Professional Documents
Culture Documents
(4-Volume Set) Rama Cont - Encyclopedia of Quantitative Finance-Wiley (2010)
(4-Volume Set) Rama Cont - Encyclopedia of Quantitative Finance-Wiley (2010)
language reads as follows: let {X1 , X2 , . . . , Xn , . . .} whose solution is the law of a centered Gaussian
be a sequence of independent random variables taking random variable with variance n.
values 1 or −1 with probability 1/2. If we let Sn =
X1 + · · · + Xn and let [x] denote the integer part of
a real number x, then Theory of Speculation
1 At the stock market, probability radiates like heat.
√ S[nt] , t ≥ 0 −−−→ Bt , t ≥ 0 (1) This “demonstrates” the role of Gaussian laws in
n
problems related to the stock market, as acknowl-
in law as n −−−→ ∞, where (Bt , t ≥ 0) is a standard edged by Poincaré himself in his report: “A little
Brownian motion. reflection shows that the analogy is real and the
This second method, which is somewhat difficult comparison legitimate. The arguments of Fourier are
to read and not very rigorous, naturally leads to applicable, with very little change, to this problem
the previous solution. But it is still not sufficient. that is so different from the problem to which these
Bachelier proposes a third method, the “radiation (or arguments were originally applied.” And Poincaré
diffusion) of probability”. Bachelier, having attended regretted that Bachelier did not develop this point
the lectures of Poincaré and Boussinesq on the theory further, though this point would be developed in a
of heat, was aware of the “method of Laplace”, masterly way by Kolmogorov in a famous article
which gives the fundamental solution of the heat published in 1931 in the Mathematische Annalen. In
equation, a solution that has exactly the form given fact, the first and third methods used by Bachelier
by the first (and second) methods used by Bachelier. are intrinsically linked: the Chapman–Kolmogorov
Hence, there is a coincidence to be elucidated. We equation for any regular Markov process is equiva-
know that Laplace probably knew the reason for lent to a partial differential equation of parabolic type.
this coincidence. Lord Rayleigh had recently noticed In all regular Markovian schemes that are continuous,
this coincidence in his solution to the problem of probability radiates like heat from a fire fanned by the
“random phases”. It is likely that neither Bachelier thousand winds of chance. And further work, exploit-
nor Poincaré had read the work of Rayleigh. Anyway, ing this real analogy, would transform not only the
Bachelier, in turn, explains this curious intersection theory of Markov processes but also the century-old
between the theory of heat and the prices of annuities theory of Fourier equations and parabolic equations.
on the Paris stock exchange. This is his third method, Now, having determined the law of price changes,
which can be summarized as follows. all calculations of financial products involving time
Consider the game of flipping a fair coin an infinite follow easily. But Bachelier did not stop there. He
number of times and set f (n, x) = (Sn = x). It has proposed a general theory of speculation integrat-
been known since at least the seventeenth century that ing all stock market products that could be proposed
to clients, whose (expected) value at maturity—and
f (n + 1, x) = 12 f (n, x − 1) + 12 f (n, x + 1) therefore whose price—can be calculated using gen-
eral formulas resulting from theory. The most remark-
(2) able product that Bachelier priced was based on the
maximum value of a stock during the period between
Subtracting f (n, x) from both the sides of the equa- its purchase and a maturity date (usually one month
tion, we obtain later). In this case, one must determine the law of the
maximum of a stock price over some interval of time.
f (n + 1, x) − f (n, x) = f (n, x + 1)
1
2
This problem would be of concern to Norbert Wiener,
the inventor of the mathematical theory of Brownian
− 2f (n, x) + f (n, x − 1) (3) motion, in 1923. It involves knowing a priori the
law of the price over an infinite time interval, but it
It then suffices to take the unit 1 in the preceding was not known—either in 1923 or in 1900—how to
equation to be infinitely small to obtain the heat easily calculate the integrals of functions of an infi-
equation nite number of variables. Let us explain the reasoning
∂f 1 ∂ 2f used by Bachelier [1] as an example of his methods
= (4) of analysis.
∂n 2 ∂x 2
Bachelier, Louis (1870–1946) 3
Bachelier proceeded in two different ways. The a simple formula by using a very simple probabilistic
first way was based on the second method developed (or combinatorial) argument.
in Bachelier’s thesis. It consists of discretizing time Of course, Bachelier had to do his mathematics
in steps of t, and introducing a change in price without a safety net. What could his safety net have
at each step of ±x. Bachelier wanted to calculate been? The mathematical analysis available during
the probability that before time t = nt, the game his time could not deal with such strange objects
(or price) exceeds a given value c = mx. Let n = and calculations. It was not until the following
m + 2p. Bachelier proposed to first calculate the year, 1901, that Lebesgue introduced the integral
probability that the price c is reached for the first based on the measure that Borel had just recently
time at exactly time t. To this end, he uses the constructed. The Daniell integral, which Wiener used,
gambler’s ruin argument: the probability is equal dates to 1920 and it was not until the 1930s that
to (m/n)Cn 2−n , which Bachelier obtained from the
p
European mathematicians realized that computing
ballot formula of Bertrand, which he learned from probabilities with respect to Brownian motion, or
Poincaré or Bertrand’s work, or perhaps both. It with respect to sequences of independent random
suffices to√ then pass properly to the limit so that variables, could be done using Lebesgue measure on
x = O( t). One then obtains the probability that the unit interval. Since Lebesgue’s theory came to be
the price exceeds c before t. Bachelier then noted viewed as one of the strongest pillars of analysis in
that this probability is equal to twice the probability the twentieth century, this approach gave probability
that the price exceeds c at time t. theory a very strong analytic basis. We will have to
The result is Bachelier’s formula for the law of wait much longer to place the stochastic calculus
the maximum Mt of the price Bt over the interval of Brownian motion and sample path arguments
[0, t]; that is, involving stopping times into a relatively uniform
analytical framework. Anyway, Bachelier had little
(Mt > c) = 2(Bt > c) (5) concern for either this new theory in analysis or the
work of his contemporaries, whom he never cites. He
It would have been difficult to proceed in a simpler refers to the work of Laplace, Bertrand, and Poincaré,
fashion. Having obtained this formula, Bachelier who never cared about the Lebesgue integral, and so
had to justify it in a simple way to understand Bachelier always ignored its existence.
why it holds. Bachelier therefore added to his first It seems that in 1900, Bachelier [1] saw very
calculation (which was somewhat confusing and clearly how to model the continuous movement of
difficult to follow) a “direct demonstration” without stock prices and he established new computational
passing to the limit. He used the argument that “the techniques, derived notably from the classical tech-
price cannot pass the threshold c over a time interval niques involving infinite sequences of fair coin flips.
of length t without having done so previously” and He provided an intermediate mathematical argument
hence that to explain a new class of functions that reflected the
vagaries of the market, just as in the eighteenth cen-
(Bt > c) = (Mt > c)α (6) tury, when one used geometric reasoning and physical
intuition to explain things.
where α is the probability that the price c, having
been attained before time t, is greater than c at
time t. The latter probability is obviously 1/2, due After the Thesis
to symmetry of the sample paths that go above
and that remain below c by time t. And Bachelier His Ph.D. thesis defended, Bachelier suddenly seem-
concludes: “It is remarkable that the multiple integral ed to discover the immensity of a world in which
that expresses the probability (Mt > c) does not randomness exists. The theory of the stock market
seem amenable to ordinary methods of calculation, allowed him to view the classical results of proba-
but can be determined by very simple probabilistic bility with a new eye, and it opened new viewpoints
reasoning.” It was, without doubt, the first example for him. Starting in 1901, Bachelier showed that the
of the use of the reflection principle in probability known results about infinite sequences of fair coin
theory. In two steps, a complicated calculation yields flips could all (or almost all) be obtained from stock
4 Bachelier, Louis (1870–1946)
market theory and that one can derive new results Bachelier essentially did not publish any original
that are more precise than anyone had previously sus- work. He married in 1920, but his wife died a few
pected. In 1906, Bachelier proposes an almost general months later. He was often ill and he seems to have
theory of “related probabilities”, that is to say, a been quite isolated.
theory about what would, 30 years later, be called In 1937, he moved with his sister to Saint-Malo
Markov processes. This article by Bachelier was the in Brittany. During World War II, he moved to Saint-
starting point of a major study by Kolmogorov in Servan, where he died in 1946. He seemed to be
1931 that we already mentioned. All of Bachelier’s aware of the new theory of stochastic processes that
work was published with the distant but caring rec- was then developing in Paris and Moscow, and that
ommendation of Poincaré, so that by 1910, Bachelier, was progressively spreading all over the world. He
whose income remains unknown and was proba- attempted to claim credit for the things that he had
bly modest, is permitted to teach a “free course” in done, without any success. He regained his appetite
probability theory at the Sorbonne, without compen- for research, to the point that in 1941, at the age
sation. Shortly thereafter, he won a scholarship that of 70, he submitted a note for publication to the
allowed him to publish his Calculus of Probability, Academy of Sciences in Paris on the “probability of
Volume I, Paris, Gauthier-Villars, 1912 (Volume II maximum oscillations”, in which he demonstrated a
never appeared), which included all of his work since fine mastery of the theory of Brownian motion, which
his thesis. This very surprising book was not widely was undertaken systematically by Paul Levy starting
circulated in France, and had no impact on the Paris in 1938. Paul Levy, the principal French researcher
stock market or on French mathematics, but it was of the theory of Brownian motion, recognized, albeit
one of the sources that motivated work in stochastic belatedly, the work of Bachelier, and his work
processes at the Moscow School in the 1930s. It also provided a more rigorous foundation for Bachelier’s
influenced work by the American School on sums “theory of speculation”.
of independent random variables in the 1950s, and
at the same time, influenced new theories in math- Reference
ematical finance that were developing in the United
States. And, as things should rightly be, these theo-
[1] Bachelier, L. (1900). Théorie de la spéculation, Thèse
ries traced back to France, where Bachelier’s name Sciences mathématiques Paris. Annales Scientifiques de
had become so well recognized that in 2000, the l’Ecole Normale Supérieure 17, 21–86; The Random
centennial anniversary of his work in “theory of spec- Character of Stock Market Prices, P. Cootner, ed, MIT
ulation” was celebrated. Press, Cambridge, 1964, pp. 17–78.
The First World War interrupted the work of
Bachelier, who was summoned for military service Further Reading
in September 1914 as a simple soldier. When the
war ended in December 1918, he was a sublieutenant Courtault, J.M. & Kabanov, Y. (eds) (2002). Louis Bachelier:
in the Army Service Corps. He served far from the Aux origines de la Finance Mathématique, Presses Univer-
front, but he carried out his service with honor. As a sitaires Franc-Comtoises, Besançon.
result, in 1919, the Directorate of Higher Education in Taqqu, M.S. (2001). Bachelier and his times: a conversation
Paris believed it was necessary to appoint Bachelier with Bernard Bru, Finance and Stochastics 5(1), 3–32.
to a university outside of Paris, since the war had
decimated the ranks of young French mathematicians Related Articles
and there were many positions to be filled. After
many difficulties, due to his marginalization in the
French mathematical community and the incongruent Black–Scholes Formula; Markov Processes;
nature of his research, Bachelier finally received Martingales; Option Pricing: General Principles.
tenure in 1927 (at the age of 57) as a professor at BERNARD BRU
the University of Besançon, where he remained until
his retirement in 1937. Throughout the postwar years,
Samuelson, Paul A. fat-tailed, infinite-variance return distributions [14],
and, over a span of nearly four decades, analyzing
the systematic dependence on age of optimal port-
Paul Anthony Samuelson (1915–) is Institute Profes- folio strategies, in particular, optimal long-horizon
sor Emeritus at the Massachusetts Institute of Tech- investment strategies, and the improper use of the
nology where he has taught since 1940. He earned Law of Large Numbers to arrive at seemingly domi-
a BA from the University of Chicago in 1935 and nating strategies for the long run [10, 15, 17, 21–27].
his PhD in economics from Harvard University in In investigating the oft-told tale that investors become
1941. He received the John Bates Clark Medal in systematically more conservative as they get older,
1947 and the National Medal of Science in 1996. Samuelson shows that perfectly rational risk-averse
In 1970, he became the first American to receive the investors with constant relative risk aversion will
Alfred Nobel Memorial Prize in Economic Sciences. select the same fraction of risky stocks versus safe
His textbook, Economics, first published in 1948, and cash period by period, independently of age, provided
in its 18th edition, is the best-selling and arguably the that the investment opportunity set is unchanging.
most influential economics textbook of all time. Having shown that greater investment conservatism is
Paul Samuelson is the last great general not an inevitable consequence of aging, he later [24]
economist—never again will any one person make demonstrates conditions under which such behavior
such foundational contributions to so many distinct can be optimal: with mean-reverting changing oppor-
areas of economics. His prolific and profound theo- tunity sets, older investors will indeed be more con-
retical contributions over seven decades of published servative than in their younger days, provided that
research have been universal in scope, and his ram- they are more risk averse than a growth-optimum,
ified influence on the whole of economics has led log-utility maximizer. To complete the rich set of age-
to foundational contributions in virtually every field dependent risk-taking behaviors, Samuelson shows
of economics, including financial economics. Repre- that rational investors may actually become less con-
senting 27 years of scientific writing from 1937 to servative with age, if either they are less risk averse
the middle of 1964, the first two volumes of his Col- than log or if the opportunity set follows a trend-
lected Scientific Papers contain 129 articles and 1772 ing, momentum-like dynamic process. He recently
pages. These were followed by the publication of confided that in finance, this analysis is a favorite
the 897-page third volume in 1972, which registers brainchild of his.
the succeeding seven years’ product of 78 articles Published in the same issue of the Industrial Man-
published when he was between the ages of 49 and agement Review, “Proof That Properly Anticipated
56 [18]. A mere five years later, at the age of 61, Prices Fluctuate Randomly” and “Rational Theory of
Samuelson had published another 86 papers, which Warrant Pricing” are perhaps the two most influen-
fill the 944 pages of the fourth volume. A decade tial Samuelson papers in quantitative finance. Dur-
later, the fifth volume appeared with 108 articles and ing the decade before their printed publication in
1064 pages. A glance at his list of publications since 1965, Samuelson had set down, in an unpublished
1986 assures us that a sixth and even seventh vol- manuscript, many of the results in these papers and
ume could be filled. That Samuelson paid no heed had communicated them in lectures at MIT, Yale,
to the myth of debilitating age in science is particu- Carnegie, the American Philosophical Society, and
larly well-exemplified in his contributions to financial elsewhere. In the early 1950s, he supervised a PhD
economics, with all but 6 of his more than 60 papers thesis on put and call pricing [5].
being published after he had reached the age of 50. The sociologist or historian of science would
Samuelson’s contribution to quantitative finance, undoubtedly be able to develop a rich case study
as with mathematical economics generally, has been of alternative paths for circulating scientific ideas
foundational and wide-ranging: these include recon- by exploring the impact of this oral publication of
ciling the axioms of expected utility theory first with research in rational expectations, efficient markets,
nonstochastic theories of choice [9] and then with the geometric Brownian motion, and warrant pricing in
ubiquitous and practical mean–variance criterion of the period between 1956 and 1965.
choice [16], exploring the foundations of diversifica- Samuelson (1965a) and Eugene Fama indepen-
tion [13] and optimal portfolio selection when facing dently provide the foundation of the Efficient Market
2 Samuelson, Paul A.
theory that developed into one of the most impor- the most part in those ensuing years, his interpretation
tant concepts in modern financial economics. As of the data is that organized markets where widely
indicated by its title, the principal conclusion of owned securities are traded are well approximated
the paper is that in well-informed and competitive as microefficient, meaning that the relative pricing of
speculative markets, the intertemporal changes in individual securities within the same or very similar
prices will be essentially random. Samuelson has asset classes is such that active asset management
described the reaction (presumably his own as well applied to those similar securities (e.g., individual
as that of others) to this conclusion as one of “initial stock selection) does not earn greater risk-adjusted
shock—and then, upon reflection, that it is obvi- returns.
ous”. The argument is as follows: the time series of However, Samuelson is discriminating in his
changes in most economic variables gross national assessment of the efficient market hypothesis as it
product (GNP, inflation, unemployment, earnings, relates to real-world markets. He notes a list of
and even the weather) exhibit cyclical or serial the “few not-very-significant apparent exceptions” to
dependencies. Furthermore, in a rational and well- microefficient markets [23, p. 5]. He also expresses
informed capital market, it is reasonable to presume belief that there are exceptionally talented people who
that the prices of common stocks, bonds, and com- can probably garner superior risk-corrected returns,
modity futures depend upon such economic variables. and even names a few. He does not see them as offer-
Thus, the shock comes from the seemingly inconsis- ing a practical broad alternative investment prescrip-
tent conclusion that in such well-functioning markets tion for active management since such talents are few
the changes in speculative prices should exhibit no and hard to identify. As Samuelson believes strongly
serial dependencies. However, once the problem is in microefficiency of the markets, he expresses doubt
viewed from the perspective offered in the paper, this about macromarket efficiency: namely that indeed
seeming inconsistency disappears and all becomes asset-value “bubbles” do occur.
obvious. There is no doubt that the mainstream of the pro-
Starting from the consideration that in a competi- fessional investment community has moved signifi-
tive market, if everyone knew that a speculative secu- cantly in the direction of Paul Samuelson’s position
rity was expected to rise in price by more (less) than during the 35 years since he issued his challenge to
the required or fair expected rate of return, it would that community to demonstrate widespread superior
already be bid up (down) to negate that possibility, performance [20]. Indexing as either a core invest-
Samuelson postulates that securities will be priced at ment strategy or a significant component of insti-
each point in time so as to yield this fair expected tutional portfolios is ubiquitous, and even among
rate of return. Using a backward-in-time induction those institutional investors who believe they can
argument, he proves that the changes in speculative deliver superior performance, performance is typi-
prices around that fair return will form a martingale. cally measured incrementally relative to an index
And this follows no matter how much serial depen- benchmark and the expected performance increment
dency there is in the underlying economic variables to the benchmark is generally small compared to the
upon which such speculative prices are formed. In an expected return on the benchmark itself. It is there-
informed market, therefore, current speculative prices fore with no little irony that as investment practice
will already reflect anticipated or forecastable future has moved in this direction, for the last 15 years,
changes in the underlying economic variables that are academic research has moved in the opposite direc-
relevant to the formation of prices, and this leaves tion, strongly questioning even the microefficiency
only the unanticipated or unforecastable changes in case for the efficient market hypothesis. The con-
these variables as the sole source of fluctuations in ceptual basis of these challenges comes from the-
speculative prices. ories of asymmetric information and institutional
Samuelson is careful to warn the reader against rigidities that limit the arbitrage mechanisms that
interpreting his mathematically derived theoretical enforce microefficiency and of cognitive dissonance
conclusions about markets as empirical statements. and other systematic behavioral dysfunctions among
Nevertheless, for 40 years, his model has been impor- individual investors that purport to distort market
tant to the understanding and interpretation of the prices away from rationally determined asset prices
empirical results observed in real-world markets. For in identified ways. A substantial quantity of empirical
Samuelson, Paul A. 3
evidence has been assembled, but there is consider- his paper, Samuelson thus chose the term European
able controversy over whether it does indeed make for the relatively simple(-minded)-to-value option
a strong case to reject market microefficiency in the contract that can only be exercised at expiration and
Samuelsonian sense. What is not controversial at all American for the considerably more-(complex)-to-
is that Paul Samuelson’s efficient market hypothesis value option contract that could be exercised early,
has had a deep and profound influence on finance any time on or before its expiration date.
research and practice for more than 40 years and all Although real-world options are almost always
indications are that it will continue to do so well into of the American type, published analyses of option
the future. pricing prior to his 1965 paper focused exclusively
If one were to describe the 1960s as “the decade on the evaluation of European options and therefore
of capital asset pricing and market efficiency” in did not include the extra value to the option from the
view of the important research gains in quantitative right to exercise early.
finance during then, one need hardly say more than The most striking comparison to make between
“the Black-Scholes option pricing model” to justify the Black–Scholes option pricing theory and Samuel-
describing the 1970s as “the decade of option and son’s rational theory [12] is the formula for the
derivative security pricing.” Samuelson was ahead of option price. The Samuelson partial differential equa-
the field in recognizing the arcane topic of option tion for the option price is the same as the correspond-
pricing as a rich area for problem choice and solution. ing equation for the Black–Scholes option price if
By at least the early 1950s, Samuelson had shown one sets the Samuelson parameter for the expected
that the assumption of an absolute random walk or return on the underlying stock equal to the riskless
arithmetic Brownian motion for stock price changes interest rate minus the dividend yield and sets the
leads to absurd prices for long-lived options, and
Samuelson parameter for the expected return on the
this was done before his rediscovery of Bachelier’s
option equal to the riskless interest rate. It should,
pioneering work [1] in which this very assumption
however, be underscored that the mathematical equiv-
is made. He introduced the alternative process of a
alence between the two formulas with the redefinition
“geometric” Brownian motion in which the log of
of parameters is purely a formal one. The Samuel-
price changes follows a Brownian motion, possibly
son model simply posits the expected returns for the
with a drift. His paper on the rational theory of
stock and option. By employing a dynamic hedging
warrant pricing [12] resolves a number of apparent
paradoxes that had plagued the existing mathematical or replicating portfolio strategy, the Black–Scholes
theory of option pricing from the time of Bachelier. analysis derives the option price without the need
In the process (with the aid of a mathematical to know either the expected return on the stock or
appendix provided by H. P. McKean, Jr), Samuelson the required expected return on the option. There-
also derives much of what has become the basic fore, the fact that the Black–Scholes option price
mathematical structure of option pricing theory today. satisfies the Samuelson formula implies neither that
Bachelier [1] considered options that could only the expected returns on the stock and option are
be exercised on the expiration date. In modern equal nor that they are equal to the riskless rate of
times, the standard terms for options and warrants interest. Furthermore, it should also be noted that
permit the option holder to exercise on or before Black–Scholes pricing of options does not require
the expiration date. Samuelson coined the terms knowledge of investors’ preferences and endowments
European option to refer to the former and American as is required, for example, in the sequel Samuelson
option to refer to the latter. As he tells the story, and Merton [28] warrant pricing paper. The “ratio-
to get a practitioner’s perspective in preparation for nal theory” put forward in 1965 is thus clearly a
his research, he went to New York to meet with a “miss” with respect to the Black–Scholes develop-
well-known put and call dealer (there were no traded ment. However, as this analysis shows, it is just as
options exchanges until 1973) who happened to be clearly a “near miss”. See [6, 19] for a formal com-
Swiss. Upon his identifying himself and explaining parison of the two models.
what he had in mind, Samuelson was quickly told, Extensive reviews of Paul Samuelson’s remark-
“You are wasting your time—it takes a European able set of contributions to quantitative finance can
mind to understand options.” Later on, when writing be found in [2–4, 7, 8].
4 Samuelson, Paul A.
to him, his most important work was the two books [7] Black, F. (1990). Equilibrium exchange rate hedging,
he wrote that extended the image of CAPM to the real Journal of Finance 45, 899–907.
economy, including the theory of money and business [8] Black, F. (1995). Exploring General Equilibrium, MIT
Press, Cambridge, MA.
cycles [5, 8]. The fluctuation of aggregate output, he
[9] Black, F. & Cox, J.C. (1976). Valuing corporate securi-
reasoned, was nothing more than the fluctuating yield ties: some effects of bond indenture provisions, Journal
on the national stock of capital. Just as risk is the price of Finance 31, 351–368.
we pay for higher expected yield, business fluctuation [10] Black, F., Derman, E. & Toy, W.T. (1990). A one-factor
is also the price we pay for higher expected rates of model of interest rates and its application to treasury
economic growth. bond options, Financial Analysts Journal 46, 33–39.
The rise of modern finance in the last third of [11] Black, F. & Litterman, R. (1991). Asset allocation: com-
twentieth century transformed the financial infrastruc- bining investor views with market equilibrium, Journal
of Fixed Income 1, 7–18.
ture within which businesses and households interact. [12] Black, F. & Litterman, R. (1992). Global portfolio
A system of banking institutions was replaced by optimization, Financial Analysts Journal 48, 28–43.
a system of capital markets, as financial engineer- [13] Black, F. & Perold, A.F. (1992). Theory of constant
ing developed ways to turn loans into bonds. This proportion portfolio insurance, Journal of Economic
revolution in institutions has also brought with it a Dynamics and Control 16, 403–426.
revolution in our thinking about how the economy [14] Black, F. & Scholes, M. (1973). The pricing of options
works, including the role of government regulation and corporate liabilities, Journal of Political Economy
81, 637–654.
and stabilization policy. Crises in the old banking
[15] Black, F. & Scholes, M. (1974). From theory to a new
system gave rise to the old macroeconomics. Crises financial product, Journal of Finance 19, 399–412.
in the new capital markets system will give rise to a [16] Mehrling, P.G. (2005). Fischer Black and the Revolu-
new macroeconomics, possibly built on the founda- tionary Idea of Finance, John Wiley & Sons, Hoboken,
tions laid by Fischer Black. New Jersey.
[17] Merton, R.C. (1973). Theory of rational option pricing,
References Bell Journal of Economics and Management Science 4,
141–183.
[1] Black, F. (1972). Capital market equilibrium with [18] Treynor, J.L. (1962). Toward a theory of market value of
restricted borrowing, Journal of Business 45, 444–455. risky assets, in Asset Pricing and Portfolio Performance,
[2] Black, F. (1974). International capital market equilib- R.A. Korajczyk, ed, Risk Books, London, pp. 15–22.
rium with investment barriers, Journal of Financial Eco-
nomics 1, 337–352.
[3] Black, F. (1975). Bank funds management in an efficient
market, Journal of Financial Economics 2, 323–339.
Related Articles
[4] Black, F. (1976). The pricing of commodity contracts,
Journal of Financial Economics 3, 167–179.
Black–Scholes Formula; Black–Litterman App-
[5] Black, F. (1987). Business Cycles and Equilibrium, Basil
Blackwell, Cambridge, MA. roach; Option Pricing Theory: Historical Perspec-
[6] Black, F. (1988). Individual investment and consumption tives; Merton, Robert C.; Modern Portfolio The-
under uncertainty, in Portfolio Insurance, A Guide to ory; Term Structure Models; Sharpe, William F.
Dynamic Hedging, D.L. Luskin, ed, John Wiley & Sons,
New York, pp. 207–225. PERRY MEHRLING
Mandelbrot, Benoit disordered and random phenomena ranging from the
geometry of coastlines to the variation of foreign
exchange rates. In his own words
The roughness of clusters in the physics of disor-
der, of turbulent flows, of exotic noises, of chaotic
dynamical systems, of the distribution of galaxies, of
coastlines, of stock price charts, and of mathemat-
ical constructions,—these have typified the topics
I studied.
He formalized the notion of ‘fractal process’—and
later, that of multifractal [13]—which provided a
tool for quantifying the “degree of irregularity” of
various random phenomena in mathematics, physics,
and economics.
Benoit B. Mandelbrot, Sterling Professor Emeritus Benoit Mandelbrot’s numerous awards include the
of Mathematical Sciences at Yale University and 1993 Wolf Prize for Physics and the 2003 Japan Prize
IBM Fellow Emeritus at the IBM Research Cen- for Science and Technology, the 1985 F. Barnard
ter, best known as the “father of fractal geometry”, Medal for Meritorious Service to Science (“Magna
is a Polish-born French-American multidisciplinary est Veritas”) of the US National Academy of Sci-
scientist with numerous contributions to different ences, the 1986 Franklin Medal for Signal and Emi-
fields of knowledge including mathematics, statistics, nent Service in Science of the Franklin Institute
hydrology, physics, engineering, physiology, eco- of Philadelphia, the 1988 Charles Proteus Stein-
nomics and, last but not least, quantitative finance. metz Medal of IEEE, the 2004 Prize of Financial
In this short text we will focus on Mandelbrot’s con- Times/Deutschland, and a Humboldt Preis from the
tributions to the study of financial markets. Alexander von Humboldt Stiftung.
Benoit Mandelbrot was born in Warsaw, Poland,
on November 20, 1924 in a family of scholars from
Lituania. In 1936 Mandelbrot’s family moved to From Mild to Wild Randomness:
Paris, where he was influenced by his mathemati- The Noah Effect
cian uncle Szolem Mandelbrojt (1899–1983). He
entered the Ecole Polytechnique in 1944. Among his Mandelbrot developed an early interest in the stochas-
professors at Polytechnique was Paul Levy, whose tic modeling of financial markets. Familiar with
pioneering work on stochastic processes influenced the work of Louis Bachelier (see Bachelier, Louis
Mandelbrot. (1870–1946)), Mandelbrot published a series of
After two years in Caltech and after obtaining pioneering studies [6–8, 21] on the tail behavior
a doctoral degree in mathematics from University of the distribution of price variations, where he
of Paris in 1952, he started his scientific career at advocated the use of heavy-tailed distributions and
scale-invariant Lévy processes for modeling price
the Centre National de la Recherche Scientifique in
fluctuations. The discovery of the heavy-tailed nature
Paris, before moving on various scientific appoint-
of price movements led him to coin the term
ments which included those at Ecole Polytechnique,
“wild randomness” for describing market behavior,
Universite de Lille, the University of Geneva MIT,
as opposed to the “mild randomness” represented by
Princeton, University of Chicago, and finally the
Bachelier’s Brownian model, which later became the
IBM Thomas J. Watson Research Center in York-
standard approach embodied in the Black–Scholes
town Heights, New York and Yale University where
model. Mandelbrot likened the sudden bursts of
he spent the longer part of his career.
volatility in financial markets to the “Noah effect”,
A central thread in his scientific career is the
by analogy with the flood which destroys the world
“ardent pursuit of the concept of roughness” which
in Noah’s biblical story:
resulted in a rich theoretical apparatus—fractal and
multifractal geometry—whose aim is to describe In science, all important ideas need names and
and represent the order hidden in apparently wildly stories to fix them in the memory. It occurred to
2 Mandelbrot, Benoit
me that the market’s first wild trait, abrupt change activity, is given by a multifractal (see Multifractals)
or discontinuity, is prefigured in the tale of Noah. increasing process (see Mixture of Distribution
As Genesis relates, in Noah’s six-hundredth year Hypothesis; Time Change) [5, 15]:
God ordered the Great Flood to purify a wicked
world. [. . .] The flood came and went, catastrophic The key step is to introduce an auxiliary quantity
but transient. Market crashes are like that : at times, called trading time. The term is self-explanatory
even a great bank or brokerage house can seem like and embodies two observations. While price changes
a little boat in a big storm. over fixed clock time intervals are long-tailed,
price changes between successive transactions stay
near-Gaussian over sometimes long period between
Long-range Dependence: The Joseph discontinuities. Following variations in the trading
Effect volume, the time interval between successive trans-
actions vary greatly. Thissuggests that trading time
Another early insight of Mandelbrot’s studies of is related to volume.
financial and economic data was the presence of long- The topic of multifractal modeling in finance was
range dependence [9–11] in market fluctuations: further developed in [1, 17–19]; a nontechnical
account is given in [16].
The market’s second wild trait—almost cycles—is
prefigured in the story of Joseph. The Pharaoh
Mandelbrot’s work in quantitative finance has
dreamed that seven fat cattle were feeding in the been generally 20 years ahead of its time: many
meadows, when seven lean kine rose out of the Nile of his ideas proposed in the 1960s—such as long-
and ate them. [. . .] Joseph, a Hebrew slave, called range dependence, volatility clustering, and heavy
the dreams prophetic : Seven years of famine would tails—became mainstream in financial modeling in
follow seven years of prosperity. [. . .] Of course, the 1990s. If this is anything of a pattern, his more
this is not a regular or predictable pattern. But the recent work in the field might deserve a closer look.
appearance of one is strong. Behind it is the influence
of long-range dependence in an otherwise random
Perhaps, one of the most important insights of his
process or, put another way, a long-term memory work on financial modeling is to closely examine the
through which the past continues to influence the empirical features of data before axiomatizing and
random fluctuations of the present. I called these writing down complex equations, a timeless piece of
two distinct forms of wild behavior the Noah effect advice which can be a useful guide for quantitative
and the Joseph effect. They are two aspects of one modeling in finance.
reality.
Mandelbrot’s work in finance is summarized in the
Building on his earlier work Mandelbrot [22, 23] on books [14, 15] and a popular account of this work is
long-range dependence in hydrology and fractional given in the book [5].
Brownian motion, he proposed the use of fractional
processes for modeling long-range dependence and References
scaling properties of economic quantities (see Long
Range Dependence). [1] Barral, J. & Mandelbrot, B. (2002). Multifractal products
of cylindrical pulses, Probability Theory and Related
Fields 124, 409–430.
Multifractal Models and Stochastic Time [2] Calvet, L., Fisher, A. & Mandelbrot, B. (1997). Large
Deviations and the Distribution of Price Changes.
Changes Cowles Foundation Discussion Papers: 1165.
[3] Clark, P.K. (1973). A subordinated stochastic process
In a series of papers [2, 4, 20] with Adlai Fisher model with finite variance for speculative prices, Econo-
and Laurent Calvet, Mandelbrot studied the scaling metrica 41(1), 135–155.
properties of the US/DEM foreign exchange rate at [4] Fisher, A., Calvet, L.M. & Mandelbrot, B. (1997).
frequencies ranging from a few minutes to weeks Multifractality of the Deutschmark/US Dollar exchange
and, building on earlier work by Clark [3] and rates. Cowles Foundation Discussion Papers: 1166.
[5] Hudson, R.L. (2004). The (Mis)behavior of Prices: A
Mandelbrot [12, 13], introduced a new family of Fractal View of Risk, Ruin, and Reward, Basic Books,
stochastic models, where the (log) price of an asset New York, & Profile Books, London, pp. xxvi + 329.
is represented by a time-changed fractional Brownian [6] Mandelbrot, B. (1962). Sur certains prix spéculatifs: faits
motion, where the time change, representing market empiriques et modèle basé sur les processus stables
Mandelbrot, Benoit 3
additifs de Paul Lévy, Comptes Rendus (Paris) 254, [19] Mandelbrot, B. (2001). Stochastic volatility, power-laws
3968–3970. and long memory, Quantitative Finance 1, 558–559.
[7] Mandelbrot, B. (1963). The variation of certain specula- [20] Mandelbrot B., Fisher A. & Calvet, L. (1997). The
tive prices, The Journal of Business of the University of Multifractal Model of Asset Returns. Cowles Foundation
Chicago 36, 394–419. Discussion Papers: 1164.
[8] Mandelbrot, B. (1963). New methods in statistical eco- [21] Mandelbrot, B. & Taylor, H.M. (1967). On the distribu-
nomics, The Journal of Political Economy 71, 421–440. tion of stock price differences, Operations Research 15,
[9] Mandelbrot, B. (1971). Analysis of long-run dependence 1057–1062.
in economics: the R/S technique, Econometrica 39, [22] Mandelbrot, B. & Van Ness, J.W. (1968). Fractional
(July Supplement), 68–69. Brownian motions, fractional noises and applications,
[10] Mandelbrot, B. (1971). When can price be arbitraged SIAM Review 10, 422–437.
efficiently? A limit to the validity of the random- [23] Mandelbrot, B. & Wallis, J.R. (1968). Noah, Joseph
walk and martingale models, Review of Economics and and operational hydrology, Water Resources Research 4,
Statistics 53, 225–236. 909–918.
[11] Mandelbrot, B. (1972). Statistical methodology for non-
periodic cycles: from the covariance to R/S analy-
sis, Annals of Economic and Social Measurement 1, Further Reading
257–288.
[12] Mandelbrot, B. (1973). Comments on “A subordinated Mandelbrot, B. (1966). Forecasts of future prices, unbiased
stochastic process model with finite variance for spec- markets and “martingale” models, The Journal of Business
ulative prices.” by Peter K. Clark, Econometrica 41, of the University of Chicago 39, 242–255.
157–160. Mandelbrot, B. (1982). The Fractal Geometry of Nature.
[13] Mandelbrot, B. (1974). Intermittent turbulence in self- Mandelbrot, B. (2003). Heavy tails in finance for indepen-
similar cascades; divergence of high moments and dent or multifractal price increments, in Handbook on Heavy
dimension of the carrier, Journal of Fluid Mechanics 62, Tailed Distributions in Finance, T.R. Svetlozar, ed., Hand-
331–358. books in Finance, 30, Elsevier, pp. 1–34, Vol. 1.
[14] Mandelbrot, B. (1997). Fractals and Scaling in Finance:
Discontinuity, Concentration, Risk, Springer, New York,
pp. x + 551. Related Articles
[15] Mandelbrot, B. (1997). Fractales, hasard et finance
(1959–1997), Flammarion (Collection Champs), Paris,
p. 246.
Exponential Lévy Models; Fractional Brownian
[16] Mandelbrot, B. (1999). A Multifractal Walk down Wall Motion; Heavy Tails; Lévy Processes; Long Range
Steet, Scientific American, February 1999, pp. 50–53. Dependence; Mixture of Distribution Hypothesis;
[17] Mandelbrot, B. (2001). Scaling in financial prices, I: tails Stylized Properties of Asset Returns.
and dependence, Quantitative Finance 1, 113–123.
[18] Mandelbrot, B. (2001). Scaling in financial prices, RAMA CONT
IV: multifractal concentration, Quantitative Finance 1,
641–649.
Sharpe, William F. market portfolio. Sharpe’s next step was to derive a
relationship between the expected return on any risky
asset and the expected return on the market. As a
William Forsyth Sharpe (born on June 16, 1934) is matter of curiosity, the CAPM relationship does not
one of the leading contributors to financial economics appear in the body of the paper but rather as the final
and shared the Nobel Memorial Prize in Economic equation in footnote 23 on page 438.
Sciences in 1990 with Harry Markowitz and Merton The CAPM relationship in modern notation is
Miller. His most important contribution is the capital
asset pricing model (CAPM), which provided an E[Rj ] − rf = βj (E[Rm ] − rf ) (1)
equilibrium-based relationship between the expected
return on an asset and its risk as measured by where Rj is the return on security j , Rm is the return
its covariance with market portfolio. Similar ideas on the market portfolio of all risky assets, rf is the
were developed by John Lintner, Jack Treynor (see return on the risk-free security, and
Treynor, Lawrence Jack), and Jan Mossin around
the same time. Sharpe has made other important Cov(Rj , Rm )
βj = (2)
contributions to the field of financial economics but, V ar(Rm )
given the space limitations, we only describe two of
his contributions: the CAPM and the Sharpe ratio. is the beta of security j . The CAPM asserts that
It is instructive to trace the approach used by the excess expected return on a risky security is
Sharpe in developing the CAPM. His starting point equal to the security’s beta times the excess expected
was Markowitz’s model of portfolio selection, which return on the market. Note that this is a single period
showed how rational investors would select optimal model and that it is formulated in terms of ex ante
portfolios. If investors only care about the expected expectations. Note also that formula (2) provides an
return and the variance of their portfolios, then the explicit expression for the risk of a security in terms
optimal weights can be obtained by quadratic pro- of its covariance with the market and the variance
gramming. The inputs to the optimization are the with the market.
expected returns on the individual securities and The CAPM has become widely used in both
their covariance matrix. In 1963, Sharpe [1] showed investment finance and corporate finance. It can
how to simplify the computations required under the be used as a tool in portfolio selection and also
Markowitz approach. He assumed that each secu- in the measurement of investment performance of
rity’s return was generated by two random factors: portfolio managers. The CAPM is also useful in
one common to all securities and a second factor capital budgeting applications since it gives a formula
that was uncorrelated across securities. This assump- for the required expected return on an investment. For
tion leads to a simple diagonal covariance matrix. this reason, the CAPM is often used in rate hearings
Although the initial motivation for this simplify- in some jurisdictions for regulated entities such as
ing assumption was to reduce the computational utility companies or insurance companies.
time, it would turn out to have deep economic The insights from the CAPM also played an
significance. important role in subsequent theoretical advances,
These economic ideas were developed in Sharpe’s but owing to space constraint we only mention one.
[2] Journal of Finance paper. He assumed that all The original derivation of the classic Black–Scholes
investors would select mean-variance-efficient port- option formula was based on the CAPM. Black
folios. He also assumed that investors had homoge- assumed that the return on the stock and the return on
neous beliefs and that investors could borrow and its associated warrant both obeyed the CAPM. Hence
lend at the same riskless rate. As Tobin had shown, he was able to obtain expressions for the expected
this implied two fund separations where the investor return on both of these securities and he used this in
would divide his money between the risk-free asset deriving the Black–Scholes equation for the warrant
and an efficient portfolio of risky assets. Sharpe price.
highlighted the importance of the notion of equi- The second contribution that we discuss is the
librium in this context. This efficient portfolio of Sharpe ratio. In the case of a portfolio p with
risky assets in equilibrium can be identified with the expected return E[Rp ] and standard deviation σp , the
2 Sharpe, William F.
It is interesting to see that the Sharpe ratio figures Sharpe, W.F., Alexander, G.J. & Bailey, J. (1999). Investments,
so prominently in this fundamental relationship in Prentice-Hall.
modern mathematical finance.
Bill Sharpe has made several other notable con- Related Articles
tributions to the development of the finance field.
His papers have profoundly influenced investment Capital Asset Pricing Model; Style Analysis; Bino-
science and portfolio management. He developed mial Tree.
the first binomial tree model (see Binomial Tree)
for option pricing, the gradient method for asset PHELIM BOYLE
Markowitz, Harry The now-landmark 1952 “Portfolio Selection”
paper skipped over the problem of selecting individ-
ual stocks and focused instead on how a manager or
investor selects a portfolio best suited to the indi-
ő Harry Max Markowitz, born in Chicago in 1927, vidual’s risk and return preferences. Pre-Markowitz,
said in his 1990 Nobel Prize acceptance speech that, diversification was considered important, but there
as a child, he was unaware of the Great Depres- was no framework to determine how diversified a
sion, which caused a generation of investors and portfolio was or how an investor could create a well-
noninvestors the world over to mistrust the markets. diversified portfolio.
However, it was a slim, 15-page paper published by Keeping in mind that “diversification is both
Markowitz as a young man that would eventually observed and sensible,” the paper began from the
transform the way people viewed the relationship premise that investors consider expected return a
between risk and return, and that overhauled the “desirable thing” and risk an “undesirable thing”.
way the investment community constructed diversi- Markowitz’s first insight was to look at a portfo-
fied portfolios of securities. lio’s risk as the variance of its returns. This offered
Markowitz was working on his dissertation in a way to quantify investment risk that previously
economics at the University of Chicago when his had not existed. He then perceived that a portfolio’s
now-famous “Portfolio Selection” paper appeared in riskiness depended not just on the expected returns
the March 1952 issue of the Journal of Finance [1]. and variances of the individual assets but also on
He was 25. He went on to win the Nobel Prize the correlations between the assets in the portfo-
in Economic Sciences in 1990 for providing the lio. For Markowitz, the wisdom of diversification
cornerstone to what came to be known as modern was not simply a matter of holding a large num-
portfolio theory (Modern Portfolio Theory). ber of different securities, but of holding securities
Markowitz shared the Nobel Prize with Merton whose value did not rise and fall in tandem with
H. Miller and William F. Sharpe (Sharpe, William one another. “It is necessary to avoid investing in
F.), who were recognized, respectively, for their work securities with high covariances among themselves,”
on how firms’ capital structure and dividend policy he stated in the paper. Investing in companies in
affect their stock price, and the development of the different industries, for instance, increased a port-
capital asset pricing model, which presents a way folio’s diversification and, paradoxically, improved
to measure the riskiness of a stock relative to the the portfolio’s expected returns by reducing its
performance of the stock market as a whole. Together, variance.
the three redefined the way investors thought about Markowitz’s paper laid out a mathematical theory
the investment process, and created the field of finan- for deriving the set of optimal portfolios based on
cial economics. Markowitz, whose work built on their risk-return characteristics. Markowitz showed
earlier work on diversification by Yale University’s how mean-variance analysis could be used to find
James Tobin, who received a Nobel Prize in 1981, a set of securities whose risk-return combinations
was teaching at Baruch College at the City Univer- were deemed “efficient”. Markowitz referred to this
sity of New York when he won the Nobel at the as the expected returns–variance of returns rule (E-
age of 63. V rule). The range of possible risk–return combi-
Markowitz received a bachelor of philosophy in nations yielded what Markowitz described as effi-
1947 and a PhD in economics in 1955, both from cient and inefficient portfolios, an idea he based
the University of Chicago. Years later he said that on Koopmans’ notion that there are efficient and
when he decided to study economics, his philo- inefficient allocations of resources [3]. Koopmans,
sophical interests drew him toward the “economics at the time, was one of Markowitz’s professors.
of uncertainty”. At Chicago, he studied with Mil- Markowitz’s notion of efficient portfolios was sub-
ton Friedman, Jacob Marschak, Leonard Savage, and sequently called the efficient frontier. “Not only does
Tjalling Koopmans, and became a student member of the E-V hypothesis imply diversification, it implies
the famed Cowles Commission for Research in Eco- the ‘right kind’ of diversification for the ‘right rea-
nomics (which moved to Yale University in 1955 and son,”’ Markowitz wrote. The optimal portfolio was
was renamed the Cowles Foundation). the one that would provide the minimum risk for a
2 Markowitz, Harry
given expected return, or the highest expected return he values another prize he received more than the
for a given level of risk. An investor would select Nobel: the von Neumann Prize in operations research
the portfolio whose risk-return characteristics he pre- theory. That prize, he said, recognized the three
ferred. main research areas that have defined his career.
It has been said many times over the years that Markowitz received the von Neumann prize in 1989
Markowitz’s portfolio theory provided, at long last, from the Operations Research Society of America
the math behind the adage “Don’t put all your eggs and the Institute of Management Sciences (now
in one basket.” In 1988, Sharpe said of Markowitz’s combined as INFORMS) for his work on portfolio
portfolio selection concept: “I liked the parsimony, theory, sparse matrix techniques and the high-level
the beauty, of it. . . . I loved the mathematics. It was simulation language called SIMSCRIPT programming
simple but elegant. It had all the aesthetic qualities language.
that a model builder likes” [5]. After Chicago, Markowitz went to the RAND
Back in 1952, Markowitz already knew the prac- Corp. in Santa Monica, CA, where he worked
tical value of the E-V rule he had crafted. It with a group of economists on linear program-
functioned, his paper noted, both “as a hypothe- ming techniques. In the mid-1950s, he developed
sis to explain well-established investment behav- sparse matrices, a technique to solve large mathe-
ior and as a maxim to guide one’s own action.” matical optimization problems. Toward the end of
However, Markowitz’s insight was deeper. The E-V the decade, he went to General Electric to build
rule enabled the investment management profession models of manufacturing plants in the company’s
to distinguish between investment and speculative manufacturing services department. After return-
behavior, which helped fuel the gradual institutional- ing to RAND in 1961, he and his team devel-
ization of the investment management profession. In oped a high-level programming language for sim-
the wake of Markowitz’s ideas, investment managers ulations called SIMSCRIPT to support Air Force
could strive to build portfolios that were not simply projects that involved simulation models. The lan-
groupings of speculative stocks but well-diversified guage was published in 1962. The same year,
sets of securities designed to meet the risk-return Markowitz and former colleague Herb Karr formed
expectations of investors pursuing clear investment CACI, the California Analysis Center Inc. The firm
goals. later changed its name to Consolidated Analysis
Markowitz’s ideas gained traction slowly, but Centers Inc. and became a publicly traded company
within a decade investment managers were turning that provided IT services to the government and
to Markowitz’s theory of portfolio selection (Mod- intelligence community. It is now called CACI
ern Portfolio Theory) to help them determine how International.
to select portfolios of diversified securities. This Markowitz’s career has ranged across academia,
occurred as institutional investors in the United States research, and business. He worked in the money
were casting around for ways to structure portfolios management industry as president of Arbitrage Man-
that relied more on analytics and less on relation- agement Company from 1969 to 1972. From 1974
ships with brokers and bankers. In the intervening until 1983, Markowitz was at IBM’s T.J. Watson
years, Markowitz expanded his groundbreaking work. Research Center in Yorktown Heights, NY. He has
In 1956, he published the Critical Line Algorithm, taught at the University of California at Los Angeles,
which explained how to compute the efficient fron- Baruch College and, since 1994, at the University
tier for portfolios with large numbers of securities of California at San Diego. He continues to teach at
subject to constraints. In 1959, he published Portfo- UC-San Diego and is an academic consultant to Index
lio Selection: Efficient Diversification of Investments, Fund Advisors, a financial services firm that provides
which bored further into the subject and explored the low-cost index funds to investors.
relationship between his mean-variance analysis and In the fall of 2008 and subsequent winter,
the fundamental theories of action under uncertainty Markowitz’s landmark portfolio theory came under
of John von Neumann and Oskar Morgenstern, and harsh criticism in the lay press as all asset classes
of Leonard J. Savage [2]. declined together. Markowitz, however, argued that
However, while Markowitz is most widely known the credit crisis and ensuing losses highlighted the
for his work in portfolio theory, he has said that benefits of diversification and exposed the risks in
Markowitz, Harry 3
not understanding, or in misunderstanding, the cor- [2] Markowitz, H.M. (1959). Portfolio Selection: Efficient
relations between assets in a portfolio. “Portfolio Diversification of Investments, John Wiley & Sons, New
York.
theory was not invalidated, it was validated,” he
[3] Markowitz, H.M. (2002). An Interview with Harry
noted in a 2009 interview with Index Fund Advisors Markowitz by Jeffrey R. Yost, Charles Babbage Institute,
[4]. He has said numerous times over the years that University of Minnesota, Minneapolis, MN.
there are no “shortcuts” to understanding the trade- [4] Markowitz, H.M. (2009). An Interview with Harry M.
off between risk and return. “US portfolio theorists Markowitz by Mark Hebner, Index Fund Advisors, Irvine,
do not talk about risk control,” he said in that inter- CA.
[5] Sharpe, W.F. (1988). Revisiting the Capital Asset Pricing
view. “It sounds like you can control risk. You can’t.”
Model, an interview by Jonathan Burton. Dow Jones
“But diversification,” he continued, “is the next best Asset Manager, May/June, 20–28.
thing.”
Related Articles
References
Modern Portfolio Theory; Risk–Return Analysis;
Sharpe, William F.
[1] Markowitz, H.M. (1952). Portfolio selection, Journal of
Finance 7, 77–91. NINA MEHTA
Merton, Robert C. case when the set of investment opportunities
is stochastic and evolves over time. Investors
hold a portfolio to hedge against shifts in the
Robert C. Merton is the John and Natty McArthur opportunity set of security returns. This implies that
University Professor at Harvard Business School. investors are compensated in the expected return
In 1966, he earned a BS in engineering mathemat- for bearing the risk of shifts in the opportunity set
ics from Columbia University where he published of security returns, in addition to bearing market
his first publication “The ‘Motionless’ Motion of risk. Because of this additional compensation in
Swift’s Flying Island” in the Journal of the History expected return, in equilibrium, expected returns
of Ideas [4]. He then went on to pursue gradu- on risky assets may differ from the risk-less
ate studies in applied mathematics at the Califor- expected return even when they have no market
nia Institute of Technology, leaving the institution risk. Through this work, we obtain an empirically
with an MS in 1967. He obtained a PhD in eco- more useful version of CAPM that allows for
nomics in 1970 from the Massachusetts Institute multiple risk factors. Merton’s ICAPM predated
of Technology where he worked under the Nobel many subsequently published multifactor models like
laureate Paul A. Samuelson (see Samuelson, Paul the arbitrage pricing theory [11] (see Arbitrage
A.). His dissertation was entitled “Analytical Optimal Pricing Theory).
Control Theory as Applied to Stochastic and Non- Merton’s work in the 1970s laid the foundation
stochastic Economics.” Prior to joining Harvard in for modern derivative pricing theory (see Option
1988, Merton served on the finance faculty of Mas- Pricing: General Principles). His paper “Theory
sachusetts Institute of Technology. of Rational Option Pricing” [5] is one of the two
In 1997, Merton shared the Nobel Prize in Eco- classic papers on derivative pricing that led to the
nomic Sciences with Myron Scholes “for a new Black–Scholes–Merton option pricing theory (see
method to determine the value of derivatives”. Black–Scholes Formula). Merton’s essential con-
Merton taught himself stochastic dynamic program- tribution was his hedging (see Hedging) argument
ming and Ito calculus during graduate school at for option pricing based on no arbitrage; he showed
Massachusetts Institute of Technology and subse- that one can use the prescribed dynamic trading
quently introduced Ito calculus (see Stochastic Inte- strategy under Black–Scholes [1] to offset the risk
grals) into finance and economics. Continuous-time exposure of an option and obtain a perfect hedge
stochastic calculus had become a cornerstone in under the continuous trading limit. In other words,
mathematical finance, and more than anyone Merton he discovered how to construct a “synthetic option”
is responsible in making manifest the mathematical using continual revision of a “self-financing” portfo-
tool’s power in financial modeling and applications. lio involving the underlying asset and riskless bor-
Merton had also produced highly regarded work rowing to replicate the expiration-date payoff of the
on dynamic models of optimal life-time consump- option. And no arbitrage dictates that the cost of con-
tion and portfolio selection, equilibrium asset pric- structing this synthetic option must give the price
ing, contingent-claim analysis, and financial systems. of the option even if it does not exist. This sem-
Merton’s monograph “Continuous-time finance” [8] inal paper also extended the Black–Scholes model
is a classic introduction to these topics. to allow for predictably changing interest rates, div-
Merton proposed an intertemporal capital asset idend payments on the underlying asset, changing
pricing model (ICAPM) [6] (see Capital Asset exercise price, and early exercise under American
Pricing Model), a model empirically more attractive options. Merton also produced “perhaps the first
than the single-period capital asset pricing model closed-form formula for an exotic option”. [12] Mer-
(CAPM) (see Capital Asset Pricing Model). ton’s approach to derivative securities provided the
Assuming continuous-time stochastic processes with intellectual basis for the rise of the profession of
continuous-decision-making and trading, Merton financial engineering.
showed that mean–variance portfolio choice is The Merton model (see Structural Default Risk
optimal at each moment of time. It explained when Models) refers to an increasingly popular structural
and how the CAPM could hold in a dynamic credit risk model introduced by Merton [7] in the
setting. As an extension, Merton looked at the early 1970s. Drawing on the insight that the payoff
2 Merton, Robert C.
structure of the leveraged equity of a firm is identical jump-diffusion models (see Jump-diffusion Mod-
to that of a call option (see Call Options) on els) in option pricing, valuation of market forecasts,
the market value of the assets of the whole firm, pension reforms, and employee stock option (see
Merton proposed that the leveraged equity of a firm Employee Stock Options).
could be valued as if it were a call option on the In addition to his academic duties, Merton has also
assets of the whole firm. The isomorphic (same been partner of the now defunct hedge fund Long
payoff structure) price relation between the leveraged Term Capital Management (see Long-Term Capi-
equity of a firm and a call option allows one to tal Management) and is currently Chief Scientific
apply the Black–Scholes–Merton contingent-claim Officer at the Trinsum Group.
pricing model to value the equities [7]. The value
for the corporate debt could then be obtained by
subtracting the value of the option-type structure
that the leveraged equity represents from the total
References
market value of the assets. Merton’s methodology
offered a way to obtain valuation functions for the
equity and debt of a firm, a measure of the risk of [1] Black, F. & Scholes, M. (1973). The pricing of options
the debt, as well as all the Greeks of contingent- and corporate liabilities, Journal of Political Economy
claim pricing. The Merton model provided a useful 81(3), 637–659.
basis for valuing and assessing corporate debt, its [2] Crane, D., Froot, K., Mason, S., Perold, A., Mer-
risk, and the sensitivity of debt value to various ton, R.C., Bodie, Z., Sirri, E. & Tufano, P. (1995).
parameters (e.g., the delta gives the sensitivity of The Global Financial System: A Functional Perspective,
Harvard Business School Press, Boston, MA.
either debt value or equity value to change in asset
[3] Merton, R.K. (1957). Social Theory and Social Structure,
value). Commercial versions of the Merton model revised and enlarged edition, The Free Press, Glencoe,
include the KMV model and the Jarrow–Turnbull IL.
model. [4] Merton, R.C. (1966). The “Motionless” Motion of
Since the 1990s, Merton collaborated with Zvi Swift’s flying island, Journal of the History of Ideas 27,
Bodie, Professor of Finance at Boston University to 275–277.
develop a new line of research on the financial sys- [5] Merton, R.C. (1973). Theory of rational option theory,
tem [2, 9, 10]. They adopted a functional perspective, Bell Journal of Economics and Management Science
4(1), 141–183.
“similar in spirit to the functional approach in soci-
[6] Merton, R.C. (1973). An intertemporal capital asset
ology pioneered by Robert K. Merton (1957)” [3, 9]. pricing model, Econometrica 41(5), 867–887.
By focusing on the underlying functions of financial [7] Merton, R.C. (1974). On the pricing of corporate debt:
systems, the functional perspective takes functions the risk structure of interest rates, Journal of Finance
rather than institutions and forms as the concep- 29(2), 449–470.
tual anchor in its analysis of financial institutional [8] Merton, R.C. (1990). Continuous-Time Finance, Black-
change over time and contemporaneous institutional well, Malden, MA.
[9] Merton, R.C. & Bodie, Z. (1995). A conceptual frame-
differences across borders. The functional perspec-
work for analyzing the financial system. Chapter 1 in
tive is also useful for predicting and guiding finan- The Global Financial System: A Functional Perspective,
cial institutional change. The existing approaches D. Crane, K. Froot, S. Mason, A. Perold, R. Merton,
of neoclassical, institutional, and behavioral theo- Z. Bodie, E. Sirri, & P. Tufano, eds, Harvard Business
ries in economics are taken as complementary in School Press, Boston, MA, pp. 3–31.
the functional approach to understand financial sys- [10] Merton, R.C. & Bodie, Z. (2005). Design of financial
tems. systems: towards a synthesis of function and structure,
Merton had made significant contributions to Journal of Investment Management 3(1), 1–23.
[11] Ross, S. (1976). The arbitrage theory of capital asset
finance across a broad spectrum and they are too
pricing, Journal of Economic Theory 13(3),
numerous to mention exhaustively. His other works 341–360.
include those on Markowitz–Sharpe-type models [12] Rubinstein, M. (2006). A History of the Theory of Invest-
with investors with homogeneous beliefs but with ments, John Wiley & Sons, Hoboken, NJ,
incomplete information about securities, the use of p. 240.
Merton, Robert C. 3
other merchants seeking to avoid the risks of car- To deal with the problems of reconciling transactions
rying significant sums of money over long distance, using different coinages and units of account, a forum
making a local payment in exchange for a disburse- for arbitrating exchange rates was introduced. On
ment of the local currency in a different location. the third day of each fair, a representative body
The basic cash-and-carry arbitrage is complicated by composed of recognized merchant bankers would
the presence of different payment locations and cur- assemble and determine the exchange rates that
rency units. The significant risk of delivery failure would prevail for that fair. The process involved each
or nonpayment was controlled through the close-knit banker suggesting an exchange rate and, after some
organizational structure of the merchant networks [7]. discussion, a voting process would determine the
These same networks provided information on chang- exchange rates that would apply at that fair. Similar
ing prices in different regions that could be used in practices were adopted at other important fairs later in
geographical goods arbitrage. the Middle Ages. At Lyon, for example, Florentine,
The gradual introduction of standardized coinage Genoese, and Lucca bankers would meet separately
starting around the 650 BC expanded available to determine rates, with the average of these group
arbitraging opportunities to include geographical rates becoming the official rate. These rates would
arbitrage of physical coins to exploit differing then apply to bill transactions and other business
exchange ratios [6, p.19–20]. For example, during conducted at the fair. Rates typically stayed constant
the era of the Athenian empire (480–404 BC), Per- between fairs in a particular location providing the
sia maintained a bimetallic coinage system where opportunity for arbitraging of exchange rates across
silver was undervalued relative to gold. The result- fairs in different locations.
ing export of silver coins from Persia to Greece and From ancient beginnings involving commodity
elsewhere in the Mediterranean is an early instance transactions of merchants, the bill of exchange
of a type of arbitrage activity that became a main- evolved during the Middle Ages to address the diffi-
stay of the arbitrageur in later years. This type of culties of using specie or bullion to conduct foreign
arbitrage trading was confined to money changers exchange transactions in different geographical loca-
with the special skills and tools to measure the bul- tions. In general, a bill of exchange contract involved
lion value of coins. In addition to the costs and risks four persons and two payments. The bill is created
of transportation, the arbitrage was restricted by the when a “deliverer” exchanges domestic cash money
seigniorage and minting charges levied in the dif- for a bill issued by a “taker”. The issued bill of
ferent political jurisdictions. Because coinage was exchange is drawn on a correspondent or agent of the
exchanged by weight and trading by bills of exchange taker who is situated abroad. The correspondent, the
was rudimentary, there were no arbitrageurs special- “payer”, is required to pay a stated amount of foreign
izing solely in “arbitrating of exchange rates”. Rather, cash money to the “payee”, to whom the bill is made
arbitrage opportunities arose from the trading activ- payable. Consider the precise text of an actual bill
ities of networks of merchants and money changers. of exchange from the early seventeenth century that
These opportunities included uncovered interest arbi- appeared just prior to the introduction of negotiability
trage between areas with low interest rates, such as [28, p.123]:
Jewish Palestine, and those with high rates, such as
Babylonia [6, p.18–19]. March 14, 1611
In London for £69.15.7 at 33.9
At half usance pay by this first of exchange
Evolution of the Bill of Exchange to Francesco Rois Serra sixty-nine pounds, fifteen
shillings, and seven pence sterling at thirty-three
Though the precise origin of the practice is unknown, shillings and nine pence groat per £ sterling, value
“arbitration of exchange” first developed during the [received] from Master Francesco Pinto de Britto,
Middle Ages. Around the time of the First Crusade, and put it into our account, God be with you.
Giovanni Calandrini and
Genoa had emerged as a major sea power and Filippo Burlamachi
important trading center. The Genoa fairs had become Accepted
sufficiently important economic and financial events [On the back:] To Balthasar Andrea in Antwerp
that attracted traders from around the Mediterranean. First 117.15.0 [pounds groat]
Arbitrage: Historical Perspectives 3
The essential features of the bill of exchange all producing a number of different contractual varia-
appear here: the four separate parties; the final tions [9, 15, 26]. The market for bills of exchange
payment being made in a different location from also went through a number of different stages. At
the original payment; and the element of currency the largest and most strategic medieval fairs, finan-
exchange. “Usance” is the period of time, set by cial activities, especially settlement and creation of
custom, before a bill of exchange could be redeemed bills of exchange, came to dominate the trading in
at its destination. For example, usance was 3 months goods [27]. By the sixteenth century, bourses such as
between Italy and London and 4 weeks between the Antwerp Exchange were replacing the fairs as the
Holland and London. The practice of issuing bills at key international venues for bill trading.
usance, as opposed to specifying any number of days
to maturity, did not disappear until the nineteenth
century [34, p.7]. Arbitrage in Coinage and Bullion
Commercial and financial activities in the Middle
Ages were profoundly impacted by Church doctrine Arbitrage trading in coins and bullion can be traced
and arbitrage trading was no exception. Exchange to ancient times. Reflecting the importance of the
rates determined for a given fair would have to be activity to ordinary merchants in the Middle Ages,
roughly consistent with triangular arbitrage to avoid methods of determining the bullion content of coins
Church sanctions. In addition, the Church usury pro- from assay results, and rates of exchange between
hibition impacted the payment of interest on money coins once bullion content had been determined,
loans. Because foreign exchange transactions were formed a substantial part of important commercial
licit under canon law, it was possible to disguise arithmetics, such as the Triparty (1484) of Nicolas
the payment of interest in a combination of bill of Chuquet [2]. The complications involved in trading
exchange transactions referred to as dry exchange or without a standardized unit of account were imposing.
fictitious exchange [13, p.380–381], [17, 26]. The There were a sizable number of political jurisdictions
associated exchange and re-exchange of bills was that minted coins, each with distinct characteristics
a risky set of transactions that could be covertly and weights [14]. Different metals and combinations
used to invest money balances or to borrow funds of metals were used to mint coinage. The value of
to finance the contractual obligations. The expansion silver coins, the type of coins most commonly used
of bill trading for financial purposes combined with for ordinary transactions, was constantly changing
the variation in the exchange rates obtained at fairs in because of debasement and “clipping”. Over time,
different locations provided the opportunity of geo- significant changes in the relative supply of gold and
graphical arbitrage of exchange rates using bills of silver, especially due to inflows from the New World,
exchange. It was this financial practice of exploiting altered the relative values of bullion. As a result,
differences in bill exchange rates between financial merchants in a particular political jurisdiction were
centers that evolved into the “arbitration of exchange” reluctant to accept foreign coinage at the par value
identified by la Porte [22], Savary [24], and Postel- set by the originating jurisdiction. It was common
wayte [30] in the eighteenth century. practice for foreign coinage to be assayed and a value
The bill of exchange contract evolved over time to set by the mint conducting the assay. Over time, this
meet the requirements of merchant bankers. As mon- led to considerable market pressures to develop a
etary units became based on coinage with specific unit of account that would alleviate the expensive
bullion content, the relationship between exchange and time-consuming practice of determining coinage
rates in different geographical locations for bills value.
of exchange, coinage, and physical bullion became An important step in the development of such
the mainstay of traders involved in “arbitration of a standardized unit of account occurred in 1284
exchange”. Until the development of the “inland” bill when the Doge of Venice began minting the gold
in early seventeenth century in England, all bills of ducat: a coin weighing about 3.5 g and struck in
exchange involved some form of foreign exchange 0.986 gold. While ducats did circulate, the primary
trading, and hence the name bill of exchange. Con- function was as a trade coin. Over time, the ducat
tractual features of the bill of exchange, such as was adopted as a standard for gold coins in other
negotiability and priority of claim, evolved over time countries, including other Italian city states, Spain,
4 Arbitrage: Historical Perspectives
Austria, the German city states, France, Switzerland, for silver shillings at a fixed price (£1.075 = 21s.
and England. Holland first issued a ducat in 1487 6d./oz.). In Amsterdam, the market price for a Dutch
and, as a consequence of the global trading power of gold ducat was 17.5 schellingen (S). Observing that
Holland in the sixteenth and seventeenth centuries, the ducat contained 0.1091 ounces of recoverable
the ducat became the primary trade coin for the gold and the guinea 0.2471 ounces, it follows that
world. Unlike similar coins such as the florin and 36.87 S could be obtained for £1 if gold was
guinea, the ducat specifications of about 3.5 g of used to effect the exchange. Or, put differently, 1
0.986 gold did not change over time. The use of ducat would produce £0.4746. Because transportation
mint parities for specific coins and market prices of coins and bullion was expensive, there was a
for others did result in the gold–silver exchange sizable band within which rates on bills of exchange
ratio differing across jurisdictions. For example, in
could fluctuate without producing bullion flows. If
1688, the Amsterdam gold–silver ratio for the silver
the (S/£) bill exchange rate rose above the rate of
rixdollar mint price and gold ducat market price was
exchange for gold plus transport costs, merchants in
14.93 and, in London, the mint price ratio was 15.58
Amsterdam seeking funds in London would prefer
for the silver shilling and gold guinea [25, p.475].
Given transport and other costs of moving bullion, to send gold rather than buy bills of exchange
such gold/silver price ratio differences were not on London. Merchants in London seeking funds
usually sufficient to generate significant bullion flows. in Amsterdam would buy bills on Amsterdam to
However, combined in trading with bills of exchange, benefit from the favorable exchange. Similarly, if the
substantial bullion flows did occur from arbitrage bill exchange rate fell below the rate of exchange
trading. for silver plus transport costs, merchants in London
Details of a May 1686 arbitrage by a London would gain by exporting silver to Amsterdam rather
goldsmith involving bills of exchange and gold coins than buying a bill on Amsterdam.
are provided by Quinn [25, p.479]. The arbitrage To reconstruct the 1686 goldsmith arbitrage,
illustrates how the markets for gold, silver, and observe that the exchange rate for a 4-week bill in
bills of exchange interacted. At that time, silver was London on Amsterdam at the time of the arbitrage
the primary monetary metal used for transactions was 37.8 (S/£). Obtaining gold ducats in Holland
though gold coins were available. Prior to 1663, when for £0.4746 and allowing for transport costs of 1.5%
the English Mint introduced milling of coins with and transport time of 1 week produces gold in Lon-
serrated edges to prevent clipping, all English coins don for £0.4676. Using this gold to purchase a bill
were “hammered” [20]. The minting technology of of exchange on Amsterdam produces 17.6715 S in
hammering coins was little changed from Roman Amsterdam 5 weeks after the trade is initiated, an
times. The process produced imperfect coins, not arbitrage profit of 0.1715 S. Even if the gold can
milled at the edges, which were only approximately be borrowed in Amsterdam and repaid in silver, the
equal in size, weight, and imprint making altered trade is not riskless owing to the transport risk and
coins difficult to identify [29, ch.4]. Such coins were
the possible movement in bill rates before the bill
susceptible to clipping, resulting in circulating silver
is purchased in London. These costs would be miti-
coins that were usually under the nominal Mint
gated significantly for a London firm also operating
weight. Despite a number of legislative attempts at
remedying the situation, around 1686, the bulk of in the bill and bullion market of Amsterdam, as was
the circulating coins in England were still hammered the case with a number of London goldsmiths. The
silver. The Mint would buy silver and gold by weight strength of the pound sterling in the bill market from
in exchange for milled silver shilling coins at a set 1685–1688 generated gold inflows to England from
price per ounce. When the market price of silver rose this trade higher than any other four-year period in
sufficiently above the mint price, English goldsmiths the seventeenth century [25, p.478]. The subsequent
would melt the milled silver coin issued by the Mint, weakening of the pound in the bill market from
though it was technically illegal to do so. 1689 until the great recoinage in 1696 led to arbi-
In addition to mint prices for silver and gold, there trage trades switching from producing gold inflows
were also market prices for gold and silver. Around to substantial outflows of silver from melted coins
1686, the Mint would issue guineas in exchange and clipping.
Arbitrage: Historical Perspectives 5
Bill of Exchange Arbitrage and then matching the settlements in the payment
centers.
The roots of “arbitration of exchange” can be traced In the following example, $G is the domestic
to the transactions of medieval merchant bankers currency in Hamburg and $A is the domestic cur-
seeking to profit from discrepancies in bill exchange rency in Antwerp, the forward exchange rate imbed-
rates across geographical locations [27, 28]. For ded in the bill transaction is denoted as F1 for
example, if sterling bills on London were cheaper in Ducats/$A; F2 for Ducats/$G; F3 for £/$G; and, F4
Paris than in Bruges, then medieval bankers would for £/$A.
profit by selling sterling in Bruges and buying in
Paris. The effect of such transactions was to keep
In Hamburg
all exchange rates roughly in parity with the trian-
gular arbitrage condition. Temporary discrepancies Acquire $G QG Deliver the $G QG
did occur but such trading provided a mechanism using a bill which on another bill
agrees to pay which agrees to be
of adjustment. The arbitrages were risky even when ($G QG F2 ) in repaid ($G QG F3 ) in
done entirely with bills of exchange. Owing to Venice at time T London at time T
the slowness of communications, market conditions
could change before bills of exchange reached their
destination and the re-exchange could be completed. In Antwerp
As late as the sixteenth century, only the Italian Acquire $A QA Deliver the $A QA on
merchant bankers, the Fuggers of Augsburg, and a using a bill which another bill
few other houses with correspondents in all bank- agrees to pay which agrees to be
ing centers were able to engage actively in arbitrage ($A QA F4 ) in repaid ($A QA F1 ) in
London at time T Venice at time T
[28, p.137]. It is not until the eighteenth century
that markets for bills were sufficiently developed
At t = 0, the cash flows from all the bill transactions
to permit arbitration of exchange to become stan-
at t = 0 offset. If the size of the borrowings in
dard practice of merchants deciding on the most
the two issuing centers is calculated to produce
profitable method of remitting or drawing funds
the same maturity value, in terms of the domestic
offshore.
currencies of the two payment centers, then the
The transactions in arbitration of exchange by
profit on the transaction depends on the relative
medieval bankers are complicated by the absence
values of the payment center currencies in the issuing
of offsetting cash flows in the locations where bills
centers. If there is sufficient liquidity in the Hamburg
are bought and sold. In the example above, the pur-
and Antwerp bill markets, the banker can generate
chase of a bill in Paris would require funds, which
triangular arbitrage trades designed to profit from
are generated by the bill sale in Bruges. The prof-
discrepancies in bid/offer rates arising in different
its are realized in London. Merchant bankers would
geographical locations.
be able to temporarily mitigate the associated geo-
To see the precise connection to triangular arbi-
graphical fund imbalances with internally generated
trage, consider the profit function from the trading
capital, but re-exchanges or movements of bullion
strategy. At time T in Venice, the cash flows would
were necessary if imbalances persisted. To be con-
provide ($A QA F1 ) − ($G QG F2 ). And, in Lon-
sistent with the spirit of the self-financing element of
don, the cash flows would provide ($G QG F3 ) −
modern riskless arbitrage, the example of medieval
($A QA F4 ). For the intermediary operating in both
banker arbitrage among Paris, Bruges, and London
locations, the resulting profit (π) on the trade would
can be extended to two issuing locations and two
be the sum of the two cash flows:
payment centers. It is possible for the same loca-
tion to be used as both the issuing and payment
π(T ) = ($A QA F1 − $G QG F2 )
location but that will not be assumed. Let the two
issuing locations be, say, Antwerp and Hamburg, + ($G QG F3 − $A QA F4 )
with the two payment locations being London and
= $A QA (F1 − F4 ) + $G QG (F3 − F2 )
Venice. The basic strategy involves making offset-
ting bill transactions in the two issuing locations (1)
6 Arbitrage: Historical Perspectives
Constructing the principal values of the two trans- Merchants’ manuals of the eighteenth and
actions to be of equal value now permits the substi- nineteenth centuries typically present arbitration
tution of QG = QA ($G/$A), where ($G/$A) = F0 of exchange from the perspective of a merchant
is the prevailing exchange rate between $G and $A: engaged in transferring funds. In some sources,
self-financing arbitrage opportunities created by
π(T ) = $AQA [(F1 − F0 F2 ) − (F4 − F0 F3 )] combining remitting and drawing opportunities are
identified. Discussions of the practice invariably
Ducats $G Ducats
= $AQA − involve calculations of the “arbitrated rates”. Earlier
$A $A G
manuals such as the one by Le Moine [11] only
£ $G £ provide a few basic calculations aimed to illustrate
− − (2)
$A $A $G the transactions involved. The expanded treatment
in Postlewayt [24] provides a number of worked
The two values in brackets will be zero if trian- calculations. In one example, exchange rates at
gular arbitrage holds for both currencies. If the direct London are given as London–Paris 31 3/4 pence
and indirect exchange rates for one of the currencies sterling for 1 French crown; London–Amsterdam
are not consistent with triangular arbitrage, then the as 240 pence sterling for 414 groats. Worked
banker can obtain a self-financing arbitrage profit. calculations are given for the problem “What is the
proportional arbitrated price between Amsterdam and
Paris?” Considerable effort is given to show the
Arbitration of Exchange arithmetic involved in determining this arbitrated rate
as 54 123/160 groat for 1 crown. Using this calculated
By the eighteenth century, the bill market in key arbitrated exchange rate and the already known actual
financial centers such as Amsterdam, London, Ham- London–Paris rate, Postlewayt then proceeds to
burg, and Paris had developed to the point where determine the arbitrated rate for London–Amsterdam
merchants as well as bankers could engage in arbi- using these exchange rates for Paris–London and
tration of exchange to determine the most profitable Paris–Amsterdam finding that it equals 240 pence
method of remitting funds to or drawing funds from sterling for 414 groats.
offshore locations. From a relatively brief treatment Having shown how to determine arbitrated rates,
in early seventeenth century sources, for example, Postlewayt provides worked examples of appropri-
[13], merchants’ manuals detailing technical aspects ate arbitrage trades when the actual exchange rate is
of bill trading were available by the beginning of the above or below the arbitrated rate. For example, when
eighteenth century. The English work by Justice, A the arbitrated Amsterdam–Paris rate is above the
General Treatise on Money and Exchanges [9], an actual rate, calculations are provided to demonstrate
expanded translation of an earlier treatise in French that drawing sterling in London by selling a bill on
by M. Ricard, details the workings of bill transac- Paris, using the funds to buy a bill on Amsterdam and
tions, recognizing subtle characteristics in the bill then exchanging the guilders/groats received in Ams-
contract. However, as a reflection of the rudimentary terdam at the actual rate to cover the crown liability
state of the English bill market in the early eigh- in Paris will produce a self-financing arbitrage profit.
teenth century, Justice did not approve of “drawing Similarly, when the arbitrated Amsterdam–Paris rate
bills upon one country payable in another” due to is below the actual rate, the trades in the arbitrage
the “difference in the Laws of Exchange, in different involve drawing sterling in London by selling a
countries” giving rise to “a great many inconve- bill on Amsterdam, using the funds to buy a bill
niences” [9, p.28]. As the eighteenth century pro- on Paris and then exchanging at the actual Ams-
gressed, there was substantial growth in the breadth terdam–Paris exchange rate the crowns received in
and depth of the bill market supported by increases in Paris to cover the guilder liability. This is similar to
speed of communication between key financial cen- the risky medieval banker arbitrage where the rate
ters with London emerging as the focal point [16, on re-exchange is uncertain. Though the actual rate
31]. This progress was reflected in the increasingly is assumed to be known, in practice, this rate could
sophisticated treatment of arbitration of exchange in change over the time period it takes to settle the rele-
merchants’ manuals. vant bill transactions. However, the degree of risk
Arbitrage: Historical Perspectives 7
facing the medieval banker was mitigated by the speed at which price discrepancies across interna-
18th century due to the considerably increased speed tional markets could be identified. Telegraph tech-
of communication between centers and subsequent nology allowed the introduction of the stock market
developments in the bill contract, such as negotiabil- ticker in 1867. Opportunity for arbitraging differences
ity and priority of claim. in the prices of securities across markets was fur-
Earlier writers on arbitration of exchange, such ther aided by expansion of the number and variety of
as Postlewayt, accurately portrayed the concept but stocks and shares, many of which were interlisted
did not adequately detail all costs involved in the on different regional and international exchanges.
transactions. By the nineteenth century, merchants’ (Where applicable, the nineteenth century convention
manuals such as [34] accurately described the range of referring to fixed-income securities as stocks and
of adjustments required for the actual execution of the common stocks as shares will be used.) For exam-
trades. Taking the perspective of a London merchant ple, after 1873 arbitraging the share price of Rio Tinto
with sterling seeking to create a fund of francs between the London and Paris stock exchanges was
in Paris, a difference is recognized between two a popular trade.
methods of determining the direct rate of exchange: Cohn [3, p.3] attributes “the enormous increase
buying a bill in the London market for payment
in business on the London Stock Exchange within
in Paris; or having correspondents in Paris issue
the last few years” to the development of “Arbi-
for francs a bill for sterling payment in London.
trage transactions between London and Continental
In comparing with the arbitrated rates, the more
Bourses”. In addition to various government bond
advantageous direct rate is used. In determining direct
issues, available securities liquid enough for arbi-
rates, 3-month bill exchange rates are used even
though the trade is of shorter duration. These rates trage trading included numerous railway securities
are then adjusted to “short” rates to account for that appeared around the middle of the century. For
the interest factor. Arbitrated rates are calculated example, both Haupt [8] and Cohn [3] specifically
and, in comparing with direct rates, an additional identify over a dozen securities traded in Amster-
brokerage charge (plus postage) is deducted from the dam that were sufficiently liquid to be available for
indirect trade due to the extra transaction involved, arbitrage with London. Included on both lists are
for example, a London merchant buys a bill for securities as diverse as the Illinois and Erie Rail-
payment in Frankfurt, which is then sold in Paris. way shares and the Austrian government silver loan.
No commissions are charged as it is assumed that the Securities of mines and banks increased in impor-
trade is done “between branches of the same house, tance as the century progressed. The expansion in
or on joint account” [34, p.98]. railway securities, particularly during the US consol-
idations of the 1860s, led to the introduction of traded
contingencies associated with these securities such as
Arbitrage in Securities and Commodities rights issues, warrant options, and convertible securi-
Arbitrage involving bills of exchange survives in ties. Weinstein [33] identifies this development as the
modern times in the foreign exchange swap trades beginning of arbitrage in equivalent securities, which,
of international banks. Though this arbitrage is of in modern times, encompasses convertible bond arbi-
central historical importance, it attracts less atten- trage and municipal bond arbitrage. However, early
tion now than a range of arbitrage activities involv- eighteenth century English and French subscription
ing securities and commodities that benefited from shares do have a similar claim [32]. Increased liquid-
the financial and derivative security market develop- ity in the share market provided increased opportuni-
ments of the nineteenth century. Interexchange and ties for option trading in stocks and shares.
geographical arbitrages were facilitated by develop- Also during the nineteenth century, trading in
ments in communication. The invention of the tele- “time bargains” evolved with the commencement
graph in 1844 permitted geographical arbitrage in of trading in such contracts for agricultural com-
stocks and shares between London and the provin- modities on the Chicago Board of Trade in 1851.
cial stock exchanges by the 1850s. This trade was While initially structured as forward contracts, adop-
referred to as shunting. In 1866, Europe and Amer- tion of the General Rules of the Board of Trade
ica were linked by cable, significantly enhancing the in 1865 laid a foundation for trading of modern
8 Arbitrage: Historical Perspectives
futures contracts. Securities and contracts with con- daily quotations of rates in several markets. Also, the
tingencies have a history stretching to ancient times similar traffic in stock.” The initial usage is given
when trading was often done using samples and mer- as 1881. Reference is also directed to “arbitration
chandise contracts had to allow for time to delivery of exchange” where the definition is “the determina-
and the possibility that the sample was not repre- tion of the rate of exchange to be obtained between
sentative of the delivered goods. Such contingencies two countries or currencies, when the operation is
were embedded in merchandise contracts and were conducted through a third or several intermediate
not suited to arbitrage trading. The securitization of ones, in order to ascertain the most advantageous
such contingencies into forward contracts that are method of drawing or remitting bills.” The singu-
adaptable to cash-and-carry arbitrage trading can be lar position given to “arbitration of exchange” trad-
traced to the introduction of “to arrive” contracts on ing using bills of exchange recognizes the practical
the Antwerp bourse during the sixteenth century [19, importance of these securities in arbitrage activi-
ch.9]. Options trading was a natural development on ties up to that time. The Oxford International Dic-
the trade in time bargains, where buyers could either tionary definition does not recognize the specific
take delivery or could pay a fixed fee in lieu of deliv- concepts of arbitrage, such as triangular currency
ery. In effect, such forward contracts were bundled arbitrage or interexchange arbitrage, or that such
with an option contract having the premium paid at arbitrage trading applies to coinage, bullion, com-
delivery. modities, and shares as well as to trading bills of
Unlike arbitration of exchange using bills of exchange. There is also no recognition that doing
exchange, which was widely used and understood arbitrage with bills of exchange introduces two addi-
by the eighteenth century, arbitrage trades involving tional elements not relevant to triangular arbitrage
options—also known as privileges and premiums —
for manual foreign exchange transactions: time and
were not. Available sources on such trades con-
location.
ducted in Amsterdam, Joseph de la Vega [21, ch.3]
The word “arbitrage” is derived from a Latin
and Isaac da Pinto [19, p.366–377] were written by
root (arbitrari, to give judgment; arbitrio, arbitration)
observers who were not the actual traders, so only
with variants appearing in the Romance languages.
crude details of the arbitrage trades are provided.
Consider the modern Italian variants: arbitraggio is
Conversion arbitrages for put and call options, which
the term for arbitrage; arbitrato is arbitration or
involves knowledge of put–call parity, are described
by both de la Vega and da Pinto. Despite this, prior umpiring; and, arbitrarer is to arbitrate. Similarly,
to the mid-nineteenth century, options trading was a for modern French variants, arbitrage is arbitration;
relatively esoteric activity confined to a specialized arbitrer is to arbitrate a quarrel or to umpire;
group of traders. Having attracted passing mention and arbitre is an arbitrator or umpire. Recognizing
by Cohn [3], Castelli [1, p.2] identifies “the great that the “arbitration of prices” concept underlying
want of a popular treatise” on options as the rea- arbitrage predates Roman times, the historical origin
son for undertaking a detailed treatment of mostly where the word arbitrage or a close variant was
speculative option trading strategies. In a brief treat- first used in relation to arbitrating differences in
ment, Castelli uses put–call parity in an arbitrage prices is unknown. A possible candidate involves
trade combining a short position in “Turks 5%” in arbitration of exchange rates for different currencies
Constantinople with a written put and purchased call observed at the medieval fairs, around the time of
in London. The trade is executed to take advantage of the First Crusade (1100). The dominance of Italian
“enormous contangoes collected at Constantinople” bankers in this era indicates the first usage was the
[1, p.74–77]. close variant, arbitrio, with the French “arbitrage”
coming into usage during the eighteenth century.
Religious and social restrictions effectively barred
Etymology and Historical Usage public discussion of the execution and profitability
of such banking activities during the Middle Ages,
The Oxford International Dictionary [12] defines though account books of the merchant banks do
arbitrage as: “the traffic in bills of exchange drawn remain as evidence that there was significant arbitrage
on sundry places, and bought or sold in sight of the trading.
Arbitrage: Historical Perspectives 9
As late as the seventeenth century, important Following the usage of “arbitrage” in German
English sources on the Law Merchant such as Ger- and Dutch works in the 1860s, common usage of
ard Malynes, Lex Mercatoria [13], make no reference “arbitrageur” in English appears with Ottomar Haupt,
to arbitrage trading strategies in bills of exchange. The London Arbitrageur [8], though reference is still
In contrast, a similar text in Italian, Il Negotiante made to “arbitration of exchange” as the activity
(1638) by Giovanni Peri [18], a seventeenth cen- of the arbitrageur. Haupt produced similar works in
tury Italian merchant, has a detailed discussion on German and French that used “arbitrage” to describe
exchange dealings. Peri states that profit is the objec- the calculation of parity relationships. A pamphlet by
tive of all trade and that the “activity directed to Maurice Cohn, The Stock Exchange Arbitrageur [3]
this end is subject to chance, which mocks at every describes “arbitrage transactions” between bourses
calculation. Yet there is still ample space for reason- but also uses “arbitration” to refer to calculated
able calculation in which the possibility of adverse parity relationships. Charles Castelli’s The Theory of
fortunes is never left out of account” [5, p.327]. “Options” in Stocks and Shares [1] concludes with
This mental activity engaged in the service of busi- a section on “combination of options with arbitrage
ness is called arbitrio. Peri identifies a connection operations” where arbitrage has exclusive use and
between speculation on future exchange rate move- no mention is made of “arbitration” of prices or
ments and the arbitrio concept of arbitrage: “the rates across different locations. Following Arbitrage
profits from exchange dealings originate in price dif- in Bullion, Coins, Bills, Stocks, Shares and Options
ferences and not in time” with profits turning to by Henry Deutsch [4], “arbitration of exchange” is
losses if re-exchange is unfavorable [18, p.150]. For no longer commonly used.
Peri, the connection between speculation and arbi-
trage applies to commodities and specie, as well as References
bills of exchange.
The first published usage of “arbitrage” in dis- [1] Castelli, C. (1877). The Theory of “Options” in Stocks
cussing the relationship between exchange rates and and Shares, F. Mathieson, London.
the most profitable locations for issuing and settling a [2] Chuquet, N. (1484, 1985). Triparty, in Nicolas Chu-
bill of exchange appears in French in, La Science des quet, Renaissance Mathematician, G. Flegg, C. Hay &
B. Moss, eds, D. Reidel Publishing, Boston.
Négocians et Teneurs de Livres [22, p.452]. From the [3] Cohn, M. (1874). The London Stock Exchange in Rela-
brief reference in a glossary of terms by de la Porte, tion with the Foreign Bourses. The Stock Exchange Arbi-
a number of French sources, including the section trageur, Effingham Wilson, London.
Traité des arbitrages by Mondoteguy in Le Moine, [4] Deutsch, H. (1904, 1933). Arbitrage in Bullion, Coins,
Le Negoce d’Amsterdam [11] and Savary, Diction- Bills, Stocks, Shares and Options, 3rd Edition, Effingham
naire Universel de Commerce (1730, 2nd ed.) [30], Wilson, London.
[5] Ehrenberg, R. (1928). Capital and Finance in the Age of
developed a more detailed presentation of arbitrage
the Renaissance, translated from the German by Lucas,
transactions involving bills of exchange. An impor- H. Jonathan Cape, London.
tant eighteenth century English source, The Univer- [6] Einzig, P. (1964). The History of Foreign Exchange, 2nd
sal Dictionary of Trade and Commerce [24], is an Edition, Macmillan, London.
expanded translation of Savary where the French [7] Greif, A. (1989). Reputation and coalitions in medieval
word “arbitrage” is translated into English as “arbi- trade: evidence on the Maghribi Traders, Journal of
Economic History 49, 857–882.
tration”. This is consistent with the linguistic con-
[8] Haupt, O. (1870). The London Arbitrageur; or, the
vention of referring to arbitration instead of arbitrage English Money Market in connexion with foreign
found in the earlier English source, The Merchant’s Bourses. A Collection of Notes and Formulae for the
Public Counting House [23]. This led to the com- Arbitration of Bills, Stocks, Shares, Bullion and Coins,
mon English use of the terms “simple arbitrations”, with all the Important Foreign Countries, Trubner and
“compound arbitrations”, and “arbitrated rates”. The Co., London.
practice of using arbitration instead of arbitrage con- [9] Justice, A. (1707). A General Treatise on Monies and
Exchanges; in which those of all Trading Nations are
tinues into nineteenth century works by Patrick Kelly, Describ’d and Consider’d, S. and J. Sprint, London.
The Universal Cambist [10] and William Tate, The [10] Kelly, P. (1811, 1835). The Universal Cambist and
Modern Cambist [34]. The latter book went into six Commercial Instructor; Being a General Treatise on
editions. Exchange including the Monies, Coins, Weights and
10 Arbitrage: Historical Perspectives
Measures, of all Trading Nations and Colonies, 2nd [22] la Porte, M. (1704). La Science des Négocians et Teneurs
Edition, Lackington, Allan and Co., London, 2 Vols. de Livres, Chez Guillaume Chevelier, Paris.
[11] Le Moine de l’Espine, J. (1710). Le Negoce [23] Postlethwayt, M. (1750). The Merchant’s Public Count-
d’Amsterdam . . . Augmenté d’un Traité des arbitrages ing House, John and Paul Napton, London.
& des changes sur les principales villes de l’Europe (by [24] Postlethwayt, M. (1751, 1774). The Universal Dictionary
Jacques Mondoteguy), Chez Pierre Brunel, Amsterdam. of Trade and Commerce, 4th Edition, John and Paul
[12] Little, W., Fowler, H. & Coulson, J. (1933, 1958). Napton, London.
Oxford International Dictionary of the English Lan- [25] Quinn, S. (1996). Gold, silver and the glorious revolu-
guage, Leland Publishing, Toronto, revised and edited tion: arbitrage between bills of exchange and bullion,
by C. Onions, 1958. Economic History Review 49, 473–490.
[13] Malynes, G. (1622, 1979). Consuetudo, vel Lex Merca- [26] de Roover, R. (1944). What is dry exchange? A contri-
toria or The Ancient Law Merchant, Adam Islip, Lon- bution to the study of english mercantilism, Journal of
don. reprinted (1979) by Theatrum Orbus Terrarum, Political Economy 52, 250–266.
Amsterdam. [27] de Roover, R. (1948). Banking and Credit in Medieval
[14] McCusker, J. (1978). Money and Exchange in Europe Bruges, Harvard University Press, Cambridge, MA.
and America, 1600–1775, University of North Carolina [28] de Roover, R. (1949). Gresham on Foreign Exchange,
Press, Chapel Hill NC. Harvard University Press, Cambridge, MA.
[15] Munro, J. (2000). English ‘Backwardness’ and finan- [29] Sargent, T. & Velde, F. (2002). The Big Problem of Small
cial innovations in commerce with the low coun- Change, Princeton University Press, Princeton, NJ.
tries, 14th to 16th centuries, in International Trade [30] Savary des Bruslons, J. (1730). Dictionnaire Universel
in the Low Countries (14th –16th Centuries), P. Sta- de Commerce, Chez Jacques Etienne, Paris, Vol. 3.
bel, B. Blondé, A. Greve, eds, Garant, Leuven- [31] Schubert, E. (1989). Arbitrage in the foreign exchange
Apeldoorn, pp. 105–167. markets of London and Amsterdam during the 18th
[16] Neal, L. & Quinn, S. (2001). Networks of information, Century, Explorations in Economic History 26, 1–20.
markets, and institutions in the rise of London as a [32] Shea, G. (2007). Understanding financial derivatives
financial centre, 1660–1720, Financial History Review during the south sea bubble: the case of the south
8, 7–26. sea subscription shares, Oxford Economic Papers 59
[17] Noonan, J. (1957). The Scholastic Analysis of Usury, (Special Issue), 73–104.
Harvard University Press, Cambridge, MA. [33] Weinstein, M. (1931). Arbitrage in Securities, Harper &
[18] Peri, G. (1638, 1707). Il Negotiante, Giacomo Hertz, Bros, New York.
Venice. (last revised edition 1707). [34] William, T. (1820, 1848). The Modern Cambist: Form-
[19] Poitras, G. (2000). The Early History of Financial ing a Manual of Foreign Exchanges, in the Different
Economics, 1478–1776, Edward Elgar, Cheltenham, Operations of Bills of Exchange and Bullion, 6th Edition,
U.K. Effingham Wilson, London.
[20] Poitras, G. (2004). William Lowndes, 1652–1724, in
Biographical Dictionary of British Economists, R. Don- GEOFFREY POITRAS
ald, ed., Thoemmes Press, Bristol, UK, pp. 699–702.
[21] Poitras, G. (2006). Pioneers of Financial Economics:
Contributions Prior to Irving Fisher, Edward Elgar,
Cheltenham, UK, Vol. I.
Utility Theory: Historical of the traders—presaged, but did not directly influ-
ence, what will become known in economics as the
Perspectives “Marginalist revolution” led by William Jevons [13],
Carl Menger [17], and Leon Walras [26].
One of the major features of the expected-utility Brownian motion as a model for stock evolution, and
theory is the separation between the utility func- it was not long before it was combined with expected
tion and the resolution of uncertainty, in that equal utility theory in the work of Robert Merton [18] (see
payoffs in different states of the world yield the Merton, Robert C.).
same utilities. It has been argued that, while some-
times useful, such a separation is not necessary. An References
approach in which the utility of a payoff depends
not only on its monetary value but also on the state
[1] Allais, M. (1953). La psychologie de l’home rationnel
of the world has been proposed. Such an approach devant le risque: critique des postulats et axiomes
has been popularized through the work of Kenneth de l’école Américaine, Econometrica 21(4), 503–546.
Arrow [2] (see Arrow, Kenneth) and Gerard Debreu Translated and reprinted in Allais and Hagen, 1979.
[7], largely because of its versatility and compatibility [2] Arrow, K.J. (1953). Le Rôle des valeurs boursières pour
with general-equilibrium theory where the payoffs are la Répartition la meilleure des risques, Econométrie,
not necessarily monetary. Further successful applica- Colloques Internationaux du Centre National de la
Recherche Scientifique, Paris 11, 41–47; Published in
tions have been made by Roy Radner [20] and many English as (1964). The role of securities in the optimal
others. allocation of risk-bearing, Review of Economic Studies
31(2), 91–96.
[3] Arrow, K.J. (1965). Aspects of the Theory of Risk-
Empirical Paradoxes and Prospect Theory Bearing, Yrjö Jahnsson Foundation, Helsinki.
[4] Bernoulli, D. (1954). Exposition of a new theory on
With the early statistical evidence being mostly anec- the measurement of risk, Econometrica 22(1), 23–36.
dotal, many empirical studies have found significant Translation from the Latin by Dr. Louise Sommer of
inconsistencies between the observed behavior and work first published 1738.
[5] de Finetti, B. (1931). Sul significato soggettivo della
the axioms of utility theory. The most influential
probabilità, Fundamenta Mathematicae 17, 298–329.
of these early studies were performed by George [6] de Finetti, B. (1937). La prévision: ses lois logiques, ses
Shackle [24], Maurice Allais [1], and Daniel Ellsberg sources subjectives, Annales de l’Institut Henri Poincaré
[9]. In 1979, Daniel Kahneman and Amos Tversky 7(1), 1–68.
[14] proposed “prospect theory” as a psychologi- [7] Debreu, G. (1959). Theory of Value—An Axiomatic
cally more plausible alternative to the expected utility Analysis of Economic Equilibrium, Cowles Foundation
theory. Monograph # 17, Yale University Press.
[8] Edgeworth, F.Y. (1911). Probability and Expectation,
Encyclopedia Britannica.
[9] Ellsberg, D. (1961). Risk, ambiguity and the Savage
Utility in Financial Theory axioms, Quarterly Journal of Economics 75, 643–69.
[10] Friedman, M. & Savage, L.P. (1952). The expected-
The general notion of a numerical value associ- utility hypothesis and the measurability of utility, Jour-
ated with a risky payoff was introduced to finance nal of Political Economy 60, 463–474.
by Harry Markowitz [15] (see Markowitz, Harry) [11] Gossen, H.H. (1854). The Laws of Human Relations
through his influential “portfolio theory”. and the Rules of Human Action Derived Therefrom, MIT
Markowitz’s work made transparent the need for a Press, Cambridge, 1983. Translated from 1854 original
precise measurement and quantitative understanding by Rudolph C. Blitz with an introductory essay by
Nicholas Georgescu-Roegen.
of the levels of “risk aversion” (degree of concavity [12] Itô, K. (1942). On stochastic processes. I. (Infinitely
of the utility function) in financial theory. Even divisible laws of probability), Japan. Journal of Mathe-
though a similar concept had been studied by Milton matics 18, 261–301.
Friedman and Leonard Savage [10] before that, the [13] Jevons, W.S. (1871). The Theory of Political Econ-
major contribution to this endeavor was made by John omy. History of Economic Thought Books, McMaster
Pratt [19] and Kenneth Arrow [3]. University Archive for the History of Economic
With the advent of stochastic calculus (developed Thought.
[14] Kahneman, D. & Tversky, A. (1979). Prospect theory:
by Kiyosi Itô [12], see Itô, Kiyosi (1915–2008)), an analysis of decision under risk, Econometrica 47(2),
the mathematical tools for continuous-time finan- 263–292.
cial modeling became available. Paul Samuelson [22] [15] Markowitz, H. (1952). Portfolio selection, Journal of
(see Samuelson, Paul A.) introduced geometric Finance 7(1), 77–91.
Utility Theory: Historical Perspectives 3
[16] Marshal, A. (1895). Principles of Economics, 3rd Edi- [23] Savage, L.J. (1954). The Foundations of Statistics, John
tion, 1st Edition 1890, Macmillan, London, New York. Wiley & Sons Inc., New York.
[17] Menger, C. (1871). Principles of Economics, 1981 edi- [24] Shackle, G.L.S. (1949). Expectations in Economics,
tion of 1971 Translation, New York University Press, Gibson Press.
New York. [25] von Neumann, J. & Morgenstern, O. (2007). Theory
[18] Merton, R.C. (1969). Lifetime portfolio selection under of Games and Economic Behavior, Anniversary Edition.
uncertainty: the continuous-time case, The Review of 1st Edition, 1944, Princeton University Press, Princeton,
Economics and Statistics 51, 247–257. NJ.
[19] Pratt, J. (1964). Risk aversion in the small and in the [26] Walras, L. (1874). Eléments d’économie Politique Pure,
large, Econometrica 32(1), 122–136. 4th Edition, L. Corbaz, Lausanne.
[20] Radner, R. (1972). Existence of equilibrium of plans,
prices, and price expectations in a sequence of markets,
Econometrica 40(2), 289–303.
[21] Ramsey, F.P. (1931). The foundations of mathematics Related Articles
and other logical essays, in Truth and Probability,
R.B. Braithwaite, ed, Kegan, Paul, Trench, Trubner &
Behavioral Portfolio Selection; Expected Utility
Co., Harcourt, Brace and Company, London, New York,
Chapter VII, pp. 156–198. Maximization; Merton Problem; Risk Aversion;
[22] Samuelson, P.A. (1965). Rational theory of Warrant Risk–Return Analysis.
Pricing, Industrial Management Review 6(2),
13–31. GORDAN ŽITKOVIĆ
understand Lévy, and, as anyone who has attempted
Itô, Kiyosi (1915–2008) to read Lévy in the original knows, this in itself
a daunting task. Indeed, I have my doubts that,
even now, many of us would know what Lévy
Kiyosi Itô was born in 1915, approximately 60 did had Itô not explained it to us. Be that as it
years after the Meiji Restoration. Responding to the may, Itô’s first published paper (1941) was devoted
appearance of the “Black Ships” in Yokohama harbor to a reworking (incorporating important ideas due
and Commodore Perry’s demand that they open to J.L. Doob) of Lévy’s theory of homogeneous,
their doors, the Japanese overthrew the Tokugawa independent increment processes.
shogunate and in 1868 “restored” the emperor Meiji Undoubtedly as a dividend of the time and effort
to power. The Meiji Restoration initiated a period of which he spent unraveling Lévy’s ideas, shortly after
rapid change during which Japan made a concerted completing this paper Itô had a wonderful insight
and remarkably successful effort to transform itself of his own. To explain his insight, imagine that
from an isolated, feudal society into a modern state the space M1 () of probability measures on has
that was ready to play a major role in the world. a differentiable structure in which the underlying
During the first phase of this period, they sent their dynamics is given by convolution. Then, if t ∈
best and brightest abroad to acquire and bring back [0, ∞) −−−→ µt ∈ M1 () is a “smooth curve” which
to Japan the ideas and techniques that had been pre- starts at the unit point mass δ0 , its “tangent” at time
viously blocked entry by the shogunate’s closed door 0, it should be given by the limit
policy. However, by 1935, the year that Itô entered n
Tokyo University, the Japanese transformation pro- lim µ 1
n→∞ n
cess had already moved to a second phase, one in
which the best and brightest were kept at home to where denotes convolution and therefore ν n
study, assimilate, and eventually disseminate the vast is the n-fold convolution power of ν ∈ M1 ().
store of information which had been imported during What Itô realized is that, if this limit exists,
the first phase. Thus, Itô and his peers were expected it must be an infinitely divisible law. Applied
to choose a topic that they would first teach them- to µt = P (t, x, ·), where (t, x) ∈ [0, ∞) × −−−→
selves and then teach their compatriots. For those of P (t, x, ·) ∈ M1 () is the transition probability func-
us who had the benefit of step-by-step guidance from tion for a Markov process, this key observation
knowledgeable teachers, it is difficult to imagine how lead Itô to view Kolmogorov’s forward equation as
Itô and his fellow students managed, and we can only describing the flow of a vector field on M1 ().In
marvel at the fact that they did. addition, because infinitely divisible laws play in
The topic which Itô chose was that of stochas- the geometry of M1 () the rolea that straight lines
tic processes. At the time, the field of stochastic play in Euclidean space, he saw that one should be
processes had only recently emerged and was still able to “integrate” Kolmogorov’s equation by piecing
in its infancy. N. Wiener (1923) had constructed together infinitely divisible laws, just as one inte-
Brownian motion, A.N. Kolmogorov (1933) and Wm. grates a vector field in Euclidean space by piecing
Feller (1936) had laid the analytic foundations on together straight lines.
which the theory of diffusions would be built, and Profound as the preceding idea is, Itô went a step
P. Lévy (1937) had given a pathspace interpretation further. Again under Lévy’s influence, he wanted to
of infinitely divisible laws. However, in comparison transfer his idea to a pathspace setting. Reasoning
to well-established fields such as complex analysis, that if the transition function can be obtained by con-
stochastic processes still looked more like a haphaz- catenating infinitely divisible laws, then the paths of
ard collection of examples than a unified field. the associated stochastic processes must be obtainable
Having studied mechanics, Itô from the outset to concatenating paths coming from Lévy’s inde-
was drawn to Lévy’s pathspace perspective with its pendent increment processes and that one should be
emphasis on paths and dynamics, and he set as his able to encode this concatenation procedure in some
goal the reconciliation of Kolmogorov and Feller’s sort of “differential equation” for the resulting paths.
analytic treatment with Lévy’s pathspace picture. To The implementation of this program required him to
carry out his program, he first had to thoroughly develop what is now called the “Itô calculus”.
2 Itô, Kiyosi (1915–2008)
It was during the period when he was work- in a German prison camp for French intellectuals,
ing out the details of his calculus that he realized, each of whom attempted to explain to the others
at least in the special case when paths are contin- something about which he was thinking. With the
uous, there is a formula which plays role in his objective of not discussing anything that might be
calculus that the chain rule plays in Newton’s. This useful to the enemy, Leray chose to talk about
formula, which appeared for the first time in a foot- algebraic topology rather than his own work on
note, is what we now call Itô’s formula. Humble partial differential equations, and for this purpose, he
as its origins may have been, it has become one introduced spectral sequences as a pedagogic tool.
of the three or four most famous mathematics for- After relating this anecdote, Schwartz leaned back
mulae of the twentieth century. Itô’s formula is not against the blackboard and spent several minutes
only a boon of unquestioned and inestimable value musing about the advantages of doing research in
to mathematicians but also has become an indispens- ideal working conditions.
able tool in the world of mathematically oriented Kiyosi Itô died at the age of 93 on November
finance. 10, 2008. He is survived by his three daughters. A
Itô had these ideas in the early 1940s, around week before his death, he received the Cultural Medal
the time when Japan attacked Pearl Harbor and its from the Japanese emperor. The end of an era is fast
population had to face the consequent horrors. In approaching.
view of the circumstances, it is not surprising that few
inside Japan, and nobody outside of Japan, knew what
Itô was doing for nearly a decade. Itô did publish an End Notes
outline of his program in a journal of mimeographed
notes (1942) at Osaka University, but he says that
a.
Note that when t µt is the flow of infinitely divisible
only his friend G. Maruyama really read what he had law µ in the sense that µ1 = µ and µs+t = µs µt , µ =
(µ(1/n) )n for all n ≥ 1, which is the convolution analog of
written. Thus, it was not until 1950, when he sent the
f (1) = n−1 f (n) for a linear function on .
manuscript for a monograph to Doob who arranged
that it be published by the A.M.S. as a Memoir,
that Itô’s work began to receive the attention which References
it deserved. Full appreciation of Itô’s ideas by the
mathematical community came only after first Doob [1] Stroock, D. & Varadhan S.R.S. (eds) (1986). Selected
and then H.P. McKean applied martingale theory Papers: K. Itô, Springer-Verlag.
to greatly simplify some of Itô’s more technical [2] Stroock, D. (2003). Markov Processes from K. Itô’s
arguments. Perspective, Annals of Mathematical Studies, Vol. 155,
Despite its less than auspicious beginning, the Princeton University Press.
[3] Stroock, D. (2007). The Japanese Journal of Mathemati-
story has a happy ending. Itô spent many years cal Studies 2(1).
traveling the world: he has three daughters, one living
in Japan, one in Denmark, and one in America. He
is, in large part, responsible for the position of Japan Further Reading
as a major force in probability theory, and he has
disciples all over the planet. His accomplishments are A selection of Itô’s papers as well as an essay about his life
widely recognized: he is a member of the Japanese can be found in [1]. The first half of the book [2] provides a
Academy of Sciences and the National Academy of lengthy exposition of Itô’s ideas about Markov processes.
Sciences; and he is the recipient of, among others, Reference [3] is devoted to articles, by several mathe-
maticians, about Itô and his work. In addition, thumbnail
the Kyoto, Wolf, and Gauss Prizes. When I think
biographies can be found on the web at www-groups.dcs.
of Itô’s career and the rocky road that he had to st-and.ac.uk/history/Biographies/Ito.html and www.math.
travel, I recall what Jack Schwartz told a topology uah.edu/stat/biographies/Ito.xhtml
class I was attending about Jean Leray’s invention of
spectral sequences. At the time, Leray was a prisoner DANIEL W. STROOCK
Thorp, Edward also interested in warrants because of his own invest-
ing. Kassouf had analyzed market data to determine
the key variables that affected warrant prices. On the
Edward O. Thorp is a mathematician who has made basis of his analysis, Kassouf developed an empiri-
seminal contributions to games of chance and invest- cal formula for a warrant’s price in terms of these
ment science. He invented original strategies for variables.
the game of blackjack that revolutionized the game. In September 1965, Thorp and Kassouf discovered
Together with Sheen Kassouf, he showed how war- their mutual interest in warrant pricing and began
rants could be hedged using a short position in the their collaboration. In 1967, they published their
underlying stocks and described and implemented book, Beat the Market, in which they proposed a
arbitrage portfolios of stocks and warrants. Thorp method for hedging warrants using the underlying
made other important contributions to the develop- stock and developed a formula for the hedge ratio
ment of option pricing and to investment theory and [5]. Their insights on warrant pricing were useda by
practice. He has had a very successful record as Black and Scholes in their landmark 1973 paper on
an investment manager. This note contains a brief option pricing.
account of some of his major contributions. Thorp and Kassouf were aware that the conven-
Thorp studied physics as an undergraduate and tional valuation method was based on projecting the
obtained his PhD in mathematics from the University warrant’s expected terminal payoff and discounting
of California at Los Angeles in 1958. The title of back to current time. This approach involved two
his dissertation was Compact Linear Operators in troublesome parameters: the expected return on the
Normed Spaces, and he has published several papers warrant and the appropriate discount rate. Black and
on functional analysis. He taught at UCLA, MIT, Scholes in their seminal paper would show that the
and New Mexico State University and was professor values of both these parameters had to coincide with
of mathematics and finance at the University of the riskless rate. There is strong evidenceb that Thorp
California at Irvine. independently discovered this solution in 1967 and
Thorp’s interest in devising scientific systems for used it in his personal investment strategies. Thorpc
playing games of chance began when he was a gradu- makes it quite clear that the credit rightfully belongs
ate student in the late 1950s. He invented a system for to Black and Scholes.
playing roulette and also became interested in black-
Black Scholes was a watershed. It was only after
jack and devised strategies based on card counting seeing their proof that I was certain that this was
systems. While at MIT, he collaborated with Claude the formula—and they justifiably get all the credit.
Shannon, and together they developed strategies for They did two things that are required. They proved
improving the odds at roulette and blackjack. One of the formula(I didn’t) and they published it (I didn’t).
their inventions was a wearable computer that was
the size of modern-day cell phone. In 1962, Thorp Thorp made a number of other contributions to the
[3] published Beat the Dealer: A Winning Strategy for development of option theory and modern finance and
the Game of Twenty One. This book had a profound his ideas laid the foundations for further advances.
impact on the game of blackjack as gamblers tried As one illustration based on my own experience,
to implement his methods, and casinos responded I will mention Thorp’s essential contribution to a
with various countermeasures that were sometimes paper that David Emanuel and I published in 1980
less than gentle. [2]. Our paper examined the distribution of a hedged
In June 1965, Thorp’s interest in warrants was portfolio of a stock and option that was rebalanced
piqued by reading Sydney Fried’s RHM Warrant Sur- after a short interval. The key equation on which
vey. He was motivated by the intellectual challenge our paper rests was first developed by Thorp in
of warrant valuation and by the prospect of mak- (1976) [4].
ing money using these instruments. He developed his Throughout his career, Edward Thorp has applied
initial ideas on warrant pricing and investing during mathematical tools to develop highly original solu-
the summer of 1965. Sheen Kassouf who was, like tions to difficult problems and he has demonstrated a
Thorp, a new faculty member at the University of unique ability to implement these solution in a prac-
California’s newly established campus at Irvine, was tical way.
2 Thorp, Edward
End Notes [2] Boyle, P.P. & Emanuel, D. (1980). Discretely adjusted
option hedges, Journal of Financial Economics 8(3),
a. 259–282.
Black and Scholes state, “One of the concepts we use [3] Thorp, E.O. (1962). Beat the Dealer: A Winning Strategy
in developing our model was expressed by Thorp and for the Game of Twenty-One, Random House, New York.
Kassouf.” [4] Thorp, E.O. (1976). Common stock volatilities in option
b.
For a more detailed discussion of this issue, see Boyle formulas, Proceedings, Seminar on the Analysis of Secu-
and Boyle [1] Chapter Five. rity Prices, Center for Research in Security Prices, Grad-
c.
Email to the author dated July 26, 2000. uate School of Business, University of Chicago, Vol. 21,
1, May 13–14, pp. 235–276.
References [5] Thorp, E.O. & Kassouf, S. (1967). Beat the Mar-
ket: A Scientific Stock Market System, Random House,
New York.
[1] Boyle, P.P. & Boyle, F.P. (2001). Derivatives: the Tools
that Changed Finance, Risk Books, UK. PHELIM BOYLE
Option Pricing Theory: work on warrant pricing [117]. Samuelson derived
valuation formulas for both European and American
Historical Perspectives options, coining these terms in the process.
Samuelson’s derivation was almost identical to
that used nearly a decade later to derive the
This article traces the history of the option pric- Black–Scholes–Merton formula, except that instead
ing theory from the turn of the twentieth century of invoking the no arbitrage principle to derive the
to the present. This history documents and clari- valuation formula, Samuelson postulated the condi-
fies the origins of the key contributions (authors tion that the discounted option’s payoffs follow a
and papers) to the theory of option pricing and martingale (see [117], p. 19). Furthermore, it is also
hedging. Contributions with respect to the empirical interesting to note that, in the appendix to this arti-
understanding of the theories are not discussed, cle, Samuelson and McKean determined the price of
except implicitly, because the usefulness and longe- an American option by observing the correspondence
vity of any model is based on its empirical validity. between an American option’s valuation and the free
It is widely agreed that the modern theory of boundary problem for the heat equation.
option pricing began in 1973 with the publica- A few years later, instead of invoking the postulate
tion of the Black–Scholes–Merton model [12, 104]. that discounted option payoffs follow a martingale,
Except for the early years (pre-1973), this his- Samuelson and Merton [118] derived this condition
tory is restricted to papers that use the no arbi- as an implication of a utility maximizing investor’s
trage and complete markets technology to price behavior. In this article, they also showed that the
options. Equilibrium option pricing models are not option’s price could be viewed as its discounted
discussed herein. In particular, this excludes the expected value, where instead of using the actual
consideration of option pricing in incomplete mar- probabilities to compute the expectation, one employs
kets. An outline for this article is as follows. utility or risk-adjusted probabilities (see expression
The following section discusses the early years of (20) on page 26). These risk-adjusted probabilities are
option pricing (pre-1973). The remaining sections now known as “risk-neutral” or “equivalent martin-
deal with 1973 to the present: the section “Equity gale” probabilities. Contrary to a widely held belief,
Derivatives” discusses the Black–Scholes–Merton the use of “equivalent martingale probabilities” in
model; the section “Interest Rate Derivatives” con- option pricing theory predated the paper by Cox and
cerns the Heath–Jarrow–Morton model; and the Ross [36] by nearly 10 years (Merton (footnote 5 p.
section “Credit Derivatives” corresponds to credit 218, [107]) points out that Samuelson knew this fact
risk derivative pricing models. as early as 1953).
Unfortunately, these early option pricing formu-
las depended on the expected return on the stock, or
Early Option Pricing Literature equivalently, the stock’s risk premium. This depen-
(Pre-1973) dency made the formulas difficult to estimate and to
use. The reason for this difficulty is that the empiri-
Interestingly, many of the basic insights of option cal finance literature has documented that the stock’s
pricing originated in the early years, that is, pre- risk premium is nonstationary. It varies across time
1973. It all began at the turn of the century according to both changing tastes and changing eco-
in 1900 with Bachelier’s [4] derivation of an nomic fundamentals. This nonstationarity makes both
option pricing formula in his doctoral disserta- the modeling of risk premium and their estimation
tion on the theory of speculation at France’s Sor- problematic. Indeed, at present, there is still no gen-
bonne University. Although remarkably close to the erally accepted model for an asset’s risk premium
Black–Scholes–Merton model, Bachelier’s formula that is consistent with historical data (see [32], Part
was flawed because he used normally distributed IV for a review).
stock prices that violated limited liability. More than Perhaps the most important criticism of this early
half a century later, Paul Samuelson read Bache- approach to option pricing is that it did not invoke the
lier’s dissertation, recognized this flaw, and fixed it riskless hedging argument in conjunction with the no-
by using geometric Brownian motion instead in his arbitrage principle to price an option. (The first use of
2 Option Pricing Theory: Historical Perspectives
riskless hedging with no arbitrage to prove a pricing the discounted stock price process a martingale.
relationship between financial securities can be found The second fundamental theorem of asset pricing
in [110].) And, as such, these valuation formulas states that the market is complete if and only if the
provided no insights into how to hedge an option equivalent martingale measure is unique. A complete
using the underlying stock and riskless borrowing. market is one in which any derivative security’s
It can be argued that the idea of hedging an option payoffs can be generated by a dynamic trading
is the single most important insight of modern strategy in the stock and riskless asset. These two
option pricing theory. The use of the no arbitrage theorems enabled the full fledged use of stochastic
hedging argument to price an option can be traced calculus for option pricing theory. A review and
to the seminal papers by Black and Scholes [12] summary of these results can be found in [43].
and Merton [104], although the no arbitrage hedging At the beginning, this alternative and more for-
argument itself has been attributed to Merton (see mal approach to option pricing theory was viewed
[79] in this regard). as only of tangential interest. Indeed, all existing
option pricing theorems could be derived without
this technology and only using the more intuitive
Equity Derivatives economic hedging argument. It was not until the
Heath–Jarrow–Morton (HJM) model [70] was devel-
Fischer Black, Myron Scholes, and Robert Mer- oped—circulating as a working paper in 1987—that
ton pioneered the modern theory of option pricing this impression changed. The HJM model was
with the publication of the Black–Scholes–Merton the first significant application that could not be
option pricing model [12, 104] in 1973. The origi- derived without the use of the martingale pricing
nal Black–Scholes–Merton model is based on five technology. More discussion relating to the HJM
assumptions: (i) competitive markets, (ii) frictionless model is contained in the section “Interest Rate
markets, (iii) geometric Brownian motion, (iv) deter- Derivatives”.
ministic interest rates, and (v) no credit risk. For the
purposes of this section, the defining characteristics Extensions
of this model are the assumptions of deterministic
interest rates and no credit risk. The original Black–Scholes–Merton model is based
The original derivation followed an economic on the following five assumptions: (i) competi-
hedging argument. The hedging argument involves tive markets, (ii) frictionless markets, (iii) geo-
holding simultaneous and offsetting positions in a metric Brownian motion, (iv) deterministic interest
stock and option that generates an instantaneous rates, and (v) no credit risk. The first two assump-
riskless position. This, in turn, implies a partial tions —competitive and frictionless markets —are
differential equation (pde.) for the option’s value the mainstay of finance. Competitive markets means
that is subject to a set of boundary conditions. The that all traders act as price takers, believing their
solution under geometric Brownian motion is the trades have no impact on the market price. Friction-
Black–Scholes formula. less markets imply that there are no transaction costs
It was not until six years later that the mar- nor trade restrictions, for example, no short sale con-
tingale pricing technology was introduced by Har- straints. Geometric Brownian motion implies that the
rison and Kreps [65] and Harrison and Pliska stock price is lognormally distributed with a con-
[66, 67], providing an alternative derivation of the stant volatility. Deterministic interest rates are self-
Black–Scholes–Merton model. These papers, and explanatory. No credit risk means that the investors
later refinements by Delbaen and Schachermayer [40, (all counterparties) who trade financial securities will
41, 42], introduced the first and second fundamen- not default on their obligations.
tal theorems of asset pricing, thereby providing the Extensions of the Black–Scholes–Merton model
rigorous foundations to option pricing theory. that relaxed assumptions (i)–(iii) quickly flourished.
Roughly speaking, the first fundamental theorem Significant papers relaxing the geometric Brownian
of asset pricing states that no arbitrage is equivalent to motion assumption include those by Merton [106]
the existence of an equivalent martingale probability and Cox and Ross [36], who studied jump and
measure, that is, a probability measure that makes jump-diffusion processes. Merton’s paper [106] also
Option Pricing Theory: Historical Perspectives 3
included the insight that if unhedgeable jump risk is price impact is called liquidity risk. Liquidity risk,
diversifiable, then it carries no risk premium. Under of this type, can be considered as an endogenous
this assumption, one can value jump risk using the transaction cost. This extension is studied in [26].
statistical probability measure, enabling the simple Liquidity risk is currently a hot research topic in
pricing of options in an incomplete market. This option pricing theory.
insight was subsequently invoked in the context of The Black–Scholes–Merton model has been app-
stochastic volatility option pricing and in the context lied to foreign currency options (see [58]) and to all
of pricing credit risk derivatives. types of exotic options on both equities and foreign
Merton [104], Cox [34] and Cox and Ross [36] currencies. A complete reference for exotic options
were among the first to study stochastic volatility is [44].
option pricing in a complete market. Option pric-
ing with stochastic volatility in incomplete markets Computations
was subsequently studied by Hull and White [73]
and Heston [71]. More recent developments in this The original derivation of the Black–Scholes–
line of research use a HJM [70] type model with a Merton model yields an option’s value satisfying a
term structure of forward volatilities (see [51, 52]). pde. subject to a set of boundary conditions. For a
Stochastic volatility models are of considerable cur- European call or put option, under geometric Brow-
rent interest in the pricing of volatility swaps, vari- nian motion, the pde. has an analytic solution. For
ance swaps, and options on variance swaps. American options under geometric Brownian motion,
A new class of Levy processes was introduced analytic solutions are not available for puts indepen-
by Madan and Milne [102] into option pricing and dent of dividend payments on the underlying stock,
generalized by Carr et al. [20]. Levy processes have and for American calls with dividends. For differ-
the nice property that their characteristic function is ent stock price processes, analytic solutions are often
known, and it can be shown that an option’s price not available as well, even for European options. In
can be represented in terms of the stock price’s these cases, numerical solutions are needed. The first
characteristic function. This leads to some alternative numerical approaches employed in this regard were
numerical procedures for computing option values finite difference methods (see [15, 16]).
using fast Fourier transforms (see [23]). For a survey Closely related, but containing more economic
of the use of Levy processes in option pricing, intuition, option prices can also be computed numer-
see [33]. ically by using a binomial approximation. The first
The relaxation of the frictionless market assump- users in this regard were Sharpe [122] chapter 16, and
tion has received less attention in the literature. The Rendleman and Bartter [113]. Cox et al. [37] pub-
inclusion of transaction costs into option pricing was lished the definitive paper documenting the binomial
originally studied by Leland [99], while Heath and model and its convergence to the continuous time
Jarrow [69] studied the imposition of margin require- limit (see also [68]). A related paper on convergence
ments. A more recent investigation into the impact of of discrete time models to continuous time models is
transaction costs on option pricing, using the martin- that by Duffie and Protter [48].
gale pricing technology, can be found in [26]. The binomial pricing model, as it is now known,
The relaxation of the competitive market assump- is also an extremely useful pedagogical device for
tion was first studied by Jarrow [77, 78] via the explaining option pricing theory. This is true because
consideration of a large trader whose trades change the binomial model uses only discrete time mathe-
the price. Jarrow’s approach maintains the no arbi- matics. As such, it is usually the first model presented
trage assumption, or in this context, a no market in standard option pricing textbooks. It is interesting
manipulation assumption (see also [5]). to note that both the first two textbooks on option
In between a market with competitive traders and a pricing utilized the binomial model in this fashion
market with a large trader is a market where traders (see [38] and [84]).
have only a temporary impact on the market price. Another technique for computing option values is
That is, purchase/sales change the price paid/received to use a series expansions (see [50, 83 and 123]).
depending upon a given supply curve. Traders act as Series expansions are also useful for hedging exotic
price takers with respect to the supply curve. Such a options that employ only static hedge positions with
4 Option Pricing Theory: Historical Perspectives
plain vanilla options (see [38] chapter 7.2, [24, 63, During the late 1970s and 1980s, interest rates
and 116]). were large and volatile, relative to historical norms.
As computing a European option’s price is equiv- New interest rate risk management tools were needed
alent to computing an expectation, an alternative because the Black–Scholes–Merton model was not
approach to either finite difference methods or the useful in this regard. In response, a class of inter-
binomial model is Monte Carlo simulation. The paper est rate pricing models were developed by Vasicek
that introduced this technique to option pricing is by [124], Brennan and Schwartz [17], and Cox et al.
Boyle [13]. This technique has become very popu- (CIR) [35]. This class, called the spot rate mod-
lar because of its simplicity and its ability to handle els, had two limitations. First, they depended on
high-dimensional problems (greater than three dimen- the market price(s) of interest rate risk, or equiv-
sions). This technique has also recently been extended alently, the expected return on default free bonds.
to pricing American options. Important contributions This dependence, just as with the option pricing mod-
in this regard are by Longstaff and Schwartz [101] els pre-Black–Scholes–Merton, made their imple-
and Broadie and Glasserman [18]. For a complete mentation problematic. Second, these models could
reference on Monte Carlo techniques, see [61]. not easily match the initial yield curve. This cal-
Following the publication of Merton’s original ibration is essential for the accurate pricing and
paper [104], which contained an analytic solution for hedging of interest rate derivatives because any
a perpetual American put option, much energy has discrepancies in yield curve matching may indi-
been expended in the search for analytic solutions for cate “false” arbitrage opportunities in the priced
both American puts and calls with finite maturities. derivatives.
For the American call, with a finite number of known To address these problems, Ho and Lee [72]
dividends, a solution was provided by Roll [115]. applied the binomial model to interest rate derivatives
For American puts, breaking the maturity of the with a twist. Instead of imposing an evolution on
option into a finite number of discrete intervals, the spot rate, they had the zero coupon bond price
the compound option pricing technique is applicable, curve that evolved in a binomial tree. Motivated by
(see [60] and [93]). More recently, the decomposition this paper, Heath–Jarrow–Morton [70] generalized
of American options into a European option and an this idea in the context of a continuous time and
early exercise premium was discovered by Carr et al. multifactor model to price interest rate derivatives.
[22], Kim [96], and Jacka [75]. The key step in the derivation of the HJM model was
These computational procedures are more gen- determined as the necessary and sufficient conditions
erally applicable to all derivative pricing models, for an arbitrage free evolution of the term structure
including those discussed in the next two sections. of interest rates.
The defining characteristic of the HJM model is
Interest Rate Derivatives that there is a continuum of underlying assets, a
term structure, whose correlated evolution needs to
Interest rate derivative pricing models provided be considered when pricing and hedging options. For
the next major advance in option pricing the- interest rate derivatives, this term structure is the term
ory. Recall that a defining characteristic of the structure of interest rates. To be specific, it is the term
Black–Scholes–Merton model is that it assumes structure of default free interest rates. But there are
deterministic interest rates. This assumption limits its other term structures of relevance, including foreign
usefulness in two ways. First, it cannot be used for interest rates, commodity futures prices, convenience
long-dated contracts. Indeed, for long-dated contracts yields on commodities, and equity forward volatili-
(greater than a year or two), interest rates cannot ties. These alternative applications are discussed later
be approximated as being deterministic. Second, for in this section.
short-dated contracts, if the underlying asset’s price To simplify the mathematics, HJM focused on
process is highly correlated with interest rate move- forward rates instead of zero-coupon bond prices.
ments, then interest rate risk will affect hedging, and The martingale pricing technology was the tool used
therefore valuation. The extreme cases, of course, are to obtain the desired conditions —the “HJM drift
interest rate derivatives where the underlyings are the conditions”. Given the HJM drift conditions and
interest rates themselves. the fact that the interest rate derivative market is
Option Pricing Theory: Historical Perspectives 5
was first discovered by Jarrow [76] and later inde- of a firm issuing only a single zero coupon bond. As
pendently discovered by Geman [59] (see [112] such, risky debt could be decomposed into riskless
for a discussion of the LIBOR model and its debt plus a short put option on the assets of the firm.
history). Shortly thereafter, extensions to address this simple
liability structure were quickly discovered by Black
Applications and Cox [11] Jones et al. [94] and Leland [100]
among others.
The HJM model has been extended to multiple term The structural approach to credit risk modeling
structures and applied to foreign currency derivatives has two well-known empirical shortcomings: (i) that
[2], to equities and commodities [3], and to Treasury default occurs smoothly, implying that bond prices
inflation protected bonds [89]. The HJM model has do not jump at default and (ii) that the firm’s
also been applied to term structures of futures prices assets are neither traded nor observable. The first
(see [21], and [108]), term structures of convenience shortcoming means that for short maturity bonds,
yields [111], term structures of credit risky bonds credit spreads as implied by the structural model are
(discussed in the next section), and term structures smaller than those observed in practice. Extensions
of equity forward volatilities ([51, 52], and [121]). In of the structural approach that address the absence of
fact, it can be shown that almost all option pricing a jump at default include that by Zhou [125]. These
applications can be viewed as special cases of a extensions, however, did not overcome the second
multiple term structure HJM model (see [88]). A shortcoming.
summary of many of these applications can be found Almost 20 years after Merton’s original paper,
in [19]. Jarrow and Turnbull [85, 86] developed an alter-
native credit risk model that overcame the sec-
ond shortcoming. As a corollary, this approach
Credit Derivatives also overcame the first shortcoming. This alterna-
tive approach has become known as the reduced form
The previously discussed models excluded the con-
model. Early important contributions to the reduced
sideration of default when trading financial securities.
form model were by Lando [97], Madan and Unal
The first model for studying credit risk, called the
[103], Jarrow et al. [80], and Duffie and Singleton
structural approach, was introduced by Merton [105].
[49].
Credit risk, although always an important considera-
As the credit derivative markets expanded, so did
tion in fixed income markets, dramatically expanded
its market wide recognition with the introduction of extensions to the reduced form model. To consider
trading in credit default swaps after the mid-1990s. credit rating migration, Jarrow et al. [80] introduced
The reason for this delayed importance was that it a Markov chain model, where the states correspond
took until then for the interest rate derivative markets to credit ratings. Next, there was the issue of default
to mature sufficiently for sophisticated financial insti- correlation for pricing credit derivatives on baskets
tutions to successfully manage/hedge equity, foreign (e.g., credit default obligations (CDOs)). This corre-
currency and interest rate risk. This risk-controlling lation was first handled with Cox processes (Lando
ability enabled firms to seek out arbitrage opportu- [97]).
nities, and in the process, lever up on the remaining The use of Cox processes induces default corre-
financial risks, which are credit/counterparty, liquid- lations across firms through common state variables
ity, and operational risk. This greater risk expo- that drive the default intensities. But when condition-
sure by financial institutions to both credit and liq- ing on the state variables, defaults are assumed to
uidity risk (as evidenced by the events surround- be independent across firms. If this structure is true,
ing the failure of Long Term Capital Management) then after conditioning, defaults are diversifiable in
spurred the more rapid development of credit risk a large portfolio and require no additional risk pre-
modeling. mium. The implication is that the empirical and risk
As the first serious contribution to credit risk neutral default intensities are equal. This equality, of
modeling, Merton’s original model was purposely course, would considerably simplify direct estimation
simple. Merton considered credit risk in the context of the risk neutral default intensity [81].
Option Pricing Theory: Historical Perspectives 7
This is not the only mechanism through which translation in P. Cootner (ed.) (1964) The Random
default correlations can be generated. Default con- Character of Stock Market Prices, MIT Press, Cam-
tagion is also possible through competitive industry bridge, MA.
[5] Bank, P. & Baum, D. (2004). Hedging and Portfolio
considerations. This type of default contagion is a optimization in illiquid Financial markets with a large
type of “counterparty” risk, and it was first studied trader, Mathematical Finance 14(1), 1–18.
in the context of a reduced form model by Jarrow [6] Bielecki, T. & Rutkowski, M. (2002). Credit Risk:
and Yu [90]. “Counterparty risk” in a reduced form Modeling, Valuation, and Hedging, Springer Verlag.
model, an issue in and of itself, was previously stud- [7] Bjork, T. & Christensen, B. (1999). Interest rate
ied by Jarrow and Turnbull [86, 87]. dynamics and consistent forward rate curves, Mathe-
matical Finance 9(4), 323–348.
Finally, default correlation could be induced via [8] Bjork, T., Di Masi, G., Kabanov, Y. & Runggaldier, W.
information flows as well. Indeed, a default by one (1997). Towards a general theory of bond markets,
firm may cause other firm’s default intensities to Finance and Stochastics 1, 141–174.
increase as the market learns about the reasons for the [9] Bjork, T. & Svensson, L. (2001). On the existence of
realized default (see [120]). Finding a suitable corre- finite dimensional realizations for nonLinear forward
lation structure for implementation and estimation is rate models, Mathematical Finance 11(2), 205–243.
[10] Black, F. (1976). The pricing of commodity contracts,
still a topic of considerable interest.
Journal of Financial Economics 3, 167–179.
An important contribution to the credit risk model [11] Black, F. & Cox, J. (1976). Valuing corporate securi-
literature was the integration of structural and reduced ties: some effects of bond indenture provisions, Journal
form models. These two credit risk models can be of Finance 31, 351–367.
understood through the information sets used in their [12] Black, F. & Scholes, M. (1973). The pricing of options
construction. Structural models use the management’s and corporate liabilities, Journal of Political Economy
information set, while reduced form models use the 81, 637–659.
[13] Boyle, P. (1977). Options: a Monte Carlo approach,
market’s information set. Indeed, the manager has Journal of Financial Economics 4, 323–338.
access to the firm’s asset values, while the market [14] Brace, A., Gatarek, D. & Musiela, M. (1997). The
does not. The first paper making this connection was market model of interest rate dynamics, Mathematical
by Duffie and Lando [46] who viewed the market Finance 7(2), 127–147.
as having the management’s information set plus [15] Brennan, M. & Schwartz, E. (1977). The valuation of
noise, due to the accounting process. An alternative American put options, Journal of Finance 32, 449–462.
[16] Brennan, M. & Schwartz, E. (1978). Finite difference
view is that the market has a coarser partitioning of methods and jump processes arising in the pricing of
management’s information, that is, less of it. Both contingent claims: a synthesis, Journal of Financial and
views are reasonable, but the mathematics is quite Quantitative Analysis 13, 461–474.
different. The second approach was first explored by [17] Brennan, M. & Schwartz, E. (1979). A continuous time
Cetin et al. [27]. approach to the pricing of bonds, Journal of Banking
Credit risk modeling continues to be a hot area of and Finance 3, 135–155.
[18] Broadie, M. & Glasserman, P. (1997). Pricing Ameri-
research. Books on the current state of the art with
can style securities by simulation, Journal of Economic
respect to credit risk derivative pricing models are by Dynamics and Control 21, 1323–1352.
Lando [98] and Bielecki and Rutkowski [6]. [19] Carmona, R. (2007). HJM: a unified approach to
dynamic models for fixed income, credit and equity
markets. Paris-Princeton Lectures on Mathematical
References Finance 2004, Lecture Notes in Mathematics, vol.
1919, Springer Verlag.
[1] Andersen, L. & Brotherton-Ratcliffe, R. (2005). Ex- [20] Carr, P., Geman, H., Madan, D. & Yor, M. (2003).
tended LIBOR market models with stochastic volatility, Stochastic volatility for levy processes, Mathematical
Journal of Computational Finance 9, 1–26. Finance 13, 345–382.
[2] Amin, K. & Jarrow, R. (1991). Pricing foreign cur- [21] Carr, P. & Jarrow, R. (1995). A discrete time syn-
rency options under stochastic interest rates, Journal of thesis of derivative security valuation using a term
International Money and Finance 10(3), 310–329. structure of futures prices, in Handbooks in OR & MS,
[3] Amin, K. & Jarrow, R. (1992). Pricing American R. Jarrow, V. Maksimoviz & W. Ziemba, eds, Elsevier
options on risky assets in a stochastic interest rate Science B.V., Vol. 9, pp. 225–249.
economy, Mathematical Finance 2(4), 217–237. [22] Carr, P., Jarrow, R. & Myneni, R. (1992). Alternative
[4] Bachelier, L. (1990). Theorie de la Speculation, Ph.D. characterizations of American put options, Mathemati-
Dissertation, L’Ecole Normale Superieure. English cal Finance 2(2), 87–106.
8 Option Pricing Theory: Historical Perspectives
[23] Carr, P. & Madan, D. (1998). Option valuation using [44] Detemple, J. (2006). American Style Derivatives: Valu-
the fast Fourier transform, Journal of Computational ation and Computation, Financial Mathematics Series,
Finance 2, 61–73. Chapman & Hall/CRC.
[24] Carr, P. & Madan, D. (1998). Toward a theory of [45] Duffie, D. & Kan, R. (1996). A yield factor model of
volatility trading, in Volatility, R. Jarrow, ed., Risk interest rates, Mathematical Finance 6, 379–406.
Publications, pp. 417–427. [46] Duffie, D. & Lando, D. (2001). Term structure of
[25] Caverhill, A. (1994). When is the spot rate Markovian?, credit spreads with incomplete accounting information,
Mathematical Finance 4, 305–312. Econometrica 69, 633–664.
[26] Çetin, U., Jarrow, R. & Protter, P. (2004). Liquidity risk [47] Duffie, D., Pan, J. & Singleton, K. (2000). Transform
and arbitrage pricing theory, Finance and Stochastics analysis and asset pricing for affine jump-diffusions,
8, 311–341. Econometrica 68, 1343–1376.
[27] Cetin, U., Jarrow, R., Protter, P. & Yildirim, Y. (2004). [48] Duffie, D. & Protter, P. (1992). From discrete to
Modeling credit risk with partial information, The continuous time finance: weak convergence of the
Annals of Applied Probability 14(3), 1167–1178. financial gain process, Mathematical Finance 2(1),
[28] Chen, L., Filipovic, D. & Poor, H. (2004). Quadratic 1–15.
term structure models for risk free and defaultable rates, [49] Duffie, D. & Singleton, K. (1999). Modeling term
Mathematical Finance 14(4), 515–536. structures of defaultable bonds, Review of Financial
[29] Cheng, P. & Scaillet, O. (2007). Linear-quadratic Studies 12(4), 687–720.
jump diffisuion modeling, Mathematical Finance 17(4), [50] Dufresne, D. (2000). Laguerre series for Asian and
575–598. other options, Mathematical Finance 10(4), 407–428.
[30] Cheyette, O. (1992). Term structure dynamics and [51] Dupire, B. (1992). Arbitrage pricing with stochastic
mortgage valuation, Journal of Fixed Income 1, 28–41. volatility. Proceedings of AFFI Conference, Paris, June.
[31] Chiarella, C. & Kwon, O. (2000). A complete Marko- [52] Dupire, B. (1996). A Unified Theory of Volatility.
vian stochastic volatiility model in the HJM framework,
Paribas working paper.
Asia-Pacific Financial Markets 7, 293–304.
[53] Eberlein, E. & Raible, S. (1999). Term structure mod-
[32] Cochrane, J. (2001). Asset Pricing, Princeton Univer-
els driven by general Levy processes, Mathematical
sity Press.
Finance 9(1), 31–53.
[33] Cont, R. & Tankov, P. (2004). Financial Modeling with
[54] Eberlein, E. & Ozkan, F. (2005). The Levy LIBOR
Jump Processes, Chapman & Hall.
model, Finance and Stochastics 9, 327–348.
[34] Cox, J. (1975). Notes on Option Pricing I: Constant
[55] Flesaker, B. & Hughston, L. (1996). Positive interest,
Elasticity of Variance Diffusions, working paper, Stan-
Risk Magazine 9, 46–49.
ford University.
[56] Filipovic, D. (2001). Consistency Problems for Heath
[35] Cox, J., Ingersoll, J. & Ross, S. (1985). A theory of
Jarrow Morton Interest Rate Models, Springer Lecture
the term structure of interest rates, Econometrica 53,
385–407. Notes in Mathematics, Vol. 1760, Springer Verlag.
[36] Cox, J. & Ross, S.A. (1976). The valuation of options [57] Filipovic, D. (2002). Separable term structures and the
for alternative stochastic processes, Journal of Finan- maximal degree problem, Mathematical Finance 12(4),
cial Economics 3(1/2), 145–166. 341–349.
[37] Cox, J., Ross, S. & Rubinstein, M. (1979). Option [58] Garman, M. & Kohlhagen, S. (1983). Foreign currency
pricing: a simplified approach, Journal of Financial exchange values, Journal of International Money and
Economics 7, 229–263. Finance 2, 231–237.
[38] Cox, J. & Rubinstein, M. (1985). Option Markets, [59] Geman, H. (1989). The Importance of the Forward
Prentice Hall. Neutral Probability in a Stochastic Approach of Interest
[39] Dai, Q. & Singleton, K. (2000). Specification analysis Rates, working paper, ESSEC.
of affine term structure models, Journal of Finance 55, [60] Geske, R. (1979). The valuation of compound options,
1943–1978. Journal of Financial Economics 7, 63–81.
[40] Delbaen, F. & Schachermayer, W. (1994). A general [61] Glasserman, P. (2004). Monte Carlo Methods in Finan-
version of the fundamental theorem of asset pricing, cial Engineering, Springer Verlag.
Mathematische Annalen 300, 463–520. [62] Glasserman, P. & Kou, S. (2003). The term structure
[41] Delbaen, F. & Schachermayer, W. (1995). The exis- of simple forward rates with jump risk, Mathematical
tence of absolutely continuous local Martingale mea- Finance 13(3), 383–410.
sures, Annals of Applied Probability 5, 926–945. [63] Green, R. & Jarrow, R. (1987). Spanning and com-
[42] Delbaen, F. & Schachermayer, W. (1998). The fun- pleteness in markets with contingent claims, Journal of
damental theorem for unbounded stochastic processes, Economic Theory 41(1), 202–210.
Mathematische Annalen 312, 215–250. [64] Goldstein, R. (2000). The term structure of interest
[43] Delbaen, F. & Schachermayer, W. (2006). The Mathe- rates as a random field, Review of Financial Studies
matics of Arbitrage, Springer Verlag. 13(2), 365–384.
Option Pricing Theory: Historical Perspectives 9
[65] Harrison, J. & Kreps, D. (1979). Martingales and discontinuities in asset returns, Mathematical Finance
arbitrage in multiperiod security markets, Journal of 5(4), 311–336.
Economic Theory 20, 381–408. [83] Jarrow, R. & Rudd, A. (1982). Approximate option
[66] Harrison, J. & Pliska, S. (1981). Martingales and valuation for arbitrary stochastic processes, Journal of
stochastic integrals in the theory of continuous trad- Financial Economics 10, 347–369.
ing, Stochastic Processes and Their Applications 11, [84] Jarrow, R. & Rudd, A. (1983). Option Pricing, Dow
215–260. Jones Irwin.
[67] Harrison, J. & Pliska, S. (1983). A stochastic cal- [85] Jarrow, R. & Turnbull, S. (1992). Credit risk: drawing
culus model of continuous trading: complete mar- the analogy, Risk Magazine 5(9).
kets, Stochastic Processes and Their Applications 15, [86] Jarrow, R. & Turnbull, S. (1995). Pricing derivatives
313–316. on financial securities subject to credit risk, Journal of
[68] He, H. (1990). Convergence of discrete time to conti- Finance 50(1), 53–85.
nous time contingent claims prices, Review of Financial [87] Jarrow, R. & Turnbull, S. (1997). When swaps are
Studies 3, 523–546. dropped, Risk Magazine 10(5), 70–75.
[69] Heath, D. & Jarrow, R. (1987). Arbitrage, continuous [88] Jarrow, R. & Turnbull, S. (1998). A unified approach
trading and margin requirments, Journal of Finance 17, for pricing contingent claims on multiple term struc-
1129–1142. tures, Review of Quantitative Finance and Accounting
[70] Heath, D., Jarrow, R. & Morton, A. (1992). Bond 10(1), 5–19.
pricing and the term structure of interest rates: a [89] Jarrow, R. & Yildirim, Y. (2003). Pricing treasury infla-
new methodology for contingent claims valuation, tion protected securities and related derivatives using
Econometrica 60(1), 77–105. an HJM model, Journal of Financial and Quantitative
[71] Heston, S. (1993). A closed form solution for options Analysis 38(2), 337–358.
with stochastic volatility with applications to bond [90] Jarrow, R. & Yu, F. (200). Counterparty risk and the
and currency options, Review of Financial Studies 6, pricing of defaultable securities, Journal of Finance
56(5), 1765–1799.
327–343.
[91] Jin, Y. & Glasserman, P. (2001). Equilibrium positive
[72] Ho, T. & Lee, S. (1986). Term structure movements
interest rates: a unified view, Review of Financial
and pricing interest rate contingent claims, Journal of
Studies 14, 187–214.
Finance 41, 1011–1028.
[92] Jeffrey, A. (1995). Single factor heath Jarrow Morton
[73] Hull, J. & White, A. (1987). The pricing of options on
term structure models based on Markov spot rate
assets with stochastic volatilities, Journal of Finance
dynamics, Journal of Financial and Quantitative Anal-
42, 271–301.
ysis 30, 619–642.
[74] Hull, J. & White, A. (1990). Pricing interest rate
[93] Johnson, H. (1983). An analytic approximation of
derivative securities, Review of Financial Studies 3,
the American put price, Journal of Financial and
573–592. Quantitative Analysis 18, 141–148.
[75] Jacka, S. (1991). Optimal stopping and the American [94] Jones, E., Mason, S. & Rosenfeld, E. (1984). Con-
put, Mathematical Finance 1, 1–14. tingent claims analysis of corporate capital structures:
[76] Jarrow, R. (1987). The pricing of commodity options an empirical investigation, Journal of Finance 39,
with stochastic interest rates, Advances in Futures and 611–627.
Options Research 2, 15–28. [95] Kennedy, D. (1994). The term structure of interest rates
[77] Jarrow, R. (1992). Market manipulation, bubbles, cor- as a Gaussian random field, Mathematical Finance 4,
ners and short squeezes, Journal of Financial and 247–258.
Quantitative Analysis 27(3), 311–336. [96] Kim, J. (1990). The analytic valuation of American
[78] Jarrow, R. (1994). Derivative security markets, market options, Review of Financial Studies 3, 547–572.
manipulation and option pricing, Journal of Financial [97] Lando, D. (1998). On Cox processes and credit
and Quantitative Analysis 29(2), 241–261. risky securities, Review of Derivatives Research 2,
[79] Jarrow, R. (1999). In honor of the Nobel Laureates 99–120.
Robert C. Merton and Myron S. Scholes: a partial [98] Lando, D. (2004). Credit Risk Modeling: Theory and
differential equation that changed the world, Journal Applications, Princeton University Press, Princeton.
of Economic Perspectives 13(4), 229–248. [99] Leland, H. (1985). Option pricing and replication with
[80] Jarrow, R., Lando, D. & Turnbull, S. (1997). A Markov transaction costs, Journal of Finance 15,
model for the term structure of credit risk spreads, 1283–1391.
Review of Financial Studies 10(1), 481–523. [100] Leland, H. (1994). Corporate debt value, bond coven-
[81] Jarrow, R., Lando, D. & Yu, F. (2005). Default risk ants and optimal capital structure, Journal of Finance
and diversification: theory and empirical applications, 49, 1213–1252.
Mathematical Finance 15(1), 1–26. [101] Longstaff, F. & Schwartz, E. (2001). Valuing American
[82] Jarrow, R. & Madan, D. (1995). Option pricing using options by simulation: a simple least squares approach,
the term structure of interest rates to hedge systematic Review of Financial Studies 14, 113–147.
10 Option Pricing Theory: Historical Perspectives
[102] Madan, D. & Milne, F. (1991). Option pricing with [115] Roll, R. (1977). An analytic valuation formula for
variance gamma martingale components, Mathematical unprotected American call options on stocks with
Finance 1, 39–55. known dividends, Journal of Financial Economics 5,
[103] Madan, D. & Unal, H. (1998). Pricing the risks of 251–258.
default, Review of Derivatives Research 2, 121–160. [116] Ross, S. (1976). Options and efficiency, Quarterly
[104] Merton, R.C. (1973). The theory of rational option Journal of Economics 90, 75–89.
pricing, Bell Journal of Economics and Management [117] Samuelson, P. (1965). Rational theory of warrant
Science 4, 141–183. pricing, Industrial Management Review 6, 13–39.
[105] Merton, R.C. (1974). On the pricing of corporate debt: [118] Samuelson, P. & Merton, R.C. (1969). A complete
the risk structure of interest rates, Journal of Finance model of warrant pricing that maximizes utility, Indus-
29, 449–470. trial Management Review 10(2), 17–46.
[106] Merton, R.C. (1976). Option pricing when underlying [119] Sandmann, K., Sondermann, D. & Miltersen, K.
stock returns are discontinuous, Journal of Financial (1995). Closed form term structure derivatives in a
Economics 3, 125–144. heath Jarrow Morton model with log-normal annu-
[107] Merton, R.C. (1990). Continuous Time Finance, Basil ally compunded interest rates, Proceedings of the
Blackwell, Cambridge, Massachusetts. Seventh Annual European Research Symposium,
[108] Miltersen, K., Nielsen, J. & Sandmann, K. (2006). Bonn, September 1994, Chicago Board of Trade,
New no-arbitrage conditions and the term structure of pp. 145–164.
interest rate futures, Annals of Finance 2, 303–325. [120] Schonbucher, P. (2004). Information Driven Default
[109] Miltersen, K., Sandmann, K. & Sondermann, D. (1997). Contagion, working paper, ETH Zurich.
Closed form solutions for term structure derivatives [121] Schweizer, M. & Wissel, J. (2008). Term structure of
with log-normal interest rates, Journal of Finance 52, implied volatilities: absence of arbitrage and existence
409–430. results, Mathematical Finance 18(1), 77–114.
[110] Modigliani, F. & Miller, M. (1958). The cost of capital, [122] Sharpe, W. (1981). Investments, Prentice Hall, Engle-
corporation finance, and the theory of investment, wood Cliffs.
American Economic Review 48, 261–297. [123] Turnbull, S. & Wakeman, L. (1991). A quick algo-
[111] Nakajima, K. & Maeda, A. (2007). Pricing commodity rithm for pricing European average options, Journal of
spread options with stochastic term structure of conve- Financial and Quantitative Analysis 26, 377–389.
nience yields and interest rates, Asia Pacific Financial [124] Vasicek, O. (1977). An equilibrium characterization of
Markets 14, 157–184. the term structure, Journal of Financial Economics 5,
[112] Rebonato, R. (2002). Modern Pricing of Interest Rate 177–1888.
Derivatives: The LIBOR Market Model land Beyond, [125] Zhou, C. (2001). The term structure of credit spreads
Princeton University Press. with jump risk, Journal of Banking and Finance 25,
[113] Rendleman, R. & Bartter, B. (1979). Two state option 2015–2040.
pricing, Journal of Finance 34, 1093–1110.
[114] Rogers, L. (1994). The potential approach to the term ROBERT A. JARROW
structure of interest rates and foreign exchange rates,
Mathematical Finance 7, 157–176.
Modern Portfolio Theory be determined by the present value of discounted
future dividends. MPT prehistory can be traced even
beyond to Bachelier [3], who was the first to describe
Modern portfolio theory (MPT) is generally arithmetic Brownian motion with the objective of
defined as the body of financial economics determining the value of financial derivatives, all the
beginning with Markowitz’ famous 1952 paper, way to Bernoulli [7], who originated the concept
“Portfolio Selection”, and extending through the of risk aversion while working to solve the St.
next several decades of research into what has Petersburg Paradox. Bernoulli, in his derivation of
variously been called Financial Decision Making logarithmic utility, suggested that people maximize
under Uncertainty, The Theory of Investments, The “moral expectation”—what we call today expected
Theory of Financial Economics, Theory of Asset utility; further, Bernoulli, like Markowitz [53] and
Selection and Capital–Market Equilibrium, and The Roy [78], advised risk-averse investors to diversify:
Revolutionary Idea of Finance [45, 53, 58, 82, 88, “. . . it is advisable to divide goods which are exposed
98]. Usually this definition includes the Capital Asset to some small danger into several portions rather than
Pricing Model (CAPM) and its various extensions. to risk them all together.”
Markowitz once remarked to Marschak that the first Notwithstanding this ancient history, MPT is inex-
“CAPM” should be attributed to Marschak because tricably connected to CAPM, which for the first time
of his pioneering work in the field [56]; Marschak placed the investor’s problem in the context of an
politely declined the honor. economic equilibrium. This modern approach finds
The original CAPM, as we understand it today, its origin in the work of Mossin [65], Lintner [47,
was first developed by Treynor [91, 92], and subse- 48], and Sharpe [84], and even earlier in Treynor [91,
quently independently derived in the works of Sharpe 92]. Accounts of these origins can be found in [8, 29,
[84], Lintner [47], and Mossin [65]. With the excep- 85]. Treynor [92] built on the single-period discrete-
tion of some commercially successful multifactor time foundation of Markowitz [53, 54] and Tobin
models that implement the approaches pioneered in [90]. Similar CAPM models of this type were later
[71, 72, 74, 75], most practitioners have little use published in [47, 48, 84]. Mossin [65] clarified Sharpe
for market models other than the CAPM, although [84] by providing a more precise specification of
(or, perhaps rather, because of the simplicity it the equilibrium conditions. Fama [26] reconciled the
derives from the fact that) its conclusions are based Sharpe and Lintner models; Lintner [49] incorporated
on extremely restrictive and unrealistic assumptions. heterogeneous beliefs; and Mayers [57] allowed for
Academics have spent much time and effort attempt- concentrated portfolios through trading restrictions
ing to substantiate or refute the validity of the CAPM on risky assets, transactions costs, and information
as a positive economic model. The best examples of asymmetries. Black [10] utilized the two-fund sep-
such attempts are [13, 28]. Roll [70] effectively ended aration theorem to construct the zero-beta CAPM,
this debate, however, by demonstrating that, since the by using a portfolio that is orthogonal to the mar-
“market portfolio” is not measurable, the CAPM can ket portfolio in place of a risk-free asset. Rubinstein
never be empirically proven or disproven. [79] extended the model to higher moments and also
(independently of Black) derived the CAPM without
a riskless asset.
History of Modern Portfolio Theory Discrete-time multiperiod models were the next
step; these models generally extend the discrete-time
The history of MPT extends back farther than the single-period model into an intertemporal setting in
history of CAPM, to Tobin [90], Markowitz [53], which investors maximize the expected utility of
and Roy [78], all of whom consider the “price of lifetime consumption and bequests. Building upon
risk”. For more detailed treatments of MPT and the multiperiod lifetime consumption literature of
pre-MPT financial economic thought, refer to [22, Phelps [68], Mirrlees [63], Yaari [97], Levhari and
69, 82]. The prehistory of MPT can be traced Srinivasan [44], and Hahn [30], models of this type
further yet, to Hicks [34] who includes the “price include those of Merton [59, 60], Samuelson [83],
of risk” in his discussion of commodity futures Hakansson [31, 32], Fama [27], Beja [4], Rubinstein
and to Williams [95] who considers stock prices to [80, 81], Long [50, 51], Kraus and Litzenberger
2 Modern Portfolio Theory
[41], and culminate in the consumption CAPMs Hindy and Huang [36] and the parsimonious con-
(CCAPMs) of Lucas [52] and Breeden [15]. ditional discrete-time CAPM and simplified infinite-
The multiperiod approach was taken to its date model of LeRoy [43], continues to build upon
continuous-time limit in the intertemporal CAPM the model originated in [91]. Each is perhaps more
(“ICAPM”) of Merton [61]. In addition to the stan- realistic, if less elegant, than the original. And yet
dard assumptions—limited liability of assets, no mar- it is the single period, discrete-time CAPM that has
ket frictions, individual trading does not affect prices, become popular and endured, as all great models do,
the market is in equilibrium, a perfect borrowing precisely because it is simple and unrealistic. It is
and lending market exists, and no nonnegativity realistic enough, apparently, to be coincident with the
constraints (relaxing the no short-sale rule employed utility functions of great many agents.
by Tobin and Sharpe but not by Treynor and Lint-
ner)—this model assumes that trading takes place
continually through time, as opposed to at discrete A Perspective on CAPM
points in time. Rather than assuming normally dis-
One of the puzzles that confronts the historian
tributed security returns, the ICAPM assumes a log- of CAPM is the changing attitude over time and
normal distribution of prices and a geometric Brow- across different scholarly communities toward the
nian motion of security returns. Also, the constant seminal work of Treynor [91, 92]. Contemporaries
rate of interest provided by the risk-free asset in consistently cited the latter paper [11, 13, 37, 38],
the CAPM is replaced by a dynamically chang- including also [84, 85]. However, in other papers,
ing rate, which is certain in the next instant but such as [16, 45, 55], these citations were not made.
uncertain in the future. Williams [96] extended this Histories and bibliographies continue to take note
model by relaxing the homogeneous expectations of Treynor’s contribution [8, 14, 58, 82], but not
assumption, and Duffie and Huang [23] confirmed textbooks or the scholarly literature that builds on
that such a relaxation is consistent with the ICAPM. CAPM. Why not?
The continuous-time model was shown to be con- One reason is certainly that Treynor’s manuscript
sistent with a single-beta CCAPM by Breeden [15]. [92] was not actually published in a book until
Hellwig [33] and Duffie and Huang [24] construct much later [40], although the paper did circulate
continuous-time models that allow for informational widely in mimeograph form. Another is that Treynor
asymmetries. The continuous-time model was further never held a permanent academic post, and so did
extended to include macroeconomic factors in [20]. not have a community of students and academic
Kyle [42] constructs an ICAPM to model insider colleagues to draw attention to his work. A third is
trading. that, although Treynor continued to write on financial
These, and other CAPMs, including the interna- topics, writings collected in [93], these writings were
tional models of Black [12], Solnik [86], and Stulz consistently addressed to practitioners, not to an
academic audience.
[89], as well as the CAPMs of Ross [73, 76] and Sta-
Even more than these, perhaps the most impor-
pleton and Subrahmanyam [87], are reviewed in [16,
tant reason (paradoxically) is the enormous attention
17, 19, 62, 77]. Bergstrom [5] provides a survey of
that was paid in subsequent years to refinement of
continuous-time models.
MPT. Unlike Markowitz and Sharpe, Treynor came
Extensions of the CAPM have also been devel- to CAPM from a concern about the firm’s capital
oped for use, in particular, in industrial applications; budgeting problem, not the investor’s portfolio allo-
for example, Cummins [21] reviews the models of cation problem. (This concern is clear in the 1961
Cooper [18], Biger and Kahane [9], Fairley [25], draft, which builds explicitly on [64].) This was the
Kahane [39], Hill [35], Ang and Lai [2], and Turner same concern, of course, that motivated Lintner, and
[94], which are specific to the insurance industry. it is significant therefore that the CAPMs of Lintner
More recent work continues to extend the theory. and Sharpe were originally seen as different theories,
Nielsen [66, 67], Allingham [1], and Berk [6] exam- rather than different formulations of the same theory.
ine conditions for equilibrium in the CAPM. Current Because the portfolio choice problem became
research, such as the collateral adjusted CCAPM of such a dominant strand of academic research, it
Modern Portfolio Theory 3
was perhaps inevitable that retrospective accounts of [3] Bachelier, L. (1900). Théorie de la spéculation, Annales
CAPM would emphasize the line of development Scientifique de l’École Normale Superieure 17, 3e serie,
that passes from the individual investor’s problem 21–86; Translated by Boness, A.J. and reprinted in
Cootner, P.H. (ed.) (1964). The Random Character of
to the general equilibrium problem, which is to say Stock Market Prices, MIT Press, Cambridge. (Revised
the line that passes through Tobin and Markowitz edition, first MIT Press Paperback Edition, July 1967).
to Sharpe. Lintner and Mossin come in for some pp. 17–78; Also reprinted as Bachelier, L. (1995).
attention, as academics who contributed not only Théorie de la Spéculation & Théorie Mathématique
their own version of CAPM but also produced a du jeu, (2 titres en 1 vol.) Les Grands Classiques
series of additional contributions to the academic Gauthier-Villars, Éditions Jacques Gabay, Paris, Part 1,
pp. 21–86.
literature. However, Treynor was not only interested
[4] Beja, A. (1971). The structure of the cost of capital
in a different problem but also was, and remained, a under uncertainty, The Review of Economic Studies 38,
practitioner. 359–369.
[5] Bergstrom, A.R. (1988). The history of continuous-
time econometric models, Econometric Theory 4(3),
Conclusion 365–383.
[6] Berk, J.B. (1992). The Necessary and Sufficient Con-
In 1990, the world beyond financial economists ditions that Imply the CAPM , working paper, Faculty
was made aware of the importance of MPT, when of Commerce, University of British Columbia, Canada;
Subsequently published as (1997). Necessary condi-
Markowitz and Sharpe, along with Miller, were
tions for the CAPM, Journal of Economic Theory 73,
awarded the Nobel Prize in Economics for their 245–257.
roles in the development of MPT. In the presenta- [7] Bernoulli, D. (1738). Exposition of a new theory on the
tion speech, Assar Lindbeck of the Royal Swedish measurement of risk, Papers of the Imperial Academy of
Academy of Sciences said “Before the 1950s, there Science, Petersburg, Vol. II, pp. 175–192;Translated and
was hardly any theory whatsoever of financial mar- reprinted in Sommer, L. (1954). Econometrica 22(1),
kets. A first pioneering contribution in the field was 23–36.
[8] Bernstein, P.L. (1992). Capital Ideas: The Improbable
made by Harry Markowitz, who developed a theory Origins of Modern Wall Street, The Free Press, New
. . . [which] shows how the multidimensional prob- York.
lem of investing under conditions of uncertainty in a [9] Biger, N. & Kahane, Y. (1978). Risk considerations in
large number of assets . . . may be reduced to the issue insurance ratemaking, The Journal of Risk and Insurance
of a trade-off between only two dimensions, namely 45, 121–132.
the expected return and the variance of the return of [10] Black, F. (1972). Capital market equilibrium with
restricted borrowing, Journal of Business 45(3),
the portfolio . . . . The next step in the analysis is to
444–455.
explain how these asset prices are determined. This [11] Black, F. (1972). Equilibrium in the creation of invest-
was achieved by development of the so-called Cap- ment goods under uncertainty, in Studies in the Theory of
ital Asset Pricing Model, or CAPM. It is for this Capital Markets, M.C. Jensen, ed., Praeger, New York,
contribution that William Sharpe has been awarded. pp. 249–265.
The CAPM shows that the optimum risk portfolio [12] Black, F. (1974). International capital market equilib-
of a financial investor depends only on the portfolio rium with investment barriers, Journal of Financial Eco-
nomics 1(4), 337–352.
manager’s prediction about the prospects of different [13] Black, F., Jensen, M.C. & Scholes, M. (1972). The
assets, not on his own risk preferences . . . . The Cap- capital asset pricing model: some empirical tests, in
ital Asset Pricing Model has become the backbone of Studies in the Theory of Capital Markets, M.C. Jensen,
modern price theory of financial markets” [46]. ed., Praeger, New York, pp. 79–121.
[14] Brealey, R.A. & Edwards, H. (1991). A Bibliography of
Finance, MIT Press, Cambridge.
References [15] Breeden, D.T. (1979). An intertemporal asset pric-
ing model with stochastic consumption and investment
[1] Allingham, M. (1991). Existence theorems in the capital opportunities, Journal of Financial Economics 7(3),
asset pricing model, Econometrica 59(4), 1169–1174. 265–296.
[2] Ang, J.S. & Lai, T.-Y. (1987). Insurance premium [16] Breeden, D.T. (1987). Intertemporal portfolio theory
pricing and ratemaking in competitive insurance and and asset pricing, in The New Palgrave Finance,
capital asset markets, The Journal of Risk and Insurance J. Eatwell, M. Milgate & P. Newman, eds, W.W. Norton,
54, 767–779. New York, pp. 180–193.
4 Modern Portfolio Theory
[17] Brennan, M.J. (1987). Capital asset pricing model, in [36] Hindy, A. & Huang, M. (1995). Asset Pricing With
The New Palgrave Finance, J. Eatwell, M. Milgate & Linear Collateral Constraints. unpublished manuscript,
P. Newman, eds, W.W. Norton, New York, pp. 91–102. Graduate School of Business, Stanford University.
[18] Cooper, R.W. (1974). Investment Return and Property- March.
Liability Insurance Ratemaking, Huebner Foundation, [37] Jensen, M.C. (ed) (1972). Studies in the Theory of
University of Pennsylvania, Philadelphia. Capital Markets, Praeger, New York.
[19] Copeland, T.E. & Weston, J.F. (1987). Asset pricing, in [38] Jensen, M.C. (1972). The foundations and current state
The New Palgrave Finance, J. Eatwell, M. Milgate & of capital market theory, in Studies in the Theory of
P. Newman, eds, W.W. Norton, New York, pp. 81–85. Capital Markets, M.C. Jensen, ed., Praeger, New York,
[20] Cox, J.C., Ingersoll Jr, J.E. & Ross, S.A. (1985). An pp. 3–43.
intertemporal general equilibrium model of asset prices, [39] Kahane, Y. (1979). The theory of insurance risk premi-
Econometrica 53(2), 363–384. ums—a re-examination in the light of recent develop-
[21] Cummins, J.D. (1990). Asset pricing models and insur- ments in capital market theory, ASTIN Bulletin 10(2),
ance ratemaking, ASTIN Bulletin 20(2), 125–166. 223–239.
[22] Dimson, E. & Mussavain, M. (2000). Three Centuries [40] Korajczyk, R.A. (1999). Asset Pricing and Portfolio Per-
of Asset Pricing, Social Science Research Network formance: Models, Strategy and Performance Metrics,
Electronic Library, paper 000105402.pdf. January. Risk Books, London.
[23] Duffie, D. & Huang, C.F. (1985). Implementing Arrow- [41] Kraus, A. & Litzenberger, R.H. (1975). Market equilib-
Debreu equilibria by continuous trading of few long- rium in a multiperiod state-preference model with loga-
lived securities, Econometrica 53, 1337–1356; Also rithmic utility, Journal of Finance 30(5), 1213–1227.
reprinted in edited by Schaefer, S. (2000). Continuous- [42] Kyle, A.S. (1985). Continuous auctions and insider
Time Finance, Edward Elgar, London. trading, Econometrica 53(3), 1315–1335.
[24] Duffie, D. & Huang, C.F. (1986). Multiperiod security [43] LeRoy, S.F. (2002). Theoretical Foundations for Con-
markets with differential information: martingales and ditional CAPM . unpublished manuscript, University of
resolution times, Journal of Mathematical Economics 15, California, Santa Barbara. May.
283–303. [44] Levhari, D. & Srinivasan, T.N. (1969). Optimal savings
[25] Fairley, W. (1979). Investment income and profit mar- under uncertainty, The Review of Economic Studies
gins in property-liability insurance: theory and empirical 36(106), 153–163.
tests, Bell Journal of Economics 10, 192–210. [45] Levy, H. & Sarnatt, M. (eds) (1977). Financial Decision
[26] Fama, E.F. (1968). Risk, return, and equilibrium: some Making under Uncertainty, Academic Press, New York.
clarifying comments, Journal of Finance 23(1), 29–40. [46] Lindbeck, A. (1990). The sveriges riksbank prize in
[27] Fama, E.F. (1970). Multiperiod consumption—invest- economic sciences in memory of Alfred Nobel 1990
ment decisions, The American Economic Review 60, presentation speech, Nobel Lectures, Economics 1981-
163–174. 1990, K.-G. Mäler, ed., World Scientific Publishing Co.,
[28] Fama, E.F. & MacBeth, J. (1973). Risk, return and Singapore, 1992.
equilibrium: empirical tests, The Journal of Political [47] Lintner, J. (1965). The valuation of risk assets and the
Economy 81(3), 607–636. selection of risky investments in stock portfolios and
[29] French, C.W. (2003). The Treynor capital asset pricing capital budgets, The Review of Economics and Statistics
model, Journal of Investment Management 1(2), Second 47, 13–37.
quarter, 60–72. [48] Lintner, J. (1965). Securities prices, risk, and maximal
[30] Hahn, F.H. (1970). Savings and uncertainty, The Review gains from diversification, Journal of Finance 20(4),
of Economic Studies 37(1), 21–24. 587–615.
[31] Hakansson, N.H. (1969). Optimal investment and con- [49] Lintner, J. (1969). The aggregation of investor’s diverse
sumption strategies under risk, an uncertain lifetime, judgment and preferences in purely competitive secu-
and insurance, International Economic Review 10(3), rities markets, Journal of Financial and Quantitative
443–466. Analysis 4, 347–400.
[32] Hakansson, N.H. (1970). Optimal investment and con- [50] Long Jr, J.B. (1972). Consumption-investment decisions
sumption strategies under risk for a class of utility and equilibrium in the securities markets, in Studies in
functions, Econometrica 38(5), 587–607. the Theory of Capital Markets, M.C. Jensen, ed., Praeger,
[33] Hellwig, M.F. (1982). Rational expectations equilibrium New York, pp. 146–222.
with conditioning on past prices: a mean-variance exam- [51] Long Jr, J.B. (1974). Stock prices, inflation and the
ple, Journal of Economic Theory 26, 279–312. term structure of interest rates, Journal of Financial
[34] Hicks, J.R. (1939). Value and Capital: An Inquiry Economics 2, 131–170.
into some Fundamental Principles of Economic Theory, [52] Lucas Jr, R.E. (1978). Asset prices in an exchange
Clarendon Press, Oxford. economy, Econometrica 46(6), 1429–1445.
[35] Hill, R. (1979). Profit regulation in property-liability [53] Markowitz, H.M. (1952). Portfolio selection, Journal of
insurance, Bell Journal of Economics 10, 172–191. Finance 7(1), 77–91.
Modern Portfolio Theory 5
[54] Markowitz, H.M. (1959). Portfolio Selection: Efficient Journal of Financial and Quantitative Analysis 8(3),
Diversification of Investments, Cowles Foundation for 317–333.
Research in Economics at Yale University, Monograph [73] Ross, S.A. (1975). Uncertainty and the heterogeneous
#6. John Wiley & Sons, Inc., New York. (2nd Edition, capital good model, The Review of Economic Studies
1991, Basil Blackwell, Inc., Cambridge). 42(1), 133–146.
[55] Markowitz, H.M. (2000). Mean-Variance Analysis in [74] Ross, S.A. (1976). The arbitrage theory of capital asset
Portfolio Choice and Capital Markets, Frank J. Fabozzi pricing, Journal of Economic Theory 13(3), 341–360.
Associates, New Hope. [75] Ross, S.A. (1976). Risk, return and arbitrage, in Risk and
[56] Marschak, J. (1938). Money and the theory of assets, Return in Finance, I. Friend & J. Bicksler, eds, Ballinger,
Econometrica 6, 311–325. Cambridge, pp. 1–34.
[57] Mayers, D. (1972). Nonmarketable assets and capital [76] Ross, S.A. (1978). Mutual fund separation in financial
market equilibrium under uncertainty, in Studies in the theory—the separating distributions, Journal of Eco-
Theory of Capital Markets, M.C. Jensen, ed., Praeger, nomic Theory 17(2), 254–286.
New York, pp. 223–248. [77] Ross, S.A. (1987). Finance, in The New Palgrave
[58] Mehrling, P. (2005). Fischer Black and the Revolution- Finance, J. Eatwell, M. Milgate & P. Newman, eds,
ary Idea of Finance, Wiley, Hoboken. W.W. Norton, New York, pp. 1–34.
[59] Merton, R.C. (1969). Lifetime portfolio selection under [78] Roy, A.D. (1952). Safety first and the holding of assets,
uncertainty: the continuous time case, The Review of Econometrica 20(3), 431–439.
Economics and Statistics 51, 247–257; Reprinted as [79] Rubinstein, M. (1973). The fundamental theorem of
chapter 4 of Merton, R.C. (1990). Continuous-Time parameter-preference security valuation, Journal of Fin-
Finance, Blackwell, Cambridge, pp. 97–119. ancial and Quantitative Analysis 8, 61–69.
[60] Merton, R.C. (1971). Optimum consumption and port- [80] Rubinstein, M. (1974). A Discrete-Time Synthesis of
folio rules in a continuous time model, Journal of Eco- Financial Theory, Working Paper 20, Haas School
nomic Theory 3, 373–413; Reprinted as chapter 5 of of Business, University of California at Berkeley;
Reprinted in Research in Finance, JAI Press, Greenwich,
Merton, R.C. (1990). Continuous-Time Finance, Black-
Vol. 3, pp. 53–102.
well, Cambridge pp. 120–165.
[81] Rubinstein, M. (1976). The valuation of uncertain
[61] Merton, R.C. (1973). An intertemporal capital asset
income streams and the pricing of options, Bell Journal
pricing model, Econometrica 41, 867–887; Reprinted
of Economics 7, Autumn, 407–425.
as chapter 15 of Merton, R.C. (1990). Continuous-Time
[82] Rubinstein, M. (2006). A History of the Theory of Invest-
Finance, Blackwell, Cambridge, pp. 475–523.
ments My Annotated Bibliography, Wiley, Hoboken.
[62] Merton, R.C. (1990). Continuous-Time Finance, Black-
[83] Samuelson, P.A. (1969). Lifetime portfolio selection
well, Cambridge. (revised paperback edition, 1999
by dynamic stochastic programming, The Review of
reprint).
Economics and Statistics 57(3), 239–246.
[63] Mirrlees, J.A. (1965). Optimum Accumulation Under [84] Sharpe, W.F. (1964). Capital asset prices: a theory of
Uncertainty. unpublished manuscript. December. market equilibrium under conditions of risk, Journal of
[64] Modigliani, F. & Miller, M.H. (1958). The cost of cap- Finance 19(3), 425–442.
ital, corporation finance, and the theory of investment, [85] Sharpe, W.F. (1990). Autobiography, in Les Prix Nobel
The American Economic Review 48, 261–297. 1990, Tore Frängsmyr, ed., Nobel Foundation, Stock-
[65] Mossin, J. (1966). Equilibrium in a capital asset market, holm.
Econometrica 34(4), 768–783. [86] Solnik, B. (1974). An equilibrium model of interna-
[66] Nielsen, L.T. (1990). Equilibrium in CAPM without tional capital markets, Journal of Economic Theory 8(4),
a riskless asset, The Review of Economic Studies 57, 500–524.
315–324. [87] Stapleton, R.C. & Subrahmanyam, M. (1978). A mul-
[67] Nielsen, L.T. (1990). Existence of equilibrium in CAPM, tiperiod equilibrium asset pricing model, Econometrica
Journal of Economic Theory 52, 223–231. 46(5), 1077–1095.
[68] Phelps, E.S. (1962). The accumulation of risky capi- [88] Stone, B.K. (1970). Risk, Return, and Equilibrium, a
tal: a sequential utility analysis, Econometrica 30(4), General Single-Period Theory of Asset Selection and
729–743. Capital-Market Equilibrium, MIT Press, Cambridge.
[69] Poitras, G. (2000). The Early History of Financial [89] Stulz, R.M. (1981). A model of international asset
Economics, Edward Elgar, Chentenham. pricing, Journal of Financial Economics 9(4), 383–406.
[70] Roll, R. (1977). A critique of the asset pricing theory’s [90] Tobin, J. (1958). Liquidity preference as behavior
tests, Journal of Financial Economics 4(2), 129–176. towards risk, The Review of Economic Studies (67),
[71] Rosenberg, B. (1974). Extra-market component of 65–86. Reprinted as Cowles Foundation Paper 118.
covariance in security returns, Journal of Financial and [91] Treynor, J.L. (1961). Market Value, Time and Risk .
Quantitative Analysis 9(2), 263–273. unpublished manuscript dated 8/8/61.
[72] Rosenberg, B. & McKibben, W. (1973). The prediction [92] Treynor, J.L. (1962). Toward a Theory of Market Value
of systematic and specific risk in security returns, of Risky Assets, unpublished manuscript. “Rough Draft”
6 Modern Portfolio Theory
dated by Mr. Treynor to the fall of 1962. A final version of the American Economic Association, Boston, December;
was published in 1999, in Asset Pricing and Portfolio Subsequently extended and published as (1965). Investment
Performance, R.A. Korajczyk, ed., Risk Books, London, decision under uncertainty: choice-theoretic approaches, The
pp. 15–22. Quarterly Journal of Economics 79(5), 509–536; Also, see
[93] Treynor, J.L. (2007). Treynor on Institutional Investing, (1966). Investment decision under uncertainty: applications
Wiley, Hoboken. of the state-preference approach, The Quarterly Journal of
[94] Turner, A.L. (1987). Insurance in an equilibrium asset Economics 80(2), 252–277.
pricing model, in Fair Rate of Return in Property- Itô, K. (1944). Stochastic integrals, Proceedings of the Imperial
Liability Insurance, J.D. Cummins & S.E. Harrington, Academy Tokyo 22, 519–524.
eds, Kluwer Academic Publishers, Norwell. Itô, K. (1951). Stochastic differentials, Applied Mathematics
[95] Williams, J.B. (1938). The Theory of Investment Value, and Optimization 1, 374–381.
Harvard University Press, Cambridge. Itô, K. (1998). My sixty years in studies of probability
[96] Williams, J.T. (1977). Capital asset prices with het- theory, acceptance speech of the Kyoto prize in basic
erogeneous beliefs, Journal of Financial Economics 5, sciences, in The Inamori Foundation Yearbook 1998, Inamori
219–239. Foundation, Kyoto.
[97] Yaari, M.E. (1965). Uncertain lifetime, life insurance, Jensen, M.C. (1968). The performance of mutual funds in the
and the theory of the consumer, The Review of Economic period 1945-64, Journal of Finance 23(2), 389–416.
Studies 32(2), 137–150. Jensen, M.C. (1969). Risk, the pricing of capital assets, and
[98] The Royal Swedish Academy of Sciences (1990). The the evaluation of investment portfolios, Journal of Business
Sveriges Riskbank Prize in Economic Sciences in Mem- 42(2), 167–247.
ory of Alfred Nobel 1990 , Press release 16 October 1990. Keynes, J.M. (1936). The General Theory of Employment,
Interest, and Money, Harcourt Brace, New York.
Further Reading Leontief, W. (1947). Postulates: Keynes’ general theory and
the classicists, in The New Economics: Keynes’ Influence on
Theory and Public Policy, S.E. Harris, ed., Knopf, New York,
Arrow, K.J. (1953). Le Rôle des Valuers Boursières pour la
Chapter 19, pp. 232–242.
Répartition la Meilleure des Risques, Économetrie, Collo-
Lintner, J. (1965). Securities Prices and Risk; the Theory and
ques Internationaux du Centre National de la Recherche
a Comparative Analysis of AT&T and Leading Industrials,
Scientifique 11, 41–47.
Paper Presented at the Bell System Conference on the Eco-
Black, F. & Scholes, M. (1973). The pricing of options and
nomics of Regulated Public Utilities, University of Chicago
corporate liabilities, The Journal of Political Economy 81(3),
637–654. Business School, Chicago, June.
Cootner, P.H. (ed.) (1964). The Random Character of Stock Lintner, J. (1970). The market price of risk, size of market
Market Prices, MIT Press, Cambridge. (Revised edition, and investor’s risk aversion, The Review of Economics and
First MIT Press Paperback Edition, July 1967). Statistics 52, 87–99.
Courtault, J.M., Kabanov, Y., Bru, B., Crépel, P., Lebon, I. & Lintner, J. (1971). The effects of short selling and margin
Le Marchand, A. (2000). Louis Bachelier on the centenary requirements in perfect capital markets, Journal of Financial
of théorie de la spéculation, Mathematical Finance 10(3), and Quantitative Analysis 6, 1173–1196.
341–353. Lintner, J. (1972). Finance and Capital Markets, National
Cvitanić, J., Lazrak, A., Martinelli, L. & Zapatero, F. (2002). Bureau of Economic Research, New York.
Revisiting Treynor and Black (1973): an Intertemporal Model Mandelbrot, B.B. (1987). Louis Bachelier, in The New Pal-
of Active Portfolio Management , unpublished manuscript. grave Finance, J. Eatwell, M. Milgate & P. Newman, eds,
The University of Southern California and the University W.W. Norton, New York, pp. 86–88.
of British Columbia. Markowitz, H.M. (1952). The utility of wealth, The Journal of
Duffie, D. (1996). Dynamic Asset Pricing Theory, 2nd Edition, Political Economy 60(2), 151–158.
Princeton University Press, Princeton. Markowitz, H.M. (1956). The optimization of a quadratic func-
Eatwell, J., Milgate, M. & Newman, P. (eds) (1987). The New tion subject to linear constraints, Naval Research Logistics
Palgrave Finance, W.W. Norton, New York. Quarterly 3, 111–133.
Friedman, M. & Jimmie Savage, L. (1948). The utility analysis Markowitz, H.M. (1957). The elimination form of the inverse
of choices involving risk, The Journal of Political Economy and its application to linear programming, Management
56(4), 279–304. Science 3, 255–269.
Friend, I. & Bicksler, J.L. (1976). Risk and Return in Finance, Marschak, J. (1950). Rational behavior, uncertain prospects,
Ballinger, Cambridge. and measurable utility, Econometrica 18(2), 111–141.
Hakansson, N.H. (1987). Portfolio analysis, in The New Pal- Marschak, J. (1951). Why “Should” statisticians and busi-
grave Finance, J. Eatwell, M. Milgate & P. Newman, eds, nessmen maximize “moral expectation”?, Proceedings of
W.W. Norton, New York, pp. 227–236. the Second Berkeley Symposium on Mathematical Statistics
Hirshleifer, J. (1963). Investment Decision Under Uncertainty, and Probability, University of California Press, Berkeley,
Papers and Proceedings of the Seventy-Sixth Annual Meeting pp. 493–506. Reprinted as Cowles Foundation Paper 53.
Modern Portfolio Theory 7
Marshall, A. (1890, 1891). Principles of Economics, 2nd Sharpe, W.F. (1961a). Portfolio Analysis Based on a Simpli-
Edition, Macmillan and Co., London and New York. fied Model of the Relationships Among Securities, unpub-
Merton, R.C. (1970). A Dynamic General Equilibrium Model lished doctoral dissertation. University of California at Los
of the Asset Market and Its Application to the Pricing of Angeles, Los Angeles.
the Capital Structure of the Firm, Working Paper 497-70, Sharpe, W.F. (1961b). A Computer Program for Portfolio Anal-
Sloan School of Management, MIT, Cambridge; Reprinted ysis Based on a Simplified Model of the Relationships Among
as chapter 11 of Merton, R.C. (1990). Continuous-Time Securities, unpublished mimeo. University of Washington,
Finance, Blackwell, Cambridge, pp. 357–387. Seattle.
Merton, R.C. (1972). An analytic derivation of the efficient Sharpe, W.F. (1963). A simplified model for portfolio analysis,
portfolio frontier, Journal of Financial and Quantitative Management Science 9(2), 277–293.
Analysis 7, 1851–1872. Sharpe, W.F. (1966). Mutual fund performance, Journal of
Miller, M.H. & Modigliani, F. (1961). Dividend policy, Business 39,(Suppl), 119–138.
growth and the valuation of shares, Journal of Business 34, Sharpe, W.F. (1970). Portfolio Theory and Capital Markets,
235–264. McGraw-Hill, New York.
Modigliani, F. & Miller, M.H. (1963). Corporate income taxes Sharpe, W.F. (1977). The capital asset pricing model: a
and the cost of capital, The American Economic Review 53, ‘multi-Beta’ interpretation, in Financial Decision Making
433–443. Under Uncertainty, H. Levy & M. Sarnatt, eds, Har-
Mossin, J. (1968). Optimal multiperiod portfolio policies, court Brace Jovanovich, Academic Press, New York, pp.
Journal of Business 4(2), 215–229. 127–136.
Mossin, J. (1969a). A note on uncertainty and preferences in Sharpe, W.F. & Alexander, G.J. (1978). Investments, 4th
a temporal context, The American Economic Review 59(1), Edition, (1990), Prentice-Hall, Englewood Cliffs.
172–174. Taqqu, M.S. (2001). Bachelier and his times: a conver-
Mossin, J. (1969b). Security pricing and investment criteria in sation with Bernard Bru, Finance and Stochastics 5(1),
competitive markets, The American Economic Review 59(5), 3–32.
749–756. Treynor, J.L. (1963). Implications for the Theory of Finance,
Mossin, J. (1973). Theory of Financial Markets, Prentice-Hall, unpublished manuscript. “Rough Draft” dated by Mr.
Englewood Cliffs. Treynor to the spring of 1963.
Mossin, J. (1977). The Economic Efficiency of Financial Mar- Treynor, J.L. (1965). How to rate management of investment
kets, Lexington, Lanham. funds, Harvard Business Review 43, 63–75.
von Neumann, J.L. & Morgenstern, O. (1953). Theory of Treynor, J.L. & Black, F. (1973). How to use security analysis
Games and Economic Behavior, 3rd Edition, Princeton to improve portfolio selection, Journal of Business 46(1),
University Press, Princeton. 66–88.
Roy, A.D. (1956). Risk and rank or safety first generalised,
Economica 23(91), 214–228.
Rubinstein, M. (1970). Addendum (1970), in Portfolio Related Articles
Selection: Efficient Diversification of Investments, Cowles
Foundation for Research in Economics at Yale University,
Monograph #6, H.M. Markowitz, ed., 1959. John Wiley & Bernoulli, Jacob; Black–Litterman Approach;
Sons, Inc., New York. (2nd Edition, 1991, Basil Blackwell, Risk–Return Analysis; Markowitz, Harry; Mutual
Inc., Cambridge), pp. 308–315. Funds; Sharpe, William F..
Savage, L.J. (1954). The Foundations of Statistics, John Wiley
& Sons, New York. CRAIG W. FRENCH
Long-Term Capital so LTCM would lever the spread trade to raise the
overall risk level, as well as the expected return on
Management invested capital.
An example of such a trade is an on-the-run versus
off-the-run trade. In August 1998, 30-year treasuries
Background (the on-the-run bond) had a yield to maturity of
5.50%. The 29-year bond (the off-the-run issue) was
Long-Term Capital Management (LTCM) launched 12 basis points (bp) cheaper, with a yield to maturity
its flagship fund on February 24, 1994, with $1.125 of 5.62%. The outright risk of 30-year treasury bonds
billion in capital, making it the largest start-up was a standard deviation of around 85 bp per year.
hedge fund to date. Over $100 million came from The spread trade only had a risk level of around 3.5
the partners themselves, especially those who came bp per year, so the spread trade could be levered 25
from the proprietary trading operation that John to 30 to 1, bringing it in line with the market risk of
Meriwether had headed at Salomon Brothers. At 30-year treasuries.
Salomon, the profit generated by this group had LTCM would never do a trade that mathematically
regularly exceeded the profit generated by the entire looked attractive according to its models unless
firm, and the idea of LTCM was to continue this they qualitatively understood why the trade worked
record on their own. To help them, they also recruited and what were the forces that would bring the
a dream team of academic talent, most notably Myron “spreads” to convergence. In the case of the on-the-
Scholes and Robert Merton (see Merton, Robert C.), run versus off-the-run trade, the main force leading
who would win the 1997 Nobel Prize in Economics to a difference in yields between the two bonds is
for their pioneering work in financial economics. But liquidity. The 30-year bond is priced higher by 12
they were not alone; half of the founding partners bp (approximately 1.2 points on a par bond) because
taught finance at major business schools. some investors are willing to pay more to own a
The first few years of the fund continued the more liquid bond. But in six months’ time, when the
success of the Salomon years (Table 1). treasury issues a new 30-year bond, that new bond
The fund was closed to new capital in 1995 and will be the most liquid one and the old 30-year bond
quickly grew to $7.5 billion of capital by the end of will lose its liquidity premium. This means that in
1997. At this time the partners decided, given the six months’ time, it will trade at a yield similar to
lack of additional opportunities, to pay a dividend of that of the old 29-year bond, thus bringing about a
$2.7 billion, which left the capital at the beginning of convergence of the spread.
1998 at $4.8 billion. LTCM was involved in many such relative-value
trades, in many different and seemingly unrelated
markets and instruments. These included trades in
Investment Style Government bond spreads, swap spreads, yield curve
arbitrage, mortgage arbitrage, volatility spreads, risk
The fund invested in relative-value convergence arbitrage, and equity relative value trades. In each
trades. They would buy cheap assets and hedge case, the bet was that some spread would converge
as many of the systematic risk factors as possible over time.
by selling rich assets. The resulting “spread” trade
had significantly less risk than the outright trade,
Risk Management
Table 1 LTCM returns
LTCM knew that a major risk to pursuing relative-
Net Gross Dollar Ending
Year return (%) return (%) profits ($) capital ($) value convergence trades was the ability to hold the
trades until they converged. To ensure this, LTCM
1994 20 28 0.4 1.6 insisted that investors lock in equity capital for
1995 43 59 1.3 3.6 3 years, so there would be no premature liquidation
1996 41 57 2.1 5.2
1997 17 25 1.4 7.5
from investor cashout. This equity lock-in also gave
counterparties comfort that LTCM had long-lasting
2 Long-Term Capital Management
credit worthiness, and that enabled LTCM to acquire diversification. If the relative value strategies had
preferential financing. very low correlations with each other, then the risk of
As a further protection, LTCM also made exten- the overall portfolio would be low. LTCM assumed
sive use of term financing. If the on-the-run/off-the- that in the long run these correlations were low
run trade might take six months to converge, LTCM
because of the loose economic ties between the
would finance the securities for six months, instead
of rolling the financing overnight. LTCM also had trades, although in the short run these correlations
a two-way mark-to-market provisions in all of its could be significantly higher. LTCM also assumed
over-the-counter contracts. Thus for its relative value that the downside risk on some of the trades was
trades that consisted of both securities and contractual diminished, as spreads got very wide, on the assump-
agreements it had fully symmetric marks, so that the tion that other leveraged funds would rush in to take
only time LTCM had to put additional equity capital advantage. In retrospect, these assumptions were all
into a trade was if the spreads widened out. The fund
falsified by experience.
also had term debt and backstop credit lines in place
as alternative funding. Before the crisis, LTCM had a historical risk
LTCM also stress tested its portfolio relative level of a $45 million daily standard deviation of
to potential economic shocks to the system, and return on the fund. See Figure 1 for historical daily
hedged against the consequences. As an example, in returns.
1995, LTCM had a large swapped position in Italian After the fund reached global scale in 1995,
government bonds. The firm got very worried that the risk level was remarkably stable. In fact, the
if the Republic of Italy defaulted, it would have a
partners had actually predicted a higher risk level
sizable loss. So it purchased insurance against this
potential default by doing a credit default swap on for the fund as they assumed that the correla-
the Italian government bonds. tions among the relative value trades would be
But the primary source of risk management relied higher then historical levels. But in 1998, all this
on the benefit that the portfolio obtained due to changed.
200
150
100
Millions of dollars
50
−50
−100
−150
−200
February 24, 1994 to July 22, 1988
The 1998 Crisis While the Russian default triggered the economic
crisis in August, it was an LTCM crisis in September.
Would the fund fail? Many other institutions with
In 1998, LTCM was up slightly in the first four similar positions liquidated them in advance of the
months of the year. Then, in May, the portfolio lost potential failure. Some market participants bet against
6% and in June, it lost 10%. In early July, the portfo- the firm and counterparties marked contractual agree-
lio rebounded by about 7% and the partners reduced ments at extremely wide levels to obtain addi-
the underlying risk of the portfolio accordingly by tional cushions against bankruptcy. The partners hired
about 10%. Goldman Sachs to help them raise additional capital
The crisis was triggered by the Russian default and to sell off assets; for this, they received 50% of
on its domestic bonds on August 17, 1998. While the management company.
LTCM did not have many Russian positions so that The leverage of the firm went to an enormous lev-
its direct losses were small, the default did initiate els involuntarily (Figure 2), not because of increase
the process that was to follow as unrelated markets in assets but because of equity falling. In the event,
all over the world reacted. On Friday August 21, attempts to raise additional funds failed and on Mon-
LTCM had a one-day loss of $550 million. (A risk day, September 21, the fund lost another $550 mil-
arb deal that was set to close on that day, that of lion, putting its capital for the first time below $1
Ciena and Tellabs, broke, causing a $160 million billion. On Wednesday, at the behest of the Federal
loss. Swap spreads that normally move about 1 bp Reserve, the 15 major counterparties met at the New
a day were out 21 bp intraday.) The Russian debt York Fed to discuss the situation.
crisis had triggered a flight out of all relative-value During the meeting, at 11:00 AM the partners rece-
positions. In the illiquid days at the end of August, ived a telephone call from Warren Buffett, who was
these liquidations caused a downward spiral as new on a satellite phone while vacationing with Bill Gates
losses led to more liquidations and more losses. The in Alaska. He said that LTCM was about to receive a
result was that by the end of August LTCM was bid on its entire portfolio from him and that he hoped
down by 53% for the year, with the capital now at they would seriously consider it. At 11:30 AM LTCM
$2.3 billion. received the fax message given in Figure 3.
45
40
35
30
Leverage
25
20
15
10
0
Jun-94 Jan-95 Aug-95 Mar-96 Oct-96 May-97 Dec-97 Jul-98
June 1994 to September 1998
Figure 2 Leverage
4 Long-Term Capital Management
HIGHLY CONFIDENTIAL
Subject to the following deal structure, the partnership described below proposes to purchase
the assets of Long-Term Capital Management (and/or its affiliates and subsidiaries, collectively
referred to as "Long-Term Capital") for $250 million.
The purchaser will be a limited partnership whose investors will be Berkshire Hathaway for $3
billion, American International Group for $700 million and Goldman Sachs for $300 million (or
each of their respective affiliates). All management of the assets will be under the sole control
of the partnership and will be transferred to the partnership in an orderly manner.
1) The limited partnership described herein will not assume any liabilities of Long-Term
Capital arising from any activities prior to the purchase by the partnership
2) All current financing provided to Long-Term Capital will remain in place under current
terms and conditions.
The names of the proposal participants may not be disclosed to anyone. If the names are
disclosed, the bid will expire.
This bid will expire at 12:30 p.m. New York time on September 23, 1998.
Sincerely,
John Meriwether
Figure 3 Copy of the $250 million offer for Long-Term Capital Management
The partners were unable to accept the proposal contract). Transfer of those positions to the Buffett-
as it was crafted. The fund had approximately 15 000 led group would require the approval of all the
distinct positions. Each of these positions was a counterparties. Clearly, all of LTCM’s counterparties
credit counterparty transaction (i.e., a repo or swap would prefer to have Warren Buffett as a creditor
Long-Term Capital Management 5
as opposed to an about-to-be-bankrupt hedge fund. bailout. At that time third-party investors were paid
But it was going to be next to impossible to obtain off. The consortium of banks decided to continue
complete approval in one hour. the liquidation at a faster pace and, by December
The partners proposed, as an alternative, that the 1999, the liquidation was complete. The banks had
group make an emergency equity infusion into the no losses and had made a 10% return on their
fund in return for 90% ownership and the right investment.
to kick the partners out as managers. Under this Investors who had made a $1 investment at the
plan, all the financing would stay in place and the beginning of 1998 would have seen their investment
third party investors could be redeemed at anytime. fall to 8 cents at the time of the bailout, and would
Unfortunately, the lawyers were not able to get have received 10 cents on April 1, 1999. But in its
Buffett back on his satellite phone and no one earlier years, LTCM had made high returns and paid
was prepared to consummate the deal without his out high dividends such that of its 100 investors only
approval. 12 actually lost money, and only 6 lost more than
At the end of the day, 14 financial institutions $2 million. The median investor actually had a 19%
(everyone with the exception of Bear Stearns) agreed internal rate of return (IRR) even including the loss.
to make an emergency $3.625 billion equity infusion The partners did not fare as well. Their capital was
into the fund. The plan was essentially a no-fault about $2 billion at the beginning of 1998 and they
bankruptcy where the creditors of a company (in received no final payout.
this case, the secured creditors) make an equity
investment, cramming down the old equity holders, in
order to liquidate the company in an orderly manner. Lessons Learned
Why did the Fed orchestrate the bailout? The
answer has to do with how the bankruptcy laws are The LTCM crisis illustrates some of the pitfalls of
applied with respect to financial firms. When LTCM a VaR-based risk management system (see Value-at-
did the on-the-run versus off-the-run strategy, the risk Risk), where the risk of the portfolio is determined
of the two sides of the trade netted within the fund. by the exogenous economic relationships among
But in bankruptcy, each side of the trade liquidates its the trades. During the crisis, all of LTCM’s trades
collateral separately, and sends a bill to LTCM. The moved together with correlations approaching one,
risk involved in the position is thus no longer netted even though the trades were economically diverse.
at 3.5 bp but is actually 85 bp per side. Although It was hard to believe that the returns from US
the netted risk of LTCM was $45 million per day, the mortgage arbitrage trades would be highly related
gross risk was much larger, more like $30 million per to LTCM’s Japanese warrant and convertible book
day with each of 15 counterparties. or highly related to their European government bond
As conditions worsened, early in September, the spread trades. Yet, during the crisis these correlations
partners had been going around to the counterparties all moved toward one, resulting in a failure of
and explaining this enormous potential risk factor diversification and creating enormous risk for the
in the event of bankruptcy and the large losses fund.
that the counterparties would potentially face. They What was the common thread in all of these
separately asked each dealer to make an equity trades? It was not that they were economically
infusion to shore up LTCM’s capital situation. But related, but more that they had similar holders of
it was a classic Prisoner’s Dilemma problem. No the trades with common risk tolerances. When these
dealer would commit unless everyone else did. It was hedge funds and proprietary trading groups at the
necessary to get everyone in the same room, so that banks lost money in the Russian crisis they were
they would all know the full extent of the exposures ordered by senior management to reduce their risk
and all commit together, and that could not happen exposures. The trades that they took off were the
until bankruptcy was imminent. relative-value trades. As they unwound their positions
In this event, the private bailout was a success. in the illiquid days of August, the spreads went out
No counterparty had any losses on their collateral. further, causing more losses and further unwinds.
By the end of the first quarter of 1999, the fund This risk might be better classified as endogenous
had rallied 25% from its value at the time of the risk, risk that comes about not from the fundamental
6 Long-Term Capital Management
30% above fundamentals in late summer 1929. White largely disappeared by the end of 2000. Although in
[107] suggests that the 1929 boom cannot be readily February 2000 the vast majority of Internet-related
explained by fundamentals, represented by expected companies had negative earnings, the Internet sector
dividend growth or changes in the equity premium. in the United States was equal to 6% of the market
While Galbraith’s and Kindleberger’s classical capitalization of all US public companies and 20% of
views have been most often cited by the mass media, the publicly traded volume of the US stock market
they had received little scholarly attention. Since the [82, 83].
1960s, in parallel with the emergence of the efficient- Ofek and Richardson [83] used the financial data
market hypothesis, their position has lost ground from 400 companies in the Internet-related sectors
among economists and especially among financial and analyzed to what extent their stock prices differed
economists. More recent works, described at the end from their fundamental values estimated by using
of this article, revive their views in the form of Miller and Modigliani [79] model for stock valuation
quantitative diagnostics. [38]. Since almost all companies in the Internet sector
had negative earnings, they estimated the (implied)
price-to-earnings (P /E) ratios, which are derived
Efficient-market Hypothesis from the revenue streams of these firms rather than
their earnings that would be read from the 1999
The efficient-markets hypothesis (see Efficient Mar- financial data. Their results are striking. Almost 20%
ket Hypothesis) states that asset prices reflect fun- of the Internet-related firms have P /E ratios in
damental value, defined as the discounted sum of excess of 1500, while over 50% exceed 500, and the
expected future cash flows where, in forming expec- aggregate P /E ratio of the entire Internet sector is
tations, investors “correctly process” all available 605. Under the assumptions that the aggregate long-
information. Therefore, in an efficient market, there run P /E ratio is 20 on average (which is already
is “no free lunch”: no investment strategy can on the large end member from a historical point
earn excess risk-adjusted average returns or aver- of view), the Internet sector would have needed to
age returns greater than are warranted for its risk. generate 40.6% excess returns over a 10-year period
Proponents of the efficient-markets hypothesis, Fried- to justify the P /E ratio of 605 implied in 2000.
man and Schwartz [39] and Fama, [34], argue that The vast majority of the implied P /Es are much too
rational speculative activity would eliminate riskless high relative to the P /Es usually obtained by firms.
arbitrage opportunities. Fama ([34], p.38) states that, By almost any standard, this clearly represented
if there are many sophisticated traders in the market, “irrational” valuation levels. These and similar figures
they may cause these bubbles to burst before they led many to believe that this set of stocks was in the
have a chance to really get under way. midst of an asset price bubble.
However, after years of effort, it has become From the theoretical point of view, some ratio-
clear that some basic empirical facts about the stock nal equilibrium asset-pricing models allow for the
markets cannot be understood in this framework presence of bubbles, as pointed out for infinite-
[106]. The efficient-markets hypothesis entirely lost horizon models in discrete-time setups by Blanchard
ground after the burst of the Internet bubble in 2000, and Watson [9]. Loewenstein and Willard [70, 71]
providing one of the recent most striking episodes characterized the necessary and sufficient conditions
of anomalous price behavior and volatility in one for the absence of bubbles in complete and incom-
of the most developed capital markets of the world. plete markets equilibria with several types of bor-
The movement of Internet stock prices during the rowing constraints and in which agents are allowed
late 1990s was extraordinary in many respects. The to trade continuously. For zero net supply assets,
Internet sector earned over 1000% returns on its including financial derivatives with finite maturities,
public equity in the two-year period from early they show that bubbles can generally exist and have
1998 through February 2000. The valuations of these properties different from their discrete-time, infinite-
stocks began to collapse shortly thereafter and by horizon counterparts. However, Lux and Sornette
the end of the same year, they had returned to pre- [73] demonstrated that exogenous rational bubbles
1998 levels, losing nearly 70% from the peak. The are hardly reconcilable with some of the stylized
extraordinary returns of 1998–February 2000 had facts of financial data at a very elementary level.
Bubbles and Crashes 3
Jarrow et al. [53] showed that if financial agents the finance literature has evolved to increasingly
prefer more to less (no dominance assumption), then recognize the evidence of deviations from the funda-
bubbles in complete markets can only exist which mental value. One important class of theories shows
are uniformly integrable martingales, and these can that there can be large movements in asset prices
exist with an infinite lifetime. Under these conditions, caused by the combined effects of heterogeneous
the put–call parity holds and there are no bubbles in beliefs and short-sales constraints. The basic idea
standard call and put options. Their analysis implies finds its root back to the original capital asset pricing
that if one believes that asset price bubbles exist, model (CAPM) theories, in particular, to Lintner’s
then asset markets must be incomplete. Jarrow et al. model of asset prices with investors having hetero-
[54] extend their discussion in [53] to characterize all geneous beliefs [69]. In his model, asset prices are a
possible price bubbles in an incomplete market, satis- weighted average of beliefs about asset payoffs with
fying the “no free lunch with vanishing risk” and “no the weights being determined by the investor’s risk
dominance” assumptions. Their [54] new theory for aversion and beliefs about asset price covariances.
bubbles is formulated in terms of different local mar- Lintner [69] and many others after him show that
tingale measures across time, which leads to some widely inflated prices can occur.
testable predictions on derivative pricing in the pres- Many other asset-pricing models in the spirit of
ence of bubbles.
Lintner [69] have been proposed [19, 29, 48, 52,
78, 89]. In these models that assume heterogeneous
Heterogeneous Beliefs and Limits to beliefs and short-sales restrictions, the asset prices
Arbitrage are determined at equilibrium to the extent that they
reflect the heterogeneous beliefs about payoffs, but
The collapsing Internet bubble has thrown new light short-sales restrictions force the pessimistic investors
on the old subject and raised the acute question of out of the market, leaving only optimistic investors
why rational investors have not moved earlier into and thus inflated asset price levels. However, when
the market and driven the Internet stock prices back short-sales restrictions no longer bind investors, then
to their fundamental valuations. prices fall. This provides a possible account of the
Two conditions are, in general, invoked as being bursting of the Internet bubble that developed in
necessary for prices to deviate from the fundamental 1998–2000. As documented by Ofek and Richard-
value. First, there must be some degree of irrational-
son [83], and by Cochrane [20], typically as much
ity in the market; that is, investors’ demand for stocks
as 80% of Internet-related shares were locked up.
must be driven by something other than fundamen-
This is due to the fact that many Internet compa-
tals, such as overconfidence in the future. Second,
nies had gone through recent initial public offerings
even if a market has such investors, the general
(IPOs) and regulations impose that shares held by
argument is that rational investors will drive prices
insiders and other pre-IPO equity holders cannot be
back to fundamental value. To avoid this, there needs
to be some limit on arbitrage. Shleifer and Vishny traded for at least six months after the IPO date. The
[92] provide a description for various limits of arbi- float of the Internet sector dramatically increased as
trage. With respect to the equity market, clearly the the lockups of many of these stocks expired. The
most important impediment to arbitrage is short-sales unlocking of literally hundreds of billions of dol-
restrictions. Roughly 70% of mutual funds explicitly lars of shares in the Internet sector in Spring 2000
state (in the Securities and Exchange Commission was equivalent of removing short-sales restrictions.
(SEC) form N-SAR) that they are not permitted to sell And the collapse of Internet stock prices coincided
short [2]. Seventy-nine percent of equity mutual funds with a dramatic expansion in the number of pub-
make no use of derivatives whatsoever (either futures licly tradable shares of Internet companies. Among
or options), suggesting further that funds do not take many others, Hong et al. [49] explicitly model the
synthetically short positions [64]. These figures indi- relationship between the number of publicly tradable
cate that the vast majority of funds never take short shares of an asset and the propensity for specula-
positions. tive bubbles to form. So far, the theoretical models
Recognizing that the world has limited arbi- based on agents with heterogeneous beliefs facing
trage and significant numbers of irrational investors, short-sales restrictions are considered among the most
4 Bubbles and Crashes
convincing models to explain the burst of the Internet that several standard results fail for local martingales:
bubbles. put–call parity does not hold, the price of an Amer-
Another test of this hypothesis on the origin of ican call exceeds that of a European call, and call
the 2000 market crash is provided by the search prices are no longer increasing in maturity (for a fixed
for possible discrepancies between option and stock strike).
prices. Indeed, even though it is difficult for rational Thus, it would seem that the issue of the ori-
investors to borrow Internet stocks for short sell- gin of the 2000 crash is settled. However, Battalio
ing due to the lockup period discussed above, they and Schultz [6] arrive at the opposite conclusion,
should have been able to construct equivalent syn- using proprietary intraday option trade and quote data
thetic short positions by purchasing puts and writing generated in the days surrounding the collapse of
calls in the option market and either borrowing or the Internet bubble. They find that the general pub-
lending cash, without the need for borrowing the lic could cheaply short synthetically using options,
stocks. The question is now transformed into find- and this information could have been transmitted to
ing some evidence for the use or the absence of such the stock market, in line with the absence of evi-
strategy and the reason for its absence in the lat- dence that synthetic stock prices diverged from actual
ter case. One possible thread is that, if short selling stock prices. The difference between the work of
through option positions was difficult or impracti- Ofek and Richardson [83] and Ofek et al. [84], on
cal, prices in the stock and options markets should the one hand, and Battalio and Schultz [6], on the
decouple [67]. Using a sample of closing bid and other, is that the former used closing option quotes
ask prices for 9026 option pairs for three days in and last stock trade prices from the OptionMetrics
February 2000 along with closing trade prices for Ivy database. As pointed out by Battalio and Schultz
the underlying equities, Ofek and Richardson [83]
[6], OptionMetrics matches closing stock trades that
find that 36% of the Internet stocks had put–call
occurred no later than 4:00 pm, and perhaps much
parity violations as compared to only 23.8% of the
earlier, with closing option quotes posted at 4:02 pm.
other stocks. One reason for put–call parity violations
Furthermore, option market makers that post clos-
may be that short-sale restrictions prevent arbitrage
ing quotes on day t are not required to trade at
from equilibrating option and stock prices. Hence,
those quotes on day t + 1. Likewise, dealers and
one interpretation of the finding that there are more
specialists in the underlying stocks have no obliga-
put–call parity violations for Internet stocks is that
short-sale constraints are more frequently binding for tion to execute incoming orders at the price of the
Internet stocks. Furthermore, Ofek et al. [84] provide most recent transaction. Hence, closing option quotes
a comprehensive comparison of the prices of stocks and closing stock prices obtained from the Option-
and options, using closing options quotes and closing Metrics database do not represent contemporaneous
trades on the underlying stock for July 1999 through prices at which investors could have simultaneously
November 2001. They find that there are large differ- traded. To address this problem, Battalio and Schultz
ences between the synthetic stock price and the actual [6] use a unique set of intraday option price data.
stock price, which implies the presence of apparent They first ensure that the synthetic and the actual
arbitrage opportunities involving selling actual shares stock prices that they compare are synchronous, and
and buying synthetic shares. They interpret their find- then, they discard quotes that, according to exchange
ings as evidence that short-sale constraints provide rules, are only indicative of the prices at which liq-
meaningful limits to arbitrage that can allow prices uidity demanders could have traded. They find that
of identical assets to diverge. almost all of the remaining apparent put–call par-
By defining a bubble as a price process that, ity violations disappear when they discard locked or
when discounted, is a local martingale under the crossed quotes and quotes from fast options markets.
risk-neutral measure but not a martingale, Cox and In other words, the apparent arbitrage opportunities
Hobson [21] provide a complementary explanation almost always arise from quotes upon which investors
for the failure of put–call parity. Intuitively, the could not actually trade. Battalio and Schultz [6] con-
local martingale model views a bubble as a stopped clude that short-sale constraints were not responsible
stochastic process for which the expectation exhibits for the high prices of Internet stocks at the peak
a discontinuity when it ends. It can then be shown of the bubble and that small investors could have
Bubbles and Crashes 5
sold short synthetically using options, and this infor- bubble was born. Once prices overshoot or supply
mation would have been transmitted to the stock catches up, inventories begin to rise, time on the mar-
market. The fact that investors did not take advan- ket increases, vacancy rises, and price increases slow
tage of these opportunities to profit from overpriced down, eventually encountering downward stickiness.
Internet stocks suggests that the overpricing was The predominant story about home prices is always
not as obvious then as it is now, with the benefit the prices themselves [91, 93]; the feedback from
of hindsight. Schultz [90] provides additional evi- initial price increases to further price increases is a
dence that contemporaneous lockup expirations and mechanism that amplifies the effects of the precip-
equity offerings do not explain the collapse of Inter- itating factors. If prices are going up rapidly, there
net stocks because the stocks that were restricted to is much word-of-mouth communication, a hallmark
a fixed supply of shares by lockup provisions actu- of a bubble. The word of mouth can spread opti-
ally performed worse than stocks with an increasing mistic stories and thus help cause an overreaction
supply of shares. This shows that current explana- to other stories, such as ones about employment.
tions for the collapse of Internet stocks are incom- The amplification can work on the downside as
plete. well.
Hedge funds are among the most sophisticated
investors, probably closer to the ideal of “rational
Riding Bubbles arbitrageurs” than any other class of investors. It is
therefore particularly telling that successful hedge-
One cannot understand crashes without knowing the fund managers have been repeatedly reported to ride
origin of bubbles. In a nutshell, speculative bubbles rather than attack bubbles, suggesting the existence of
are caused by “precipitating factors” that change pub- mechanisms that entice rational investors to surf bub-
lic opinion about markets or that have an immediate bles rather than attempt to arbitrage them. However,
impact on demand and by “amplification mecha- the evidence may not be that strong and could even be
nisms” that take the form of price-to-price feedback, circular, since only successful hedge-fund managers
as stressed by Shiller [91]. Consider the example would survive a given 2–5 year period, opening the
of a housing-market bubble. A number of funda- possibility that the mentioned evidence could result
mental factors can influence price movements in in large part from a survival bias [14, 44]. Keeping
housing markets. The following characteristics have this in mind, we now discuss two classes of models,
been shown to influence the demand for housing: which attempt to justify why sophisticated “rational”
demographics, income growth, employment growth, traders would be willing to ride bubbles. These mod-
changes in financing mechanisms, interest rates, as els share a common theme: rational investors try to
well as changes in the characteristics of the geo- ride bubbles, and the incentive to ride the bubble
graphic location such as accessibility, schools, or stems from predictable “sentiment”—anticipation of
crime, to name a few. On the supply side, atten- continuing bubble growth [1] and predictable feed-
tion has been paid to construction costs, the age back trader demand [26, 27]. An important implica-
of the housing stock, and the industrial organiza- tion of these theories is that rational investors should
tion of the housing market. The elasticity of sup- be able to reap gains from riding a bubble at the
ply has been shown to be a critical factor in the expense of less-sophisticated investors.
cyclical behavior of home prices. The cyclical pro-
cess that we observed in the 1980s in those cities
experiencing boom-and-bust cycles was caused by
the general economic expansion, best proxied by Positive Feedback Trading by Noise
employment gains, which drove up the demand. In Traders
the short run, those increases in demand encoun-
tered an inelastic supply of housing and developable The term noise traders was introduced first by
land, inventories of for-sale properties shrank, and Kyle [65] and Black [8] to describe irrational
vacancy declined. As a consequence, prices accel- investors. Thereafter, many scholars exploited this
erated. This provided an amplification mechanism concept to extend the standard models by intro-
as it led buyers to anticipate further gains, and the ducing the simplest possible heterogeneity in terms
6 Bubbles and Crashes
of two interacting populations of rational and irra- Their work was followed by a number of behav-
tional agents. One can say that the one-representative- ioral models based on the idea that trend chas-
agent theory is being progressively replaced by a ing by one class of agents produces momentum
two-representative-agents theory, analogously to the in stock prices [5, 22, 50]. The most influential
progress from the one-body to the two-body problems empirical evidence on momentum strategies came
in astronomy. from the work of Jegadeesh and Titman [55, 56],
De Long et al. [26, 27] introduced a model of who established that stock returns exhibit momentum
market bubbles and crashes, which exploits this behavior at intermediate horizons. Strategies that buy
idea of the possible role of noise traders in the stocks that have performed well in the past and sell
development of bubbles as a possible mechanism for stocks that have performed poorly in the past gener-
why asset prices may deviate from the fundamen- ate significant positive returns over 3- to 12-month
tals over rather long time periods. Their inspiration holding periods. De Bondt and Thaler [24] docu-
came from the observation of successful investors mented long-term reversals in stock returns. Stocks
such as George Soros, who reveal that they often that perform poorly in the past perform better over
exploit naive investors following positive feedback the next 3–5 years than stocks that perform well
strategies or momentum investment strategies. Pos- in the past. These findings present a serious chal-
itive feedback investors are those who buy securi- lenge to the view that markets are semistrong-form
ties when prices rise and sell when prices fall. In efficient.
the words of Jegadeesh and Titman [55], positive In practice, do investors engage in momentum
feedback investors are buying winners and selling trading? A growing number of empirical studies
losers. In a description of his own investment strat- address momentum trading by investors, with some-
egy, Soros [101] stresses that the key to his success what conflicting results. Lakonishok et al. [66] ana-
was not to counter the irrational wave of enthusi- lyzed the quarterly holdings of a sample of pension
asm that appears in financial markets, but rather to funds and found little evidence of momentum trading.
ride this wave for a while and sell out much later. Grinblatt et al. [45] examined the quarterly holdings
The model of De Long et al. [26, 27] assumes that of 274 mutual funds and found that 77% of the funds
when rational speculators receive good news and in their sample engaged in momentum trading [105].
trade on this news, they recognize that the initial Nofsinger and Sias [81] examined total institutional
price increase will stimulate buying by noise traders holdings of individual stocks and found evidence
who will follow positive feedback trading strategies of intraperiod momentum trading. Using a different
with a delay. In anticipation of these purchases, ratio- sample, Gompers and Metrick [41] investigated the
nal speculators buy more today, and so drive prices relationship between institutional holdings and lagged
up today higher than fundamental news warrants. returns and concluded that once they controlled for
Tomorrow, noise traders buy in response to increase the firm size, there was no evidence of momentum
in today’s price and so keep prices above the fun- trading. Griffin et al. [43] reported that, on a daily and
damentals. The key point is that trading between intraday basis, institutional investors engaged in trend
rational arbitrageurs and positive feedback traders chasing in NASDAQ 100 stocks. Finally, Badrinath
gives rise to bubble-like price patterns. In their model, and Wahal [4] documented the equity trading prac-
rational speculators destabilize prices because their tices of approximately 1200 institutions from the third
trading triggers positive feedback trading by other quarter of 1987 through the third quarter of 1995.
investors. Positive feedback trading reinforced by They decomposed trading by institutions into (i) the
arbitrageurs’ jumping on the bandwagon leads to a initiation of new positions (entry), (ii) the termination
positive autocorrelation of returns at short horizons. of previous positions (exit), and (iii) the adjustments
Eventually, selling out or going short by rational to ongoing holdings. Institutions were found to act
speculators will pull the prices back to the fundamen- as momentum traders when they enter stocks but as
tals, entailing a negative autocorrelation of returns contrarian traders when they exit or make adjustments
at longer horizons. In summary, De Long et al. [26, to ongoing holdings. Badrinath and Wahal [4] found
27] model suggests the coexistence of intermediate- significant differences in trading practices among dif-
horizon momentum and long-horizon reversals in ferent types of institutions. These studies are limited
stock returns. in their ability to capture the full range of trading
Bubbles and Crashes 7
practices, in part because they focus almost exclu- is reflected in the fact that hedge funds earned
sively on the behavior of institutional investors. In substantial excess returns in the technology segment
summary, many experimental studies and surveys of the NASDAQ.
suggest that positive feedback trading exists in greater
or lesser degrees.
Complex Systems Approach to Bubbles
and Crashes
Synchronization Failures among Rational Bhattacharya and Yu [7] provide a summary of
Traders recent efforts to expand on the above concepts, in
particular, to address the two main questions of
Abreu and Brunnermeier [1] propose a completely (i) the cause(s) of bubbles and crashes and (ii) the
different mechanism justifying why rational traders possibility to diagnose them ex ante. Many finan-
ride rather than arbitrage bubbles. They consider a cial economists recognize that positive feedbacks
market where arbitrageurs face synchronization risk and, in particular, herding are the key factors for
and, as a consequence, delay usage of arbitrage the growth of bubbles. Herding can result from
opportunities. Rational arbitrageurs are supposed to a variety of mechanisms, such as anticipation by
know that the market will eventually collapse. They rational investors of noise traders’ strategies [26,
know that the bubble will burst as soon as a sufficient 27], agency costs and monetary incentives given to
number of (rational) traders will sell out. However, competing fund managers [23] sometimes leading
the dispersion of rational arbitrageurs’ opinions on to the extreme Ponzi schemes [28], rational imita-
market timing and the consequent uncertainty on the tion in the presence of uncertainty [88], and social
synchronization of their sell-off are delaying this col- imitation.
lapse, allowing the bubble to grow. In this framework, The Madoff Ponzi scheme is a significant recent
bubbles persist in the short and intermediate term illustration, revealed by the unfolding of the finan-
because short sellers face synchronization risk, that cial crisis that started in 2007 [97]. It is the
is, uncertainty regarding the timing of the correction. world’s biggest fraud allegedly perpetrated by long-
As a result, arbitrageurs who conclude that other arbi- time investment adviser Bernard Madoff, arrested
trageurs are yet unlikely to trade against the bubble on December 11, 2008 and sentenced on June 29,
find it optimal to ride the still growing bubble for 2009 to 150 years in prison, the maximum allowed.
a while. His fraud led to 65 billion US dollars losses that
Like other institutional investors, hedge funds with caused reverberations around the world as the list
large holdings in US equities have to report their of victims included many wealthy private investors,
quarterly equity positions to the SEC on Form 13F. charities, hedge funds, and major banks in the United
Brunnermeier and Nagel [15] extracted hedge-fund States, Europe, and Asia. The Madoff Ponzi scheme
holdings from these data, including those of well- surfed on the general psychology, characterizing the
known managers such as Soros, Tiger, Tudor, and first decade of the twenty-first century, of exorbi-
others in the period from 1998 to 2000. They found tant unsustainable expected financial gains. It is a
that, over the sample period 1998–2000, hedge- remarkable illustration of the problem of implement-
fund portfolios were heavily tilted toward highly ing sound risk management, due diligence processes,
priced technology stocks. The proportion of their and of the capabilities of the SEC, the US mar-
overall stock holdings devoted to this segment was kets watchdog, when markets are booming and there
higher than the corresponding weight of technology is a general sentiment of a new economy and new
stocks in the market portfolio. In addition, the hedge financial era, in which old rules are believed not
funds in their sample skillfully anticipated price to apply anymore [75]. Actually, the Madoff Ponzi
peaks of individual technology stocks. On a stock- scheme is only the largest of a surprising number of
by-stock basis, hedge funds started cutting back other Ponzi schemes revealed by the financial cri-
their holdings before prices collapsed, switching sis in many different countries (see accounts from
to technology stocks that still experienced rising village.albourne.com).
prices. As a result, hedge-fund managers captured Discussing social imitation is often considered
the upturn, but avoided much of the downturn. This off-stream among financial economists but warrants
8 Bubbles and Crashes
some scrutiny, given its pervasive presence in human How can this help address the question of what
affairs. On the question of the ex ante detection is/are the cause(s) of bubbles and crashes? The crucial
of bubbles, Gurkaynak [46] summarizes the dismal insight is that a system, made of competing investors
state of the econometric approach, stating that the subjected to the myriad of influences, both exogenous
“econometric detection of asset price bubbles cannot news and endogenous interactions and reflexiv-
be achieved with a satisfactory degree of certainty. ity, can develop into endogenously self-organized
For each paper that finds evidence of bubbles, there self-reinforcing regimes, which would qualify as
is another one that fits the data equally well without bubbles, and that crashes occur as a global self-
allowing for a bubble. We are still unable to distin- organized transition. Mathematicians refer to this
guish bubbles from time-varying or regime-switching behavior as a bifurcation or more specifically as a
fundamentals, while many small sample economet- catastrophe [103]. Physicists call these phenomena
rics problems of bubble tests remain unresolved.” The phase transitions [102]. The implication of modeling
following discusses an arguably off-stream approach a market crash as a bifurcation is to solve the question
that, by using concepts and tools from the theory of of what makes a crash: in the framework of bifurca-
complex systems and statistical physics, suggests that tion theory (or phase transitions), sudden shifts in
ex ante diagnostic and partial predictability might be behavior arise from small changes in circumstances,
possible [93]. with qualitative changes in the nature of the solutions
that can occur abruptly when the parameters change
smoothly. A minor change of circumstances, of inter-
Social Mimetism, Collective Phenomena, action strength, or heterogeneity may lead to a sudden
Bifurcations, and Phase Transitions and dramatic change, such as during an earthquake
and a financial crash.
Market behavior is the aggregation of the indi-
Most approaches for explaining crashes search for
vidual behavior of the many investors participat-
possible mechanisms or effects that operate at very
ing in it. In an economy of traders with com-
short timescales (hours, days, or weeks at most).
pletely rational expectations and the same infor-
According to the “bifurcation” approach, the under-
mation sets, no bubbles are possible [104]. Ratio-
lying cause of the crash should be found in the
nal bubbles can, however, occur in infinite-horizon
preceding months and years, in the progressively
models [9], with dynamics of growth and col-
increasing buildup of market cooperativity, or effec-
lapse driven by noise traders [57, 59]. However,
tive interactions between investors, often translated
the key issue is to understand by what detailed
into accelerating ascent of the market price (the bub-
mechanism the aggregation of many individual
ble). According to this “critical” point of view, the
behaviors can give rise to bubbles and crashes.
specific manner in which prices collapsed is not
Modeling social imitation and social interactions
the most important problem: a crash occurs because
requires using approaches, little known to finan-
the market has entered an unstable phase and any
cial economists, that address the fundamental ques-
small disturbance or process may reveal the existence
tion of how global behaviors can emerge at the
of the instability.
macroscopic level. This extends the representa-
tive agent approach, but it also goes well beyond
the introduction of heterogeneous agents. A key Ising Models of Social Imitation and Phase
insight from statistical physics and complex sys- Transitions
tems theory is that systems with a large number of
interacting agents, open to their environment, self- Perhaps the simplest and historically most impor-
organize their internal structure and their dynam- tant model describing how the aggregation of many
ics with novel and sometimes surprising “emer- individual behaviors can give rise to macroscopic
gent” out-of-equilibrium properties. A central prop- out-of-equilibrium dynamics such as bubbles, with
erty of a complex system is the possible occur- bifurcations in the organization of social systems due
rence and coexistence of many large-scale collec- to slight changes in the interactions, is the Ising model
tive behaviors with a very rich structure, resulting [16, 80]. In particular, Orléan [85, 86] captured the
from the repeated nonlinear interactions among its paradox of combining rational and imitative behav-
constituents. ior under the name mimetic rationality, by developing
Bubbles and Crashes 9
models of mimetic contagion of investors in the stock V-3 Bubble as Superexponential Price
markets, which are based on irreversible processes of Growth, Diagnostic, and Prediction
opinion forming. Roehner and Sornette [88], among
others, showed that the dynamical updating rules of Bubbles are often defined as exponentially explo-
the Ising model are obtained in a natural way as the sive prices, which are followed by a sudden collapse.
optimal strategy of rational traders with limited infor-
As summarized, for instance, by Gurkaynak [46],
mation who have the possibility to make up for their
the problem with this definition is that any expo-
lack of information via information exchange with
nentially growing price regime—that one would call
other agents within their social network. The Ising
a bubble—can be also rationalized by a fundamen-
model is one of the simplest models describing the
tal valuation model. This is related to the problem
competition between the ordering force of imitation
that the fundamental price is not directly observ-
or contagion and the disordering impact of private
able, giving no strong anchor to understand observed
information or idiosyncratic noise (see [77] for a tech-
prices. This was exemplified during the last Inter-
nical review).
net bubble by fundamental pricing models, which
Starting with a framework suggested by Blume
incorporated real options in the fundamental valua-
[10, 11], Brock [12], Durlauf [30–33], and Phan
tion, justifying basically any price. Mauboussin and
et al. [87] summarize the formalism starting with
Hiler [76] were among the most vocal proponents
different implementation of the agents’ decision pro-
cesses whose aggregation is inspired from statis- of the proposition, offered close to the peak of the
tical mechanics to account for social influence in Internet bubble that culminated in 2000, that bet-
individual decisions. Lux and Marchesi [72], Brock ter business models, the network effect, first-to-scale
and Hommes [13], Kaizoji [60], and Kirman and advantages, and real options effect could account
Teyssiere [63] also developed related models in which rationally for the high prices of dot-com and other
agents’ successful forecasts reinforce the forecasts. New Economy companies. These interesting views
Such models have been found to generate swings expounded in early 1999 were in synchrony with the
in opinions, regime changes, and long memory. An bull market of 1999 and preceding years. They par-
essential feature of these models is that agents are ticipated in the general optimistic view and added to
wrong for some of the time, but whenever they are the strength of the herd. Later, after the collapse of
in the majority they are essentially right. Thus, they the bubble, these explanations seemed less attractive.
are not systematically irrational [62]. Sornette and This did not escape the US Federal Reserve chairman
Zhou [99] show how Bayesian learning added to the Greenspan [42], who said: “Is it possible that there
Ising model framework reproduces the stylized facts is something fundamentally new about this current
of financial markets. Harras and Sornette [47] show period that would warrant such complacency? Yes, it
how overlearning from lucky runs of random news in is possible. Markets may have become more efficient,
the presence of social imitation may lead to endoge- competition is more global, and information technol-
nous bubbles and crashes. ogy has doubtless enhanced the stability of business
These models allow one to combine the ques- operations. But, regrettably, history is strewn with
tions on the cause of both bubbles and crashes, as visions of such new eras that, in the end, have proven
resulting from the collective emergence of herding to be a mirage. In short, history counsels caution.”
via self-reinforcing imitation and social interactions, In this vein, the buzzword “new economy” so much
which are then susceptible to phase transitions or used in the late 1990s was also in use in the 1960s
bifurcations occurring under minor changes in the during the “tronic boom” also followed by a market
control parameters. Hence, the difficulty in answering crash and during the bubble of the late 1920s before
the question of “what causes a bubble and a crash” the October 1929 crash. In the latter case, the “new”
may, in this context, be attributed to this distinctive economy was referring to firms in the utility sector.
attribute of a dynamical out-of-equilibrium system to It is remarkable how traders do not learn the lessons
exhibit bifurcation behavior in its dynamics. This line of their predecessors.
of thought has been pursued by Sornette and his coau- A better model derives from the mechanism of
thors, to propose a novel operational diagnostic of positive feedbacks discussed above, which generi-
bubbles. cally gives rise to faster-than-exponential growth of
10 Bubbles and Crashes
price (termed as superexponential ) [95, 96]. An expo- received the attention from the academic financial
nential growing price is characterized by a constant community that it perhaps deserves given the stakes.
expected growth rate. The geometric random walk is This is probably due to several factors, which include
the standard stochastic price model embodying this the following: (i) the origin of the hypothesis com-
class of behaviors. A superexponential growing price ing from analogies with complex critical systems in
is such that the growth rate grows itself as a result physics and the theory of complex systems, which
of positive feedbacks of price, momentum, and other constitutes a well-known obstacle to climb the ivory
characteristics on the growth rate [95]. As a conse- towers of standard financial economics; (ii) the non-
quence of the acceleration, the mathematical models standard (from an econometric viewpoint) formula-
generalizing the geometric random walk exhibit so- tion of the statistical tests performed until present (in
called finite-time singularities. In other words, the this respect, see the attempts in terms of a Bayesian
resulting processes are not defined for all times: the analysis of log-periodic power law (LPPL) precursors
dynamics has to end after a finite life and to transform [17] to focus on the time series of returns instead of
into something else. This captures well the transient prices, and of regime-switching model of LPPL [18]),
nature of bubbles, and the fact that the crashes ending (iii) the nonstandard expression of some of the math-
the bubbles are often the antechambers to different ematical models underpinning the hypothesis; and
market regimes. (iv) perhaps an implicit general belief in academia
Such an approach may be thought of, at first that forecasting financial instabilities is inherently
sight, to be inadequate or too naive to capture impossible. Lin et al. [68] have recently addressed
the intrinsic stochastic nature of financial prices, problem (ii) by combining a mean-reverting volatil-
whose null hypothesis is the geometric random walk ity process and a stochastic conditional return, which
model [74]. However, it is possible to generalize this reflects nonlinear positive feedbacks and continu-
simple deterministic model to incorporate nonlinear ous updates of the investors’ beliefs and sentiments.
positive feedback on the stochastic Black–Scholes When tested on the S&P500 US index from January
model, leading to the concept of stochastic finite-time 3, 1950 to November 21, 2008, the model correctly
singularities [3, 36, 37, 51, 95]. Much work still needs identifies the bubbles that ended in October 1987, in
to be done on this theoretical aspect. October 1997, in August 1998, and the information
In a series of empirical papers, Sornette and his and communication technologies (ICT) bubble that
collaborators have used this concept to empirically ended in the first quarter of 2000. Using Bayesian
test for bubbles and prognosticate their demise often inference, Lin et al. [68] find a very strong statistical
in the form of crashes. Johansen and Sornette [58] preference for their model compared with a stan-
provide perhaps the most inclusive series of tests of dard benchmark, in contradiction with Chang and
this approach. First, they identify the most extreme Feigenbaum [17], who used a unit-root model for
cumulative losses (drawdowns) in a variety of asset residuals.
classes, markets, and epochs, and show that they
belong to a probability density distribution, which is V-4 Bubbles and the Great Financial
distinct from the distribution of 99% of the smaller Crisis of 2007
drawdowns (the more “normal” market regime).
These drawdowns can thus be called outliers or kings It is appropriate to end this article with some com-
[94]. Second, they show that, for two-thirds of these ments on the relationship between the momentous
extreme drawdowns, the market prices followed a financial crisis and bubbles. The financial crisis,
superexponential behavior before their occurrences, which started with an initially well-defined epicen-
as characterized by the calibration of the power law ter focused on mortgage-backed securities (MBS),
with a finite-time singularity. has been cascading into a global economic recession,
This provides a systematic approach to diagnose whose increasing severity and uncertain duration are
for bubbles ex ante, as shown in a series of real-life continuing to lead to massive losses and damage for
tests [98, 100, 108–111]. Although this approach has billions of people. At the time of writing (July 2009),
enjoyed a large visibility in the professional financial the world still suffers from a major financial crisis
community around the world (banks, mutual funds, that has transformed into the worst economic reces-
hedge funds, investment houses, etc.), it has not yet sion since the Great Depression, perhaps on its way
Bubbles and Crashes 11
to surpass it. Heavy central bank interventions and [4] Badrinath, S.G. & Wahal, S. (2002). Momentum
government spending programs have been launched trading by institutions, Journal of Finance 57(6),
worldwide and especially in the United States and 2449–2478.
[5] Barberis, N., Shleifer, A. & Vishny, R. (1998). A model
Europe, with the hope to unfreeze credit and bolster of investor sentiment, Journal of Financial Economics
consumption. 49, 307–343.
The current financial crisis is a perfect illustration [6] Battalio, R. & Schultz, P. (2006). Option and the
of the major role played by financial bubbles. We bubble, Journal of Finance 61(5), 2071–2102.
refer to the analysis, figures, and references in [97], [7] Bhattacharya, U. & Yu, X. (2008). The causes and
which articulate a general framework, suggesting that consequences of recent financial market bubbles: an
the fundamental cause of the unfolding financial and introduction, Review of Financial Studies 21(1), 3–10.
[8] Black, F. (1986). Noise, The Journal of Finance 41(3),
economic crisis is the accumulation of five bubbles:
529–543. Papers and Proceedings of the Forty-Fourth
1. the “new economy” ICT bubble that started in Annual Meeting of the America Finance Association,
New York, NY, December 28–30, 1985.
the mid-1990s and ended with the crash of 2000;
[9] Blanchard, O.J. and Watson, M.W. (1982). Bubbles,
2. the real-estate bubble launched in large part by rational expectations and speculative markets, in Cri-
easy access to a large amount of liquidity as a sis in Economic and Financial Structure: Bubbles,
result of the active monetary policy of the US Bursts, and Shocks, P. Wachtel, ed., Lexington Books,
Federal Reserve lowering the fed rate from 6.5% Lexington.
in 2000 to 1% in 2003 and 2004 in a successful [10] Blume, L.E. (1993). The statistical mechanics of
attempt to alleviate the consequence of the 2000 strategic interaction, Game and Economic Behavior 5,
387–424.
crash;
[11] Blume, L.E. (1995). The statistical mechanics of
3. the innovations in financial engineering with the best-response strategy revisions, Game and Economic
collateralized debt obligations (CDOs) and other Behavior 11, 111–145.
derivatives of debts and loan instruments issued [12] Brock, W.A. (1993). Pathways to randomness in the
by banks and eagerly bought by the market, economy: emergent nonlinearity and chaos in eco-
accompanying and fueling the real-estate bubble; nomics and finance, Estudios Económicos 8, 3–55.
4. the commodity bubble(s) on food, metals, and [13] Brock, W.A. & Hommes, C.H. (1999). Rational animal
spirits, in The Theory of Markets, P.J.J. Herings, G. van-
energy; and
derLaan & A.J.J. Talman, eds, North-Holland, Amster-
5. the stock market bubble that peaked in October dam, pp. 109–137.
2007. [14] Brown, S.J., Goetzmann, W., Ibbotson, R.G. &
Ross, S.A. (1992). Survivorship bias in performance
These bubbles, by their interplay and mutual rein- studies, Review of Financial Studies 5(4), 553–580.
forcement, have led to the illusion of a “perpetual [15] Brunnermeier, M.K. & Nagel, S. (2004). Hedge funds
money machine”, allowing financial institutions to and the technology bubble, Journal of Finance 59(5),
extract wealth from an unsustainable artificial pro- 2013–2040.
cess. This realization calls to question the sound- [16] Callen, E. & Shapero, D. (1974). A theory of social
ness of many of the interventions to address the imitation, Physics Today July, 23–28.
[17] Chang, G. & Feigenbaum, J. (2006). A Bayesian
recent liquidity crisis that tend to encourage more
analysis of log-periodic precursors to financial crashes,
consumption. Quantitative Finance 6(1), 15–36.
[18] Chang, G. & Feigenbaum, J. (2007). Detecting log-
periodicity in a regime-switching model of stock
References returns, Quantitative Finance 8, 723–738.
[19] Chen, J., Hong, H. & Stein, J. (2002). Breadth of
[1] Abreu, D. & Brunnermeier, M.K. (2003). Bubbles and ownership and stock returns, Journal of Financial
crashes, Econometrica 71, 173–204. Economics 66, 171–205.
[2] Almazan, A., Brown, K.C., Carlson, M. & Chap- [20] Cochrane, J.H., 2003,. Stocks as money: convenience
man, D.A. (2004). Why constrain your mutual yield and the tech-stock bubble, in Asset Price Bubbles,
fund manager? Journal of Financial Economics 73, W.C. Hunter, G.G. Kaufman & M. Pomerleano, eds,
289–321. MIT Press, Cambridge.
[3] Andersen, J.V. & Sornette, D. (2004). Fearless ver- [21] Cox, A.M.G. & Hobson, D.G. (2005). Local martin-
sus fearful speculative financial bubbles, Physica A gales, bubbles and option prices, Finance and Stochas-
337(3–4), 565–585. tics 9(4), 477–492.
12 Bubbles and Crashes
[22] Daniel, K., Hirshleifer, D. & Subrahmanyam, A. [40] Galbraith, J.K. (1954/1988). The Great Crash 1929,
(1998). Investor psychology and security market under- Houghton Mifflin Company, Boston.
and overreactions, The Journal of Finance 53(6), [41] Gompers, P.A. & Metrick, A. (2001). Institutional
1839–1885. investors and equity prices, Quarterly Journal of Eco-
[23] Dass, N., Massa, M. & Patgiri, R. (2008). Mutual nomics 116, 229–259.
funds and bubbles: the surprising role of contracted [42] Greenspan, A. (1997). Federal Reserve’s Semiannual
incentives, Review of Financial Studies 21(1), 51–99. Monetary Policy Report, before the Committee on
[24] De Bondt, W.F.M. & Thaler, R.I.-I. (1985). Does Banking. Housing, and Urban Affairs, U.S. Senate,
the stock market overreact? Journal of Finance 40, February 26.
793–805. [43] Griffin, J.M., Harris, J. & Topaloglu, S. (2003). The
[25] De Long, B.J. & Shleifer, A. (1991). The stock dynamics of institutional and individual trading, Jour-
market bubble of 1929: evidence from closed-end nal of Finance 58, 2285–2320.
mutual funds, The Journal of Economic History 51(3),
[44] Grinblatt, M. & Titman, S. (1992). The persistence
675–700.
of mutual fund performance, Journal of Finance 47,
[26] De Long, J.B., Shleifer, A., Summers, L.H. & Wald-
1977–1984.
mann, R.J. (1990a). Positive feedback investment
[45] Grinblatt, M., Titman, S. & Wermers, R. (1995).
strategies and destabilizing rational speculation, The
Momentum investment strategies, portfolio perfor-
Journal of Finance 45(2), 379–395.
[27] De Long, J.B., Shleifer, A., Summers, L.H. & Wald- mance and herding: a study of mutual fund behavior,
mann, R.J. (1990b). Noise trader risk in financial mar- The American Economic Review 85(5), 1088–1105.
kets, The Journal of Political Economy 98(4), 703–738. [46] Gurkaynak, R.S. (2008). Econometric tests of asset
[28] Dimitriadi, G.G. (2004). What are “Financial Bubbles”: price bubbles: taking stock, Journal of Economic Sur-
approaches and definitions, Electronic journal “INVES- veys 22(1), 166–186.
TIGATED in RUSSIA” http://zhurnal.ape.relarn.ru/ [47] Harras, G. & Sornette, D. (2008). Endogenous versus
articles/2004/245e.pdf Exogenous Origins of Financial Rallies and Crashes
[29] Duffie, D., Garleanu, N. & Pedersen, L.H. (2002). in an Agent-based Model with Bayesian Learning and
Security lending, shorting and pricing, Journal of Imitation, ETH Zurich preprint (http://papers.ssrn.com/
Financial Economics 66, 307–339. sol3/papers.cfm?abstract id=1156348)
[30] Durlauf, S.N. (1991). Multiple equilibria and persis- [48] Harrison, M. & Kreps, D. (1978). Speculative investor
tence in aggregate fluctuations, American Economic behavior in a stock market with heterogeneous expec-
Review 81, 70–74. tations, Quarterly Journal of Economics 92, 323–336.
[31] Durlauf, S.N. (1993). Nonergodic economic growth, [49] Hong, H., Scheinkman, J. & Xiong, W. (2006). Asset
Review of Economic Studies 60(203), 349–366. float and speculative bubbles, Journal of Finance 59(3),
[32] Durlauf, S.N., (1997). Statistical mechanics approaches 1073–1117.
to socioeconomic behavior, in The Economy as an [50] Hong, H. & Stein, J.C. (2003). Differences of Opinion,
Evolving Complex System II, Santa Fe Institute Studies short-sales constraints, and market crashes, The Review
in the Sciences of Complexity, B. Arthur, S. Durlauf of Financial Studies 16(2), 487–525.
& D. Lane, eds, Addison-Wesley, Reading, MA, Vol. [51] Ide, K. & Sornette, D. (2002). Oscillatory finite-time
XXVII. singularities in finance, population and rupture, Physica
[33] Durlauf, S.N. (1999). How can statistical mechanics A 307(1–2), 63–106.
contribute to social science? Proceedings of the
[52] Jarrow, R. (1980). Heterogeneous expectations, restric-
National Academy of Sciences of the USA 96,
tions on short sales, and equilibrium asset prices, Jour-
10582–10584.
nal of Finance 35, 1105–1113.
[34] Fama, E.F. (1965). The Behavior of Stock-Market
[53] Jarrow, R., Protter, P. & Shimbo, K. (2007). Asset
Prices, Journal of Business, 38(1), 34–105.
price bubbles in a complete market, in Advances in
[35] Fisher, I. (1930). The Stock Market Crash-and After,
Macmillan, New York. Mathematical Finance, (Festschrift in honor of Dilip
[36] Fogedby, H.C. (2003). Damped finite-time-singularity Madan’s 60th birthday), M.C. Fu, R.A. Jarrow, J.-Y.
driven by noise, Physical Review E 68, 051105. Yen & R.J. Elliott, eds, Birkhäuser, pp. 97–122.
[37] Fogedby, H.C. & Poukaradzez, V. (2002). Power [54] Jarrow, R., Protter, P. & Shimbo, K. (2008). Asset price
laws and stretched exponentials in a noisy finite-time- bubbles in incomplete markets, Mathematical Finance
singularity model, Physical Review E 66, 021103. to appear.
[38] French, K.R. & Poterba, J.M. (1991). Were Japanese [55] Jegadeesh, N. & Titman, S. (1993). Returns to buying
stock prices too high? Journal of Financial Economics winners and selling losers: Implications for stock
29(2), 337–363. market efficiency, Journal of Finance 48, 65–91.
[39] Friedman, M. & Schwartz, A.J. (1963). A Monetary [56] Jegadeesh, N. & Titman, S. (2001). Profitability of
History of the United States, 1867-1960, Princeton momentum strategies: An evaluation of alternative
University Press, Princeton. explanations, Journal of Finance 54, 699–720.
Bubbles and Crashes 13
[57] Johansen, A., Ledoit, O. & Sornette, D. (2000). Crashes [73] Lux, T. & Sornette, D. (2002). On rational bubbles and
as critical points, International Journal of Theoretical fat tails, Journal of Money, Credit and Banking Part 1
and Applied Finance 3(2), 219–255. 34(3), 589–610.
[58] Johansen, A. & Sornette, D. (2004). Endogenous versus [74] Malkiel, B.G. (2007). A Random Walk Down Wall
Exogenous Crashes in Financial Markets, preprint at Street: The Time-Tested Strategy for Successful Invest-
http://papers.ssrn.com/paper.taf?abstract id=344980, ing, W.W. Norton & Co.. Revised and Updated edition
published as “Shocks, Crashes and Bubbles in Finan- (December 17, 2007).
cial Markets,” Brussels Economic Review (Cahiers [75] Markopolos, H. (2009). Testimony of Harry Markopo-
economiques de Bruxelles), 49 (3/4), Special Issue on los, CFA, CFE Chartered Financial Analyst, Certified
Nonlinear Analysis (2006) (http://ideas.repec.org/s/bxr/ fraud examiner, before the U.S. House of Represen-
bxrceb.html) tatives, Committee on Financial Services. Wesnesday,
[59] Johansen, A., Sornette, D. & Ledoit, O. (1999). Pre- February 4, 2009, 9:30am, McCarter & English LLP,
dicting financial crashes using discrete scale invariance, Boston.
Journal of Risk 1(4), 5–32. [76] Mauboussin, M.J. & Hiler, B. (1999). Rational Exuber-
[60] Kaizoji, T. (2000). Speculative bubbles and crashes in ance? Equity Research, Credit Suisse First Boston, pp.
stock markets: an interacting agent model of specula- 1–6. January 26, 1999.
tive activity, Physica A 287(3–4), 493–506. [77] McCoy, B.M. & Wu, T.T. (1973). The Two-Dimen-
[61] Kindleberger, C.P. (1978). Manias, Panics and sional Ising Model, Harvard University, Cambridge,
Crashes: A History of Financial Crises, Basic Books, MA.
New York. [78] Miller, E. (1977). Risk, uncertainty and divergence of
[62] Kirman, A.P. (1997). Interaction and Markets, opinion, Journal of Finance 32, 1151–1168.
G.R.E.Q.A.M. 97a02 , Universite Aix-Marseille III. [79] Miller, M.H. & Modigliani, F. (1961). Dividend pol-
[63] Kirman, A.P. & Teyssiere, G. (2002). Micro-economic icy, growth, and the valuation of shares, Journal of
Business, 34(4), 411–433.
models for long memory in the volatility of financial
[80] Montroll, E.W. & Badger, W.W. (1974). Introduction
time series, in The Theory of Markets, P.J.J. Her-
to Quantitative Aspects of Social Phenomena, Gordon
ings, G. VanderLaan & A.J.J. Talman, eds, North-
and Breach, New York.
Holland, Amsterdam, pp. 109–137.
[81] Nofsinger, J.R. & Sias, R.W. (1999). Herding and feed-
[64] Koski, J.L. & Pontiff, J. (1999). How Are derivatives
back trading by institutional and individual investors,
used? Evidence from the mutual fund industry, Journal
Journal of Finance 54, 2263–2295.
of Finance 54(2), 791–816.
[82] Ofek, E. & Richardson, M. (2002). The valuation
[65] Kyle, A.S. (1985). Continuous auctions and insider
and market rationality of internet stock prices, Oxford
trading, Econometrica 53, 1315–1335.
Review of Economic Policy 18(3), 265–287.
[66] Lakonishok, J., Shleifer, A. & Vishny, R.W. (1992).
[83] Ofek, E. & Richardson, M. (2003). DotCom mania:
The impact of institutional trading on stock prices, the rise and fall of internet stock prices, The Journal of
Journal of Financial Economics 32, 23–43. Finance 58(3), 1113–1137.
[67] Lamont, O.A. & Thaler, R.H. (2003). Can the market [84] Ofek, E., Richardson, M. & Whitelaw, R.F. (2004).
add and subtract? Mispricing in tech stock carve- Limited arbitrage and short sale constraints: evidence
outs, Journal of Political Economy 111(2), 227–268. from the options market, Journal of Financial Eco-
University of Chicago Press. nomics 74(2), 305–342.
[68] Lin, L., Ren, R.E. & Sornette, D. (2009). A Consistent [85] Orléan, A. (1989). Mimetic contagion and speculative
Model of ‘Explosive’ Financial Bubbles With Mean- bubbles, Theory and Decision 27, 63–92.
Reversing Residuals, preprint at http://papers.ssrn.com/ [86] Orléan, A. (1995). Bayesian interactions and collec-
abstract=1407574 tive dynamics of opinion – herd behavior and mimetic
[69] Lintner, J. (1969). The aggregation of investors’ diverse contagion, Journal of Economic Behavior and Organi-
judgments and preferences in purely competitive secu- zation 28, 257–274.
rity markets, Journal of Financial and Quantitative [87] Phan, D., Gordon, M.B. & Nadal, J.-P. (2004). Social
Analysis 4, 347–400. interactions in economic theory: an insight from sta-
[70] Loewenstein, M. & Willard, G.A. (2000a). Rational tistical mechanics, in Cognitive Economics – An Inter-
equilibrium asset-pricing bubbles in continuous trading disciplinary Approach, P. Bourgine & J.-P. Nadal, eds,
models, Journal of Economic Theory 91(1), 17–58. Springer, Berlin.
[71] Loewenstein, M. & Willard, G.A. (2000b). Local [88] Roehner, B.M. & Sornette, D. (2000). Thermometers
martingales, arbitrage and viability: free snacks and of speculative frenzy, European Physical Journal B 16,
cheap thrills, Economic Theory 16, 135–161. 729–739.
[72] Lux, T. & Marchesi, M. (1999). Scaling and criticality [89] Scheinkman, J. & Xiong, W. (2003). Overconfidence
in a stochastic multi-agent model of a financial market, and speculative bubbles, Journal of Political Economy
Nature 397, 498–500. 111, 1183–1219.
14 Bubbles and Crashes
[90] Schultz, P. (2008). Downward-sloping demand curves, [103] Thom, R. (1989). Structural Stability and Morpho-
the supply of shares, and the collapse of internet stock genesis: An Outline of a General Theory of Models,
prices, Journal of Finance 63, 351–378. Addison-Wesley, Reading, MA.
[91] Shiller, R. (2000). Irrational Exuberance, Princeton [104] Tirole, J. (1982). On the possibility of speculation under
University Press, Princeton, NJ. rational expectations, Econometrica 50, 1163–1182.
[92] Shleifer, A. & Vishny, R. (1997). Limits of arbitrage, [105] Wermers, R. (1999). Mutual fund herding and the
Journal of Finance 52, 35–55. impact on stock prices, Journal of Finance 54(2),
[93] Sornette, D. (2003). Why Stock Markets Crash (Crit- 581–622.
ical Events in Complex Financial Systems), Princeton [106] West, K.D. (1988). Bubbles, fads and stock price
University Press, Princeton NJ. volatility tests: a partial evaluation, Journal of Finance
[94] Sornette, D. (2009). Dragon-Kings, Black Swans and 43(3), 639–656.
the Prediction of Crises, in press in the Interna- [107] White, E.N. (2006). Bubbles and Busts: The 1990s
tional Journal of Terraspace Science and Engineering in the Mirror of the 1920s NBER Working Paper No.
(http://ssrn.com/abstract = 1470006). 12138 .
[95] Sornette, D. & Andersen, J.V. (2002). A nonlinear [108] Zhou, W.-X. & Sornette, D. (2003). 2000–2003 real
estate bubble in the UK but not in the USA, Physica A
super-exponential rational model of speculative finan-
329, 249–263.
cial bubbles, International Journal of Modern Physics
[109] Zhou, W.-X. & Sornette, D. (2006). Is there a real-
C 13(2), 171–188.
estate bubble in the US? Physica A 361, 297–308.
[96] Sornette, D., Takayasu, H. & Zhou, W.-X. (2003).
[110] Zhou, W.-X. & Sornette, Didier (2007). A Case
Finite-time singularity signature of hyperinflation,
Study of Speculative Financial Bubbles in the South
Physica A: Statistical Mechanics and Its Applications
African Stock Market 2003-2006 , ETH Zurich preprint
325, 492–506. (http://arxiv.org/abs/physics/0701171)
[97] Sornette, D. & Woodard, R. (2009). Financial bubbles, [111] Zhou, W.-X. & Sornette, D. (2008). Analysis of the real
real estate bubbles, derivative bubbles, and the finan- estate market in Las Vegas: bubble, seasonal patterns,
cial and economic crisis, to appear in the Proceedings and prediction of the CSW indexes, Physica A 387,
of APFA7 (Applications of Physics in Financial Analy- 243–260.
sis), in New Approaches to the Analysis of Large-Scale
Business and Economic Data, M. Takayasu, T Watan-
abe & H. Takayasu, eds., Springer (2010) (e-print at Further Reading
http://arxiv.org/abs/0905.0220)
[98] Sornette, D., Woodard, R. & Zhou, W.-X. (2008).
Abreu, D & Brunnermeier, M.K. (2002). Synchronization risk
The 2006–2008 Oil Bubble and Beyond , ETH Zurich
and delayed arbitrage, Journal of Financial Economics 66,
preprint (http://arXiv.org/abs/0806.1170)
341–360.
[99] Sornette, D. & Zhou, W.-X. (2006a). Importance
Farmer, J.D. (2002). Market force, ecology and evolution,
of positive feedbacks and over-confidence in a self-
Industrial and Corporate Change 11(5), 895–953.
fulfilling ising model of financial markets, Physica
Narasimhan, J. & Titman, S. (1993). Returns to buying winners
A: Statistical Mechanics and its Applications 370(2), and selling losers: implications for stock market efficiency,
704–726. The Journal of Finance 48(1), 65–91.
[100] Sornette, D. & Zhou, W.-X. (2006b). Predictability Narasimhan, J. & Titman, S. (2001). Profitability of momentum
of large future changes in major financial indices, strategies: an evaluation of alternative explanations, The
International Journal of Forecasting 22, 153–168. Journal of Finance 56(2), 699–720.
[101] Soros, G. (1987). The Alchemy of Finance: Reading the Shleifer, A & Summers, L.H. (1990). The noise trader approach
Mind of the Market, Wiley, Chichester. to finance, The Journal of Economic Perspectives 4(2),
[102] Stanley, H.E. (1987). Introduction to Phase Transitions 19–33.
and Critical Phenomena, Oxford University Press,
USA. TAISEI KAIZOJI & DIDIER SORNETTE
Ross, Stephen arguably the vision that underlies the entire field of
financial engineering.
The general existence of a linear pricing rule
The central focus of the work of Ross (1944–) has has further implications that Ross would later group
been to tease out the consequences of the assumption together in what he called the pricing rule representa-
that all riskless arbitrage opportunities have already tion theorem [7, p. 104]. Most important for practical
been exploited and none remain. The empirical rel- purposes is the existence of positive risk-neutral prob-
evance of the no arbitrage assumption is especially abilities and an associated riskless rate of interest, a
high in the area of financial markets for two sim- feature first noted in [4, 5]. It is this general fea-
ple reasons: there are many actors actively searching ture that makes it possible to model option prices
for arbitrage opportunities, and the exploitation of by treating the underlying stock price as a binomial
such opportunities is relatively costless. For finance, random variable in discrete time, as first introduced
therefore, the principle of no arbitrage is not merely by Cox et al. [6] in an approach that is now ubiq-
a convenient assumption that makes it possible to uitous in industry practice. It is this same general
derive clean theoretical results but even more an feature that makes it possible to characterize asset
idealization of observable empirical reality, and a prices generally as following a martingale under the
characterization of the deep and simple structure equivalent martingale measure [9], a characteriza-
underlying multifarious surface phenomena. For one tion that is also now routine in financial engineering
whose habits of mind were initially shaped by the practice.
methods of natural science, specifically physics as What is most remarkable about these conse-
taught by Richard Feynman (B.S. California Institute quences of the no arbitrage point of view is how little
of Technology, 1965), finance seemed to be an area economics has to do with it. Ross, a trained economist
of economics where a truly scientific approach was (Harvard, PhD, 1969), might well have built a rather
possible. different career, perhaps in the area of agency theory
It was exposure to the Black–Scholes option pric- where he made one of the early seminal contributions
ing theory, when Ross was starting his career as [10], but once he found finance he never looked back.
an assistant professor at the University of Pennsyl- (His subsequent involvement in agency theory largely
vania, that first sparked his interest in the line of focused on financial intermediation in a world with
research that would occupy him for the rest of his no arbitrage, as in [14, 18].)
life. If the apparently simple and eminently plausible When Ross was starting his career, economists had
assumption of no arbitrage could crack the problem already begun making inroads into finance, and one
of option pricing, perhaps it could crack other prob- of the consequences was the Sharpe–Lintner capital
lems in finance as well. In short order, Ross produced asset pricing model (CAPM) (see Modern Portfo-
what he later called the fundamental theorem of asset lio Theory). Ross [16] reinterpreted the CAPM as
pricing [7, p. 101], which linked the absence of arbi- a possible consequence of no arbitrage and then pro-
trage with the existence of a positive linear pricing posed his own arbitrage pricing theory [13] as a more
rule [12, 15] (see Fundamental Theorem of Asset general consequence that would be true whenever
Pricing). asset prices were generated by a linear factor model
Perhaps the most important practical implication such as
of this theorem is that it is possible to price assets
that are not yet traded simply by reference to the
price of assets that are already traded, and to do Ri = Ei + βij fj + εi , i = 1, . . . , n (1)
so without the need to invoke any particular theory
of asset pricing. This opened the possibility of
creating new assets, such as options, that would where Ei is the expected return on asset i, fi is an
in practical terms “complete” markets, and so help exogenous systematic factor, and εi is the random
move the economy closer to the ideal efficient noise.
frontier characterized by Kenneth Arrow (see Arrow, In such a world, it follows from no arbitrage that
Kenneth) as a complete set of markets for state- the expected return on asset i, in excess of the risk-
contingent securities [11]. Here, in the abstract, is free rate of return r, is equal to a linear combination
2 Ross, Stephen
[18] Ross, S.A. (2004). Markets for agents: fund manage- Related Articles
ment, in The Legacy of Fischer Black, B.N. Lehman, ed,
Oxford University Press.
Arbitrage: Historical Perspectives; Arbitrage
Pricing Theory; Black, Fischer; Equivalent
Further Reading Martingale Measures; Martingale Representation
Theorem; Option Pricing Theory: Historical
Perspectives; Risk-neutral Pricing.
Ross, S.A. (1974). Portfolio Turnpike theorems for constant
policies, Journal of Financial Economics 1, 171–198.
PERRY MEHRLING
Ross, S.A. (1978a). Mutual fund separation in financial theory:
the separating distributions, Journal of Economic Theory
17(2), 254–286.
Fisher, Irving favoring consumption at the expense of saving, a
view now increasingly held by economists. Fisher
[7] also discussed the pricing and allocation of risk
The American economist Irving Fisher (born 1867, in financial markets, using a “coefficient of cau-
died 1947) advanced the use of formal mathematical tion” to represent subjective attitudes to risk tolerance
and statistical techniques in economics and finance, [2, 3, 18]. In The Rate of Interest, Fisher [8] drew
both in his own pioneering research in monetary and on the earlier work of John Rae and Eugen von
capital theory and in his roles as a mentor to a Böhm-Bawerk to examine how intertemporal alloca-
handful of talented doctoral students and as found- tion and the real interest rate depend on impatience
ing president of the Econometric Society. As an (time preference) and opportunity to invest (expected
undergraduate and a graduate student at Yale Uni- rate of return over cost). He illustrated this anal-
versity, Fisher studied with the physicist J. Willard ysis with the celebrated “Fisher diagram” showing
Gibbs and the economist and sociologist William optimal smoothing of consumption over two periods.
Graham Sumner. Fisher’s 1891 doctoral dissertation According to the “Fisher separation theorem,” the
in economics and mathematics, Mathematical Inves- time pattern of consumption is independent of the
tigations in the Theory of Value and Prices (reprinted time pattern of income (assuming perfect credit mar-
in [12], Vol. 1), was the first North American use kets), because the net present value of expected
of general equilibrium analysis—indeed, an inde- lifetime income is the relevant budget constraint for
pendent rediscovery of general equilibrium, because consumption and saving decisions, rather than income
Fisher did not read the works of Léon Walras and in a particular period. Fisher’s analysis of consump-
F.Y. Edgeworth until his thesis was nearly com- tion smoothing across time periods provided the basis
pleted. To accompany this thesis, Fisher constructed for later permanent-income and life-cycle models of
a hydraulic mechanism to simulate the determination consumption, and was extended by others to con-
of equilibrium prices and quantities, a remarkable sumption smoothing across possible states of the
achievement in the days before electronic comput- world. John Maynard Keynes later identified his con-
ers (see Brainard and Scarf in [5] and Schwalbe cept of the marginal efficiency of capital with Fisher’s
in [14]). Initially appointed to teach mathematics rate of return over costs.
at Yale, Fisher soon switched to political economy, Fisher’s Appreciation and Interest [6] presented
teaching at Yale until he retired in 1935. Stricken the “Fisher equation,” decomposing nominal interest
with tuberculosis in 1898, Fisher was on leave for into real interest and expected inflation, formalizing
three years, and did not resume a full teaching load and expounding an idea that had been briefly noted
until 1903. This ordeal turned Fisher into a relentless by, among others, John Stuart Mill and Alfred
crusader for healthier living and economic reforms, Marshall. With i as the nominal interest rate, j as
dedicated to improving the world and confident of the real interest rate, and a as the expected rate
overcoming adversity and daunting obstacles [1, 5, of appreciation of the purchasing power of money
14]. As a scientific economist and as a reformer, ([6] appeared at the end of two decades of falling
Fisher was a brilliant and multifaceted innovator, but prices),
he never managed to pull his ideas together in a grand (1 + j ) = (1 + a)(1 + i) (1)
synthesis.
In The Nature of Capital and Income, Fisher [7] in Fisher’s notation. This analysis of the relation-
popularized the concept of net present value, viewing ship between interest rates expressed in two different
capital as the present discounted value of an expected standards (money and goods, gold and silver, dol-
income stream. Controversially, Fisher excluded sav- lars and pounds sterling) led Fisher [6] to uncovered
ing from his definition of income, and advocated a interest parity (the difference between nominal inter-
spending tax instead of a tax on income as usu- est rates in two currencies is the expected rate of
ally defined. Since saving is the acquisition of assets change of the exchange rate) and to a theory of the
whose market value is the net present value of the term structure of interest rates as reflecting expecta-
expected taxable income from owning the assets, a tions about future changes in the purchasing power
tax on income (as usually defined) would involve of money. In later work (see [12], Vol. 9), Fisher
double taxation and would introduce a distortion correlated nominal interest with a distributed lag of
2 Fisher, Irving
past price level changes, deriving expected inflation it came closer than any other formula to satisfying
adaptively from past inflation. Distributed lags were seven tests for such desirable properties as deter-
introduced into economics by Fisher, who was also minateness, proportionality, and independence of the
among the first economists to use correlation analysis. units of measurement. Later research demonstrated
Long after Fisher’s death, his pioneering 1926 article that no formula can satisfy more than six of the
[10], correlating unemployment with a distributed lag seven tests, although, which one should be dropped
of inflation, was reprinted in 1973, under the title “I remains an open question. Three quarters of a century
Discovered the Phillips Curve.” later, the “Fisher ideal index” began to be adopted by
In The Purchasing Power of Money, Fisher [13] governments.
upheld the quantity theory of money, arguing that Beyond his work, Fisher encouraged quantita-
changes in the quantity of money affect real output tive research by others, notably Yale dissertations by
and real interest during adjustment periods of up to J. Pease Norton [16] and Chester A. Phillips [17],
10 years, but affect only nominal variables in the long and through his role as founding president of the
run. He extended the quantity theory’s equation of Econometric Society. Norton’s Statistical Studies of
exchange to include bank deposits: the New York Money Market is now recognized as
a landmark in time-series analysis, while Phillips’s
MV + M V = P T (2)
Bank Credit (together with later work by Fisher’s for-
where M is currency, M bank deposits, V and V mer student James Harvey Rogers) analyzed the cre-
the velocities of circulation of currency and bank ation and absorption of bank deposits by the banking
deposits, respectively, P the price level, and T an system [4]. Arguing that fluctuations in the pur-
index of the volume of transactions. Fisher attributed chasing power of money make money and bonds
economic fluctuations to the slow adjustment of nom- risky assets, contrary to the widespread “money illu-
inal interest to monetary shocks, resulting from what sion,” Fisher and his students advocated common
he termed “the money illusion” in the title of a 1928 stocks as a long-term investment, with the return
book (in [12], Vol. 8). The economy would be stable on stocks more than compensating for their risk,
if, instead of pegging the dollar price of gold, mon- once risk is calculated in real rather than in nominal
etary policy followed Fisher’s “compensated dollar” terms.
plan of regularly varying the price of gold to target Fisher was swept up in the “New Economy”
an index number of prices. Inflation targeting is a rhetoric of the 1920s stock boom. He promoted
modern version of Fisher’s proposed price level tar- several ventures, of which by far the most suc-
get (without attempting a variable peg of the price of cessful was his “Index Visible,” a precursor of the
gold, which would have made Fisher’s plan vulnera- Rolodex. Fisher sold Index Visible to Rand Kardex
ble to speculative attacks). Failing to persuade gov- for shares and stock options, which he exercised
ernments to stabilize the purchasing power of money, with borrowed money. In mid-1929, Fisher’s net
Fisher attempted to neutralize the effects of price worth was 10 million dollars. Had he died then,
level changes by advocating the creation of indexed he would have been remembered like Keynes as a
financial instruments, persuading Rand Kardex (later financial success as well as a brilliant theorist; how-
Remington Rand) to issue the first indexed bond (see ever, a few years later, Fisher’s debts exceeded his
[12], Vol. 8). Fisher tried to educate the public against assets by a million dollars—a loss of 11 million dol-
money illusion, publishing a weekly index of whole- lars, which, as John Kenneth Galbraith remarked,
sale prices calculated by an index number institute was “a substantial sum of money, even for a pro-
operating out of his house in New Haven, Connecti- fessor of economics” [1, 3]. Worst of all for his
cut. Indexed bonds, the compensated dollar, statistical public and professional reputation, Fisher memo-
verification of the quantity theory, and eradication of rably asserted in October 1929, on the eve of the
money illusion all called for a measure of the price Wall Street crash, that stock prices appeared to
level. In The Making of Index Numbers, Fisher [9] have reached a permanently high plateau. McGrat-
argued that a simple formula, the geometric mean tan and Prescott [15] hold that Fisher was right
of the Laspeyres (base-year weighted) index and the to deny that stocks were overvalued in 1929 given
Paasche (current-year weighted) index, was the best the prices/earnings multiples of the time. Whether
index number for that and all other purposes, as or not Fisher could reasonably be faulted for not
Fisher, Irving 3
predicting the subsequent errors of public policy that [4] Dimand, R. (2007). Irving Fisher and his students as
converted the downturn into the Great Depression, financial economists, in Pioneers of Financial Eco-
and even though many others were just as mistaken nomics, G. Poitras, ed., Edward Elgar, Cheltenham, UK,
Vol. 2, pp. 45–59.
about the future course of stock prices, Fisher’s mis-
[5] Dimand, R. & Geanakoplos, J. (eds) (2005). Celebrating
taken prediction was particularly pithy, quotable, and
Irving Fisher, Blackwell, Malden, MA.
memorable, and his reputation suffered as severely [6] Fisher, I. (1896). Appreciation and Interest, Macmil-
as his personal finances. Fisher’s 1933 article on lan for American Economic Association, New York.
“The Debt-Deflation Theory of Great Depressions” (reprinted in Fisher [12], 1).
[11], linking the fragility of the financial system to [7] Fisher, I. (1906). The Nature of Capital and Income,
the nonneutrality of inside nominal debt whose real Macmillan, New York. (reprinted in Fisher [12], 2).
value grew as the price level fell, was much later [8] Fisher, I. (1907). The Rate of Interest, Macmillan, New
taken up by such economists as Hyman Minsky, York. (reprinted in Fisher [12], 3).
James Tobin, Ben Bernanke, and Mervyn King [5, [9] Fisher, I. (1922). The Making of Index Numbers,
Houghton Mifflin, Boston. (reprinted in Fisher [12], 7).
14], but in the 1930s Fisher had lost his audience.
[10] Fisher, I. (1926). A statistical relation between unem-
Fisher’s 1929 debacle (together with his enthusias- ployment price changes, International Labour Review
tic embrace of causes ranging from a new world 13, 785–792. reprinted as Lost and found: (1973) I dis-
map projection, the unhealthiness of smoking, and covered the Phillips curve – Irving Fisher, Journal of
the usefulness of mathematics in economics, through Political Economy 81, 496–502.
the League of Nations, universal health insurance, [11] Fisher, I. (1933). The debt-deflation theory of great
and a low-protein diet to, more regrettably, prohi- depressions, Econometrica 1, 337–357. (reprinted in
bition and eugenics) long tarnished his public and Fisher [12], 10).
professional reputation, but he has increasingly come [12] Fisher, I. (1997). The Works of Irving Fisher, W.J.
to be recognized as a great figure in the development Barber, ed, Pickering & Chatto, London.
[13] Fisher, I. & Brown, H.G. (1911). The Purchasing Power
of theoretical and quantitative economics, including
of Money, Macmillan, New York. (reprinted in Fisher
financial economics. [12], 4).
[14] Loef, H. & Monissen, H. (eds) The Economics of Irving
References Fisher, Edward Elgar, Cheltenham, UK.
[15] McGrattan, E. & Prescott, E. (2004). The 1929 stock
[1] Allen, R.L. (1993). Irving Fisher: A Biography, Black- market: Irving Fisher was right, International Economic
well, Cambridge, MA. Review 45, 91–1009.
[2] Crockett, J.H. Jr. (1980). Irving Fisher on the financial [16] Norton, J.P. (1902). Statistical Studies in the New York
economics of uncertainty, History of Political Economy Money Market, Macmillan, New York.
12, 65–82. [17] Phillips, C. (1920). Bank Credit, Macmillan, New York.
[3] Dimand, R. (2007). Irving Fisher and financial eco- [18] Stabile, D. & Putnam, B. (2002). Irving Fisher and statis-
nomics: the equity premium puzzle, the predictabil- tical approaches to risk, Review of Financial Economics
ity of stock prices, and intertemporal allocation under 11, 191–203.
risk, Journal of the History of Economic Thought 29,
153–166. ROBERT W. DIMAND
Modigliani, Franco In 1954, Modigliani laid the groundwork for
the now-famous life cycle hypothesis (LCH) ([5],
Vol. 6, pp. 3–45). The LCH bracketed broader
macroeconomic problems such as why S/Y is larger
An Italian-born economist who fled the fascist regime in rich countries than in poor countries; why S
of Benito Mussolini at the outbreak of WWII, is greater for farm families than urban families;
Modigliani pursued the study of economics at the why lower status urban families save less than
New School of Social Research (renamed New other urban families; why when a higher future
School University) in New York where he received income is expected, more of current income will
his doctorate in 1944. He taught at several universi- be consumed now; why in countries with rising
ties but, from 1962 on, he stayed at the Massachusetts income that is expected to continue to increase,
Institute of Technology. His famous dissertation on S/Y will be smaller; and why property income that
the Keynesian system served as a springboard for mostly accrues to the rich is largely saved, whereas
many of his lifetime contributions, which include wages that are mostly earned by the poor are largely
stabilization policies, the FRB–MIT–Penn–SSRC spent. To answer these questions, the LCH model
Model (MPS), the Modigliani–Miller (M&M) the- maintains the relative income concept of the early
orem (Modigliani–Miller Theorem) and the life S/Y model. The income concept is, however, more
cycle hypothesis (LCH). Modigliani was awarded encompassing in being high or low relative to the
the Nobel Memorial Prize in economics in 1985 for individual’s lifetime or permanent income, marking
research in the latter two areas. Modigliani’s contribution to the permanent income
Modigliani contributed to making the disciplines hypothesis in consumption theory. The LCH captures
of financial economics and macroeconomics opera- how individuals save when they are young, spend
tional, and thus more quantitative from a neoclassical when they are old, and make bequests to their
perspective. The influence of his teachers, particularly children. In that scenario, consumption, C is uniform
J. Marschak and A. Wald, is seen in his quantitative over time, T , or C(T ) = (N/L)Y , where L is the
MPS model based on Keynesian economic thought number of years the representative individual lives;
and his M&M hypothesis in financial economics. N < L is the number of years the individual earns
The macroeconomic framework that Modigliani built labor income, and Y is average income. Average
emphasized the savings, consumption, investment, income is represented by a flat line, Y (T ) up to
and liquidity components of the Keynesian model. N , which falls to zero after N , when the individual
He explained the anomalous fluctuations of the sav- retires. Since income is earned for N periods, lifetime
ings (S) to income (Y ) ratio during the 1940s and income is NY, and savings is defined as the excess of
1950s. He explained the S/Y ratio by the relative Y (T ) over C(T ).
position in the income distribution of individuals, The empirical estimate of the LCH included a
and by secular and cyclical changes in income ([3], wealth-effect variable on consumption. Saving during
Vol. 2). The secular changes represent differences in an individual’s early working life is one way in
real income per capita above the highest level reached which wealth accumulates. Such an accumulation of
in any preceding year, signifying his contribution wealth reaches a peak during the person’s work-
to the relative income hypothesis in consumption ing age when income is highest. Individuals also
theory. The cyclical changes represent variation in inherit wealth. If the initial stock of wealth is A0 ,
money income measured by an index, (Yt − Yt0 )/Yt , then, at a certain age, τ , a person’s consump-
where Yt is real income per capita in current time, tion can be expressed as (L − τ )C = A + (N − τ )Y .
and Yt0 is the past peak level of such income. He Thus, we have a model of consumption explained
estimated that the secular and the cyclical affects by income and wealth or assets that can be con-
on income were approximately 0.1% and 0.125%, fronted with data. An early estimate of the coefficient
respectively. These coefficients translate to an S/Y of this LCH model yielded C = 0.76Y + 0.073A
ratio of about 11.7%. Klein and Ozmucur [1] revisited (Modigliani, ibid., 70). The result reconciled an early
Modigliani’s S/Y specification with a much larger controversy that the short-run propensity to consume
sample size and were able to reaffirm the robustness from income was between 70% and 80%, and the
of the model. long-run propensity was approximately 100%. The
2 Modigliani, Franco
reconciliation occurs because the short-run marginal is a reserve release term, and δ is a constant. The
propensity to consume (MPC) is 0.766, and assuming equations indicate that the cause and effect between
assets, A, is approximately five times income, while unborrowed reserves to GNP works through lags,
labor income is approximately 80% of income, then a causing delay responses to policy measures.
long-run MPC is approximately 0.98 = 0.8(.76Y ) + Another of Modigliani’s noteworthy contributions
5(.073Y ). to quantitative analysis is the Modigliani and Miller
Modigliani’s largest quantitative effort was the (M&M) theorem [6], which has created a revolution
MPS model. Working with the board of governors in corporate finance equivalent to the revolution in
of the Federal Reserve Banks (FRB) and the Social portfolio theory by H. Markowitz and W. Sharpe.
Science Research Council (SSRC), Modigliani built The M&M hypothesis stands on two major propo-
the MIT–Penn–SSRC (MPS) econometric model sitions, namely that “. . . market value of any firm
in the 1960s. The 1968 version, which had 171 is independent of its capital structure and is given by
endogenous and 119 exogenous variables, predicted capitalizing its expected return at the rate ρk appropri-
poorly in the 1970s and 1980s. In 1996, the FRB/US ate to its class,” and that “the average cost of capital
model replaced the MPS by incorporating rational to any firm is completely independent of the capi-
and vector autoregression types of expectations with tal structure and is equal to the capitalization rate
a view to improve forecasts. The financial sector of a pure equity stream of its class” (Italics origi-
was the dominant module in the MPS model. The nal) ([4], Vol. 3, 10–11). The M&M model can be
net worth of consumers took the form of the real demonstrated for a firm with no growth, no new net
value of money and debt. The demand for money investment, and no taxes. The firm belongs to a risk
depended on the nominal interest rate and the cur- group in which its shares can be substituted for one
rent value of output. Unborrowed reserves influ- another.
enced the short-term money rate of interest and the of the firm can be written as Vj ≡ Sj +
The value
nominal money supply, and through the term struc- Dj = X j ρj , where X j measures expected return
ture effect, the short-term rate affected the long- on assets, ρj measures interest rate for a given risk
term rate and hence savings, which is essential for class, Dj is market value of bonds, and Sj is the
the expansion of output and employment. Out of market value of stocks. For instance, if the earnings
this process came the following two fitted demand before interest and taxes (EBIT) are $5000 and if
and supply equations that characterized the financial the low-risk interest is 10%, then the net operating
sector: income is $50 000.
The proposition of the M&M hypothesis is often
Md = − 0.0021iY − 0.0043rs Y + 0.542Y expressed as an invariance principle based on the idea
+ 0.0046NP + 0.833Mdt−1 (1) that the value of a firm is independent of how it is
financed. The proof of this invariance is based on
arbitrage. As stated by Modigliani, “. . . an investor
F R = (0.001 − 0.00204S2 − 0.00237S3 can buy and sell stocks and bonds in such a way as
to exchange one income stream for another . . . the
− 0.00223S4 )D t−1 + 0.00122iDt−1
value of the overpriced shares will fall and that of
+ 0.00144d dD t−1 + 0.646(1 − δ)RU the under priced shares will rise, thereby tending to
eliminate the discrepancy between the market values
− 0.502δCL + 0.394RD + 0.705F Rt−1
of the firms” (ibid., p. 11). For example, an investor
(2) can get a 6% return either by holding the stocks of
an unlevered firm (0.06X1 ), or holding the stocks
where Md is demand for deposits held by the public, and debts of a levered firm, that is, [0.06(X2 − rD2 )
Y is gross national product (GNP), rs is the savings of stocks + 0.06rD2 of debts], where the subscripts
deposit rate, i is the available return on short-term refer to firms, X is stock, D is debt, and r is return.
assets, P is expected profits, F R is free reserves, Si The M&M hypothesis was a springboard for many
are seasonal adjustments, D is the expected value new works in finance. A first extension of the model
of the stock of member banks deposits, RU is by the authors reflected the effect of corporate tax
unborrowed reserves, CL is commercial loans, RL effects. Further analysis incorporating the effects of
Modigliani, Franco 3
personal and corporate income taxes does not change [2] Mehrling, P. (2005). Fisher Black and the Revolutionary
the value of the firm because both personal and Idea of Finance, John Wiley & Sons, Inc, Hoboken.
corporate tax rates tend to cancel out. Researchers [3] Modigliani, F. (1980). Fluctuations in the saving-income
ratio: a problem in economic forecasting, in The Collected
dealt with questions that arise when the concept Papers of Franco Modigliani, The Life Cycle Hypothesis
of risk class used in the computation of a firm’s of Savings, A. Abel, & S. Johnson, eds, The MIT Press,
value is replaced with perfect market assumptions, Cambridge, MA, Vol. 2.
and when mean–variance models are used instead [4] Modigliani, F. (1980). The cost of capital, corporate
of arbitrage. The value of the firm was also found finance and the theory of investment, in The Collected
to be independent of dividend policy. By changing Papers of Franco Modigliani, The Theory of Finance and
the discount rate for the purpose of calculating a Other Essays, A. Abel, ed., The MIT Press, Cambridge,
MA, Vol.3.
firm’s present value, it was found that bankruptcy can [5] Modigliani, F. (2005). Collected Papers of Franco
have an effect on the value of a firm. Macroeconomic Modigliani, F. Modigliani, ed., The MIT Press,
variables such as the inflation rate can result in the Cambridge, MA, Vol. 6.
underestimation of the value of a firm’s equity. [6] Modigliani, F. & Miller, M. (1958). The cost of cap-
The M&M theorem has been extended into many ital, corporation finance and the theory of investment,
areas of modern research. It supports the popular American Economic Review 48(3), 261–297.
Black–Scholes capital structure model. It has been
used to validate the effect of the Tax Reform Act Further Reading
of 1986 on values of the firm. Modern capital asset
pricing model (CAPM) scholars such as Sharpe Modigliani, F. (2003). The Keynesian Gospel according to
(Sharpe, William F.), J. Lintner, and J. Treynor Modigliani, The American Economist 47(1), 3–24.
[2] were influenced by the M&M result in the Ramrattan, L. & Szenberg, M. (2004). Franco Modigliani
construction of their financial models and ratios. 1918–2003, in memoriam, The American Economist 43(1),
On a personal level, Modigliani was an outstand- 3–8.
ingly enthusiastic, passionate, relentless, and focus- Szenberg, M. & Ramrattan, L. (2008). Franco Modigliani,
A Mind That Never Rests with a Foreword by Robert M.
driven teacher and exceptional researcher whose
Solow, Palgrave Macmillan, Houndmills, Basingstoke and
arena was both economic theory and the real New York.
empirical world.
[1] Klein, L.R. & Ozmucur, S. (2005). The Wealth Effect: A Modigliani–Miller Theorem.
Contemporary Update, paper presented at the New School
University. MICHAEL SZENBERG & LALL RAMRATTAN
Arrow, Kenneth and Morgenstern [41], Hernstein and Milnor [33],
De Groot [31], and Villegas [40]. The legacy of
Arrow’s work is very extensive and some of it
Most financial decisions are made under conditions surprising. This article describes his legacy along
of uncertainty. Yet a formal analysis of markets under three lines: (i) individual and idiosyncratic risks,
uncertainty emerged only recently, in the 1950s. The (ii) rare risks and catastrophic events, and (iii)
matter is complex as it involves explaining how endogenous uncertainty.
individuals make decisions when facing uncertain
situations, the behavior of market instruments such
as insurance, securities, and their prices, the welfare Biographical Background
properties of the distribution of goods and services
under uncertainty, and how risks are shared among Kenneth Joseph Arrow is American economist and
the traders. It is not even obvious how to formulate joint winner of the Nobel Memorial Prize in Eco-
market clearing under conditions of uncertainty. A nomics with John Hicks in 1972. Arrow taught at
popular view in the middle of the last century was Stanford University and Harvard University. He is
that markets would only clear on the average and one of the founders of modern (post World War
asymptotically in large economies.a This approach II) economic theory, and one of the most important
was a reflection of how insurance markets work, and economists of the twentieth century. For a full bio-
followed a notion of actuarially fair trading. graphical note, the reader is referred to [18]. Born in
A different formulation was proposed in the 1921 in New York City to Harry and Lilian Arrow,
early 1950s by Arrow and Debreu [10, 12, 30]. Kenneth was raised in the city. He graduated from
They introduced an economic theory of markets in Townsend Harris High School and earned a bach-
which the treatment of uncertainty follows basic elor’s degree from the City College of New York
principles of physics. The contribution of Arrow studying under Alfred Tarski. After graduating in
and Debreu is as fundamental as it is surpris- 1940, he went to Columbia University and after a
ing. For Arrow and Debreu, markets under uncer- hiatus caused by World War II, when he served
tainty are formally identical to markets without with the Weather Division of the Army Air Forces,
uncertainty. In their approach, uncertainty all but he returned to Columbia University to study under
disappears.b the great statistician Harold Hotelling at Columbia
It may seem curious to explain trade with uncer- University. He received a master’s degree in 1941
tainty as though uncertainty did not matter. The studying under A. Wald, who was the supervisor
disappearing act of the issue at stake is an unusual of his master’s thesis on stochastic processes. From
way to think about financial risk, and how we trade 1946 to 1949 he spent his time partly as a grad-
when facing such risks. But the insight is valu- uate student at Columbia and partly as a research
able. Arrow and Debreu produced a rigorous, con- associate at the Cowles Commission for Research in
sistent, general theory of markets under uncertainty Economics at the University of Chicago; it was in
that inherits the most important properties of mar- in Chicago that he met his wife Selma Schweitzer.
kets without uncertainty. In doing so, they forced us During that time, he also held the position of Assis-
to clarify what is intrinsically different about uncer- tant Professor of Economics at the University of
tainty. Chicago. Initially interested in following a career as
This article summarizes the theory of markets an actuary, in 1951 he earned his doctorate in eco-
under uncertainty that Arrow and Debreu created, nomics from Columbia University working under the
including critical issues that arise from it, and also supervision of Harold Hotelling and Albert Hart. His
its legacy. It focuses on the way Arrow introduced published work on risk started in 1951 [3]. In devel-
securities: how he defined them and the limits of oping his own approach to risk, Arrow grapples with
his theory. It mentions the theory of insurance the ideas of Shackle [39], Knight [35], and Keynes
that Arrow pioneered together with Malinvaud and [34] among others, seeking and not always finding
others [6], as well as the theory of risk bearing a rigorous mathematical foundation. His best-known
that Arrow developed on the basis of expected works on financial markets date back to 1953 [3].
utility [7], following the axioms of Von Neumann These works provide a solid foundation based on the
2 Arrow, Kenneth
role of securities in the allocation of risks [4, 5, 7, nature [4, 5]. This new approach no longer requires
9, 10]. His approach can be described as a state con- trading “contingent” commodities but rather trading
tingent security approach to the allocations of risks a combination of commodities and securities. Arrow
in an economy, and is largely an extension of the proves that by trading commodities and securities,
same approach he followed in his work on general one can achieve the same results as trading state
equilibrium theory with Gerard Debreu, for which he contingent commodities [4, 5]. Rather than needing
was awarded the Nobel Prize in 1972 [8]. Neverthe- N × S markets, one needs a fewer number of mar-
less, his work connects also with social issues of risk kets, namely, N markets for commodities and S − 1
allocation and with the French literature of the time, markets for securities. This approach was a great
especially [1, 2]. improvement and led to the study of securities in
a rigorous and productive manner, an area in which
his work has left a large legacy. The mathematical
Markets under Uncertainty requirement to reach Pareto efficiency was simplified
gradually to require that the securities traded should
The Arrow–Debreu theory conceptualizes uncer- provide for each trader a set of choices with the same
tainty with a number of possible states of the world dimension as the original state contingent commod-
s = 1, 2, . . . that may occur. Commodities can be in ity approach. When this condition is not satisfied, the
one of several states, and are traded separately in markets are called “incomplete”. This led to a large
each of the states of nature. In this theory, one does literature on incomplete markets, for example, [26,
not trade a good, but a “contingent good”, namely, 32], in which Pareto efficiency is not assured, and
a good in each state of the world: apples when it government intervention may be required, an area that
rains and apples when it shines [10, 12, 30]. This exceeds the scope of this article.
way the theory of markets with N goods and S
states of nature is formally identical to the theory
of markets without uncertainty but with N × S com- Individual Risk and Insurance
modities. Traders trade “state contingent commodi-
ties”. This simple formulation allows one to apply the The Arrow–Debreu theory is not equally well suited
results of the theory of markets without uncertainty, for all types of risks. In some cases, it could require
to markets with uncertainty. One recovers most of an unrealistically large number of markets to reach
the important results such as (i) the existence of a efficient allocations. A clear example of this phe-
market equilibrium and (ii) the “invisible hand theo- nomenon arises for those risks that pertain to one
rem” that establishes that market solutions are always individual at a time, called individual risks, which
Pareto efficient. The approach is elegant, simple, and are not readily interpreted as states of the world on
general. which we all agree and are willing to trade. Indi-
Along with its elegance and simplicity, the formu- viduals’ accidents, illnesses, deaths, and defaults, are
lation of this theory can be unexpectedly demanding. frequent and important risks that fall under this cat-
It requires that we all agree on all the possible states egory. Arrow [6] and Malinvaud [37] showed how
of the world that describe “collective uncertainty”, individual uncertainty can be reformulated or reinter-
and that we trade accordingly. This turns out to be preted as collective uncertainty. Malinvaud formal-
more demanding than it seems: for example, one may ized the creation of states of collective risks from
need to have a separate market for apples when it individual risks, by lists that describe all individu-
rains than when it does not, and separate market als in the economy, each in one state of individual
prices for each case. The assumption requires N × S risk. The theory of markets can be reinterpreted
markets to guarantee market efficiency, a requirement accordingly [14, 37, 38], yet remains somewhat awk-
that in some cases militates against the applicabil- ward. The process of trading under individual risk
ity of the theory. In a later article, Arrow simplified using the Arrow–Debreu theory requires an unreal-
the demands of the theory and reduced the num- istically large number of markets. For example with
ber of markets needed for efficiency by defining N individuals, each in one of two individual states
“securities”, which are different payments of money G (good) and B (bad), the number of (collective)
exchanged among the traders in different states of states that are required to apply the Arrow–Debreu
Arrow, Kenneth 3
theory is S = 2N . The number of markets required an “expected utility function”. This means that they
is as above, either S × N or N + S − 1. But with behave as though they have (i) a utility u for
N = 300 million people, as in the US economy, commodities, which is independent of the state of
applying the Arrow–Debreu approach would require nature, and (ii) subjective probabilities about how
N × S = N × 2300 million markets to achieve Pareto likely are the various states of nature. Using the
efficiency, more markets than the total amount of classic axioms one constructs a ranking of choice
particles in the known universe [25]. For this rea- under uncertainty obtaining a well-known expected
son, individual uncertainty is best treated with another utility approach. Specifically, traders choose over
formulation of uncertainty involving individual states “lotteries” that achieve different outcomes in different
of uncertainty and insurance rather than securities, states of nature. When states of nature and outcomes
in which market clearing is defined on the aver- are represented by real numbers in R, a lottery
age and may never actually occur. In this new is a function f : R → R N , a utility is a function
approach, instead of requiring N + S − 1 markets, u : R N → R, and a subjective probability is p : R →
one requires only N commodity markets and, with [0, 1] with R p(s) = 1. Von Neumann, Arrow, and
two states of individual risk, just one security: an Hernstein and Milnor, all obtained the same classic
insurance contract suffices to obtain asymptotic effi- “representation theorem” that identifies choice under
ciency [37, 38]. This is a satisfactory theory of uncertainty by the ranking of lotteries according to
individual risk and insurance, but it leads only to a real-valued function W, where W has the now
asymptotic market clearing and Pareto efficiency. familiar “expected utility” form:
More recently, the theory was improved and it was
shown that one can obtain exact market-clearing solu- W (f ) = p(s).u(f (s)) ds (1)
s∈R
tions and Pareto-efficient allocations based on N
commodity markets with the introduction of a lim- The utility function u is typically bounded to avoid
ited number of financial instruments called mutual paradoxical behavior. The expected utility approach
insurance [14]. It is shown in [14] that if there are just described has been generally used since the mid-
N households (consisting of H types), each fac- twentieth century. Despite its elegance and appeal,
ing the possibility of being in S individual states from the very beginning, expected utility has been
together with T collective states, then ensuring unable to explain a host of experimental evidence
Pareto optimality requires only H (S − 1)T indepen- that was reported in the work of Allais [2] and
dent mutual insurance policies plus T pure Arrow others. There has been a persistent conflict between
securities. theory and observed behavior, but no axiomatic
foundation to replace Von Neumann’s foundational
approach. The reason for this discrepancy has been
Choice and Risk Bearing identified more recently, and it is attributed to the
fact that expected utility is dominated by frequent
Choice under uncertainty explains how individuals events and neglects rare events—even those that are
rank risky outcomes. In describing how we rank potentially catastrophic, such as widespread default
choices under uncertainty, one follows principles in today’s economies. That expected utility neglects
that were established to describe the way nature rare events was shown in [17, 19, 23]. In [23],
ranks what is most likely to occur, a topic that was the problem was traced back to Arrow’s axiom of
widely explored and is at the foundation of statistics monotone continuity [7], which Arrow attributed to
[31, 40]. To explain how individuals choose under Villegas [40], and to the corresponding continuity
conditions of uncertainty, Arrow used behavioral axioms of Hernstein and Milnor, and De Groot [31],
axioms that were introduced by Von Neumann and who defined a related continuity condition denoted
Morgenstern [41] for the theory of gamesc and “SP4 ”. Because of this property, on which Arrow’s
axioms defined by De Groot [31] and Villegas [40] work is based, the expected utility approach has
for the foundation of statistics. The main result been characterized as the “dictatorship” of frequent
obtained in the middle of the twentieth century events, since it is dominated by the consideration of
was that under rather simple behavioral assumptions, “normal” and frequent events [19]. To correct this
individuals behave as though they were optimizing bias, and to represent more accurately how we choose
4 Arrow, Kenneth
under uncertainty, and to arrive at a more realistic through our economic behavior. This realization led
meaning of rationality, a new axiom was added in to the new concept of “markets with endogenous
[17, 19, 21], requiring equal treatment for frequent uncertainty”, created in 1991, and embodied in early
and for rare events. The new axiom was subsequently articles [16, 27, 28] that established some of the
proven to be the logic negation of Arrow’s monotone basic principles and welfare theorems in markets
continuity that was shown to neglect small probability with endogenous uncertainty. This, and other later
events [23]. articles ([20, 25, 27, 36]), established basic princi-
The new axioms led to a “representation theorem” ples of existence and the properties of the general
according to which the ranking of lotteries is a equilibrium of markets with endogenous uncertainty.
modified expected utility formula It is possible to extend the Arrow–Debreu theory
of markets to encompass markets with endogenous
W (f ) = p(s).u(f (s)) ds + φ(f ) (2) uncertainty and also to prove the existence of market
s∈R
equilibrium under these conditions [20]. But in the
where φ is a continuous linear function on lotteries new formulation, Heisenberg’s uncertainty principle
defined by a finite additive measure, rather than a rears its quizzical face. It is shown that it is no longer
countably additive measure [17, 19]. This measure possible to fully hedge the risks that we create our-
assigns most weight to rare events. The new for- selves [16], no matter how many financial instruments
mulation has both types of measures, so the new we create. The equivalent of Russel’s paradox in
characterization of choice under uncertainty incor- mathematical logic appears also in this context due to
porates both (i) frequent and (ii) rare events in a the self-referential aspects of endogenous uncertainty
balanced manner, conforming more closely to the [16, 20]. Pareto efficiency of equilibrium can no
experimental evidence on how humans choose under longer be ensured. Some of the worst economic risks
uncertainty [15]. The new specification gives well- we face are endogenously determined—for example,
deserved importance to catastrophic risks, and a spe- those that led to the 2008–2009 global financial cri-
cial role to fear in decision making [23], leading to sis [27]. In [27] it was shown that the creation of
a more realistic theory of choice under uncertainty financial instruments to hedge individual risks—such
and foundations of statistics, [15, 23, 24]. The legacy as credit default insurance that is often a subject
of Kenneth Arrow’s work is surprising but strong: of discussion in today’s financial turmoil—by them-
the new theory of choice under uncertainty coincides selves induce collective risks of widespread default.
with the old when there are no catastrophic risks so The widespread default that we experience today was
that, in reality, the latter is an extension of the former anticipated in [27], in 1991, and in 2006, when it
to incorporate rare events. Some of the most interest- was attributed to endogenous uncertainty created by
ing applications are to environmental risks such as
financial innovation as well as to our choices of
global warming [25]. Here Kenneth Arrow’s work
regulation or deregulation of financial instruments.
was prescient: Arrow was a contributor to the early
Examples are the extent of reserves that are required
literature on environmental risks and irreversibilities
for investment banking operations, and the creation
[11], along with option values.
of mortgage-backed securities that are behind many
of the default risks faced today [29]. Financial inno-
Endogenous Uncertainty and Widespread vation of this nature, and the attendant regulation
of new financial instruments, causes welfare gains
Default
for individuals—but at the same time creates new
Some of the risks we face are not created by nature. risks for society that bears the collective risks that
They are our own creation, such as global warming ensue, as observed in 2008 and 2009. In this con-
or the financial crisis of 2008 and 2009 anticipated text, an extension of the Arrow–Debreu theory of
in [27]. In physics, the realization that the observer markets can no longer treat markets with endogenous
matters, that the observer is a participant and cre- uncertainty as equivalent to markets with stan-
ates uncertainty, is called Heisenberger’s uncertainty dard commodities. The symmetry of markets with
principle. The equivalent in economics is an uncer- and without uncertainty is now broken. We face a
tainty principle that describes how we create risks brave new world of financial innovation and the
Arrow, Kenneth 5
endogenous uncertainty that we create ourselves. Cre- Prix Nobel en 1972, Stockholm Nobel Foundation pp.
ation and hedging of risks are closely linked, and 253–272.
endogenous uncertainty has acquired a critical role in [9] Arrow, K. (1983). Collected Papers of Kenneth Arrow,
Belknap Press of Harvard University Press.
market performance and economic welfare, an issue
[10] Arrow, K.J. & Debreu, G. (1954). Existence of an
that Kenneth Arrow has more recently tackled him- equilibrium for a competitive economy, Econometrica
self through joint work with Frank Hahn [13]. 22, 265–290.
[11] Arrow, K.J. & Fischer, A. (1974). Environmental preser-
vation, uncertainty and irreversibilities, Quarterly Jour-
Acknowledgments nal of Economics 88(2), 312–319.
[12] Arrow, K. & Hahn, F. (1971). General Competitive
Many thanks are due to Professors Rama Cont and Perry Analysis, Holden Day, San Francisco.
Mehrling of Columbia University and Barnard College, [13] Arrow, K. & Hahn, F. (1999). Notes on sequence
respectively, for their comments and excellent suggestions. economies, transaction costs and uncertainty, Journal of
Economic Theory 86, 203–218.
[14] Cass, D., Chichilnisky, G. & Wu, H.M. (1996). Indi-
End Notes vidual risk and mutual insurance, Econometrica 64,
333–341.
a.
See [37, 38]; later on Werner Hildenbrand followed this [15] Chanel, O. & Chichilnisky, G. (2009). The influence of
approach. fear in decisions: experimental evidence, Journal of Risk
b.
They achieved the same for their treatment of economic and Uncertainty 39(3).
dynamics. Trading over time and under conditions of [16] Chichilnisky, G. (1991, 1996). Markets with endogenous
uncertainty characterizes financial markets. uncertainty: theory and policy, Columbia University
c.
And similar axioms used by Hernstein and Milnor [33]. Working paper 1991 and Theory and Decision 41(2),
d.
Specifically to avoid the so-called St. Petersburg paradox, 99–131.
see [7]. [17] Chichilnisky, G. (1996). Updating Von Neumann Morg-
ernstern axioms for choice under uncertainty with
catastrophic risks. Proceedings of Conference on Catas-
References
trophic Risks, Fields Institute for Mathematical Sciences,
Toronto, Canada.
[1] Allais, M. (ed) (1953). Fondements el Applications de la [18] Chichilnisky, G. (ed) (1999). Markets Information and
Theorie du Risque en Econometrie, CNRS, Paris. Uncertainty: Essays in Honor of Kenneth Arrow, Cam-
[2] Allais, M. (1987). The general theory of random choices bridge University Press.
in relation to the invariant cardinality and the specific [19] Chichilnisky, G. (2000). An axiomatic treatment
probability function, in Risk Decision and Rationality, of choice under uncertainty with catastrophic risks,
B.R. Munier, ed., Reidel, Dordrech The Netherlands, Resource and Energy Economics 22, 221–231.
pp. 233–289.
[20] Chichilnisky, G. (1999/2008). Existence and optimality
[3] Arrow, K. (1951). Alternative approaches to the theory
of general equilibrium with endogenous uncertainty, in
of choice in risk – taking situations, Econometrica
Markets Information and Uncertainty: Essays in Honor
19(4), 404–438.
of Kenneth Arrow, 2nd Edition, G. Chichilnisky, ed.,
[4] Arrow, K. (1953). Le Role des Valeurs Boursiers pour
Cambridge University Press, Chapter 5.
la Repartition la Meilleure des Risques, Econometrie 11,
41–47. Paris CNRS, translated in English in RES 1964 [21] Chichilnisky, G. (2009). The foundations of statis-
(below). tics with Black Swans, Mathematical Social Sciences,
[5] Arrow, K. (1953). The role of securities in the optimal DOI:10.1016/j.mathsocsci.2009.09.007.
allocation of risk bearing, Proceedings of the Colloque [22] Chichilnisky, G. (2009). The limits of econometrics: non
sur les Fondaments et Applications de la Theorie du parametric estimation in Hilbert spaces, Econometric
Risque en Econometrie. CNRS, Paris. English Transation Theory 25, 1–17.
published in The Review of Economic Studies Vol. 31, [23] Chichilnisky, G. (2009). “The Topology of Fear” invited
No. 2, April 1964, p. 91–96. presentation at NBER conference in honor of Ger-
[6] Arrow, K. (1953). Uncertainty and the welfare eco- ard Debreu, UC Berkeley, December 2006, Journal of
nomics of medical care, American Economic Review 53, Mathematical Economics 45(11–12), December 2009.
941–973. Available online 30 June 2009, ISSN 0304–4068, DOI:
[7] Arrow, K. (1970). Essays on the Theory of Risk Bearing, 10.1016/j.jmateco.2009.06.006.
North Holland, Amsterdam. [24] Chichilnisky, G. (2009a). Subjective Probability with
[8] Arrow, K. (1972). General Economic Equilibrium: Black Swans, Journal of Probability and Statistics (in
Purpose Analytical Techniques Collective Choice, Les press, 2010).
6 Arrow, Kenneth
[25] Chichilnisky, G. & Heal, G. (1993). Global environmen- [35] Knight, F. (1921). Risk Uncertainty and Profit, Houghton
tal risks, Journal of Economic Perspectives, Special Issue Miffin and Co., New York.
on the Environment Fall, 65–86. [36] Kurz, M. & Wu, H.M. (1996). Endogenous uncertainty
[26] Chichilnisky, G. & Heal, G. (1996). On the existence in a general equilibrium model with price - contingent
and the structure pseudo-equilibrium manifold, Journal contracts, Economic Theory 6, 461–488.
of Mathematical Economics 26, 171–186. [37] Malinvaud, E. (1972). The allocation of individual
[27] Chichilnisky, G. & Wu, H.M. (1991, 2006). General risks in large markets, Journal of Economic Theory 4,
equilibrium with endogenous uncertainty and default, 312–328.
Working Paper Stanford University, 1991, Journal of [38] Malinvaud, E. (1973). Markets for an exchange economy
Mathematical Economics 42, 499–524. with individual; Risks, Econometrica 41, 383–410.
[28] Chichilnisky, G., Heal, G. & Dutta, J. (1991). [39] Shackle, G.L. (1949). Expectations in Economics,
Endogenous Uncertainty and Derivative Securities in a Cambridge University Press, Cambridge, UK.
General Equilibrium Model, Working Paper Columbia [40] Villegas, C. (1964). On quantitiative probability
University. σ − algebras, Annals of Mathematical Statistics 35,
[29] Chichilnisky, G., Heal, G. & Tsomocos, D. (1995). 1789–1800.
Option values and endogenous uncertainty with asset [41] Von Neumann, J. & Morgenstern, O. (1944). Theory
backed securities, Economic Letters 48(3–4), 379–388. of Games and Economic Behavior, Princeton University
[30] Debreu, G. (1959). Theory of Value: An Axiomatic Press, Princeton, NJ.
Analysis of Economic Equilibrium, John Wiley & Sons,
New York.
[31] De Groot, M.H. (1970, 2004). Optimal Statistical Deci- Related Articles
sions, John Wiley & Sons, Hoboken New Jersey.
[32] Geanakopolos, J. (1990). An introduction to general
Arrow–Debreu Prices; Risk Aversion; Risk
equilibrium with incomplete asset markets, Journal of
Mathematical Economics 19, 1–38.
Premia; Utility Theory: Historical Perspectives.
[33] Hernstein, N. & Milnor, J. (1953). An axiomatic
approach to measurable utility, Econometrica 21, GRACIELA CHICHILNISKY
219–297.
[34] Keynes, J.M. (1921). A Treatise in Probability,
MacMillan and Co., London.
Efficient Markets Theory: mathematical model of a stochastic process (random
walk, Brownian motion, or martingale); (ii) the con-
Historical Perspectives cept of economic equilibrium; and (iii) the statistical
results about the unpredictability of stock market
prices. EMH’s creation took place only between 1959
Without any doubt, it can be said that efficient mar- and 1976, when a large number of economists became
ket hypothesis (EMH) was crucial in the emergence familiar with these three features. Between the time of
of financial economics as a proper subfield of eco- Bachelier and the development of EMH, there were
nomics. But this was not its original goal: EMH was no theoretical preoccupations per se about the ran-
initially created to give a theoretical explanation of dom character of stock prices, and research was only
the random character of stock market prices. empirical.
The historical roots of EMH can be traced back to
the nineteenth century and the early twentieth century
in the work of Regnault and Bachelier, but their work
was isolated and not embedded in a scientific com- Empirical Research between 1933 and
munity interested in finance. More immediate roots 1959
of the EMH lie in the empirical work of Cowles,
Working, and Kendall from 1933 to 1959, which laid Between 1933 and the end of the 1950s, only three
the foundation for the key works published in the authors dealt with the random character of stock
period from 1959 (Roberts) to 1976 (Fama’s reply
market prices: Cowles [3, 4], Working [24, 25], and
to LeRoy). More than any other single contributor,
Kendall [13]. They compared stock price fluctuations
it was Fama [7] in his 1965 dissertation, building on
with random simulations and found similarities. One
the work of Roberts, Cowles, and Cootner, who for-
point must be underlined: these works were strictly
mulated the EMH, suggesting that stock prices reflect
statistical, and no theory explained these empirical
all available information, and that, consequently, the
results.
actual value of a security is equal to its price. In
The situation changed at the end of the 1950s and
addition, because new information arrives randomly,
during the 1960s because of three particular events.
stock prices fluctuate randomly.
The idea that stock prices fluctuate randomly was First, the Koopmans–Vining controversy at the end of
not new: in 1863, a French broker, Jules Regnault 1940s led to a decline of descriptive approaches and
[20], had already suggested it. Regnault was the first to the increased use of modeling based on theoretical
author to put forward this hypothesis, to validate it foundations. Second, modern probability theory, and
empirically, and to give it a theoretical interpretation. consequently also the theory of stochastic processes,
In 1900, Louis Bachelier [1], a French mathemati- became usable for nonmathematicians. Significantly,
cian, used Regnault’s hypothesis and framework to economists were attracted to the new formalisms
develop the first mathematical model of Brownian by some features that were already familiar con-
motion, and tested the model by using it to price sequences of economic equilibrium. Most impor-
futures and options. In retrospect, we can recog- tant, the zero expected profit when prices follow a
nize that Bachelier’s doctoral dissertation constitutes Brownian motion reminded economists of the zero
the first work in mathematical finance. Unfortunately marginal profit in the equilibrium of a perfectly
for him, however, financial economics did not then competitive market. Third, research on the stock
exist as a scientific field, and there was no organized market became more and more popular among schol-
scientific community interested in his research. Con- ars: groups of researchers and seminars in finan-
sequently, both Regnault and Bachelier were ignored cial economics became organized; scientific journals
by economists until the 1960s. such as the Journal of Financial and Quantita-
Although these early authors did suggest mod- tive Analysis were created and a community of
eling stock prices as a stochastic process, they did scholars was born. This context raised awareness
not formulate the EMH as it is known today. EMH about the need for theoretical investigations, and
was genuinely born in linking three elements that these investigations, in turn, allowed the creation of
originally existed independently of each other: (i) the the EMH.
2 Efficient Markets Theory: Historical Perspectives
Theoretical Investigations during the 1960s link between empirical results about stock price
variations, the random walk model, and economic
Financial economists did not speak immediately of equilibrium. EMH was born.
EMH; they talked about “random walk theory”.
Following his empirical results, Working [26] was
the first author to suggest a theoretical explana-
tion; he established an explicit link between the Evolution of Fama’s Definition during the
unpredictable arrival of information and the random 1970s
character of stock market price changes. However,
this paper made no link with economic equilibrium Five years after his PhD dissertation, Fama [8]
and, probably for this reason, it was not widely offered a mathematical demonstration of the EMH.
diffused. Instead, it was Roberts [21], a professor He simplified his first definition by making the
at the University of Chicago, who first suggested implicit assumption of a representative agent. He
a link between economic concepts and the random also used another stochastic process: the martingale
walk model by using the “arbitrage proof” argu- model, which had been introduced to model the ran-
ment that had been popularized by Modigliani and dom character of stock market prices by Samuelson
Miller [19]. Then, Cowles [5] made an important [22] and Mandelbrot [17]. The martingale model
step by identifying a link between financial econo- is less restrictive than the random walk model: the
metric results and economic equilibrium. Finally, martingale model requires only independence of the
two years later, Cootner [2] linked the random walk conditional expectation of price changes, whereas
model, information, and economic equilibrium, and the random walk model requires also independence
exposed the idea of EMH, although he did not use involving the higher conditional moments (i.e., vari-
that expression. ance, skewness, and kurtosis) of the probability dis-
Cootner [2] had the essential idea of EMH, but tribution of price changes. For Fama’s [8] purposes,
he did not make the crucial empirical link because the most important attraction of the martingale for-
he considered that real-world stock price variations malism was its explicit reference to a set of informa-
were not purely random. This point of view was tion, t ,
defended by economists from MIT (such as Samuel-
E(Pt+1 |t ) − Pt = 0 (1)
son) and Stanford University (such as Working). By
contrast, economists from the University of Chicago
claimed that real stock markets were perfect, and As such, the martingale model could be used to
so were more inclined to characterize them as effi- test the implication of EMH that, if all available
cient. Thus, it was a scholar from the Univer- information is used, the expected profit is null. This
sity of Chicago, Eugene Fama, who formulated the idea led to the definition of an efficient market that is
EMH. generally used nowadays: “a market in which prices
In his 1965 PhD thesis, Fama gave the first always ‘fully reflect’ available information is called
theoretical account of EMH. In that account, the key ‘efficient’ ” [8].
assumption is the existence of “sophisticated traders” However, in 1976, LeRoy [15] showed that
who, due to their skills, make a better estimate of Fama’s demonstration is tautological and that his the-
intrinsic valuation than do other agents by using ory is not testable. Fama answered by modifying his
all available information. Provided that such traders definition and he also admitted that any test of the
have predominant access to financial resources, their EMH is a test of both market efficiency and the model
activity of buying underpriced assets and selling of equilibrium used by investors. In addition, it is
overpriced assets will tend to make prices equal striking to note that the test suggested by Fama [9]
the intrinsic values about which they have a shared (i.e., markets are efficient if stock prices are equal to
assessment and also to eliminate any expectation of the prediction provided by the model of equilibrium
profit from trading. Linking these consequences with used) does not imply any clear causality between
the random walk model, Fama added that because the random character of stock market prices and the
information arrives randomly, stock prices have to EMH; it is mostly a plausible correlation valid only
fluctuate randomly. Fama thus offered the first clear for some cases.
Efficient Markets Theory: Historical Perspectives 3
The Proliferation of Definitions since is costly, prices cannot perfectly reflect all available
the 1970s information. Consequently, they considered that per-
fectly information-efficient markets are impossible.
Fama’s modification of his definition proved to be a The history of EMH shows that the definition
fateful admission. In retrospect, it is clear that the of this theory is plural, and the initial project of
theoretical content of EMH comprised its sugges- EMH (the creation of a link between a mathematical
tion of a link between some mathematical model, model, the concept of economic equilibrium, and
some empirical results, and some concept of eco- statistical results about the unpredictability of stock
nomic equilibrium. The precise linkage proposed by market prices) has not been fully achieved. Moreover,
Fama was, however, only one of many possible link- this theory is not empirically refutable (since a test
ages, as subsequent literature would demonstrate. Just of the random character of stock prices does not
so, LeRoy [14] and Lucas [16] provided theoreti- imply a test on efficiency). Nevertheless, financial
cal proofs that efficient markets and the martingale economists have considered EMH as one of the
hypothesis are two distinct ideas: martingale is nei- pillars of financial economics because it played a key
ther necessary nor sufficient for an efficient market. role in the creation and history of financial economics
In a similar way, Samuelson [23], who gave a mathe- by linking financial results with standard economics.
matical proof that prices may be permanently equal to This link is the main contribution of EMH.
the intrinsic value and fluctuate randomly, explained
that it cannot be excluded that some agents make References
profits, contrary to the original definition of EMH. De
Meyer and Saley [6] show that stock market prices [1] Bachelier, L. (1900). Théorie de la spéculation repro-
follow a martingale even if all available information duced in Annales de l’Ecole Normale Supérieure, 3ème
is not contained in stock market prices. série 17, in Random Character of Stock Market Prices
This proliferation at the level of theory has been (English Translation: P.H. Cootner, ed, (1964)), M.I.T.
matched by proliferation at the level of empirical test- Press, Cambridge, MA, pp. 21–86.
ing, as the definition of EMH has changed depending [2] Cootner, P.H. (1962). Stock prices: random vs. sys-
tematic changes, Industrial Management Review 3(2),
on the emphasis placed by each author on one par-
24–45.
ticular feature. For instance, Fama et al. [10] defined [3] Cowles, A. (1933). Can stock market forecasters fore-
an efficient market as “a market that adjusts rapidly cast? Econometrica 1(3), 309–324.
to new information”; Jensen [12] considered that “a [4] Cowles, A. (1944). Stock market forecasting, Economet-
market is efficient with respect to information set θt rica 12(3/4), 206–214.
if it is impossible to make economic profit by trad- [5] Cowles, A. (1960). A revision of previous conclusions
ing on the basis of information set θt ”; according to regarding stock price behavior, Econometrica 28(4),
909–915.
Malkiel [18] “the market is said to be efficient with [6] De Meyer, B. & Saley, H.M. (2003). On the strategic
respect to some information set [. . .] if security prices origin of Brownian motion in finance, International
would be unaffected by revealing that information to Journal of Game Theory 31, 285–319.
all participants. Moreover, efficiency with respect to [7] Fama, E.F. (1965). The behavior of stock-market prices,
an information set [. . .] implies that it is impossible Journal of Business 38(1), 34–105.
to make economic profits by trading on the basis of [8] Fama, E.F. (1970). Efficient capital markets: a review
of theory and empirical work, Journal of Finance 25(2),
[that information set]”.
383–417.
The situation is similar regarding the tests: the [9] Fama, E.F. (1976). Efficient capital markets: reply,
type of test used depends on the definition used by Journal of Finance 31(1), 143–145.
the authors and on the data used (for instance, most [10] Fama, E.F., Fisher, L., Jensen, M.C. & Roll, R. (1969).
of the tests are done with low frequency or daily The adjustment of stock prices to new information,
data, while statistical arbitrage opportunities are dis- International Economic Review 10(1), 1–21.
cernible and exploitable at high frequency using algo- [11] Grossman, S.J. & Stiglitz, J.E. (1980). The impossibility
of informationally efficient markets, American Economic
rithmic trading). Moreover, some authors have used Review 70(3), 393–407.
the weakness of the definitions to criticize the very [12] Jensen, M.C. (1978). Some anomalous evidence regard-
relevance of efficient markets. For instance, Gross- ing market efficiency, Journal of Financial Economics
man and Stiglitz [11] argued that because information 6, 95–101.
4 Efficient Markets Theory: Historical Perspectives
[13] Kendall, M.G. (1953). The analysis of economic time- [25] Working, H. (1949). The investigation of economic
series. Part I: prices, Journal of the Royal Statistical expectations, The American Economic Review 39(3),
Society 116, 11–25. 150–166.
[14] LeRoy, S.F. (1973). Risk-aversion and the martingale [26] Working, H. (1956). New ideas and methods for price
property of stock prices, International Economic Review research, Journal of Farm Economics 38, 1427–1436.
14(2), 436–446.
[15] LeRoy, S.F. (1976). Efficient capital markets: comment,
Journal of Finance 31(1), 139–141. Further Reading
[16] Lucas, R.E. (1978). Asset prices in an exchange econ-
omy, Econometrica 46(6), 1429–1445. Jovanovic, F. (2008). The construction of the canonical history
[17] Mandelbrot, B. (1966). Forecasts of future prices, unbi- of financial economics, History of Political Economy 40(3),
ased markets, and “Martingale” models, Journal of Busi- 213–242.
ness 39(1), 242–255. Jovanovic, F. & Le Gall, P. (2001). Does God practice a
[18] Malkiel, B.G. (1992). Efficient Market Hypothesis, in random walk? The “financial physics” of a 19th century
The New Palgrave Dictionary of Money and Finance, forerunner, Jules Regnault, European Journal of the History
P. Newman, M. Milgate & J. Eatwell, eds, Macmillan, of Economic Thought 8(3), 323–362.
London. Jovanovic, F. & Poitras, G. (eds) (2007). Pioneers of Financial
[19] Modigliani, F. & Miller, M.H. (1958). The cost of Economics: Twentieth Century Contributions, Edward Elgar,
capital, corporation finance and the theory of investment, Cheltenham, Vol. 2.
The American Economic Review 48(3), 261–297. Poitras, G. (ed) (2006). Pioneers of Financial Economics: Con-
[20] Regnault, J. (1863). Calcul des Chances et Philosophie tributions prior to Irving Fisher, Edward Elgar, Cheltenham,
de la Bourse, Mallet-Bachelier and Castel, Paris. Vol. 1.
[21] Roberts, H.V. (1959). Stock-market “Patterns” and Rubinstein, M. (1975). Securities market efficiency in an
financial analysis: methodological suggestions, Journal Arrow-Debreu economy, The American Economic Review
of Finance 14(1), 1–10. 65(5), 812–824.
[22] Samuelson, P.A. (1965). Proof that properly antici-
pated prices fluctuate randomly, Industrial Management
Review 6(2), 41–49. Related Articles
[23] Samuelson, P.A. (1973). Proof that properly discounted
present value of assets vibrate randomly, Bell Journal of
Economics 4(2), 369–374.
Bachelier, Louis (1870–1946); Efficient Market
[24] Working, H. (1934). A random-difference series for use Hypothesis.
in the analysis of time series, Journal of the American
Statistical Association 29, 11–24. FRANCK JOVANOVIC
Econophysics the first to describe, for the distribution of incomes,
the eponym power laws that would later become the
center of attention of physicists and other scientists
observing this remarkable and universal statistical
The Prehistoric Times of Econophysics signature in the distribution of event sizes (earth-
quakes, avalanches, landslides, storms, forest fires,
The term econophysics was introduced in the 1990s, solar flares, commercial sales, war sizes, and so on)
endorsed in 1999 by the publication of Mantegna punctuating so many natural and social systems [3,
& Stanley’s “An Introduction to Econophysics” [33]. 29, 35, 41].
The word “econophysics”, paralleling the quests of While attempting to model the erratic motion of
biophysics or geophysics, suggests that there is a bonds and stock options in the Paris Bourse in 1900,
physics-based approach to economics. mathematician Louis Bachelier developed the mathe-
From classical to neoclassical economics and until matical theory of diffusion (and the first elements of
now, economists have been inspired by the concep- financial option pricing) and solved the parabolic dif-
tual and mathematical developments of the physical fusion equation five years before Albert Einstein [10]
sciences and by their remarkable successes in describ- established the theory of Brownian motion based on
ing and predicting natural phenomena. Reciprocally, the same diffusion equation (also underpinning the
physics has been enriched several times by develop- theory of random walks) in 1905. The ensuing mod-
ments first observed in economics. Well before the ern theory of random walks now constitutes one of
christening of econophysics as the incarnation of the the fundamental pillars of theoretical physics and eco-
multidisciplinary study of complex large-scale finan- nomics and finance models.
cial and economic systems, a multitude of small and In the early 1960s, mathematician Benoit Mandel-
large collisions have punctuated the development of brot [28] pioneered the use in financial economics
these two fields. We now mention a few that illustrate of heavy-tailed distributions (Lévy stable laws) as
the remarkable commonalities and interfertilization. opposed to the traditional Gaussian (normal) law. A
In his “Inquiry into the Nature and Causes of cohort of economists, notably at the University of
the Wealth of Nations” (1776), Adam Smith found Chicago (Merton Miller, Eugene Fama, and Richard
inspiration in the Philosophiae Naturalis Principia Roll), at MIT (Paul Samuelson), and at Carnegie Mel-
Mathematica (1687) of Isaac Newton, specifically lon University (Thomas Sargent), initially followed
based on the (novel at the time) notion of causative his steps. In his PhD thesis, Eugene Fama con-
forces. firmed that the frequency distribution of the changes
The recognition of the importance of feedbacks to in the logarithms of prices was “leptokurtic”, that
fathom the sheer complexity of economic systems has is, with a high peak and fat tails. However, other
been at the root of economic thinking for a long time. notable economists (Paul Cootner and Clive Granger)
Toward the end of the nineteenth century, the microe- opposed Mandelbrot’s proposal, on the basis of the
conomists Francis Edgeworth and Alfred Marshall argument that “the statistical theory that exists for the
drew on some of the ideas of physicists to develop normal case is nonexistent for the other members of
the notion that the economy achieves an equilibrium the class of Lévy laws.” The coup de grace was the
state like that described for gases by Clerk Maxwell mounting empirical evidence that the distributions of
and Ludwig Boltzmann. The general equilibrium the- returns were becoming closer to the Gaussian law at
ory now at the core of much of economic thinking is timescales larger than one month, at odds with the
nothing but a formalization of the idea that “every- self-similarity hypothesis associated with the Lévy
thing in the economy affects everything else” [18], laws [7, 23]. Much of the efforts in the econophysics
reminiscent of mean-field theory or self-consistent literature of the late 1990s and early 2000s revis-
effective medium methods in physics, but emphasiz- ited and refined this hypothesis, confirming on one
ing and transcending these ideas much beyond their hand the existence of the variance (which rules out
initial sense in physics. the class of Lévy distributions proposed by Mandel-
While developing the field of microeconomics brot), but also suggesting a power-law tail with an
in his “Cours d’Economie Politique” (1897), the exponent close to 3 [16, 32]—several other groups
economist and philosopher Vilfredo Pareto was have discussed alternatives, such as exponential [39]
2 Econophysics
or stretched exponential distributions [19, 24, 26]. to covariance of returns [20, 36, 37], and meth-
Financial engineers actually care about these appar- ods and models of dependence between financial
ent technicalities because the tail structure controls assets [25, 43].
the Value at Risk and other measures of large losses, At present, the most exciting progresses seem to
and physicists care because the tail may constrain be unraveling at the boundary between economics
the underlying mechanism(s). For instance, Gabaix and the biological, cognitive, and behavioral sciences.
et al. [14] attribute the large movements in stock mar- While it is difficult to argue for a physics-based foun-
ket activity to the interplay between the power-law dation of economics and finance, physics has still a
distribution of the sizes of large financial institutions role to play as a unifying framework full of concepts
and the optimal trading of such large institutions. In and tools to deal with the complex. The modeling
this domain, econophysics focuses on models that skills of physicists explain their impressive number
can reproduce and explain the main stylized facts in investment and financial institutions, where their
of financial time series: non-Gaussian fat tail dis- data-driven approach coupled with a pragmatic sense
tribution of returns, long-range autocorrelation of of theorizing has made them a most valuable com-
volatility and the absence of correlation of returns, modity on Wall Street.
multifractal property of the absolute value of returns,
and so on.
In the late 1960s, Benoit Mandelbrot left financial Acknowledgments
economics but, inspired by this first episode, went
on to explore other uncharted territories to show how We would like to thank Y. Malevergne for many
nondifferentiable geometries (that he coined fractal ), discussions and a long-term enjoyable and fruitful
collaboration.
previously developed by mathematicians from the
1870s to the 1940s, could provide new ways to deal
with the real complexity of the world [29]. He later References
returned to finance in the late 1990s in the midst
of the econophysics’ enthusiasm to model the mul- [1] Arthur, W.B. (2005). Out-of-equilibrium economics and
tifractal properties associated with the long-memory agent-based modeling, in Handbook of Computational
properties observed in financial asset returns [2, 30, Economics, Vol. 2: Agent-Based Computational Eco-
31, 34, 43]. nomics, K. Judd & L. Tesfatsion, eds, Elsevier, North
Holland.
[2] Bacry, E., Delour, J. & Muzy, J.-F. (2001). Multifractal
random walk, Physical Review E 64, 026103.
Notable Contributions [3] Bak, P. (1996). How Nature Works: The Science of Self-
Organized Criticality, Copernicus, New York.
[4] Bouchaud, J.-P. & Potters, M. (2003). Theory of finan-
The modern econophysicists are implicitly and some- cial risk and derivative pricing, From Statistical Physics
times explicitly driven by the hope that the concept to Risk Management, 2nd Edition, Cambridge University
of “universality” holds in economics and finance. The Press.
value of this strategy remains to be validated [42], [5] Bouchaud, J.-P., Sagna, N., Cont, R., El-Karoui, N. &
as most econophysicists have not yet digested the Potters, M. (1999). Phenomenology of the interest rate
curve, Applied Mathematical Finance 6, 209.
subtleties of economic thinking and failed to marry
[6] Bouchaud, J.-P. & Sornette, D. (1994). The Black-
their ideas and techniques with mainstream eco- Scholes option pricing problem in mathematical finance:
nomics. The following is a partial list of a few generalization and extensions for a large class of stochas-
notable exceptions: precursory physics approach to tic processes, Journal de Physique I France 4, 863–881.
social systems [15], agent-based models, induction, [7] Campbell, J.Y., Lo, A.W. & MacKinlay, A.C. (1997).
evolutionary models [1, 9, 11, 21], option theory The Econometrics of Financial Markets, Princeton Uni-
for incomplete markets [4, 6], interest rate curves versity Press, Princeton.
[8] Challet, D., Marsili, M. & Zhang, Y.-C. (2005). Minority
[5, 38], minority games [8], theory of Zipf law and Games, Oxford University Press, Oxford.
its economic consequences [12, 13, 27], theory of [9] Cont, R. & Bouchaud, J.-P. (2000). Herd behavior and
large price fluctuations [14], theory of bubbles and aggregate fluctuations in financial markets, Journal of
crashes [17, 22, 40], random matrix theory applied Macroeconomic Dynamics 4(2), 170–195.
Econophysics 3
[10] Einstein, A. (1905). On the motion of small particles [29] Mandelbrot, B.B. (1982). The Fractal Geometry of
suspended in liquids at rest required by the molecular- Nature, W.H. Freeman, San Francisco.
kinetic theory of heat, Annalen der Physik 17, 549–560. [30] Mandelbrot, B.B. (1997). Fractals and Scaling in
[11] Farmer, J.D. (2002). Market forces, ecology and evolu- Finance: Discontinuity, Concentration, Risk, Springer,
tion, Industrial and Corporate Change 11(5), 895–953. New York.
[12] Gabaix, X. (1999). Zipf’s law for cities: an explanation, [31] Mandelbrot, B.B., Fisher, A. & Calvet, L. (1997). A
Quarterly Journal of Economics 114(3), 739–767. Multifractal Model of Asset Returns, Cowles Founda-
[13] Gabaix, X. (2005). The Granular Origins of Aggregate tion Discussion Papers 1164, Cowles Foundation, Yale
Fluctuations, working paper, Stern School of Business, University.
New York. [32] Mantegna, R.N. & Stanley, H.E. (1995). Scaling behav-
[14] Gabaix, X., Gopikrishnan, P., Plerou, V. & Stanley, H.E. ior in the dynamics of an economic index, Nature 376,
(2003). A theory of power law distributions in financial 46–49.
market fluctuations, Nature 423, 267–270. [33] Mantegna, R. & Stanley, H.E. (1999). An Introduction to
[15] Galam, S. & Moscovici, S. (1991). Towards a theory of Econophysics: Correlations and Complexity in Finance,
collective phenomena: consensus and attitude changes Cambridge University Press, Cambridge and New York.
in groups, European Journal of Social Psychology 21, [34] Muzy, J.-F., Sornette, D., Delour, J. & Arneodo, A.
49–74. (2001). Multifractal returns and hierarchical portfolio
[16] Gopikrishnan, P., Plerou, V., Amaral, L.A.N., Meyer, M. theory, Quantitative Finance 1, 131–148.
& Stanley, H.E. (1999). Scaling of the distributions of [35] Newman, M.E.J. (2005). Power laws, Pareto distri-
fluctuations of financial market indices, Physical Review butions and Zipf’s law, Contemporary Physics 46,
E 60, 5305–5316. 323–351.
[17] Johansen, A., Sornette, D. & Ledoit, O. (1999). Pre- [36] Pafka, S. & Kondor, I. (2002). Noisy covariance matri-
dicting financial crashes using discrete scale invariance, ces and portfolio optimization, European Physical Jour-
Journal of Risk 1(4), 5–32. nal B 27, 277–280.
[18] Krugman, P. (1996). The Self-Organizing Economy, [37] Plerou, V., Gopikrishnan, P., Rosenow, B., Ama-
Blackwell, Malden. ral, L.A.N. & Stanley, H.E. (1999). Universal and non-
[19] Laherrere, J. & Sornette, D. (1999). Stretched exponen- universal properties of cross correlations in financial
tial distributions in nature and economy: fat tails with time series, Physical Review Letters 83(7), 1471–1474.
characteristic scales, European Physical Journal B 2, [38] Santa-Clara, P. & Sornette, D. (2001). The dynamics
525–539. of the forward interest rate curve with stochastic string
[20] Laloux, L., Cizeau, P., Bouchaud, J.-P. & Potters, M. shocks, The Review of Financial Studies 14(1), 149–185.
(1999). Noise dressing of financial correlation matrices, [39] Silva, A.C., Prange, R.E. & Yakovenko, V.M. (2004).
Physical Review Letters 83, 1467–1470. Exponential distribution of financial returns at meso-
[21] Lux, T. & Marchesi, M. (1999). Scaling and criticality scopic time lags: a new stylized fact, Physica A 344,
in a stochastic multi-agent model of financial market, 227–235.
Nature 397, 498–500. [40] Sornette, D. (2003). Why Stock Markets Crash, Critical
[22] Lux, T. & Sornette, D. (2002). On rational bubbles and Events in Complex Financial Systems, Princeton Univer-
fat tails, Journal of Money, Credit and Banking, Part 1 sity Press.
34(3), 589–610. [41] Sornette, D. (2006). Critical Phenomena in Natural Sci-
[23] MacKenzie, D. (2006). An Engine, Not a Camera: ences, Chaos, Fractals, Self-organization and Disorder:
How Financial Models Shape Markets, The MIT Press, Concepts and Tools, Series in Synergetics, 2nd Edition,
Cambridge, London. Springer, Heidelberg.
[24] Malevergne, Y., Pisarenko, V.F. & Sornette, D. (2005). [42] Sornette, D., Davis, A.B., Ide, K., Vixie, K.R., Pis-
Empirical distributions of log-returns: between the arenko, V. & Kamm, J.R. (2007). Algorithm for model
stretched exponential and the power law? Quantitative validation: theory and applications, Proceedings of the
Finance 5(4), 379–401. National Academy of Sciences of the United States of
[25] Malevergne, Y. & Sornette, D. (2003). Testing the Gaus- America 104(16), 6562–6567.
sian copula hypothesis for financial assets dependences, [43] Sornette, D., Malevergne, Y. & Muzy, J.F. (2003). What
Quantitative Finance 3, 231–250. causes crashes? Risk 16, 67–71. http://arXiv.org/abs/
[26] Malevergne, Y. & Sornette, D. (2006). Extreme Finan- cond-mat/0204626
cial Risks: From Dependence to Risk Management,
Springer, Heidelberg.
[27] Malevergne, Y. & Sornette, D. (2007). A two-factor Further Reading
Asset Pricing Model Based on the Fat Tail Distri-
bution of Firm Sizes, ETH Zurich working paper. Bachelier, L. (1900). Théorie de la speculation, Annales de
http://arxiv.org/abs/physics/0702027 l’Ecole Normale Supérieure (translated in the book Ran-
[28] Mandelbrot, B.B. (1963). The variation of certain spec- dom Character of Stock Market Prices), Théorie des prob-
ulative prices, Journal of Business 36, 394–419. abilités continues, 1906, Journal des Mathematiques Pures
4 Econophysics
et Appliquées; Les Probabilités cinematiques et dynamiques, Stanley, H.E. (1999). Scaling, universality, and renormaliza-
1913, Annales de l’Ecole Normale Supérieure. tion: three pillars of modern critical phenomena, Reviews of
Cardy, J.L. (1996). Scaling and Renormalization in Statistical Modern Physics 71(2), S358–S366.
Physics, Cambridge University Press, Cambridge.
Pareto, V. (1897). Cours d’Économique Politique, Macmillan, GILLES DANIEL & DIDIER SORNETTE
Paris, Vol. 2.
Kolmogorov, Andrei [5], the theory of trigonometric series, measure
and set theory, the theory of integration, approx-
Nikolaevich imation theory, constructive logic, topology, the
theory of superposition of functions and Hilbert’s
thirteenth problem, classical mechanics, ergodic the-
Andrei Nikolaevich Kolmogorov was born on ory, the theory of turbulence, diffusion and models
April 25, 1903 and died on October 20, 1987 in the of population dynamics, mathematical statistics, the
Soviet Union. theory of algorithms, information theory, the the-
Springer Verlag published (in German) Kol- ory of automata and applications of mathemati-
mogorov’s monograph “Foundations of the Theory cal methods in humanitarian sciences (including
of Probability” more than seventy-five years ago [3]. work in the theory of poetry, the statistics of
In this small, 80-page book, he not only provided text, and history), and the history and methodol-
the logical foundation of the mathematical theory of ogy of mathematics for school children and teachers
probability (axiomatics) but also defined new con- of school mathematics [4–6]. For more descriptions
cepts: conditional probability as a random variable, of Kolmogorov’s works, see [1, 7].
conditional expectations, notion of independency, the
use of Borel fields of probability, and so on. The
“Main theorem” in Chapter III “Probability in Infi- References
nite Spaces” indicated how to construct stochastic
processes starting from their finite-dimensional dis- [1] Bogolyubov, N.N., Gnedenko, B.V. & Sobolev, S.L.
tributions. His approach has made the development (1983). Andrei Nikolaevich Kolmogorov (on his eigh-
of modern mathematical finance possible. teenth birthday), Russian Mathematical Surveys 38(4),
Before writing “Foundations of the Theory of 9–27.
Probability”, Kolmogorov wrote his great paper [2] Kolmogoroff, A. (1931). Uber die analytischen Metho-
den in der Wahrscheinlichkeitsrechnung, Mathematische
“Analytical Methods in Probability Theory” [2],
Annalen, 104, 415–458.
which gave birth to the theory of Markov pro- [3] Kolmogoroff, A. (1933). Grundbegriffe der Wahrschein-
cesses in continuous time. In this paper, Kolmogorov lichkeitsrechnung, Springer, Berlin.
presented his famous forward and backward dif- [4] Kolmogorov, A.N. (1991). Mathematics and mechan-
ferential equations, which are the often-used tools ics, in Mathematics and its Applications (Soviet Series
in probability theory and its applications. He also 25), V.M. Tikhomirov, ed., Kluwer, Dordrecht, Vol. I,
gave credit to L. Bachelier for the latter’s pioneering pp. xx+551.
investigations of probabilistic schemes evolving con- [5] Kolmogorov, A.N. (1992). Probability theory and math-
ematical statistics, in Mathematics and its Applications
tinuously in time. (Soviet Series 26), A.N. Shiryayev, ed., Kluwer, Dor-
The two works mentioned earlier laid the ground- drecht, Vol. II, pp. xvi+597.
work for all subsequent developments of the theory [6] Kolmogorov, A.N. (1993). Information theory and the
of probability and stochastic processes. Today, it is theory of algorithms, in Mathematics and its Applica-
impossible to imagine the state of these sciences with- tions (Soviet Series 27), A.N. Shiryayev, ed., Kluwer,
out Kolmogorov’s contributions. Dordrecht, Vol. III, pp. xxvi+275.
Kolmogorov developed many fundamentally [7] Shiryaev, A.N. (2000). Andrei Nikolaevich Kolmogorov
(April 25, 1903 to October 20, 1987). A biographical
important concepts that have determined the progress
sketch of his life and creative paths, in Kolmogorov
in different branches of mathematics and other in Perspective, American Mathematical Society, London
branches of science and arts. Being an outstand- Mathematical Society, pp. 1–87.
ing mathematician and scientist, he obtained, besides
fundamental results in the theory of probability ALBERT N. SHIRYAEV
Bernoulli, Jacob Europe by Leonardo of Pisa, also known as Fibonacci
[6]. Rather than relying on investments with guar-
anteed rates of return, which were frowned upon
as involving usury, Muslim trade was often carried
Jacob Bernoulli (1654–1705), the son and grandson out by partnerships or companies, many involving
of spice merchants in the city of Basel, Switzerland, members of extended families. Such partnerships
was trained to be a protestant clergyman, but, follow- would be based on a written contract between those
ing his own interests and talents, instead became the involved, spelling out the agreed-upon division of the
professor of mathematics at the University of Basel profits once voyagers had returned and the goods had
from 1687 until his death. He taught mathematics been sold, the shares of each partner depending upon
to his nephew Nicolaus Bernoulli (1687–1759) and their investment of cash, supply of capital goods such
to his younger brother Johann (John, Jean) Bernoulli as ships or warehouses, and labor. According to the
(1667–1748), who was trained in medicine, but took Islamic law, if one of the partners in such an enter-
over as professor of mathematics at Basel after prise died before the end of the anticipated period of
Jacob’s death in 1705. As a professor of mathemat- the venture, his heirs were entitled to demand the dis-
ics, Johann Bernoulli, in turn, taught mathematics to solution of the firm, so that they might receive their
his sons, including Daniel Bernoulli (1700–1782), legal inheritances. Not infrequently, applied mathe-
known for the St. Petersburg paradox in probabil- maticians were called upon to calculate the value of
ity, as well as for work in hydrodynamics. Jacob and the partnership on a given intermediate date, so that
Johann Bernoulli were among the first to read and the partnership could be dissolved fairly.
understand Gottfried Wilhelm Leibniz’s articles in the In Arabic and then Latin books of commer-
Acta Eruditorum of 1684 and 1686, in which Leibniz cial arithmetic or business mathematics in general
put forth the new algorithm of calculus. They helped (geometry, for instance, volumes of barrels, might
to develop and spread Leibniz’s calculus throughout also be included), there were frequently problems
Europe, Johann teaching calculus to the Marquis de of “societies” or partnerships, which later evolved
Hôpital, who published the first calculus textbook. into the so-called “problem of points” concerning
Nicolas Bernoulli wrote his master’s thesis [1] on the division of the stakes of a gambling game if
the basis of the manuscripts of Jacob’s still unpub- it were terminated before its intended end. Typi-
lished Art of Conjecturing, and helped to spread its cally, the values of the various partners’ shares were
contents in the years between Jacob’s death and the calculated using (i) the amounts invested; (ii) the
posthumous publication of Jacob’s work in 1713 [2]. length of time it was invested in the company if
In the remainder of this article, the name “Bernoulli” all the partners were not equal in this regard; and
without any first name refers to Jacob Bernoulli. (iii) the original contract, which generally specified
(Readers should be aware that many Bernoulli math- the division of the capital and profits among part-
ematicians are not infrequently confused with each ners traveling to carry out the business and those
other. For instance, it was Jacob’s son Nicolaus, also remaining at home. The actual mathematics involved
born in 1687, but a painter and not a mathematician, in making these calculations was similar to the
who had the Latin manuscript of [2] printed, and not mathematics of calculating the price of a mixture
his nephew Nicolaus, although the latter wrote a brief [2, 7, 8]. (If, as was often the case, “story prob-
preface.) lems” were described only in long paragraphs, what
As far as the application of the art of conjectur- was intended might seem much more complex than
ing to economics (or finance) is concerned, much of if everything could have been set out in the subse-
the mathematics that Jacob Bernoulli inherited relied quently developed notation of algebraic equations.)
more on law and other institutional factors than it In Part IV of [2], Bernoulli had intended to apply
relied on statistics or mathematical probability, a dis- the mathematics of games of chance, expounded in
cipline that did not then exist. Muslim traders had Parts I–III of the book on the basis of Huygens’
played a significant role in Mediterranean commerce work, by analogy, to civil, moral, and economic
in the medieval period and in the development of problems. The fundamental principle of Huygens’
mathematics, particularly algebra, as well. Muslim and Bernoulli’s mathematics of games of chance was
mathematical methods were famously transmitted to that the game should be fair and that players should
2 Bernoulli, Jacob
pay to play a game in proportion to their expected Bernoulli’s proof with Nicolaus Bernoulli’s proof of
winnings. Most games, like business partnerships, the same theorem, see [5].
were assumed to involve only the players, so that In correspondence with Leibniz, Bernoulli unsuc-
the total paid in would equal the total paid out at the cessfully tried to obtain from Leibniz a copy of Jan
end. Here, a key concept was the number of “cases” De Witt’s rare pamphlet, in Dutch, on the mathe-
or possible alternative outcomes. If a player might matics of annuities—this was the sort of problem to
win a set amount if a die came up a 1, then there were which he hoped to apply his new mathematical the-
said to be six cases, corresponding to the six faces of ory [4]. Leibniz, in reply, without having been told
the die, of which one, the 1, would be favorable to the mathematical basis of Bernoulli’s proof of his law
that player. For this game to be fair, the player should for finding, a posteriori, ratios of cases, for instance,
pay in one-sixth of the amount he or she would win of surviving past a given age, objected that no such
if the 1 were thrown. approach would work because the causes of death
Bernoulli applied this kind of mathematics in an might be changeable over time. What if a new disease
effort to quantify the evidence that an accused person should make an appearance, leading to an increase of
had committed a crime by systematically combining early deaths? Bernoulli’s reply was that, if there were
all the various types of circumstantial evidence of such changed circumstances, then it would be neces-
the crime. He supposed that something similar might sary to make new observations to calculate new ratios
be done to judge life expectancies, except that no one for life expectancies or values of annuities [2].
knew all the “cases” that might affect life expectancy, But what if not only were there no fixed ratios of
such as the person’s inherited vigor and healthiness, cases over time, but no such regularities (underlying
the diseases to which a person might succumb, the ratios of cases) at all? For Bernoulli this was not a
accidents that might happen, and so forth. With the serious issue because he was a determinist, believing
law that later came to be known as the weak law that from the point of view of the Creator everything
of large numbers, Bernoulli proposed to discover is determined and known eternally. It is only because
a posteriori from the results many times observed we humans do not have such godlike knowledge that
in similar situations what the ratios of unobserved we cannot know the future in detail. Nevertheless, we
can increase the security and prudence of our actions
underlying “cases” might be. Most people realize,
through the application of the mathematical art of
Bernoulli said, that if you want to judge what may
conjecturing that he proposed to develop. Even before
happen in the future by what has happened in the
the publication of The Art of Conjecturing, Abraham
past, you are less liable to be mistaken if you have
De Moivre had begun to carry out with great success
made more observations or have a longer time series
the program that Bernoulli had begun [3]. Although,
of outcomes. What people do not know, he said, is
for Bernoulli, probability was an epistemic concept,
whether, if you make more and more observations,
and expectation was more fundamental than relative
you can be more and more sure, without limit,
chances, De Moivre established mathematical proba-
that your prediction is reliable. By his proof he
bility on the basis of relative frequencies.
claimed to show that there was no limit to the degree
of confidence or probability one might have that
the ratio of results would fall within some interval References
around an expected ratio. In addition, he made a
rough calculation of the number of trials (later called
[1] Bernoulli, N. (1709). De Usu Artis Conjectandi in
Bernoulli trials) that would be needed for a proposed Jure, in Die Werke von Jacob Bernoulli III, B.L. van-
degree of certainty. The mathematics he used in his der Waerden, ed., Birkhäuser, Basel, pp. 287–326.
proof basically involved binomial expansions and the An English translation of Chapter VII can be found
possible combinations and permutations of outcomes at http://www.york.ac.uk/depts/mathes/histstat/bernoulli
(“successes” or “failures”) over a long series of trials. n.htm [last access December 13, 2008].
[2] Bernoulli, J. (2006). [Ars Conjectandi (1713)], English
After a long series of trials, the distribution of ratios
translation in Jacob Bernoulli, The Art of Conjecturing
of outcomes would take the shape of a bell curve, together with Letter to a Friend on Sets in Court Tennis,
with increasing percentages of outcomes clustering E.D. Sylla ed., The Johns Hopkins University Press,
around the central value. For a comparison of Jacob Baltimore.
Bernoulli, Jacob 3
[3] De Moivre, A. (1712). De Mensura Sortis, seu, de Prob- [6] Leonardo of Pisa (Fibonacci) (2002). [Liber Abaci
abilitate Eventuum in Ludis a Casu Fortuito Penden- (1202)], English translation in Fibonacci’s Liber Abaci:
tibus Philosophical Transactions of the Royal Society 27, A Translation into Modern English of Leonardo Pisano’s
213–264 ; translated by Bruce McClintock in Hald, A. Book of Calculation, Springer Verlag, New York.
(1984a). A. De Moivre: ‘De Mensura Sortis’ or ‘On the [7] Sylla, E. (2003). Business ethics, commercial mathe-
Measurement of Chance’ . . . Commentary on ‘De Men- matics, and the origins of mathematical probability, in
sura Sortis, International Statistical Review 52, 229–262. Oeconomies in the Age of Newton, M. Schabas & N.D.
Marchi, eds, Annual Supplement to History of Politi-
After Bernoulli’s The Art of Conjecturing, De Moivre
cal Economy, Duke University Press, Durham, Vol. 35,
published The Doctrine of Chances, London 1718, 1738,
pp. 309–327.
1756.
[8] Sylla, E. (2006). Revised and expanded version of [7]:
[4] De Witt, J. (1671). Waerdye van Lyf-renten, in Die “Commercial Arithmetic, theology and the intellectual
Werke von Jacob Bernoulli III, B.L. vander Waerden, ed., foundations of Jacob Bernoulli’s Art of Conjecturing”, in
Birkhäuser, Basel, pp. 328–350. G. Poitras, ed., Pioneers of Financial Economics, Contri-
[5] Hald, A. (1984b). Nicholas Bernoulli’s theorem, Interna- butions Prior to Irving Fisher, Edward Elgar Publishing,
tional Statistical Review 52, 93–99 ; Cf. Hald, A. (1990). Cheltenham UK and Northampton MA, Vol. 1.
A History of Probability and Statistics and Their Applica-
tions before 1750, Wiley, New York. EDITH DUDLEY SYLLA
Treynor, Lawrence Jack permitted him to isolate the portion of fund return that
was actually due to the selection skills of the fund
manager. In 1981, Fischer Black wrote an open letter
Jack Lawrence Treynor was born in Council Bluffs, in the Financial Analysts Journal, stating that Treynor
Iowa, on February 21, 1930 to Jack Vernon Treynor had “developed the capital asset pricing model before
and Alice Cavin Treynor. In 1951, he graduated anyone else.”
from Haverford College on Philadelphia’s Main Line In his second Harvard Business Review paper,
with a Bachelors of Arts degree in mathematics. He Treynor and Kay Mazuy used a curvilinear regression
served two years in the US Army before moving to line to test whether funds were more sensitive to the
Cambridge, MA to attend Harvard Business School. market in the years when the market went up versus
After a year writing cases for Professor Robert the years when the market went down.
Anthony, Treynor went to work for the Operations When Fischer Black arrived at Arthur D. Little
Research department at Arthur D. Little in 1956. in 1965, Black took an interest in Treynor’s work
Treynor was particularly inspired by the 1958 and later inherited Treynor’s caseload (after Treynor
paper coauthored by Franco Modigliani and Merton went to work for Merrill Lynch.) In their paper,
H. Miller, titled “The Cost of Capital, Corporation “How to Use Security Analysis to Improve Portfolio
Finance, and the Theory of Investment.” At the Selection,” Treynor and Black proposed viewing
invitation of Modigliani, Treynor spent a sabbati- portfolios as having three distinct parts: a riskless
cal year at MIT between 1962 and 1963. While at part, a highly diversified part (devoid of specific risk),
MIT, Treynor made two presentations to the finance and an active part (which would have both specific
faculty, the first of which, “Toward a Theory of risk and market risk). The paper spells out the optimal
the Market Value of Risky Assets,” introduced the balance, not only between the three parts but also
capital asset pricing model (CAPM). The CAPM between the individual securities in the active part.
says that the return on an asset should equal the In 1966, Treynor was hired by Merrill Lynch
rate on a risk-free rate plus a premium propor- where he headed Wall Street’s first quantitative
tional to its contribution to the risk in the market research group. Treynor left Merrill Lynch in 1969
portfolio. The model is often referred to as the to serve as the editor of the Financial Analysts
Treynor–Sharpe–Lintner–Mossin CAPM to reflect Journal, with which he stayed until 1981. Treynor
the fact that it was simultaneously and independently then joined Harold Arbit in starting Treynor–Arbit
developed by multiple individuals, albeit with slight Associates, an investment firm based in Chicago.
differences. Although Treynor’s paper was not pub- Treynor continues to serve on the advisory boards
lished until Robert Korajczyk included the unrevised of the Financial Analysts Journal and the Journal of
version in his 1999 book, Asset Pricing and Portfolio Investment Management, where he is also case editor.
Performance, it is also included in the “Risk” section In addition to his 1976 book published with
of Treynor’s own 2007 book, Treynor on Institutional William Priest and Patrick Regan titled The Financial
Investing (Wiley, 2008). William F. Sharpe’s 1964 Reality of Pension Funding under ERISA, Treynor
version, which was built on the earlier work of Harry coauthored Machine Tool Leasing in 1956 with
M. Markowitz, won the Nobel Prize for Economics Richard Vancil of Harvard Business School. Treynor
in 1990. has authored and co-authored more than 90 papers on
The CAPM makes no assumptions about the factor such topics as risk, performance measurement, eco-
structure of the market. In particular, it does not nomics, trading (market microstructure), accounting,
assume the single-factor structure of the so-called investment value, active management, and pensions.
market model. However, in his Harvard Business He has also written 20 cases, many published in the
Review papers on performance measurement, Treynor Journal of Investment Management.
assumed a single factor. He used a regression of Treynor’s work has appeared in the Financial Ana-
returns on managed funds against returns on the lysts Journal, the Journal of Business, the Harvard
“market” to estimate the sensitivity of the fund Business Review, the Journal of Finance, and the
to the market factor and then used the slope of Journal of Investment Management, among others.
that regression line to estimate the contribution of Some of Treynor’s works were published under the
market fluctuations to a fund’s rate of return, which pen-name “Walter Bagehot,” a cover that offered him
2 Treynor, Lawrence Jack
anonymity while allowing him to share his often as to change the direction of the profession and
unorthodox theories. He promoted notions such as raise it to higher standards of accomplishment.” He
random walks, efficient markets, risk/return trade-off, received the Roger F. Murray prize in 1994 from
and betas that others in the field actively avoided. the Institute of Quantitative Research in Finance for
Treynor has since become renowned not only for “Active Management as an Adversary Game.” That
pushing the envelope with new ideas but also for same year he was also named a Distinguished Fellow
encouraging others to do the same as well. Eighteen of the Institute for Quantitative Research in Finance
of his papers have appeared in anthologies. along with William Sharpe, Merton Miller, and Harry
Two papers that have not been anthologized are Markowitz. In 1997, he received the EBRI Lillywhite
“Treynor’s Theory of Inflation” and “Will the Phillips Award, which is “awarded to persons who have had
Curve Cause World War III?” In these papers, he distinguished careers in the investment management
points out that, because in industry labor and capital and employee benefits fields and whose outstanding
are complements (rather than substitutes, as depicted service enhances Americans’ economic security.” In
in economics textbooks), over the business cycle they 2007, he was presented with The Award for Pro-
will become more or less scarce together. However, fessional Excellence, presented periodically by the
when capital gets more or less scarce, the identity of CFA Institute Board to “a member of the investment
the marginal machine will change. If the real wage profession whose exemplary achievement, excellence
is determined by the marginal productivity of labor of practice, and true leadership have inspired and
then (as Treynor argues) it is determined by the labor reflected honor upon our profession to the high-
productivity of the marginal machine. As demand est degree” (Previous winners were Jack Bogle and
rises and the marginal machines get older and less Warren Buffett.). In 2008, he was recognized as the
efficient, the real wage falls, but labor negotiations 2007 IAFE/SunGard Financial Engineer of the Year
fix the money wage. In order to satisfy the identity for his contributions to financial theory and practice.
money wage Treynor taught investments at Columbia Univer-
money prices ≡ (1) sity while working at the Financial Analysts Journal.
real wage
Between 1985 and 1988, Treynor taught investments
when the real wage falls, money prices must at the University of Southern California.
rise. According to Nobel Laureate Merton Miller, He is currently President of Treynor Capital Man-
Treynor’s main competitor on the topic, the Phillips agement in Palos Verdes, California.
curve is “just an empirical regularity” (i.e., just data
snooping). Further Reading
Treynor has won the Financial Analysts Jour-
nal’s Graham and Dodd Scroll award in 1968,
Bernstein, P.L. (1992). ‘Capital Ideas: The Improbable Origins
1982, twice in 1987, for “The Economics of the of Modern Wall Street’, The Free Press, New York.
Dealer Function” and “Market Efficiency and the Black, F.S. (1981). An open letter to Jack Treynor, Financial
Bean Jar Experiment,” in 1998 for “Bulls Bears Analysts Journal July/August, 14.
and Market Bubbles”, and in 1999 for “The Invest- Black, F.S. & Treynor, J.L. (1973). How to use security
ment Value of Brand Franchise.” In 1981 Treynor analysis to improve portfolio selection, The Journal of
was again recognized for his research, winning the Business 46(1), 66–88.
Black, F.S. & Treynor, J.L. (1986). Corporate investment deci-
Graham and Dodd award for “Best Paper” titled sion, in Modern Developments in Financial Management,
“What Does It Take to Win the Trading Game?” S.C. Myers, ed., Praeger Publishers.
In 1987, he was presented with the James R. Vertin French, C. (2003). The Treynor capital asset pricing model,
Award of the Research Foundation of the Institute Journal of Investment Management 1(2), 60–72.
of Chartered Financial Analysts, “in recognition of Keynes, J.M. (1936). The General Theory of Employment,
his research, notable for its relevance and endur- Interest, and Money, Harcourt Brace, New York.
ing value to investment professionals.” In addition, Korajczyk, R. (1999). Asset Pricing and Portfolio Perfor-
mance: Models, Strategy and Performance Metrics, Risk
the Financial Analysts Association presented him Books, London.
with the Nicholas Molodovsky Award in 1985, “in Lintner, J. (1965a). The valuation of risk assets and the
recognition of his outstanding contributions to the selection of risky investment in stock portfolios and capital
profession of financial analysis of such significance budgets, The Review of Economics and Statistics 47, 13–37.
Treynor, Lawrence Jack 3
Lintner, J. (1965b). Securities prices, risk, and maximal gains Treynor, J.L. (1965). How to rate management of investment
from diversification, The Journal of Finance 20(4), 587–615. funds, Harvard Business Review 43, 63–75.
Markowitz, H.M. (1952). Portfolio selection, The Journal of Treynor, J.L. (2007). Treynor on Institutional Investing, Wiley,
Finance 7(1), 77–91. New York.
Mehrling, P. (2005). Fischer Black and the Revolutionary Idea Treynor, J.L. & Mazuy, K. (1966). Can mutual funds outguess
of Finance, Wiley, New York. the market? Harvard Business Review 44, 131–136.
Modigliani, F. & Miller, M.H. (1958). The cost of capital, Treynor, J.L. & Vancil, R. (1956). Machine Tool Leasing,
corporation finance, and the theory of investment, The Management Analysis Center.
American Economic Review 48, 261–297.
Sharpe, W.F. (1964). Capital asset prices: a theory of market
equilibrium under conditions of risk, The Journal of Finance Related Articles
19(3), 425–442.
Treynor, J.L. (1961). Market Value, Time, and Risk . Unpub- Black, Fischer; Capital Asset Pricing Model;
lished manuscript. Dated 8/8/1961, #95-209.
Treynor, J.L. (1962). Toward a Theory of Market Value of Risk
Factor Models; Modigliani, Franco; Samuelson,
Assets. Unpublished manuscript. Dated Fall of 1962. Paul A.; Sharpe, William F.
Treynor, J.L. (1963). Implications for the Theory of Finance.
Unpublished manuscript. Dated Spring of 1963. ETHAN NAMVAR
Rubinstein, Edward Mark In 1975, Rubinstein began developing theoret-
ical models of “efficient markets.” In 1976, he
published a paper showing that the same for-
mula derived by Black and Scholes for valuing
Mark Rubinstein, the only child of Sam and Gladys options could come from an alternative set of
Rubinstein of Seattle, Washington, was born on June assumptions based on risk aversion and discrete-
8, 1944 . He attended the Lakeside School in Seattle time trading opportunities. (Black and Scholes had
and graduated in 1962 as one of the two gradu- required continuous trading and continuous price
ation speakers. He earned an A.B. in Economics, movements.)
magna cum laude, from Harvard College in 1966 Working together with Cox et al. [1], Rubinstein
and an MBA with a concentration in finance from published the popular and original paper develop-
the Graduate School of Business at Stanford Uni- ing the binomial option pricing model, one of the
versity in 1968. In 1971, Rubinstein earned his most widely cited papers in financial economics and
PhD. in Finance from the University of California, now probably the most widely used model by pro-
Los Angeles (UCLA). During this time at UCLA, fessional traders to value derivatives. The model
he was heavily influenced by the microeconomist is often referred to as the Cox–Ross–Rubinstein
Jack Hirshleifer. In July 1972, he became an assis- option pricing (CRR) model. At the same time,
tant professor in finance at the University of Cal- Rubinstein began work with Cox [2] on their
ifornian at Berkeley, where he remained for his own text, Options Markets, which was eventually
entire career. He was advanced to tenure unusu- published in 1985 and won the biennial award
ally early in 1976 and became a full professor in of the University of Chicago for the best work
1980. by professors of business concerning any area of
Rubinstein’s early work concentrated on asset business.
pricing. Specifically, between 1971 and 1973, his He supplemented his academic work with first-
research centered on the mean–variance capital asset hand experience as a market maker in options
pricing model and came to include skewness as a when he became a member of the Pacific Stock
measure of risk [3–5]. Rubinstein’s extension has Exchange. In 1981, together with Hayne E. Leland
new relevance as several researchers have since and John W. O’Brien, Rubinstein founded the Leland
determined its predictive power in explaining real- O’Brien Rubinstein (LOR) Associates, the original
ized security returns. In 1974, Rubinstein’s research portfolio insurance firm. At the time, the novel
turned to more general models of asset pricing. idea of portfolio insurance had been put forth by
He developed an extensive example of multiperiod Leland, later fully developed together with Rubin-
security market equilibrium, which later became the stein, and successfully marketed among large insti-
dominant model used by academics in their theoret- tutional investors by O’Brien. Their business grew
ical papers on asset pricing. Unlike earlier work, he extremely rapidly, only to be cut short when they
left the intertemporal process of security returns to had to share the blame for the October 1987 stock
be determined in equilibrium rather than as datum market crash. Not admitting defeat, LOR invented
(although as special cases he assumed a random another product that became the first exchange-traded
walk and constant interest rates). Rubinstein was thus fund (ETF), the SuperTrust, listed on the Ameri-
able to derive conditions for the existence of a ran- can Stock Exchange in 1992. Rubinstein also pub-
dom walk and an unbiased term structure of interest lished a related article examining alternative basket
rates. He also was the first to derive a simple equa- vehicles.
tion in equilibrium for valuing a risky stream of In the early 1990s, Rubinstein published a series
income received over time. He published the first of eight articles in the Risk Magazine showing how
paper to show explicitly how and why in equilib- option pricing tools could easily be applied to value a
rium investors would want to hold long-term bonds host of so-called exotic derivatives, which were just
in their portfolios, and in particular would want to becoming popular.
hold a riskless (in terms of income) annuity maturing Motivated by the failure after 1987 of index
at their death, foreshadowing several strands of later options to be priced anywhere close to the predic-
research. tions of the Black–Scholes formula, in an article
2 Rubinstein, Edward Mark
published in the Journal of Finance [8], he devel- States and around the world. He has served as chair-
oped an important generalization of the original bino- man of the Berkeley finance group, and as direc-
mial model, which he called implied binomial trees. tor of the Berkeley Program in Finance; he is the
The article included new techniques for inferring founder of the Berkeley Options Database (the first
risk-neutral probability distributions from options on large transaction-level database ever assembled with
the same underlying asset. Rubinstein’s revisions of respect to options and stocks). He has served on the
the model provide the natural generalization of the editorial boards of numerous finance journals. He has
standard binomial model to accommodate arbitrary authored 62 journal articles, published 3 books, and
expiration date risk-neutral probability distributions. developed several computer programs dealing with
derivatives.
This paper, in turn, spurred new academic work on
Rubinstein is currently a professor of finance at
option pricing in the latter half of the 1990s and
the Haas School of Business at the University of
found immediate application among various profes-
California, Berkeley. Many of his papers are fre-
sionals. In 1998 and 1999, Rubinstein rounded out quently reprinted in survey publications, and he has
his work on derivatives by publishing a second text won numerous prizes and awards for his research
titled “Rubinstein on Derivatives,” which expanded and writing on financial economics. He was named
its domain from calls and puts to futures and more “Businessman of the Year” (one of 12) in 1987 by
general types of derivatives. The book also pio- Fortune magazine. In 1995, the International Asso-
neered new ways to integrate computers as an aid ciation of Financial Engineers (IAFE) named him
to learning. the 1995 IAFE/SunGard Financial Engineer of the
After a 1999 debate about the empirical rationality Year. In 2000, he was elected to Derivatives Strat-
of financial markets with the key behavioral finance egy Magazine’s “Derivatives Hall of Fame” and
theorist, Richard Thaler, Rubinstein began to rethink named in the “RISK Hall of Fame” by Risk Mag-
the concept of efficient markets. In 2001, he published azine in 2002. Of all his awards, the one he cher-
a version of his conference argument in the Financial ishes the most is the 2003 Earl F. Cheit Teaching
Analysts Journal [6, 7], titled “Rational Markets? award in the Masters of Financial Engineering Pro-
Yes or No: The Affirmative Case,” which won the gram at the University of California, Berkeley [10]
Graham and Dodd Plaque award in 2002. (Rubinstein, M.E. (2003). A Short Career Biography.
He then returned to the more general theory of Unpublished.)
investments with which he had begun his research Rubinstein has two grown-up children, Maisiee
career as a doctoral student. In 2006, Rubinstein [11] and Judd. He lives with Diane Rubinstein in the San
Francisco Bay Area.
published “A History of the Theory of Investments:
My Annotated Bibliography”—an academic history
of the theory of investments from the thirteenth to the References
beginning of the twenty-first century, systematizing
the knowledge, and identifying the relations between [1] Cox, J.C., Ross, S.A. & Rubinstein, M.E. (1979).
Optional pricing: a simplified approach, Journal of
apparently disparate lines of research. No other book Financial Economics September, 229–263.
has so far been written that comes close to examining [2] Cox, J.C. & Rubinstein, M.E. (1985). Options Markets,
in detail the intellectual path that has led to modern Prentice-Hall.
financial economics (particularly, in the subarea of [3] Rubinstein, M.E. (1973). The fundamental theorem
investments). Rubinstein shows that the discovery of of parameter-preference security valuation, Journal of
Financial and Quantitative Analysis January, 61–69.
key ideas in finance is much more complex and mul- [4] Rubinstein, M.E. (1973). A comparative statics analysis
tistaged than anyone had realized. Too few are given of risk premiums, Journal of Business October.
too much credit, and sometimes original work has [5] Rubinstein, M.E. (1973). A mean-variance synthesis of
been forgotten. corporate financial theory, Journal of Finance March.
Rubinstein has taught and lectured widely. Dur- [6] Rubinstein, M.E. (1989). Market basket alternatives,
Financial Analysts Journal September/October.
ing his career, he has given 303 invited lectures, [7] Rubinstein, M.E. (1989). Rational markets? Yes or
including conference presentations, full course sem- No: the affirmative case, Financial Analysts Journal
inars, and honorary addresses all over the United May/June.
Rubinstein, Edward Mark 3
[8] Rubinstein, M.E. (1994). Implied binomial trees, Journal [11] Rubinstein, M.E. (2006). A History of the Theory of
of Finance July, 771–818. Investments: My Annotated Bibliography, John Wiley &
[9] Rubinstein, M.E. (2000). Rubinstein on Derivatives, Risk Sons, New York.
Books.
[10] Rubinstein, M.E. (2003). All in All, it’s been a Good ETHAN NAMVAR
Life, The Growth of Modern Risk Management: A His-
tory July, 581–585.
Infinite Divisibility > 0. On the other hand, the class L of SD
distributions is characterized as the class of pos-
sible limit laws for normalized sequences of the
form (X1 + · · · + Xn − an )/bn , where X1 , X2 , . . . are
We say that a random variable X has an infinitely independent random variables and an and bn > 0
divisible (ID) distribution (in short X is ID) if are sequences of numbers with limn→∞ bn = ∞ and
for all the integers n ≥ 1 there exist n indepen- limn→∞ bn+1 /bn = 1.
dent identically distributed (i.i.d) random variables
d d
X1 , . . . , Xn , such that X1 + · · · + Xn = X, where =
is equality in distribution. Alternatively, X (or its Lévy–Khintchine Representation
distribution µ) is ID if for all n ≥ 1, µ is the nth
convolution µn ∗ · · · ∗ µn , where µn is a probability In terms of characteristic functions (see Filtering),
distribution. a random variable X is ID if ϕ(u) = E[eiuX ] is
There are several advantages in using infinitely represented by ϕ = (ϕn )n , where ϕn is the char-
divisible distributions and processes in financial acteristic function of a probability distribution for
modeling. First, they offer wide possibilities for every n ≥ 1. We define the characteristic exponent
modeling alternatives to the Gaussian and stable or cumulant function of X by (u) = log ϕ(u).
distributions, while maintaining a link with the The Lévy–Khintchine representation establishes that
central limit theorem and a rich probabilistic struc- a distribution function µ is ID if and only if its char-
ture. Second, they are closely linked to Lévy pro- acteristic exponent is represented by
cesses: for each ID distribution µ there is a Lévy
process (see Lévy Processes) {Xt : t ≥ 0} with 1
(u) = iau − u2 σ 2
X1 having distribution µ. Third, every stationary 2
distribution of an Ornstein–Uhlenbeck process (see iux
Ornstein–Uhlenbeck Processes) belongs to the class + e − 1 − iux1|x|≤1 (dx), u∈
L of ID distributions, which are self-decomposable
(SD). We say that a random variable X is SD if it has (1)
the linear autoregressive property: for any θ ∈ (0, 1),
where σ 2 ≥ 0, a ∈ and is a positive measure on
there is a random variable εθ independent of X such
d with no atom at zero and min(1, |x|2 )(dx) <
that X = θX + εθ . ∞. The triplet (a, σ 2 , ) is unique and is called
The concept of infinite divisibility in probability the generating triplet of µ, while is its Lévy
was introduced in 1929 by de Fenneti. Its theory was measure. When is zero, we have the Gaussian
established in the 1930s by Khintchine, Kolmogorov, distribution. We speak of the purely non-Gaussian
and Lévy. Motivated by applications arising in dif- case when σ 2 = 0. When (dx) = h(x)dx is abso-
ferent fields, from the 1960s on there was a renewed lutely continuous, we call the nonnegative func-
interest in the subject, in particular, among many tion h the Lévy density of . Distributions in
other topics, in the study of concrete examples and the class L are also characterized by having Lévy
subclasses of ID distributions. Historical notes and densities of the form h(x) = |x|−1 g(x), where g
references are found in [3, 6, 8, 9]. is nondecreasing in x < 0 and nonincreasing in
x > 0.
A nonnegative ID random variable is characterized
Link with the Central Limit Theorem by a special form of its Lévy–Khintchine repre-
The class of ID distributions is characterized as sentation: it is purely non-Gaussian, (−∞, 0) = 0,
|x|≤1 |x| (dx) < ∞, and
the class of possible limit laws for triangular
arrays of the form Xn,1 + · · · + Xn,kn − an , where
iux
kn > 0 is an increasing sequence, Xn,1 , . . . , Xn,kn are (u) = ia0 u + e − 1 (dx) (2)
+
independent random variable for every n ≥ 1, an
are normalized constants, and {Xn,j } is infinitesi- where a0 ≥ 0 is called the drift. The associated Lévy
mal: limn→∞ max1≤j ≤kn P (Xn,j > ) = 0, for each process {Xt : t ≥ 0} is called a subordinator. It is a
2 Infinite Divisibility
nonnegative increasing process having characteristic of SD distributions is that they always have densities
exponent (2). Subordinators are useful models for that are unimodal.
random time evolutions. Infinite divisibility is preserved under some mix-
Several properties of an ID random variable X tures of distributions. One has the surprising fact
are related to corresponding properties of its Lévy that any mixture of the exponential distribution is
measure . For example,
the kth moment E |X|k is d
ID: X = Y V is ID whenever V has exponential dis-
|x|
finite if and only if |x|>1 k (dx) is finite. Like-
tribution and Y is an arbitrary nonnegative random
wise, for the IDlog condition: |x|>2 ln |x| (dx) < ∞ variable independent of V . The monograph [9] has a
if and only if |x|>2 ln |x| µ(dx) < ∞. detailed study of ID mixtures.
The monograph [8] has a detailed study of mul-
tivariate ID distributions and their associated Lévy
processes. Stochastic Integral Representations
Several classes of ID distributions are characterized
Classical Examples and Criteria by stochastic integrals (see Stochastic Integrals)
of a nonrandom function with respect to a Lévy
The Poisson distribution with mean λ > 0 is ID process [2]. The classical example is the class L
with Lévy measure (B) = λ1{1} (B), but is not d
that is also characterized as all the laws of X =
SD. Acompound Poisson distribution is the law of ∞ −t
X= N 0 e dZt , where Zt is a Levy process having
i=1 Yi , where N, Y1 , Y2 , . . . are independent
Lévy measure Z with the IDlog condition. More
random variables, N having Poisson distribution with 1
mean λ and the Yi ’s have the same distribution G, generally, the stochastic integral 0 log t −1 dZt is well
with G({0}) = 0. Any compound Poisson distribution defined for every Lévy process Zt . Denote by B()
is ID with Lévy measure (B) = λG(B). This the class of all the distributions of these stochastic
distribution is a building block for all other ID laws, integrals. The class B() coincides with those ID
since every ID distribution is the limit of a sequence laws with completely monotone Lévy density. It is
of compound Poisson distributions. also characterized as the smallest class that contains
An important example of an SD law is the all mixtures of exponential distributions and is closed
gamma distribution with shape parameter α > 0 and under convolution, convergence, and reflection. It
scale parameter β > 0. It has Lévy density h(x) = is sometimes called the Bondenson–Goldie–Steutel
αx −1 e−βx , x > 0. The α-stable distribution, with class of distributions. Multivariate extensions are
0 < α < 2 and purely non Gaussian, is also SD. Its presented in [2].
Lévy density is h(x) = c1 x −1−α dx on (0, ∞) and
h(dx) = c2 |x|−1−α on (−∞, 0), with c1 ≥ 0, c2 ≥ 0
and c1 + c2 > 0. Generalized Gamma Convolutions
There is no explicit characterization of infinite
divisibility in terms of densities or distributions. The class of generalized gamma convolutions
However, there are some sufficient or necessary con- (GGCs) is the smallest class of probability distribu-
ditions to test for infinite divisibility. A nonnegative tions on + that contains all gamma distributions and
random variable with density f is ID in any of the fol- is closed under convolution and convergence in dis-
lowing cases: (i) log f is convex, (ii) f is completely tribution [6]. These laws are in the class L and have
monotone, or (iii) f is hyperbolically completely Lévy density of the form h(x) = x −1 g(x), x > 0,
monotone [9]. If X is symmetric around zero, it is with g a completely monotone function on (0, ∞).
ID if it has a density that is completely monotone on Most of the classical distributions on + are GGC:
(0, ∞). For a non-Gaussian ID distribution F, its tail gamma, lognormal, positive α-stable, Pareto, Student
behavior is − log(1 + F (−x) − F (x)) = O(x log x), t-distribution, Gumbel, and F -distribution. Of spe-
when x → ∞. Hence, no bounded random variable cial applicability in financial modeling is the family
is ID and if a density has a decay of the type of generalized inverse Gaussian distributions [4, 7].
c1 exp(−c2 x 2 ) with some c1 , c2 positive and if it is A distribution µ with characteristic exponent
not Gaussian, then F is not ID. An important property is GGC if and only if there exists a positive Radon
Infinite Divisibility 3
measure U on (0, ∞) such that with V being nonnegative ID and N having the
∞ standard normal distribution. Any type G distribu-
iu
(u) = ia0 u − log 1 + U (ds) (3) tion is ID and it is interpreted as the law of a
0 s random time changed Brownian motion BV , where
1 ∞ {Bt : t ≥ 0} is a Brownian motion independent of V .
with 0 | log x|U (dx) < ∞ and 1 U (dx)/x < ∞.
When we know the Lévy measure ρ of V , we
The measure Uµ is called the Thorin measure of µ.
can compute the Lévy density of X as h(x) =
So, the triplet of µ is (a0 , 0, νµ ) where the Lévy 1 2
measure is concentrated
∞ on (0, ∞) and such that (2π)−1/2 + s −1/2 e− 2s x ρ(ds) as well as its charac-
νµ (dx) = dx/x 0 e−xs Uµ (ds). Moreover, ∞any GGC
teristic exponent
is the law of a Wiener-gamma integral 0 h(u)dγu ,
where (γt ; t ≥ 0) is the standard gamma process with X (u) = e−(1/2)u s − 1 ρ(ds)
2
(5)
Lévy measure ν(dx) = e−x (dx/x) ∞ and h is a Borel +
function h : + → + with 0 log(1 + h(t))dt <
∞. The function h is called the Thorin function of Many classical distributions are of type G and SD:
x the gamma variance distribution, where V has a
µ and is obtained as follows. Let FU (x) = 0 U (dy)
for x ≥ 0 and let FU−1 (s) be the right continuous gamma distribution; the Student t, where V has the
distribution of the reciprocal chi-square distribution
inverse of FU−1 (s) in the sense of composition of
and the symmetric α-stable distributions, 0 < α < 2;
functions, that is FU−1 (s) = inf{t > 0; FU (t) ≥ s} for
here V is a positive α/2-stable random variable,
s ≥ 0. Then, h(s) = 1/FU−1 (s) for s ≥ 0. For the
including the Cauchy distribution case α = 1. Of
positive α-stable distributions, 0 < α < 1, h(s) =
special relevance in financial modeling are the nor-
{sθ
(α + 1)}−1/α for a θ > 0.
mal inverse Gaussian, with V following the inverse
For distributions on , Thorin also introduced
Gaussian law [1], and the zero-mean symmetric gen-
the class T () of extended generalized gamma
eralized hyperbolic distributions, where V has the
convolutions as the smallest class that contains the
generalized inverse Gaussian law [5, 7]; all their
GGC and is closed under convolution, convergence
moments are finite and they can accommodate heavy
in distribution, and reflection. These distributions are
tails.
in the class L and are characterized by the alternative
representation of their characteristic exponents
where is the characteristic exponent of the divisible): for any θ ∈ (0, 1), there is a random vari-
d
Lévy process Zt given by the Lévy–Khintchine able εθ independent of X such that X = θX + εθ .
representation Conversely, for every SD distribution µ there exists
a Lévy process Zt with Z1 being IDlog and such that
1 µ is the stationary distribution of the OU process
(u) = iau − u2 σ 2
2 driven by Zt .
The strictly stationary OU process is defined as
+ (eiux − 1 − iux1|x|≤1 )(dx), u∈ t
−λt
(7) Xt = e eλs dZs , t ∈ (9)
−∞
where σ 2 ≥ 0, a ∈ , and , the Lévy mea- where {Zt : t ∈ } is a Lévy process constructed
sure, is a positive measure on with ({0}) = 0 as follows: let {Zt1 : t ≥ 0} be a Lévy process with
and min(1, |x|2 )(dx) < ∞. For each t > 0, the characteristic exponent 1 and let {Zt2 : t ≥ 0} be a
probability distribution of Zt has characteristic func- Lévy process with characteristic exponent 2 (u) =
tion ϕt (u) = E[eiuXt ] = exp(t(u)). When the Lévy 1 (−u) and independent of Z 1 . Then Zt = Zt1 for
measure is zero, Zt is a Brownian motion with vari- t ≥ 0 and Zt = Z−t 2
for t < 0. In this case, the law
ance σ 2 and drift a. of Xt is SD and conversely, for any SD law µ there
exists a BDLP Zt such that equation (9) determines a
stationary OU process with distribution µ. As a result,
The Integrated OU Process 0
taking X0 = −∞ eλs dZs , we can always consider (5)
as a strictly stationary OU process with a prescribed
A non-Gaussian OU process Xt has the same jump SD distribution µ. It is an important example of a
times of Zt , as one sees from equation (4). However, continuous-time moving average process.
Xt and Zt cobreak in the sense that a linear combi-
nation of the two does not jump. We see this by con-
sidering
t the continuous integrated OU process ItX = Generalizations
0 Xs ds, which has two alternative representations
The monographs [6, 7] contain a detailed study of
ItX =λ −1 −1
{X0 − Xt + Zt } = λ (1 − e −λt
)X0 multivariate OU process, while matrix extensions
t are considered in [2]. Another extension is the
−1
generalized OU process, which has arisen in several
+λ 1 − e−λ(t−s) dZs (8)
0 financial applications [4, 8]. It is defined as
t
−ξt −ξt
In the Gaussian case, the process ItX is interpreted Xt = e X0 + e e−ξs− dηs , t ≥ 0 (10)
as the displacement of the Brownian particle. In 0
degenerate to constant random variable, and where Y. Kabanov, R. Lipster & J. Stoyanov, eds, Springer,
Lt is the accompanying Lévy process Lt = ηt +
pp. 392–419.
−ξs ξ η [5] Linder, A. & Maller, R. (2005). Lévy processes and
0<s≤t (e − 1)ηs − tE(B1 , B1 ), where ξs =
ξ η the stationarity of generalised Ornstein-Uhlenbeck pro-
ξs − ξs− , with B1 , B1 the Gaussian parts of ξ and η cesses, Stochastic Processes and Their Applications 115,
respectively [3, 5]. 1701–1722.
[6] Rocha-Arteaga, A. & Sato, K. (2003). Topics in Infinitely
Divisible Distributions and Lévy Processes, Aportaciones
References Matemáticas Investigación, Mexican Mathematical Soci-
ety, 17.
[1] Barndorff-Nielsen, O.E. & Shephard, N. (2001). Non- [7] Sato, K. (1999). Lévy Processes and Infinitely Divisible
Gaussian Ornstein-Uhlenbeck-based models and some of Distributions, Cambridge University Press, Cambridge.
their uses in financial economics (with discussion), The [8] Yor, M. (2001). Exponential Functionals of Brownian
Journal of the Royal Statistical Society B 63, 167–241. Motion and Related Processes, Springer, New York.
[2] Barndorff-Nielsen, O.E. & Stelzer, R. (2007). Positive-
definite matrix processes of finite variation, Probability
and Mathematical Statistics 27, 3–43. Related Articles
[3] Carmona, P., Petit, F. & Yor, M. (2001). Exponential
functionals of Lévy processes, in Lévy Processes. Theory
and Applications, O.E. Barndorff-Nielsen, T. Mikosch & Infinite Divisibility; Lévy Processes; Stochastic
S.I. Resnick, eds, Birkhäuser, pp. 41–55. Integrals.
[4] Klüppelberg, C., Linder, A. & Maller, R. (2006). Con-
tinuous time volatility modelling: COGARCH versus VÍCTOR PÉREZ-ABREU
Ornstein-Uhlenbeck models, in The Shiryaev Festschrift:
From Stochastic Calculus to Mathematical Finance,
Fractional Brownian One can define a parametric family of fBms in
terms of the stochastic Weyl integral (see e.g. [16],
Motion chapter 7.2). In fact, for any a, b ∈ ,
A fractional Brownian motion (fBm) is a self-similar H−
1
H−
1
d
Gaussian process, defined as follows: {BH (t)}t∈ = a [(t − s)+ 2 − (−s)+ 2 ]
Definition 1 Let 0 < H < 1. The Gaussian stochas-
tic process {BH (t)}t≥0 satisfying the following three H−
1
H−
1
properties + b [(t − s)− 2 − (−s)− 2 ] dB(s) (2)
t∈
(i) BH (0) = 0
(ii) E[BH (t)] = 0 for all t ≥ 0, where u+ = max(u, 0), u− = max(−u, 0), and
(iii) for all s, t ≥ 0, {B(t)}t∈ is a two-sided standard Brownian motion
constructed by taking a Brownian motion B1 and an
E[BH (t)BH (s)] independent copy B2 and setting B(t) = B1 (t)1{t≥0}
− B2 (−t−)1{t<0} . √
1 2H
= |t| − |t − s|2H + |s|2H (1) If we choose a = (2H + 1) sin(πH )/ (H +
2 1/2) and b = 0 in equation (2) then {BH (t)}t∈ is an
is called the (standard) fBm with parameter H . fBm satisfying equation (1).
fBm
t
admits a Volterra type representation BH (t)
The fBm has been the subject of numerous inves- = 0 KH (t, s) B(ds), where KH is some square inte-
tigations, in particular, in the context of long-range grable kernel (see [13] or [1] for details).
dependence (often referred to as long memory). fBm
was first introduced in 1940 by Kolmogorov (see
Kolmogorov, Andrei Nikolaevich) [11], but its main
properties and its relevance in many fields of appli- Properties
cation such as economics, finance, turbulence, and
telecommunications were first discussed in the sem- Many properties of fBm, like self-similarity, are
inal paper of Mandelbrot (see Mandelbrot, Benoit) given by its fractional index H .
and Van Ness [12].
For historical reasons, the parameter H is also Definition 2 A real-valued stochastic process
referred to as the Hurst coefficient. In fact, in 1951, {X(t)}t∈ is self-similar with index H if for all c > 0,
while he was investigating the flow of the river Nile, d d
{X(ct)}t∈ = cH {X(t)}t∈ , where = denotes equality
the British hydrologist H. E. Hurst [10] noticed that in distribution.
his measurements showed dependence properties and,
in particular, long memory behavior in the sense Proposition 1 Fractional Brownian motion (fBm)
that they seemed to require models, whose auto- is self-similar with index H . Moreover, fBm is the
correlation functions exhibit a power law decay at only self-similar Gaussian process with stationary
large timescales. This index of dependence H always increments.
takes values between 0 and 1 and indicates rela-
tively long-range dependence if H > 0.5, for exam- Now, we consider the increments of fBm.
ple, Hurst observed H = 0.91 in the case of Nile Definition 3 The stationary process {Y (t)}t∈ given
level data.
by
If H = 0.5, it is obvious from equation (1)
that the increments of fBm are independent Y (t) = BH (t) − BH (t − 1) t ∈ (3)
and {B0.5 (t)}t∈ = {B(t)}t∈ is ordinary Brownian
motion. Moreover, fBm has stationary increments
which, for H = 0.5, are not independent. is called fractional Gaussian noise.
2 Fractional Brownian Motion
0.5
H = 0.95
0
H = 0.55
BH(t )
−0.5
−1
H = 0.75
−1.5
−2
0 50 100 150 200 250 300 350 400 450 500
t
∞
(i) If 0 < H < 0.5, ρH is negative and |ρH (n)| p-variation for every p > 1/H and of infinite p-
n=1 variation if p < 1/H .
< ∞.
(ii) If H = 0.5, ρH equals 0, that is, the increments
Consequently, for H < 0.5 the quadratic variation
are independent.
Hence, for 0.5 < H < 1 the increments of fBm A proof of this well-known fact can be found in
are persistent or long-range dependent, whereas for for example, [15] or [4].
0 < H < 0.5 they are said to be antipersistent. However, since fBm is not a semimartingale one
cannot use the Itô stochastic integral (see Stochastic
Proposition 3 The sample paths of fBm are contin- Integrals) when considering integrals with respect to
uous. In particular, for every H̃ < H there exists a fBm. Recently, integration with respect to fBms has
modification of BH whose sample paths are almost been studied extensively and various approaches have
surely (a.s.) locally H̃ -Hölder continuous on , that been made to define a stochastic integration theory for
is, for each trajectory, there exists a constant c > 0 fBm (see e.g., [14] for a survey).
Fractional Brownian Motion 3
T
References
f (t) δBH (t)
[1] Baudoin, F. & Nualart, D. (2003). Equivalence of
0 Volterra processes, Stochastic Processes and their Appli-
n−1 cations 107, 327–350.
= lim f (tk ) ♦ (BH (tk+1 ) − BH (tk )) [2] Bender, C. (2003). An Itô formula for generalized func-
|π|→0 tionals of a fractional Brownian motion with arbitrary
k=0
Hurst parameter, Stochastic Processes and their Appli-
(8) cations 104, 81–106.
[3] Björk, T. & Hult, H. (2005). A note on Wick products
where ♦ represents the Wick product [18] and and the fractional Black-Scholes model, Finance and
the convergence is the L2 ()-convergence of Stochastics 9, 197–209.
random variables [2]. [4] Cheridito, P. (2001). Regularizing Fractional Brownian
Whereas, the pathwise fractional integral mirrors a Motion with a View towards Stock Price Modelling, PhD
Dissertation, ETH Zurich.
Stratonovich integral, the Wick–Itô-Skorohod calcu-
[5] Cheriditio, P. (2003). Arbitrage in fractional Brownian
lus is similar to the Itô calculus, for example, integrals motion models, Finance and Stochastics 7, 533–553.
always have zero expectation. [6] Comte, F. & Renault, E. (1998). Long memory in con-
The Wick–Itô integral was constructed by Duncan tinuous time stochastic volatility models, Mathematical
et al. [8] and later applied to finance by, for example, Finance 8, 291–323.
4 Fractional Brownian Motion
[7] Cont, R. (2005). Long range dependence in financial [15] Rogers, L.C.G. (1997). Arbitrage with fractional Brow-
time series, in Fractals in Engineering, E. Lutton & nian motion, Mathematical Finance 7, 95–105.
J. Levy-Vehel, eds, Springer. [16] Samorodnitsky, G. & Taqqu, M. (1994). Stable Non-
[8] Duncan, T.E., Hu, Y. & Pasik-Duncan, B. (2000). Gaussian Random Processes: Stochastic Models with
Stochastic calculus for fractional Brownian motion I. Infinite Variance, Chapman & Hall, New York.
Theory, SIAM Journal of Control and Optimization 28, [17] Sottinen, T. & Valkeila, E. (2003). On arbitrage
582–612. and replication in the fractional Black-Scholes pricing
[9] Hu, Y. & Oksendal, B. (2003). Fractional white noise model, Statistics and Decisions 21, 93–107.
calculus and applications to finance, Infinite Dimensional [18] Wick, G.-C. (1950). Evaluation of the collision matrix,
Analysis, Quantum Probability and Related Topics 6, Physical Review 80, 268–272.
1–32.
[10] Hurst, H. (1951). Long term storage capacity of reser-
Further Reading
voirs, Transactions of the American Society of Civil Engi-
neers 116, 770–1299.
[11] Kolmogorov, A.N. (1940). Wienersche Spiralen und Doukhan, P., Oppenheim, G. & Taqqu, M.S. (2003). Theory
einige andere interessante Kurven im Hilbertschen and Applications of Long-Range Dependence, Birkhäuser,
Raum, Computes Rendus (Doklady) Academic Sciences Boston.
USSR (N.S.) 26, 115–118. Lin, S.J. (1995). Stochastic analysis of fractional Brownian
[12] Mandelbrot, B.B. & Van Ness, J.W. (1968). Fractional motion, Stochastics and Stochastics Reports 55, 121–140.
Brownian motions, fractional noises and applications,
SIAM Review 10, 422–437.
[13] Norros, I., Valkeila, E. & Virtamo, J. (1999). An ele-
Related Articles
mentary approach to a Girsanov formula and other ana-
lytical results on fractional Brownian motion, Bernoulli Long Range Dependence; Mandelbrot, Benoit;
5, 571–589. Multifractals; Semimartingale; Stylized Properties
[14] Nualart, D. (2003). Stochastic calculus with respect of Asset Returns.
to the fractional Brownian motion and applications,
Contemporary Mathematics 336, 3–39. TINA M. MARQUARDT
with independent increments. For the most part, how-
Lévy Processes ever, research literature through the 1960s and 1970s
refers to Lévy processes simply as processes with
stationary and independent increments. One sees a
A Lévy process is a continuous-time stochastic pro- change in language through the 1980s and by the
cess with independent and stationary increments. 1990s the use of the term Lévy process had become
Lévy processes may be thought of as the continuous- standard.
time analogs of random walks. Mathematically, a Judging by the volume of published mathematical
Lévy process can be defined as follows. research articles, the theory of Lévy processes can
be said to have experienced a steady flow of interest
Definition 1 An d -valued stochastic process X =
from the time of the foundational works, for example,
{Xt : t ≥ 0} defined on a probability space (, F, )
of Lévy [8], Kolmogorov [7], Khintchine [6], and
is said to be a Lévy process if it possesses the
Itô [5]. However, it was arguably in the 1990s that a
following properties:
surge of interest in this field of research occurred,
drastically accelerating the breadth and depth of
1. The paths of X are almost surely right contin-
understanding and application of the theory of Lévy
uous with left limits.
processes. While there are many who made prolific
2. (X0 = 0) = 1.
contributions during this period, as well as thereafter,
3. For 0 ≤ s ≤ t, Xt − Xs is equal in distribution
the general progression of this field of mathematics
to Xt−s .
was enormously encouraged by the monographs of
4. For 0 ≤ s ≤ t, Xt − Xs is independent of {Xu :
Bertoin [3] and Sato [10]. It was also the growing
u ≤ s}.
research momentum in the field of financial and
Historically, Lévy processes have always played insurance mathematics that stimulated a great deal
a central role in the study of stochastic processes of the interest in Lévy processes in recent times, thus
with some of the earliest work dating back to the entwining the modern theory of Lévy processes ever
early 1900s. The reason for this is that, mathemat- more with its historical roots.
ically, they represent an extremely robust class of
processes, which exhibit many of the interesting phe- Lévy Processes and Infinite Divisibility
nomena that appear in, for example, the theories of
stochastic and potential analysis. Moreover, this in The properties of stationary and independent incre-
turn, together with their elementary definition, has ments imply that a Lévy process is a Markov process.
made Lévy processes an extremely attractive class of One may show in addition that Lévy processes are
processes for modeling in a wide variety of physical, strong Markov processes. From Definition 1 alone it
biological, engineering, and economical scenarios. is otherwise difficult to understand the richness of the
Indeed, the first appearance of particular examples class of Lévy processes. To get a better impression
of Lévy processes can be found in the foundational in this respect, it is necessary to introduce the notion
works of Bachelier [1, 2], concerning the use of of an infinitely divisible distribution. Generally, an
Brownian motion, within the context of financial d -valued random variable has an infinitely divis-
mathematics, and Lundberg [9], concerning the use ible distribution if for each n = 1, 2, . . . there exists
of Poisson processes within the context of insurance a sequence of i.i.d. random variables 1,n , . . . , n,n
mathematics. such that
d
The term Lévy process honors the work of the = 1,n + · · · + n,n (1)
French mathematician Paul Lévy who, although not
d
alone in his contribution, played an instrumental role where = is equality in distribution. Alternatively, this
in bringing together an understanding and character- relation can be expressed in terms of characteristic
ization of processes with stationary and independent exponents. That is to say, if has characteristic
increments. In earlier literature, Lévy processes have exponent (u) := − log Ɛ(eiu· ), then is infinitely
been dealt with under various names. In the 1940s, divisible if and only if for all n ≥ 1 there exists a
Lévy himself referred to them as a subclass of pro- characteristic exponent of a probability distribution,
cessus additifs (additive processes), that is, processes say n , such that (u) = nn (u) for all u ∈ d .
2 Lévy Processes
It turns out that has an infinitely divisible dis- Two fundamental examples of Lévy processes,
tribution if and only if there exists a triple (a, , ), which are shown in the next section to form the
where a ∈ d , is a d × d matrix whose eigenval- “building blocks”of all the other Lévy processes, are
ues are all nonnegative, and isa measure concen- Brownian motion and compound Poisson processes.
trated on d \{0} satisfying d 1 ∧ |x|2 ( dx) < A Brownian motion is the Lévy process associated
∞, such that with the characteristic exponent
1 1
(u) = ia · u + u · u (u) = u · u (6)
2 2
+ 1 − eiu·x + iu · x1(|x|<1) ( dx) (2) and therefore has increments over time periods of
d length t, which are Gaussian distributed with covari-
ance matrix t. It can be shown that, up to the
for every θ ∈ d . Here, we use the notation u · x
addition of a linear drift, Brownian motions are the
for the Euclidian inner product and |x| for Euclidian
only Lévy processes that have continuous paths.
distance. The measure is called the Lévy (char-
A compound Poisson process is the Lévy process
acteristic) measure and it is unique. The identity
associated with the characteristic exponent:
in equation (2) is known as the Lévy–Khintchine
formula.
The link between a Lévy processes and infinitely (u) = 1 − eiu·x λF ( dx) (7)
d
divisible distributions becomes clear when one notes
that for each t > 0 and any n = 1, 2, . . . , where λ > 0 and F is a probability distribution.
Such processes may be described pathwise by the
Xt = Xt/n + (X2t/n − Xt/n ) + · · · + (Xt − X(n−1)t/n ) piecewise linear process:
(3)
Nt
ξi , t ≥0 (8)
As a result of the fact that X has stationary inde- i=1
pendent Increments, it follows that Xt is infinitely
divisible. where {ξi : i ≥ 1} are a sequence of i.i.d. random
It can be deduced from the above observation variables with common distribution F , and {Nt : t ≥
that any Lévy process has the property that for all 0} is a Poisson process with rate λ; the latter is
t ≥0 the process with initial value zero and with unit
Ɛ eiu·Xt = e−t(u) (4) increments whose interarrival times are independent
and exponentially distributed with parameter λ.
where (θ) := 1 (θ) is the characteristic exponent It is a straightforward exercise to show that the
of X1 , which has an infinitely divisible distribution. sum of any finite number of independent Lévy pro-
The converse of this statement is also true, thus cesses is also a Lévy process. Under some circum-
constituting the Lévy–Khintchine formula for Lévy stances, one may show that a countably infinite sum
processes. of Lévy processes also converges in an appropri-
ate sense to a Lévy process. This idea forms the
Theorem 1 (Lévy–Khintchine formula for Lévy basis of the Lévy–Itô decomposition, discussed in
processes). a ∈ d , is a d × d matrix whose the next section, where, as alluded to above, the Lévy
eigenvalues are all nonnegative, and is
a measure
processes that are summed together are either a Brow-
concentrated on d \{0} satisfying d 1 ∧ |x|2 nian motion with drift or a compound Poisson process
( dx) < ∞. Then there exists a Lévy process having with drift.
characteristic exponent
1
(u) = ia · u + u · u The Lévy–Itô Decomposition
2
Hidden in the Lévy–Khintchine formula is a repre-
+ 1−eiu·x + iu · x1(|x|<1) ( dx) (5) sentation of the path of a given Lévy process. Every
d
Lévy Processes 3
Lévy process may always be written as the indepen- distributed with common distribution F0 ( dx) concen-
dent sum of up to a countably infinite number of other trated on {x : |x| ≥ 1} and for n = 1, 2, 3, . . .
Lévy processes, at most one of which will be a linear
(n)
Nt
Brownian motion and the remaining processes will
be compound Poisson processes with drift. Xt(n) = ξi(n) − λn t xFn ( dx), t ≥ 0
i=1 2−n ≤|x|<2−(n−1)
Let be the characteristic exponent of some
infinitely divisible distribution with associated triple (13)
(a, , ). The necessary assumption that d (1 ∧
|x|2 )( dx) < ∞ implies that (A) < ∞ for all with {Nt(n) : t ≥ 0} as a Poisson process with rate
Borel A such that 0 is in the interior of Ac and, in λn and {ξi(n) : i ≥ 1} are independent and identically
particular, that ({x : |x| ≥ 1}) ∈ [0, ∞). With this distributed with common distribution Fn ( dx) con-
in mind, it is not difficult to see that, after some sim- centrated on {x : 2−n ≤ |x| < 2−(n−1) }. The limit in
ple reorganization, for u ∈ d , the Lévy–Khintchine equation (10) needs to be understood in the appropri-
formula can be written in the form ate context, however.
It is a straightforward exercise to deduce that X·(n)
is a square integrable martingale on account of the
1
(θ) = iu · a + u · u fact that it is a centered compound Poisson process
2
together with the fact that x 2 is integrable in the
neighborhood of the origin
against the measure . It
+ λ0 1 − eiu·x F0 ( dx)
|x|≥1 is not difficult to see that kn=1 X·(n) is also a square
martingale just as the elements of the approximating the Lévy process X will thus be of bounded variation
sequence are. and otherwise, when the above integral is infinite, the
paths are of unbounded variation.
In the case that d = 1, as an extreme case of a
Path Variation Lévy process with bounded variation, it is possible
that the process X has nondecreasing paths, in which
Consider any function f : [0, ∞) → d . Given any case it is called a subordinator. As is apparent from
partition P = {a = t0 < t2 < · · · < tn = b} of the the Lévy–Itô decomposition (9), this will necessarily
bounded interval [a, b], define the variation of f over occur when (−∞, 0) = 0,
[a, b] with partition P by
x( dx) < ∞ (18)
n (0,1)
VP (f, [a, b]) = |f (ti ) − f (ti−1 )| (15) and = 0. In that case, reconsidering the decompo-
i=1
sition (10), one may identify
The function f is said to be of bounded variation
over [a, b] if Nt(n)
k
Xt = −a − x( dx) t + lim ξi(n)
k↑∞
V (f, [a, b]) := sup VP (f, [a, b]) < ∞ (16) (0,1) n=1 i=1
P
(19)
where the supremum is taken over all partitions of
[a, b]. Moreover, f is said to be of bounded variation On account of the assumption (−∞, 0) = 0, all
if the above inequality is valid for all bounded the jumps ξi(n) are nonnegative. Hence, it is also a
intervals [a, b]. If V (f, [a, b]) = ∞ for all bounded necessary condition that
intervals [a, b], then f is said to be of unbounded
variation. −a − x( dx) ≥ 0 (20)
(0,1)
For any given stochastic process X = {Xt : t ≥
0}, we may adopt these notions in the almost sure for X to have nondecreasing paths. These necessary
sense. So, for example, the statement “X is a process conditions are also sufficient.
of bounded variation” (or “has paths of bounded
variation”) simply means that as a random mapping,
X : [0, ∞) → d is of bounded variation almost Lévy Processes as Semimartingales
surely.
In the case that X is a Lévy process, the Lévy–Itô Recall that a semimartingale with respect to a given
decomposition also gives the opportunity to establish filtration := {Ft : t ≥ 0} is defined as the sum of
a precise characterization of the path variation of an -local martingale and an -adapted process of
a Lévy process. Since any Lévy process may be bounded variation. The importance of semimartin-
written as the independent sum as in equation (10) gales is that they form a natural class of stochastic
and any d-dimension Brownian motion is known to processes with respect to which one may construct
have paths of unbounded variation, it follows that a stochastic integral and thereafter perform calculus.
any Lévy process for which = 0 has unbounded Moreover, the theory of stochastic calculus plays a
variation. In the case that = 0, since the paths of significant role in mathematical finance as it can be
the component X (0) in equation (10) are independent used as a key ingredient in justifying the pricing and
and clearly of bounded variation (they are piecewise hedging of derivatives in markets where risky assets
linear), the path variation of X is are modeled as positive semimartingales.
characterized by the A popular choice of model for risky assets in
way in which the component kn=1 Xt(n) converges.
In the case that recent years has been the exponential of a Lévy pro-
cess (see Exponential Lévy Models). Lévy processes
have also been used as building blocks in more com-
|x|( dx) < ∞ (17)
|x|<1 plex stochastic models for prices, such as stochastic
Lévy Processes 5
volatility models with jumps (see Barndorff-Nielsen [6] Khintchine, A. (1937). A new derivation of one formula
and Shephard (BNS) Models) and time-changed by Levy P., Bulletin of Moscow State University I(1),
Lévy models (see Time-changed Lévy Process). The 1–5.
[7] Kolmogorov, N.A. (1932). Sulla forma generale di un
monograph of Cont and Tankov [4] gives an exten- processo stocastico omogeneo (un problema di B. de
sive exposition on these types of models. Thanks to Finetti), Atti Reale Accademia Nazionale dei Lincei Rend
Itô’s formula for semimartingales, the exponential of 15, 805–808.
a Lévy process is a semimartingale when it can be [8] Lévy, P. (1934). Sur les intégrales dont les éléments
shown that a Lévy process is a semimartingale. How- sont des variables aléatoires indépendantes, Annali
ever, reconsidering della Scuola Normale Superiore di Pisa 3–4, 217–218,
equation (10) and recalling that 337–366.
B and limk↑∞ kn=1 X·(n) are martingales and that
[9] Lundberg, F. (1903). Approximerad framställning av
X·(0) − a· is an adapted process with bounded vari- sannolikhetsfunktionen, Återförsäkring av kollektiv-
ation paths, it follows immediately that any Lévy risker, Akademisk Afhandling Almqvist och Wiksell,
process is a semimartingale. Uppsala.
[10] Sato, K. (1999). Lévy Processes and Infinitely Divisible
Distributions, Cambridge University Press, Cambridge.
References
are those that may be expressed as the distribution (ii) For 0 < s ≤ 1 and θ ∈
of a random walk sampled at an independent
p
and geometrically distributed time; Sp = i=1 ξi . E s G eiθSG
0
(Note, we interpret i=1 as the empty sum). To jus- ∞
1
tify the previous claim, a straightforward computation = exp − 1−s n e iθx
q n F ∗n (dx)
shows that for each n = 1, 2, 3, . . . (0,∞) n=1 n
n (13)
1
iθS p n
Ɛ e p
= (iii) For 0 < s ≤ 1 and θ ∈
1 − q Ɛ eiθξ1
n E s N eiθSN
= Ɛ eiθS1/n,p (9) ∞
1 ∗n
= 1 − exp − n iθx
s e F (dx)
where 1/n,p is a negative binomial random variable (0,∞) n=1 n
with parameters 1/n and p, which is independent of (14)
S. The latter has distribution mass function
1 (k + 1/n) 1/n k Note that the third part of the Wiener–Hopf fac-
(1/n,p = k) = p q (10) torization characterizes what is known as the ladder
k! (1/n)
height process of the random walk S. The latter is
for k = 0, 1, 2, . . . the bivariate random walk (T , H ) := {(Tn , Hn ) : n =
0, 1, 2, . . .} where (T0 , H0 ) = (0, 0), and otherwise
for n = 1, 2, 3, . . .,
Wiener–Hopf Factorization for Random
Walks min k ≥ 1 : STn−1 +k > Hn−1 if Tn−1 < ∞
Tn =
∞ if Tn−1 = ∞
We now turn our attention to the Wiener–Hopf
factorization. Fix 0 < p < 1 and define and
STn if Tn < ∞
Hn = (15)
∞ if Tn = ∞
G = inf k = 0, 1, . . . , p : Sk = max Sj That is to say, the process (T , H ), until becoming
j =0,1,...,p
infinite in value, represents the times and positions of
(11) the running maxima of S, the so-called ladder times
and ladder heights. It is not difficult to see that Tn
where p is a geometrically distributed random is a stopping time for each n = 0, 1, 2, . . . and hence
variable with parameter p, which is independent of thanks to the i.i.d. increments of S, the increments
the random walk S, that is, G is the first visit of S of (T , H ) are i.i.d. with the same law as the pair
to its maximum over the time period {0, 1, . . . , p }. (N, SN ).
Now define
Proof (i) The path of the random walk may be
N = inf{n > 0 : Sn > 0} (12) broken into ν ∈ {0, 1, 2, . . .} finite (or completed)
excursions from the maximum followed by an addi-
In other words, the first visit of S to (0, ∞) after tional excursion, which straddles the random time
time 0. p . Here, we understand the use of the word strad-
Theorem 1 (Wiener–Hopf Factorization for Ran- dle to mean that if is the index of the left end
dom Walks) Assume all of the notation and conven- point of the straddling excursion then ≤ p . By the
tions above. strong Markov property for random walks and lack
of memory, the completed excursions must have the
(i) (G, SG ) is independent of (p − G, Sp − SG ) same law, namely, that of a random walk sampled
and both pairs are infinitely divisible. on the time points {1, 2, . . . , N } conditioned on the
Wiener–Hopf Decomposition 3
event that {N ≤ p } and hence ν is geometrically and, on the other hand, with the help of Fubini’s
distributed with parameter 1 − P (N ≤ p ). Mathe- Theorem,
matically, we express
∞
n 1 ∗n
ν
(i) (i) exp − 1−s e n iθx
q F (dx)
(G, SG ) = N ,H (16) n=1 n
i=1 ∞
1
= exp − 1 − s n E eiθSn q n
n
where the pairs {(N (i) , H (i) ) : i = 1, 2, . . .} are inde- n=1
∞
pendent having the same distribution as (N, SN ) n
1
conditioned on {N ≤ p }. Note also that G is the = exp − 1 − s n E eiθS1 qn
sum of the lengths of the latter conditioned excur- n=1
n
sions and SG is the sum of the respective increment
= exp log(1 − q) − log 1 − sqE eiθS1
of the terminal value over the initial value of each
excursion. In other words, (G, SG ) is the component- p
= (19)
wise sum of ν independent copies of (N, SN ) (with 1 − qsE(eiθS1 )
(G, SG ) = (0, 0) if ν = 0). Infinite divisibility fol-
lows as a consequence of the fact that (G, SG ) is where, in the last equality, we have applied the
a geometric sum of i.i.d. random variables. The Mercator–Newton series expansion of the logarithm.
independence of (G, SG ) and (p − G, Sp − SG ) is Comparing the conclusions of the last two series of
immediate from the decomposition described above. equalities, the required expression for E(s p eiθSp )
Feller’s classic duality lemma (cf [3]) for ran- follows. The Lévy measure mentioned in equation
dom walks says that for any n = 0, 1, 2, . . . (which (4) is thus identifiable as
may later be randomized with an independent geo- ∞
metric distribution), the independence and common 1
(dy, dx) = δ{n} (dy)F ∗n (dx) q n (20)
distribution of increments implies that {Sn−k − Sn : n
n=1
k = 0, 1, . . . , n} has the same law as {−Sk : k =
0, 1, . . . , n}. In the current context, the duality lemma for (y, x) ∈ 2 .
also implies that the pair (p − G, Sp − SG ) is equal We know that (p , Sp ) may be written as the
in distribution to (D, SD ) where independent sum of (G, SG ) and (p − G, Sp −
SG ), where both are infinitely divisible. Further, the
former has Lévy measure supported on {1, 2, . . .} ×
D := sup k = 0, 1, . . . , p : Sk = min Sj (0, ∞) and the latter has Lévy measure supported
j =0,1,...,p
on {1, 2, . . .} × (−∞, 0). In addition, E(s G eiθSG )
(17) extends to the upper half of the complex plane
in θ (and is continuous
on the real axis) and
E s p − G eiθ(Sp − SG extends to the lower half of the
(ii) Note that, as a geometric sum of i.i.d. random
complex plane in θ (and is continuous on the real
variables, the pair (p , Sp ) is infinitely divisible for
axis).a Taking account of equation (4), this forces
s ∈ (0, 1) and θ ∈ , let q = 1 − p and also that, on
the factorization of the expression for E(s p eiθSp )
one hand,
in such a way that
∞
p − (1−s n eiθ x )q n F ∗n (dx)/n
E(s ep iθSp
) = E E seiθS1 E(s G eiθSG ) = e (0,∞) n=1
(21)
k
= p qsE eiθS1 (iii) Note that the path decomposition given in part
k≥0 (i) shows that
p ν (i) ν (i)
= (18)
1 − qsE eiθS1 E s G eiθSG = E s i=1 N eiθ i=1 H (22)
4 Wiener–Hopf Decomposition
where the pairs {(N (i) , H (i) ) : i = 1, 2, . . .} are inde- It is easy to deduce that if X is a Lévy process,
pendent having the same distribution as (N, SN ) con- then for each t > 0 the random variable Xt is
ditioned on {N ≤ p }. Hence, we have infinitely divisible. Indeed, one may also show via
a straightforward computation that
E s G eiθSG Ɛ eiθXt = e−(θ)t for all θ ∈ , t ≥ 0 (26)
= P (N > p )P (N ≤ p )k
where, in its most general form, takes the form
k≥0
k given in equation (4). Conversely, it can also be
(i)
k (i) shown that given a Lévy–Khintchine exponent (4) of
× E s i=1 N eiθ i=1 H
an infinitely divisible random variable, there exists
k a Lévy process that satisfies equation (26). In the
= P (N > p )P (N ≤ p )k E s N eiθSN |N ≤ p special case that the Lévy–Khintchine exponent
k≥0 belongs to that of a positive-valued infinitely divisible
k distribution, it follows that the increments of the
= P (N > p )E s N eiθSN 1(N≤p )
associated Lévy process must be positive and hence
k≥0
its paths are necessarily monotone increasing. In full
k
= P (N > p )E (qs)N eiθSN generality, a Lévy process may be naively thought of
k≥0 as the independent sum of a linear Brownian motion
plus an independent process with discontinuities in its
P (N > p )
= (23) path, which, in turn, may be seen as the limit (in an
1 − E (qs)N eiθSN appropriate sense) of the partial sums of a sequence
of compound Poisson processes with drift. The book
Note that in the fourth equality we use the fact that by Bertoin [1] gives a comprehensive account of the
P (p ≥ n) = q n . above details.
The required equality to be proved follows by The definition of a Lévy process suggests that
setting s = 0 in equation (21) to recover it may be thought of as a continuous-time analog
of a random walk. Let us introduce the exponen-
∞
qn ∗n
tial random variable with parameter p, denoted by
P (N > p ) = exp − F (dx) ep , which henceforth is assumed to be independent
(0,∞) n=1 n
of all other random quantities under discussion and
(24) defined on the same probability space. Like the geo-
metric distribution, the exponential distribution also
and then plugging this back into the right-hand side has a lack-of-memory property in the sense that for
of equation (23) and rearranging. all 0 ≤ s, t < ∞ we have (ep > t + s|ep > t) =
(ep > s) = e−ps . Moreover, ep , and, more gener-
ally, Xep , is infinitely divisible. Indeed, straightfor-
Lévy Processes and Infinite Divisibility ward computations show that for each n = 1, 2, 3, . . .
A (one-dimensional) stochastic process X = {Xt : 1 n n
t ≥ 0} is called a Lévy process (see Lévy Processes) p n
Ɛ(eiθXep ) = = Ɛ eiθXγ1/n,p
on some probability space (, F, ) if p + (θ)
(27)
1. X has paths that are -almost surely right where γ1/n,p is a gamma distribution with parameters
continuous with left limits; 1/n and p, which is independent of X. The latter has
2. given 0 ≤ s ≤ t < ∞, Xt − Xs is independent of distribution
{Xu : u ≤ s};
3. given 0 ≤ s ≤ t < ∞, Xt − Xs is equal in dis- p 1/n −1+1/n −px
tribution to Xt−s ; and (γ1/n,p ∈ dx) = x e dx (28)
(1/n)
(X0 = 0) = 1 (25) for x > 0.
Wiener–Hopf Decomposition 5
put, boils down to the computation of the following Corollary 1 For all α, β ≥ 0 and x ≥ 0, we have
quantity:
−
Ɛ eβXeα 1(−X >x)
− X− + −ατ−x +βXτ − eα
vy (x) := Ɛ e−rτy K − e τy |X0 = x (35) Ɛ e (τ−x <∞) =
−x 1 −
Ɛ eβXeα
End Notes
= Ɛ e−βXeα 1(τx+ <eα )
a.
−β Xeα −Xτ + It is this part of the proof that makes the connection
−βXτ + + with the general analytic technique of the Wiener–Hopf
= Ɛ 1(τx+ < eα ) e x Ɛ e x
Fτx (37)
method of factorizing operators. This also explains the
origin of the terminology Weiner–Hopf factorization for
what is otherwise a path, and consequently distributional,
Now, conditionally on Fτx+ and on the event τx+ < eα , decomposition.
the random variables X eα − Xτx+ and X eα have the
same distribution, thanks to the lack-of-memory prop- References
erty of eα and the strong Markov property. Hence, we
have the factorization [1] Bertoin, J. (1996). Lévy Processes, Cambridge Univer-
sity Press.
+ [2] Borovkov, A.A. (1976). Stochastic Processes in Queue-
Ɛ e−βXeα 1X = Ɛ e−ατx −βXτx+ Ɛ e−βXeα
eα >x ing Theory, Springer-Verlag.
[3] Feller, W. (1971). An Introduction to Probability Theory
(38) and its Applications, 2nd Edition, Wiley, Vol. II.
[4] Fristedt, B.E. (1974). Sample functions of stochastic pro-
cesses with stationary independent increments, Advances
The case that β or x is equal to zero can be achieved
in Probability 3, 241–396.
by taking limits on both sides of the above equality. [5] Fusai, G., Abrahams, I.D. & Sgarra, C. (2006). An exact
analytical solution for discrete barrier options, Finance
By replacing X by −X in Lemma 1, we get the and Stochastics 10, 1–26.
following analogous result for first passage into the [6] Greenwood, P.E. & Pitman, J.W. (1979). Fluctua-
negative half line. tion identities for Lévy processes and splitting at
Wiener–Hopf Decomposition 7
the maximum, Advances in Applied Probability 12, [14] Percheskii, E.A. & Rogozin, B.A. (1969). On the joint
839–902. distribution of random variables associated with fluctua-
[7] Greenwood, P.E. & Pitman, J.W. (1980). Fluctu- tions of a process with independent increments, Theory
ation identities for random walk by path decom- of Probability and its Applications 14, 410–423.
position at the maximum. Abstracts of the Ninth [15] Spitzer, E. (1956). A combinatorial lemma and its
Conference on Stochastic Processes and Their Applica- application to probability theory, Transactions of the
tions, Evanston, Illinois, 6–10 August 1979, Advances American Mathematical Society 82, 323–339.
in Applied Probability 12, 291–293. [16] Spitzer, E. (1957). The Wiener-Hopf equation whose
[8] Gusak, D.V. & Korolyuk, V.S. (1969). On the joint kernel is a probability density, Duke Mathematical
distribution of a process with stationary independent Journal 24, 327–343.
increments and its maximum. Theory of Probability 14, [17] Spitzer, E. (1964). Principles of Random Walk, Van
400–409. Nostrand.
[9] Hopf, E. (1934). Mathematical Problems of Radiative [18] Sato, K.-I. (1999). Lévy Processes and Infinitely Divisi-
Equilibrium. Cambridge tracts, No. 31. ble Distributions, Cambridge University Press.
[10] Jeannin, M. & Pistorius, M.R. (2007). A Transform
Approach to Calculate Prices and Greeks of Barrier
Options Driven by a Class of Lévy. Available at arXiv: Related Articles
http://arxiv.org/abs/0812.3128.
[11] Kudryavtsev, O. & Levendorski, S.Z. (2007). Fast Fractional Brownian Motion; Infinite Divisibility;
and Accurate Pricing of Barrier Options Under Levy
Processes. Available at SSRN: http://ssrn.com/abstract=
Lévy Processes; Lookback Options.
1040061.
[12] Kyprianou, A.E. (2006). Introductory Lectures on Fluc- ANDREAS E. KYPRIANOU
tuations of Lévy Processes with Applications, Springer.
[13] Payley, R. & Wiener, N. (1934). Fourier Transforms in
the Complex Domain, American Mathematical Society.
Colloquium Publications, New York, Vol. 19.
• for every s, t ≥ 0, the r.v. Nt+s − Nt has the same
Poisson Process law as Ns .
d
(λj (t − s))nj Stochastic Calculus
= e−λj (t−s) (7)
j =1
nj ! Integration by Parts Formula. Let dXt = bt dt +
ϕt dMt and dYt = ct dt + ψt dMt , where ϕ and ψ are
Proposition 2 An F-adapted process N is a predictable processes, and b, c are adapted processes
d-dimensional F-Poisson process if and only if such that the processes X and Y are well defined.
Then,
1. each N j is an F-Poisson process
t t
2. no two N j ’s jump simultaneously. Xt Yt = xy + Ys − dXs + Xs − dYs + [X, Y ]t
0 0
(10)
Inhomogeneous Poisson Processes
where [X, Y ]t is the quadratic covariation process,
We assume that the probability space (, F, ) is defined as
endowed with a filtration F. t
[X, Y ]t : = ϕs ψs dNs (11)
0
Definition
In particular, if dXt = ϕt dMt and dYt = ψt dMt (i.e.,
Let
λ be an
F-adapted nonnegative process satisfying X and Y are local martingales), the process (Xt Yt −
∞
t
Ɛ 0 λs ds < ∞, ∀t, and 0 λs ds = ∞. [X, Y ]t , t ≥ 0) is a martingale. It can be noted that,
An inhomogeneous Poisson process N with in that case, the process (Xt Yt − X, Y t , t ≥ 0),
t
stochastic intensity λ is a counting process such where X, Y t = 0 ϕs ψs λs ds is also a martingale.
that for every nonnegative F-predictable process (φt , The process X, Y is the compensator of [X, Y ]
t ≥ 0), the following equality is satisfied: if [X, Y ] is integrable (see Compensators). The
∞ ∞ predictable process (X, Y t , t ≥ 0) is called the
predictable covariation process of the pair (X, Y ), or
Ɛ φs dNs = Ɛ φs λs ds (8)
0 0
the compensator of the product XY . If dXti = xti dNti ,
t where N i , i = 1, 2 are independent inhomogeneous
Therefore (Mt = Nt − 0 λs ds, t ≥ 0) is an F- Poisson processes, the covariation processes [X 1 , X 1 ]
martingale, and if φ is an F-predictable t process and X 1 , X 2 are null, and X 1 X 2 is a martingale.
t
such that ∀t, Ɛ( 0 |φs |λs ds) < ∞, then ( 0 φs dMs ,
t
t ≥ 0) is an F-martingale. The process t = 0 λs ds Itô’s Formula. Itô’s formula is a special case of
is called the compensator of N . the general one; it is a bit simpler and is used for the
Poisson Process 3
processes that are within bounded variation. Let b be The local martingale L is denoted by E(µ M) and
an adapted process and ϕ a predictable process with named the Doléans-Dade exponential (alternatively,
adequate integrability conditions, and the stochastic exponential) of the process µ M.
If µ > −1, the process L is nonnegative and is a
dXt = bt dt + ϕt dMt = (bt − ϕt λt ) dt + ϕt dNt martingale if ∀t, Ɛ(Lt ) = 1 (this is the case if µ
satisfies −1 + δ < µs < C where C and δ > 0 are
(12)
two constants).
and F ∈ C 1,1 (+ × ). Then, the process (F (t, Xt ), If µ is not greater than −1, then the process L
t ≥ 0) is a semimartingale with decomposition defined in equation (16) may take negative values.
F (t, Xt ) = Zt + At (13)
where Z is a local martingale given by
Change of Probability Measure
Lt = (1 + µTn )
n,Tn ≤t Definition and Properties
t
× exp − µs λs ds if t ≥ T1 Let λ be a positive number, and F (dy) be a proba-
0
(16) bility law on . A (λ, F )-compound Poisson process
is a local martingale solution of is a process X = (Xt , t ≥ 0) of the form
dLt = Lt− µt dMt , L0 = 1 (17)
Nt
Xt = Yn = Yn (21)
Moreover, if µ is such that ∀s, µs > −1,
n=1 n>0,Tn ≤t
t t
Lt = exp − µs λs ds + ln(1 + µs ) dNs where N is a standard Poisson process with intensity
0 0 λ > 0, and the (Yn , n ≥ 1) are i.i.d. square-integrable
t random variables with law F (dy) = (Y1 ∈ dy),
= exp − (µs − ln(1 + µs ))λs ds independent of N .
0
t
Proposition 4 A compound Poisson process has
+ ln(1 + µs ) dMs (18) stationary and independent increments; for fixed t, the
0
4 Poisson Process
cumulative distribution function of Xt is In other words, for any α such that Ɛ(eαXt ) <
∞
∞ (or equivalently Ɛ(eαY1 ) < ∞), the process
(λt)n (eαXt /Ɛ(eαXt ), t ≥ 0) is a martingale. More gener-
(Xt ≤ x) = e−λt F ∗n (x) (22)
n=0
n! ally, let f be a bounded Borel function. Then, the
process
where the star indicates a convolution.
If Ɛ(|Y1 |) < ∞, the process (Zt = Xt − tλƐ(Y1 ), N
t ∞
t ≥ 0) is a martingale and Ɛ(Xt ) = λt Ɛ(Y1 ). exp f (Yn ) − λt (e f (x)
− 1)F (dx) (28)
If Ɛ(Y12 ) < ∞, the process (Zt2 − tλƐ(Y12 ), n=1 −∞
Nt
Let X be a (λ, F )-compound Poisson process,
= f (Yn (ω)) (24)
λ > 0, and F a probability measure on , absolutely
n=1
continuous with respect to F , with Radon–Nikodym
we obtain that (dx) = ϕ(x)F (dx). The process
density ϕ, that is, F
f λ
Mt = (f ∗ µ)t − tλƐ(f (Y1 )) Lt = exp t (λ −
λ) + ln ϕ(Xs ) (30)
t λ
s≤t
= f (x)(µ(ω, ds, dx) − λF (dx) ds) (25)
0
is a positive martingale (take f (x) = ln((
λ/λ) ϕ(x))
is a martingale. in equation (28)) with expectation 1. Set d|Ft =
Lt d|Ft .
Martingales )-
Proposition 6 Under , the process X is a (
λ, F
Proposition 5 If X is a (λ, F )-compound Poisson compound Poisson process.
∞
process, for any α such that −∞ eαx F (dx) < ∞, the
Let α be such that Ɛ(eαY1 ) < ∞. The particular
process
case with ϕ(x) = (eαx /Ɛ(eαY1 )) and λ = λƐ(eαY1 )
∞ corresponds to the Esscher transform for which
Zt = exp αXt − λt (eαx − 1)F (dx) (26)
−∞
eαXt
d|Ft = d|Ft (31)
is a martingale and Ɛ(eαXt )
∞ We emphasize that there exist changes of probability
Ɛ(eαXt ) = exp λt (eαx − 1)F (dx) that do not preserve the compound Poisson process
−∞
property. For the predictable representation theorem,
= exp λt (Ɛ(eαY1 − 1)) (27) see Point Processes.
Poisson Process 5
The compound Poisson process is said to be a double [1] Brémaud, P. (1981). Point Processes and Queues:
exponential process if the law of the random variable Martingale Dynamics, Springer-Verlag, Berlin.
Y1 is [2] Çinlar, E. (1975). Introduction to Stochastic Processes,
Prentice Hall.
[3] Cont, R. & Tankov, P. (2004). Financial Modeling with
F (dx) = pθ1 e−θ1 x 11{x>0} + (1 − p)θ2 eθ2 x 11{x<0} dx Jump Processes, Chapman & Hall/CRC.
[4] Jeanblanc, M., Yor, M. & Chesney, M. (2009). Mathe-
(32)
matical Models for Financial Markets, Springer, Berlin.
[5] Karlin, S. & Taylor, H. (1975). A First Course in
where p ∈]0, 1[ and θi , i = 1, 2 are positive numbers. Stochastic Processes, Academic Press, San Diego.
Under an Esscher transform, this model is still a [6] Protter, P.E. (2005). Stochastic Integration and Differen-
double exponential model. This particular dynamic tial Equations, 2nd Edition, Springer, Berlin.
allows one to compute the Laplace transform of the
first hitting times of a given level.
Related Articles
Definition t
Ɛ (s, z; ω)µ(ω; ds, dz)
An increasing sequence of random times is called 0 E
a univariate point process. A simple example is the t
Poisson process. =Ɛ (s, z; ω) ν(ω; ds, dz) (5)
Given a univariate point process, we associate 0 E
to every time Tn a mark Zn . More precisely, let
(, F, ) be a probability space, (Zn , n ≥ 1) a In the case of a marked point process on × d ,
sequence of random variables taking values in a the compensator admits an explicit representation: let
measurable space (E, E), and (Tn , n ≥ 1) an increas- Gn (dt, dz) be a regular version of the conditional
ing sequence of nonnegative random variables. We distribution of (Tn+1 , Zn+1 ) with respect to FTNn =
assume that lim Tn = ∞, so that there is only a finite σ {(T1 , Z1 ), . . . (Tn , Zn )}. Then,
number of n such that, for a given t, one has Tn ≤ t. Gn (dt, dz)
We define the process
N as follows. For each set, ν(dt, dz) = 11{Tn <t≤Tn+1 } (6)
A ∈ E, Nt (A) = n 11{Tn ≤t} 11{Zn ∈A} is the number of n
Gn ([t, ∞[×d )
“marks” in the set A before time t. The natural fil-
tration of N is
Intensity Process
FtN = σ (Ns (A), s ≤ t, A ∈ E ) (1)
In what follows, we assume that, for any A ∈ E,
The predictable σ -algebra P is the σ -algebra defined the process (Nt (A), t ≥ 0) admits the F-predictable
on × + that is generated by the sets intensity (λt (A), t ≥ 0), that is, there exists a non-
negative process (λt (A), t ≥ 0) such that
A × {0}, A ∈ F0N ; A×]s, t], A ∈ FsN , s ≤ t t
Nt (A) − λs (A)ds (7)
(2) 0
t (E)
The associated random counting measure µ(ω, is an F- martingale. Then, if Xt = N n=1 (Tn , Zn )
ds, dz) is defined as follows: let be a map where is an F-predictable process that satisfies
(t, ω, z) ∈ (+ , , E) → (t, ω, z) ∈ (3)
Ɛ |(s, z)|λs (dz)ds < ∞ (8)
]0,t] E
We set
the process
∞
(s, z)µ(ds, dz) = (Tn , Zn )11{Tn ≤t} t
]0,t] E n=1 Xt − (s, z)λs (dz)ds
0 E
N
t (E)
= (Tn , Zn ) (4) = (s, z) [µ(ds, dz) − λs (dz)ds] (9)
n=1 ]0,t] E
2 Point Processes
Nt
= N B = Card{s ≤ t : e(s) ∈
} (20)
Ɛ exp i f (s, es )
0<s≤t
Poisson Point Processes t
= exp ds (eif (s,u) − 1)n(du) (24)
Definition 2 An F-Poisson point process e is a point 0
process such that Moreover, if f ≥ 0,
1. NtE < ∞ a.s. for every t
2. for any
∈ E, the process N
is F-adapted
Ɛ exp − f (s, es )
3. for any s and t and any
∈ E, Ns+t
− Nt
is
0<s≤t
independent from Ft and distributed as Ns
.
t
[3] Jacod, J. & Shiryaev, A.N. (2003). Limit Theorems for Related Articles
Stochastic Processes, 2nd Edition, Springer
Verlag.
[4] Last, G. & Brandt, A. (1995). Marked Point Processes
on the Real Line. The Dynamic Approach, Springer, Lévy Processes; Martingales; Martingale Repre-
Berlin. sentation Theorem.
[5] Protter, P.E. (2005). Stochastic Integration and Differen-
tial Equations, 2nd Edition, Springer, Berlin. MONIQUE JEANBLANC
that ∀t, F (t) < 1, the H-compensator of τ is t =
Compensators t∧τ dF (s)
. If F is continuous, the H-compensator
0 1−F (s − )
is t = − ln(1 − F (t ∧ τ )).
In probability theory, the compensator of a stochastic
process designates a quantity that, once subtracted Cox Processes
from a stochastic process, yields a martingale.
Let F be a given filtration, t λ a given F-adapted
nonnegative process, Ft = 0 λs ds, and a random
Compensator of a Random Time variable with exponential law, independent of F. Let
us define the random time τ as
Let (, G, ) be a filtered probability space and τ
a G-stopping time. The process Ht = 11τ ≤t is a G- τ = inf t : Ft ≥ (3)
adapted increasing process, hence a G-submartingale
and admits a Doob–Meyer decomposition as Then, the process
t∧τ
H t = M t + t (1)
11τ ≤t − λs ds = 11τ ≤t − Ft∧τ (4)
0
where M is a G-local martingale and a G-
predictable increasing process. The process , called is a martingale in the filtration G = F ∨ H, the small-
the G-compensator of H , is constant after τ , that is, est filtration that contains F, making τ a stopping time
t = t∧τ . The process “compensates” H with (in fact a totally inaccessible stopping time). The G-
the meaning that H − is a martingale. If τ is G- compensator of H is t = Ft∧τ , and the G-intensity
predictable, then t = Ht . The continuity of is rate is λG t = 11t<τ λt . In that case, for an integrable
equivalent to the fact that τ is a G-totally inaccessible random variable X ∈ FT , one has
stopping time. If is absolutely continuous with
Ɛ(X11T <τ |Gt ) = 11t<τ et Ɛ(Xe−T |Ft )
F F
(5)
t G to the Lebesgue measure, that is, if Gt =
respect
0 λs ds, the nonnegative G-adapted process λ is
and, for H , an F-predictable (bounded) process
called the intensity rate of τ . Note that λGt is null on
the set τ ≤ t. Ɛ(Hτ 11τ ≤T |Gt ) = Hτ 11τ ≤t
For any integrable random variable X ∈ GT , one T
has F
+ 11t<τ et Ɛ Hs e−s λs ds|Ft
F
(6)
t
Ɛ(X11T <τ |Gt ) = 11{t<τ } Vt − Ɛ(Vτ 11τ ≤T |Gt )
(2) Conditional Survival Probability
−T
with Vt = e Ɛ(Xe
t
|Gt ). Assume now that τ is a nonnegative random vari-
In the following examples, τ is a given random able on the filtered probability space (, F, ) with
time, that is, a nonnegative random variable, and H conditional survival probability Gt : = (τ > t|Ft ),
the natural filtration of H (i.e., the smallest filtration taken continuous on the right and let G = F ∨ H. The
satisfying the usual conditions such that the process random time τ is a G-stopping time.
H is adapted). The random time τ is a H-stopping If τ is an F-predictable stopping time (hence a
time. G-predictable stopping time), then Gt = 11τ >t and
= H.
Elementary Case In what follows, we assume that Gt > 0 and
we introduce the Doob–Meyer decomposition of the
Let τ be an exponential random variable with con- F-supermartingale G, that is, Gt = Zt − At , where
stant parameter λ. Then, the H-compensator of H is Z is an F-martingale and A is an increasing F-
λ(t ∧ τ ). More generally, if τ is a nonnegative ran- predictable process. Then, the G-compensator of
τ is t = 0 (Gs − )−1 dAs . If dAt = at dt, the G-
t∧τ
dom variable with cumulative distribution function F ,
−1
taken continuous on the right (F (t) = (τ ≤ t)) such t = 11t<τ (Gt − ) at . Moreover, if G
intensity rate is λG
2 Compensators
exists
a unique
F-predictable increasing process 1. for every predictable process H , the process
(p) (H ν) is predictable (the measure ν is said to
At , t ≥ 0 , called the F-dual predictable projec-
tion of A such that be predictable) and
∞ ∞ 2. for every predictable process H such that the
process |H | µ is increasing and locally inte-
Ɛ Hs dAs = Ɛ Hs dA(p)
s (17) grable, the process (H µ − H ν) is a local
0 0
martingale.
for any positive F-predictable process H .
Examples
The definition of compensator of a random time
can be interpreted in terms of dual predictable pro- If N is a Lévy process with Lévy measure ν
jection: if τ is a random time, the F -predictable
compensator associated with τ is the dual predictable f (x)Nt (·, dx) − t f (x)ν(dx)
projection Aτ of the increasing process 11{τ ≤t} . It
satisfies ∞ = f (Xs )11 (Xs ) − t f (x)ν(dx)
Ɛ(kτ ) = Ɛ τ
ks dAs (18) 0<s≤t
0 (20)
for any positive, F-predictable process k.
is a martingale, the compensator of f (x)Nt (·, dx)
is t f (x)ν(dx).
Examples For other examples see the article on point pro-
cesses (see Point Processes).
Covariation Processes. Let M be a martingale
and [M] its quadratic variation process. If [M] is References
integrable, its compensator is M.
[1] Brémaud, P. & Yor, M. (1978). Changes of filtration and
Standard Poisson Process. If N is a Poisson of probability measures, Zeit Wahr and Verw Gebiete 45,
process, (Mt = Nt − λt, t ≥ 0) is a martingale, and 269–295.
[2] Zeng, Y. (2006). Compensators of Stopping Times, PhD
λt is the compensator of N ; the martingale M is
thesis, Cornell University.
called the compensated martingale.
Further Reading
Compensated Poisson Integrals. Let N be a time
inhomogeneous Poisson process with deterministic Brémaud, P. (1981). Point Processes and Queues. Martingale
intensity λ and FN its natural filtration. The process Dynamics, Springer-Verlag, Berlin.
t Çinlar, E. (1975). Introduction to Stochastic Processes, Prentice
Hall.
Mt = Nt − λ(s)ds, t ≥ 0 (19) Cont, R. & Tankov, P. (2004). Financial Modeling with Jump
0
Processes, Chapman & Hall/CRC.
t an F -martingale. The increasing function (t) : =
N Jeanblanc, M., Yor, M. & Chesney, M. (2009). Mathematical
is
λ(s)ds is called the (deterministic) compensator Models for financial Markets, Springer, Berlin.
0 Karlin, S. & Taylor, H. (1975). A First Course in Stochastic
of N . Processes, Academic Press, San Diego.
Related Articles
Random Measures
Definitions Doob–Meyer Decomposition; Filtrations; Inten-
sity-based Credit Risk Models; Point Processes.
The compensator of a random measure µ is the
unique random measure ν such that MONIQUE JEANBLANC
Heavy Tails of the observations. In the early 1960s, Mandelbrot
(see Mandelbrot, Benoit) [31], Mandelbrot and
Taylor [32], and Fama [21] realized that the marginal
distribution of returns appeared to be heavy tailed. To
The three most cited stylized properties attributed to cope with heavy tails, they considered non-Gaussian
log-returns of financial assets or stocks are (i) a kur- stable distributions for the marginals. Since this
tosis much larger than 3, the kurtosis of a normal class of distributions has infinite variance, it was a
distribution; (ii) serial dependence without correla- slightly controversial approach. On the other hand,
tion; and (iii) volatility clustering. Any realistic and for many financial time series, there is evidence that
useful model for log-returns must account for all three the marginal distribution may have a finite variance
of these characteristics. In this article, the focus is but an infinite fourth moment. Figure 1 contains
on the large kurtosis property, which is indicative two financial time series that exhibit heavy tails.
of heavy tails in the returns. Although this stylized Figure 1(a) consists of the daily pound/US dollar
fact may not draw the same level of attention as the exchange rate from October 1, 1981 to June 28,
other two, it can have a serious impact on model- 1985, while Figure 1(b) displays the log-returns of
ing and inference questions related to financial time the daily closing price of Merck stock from January
series. One such application is the estimation of the 2, 2003 through April 28, 2006. One can certainly
Value at Risk, which is an important entity in the detect the occasional bursts of outlying observations
finance industry. For example, financial institutions in both series that are representative of heavy tails.
would like to estimate large quantiles of the absolute As described in the second section (see Figure 3c and
returns, that is, the level at which the probability that d), there is statistical evidence that the tail behavior
an absolute return exceeds this value is small such as of the marginal distribution is heavy with possibly
0.01 or less. The estimation of these large quantities is infinite fourth moments.
extremely sensitive to the shape assumed for the tail Regular variation is a natural and often used con-
of the marginal distribution. A light-tailed assump- cept to describe and model heavy-tailed phenomena.
tion for the tails can severely underestimate the actual Many processes that are designed to model finan-
quantiles of the marginal distribution. In addition to cial time series, such as the GARCH and heavy-
Value at Risk, heavy tails can impact the estimation of tailed SV processes, have the property that all finite-
key measures of dependencies in financial time series. dimensional distributions are regularly varying. For
This includes the sample autocorrelation of the time such processes, one can apply standard results from
series and of functions of the time series such as abso- extreme value theory for establishing limiting behav-
lute values and squares. Standard central limit theory ior of the extremes of the process, the sample ACF
for mixing sequences generally directly applies to the of the process and its squares, and a host of other
sample autocorrelation functions (ACFs) of a finan- statistics. The regular variation condition and its prop-
cial time series and its squares, provided the fourth erties are described in the second section. In the third
and eight moments, respectively, are finite. If these section, some of the main results on regular varia-
moments are infinite, as well may be the case for tion for GARCH and SV processes, respectively, are
financial time series, then the asymptotic behavior of described. The fourth section describes some of the
the sample ACFs is often nonstandard. As it turns out, applications of the regular variation conditions men-
GARCH processes and stochastic volatility (SV) pro- tioned in the third section, with emphasis on extreme
cesses, which are the primary modeling engines for values, point processes, and sample autocorrelations.
financial returns, exhibit heavy tails in the marginal
distribution. We focus on heavy tails and how the
concept of regular variation plays a vital role in both Regular Variation
these processes.
It is often a misconception to associate heavy- Multivariate regular variation plays an indispensable
tailed distributions with a very large variance. Rather, role in extreme value theory and often serves as
the term is used to describe data that exhibit bursts the starting point for modeling multivariate extremes.
of outlying observations. These outlying observations In some respect, one can regard a random vector
could be orders of magnitude larger than the median that is regularly varying as the heavy-tailed analog
2 Heavy Tails
0.1
2
0.0
Exchange returns
Log-returns
0
−0.1
−2
−0.2
−4
−0.3
Figure 1 Log-returns for US/pound exchange rate, October 1, 1981 to June 28, 1985 (a) and log-returns for closing price
of Merck stock, January 2, 2003 to April 28, 2006 (b)
varying if and only if |X| is regularly varying the unit circle and the coordinate axes. That is,
P (|X| > t u) πk 1
lim = t −α (6) P = = for k = −1, 0, 1, 2 (8)
u→∞ P (|X| > u ) 2 4
and the tail balancing condition, The scatter plot in Figure 2 reflects the form
of the spectral distribution. The points that are far
from the origin occur only near the coordinate axes.
P (X > u)
lim =p and The interpretation is that the probability that both
u→∞ P (|X| > u ) components of the random vector are large at the
P (X < −u) same time is quite small.
lim =q (7)
u→∞ P (|X| > u )
Example 2 (Totally Dependent Components). In
holds, where p and q are nonnegative constants with contrast to the independent case of Example 1,
p + q = 1. The Pareto distribution, t-distribution, suppose that both components of the vector are
and nonnormal stable distributions are all examples identical, that is, X = (X, X), with X regularly
of one-dimensional distributions that are regularly varying in one dimension. Independent replicates of
varying. this random vector would just produce points lying
on a 45° line through the origin. Here, it is easy to
Example 1 (Independent components). Suppose see that the vector is regularly varying with spectral
that X = (X1 , X2 ) consists of two independent and measure given by
identically distributed (i.i.d.) components, where X1
π −π
is regularly varying random variable. The scatter plot
P = =p and P = = q (9)
of 10 000 replicates of these pairs, where X1 has a 4 4
t-distribution with 3 degrees of freedom, is displayed
in Figure 2(a). The t-distribution is regularly varying, Example 3 (AR(1) Process). Let {Xt } be the AR(1)
with index α being equal to the degrees of freedom. process defined by the recursion:
In this case, the spectral measure is a discrete distri-
bution, which places equal mass at the intersection of Xt = 0.9Xt−1 + Zt (10)
Independent components
80
40
60
x = {t +1}
40
20
x _2
20
0 0
−20
−20
Figure 2 Scatter plot of 10 000 pairs of observations with i.i.d. components having a t-distribution with 3 degrees of
freedom (a) and 10 000 observations of (Xt , Xt+1 ) from an AR(1) process (b)
4 Heavy Tails
where {Zt } is an i.i.d. sequence of random variables of m where the plot appears horizontal for an
that have a symmetric stable distribution with expo- extended segment. See [7, 37] for other procedures
nent 1.8. This stable distribution is regularly varying for selecting m. There is the typical bias versus
with index α = 1.8. Since Xt = ∞ j
j =0 0.9 Zt−j is a variance trade-off, with larger m producing smaller
linear process, it follows [14, 15] that Xt is also sym- variance but larger bias. Figure 3 contains graphs of
metric and regularly varying with index 1.8. In fact, the Hill estimate of α as a function of m for the
Xt has a symmetric stable distribution with exponent two simulated series in Figure 2 and the exchange
1.8 and scale parameter (1 − 0.91.8 )−1/1.8 . The scatter rate and log-return data of Figure 1. In all cases, one
plot of consecutive observations (Xt , Xt+1 ) based on can see a range of m for which the graph of α̂ is
10 000 observations generated from an AR(1) pro- relatively flat. Using this segment as an estimate of
cess is displayed in Figure 2(b). It can be shown α, we would estimate the index as approximately 3
that all finite-dimensional distributions of this time for the two simulated series, approximately 3 for the
series are regularly varying. The spectral distribution exchange rate data, and around 3.5 for the stock price
of the vector consisting of two consecutive observa- data. (The value of α for the two simulated series
tions X = (Xt , Xt+1 ) is given by is indeed 3.) Also displayed on the plots are 95%
confidence intervals for α, assuming the data are i.i.d.
P ( = ± arctan(0.9)) = 0.9898 and As suggested by these plots, the return data appear
to have quite heavy tails.
P ( = ± π/2) = 0.0102 (11)
As seen in Figure 2, one can see that most of the Estimation of the Spectral Distribution
points in the scatter plot, especially those far from
the origin, cluster tightly around the line through the Using property (3), a naive estimate of the distri-
origin with slope 0.9. This corresponds to the large bution of is based on the angular components
mass at arctan(0.9) of the distribution of . One can Xt /|Xt | in the sample. One simply uses the empir-
also detect a smattering of extreme points clustered ical distribution of these angular pieces for which the
around the vertical axis. modulus |Xt | exceeds some large threshold. More
details can be found in [37]. For the scatter plots
in Figure 2, we produced in Figure 4 kernel den-
Estimation of α sity estimates of the spectral density function for
the random variable on (−π, π]. One can see
A great deal of attention in the extreme value theory in the graph of the i.i.d. data, the large spikes at
community has been devoted to the estimation of α values of θ = −π, −π/2, 0, π/2, π corresponding to
in the regular variation condition (1). The generic the coordinate axes (the values at −π and π should
Hill estimate is often a good starting point for this be grouped together). On the other hand for the
task. There are more sophisticated versions of Hill AR(1) process, the density estimate puts large mass at
estimates, see [23] for a nice treatment of Hill θ = arctan(0.9) and θ = arctan(0.9) − π correspond-
estimators, but for illustration we stick with the ing to the line with slope 0.9 in the first and third
standard version. For observations X1 , . . . , Xn from a quadrants, respectively. Since there are only a few
nonnegative-valued time series, let Xn:1 > · · · > Xn:n points on the vertical axis, the density estimate does
be the corresponding descending order statistics. If not register much mass at 0 and π.
the data were in fact i.i.d. from a Pareto distribution,
then the maximum likelihood estimator of α −1 based
on the largest m + 1 order statistics is Regular Variation for GARCH and SV
1
m
Processes
α̂ −1 = ln Xn:j − ln Xn:m+1 (12)
m j =1 GARCH Processes
4 4
Hill
Hill
3 3
2 2
1 1
4 4
Hill
Hill
3 3
2 2
1 1
0 50 100 150 0 50 100 150
(c) m (d) m
Figure 3 Hill plots for tail index: (a) i.i.d. data in Figure 2; (b) AR(1) process in Figure 2; (c) log-returns for US/pound
exchange rate; and (d) log-returns for Merck stock, January 2, 2003 to April 28, 2006
and Bollerslev [20] are perhaps the most popu- where the noise or innovations sequence (Zt )t∈ is
lar models for financial time series (see GARCH an i.i.d. sequence with mean zero and unit variance.
Models). Although there are many variations of the It is usually assumed that all coefficients αi and
GARCH process, we focus on the traditional version. βj are nonnegative, with α0 > 0. For identification
We say that {Xt } is a GARCH(p, q) process if it is a purposes, the variance of the noise is assumed to
strictly stationary solution of the equations: be 1 since otherwise its standard deviation can be
absorbed into σt . (σt ) is referred to as the volatility
sequence of the GARCH process.
Xt = σt Zt The parameters are typically chosen to ensure
p that a causal and strictly stationary solution to the
σt2 = α0 + 2
αi Xt−i equations (13) exists. This means that Xt has a
i=1 representation as a measurable function of the past
q and present noise values Zs , s ≤ t. The necessary and
+ 2
βj σt−j , t ∈ (13) sufficient conditions for the existence and uniqueness
j =1 of a stationary ergodic solution to equation (13) are
6 Heavy Tails
0.6
0.20
0.4
0.15
0.2
0.10
0.0
−3 −2 −1 0 1 2 3 −3 −2 −1 0 1 2 3
(a) q (b) q
Figure 4 The estimation of the spectral density function for i.i.d. components (a) and for the AR(1) process (b) from
Figure 2
given in [35] for the GARCH(1, 1) case and for the where Yt is an m-dimensional random vector, At
general GARCH(p, q) case in [4]; see [30] for a is an m × m random matrix, Bt is a random vec-
summary of the key properties of a GARCH process. tor, and {(At , Bt )} is an i.i.d. sequence. Under suit-
In some cases, one only assumes weak stationarity, in able conditions on the coefficient matrices and error
which case the conditions on the parameters reduce matrices, one can derive various properties about the
substantially. A GARCH process is weakly stationary Markov chain Yt . For example, iteration of equa-
if and only if tion (15) yields a unique stationary and causal solu-
tion:
p
q
α0 > 0 and αj + βj < 1 (14)
∞
j =1 j =1
Y t = Bt + At · · · At−i+1 Bt−i , t ∈ (16)
i=1
To derive properties of the tail of the finite-
dimensional distributions of a GARCH process,
including the marginal distribution, it is convenient To ensure almost surely (a.s.) convergence of the
to embed the squares Xt2 and σt2 in a stochas- infinite series in equation (16), and hence the exis-
tic recurrence equation (SRE). This embedding can tence of a unique a strictly stationary solution to
be used to derive other key properties of the pro- equation (15), it is assumed that the top Lyapunov
cess beyond the finite-dimensional distributions. For exponent given by
example, conditions for stationarity and β-mixing can
be established from the properties of SREs and gen-
eral theory of Markov chains. Here, we focus on the γ = inf n−1 E log An · · · A1 (17)
n≥1
tail behavior.
One builds an SRE by including the volatil-
ity process in the state vector. An SRE takes the is negative, where · is the operator norm corre-
form sponding to a given norm in m .
Now, the GARCH process, at least its squares, can
Yt = At Yt−1 + Bt (15) be embedded into an SRE by choosing
Heavy Tails 7
where, as required, {(At , Bt )} is an i.i.d. sequence. The conditions [35], E log(α1 Z 2 + β1 ) < 0 and
The top row in the SRE for the GARCH specification α0 > 0, are necessary and sufficient for the existence
follows directly from the definition of the squared of a stationary causal nondegenerate solution to the
2
volatility process σt+1 and the property that Xt = GARCH(1,1) equations.
σt Zt . Once the squares and volatility sequence, Xt2 and
In general, the top Lyapunov coefficient γ for 2
σt , respectively, are embedded in an SRE, then one
the GARCH SRE cannot be calculated explicitly. can apply classical theory for SREs as developed by
However, a sufficient condition for γ < 0 is given as
Kesten [28], (see also [22]), and extended by Basrak
et al. [2], to establish regular variation of the tails of
p
q Xt2 and σt2 . The following result by Basrak et al. [1]
αi + βj < 1 (19) summarizes the key results applied to a GARCH
i=1 j =1 process.
see p. 122 [4]. It turns out that this condition is also Theorem 1 Consider the process (Yt ) in equa-
necessary and sufficient for the existence of a weakly tion (18) obtained from embedding a stationary
stationary solution to the GARCH recursions. The GARCH process into the SRE (18). Assume that Z
solution will also be strictly stationary in this case. has a positive density on such that E(|Z|h ) < ∞
It has been noted that for many financial time for h < h0 and E(|Z|h0 ) = ∞ for some h0 ∈ (0, ∞].
series, the GARCH(1,1) often provides an adequate Then with Y = Y1 , there exist α > 0, a constant c >
model or is at least a good starter model. This is one 0, and a random vector on the unit sphere p+q−2
of the few models where the Lyapunov coefficient such that
can be computed explicitly. In this case, the SRE
equation essentially collapses to the one-dimensional
SRE given as x α/2 P (|Y| > x) → c as x → ∞ (22)
2
σt+1 = α0 + (α1 Zt2 + β1 ) σt2 = At σt2 + α0 (20) and for every t > 0
where At = α1 Zt2 + β1 . The elements in the second P (|Y| > tx, Y/|Y| ∈ ·) w −α/2
row in the vector and matrix components of equa- →t P ( ∈ ·)
P (|Y| > x)
tion (18) play no role in this case. Hence,
as x → ∞ (23)
where Z1± are the respective positive and negative P (|Um | > x, Um /|Um | ∈ A)
lim
parts of Z1 . With the exception of simple models x→∞ P (|Um | > x)
such as the GARCH(1,1), there is no explicit formula P (|Fm |σ12 > x, Fm /|Fm | ∈ A)
for the index α of regular variation of the marginal = lim
distribution. In principle, α could be estimated from
x→∞ P (|Fm |σ12 > x)
the data using a Hill style estimator, but an enormous E |Fm |α/2 IA (Fm /|Fm |)
sample size would be required in order to obtain a = (31)
E|Fm |α/2
precise estimate of the index.
In the GARCH(1,1) case, α is found by solving It follows that the finite-dimensional distributions
the following equation: of a GARCH process are regularly varying.
E (α1 Z 2 + β1 )α/2 = 1 (27) Stochastic Volatility Processes
This equation can be solved for α by numerical The SV process also starts with the multiplicative
and/or simulation methods for fixed values of α1 and model (13)
β1 from the stationarity region of a GARCH(1,1) pro- Xt = σt Zt (32)
cess and assuming a concrete density for Z. (See [12]
for a table of values of α for various choices of α1 and with (Zt ) being an i.i.d. sequence of random vari-
β1 .) Note that in the case of an integrated GARCH ables. If var(Zt ) < ∞, then it is conventional to
(IGARCH) process where α1 + β1 = 1, then we have assume that Zt has mean 0 and variance 1. Unlike
α = 2. This holds regardless of the distribution of Z1 , the GARCH process, the volatility process (σt ) for
provided it has a finite variance. Since the marginal SV processes is assumed to be independent of the
distribution of an IGARCH process has Pareto-like sequence (Zt ). Often, one assumes that log σt2 is a
tails with index 2, the variance is infinite. linear Gaussian process given by
While equations (25) and (26) describe only the ∞
regular variation of the marginal distribution, it is log σt2 = Yt = µ + ψj ηt−j (33)
also true that the finite-dimensional distributions are j =0
regularly varying. To see this in the GARCH(1,1)
case, we note that the volatility process is given as where (ψj ) is a sequence of square summable coef-
ficients and (ηt ) is a sequence of i.i.d. N(0, σ 2 )
2
σt+1 = (α1 Zt2 + β1 )σt2 + β0 (28) random variables independent of (Zt ). If var(Zt ) is
Heavy Tails 9
finite and equal to 1, then the SV process Xt = that X1 is regularly varying with index −α and choos-
σt Zt = expYt /2 Zt is
white noise with mean 0 and ing the sequence (an ) such that n(1 − F (an )) → 1,
variance exp{µ + σ 2 ∞ 2
j =0 ψj /2}. One advantage of then
such processes is that one can explicitly compute the
autocovariance function (ACVF) of any power of Xt 0, x≤0
F (an x) → G(x) =
n
−α (36)
and its absolute values. For example, the ACVF of e −x , x > 0
the squares of (Xt ) is, for h > 0, given as
This relation is equivalent to convergence in distri-
γ|X|2 (h) = E(exp{Y0 + Yh }) − (E exp{Y0 })2 bution of the maxima of the associated independent
sequence (X̂t ) (i.e., the sequence (X̂t ) is i.i.d. with
∞
common distribution function F ) normalized by an
= exp 2µ + σ 2 ψi2
to the Fréchet distribution G. Specifically, if M̂n =
i=0
max{X̂1 , . . . , X̂n }, then
∞
× exp σ 2 ψi ψi+h − 1 P (an−1 Mn ≤ x) → G(x) (37)
i=0
= e2µ e γY (0) e γY (h) − 1 (34) Under mild mixing conditions on the sequence (Xt )
[29], we have
Note that as h → ∞,
P (an−1 Mn ≤ x) → H (x) (38)
γ|X|2 (h) ∼ e2µ eγY (0) e γY (h) − 1 ∼ e2µ e γY (0) γY (h) with H a nondegenerate distribution function if and
only if
(35) H (x) = Gθ (x) (39)
so that the ACVF of the SV for the squares converges
for some θ ∈ (0, 1]. The parameter θ is called the
to zero at the same rate as the log-volatility process.
extremal index and can be viewed as a sample
If Zt has a Gaussian distribution, then the tail
size adjustment for the maxima of the dependent
of Xt remains light although a bit heavier than a
sequence due to clustering of the extremes. The case
Gaussian [3]. This is in contrast to the GARCH
θ = 1 corresponds to no clustering, in which case the
case where an i.i.d. Gaussian input leads to heavy-
limiting behavior of Mn and M̂n are identical. In case
tailed marginals of the process. On the other hand,
θ < 1, Mn behaves asymptotically like the maximum
for SV processes, if the Zt have heavy tails, for
of nθ independent observations. The reciprocal of the
example, if Zt has a t-distribution, then Davis and
extremal index 1/θ of a stationary sequence (Xt ) also
Mikosch [10] show that Xt is regularly varying.
has the interpretation as the expected size of clusters
Furthermore, in this case, any finite collection of
of high-level exceedances in the sequence.
Xt ’s has the same limiting joint tail behavior as
There are various sufficient conditions for ensur-
an i.i.d. sequence with regularly varying marginals.
ing that θ = 1. Perhaps the most common anticlus-
Specifically, the two random vectors, (X1 , . . . , Xk )
tering condition is D [28], which has the following
and (E|σ1 |α )1/α (Z1 , . . . , Zk ) have the same joint tail
form:
behavior.
[n/k]
lim sup n P (X1 > an x, Xt > an x) = O(1/k)
Limit Theory GARCH and SV Processes n→∞
t=2
If (Xt ) is a stationary sequence of random variables as k → ∞. Hence, if the stationary process (Xt )
with common distribution function F , then often satisfies a mixing condition and D , then
one can directly relate the limiting distribution of
the maxima, Mn = max{X1 , . . . , Xn } to F . Assuming P (an−1 Mn ≤ x) → G(x) (41)
10 Heavy Tails
Returning to the GARCH setting, we assume that regularly varying tails with index −α. Choosing the
the conditions of Theorem 1 are satisfied. Then we sequence an satisfying n(1 − F (an )) → 1, we have
know that P (|X| > x) ∼ c1 x −α for some α, c1 > 0,
and we can even specify the value of α in the nP (X̂1 > an x) → x −α (44)
GARCH(1, 1) case by solving equation (27). Now
1/α
choosing an = n1/α c1 , we have nP (|X1 | > an ) → as n → ∞. Now equation (44) can be strengthened
1 and defining Mn = max{|X1 |, . . . , |Xn |}, we obtain to the statement
As an application of equation (49), define M̂n,k to The Behavior of the Sample Autocovariance and
be the kth largest among X̂1 , . . . , X̂n . For y ≤ x, the Autocorrelation Functions
event {an−1 M̂n ≤ x, an−1 M̂n,k ≤ y} = {N̂n (x, ∞) = 0,
The ACF is one of the principal tools used in classical
N̂n (y, x] ≤ k − 1} and hence time series modeling. For a stationary Gaussian
process, the dependence structure of the process is
P (an−1 M̂n ≤ x, an−1 M̂n,k ≤ y) completely determined by the ACF. The ACF also
conveys important dependence information for linear
= P (N̂n (x, ∞) = 0, N̂n (y, x] ≤ k − 1) process. To some extent, the dependence governed by
→ P (N (x, ∞) = 0, N (y, x] ≤ k − 1) a linear filter can be fully recovered from the ACF.
For the time series consisting of financial returns,
−α
k−1
the data are uncorrelated, so the value of the ACF
= e−x (y −α − x −α )j /j ! (51)
is substantially diminished. Nevertheless, the ACF of
j =0
other functions of the process such as the squares and
As a second application of the limiting Poisson absolute values can still convey useful information
convergence in equation (49), the limiting Poisson about the nature of the nonlinearity in the time series.
−1/α For example, slow decay of the ACF of the squares
process N̂ has points located at
k , where
k =
E1 + · · · + Ek is the sum of k i.i.d. unit exponentially is consistent with the volatility clustering present in
distributed random variables. Then if α < 1, the the data. For a stationary time series (Xt ), the ACVF
result is more complicated; if α ≥ 1, we obtain the and ACF are defined as
convergence of partial sums:
γX (h) = cov(X0 , Xh ) and
n ∞
γX (h)
an−1
d
X̂t →
−1/α
j (52) ρX (h) = corr(X0 , Xh ) = , h ≥ 0 (53)
γX (0)
t=1 j =0
respectively. Now for observations X1 , . . . , Xn from
In other words, the sum of the points of the point the stationary time series, the ACVF and ACF are
process Nn converges in distribution to the sum of estimated by their sample counterparts, namely, by
points in the limiting Poisson process.
For a stationary time series (Xt ) with heavy
1
n−h
tails that satisfy a suitable mixing condition, such γ̂X (h) = (Xt − X n ) (Xt+h − X n ) (54)
as strong mixing, and the anticlustering condition n t=1
D , then the convergence in equation (49) remains
valid, as well as the limit in equation (52), at least and
for positive random variables. For example, this is
the case for SV processes. If the condition D is
n−h
stationary process consisting of products (Xt Xt+h ). value statistics as described in the second section,
The first such results were established by Davis and indicating that log-return series might not have a
Resnick [14–16] in a linear process setting. Exten- finite fourth or fifth momentc and then the limit results
sions by Davis and Hsing [8] and Davis and Mikosch above would show that the usual confidence bands for
[9] allowed one to consider more general time series the sample ACF based√on the central limit theorem
models beyond those linear. The main idea is to con- and the corresponding n-rates are far too optimistic
sider a point process Nn based on products of the in this case.
form Xt Xt+h /an2 . After establishing convergence of
this point process, in many cases one can apply the
continuous mapping theorem to show that the sum of The Stochastic Volatility Case
the points that comprise Nn converges in distribution
For a more direct comparison with the GARCH
to the sum of the points that make up the limiting
process, we choose a distribution for the noise process
point process. Although the basic idea for establish-
that matches the power law tail of the GARCH with
ing these results is rather straightforward, the details
index α. Then
are slightly complex. These ideas have been applied
n 1/α n 1/(2α)
to the case of GARCH processes in [1] and to SV
ρ̂X (h) and ρ̂X2 (h) (57)
processes in [10], which are summarized below. ln n ln n
converge in distribution for α ∈ (0, 2) and α ∈ (0, 4),
The GARCH Case respectively. This illustrates the excellent large sam-
ple behavior of the sample ACF for SV models even
The scaling in the limiting distribution for the sample if ρX and ρX2 are not defined [11, 13]. Thus, even
ACF depends on the index of regular variation α if var(Zt ) = ∞ or EZt4 = ∞, the estimates ρ̂X (h)
specified in Theorem 1. We summarize the results and ρ̂X2 (h), respectively, converge to zero at a rapid
for the various cases of α. rate. This is in marked contrast with the situation for
GARCH processes, where under similar conditions
1. If α ∈ (0, 2), then ρ̂X (h) and ρ̂|X| (h) have nonde- on the marginal distribution, the respective sample
generate limit distributions. The same statement ACFs converge in distribution to random variables
holds for ρ̂X2 (h) when α ∈ (0, 4). without any scaling.
2. If α ∈ (2, 4), then both ρ̂X (h), ρ̂|X| (h) converge
in probability to their deterministic counterparts
ρX (h), ρ|X| (h), respectively, at the rate n1−2/α
End Notes
and the limit distribution is a complex function a.
Basrak et al. [1] proved this result under the condition
of non-Gaussian stable random variables.
that α/2 is not an even integer. Boman and Lindskog [5]
3. If α ∈ (4, 8), then removed this condition.
b.
Here bounded means bounded away from zero.
d
n1−4/(2α) (ρ̂X2 (h) − ρX2 (h)) → Sα/2 (h) (56) c.
See, for example, [18], Chapter 6, and [33].
[5] Boman, J. & Lindskog, F. (2007). Support Theorems [22] Goldie, C.M. (1991). Implicit renewal theory and tails
for the Radon Transform and Cramér-Wold Theorems. of solutions of random equations, Annals of Applied
Technical report, KTH, Stockholm. Probability 1, 126–1 –1.
[6] Breiman, L. (1965). On some limit theorems similar to [23] Haan, L. & Ferreira, A. (2006). Extreme Value Theory:
the arc-sin law, Theory of Probability and Its Applica- An Introduction, Springer, New York.
tions 10, 323–331. [24] Haan, L. & Resnick, S.I. (1977). Limit theory for
[7] Coles, S. (2001). An Introduction to Statistical Modeling multivariate sample extremes, Zeitschriftfur Wahrschein-
of Extreme Values, Springer, London. lichkeitstheorieund Verwandle. Gebiete 40, 317–337.
[8] Davis, R.A. & Hsing, T. (1995). Point process and [25] Haan, Lde., Resnick, S.I., Rootzén, H. & Vries, C. Gde.
partial sum convergence for weakly dependent random (1989). Extremal behaviour of solutions to a∼stochastic
variables with infinite variance, Annals of Probability 23, difference equation with applications to ARCH pro-
879–917. cesses, Stochastic Processes and Their Applications 32,
[9] Davis, R.A. & Mikosch, T. (1998). The sample autocor- 213–224.
relations of heavy-tailed processes with applications to [26] Ibragimov, I.A. & Linnik, Yu.V. (1971). Independent
ARCH, Annals of Statistics 26, 2049–2080. and Stationary Sequences of Random Variables, Wolters-
[10] Davis, R.A. & Mikosch, T. (2001). Point process conver- Noordhoff, Groningen.
gence of stochastic volatility processes with application [27] Kallenberg, O. (1983). Random Measures, 3rd edition,
to sample autocorrelation, Journal of Applied Probability Akademie-Verlag, Berlin.
38A, 93–104. [28] Kesten, H. (1973). Random difference equations and
[11] Davis, R.A. & Mikosch, T. (2001). The sample auto- renewal theory for products of random matrices, Acta
correlations of financial time series models, in W.J. Mathematica 131, 207–248.
Fitzgerald, R.L. Smith, A.T. Walden & P.C. Young, [29] Leadbetter, M.R., Lindgren, G. & Rootzén, H. (1983).
(eds), Nonlinear and Nonstationary Signal Processing, Extremes and Related Properties of Random Sequences
Cambridge University Press, Cambridge, pp. 247–274. and Processes, Springer, New York.
[12] Davis, R.A. & Mikosch, T. (2009). Extreme value [30] Linder, A. (2009). Stationairty, mixing, distributional
theory for GARCH processes, in Handbook of Financial properties and moments of GARCH(p,q) processes, in
Time Series, T. Andersen, R.A. Davis, J.-P. Kreiss & T. Andersen, R.A. Davis, J.-P. Kreiss, and T. Mikosch,
T. Mikosch, eds, Springer, New York, pp. 187–200. (eds), Handbook of Financial Time Series, Springer, New
[13] Davis, R.A. & Mikosch, T. (2009). Probabilistic proper- York.
ties of stochastic volatility models, in T. Andersen, R.A. [31] Mandelbrot, B. (1963). The variation of certain specula-
Davis, J.-P. Kreiss & T. Mikosch, (eds), Handbook tive prices, Journal of Business 36, 394–419.
of Financial Time Series, Springer, New York, pp. [32] Mandelbrot, B. & Taylor, H. (1967). On the distribution
255–267. of stock price differences, Operations Research 15,
[14] Davis, R.A. & Resnick, S.I. (1985). Limit theory for 1057–1062.
moving averages of random variables with regularly [33] Mikosch, T. (2003). Modelling dependence and tails of
varying tail probabilities, Annals of Probability 13, financial time series, in B. Finkenstadt & H. Rootzen,
179–195. (eds), Extreme Values in Finance, Telecommunications
[15] Davis, R.A. & Resnick, S.I. (1985). More limit theory and the Environment, Chapman & Hall, pp. 185–286.
for the sample correlation function of moving aver- [34] Mikosch, T. & Stărică, C. (2000). Limit theory for the
ages, Stochastic Processes and Their Applications 20, sample autocorrelations and extremes of a GARCH(1,1)
257–279. process, Annals of Statistics 28, 1427–1451.
[16] Davis, R.A. & Resnick, S.I. (1986). Limit theory for the
[35] Nelson, D.B. (1990). Stationarity and persistence in
sample covariance and correlation functions of moving
the GARCH$(1,1)$ model, Econometric Theory 6,
averages, Annals of Statistics 14, 533–558.
318–334.
[17] Doukhan, P. (1994). Mixing Properties and Examples,
[36] Resnick, S.I. (1987). Extreme Values, Regular Variation,
Lecture Notes in Statistics, Springer Verlag, New York.
and Point Processes, Springer, New York.
Vol. 85.
[37] Resnick, S.I. (2007). Heavy Tail Phenomena; Probabilis-
[18] Embrechts, P., Klüppelberg, C. & Mikosch, T. (1997).
tic and Statistical Modeling, Springer, New York.
Modelling Extremal Events for Insurance and Finance,
Springer, Berlin.
[19] Engle, R.F. (1982). Autoregressive conditional het-
eroscedastic models with estimates of the variance of
Further Reading
United Kingdom inflation, Econometrica 50, 987–1007.
[20] Engle, R.F. & Bollerslev, T. (1986). Modelling the Resnick, S.I. (1986). Point processes, regular variation and
persistence of conditional variances. With comments and weak convergence, Advances in Applied Probability 18,
a reply by the authors, Econometric Reviews 5, 1–87. 66–138.
[21] Fama, E.F. (1965). The behaviour of stock market prices, Taylor, S.J. (1986). Modelling Financial Time Series, Wiley,
Journal of Business 38, 34–105. Chichester.
14 Heavy Tails
Analytic Solutions of the Filtering Problem which represents the prediction step, and
filter where, combining equations (8) and (9), this equation (14) becomes (on replacing L by Q)
latter representation is given by
N
at each step the relevant distribution (predictive and addition, with Markovian factor processes, Markov-
filter distribution, respectively) is approximated by process techniques can be fruitfully applied. In many
a discrete probability measure supported by a finite financial applications of factor models, the investors
number of points. These approaches vary mainly in have only incomplete information about the actual
the updating step. state of the factors and this may induce model
A simple version of a particle filter is as follows risk. In fact, even if the factors are associated
(see [3]): in the generic period t − 1 approximate with economic quantities, some of them are difficult
p(xt−1 | y0t−1 ) by a discrete distribution ((xt−1
1 1
, pt−1 ), to observe precisely. Furthermore, abstract factors
L L i
. . . , (xt−1 , pt−1 )) where pt−1 is the probability that without economic interpretation are often included in
xt−1 = xt−1 i
. Consider each location xt−1i
as the the specification of a model to increase its flexibility.
position of a “particle”. Under incomplete information of the factors, their
1. Prediction step values have to be inferred from observable quantities
and this is where filtering comes in as an appropriate
Propagate each of the particles xt−1 i
→ x̂ti over
tool.
one time period, using the given (discrete time)
Most financial problems concern pricing as well
evolution dynamics of xt : referring to the model in
as portfolio management, in particular, hedging and
equation (2) just simulate independent trajectories
i portfolio optimization. While portfolio management
of xt starting from the various xt−1 . This leads
is performed under the physical measure, for pricing,
to an approximation of p(xt | y0 ) by the discrete
t−1
one has to use a martingale measure. Filtering prob-
distribution ((x̂t1 , p̂t1 ), . . . , (x̂tL , p̂tL )) where one puts
lems in finance may therefore be considered under
p̂ti = pt−1
i
.
the physical or the martingale measures, or under
2. Updating step both (see [22]). In what follows, we shall discuss
Update the weights using the new observation yt by filtering for pricing problems, with examples from
putting pti = cpt−1
i
p(yt | x̂ti ) where c is the normal- term structure and credit risk, as well as for portfolio
ization constant (see the second relation in equation management. More general aspects can be found, for
(5) for an analogy). example, in the recent papers [6, 7], and [23].
Notice that p(yt | x̂ti ) may be viewed as the likeli-
hood of particle x̂ti , given the observation yt , so that in
the updating step one weighs each particle according
to its likelihood. There exist various improvements of Filtering in Pricing Problems
this basic setup. There are also variants, where in the
updating step each particle is made to branch into a
random number of offsprings, where the mean num- This section is to a large extent based on [14]. In
ber of offsprings is taken to be proportional to the Markovian factor models, the price of an asset at
likelihood of that position. In this latter variant, the a generic time t can, under full observation of the
number of particles increases and one can show that, factors, be expressed as an instantaneous function
under certain assumptions, the empirical distribution (t, xt ) of time and the value of the factors. Let
of the particles converges to the true filter distribu- Gt denote the full filtration that measures all the
tion. There is a vast literature on particle filters, of processes of interest, and let Ft ⊂ Gt be a subfiltration
which we mention [5] and, in particular, [1]. representing the information of an investor. What is
an arbitrage-free price in the filtration Ft ? Assume
the asset to be priced is a European derivative with
Filtering in Finance maturity T and claim H ∈ FT . Let N be a numeraire,
adapted to the investor filtration Ft , and let QN be
There are various situations in finance where filtering
the corresponding martingale measure. One can easily
problems may arise, but one typical situation is given
by factor models. These models have proven to prove the following:
be useful for capturing the complicated nonlinear N
dynamics of real asset prices, while at the same Lemma 1 Let (t, xt ) = Nt E Q NHT | Gt be the
time being parsimonious and numerically tractable. In arbitrage-free price of the claim H under the full
Filtering 5
ˆ
information Gt and (t)
N
= Nt E Q NHT | Ft the cor- From the filtering point of view, the system (20) is
a linear-Gaussian model with xt unobserved and the
responding arbitrage-free price in the investor filtra-
observations given by (rt , yti ). We shall thus put Ft =
tion. It then follows that
σ {rs , ysi ; s ≤ t, i = 1, . . . , n}. The filter distribution
ˆ N is Gaussian and, via the Kalman filter, one can
(t) = E Q {(t, xt ) | Ft } (16)
obtain its conditional mean mt and (co)variance
t
Furthermore, if the savings account Bt = exp{ 0 t . Applying Lemma 1 and using the moment-
rs ds} with corresponding martingale measure Q is generating function of a Gaussian random variable,
Ft −adapted, then we obtain the arbitrage-free price, in the investor
filtration, of an illiquid bond with maturity T as
ˆ
(t) = E Q {(t, xt ) | Ft } (17) follows:
of xt and thus to a jump in the default intensities xt and yt . In [13] it is shown that an arbitrarily good
of the still surviving obligors. In this context, we approximation to the filter solution can be obtained
shall consider the pricing of illiquid credit derivatives both analytically and by particle filtering.
on the basis of the investor filtration supposed to
be given by the default history and noisily observed We conclude this section with a couple of addi-
prices of liquid credit derivatives. tional remarks:
We assume that, conditionally on xt , the defaults
1. Traditional credit risk models are either struc-
are independent with intensities λi (xt ) and that
tural models or reduced-form (intensity-based)
(xt , yt ) is jointly Markov. A credit derivative has
models. Example 2 belongs to the latter class.
the payoff linked to default events in a given refer- In structural models, the default of the generic
ence portfolio and so one can think of it as a random obligor/firm i is defined as the first passage time
y
variable H ∈ FT with T being the maturity. Its full of the asset value Vi (t) of the firm at a given
information price at the generic t ≤ T , that is, in (possibly stochastic) barrier Ki (t), that is,
the filtration Gt that measures also xt , is given by
H̃t = E{e−r(T −t) H | Gt } where r is the short rate and τi = inf{t ≥ 0 | Vi (t) ≤ Kt (t)} (25)
the expectation is under a given martingale measure
Q. By the Markov property of (xt , yt ), one gets a In such a context, filtering problems may arise
representation of the form when either Vi (t) or Ki (t) or both are not exactly
known/observable (see e.g., [9]).
H̃t = E{e−r(T −t) H | Gt } := a(t, xt , yt ) (22) 2. Can a structural model also be seen as a reduced-
form model? At first sight, this is not clear
for a suitable a(·). In addition to the default history, since τi in equation (25) is predictable, while in
we assume that the investor filtration also includes intensity-based models it is totally inaccessible.
noisy observations of liquid credit derivatives. In However, it turns out (see e.g., [16]) that, while τi
view of equation (22), it is reasonable to model such in equation (25) is predictable with respect to the
observations as full filtration (measuring also Vi (t) and Ki (t)),
it becomes totally inaccessible in the smaller
dzt = γ (t, xt , yt ) dt + dβt (23) investor filtration that, say, does not measure
Vi (t) and, furthermore, it admits an intensity.
where the various quantities may also be column
vectors, βt is an independent Wiener process and γ (·)
is a function of the type of a(·) in equation (22). The Filtering in Portfolio Management Problems
y
investor filtration is then Ft = Ft ∨ Fzt . The price at
Rather than presenting a general treatment (for this,
t < T of the credit derivative in the investor filtration
we refer to [21] and the references therein), we
is now Ht = E{e−r(T −t) H | Ft } and by Lemma 1 we
discuss here two specific examples in models with
have
unobserved factors, one in discrete time and one in
Ht = E{e−r(T −t) H | Ft } = E{a(t, xt , yt ) | Ft } continuous time. Contrary to the previous section
(24) on pricing, here we shall work under the physical
Again, if one knows the price a(t, xt , yt ) in Gt , one measure P .
can thus obtain the price in Ft by computing the
right-hand side in equation (24) and for this we need A Discrete Time Case. To motivate the model, start
the filter distribution of xt given Ft . from the classical continuous time asset price model
To define the corresponding filtering problem, we dSt = St [a dt + xt dwt ] where wt is Wiener and xt is
need a more precise model for (xt , yt ) (the process the nondirectly observable volatility process (factor).
zt is already given by equation (23)). Since yt is For yt := log St , one then has
a jump process, the model cannot be one of those
1
for which we had described an explicit analytic dyt = a − xt2 dt + xt dwt (26)
2
solution. Without entering into details, we refer to
[13] (see also [14]), where a jump-diffusion model Passing to discrete time with step δ, let for t =
is considered that allows for common jumps between 0, . . . , T the process xt be a Markov chain with m
Filtering 7
states x 1 , . . . , x m (may result from a time discretiza- distribution of the form p(yt | xt−1 , yt−1 ), and equa-
tion of a continuous time xt ) and tion (5) can be adapted to become here
1 2 √
yt = yt−1 + a − xt−1 δ + xt−1 δεt (27) π0 =
µ (initial distribution for xt )
2 πti ∝ m
j =1 p (yt | xt−1 = j, yt−1 ) (30)
j
with εt i.i.d. standard Gaussian as it results from p (xt = i | xt−1 = j ) πt−1
equation (26) by applying the Euler–Maruyama
In addition, we may consider the law of yt
scheme. Notice that (xt , yt ) is Markov. Having for
conditional on (πt−1 , yt−1 ) = (π, y) that is given by
simplicity only one stock to invest in, denote by φt
the number of shares of stock held in the portfo-
lio in period t with the rest invested in a riskless
m
Qt (π, y, dy ) = p y | xt−1 = j, y
bond Bt (for simplicity assume r = 0). The corre- i,j =1
sponding self-financed wealth process then evolves
according to p (xt = i | xt−1 = j ) π j (31)
φ φ φ
Vt+1 = Vt + φt eyt+1 − eyt := F Vt , φt , yt , yt+1 From equations (30) and (31), it follows easily that
y
(πt , yt ) is a sufficient statistic and an Ft −Markov
(28)
y process.
and φt is supposed to be adapted to Ft ; denote by
To transform the original partial information prob-
A the class of such strategies. Given a horizon T ,
lem with criterion (29) into a corresponding complete
consider the following investment criterion
observation problem, put r̂t (π, y, v, φ) = m i
i=1 rt (x ,
m
i ˆ
y, v, φ)π and f (π, y, v) = i=1 f (x , y, v)π so
i i
Jopt (V0 ) = sup J (V0 , φ) that, by double conditioning, one obtains
φ∈A
T −1 T −1
φ
+ f (xT , yT , VT ) (29) φ y
+ E f (xT , yT , VT ) | FT
T −1
which, besides portfolio optimization, includes also
r̂t (πt , yt , Vt , φt )+ fˆ(πT , yT , VT )
φ φ
hedging problems. The problem in equations (27), =E
(28), and (29) is now a stochastic control problem t=0
under partial/incomplete information given that xt is (32)
an unobservable factor process.
A standard approach to dynamic optimization Owing to the Markov property of (πt , yt ), one can
problems under partial information is to trans- write the following (backward) dynamic program-
form them into corresponding complete information ming recursions:
ones whereby xt is replaced by its filter distribu-
y y
tion given Ft . Letting πti := P {xt = x i | Ft } , i = u (π, y, v) = fˆ(π, y,
1, . . . , m we first adapt the filter dynamics in equa-
T v)
ut (π, y, v) = supφ∈A r̂t (π, y, v, φ)
tion (5) to our situation to derive a recursive relation (33)
+E {ut+1 (πt+1 , yt+1,
for πt = (πt1 , . . . , πtm ). Being xt finite-state Markov, F (v, φ, y, yt+1 )) | (πt , yt ) = (π, y)}
p(xt+1 | xt ) is given by the transition probability
matrix and the integral in equation (5) reduces to where the function F (·) was defined in equation (28),
a sum. On the other hand, p(yt | xt ) in equation (5) and φ here refers to the generic choice of φ = φt in
corresponds to the model in equation (2) that does period t. It leads to the optimal investment strategy
not include our model in equation (27) for yt . One φ ∗ and the optimal value Jopt (V0 ) = u0 (µ, y0 , V0 ). It
can however easily see that equation (27) leads to a can, in fact, be shown that the strategy and value thus
8 Filtering
obtained are optimal also for the original incomplete consumption, and with a power utility function.
information problem when φ there is required to be Combining equations (34), (35), and (36) we obtain
y
Ft −adapted. the following portfolio optimization problem under
To actually compute the recursions in equation incomplete information where the factor process xt
(33), one needs the conditional law of (πt+1 , yt+1 ) is not observed and where we shall require that ρt is
given (πt , yt ), which can be deduced from equations FYt -adapted:
(30) and (31). In this context, notice that, even if x
is m-valued, πt takes values in the m-dimensional
simplex that is ∞-valued. To actually perform the
dxt = Ft (xt ) dt + Rt (xt ) dMt (unobserved)
dy t = At
(yt , xt ) dt + B(yt ) dwt (observed)
calculation, one needs an approximation leading to a
finite-valued process (πt , yt ) and to this effect various
dVt = Vt ρt At (yt , xt ) + 1 Bt2 (yt ) dt
approaches have appeared in the literature (for an 2
(37)
approach with numerical results see [4]).
+ ρt Bt (yt ) dwt
A Continuous Time Case. Consider the following sup E {(V )µ } , µ ∈ (0, 1)
ρ T
market model where xt is an unobserved factor
process and St is the price of a single risky asset:
As in the previous discrete time case, we shall now
transform this problem into a corresponding one
dxt = Ft (xt ) dt + Rt (xt ) dMt under complete information, thereby replacing the
(34)
dSt = St [at (St , xt ) dt + σt (St ) dwt ] unobserved state variable xt by its filter distribution,
y y
with wt a Wiener process and Mt a not necessarily given Ft , that is, πt (x) := p(xt | Ft )xt =x . Even if
continuous martingale, xt is finite-dimensional, πt (·) is ∞-dimensional. We
t independent of wt . Since, in have seen above cases where the filter distribution
continuous time, 0 σs2 ds can be estimated by the
empirical quadratic variation of St , in order not to is finitely parameterized, namely, the linear-Gaussian
have degeneracy in the filter to be derived below for case (Kalman filter) and when xt is finite-state
xt , we do not let σ (·) depend also on xt . For the Markov. The parameters characterizing the filter were
riskless asset, we assume for simplicity that its price seen to evolve over time driven by the innovations
is Bt ≡ const (short rate r = 0). In what follows, it process (see equations (8), (10) and (14)). In what
is convenient to consider log-prices yt = log St , for follows, we then assume that the filter is parameter-
which ized by a vector process ξt ∈ p , that is, πt (x) :=
y
p(xt | Ft )xt =x = π(x; ξt ) and that ξt satisfies
1
dyt = [at (St , xt ) − σt2 (St )] dt + σ (St ) dwt dξt = βt (yt , ξt ) dt + ηt (yt , ξt ) dw̄t (38)
2
:= At (yt , xt ) dt + B(yt ) dwt (35)
where w̄t is Wiener and given by the innovations
Investing in this market in a self-financing way and process. We now specify this innovations process w̄t
denoting by ρt the fraction of wealth invested in for our general modelin equation (37). To this effect,
the risky asset, we have from dV
Vt
t
= ρt dS
St
t
= ρt edyt eyt putting At (yt , ξt ) := At (yt , x) dπt (x; ξt ), let
that
dw̄t := Bt−1 (yt ) [ dyt − At (yt , ξt ) dt] (39)
1
dVt = Vt ρt At (yt , xt ) + Bt2 (yt ) dt
2 and notice that, replacing dyt from equation (35),
this definition implies a translation of the original
(P , Ft )-Wiener wt , that is,
+ ρt Bt (yt ) dwt (36)
dw̄t = dwt + Bt−1 (yt ) At (yt , xt ) − At (yt , ξt ) dt
We want to consider the problem of maximization (40)
of expected utility from terminal wealth, without
Filtering 9
and thus the implicit change of measure P → P̄ with [2] Bhar, R. Chiarella, C. Hung, H. & Runggaldier, W.
(2005). The volatility of the instantaneous spot interest
rate implied by arbitrage pricing—a dynamic Bayesian
dP¯ T approach. Automatica 42, 1381–1393.
= exp At (yt , ξt ) − At (yt , xt ) [3] Budhiraja, A., Chen, L. & Lee, C. (2007). A survey
dP | FT 0 of nonlinear methods for nonlinear filtering problems,
Physica D 230, 27–36.
1 T
× Bt−1 (yt ) dwt − At (yt , ξt ) [4] Corsi, M., Pham, H. & Runggaldier, W.J. (2008).
2 0 Numerical approximation by quantization of control
problems in finance under partial observations, to appear
2 in Mathematical Modeling and Numerical Methods in
− At (yt , xt ) Bt−2 (yt ) dt (41) Finance. Handbook of Numerical Analysis, A. Bensous-
san & Q. Zhang, eds, Elsevier, Vol. 15.
[5] Crisan, D., Del Moral, P. & Lyons, T. (1999). Inter-
We obtain thus as the complete information problem acting particle systems approximations of the Kush-
corresponding to equation (37), the following, which ner–Stratonovich equation, Advances in Applied
is defined on the space (, F, Ft , P̄ ) with Wiener w̄t : Probability 31, 819–838.
[6] Cvitanic, J., Liptser, R. & Rozovski, B. (2006). A filter-
ing approach to tracking volatility from prices observed
dξt = βt (yt , ξt ) dt + ηt (yt , ξt ) dw̄t
at random times, The Annals of Applied Probability 16,
dyt = At (yt , ξt ) dt + Bt (yt ) dw̄t
1633–1652.
dVt = Vt ρt At (yt , ξt ) + 1 Bt2 (yt ) dt
[7] Cvitanic, J., Rozovski, B. & Zaliapin, I. (2006).
2 Numerical estimation of volatility values from dis-
(42) cretely observed diffusion data, Journal of Computa-
tional Finance 9, 1–36.
+ ρt Bt (yt ) dw̄t
[8] Davis, M.H.A. & Marcus, S.I. (1981). An Introduction
to nonlinear filtering, in Stochastic Systems: The Mathe-
supρ Ē {(VT )µ } , µ ∈ (0, 1) matics of Filtering and Identification and Applications
M. Hazewinkel & J.C. Willems, eds, D.Reidel, Dor-
One can now use methods for complete information drecht, pp. 53–75.
problems to solve equation (42), and it can also be [9] Duffie, D. & Lando, D. (2001). Term structure of
shown that the solution to equation (42) gives a credit risk with incomplete accounting observations,
Econometrica 69, 633–664.
solution of the original problem for which ρt was
y [10] Elliott, R.J. (1993). New finite-dimensional filters and
assumed Ft -adapted. smoothers for noisily observed Markov chains, IEEE
We remark that other reformulations of the incom- Transactions on Information Theory, IT-39, 265–271.
plete information problem as a complete information [11] Elliott, R.J., Aggoun, L. & Moore, J.B. (1994). Hidden
one are also possible (see e.g., [20]). Markov models: estimation and control, in Applications
A final comment concerns hedging under incom- of Mathematics, Springer-Verlag, Berlin-Heidelberg-
plete information (incomplete market). When using New York, Vol. 29.
[12] Frey, R. & Runggaldier, W. (1999). Risk-minimizing
the quadratic hedging criterion, that is, minρ ES0 ,V0 hedging strategies under restricted information: the case
ρ
{(HT − VT )2 }, its quadratic nature implies that if of stochastic volatility models observed only at discrete
∗
φt (xt , yt ) is the optimal strategy (number of units random times, Mathematical Methods of Operations
invested in the risky asset) under complete informa- Research 50(3), 339–350.
tion also of xt , then, under the partial information [13] Frey, R. & Runggaldier, W. (2008). Credit risk and
y incomplete information: a nonlinear filtering approach,
Ft , the optimal strategy is simply the projection
preprint, Universitat Leipzig, Available from www.math.
E{φt∗ (xt , yt ) | Ft } that can be computed on the basis
y
y uni-leipzig.de/%7Efrey/publications-frey.html.
of the filter of xt given Ft (see [12]). [14] Frey, R. & Runggaldier, W.R. Nonlinear filtering in
models for interest-rate and credit risk, to appear
in Handbook of Nonlinear Filtering, D. Crisan &
References B. Rozovski, eds, Oxford University Press (to be pub-
lished in 2009).
[1] Bain, A. & Crisan, D. (2009). Fundamentals of stochas- [15] Gombani, A., Jaschke, S. & Runggaldier, W. (2005).
tic filtering, in Series: Stochastic Modelling and Applied A filtered no arbitrage model for term structures with
Probability, Vol. 60, Springer Science+Business Media, noisy data, Stochastic Processes and Applications 115,
New York,. 381–400.
10 Filtering
[16] Jarrow, R. & Protter, P. (2004). Structural versus Markov factors, in Seminar on Stochastic Analysis, Ran-
reduced-form models: a new information based perspec- dom Fields and Applications V, R.C. Dalang, M. Dozzi,
tive, Journal of Investment Management, 2, 1–10. & F. Russo, eds, Progress in Probability, Birkhäuser
[17] Kliemann, W., Koch, G. & Marchetti, F. (1990). Verlag, Vol. 59, pp. 493–506.
On the unnormalized solution of the filtering prob- [21] Pham, H. Portfolio optimization under partial obser-
lem with counting process observations, IEEE IT-36, vation: theoretical and numerical aspects, to appear
1415–1425. in Handbook of Nonlinear Filtering, D. Crisan &
B. Rozovski, eds, Oxford University Press (to be pub-
[18] Kushner, H.J. & Dupuis, P. (1992). Numerical methods
lished in 2009).
for stochastic control Problems in continuous time,
[22] Runggaldier, W.J. (2004). Estimation via stochastic
in Applications of Mathematics, Springer, New York,
filtering in financial market models, in Mathematics
Vol. 24. of Finance. Contemporary Mathematics, G. Yin &
[19] Liptser, R.S. & Shiryaev, A.N. (2001). Statistics of Q. Zhang, eds, AMS, Vol. 351, pp. 309–318.
random processes, Series: Applications of Mathematics; [23] Zeng, Y. (2003). A partially observed model for micro-
Stochastic Modelling and Applied Probability, Springer- movement of asset prices with Bayes estimation via
Verlag, Berlin, Vols. I, II. filtering, Mathematical Finance, 13, 411–444.
[20] Nagai, H. & Runggaldier, W.J. (2008). PDE approach
to utility maximization for market models with hidden WOLFGANG RUNGGALDIER
Filtrations Some fundamental theorems, such as the Début
theorem, require the usual hypotheses. Hence natu-
rally, very often in the literature on the theory of
stochastic processes and mathematical finance, the
The notion of filtration, introduced by Doob, has underlying filtered probability spaces are assumed to
become a fundamental feature of the theory of satisfy the usual hypotheses. This assumption is not
stochastic processes. Most basic objects, such as mar- very restrictive for the following reasons:
tingales, semimartingales, stopping times, or Markov
processes, involve the notion of filtration. 1. Any filtration can easily be made complete
and right continuous;
indeed,
given a filtered
Definition 1 Let (, F, ) be a probability space. probability space , F, , , we first complete
A filtration , on (, F, ), is an increasing family the probability space , F, , and then we
(Ft )t≥0 of sub-σ -algebras of F. In other words, for add all the -null sets to every Ft+ , t ≥ 0. The
each t, Ft is a σ -algebra included in F and if s ≤ t, new filtration thus obtained satisfies the usual
Fs ⊂ Ft . A probability space (, F, ) endowed with hypotheses and is called the usual augmentation
a filtration is called a filtered probability space. of ;
2. Moreover, in most classical and encountered
We now give a definition that is very closely cases, the filtration is right continuous. Indeed,
related to that of a filtration. this is the case when, for instance, is the natural
filtration of a Brownian motion, a Lévy process,
Definition 2 A stochastic process (Xt )t≥0 on (,
a Feller process, or a Hunt process [8, 9].
F, ) is adapted to the filtration (Ft ) if, for each
t ≥ 0, Xt is Ft -measurable.
Enlargements of Filtrations
A stochastic process X is always adapted to
its natural filtration X , where for each t ≥ 0, For more precise and detailed references, the reader
FXt = σ (Xs , s ≤ t) (the last notation means that Ft can consult the books [4–6, 8] or the survey article
is the smallest σ -algebra with respect to which all the [7].
variables (Xs , s ≤ t) are measurable). X is, hence,
the smallest filtration to which X is adapted.
Generalities
The parameter t is often thought of as time, and
the σ -algebra Ft represents the set of information
Let , F, , be a filtered probability space satis-
available at time t, that is, events that have occurred fying the usual hypotheses. Let be another filtration
up to time t. Thus, the filtration represents the satisfying the usual hypotheses and such that Ft ⊂ Gt
evolution of the information or knowledge of the for every t ≥ 0. One natural question is, how are
world with time. If X is an adapted process, then the -semimartingales modified when considered as
Xt , its value at time t, depends only on the evolution stochastic processes in the larger filtration ? Given
of the universe prior to t. the importance of semimartingales and martingales
(in particular, in mathematical finance where they are
Definition 3 Let , F, , be a filtered probabil- used to model prices), it seems natural to character-
ity space. ize situations where the semimartingale or martingale
properties are preserved.
1. The filtration is said to be complete if , F,
is complete and if F0 contains all the -null Definition
4 We shall say that
the pair of filtra-
sets. tions , satisfies the H hypothesis if every
2. The filtration is said to satisfy the usual -semimartingale is a -semimartingale.
hypotheses if it is complete and right continuous,
that is, for all t ≥ 0, Ft = Ft+ , where Remark 1 In fact, using a classical decomposition
of semimartingales due to Jacod and Mémin, it is
Ft+ = Fu (1) enough to check that every -bounded martingale is
u>t a -semimartingale.
2 Filtrations
Definition
5 We shall say that the pair of filtrations The conditional laws of Z given Ft , for t ≥ 0,
, satisfies the (H ) hypothesis if every -local play a crucial role in initial enlargements.
martingale is a -local martingale.
Theorem 2 (Jacod’s criterion). Let Z be an F mea-
The theory of enlargements of filtrations, devel- surable random variable and let Qt (ω, dx) denote the
oped in the late 1970s, provides answers to questions regular conditional distribution of Z given Ft , t ≥ 0.
such as those mentioned earlier. Currently, this the- Suppose that for each t ≥ 0, there exists
a positive
ory has been widely used in mathematical finance, σ -finite measure ηt (dx) on , B such that
especially in insider trading models and in models of
default risk. The insider trading models are usually Qt (ω, dx) ηt (dx) almost surely (3)
based on the so-called initial enlargements of filtra-
tions, whereas the models of default risk fit well in Then every -semimartingale is a -semimartin-
the framework of the progressive enlargements of fil- gale.
trations.
More precisely,
given a filtered probability
space , F, , , there are essentially two ways of Remark 2 In fact, this theorem still holds for
enlarging filtrations: random variables with values in a standard Borel
space. Moreover, the existence of the σ -finite mea-
• initial enlargements, for which Gt = Ft H for sure ηt (dx) is equivalent to the existence of one pos-
every t ≥ 0, that is, the new information H is itive σ -finite measure η (dx) such that Qt (ω, dx)
brought in at the origin of time and η (dx) and in this case η can be taken to be the dis-
• progressive enlargements, for which Gt = Ft Ht tribution of Z.
for every t ≥ 0, that is, the new information is
brought in progressively as the time t increases. Now we give classical corollaries of Jacod’s
theorem.
Before presenting the basic theorems on enlarge-
ments of filtrations, we state a useful theorem due to Corollary 1 Let Z be independent of F∞ . Then,
Stricker. every -semimartingale is a -semimartingale.
Theorem 1 (Stricker [10]). Let and be two Corollary 2 Let Z be a random variable taking on
filtrations as above, such that for all t ≥ 0, Ft ⊂ Gt . only a countable number of values. Then every -
If (Xt ) is a -semimartingale that is -adapted, then semimartingale is a -semimartingale.
it is also an -semimartingale.
In some cases, it is possible to obtain an explicit
decomposition of an -local martingale as a -
Initial Enlargements of Filtrations semimartingale [4–8]. For example, if Z = Bt0 , for
some fixed time t0 > 0 and a Brownian Motion B, it
The most important theorem on initial enlargements can be shown that Jacod’s criterion holds for t < t0
of filtrations is due to Jacod and deals with the special and that every -local martingale is a semimartin-
case where the initial information brought in at the gale for 0 ≤ t < t0 , but not necessarily including t0 .
origin of time consists of the σ -algebra generated
by Indeed in this case, there are -local martingales
a random variable. More precisely, let , F, , that are not -semimartingales. Moreover, B is a
be a filtered probability space satisfying the usual -semimartingale, which decomposes as
assumptions. Let Z be an F measurable random
variable. Define
t +
t∧t0
Bt0 − Bs
Bt = B0 + B ds (4)
Gt = Ft+ε σ {Z} , t ≥ 0 (2) 0 t0 − s
ε>0
t is a Brownian Motion.
where B
In financial models, the filtration represents
the public information in a financial market and Remark 3 There are cases where Jacod’s crite-
the random variable Z stands for the additional rion does not hold but where other methods apply
(anticipating) information of an insider. [4, 6, 7].
Filtrations 3
Progressive Enlargements of Filtrations The next decomposition formulas are used for
pricing in default models:
Let , F, , be a filtered probability space sat-
isfying
the usual hypotheses, and ρ : (, F) → Proposition 1
+ , B + be a random time. We enlarge the ini-
tial filtration with the process (ρ ∧ t)t≥0 , so that
1. Let ξ ∈
L1 . Then a càdlàg version of the martingale
the new enlarged filtration ρ is the smallest filtra- ρ
ξt = Ɛ ξ |Ft , on the set {t < ρ}, is given by:
tion (satisfying the usual assumptions) containing
(i.e., for all t ≥ 0,
and making ρ a stopping time 1
ρ ξt 1t<ρ = ρ 1t<ρ Ɛ ξ 1t<ρ |Ft (10)
Ft = Kot+ , where Kot = Ft σ (ρ ∧ t)). One may Zt
interpret ρ as the instant of default of an issuer; the
2. Let ξ ∈ L1 and let ρ be an honest time.
Then a
given filtration can be thought of as the filtration ρ
càdlàg version of the martingale ξt = Ɛ ξ |Ft is
of default-free prices, for which ρ is not a stopping
given as
time. Then, the filtration ρ is the defaultable market
filtration used for the pricing of defaultable assets. 1
A few processes play a crucial role in our ξt = ρ Ɛ ξ 1t<ρ |Ft 1t<ρ
Zt
discussion:
1
• the -supermartingale + ρ Ɛ ξ 1t≥ρ |Ft 1t≥ρ (11)
1 − Zt
ρ
Zt = [ρ > t | Ft ] (5)
The (H ) Hypothesis
chosen to be càdlàg, associated to ρ by Azéma
[1]; The (H ) hypothesis, in contrast to the (H ) hypothe-
• the -dual optional projection of the process sis, is sometimes presented
as a no-abitrage
condition
ρ
1{ρ≤t} , denoted by At (see [7, 8] for a definition in default models. Let , F, be a probability
of dual optional projections); and space satisfying the usual assumptions. Let and
• the càdlàg martingale be two subfiltrations of F, with
ρ
ρ ρ Ft ⊂ Gt (12)
µt = Ɛ Aρ∞ | Ft = At + Zt (6)
Brémaud and Yor [2] have proven the following
Theorem 3 Every -local martingale (Mt ), stopped characterization of the (H ) hypothesis:
at ρ, is an ρ -semimartingale, with canonical
decomposition: Theorem 4 The following are equivalent:
t∧ρ
t + d M, µρ
s 1. Every -martingale is a -martingale.
Mt∧ρ = M ρ (7)
0 Zs− 2. For all t ≥ 0, the sigma fields Gt and F∞ are
independent conditionally on Ft .
t is an ρ -local martingale.
where M
Remark 4 We also say that is immersed in .
The most interesting case in the theory of progres-
sive enlargements of filtrations is when ρ is an honest In the framework of the progressive enlargement
time or equivalently the end of an optional set , of some filtration with a random time ρ, the
that is, (H ) hypothesis is equivalent to one of the following
ρ = sup {t : (t, ω) ∈ } (8) hypothesis [3]:
Indeed, in this case, the pair of filtrations (, ρ ) ρ
satisfies the (H ) hypothesis: every -local martin- 1. ∀t, the σ -algebras F∞ and Ft are conditionally
gale (Mt ) is an ρ -semimartingale, with canonical independent given Ft .
decomposition: 2. For all bounded F∞ measurable random vari-
ρ
t∧ρ t ables F and all bounded Ft measurable random
t + d M, µρ
s d M, µρ
s variables Gt , we have
Mt = M ρ − 1{ρ≤t} ρ
0 Zs− ρ 1 − Zs−
(9) Ɛ [FGt | Ft ] = Ɛ [F | Ft ] Ɛ [Gt | Ft ] (13)
4 Filtrations
ρ
For all bounded Ft measurable random variables 1 1
3. × d[X, R]s − d[X, R ]s
Gt : Rs− Rs−
Ɛ [Gt | F∞ ] = Ɛ [Gt | Ft ] (14) (19)
is a , -local martingale.
4. For all bounded F∞ measurable random vari-
ables F, References
Ɛ F | Fρt = Ɛ [F | Ft ] (15)
[1] Azéma, J. (1972). Quelques applications de la théorie
générale des processus I, Inventiones Mathematicae 18,
293–336.
5. For all s ≤ t, [2] Brémaud, P. & Yor, M. (1978). Changes of filtration
and of probability measures, Zeitschrift fur Wahrschein-
[ρ ≤ s | Ft ] = [ρ ≤ s | F∞ ] (16) lichkeitstheorie und Verwandte Gebiete 45, 269–295.
[3] Elliott, R.J., Jeanblanc, M. & Yor, M. (2000). On models
of default risk, Mathematical Finance 10, 179–196.
In view of applications to financial mathematics, [4] Jeulin, T. (1980). Semi-martingales et Grossissements
d’une Filtration, Lecture Notes in Mathematics,
a natural question is, how is the (H ) hypothesis Springer, Vol. 833.
affected when we make an equivalent change of [5] Jeulin, T. & Yor, M. (eds) (1985). Grossissements de
probability measure? Filtrations: Exemples et Applications, Lecture Notes in
Mathematics, Springer, Vol. 1118.
Proposition 2 Let be a probability measure [6] Mansuy, R. & Yor, M. (2006). Random Times and
that is equivalent to (on F). Then, every , - (Enlargement of Filtrations) in a Brownian Setting,
semimartingale is a , -semimartingale. Lecture Notes in Mathematics, Springer, Vol. 1873.
[7] Nikeghbali, A. (2006). An essay on the general theory
of stochastic processes, Probability Surveys 3, 345–412.
Now, define [8] Protter, P.E. (2005). Stochastic Integration and Differ-
d d
ential Equations, 2nd Edition, version 2.1, Springer.
= Rt , = Rt (17) [9] Revuz, D. & Yor, M. (1999). Continuous Martingales
d Ft d Gt and Brownian Motion, 3rd Edition, Springer.
[10] Stricker, C. (1977). Quasi-martingales, martingales
If Y = d , then the hypothesis (H ) holds under locales, semimartingales et filtration naturelle, Zeitschrift
d fur Wahrscheinlichkeitstheorie und Verwandte Gebiete
if and only if 39, 55–63.
with 0 < c < ∞, and One can actually see Tanaka’s formula as an example
√ of extension of Itô’s formula (see Itô’s Formula).
lim sup sup Lxt (t log log t)−1/2 = 2 (4) Local time is also involved in inequalities reminis-
t→∞ x∈
cent of the Burkholder–Davis–Gundy ones. Indeed,
One of these special times is Ta , the first hitting in [2], it is shown that there exist two universal pos-
time by B of a given value a. The law of (LbTa , b ∈ itive and finite constants c and C such that
) is described by one of the famous Ray–Knight
cE [sup |Xt |] ≤ E[sup La∞ ] ≤ CE [sup |Xt |] (8)
Theorems (see [8, Chapter XI]). t a t
2 Local Times
for any continuous local martingale X with occupation time formula as for the real Brownian
X0 = 0. motion:
t
f (Xs ) ds = f (b)bt (X) db (12)
Local Time of a Markov Process 0 E
possible form for the integrand H and work gradually asset since its price does not change over time on
to extend the stochastic integral to more complex average. Indeed, if H is of the form (3), then H · X
integrands by imposing conditions on X but making is a martingale with expected value zero so that
sure that these conditions are as minimal as possible the traders earn zero profit on average, as expected.
at the same time. Now consider another strategy H = 1[0,T1 ) , where
The simplest integrand one can think of is of the T1 is the time of the first jump of N . Since X is
following form: an FV process, H · X is well defined as a Stieltjes
integral and is given by (H · X)t = λ(t ∧ T1 ) > 0,
a.s., being the value of the portfolio at time t.
Ht (ω) = 1(S(ω),T (ω)] (t)
Thus, this trading strategy immediately accumulates
1 if S(ω) < t ≤ T (ω) arbitrage profits. A moment of reflection reveals that
:= (3)
0 otherwise such a trading strategy is not feasible under usual
circumstances since it requires the knowledge of the
where S and T are stopping times (see Filtrations) time of a market crash, time T1 in this case, before it
with respect to . In financial terms, this corresponds happens. If we use H = 1[0,T1 ] instead, this problem
to a buy-and-hold strategy, whereby one unit of the disappears.
asset is bought at, possibly random, time S and sold
at time T . If X is the stochastic process representing Naturally, one will want the stochastic integral to
the price of the asset, the net profit of such a trading be linear. Given a linear integral operator, we can
strategy aftertime T is equal to XT − XS . This leads define H · X for integrands that are linear combina-
us to define H dX as tions of processes of the form (3).
t Definition 3 A process H is said to be simple
Hs dXs = Xt∧T − Xt∧S (4) predictable if H has a representation
0
n
where t ∧ T := min{t, T } for all t, 0 ≤ t < ∞, and Ht = H0 1{0} (t) + Hi 1(Ti ,Ti+1 ] (t) (5)
stopping times T . Clearly, the process H in equation i=1
(3) has paths that are left continuous and possess
right
limits. We could similarly have defined H dX for where 0 = T1 ≤ · · · ≤ Tn+1 < ∞ is a finite sequence
H of the form, say, 1[S,T ) . However, there is a good of stopping times, H0 ∈ F0 , Hi ∈ FTi , 1 ≤ i ≤ n with
reason for insisting on paths that are continuous from |Hi | < ∞, a.s., 0 ≤ i ≤ n. The collection of simple
the lefton (0, ∞) as we see in Example 1. Let us predictable processes is denoted by S.
t
denote 0 Hs dXs by (H · X)t .
Let L0 be the space of finite-valued random
Theorem 1 Let H be of the form (3) and M be variables endowed with the topology of convergence
a martingale (see Martingales). Then H · M is a in probability. Define the linear mapping IX : S →
martingale. L0 as
reasonably weak version. A particularly weak version Arguably, Brownian motion is the most well
of the bounded convergence theorem is that the known of all semimartingales. In the following
uniform convergence of H n to H in S implies the section, we develop stochastic integration with
convergence of IX (H n ) to IX (H ) only in probability. respect to a Brownian motion.
Let Su be the space S topologized by uniform
convergence and recall that for a process X and a
stopping time T , the notation X T denotes the process L2 Theory of Stochastic Integration with
(Xt∧T )t≥0 . Respect to Brownian Motion
Definition 4 A process X is a total semimartingale if We assume that there exists a Brownian motion, B, on
X is càdlàg, adapted and IX : Su → L0 is continuous. (, F, , ) with B0 = 0, and that F0 only contains
X is a semimartingale (see Semimartingale) if, for the (F, )-null sets. First, we define the notion of
each t ∈ [0, ∞), X t is a total semimartingale. predictability, which is the key concept in defining
the stochastic integral.
This continuity property of IX allows us to extend
the definition of stochastic integrals to a class of Definition 5 The predictable σ -algebra P on
integrands that is larger than S when the integrator [0, ∞) × is defined to be the smallest σ -algebra
is a semimartingale. on [0, ∞) × with respect to which every adapted
It follows from the definition of a semimartingale càglàd process is measurable. A process is said to
that semimartingales form a vector space. One can be predictable if it is a P-measurable map from
also show that all square integrable martingales and [0, ∞) × to .
all adapted FV processes are semimartingales (see
Semimartingale). Therefore, the sum of a square Clearly, S ⊂ P. Actually, there is more to this as is
integrable martingale and an adapted FV process shown by the next theorem.
would also be a semimartingale. The converse of
this statement is also “essentially” true. The precise Theorem 3 Let bS be the set of elements of S
statement is the following theorem. that are bounded a.s. Then, P = σ (bS), that is, P is
generated by the processes in bS.
Theorem 2 (Bichteler–Dellacherie Theorem).
By linearity of the stochastic integral and Theo-
Let X be a semimartingale. Then there exist processes
rem 1 and using the fact that Brownian motion has
M, A, with M0 = A0 = 0 such that
increments independent from the past with a certain
Xt = X0 + Mt + At (7) Gaussian distribution, we have the following.
where M is a local martingale and A is an adapted Theorem 4 Let H ∈ bS and define (H · B)t = (H ·
B t )∞ , that is, (H · B)t is the stochastic integral of H
FV process.
with respect to B t . Then H · B is a martingale and
Here, we emphasize that this decomposition is not t
necessarily unique. Indeed, suppose that X has Ɛ (H · B)t =
2
Ɛ[Hs2 ] ds (8)
the decomposition X = X0 + M + A and the space 0
(, F, , ) supports a Poisson process N with In the following, we construct the stochastic integral
intensity λ. Then Yt = Nt − λt will define a martin- with respect to Brownian motion for a subset of
gale, which is also an FV process. Therefore, X can predictable processes. To keep the exposition simple,
also be written as X = X0 + (M + Y ) + (A − Y ). we restrict our attention to a finite interval [0, T ],
The reason for the nonuniqueness is the existence where T is arbitrary but deterministic. Define
of martingales that are of finite variation. However,
T
if X has a decomposition X = X0 + M + A, where
M is a local martingale and A is predictablea and L (B ) := H ∈ P :
2 T
Ɛ[Hs ] ds < ∞
2
(9)
0
FV with M0 = A0 = 0, then such a decomposition is
unique since all predictable local martingales that are which is a Hilbert space. Note that bS ⊂ L2 (B T ).
of finite variation have to be constant. Letting L2 (FT ) denote the space of square integrable
4 Stochastic Integrals
FT -measurable random variables, Theorem 4 now Brownian motion. We show that the integral oper-
implies the map ator is a continuous mapping from the set of simple
predictable process into an appropriate space so that
IB T : bS → L2 (FT ) (10) we can extend the set of possible integrands to the
closure of S in a certain topology.
defined by
IB T (H ) = (H · B)T (11) Definition 6 A sequence of processes (H n )n≥1 con-
verges to a process H uniformly on compacts in prob-
is an isometry. Consequently, we can extend the ability (UCP) if, for each t > 0, sup0≤s≤t |Hsn − Hs |
definition of the stochastic integral uniquely to the converges to 0 in probability.
closure of bS in L2 (B T ). An application of monotone
class theorem along with Theorem 3 yields that the The following result is not surprising and one can
closure is the whole L2 (B T ). refer to, for example, [7] for a proof.
Theorem 5 Let H ∈ L2 (B T ). Then the Itô integral Theorem 6 The space S is dense in under the
(H · B)T of H with respect to B T is the image of H UCP topology.
under the extension of the isometry IB T to the whole
of L2 (B T ). In particular, The following mapping is key to defining the stochas-
T tic integral with respect to a general semimartingale.
Ɛ (H · B)2T = Ɛ[Hs2 ] ds (12) Definition 7 For H ∈ S and X being a càdlàg
0
process, define the linear mapping JX : S → by
Moreover, the process Y defined by Yt = (H · B)t∧T n
is a square integrable martingale. JX (H ) = H0 X0 + Hi (X Ti+1 − X Ti ) (14)
i=1
The property (12) is often called the Itô isometry.
where H has the representation as in equation (5).
1 2 1
0 = T0 ≤ T1 ≤ · · · ≤ Tk < ∞. (15) = B(t∧T n ) − (Bt − Btj )2 (19)
2 k n 2 tj ∈σn j +1
tj <t
The sequence of σ is called a random partition. A
sequence of random partitions σn
As n tends to ∞, the sumc in equation (19) is
σn : 0 = T0n ≤ T1n ≤ · · · ≤ Tknn (16) known to converge to t. Obviously, BT2 n ∧t tends to
kn
Bt2 since σn tends to identity. Thus, we conclude via
is said to tend to identity if Theorem 8 that
1. limn→∞ supj Tjn = ∞, a.s. and t
1 2 t
2. supj |Tjn+1 − Tjn | converges to 0 a.s. Bs dBs = B − (20)
0 2 t 2
Let Y be a process and σ be a random partition. since B is continuous with B0 = 0. Thus, the integra-
Define the process tion rules for a stochastic integral are quite different
from those for an ordinary integral. Indeed, if A
Y σ := Y0 1{0} + YTj 1(Tj ,Tj +1 ] (17) were a continuous process of finite variation with
j A0 = 0, then the Riemann–Stieltjes integral of A · A
will yield the following formula:
Consequently, if Y is in or
t
1 2
Y σ · X = Y0 X0 + YTj X Tj +1 − X Tj (18) As dAs = A (21)
0 2 t
j
integral of H with respect to a particular martingale Theorem 15 The stochastic integral is associative.
is not a martingale. That is, H · X is also a semimartingale and if G ∈
Before we allow more general predictable inte-
grands in a stochastic integral, we need to develop G · (H · X) = (GH ) · X (22)
the notion of quadratic variation of a semimartingale.
Definition 11 The quadratic variation process of X,
This is discussed in the following section.
denoted by [X, X] = ([X, X]t )t≥0 , is defined as
When X and Y are FV processes, the classical then X is constant on [S, T ]. Moreover, if [X, X] is
integration by parts formula reads as follows: constant on [S, T ] ∩ [0, ∞), then X is also constant
there.
Xt Yt = X0 Y0 + (X− · Y )t The following result is quite handy when it comes
+ (Y− · X)t + Xs Ys (28) to the calculation of the quadratic covariation of two
0<s≤t stochastic integrals.
local martingale and A is predictable and of finite Moreover, it is easy to show that if (H n ) ⊂ b and
variation with M0 = A0 = 0 and X = X0 + M + A. (J n ) ⊂ b converge to the same limit under dX (·, ·),
This decomposition of a special semimartingale is then (H n · X) and (J n · X) converge to the same limit
unique and called the canonical decomposition. With- in H2 . Thus, we can now define the stochastic integral
out loss of generality, let us assume that X0 = 0. H · X for any H ∈ bP.
In the case X = B T , M = B T , and A = 0; therefore, random variable with (U = 1) = (U = −1) =
H being (H2 , X) integrable is equivalent to the 1/2, and set X = U 1[T ,∞) . Then, X is a martingale in
condition T
its own filtration. Let H be defined as Ht = 1t 1{t>0} .
Ɛ(Hs2 ) ds < ∞ (35) H is a deterministic predictable integral. Note that
0 H is not locally bounded, being only continuous on
(0, ∞). H · X exists as a Lebesgue–Stieltjes integral
which gives exactly the elements of L2 (B T ).
since X has paths of finite variation. However, H · X
So far, we have been able to define the stochastic
is not a local martingale since, for any stopping time
integral with predictable integrands only for semi-
S with P (S > 0) > 0, Ɛ(|(H · X)S |) = ∞.
martingales in H2 . This seems to be a major restric-
tion. However, as the following theorem shows, it When M is a continuous local martingale, the
is not. Recall that for a stopping time T , X T − = theory becomes nicer.
X1[0,T ) + XT − 1[T ,∞] .
Theorem 26 Let M be a continuous
t local martin-
Theorem 23 Let X be a semimartingale, X0 = gale and let H ∈ P be such that 0 Hs2 d[M, M]s <
0. Then X is prelocally in H2 . That is, there ∞, for each t ≥ 0. Then H ∈ L(M) and H · M is a
exists a nondecreasing sequence of stopping times continuous local martingale.
(T n ), limn→∞ T n = ∞ a.s., such that X T − ∈ H2 , for
n
Theorem 28 Let X be a semimartingale and processes when at least one of the integrand or the
(H n ) ⊂ P be a sequence converging a.s. to a limit integrator is continuous.
H ∈ P. If there exists a process G ∈ L(X) such that
|H n | ≤ G, for all n, then H n ∈ L(X) for all n, H ∈
End Notes
L(X) and (H n · X) converges to H · X in UCP.
a.
See Definition 5 for the definition of a predictable process.
b.
Concluding Remarks For a proof of the fact that UCP is metrizable and
complete under that metric, see [7].
c.
In this article, we used the approach of Protter [7] to This sum converges to the quadratic variation of B over
the interval [0, t] as we see in Theorem 16.
define the semimartingale as a good integrator and
construct its stochastic integral. Another approach
that is closely related is given by Chou et al. [1], References
who developed the stochastic integration for general
predictable integrands with respect to a semimartin- [1] Chou, C.S., Meyer, P.A. & Stricker, C. (1980). Sur
gale in a space endowed with the semimartingale les intégrales stochastiques de processus prévisibles non
topology. Historically, the stochastic integral was first bornés, Séminaire de Probabilités, XIV . Lecture Notes in
Mathematics, 784, Springer, Berlin, pp. 128–139.
proposed for Brownian motion by Itô [3], then for
[2] Doléans-Dade, C. & Meyer, P.-A. (1970). Intégrales
continuous martingales, then for square integrable stochastiques par rapport aux martingales locales, Sémi-
martingales, and finally for càdlàg processes that naire de Probabilités, IV . Lecture Notes in Mathematics,
can be written as the sum of a locally square inte- 124, Springer, Berlin, pp. 77–107.
grable local martingale and an FV process by J.L. [3] Itô, K. (1944). Stochastic integral, Proceedings of the
Doob, H. Kunita, S. Watanabe, P. Courrège, P.A. Imperial Academy of Tokyo 20, 519–524.
Meyer, and others. Later in 1970, Doléans-Dade and [4] Jeulin, T. (1980). Semi-martingales et Grossissement
d’une Filtration, Lecture Notes in Mathematics, Springer,
Meyer [2] showed that the local square integrability
Berlin, Vol. 833.
condition could be relaxed, which led to the tradi- [5] McShane, E.J. (1974). Stochastic Calculus and Stochastic
tional definition of a semimartingale as a sum of a Models, Probability and Mathematical Statistics, Aca-
local martingale and an FV process. A different the- demic Press, New York, Vol. 25.
ory of stochastic integration, the Itô-belated integral, [6] Protter, P. (1979). A comparison of stochastic integrals,
was developed by McShane [5]. It imposed differ- The Annals of Probability 7(2), 276–289.
ent restrictions on the integrators and the integrands [7] Protter, P. (2005). Stochastic Integration and Differential
Equations, 2nd Edition, Version 2.1, Springer, Berlin.
and used a theory of “gauges” and appeared to be
very different from the approach here. It turns out,
however, that when the integral H dX made sense Related Articles
both as a stochastic integral in the sense developed
here and as an Itô-belated integral, they were indis- Arbitrage Strategy; Complete Markets; Equi-
tinguishable. See [6] for a comparison of these two valent Martingale Measures; Filtrations; Itô’s
integrals. Another related stochastic integral is called Formula; Martingale Representation Theorem;
the Fisk–Stratonovich (FS) integral that was devel- Semimartingale.
oped by Fisk and Stratonovich independently. The FS
integral obeys the integration by parts formula for FV UMUT ÇETIN
Equivalence of Probability in L1 (Q). We then have
Measures Zs EQ f | Fs = EP Zt f | Fs (3)
it in a Wiener space setting. Later on it was extended to be a martingale measure for the price process,
in various levels of generality by Girsanov, Meyer, and then equivalence is a necessary condition to
and Lenglart, among many others. exclude arbitrage opportunities [1]. There is, how-
Let us first give some examples. They are all ever, also a result which covers the case where
the consequences of the general formulation of Q is only absolutely continuous, but not equiva-
Girsanov’s theorem to be given below. lent to P , and which has been proven by Lenglart
[2].
1. Let B be a P -Brownian motion, µ ∈ , and
define an equivalent measure Q by the stochastic Theorem 1 (Girsanov’s Theorem: Standard
exponential Version). Let P ∼ Q, with density process given by
dQ 1 dQ
= E (−µB)T = exp −µBT − µ2 T Zt = E Ft (6)
dP 2 dP
(4)
If S is a semimartingale under P with decomposi-
Then B = B + µt is a Q-Brownian motion (up tion S = M + A (here M is a local martingale, and
to time T ). Alternatively stated, the semimartin- A a process of locally finite variation), then S is a
gale decomposition of B under Q is B = B − semimartingale under Q as well and has decomposi-
µt. Hence the effect of the measure change is to tion
add a drift term to the Brownian motion.
2. Let Nt − λt be a compensated Poisson process 1
S= M− d[Z, M]
on an interval [0, T ] with P -intensity λ > 0, Z
and let κ > 0. Define an equivalent measure
1
Q by + A+ d [Z, M] (7)
Z
dQ 1
= e−κλT (1 + κNs ) In particular, M −
dP Z d[Z, M] is a local Q-
0<s≤T martingale.
= e−κλT (1 + κ)NT
In situations where the process S may exhibit
= exp (NT ln (1 + κ) − κλT ) (5) jumps, it is often more convenient to apply a version
of Girsanov which uses the angle bracket instead of
Then N is a Poisson process on [0, T ] under Q the quadratic covariation.
with intensity (1 + κ) λ. The process Nt − (1 + κ) λt
is a compensated Poisson process under Q and thus a Theorem 2 (Girsanov’s Theorem: Predictable
Q-martingale. Hence the effect of the measure change Version). Let P ∼ Q, with density process as above,
is to change the intensity of the Poisson process, or and S = M + A be a P -semimartingale. Given that
in other words, to add a drift term to the compensated Z, M exists (with respect to P ), then the decompo-
Poisson process. sition of S under Q is
One of the most important applications of measure
changes in mathematical finance is to find martingale
1
measures for the price process S of some risky asset. S= M− d Z, M
Z−
Definition 2 A martingale measure for S is a proba- 1
bility measure Q such that S is a Q-local martingale. + A+ d Z, M (8)
Z−
Let us now state a general form of Girsanov’s Here Z− denotes the left-continuous version of Z.
theorem. It is not the most general setting, though,
since we will assume that Q is equivalent to P Whereas the standard version of Girsanov’s theo-
which suffices for most applications in finance. This rem always works, we need an integrability condition
is due to the fact that one would often choose Q (existence of Z, M) for the predictable version.
Equivalence of Probability Measures 3
However, in case S = M + A for a local martingale For example, in the Bachelier model S = B + µt
M and a finite variation process A, it is rarely the case we have that Bt = t, and hence λ equals the con-
in a discontinuous framework that dA << d [M], stant µ.
whereas it is quite natural in financial applications The predictable version of Girsanov’s theorem
that dA << d M (see below). can now be applied to remove the drift λd M as
In mathematical finance, these results are often follows: we define a probability measure Q via
applied to find a martingale measure for the price pro-
cess S. Consider, for example, the Bachelier model dQ
= E − λ dM (14)
where S = B + µt is a Brownian motion plus drift. dP T
If we now take as above the measure
change as
given
where E denotes the Doléans-Dade stochastic
by a density process Zt = exp −µBt − 1 µ2 t , then exponential, assuming that E − λdM is a
2
we have (since dZ = −µZdB) martingale. The corresponding density process Z
therefore satisfies the stochastic differential equation
1 1
A+ d [Z, M] = µt + d −µ Z dB, B dZ = −Z− λ dM (15)
Z Z
1 It follows that
= µt + d −µ Z dt
Z
1
according to Lévy’s characterization), and hence Q S=M+ λd M = M − d Z, M (17)
Z−
is a martingale measure for S.
More generally, Girsanov’s theorem implies an is by the (predictable version) of the Girsanov theo-
important structural result for the price process S rem a local Q-martingale: the drift has been removed
in an arbitrage-free market. As has been mentioned by the measure change.
above, it is essentially true that some no-arbitrage This representation of S has an important con-
property implies the existence of an equivalent mar- sequence for the structure of martingale measures,
tingale measure Q for S = M + A, with density pro- provided the so-called structure condition holds:
cess Z. Therefore, we must have by the predictable
T
version (8), given that Z, M exists, that λ2s d Ms < ∞ P –a.s. (18)
0
1
A=− d Z, M (10) In that case, the remarkable conclusion we can
Z−
draw from (13) is that the existence of an equivalent
to get that S is a local Q-martingale. As it follows martingale measure for S implies that S is a spe-
from the so-called Kunita-Watanabe inequality that cial semimartingale, for example, its finite variation
part is predictable and therefore the semimartingale
d Z, M d M (11)
decomposition (13) is unique. Moreover, the follow-
(here Z, M respectively M are interpreted as the ing result holds.
associated measures on the nonnegative real line), we
conclude that Proposition 1 Let Q be an equivalent martingale
dA d M (12) measure for S, and the structure condition (18) hold.
Then the density process Z of Q with respect to P is
and hence there exists some predictable process λ given by the stochastic exponential
such that
S = M + λ d M (13) Z = E − λ dM + L (19)
4 Equivalence of Probability Measures
The SEP problem can be stated as follows: Skorokhod [20] and Dubins [8] solved the SEP for
Given a stochastic process (Xt : t ≥ 0) and a Brownian motion and arbitrary centereda probability
probability measure µ, find a minimal stopping time measure µ. However, the search for new solutions
τ such that Xτ has the law µ : Xτ ∼ µ. continued and was, to a large extent, motivated by the
At first, there seems to be a trivial solution to the properties of the stopping times. Researchers sought
SEP when Xt = Bt is a Brownian motion. Write simple explicit solutions that would have additional
and Fµ for the cumulative distribution function of the optimal properties. Several solutions were obtained
standard normal distribution and of µ, respectively. using stopping times of the form
Then Fµ−1 ((B1 )) has law µ and hence the stop-
τ = inf{t : (At , Bt ) ∈ }, = (µ) ⊂ 2 (1)
ping time τ = inf{t ≥ 2 : Bt = Fµ−1 ((B1 ))} satis-
fies Bτ ∼ µ. However, this solution is intuitively “too which is a first hitting time for the Markov process
large”, in particular Ɛτ = ∞. A meaningful solution (At , Bt ), where (At ) is some auxiliary increasing
needs to be “small”. To express this, Skorokhod [20] process. We now give two examples.
imposed Ɛτ < ∞ and solved the problem explicitly Consider At = t and let τR be the resulting
for any centered target measure with finite variance. stopping time in (1). Root [17] proved that for
To avoid the restriction on the set of target measures, any centered µ there is a barrier = (µ) such
in general, one requires τ to be minimal. Minimal- that Bτ ∼ µ, where a barrier is a set in + ×
ity of τ signifies that if a stopping time ρ satisfies (time–space) such that if a point is in , then all
ρ ≤ τ and Xρ ∼ Xτ then ρ = τ . When ƐBτ = 0, points to the right of it are also in (see Figure 1).
2 Skorokhod Embedding
Bt Applications
Robust Price Bounds
TR t
In the standard approach to pricing and hedging, one
postulates a model for the underlying, calibrates it
to the market prices of liquidly traded vanilla options
(see Call Options), and then uses the model to derive
prices and associated hedges for exotic over-the-
counter products (such as Barrier Options; Look-
Figure 1 The barrier and Root stopping time τR
back Options; Foreign Exchange Options). Prices
embedding a uniform law and hedges will be correct only if the model describes
the real world perfectly, which is not very likely. The
SEP-driven approach uses the market data to deduce
Later Rost (cf. [14]) proved an analogous result bounds on the prices consistent with no-arbitrage
replacing (µ) with a reversed barrier ˜ = (µ), ˜ and the associated super-replicating strategies (see
which is a set in time–space such that if a point is in Superhedging), which are robust to model misspec-
˜ then all the points to the left of it are also in .
, ˜ ification.
˜
We denote τ̃R the first hitting of (µ). Rost (cf. [14, Assume absence of arbitrage (see Fundamen-
19]) proved that for any other solution τ to the SEP tal Theorem of Asset Pricing) and work under a
and any positive convex function f , we have risk-neutral measure (see Risk-neutral Pricing) so
that the forward price process (see Forwards and
Ɛf (τR ) ≤ Ɛf (τ ) ≤ Ɛf (τ̃R ) (2) Futures) (St : t ≤ T ) is a martingale. Equivalently,
under a simplifying assumption of zero interest rates,
In financial terms, as we will see, this implies St is simply the stock price process. We are interested
bounds on the prices of volatility derivatives. Given in pricing an exotic option with payoff given by a
a measure µ, the barrier and the reversed barrier ˜ path-dependent functional F (S)T . Our main example
are not known explicitly. However, using techniques considered below is a one-touch option struck at α
of partial differential equations, they can be computed that pays 1 if the stock price reaches α before matu-
numerically together with the bounds in equation (2) rity T : O α (S)T = 1S T ≥α , where S T = supt≤T St . It
(see [9]). follows from Monroe’s theorem that St = Bρt , for
Consider now At = B t = supu≤t Bu in equation a Brownian motion (Bt ) with B0 = S0 and some
(1). Azéma and Yor [1] proved that, for a probability increasing sequence of stopping times ρt : t ≤ T
measure µ satisfying xµ(dx) = B0 , the stopping (possibly relative to an enlarged filtration). We make
time no other assumptions about the dynamics of the
underlying. Instead, we propose to investigate the
τAY = inf{t : µ (Bt ) ≤ B t }, restrictions induced by the market data.
Suppose, first, that we know the market prices of
1
where µ (x) = uµ( du) (3) calls and puts (see Call Options) for all strikes at
µ([x, ∞)) [x,∞) one maturity T . This is equivalent to knowing the
is minimal and BτAY ∼ µ. The Azéma–Yor stopping distribution µ of ST (cf. [3]). Thus, we can see the
time is also optimal as it stochastically maximizes the stopping time ρ = ρT as a solution to the SEP for
maximum: (B τ ≥ α) ≤ (B τAY ≥ α), for all α ≥ 0 µ. Conversely, given a solution τ to the SEP for µ,
and any minimal τ with Bτ ∼ BτAY . Later, Perkins the process S̃t = Bτ ∧ t is a model for the stock-
T −t
[16] developed a stopping time τP , which, in turn, price process consistent with the observed prices of
stochastically minimizes the maximum. As we will calls and puts at maturity T. In this way, we obtain
Skorokhod Embedding 3
a correspondence that allows us to identify market of an embedding that maximizes the maximum. As
models with solutions to the SEP and vice versa. we have seen, in financial terms, this amounts to
In consequence, to estimate the fair price of the obtaining the least upper bound on the price of a
exotic option ƐF (S)T , it suffices to bound ƐF (B)τ one-touch option.
among all solutions τ to the SEP. More precisely, if In practice, we do not observe the prices of calls
F (S)T = F (B)ρT a.s., then we have and puts for all strikes but only for a finite family
of strikes. As a result, the terminal law of ST is
inf ƐF (B)τ ≤ ƐF (S)T ≤ sup ƐF (B)τ (4) not specified entirely and one needs to optimize
τ :Bτ ∼µ τ :Bτ ∼µ
among possible terminal laws (cf. [5, 10]). In general,
where all stopping times τ are minimal. Consider, different sets of market prices lead to embedding
for example, a volatility derivativeb paying F (S)T = problems with different constraints. The resulting
f (ST ), for some positive convex function f , problems can be complex. In particular, to our best
and suppose that the underlying (St ) is continuous. knowledge, there are no known optimal solutions to
Then, by Dubins–Schwarz theorem, we can take the the SEP with multiple intermediate law constraints.
time change ρt = St so that f (ST ) = f (ρT ) =
F (B)ρT . Using inequality (2), inequality (4) becomes
Robust Hedging
Ɛf (τR ) ≤ Ɛf (ST ) ≤ Ɛf (τ̃R ) (5)
Once we know the price-range for an option, we want
where BτR ∼ ST ∼ Bτ̃R (cf. [9]). to understand model-free super-replicating strategies
When (St ) has jumps typically one of the bounds (see Superhedging). In general, to achieve this, we
in inequality (4) remains true and the other degen- need to develop a pathwise approach to the SEP.
erates. In the example of a one-touch option, one Following [5], we treat the example of a one-touch
sees that O α (S)T ≤ O α (B)ρT and the fair price is option. We develop a super-replicating portfolio with
always bounded above by supτ {(B τ ≥ α) : Bτ ∼ the initial wealth equal to the upper bound displayed
µ}. Furthermore, the supremum is attained by the in equation (6).
Azéma–Yor construction discussed above. The best The key observation lies in the following simple
lower bound on the price in the presence of jumps inequality:
is the obvious bound µ([α, ∞)). In consequence, the
price of a one-touch option ƐO α (S)T = (S T ≥ α) (ST − K)+ Sς∧T − ST
1S T ≥α ≤ + 1S T ≥α (7)
is bounded by α−K α−K
where α > S0 , K and ς = inf{t : St ≥ α}. Taking
µ([α, ∞)) ≤ (S T ≥ α) ≤ (B τAY ≥ α) expectations yields (S T ≥ α) ≤ C(K)/(α − K),
where C(K) denotes the price of a European call with
= µ([µ−1 (α))) (6)
strike K and maturity T . Taking the optimal K =
and the lower bound can be improved to (B τP ≥ α) K ∗ such that C(K ∗ ) = (α − K ∗ )|C (K ∗ )| we find
under the hypothesis that (St ) is continuous, where (S T ≥ α) ≤ |C (K ∗ )| = (ST ≥ K ∗ ). On the other
τP is Perkins’ stopping time (see [5] for detailed hand, using |C (K)| = µ([K, ∞)), where µ ∼ ST ,
discussion and numerical examples). Selling a one- we have
touch option for a lower price then the upper bound ∞
in equation (6) necessarily involves some risk. If C(K) = (u − K)µ(du) = |C (K)| µ (K) − K
additional modeling assumptions are made, then a K
lower price can be justified, but this new price is (8)
not necessarily robust to model misspecification.
The above analysis can be extended if we know The equation for K ∗ implies readily that K ∗ =
more market data. For example, knowing prices of µ−1 (α) and the bound we have derived coincides
puts and calls at some earlier expiry T1 < T would with equation (6).
lead to solving the SEP, constrained by embedding Inequality (7) encodes the super-replicating strat-
an intermediate law µ1 before µ. This was achieved egy. The first term of the right-hand side means we
by Brown et al. [4] who gave an explicit construction buy 1/(α − K ∗ ) calls with strike K ∗ . The second
4 Skorokhod Embedding
[16] Perkins, E. (1986). The Cereteli-Davis solution to the [21] Vallois, P. (1983). Le problème de Skorokhod sur
H 1 -embedding problem and an optimal embedding in R: une approche avec le temps local, in Séminaire
Brownian motion, in Seminar on stochastic processes, de Probabilités, XVII, Lecture Notes in Mathematics,
1985 (Gainesville, Fla., 1985), Progress in Probability Springer, Berlin, Vol. 986, pp. 227–239.
and Statistics, Birkhäuser Boston, Boston, Vol. 12,
pp. 172–223.
[17] Root, D.H. (1969). The existence of certain stopping
times on Brownian motion, The Annals of Mathematical
Related Articles
Statistics 40, 715–718.
[18] Rost, H. (1971). The stopping distributions of a Markov Arbitrage Bounds; Arbitrage: Historical Perspec-
Process, Inventiones Mathematicae 14, 1–16.
tives; Arbitrage Pricing Theory; Arbitrage Stra-
[19] Rost, H. (1976). Skorokhod stopping times of minimal
variance, in Séminaire de Probabilités, X, Lecture Notes
tegy; Barrier Options; Complete Markets; Convex
in Mathematics, Springer, Berlin, Vol. 511, pp. 194–208. Risk Measures; Good-deal Bounds; Hedging;
[20] Skorokhod, A.V. (1965). Studies in the Theory of Ran- Implied Volatility Surface; Martingales; Model
dom Processes, Addison-Wesley Publishing Co., Read- Calibration; Static Hedging; Superhedging.
ing, Translated from the Russian by Scripta Technica,
Inc. JAN OBŁÓJ
by B. In the following, we will denote a Markov pro-
Markov Processes cess by (Xt , t ≥ 0), or simply X when no confusion
is possible.
A Markov process is a process that evolves in a
memoryless way: its future law depends on the past
only through the present position of the process. This
Markov Property and Transition
property can be formalized in terms of conditional
expectations: a process (Xt , t ≥ 0) adapted to the Semigroup
filtration (Ft )t≥0 (representing the information avail-
A Markov process retains no memory of where it
able at time t) is a Markov process if
has been in the past. Only the current state of the
Ɛ(f (Xt+s ) | Ft ) = Ɛ(f (Xt+s ) | Xt ) (1) process influences its future dynamics. The following
definition formalizes this notion:
for all s, t ≥ 0 and f bounded and measurable.
The interest of such a process in financial models Definition 1 Let (Xt , t ≥ 0) be a stochastic process
becomes clear when one observes that the price of an defined on a probability filtered space (, Ft , ) with
option, or more generally, the value at time t of any values in d . X is a Markov process if
future claim with maturity T , is given by the general
formula (see Risk-neutral Pricing) (Xt+s ∈ | Ft ) = (Xt+s ∈ | Xt ) -a.s.
(4)
Vt = value at time t
= Ɛ(discounted payoff at time T | Ft ) (2) for all s, t ≥ 0 and ∈ B. Equation (4) is called
the Markov property of the process X. The Markov
where the expectation is computed with respect to a process is called time homogeneous if the law of Xt+s
pricing measure (see Equivalent Martingale Mea- conditionally on Xt = x is independent of t.
sures). The Markov property is a frequent assumption
in financial models because it provides powerful tools Observe that equation (4) is equivalent to equation
(semigroup, theory of partial differential equations (1) and that X is a time-homogeneous Markov
(PDEs), etc.) for the quantitative analysis of such process if there exists a positive function P defined
problems. on + × d × B such that
Assuming the Markov property (1) for (St , t ≥ 0),
the value Vt of the option can be expressed as P (s, Xt , ) = (Xt+s ∈ | Ft ) (5)
Vt = Ɛ(e−r(T −t) f (ST ) | Ft ) holds -a.s. for all t, s ≥ 0 and ∈ B. P is called the
transition function of the time homogeneous Markov
= Ɛ(e−r(T −t) f (ST ) | St ) (3) process X.
For the moment, we restrict ourselves to the time-
so Vt can be expressed as a (deterministic) function of
homogeneous case.
t, St : u(t, St ) = Ɛ(e−r(T −t) f (ST ) | St ). Furthermore,
this function u is shown to be the solution of a Proposition 1 The transition function P of a time-
parabolic PDE, the Kolmogorov backward equation. homogeneous Markov process X satisfies
The goal of this article is to present the Markov
processes and their relation with PDEs, and to 1. P (t, x, ·) is a probability measure on d for any
illustrate the role of Markovian models in various t ≥ 0 and x ∈ d ,
financial problems. We give a general overview of the 2. P (0, x, ·) = δx (unit mass at x) for any x ∈ d ,
links between Markov processes and PDEs without 3. P (·, ·, ) is measurable for any ∈ B,
giving more details and we focus on the case of and for any s, t ≥ 0, x ∈ d , ∈ B, P satisfies the
Markov processes solution to stochastic differential Chapman–Kolmogorov property
equations (SDEs).
We will restrict ourselves to d -valued Markov
processes. The set of Borel subsets of d is denoted P (t + s, x, ) = P (s, y, )P (t, x, dy) (6)
d
2 Markov Processes
From an analytical viewpoint, we can think of the Theorem 1 ([9] Th.4.2.7). Given a Feller semigroup
transition function as a Markov semigroupa (Pt , t ≥ (Pt , t ≥ 0) and any probability measure ν on d ,
0), defined by there exists a filtered probability space (, Ft , )
and a strong Markov process (Xt , t ≥ 0) on this
space with values in d with initial law ν and with
Pt f (x) := P (t, x, dy)f (dy) transition function Pt . A strong Markov process whose
d
semigroup is Feller is called a Feller process.
= Ɛ(f (Xt ) | X0 = x) (7)
in which case the Chapman–Kolmogorov equation
becomes the semigroup property Infinitesimal Generator
this to hold are given by the Hille–Yosida theorem, equation; for all f ∈ D(L),
see [21, Th.III.5.1]). For almost all Markov finan-
d
cial models, these conditions are well established Pt f = LPt f (16)
and always satisfied (see Examples 1, 2, 3, and 4). dt
As illustrated by equation (14), when D(L) is large This equation is called Kolmogorov’s backward equa-
enough, the infinitesimal generator captures the law tion. In particular, if L is a differential operator (e.g.,
of the whole dynamics of a Markov process and pro- if X is a Feller diffusion), the function u(t, x) =
vides an analytical tool to study the Markov process. Pt f (x) is the solution of the PDE
The other major mathematical tool used in finance
is the stochastic calculus (see Stochastic integral, ∂u = Lu
∂t (17)
Itô formula), which applies to Semimartingales (see
u(0, x) = f (x)
[18]). It is therefore crucial for applications to char-
acterize under which conditions a Markov process Conversely, if this PDE admits a unique solution,
is a semimartingale. This question is answered for then its solution is given by
very general processes in [5]. We mention that this is
always the case for Feller diffusions, defined later. u(t, x) = Ɛ(f (Xt ) | X0 = x) (18)
which the parameters of Theorem 3 are b = 0 and (i.e., a = σ σ ) and where Bt is a r-dimensional stan-
a = 1. The Brownian motion is the fundamental dard Brownian motion. For example, when d = r,
prototype of Feller diffusions. Other diffusions are one can take for σ (x) the symmetric square root
inherited from this process because they can be matrix of the matrix a(x).
expressed as solutions to SDEs driven by independent The construction of Markov solutions to the SDE
Brownian motions (see later). Similarly, the standard (33) with generator (15) is possible if b and σ are
d-dimensional Brownian motion is a vector of d inde- globally Lipschitz with linear growth [13, Th.5.2.9],
pendent standard one-dimensional Brownian motions or if b and a are bounded and continuous func-
and corresponds to the case bi = 0 and aij = δij for tions [13, Th.5.4.22]. In the second case, the SDE has
1 ≤ i, j ≤ d, where δij is the Kronecker delta func- a solution in a weaker sense. Uniqueness (at least in
tion (δij = 1 if i = j and 0 otherwise). law) and the strong Markov property hold if b and
σ are locally Lipschitz [13, Th.5.2.5], or if b and a
Example 2 Black–Scholes Model In the Black– are Hölder continuous and the matrix a is uniformly
Scholes model, the underlying asset price St follows positive definite [13, Rmk.5.4.30, Th.5.4.20]. In the
a geometric Brownian motion with constant drift µ one-dimensional case, existence and uniqueness for
and volatility σ . the SDE (32) can be proved under weaker assump-
tions [13, Sec.5.5].
St = S0 exp (µ − σ 2 /2)t + σ Bt (29) In all these cases, the Markov property allows one
to identify the SDE (33) with its generator (15). This
where B is a standard Brownian motion. With Itô’s will allow us to make the link between parabolic
formula, it is easily checked that S is a Feller PDEs and the corresponding SDE in the section
diffusion with infinitesimal generator “Parabolic PDEs Associated to Markov Processes”
and sequel.
Lf (x) = µxf (x) + 12 σ 2 x 2 f (x) (30) Similarly, one can associate to the time-inhomo-
geneous SDE
Itô’s formula also yields
t t
dXt = b(t, Xt ) dt + σ (t, Xt ) dBt (34)
St = S0 + µ Ss ds + σ Ss dBs (31)
0 0 the time-inhomogeneous generators (28). Existence
for this SDE holds if bi and σij are globally Lipschitz
which can be written as the SDE
in x and locally bounded (uniqueness holds if bi and
σij are only locally Lipschitz in x). As earlier, in this
dSt = µSt dt + σ St dBt (32) case, a solution to equation (34) is strong Markov.
We refer the reader to [16] for more details.
The correspondence between the SDE and the
second-order differential operator L appears below Example 4 Backward Stochastic Differential
as a general fact. Equations Backward stochastic differential
equations are SDEs where a random variable is given
Example 3 Stochastic Differential Equations as a terminal condition. Let us motivate the definition
SDEs are probably the most used Markov models of a backward SDE (BSDE) by continuing the study
in finance. Solutions of SDEs are examples of Feller of the elementary example of the introduction of this
diffusions. When the parameters bi and aij of The- article.
orem 3 are sufficiently regular, a Feller process X
with generator equation (15) can be constructed as Consider an asset St modeled by the Black–
the solution of the SDE Scholes SDE (32) and assume that it is possible to
borrow and lend cash at a constant risk-free interest
dXt = b(Xt )dt + σ (Xt ) dBt (33) rate r. A self-financed trading strategy is determined
by an initial portfolio value and the amount πt of
where b(x) ∈ d is (b1 (x), . . . ,
bd (x)), where the the portfolio value placed in the risky asset at time t.
d×r matrix σ (x) satisfies aij (x)= rk=1 σik (x)σj k (x) Given the stochastic process (πt , t ≥ 0), the portfolio
6 Markov Processes
where c(t, x) is uniformly bounded and locally g(Su , 0 ≤ u ≤ T ). The free arbitrage value at time t
Hölder on [0, T ] × d , f (t, x) is locally Hölder on of this option is
[0, T ] × d , g(x) is continuous on d and
Vt = Ɛ[e−r(T −t) g(Su , t ≤ u ≤ T ) | Ft ] (59)
|f (t, x)| + |g(x)| ≤ A exp(a|x|), By the Markov property (1), this quantity only
∀(t, x) ∈ [0, T ] × d depends on St and t [10, Th.2.1.2]. The Feynman–
Kac formula (58) allows one to characterize V in the
(56) case where g depends only on ST and S is a Feller
diffusion.
for some constants A, a > 0. Under these condi- Most often, the asset SDE
tions, it follows easily from Theorems 6.4.5 and 6.4.6
of [10] that equation (55) admits a unique classical dSt = St (µ(t, St ) dt + σ (t, St ) dBt ) (60)
solution u such that
cannot satisfy the uniform ellipticity assumption (54)
in the neighborhood of 0. Therefore, Theorem 4 does
|u(t, x)| ≤ A exp(a|x|) ∀(t, x) ∈ [0, T ] × d
not apply directly. This is a general difficulty for
(57) financial models. However, in most cases (and in
all the examples below), it can be overcome by
taking the logarithm of the asset price. In our case,
for some constant A > 0. we assume that the process (log St , 0 ≤ t ≤ T ) is
The following result is known as Feynman–Kac a Feller diffusion on with time-inhomogeneous
formula and can be deduced from equation (57) generator
using exactly the same method as for [10, Th.6.5.3]
and using the fact that, under our assumptions, Lt φ(y) = 12 a(t, y)φ (y) + b(t, y)φ (y) (61)
Xt has finite exponential moments
[10, Th.6.4.5].
that satisfy the assumptions of Theorem 4. This
Theorem 4 Under the previous assumptions, the holds for example for the Black–Scholes model
solution of the Cauchy problem (55) is given by (32). This assumption implies that S is a Feller
diffusion on (0, +∞) whose generator takes the
form
T
u(t, x) = Ɛ g(XT ) exp c(s, Xs ) ds | Xt = x L̃t φ(x) = 12 ã(t, x)x 2 φ (x) + b̃(t, y)xφ (x) (62)
t
T
−Ɛ f (s, Xs ) where ã(t, x)=a(t, log x) and b̃(t, x)=b(t, log x) +
t a(t, log x)/2.
s Assume also that g(x) is continuous on +
× exp c(α, Xα ) dα ds | Xt = x with polynomial growth when x → +∞. Then, by
t
Theorem 4, the function
(58)
Let us mention that this result can be extended v(t, y) = Ɛ e−r(T −t) g(ST ) | log St = y (63)
to parabolic linear PDEs on bounded domains [10,
Th.6.5.2] and to elliptic linear PDEs on bounded is solution to the Cauchy problem
domains [10, Th.6.5.1].
∂v
Example 5 European Options The Feynman–
∂t (t, y) + Lt v(t, y)
Kac formula has many applications in finance. Let (t, y) ∈ [0, T ) ×
−rv(t, y) = 0, (64)
us consider the case of an European option on a
one-dimensional Markov asset (St , t ≥ 0) with payoff v(T , y) = g(exp(y)), y∈
10 Markov Processes
Making the change of variable x = exp(y), u(t, x) = It is straightforward to check that (S, A) is a Feller
v(t, log x) is solution to diffusion on (0, +∞)2 with infinitesimal generator
∂u (t, x) + b̃(t, x)x ∂u (t, x) + 1 ã(t, x)x 2 ∂ 2 u (t, x) − rv(t, x) = 0, (t, x) ∈ [0, T ) × (0, +∞)
∂t ∂x 2
∂x 2 (65)
u(T , x) = g(x), x ∈ (0, +∞)
where B is a standard one-dimensional Brownian Actually, it is possible to justify the previous state-
motion. The free arbitrage price at time t is ment in the specific case of a one-dimensional
T + Black–Scholes asset: u can be written as
−r(T −t) 1 KT − y
Ɛ e Su du − K St (68) u(t, x, y) = e −r(T −t)
T 0 x ϕ t,
x
(73)
To apply the Feynman–Kac formula, one must (see [20]) where ϕ(t, z) is the solution of the one-
express this quantity as the (conditional) expectation dimensional parabolic PDE
∂ϕ 2
σ 2 z2 ∂ ϕ (t, z) − 1 + rz ∂ϕ (t, z) + rϕ(t, z) = 0,
(t, z) + (t, z) ∈ [0, T ) ×
∂t 2 ∂z 2 T ∂z (74)
+
ϕ(T , z) = −(z) /T , z∈
of the value at time T of some Markov quantity. This From this, it is easy to check that u solves equa-
can be done by introducing the process tion (72).
Note that this relies heavily on the fact that the
t
underlying asset follows the Black–Scholes model.
At = Su du, 0≤t ≤T (69) As far as we know, no rigorous justification of
0
Markov Processes 11
Feynman–Kac formula is available for Asian options solution of the SDE dYt = f (Yt ) dt + Zt dBt with
on more general assets. terminal condition YT = g(XT ).
The following definition of a BSDE generalizes
the previous situation. Given functions bi (t, x) and
Quasi- and Semilinear PDEs and BSDEs σij (t, x) that are globally Lipschitz in x and locally
The link between quasi- and semilinear PDEs and bounded (1 ≤ i, j ≤ d) and a standard d-dimensional
BSDEs is motivated by the following formal argu- Brownian motion B, consider the unique solution X
ment. Consider the semilinear PDE of the time-inhomogeneous SDE
∂u
(t, x) + Lt u(t, x) = f (u(t, x)) dXt = b(t, Xt ) dt + σ (t, Xt ) dBt (79)
∂t
(t, x) ∈ (0, T ) × with initial condition X0 = x. Consider also two
functions f : [0, T ] × d × k × k×d → k and
u(T , x) = g(x) x∈ g : d → k . We say that ((Yt , Zt ), t ≥ 0) solve the
(75) BSDE
where (Lt ) is the family of infinitesimal generators
of a time-inhomogeneous Feller diffusion (Xt , t ≥ 0). dYt = f (t, Xt , Yt , Zt ) dt + Zt dBt (80)
Assume that this PDE admits a classical solution
u(t, x). Assume also that we can find a unique
with terminal condition g(XT ) if Y and Z are
adapted process (Yt , 0 ≤ t ≤ T ) such that
progressively measurable processes with respect to
T
the Brownian filtration Ft = σ (Bs , s ≤ t) such that,
Yt = Ɛ[g(XT ) − f (Ys ) ds | Ft ] ∀t ∈ [0, T ] for any 0 ≤ t ≤ T ,
t
(76) T T
Yt = g(XT ) − f (s, Xs , Ys , Zs ) ds − Zs dBs
Now, by Itô’s formula applied to u(t, Xt ), t t
(81)
T
u(t, Xt ) = Ɛ[g(XT ) − f (u(s, Xs )) ds | Ft ] Example 4 corresponds to g(x) = (x − K)+ ,
t
f (t, x, y, z) = −ry + z(µ − r)/σ and Zt = σ πt .
(77)
Note that the role of the implicit unknown process
Therefore, Yt = u(t, Xt ) and the stochastic process Z is to make Y adapted.
Y provides a probabilistic interpretation of the solu- The existence and uniqueness of (Y, Z) solving
tionof the PDE (75). Now, by the martingale decom- equation (81) hold under the assumptions that g(x) is
position theorem, if Y satisfies (76), there exists an continuous with polynomial growth in x, f (t, x, y, z)
adapted process (Zt , 0 ≤ t ≤ T ) such that is continuous with polynomial growth in x and linear
growth in y and z, and f is uniformly Lipschitz in y
T and z. Let us denote by (A) all these assumptions.
Yt = g(XT ) − f (Ys ) ds We refer to [17] for the proof of this result and
t the general theory of BSDEs (see also forward-
T backward SDEs).
− Zs dBs ∀t ∈ [0, T ] (78) Consider the quasi-linear parabolic PDE
t
∂u (t, x) + L u(t, x) = f (t, x, u(t, x), ∇ u(t, x)σ (t, x)), (t, x) ∈ (0, T ) × d
∂t t x
(82)
u(T , x) = g(x), x ∈ d
where B is the same Brownian motion as the one The following results give the links between the
driving the Feller diffusion X. In other words, Y is BSDE (80) and the PDE (82).
12 Markov Processes
Theorem 5 ([15], Th.4.1). Assume that b(t, x), In the case of an European put option, the price is
σ (t, x), f (t, x, y, z), and g(x) are continuous and dif- given by the solution of the BSDE
ferentiable with respect to the space variables x, y, z T
+
with uniformly bounded derivatives. Assume also that Yt = (K − ST ) − Zs dBs (85)
b, σ , and f are uniformly bounded and that a = σ σ t
is uniformly elliptic. Then equation (82) admits a by a similar argument as in Example 4. In the
unique classical solution u and case of an American put option, the price at time
t is necessarily bigger than (K − St )+ . It is there-
fore natural to include this condition by consid-
Yt = u(t, Xt ) and Zt = ∇x u(t, Xt )σ (t, Xt ) (83)
ering the BSDE (85) reflected on the obstacle
Theorem 6 ([17], Th.2.4). Assume (A) and that (K − St )+ . Mathematically, this corresponds to the
b(t, x) and σ (t, x) are globally Lipschitz in x and problem of finding adapted processes Y, Z, and R
locally bounded. Define the function u(t, x) = Ytt,x , such that
T
where Y t,x is the solution to the BSDE (82) on the time Yt = (K − ST )+ − t Zs dBs + RT − Rt
interval [t, T ], where X is solution to the SDE (79)
Y ≥ (K − S )+
with initial condition Xt = x. Then u is a viscosity t t
(86)
R is continuous, increasing, R0 = 0 and
solution of equation (82).
T +
0 [Yt − (K − St ) ] dRt = 0
Theorem 5 gives an interpretation of the solution
of a BSDE in terms of the solution of a quasi- The process R increases only when Yt = (K − St )+
linear PDE. In particular, in Example 4, it gives in such a way that Y cannot cross this obstacle. The
the usual interpretation of the hedging strategy πt = existence of a solution of this problem is a particular
Zt /σ as the -hedge of the option price. Note also case of general results, (see [7]). As a consequence
that Theorem 5 implies that the process (X, Y, Z) of the following theorem, this reflected BSDE gives
is Markov—a fact which is not obvious from the a way to compute the price of the American put
definition. Conversely, Theorem 6 shows how to option.
construct a viscosity solution of a quasi-linear PDE
from BSDEs. Theorem 7 ([7], Th.7.2). The American put option
BSDEs provide an indirect tool to compute quan- has the price Y0 , where (Y , Z, R) solves the reflected
tities related to a solution X of the SDE (such as BSDE (86).
the hedging price and strategy of an option based The essential argument of the proof is the follow-
on the process X). BSDEs also have links with ing. Fix t ∈ [0, T ) and a stopping time τ ∈ [t, T ].
general stochastic control problems, that we will Since
not mention (see BSDEs). Here, we give an exam- τ
ple of application to the pricing of an American Yτ − Yt = Rt − Rτ + Zs dBs (87)
put option. t
Therefore, similarly as in Theorem 6, the reflected tions of portfolio management, quadratic hedging of
BSDE (84) provides a probabilistic interpretation of options, or super-hedging cost for uncertain volatil-
the solution of this PDE. ity models.
The (formal) essential argument of the proof of Let us consider a controlled diffusion X α in d
this result can be summarized as follows (for details, solution to the SDE
see [14, Section V.3.1]). Consider the solution u of
equation (90) and apply Itô’s formula to u(t, St ). dXtα = b(Xtα , αt ) dt + σ (Xtα ) dBt (93)
Then, for any stopping time τ ∈ [0, T ],
where B is a standard r-dimensional Brownian
τ motion and the control α is a given progressively
∂u measurable process taking values in some compact
u(0, x) = Ɛ[u(τ, Sτ )] − Ɛ (t, St )
0 ∂t metric space A. Such a control is called admissible.
For simplicity, we consider the time-homogeneous
σ 2 ∂ 2u
+ St2 2 u(t, St ) ds case and we assume that the control does not act on
2 ∂x the diffusion coefficient σ of the SDE. Assume that
(91) b(x, a) is bounded, continuous, and Lipschitz in the
variable x, uniformly in a ∈ A. Assume also that σ is
Because u is solution of equation (90), u(0, x) ≥ Lipschitz and bounded. For any a ∈ A, we introduce
Ɛ[u(τ, Sτ )] ≥ Ɛ[(K − Sτ )+ ]. Hence, u(0, x) ≥ the linear differential operator
sup0≤τ ≤T Ɛ[(K − Sτ )+ ].
Conversely, if τ ∗ = inf{0 ≤ t ≤ T : u(s, Ss ) = d
1
d
∂ 2ϕ
(K − Ss )+ }, then L ϕ=
a
σik (x)σj k (x)
2 i,j =1 k=1 ∂xi ∂xj
∂u σ 2 2 ∂ 2u
∀t ∈ [0, τ ∗ ]
d
(t, St ) + S u(t, St ) = 0 ∂ϕ
∂t 2 t ∂x 2 + bi (x, a) (94)
(92) i=1
∂xi
Therefore, for τ = τ ∗ , all the inequalities in the
previous computation are equalities and u(0, x) = which is the infinitesimal generator of X α when α is
sup0≤τ ≤T Ɛ[(K − Sτ )+ ]. a constant equal to a ∈ A.
14 Markov Processes
A typical form of finite horizon optimal control ∂v
problems in finance consists in computing ×Ɛ (t, Xtα ) + Lαt v(t, Xtα ) + rv(t, Xtα ) ds
∂t
(98)
u(t, x) = inf Ɛ e−rT g(XTα )
α admissible Therefore, by equation (96),
T
+ e−rt f (Xtα , αt ) dt | Xtα = x (95)
t v(0, x)
T
where f and g are continuous and bounded func- ≤ Ɛ e−rT g(XTα ) + e−rt f (Xtα , αt ) dt | Xtα = x
tions and to find an optimal control α ∗ that realizes t
the minimum. Moreover, it is desirable to find a (99)
Markov optimal control, that is, an optimal con-
trol having the form αt∗ = ψ(t, Xt ). Indeed, in this for any admissible control α. Now, for the Markov
∗
case, the controlled diffusion X α is a Markov pro- control α ∗ defined in Theorem 8, all the inequalities
cess. in the previous computation are equalities. Hence
In the case of nondegenerate diffusion coefficient, v = u.
we have the following link between the optimal The cases where σ is not uniformly elliptic or
control problems and a semilinear PDEs. where σ is also dependent on the current control
αt are much more difficult. In both cases, it is
Theorem 8 Under the additional assumption that necessary to enlarge the set of admissible control
σ is uniformly elliptic, u is the unique bounded by considering relaxed controls, that is, controls
classical solution of the Hamilton–Jacobi–Bellman that belong to the set P(A) of probability measures
(HJB) equation on A. For such a control α, the terms b(x, αt ) and
∂u (t, x) + inf {La u(t, x) + f (x, a)} − ru(t, x) = 0, (t, x) ∈ (0, T ) × d
∂t a∈A
(96)
u(T , x) = g(x), x ∈ d
Furthermore, a Markov control αt∗ = ψ(t, Xt ) is opti- f (x, αt ) in equations (93) and (95) are replaced by
mal for a fixed initial condition x and initial time b(x, a)αt (da) and f (x, a)αt (da), respectively.
t = 0 if and only if The admissible controls of the original problem
correspond to relaxed controls that are Dirac masses
Lψ(t,x) u(t, x) + f (x, ψ(t, x)) at each time. These are called precise controls.
The value ũ of this new problem is defined as
= inf {La u(t, x) + f (x, a)} (97) in equation (95), but the infimum is taken over all
a∈A
progressively measurable processes α taking values
for almost every (t, x) ∈ [0, T ] × d . in P(A). It is possible to prove under general
assumptions that both problems give the same value:
This is Theorem III.2.3 of [3] restricted to the case ũ = u (cf. [3, Cor.I.2.1] or [8, Th.2.3]).
of precise controls (see later). In these cases, one usually cannot prove the
Here again, the essential argument of the proof existence of a classical solution of equation (96). The
can be easily (at least formally) written: consider any weaker notion of viscosity solution is generally the
admissible control α and the corresponding controlled correct one. In all the cases treated in the literature,
diffusion X α with initial condition X0 = x. By Itô’s u = ũ solves the same HJB equation as in Theorem 8,
formula applied to e−rt v(t, Xtα ), where v is the except that the infimum is taken over P(A) instead
solution of equation (96), of A (cf. [3, Th.IV.2.2] for the case without control
on σ ). However, it is not trivial at all in general to
T
obtain a result on precise controls from the result
Ɛ[e−rT v(T , XTα )] = v(0, x) + e−rt on relaxed controls. This is due to the fact that
0
Markov Processes 15
usually no result is available on the existence and assume that g(t, x) is differentiable with respect
the characterization of a Markov-relaxed optimal to t and twice differentiable with respect to x
control. The only examples where it has been done and that
require restrictive assumptions (cf. [8, Cor.6.8]). d
However, in most of the financial applications, the ∂g ∂g
|f (t, x)| + (t, x) +
(t, x) ≤ Ceµ|x|
value function u is the most useful information. In ∂t ∂x i
i=1
practice, one usually only needs to compute a control (102)
that give an expected value arbitrarily close to the for positive constants C and µ.
optimal one.
Theorem 9 ([2], Sec.III.4.9). Under the previous
assumptions, u(t, x) admits first-order derivatives
Optimal Stopping Problems
with respect to t and second-order derivatives with
Optimal stopping problems arise in finance, for respect to x that are Lp for all 1 ≤ p < ∞. Moreover,
example, for the American options pricing (when u is the solution of the variational inequality
max u(t, x) − g(t, x); − ∂u
∂t (t, x) − L t u(t, x) + ru(t, x) − f (t, x) = 0, (t, x) ∈ (0, T ) × d
(103)
u(T , x) = g(T , x) x ∈ d
to sell a claim, an asset?) or in production models The proof of this result is based on a similar
(when to extract or product a good? when to stop (formal) justification as the one we gave for equa-
production?). tion (90). We refer to [12] for a similar result under
Let us consider a Feller diffusion X in d solution weaker assumptions more suited to financial models
to the SDE when f = 0 (this is in particular the case for Amer-
ican options).
dXt = b(t, Xt ) dt + σ (t, Xt ) dBt (100) In some cases (typically with f = 0, see [11]), it
can be shown that the infimum in equation (101) is
where B is a standard d-dimensional Brownian attained for the stopping time
motion. As in equation (28), let (Lt )t≥0 denote its
family of time-inhomogeneous infinitesimal genera-
τ ∗ = inf t ≤ s ≤ T : u(s, Xst,x ) = g(s, Xst,x )
tors. Denote by (t, T ) the set of stopping times
valued in [t, T ]. (104)
A typical form of optimal stopping problems
consists in computing where X t,x is the solution of the SDE (100) with
initial condition Xtt,x = x.
u(t, x) = inf Ɛ e−r(τ −t) g(τ, Xτ )
τ ∈(t,T ) Generalizations and Extensions
τ
+ e−r(s−t) f (s, Xs ) ds | Xt = x An optimal control problem can also be solved
t through the optimization of a family of BSDEs
(101) related to the laws of the controlled diffusions. On
this question, we refer to [19] and BSDEs.
and to characterize an optimal stopping time. In this section, we considered only very specific
Assume that b(t, x) is bounded and continu- optimal control problems. Other important families of
ously differentiable with bounded derivatives and optimal control problems are given by impulse con-
that σ (t, x) is bounded, continuously differentiable trol problems, where the control may induce a jump
with respect to t and twice continuously differen- of the underlying stochastic process, or ergodic con-
tiable with respect to x with bounded derivatives. trol problems, where the goal is to optimize a quantity
Assume also that σ is uniformly elliptic. Finally, related to the stationary behavior of the controlled
16 Markov Processes
diffusion. Impulse control has applications, for exam- or difficult depending on the particular constraints
ple, in stock or resource management problems. In the imposed on the control. Moreover, these methods
finite horizon case, when the underlying asset follows require to localize the problem, that is, to solve the
a model with stochastic or elastic volatility or when problem in a bounded domain with artificial bound-
the market is incomplete, other optimal control prob- ary conditions, which are usually difficult to compute
lems can be considered, such as characterizing the precisely. This localization problem can be solved by
superhedging cost, or minimizing some risk measure. computing the artificial boundary condition with a
Various constraints can be included in the optimal Monte Carlo method based on BSDEs. However, the
control problem, such as maximizing the expectation error analysis of this method is based on the prob-
of an utility with the constraint that this utility has abilistic interpretation of HJB equations in bounded
a fixed volatility, or minimizing the volatility for a domains, which is a difficult problem in general.
fixed expected utility. One can also impose Gamma
constraints on the control. Another important exten- End Notes
sion of optimal control problems arises when one
wants tosolve numerically an HJB equation. Usual a.
A Markov semigroup family (Pt , t ≥ 0) on d is a family
discretization methods require to restrict to a bounded of bounded linear operators of norm 1 on the set of bounded
domain and to fix artificial boundary conditions. The measurable functions on d equipped with the L∞ norm,
numerical solution can be interpreted as the solution which satisfies equation (8).
b.
of an optimal control problem in a bounded domain. This is not the most general definition of Feller semi-
groups (see [21, Def.III.6.5]). In our context, because we
In this situation, a crucial question is to quantify the only introduce analytical objects from stochastic processes,
impact on the discretized solution of an error on the the semigroup (Pt ) is naturally defined on the set of
artificial boundary condition (which usually cannot bounded measurable functions.
c.
be computed exactly). The strong continuity of a semigroup is usually defined
as Pt f − f → 0 as t → 0 for all f ∈ C0 (d ). However,
in the case of Feller semigroups, this is equivalent to the
On Numerical Methods weaker formulation (10) (see [21, Lemma III.6.7]).
[7] El Karoui, N., Kapoudjian, C., Pardoux, E., Peng, S. & [16] Øksendal, B. (2003). Stochastic Differential Equations:
Quenez, M.C. (1997). Reflected solutions of backward An Introduction with Applications, 6th Edition, Univer-
SDE’s, and related obstacle problems for PDE’s, Annals sitext, Springer-Verlag, Berlin.
of Probability 25(2), 702–737. [17] Pardoux, E. (1998). Backward stochastic differential
[8] El Karoui, N., Nguyen, D. & Huu Jeanblanc-Picqué, M. equations and viscosity solutions of systems of semi-
(1987). Compactification methods in the control of linear parabolic and elliptic PDEs of second order, in
degenerate diffusions: existence of an optimal control, Stochastic Analysis and Related Topics: The Geilo Work-
Stochastics 20(3), 169–219. shop, B.O.L. Decreusefond, J. Gjerde & A. Ustunel, eds,
[9] Ethier, S.N. & Kurtz, T.G. (1986). Markov Processes: Birkhäuser, pp. 79–127.
Characterization and Convergence, Wiley Series in Prob- [18] Protter, P. (2001). A partial introduction to financial
ability and Mathematical Statistics: Probability and asset pricing theory, Stochastic Processes and Their
Mathematical Statistics, John Wiley & Sons, New York. Applications 91(2), 169–203.
[10] Friedman, A. (1975). Stochastic Differential Equations [19] Quenez, M.C. (1997). Stochastic control and BSDEs,
and Applications, Vol. 1, Probability and Mathematical in Backward Stochastic Differential Equations (Paris,
Statistics, Academic Press [Harcourt Brace Jovanovich 1995–1996), Pitman Research Notes in Mathematics
Publishers], New York, Vol. 28. Series, Longman, Harlow, Vol. 364, pp. 83–99.
[11] Jacka, S.D. (1993). Local times, optimal stopping and [20] Rogers, L.C.G. & Shi, Z. (1995). The value of an
semimartingales, Annals of Applied Probability 21(1), Asian option, Journal of Applied Probability 32(4),
329–339. 1077–1088.
[12] Jaillet, P., Lamberton, D. & Lapeyre, B. (1990). Varia- [21] Rogers, L.C.G. & Williams, D. (1994). Diffusions,
tional inequalities and the pricing of American options, Markov Processes, and Martingales, Wiley Series in
Acta Applicandae Mathematicae 21(3), 263–289. Probability and Mathematical Statistics: Probability and
[13] Karatzas, I. & Shreve, S.E. (1988). Brownian Motion Mathematical Statistics, 2nd Edition, John Wiley &
and Stochastic Calculus, Graduate Texts in Mathematics, Sons, Chichester, Vol. 1.
Springer-Verlag, New York, Vol. 113. [22] Talay, D. & Zheng, Z. (2003). Quantiles of the Euler
[14] Lamberton, D. & Lapeyre, B. (1996). Introduction to scheme for diffusion processes and financial applica-
Stochastic Calculus Applied to Finance, Chapman & tions, Mathematical Finance 13(1) 187–199, Confer-
Hall, London (Translated from the 1991 French original ence on Applications of Malliavin Calculus in Finance
by Nicolas Rabeau and François Mantion). (Rocquencourt, 2001).
[15] Ma, J., Protter, P. & Yong, J.M. (1994). Solving forward-
backward stochastic differential equations explicitly—a MIREILLE BOSSY & NICOLAS CHAMPAGNAT
four step scheme, Probability Theory and Related Fields
98(3), 339–359.
Doob–Meyer with a martingale M and an increasing predictable
process A satisfying A0 = 0. While the intuitive
Decomposition meaning of M and A may not be obvious, the cor-
responding decomposition of the increments Xt :=
Xt − Xt−1 is easier to understand.
Submartingales are processes that grow on average.
Subject to some condition of uniform integrability, Xt = Mt + At (3)
they can be written uniquely as the sum of a can be interpreted in the sense that the increment Xt
martingale and a predictable increasing process. This consists of a predictable trend At and a random
result is known as the Doob–Meyer decomposition. deviation Mt from that trend. Its implication At =
Consider a filtered probability space (, F , E(Xt |F t−1 ) means that At is the best prediction
F, P ). It consists of a probability space (, F , P ) of Xt in a mean-square sense and based on the
and a filtration F = (F t )t≥0 , that is, an increasing information up to time t − 1.
family of sub-σ -fields of F . The σ -field F t stands The natural decomposition (3) does not make
for the information available at time t. A random sense for continuous time processes but an analog
event A belongs to F t , if we know at time t, of equation (2) still exists. To this end, the notion
whether it will take place or not, that is, A does not of predictability must be extended to continuous
depend on randomness in the future. For technical time. A process X = (Xt )t∈+ is called predictable
reasons, one
typically assumes right continuity, that if—viewed as a mapping on × + —it is mea-
is, F t = s>t F s . surable with respect to the σ -field generated by all
A martingale (see Martingales) (respectively sub- adapted, left-continuous processes. Intuitively, this
martingale, supermartingale) is an adapted, inte- rather abstract definition means that Xt is known
grable process (Xt )t∈+ satisfying slightly ahead of time t. In view of the discrete-time
E(Xt |F s ) = Xs (1) case, it may seem more natural to require that Xt be
F t− -measurable, where F t− stands for the smallest
(respectively ≥ Xs , ≤ Xs ) for s ≤ t. Moreover, we sub-σ -field containing all F s , s < t. However, this
require these processes to be a.s. càdlàg, that is, right- slightly weaker condition turns out to be too weak
continuous with left-hand limits. Adaptedness means for the general theory.
that Xt is F t -measurable, that is, the random value In order for a decomposition (2) into a martingale
Xt is known at the latest at time t. Integrability M and a predictable increasing process A to exist,
E(|Xt |) < ∞ is needed for the conditional expec- one must assume some uniform integrability of X.
tation to be defined. The crucial martingale equality The process X must belong to the so-called class
(1) means that the best prediction of future values (D), which amounts to a rather technical condition
of X is the current value, that is, X will stay on the implying supt≥0 E(|Xt |) ≤ ∞ but being itself implied
current level on average. In other words, it does not by E(supt≥0 |Xt |) ≤ ∞. For its precise definition, we
exhibit any positive or negative trend. If X denotes need to introduce the concept of a stopping time,
the price of a security, this asset does not produce which is not only an indispensable tool for the general
profits or losses on average. Submartingales, on the theory of stochastic processes but also interesting for
other hand, grow on average. Put differently, they applications, for example, in mathematical finance. A
show an upward trend compared to a martingale. [0, ∞]-valued random variable T is called stopping
This loose statement is made precise in terms of the time if {T ≤ t} ∈ F t for any t ≥ 0. Intuitively, T
Doob–Meyer decomposition. stands for a random time, which is generally not
As a starting point, consider a discrete-time pro- known in advance but at the latest once it has
cess X = (Xt )t=0,1,2,... . In discrete time, a process happened (e.g., the time of a phone call, the first time
X is called predictable if Xt is F t−1 -measurable when a stock hits 100, the time when you crash your
for t = 1, 2, . . .. This means that the value Xt is car into a tree). In financial applications, it appears,
known already one period ahead. The Doob decompo- for example, as the exercise time of an American
sition states that any submartingale X can be written option.
uniquely as Stopping times can be classified by their degree of
Xt = Mt + At (2) suddenness. Predictable stopping times do not come
2 Doob–Meyer Decomposition
entirely as a surprise because one anticipates them. monotonicity of A. In general, A is only required
Formally, a stopping time T is called predictable if to be of finite variation, that is, the difference of
it allows for an announcing sequence, that is, for a two increasing processes. t In the Itô process exam-
sequence (Tn )n∈ of stopping times satisfying T0 < ple, these are A(+) = 0 max(Ks , 0)ds and At
(−)
=
t t
T1 < T2 < . . . on {T > 0} and Tn → T as n → ∞. max(−K s , 0)ds. Put differently, the trend may
0
This is the case for a continuous stock price hitting change its direction every now and then.
100 or for the car crashing into a tree, because you To cover all Itô processes, one must also allow for
can literally see the level 100 or the tree coming local martingales rather than martingales. M is said
increasingly closer. Phone calls, strikes of lightning, to be a local martingale if there exists a sequence
or jumps of Lévy process, on the other hand, are of stopping times (Tn )n∈ , which increases to ∞
of an entirely different kind because they happen almost surely such that M Tn is a martingale for
completely out of the blue. Such stopping times T any n. Here, the stopped process M Tn is defined as
are called totally inaccessible, which formally means MtTn := Mmin(Tn ,t) , that is, it stays constant after time
that P (S = T < ∞) = 0 for all predictable stopping Tn (as e.g., your wealth does if you sell an asset at
times S. Tn ). This rather technical concept appears naturally
Coming back to our original theme, a pro- in the general theory of stochastic
cess X is said to be of class (D) if the set t processes. For
example, stochastic integrals Mt = 0 Hs dNs relative
{XT : T finite stopping time} is uniformly integrable, to martingales N generally fail to be martingales but
which in turn means that are typically local martingales or a little less, namely,
σ -martingales.
lim sup E(1{|XT |>c} |XT |) = 0
c→∞
T finite stopping time A local martingale is a uniformly integrable mar-
tingale, if and only if it is of class (D). Nevertheless,
The Doob–Meyer decomposition can now be stated one should be careful with thinking that local mar-
as follows: tingales behave basically as martingales up to some
integrability. For example, there exist local martin-
Theorem 1 Any submartingale X of class (D) t
gales Mt = 0 Hs dWs with M0 = 0 and M1 = 1 a.s.
allows for a unique decomposition and such that E(|Mt |) < ∞, t ≥ 0. Even though
Xt = Mt + At (4) such a process has no trend in a local sense, it
behaves entirely differently from a martingale on
with a martingale M and some predictable increasing a global scale. The difference between local mar-
process A satisfying A0 = 0. tingales and martingales leads to many technical
problems in mathematical finance. For example, the
The martingale M turns out to be of class (D) previous example may be interpreted in the sense that
as well, which implies that it converges a.s. and in dynamic investment in a perfectly reasonable martin-
L1 to some terminal random variable M∞ . Since the gale may lead to arbitrage unless the set of trading
whole martingale M can be recovered from its limit strategies is restricted to some admissible subset.
via Mt = E(M∞ |F t ), one can formally identify such Let us come back to generalizing the Doob–Meyer
uniformly integrable martingales with their limit. decomposition. Without class (D) it reads as follows:
In the case of an Itô process
Theorem 2 Any submartingale X allows for a
dXt = Ht dWt + Kt dt (5) unique decomposition (4) with a local martingale M
and some predictable increasing process A satisfying
the Doob–Meyer decomposition t is easily obtained. A0 = 0.
Indeed, we have Mt = X0 + 0 Hs dWs and At =
t
0 Ks ds. However, a general Itô process need not,
For a considerably larger class of processes X,
of course, be a submartingale. However, equation there exists a canonical decomposition (4) with a
(5) suggests that a similar decomposition exists for local martingale M and some predictable process A
more general processes. This is indeed the case. of finite variation, which starts in 0. These processes
For a generalization covering all Itô processes we are called special semimartingales and they play a
relax both the martingale property of M and the key role in stochastic calculus. The slightly larger
Doob–Meyer Decomposition 3
Then FBSDE (1) admits a unique solution reason we call equation (4) a Markovian FBSDE. We
(X, Y , Z), and there exists a constant C, depending note that in the Black–Scholes model, as we see in
only on T , the dimensions, and
the Lipschitz
constant, the section Applications, the PDE (5) is linear and
such that (X, Y , Z) ≤ C |x0 | + I0 .
2 2 one can solve for u explicitly. Then equation (6) in
fact gives us the well known Black–Scholes formula.
When dim(Y ) = 1, we have the following com- Moreover, the hedging portfolio Zt σ −1 (t, Xt ) is the
parison result for the BSDE. For i = 1, 2, assume sensitivity of the option price Yt with respect to the
underlying asset price Xt . This is exactly the idea
(b, σ, fi , gi ) satisfy the assumptions in Theorem 1
of the -hedging. On the other hand, when f is
and let (X, Y i , Z i ) denote the corresponding solu-
linear in (y, z), equation (7) actually is equivalent to
tions to equation (1). If f 1 ≤ f 2 , g 1 ≤ g 2 , P a.s.,
the Feynman–Kac formula. In general, when m = 1,
for any (t, x, y, z), then, Yt1 ≤ Yt2 , ∀t, P a.s.; see, for
equation (7) provides a probabilistic representation
example, [24]. On the basis of this result, Lepeltier
for the viscosity solution to the PDE (5), and thus
and San Martín [31] constructed solutions to BSDEs
is called a nonlinear Feynman–Kac formula. Such a
with non-Lipschitz coefficients. Moreover, Kobylan-
type of representation formula is also available for
ski [30] and Briand and Hu [10] proved the well
ux [36].
posedness of BSDEs whose generator f has quadratic
The link between FBSDEs and PDEs opens the
growth in Z. Such BSDEs are quite useful in practice.
door to efficient Monte Carlo methods for high-
When the coefficients are deterministic, the decou-
dimensional PDEs and FBSDEs, and thus also for
pled FBSDE (1) becomes
many financial problems. This approach can effec-
tively overcome the curse of dimensionality; see,
dXt = b(t, Xt )dt + σ (t, Xt )dWt , X0 = x;
for example, [3–5, 8, 27, 45], and [12]. There are
dYt = −f (t, Xt , Yt , Zt )dt + Zt dWt , (4) also some numerical algorithms for non-Markovian
YT = g(XT ) BSDEs and coupled FBSDEs; see, for example, [2,
In this case, the FBSDE is associated with the 9, 18, 33], and [17].
following system of parabolic PDEs:
Coupled FBSDEs
ui + 1 tr uixx σ σ ∗ (t, x) + uix b(t, x)
t 2i
+f (t, x, u, ux σ (t, x)) = 0, (5) The theory of coupled FBSDEs is much more com-
i = 1, · · · , m; plex and is far from complete. There are mainly three
u(T , x) = g(x) approaches for its well posedness, each with its limit.
Since the precise statements of the results require
Theorem 2 ([38]). Assume b, σ , f , g satisfy all the complicated notation and technical conditions, we
conditions in Theorem 1. refer readers to the original research papers and focus
only on the main ideas here.
(i) If PDE (5) has a classical solution u ∈ C 1,2
([0, T ] × IRn ), then Method 1 Contraction Mapping This method
works very well for BSDEs and decoupled FBS-
Yt = u(t, Xt ), Zt = ux σ (t, Xt ) (6) DEs. However, to ensure the constructed mapping
(ii) In general, define is a contraction one, for coupled FBSDEs one has
to assume some stronger conditions. The first well-
posedness result was by Antonelli [1], which has been
u(t, x)=E{Yt |Xt = x} (7)
extended further by Pardoux and Tang [39]. Roughly
speaking, besides the standard Lipschitz conditions,
Then u is deterministic and Yt = u(t, Xt ). FBSDE (1) is well posed in one of the following
Moreover, when m = 1, u is the unique viscos- three cases: (i) T is small and either σz or gx is
ity solution to the PDE (5). small; (ii) X is weakly coupled into the BSDE (i.e.,
gx and fx are small) or (Y, Z) are weakly cou-
In this case, X is a Markov process; then by equation pled into the FSDE (i.e., by , bz , σy , σz are small); or
(6) the solution (X, Y, Z) is Markovian. For this (iii) b is deeply decreasing in x (i.e., [b(·, x1 , ·) −
Forward–Backward Stochastic Differential Equations (SDEs) 3
b(·, x2 , ·)][x1 − x2 ] ≤ −C|x1 − x2 |2 for some large assumes some sufficient conditions on the determin-
C) or f is deeply decreasing in y. Antonelli [1] istic coefficients to ensure such Lipschitz continuity.
also provides a counterexample to show that, under In particular, one key condition is that the coefficient
Lipschitz conditions only, equation (1) may have no σ be uniformly nondegenerate. Zhang [46] allows
solution. the coefficients to be random and σ to be degen-
erate, but assumes all processes are one-dimensional
Method 2 Four-step Scheme This is the most pop- along with some special compatibility condition on
ular method for coupled FBSDEs with deterministic the coefficients, so that a similarly defined random
coefficients, proposed by Ma et al. [34]. The main field u(t, ω, x) is uniformly Lipschitz continuous
idea is to use the close relationship between Marko- in x.
vian FBSDEs and PDEs, in the spirit of Theorem 2.
Step 1 in [34] deals with the dependence of σ on z, Method 3 Method of Continuation The idea is
which works only in very limited cases. The more that, if an FBSDE is well-posed, then a new FBSDE
interesting case is that σ does not depend on z. Then with slightly modified coefficients is also well-posed.
the other three steps read as follows: The problem is then to find sufficient conditions so
that this modification procedure can go arbitrarily
Step 2. Solve the following PDE with u(T , x) = long. This method allows the coefficients to be
g(x): for i = 1, · · · , m, random and σ to be degenerate. However, it requires
some monotonicity conditions; see for example, [29,
1 42], and [43]. For example, [29] assumes that, for
uit + tr [uixx σ σ ∗ (t, x, u)]
2 some constant β > 0 and for any θi = (xi , yi , zi ), i =
1, 2,
+ uix b(t, x, u, ux σ (t, x, u))
+ f i (t, x, u, ux σ (t, x, u)) = 0 (8) [b(t, ω, θ1 ) − b(t, ω, θ2 )][y1 − y2 ]
[12] Cheridito, P., Soner, M., Touzi, N. & Victoir, N. [28] Hamadene, S. & Lepeltier, J.-P. (1995). Zero-sum
(2006). Second order backward stochastic differen- stochastic differential games and backward equations,
tial equations and fully non-linear parabolic PDEs, Systems and Control Letters 24(4), 259–263.
Communications in Pure and Applied Mathematics 60, [29] Hu, Y. & Peng, S. (1995). Solution of forward-backward
1081–1110. stochastic differential equations, Probability Theory and
[13] Cvitanić, J. & Karatzas, I. (1996). Backward SDE’s with Related Fields 103(2), 273–283.
reflection and Dynkin games, The Annals of Probability [30] Kobylanski, M. (2000). Backward stochastic differen-
24, 2024–2056. tial equations and partial differential equations with
[14] Cvitanić, J., Karatzas, I. & Soner, M. (1998). Back- quadratic growth, The Annals of Probability 28(2),
ward stochastic differential equations with constraints 558–602.
on the gains-process, The Annals of Probability 26(4), [31] Lepeltier, J.P. & San Martín, J. (1997). Backward
1522–1551. stochastic differential equations with continuous coeffi-
cients, Statistics and Probability Letters 32,
[15] Cvitanić, J. & Ma, J. (1996). Hedging options for a large
425–430.
investor and forward-backward SDE’s, The Annals of
[32] Ma, J. & Cvitanic, J. (2001). Reflected forward-
Applied Probability 6(2), 370–398.
backward SDEs and obstacle problems with boundary
[16] Delarue, F. (2002). On the existence and unique-
conditions, Journal of Applied Mathematics and Stochas-
ness of solutions to FBSDEs in a non-degenerate
tic Analysis 14(2), 113–138.
case, Stochastic Processes and their Applications 99(2), [33] Ma, J., Protter, P., San Martín, J. & Torres, S. (2002).
209–286. Numerical method for backward stochastic differential
[17] Delarue, F. & Menozzi, S. (2006). A forward backward equations, The Annals of Applied Probability 12(1),
stochastic algorithm for quasi-linear PDEs, The Annals 302–316.
of Applied Probability 16, 140–184. [34] Ma, J., Protter, P. & Yong, J. (1994). Solving forward-
[18] Douglas, J., Ma, J. & Protter, P. (1996). Numeri- backward stochastic differential equations explicitly - a
cal methods for forward backward stochastic differ- four step scheme, Probability Theory and Related Fields
ential equations, The Annals of Applied Probability 6, 98, 339–359.
940–968. [35] Ma, J. & Yong, J. (1999). Forward-backward Stochastic
[19] Duffie, D. & Epstein, L. (1992). Stochastic differential Differential Equations and their Applications, Lecture
utility, Econometrica 60, 353–394. Notes in Mathematics, Springer, Vol. 1702.
[20] Duffie, D. & Epstein, L. (1992). Asset pricing with [36] Ma, J. & Zhang, J. (2002). Representation theorems for
stochastic differential utility, Review of Financial Studies backward SDEs, The Annals of Applied Probability 12,
5, 411–436. 1390–1418.
[21] Duffie, D., Ma, J. & Yong, J. (1995). Black’s consol [37] Pardoux, E. & Peng, S. (1990). Adapted solutions
rate conjecture, The Annals of Applied Probability 5(2), of backward stochastic equations, System and Control
356–382. Letters 14, 55–61.
[22] El Karoui, N., Kapoudjian, C., Pardoux, E., Peng, S. & [38] Pardoux, E. & Peng, S. (1992). Backward Stochastic
Quenez, M.C. (1997). Reflected solutions of backward Differential Equations and Quasilinear Parabolic Partial
SDE’s, and related obstacle problems for PDE’s, The Differential Equations, Lecture Notes in CIS, Springer,
Annals of Probability 25(2), 702–737. Vol. 176, pp. 200–217.
[23] El Karoui, N. & Mazliak, L. (1997). Backward Stochas- [39] Pardoux, E. & Tang, S. (1999). Forward-backward
stochastic differential equations and quasilinear para-
tic Differential Equations, Pitman Research Notes in
bolic PDEs, Probability Theory and Related Fields
Mathematics Series, Longman, Harlow, Vol. 364.
114(2), 123–150.
[24] El Karoui, N., Peng, S. & Quenez, M.C. (1997).
[40] Peng, S. (1990). A general stochastic maximum principle
Backward stochastic differential equations in finance,
for optimal control problems, SIAM Journal on Control
Mathmatical Finance 7, 1–72.
and Optimization 28(4), 966–979.
[25] El Karoui, N., Peng, S. & Quenez, M.C. (2001). [41] Peng, S. (1992). A nonlinear Feynman-Kac formula and
A dynamic maximum principle for the optimization applications, in Control Theory, Stochastic Analysis and
of recursive utilities under constraints, The Annals of Applications: Proceedings of the Symposium on System
Applied Probability 11(3), 664–693. Sciences and Control Theory (Hangzhou, 1992), S.P.
[26] El Karoui, N. & Quenez, M.C. (1995). Dynamic pro- Shen & J.M. Yong, eds, World Scientific Publications,
gramming and pricing of contingent claims in an incom- River Edge, NJ, pp. 173–184.
plete market, SIAM Journal on Control and Optimization [42] Peng, S. & Wu, Z. (1999). Fully coupled forward-
33(1), 29–66. backward stochastic differential equations and applica-
[27] Gobet, E., Lemor, J.-P. & Warin, X. (2005). A tions to optimal control, SIAM Journal on Control and
regression-based Monte-Carlo method to solve backward Optimization 37(3), 825–843.
stochastic differential equations, The Annals of Applied [43] Yong, J. (1997). Finding adapted solutions of forward-
Probability 15, 2172–2202. backward stochastic differential equations: method of
Forward–Backward Stochastic Differential Equations (SDEs) 7
a predictable process H such that The generalization of type (1) essentially uses
the idea of orthogonal decomposition of the Hilbert
t space. In fact, note that M 2 (F) is a Hilbert space,
P |Hs |2 ds < ∞ : t ≥ 0 = 1 (3) let H denote all H ∈ M 2 (F) such that Ht =
0 and
t
0 s dBs , t ≥ 0 for some progressively measurable
We note that there is a slight difference between process ∈ L2 ([0, T ] × ). Then H is a closed
Corollary 1 and Theorem 2, on the integrability of subspace of M 2 (F); thus for any M ∈ M 2 (F) the
the integrand H . In fact, without the local martin- following decomposition holds:
gale assumption the “local” square integrability such
as equation (3) does not guarantee the uniqueness of
M t = M 0 + Ht + N t
the process H in Corollary 1. A very elegant result t
in this regard is attributed to Dudley [4], who proved
= M0 + s dBs + Nt , t ≥0 (4)
that any almost surely finite FT -measurable random 0
variable ξ can be represented as a stochastic inte-
gral evaluated at T , and the “martingale integrand” where N ∈ N ⊥ , the subspace of M 2 (F) consisting
satisfies only equation (3). However, such representa- of all martingales that are “orthogonal” to N . We
tion does not have uniqueness. This point was further refer to [12] and [20], for example, for detailed
investigated in [7]. In this study, the filtration is gen- discussions for this type of representations. The
erated by a higher dimensional Brownian motion, of generalizations of types (2) and (3) keep the original
which B is only a part of the components. We also form of the representation. We now list two results
refer to [12] for the discussions on this issue. adapted from Ikeda–Watanabe [8].
Itô’s original martingale representation theorem
has been extended to many other situations when Theorem 3 Let M i ∈ M 2c (F), i = 1, 2, . . . , d. Sup-
the Brownian motion is replaced by certain semimar- pose that i,j ∈ L 1 (F) and i,k ∈ L 2 (F), i, j , k =
tingales. In this section, we give a brief summary of 1, 2, . . . , d, exist such that for i, j = 1, 2, . . . , d,
these cases. For simplicity in what follows, we shall
consider only martingales rather than local martin- t
gales. The versions for the latter are essentially iden-
M , M t =
i j
ijs ds and
tical, but with slightly relaxed integrability require- 0
ments on the representing integrands, as we saw in
d
s =
Theorem 2. i,j sik sj k , P a.s. (5)
k=1
same as asking whether a market could be complete market, denoted by σ , is positive, we can write
when the dynamics of the underlying assets have
jumps. It turns out that there indeed exists a class t t
of martingales, known as the normal martingales, Vt = V0 + rVs ds + πt σs dBs , t ∈ [0, T ]
0 0
that are discontinuous in general but the martingale
representation theorem holds. A square-integrable (13)
martingale M is called normal if
Mt = t (cf.
where πt = ert φt σt−1 , t ≥ 0. The process π is then
[2]). The class of normal martingale, in particular,
exactly the “hedging strategy” for the claim X , that
includes those martingales that satisfy the so-called
is, the amount of money one should invest in the
structure equation (cf. [5, 6]). Examples of normal
stock, so that VT = X, almost surely.
martingales satisfying the structure equation include
The martingale representation theorem also plays
Brownian motion, compensated Poisson process, the
an important role in portfolio optimization problems,
Azéma martingale, and the “parabolic” martingale
especially in finding optimal strategies [12].
[20]. The martingale representation, or more precisely
One of the abstract forms of the hedging problem
the Clark–Ocone formula, was proved in [16]. The
described earlier is the so-called backward stochastic
application of such a representation in finance was
differential equation (BSDE), which is the problem of
first done by Dritschel and Protter [3] (see also [11]).
finding a pair of F-adapted processes (V , Z) so that
the following terminal value problems for a stochastic
differential equation similar to (13) holds:
Relation with Hedging
dVt = f (t, Vt , Zt ) dt + Zt dBt , t ∈ [0, T ]
The martingale representation theorem is the basis
for the arguments leading to market completeness, a VT = X (14)
fundamental component in the “Second Fundamental
Theorem” of mathematical finance (see Second Fun- See Forward–Backward Stochastic Differential
damental Theorem of Asset Pricing). Consider a Equations (SDEs); Backward Stochastic Differen-
market modeled by a probability space (, F, P , F), tial Equations.
where F is the filtration generated by a Brown-
ian motion that represents market randomness, and References
denote it by B. Assume that the market is arbitrage
free; then there exists a risk neutral measure Q (see [1] Dellacherie, C. (1974). Intégrales Stochastiques par
Fundamental Theorem of Asset Pricing), equiva- Rapport aux Processus de Wiener et de Poisson, Sémi-
lent to P . The arbitrage price at time t ∈ [0, T ] for naire de Probability (Univ. de Strasbourg) IV, Lec-
any contingent T -claim X is given by the discounted ture Notes in Math, Springer-Verlag, Berlin, Vol. 124,
77–107.
present value formula:
[2] Dellacherie, C., Maisonneuve, B. & Meyer, P.A. (1992).
Probabilités et Potentiel: Chapitres XVII à XXIV, Her-
Vt = e−r(T −t) E Q [X|Ft ], t ∈ [0, T ] (11) mann, Paris.
[3] Dritschel, M. & Protter, P. (1999). Complete markets
with discontinuous security price, Finance and Stochas-
where r is the (constant) interest rate. If X is tics 3(2), 203–214.
square integrable, then Mt = e−rt Vt , t ≥ 0, is a [4] Dudley, R.M. (1977). Wiener functionals as Itô integrals,
square-integrable F-martingale under Q. Applying Annals of Probability 5, 140–141.
the martingale representation theorem one has [5] Emery, M. (1989). On the Azéma Martingales, Séminaire
de Probabilités XXIII, Lecture Notes in Mathematics,
Vol. 1372, Springer Verlag, pp. 66–87.
t [6] Emery, M. (2006). Chaotic representation property of
Mt = M0 + φs dBs , t ∈ [0, T ] (12) certain Azéma martingales, Illinois Journal of Mathe-
0 matics 50(2), 395–411.
[7] Emery, M., Stricker, C. & Yan, J. (1983). Valuers prises
for some square-integrable, F-predictable process φ. par les martinglales locales continues à un instant donné,
Or equivalently, assuming that the volatility of the Annals of Probability 11, 635–641.
Martingale Representation Theorem 5
[8] Ikeda, N. & Watanabe, S. (1981). Stochastic Differential [18] Nualart, D. & Schoutens, W. (2000). Chaotic and pre-
Equations and Diffusion Processes, North-Holland. dictable representations for Lévy processes, Stochastic
[9] Itô, K. (1951). Multiple Wiener integral, Journal of Processes and their Applications 90, 109–122.
Mathematical Society of Japan 3, 157–169. [19] Pardoux, E. & Peng, S. (1990). Adapted solutions
[10] Jacod, J. & Shiryaev, A.N. (1987). Limit Theorems for of backward stochastic equations, System and Control
Stochastic Processes, Springer-Verlag, Berlin. Letters 14, 55–61.
[11] Jeanblanc, M. & Privault, N. (2002). A complete [20] Protter, P. (1990). Stochastic Integration and Stochastic
market model with Poisson and Brownian compo- Differential Equations, Springer.
nents, Seminar on Stochastic Analysis, Random Fields [21] Rogers, L.C.G. & Williams, D. (1987). Diffusions,
and Applications, Ascona; Progress in Probability, 52, Markov Processes and Martingales, Vol. 2: Itô Calculus,
189–204. John Wiley & Sons.
[12] Karatzas, I. & Shreve, S.E. (1987). Brownian Motion
and Stochastic Calculus, Springer.
[13] Kunita, H. (2004). Representation of martingales with Further Reading
jumps and applications to mathematical finance, in
Stochastic Analysis and Related Topics in Kyoto, Dellacherie, C. & Meyer, P. (1978). Probabilities and Poten-
Advanced Studies in Pure Mathematics 41 , H. Kunita, tial, North-Holland.
S. Watanabe & Y. Takahashi eds, Mathematical Society Doob, J.L. (1984). Classical Potential Theory and its Proba-
of Japan, Tokyo, pp. 209–232. bilistic Counterparts, Springer.
[14] Liptser, R.S. & Shiryaev, A.N. (1977). Statistics of Revuz, D. & Yor, M. (1991, 1994). Continuous Martingales
Random Processes. Vol I: General Theory, Springer- and Brownian Motion, Springer.
Verlag, New York.
[15] Løkka, A. (2004). Martingale representation of function-
als of Lévy processes, Stochastic Analysis and Applica- Related Articles
tions 22(4), 867–892.
[16] Ma, J., Protter, P., & San Martin, J. (1998). Anticipating
Backward Stochastic Differential Equations; Con-
integrals for a class of martingales, Bernoulli 4(1),
81–114. vex Duality; Complete Markets; Filtrations; Sec-
[17] Ma, J. & Yong, J. (1999). Forward-Backward Stochastic ond Fundamental Theorem of Asset Pricing.
Differential Equations and Their Applications, LNM
1702, Springer. JIN MA
Backward Stochastic Brownian motion W ; L2 is the set of random
variables ξ that are FT -measurable and square-
Differential Equations integrable; IH 2 is the set of predictable processes φ
T
such that E 0 |φt |2 dt < ∞. In the following, the
sign denotes transposition.
Backward stochastic differential equations (BSDEs) Let us consider the following BSDE (with dimen-
occur in situations where the terminal (as opposed sion 1 to simplify the presentation):
to the initial) condition of stochastic differential
equations is a given random variable. Linear BSDEs
were first introduced by Bismut (1976) as the adjoint − dYt = f (t, Yt , Zt )dt − Zt dWt , YT = ξ (3)
equation associated with the stochastic version of
the Pontryagin maximum principle in control theory. where ξ ∈ L2 and f is a driver, that is, it satisfies
The general case of a nonlinear BSDE was first the following assumptions: f : × [0, T ] × IR ×
introduced by Peng and Pardoux [23] to give a IR n → IR est P ⊗ B ⊗ Bn -measurable, f (., 0, 0) ∈
Feynman–Kac representation of nonlinear parabolic IH 2 and f is uniformly Lipschitz with respect to y, z
partial differential equations (PDEs). The solution of with constant C > 0. Such a pair (ξ, f ) is called a
a BSDE consists of a pair of adapted processes (Y, Z) pair of standard parameters. If the driver f does not
satisfying depend on y and z, the solution Y of equation (3) is
then given as
− dYt = f (t, Yt , Zt )dt − Zt dWt , YT = ξ (1)
T
where f is called the driver and ξ the terminal Yt = E ξ + f (s)ds/Ft (4)
condition. This type of equation appears naturally in t
hedging problems. For example, in a complete market
(see Complete Markets), the price process (Yt )0≤t≤T and the martingale representation theorem for Brow-
of a European contingent claim ξ with maturity T nian motion ([16] Theorem 4.15) gives the existence
corresponds to the solution of a BSDE with a linear of a unique process Z ∈ IH 2 such that
driver f and a terminal condition equal to ξ .
Reflected BSDEs were introduced by El Karoui T t
et al. [6]. In the case of a reflected BSDE, the solution E ξ+ f (s)ds/Ft = Y0 + Zs dWs (5)
Y is constrained to be greater than a given process 0 0
called the obstacle. A nondecreasing process K is
introduced in the equation in order to push (upward) In 1990, Peng and Pardoux [23] stated the follow-
the solution so that the constraint is satisfied, and ing theorem.
this push is minimal, that is, Y satisfies the following
Theorem 1 If ξ ∈ Ł2 and if f is a driver, then there
equation:
exists a unique pair of solutions (Y , Z) ∈ IH 2 × IH 2
− dYt = f (t, Yt , Zt )dt + dKt − Zt dWt , YT = ξ of equation (3).
(2)
In [7], El Karoui et al. have given a short proof
with (Yt − St ) dKt = 0. One can show that the price of this theorem based on a priori estimations of the
of an American option (with eventually some non- solutions. More precisely, the proposition is given as
linear constraints) is the solution of a reflected follows:
BSDE, where the obstacle is given by the payoff
process. Proposition 1 (A Priori Estimations). Let f 1 , ξ 1 ,
f 2 , ξ 2 be standard parameters. Let (Y 1 , Z 1 ) be the
solution associated with f 1 , ξ 1 and (Y 2 , Z 2 ) be the
Definition and Properties solution associated with f 2 , ξ 2 . Let C be the Lips-
chitz constant of f 1 . Substitute δYt = Yt1 − Yt2 , δZt =
We adopt the following notation: IF = {Ft , 0 ≤ t Zt1 − Zt2 , and δ2 ft = f 1 (t, Yt2 , Zt2 ) − f 2 (t, Yt2 , Zt2 ).
≤ T } is the natural filtration of an n-dimensional For (λ, µ, β) such that λ2 > C and β sufficiently
2 Backward Stochastic Differential Equations
large, that is, β > C(2 + λ2 ) + µ2 , the following esti- Let S ≤ T be a stopping time, and denote by
mations hold: Yt (S, ξ ) the solution of the BSDE with terminal
time T , coefficient f (t, y, z)1{t≤S} , and terminal
1
condition ξ (FS -measurable). Both the processes
||δY ||β ≤ T e E(|δYT | ) + 2 ||δ2 f ||β
2 βT 2 2
(6)
µ (Yt (S, YS ), Zt (S, YS ); t ∈ [0, T ]) and (Yt∧S (T , ξ ),
Z(T , ξ )1{t≤S} ; t ∈ [0, T ]) are solutions of the BSDE
λ2 1 with terminal time T , coefficient f (t, y, z)1{t≤S} , and
||δZ||2β ≤ e βT
E(|δYT |2
) + ||δ 2 f ||2
β
λ2 − C µ2 terminal condition YS . By uniqueness, these processes
are the same dP ⊗ dt-a.s.
(7)
T The simplest case is that of a linear BSDE.
where ||δY ||2β =E 0 e |δYt | dt.
βt 2
Let (β, γ ) be a bounded (IR, IR n )-valued predictable
process and let ϕ ∈ IH 2 (IR), ξ ∈ Ł2 (IR). We consider
From these estimations, uniqueness and existence the following BSDE:
of a solution follow by using the fixed point theo-
rem applied to the function : IHβ2 ⊗ IHβ2 → IHβ2 ⊗
− dYt = (ϕt + Yt βt + Zt γt ) dt − Zt dWt ,
IHβ2 ; (y, z) → (Y, Z), where (Y, Z) is the solution
associated with the driver f (t, yt , zt ) and IHβ2 denotes YT = ξ (9)
the space IH 2 endowed with norm || · ||β . Indeed, by
using the previous estimations, one can show that for By applying Itô’s formula to
t Yt , it can easily
t
sufficiently large β, the mapping is strictly con- be shown that the process
t Yt + 0
s ϕs ds is a
tracting, which gives the existence of a unique fixed local martingale and even a uniformly integrable
point, which is the solution of the BSDE. martingale, which gives the following proposition.
In addition, from “a priori estimations” (Proposi-
tion 1), some continuity and differentiability of solu- Proposition 3 The solution (Y , Z) of the linear
tions of BSDEs (with respect to some parameter) can BSDE (9) satisfies
be derived ([7] section 2). T
Furthermore, estimations (1) are also very useful
t Yt = E ξ
T +
s ϕs ds |Ft (10)
to derive some results concerning approximation or t
discretization of BSDEs [14]. where
is the adjoint process (corresponding to a
Recall the dependence of the solutions of BSDEs change of numéraire or a deflator in finance) defined
with respect to terminal time T and terminal condi-
by d
t =
t [βt dt + γt∗ dWt ],
0 = 1.
tion ξ by the notation (Yt (T , ξ ), Zt (T , ξ )). We have
the following flow property. Remark 1 First, it can be noted that if ξ and ϕ are
positive, then the process Y is positive. Second, if in
Proposition 2 (Flow Property). Let (Y (T , ξ ), Z
addition Y0 = 0 a.s., then for any t, Yt = 0 a.s. and
(T , ξ )) be the solution of a BSDE associated with the
ϕt = 0 dt ⊗ dP -a.s.
terminal time T > 0 and standard parameters (ξ , f ).
For any stopping time S ≤ T , From the first point in this remark, one can derive
the classical comparison theorem, which is a key
Yt (T , ξ ) = Yt (S, YS (T , ξ )), property of BSDEs.
Zt (T , ξ ) = Zt (S, YS (T , ξ )),
Theorem 2 (Comparison Theorem). If f 1 , ξ 1 and
t ∈ [0, S], dP ⊗ dt-almost surely (8) f 2 , ξ 2 are standard parameters and if (Y 1 , Z 1 )
(respectively (Y 2 , Z 2 )) is the solution associated with
Proof By conventional notation, we define the solu- (f 1 , ξ 1 ) (respectively (f 2 , ξ 2 )) satisfying
tion of the BSDE with terminal condition (T , ξ ) for
t ≥ T by (Yt = ξ, Zt = 0). Thus, if T ≥ T , then 1. ξ 1 ≥ ξ 2 P -a.s.
(Yt , Zt ); t ≤ T is the unique solution of the BSDE 2. δ2 ft = f 1 (t, Yt2 , Zt2 ) − f 2 (t, Yt2 , Zt2 ) ≥ 0 dt ×
with terminal time T , coefficient f (t, y, z)1{t≤T } , and dP -a.s.
terminal condition ξ . 3. f 1 (t, Yt2 , Zt2 ) ∈ IH 2 .
Backward Stochastic Differential Equations 3
Idea of the proof. We denote by δY the spread Proof For each α, since f (t, Yt , Zt ) ≤ f α (t, Yt , Zt )
between those two solutions: δYt = Yt2 − Yt1 and dt ⊗ dP -a.s. and ξ ≤ ξ α , the comparison theorem
δZt = Zt2 − Zt1 . The problem is to show that under gives that Yt ≤ Ytα 0 ≤ t ≤ T , P -a.s. It follows that
the above assumptions, δYt ≥ 0.
Now, the pair (δY, δZ) is the solution of the Yt ≤ ess inf Ytα , 0 ≤ t ≤ T , P -a.s. (15)
following LBSDE: α
If there exist standard parameters f and ξ and Many tentatives have been made to relax the
a parameter α such that equation (12) holds, then Lipschitz assumption on the driver f ; for instance,
the value function coincides with the solution of Lepeltier and San Martı́n [19] and have proved the
the BSDE associated with (f, ξ ). In other words, existence of a solution for BSDEs with a driver
Y t = Yt , 0 ≤ t ≤ T , P -a.s., where (Y, Z) denotes f , which is only continuous with linear growth by
the solution of the BSDE associated with (f, ξ ). It can an approximation method. Kobylanski [17] studied
be noted that this verification theorem generalizes the the case of quadratic BSDEs [20]. To give some
well-known Hamilton–Jacobi–Bellman–verification intuition on quadratic BSDEs, let us consider the
theorem, which holds in a Markovian framework. following simple example:
Indeed, recall that in the Markovian case, that is,
the case where the driver and the terminal condition
are functions of a state process, Peng and Pardoux Zt2
−dYt = dt − Zt dWt ,
(1992) have given an interpretation of the solution of 2
a BSDE in terms of a PDE [24]. More precisely, the YT = ξ (23)
state process X.t,x is a diffusion of the following type:
Let us make the exponential change of variable
dXs = b(s, Xs )ds + σ (s, Xs )dWs , Xt = x (19)
yt = eYt . By applying Itô’s formula, we easily derive
Then, let us consider (Y t,x , Z t,x ) solution of the
following BSDE:
dyt = eYt Zt dWt ,
BSDE for a European Option contingent claim settled at time T , that is, an FT -
measurable square-integrable random variable (it can
Consider a market model with a nonrisky asset, where be thought of as a contract that pays the amount ξ at
price per unit P0 (t) at time t satisfies time T ). By a direct application of BSDE results, we
derive that there exists a unique P -square-integrable
dP0 (t) = P0 (t)r(t)dt (26) strategy (X, π) such that
and n risky assets, the price of the ith stock Pi (t) is
modeled by the linear stochastic differential equation dXt = rt Xt dt + πt σt θt dt + πt σt dWt ,
XT = ξ (30)
n
dPi (t) = Pi (t) bi (t)dt + σi,j (t)dWt (27)
j
Xt is the price of claim ξ at time t and (X, π) is a
j =1
hedging strategy for ξ .
driven by a standard n-dimensional Wiener process In the case of constraints such as the case of a
W = (W 1 , . . . , W n ) , defined on a filtered probabil- borrowing interest rate Rt greater than the bond rate
ity space (, IF, P ). We assume the filtration IF r (see [10] p. 201 and 216 or [7]), the case of taxes
generated by the Brownian W is complete. The prob- [8], or the case of a large investor (whose strategy has
ability P corresponds to the objective probability an influence on prices, see [10] p. 216), the dynamics
measure. The coefficients r, bi , σi,j are IF -predictable of the wealth-portfolio strategy is no longer linear.
processes. We denote the vector b := (b1 , . . . , bn ) by Generally, it can be written as follows:
b and the volatility matrix σ := (σi,j , 1 ≤ i ≤ n, 1 ≤ −dXt = b(t, Xt , σt πt )dt − πt σt dWt (31)
j ≤ n) by σ . We will assume that the matrix σt has
full rank for any t ∈ [0, T ]. Let θt = (θt1 , . . . , θtd ) be where b is a driver (the classical case corresponds to
the classical risk-premium vector defined as the case where b(t, x, z) = −rt x − z θt ).
Let ξ be a square-integrable European contingent
θt = σ −1 (bt − rt 1) P -a.s. (28) claim. BSDE results give the existence and the
The coefficients σ , b, θ, and r are supposed to be uniqueness of a P -square-integrable strategy (X, π)
bounded. such that
Let us consider a small investor, who can invest
in the n + 1 basic securities. We denote by (Xt ) the −dXt = b(t, Xt , σt πt )dt − πt σt dWt ,
wealth process. At each time t, he/she chooses the XT = ξ (32)
amount πi (t) invested in the ith stock.
More precisely, a portfolio process
T is an adapted As in the classical case, Xt is the price of the
process π = (π1 , . . . , πn ) with 0 |σt πt |2 dt < ∞, claim ξ at time t and (X, π) is a hedging strategy of
P -a.s. ξ . Also note that, under some smoothness assump-
The strategy is supposed to be self-financing, tions on the driver b, by equality (22), the hedging
that is, the wealth process satisfies the following portfolio process (multiplied by the volatility) πt σt
dynamics: corresponds to the Malliavin derivative Dt Xt of the
price process, that is,
dXtx,π = rt Xt dt + πt σt (dWt + θt dt) (29)
Dt Xt = σt πt , dP ⊗ dt-a.s. (33)
Generally, the initial wealth x = X0 is taken as a
primitive, and for an initial endowment and portfolio which generalizes (to the nonlinear case) the useful
process (x, π), there exists a unique wealth process result stated by Karatzas and Ocone [21] in the
X, which is the solution of the linear equation (29) linear case. Thus, we obtain a nonlinear price system
with initial condition X0 = x. Therefore, there exists (see [10] p. 209), that is, an application that, for
a one-to-one correspondence between pairs (x, π) each ξ ∈ L2 (FT ) and T ≥ 0, associates an adapted
and trading strategies (X, π). process (Xtb (ξ, T )){0≤t≤T } , where Xtb (ξ, T ) denotes
Let T be a strictly positive real, which will be the the solution of the BSDE associated with the driver
terminal time of our problem. Let ξ be a European b, terminal condition ξ , and terminal time T .
6 Backward Stochastic Differential Equations
By the comparison theorem, this price system is A is a bounded set of T pairs of adapted pro-
nondecreasing with respect to ξ and satisfies the no- cesses (β, γ ) such that E 0 B(t, βt , γt )2 dt < +∞.
arbitrage property: BSDEs’ properties give the following variational
formulation:
A1. If ξ 1 ≥ ξ 2 and if Xtb (ξ 1 , T ) = Xtb (ξ 2 , T ) on an
event A ∈ Ft , then ξ 1 = ξ 2 on A. β,γ
Xtb = ess sup Xt (36)
By the flow property of BSDEs (Proposition 2), (β,γ )∈A
it is also consistent: more precisely, if S is a
stopping time (smaller than T ), then for each where X β,γ is the solution of the linear BSDE
time t smaller than S, the price associated with associated with the driver bβ,γ and terminal condition
payoff ξ and maturity T coincides with the ξ . In other words, X β,γ is the classical linear price of
price associated with maturity S and payoff ξ in a fictitious market with interest rate β and risk-
XSb (ξ, T ), that is, premium γ . The function B can be interpreted as a
A2. ∀t ≤ S, Xtb (ξ, T ) = Xtb (XSb (ξ, T ), S). cost function or a penalty function (which is equal to
In addition, if b(t, 0, 0) ≥ 0, then, by the com- 0 in quite a few examples).
parison theorem, the price X.b is positive. At An interesting question that follows is “Under
least, if b is sublinear with respect to (x, π) what conditions does a nonlinear price system have
(which is generally the case), then, by the com- a BSDE representation?” In 2002, Coquet et al. [3]
parison theorem, the price system is sublinear. gave the first answer to this question.
Also note that if b(t, 0, 0) = 0, then the price
of a contingent claim ξ = 0 is equal to 0, Theorem 3 Let X(.) be a price system, that is,
that is, Xtb (0, T ) = 0 and moreover (see, e.g., an application that, for each ξ ∈ L2 (FT ) and T ≥
[25]), the price system satisfies the zero–one 0, associates an adapted process (Xt (ξ , T )){0≤t≤T }
law property, that is, that is nondecreasing, which satisfies the no-arbitrage
A3. Xt (1A ξ, T ) = 1A Xt (ξ, T ) a.s. for t ≤ T , A ∈ property (A1), time consistency (A2), zero–one law
Ft , and ξ ∈ L2 (FT ). (A3), and translation invariance property (A4).
Furthermore, if b does not depend on x, then Suppose that it satisfies the following assumption:
the price system satisfies the translation invari- There exists some µ > 0 such that
X0 (ξ + ξ , T ) − X0 (ξ , T ) ≤ Y0 (ξ , T ), for any ξ
ance property: µ
In the case where the driver b is convex with − dYt = µ|Zt |dt − Zt dWt , YT = ξ (37)
respect to (x, π) (which is generally the case), we
have a variational formulation of the price of a Then the price system has a BSDE representation,
European contingent claim (see [7] or [10] Prop. 3.8 that is, there exists a standard driver b(t, z) that does
p. 215). Indeed, by classical properties of convex not depend on x such that b(t, 0) = 0 and that is
analysis, b can be written as the maximum of a family Lipschitz with respect to z with coefficient µ, such
of affine functions. More precisely, we have that X(ξ , T ) corresponds to the solution of the BSDE
associated with the terminal time T , driver b, and
b(t, x, π) = sup {bβ,γ (t, x, π)} (34) terminal condition ξ , for any ξ ∈ L2 (FT ), T ≥ 0, that
(β,γ )∈A
is, X(ξ , T ) = X b (ξ , T ).
where bβ,γ (t, x, π) = B(t, βt , γt ) − βt x − γt π,
where B(t, ., .) is the polar function of b with respect In this theorem, the existence of the coefficient µ
to x, π, that is, might be interpreted in terms of risk aversion.
Many nonlinear BSDEs also appear in the case
B(ω, t, β, γ ) = inf [b(ω, t, x, π) of an incomplete market (see Complete Markets).
(x,π)∈IR×IR n
For example, the superreplication price of a Euro-
+ βt (ω) x + γt (ω) π] (35) pean contingent claim can be obtained as the limit
Backward Stochastic Differential Equations 7
of a nondecreasing sequence of penalized prices, Then, by the results of the previous section,
which are solutions of nonlinear BSDEs [9, 10]. the dynamic risk measure ρ b is nonincreasing and
Another example is given by the pricing a European satisfies the no-arbitrage property (A1). In addition,
contingent claim via exponential utility maximiza- the risk measure ρ b is also consistent.
tion in an incomplete market. In this case, El Karoui If b is superadditive with respect to (x, z), then
and Rouge [11] have stated that the price of such an the dynamic risk-measure ρ b is subadditive, that is,
option is the solution of a quadratic BSDE. More pre- For any T ≥ 0, ξ, ξ ∈ L2 (FT ), ρtb (ξ + ξ , T ) ≤
cisely, let us consider a complete market (see Com- ρtb (ξ, T ) + ρtb (ξ , T ).
plete Markets) [11] that contains n securities, whose If b(t, 0, 0) = 0, then ρ b satisfies zero–one law
(invertible) volatility matrix is denoted by σt . Sup- (A3).
pose that only the first j securities are available for In addition, if b does not depend on x, then the
hedging and their volatility matrix is denoted by σt1 . measure of risk satisfies the translation invariance
The utility function is given by u(x) = −e−γ x , where property (A4).
γ (≥ 0) corresponds to the risk-aversion coefficient. In addition, if b is positively homogeneous with
Let ξ be a given contingent claim corresponding to respect to (x, z), then the risk measure ρ b is positively
an exercise time T ; in other words, ξ is a bounded homogeneous with respect to ξ , that is, ρ.b (λξ, T ) =
FT -measurable variable. Let (Xt (ξ, T )) (also denoted λρ.b (ξ, T ), for each real λ ≥ 0, T ≥ 0, and ξ ∈
by (Xt )) be the forward price process defined via the L2 (FT ).
exponential utility function as in [11]. By Theorem If b is convex (respectively, concave) with respect
5.1 in [11], there exists Z ∈ H 2 (IR n ) such that the to (x, z), then ρ b is concave (respectively, con-
pair (X, Z) is solution of the quadratic BSDE: vex) with respect to ξ . Furthermore, if b is concave
(respectively, convex), we have a variational formu-
γ
lation of the risk measure ρ b (similar to the one
−dXt = −(ηt + σt−1 νt0 ) · Zt + |(Zt )|2 obtained for nonlinear price systems). Note that in
2
the case where b does not depend on x, this dual for-
× dt − Zt dWt , XT = ξ (38)
mulation corresponds to a famous theorem for convex
and translation-invariant risk measures [12] and the
where η is the classical relative risk process, ν 0 is a polar function B corresponds to the penalty function.
given process [11], and (z) denotes the orthogonal Clearly, Theorem 3 can be written in terms of
projection of z onto the kernel of σt1 . risk measures. Thus, it gives the following interesting
result.
Proposition 7 Let ρ be a dynamic risk measure,
Dynamic Risk Measures that is, an application that, for each ξ ∈ L2 (FT )
and T ≥ 0, associates an adapted process
In the same way as in the previous section, some (ρt (ξ , T )){0≤t≤T } . Suppose that ρ is nonincreas-
dynamic measures of risk can be induced quite simply ing and satisfies assumptions (A1)–(A4) and that
by BSDEs (note that time-consistent dynamic risk- there exists some µ > 0 such that ρ0 (ξ + ξ , T ) −
measures are otherwise very difficult to deal with). ρ0 (ξ , T ) ≥ −Y0 (ξ , T ), for any ξ ∈ L2 (FT ) and ξ a
µ
More precisely, let b be a standard driver. We positive random variable ∈ L2 (FT ), where Yt (ξ , T )
µ
define a dynamic risk-measure ρ b as follows: for each is solution of BSDE (37). Then, ρ can be represented
T ≥ 0 and ξ ∈ L2 (FT ), we set by a backward equation, that is, there exists a stan-
dard driver b(t, z), which is Lipschitz with respect to
ρ.b (ξ, T ) = X.b (−ξ, T ) (39)
z with coefficient µ, such that ρ = ρ b a.s.
where (Xtb (−ξ, T )) denotes the solution of the Relation with Recursive Utility
BSDE associated with the terminal condition −ξ ,
terminal time T , and driver b(t, ω, x, z) [25]. Also Another example of BSDEs in finance is given
note that ρ.b (ξ, T ) = −X.b (ξ, T ), where b(t, x, z) = by recursive utilities introduced by Duffie and
−b(t, −x, −z). Epstein [5]. Such a utility function associated with
8 Backward Stochastic Differential Equations
BSDE associated with the terminal time τ , terminal Proposition 9 (Comparison). Let ξ 1 , ξ 2 be two
condition ξτ , and coefficient f . We easily derive the obstacle processes and let f 1 , f 2 be two coefficients.
following property. Let (Y 1 , Z 1 , K 1 ) (respectively, (Y 2 , Z 2 , K 2 )) be
a solution of the reflected BSDE (43) for (ξ 1 , f 1 )
Proposition 8 (Characterization). Suppose that (respectively, for (ξ 2 , f 2 ) and assume that
(Y , Z, K) is solution of the reflected BSDE (43). Then,
for each t ∈ [0, T ], • ξ 1 ≤ ξ 2 a.s.
• f 1 (t, y, z) ≤ f 2 (t, y, z), t ∈ [0, T ], (y, z) ∈
Yt = Xt (Dt , ξDt ) = ess sup Xt (τ , ξτ ) (44) IR × IR d .
τ ∈Tt
Then, Yt1 ≤ Yt2 ∀t ∈ [0, T ] a.s.
where Dt = inf {u ≥ t; Yu = ξu }.
As in the case of classical BSDEs, some a priori
Proof By using the fact that YDt = ξDt and since estimations similar to equations (6) and (7) can be
the process K is constant on [t, Dt ], we easily given [6]. From these estimations, we can derive the
derive that (Ys , t ≤ s ≤ Dt ) is the solution of the existence of a solution, that is, the following theorem.
BSDE associated with the terminal time Dt , terminal
condition ξDt , and coefficient f , that is, Theorem 4 There exists a unique solution (Y , Z, K)
of RBSDE (43).
Yt = Xt (Dt , ξDt ) (45)
Sketch of the proof. The arguments are the same as
It remains now to show that Yt ≥ Xt (τ, ξτ ), for in the classical case. The only problem is to show the
each τ ∈ Tt . existence of a solution in the case where the driver
Fix τ ∈ Tt . On the interval [t, τ ], the pair (Ys , Zs ) f does not depend on y, z. However, this problem
satisfies is already solved by optimal stopping time theory.
Indeed, recall that by Theorem (4), we have Y that is
a solution of the RBSDE associated with the driver
−dYs = f (s, Ys , Zs) ds + dKs − Zs dWs , f (t) and obstacle ξ ; then,
Yτ = Yτ (46)
Yt = ess sup X(τ, ξτ )
τ ∈Tt
In other words, the pair (Ys , Zs , t ≤ s ≤ Dt ) is the
τ
solution of BSDE associated with the terminal time = ess sup E f (s) ds + ξτ Ft (48)
τ , terminal condition Yτ , and coefficient τ ∈Tt t
The price of the American option is then given g-expectations, Probability Theory and Related Fields
by a right continuous left limited (RCLL) process Y , 123, 1–27.
satisfying for each t, [4] Cvitanić, J. & Karatzas, I. (1996). Backward stochastic
differential equations with reflection and Dynkin games,
Yt = ess sup Xt (ν, ξν ), P -p.s. (55) Annals of Probability 4, 2024–2056.
ν∈Tt [5] Duffie, D. & Epstein, L. (1992). Stochastic differential
utility, Econometrica 60, 353–394.
By the previous results, the price (Yt , 0 ≤ t ≤ T ) [6] El Karoui, N., Kapoudjian, C., Pardoux, E., Peng, S. &
corresponds to the solution of a reflected BSDE Quenez, M.C. (1997). Reflected solutions of Backward
associated with the coefficient b and obstacle ξ . In SDE’s and related obstacle problems for PDE’s, The
other words, there exists a process π ∈ IH 2 and K Annals of Probability 25(2), 702–737.
an increasing continuous process such that [7] El Karoui, N., Peng, S. & Quenez, M.C. (1997).
Backward stochastic differential equations in finance,
Mathematical Finance 7(1), 1–71.
−dYt = b(t, Yt , πt )dt + dKt − πt dWt , [8] El Karoui, N., Peng, S. & Quenez, M.C. (2001). A
dynamic maximum principle for the optimization of
YT = ξT (56)
recursive utilities under constraints, Annals of Applied
T Probability 11(3), 664–693.
with Y. ≥ ξ. and 0 (Yt − ξt ) dKt = 0. In addition, [9] El Karoui, N. & Quenez, M.C. (1995). Dynamic pro-
the stopping time Dt = inf {s ≥ t/Ys = ξs } is opti- gramming and pricing of a contingent claim in an incom-
mal, that is, plete market, SIAM Journal on Control and optimization
33(1), 29–66.
Yt = ess sup X(ν, ξν ) = Xt (Dt , ξDt ) (57) [10] El Karoui, N. & Quenez, M.C. (1996). Non-linear
ν∈Tt
pricing theory and backward stochastic differen-
Moreover, by the minimality property of the tial equations, in Financial Mathematics, Lectures
increasing process K, the process Y corresponds to Notes in Mathematics, Bressanone 1656, W.J. Rung-
galdieredssnm, ed., collection, Springer.
the surreplication price of the option, that is, the
[11] El Karoui, N. & Rouge, R. (2000). Contingent claim
smallest price that allows the surreplication of the pricing via utility maximization, Mathematical Finance
payoff. 10(2), 259–276.
One can also easily state that the price system [12] Föllmer, H. & Shied, A. (2004). Stochastic Finance: An
ξ. → Y. (ξ. ) is nondecreasing and sublinear if b is introduction in Discrete Time, Walter de Gruyter, Berlin.
sublinear with respect to x, π. Note (see [10] p. 239) [13] Gegout-Petit, A. & Pardoux, E. (1996). Equations
that the nonarbitrage property holds only in a weak différentielles stochastiques rétrogrades réfléchies dans
sense: more precisely, let ξ. and ξ. be two payoffs and un convexe, Stochastics and Stochastic Reports 57,
111–128.
let Y and Y their associated prices. If ξ. ≥ ξ. and also
[14] Gobet, E. & Labart, C. (2007). Error expansion for
Y0 = Y0 , then D0 ≤ D0 , the payoffs are equal at time the discretization of Backward Stochastic Differential
D0 , and the prices are equal until D0 . Equations, Stochastic Processes and their Applications
In the previous section, we have seen how, in the 10(2), 259–276.
case where the driver b is convex, one can obtain [15] Hamadane, S., Lepeltier, J.P. & Matoussi, A. (1997).
a variational formulation of the price of a European Double barrier reflected backward SDE’s with contin-
option. Similarly, one can show that the price of an uous coefficient, in Backward Stochastic Differential
American option is equal to the value function of a Equations, Collection Pitman Research Notes in Math-
mixed control problem [10]. ematics Series 364, N. El Karoui & L. Mazliak, eds,
Longman.
[16] Karatzas, I. & Shreve, S. (1991). Brownian Motion and
References Stochastic Calculus, Springer Verlag.
[17] Kobylanski, M. (2000). Backward stochastic differential
[1] Buckdahn, R. (1993). Backward Stochastic Differential equations and partial differential equations with
Equations Driven by a Martingale. Preprint. quadratic growth, The Annals of Probability 28,
[2] Chen, Z. & Epstein, L. (1998). Ambiguity, Risk and 558–602.
Asset Returns in Continuous Time, working paper 1998, [18] Kobylanski, M., Lepeltier, J.P., Quenez, M.C. &
University of Rochester. Torres, S. (2002). Reflected BSDE with super-linear
[3] Coquet, F., Hu, Y., Mémin, J. & Peng, S. (2002). quadratic coefficient, Probability and Mathematical
Filtration-consistent nonlinear expectations and related Statistics 22, Fasc.1, 51–83.
12 Backward Stochastic Differential Equations
[19] Lepeltier, J.P. & San Martı́, J. (1997). Backward stochas- [25] Peng, S. (2004). Nonlinear Expectations, Nonlinear
tic differential equations with continuous coefficients, Evaluations and Risk Measures, Lecture Notes in Math.,
Statistics and Probability Letters 32, 425–430. 1856, Springer, Berlin, pp. 165–253.
[20] Lepeltier, J.P. & San Martı́n, J. (1998). Existence for [26] Quenez, M.C. (1997). “Stochastic Control and BSDE’s”,
“Backward Stochastic Differential Equations”, N. El
BSDE with superlinear-quadratic coefficient, Stochastic
Karoui & L. Mazliak, eds, Collection Pitman Reasearch
and Stochastic Reports 63, 227–240.
Notes in Mathematics Series 364, Longman.
[21] Ocone, D. & Karatzas, I. (1991). A generalized Clark
representation formula with application to optimal
portfolios, Stochastics and Stochastisc Reports 34, Related Articles
187–220.
[22] Ouknine, Y. (1998). Reflected backward stochastic dif- Backward Stochastic Differential Equations:
ferential equation with jumps, Stochastics and Stochas- Numerical Methods; Convex Risk Measures;
tics Reports 65, 111–125. Forward–Backward Stochastic Differential Equa-
[23] Pardoux, P. & Peng, S. (1990). Adapted solution of tions (SDEs); Markov Processes; Martingale Rep-
backward stochastic differential equation, Systems and resentation Theorem; Mean–Variance Hedging;
Control Letters 14, 55–61. Recursive Preferences; Stochastic Control; Sto-
[24] Pardoux, P. & Peng, S. (1992). Backward stochastic dif- chastic Integrals; Superhedging.
ferential equations and Quasilinear parabolic partial dif-
ferential equations, Lecture Notes in CIS 176, 200–217. MARIE-CLAIRE QUENEZ
Backward Stochastic assumptions only, but for simulation studies multiple
approximations are needed. See also [10, 13, 28]
Differential Equations: for forward–backward systems of SDE (FBSDE)
solutions, [18] for a regression-based Monte Carlo
Numerical Methods method, [39] for approximating solutions of BSDEs,
and [35] for Monte Carlo valuation of American
Options.
Nonlinear backward stochastic differential equations On the other hand, in [2, 9, 11, 26] the authors
(BSDEs) were introduced in 1990 by Pardoux and replace Brownian motion by simple random walks
Peng [34]. The interest in BSDEs comes form their in order to define numerical approximations for
connections with partial differential equations (PDEs) BSDEs. This technique simplifies the computation of
[14, 38]; stochastic control (see Stochastic Cont- conditional expectations involved at each time step.
rol); and mathematical finance (see [16, 17], among A quantization (see Quantization Methods) tech-
others). In particular, as shown in [15], BSDEs are nique was suggested in [4, 5] for the resolution of
a useful tool in the pricing and hedging of European reflected backward stochastic differential equations
options. In a complete market, the price process Y (RBSDEs) when the generator f does not depend
of ξ is a solution of a BSDE. BSDEs are also useful on the control variable z. This method is based on
in quadratic hedging problems in incomplete markets the approximation of continuous time processes on
(see Mean–Variance Hedging). a finite grid, and requires a further estimation of the
The result that there exist unique BSDE equations transition probabilities on the grid.
under the assumption that the generator is locally Lip- In [8], the authors propose a discrete-time approxi-
schitz can be found in [19]. A similar result was mation for approximations of RBSDEs. The Lp norm
obtained in the case when the coefficient is con- of the error is shown to be of the order of the time
tinuous with linear growth [24]. The same authors, step. On the other hand, a numerical approximation
Lepeltier and San Martı́n [23], generalized these for a class of RBSDEs based on numerical approxi-
results under the assumption that the coefficients mations for BSDE and approximations given in [29],
have a superlinear quadratic growth. Other exten- can be found in [31, 33].
sions of existence and uniqueness of BSDE are dealt Recently, work on numerical schemes for jumps
with in [20, 25, 30]. Stability of solutions for BSDE is given in [22] and is based on the approximation for
have been studied, for example, in [1], where the the Brownian motion and a Poisson process by two
authors analyze stability under disturbances in the simple random walks. Finally, for decoupled FBSDEs
filtration. In [6], the authors show the existence and with jumps a numerical
scheme is proposed in [7].
uniqueness of the solution and the link with integral- Let = C [0, 1], d and consider the canonical
PDEs (see Partial Integro-differential Equations Wiener space (, F, , Ft ), in which Bt (ω) = ω(t)
(PIDEs)). An existence theorem for BSDEs with is a standard d-dimensional Brownian motion. We
jumps is presented in [25, 36]. The authors state a the- consider the following BSDE:
orem for Lipschitz generators proved by fixed point T T
techniques [37]. Yt = ξ + f (s, Ys , Zs )ds − Zs dBs (1)
Since BSDE solutions are explicit in only a few t t
cases, it is natural to search for numerical methods where ξ is a FT -measurable square integrable random
approximating the unique solution of such equa- variable and f is Lipschitz continuous in the space
tions and to know the associated type of conver- variable with Lipschitz constant L. The solution of
gence. Some methods of approximation have been equation (1) is a pair of adapted processes (Y, Z),
developed. which satisfies the equation.
A four-step algorithm is proposed in [27] to
solve equations of forward–backward type, relat-
ing the type of approximation to PDEs theory. On Numerical Methods for BSDEs
the other hand, in [3], a method of random dis-
cretization in time is used where the convergence of One approach for a numerical scheme for solving
the method for the solution (Y, Z) needs regularity BSDEs is based upon a discretization of the equation
2 Backward Stochastic Differential Equations: Numerical Methods
(1) by replacing B with a simple random walk. To be It is standard to show that if f is uniformly
more precise, let us consider the symmetric random Lipschitz in the spatial variable x with Lipschitz
walk W n : constant L (we also assume that f is bounded by R),
then the iterations of this procedure will converge
1 n c (t)
to the true solution of equation (7) at a geometric
Wtn := √ ζkn , 0≤t ≤T (2)
n k=0 rate L/n. Therefore, in the case where n is large
enough, one iteration would already give us the
where {ζkn }1≤k≤n is an i.i.d. Bernoulli symmetric error estimate: |Ytni − X 1 | ≤ LR
n2
, producing a good
sequence. We define Gnk := σ (ζ1n , . . . , ζkn ). Through- approximate solution of equation (7). Consequently,
out this section cn (t) = [nt]/n, and ξ n denotes a the explicit numerical scheme is given by
square integrable random variable, measurable w.r.t.
Gnn that should converge to ξ . We assume that W n
ŶT = ξ
; ẐTn =
n n
0
Xt = Ɛ Ŷti+1 Gni
and B are defined in the same probability space. n
In [26], the authors consider the case when the i
T T (9)
Yt = ξ + f (Ys )ds − Zs dBs (3)
t t The convergence of Ŷ n to Y is proved in the sense
of the Skorohod topology in [9, 26]. In [11], the
whose solution is given by
convergence of the sequence Y n is established using
T the tool of convergence of filtrations. See also [3] for
Yt = Ɛ ξ + f (Ys )ds Ft (4) the case where f depends on both variables y and z.
t
, ξ = (S0 e(µ− 2 σ T )+σ BT − K)+ , and (26) [14] coupled with a use of the standard Euler
1 2
with α = r−µ σ
Zs = σ Ss ∂x . In this case, we have an explicit solu-
∂w scheme. The penalization equation is given by
tion for w given by
1
Ytε = ξ + f (s, Ysε , Zsε )ds
−rT
Y0 = S0 (g(T , S0 )) − Ke (h(T , S0 )) t
1 1
1
(23) − Zsε dBs + (Ls − Ysε )+ ds
t ε t
w(t, x) = x (g(T − t, x))
(27)
− Ke−r(T −t) (h(T − t, x)) (24)
2
In this framework, we define
where g(t, x) = ln(x/K)+(r+1/2σ
√
σ t
)t
, h(t, x) = g(t, x) −
√ −y 2 1 t
1 x
σ t and (x) = √2π −∞ e 2 dy is the standard Ktε := (Ls − Ysε )+ ds, 0≤t ≤1 (28)
ε 0
normal distribution. In general, for example, when
σ may depend on time and (St ), we obtain a BSDE where ε is the penalization parameter. In order to
for (Yt ) coupled with a forward equation for (St ), that have an explicit iteration, we include an extra Picard
can be solved numerically. iteration, and the numerical procedure is then
impose in the final condition of every step of the Clearly K n is predictable and we have
discretization that the solution must be above the
barrier. Schematically we have ti
Ytni−1 = Ytni + f s, Ỹsn , Zsn ds
• Y1n := ξ n
ti−1
ti
• for i = n, n − 1, . . . 1 let Ỹ n , Z n be the solu- − Zsn dWsn + Ktni − Ktni−1 (32)
tion of the BSDE: ti−1
Node
7.1
260,88728
Node
6.1
222,35356
Node Node
5.1 7.2
189,51137 188,266912
Node Node
4.1 6.2
161,520055 160,459406
Node Node Node
3.1 5.2 7.3
137,663129 136,759141 135,861089
Node Node Node
2.1 4.2 6.3
117,3299316 116,559465 115,794058
Node Node Node Node
1.1 3.2 5.3 7.4
100 99,3433333 98,6909788 98,042908
Node Node Node
2.2 4.3 6.4
84,67006838 84,1140683 83,5617192
Node Node Node
3.3 5.4 7.5
71,6902048 71,2194391 70,7517648
Node Node
4.4 6.5
60,7001454 60,3015478
Node Node
5.5 7.6
51,3948546 51,0573618
Node
6.6
43,5160586
Node
7.7
35,8450765
Figure 1 Binomial tree for six time steps, r = 0.06, σ = 0.4, and T = 0.5
6 Backward Stochastic Differential Equations: Numerical Methods
Carlo and Probabilistic Methods for Partial Differential [21] Kobylanski, M., Lepeltier, J.P., Quenez, M.C. &
Equations (Monte Carlo, 2000). Monte Carlo Methods Torres, S. (2002). Reflected BSDE with Superlinear
and Applications 7 (no. 1–2), pp. 21–33. quadratic coefficient, Probability and Mathematical
[6] Barles, G., Buckdahn, R. & Pardoux, E. (1997). BSDEs Statistics 22,(Fasc. 1), 51–83.
and integral-partial differential equations, Stochastics [22] Lejay, A., Mordecki, E. & Torres, S. (2008). Numerical
and Stochastics Reports 60(1–2), 57–83. method for backward stochastic differential equations
[7] Bouchard, B. & Elie, R. (2005). Discrete time approx- with jumps. Submitted, preprint inria-00357992.
imation of decoupled forward-backward SDE with [23] Lepeltier, J.P. & San Martı́n, J. (1997). Backward
jumps. Stochastic Processes and Their Applications stochastic differential equations with continuous coeffi-
118(1), 53–75. cient, Statistics and Probability Letters 32(4), 425–430.
[8] Bouchard, B. & Touzi, N. (2004). Discrete-time [24] Lepeltier, J.P. & San Martı́n, J. (1998). Existence for
approximation and Monte-Carlo simulation of backward BSDE with superlinear-quadratic coefficients, Stochas-
stochastic differential equations, Stochastic Processes tics Stochastics Reports 63, 227–240.
and Their Applications 111(2), 175–206. [25] Li, X. & Tang, S. (1994). Necessary condition for opti-
[9] Briand, P., Delyon, B. & Mémin, J. (2001). Donsker- mal control of stochastic systems with random jumps,
Type theorem for BSDEs, Electronic Communications SIAM Journal on Control and Optimization 332(5),
in Probability 6, 1–14. 1447–1475.
[10] Chevance, D. (1997). Numerical Methods for Backward [26] Ma, J., Protter, P., San Martı́n, J. & Torres, S.
Stochastic Differential Equations. Numerical Methods in (2002). Numerical method for backward stochastic dif-
Finance, Publications of the Newton Institute, Cam- ferential equations, Annals of Applied Probability 12,
bridge University Press, Cambridge, pp. 232–244. 302–316.
[11] Coquet, F., Mémin, J. & Slomiński, L. (2001). On Weak [27] Ma, J., Protter, P. & Yong, J. (1994). Solving forward-
Convergence of Filtrations, Séminaire de Probabilités, backward stochastic differential equations explicitly a
four step scheme, Probability Theory and Related Fields
XXXV, Lecture Notes in Mathematics, Springer, Berlin,
98(3), 339–359.
Vol. 1755, pp. 306–328.
[28] Ma, J. & Yong, J. (1999). Forward-Backward Stochas-
[12] Cvitanic, J. & Karatzas, I. (1996). Backward stochastic
tic Differential Equations and their Applications. Lec-
differential equations with reflections and Dynkin games,
ture notes in Mathematics, Springer Verlag, Berlin,
Annals of Probability 24, 2024–2056.
p. 1702.
[13] Douglas, J., Ma, J. & Protter, P. (1996). Numerical
[29] Ma, J. & Zhang, L. (2005). Representations and regular-
methods for forward-backward stochastic differential
ities for solutions to bsde’s with reflections, Stochastic
equations, Annals of Applied Probability 6(3), 940–968.
Processes and their Applications 115, 539–569.
[14] El Karoui, N., Kapoudjian, C., Pardoux, E. &
[30] Mao, X.R. (1995). Adapted Solutions of BSDE with
Quenez, M.C. (1997). Reflected solutions of backward
Non-Lipschitz coefficients, Stochastic Processes and
SDE’s, and related obstacle problems for PDE’s, Annals their Applications 58, 281–292.
of Probability 25(2), 702–737. [31] Martínez, M., San Martı́n, J. & Torres, S. Numerical
[15] El Karoui, N., Peng, S. & Quenez, M.C. (1997). method for Reflected Backward Stochastic Differential
Backward stochastic differential equations in finance, Equations. Submitted.
Mathematical Finance 7, 1–71. [32] Matoussi, A. (1997). Reflected solutions of back-
[16] El Karoui, N. & Quenez, M.C. (1997). Imperfect Mar- ward stochastic differential equations with continu-
kets and Backward Stochastic Differential Equation. ous coefficient, Statistics and Probability Letters 34,
Numerical Methods in Finance, publications of the New- 347–354.
ton Institute, Cambridge University Press, Cambridge, [33] Mémin, J., Peng, S. & Xu, M. (2008). Convergence
pp. 181–214. of solutions of discrete reflected backward SDE’s and
[17] El Karoui, N. & Rouge, R. (2000). Contingent claim simulations, Acta Matematicae Applicatae Sinica 24(1),
pricing via utility maximization, Mathematical Finance 1–18.
10(2), 259–276. [34] Pardoux, P. & Peng, S. (1990). Adapted solution of
[18] Gobet, E., Lemor, J.-P. & Warin, X. (2005). A backward stochastic differential equation, Systems and
regression-based Monte Carlo method to solve backward Control Letters 14, 55–61.
stochastic differential equations, Annals of Applied Prob- [35] Rogers, L.C.G. (2002). Monte Carlo valuation of Amer-
ability 15(3), 2172–2202. ican options, Mathematical Finance 12(3), 271–286.
[19] Hamadene, S. (1996). Équations différentielles stochas- [36] Situ, R. (1997). On solution of backward stochastic
tiques rétrogrades: les cas localement Lipschitzien, differential equations with jumps, Stochastic Processes
Annales de l’institut Henri Poincaré (B) Probabilités et and their Applications 66(2), 209–236.
Statistiques 32(5), 645–659. [37] Situ, R. & Yin, J. (2003). On solutions of forward-
[20] Kobylanski, M. (2000). Backward stochastic differen- backward stochastic differential equations with Pois-
tial equations and partial differential equations with son jumps, Stochastic Analysis and Applications 21(6),
quadratic growth, Annals of Probability 28, 558–602. 1419–1448.
8 Backward Stochastic Differential Equations: Numerical Methods
[38] Sow, A.B. & Pardoux, E. (2004). Probabilistic inter- Differential Equations (SDEs); Markov Processes;
pretation of a system of quasilinear parabolic PDEs, Martingales; Martingale Representation Theorem;
Stochastics and Stochastics Reports 76(5), 429–477.
Mean–Variance Hedging; Partial Differential
[39] Zhang, J. (2004). A numerical scheme for BSDEs,
Annals of Applied Probability 14(1), 459–488. Equations; Partial Integro-differential Equa-
tions (PIDEs); Quantization Methods; Stochastic
Control.
Related Articles
JAIME SAN MARTÍN & SOLEDAD TORRES
American Options; Backward Stochastic Differ-
ential Equations; Forward–Backward Stochastic
Stochastic Exponential For a general semimartingale X as above, the expres-
sion for the stochastic exponential is
1
Let X be a semimartingale with X0 = 0. Then there Zt = exp Xt − [X]t (1 + Xs )
2 0<s≤t
exists a unique semimartingale Z that satisfies the
equation 1
t × exp −Xs + (Xs ) 2
(6)
2
Zt = 1 + Zs− dXs (1)
0 where the possibly infinite product converges. Here
[X] denotes the quadratic variation process of X.
It is called the stochastic exponential of X and is
In case X is a local martingale vanishing at zero
denoted by E(X). Sometimes the stochastic exponen-
with X > −1, then E(X) is a strictly positive local
tial is also called the Doléans exponential, after the
martingale. This property renders the stochastic expo-
French mathematician Catherine Doléans-Dade. Note
nential very useful as a model for asset prices in case
that Z− denotes the left-limit process, so that the inte-
the price process is directly modeled under a mar-
grand in the stochastic integral is predictable.
tingale measure, that is, in the risk neutral world.
We first give some examples as follows:
However, considering some Lévy-process X, many
1. If B is a Brownian motion, then an application authors prefer to model the price process as exp(X)
of Itô’s formula reveals that rather than E(X) since this form is better suited
for applying Laplace transform methods. In fact, the
two representations are equivalent because starting
1
E (B)t = exp Bt − t (2) with a model of the form exp(X), one can always
2
find a Lévy-process X such that exp(X) = E X
2. Likewise, the stochastic exponential for a com- and vice versa (in case the stochastic exponential is
pensated Poisson process N − λt is given as positive). The detailed calculations involving charac-
teristic triplets can be found in Goll and Kallsen [3].
Finally, for any two semimartingales X, Y we
1 have the formula
E (N − λt)t = exp − λt × 2Nt
2
E (X) E (Y ) = E (X + Y + [X, Y ]) (7)
1
= exp ln(2)Nt − λt (3) which generalizes the multiplicative property of the
2
usual exponential function.
3. The classical Samuelson model for the evolution
of stock prices is also given as a stochastic
exponential. The price process S is modeled Martingale Property
here as the solution of the stochastic differential The most crucial issue from the point of mathemati-
equation cal finance is that, given X is a local martingale, the
dSt
= σ dBt + µ dt (4) stochastic exponential E(X) may fail to be a martin-
St gale. Let us give an illustration of this phenomenon.
We assume that the price process of a risky
Here, we consider the constant trend coefficient asset evolves as the stochastic exponential Zt =
µ, the volatility σ , and a Brownian motion B. exp Bt − 12 t where B is a standard Brownian
The solution to this equation is motion starting in zero. Since one-dimensional Brow-
nian motion is almost-surely recurrent, and therefore
gets negative for arbitrarily large times, zero must
St = E (σ Bt + µt) be an accumulation point of Z. As Z can be written
1 2 as a stochastic integral of B, it is a local martin-
= exp σ Bt + µ − σ t (5)
2 gale, and hence a supermartingale by Fatou’s lemma
2 Stochastic Exponential
stochastic exponentials is that they are intricately and B a Brownian motion. Provided that there is
related to measure changes since they qualify as ε > 0 such that
candidates for density processes (see Girsanov’s
theorem). Let us fix a filtered probability space sup E exp εϑt2 < ∞ P − a.s. (13)
0≤t≤T
(, F∞ , (Ft ), P ). In case the stochastic exponential
is positive, we may define a new measure Q on
then the stochastic exponential E( ϑdB) is a martin-
F∞ via gale on [0, T ].
dQ
= Z∞ (9)
dP Let us now turn to the discontinuous case. A gen-
If Z is a uniformly integrable martingale, then Q eralization of Novikov’s criterion has been obtained
is a probability measure since E[Z∞ ] = Z0 = 1. On by Lepingle and Mémin [7] where more results in
the other hand, if Z is a strict local martingale, this direction can be found.
hence a strict supermartingale, then we get Q() = Theorem 4 Let M be a locally bounded local
E[Z∞ ] < 1. It is therefore of paramount interest to P -martingale with M > −1. If
have criteria at hand for stochastic exponentials to be
true martingales. We first focus on the continuous
1
c
case. E exp M ∞ (1 + Mt )
2 t
Theorem 1 (Kazamaki’s Criterion). Let M be a
continuous local martingale. Suppose Mt
× exp − <∞ (14)
1 + Mt
1
sup E exp MT <∞ (10) then E(M) is a uniformly integrable martingale. Here
T 2
M c denotes the continuous local martingale part
where the supremum is taken over all bounded stop- of M.
ping times T . Then E(M) is a uniformly integrable
martingale. The situation is particularly transparent for Lévy
processes; see Cont and Tankov [1].
A slightly weaker result, which, however, is often
easier to apply, is given by the following criterion. Theorem 5 If M is both a Lévy process and a
local martingale, then its stochastic exponential E(M)
Theorem 2 (Novikov’s Criterion). Let M be a (given that it is positive) is already a martingale.
continuous local martingale. Suppose
Alternative conditions for ensuring that stochastic
1 exponentials are martingales in case of Brownian
E exp [M]∞ <∞ (11)
2 motion driven stochastic volatility models have been
Stochastic Exponential 3
provided in Hobson [4] as well as in Wong and [3] Goll, T. & Kallsen, J. (2000). Optimal portfolio with loga-
Heyde [9]. Moreover, Kallsen and Shiryaev [6] rithmic utility, Stochastic Processes and their Applications
give results generalizing and complementing the 89, 91–98.
[4] Hobson, D. (2004). Stochastic volatility models, correla-
criterions in Lepingle and Mémin [7]. In case of local tion and the q-optimal measure, Mathematical Finance
martingales of stochastic exponential form E(X), 14, 537–556.
where X denotes one component of a multivariate [5] Kallsen, J. & Muhle-Garbe, J. (2007). Exponentially
affine process, Kallsen and Muhle-Garbe [5] give Affine Martingales and Affine Measure Changes, preprint,
sufficient conditions for M to be a true martingale. TU München.
Finally, there are important links between stochastic [6] Kallsen, J. & Shiryaev, A.N. (2002). The cumulant
process and Esschers’s change of measure, Finance and
exponentials of BMO-martingales, reverse Hölder
Stochastics 6, 397–428.
inequalities, and weighted norm inequalities (i.e., [7] Lepingle, D. & Mémin, J. (1978). Sur l’intégrabilité
inequalities generalizing martingale inequalities to uniforme des martingales exponentielles, Zeitschrift für
certain semimartingales); compare Doléans-Dade and Wahrscheinlichkeitstheorie und verwandte Gebiete 42,
Meyer [2]. 175–203.
[8] Liptser, R. & Shiryaev, A.N. (1977). Statistics of Random
Processes I, Springer, Berlin.
References [9] Wong, B. & Heyde, C.C. (2004). On the martingale
property of stochastic exponentials, Journal of Probability
[1] Cont, R. & Tankov P. (2003). Financial Modelling with and its Applications 41, 654–664.
Jump Processes, Chapman & Hall/CRC Press, Boca
Raton. THORSTEN RHEINLÄNDER
[2] Doléans-Dade, C. & Meyer, P.A. (1979). Inégalités de
normes avec poids, Séminaire de Probabilités de Stras-
bourg 13, 313–331.
Martingales speaking, that a mathematical model for stochastic
asset prices X is free of arbitrage if and only if X
is a martingale under an equivalent probability mea-
The word martingale originated from Middle French. sure. The fair price of a contingent claim associated
It means a device for steadying a horse’s head with those assets X is the expectation of its payoff
or checking its upward movement. In eighteenth- under the martingale equivalent measure (risk neutral
century France, martingale also referred to a class measure).
of betting strategies in which a player increases the Martingale theory is a vast field of study, and
stake usually by doubling each time a bet is lost. this article only gives an introduction to the theory
The word “martingale”, which appeared in the official and describes its use in finance. For a complete
dictionary of the Academy in 1762 (in the sense of description, readers should consult texts such as [4,
a strategy) means “a strategy that consists in betting 13] and [6].
all that you have lost”. See [7] for more about the
origin of martingales. The simplest version of the
martingale betting strategies was designed to beat a Discrete-time Martingales
fair game in which the gambler wins his stake if a
coin comes up heads and loses it if the coin comes A (finite or infinite) sequence of random variables
up tails. The strategy had the gambler keep doubling X = {Xn |n = 0, 1, 2, . . .} on a probability space
his bet until the first head eventually occurs. At this (, F, ) is called a discrete-time martingale (res-
point, the gambler stops the game and recovers all pectively, submartingale, supermartingale) if for all
previous losses, besides winning a profit equal to n = 0, 1,
the original stake. Logically, if a gambler is able to 2, . . ., Ɛ[|Xn |] < ∞ and
follow this “doubling strategy” (in French, it is still
referred to as la martingale), he would win sooner
or later. But in reality, the exponential growth of Ɛ Xn+1 X0 , X1 , . . . , Xn = Xn
the bets would bankrupt the gambler quickly. It is (respectively ≥ Xn , ≤ Xn ) (1)
Doob’s optional stopping theorem (the cornerstone
of martingale theory) that shows the impossibility of By the tower property of conditional expectations,
successful betting strategies. equation (1) is equivalent to
In probability theory, a martingale is a stochas-
tic process (a collection of random variables) such
that the conditional expectation of an observation at Ɛ Xn X0 , X1 , . . . , Xk = Xk
some future time t, given all the observations up to
some earlier time s < t, is equal to the observation (respectively ≥ Xk , ≤ Xk ), for any k ≤ n (2)
at that earlier time s. The name “martingale” was
introduced by Jean Ville (1910–1989) as a synonym Obviously, X is a submartingale if and only if −X
of “gambling system” in his book on “collectif” in is a supermartingale. Every martingale is also a
the Borel collection, 1938. However, the concept of submartingale and a supermartingale; conversely, any
martingale was created and investigated as early as in stochastic process that is both a submartingale and
1934 by Paul Pierre Lévy (1886–1971), and a lot of a supermartingale is a martingale. The expectation
the original development of the theory was done by Ɛ[Xn ] of a martingale X at time n, is a constant
Joseph Leo Doob (1910–2004). At present, the mar- for all n. This is one of the reasons that in a
tingale theory is one of the central themes of modern fair game, the asset of a player is supposed to
probability. It plays a very important role in the study be a martingale. For a supermartingale X, Ɛ[Xn ]
of stochastic processes. In practice, a martingale is a is a nonincreasing function of n, whereas for a
model of a fair game. In financial markets, a fair submartingale X, Ɛ[Xn ] is a nondecreasing function
game means that there is no arbitrage. Mathematical of n. Here is a mnemonic for remembering which is
finance builds the bridge that connects no-arbitrage which: “Life is a supermartingale; as time advances,
arguments and martingale theory. The fundamental expectation decreases.” The conditional expectation
theorem (principle) of asset pricing states, roughly of Xn in equation (2) should be evaluated on the basis
2 Martingales
of all information available up to time k, which can walk and (q/p)Sn is a martingale since φ(p/q) = 1;
be summarized by a σ -algebra Fk , in particular, when p = q = 1/2, Sn is called a
simple symmetric random walk. If Zk has the
Fk = {all events occurring at times Bernoulli distribution, (Zk = +1) = p, (Zk =
0) = q = 1 − p, then Sn has the binomial distribu-
i = 0, 1, 2, . . . , k} (3) tion (n, p), and (q/p)2Sn −n is a martingale since
φ([q/p]2 ) = q/p.
A sequence of increasing σ -algebras {Fn |n = 0, 1,
2, . . .}, that is, Fk ⊆ Fn ⊆ F for k ≤ n, is called a Example 3 (Polya’s Urn). An urn initially con-
filtration, denoted by . When Fn is the smallest tains r red and b blue marbles. One is chosen ran-
σ -algebra containing all the information of X up domly. Then it is put back together with another one
to time n, Fn is called the σ -algebra generated by of the same color. Let Xn be the number of red mar-
X0 , X1 , . . . , Xn , denoted by σ {X0 , X1 , . . . , Xn }, and bles in the urn after n iterations of this procedure,
is called the natural filtration of X. For another and let Yn = Xn /(n + r + b). Then the sequence Yn
sequence of random variables {Yk |k = 0, 1, . . .}, let is a martingale.
Fk = σ {Y0 , Y1 , . . . , Yk }, then Ɛ[Xn |Y0 , Y1 , . . . , Yk ] =
Ɛ[Xn |Fk ]. Example 4 (A Convex Function of Martingales).
A sequence of random variables X = {Xn |n = By Jensen’s inequality, a convex function of a
0, 1, 2, . . .} on the filtered probability space (, F, martingale is a submartingale. Similarly, a convex
, ) is said to be adapted if Xn is Fn -measurable and nondecreasing function of a submartingale is
for each n, which means that given Fn , there is also submartingale. Examples of convex functions are
no randomness in Xn . An adapted X is called a max(x − k, 0) for constant k, |x|p for p ≥ 1 and eθx
discrete-time martingale (respectively submartingale, for constant θ.
supermartingale) with respect to the filtration , if for
Example 5 (Martingale Transforms). Let X be
each n, Ɛ[|Xn |] < ∞, and
a martingale with respect to the filtration and H be
a predictable process with respect to , that is, Hn
Ɛ[Xn |Fk ] = Xk (respectively ≥ Xk , ≤ Xk ), is Fn−1 -measurable for n ≥ 1, where F0 = {∅, }. A
martingale transform of X by H is defined by
for any k ≤ n (4)
n
Example 1 (Closed Martingales). Let Z be a H · X = H0 X0 + Hi (Xi − Xi−1 ) (6)
random variable with Ɛ|Z| < ∞, then for any fil- n
i=1
tration = (Fn ), Xn = Ɛ[Z|Fn ] is a martingale (also
called a martingale closed by Z). Conversely, for any where the expression H· X is the discrete analog of
martingale X on a finite probability space, there exists the stochastic integral H dX. If Ɛ|(H · X)n | < ∞
a random variable Z such that Xn = Ɛ[Z|Fn ]. for n ≥ 1, then (H · X)n is a martingale with respect
to . The interpretation is that in a fair game X, if we
Example 2 (Partial Sums of i.i.d. Random Vari- choose our bet at each stage on the basis of the prior
ables). Let Z1 , Z2 , . . . be a sequence of indepen- history, that is, the bet Hn for the nth gamble only
dent, identically distributed (i.i.d.) random variables depends on {X0 , X1 , . . . , Xn−1 }, then the game will
such that Ɛ[Zn ] = µ, and Ɛ[Zn2 ] = σ 2 < ∞, and continue to be fair. If Xn is the asset price at time
that the moment generating function φ(θ) = Ɛ[θ Z1 ] n and Hn is the number of shares of the assets held
exists for some θ > 0. Let Sn be the partial sum, by the investor during the time period from time n
Sn = Z1 + · · · + Zn , also called a random walk. Let until time n + 1, more precisely, for the time interval
Fn = σ {Z1 , . . . , Zn }. Then [n, n + 1), then (H · X)n is the total gain (or loss) up
to time n (the value of the portfolio at time n with
θ Sn the trading strategy H ).
Sn − nµ, (Sn − nµ)2 − nσ 2 , (5)
[φ(θ)]n
A random variable T taking values in {0, 1, 2,
are all martingales. If (Zk = +1) = p, (Zk = . . . ; ∞} is a stopping time T with respect to a fil-
−1) = q = 1 − p, then Sn is called a simple random tration = {Fn |n = 0, 1, 2, . . .}, if for each n, the
Martingales 3
event {T = n} is Fn -measurable, or equivalently, the Continuous-time martingales have the same prop-
event {T ≤ n} is Fn -measurable. If S and T are erties as discrete-time martingales. For example,
stopping times, then S + T , S ∨ T = max(S, T ), and Doob’s optional stopping theorem says that for a
S ∧ T = min(S, T ) are all stopping times. Partic- martingale Xt with right continuous paths, which is
ularly, T ∧ n is a bounded stopping time for any closed in L1 by a random variable X∞ , we have
fixed time n. XnT =: XT ∧n is said to be the process
X stopped at T , since on the event {ω|T (ω) = k},
XnT = Xk for n = k, k + 1, . . . . Ɛ[XT |FS ] = XS a.s. for any two stopping times
0≤S≤T (9)
Doob’s Optional Stopping Theorem
Let X be a martingale and T be a bounded stopping The most important continuous-time martingale is
time with respect to the same filtration , then Brownian motion, which was named for the Scot-
Ɛ[XT ] = Ɛ[X0 ]. Conversely, for an adapted process tish botanist Robert Brown, who, in 1827, observed
X, if Ɛ[|XT |] < ∞ and Ɛ[XT ] = Ɛ[X0 ] hold for all ceaseless and irregular movement of pollen grains
bounded stopping time T , then X is a martingale. suspended in water. It was studied by Albert Einstein
This theorem says roughly that stopping a martingale in 1905 at the level of modern physics. Its mathemati-
at a stopping time T does not alter its expectation, cal model was first rigorously constructed in 1923 by
provided that the decision when to stop is based only Norbert Wiener. Brownian motion is also called a
on information available up to time T . The theorem Wiener process. The Wiener process gave rise to the
also shows that a martingale stopped at a stopping study of continuous-time martingales, and has been
time is still a martingale, and there is no way to be an example that helps mathematicians to understand
sure to win in a fair game if the stopping time is stochastic calculus and diffusion processes.
bounded. It was Louis Bachelier (1870–1946), now recog-
nized as the founder of mathematical finance (see
[9]), who first, in 1900, used Brownian motion B to
Continuous-time Martingales model short-term stock prices St at a time t in finan-
cial markets, that is, St = S0 + σ Bt , where σ > 0 is
A continuous-time stochastic process X on filtered a constant. Now we can see that if Brownian motion
probability space (, F, , ) is a collection of B is defined on (, F, , ), then the price process
random variables X = {Xt : 0 ≤ t ≤ ∞}, where Xt S is a martingale under the probability measure .
is a random variable observed at time t, and the In 1965, the American economist Paul Samuel-
filtration = {Ft : 0 ≤ t ≤ ∞}, which is a family
son rediscovered Bachelier’s ideas and proposed the
of increasing σ -algebras, Fs ⊆ Ft ⊆ F for s ≤ t. A
geometric Brownian motion S0 exp{(µ − (σ 2 /2))t +
process X is said to be adapted if Xt is Ft measurable
σ Bt } as a model for long-term stock prices St . That is,
for each t. A random variable T taking values in
St follows the stochastic differential equation (SDE):
[0, ∞] is called a stopping time, if the event {T ≤ t}
dSt = µSt dt + σ St dBt . From this simple structure,
is Ft measurable for each t. The stopping time σ -
we get the famous Black–Scholes option price for-
algebra FT is defined to be FT = {A ∈ FA ∩ {T ≤ mulas for European calls and puts. This SDE is now
t} ∈ Ft , all t ≥ 0}, which represents the information called the Black–Scholes equation (model). Contrary
up to the stopping time T . to Bachelier’s setting, the price process S is not a
A real-valued, adapted process X is called a martingale under . However, by Girsanov’s theo-
continuous-time martingale (respectively supermar- rem, there is a unique probability measure , which
tingale, submartingale) with respect to the filtration is equivalent to , such that the discounted stock
if price e−rt St is a martingale under for 0 ≤ t ≤ T ,
where r is the riskless rate of interest, and T > 0 is
1. Ɛ[|Xt |] < ∞, for t > 0 (7) a fixed constant.
The reality is not as simple as the above linear
2. Ɛ[Xt |Fs ] = Xs (respectively ≤ Xs , ≥ Xs ),
SDE. A simple generalization is dSt = µ(t, St ) dt +
a.s. for any 0 ≤ s ≤ t (8) σ (t, St ) dBt . If one believes that risky asset prices
4 Martingales
have jumps, an appropriate model might be paths of infinite variation on [0, t], which prevents
us from defining the stochastic integral H dB as a
dSt = µ(t, St ) dt + σ (t, St ) dBt + J (t, St ) dNt Riemann–Stieltjes integral, path by path.
An adapted, càdlàg process M is called a local
(10) martingale with respect to a filtration if there
exists a sequence of increasing stopping time Tn
where N is a Poisson process with intensity λ, with limn→∞ Tn = ∞ almost surely, such that for
J (t, St ) refers to the jump size, and N indicates when each n, Mt∧Tn is a martingale. A similar concept
the jumps occur. Since N is a counting (pure jump) is that a function is locally bounded : for example,
process with independent and stationary increments, 1/t is not bounded over (0, 1], but it is bounded
both Nt − λt and (Nt − λt)2 − λt are martingales. on the interval [1/n, 1] for any integer n. A process
For a more general model, we could replace N by a moving very rapidly though with continuous paths,
Lévy process that includes the Brownian motion and or jumping unboundedly and frequently, might not
Poisson process as special cases. be a martingale. However, we could modify it to be
Under these general mathematical models, it a martingale by stopping it properly, that is, it is a
becomes hard to turn the fundamental principle of martingale up to a stopping time, but may not be a
asset pricing into a precise mathematical theorem: the martingale for all time.
absence of arbitrage possibilities for a stochastic pro- The class of local martingales includes martingales
cess S, a semimartingale defined on (, F, , ), is
as special cases. For example, if for every t >
equivalent to the existence of an equivalent measure
0, Ɛ{sups≤t |Ms |} < ∞, then M is a martingale;
, under which S is a local martingale, sometimes, if for all t > 0, Ɛ{[M, M]t } < ∞, then M is a
a sigma martingale. See [2] or [3].
martingale, and Ɛ{Mt2 } = Ɛ{[M, M]t }. Conversely,
if M is a martingale with Ɛ{Mt2 } < ∞ for all t > 0,
then Ɛ{[M, M]t } < ∞ for all t > 0. For the definition
Local Martingales and Finite Variation
of quadratic variation [M, M]t , see equation (14) in
Processes the next section.
There are two types of processes with only jump Not all local martingales are martingales. Here
discontinuities. A process is said to be càdlàg if it is a typical example of a local martingale, but not
almost surely (a.s.) has sample paths that are right a martingale. Lots of continuous-time martingales,
continuous, with left limits. A process is said to be supermartingales, and submartingales can be con-
càglàd if it almost surely has sample paths that are structed from Brownian motion, since it has indepen-
left continuous, with right limits. The words càdlàg dent and stationary increments and it can be approx-
and càglàd are acronyms from the French for continu imated by a random walk. For example, let B be a
à droite, limites à gauche, and continu à gauche, standard Brownian motion in 3 with B0 = x = 0.
limites à droite, respectively. Let Let u(y) = ||y||−1 , be a superharmonic function on
3 . M t =
√ u(Bt ) is a √positive supermartingale. Since
limt→∞ t Ɛ{Mt } = π and Ɛ{M0 } = u(x), M does
= the space of adapted processes not have constant expectations and it cannot be a
with càdlàg paths martingale. M is known as the inverse Bessel Pro-
cess. For each n, we define a stopping time Tn =
= the space of adapted processes inf{t > 0 : ||Bt || ≤ 1/n}. Since the function u is har-
with càglàd paths (11) monic outside of the ball of radius 1/n centered at
the origin, the process {Mt∧Tn : t ≥ 0} is a martingale
An adapted, càdlàg process A is called a finite for each n. Therefore, M is a local martingale.
N
variation (FV) process if sup i=1 |Ati − Ati−1 | is
bounded almost surely for each constant t > 0, where
the supremum is taken over the set of all parti- Semimartingales and Stochastic Integrals
tions 0 = t0 ≤ t1 ≤ · · · ≤ tN = t. An FV process is
a difference of two increasing processes. Although Today stocks and bonds are traded globally almost 24
the Brownian motion B has continuous paths, it has hours a day, and online trading happens every second.
Martingales 5
When trading takes place almost continuously, it is For a semimartingale X, its quadratic variation
simpler to use a continuous-time stochastic processes [X, X] is defined by
to model the price X. The value of the portfolio t
at time t with the continuous-time trading strategy [X, X]t = Xt − 2
2
Xs− d Xs (14)
H becomes the limit of sums as shown in the 0
martingale transform (H t · X)n in equation (6), that is, where Xs− denotes the left limit at s. Let [X, X]c
the stochastic integral 0 Hs dXs . Stochastic calculus denote the path-by-path continuous part of [X, X],
is more complicated than regular calculus because X and Xs = Xs − Xs− be the jump of X at s,
can have paths of infinite variation, especially when then [X, X]t = [X, X]ct + 0≤s≤t (Xs )2 . For an FV
X has unbounded jumps, for example, when X is
process X, [X, X]t = 0≤s≤t (Xs )2 . In particular,
Brownian motion, a continuous-time martingale, or if X is an FV process with continuous paths, then
a local martingale. For stochastic integration theory, [X, X]t = X02 for all t ≥ 0. For a continuous local
see Stochastic Integrals or consult [8, 11] and [12], martingale X, then X 2 − [X, X]t is a continuous local
and other texts. martingale. Moreover, if [X, X]t = X02 for all t, then
Let 0 = T1 ≤ · · · ≤ Tn+1 < ∞ be a sequence of Xt = X0 for all t; in other words, if an FV process
stopping times and Hi ∈ FTi with |Hi | < ∞. A is also a continuous local martingale, then it is a
process H with a representation constant process.
n
Ht = H0 1{0} (t) + Hi 1(Ti , Ti+1 ) (t) (12) Lévy’s Characterization of Brownian
i=1 Motion
is called a simple predictable process. A collection A process X is a standard Brownian motion if and
of simple predictable processes is denoted by S. only if it is a continuous local martingale with
For a process X ∈ and H ∈ S having the rep- [X, X]t = t.
resentation (12), we define a linear mapping as the The theory of stochastic integration for integrands
martingale transforms in equation (6) in the discrete- in is sufficient to establish Itô’s formula, the Gir-
time case sanov–Meyer theorem, and to study SDEs. For exam-
ple, the stochastic exponential of a semimartingale X
n with X0 = 0, written E(X), is the unique semimartin-
(H · X)t = H0 X0 + Hi (Xt∧Ti+1 − Xt∧Ti ) (13) gale Z that is a solution of the linear SDE: Zt =
t
i=1 1 + 0 Zs− d Xs . When X is a continuous local mar-
tingale, so is E(X)t = exp{Xt − 12 [X, X]t }. Further-
If for any H ∈ S and each t ≥ 0, the sequence
more, if Kazamaki’s Criterion supT Ɛ{exp( 12 XT )} <
of random variables (H n · X)t converges to (H ·
∞ holds, where the supremum is taken over all
X)t in probability, whenever H n ∈ S converges to bounded stopping times, or if Novikov’s Criterion
H uniformly, then X is called a semimartingale. Ɛ{exp( 12 [X, X]∞ )} < ∞ holds (stronger but easier to
For example, an FV process, a local martingale check in practice), then E(X) is a martingale. See
with continuous paths, and a Lévy process are all [10] for more on these conditions. When X is Brow-
semimartingales. nian motion, E(X) = exp{Xt − 12 t} is referred to as
Since the space S is dense in , for any H ∈ , geometric Brownian motion.
there exists Hn ∈ S such that Hn converges to H . The space of integrands is not general enough
For a semimartingale X and a process H ∈ , the to have local times and martingale representation
stochastic integral H d X, also denoted by (H · X), theory, which is essential for hedging in finance. On
is defined by lim (H n · X). For any H ∈ , H · X the basis of the Bichteler–Dellacherie theorem, X is a
n→∞
is a semimartingale, it is an FV process if X is, and semimartingale if and only if X = M + A, where M
it is a local martingale if X is. But H · X may not is a local martingale and A is an FV process, we can
be a martingale even if X is. H· X is a martingale if extend the stochastic integration from to the space
t
X is a local martingale and Ɛ{ 0 Hs2 d[X, X]s } < ∞ P of predictable processes, which are measurable
for each t > 0. with respect to σ {H : H ∈ }. For a semimartingale
6 Martingales
[6] Ethier, S. & Kurtz, T.G. (1986). Markov Processes: [11] Revuz, D. & Yor, M. (1991). Continuous Martingales
Characterization and Convergence, Wiley, New York. and Brownian motion, Grundlehren der Mathematischen
[7] Mansuy, R. (2005). Histoire de martingales, Mathema- Wissenschaften, 3rd Edition, Springer, Vol. 293.
[12] Rogers, L.C.G. & Williams, D. (2000). Diffusions,
tiques et Sciences Humaines/Mathematical Social Sci-
Markov Processes and Martingales, Vols 1 and 2, Cam-
ences 169(1), 105–113. bridge University Press.
[8] Protter, P. (2003). Stochastic Integration and Differential [13] Williams, D. (1991). Probability with Martingales, Cam-
Equations, Applications of Mathematics, 2nd Edition, bridge University Press.
Springer, Vol. 21.
[9] Protter, P. (2007). Louis Bachelier’s Theory of Specu- Related Articles
lation: The Origins of Modern Finance, M. Davis &
A. Etheridge, eds, a book review in the Bulletin of Equivalent Martingale Measures; Fundamental
the American Mathematical Society, Vol. 45, No. 4, Theorem of Asset Pricing; Markov Processes;
pp. 657–660. Martingale Representation Theorem.
[10] Protter, P. & Shimbo, K. (2006). No Arbitrage and
General Semimartingales. To appear in the Festschrift. LIQING YAN
Itô’s Formula The process defined in formula (2) is an example of
continuous semimartingale. Here is the classical Itô
formula for a general semimartingale (Xs )s≥0 (e.g.,
[7, 9]) and F in C2
For a function depending on space and time param-
eters, rules of differentiation are well known. For a t
function depending on space and time parameters and
F (Xt ) = F (X0 ) + F (Xs− ) dXs
also on a randomness parameter, Itô’s formulas pro- 0
vide rules of differentiation. These rules of differ-
1 t
entiation are based on the complementary notion of + F (Xs ) d[X]cs
stochastic integration (see Stochastic Integrals). 2 0
More precisely, given a probability space (, IP , F, + F (Xs ) − F (Xs− ) − F (Xs− )Xs
(Ft )t≥0 ), Itô’s formulas deal with (F (Xt ); t ≥ 0), 0≤s≤t
where F is a deterministic function defined on
and (Xt )t≥0 is a random process such that inte- (4)
gration of locally bounded predictable processes is
possible with respect to (Xt )t≥0 and satisfies a prop- where [X]c is the continuous part of [X]. For contin-
erty equivalent to the Lebesgue dominated conver- uous semimartingales, formula (4) becomes
gence theorem. This means that (Xt )t≥0 is a semi-
martingale and therefore has a finite quadratic varia- t
tion process ([X]t , t ≥ 0) (see Stochastic Integrals) F (Xt ) = F (X0 ) + F (Xs ) dXs
defined as 0
t
1
2 + F (Xs ) d[X]s (5)
[X]t = limn→∞ Xsi+1
n − Xsin in probability, 2 0
uniformly on time intervals (1) In the special case when (Xt )t≥0 is a real Brownian
motion, then [X]t = t.
where (sin )1≤i≤nis a subdivision of [0, t] whose mesh The multidimensional version of formula (4)
converges to 0 as n tends to ∞. gives the expansion of F (Xt(1) , Xt(2) , . . . , Xt(d) ) for
We will see that Itô’s formulas also provide infor- F a real-valued function of C2 (d ) and d semi-
mation on the stochastic structure of the process
martingales X (1) , X (2) , . . . , X (d) . We set X = (X (1) ,
(F (Xt ), t ≥ 0). We first introduce the formula estab-
X (2) , . . . , X (d) ):
lished by Itô in 1951. Consider a process (Xt )t≥0 of
the form
d
t t
t
∂F
F (Xt ) = F (X0 ) + (Xs− ) dXs(i)
Xt = Hs dBs + Gs ds (2) 0 ∂xi
0 0 i=1
t 2
1 ∂ F c
where (Bs )s≥0 is a real-valued Brownian motion, and + (Xs− ) d X (i) , X (j ) s
2 1≤i,j ≤d 0 ∂xi ∂xj
(Hs )s≥0 and (Gs )s≥0 are locally bounded predictable
processes. Then for every C 2 -function F from to
, we have + F (Xs ) − F (Xs− )
0≤s≤t
t
d
F (Xt ) = F (X0 ) + F (Xs )Hs dBs −
∂F
(Xs− )Xs(i) (6)
0 ∂x
t t i=1 i
1
+ F (Xs )Gs ds + Hs2 F (Xs ) ds
0 2 0 Note the Itô formula corresponding to the case of
(3) the couple of semimartingales (Xt , t)t≥0 with X
2 Itô’s Formula
[2] Bouleau N. & Yor M. (1981). Sur la variation quadratique [7] Jacod J. & Shiryayev A.N. (2003). Limit Theorems for
des temps locaux de certaines semimartingales, Comptes Stochastic Processes, 2nd Edition, Springer.
Rendus de l’Académie des Sciences 292, 491–494. [8] Peskir G. (2005). A change-of-variable formula with local
[3] Eisenbaum N. (2000). Integration with respect to local time on curves, Journal of Theoretical Probability 18,
time, Potential Analysis 13, 303–328. 499–535.
[4] Eisenbaum N. (2006). Local time-space stochastic cal- [9] Protter, P. (2004). Stochastic Integration and Differential
Equations, 2nd Edition, Springer.
culus for Lévy processes, Stochastic Processes and their
Applications 116(5), 757–778.
[5] Errami M., Russo F. & Vallois P. (2002). Itô formula
for C 1,λ -functions of a càdlàg process, Probability Theory
Related Articles
and Related Fields 122, 191–221.
[6] Föllmer H., Protter P. & Shiryayev A.N. (1995). Quad- Lévy Processes; Local Times; Stochastic Integrals.
ratic covariation and an extension of Itô’s formula,
Bernoulli 1(1/2), 149–169. NATHALIE EISENBAUM
say, positive jumps, the definition of the tail integral
Lévy Copulas is simple: given a d-valued Lévy process with Lévy
measure ν supported by [0, ∞)d , the tail integral of
ν is the function U : (0, ∞)d → [0, ∞) defined by
Lévy copulas characterize the dependence among
components of multidimensional Lévy processes. U (x1 , . . . , xd ) = ν((x1 , ∞) × · · · × (xd , ∞)) (1)
They are similar to copulas of probability distribu-
tions but are defined at the level of Lévy measures. In the general case, care must be taken to avoid the
Lévy copulas separate the dependence structure of possible singularity of ν near zero: so the tail integral
a Lévy measure from the one-dimensional marginal is a function U : ( \ {0})d → defined by
measures meaning that any d-dimensional Lévy mea-
sure can be constructed from a set of one-dimensional
d
d
margins and a Lévy copula. This suggests the con- U (x1 , . . . , xd ) := sgn(xi )ν I(xj ) (2)
struction of parametric multidimensional Lévy mod- i=1 j =1
d = 2. If F (u) = di=1 ui , the F volume of any inter- Lévy Copulas: The Spectrally One-sided
val is equal to its Lebesgue measure. Case
d
A function F : → is called d increasing
If X has only positive jumps in each component, or
if VF ((a, b]) ≥ 0 for all a ≤ b. The distribution
if we are only interested in the positive jumps of
function of a random vector is one example of a d-
X, only the values F (u1 , . . . , ud ) for u1 , . . . , ud ≥ 0
increasing function. The tail integral U was defined
are relevant. We can then set F (u1 , . . . , ud ) = 0 if
in such way that (−1)d U is d increasing in every
ui < 0 for at least one i, which greatly simplifies the
orthant (but not on the entire space).
d definition of the margins:
Let F : → be a d-increasing function such
that F (u1 , . . . , ud ) = 0 if ui = 0 for at least one i. F I ((ui )i∈I ) = F (u1 , . . . , ud )|uj =+∞,j ∈I
/ (8)
For an index set I , the I margin of F is the function
|I | Taking the margins now amounts to replacing the
F I : → , defined by
variable that is being integrated out with infin-
F I ((ui )i∈I ) := lim ity—exactly the same procedure as for probability
a→∞
(ui )i∈I c ∈{−a,∞}|I c | distribution functions. Restricting a Lévy copula to
[0, ∞]d in such way, we obtain a Lévy copula for
× F (u1 , . . . , ud ) sgn ui (5) spectrally positive Lévy processes, or, for short, a
i∈I c positive Lévy copula.
where I c := {1, . . . , d} \ I . In particular, we have
F {1} (u) = F (u, ∞) − lima→−∞ F (u, a) for d = 2. Sklar’s Theorem for Lévy Processes
To understand the reasoning leading to the above def-
inition of margins, note that any positive measure µ The following theorem [4, 7] characterizes the depen-
d
on naturally induces an increasing function F via dence structure of Lévy processes in terms of Lévy
copulas:
F (u1 , . . . , ud ) :=
Theorem 1
d
µ (u1 ∧ 0, u1 ∨ 0] ×· · ·× (ud ∧ 0, ud ∨ 0] sgn ui 1. Let X = (X 1 , . . . , X d ) be a d-valued Lévy pro-
i=1
cess. Then there exists a Lévy copula F such that
(6) the tail integrals of X satisfy
for u1 , . . . , ud ∈ . The margins of µ are usually U I ((xi )i∈I ) = F I ((Ui (xi ))i∈I ) (9)
defined by
for any nonempty index set I ⊂ {1, . . . , d} and
d
µI (A) = µ {u ∈ : (ui )i∈I ∈ A} , A ⊂
|I | any (xi )i∈I ∈ ( \ {0})|I | . The Lévy copula F is
unique on di=1 Ran Ui .
(7)
2. Let F be a d-dimensional Lévy copula and
Ui , i = 1, . . . , d, tail integrals of real-valued Lévy
It is now easy to see that the margins of F are induced
processes. Then there exists a d-valued Lévy
by the margins of µ in the sense of equation (6).
d process X whose components have tail integrals
A function F : → is called Lévy copula if U1 , . . . , Ud and whose marginal tail integrals sat-
it satisfies the following four conditions (the first one isfy equation (9) for any nonempty I ⊂ {1, . . . , d}
is just a nontriviality requirement): and any (xi )i∈I ∈ ( \ {0})|I | . The Lévy measure
ν of X is uniquely determined by F and Ui , i =
1. F (u1 , . . . , ud )
= ∞ for (u1 , . . . , ud )
=
1, . . . , d.
(∞, . . . , ∞);
2. F (u1 , . . . , ud ) = 0 if ui = 0 for at least one In particular, applying the above theorem with I =
i ∈ {1, . . . , d}; {1, . . . , d}, we obtain the usual formula
3. F is d-increasing; and
4. F {i} (u) = u for any i ∈ {1, . . . , d}, u ∈ . U (x1 , . . . , xd ) = F (U1 (x1 ), . . . , Ud (xd )) (10)
Lévy Copulas 3
If the one-dimensional marginal Lévy measures are dependence Lévy copula given by
infinite and have no atoms, Ran Ui = (−∞, 0) ∪
(0, ∞) for any i and one can compute F directly via
d
F (x) := min(|x1 |, . . . , |xd |)1K (x) sgn xi (14)
F (u1 , . . . , ud ) = U U1−1 (u1 ), . . . , Ud−1 (ud ) (11) i=1
same direction, and when η = 0, positive jumps in both of which lead to a correlation of 50% but
one component are accompanied by negative jumps have different tail dependence patterns. It is clear
in the other and vice versa. The parameter θ is respon- that when a precise description of tail events such
sible for the dependence of absolute values of jumps as simultaneous large jumps is necessary, Lévy cop-
in different components. ulas offer more freedom in modeling dependence
Figure 1 shows the scatter plots of weekly returns than traditional correlation-based approaches. A nat-
in an exponential Lévy model with variance gamma ural application of Lévy copulas arises in the context
(see Variance-gamma Model) margins and the of multidimensional gap options [8] that are exotic
dependence pattern given by the Lévy copula (18) products whose payoff depends on the total number
with two different sets of dependence parameters, of sharp downside moves in a basket of assets.
References
0.2 [1] Barndorff-Nielsen, O.E. & Lindner, A.M. (2007). Lévy
copulas: dynamics and transforms of upsilon type, Scan-
0.1 dinavian Journal of Statistics 34, 298–316.
[2] Böcker, K. & Klüppelberg, C. (2007). Multivariate oper-
ational risk: dependence modelling with Lévy copulas,
0 ERM Symposium Online Monograph, Society of Actuar-
ies, and Joint Risk Management, section newsletter.
[3] Bregman, Y. & Klüppelberg, C. (2005). Ruin estimation
−0.1 in multivariate models with Clayton dependence structure,
Scandinavian Actuarial Journal November(6), 462–480.
−0.2 [4] Cont, R. & Tankov, P. (2004). Financial Modelling with
Jump Processes, Chapman & Hall/CRC Press.
(a) −0.2 −0.1 0 0.1 0.2 [5] Farkas, W., Reich, N. & Schwab, C. (2007). Anisotropic
stable Lévy copula processes-analytical and numerical
aspects, Mathematical Models and Methods in Applied
0.2 Sciences 17, 1405–1443.
[6] Kallsen, J. & Tankov, P. (2006). Characterization of
dependence of multidimensional Lévy processes using
0.1 Lévy copulas, Journal of Multivariate Analysis 97,
1551–1572.
[7] Tankov, P. (2004). Lévy Processes in Finance: Inverse
0
Problems and Dependence Modelling, PhD thesis, Ecole
Polytechnique, France.
−0.1 [8] Tankov, P. (2008). Pricing and Hedging Gap Risk,
preprint, available at http://papers.ssrn.com.
−0.2
Related Articles
(b) −0.2 −0.1 0 0.1 0.2
Copulas: Estimation; Exponential Lévy Models;
Figure 1 Scatter plots of returns in a two-dimensional
variance gamma model with correlation ρ = 50% and dif-
Lévy Processes; Multivariate Distributions; Oper-
ferent tail dependence. (a) Strong tail dependence (η = 0.75 ational Risk.
and θ = 10) and (b) weak tail dependence (η = 0.99 and
θ = 0.61) PETER TANKOV
Convex Duality Different duality principles differ in the way the dual
problem is built. Two main principles are Lagrange
duality and Fenchel duality. Even though they are
formally equivalent, at least in the finite-dimensional
Convex duality refers to a general principle that case, they provide different insights into the problem.
allows us to associate with an original minimization We will see below how the Lagrange and Fenchel
program (the primal problem) a class of concave duality principles practically accomplish the tasks 1
maximization concave programs (the dual problem), to 3 above.
which, under some conditions, are equivalent to For the topics to be presented below, compre-
the primal. The unifying principles underlying these hensive references are [4] and [1] for the finite-
methods can be traced back to the basic duality that dimensional case ([1] also provides an extensive
exists between a convex set of points in the plane account of numerical methods) and [2] for the
and the set of supporting lines (hyperplanes). Duality infinite-dimensional case.
tools can be applied to nonconvex programs too, but
are most effective for convex problems.
Convex optimization problems naturally arise in Lagrange Duality in Finite-dimensional
many areas of finance; we mention just few of them Problems
(see the list of the related entries at the end of this
article): maximization of expected utility in com- We consider finite-dimensional problems, that is,
plete or incomplete markets, mean–variance portfo- V = N for some N ≥ 1. We denote v · w the
lio selection and CAPM, utility indifference pricing, inner product between two vectors v, w ∈ N and
selection of the minimal entropy martingale measure, use v ≥ 0 as a shorthand for vn ≥ 0 ∀n. Let
and model calibration. This short and nonexhaustive f, h1 , . . . , hM : C → be M + 1 convex functions,
list should give a hint of the scope of convex duality where C ⊆ N is a convex set. Setting h =
methods in financial applications. (h1 , . . . , hM ), so that h is a convex function from
Consider the following primal minimization (con- C to M , we consider, as the primal problem, the
vex) problem: minimization of f under M inequality constraints:
where A is a convex subset of some vector space V To build a dual problem, we define the so-called
and f : A → is a convex function. Convex duality Lagrangian function
principles consist in pairing this problem with a dual
maximization (concave) problem: L(v, w) := f (v) + w · h(v)
(D) : max g(w) sub w ∈ B (2) v ∈ C, w ∈ M (4)
where B is a convex subset of some other vector and note that f (v) = supw≥0 L(v, w) for any v ∈ A.
space W (possibly W = V ) and g : B → is a As a consequence, we can write the primal problem
concave function. in terms of L:
In general, by applying a duality principle, we
usually try to (P) : inf sup L(v, w) (5)
v∈C w≥0
In the terminology of the introductory section, the practical situations, “branch and bound” algorithms in
dual problem is then integer programming being a prominent example. It
also provides a workable condition that characterizes
(D) : max g(w) sub w ∈ B a solution pair, at least when there is no duality gap.
Strong duality, on the contrary, requires a precise
= {w ∈ D : w ≥ 0} ⊂ M (7) topological assumption: the interior of the constraint
where set has to be nonempty (Slater condition). We note,
g(w) = inf L(v, w) (8) however, that this condition is satisfied in most cases,
v∈C at least in the present finite-dimensional setting.
and D = {w ∈ M : g(w) > −∞} is the domain The proof is then based on a separating hyperplane
of g. It can be proved that D is a convex set and g theorem, that in turn requires convexity assumptions
is a concave function on D even if f is not convex: about f and h. When strong duality holds, and
therefore the dual problem is always concave, even provided we are able to actually solve the dual
when the primal problem is not convex. problem, we obtain the exact value of the primal (no
We assume throughout primal and dual feasibility, duality gap).
that is, A and B are assumed to be nonempty. Dual We can add a finite number (say L) of linear
feasibility would however be ensured under Slater equality constraints to (P), obtaining
conditions for A (see below). Let p = infA f and
d = supB g be the (possibly infinite) values of the (P) : min f (v) sub v ∈ A
primal and the dual. A primal (dual) solution is
v∈A = {v ∈ C : h(v) ≤ 0, Qv = r} ⊂ N (11)
w ∈ B), if any, such that f (
( v ) = p (g(
w ) = d); a
solution pair is a feasible pair ( ) ∈ A × B made
v, w where Q is an L × N matrix and r ∈ . The L
· h(
w v ) = 0 and L( ) = g(
v, w w) (10) = {w ∈ D : w in ≥ 0} ⊂ M×L (14)
The relative interior ri(C) is the interior of the convex if N is much larger than L. This is the basis for great
set C relative to the affine hull of C. For instance, enhancements in existing numerical methods.
if C = [0, 1] × {0} ⊂ 2 , then ri(C) = (0, 1) × {0} A last remark concerns the word “duality”: any
(because the affine hull of C is × {0}), while the dual problem can be turned into an equivalent mini-
interior of C is clearly empty (see [4] for more mization primal problem. It turns out that the bidual,
on relative interiors and related topics about convex that is, the dual of this new primal problem, seldom
sets). coincides with the original primal problem. LP prob-
In many concrete problems, C is a polyhedron, lems are an important exception: the bidual of an LP
that is, it is the (convex and closed) set defined by problem is the problem itself.
a certain finite set of linear inequalities, and all the
functions hm are affine. If we assume, in addition, that
f may be extended to a finite convex function over all Fenchel Duality in Finite-dimensional
N , Farkas Lemma allows us to prove strong duality Problems
without requiring any Slater condition. Remarkably,
if f is linear too, then the existence of a primal Fenchel duality, that we will derive from Lagrange
solution is ensured. duality, may be applied to primal problems in the
The Lagrange duality theorem provides us a form
simple criterion for the existence of a dual solution
and a set of conditions characterizing a possible
(P) : min {f1 (v) − f2 (v)}
primal solution. It is, however, not directly concerned
with the existence of a primal solution. To ensure sub v ∈ A = C1 ∩ C2 ⊂ N (18)
this, one has to assume stronger conditions such
as compactness of C or coercivity of f . A third where C1 , C2 ⊆ N are convex, f1 : C1 → is
condition (f linear) has been described above. convex, and f2 : C2 → is concave.
We have seen that the dual problem usually looks Consider the function f (x, y) = f1 (x) − f2 (y)
much better than the primal: it is always concave and defined on 2N and clearly convex. We can restate
its solvability is guaranteed under mild assumptions the primal as
about the primal. This fact is particularly useful
in designing numerical procedures. Moreover, even
(P) : min f (x, y) sub (x, y) ∈ A
where c ∈ N , Q is a L × N matrix and r ∈ L . An g(w) = inf L(x, y, w) = f2∗ (w) − f1∗ (w)
x∈C1 ,y∈C2
easy computation shows that the dual problem is (T
denotes transposition) (20)
is the concave conjugate (indeed, f2∗ is concave) of Fenchel duality can sometimes be effectively used for
the concave function f2 . As a consequence, the dual general problems in the form
problem is
(P) : min f (v) sub v ∈ C ⊂ N (25)
(D) : max {f2∗ (w) − f1∗ (w)} where f and C are convex. Indeed, such a problem
sub w ∈ B = C1∗ ∩ C2∗ ⊂ N
(23) can be cast in the form (18) provided we set f1 = f ,
f2 = 0 (concave), C1 = N , and C2 = C. The dual
where C1∗ and C2∗
are the domains of and f1∗ problem is given by equation (23), where
f2∗ , respectively. Assuming primal feasibility and
boundedness, the Lagrange duality theorem yields the f1∗ (w) = sup {w · v − f (v)} (26)
v∈N
Fenchel duality theorem.
is an unconstrained problem and
Fenchel Duality Theorem
f2∗ (w) = inf w · v (27)
v∈C
1. Weak duality
has a simple goal function.
If there is no duality gap, ( ) is a solution pair if
v, w
We have derived Fenchel duality as a by product
and only if
of Lagrange duality. However, it is possible to go
v·w v ) + f1∗ (
= f1 ( v ) + f2∗ (
w ) = f2 ( w) (24) in the opposite direction, by first proving Fenchel
duality (unsurprisingly, using hyperplane separation
2. Strong duality arguments, see [2]) and then writing a Lagrange
There is no duality gap between the primal and the problem in the Fenchel form, so that Lagrange duality
dual, and there is a dual solution, provided one of the can be derived (see [3]). Therefore, at least in
following conditions is satisfied: the finite-dimensional setting, Lagrange and Fenchel
(a) ri(C1 ) ∩ ri(C2 ) is nonempty duality are formally equivalent.
(b) C1 and C2 are polyhedra and f1 (resp. f2 )
may be extended to a finite convex (concave)
function over all N Duality in Infinite-dimensional Problems
See [4] or [1] for a proof. For infinite-dimensional problems, Lagrange or
Fenchel duality exhibit a large formal similarity with
We say that a convex function f is closed if, for any the finite-dimensional counterparts we have described
a ∈ , the set a = {v : f (v) ≤ a} is closed; a sim- so far. Nevertheless, the technical topological
ilar definitions applies to concave functions, where assumptions, which are needed to ensure duality,
the inequality inside a is reversed. A sufficient, become much less trivial when the space V = N
though not necessary condition for f to be closed is is replaced by an infinite-dimensional Banach space.
continuity on all C. A celebrated result (the Fenchel– We give a brief account of these differences.
Moreau theorem) states that (f ∗ )∗ ≡ f , provided f Let V be a Banach space and consider the primal
is a closed (convex or concave) function. Therefore, problem
if in the primal problem f1 and f2 are closed, then
the dual problem of the dual coincides with the pri-
(P) : min f (v) sub v ∈ A
mal, and the duality is therefore complete. Thanks to
this fact, an application of the Fenchel duality theo- = {v ∈ C : h(v) ≤ 0} ⊂ V (28)
rem to the dual problem allows us to state that the
primal has a solution provided one of the following where C ⊆ V is a convex set, and f : C → and h :
conditions is satisfied: C → M are convex functions. Then, by mimicking
the finite-dimensional case, the dual problem is
1. ri(C1∗ ) ∩ ri(C2∗ ) is nonempty.
2. C1∗ and C2∗ are polyhedra, and f1∗ (resp. f2∗ ) may
(D) : max g(w) sub w ∈ B
be extended to a finite convex (concave) function
over all N . = {w ∈ D : w ≥ 0} ⊂ M (29)
Convex Duality 5
where g(w) = infv∈C {f (v) + w · h(v)}, and D is the are the convex and concave conjugates of f1 and
domain of g. We can note that the dual is finite- f2 , respectively, and C1∗ and C2∗ are their domains.
dimensional, but the definition of g involves an Then, with obvious formal modifications, Fenchel
infinite-dimensional problem. A perfect analog of the duality theorem holds in this case, too (see again
finite-dimensional Lagrange duality theorem may be [2]). However, to obtain strong duality, we must
derived in this more general case too (see [2]) with supplement conditions (a) or (b) with the following
essentially the same Slater condition (existence of
some v ∈ C such that hm (v) < 0 for any m). We • Either {(v, a) ∈ V × : f1 (v) ≤ a}
can also introduce a finite set of linear inequali-
or {(v, a) ∈ V × : f2 (v) ≥ a}
ties: this case can be handled in exactly the same
way as in the finite-dimensional case. However, has a nonempty interior.
the hypothesis ri(C)
= ∅ is not completely trivial
This latter condition, which, in the finite-dimensional
here.
setting, follows from (a) or (b), must be checked
Fenchel duality too can be much generalized.
separately in the present case.
Indeed, let V be a Banach space, W = V ∗ its dual
space (the Banach space of continuous linear forms
on V ), and denote by v, v ∗ the action of v ∗ ∈ V ∗ References
on v ∈ V . Consider the primal problem
[1] Bertsekas, D.P. (1995). Nonlinear Programming, Athena
Scientific, Belmont.
(P) : min {f1 (v) − f2 (v)} sub v ∈ A [2] Luenberger, D.G. (1969). Optimization by Vector Space
Methods, Wiley, New York.
= C1 ∩ C2 ⊂ V (30)
[3] Magnanti, T.L. (1974). Fenchel and Lagrange duality are
equivalent, Mathematical Programming 7, 253–258.
where C1 , C2 ⊆ V are convex sets, f1 is convex [4] Rockafellar, R.T. (1970). Convex Analysis, Princeton
on C1 , and f2 is concave on C2 . Then, again by University Press, Princeton.
mimicking the finite-dimensional case, we associate
the primal with the dual
Related Articles
(D) : max {f2∗ (v ∗ ) − f1∗ (v ∗ )} sub v ∗ ∈ B Capital Asset Pricing Model; Expected Utility
Maximization; Expected Utility Maximization:
= C1∗ ∩ C2∗ ⊂V ∗
(31) Duality Methods; Minimal Entropy Martin-
gale Measure; Model Calibration; Optimization
where
Methods; Risk–Return Analysis; Robust Port-
folio Optimization; Stochastic Control; Utility
f1∗ (v ∗ ) = sup {v, v ∗ − f1 (v)} and f2∗ (v ∗ ) Function; Utility Indifference Valuation.
v∈C1
A squared Bessel (BESQ) process (Xt(x,δ) , t ≥ 0) may where Mx,δ = xM + δN , for M and N two σ -finite
be defined (in law) as the solution of the stochastic measures on C+ , which are described in detail in, for
differential equation: example, [5] and [6].
t
Xt = x + 2 Xs dβs + δt , Xt ≥ 0 (1) Brownian Local Times and BESQ
0
Processes
where x is the starting value: X0 = x, δ is the
so-called dimension of X, and (βs )s≥0 is standard The Ray–Knight theorems for Brownian local times
y y
Brownian motion. For any integer dimension δ, (Lt ; y ∈ , t ≥ 0) express the laws of (LT ; y ∈ )
(Xt , t ≥ 0) may be obtained as the square of the for some very particular stopping times in terms of
Euclidean norm of δ-dimensional Brownian motion. certain Qδx ’s, namely,
The general theory of stochastic differential equa-
tions (SDEs) ensures that equation (1) enjoys path- 1. if T = Ta is the first hitting time of a by Brown-
ian motion then Z(a) ≡ La−y , y ≥ 0, satisfies the
wise uniqueness, hence uniqueness in law, and conse- y Ta
quently the strong Markov property. Denoting by Qδx following:
the law of (Xt )t≥0 , solution of equation (1), on the y
canonical space C+ ≡ C(+ , + ), where (Zu , u ≥
Zy = 2 z dβz + 2(y ∧ a)
Z (5)
0
0) is taken as the coordinate process, there is the
convolution property: 2. if T = τ is the first time (L0t , t ≥ 0) the Brown-
y
ian local time at level 0 reaches , then (Lτ , y ≥
Qδx ∗ Qδx = Qδ+δ (2) −y
x+x 0) and (Lτ , y ≥ 0) are two independent BESQ
which holds for all x, δ ≥ 0 ([7]); in other terms, processes, distributed as Q0 .
adding two independent BESQ processes yields
another BESQ process, whose starting point, respec-
tively dimension, is the sum of the starting points,
An Implicit Representation in Terms of
respectively dimensions. Geometric Brownian Motions
It follows from equation (2) that
for any positive Lamperti [3] showed a one-to-one correspondence
measure µ(du) on + such that µ(du)(1 + u) < between Lévy processes (ξt , t ≥ 0) and semistable
Markov processes (u , u ≥ 0) via the (implicit) for-
∞, then, if Iµ = µ(du)Zu , mula:
exp(ξt ) = t , t ≥0 (6)
1
Qδx exp − Iµ = (Aµ )δ (Bµ )x (3) ds exp(ξs )
2 0
2 Squared Bessel Processes
In the particular case where ξt = 2(Bt + νt), t ≥ of Iµ , provided the function φλµ is known explicitly,
0, formula (6) becomes which is the case for µ(dt) = at α 1(t≤A) dt + bεA (dt)
and many other examples.
exp(2(Bt + νt)) = X(1,δ)t (7) Consequently, the semigroup of BESQ may be
ds exp(2(Bs + νs)) expressed explicitly in terms of Bessel functions,
0
as well as the Laplace transforms of first hitting
where, in agreement with our notation, (Xu(1,δ) , u ≥ times (see, for example, [2]) and distributions of last
0) denotes a BESQ process starting from 1 with passage times (see, for example, [4]). Chapter XI of
dimension δ = 2(1 + ν). We note that in equation [6] is entirely devoted to Bessel processes.
(7), δ may be negative, that is, ν < −1; however,
formula (7) reveals (Xu(1,δ) ) for u ≤ T0 (X (1,δ) ) the first References
hitting time of 0 by (X (1,δ) ). Nonetheless, the study
of BESQδ , for any δ ∈ , has been developed in [1]. [1] Goı̈ng-Jaeschke, A. & Yor, M. (2003). A survey and
Absolute continuity relationships between the laws some generalizations of Bessel processes, Bernoulli 9(2),
of different BESQ processes may be derived from 313–350.
equation (7), combined with the Cameron–Martin [2] Kent, J. (1978). Some probabilistic properties of Bessel
relationship between the laws of (Bt + νt, t ≥ 0) and functions, The Annals of Probability 6, 760–770.
[3] Lamperti, J. (1972). Semi-stable Markov processes,
(Bt , t ≥ 0). Zeitschrift fur Wahrscheinlichkeitstheorie und verwandte
Precisely, one obtains thus, for δ ≥ 2: Gebiete 22, 205–225.
ν [4] Pitman, J. & Yor, M. (1981). Bessel processes
Zu 2 ν 2 u ds and infinitely divisible laws, in Stochastic Integrals,
Qx|Zu =
δ
exp − •Q
2
x|Zu (8) D. Williams, ed., LNM 851, Springer, pp. 285–370.
x 2 0 Zs
[5] Pitman, J. & Yor, M. (1982). A decomposition of Bessel
Bridges, Zeitschrift fur Wahrscheinlichkeitstheorie und
where Zu ≡ σ {Zs , s ≤ u}, and ν = 2δ − 1. The com- verwandte Gebiete 59, 425–457.
bination of equations (7) and (8) may be used to [6] Revuz, D. & Yor, M. (1999). Continuous Martingales and
derive results about (Bt + νt, t ≥ 0) from results Brownian Motion, 3rd Edition, Springer.
[7] Shiga, T. & Watanabe, S. (1973). Bessel diffusions as
about X x,δ (and vice versa). In particular, the law of
a one-parameter family of diffusion processes, Zeitschrift
Tλ fur Wahrscheinlichkeitstheorie und verwandte Gebiete 27,
(ν)
ATλ := ds exp(2(Bs + νs)) (9) 37–46.
0 [8] Yor, M. (2001). Exponential Functionals of Brownian
Motion and Related Processes, Springer-Finance.
where Tλ denotes an independent exponential time,
was derived in ([8], Paper 2) from this combination.
Related Articles
Some Explicit Formulae for BESQ Affine Models; Cox–Ingersoll–Ross (CIR) Model;
Functionals Heston Model; Simulation of Square-root Pro-
cesses.
Formula (3), when µ is replaced by λµ, for any
scalar λ ≥ 0, yields the explicit Laplace transform MARC J. YOR
Semimartingale A process X is defined to be a semimartingale if
it is càglàd, adapted, and the mapping IX : S → L0
is continuous. Such processes are “good integrators”,
Semimartingales form an important class of processes because they satisfy the following bounded conver-
in probability theory, especially in the theory of gence theorem: the uniform convergence of H n to
stochastic integration and its applications. They H (in S) implies the convergence in probability of
serve as natural models for asset pricing, since under IX (H n ) to IX (H ). As a consequence, when X is a
no-arbitrage assumptions a price process must be a semimartingale, the domain of the stochastic integral
semimartingale [1, 3]. IX can be extended to the space of all predictable
Let (, F, = (Ft )t≥0 , P ) be a complete proba- processes H (see Stochastic Integrals).
bility space that satisfies the usual assumptions (i.e., Indeed, these two definitions are equivalent. This
F0 contains all P -null sets of F and the filtration result is known as the Bichteler–Dellacherie theorem
is right continuous). A càglàd, adapted process X is [2, 4].
called a semimartingale if it admits a decomposition
Xt = X0 + At + Mt (1)
Examples
where X0 is F0 -measurable, A is a process with
finite variation, M is a local martingale, and A0 =
M0 = 0. If, moreover, A is predictable (i.e., mea-
surable with respect to the σ -algebra generated by • Càglàd adapted processes with finite variation
all left-continuous processes), X is called a special are semimartingales.
semimartingale. In this case, the decomposition (1) • All càglàd, adapted martingales, submartingales,
is unique and we call it the canonical decomposi- and supermartingales are semimartingales.
tion. Clearly, the set of all semimartingales is a vector • Brownian motion is a continuous martingale.
space. Hence, it is a semimartingale.
For any a > 0, a semimartingale X can be further • Lévy processes are semimartingales.
decomposed as • Itô diffusions of the form
Xt = X0 + At + Dt + Nt (2) t t
Xt = X0 + as ds + σs dWs (5)
where D and N are local martingales such that D is 0 0
a process with finite variation and the jumps of N are
bounded by 2a (see [6] p. 126).
Alternatively, semimartingales can be defined as where W is a Brownian motion, are (continuous)
a class of “good integrators”. Let S be a collection semimartingales. In particular, solutions of stochastic
of all simple predictable processes equipped with the differential equations of the type dXt = a(t, Xt )dt +
uniform convergence in (t, ω). A process H is called σ (t, Xt )dWt are semimartingales.
simple predictable if it has the representation
n
Ht = H0 1{0} (t) + Hi 1(Ti ,Ti+1 ] (t) (3) Quadratic Variation of Semimartingales
i=1
where 0 = T1 ≤ · · · ≤ Tn+1 < ∞ are stopping times, Quadratic variation is an important characteristic of
Hi are FTi -measurable and |Hi | < ∞ almost surely. a semimartingale. It is also one of the crucial objects
Let L0 be the space of (finite-valued) random in financial econometrics as it serves as a measure of
variables topologized by convergence in probability. the variability of a price process.
For a given process X, we define a linear mapping Let X, Y be semimartingales. The quadratic vari-
(stochastic integral) IX : S → L0 by ation process [X, X] = ([X, X]t )t≥0 is given as
n t
IX (H ) = H0 X0 + Hi (XTi+1 − XTi ) (4) [X, X]t = Xt2 − X02 − 2 Xs− dXs (6)
i=1 0
2 Semimartingale
One of the most interesting applications of Itô’s [2] Bichteler, K. (1981). Stochastic integration and Lp-theory
formula is the so-called Doléans–Dade exponen- of semimartingales, Annals of Probability 9, 49–89.
tial (see Stochastic Exponential). Let X be a [3] Delbaen, F. & Schachermayer, W. (1994). A general
version of the fundamental theorem of asset pricing,
(one-dimensional) semimartingale with X0 = 0. Then Mathematische Annalen 300, 463–520.
there exists a unique semimartingale
t Z that satisfies [4] Dellacherie, C. (1980). Un survol de la théorie de
the equation Zt = 1 + 0 Zs− dXs . This solution is l’intégrale stochastique, Stochastic Processes and their
denoted by E(X) (the Doléans–Dade exponential) and Applications 10, 115–144.
is given by [5] Jacod, J. & Shiryaev, A.N. (2003). Limit Theorems for
Stochastic Processes, 2nd Edition, Springer-Verlag.
[6] Protter, P.E. (2005). Stochastic Integration and Differen-
1
E(X)t = exp Xt − [X, X]t (1 + Xs ) tial Equations, 2nd Edition, Springer-Verlag.
2 0≤s≤t
1 Further Reading
× exp −Xs + |Xs | 2
(14)
2
Revuz, D. & Yor, M. (2005). Continuous Martingales and
Moreover, we obtain the identity E(X)E(Y ) = E(X + Brownian Motion, 3rd Edition, Springer-Verlag.
Y + [X, Y ]).
An important example is Xt = at + σ Wt , where
W denotes the Brownian motion and a, σ are con- Related Articles
stant.
In this2 case,
the continuous
solution E(X)t =
exp a − σ t + σ Wt is known as the Black– Doob–Meyer Decomposition; Equivalence of Prob-
2 ability Measures; Filtrations; Itô’s Formula; Mar-
Scholes model.
tingales; Poisson Process; Stochastic Exponential;
Stochastic Integrals.
References
MARK PODOLSKIJ
[1] Back, K. (1991). Asset prices for general processes,
Journal of Mathematical Economics 20(4), 371–395.
Capital Asset Pricing understanding the behaviors and transactions of mar-
ket participants on the financial market. Under this
Model setting, market participants are assumed to act simul-
taneously so that they can invest their money in only
two asset classes, namely, risky assets, which are
The 1990 Nobel Prize winner William Sharpe contingent claims, and nonrisky assets such as the
[49, 50] introduced one cornerstone of the modern risk-free asset. The confrontation between the supply
finance theory with his seminal capital asset pricing and demand of financial assets in the market allows,
model (CAPM) for which Black [9], Lintner [35, 36], therefore, for establishing an equilibrium price (for
Mossin [43], and Treynor [54] proposed analogous each traded asset) once the supply of financial assets
and extended versions. He then proposed an answer to satisfies the demand of financial assets. The uncer-
the financial theory’s question about the uncertainty tainty surrounding contingent claims is so that the
surrounding any investment and any financial asset. general equilibrium theory explains risky asset prices
Indeed, financial theory raised the question of how by the equality between the supply and demand of
risk impacts the fixing of asset prices in the financial financial assets. Under this setting, Sharpe [49, 50]
market (see Modern Portfolio Theory), and William assumes that the returns of contingent claims depend
Sharpe proposed an explanation of the link prevailing on each other only due to a unique exogenous market
between risky asset prices and market equilibrium. factor called the market portfolio. The other potential
The CAPM therefore proposes a characterization of impacting factors are assumed to be random.
the link between the risk and return of financial assets, Hence, the CAPM results immediately from
on one side, and market equilibrium, on the other Markowitz [37, 38] setting since it represents an
side. This fundamental relationship establishes that equilibrium model of financial asset prices (see
the expected excess return of a given risky asset Markowitz, Harry). Basically, market participants
(see Expectations Hypothesis; Risk Premia) cor- hold portfolios, which are composed of the risk-
responds to the expected market risk premium (i.e., free asset and the market portfolio (representing the
market price of risk) times a constant parameter set of all traded risky assets). The market portfo-
called beta (i.e., a proportionality constant). The beta lio is moreover a mean–variance efficient portfolio,
is a measure of the asset’s relative risk and repre- which is optimally diversified and satisfies equi-
sents the asset price’s propensity to move with the librium conditions (see Efficient Markets Theory:
market. Indeed, the beta assesses the extent to which Historical Perspectives; Efficient Market Hypoth-
the asset’s price follows the market trend simulta- esis; Risk–Return Analysis). Consequently, holding
neously. Namely, the CAPM explains that, on an a risky asset such as a stock is equivalent to holding
average basis, the unique source of risk impacting the a combination of the risk-free asset and the market
returns of risky assets comes from the broad finan- portfolio, the market portfolio being the unique mar-
cial market to which all the risky assets belong and ket factor.
on which they are all traded. The main result is that
the global risk of a given financial asset can be split
into two distinct components, namely, a market-based The Capital Asset Pricing Model
component and a specific component. This specific Specifically, Sharpe [49, 50] describes the uncer-
component vanishes within well-diversified portfo- tainty underlying contingent claims with a one-factor
lios so that their global risk summarizes to the broad model—the CAPM. The CAPM illustrates the estab-
market influence. lishment of financial asset prices under uncertainty
and under market equilibrium. Such equilibrium is
Framework and Risk Typology partial and takes place under a set of restrictive
assumptions.
The CAPM provides a foundation for the theory of
market equilibrium, which relies on both the utility
Assumptions
theory (see Utility Theory: Historical Perspectives)
and the portfolio selection theory (see Markowitz, 1. Markets are perfect and without frictions: no tax,
Harry). The main focus consists of analyzing and no transaction costs (see Transaction Costs),
2 Capital Asset Pricing Model
and no possibility of manipulating asset prices it characterizes the systematic fluctuations in asset
in the market (i.e., perfect market competition). prices, which result from the broad market. In a com-
2. Information is instantaneously and perfectly plementary way, the specific risk factor is also called
available in the market so that investors simulta- idiosyncratic risk factor, unsystematic risk factor, or
neously access the same information set without diversifiable risk factor. It represents a component,
any cost. which is peculiar to each financial asset or to each
3. Market participants invest over one time period financial asset class (e.g., small or large caps). This
so that we consider a one-period model setting. specific component in asset prices has no link with
4. Financial assets are infinitely divisible and liquid. the broad market. Moreover, the systematic risk fac-
5. Lending and borrowing processes apply the risk- tor is priced by the market, whereas the idiosyncratic
free rate (same rate of interest), and there is no risk factor is not priced by the market. Specifically,
short sale constraint. market participants ascribe a nonzero expected return
6. Asset returns are normally distributed so that to the market risk factor, whereas they ascribe a zero
expected returns and corresponding standard expected return to the specific risk factor. This fea-
deviations are sufficient to describe the assets’ ture results from the fact that the idiosyncratic risk
behaviors (i.e., their probability distributions). can easily be mitigated within a well-diversified port-
The Gaussian distribution assumption is equiv- folio, namely, a portfolio with a sufficient number
alent to a quadratic utility setting. of heterogeneous risky assets so that their respective
7. Investors are risk averse and rational. More- idiosyncratic risks cancel each other. Thus, a diversi-
over, they seek to maximize the expected util- fied portfolio’s global risk (i.e., total variance) results
ity of their future wealth/of the future value of only from the market risk (i.e., systematic risk).
their investment/portfolio (see Expected Util-
ity Maximization: Duality Methods; Expected CAPM equation
Utility Maximization; and the two-fund separa-
Under the previous assumptions, the CAPM estab-
tion theorem of Tobin [52]).
lishes a linear relationship between a portfolio’s
8. Investors build homogeneous expectations about
expected risk premium and the expected market risk
the future variation of interest rates. All the
premium as follows:
investors build the same forecasts about the
expected returns and the variance–covariance E[RP ] = rf + βP × E[RM ] − rf (1)
matrix of stock returns. Therefore, there is where RM is the return of the market portfolio; RP is
a unique set of optimal portfolios. Basically, the return of portfolio P (which may also correspond
investors share the same opportunity sets, which to a given stock i); rf is the risk-free interest rate;
means they consider the same sets of accessible βP is the beta of portfolio P ; and E[RM ] − rf is
and “interesting” portfolios. the market price of risk. The market portfolio M
9. The combination of two distinct and independent is composed of all the available and traded assets
risk factors drives the evolution of any risky in the market. The weights of market portfolio’s
return over time, namely, the broad financial components are proportional to their corresponding
market and the fundamental/specific features of market capitalization relative to the global broad
the asset under consideration. Basically, the risk market capitalization. Therefore, the market portfolio
level embedded in asset returns results from the is representative of the broad market evolution and
trade-off between a market risk factor and an its related systematic risk. Finally, βP is a systematic
idiosyncratic risk factor. risk measure also called Sharpe coefficient since it
quantifies the sensitivity of portfolio P or stock i to
The market risk factor is also called systematic
the broad market. Basically, the portfolio’s beta is
risk factor and nondiversifiable risk factor. It repre-
written as
sents a risk factor, which is common to any traded
Cov(RP , RM ) σP M
financial asset. Specifically, the market risk factor βP = = 2 (2)
represents the global evolution of the financial mar- Var(RM ) σM
ket and the economy (i.e., trend of the broad market, where Cov(RP , RM ) = σP M is the covariance
business cycle), and impacts any risky asset. Indeed, between the portfolio’s return and the market return,
Capital Asset Pricing Model 3
P
E [R P ]
M Risk premium =
E [R M ]
systematic risk
Market times market price
price of risk. of risk.
rf
Time price =
risk free rate.
0 bM = 1 bP b
and Var(RM ) = σ 2 M is the market return’s variance required by investors becomes. Consequently, the
over the investment period. In other words, beta beta parameter allows investors to classify assets as a
is the risk of covariation between the portfolio’s function of their respective systematic risk level (see
and the market’s returns normalized by the market Table 1).
return’s variance. Therefore, beta is a relative risk Assets with negative beta values are usually spe-
measure. Under the Gaussian return assumption, the cific commodity securities such as gold-linked assets.
standard deviation, or equivalently the variance, is an Moreover, risk-free securities such as cash or Trea-
appropriate risk metric for measuring the dispersion sury bills, Treasury bonds, or Treasury notes belong
risk of asset returns. to the zero-beta asset class. Risk-free securities are
Therefore, under equilibrium, the portfolio’s independent from the broad market and exhibit a zero
expected return RP equals the risk-free rate increased variance, or equivalently a zero standard deviation.
by a risk premium. The risk premium is a linear However, the class of zero-beta securities includes
function of the systematic risk measure as represented also risky assets, namely, assets with a nonzero vari-
by the beta and the market price of risk as ance, which are not correlated with the market.
represented by the expected market risk premium.
Such a relationship is qualified as the security
Table 1 Systematic risk classification
market line (SML; see Figure 1). Since idiosyncratic
risk can be diversified away, only the systematic Beta level Classification
risk component in asset returns matters.a Intuitively, β>1 Offensive, cyclical asset
diversified portfolios cannot get rid of their respective amplifying market
dependency to the broad market. From a portfolio variations
management prospect, the CAPM relationship then 0<β<1 Defensive asset absorbing
focuses mainly on diversified portfolios, namely, market variations
β=1 Market portfolio or asset
portfolios or stocks with no idiosyncratic risk.
mimicking market
It then becomes useless to keep any idiosyncratic variations
risk in a given portfolio since such a risk is not β=0 Asset with no market
priced by the market. The beta parameter becomes dependency
subsequently the only means to control the portfolio’s β lies between −1 Asset with low systematic
risk since the CAPM relationship (1) establishes the and 1 risk level
premium investors require to bear the portfolio’s sys- |β| lies above 1 Asset with a higher risk level
than the broad market’s
tematic risk. Indeed, the higher the dependency on the risk
broad financial market is, the greater the risk premium
4 Capital Asset Pricing Model
Estimation and Usefulness describing the return of asset i. Therefore, RMt and
εit are assumed to be independent, whereas (εit )
The CAPM theory gives a partial equilibrium rela- are supposed to be mutually independent. Regression
tionship, which is assumed to be stable over time. equation (3) is simply the ex-post form of the CAPM
However, how can we estimate such a linear relation- relationship, namely, the application of CAPM to past
ship in practice and how do we estimate a portfolio’s observed data [27].
beta? How useful is this theory to market participants The second method for estimating CAPM betas
and investors? is the characteristic line so that we consider the
following regression:
Empirical Estimation
Rit = ai + bi × RMt + εit (4)
As a first point, under the Gaussian return assump-
where ai and bi are constant trend and slope regres-
tion, beta coefficients can be computed while con-
sion coefficients, respectively [51]. Moreover, such
sidering the covariance and variance of asset returns
coefficients have to satisfy the following constraints:
over the one-period investment horizon (see equa-
tion (2)). However, this way of computing beta coef- αi = ai − (1 − bi ) × rf (5)
ficients does not work in a non-Gaussian world. βi = bi (6)
Moreover, beta estimates depend on the selected mar-
ket index, the studied time window, and the frequency Regression equations (3) and (4) are only valid
of historical data [8]. under the strong assumptions that αi and βi coef-
As a second point, empirical estimations of the ficients are stationary over time (e.g., time stability),
CAPM consider historical data and select a stock and that each regression equation is a valid model
market index as a proxy for the CAPM market portfo- over each one-period investment horizon.
lio. Basically, the CAPM is tested while running two In practice, the market model (3) is estimated
possible types of regressions based on observed asset over a two-year window of weekly data, whereas
returns (i.e., past historical data). Therefore, stocks’ the characteristic line (4) is estimated over a five-
and portfolios’ betas are estimated by regressing past year window of monthly data. Basically, the market
asset returns on past market portfolio returns. We model and the characteristic line use, as a market
therefore focus on the potential existence of a linear proxy, well-chosen stock market indexes such as
relationship between stock/asset returns and market NYSE index and S&P500 index, respectively, which
returns. The first possible estimation method corre- are adapted to the frequency of the historical data
sponds to the market model regression as follows: under consideration.
Rit − rf = αi + βi × (RMt − rf ) + εit (3)
where Rit is the return of asset i at time t; RMt is Practical Use
the market portfolio’s return at time t, namely, the
systematic risk factor as represented by the chosen A sound estimation process is very important insofar
market benchmark, which is the unique explanatory as the CAPM relationship intends to satisfy investors’
factor; rf is the short-term risk-free rate; εit is a needs. From this viewpoint, the main goal of CAPM
Gaussian white noise with a zero expectation and estimation is first to use past-history beta estimates to
a constant variance σεi 2 ; αi is a constant trend forecast future betas. Specifically, the main objective
coefficient; and the slope coefficient βi is simply the consists of extracting information from past history to
beta of asset i. The trend coefficient αi measures predict future betas. However, extrapolating past beta
the distance of the asset’s average return to the estimates to build future beta values may generate
security market line, namely, the propensity of asset estimation errors resulting from outliers due to firm-
i to overperform (i.e., αi > 0) or to underperform specific events or structural changes either in the
(i.e., αi < 0) the broad market. In other words, αi is broad market or in the firm [10].
the difference between the expected return forecast Second, the CAPM is a benchmark tool helping
provided by the security market line and the average investors’ decision. Specifically, the SML is used to
return observed on past history. The error term εit identify overvalued (i.e., above SML) and underval-
represents the diversifiable/idiosyncratic risk factor ued (i.e., below SML) stocks under a fundamental
Capital Asset Pricing Model 5
analysis setting. Indeed, investors compare observed efficiency. Indeed, Campbell et al. [14] show the poor
stock returns with CAPM required returns and then performance of CAPM over the 1990s investment
assess the performance of the securities under consid- period in the United States. Such a result does have
eration. Therefore, the CAPM relationship provides several possible explanations among which miss-
investors with a tool for investment decisions and ing explanatory factors, heteroscedasticity in returns
trading strategies since it provides buy and sell sig- or autocorrelation patterns, time-varying or nonsta-
nals, and drives asset allocation across different asset tionary CAPM regression estimates. For example,
classes. heteroscedastic return features imply that the static
Third, the CAPM allows for building classical estimation of the CAPM is flawed under the classic
performance measures such as Sharpe ratio (see setting (e.g., ordinary least squares linear regres-
Sharpe Ratio), Treynor index, or Jensen’s alpha (see sion). One has, therefore, to use appropriate tech-
Style Analysis; Performance Measures). Finally, niques while running the CAPM regression under
the CAPM theory can be transposed to firm valuation heteroscedasticity or non-Gaussian stock returns (see
insofar as the equilibrium value of the firm is the dis- [7], for example, and see also Generalized Method
counted value of its future expected cash flows. The of Moments (GMM); GARCH Models).
discounting factor is just mitigated by one identified
risk factor affecting equity [20, 29, 30, 47]. Accord- General Violations
ing to the theorem proposed by Modigliani and Miller Basic CAPM assumptions are not satisfied in the mar-
[40–42] (see Modigliani–Miller Theorem), the cost ket and engender a set of general violations. First,
of equity capital for an indebted firm corresponds lending and borrowing rates of interest are different
to the risk-free rate increased by an operating risk in practice. Generally speaking, it is more expensive
premium (independent from the firm’s debt) times to borrow money than to lend money in terms of
a leverage-specific factor. The firm’s risk is there- interest rate level. Second, the risk-free rate is not
fore measured by the beta of its equity (i.e., equity’s constant over time but one can focus on its arith-
systematic risk), which also depends on the beta of metic mean over the one-period investment horizon.
the firm’s assets and on the firm’s leverage. Indeed, Moreover, the choice of the risk-free rate employed
the leverage increases the beta of equity in a perfect in the CAPM has to be balanced with the unit-holding
market and therefore increases the firm’s risk, which period under consideration. Third, transactions costs
represents the probability of facing a default situation. are often observed on financial markets and consti-
However, an optimal capital structure may result from tute part of the brokers’ and dealers’ commissions.
market imperfections such as taxes, agency costs, Fourth, the market benchmark as well as stock returns
bankruptcy costs, and information asymmetry among are often nonnormally distributed and skewed [44].
others. For example, there exists a trade-off between Indeed, asset returns are skewed, leptokurtic [55],
the costs incurred by a financial distress (i.e., default) and they exhibit volatility clusters (i.e., time-varying
and the potential tax benefits inferred from lever- volatility) and long memory patterns [2, 45]. More-
age (i.e., debt). Consequently, applying the CAPM over, the market portfolio is assumed to be composed
to establish the cost of capital allows for budget of all the risky assets available on the financial market
planning and capital budgeting insofar as choosing so as to represent the portfolio of all the traded secu-
an intelligent debt level allows for maximizing the rities. Therefore, the broad market proxy or market
firm value. Namely, there exists an optimal capital benchmark should encompass stocks, bonds, human
structure. capital, real estate assets, and foreign assets (see the
critique of Roll [46]). Fifth, financial assets are not
Limitations and Model Extensions infinitely divisible so that only fixed amounts or pro-
portions of shares, stocks, and other traded financial
However, CAPM is only valid under its strong semi- instruments can be bought or sold.
nal assumptions and exhibits a range of shortcomings Finally, the static representation of CAPM is at
as reported by Banz [6], for example. However, in odds with the dynamic investment decision pro-
practice and in the real financial world, many of these cess. This limitation gives birth to multiperiodic
assumptions are violated. As a result, the CAPM suf- extensions of CAPM. Extensions are usually called
fers from various estimation problems that impact its intertemporal capital asset pricing models (ICAPMs),
6 Capital Asset Pricing Model
and extend the CAPM framework to several unit- factor: the market portfolio. Indeed, considering the
holding periods (see [11, 39]). market portfolio as the unique source of systematic
risk, or equivalently as the unique systematic risk
Trading, Information, and Preferences information source is insufficient. To bypass this
Insider trading theory assumes that some market shortcoming, a wide academic literature proposes
participants hold some private information. Specifi- to add complementary factors to the CAPM in
cally, information asymmetry prevails so that part of order to better forecast stock returns (see Arbitrage
existing information is not available to all investors. Pricing Theory; Predictability of Asset Prices;
Under such setting, Easley and O’Hara [22] and Factor Models). Those missing factors are often
Wang [56] show that the trade-off between pub- qualified as asset pricing anomalies [5, 24, 26, 31].
lic and private information affects any firm’s cost Namely, the absence of key explanatory factors
of capital as well as the related return required by generates misestimations in computed beta values.
investors. Namely, the existence of private infor- For example, Fama and French [25] propose to
mation increases the return required by uninformed consider two additional factors such as the issu-
investors. Under information asymmetry, market ing firm’s size and book-to-market characteristics.
participants exchange indeed information through Further, Carhart [16] proposes to add a fourth
observed trading prices [18]. Moreover, heterogene- complementary factor called momentum. The stock
ity prevails across investors’ preferences. Namely, momentum represents the significance of recent past
they exhibit different levels of risk tolerance, which stock returns on the current observed stock returns.
drives their respective investments and behaviors in Indeed, investors’ sentiment and preferences may
the financial market. Finally, homogeneous expec- explain expected returns to some extent. In this
tations are inconsistent with the symmetry in the prospect, momentum is important since investors
motives of transaction underlying any given trade. make the difference between poor and high perform-
For a transaction to take place, the buy side has to ing stocks over a recent past history. More recently,
meet the sell side. Indeed, Anderson et al. [4] show Li [34] proposed two additional factors to the four
that heterogeneous beliefs play a nonnegligible role previous ones, namely, the earnings-to-price ratio and
in asset pricing. the share turnover as a liquidity indicator. Indeed,
Acharya and Pedersen [1], Brennan and Subrah-
Nonsynchronous Trading manyam [12], Chordia et al. [19], and Keene and
Peterson [32] underlined the importance of liquidity
Often, the market factor of risk and stocks are not as an explanatory factor in asset pricing. Basically,
traded at the same time on the financial market, the trading activity impacts asset prices since the
specifically at the daily frequency level. This stylized degree of transactions’ fluidity drives the continu-
fact engenders the so-called nonsynchronous trading ity of observed asset prices. In other words, traded
problem. When the market portfolio is composed of volumes impact market prices, and the impact’s mag-
highly liquid stocks, the nonsynchronism problem nitude depends on the nature of market participants
is reduced within the portfolio as compared to an [17].
individual stock. However, for less liquid stocks or
less liquid financial markets, the previous stylized
fact becomes an issue under the CAPM estimation Time-varying Betas
setting. To bypass this problem, the asset pricing Some authors like Tofallis [53] questioned the sound-
theory introduces one-lag systematic risk factor(s) ness of CAPM while assessing and forecasting stock
as additional explanatory factor(s) to describe asset returns’ performance. Indeed, the CAPM relation-
returns [13, 21, 48]. ship is assumed to remain stable over time insofar
as it relies on constant beta estimates over each unit-
Missing Factors holding period (i.e., reference time window). Such a
The poor explanatory power of the CAPM setting process assumes implicitly that beta estimates remain
[14] comes from the lack of information describing stable in the near future so that ex-post beta estimates
stock returns in the market among others. The broad are good future risk indicators. However, time insta-
market’s uncertainty is described by a unique risk bility is a key feature of beta estimates. For example,
Capital Asset Pricing Model 7
Gençay et al. [28] and Koutmos and Knif [33] sup- [11] Breeden, D. (1979). An intertemporal capital asset pric-
port time-varying betas in CAPM estimation. ing model with stochastic consumption and investment
Moreover, CAPM-type asset pricing models often opportunities, Journal of Financial Economics 7(3),
265–296.
suffer from error-in-variables problems coupled with
[12] Brennan, M.J. & Subrahmanyam, A. (1996). Market
time-varying parameters features [15]. To solve such microstructure and asset pricing: on the compensation
problems, authors like Amman and Verhoeven [3], for illiquidity in stock returns, Journal of Financial
Ellis [23], and Wang [57] among others advocate Economics 41(3), 441–464.
using conditional versions of the CAPM. Moreover, [13] Busse, J.A. (1999). Volatility timing in mutual funds:
Amman and Verhofen [3] and Wang [57] show the evidence from daily returns, Review of Financial Studies
efficiency of conditional asset pricing models and 12(5), 1009–1041.
exhibit the superior performance of the conditional [14] Campbell, J.Y., Lettau, M., Malkiel, B.G. & Xu, Y.
(2001). Have individual stocks become more volatile?
CAPM setting as compared to other asset pricing
An empirical exploration of idiosyncratic risk, Journal
models.
of Finance 56(1), 1–43.
[15] Capiello, L. & Fearnley, T.A. (2000). International
End Notes CAPM with Regime Switching GARCH Parameters.
Graduate Institute of International Studies, University of
a.
Specifically, the systematic risk represents that part of Geneva. Research Paper No 17.
returns’ global risk/variance, which is common to all [16] Carhart, M.M. (1997). On persistence in mutual fund
traded assets, or equivalently, which results from the broad performance, Journal of Finance 52(1), 57–82.
market’s influence. [17] Carpenter, A. & Wang, J. (2007). Herding and the
information content of trades in the Australian dollar
market, Pacific-Basin Finance Journal 15(2), 173–194.
References [18] Chan, H., Faff, R., Ho, Y.K. & Ramsay, A. (2006).
Asymmetric market reactions of growth and value
[1] Acharya, V.V. & Pedersen, L.H. (2005). Asset pricing firms with management earnings forecasts, International
with liquidity risk, Journal of Financial Economics Review of Finance 6(1–2), 79–97.
77(2), 375–410. [19] Chordia, T., Roll, R. & Subrahmanyam, A. (2001).
[2] Adrian, T. & Rosenberg, J. (2008). Stock Returns and Trading activity and expected stock returns, Journal of
Volatility: Pricing the Short-run and Long-run Compo- Financial Economics 59(1), 3–32.
nents of Market Risk , Staff Report No 254, Federal [20] Cohen, R.D. (2008). Incorporating default risk into
Reserve Bank of New York. Hamada’s equation for application to capital structure,
[3] Amman, M. & Verhofen, M. (2008). Testing conditional Wilmott Magazine March, 62–68.
asset pricing models using a Markov chain Monte [21] Dimson, E. (1979). Risk measurement when shares
Carlo approach, European Financial Management 14(3),
are subject to infrequent trading, Journal of Financial
391–418.
Economics 7(2), 197–226.
[4] Anderson, E.W., Ghysels, E. & Juergens, J.L. (2005). Do
[22] Easley, D. & O’Hara, M. (2004). Information and the
heterogeneous beliefs matter for asset pricing? Review of
cost of capital, Journal of Finance 59(4), 1553–1583.
Financial Studies 18(3), 875–924.
[23] Ellis, D. (1996). A test of the conditional CAPM
[5] Avramov, D. & Chordia, T. (2006). Asset pricing models
with simultaneous estimation of the first and second
and financial market anomalies, Review of Financial
Studies 19(3), 1001–1040. conditional moments, Financial Review 31(3), 475–499.
[6] Banz, R. (1981). The relationship between return and [24] Faff, R. (2001). An Examination of the Fama and French
market value of common stocks, Journal of Financial three-factor model using commercially available factors,
Economics 9(1), 3–18. Australian Journal of Management 26(1), 1–17.
[7] Barone Adesi, G., Gagliardini, P. & Urga, G. (2004). [25] Fama, E.F. & French, K.R. (1993). Common risk factors
Testing asset pricing models with coskewness, Journal in the returns on stocks and bonds, Journal of Financial
of Business and Economic Statistics 22(4), 474–495. Economics 33(1), 3–56.
[8] Berk, J. & DeMarzo, P. (2007). Corporate Finance, [26] Fama, E.F. & French, K.R. (1996). Multi-factor expla-
Pearson International Education, USA. nations of asset pricing anomalies, Journal of Finance
[9] Black, F. (1972). Capital market equilibrium with 51(1), 55–84.
restricted borrowing, Journal of Business 45(3), [27] Friend, I. & Westerfield, R. (1980). Co-skewness and
444–455. capital asset pricing, Journal of Finance 35(4), 897–913.
[10] Bossaerts, P. & Hillion, P. (1999). Implementing statisti- [28] Gençay, R., Selçuk, F. & Whitcher, B. (2003). Sys-
cal criterion to select return forecasting models: what do tematic risk and timescales, Quantitative Finance 3(1),
we learn? Review of Financial Studies 12(2), 405–428. 108–116.
8 Capital Asset Pricing Model
[29] Hamada, R. (1969). Portfolio analysis market equilib- Physica A: Statistical Mechanics and Its Applications
rium and corporation finance, Journal of Finance 24(1), 387(5–6), 1247–1254.
13–31. [46] Roll, R. (1977). A critique of the asset pricing theory’s
[30] Hamada, R. (1972). The effect of the firm’s capital tests: Part one: on past and potential testability of the
structure on the systematic risk of common stocks, theory, Journal of Financial Economics 4(1), 129–176.
Journal of Finance 27(2), 435–451. [47] Rubinstein, M. (1973). A mean variance synthesis of
[31] Hu, O. (2007). Applicability of the Fama-French three- corporate financial theory, Journal of Finance 38(1),
factor model in forecasting portfolio returns, Journal of 167–181.
Financial Research 30(1), 111–127. [48] Scholes, M. & Williams, J. (1977). Estimating betas
[32] Keene, M.A. & Peterson, D.R. (2007). The importance from non synchronous data, Journal of Financial Eco-
of liquidity as a factor in asset pricing, Journal of nomics 5(3), 309–327.
Financial Research 30(1), 91–109. [49] Sharpe, W.F. (1963). A simplified model of portfolio
[33] Koutmos, G. & Knif, J. (2002). Estimating systematic analysis, Management Science 9(2), 227–293.
risk using time-varying distributions, European Finan- [50] Sharpe, W.F. (1964). Capital asset prices: a theory of
cial Management 8(1), 59–73. market equilibrium under risk, Journal of Finance 19(3),
[34] Li, X. (2001). Performance Evaluation of Recommended 425–442.
Portfolios of Individual Financial Analysts. Working [51] Smith, K.V. & Tito, D.A. (1969). Risk-return measures
Paper, Owen Graduate School of Management, Vander- of ex post portfolio performance, Journal of Financial
bilt University. and Quantitative Analysis 4(4), 449–471.
[35] Lintner, J. (1965). The valuation of risky assets and [52] Tobin, J. (1958). Liquidity preferences as behavior
the selection of risky investments in stock portfolios towards risk, Review of Economic Studies 25(1), 65–86.
and capital budgets, Review of Economics and Statistics [53] Tofallis, C. (2008). Investment volatility: a critique
47(1), 13–37. of standard beta estimation and a simple way for-
[36] Lintner, J. (1969). The aggregation of investor’s diverse ward, European Journal of Operational Research 187(3),
judgments and preferences in purely competitive security 1358–1367.
markets, Journal of Financial and Quantitative Analysis [54] Treynor, J. (1961). Toward a theory of the market value
4(4), 347–400. of risky assets, in Asset Pricing and Portfolio Per-
[37] Markowitz, H.W. (1952). Portfolio selection, Journal of formance: Models, Strategy and Performance Metrics,
Finance 7(1), 77–91. Korajczyk, Robert A., ed., Risk Books, London, pp.
[38] Markowitz, H.W. (1959). Portfolio Selection. Efficient 15–22. Unpublished Manuscript. Recently published in
Diversification of Investment, John Wiley & Sons, New 1999 as the Chapter 2 of editor).
York. [55] Verhoeven, P. & McAleer, M. (2004). Fat tails and
[39] Merton, R.C. (1973). An intertemporal capital asset asymmetry in financial volatility models, Mathematics
pricing model, Econometrica 41(5), 867–887. and Computers in Simulation 64(3–4), 351–361.
[40] Modigliani, F. & Miller, M.H. (1958). The cost of [56] Wang, J. (1993). A model of intertemporal asset prices
capital, corporation finance and the theory of investment, under asymmetric information, Review of Economic
American Economic Review 48(3), 261–297. Studies 60(2), 249–282.
[41] Modigliani, F. & Miller, M.H. (1963). Corporate income [57] Wang, K.Q. (2003). Asset pricing with conditioning
taxes and the cost of capital: a correction, American information: a new test, Journal of Finance 58(1),
Economic Review 53(3), 433–443. 161–196.
[42] Modigliani, F. & Miller, M.H. (1966). Some estimates
of the cost of capital to the utility industry 1954-7,
American Economic Review 56(3), 333–391. Related Articles
[43] Mossin, J. (1966). Equilibrium in a capital asset market,
Econometrica 34(4), 768–783. Arbitrage Pricing Theory; Efficient Markets The-
[44] Nelson, D.B. (1991). Conditional heteroskedasticity in ory: Historical Perspectives; Markowitz, Harry;
asset returns: a new approach, Econometrica 59(2), Modigliani, Franco; Sharpe, William F.
347–370.
[45] Oh, G., Kim, S. & Eom, C. (2008). Long-term memory
and volatility clustering in high-frequency price changes, HAYETTE GATFAOUI
{Zi ; i = 1, . . . , N }. The bi,k are the factor loadings
Arbitrage Pricing Theory and the ei are the residuals from projecting the Zi on
the factors.
The arbitrage pricing theory (APT) was introduced by The K + 1 largest eigenvalue of the covariance
Ross [10] as an alternative to the capital asset pricing matrix of the Zi , denoted by 2 (K), is interpreted
model (CAPM). The model derives a multibeta as a measure of the extent to which our sequence of
representation of expected returns relative to a set assets has a K-factor representation. The PCA selects
of K reference variables under assumptions that may the fk so that 2 (K) is minimized. In addition,
be described roughly as follows: 2 (K) is also the largest eigenvalue of the covariance
matrix of the ei .
1. There exists no mean–variance arbitrage.
2. The asset returns follow a K-factor model.
3. The reference variables and the factors are non- Diversified Portfolios
trivially correlated.a
Let w ∈ R N be a portfolio in assets i = 1, . . . , N . Its
The first assumption implies that there are no excess return is
portfolios with arbitrarily large expected returns and
N
unit variance. The second one assumes that the Zw = wi Zi .
returns are a function of K factors common to all i=1
assets, and noise term specific to each asset. The third Its representation as a linear function of the factors is
one identifies the sets of reference variables for which
the model works. K
The model predictions may have approximation Zw = bw,0 + bw,k fk + ew
k=1
errors. However, these errors are small for each port-
folio that its weight on each asset is small (a well- where bw,k = N i=1 wi bi,k are the factor loadings and
N
diversified portfolio). ew = i=1 wi ei is the residual which satisfies
Early versions of the model unnecessarily assumed
N
that the factors are equal to the reference variables. Var[ew ] < 2 (K) wi2
The extension of the model to arbitrary sets of ref- i=1
erence variables comes at the cost of increasing the A portfolio w = (w1 , . . .) is called an (approximate)
bound on the approximation errors by a multiplicative well-diversified portfolio if
factor. However, when focusing on pricing of only
well-diversified portfolios, this seems to be unimpor- N
wi2 ≈ 0 (1)
tant because each of the approximation error is small i=1
and a multiplicative factor does not change much the
size of the error. Intuitively, a well-diversified portfolio is one with
a large number of assets that has a small weight in
many of them, and, in addition, there is no single
Factor Representation asset for which the weight is not small.
The variance of the residual of a well-diversified
Consider a finite sequence of random variables portfolio is small and thus its excess return is
{Zi ; i = 1, . . . , N } with finite variances that will be approximately a linear function of the factors; that is,
held fixed throughout the article. It is regarded as rep-
K
resenting the excessb returns of a given set of assets Zw ≈ bw,0 + bw,k fk (2)
(henceforth “assets i = 1, . . . , N ”). Without any fur- k=1
ther assumptions
Although N i=1 wi ≈ 0, Zw may not be
2
small. For
K N
Zi = bi,0 + bi,k fk + ei ; i = 1, . . . , N example, let wi = 1/N , then we have i=1 wi =
2
K
k=1
1/N , and bw,k = (1/N ) k=1 bi,k
where f1 , . . . , fK are the first K factors in the A further discussion on well-diversified portfolios
principal component analysis (PCA) of the sequence can be found in [4].
2 Arbitrage Pricing Theory
may explain most of the variation of the market. 1. a factor structure with K factors;
Then he tested the multibeta representation with these 2. no mean–variance arbitrage;
factors as reference variables. 3. nontrivial correlation between our set of refer-
ence variables and the first K factors in the PCA.
Equilibrium APT The parameters , S, and are measures of the
extent to which each of the above assumptions holds.
The CAPM implies that the market portfolio is The larger it is, the larger is the extent to which the
mean–variance efficient. If the market portfolio is a related assumption does not hold.
well-diversified one, then it is spanned by the factors. What this says is that the model translates our
In that case, we get that if the reference variables beliefs on the extent to which the model assumptions
are the factors, then is small, which implies that hold to a belief on a bound on the size of the approx-
the approximation error for each asset in the sequence imation errors in pricing well-diversified portfolios.
is small. Connor [2] and Wei [14] derived a related
result which is called equilibrium APT.
Summary
Arbitrage and APT The APT implies that each (approximate) well-
S measures the extent to which arbitrage in the mean- diversified portfolio is (approximately) priced by a
variance sense exists. It is equal to the maximal set of K reference variables.
expected excess return per unit variance of portfolios What distinguishes this model from the K-factor
in the Zi . A large S can be interpreted as some form CAPM is the set of reference variables that is implied
of no arbitrage. However it is not an arbitrage in the by each of the models.
standard sense as there are examples in which S is In the CAPM, the market portfolio is mean–
finite and arbitrage exists. See Reisman [6]. variance efficient and its return must be equal to a
linear function of the set of reference variables.
In contrast, in the APT, the reference variables
Testability are any set that is nontrivially correlated with the
common factors of the returns and it may not span
It was pointed out by Shanken [11, 12] that an the mean–variance frontier.
inequality of the type given in equation (7) is a
tautology. That is, it is a mathematical statement and
thus cannot be rejected. End Notes
Assume that we performed statistical tests that
a.
imply that the probability that the bound in equation The cross-correlation matrix is nonsingular.
b.
(7) holds, is small. Then the only explanation can The excess return is the return minus the risk-free rate.
be that it was a bad sample. Since equation (7) is a
tautology, there is no other explanation. References
Nevertheless, this does not imply that the bound
is not useful. The bound translates prior beliefs on [1] Chamberlain, G. & Rothschild, M. (1983). Arbitrage,
the sizes of , S, and , into a prior belief on a factor structure, and mean variance analysis on large
bound on the size of the approximation error of each asset markets, Econometrica 51, 1281–1304.
[2] Connor, G. (1984). A unified beta pricing theory,
well-diversified portfolio.
Journal of Economic Theory 34, 13–31.
The relationship between the sizes of , S, and , [3] Huberman, G. (1982). A simple approach to arbitrage
and the model assumptions is illustrated in the next pricing, Journal of Economic Theory 28, 183–191.
section. [4] Ingersoll Jr J.E. (1984). Some results in the theory of
arbitrage pricing, Journal of Finance 39, 1021–1039.
[5] Nawalkha, S.K. (1997). A multibeta representation theo-
The APT Assumptions rem for linear asset pricing theories, Journal of Financial
Economics 46, 357–381.
The model is derived under assumptions on the extent [6] Reisman, H. (1988). A general approach to the Arbitrage
to which there exists Pricing Theory (APT), Econometrica 56, 473–476.
4 Arbitrage Pricing Theory
[7] Reisman, H. (1992). Reference variables, factor struc- [13] Trzcinka, C. (1986). On the number of factors in
ture, and the approximate multibeta representation, Jour- the arbitrage pricing model, Journal of Finance 41,
nal of Finance 47, 1303–1314. 347–368.
[8] Reisman, H. (2002). Some comments on the APT, [14] Wei, K. & John, C. (1988). An asset-pricing theory
unifying CAPM and APT, Journal of Finance, 43,
Quantitative Finance 2, 378–386.
881–892.
[9] Roll, R. & Ross, S.A. (1980). An empirical investigation
of the arbitrage pricing theory, Journal of Finance 35,
1073–1103. Related Articles
[10] Ross, S.A. (1976). The arbitrage theory of capital asset
pricing, Journal of Economic Theory 13, 341–360. Capital Asset Pricing Model; Correlation Risk;
[11] Shanken, J. (1982). The arbitrage pricing theory: is it Factor Models; Risk–Return Analysis; Ross,
testable? Journal of Finance 37, 1129–1140. Stephen; Sharpe, William F.
[12] Shanken, J. (1992). The current state of the arbitrage
pricing theory, Journal of Finance 47, 1569–1574. HAIM REISMAN
Efficient Market efficiency only if market efficiency is identified with
constancy of expected returns. On this reading, the
Hypothesisa additional restriction implied by market efficiency
might consist of the assumption that investors have
rational expectations. The market model explains
The topic of capital market efficiency plays a cen- asset prices based on investors’ subjective percep-
tral role in introductory instruction in finance. After tions of their environment; the assumption of rational
investigating the risk–return trade-off and the selec- expectations is needed to connect these subjective
tion of optimal portfolios, instructors find it natural perceptions with objective correlations. Admittedly,
to go on to raise the question of what information it is pure conjecture to assume that proponents intend
is incorporated in the estimates of risk and expected this identification of market efficiency with rational
return that underlie portfolio choices. Information that expectations–as Berk [1] pointed out, there is no
is “fully reflected” in security prices (and therefore mention of rational expectations in [7, 8].
in investors’ estimates of expected return and risk) In many settings, conditional expected returns are
cannot be used to construct successful trading rules, constant over time when agents are risk neutral. If
which are defined as those with an abnormally high agents are risk averse, expected returns will gen-
expected return for a given risk. In contrast, informa- erally differ across securities, as is clear from the
tion that is not fully reflected in security prices can be capital asset pricing model (see Capital Asset Pric-
so used. Students appear to find this material plausible ing Model), and will change over time according to
and intuitive, and this is the basis of its appeal. Best the realizations of the conditioning variables even
of all, the idea of capital market efficiency appears not in stationary settings [14, 19]. Hence, if investors
to depend on the validity of particular models, imply- are risk averse, the assumption of rational expecta-
ing that students can grasp the major ideas without tions will not generally lead to returns that are fair
wading through the details of finance models. games.
However, those who are accustomed to relying Analysts who understood that constancy of
on formal models to discipline their thinking find expected returns requires the assumption of risk neu-
that capital market efficiency has the disadvantage trality (or some other even more extreme assumption,
of its advantage: the fact that market efficiency is not such as that growth rates of gross domestic prod-
grounded in a particular model (unlike, e.g., portfolio uct are independently and identically distributed over
theory) means that it is not so easy to determine time) were skeptical about the empirical evidence
what efficiency really means. To see this, consider the offered in support of market efficiency. From the fact
assertion of Fama [8] that capital market efficiency that high-risk assets generate higher average returns
can only be tested in conjunction with a particular than low-risk assets—or from the fact that agents
model of returns. This statement implies that there purchase insurance even at actuarially unfavorable
exist two independent sources of restrictions on the prices, or from a variety of other considerations—we
data that are being tested jointly: the assumed model know that investors are risk averse. If so, there is no
and market efficiency. Analysts who are used to reason to expect that conditional expected returns will
deriving all restrictions being tested from the assumed be constant.
model find this puzzling: what is the additional source One piece of evidence offered in the 1970s, which
of information that is separate from the model? appeared to contradict the consensus in support of
This question was not addressed clearly in the market efficiency, had to do with the volatility of
major expositions of market efficiency offered by its security prices and returns. If conditional expected
proponents. One way to resolve this ambiguity is to returns are constant, then the volatility of stock
look at the empirical tests that are interpreted as sup- prices depends entirely on the volatility of dividends
porting or contradicting market efficiency. Most of (under some auxiliary assumptions, such as exclu-
the empirical evidence that Fama [7] interpreted as sion of bubbles). This observation led LeRoy and
supporting market efficiency is based on a particular Porter [16] and Shiller [23] to suggest that bounds
model: expected returns conditional on some pre- on the volatility of stock prices and returns can
specified information set are constant. For example, be derived from the volatility of dividends. These
return autocorrelatedness is evidence against market authors concluded that stock prices appear to be more
2 Efficient Market Hypothesis
volatile than can be justified by the volatility of divi- The attractive feature of the log-linearization is that
dends. This finding corroborated the informal opinion expectations of future dividends and expectations of
(that was subsequently confirmed by Cutler et al. future returns appear symmetrically and additively in
[6]) that large moves in stock prices generally can- relation (1). Without the log-linearization, dividends
not be convincingly associated with contemporaneous would appear in the numerator of the present-value
news that would materially affect expected future relation and returns in the denominator, rendering the
dividends. analysis less tractable.
Connecting the volatility of stock prices with that As noted, the market-efficiency tests of Fama and
of dividends required a number of auxiliary econo- the variance bounds are implications of the hypoth-
metric specifications. These were supplied differently esis that prt is a constant. If prt is, in fact, random
by LeRoy–Porter and Shiller. However, both sets of and positively correlated with pdt , then the assump-
specifications turned out to be controversial (see [9] tion of constancy of expected returns will bias the
for a survey of the econometric side of the variance- implied volatility of pt downward. Campbell and
bounds tests). Some analysts, such as Marsh and Shiller found that if averages of future returns are
Merton [20], concluded that the appearance of excess regressed on current stock prices, a significant propor-
volatility was exactly what should be expected in an tion of the variation can be explained, contradicting
efficient market, although the majority opinion was the specification that expected returns are constant.
that resolving the econometric difficulties reduces but Campbell et al. noted that as economists came
does not eliminate the excess volatility [25]. to understand the connection between return auto-
It was understood throughout that the variance correlatedness and price and return volatility, the
bounds were implications of the assumption that variance-bounds results seemed less controversial:
expected returns are constant. As noted, this was the
same model that was implicitly assumed in the market LeRoy and Porter [16] and Shiller [23] started a
heated debate in the early 1980s by arguing that
efficiency tests summarized by Fama. The interest stock prices are too volatile to be rational forecasts
in the variance-bounds tests derived from the fact of future dividends discounted at a constant rate.
that the results of the two sets of tests of the This controversy has since died down, partly because
same model appeared to be so different. In the late it is now more clearly understood that a rejection
1980s, there was a growing realization that small but of constant-discount-rate models is not the same
persistent autocorrelations in returns could explain as a rejection of Efficient Capital Markets, and
the excess volatility of prices [24]. This connection partly because regression tests have convinced many
financial economists that expected stock returns are
is particularly easy to understand if we employ the time-varying rather than constant ([2] p. 275).
Campbell–Shiller log-linearization. Defining rt+1 as
the log stock return from t to t + 1, pt as the log stock This passage, in implying that the return autocorre-
price at t, and dt as the log dividend level, we have lation results provide an explanation for excess stock
price volatility, is a bit misleading. The log-linearized
pt ∼
= k + pdt + prt (1)
present-value relation (1) is not a theoretical model
where pdt and prt are given by with the potential to explain price volatility. Rather,
it is very close to an identity (the only respect in
∞ which equation (1) imposes substantive restrictions
pdt = Et ρ j [(1 − ρ)dt+j ] (2) lies in the assumption that the infinite sums con-
j =1 verge; this rules out bubbles). The Campbell–Shiller
exercise amounts to decomposing price variation into
and dividend variation, return variation, and a covari-
∞
prt = −Et ρ j rt+j (3) ance term and observing that the latter two terms
j =1
are not negligible quantitatively. This, although use-
ful, is a restatement of the variance-bounds result,
(see [2–4]). Here, k and ρ are parameters associated not an explanation of it. Explaining excess volatil-
with the log-linearization. Thus pdt and prt capture ity would involve accounting in economic terms for
price variations induced by expected dividend vari- the fact that expected returns have the time structure
ations and expected return variations, respectively. that they do. Campbell and Shiller have not done
Efficient Market Hypothesis 3
this—nor has anyone else. LeRoy–Porter’s conclu- as real estate in a model that accounts explicitly
sion from the variance-bounds tests was that we do for illiquidity in terms of search and matching. In
not understand why asset prices move as they do. a similar setting, Krainer [12] introduced economy-
That conclusion is no less true now than it was when wide shocks and found that, despite the illiquidity
the variance-bounds results were first reported. of real estate, prices adjust instantaneously to the
Fama’s assertion that market efficiency is testable, shocks, just as in liquid markets.
but only in conjunction with a model of market A similar result was demonstrated by Lim [17].
returns, can be given another reading. Rather than He considered the determination of asset prices
identifying market efficiency with the proposition that when short sales are restricted. Lintner [18] and
investors have rational expectations—alternatively, Miller [21], among others, proposed that short sale
with the decision to model investors as having ratio- restrictions cause securities to trade at higher prices
nal expectations—one can associate market effi- than they would otherwise. This is held to occur
ciency with the proposition that asset prices behave because investors with negative information may be
as one would expect if security markets were entirely unable to trade based on their information, whereas
frictionless. In such markets, prices respond quickly those with positive information can buy without
to information, implying that investors cannot use restriction. Empirical evidence is held to support this
publicly available information to construct profitable result [5, 10, 22]. Lim showed that this outcome
trading rules because that information is reflected will not occur if investors have rational expectations
in security prices as soon as it becomes available. about the extent of short sales restrictions. Under
In contrast, the presence of major frictions in asset rational expectations, prices in Lim’s model follow a
markets is held to imply that prices may respond martingale under the natural probabilities (reflecting
slowly to information. In that case, the frictions pre- assumed risk neutrality), just as they would in the
vent investors from exploiting the resulting trading absence of short sales restrictions.
opportunities. These results were derived in settings that imposed
In the foregoing argument, it is presumed that trad- strong restrictions, and it is not clear how general
ing frictions and transactions costs are analogous to they are. However, the preliminary conclusion is
adjustment costs. In the theory of investment, it is that if market efficiency is defined as the absence
sometimes assumed that investment in capital goods of frictions, empirical evidence of quick adjustment
induces costs that motivate firms to change quan- of prices to information cannot necessarily be inter-
tities—in this case, physical capital—more slowly preted as supporting market efficiency, since that
than they would otherwise. It appears natural to outcome would occur in the presence of frictions.
assume that prices are similar. For example, real It could be objected that none of these consid-
estate prices are held to respond slowly to relevant erations supports distinguishing between the implica-
information because the costs implied by the illiquid- tions of an asset pricing model and market efficiency,
ity of real estate preclude the arbitrages that would however defined. All testable restrictions are derived
otherwise bring about rapid price adjustment. from an assumed model; so, the question is, what can
Recent work on the valuation of assets in the be gained by identifying some of these restrictions
presence of market frictions raises questions as to with something called market efficiency? This is par-
the appropriateness of the analogy between quantity ticularly debatable, given the ambiguity in the usage
adjustment and price adjustment. It is correct that,
of this term now. Berk [1] suggested dropping the
if prices respond slowly to information, investors
term “market efficiency” from financial economics,
may be unable to construct the trades that exploit
and this might be the best course.
the mispricing because of frictions. This, however,
does not establish that markets clear in settings
where prices adjust slowly. Equilibrium models that
characterize asset prices in the presence of frictions End Notes
suggest that in equilibrium prices respond quickly
to shocks, just as in the absence of frictions. For a.
An evaluation of the idea of capital market efficiency has
example, Krainer [11] and Krainer and LeRoy [13] been presented elsewhere [15]. In this essay, repetition of
analyzed equilibrium prices of illiquid assets such material found there has been avoided as much as possible.
4 Efficient Market Hypothesis
References [15] LeRoy, S.F. (1989). Efficient capital markets and mar-
tingales, Journal of Economic Literature 27, 1583–1621.
[16] LeRoy, S.F. & Porter, R.D. (1981). The present value
[1] Berk, J. (2007). A Critique of the Efficient Capital Mar- relation: tests based on implied variance bounds, Econo-
kets Hypothesis. Reproduced, Haas School of Business, metrica 49, 555–574.
University of California, Berkeley. [17] Lim, B. (2007). Short-sales Constraints and Price
[2] Campbell, J.Y., Lo, A.W. & MacKinlay, A.C. (1996). Bubbles. Reproduced, University of California, Santa
The Econometrics of Financial Markets, Princeton Barbara.
University Press, Princeton, NJ, 275. [18] Lintner, J. (1969). The aggregation of investors’ diverse
[3] Campbell, J.Y. & Shiller, R.J. (1988). The dividend- judgments and preferences in purely competitive security
price ratio and expectations of future dividends and dis- markets, Journal of Financial and Quantitative Eco-
count factors, Review of Financial Studies 1, 195–228. nomics 4(4), 347–400.
[4] Campbell, J.Y. & Shiller, R. (1988). Stock prices, [19] Lucas, R.E. (1978). Asset prices in an exchange econ-
earnings, and expected dividends, Journal of Finance omy, Econometrica 46, 1429–1445.
43, 661–676. [20] Marsh, T.A. & Merton, R.C. (1986). Dividend variability
[5] Cheng, J.W., Chang, E.C. & Yu, Y. (2007). Short-sales and variance bounds tests for the rationality of stock
constraints and price discovery: evidence from the Hong market prices, American Economic Review 76, 483–498.
Kong market, Journal of Finance 62(5), 2097–2121. [21] Miller, E.M. (1977). Risk, uncertainty, and divergence
[6] Cutler, D., Poterba, J. & Summers, L. (1989). What of opinion, Journal of Finance 32(4), 1151–1168.
moves stock prices? Journal of Portfolio Management [22] Ofek, E. & Richardson, M. (2003). Dotcommania: the
15, 4–12. rise and fall of internet stock prices, Journal of Finance
[7] Fama, E.F. (1970). Efficient capital markets: a review
58(3), 1113–1137.
of theory and empirical work, Journal of Finance 25, [23] Shiller, R.J. (1981). Do stock prices move too much to be
283–417. justified by subsequent changes in dividends? American
[8] Fama, E.F. (1991). Efficient capital markets: II, Journal Economic Review 71, 421–436.
of Finance 46, 1575–1617. [24] Summers, L. (1986). Does the stock market ratio-
[9] Gilles, C. & LeRoy, S.F. (1991). Econometric aspects of nally reflect fundamental values, Journal of Finance 41,
the variance-bounds tests: a survey, Review of Financial 591–600.
Studies 4, 753–791. [25] West, K.D. (1988). Bubbles, fads and stock price
[10] Jones, C. & Lamont, O. (2002). Short-sale constraints volatility: a partial evaluation, Journal of Finance 43,
and stock returns, Journal of Financial Economics 66, 636–656.
207–239.
[11] Krainer, J. (1997). Pricing Illiquid Assets with a Match-
ing Model . Reproduced, University of Minnesota. Related Articles
[12] Krainer, J. (2001). A theory of liquidity in residential
real estate markets, Journal of Urban Economics 13,
32–53. Expectations Hypothesis; Predictability of Asset
[13] Krainer, J. & LeRoy, S.F. (2002). Equilibrium valuation Prices; Risk Aversion; Risk Premia; Transaction
of illiquid assets, Economic Theory 19, 223–242. Costs.
[14] LeRoy, S.F. (1973). Risk aversion and the martingale
model of stock prices, International Economic Review STEPHEN F. LEROY
14, 436–446.
Expectations Hypothesis and simplicity reasons. In comparing the expected
returns on two bonds of different maturities, how-
ever, the returns may be compounded in any of
four natural ways: continuously, to the shorter
bond’s maturity, to the longer bond’s maturity, or
If the attractiveness of an economic hypothesis is
measured by the number of papers which statistically to the nearest available future date. For these rea-
reject it, the expectations theory of the term structure sons, in the following, we introduce notation that
is a knockout [43]. is flexible enough to accommodate the descrip-
tion of discrete as well as continuous time models
The term expectations hypothesis (EH) stands for and all possible ways that compounding may take
numerous statements that link yields, returns on place.
bonds, and forward rates of different maturities and A zero-coupon bond or discount bond–the sim-
periods. The EH has been the basis of empirical and plest fixed income security–promises a single fixed
theoretical work in fixed income following the work payment at a specified date in the future known as
of Macaulay [54]. These hypotheses were devel- maturity date. The size of this payment is called face
oped for understanding the returns and yields on value of the bond. Example of such securities is the
long- versus short-term bonds and the time series Treasury bills, which are bonds issued by the US
movements of the term structure. The literature dis- government with maturities up to a year.
tinguishes between the pure expectations hypothesis We denote the price of a zero-coupon bond
(PEH), which postulates that (i) expected excess that matures τ ∈ + periods from now and pays
returns on long-term over short-term bonds are zero, 1 unit at maturity as Pt(τ ) . Call the yield to matu-
(ii) yield term premia are zero, or (iii) forward term rity–compounded once per period–for this zero-
premia are zero, from the EH, which postulates that coupon bond as Yt(τ ) . Then prices and yields are
(i) expected excess returns are constant over time, (ii) connected through the following equation:
yield term premia are constant, or (iii) forward term
premia are constant over time. 1
Pt(τ ) = (1)
We review the literature related to the EH. We (1 + Yt(τ ) )τ
present the different forms of both the PEH and It is common in the empirical finance literature to
the less strong EH. We show that their math- work with log or continuously compounded vari-
ematical expressions depend on the researchers’ ables. This has the usual advantage of linearizing
choice of model–continuous time versus discrete exponential affine equations that arise frequently in
time–and their choice of frequency of compounding asset pricing and of defining comparable yield values
returns–continuous (log-return) versus discrete (sim- independent of the remaining horizon value τ . Using
ple return). Depending on these choices, we may or lowercase letters for logs, the relationship between
may not have equivalence among the several forms of log yield and log price is
the (pure) EH. In addition, we examine which of the
statements can be derived from a no-arbitrage gen- pt(τ )
eral equilibrium model. Lastly, we present empirical yt(τ ) = − (2)
τ
evidence against the EH mainly from the US data, The collection of all these yields for different maturi-
and the less strong rejection of the hypotheses when ties is called the zero term structure of interest rates.
using non-US data. Buying this bond at time t and reselling it at time
t + s generate a holding period return of
Notation (τ −s)
(τ ) Pt+s (1 + Yt(τ ) )τ
Rt→t+s = = (τ −s) τ −s
(3)
To formulate the different forms of the EH, we Pt(τ ) (1 + Yt+s )
need to introduce the basic fixed income assets and a log holding period return of
and concepts associated with them. Even though
all the empirical research is done using discrete (τ ) (τ −s)
rt→t+s = pt+s − pt(τ )
time models, the theoretical literature predominantly
(τ −s)
uses continuous time models mainly for tractability = s yt(τ ) − (τ − s)(yt+s − yt(τ ) ) (4)
2 Expectations Hypothesis
Clearly, the holding period s cannot exceed the time and the two measures and () are connected
to maturity τ , s ≤ τ . The above equation shows through the Radon–Nikodym derivative
that the holding period return on a zero-coupon
bond is not known at time t unless the holding
d() 1 T
period coincides with the lifetime of the bond. In = ξT = exp − (s) (s)ds−
d 2 0
this case, the holding period return is the yield to T
maturity. Otherwise, the return is a random variable
(s)ds (9)
that depends on the future evolution of yields. 0
Even though returns are unknown, bonds can be
combined to guarantee a fixed interest rate on an This gives rise to the following pricing equation
investment to be made in the future; the interest under both measures:
rate on this investment is called a forward rate. The T
forward and log forward rates guaranteed at time t M(T ) − r(s)ds
S(t) = Ɛt
S(T ) = Ɛt e t S(T )
for an investment made at time t + s until time t + τ M(t)
where s ≤ τ are given as (10)
These hypotheses were developed for understanding of that future period, or equivalently, forward term
the returns and yields on long- versus short-term premiag are zeroc :
bonds, and the time series movements of the term
(1 + Yt(n) )n
structure. Later, researchers developed theoretical 1 + Ft(n−1,n) = = Ɛ 1 + Y (1)
t t+n−1
models that give rise to some of the hypothesized (1 + Yt(n−1) )n−1
equations associated with the EH [20, 21, 27, 57]. (14)
The literature distinguishes between the PEH, The last form of the PEH equates the n-period bond
which postulates that (i) expected excess returns on return with the one-period bond and n − 1 period
long-term over short-term bonds are zero, (ii) yield bond:
term premia are zero, or that (iii) forward term premia
n
n−1
are zero, from the EH, which postulates that (i) 1 + Yt(n) = 1 + Yt(1) Ɛ 1 + Y (n−1)
t t+1
expected excess returns are constant over time, (ii)
yield term premia are constant, or (iii) forward term (15)
premia are constant over time. In the following, all
the forms of the PEH in discrete time with discrete Even though the above expressions describe dif-
and continuous compounding and in continuous time ferent forms of the PEH, they are not mutually
(continuous compounding) are presented. We will see equivalent. Assuming that the above expressions are
that the PEH expressions derived in all these models true for all t and n, it can be shown that (i) equa-
are not equivalent across models as well as within tion (13) is equivalent to equation (15), (ii) equation
each model. (14) implies equation (13) (therefore equation (15)),
but the opposite is not true unless
∞ we make the
(1)
additional assumption that 1 + Yt+j are uncor-
Pure Expectations Hypothesis in Discrete j =1
related with each other, and (iii) equations (12) and
Time (15) are inconsistent, because the expected value of
the inverse of a random variable is not in general
Discrete Compounding
equal to the inverse of its expected value.
The first form of the PEH equates the expected To summarize, the PEH cannot hold in both
returns on one-period (short-term) and n-period its one-period form and its n-period form, and,
(long-term) bonds, or equivalently, expected excess essentially there are three different (competing) forms
returns on long-term over short-term bonds are zero: of the PEH in discrete time, the excess return
expression (12), the yield premia expression (13), and
the forward premia expression (14).
(1 + Yt(1) ) = Ɛ (n)
t 1 + Rt→t+1
Imposing more structure in the term structure
model by assuming that the interest rate is lognormal
n
−n+1
and homoscedastic, we can quantify the effect of
= 1 + Yt(n) · Ɛ
t 1 + Y (n−1)
t+1 Jensen’s inequality. Under this additional assumption,
the excess one-period bond returns under the different
(12)
hypotheses can be shown to be of 1 Var[rt→t+1 (n)
−
2
(1)
The second form of the PEH equates the n-period yt ] order. Therefore, the difference between the one-
expected returns on the one-period and n-period period excess bond returns of different PEH forms is
(n)
bonds, or equivalently, yield term premia are zerob : Var[rt→t+1 − yt(1) ]. Using sample means and standard
deviations, we can get an estimate and a standard
n
error of the above quantity. This magnitude is very
1 + Yt(n) = Ɛ
t 1 + Yt(1) 1 + Yt+1
(1)
small for short-term bonds and becomes significant
only for long-term bonds; hence, the differences
(1)
· · · 1 + Yt+n−1 (13) between different forms of the PEH are small except
for very long term zero-coupon bonds. Thus, the data
The third form of the PEH equates the expected future reject all forms of the PEH at the short end, but
one-period spot rate with the current forward rate reject no forms of the PEH at the long end of the
4 Expectations Hypothesis
term structure. In this sense, the distinction between also known to hold under ; under the risk-neutral
the different forms of the PEH is not critical for measure, all assets have the same expected return,
evaluating this hypothesis. equal to the risk-free rate. This implies that this form
of the PEH postulates that = . The expression’s
Continuous Compounding (13) continuous time equivalent is
t+τ
Most empirical research, though, uses neither of the 1 r(s) ds
= Ɛ t e t (22)
above PEH forms, but a log form of them. Once Pt(τ )
the PEH is formulated in logs, all the forms of
the PEH become equivalent. Using log returns, the This statement equates the guaranteed return from
counterparts of equations (12), (13), and (14) are holding any zero-coupon bond to maturity with the
total return expected from rolling over a series
of short-term period bonds. The continuous time
yt(1) = Ɛ (n)
t [rt→t+1 ] (16)
equivalent of equation (14) is
n−1
yt(n) = (1/n) Ɛt [yt+i
(1)
] (17) −∂Pt(τ ) /∂τ
= Ɛ
t [r(t + τ )] (23)
i=0 Pt(τ )
ft(τ −1,τ ) = Ɛ (1)
t [yt+τ −1 ] (18) The left-hand side of the equation is the current
The empirical literature uses equations (17) and infinitesimal forward rate at time t + τ , and the
(18) in order to construct two related notions of right-hand side is the expected future spot rate at
term premia that have played a prominent role in t + τ . Integrating the last equation and applying the
the literature of expected bond returns: the yield boundary condition Pt(0) = 1 gives
premium, t+τ
− ln[Pt(τ ) ] = Ɛt [r(s)] ds (24)
1 (1)
n−1 t
ct(n) ≡ yt(n) − Ɛ [y ] (19)
n i=0 t t+i Formulating the PEH in continuous time makes
the pairwise incompatibility of equations (21), (22),
and the forward term premium, and (24) transparent. If we
define the random variable
t+τ
pt(n) ≡ ft(n,n+1) − Ɛ (1) X̃ ≡ exp − t r(s) ds , then these equations can
t [yt+n ]. (20)
be rewritten as
Derivations of PEH- and EH-tested formulas follow
below. P = Ɛ
t [X̃] (25)
−1
P = Ɛt [X̃−1 ] (26)
Pure Expectations Hypothesis in
Continuous Time ln(P ) = Ɛ
t [ln X̃] (27)
Cox et al. [27] restate the PEH forms in continuous By invoking Jensen’s inequality, one can show that
time and prove that the different forms are incompat- the yields to maturity implied from equations (21),
ible. The equivalent of expression (12) in continuous (22), and (24) satisfy the relationship (with some
time is created by assuming that the holding period abuse of notation):
is the shortest possible, that is, infinitesimal. In this
case, the PEH takes the following form: yt(τ )(21) ≤ yt(τ )(22) ≤ yt(τ )(24) (28)
Ɛt [dPt(τ ) ] In this model, it is also easy to see that the expected
= r(t) dt (21) excess returns are positive in all hypotheses except
Pt(τ )
in equation (21).
This expression states that all bonds have the same Perhaps the most impacting result of Cox et al.
expected infinitesimal return, equal to the short- [27] is the characterization of the PEH forms that can
term interest rate. However, the above expression is be the result of a (no-arbitrage) equilibrium model.
Expectations Hypothesis 5
They examine whether there exist pricing kernels Fama [39, 40] and Fama and Bliss [41] also present
(i.e., prices of risk) that can satisfy the resulting pric- challenges of the EH where they find evidence of rich
ing PDE in the economy and at the same time satisfy patterns of variation in expected returns across time
the form of the PEH under examination. They con- and maturities. Keim and Stambaugh [50], Fama and
clude that only equation (21) can be sustained by French [42], and Campbell and Ammer [23] show
an equilibrium model. By definition, equation (21) that yield spreads help to forecast excess return on
implies that = ; therefore, selecting (t) = 0 bonds as well as on other long-term assets.
gives rise to a valid pricing kernel. Cox et al. [27] Perhaps the most widely cited tests of the EH are
prove that the other forms do not give rise to a the Campbell and Shiller [24] regressions based on
valid pricing kernel. However, McCulloch [57] later the equations:
showed that their claim is incorrect. Working in gen-
eralizing a preexisting discrete time model to contin- m
(n−m)
Ɛt yt+m − yt(n) = αnm + (yt(n) − yt(m) )
uous time, he shows that there exists an equilibrium n−m
economy that also gives rise to equation (23). (29)
where ct(n) and pt(n) are the yield and forward premia
Tests of the Expectations Hypothesis defined in equations (19) and (20), respectively. The
last expression implies that if the PEH holds (i.e.,
The EH has been under scrutiny at least since the ct(n) = 0, pt(n) = 0) then the expected excess returns
work of Macaulay [54]. In this study, Macaulay are zero, whereas if the EH holds (i.e., ct(n) = c(n),
emphasizes the low (given the EH is true) correla- pt(n) = p(n)), then the expected excess returns are
tion between forward rates and subsequent spot rates. constants that depend on the time to maturity n.
Since then, the EH has been tested in hundreds of Combining equations (31) and (32) gives rise to
studies, and in all of them–with only few excep- equation (30), the well-known LPY regressions of
tions–has been rejected. Some of the early papers Dai and Singleton [30].
that test the EH are those of Sutch [74], Shiller [69, Campbell and Shiller [24] and Dai and Singleton
70, 71], Modigliani and Shiller [59], Sargent [67, 68]. [30], among others, document the failure of both the
6 Expectations Hypothesis
regressions (31) and (32), which are true under the easily available for maturities up to a year, but for
EH. According to these equations, the coefficients longer maturities the rates have to be constructed
(n−m)
of the nonconstant terms when regressing yt+m − using spline methods) but with the use of the easily
(n) m (n) (m) 1
yt onto n − m (yt − yt ), or n − 1 (yt − rt ) if (n) observable coupon-bond yields. One of their regres-
m = 1, should be equal to unity. Not only are the sions is based on the EH equation:
estimated coefficients not unity but also they are
often statistically significantly negative, particularly Ɛt yt+m
(n−m)
− yt(m) = αn,m + (ft(m,n) − yt(m) ) (34)
for large n. This means that the EH fails more
significantly for long-term bonds. The intuition of the Using coupon-bond yields does not change the results
EH is that increases in the slope of term structure that future bond yield changes cannot be predicted
(yt(n) − rt ) reflect expectations of rising future short by the current term spread of forward spread. Still,
spot rates. For the “buy an n-period bond and even the direction cannot be predicted correctly.
hold it to maturity” investment strategy to match, They suggest time-varying risk premia as a plausible
on average, the returns from rolling over short solution to the failure of the EH. Froot [43] also tests
rates in a rising short rate environment, the price the same equation trying to understand whether its
of the long bond should decrease such that the failure is the result of time varying term premia or
yield increases (yt+1(n−1)
> yt(n) ). The regression results that expected future spot rates under- or overreact
suggest that the slope of the yield curve does not to changes in short rates. He finds that for short
even forecast correctly the direction of the changes maturities its failure is due to variation in term
in the long-bond yields. The underreaction of long premia, but this is not true for long maturities.
rates to spread term changes has also been the study A recent paper that has received a lot of attention
in [56]. Elaborating and further documenting this is Cochrane and Piazzesi [26]. Cochrane and Piazzesi
underreaction, Campbell [22] finds that yield spreads [26] have revisited the forecasting regressions of
do not forecast short-run changes in long yields Fama and Bilss using the term structure of forward
(against EH) but do forecast long-run changes in short rates instead of a single forward rate. Their most
yields (consistent with EH). notable finding is that the coefficients from regressing
Backus et al. [7] tested the EH by running regres- excess bond returns over one year onto the one
sions based on analogous equations for the forward year forward rates for the next five years exhibit a
rates: tentlike shape for all maturity bonds. The tentlike
shape similarities for all bond maturities suggest
Ɛt ft+1
(n−1,n)
− rt = α + (ft(n−1,n) − rt ) (33) that a single common factor may be underlying the
predictability of excess bond returns for all maturities.
They also find that the regression coefficients of Another very interesting fact in [26] is the high R 2 s
(ft(n−1,n) − rt ) are not unity as the null hypothesis generated in the above regressions. The R 2 s range
sets, but slightly less than one and significantly between 36% and 39%. This is substantially more
different than one. They also show that the small predictability than in [41] using a single forward
differences of the estimated coefficients with unity factor.
in the above regressions do not constitute separate or A series of other papers have also examined
weaker findings against the EH than the deviations alternative reasons for the failure of the EH: Mankiew
of the Campbell–Shiller coefficients from unity, but and Miron [55] find that interest rate movements were
are actually the same. They constructed a one-factor more predictable before the founding of the Federal
model and showed that the small differences in the Reserve in 1913, and the downward bias appears
coefficients of the Backus et al. [7] regressions from to be smaller in that period. Campbell and Ammer
their null value translate into large negative values [23] emphasize that long-term bond yields vary
for the Campbell–Shiller coefficients. primarily in response to changing expected inflation.
Similar to the above forward regressions are the Rudebusch [65] argues that contemporary Federal
forward term regressions tested in [43, 72] Shiller Reserve operating procedures lead to predictable
et al. [72] use a log-linearized model [70, 71] that interest rate movements in the very short run and the
allows them to test several models of the EH very long run, but tend to smooth away predictable
without having to use discount rates (which are movements in the medium run. Balduzzi et al. [8]
Expectations Hypothesis 7
argue that spreads between short-term rates and the trying to understand the failure of the EH have
overnight federal reserve funds rate are mainly driven hinted on different reasons that may give rise to
by expectations of changes in the target, and not time-varying expected excess returns on bonds. Even
by the transitory dynamics of the overnight rate though the understanding of the failure of the EH
around the target. Hence, the bias in tests of the EH is not complete, part of the literature is devoted to
that they document can be mainly attributed to the creating models that better capture this failure and
erroneous anticipation of future changes in monetary that better replicate the data.
policy. In this strand we can put the papers of reduced
Several studies have examined the small-sample form (affine or nonaffine, with macro or without
bias of the regression coefficients. Bekaert and macrovariables) term structure models, such as Ahn
Hodrick [10] argue that the past use of large sample et al. [1], Ang and Bekaert [2], Bansal and Zhou
critical regions, instead of their small-sample coun- [9], Bikbov and Chernov [12, 13], Buraschi et al.
terparts, may have overstated the evidence against [17], Dai and Singleton [29, 30, 32], Diebold et al.
the expectations theory. They find that the evidence [33], Duarte [34], Evans [38], Leippold and Wu [52],
against the EH for these interest rates and exchange and Naik and Lee [59], and, those of the structural
rates is much less strong than under asymptotic infer- form models with or without macrovariables, such
ence. Other studies, though, such as Backus et al. [7] as, Ang et al. [3, 6], Ang and Piazzesi [4, 5], Brandt
and Bekaert, Hodrick, and Marshall [11], find that and Wang [15], Buraschi and Jiltsov [18, 19], Dai
the small-sample properties of the regressions like the [28], Greenwood and Vayanos [45], Guibaud et al.
ones shown in this article are actually biased upward; [46], Piazzesi [62], Piazzesi and Schneider [63],
this means the true Campbell–Shiller coefficients Rudebusch and Wu [65, 66], Vayanos and Vila [75],
are more negative than the ones estimated in the and Wachter [76].
regressions, heightening the puzzle related with the
failure of the EH.
Researchers have also looked at the validity of the Acknowledgments
EH outside the United States. The tendency in these
studies is to find Campbell–Shiller coefficients that The author thanks Aggie Moon for providing research
assistantship. The author takes the responsibility for errors
are less than zero but less negative than the US data
if any.
results. Some of those studies that are done primarily
for European countries and show mixed results are
Bekaert and Hodrick [10], Boero and Torricelli [14], End Notes
Evans [37], Gerlach and Smets [44], Hardouvelis
[48], Kugler [51]. a.
Cochrane [25], Dai and Singleton [31], Duffie [35],
Nielsen [60], Piazzesi [61], Singleton [73].
b.
In the EH literature the term yield premium is used to
Conclusion denote the difference of the nth root of the terms in the
left- and right-hand side of equation (13).
c.
In the EH literature the term forward premium is used to
The EH constitutes several hypotheses that were
denote the difference of the terms in the left- and right-hand
generated to understand bond returns and their yields side of equation (14).
through the help of other maturity bond returns or
investment strategies and forward rates. We showed
that these hypotheses can be formulated in many References
different ways and using different models (continuous
time vs discrete and continuous compounding vs [1] Ahn, D.-H., Dittmar, R.F. & Gallant, A.R. (2002).
discrete). The different hypotheses are not equivalent. Quadratic term structure models: theory and evidence,
Therefore to test the validity of the EH numerous Review of Financial Studies 15, 243–288.
[2] Ang, A. & Bekaert, G. (2002). Regime switches in inter-
different expressions have to be tested. est rates, Journal of Business and Economic Statistics 20,
The consensus is that the EH fails in the US 163–182.
data. Its failure, though, is less strong or mixed [3] Ang, A., Dong, S. & Piazzesi, M. (2007). No-Arbitrage
for the non-US data. Researchers challenging and Taylor Rules, National Bureau of Economic Research.
8 Expectations Hypothesis
[4] Ang, A. & Piazzesi, M. (2003a). A no-arbitrage vec- [21] Campbell, J.Y. (1986b). A defense of traditional
tor autoregression of term structure dynamics with hypotheses about the term structure of interest rates,
macroeconomic and latent variables, Journal of Mon- Journal of Finance 41, 183–193.
etary Economics 50, 745–787. [22] Campbell, J.Y. (1995). Some lessons from the yield
[5] Ang, A. & Piazzesi, M. (2003b). A no-arbitrage curve, Journal of Economic Perspectives 9, 129–152.
vector autoregression of term structure dynamics [23] Campbell, J.Y. & Ammer, J. (1993). What moves the
with macroeconomic and latent variables, Journal of stock and bond markets? A variance decomposition for
Monetary Economics 50, 745–787. long-term asset returns, Journal of Finance 48, 3–37.
[6] Ang, A., Piazzesi, M. & Wei, M. (2006). What does the [24] Campbell, J.Y. & Shiller, R.J. (1991). Yield spreads and
yield curve tell us about the GDP growth? Journal of interest rate movements: A Bird’s eye view, Review of
Econometrics 131, 359–403. Economic Studies 58, 495–514.
[7] Backus, D., Foresi, S., Mozumbar, A. & Wu, L. (2001). [25] Cochrane, J. (2000). Asset Pricing, Princeton University
Press, Princeton.
Predictable changes in yields and forward rates, Journal
[26] Cochrane, J. & Piazzesi, M. (2005). Bond risk premia,
of Financial Economics 59, 281–311.
American Economic Review 95, 138–160.
[8] Balduzzi, P., Bertola, G. & Foresi, S. (1997). A model
[27] Cox, J.C., Ingersoll, J.C. & Ross, S.A. (1981). A re-
of target changes and the term structure of interest rates,
examination of traditional hypotheses about the term
Journal of Monetary Economics 39, 223–249.
structure of interest rates, Journal of Finance 36,
[9] Bansal, R. & Zhou, H. (2002). Term structure of 769–799.
interest rates with regime shifts, Journal of Finance 57, [28] Dai, Q. (2003). Term Structure Dynamics in a Model with
1997–2044. Stochastic Internal Habit. Working Paper, New York
[10] Bekaert, G. & Hodrick, R.J. (2001). Expectations University.
hypothesis tests, Journal of Finance 56, 1357–1394. [29] Dai, Q. & Singleton, K. (2000). Specification analysis
[11] Bekaert, G., Hodrick, R.J. & Marshall, D.A. (1997). of affine term structure models, Journal of Finance 55,
On biases in tests of the expectations hypothesis of 1943–1978.
the term structure of interest rates, Journal of Financial [30] Dai, Q. & Singleton, K. (2002). Expectations puzzles,
Economics 44, 309–348. time-varying risk premia, and affine models of the term
[12] Bikbov, R. & Chernov, M. (2005). Term Structure structure, Journal of Financial Economics 63, 415–442.
and Volatility: Lessons from the Eurodollar Futures [31] Dai, Q. & Singleton, K. (2003a). Fixed-income pricing,
and Options. Working Paper, London Business School, in Handbook of Economics and Finance, C. Constan-
London. tinides, M. Harris & R. Stulz, eds, North Holland,
[13] Bikbov, R. & Chernov, M. (2006). No-Arbitrage Amsterdam.
Macroeconomic Determinants. Working Paper, London [32] Dai, Q. & Singleton, K. (2003b). Term structure dynam-
Business School. ics in theory and reality, Review of Financial Studies 16,
[14] Boero, G. & Torricelli, C. (1997). The Expectations 631–678.
Hypothesis of the Term Structure: Evidence for Germany. [33] Diebold, F., Rudebusch, G. & Aruoba, B. (2006). The
Working Paper CRENoS 1997/4. Centre for North South macroeconomy and the yield curve: a dynamic latent
Economic Research, University of Cagliari and Sassari, factor approach, Journal of Econometrics 131, 309–338.
Sardinia, revised. [34] Duarte, J. (2004). Evaluating an alternative risk prefer-
[15] Brandt, M.W. & Wang, K.Q. (2003). Time-varying risk ence in affine term structure models, Review of Financial
Studies 17, 370–404.
aversion and unexpected inflation, Journal of Monetary
[35] Duffie, D. (1996). Dynamic Asset Pricing Theory,
Economics 50, 1457–1498.
Princeton University Press, Princeton.
[16] Breeden, D. (1986). Consumption, production and
[36] Dunn, K.B. & Singleton, K.J. (1986). Modeling the term
interest rates: a synthesis, Journal of Financial
structure of interest rates under non-separable utility and
Economics 7, 265–296.
durability of goods, Journal of Financial Economics 17,
[17] Buraschi, A., Cieslak, A. & Trojani, F. (2007). 27–55.
Correlation Risk and the Term Structure of Interest Rates. [37] Evans, M.D. (2000). Regime Shifts, Risk, and the Term
Working Paper, Imperial College, U.K. Structure. Working Paper, Georgetown University.
[18] Buraschi, A. & Jiltsov, A. (2005). Inflation risk premia [38] Evans, M.D. (2003). Real risk, inflation risk, and the
and the expectations hypothesis, Journal of Financial term structure, The Economic Journal 113, 345–389.
Economics 75, 429–490. [39] Fama, E.F. (1984a). The information in the term struc-
[19] Buraschi, A. & Jiltsov, A. (2007). Term structure of ture, Journal of Financial Economics 13, 509–528.
interest rates implications of habit persistence, Journal [40] Fama, E.F. (1984b). Term premiums in bond returns,
of Finance 62, 3009–3063. Journal of Financial Economics 13, 529–546.
[20] Campbell, J.Y. (1986a). Bond and stock returns in a [41] Fama, E.F. & Bliss, R.R. (1987). The information in
simple exchange model, Quarterly Journal of Economics long-maturity forward-rates, American Economic Review
101, 785–803. 77, 680–692.
Expectations Hypothesis 9
[42] Fama, E.F. & French, K.R. (1989). Business conditions [61] Piazzesi, M. (2003). Affine term structure models, Hand-
and expected returns on stocks and bonds, Journal of book of Financial Econometrics, Elsevier, p. 828.
Financial Economics 29, 23–49. [62] Piazzesi, M. (2005). Bond yields and the federal reserve,
[43] Froot, K.A. (1989). New hope for the expectations Journal of Political Economy 113, 311–344.
hypothesis of the term structure of interest rates, Journal [63] Piazzesi, M. & Schneider, M. (2007). Equilibrium
of Finance 44, 283–305. yield curves, NBER/Macroeconomics Annual 21,
[44] Gerlach, S. & Smets, F. (1997). The term structure of 389–442.
Euro-rates: some evidence in support of the expectations [64] Rudebusch, G.D. (1995). Federal reserve interest rate
hypothesis, Journal of International Money and Finance targeting, rational expectations, and the term structure,
16, 305–321. Journal of Monetary Economics 35, 245–274.
[45] Greenwood, R. & Vayanos, D. (2008). Bond Supply and [65] Rudebusch, G.D. & Wu, T. (2004a). A Macro-Finance
Excess Bond Returns. Working Paper, London School of Model of the Term Structure, Monetary Policy, and the
Economics. Economy. Working Paper, Federal Reserve Bank of San
[46] Guibaud, S., Nosbusch, Y. & Vayanos, D. (2007). Francisco.
Preferred Habitat and the Optimal Maturity Structure of [66] Rudebusch, G.D. & Wu, T. (2004b). The Recent Shift
Government Debt. Working Paper, London School of in Term Structure Behavior from a No-Arbitrage Macro-
Economics. Finance Perspective. Working Paper, Federal Reserve
[47] Hansen, L.P. & Singleton, K. (1983). Stochastic con- Bank of San Francisco.
sumption, risk aversion, and the temporal behavior of [67] Sargent, T.J. (1972). Rational expectations and the term
asset returns, Journal of Political Economy 91, 249–268. structure of interest rates, Journal of Money, Credit and
[48] Hardouvelis, G. (1994). The term structure spread and Banking 4, 74–97.
future changes in long and short rates in G7 countries, [68] Sargent, T.J. (1979). A note on maximum likeli-
Journal of Monetary Economics 33, 255–283. hood estimation of the rational expectations model of
[49] Hicks, J.R. (1939)Value and Capital,. Oxford University the term structure, Journal of Monetary Economics 5,
Press, Oxford. 133–143.
[50] Keim, D.B. & Stambaugh, R.F. (1986). Predicting [69] Shiller, R.J. (1972). Rational Expectations and the Term
returns in the stock and bond markets, Journal of Structure of Interest Rates. Ph.D. Dissertation, MIT.
Financial Economics 17, 357–390. [70] Shiller, R.J. (1979). The volatility of long-term interest
[51] Kugler, P. (1997). Central bank policy reaction and the rates and expectations models of the term structure,
expectations hypothesis of the term structure, Interna- Journal of Political Economy 87, 1190–1219.
tional Journal of Financial Economics 2, 164–181. [71] Shiller, R.J. (1981). Do stock prices move too much to be
[52] Leippold, M. & Wu, L. (2003). Design and estimation justified by subsequent changes in dividends? American
of quadratic term structure models, European Finance Economic Review 71, 421–436.
Review 7, 47–73. [72] Shiller, R.J., Campbell, J.Y. & Schoenholtz, K.L. (1983).
[53] Lutz, F.A. (1940). The structure of interest rates, The Forward rates and future policy: interpreting the term
Quarterly Journal of Economics 55, 36–63. structure of interest rates, Brookings Papers on Eco-
[54] Macaulay, F.R. (1938). Some Theoretical Problems Sug- nomic Activity 14(1), 173–224.
gested by the Movements of Interest Rates, Bond Yields, [73] Singleton, K.J. (2006). Empirical Dynamic Asset Pric-
and Stock Prices in the United States Since 1856 . NBER ing, Princeton University Press, Princeton.
Working Paper Series, New York. [74] Sutch, R.C. (1970). Expectations, risk, and the term
[55] Mankiew, G.N. & Miron, J.A. (1986). The changing structure of interest rates, Journal of Finance 25, 703.
behavior of the term structure of interest rates, Quarterly [75] Vayanos, D. & Vila, J.-L. (2007). A Preferred-Habitat
Journal of Economics 101, 211–228. Model of the Term Structure of Interest Rates. Working
[56] Mankiew, G.N. & Summers, L.H. (1984). Do long-term Paper, London School of Economics.
interest rates overreact to short-term rates? Brookings [76] Wachter, J.A. (2006). A consumption-based model of
Papers on Economic Activity 1, 223–242. the term structure of interest rates, Journal of Financial
[57] McCulloch, H.J. (1993). A reexamination of traditional Economics 79, 365–399.
hypotheses about the term structure: a comment, Journal
of Finance 48, 779–789.
[58] Modigliani, F. & Shiller, R.J. (1973). Inflation, rational
Further Reading
expectations, and the term structure of interest rates,
Economica 40, 12–43. Longstaff, F.A. (2000). The term structure of very short-term
[59] Naik, V. & Lee, M.H. (1997). Yield Curve Dynamics rates: new evidence for the expectations hypothesis, Journal
with Discrete Shifts in Economic Regimes: Theory and of Financial Economics 58, 397–415.
Estimation. Unpublished Working Paper, Faculty of Sutch, R.C. (1968). Expectations, Risk, and the Term Structure
Commerce, University of British Columbia. of Interest Rates. Dissertation, MIT.
[60] Nielsen, L.T. (1999). Pricing and Hedging of Derivative
Securities, Oxford University Press, Oxford. ANTONIOS SANGVINATSOS
Stochastic Discount The Setup
Starting with capital x ∈ , an economic agent The above is a nonlinear system of d equations to be
chooses at day zero a strategy θ ≡ (θ 1 , . . . , θ d ) ∈ solved for d unknowns (θ∗1 , . . . , θ∗d ). Under NA, the
d , where θ j denotes the units from thej th asset system 4 has a solution θ∗ . Actually, under a trivial
held in the portfolio. What remains, x − di=1 θ i S0i , nondegeneracy condition in the market, the solution is
is invested in the baseline asset. If X (x,θ) is the unique; even if the optimal strategy θ∗ is not unique,
wealth generated starting from capital x and investing strict concavity of U implies that the optimal wealth
according to θ, then X0(x,θ) = x and XT(x; θ∗ ) generated is unique.
A little bit of algebra on equation (4) gives, for all
i = 1, . . . , d,
d
ST0
d
XT(x; θ) = x− θ i S0i + θ i STi
S00
i=1 i=1 S0i = Ɛ YT STi , where
ST0
d
ST0
=x + θ i
STi − S0i (1) U
XT(x; θ∗ )
S00 S00 YT : = (5)
Ɛ (ST0 /S00 )U
XT(x; θ∗ )
i=1
or, in deflated terms, βT XT(x; θ) = x + di=1 θ i
(βT STi − S0i ). The agent’s objective is to choose a Observe that since U is continuously differentiable
strategy in such a way as to maximize expected utility, and strictly increasing, U
is a strictly positive func-
that is, find θ∗ such that tion, and therefore [YT > 0] = 1. Also, equation (5)
also holds trivially for i = 0. Note that the random
Ɛ U XT(x; θ∗ ) = sup Ɛ U XT(x; θ) (2) variable YT that was obtained above depends on the
θ∈d utility function U , the probability , as well as on
the initial capital x ∈ .
The above problem will indeed have a solution if
and only if no arbitrages exist in the market. By Definition 1 In the model described above, a pro-
definition, an arbitrage is a wealth generated by some cess Y = (Yt )t=0,T will be called a stochastic discount
θ ∈ d such that [XT(x; θ) ≥ 0] = 1 and [XT(x; θ) > factor if [Y0 = 1, YT > 0] = 1 and S0i = Ɛ YT STi
0] > 0. It is easy to see that arbitrages exist in the for all i = 0, . . . , d.
market if and only if supθ∈d Ɛ [U (XT(x; θ) )] is not
If Y is a stochastic discount factor, using equation
attained by some θ∗ ∈ d . Assuming, then, the no-
(1), one can actually show that
arbitrage (NA) condition, concavity of the function
d θ → Ɛ [U (XT(x; θ) )] will imply that the first-
order conditions Ɛ YT XT(x; θ) = x, for all x ∈ and θ ∈ d
∂
(x; θ)
(6)
Ɛ U XT = 0, for all i = 1, . . . , d
∂θ i θ=θ∗ In other words, the process Y X (x; θ) is a -martingale
(3) for all x ∈ and θ ∈ d .
will provide the solution θ∗ to the problem. Since
the expectation is just a finite sum, the differential Connection with Risk-neutral Valuation
operator can pass inside, and then the first-order
conditions for optimality are Since Ɛ [ST0 YT ] = S00 > 0, we can define
a proba-
bility mass Q by requiring that Q(ω) = ST0 (ω)/S00
∂
(x; θ)
YT (ω)P(ω), which defines a probability on sub-
0=Ɛ U XT sets of in the obvious way. Observe that, for any
∂θ i θ=θ∗
A ⊆ , [A] > 0 if and only if [A] > 0; we say
ST0 i that the probabilities and are equivalent and we
(x; θ∗ )
= Ɛ U XT ST − 0 S0 ,
i
denote this by ∼ . Now, rewrite equation (5) as
S0
i = 1, . . . , d (4) S0i = Ɛ βT STi , for all i = 0, . . . , d (7)
Stochastic Discount Factors 3
formed. Under the present setup, consider a claim equation (5), we can write
with random payoff HT at time T . The question we
wish to answer is this: what is the indifference price H0 = Ɛ [YT HT ] (12)
H0 of this claim today for an economic agent?
It is important to observe that YT depends on a
For the time being, let H0 be any price set by the
number of factors, namely, the probability , the
market for the claim. The agent will invest in the
utility U , and the initial capital x, but not on the
risky assets and will hold θ units of them, as well as
particular claim to be valued. Thus, we need only
the new claim, taking a position of units. Then, the
one evaluation of the stochastic discount factor and
agent’s terminal payoff is
we can use it to find indifference prices with respect
to all kinds of different claims.
(x; θ,) (x; θ) ST0
XT := XT + HT − 0 H0 (9)
S0
State Price Densities
The agent will again maximize expected utility, that For a fixed ω ∈ , consider an Arrow–Debreau
is, will invest (θ∗ , ∗ ) ∈ d × such that security that pays off a unit of account at time T
if the state of nature is ω, and pays off nothing,
otherwise. The indifference price of this security
Ɛ U XT(x; θ∗ ,∗ ) = sup Ɛ U XT(x; θ,)
(θ,)∈d × for the economic agent is p(ω) := Y (ω)P(ω). Since
Y appears as the density of the “state price” p
(10) with respect to the probability , stochastic discount
factors are also termed state price densities in the
If H0 is the agent’s indifference price, it must follow literature. For two states of nature ω and ω
of
that ∗ = 0 in the above maximization problem; such that Y (ω) < Y (ω
), an agent who uses the
then, the agent’s optimal decision regarding the stochastic discount factor Y would consider ω
a
claim would be not to buy or sell any units of more unfavorable state than ω and would be inclined
the asset. the concave function
In particular, to pay more for insurance against adverse market
→ Ɛ U XT(x;θ∗ ,) should achieve its maximum movements.
at = 0. First-order conditions give that H0 is the
agent’s indifference price if
Comparison with Real-world Valuation
correction factor cov (YT , HT ) appearing in equation of the j th source of uncertainty on the ith asset
(13). If the covariance of YT and HT is negative, the at time t ∈ [0, T ]. With “” denoting transposition,
claim tends to pay more when YT is low. By the c := σ σ is the d × d local covariation matrix. To
discussion in the section State Price Densities, this avoid degeneracies in the market, it is required that
means that the payoff will be high in states that are ct has full rank for all t ∈ [0, T ], almost surely
not greatly feared by the agent, who will therefore be (a.s.). This implies, in particular, that d ≤ m—there
inclined to pay less than what the real-world valuation are more sources of uncertainty in the market than
gives. On the contrary, if the covariance of YT and are liquid assets to hedge away the uncertainty risk.
HT is positive, HT will pay off higher in dangerous Models of this sort are classical in the quantitative
states of nature for the agent (where YT is also high), finance literature—see, for example, [8].
and the agent’s indifference price will be higher than
the real-world valuation. Definition 2 A risk premium is any m-dimensional,
F-adapted process λ satisfying σ λ = b − r1, where
1 is the d-dimensional vector with all unit entries.
Stochastic Discount Factors for Itô
The terminology “risk premium” is better
Processes explained for the case d = m = 1; then λ = (b −
r)/σ is the premium over the risk-free rate that
The Model
investors require per unit of risk associated with
Uncertainty is modeled via a probability space the (only) source of uncertainty. In the general
(, F, F, ), where F = (Ft )t∈[0,T ] is a filtration case, λj can be interpreted as the premium required
representing the flow of information. The market con- for the risk associated with the j th source of
sists of a locally riskless savings account whose price uncertainty, represented by the Brownian motion W j .
process S 0 satisfies S00 > 0 and In incomplete markets, when d < m, Proposition 1
shows all the different choices for λ. Each choice will
dSt0 parameterize the different risk attitudes of different
= rt dt, t ∈ [0, T ] (14) investors. In other words, risk premia characterize the
St0
possible stochastic discount factors, as is revealed in
for some F-adapted, positive short-rate process r = Theorem 3.
t
(rt )t∈ . It is obvious that St0 = S00 exp( 0 ru du) for If m = d, the equation σ λ = b − r1 has only
t ∈ [0, T ]. We define the deflator β via one solution: λ∗ = σ c−1 (b − r1). If d < m there are
t many solutions, but they can be characterized using
S0 easy linear algebra.
βt = 00 = exp − ru du , t ∈ [0, T ] (15)
St 0
Proposition 1 The risk premia are exactly all pro-
The movement of d risky assets will be modeled via cesses of the form λ = λ∗ + κ, where λ∗ := σ c−1 (b −
Itô processes: r1) and κ is any adapted process with σ κ = 0.
defined via the recipe d = (YT ST0 /S00 )d. How- |πt , bt − rt 1
| dt < +∞ and
0
ever, this is not always the case, as Example 1 below T
will show. Therefore, existence of a stochastic dis- πt , ct πt
dt < +∞, a.s (18)
count factor is a weaker notion than existence of a 0
risk-neutral measure. For some practical applications
The set of all d-dimensional, F-adapted processes
though, these differences are unimportant. There is
π that satisfy equation (18) is denoted by
. A
further discussion of this point later in the section
simple use of the integration-by-parts formula gives
Stochastic Discount Factors and Equivalent Martin-
the following result:
gale Measures.
Proposition 2 If Y is a stochastic discount factor,
Example 1 Let S 0 ≡ 1 and S 1 be a three- then Y X π is a local martingale for all π ∈
.
dimensional Bessel process with S01 = 1. If F is
the natural filtration of S 1 , it can be shown that
the only stochastic discount factor is Y = 1/S 1 , Connection with “No Free Lunch” Notions
which is a strict local martingale in the terminology The next line of business is to obtain an existential
of [4]. result about stochastic discount factors in the present
setting, also connecting their existence to an NA-type
Credit Constraints on Investment notion. Remember, from the section The Important
Case of the Logarithm, the special stochastic dis-
In view of the theoretical possibility of continuous
count factor that is the reciprocal of the log-optimal
trading, to avoid so-called doubling strategies (and
wealth process. We proceed somewhat heuristically to
for the fundamental theorem of asset pricing to hold),
compute the analogous processes for the Itô-process
credit constraints have to be introduced. The wealth
model. The linear stochastic differential equation (17)
of agents has to be bounded from below by some
has the following solution, expressed in logarithmic
constant, representing the credit limit. Shifting the
terms:
wealth appropriately, one can assume that the credit
limit is set to zero; therefore, only positive wealth ·
1
processes are allowed in the market. log X π = rt + πt , bt − rt 1
− πt , ct πt
dt
Since only strictly positive processes are consid- 0 2
·
ered, it is more convenient to work with proportions
of investment, rather than absolute quantities as was + σt πt , dWt
(19)
0
the case in the section Stochastic Discount Factors ·
in Discrete Probability Spaces . Pick some F-adapted Assuming that the local martingale term 0 σt πt
process π = (π 1 , . . . , π d ). For i = 1, . . . , d and t ∈ dWt
in equation (19) is an actual martingale, the
[0, T ], the number πti represents the percentage of aim is to maximize the expectation of the drift
Stochastic Discount Factors 7
term. Notice that we can actually maximize the drift where κ is an m-dimensional F-adapted process with
pathwise if we choose the portfolio π∗ = c−1 (b − σ κ = 0.
r1). We need to ensure that π∗ is in
. It is easy to see
that If the assumption that F is generated by W is
Tthe equations in (18) are both satisfied if and only removed, one still obtains a similar result with N κ
if 0 |λ∗t |2 dt < ∞ a.s., where λ∗ := σ c−1 (b − r1)
being replaced by any positive F-martingale N with
is the special risk premium of Proposition 1. Under
N0 = 1 that is strongly orthogonal to W . The specific
this assumption, π∗ ∈
. Call X ∗ = X π∗ and define
representation obtained in Theorem 3 comes from
the martingale representation theorem of Brownian
1 filtrations; see, for example, [7].
Y ∗ :=
X∗
· Stochastic Discount Factors and Equivalent
∗ 1 · ∗2
= β exp − λt , dWt
− |λ | dt Martingale Measures
0 2 0 t
(20) Consider an agent who uses a stochastic discount
factor Y for valuation purposes. There is a possibility
that Y S i could be a strict local -martingale for
Using the integration-by-parts formula, it is rather some i = 0, . . . , d, which would mean thate S0i >
straightforward to check that Y ∗ is a stochastic
Ɛ [YT STi ]. The last inequality is puzzling in the sense
discount factor. In fact, the ability to define Y ∗ is
that the agent’s indifference price for the ith asset,
the way to establish that a stochastic discount factor which is Ɛ [YT STi ], is strictly lower than the market
exists, as the next result shows. price S0i . In such a case, the agent would be expected
Theorem 2 For the Itô process-model considered to wish to short some units of the ith asset. This is
above, the following are equivalent. indeed what is happening; however, because of credit
constraints, this strategy is infeasible. The following
1. The set of stochastic discount factors is is a convincing example that establishes this fact.
nonempty. Before presenting the example, an important issue
T ∗ 2 ∗ should be clarified. One would rush to state that such
2. 0 |λt | dt, < ∞-a.s.; in that case, Y defined
in equation (20) is a stochastic discount factor. “inconsistencies” are tied to the notion of a stochastic
3. For any > 0, there exists = () ∈ + such discount factor as it appears in Definition 3, and
that [XTπ > ] < uniformly over all portfolios that is strictly weaker than existence of a probability
π ∈
. ∼ that makes all discounted processes βS i
local -martingales for i = 0, . . . , d. Even if such
The interest reader is referred to [6], where the a probability did exist, βS i could be a strict local -
property of the market described in statement 3 of martingale for some i = 1, . . . , d; in that case, S0i >
the above theorem is termed No Unbounded Profit Ɛ [βT STi ] and the same mispricing problem pertains.
with Bounded Risk.
The next structural result about the stochastic Example 2 Let S 0 ≡ 1, S 1 be the reciprocal of a
discount factors in the Itô-process setting reveals the three-dimensional Bessel process starting at S01 = 1
importance of Y ∗ as a building block. under and F be the filtration generated by S 1 . Here,
is the only equivalent local martingale measure and
Theorem 3 Assume that F is the filtration generated 1 = S01 > Ɛ [ST1 ] for all T > 0. This is a complete
by the Brownian motion W . Then, any stochastic market—an agent can start with capital Ɛ [ST1 ] and
discount factor Y in the previous Itô-process model invest in a way so that at time T the wealth generated
can be decomposed as Y = Y ∗ N κ , where Y ∗ was is exactly ST . Naturally, the agent would like to long
defined in equation (20) and as much as possible from this replicating portfolio
and go as short as possible from the actual asset.
t t However, in doing so, the possible downside risk
Ntκ = exp − κu , dWu
− |κu | du ,
2
is infinite throughout the life of the investment and
0 0 the enforced credit constraints will disallow for such
∀t ∈ [0, T ] (21) strategies.
8 Stochastic Discount Factors
In the context of Example 2, the law of one issue, depending on the preferences of the particu-
price fails, since the asset that provides payoff ST1 lar agent as given by the choice of κ to form the
at time T has a market price S01 and a replication stochastic discount factor.
price Ɛ [ST1 ] < S01 . Therefore, if the law of one price
is to be valid in the market, one has to insist on
existence of an equivalent (true) martingale measure End Notes
, where each discounted process βS i is a true (and
a.
not only local) -martingale for all i = 0, . . . , d. For One can impose natural conditions on preference rela-
pricing purposes then, it makes sense to ask that the tions defined on the set of all possible outcomes that
will lead to numerical representation of the preference
stochastic discount factor Y κ that is chosen according
relationship via expected utility maximization. This was
to Theorem 3 is such that Y κ S i is a true -martingale axiomatized in [10]—see also Chapter 2 of [5] for a nice
for all i = 0, . . . , d. Such stochastic discount factors exposition.
give rise to probabilities κ that make all deflated b.
We stress “infinitesimal” because when the portfolio
asset-price-process κ -martingales and can be used holdings of the agent change, the indifference prices also
as pricing measures. change; thus, for large sales or buys that will considerably
Let us now specialize to the important “diffusion” change the portfolio structure, there might appear an
incentive, that was not there before, to sell or buy the asset.
case where rt = r ∈ for all t ∈ [0, T ] and σt = c.
For this reason, utility indifference prices are sometimes
η(t, St ) for all t ∈ [0, T ], where η is a nice function referred to as Davis prices.
with values in the space of (m × d)-matrices. As d.
Free lunches with vanishing risk is the suitable general-
long as a claim written only on the traded assets is ization of the notion of arbitrages to get a version of the
concerned, the choice of κ for pricing is irrelevant, fundamental theorem of asset pricing in continuous time.
since the asset prices under κ have dynamics The reader is referred to [3].
e.
The inequality follows because positive local martingales
are supermartingales—see, for example, [7].
dSti
= rt dt + σt·i , dWtκ
,
Sti
References
∀t ∈ [0, T ], i = 1, . . . , d (22)
[1] Cochrane, J.H. (2001). Asset Pricing, Princeton Univer-
where W κ is a κ -Brownian motion. However, if one sity Press.
is interested in pricing a claim written on a nontraded [2] Davis, M.H.A. (1997). Option pricing in incomplete
asset whose price process Z has -dynamics markets, in Mathematics of Derivative Securities (Cam-
bridge, 1995), Publications of the Newton Institute,
dZt = at dt + ft , dWt
, t ∈ [0, T ] (23) Cambridge University Press, Cambridge, Vol. 15, pp.
216–226.
for F-adapted a and f = (f 1 , . . . , f m ), then the κ - [3] Delbaen, F. & Schachermayer, W. (2006). The Mathe-
dynamics of Z are matics of Arbitrage, Springer Finance, Springer-Verlag,
Berlin.
[4] Elworthy, K.D. & Li, X.-M. & Yor, M. (1999). The
dZt = at − ft , λ∗t
− ft , κt
dt + ft , dWtκ
, importance of strictly local martingales; applications to
radial Ornstein-Uhlenbeck processes, Probability Theory
∀t ∈ [0, T ] (24) and Related Fields 115, 325–355.
[5] Föllmer, H. & Schied, A. (2004). Stochastic Finance,
The dynamics of Z will be independent of the choice extended Edition, de Gruyter Studies in Mathematics,
of κ only if the volatility structure of the process Z, Walter de Gruyter & Co., Berlin, Vol. 27.
given by f , is in the range of σ . This will mean [6] Karatzas, I. & Kardaras, C. (2007). The numéraire
that f, κ
= 0 for all κ such that σ κ = 0 and that portfolio in semimartingale financial models, Finance
Z is perfectly replicable using the traded assets. As and Stochastics 11, 447–493.
[7] Karatzas, I. & Shreve, S.E. (1991). Brownian Motion
long as there is any randomness in the movement in
and Stochastic Calculus, 2nd Edition, Graduate Texts in
Z that cannot be captured by investing in the traded Mathematics, Springer-Verlag, New York, Vol. 113.
assets, that is, if there exists some κ with σ κ = 0 [8] Karatzas, I. & Shreve, S.E. (1998). Methods of Math-
and f, κ
not being identically zero, perfect replica- ematical Finance, Applications of Mathematics (New
bility fails and pricing becomes a more complicated York), Springer-Verlag, New York, Vol. 39.
Stochastic Discount Factors 9
Axiom 1 (Transitivity). For any three elements Axiom 3 (Monotonicity). For any two elements x
x, y, and z in X, if x y and y z, then x z. and y in X ⊆ n , if x > y, then x y.
2 Utility Function
This axiom connects the order ≥ on X and in terms of choice behavior. In particular, their
the DM’s preference relation . In the context of behavioral meaning is transparent and, with the
consumer theory, it says that “the more, the better.” exception of the Archimedean axiom, they are all
In particular, given two vectors x and y with x ≥ y, behaviorally falsifiable by suitable choice patterns.
it is enough that x has strictly more of at least some For example, one can show that a DM does not
good i to be strictly preferred to y. This means that all satisfy the transitivity axiom by finding alternatives
goods are “essential” that is, the DM pays attention x, y, z ∈ X over which his/her choices exhibit the
to each of them. Moreover, observe that, by Axiom 3 cycle x y z x. This choice pattern would be
and reflexivity, x ≥ y implies x y. This is because enough to reject the hypothesis that his/her preference
x ≥ y if either x = y or x > y. over X is transitive.
The following two axioms rely on the vector The use of preference axioms that have a
structure of X. transparent behavioral interpretation and that are
falsifiable through choice behavior is the main
Axiom 4 (Archimedean). Suppose that x, y, and methodological tenet of modern utility theory, often
z are any three elements of a convex X ⊆ n such called the revealed preference methodology. In fact,
that x y z. Then there exist α, β ∈ (0, 1) such choice behavior data are regarded as the only
that αx + (1 − α)z y βx + (1 − β)z. observable data that economic theories can rely upon.
Another important methodological feature of mod-
According to this axiom, there are no infinitely
ern utility theory is that it adopts a weak notion
preferred or infinitely despised alternatives. That is,
of rationality, which requires only the consistency
given any pairs x y and y z, alternative x cannot
of choices without any demand on their motives.
be infinitely better than y, and alternative z cannot be
For example, transitivity is viewed as a rational-
infinitely worse than y. Indeed, we can always mix x
ity requirement in this sense because its violations
and z to get better alternatives, that is, αx + (1 − α)z,
would entail inconsistent patterns of choices that no
or worse alternatives, that is, βx + (1 − β)z, than y.
DM would consciously follow, regardless of his/her
It may be useful to remember the analogous
motivations (see [15], for a recent discussion of this
property that holds for real numbers: if x, y, and
methodological issue).
z are real numbers with x > y > z, then there exist
α, β ∈ (0, 1) such that αx + (1 − α)z > y > βx +
(1 − β)z. This property does not hold any more if we Paretian Utility Functions
consider ∞ and −∞, that is, the extended real line
= [−∞, ∞]. Specifically, let x = ∞ or z = −∞. Although the preference ordering is the funda-
In this case, x is infinitely larger than y, z is infinitely mental notion, for analytical convenience it is often
smaller than y, and there are no α, β ∈ (0, 1) that of interest to find a numerical representation of .
satisfy the previous inequality. In fact, α∞ = ∞ and Such numerical representations are called utility func-
β(−∞) = −∞ for all α, β ∈ (0, 1). tions; formally, a real-valued function u : X → is
a (Paretian) utility function if, for any pair x, y ∈ X,
Axiom 5 (Convexity). Given any two elements x
and y of a convex set X ⊆ n , if x ∼ y then αx + xy if and only if u(x) ≥ u(y) (1)
(1 − α)y x for all α ∈ [0, 1].
This axiom captures a preference for mixing: In particular, for the derived relations and ∼ it
given any two indifferent alternatives, the DM always holds, respectively, x y if and only if u(x) > u(y)
prefers any of their combination to each of the orig- and x ∼ y if and only if u(x) = u(y). Indifference
inal alternatives. This preference for mixing is often curves can thus be written in terms of utility functions
assumed in applications and is a convexity property as [x] = {y ∈ X : u(y) = u(x)}.
of indifference curves,a the modern counterpart of the Utility functions are analytically very convenient,
classic assumption of diminishing marginal utility. but do not have any intrinsic psychological mean-
Summing up, we have introduced a few properties ing: what matters is that they numerically rank vec-
that are often assumed on the preference . All these tors in the same way as the preference ordering
axioms are behavioral, that is, they are expressed . This implies, inter alia, that every monotone
Utility Function 3
where C is a suitable subset of the choice set X, Matters are more complicated when the collection
determined by possible constraints that limit the X/ ∼ is uncountable. It is easy to come up with exam-
DM’s choices. For example, in consumer theory, C ples of preferences that satisfy Axioms 1 and 2 and
4 Utility Function
do not admit a utility representation (see Example 2). optimization problems, a natural question is whether,
We refer to [2, 12, 18] for general representation the- among all monotone transformations f ° u of a quasi-
orems. Here we establish an existence result for the concave utility function u, there exists a concave one
important special case of preferences defined on n , and this would ensure the existence of a concave rep-
based on [3]. It is closely related to Theorems 3.3 resentation of a preference that satisfies Axiom 5.
and 3.6 of Fishburn (1970). For brevity, we omit its This important question was first studied by de Finetti
proof. [11], who showed that there exist quasi-concave func-
Write x ≤ ∞ (respectively, x ≥ −∞) when either tions that do not have any concave monotone trans-
x ∈ n or xi = ∞ (respectively, xi = −∞) for each formation. Hence, convex indifference curves are not
i. That is, x ≤ ∞ or x ≥ −∞ means that either necessarily determined by a concave utility function
each xi is finite or each xi is infinite. A subset (the converse is obviously true) and quasiconcavity
of n is a closed order interval if, given −∞ ≤ in Theorem 2 cannot be improved to concavity. Inter
y < z ≤ ∞, it has the form [y, z] = {x ∈ n : y ≤ alia, the seminal paper of de Finetti started the study
x ≤ z} and is an open order interval if it has of quasi-concave functions, later substantially devel-
the form (y, z) = {x ∈ n : yi < xi < zi for each i}. oped by Fenchel [8], which is arguably the most
The half-open order intervals [y, z) and (y, z] are important generalization of concavity.
similarly defined. For example, [z, ∞) = {x ∈ n : Finally, observe that the utility function in Theo-
x ≥ z}, and so [0, ∞) = n+ . rem 2 is continuous even though none of the axioms
A function u : X → is monotone if x > y involves any topological notion. This is a remarkable
implies u(x) > u(y) and is quasiconcave if its consequence of the order and vector structures that
upper sets {x : u(x) ≥ t} are convex for all t ∈ the axioms use.
[16]. Since {y : u(y) ≥ u(x)} = {y : y x}, the We close with an example of a preference that
quasi-concavity of u implies the convexity of the does not admit a utility representation.
upper contour sets of indifference curves (cf. End
Note a). Example 2 Lexicographic preferences are a classic
example of preference orderings that do not admit a
Theorem 2 For a preference ordering defined on utility representation. Set X = 2 and say that x y
a order interval X ⊆ n , the following conditions are if either x1 > y1 or x1 = y1 and x2 ≥ y2 . That is,
equivalent: the DM first looks at the first coordinate: if x1 > y1 ,
then x y. However, if x1 = y1 , then the DM turns
1. satisfies Axioms 1–4 and
his/her attention to the second coordinate: if x2 ≥ y2 ,
2. there exists a monotonic and continuous function
then x y. This is how dictionaries order words and
u : X → such that equation (1) holds.
this motivates the name of this particular ordering.
Moreover, Axiom 5 holds if and only if u is
Although they satisfy Axioms 1–3, it can be proved
quasiconcave.
([18], pages 24–25) that lexicographic preferences do
Theorem 2 is an important result. Almost every not admit a utility representation (it is easy to check
work in economics contains a utility function, often that they do not satisfy the Archimedean axiom).
defined on order intervals of n and assumed to be
monotone and quasi-concave. Theorem 2 shows the
behavioral conditions that underlie this key modeling Brief Historical Remarks
assumption.
By Theorem 2, the convexity axiom 5 is equiv- The early development of utility theory is surveyed
alent to the quasi-concavity of the utility function in the two 1950 articles of George Stigler [24].
u. This is a substantially weaker property than the Here it is worth noting that originally utility func-
concavity of u, which would require u(αx + (1 − tions were regarded as a primitive notion whose role
α)y) ≥ αu(x) + (1 − α)u(y) for all x, y ∈ X and all was to quantify a Benthamian pain/pleasure calcu-
α ∈ [0, 1]. For example, any increasing function u : lus. In other words, utility functions were viewed
→ is automatically quasi-concave. as a measure or a quantification of an underly-
Since concave utility functions are often used in ing physiological phenomenon. This view of utility
applications because of their remarkable properties in theory is sometimes called cardinalism and utility
Utility Function 5
functions derived within this approach are called car- End Notes
dinal utility functions. A key feature of cardinalism is
that utility differences and their ratios are meaning- a.
Observe that this convexity property of indifference curves
ful notions that quantify differences in pain/pleasure is weaker than the convexity of their upper contour sets
that DMs experience among different quantities of {y ∈ X : y x}.
the outcomes. In particular, marginal utilities measure
the marginal pain/pleasure that results from choices
and these played a central role in the early cardinal References
consumer theory.
However, the difficulty of any reliable scien- [1] Aumann, R. (1962). Utility theory without the complete-
tific measurement of cardinal utility raised serious ness axiom, Econometrica 30, 445–462.
[2] Bridges, D.S. & Mehta, G.B. (1995). Representations of
doubts on the scientific status of cardinalism. At
Preference Orderings, Springer-Verlag, Berlin.
the end of the nineteenth century Pareto revolu- [3] Cerreia-Vioglio, S., Maccheroni, F., Marinacci, M. &
tionized utility theory by showing that an ordinal Montrucchio, L. (2009). Uncertainty Averse Preferences,
approach, based on indifference curves as a primi- mimeo.
tive notion—unlike Edgeworth [7], who introduced [4] Debreu, G. (1959). Theory of Value, Yale University
them as level curves of an original cardinal utility Press.
function—was enough for consumer theory purposes [5] Debreu, G. (1964). Continuity properties of Paretian
utility, International Economic Review 5, 285–293.
[20]. In particular, Pareto showed that the classic
[6] Dubra, J., Maccheroni, F. & Ok, E.A. (2004). Expected
consumer problem could be solved and character- utility theory without the completeness axiom, Journal
ized by replacing marginal utilities with marginal of Economic Theory 115, 118–133.
rates of substitutions along indifference curves. For [7] Edgeworth, F.Y. (1881). Mathematical Psychics: An
example, the classic key assumption of diminishing Essay on the Application of Mathematics to the Moral
marginal utilities is replaced by the convexity prop- Sciences, Kegan Paul, London.
erty (Axiom 5) of indifference curves (the latter is [8] Fenchel, W. (1953). Convex Cones, Sets, and Functions,
Princeton University Press, Princeton.
actually a stronger property, unless utility functions
[9] de Finetti, B. (1931). Sul significato soggettivo della
are separable). probabilità, Fundamenta Mathematicae 18, 298–329.
Unlike cardinal utility functions, indifference [10] de Finetti, B. (1937). La prévision: ses lois logiques, ses
curves and their properties can be empirically deter- sources subjectives, Annales de l’Institut Henri Poincaré
mined and tested. Pareto’s insight thus represented 7, 1–68.
a key methodological advance and his ordinal [11] de Finetti, B. (1949). Sulle stratificazioni convesse,
approach, later substantially extended by Hicks and Annali di Matematica Pura ed Applicata 30, 173–183.
[12] Fishburn, P.C. (1970). Utility Theory for Decision Mak-
Allen [17, 23], is today the mainstream version of
ing, Wiley, New York.
consumer theory. More generally, Pareto’s ordinal [13] Frisch, R. (1926). Sur un problem d’économie pure,
revolution paved the way to the modern use of pref- Norsk Matematisk Forenings Skrifter 1, 1–40.
erences as the primitive notion of decision theory. In [14] Gilboa, I. (2009). Theory of Decision under Uncertainty,
fact, the use of preferences is the natural conceptual Cambridge University Press, Cambridge.
development of Pareto’s original insight of consider- [15] Gilboa I., Maccheroni, F., Marinacci, M. & Schmei-
ing indifference curves as a primitive notion. The first dler, D. (2009). Objective and subjective rationality in a
multiple priors model, Econometrica, forthcoming.
appearance of preferences as primitive notions seems [16] Greenberg, H.J. & Pierskalla, W.P. (1971). A review
to be in [9, 13]. They earned their current central of quasi-convex functions, Operations Research 19,
theoretical place in decision theory with the classic 1553–1570.
works [4, 9, 12]. [17] Hicks, J.R. & Allen, R.G.D. (1934). A reconsideration of
The utility theory under certainty outlined here the theory of value I, II, Economica 1, 52–76, 196–219.
reached its maturity in the 1960s (see, e.g., [5]). [18] Kreps, D.M. (1988). Notes on the Theory of Choice,
Westview Press, London.
Subsequent work on decision theory has been mainly
[19] von Neumann, J. & Morgenstern, O. (1947). Theory of
concerned with choice under uncertainty, extending Games and Economic Behavior, 2nd Edition, Princeton
the scope of the seminal contributions [9, 10, 19, 21, University Press, Princeton.
22]. We refer the reader to [14] for a thorough and [20] Pareto, V. (1906). Manuale di Economia Politica,
updated introduction to these more recent advances. Società Editrice Libraria, Milano.
6 Utility Function
[21] Ramsey, F.P. (1931). Truth and probability, in Founda- Related Articles
tions of Mathematics and other Essays, R.B. Braithwaite,
ed., Routledge.
[22] Savage, L.J. (1954). The Foundations of Statistics,
Expected Utility Maximization: Duality Methods;
Wiley, New York.
Expected Utility Maximization; Recursive Prefer-
[23] Slutsky, E. (1915). Sulla teoria del bilancio del consuma-
ences; Risk Aversion; Utility Indifference Val-
tore, Giornale degli Economisti 51, 1–26.
uation; Utility Theory: Historical Perspectives.
[24] Stigler, G.J. (1950). The development of utility theory I,
II, Journal of Political Economy 58, 307–327, 373–396. MASSIMO MARINACCI
Recursive Preferences This preference model was introduced by Lazrak and
Quenez [15] to unify the recursive formulation of
Duffie and Epstein [8] and multiple-prior formulation
The standard additive utility model defines time- of Chen and Epstein [2]. Schroder and Skiadas
t utility for a discrete-time consumption process [19] and Skiadas [23] (see also [20] for the case
{ct ; t = 1, . . . , T } as with jumps) show that the more flexible form of
the aggregator allows preferences to depend on the
T source of risk (e.g., domestic versus foreign), as well
Ut = E t e−β (s−t) u(cs ) = Et {u(ct ) + e−β Ut+1 } as first-order risk aversion (which imposes a higher
s=t penalty for small levels of risk) in addition to the
(1) standard second-order risk aversion dependence in
where Et denotes the conditional expectation. The
equation (3).a
virtue of the model is its simplicity: only discounted
Relative to the time-additive model, the loss of
probabilities and the function u determine prefer-
tractability under generalized SDU is surprisingly
ences. However, the additive treatment of states
small and mainly confined to the complete-markets
and times precludes the model from distinguishing
setting. In the case of power utility, for example, once
between aversion to variability in consumption across
incompleteness or market constraints are imposed,
states and across time. In fact, the agent’s preferences
the additive problem is no simpler to solve than a
are entirely determined by preferences over determin-
more general class of scale-invariant (homothetic)
istic consumption streams (see [23]). Furthermore,
recursive utility. The tractability of the most popular
because agents care only about the distribution of
additive utility models is obtained not from additivity
future consumption, they do not care about the tem-
but from the scale or translation invariance property.
poral resolution of uncertainty.
The second and third sections examine the recursive
A more flexible preference model is obtained with
classes with these invariance properties and show that
Kreps and Porteus [14] recursive specification (see
their solution essentially reduces to solving a single
also [11]):
constrained backward stochastic differential equation.
Ut = F (ct , Et u(Ut+1 )), UT = v(cT ) (2) After defining the preferences and markets in the
second and third sections, the general solution to the
where the aggregator function F models intertempo- optimal portfolio and consumption problem is pre-
ral substitution and u the aversion to risk in next sented in the fourth section. The solution is obtained
period’s utility. The popular Epstein and Zin [12] by first characterizing the utility supergradient density
model is the special case characterized by scale- (a generalization of marginal utility) and the state-
invariant preferences (Ut homogeneous in (ct , Ut+1 ) price density. The state-price result is useful in other
and v(c) = c) and constant elasticity of substitution. asset-pricing applications because it characterizes the
The stochastic differential utility (SDU) formulation set of pricing operators consistent with no arbitrage
T in a general market setting.b The optimal consump-
1
Ut = E t b(cs , Us )ds + a(Us )d[U, U ]s tion process is obtained by equating a supergradient
s=t 2 density and state-price density (a generalized notion
(3) of equating marginal utility and prices). All results
where [·, ·] denotes quadratic variation, which was in this article are based on [18–20]; these references
obtained by Duffie and Epstein [8] as the continuous- also develop more specialized and tractable formula-
time limit of recursive utility. Time-additive utility is tions, based on quadratic modeling of risk aversion,
the special case b(c, U ) = u(c) − βU and a = 0. and the last introduces jump risk (modeled by marked
Skiadas [22] shows that SDU includes the robust point processes).
control formulations of Anderson et al. [1], Hansen All uncertainty is generated by d-dimensional
et al. [13], and Maenhout [17]. It is straightforward standard Brownian motion B over the finite time
to show that SDU also includes the continuous-time horizon [0, T ], supported by a probability space
limit of Chew [3] and Dekel [7] preferences. (, F, P ). All processes dealt with in this article are
In this paper, we examine the generalized SDU assumed to be progressively measurable with respect
model, given in differential form by equation (5). to the augmented filtration {Ft : t ∈ [0, T ]} generated
2 Recursive Preferences
for all other feasible consumption plans c̃. We first in terms of the differential or superdifferential (in
show that optimality of c is essentially equivalent the absence of differentiability or in the presence of
to the utility supergradient density of U at c sat- constraints) of the corresponding aggregator.
isfying the conditions for a state-price density, and The superdifferential of f (t, ·) at (ω, t, w, α) rel-
then characterize these density equations in terms of ative to the constraint set K is the set ∂f (ω, t, w, α)
the utility and wealth aggregators defined above. The of all pairs (dw , dφ ) ∈ 1+m such that
resulting first-order conditions satisfy a constrained
forward–backward stochastic differential equation f (ω, t, w̃, α̃) − f (ω, t, w, α) ≤ dw (w̃ − w)
(FBSDE) system.
Given the feasible consumption plan c, the process + dφ (α̃ − α) for all (w̃, α̃) ∈ K (13)
π ∈ H is a state-price density at c if
Sufficient conditions for a state-price density
follow.c
(π|x) ≤ 0
Proposition 2 Suppose that (c, W , φ) is feasible
for all x such that c + x is a and π ∈ H++ satisfies
feasible consumption plan (10)
dπt
We can interpret (π|x) as the net present value of = −ζt dt − ηt dBt , (ζt , σtR ηt ) ∈ ∂f (t, Wt , φt )
πt
the cash flow x, which must be nonpositive for any (14)
feasible (i.e., affordable) incremental cash flow.
and πW ∈ S1 . Then π is a state-price density at c.
The process π ∈ H is a supergradient density of
U0 at c if The process η is often called the market price
of risk, with ηti representing the time-t shadow
U0 (c + x) ≤ U0 (c) + (π|x) incremental expected wealth return per unit additional
exposure to dBti . The drift term ζ represents the
for all x such that c + x ∈ C (11)
shadow incremental return per unit wealth. In the
and a utility gradient density of U0 at c if case of a linear budget equation (8) and K = 1+m
(no constraints but possibly incomplete markets), we
U0 (c + αx) − U0 (c) obtain the standard result ζt = rt and µRt = σtR ηt .
(π|x) = lim for all x such
α↓0 α Example 3 Collateral Constraint. Suppose that
that c + αx ∈ C for some α > 0 (12) there is a single risky asset (m = 1), and, as in
Example 1, f (ω, t, w, α) = r(ω, t)w + µR (ω, t)α.
If π is a supergradient density of U0 at c and We consider an agent who faces the collateral
the utility gradient of U0 at c exists, then the utility constraint:
gradient density is π.
The general optimality result follows. K = {(w, α) ∈ 2 : w ≥ |α|} (15)
Proposition 1 Suppose that (c, W , φ) is a feasible for some ∈ (0, 1). Then condition (ζ, σ R η) ∈
plan. If π ∈ H is both a supergradient density of U0 at ∂f (W, φ) is equivalent to the following restrictions:
c and a state-price density at c, then the plan (c, W , φ)
is optimal. Conversely, if the plan (c, W , φ) is optimal δt = ζt − rt ≥ 0, εt = µRt − t ∈ [−δt , δt ]
and π ∈ H is a utility gradient density of U0 at c, then
π is a state-price density at c. (φt > 0 ⇒ εt = δt ), (φt < 0 ⇒ εt = −δt ),
(Wt > |φt | ⇒ δt = 0) (16)
To apply Proposition 1, we obtain the dynamics
of the utility supergradient and state-price densities Papers analyzing collateral constraints in a
corresponding to the utility and market models, as Brownian setting and additive utility include [5, 16].
discussed in the sections Recursive Preferences and
Markets and the Wealth Equation. Both depend on the Assuming differentiability of the utility aggregator
feasible reference plan (c, W, φ) and are expressed F (nondifferentiability is accommodated by replacing
4 Recursive Preferences
the differential with a superdifferential defined as for optimality conditions in the form of a constrained
f ), we now provide sufficient conditions for a utility FBSDE system:
supergradient density.
dU = − F (I(λ, U, ), U, )dt + dB,
Proposition 3 Suppose that c ∈ C, (U , ) solves
UT = F (T , WT + eT )
BSDE (5), π ∈ H++ satisfies
dλt
πt = Et Fc (t, ct , Ut , t ) (17) = − (ζ + FU + σ λ F )dt + σ λ dB,
λt
λT = Fc (T , WT + eT )
where
dW = (f (W, φ) + e − I(λ, U, ))dt + φ σtR dBt ,
dEt
= FU (t, ct , Ut , t )dt+ W0 = w0
Et
F (t, ct , Ut , t ) dBt , E0 = 1 (18) (ζ, −σ R (F + σ λ )) ∈ ∂f (W, φ),
(φ, W ) ∈ K (23)
and EU ∈ S1 . Then, π is a utility gradient density of
U0 at c. Given a solution (U, , λ, σ λ , W, φ) and suit-
able integrability assumptions (to satisfy Propositions
The supergradient density expression (17) is con- 1–3), then c in equation (21) defines an optimal con-
sistent with the calculations of Skiadas [2], Duffie and sumption plan.
Skiadas [9], Chen and Epstein [10], and El Karoui
et al. [21]. All these papers assume Lipschitz-growth
conditions that are violated in our setting. Scale and Translation-invariant Solutions
We now apply Proposition 1 to characterize the
first-order conditions. A key role in the solution is The first-order conditions significantly simplify when
played by the strictly positive process utility and wealth dynamics fall into either the scale
or translation-invariant classes. The scale-invariant,
λt = Fc (t, ct , Ut , t ) (19) or homothetic, class exhibits homogeneity of degree
one in consumption (when in certainty equivalent
computed at the optimum, which represents the form) and includes, as special cases, homothetic
derivative of time-t optimal utility with respect to Duffie–Epstein utility and additive power and log
time-t wealth (as in the familiar envelope result). We utility. The translation-invariant class exhibits quasi-
solve for µλ and σ λ in the Ito expansion linearity with respect to a reference consumption
dλt stream and generalizes additive exponential utility.
= µλt dt + σtλ dBt (20) In both cases, the FBSDE of the first-order condi-
λt
tions uncouples into a single pure backward equation
by applying Ito’s lemma to the utility gradient for λ and a pure forward equation for wealth.
density, πt = Et λt , and matching coefficients with
those of the state-price density in Proposition 2. Scale-invariant Class
Having solved for λ, invert equation (19) to express
the consumption plan c as We assume that consumption is strictly positive, and
the aggregator F (ω, t, ·) is homogeneous of degree
ct = I(t, λt , Ut , t ) (21) one, allowing the representation
where the function I : × [0, T ] × (0, ∞) × d+1
c
→ is defined implicitly through the following F (ω, t, c, U, ) = U G ω, t, , ,
equation: U U
F (T , c) = c (24)
Fc (t, I(t, y, U, ), U, ) = y, y ∈ (0, ∞) (22)
It is easy to confirm that utility is therefore
Combining the dynamics of λ with the utility homogeneous of degree one in consumption:
BSDE (5), the budget equation (7), and the state-
pricing restriction of Proposition 2, we obtain the U (kc) = kU (c) for all k ∈ + and c ∈ C (25)
Recursive Preferences 5
Defining σtU = t /Ut , the BSDE (5) is equiva- and therefore σtU = σtλ + σtR ψt . Recalling λt =
lent to Gc (t, ct /Ut , σtU ), we define the inverse function
dUt IG () analogously to equation (22) to obtain ct /Ut =
= −G(t, ct /Ut , σtU )dt + σtU dBt , UT = cT IG (t, λt , σtU ). Defining the dual function of G∗
Ut
(26)
G∗ (t, λ, σ ) = G(t, IG (t, λ, σ ), σ ) − IG (t, λ, σ )λ
Example 4 Schroder and Skiadas [19] show that
the quasi-quadratic aggregator (32)
T
cT γ / . Utility and marginal utility of wealth processes
− exp − βu du − (37) satisfy
t γT
1 1
On the markets side, we assume that the reference Ut = (Yt + Wt ), λt = (41)
consumption stream γ is part of the feasible plan t t
(γ , , κ):
where (Y, φ 0 ) is determined by a constrained back-
ward SDE, given below, that is independent of finan-
dt γt cial wealth.
= µt −
κ
dt + κ σtR dBt , T = γT
t t Defining the superdifferential notation ∂f 0 anal-
(38) ogously to ∂f , the state-price density condition
(ζ, σ R η) ∈ ∂f (W, φ) is equivalent to ζ = µκ −
That is, is the price of a fund paying dividend κ σ R η and σ R η ∈ ∂f 0 (φ 0 ). Defining the inverse
process γ ; κ ∈ m represents the investment propor- and dual functions X, G∗ : × [0, T ] × d+1 →
tions of the fund; and µκ is the fund’s instantaneous by
expected return process.
For any (w, α) ∈ K, we assume (w + v, α + Gx (ω, t, X(ω, t, y, ), ) = y,
vκ) ∈ K and f (ω, t, w + v, α + vκ) = f (ω, t,
G∗(ω, t, y, ) = G(ω, t, X(ω, t, y,),)
w, α) + vµκ (ω, t) for all v ∈ . That is, trading in
the portfolio κ is unrestricted and earns instantaneous − y X(ω, t, y, )
expected return µκ regardless of the agent’s plan.
(42)
For example, under the linear budget equation
(Example 1), we have µκ = r + κ µR .
the processes (Y, σ Y , φ 0 ) satisfy
Defining the zero-wealth constraint set, aggrega-
tor, and portfolio and consumption processes
dY = − (e − Y µκ + f 0 (φ 0 ) + G∗ (δ, )
K 0 = {α : (0, α) ∈ K}, − κ σ R )dt + σ Y dB, YT = eT
f 0 (ω, t, α) = f (ω, t, 0, α), σ + (φ − κY ) σ
Y 0 R
= ,
φt0 = φt − Wt κ,
γt − σ R (G − σ R κ) ∈ ∂f 0 (φ 0 ), φ0 ∈ K 0
ct0 = ct − Wt (39)
t (43)
the budget equation (7) is equivalent to Given the solution (Y, σ Y , φ 0 ) and sufficient regu-
larity, the optimal wealth-independent component of
dt consumption is
dWt = Wt + (f 0 (t, φt0 ) + et − ct0 )dt
t
γt γt
+ φt0 σtR dBt , ct0 = Yt + γt X t, , t and cT0 = eT
t t
cT0 = eT , φt0 ∈ K 0 (40)
(44)
At the optimum, the quasi-linearity of utility and Substituting (c0 , φ 0 ) into the budget equation (40),
markets implies that there are two components to the optimal plan is (c0 + W γ / , W, φ 0 + W κ).
consumption and trading. The pair (c0 , φ 0 ) depends
on the investment opportunity set and the endowment,
but is independent of W . All incremental financial Acknowledgments
wealth is invested in the portfolio κ, and the resulting
dividend stream rate γ is consumed; therefore, (c − I am grateful to Costis Skiadas for many fruitful years of
c0 , φ − φ 0 ) depend only on W and the dividend yield joint research, on which this article is based.
Recursive Preferences 7
u (x 2) u (x )
U( ) (u (x 1)+u (x 2)) / 2
u (x 1)
u −1(U ( )) (x 1 + x 2 ) / 2
x2 CE ( ) EV ( ) x2 x
then EU ( ) ≥ λEU () + (1 − λ)EU ( ) for every B if their systems of preferences L, A and L, B
λ ∈ [0, 1] and every triplet , , ∈ L, with = give CEA () ≤ CEB () for every ∈ L.
(xi , pi )ni=1 , = (xi , pi )ni=1 and = (xi , pi )ni=1 .
Thus, if the agent is risk averse and the expected In the Hirshleifer–Yaari diagram, this definition
utility theory holds, the function EU (.) is concave implies that the indifference curves of the agents that
(and, all the more so, quasiconcave) with respect to go through the same point on the 45° line do not
the outcomes. Consequently, the indifference curves cross and that the indifference curve of the more risk
in the Hirshleifer–Yaari diagram are convex (as averse agent is to the north-east with respect to the
described in Figure 1, but not in Figure 2, which can indifference curve of the less risk averse agent, as
represent an agent who is risk averse but does not shown in Figure 4.
maximize expected utility).
Proposition 5 [7]. If agent A is more risk averse
Proposition 4 [5]. The agent is risk averse if the than agent B and the expected utility model applies,
certainty equivalent function CE : L → X is con-
vex with respect to the probabilities. The agent is
risk loving if it is concave and risk neutral if it is x2
linear.
The condition stated in Proposition 4 for risk aver-
sion is sufficient, but not necessary, nor is it necessary
that the certainty equivalent function CE(.) is quasi-
convex with respect to the probabilities. However,
if the expected utility theory holds and there is risk
aversion, then the certainty equivalent function is ∗ UA( ∗)
convex with respect to the probabilities: UB ( ∗)
n in fact, in
such a case, we have CE() = u−1 i=1 pi u(xi ) ,
where function u(.) is increasing and concave and
function u−1 (.) is increasing and convex. CEA( ∗) CEB ( ∗) x1
Definition 4 (Comparison of Risk Aversion across Figure 4 Indifference curves of two agents of whom one
Agents). An agent A is more risk averse than agent is more risk averse than the other
4 Risk Aversion
then the von Neumann–Morgenstern utility function u (.), which is a measure of its concavity, is not
uA (.) is a concave transformation of uB (.). That is, invariant to increasing linear transformations of u(.).
there exists an increasing and concave function g : An invariant measure is the de Finetti–Arrow–Pratt
→ such that uA (x) = g(uB (x)) for every x ∈ X. coefficient of risk aversion (due to de Finetti [3], Pratt
[7], and Arrow [1]). This measure of (absolute) risk
aversion is defined as
Local Risk Aversion
u (x)
Till now, we considered global risk aversion, that is, r(x) = − (3)
u (x)
the relationship CE() ≤ EV () was introduced for
every lottery ∈ L. Now, let us consider local risk There also exists a measure of relative risk aver-
aversion, by taking into account only small lotteries,
sion rr (x) = −x u (x) , which is important in the
that is, the lotteries that have only little differences in u (x)
consequences. For this purpose, we denote the lottery case of multiplicative lotteries = (αi W, pi )ni=1 .
(x + txi , pi )ni=1 with x + t, where = (xi , pi )ni=1 . The de Finetti–Arrow–Pratt measure can be jus-
tified in relation to the local risk premium, which is
Definition 5 (Local Risk Aversion). An agent is (by Definition 3)
locally risk averse, if, for every x ∈ X and
∈ L, there exists a t ∗ > 0 such that CE(x + t) ≤
RP (x + t) = EV (x + t) − CE(x + t)
EV (x + t) for all t ∈ [0, t ∗ ]. Thus, if the certainty
equivalent function can be derived, then the agent is = x + tEV ()
locally risk averse if lim d (EV (x + t) − CE(x + n
t→0 dt
−1
−u pi u(x + txi ) (4)
t)) > 0 and only if lim d (EV (x + t) − CE(x +
t→0 dt i=1
t)) ≥ 0 for every x ∈ X and ∈ L. By analogy, the
definition holds with reversed inequality signs for the Then, assuming that this function is differ-
local risk loving. entiable with respect to t, we get RP(x) =
0, ∂RP (x + t) = 0 and ∂ 2 RP (x + t) =
∂t
Although the global risk aversion requires that in t=0 ∂t 2 t=0
the Hirshleifer–Yaari diagram the indifference curve
− u (x) 2
σ (). Therefore, in the neighborhood of the
and the expected value line passing through some u (x)
point on the 45° line do not cross and that the certain outcome x, the risk premium is proportional
indifference curve is to the north-east with respect to the de Finetti–Arrow–Pratt measure. Nevertheless,
to the expected value line, this condition needs to be the fact that only the second derivative of the risk pre-
satisfied only in the vicinity of the 45° line for the mium can be different from zero at t = 0, while the
local risk aversion. first derivative is always equal to zero, means that the
expected utility theory allows only for local risk aver-
Proposition 6 If the expected utility theory holds, sion of the second order, while that of the first order
then the agent is locally risk averse if and only if is zero. Other theories (e.g., rank-dependent expected
his/her von Neumann–Morgenstern utility function utility, which is discussed later) also allow for the
u : X → is concave. In other words, if the expected risk aversion of the first order and can, as a result,
utility theory holds, then the conditions for local and describe the preferences that indicate more relevant
global risk aversion (risk loving or neutrality) are the types of aversion to risk (like the one presented in
same. Allais paradox) than the risk aversion admitted by
the expected utility theory and measured by the de
Measure of the Risk Aversion Finetti–Arrow–Pratt index.
Local risk aversion in the Hirshleifer–Yaari dia-
If the expected utility theory holds, then the local gram is linked to the curvature of indifference
risk aversion can be measured by the concavity of curves at the point where they intersect the 45°
the von Neumann–Morgenstern utility function u(.). line. In other words, it is linked to the value
However, the second derivative of the utility function of the second derivative x2 (x1 ) at x1 = x, where
Risk Aversion 5
the function x2 (x1 ) that represents the indiffer- called strong risk aversion and the risk aversion as
ence curve is implicitly defined by the condition introduced in Definition 3 is called weak risk aversion
CE(x1 , x2 ) = x. Then, if the expected utility the- [2]). To be precise, if CE() ≤ CE(∗ ) for every pair
p , ∗ ∈ L with not less risky than ∗ (according to
ory holds, we get x2 (x) = x, x2 (x) = − and
1−p
p
u (x) , that is, the curvature of mean preserving spreads), then CE() ≤ EV () for
x2 (x) = − every ∈ L.
(1 − p)2 u (x)
the indifference curves along the 45° line is propor-
tional to the de Finetti–Arrow–Pratt measure of risk Proposition 8 If the expected utility model applies,
aversion. then there is aversion toward mean preserving
The dependence of the de Finetti–Arrow–Pratt spreads increases in risk if and only if the von
index r(x) on x defines the decreasing absolute risk Neumann–Morgenstern utility function u : X → is
aversion if r (x) < 0 (increasing if r (x) > 0), as concave.
well as, with regard to rr (x), the decreasing relative Note that the concavity of the utility function is
risk aversion if rr (x) < 0 (increasing if rr (x) > 0). a necessary and sufficient condition for both risk
aversion and aversion to increases in risk (deter-
Aversion Toward Increases in Risk mined by mean preserving spreads). The equality of
this condition holds in the case of expected util-
Risk aversion can also be analyzed taking into ity theory. For other theories, we will generally
account the riskiness of lotteries, that is, consid- have two different conditions (one for risk aver-
ering preference for less risky lotteries. However, sion and the other for the aversion to increases in
there does not exist a unique definition of riskiness risk).
according to which lotteries can be ordered. In the An ordering of the lotteries according to their
following, only two definitions of riskiness are exam- riskiness that is equivalent to the mean preserving
ined. Both introduce a partial ordering criterion. spreads concept (for the lotteries that have equal
expected value) is provided by the notion of the
1. The first definition refers to mean preserving second-order stochastic dominance.
spreads (introduced by Rothschild and Stiglitz
[10]). A lottery = (xi , pi )ni=1 is not less risky Definition 7 (First-order Stochastic Dominance).
than lottery ∗ = (xi ∗ , pi ∗ )ni=1 if can be obtained A lottery = (xi , pi )ni=1 , where xi > xi+1 for every
from ∗ by mean preserving spreads. That is, i = 1, . . . , n − 1, first orderstochastically
i dominates
if EV () = EV (∗ ), xi = xi ∗ for every i = lottery = (x i , p i
n
)i=1 if
i
h=1 p h ≥
h=1 ph (or,
1, . . . , n and pi = pi ∗ for every i = 1, . . . , n equivalently, nh=i+1 ph ≤ nh=i+1 ph ) for every i =
except for three outcomes xa > xb > xc , for 1, . . . , n − 1, that is, with respect to the cumulative
which we have pa ≥ pa ∗ , pb ≤ pb ∗ , and pc ≥ probability functions (introduced earlier), if F (x) ≤
pc ∗ . For example, = (x1 , p1 ; x2 , p2 ; x3 , p3 ) is F (x) for every x ∈ X.
not less risky than ∗ = (x1 , p1 ∗ ; x2 , p2 ∗ ; x3 , p3 ∗ )
if p2 ≤ p2 ∗ , p1 = p1 ∗ + xx2 − x3 (p ∗ − p ), First-order stochastic dominance means that prob-
1 − x3
2 2
abilities of the better (worse) outcomes are higher
p3 = p3 ∗ + xx1 − x2 (p ∗ − p ), and x > x >
1 − x3 (lower) in the dominant lottery than in the dominated
2 2 1 2
x3 . lottery. It implies that EV () ≥ EV ( ) and, also,
CE() ≥ CE( ) for a rational agent.
Definition 6 (Aversion to Mean Preserving Spreads
Increases in Risk). An agent is averse to the increases Definition 8 (Second-order Stochastic Dominance).
in risk if CE() ≤ CE(∗ ) for every pair of lotteries A lottery = (xi , pi )ni=1 , where xi > xi+1 for every
, ∗ ∈ L with not less risky than ∗ (according to i = 1, . . . , n − 1, second order stochastically dom-
mean preserving spreads). inates lottery = (xi , pi )ni=1 if Dj (, ) = n−1
j i=j
Proposition 7 If an agent is averse to mean pre- (xi − xi+1 ) h=1 (ph − ph ) ≥ 0 for every j = 1, . . . ,
serving spreads increases in risk, then he/she is also n − 1, that is, with respect to the cumulative x proba-
risk averse (for this reason, sometimes the aver- bility functions in the continuous case, if x (F (t) −
sion to mean preserving spreads increases in risk is F (t))dt ≤ 0 for every x ∈ X = [x, x]. The first-order
6 Risk Aversion
l xa (s1)
p (s1) xa (s1) p (s1) xb (s 1)
p (s1)
1–l xb (s1)
a b l a ⊕(1–l) b xa (s 2)
l
p (s 2) xa (s 2) p (s 2) xb (s 2) p (s 2)
1–l xb (s 2)
stochastic dominance implies second-order stochastic Note that the expected utility model implies neu-
dominance, but not vice versa. trality toward probability mixture increases in risk,
since this model satisfies the compound lottery prin-
Proposition 9 Let two lotteries and have the ciple, according to which EU (λa ⊕ (1 − λ)b ) =
n−1
same expected value, so that
j i=1 (xi − xi+1 ) λEU (a ) + (1 − λ)EU (b ).
(p
h=1 h − p h ) = 0. If the lottery is more risky
than (according to the mean preserving spreads cri-
terion), then second-order stochastically dominates Risk Aversion and Aversion to
. Conversely, if second-order stochastically domi- Increasing Risk with Regard to
nates , then can be obtained from by a sequence Rank-dependent Expected Utility
of mean preserving spreads.
Let us take into consideration a generalization of
The equivalence of the second-order stochastic expected utility theory in order to show some aspects
dominance and mean preserving spreads for the of risk aversion and aversion to increasing risk,
lotteries with the same expected value implies that which appear very different from the case of expected
the same conditions that determine the aversion to utility.
the increases in risk (introduced by mean preserving
spreads) also determine the aversion for the lotteries Definition 10 (Rank-dependent Expected Utility [8,
that are second-order stochastically dominated (in 4]). The system of preferences L, is represented
comparison between lotteries of the same expected by rank-dependent expected utility U : L → if, for
value). every lottery ∈ L with = (xi , pi )ni=1 and xi > xi+1
for every i = 1, . . . , n − 1, where xi ∈ X with X =
2. The second definition of riskiness refers to prob- [x, x] ⊂ , we have
ability mixtures [11]. According to this defini-
tion, a compound lottery is, ceteris paribus, more
n−1 i
risky than a simple lottery. More precisely, let U () = u(xn ) + (u(xi ) − u(xi+1 ))ϕ ph
h=1
us define as a probability mixture of two sim- i=1
ple lotteries a = (xa (sj ), p(sj ))m j =1 and b = (5)
(xb (sj ), p(sj ))m
j =1 , where S = {s 1 , . . . , sm } is the
set of the states of the nature, the two-stages where function u : X → represents the system of
lottery λa ⊕ (1 − λ)b = (((xa (sj ), λ), (xb (sj ), preferences X, over the set of outcomes and
(1 − λ))), p(sj ))m function ϕ : [0, 1] → [0, 1], which is increasing, with
j =1 , where λ ∈ [0, 1]. Figure 5
represents the simplest case of a probability ϕ(0) = 0 and ϕ(1) = 1, distorts the decumulative
mixture. probability function.
Thus, the rank-dependent expected utility model
Definition 9 (Aversion to Probability Mixture In- describes the agent’s system of preferences by means
creases in Risk). An agent is averse to the increases in of a utility function on outcomes and a probability
risk if CE(λa ⊕(1−λ)b ) ≤ max{CE(a ), CE(b )} distortion function (while the expected utility model
for every pair of lotteries a , b ∈ L and λ ∈ [0, 1]. requires only the first function). Note that, when
Risk Aversion 7
References
Related Articles
[1] Arrow, K.J. (1965). Aspects of the Theory of Risk-
Bearing, Yrjö Jahnssonin Sāātiö, Helsinki.
[2] Cohen, M.D. (1995). Risk-aversion concepts in Ambiguity; Behavioral Portfolio Selection;
expected- and non-expected-utility models, Geneva Expected Utility Maximization; Risk–Return
Papers on Risk and Insurance Theory 20, 73–91. Analysis; Utility Function.
[3] de Finetti, B. (1952). Sulla preferibilità, Giornale degli
Economisti NS 11, 685–709. ALDO MONTESANO
Ambiguity the unfamiliar coin lands heads up—that is, a bet on
the event B = {HH , TH }—an SEU decision maker
reveals that
In the literature on decision making under uncer-
tainty, ambiguity is now consistently used to define u(1) P (A) + u(0) (1 − P (A)) > u(1) P (B)
those decision settings in which an economic agent + u(0) (1 − P (B)) (1)
perceives “[. . .] uncertainty about probability, cre-
ated by missing information that is relevant and could that is, P (A) > P (B). Analogously, by preferring the
be known” [17]. Other terms have been used inter- bet on tails on the familiar coin to the bet on tails on
changeably, notably “Knightian uncertainty,” based the unfamiliar coin, an SEU decision maker reveals
on Knight’s [32] distinction between “risk” (a con- that
text in which all the relevant “odds” are known and
unanimously agreed upon) and “uncertainty” (a con- P ({TH , TT }) = P (Ac ) = 1 − P (A) > 1 − P (B)
text in which some “odds” are not known). The term
ambiguity, which avoids charging uncertainty with = P (B c ) = P ({HT , TT }) (2)
too many meanings, was introduced in [12], the paper
that is, P (A) < P (B): a contradiction. Yet, few
that first showed how ambiguity represents a norma-
people would immediately describe these preferences
tive criticism to Savage’s [38] subjective expected
as being an example of irrationality. Ellsberg reports
utility (SEU) model.
that Savage himself chose in the manner described
Ellsberg proposed two famous thought experi-
above, and did not feel that his choices were clearly
ments involving choices on urns in which the exact
wrong [12, p. 656]. (Indeed, Savage was aware of
distribution of ball colors is unknown (one of which
the issue well before Ellsberg proposed his thought
was anticipated in both [29] and [32]). A variant
experiments, for Savage wrote in the Foundations of
of Ellsberg’s so-called two-urn paradox is the fol-
Statistics (pp. 57–58) that “there seem to be some
lowing example, due to David Schmeidler. “Suppose
probability relations about which we feel relatively
that I ask you to make bets on two coins, one taken
‘sure’ as compared to others,” adding that he did
out of your pocket—a coin, which you have flipped
not know how to make such notion of comparatively
countless times—the other taken out of my pocket.
“sure” less vague.)
If asked to bet on ‘heads’ or on ‘tails’ on one of
Ellsberg’s paper generated quite a bit of debate
the two coins, would you rather bet on your coin
immediately after its publication (most of which is
or mine?” Most people, when posed this question,
discussed in Ellsberg’s PhD dissertation [13]), but
announce a mild but strict preference for betting on
the lack of axiomatically founded models that could
their own coin rather than on somebody else’s, both
encompass a concern for ambiguity while retain-
for heads and for tails. The rationale is precisely that
ing most of the compelling features of the SEU
their coin has a well-understood stochastic behav-
model worked to douse the flames. Moreover, the
ior, while the other person’s coin does not; that is,
so-called Allais paradox [2], another descriptive fail-
its behavior is ambiguous. The possibility that the
ure of expected utility, which predated Ellsberg’s by
coin be biased, although remote, cannot be dismissed
a few years, monopolized the attention of decision
altogether. This pattern of preference is called ambi-
theorists until the early 1980s. However, statisticians
guity aversion, and is, as suggested, very common
such as Good [23] and Arthur Dempster [9] did lay
([6, p. 646] e.g., references many experimental repli-
the foundations of statistics with sets of probabilities,
cations of the “paradox”.) It is easy to see that it
providing analysis and technical results, which even-
is not compatible with the SEU model. For, suppose
tually made it into the toolbox of decision theorists.
that a decision maker has a probabilistic prior P over
the state space S = {HH , HT , TH , TT } (where H T
is the state in which the familiar coin lands heads up Models of Ambiguity-sensitive Preferences
and the unfamiliar coin lands tails up, etc.). Then, by
saying that he/she prefers a bet that pays off ¤1 if The interest in ambiguity as a reason for departure
the familiar coin lands heads up—that is, a bet on from the SEU model was revived by David Schmei-
the event A = {HH , HT }—to the bet that pays ¤1 if dler, who proposed and characterized axiomatically
2 Ambiguity
two of the most successful models of decision making of u° g. The interested reader is referred to Schmei-
in the presence of ambiguity, the Choquet expected dler’s paper for details of the axiomatization. For our
utility (CEU) and the maxmin expected utility (MEU) purpose, it suffices to observe that, not too surpris-
models. ingly, the key axiomatic departure from SEU (in the
CEU [39] “resolves” the Ellsberg paradox by variant due to [3]) is a relaxation of the independence
allowing a decision maker’s willingness to bet on an axiom—or what Savage calls the sure-thing princi-
event to be represented by a set-function that is not ple—which is the property of preferences that the
necessarily additive; that is, a v, which, to disjoint Ellsberg-like preferences above violate.
events A and B, may assign v(A ∪ B) = v(A) + Not all capacities give rise to behavior which
v(B). More precisely, call a capacity any function v is averse to ambiguity, as in the above example.
defined on a σ -algebra of subsets of a state space S, Schmeidler proposed the following behavioral notion
which satisfies the following properties: (i) v(∅) = 0, of aversion to ambiguity. Assuming that the payoffs
(ii) v(S) = 1, (iii) for any A, B ∈ such that A ⊆ B, x can themselves be (objective and additive) lotteries
v(A) ≤ v(B). (Note that a probability (charge) is v, over a set of certain prizes, define for any α ∈ [0, 1]
which satisfies instead of (iii) the property v(A ∪ the α-mixture of acts f and g as follows: for any
B) = v(A) + v(B) − v(A ∩ B) for any A, B ∈ .) It s ∈ S,
is simple to see that if v represents a decision maker’s
beliefs, we may observe the preferences described (αf + (1 − α)g)(s) ≡ αf (s) + (1 − α)g(s) (5)
above in the two-coin example. Just substitute P
in equations (1) and (2) with v satisfying v(A) = where the object on the right-hand side is the lottery
v(Ac ) = 1/2 and v(B) = v(B c ) = 1/4. The obvious that pays off prize f (s) with probability α and
question is that of defining expectations for a notion prize g(s) with probability (1 − α). Now, say that
of “belief”, which is not a measure. As the model’s a preference satisfies ambiguity hedging (Schmeidler
name suggests, Schmeidler used the notion of integral calls this property uncertainty aversion) if for any f
for capacities, which was developed by Choquet [8]. and g such that f ∼ g we have
Formally, given a capacity space (S, , v) and a -
measurable function a : S → , the Choquet integral αf + (1 − α)g f (6)
of a with reference to (w.r.t.) v is given by the
following formula: for any α. That is, the decision maker may prefer
to “hedge” the ambiguous returns of two indifferent
∞
acts by mixing them appropriately. This makes sense
a(s) dv(s) ≡ v({s ∈ S : a(s) ≥ α}) dα
S 0 if we consider two acts whose payoff profiles are
0 negatively correlated (over S), so that the mixture has
+ [v({s ∈ S : a(s) ≥ α}) − 1] dα a payoff profile, which is flatter, hence less sensitive
−∞
to the information on S, than the original acts.
(3)
(Ghirardato and Marinacci [20] discuss ambiguity
This is shown to correspond to Lebesgue integra- hedging, arguing that it captures more than just the
tion when the capacity v is a probability. Schmeidler ambiguity aversion of equations 1 and 2.) Schmeidler
provided axioms on a decision maker’s preference shows that a CEU decision maker satisfies ambiguity
relation , which guarantee that the latter is rep- hedging if and only if her capacity v is supermodular;
resented by the Choquet expectation w.r.t. v of a that is, for any A, B ∈ ,
real-valued utility function u (on final prizes x ∈ X).
Precisely, given choice options (acts) f, g : S → X, v(A ∪ B) ≥ v(A) + v(B) − v(A ∩ B) (7)
Ambiguity hedging also plays a key role in
f g ⇐
⇒ u(f (s)) dv(s) ≥ u(g(s)) dv(s) the second model of ambiguity-sensitive preferences
S S
(4) proposed by Schmeidler, the MEU model introduced
alongside that of Itzhak Gilboa [21]. In MEU, the
That is, the decision maker prefers f to g when- decision maker’s preferences are represented by (a
ever the Choquet integral of u° f is greater than that utility function u and) a set C of probability charges
Ambiguity 3
on (S, )—which is nonempty, (weak*-)closed and called variational preferences, which relaxes the
convex—as follows: independence condition used in MEU while retaining
the ambiguity hedging condition. An important spe-
cial case of variational preferences is the so-called
f g ⇐
⇒ min u(f (s)) dP (s) multiplier model of Hansen and Sargent [25], a key
P ∈C S
model in the applications literature to be discussed
≥ min u(g(s)) dP (s) (8) later. Siniscalchi [42] proposed a model that he called
P ∈C S vector expected utility, in which an act is evaluated by
Thus, the presence of ambiguity is reflected by modifying its expectation (w.r.t. a “baseline probabil-
the nonuniqueness of the prior probabilities over the ity”) by an adjustment function capturing ambiguity
set of states. In the authors’ words, “the subject has attitudes. Such a model is also built with applications
too little information to form a prior. Hence, (s)he in mind, as it (potentially) employs a smaller number
considers a set of priors as possible” [21, p. 142]. of parameters than CEU and MEU.
In the two-coin example, let S be the product space Second, Bewley [4] (originally circulated in 1986)
{H, T } × {H, T } and consider the set of priors suggested that ambiguity might result in incom-
pleteness of preferences, rather than in violation of
C ≡ ∪a∈[1/4,3/4] {{1/2, 1/2} × {a, 1 − a}} (9) independence. Under such assumptions, he found a
representation in which a set of priors C appears in
It is easy to see that a decision maker with a “unanimity” sense as follows:
such a C will “assign” to events A and Ac the
weight minP ∈C P (A) = 1/2 = minP ∈C P (Ac ), and to f g ⇐
⇒ u(f (s)) dP (s)
events B and B c the weight minP ∈C P (B) = 1/4 = S
minP ∈C P (B c ), thus displaying the classical Ellsberg
preferences. Gilboa and Schmeidler showed that ≥ u(g(s)) dP (s) for all P ∈ C
S
MEU is axiomatically very close to CEU. While
ambiguity hedging is required (being single-handedly (10)
responsible for the “min” in the representation; see That is, the decision maker prefers f over g when-
[19]), a weaker version of independence is used. ever f dominates g according to every “possible
Ambiguity hedging characterizes the intersection scenario” in C. Preferences are undecided otherwise,
of the CEU and MEU models. Schmeidler [39] shows and Bewley suggested completing them by following
that a decision maker’s preferences have both CEU an “inertia” rule: the status quo is retained if undom-
and MEU representations if and only if (i) the v in inated by any available act. In a model that joins
the CEU representation is supermodular, and (ii) the the two research strands just described, Ghirardato
lower envelope of the set C in the MEU representa- et al. [19] showed that if we drop ambiguity hedg-
tion, C(·) ≡ minP ∈C P (·), is a supermodular capacity ing from the MEU axioms, we can still obtain the
and C is the set of all the probability charges that set of priors C as a “unanimous” representation of
dominate C (the core of C). On the other hand, a suitably defined incomplete subset of the decision
there are CEU preferences that are not MEU (take maker’s preference relation, which they interpreted
a capacity v which is not supermodular), and MEU as “unambiguous” preference (i.e., a preference that
preferences that are not CEU (see [30, Example 1]). is not affected by the presence of ambiguity). This
The CEU and MEU models brought ambiguity yields a model—of which both CEU and MEU are
back to the forefront of decision theoretic research, special cases—in which the decision maker evaluates
and in due course, as “applications” of such theo- act f via the functional
retical models started to appear, they were key in
attracting the attention of mainstream economics and
finance. V (f ) = a(f ) min u(f (s)) dP (s)
P ∈C S
On the theoretical front, a number of alternative
axiomatic models have been developed. First, there + (1 − a(f )) max u(f (s)) dP (s)
P ∈C
are generalizations of CEU and MEU. For instance, S
Maccheroni et al. [33] presented a model that they (11)
4 Ambiguity
where a(f ) ∈ [0, 1] is the decision maker’s ambigu- such applications, while some applications to finance
ity aversion in evaluating f (a generalization of the are briefly discussed here.
decision rule suggested by Hurwicz [27]). In a seminal contribution, Dow and Werlang [10]
A third modeling approach relaxes the “reduction showed that a CEU agent with supermodular capacity
of compound lotteries” property that is built within may display a nontrivial bid–ask spread on the price
the expected utility model. The basic idea is that the of an (ambiguous) Arrow security, even without fric-
decision maker forms a “second-order” probability µ tions. If the price of the security falls within such an
over the set of possible priors over S, and that he/she interval, the agent will not want to trade the secu-
does not reduce the resulting compound probability. rity at all (given an initial riskless position). Epstein
That is, he/she could evaluate act
f by first calculat- and Wang [15] employed the recursive MEU model
ing its expectation EP (u° f ) ≡ u(f (s)) dP (s) with to study the equilibrium of a representative agent
respect to each prior P that he/she deems possible, economy à la Lucas. They showed that price inde-
and then computing terminacy can arise in equilibrium for reasons that
are closely related to Dow and Werlang’s observa-
φ(EP (u° f )) dµ(P ) (12) tion. Other contributions followed along this line; for
example, see [7, 35, 43]. More recently, the smooth
where denotes the set of all possible probabil- ambiguity model has also been receiving attention;
ity charges on (S, ), and φ : → is a function, see, for example, [28].
which is not necessarily affine. This is the reasoning Though originally not motivated by the Ellsberg
adopted by Segal [40], followed by Ergin and Gul paradox and ambiguity, the “model uncertainty” lit-
[16], Klibanoff et al. [31], Nau [37], and Seo [41]. erature due to Hansen et al. ([26], but more com-
The case of SEU corresponds to φ being affine, while prehensively found in [25]) falls squarely within the
Klibanoff et al. [31] show that φ being concave cor- scope of the applications of ambiguity. Moreover,
responds intuitively to ambiguity averse preferences. both decision models they employ are special cases of
That is, the “external” utility function describes ambi- the models described above: the “multiplier model”
guity attitude, while the “internal” one describes risk is a special case of variational preferences, and the
attitude. An important feature of such a model is that “constraint model” is a special case of MEU.
its representation is smooth (in utility space), whereas Most of the applications of ambiguity to
those of MEU and CEU are generally not. For this finance—an exception being [11]—are cast in
reason, this is called the smooth ambiguity model. a representative agent environment, with the
In concluding this brief survey of decision mod- preferences of the representative agent satisfying in
els, it is important to stress that, owing to space one case MEU, in another CEU, and so on. Recent
constraint, the focus is on static models. The litera- work on experimental finance by Bossaerts et al. [5]
ture on intertemporal models is more recent and less and Ahn et al. [1] finds that experimental subjects,
developed, in part, because of the fact that non-SEU when making portfolio choices with ambiguous
preferences often violate a property called dynamic Arrow securities, display substantial heterogeneity
consistency [18], making it hard to use the traditional in ambiguity attitudes. Because Bossaerts et al. [5]
dynamic programming tools. Important contributions show that such heterogeneity may easily result in a
in this area are found in [14, 22] (characterizing the breakdown of the representative agent result, such
so-called recursive MEU model) and [24, 34]. findings cast some doubt on the generality of a
representative agent approach to financial markets
equilibrium.
Applications
References
As mentioned above, the CEU and MEU models
were finally successful in introducing ambiguity into
[1] Ahn, D., Choi, S., Gale, D. & Shachar, K. (2007).
mainstream research in economics and finance. Many Estimating Ambiguity Aversion in a Portfolio Choice
papers have been written, which assume that (some) Experiment, UC Berkeley, Mimeo.
agents have CEU or MEU preferences. The interested [2] Allais, M. (1953). Le comportement de l’homme
reader is referred to [36] for an extensive survey of rationnel devant le risque: Critique des postulats
Ambiguity 5
et axiomes de l’école américaine, Econometrica 21, [23] Good, I.J. (1962). Subjective probability as the mea-
503–546. sure of a nonmeasurable set, in Logic, Methodology
[3] Anscombe, F.J. & Aumann, R.J. (1963). A definition of and Philosophy of Science, E. Nagel, P. Suppes &
subjective probability, Annals of Mathematical Statistics A. Tarski, eds, Stanford University Press, Stanford,
34, 199–205. pp. 319–329.
[4] Bewley, T. (2002). Knightian decision theory: part I, [24] Hanany, E. & Klibanoff, P. (2007). Updating prefer-
Decisions in Economics and Finance 25(2), 79–110. ences with multiple priors, Theoretical Economics 2(3),
(First version 1986). 261–298.
[5] Bossaerts, P., Ghirardato, P., Guarnaschelli, S. & [25] Hansen, L.P. & Sargent, T.J. (2007). Robustness, Prince-
Zame, W.R. (2006). Ambiguity and asset markets: the- ton University Press, Princeton, NJ.
ory and experiment, Review of Financial Studies, forth- [26] Hansen, L.P., Sargent, T.J. & Tallarini, T.D. (1999).
coming, Notebook 27, Collegio Carlo Alberto. Robust permanent income and pricing, Review of Eco-
[6] Camerer, C. (1995). Individual decision making, in The nomic Studies 66, 873–907.
Handbook of Experimental Economics, J.H. Kagel & [27] Hurwicz, L. (1951). Optimality Criteria for Decision
A.E. Roth, eds, Princeton University Press, Princeton, Making under Ignorance. Statistics 370, Cowles Com-
NJ, pp. 587–703. mission Discussion Paper.
[7] Chen, Z. & Epstein, L.G. (1999). Ambiguity, Risk [28] Izhakian, Y. & Benninga, S. (2008). The Uncertainty
and Asset Returns in Continuous Time, University of Premium in an Ambiguous Economy. Technical report,
Rochester, Mimeo. Recanati School of Business, Tel-Aviv University.
[8] Choquet, G. (1953). Theory of capacities, Annales de [29] Keynes, J.M. (1921). A treatise on probability, The
l’Institut Fourier (Grenoble) 5, 131–295. Collected Writings of John Maynard Keynes, Macmil-
[9] Dempster, A.P. (1967). Upper and lower probabilities lan, London and Basingstoke, paperback 1988 edition,
induced by a multi-valued mapping, Annals of Mathe- Vol. VIII.
matical Statistics 38, 325–339. [30] Klibanoff, P. (2001). Characterizing uncertainty aversion
through preference for mixtures, Social Choice and
[10] Dow, J. & Werlang, S. (1992). Uncertainty aversion,
Welfare 18, 289–301.
risk aversion, and the optimal choice of portfolio,
[31] Klibanoff, P., Marinacci, M. & Mukerji, S. (2005). A
Econometrica 60, 197–204.
smooth model of decision making under ambiguity,
[11] Easley, D. & O’Hara, M. Ambiguity and nonparticipa-
Econometrica 73(6), 1849–1892.
tion: the role of regulation, Review of Financial Studies
[32] Knight, F.H. (1921). Risk, Uncertainty and Profit,
22(5), 1817–1843.
Houghton Mifflin, Boston.
[12] Ellsberg, D. (1961). Risk, ambiguity, and the Savage
[33] Maccheroni, F., Marinacci, M. & Rustichini, A. (2006).
axioms, Quarterly Journal of Economics 75, 643–669.
Ambiguity aversion, robustness, and the variational
[13] Ellsberg, D. (2001). Risk, Ambiguity and Decision. PhD
representation of preferences, Econometrica 74(6),
thesis, Harvard University, 1962. Published by Garland 1447–1498.
Publishing Inc., New York. [34] Maccheroni, F., Marinacci, M. & Rustichini, A. (2006).
[14] Epstein, L.G. & Schneider, M. (2003). Recursive Dynamic variational preferences, Journal of Economic
multiple-priors, Journal of Economic Theory 113, 1–31. Theory 128(1), 4–44.
[15] Epstein, L.G. & Wang, T. (1994). Intertemporal asset [35] Mukerji, S. & Tallon, J.-M. (2001). Ambiguity aversion
pricing under Knightian uncertainty, Econometrica 62, and incompleteness of financial markets, Review of
283–322. Economic Studies 68(4), 883–904.
[16] Ergin, H. & Gul, F. (2004). A Subjective Theory of [36] Mukerji, S. & Tallon, J.-M. (2004). An overview of
Compound Lotteries. February. economic applications of David Schmeidler’s models
[17] Frisch, D. & Baron, J. (1988). Ambiguity and rationality, of decision making under uncertainty, in Uncertainty
Journal of Behavioral Decision Making 1, 149–157. in Economic Theory: A Collection of Essays in Honor
[18] Ghirardato, P. (2002). Revisiting Savage in a conditional of David Schmeidler’s 65th Birthday, I. Gilboa, ed.,
world, Economic Theory 20, 83–92. Routledge, Chapter 13, pp. 283–302.
[19] Ghirardato, P., Maccheroni, F. & Marinacci, M. (2004). [37] Nau, R.F. (2006). Uncertainty aversion with second-
Differentiating ambiguity and ambiguity attitude, Jour- order utilities and probabilities, Management Science
nal of Economic Theory 118(2), 133–173. 52(1), 136.
[20] Ghirardato, P. & Marinacci, M. (2002). Ambiguity made [38] Savage, L.J. (1954). The Foundations of Statistics,
precise: a comparative foundation, Journal of Economic Wiley, New York.
Theory 102, 251–289. [39] Schmeidler, D. (1989). Subjective probability and
[21] Gilboa, I. & Schmeidler, D. (1989). Maxmin expected expected utility without additivity, Econometrica 57,
utility with a non-unique prior, Journal of Mathematical 571–587.
Economics 18, 141–153. [40] Segal, U. (1987). The Ellsberg paradox and risk
[22] Gilboa, I. & Schmeidler, D. (1993). Updating ambiguous aversion: an anticipated utility approach, International
beliefs, Journal of Economic Theory 59, 33–49. Economic Review 28, 175–202.
6 Ambiguity
Utility-based Asset Pricing. Assume that the in- We define M as the ratio of the contingent claim’s
vestor derives some utility u from consumption C price to the corresponding state’s probability M(s) ≡
now and in the next period. This setup can be easily Pc (s)/π(s) to obtain the Euler equation in complete
generalized to many periods. Let us find the price Pt markets:
at time t of a payoff Xt+1 at time t + 1. Let Q be
S
the original consumption level in the absence of any P (X) = π(s)M(s)X(s) = E(MX) (5)
asset purchase and let ξ be the amount of the asset s=1
the investor chooses to buy. The constant subjective
discount factor is β. The maximization problem of Law of One Price and the Absence of Arbitrage.
this investor is Finally, assume now that markets are incomplete
and that we simply observe a set of prices P and
Maxξ u(Ct ) + Et [βu(Ct+1 )] payoffs X. Under a minimal set of assumptions, some
discount factor exists that represents the observed
subject to: Ct = Qt − Pt ξ,
prices by the same equation P = E(MX). These
Ct+1 = Qt+1 + Xt+1 ξ (1) assumptions are defined below:
Substituting the constraints into the objective and Definition 1 Free portfolio formation: X1 , X2 ∈
setting the derivative with respect to ξ to zero yields X ⇒ aX1 + bX2 ∈ X for any real a and b.
Pt u (Ct ) = Et [βu (Ct+1 )Xt+1 ] (2) Definition 2 Law of one price: P (aX1 + bX2 ) =
aP (X1 ) + bP (X2 ).
where Pt u (Ct ) is the loss in utility if the investor
buys another unit of the asset, and Et [βu (Ct+1 )Xt+1 ] Note that free portfolio formation rules out short
is the expected and discounted increase in utility sales constraints, bid/ask spreads, leverage limi-
he/she obtains from the extra payoff Xt+1 . The tations, and so on. The law of one price says
2 Risk Premia
that investors cannot make instantaneous profits by Projecting X on M is like regressing X on M without
repackaging portfolios. These assumptions lead to the a constant:
following theorem: E(MX)
proj(X|M) = M (7)
Theorem 1 Given free portfolio formation and the E(M 2 )
law of one price, there exists a unique payoff X ∈ X
The residuals ε are orthogonal to the right-hand side
such that P (X) = E(X X) for all X ∈ X. variable M: E(Mε) = 0, which means that the price
As a result, there exists an SDF M such that of ε is zero. The price of the projection of X on M
P (X) = E(MX). Note that the existence of a dis- is the price of X:
count factor implies the law of one price E[M(X +
E(MX)
Y )] = E[MX] + E[MY ]. The theorem reverses this P (proj (X|M)) = E M M = E(MX)
E(M 2 )
logic. Cochrane [7] offers a geometric and an arith- (8)
metic proof. With a stronger assumption, the absence
of arbitrage, the SDF is strictly positive and thus rep-
Payoffs and Returns. We have reviewed three
resents some–potentially unknown–preferences. Let
frameworks that lead to the Euler equation. This
us first review the definition of the absence of arbi-
equation defines the asset price P for any asset. For
trage and then turn to this new theorem.
stocks, the payoff Xt+1 is the price next period Pt+1
Definition 3 Absence of arbitrage: A payoff space and the dividend Dt+1 . For a one-period bond, the
X and pricing function P (X) leave no arbitrage payoff is 1: one buys a bond at price Pt and receive
opportunities if every payoff X that is always nonneg- 1 dollar next period. Alternatively, we can write the
ative (X ≥ 0 almost surely) and strictly positive (X > Euler equation in terms of returns. For stocks, returns
are payoffs divided by prices: Rt+1 = Xt+1 /Pt+1 . For
0) with some positive probability has some strictly
bonds, one pays 1 dollar today and receives Rt+1
positive price P (X) > 0.
dollars tomorrow. In any case, the Euler equation in
In other words, no arbitrage says that one cannot terms of returns is thus
get for free a portfolio that might pay off positively
Et [Mt+1 Rt+1 ] = 1 (9)
but will certainly never cost one anything. This
assumption leads to the next theorem: The Euler equation naturally applies to a risk-free
f
asset. If one pays 1 dollar today and receives Rt
Theorem 2 No arbitrage and the law of one price f
imply the existence of a strictly positive discount dollars tomorrow for sure, the risk-free rate Rt
factor M > 0 such that P = E(MX), ∀X ∈ X. satisfies
f
Rt = 1/Et [Mt+1 ] (10)
We have seen three ways to derive the Euler
equation that links any asset’s price to the SDF. Expected Excess Returns
Before we exploit the Euler equation to define risk
premia, note that only aggregate risk matters for asset Definition of Risk Premia. Applying the defini-
prices. tion of the covariance to the Euler equation (9) for
i
the asset return R i leads to Et (Mt+1 )Et (Rt+1 )+
covt [Mt+1 , Rt+1 ] = 1. Using the definition of the
i
Aggregate and Idiosyncratic Risk. Only the com- risk-free rate in equation (10), we obtain
ponent of payoffs that is correlated with the SDF f f
shows up in the asset’s price. Idiosyncratic risk,
i
Et (Rt+1 ) − Rt = −Rt covt [Mt+1 , Rt+1
i
] (11)
uncorrelated with the SDF, generates no premium.
The left-hand side of equation (11) defines the
To see this, let us project X on M and decompose
expected excess return. The right-hand side of equa-
the payoff as follows:
tion (11) defines the risk premium. When the asset
return R i is negatively correlated to the SDF, the
X = proj(X|M) + ε (6) investor expects a positive excess return on asset
Risk Premia 3
i. All assets have an expected return equal to the the previous results in terms of the log SDF mt+1 and
i
risk-free rate, plus a risk adjustment that is positive log return rt+1 . Assuming that SDF and returns are
or negative. lognormal, equation (9) leads to
To gain some intuition on the definition above,
let us consider the case of preference-based SDFs. 1
Assume that utility increases, and marginal utility Et (mt+1 ) + Et (rt+1
i
) + Vart (mt+1 )
2
decreases with consumption; this is the consumption-
1
capital asset pricing model (consumption-CAPM). + Vart (rt+1i
) + Covt (mt+1 , rt+1
i
)=0 (14)
Here, the SDF—also known as intertemporal margi- 2
nal rate of substitution—is the ratio of marginal util- where lowercase letters denote logs. The same equa-
ity of consumption tomorrow divided by the marginal f e,i
tion holds for the risk-free rate rt . Let r̃t+1 be
utility of consumption today. Substituting the SDF the excess return corrected for the Jensen term:
into equation (11), we obtain − rt + 1 Vart (rt+1
f
e,i
r̃t+1 = rt+1
i i
). Then, the expected
2
Covt [βu (Ct+1 ), Rt+1
i
] log excess return is equal to
f f
i
Et (Rt+1 ) − Rt = −Rt
u (Ct ) e,i
Et (r̃t+1 ) = −Covt (mt+1 , r̃t+1
e,i
) (15)
(12)
Marginal utility u (C) declines as consumption C For the consumption-CAPM, the utility each
rises. Thus, an asset’s expected excess return is posi- period is u(C) = C 1−γ /(1 − γ ). The log SDF
tive if its return covaries positively with consumption. depends only on consumption growth and is equal
The reason for this is can be explained as follows. Our to mt+1 = log β − γ g − γ (ct+1 − g), where g is
assumption on the investors’ utility function implies the average consumption growth. In this case, the
that investors dislike uncertainty about consumption. expected excess return is equal to
An asset whose return covaries positively with con- e,i
sumption pays off well when the investor is already Et (r̃t+1 e,i
) = γ Covt (ct+1 − g, r̃t+1 ) (16)
feeling wealthy and it pays off badly when he/she is
already feeling poor. Thus, such an asset will make Again, assets whose returns covary positively with
the investor’s consumption stream more volatile. As consumption must promise positive expected returns
a result, assets whose returns covary positively with to induce investors to hold them.
consumption make consumption more volatile, and
so must promise higher expected returns to induce
investors to hold them. Empirical Evidence
Beta-representation and Market Price of Risk. Now the empirical stylized facts on risk premia are
We can rewrite the right-hand side of equation discussed. A large literature shows that, in many asset
(11) as markets, expected excess returns are sizable and time-
varying. The equity, bond, and currency markets are
i
Covt [Mt+1 , Rt+1 ] Vart [Mt+1 ] considered (see Predictability of Asset Prices).
− (13)
Vart [Mt+1 ] Et [Mt+1 ]
βi,M λM Stock Markets
i
E(Rt+1
f
) − Rt = βi,M λM is then a beta-representation Evidence of large risk premia abound on equity
of the Euler equation. Note that λM is independent of markets. The size of the average excess return
the asset i. It is called the market price of risk. βi,M on the stock market is actually puzzling from
is the quantity of risk. The expected excess return on a consumption-based asset pricing perspective; it
asset i is equal to the quantity of risk of this asset constitutes the equity premium puzzle. Moreover,
times the price of risk. expected equity returns appear time-varying.
Euler Equation with Log Returns and Log SDF. Equity Premium Puzzle. To understand the equity
To interpret risk premia, it is often easier to rewrite premium puzzle, let us first define the Sharpe ratio.
4 Risk Premia
h α s.e. R2 α s.e. R2
1 3.77 1.38 0.07 −0.11 1.00 0.00
2 7.46 2.36 0.12 −0.76 0.86 0.01
3 12.07 3.70 0.18 0.12 0.98 0.00
4 17.62 5.27 0.24 0.41 1.26 0.00
5 22.01 5.66 0.29 0.03 0.89 0.00
This table reports slope coefficients α, standard errors s.e. and R 2 from in-sample predictability tests. In the left panel, the univariate
e
regressions are Rt,t+h = C + αDt /Pt + εt+h , where Rt,t+he
denotes the h-year ahead stock market excess return and Dt /Pt the
dividend-price ratio. In the right panel, the regressions are Dt+h /Dt = C + αDt /Pt + t+h , where Dt+h /Dt denotes the h-year
ahead dividend growth rate. The sample relates to the period 1927–2006. Data are annual
and domestic risk-free bonds. In this case, expected successful classes of models in this literature, namely,
currency excess returns should be zero. However, the habit preferences, long-run risk, and disaster risk, are
UIP condition is clearly rejected in the data. In a reviewed.
simple regression of exchange rate changes on inter-
est rate differentials, UIP predicts a slope coefficient
of 1. Instead, empirical work following Hansen and
Habit Preferences
Hodrick [13] and Fama [11] consistently reveals a Habit preferences assume that the agent does not care
regression coefficient that is smaller than 1 and very about the absolute level of his/her consumption, but
often negative. The international economics literature cares about its relative level compared to a habit level
refers to these negative UIP slope coefficients as the that can be interpreted as a subsistence level, past
UIP puzzle or forward premium anomaly. Negative consumption, or the neighbors’ consumption. Hence,
slope coefficients mean that currencies with higher preferences over habits H are defined using ratios
than average interest rates actually tend to appreci- or differences (C/H or C − H ), where H depends
ate. Investors in foreign one-period discount bonds on past consumption: Ht = f (Ct−1 , Ct−2 , . . .). Major
thus earn the interest rate spread, which is known examples of habit preferences are found in Abel [1],
at the time of their investment, plus the bonus from Campbell and Cochrane [4], Constantinides [8] and
the appreciation of the currency during the holding Sundaresan [18]. Preferences defined using differ-
period. As a result, the failure of the UIP condi- ences between consumption and habit (e.g., u(C) =
tion implies positive predictable excess returns when (C − H )−γ ) imply time-varying risk-aversion coeffi-
investing in high interest rate currencies and nega- cient if the percentage gap between consumption and
tive excess returns for investing in low interest rate habit changes through time:
currencies. Lustig and Verdelhan [15] build portfo-
lios of currency excess returns by sorting currencies CUCC γ Ht
γt = − = (30)
on their interest rate differentials with the United UC C t − Ht
States. They obtain a large cross section of currency
excess returns and show that these excess returns Campbell and Cochrane [4] propose a model along
compensate the US investor for bearing US aggre- these lines. In their model, the habit level is slow
gate macroeconomic risk because high interest rate moving; in bad times, consumption falls close to
currencies tend to depreciate in bad times. As a the habit level, and the investor is very risk averse.
result, currency excess returns are also evidence of This model offers a new interpretation to risk pre-
risk premia. mia: investors fear bad returns and wealth loss
To summarize this section, equity, bond, and because they tend to happen in recessions, when
currency markets offer predictable excess returns, consumption falls relative to its recent past. These
and are thus characterized by risk premia. Now the preferences generate many interesting asset pric-
potential theoretical explanations of these risk premia ing features: pro-cyclical variations of stock prices,
are discussed. long-horizon predictability, countercyclical variation
of stock market volatility, countercyclicality of the
Sharpe ratio, and the short- and long-run equity
premium.
Theoretical Interpretations
As observed above, the consumption-CAPM (also Long-run Risk
known as power utility) can replicate average
equity excess returns only with implausibly high The long-run risk literature works off the class of
risk-aversion coefficients. Moreover, if consumption preferences due to Epstein and Zin [9, 10] and Kreps
growth shocks are close to independent and identi- and Porteus [14]. These preferences impute a con-
cally distributed (i.i.d.)—as they are in the data—this cern for the timing of the resolution of uncertainty to
model does not explain time variations in expected agents, and the risk-aversion coefficient is no longer
excess returns. A large literature seeks to address the inverse of the intertemporal elasticity of substitu-
these shortcomings and offers different interpretations tion as it is with the consumption-CAPM (see Recur-
of the observed risk premia. Now the three most sive Preferences). Building on these preferences,
Risk Premia 7
[14] Kreps, D. & Porteus, E.L. (1978). Temporal resolution [19] Weil, P. (1989). The equity premium puzzle and the
of uncertainty and dynamic choice theory, Econometrica risk-free rate puzzle, Journal of Monetary Economics 24,
46, 185–200. 401–424.
[15] Lustig, H. & Verdelhan, A. (2007). The cross-section of
foreign currency risk premia and consumption growth
risk, American Economic Review 97(1), 89–117.
[16] Mehra, R. & Prescott, E. (1985). The equity pre- Related Articles
mium: a puzzle, Journal of Monetary Economics 15(2),
145–161.
[17] Rietz, T.A. (1988). The equity risk premium: a solution, Arbitrage Pricing Theory; Capital Asset Pricing
Journal of Monetary Economics 22, 117–131. Model; Stochastic Discount Factors; Utility
[18] Sundaresan, S. (1989). Intertemporal dependent prefer- Function.
ences and the volatility of consumption and wealth, The
Review of Financial Studies 2(1), 73–88. ADRIEN VERDELHAN
Predictability of Asset are related to interest rates: relative interest rate [7],
term spread and the default spread [7, 16, 23], infla-
Prices tion rate [14, 18]; variables that are related to “one
over the price”: dividend yield [10], payout yield [4],
earning–price ratio and dividend–earnings (payout)
Predictability can be interpreted in many ways in ratio [26], book-to-market ratio [25, 32]; and other
finance. The fundamental issue in asset pricing variables including aggregate net issuing activity [2]
is to determine the relationship between risk and and consumption–wealth–income ratio [27].
reward. To quantify such a relationship, an economic Although the focus is on the rational explanation
model is built to “predict” how the expected asset for predictability, the evidence has also been inter-
returns should vary with their risk measures. In this preted differently under different views. Their dif-
case, predictability means contemporaneous associa- ferences are illustrated by the following story. Once
tion between the expected return of an asset and the there were four students walking on a street with their
expected returns of different risk factors. For exam- professor. A dollar bill lying on the sidewalk quickly
ple, the capital asset pricing model (CAPM) predicts caught the professor’s eyes. The professor asked the
that a security’s expected risk premium is propor- four students why nobody was picking up the dollar
tional to the expected return from the market factor, bill. The first student answered although the dollar
where the proportionality reflects the systematic risk bill was real, people just pretended not seeing it. The
measure. This type of predictability is not the focus second student argued that the dollar bill was just an
of this article. Instead, the focus is on whether future illusion (or a statistical illusion). The third student
security returns can be predicted from current known said that, even though the dollar bill was real, no one
information. would bother to pick it up because it was too costly to
One important assumption used to build a rational pick it up (or transactions costs). The last student’s
asset pricing model is the market efficiency (see answer was that the dollar bill was real. Someone
Efficient Market Hypothesis), in which security left it there for a needy person. Generally speaking,
prices reflect all available information quickly and the first student is a behaviorist; the second and third
fairly. This was interpreted literally in the 1950s and students hold the traditional efficient market view;
1960s as saying that any lagged variables possess no and the last student holds the modern view on the
power in predicting current or future security prices EMH. No matter which student’s answer represents
or returns. The modern finance theory, however, has your view, predictability cannot be too large. There
a different interpretation for the evidence of return is an old saying: if you can predict the market, why
predictability. In fact, researchers have recognized aren’t you rich!
since 1980s that the expected returns can vary over The existence of predictability is crucial in testing
time due to changes in investors’ risk tolerance and/or the conditional asset pricing models [19], in return
investment opportunities [30] over business cycles. decomposition [8], in asset allocation [22], and so on.
If business cycles are predictable to some degree, Because of the theoretical foundation for predictabil-
returns can also be predictable, which poses no ity, this article focuses primarily on aggregate market
challenge to the efficient market hypothesis (EMH). returns. Predictability is also related to anomalies. An
Under this view, one should not rely solely on the anomaly is defined as the deviation from an asset
historical average returns to estimate expected returns pricing model. In most empirical studies, anomalies
in assisting our investment decisions. In other words, are tied to a specific part of the market, such as small
the task of estimating the expected returns precisely firms, firms with low book-to-market ratios, and so
largely depends on our ability to predict future stock on, or particular sample periods, such as January,
returns. weekends, and so on. A detailed review on anomalies
Given the fact that the serial correlations for aggre- can be found in [35].
gate stock returns are weak especially in the recent This article intends to offer a perspective on both
decade, the quest for additional predictors goes on. the evidence and the reasons for return predictability.
Many financial variables have been shown to possess A detailed discussion about the economic reasons for
predictive power for stock returns. A partial list of predictability is given in the section Economic Inter-
these variables can be characterized as variables that pretation of Predictability. Recent empirical studies
2 Predictability of Asset Prices
have uncovered many useful predictors, which are it is clear that most of the predictability from past
summarized in the section Understanding Some Use- returns concentrates in the early sample period from
ful Predictors. Predictability is not without contro- 1962 to 1984, with autocorrelations as high as 22.4
versy. Many of the statistical issues in testing the and 38.5% for value-weighted and equal-weighted
predictability are discussed in the section Statistical indices, respectively.
Issues, followed by conclusion in the last section. Predictability in daily returns might be subject
to market microstructure effects discussed in the
section The Economic Interpretation of Predictability.
Evidence on Predictability One way to alleviate such effects is to examine the
behavior of monthly returns. For both value- and
The most simple form of predictability is the return
equal-weighted index returns, the autocorrelations
autocorrelation. To gain a perspective on the mag-
have been substantially attenuated. For example,
nitude of the serial correlation, returns of differ-
over the whole sample period, autocorrelation for
ent frequencies and over different sample peri-
value-weighted index returns is only 4.3%, almost
ods are examined. Owing to the availability of
daily returns, the whole sample period is from negligible. For the equal-weighted index, however,
1962 to 2006. The summary statistics is listed in the autocorrelation is still as large as 17.6% for
Table 1 for both value-weighted and equal-weighted the whole sample period and is stable over the two
NYSE/AMEX/NASDAQ composite index returns. subsample periods. Therefore, it can be concluded
For the whole sample period, the average value- that return serial correlations are more likely to occur
weighted index daily return is 0.044% with a volatil- in small stocks. Given there are still substantial serial
ity of 0.859%. Such a large difference between aver- correlations in low-frequency small stock return data,
age return and volatility implies a very low Sharpe market microstructure effects cannot be the only
ratio of 5%. If returns are autocorrelated, the “true” factor.
Sharpe ratio should be larger.a For the value-weighted If future returns can only be weakly predicted by
index returns, the autocorrelation is about 13%. Such past returns, are there other variables that help to pre-
a large autocorrelation further increases to 31% when dict returns? In Table 2, we further study return pre-
an equal-weighted index is used. If we fit an AR(1) dictability using three other variables—the dividend
model to the equal-weighted index returns, we see yield, the repurchasing yield, and the relative interest
an R 2 of 9.61%! The autocorrelation difference in rate. Our sample starts in 1952 after a major shift in
the two types of index returns suggests that small the interest rate regime by the Federal Reserve. To be
stocks are more predictable than large stocks. To see representative, we focus on the value-weighted index
whether such a predictability is stable over time, the returns. During the first 17 years from 1952 to 1978,
whole sample period is split into two. From Table 1, both the dividend yield and the relative interest rate
Dependent Adjusted
variable rt (D/P )t (F/P )t rrelt R2
Panel A: sample period 1952–1978
rt+1 0.061 10.90 0.675 −11.67 0.062
(D/P )t+1 −0.000 0.966 0.003 0.042 0.956
(F/P )t+1 −0.001 0.038 0.943 0.034 0.898
rrelt+1 0.000 0.032 0.005 0.731 0.529
Panel A: sample period 1979–2005
rt+1 0.030 0.461 3.508 −0.801 0.009
(D/P )t+1 −0.000 0.994 −0.009 0.005 0.985
(F/P )t+1 −0.001 0.029 0.971 0.071 0.960
rrelt+1 −0.000 0.009 −0.010 0.751 0.560
This table reports the VAR results for the four variables including the value-weighted
NYSE/AMEX/NASDAQ composite index return, dividend yield, repurchasing yield,
and the relative interest rate over different sample periods. The bold face number
indicates that the estimate is statistically significant at a 5% level
have helped to predict returns, with an adjusted R 2 future prices is the current price. In other words, we
of 6.2%. In contrast, the repurchasing yield becomes have
more important over the next 17 years from 1979 to
Cov[(Pt+j − Pt+i ), (Pt+l − Pt+k )|It ] = 0 (3)
2005, with an adjusted R 2 of 0.9%. The evidence sug-
gests that returns are predictable even if not by their where i < j < k < l. In other words, the nonoverlap-
past returns. Despite large persistence of all three ping price changes are uncorrelated at all leads and
predictors as shown in Table 2, statistical adjustment lags. If we interpret the price difference as a return,
for estimates will not likely take away the predictive it means that returns should be unpredictable.
power of the three variables (see the section Statisti- This analysis defines the notion of EMH. Financial
cal Issues). markets are said to be efficient if security prices
rapidly reflect all relevant information about asset
values, and all securities are fairly priced in light
Predictability and Market Efficiency of the available information. In other words, the
Historically, predictability has been associated with EMH describes how security prices should react to
market inefficiency. According to the fundamental available information and how prices should evolve
law of valuation, a security price should reflect its over time. Under this framework, return predictability
expected fundament value for risk-neutral investors serves as evidence against the EMH.
with zero interest rate: Does the EMH indeed exclude predictability? To
answer this question, we focus on a stronger version
Pt = E[V ∗ |It ], Pt+1 = E[V ∗ |It+1 ] (1) of the Martingale process, which is the random walk
process, and assume that investors are risk averse.
where V ∗ is the fundamental value and It is the The random walk process was first used by Bachelier
information set at time t. Since the information set It (1900) to model stock prices in his dissertation, and
is included in the information set It+1 , the following was rekindled by Merton in the late 1960s. For
result is obtained by the law of iterated expectations: convenience, we use log price pt
Strictly speaking, the EMH only puts a restriction traditional framework? Most explanations focus on
on the residual t+1 to satisfy the condition of market microstructure effects and transactions costs.
E[t+1 |It ] = 0 at any time t in either equation (4) or This section reviews the bid–ask bounce, nonsyn-
(5). Since µ is determined by an asset pricing model, chronous trading, and transactions costs in explaining
such as the CAPM, the traditional view on the EMH the return autocorrelation.
implicitly assumes that µ is constant. The modern
finance theory, however, has offered a different view
on µ. For example, Fama and French [17] have Bid–ask Bounce
suggested that the risk premium might be higher in
the economic downturn than in the peak of a business Returns tend to be negatively autocorrelated in a
cycle. This evidence suggests that the expected return short-run. One possible explanation is offered by
might be time varying. In fact, many asset pricing Roll [34] from the perspective of bid and ask price
models since Merton have emphasized the idea of differences. In the absence of information, sell orders
changing investment opportunities, which requires and buy orders arrive with the same probabilities. In
additional risk compensation over time. Alternatively, other words, a buy order is likely to follow a sell
investors’ risk tolerance might change over time, order, which results in a negative autocorrelation. In
which will cause the investors to demand different particular, let Pt∗ be the fundamental value:
levels of risk premium. No matter which scenario is
more likely, one should allow µ to be time varying:
Pt = Pt∗ + It (s/2) (7)
rt+1 = µt+1 + t+1 (6)
+1 if buy order with prob = 0.5
It = (8)
Although under the EMH, we still have the condition −1 if sell order with prob = 0.5
of E[t+1 |It ] = 0, E[µt+1 |It ] is not necessarily con-
stant. For example, if risk premia changes with the where s is the bid–ask spread. This implies a price
business cycle and the business cycle is predictable, change of Pt = Pt∗ + (It − It−1 )s/2. In other
return should also be predictable. This analysis opens words, autocorrelation is related to the spread s in
a channel for the predictability to coexist with the the following way:
EMH.
Returns from a buy-and-hold strategy on the mar-
Cov(Pt−1 , Pt ) = −s 2 /4 (9)
ket portfolio correspond to returns for a represen-
tative investor. Predictability means that someone
Since the bid–ask spreads tend to be larger for small
can implement a trading strategy that requires a full
company stocks than for large stocks, autocorrelation
investment in some periods and a zero or a short posi-
will be stronger for small firms than for large stocks,
tion in other periods in order to earn higher returns
than those from a buy-and-hold strategy. Clearly, this other things being equal. Equation (9) can also be
investment strategy cannot be implemented by the used to back out the implied bid–ask spread.
representative investor since he/she has to fully invest If the autocorrelation is due to differences in the
in the equity market. Although such a strategy will bid and ask prices, the effect should be smaller if
pay off in a long run, it is not without risk in short the average bid and ask prices are used to com-
term. The success of this strategy depends on the pute returns instead of the actual closing prices.
degree of predictability. Therefore, predictability can- Similarly, low-frequency returns, such as monthly
not be too large in order to prevent too many investors returns, should have weaker autocorrelation than
defecting from being representative investors. high-frequency returns, such as daily returns, which is
true in general. We should also see a drop in autocor-
relations over time when the average bid–ask spread
The Economic Interpretation of shrinks, especially after decimalization. This is con-
Predictability firmed in Table 1. In general, investors cannot design
a trading strategy to obtain excess returns in this case,
Without assuming irrationality and market ineffi- since the bid and ask effect is due to the market
ciency, how can we interpret predictability under the friction.
Predictability of Asset Prices 5
Table 4 Portfolio correlations adopted from Boudoukh Note that the coefficients in equations (15) and (16)
et al. [5] can be estimated using the Kalman filter proce-
Portfolio Smallt+1 Mediumt+1 Larget+1 Smallt Larget dure. Testing the hypothesis of time-varying expected
return is equivalent to test whether φ = 0 in the above
Smallt 0.36 0.19 0.03 models. Using the 10 size-sorted weekly (Wednes-
Mediumt 0.35 0.22 0.06 0.89
Larget 0.28 0.21 0.07 0.72 0.91
day to Tuesday) portfolio returns from 1962 to 1985,
Conrad and Kaul [13] found that the autocorrela-
This table reports cross- and auto-correlations among size tions coefficients are 41 and 9% for the small and
portfolios the large decile portfolios, respectively, which are
both statistically significant
√ when compared to the
As seen from equation (12), the cross autocorrelation confidence bound of 1/ T = 0.03. Although the per-
is essentially the self-autocorrelation acted on con- sistence parameter estimates (φ̂) of 0.589 and 0.087
temporaneous correlation. Using a different sample for small and large portfolios, respectively, are very
period, Boudoukh et al. [5] found that results are different, they are statistically significant at a 1%
consistent with equation (12) (Table 4). level.
Applying equation (12), we can compute the It is important to understand why expected returns
predicted cross autocorrelations as change over time. In the CAPM world, it is implicitly
assumed that a firm will continue to produce the
same widgets and face the same uncertainty when
Corr∗ (rsmall,t+1 , rlarge,t )
selling these widgets in the market. In other words,
= Corr (rsmall,t+1 , rsamll,t )Corr(rsmall,t , rlarge,t ) the risk structure in future cash flows (CFs) is fixed.
Thus, the comovement with the overall market is
= 0.36 × 0.72 = 0.26 (13)
fixed. At the same time, investors’ attitude toward
Corr∗ (rlarge,t+1 , rsmall,t ) risk does not change, which implies constant expected
returns. Such a model structure may reasonably
= Corr (rlarge,t+1 , rlarge,t )Corr(rlarge,t , rsmall,t ) describe the real world over a short period of
= 0.07 × 0.72 = 0.05 (14) time.
Over a longer horizon, however, investment
These numbers are very close to the actual cross opportunities can change due to either technological
autocorrelations shown in the table. Therefore, we do advances or changes in consumers’ preference toward
not need frequent nontrading to justify the observed goods and services. For example, Apple used to be
cross autocorrelation. However, we still need to in the business of making personal computers and
understand the serial correlation. software 10 years ago. Today, a significant portion
of Apple’s business is in the consumer electron-
Time-varying Expected Returns ics including music players and cell phones. Under
this view, both the risk environment of a firm and
The mechanism for the observed autocorrelation, the risk tolerance of investors could change over
discussed in the previous sections, largely relies time. Therefore, the observed predictability may sim-
on market frictions. As discussed in the section ply provide compensation for investors’ exposure to
Predictability and Market Efficiency, an alterna- the risk of change in investment opportunities or
tive rational explanation for predictability is the reflect the differences in the required risk compen-
time-varying expected return. Given the unobserv- sation due to change in the risk tolerance over dif-
ability nature of expected returns, Conrad and ferent economic conditions. In this case, a represen-
Kaul [13] proposed to characterize the movement tative investor will not try to utilize the predictability
in expected returns as following a simple AR(1) to alter his/her asset allocations. For example, if
process of he/she knows that the next period stock return will
likely be high, he/she should allocate more assets
to stocks. However, if he/she understands that the
rt+1 = Et (rt+1 ) + t+1 (15)
high return is associated with high expected return
Et (rt+1 ) = r̄ + φEt−1 (rt ) + ut (16) due to his/her increased risk aversion next period,
Predictability of Asset Prices 7
he/she would not increase his/her holding of the risky that security prices reflect investors’ expectations,
stocks. and expectations are good predictors of future val-
ues. To further illustrate this rationale, we can use
mathematical models to relate returns to prices or
Understanding Some Useful Predictors other variables.
(the nonoverlapping 5-year sample) due to the highly The Payout Ratio or the Repurchasing Yield
persistent regressor. As pointed out by Boudoukh
et al. [6], if an innovation in an independent variable Recently, Boudoukh et al. [4] have proposed to use
happens to coincide with the next period return, this the total payout ratio as a predictor for stock returns.
relationship will be repeated many times in the long- The importance of the new predictor can be illustrated
horizon regression since the shock will not die out by its impressive R 2 of 26% using annual data over
for many periods and the particular return will appear the sample period from 1926 to 2003. The use of
many times in the overlapping return series. payout ratio can be justified since investors’ total
Under the null hypothesis of no autocorrelation wealth could also be affected by share repurchasing.
in returns, that is, β(τ ) = 0, ∀τ in equation (22), In fact, representative investors should care about
Kirby [24] has shown the following asymptotic result the total distribution, which includes both the direct
for the R 2 from a predictive regression: dividend distributions and repurchases. If there are
rational reasons to believe that dividend yield predicts
d stock returns, the payout yield should play a similar
T × R 2 −−−→χ 2 (K) (23) role. In fact, the implementation of the SEC rule
where T is the number of observations and K is the 10b − 18 in 1982 gives firms an incentive to rely
number of independent variables in the regression. more on repurchases due to the tax advantages for
Let us use a numerical example to illustrate the point. investors.
For a univariate regression with K = 1 and T = 12,
what would we expect to see? Payout Ratio. Repurchasing could be used either
to reduce the effect of stock option exercise or to
• Since the mean of a χ 2 (1) random variable is substitute for dividends. To construct a measure that
1, we have E(12R 2 ) = 1, which implies that reflects the latter, Boudoukh et al. [4] used change in
E(R 2 ) = 8.3%. treasury stock adjusted for potential asynchronicity
• The 95% cutoff for R 2 is expected to be 32% between the repurchase and option exercise as a
since the critical value for a χ 2 (1) distribution measure of repurchasing (TS). They also used total
under the same confidence level is 3.84. In other repurchasing from the CF statement. Results are
words, we can expect to see R 2 s as high as 32% summarized in Table 6. Clearly, the predictive power
even though there is no predictability. of the dividend–price ratio (D/P ) has gone down
when comparing R 2 s from the two sample periods.
Therefore, long-horizon regression results are In particular, R 2 has decreased from 13 to 8% when
avoided in this article. including the recent sample period.
Table 7 The adjusted R 2 s for predictive regression using the dividend yield
(D/P) and/or the repurchasing yield (F/P) over different sample periods
1952–2005 1952–1978 1979–2005
In contrast, the repurchasing yield (measured as where St is the number of share outstanding at time
the ratio between repurchasing and market capital- t. Equation (24) can be interpreted in the following
ization) is impressive. No matter how it is measured, way:
the explanatory power is much larger than the pure
dividend–price ratio. Moreover, when using the net • first term: dividend yield (D/P ) for a representa-
payout yield (measured as the ratio between repur- tive shareholder;
chasing minus new issuing plus dividend and the mar- • second term: net repurchasing yield (F /P ) at the
ket capitalization), R 2 is as high as 26%. Although the before ex-dividend day price; and
payout yield is important empirically, its significance • third term: change in market capitalization, which
is overstated in Boudoukh et al. [4]. The most signifi- reflects growth.
cant contributor to the predictive power of the payout
Using NYSE/AMEX/NASDAQ index returns, we
yield is the new issuing yield when examining their
can construct both the smoothed dividend yield
accounting-based measures separately. Furthermore,
(D/P ) and the repurchasing yield (F /P ) over the
the predictive power of new issuing yield largely
past 12 months. Table 7 reports the adjusted R 2 s for
comes from the two outliers of 1929 and 1930. In
various predictive regressions.
other words, the new issuing yield offers no explana-
For the whole sample period from 1952 to 2005,
tory power once the sample period starts from 1931
quarterly returns are more predicable than monthly
instead of 1926.
returns. Overall, the repurchasing yield has higher
predictive power than the dividend yield. When we
split the sample into two, it becomes clear that the two
An Alternative Approach to Construct the Repur- predictors have played very different roles. Almost all
chasing Yield. The conventional approach of com- the predictive power in the first half of the sample
puting returns ignores changes in the market capital- comes from the dividend yield, whereas majority
ization associated with changes in the total number predictive power in the second half of the sample
of share outstanding. When the number of shares is due to the repurchasing yield. This evidence is
changes over time due to either repurchasing or sea- consistent with the observation of a decreasing trend
sonal offering, capital gains do not purely reflect in the dividend yield and the increasing role played
growth potential. From an asset pricing perspective, by repurchases.
it is more important to consider different components
of returns from a representative investor’s perspec-
tive. In other words, we can decompose returns from
the stand point of a representative investor instead of Statistical Issues
a buy-and-hold investor. In particular, we can rewrite
the return identity as the following: The use of many predictors can be controversial. In
many cases, the issue lies in the statistical inference
due to the persistence in predictors. These issues
St+1 Dt+1 (St − St+1 )(Pt+1 + Dt+1 )
Rt+1 ≡ + include spurious regression, biased estimates due to
St Pt St Pt correlations between innovations to predictors and
St+1 Pt+1 stock returns, and error in variables when using
+ (24) imperfect predictors.
St Pt
Predictability of Asset Prices 11
Table 8 The actual parameter estimates and the implied through a correlation between ut+1 and vt+1 . In other
parameters; adopted from Cochrane [12] words, information in the predictors helps to improve
Correlation the “quality” of the expected return estimates in
Implied the spirit of the classical SUR (seemingly unrelated)
Estimates σ (b̂) value r d dp regression. Since the system of equations (33)–(35)
b̂r 0.097 0.050 0.101 r 19.6 66.0 −70.0 can be reduced to a predictive regression when µt =
b̂d 0.008 0.044 0.004 d 66.0 14.0 7.5 zt , it should perform at least as good as the predictive
ψ̂ 0.941 0.047 0.945 dp −70.0 7.5 15.3 regression. Additional constraints can be imposed to
improve the estimation efficiency. For example, when
there is a positive shock to the expected return, future
regression of equation (28). As shown in Table 8 expected returns will be high due to the persistence,
(adapted from [12]), the b̂d estimate is very close which will result in a low price, or equivalently a
to zero with a large standard error. Therefore, br low return. Therefore, we can incorporate this prior
is probably close to 0.101 as implied by equation constraint of the negative correlation between ut+1
(31) using ρ = 0.9638, which is close to the actual and t+1 . Using quarterly data and imposed economic
estimate of 0.097. prior in a Bayesian framework, Pastor and Stambaugh
At the same time, equation (32) suggests that [31] found that the dividend yield is a very useful
shocks to returns and the dividend–price ratio are predictor.
highly correlated with each other, which are indeed In Pastor and Stambaugh [31], a predictor affects
true from Table 8. For example, the negative corre- the expected return through an indirect channel by
lation is as high as 70%. Table 8 also shows that improving the precision of the expected return esti-
the estimated coefficients br , bd , and ψ and their mate as in the same spirit of the SUR regression. To
corresponding implied values from equation (31) are push the idea further, Baranchuk and Xu [3] studied
amazingly close. both the direct and indirect effects of predictors on
the expected return. In particular, equations (34) and
Error in Variables (35) are replaced by
After comparing the (conditional) root-mean-squared Given the vast literature on predictability, in this arti-
errors (RMSEs) with respect to the predicted returns cle attention was focused on the questions of the
to the (unconditional) RMSEs using a simple sam- existence of predictability and the interpretation of
ple mean, Goyal and Welch [37] concluded that predictability. Return predictability has always been
in-sample predictability can be very different from a challenge to the EMH. The traditional view on
out-of-sample performance. In most cases, the uncon- the evidence is either denying the evidence with the
ditional RMSEs are smaller than the conditional help of statistical methods or attributing the phe-
RMSEs. Therefore, they believe that most results nomenon to market frictions. For example, most pre-
from predictive regressions are just statistical illu- dictors, except for the past returns, are persistent.
sions. Such a statistical property may result in a spuri-
Similar to the idea of using prior information to ous regression. Predictors are also imperfect, which
improve the predictive power for future returns as in will bring in an error-in-variables problem in esti-
[12], prior economic constraints are valuable infor- mation. Many market microstructure effects, such as
mation and should be used simultaneously. Campbell bid–ask bounce and nonsynchronous trading, may
and Yogo [11] recognized that if we are really predict- induce autocorrelation in a short-run and among small
ing the expected returns in a predictive regression, we stocks. The modern view, however, takes a more pos-
should throw out the negative predicted returns since itive approach by recognizing the time-varying risk
the expected returns should always be positive. By premium due to changes in either investment oppor-
constraining the predicted returns to be nonnegative, tunities or investors’ risk tolerance. If this is indeed
Campbell and Yogo [11] found that most predictors the case, many variables that predict business cycles
in the above list are indeed useful in predicting future should also help to predict returns, for example, the
returns even out of sample. interest rate. Other variables that contain a price com-
In a related study, Xu [38] recognized that, given ponent can also predict returns because prices reflect
the low R 2 s in predictive regressions, it is very expectations and should summarize all future changes
difficult to provide accurate prediction about the in the expected return or CF distributions.
14 Predictability of Asset Prices
[20] Ferson, W.E., Sarkissian, S. & Simin, T.T. (2003). [31] Pastor, L. & Stambaugh, R.F. (2007). Predictive Sys-
Spurious regressions in financial economics? Journal of tems: Living with Imperfect Predictors. NBER, Working
Finance 58, 1393–1414. Paper.
[21] Granger, C.W.J. & Newbold, P. (1974). Spurious [32] Pontiff, J. & Schall, L.D. (1998). Book-to-market ratio
regressions in economics, Journal of Econometrics 14, as predictors of market returns, Journal of Financial
111–120. Economics 49, 141–160.
[22] Kandle, S. & Stambaugh, R.F. (1996). On the pre- [33] Richardson, M. & Smith, T. (1991). Tests of financial
dictability of stock returns: an asset allocation perspec- models in the presence of overlapping observations,
tive, The Journal of Finance 51, 385–424. Review of Financial Studies 4, 227–254.
[34] Roll, R. (1984). A simple implicit measure of the
[23] Keim, D. & Stambaugh, R. (1986). Predicting returns in
effective bid-ask spread in an efficient market, Journal
stock and bond markets, Journal of Financial Economics
of Finance 39, 1127–1140.
17, 357–390.
[35] Schwert, G. (2003). Anomalies and market efficiency,
[24] Kirby, C. (1997). Measuring the predictability in stock in Handbook of Economics and Finance, G. Constan-
and bond returns, Review of Financial Studies 10, tinides, M. Harris & R. Stulz, eds, North Holland,
579–630. Amsterdam, Netherlands, Chapter 17.
[25] Kothari, S. & Shanken, J. (1997). Book-to-market, [36] Stambaugh, R.R. (1999). Predictive regressions, Journal
dividend yield, and expected market returns: a time- of Financial Economics 54, 375–421.
series analysis, Journal of Financial Economics 44, [37] Welch, I. & Goyal, A. (2008). A comprehensive look at
169–203. the empirical performance of equity premium prediction,
[26] Lamont, O. (1998). Earnings and expected returns, Review of Financial Studies 21, 1533–1575.
Journal of Finance 53, 1563–1587. [38] Xu, Y. (2004). Small levels of predictability and large
[27] Lettau, M. & Ludvigson, S. (2001). Consumption, aggre- economic gains, Journal of Empirical Finance 11,
gate wealth, and expected stock returns, Journal of 247–275.
Finance 56, 515–849.
[28] Lewellen, J. (2004). Predicting returns with financial
ratios, Journal of Financial Economics 74, 209–235.
Related Articles
[29] Lo, A. & MacKinlay, A.C. (1990). An econometric
analysis of nonsynchronous-trading, Journal of Econo- Capital Asset Pricing Model; Efficient Mar-
metrics 45, 181–212. ket Hypothesis; Expectations Hypothesis; Risk
[30] Pastor, L. & Stambaugh, R.F. (2008). Predictive Sys- Premia.
tems: Living with Imperfect Predictors, Working Paper,
12814, NBER. YEXIAO XU
Real Options take place in continuous time and that the underlying
sources of uncertainty follow Brownian motions.
Even though these assumptions may be unsuitable
in some corporate contexts, they permit to derive
Real options theory is about decision making and
precise theoretical solutions, thereby proving to be
value creation in an uncertain world. It owes its suc-
essential.f,g The focus of this earlier literature has
cess to its ability to reconcile frequently observed
been on valuing individual real options: the option
investment behaviors that are seemingly inconsistent
to expand a project, for instance, is an American
with rational choices at the firm level. For instance,
call option (see American Options). So is a deferral
Dixit [15] uses real options to explain why firms
option that gives a firm the right to delay the start of
undertake investments only if they expect a yield
a project. The option to abandon a project, or to scale
in excess of a required hurdle rate, thus violat-
back by selling a fraction of it for a fixed price, is
ing the Marshallian theory of long- and short-run
formally an American put (see American Options).
equilibria.a,b This is because, relative to a setting in
which there is no uncertainty, unforeseeable future Real-world projects, however, are often more com-
payouts discourage commitment to a project unless plex in that they involve a collection of real options,
the expected profitability of the project is sufficiently whose values may interact. The recent development
high. The real options methodology allows to identify in financial options interdependencies has enabled a
and value risky investments and, under certain condi- smoother transition from a theoretical stage to an
tions, to even take advantage of uncertainty. Indeed, application stage.h Margrabe’s [29] valuation of an
as we shall see, this valuation approach insures option to exchange one risky asset for another (see
investments against possible adverse outcomes while Margrabe Formula) finds immediate application in
retaining upside potential.c the modeling of switching options, which allow a firm
to switch between two modes of operation. Geske
[19] values options on options—called compound
Definition of a Real Option options —which may be applied to growth opportuni-
ties that become available only if earlier investments
A real option gives its holder the right, but not are undertaken. Phased investments belong to this
the obligation, to take an action (e.g., deferring, category. Thus, almost paradoxically, in this rela-
expanding, contracting, or abandoning) for a specified tively new field of research, the mathematically most
price, called the exercise —or strike —price, on or complex models, which apply sophisticated contin-
before some specified future date. We can identify gent claims analysis techniques, entail a great wealth
at least six factors that affect the value of a real of factual applications.i Moreover, numerous studies
option: the value of the underlying risky asset (i.e., show that real options represent a sizable fraction
the project, investment, or acquisition); the exercise of a firm’s value; both Kester [25] and Pindyck
price; the volatility of the value of the underlying [35], for instance, estimate that the value of a firm’s
asset; the time to expiration of the option; the interest growth options is more than half its market value
rate; and the dividend rate of the underlying asset of equity if demand volatility exceeds 20%. For this
(i.e., the cash outflows or inflows over the life of reason, the theory of real options has gained sig-
the option). If the value of the underlying project, its nificant importance among management practitioners
standard deviation, or the time to expiration increase, whose choices determine the success or failure of
so too does the value of the option. The value of their enterprises. Amram and Kulatilaka [1] collect
the (call) option also increases if the risk-free rate several case studies to show practitioner audiences
of interest goes up. Lost dividends decrease the how real options can improve capital investment
value of the option.d A higher exercise price reduces planning and results. In particular, they list three real
(augments) the value of a call (put) option.e options characteristics that are of great use to man-
The quantitative origins of real options derive agers: (i) options payoffs are contingent on the man-
from the seminal work of Black and Scholes [2] ager’s decisions; (ii) options valuations are aligned
and Merton [32] on financial options pricing (see with financial market valuations; and (iii) options
Black–Scholes Formula). These roots are evident thinking can be used to design and manage strategic
in the assumptions that trading and decision making investments proactively. The real options paradigm,
2 Real Options
however, is only the last stage in the evolution of with expected return and volatility indicated by µ
valuation models. The traditional approach to valu- and σ , respectively. The project’s payout rate equals
ing investment projects, which owes its origins to δ. Formally, the process can be written as
John Hicks and Irving Fisher, is based on net present
value. This technique involves discounting expected dV
= (µ − δ) dt + σ dz (1)
net cash flows from a project at a discount rate V
that reflects the risk of those cash flows, called the
where dz is the increment of a Wiener process and
risk-adjusted discount rate. Brennan and Trigeorgis
(dz)2 = dt.m,n In addition, denote the value of the
[8] characterize this first-phase models as static, or
firm’s investment opportunity (its option to invest)
mechanistic. The second-phase models are control-
by F (V ). It can be shown that the optimal rule is to
lable cash-flow models, in which projects can be
invest at the date τ ∗ when the project’s value first
managed actively in response to the resolution of
exceeds a certain optimal threshold V ∗ . This rule
exogenous uncertainties. Since they ignore strate-
maximizes
gic investment, both first- and second-phase models
often lead to suboptimal decisions. Dynamic, game-
F (V ) = max E[(Vτ − I )e−µτ ], V0 = V (2)
theoretic options models assume that projects can be τ
managed actively, instead.j These models take into
account not only the resolution of exogenous uncer- over all possible stopping times τ , where E is the
tainties but also the actions of outside parties. For expectation operator. Prior to undertaking the project
this reason, an area of immense importance within the only return to holding the investment option is its
game-theoretic options models concerns market com- capital appreciation, so that
petition and strategy.
Strategic firm interactions are isomorphic to a µF (V ) dt = E[dF (V )] (3)
portfolio of real options.k Furthermore, the payouts
of a project (as well as its value) can be seen as the Expanding dF (V ) using Itô’s lemma yields
outcome of a game among the inside agent, outside 1
agents, and nature. Dixit [14] and Williams [40] were dF (V ) = F (V ) dV + F (V )(dV )2 (4)
2
the first to consider real options within an equilibrium
context. Smit and Ankum [37], among others, study where primes indicate derivatives. Lastly, substituting
competitive reactions within a game-theoretic frame- equation (1) in (4) and taking expectations on both
work under different market structures. In the same sides gives
line of research is Grenadier’s [21] analysis of a per-
fectly competitive real-estate market with stochastic 1 2 2
σ V F (V ) + (µ − δ)V F (V ) − µF (V ) = 0
demand and time to build.l 2
(5)
Solution of the Basic Model Equation (5) must be solved simultaneously for
the project value F (V ) and the optimal investment
Besides particular cases, all investment expenditures threshold V ∗ , subject to three boundary conditions:
have two important characteristics. First, they are
at least partly irreversible, and second, they can be F (0) = 0 (6)
delayed so that the firm has the opportunity to wait
F (V ∗ ) = V ∗ − I (7)
for new information to arrive before committing any
∗
resources. F (V ) = 1 (8)
The most basic continuous-time model of irre-
versible investment was originally developed by Equation (6) is equivalent to stating that the
McDonald and Siegel [31]. In their problem, a firm investment option is worthless when the project’s
must decide when to invest in a single risky project, outcome is null. Equations (7) and (8) indicate the
denoted by V , with a fixed known cost I . The project payoff and marginal value associated with the opti-
is assumed to follow a geometric Brownian motion mum. To derive V ∗ , we must guess a functional form
Real Options 3
that satisfies equation (5) and verify if it works. In Numerical Methods in Real Options
particular, if we take F (V ) = AV β , then
In practice, most real option problems must be solved
∗ βI using numerical methods. Until recently, these meth-
V = (9)
β −1 ods were so complex that only few companies found
it practical to use them when formulating operat-
and ing strategies. However, advances in both compu-
tational power and understanding of the techniques
2 over the last 20 years have made it feasible to apply
1 µ−δ µ−δ 1 2µ
β= − + − + (10) real options thinking to strategic decision making.
2 σ2 σ 2 2 σ2
Numerical solutions give not only the value of the
project but also the optimal strategy for exercising the
The optimal rule is to invest when the value of
β options.t The simplest real option problems involv-
the project exceeds the cost by a factor > 1.
β −1 ing one or two state variables can be more conve-
This result is in contrast with net present value, niently solved using binomial or trinomial trees in one
which prescribes to invest as long as the value of the or two dimensions (see Finite Element Methods).u
project exceeds the cost (V ∗ = I ). However, since When a problem involves more state variables, per-
the latter rule does not account for uncertainty and haps path dependent, the more practical solution is
irreversibility, it is incorrect and it leads to suboptimal to use Monte Carlo simulation methods (see Monte
decisions. Carlo Simulation).v,w In order to do so, we use the
Furthermore, as it is apparent from the solution, assumption that properly anticipated prices (or cash
the higher the risk of the project, measured by σ , the flows) fluctuate randomly. Regardless of the pattern
larger are the value of the option and the opportunity of cash flows that a project is expected to have, the
cost of investing. Increasing values of the growth changes in its present value will follow a random
rate, µ, also cause F (V ) and V ∗ to be higher. On walk. This theorem, attributable to Paul Samuelson,
the other hand, larger expected payout rates, δ, lower allows us to combine any number of uncertainties
both F (V ) and V ∗ as holding the option becomes by using Monte Carlo techniques, and to produce
more expensive. an estimate of the present value of a project con-
Dixit and Pindyck [16] show how the optimal ditional on the set of random variables drawn from
investment rule can be found by using both dynamic their underlying distributions. More generally, there
programming (as it is done above) and contingent are two types of numerical techniques for option
claims analysis.o valuation: (i) those that approximate the underlying
Contingent claims methods require one important stochastic processes directly and (ii) those approxi-
assumption: stochastic changes in the value of the mating the resulting partial differential equation. The
project must be spanned by existing assets in the first category includes lattice approaches and Monte
economy (see Complete Markets). Specifically, cap- Carlo simulations. Examples of the second cate-
ital markets must be sufficiently complete so that gory include numerical integration (see Quadrature
one could find an asset, or construct a dynamic Methods); and the implicit/explicit finite difference
portfolio of assets, the price of which is perfectly schemes (see Finite Difference Methods for Bar-
correlated with the value of the project (see Risk- rier Options; Finite Difference Methods for Early
neutral Pricing).p,q This assumption allows properly Exercise Options) used by Brennan [6], Brennan and
taking into account all the flexibility (options) that Schwartz [7], and Majd and Pindyck [28], among
the project might have and using all the information others.
contained in market prices (e.g., futures prices) when
such prices exist.r If the sources of uncertainty in a
project are not traded assets (examples of which are Conclusions
product demand uncertainty, geological uncertainty,
technological uncertainty, cost uncertainty, etc.), an The application of option concepts to value real assets
equilibrium model of asset prices can be used to value has been an important growth area in the theory
the contingent claim.s and practice of finance. The insights and techniques
4 Real Options
derived from option pricing have proven capable both plain and exotic contingent claims and presents
of quantifying the managerial operating flexibility recent results on the numerical computation of opti-
and strategic interactions thus far ignored by con- mal exercise boundaries, hedging prices, and hedging
ventional net present value and other quantitative portfolios.
i.
Flexible manufacturing, natural resource investments, land
approaches. This flexibility represents a substantial
development, leasing, large-scale energy projects, research
part of the value of many projects and neglecting and development, and foreign investment are all examples
it can undervalue investments and induce a mis- of real options cases.
allocation of resources. By explicitly incorporating j.
Trigeorgis and Mason [39] remark that option valuation
management flexibility into the analysis, real options can be seen as a special version of decision tree analysis.
have provided the tools for properly valuing corporate Decision scientists propose the use of decision tree analysis
resources and capital budgeting. [34] to capture the value of operating flexibility associated
with many projects.
k.
Luerhman [27] explains how a business strategy compares
End Notes to a series of options more than to a single option. De
facto, executing a strategy almost always involves making a
a.
Marshall’s [30] analysis states that if price exceeds long- sequence of decisions: some actions are taken immediately,
run average cost, then existing firms expand and new ones while others are deliberately deferred.
l.
enter a business. The time-to-build and continuous-time features of Gren-
b. adier’s [21] model translate into an infinite state space.
Symmetrically, firms often do not exit a business for
lengthy periods, even after the price falls substantially Despite this, he is able to determine the optimal construc-
below long-run average cost. This phenomenon is dubbed tion rules by engineering an artificial economy with a finite
hysteresis. state space in which the equilibrium strategy is identical to
c.
Amram and Kulatilaka [1], Brennan and Trigeorgis [8], that of the true economy.
Copeland and Antikarov [10], Dixit and Pindyck [16], m.
According to equation (1), the current project value is
Grenadier [21], Schwartz and Trigeorgis [36], and Smit known but its future values are uncertain.
and Trigeorgis [38] represent core reference volumes on n.
Chapters 3 and 4 in [16] provide a thorough overview
real investment decisions under uncertainty. The survey of the mathematical tools necessary to study investment
article by Boyer et al. [4] is a noteworthy collection of decision using a continuous-time approach.
all most notable contributions to the literature on strategic o.
Although equivalent, the two methodologies are concep-
investment games, from the pioneering works of Gilbert tually rather different: while the former lies on the option’s
and Harris [20] and Fudenberg and Tirole [18] to more value satisfying the Bellman equation, the latter is founded
recent contributions. on the construction of a risk-free portfolio formed by a long
d.
For a thorough examination of the variables driving real position in the firm’s option and a short position in units
options’ analysis, the reader is referred to [10], Chapter 1. of the firm’s project. Chapter 5 in [16] presents a detailed
e.
An interesting example on the effect of an option’s explanation, along with a guided derivation, of the optimal
exercise price on its value is presented by Moel and Tufano rule obtained on adopting each technique.
[33]. They study the bidding for rights to explore and p.
Duffie [17] gives great emphasis to the implications of
develop a copper mine in Peru. A peculiar aspect of the
complete markets for asset pricing under uncertainty.
transaction is the nature of the bidding rules that bidders q.
Harrison and Kreps [22], Harrison and Pliska [23],
were required to follow by the Peruvian government.
and others have shown that, in complete markets, the
Each bid was required to specify the minimum amount
that the bidder would spend on developing the property absence of arbitrage implies the existence of a probability
if they decided to go ahead after exploration. This is distribution such that securities are priced on the basis
equivalent to allowing the bidders to specify the exercise of their discounted (at the risk-free rate) expected cash
price of their development option. This structure gave rise flows, where expectation is determined under the risk-
to incentives that affected the amount that firms would neutral probability measure. If all risks can be hedged, this
offer, thus inducing successful bidders to make uneconomic probability is unique. The critical advantage of working
investments. in the risk-neutral environment is that it is a convenient
f.
Boyarchenko and Levendorskii [3] relax these assump- environment for option pricing.
r.
tions and show how to analyze firm decisions in discrete The reader is referred to [36] for a more rigorous
time. discussion on the application of contingent claims analysis
g. to determine a project’s optimal operating policy.
Cox, Ross, and Rubinstein’s [12] binomial approach
s.
enables a more simplified valuation of options in discrete See [11] for the derivation of a fundamental partial
time. differential equation that must be satisfied by the value of
h. all contingent claims on the value of state variables that are
Detemple [13] provides a complete treatment of
American-style derivatives pricing. He analyzes in detail not traded assets.
Real Options 5
t. [12] Cox, J., Ross, S. & Rubinstein, M. (1979). Option
Broadie and Detemple [9] conduct a careful evaluation
of the many methods for computing American option pricing: a simplified approach, The Journal of Financial
prices. Economics 7(3), 229–263.
u. [13] Detemple, J. (2005). American-Style Derivatives: Valu-
Boyle [5] shows how lattice frameworks can be extended
to handle two state variables. ation and Computation, Chapman & Hall/CRC.
v. [14] Dixit, A. (1989). Entry and exit decisions under
In the last few years, methods have been developed, which
allow using simulations for solving American-style options. uncertainty, The Journal of Political Economy 97(3),
For example, Longstaff and Schwartz [26] developed a 620–638.
least-squares Monte Carlo approach to compare the value [15] Dixit, A. (1992). Investment and hysteresis, The Journal
of immediate exercise with the conditional expected value of Economic Perspectives 6(1), 107–132.
from continuation. [16] Dixit, A. & Pindyck, R. (1994). Investment Under
w.
Hull and White [24] suggest a control variate technique to Uncertainty, Princeton University Press, Princeton, NJ.
improve computational efficiency when a similar derivative [17] Duffie, D. (1996). Dynamic Asset Pricing Theory,
asset with an analytic solution is available. Princeton University Press, Princeton, NJ.
[18] Fudenberg, D. & Tirole, J. (1985). Preemption and rent
equalization in the adoption of new technology, The
References Review of Economic Studies 52(3), 383–401.
[19] Geske, R. (1979). A note on analytical valuation formula
[1] Amram, M. & Kulatilaka, N. (1999). Real Options: for unprotected American call options on stocks with
Managing Strategic Investment in an Uncertain World, known dividends, The Journal of Financial Economics
Harvard Business School Press, Boston, MA. 7, 375–380.
[2] Black, F. & Scholes, M. (1973). The pricing of options [20] Gilbert, R. & Harris, R. (1984). Competition with
and corporate liabilities, The Journal of Political Econ- lumpy investment, RAND Journal of Economics 15(2),
omy 18(3), 637–654. 197–212.
[3] Boyarchenko, S. & Levendorskii, S. (2000). Entry [21] Grenadier, S. (2000). Strategic options and product
and exit strategies under Non-Gaussian distributions, in market competition, in Project Flexibility, Agency, and
Project Flexibility, Agency, and Competition, M. Bren- Competition, M. Brennan, & L. Trigeorgis, eds, Oxford
nan & L. Trigeorgis, eds, Oxford University Press, Inc., University Press, Inc., New York, NY, pp. 275–296.
New York, NY, pp. 71–84. [22] Harrison, M. & Kreps, D. (1979). Martingales and
[4] Boyer, R., Gravelle, E. & Lasserre, P. (2004). Real arbitrage in multiperiod securities markets, The Journal
Options and Strategic Competition: A Survey. Working of Economic Theory 20(3), 381–408.
Paper. [23] Harrison, J. & Pliska, S. (1981). Martingales and
[5] Boyle, P. (1988). A lattice framework for option pricing stochastic integrals in the theory of continuous trad-
with two state variables, The Journal of Financial and ing, Stochastic Processes and Their Applications 11,
Quantitative Analysis 23(1), 1–12. 215–260.
[6] Brennan, M. (1979). The pricing of contingent claims [24] Hull, J. & White, A. (1988). The use of control variate
in discrete time models, The Journal of Finance 34(1), technique in option pricing, The Journal of Financial
53–68. and Quantitative Analysis 23(3), 237–251.
[7] Brennan, M. & Schwartz, E. (2001). Finite differences [25] Kester, W. (2001). Today’s options for tomorrow’s
methods and jump processes arising in the pricing of growth, in Real Options and Investment Under Uncer-
contingent claims: a synthesis, in Real Options and tainty: Classical Readings and Recent Contributions,
Investment Under Uncertainty: Classical Readings and E. Schwartz & L. Trigeorgis, eds, The MIT Press,
Recent Contributions, E. Schwartz & L. Trigeorgis, eds, Cambridge, MA, pp. 33–46.
The MIT Press, Cambridge, MA, pp. 559–570. [26] Longstaff, F. & Schwartz, E. (2001). Valuing American
[8] Brennan, M. & Trigeorgis, L. (2000). Project Flexibility, options by simulations: a simple least-squares approach,
Agency, and Competition, Oxford University Press, Inc., The Review of Financial Studies 14(1), 113–147.
New York, NY. [27] Luehrman, T. (2001). Strategy as a portfolio of real
[9] Broadie, M. & Detemple, J. (1996). American option options, in Real Options and Investment Under Uncer-
valuation: new bounds, approximations, and a compari- tainty: Classical Readings and Recent Contributions,
son of existing methods, The Review of Financial Studies E. Schwartz & L. Trigeorgis, eds, The MIT Press,
9(4), 1211–1250. Cambridge, MA, pp. 385–404.
[10] Copeland, T. & Antikarov, V. (2001). Real Options: A [28] Majd, S. & Pindyck, R. (1987). Time to build, option
Practitioner’s Guide, W.W. Norton & Company, New value, and investment decisions, The Journal of Finan-
York. cial Economics 18(1), 7–27.
[11] Cox, J., Ingersoll, J. & Ross, S. (1985). An intertemporal [29] Margrabe, W. (1978). The value of an option to
general equilibrium model of asset prices, Econometrica exchange one asset for another, The Journal of Finance
53(2), 363–384. 33(1), 177–186.
6 Real Options
[30] Marshall, A. (1890). Principles of Economics, Macmil- [37] Smit, H. & Ankum, L. (1993). A real options and
lan and Co, London. game-theoretic approach to corporate investment strat-
[31] McDonald, R. & Siegel, D. (1986). The value of waiting egy under competition, Financial Management 22(3),
241–250.
to invest, The Quarterly Journal of Economics 101(4),
[38] Smit, H. & Trigeorgis, L. (2004). Strategic Investment:
707–728.
Real Options and Games, Princeton University Press,
[32] Merton, R. (1973). Theory of rational option pricing, Princeton, NJ.
Bell Journal of Economics 4(1), 141–183. [39] Trigeorgis, L. & Mason, S. (2001). Valuing manage-
[33] Moel, A., Tufano, P., Brennan, M. & Trigeorgis, L. rial flexibility, in Real Options and Investment Under
(2000). Bidding for the antamina mine: valuation and Uncertainty: Classical Readings and Recent Contribu-
incentives in a real options context, in Project Flexibil- tions, E. Schwartz & L. Trigeorgis, eds, The MIT Press,
ity, Agency, and Competition, Oxford University Press, Cambridge, MA, pp. 47–60.
[40] Williams, J. (1993). Equilibrium and options on real
London, pp. 128–150.
assets, The Review of Financial Studies 6(4), 825–850.
[34] Myers, S. (2001). Finance theory and financial strat-
egy, in Real Options and Investment Under Uncer- Further Reading
tainty: Classical Readings and Recent Contributions,
E. Schwartz & L. Trigeorgis, eds, The MIT Press, Cam-
Grenadier, S. (2000). Game Choices: The Intersection of Real
bridge, MA, pp. 19–32. Options and Game Theory, Risk Books, London.
[35] Pyndick, R., Schwartz, E. & Trigeorgis, L. (eds) (2001).
Irreversible investment, capacity choice, and the value of
Related Articles
the firm, in Real Options and Investment Under Uncer-
tainty: Classical Readings and Recent Contributions, Black–Scholes Formula; Option Pricing: General
The MIT Press, Cambridge, MA, pp. 313–334. Principles; Options: Basic Definitions; Swing
[36] Schwartz, E. & Trigeorgis, L. (2001). Real Options and Options.
Investment Under Uncertainty: Classical Readings and
Recent Contributions, The MIT Press, Cambridge, MA. DORIANA RUFFINO
Employee Stock Options is not able to sell or transfer the options at any time.
This is in keeping with the alignment or incentive
effect of options. The option terms are modified if
the employee exits the firm either because he or she
Employee stock options (ESOs) are call options is fired, leaves, retires, or dies. These “sunset rules”
issued by a company and given to its employees vary widely across firms (see [8] for details), but
as part of their remuneration. The rationale is that typically the employee is given a period of time in
granting the employee options will align his or which to exercise the options or forfeit them. The
her interests with those of the firm’s shareholders. length of time is generally longest if the employee
This is particularly relevant for managers and Chief retires and shortest if the employee leaves or is
Executive Officers (CEOs) whose behavior has more fired. In addition to being unable to unwind an
impact on firm value than that of lower ranked option position by selling it, employees are typically
employees. also restricted from short selling the stock of their
ESOs are prevalent in both the United States and company and thus are very restricted in terms of
Europe. In the fiscal year 1999, 94% of the S&P 500 hedging their option exposure [5].
companies granted options to their top executives, There have been a number of empirical studies
and the value at the grant date represented 47% of ESO exercise patterns. Huddart and Lang [23]
of total pay for the CEOs [14]. The 2005 Mercer study exercise behavior in a sample of eight firms
Compensation Survey [34] reports that over 75% of that volunteered internal records on option grants and
CEOs receive option grants and options account for exercises from 1982 to 1994. They find a pervasive
32% of CEO pay. The Hay Group’s 2006 European pattern of option exercises well before expiration–the
Executive Pay Survey [15] found that 55% of the mean fraction of option life elapsed at the time of
companies in the study used stock options. exercise varied from 0.26 to 0.79 over companies.
ESOs are American call options on the company Bettis et al. [2] analyze a unique database of more
stock granted to the employee. They typically have a than 140 000 option exercises by corporate executives
number of characteristics that distinguish them from at almost 4000 firms during the period 1996 through
financial options; see [38] and [35] for overviews. 2002. They find 10-year options were exercised a
There is usually an initial vesting period during which median of 4.25 years before expiry. A further fea-
the options cannot be exercised. Cliff vesting is a ture documented in the data is that of block exercise.
structure where all options granted on a given date Huddart and Lang [23] find that the mean fraction of
become exercisable after an initial period, usually options from a single grant exercised by an employee
2–4 years. Stepped vesting refers to a structure where at one time varied from 0.18 to 0.72 over employ-
a proportion of an option grant becomes exercisable ees at a number of companies. Similarly, Aboody [1]
each year, for example, 10% after one year, then reports yearly mean percentages of options exercised
20%, 30%, and 40% each subsequent year. The most over the life of 5 and 10 year options, showing exer-
common structure is straight vesting where the pro- cises are spread over the life of the options. Some
portions are equal, say one-third of the grant is exer- of these block exercises are due to the nature of
cisable after each of the first three years (see [2, 30], the vesting structure–for instance, Huddart and Lang
and [25]). During this period, typically, the employee [23] find spikes on vest dates corresponding to large
must forfeit the remaining unvested options if he or exercises on those dates–but there are also other
she resigned or is fired. Clearly, if there is no vest- block exercises on dates that cannot be explained by
ing period, the options are American style, whereas, vesting.
in the limit, as the vesting period approaches matu- There are many questions of interest–including
rity, the options become European (see American “What is the employee’s optimal exercise pol-
Options; Call Options for descriptions of European icy?”, “What are the options worth to him or her?”;
and American options). “What is the corresponding cost to the company
After the vesting period, the options may be of granting the options?” The employee’s exercise
exercised at any time up to and including the maturity policy and option value should incorporate the fea-
date. These options are typically long dated with a tures described above–his or her inability to hedge
10-year maturity being most common. The employee being key. The cost to the company should reflect the
2 Employee Stock Options
value of the option liability to the issuing corporation. European. Also in a binomial model, Cai and Vijh [3]
This usually entails the assumption that shareholders and Carpenter [5] assume nonoption or outside wealth
are well diversified, so the cost should be the is invested in a Merton-style portfolio, but only allow
risk-neutral option value conditional on the optimal for a one-off choice of this portfolio.
exercise behavior of the employee. This distinction Many of the papers mentioned above observe
between the option value to the employee (often that the utility-based or subjective valuation to
called subjective value) and the cost to the company the employee is much lower than the equivalent
is important and arises because the employee can- Black–Scholes value (the value obtained in an equiv-
not perfectly hedge the risk of the option exposure, alent complete market setting); however, this is not
while shareholders are typically assumed to be well universally true in models where nonoption wealth
diversified. is invested in a riskless bond [14]. Generally, how-
The need to quantify the company cost is par- ever, the (subjective) value of the options to the
ticularly relevant in light of changes in accounting employee is less than the cost of the options to the
rules, which require companies to expense options company because of the employee’s hedging restric-
at the grant date. In 1995, the Financial Account- tions.
ing Standards Board (FASB) set a standard to require These models have been extended to incorporate
firms to expense stock options using “fair value”. the impact of optimal investment of outside wealth
However, this included the possibility to calculate in a market or risky asset, rather than just a bank
the option cost to the firm as the option’s intrinsic account. This was tackled in the natural setting of
value at the grant date. Perhaps motivated by this, utility-indifference pricing (see [19] for a survey
companies mainly granted options that were at-the- containing many references) for European options
money thus calculating a zero value for the expense. by Henderson [17]. This allows the employee to
The huge growth of employee options and a series reduce risk by partial hedging in the market asset,
of corporate scandals led to pressure for changes to which would seem to reflect what can be done
these rules, and new regulations (FASB 123R in the in practice. The basic setup for continuous-time
United States, International Financial Reporting Stan- models with hedging in the market asset is as
dards (IFRS) 2 in Europe) were introduced in 2004. follows. The market M follows a geometric Brownian
From 2005 onward, these regulations required com- motion
panies to use a “fair value method” of accounting dM/M = µdt + σ dB (1)
for the expense of employee options, and although
recommendations are made concerning appropriate
where µ, σ are constants, and B is standard Brownian
methods, there is still much scope for interpretation
motion. Let W be a standard Brownian motion and
by companies. For instance, use of the (European)
assume dBdW = ρdt. We can write dW = ρdB +
√
Black–Scholes price with an estimated “expected
(1 − ρ 2 )dZ for Z a Brownian motion independent
term” is an acceptable and popular approach. Despite
of B. The company stock S also follows a geometric
these changes, the granting of options that are at-the-
Brownian motion:
money is still typical.
To take into account the nonhedgability aspect of
employee options, we need to move outside of the dS/S = νdt + ηdW
complete market or risk-neutral pricing framework
to an incomplete setting (see Complete Markets). = νdt + η(ρdB + (1 − ρ 2 )dZ) (2)
There have been many papers in the literature in
this direction, beginning with [22, 31, 32], and The term ρ 2 η2 represents the hedgable or market
[14], amongst others. These papers typically develop component of the total risk of the stock and (1 −
binomial models that take trading restrictions and ρ 2 )η2 is the unhedgable or idiosyncratic/firm-specific
employee risk aversion into account and compute risk of the stock. When ρ 2 = 1 all the risk can be
a certainty equivalent or subjective value for the hedged and an employee with an option on the stock
employee options. These models make the simplistic S is able to perfectly hedge the risk he or she faces.
assumption that any nonoption wealth is invested in (To avoid arbitrage, we should have ν − r = (µ −
a riskless bank account, and most treat the options as r)η/σ . More generally, CAPM imposes the relation
Employee Stock Options 3
ν − r = (µ − r)ηρ/σ ; see Capital Asset Pricing considered the case of the perpetual option but with-
Model). out the partial hedging in the market. The exercise
The employee can invest in a riskless asset with threshold and option values both decrease with risk
interest rate r and hold a cash amount θt in the market aversion and increase with (absolute value of) cor-
at time t. The dynamics of the wealth account X are relation. Just as in the European case, the ability to
then partially hedge risk is valuable to the employee. He
dX = θdM/M + r(X − θ)dt (3) or she places a higher value on the option and waits
longer to exercise it. It is also possible that stock
If the employee is granted λ European call options volatility reduces the option value in some scenar-
with strike K then he or she solves ios because of the interaction of the convex payoff
with the concave utility function; see [17, 18, 33],
and also [37]. Since the cost to the company is just
V (t, Xt , St , λ) = sup Et [U (XT + λ(ST − K)+ )]
θu ;u≥t the risk-neutral option value conditional on optimal
(4) exercise by the employee, it is also decreasing with
Under the assumption of exponential utility, closed- risk aversion and increasing with (absolute value of)
form solutions are obtained for the value function. correlation [13]. Detemple and Sundaresan [9] and
The utility-based or utility-indifference value p of the Ingersoll [25] also allow for optimal investment in a
λ options solves V (t, x + p, St , 0) = V (t, x, St , λ). market portfolio and consider numerical approaches
In such models, it is straightforward to show that, to the marginal pricing of small quantities of options.
in the limit, as the (absolute value of) correlation As mentioned earlier, the data indicates that
between the company stock and market approaches employees exercise options in a number of tranches
one, the Black–Scholes or complete market value on different occasions. Consideration of models that
is recovered. This value is then an upper bound only allow for one option or one exercise time is
on the utility-based valuation. In a European option not consistent with this observation. Vesting is one
setting, the Black–Scholes value represents the cost feature that clearly encourages such block exercise
to the company, so we see the value to the employee behavior, and indeed, Huddart and Lang [23] observe
is lower than the cost to the company. The other that many exercises take place immediately when the
comparison of interest is to consider what difference options vest. However, vesting does not appear to
the ability to undertake partial hedging in the market explain all of the intertemporal exercises, since not all
makes. The ability to partially hedge is valuable exercises occur immediately upon vesting. Another
to the employee and his or her utility-based or reason for intertemporal exercise is risk aversion and
subjective option value is higher than without the the inability to hedge risk due to restrictions. Jain
hedging/investment opportunity. In other words, the and Subrahmanian [26] consider a binomial model
subjective value increases in (absolute value of) for a risk-averse employee who is granted a number
correlation. Similar to the models without the market of options. Grasselli [12] extends the binomial frame-
asset, the higher the employee’s risk aversion, the work to include optimal investment in a correlated
lower the utility-based option value. market asset. These papers find numerically that opti-
Of course, as we described earlier, employee mal behavior is to exercise options when the stock
stock options are American options, and allow for price reaches a boundary and the discrete nature
early exercise once the options have vested. Some of the model results in exercise occurring at a dis-
of the aforementioned papers also treat American crete set of dates or stock price levels. Rogers and
style options and the general intuition is that hedg- Scheinkman [36] make similar observations numeri-
ing restrictions of the employee result in an earlier cally in a discrete approximation to a continuous-time
exercise and a lower subjective value than the equiv- model without investment opportunities in a market
alent Black–Scholes (complete market) American asset.
option. In the continuous-time model with invest- Grasselli and Henderson [13] show that under
ment in the market asset, closed-form results are the assumption of exponential utility and perpetual
found under the assumptions of exponential utility options, closed-form solutions can be derived for
and perpetual options in [18] and numerical solu- the multiple-option problem with investment oppor-
tions for finite maturity in [33]. Kadam et al. [29] tunities in a market asset. In fact, they show that
4 Employee Stock Options
given N options, there are N unique stock price option; see [24] and [7]. This style of model has
thresholds at which the employee should exercise an the attraction of simplicity and is much easier for
option. These thresholds are obtained using a recur- calibration since the employee’s risk aversion is no
sive relation. The price thresholds are increasing as longer used. For this reason, it may well be a fruit-
the quantity of options falls. In other words, when ful approach for calculating an approximation to the
the employee has fewer options remaining, he or she cost of the options to the company for accounting
is exposed to less risk, and thus is willing to wait purposes.
for a higher price threshold before exercising fur- We now turn to briefly discuss a number of
ther options. Similar comparative statics apply as in other features relevant in employee compensation.
the single American option case–thresholds, option Typically employees receive new grants of options
values, and company cost are decreasing in risk aver- periodically; however, companies also engage in
sion and increasing in (absolute value of) correlation. resetting (where the option strike of existing options
In addition, they show that the cost to the company is is adjusted downward when the options are out-of-
underestimated if a single optimal exercise threshold the-money) and reloading (where additional options
is used. Since, in reality, options are not exercised are granted automatically when existing options are
one at a time, the paper also introduces a transaction exercised [10]).
cost on exercise, which restores block exercise as the Besides the traditional employee options described
optimal solution, again found in closed form. Leung in this article, companies have increasingly granted
and Sircar [33] consider the finite-maturity version performance-based options, which link option vest-
of the problem, which leads to numerical solution of ing or exercise to the achievement of market or
the free-boundary problem. They also include fea- accounting-based performance targets. These options
tures such as vesting and job termination risk. are very popular in Europe, but have, until recently,
As described earlier, option terms change upon been less common in the United States; see [11] and
departure of an employee from the company and references therein. Compensation linked to account-
this should be incorporated into pricing models. ing data is potentially open to manipulation and man-
Employee departure is typically modeled by an agers with such options may be motivated to inflate
exogenous exponentially distributed time with con- earnings. There is a large literature on the connection
stant intensity, independent of the stock price, similar between compensation involving accounting-based
to a reduced-form approach in credit modeling. (see targets and earnings management, either of a direct
Structural Default Risk Models). The papers [5, 6, nature [4]) or accrual-based management or manipu-
27, 39], and [33], among others, incorporate departure lation Healy [16].
into a variety of setups in this manner. Performance-based options can also have exercise
Although we do not discuss estimation in any prices contingent on performance relative to a com-
detail here, it is clear that estimation of such models is parison group–these are known as indexed options;
difficult. The models require estimates of risk aver- see Johnson and Tian [28] who value such options
sion, outside wealth, and employee departure rate, in a risk-neutral framework using techniques from
which are not easily obtained. Bettis et al. [2] and exchange or Margrabe options (see Margrabe For-
Carpenter [5] have attempted calibration exercises mula). Managers are then rewarded as a function of
on utility style models to exercise data; however, relative performance relative to a peer group rather
many simplifying assumptions have to be made due than on absolute performance [20].
to data limitations. For example, they assume an Other important issues that have not been dis-
option grant is exercised on one date only rather cussed here include the impact of dilution–when
than on multiple occasions. Perhaps surprising is options are exercised, the company typically issues
the finding of Carpenter [5] that after a calibration new shares. Another important issue is the influence
to data, a reduced-form model of employee depar- the CEO has on the stock price via his or her effort or
ture is as capable as a utility-maximizing model choice of projects/risk. The problem of how best to
in explaining option exercises. This finding moti- compensate managers, given the benefits of improved
vates another strand of the literature, which models incentives and the costs of inefficient risk-sharing, is
option exercise exogenously by postulating an exer- the subject of a large literature on the principal agent
cise boundary in terms of the moneyness of the problem; see the classic reference [21].
Employee Stock Options 5
[39] Sircar, R. & Xiong, W. (2007). A general framework for Related Articles
evaluating executive stock options, Journal of Economic
Dynamics and Control 31(7), 2317–2349.
American Options; Black–Scholes Formula; Call
Options; Capital Asset Pricing Model; Complete
Further Reading Markets; Structural Default Risk Models.
Black, F. & Scholes, M. (1973). The pricing of options and VICKY HENDERSON & JIA SUN
corporate liabilities, Journal of Political Economy 81(3),
637–654.
Arbitrage Strategy Japanese yen for less than $0.009419 and Toru will
be buying euro for less than ¥155.02. Very soon, the
situation will be such that nobody is able to make a
It is difficult to imagine a normative condition that riskless profit anymore.
is more widely accepted and unquestionable in the The economic rationale behind asking for nonex-
minds of anyone involved in the field of quantitative istence of arbitrage opportunities is based exactly on
finance other than the absence of arbitrage opportu- the discussion in the previous paragraph. If arbi-
nities in a financial market. Put plainly, an arbitrage trage opportunities were present in the market, a
strategy allows a financial agent to make certain profit multitude of investors would try to take advantage
out of nothing, that is, out of zero initial investment. of them simultaneously. Therefore, there would be
This has to be disallowed on economic basis if the an almost instantaneous move of the prices of cer-
market is in equilibrium state, as opportunities for tain financial instruments as a response to a sup-
riskless profit would result in an instantaneous move- ply–demand imbalance. This price movement will
ment of prices of certain financial instruments. continue until any opportunity for riskless profit is
Let us give an illustrative example of an arbi- no longer available.
trage strategy in the foreign exchange market, com- It is important to note that the preceding, some-
monly called the triangular arbitrage. Suppose that what theoretical, discussion does not imply that arbi-
Mary, in Paris, is buyinga the US dollar for ¤0.685. trage opportunities never exist in practice. On the
Tom, in San Francisco, is buying Japanese yen contrary, it has been observed that opportunities for
for $0.009419. Finally, Toru, in Tokyo, is buying some, albeit usually minuscule, riskless profit appear
one euro for ¥155.02. All these transactions are frequently as a consequence of the huge amount of
supposed to be able to occur at the same time. distant geographic trading locations, as well as a
There is something worth noting in the situation just result of the numerous financial products that have
described—something that could allow you to make sprung up and are sometimes interrelated in compli-
riskless profit. Let us see how. You borrow $10 000 cated ways. Realizing that such opportunities exist is
from your rich aunt Clara and tell her you will return a matter of rapid access to information that a certain
the money in a matter of minutes. First, you approach group of investors, so-called arbitrageurs, has. It is
Mary and change all your dollars to euros. This means rather the existence of arbitrageurs acting in financial
that you will get ¤6850. With the euros in hand, you
markets that ensures that when arbitrage opportunities
contact Toru and change them into yen—you will
exist, they will be fleeting.
get ¥(6850 × 155.02) = ¥1 061 887. Finally, you call
The principle of not allowing for arbitrage oppor-
Tom, wire him all your yen and change them back to
tunities in financial markets has far-reaching con-
dollars, which gets you $(1 061 887 × 0.009419) ≡
sequences and has immensely boosted research in
$10 001.91. You give the $10 000 back to your aunt
Clara as promised, and you have managed to create quantitative finance. The ground-breaking papers of
$1.91 out of thin air. Black (see Black, Fischer) and Scholes [1] and
Although the above-mentioned example is over- Merton (see Merton, Robert C.) [3], published
simplistic, it gives a clear idea of what arbitrage is: in 1973, were the first instances explaining how
a position on a combination of assets that requires absence of arbitrage opportunities leads to ratio-
zero initial capital and results in a profit with no nal pricing and hedging formulas for European-style
risk involved. Let us now take a step further and see options in a geometric Brownian motion financial
what will happen under the situation of the preceding model.b This idea was consequently taken up and
example. As more and more investors become aware generalized by many authors and has lead to a pro-
of the discrepancy between prices, they will all try to found understanding of the interplay between the
use the same smart strategy that you used for their economics of financial markets and the mathematics
benefit. Everyone will be trying to exchange US dol- of stochastic processes, with deep-reaching results—
lars for euros in the first step of the arbitrage, which see Fundamental Theorem of Asset Pricing; Risk-
will drive Mary to start buying the US dollar for neutral Pricing; Equivalent Martingale Measures;
less than ¤0.685 because of the high demand for the and Free Lunch for some amazing developments on
euros she is selling. Similarly, Tom will start buying this path.
2 Arbitrage Strategy
one hand, the methodology of pricing by taking question. An excellent reference is [14]. Ross [29]
expectations with respect to a properly chosen “risk circumvented this problem by deliberately leaving
neutral” or “martingale” measure Q; on the other this issue aside and simply starting with the mod-
hand, the methodology of pricing by “no arbitrage” eling assumption that the subset M ⊆ X as well as a
considerations. Why, after all, do these two seem- pricing operator π : M → are given.
ingly unrelated approaches yield identical results in Let us now formalize the notion of arbitrage.
the Black–Merton–Scholes approach? Maybe even In the above setting, we say that the no arbitrage
more importantly: how far can this phenomenon be assumption is satisfied if, for m ∈ M, satisfying
extended to more involved models? m ≥ 0, -a.s. and [m > 0] > 0, we have π(m) >
To the best of the author’s knowledge, the first 0. In prose, this means that it is not possible to find a
person to take up these questions in a systematic claim m ∈ M, which bears no risk (as m ≥ 0, -a.s.),
way was Ross (see Ross, Stephen) [29]; see also [4, yields some gain with strictly positive probability (as
27, 28]. He chose the following setting to formalize [m > 0] > 0), and such that its price π(m) is less
the situation: fix a topological, ordered vector space than or equal to zero.
(X, τ ), modeling the possible cash flows (e.g., the The question that now arises is whether it is
payoff function of an option) at a fixed time horizon possible to extend π : M → to a nonnegative,
T . A good choice is, for example, X = Lp (, F, ), continuous linear functional π ∗ : X → .
where 1 ≤ p ≤ ∞ and (, F, (Ft )0≤t≤T , ) is the What does this have to do with the issue of
underlying filtered probability space. The set of martingale measures? This theme was developed in
marketed assets M is a subspace of X. detail by Harrison and Kreps [14]. Suppose that
In the context of a stock price process S = X = Lp (, F, ) for some 1 ≤ p < ∞, that the
(St )0≤t≤T as above, one might think of M as all price process S = (St )0≤t≤T satisfies St ∈ X, for each
the outcomes of an initial investment x ∈ plus 0 ≤ t ≤ T , and that M contains (at least) the “simple
the result of subsequent trading according to a integrals” on the process S = (St )0≤t≤T of the form
predictable trading strategy H = (Ht )0≤t≤T . This
n
yields (in discounted terms) an element m=x+ Hi (Sti − Sti−1 ) (3)
T i=1
m=x+ Ht dSt (1) Here x ∈ , 0 = t0 < t1 < . . . < tn = T and
0
(Hi )ni=1 is a (say) bounded process which is pre-
in the set M of marketed claims. It is natural to price dictable, that is, Hi is Fti−1 -measurable. The sums in
the above claim m by setting π(m) = x, as this is equation (3) are the Riemann sums corresponding to
the net investment necessary to finance the above the stochastic integrals (1). The Riemann sums (3)
claim m. have a clear-cut economic interpretation [14]. In
For notational convenience, we shall assume in equation (3) we do not have to bother about subtle
the sequel that S is a one-dimensional process. It is convergence issues as only finite sums are involved
straightforward to generalize to the case of d risky in the definition. It is therefore a traditional (minimal)
assets by assuming that S is d -valued and replacing requirement that the Riemann sums of the form (3)
the above integral by are in the space M of marketed claims; naturally, the
price of a claim m of the form (3) should be defined
T d
as π(m) = x.
m=x+ Hti dSti (2) Now suppose that the functional π, which is
0 i=1 defined for the claims of the form (3) can be extended
Some words of warning about the stochastic inte- to a continuous, nonnegative functional π ∗ defined on
gral (1) seem necessary. The precise admissibility X = Lp (, F, ). If such an extension π ∗ exists, it
conditions, which should be imposed on the stochas- is induced by some function g ∈ Lq (, F, ), where
tic integral (1), in order to make sense both mathe- p
1
+ q1 = 1. The nonnegativity of π ∗ is tantamount
matically as well as economically, are a subtle issue. to g ≥ 0, -a.s., and the fact that π ∗ (1) = 1 shows
Much of the early literature on the fundamental the- that g is the density of a probability measure Q with
orem of asset pricing struggled exactly with this Radon–Nikodym derivative dQ = g.
d
Fundamental Theorem of Asset Pricing 3
If we can find such an extension π ∗ of π, we thus infinite (, F, ), the present result only applies to
find a probability measure Q on (, F, ) for which L∞ (, F, ) endowed with the norm topology. In
this case, the continuous linear functional π ∗ only is
n n
in L∞ (, F, )∗ and not necessarily in L1 (, F, );
∗
π Hi Sti − Sti−1 = ƐQ Hi (Sti − Sti−1 ) in other words, we cannot be sure that π ∗ is induced
i=1 i=1 by a probability measure Q, as it may happen that
(4) π ∗ ∈ L∞ (, F, )∗ also has a singular part.
Another drawback, which already appears in the
for every bounded predictable process H = (Hi )ni=1 case of finite-dimensional (in which case π ∗
as above, which is tantamount to (St )0≤t≤T being a certainly is induced by some Q with ddQ
=g∈
martingale (see [Th. 2] [14], or [Lemma 2.2.6] [11]).
L1 (, F, )) is the following: we cannot be sure that
To sum up, in the case 1 ≤ p < ∞, finding a con-
the function g is strictly positive -a.s. or, in other
tinuous, nonnegative extension π ∗ : Lp (, F, ) →
words, that Q is equivalent to .
of π amounts to finding a -absolutely continuous
After this early work by Ross, a major advance
measure Q with dQ ∈ Lq and such that (St )0≤t≤T is in the theory was achieved between 1979 and 1981
d
a martingale under Q. by three seminal papers [14, 15, 24] by Harrison,
At this stage, it becomes clear that in order to Kreps, and Pliska. In particular, [14] is a landmark in
find such an extension π ∗ of π, the Hahn–Banach the field. It uses a similar setting as [29], namely, an
theorem should come into play in some form, for ordered topological vector space (X, τ ) and a linear
example, in one of the versions of the separating functional π : M → , where M is a linear subspace
hyperplane theorem. of X. Again the question is whether there exists
In order to be able to do so, Ross assumes an extension of π to a linear, continuous, strictly
([p. 472] [29]) that “. . .we will endow X with a positive π ∗ : X → . This question is related in [14]
strong enough topology to insure that the positive to the issue of whether (M, π) is viable as a model
orthant {x ∈ X|x > 0} is an open set, . . .”. In prac- of economic equilibrium. Under proper assumptions
tice, the only infinite-dimensional ordered topological on the convexity and continuity of the preferences of
vector space X, such that the positive orthant has agents, this is shown to be equivalent to the extension
nonempty interior, is X = L∞ (, F, ), endowed discussed above.
with the topology induced by .∞ . The paper [14] also analyzes the case when is
Hence the two important cases, applying to Ross’ finite. Of course, only processes S = (St )Tt=0 indexed
hypothesis, are when either the probability space by finite, discrete time {0, 1, . . . , T } make sense in
is finite, so that X = Lp (, F, ) simply is finite this case. For this easier setting, the following precise
dimensional and its topology does not depend on theorem was stated and proved in the subsequent
1 ≤ p ≤ ∞, or if (, F, ) is infinite and X = paper [15] by Harrison and Pliska:
L∞ (, F, ) equipped with the norm .∞ .
After these preparations we can identify the Theorem 1 ([Th. 2. 7.] [15]): suppose the
two convex sets to be separated: let A = {m ∈ M : stochastic process S = (St )Tt=0 is based on a finite, fil-
π(m) ≤ 0} and B be the interior of the positive cone tered, probability space (, F, (Ft )Tt=0 , ). The mar-
of X. Now make the easy, but crucial, observation: ket model contains no-arbitrage possibilities if and
these sets are disjoint if and only if the no-arbitrage only if there is an equivalent martingale measure
condition is satisfied. As one always can separate an for S.
open convex set from a disjoint convex set, we find
a functional π̃, which is strictly positive on B, while The proof again relies on a (finite-dimensional
π̃ takes nonpositive values on A. By normalizing π̃, version) of the Hahn–Banach theorem plus an extra
that is, letting π ∗ = π̃ (1)−1 π̃ we have thus found the argument making sure to find a measure Q, which
desired extension. is equivalent to . Harrison and Pliska thus have
In summary, the first precise version of the fun- achieved a precise version of the above meta-theorem
damental theorem of asset pricing is established in in terms of equivalent martingale measures, which
[29], the proof relying on the Hahn–Banach theorem. does not use the word “essentially”. Actually, the
There are, however, serious limitations: in the case of theme of the Harrison–Pliska theorem goes back
4 Fundamental Theorem of Asset Pricing
much further, to the work of Shimony [35] and Here is the ingenious construction of Kreps: define
Kemeny [22] on symbolic logic in the tradition
of Carnap, de Finetti, and Ramsey. These authors A = M0 − X+ (5)
showed that, in a setting with only finitely many states
of the world, a family of possible bets does not allow where the bar denotes the closure with respect to the
(by taking linear combinations) for making a riskless topology τ . We shall require that A still satisfies
profit (i.e., one certainly does not lose but wins with
strictly positive probability), if and only if there is a A ∩ X+ = {0} (6)
probability measure Q on these finitely many states,
This property is baptized as “no free lunch” by Kreps:
which prices the possible bets by taking conditional
Q-expectations. Definition 1 [24]: The financial market defined by
The restriction to finite is very severe in (X, τ ), M, and π admits a free lunch if there are nets
applications: the flavor of the theory, building on (mα )α∈I ∈ M0 and (hα )α∈I ∈ X+ such that
Black–Scholes–Merton, is precisely the concept of
continuous time. Of course, this involves infinite lim (mα − hα ) = x (7)
probability spaces (, F, ). α∈I
probability measure in (X, τ )∗ = Lq (, F, ), where of them were explicitly stated as open problems in
p
1
+ q1 = 1. This yields the desired extension π ∗ of these papers.
π which is strictly positive on X+ \{0}. Subsequently a rather extensive literature devel-
We still have to specify the choice of (M0 , π). The oped, answering these problems and opening new
most basic choice is to take for given S = (St )0≤t≤T perspectives. We cannot give a full account on all
the space generated by the “simple integrands” (3) of this literature and refer, for example, to the mono-
as proposed in [14]. We thus may deduce from graph [11] for more extensive information. We can
Kreps’ arguments in [24] the following version of give an outline.
the fundamental theorem of asset pricing. As regards the situation for 1 ≤ p ≤ ∞ in Kreps’
theorem, this issue was further developed by Duffie
Theorem 2 Let (, F, ) be countably generated and Huang [12] and, in particular, by Stricker [36].
and X = Lp (, F, ) endowed with the norm topol- This author related the no free lunch condition of
ogy τ , if 1 ≤ p < ∞, or the Mackey topology induced Kreps to a theorem by Yan [37] obtained in the
by L1 (, F, ), if p = ∞. context of the Bichteler–Dellacherie theorem on
Let S = (St )0≤t≤T be a stochastic process taking the characterization of semimartingales. Using Yan’s
values in X. Define M 0 ⊆ X to consist of the simple theorem, Stricker gave a different proof of Kreps’
stochastic integrals ni=1 Hi (Sti − Sti−1 ) as in equa- theorem, which does not need the assumption that
tion (3). (, F, ) is countably generated.
Then the “no free lunch” condition (5) is satisfied if A beautiful extension of the Harrison–Pliska the-
and only if there is a probability measure Q with ddQ
∈ orem was obtained in 1990 by Dalang, Morton, and
Lq (, F, ), where p1 + q1 = 1, such that (St )0≤t≤T is Willinger [5]. They showed that, for an d -valued
a Q-martingale. process (St )Tt=0 in finite discrete time, the no-arbitrage
condition is indeed equivalent to the existence of an
This remarkable theorem of Kreps sets new stan- equivalent martingale measure. The proof is surpris-
dards. For the first time, we have a mathematically ingly tricky, at least for the case d ≥ 2. It is based on
precise statement of our meta-theorem applying to a the measurable selection theorem (the suggestion to
general class of models in continuous time. There are use this theorem is acknowledged to Delbaen). Differ-
still some limitations, however. ent proofs of the Dalang–Morton–Willinger theorem
When applying the theorem to the case 1 ≤ p < have been given in [17, 20, 21, 26, 31].
∞, we find the requirement dQ ∈ Lq (, F, ) for An important question left unanswered by Kreps
d
some q > 1, which is not very pleasant. After all, was whether one can, in general, replace the use of
we want to know what exactly corresponds (in terms nets (mα − hα )α∈I , indexed by α ranging in a general
of some no-arbitrage condition) to the existence of ordered set I , simply by sequences (mn − hn )∞ n=1 . In
an equivalent martingale measure Q. The q-moment the context of continuous processes, S = (St )0≤t≤T , a
condition is unnatural in most applications. In partic- positive answer was given by Delbaen in [6], if one is
ular, it is not invariant under equivalent changes of willing to make the harmless modification to replace
measures as is done often in the applications. the deterministic times 0 = t0 ≤ t1 ≤ . . . ≤ tn = T in
The most interesting case of the above theorem equation (3) by stopping times 0 = τ0 ≤ τ1 ≤ . . . ≤
is p = ∞. However, in this case, the requirement τn = T . A second case, where the answer to this
St ∈ X = L∞ (, F, ) is unduly strong for most question is positive, are processes S = (St )∞ t=0 in
applications. In addition, for p = ∞, we run into the infinite, discrete time as shown in [32].
subtleties of the Mackey topology τ (or the weak-star The Banach–Steinhaus theorem implies that, for a
topology, which does not make much of a difference) sequence (mn − hn )∞ ∞
n=1 converging in L (, F, )
on L∞ (, F, ). We shall discuss this issue below. with respect to the weak-star (or Mackey) topology,
The “heroic period” of the development of the fun- the norms (||mn − hn ||∞ )∞ n=1 remain bounded (“uni-
damental theorem of asset pricing marked by Ross form boundedness principle”). Therefore, it follows
[29], Harrison–Kreps [14], Harrison–Pliska [15], that in the above two cases of continuous processes
and Kreps [24], put the issue on safe mathematical S = (St )0≤t≤T or processes (St )∞ t=0 in infinite, dis-
grounds and brought some spectacular results. How- crete time, the “no free lunch” condition of Kreps
ever, it still left many questions open; quite a number can be equivalently replaced by the “no free lunch
6 Fundamental Theorem of Asset Pricing
with bounded risk” condition introduced in [32]: Definition 2 ([Def. 2.7] [7]): An S-integrable
in equation (7) above, we additionally impose that predictable process H = (Ht )0≤t≤T is called admis-
(||mα − hα ||∞ )α∈I remains bounded. In this case, sible if there is a constant M > 0 such that
we have that there is a constant M > 0 such that t
mα ≥ −M, -a.s. for each α ∈ I , which explains the Hu dSu ≥ −M, a.s., f or 0 ≤ t ≤ T (8)
wording “bounded risk”. 0
However, in the context of general semimartingale
models S = (St )0≤t≤T , a counter-example was given The economic interpretation is that the economic
by Delbaen and the author in ( [Ex. 7.8] [7]) showing agent, trading according to the strategy, has to respect
that the “no free lunch with bounded risk” condition a finite credit line M.
does not imply the existence of an equivalent martin- Let us now sketch the approach of [7]. Define
gale measure. Hence, in a general setting and by only
T
using simple integrals, there is no possibility of get- K= Ht dSt : H admissible (9)
ting any more precise information on the free lunch 0
condition than the one provided by Kreps’ theorem. which is a set of (equivalence classes of) random
At this stage it became clear that, in order to variables. Note that by equation (6) the elements
obtain sharper results, one has to go beyond the f ∈ K are uniformly bounded from below, that is,
framework of simple integrals (3) and rather use f ≥ −M for some M ≥ 0. On the other hand, there
general stochastic integrals (1). After all, the simple is no reason why the positive part f+ should obey
integrals are only a technical gimmick, analogous any boundedness or integrability assumption.
to step functions in measure theory. In virtually all As a next step, we “allow agents to throw away
the applications, for example, the replication strategy money” similarly as in Kreps’ work [24]. Define
of an option in the Black–Scholes model, one uses
general integrals of the form (1).
C = g ∈ L∞ (, F, ) : g ≤ f for some f ∈ K
General integrands pose a number of questions
to be settled. First of all, the integral (1) has to be = K − L0+ (, F, ) ∩ L∞ (, F, ) (10)
mathematically well defined. The theory of stochastic
calculus starting with K. Itô, and developed in partic- where L0+ (, F, ) denotes the set of nonnegative
ular by the Strasbourg school of probability around measurable functions.
Meyer, provides very precise information on this By construction, C consists of bounded random
issue: there is a good integration theory for a given variables, so that we can use the functional analytic
stochastic process S = (St )0≤t≤T if and only if S is a duality theory between L∞ and L1 . The difference of
semimartingale (theorem of Bichteler–Dellacherie). the subsequent definition to Kreps’ approach is that
Hence, mathematical arguments lead to the model it pertains to the norm topology .∞ rather than to
assumption that S has to be a semimartingale. How- the Mackey topology on L∞ (, F, ).
ever, what about an economic justification of this
assumption? Fortunately, the economic reasoning Definition 3 ([2.8] [11]): A locally bounded semi-
hints in the same direction. It was shown by Delbaen martingale S = (St )0≤t≤T satisfies the no free lunch
and the author that, for a locally bounded stochastic with vanishing risk condition if
process S = (St )0≤t≤T , a very weak form of Kreps’
C̄ ∩ L∞
+ (, F, ) = {0} (11)
“no free lunch” condition involving simple integrands
(3), implies already that S is a semimartingale (see where C̄ denotes the .∞ -closure of C.
[Theorem 7.2] [7], for a precise statement).
Hence, it is natural to assume that the model Here is the translation of equation (11) into prose:
S = (St )0≤t≤T of stock prices is a semimartingale so the process S fails the above condition if there is a
that the stochastic integral (3) makes sense mathe- function g ∈ L∞ + (, F, ) with [g > 0] > 0 and a
matically, for all S-integrable, predictable processes sequence (f n )∞
n=1 of the form
H = (Ht )0≤t≤T . As pointed out, [14, 15] impose, in
T
addition, an admissibility condition to rule out dou-
bling strategies and similar schemes. fn = Htn dSt (12)
0
Fundamental Theorem of Asset Pricing 7
where H n are admissible integrands, such that that, under the assumption of no free lunch with
vanishing risk, the set C defined in equation (10) is
fn ≥ g − 1
n
a.s. (13) automatically weak-star closed in L∞ (, F, ). This
Hence the condition of no free lunch with van- pleasant fact is not only a crucial step in the proof of
ishing risk is intermediate between the (stronger) the above theorem; maybe even more importantly,
no free lunch condition of Kreps and the (weaker) it also found other applications. For example, to
no-arbitrage condition. The latter would require that find general existence results in the theory of utility
there is a nonnegative function g with [g > 0] > 0, optimization (see Expected Utility Maximization:
which is of the form Duality Methods) it is of crucial importance to have
T a closedness property of the set over which one
g= Ht dSt (14) optimizes: for these applications, the above result is
0 very useful [23].
for an admissible integrand H . Condition (13) does Without going into the details of the proof, the
not quite guarantee this, but something — at least importance of certain elements in the set K is pointed
from an economic point of view — very close: we out. The admissibility rules out the use of doubling
can uniformly approximate from below such a g by strategies. The opposite of such a strategy can be
the outcomes fn of admissible trading strategies. called a suicide strategy. It is the mathematical
The main result of Delbaen and the author [7] equivalent of making a bet at the roulette, leaving it
reads as follows. as well as all gains on the table as long as one keeps
winning, and wait until one loses for the first time.
Theorem 3 ( [Corr. 1.2] [7]): Let S = (St )0≤t≤T be Such strategies, although admissible, do not reflect
a locally bounded real-valued semimartingale. economic efficiency. More precisely, we define the
There is a probability measure Q on (, F), which
following.
is equivalent to and under which S is a local
martingale if and only if S satisfies the condition of T
Definition 4 An admissible outcome 0 Ht dSt
no free lunch with vanishing risk.
is called maximal if there
T is no other
T admissible
This is a mathematically precise theorem, which, strategy H
such that 0 Ht
dSt ≥ 0 Ht dSt with
T T
in my opinion, is quite close to the vague “meta- [ 0 Ht
dSt > 0 Ht dSt ] > 0
theorem” at the beginning of this article. The dif-
ference to the intuitive “no arbitrage” idea is that the In the proof of Theorem 6, these elements play
agent has to be willing to sacrifice (at most) the quan- a crucial role and the heart of the proof consists in
tity n1 in equation (13), where we may interpret n1 as, showing that every element in K is dominated by
say, 1 cent. a maximal element. However, besides their mathe-
The proof of the above theorem is rather long and matical relevance, they also have a clear economic
technical and a more detailed discussion goes beyond interpretation. There is no use in implementing a
the scope of this article. To the best of the author’s strategy that is not maximal as one can do better.
knowledge, no essential simplification of this proof Nonmaximal elements can also be seen as bubbles
has been achieved so far ([19]). [18].
Mathematically speaking, the statement of the In Theorem 6, we only assert that S is a local
theorem looks very suspicious at first glance: after martingale under Q. In fact, this technical concept
all, the no free lunch with vanishing risk condition cannot be avoided in this setting. Indeed, fix an
pertains to the norm topology of L∞ (, F, ). Hence S-integrable, predictable, admissible process H =
it seems that, when applying the Hahn–Banach (Ht )0≤t≤T as well as a bounded, predictable, strictly
theorem, one can only obtain a linear functional positive process (kt )0≤t≤T . The subsequent identity
in L∞ (, F, )∗ , which is not necessarily of the holds true trivially.
form dQ ∈ L1 (, F, ), as we have seen in Ross’
d
work [29].
t t
The reason why the above theorem, nevertheless, Hu
is true is a little miracle: it turns out ([Th. 4.2] [7]) Hu dSu = dS̃u , 0≤t ≤T (15)
0 0 ku
8 Fundamental Theorem of Asset Pricing
where u
to Equivalent Martingale Measures for a discussion
S̃u = kv dSv , 0≤u≤T (16) of the concept of sigma-martingales. This concept
0 allows to formulate a result pertaining to a perfectly
general setting.
The message of equations (15) and (16) is that
the class of processes obtained by taking admissible Theorem 4 ([Corr. 1.2][7]): Let S = (St )0≤t≤T be
stochastic integrals on S or S̃ simply coincide. An an d -valued semimartingale.
easy interpretation of this rather trivial fact is that There is a probability measure Q on (, F), which
the possible investment opportunities do not depend is equivalent to and under which S is a sigma-
on whether stock prices are denoted in euros or in martingale if and only if S satisfies the condition
cents (this corresponds to taking kt ≡ 100 above). of no free lunch with vanishing risk with respect to
However, it may very well happen that S̃ is a admissible strategies.
martingale while S only is a local martingale. In
fact, the concept of local martingales may even be One may still ask whether it is possible to for-
characterized in these terms ([Proposition 2.5] [10]): mulate a version of the fundamental theorem, which
a semimartingale S is a local martingale if and only does not rely on the concepts of local or sigma-, but
if there is a strictly positive, decreasing, predictable rather on “true” martingales.
process k such that S̃ defined in equation (16) is a This was achieved by Yan [38] by applying a
martingale. clever change of numéraire technique, (see Change
Again we want to emphasize the role of the max- of Numeraire also [Section 5] [13]): let us suppose
T
imal elements. It turns out ([8, 11]) that if 0 Ht dSt that (St )0≤t≤T is a positive semimartingale, which is
is maximal, if and only if there is an equivalent natural if we model, for example, prices of shares
t local
martingale measure Q such that the process 0 Hu dSu (while the previous setting of not necessarily positive
is a martingale and not just a local martingale under price processes also allows for the modeling of
Q. One can show ([9, 11]) that for a given sequence forwards, futures etc.).
T Let us weaken the admissibility condition (8)
of maximal elements 0 Htn dSt , one can find one and
the same equivalent local above, by calling a predictable, S-integrable process
t martingale measure Q such
that all the processes 0 Hun dSu are Q-martingales. allowable if
Another useful and related characterization
t ([8, 11])
is that if a process Vt = x + 0 Hu dSu defines a max- t
T Hu dSu ≥ −M(1 + St ) a.s., for 0 ≤ t ≤ T
imal element 0 Hu dSu and remains strictly positive, 0
the whole financial market can be rewritten in terms (17)
of V as a new numéraire without losing the no-
arbitrage properties. The change of numéraire and The economic idea underlying this notion is well
the use of the maximal elements allows to introduce known and allows for the following interpretation:
a numéraire invariant concept of admissibility, see an agent holding M units of stock and bond may, in
[9] for details. An important result in this article is addition, trade in S according to the trading strategy
that the sum of maximal elements is again a maximal H satisfying equation (17); the agent will then remain
element. liquid during [0, T ].
Theorem 6 above still contains one severe limi- By taking S + 1 as new numéraire and replac-
tation of generality, namely, the local boundedness ing admissible by allowable trading strategies, Yan
assumption on S. As long as we only deal with con- obtains the following theorem.
tinuous processes S, this requirement is, of course,
satisfied. However, if one also considers processes Theorem 5 ([Theorem 3.2] [38]) Suppose that S is
with jumps, in most applications it is natural to drop a positive semimartingale.
the local boundedness assumption. There is a probability measure Q on (, F), which
The case of general semimartingales S (without is equivalent to and under which S is a martingale
any boundedness assumption) was analyzed in [10]. if and only if S satisfies the condition of no free lunch
Things become a little trickier as the concept of local with vanishing risk with respect to allowable trading
martingales has to be weakened even further: we refer strategies.
Fundamental Theorem of Asset Pricing 9
References [17] Jacod, J. & Shiryaev, A.N. (1998). Local martingales and
the fundamental asset pricing theorems in the discrete-
time case, Finance and Stochastics 2(3), 259–273.
[1] Arrow, K. (1964). The role of securities in the optimal [18] Jarrow, R., Protter, P. & Shimbo, K. (2007). Asset
allocation of risk-bearing, Review of Economic Studies price bubbles in complete markets, in Advances in
31, 91–96. Mathematical Finance, Appl. Numer. Harmon. Anal.,
[2] Bachelier, L. (1964). Théorie de la Spéculation, Annales Birkhäuser, Boston, Boston MA, pp. 97–121.
Scientifiques de l’É Normale Superieure 17, 21–86. [19] Kabanov, Y.M. (1997). On the FTAP of Kreps-Delbaen-
English translation in: Cootner, P. (ed), The Random Schachermayer (English), in Statistics and Control of
Character of Stock Market Prices, MIT Press. Stochastic Processes, Y.M. Kabanov ed., World Scien-
[3] Black, F. & Scholes, M. (1973). The pricing of options tific, Singapore, pp. 191–203. The Liptser Festschrift.
and corporate liabilities, Journal of Political Economy Papers from the Steklov seminar held in Moscow, Rus-
81, 637–659. sia, 1995–1996.
[4] Cox, J. & Ross, S. (1976). The valuation of options [20] Kabanov, Y.M. & Kramkov, D. (1994). No-arbitrage and
for alternative stochastic processes, Journal of Financial equivalent martingale measures: an elementary proof of
Economics 3, 145–166. the Harrison–Pliska theorem, Theory of Probability and
[5] Dalang, R.C., Morton, A. & Willinger, W. (1990). its Applications 39(3), 523–527.
Equivalent Martingale measures and no-arbitrage in [21] Kabanov, Y.M. & Stricker, Ch. (2001). A teachers’
stochastic securities market model, Stochastics and note on no-arbitrage criteria, Séminaire de Probabilités
Stochastic Reports 29, 185–201. XXXV, Springer Lecture Notes in Mathematics 1755,
[6] Delbaen, F. (1992). Representing martingale measures 149–152.
when asset prices are continuous and bounded, Mathe- [22] Kemeny, J.G. (1955). Fair bets and inductive probabili-
matical Finance 2, 107–130. ties, Journal of Symbolic Logic 20(3), 263–273.
[7] Delbaen, F. & Schachermayer, W. (1994). A general [23] Kramkov, D. & Schachermayer, W. (1999). The asymp-
version of the fundamental theorem of asset pricing, totic elasticity of utility functions and optimal investment
Mathematische Annalen 300, 463–520. in incomplete markets, Annals of Applied Probability
[8] Delbaen, F. & Schachermayer, W. (1995). The no- 9(3), 904–950.
[24] Kreps, D.M. (1981). Arbitrage and equilibrium in eco-
arbitrage condition under a change of numéraire,
nomics with infinitely many commodities, Journal of
Stochastics and Stochastic Reports 53, 213–226.
Mathematical Economics 8, 15–35.
[9] Delbaen, F. & Schachermayer, W. (1997). The Banach
[25] Merton, R.C. (1973). The theory of rational option
space of workable contingent claims in arbitrage the-
pricing, Bell Journal of Economics and Management
ory, Annales de IHP (B) Probability and Statistics 33,
Science 4, 141–183.
113–144.
[26] Rogers, L.C.G. (1994). Equivalent martingale measures
[10] Delbaen, F. & Schachermayer, W. (1998). The funda-
and no-arbitrage, Stochastics and Stochastic Reports
mental theorem of asset pricing for unbounded stochastic
51(1–2), 41–49.
processes, Mathematische Annalen 312, 215–250. [27] Ross, S. (1976). The arbitrage theory of capital asset
[11] Delbaen, F. & Schachermayer, W. (2006). The Mathe- pricing, Journal of Economic Theory 13, 341–360.
matics of Arbitrage, Springer Finance, Springer, p. 371. [28] Ross, S. (1977). Return, risk and arbitrage, Risk and
[12] Duffie, D. & Huang, C.F. (1986). Multiperiod security Return in Finance 1, 189–218.
markets with differential information; martingales and [29] Ross, S. (1978). A simple approach to the valuation of
resolution times, Journal of Mathematical Economics 15, risky streams., Journal of Business 51, 453–475.
283–303. [30] Samuelson, P.A. (1965). Proof that properly antici-
[13] Guasoni, P., Rásonyi, M. & Schachermayer, W. (2009). pated prices fluctuate randomly, Industrial Management
The fundamental theorem of asset pricing for continu- Review 6, 41–50.
ous processes under small transaction costs, Annals of [31] Schachermayer, W. (1992). A Hilbert space proof of the
Finance, forthcoming. fundamental theorem of asset pricing in finite discrete
[14] Harrison, J.M. & Kreps, D.M. (1979). Martingales and time, Insurance: Mathematics and Economics 11(4),
arbitrage in multiperiod securities markets, Journal of 249–257.
Economic Theory 20, 381–408. [32] Schachermayer, W. (1994). Martingale Measures for
[15] Harrison, J.M. & Pliska, S.R. (1981). Martingales and Discrete time Processes with Infinite Horizon, Mathe-
stochastic integrals in the theory of continuous trad- matical Finance 4, 25–56.
ing, Stochastic Processes and their Applications 11, [33] Schachermayer, W. (2005). A note on arbitrage and
215–260. closed convex cones, Mathematical Finance (1), forth-
[16] Harrison, J.M. & Pliska, S.R. (1983). A stochastic coming.
calculus model of continuous trading: complete mar- [34] Schachermayer, W. & Teichmann, J. (2005). How close
kets, Stochastic Processes and their Applications 11, are the option pricing formulas of Bachelier and Black-
313–316. Merton-Scholes? Mathematical Finance 18(1), 55–76.
10 Fundamental Theorem of Asset Pricing
[35] Shimony, A. (1955). Coherence and the axioms of [39] Yor, M. (1978). Sous-espaces denses dans L1 ou H 1 et
confirmation, The Journal of Symbolic Logic 20, 1–28. représentation des martingales, in Séminaire de Prob-
[36] Stricker, Ch. (1990). Arbitrage et Lois de Martingale, abilités XII, Springer Lecture Notes in Mathematics,
Annales de l’Institut Henri Poincaré—Probabilites et Springer, Vol. 649, pp. 265–309.
Statistiques 26, 451–460.
[37] Yan, J.A. (1980). Caractérisation d’ une classe Related Articles
d’ensembles convexes de L1 ou H 1 , in Séminaire de
Probabilités XIV, J. Azema, M. Yor, eds, Springer Arbitrage Strategy; Arrow, Kenneth; Change
Lecture Notes in Mathematics 784, Springer, pp. of Numeraire; Equivalent Martingale Measures;
220–222. Martingales; Martingale Representation Theorem;
[38] Yan, J.A. (1998). A new look at the fundamental theorem Risk-neutral Pricing; Stochastic Integrals.
of asset pricing, Journal of Korean Mathematics Society
35, 659–673. WALTER SCHACHERMAYER
Risk-neutral Pricing account. This approach is used in modern mathe-
matical finance, in particular, in the Black–Scholes
formula. However, the idea goes back much further
and the method was used by actuaries for centuries.
A classical problem arising frequently in business is Think of a life insurance contract. To focus on
the valuation of future cash flows that are risky. By the essential point, we consider the simplest case: a
the term risky we mean that the payment is not of a one-year death insurance. If the insured person dies
deterministic nature; rather there is some uncertainty within the subsequent year, the insured sum S, say
in the amount of the future cash flows. Of course, in S = ¤1, is paid out at the end of this year; if the
real life, virtually everything happening in the future insured person survives the year, nothing is paid, and
contains some element of uncertainty. the contract ends at the end of the year.
As an example, let us think of an investment To calculate the premiumb for this contract, actu-
project, say, a company plans to build a new factory. aries look up in their mortality tablesc the probability
A classical way to proceed is to calculate a net asset that the insured person dies within one year. The tra-
value. One tries to estimate the future cash flows ditional notation for this probability is qx , where x
generated by the project in the subsequent periods. In denotes the age of the insured person.
the present example, they will initially be negative; To calculate the premium for such a one-year
this initial investment should be compensated by the death insurance contract, with S normalized to S = 1,
positive cash flows in the later periods. Having fixed actuaries apply the formula
these estimates of the future cash flows for all periods,
1
one calculates a net asset value by discounting these P = qx (1)
cash flows to the present date. But, of course, there 1+i
is uncertainty involved in the estimation of the The term qx is just the expected value of the future
future cash flows and people doing these calculations cash flow and i denotes “the” interest rate: hence the
are, of course, aware of that. The usual way to premium P is the discounted expected value of the
compensate for this uncertainty is to apply an interest cash flow at the end of the year.
rate that is higher than the risklessa rate of return It is important to note that actuaries use a “conser-
corresponding to the rate of return of government vative” value for the interest rate, for example, i =
bonds. 3%. In practical terms, this corresponds quite well to
The spread between the riskless rate of return and the “riskless rate of return”. In any case, it is quite
the interest rate used for discounting the future cash different, in practical as well as in theoretical terms,
flows in the calculation of the net asset value can from the discount factors used to calculate the net
be quite substantial in order to compensate for the asset value of a risky future cash flow according to
riskiness. Only if the net asset value, obtained by the method stated above.
discounting with a rather high rate of return, remains But, after all, the premium of our death insurance
positive, the management of the company will engage contract also corresponds to the present value of an
in the investment project. uncertain future cash flow! How do actuaries account
Mathematically speaking, the above procedure for the risk involved in this cash flow, if not via an
may be described as follows: first, one determines appropriate choice of the interest rate?
the expected values of the future cash flows and, The answer is simple when looking at equation
subsequently, one discounts by using an elevated (1): apart from the interest rate i the probability qx of
discount factor. However, there is no systematic way dying within the next year also enters the calculation
of mathematically approaching the question of how of P . The art of the actuarial profession is to choose
the degree of uncertainty in the determination of the the “good” value for qx . Typically, actuaries very
expected values can be quantified, and in which way well know the actual mortality probabilities in their
this should be taken into account to determine the portfolio of contracts, which often consists of several
spread between the interest rates. hundred thousand contracts; in other words, they have
We now turn to a different approach, which inter- a very good understanding of what the “true value” of
changes the roles of taking expectations and discount- qx is. However, they do not apply this “true value”
ing in taking the riskness of the cash flows into in their premium calculations: in equation (1) they
2 Risk-neutral Pricing
would apply a value for qx which is substantially Markets), the solution proposed by Black, Scholes,
higher than the “true” value of qx . Actuaries speak and Merton is
about mortality tables of the first kind and the second
kind. C0 = e−rT ƐQ [CT ] (3)
Mortality tables of the second kind reflect the “true
probabilities”. They are only used for the internal The above equation is a perfect analog to the pre-
analysis of the profitability of the insurance company. mium of a death insurance contract (1). The first
On the other hand, in the daily life of actuaries only term, taking care of the discounting, uses the
the mortality tables of the first kind, which properly “conservative” choice of a riskless interest rate r.
display the “modified” probabilities qx , are used. The second term gives the expected value of the
They are not only used for the calculation of premia future cash flow, taken under the risk-neutral prob-
but also for all quantities of relevance involved ability measure Q. This probability measure Q is
in an insurance policy, such as surrender values, chosen in such a way that the dynamics (2) of the
reserves, and so on. This constitutes a big strength of stock under Q become
the actuarial technique: actuaries are always armed
with perfectly coherent logic when doing all these dSt = St r dt + St σ dWt (4)
calculations. This logic is that of a fair game or,
The point is that the drift term St r dt of S under
mathematically speaking, of a martingale. Indeed, if
Q is in line with the growth rate of the risk-free bond
the qx would correctly model the mortality of the
insured person and if i were the interest rate that dBt = Bt r dt (5)
the insurance company could precisely achieve when
investing the premia, then the premium calculation The interpretation of (4) is that if the market were
(1) would make the insurance contract a fair game. correctly modeled by the probability Q, then the mar-
It is important to note that this argument pertains ket was risk neutral. The mathematical formulation,
only to a kind of virtual world, as it is precisely (e−rt St )0≤t≤T , that is, the stock price process dis-
the task of actuaries to choose the mortalities qx counted by the risk-free interest rate r, is a martingale
in a prudent way such that they do not coincide under Q.
with the “true” probabilities. In the case of insurance Similarly as in the actuarial context above, the
contracts where the insurance company has to pay in mathematical model of a financial market under the
the case of death, actuaries choose the probabilities risk-neutral measure Q pertains to a virtual world,
qx higher than the “true ones”. This happens in and not to the real world. In reality, that is, under
the simple example considered above. On the other , we would typically have µ > r. Fixing this case,
hand, if the insurance company has to pay when the Girsanov’s formula (see Equivalence of Probability
insured person is still alive, for example, in the case Measures; Stochastic Exponential) tells us precisely
of a pension, actuaries use probabilities qx which are that the probability measure Q represents a “prudent
lower than the “true ones”, in order to be on the safe choice of probability”. It gives less weight than the
side. original measure to the events which are favorable
These actuarial techniques have been elaborated for the buyer of a stock, that is, when ST is large.
on as this will be helpful to more clearly understand On the other hand, Q gives more weight than to
the essence of the option pricing approach of Black, unfavorable events, that is, when ST is small. This
Scholes, and Merton. Their well-known model for the can be seen from Girsanov’s formula
risky stock S and the risk-free bond are
dQ µ−r (µ − r)2
= exp − WT − T (6)
dSt = St µ dt + St σ dWt d σ 2σ 2
dBt = Bt r dt (2) and the dynamics of the stock price process S under
resulting from (2)
The task is to value a (European) derivative on
the stock S at expiration time T , for example, CT = σ2
ST = S0 exp σ WT + µ − T (7)
(ST − K)+ . As explained earlier (see Complete 2
Risk-neutral Pricing 3
Fixing a random element ω ∈ , the Radon– Risk-neutral Pricing for General Models
Nikodym derivative dQ d
(ω) is small iff WT (ω) is
large, and the latter is large iff ST (ω) is large. In the Black–Scholes model (2) there is only one
In many applications, it is not even necessary risk-neutral measure Q under which the discounted
to consider the original “true” probability measure stock price process becomes a martingale.d
. There are hundreds of papers containing the This feature characterizes complete financial mar-
sentence: “we work under the risk-neutral measure kets (see Complete Markets). In this case, we not
Q”. This is parallel to the situation of an actuary only obtain from equation (3) a price C0 for the
in his/her daily work: He/she does not bother about derivative security CT , but we get much more: the
the “true” mortality probabilities, but only about the derivative can be perfectly replicated by starting
probabilities listed in the mortality table of the first at time t = 0 with the initial investment given by
kind. equation (3) and subsequent dynamical trading in
The history of the valuation formula (3), in fact, the underlying stock S. This is the essence of the
goes back much further than Black, Scholes, and approach of Black, Scholes, and Merton; it has no
Merton. Already in 1900, L. Bachelier applied this parallel in the classical actuarial approach or in the
formula in his thesis [1] in order to price options. It work of L. Bachelier.
seems worthwhile to have a closer look. Bachelier What happens in incomplete financial markets,
did not use a discount factor, such as e−rT , in that is, when there is more than one risk-neutral
equation (3). The reason is that in 1900 prices measure Q? It has been shown by Harrison and Pliska
underlying the option were denoted in forward prices [4] that equation (3) yields precisely all the consistent
at the Paris stock exchange (called “true prices” by pricing rules for derivatives on S, when Q runs
Bachelier who also carefully adjusted for coupon through the set of risk-neutral measures equivalent
payments; see [6] for details). As it is well known, to . We denote the latter set by Me (S). The term
when considering forward prices the discount factor consistent means that there should be no-arbitrage
disappears. In modern terminology, this fact boils possibilities when all possible derivatives on S are
down to “Black’s formula”. traded at the price given by equation (3).
As regards the second term in equation (3), Bache- But, what is the good choice of Q ∈ Me (S)?
lier started from the very beginning with a martingale In general, this question is as meaningless as the
model, namely, (scaled) Brownian motion [6] question: what is the good choice of an element in
some convex subset of a vector space? In order to
St = S0 + σ Wt , 0≤t ≤T (8) allow for a more intelligent version of this question,
one needs additional information. It is here that
the original probability measure comes into play
In other words, he also “worked assuming the risk-
again: a popular approach is to choose the element
neutral probability”.
Q ∈ Me (S) which is “closest” to .
In fact, in the first pages of his thesis Bachelier
In order to make this idea precise, fix a strictly
does speak about two kinds of probabilities. The
convex function V (y), for example,
following is a quote from [1]:
(i) The probability which might be called “mathe- V (y) = y ln(y) − 1 , y>0 (9)
matical”, which can be determined a priori and which
2
is studied in games of chance. y
(ii) The probability dependent on future events and, or V (y) = , y∈ (10)
2
consequently impossible to predict in a mathematical
manner. Determine Q̂ ∈ Me (S) as the optimizer of the
This latter is the probability that the speculator tries optimization problem
to predict.
dQ
Admitting a large portion of goodwill and hindsight Ɛ V → min! Q ∈ Me (S) (11)
d
knowledge one might interpret (i) something like
the risk-neutral probability Q, while (ii) describes To illustrate things
at the hand
of the above exam-
something like the historical measure . ples: For V (y) = y ln(y) − 1 , this corresponds to
4 Risk-neutral Pricing
egy as follows: we consider for each bounded claim K = c+ ϑt dSt : c ∈ , ϑ ∈ L2 (S) ⊂ L2 (Q)
B the associated Q-martingale V given by 0
(7)
Vt = EQ [ B | Ft ] , t ≤T (2)
For ϑ as above we also denote
By the PRP, there exists an admissible strategy ϑ T
T Utility-indifference Hedging
ϑt dSt ←−→ ϑ (10)
0
Let u be some utility function defined on the whole
since we have real line. If there exists a number π satisfying
2
T T T
EQ ϑt dSt = EQ ϑt2 d [S]T (11) sup E u x + ϑt dSt
0 0 ϑ 0
T
Hence, K0 is isometrically isomorphic to an L - 2 = sup E u x + ϑt dSt + π − B (17)
space and therefore closed. Therefore, we can apply ϑ 0
the theorem about the orthogonal projection in the then it is called utility-indifference price of the claim
Hilbert spaces to get a decomposition B. It is the threshold where the investor is indifferent
T
whether just to maximize expected utility from a pure
B=c + B
ϑtB dSt + LT (12) investment into the stock with the price process S or
0 to sell in addition a claim B and collect a premium
π for this.
where LT is orthogonal to each element of K; The optimal strategies ϑ on both sides of equation
in particular, EQ [LT ] = 0 since 1 ∈ K. It follows (17) typically differ. The difference
that we have cB = EQ [B], and ϑ B is called the
FS optimal hedging strategy. As processes, Lt := θ := φ B − φ 0 (18)
E[LT |Ft ] and S are strongly orthogonal in the sense
that LS is a Q-martingale or equivalently, L, S = of the optimizers on the right- and the left-hand
0, where the predictable covariation ., . here, refers side respectively can be interpreted as a utility-based
to the measure Q. This implies hedging strategy. It corresponds to the adjustment of
the investor’s portfolio made in order to account for
the option.
ϑ B dV , S = S, S (13)
Let us consider exponential utility
typically equals the highest price consistent with no- [2] Di Nunno, G. (2002). Stochastic integral representation,
arbitrage pricing, that is, it amounts to supQ EQ [B], stochastic derivatives and minimal variance hedging,
where the supremum is taken over all the equivalent Stochastics and Stochastics Reports 73, 181–198.
[3] Föllmer, H. & Leukert, P. (1999). Quantile hedging,
martingale measures Q. Finance and Stochastics 3, 251–273.
Therefore, it has been proposed by Föllmer and [4] Föllmer, H. & Leukert, P. (2000). Efficient hedging: cost
Leukert [3] to maximize the probability of a success- versus shortfall risk, Finance and Stochastics 4, 117–146.
ful hedge given a certain amount of initial capital, [5] Föllmer, H. & Sondermann, D. (1986). Hedging of non-
a concept that they call quantile hedging. However, redundant contingent claims. Contributions to mathemati-
with this approach there is no protection for the worst cal economics, in Honor of G. Debreu, W. Hildenbrand &
case scenarios other than portfolio diversification, A. Mas-Colell, eds, Elsevier Science Publications, North-
Holland, pp. 205–223.
and technically, it might be difficult to implement [6] Kallsen, J. & Rheinländer, T. (2008). Asymptotic Utility-
this since it corresponds to hedging a knock-out based Pricing and Hedging for Exponential Utility.
option. The same authors [4], moreover, considered Preprint.
efficient hedges which minimize the expected short-
fall weighted by some loss function. In this way, the
investor may interpolate between the extremes of no Related Articles
hedge and a superhedge, depending on the accepted
level of shortfall risk. Complete Markets; Delta Hedging; Equivalent
Martingale Measures; Mean–Variance Hedging;
References Option Pricing: General Principles; Second
Fundamental Theorem of Asset Pricing; Stochastic
[1] Cont R., Tankov P. & Voltchkova E. (2007). Hedging Integrals; Superhedging; Uncertain Volatility
with options in presence of jumps, in Stochastic Analysis Model; Utility Indifference Valuation.
and Applications: The Abel Symposium 2005 in honor
of Kiyosi Ito, F.E. Benth, G. Di Nunno, T. Lindstrom, THORSTEN RHEINLÄNDER
B. ksendal & T. Zhang, eds, Springer, pp. 197–218.
Complete Markets is given as a strictly positive process, so that it can
be selected as a numéraire asset. Let us then assume
that Stk > 0 for every t ≤ T . To emphasize the spe-
cial role of the process S k , we will sometimes write
According to the arbitrage pricing of derivative secu-
B instead of S k . We assume that all assets are per-
rities, the arbitrage price of a financial derivative
fectly divisible and the market is frictionless, that is,
is defined as the wealth of a self-financing trad-
there are no restrictions on the short-selling of assets,
ing strategy based on traded primary assets, which
transaction costs, taxes, and so on.
replicates the terminal payoff at maturity (or, more
We consider a probability space (, FT , ),
generally, all cash flows) from the financial deriva-
which is equipped with a filtration = (Ft )t≤T .
tive. Hence, an important issue arises whether any
A probability measure , to be interpreted as the
financial derivative admits a replicating strategy in a
real-life probability, is an arbitrary probability mea-
given model; if this property holds, then the market
sure on (, FT ) such that (ωi ) > 0 for every i =
model is said to be complete. Completeness of a mar-
1, 2, . . . , d. For convenience, we assume throughout
ket model ensures that any derivative security can be
that the σ -field F0 is trivial, that is, F0 = {∅, }. All
priced by arbitrage and hedged by a dynamic trading
processes considered in what follows are assumed to
in primary traded assets. For example, in the frame-
be -adapted.
work of the Cox, Ross, and Rubinstein [9] model,
not only the call and put options but also any path-
independent or path-dependent contingent claim can Trading Strategies
be replicated by a dynamic trading in stock and bond.
Similarly, the classic Black and Scholes [3] model The component φti of a trading strategy φ =
enjoys the property of completeness, although a suit- (φ 1 , φ 2 , . . . , φ k ) represents the number of units of
able technical assumption needs to be imposed on the the ith security held by an investor at time t. In other
class of considered contingent claims. words, φti Sti is the amount of funds invested in the
Even for an incomplete model, the class of hedge- ith security at time t. Hence, the wealth process V (φ)
able derivatives, formally represented by attainable of a trading strategy φ is given by the equality, for
contingent claims, can be sufficiently large for prac- t = 0, 1, . . . , T ,
tical purposes. Therefore, completeness should not
k
be seen as a necessary requirement, as opposed to the Vt (φ) = φti Sti (1)
no-arbitrage property, which is an indispensable fea- i=1
ture of any financial model used for arbitrage pricing
of derivative securities. The initial wealth V0 (φ) = φ0 S0 is also referred to as
the initial cost of φ.
A trading strategy φ is said to be self-financing
Finite Market Models whenever it satisfies the following condition, for
every t = 0, 1, . . . , T − 1,
The issue of completeness of a finite market
model was analyzed, among others, by Taqqu and
k
k
φti St+1
i
= i
φt+1 i
St+1 (2)
Willinger [24]. The finiteness of a market means i=1 i=1
that the underlying probability space is finite,
= {ω1 , ω2 , . . . , ωd }, and trading activities may In the financial interpretation, this condition means
only occur at the finite set of dates, denoted as that the portfolio φ is revised at any date t in such
{0, 1, . . . , T }. As a standard example of a finite a way that there are no infusions of external funds
market model, one may quote, for instance, the Cox, and no funds are withdrawn from the portfolio. We
Ingersoll, and Ross [9] binomial tree model (see denote by the vector space of all self-financing
Binomial Tree) or any its multinomial extensions. trading strategies. The gains process G(φ) of any
Let S 1 , S 2 , . . . , S k be the stochastic processes trading strategy φ equals, for t = 0, 1, . . . , T ,
describing the spot (or cash) prices of some non-
t−1
k
dividend paying financial assets. As customary, we Gt (φ) = φui (Su+1
i
− Sui ) (3)
postulate that the price process of at least one asset u=0 i=1
2 Complete Markets
with G0 (φ) = 0. It can be checked that a trading there are no-arbitrage opportunities in the class of
strategy φ is self-financing if and only if the all self-financing trading strategies.
equality Vt (φ) = V0 (φ) + Gt (φ) holds for every t = It can be shown that if the market model M is
0, 1, . . . , T . arbitrage free, then any attainable contingent claim X
is uniquely replicated in M. The converse implication
is not true, however, that is, the uniqueness of the
Replication and Arbitrage wealth process of any attainable contingent claim
does not imply the arbitrage-free property of a
A European contingent claim X with maturity T is market, in general. Therefore, the existence and
an arbitrary FT -measurable random variable. Since uniqueness of the wealth process associated with any
the space is assumed to be a finite set with d attainable claim is insufficient to justify the term
elements, any claim X has the representation X = arbitrage price. Indeed, it is easy to give an example
(X(ω1 ), X(ω2 ), . . . , X(ωd )) ∈ d . Hence, the class of a finite market in which all claims can be uniquely
X of all contingent claims that settle at T may be
replicated, but there exists a strictly positive claim
identified with the vector space d .
which can be replicated by a self-financing strategy
A replicating strategy for the contingent claim X,
with a negative initial cost.
which settles at time T , is a self-financing trading
strategy φ such that VT (φ) = X. For any claim X, Definition 2 Let the market model M be arbitrage
we denote by X the class of all replicating strategies free. Then the wealth process of an attainable claim
for X. X is called the arbitrage price of X in M and it is
The wealth process V (φ) of an arbitrary strategy denoted by πt (X) for every t = 0, 1, . . . , T .
φ from X is called a replicating process of X in M.
Finally, we say that a claim X is attainable in M if
it admits at least one replicating strategy. We denote Risk-neutral Valuation Formula
the class of all attainable claims by A.
Recall that we write S k = B. Let us denote by S ∗ the
Definition 1 A market model M is said to be process of relative prices, which equals, for every
complete if every claim X ∈ X is attainable in M t = 0, 1, . . . , T ,
or, equivalently, if for every FT -measurable random
variable X there exists at least one trading strategy
St∗ = (St1 Bt−1 , St2 Bt−1 , . . . , Stk Bt−1 )
φ ∈ such that VT (φ) = X. In other words, a market
model M is complete whenever X = A. = (St∗1 , St∗2 , . . . , St∗(k−1) , 1) (5)
Let X be an arbitrary attainable claim that settles where we denote S ∗i = S i B −1 . Recall that the prob-
at time T . We say that X is uniquely replicated in M ability measures and on (, F) are said to
if it admits a unique replicating process in M, that be equivalent if, for any event A ∈ F, the equality
is, if the equality Vt (φ) = Vt (ψ), t ∈ [0, T ], holds (A) = 0 holds if and only if (A) = 0. Similarly,
for arbitrary trading strategies φ, ψ from X . Then is said to be absolutely continuous with respect
the process V (φ) is termed the wealth process of X to if, for any event A ∈ F, the equality (A) = 0
in M. implies that (A) = 0. Clearly, if the probability
measures and are equivalent, then they are
also equivalent to each other. The following con-
Arbitrage Price cept is crucial in the so-called risk-neutral valuation
approach.
A trading strategy φ ∈ is called an arbitrage
opportunity if V0 (φ) = 0 and the terminal wealth of
Definition 3 A probability measure ∗ on (, FT )
φ satisfies
equivalent to (absolutely continuous with respect
(VT (φ) ≥ 0) = 1 and (VT (φ) > 0) > 0 (4) to , respectively) is called a equivalent martingale
measure for S ∗ (a generalized martingale measure
where is the real-world probability measure. We for S ∗ , respectively) if the relative price S ∗ is a ∗ -
say that a market M = (S, ) is arbitrage free if martingale with respect to the filtration .
Complete Markets 3
Lemma 1 A probability measure ∗ on (, FT ) is In the case of a finite market model, this result
a GMM for the market model M if and only if it was established by Harrison and Pliska [13]. For a
is a GMM for the relative price process S ∗ , that is, probabilistic approach to the First FTAP we refer to
P(S ∗ ) = P(M) and Q(S ∗ ) = Q(M). Taqqu and Willinger [20], who examine the case of
a finite market model, and to papers by Dalang et al.
The next result shows that the existence of [10] and Schachermayer [23], who study the case of
an EMM for M is a sufficient condition for the a discrete-time model with infinite state space.
4 Complete Markets
martingale measures. In addition, if the risk-neutral To ensure the absence of arbitrage opportuni-
valuation formula yields the same result for any ties, we postulate the existence of a d-dimensional,
choice of an EMM for the market model at hand, progressively measurable process γ such that the
then a given claim is necessarily attainable. equality
d
ij j
Multidimensional Black and Scholes rt − µit = σt γt = σti · γt (13)
j =1
Model
is satisfied simultaneously for every i = 1, . . . , k (for
A multidimensional Black and Scholes model is a Lebesgue a.e. t ∈ [0, T ], with probability one). Note
natural extension to a multiasset setup of the classic that the market price for risk γ is not uniquely
Black and Scholes [3] options pricing model. Let k determined, in general. Indeed, the uniqueness of a
denote the number of primary risky assets. For any solution γ to this equation holds only if d ≤ k and
i = 1, . . . , k, the price process S i of the ith risky the volatility matrix σ has the full rank for every
asset, referred to as the ith stock, is modeled as an t ∈ [0, T ].
Itô process (the dot · stands for the inner product in For example, if d = k and the volatility matrix
d ) σ is nonsingular (for Lebesgue a.e. t ∈ [0, T ], with
dSti = Sti (µit dt + σti · dWt ) (9) probability one), then, for every t ∈ [0, T ],
with S0i > 0 or, more explicitly, γt = σt−1 (rt 1 − µt ) (14)
d where 1 denotes the d-dimensional vector with every
dSti = Sti µit dt + σt dWt
ij j
(10) component equal to one, and µt is the vector with
j =1 components µit . For any process γ satisfying the
Complete Markets 5
above equation, we introduce a probability measure seen that under these assumptions, the martingale
∗ on (, FT ) by setting measure ∗ exists and is unique.
d ∗ T
1 T
= exp γu · dWu − γu du ,
2 Completeness of the Multidimensional
d 0 2 0 Black and Scholes Model
-a.s. (15)
The completeness of the multidimensional Black and
provided that the right-hand side in the last formula Scholes model is defined in much the same way
is well defined. The Doléans (stochastic) exponential as for a finite market model, except that certain
t technical restrictions need to be imposed on the class
1 t of contingent claims we wish to hedge and price. This
ηt = exp γu · dWu − γu du
2
(16)
0 2 0 is linked to the fact that not all self-financing trading
strategies are deemed to be admissible. Some of them
is known to be a strictly positive supermartingale should be excluded in order to ensure the no-arbitrage
(but not necessarily a martingale) under , since it property of the model (in addition to the existence of
may happen that Ɛ∗ (ηT ) < 1. A probability measure a martingale measure). Typically, one considers the
∗ equivalent to is well defined if and only if class of tame strategies to play the role of admissible
the process η follows a -martingale, that is, when trading strategies.
Ɛ∗ (ηT ) = 1. For the last property to hold, it is The multidimensional Black and Scholes model
enough (but not necessary) that γ is a bounded is said to be complete if any ∗ -integrable, bounded
process. from below contingent claim X is attainable, that is,
Assume that the class of martingale measures is if for any such claim X there exists an admissible
nonempty. By virtue of the Girsanov theorem, the trading strategy φ such that X = VT (φ). Otherwise,
process W ∗ , which equals, for every t ∈ [0, T ], the market model is said to be incomplete.
t Since, by assumption, the interest rate process r
∗ is nonnegative and bounded, the integrability and
Wt = Wt − γu du (17)
0 boundedness of X is therefore equivalent to the
integrability and boundedness of the discounted claim
is a d-dimensional standard Brownian motion on
X/BT . It is not postulated that the uniqueness of
(, , ∗ ). It follows from the Itô formula that the
an EMM holds, and thus the ∗ -integrability of X
discounted stock price St∗i = Sti Bt−1 satisfies under ∗
refers to any EMM for the model. The next result
establishes necessary and sufficient conditions for the
dSt∗i = S ∗i σti · dWt∗ (18)
completeness of the Black and Scholes market.
for any i = 1, . . . , k. This means that the discounted
Proposition 3 The following are equivalent:
prices of all stocks follow local martingales under ∗ ,
so that any probability measure described above is a 1. the multidimensional Black and Scholes
martingale measure for our model and it corresponds model is complete;
to the choice of the savings account as the numéraire 2. inequality d ≤ k holds and the volatility
asset. The class of tame strategies relative to B is matrix σ has full rank for Lebesgue a.e. t ∈
defined by postulating that the discounted wealth of [0, T ], with probability 1;
a strategy follows a stochastic process bounded from 3. there exists a unique equivalent martingale
below. The market model obtained in this way is measure ∗ for discounted stock price S ∗i for
referred to as the multidimensional Black and Scholes every i = 1, . . . , k.
model.
In the classic version of the multidimensional The classic one-dimensional Black and Scholes
Black and Scholes model, one postulates that d = market model introduced in [3] is clearly a special
k, the constant volatility matrix σ is nonsingular, case of the multidimensional Black and Scholes
and the appreciation rates µi and the continuously model. Hence, the above results apply also to the
compounded interest rate r are constant. It is easily classic Black and Scholes market model, in which the
6 Complete Markets
martingale measure ∗ is well known to be unique. where the stochastic volatility process σ satisfies
We conclude that the one-dimensional Black and
t
dσt = a(σt , t) dt + b(σt , t) dW
Scholes market model is complete, that is, any ∗ - (21)
integrable contingent claim is ∗ -attainable and thus are (possibly correlated) one-
where W and W
it can be priced by arbitrage.
dimensional Brownian motions defined on some
In the general semimartingale framework, the
filtered probability space (, , ). Owing to the
equivalence of the uniqueness of an EMM and the , stochastic
presence of the Brownian motion W
completeness of a market model were conjectured
volatility models are incomplete if stock and bond
by Harrison and Pliska [13, 14] (see also [18]).
are the only trade primary assets. By postulating that
The case of the Brownian filtration is examined in some plain-vanilla options are traded, it is possible
[16]. Chatelain and Stricker [7, 8] provide definitive to complete a stochastic volatility model, however.
results for the case of continuous local martingales Completeness of a model of financial market with
(see also [1, 20] for related results). They focus traded call and put options and related topics, such
on the important distinction between the vector and as static hedging of exotic options, was examined by
componentwise stochastic integrals. several authors: Bajeux-Besnainou and Rochet [2],
Breeden and Litzenberger [4], Brown et al. [5], Carr
et al. [6], Derman et al. [11], Madan and Milne [17],
Local and Stochastic Volatility Models Nachman [19], Romano and Touzi [21], and Ross
[22], to mention a few.
Note that we have examined the completeness of
the market model in which trading was restricted References
to a predetermined family of primary securities. In
practice, several derivative securities are also traded [1] Artzner, P. & Heath, D. (1995). Approximate complete-
either on organized exchanges or over-the-counter ness with multiple martingale measures, Mathematical
and thus they can be used to formally complete a Finance 5, 1–11.
given market model. Let us comment briefly on two [2] Bajeux-Besnainou, I. & Rochet, J.-C. (1996). Dynamic
classes of models in which, for simplicity, we assume spanning: are options an appropriate instrument, Mathe-
that the bond price is deterministic. matical Finance 6, 1–16.
[3] Black, F. & Scholes, M. (1973). The pricing of options
Following Dupire [12], we define the stock price and corporate liabilities, Journal of Political Economics
as a solution to the following stochastic differential 81, 637–654.
equation: [4] Breeden, D. & Litzenberger, R. (1978). Prices of state-
contingent claims implicit in option prices, Journal of
dSt = St (µ(St , t) dt + σ (St , t) dWt∗ ) (19) Business 51, 621–651.
[5] Brown, H., Hobson, D. & Rogers, L. (2001). Robust
hedging of options, Applied Mathematical Finance 5,
where S0 > 0 and the function σ : + × + → 17–43.
represents the so-called local volatility. In practice, [6] Carr, P., Ellis, K. & Gupta, V. (1998). Static hedging of
the function σ is obtained by fitting the model to exotic options, Journal of Finance 53, 1165–1190.
market quotes of traded options. Model of this form [7] Chatelain, M. & Stricker, C. (1994). On componentwise
is complete and thus any derivative security with the and vector stochastic integration, Mathematical Finance
4, 57–65.
stock price as an underlying asset can be hedged
[8] Chatelain, M. & Stricker, C. (1995). Componentwise
and priced by arbitrage (provided, of course, that and vector stochastic integration with respect to cer-
the model is arbitrage free). Another example of a tain multi-dimensional continuous local martingales, in
complete model in which the volatility follows a Seminar on Stochastic Analysis, Random Fields and
stochastic process is discussed by Hobson and Rogers Applications, E. Bolthausen, M. Dozzi, F. Russo, eds,
[15]. Birkhäuser, Boston, Basel, Berlin, pp. 319–325.
[9] Cox, J.C., Ross, S.A. & Rubinstein, M. (1979). Option
In a typical stochastic volatility model, the stock
pricing: a simplified approach, Journal of Financial
price S is governed by the equation Economics 7, 229–263.
[10] Dalang, R.C., Morton, A. & Willinger, W. (1990).
dSt = µ(St , t) dt + σt St dWt (20) Equivalent martingale measures and no-arbitrage in
Complete Markets 7
stochastic securities market model, Stochastics and [19] Nachman, D. (1989). Spanning and completeness with
Stochastic Reports 29, 185–201. options, Review of Financial Studies 1, 311–328.
[11] Derman, E., Ergener, D. & Kani, I. (1995). Static options [20] Pratelli, M. (1996). Quelques résultats du calcul
replication, Journal of Derivatives 2(4), 78–95. stochastique et leur application aux marchés financiers,
[12] Dupire, B. (1994). Pricing with a smile, Risk 7(1), Astérisque 236, 277–290.
18–20. [21] Romano, M. & Touzi, N. (1997). Contingent claims and
[13] Harrison, J.M. & Pliska, S.R. (1981). Martingales and market completeness in a stochastic volatility model,
stochastic integrals in the theory of continuous trad- Mathematical Finance 7, 399–412.
ing, Stochastic Processes and their Applications 11, [22] Ross, S.A. (1976). Options and efficiency, Quarterly
Journal of Economics 90, 75–89.
215–260.
[23] Schachermayer, W. (1992). A Hilbert space proof of
[14] Harrison, J.M. & Pliska, S.R. (1983). A stochastic
the fundamental theorem of asset pricing in finite
calculus model of continuous trading: complete mar-
discrete time, Insurance: Mathematics and Economics
kets, Stochastic Processes and their Applications 15, 11, 249–257.
313–316. [24] Taqqu, M.S. & Willinger, W. (1987). The analysis of
[15] Hobson, D.G. & Rogers, L.C.G. (1998). Complete finite security markets using martingales, Advances in
model with stochastic volatility, Mathematical Finance Applied Probability 19, 1–25.
8, 27–48.
[16] Jarrow, R.A. & Madan, D. (1991). A characterization of
complete markets on a Brownian filtration, Mathemati- Related Articles
cal Finance 1, 31–43.
[17] Madan, D.B. & Milne, F. (1993). Contingent claims
Binomial Tree; Local Volatility Model; Martin-
valued and hedged by pricing and investing in a basis,
gale Representation Theorem; Second Fundamen-
Mathematical Finance 4, 223–245.
[18] Müller, S. (1989). On complete securities markets and
tal Theorem of Asset Pricing.
the martingale property of securities prices, Economics
MAREK RUTKOWSKI
Letters 31, 37–41.
Equivalent Martingale (iii) sigma-martingale if there is an d -valued mar-
tingale M = (Mt )0≤t≤T and a predictable
Measures M-integrable + -valued process ϕ such that
S = ϕ · M.
The usual setting of mathematical finance is provided The process ϕ · M is defined as the stochastic inte-
by a d-dimensional stochastic process S = (St )0≤t≤T gral in the sense of semimartingales. The—by now
based on and adapted to a filtered probability space well understood—underlying theory was developed
(, F, (Ft )0≤t≤T , ). This process S models the notably by the school of P.A. Meyer in Strasbourg
price evolution of d risky stocks, which is random. [10–12]:
To alleviate notation, we assume from the very t
beginning that these prices are denoted in discounted (ϕ · M)t = ϕu dMu , 0 ≤ t ≤ T (3)
0
terms: fix a traded asset, the “bond”, as numéraire
and express stock prices S in units of this bond. This It is not obvious, but true, that a local martingale
simple and classical technique allows us to dispense is a sigma-martingale, so that (i) ⇒ (ii) ⇒ (iii) holds
with discount factors in the formulae below (compare true above, while the reverse implications fail to hold
Section 2.1 in [6] for more details). true as we discuss later.
A central topic in mathematical finance is to Why is it necessary to introduce these generaliza-
decide whether there is a probability measure Q, tions of the concept of a martingale? Let us start
equivalent to , such that S is a martingale under with a familiar example of a martingale, namely,
Q. This is the theme of the fundamental theorem geometric Brownian motion
of asset pricing (see Fundamental Theorem of
Asset Pricing). Once we know that there exist t
Mt = exp Wt − , t ≥0 (4)
equivalent martingale measures, they can be used to 2
determine risk-neutral prices of derivative securities
where the process (Wt )t≥0 is a standard Brownian
by taking expectations under these measures (see
motion.
Risk-neutral Pricing), and to replicate, respectively,
Clearly, (Mt )t≥0 is a martingale (with reference
sub- or superreplicate, the derivative.
to its natural filtration) when t ranges in [0, ∞[. But
In fact, we were less precise in the previous
what happens if we include t = ∞ into the time set?
paragraph (as is usual in this context) by requiring
It is straightforward to verify that
that S is a martingale. It turns out that some
technical care is needed here, involving the notions M∞ := lim Mt (5)
of local martingales and, more generally, of sigma- t→∞
martingales. This article deals precisely with these exists a.s. and equals
technical variants of the concept of a martingale.
We start by giving precise definitions. M∞ = 0 (6)
Definition 1 An d -valued stochastic process Hence we may well define the continuous process
(St )0≤t≤T based on and adapted to (, F, (Mt )0≤t≤∞ ; this process is not a martingale any
(Ft )0≤t≤T , ) is called a more as
1 = M0 > Ɛ[M∞ ] = 0 (7)
(i) martingale if
In this example, the breakdown of the martingale
Ɛ[St |Fu ] = Su , 0≤u≤t ≤T (1) property happens at t = ∞. However, it is purely
formal to shift this problem to any other point
(ii) local martingale if there exists a sequence T ∈]0, ∞[, for example, T = 1. Indeed, letting
(τn )∞
n=1 of [0,T ] ∪ {+∞}-valued stopping times,
increasing a.s. to ∞, such that the stopped pro-
cesses Stτn are all martingales, where M̃t = Mtan tπ , 0≤t <1
2
we find a process (M̃t )0≤t≤1 , having a.s. continuous or [Vol. II] [13]), goes as follows: for the drift
paths, which fails to be a martingale. However, term µ(Xt , t) we have µ(Xt , t) ≡ 0 if and only if
it is intuitively clear that “locally”, that is, before (Xt )0≤t≤1 is a martingale. This is a very useful argu-
t assumes the value 1, the process (M̃t )0≤t≤1 is ment. However, this argument is not quite complete,
“something like a martingale”. The good way to as a glance at equation (10) reveals where we only
formalize this intuition is to find a “localizing” obtain a local martingale. The correct statement is
sequence of stopping times as in (ii) above. The the process X is a local martingale if and only if
canonical choicea is µ(Xt , t) = 0, a.s. with respect to d ⊗ dλ.
and define the stochastic integral S = ϕ · M, for processes S and M work equally well. In particular,
which we get the Ansel–Stricker Theorem carries over to sigma-
martingales (see [4, Th. 5.5] for a somewhat stronger
0, for 0 ≤ t ≤ τ version of this result).
St = (18)
τ −1 ε, for τ ≤ t It is not hard to show that a locally bounded
The process S = (St )0≤t≤1 is a well-defined sto- process, which is a sigma-martingale, is already a
chastic integral (in the pointwise Stieltjes sense). The local martingale [4, Prop. 2.5 and 2.6]. Émery’s
verbal description of S goes as follows: again S example shows that this is not the case any more
remains at 0 until time τ and then, depending on if we drop the local boundedness assumption. From
the sign of ε, it jumps to +τ −1 or −τ −1 . a financial point of view, however, the question of
Is the process S a martingale? Morally speaking, interest arises in a slightly different version. Is there
one might think yes, as it has the same odds of an example of a process S = (St )0≤t≤T , which is a
jumping up or down.d But this intuition goes wrong: sigma-martingale, say under , but such that it fails to
indeed, the notion of martingale is based on some be a local martingale under any probability measure
(conditional) expectations to be zero. When we do Q equivalent to ?
the calculations in the present example we end up Émery’s original example does not provide a
with expressions of the form ∞ − ∞, which creates counterexample to this question; in this example,
a problem. Indeed, we have it is not hard to pass from to Q such that
S even becomes a Q-martingale. However, in [4,
Ɛ[|St |] = ∞, for 0 ≤ t ≤ 1 (19) Ex. 2.3] a variant of Émery’s example has been
t constructed, which is a process S taking values in 2
as is easily seen from 0 u−1 du = ∞, for t > 0. answering the above question negatively. It seems
In fact, it is not hard to show [7] that, for every worth mentioning that—to the best of the author’s
(Ft )0≤t≤1 —stopping time σ : → [0, 1], such that knowledge—it is unknown whether there also is a
[σ > 0] > 0, we have counterexample of a process S, taking values only in
, to this question.
Ɛ[|Sσ |] = ∞ (20)
b.
The name is derived from the following fact: let [8] Harrison, J.M. & Pliska, S.R. (1981). Martingales and
(Bt )0≤t≤1 = (Bt1 , Bt2 , Bt3 )0≤t≤1 be an 3 -valued standard stochastic integrals in the theory of continuous trad-
Brownian motion starting at B0 = (B01 , B02 , B03 ) = (1, 0, 0). ing, Stochastic Processes and their Applications 11,
Let Rt =
Bt
−1 where
.
denotes Euclidean norm on 3 . 215–260.
Then (Rt )0≤t≤1 satisfies equation (10), where (Wt )0≤t≤1 is a [9] Kabanov, Y.M. (1997). On the FTAP of Kreps-Delbaen-
one-dimensional Brownian motion adapted to the filtration Schachermayer (English), in Statistics and Control of
generated by the three-dimensional Brownian (Bt )0≤t≤1 . We Stochastic Processes, Y.M. Kabanov, B.L. Rozovskii
refer to [11] for a beautiful presentation of the theory of & A.N. Shiryaev, eds, The Liptser Festschrift. Papers
Bessel processes (compare also [5]). from the Steklov Seminar held in Moscow, Russia,
c. 1995–1996, World Scientific, Singapore, pp.
It is easy to verify that in Proposition 1 (as well as
in Definition 1 (iii)), we may assume without loss of 191–203.
generality that ϕ takes its values in ]0, ∞[ (or, equivalently, [10] Protter, P. (1990). Stochastic integration and differential
in [1, ∞[). See [4, Prop. 2.5] for details. equations. A new approach, in Applications of Math-
d.
A precise statement is that the processes S and −S have ematics, (2nd edition, 2003, corrected third printing:
the same law, which obviously is the case. 2005) Springer-Verlag, Berlin, Heidelberg, New York,
e. Vol. 21.
To be precise, we have to consider the random variables
(H · S)T ∧ C, where C runs through + , to make sure that [11] Revuz, D. & Yor, M. (1991). Continuous martingales
these random variables are in L∞ (, F, ). and Brownian motion, in Grundlehren der Mathematis-
chen Wissenschaften, 3rd edition, 1999, corrected third
printing: 2005, Springer, Vol. 293.
References [12] Rogers, L.C.G. & Williams, D. (2000). Diffusions,
Markov Processes and Martingales, Cambridge Univer-
[1] Ansel, J.P. & Stricker, C. (1994). Couverture des actifs sity Press, Vol. I and II.
contingents et prix maximum, Annales de l’Institut Henri [13] Shreve, S. (2004). Stochastic calculus for finance,
Poincaré – Probabilités et Statistiques 30, 303–315. Springer Finance I, II, 208, 550.
[2] Chou, C.S. (1977/78). Caractérisation d’une classe de [14] Stricker, Ch. (1990). Arbitrage et Lois de martingale,
semimartingales, in Séminaire de Probabilités XIII, Annales de l’Institut Henri Poincaré – Probabilites et
Springer Lecture Notes in Mathematics, Springer, Vol. Statistiques 26, 451–460.
721, pp. 250–252. [15] Yan, J.A. (1980). Caractérisation d’ une classe
[3] Delbaen, F. & Schachermayer, W. (1994). A general d’ensembles convexes de L1 ou H 1 , in Séminaire de
version of the fundamental theorem of asset pricing, Probabilités XIV, J. Azema & M. Yor, eds, Springer
Mathematische Annalen 300, 463–520. Lecture Notes in Mathematics, Springer, Vol. 784, pp.
[4] Delbaen, F. & Schachermayer, W. (1998). The funda- 220–222.
mental theorem of asset pricing for unbounded stochastic
processes, Mathematische Annalen 312, 215–250.
[5] Delbaen, F. & Schachermayer, W. (1995). Arbitrage
possibilities in Bessel processes and their relations to
Related Articles
local martingales, Probability Theory and Related Fields
102, 357–366. Arbitrage Strategy; Complete Markets; Free
[6] Delbaen, F. & Schachermayer, W. (2006). The Math-
Lunch; Fundamental Theorem of Asset Pri-
ematics of Arbitrage, Springer Finance, p. 371. ISBN:
3-540-21992-7. cing; Martingales; Minimal Entropy Martingale
[7] Émery, M. (1980). Compensation de processus à varia- Measure; Minimal Martingale Measure; Risk-
tion finie non localement intégrables, in Séminaire de neutral Pricing; Second Fundamental Theorem of
Probabilités XIV, J. Azema & M. Yor, eds, Springer Asset Pricing.
Lecture Notes in Mathematics, Springer, Vol. 784,
pp. 152–160. WALTER SCHACHERMAYER
Second Fundamental and with FT = F. Let S = (St0 , . . . , Std )t∈[0,T ] be a
(d + 1)-dimensional strictly positive semimartingale,
Theorem of Asset Pricing whose components S 0 , . . . , S d are right continuous
with left limits. Moreover, we assume that S00 = 1.
Here, the stochastic process Stk represents the value
The second fundamental theorem of asset pricing at time t of the kth security on the market. The
concerns the mathematical characterization of the discounted price process Z = (Zt1 , . . . , Ztd )t∈[0,T ] is
economic concept of market completeness for liq- then defined by setting Z k = S k /S 0 , for k = 1, . . . , d.
uid and frictionless markets with an arbitrary number Let be the set of probability measures Q on
of assets. The theorem establishes the mathemati- (, F) that are equivalent to P and such that Z is
cal necessary and sufficient conditions in order to a (vector) martingale under Q. We assume that
guarantee that every contingent claim on the mar- is not empty, that is, that the market is arbitrage
ket can be duplicated with a portfolio of primitive free (see Fundamental Theorem of Asset Pri-
assets. For finite asset economies, completeness (i.e., cing). We fix an element P ∗ in and denote by
perfect replication of every claim on the market E ∗ the expectation under P ∗ . Let L(Z) denote the
by admissible self-financing strategies) is equivalent set of all vector-valued, predictable processes H =
to uniqueness of the equivalent martingale measure. (Ht1 , . . . , Htd )t∈[0,T ] that are integrable with respect
This result can be extended to market models with an to the semimartingale Z. For further details on L(Z),
infinite number of assets by defining completeness in we refer to Remark 1 below.
terms of approximate replication of claims by attain-
able ones. Hence several definitions of completeness Definition 1 A stochastic process φ ∈ L(Z) is said
are possible, and in the sequel we will present and to be an admissible self-financing strategy if
discuss them extensively.
(i) the
d discounted value process V ∗ (φ) :=
k k
k=1 φ Z is almost surely nonnegative;
Finite Number of Assets (ii) V ∗ (φ) satisfies the self-financing condition
Theorem 2 (Theorem 4 of [3], Theorem 2.2 and sources of randomness, given by the |E| different
3.2 of [16]). Let = ∅. Then the market is com- possible shocks.
plete if and only if P (rank(t ) = d for almost all t ∈
[0, T ]) = 1. We have seen that the key to completeness is the
predictable representation property. Hence, a natu-
For further references, see also Theorem 4.1 of [18]. ral question concerns the kind of martingales for
Since there are n sources of randomness represented which the predictable representation property is sat-
by the Brownian motions, it is natural to expect that isfied. For the continuous case, we have that the
n sufficiently independent asset prices are needed for predictable representation property holds for diffu-
completeness. Clearly, if d < n the market cannot be sion processes that are martingales and have either
complete. Lipschitz coefficients [24] or a nondegenerate dif-
fusion matrix and continuous coefficients [14]. The
Example 3 If price processes are discontinuous only one-dimensional martingales with stationary and
but with a finite number of jump sizes, then we independent increments that satisfy the predictable
obtain again a characterization of completeness in representation property are the Wiener and the Pois-
terms of the volatility matrix, as shown by the son martingales [25]. Hence the representation prop-
following theorem attributed to Bättig [3]. We set erty holds for finite Lévy measures, but it fails for
again S 0 = 1 and consider price processes driven by infinite Lévy measures. In the next section, we dis-
a multivariate point processc µ with compensator cuss the second fundamental theorem in the case of
ν( dt, dx) = Kt ( dx) dt such that infinite dimensional financial markets.
Sti = S0i ε R i t , t ∈ [0, T ], i = 1, . . . , d (8)
Infinite Number of Assets
with
Many applications of hedging involve dynamic trad-
t
Rti = αsi ds + i
σ (u, x)(µ( du, dx) ing in principle in infinitely many securities, for
0 [0,t]×E example, in pricing of interest rate derivatives by
− ν( du, dx)), t ∈ [0, T ], i = 1, . . . , d, using pure discount bonds or in the use of the
term and strike structure of European put and call
(9)
options to hedge exotic derivatives, when asset prices
where the σ i (t, x)s are bounded dµ ⊗ dP a.e., E is are driven by Lévy measures. Hence it is natural
the Doléans exponential (for the definition, we refer to develop infinite dimensional market models to
to Theorem I.4.61 of [13]), and E ⊂ . Note that address this kind of issues. The problem now is
here σ i , µ, and ν may depend on ω, but for the sake to establish if the second fundamental theorem still
of simplicity we do not indicate this dependence. In holds, and if the market is endowed with an infinite
this context, asset prices may have jumps that can be number of assets.
thought of as the result of possible shocks that may By defining a complete market via the density of
trigger the market. If the cardinality |E| of E is finite, a vector space, the second fundamental theorem is
we denote again by t the volatility matrix, whose in [8] proved to hold true for (infinitely many)
row vectors are given by (σ i (t, x))x∈E , i = 1, . . . , d. continuous and bounded asset price processes, if all
the martingales with respect to the reference filtration
Theorem 3 (Theorem 5 of [3]). Let = ∅, |E| < Ft are continuous ([8], Theorem 6.7). In the case of
∞ and Kt ({x}) > 0 for every x ∈ E. Then the a general filtration, Theorem 6.5 of [8] states that
market is complete if and only if P (rank(t ) = completeness is equivalent for P ∗ to be an extreme
|E| for almost all t ∈ [0, T ]) = 1. point of , that is, a weaker version of the second
fundamental theorem holds.
Furthermore, in the case of a finite number of jumps The hypothesis of continuity cannot be dropped
that may trigger the economy, the characterization of and in the presence of jumps (discontinuities) and
market completeness is similar to the Itô price process infinitely many assets, a counterexample to the sec-
case, that is, one needs |E| sufficiently independent ond fundamental theorem is provided in [2], where an
processes for completeness in presence of the |E| economy with infinitely many assets is constructed,
4 Second Fundamental Theorem of Asset Pricing
in which the market is complete; yet, there exists be interpreted as a market agent’s personal way of
an infinity of equivalent martingale measures. Since assigning values to claims, that is, the set repre-
the formulation of this counterexample, many papers sents the possible contingent claims valuation mea-
have studied the problem of extending the result sures held by traders. An agent using the valuation
Q ∈ assigns to a contingent claim H the
of the second fundamental theorem to markets with measure
infinitely many assets. Since many definitions of value H dQ. The fact that is given by the P -
completeness are possible, the solution to the coun- absolutely continuous signed measures on FT has
terexample of [2] relies on the choice of the definition two particular meanings: first that all traders agree
of completeness that is adopted. A first answer to this on null events, and second, that there can be strictly
problem was provided in 1997 by Björk et al. [5, 6], positive random variables with negative personal
where Theorem 6.11 shows that in the presence of value. For a given trader, represented by Q ∈ ,
infinitely many assets and a continuum of jump sizes, two contingent claims H1 and H2 are approximately
the uniqueness of the equivalent martingale measure equal if
is equivalent to the market being approximately com-
plete, that is, every bounded contingent claim can be
approached in L2 (Q) for some Q ∈ by a sequence
| (H1 − H2 ) dQ| < for small > 0 (10)
of hedgeable claims.
In 1999, a number of papers appeared [3, 4,
15, 17] at the same time, where new definitions
Denote the space of all bounded contingent claims
of market completeness were proposed in order to
maintain the second fundamental theorem, even in by C. The finite
intersections
of the sets of the form
complex economies. The equivalence between market B(H1 , ) = H2 ∈ C| | (H1 − H2 ) dQ| < , H1 ∈
completeness and uniqueness of the pricing measure C, and > 0, give a basis for a topology τ Q on C. We
is maintained by introducing a notion of market com- endow C with the coarsest topology τ finer than all of
pleteness that is independent both of the notion of no the τ Q , Q ∈ . This topology is now agent indepen-
arbitrage and of a chosen equivalent martingale mea- dent, that is, two claims are approximately equal if
sure. In finite-dimensional markets, the definition of all the agents believe that their values are close. The
market completeness is given in terms of replicat- topology τ is usually referred as the weak* topology
ing value processes in economies without arbitrage on C [21].
possibilities and with respect to a given equivalent An agent is then allowed to trade in a finite number
martingale measure. However, the issue of complete- of assets via self-financing, bounded, stopping time
ness is about the ability to replicate certain cash simple strategies that yield a bounded payoff at T .
flows, and not about how these cash flows are val- As in the previous section, a (bounded) claim is said
ued or whether these values are arbitrage free. From to be attainable if it can be replicated by one of
this perspective, the appropriate measure to address such strategies. In this setting, the market is said
the issue of completeness is the statistical probability to be quasicomplete if any contingent claim H ∈ C
measure P , and not an equivalent martingale mea- can be approximated by attainable claims in the
sure that may also not exist. In reference [17], this weak* topology induced by on C. Since the weak*
new approach was also motivated by the empirical topology as well as the trading strategies are agent
asset pricing literature. Moreover, an example in [3] measure independent, the same is true for this notion
shows an economy where the existence of an equiv- of completeness. Consider now the space ± of the
alent martingale measure precludes the possibility of P -absolutely continuous signed martingale measures.
market completeness. Hence in references [3, 4, 15, Then the following generalized version of the second
17], the concept of exact (almost everywhere) replica- fundamental theorem holds.
tion of a contingent claim via an admissible portfolio
is substituted by the notion of approximation of a Theorem 4 (The second fundamental theorem of
contingent claim. The main outlines of this approach asset pricing, Theorem 2 of [3], Theorem 1 of [4],
are the following. Theorem 5 of [17]). Let ± = ∅. Then there exists
Let denote the space of the P -absolutely con- a unique P -absolutely continuous signed martingale
tinuous signed measures on FT . Then Q ∈ can measure if and only if the market is quasicomplete.
Second Fundamental Theorem of Asset Pricing 5
The proof of this theorem relies on the theory of [5] Björk, T., Di Masi, G., Kabanov, Y. & Runggaldier, W.
linear operators between locally convex topological (1997). Towards a general theory of bond markets,
vector spaces. Finance and Stochastics 1, 141–174.
[6] Björk, T.G., Kabanov, Y. & Runggaldier, W. (1997).
Since the market is endowed with an infinite
Bond market structure in the presence of marked point
number of assets, in principle, trading in infinitely processes. Mathematical Finance 7, 211–223.
many assets may be possible. To take this possibility [7] Black, F. & Scholes, M. (1973). The pricing of options
into account, in [5, 6, 15, 17] portfolios consisting and corporate liabilities, Journal of Political Economy
of infinitely many assets are allowed by considering 81, 637–659.
measure-valued strategies. The result of Theorem 4 [8] Delbaen, F. (1992). Representing martingale measures
still holds in the case of market models where when asset prices are continuous and bounded. Mathe-
measure-valued strategies are allowed as shown in matical Finance 2, 107–130.
[9] Harrison, J.M. & Kreps, D.M. (1979). Martingales and
Theorem 6.11 of [5] and Theorem 2.1 of [15].
arbitrage in multiperiod securities markets. Journal of
This approach resolves the paradox of the coun- Economic Theory 20, 381–408.
terexample of [2], since the economy considered in [10] Harrison, J.M. & Pliska, S.R. (1981). Martingales and
[2] is incomplete under this new definition of market stochastic integrals in the theory of continuous trad-
completeness. Moreover, if = ∅ and the number of ing, Stochastic Processes and Their Applications 11,
assets is finite or the asset prices are given by con- 215–260.
tinuous processes, then Theorem 5 of [4] shows that [11] Harrison, J.M. & Pliska, S.R. (1983). A stochastic
the market model is quasicomplete if and only if it is calculus model of continuous trading: complete mar-
kets. Stochastic Processes and Their Applications 15,
complete.
313–316.
[12] Jacod, J. (1979). Calcul Stochastique et Problèmes des
Martingales, Lectures Notes in Mathematics, No. 714,
End Notes Springer-Verlag, Berlin, Heidelberg, New York.
[13] Jacod, J. & Shiryaev, A.N. (1987). Limit Theorems for
a.
We say that a contingent claim is integrable if E ∗ [X/ST0 ] < Stochastic Processes, Springer-Verlag, Berlin, Heidel-
∞. By Definition 1, it follows that an attainable contin- berg, New York.
gent claim is necessarily integrable. Hence we can restate [14] Jacod, J. & Yor, M. (1977). Etude des solutions
the definition of market completeness as follows. The extrémales et représentation intégrales des solutions
model is said to be complete if every integrable claim is pour certains problèmes des martingales, Zeitschrift für
attainable. Wahrscheinlichkeitstheorie und verwandte Gebiete 38,
b.
The price process is said to contain a redundancy if 83–125.
P (α · St+1 = 0|A) = 1 for some nontrivial vector α, some [15] Jarrow, R.A., Jin, X. & Madan, D.B. (1999). The second
t < T , and some A ∈ Pt . fundamental theorem of asset pricing, Mathematical
c.
Let E be a Blackwell space. An E-multivariate point Finance 9, 255–273.
process is an integer-valued random measure on [0, T ] × [16] Jarrow, R.A. & Madan, D.B. (1991). A characterization
E with µ([0, t] × E) < ∞ for every ω, t ∈ [0, T ] (see of complete security markets on a Brownian filtration,
Definition III.1.23 of [13]). Mathematical Finance 1, 31–43.
[17] Jarrow, R.A. & Madan, D.B. (1999). Hedging contingent
claims on semimartingales, Finance and Stochastics 3,
References 111–134.
[18] Londono, J.A. (2004). State tameness: a new approach
[1] Arrow, K. (1964). The role of securities in the optimal for credit constrains, Electronic Communications in
allocation of risk-bearing, Review of Economics Studies Probability 9, 1–13.
31, 91–96. [19] Müller, S.M. (1989). On complete securities markets and
[2] Artzner, P. & Heath, D. (1995). Approximate complete- the martingale property of securities, Economics Letters
ness with multiple martingale measures, Mathematical 31, 37–41.
Finance 5, 1–11. [20] Ross, S. (1976). The arbitrage theory of capital asset
[3] Bättig, R. (1999). Completeness of securities mar- pricing, Journal of Economic Theory 13, 341–360.
ket models–an operator point of view, The Annals of [21] Rudin, W. (1991). Functional Analysis, 2nd Edition,
Applied Probability 9, 529–566. MacGraw-Hill, New York.
[4] Bättig, R. & Jarrow, R.A. (1999). The second funda- [22] Stiglitz, J. (1972). On the optimality of the stock
mental theorem of asset pricing: a new approach, The market allocation of investment, Quarterly Journal of
Review of Financial Studies 12, 1219–1235. Economics 86, 25–60.
6 Second Fundamental Theorem of Asset Pricing
[23] Taqqu, M.S. & Willinger, W. (1987). The analysis of Related Articles
finite security markets using martingales, Advances in
Applied Probability 19, 1–25.
[24] Yamada, T. & Watanabe, S. (1971). On the unique- Equivalence of Probability Measures; Equivalent
ness of solutions of stochastic differential equations. Martingale Measures; Fundamental Theorem of
Journal of Mathematics of Kyoto University 11, Asset Pricing; Hedging; Martingales; Martingale
155–167. Representation Theorem;
[25] Yor, M. (1977). Remarques Sur la Représentation des
Martingales Comme intégrales Stochastiques, Séminaire FRANCESCA BIAGINI
de probabilités de Strasbourg XI, Lecture Notes in Math-
ematics, No. 581, Springer, New York,
pp. 502–517.
Expected Utility Convex Duality) is a powerful alternative approach.
In the mid-1980s, with the works of Pliska [14],
Maximization: Duality He and Pearson [8], Karatzas et al. [10], and Cox
and Huang [5] this new methodology started to fully
Methods develop. Relying on convex duality (see Convex
Duality) and martingale (see Martingales) meth-
ods, it enables the treatment of the most general
Expected utility maximization has a long tradition in cases. The price to pay for the achieved general-
modern mathematical finance. It dates back to the ity is that the results obtained have a mathematical
1950s [18] when it provided a theoretical foundation existence–uniqueness–characterization form. As is
to the (Markowitz’s) mean–variance asset allocation always the case, explicit calculations require the spec-
method (see Risk–Return Analysis). The objective ification of a (very) tractable model.
of a rational and risk-averse agent acting is captured The presentation given here is based on the convex
by a concave function, the utility U of the agent (see duality approach, in a general semimartingale model.
Utility Function). It is typically assumed that U is For a treatment of the same problem with martingale
increasing since the agent prefers more wealth to less. methods in a diffusion context, see Expected Utility
Given his/her U , the agent chooses the portfolio P ∗ Maximization or [9].
that maximizes the agent’s expected utility over a
horizon [0, T ].
Some famous case studies are considered in [12, Examples
13], where the agent is planning for retirement
in a Black–Scholes (and thus complete) financial Consider an agent who is a price taker, that is,
market (see Merton Problem). The complete market his/her actions do not affect market prices, and whose
framework (see Complete Markets) is a convenient goal is to trade dynamically in a financial market
mathematical idealization as any conceivable risk up to a horizon T , in order to achieve maximum
can be hedged by cleverly investing in the market. expected utility. A host of features can be taken into
As a consequence, independently of the specific account, such as the initial endowment, the possibility
utility of the agent, the price of any claim is also of intertemporal consumption, and the presence of
uniquely assigned since by the no-arbitrage principle a random endowment at time T . A list of various
it must coincide with the initial value of the hedging situations is given in the following. The mathematical
portfolio. details are discussed in the next section.
In the more realistic situation of incomplete mar-
ket, when there are, for example, intrinsic, nontraded 1. Maximizing Utility of Terminal Wealth
sources of risk, both the valuation and the hedging The preferences of the investor are represented by a
problems become highly nontrivial issues. Expected von Neumann–Morgenstern utility function
utility maximization has also turned out to perform
very well in the pricing problem in the general, U : → [−∞, +∞) (1)
incomplete market setup. The related pricing tech-
niques are known as pricing by marginal utility and which must be not identical to −∞, increasing, and
indifference pricing and are discussed briefly in this concave.
article (for more details see Utility Indifference Val- Typical examples are U (x) = ln x, U (x) = α1 x α
uation). with α < 1, α = 0, where it is intended that U (x) =
The use of increasingly more complex probabilis- −∞ outside the domain, and U (x) = − γ1 e−γ x with
tic models of financial assets has continued to pose γ > 0.
new mathematical challenges. If the setup is that of No consumption occurs before time T . The agent
general non-Markovian diffusion or semimartingale has the initial endowment x and can invest in the
models, direct methods from stochastic optimal con- financial market. The resulting optimization problem
trol (as originally done by Merton and many others is
after him) become increasingly difficult to handle. sup E[U (k)] (2)
As first suggested by Bismut [4], convex duality (see k∈K(x)
2 Expected Utility Maximization: Duality Methods
where K(x) is the set of random wealths that can the wealth level x. The maximization is then that
be obtained at time T (terminal wealths) with initial of the expected integrated utility from the rate of
wealth x. consumption:
The formulation of the problem with random T
endowment, namely, when the agent receives at T sup E U (t, c(t)) dt (6)
an additional cashflow B (say, an option), is the (C,P )∈A(x) 0
following:
sup E[U (k + B)] (3) 3. Maximizing Utility of Terminal Wealth and
k∈K(x) Consumption
Alternatively, the agent may wish to maximize
as his/her terminal possible wealths now are of the
expected utility from terminal wealth and intertem-
form k + B.
poral consumption given his/her initial wealth x ≥ 0.
2. Maximizing Utility of Consumption Therefore, there are two utilities, U and U , from
terminal wealth and from the rate of consumption,
Suppose that the agent is not particularly interested
respectively. Let A(x) be the set of the possible
in consumption at the terminal time T , but rather
consumption plans—portfolios (C, P ), obtained with
he/she is willing to consume over the entire planning
initial wealth x, and let X C,P (T ) be the terminal
horizon. A consumption plan C for the agent is deter-
wealth from the choice (C, P ). Then the optimal
mined by its random rate of consumption c(t) at time
consumption–investment is the couple (C ∗ , P ∗ ) that
t for all t ∈ [0, T ]. It is evident from the financial per-
solves
spective that the rate c(t) must be nonnegative, so the
consumption in the interval [t, t + dt] increases by
T
the quantity c(t)dt. The goal of the agent is thus the sup E U (t, c(t)) dt + E U X C,P (T )
selection of the best consumption plan over [0, T ], (C,P )∈A(x) 0
starting with an initial endowment x ≥ 0. The utility (7)
function will now measure the degree of satisfaction
with the intertemporal consumption or better with the The case selected in the following section for the
rate of consumption. As this measure may change illustration of the duality technique and the main
over the time, the utility also depends on the time results is the first, that is, utility maximization of
parameter: terminal wealth. When intertemporal consumption is
taken into account, similar results can be proved. In
U : [0, T ] × → [−∞, +∞) (4) addition, case 3 turns out to be a superposition of
cases 1 and 2, as shown in Chapters 3, 6 of [9].
When t is fixed, then U (t, ·) is a utility function
with the same properties as in case (1). As the rate
of consumption cannot be negative, U (t, x) = −∞ Maximizing the Utility of (Discounted)
when x < 0. The agent may clearly benefit from Terminal Wealth
the opportunity of investing in the financial market,
so in general his/her position can be expressed by An analysis of any optimization problem relies on
a consumption plan C and a dynamically changing a precise definition of the domain of optimization
portfolio P . If X C,P (t) is the total wealth of the and the objective function. Therefore, the study of
position (C, P ) at time t, then as there is no inflow maximization (2) requires a specification of
of cash the variation of the wealth in [t, t + dt] must
satisfy 1. the financial market model and the admissible
terminal wealths;
dX C,P (t) = −c(t) dt + dV P (t) (5) 2. the technical assumptions on U ; and
3. some joint condition on the market model and
where dV P (t) is the variation of the value of the the utility function.
portfolio P at time t due to market fluctuations.
Let A(x) indicate the set of all such consump- 1. The financial market model considered is fric-
tion plans—portfolios (C, P ) when starting from tionless and consists of N risky assets and one
Expected Utility Maximization: Duality Methods 3
risk-free asset (money market account). Although if there exists some constant c > 0 such
it is not necessary, for the sake of convenience, t
it is assumed that the risk-free asset, S 0 , is con-
for all t ∈ [0, T ], Hs dSs ≥ −c − a.s.
stantly equal to 1, that is, the prices are discounted. 0
The N risky assets are globally indicated by S = (9)
(S 1 , . . . , S N ). The trading can occur continuously
in [0, T ]. S = (St )t≤T is, in fact, an N -valued,
that for any x the wealth process X = x +
so
continuous-time process, defined on a filtered prob- Hs dSs is also bounded from below. Maximizing
ability space (, (Ft )t≤T , ). Since the wealth from expected utility from terminal wealth means, in fact,
an investment in this market is a (stochastic) inte- maximizing expected utility from the set K(x) of those
gral, S is assumed to be a semimartingale so that the random variables
T XT that can be represented as
object “integral with respect to S” is mathematically XT = x + 0 Ht dSt with H admissible in the sense
well defined (see Stochastic Integrals). For exposi- of equation (9).
tory reasons, S is a locally bounded semimartingale. Hereafter, the notation E[·] indicates -expec-
This class of models is already very general, as all tation. When considering expectation under another
the diffusions are locally bounded semimartingales, probability , the notation is explicitly E [·].
as well as any jump-diffusion process with bounded As shown by Delbaen and Schachermayer [7]
jumps. a financially relevant set of probabilities is Me ,
The agent has an initial endowment x and there namely, the set of the equivalent (local) martingale
are no restrictions on the quantities he/she can buy, probabilities for S. When the market is complete, this
sell, or sell short. Ht = (Ht1 , . . . HtN ) is the ran- set consists of only one probability, but in the general,
dom vector with the number of shares of each risky incomplete market case, this set is infinite. Under
asset that the agent holds in the infinitesimal interval each probability, ∈ Me S is a (local) martingale
[t, t + dt]. Bt represents the number of shares of the and thus is a risk-neutral probability. This is the
risk-free asset held in the same interval. H = (Ht )t theoretical justification for the use of each of these
and B = (Bt )t are the corresponding processes and s as a pricing measure for any derivative claim B,
are referred to as the strategy of the agent. To be with (arbitrage-free) price given by the expectation
technically precise, H must be a predictable process E [B].
and B a semimartingale. As there is no consumption However, we need the less restrictive set M of the
and no infusion of money in the trading period [0, T ], absolutely continuous (local) martingale probabilities
the wealth from a strategy (H, B) is the process X
for S, as this is the set that will show up in the
dual problem. The set M can be characterized in the
that solves
following manner:
dXt = (Ht dSt + Bt dSt0 ) = Ht dSt T
(8) M = | E Ht dSt ≤ 0 ∀ adm. H
X0 = x
0
t (10)
or, in integral form, Xt = x + 0 Hs dSs . This can
be equivalently stated as the strategy (H, B) is self- as the set of absolutely continuous probability
financing. Since dS 0 = 0, the self-financing condition measures that give nonpositive expectation to the ter-
enables a representation of the wealth X only in terms minal wealths from admissible self-financing strate-
of H . This is the reason one typically refers to H only gies starting with zero wealth. Therefore, given any
as the strategy. XT ∈ K(x) and any ∈ M,
As usual in continuous-time trading (see Funda- T
mental Theorem of Asset Pricing) to avoid phenom- E [XT ] = E x + Ht dSt ≤ x (11)
0
ena like doubling strategies, not every self-financing
H is allowed. A self-financing strategy H is said to 2. Hypothesis on U . As a case study, let us
be admissible only if during the trading the losses do assume that U is finite valued on , that is, the
not exceed a finite credit line. That is, H is admissible wealth can become arbitrarily negative (the closest
4 Expected Utility Maximization: Duality Methods
references are [2, 16]). A typical example is the and, apart from some minus signs, it coincides with
exponential utility. The reason we prefer the expo- the Fenchel conjugate of U (see Convex Duality).
nential utility (and all the other utilities with the Thus, V is a convex function, which is identically
properties listed below) to, for example, the loga- equal to +∞ when y < 0. It is also differentiable
rithmic or the power utilities is that the dual problem on (0, +∞) and its derivative is V = −(U )−1 .
is easier to interpret. References for the case when Traditionally, the inverse of the marginal utility
there are constraints on the wealth (then U is finite (U )−1 is denoted by I . By mere definition of V ,
only on a half-line), like U (x) = ln x or U (x) = for all x, y, the Fenchel inequality holds
1 α
α
x , are [11], [17], and the bibliography contained
therein. U (x) ≤ xy + V (y) (14)
A main difficulty the reader may encounter when
comparing this literature is that the language and and the above relation is, in fact, an equality iff y =
style in the papers differ. Very recently, Biagini and U (x) or equivalently x = (U )−1 (y) = I (y). Also
Frittelli [3] proposed a unifying approach that works note that
both for the case of U finite on all and for the
case of U finite only on a half-line. The result there U (x) = inf{xy + V (y)} = inf {xy + V (y)} (15)
y y>0
is enabled by the choice of an innovative duality (an
Orlicz space duality), naturally induced by the utility
function U . The typical example (and most used) is the fol-
Regarding U , it is here required that lowing couple (U, V ):
finite generalized entropy. Let us restate problem (2), out the value u (x) = u(x), one can now apply the
the primal problem traditional Lagrange multiplier method to get
Therefore,
the (unique) optimal claim is k ∗ = Duality in Incomplete Market Models
I y ∗ d because it verifies the following:
d The same methodology applies to the incomplete
• ∗ ∗
the balance equation E [k ] = x, so k ∈ K (x) market framework, but the technicalities require some
and more effort. The main results are (more or less intu-
• the Fenchel equality itive) generalizations of what happens in the com-
plete case, as summarized below (see [2, 16] for the
d d proofs).
U (k ∗ ) = k ∗ y ∗ + V y∗ (31)
d d
1. The duality relation is the natural generalization
from which, by taking the -expectations, we get of equation (28):
∗ ∗ ∗ d ∗ d u(x) = sup E[U (XT )]
E[U (k )] = y E k +E V y XT ∈K(x)
d d
d d
= y∗x + E V y∗ (32) = inf xy + E V y
d y>0,∈M d
(36)
which proves the main equality (28).
and there exists a unique couple of dual mini-
By market completeness, the martingale represen-
mizers y ∗ , ∗ .
tation theorem applies, so that k ∗ can be obtained via
2. As in the complete case, the supremum of the
a self-financing strategy H ∗ :
expected utility on K(x) may be not reached.
T KV (x) denotes the set of k ∈ L1 () such that
k∗ = x + Ht∗ dSt (33) E [k] ≤ x for all ∈ M with finite generalized
0
entropy. Then, the supremum of the expected
though H ∗ is not admissible in general, that is, when utility on KV (x) coincides with the value u(x)
optimally investing, the agent can incur arbitrarily and it is a maximum. The claim k ∗ ∈ KV attain-
large losses. ing the maximum is unique and the relationship
Moreover, as a function of x, the optimal value between primal and dual optima still holds
u(x) is also a utility function finite on , with
the same properties of U . The duality equation (28) d ∗ 1
= ∗ U (k ∗ ) (37)
shows that u and d y
E V y dQ if y ≥ 0 3. ∗ may be not equivalent to . However, in the
v(y) = dP
(34) case ∗ ∼ , k ∗ can be obtained through a self-
+∞ otherwise financing strategy H ∗ , albeit not admissible in
are conjugate functions. general.
The relationship between the primal and dual 4. The optimal value u as a function of the initial
optima can also be expressed as endowment x is a utility function, with the same
properties of U . In fact, it is finite on , strictly
d 1 concave, strictly increasing, it verifies the Inada
= ∗ U (k ∗ ) (35) conditions, and RAE(u) holds. The duality rela-
d y
tion (36), rewritten as u(x) = infy>0 {xy + v(y)}
which amounts to saying that d is proportional to with
d
one’s marginal utility from the optimal investment.
Therefore, in the complete market case, pricing by
v(y) = inf∈M E V y dQ if y ≥ 0
taking -expectations coincides with the pricing by dP
marginal utility principle, introduced in the option +∞ otherwise
pricing context by Davis [6]. (38)
Expected Utility Maximization: Duality Methods 7
shows that u and v are conjugate functions (see This means that the agent is indifferent, that
Convex Duality). is, he/she has the same (optimal expected) utility,
As ∗ results from a minimax theorem, it is also between (i) paying pB at time t = 0 and receiving B
known as the minimax measure. For the applications, at T and (ii) not entering into a deal for the claim B.
it is important to know that there are easy sufficient
conditions that guarantee that ∗ is equivalent to ,
such as the following: (i) U (+∞) = +∞ as noted in References
[1] or (ii) in case U (x) = − γ1 e−γ x , the existence of
a ∈ Me with finite generalized entropy (see [17]
[1] Bellini, F. & Frittelli, M. (2002). On the existence of
for an extensive bibliography).
minimax martingale measures, Mathematical Finance
When ∗ is indeed equivalent to , its selection 12/1, 1–21.
in the class of all risk-neutral, equivalent probabilities [2] Biagini, S. & Frittelli, M. (2005). Utility maximization
Me as the pricing measure is economically motivated in incomplete markets for unbounded processes, Finance
by its proportionality to the marginal utility from the and Stochastics 9, 493–517.
optimal investment. [3] Biagini, S. & Frittelli, M. (2008). A unified frame-
work for utility maximization problems: an Orlicz
space approach, Annals of Applied Probability 18/3,
929–966.
Utility Maximization with Random [4] Bismut, J.M. (1973). Conjugate convex functions in
Endowment optimal stochastic control, Journal of Mathematical
Analysis and Applications 44, 384–404.
Under all the conditions stated above (on the market, [5] Cox, J.C. & Huang, C.F. (1989). Optimal consump-
on U , and on both), suppose that the agent has a tion and portfolio policies when asset prices follow
a diffusion process, Journal of Economic Theory 49,
random endowment B at T , in addition to the initial 33–83.
wealth x. For example, B can be the payoff of a [6] Davis, M.H.A. (1997). Option pricing in incomplete
European option expiring at T . The agent’s goal is markets, in Mathematics of Derivative Securities,
still maximizing of expected utility from terminal M. Dempster & S.R. Pliska, eds, Cambridge University
wealth, which now becomes Press, pp. 216–227.
[7] Delbaen, F. & Schachermayer, W. (1994). A general
version of the fundamental theorem of asset pricing,
u(x, B) := sup E[U (B + XT )] (39)
XT ∈K(x) Mathematische Annalen 300, 463–520.
[8] He, H. & Pearson, N.D. (1991). Consumption and
portfolio policies with incomplete markets and short-
The duality results, in this case, are similar to the sale constraints: the infinite-dimensional case, Journal
ones just shown. In fact, of Economic Theory 54, 259–304.
[9] Karatzas, I. & Shreve, S. (1998). Methods of Mathemat-
ical Finance, Springer.
u(x, B) [10] Karatzas, I., Shreve, S., Lehoczky, J. & Xu, G. (1991).
Martingale and duality methods for utility maximization
d
= min xy + yE [B] + E V y in an incomplete market, SIAM Journal on Control and
y>0,∈M d Optimization 29, 702–730.
[11] Kramkov, D. & Schachermayer, W. (1999). The asymp-
(40)
totic elasticity of utility function and optimal investment
in incomplete markets, Annals of Applied Probability
Note that the maximization without the claim 9/3, 904–950.
can be seen as a particular case of the one above, [12] Merton, R.C. (1969). Lifetime portfolio selection under
with B = 0: u(x, 0) = u(x). The solution of a utility uncertainty: the continuous-time case, The Review of
maximization problem with random endowment is Economics and Statistics 51, 247–257.
the key step to the indifference pricing technique. [13] Merton, R.C. (1971). Optimum consumption and port-
folio rules in a continuous-time model, Journal of Eco-
The (buyer’s) indifference price of B is, in fact, the
nomic Theory 3, 373–413.
unique price pB that solves [14] Pliska, S.R. (1986). A stochastic calculus model of
continuous trading: optimal portfolios, Mathematics of
u(x − p, B) = u(x, 0) (41) Operations Research 11, 371–382.
8 Expected Utility Maximization: Duality Methods
[15] Rockafellar, R.T. (1974). Conjugate Duality and Opti- Related Articles
mization, Conference Board of Math. Sciences Series,
SIAM Publications, No. 16.
[16] Schachermayer, W. (2001). Optimal investment in Complete Markets; Convex Duality; Equivalent
incomplete markets when wealth may become negative, Martingale Measures; Expected Utility Maximiza-
Annals of Applied Probability 11/3, 694–734. tion; Merton Problem; Second Fundamental The-
[17] Schachermayer, W. (2004). Portfolio Optimization in orem of Asset Pricing; Utility Function; Utility
incomplete financial markets, Notes of the Scuola
Indifference Valuation.
Normale Superiore di Pisa, Cattedra Galileiana downlo-
adable at http://www.fam.tuwien.ac.at/∼wschach/pubs/
[18] Tobin, J. (1958). Liquidity preference as behavior
SARA BIAGINI
towards risk, Review of Economic Studies 25, 68–85.
Change of Numeraire pricing formula:
X
t [X] = St0 E 0 Ft (1)
ST0
Consider a financial market model with nondividend
paying asset price processes (S 0 , S 1 , . . . , S N ) living where E 0 denotes integration with respect to (w.r.t.)
on a filtered probability space (, F, F, P ), where Q0 .
F = {Ft }t≥0 and P is the objective probability mea- Very often one uses the bank account B with
sure. For general results concerning completeness, dynamics
self-financing portfolios, martingale measures, and
arbitrage, (see Arbitrage Strategy; Fundamental dBt = rt Bt dt, B0 = 1 (2)
Theorem of Asset Pricing; Risk-neutral Pricing).
We choose the asset S 0 as the numeraire asset, where r is the short rate, as numeraire. The cor-
and we assume that St0 > 0 with probability 1. From responding martingale measure QB is then often
general theory, we know that (modulo integrability denoted by Q and referred to as “the risk neutral
and technical conditions) the market is free of arbi- martingale measure”. In this case, the pricing formula
trage if and only if there exists a measure Q0 ∼ P becomes
such that the normalized price processes T
− r ds
t [X] = E Q e t s X Ft (3)
St0 St1 StN
, , . . . , In many concrete situations, the computational
St0 St0 St0
work needed for the determination of arbitrage-free
prices can be drastically reduced by a clever choice
are Q0 martingales. Using the notation Z i = S i /S 0 ,
of numeraire, and the purpose of this article is to
thus we also have, apart from the nominal price
analyze such changes.
system S 0 , S 1 , . . . , S n , the normalized price system
To set the scene, we consider a fixed risk neutral
Z 0 , Z 1 , . . . , Z n . The economic importance of the nor-
martingale measure Q for the numeraire B, and an
malized system is clarified by the following standard
alternative numeraire asset S 0 with the corresponding
result.
martingale measure Q0 . Our first task is to find the
Proposition 1 With notation as defined above the measure transformation between Q and Q0 .
following hold. To see what Q0 must look like, we consider a
fixed time T and an arbitrarily chosen T -claim X.
• A portfolio is self-financing in the S system if and Assuming enough integrability we then know that,
only if it is self-financing in the Z system. by using B as the numeraire, the arbitrage-free price
• A portfolio is an arbitrage opportunity in the S of X at time t = 0 is given as
system if and only if it is an arbitrage in the Z
system. X
0 [X] = E Q
(4)
• The S market is complete if and only if the Z BT
market is complete.
• In the Z market, the asset Z 0 has the property that On the other hand, using S 0 as numeraire, the price
Zt0 ≡ 1, so it represents a bank account with zero is also given by the following formula:
interest rate.
0 0 X
If X ∈ FT is a fixed contingent claim with exercise 0 [X] = S0 E (5)
ST0
date T , and if we denote the (not necessarily unique)
arbitrage-free price process of X by t [X], then by Defining the likelihood process L by Lt = dQ0 /
applying the above-mentioned result to the extended dQ on Ft , we thus have
market S 0 , S 1 , . . . , S N , t [X] we see that t [X]/St0
is a Q0 martingale, and using this fact together with X X
E Q
= S0 E L T 0
0 Q
(6)
the obvious fact that T [X] = X we obtain the basic BT ST
2 Change of Numeraire
Since this holds for all X ∈ FT , we have the unique martingale measure Q0 . In more detail, the
following basic result. situation is as follows.
Proposition 2 Under the above-mentioned assump- • If the market is incomplete, then there will exist
tions, the likelihood process L, defined as several risk neutral measures Q.
• Each of these measures generates a different price
dQ0 system, defined by the pricing formula (3).
Lt = , on Ft , 0≤t ≤T (7)
dQ • Choosing one particular Q is thus equivalent to
is given by the formula choosing one particular price system.
• For a given numeraire S 0 , there will also exist
St0 several different martingale measures Q0 .
Lt = (8) • Each of these measures generates a different price
S0 · Bt
0
system, defined by the pricing formula (1).
We note that since S 0 /B is a Q martingale, • If a risk neutral measure Q and thus a price
the likelihood process L is also, as expected, a Q system are fixed, there exists a unique measure
martingale. Q0 such that Q0 generates the same price system
As an immediate corollary we have the following. as Q.
• The measure transformations considered here are
Proposition 3 Assume that the S dynamics under 0 precisely those corresponding to a change of
the Q measure are of the form measure within a given price system.
= St0 E 0 [ϕ(ZT )Ft ] (15) where W is a standard Q0 -Wiener process. The price
is thus given by the following formula:
where ϕ(z) = (1, z) and ZT = ST1 /ST0 . Note that
the factor St0 is the price of the traded asset S 0 at time t [X] = St0 · c(t, Zt ) (23)
t, so this quantity does not have to be computed—it
can be directly observed on the market. Thus, the Here c(t, z) is given directly by the Black–Scholes
computational work is reduced to computing a single formula as the price of a European call option, val-
integral. √ K = 1,
ued at t, with time of maturity T , strike price
As an example, assume that we have two stocks, short rate r = 0, on a stock with volatility σ 2 + δ 2
S 0 and S 1 , with price processes of the following form and price z.
under the objective probability P :
dSt1 = βSt1 dt + δSt1 dW̃t1 . (17) We now specialize the theory to the case when the
chosen numeraire is a zero coupon bond. As can
Here W̃ 0 and W̃ 1 are assumed to be independent be expected, this choice of numeraire is particularly
P -Wiener processes, but it would also be easy to useful when dealing with interest rate derivatives.
treat the case when there is a coupling between the Suppose, therefore, that we are given a specified
two assets. bond market model with a fixed risk neutral martin-
Under Q the price dynamics will be given as gale measure Q (always with B as numeraire). For
a fixed time of maturity T , we now choose the price
dSt0 = rSt0 dt + σ St0 dWt0 (18) process p(t, T ), of a zero coupon bond maturing at
T , as our new numeraire.
dSt1 = rSt1 dt + δSt1 dWt1 (19)
Definition 1 The T -forward measure QT is defined
where W 0 and W 1 are Q-Wiener processes, and from as
Proposition 3 it follows that the Girsanov transfor- dQT
LTt = (24)
mation from Q to Q0 has a likelihood process with dQ
dynamics given as
on Ft for 0 ≤ t ≤ T where LT is defined as
dLt = Lt σ dWt0 (20)
p(t, T )
The T -claim to be priced is an exchange option, LTt = (25)
Bt p(0, T )
which gives the holder the right, but not the obliga-
tion, to exchange one S 0 share for one S 1 share at Observing that p(T , T ) = 1 we have the following
time T . Formally, this means that the claim is given useful pricing formula as an immediate corollary of
by X = max[ST1 − ST0 , 0], and we note that we have a Proposition 3.
linearly homogeneous contract function. From equa-
tion (15), the price process is given as Proposition 5 For any sufficiently integrable
T -claim X, we have the pricing formula
t [X] = St0 E 0 [max[ZT − 1, 0]|Ft ] (21)
t [X] = p(t, T )E T [X|Ft] (26)
with Z(t) = St1 /St0 . We are thus, in fact, valuing a
European call option on ZT , with strike price K = 1. where E T denotes integration w.r.t. QT .
By construction, Z will be a Q0 -martingale, and
since a Girsanov transformation will not affect the Note again that the price p(t, T ) does not have
volatility, it follows easily from equations (16) and to be computed. It can be observed directly on the
(17) that the Q0 -dynamics of Z are given by market at time t.
A natural question to ask is when Q and QT
dZt = Zt σ 2 + δ 2 dWt (22) coincide. This occurs if and only if we Q-a.s. have
4 Change of Numeraire
wealth at time T that is attainable from x ∈ LFt at for the conjugate function
t by trading. Let XT (x, t) denote the set of all such V b (y) := ess supx (U (x + bB) − xy)
XT , and set Xt (x) := Xt (x, t − 1). In addition to the (y > 0, b ∈ J ), with V b (y) = V 0 (y) + ybB and
liquid assets, there exist J illiquid assets delivering vTb (y) = V b (y). For later arguments, we assume the
payoffs B = (B j )1≤j ≤J ∈ LFT (J ) at T . A quantity following:
b ∈ Jof illiquid assets provides at T the payoff (A1) ubt (x) satisfies condition (4) for any t, b, and
bB := j bj B j . We assume that the market is free the value functions are conjugate:
of arbitrage in the sense that the set Me of equivalent
probability measures Q under which S is a martingale vtb (y) = ess supx∈ (ubt (x) − xy), y > 0 (6)
(see Martingales) is nonempty. This is equivalent
ubt (x) = ess infy>0 (vtb (y) + xy), x∈ (7)
to assuming that all sets Ds,t , s < t ≤ T , of condi-
tional state-price densities are nonempty. Technically, (A2) For any t ≤ T and b, x, y ∈ LFt−1 , there exist
t Ds,t ∈ LFt satis-
Ds,t is the set of strictly positive unique X tb (y) that attain the single-period
tb (x) and D
fying Es [Ds,t ] = 1 and Es [Ds,t s ϑ dS] = 0 for all optima (8) and (9)
ϑ ∈ . For brevity, let Dt := Dt−1,t . State-price den-
sities are related to the likelihood density process ubt−1 (x) = ess sup Et−1 ubt (Xt )
Zt = Et [ dQ/ dP ] of a Q ∈ Me by Ds,t = Zt /Zs . Xt ∈Xt (x,t−1)
b (x))
= Et−1 ubt (X (8)
t
Conditional Utility Functions and Dual b
vt−1 (y) = ess inf Et−1 vtb (yDt )
Dt ∈Dt
Problems
b (y))
= Et−1 vtb (y D (9)
t
Our agent’s objective is to maximize her expected
utility (3) of wealth at T for a direct utility function and satisfy u bt (X b (x)) = y D
b (y) for x and y being
t t
b
U , which is finite, differentiable, strictly increasing, related by u t−1 (x) = y.
and concave on all of , with limx→−∞ U (x) = ∞ (A3) For t ≤ T , b, x, y ∈ LFt , unique optima X b
T
and limx→∞ U (x) = 0. Holding position (x, b) ∈ b
and Dt,T for the multiperiod problems (3), (5) are at-
× J in liquid and illiquid assets at t ≤ T , she tained, and can be constructed by dynamical program-
maximizes ming X b = X
b (X b ) and D b = D b D b b
k k k−1 t,k t,k−1 k (y Dt,k−1 )
for t < k ≤ T , with Dk (·) and Xk (·) from A2 and
b b
that they compensate for, that is, opportunities to her counterparty. Indeed, a strategy
θ ∈ would offer arbitrage profits to him, jointly
λπtb,x (δ 1 ) + (1 − λ)πtb,x (δ 2 ) with (βt ), if his gains
(13)
for δ, δ ∈ J
(11)
and that 1A πtb,x (δ) = 0 holds if 1A δB = 0. satisfy GT ≥ 0 and P [GT > 0]. Unwinding her
illiquid asset position at T , leaves her with final
Dynamic consistency with no arbitrage wealth
So far, we took the agent to trade optimally in liquid T
assets, while holding a fixed position b in illiquid T = x +
X
ϑ dS
assets. Now, suppose that she is ready to buy (or 0
sell) at her compensating variations shares of illiquid
T
,
assets in quantities as requested by another agent − t−1
πt−1
β Xt−1
(βt ) + βT B (14)
(he), dynamically over time. Let βt − β0 ∈ LFt−1 (J ) t=1
denote the cumulative position in illiquid assets she
has accepted until date t − 1, when she initially has Adding equation T (13) to equation β(14), would im-
held β0 := b ∈ J . At t − 1 < T , he chooses to sell ply E[ubT (x + 0 θ + T )] = ub (x),
ϑ dS)] > E[uTT (X 0
βt ∈ LFt−1 (J ) illiquid assets. Given X t−1 is the contradicting definition (3).
wealth in liquid assets she arrived with at t − 1,
paying compensating variation changes her liquid Static no-arbitrage bounds
wealth to X t−1 − π βt−1 ,
t−1 := X Xt−1
(βt ) such that In particular, there is no arbitrage from buy(sell)-and-
t−1
βt−1
the utility of her position stays equal ut−1 (Xt−1 ) = hold strategies in illiquid assets. For x ∈ , b, δ ∈ J
βt t−1 ). Investing optimally for the next period
ut−1 (X it thus holds
according to A2 from her new position (X t−1 , βt )
(without knowing his future (βt+k )k≥1 ), she arrives πtb,x (δ) ≤ ess sup EtQ [δB]
at t with liquid wealth Q∈Me
t (15)
t−1 − π t−1 β ,
Xt−1
=: X t−1 (βt ) + ϑ dS (12) j T j
t−1 For replicable payoffs B j = Bt + t ϑ B dS with
j
ϑ B in for all j , Bt ∈ LFt (d ) the indifference
for an optimal strategy ϑ over (t − 1, t]. Given
an initial wealth X0 = x ∈ , the wealth process value πtb,x (δ) equals the replication cost (market
t is determined by compensating variations and
X price) δBt .
β
A2 such that (ut t (X t )) is a martingale. Trading
Marginal indifference values
against indifference valuations but not following
strategy ϑ would result in a suboptimal wealth In general, πtb,x (δ) is nonlinear in δ. Since
β
process Xt for which the utility process (ut t (Xt )) ε → ub+εδ
t (x − πtb,x (εδ)) is constant, it holds
is a supermartingale, therefore decreasing in the
mean. By accepting to trade illiquid assets against ∂ b,x gradb ubt (x)
πt (εδ) = δ (16)
u t (x)
∂ε b
her indifference values, she is not offering arbitrage ε=0
4 Utility Indifference Valuation
Hence, R0 must be given by the ratio in (16) of πt0,x (δ) = ess inf Et [D(δB)]
D∈Dt,T
marginal utilities at t = 0. If the agent is taken to
be representative for the whole market, holding a net 1 t,T t,T
+ Et [D log D] − Et [D 0
log D 0
]
supply of b illiquid assets, then (Rt ) could be inter- α
preted as a partial equilibrium price process. (17)
Numeraire dependence for the indifference value =
πtb,x (δ) πt0,x (δ
+ b) −
In general, utility indifference values depend on the πt0,x (b), where D 0 is the minimizer of equa-
t,T
utility functions and the numeraire (unit of account) tion (5) for b = 0 that satisfies Et [D0 log D0 ] =
t,T t,T
with respect to which they are defined. But it is pos- ess infDt,T Et [Dt,T log Dt,T ]. By equation (17), util-
sible to choose state-dependent utility functions with ity indifference sell values δB → −πtb,x (−δ) are
respect to another numeraire such that indifference monotonic in α, and satisfy the properties of con-
values (and optimal strategies) become numeraire vexity, translation invariance, and monotonicity, that
invariant. Let (Nt ) be the price process of a trad- constitute a convex risk measure (see Convex Risk
t
able numeraire, that is, Nt = N0 + 0 ϑ N dS for t ≤ Measures).
T , ϑ N ∈ , with N > 0. Then indifference values Under particular model assumptions, indifference
coincide, that means πt,N b,x
(δ) = πtb,xNt (δ)/Nt holds, values πt0,x (δ) can be computed by a backward
if utilities and payoffs with respect to N satisfy the induction scheme
relations ubt,N (x) = ubt (xNt ) (for t = T , hence all t)
0,x
and BN := B/NT . Likewise, for numeraires N, N, − πt−1 (δ)
the relations should be ub (x) = ubt,N (x N t /Nt ) and
t,N Q0 1
= Et−1 log EGt exp − απt (δ)
0,x
(18)
BN = BN NT /NT . α
Utility Indifference Valuation 5
starting from πT0,x (δ) = δB. Roughly speaking, the with the replication cost of δB. Marginal utility
assumptions needed comprise certain independence indifference values are given by
conditions plus semicompleteness of the market at
∂ b,x
each period. The scheme (18) has intuitive appeal, σ (
π (δ) = Yt + λ − λρ)(T − t)
in showing that the indifference valuation is com- ∂δ t
puted here by intertwining two well-known valuation − α(b + δ)
σ 2 (1 − ρ 2 )(T − t) (21)
methods: First, one takes an exponential certainty
equivalent with respect to nontradable risk at the Under the (minimal entropy) martingale measure
inner expectation (with Ft−1 ⊂ Gt ⊂ Ft ); after that dQ0 = exp(−λWT − λ2 T /2) dP we have St =
one takes a risk-neutral expectation of this certainty S0 + σ Wt0 for independent Q0 -Brownian motions
equivalent at the outer expectation (under the mini- (Wt0 ) and (Wt⊥ ). Indifference values can be
mal entropy martingale measure), where, Gt -risk is expressed by
taken as replicable from t − 1. See [1, 18] for precise
technical assumptions and examples. − πt0,x (b) =
1 0
log EtQ exp α(1 − ρ 2 )(−bB) (22)
Example in Continuous Time α(1 − ρ )
2
⊥
1
2
αb σ (1 − ρ ) + bσ 1 − ρ W1
2 2 2 2 is normally
1 1 λ2 distributed. Its standard deviation accounts for
ut (x) = − exp − α x +
b
(T − t) + πt (b) ,
0,x
α 2 α 1 − ρ 2 × 100% of that for the unhedged payoff
bB = bYT . For correlation ρ = 80%, for example,
x, b ∈ (19) the error size is still substantial at a ratio of 60%.
are then attained by the optimal strategies
ϑb = Even for ρ = 99%, it is still above 14%. To be
λ
− b σ ρ, with
σ compensated for the remaining risk in terms of
σα
her expected utility, the agent requires −π00,x (b) =
σ (
πt0,x (b) = bYt + b λ − λρ)(T − t)
1
2
αb2 σ 2 (1 − ρ 2 ) at t = 0. Her compensating varia-
tion of wealth is proportional to the variance of H
1 and to her risk aversion α.
− αb2
σ 2 (1 − ρ 2 )(T − t) (20)
2
Indifference values πtb,x (δ) = πt0,x (b + δ) − πt0,x (b)
Further Reading
for exponential utility do not depend on wealth x.
Optimality b
t of equation (19) and ϑ follows by noting To value options under transaction costs, indifference
that ut ( 0 ϑ dS) is a martingale for ϑ =
b
ϑ b and valuation was applied in [8]. The method is not lim-
a supermartingale for any other ϑ ∈ . Clearly, ited to European payoffs. For payoffs with optimal
indifference buy (sell) values πtb,x (δ) (respectively exercise features, see [10, 17]. Indifference values
−πtb,x (−δ)) are decreasing (increasing) in the risk for payoff streams could be defined by equation (1)
aversion α. They are linear in the quantity δ only if for utilities that reflect preferences on future payment
correlation is perfect (|ρ| = 1). Then, they coincide streams, like in [22]. For results on nonexponential
6 Utility Indifference Valuation
utilities, see [5, 12, 14]. For performance of utility- quadratic coefficient, Probability and Mathematical
based hedging strategies, see [15]. Besides dynamical Statistics 22, 51–83.
programming and convex duality, solutions have been [11] Kramkov, D. & Bank, P. (2007). A model for large
investor, where she trades at utility indifference prices
obtained by backward stochastic differential equa- of market makers, ICMS, Edinburgh, Present
tions (see Backward Stochastic Differential Equa- ation, www.icms.org.uk/downloads/quantfin/Kramkov.
tions) [10, 13, 20], also for non-convex closed con- pdf
straints [9] and jumps [2]. For asymptotic results on [12] Kramkov, D. & Sirbu, M. (2007). Asymptotic analysis
valuation and hedging for small volumes, see [2, 5, of utility-based hedging strategies for small number of
12, 13]. A Paretian equilibrium formulation for indif- contingent claims, Stochastic Processes and Applications
117, 1606–1620.
ference pricing has been presented in [11]. Being
[13] Mania, M. & Schweizer, M. (2005). Dynamic expo-
nonlinear, indifference values can reflect diversifica- nential utility indifference valuation, Annals of Applied
tion or accumulation of risk for applications areas Probability 15, 2113–2143.
like real options or insurance, see [6, 14, 19, 22]; [14] Møller, T. (2003). Indifference pricing of insurance con-
but modeling and computation are more demanding, tracts: applications, Insurance: Mathematics and Eco-
since a portfolio of assets cannot be valued by parts nomics 32, 295–315.
in general. Instead, each component is to be judged [15] Monoyios, M. (2004). Performance of utility-based
strategies for hedging basis risk, Quantitative Finance
by its contribution to the overall portfolio. More com-
4, 245–255.
prehensive references are given in [1, 3, 4, 6, 16]. [16] Musiela, M. & Zariphopoulou, T. (2004). An example
of indifference pricing under exponential preferences,
Finance and Stochastics 8, 229–239.
References [17] Musiela, M. & Zariphopoulou, T. (2004). Indiffer-
ence prices of early exercise claims, in Mathematics of
[1] Becherer, D. (2003). Rational hedging and valua- Finance, G. Yin & Q. Zhang, eds, Contemporary Math-
tion of integrated risks under constant absolute risk ematics, AMS, Vol. 351, pp. 259–273.
aversion, Insurance: Mathematics and Economics 33, [18] Musiela, M. & Zariphopoulou, T. (2004). A valuation
1–28. algorithm for indifference prices in incomplete markets,
[2] Becherer, D. (2006). Bounded solutions to backward Finance and Stochastics 8, 399–414.
SDEs with jumps for utility optimization and indif- [19] Porchet, A., Touzi, N. & Warin, X. (2008). Val-
ference hedging, Annals of Applied Probability 16, uation of power plants by utility indifference and
2027–2054. numerical computation, Mathematical Methods of Oper-
[3] Davis, M.H.A. (2006). Optimal hedging with basis risk, ations Research [Online], DOI: 10.007/s00186-008-
in From Stochastic Calculus to Mathematical Finance, 0231-z.
Y. Kabanov, R. Liptser & J. Stoyanov, eds, Springer, [20] Rouge, R. & El Karoui, N. (2000). Pricing via utility
Berlin, pp. 169–188. maximization and entropy, Mathematical Finance 10,
[4] Foldes, L. (2000). Valuation and martingale properties 259–276.
of shadow prices: an exposition, Journal of Economic [21] Schachermayer, W. (2002). Optimal investment in
Dynamics and Control 24, 1641–1701. incomplete financial markets, in Mathematical Finance:
[5] Henderson, V. (2002). Valuation of claims on non-traded Bachelier Congress 2000, H. Geman, D. Madan & S.R.
assets using utility maximization, Mathematical Finance Pliska, eds, Springer, Berlin, pp. 427–462.
12, 351–373. [22] Smith, J.E. & Nau, R.F. (1995). Valuing risky projects:
[6] Henderson, V. & Hobson, D. (2008). Utility indif- option pricing theory and decision analysis, Management
ference pricing—an overview, in Indifference Pricing, Science 41, 795–816.
R. Carmona, ed, Princeton University Press, pp.
44–74.
[7] Hicks, J.R. (1956). A Revision of Demand Theory, Related Articles
Oxford University Press, Oxford.
[8] Hodges, S.D. & Neuberger, A. (1989). Optimal replica- Complete Markets; Expected Utility Maximiza-
tion of contingent claims under transaction costs, Review
of Futures Markets 8, 222–239.
tion: Duality Methods; Good-deal Bounds;
[9] Hu, Y., Imkeller, P. & Müller, M. (2005). Utility Hedging; Minimal Entropy Martingale Measure;
maximization in incomplete markets, Annals of Applied Utility Theory: Historical Perspectives; Utility
Probability 15, 1691–1712. Function.
[10] Kobylanski, M., Lepeltier, J., Quenez, M. & Torres, S.
(2002). Reflected backward SDE with super-linear DIRK BECHERER
Superhedging noticed that, in the presence of leverage constraints,
superhedging may be cheaper than perfect hedging.
The same phenomenon has been observed by Benais
Pricing and hedging of contingent claims are the et al. [1] in the presence of transaction costs.
two main problems of mathematical finance. They The characterization of superhedging strategies
both have a clear and transparent solution when the and prices is the object of a family of results called
underlying market model is complete, that is, for superhedging theorems.
each contingent claim with promised payoff H there
exists a self-financing admissible trading strategy
Superhedging Theorems
whose wealth at maturity equals H (see Complete
Markets). Such a strategy is called the hedging A large literature has been devoted to characterizing
strategy of the contingent claim H . The smallest the set of all initial endowments that allows to
initial wealth that allows to reach H at maturity via superhedge a contingent claim H as a first crucial
admissible trading is called the hedging price of H . step to compute the superhedging price, the infimum
Under a suitable no-arbitrage assumption (see of that set. In this article, we focus essentially on
Fundamental Theorem of Asset Pricing), the sec- continuous-time hedging of European options, that
ond fundamental theorem of asset pricing (see Sec- is, with a fixed exercise time T , and distinguish
ond Fundamental Theorem of Asset Pricing) states between two cases: frictionless incomplete markets
that replicability of every contingent claim is equiv- and markets with frictions. For superhedging in
alent to the uniqueness of the equivalent martingale discrete-time models and for American options, the
measure (see Equivalent Martingale Measures). interested reader could see respectively Föllmer and
It turns out that in a complete market (see Com- Schied’s book [17] and American Options.
plete Markets), the hedging price at time t = 0
of a contingent claim H , denoted by p(H ), coin-
cides with the expectation of discounted H under
Frictionless Incomplete Markets
the unique equivalent martingale measure , that is, To facilitate the discussion, let us fix the notation first.
p(H ) = Ɛ [DT H ] where DT is a discounting factor We consider a market model composed of d ≥ 1 risky
over [0, T ]. assets whose discounted price dynamics is described
If the market model is incomplete, there exist by a càdlàg and locally bounded semimartingale
contingent claims that are not perfectly replicable via S = (St )t∈[0,T ] , where T > 0 is a given finite time
admissible trading strategies. In other words, in such horizon. S is defined on a probability space (, F, )
financial models, contingent claims are not redundant and adapted to a filtration (Ft )t∈[0,T ] with Ft ⊂ F
assets. Therefore, since perfect replicability cannot be for all t ≤ T satisfying usual conditions. Notice that
always achieved, this requirement has to be relaxed. prices S are already discounted; this is equivalent
One way of doing this consists in introducing the to assuming that the spot interest rate r = 0. This
concept of superhedging. model is, in general, incomplete, that is, it may admit
Given a contingent claim H with maturity T > 0, infinitely many equivalent martingale measures (see
a superhedging strategy for H is an admissible trad- Equivalent Martingale Measures).
ing strategy such that its terminal wealth VT super- Let H be a positive FT -measurable random vari-
replicates H , that is, VT ≥ H . The superhedging able, modeling the final payoff of a given contingent
price of H is the smallest initial endowment that claim, for example, H = (ST − K)+ , a European call
allows an investor to super-replicate H at maturity; option written on S, with maturity T and strike price
in other words, it is the initial value V0 of the super- K > 0.
hedging strategy of H . An admissible trading strategy is a couple (x, θ)
Superhedging was introduced and investigated where x ∈ is an initial endowment and θ =
first by El Karoui and Quenez [13, 14] in a (θt )t∈[0,T ] a predictable S-integrable process,
t such that
continuous-time setting where the risky assets follow the corresponding wealth Vtx,θ = x + 0 θu dSu ≥ −a
a multidimensional diffusion process. Independently, for every t ∈ [0, T ] and for some threshold a >
Naik and Uppal [25] studied the same problem in a 0. We denote A as the set of all admissible
discrete-time model with finite set of scenarios and strategies.
2 Superhedging
Definition 1 Let H ≥ 0 be a given contingent claim. wealth. This extension is a consequence of the so-
(x, θ) ∈ A is a superhedging strategy for H if VTx,θ ≥ called optional decomposition of supermartingales.
H -a.s. (almost surely). Moreover, the superhedging The optional decomposition was first proved in [13,
price p̄(H ) of H is given by 14] for diffusions and then extended to general semi-
martingales by Kramkov [24], Föllmer and Kabanov
[15], and Delbaen and Schachermayer [12]. This is
p̄(H ) = inf x ∈ : ∃(x, θ) ∈ A, VTx,θ ≥ H a.s. a very deep result of the general theory of stochas-
(1) tic processes and roughly states that any càdlàg-
positive -supermartingale X, for any ∈ Me , can
The fundamental result in the literature on super- be decomposed as follows:
hedging is the dual characterization of the set DH t
of all initial endowments x ∈ leading to super- Xt = X0 + θu dSu − Ct , t ∈ [0, T ] (4)
hedge H . In an incomplete frictionless market, the 0
relevant dual variables are the densities of all equiv- where θ is a predictable, S-integrable process and C
alent martingale measures d/d. We denote Me as an increasing optional process, to be interpreted as
the set of all equivalent (local) martingale measures a cumulative consumption process. What is remark-
for S. In this setting, the superhedging theorem states able is that the local martingale part can be rep-
that resented as a stochastic integral with respect to S
so that it is a local martingale under any equiva-
DH = x ∈ : Ɛ [H ] ≤ x, ∀ ∈ Me (2) lent martingale measure . In this sense, decom-
position (4) is universal. The price to pay is that
An important consequence of equation (2) is that the increasing process C is, in general, not pre-
the superhedging price p̄(H ) satisfies dictable as in the Doob–Meyer decomposition (see
Doob–Meyer Decomposition) but only optional. The
p̄(H ) = sup Ɛ [H ] (3)
∈Me
process C has the economic interpretation of cumu-
lative consumption.
While an advantage of superhedging is that it is The decomposition (4) implies that the wealth
preference free, from the previous characterization dynamics of the minimal superhedging portfolio for
of p̄(H ) as the biggest expectation Ɛ [H ] over all a contingent claim H is given by
equivalent martingale measures, it becomes appar-
ent that pursuing a superhedging strategy can be too Vt = ess sup∈Me Ɛ [H |Ft ], t ∈ [0, T ] (5)
expensive, depending on the financial model and on An analogous result holds for American contingent
the constraints on portfolios. This is the main disad- claims too (see [13–15, 24] for details at increasing
vantage of such a criterion, which is, nonetheless, of levels of generality).
great interest as a benchmark. Moreover, for an agent Finally, in the more specific setting of stochastic
with a large risk aversion and under transaction costs volatility models, Cvitanić et al. [8] compute the
(see the section Markets with Frictions), the reser- superhedging strategy and price for a contingent
vation price approaches the superhedging price, as claim H = g(ST ), yielding that the former is a buy-
established in [2]. and-hold strategy and so the latter is just S0 . The same
El Karoui and Quenez [13, 14] first proved the study is carried over under portfolio constraints.
superhedging theorem in an Itô’s diffusion setting
and Delbaen and Schachermayer [10, 11] generalized Markets with Frictions
it to, respectively, a locally bounded and unbounded
semimartingale model, using a Hahn–Banach sepa- In the previous section, we made the implicit assump-
ration argument. tion that investors can trade in continuous time and
The superhedging theorem can be extended in without frictions. This is clearly a strong idealiza-
order to characterize the dynamics of the mini- tion of the real world; that is why during the last
mal superhedging portfolio of a contingent claim 15 years much effort has been devoted to the super-
H , that is, the cheapest at any time t of all hedging approach under various types of trading
superhedging portfolios of H with the same initial constraints.
Superhedging 3
Transaction Costs. Financial models with propor- where ·, · denotes the usual scalar product in d .
tional transaction costs were studied first by Jouini This theorem has been proven with increasing degree
and Kallal [19] and then generalized in a series of of generality by Cvitanić and Karatzas [7], Kabanov
papers by Kabanov and his coauthors [20–22]. [20], and Kabanov and Last [21] for continuous
For the reader’s convenience, we briefly introduce bid–ask processes (πt )t∈[0,T ] and constant propor-
the model, following the bid–ask matrix formalism tional transaction costs, by Kabanov and Stricker
introduced by Schachermayer [27], which is only one [22] under slightly more general assumptions and
of many equivalent convenient ways of describing finally, motivated by a counterexample constructed
it (see, e.g., [22] and Transaction Costs for more by Rásonyi [26], Campi and Schachermayer [5]
details). extend it to discontinuous π.
We consider an economy with d ≥ 1 risky assets Explicit computations of the superhedging price
ij
(e.g., foreign currencies); πt (ω) denotes the number have been performed in [3, 9, 18] for a European-
of physical units of asset i that can be exchanged type contingent claim H = g(ST ), where ST is the
with 1 unit of asset j at time t ∈ [0, T ]. All of them price at time T of a given asset in terms of some
are assumed to be adapted to some filtration and fixed numéraire. Under different assumptions, the
càdlàg. An important role is played by the so-called superhedging strategy is a buy-and-hold one, so that
solvency region Kt (ω), the cone generated by the unit the corresponding superhedging price is the price at
vectors ei and π ij ei − ej for 1 ≤ i, j ≤ d. Elements time t = 0 of the underlying S0 .
of Kt (ω) are all the positions that can be liquidated Finally, duality methods for American options
into a portfolio with a nonnegative quantity of each under proportional transaction costs are briefly treated
currency. We denote Kt∗ (ω) as the positive polar of in Transaction Costs.
Kt (ω).
A self-financing portfolio process is modeled by a Other Types of Market Frictions. Superhedging
d-dimensional finite variation process V = (Vt )t∈[0,T ] has also been studied under other types of constraints
such that each infinitesimal change dVt (ω) lies in on, for example, shortselling and/or borrowing (see,
−Kt (ω), that is, a portfolio change at time t has to e.g., Cvitanić and Karatzas’ paper [6] and Karatzas
be done according to the trading terms described by and Shreve’s book [23], Chapter 5, for more details).
the solvency cone Kt . Very often, an agent willing to superhedge a con-
In this setting, so-called strictly consistent price tingent claim H has to choose a strategy fulfilling
systems play the same role as the equivalent mar- a given set of constraints. Let us denote Ac as the
tingale measures. A strictly consistent price system class of constrained trading strategies. In this case,
Z is a positive non-null d –dimensional martingale the constrained superhedging price p̄c (H ) is given by
such that each Zt (ω) belongs to the relative interior
of Kt∗ (ω) almost surely for all t ∈ [0, T ]. We denote
p̄c (H ) = inf x ∈ d : ∃(x, θ) ∈ Ac , VTx,θ ≥ H
Zs as the set of all strictly consistent price systems.
A standard assumption is that there exists at least one (7)
of such Z’s, that is, Zs = ∅, which is equivalent to
some kind of no-arbitrage condition (see Transaction Cvitanić and Karatzas [6] gave the first dual char-
Costs for details). acterization of p̄c (H ) in a diffusions setting, which
Let H = (H 1 , . . . , H d ) be a d-dimensional con- was further generalized to general semimartingales
tingent claim such that H + a1 ∈ KT for some a ∈ by Föllmer and Kramkov [16] via a constrained ver-
. We say that an admissiblea portfolio V super- sion of the optional decomposition theorem, whose
hedges H if VT − H ∈ KT . Consider the set DH of original version we already discussed at the end of
all initial endowment x ∈ d such that there exists an the section Frictionless Incomplete Markets.
admissible portfolio V , V0 = x, that superhedges H . We conclude by mentioning a recent series of
In this model, the superhedging theorem states that papers by Broadie et al. [4] and by Soner and Touzi
[28, 29] on superhedging under gamma constraints,
where an agent is allowed to hedge H , having at
DH = x ∈ d : Ɛ[ZT H ] ≤ x, Z0 , ∀Z ∈ Zs
the same time a control on the gamma of his or her
(6) portfolios.
4 Superhedging
End Notes [14] El Karoui, N. & Quenez, M.-C. (1995). Dynamic pro-
gramming and pricing of contingent claims in an incom-
a. plete market, SIAM Journal of Control and Optimization
We remark en passant that the notion of admissibility in 33(1), 27–66.
the presence of transaction costs, that we do not give here, [15] Föllmer, H. & Kabanov, Yu.M. (1998). Optional decom-
is a subtle one. The interested reader could look at [5] for position and Lagrange multipliers, Finance and Stochas-
a short discussion. tics 2(1), 69–81.
[16] Föllmer, H. & Kramkov, D. (1997). Optional decompo-
References sitions under constraints, Probability Theory and Related
Fields 109, 1–25.
[17] Föllmer, H. & Schied, A. (2004). Stochastic Finance: An
[1] Benais, B., Lesne, J.P., Pagès, H. & Scheinkman, J. Introduction in Discrete Time, 2nd Edition, de Gruyter
(1992). Derivative asset pricing with transaction costs, Studies in Mathematics, Berlin, P. 27.
Mathematical Finance 2, 63–86. [18] Guasoni, P., Rásonyi, M. & Schachermayer, W. (2007).
[2] Bouchard, B., Kabanov, Yu.M. & Touzi, N. (2001). Consistent price systems and face-lifting under transac-
Option pricing by large risk aversion utility under tion costs, Annals of Applied Probability 18(2),
transaction costs, Decisions in Economics and Finance 491–520.
24, 127–136. [19] Jouini, E. & Kallal, H. (1995). Martingales and arbitrage
[3] Bouchard, B. & Touzi, N. (2000). Explicit solution of the in securities markets with transaction costs, Journal of
multivariate super-replication problem under transaction Economic Theory 66, 178–197.
costs, Annals of Applied Probability 10, 685–708. [20] Kabanov, Yu.M. (1999). Hedging and liquidation under
[4] Broadie, M., Cvitanić, J. & Soner, H.M. (1998). Optimal transaction costs in currency markets, Finance and
replication of contingent claims under portfolio con- Stochastics 3(2), 237–248.
straints, The Review of Financial Studies 11, 59–79. [21] Kabanov, Yu.M. & Last, G. (2002). Hedging under
[5] Campi, L. & Schachermayer, W. (2006). A super- transaction costs in currency markets: a continuous-time
replication theorem in Kabanov’s model of transaction model, Mathematical Finance 12(1), 63–70.
costs, Finance and Stochastics 10(4), 579–596. [22] Kabanov, Yu. & Stricker, Ch. (2002). Hedging of
[6] Cvitanić, J. & Karatzas, I. (1993). Hedging contingent contingent claims under transaction costs, in Advances
claims with constrained portfolios, The Annals of Applied in Finance and Stochastics. Essays in Honour of Dieter
Probability 3(3), 652–681. Sondermann, K. Sandmann & Ph. Schonbucher, eds,
[7] Cvitanić, J. & Karatzas, I. (1996). Hedging and port- Springer, Berlin, Heidelberg, New York.
folio optimization under transaction costs: a martingale [23] Karatzas, I. & Shreve, S. (1998). Methods of Mathemat-
approach, Mathematical Finance 6(2), 133–165. ical Finance, Springer.
[8] Cvitanić, J., Pham, H. & Touzi, N. (1999). Super- [24] Kramkov, D. (1996). Optional decomposition of super-
replication in stochastic volatility models under port- martingales and hedging contingent claims in incomplete
folio constraints, Journal of Applied Probability 36(2), security markets, Probability Theory and Related Fields
523–545. 105, 459–479.
[9] Cvitanić, J., Pham, H. & Touzi, N. (1999). A closed [25] Naik, V. & Uppal, R. (1994). Leverage constraints and
form solution to the problem of super-replication under the optimal hedging of stock and bond options, Journal
transaction costs, Finance and Stochastics 3, 35–54. of Financial and Quantitative Analysis 29(2), 199–222.
[10] Delbaen, F. & Schachermayer, W. (1994). A general [26] Rásonyi, M. (2003). A Remark on the Superhedging
version of the fundamental theorem of asset pricing, Theorem Under Transaction Costs. Séminaires de Prob-
Mathematische Annalen 300, 463–520. abilités XXXVII, Lecture Notes in Mathematics, 1832,
[11] Delbaen, F. & Schachermayer, W. (1998). The fun- Springer, pp. 394–398.
damental theorem of asset pricing for unbounded stoch- [27] Schachermayer, W. (2004). The fundamental theorem
astic processes, Mathematische Annalen 312, of asset pricing under proportional transaction costs
215–250. in finite discrete time, Mathematical Finance 14(1),
[12] Delbaen, F. & Schachermayer, W. (1999). A compact- 19–48.
ness principle for bounded sequences of martingales with [28] Soner, H.M. & Touzi, N. (2000). Super-replication
applications, Proceedings of the Seminar of Stochastic under gamma constraints, SIAM Journal of Control and
Analysis, Random Fields and Applications, Progress in Optimization 39(1), 73–96.
Probability 45, 137–173. [29] Soner, M. & Touzi, N. (2007). Hedging under gamma
[13] El Karoui, N. & Quenez, M.-C. (1991). Programma- constraints by optimal stopping and face-lifting, Mathe-
tion dynamique et évaluation des actifs contingents en matical Finance 17(1), 59–80.
marché incomplet. (French) [Dynamic programming and
pricing of contingent claims in an incomplete mar- LUCIANO CAMPI
ket], Comptes Rendus de l’Académie des Sciences Série
Mathématiques 313(12), 851–854.
Free Lunch point in the future (after liquidation has taken place).
Naturally, the previous formulation of an arbitrage
presupposes that a probabilistic model for the ran-
In the process of building realistic mathematical dom movement of liquid asset prices has been set
models of financial markets, absence of opportunities up. In [5], a discrete state space, multiperiod discrete-
for riskless profit is considered to be a minimal time financial market was considered. For this model,
normative assumption in order for the market to be the authors showed the equivalence between the eco-
in equilibrium state. The reason is quite obvious. If nomical “no arbitrage” (NA) condition and the math-
opportunities for riskless profit were present in the ematical stipulation of existence of an equivalent
market, every economic agent would try to reap them. probability that makes the discounted asset price pro-
Prices would then instantaneously move in response cesses martingales.
to an imbalance between supply and demand. This Crucial in the proof of the result in [5] was the
sudden price movement would continue as long as separating hyperplane theorem in finite-dimensional
opportunities for riskless profit are still present in Euclidean spaces. One of the convex sets to be sep-
the market. Therefore, in market equilibrium, no such arated is the class of all terminal outcomes resulting
opportunities should be possible. from trading and possible consumption starting from
The aforementioned simple and very natural idea zero capital; the other is the positive orthant. The NA
has proved very fruitful and has lead to great math- condition is basically the statement that the intersec-
ematical as well as economical insight in the theory tion of these two convex sets consists of only the zero
of quantitative finance. A rigorous formulation of the vector.
exact definition of “absence of opportunities for risk- After the publication of [5], a saga of papers
less profit” turned out to be a highly nontrivial fact followed that were aimed, one way or another, at
that troubled mathematicians and economists for at strengthening the conclusion by considering more
least two decades.a As the road unfolded, the valuable complicated market models. It quickly became obvi-
input of the theory of stochastic analysis in financial ous that the previous NA condition is no longer
theory was obvious; in the other direction, the devel- sufficient to imply the existence of a risk-neutral mea-
opment of the theory of stochastic processes benefited sure; it is too weak. In infinite-dimensional spaces,
immensely from problems that emerged purely from separation of hyperplanes, made possible by means of
these financial considerations.
the geometric version of the Hahn–Banach theorem,
Since the late 1970s, there has been a notion that
requires the closedness of the set C of all terminal
there is a deep connection between the absence of
outcomes resulting from trading and possible con-
opportunities for riskless profit and the existence of
sumption starting from zero capital. The simple NA
a risk-neutral measure,b that is, a probability that
condition does not imply this, in general. This has
is equivalent to the original one under which the
lead Kreps [7] to define a free lunch as a generalized,
discounted asset price processes have some kind of
martingale property. Existence of such measures are asymptotic form of an arbitrage.
of major practical importance, since they open the Essentially, a free lunch is a possibly infinite-
road to pricing illiquid assets or contingent claims valued random variable f with [f ≥ 0] = 1 and
in the market (see Risk-neutral Pricing). The result [f > 0] > 0 that belongs to the closure of C.
of the above notion has been called the fundamental Once an appropriate topology is defined on L0 , the
theorem of asset pricing (FTAP); for a detailed space of all random variables, in order for the last
to make sense, the “no-free-lunch”
closure (call it C)
account, see Fundamental Theorem of Asset Pri-
cing. (NFL) condition states thatc C ∩ L0 = {0}. Kreps
+
The easiest and most classical way to formulate [7] used this idea with a very weak topology on
the notion of riskless profit is via the so-called arbi- locally convex spaces and showed the existence of
trage strategy (see Arbitrage Strategy). An arbitrage a separating measure.d However, apart from trivial
is a combination of positions in the traded assets cases, this topology does not stem from a metric,
that requires zero initial capital and results in non- which means that closedness cannot be described in
negative outcome with a strictly positive probability terms of convergence of sequences. This makes the
of the wealth being strictly positive at a fixed time definition of a free lunch quite nonintuitive.
2 Free Lunch
End Notes
a.
The exact market viability definition is still sometimes the
source of debate.
Minimal Entropy of U , that is,
dQ
E
Q∗,f of Df ( · |P ) is then called f-optimal ELMM. = ZTE = Z0 exp ϑrE dSr (6)
In many situations arising in mathematical dP FT
0
finance, f -optimal ELMMs come up via duality
from expected utility maximization problems; see for some constant Z0 > 0 and some predictable S-
Expected Utility Maximization: Duality Methods; integrable process ϑ E . This has been proved in [21]
Expected Utility Maximization. One starts with a for models in finite discrete time and in [26, 28] in
utility function U (see Utility Function) and obtains general; see also [23] for an application to finding
f (up to an affine function) as the convex conjugate optimal strategies in a Lévy process setting. Note,
2 Minimal Entropy Martingale Measure
however, that representation (2) holds only at the time given by an Esscher transform (see Esscher
horizon, T ; the density process Transform) and L is again a Lévy process under
QE ; see, for instance, [13, 19, 24, 39].
dQE
Zt =
E
= EP ZTE Ft , 0 ≤ t ≤ T (7)
dP Ft For continuous semimartingales S, an alternative
approach is to characterize Z E via semimartin-
is usually quite difficult to find. We remark that gale backward equations or backward stochastic
the above results on both the equivalence to P and differential equations [50, 52]. The results in [56, 57]
the structure of the fE -optimal QE have versions use a mixture of the above ideas in a specific class
for more general f -divergences [26]. (Essentially, of models.
equation (2) is relation (1) in the case of exponential The second major area is concerned with con-
utility, but it can also be proved directly without using vergence questions. Several authors have proved, in
general duality.) several settings and with various techniques, that
The history of the minimal entropy martingale the minimal entropy martingale measure QE is the
measure QE is not straightforward to trace. A general limit, as p 1, of the so-called p-optimal martingale
definition and an authoritative exposition are given by measures obtained by minimizing the f -divergence
Frittelli [21]. However, the idea of the so-called mini- associated to the function f (y) = y p . This line of
max measures to link martingale measures via duality research was initiated in [27, 28], and later contri-
to utility maximization already appears, for instance, butions include [39, 52, 65]. In [45, 60], this con-
in [30, 31, 41]; see also [8]. Other early contributors vergence is combined with the general duality (1)
include Miyahara [53], who used the term “canonical from utility maximization in order to obtain con-
martingale measure”, and Stutzer [70]; some more vergence results for optimal wealths and strategies
historical comments and references are contained in as well.
[71]. Even before, in [20], it was shown that the prop- The third, and by far the most important area of
erty defining the MEMM is satisfied by the so-called research on the MEMM, is centered on its link to the
minimal martingale measure if S is continuous and exponential utility maximization problem; see [8, 18]
the so-called mean-variance trade-off of S has con- for a detailed exposition of this issue. More specifi-
stant expectation over all ELMMs for S; see also cally, the MEMM is very useful when one studies the
Minimal Martingale Measure. The most prominent valuation of contingent claims by (exponential) utility
example for this occurs when S is a Markovian dif- indifference valuation; see Utility Indifference Val-
fusion [53]. uation. To explain this, we fix an initial capital x0
After the initial foundations, work on the MEMM and a random payoff H due at time T . The maximal
has mainly concentrated on three major areas. The expected utility one can obtain by trading in S via
first aims to determine or describe the MEMM and, some strategy ϑ, if one starts with x0 and has to pay
in particular, its density process Z E more explicitly out H in T , is
in specific models. This has been done, among others,
for the following:
T
• stochastic volatility models: see [9, 10, 35, 62, sup E U x0 + ϑr dSr − H =: u(x0 ; −H )
ϑ
63], and compare also Volatility; Barndorff- 0
Nielsen and Shephard (BNS) Models;
(8)
• jump-diffusions [54]; and
• Lévy processes (see Lévy Processes), both in and the utility indifference value xH is then implicitly
general and in special settings: see [36] for defined by
an overview and [42, 43] for some examples.
In particular, many studies have considered u(x0 + xH ; −H ) = u(x0 ; 0) (9)
exponential Lévy models (see Exponential Lévy
Models) where S = S0 E(L) and L is a Lévy Hence, xH represents the monetary compensation
process under P . There, the existence of the required for selling H if one wants to achieve util-
MEMM QE reduces to an analytical condition ity indifference at the optimal investment behavior.
on the Lévy triplet of L. Moreover, QE is then If U = Uα is exponential, its multiplicative structure
Minimal Entropy Martingale Measure 3
makes the analysis of the utility indifference value xH aversion, Insurance: Mathematics and Economics 33,
tractable, in remarkable contrast to all other classical 1–28.
utility functions. Moreover, u(x0 ; −H ) as well as xH [5] Becherer, D. (2004). Utility-indifference hedging and
valuation via reaction-diffusion systems, Proceedings
and the optimal strategy ϑH∗ can be described with of the Royal Society A: Mathematical, Physical and
the help of a minimal entropy martingale measure Engineering Sciences 460, 27–51.
(defined here with respect to a new, H -dependent [6] Becherer, D. (2006). Bounded solutions to backward
reference measure PH instead of P ). This topic has SDEs with jumps for utility optimization and indif-
first been studied in [4, 58, 59, 64]; later work ference hedging, Annals of Applied Probability 16,
has examined intertemporally dynamic extensions [5, 2027–2054.
51], descriptions via backward stochastic differential [7] Bellamy, N. (2001). Wealth optimization in an incom-
plete market driven by a jump-diffusion process, Journal
equations (BSDEs) in specific models [6, 51], exten-
of Mathematical Economics 35, 259–287.
sions to more general payoff structures [38, 47, 48, [8] Bellini, F. & Frittelli, M. (2002). On the existence of
61], and so on [29, 37, 69]. minimax martingale measures, Mathematical Finance
Apart from the above, there are a number of other 12, 1–21.
areas where the minimal entropy martingale measure [9] Benth, F.E. & Karlsen, K.H. (2005). A PDE represen-
has come up; these include the following: tation of the density of the minimal entropy martingale
measure in stochastic volatility markets, Stochastics 77,
• option price comparisons [7, 11, 32–34, 55]; 109–137.
• generalizations or connections to other optimal [10] Benth, F.E. & Meyer-Brandis, T. (2005). The density
process of the minimal entropy martingale measure in
ELMMs [2, 14, 15, 66]); see also Minimal a stochastic volatility model with jumps, Finance and
Martingale Measure and [20]; Stochastics 9, 563–575.
• utility maximization with a random time horizon [11] Bergenthum, J. & Rüschendorf, L. (2007). Convex
[12]; ordering criteria for Lévy processes, Advances in Data
• good deal bounds [44]; see also Good-deal Analysis and Classification 1, 143–173.
Bounds; and [12] Blanchet-Scalliet, C., El Karoui, N. & Martellini, L.
• a calibration game [25]. (2005). Dynamic asset pricing theory with uncertain
time-horizon, Journal of Economic Dynamics and Con-
trol 29, 1737–1764.
There are also many papers that simply choose
[13] Chan, T. (1999). Pricing contingent claims on stocks
the MEMM as pricing measure for option pricing driven by Lévy processes, Annals of Applied Probability
applications; especially in papers from the actuarial 9, 504–528.
literature, this approach is often motivated by the [14] Choulli, T. & Stricker, C. (2005). Minimal entropy-
connections between the MEMM and the Esscher Hellinger martingale measure in incomplete markets,
transformation. Finally, we mention that the idea Mathematical Finance 15, 465–490.
of looking for a martingale measure subject to a [15] Choulli, T. & Stricker, C. (2006). More on mini-
mal entropy-Hellinger martingale measure, Mathemat-
constraint on relative entropy also naturally comes
ical Finance 16, 1–19.
up in calibration problems; see, for instance, [3, 16, [16] Cont, R. & Tankov, P. (2004). Nonparametric calibration
17] and Model Calibration. of jump-diffusion option pricing models, Journal of
Computational Finance 7, 1–49.
[17] Cont, R. & Tankov, P. (2006). Retrieving Lévy processes
References from option prices: regularization of an ill-posed inverse
problem, SIAM Journal on Control and Optimization 45,
[1] Acciaio, B. (2005). Absolutely continuous optimal mar- 1–25.
tingale measures, Statistics and Decisions 23, [18] Delbaen, F., Grandits, P., Rheinländer, T., Samperi, D.,
81–100. Schweizer, M. & Stricker, C. (2002). Exponential hedg-
[2] Arai, T. (2001). The relations between minimal martin- ing and entropic penalties, Mathematical Finance 12,
gale measure and minimal entropy martingale measure, 99–123.
Asia-Pacific Financial Markets 8, 137–177. [19] Esche, F. & Schweizer, M. (2005). Minimal entropy
[3] Avellaneda, M. (1998). Minimum-relative-entropy cali- preserves the Lévy property: how and why, Stochastic
bration of asset pricing models, International Journal of Processes and their Applications 115, 299–327.
Theoretical and Applied Finance 1, 447–472. [20] Föllmer, H. & Schweizer, M. (1991). Hedging of con-
[4] Becherer, D. (2003). Rational hedging and valua- tingent claims under incomplete information, in M.H.A.
tion of integrated risks under constant absolute risk Davis & R.J. Elliott, eds, Applied Stochastic Analysis,
4 Minimal Entropy Martingale Measure
Stochastics Monographs, Gordon and Breach, London, [37] İlhan, A., Jonsson, M. & Sircar, R. (2005). Opti-
Vol. 5, pp. 389–414. mal investment with derivative securities, Finance and
[21] Frittelli, M. (2000). The minimal entropy martingale Stochastics 9, 585–595.
measure and the valuation problem in incomplete mar- [38] İlhan, A. & Sircar, R. (2006). Optimal static-dynamic
kets, Mathematical Finance 10, 39–52. hedges for barrier options, Mathematical Finance 16,
[22] Frittelli, M. (2000). Introduction to a theory of value 359–385.
coherent with the no-arbitrage principle, Finance and [39] Jeanblanc, M., Klöppel, S. & Miyahara, Y. (2007).
Stochastics 4, 275–297. Minimal f q -martingale measures for exponential Lévy
[23] Fujiwara, T. (2004). From the minimal entropy mar- processes, Annals of Applied Probability 17, 1615–1638.
tingale measures to the optimal strategies for the [40] Kabanov, Y.M. & Stricker, C. (2002). On the opti-
exponential utility maximization: the case of geomet- mal portfolio for the exponential utility maximization:
ric Lévy processes, Asia-Pacific Financial Markets 11, remarks to the six-author paper, Mathematical Finance
367–391. 12, 125–134.
[24] Fujiwara, T. & Miyahara, Y. (2003). The minimal [41] Karatzas, I., Lehoczky, J.P., Shreve, S.E. & Xu, G.L.
entropy martingale measures for geometric Lévy pro- (1991). Martingale and duality methods for utility max-
cesses, Finance and Stochastics 7, 509–531. imization in an incomplete market, SIAM Journal on
[25] Glonti, O., Harremoes, P., Khechinashvili, Z., Topsøe, F. Control and Optimization 29, 702–730.
& Tbilisi, G. (2007). Nash equilibrium in a game of [42] Kassberger, S. & Liebmann, T. (2008). Mini-
calibration, Theory of Probability and its Applications mal q-entropy Martingale Measures for Exponential
51, 415–426. Time-changed Lévy Processes and within Parametric
[26] Goll, T. & Rüschendorf, L. (2001). Minimax and mini- Classes, preprint, University of Ulm, http://www.uni-
mal distance martingale measures and their relationship ulm.de/mawi/finmath/people/kassberger.html
to portfolio optimization, Finance and Stochastics 5, [43] Kim, Y.S. & Lee, J.H. (2007). The relative entropy
557–581. in CGMY processes and its applications to finance,
[27] Grandits, P. (1999). The p-optimal martingale measure Mathematical Methods of Operations Research 66,
and its asymptotic relation with the minimal entropy 327–338.
martingale measure, Bernoulli 5, 225–247. [44] Klöppel, S. & Schweizer, M. (2007). Dynamic utility-
[28] Grandits, P. & Rheinländer, T. (2002). On the minimal based good deal bounds, Statistics and Decisions 25,
285–309.
entropy martingale measure, Annals of Probability 30,
[45] Kohlmann, M. & Niethammer, C.R. (2007). On
1003–1038.
convergence to the exponential utility problem,
[29] Grasselli, M. (2007). Indifference pricing and hedging
Stochastic Processes and their Applications 117,
for volatility derivatives, Applied Mathematical Finance
1813–1834.
14, 303–317.
[46] Kramkov, D. & Schachermayer, W. (1999). The
[30] He, H. & Pearson, N.D. (1991). Consumption and
asymptotic elasticity of utility functions and optimal
portfolio policies with incomplete markets and short-sale
investment in incomplete markets, Annals of Applied
constraints: the finite-dimensional case, Mathematical
Probability 9, 904–950.
Finance 1(3), 1–10. [47] Leung, T. & Sircar, R. (2008). Exponential Hedging with
[31] He, H. & Pearson, N.D. (1991). Consumption and Optimal Stopping and Application to ESO Valuation,
portfolio policies with incomplete markets and short- preprint, Princeton University, http://ssrn.com/abstract=
sale constraints: the infinite dimensional case, Journal 1111993
of Economic Theory 54, 259–304. [48] Leung, T. & Sircar, R. (2009). Accounting for risk
[32] Henderson, V. (2005). Analytical comparisons of option aversion, vesting, job termination risk and multiple
prices in stochastic volatility models, Mathematical exercises in valuation of employee stock options,
Finance 15, 49–59. Mathematical Finance 19, 99–128.
[33] Henderson, V. & Hobson, D.G. (2003). Coupling and [49] Liese, F. & Vajda, I. (1987). Convex Statistical
option price comparisons in a jump-diffusion model, Distances, Teubner.
Stochastics and Stochastics Reports 75, 79–101. [50] Mania, M., Santacroce, M. & Tevzadze, R. (2003).
[34] Henderson, V., Hobson, D., Howison, S. & Kluge, T. A semimartingale BSDE related to the minimal
(2005). A comparison of option prices under different entropy martingale measure, Finance and Stochastics 7,
pricing measures in a stochastic volatility model with 385–402.
correlation, Review of Derivatives Research 8, 5–25. [51] Mania, M. & Schweizer, M. (2005). Dynamic
[35] Hobson, D. (2004). Stochastic volatility models, correla- exponential utility indifference valuation, Annals of
tion, and the q-optimal measure, Mathematical Finance Applied Probability 15, 2113–2143.
14, 537–556. [52] Mania, M. & Tevzadze, R. (2003). A unified charac-
[36] Hubalek, F. & Sgarra, C. (2006). Esscher transforms and terization of q-optimal and minimal entropy martin-
the minimal entropy martingale measure for exponential gale measures by semimartingale backward equations,
Lévy models, Quantitative Finance 6, 125–145. Georgian Mathematical Journal 10, 289–310.
Minimal Entropy Martingale Measure 5
[53] Miyahara, Y. (1995). Canonical martingale measures [64] Rouge, R. & El Karoui, N. (2000). Pricing via utility
of incomplete assets markets, Probability Theory and maximization and entropy, Mathematical Finance 10,
Mathematical Statistics: Proceedings of the Seventh 259–276.
Japan-Russia Symposium, Tokyo, pp. 343–352. [65] Santacroce, M. (2005). On the convergence of the p-
[54] Miyahara, Y. (1999). Minimal entropy martingale optimal martingale measures to the minimal entropy
measures of jump type price processes in incomplete martingale measure, Stochastic Analysis and Applica-
assets markets, Asia-Pacific Financial Markets 6, tions 23, 31–54.
97–113. [66] Santacroce, M. (2006). Derivatives pricing via p-optimal
[55] Møller, T. (2004). Stochastic orders in dynamic martingale measures: some extreme cases, Journal of
reinsurance markets, Finance and Stochastics 8, Applied Probability 43, 634–651.
479–499. [67] Schachermayer, W. (2001). Optimal investment in
[56] Monoyios, M. (2006). Characterisation of optimal dual incomplete markets when wealth may become negative,
measures via distortion, Decisions in Economics and Annals of Applied Probability 11, 694–734.
Finance 29, 95–119. [68] Schäl, M. (2000). Portfolio optimization and martingale
[57] Monoyios, M. (2007). The minimal entropy measure and measures, Mathematical Finance 10, 289–303.
an Esscher transform in an incomplete market, Statistics [69] Stoikov, S. (2006). Pricing options from the point of
and Probability Letters 77, 1070–1076. view of a trader, International Journal of Theoretical
[58] Musiela, M. & Zariphopoulou, T. (2004). An example and Applied Finance 9, 1245–1266.
of indifference prices under exponential preferences, [70] Stutzer, M. (1996). A simple nonparametric approach
Finance and Stochastics 8, 229–239. to derivative security valuation, Journal of Finance 51,
[59] Musiela, M. & Zariphopoulou, T. (2004). A valuation 1633–1652.
algorithm for indifference prices in incomplete markets, [71] Stutzer, M.J. (2000). Simple entropic derivation of a gen-
Finance and Stochastics 8, 399–414. eralized Black-Scholes option pricing model, Entropy 2,
[60] Niethammer, C.R. (2008). On convergence to the expo- 70–77.
nential utility problem with jumps, Stochastic Analysis
and Applications 26, 169–196.
[61] Oberman, A. & Zariphopoulou, T. (2003). Pricing early Related Articles
exercise contracts in incomplete markets, Computational
Management Science 1, 75–107.
[62] Rheinländer, T. (2005). An entropy approach to the
Entropy-based Estimation; Exponential Lévy
Stein and Stein model with correlation, Finance and Models; Minimal Martingale Measure; Risk-
Stochastics 9, 399–413. neutral Pricing; Semimartingale.
[63] Rheinländer, T. & Steiger, G. (2006). The minimal
entropy martingale measure for general Barndorff- MARTIN SCHWEIZER
Nielsen/Shephard models, Annals of Applied Probability
16, 1319–1351.
Minimal Martingale locally risk-minimizing strategy for a given con-
tingent claim H was obtained there (under some
Measure specific assumptions) as the integrand from the clas-
sical Galtchouk–Kunita–Watanabe decomposition of
H under P . However, the introduction of P in
[46] and also in [47] was still somewhat ad hoc.
Let S = (St ) be a stochastic process
on a filtered The above definition was given in [18] where the
probability space , F, (Ft ), P that models the main results presented here can also be found. In
discounted prices of primary traded assets in a finan- particular, [18] showed that for continuous S, the
cial market. An equivalent local martingale measure Galtchouk–Kunita–Watanabe decomposition of H
(ELMM) for S is a probability measure Q equivalent under the MMM P provides (under very mild integra-
to the original (historical) measure P such that S bility conditions) the so-called Föllmer–Schweizer
is a local Q-martingale (see Equivalent Martingale decomposition of H under the original measure P ,
Measures). If S is a nonnegative P -semimartingale, and this in turn immediately gives the locally risk-
the fundamental theorem of asset pricing says that minimizing strategy for H . We emphasize that this
an ELMM Q for S exists if and only if S satisfies is no longer true, in general, if S has jumps. The
the no-arbitrage condition (NFLVR), that is, admits MMM subsequently found various other applications
no free lunch with vanishing risk (see Fundamental and uses and has become fairly popular, especially in
Theorem of Asset Pricing). By Girsanov’s theorem, models with continuous price processes.
S is then under P a semimartingale with a decom- Suppose now S satisfies (SC). For every ELMM
position S = S0 + M + A into a local P -martingale Q for S with dQ/ dP ∈ L2 (P ), the density process
M and an adapted process A of finite variation. If then takes the form
S is special under P , then A can be chosen pre-
dQ Q
dictable and the resulting canonical decomposition of Z Q := = Z E − λ dM + L Q
(1)
S is unique. We say that S satisfies the structure con- dP IF 0
not yet well understood. See also Risk-sensitive [17] Fleming, W.H. & Sheu, S.J. (2002). Risk-sensitive
Asset Management. control and an optimal investment model II, The Annals
of Applied Probability 12, 730–767.
[18] Föllmer, H. & Schweizer, M. (1991). Hedging of con-
tingent claims under incomplete information, in Applied
References Stochastic Analysis, Stochastics Monographs, M.H.A.
Davis & R.J. Elliott eds, Gordon and Breach, London,
[1] Arai, T. (2001). The relations between minimal martin- Vol. 5, pp. 389–414.
gale measure and minimal entropy martingale measure, [19] Grandits, P. (2000). On martingale measures for stochas-
Asia-Pacific Financial Markets 8, 137–177. tic processes with independent increments, Theory of
[2] Becherer, D. (2001). The numeraire portfolio for Probability and its Applications 44, 39–50.
unbounded semimartingales, Finance and Stochastics 5, [20] Grasselli, M. (2007). Indifference pricing and hedging
327–341. for volatility derivatives, Applied Mathematical Finance
[3] Berrier, F., Rogers, L.C.G. & Tehranchi, M. (2008). 14, 303–317.
A Characterization of Forward Utility Functions, [21] Henderson, V. (2002). Valuation of claims on nontraded
preprint, http://www.statslab.cam.ac.uk/∼mike/forward assets using utility maximization, Mathematical Finance
-utilities.pdf. 12, 351–373.
[4] Biagini, F. & Pratelli, M. (1999). Local risk minimiza- [22] Henderson, V. (2005). Analytical comparisons of option
tion and numeraire, Journal of Applied Probability 36, prices in stochastic volatility models, Mathematical
Finance 15, 49–59.
1126–1139.
[23] Henderson, V. & Hobson, D.G. (2002). Real options
[5] Björk, T. & Slinko, I. (2006). Towards a general
with constant relative risk aversion, Journal of Economic
theory of good-deal bounds, The Review of Finance 10,
Dynamics and Control 27, 329–355.
221–260.
[24] Henderson, V. & Hobson, D.G. (2003). Coupling and
[6] Černý, A. (2003). Generalised Sharpe ratios and asset
option price comparisons in a jump-diffusion model,
pricing in incomplete markets, European Finance
Stochastics and Stochastics Reports 75, 79–101.
Review 7, 191–233.
[25] Hong, D. & Wee, I.S. (2003). Convergence of jump-
[7] Černý, A. & Kallsen, J. (2007). On the structure of
diffusion models to the Black-Scholes model, Stochastic
general mean-variance hedging strategies, The Annals of
Analysis and Applications 21, 141–160.
Probability 35, 1479–1531.
[26] Jouini, E. & Napp, C. (1999). Continuous Time Equilib-
[8] Chan, T. (1999). Pricing contingent claims on stocks
rium Pricing of Nonredundant Assets, Leonard N. Stern
driven by Lévy processes, The Annals of Applied Prob- School Finance Department Working Paper 99-008 ,
ability 9, 504–528. New York University, http://w4.stern.nyu.edu/finance/
[9] Choulli, T. & Stricker, C. (2005). Minimal entropy- research.cfm?doc id=1216, http://www.stern.nyu.edu/
Hellinger martingale measure in incomplete markets, fin/workpapers/papers99/wpa99008.pdf.
Mathematical Finance 15, 465–490. [27] Kallsen, J. (2002). Utility-based derivative pricing
[10] Choulli, T. & Stricker, C. (2006). More on mini- in incomplete markets, in Mathematical Finance—
mal entropy-Hellinger martingale measure, Mathemat- Bachelier Congress 2000, H. Geman, D. Madan,
ical Finance 16, 1–19. S.R. Pliska & T. Vorst, eds, Springer-Verlag, Berlin,
[11] Choulli, T., Stricker, C. & Li, J. (2007). Minimal Heidelberg, New York, pp. 313–338.
Hellinger martingale measures of order q, Finance and [28] Korn, R. (1998). Value preserving portfolio strategies
Stochastics 11, 399–427. and the minimal martingale measure, Mathematical
[12] Christensen, M.M. & Larsen, K. (2007). No arbitrage Methods of Operations Research 47, 169–179.
and the growth optimal portfolio, Stochastic Analysis [29] Korn, R. (2000). Value preserving strategies and a
and Applications 25, 255–280. general framework for local approaches to optimal
[13] Colwell, D.B. & Elliott, R.J. (1993). Discontinuous asset portfolios, Mathematical Finance 10, 227–241.
prices and non-attainable contingent claims, Mathemat- [30] Korn, R. & Schäl, M. (1999). On value preserving
ical Finance 3, 295–308. and growth optimal portfolios, Mathematical Methods
[14] Delbaen, F., Grandits, P., Rheinländer, T., Samperi, D., of Operations Research 50, 189–218.
Schweizer, M. & Stricker, C. (2002). Exponential hedg- [31] Kuroda, K. & Nagai, H. (2002). Risk-sensitive portfolio
ing and entropic penalties, Mathematical Finance 12, optimization on infinite time horizon, Stochastics and
99–123. Stochastics Reports 73, 309–331.
[15] Delbaen, F. & Schachermayer, W. (1998). A simple [32] Lesne, J.-P., Prigent, J.-L. & Scaillet, O. (2000). Con-
counterexample to several problems in the theory of vergence of discrete time option pricing models under
asset pricing, Mathematical Finance 8, 1–11. stochastic interest rates, Finance and Stochastics 4,
[16] Elliott, R.J. & Madan, D.B. (1998). A discrete time 81–93.
equivalent martingale measure, Mathematical Finance [33] Mania, M. & Tevzadze, R. (2003). A unified charac-
8, 127–152. terization of q-optimal and minimal entropy martingale
4 Minimal Martingale Measure
measures by semimartingale backward equation, The [45] Schachermayer, W. (1993). A counterexample to several
Georgian Mathematical Journal 10, 289–310. problems in the theory of asset pricing, Mathematical
[34] Møller, T. (2004). Stochastic orders in dynamic reinsur- Finance 3, 217–229.
ance markets, Finance and Stochastics 8, 479–499. [46] Schweizer, M. (1988). Hedging of options in a general
[35] Monoyios, M. (2004). Performance of utility-based semimartingale model, Dissertation ETH Zürich 8615.
strategies for hedging basis risk, Quantitative Finance [47] Schweizer, M. (1991). Option hedging for semimartin-
4, 245–255. gales, Stochastic Processes and their Applications 37,
[36] Monoyios, M. (2006). Characterisation of optimal dual 339–363.
measures via distortion, Decisions in Economics and [48] Schweizer, M. (1992). Mean-variance hedging for gen-
Finance 29, 95–119. eral claims, The Annals of Applied Probability 2,
[37] Monoyios, M. (2007). The minimal entropy measure and 171–179.
an Esscher transform in an incomplete market, Statistics [49] Schweizer, M. (1995). On the minimal martingale mea-
and Probability Letters 77, 1070–1076. sure and the Föllmer-Schweizer decomposition, Stochas-
[38] Nagai, H. & Peng, S. (2002). Risk-sensitive portfo- tic Analysis and Applications 13, 573–599.
lio optimization with partial information on infinite [50] Schweizer, M. (1999). A minimality property of the
time horizon, The Annals of Applied Probability 12,
minimal martingale measure, Statistics and Probability
173–195.
Letters 42, 27–31.
[39] Pham, H., Rheinländer, T. & Schweizer, M. (1998).
[51] Schweizer, M. (2001). A guided tour through quadratic
Mean-variance hedging for continuous processes: new
hedging approaches, in Option Pricing, Interest Rates
results and examples, Finance and Stochastics 2,
and Risk Management, E. Jouini, J. Cvitanić &
173–198.
M. Musiela, eds, Cambridge University Press, Cam-
[40] Pham, H. & Touzi, N. (1996). Equilibrium state prices
in a stochastic volatility model, Mathematical Finance bridge, pp. 538–574.
6, 215–236. [52] Sin, C.A. (1998). Complications with stochastic volatil-
[41] Pirvu, T.A. & Haussmann, U.G. (2007). On Robust ity models, Advances in Applied Probability 30,
Utility Maximization, University of British Columbia, 256–268.
arXiv:math/0702727, preprint. [53] Stoikov, S. & Zariphopoulou, T. (2004). Optimal invest-
[42] Prigent, J.-L. (1999). Incomplete markets: convergence ments in the presence of unhedgeable risks and under
of options values under the minimal martingale measure, CARA preferences, in IMA Volume in Mathematics and
Advances in Applied Probability 31, 1058–1077. its Applications, in press.
[43] Rheinländer, T. (2005). An entropy approach to the [54] Tehranchi, M. (2004). Explicit solutions of some utility
Stein and Stein model with correlation, Finance and maximization problems in incomplete markets, Stochas-
Stochastics 9, 399–413. tic Processes and their Applications 114, 109–125.
[44] Runggaldier, W.J. & Schweizer, M. (1995). Conver- [55] Zhang, X. (1997). Numerical analysis of American
gence of option values under incompleteness, in Seminar option pricing in a jump-diffusion model, Mathematics
on Stochastic Analysis, Random Fields and Applications, of Operations Research 22, 668–690.
E. Bolthausen, M. Dozzi & F. Russo, eds, Birkhäuser
Verlag, Basel, pp. 365–384. HANS FÖLLMER & MARTIN SCHWEIZER
Good-deal Bounds for some derivative which can, at best, only be partly
hedged. There is no chance of replicating this claim
exactly, and super-replication bounds may be too
Most contingent claims valuation is based, at least loose to be practically helpful. The company expects
notionally, on the concept of exact replication. The to trade using some kind of statistical arbitrage, for
difficulties of exactly replicating derivative positions which each transaction passes a minimum reward-
suggest that in many cases we should, instead, put for-risk threshold and overall to obtain a portfolio
bounds around the value of an instrument. These that performs much better than that minimun.
bounds ought to depend on model assumptions and on More specifically, reservation forward bid and ask
the prices of securities that would be used to exploit prices p− < p+ are to be determined at time zero for
mispricing. No-arbitrage bounds are often very weak, a derivative that will pay a random amount C̃T at later
so good-deal bounds provide an attractive alternative. date T . We suppose a von Neumann–Morgenstern
Good-deal bounds provide a range of prices within utility function U (.) for date T wealth and a forward
which an instrument must trade if it is not to offer a wealth endowment of W0 . The reservation prices
surprisingly good reward-for-risk opportunity. This is are constructed so that trade will provide a level
illustrated in Figure 1, where the horizontal axis rep- of expected utility at a predetermined level UR that
resents the distribution of future payoffs (or values) exceeds the expected utility that could be reached
after zero cost hedging. In an incomplete market set- without it by A > 0.
ting, rather strong assumptions are needed to arrive at Figure 2 illustrates the construction. The hori-
a unique forward value, such as p ∗ in the figure. Con- zontal axis represents the price of the contingent
versely, risk-free arbitrage typically allows a rather claim. The vertical axis represent the expected utility
wide band of prices, as between the upper and lower obtained from buying (or selling) the optimal quantity
bounds b+ , b− . We can hope to obtain a much nar- of the claim. Outside the super-replication bounds,
rower band without the need for strong assumptions b− , b+ , unbounded wealth can be obtained.
if we simply preclude profitable opportunities. This In the case where no hedging will be undertaken
gives the good-deal bounds p+ and p− . These bounds and the forward price of the claim is p, we simply
have two alternative interpretations: we can think of have the optimization of the quantity θ bought or
them as establishing normative bid and ask forward sold as
prices for a particular trader or as predicting a range
in which we expect the market price to lie. Max
E U W0 + θ C̃T − p (1)
This line of valuation analysis now has an inter- θ
esting history and it has inspired a quite significant
literature, much of it very mathematical. There are a If p is low enough we will expect to buy the
great many different variations by which the philos- claim, and if p is high enough we will want to
ophy just described can be implemented. sell it. Intuitively the good-deal lower bound, p− ,
This article aims to cover the main issues without is the highest price at which we can buy the claim
going too deeply into mathematical technicalities. We and obtain expected utility of UR , and the good-
begin by considering a simple illustrative example deal upper bound, p+ , is the lowest price at which
to provide intuitive insights into the nature of the we can sell the claim and obtain expected utility of
analysis, including the use of duality in the solutions. UR .
We then sketch the history of this topic, including Now consider the first-order conditions from the
the generalized Sharpe ratio. Finally, there is a optimization:
discussion of the role of the utility function (see
Utility Function) in the analysis, of applications, and E C̃T − p U W0 + θ C̃T − p = 0, so (2)
of the more recent literature.
E C̃T U W̃T
p= ,
Illustration E U W̃T
Consider the problem faced by a financial interme-
diary in determining reservation bid and ask prices where W̃T = W0 + θ C̃T − p (3)
2 Good-deal Bounds
Much of the work in the incomplete markets level of von Neumann–Morgenstern expected utility
literature focuses on ways to obtain a particular pric- and a good-deal as a desirable claim with zero or
ing measure and hence unique prices (for example, negative price. Within the analysis, it is assumed that
see Minimal Entropy Martingale Measure; Mini- any quantity of any claim may be bought or sold.
mal Martingale Measure and Schweitzer [17]), but The economy contains a collection of claims with
it is not clear why a particular agent would be pre- predetermined prices, so called basis assets. These
pared to trade at these prices. claims generate the marketed subspace M and their
The good-deal literature represents an important prices define a price correspondence on this subspace.
alternative between these two paths. Hansen and In an incomplete market, it is often convenient to
Jagannathan [9] provide a crucial stepping stone. suppose that the market is augmented in such a
They showed that the Sharpe ratio on any security way that the resulting complete market contains no
is bounded by the coefficient of variation of the arbitrages. Instead, we can more powerfully augment
stochastic discount factor (see Stochastic Discount the market so that the complete market contains no
Factors). The Sharpe ratio provides a very natural good-deals. We obtain a set of pricing functionals that
benchmark (see [18]) and Cochrane and Saá-Requejo form a subset of those that simply preclude arbitrage.
[6] subsequently used this to limit the volatility of the The link between no arbitrage and strictly positive
stochastic discount factor and infer the first no-good- pricing rules carries over to good-deals and enables
deal prices, conditional on the absence of high Sharpe price restrictions to be placed on nonmarketed claims.
ratios. At about the same time, a related paper by Under suitable technical assumptions, the no-good-
Bernardo and Ledoit [2] showed how similar bounds deal price region for a set of claims is a convex set,
could be obtained relative to a maximum gain–loss and redundant assets have unique good-deal prices.
ratio for the economy as a whole. These papers With an acceptance set of deals, K, typically
have their disadvantages. Cochrane and Saá-Requejo defined in terms of expected utility, the upper and
work with quadratic utility (and sometimes truncated lower good-deal bounds can be defined simply as
quadratic utility), whereas Bernardo and Ledoit use T
Domar–Musgrave utility (i.e., two linear segments). p+ = inf p| − C̃T + p + xt dSt ∈ K and (7)
This led Hodges [12] to investigate bounds based on p,xt 0
the more conventional choice of exponential utility T
and to thereby introduce the idea of a generalized p+ = inf p|C̃T − p + xt dSt ∈ K (8)
p,xt 0
Sharpe ratio.
This concept was extended by Černý and Hodges For a given utility function, the positions of the
[5] into the more general framework of good-deal good-deal bounds naturally depend on the required
pricing mostly used today. By then, it was already expected utility premium, A. The higher this level,
clear that these prices satisfied the criteria for coher- the further apart the bounds will be. Coherent risk
ent risk measures of Artzner et al., [1], namely, the measures, well into the tails of the final distribution,
linearity, subadditivity, and monotonicity properties. can be obtained if high levels are employed for A.
This includes the representation of the lower good- Except for the case of exponential utility, the bounds
deal price as an infimum over values from alternative also depend on the initial wealth level.
pricing measures. Nevertheless, Jaschke and Küchler
[13] provided an important clarification and unifica-
tion of these ideas. Generalized Sharpe Ratios
One method for setting the required premium comes
General Framework from the Sharpe ratio available on a market oppor-
tunity. This give rise to what are called generalized
The general framework of “no-good-deal” pricing Sharpe ratio bounds (see [12] or [4]). The idea is to
(first described by Černý and Hodges [5]) places no- first compute the level of expected utility UR attain-
arbitrage and representative agent equilibrium at the able from a market opportunity offering a specific
two ends of a spectrum of possibilities. They define annualized Sharpe ratio, such as 0.25, and with-
a desirable claim as one which provides a specific out any investment in the derivative. The good-deal
4 Good-deal Bounds
bounds that are supported by this level of expected the Domar–Musgrave function used by Bernardo and
utility (but without this market opportunity) are then Ledoit and the negative exponential one.
said to correspond to a generalized Sharpe ratio of
0.25. In the case of negative exponential utility, the
wealth level and the risk aversion parameter play the Coherent Risk Measures
same role and become irrelevant since the opportunity
can be accepted at any scale. This provides a particu- Jaschke and Küchler [13] expand the link between
larly simple implementation with minimal parameter good-deal bounds and coherent risk measures. They
requirements. show that there is a one-to-one correspondence
Subsequent analysis by Černý [4] further expands between
both the notions and the analysis of generalized 1. “coherent risk measures” (see Convex Risk
Sharpe ratios. The analysis provides details of the Measures)
dual formulations for alternative standard utility func- 2. cones of “desirable claims”
tions. For example, the dual constraints on the change 3. partial orderings
of measure m for different utility functions are as 4. good-deal valuation bounds
given in Table 1. 5. sets of “admissible” price systems.
The various properties of the utility affect the
details of the mathematical analysis considerably. For It should be noted from this analysis that it is
some features to work cleanly we need unbounded sufficient but not necessary to use expected utility
utility, whereas for others the behavior for low wealth to define all the abstract measures considered in
levels is critical. Exponential utility precludes any their paper. In other words, acceptance sets must be
delta hedge that gives a short lognormal position consistent with coherence, but not necessarily with
over finite time—even though it would have a expected utility.
smaller standard deviation than the fully covered It is clear from the foregoing that good-deal
position. Capping such a liability at a finite level can analysis can easily be applied as the basis of risk
therefore have a big effect on the good-deal price measurement and will satisfy the axioms of coherent
resulting from such an analysis. Depending on the risk measures (see Convex Risk Measures). They
context, this may or may not be desirable. While can also be applied as a method of risk adjustment
exponential utility precludes fat negative tails, such for performance measurement. For example, a utility-
as the short lognormal, power and log utility preclude based generalized Sharpe ratio, when applied to an
the possibility of any negative future wealth, and even empirical distribution, provides a method of adjusting
stronger effects can, in principle, derive from this. for skewness in the distribution. In doing so, it makes
With constant absolute risk aversion (CARA) sense to apply a negative sign to situations where a
utility, changing the scale of investment is equivalent short position would have been optimal.
to changing the level of risk aversion. With constant
relative risk aversion (CRRA), it is equivalent to
scaling the initial wealth, W0 . The CRRA-based Recent and Prospective Literature
good-deal bound thus searches across measures with Important new papers continue to appear quite reg-
the same exponent, but different wealth levels. There ularly; a few recent ones are mentioned here. Staum
may be some advantages to finding alternative utility [19] provides much of the background, treating
functions that have properties intermediate between good-deals from the perspective of convex optimiza-
tion. Bjork and Slinko [3] provide extensions to
Table 1 Stochastic discount factor constraints for various Cochrane and Saá-Requejo in a multidimensional
utility functions jump-diffusion setting. There are further papers that
Utility function Constraint expand on the dynamic aspects of this analysis, apply
it to settings with stochastic volatility, or implement
Quadratic: Cochrane et al. E[m2 ] ≤ 1 + A2
similar optimizations using mathematical program-
Exponential E[m ln m] ≤ A
Power, RRA = γ E[m1−1/γ ] ≤ (1 + Aγ )1/γ −1
ming. There are also a number of papers, which
Logarithmic −E[ln m] ≤ ln(1 + A)
although not directly within the framework developed
here, deal with related ideas in different ways.
Good-deal Bounds 5
The apparently simple concept of good-deal [9] Hansen, L.P. & Jagannathan, R. (1991). Implications of
bounds has turned out to provide a great deal of rich- security market data for models of dynamic economies,
ness for mathematicians to analyze, and there are now Journal of Political Economy 99, 225–262.
[10] Harrison, J. & Kreps, J. (1979). Martingales and arbi-
many variations on this theme in the published lit- trage in multiperiod securities markets, Journal of Eco-
erature. Although the theory stems from a practical nomic Theory 11, 215–260.
desire, very few of the papers have an applied flavor. [11] Hobson, D.G. (1998). Robust hedging of the lookback
Rather little algorithmic or numerical work has been option, Finance and Stochastics 2, 329–347.
reported, and most of that uses only somewhat sim- [12] Hodges, S.D. (1998). A Generalization of the Sharpe
plified models, seldom calibrated to the market. The ratio and its Applications to Valuation Bounds and
Risk Measures. FORC Preprint 1998/88, University of
good-deal bounds approach could easily be adapted
Warwick.
to deal with model risk, something which is hinted [13] Jaschke, S. & Küchler, U. (2001). Coherent risk mea-
at in Cont [7]. The literature needs more real appli- sures and good-deal bounds, Finance and Stochastics 5,
cations, and, perhaps, the balance will have changed 181–200.
when the next survey of this area comes to be written. [14] Levy, H. (1985). Upper and lower bounds of put and call
option values: stochastic dominance approach, Journal
of Finance 40, 1197–1218.
References [15] Merton, R.C. (1973). Theory of rational option pricing,
Bell Journal of Economics 4, 141–183.
[1] Artzner, P., Delbaen, F., Eber, J. & Heath, D. (1999). [16] Perrakis, S. & Ryan, P.J. (1984). Option pricing
Coherent measures of risk, Mathematical Finance 9(3), bounds in discrete time, Journal of Finance 39,
203–228. 519–525.
[2] Bernardo, A. & Ledoit, O. (1996). Gain, loss and asset [17] Schweizer, M. (1995). On the minimal martingale mea-
pricing, Journal of Political Economy 108(1), 144–172. sure and the Föllmer-Schweizer decomposition, Stochas-
[3] Bjork, T. & Slinko, I. (2006). Towards a general theory tic Analysis and its Applications 13, 573–599.
of good-deal bounds, Review of Finance 10, 221–260. [18] Sharpe, W.F. (1994). The sharpe ratio, Journal of
[4] Černý, A. (2003). Generalised sharpe ratios and asset Portfolio Management 21, 49–59.
pricing in incomplete markets, European Finance [19] Staum, J. (2004). Pricing and hedging in incomplete
Review 7, 191–233. markets: fundamental theorems and robust utility maxi-
[5] Černý, A. and Hodges, S.D. (2001). The theory of good- mization, Mathematical Finance 14(2), 141–161.
deal pricing in financial markets, in Selected Proceedings
of the First Bachalier Congress Held in Paris, 2000,
H. Geman, D. Madan, S.R. Pliska & T. Vorst, eds, Related Articles
Springer Verlag.
[6] Cochrane, J.H. & Saá-Requejo, J. (2000). Beyond arbi-
trage: ‘Good-Deal’ asset price bounds in incomplete Arbitrage Strategy; Convex Risk Measures;
markets, Journal of Political Economy 108(1), 79–119.
Stochastic Discount Factors; Sharpe Ratio; Super-
[7] Cont, R. (2006). Model uncertainty and its impact on the
pricing of derivative instruments, Mathematical Finance
hedging; Utility Function.
16(3), 519–547.
[8] Dybvig, P.H. & Ross, S.A. (1987). Arbitrage, in The STEWART D. HODGES
New Palgrave: A Dictionary of Economics, J. Eatwell,
M. Milgate & P. Newman., eds, Macmillan, London,
Vol. 1, pp. 100–106.
dia cγ /γ with relative risk aversion coefficient γ =
Arrow–Debreu Prices γ a ∈ (0, 1) and discount factors dia > 0. This exam-
ple for preferences satisfies the general requirements
(insaturation, continuity and convexity) on prefer-
Arrow–Debreu prices are the prices of “atomic”
ences for state-contingent consumption in [3], which
time and state-contingent claims, which deliver one
need not be of the separable subjective expected util-
unit of a specific consumption good if a specific
ity form above. The only way for agents to allocate
uncertain state realizes at a specific future date. For
their consumption is by exchanging state-contingent
instance, claims on the good “ice cream tomorrow”
claims for the delivery of some units of the (perish-
are split into different commodities depending on
able) consumption good at a specific future state. Let
whether the weather will be good or bad, so that
qω denote the price at time 0 for the state-contingent
good-weather and bad-weather ice cream tomorrow
claim that pays q0 > 0 units if and only if state ω ∈
can be traded separately. Such claims were introduced
is realized. Given the endowments and utility pref-
by Arrow and Debreu in their work on general
erences of the agents, an equilibrium is given by
equilibrium theory under uncertainty, to allow agents
consumption allocations ca∗ and a linear price system
to exchange state and time contingent claims on
(qω )ω∈ ∈ m + such that,
goods. Thereby the general equilibrium problem with
uncertainty can be reduced to a conventional one
1. for any agent a, his or her consumption ca∗
without uncertainty. In finite-state financial models,
maximizes ua (ca ) over all ca subject to budget
Arrow–Debreu securities delivering one unit of the
constraint
numeraire good can be viewed as natural atomic
(c0a − e0a )q0 + ω (c1a − e1a )(ω)qω
building blocks for all other state–time contingent
≤ 0, and
financial claims; their prices determine a unique
2. markets clear, that is a (cta − eta )(ω) = 0 for all
arbitrage-free price system.
dates t = 0, 1 and states ω.
c0a∗ = c0a∗ (q) = (U0a )−1 (λa q0 ) and For multiple consumption goods, the above ideas
generalize if one considers consumption bundles
a∗
c1,ω = c1,ω
a∗
(q) = (Uωa )−1 (λa qω /P a (ω)) , ω∈ and state-contingent claims of every good. Arrow
(3) [1] showed that in the case of multiple consump-
where λa = λa (q) > 0 is determined by the budget tion goods, all possible consumption allocations are
constraint (ca∗ − ea )q = 0 as the Lagrange multiplier spanned if agents could trade as securities solely
associated to the constrained optimization problem 1. state-contingent claims on the unit of account (so-
Equilibrium is attained at prices q ∗ where the aggre- called Arrow securities), provided that spot markets
gate excess demand with anticipated prices for all other goods exists in
all future states. In the sequel, we only deal with
z(q) := (ca∗ (q) − ea ) (4) Arrow securities in financial models with a single
a numeraire good that serves as unit of account, and
∗
vanishes, that is z(q ) = 0. One can check that could for simplicity be considered as money (“euro”).
z : → 1+m is continuous in the (relative) inte- If the set of outcomes were (uncountably) infi-
rior int := ∩ 1+m
++ of the simplex, and that nite, the natural notion of atomic securities is lost,
|z(q n )| goes to ∞ when q n tends to a point on although a state price density (stochastic discount
the boundary of . Since each agent exhausts his factor, deflator) may still exist, which could be inter-
or her budget constraint 1. with equality, Walras’ preted intuitively as an Arrow–Debreu state price per
law z(q)q = 0 holds for any q ∈ int . Let n be unit probability.
an increasing sequence of compact sets exhaust-
ing the simplex interior: int = ∪n n . Set ν n (z) :=
{q ∈ n | zq ≥ zp ∀p ∈ n }, and consider the corre- Multiple Period Extension and
spondence (a multivalued mapping) No-arbitrage Implications
n : (q, z) → (ν n (z), z(q)) (5) The one-period setting with finitely many states is
easily extended to finitely many periods with dates
that can be shown to be convex, nonempty valued, t ∈ {0, . . . , T } by considering an enlarged state space
and maps the compact convex set n × z(n ) into of suitable date–event pairs (see Chapter 7 in [3]). To
itself. Hence, by Kakutani’s fixed point theorem,
this end, it is mathematically convenient to describe
it has a fixed point (q n∗ , zn∗ ) ∈ n (q n∗ , zn∗ ). This
the information flow by a filtration (Ft ) that is
implies that
generated by a stochastic process X = (Xt (ω))0≤t≤T
z(q n∗ )q ≤ z(q n∗ )q n∗ = 0 for all q ∈ n (6) (abstract, at this stage) on the finite probability
space (, F, P0 ). Let F0 be trivial, FT = F = 2 ,
using Walras’ law. A subsequence of q n∗ converges and assume P0 ({ω}) > 0, ω ∈ . The σ -field Ft
to a limit q ∗ ∈ . Provided one can show that q ∗ is contains all events that are based on information
in the interior simplex int , existence of equilibrium from observing paths of X up to time t, and is
follows. Indeed, it follows that z(q ∗ )q ≤ 0 for all defined by a partition of . The smallest nonempty
q ∈ int , implying that z(q ∗ ) = 0 since z(q ∗ )q ∗ = 0 events in Ft are “t-atomic” events A ∈ At of the
by Walras’ law. To show that any limit point of q n∗ type A = [x0 · · · xt ] := {X0 = x0 , . . . , Xt = xt }, and
is indeed in int , it suffices to show that |z(q n∗ )| is constitute a partition of . Figure 1 illustrates the
bounded in n, recalling that z explodes at the sim- partitions At corresponding to the filtration (Ft )t=0,1,2
plex boundary. Indeed, z = a za is bounded from in a five-element space , as generated by a process
below since each agent’s excess demand satisfies Xt taking values a, . . . , f . It shows that a filtration
za = ca − ea ≥ −ea . This lower bound implies also can be represented by a nonrecombining tree. There
an upper bound, by using equation (6) applied with are eight (atomic) date–event pairs (t, A), A ∈ At .
some q ∈ 1 ⊂ n , since 0 <
≤ qi ≤ 1 uniformly An adapted process (ct )t≥0 , describing, for instance,
in i. This establishes existence of equilibrium. To a consumption allocation, has the property that ct is
ensure uniqueness of equilibrium, a sufficient con- constant on each atom A of partition At , and hence
dition is that all agents’ risk aversions are less than is determined by specifying its value ct (A) at each
or equal to 1, that is γ a ∈ (0, 1] for all a, see [2]. node point A ∈ At of the tree. Arrow–Debreu prices
Arrow–Debreu Prices 3
d [abd ]
b b [ab ] [abb ]
a a
[a ] [aba ]
e
[ace ]
c
[ac ]
f [acf ]
Date: 0 1 2 Date: 0 1 2
(a) (b)
Figure 1 Equivalent representations of the multiperiod case. (a) Tree of the filtration generating process Xt . (b) Partitions
At of filtration (Ft )t=0,1,2
q(t, A) are specified for each node of the tree and like this and trading takes place only at time 0, the
represent the value at time 0 of one unit of account market is free of arbitrage, given all Arrow–Debreu
at date t in node A ∈ At . prices are strictly positive. To give some examples,
Technically, this is easily embedded in the previ- the price at time 0 of a zero couponbond paying
ous single-period setting by passing to an extended one euro at date t equals ZCB t = A∈At q(t, A).
space := {1, . . . T } × with σ -field F generated For the absence of arbitrage, the t-forward prices
by all sets {t} × A with A being an (atomic) event q f (t, A ), A ∈ At , must be related to spot prices of
of Ft , and P0 ({t} × A) := µ(t)P0 (A) for a (strictly Arrow–Debreu securities by
positive) probability measure µ on {1, . . . T }.
For the common no-arbitrage pricing approach q(t, A )
in finance, the focus is to price contingent claims q f (t, A ) =
q(t, A)
solely in relation to prices of other claims, that are
A∈At
taken as exogenously given. In doing so, the aim
of the model shifts from the fundamental economic 1
= q(t, A ), A ∈ At (7)
equilibrium task to explain all prices, toward a ZCB t
“financial engineering” task to determine prices from Hence, the forward prices q f (t, A ) are normal-
already given other prices solely by no-arbitrage ized Arrow–Debreu prices and constitute a prob-
conditions, which are a necessary prerequisite for ability measure Qt on Ft , which is the t-forward
equilibrium. From this point of view, the (atomic) measure associated to the t − ZCB as numeraire,
Arrow–Debreu securities span a complete market, and yields q f (t, A) = E t [1A ] for A ∈ At , with E t
as every contingent payoff c, paying ct (A) at time denoting expectation under Qt . Below, we also con-
t in atomic event A ∈ At , can be decomposed by sider “non-atomic” state-contingent claims with pay-
c = t,A∈At ct (A)1(t,A) into a portfolio of atomic offs ck (ω) = 1(t,B) (k, ω), k ≤ T , for B ∈ Ft , whose
Arrow securities, paying one euro at date t in Arrow–Debreu prices are denoted by q(t, B) =
event A. Hence the no-arbitrage price of the claim
A∈At ,A⊂B q(t, A).
must be t,A∈At ct (A)q(t, A). Given that all atomic
Arrow–Debreu securities are traded at initial time
0, the market is statically complete in that any state- Arrow–Debreu Prices in Dynamic
contingent cash flow c can be replicated by a portfolio Arbitrage-free Markets
of Arrow–Debreu securities that is formed statically
at initial time, without the need for any dynamic In the above setting, information is revealed dynam-
trading. The no-arbitrage price for c simply equals ically over time, but trading decisions are static in
the cost of replication by Arrow–Debreu securities. that they are entirely made at initial time 0. To
It is easy to check that, if all prices are determined discuss relations between initial and intertemporal
4 Arrow–Debreu Prices
Arrow–Debreu prices in arbitrage-free models with by q0 (t, At ) ≡ q(t, At ), the martingale property and
dynamic trading, this section extends the above set- equations (8, 10) imply that
ting, assuming that all Arrow–Debreu securities are
tradable dynamically over time. q(t + 1, At+1 ) = e−Rt+1 (At )t QB (At+1 |At )q(t, At )
Let qs (t, At ), s ≤ t, denote the price process of (11)
the Arrow–Debreu security paying one euro at t in Hence q(t, At ) = QB (At )/Bt (At ) for At ∈ At .
state At ∈ At . At maturity t, qt (t, At ) = 1At takes The deflator or state price density for agent a is
value 1 on At and is 0 otherwise. For the absence of the adapted process ζta defined by
arbitrage, it is clearly necessary that Arrow–Debreu
prices are nonnegative, and that qs (t, At )(As ) > 0 q(t, At ) QB (At )
ζta (At ) := = , At ∈ At
holds for s < t at As ∈ As if and only if As ⊃ At . P a (At ) Bt (At )P a (At )
Further, for s < t it must hold that (12)
so thatζta St a
is a P -martingale for any security price
qs (t, At )(As ) = qs (s + 1, As+1 )(As ) process S, e.g. St = qt (T , AT ), t ≤ T , with AT ∈
AT . Ifone chooses, instead of Bt , another security
× qs+1 (t, At )(As+1 ) (8) Nt = AT ∈AT NT (AT )qt (T , AT ) with NT > 0 as the
numeraire asset for discounting, one can define an
for As ∈ As , As+1 ∈ As+1 such that As ⊃ As+1 ⊃ At . equivalent measure QN by
In fact, the above conditions are also sufficient to
ensure that the market model is free of arbitrage: At QN (A) NT (A) a
= ζ (A) , A ∈ AT (13)
any date t, the Arrow–Debreu prices for the next a
P (A) N0 (A) T
date define the interest rate Rt+1 for the next period
(t, t + 1)
of length t > 0 of a savings account which has the property that St /Nt is a QN -martingale
Bt = exp( ts=1 Rs t) by for any security price process S. Taking N =
(ZCBtT )t≤T as the T -zero-coupon bond yields the T -
forward measure QT .
exp(−Rt+1 (At )t) If X is a QB -Markov process, the condi-
tional probability QB (At+1 |At ) in equation (11) is
= qt (t + 1, At+1 ),
a transition probability pt (xt+1 |xt ) := QB (Xt+1 =
At+1 ∈At+1 (At ),At+1 ⊂At
xt+1 |Xt = xt ), where Ak = [x1 . . . xk ] for k = t,
for At ∈ At (9) t + 1. By summation of suitable atomic events
qt (t + 1, At+1 )(At ) where the sum is over all xt from the range of Xt .
QB (At+1 |At ) = (10)
qt (t + 1, A)(At )
A∈At+1 ,A⊂At Application Examples: Calibration of
Pricing Models
The transition probability (10) can be interpreted
as one-period forward price when being at At at The role of Arrow–Debreu securities as “atomic
time t, for one euro at date t + 1 in event At+1 , building blocks” is theoretical, in that there exist
cf. (7). Since all B-discounted Arrow–Debreu price no corresponding securities in real financial markets.
processes qs (t, At )/Bs , s ≤ t, are martingales under Nonetheless, they are of practical use in the cal-
QB thanks to equations (8, 10), the model is free of ibration of pricing models. For this section, X is
arbitrage by the fundamental theorem of asset pricing, taken to be a QB -Markov process, possibly time-
see [6]. For initial Arrow–Debreu prices, denoted inhomogeneous.
Arrow–Debreu Prices 5
The first example concerns the calibration of a prices of all state-contingent claims, which pay one
short rate model to some given term structure of zero unit at some t if Xt = xt for some xt , already deter-
coupon prices (ZCB t )t≤T , implied by market quotes. mine the risk neutral transition probabilities of X.
For such models, a common calibration procedure It is easy to see that these prices are determined
relies on a suitable time-dependent shift of the state by those of calls and puts for sufficiently many
space for the short rate (see [7], Chapter 28.7). Let strikes and maturities. Indeed, strikes at all tree levels
suitable functions Rt∗ be given such that the variations of the stock for each maturity date t are suffi-
of Rt∗ (Xt ) already reflect the desired volatility and cient, since Arrow–Debreu payoffs are equal to those
mean-reversion behavior of the (discretized) short of suitable butterfly options that are combinations
rate. Making an ansatz Rt (Xt ) := Rt∗ (Xt ) + αt for of such calls and puts. From given Arrow–Debreu
the short rate, the calibration task is to determine the prices q(t, Xt = xt ) for all t, xt , the transition prob-
parameters αt , 1 ≤ t ≤ T , such that abilities pt (xt+1 |xt ) are computed as follows: start-
ing from the highest stock level xt at some date
t, one obtains pt (xt u|xt ) by equation (14) with
∗
ZCB t = E exp −
B
(Rk (Xk ) + αk )t
Rt (xt ) = r and t = 1. The remaining transition
k≤t
(15) probabilities pt (xt m|xt ), pt (xt d|xt ) from (t, xt ) are
determined from
with the expectation being taken under the risk
neutral measure QB . It is obvious that this determines pt (xt u|xt )u + pt (xt m|xt )m + pt (xt d|xt )d = 1
all the αt uniquely. When computing this expectation (17)
to obtain the αt by forward induction, it is efficient
to use Arrow–Debreu prices q(t, Xt = xt ), since X and pt (xt u|xt ) + pt (xt m|xt ) + pt (xt d|xt ) = 1. Using
usually can be implemented by a recombining tree. these results, the transition probabilities from the
Summing over the range of states xt of Xt is more second highest (and subsequent) stock level(s) are
efficient than summing over all paths of X. Suppose implied by equation (14) in a similar way. This yields
that αk , k ≤ t, and q(t, Xt = xt ) for all values xt have all transition probabilities for any t.
been computed already. Using equation (14), one can To apply this in practice, the call and put prices for
then compute αt+1 from equation the maturities and strikes required would be obtained
from real market quotes, using suitable interpolation,
and the trinomial state space (i.e., σ, r, t) has to be
ZCB t+1 = q(t, Xt = xt ) chosen appropriately to ensure positivity of all pt ,
xt+1 xt
∗ see [4, 5].
× e(Rt+1 (xt )+αt+1 )t pt (xt+1 |xt ) (16)
[6] Harrison, J. & Kreps, D. (1979). Martingales and arbitrage Related Articles
in multiperiod securities markets, Journal of Economic
Theory 20, 381–408.
[7] Hull, J. (2006). Options, Futures and Other Deriva- Arrow, Kenneth; Complete Markets; Dupire Eq-
tive Securities, Prentice Hall, Upper Saddle River, New uation; Fundamental Theorem of Asset Pricing;
Jersey. Model Calibration; Pricing Kernels; Risk-neutral
[8] Mas-Colell, A., Whinston, M.D. & Green, J.R. (1995). Pricing; Stochastic Discount Factors.
Microeconomic Theory, Oxford University Press,
Oxford. DIRK BECHERER & MARK H.A. DAVIS
Options: Basic Definitions There are several binary classifications that help
define an option contract.
while “over the counter” (OTC) options are bilateral none of whom controls a significant proportion of
agreements between market counterparties. Option the total supply. In these circumstances, the price
exchanges have become increasingly globalized in is well established, since the last trade was never
recent years. They include the US–European consor- very long ago; bid/ask spreads will be tight, buyers
tium NYSE Euronext, Chicago Mercantile Exchange and sellers can enter the market at will, and there is
(CME), Eurex and EDX, all of which offer a range of little room for price manipulation. By contrast, in an
financial contracts, and a number of specialist com- illiquid market, it may be hard to establish a market
modity exchanges such as NYMEX (oil), ICE, and price when actual trades are infrequent and bid/ask
the London Metal Exchange (LME). An exchange spreads are wide. The liquid/illiquid classification is
offers contracts on an underlying asset such as an not immutable: a liquid market can suddenly become
individual stock or a stock index such as the S&P500, illiquid if there is some shock that forces everybody
with a range of maturity times and strike values. New onto the same side of the market. Several well-
options are added as the old ones roll off, and the recorded disasters in the derivatives market have been
strikes offered are in a range around the spot price due to this phenomenon.
of the underlying asset at the time the contract is ini-
tiated (the options may turn out to be far in or out
of the money at later times, of course). In a traded (Plain) Vanilla/Exotic
options market, prices are determined by supply and
demand. If the exercise times are Ti and the strike The simplest, most standard, and most widely traded
values Kj then the matrix V = [σ̂ij ], where σ̂ij is options are often referred to as plain vanilla options.
the implied volatility corresponding to the (Ti , Kj ) This would certainly include all exchange-traded
contract, defines the so-called volatility surface that options. An “exotic” option is an OTC option with
plays a key role in option risk management. nonstandard features of some kind, which requires
All interest-rate options and most FX (foreign significant modeling effort to value, and where differ-
exchange) options are OTC, but many are, ent analysts could well come up with significantly dif-
nevertheless, very liquidly traded and market ferent valuations. Exotic options often involve several
information on implied volatilities is readily underlying assets and complicated payment streams,
available. but even a simple call option can be exotic if it
poses significant hedging difficulties, as for exam-
Physical Settlement/Cash Settlement ple do long-dated equity options. On the other hand,
barrier options, for example, which once would have
Many single-stock options, and commodity options been considered exotic, have now become vanilla in
are physically settled, that is, at exercise the holder some markets such as FX, because they are so widely
pays the strike value and takes delivery of a share traded.
certificate or a barrel of oil. (One can, however, avoid
physical delivery by selling the option shortly before
final maturity.) The alternative is cash settlement, Path-dependent/Path-independent
where the holder is simply paid a cash amount, such
as [ST − K]+ for a call option, at exercise. When the An option is path-dependent if its exercise value
underlying is an index like the S&P500 this is the depends on the value of the underlying asset at
only way—one cannot deliver the index! In this case, more than one time. Examples are barrier and Asian
the amount paid is c × [IT − K]+ where IT is the options, and any American option. The exercise value
value of the index and c is the contractually specified of a path-independent option is a function only of the
dollar value of one index point. underlying price, say ST , at the maturity time T , as
for example in Black–Scholes. Valuation then only
Liquid/Illiquid requires specifying the one-dimensional risk-neutral
distribution of ST , whereas for a path-dependent
Like any other traded asset, an option contract is option a distribution in path space is required, making
liquid if there is large market depth, that is, there are the valuation more computationally intensive and
a significant number of active traders in the market, model dependent.
Options: Basic Definitions 3
expressed as a present value? Letting q = 0.5, we famous option pricing formula. We suppose that the
can easily convince ourselves that market is frictionless in the sense that there are no
transaction costs incurred when we trade in the stock
1 or the bank, and there are no restrictions on short or
P = {q × 5 + (1 − q) × 0} long positions. Further, the interest rate is the same
1.05
1 whether we borrow or lend money, and the market is
= Ɛq [option payoff] (1) perfectly liquid.
1.05 The main difference from the one-period model
where Ɛq is the expectation with respect to the is that we can invest in the underlying stock at all
probability q. This probability of a stock price times up to maturity of the claim. Obviously, we can
increase is not the probability for a price increase also do the same with the bank deposit, which is
observed in the market, but a constructed probability now assumed to yield a continuously compounding
for which the option price can be expressed as a interest rate r. An investment strategy will consist
present expected value. of a(t) shares of the stock and $b(t) invested in
The probability q has an interesting property that the bank at time t. Since investors cannot foresee
actually defines it. The present expected value of the the future, the investment decisions at time t can
stock price is equal to today’s value, only be based upon the available market information,
which is contained in the filtration Ft . The value at
1
100 = Ɛq [stock price] (2) time t of the portfolio is
1.05
Hence, the discounted stock price is a martingale V (t) = a(t)S(t) + b(t)R(t) (4)
with respect to the probability q. Further, the return
on an investment in the stock coincides with the risk- where R(t) = exp(rt), the value of an initial bank
free rate under q, defending the name “risk-neutral deposit of 1. Further, since we are interested in
probability” often assigned to q. creating strategies that are replicating an option, we
wish to rule out any external funding or withdrawal
of money in the portfolio we are setting up. This leads
Option Pricing in Continuous Time to the so-called self-financing hypothesis, saying that
any change in portfolio value comes from a change
Our binomial one-period example basically contains in the underlying stock price and bank deposit.
the main concepts for pricing of options and claims Mathematically, we can formulate this condition as
in more general and realistic market models. Moving
to a stock price that evolves dynamically in time dV (t) = a(t) dS(t) + b(t) dR(t) (5)
with stochastic marginal changes, the principles of
option pricing remain basically the same, however, Note that Itô’s formula implies a dynamics V (t)
introducing interesting technical challenges. We now where the differentials of a(t) and b(t) appear.
look at the case when the stock price follows a The self-financing hypothesis indicates that these
geometric Brownian motion (GBM), that is, differentials are zero.
For the one-period binomial model, we recall the
dS(t) existence of an equivalent martingale measure for
= µ dt + σ dB(t) (3)
S(t) which the discounted stock price is a martingale.
Applying the Girsanov theorem, we find a probability
defined on a probability space (, F, (Ft )t≥0 , )
measure equivalent to the market probability ,
with the filtration Ft generated by the Brownian
for which the process W (t) with differential
motion modeling the information flow. The GBM
model indicates that returns (or more precisely, µ−r
logarithmic returns) are independent and √ normally dW (t) = dt + dB(t) (6)
σ
distributed, with mean µ dt and volatility σ dt. The
model was first proposed for stock price dynamics is a Brownian motion. By a direct calculation, we
by Samuelson [7] and later used by Black and find
Scholes [1] and Merton [6] in their derivation of the d(e−rt S(t)) = σ (e−rt S(t)) dW (t) (7)
Option Pricing: General Principles 3
free. Furthermore, any self-financing strategy that utility from the two investment scenarios, the indif-
costs less than Pmax will always have a positive ference price of the claim is defined as the price that
probability of having a value lower than the claim at makes one indifferent between the two opportunities.
maturity, and thus full replication is impossible. This The choice of an exponential utility function leads
leaves the issuer of the claim with some unhedgable to prices where the singular case of zero risk aver-
risk. An acceptable or fair price of the claim will sion coincides with the price defined by the minimal
reflect the compensation the issuer demands for entropy martingale measure [4]. This price lends
taking on this risk. itself to the interpretation of being the price that is
A change in the stock price dynamics gives equally desirable for both the issuer and the buyer in
another source of incompleteness in the market. The the case when both parties have zero risk aversion.
GBM model is rather unnatural from an empirical For all other risk aversions, the seller will charge
point of view, since observed stock price returns on higher prices, and the buyer will demand lower.
the marketplace are frequently far from being nor- The difference of the two optimal investment strate-
mally distributed nor are they independent. Stock gies obtained from utility maximization becomes the
price models including stochastic volatility and/or hedging strategy. This and other similar approaches
stochastic drivers other than Brownian motion have have gained a lot of academic attention in the recent
been proposed. For instance, on the basis of empir- years.
ics, the returns may be modeled by a heavy-tailed Another path to pricing in incomplete markets is
distribution, which gives rise to a Lévy process in to try to complete the market by adding options. The
the geometric dynamics of the stock price. A conse- required number of options to complete the market is
quence of such a seemingly innocent change in the closely linked to the number of sources of uncertainty
structure is that there exists a continuum (in general) and the number of assets. For example, considering
of equivalent martingale measures such that the a GBM with a stochastic volatility following the
discounted stock price is a martingale. The compli- Heston model gives two random sources and one
cating implication of this is the absence of martin- asset. Following the analysis in [2], one call option is
gale representations when it becomes impossible to sufficient to complete the market. In [2], the necessary
and sufficient conditions to complete markets are
find an investment strategy replicating the claim. As
given in the case when the filtration is spanned
for markets with frictions, we have no possibility
by more Brownian motions than there are traded
of replication, but an interval of possible arbitrage-
assets.
free prices. In addition, in this case, the issuer
of the claim needs to accept a certain unhedgable
risk. References
To price claims in incomplete markets, one must
resort to methods that take into account the risk posed
[1] Black, F. & Scholes, M. (1973). The pricing of options
on the issuer. Popular approaches include minimal- and corporate liabilities, Journal of Political Economy 81,
variance hedging, where the strategy minimizing the 637–654.
variance (that is, the risk) is sought for. The price [2] Davis, M. & Obloj, J. (2008). Market completion using
of the claim is the cost of buying the minimal- options, in Advances in Mathematics of Finance, L. Stet-
variance strategy [8] plus a compensation for the tner, ed., Banach Center Publications, pp. 49–60, Vol. 43.
[3] Delbaen, F. & Schachermayer, W. (1994). A general
unhedged risk. Another possibility that has gained
version of the fundamental theorem of asset pricing,
a lot of attention in the option pricing literature is Matematische Annalen 300, 463–520.
indifference pricing (see also the seminal work of [4] El Karoui, N. & Rouge, R. (2000). Pricing via utility
Hodges and Neuberger [5]). Here, one considers an maximization and entropy, Mathematical Finance 10(2),
investor who has two opportunities. Either he/she can 259–276.
invest his/her funds in the market, or he/she can sell [5] Hodges, S. & Neuberger, A. (1989). Optimal replication
of contingent claims under transaction costs, Review of
a claim and invest his/her funds along with claim
Futures Markets 8, 222–239.
price. In the latter case, he/she has more funds for [6] Merton, R. (1973). Theory of rational option pricing,
investment, but on the other hand, he/she faces a Bell Journal of Economics and Management Science 4,
claim at maturity. By optimizing his/her expected 141–183.
Option Pricing: General Principles 5
[7] Samuelson, P.A. (1965). Proof that properly anticipat- Related Articles
ing prices fluctuate randomly, Industrial Management
Reviews 6, 41–49.
[8] Schweizer, M. (2001). A guided tour through quadratic Binomial Tree; Black–Scholes Formula; Hedging;
hedging approaches, in Option Pricing, Interest Rates, and Option Pricing Theory: Historical Perspectives.
Risk Management, E. Jouini, J. Cvitanic & M. Musiela,
eds, Cambridge University Press, pp. 538–574. FRED E. BENTH
Forwards and Futures parties need to agree on the specific asset (often called
the underlying asset) and on the precise quantities that
are bought or sold, on the exact date when the trans-
Futures and forwards are financial contracts that make actions take place (the delivery date), and the price
it possible to reduce the price risk that arises from the that will be charged on that date (the forward price).
intention to buy or sell certain assets at a later date. Usually the forward price is chosen in such a way
A forward contract specifies in advance the price that that both parties agree to sign the contract without
will be paid at such a later date for the delivery of the any money changing hands before the delivery date.
asset. This obviously reduces the price risk for that This implies that the forward contract starts with hav-
transaction to zero for all parties involved. A futures ing zero market value, since both parties are willing
contract, on the other hand, guarantees that changes to sign it without receiving or paying any money for
in the asset’s price that occur before the delivery date it. Later on, the contract may have a positive or neg-
will be compensated for immediately when they arise. ative market value, since every change in the market
This compensation is achieved by offsetting payments price of the underlying asset will make the existing
into a bank account that is called the margin account. agreement as written in the contract more beneficial
This significantly reduces the price risk associated to one of the parties and less beneficial to the other
with the futures transaction, since the only possible one. The forward contract may, therefore, become a
remaining source of uncertainty is now due to the serious liability for one of the two parties involved,
interest rate used for the margin account. so there is the risk that this party is no longer will-
The assets that are bought or sold at the delivery ing or able to honor the terms of the contract on the
date can be storable commodities (such as gold, oil, delivery date. This counterparty risk problem can be
and agricultural products), nonstorable commodities avoided by the use of futures contracts.
(such as electricity), or other financial assets (such Futures are standardized contracts that are traded
as stocks, bonds, options, or currencies). Forward on futures exchanges. When entering a futures con-
contracts are also used by parties to agree in advance tract, a margin account on the futures exchange is
on an interest rate that will be paid or charged opened and a payment into that account is required, to
during a later time period, in so-called forward rate make it possible for the exchange to withdraw money
agreements (FRAs). Similarly, one can buy and sell when appropriate. The exchange publishes a futures
futures on the value of money deposited in a bank price for every contract, which is updated regularly
account. For such interest rate futures, which include to reflect price changes in the underlying. When-
the very popular eurodollar and euribor contracts, ever a new futures price is announced, an amount
there is no actual delivery but the contract is fulfilled of cash that is equal to the difference between the
by cash settlement instead. new futures price and the previous one is paid into
Here, we discuss only the general pricing prin- or withdrawn from the margin account, depending on
ciples for forwards and futures. We refer to other whether one is short the contract or long the contract.
articles in the encyclopedia for detailed information Parties that intend to buy the underlying are long the
concerning the delivery procedures and methods to contract, and they, therefore, receive money if the
quote prices for specific futures and forward con- futures price goes up and pay when it goes down.
tracts, such as eurodollar futures (see Eurodollar Parties that intend to sell are short the contract, and
Futures and Options), forward rate agreements (see they, therefore, pay money if the futures price goes
LIBOR Rate), electricity (see Electricity Forward down and receive money when it goes up. This pro-
Contracts), commodity (see Commodity Forward cedure is known as marking to market. Since on the
Curve Modeling), and foreign exchange forwards delivery date the futures price is always equal to the
(see Currency Forward Contracts). underlying asset price, a possible difference between
the initial futures price and the current asset price has
Using Futures and Forwards been compensated for by the intermediate payments
into the margin account.
Forward contracts are usually agreed upon by two Parties with opposite positions in the futures
parties who directly negotiate the terms of such con- market deal only with the exchange instead of with
tracts, which can therefore be very flexible. The two each other, which explains the need for standardized
2 Forwards and Futures
contracts and the significant reduction in counterparty time n. These cash flows can be positive (such as
risk. Since no cash is needed to enter into a new dividends when S is a stock, or interest when S is
(long or short) futures contract as long as there is a currency) or negative (such as storage costs when
enough money left in the margin account, it is easy S is a commodity). We will always assume perfect
to change a position in futures once such an account market liquidity (see Liquidity), so all assets can be
has been established. One can terminate existing long bought and sold in all possible quantities for their
contracts by simply taking a position in offsetting current market prices and no transaction costs (see
short contracts or vice versa, and many parties close Transaction Costs) are charged.
their position just before the delivery date if they The cash flows associated with a forward contract
are only interested in compensation for price changes that is initiated at time T0 ∈ N take place at the time
and not in the actual delivery. This makes futures of delivery Td ∈ N that is specified in the contract,
very convenient to use for hedging purposes (see with T0 ≤ Td . At time Td , the asset is delivered while
Hedging) and for speculation on an underlying’s the forward price agreed upon at the initial time T0 for
price movements. Likewise, it is quite easy for the delivery at time Td , which we denote by F (T0 , Td ),
exchange to close the futures position of a party who is paid in return. Since this forward price needs to be
refuses to put more money in their margin account determined at time T0 , it should be FT0 -measurable.
when asked to do so in the so-called margin call. Moreover, the forward price is chosen in such a way
These characteristics have made futures very pop- that both parties agree to enter the contract without
ular financial instruments and the market for them is any cash changing hands at this initial time.
huge. In 2008, more than eight billion futures con- In complete and arbitrage-free markets (see Arbi-
tracts were traded worldwide with underlying assetsa trage Pricing Theory), it is often possible to find an
in equity indices (37%), individual equity (31%), explicit expression for the forward price F (T0 , Td ),
interest rates (18%), agricultural goods (5%), energy since the cash flows associated with the contract can
(3%), currencies (3%), and metals (2%). The most then be replicated using other assets with known
popular are contracts on the S&P 500 and Dow Jones prices. Let us assume that there exists a unique
indices, followed by eurodollar and eurobund futures, martingale measure , which is equivalent to ,
and contracts on white sugar, soybeans, crude oil, alu- such that the discounted versions of tradable assets
minum, and gold. The notional amounts underlying are martingales under this measure (see Equivalent
futures on interest rates, equity indices, and curren- Martingale Measures). This is almost equivalent
cies at the world’s exchanges were estimated to be to the assumption of a complete and arbitrage-free
27 trillion, 1.6 trillion, and 175 billion US dollars, market; for the exact statement (see Fundamental
respectively, in June 2008b . Theorem of Asset Pricing). Contingent claims that
pay a cash-flow stream of Fn -measurable amounts
Xn at the times n ∈ N in such markets have a unique
Pricing Methods for Forwards in Discrete
price p at time k ∈ N equal to
Time
pk = Bk Ɛ [Xn /Bn | Fk ] (1)
To analyze the futures and forward prices, we first
n∈N, n≥k
look at discrete-time models, and then look at gener-
alizations in continuous time. A specific example is the zero-coupon bond price
Consider a discrete-time market model on a proba- at time k for the delivery of one unit cash at time
bility space (, F, ) with a filtration (Fn )n∈N where T > k, which is equal to p(k, T ) = Bk Ɛ [1/BT |
N = {0, 1, ..., N } denotes our discrete-time set. We Fk ].
define assets S and B to model the underlying asset Suppose that an investor enters into a forward
and a bank account, respectively, with associated contract at time T0 , which obliges him/her to deliver
stochastic price processes (Sn )n∈N and (Bn )n∈N . We the underlying asset S at time Td , and that he/she
assume that S is adapted and that B is predictable buys the underlying asset at time T0 to hold it until
with respect to this filtration, and that both B and delivery. This will lead to a cashflow of −ST0 at time
1/B are bounded. Associated with the asset S are T0 , to cash flows Dn at times {n ∈ N : T0 ≤ n ≤ Td },
cash flows (Dn )n∈N where Dn denotes the sum of and a cash flow of F (T0 , Td ) at time Td when he/she
all cash flows caused by holding one unit of S at delivers the asset. Since a forward contract is entered
Forwards and Futures 3
into without any money changing hands and since the Pricing Methods for Futures in Discrete
net position after delivery will be zero, the value of Time
the cash-flow stream defined above must be zero if
there is no arbitrage in the market. Using the previous All cash flows associated with a futures contract take
equation, we thus find that place via the margin account. Let (Mn )n∈N be the
process describing the value of the margin account
0 = − ST0 + BT0 Ɛ [F (T0 , Td )/BTd | FT0 ] associated with a long position in one future on the
underlying asset S defined above. If f (k, Td ) is the
+ BT0 Ɛ [Dn /Bn | FT0 ] (2) futures price at time k for delivery of one unit of
T0 ≤n≤Td the asset at time Td > k (k, Td ∈ N), then the margin
account values will satisfy
Since F (T0 , Td ) is FT0 -measurable, this leads to
the following expression for a forward price in a Bk+1
Mk+1 = Mk + f (k + 1, Td ) − f (k, Td ) (5)
complete and arbitrage-free market: Bk
where we assume that the interest rate used for the
ST0 /BT0 − Ɛ [Dn /Bn | FT0 ] margin account is the same as the one used for B.
T0 ≤n≤Td Futures prices are determined by supply and
F (T0 , Td ) =
Ɛ [1/BTd | FT0 ] demand on the futures exchanges, but if we assume
a complete and arbitrage-free market for S and B,
ST0 − BT0 Ɛ [Dn /Bn | FT0 ] we can derive a theoretical formula for the futures
T0 ≤n≤Td price. We consider an investment strategy where at a
=
p(T0 , Td ) certain time k ∈ N, we open a new margin account,
(3) put an initial margin amount Mk into it, and take
a long position in a futures contract for delivery at
In particular, when there are no dividends or stor- time Td . One time step later, we go short one future
age costs, the forward price is simply equal to the contract for the same delivery date, which effectively
current price of the underlying asset divided by the closes our futures position, and we then empty our
appropriate discount rate until delivery. For com- margin account. Since our net position is then zero
modities, where the cash flows Dn are often negative again and since we do not pay or receive money to
since they represent storage costs, this formula (3) go long or short a futures contract, the total value of
is known as the cost-of-carry formula. Conversely, this cash-flow stream at time k should be equal to
when the actual possession of an underlying asset is zero, so
more beneficial than just holding the forward con-
tract, this can be modeled by introducing positive Mk+1
0 = − Mk + Bk Ɛ Fk
cash flows Dn . Such benefits are often expressed as Bk+1
a rate, the so-called convenience yield, which may
f (k + 1, Td ) − f (k, Td )
fluctuate as a result of changing expectations con- = Bk Ɛ Fk (6)
cerning the availability of the underlying asset on the Bk+1
delivery date. Since B was assumed to be predictable, that is,
The initial price of a forward contract is zero, but Bk+1 is Fk -measurable for all k ∈ N \ {N }, we may
when the underlying asset’s price changes, so does conclude from the above that the futures price process
the value of an existing contract. If we denote by f (·, Td ) is a -martingale for any fixed delivery date
G(T0 , Td , k) the value at time k of a forward contract Td ∈ N, and hence
entered at time T0 ≤ k for delivery at time Td ≥ k,
then a similar argument as before leads to f (k, Td ) = Ɛ [STd | Fk ] (7)
since f (Td , Td ) = STd . Note that this formula no
F (k, Td ) − F (T0 , Td )
G(T0 , Td , k) = Bk Ɛ Fk longer holds if B fails to be predictable or when the
BTd interest rates paid on the bank account B and the
= p(k, Td ) (F (k, Td ) − F (T0 , Td )) (4) margin account M are different.
4 Forwards and Futures
Td
1 dDu 1
F (T0 , Td ) = ST0 − BT0 Ɛ
+ d D, FT0 (10)
p(T0 , Td ) T0 Bu− B u
As in the discrete-time case, we assume that we The formula for the value of a forward contract at
have a complete and arbitrage-free market and that a later time after T0 is the same as in the discrete-time
there exists a unique measure that is equivalent case.
to such that discounted versions of tradable assets We now turn to the definition of a futures price
become martingales under this measure (see Equi- process (f (t, Td ))t∈[0,Td ] in continuous time for deliv-
valent Martingale Measures). We model contingent ery at a fixed time Td ∈ [0, T ]. Let (ψt )t∈[0,Td ] be
claims by a cumulative cash-flow stream (Xt )t∈[0,T ] , a futures investment strategy: a bounded and pre-
which is an adapted semimartingale. The total cash dictable stochastic process such that ψt represents the
amount paid out by the contingent claim between two number of futures contracts (positive or negative) we
times t1 and t2 is given by Xt2 − Xt1 − , and Xt − Xt− own at time t. The associated margin account process
corresponds to a payment at the single time t (with (Mt )t∈[0,T ] is then defined on [0, T ] as
t, t1 , t2 ∈ [0, T ] and t2 ≥ t1 ). Such contingent claims
have a unique price p in a complete and arbitrage-free dBt
dMt = Mt + ψt df (t, Td ) (11)
market, which at time t is equal to Bt−
T
dXu 1 with M0 ∈ , where we have again assumed that the
pt = Bt Ɛ + d X, Ft (8)
t Bu− B u margin account earns the same interest rate as the
bank account B. As mentioned before, the futures
The last term involving the brackets compensates price process should be equal to the underlying asset
for the fact that the cash flows X and the bank price at delivery, so f (Td , Td ) = STd .
account may have nonzero covariation, so it disap- In a complete and arbitrage-free market, we con-
pears when B has finite variation and is continuous, sider an investment strategy where at any time t ∈
or when B is deterministic. Compare this to the [0, Td ] we open a new margin account and put an
discrete-time case, where we assumed that (Bn )n∈N initial margin amount Mt in, go long one future con-
is predictable. tract at time t, wait until a later date s ∈]t, Td ] and
Forwards and Futures 5
close our futures position by going short one contract, treated by Duffie and Stanton [7] and Karatzas and
and close our margin account. If there is no arbitrage, Shreve [10], see also [12]. See [2] for a very clear
the discounted value of the cash flows from this strat- summary of the principles involved. For excellent
egy should be zero at time t since we start and end introductions to the practical organization of futures
without any position, so and forward markets and for empirical results on
prices, the books by Duffie [6], Hull [8], and Kolb
Ms
Mt = Bt Ɛ Ft (12) [11] are recommended.
Bs For incomplete markets, there is a theory of
equilibrium in futures market under mean–variance
This shows that M/B is a martingale under ,
preferences; see, for example, [14] and the
that is, the margin account should be a tradable asset.
consumption-based capital asset pricing model of
A bit of stochastic calculus shows that
Breeden [4] (see also Capital Asset Pricing Model).
Mt df (t, Td ) 1 Many futures allow a certain flexibility regarding the
d = + d f (·, Td ), (13)
Bt Bt− B t exact product that must be delivered and regarding
the time of delivery. The value of this last “timing
and we see that if B is continuous, of finite variation, option” is analyzed in a paper by Biagini and
bounded, and bounded away from zero, then the Björk [1].
futures price process f (·, Td ) is itself a martingale When the bank account process B is not of
under and hence finite variation and continuous, the futures price is
no longer a martingale under ; however under
f (t, Td ) = Ɛ f (Td , Td ) | Ft = Ɛ STd | Ft some technical conditions, it can be shown to be
a martingale under another equivalent measure that
(14) can be found using a multiplicative Doob–Meyer
decomposition (see Doob–Meyer Decomposition) as
Note that in this case the difference between the
shown in [15]. The assumption that B and 1/B are
forward and futures prices can be expressed as
bounded is often too restrictive in practice; see [13]
for weaker conditions.
F (t, Td ) − f (t, Td )
BT0 STd End Notes
= Ɛ FT0
p(T0 , Td ) BTd
a.
1 Sector estimates based on the US data, by the Futures
−Ɛ STd FT0 Ɛ FT0 (15) Industry Association.
BTd b.
Quarterly Review, December 2008, Bank for International
Settlements.
Since the expression in brackets is the FT0 -
conditional covariance between STd and 1/BTd , we
immediately see that forward and futures prices coin-
References
cide if and only if these two stochastic variables are
[1] Biagini, F. & Björk, T. (2007). On the timing option in a
uncorrelated when conditioned on FT0 , for example,
futures contract, Mathematical Finance 17(2), 267–283.
when the bank account B is deterministic. [2] Björk, T. (2004). Arbitrage Theory in Continuous Time,
2nd Edition, Oxford University Press.
[3] Black, F. (1976). The pricing of commodity contracts,
Extensions Journal of Financial Economics 3(1–2), 167–179.
[4] Breeden, D.T. (1980). Consumption risk in futures
For clarity of exposition, we have focused here on markets, Journal of Finance 35(2), 503–520.
forward and future prices in complete and arbitrage- [5] Cox, J.C., Ingersoll, J. Jr. & Ross, S.A. (1981). The
free markets without transaction costs (see Transac- relation between forward prices and futures prices,
Journal of Financial Economics 9(4), 321–346.
tion Costs). Early papers on the theoretical pricing [6] Darell, D. (1989). Futures Markets, Prentice-Hall.
methods are by Black [3] for deterministic interest [7] Duffie, D. & Stanton, R. (1992). Pricing continu-
rates and Cox et al. [5] and Jarrow and Oldfield ously resettled contingent claims, Journal of Economic
[9] for the general case. Continuous resettlement is Dynamics and Control 16(3–4), 561–573.
6 Forwards and Futures
[8] Hull, J. (2003). Options, Futures and Other Derivatives, prices in a multigood economy, Journal of Financial
5th Edition, Prentice-Hall. Economics 9(4), 347–371.
[9] Jarrow, R.A. & Oldfield, G.S. (1981). Forward contracts [15] Vellekoop, M. & Nieuwenhuis, H. (2007). Cash Divi-
and futures contracts, Journal of Financial Economics dends and Futures Prices on Discontinuous Filtrations.
9(4), 373–382. Technical Report 1838, University of Twente.
[10] Karatzas, I. & Shreve, S. (1998). Methods of Mathemat-
ical Finance, Springer-Verlag.
[11] Kolb, R. (2003). Futures, Options, and Swaps, 4th
Edition, Blackwell Publishing. Related Articles
[12] Norberg, R. & Steffensen, M. (2005). What is the time
value of a stream of investments? Journal of Applied
Probability 42, 861–866.
Commodity Forward Curve Modeling; Cur-
[13] Pozdnyakov, V. & Steele, J.M. (2004). On the martin- rency Forward Contracts; Electricity Forward
gale framework for futures prices, Stochastic Processes Contracts; Eurodollar Futures and Options;
and Their Applications 109, 69–77. LIBOR Rate.
[14] Richard, S.F. & Sundaresan, M.S. (1981). A continuous
time equilibrium model of forward prices and futures MICHEL VELLEKOOP
Black–Scholes Formula information in the market. Traded asset prices are
Ft -adapted stochastic processes on (, F, ). We
assume that the market is frictionless; assets may
be held in arbitrary amount, positive and negative,
“If options are correctly priced in the market, it
the interest rate for borrowing and lending is the
should not be possible to make sure profits by creat-
same, and there are no transaction costs (i.e., the
ing portfolios of long and short positions in options
bid–ask spread is 0). While there may be many traded
and their underlying stocks. Using this principle, a
assets in the market, we fix attention on two of them.
theoretical valuation formula for options is derived.”
First, there is a “risky” asset whose price process
These sentences, from the abstract of the great paper
(St , t ∈ + ) is assumed to satisfy the stochastic dif-
[2] by Fischer Black and Myron Scholes, encapsu-
ferential equation (SDE)
late the basic idea that—with the asset price model
they employ—insisting on absence of arbitrage is dSt = µSt dt + σ St dwt (1)
enough to obtain a unique value for a call option on
the asset. The resulting formula, equation (6) below, with given drift µ and volatility σ . Here (wt , t ∈ + )
is the most famous formula in financial economics, is an (Ft )-Brownian motion. Equation (1) has a
and, in fact, that whole subject splits decisively into unique solution: if St satisfies equation (1), then by
the pre-Black–Scholes and post-Black–Scholes eras. the Itô formula
This article aims to give a self-contained deriva-
tion of the formula, some discussion of the hedge d log St = µ − 12 σ 2 dt + σ dwt (2)
parameters, and some extensions of the formula,
and to indicate why a formula based on a stylized so that St satisfies equation (1) if and only if
mathematical model, which is known not to be a par-
ticularly accurate representation of real asset prices, St = S0 exp µ − 12 σ 2 t + σ wt (3)
has nevertheless proved so effective in the world of
Asset St is assumed to have a constant dividend yield
option trading. The section The Model and Formula
q, that is, the holder receives a dividend payment
formulates the model and states and proves the for-
qSt dt in the time interval [t, t + dt[. Secondly,
mula. As is well known, the formula can equally
there is a riskless asset paying interest at a fixed
well be stated in the form of a partial differential
continuously compounding rate r. The exact form
equation (PDE); this is equation (9) below. The next
of this asset is unimportant—it could be a money-
section discusses the PDE aspects of Black–Scholes.
market account in which $1 deposited at time s grows
The section Hedge Parameters summarizes informa-
to $er(t−s) at time t, or it could be a zero-coupon bond
tion about the option ‘greeks’, while the sections The
maturing with a value of $1 at some time T , so that
Black ‘Forward’ Option Formula and A Universal
its value at t ≤ T is
Black Formula introduce what is actually a more use-
ful form of Black–Scholes, usually known as the Bt = exp(−r(T − t)) (4)
Black formula. Finally, the section Implied Volatil-
ity and Market Trading discusses the applications of This grows, as required, at rate r:
the formula in market trading. We define the implied
volatility and demonstrate a “robustness” property of dBt = rBt dt (5)
Black–Scholes, which implies that effective hedging
can be achieved even if the “true” price process is Note that equation (5) does not depend on the final
substantially different from Black and Scholes’ styl- maturity T (the same growth rate is obtained from
ized model. any zero-coupon bond) and the choice of T is a matter
of convenience.
A European call option on St is a contract,
entered at time 0 and specified by two parameters
The Model and Formula (K, T ), which gives the holder the right, but not the
obligation, to purchase 1 unit of the risky asset at
Let (, F, (Ft )t∈+ , ) be a probability space with price K at time T > 0. (In the frictionless market
a given filtration (Ft ) representing the flow of setting, an option to buy N units of stock is equivalent
2 Black–Scholes Formula
to N options on a single unit, so we do not need 3. The value of the put option with exercise time T
to include quantity as a parameter.) If ST ≤ K the and strike K is
option is worthless and will not be exercised. If
ST > K the holder can exercise his option, buying the
P (t, S) = e−r(T −t) KN (− d2 )
asset at price K, and then immediately selling it at the
prevailing market price ST , realizing a profit of ST − − e−q(T −t) SN (− d1 ) (11)
K. Thus, the exercise value of the option is [ST −
K]+ = max(ST − K, 0). Similarly, the exercise value To prove the theorem, we are going to show that
of a European put option, conferring on the holder the call option value can be replicated by a dynamic
the right to sell at a fixed price K, is [K − ST ]+ . trading strategy investing in the asset St and in the
In either case, the exercise value is nonnegative zero-coupon bond Bt = e−r(T −t) . A trading strategy is
and, in the above model, is strictly positive with specified by an initial capital x and a pair of adapted
positive probability, so the option buyer should pay processes αt , βt representing the number of units of
the writer a premium to acquire it. Black and Scholes S, B respectively held at time t; the portfolio value
[2] showed that there is a unique arbitrage-free value at time t is then Xt = αt St + βt Bt , and by definition
for this premium. x = α0 S0 + β0 B0 . The trading strategy (x, α, β) is
admissible if
Theorem 1 T
1. In the above model, the unique arbitrage-free (i) αt2 St2 dt < ∞ a.s.
value at time t < T when St = S of the call
0
T
option maturing at time T with strike K is (ii) |βt | dt < ∞ a.s.
0
(iii) There exists a constant L ≥ 0 such that
C(t, S) = e−q(T −t) SN (d1 ) − e−r(T −t) KN (d2 )
Xt ≥ −L for all t, a.s.
(6) (12)
where N (·) denotes the cumulative standard nor- The gain from trade in [s, t] is
mal distribution function
x t t t
1 1 2
αu dSu + βu dBu +
N (x) = √ e− 2 y dy (7) qαu Su du
2π −∞ s s s
the so-called risk-neutral measure on (, FT ) by the C(t, S) = √ h(S exp((r − q − σ 2 /2)
2π −∞
Radon–Nikodým derivative √
× (T − t) − σ x T − t))e−1/2x dx (22)
2
d 1 2
= exp −θwT − θ T (15)
d 2 Straightforward calculations show that this integral is
(The right-hand side has expectation 1, since wT ∼ equal to the closed-form expression in equation (6).
N (0, T ).) Expectation with respect to will be The argument so far shows that if there is a
denoted Ɛ . By the Girsanov theorem, w̌ = wt + θt replicating strategy, the initial capital required must
is a -Brownian motion, so that from equation (1) be x = C(0, S0 ) where C is defined by equation (22).
the SDE satisfied by St under is It remains to identify the strategy (x, α, β) and to
show that it is admissible. Let us temporarily take for
dSt = (r − q)St dt + σ St dw̌t (16)
granted the assertions of part (2) of the theorem; these
so that for t < T will be proved in Theorem 3 below, where we also
show that (∂C/∂S)(t, S) = e−q(T −t) N (d1 ), so that in
particular 0 < ∂C/∂S < 1.
ST = St exp r − q − 12 σ 2 (T − t)
The replicating strategy is A = (x, α, β) defined
by
+ σ (w̌T − w̌t ) (17)
∂C
Applying the Itô formula and equation (14) we find x = C(0, S0 ), αt = (t, St )
∂S
that, with X̃t = e−rt Xt and S̃t = e−rt St
1 ∂C 1 2 2 ∂ 2C ∂C
dX̃t = αt S̃t σ dw̌t (18) βt = + σ St − qSt (23)
rBt ∂t 2 ∂S 2 ∂S
−rt
Thus e Xt is a -local martingale under con-
dition (12)(i). Let h(S) = [S − K]+ and suppose Indeed, using the PDE (9) we find that Xt = αt St +
there exists a replicating strategy, that is, a strat- βt Bt = C(t, St ), so that A is replicating and also
egy (x, α, β) with value process Xt constructed as Xt ≥ 0, so that condition (12)(iii) is satisfied. From
in equation (14) such that XT = h(ST ) a.s. Suppose equation (17)
also that αt satisfies the stronger condition
T St2 = S02 exp((2r − 2q − σ 2 )t + 2σ w̌t ) (24)
Ɛ αt2 St2 dt < ∞ (19)
0 so that Ɛ [St2 ] = exp((2r − 2q + σ 2 )t). Since
T
Then X̃t is a -martingale, and hence for t < T |e−r(T −t) ∂C/∂S| < 1, this shows that Ɛ 0 αt2 St2
dt < ∞, that is, condition (19) is satisfied. Since βt
Xt = e−r(T −t) Ɛ [h(ST )|Ft ] (20) is, almost surely (a. s.), a continuous function of t it
4 Black–Scholes Formula
satisfies equation (12)(ii). Thus A is admissible. The The Black–Scholes Partial Differential
gain from trade in an interval [s, t] is Equation
t t t
Theorem 3
αu dSu + qαu Su du + βu dBu 1. The Black–Scholes PDE (9) with boundary con-
s s s
t dition (10) has a unique C 1,2 solution, given by
t
∂C ∂C 1 ∂ 2C equation (6).
= dS + + σ 2 St2 2 du
s ∂S s ∂t 2 ∂S 2. The Black–Scholes “delta”, (t, S), is given by
t ∂
= dC (t, S) = C(t, S) = e−q(T −t) N (d1 ) (26)
s ∂S
= C(t, St ) − C(s, Ss ) (25) Proof It can—with some pain—be directly
checked that C(t, S) defined by equation (6) does sat-
isfy the Black–Scholes PDE (9), (10), and a further
(We obtain the first equality from the definition of calculation (not quite as simple as it appears) gives
α, β, and it turns out to be just the Itô formula applied the formula (26) for the Black–Scholes delta. It is,
to the function C.) This confirms the self-financing however, enlightening to take the original route of
property and completes the proof. Black and Scholes and relate the equation (9) to a
Finally, part (3) of the theorem follows from simpler equation, the heat equation. Note from the
the model-free put–call parity relation C − P = explicit expression (17) for the price process under
e−q(T −t) S − e−r(T −t) K and symmetry of the normal the risk-neutral measure that, given the starting point
distribution: N (−x) = 1 − N (x). St , there is a one-to-one relation between ST and
The replicating strategy derived above is known as the Brownian increment w̌T − w̌t . We can therefore
delta hedging: the number of units of the risky asset always express things interchangeably in “S coordi-
held in the portfolio is equal to the Black–Scholes nates” or in “w̌ coordinates”. In fact, we already made
delta, = ∂C/∂S. use of this in deriving the integral price expression
So far, we have concentrated entirely on the (22). Here we proceed as follows.
hedging of call options. We conclude this section by For fixed parameters S0 , r, q, σ , define the func-
showing that, with the class of trading strategies we tions φ : + × → + and u : [0, T [× → + by
have defined, there are no arbitrage opportunities in
1 2
the Black–Scholes model. φ(t, x) = S0 exp r −q − σ t +σ x (27)
2
Theorem 2 There is no admissible trading strategy
and
in a single asset and the zero-coupon bond that gen-
erates an arbitrage opportunity, in the Black–Scholes u(t, x) = C(t, φ(t, x)) (28)
model.
Note that the inverse function ψ(t, s) = φ −1 (t, s)
Proof Suppose Xt is the portfolio value pro- (i.e., the solution for x of the equation s = φ(t, x))
cess corresponding to an admissible trading strategy is
(x, α, β). There is an arbitrage opportunity if x = 0
and, for some t, Xt ≥ 0 a.s. and [Xt > 0] > 0, or 1 s 1
ψ(t, s) = log − r − q − σ2 t
equivalently Ɛ[Xt ] > 0. This is the -expectation, σ S0 2
but Ɛ[Xt ] > 0 ⇔ Ɛ [X̃t ] > 0 since and are
(29)
equivalent measures and e−rt > 0. From equation
(18), X̃t is a -local martingale which, by the defini- A direct calculation shows that C satisfies equation
tion of admissibility, is bounded below by a constant (9) if and only if u satisfies the heat equation
−L. It follows that X̃t is a supermartingale, so if
x = 0, then Ɛ [X̃t ] ≤ 0 for any t. So no arbitrage ∂u 1 ∂ 2 u
can arise from the strategy (0, α, β). + −ru=0 (30)
∂t 2 ∂x 2
Black–Scholes Formula 5
If Wt is Brownian motion on some probability space equation (22) is the unique C 1,2 , solution of equation
and u is a C 1,2 function, then an application of the (9) as claimed.
Itô formula shows that
∂u 1 ∂ 2 u Hedge Parameters
−rt −rt
d(e u(t, Wt )) = e + − ru dt
∂t 2 ∂x 2 Bringing in all the parameters, the Black–Scholes
∂u formula (6) is a six-parameter function C(t, S) =
+ e−rt dWt (31) C(τ, S, K, r, q, σ ), where τ = T − t is the time to
∂x
maturity. For risk-management purposes, it is impor-
If u satisfies equation (30) with boundary condition tant to know the sensitivities of the option value to
u(T , x) = g(x) and changes in the parameters. The conventional hedge
T 2 parameters or “greeks” are given in Table 1. There
∂u are slight notational problems in that “vega” is not
Ɛ (t, Wt ) dt < ∞ (32)
0 ∂x the name of a Greek letter (here we have used upper-
case upsilon, but this is not necessarily a conventional
then the process t → e−rt u(t, Wt ) is a martingale so choice) and upper-case rho coincides with Latin P, so
that, with Ɛt,x denoting the conditional expectation this parameter is usually written ρ, risking confusion
given Wt = x, with correlation parameters. The expressions in the
right-hand column are readily obtained from the sen-
e−rt u(t, x) = Ɛt,x [e−rT u(T , WT )] sitivity parameters (42) and (43) of the “universal”
Black Formula introduced below.
= Ɛt,x [e−rT g(WT )] (33)
Delta is, of course, the Black–Scholes hedge ratio.
Since WT ∼ N (x, T − t), this shows that u is given Gamma measures the convexity of C and is at its
by maximum when the option is close to being at the
money. Since gamma is the rate of change of delta,
frequent rebalancing of the hedge portfolio will be
e−r(T −t) ∞
−
(y−x)2
u(t, x) = g(y)e 2(T −t) dy (34) required in areas of high gamma. Theta is defined as
2π(T − t) −∞ −∂ C/∂τ and is generally negative (as can be seen
from the table, it is always negative for a call option
A sufficient condition for equation (32) is
on an asset with no dividends). It represents the
∞ “time decay” in the option value as the maturity time
1
g 2 (y)e−y /2T dy < ∞
2
√ (35) is reduced, that is, real time advances. As regards
2πT −∞
rho, it is not immediately obvious, without doing
In our case, the boundary condition is g(x) = the calculation, what its sign will be: on one hand,
[φ(t, x) − K]+ < φ(t, x) and this condition is eas- increasing r increases the forward price, pushing a
ily checked. Hence, equation (30) with this boundary call option further into the money, while on the
condition has unique C 1,2 solution (34), implying that other hand increased r implies heavier discounting,
the inverse function C(t, S) = u(t, ψ(t, S)) given by reducing option value. As can be seen from the table,
Theta ∂C
− ∂τ −e SN
√ (d1 )σ + q e−qτ SN (d1 ) − rKe−rτ N (d2 )
2 τ
Rho P ∂C Kτ e−rτ N (d2 )
∂r
∂C √
Vega ϒ ∂σ e−qτ S τ N
(d1 )
6 Black–Scholes Formula
the first effect wins: rho is always positive. Vega is risk-neutral measure as St = F (0, t)Mt where Mt is
in some ways the most important parameter, since the exponential martingale
a key risk in managing books of traded options is
“vega risk”, and in Black–Scholes this is completely Mt = exp σ w̌t − 12 σ 2 t (38)
“outside the model”. Bringing it back inside the
which is equivalent to equation (17). This model
model is the subject of stochastic volatility.
accords with the general fact that, in a world of
An extensive discussion of the risk parameters and
deterministic interest rates, the forward price is the
their uses can be found in Hull [6].
expected price in the risk-neutral measure, that is, the
ratio St /F (0, t) is a positive martingale with expecta-
tion 1. The exponential martingale (38) is the simplest
The Black “Forward” Option Formula continuous-path process with these properties.
The six-parameter representation C(τ, S, K, r, q, σ )
is not the best parameterization of Black–Scholes. A Universal Black Formula
For the asset St with dividend yield q, the forward
price at time t for delivery at time T is F (t, T ) = The parameterization of Black–Scholes can be fur-
St e(r−q)(T −t) (this is a model-free result, not related ther compressed as follows. First, note that σ and
to the Black–Scholes model). We can trivially re- τ = (T − t) do not appear
√ separately, but only in
express the price formula (6) as the combination a = σ T − t, where a 2 is some-
times known as the operational time. Next, define
C(t, St ) = B(t, T )(F (t, T )N (d1 ) − KN (d2 )) (36) the “moneyness” m as m(t, T ) = K/F (t, T ), and
define
with a log m
d(a, m) = − (39)
2 a
1 √
log(F (t, T )/K) + σ 2 (T − t) (so that d1 = d(σ T − t, K/F (t, T ))). Then the
d1 = √ 2
σ T −t Black formula (36) becomes
√
d2 = d1 − σ T − t (37) C = BF f (a, m) (40)
t t
0.5
St = S0 + ηt St− dt + κt St− dWt
0.4 0 0
0.3
+ St− vt (z)µ(dt, dz) (44)
0.2 1 [0,t]×
0.1 0.8
0.6 where µ is a finite-activity Poisson random mea-
0.0 0.4 a sure, so that there is a finite measure ν on such
that µ([0, t] × A) − ν(A)t ≡ (µ − π)([0, t] × A) is
0.0
0.2
0.2
0.4
0.6
0.8
0.00
a martingale for each A ∈ B(). η, κ, v are pre-
1.2
1.4
1.6
1.8
Moneyness m
dictable processes. Assume that η, κ and v are such
that the solution to the SDE (44) is well defined
Figure 1 The universal Black–Scholes function
and, moreover, that vt (z) > −1 so St > 0 almost
surely. This is a very general model including
path-dependent coefficients, stochastic volatility, and
This minimal parameterization of Black–Scholes jumps. Readers unfamiliar with jump-diffusion mod-
is used in studies of stochastic volatility; see, for els can set µ = ν = π = 0 below, and refer to the
example, Gatheral [5]. last paragraph of this section for comments on the
effect of jumps.
Consider the scenario of selling at time 0 a
European call option at implied volatility σ̂ , that is,
Implied Volatility and Market Trading for the price p = C(T , S0 , K, r, σ̂ ) and then following
a Black–Scholes delta-hedging trading strategy based
So far, our discussion has been entirely within the on constant volatility σ̂ until the option expires
Black–Scholes model. What happens if we attempt at time T . As usual, we shall denote C(t, s) =
to use Black–Scholes delta hedging in real market C(T − t, s, K, r, σ̂ ), so that the hedge portfolio, with
trading? This question has been considered by several value process Xt , is constructed by holding αt :=
authors, including El Karoui et al. [3] and Fouque ∂S C(t, St− ) units of the risky asset S, and the
et al. [4], though neither of these discusses the effect remainder βt := B1t (Xt− − αt St− ) units in the riskless
of jumps in the price process. asset B (a unit notional zero-coupon bond). This
In the universal price formula (40), the parameters portfolio, initially funded by the option sale (so X0 =
B, F, m are market data, so we can regard the formula p), defines a self-financing trading strategy. Hence,
as a mapping a → p = BFf (a, m) from a to price the portfolio value process X satisfies the SDE
p ∈ [B[F − K]+ , BF ). In a traded options market,
p is market data (but must lie in the stated interval, t
else there is a static arbitrage opportunity). In view Xt = p + ∂S C(u, Su− )ηu Su− du
0
of equation (42), f (a, m) is strictly increasing in a t
and hence there is a unique value a = â(p) such that
+ ∂S C(u, Su− )κu Su− dWu
p = BFf√ (â(p), m). The implied volatility is σ̂ (p) = 0
â(p)/ T − t. If the underlying price process St
actually were geometric Brownian motion (1), then + ∂S C(u, Su− )Su− vu (z)µ(du, dz)
[0,t]×
σ̂ would be the same, and equal to the volatility
t
σ , for call options of all strikes and maturities. Of
course, this is never the case in practice—see [5] for + (Xu − ∂S C(u, Su− )Su )rdu (45)
0
8 Black–Scholes Formula
1
Now define Yt = C(t, St ), so that, in particular, Y0 =
p. Applying the Itô formula (Lemma 4.4.6 of [1]) − er(T −t)
(t, St− (1 +
vt (z)))
[0,T ]× 0 0
gives
× vt2 (z)Su−
2
d
d π(dt, dz) − MT (48)
t
Yt = p + ∂t C(u, Su− )du
0
t where MT is the terminal value of the martingale
+ ∂S C(u, Su− )ηu Su− du
0
1
t
Mt = er(T −t)
(t, St− (1 +
vt (z)))
+ ∂S C(u, Su− )κu Su− dWu [0,T ]× 0 0
0
d
d
t
1 × vt2 (z)Su−
2
(µ − π)(dt, dz) (49)
+ 2
∂SS C(u, Su− )κu2 Su−
2
du
2 0
Equation (48) is a key formula, as it shows that
+ C(u, Su− (1 + vu (z)))
[0,t]× successful hedging is quite possible even under
significant model error. Without some “robustness”
− C(u, Su− ) µ(dt, dz) (46) property of this kind, it is hard to imagine that the
derivatives industry could exist at all, since hedging
Thus the ‘hedging error’ process defined by Zt := under realistic conditions would be impossible.
Xt − Yt satisfies the SDE Consider first the case µ ≡ 0, where St has
continuous sample paths and the last two terms
t t in equation (48) vanish. Then, successful hedging
Zt = rXu du − rSu− ∂S C(u, Su− ) depends entirely on the relationship between the
0 0
implied volatility σ̂ and the true “local volatility”
1
+ ∂t C(u, Su− ) + κu2 Su−
2 2
∂SS C(u, Su− ) du κt . Note from Table 1 that
t > 0. If we, as option
2 writers, are lucky and σ̂ 2 ≥ βt2 a.s. for all t, then
the hedging strategy makes a profit with probability
− C(u, Su− (1 + vu (z))) − C(u, Su− )
[0,t]× 1 even though the true price model is substantially
different from the assumed model as in equation (1).
− ∂S C(u, Su− )Su− vu (z) µ(du, dz) On the other hand, if we underestimate the volatility,
t we will consistently make a loss. The magnitude of
1 t
= rZu du + 2
(u, Su− )Su− (σ̂ 2 − κu2 )du the profit or loss depends on the option convexity
.
0 2 0
If
is small, then hedging error is small even if the
− C(u, Su− (1 + vu (z))) volatility has been grossly misestimated.
[0,t]× For the option writer, jumps in either direction
are unambiguously bad news. Since C is convex,
− C(u, Su− ) − ∂S C(u, Su− )Su− vu (z) µ(du, dz)
C > (∂C/∂S)S, so the last term in equation
(47) (47) is monotone decreasing: the hedge profit takes
a hit every time there is a jump, either upward
where
(t, St ) = ∂SS
2
C(t, St ), and the last equality or downward, in the underlying price. However,
follows from the Black–Scholes PDE. Therefore, the there is some recourse: in equation (48), MT has
final difference between the hedging strategy and the expectation 0 while the penultimate term is negative.
required option payout is given by By increasing σ̂ we increase Ɛ[ZT ], so we could
arrive at a situation where Ɛ[ZT ] > 0, although in
this case there is no possibility of with probability
ZT = XT − [ST − K]+
1 profit because of the martingale term. All of this
1 T r(T −t) 2 reinforces the trader’s intuition that one can offset
= e St−
(t, St− )(σ̂ 2 − κt2 )dt additional hedge costs by charging more upfront (i.e.,
2 0
Black–Scholes Formula 9
increasing σ̂ ) and hedging at the higher level of [2] Black, F. & Scholes, M. (1973). The pricing of options
implied volatility. and corporate liabilities, Journal of Political Economy 81,
637–654.
[3] El Karoui, N., Jeanblanc-Picqué, M. & Shreve, S.E.
End Notes (1998). Robustness of the Black and Scholes formula,
Mathematical Finance 8, 93–126.
[4] Fouque, J.-P., Papanicolaou, G. & Sircar, K.R. (2000).
a.
A two-parameter function is C 1,2 if it is once (twice) Derivatives in Financial Markets with Stochastic Volatil-
continuously differentiable in the first (second) argument. ity, Cambridge University Press.
[5] Gatheral, J. (2006). The Volatility Surface, Wiley.
[6] Hull, J.C. 2005. Options, Futures and Other Derivatives,
References 6th Edition, Prentice Hall.
[1] Applebaum, D. (2004). Lévy Processes and Stochastic MARK H.A. DAVIS
Calculus, Cambridge University Press.
Exchange Options American option to exchange two fixed zero-dividend
assets is not exercised early.)
general dynamics. There is another concrete albeit This reduces existence to finding an F0 and δ A such
less known example with simple jumps in X involv- that
ing the Poisson rather than the normal distribution. + T
The pattern is similar, with the main difference being AT At
−1 = F0 + δtA d (5)
that the deltas are the partial differences rather than BT 0 Bt
the partial derivatives of the option price function.
We fix, throughout, a stochastic basis (, (Ft ), F, The exchange option price process
is then the
) with time horizon t ∈ [0, T ], T > 0. In this semimartingale C = B F0 + δ d B . A A
section, we fix two zero-dividend assets with price Numéraire invariance in effect reduces general
processes A = (At ) and B = (Bt ). option pricing and hedging to a market where one
of the asset price processes equals 1 identically.
The remaining task is to find the above “projective”
The Exchange Option Price Process
predictable representation of the ratio payoff against
the ratio process.
When A and B are semimartingales, we call a pair
(δ A , δ B ) of (locally) bounded predictable processes
a (locally) bounded SFTS (see, more generally, the Deterministic-volatility Exchange Option Model
section Self-financing
Trading Strategies) if C =
C0 + δ A dA + δ B dB, where Let σ (t) > 0 be a continuous positive function.
Define the Black–Scholes/Merton projective option
C = δA A + δB B (2) price function
dC = δ A dA + δ B dB (3) √
log x νt
δA (t, x) : = N √ + ,
SFTSs form a linear space. If there exists a unique νt 2
bounded SFTS (δ A , δ B ) such that √
log x νt
δB (t, x) : = − N √ − (7)
CT = (AT − BT )+ (4) νt 2
T
then it is justified to call C the exchange option price where νt := t σ 2 (s)ds and N (·) is the normal dis-
process and δ A and δ B the deltas. tribution function. The function f (t, x) is continuous,
Assume now that the semimartingales A and B and on t < T is C 1 in t and analytic in x. In addition,
are positive and have positive left limits. −1 ≤ δB ≤ 0 ≤ δA ≤ 1, and
The numéraire invariance principle (see the
∂f
section Numéraire Invariance and more comprehen- f (T , x) = (x − 1)+ , (t, x) = δA (t, x) (8)
sively the section The Invariance Principle) states ∂x
that if (δ A , δ B ) is a locally
bounded
SFTS,
then As is well known and seen in the sections
C = δ A A + δ B B satisfies d B C = δ A d A (simi- Deterministic-volatility Model Uniqueness and Pro-
B
larly by symmetry with A as numéraire). This is jective Continuous SDE SFTS, the function f (t, x)
useful for uniqueness. Numéraire invariance also is the unique C 1,2 (on t < T ) solution with bounded
∂f
states the converse: if C is a semimartingale and partial derivative ∂x (t, x) subject to f (T , x) = (x −
δ A a locally bounded
predictable process such that 1)+ of the PDE
C A
d B = δ d B , then (δ A , δ B ) is an SFTS and
A
C A ∂f 1 ∂ 2f
equations (2) and (3) hold, where δ B = B− − δ A B− . (t, x) + σ 2 (t)x 2 2 (t, x) = 0 (9)
− − ∂t 2 ∂x
4 Exchange Options
Assume now A = BX for some positive continu- d BC = δ A d A , where C = δ A A + δ B B. (See the
ous semimartingale X > 0 satisfying B
section The Invariance Principle for a more lucid
d[log X]t = σ 2 (t)dt, (A = BX) (10) treatment.)
If, at this stage, we assume that B is a semi- C := BF, F = (Ft ), Ft := f (t, Xt ) (17)
martingale, then A and C are semimartingales too,
and by the invariance principle discussed next, dC = Clearly f (T , x) = (x −1)+ and CT = (AT − BT )+ .
δ A dA + δ B dB and (δ A , δ B ) is a bounded SFTS. One has the predictable representation
dF = δ A dX (18)
Numéraire Invariance
as shown shortly, where
Let X and F be two semimartingales and δ A be
a locally bounded predictable process such that f (t, eβ x) − f (t, x)
dF = δ A dX. Set δ B = F − δ A X. Clearly δ B = F− − δtA := δA (t, Xt− ), δA (t, x) :=
δ A X− since F = δ A X. Let B be any semimartin- (eβ − 1)x
gale. Set A = BX, C = BF . Clearly C = δ A A + (19)
δ B B. We claim dC = δ A dA + δ B dB, so (δ A , δ B ) is
an SFTS. Thus by numéraire invariance, (δ A , δ B ) is an SFTS
Indeed, this follows by applying Itô’s product if A and B are semimartingales, where
rule to BF , then substituting dF = δ A dX and F− = δ B := F − δ A X = F− − δ A X− (20)
δ B + δ A X− , followed by Itô’s product rule on BX:
Moreover, it is bounded. Indeed, since |(eβ y −
dC = d(BF ) = B− dF + F− dB + d[B, F ] 1)+ − (y − 1)+ | ≤ |eβ − 1|y for any y > 0,
= B− δ A dX + (δ B + δ A X− )dB + δ A d[B, X] ∞
β
−1)λ(T −t)
= δ A d(BX) + δ B dB = δ A dA + δ B dB (14) 0 ≤ δA (t, x) ≤ eβn−(e
n=0
Conversely, if A and B are semimartingales λn
with B, B− > 0 and (δ A , δ B ) is an SFTS, then × (T − t)n e−λ(T −t) = 1 (21)
n!
Exchange Options 5
To show dF = δ A dX (equation (18)), we first note Clearly, u(t, Pt ) = Ft . One readily verifies that
that [P ]c = 0 since u(t, p) satisfies the equation
[P ] = P ; hence, (P ) = P
2
Hence, as V is clearly the sum of its jumps, Combining this with equation (27) and the fact
that clearly
Vt − v(0) = Vs
s≤t u(t, p + 1) − u(t, p) = f (t, eβ x(t, p))
= (v(Ps− + 1) − v(Ps− ))Ps − f (t, x(t, p)) (31)
s≤t
t we conclude that, as desired,
= (v(Ps− + 1) − v(Ps− ))dPs (23)
0 f (t, eβ Xt− ) − f (t, Xt− )
dFt = dXt (32)
Likewise, (u(t, Pt )) is a semimartingale for any (eβ − 1)Xt−
C 1 in t function u(t, p), p ∈ , and one has
The Homogeneous Option Price Function
∂u
du(t, Pt ) = (t, Pt− )dt + (u(t, Pt− + 1)
∂t There is an alternative derivation of the self-financing
− u(t, Pt− ))dPt (24) equation dC = δ A dA + δ B dB much along that in [9]
and [8] that does not employ numéraire invariance. It
Now, define the function
is related to a family of two-dimensional PDEs sat-
β
−1)λt
x(t, p) := X0 eβp−(e (p ∈ ) (25) isfied by the Merton/Margrabe homogeneous option
price function c(t, a, b) below.
Clearly Xt = x(t, Pt ). Applying equation (24) to Let f (t, x), x > 0, be any C 1,2 function, for
the function x(t, p) and using that example, as in equation (6). Define the homogenized
function
∂x
(t, p) = − x(t, p)(eβ − 1)λ, a
∂t c(t, a, b) := bf t, (a, b > 0) (33)
b
x(t, p + 1) − x(t, p) = x(t, p)(eβ − 1) (26)
Then c(t, a, b) is homogeneous of degree 1 in
(or alternatively applying Itô’s formula to x(t, Pt ) (a, b), and hence by Euler’s formula
and simplifying) yields
∂c ∂c
dXt = Xt− (eβ − 1)d(Pt − λt) (27) c(t, a, b) = (t, a, b)a + (t, a, b)b (34)
∂a ∂b
6 Exchange Options
A laborious repeated application of the chain rule This combined with Euler’s formula (34) and the
on equation (33) gives definition (6) f := δA x + δB give
∂c a
2
2∂ c
2
2∂ c
(t, a, b) = δB t, (39)
a 2
(t, a, b) = b 2
(t, a, b) ∂b b
∂a ∂b
Assume that A and B are positive semimartingales
∂ 2c
= − ab (t, a, b) with positive left limits and X := A/B has deter-
∂a∂b ministic volatility σ (t): d[X]t = Xt2 σ 2 (t)dt. Using
∂ 2f a equation (12), the deltas are conveniently the sensitiv-
= b x2 2
(t, x), x := (35) ities of the homogeneous Merton/Margrabe function:
∂x b
σ 2 (t) = σA2 (t, a, b) + σB2 (t, a, b) − 2σAB (t, a, b) Since X is continuous, we also have δtA =
∂c B
(36) ∂a (t, At− , Bt− ) and similarly δt . The section
Deterministic-Volatility Exchange Option Model
Using equations (35), (36), and ∂c ∂t (t, a, b) = yields dC = δ A dA + δ B dB with Ct = Bt f (t, Xt ) =
∂f a c(t, At , Bt ). Therefore, by equation (40) and Itô’s
b ∂t t, b , we see that c(t, a, b) satisfies the PDE
formula,
∂c 1 2 ∂ 2c 1 ∂ 2c ∂c 1 ∂ 2c 1 ∂ 2c
+ σA (t, a, b)a 2 2 + σB2 (t, a, b)b2 2 dt + d[A] c
+ d[B]ct
∂t 2 ∂a 2 ∂b ∂t 2 ∂a 2 t
2 ∂b2
∂ 2c ∂ 2c
+ σAB (t, a, b)ab =0 (37) + d[A, B]ct = 0 (41)
∂a∂b ∂a∂b
∂f
if and only if f (t, x) satisfies the PDE (9): ∂t + where the partial derivatives are evaluated at
2 (t, At− , Bt− ) and [·]c is the bracket continuous part.
1 σ 2 (t)x 2 ∂ f = 0.
2 ∂x 2 (The
jump termAin Itô’s formula vanishes as it equals
s≤t (Cs − δs As − δs Bs ) = 0.)
B
The PDE (9) was utilized in [1] and [9] (but not
in [8]), and Merton [9] stated its equivalence to the Returning to the approach of Merton [9],
PDE (37) (assuming σA , etc., depend only on t). As assume now that d[log A]t = σA2 (t, At , Bt )dt for
noted in [9] and expounded in [8], if d[log A]t = some function σA and similarly d[log B] = σB2 dt and
σA2 (t)dt, d[log B]t = σB2 (t)dt and d[log A, log B]t = d[log A, log B] = σAB dt. Then equation (36) holds
σAB (t)dt, then Itô’s formula and equation (37) imply using log X = log A − log B. Since f (t, x) satisfies
at once dc(t, At , Bt ) = δtA dAt + δtB dBt , with δ A and the PDE (9), the PDE (37) follows as before by the
δ B as in equation (40), and thus (δ A , δ B ) is an SFTS chain rule. However, equation (37) implies equation
with price process c(t, A, B) by Euler’s formula (34). (41), which by Itô’s formula in turn implies the self-
Let us expand on this (see also the sections financing equation dC = δ A dA + δ B dB with δ A and
Self-financing Trading Strategies and Homoge- δ B given by equation (40).
neous Continuous Markovian SFTS). Let σ (t) >
0 be a continuous function, and f (t, x) be the
Black–Scholes/Merton function (6). Set c(t, a, b) := Change of Numéraire
bf (t, a/b). Clearly,
The solution c(t, a, b) to the PDE (37) subject to
c(T , a, b) = (a − b)+ can be expressed in a form
∂c ∂f a a
Ɛ (X − Y )+ for some random variables X and Y > 0
(t, a, b) = t, = δA t,
∂a ∂x b b with means a and b. Expectations of this form often
(38) become more tractable by a change of measure as in
Exchange Options 7
[4]. Define the equivalent probability measure by In general, since F := C/B is a -martingale, we
d := Y . Clearly, have the following pricing formula:
d ƐY
X Ɛ(X) d Y Ct = Bt Ɛ [CT /BT | Ft ] (45)
Ɛ = := (42)
Y Ɛ(Y ) d Ɛ(Y )
Replacing X by (X − Y )+ in equation (42) and Deterministic-volatility Model Uniqueness
using the homogeneity to factor out Y , we get
+ Assume that A and B are positive semimartingales
X with positive left limits and X := A/B is an Itô
Ɛ (X − Y )+ = Ɛ(Y )Ɛ −1 (43)
Y process following
If X/Y is -lognormally distributed then equation
dXt A
(43) together with equation (42) readily yields = µt dt + σt dZt , X := (46)
Xt B
+ log(ƐX/ƐY ) ν where Z is a Brownian motion and µ and σ >
Ɛ(X − Y ) = Ɛ(X)N
+
2 0 are predictable processes with σ bounded and
ν 1/2 T (µt /σt )2 dt
Ɛe 0 < ∞. Let (δ A , δ B ) be an SFTS
log(ƐX/ƐY ) ν with δ bounded. Set C := δ A A + δ B B. We claim
A
− Ɛ(Y )N
−
ν 2 that δ A = δ B = 0 if CT = 0. Indeed, the process
(44)
µ µ 2
µ − σ dZ− 1 σ dt
M := E − dZ = e 2
where ν := var [log(X/Y )]. When X and Y are σ
bivariately lognormally distributed, it is not difficult (47)
to show that X/Y is lognormally distributed in both
and with the same log-variance ν = ν := is then a positive martingale with M0 = 1. Define the
var[log(X/Y )]. Then ν can be replaced with ν in
equation (44). This occurs when the functions σA , σB µ by d = MT d.
equivalent probability measure
The process W := Z + σ dt is a -Brownian
and σAB in equation (37) are independent of a and b,
motion because [W ]t = t and W is -local martin-
as in [8, 9].
gale as MW is a local martingale using Itô’s product
rule:
Uniqueness
Assume that A and B are positive semimartin- d(MW ) − W dM = MdW + d[W, M]
gales with positive left limits such that X := A/B µ µ
= M dZ + dt − M d[Z]
is square-integrable martingale under an equivalent σ σ
probability measure and d
X t = Xt− σt dt for
2 2
= MdZ (48)
some nowhere zero process σ , where
X is the
-compensator of [X]. (Of course,
X = [X] if Moreover, dX = Xσ dW by equation (46). There-
X is continuous.) Let (δ A , δ B ) be an SFTS and fore, X is a -square integrable martingale since σ
set C := δ A A + δ B B. We claim that δ A = δ B = 0 if is bounded. The claim, thus, follows by the section
CT = 0 and δ A is bounded. Uniqueness.
Indeed, set F := C/B. By numéraire invariance, Assume now that σt is deterministic. The results of
dF = δ A dX. Hence, F is a -square-integrable the section Deterministic-Volatility Exchange Option
martingale since X is and δ A is bounded. Thus, Model hold since d[log X] = σt2 dt. However, we can
F = 0 since FT = CT /BT = 0. Hence, 0 = d
F = now derive them more conceptually. Indeed, both
(δ A )2 X−2 2
σ dt. However, X− σ > 0. Thus, δ A = 0
2 2
conditioned on Ft and unconditionally, XT /Xt is -
and δ = F − δ X = 0.
B A
lognormally distributed with mean 1 and log-variance
8 Exchange Options
T T T
σs dWs −1/2 σs2 ds is a martingale. Define the equivalent probability
t σs ds since XT = Xt e
2 t t . Hence, measure by d = MT d. Then N := P − λdt
by equation (45), is a -local martingale as MN is a local martingale
by Itô’s product rule:
f (t, Xt ) = Ɛ [(XT − 1)+ | Ft ] where
+
XT d(MN ) − N− dM = M− dN + d[M, N ]
f (t, x) : = Ɛ x −1 (49)
Xt = M− (dP − λdt)
which function readily equals the Black–Scholes/
λ
Merton option price function (6). Thus, F := + M− − 1 dP
κ
(f (t, Xt )) is a -martingale. Therefore, Itô’s for-
λ
mula implies that f (t, x) satisfies the PDE (9) = M− (dP − κdt) (53)
∂f κ
and dF = δ A dX where δ A := ∂x (t, Xt ). Numéraire
invariance now yields that the pair (δ A , δ B := F − Therefore, by equation (50), X is a -square-
δ A X) is an SFTS. Clearly, CT = (AT − BT )+ where integrable martingale (in fact, in Hp () for all
C := δ A A + δ B B = BF . p > 0) since λ is bounded. Thus, by the section
Uniqueness, δ A = δ B = 0 if CT = 0, as claimed.
Assume now that λ is a positive constant. By equa-
Exponential-Poisson Model Uniqueness tion (51) we have a special case of the exponential-
Poisson model. Further, P is a -Poisson process
Let β = 0 be a constant and κ and λ be positive con-
with intensity λ since [P ] = P . We now have unique-
tinuous adapted processes such that λ is bounded and
T λt 2 ness, but additionally, the previous results follow
−1 κt dt more conceptually as follows.
Ɛ e 0 κt < ∞. Let P be semimartingale sat- Conditioned on Ft , PT − Pt is -Poisson dis-
isfying [P ] = P with P0 = 0 and compensator κdt.
tributed with mean λ(T − t). Its unconditional -
Assume that A and B are positive semimartingales
A satisfies distribution is identical. Thus, the Ft - conditional and
with positive left limits and X := B the unconditional -distribution of XT /Xt are iden-
tical and are exponentially Poisson distributed with
dXt = Xt− (eβ − 1)(dPt − λt dt) (50) mean 1. Hence, by equation (45),
Using deβP = (eβ − 1)eβP− dP or as in the section
Derivation of the Predictable Representation, this is f (t, Xt ) = Ɛ [(XT − 1)+ | Ft ] where
+
equivalent to the integrated form XT
f (t, x) : = Ɛ x −1 (54)
t Xt
βP −(eβ −1) λs ds
Xt = X0 e t 0 (51)
which function readily equals that defined in equation
A B
Let (δ , δ ) be an SFTS with δ bounded. A (16). Thus, F := (f (t, Xt )) is a -martingale. Using
Set C := δ A A + δ B B. We claim δ = δ =
λ that
A B this and equation (24), one shows that F satisfies
−1 (dP −κdt) equation (32) and with it that the pair (δ A , δ B ) as
κ
0 if CT = 0. Indeed, Ɛ e T = Ɛ defined in equation (19), equation (20) is a bounded
T λt 2
Extension to Dividends
λ
M :=E − 1 (dP − κdt)
κ
Consider two assets with positive price processes Â
− (λ−κ)dt λs and B̂ and continuous dividend yields ytA and ytB .
= e 1+ − 1 Ps When there exist traded or replicable zero-dividend
κs
s≤·
assets A and B such that AT = ÂT and BT = B̂T (if
(52) not, there is little hope of replication), it is natural to
Exchange Options 9
define the price process of the option to exchange  replicates the given payoff h(AT ) in general. The
and B̂ to be that of the option to exchange A and B. construction is explicit in the multivariate extensions
If y A and y B are deterministic, then consistent with of the deterministic-volatility and exponential-
the treatment of dividends in [9], A (and similarly B) Poisson models.
is simply given by The homogeneity of the payoff function h(a)
implies h(AT ) = Am T g(XT ) where g(x) := h(x, 1),
T A 1
x ∈ + , n := m − 1, and X := Am , · · · , Am .
n
n
− y ds A A
At : = a Ãt = e t s Ât ,
t A T A Once a predictable representation F = F0 + δ ·
y ds − y dt
Ãt : = e 0 s Ât , a := e 0 t (55) X, FT = g(XT ) is found, then by numéraire
invariance δ := (δ , δ m ) will be an SFTS with
n
n h(Ai T ), where δ := F− − i=1 δ X− =
Note A/B is a semimartingale if and only if Â/B̂ m i
payoff
is, in which case [log A/B] = [log Â/B̂]. F − i=1 δ X. Uniqueness of pricing requires
In general, Ãt is the price of the zero-dividend boundedness of partial derivatives (or differences)
asset that initially buys one share of  and thereon of h(a) (or g(x)) and that A be arbitrage free,
continually reinvests all dividends in  itself. What meaning X is a martingale under an equivalent
is required is that the four zero-dividend assets A, Ã, measure. Arbitrage freedom holds “generically” when
B, and B̃ be arbitrage free in relation to one another the matrix (
X i , X j ) is nonsingular, basically a “no-
(see the section Arbitrage-free Semimartingales and redundant-asset” condition. Then the SFTS is also
Uniqueness). unique.
For instance, say  and B̂ are the yen/dollar and Libor and swap derivatives are among contingent
yen/Euro exchange rates viewed as yen-denominated claims with homogeneous payoffs.
dividend assets. Then A is the yen-value of the US T -
maturity zero-coupon bond and à is the yen-value of
the US money market asset. This exchange option is Self-financing Trading Strategies
equivalent to a Euro-denominated call struck at 1 on
By an SFTS we mean a pair (δ, A) of an m-
the Euro/dollar exchange rate Â/B̂. The ratio A/B
dimensional semimartingale A = (A1 , . . . , Am ) and
is the forward Euro/dollar exchange rate. If it has
an A-integrable predictable vector process δ =
deterministic volatility, we are as in a setting of [7],
(δ 1 , . . . , δ m ) such that (with δ · A denoting the m-
which yields the same pricing formula as that from
dimensional stochastic integral)
the section Deterministic-volatility Exchange Option
Model.
m
m
δ i Ai = δ0i Ai0 + δ · A (56)
i=1 i=1
Pricing and Hedging Options with
Homogeneous Payoffs We then say δ is an SFTS for A. This is equivalent
to saying that the SFTS price process
We took some shortcuts to quickly present
the main results for two of the simplest and
m
because C is then a local martingale that is dominated (61) has finite variation. Thus, if [Ai ]are absolutely
by a martingale M: continuous and the m × m matrix d/dt[Ai , Aj ]
is nonsingular, then δti = ∂c/∂a i (t, At ), so equa-
|Ct | ≤ b |Ait | = b |Ɛ[AiT | Ft ]| tion (62) holds and c(t, At ) = i ∂c/∂ai (t, At )Ait .
i i If further the support of At is a cone, it fol-
lows c(t, a) is homogeneous of degree 1 in a on
≤b Ɛ[|AiT | | Ft ] =: Mt that cone.
i Assume that M i := e− rdt Ai are local martin-
(59) gales under an equivalent measure for some locally
bounded
predictable process r. Then dAi = rAi dt +
As suggested by the case of a locally bounded δ, rdt
e dM i ; thus, by equations (61) and (57)
we often use the differential form
m
1 ∂ 2c
m
dC = δ i dAi (60) ∂c
(t, At )dt + (t, At )d[Ai , Aj ]t
i=1 ∂t 2 i,j =1 ∂ai ∂aj
of the equation C = C0 + δ · A as a convenient
m
∂c
symbolic equivalent in calculations. One interprets = rt C t − (t, At )Ait dt
Ai as prices of m zero-dividend assets and δti as the i=1
∂ai
number of shares invested in them at time t. Then Ct
(63)
indicates the resultant self-financing portfolio price
by equation (57), and equation (60) is the self-
financing equation, implying that the change dC Hence, if c(t, a) is homogeneous (in a), then by
in the portfolio price is only due to the changes Euler’s formula equation (62) holds (yet δti may differ
dAi in the asset prices with no financing from from ∂c/∂ai (t, At ) if there are redundancies, for then
outside. a regular replicating SFTS is not unique).
Assume for the remainder of this subsection as a Given a homogeneous payoff function h(a), the
way of motivation that A is continuous and Ct = section Homogeneous Continuous Markovian SFTS
c(t, At ) for some C 1,2 function c(t, a).a Then by constructs under suitable assumptions a homoge-
equation (60) and Itô’s formula, we have neous solution c(t, a) to equation (62) with c(T , a) =
h(a). Clearly then, by Euler and Itô formulae,
(∂c/∂ai (t, At )) is an SFTS for A (as observed in [9]
1 ∂ 2c
m
∂c and highlighted in [8], see the section The Homo-
(t, At )dt + (t, At )d[Ai , Aj ]t
∂t 2 i,j =1 ∂ai ∂aj geneous Option Price Function). To this end, we
first factor out the homogeneous symmetry of h(a)
m
∂c next.
= δti − (t, At ) dAit (61)
i=1
∂ai
∂c ∂c where C := i δ i Ai = C0 + δ · A, that is, SC =
i,j δ − ∂a δ − ∂a d[Ai ,
i j
In general,
i j S0 C0 + δ · (SA). Indeed, by Itô’s product rule, then
Aj ] = 0 since the (left) right-hand side of equation substituting for dC and C− and regrouping, followed
Exchange Options 11
by Itô’s product rule again, numéraire invariance, δ is an SFTS for A with price
process C = Am F , provided δ is A-integrable.
Thus, numéraire invariance shows that in order
d(SC) = S− dC + C− dS + d[S, C]
to find an SFTS with a given time-T payoff CT it
m
m
m
is sufficient to find processes δ and F such that
= S− δ i dAi + δ i Ai− dS + δ i d[S, Ai ] F = F0 + δ · X and FT = CT /Am T.
i=1 i=1 i=1
Since δ m = F − ni=1 δ i X i , the mth delta δ m is
m like F determined by δ and F0 . As such, one inter-
= δ i (S− dAi + Ai− dS + d[S, Ai ]) prets the m-th asset as the “numéraire asset” chosen
i=1 to finance an otherwise arbitrary trading strategy δ
m in the other assets, post an initial investment of
= δ i d(SAi ) (65) C0 = Am 0 F0 .
i=1 We often use the differential form dF = ni=1 δ i
dX i of the equation F = F0 + δ · X.
Interpreting S as an exchange rate, this result [3,
4, 8], called numéraire invariance, means that the Arbitrage-free Semimartingales and Uniqueness
self-financing property is independent of the base
currency. (To the best of our knowledge, the term We call a semimartingale A = (A1 , · · · , Am ), m ≥ 2,
was coined in the 1992 edition of [3], where a similar arbitrage free if there exists a positive semimartingale
proof is given.) S with S− > 0 such that SAi are martingales for all
If S, S > 0, then applied to the semimartingale i. Such a process S is called a state price density or
1/S we see that δ is an SFTS for A if and only if deflator for A. The law of one price (with bounded
it is one for SA. Thus, if equation (57) holds, then deltas) justifies the terminology:
equations (60) and (64) are equivalent. If A is arbitrage free and δ is a boundedSFTSi for
A, then SC is a martingale where C := m i
i=1 δ A ;
Assume now that Am , Am − > 0 and m ≥ 2. Define
the n := m − 1 dimensional semimartingale consequently, C = 0 if CT = 0.
Indeed, by numéraire invariance δ is then an SFTS
for SA with price process SC. Hence by the section
A1 An
X := , . . . , , n := m − 1 (66) Self-financing Trading Strategies, SC is a martingale,
Am Am implying SC = 0 if CT = 0, and with it C = 0, as
claimed.
Taking S = 1/Am , it follows that δ is an SFTS A simple and well-known argument yields that
for A if and only if it is an SFTS for A/Am = (X, 1), if Am , Am − > 0, then A is arbitrage free if and only
that is, if and only if F := C/Am satisfies F = F0 + if there exists an equivalent probability measure
δ · Xwhere δ := (δ 1 , · · · , δ n ). Clearly
in this case, such that X is a -martingale, where X :=
F = ni=1 δ i X i + δ m and F− = ni=1 δ i X− i
+ δ m as A1 , · · · An , n := m − 1.b Numéraire invariance
F = δ · X. Thus, Am Am
m
then implies that C/A i isi a -martingale for the
n
n price process C := i δ A of any bounded SFTS δ,
C
δm = F − δ i X i = F− − δ i X−
i
, F := m and hence
A
i=1 i=1 m CT
Ct = At Ɛ | Ft (68)
(67) Am
T
Indeed, by numéraire invariance, δ is an SFTS for
(When m = 1, a similar argument shows that δ A/Am with price process C/Am . Hence, C/Am is a
must be a constant, as intuitively obvious.) -martingale by the section Self-Financing Trading
Conversely, suppose that δ is an X-integrable Strategies since A/Am is a -martingale and δ is
process and F is a process such that F = F0 + δ · X. bounded.
Define δ m by either of the above formulas—the other Suppose that X is a -square-integrable mar-
then holds as before. Obviously then δ = (δ , δ m ) is tingale and δ i are bounded for i ≤ n. Then
an SFTS for (X, 1) with price process F . Hence by F := C/Am is a -square-integrable martingale
12 Exchange Options
since dF = ni=1 δ i dX i
n by i numéraire invariance. unique bounded SFTS for (X, 1) with payoff g(XT ),
Moreover, d
F = ij =1 δ δ j d
X i , X j . Thus, if
provided d[X i , X j ] = X i X j σ ij dt for some nonsin-
ij
gular matrix process (σt ).
X are continuous and the n × n matrix
i
absolutely
d/dt
X i , X j is nonsingular, then given any ran-
dom variableR, there exists at most one SFTS δ for Example: Projective Deterministic Volatility
A such that m i=1 δT AT = R and δ are bounded for
i i i
T ∂c (t, A )
Then Ct = c(t, At ). Agreeably, δti = ∂a
j
1/2 (φt )2 dt
ϕ ij bounded and Ɛ e j 0 < ∞. Define the i
t
M : = E − φ j dZ j
1 ∂ 2c
j =1 m
∂c
k (t, At− )dt + (t, At− )d[Ai , Aj ]ct = 0
− φ j dZ j +
1
(φ i )2 dt
∂t 2 i,j =1 ∂ai ∂aj
j =1 2
=e (85) (87)
arbitrage free. tions σijA (t, a). The quotient-space PDE (83) is more
Now let h(a), a ∈ m + > 0, be a homogeneous fundamental for it holds in general (even when A is
function of linear growth. Define g(x) := h(x, 1), discontinuous) and has one lower dimension. Change
ij i
x ∈ n+ . Assume further that ϕt = ϕij (t, Xt ) for of variable Li = Xi+1 − 1 (i < n), Ln = X n − 1,
some continuous bounded functions ϕij (t, x). Then X
transforms equation (83) to the Libor market model
equation (80) holds, and hence the section PDE.
Projective Continuous SDE SFTS applied under
measure shows that X is -Markovian in Multivariate Poisson Predictable Representation
that Ɛ [g(XT ) | Ft ] := f (t, Xt ) where f (t, x) =
Ɛ g(XTt,x ), as in equation (81). Thus, by the section Let P = (P 1 , · · · , P k ) be a vector of independent
Projective Continuous SDE SFTS, equations (72) Poisson processes P i with intensities λi > 0. For
and (73) hold and δ as defined in equation (70) any C 1 in t function u(t, p), p ∈ k , the process
is an SFTS for (X, 1). Therefore by numéraire u(t, P ) = (u(t, Pt )) is a finite activity semimartin-
invariance, δ is an SFTS for A with price process gale, and using [P i , P j ] = 0, one has
u(t, P ) =
i
C = Am F . The homogeneity of h(a) further implies i i u(t, P− )P , where
CT = Am T g(XT ) = h(AT ).
We have thus constructed an SFTS with the given i u(t, p) := u(t, p1 , . . . , pi + 1, . . . , pn ) − u(t, p)
payoff h(AT ). As in the section Example: Projec- (88)
tive Deterministic Volatility or Projective Continuous denotes the ith forward partial difference of u(t, p)
SDE SFTS, we ensure its boundedness by requir- in p. This in turn readily implies
ing the x-partial derivatives of g(x) or equivalently
a-partial derivatives of h(a) (as L1loc functions) be ∂u k
bounded and thereby get unique pricing. For (very) du(t, P ) = (t, P− )dt + i u(t, P− )dP i (89)
∂t i=1
low dimensions n, the PDE (83) is suitable for
numerical valuation in the absence of a closed-form
Let v(p), p ∈ k be a function of exponential
solution.
linear growth. Define the function
Although the option price process and the deltas
are already found, let us also consider the homoge- ∞
neous option price function referred to in the section
u(t, p) : = v(p + q)
Self-financing Trading Strategies, and now naturally
q1 ,...,qk =0
defined by
k q
λi
× i
(T − t)qi e−λi (T −t) (p ∈ k )
a1 an qi !
c(t, a) := am f t, m , . . . , m (86) i=1
a a (90)
Exchange Options 15
k
× d(P j − λj t) (Xti := xi (t, Pt )) (98)
dF = i u(t, P− )d(P i − λi t) (93)
i=1
Letβil α = (αij ) be any n × k matrix such that
and u(t, p) satisfies the equation i (e − 1)αij = δj l , all 1 ≤ j, l ≤ k. Then
∂u k
n
(t, Pt− ) + λi i u(t, Pt− ) = 0 (94) dX i
∂t d(P j − λj t) = αij i
(99)
i=1
i=1
X−
Since FT = v(PT ) and F0 = u(0, 0), combining
equations (90) and (93) yields the following repre- Now let g(x), x ∈ n+ , be a function of linear
sentation: growth; define the function
∞
k q
λi v(p) := g(x1 (T , p), . . . , xn (T , p)), (p ∈ n )
qi −λi T
v(PT ) = v(q1 , . . . , qk ) i
T e (100)
q1 ,...,qk =0 i=1
qi !
k
T
and the function u(t, p) by equation (90). By
+ i u(t, Pt− )d(Pti − λi t) (95) the section Multivariate Possion Predictable Rep-
i=1 0 resentation, F := (u(t, Pt )) is a martingale with
FT = v(PT ) = g(XT ) and is represented as equation
(93). Substituting equation (99) into equation (93)
Projective Exponential-Poisson SFTS yields
Let P = (P 1 , · · · , P k ) be a vector of independent
n
dF = δ i dX i (101)
Poisson processes P j with intensities λj > 0. Let
i=1
X0 ∈ n+ , n ≥ k, and β = (βij ) be an n × k matrix
such that the n × k matrix (eβij − 1) has full rank.
where
Then the processes X i := (xi (t, Pt )), i = 1, · · · , n,
1
k
are square-integrable martingales (in fact in all Hp ),
δti := αij j u(t, Pt− ) (102)
where i
Xt− j =1
k
Thus, δ = (δ 1 , · · · , δ m ) is an
SFTS for (X, 1)
xi (t, p) : = X0i exp (βij pj − (eβij − 1)λj t)
j =1
where m := n + 1 and δ m := F − ni=1 δ i X i .
It is more desirable to express δ in terms of X.
(p ∈ ) k
(96) One has u(t, p) = f (t, x(t, p)), where
16 Exchange Options
XT XT
f (t, x) : = Ɛ g x =Ɛ g x | Ft
Xt Xt
∞ n n q
n
(β q −(eβ1j −1)λj (T −t)) (β q −(eβnj −1)λj (T −t)) λi i
= g x1 e j =1 1j j ,. . . , xn e j =1 nj j (T − t)qi e−λi (T −t)
q ,···,q =0 i=1
q i !
1 n
(103)
The equalities follow from the definition of v(p) the last equality following from equation (98).
n
above and of u(t, p) in equation (90) together with However, the n × n matrix l=1 (e
βil
− 1 (eβj l −
the two formulae following it.c We clearly have 1)λl )ni,j =1 is nonsingular. Therefore, θ i = 0, that
f (T , x) = g(x) and is, δ̂ i = δ i for i ≤ n, implying δ̂ m = δ m too
as F̂ = F .
Ft := u(t, Pt ) = f (t, Xt ) = Ɛ[g(XT ) | Ft ] (104)
One shows, as in the section Exponential-Poisson
Since u(t, p) = f (t, x(t, p)), the deltas in equa- Exchange Option Model, that the processes δ i are
tion (102) are given by partial differences of f (t, x) bounded if γi (x) are bounded, where
as
1
k
δti = δi (t, Xt− ) where
γi (x) : = αij (g(eβ1j x1 , . . . , eβnj xn ) − g(x)),
xi j =1
1
k
δi (t, x) : = αij (f (t, eβ1j x1 , · · · , eβnj xn )
xi j =1
n
γm (x) : = g(x) − γi (x)xi (107)
− f (t, x)) (105) i=1
Owing to the above growth condition, the positive If the support of At is a proper surface, for example, if
local martingale m = 2 and A2 is deterministic as in the Black–Scholes
model or A2t = a2 (t, A1t ) as in Markovian short-rate mod-
els, then obviously there exist infinitely many nonhomo-
k
j geneous functions ĉ(t, a) such that Ct = ĉ(t, At ). (Such
λ
M : = E − 1 (dP − κ dt)
j j
a homogeneous function also exists under some assump-
j =1
κj tions as in the section Homogeneous Continuous Markovian
k j j SFTS.)
− (λ −κ )dt
n
λ j b.
Indeed, first assume that A is arbitrage free and let S
=e j =1 (1+ s
− 1 Psj ) m
j =1
κ j
s be a state price density. The martingale M := SA m
s≤· Ɛ[S0 A0 ]
(109) clearly satisfies Ɛ MT = 1. Hence, the equivalent measure
defined by d = MT d is a probability measure.
i
Since MXi = SA m is a martingale, Xi is a -
is a martingale. Define the measure by d = Ɛ[S0 A0 ]
i
martingale by Bayes’ rule. Conversely, assume that X are
MT d. As inthe section Exponential-Poisson Model d
Uniqueness, λj dt are the -compensator of P j . -martingales for some . Define Mt := Ɛ | Ft >
d
This, equation (108), and boundedness of λj imply 0. Then (the right continuous version of) M = (Mt ) is
a martingale (so M− > 0). By Bayes’ rule MXi are
that X i are -square integrable martingales. Thus, A
martingales since Xi are -martingales. Set S := M/Am .
is arbitrage free. As before, the SDE (108) integrates Then S, S− > 0 and SAi = MXi . Thus S is a deflator,
to k j
t j as desired. Further, since SC is a martingale for any
β P −(eβij −1) λ ds bounded SFTS δ, by the Bayes’ rule SC/M = C/Am is
Xti = X0i e j =1 ij t 0 s (110)
a -martingale.
c.
Now assume λj are constant. Then P j are - The
projective
option price function f (t, x) :=
Poisson processes with intensities λj and are inde- Ɛ g x XT , also encountered for the log-Gaussian case
Xt
pendent since [P j , P l ] = 0, j = l. Let h(a), a ∈ in equation (75), satisfies f (t, Xt ) = Ɛ[g(XT ) | Ft ] in gen-
m eral when X is the exponential of any n-dimensional
+ , be a homogeneous function of linear growth.
Define g(x) := h(x, 1), x ∈ n+ . The section Pro- process of independent increments (inhomogeneous Lévy
process), but we no longer have hedging in general.
jective Exponential-Poisson SFTS applied under
nthati δ i given by equation (105) (with δ =
m
implies
References
F − i=1 δ X ) is an SFTS for (X, 1) with price pro-
cess F = (f (t, Xt )) satisfying FT = g(XT ), where
[1] Black, F. & Scholes, M. (1973). The pricing of options
f (t, x) is defined explicitly by equation (103), and corporate liabilities, Journal of Political Economics
or equivalently, f (t, x) = Ɛ g(xXT /Xt ). There- 81, 637–659.
fore, by numéraire invariance, δ is an SFTS for [2] Delbaen, F. & Schachermayer, W. (2006). The Mathe-
A with price process C := Am F satisfying CT = matics of Arbitrage, Springer.
Am g(XT ) = h(AT ) by homogeneity. [3] Duffie, D. (2001). Dynamic Asset Pricing Theory, 3rd
Assume finally that the payoff function h(a) is Edition, Princeton University Press.
[4] El-Karoui, N., Geman, H. & Rochet, J.C. (1995).
such that the functions γi (x) defined in equation Change of numeraire, change of probability measure,
(107) are bounded (e.g., h(a) = max(a 1 , · · · , a m )). and option pricing, Journal of Applied Probability 32,
By the section Projective Exponential-Poisson SFTS, 443–458.
if k = n, then δ is the unique bounded SFTS for [5] Harrison, M.J. & Kreps, D.M. (1979). Martingales and
A with payoff CT = h(AT ). In general, since A arbitrage in multiperiod securities markets, Journal of
is arbitrage free, Ĉ = C for any other bounded Economic Theory 20, 381–408.
[6] Harrison, M.J. & Pliska, S. (1981). Martingales and
i δ̂i for A with payoff ĈT = h(AT ), where Ĉ :=
SFTS stochastic integrals in the theory of continuous trad-
i δ̂ A . ing, Stochastic Processes and their Applications 11,
215–260.
[7] Jamshidian, F. (1993). Options and futures evaluation
End Notes with deterministic volatilities, Mathematical Finance
3(2), 149–159.
a.
Clearly, then the restriction of (any such) c(t, a) to the [8] Margrabe, W. (1978). The value of an option to
support of A is unique, and if ĉ(t, a) is any function that exchange one asset for another, Journal of Finance 33,
equals c(t, a) on the support of A, then Ct = ĉ(t, At ) too. 177–186.
18 Exchange Options
[9] Merton, R. (1973). Theory of rational option pricing, eign Exchange Options; Forward–Backward Sto-
Bell Journal of Economics 4(1), 141–183. chastic Differential Equations (SDEs); Hedging;
[10] Neuberger, A. (1990). Pricing Swap Options Using the Itô’s Formula; Markov Processes; Martingales;
Forward Swap Market, IFA Preprint.
Poisson Process.
is v0,0 = 9.086 and N = 0.588. The initial holding This is the Black–Scholes formula. It can be
in the bank is therefore v0,0 − N S = −49.72. This given in more explicit terms when, for exam-
is the typical situation: hedging involves leverage ple, h(S) = [S − K]+ , the standard call option (see
(borrowing from the bank to invest in shares). Black–Scholes Formula).
Now let us consider scaling the binomial model In the multiperiod binomial model, the basic compu-
to a continuous limit. Take a fixed time horizon tational step is the backward recursion
T and think of the price S(i) above, now written 1
Sn (i), as the price at time iT /n = it. Suppose the vi−1,j = (q0 vi,j + q1 vi,j +1 ) (14)
continuously compounding rate of interest is r, so R
that R = ert . Finally, define h = log u and X(i) = defining the values at time step i − 1 from those at
log(S(i)/S(0)); then X(i) is a random walk on the time i by discounted conditional expectation, starting
lattice {. . . − 2h, −h, 0, h, . . .} with right and left with the exercise values vn,j = Oj at the final time
probabilities q0 , q1 as √defined earlier and X(0) = 0. n. In an American option, we have the right to
If we now take h = σ t for some constant σ , we exercise at any time, the exercise value at time i being
find that some given function h(i, Si ), for example, h(i, Si ) =
[K − Si ]+ for an American put. The exercise value at
1 h 1 2
q0 , q1 = ± r − σ + O(h2 ) (10) node (i, j ) in the binomial tree is therefore h̃(i, j ) =
2 2σ 2 2 h(i, Sui−2j ). In this case, it is natural to replace
equation (14) by
Thus Z(i) := X(i) − X(i − 1) are independent
random variables with vi−1,j = max{vk−1,j
c
, h̃(i − 1, j )} (15)
c
h2 1 2 where vi−1,j is given by the right-hand side of
EZ(i) = r − σ + O(h3 ) equation (14). At each node (i − 1, j ), we compare
σ2 2 c
the “continuation value” vi−1,j with the “immediate
1
= r − σ 2 t + O(n−3/2 ) (11) exercise” value h̃(i − 1, j ) and take the larger value.
2 This intuition is correct, and the value v0,0 obtained
by applying equation (15) for i = n, n − 1, . . . , 1
and with starting condition vn,j = h̃(n, j ) is the unique
var(Z(i)) = σ 2 t + O(n−2 ) (12) arbitrage-free value of the American option at time 0.
The reader should refer to American Options
n
Hence Xn (T ) := X(n) = i=1 Z(i) has mean µn for a complete treatment, but, in outline, the argu-
and variance Vn such that µn → (r − σ 2 /2)T and ment establishing the above claim is as follows. The
Vn → σ 2 T as n → ∞. By the central limit theo- algorithm divides the set of nodes into two, the stop-
rem, the distribution of Xn (T ) converges weakly to ping set S = {(i, j ) : vi,j = h̃(i, j )} and the comple-
the normal distribution with the limiting mean and mentary continuation set C. By definition, (n, j ) ∈ S
variance. If the contingent claim payoff is a continu- for j = 0, . . . , n. Let τ ∗ be the stopping time τ ∗ =
ous bounded function O = h(Sn (n)), then the option min{i : Si ∈ S}. Then τ ∗ is the optimal time at which
value converges to a normal expectation that can be the holder of the option should exercise. The process
written as Vi = vi,Si /R i is a supermartingale, while the stopped
process Vi∧τ ∗ is a martingale with the property that
1 Vi∧τ ∗ ≥ h(i ∧ τ ∗, Si∧τ ∗ )/R i . These facts follow from
e−rT 1
V0 (S) = √ h S exp r − σ 2 T the general theory of optimal stopping, but are not
2π −1 2
hard to establish directly in the present case. The
√ 1 2 value Vi∧τ ∗ can be replicated by trading in the under-
+ σ T x e− 2 x dx (13)
lying asset (using the basic hedging strategy (3)
4 Binomial Tree
derived for the one-period model). It follows that this where F0,t is the forward price quoted at time 0 for
strategy (call it SR) is the cheapest superreplicating exchange at time t and
strategy, that is, x = v0,0 is the minimum capital
required to construct a trading strategy with value Xi 1
Mt = exp σ Wt − σ 2 t (17)
at time i with the property that Xi ≥ h(i, Si ) for all i 2
almost surely. If the seller of the option is paid more
than v0,0 , then he or she can put the excess in the is the exponential martingale with Brownian motion
bank and employ the trading strategy SR, which is Wt . See Black–Scholes Formula for this represen-
guaranteed to cover his or her obligation to the buyer tation. F0,t only depends on the spot price S0 and
whenever he or she chooses to exercise. Conversely, the yield curve (and the dividend yield, if any), so
if the seller will accept p < v0,0 for the option then the only stochastic modeling required relates to the
the buyer should short SR, obtaining an initial value Brownian motion σ Wt . Here we can use a standard
v0,0 of which p is paid to the seller and v0,0 − p “symmetric random walk” approximation: divide the
placed in (for clarity) a second bank account. The length δ = T /n
time interval [0, T ] into n intervals of√
short strategy has value −Xi and the buyer exercises and take a space step of length h = σ δ. At each dis-
at τ ∗, receiving from the seller the exercise value crete time point, the random walk (denoted Xi ) takes
h(τ ∗ , Sτ ∗ ) = Xτ ∗ , which is equal and opposite to the a step of ±h with probability 1/2 each—this is just
value of the short hedge at τ ∗ . Thus, there is an arbi- a binomial tree with equal up and down probabili-
trage opportunity for one party or the other unless the ties. For a single step Z = Xi − Xi−1 = ±h we have
price is v0,0 . E[eZ ] = cosh h, so if we define α = log(cosh h) then
The impact of the binomial model as introduced Mi(n) = exp(Xi − αi) is a positive discrete-time mar-
by Cox et al. [1] is largely due to the fact that tingale with E[Mi(n) ] = 1. It is a standard result that
the European option pricer can be turned into an the sequence M (n) (suitably interpreted) converges
American option pricer by a trivial one-line modifica- weakly to M given by equation (17) as n → ∞. This
tion of the code. Pricing American options in (essen- gives us a discrete-time model
tially) the Black–Scholes model was recognized as
a free-boundary problem in partial differential equa- Si(n) = F0,iδ Mi(n) (18)
tions (PDE) by McKean [3] in 1965, but the only
computational techniques were PDE methods (see
Finite Difference Methods for Early Exercise Op- such that E[Si(n) ] = F0,iδ holds exactly at each i.
tions) generally designed for much more complicated At node (i, j ) in the tree the corresponding price
problems. is F0,iδ exp((n − j )h − iα). Essentially, we have
replaced the original multiplicative random walk rep-
resenting the price S(t) by an additive random walk
Computations in the Binomial Model representing the return process log S(t). The advan-
tages of this are (i) all the yield curve aspects
Nowadays the binomial model is rarely, if ever, are bundled up in the model-free function F , and
used for practical problems, largely because it is (ii) the stochastic model is “universal” (and very
comprehensively outperformed by the trinomial tree simple).
(see Tree Methods). The decisive drawback of any binomial model
First, the form of the tree given above is probably is the absolute inflexibility with respect to volatil-
not the best if we want to regard the tree as an ity: it is impossible to maintain a recombining
approximation to the Black–Scholes model. We see tree while allowing time-varying volatility. This
from equation (10) that the risk-neutral probabilities means that the model cannot be calibrated to more
q0 , q1 depend on r, so if we want to calibrate the than a single option price, making it useless for
model to the market yield curve we will need time- real pricing applications. The trinomial tree gets
varying q0 , q1 . This can be avoided if we write the around this: we can adjust the local volatility by
Black–Scholes model as changing the transition probabilities while main-
taining the tree geometry (i.e., the constant spatial
St = F0,t Mt (16) step h).
Binomial Tree 5
or for a put option on the minimum of two assets, ∀t ∈ [0, T ], Vt (φ) = Pt + St0 At ≥ Pt (2)
we have Zt = (K − min(St1 , St2 ))+ . There also exist
options, called Amerasian options, where the pay-off This is a surreplication strategy for American
depends
on the whole path of the assets, for instance, options. Moreover, for this strategy, the initial wealth
+
1 t
Zt = K − t 0 Su du . for hedging the option is minimum because we have
V0 (φ) = P0 .
Using arbitrage arguments, Benssoussan and
The third problem arising in the American option
Karatzas [5, 32] have shown that the discounted
theory is linked to early exercise opportunity. Con-
American option value at time t is the Snell envelope
trary to European options, for the American option
of the discounted pay-off process [19, 43]. For
holder, knowing the arbitrage price of his/her option
the definition and general properties on the Snell
is not enough. He/she has to know when it is optimal
envelope, we refer to [23] for continuous time and
for him/her to exercise the option. The tool to study
to [44] for discrete time. We can then assert that the
this problem is the optimal stopping theory.
price at time t of an American option with pay-off
process Z and maturity T is
Optimal Exercise
St0
Pt = esssupτ ∈Tt,T Ɛ∗ Zτ | Ft (1) We recall some useful results of the optimal stopping
Sτ0
theory and apply them to the American put option
where Tt,T is the set of -stopping times with values in the famous Black–Scholes model. These results
in [t, T ]. are proved in [23] in a larger setting and their
The second problem appearing in the option theory financial applications have been developed in [35].
consists in determining a hedging strategy for the An optimal stopping time for an American option
option seller (see Hedging). The solution follows holder is a stopping time that maximizes his/her gain.
directly from the Snell envelope properties. Indeed, if Consequently, a stopping time ν is optimal if we have
X is a process, we will denote the discounted process
Ɛ∗ [Z̃ν ] = esssupτ ∈T0,T Ɛ∗ [Z̃τ | Ft ] (3)
by X̃ = SX0 and we have the following result ([35,
Corollary 10.2.4]). We have a characterization of optimal stopping
times, thanks to the following theorem.
Proposition 1 The process (P̃t )0≤t≤T is the small-
est right-continuous super martingale that dominates Theorem 1 Let τ ∗ ∈ T0,T . τ ∗ is an optimal stop-
(Z̃t )0≤t≤T . ping time if and only if Pτ ∗ = Zτ ∗ and the process
(P˜t∧τ ∗ )0≤t≤T is a martingale.
As (P̃t )0≤t≤T is a super martingale, it admits
a Doob decomposition (see Doob–Meyer Decom- It follows from this result that the stopping time
position). There exist a unique right-continuous
martingale (Mt )0≤t≤T and a unique nondecreasing, τ ∗ = inf{t ≥ 0 : Pt = Zt } ∧ T (4)
continuous, adapted process (At )0≤t≤T such that
is an optimal stopping time and, obviously, it is the
A0 = 0 and P̃t = Mt − At for all t ∈ [0, T ]. This smallest one. We can easily determine the largest opti-
decomposition of P is very useful to determine a mal stopping time by using the Doob decomposition
surreplication strategy for an American option (see of super martingale. We introduce the following stop-
Superhedging). A strategy is defined as a predictable ping time:
process (φt )0≤t≤T such that the value, at time t, of
the
d portfolio associated with this strategy is Vt (φ) = ν ∗ = inf{t ≥ 0 : At > 0} ∧ T (5)
i i
i=0 φt St . In a complete market, each contingent
claim is replicable; then there exists a self-financing and it is easy to see that ν ∗ is the largest optimal
strategy φ such that VT (φ) = ST0 MT . As Ṽ (φ) is a stopping time.
martingale under the risk-neutral probability, we get We then apply these results to an American
Ṽt (φ) = Mt for all t ∈ [0, T ]. In conclusion, we have put option in Black–Scholes framework (see
constructed a self-financing strategy such that Black–Scholes Formula). We assume that the
American Options 3
underlying asset S of the option is solution, under the Another technique to reduce the dimension of the
risk-neutral probability, to the following equation: problem is the randomization of the maturity applied
in [9, 13], but only approximations of the option price
dSt = St (r dt + σ dWt ) (6) can be obtained in this way. In the following section,
we present methods to approximate P based on the
with r, σ > 0 and W a standard Brownian motion. discretization of the problem.
From the Markov property of S, we can deduce
that the option price at time t is P (t, St ), where
Approximation of the American Option Value
P (t, x) = sup Ɛ∗ [e−rτ (K − Sτ )+ |S0 = x] (7)
τ ∈T0,T −t To approximate Pt , it is natural to restrict the set
of exercise dates to a finite one. We then introduce a
It is easy to see that t → P (t, x) is nonincreasing subdivision S = {t1 , . . . , tn } of the interval [0, T ] and
for all x ∈ [0, +∞). Moreover, for t ∈ [0, T ], the assume that the option owner can exercise only at a
function x → P (t, x) is convex [24, 29, 30]. From date in S. Such options are called Bermuda options
the convexity of P , we deduce that there exists and their price at time t is given by
a unique optimal stopping time: τ ∗ = inf{t ≥ 0 :
P (t, St ) = (K − St )+ } ∧ T . We introduce the so- Ptn = esssupτ ∈T n Ɛ[St0 Z̃τ | Ft ] (11)
t,T
called critical price or free boundary s(t) = inf{x ∈
[0, +∞) : P (t, x) > (K − x)+ } and can write that where Tt,T
n
is the set of -stopping times with values
in S ∩ [t, T ]. We obviously have limn→+∞ P n = P
τ ∗ = inf{t ≥ 0 : St ≤ s(t)} ∧ T and some estimates of the error have been given in
[1, 15]. For perpetual put options, Dupuis and Wang
= inf{t ≥ 0 : Wt ≤ α(t)} ∧ T [21] have obtained a first-order expansion of the error
on the value function and on the critical prices. In
1 s(t) σ2
with α(t) = ln − r− t (8) the case of finite maturity, this problem is still open;
σ S0 2 we just know that the error is proportional to n1 for
the value function and to √1n for the critical prices
Hence, τ ∗ is the reaching time of α by a Brownian
[18].
motion. If α was known, we could compute τ ∗
We have to determine Ptni for all i ∈ {1, . . . , n}.
and then P . However, the only way to get the
For this, we use the so-called dynamic programming
law of τ ∗ explicitly is to reduce the dimension
equation:
by considering options with infinite maturity (also
n
known as perpetual options). In this case, we have PT = ZT
the following result ([37, Proposition 4.5]). S 0
(12)
Ptni = max Zti , Ɛ∗ ti
P n | Fti
Proposition 2 The value function of an American St0i+1 ti+1
perpetual put option is This equation is easy to understand with financial
arguments. At maturity of the option, it is obvious
P ∞ (x) = sup Ɛ∗ [e−rτ (K − Sτ )+ that the option price PTn is equal to the pay-off ZT .
τ ∈T0,+∞ At time ti < T , the option holder has two choices:
× ζτ <+∞ |S0 = x] (9) he/she exercises and then earns Zti ; else he/she keeps
the option and then would have the option value
and is given by at time n + 1, Ptni+1 . Hence, using the no arbitrage
assumption, one can prove that at time ti the option
K − x if x ≤s ∗ seller should receive
P ∞ (x) = ∗ γ 0
(K − s ∗ ) sx if x > s ∗ St i n
max Zti , Ɛ∗ Pt | Fti (13)
2r Kγ St0i+1 i+1
where γ = and s ∗ = (10)
σ 2 1+γ
Computing the Bermuda option price consists
is the critical price. now in calculating the expectations in the dynamic
4 American Options
programming equation. On the one hand, Monte consequences from the theoretical point of view and
Carlo techniques have been applied to solve this on practical aspects.
problem (see Monte Carlo Simulation for Stocha-
stic Differential Equations; Bermudan Options and
[11]). More precisely, we can quote some regression Analytic Properties of American Options
methods based on projections on Hilbert space base
[40, 47], quantization algorithms proposed in [1, 2], In this section, we assume that the assets prices
and some Monte Carlo methods based on Malliavin process follows a model called local volatility model
calculus [3, 8]. On the other hand, we can use (see Local Volatility Model). This model is complete
a discrete approximation of the underlying assets and takes into account the smile of volatility observed
process. A widely used model is the Cox, Ross, and when one calibrates the Black–Scholes model (see
Rubinstein model (see Binomial Tree). We introduce Model Calibration; Implied Volatility Surface and
a family of independent and identically distributed [20]). We suppose that the assets prices process
Bernouilli variables (Un )1≤n≤N with values in {b, h}, is solution to the following stochastic differential
where −1 < b < h. We then consider only two assets equation:
S 0 and S whose respective initial values are 1 and S0
such that
d
dSti = Sti bi (t, St ) dt + σi,j (t, St ) dWt (16)
j
j =1
Sn0 = (1 + r)n and Sn = Sn−1 (1 + Un )
∀n ∈ {1, . . . , N } (14) where W is a standard Brownian motion on d , b
a function mapping [0, T ] × [0, +∞)d into d , and
where r > 0 is the constant interest rate of the market. σ a function mapping [0, T ] × [0, +∞)d into d×d .
From the no arbitrage assumption, it follows that Moreover, we assume that b is bounded and Lipschitz
b < r < h and that, under the risk-neutral probability continuous, that σ is Lipschitz continuous in the
∗ , we have p := ∗ (U1 = h) = h−b r−b
. Hence, using space variable, and that there exists α ≥ 1/2 and σH
the Markov property of S, we can price an American such that ∀x ∈ [0, +∞), (t, s) ∈ [0, T ]2 , | σ (t, x) −
option on S. For instance, for a call option with σ (s, x) |≤ σH | t − s |α . Moreover, to ensure the
exercise price K, we get Pn = F (n, Sn ), where F completeness of the market and the nondegeneracy of
is the solution to the following equation: the partial differential equation satisfied by European
option price functions, we assume that there exist
+ m > 0 and M > 0 such that
F (N, x) = (K −
x)
F (n, x) = max K − x,
1
1+r ∀(t, x, ξ ) ∈ [0, T ] × [0, +∞)d × d ,
(15)
× (pF (n + 1, x(1 + h))
m2 ξ 2 ≤ ξ ∗ σ ∗ σ (t, x)ξ ≤ M 2 ξ 2 (17)
+ (1 − p)F (n + 1, x(1 + b)))
From the Markov property of the process S, at
The convergence of binomial approximations was time t, the price of an American option with maturity
first studied in a general setting in [34]. The rate of T and pay-off process (f (St ))0≤t≤T is P (t, St ), where
convergence is difficult to get, but some estimates are
given in [36, 38]. P (t, x) = sup Ɛ e−r(τ −t) f (Sτ ) | St = x (18)
In conclusion, for some simple models, one can τ ∈Tt,T
• ∀(t, x) ∈ [0, T ] × [0, +∞)d , P (t, x) ≥ f (x) this method to the American option problem, Jaillet
• If the coefficients σ and b do not depend on time, et al. have proved that the value function P can be
we can write characterized as the unique solution, in the sense of
distribution, of the following variational inequality
P (t, x) = sup Ɛ e−rτ f (Sτ ) | S0 = x (19) [31]:
τ ∈T0,T −t
DP ≤ 0, f ≤ P , (P − f )DP = 0 a.e.
then the function t → P (t, x) is nonincreasing on P (x, T ) = f (x) on [0, +∞)
[0, T ]. (23)
where we set
Up to imposing some assumptions on the regu-
larity of the pay-off function, we can derive some
important continuity properties of P . In this section, ∂h
d
∂ 2h
Dh(t, x) = + 1
2
(σ σ ∗ )i,j (t, x)xi xj
we assume that f is nonnegative and continuous on ∂t ∂xi xj
i,j =1
[0, +∞) such that
d
∂h
+ bi (t, x)xi − rh (24)
∃(M, n) ∈ [0, +∞) × , ∀x ∈ [0, +∞)d , ∂xi
i=1
d
∂f
| f (x) | + (x) ≤ M(1+ | x |n ) (20) This inequality directly derives from the properties
∂x
i=1 i of the Snell envelope. Indeed, the condition DP ≤ 0
is the analytic translation of the super martingale
These assumptions are generally satisfied by the
property of P̃ , f ≤ P corresponds to Z ≤ P , and
pay-off functions appearing in finance, especially by
the fact that one of this two inequalities has to be
the pay-off functions of put and call options. In this
an equality follows from the martingale property of
setting, we have the following result [31].
(Pt∧τ ∗ )0≤t≤T .
Proposition 3 There exists a constant C > 0 such From the variational inequality, we can use numer-
that ical methods, such as finite difference methods, to
compute the option price (see Finite Difference
Methods for Early Exercise Options and [31]).
∀t ∈ [0, T ], ∀(x, y) ∈ [0, +∞)2d , From a theoretical point of view, we can deduce some
| P (t, x) − P (t, y) |≤ C | x − y | (21) analytic properties of P . If we add the condition that
second-order derivatives of the pay-off function are
∀x ∈ [0, +∞) , ∀(t, s) ∈ [0, T ] ,
d 2
bounded from below, we have the following result.
1 1
| P (t, x) − P (s, y) |≤ C (T − t) 2 − (T − s) 2 Proposition 4 Regularity of P
exercise premium formula presented in the section where µ ∈ , σ > 0, W is a standard Brownian
Exercise Region [30, 43]. In connection with free motion, N is a Poisson process with intensity λ >
boundary problems, some analytic methods have been 0, and the Ui are independent and identically dis-
developed in [26] from which we can deduce the con- tributed variables with values in (−1, +∞) such that
tinuity of ∂P
∂t
on [0, T ) × [0, +∞)d . Ɛ[Ui2 ] < +∞.
Thanks to the variational inequality, we can estab- This model is not complete but up to a change
lish the so-called robustness of Black–Scholes for- of probability measure, we can suppose that µ =
mula [24]. The two main results obtained are the r − λƐ[U1 ], where r > 0 is the constant interest rate
following. of the market. Hence, S̃ is a martingale
with respect
to
the filtration generated by W , N , and Ui ζi≤Nt 0≤t≤T .
Proposition 5 We assume that d = 1. If the pay-off The option price is then determined as the initial
function is convex, then the value function P is equally wealth of a replication portfolio, which minimizes
convex. Moreover, if there exist σ1 , σ2 > 0 such that the quadratic risk. Merton obtained closed formu-
σ1 ≤ σ ≤ σ2 , then we have las to calculate the European options price. In this
model, Zhang [50] extended the variational inequal-
P σ1 ≤ P ≤ P σ2 (26)
ity approach to evaluate the American options price
where P σi is the value function of the American option and he got a characterization of the value func-
on an underlying asset with volatility σi . tion as solution to the following integro-differential
equation:
The propagation of convexity has been proved
DP + IP ≤ 0, f ≤ P ,
with probabilistic arguments in [29] and can be
(DP + IP ) (P − f ) = 0 a.e. (28)
extended to the case d > 1. The robustness of
P (x, T ) = f (x) on [0, +∞)
Black–Scholes formula is equally useful from a prac-
tical point of view because it allows to construct with
surreplication and subreplication strategies using a
constant volatility. ∂h σ 2 x 2 ∂ 2 h ∂h
Dh(t, x) = + 2
+ µx − rh
When there is only one risky asset modeled as a ∂t 2 ∂x ∂x
geometric Brownian motion, the analytic properties
presented in this section can be used to transform, Ih(t, x) = λ (h(t, x + z) − h(t, x)) ν(dz) (29)
thanks to Green’s theorem, the variational inequal-
ity in an integral equation (see Integral Equation where ν is the law of ln(1 + U1 ). Zhang used this
Methods for Free Boundaries). This point of view equation to derive numerical schemes for approxi-
has been adopted to provide new numerical methods mating P . However, he could not obtain a description
[16] to get theoretical results such as the convexity of of the optimal exercise strategies. This was studied
the critical price for the put option [22] or its behavior by Pham [46] who obtained a pricing decomposi-
near maturity [16, 25]. tion formula and some properties of the exercise
boundary.
In conclusion, analytic properties of the American
Integro-differential Equation option value function have been used to build numer-
ical methods of pricing and to get some theoretical
The integro-differential approach can be extended to
properties. Although the variational point of view is
the American option on jump diffusions (see Partial
better for understanding the discretization of Amer-
Integro-differential Equations (PIDEs)). In 1976,
ican options, it is less explicit than the probabilistic
Merton (see Merton, Robert C. and [42]) introduced
methods. We can remark that a specific region of
a model including some discontinuities in the assets
[0, T ] × [0, +∞)d appears in these two approaches:
value process. He considered a risky asset whose
the so-called exercise region
value process is solution to the following equation:
N
t
E = {(t, x) ∈ [0, T ) × [0, +∞)d : P (t, x) = f (x)}
dSt = St − µdt + σ dWt + d Ui (27)
i=1 (30)
American Options 7
If we knew E, on the one hand, we would be able assets but the same kinds of results exist for many
to determine the law of optimal stopping times and, others options.
on the other hand, the option pricing problem would We denote by Et the temporal section of the
be reduced to solving a partial differential equation in exercise region. For a call option on the maximum
the complementary set of E. In the following section, of two assets, S 1 and S 2 Et can be decomposed in
we recall some results on exercise regions and in two regions: E1t = Et ∩ {(x1 , x2 ) ∈ [0, +∞)2 : x2 ≤
particular we give a price decomposition, known as x1 } and E2t = Et ∩ {(x1 , x2 ) ∈ [0, +∞)2 : x1 ≤ x2 }.
the early exercise premium formula, which involves These two regions are convex and can be rewritten
the exercise region. as follows:
concerning a call option on the maximum on two × ζ{Su ≥s(u)} |St = s(t)] du (34)
8 American Options
a local volatility model, Mathematical Finance 15, [37] Lamberton, D. & Lapeyre, B. (1996). Introduction to
439–463. Stochastic Calculus Applied to Finance, Chapman and
[18] Chevalier, E. (2007). Bermudean approximation of the Hall, London.
free boundary associated with an American option, [38] Lamberton, D. & Pagès, G. (1990). Sur l’approximation
Free Boundary Problems: Theory and Applications 154, des réduites, Annales de l’I.H.P., Probabilités et Statis-
137–147. tiques 26(2), 331–355.
[19] Duffie, D. (1992). Dynamic Asset Pricing Theory, [39] Lamberton, D. & Villeneuve, S. (2003). Critical price
Princeton University Press, Princeton. for an American option on a dividend-paying stock, The
[20] Dupire, B. (1994). Pricing with a smile, Risk Magazine Annals of Applied Probability 13, 800–815.
7, 18–20. [40] Longstaff, F.A. & Schwartz, E.S. (2001). Valuing
[21] Dupuis, P. & Wang, H. (2004). On the conver- American options by simulations: a simple least squares
gence from discrete to continuous time in an optimal approach, Review of Financial Studies 14, 113–147.
stopping problem, Annals of Applied Probability 15, [41] McKean, H.P. Jr. (1965). Appendix: a free boundary
1339–1366. problem for the heat equation arising from a prob-
[22] Ekström, E. (2004). Convexity of the optimal stop- lem in mathematical economics, Industrial Management
ping boundary for the American put option, Jour- Review 6, 32–39.
nal of Mathematical Analysis and Applications 299, [42] Merton, R.C. (1976). Option pricing when underlying
147–156. stock returns are discontinuous, Journal of Financial
[23] El Karoui, N. (1981). Les aspects probabilistes du Economics 3, 125–144.
contrôle stochastique, Lecture Notes in Mathematics 876, [43] Myneni, R. (1992). The pricing of the American option,
72–238. Springer-Verlag. Annals of Applied Probability 2, 1–23.
[24] El Karoui, N., Jeanblanc-Piqué, M. & Shreve, S. (1998). [44] Neveu, J. (1975). Discrete-Parameter Martingales,
Robustness of the Black-Scholes formula, Mathematical North Holland, Amsterdam.
Finance 8, 93–126. [45] Nyström, K. (2007). On the behaviour near expiry
[25] Evans, J.D., Keller, R.J. & Kuske, R. (2002). American for multi-dimensional American options, to appear in
options on assets with dividends near expiry, Mathemat- Journal of Mathematical Analysis and Applications 339,
ical Finance 12(3), 219–237. 664–654.
[26] Friedman, A. (1975). Stochasic Differential Equations [46] Pham, H. (1997). Optimal stopping, free boundary and
and Applications, Academic Press, New York, Vol. 1. American option in a jump-diffusion model, Applied
[27] Friedman, A. (1976). Stochasic Differential Equations Mathematics and Optimization 35, 145–164.
and Applications, Academic Press, New York, Vol. 2. [47] Tsitsiklis, J.N. & Van Roy, B. (2001). Regression
[28] Harrisson, J.M. & Pliska, S.R. (1981). Martingales methods for pricing complex American-Style options,
and stochastic integrals in the theory of continuous IEEE Transactions on Neural Networks 12(4),
trading, Stochastic Processes and their Applications 11, 694–703.
215–260. [48] Van Moerbeke, P. (1976). On optimal stopping and free
[29] Hobson, D. (1998). Volatility misspecification, option boundary problems, Archive for Rational Mechanics and
pricing and superreplication via coupling, The Annals Analysis 20, 101–148.
of Applied Probability 8(1), 193–205. [49] Villeneuve, S. (1999). Exercice region of American
[30] Jacka, S.D. (1991). Optimal stopping and the American options on several assets, Finance and Stochastics 3,
put, Mathematical Finance 1, 1–14. 295–322.
[31] Jaillet, P., Lamberton, D. & Lapeyre, B. (1990). Varia- [50] Zhang, X.L. (1997). Numerical analysis of American
tional inequalities and the pricing of American options, option pricing in a jump-diffusion model, Mathematics
Acta Applicandae Mathematicae 21, 263–289. of Operations Research 22, 668–690.
[32] Karatzas, I. (1988). On the pricing of American options,
Applied Mathematics and Optimization 17, 37–60.
[33] Kim, I.J. (1990). The analytic valuation of American
options, Review of Financial Studies 3, 547–572. Further Reading
[34] Kushner, H.J. (1977). Probability Methods for Approxi-
mations in Stochastic Control and for Elliptic Equations,
Academic Press, New York. Black, F. & Scholes, M. (1973). The pricing of options
[35] Lamberton, D. (1998). American Options, Statistics in and corporate liabilities, Journal of Political Economy 81,
Finance, D. Hand & S. Jacka, Arnold Applications of 637–659.
Statistics Series. eds, Edward Arnold London. Dalang, R.C., Morton, A. & Willinger, W. (1990). Equivalent
[36] Lamberton, D. (1998). Error estimates for the bino- martingale measures and no-arbitrage in stochastic securities
mial approximation of American put options, Annals of market models, Stochastics and Stochastics Reports 29(2),
Applied Probability 8, 206–233. 185–202.
10 American Options
The corresponding geometric average Glt ; l = c, d (documented in many papers including Levy [13]),
is defined to be so it is enough to consider the call option and derive
t the price of the put from this.
1 The main difficulty in pricing and hedging the
Gt = exp
c
ln Su du (4)
t − t0 t0 Asian option is that the random variable AT does
not have a lognormal distribution. This makes the
for continuous averaging and
pricing very involved, and an explicit formula does
Gdt = (St0 St1 ...Stm )1/(m+1) (5) not exist to date. This is an interesting mathematical
problem and many research papers have been and
for discrete averaging. still are written on the topic. The first of these was
The payoff of an Asian call with arithmetic by Boyle and Emmanuel [3] in 1980.
averaging is given as Early methods for pricing the Asian option with
arithmetic average involved replacing the arithmetic
(AlT − K)+ (6) average AT with the geometric average GT , which
is lognormally distributed; see [5, 10, 11, 15, 17].
and the payoff of an Asian put with arithmetic This gives a simple formula, but it underprices the
averaging is given as call significantly. However, it is worth noting that√
the formula leads to a scaling known as the 1/ 3
(K − AlT )+ (7) rule, since for t > t0 , the volatility is scaled down
by this
√ factor. That is, the formula involves the term
where K is the fixed strike. Option payoffs depending
σ √13 T − t. This is a particularly useful observation
on the geometric average are identical with AlT
replaced by GlT . if the averaging period is quite short relative to the
By standard arbitrage arguments, the time-t price life of the option. See [12], among others, for a
of the Asian call is description and more details.
The second class of methods used is to approx-
e−r(T −t) Ɛ[(AlT − K)+ |Ft ] (8) imate the true distribution of the arithmetic average
using an approximate distribution, usually lognormal
and the price of the put is with appropriate parameters. True moments for AT
are equated with those implied by a lognormal model,
e−r(T −t) Ɛ[(K − AlT )+ |Ft ] (9) so
2 2
ƐAnT = enα+1/2n v (11)
It is worth noting that in pricing the Asian option,
we need to consider only those cases where t ≤ t0 . for any integer n and α, v are the mean and standard
For t > t0 , the option is in progress, and we can write deviation of a normally distributed variable. This
e−r(T −t) Ɛ[(AT − K)+ |Ft ] as idea was used in a number of papers including
+
−r(T −t) 1 T
t − t0
e Ɛt Su du + At − K
T − t0
t T − t0
T +
T − t −r(T −t) 1 T − t0 t − t0
= e Ɛt Su du − K− At (10)
T − t0 T −t t T −t T −t
methods work well for some parameter values but not The Asian option is an exotic path-dependent
for others. option since the value at any point in time depends on
A further analytical technique in approximating the history of the underlying asset price. Specifically,
the price of the Asian option is to establish price the value of the option at t depends on the current
bounds. Curran [6] and Rogers and Shi [16] used con- level of the underlying asset St , time to expiry T − t,
ditioning to obtain a two-dimensional integral, which and the average level of the underlying up to t,
proves to be a tight lower bound for the option. At . Zvan et al. [21] presented numerical methods
Much work has been done on pricing the Asian for solving this PDE. It turns out that the problem
option using quasi-analytic methods. Geman and can be reduced to two variables (one state and the
Yor [8] derived a closed-form solution for an “in- other time). Rogers and Shi [16], Alziary et al. [1],
the-money” Asian call and a Laplace transform and Andreasen [2] formulated a one-dimensional
for “at-the-money” and “out-of-the-money” cases. PDE. The PDE approach is flexible in that it can
Their methods are based on a relationship between handle market realities, but it is difficult to solve
geometric Brownian motion and time-changed Bessel numerically as the diffusion term is very small for
processes. To price the option, one must invert the values of interest on the finite-difference grid. Vecer
Laplace transform numerically; see [7]. Shaw [18] [20] reformulated the problem using analogies to
demonstrated that the inversion can be done quickly passport options [9] to obtain an unconditionally
and efficiently for all reasonable parameter choices stable PDE, which is more easily solved.
in Mathematica, making this a fast and effective Methods based on discrete sampling become more
approach. Linetsky [14] produced a quasi-analytic appropriate when there are relatively few averaging
pricing formula using eigenfunction methods, with dates. One simplistic approach is a scaling correction
highly accurate results, also employing a package to volatility as described. Other possibilities include
such as Mathematica. a Monte Carlo simulation or numerical solution of a
Direct numerical methods such as Monte Carlo sequence of PDEs [2]. Monte Carlo simulation can
or quasi-Monte Carlo simulation and finite-difference be quite efficient when there are only a small num-
partial differential equation (PDE) methods can be ber of averaging dates, since the first “step” can take
used to price the Asian option (see Lattice Meth- one straight to the averaging period (under the usual
ods for Path-dependent Options). In fact, given exponential Brownian motion model). Andreasen [2]
the popularity of such techniques, these methods priced discretely sampled Asian options using finite-
were probably amongst the first used by practitioners difference schemes on a sequence of PDEs. This is
(and remain popular today). Monte Carlo simula- particularly efficient if the averaging period is short
tion was used to price Asian options by Broadie and hence there are only a small number of PDEs
and Glasserman [4] and Kemna and Vorst [11], to solve. He compared his PDE results to that of
among many other more recent researchers. Simu- Monte Carlo simulation and showed that the finite-
lation methods have the advantage of being widely difference schemes get within a penny accuracy of the
used by practitioners to price derivatives, so no Monte Carlo simulation in less than a second of CPU
“new” method is required. Additional practical fea- time.
tures such as stochastic volatility or interest rates To conclude, there has been ongoing research into
can be incorporated without a significant increase the methods for pricing the Asian option. It seems,
in complexity. Control variates can often be used however, the current-state-of-the-art pricing meth-
(e.g., using a geometric Asian option when pric- ods (good implementation of inversion of Laplace
ing an arithmetic option). Additionally, simulation transform, eigenfunction and other expansions, stable
is often used as a benchmark price against which PDE, and Monte Carlo simulation where appropriate)
other methods are tested. The disadvantages are are fast, accurate, and adequate for most uses.
that it is computationally expensive, even when
variance reduction techniques are used. Lapeyre References
and Temam [12] showed that Monte Carlo simula-
tion can be competitive under the more advanced [1] Alziary, B., Decamps, J.P. & Koehl, P.F. (1997). A PDE
schemes they propose and with variance reduction approach to Asian optons: analytical and numerical evi-
techniques. dence, Journal of Banking and Finance 21(5), 613–640.
4 Asian Options
[2] Andreasen, J. (1998). The pricing of discretely sam- [13] Levy, E. (1992). Pricing European average rate currency
pled Asian and lookback options: a change of numeraire options, Journal of International Money and Finance
approach, Journal of Computational Finance 11(5), 474–491.
2(1), 5–30. [14] Linetsky, V. (2004). Spectral expansions for Asian
[3] Boyle, P. & Emanuel, D. (1980). The Pricing of Options (average price) options, Operations Research 52(6),
on the Generalized Mean. Working paper, University of 856–867.
British Columbia. [15] Ritchken, P., Sankarasubramanian, L. & Vijh, A.M.
[4] Broadie, M. & Glasserman, P. (1996). Estimating secu- (1993). The valuation of path-dependent contracts on
rity price derivatives using simulation, Management Sci- the average, Management Science 39(10), 1202–1213.
ence 42, 269–285. [16] Rogers, L.C.G. & Shi, Z. (1995). The value of an Asian
[5] Conze, A. & Viswanathan, R. (1991). European path option, Journal of Applied Probability 32, 1077–1088.
dependent options: the case of geometric averages, [17] Ruttiens, A. (1990). Classical replica, Risk February,
Finance 12(1), 7–22. 33–36.
[6] Curran, M. (1992). Beyond average intelligence, Risk 5, [18] Shaw, W. (2000). A Reply to Pricing Continuous Asian
60. Options by Fu, Madan and Wang, Working paper.
[7] Fu, M., Madan, D. & Wang, T. (1999). Pricing contin- [19] Turnbull, S.M. & Wakeman, L.M. (1991). A quick
uous Asian options: a comparison of Monte Carlo and algorithm for pricing European average options, Journal
Laplace transform inversion methods, Journal of Com- of Financial and Quantitative Analysis 26(3), 377–389.
putational Finance 2(2), 49–74. [20] Vecer, J. (2001). A new PDE approach for pricing arith-
[8] Geman, H. & Yor, M. (1993). Bessel processes, metic average Asian options, Journal of Computational
Asian options and perpetuities, Mathematical Finance Finance 4(4), 105–113.
3, 349–375. [21] Zvan, R., Forsyth, P. & Vetzal, K. (1998). Robust
[9] Henderson, V. & Hobson, D. (2000). Local time, cou- numerical methods for PDE models of Asian options,
pling and the passport option, Finance and Stochastics Journal of Computational Finance 2, 39–78.
4(1), 69–80.
[10] Jarrow, R.A. & Rudd, A. (1983). Option Pricing. Irwin,
IL. Related Articles
[11] Kemna, A.G.Z. & Vorst, A.C.F. (1990). A pricing
method for options based on average asset values, Average Strike Options; Black–Scholes Formula;
Journal of Banking and Finance 14,
113–129.
Lattice Methods for Path-dependent Options;
[12] Lapeyre, B. & Temam, E. (2000). Competitive Monte Risk-neutral Pricing.
Carlo methods for the pricing of Asian options, Journal
of Computational Finance 5, 39–57. VICKY HENDERSON
2. Suppose C(K, T ) < S0 − B(T )K. Then we can
Arbitrage Bounds construct an arbitrage by buying the call option
with strike K, selling short the asset, and buying
A key question in option pricing concerns how to K units of the bond that pays $1 at time T . At
incorporate information about the prices of existing, time 0, we receive the cash amount
liquidly traded options into the prices of exotic
S0 − B(T )K − C(K, T ) (2)
options. In the classical Black–Scholes model, where
there is only one parameter to choose, this question which, by assumption, is strictly positive. At
becomes: what do existing prices tell us about the maturity, writing x+ = max{x, 0}, we hold a
volatility? Since the Black–Scholes model lacks the portfolio whose value is
flexibility to capture all the market information, a
wide variety of pricing models have been proposed. (ST − K)+ − (ST − K) (3)
Rather than specifying a model and pricing with
respect to this model, an alternative approach is to which is positive.
construct model-free arbitrage bounds on the price 3. Finally, it is clear that the call option must have
of exotic options. Arbitrage bounds are constraints on a positive value (i.e., C(T , K) ≥ 0), but this can
the price of an option, due to the absence of arbitrage also be considered a consequence of the arbitrage
strategies. These strategies are typically derived from strategy of “buying” the derivative (for a negative
relationships between the payoff of an option, and the price), and hence receiving positive cash flows
payoff of a simple trading strategy constructed from both initially and at maturity.
other related derivatives—for example, the strategy
might be a buy-and-hold strategy. If such a simple There are some key features of the above exam-
trading strategy can be shown to be worth at least as ple that are repeated in other similar applications.
much as the corresponding option at maturity in every Note, first of all, that the inequalities make no mod-
possible outcome, then the initial cost of the trading eling assumptions—the final value of the arbitrage
strategy must be more than the cost of the option, portfolios will be larger/smaller than the call option
or else there exists a simple arbitrage. An important for any final value of the asset, so these bounds
feature of these bounds is that they are often valid are truly independent of any model for the under-
for a very wide class of models. lying asset. Secondly, the bounds are the best we
can do in the following sense: it can be shown that
there are arbitrage-free models for the asset price
Arbitrage Bounds for Call Prices under which the bounds are tight. For example, if the
interest rates are deterministic, and the asset price sat-
Perhaps the earliest and simplest example of arbitrage isfies St = S0 B(t), then the lower bounds hold for all
bounds are the following inequalities, which are strikes, and there is no arbitrage in the market. Alter-
described in the seminal paper [29]: natively, the upper and lower bounds can be shown
to be the Black–Scholes price of an option in the
max {0, S0 − B(T )K} ≤ C(K, T ) ≤ S0 (1)
limit as σ → ∞ and σ → 0, respectively.
where C(K, T ) is the time-0 price of a European call In practice, these bounds are far too wide for
option on the asset (St )t≥0 with strike K and maturity most practical purposes, although they can be useful
T , and B(T ) is the time-0 price of a bond that is as a check that a pricing algorithm is producing
worth $1 at time T . These bounds can be derived sensible numerical results. Part of the reason for this
from the following simple arbitrages: wide range of values concerns the relatively small
amount of information that is being used in deriving
1. Suppose C(K, T ) > S0 . Then we can construct the bound. In general, one would expect to have
an arbitrage by selling the call option and buying some information about the behavior of the market.
the asset. We receive an initial positive cash flow, A natural place to look for further information is
while at maturity the option is worth (ST − K)+ , in the market prices of other vanilla options: in
which is less than ST , the value of the asset we model-specific pricing, this information is commonly
hold. used for calibration of the model. However, the
2 Arbitrage Bounds
information contained in these prices can also be that the price of an option may be written as a
used to provide arbitrage bounds on the prices of discounted expectation under a suitable probability
other exotic derivatives through the formulation of measure. However, an assumption of the fundamen-
appropriate portfolios. tal theorem of asset pricing is that there is a (known)
model for the underlying asset. In the situation we
wish to consider, there is no such measure. It is there-
Breeden–Litzenberger Formula fore not immediate that we can say anything about
any probabilistic structure that might help us. One
One of the initial works to consider the pricing of the interesting consequences of this result is that
implications of vanilla options on exotic options is it does provide some information about the underly-
[6]. Here, the authors suppose that the value of calls ing probabilistic structure: namely, that the call prices
at all strikes and a given maturity are known, and “imply” a risk-neutral distribution for the asset price,
observe that and that there are arbitrage relationships that ensure
1 ∂ 2 C(K, T ) that any other option whose payoff depends only on
p(x) = (4) the final value of the asset also has the price implied
B(T ) ∂K 2
K=x by this probability measure.
can be thought of as the density of a random variable.
The value at time-0 of an option whose payoff is only
a function of the terminal value of the asset, f (ST ), Arbitrage Bounds for Exotic Options
can then be shown to be
A general approach that is implied by the above
B(T ) f (x)p(x) dx (5) examples is the following: suppose we know the
prices of (and can trade in) a set of “vanilla” deriva-
tives. Consider also an exotic option, for exam-
or, intuitively, the discounted expectation under the
ple, a barrier option. Without making any (strong)
density implied by the call prices. We can see this by
assumptions about a model for the underlying asset,
noting that (at least for twice-differentiable functions
what does arbitrage imply about the price of the bar-
f ), we have
rier option? Through a suitable set of trades in the
∞
underlying and vanilla options, we should be able to
f (S) = f (0) + Sf (0) + f (K)(S − K)+ dK construct portfolios and self-financing trading strate-
0
(6) gies that either dominate, or are dominated by, the
payoff of the exotic option. If we can find a portfolio
and therefore may replicate the contract f (S) exactly that dominates the exotic option, then the initial cost
by holding f (0) in cash, buying f (0) units of of this portfolio (which is known) must be at least as
the asset, and “holding” a continuous portfolio of much as the price of the exotic option, or else there
calls consisting of f (K) dK units of call options will be an arbitrage from buying the portfolio and
with strikes in [K, K + dK]. Since this portfolio selling the exotic option. The price of this portfolio
replicates the exotic option exactly, by an arbitrage therefore provides an upper bound on the price of
argument, the prices must agree. The price of the the option. In a similar manner, we may also find a
portfolio of calls can be shown to be equation (5). lower bound for the price of the option by looking
In practice, some discrete approximation of such a for portfolios and trading strategies in the underly-
portfolio is necessary, and this is generally possible ing and vanilla options that result in a terminal value
provided the calls trade at a suitably large range of that is always dominated by the exotic option. Note
strikes. that we are, in general, interested in the least upper
One of the interesting consequences of this result bound and also the greatest lower bound that can be
is that we have a representation for the price of the attained, since these will give the tightest possible
exotic option as a discounted expectation. A key bounds.
result in modern mathematical finance is the fun- We have been vague about two concepts here: first,
damental theorem of asset pricing, which allows we said that we would not want to make any “strong”
us to deduce from the assumption of no arbitrage assumptions about the model of the underlying asset.
Arbitrage Bounds 3
The exact assumptions that different examples make Of course, not all markets fit naturally into this
about the underlying models vary from case to case, framework, and so other settings should also be
but typically we might assume, perhaps, that the considered, as, for example, in [27], where arbitrage
underlying asset price is continuous (or at least, bounds for fixed income markets are considered.
that it continuously crosses a barrier), or that the
price process satisfies some symmetry assumption.
Secondly, we have not specified what types of trading Barrier Options
strategies we wish to consider: this is because, in
part, this depends heavily on the assumptions on the One of the simplest classes of options that can be
price process—for example, trading strategies that considered are the various types of barrier options,
involve a trade when the asset first crosses a barrier and one of the simplest of these options is the one-
often assume that the underlying crosses the barrier touch barrier option: this is an option that pays $1 at
continuously; the assumption on the symmetry of maturity if the barrier is breached during the lifetime
the asset price results in identities connecting the of the contract, and expires worthless if the barrier is
prices of call and put options. However, the important not hit before maturity. Suppose that the price process
point to note here is that we work typically in a is continuous, and suppose further that the riskless
class of price processes that are too large to be interest rate is zero. Then [7] provides an upper bound
able to hedge dynamically in any meaningful way, on the price of the option, OT (R, T ), where R is the
so that continuously rebalancing the portfolio is not level of the barrier, R > S0 , and T is the maturity of
an option. Two important classes of strategies are the option. The bound that is derived in [7] is
static strategies, which involve purchasing an initial
portfolio of the underlying and vanilla options, and C(x, T )
holding this to maturity (see Static Hedging), and OT (R, T ) ≤ inf (7)
x≤R R−x
semistatic strategies, which involve a fixed position
in the options, and some trading in the underlying The bound can be most clearly seen by noting the
asset, often at hitting times of certain levels or corresponding arbitrage strategy: suppose that the
sets. bound does not hold, then we can find an x for which
C(x, T )
Consistency of Vanilla Options OT (R, T ) > (8)
R−x
Since we are looking for arbitrage in the market We sell the one-touch option, and buy R−x 1
units
when we add an exotic option, it is important that of the call with strike x and maturity T . If the
the initial prices of the vanilla options do include an barrier at R is not hit, the one-touch option expires
arbitrage. In the case of equity markets, where the worthless, and our call option may have positive
underlying vanilla options are call options, written value. Alternatively, suppose that at some time, the
on a given set of strikes and maturities, this is a barrier is hit. At this time, we enter into a forward
question that has been studied by a number of authors contract on the asset. Specifically, we sell R−x 1
units
[9, 11, 13, 15, 18]. The fundamental conclusion of a forward struck at R. Since the current value of
that may be arrived at from all these works is the the asset is R, and we have assumed that the interest
following: the prices of calls are arbitrage free if rates are zero, we may enter into such a contract
and only if there exists a model under which the for free. At maturity, the value of our position in
prices agree with the discounted expectation under the forward will be R−S T
, and the total value of our
R−x
the model. Moreover, the existence of the model has position in the call and the forward is
a relatively straightforward characterization in terms
of the properties of the call prices, so that for a given 1 R − ST 1
set of call prices, the conditions may be checked with (ST − x)+ + =
R−x R−x R−x
relative ease. Moreover, some practical concerns can
be included in the models: [15] allows the inclusion × (ST − x) + (x − ST )+ + (R − ST )
of default of the asset, while [18] also allows for the (x − ST )+
inclusion of dividends. = −1 + (9)
R−x
4 Arbitrage Bounds
where we write x+ = max {x, 0}. Since the value of There are a number of observations that we can
the portfolio is now greater than the value of the make about the solution to the above problem, and
one-touch option, we have an arbitrage. which extend more generally. First, the extension
It can also be shown that the bound here is to nonzero interest rates is nontrivial—one of the
the best that can be attained: specifically, it can assumptions that was made in constructing the trading
be shown that there exists a model under which strategy was that, when the barrier is struck, we
there is equality in the identity (7). By considering would be able to enter into a forward contract with
the form of the hedge, we can also say something a strike at the barrier. If there are nonzero interest
about the extremal model. For equality to be there in rates, we will not be able to enter into such a
equation (7), we must always have equality between contract at no cost. Consequently, these results are
the payoff of the one-touch option, and the value of only generally valid in cases where there is zero
the hedging portfolio. The case where the barrier is cost of carry, for example, where the underlying
not hit requires that is a forward price, in foreign exchange markets
(ST − x)+ where both currencies have the same interest rate,
0= (10) or commodities where the interest rate is the same as
R−x
the convenience yield. Secondly, recall that the only
or, equivalently, that ST is always below x. The case assumption we made on the paths was continuity.
where the barrier is struck requires that This assumption is key to knowing that we can sell
(x − ST )+ forward as we hit the barrier. In fact, the upper
1=1+ (11) bound will still hold if the path is not continuous,
R−x
provided we sell forward the first time that we go
or that ST is always above x. In other words, in the above the barrier, at which point, we can enter into
extremal model, the paths that hit the barrier will, a forward contract that is at least as good for our
at maturity, finish above the minimizing value of x, purposes. Note, however, that under the model for
while those that do not hit the barrier will always end which the bound is tight, we must cross the barrier
up below x. continuously. The same is not true of the lower
A similar approach allows us to find a lower bound, which fails if the asset price does not cross
bound. In this case, the hedging portfolio consists of the barrier continuously. If the path is not assumed
a digital call struck at the barrier, so that the payoff continuous, a new bound can be derived, which
of this option is simply $1 if the asset ends up above corresponds to the asset jumping immediately to its
the barrier, and put options are struck at the barrier, at final value. The third aspect to note about these
some y < R. Note that the digital call can, in theory constructions is that there is a natural extension to
at least, be arbitrarily closely approximated by buying the case where calls are available at finitely many
a suitably large number of calls just below the strike, strikes. Consider the upper bound on the one-touch
and selling the same number of calls at the strike, so option, and suppose that calls trade at a finite set
that we can deduce the price of the digital call from
of strikes K1 , K2 , . . . , Kn . Rather than taking the
the prices of the vanilla call options. The prices of
infimum over x where x < R, to get an upper bound,
the puts can be deduced from put–call parity. In a
we can take the minimum over the strikes at which
manner similar to the above, we can find the “best”
calls are available
bound by finding the value of y that corresponds to
the most expensive portfolio. Again, the bound is
C(Ki , T )
tight, in the sense that there exists a model under OT (R, T ) ≤ min (12)
which we attain equality. We can also describe the i:Ki <R R − Ki
behavior in this model: the paths that hit the barrier
will end up either below y or above R. Those that do The previous arguments can be applied directly to
not hit the barrier will finish between y and R. show that this is an upper bound. It can also be shown
Using extensions of these ideas, similar bounds that there is a model that fits with the call prices, and
can be found for other common barrier options, for under which this bound is attained, so the resulting
example, down-and-in calls. Full details can be found bounds are the best possible. Details of this extension
in [7]. can be found in [7].
Arbitrage Bounds 5
where Wt is a Brownian motion, τ (t) is increasing in and the minimum will correspond to stopping time
t and is a stopping time for all t. As a consequence, that minimizes the probability within this class. The
any martingale price process should be a time change construction of arbitrage bounds for the price of the
of a Brownian motion. If, in addition, we know option is therefore equivalent to the identification of
that the law of ST under the risk-neutral measure extremal Skorokhod embeddings for the law implied
is implied by the call prices, we also know that by the call prices at maturity, as seen in [7]. The
Wτ (T ) has a given law. Finally, suppose that the construction that attains this maximum is due to
time change is continuous (as it will be if the price Azema and Yor [3], while the construction that attains
process is continuous), then many of the properties the minimum is due to Perkins [32], and it can be
in which we are interested remain unaffected by the shown that these embeddings do indeed have the
exact form of the time change. For example, consider behavior that was hypothesized previously: for the
the probability of whether the discounted asset price upper bound, those paths that hit the barrier remain
goes above a barrier R before time T . This is the above the level x derived in the bound, while in the
same as the probability that the Brownian motion lower bound, those paths that hit the barrier all either
Wt with Wτ (t) = B(t)St goes above the barrier before finish above the barrier, or stop below y.
time τ (T ). Moreover, consider two time changes τ (t) The Skorokhod embedding approach was initially
and τ̃ (t), such that we always have τ (T ) = τ̃ (T ). explored in [23]. In this work, it is shown that the
Then the probability of whether the barrier has been upper bound on the price of a lookback option can
breached will be the same for the price processes be computed in terms of the available call prices.
corresponding to the time change τ and the time Moreover, Hobson [23] has constructed a trading
change τ̃ . Consequently, if we are concerned with strategy that will result in an arbitrage should the
such path properties of the underlying price process, lookback option trade above the given bound. In
when we look in the Brownian setting, we need only this case, the strategy involves constructing an initial
differentiate between different final stopping times portfolio of calls (purchased at the specified prices)
τ (T ), and not different time changes. and then selling these calls appropriately as the price
The argument then goes as follows: suppose we process sets new maxima. The price at which the calls
know call prices at all strikes at time T . From this can be sold will be at least the intrinsic value of the
information, we may deduce the law of the discounted call, and it can be shown that the profit from selling
asset price B(T )ST , which we assume to be a time off the calls appropriately will be at least the payoff
change of a Brownian motion, and whose value at from the lookback option. A simple lower bound is
some stopping time τ therefore has the same law. also derived, but without assuming any continuity.
Since the time change in the intermediate time is For discontinuous asset prices, the lower bound is
assumed to be continuous, and its exact form will attained by the price process that jumps immediately
not impact the quantities of interest, we get a one-to- to its final value. In terms of the corresponding
one correspondence between possible price processes Skorokhod embeddings, the upper bound has close
and the class of stopping times of a Brownian motion connections with the embedding due to Azema and
that have a given law. This line of reasoning is Yor [3]; this can be shown to maximize the law of
of interest, since the problem of finding a stopping the maximum over the class of embeddings. Further,
time with a given terminal law has a long history in it can be shown that if we use the price process that
the probabilistic literature, where it is known as the corresponds to the stopping time constructed in [3],
Skorokhod embedding problem. In particular, given then the trading strategy dominating the lookback
a distribution µ, we say that a stopping time τ is a option actually attains equality demonstrating that the
(Skorokhod) embedding of µ, if Wτ has law µ. The upper bound is the best possible. This connection
recent survey paper [31] contains a comprehensive between an extremal Skorokhod embedding and a
survey of the probabilistic literature on the Skorokhod corresponding bound on the price of a connected
embedding problem. exotic option has been exploited a number of times:
Getting back to the one-touch option, we see that in [8], these techniques are used to generalize the
the upper bound will correspond to the stopping above results to the case where the call prices at
time that maximizes the probability of being larger an intermediate time are also known; in [24] the
than the barrier within the class of embeddings, embedding due to Perkins [32] is generalized to
Arbitrage Bounds 7
Price
A related development of these ideas is considered 0.5
in [28], wherein the problem of fitting martingales 0.4
to marginal distributions specified at all maturities is 0.3
presented, and some solutions corresponding to the 0.2
different Skorokhod embedding approaches, the local 0.1
volatility models of Dupire [21], and processes with 0
90 95 100 105 110 115 120
independent increments are discussed.
Strike
on the value of a weighted sum of a number of [12] Carr, P., Ellis, K. & Gupta, V. (1998). Static hedging of
assets, and where calls are traded on each of the exotic options, Journal of Finance 53(3), 1165–1190.
underlying assets. There are also connections to [1], [13] Carr, P., Geman, H., Madan, D.B. & Yor, M. (2003).
Stochastic volatility for Lévy processes, Mathematical
where bounds on the prices of Asian options are Finance 13(3), 345–382.
derived. [14] Cerny, A. & Hodges, S.D. (1999). The theory of good-
Another class of options where similar hedg- deal pricing in financial markets, FORC preprint, No.
ing techniques have been considered are installment 98/90.
options [19], which are options similar to a European [15] Cousot, L. (2007). Conditions on option prices for
call, but where the holder pays for the option in a set absence of arbitrage and exact calibration, Journal of
number of installments, and has the option to stop Banking and Finance 31, 3377–3397.
[16] Cox, A.M.G., Hobson, D.G. & Obloj, J. (2008). Path-
paying the installments at any point before maturity, wise inequalities for local time: applications to Sko-
thereby losing the final payoff for the contract. rokhod embeddings and optimal stopping, Annals of
A common complication that arises in construct- Applied Probability 18(5), 1870–1896.
ing many of the bounds and their respective hedging [17] d’Aspremont, A. & El Ghaoui, L. (2006). Static arbitrage
portfolios is that there can be some nontrivial opti- bounds on basket option prices, Mathematical Program-
mization problems, typically, large linear program- ming 106(3), Series A, 467–489.
[18] Davis, M.H.A. & Hobson, D.G. (2007). The range
ming problems [4, 17, 25].
of traded option prices, Mathematical Finance 17(1),
1–14.
[19] Davis, M.H.A., Schachermayer, W. & Tompkins, R.G.,
References (2001). Installment options and static hedging, in Math-
ematical Finance (Konstanz, 2000), Trends in Mathe-
[1] Albrecher, H., Mayer, P.A. & Schoutens, W. (2008). matics, Birkhäuser, Basel, pp. 131–139.
General lower bounds for arithmetic Asian option prices, [20] Derman, E., Ergener, D. & Kani, I. (1995). Static options
Applied Mathematical Finance 15(2), 123–149. replication, Journal of Derivatives 2, 78–95.
[2] Andersen, L.B.G., Andreasen, J. & Eliezer, D. (2002). [21] Dupire, B. (1994). Pricing with a smile, Risk 7, 32–39.
Static replication of barrier options: some general results, [22] Engelmann, B., Fengler, M.R., Nalholm, M. & Schwen-
Journal of Computational Finance 5(4), 1–25. der, P. (2006). Static versus dynamic hedges: an empiri-
[3] Azéma, J. & Yor, M. (1979). Une solution simple au cal comparison for barrier options, Review of Derivatives
problème de Skorokhod, in Séminaire de Probabilités, Research 9(3), 239–264.
XIII (Univ. Strasbourg, Strasbourg, 1977/78), Lecture [23] Hobson, D.G. (1998). Robust hedging of the lookback
Notes in Mathematics, Springer, Berlin, Vol. 721, option, Finance and Stochastics 2(4), 329–347.
pp. 90–115. [24] Hobson, D.G. & Pedersen, J.L. (2002). The minimum
[4] Bertsimas, D. & Popescu, I. (2002). On the relation maximum of a continuous martingale with given ini-
between option and stock prices: a convex optimization tial and terminal laws, Annals of Probability 30(2),
approach, Operations Research 50(2), 358–374. 978–999.
[5] Bowie, J. & Carr, P. (1994). Static simplicity, Risk 7(8), [25] Hobson, D., Laurence, P. & Wang, T. (2005a). Static-
45–49. arbitrage upper bounds for the prices of basket options,
[6] Breeden, D.T. & Litzenberger, R.H. (1978). Prices of Quantitative Finance 5(4), 329–342.
state-contingent claims implicit in option prices, Journal [26] Hobson, D., Laurence, P. & Wang, T. (2005b). Static-
of Business 51(4), 621–651. arbitrage optimal subreplicating strategies for basket
[7] Brown, H., Hobson, D. & Rogers, L.C.G. (2001a). options, Insurance, Mathematics & Economics 37(3),
Robust hedging of barrier options, Mathematical Fin- 553–572.
ance 11(3), 285–314. [27] Jaschke, S.R. (1997). Arbitrage bounds for the term
[8] Brown, H., Hobson, D. & Rogers, L.C.G. (2001b). The structure of interest rates, Finance and Stochastics 2(1),
maximum maximum of a martingale constrained by an 29–40.
intermediate law, Probability Theory and Related Fields [28] Madan, D.B. & Yor, M. (2002). Making Markov mar-
119(4), 558–578. tingales meet marginals: with explicit constructions,
[9] Buehler, H. (2006). Expensive martingales, Quantitative Bernoulli 8(4), 509–536.
Finance 6(3), 207–218. [29] Merton, R.C. (1973). Theory of rational option pricing,
[10] Carr, P. & Chou, A. (1997). Breaking barriers, Risk The Bell Journal of Economics and Management Science
10(9), 139–145. 4(1), 141–183.
[11] Carr, P. & Madan, D.B. (2005). A note on sufficient [30] Monroe, I. (1972). On embedding right continuous mar-
conditions for no arbitrage, Finance Research Letters 2, tingales in Brownian motion, Annals of Mathematical
125–130. Statistics 43, 1293–1311.
Arbitrage Bounds 9
[31] Obłój, J. (2004). The Skorokhod embedding problem [34] Tompkins, R. (1997). Static versus dynamic hedging of
and its offspring, Probability Surveys 1, 321–390, elec- exotic options: an evaluation of hedge performance via
tronic. simulation, Netexposure 1, 1–28.
[35] Vallois, P. (1983). Le problème de Skorokhod sur R: une
[32] Perkins, E. (1986). The Cereteli-Davis solution to the
approche avec le temps local, in Seminar on Probability,
H 1 -embedding problem and an optimal embedding in XVII, Lecture Notes in Mathematics, Springer, Berlin,
Brownian motion, in Seminar on Stochastic Processes, Vol. 986, pp. 227–239.
1985 (Gainesville, Fla., 1985), Progress in Probabity
and Statistics, Birkhäuser Boston, Boston, Vol. 12, Related Articles
pp. 172–223.
[33] Revuz, D. & Yor, M. (1999). Continuous Martin- Arbitrage Strategy; Barrier Options; Dupire Eq-
gales and Brownian Motion, Grundlehren der Mathe- uation; Good-deal Bounds; Hedging; Model Cali-
matischen Wissenschaften [Fundamental Principles of bration; Skorokhod Embedding; Static Hedging.
Mathematical Sciences], 3rd Edition, Springer-Verlag,
Berlin, Vol. 293. ALEXANDER COX
Average Strike Options We consider a contract that is based on the value
AT , where (At )t≥t0 is the arithmetic average
t
1
At = Su du, t > t0 (2)
An average strike option is also known as an Asian t − t0 t0
option with floating strike. These options have a
payoff based on the difference between the terminal and by continuity, we define At0 = St0 . The corre-
asset price and the average of an underlying asset sponding geometric average Gt is defined as
price over a specified time period. The other type t
1
of Asian option is the fixed-strike option, where the Gt = exp ln Su du (3)
payoff is determined by the average of an underlying t − t0 t0
asset price and a fixed strike set in advance (see Asian
The contract is written at time 0 (with 0 ≤ t0 )
Options).
and expires at T > t0 . It is of interest to calculate
If the average is computed using a finite sample
the price of the option at the current time t, where
of asset price observations taken at a set of regularly
0 ≤ t ≤ T . The position of t compared to the start
spaced time points, we have a discrete average strike
of the averaging, t0 may vary, as described in Asian
option. A continuous time option is obtained by
Options.
computing the average via the integral of the price
The payoff of an average strike call with arith-
path over an interval of time. The average itself can
metic averaging is given as
be defined to be geometric or arithmetic. As for the
fixed strike Asian option, when the geometric average
(ST − AT )+ (4)
is used, the average strike option has a closed-
form solution for the price, whereas the option with and the payoff of an average strike put with arithmetic
arithmetic average does not have a known closed- averaging is
form solution.
We concentrate on the continuous time, average (AT − ST )+ (5)
strike option of European style with arithmetic aver-
Average strike option payoffs with geometric
aging. A discussion of the uses and rationale for intro-
averaging are identical, with AT replaced by GT . The
ducing Asian contracts is given in Asian Options.
buyer of an average strike call is able to exchange
Average strike options are closely related to these
the terminal asset price for the average of the asset
options, but are less commonly used in practice.
price over a given period. For this reason, it is
Consider the standard Black–Scholes economy
sometimes referred to as a lookback on the average
with a risky asset (stock) and a money market
(see Lookback Options for a discussion of the
account. We also assume the existence of a risk-
lookback option).
neutral probability measure Q (equivalent to the
By standard arbitrage arguments, the time-t price
real-world measure P ) under which discounted asset
of the average strike call is
prices are martingales. We denote expectation under
measure Q by E, and the stock price follows
e−r(T −t) Ɛ[(ST − AT )+ |Ft ] (6)
dSt and the price of the put is
= (r − δ) dt + σ dWt (1)
St
e−r(T −t) Ɛ[(AT − ST )+ |Ft ] (7)
where r is the constant continuously compounded
interest rate, δ is a continuous dividend yield, It turns out that we need to consider only the case
σ is the instantaneous volatility of asset return, t ≥ t0 , where the option is “in progress”. The forward
and W is a Q-Brownian motion. The reader is starting case (t < t0 ) can be rewritten as a modified
referred to Black–Scholes Formula for details on option with averaging starting at t, today. This is in
the Black–Scholes model and Risk-neutral Pricing contrast to the Asian option with fixed strike, where
for a discussion of risk-neutral pricing. the difficult case was when the option was forward
2 Average Strike Options
starting. As for the Asian option, the average strike price of the average strike option. This bound is in
option satisfies a put–call parity; see [1] for details. terms of an Asian option with fixed strike and a
The average strike option is an exotic path- vanilla option. The method gives an “exact” bound
dependent option, as the price depends on the path of for forward starting and starting options and when
the underlying asset via the average. The distribution expiry is reached.
of the average AT is not lognormal, if the asset Numerical methods can be used to price the aver-
price is lognormal, and pricing is difficult because age strike option. The discussion of Monte Carlo
the joint law of AT and ST is needed. This is in simulation in Asian Options is also relevant here, as
contrast to the Asian option, which required only the simulation is often used as a benchmark price. Inger-
law of the average. Perhaps because of this increased soll [6] was the first to recognize that it is possible to
complexity, or their lesser popularity in practice, reduce the dimension of the pricing problem for the
fewer methods exist for the pricing of average strike average strike option using a transformation of vari-
options. Just as for the Asian option, there are no ables. Despite the value of the average strike option at
closed-form solutions for the price of the average
t depending on the current asset price, current value
strike option.
of the average, and time to expiry, a one-dimensional
Many of the methods that we discuss here for
partial differential equation (PDE) can be derived by
pricing are similar to those used to price the Asian
using Ingersoll’s reduction of variables. However, the
option. An early technique to give an approximate
price for the average strike option was to replace drawback is that the Dirac delta function appears as
the arithmetic average AT with the geometric average a coefficient of the PDE, making it prone to insta-
GT . Since GT has a lognormal distribution, the bilities. Vecer’s [12] PDE method for Asian options
(approximate) pricing problem becomes (for a call) with fixed strike also applies to average strike options
and gives a stable one-dimensional PDE. Some test-
e−r(T −t) Ɛ[(ST − GT )+ |Ft ] (8) ing of this method for the average strike option is
given in [11].
We recognize that this is exactly an exchange
To conclude, research into pricing the average
option (see Exchange Options), which can be priced
strike option is ongoing, with current PDE and bound
via a change of measure, as in [9]. Levy and Turnbull
methods being very efficient.
[8] mentioned this connection to exchange options,
but it was Conze and Viswanathan [3] who presented
the results of this computation.
Other analytical approximations can be obtained
by approximating the true joint distribution of the References
arithmetic average and asset price using an approx-
imate distribution, usually jointly lognormal with
[1] Bouaziz, L., Briys, E. & Crouhy, M. (1994). The pricing
appropriate parameters. Chung et al. [2] extended the
of forward starting asian options, Journal of Banking and
linear approximations of Bouaziz et al. [1], Levy [7],
Finance 18(5), 823–839.
and Ritchken et al. [10] (approximating distribution [2] Chung, S., Shackleton, M. & Wojakowski, R. (2003).
of {AT , ST } by joint lognormal) to include quadratic Efficient quadratic approximation of floating strike Asian
terms. Their approximation is no longer based on a option values, Finance 24(1), 49–62.
geometric-type approximation. [3] Conze, A. & Viswanathan, R. (1991). European path
Recently, symmetries of a similar style to that of dependent options: the case of geometric averages,
the put-call symmetry have been found between fixed Finance 12(1), 7–22.
strike Asian options and average strike options. For [4] Henderson, V., Hobson, D., Shaw, W. & Wojakowski, R.
forward starting average strike options, Henderson (2007). Bounds for in-progress floating-strike Asian
options using symmetry, Annals of Operations Research
et al. [4] gave a symmetry with a starting Asian
151, 81–98.
option. If the average strike option is starting, the [5] Henderson, V. & Wojakowski, R. (2002). On the equiva-
special case of Henderson and Wojakowski [5] is lence of fixed and floating-strike Asian options, Journal
recovered. If the average strike option is in progress, of Applied Probability 39(2), 391–394.
it cannot be rewritten as an Asian option, and [6] Ingersoll, J. (1987). Theory of Financial Decision Mak-
Henderson et al. [4] derived an upper bound for the ing, Rowman and Littlefield Publishers, New Jersey.
Average Strike Options 3
[7] Levy, E. (1992). Pricing European average rate currency [12] Vecer, J. (2001). A new pde approach for pricing arith-
options, Journal of International Money and Finance metic average Asian options, Journal of Computational
11(5), 474–491. Finance 4(4), 105–113.
[8] Levy, E. & Turnbull, S. (1992). Average intelligence,
Risk 5, 2.
[9] Margrabe, W. (1978). The value of an option to Related Articles
exchange one asset for another, Journal of Finance 33,
177–186. Asian Options; Black–Scholes Formula; Exchange
[10] Ritchken, P., Sankarasubramanian, L. & Vijh, A.M. Options; Lookback Options; Risk-neutral Pricing.
(1993). The valuation of path-dependent contracts on
the average, Management Science 39(10), 1202–1213. VICKY HENDERSON
[11] Shiuan, Y.J. (2001). Pricing Floating-Strike Asian
Options. MSc dissertation, University of Warwick.
Foreign Exchange exchange rate is 100% correlated to a major currency,
mostly the USD. If one expects that this peg will con-
Markets tinue, hedges should be done in the correlated major
currency. In the case of SAR (Saudi riyal) or AED
(United Arab Emirates dirham), discussion has been
The foreign exchange (FX) market has two major ongoing about depegging the currencies. In case this
functionalities, one related to hedging and the other is done, there could be an increasing interest in SAR-
to investment. or AED-linked investments. This opens an increasing
In the age of globalization, it is essential for interest in SAR- or AED-linked investments to partic-
corporates and multinationals to hedge their FX ipate in the case that these currencies are depegged.
exposure due to export/import activities. In addition, For the “more exotic currencies” such as the GHC
fund managers (institutional) need to hedge their FX (Ghanaian cedi), there is no options market.
risk in stocks or bonds if the stocks/bonds are quoted
in a foreign currency. With hedging instruments,
the FX exposure can be reduced and one can even
Quotation
benefit from certain market scenarios. This kind of The exchange rate can be defined as the amount of
participation brings us to the important class of domestic currency one gets if one sells one unit of
investor-oriented products where the coupon depends foreign currency. If we take a look at an example
on an FX rate or, at maturity, the pay-off (amount, of the EUR/USD exchange rate, the default quotation
currency) will be determined by an FX rate. This is EUR-USD, where USD is the domestic currency
kind of product can be issued as a note, certificate, and EUR is the foreign currency. The terms domestic
or bond. and foreign are not related to the location of the
For the major currencies such as USD, EUR, trader or any country, but it is more a question of
JPY, GBP, CHF, AUD, CAD, and NZD, the mar- the definition. Domestic and base are synonyms as
ket has become more transparent over the last few are foreign and underlying. The common way is to
years. For plain vanilla options, market data, espe- denote the currency pair with a slash (/) and the
cially volatilities for maturities below 1 year, are quotation with a dash (−). The slash (/) does not
published by brokers or banks and are shown on mean a division.
Reuters pages (e.g., TTKLINDEX10, ICAPFXOP, For example, the currency pair EUR/USD can be
GFIVOLS). For exotic products, new pricing tools, quoted either in EUR-USD, which means how many
such as Superderivatives, LPA, Bloomberg, ICY, USD one gets for selling one EUR, or in USD-
Fenics, and so on, are available for users, but the pre- EUR, which then means how many EUR one gets for
mium of the option will depend on the pricing model selling one USD. There are certain market standard
and the adjustments used. For the emerging market quotations; some of them are listed in Table 1.
currencies such as PLN (Polish zloty), HUF (Hun- In the FX market, two currencies are involved,
garian forint), ZAR (South African rand), and so on, which means that one needs to specify on which
which are freely tradable but less liquid, the market currency a particular call or put option is written. For
data are less transparent. Currencies that are not freely
tradable (the currency cannot be cash-settled off-
Table 1 Market convention of some major currency pairs
shore) such as BRL (Brazilian real) or CNY (Chinese
with sample spot price
yuan renmimbi) can be traded as a nondeliverable
forward (NDF) or as a nondeliverable option (NDO) Currency pair Quotation Quote
against a tradable currency. The NDF is a cash-settled EUR/USD EUR-USD 1.4400
product without exchange of notionals, which means GBP/USD GBP-USD 1.9800
that the intrinsic value at maturity will be paid in the USD/JPY USD-JPY 114.00
free tradable currency based on a fixing source. The USD/CHF USD-CHF 1.1500
underlying of an NDO is the NDF, meaning that exer- EUR/CHF EUR-CHF 1.6600
cising the NDO results in an NDF, which will also be EUR/JPY EUR-JPY 165.00
EUR/GBP EUR-GBP 0.7300
cash-settled. Another class of currencies is that of the USD/CAD USD-CAD 0.9800
fully cash-settled pegged ones, which means that their
2 Foreign Exchange Markets
instance, in the currency pair EUR/USD, there can Table 2 Standard market quotation types for option
be a EUR call, which is equivalent to a USD put, or premiums
a EUR put, which is equivalent to a USD call. Symbol Description of symbol Result of example
d pips Domestic per unit 208.42 USD pips
foreign per EUR
FX Terminology f pips Foreign per unit 97.17 EUR pips
domestic per USD
In the FX market, a million is called a buck and a %f Foreign per unit 1.4575% EUR
billion a yard. This is because the word “billion” has foreign
different meanings in different languages. In French %d Domestic per unit 1.3895% USD
and German, it represents 1012 and in English it domestic
stands for 109 . d Domestic amount 20 842 USD
Certain currency pairs have their own names in f Foreign amount 14 575 EUR
the market. For instance, GBP/USD is called a cable, Foreign = EUR, domestic = USD, S0 = 1.4300, rd = 5.0%,
because the exchange rate information used to be sent rf = 4.5%, volatility = 8.0%, K = 1.5000, T = 365 days,
between England and America through a telephone EUR call USD put, notional = 1 000 000 EUR = 1 500 000
cable in the Atlantic Ocean. EUR/JPY is called the USD
cross, because it is the cross rate of the more liquidly
traded USD/JPY and EUR/USD. compare the prices, especially in the broker market.
Some currency pairs also have their own names On the basis of the spot rate on which the delta
to make them short and unique in communication. exchange is done, the premium of the plain vanilla
New Zealand dollar, which is NZD/USD, is called option is calculated via the Black–Scholes formula.
Kiwi, and the Australian dollar, which is AUD/USD, For exotic options, a price in volatility is not possible
is called Aussie. Among the Scandinavian curren- because each bank has its own pricing model for
cies, NOK (Norwegian krone) is called Noki, SEK these.
(Swedish krona) is called Stoki, and in combina- The premium, value, or prices of options can
tion with DKK (Danish krone) the three are called be quoted in six different ways (Table 2). The
Scandies. Black–Scholes formula quotes in domestic pips per
The exchange rates are usually quoted in five one unit of foreign notional. The others can be
relevant figures, for example, in EUR-USD we would retrieved in the following manner:
get a quote of 1.4567. Sometimes one can get a quote
up to six figures, but for the time being we focus on 1 S
× ×K ×
five figures. The last digit “7” is called the pip and SK K
the middle digit “5” is called the big figure, because d pips −−−→ f pips−−−→%f −−−→ %d (1)
the interbank spot trading tools show this digit in
bigger size since it is the most important information.
The figure to the left of the big figure is known Delta and Premium Convention
anyway and the pips to the right of the big figures
are sometimes “negligible”. For example, a rise of The spot delta of a plain vanilla option can be
EUR-JPY 165.00 by 40 pips is 165.40 or a rise by 3 retrieved in a straightforward way by using the
big figures would be 168.00. Black–Scholes formula. It is called the raw spot
delta, δraw . One retrieves it in percentage of the
foreign currency, but the delta in the second involved
opposite
Quotation of Option Prices currency δraw can be computed in the following
manner:
Plain vanilla option prices are usually quoted in S
opposite
δraw = − δraw (2)
terms of implied volatility. If an option is priced K
in volatility, a delta exchange is necessary. The
advantage is that the volatility does not usually move The delta multiplied with the corresponding
as quickly as the spot rate and one has the chance to notional determines the amount that has to be bought
Foreign Exchange Markets 3
or sold to hedge the spot risk of the option up to the Table 3 One-year EUR call USD put, the strike is 1.4300
first order. for a EUR-based bank
An important question is whether the premium of Delta Premium
the option needs to be included in the delta or not? An currency currency Fenics Hedge Delta
example, EUR-USD, for investigation is considered
%EUR EUR LHS δraw − P 48.35
here. In this quotation, USD is the domestic currency %EUR USD RHS δraw 51.64
and EUR is the foreign one. The Black–Scholes %USD EUR RHS + F4 −(δraw − P) −48.35
formula calculates the premium in domestic per 1 unit S/K
foreign currency, which in our example is in USD %USD USD LHS + F4 −δraw S/K −51.64
per 1 EUR. This premium is denoted by p. If the S = 1.4300, rd = 5.0%, rf = 4.5%, volatility = 8.0%,
premium is paid in EUR, which means in the foreign K = 1.4300
currency, it includes an FX risk. The premium p in
USD is equivalent to pS EUR, which means that the
amount of EUR that has to be bought to hedge the
Table 4 One-year EUR call USD put, the strike is 1.5000
option needs to be reduced by this EUR premium and for a EUR-based bank
is given as
Delta Premium
p
δraw − EUR (3) currency currency Fenics Hedge Delta
S
%EUR EUR LHS δraw − P 28.22
%EUR USD RHS δraw 29.69
We denoted USD as domestic currency and EUR %USD EUR RHS + F4 −(δraw − P) −26.91
as foreign currency, but do all banks or trading S/K
places have this notion? What is the notional currency %USD USD LHS + F4 −δraw S/K −28.30
of the option and what is the premium currency? In
S = 1.4300, rd = 5.0%, rf = 4.5%, volatility = 8.0%,
the interbank market, there exists a fixed notion of K = 1.5000
the delta of the currency pair. Normally, it is the
LHS delta in Fenicsa if the option is traded in the
LHS premium, which is mostly used, for example,
for EUR/USD, USD/JPY, and EUR/JPY, and the Examples
RHS delta if it is the RHS premium, for example,
for GBP/USD and AUD/USD. Most of the options To see the different deltas used in practice, consider
traded in the market are out-of-the-money; therefore, two examples discussed in Tables 3 and 4.
the premium does not create a critical FX risk for the
trader.
For the banks where the base currency is consid- Implied Volatility and Delta for a Given
ered the risk-free currency, the market value of the Strike
option is in the base currency, and if the premium is in
the risky currency, the premium needs to be included Implied volatility is not constant across strikes (see
in the hedge. If the premium is in the risk-free (or Foreign Exchange Smiles). The volatility σ depends
the base) currency, the premium will be offset by on the corresponding delta of the option, but the delta
the market value of the option. In the opposite case, depends on the price of the option and therefore on
where the risk-free currency is the underlying cur- the used volatility. How can we retrieve the correct
rency, if the premium is in the risky currency, the volatility for a given strike? For sure it is an iterative
premium will be offset by the market value of the process. Initially, one uses the at-the-money (ATM)
option. Only in the case of premium in risk-free cur- volatility σ0 and calculates the delta 1 . On the basis
rency, the amount needs to be included in the hedge. of 1 , a new volatility σ1 can be retrieved from the
Therefore, the delta hedge is invariant with respect volatility matrix. This new volatility leads to a new
to the risky currency notion of the bank; for example, delta and so on. Now one can define a convergence
for both banks, one is based in USD and the other in criterion to stop the iteration. In practice, a fixed
EUR, and the delta is the same. number of iterations is used, usually five steps.
4 Foreign Exchange Markets
Table 5 Vega matrix for standard maturities and delta values, expressed in percent of foreign notional
Mat/ 50% 45% 40% 35% 30% 25% 20% 15% 10% 5%
O/N 0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.01 0.01 0.01
1W 0.06 0.06 0.05 0.05 0.05 0.04 0.04 0.03 0.02 0.01
1M 0.11 0.11 0.11 0.10 0.10 0.09 0.08 0.07 0.05 0.03
2M 0.16 0.16 0.15 0.15 0.14 0.13 0.11 0.09 0.07 0.04
3M 0.20 0.02 0.19 0.19 0.17 0.16 0.14 0.12 0.09 0.05
6M 0.28 0.28 0.27 0.26 0.25 0.23 0.20 0.17 0.13 0.07
9M 0.33 0.33 0.33 0.32 0.30 0.28 0.24 0.20 0.15 0.09
1Y 0.38 0.38 0.38 0.37 0.35 0.32 0.28 0.24 0.18 0.10
2Y 0.51 0.51 0.51 0.50 0.48 0.44 0.40 0.33 0.25 0.15
3Y 0.60 0.60 0.60 0.60 0.57 0.54 0.48 0.40 0.31 0.18
The matrix shows, for example, that 2Y EUR call USD put 35 delta can be hedged with two times 6M EUR call USD put 30 delta
Dec 5, 2007
Dec 6, 2007 −0.525 −0.55 −0.525
Dec 7, 2007 −0.6 −0.6 −0.6
Dec 10, 2007 −0.6 −0.6 −0.6
At-the-money Definition
If we denote the ATM volatillity by σ0 , the 25-
delta put volatility by σ− , and the 25-delta call There exist several definitions of ATM:
volatility by σ+ , we get the following relationships:
• ATM spot: Strike is equal to the spot.
RR = σ+ − σ− (4) • ATM forward: Strike is equal to the forward.
1 • Delta parity: The absolute value of the delta call
BF = (σ+ + σ− ) − σ0 (5) is equal to the absolute value of the delta put.
2 • Fifty delta: Put delta is 50% and the call delta is
1 50%.
σ+ = ATM + BF + RR (6)
2 • Value parity: The premium of the delta put is
1 equal to the delta call.
σ− = ATM + BF − RR (7)
2
The most widely used one in the interbank market
It should be noted that the values RR and BF given is the delta parity up to 1 year for the most liq-
above have nothing to do with the prices of actual risk uid currencies. In emerging markets, (at-the-money-
reversal and butterfly contracts: rather they provide forward) (ATMF) is used. For long-term options such
a convenient representation of the implied volatility as USD/JPY 15 years, the ATMF convention is used,
smile in terms of its level (σ0 ), convexity (BF ), and but since this results in a delta position, a forward
skewness (RR). delta exchange will be done.
6 Foreign Exchange Markets
800000
600000
P/L
400000
200000
–200000 1.25
1.27
1.29
1.31
1.33
1.35
1.37
1.39
1.41
1.43
1.45
1.47
1.49
1.51
1.53
1.55
1.57
1.59
1.61
1.63
1.65
(a) Spot at expiry
200000
0
–200000
–400000
–600000
–800000
1.25
1.27
1.29
1.31
1.33
1.35
1.37
1.39
1.41
1.43
1.45
1.47
1.49
1.51
1.53
1.55
1.57
1.59
1.61
1.63
1.65
(b) Spot at expiry
Figure 1 (a) Payoff profile of a EUR-USD knockout option that is not knocked out during its lifetime; (b) payoff profile
of the EUR-USD risk reversal at expiry
the delta position. Normally, during the lifetime of the option is only activated if the spot ever trades
the option, the risk is hedged dynamically across the at or beyond the prespecified barrier. The barrier is
entire option book. valid at all times between inception of the trade and
The quoting bank (market maker) is the calcula- the maturity time of the option.
tion agent. It stipulates the regulations under which One can further distinguish regular barrier options,
predefined triggers are reached or how often the where the barrier is out-of-the-money, and reverse-
underlying is traded in certain predefined ranges. The barrier options, where the barrier is in-the-money.
market maker informs the market user about the trig- A regular knockout barrier option can basically be
ger event. priced and semistatically hedged by a risk reversal
(Lego-brick principle).
Barrier Options Figure 1 illustrates the example: EUR-USD Spot
1.4600 expiry six months, strike 1.5000, EUR CALL
Barrier options are vanilla put and call options with with regular knockout trigger at 1.4300.
additional barriers. In case of a knockout, the option Hedging a short regular knockout EUR call, we
expires worthless, if the spot ever trades at or beyond can go long a vanilla EUR call with the same strike
the prespecified barrier. In case of a knockin option, and the same expiry and go short a vanilla EUR put
Foreign Exchange Options 3
with a strike such that the value of the hedge portfolio short knockout option, if the spot value is above both
is zero if the spot is at the barrier. The long call and triggers (Lego-brick principle).
short put is called a risk reversal and its market price Window barriers (partial barriers) are additional
can be used as a proxy for the price of the regular modifications of barrier options. In case of a window-
knockout call. In our example, it would be a 1.3650 barrier option, the trigger is valid only within a certain
EUR put. If the trigger is not reached, then the put period of time. Commonly, this period of time is
expires worthless and the call offsets the knockout from inception of the trade until a specific date (early
call payoff. If the trigger is reached, the risk reversal ending) or from a specific date during validity until
can be canceled with approximately zero value. The expiry date of the option (deferred start). Arbitrary
delta of a knockout option is higher than the delta time intervals are possible.
of the corresponding plain vanilla option, and the For European barrier options, the triggers are only
higher it is, the closer the trigger is to the underlying valid at maturity. They can be statically hedged with
spot. plain vanilla options and European digital options
Reverse knockout and reverse knockin are more (Lego-brick principle).
difficult to price and hedge as the risk profile of
these options is difficult to replicate with other Binary Options/Digital Options
options. In this case, the trigger is in the money. The
volatility risk of first and second order arising from Digital or binary options pay a fixed amount in a
these options can be hedged dynamically with risk currency to be specified if the spot trades at or
reversals and butterflies (see Vanna–Volga Pricing). beyond a prespecified barrier or trigger. For European
However, all sensitivities take extreme values when digitals, the trigger is valid only at maturity, whereas
getting closer to the trigger and closer to maturity. for American digitals, the trigger is valid during the
Delta positions can be a multiple of the notional entire lifetime of the trade. In FX-interbank trade
amount. Therefore, it is difficult for the trader to American digitals are also called one-touch (if the
perform dynamic hedging strategies. To manage these fixed amount is paid at maturity) or instant one-
risks, short-term reverse knockout barrier options touch (if the fixed amount is paid at first hitting time)
are often removed from the global books and are options. Further touch options are the so-called no-
matched as individual positions, or are closed two touch options, double no-touch options, and double
to three weeks before expiry. The risk surcharge one-touch options. A no-touch pays only if the spot
paid in this case is often smaller than the cost never touches or crosses the prespecified trigger. A
of keeping to such positions and hedging them double no-touch pays only if neither the upper trigger
individually. nor the lower trigger is ever touched or crossed during
the lifetime of the contract. A double one-touch pays
only if at least one of the upper or the lower triggers
Modifications and Extensions of Barrier Options is touched. When buying a double no-touch option,
a vega short position is generated. This means that
Standard extensions of barrier options are double-
double no-touch options are cheap in phases of high
barrier options, where there is a barrier above
volatility.
and below the current spot. A double knockout
European digital options can be replicated with
option expires worthless if any of the two barri-
bull or bear spreads with large amounts. Their market
ers are ever touched or crossed. A double knockin
price can thus be approximated by liquid vanilla
option only becomes a vanilla option if at least one
options. However, this type of option is difficult to
of the two barriers is touched or crossed in the
hedge as the delta hedge close to expiry is zero almost
underlying.
everywhere.
A further modification of barrier options is called
the knockin/knockout (KIKO) option. This option can
knockout at any time; however, it must knockin to General Features When Pricing Exotic
become alive. A short KIKO option can be statically Options
hedged with a long knockout option and a short
double knockout option, if the spot value is between Most commercial software packages calculate the
the triggers, and with a long knockout option and a “theoretical value (TV)” of the exotic options, which
4 Foreign Exchange Options
is the value of the product in a Black–Scholes model Further extensions are target redemption prod-
with constant parameters. ucts, whose notional amount increases until a certain
Knowing the TV is important for trading partners gain is reached. A common example is a target
as it serves as a checksum to ensure that both redemption forward (TRF). We provide a descrip-
parties talk about the same product. The market value, tion and an example here: We consider a TRF in
however, often deviates from this value because of which a counterpart sells EUR and buys USD at
so-called overhedge costs, which arise when hedging a much higher rate than current spot or forward
the exotic option. Every trader must be aware of rates. The key feature in this product is that coun-
the risk arising from these options and should be terpart has a total target profit that, once hit, knocks
able to control this risk dynamically in his books via out all future settlements (in the example below, all
the Greeks (price sensitivity with respect to market weekly settlements), locking the gains registered until
and model parameters). If a gain is generated by then.
performing this hedge, the price of an exotic option The idea is to place the strike over 5.5 big
must be higher than the TV. Conversely, if the hedge figures above spot to allow the counterpart to quickly
leads to a loss, the market price of the exotic option accumulate profits and have the trade knocked out
should be above TV. after five or six weeks. The counterpart will start
A very important issue when trading exotic losing money if EUR-USD starts fixing above the
options is placing automatic spot orders at spot lev- strike. On a spot reference of 1.4760, consider a one
els that could lead to a knockout or expiry of the year TRF, in which the counterpart sells 1 EUR 1
option. This order eliminates the delta hedge of million per week at 1.5335, subject to a knockout
the option automatically when reaching the trig- condition: if the sum of the counterpart profits reaches
ger. This explains the occasional very heavy spot the target, all future settlements are canceled. We let
movements during specific trigger events in the mar- the target be 0.30 (i.e., 30 big figures), measured
ket. weekly as Profit = Max (0, 1.5335—EUR-USD Spot
The following vega structure is often found in Fixing). As usual, this type of forward is also traded
options books as it stems from most of the structured at zero cost:
products offered today in the FX range: ATM vega Week 1 Fixing = 1.4800 Profit = 0.0535 Max
long and wing vega short. This is the reason for (1.5335–1.4800, 0)
a long phase of low volatility and high butterflies Week 2 Fixing = 1.4750 Profit = 0.0585 Accumu-
for the past years. See also Foreign Exchange lated profit = 0.1120
Smiles. Week 3 Fixing = 1.4825 Profit = 0.0510 Accumu-
lated profit = 0.1630
Week 4 Fixing = 1.4900 Profit = 0.0435 Accumu-
Second-generation Exotic Options lated profit = 0.2065
Week 5 Fixing = 1.4775 Profit = 0.0560 Accumu-
We consider every exotic option as second generation lated profit = 0.2625
if it is not a vanilla and not a first-generation product. Week 6 Fixing = 1.4850 Profit = 0.0485 Accumu-
Some of the common examples in FX markets are lated profit = 0.3110
range accruals and faders.
A range accrual is a sum of digital call spreads The profit is capped at 0.30, so the counterpart
and pays an amount of a prespecified currency that only accumulates the last 3.75 big figures and the
depends on the number of currency fixings that come trade knocks out.
to fall inside a prespecified range. A fader is any basic Each forward will be settled physically every week
option product like a vanilla or barrier option, whose until trade knocks out (if the target is reached).
notional amount depends on the number of currency Another popular FX product is the time option,
fixings that come to fall inside a prespecified range. which is essentially a forward contract of American
We distinguish fade-in products, where the notional style, that is, the buyer is entitled and obliged to
grows with each fixing inside the range and fade- trade a prespecified amount at a prespecified strike,
out products, where the notional decreases with each but can choose the time within a prespecified time
fixing inside the range. interval.
Foreign Exchange Options 5
is the net present value (NPV) of USD 100 000. To • At the end of the transaction:
calculate this, we use the following formula: – company X pays USD 100 000 for the GBP
amount exchanged at the agreed forward rate.
N
NP V = (1)
1 + r ∗ d/dc Because the two approaches described earlier have
where the same outcome, the GBP amount received for the
USD 100 000 has to be the same; otherwise there
• N is the amount for which one wants to calculate would be an arbitrage opportunity. Therefore, the
the NPV. In this example, it is USD 100 000. market forward rate in this example has to be 1.9717.
• r is the interest rate, expressed as percentage Generally, the forward rate can be calculated with
per annum, for the currency in which N is a single and easy formula:
denominated.
• d is the duration of the deposit or loan in days. 1 + rUSD∗ d/dcUSD
FGBP/USD = SGBP/USD∗ (2)
In this example, it is 180 days (i.e., six months). 1 + rGBP∗ d/dcGBP
• dc is the day-count fraction. This is usually 360,
except for GBP deposits or loans where it is 365. where FGBP/USD , forward rate for GBP/USD;
SGBP/USD , spot exchange rate for GBP/USD; rUSD ,
We are now able to calculate the amount com- USD interest rate expressed in percentage per annum;
pany X has to borrow: USD 100 000/(1 + 0.03 × rGBP , GBP interest rate expressed in percentage per
180/360) = USD 98,522.17. If they borrow this annum; d, duration of the deposit or loan in days;
money, they have to pay back exactly USD 100 000 dcUSD , day-count fraction for USD (360); and dcGBP ,
in six months time including the interest charge. This day-count fraction for GBP (365).
is the amount company X is due to receive in six As the formula suggests, the market forward rate
months time from the sales of its goods to the import- is a function of only the current (spot) exchange
ing company. rate and the interest rates of the two currencies for
If company X now sells the borrowed USD the specified forward period. Hence, it is not market
in the spot market and buys GBP, they receive expectations, or any other factor that determines the
GBP 49 261.08. This is calculated by dividing the arbitrage-free forward rate.
borrowed USD amount by the current GBP/USD
exchange rate (2.0000 in this example).
Company X now has the GBP and has eliminated Structured Forwards
the foreign exchange exposure. They can take GBP
and deposit it with their bank at the current interest The previous section helped us to understand how a
rate (6% in this example). The amount they get back foreign exchange exposure resulting from a cross-
after six months is equal to GBP 50 718.67—this is border transaction can be eliminated and hedged
calculated as follows: GBP 49 261.08 × (1 + 0.06 × through a forward transaction. It showed that the for-
180/365). ward exchange rate was fixed right at the beginning of
After these series of transactions company X is the contract and hence the uncertainty about exchange
left with no cash position at the beginning of the rate movements was turned into a known rate with
transaction. They receive GBP 50 718.67 after six which companies can calculate their cost of produc-
months and have to pay USD 100 000 in exchange. tion. The example also demonstrated that there is no
The exchange rate that is implied from the above- cash flow at the beginning of a forward transaction
mentioned two amounts is 1.9717 (calculated as USD and there is no premium or any other fee associ-
100 000 divided by GBP 50 718.67). ated with it. A forward transaction is by definition a
What happens when a forward transaction is zero-cost strategy.
entered into? Exactly the same:
The Difference between Forwards and Structured
• At the beginning of the transaction: Forwards
– company X has no cash position;
– company X agrees to sell USD 100 000 for The disadvantage of forwards is that favorable
GBP at the market forward rate. exchange rate moves are also eliminated when the
Currency Forward Contracts 3
exchange rate is fixed. In the previous example, the (or selling) rate and no upfront premium must enter
forward rate was calculated to be 1.9717. This is the into the transaction.
rate at which company X has to buy GBP and sell the
USD. If in six months time the GBP/USD exchange Forward Plus. The forward plus is the simplest of
rate falls below 1.9717, company X would be bet- all structured forwards. It offers the possibility to take
ter off without hedging the GBP purchase through advantage of favorable market movements up to a
a forward. certain point, while still having a certain worst-case
Structured forwards allow just this. They are more hedged rate.
flexible, because favorable exchange rate moves, and, How does it work: by accepting a worst-case
in fact, any market view can be incorporated into the hedge rate that is less favorable than the prevailing
transaction to enhance the rate at which a currency is market forward rate, we create excess cash. Remem-
exchanged for another. ber, trading at the market forward rate is zero cost by
As with forwards, structured forwards offer the definition. If one trades on a rate that is worse than
worst-case exchange rate. This rate is fixed at the the market rate, one can expect some compensation.
beginning of the contract and similar to a regular The cash generated is used to buy an option that pays
forward, it offers the benefit of certainty about the out, if the underlying currency pair moves favorably.
exchange rate that can be used for financial planning. To make this a zero-cost strategy, we need to intro-
Similar to standard forward contracts, most struc- duce a barrier, or knockout. This has the effect that
tured forward contracts are zero-cost strategies, that options cease to exist (are knocked out) if the barrier
is, no upfront premium is required. is reached. For our strategy, it means that we can par-
We all know that there is no such a thing as a “free ticipate in a favorable market move, but only up to
lunch”. Therefore, to have the benefit of an improved a certain point, namely, the predefined barrier level.
exchange rate, a fixed worst-case rate, and a zero- If the barrier is reached we are locked into a forward
cost strategy, the company entering into a structured transaction with a rate equal to the worst-case rate.
forward transaction needs to take on certain risks. Let us continue the previous example with com-
This risk is usually structured so that the guaranteed pany X: We calculated the market forward rate to
worst-case exchange rate is set at a rate that is purchase GBP against USD in six months time to
worse than the prevailing market forward rate. The be 1.9717. A forward plus could have a worst-case
hedging counterparty accepts this worse guaranteed buying rate of 1.9850. This rate is 0.0133 worse than
rate for the chance of receiving a better rate, in case the market forward rate. As compensation for accept-
a predefined condition is met. As the examples in ing this hedge rate, company X has the opportunity
the following section demonstrate, these predefined to buy GBP at the prevailing spot rate in six months
conditions can take many forms and may incorporate time as long as the barrier of 1.8875 is not reached or
the market view of the counterparty entering into the breached during the life of the contract. As the bar-
structured forward transaction. rier is observed continuously during the entire life of
the transaction, we call this barrier an American style
barrier (this is not to be confused with an American
Examples of Structured Forwards style option that is exercisable during the life of the
option). So what does this right to buy the GBP at the
As mentioned in the previous section, structured for- prevailing market spot rate in six months time give
wards offer the possibility to incorporate one’s market to company X? Imagine that the barrier was never
view into a forward transaction. This view might reached and the spot rate in six months time is at
be the appreciation or depreciation of a currency 1.9000. In this case, company X may buy the GBP at
or even the view that a currency pair remains in a 1.9000 and it will outperform the forward transaction
certain range over a given period of time. The follow- that would have forced it to buy the GBP at 1.9717.
ing examples demonstrate how these different market However, if the spot rate ever trades at or below the
views can be expressed with currency options that barrier of 1.8875, company X has to buy the GBP at
can be structured into the forward transaction. As a the worst-case rate of 1.9850.
reminder: all examples follow the basic assumptions Table 1 and Figure 1 demonstrate possible scenar-
that the structured forward has a worst-case buying ios with assumed spot rates after six months.
4 Currency Forward Contracts
Spot rate in six months time Barrier never reached Barrier reached Market forward rate
2.0200 1.9850 1.9850 1.9717
2.0100 1.9850 1.9850 1.9717
2.0000 1.9850 1.9850 1.9717
1.9900 1.9850 1.9850 1.9717
1.9850 1.9850 1.9850 1.9717
1.9750 1.9750 1.9850 1.9717
1.9700 1.9700 1.9850 1.9717
1.9650 1.9650 1.9850 1.9717
1.9600 1.9600 1.9850 1.9717
1.9550 1.9550 1.9850 1.9717
1.9500 1.9500 1.9850 1.9717
1.9450 1.9450 1.9850 1.9717
1.9400 1.9400 1.9850 1.9717
1.9350 1.9350 1.9850 1.9717
1.9300 1.9300 1.9850 1.9717
1.9250 1.9250 1.9850 1.9717
1.9200 1.9200 1.9850 1.9717
1.9150 1.9150 1.9850 1.9717
1.9100 1.9100 1.9850 1.9717
1.9050 1.9050 1.9850 1.9717
1.9000 1.9000 1.9850 1.9717
1.8950 1.8950 1.9850 1.9717
1.8876 1.8876 1.9850 1.9717
1.8875 1.9850 1.9850 1.9717
1.8800 1.9850 1.9850 1.9717
1.8750 1.9850 1.9850 1.9717
1.8700 1.9850 1.9850 1.9717
2.0000
1.9800
GBP purchasing rate
1.9600
1.9400
1.9200
1.9000
1.8800
1.8700 1.8900 1.9100 1.9300 1.9500 1.9700 1.9900 2.0100
GBP/USD spot rate at maturity
As Figure 1 demonstrates, the forward plus out- that pays out if the range holds. The payout of the
performs the market forward rate, if the barrier is option is then used to improve the worst-case rate.
never reached and the GBP/USD spot rate at maturity Here is an example: we calculated the market
is below 1.9717. forward rate to purchase GBP against USD in six
If we set the worst-case scenario even higher than months time to be 1.9717. A range forward could
1.9850, we can set the barrier further down. Taking have a worst-case buying rate of 1.9850. This rate
advantage of this flexibility, each company entering is 0.0133 worse than the market forward rate. In
into a forward plus can create a product that suits its compensation for accepting this hedge rate, company
risk appetite. X can buy GBP at 1.8850 (0.0867 better than the
forward rate), if the GBP/USD exchange rate remains
within the 2.0700–1.9400 range during the entire six-
Range Forward. The following example uses month period. If at any time during the life of the
another market view to try to outperform the forward contract, the underlying currency pair trades outside
rate. In this case, we expect the underlying currency the range, company X has to buy the GBP at the
pair to trade within a predefined range during the life worst-case rate of 1.9850.
of the contract. Table 2 and Figure 2 demonstrate possible scenar-
Like with the forward plus (and with nearly all ios with assumed spot rates after six months.
other structured forwards), the worst-case hedge rate As Figure 2 demonstrates, the range forward out-
is less favorable than the prevailing market forward performs the market forward rate, if the range holds,
rate. The generated excess cash is spent on an option even if spot rate closes above the forward rate.
Spot rate in six months time Barriers never reached Barrier reached Market forward rate
2.1000 1.9850 1.9850 1.9717
2.0700 1.9850 1.9850 1.9717
2.0699 1.8850 1.9850 1.9717
2.0500 1.8850 1.9850 1.9717
2.0300 1.8850 1.9850 1.9717
2.0250 1.8850 1.9850 1.9717
2.0200 1.8850 1.9850 1.9717
2.0150 1.8850 1.9850 1.9717
2.0100 1.8850 1.9850 1.9717
2.0050 1.8850 1.9850 1.9717
2.0000 1.8850 1.9850 1.9717
1.9950 1.8850 1.9850 1.9717
1.9900 1.8850 1.9850 1.9717
1.9850 1.8850 1.9850 1.9717
1.9800 1.8850 1.9850 1.9717
1.9750 1.8850 1.9850 1.9717
1.9700 1.8850 1.9850 1.9717
1.9650 1.8850 1.9850 1.9717
1.9600 1.8850 1.9850 1.9717
1.9550 1.8850 1.9850 1.9717
1.9500 1.8850 1.9850 1.9717
1.9450 1.8850 1.9850 1.9717
1.9401 1.8850 1.9850 1.9717
1.9400 1.9850 1.9850 1.9717
1.9300 1.9850 1.9850 1.9717
1.9250 1.9850 1.9850 1.9717
1.9200 1.9850 1.9850 1.9717
6 Currency Forward Contracts
2.0000
1.9800
1.9400
1.9200
1.9000
1.8800
1.9200 1.9700 2.0200 2.0700
GBP/USD spot rate at maturity
If we set the worst-case scenario even higher [4] Chisholm, A.M. (2004). Derivatives Demystified: A Step-
than 1.9850, we can widen the range or improve the by-Step Guide to Forwards, Futures, Swaps and Options,
best-case buying rate. Taking advantage of this flex- John Wiley & Sons.
ibility, each company entering into a range forward
can create a product that suits its risk appetite. Related Articles
[φ(ST − K)]+ II {ηSt >ηB,0≤t≤T } Using the density (8), the value of a barrier option
= [φ(ST − K)]+ II {mint∈[0,T ] (ηSt )>ηB} (4) can be written as the following integral
(up-and-out). +
To price knock-in options paying = e−rd T φ(S0 eσ x − K)
x=−∞ ηy≤min(0,ηx)
Digital options have a payoff where the former is also called instant one-touch
and the latter is the default in FX options markets.
It is important to mention that the payoff is one
v(T , ST ) = II {φST ≥φK} domestic paying (15)
unit of the domestic currency. For a payment in the
w(T , ST ) = ST II {φST ≥φK} foreign paying (16) foreign currency EUR, one needs to exchange rd and
rf , replace x and B by their reciprocal values, and
In the domestic paying case, the payment of the change the sign of η; see Foreign Exchange Sym-
fixed amount is in domestic currency, whereas in metries.
the foreign paying case the payment is in foreign For the one-touch, we use the abbreviations
currency. We obtain for the value functions
ϑ− = θ−2 + 2(1 − ω)rd and
v(t, x) = Dd N(φd− ) (17)
w(t, x) = xDf N(φd+ ) (18) ± ln x
− σ ϑ− τ
e± = B
√ (24)
σ τ
of the digital options paying one unit of domestic and
paying one unit of foreign currency, respectively. The theoretical value of the one-touch turns out
to be
One-touch Options
v(t, x) = Re−ωrd τ
The payoff of a one-touch is given by
θ− +ϑ− θ− −ϑ−
B σ B σ
× N(−ηe+ ) + N(ηe− )
RII {τB ≤T } (19) x x
τB = inf{t ≥ 0 : ηSt ≤ ηB} (20) (25)
This type of option pays a domestic cash amount Note that ϑ− = |θ− | for rebates paid-at-end (ω = 1).
R if a barrier B is hit any time before the expiration The risk-neutral probability of knocking out is
time. We use the binary variable η to describe whether given by
B is a lower barrier (η = 1) or an upper barrier
(η = −1). The stopping time τB is called the first 1
hitting time. In FX markets, an option with this IP [τB ≤ T ] = IE II {τB ≤T } = erd T v(0, S0 ) (26)
R
payoff is usually called a one-touch (option), one-
touch-digital, or hit option. The modified payoff of a
no-touch (option), RII {τB ≥T } describes a rebate, which Properties of the First Hitting Time τB . As
is paid if a knock-in-option is not knocked-in by derived, for example, in [15], the first hitting time
the time it expires and can be valued similarly by
exploiting the identity τ̃ = inf{t ≥ 0 : θt + W (t) = x} (27)
RII {τB ≤T } + RII {τB >T } = R (21) of a Brownian motion with drift θ and hit level x > 0
has the density
Furthermore, we distinguish the time at which the
rebate is paid and let
IP [τ̃ ∈ dt]
ω = 0, if the rebate is paid at first hitting time τB x (x − θt)2
= √ exp − dt, t > 0 (28)
(22) t 2πt 2t
4 Pricing Formulae for Foreign Exchange Options
the cumulative distribution function To evaluate this integral, we introduce the notation
S0
θt − x ± ln − σ θ− t
IP [τ̃ ≤ t] = N √ B√
t e± (t) = (35)
σ t
−θt − x
+ e2θx N √ , t >0 (29) and list the properties
t
the Laplace transform
2 1 B
e− (t) − e+ (t) = √ ln (36)
t σ S 0
IEe−α τ̃ = exp xθ − x 2α + θ 2 , α > 0, x > 0
− 2θ−
B σ
(30) n(e+ (t)) = n(e− (t)) (37)
S0
and the property
∂e± (t) e∓ (t)
1 if θ ≥ 0 = (38)
IP [τ̃ < ∞] = (31) ∂t 2t
e2θx if θ < 0
For upper barriers B > S0 , we can now rewrite the We evaluate the integral in equation (34) by rewriting
first passage time τB as the integrand in such a way that the coefficients
of the exponentials are the inner derivatives of the
τB = inf{t ≥ 0 : St = B} exponentials using properties (36)–(38).
1 B
= inf t ≥ 0 : Wt + θ− t = ln (32) 2
σ S0 B
B
1
T σ ln
1
σ
ln − θ− t
The density of τB is hence S0 S0
√ exp − dt
t 2πt
2t
1 B
0
ln
σ S0
T
IP [τ˜B ∈ dt] = √ 1 B 1
t 2πt = ln n(e− (t)) dt
2 σ S0 0 t
(3/2)
1 B
T
ln − θ− t
1
× exp −
σ S0
, t >0 = n(e− (t))[e− (t) − e+ (t)] dt
2t
2t
0
2θ−
T
e+ (t) B σ e− (t)
=− n(e− (t)) + n(e+ (t)) dt
(33) 0 2t S0 2t
that period. Thus, lookback options (like Asian Table 3 Types of lookback options. The contract param-
options) avoid the problem of European options that eters T and X are the time to maturity and the strike price,
the underlying performed favorably throughout most respectively, and ST denotes the spot price at expiration
time. Fixed strike lookback options are also called hindsight
of the option’s lifetime but moves into a nonfavorable options
direction toward maturity. Moreover, (unlike Ameri-
can Options) lookback options optimize the market Parameter used below
timing, because the investor gets, by definition, the Payoff Lookback type in valuation
most favorable underlying price. As summarized in MT − ST Floating strike put φ = −1, η̄ = +1
Table 3, lookback options can be structured in two ST − mT Floating strike call φ = +1, η̄ = +1
different types with the extremum representing either (MT − X)+ Fixed strike call φ = +1, η̄ = −1
the strike price or the underlying value. Figure 1 (X − mT )+ Fixed strike put φ = −1, η̄ = −1
shows the development of the payoff of lookback
options depending on a sample price path. In detail, exchange rate in currency-linked security issues.
we define However, this right is very expensive. Since one buys
a guarantee for the best possible exchange rate ever,
lookback options are generally too expensive and
MT = max S(u) and mT = min S(u) (57)
0≤u≤T 0≤u≤T hardly ever trade. Exceptions are performance notes,
where lookback and average features are mixed, for
Variations of lookback options include partial
example, performance notes paying say 50% of the
lookback options, where the monitoring period for
best of 36 monthly average gold price returns.
the underlying is shorter than the lifetime of the
option. Conze and Viswanathan [2] present further
variations like limited risk and American lookback Valuation
options.
In theory, Garman pointed out in [4], that look- We consider the example of the floating strike look-
back options can also add value for risk managers, back call. Again, the value of the option is given by
because floating (fixed) strike lookback options are
good means to solve the timing problem of market
v(0, S0 ) = IE e−rd T (ST − mT )
entries (exits) (see [9]). For instance, a minimum
strike call is suitable for avoiding missing the best = S0 e−rf T − e−rd T IE [mT ] (58)
0.25 1.2
0.15
0.6
0.1
0.4
0.05
0.2
0 0
1
3
5
7
9
11
13
15
17
19
21
23
25
27
29
31
33
35
37
39
41
43
45
Trading day
Figure 1 Payoff profile of lookback calls (sample underlying price path, m = 20 trading days)
Pricing Formulae for Foreign Exchange Options 7
In the standard Black–Scholes model (1), the Table 4 Sample values for lookback options. For the
value can be derived using the reflection principle input data, we used spot S0 = 0.9800, rd = 3%, rf = 6%,
and results in σ = 10%, τ = 1/12, running min R = 0.9500, running
max R = 0.9900, and number of equidistant fixings m = 22
Discretely sampled Continuously
v(t, x) = φ xDf N(φd+ ) − KDd N(φd− ) Equations (67) Equations (59)
Payoff sampled and (68) or (60)
1−η MT − ST
+ φDd [φ(R − X)]+ 0.0231 0.0255
2 ST − mT 0.0310 0.0320
(MT − 0.99)+
1 x −h √ 0.0107 0.0131
+ ηxDd N(−ηφ(d+ − hσ τ )) (0.97 − mT )+ 0.0235 0.0246
h K
− e(rd −rf )τ N(−ηφd+ ) (59)
lookback options in [10]. We list some sample results
This value function has a removable discontinuity at in Table 4.
h = 0 where it turns out to be
Discrete Sampling
v(t, x) = φ xDf N(φd+ ) − KDd N(φd− )
2(rd − rf ) √
h=
(61) β = − ζ (1/2)/ 2π
σ2
= 0.5826 (ζ being Riemann’s ζ -function) (65)
R = running extremum: extremum observed √
α = eφβσ τ/m (66)
until valuation time (62)
R floating strike lookback and obtain for fixed strike lookback options
K =
−φ min(−φX, −φR) fixed strike lookback
(63)
vm (t, x, rd , rf , σ, R, X, φ, η)
+1 floating strike lookback
η =
−1 fixed strike lookback = v(t, x, rd , rf , σ, αR, αX, φ, η)/α (67)
(64)
and for floating strike lookback options
Note that this formula basically consists of that
for a call option (the first two terms) plus another
term. Conze and Viswanathan also show closed-form vm (t, x, rd , rf , σ, R, X, φ, η)
solutions for fixed strike lookback options and the
= av(t, x, rd , rf , σ, R/α, X, φ, η) − φ(α − 1)xDf
variations mentioned above in [2]. Heynen and Kat
develop equations for partial fixed and floating strike (68)
8 Pricing Formulae for Foreign Exchange Options
x
The Curnow and Dunnett Integral Reduction eAz N(az + B)n(z) dz
Technique −∞
Denote the n-dimensional multivariate normal inte-
A2 aA + B −a
= e2 N2 x − A, ; (80)
gral with upper limits h1 , . . . , hn and correlation 1+a 2 1 + a2
Standard option Vn −1 (t )
Compound option Vn −2 (t )
P = V 0 (t 0 ) k1 k2 kn −1 kn
t0 = 0 t1 t2 tn −1 tn = T
V 0 (t )
V1 ( t )
A Closed-form Solution for the Value of an Theorem 1 Let k = (k1 , . . . , kn ) be the strike price
Instalment Option vector, t = (t1 , . . . , tn ) the vector of the exercise
dates of an n-variate instalment option and φ =
Heuristically, the formula which is given in the (φ1 , . . . , φn ) the vector of the put/call-indicators of
Theorem 1 has the structure of the Black–Scholes these n options.
formula in higher dimensions, namely, S0 Nn (·) − The value function of an n-variate instalment
kn Nn (·) minus the later premium payments ki Ni (·) option is given by
Vn (S0 , k, t, φ)
ln SS0∗ + σ θ+ t1 ln SS0∗ + σ θ+ t2 ln SS0∗ + σ θ+ tn
= e−rf tn S0 φ1 · · · φn × Nn φ1 1
√ , φ2 2
√ , . . . , φn n
√ ; Rn
σ t1 σ t2 σ tn
ln SS0∗ + σ θ− t1 ln SS0∗ + σ θ− t2 ln SS0∗ + σ θ− tn
− e−rd tn kn φ1 · · · φn × Nn φ1 1
√ , φ2 2
√ , . . . , φn n
√ ; Rn
σ t1 σ t2 σ tn
ln SS0∗ + σ θ− t1 ln SS0∗ + σ θ− t2 ln SS∗0 + σ θ− tn−1
− e−rd tn−1 kn−1 φ1 · · · φn−1 × Nn−1 φ1 1 √ , φ2 2
√ , . . . , φn−1 n−1
√ ; Rn−1
σ t1 σ t2 σ tn−1
..
.
ln SS0∗ + σ θ− t1 ln SS0∗ + σ θ− t2
− e−rd t2 k2 φ1 φ2 N2 φ1 1
√ , φ2 2
√ ; ρ12
σ t1 σ t2
ln SS0∗ + σ θ− t1
− e−rd t1 k1 φ1 N φ1 1
√ (81)
σ t1
S f σ2 • Dual delta.
ln + σ θ± τ ln ± τ
• d± = K √ = K√ 2 ∂v
σ τ σ τ = −φe−rd τ N(φd− ) (9)
1
∂K
1 − t2
• n(t) = √ e 2 = n(−t) The forward dual delta
2π
x
• N(x) = −∞ n(t) dt = 1 − N(−x) N(φ d− ) = IP [φST ≥ φK] (10)
Greeks ∂ d± d∓
= − (12)
∂σ σ
Greeks are derivatives of the value function with √
∂ d± τ
respect to model and contract parameters. They = (13)
are an important information for traders and have ∂rd σ
√
become standard information provided by front-office ∂ d± τ
systems. More details on Greeks and the relations = − (14)
∂rf σ
among Greeks are presented in [5] or [6]. We now
list some of them for vanilla options: Se−rf τ n( d+ ) = Ke−rd τ n( d− ) (15)
In particular, we learn that the absolute value of a put Comparing the coefficients of S and K in equa-
delta and a call delta do not exactly add up to 1, but tions (3) and (20) leads to suggestive results for the
only to a positive number e−rf τ . They add up to 1 delta vS and dual delta vK . This space-homogeneity
approximately if either the time to expiration τ is is the reason behind the simplicity of the delta for-
short or if the foreign interest rate rf is close to 0. For mulas, whose tedious computation can be saved this
this reason, traders often prefer to work with forward way.
deltas, because these are symmetric in the sense that
a 25-delta call is a 75-delta put.
Although the choice K = f produces identical Time Homogeneity
values for call and put, we seek the delta-symmetric
strike Ǩ, which produces absolutely identical deltas We can perform a similar computation for the time-
(spot, forward, or driftless). This condition implies affected parameters and obtain the obvious equation
d+ = 0 and thus
v(S, K, T , t, σ, rd , rf , φ)
σ2
Ǩ = fe 2 τ
(18) T t √
= v S, K, , , aσ, ard , arf , φ
a a
in which case the absolute delta is e−rf τ /2. In
particular, we learn that always Ǩ > f , that is, there for all a > 0 (21)
cannot be a put and a call with identical values and
deltas. This is natural as the payoffs of calls and Differentiating both sides with respect to a and then
puts are not symmetric to start with: the call has setting a = 1 yields
unlimited upside potential, whereas the put payoff is
always bounded by the strike. Note that the strike Ǩ 1
0 = τ vt + σ vσ + rd vrd + rf vrf (22)
is usually chosen as the middle strike when trading 2
a straddle or a butterfly. Similarly the dual-delta-
σ2 Of course, this can also be verified by direct com-
symmetric strike K̂ = f e− 2 τ can be derived from putation. The overall use of such equations is to
the condition d− = 0. generate double checking benchmarks when comput-
Note that the delta-symmetric strike Ǩ also max- ing Greeks. These homogeneity methods can easily
imizes gamma and vega of a vanilla option and is be extended to other more complex options.
thus often considered as a center of symmetry.
Put–Call Symmetry
Homogeneity-based Relationships
By put–call symmetry, we understand the relationship
Space Homogeneity (see [1–4])
the put–call parity where the call and the put have FOR paying: ST 11{φST ≥φK} (27)
the same value.
where the contractual parameters are the strike K, the
Rates Symmetry expiration time T , and the type φ, a binary variable,
which takes the value +1 in the case of a call and −1
Direct computation shows that the rates symmetry in the case of a put. Then we observe that a DOM-
∂v ∂v paying digital call in the currency pair FOR–DOM
+ = −τ v (24) with a value of vd units of domestic currency must
∂rd ∂rf
be worth the same as a FOR-paying digital put in the
holds for vanilla options. This relationship, in fact, currency pair DOM–FOR with a value of vf units
holds for all European options and a wide class of of foreign currency. And since we are looking at the
path-dependent options as shown in [6]. same product, we conclude that vd = vf · S, where S
is the initial spot of FOR–DOM.
Foreign–Domestic Symmetry
One can directly verify the FOR–DOM symmetry Touch Options
This key idea generalizes from the path-independent
1 digitals to touch products. Consider the value function
v(S, K, T , t, σ, rd , rf , φ)
S for one-touch in EUR-USD paying 1 USD. If we
1 1 want to find the value function of a one-touch in
= Kv , , T , t, σ, rf , rd , −φ (25) EUR-USD paying 1 EUR, we can price the one-touch
S K
in USD-EUR paying 1 EUR using the known value
This equality can be viewed as one of the faces function with rates rd and rf exchanged, volatility
of put–call symmetry. The reason is that the value unchanged, using the formula for a one-touch in
of an option can be computed both in a domes- EUR-USD paying 1 USD. We also note that an
tic as well as in a foreign scenario. We con- upper one-touch in EUR-USD becomes a lower one-
sider the example of St modeling the exchange touch in USD-EUR. The result we get is in domestic
rate of EUR/USD. In New York, the call option currency, which is EUR in USD-EUR notation. To
(ST − K)+ costs v(S, K, T , t, σ, rusd , reur , 1) USD convert it into an USD price, we just multiply by the
and hence v(S, K, T , t, σ, rusd , reur , 1)/S EUR. This EUR-USD spot S.
EUR call option can alsobe viewed as an USD
+
put option with payoff K K1 − S1T . This option Barrier Options
costs Kv S1 , K1 , T , t, σ, reur , rusd , −1 EUR in Frank-
furt, because St and S1t have the same volatility. Of For a standard knockout barrier option, we let the
course, the New York value and the Frankfurt value value function be
must agree, which leads to equation (25). This can v(S, rd , rf , σ, K, B, T , t, φ, η) (28)
also be seen as a change of measure to the foreign
discount bond as numeraire (see, e.g., in [7]). where B denotes the barrier and the variable η takes
the value +1 for a lower barrier and −1 for an upper
barrier. With this notation at hand, we can state our
Exotic Options
FOR–DOM symmetry as
In FX markets, one can use many symmetry relation-
ships for exotic options. v(S, rd , rf , σ, K, B, T , t, φ, η)
1 1 1
Digital Options =v , rf , rd , σ, , , T , t, −φ, −η SK
S K B
For example, let us define the payoff of digital (29)
options by
Note that the rates rd and rf have been interchanged
DOM paying: 11{φST ≥φK} (26) on purpose. This implies that if we know how to price
Foreign Exchange Symmetries 5
barrier contracts with upper barriers, we can derive Table 1 Standard market quotation of major currency
the formulas for lower barriers. pairs with sample spot prices
Currency pair Default quotation Sample quote
Table 2 Standard market quotation types for option values. In the example, we take FOR = EUR, DOM = USD,
S = 1.2000, rd = 3.0%, rf = 2.5%, σ = 10%, K = 1.2500, T = 1 year, φ = +1 (call), notional = 1 000 000 EUR =
1 250 000 USD. For the pips, the quotation 291.48 USD pips per EUR is also sometimes stated as 2.9148% USD per
1 EUR. Similarly, the 194.32 EUR pips per USD can also be quoted as 1.9432% EUR per 1 USD
Name Symbol Value in units of Example
Domestic cash d DOM 29 148 USD
Foreign cash f FOR 24 290 EUR
% domestic %d DOM per unit of DOM 2.3318% USD
% foreign %f FOR per unit of FOR 2.4290% EUR
Domestic pips d pips DOM per unit of FOR 291.48 USD pips per EUR
Foreign pips f pips FOR per unit of DOM 194.32 EUR pips per USD
6 Foreign Exchange Symmetries
to first order. To interpret this relationship, note that the option, and for risky premium this premium must
the minus sign refers to selling DOM instead of be included. In the opposite case, the risky premium
buying FOR, and the multiplication by S adjusts and the market value must be taken into account for
the amounts. Furthermore, we divide by the strike, the base currency premium, such that these offset
because a call on 1 EUR corresponds to K USD each other. And for premium in underlying currency
puts. More details on delta conventions are contained of the contract, the market value needs to be taken
in Foreign Exchange Options: Delta- and At-the- into account. In this way, the delta hedge is invariant
money Conventions with respect to the risky currency notion of the bank,
For consistency, the premium needs to be incorpo- for example, the delta is the same for a USD-based
rated into the delta hedge, since a premium in foreign bank and an EUR-based bank.
currency will already hedge part of the option’s delta
risk. To make this clear, let us consider EUR-USD.
In the standard arbitrage theory, v(S) denotes the Example
value or premium in USD of an option with 1 EUR
We consider two examples in Tables 3 and 4 to
notional, if the spot is at S, and the raw delta vS
compare the various versions of deltas that are used
denotes the number of EUR to buy for the delta
in practice.
hedge. Therefore, SvS is the number of USD to sell.
If now the premium is paid in EUR rather than in
USD, then we already have Sv EUR, and the number
of EUR to buy has to be reduced by this amount, that
Greeks in Terms of Deltas
is, if EUR is the premium currency, we need to buy In FX markets, the moneyness of vanilla options
vS − v/S EUR for the delta hedge or equivalently is always expressed in terms of deltas and prices
sell SvS − v USD.
To quote an FX option, we need to first sort
out which currency is domestic, which is for- Table 3 1 y EUR call USD put strike K = 0.9090 for a
eign, what is the notional currency of the option, EUR-based bank. Market data: spot S = 0.9090, volatility
and what is the premium currency. Unfortunately, σ = 12%, EUR rate rf = 3.96%, USD rate rd = 3.57%.
The raw delta is 49.15%EUR and the value is 4.427%EUR
this is not symmetric, since the counterparty might
have another notion of domestic currency for a Delta Prem
given currency pair. Hence, in the professional currency currency Fenics Formula Delta
interbank market, there is one notion of delta % EUR EUR lhs δraw − P 44.72
per currency pair. Normally, it is the left-hand % EUR USD rhs δraw 49.15
side delta of the Fenicsa screen if the option is % USD EUR rhs −(δraw − −44.72
traded in left-hand side premium, which is nor- [flip F4] P )S/K
mally the standard and right-hand side delta if it % USD USD lhs −(δraw )S/K −49.15
is traded with right-hand-side premium, for exam- [flip F4]
ple, EUR/USD lhs, USD/JPY lhs, EUR/JPY lhs,
AUD/USD rhs, and so on. Since OTM options
are traded most of the time, the difference is Table 4 1 y call EUR call USD put strike K = 0.7000 for
a EUR-based bank. Market data: spot S = 0.9090, volatility
not huge and hence does not create a huge spot
σ = 12%, EUR rate rf = 3.96%, USD rate rd = 3.57%.
risk. The raw delta is 94.82%EUR and the value is 21.88%EUR
Additionally, the standard delta per currency pair
(left-hand-side delta in Fenics for most cases) is used Delta Prem
to quote options in volatility. This has to be specified currency currency Fenics Formula Delta
by currency pair. % EUR EUR lhs δraw − P 72.94
This standard interbank notion must be adapted % EUR USD rhs δraw 94.82
to the real delta risk of the bank for an automated % USD EUR rhs −(δraw − −94.72
trading system. For currency pairs where the risk- [flip F4] P )S/K
% USD USD lhs −δraw S/K −123.13
free currency of the bank is the domestic or base [flip F4]
currency, it is clear that the delta is the raw delta of
Foreign Exchange Symmetries 7
Table 5 Vega in terms of Delta for the standard maturity labels and various deltas. It shows that one can neutralize a
vega position of a long 9M 35 delta call with 4 short 1M 20 delta puts. This offsetting, however, is not a static, but only
a momentary hedge
Matrix/ 50% 45% 40% 35% 30% 25% 20% 15% 10% 5%
1D 2 2 2 2 2 2 1 1 1 1
1W 6 5 5 5 5 4 4 3 2 1
1W 8 8 8 7 7 6 5 5 3 2
1M 11 11 11 11 10 9 8 7 5 3
2M 16 16 16 15 14 13 11 9 7 4
3M 20 20 19 18 17 16 14 12 9 5
6M 28 28 27 26 24 22 20 16 12 7
9M 34 34 33 32 30 27 24 20 15 9
1Y 39 39 38 36 34 31 28 23 17 10
2Y 53 53 52 50 48 44 39 32 24 14
3Y 63 63 62 60 57 53 47 39 30 18
8 Foreign Exchange Symmetries
Vega in Terms of Delta [2] Bates, D. (1991). The crash of 1987—was it expected?
The evidence from options markets, The Journal of
The mapping Finance 46, 1009–1044.
√ [3] Bowie, J. & Carr, P. (1994). Static simplicity, Risk
→ vσ = Se−rf τ τ n(N−1 (erf τ )) (42) Magazine (7), 45–49. http://www.riskpublications.com
[4] Carr, P. (1994). European Put Call Symmetry, Cornell
is important for trading vanilla options. Observe that University Working Paper.
this function does not depend on rd or σ , just on [5] Hakala, J. & Wystup, U. (2002). Foreign Exchange Risk,
rf . Quoting vega in % foreign will additionally Risk Publications, London. http://www.mathfinance.com/
FXRiskBook/.
remove the spot dependence. This means that for
[6] Reiss, O. & Wystup, U. (2001). Efficient computation
a moderately stable foreign term structure curve, of option price sensitivities using homogeneity and other
traders will be able to use a moderately stable vega tricks, The Journal of Derivatives 9(2), 41–53.
matrix. For rf = 3%, the vega matrix is presented in [7] Shreve, S.E. (2004). Stochastic Calculus for Finance II.
Table 5. Springer.
The most important result of this paragraph is
the fact that vega can be written in terms of delta, Further Reading
which is the main reason why the FX market uses
implied volatility quotation based on deltas in the first
Wystup, U. (2006). FX Options and Structured Products, Wiley
place. Finance Series, Wiley. http://fxoptions.mathfinance.com/.
End Notes
Related Articles
a.
Fenics is one of the standard tools for FX option pricing
(see http://www.fenics.com/) Black–Scholes Formula; Foreign Exchange Op-
tions: Delta- and At-the-money Conventions;
References Foreign Exchange Markets; Put–Call Parity.
UWE WYSTUP
[1] Bates, D. (1988). Crashes, Options and International
Asset Substitutability. PhD Dissertation, Economics
Department, Princeton University.
Quanto Options and hence
1 1 1
dSt(1) = dSt(3) + St(3) d + dSt(3) d
A quanto option can be any cash-settled option St(2) St(2) St(2)
whose payoff is converted into a third currency at
St(3) St(3)
maturity at a prespecified rate, called the quanto = (rEUR − rXAU ) dt + σ3 dWt(3)
factor. There can be quanto plain vanilla, quanto St(2) St(2)
barriers, quanto forward starts, quanto corridors,
St(3)
and so on. The arbitrage pricing theory and the + (rUSD − rEUR + σ22 ) dt
fundamental theorem of asset pricing, also covered St(2)
for example in [3] and [2], allow the computation St(3) St(3)
of option values. Other references include Options: − σ dWt(2) +
(2) 2
ρ23 σ2 σ3 dt
Basic Definitions; Option Pricing: General Princi- St St(2)
ples; Foreign Exchange Markets. = (rUSD − rXAU + σ22 + ρ23 σ2 σ3 )St(1) dt
+ St(1) (σ3 dWt(3) − σ2 dWt(2) ) (6)
Foreign Exchange Quanto Drift
Adjustment Since St(1) is a geometric Brownian motion with
volatility σ1 , we introduce a new Brownian motion
We take the example of a gold contract with under- Wt(1) and find
lying XAU/USD in XAU–USD quotation that is
quantoed into EUR. Since the payoff is in EUR, we
let EUR be the numeraire or domestic or base cur- dSt(1) = (rUSD − rXAU + σ22 + ρ23 σ2 σ3 )St(1) dt
rency and consider a Black–Scholes model + σ1 St(1) dWt(1) (7)
XAU–EUR: dSt(3) = (rEUR − rXAU )St(3) dt Now Figure 1 and the law of cosine imply
+ σ3 St(3) dWt(3) (1)
σ32 = σ12 + σ22 − 2ρ12 σ1 σ2 (8)
USD–EUR: dSt(2) = (rEUR − rUSD )St(2) dt
σ12 = σ22 + σ32 + 2ρ23 σ2 σ3 (9)
+ σ2 St(2) dWt(2) (2)
dWt(3) dWt(2) = − ρ23 dt (3) which yields
where we use a minus sign in front of the correla- σ22 + ρ23 σ2 σ3 = ρ12 σ1 σ2 (10)
tion, because both S (3) and S (2) have the same base
currency (DOM), which is EUR in this case. The sce- As explained in the currency triangle in Figure 1,
nario is displayed in Figure 1. The actual underlying ρ12 is the correlation between XAU–USD and
is then USD–EUR, whence ρ = − ρ12 is the correlation
S (3) between XAU–USD and EUR–USD. Inserting this
XAU–USD: St(1) = t(2) (4)
St into equation (7), we obtain the usual formula for the
drift adjustment
Using Itô’s formula, we first obtain
Table 3 Example of a performance-linked deposit, foreign exchange, however, is the deposit currency
where the investor is paid 30% of the EUR–GBP return. being different from the domestic currency of the
Note that in GBP the day count convention in the money exchange rate, which is quoted in FOR–DOM (for-
market is act(a) /365 rather than act/360
eign–domestic), meaning how many units of domes-
Notional 5 000 000 GBP tic currency are required to buy one unit of foreign
Start date 3 June 2005 currency. So, if we have a EUR investor who wishes
Maturity 2 September 2005 (91 days) to participate in a EUR–USD movement, we need to
Number of days 91
quanto the domestic payoff currency (USD) into the
(act)
Money market 4.00% act/365 foreign currency (EUR). The payoff of the EUR call
reference rate USD put
EUR–GBP spot 0.7000 [ST − K]+ (24)
reference
Minimum rate 2.00% act/365 is in domestic currency (USD). Of course, this payoff
T −0.7000,0]
Additional 30% · 100 max[S
0.7000
act/365 can be converted into the foreign currency (EUR)
coupon at maturity, but the question is, at what rate? If we
ST EUR–GBP fixing on 31 August convert at rate ST , which is what we could do in
2005 (88 days)
the spot market at no cost, then the investor buys a
Fixing source ECB
vanilla EUR call. But here, the investor receives a
(a)
(act = actual number of days) coupon given by
max[ST − S0 , 0]
be to buy an up-and-out call with barrier at 0.7400 p· (25)
ST
and 75% participation, where we would find the
best case to be 0.7399 with an additional coupon If the investor wishes to have performance of equa-
of 4.275% per annum, which would lead to a total tion (23) rather than equation (25), then the payoff at
coupon of 6.275% per annum. maturity is converted at a rate of 1.0000 into EUR,
and this rate is set at the beginning of the trade. This
Composition is the quanto factor, and the vanilla is actually a self-
quanto vanilla, that is, a EUR call USD put, cash
• From the money market we get 49 863.01 GBP settled in EUR, where the payoff in USD is con-
at the maturity date. verted into EUR at a rate of 1.0000. This self-quanto
• The investor buys a EUR call GBP put with strike vanilla can be valued by inverting the exchange rate,
0.7000 and with notional 1.5 million GBP. that is, looking at USD–EUR. This way the valuation
• The offer price of the call is 26 220.73 GBP, can incorporate the smile of EUR–USD.
assuming a volatility of 8.0% and a EUR rate Similar considerations need to be taken into
of 2.50%. account if the currency pair to participate in does
• The deferred premium is 24 677.11 GBP. not contain the deposit currency at all. A typical sit-
• The investor receives a minimum payment of uation is a EUR investor, who wishes to participate
24 931.51 GBP. in the gold price, which is measured in USD, so the
• Subtracting the deferred premium and the mini- investor needs to buy a XAU call USD put quantoed
mum payment from the money market leaves a into EUR. So the investor is promised a coupon as
sales margin of 254.40 GBP (which is extremely in equation (23) for a XAU–USD underlying, where
poor). the coupon is paid in EUR; this implicitly means that
• Note that the option the investor is buying must we must use a quanto plain vanilla with a quanto
be cash-settled. factor of 1.0000.
International Journal of Theoretical and Applied Finance [6] Wystup, U. (2006). FX Options and Structured Products,
4(1), 91–119. Wiley Finance Series.
[2] Hakala, J. & Wystup, U. (2002). Foreign Exchange Risk,
Risk Publications, London.
[3] Shreve, S.E. (2004). Stochastic Calculus for Finance I+II,
Springer. Related Articles
[4] Wystup, U. (2001). How the Greeks would have hedged
correlation risk of foreign exchange options, Wilmott
Research Report, August 2001. Black–Scholes Formula; Foreign Exchange Mar-
[5] Wystup, U. (2002). How the Greeks would have hedged kets; Foreign Exchange Options.
correlation risk of foreign exchange options, in Foreign
Exchange Risk, Risk Publications, London. UWE WYSTUP
Vanna–Volga Pricing Delta and vega are the most relevant sensitivity
parameters for FX options maturing within one
year. A delta-neutral position can be achieved by
trading the spot. Changes in the spot are explicitly
The vanna–volga method, also called the traders’ allowed in the Black–Scholes model. Therefore,
rule of thumb, is an empirical procedure that can be model and practical trading have very good control
used to infer an implied-volatility smile from three over spot change risk. The more sensitive part is
available quotes for a given maturity. It is based the vega position. This is not taken care of in the
on the construction of locally replicating portfolios Black–Scholes model. Market participants need to
whose associated hedging costs are added to cor- trade other options to obtain a vega-neutral position.
responding Black–Scholes prices to produce smile- However, even a vega-neutral position is subject to
consistent values. Besides being intuitive and easy to changes of spot and volatility. For this reason, the
implement, this procedure has a clear financial inter- sensitivity parameters vanna (change of vega due to
pretation, which further supports its use in practice. change of spot) and volga (change of vega due to
In fact, SuperDerivatives has implemented a type of change of volatility) are of special interest. Vanna is
this method in their pricing platform, as one can read also called d vega/d spot, volga is also called d vega/d
in the patent that SuperDerivatives has filed. vol. The plots for vanna and volga for a vanilla option
The vanna–volga method is commonly used in are displayed in Figures 1 and 2. In this section, we
foreign exchange options markets, where three main outline how the cost of such a vanna and volga
volatility quotes are typically available for a given exposure can be used to obtain prices for options
market maturity: the delta-neutral straddle, referred that are closer to the market than their theoretical
to as at-the-money (ATM); the risk reversal (RR) Black–Scholes value.
for 25 delta call and put; and the (vega-weighted)
butterfly (BF) with 25 delta wings. The application
of vanna–volga pricing allows us to derive implied
volatilities for any option’s delta, in particular for Cost of Vanna and Volga
those outside the basic range set by the 25 delta
put and call quotes. The notion of risk reversals We fix the rates rd and rf , the time to maturity T ,
and butterflies is explained in the article on foreign and the spot x and define
exchange (FX) market terminology (see Foreign
Exchange Markets).
In the financial literature, the vanna–volga app- cost of vanna = exotic vanna ratio
roach was introduced by Lipton and McGhee in [2],
who compare different approaches to the pricing × value of RR (1)
of double-no-touch (DNT) options, and by Wystup
cost of volga = exotic volga ratio
in [5], who describes its application to the valuation
of one-touch (OT) options. The vanna–volga proce- × value of BF (2)
dure is reviewed in more detail and some important
results concerning the tractability of the method and exotic vanna ratio = Bσ x /RRσ x (3)
its robustness are derived by Castagna and Mercurio
exotic volga ratio = Bσ σ /BFσ σ (4)
in [1].
The following is based on the section Traders’ value of RR = [RR(σ ) − RR(σ0 )] (5)
Rule of Thumb by Wystup in [6].
The traders’ rule of thumb is a method of traders value of BF = [BF(σ ) − BF(σ0 )] (6)
to determine the cost of risk managing the volatility
risk of exotic options with vanilla options. This cost where σ0 denotes the ATM (forward) volatility and
is then added to the theoretical value (TV) in the σ denotes the wing volatility at the delta pillar ,
Black–Scholes model and is called the overhedge. and B denotes the value function of a given exotic
We explain the rule and then consider an example of option. The values of risk reversals and butterflies are
a one-touch option. defined by
2 Vanna–Volga Pricing
Vanilla Vanilla
1.6
1.4
1.2
2.0
1.0
1.5
0.8
Volga
1.0
180 0.6
180
Time to expiration
162 0.5
Time to expiration
155
Vanna
144 0.4
130
0.0
(days)
126 105
0.2
(days)
108 80
−0.5 55
91 0.0
73 −1.0 30
55 −0.2 5
−1.5
1.00
0.98
0.96
0.94
0.92
0.90
0.89
0.85
0.87
0.81
0.79
0.83
0.75
0.77
0.74
0.70
0.72
37
19 −2.0
0.70
0.72
0.75
0.77
0.80
0.82
0.85
0.87
0.90
0.92
0.95
0.97
1.00
RR(σ ) = call(x, , σ, rd , rf , T )− put(x, , σ, rd , rf , T ) (7)
call(x, , σ, rd , rf , T ) + put(x, , σ, rd , rf , T )
BF(σ ) =
2
call(x, 0 , σ0 , rd , rf , T ) + put(x, 0 , σ0 , rd , rf , T )
− (8)
2
and obtain 1
cσ σ (σ+ ) + pσ σ (σ− )−cσ σ (σ0 )−pσ σ (σ0 ) (13)
2
cost of vanna
but the last two summands are close to zero. The
Bσ x
= vanna–volga adjusted value of the exotic is then
cσ x (σ ) − pσ x (σ− )
+
× c(σ+ ) − c(σ0 ) − p(σ− ) + p(σ0 ) (11) B(σ0 )+p × [cost of vanna+cost of volga] (14)
Vanna–Volga Pricing 3
A division by the spot x converts everything into With these approximations, we obtain the formulae
the usual quotation of the price in per cent of the
underlying currency. The cost of vanna and volga is
Bσ x
commonly adjusted by a number p ∈ [0, 1], which cost of vanna ≈ cσ (σ0 )RR (17)
is often taken to be the risk-neutral no-touch (NT) cσ x (σ+ )− pσ x (σ− )
probability. The reason is that in the case of options 2Bσ σ
cost of volga ≈ cσ (σ0 )BF (18)
that can knock out, the hedge is not needed anymore cσ σ (σ+ ) + pσ σ (σ− )
once the option has knocked out. The exact choice of
p depends on the product to be priced; see Table 1.
Taking p = 1 as the default value would lead to
overestimated overhedges for DNT options as pointed Observations
out in [2].
The values of risk reversals and butterflies in 1. The price supplements are linear in butterflies
equations (11) and (12) can be approximated by a and risk reversals. In particular, there is no cost
first-order expansion as follows. For a risk reversal, of vanna supplement if the risk reversal is zero
we take the difference of the call with correct implied and no cost of volga supplement if the butterfly
volatility and the call with ATM volatility minus the is zero.
difference of the put with correct implied volatility 2. The price supplements are linear in the ATM
and the put with ATM volatility. It is easy to see vanilla vega. This means supplements grow with
that this can be well-approximated by the vega of growing volatility change risk of the hedge
the ATM vanilla times the risk reversal in terms of instruments.
volatility. Similarly, the cost of the butterfly can be 3. The price supplements are linear in vanna and
approximated by the vega of the ATM volatility times volga of the given exotic option.
the butterfly in terms of volatility. In formulae, this is 4. We have not observed any relevant difference
between the exact method and its first-order
approximation. Since the computation time for
c(σ+ ) − c(σ0 ) − p(σ− ) + p(σ0 )
the approximation is shorter, we recommend
≈ cσ (σ0 )(σ+ − σ0 ) − pσ (σ0 )(σ− − σ0 ) using the approximation.
5. It is not clear up front which target delta to use
= σ0 [pσ (σ0 ) − cσ (σ0 )] + cσ (σ0 )[σ+ − σ− ]
for the butterflies and risk reversals. We take a
= cσ (σ0 )RR (15) delta of 25% merely on the basis of its liquidity.
6. The prices for vanilla options are consistent with
and, similarly, the input volatilities as shown in Figures 3, 4,
and 5.
c(σ+ ) − c(σ0 ) + p(σ− ) − p(σ0 ) 7. The method assumes a zero volga of risk rever-
2 sals and a zero vanna of butterflies. This way
≈ cσ (σ0 )BF (16) the two sources of risk can be decomposed and
hedged with risk reversals and butterflies. How-
ever, the assumption is actually not exact. For
Table 1 Adjustment factors for the overhedge for first- this reason, the method should be used with a
generation exotics lot of care. It causes traders and financial engi-
Option p neers to keep adding exceptions to the standard
method.
KO No-touch probability
RKO No-touch probability
DKO No-touch probability
OT 0.9 × no-touch probability − 0.5 × bid–offer- Consistency Check
spread × (TV − 33%)/66%
DNT 0.5 A minimum requirement for the vanna–volga pricing
KO, knock out; RKO, reverse knockout; DKO, double knock- to be correct is the consistency of the method with
out; OT, one touch; DNT, double no touch vanilla options. We show in Figures 3, 4, and 5 that
4 Vanna–Volga Pricing
16.0 13.4
Volatility (%)
15.5
Volatility (%)
13.2
15.0
13.0
14.5
12.8 Implied volatility
14.0 Given volatility
12.6
13.5
92
88
81
73
63
51
40
30
21
15
10
13.0 One-year call delta (%)
99
98
94
85
70
51
32
3
1
7
17
For DNT options with lower barrier L and higher The Cost of Trading and Its Implication
barrier H at spot S, one can use the overhedge on the Market Price of One-touch Options
Now let us take a look at an example of the traders’
OH = max{vanna–volga-OH ; δ(S − L) rule of thumb in its simple version. We consider OT
− T V − 0.5%; δ(H − S) − T V − 0.5%} options, which hardly ever trade at TV. The tradable
price is the sum of the TV and the overhedge. Typical
(26) examples are shown in Figure 6, one for an upper
touch level in EUR/USD, and one for a lower touch
where δ denotes the delta of the DNT option. level.
Vanna–Volga Pricing 7
One-touch up
5
4
3
Overhedge (%)
2
1
0
0 10 20 30 40 50 60 70 80 90 100
−1
−2
−3
−4
(a) Theoretical value (%)
One-touch down
1
0
Overhedge (%)
0 10 20 30 40 50 60 70 80 90 100
−1
−1
−2
−2
−3
(b) Theoretical value (%)
Figure 6 Overhedge of a one-touch option in EUR/USD for (a) an upper touch level and (b) a lower touch level, based
on the traders’ rule of thumb
Option sensitivity
0.07
0.06
0.05
0.04
0.03
0.02 rho
vega
0.01
0.00
0.0 0.1 0.2 0.3 0.4 0.5 0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.8 1.9 2.0
Maturity of a vanilla option in years
Figure 7 Comparison of interest rate and volatility risk for a vanilla option. The volatility risk behaves like a square-root
function, whereas the interest rate risk is close to linear. Therefore, short-dated FX options have higher volatility risk than
interest rate risk
Actual implied volatility smiles: D-mark futures Actual implied volatility smiles: yen futures
160 160
Standardized implied
Standardized implied
150 150
140 140
130
volatility
130
volatility
120 120
80 80
piry
110 110
y
Time to expir
50 50
100 100
Time to ex
35 35
90 90
80 20 80 20
−3.5 −3 −2.5 −2 −1.5 −1 5 −3.5 −3−2.5 −2 −1.5 −1 5
−0.5 0 0.5 1 1.5 2 −0.5 0 0.5 1 1.5 2
2.5 3 3.5 2.5 3 3.5
(a) Strike price (in standard deviation terms) (b) Strike price (in standard deviation terms)
Actual implied volatility smiles: B-pound futures Actual implied volatility smiles: S-franc futures
160 160
Standardized implied
Standardized implied
150 150
140 140
volatility
volatility
130 130
120 120
80 80
piry
110 piry 110
50 50
Time to ex
Time to ex
100 100
90 35 90 35
80 20 80 20
−3.5−3−2.5 −2−1.5 −1−0.5 5 −3.5 −3−2.5−2−1.5−1 5
0 0.5 1 1.5 2 2.5 −0.5 0 0.5 1 1.5 2
3 3.5 2.5 3 3.5
(c) Strike price (in standard deviation terms) (d) Strike price (in standard deviation terms)
Figure 1 Actual implied volatility surfaces of option prices for four foreign exchange futures standardized to the level of
the ATM volatility (1985–2000)
tested a number of arbitrary models based upon a of the ATM implied volatility. Curved patterns are
polynomial expansion across strike price (x) and time independent of the level of the exchange rate. Finally,
(t). Tompkins [15] extended the polynomial expan- Tompkins [15] reports a significant third-order strike
sion to degree three and included additional factors, price effect for all four foreign exchange option
which might also influence the behaviors of volatility markets. Tompkins [15] shows that the high degree
surfaces. of explanatory power is invariant to the time period
For all four foreign exchange options markets, a of analysis and that the model provides accurate
parsimonious model explains the vast majority of smile predictions outside of the estimation sample
the variance in the standardized implied volatility period. Under these assumptions, we conclude that
surfaces. The analysis allowed strike price effects regularities in implied volatility surfaces exist and are
to be separated into a first-order effect (the skew), similar for the four currency markets. Furthermore,
a second-order effect (the smile), and higher order the regularities are time period invariant. These
effects. For the skew effect, the results suggested general results provide means to test alternative
that an asymmetrical smile pattern is a function of models, which could potentially explain why implied
the level of the foreign exchange rate. The evidence volatility surfaces exist. This is discussed in the
suggests that when futures prices are low (high), the following section.
implied volatility pattern becomes more negatively
(positively) skewed.
For the second-order “curved” pattern, all four Empirical Regularities for Currency
markets display a convex pattern that becomes more Option Smiles
extreme as the options expiration date is approached.
Furthermore, a significant negative relationship is From [15], the following general conclusions can be
found between the degree of curvature and the level drawn for the behaviors of implied volatility surfaces
Foreign Exchange Smiles 3
for options on foreign exchange: this event caused greater curvature for the
smiles. For the second shock, both the British
1. Implied volatility patterns are symmetrical on pound and Japanese yen displayed greater smile
average for options on currencies. curvature thereafter.
2. For three of the four markets, the skew effect is 8. For all four markets, the degree of curvature of
related to the level of the underlying futures the implied volatility pattern is inversely related
price. The only exception is for the British to the level of the ATM implied volatility.
pound/US dollar. The level of the futures price
Thus, the higher the level of ATM implied
impacts the skewness in an inverse manner to
volatility, the less pronounced the degree of
the pure skewness effect. This suggests that for
curvature in the smile.
low futures prices a negative skew occurs and
9. For three of the four currency markets, the
at higher futures prices the skew flattens and
degree of curvature is independent of the level
can become positive.
3. The skew effect for currency options is rela- of the underlying futures price. The only excep-
tively invariant to the time to expiration of the tion is for the Japanese yen, where the higher
options. It is solely due to extreme levels of the level of the exchange rate, the lesser the
the underlying exchange rate or to some market curvature (however, this impact is small).
shock. 10. For all four markets, the degree of curvature of
4. For two of the four markets, the level of the the implied volatility pattern is asymmetrical.
skew effect is inversely related to the level of For the Deutsche mark, Japanese yen, and Swiss
the ATM implied volatility. For the Deutsche franc, the degree of asymmetry is negative. This
mark and Swiss franc, the higher (lower) the suggests that the curvature is more extreme
level of the ATM implied volatility, the more for options with strike prices below the current
negative (positive) the level of the skew. level of the underlying futures. For the British
5. Shocks change the degree and sign of the pound, the relationship is positive, indicating
skew effect. For the Deutsche mark and Swiss that the curvature is more extreme for options
franc, the concerted intervention in the currency with strike prices above the current level of the
markets by the Group of Seven (G7) caused underlying futures.
a negative skew to occur. The 1987 stock
crash had minimal impact on the currency Using these 10 stylized facts as clues, we now
markets, with only a slightly negative skew examine alternative explanations for the existence
impact for the Deutsche mark. For the second
of implied volatility smiles. It is crucial that any
shock, the only currency option skew affected
coherent explanation must conform to all of these
was the Japanese yen. This occurred in January
facts simultaneously. If selected models are inter-
1988 and appears to have been associated with
nally inconsistent with these facts, it is grounds for
international capital flows out of the US dollar
rejection.
into yen.
6. All implied volatility patterns display some A nontrivial problem is that the statistical testing
degree of curvature and the degree of curvature of any option-pricing model has to be a joint hypoth-
is inversely related to the option’s term to esis that the option-pricing model is correct and that
expiration. The longer the term to expiration, the markets are efficient. Given that smiles do exist,
the less extreme the degree of curvature in the we can reject the hypothesis that actual option values
smile. conform to the Black [2] model. However, we are
7. Shocks change the degree of curvature of the uncertain as to why this occurs. Consider two possi-
implied volatility pattern. However, the effect ble reasons for the existence of smiles: the underlying
is not systematic and often shocks reduce the asset may follow an alternative price process or the
degree of curvature. For the G7 intervention Black [2] model is correct but market imperfections
in 1985, there was a reduction in the degree exist. The next sections discuss both possibilities to
of smile curvature for both the Deutsche mark better understand the regularities in implied volatility
and Swiss franc, while for the Japanese yen, surfaces presented in [15].
4 Foreign Exchange Smiles
Models with Alternative Price and jumps), dZ is a standard Wiener process, q(t) is
Volatility Processes the independent Poisson process, which captures the
jumps. The term λ is the mean number of arrivals per
Consider first that some alternative price (and volatil- unit time and κ represents the jumps size (which can
ity) process is at work instead of geometric Brownian also be a random variable).
motion with constant variance. Following the gen- Bates [1], Ho et al. [8], and Jiang [12] assumed
eral approach of Jarrow and Rudd [11], we consider that the volatility process is subordinated in a non-
alternative true terminal distributions for the underly- normal price process; this provides the inspiration
ing asset. Consider the following models that include for the third model (see [1] for tests of these mod-
stochastic volatility (σ̂ ) and alternative price pro- els). In this spirit, the third proposed model is a
cesses. For the sake of convenience, the volatility variant of the Heston [7] model, proposed by Tomp-
processes will be evaluated
√ in terms of a stochas- kins [16, 17], which includes jumps (as captured by a
tic variance process ( V ). Given that our previous normal inverse Gaussian (NIG) process) in the under-
results examined options on futures, the notation indi- lying price process.
cates that the underlying asset is a futures price
(F ). The first model, which will be considered, is Model 3
a stochastic volatility model: the square root process
model proposed by Heston [7] (see also Stochas- dF (t) = µF (t−) dt + σ̂ (t)F (t−) dN(t) (6)
tic Volatility Models: Foreign Exchange; Heston
Model). This choice is due to the ability of this with the variance process defined by
model to allow correlated underlying and volatility
processes. This will be defined as dV (t) = k(θ − V (t)) dt + ξ V (t) dZ(t) (7)
Model 1
where N (t) is a purely discontinuous martingale
dF (t) = µF (t) dt + σ̂ F (t) dZ1 (t) (3) corresponding to log returns driven by an NIG Lévy
process (see Normal Inverse Gaussian Model).
with the variance process defined by This model will be referred to as normal inverse
Gaussian stochastic volatility (NIGSV) for the sake
dV (t) = κ(θ − V (t)) dt + ξ V (t) dZ2 (t) (4)
of convenience.
where Z1 and Z2 are standard Wiener processes with
correlation ρ. The term κ indicates the rate of mean
reversion of the variance, θ is the long-term variance, Smile Patterns Associated with the
and ξ indicates√the volatility of the variance. The Proposed Models
terms V and. V represent the variance and the
volatility of the process, respectively. Tompkins [17] discussed how parameters for each
The second model that we consider is the jump- of these models could be estimated (under the
diffusion model proposed by Merton [13] (see also physical measure) and the change of measure to
Jump-diffusion Models). Using his notation, this can allow risk neutral pricing. Of more interest to
be expressed as this article is the resulting smile behavior of each
Model 2 model. This can be seen in Figure 2 (restricted
solely to the Deutsche mark/US dollar). Figure 2(a)
shows the empirical smile patterns for Deutsche
dF (t) = F (α − λκ) dt + F σ (t) dZ(t) + dq(t) mark/US dollar from 1985 to 2000. Figure 2(b)
(5) shows the smile surface associated with the Hes-
ton [7] model. Figure 2(c) shows the smile surface
Using his notation, α is the instantaneous expected associated with the jump-diffusion model of Mer-
return on the futures contract, σ (t) is the instanta- ton [13]. Figure 2(d) represents the combination of
neous volatility of the futures contract, conditional stochastic volatility and jump processes (NIGSV
on no arrivals of important new information (no model).
Foreign Exchange Smiles 5
Empirical implied volatility smiles: Dmark/US Dollar Simulated implied volatility smiles: Heston (1993)
piry
100 110 80
piry
Time to ex
45
90 100
Time to ex
45
25
80 90
25
−3.5 −2.5 −1.5 5 80
−0.5 0.5 1.5 2.5 3.5 −3.5 −2.5 −1.5 −0.5 0.5
5
(a) Strike price (in standard deviation terms) 1.5 2.5 3.5
(b)
Strike price (in standard deviation terms)
Simulated implied volatility smiles: Merton (1976) Simulated implied volatility smiles: Tompkins (2007)
160
Standardized implied volatility
150
160
piry
100 110 65
Time to ex
45
Time to ex
90 100
25 45
80 90
−3.5 25
−2.5 −1.5 −0.5 5 80
0.5 1.5 2.5 3.5 −3.5 −2.5 −1.5 −0.5 5
0.5 1.5 2.5 3.5
(c) Strike price (in standard deviation terms) (d) Strike price (in standard deviation terms)
Figure 2 Simulated implied volatility smiles for options on Deutsche mark/US dollar
Smile Patterns Associated with Stochastic of stochastic correlations. However, it seems incon-
Volatility sistent from an economic standpoint; if shocks change
the degree of asymmetry in the expected terminal dis-
As one can see in Figure 2(b), the Heston [7] model tribution of the underlying asset, it is not clear why
does generate a symmetrically curved smile function in half of the instances the degree of curvature (fact
consistent with point #1, but the smiles are flat as #8) is reduced. This model is also inconsistent with
the option expiration approaches and become more fact #8, that the higher the level of expected variance
curved, the longer the term to expiration (which is (ATM volatility), the flatter the degree of curva-
inconsistent with point #6). This is exactly the oppo- ture. Given that this model would produce effects
site of what is observed for currency smiles empiri- that are contradictory to both first and second strike
cally. The Heston [7] model can generate a skewed price effects observed empirically, we must reject it.
implied volatility pattern from a nonzero correlation An alternative explanation is that the jump-diffusion
between the volatility and underlying processes (see model of Merton [13] may be more appropriate.
equations 3 and 4). However, the longer the term
to expiration, the more extreme the skew pattern Smile Patterns Associated with Stochastic
would be. This is inconsistent with point #3, that Volatility
skewed patterns for currency options are time invari-
ant and are only associated with the levels of the According to Hull [9], this model could produce
ATM implied volatility or the underlying currency a curved implied volatility surface and this curve
exchange rate. However, this model is consistent with would be consistent with fact #6, that curves exist
fact #5 that shocks could change the degree of skew- and become more extreme the shorter the time to
ness. The model could still be valid under a regime expiration of the option. This can be seen in Figure 2,
6 Foreign Exchange Smiles
where the degree of curvature is most extreme clos- is observed for the actual smiles. The reason for this
est to expiration. However, as the Poisson process is that the parameters for the model were estimated
in equation (5) is independent and identically dis- using the underlying Deutsche mark/US dollar cur-
tributed (i.i.d.), this will converge over time to a rency futures (see [17] for details). While a feasible
normal distribution and thus, the implied volatil- measure change was used to price options (that omit-
ity surface would flatten, which is what occurs in ted arbitrage), it is unlikely that this measure change
Figure 2. It could also hold under a regime asso- is unique as nontraded sources of risk have been
ciated with fact #7, that shocks do change the introduced into the state space. These include jumps
degree of curvature. It could be that the inflow of and stochastic volatility. Given this, we should expect
new information changes the expectations of mar- that option prices will also contain some risk pre-
ket agents regarding the degree and magnitude of mium above and beyond the values associated with
future jumps. However, the model, as it stands, the underlying asset.
would not be able to explain the first-order strike
price effects. One alternative would be to allow the
shocks to be asymmetric. This would allow a skewed
Conclusions and Implications
implied volatility pattern to exist. However, if the In this article, we have examined currency option
jumps follow some i.i.d. process, the central limit smiles. Previous research by Tompkins [15] suggests
theorem would imply that the degree of skewness that when implied volatility patterns are standardized,
would be highest when the options are closest to regularities are observed both across markets and
expiration and would flatten as the term to expi- across time. He concludes that this may suggest that
ration is lengthened. This is at variance with fact market participants have developed some consistent
#3 that for currency options the skew effects are algorithm to vary option prices in a consistent manner
time invariant. Therefore, we can also reject a jump- away from Black [2] values.
diffusion model as being inconsistent with the empir- To better understand the nature of this algo-
ical record. rithm, 10 stylized results are identified from his
results for the four currency option markets. With
these 10 results we test whether alternative models,
Smile Patterns Associated with the NIGSV which have been proposed to explain the existence
Model of implied volatility surfaces, can generate the same
dynamics as these empirical results. Initially, mod-
This model assumes a symmetrical jump-diffusion els were examined that suggest an alternative price
process with a subordinated stochastic volatility pro- process may better define the underlying price and
cess with nonzero correlations between the two volatility processes. We reject both the Heston [7]
processes. The simulated implied volatility smiles and the Merton [13] models as appropriate models,
appear in Figure 2(d) and seem to resemble most the as they cannot produce all the empirical dynam-
actual smiles for Deutsche mark/US dollars options ics for actual smiles. The only model that could
in Figure 2(a). As can be seen, there is curvature explain all the dynamics is a model that combines
in the smile patterns for both short term and longer stochastic volatility and nonnormal innovations for
term options. The shorter term curvature is associated currency returns. When appropriate parameters are
with the jump process, while the longer term cur- input into this model and a feasible change of mea-
vature is associated with stochastic volatility. This sure is made, option prices can be determined. The
is consistent with both fact #1 and fact #6, that the smiles associated with this model match the dynamics
average smile pattern is symmetrical and the degree observed for actual currency option smiles. However,
of curvature is inversely related to time. Dynamics the model smiles do not display the same extreme
of the skew relationship can be explained with vari- degree of curvature as the empirical smiles. Follow-
ations of the correlation between the two processes. ing Tompkins [17], this suggests that a substantial
Finally, the asymmetry of the smile shapes can be risk premium exists for currency options and that
explained by the jump process. While this model the hypothesis that the existence of implied volatility
appears to display many of the dynamics of empirical surfaces are due solely to an alternative price process
smiles, the degree of curvature is not as extreme as is rejected.
Foreign Exchange Smiles 7
Alternatively, market imperfections may be the [12] Jiang, G. (1999). Stochastic volatility and jump-
reason for the existence of implied volatility surfaces. diffusion—implications on option pricing, International
Given that existing research has previously rejected Journal of Theoretical and Applied Finance 2(4),
409–440.
this, we tend to concur that market imperfections [13] Merton, R. (1976). Option pricing when underlying
alone are also probably not sufficient to explain stock returns are discontinuous, Journal of Financial
the existence of implied volatility smiles. However, Economics 3, 125–144.
it is possible that both alternative price processes [14] Rubinstein, M. (1994). Implied binomial trees, The
and market imperfections jointly contribute to the Journal of Finance 49, 771–818.
existence of implied volatility smiles. [15] Tompkins, R.G. (2001). Implied volatility surfaces:
uncovering regularities for options of financial futures,
The European Journal of Finance 7, 198–230.
References [16] Tompkins, R.G. (2003). Options on bond futures: isolat-
ing the risk premium, Journal of Futures Markets 23(2),
[1] Bates, D.S. (1996). jumps and stochastic volatility: 169–215.
exchange rate process implicit in Deutsche Mark opt- [17] Tompkins, R.G. (2006). Why smiles exist in foreign
ions, Review of Financial Studies 9, 69–107. exchange options: isolating components of the risk
[2] Black, F. (1976). The pricing of commodity contracts, neutral process, The European Journal of Finance 12,
Journal of Financial Economics 3, 167–179. 583–604.
[3] Derman, E. & Kani, I. (1994). Riding on the smile, Risk
7, 32–39.
[4] Dumas, B., Fleming, J. & Whaley, R.E. (1998). Implied Further Reading
volatility functions: empirical tests, The Journal of
Finance 53, 2059–2106. Balyeat, R.B. (2002). The economic significance of risk
[5] Dupire, B. (1992). Arbitrage Pricing with Stochastic premiums in the S&P 500 options market, Journal of Futures
Volatility, Working Paper, Société Générale Options Markets 22, 1145–1178.
Division. Garman, M. & Kohlhagen, S. (1983). Foreign currency option
[6] Dupire, B. (1994). Pricing with a smile, Risk 7, 18–20. values, Journal of International Money and Finance 2,
[7] Heston, S.L. (1993). A closed-form solution for options 231–237.
with stochastic volatility with applications to bond Henker, T. & Kazemi, H.B. (1998). The impact of deviations
and currency options, Review of Financial Studies 6, from random walk, in Security Prices on Option Prices,
327–343. Working Paper, University of Massachusetts,
[8] Ho, M.S., Perraudin, W.R.M. & Sørensen, B.E. (1996). Amherst.
A continuous-time arbitrage-pricing model with stochas-
tic volatility and jumps, Journal of Business & Economic
Statistics 14, 31–43. Related Articles
[9] Hull, J. (1997). Options, Futures and other Derivative
Securities, 3rd Edition, Prentice Hall, Upper Saddle
River.
Foreign Exchange Smile Interpolation; Implied
[10] Jackwerth, J.C. & Rubinstein, M. (1996). Recovering Volatility Surface; Stochastic Volatility Models:
probability distributions from option prices, The Journal Foreign Exchange.
of Finance 51, 1611–1631.
[11] Jarrow, R. & Rudd, A. (1982). Approximate option ROBERT G. TOMPKINS
valuation for arbitrary stochastic processes, Journal of
Financial Economics 10, 347–369.
Foreign Exchange Smile For sufficiently large σ (n ) and a smooth, differ-
entiable volatility smile, the sequence converges for
Interpolation n → ∞ against the unique fixed point ∗ ∈ A with
σ ∗ = σ (∗ ), corresponding to strike K.
application of Dupire-style local volatility mod- Definition 1 (Slice Kernel). Let (x1 , y1 ),(x2 , y2 ). . . ,
els, it is crucial to construct an interpolation that (xn , yn ) be n given points and g : → a smooth
is at least C2 in strike and at least C1 in time function which fulfills
direction. This becomes obvious when consider-
ing the expression for the local volatility in this g(xn ) = yn , n = 1, . . . , n (6)
context:
0.14
0.13
Implied volatility
0.12
0.11
0.1
0.09
0.08
0 20 40 60 80 100
Percent delta
is a nondecreasing variance for constant moneyness [3] Rebonato, R. (1999). Volatility and Correlation, John
F /K (see also [1] for a discussion of this). Wiley & Sons.
Figure 1 displays the shape of a slice kernel [4] Tistaert, J., Schoutens, W. & Simons, E. (2004). A perfect
calibration now what? Wilmott Magazine (March), 66–78.
applied to a typical FX volatility surface con- [5] Wystup, U. (2006). FX Options and Structured Products,
structed from 10 and 25 delta volatilities, and John Wiley & Sons.
the ATM volatility (in this example λ = 0.25 was
chosen).
Related Articles
References
Foreign Exchange Markets; Foreign Exchange
Options: Delta- and At-the-money Conventions.
[1] Gatheral, J. (2004). A Parsimonious Arbitrage-free Impli-
ed Volatility Parameterization with Application to the Val-
UWE WYSTUP
uation of Volatility Derivatives, Workshop Presentation,
Madrid.
[2] Hakala, J. & Wystup, U. (2002). Local volatility sur-
faces—tackling the smile, Foreign Exchange Risk, Risk
Books.
Margrabe Formula change of numeraire (see Change of Numeraire),
writing
S2
π EO (t) = S2 (t)EQ
t ((aS1 (T )/S2 (T ) − b)+ ) (5)
An exchange option gives its owner the right, but
not the obligation, to exchange b units of one asset noting that S1 /S2 follows a geometric Brownian
for a units of another asset at a specific point in motion, and reusing the Black–Scholes calculation
time, that is, it is a claim that pays off (aS1 (T ) − for the mean of a truncated lognormal variable.
bS2 (T ))+ at time T . If the underlying asset prices are multiplied by
Outperformance option or Margrabe option are a positive factor, then the exchange option’s value
alternative names for the same payoff. changes by that same factor. This means that we can
Let us assume that the interest rate is constant use Euler’s homogeneous function theorem to read
(r) and that the underlying assets follow corre- off the partial derivatives of the option value with
lated ( dW1 dW2 = ρ dt) geometric Brownian motions respect to the underlying assets (the deltas) directly
under the risk-neutral measure, from the Margrabe formula (see [15] for more such
tricks), specifically
dSi = µi Si dt + σi Si dWi for i = 1, 2 (1) dEO
= e(µ1 −r)τ N( d+ ) (6)
dS1
Note that allowing µi ’s that are different from r
enables us to use resulting valuation formula for and similarly for S2 . If the S assets are traded, then
the exchange option directly in cases with nontrivial a portfolio with these holdings (scaled by a and b)
carrying costs on the underlying. This could be for that is made self-financing with the risk-free asset
futures (where the drift rate is 0), currencies (where replicates the exchange option, and the Margrabe
the drift rate is the difference between domestic formula gives the only no-arbitrage price.
and foreign interest rates, see Foreign Exchange If the underlying assets do not pay dividends
Options), stocks with dividends (where the drift rate during the life of the exchange option (so that the
is r less the dividend yield), or nontraded quantities risk-neutral drift rates are µ1 = µ2 = r), then early
with convenience yields. exercise is never optimal, and the Margrabe formula
The value of the exchange option at time t is holds for American options too. With nontrivial
carrying costs, this is not true, but as noted by [2],
π EO (t) = EO(T − t, aS1 (t), bS2 (t)) (2) a change of numeraire reduces the dimensionality
of the problem so that standard one-dimensional
where the function EO is given by methods for American option pricing can be used.
The Margrabe formula is still valid with stochastic
interest rates, provided the factors that drive interest
EO(τ, S1 , S2 ) = S1 e(µ1 −r)τ N( d+ ) − S2 e(µ2 −r)τ N( d− )
rates are independent of those driving the S assets.
(3) Exchange options are most common in over-
the-counter foreign exchange markets, but exchange
with features are embedded in many other financial con-
texts; mergers and acquisitions (see [12]) and indexed
ln(S1 /S2 ) + (µ1 − µ2 ± σ 2 /2)τ executive stock options (see [9]) to give just two
d± = √ (4)
σ τ examples.
where σ = σ12 + σ22 − 2σ1 σ2 ρ, N denotes the stan-
dard normal distribution function and τ = T − t. The
Variations and Extensions
formula was derived independently by Margrabe [12] Some variations of exchange options can be valued in
and Fisher [6], but despite the two papers being pub- closed form. In [10], a formula for a so-called traffic
lished side by side in the Journal of Finance, the light option that pays
formula commonly bears only the former author’s
name. The result is most easily proven by using a (S1 (T ) − K1 )+ (S2 (T ) − K2 )+ (7)
2 Margrabe Formula
is derived, and [4] gives a formula for the value of distribution has been used for Asian and basket
a compound exchange option, that is, a contract that options.
pays • changing to Gaussian processes as suggested in
[3]; this may be suitable for commodity markets
(π EO (TC ) − S2 (TC ))+ at time TC < T (8) where spread contracts are popular, and it allows
for the inclusion of mean reversion.
Both formulas involve the bivariate normal distri- • if the ai,n Xi,n ’s depend monotonically on a com-
bution function, and in the case of the compound mon random variable, then Jamshidian’s approach
exchange option a nonlinear but well-behaved equa- from [8] can be used to decompose an option on a
tion that must be solved numerically. portfolio into a portfolio of simpler options. This
For knock-in and knockout exchange options is used to value options on coupon-bearing bonds
whose barriers are expressed in terms of the ratio in one-factor interest-rate models.
of the two underlying assets, [7] show that the
reflection-principle-based closed-form solutions (see
[14]) from the Black-Scholes model carry over; this
means that barrier option values can be expressed References
solely through the EO-function evaluated at appro-
priate points.
However, there are not always easy answers; in [1] Alexander, C. & Scourse, A. (2004). Bivariate normal
the simple case of a spread option mixture spread option valuation, Quantitative Finance 4,
637–648.
[2] Bjerksund, P. & Stensland, G. (1993). American
(S1 (T ) − S2 (T ) − K)+ (9) exchange options and a put-call transformation: a
note, Journal of Business, Finance and Accounting 20,
there is no commonly accepted closed-form solution. 761–764.
The reason for this is that a sum of lognormal [3] Carmona, R. & Durrleman, V. (2003). Pricing and
variables is not lognormal. More generally, many hedging spread options, SIAM Review 45, 627–685.
financial valuation problems can be cast as follows: [4] Carr, P. (1988). The valuation of sequential exchange
calculate the expected value of opportunities, Journal of Finance 43, 1235–1256.
[5] Dufresne, D. (2004). The log-normal approximation in
+ financial and other computations, Advances in Applied
n
αi,n Xi,n − K (10) Probability 36, 747–773.
[6] Fischer, S. (1978). Call option pricing when the exercise
i=1
price is uncertain, and the valuation of index bonds,
where the Xi,n ’s are lognormally distributed. One Journal of Finance, 33, 169–176.
[7] Haug, E.G. & Haug, J. (2002). Knock-in/out Margrabe,
can use generic techniques such as direct integration,
Wilmott Magazine 1, 38–41.
numerical solution of partial differential equations, [8] Jamshidian, F. (1989). An exact bond option formula,
or Monte Carlo simulation, but there is an extensive Journal of Finance 44, 205–209.
literature on other approximation methods. These [9] Johnson, S.A. & Tian, Y.S. (2001). Indexed execu-
include tive stock options, Journal of Financial Economics 57,
35–64.
• moment
n approximation, where the moments of [10] Jørgensen, P.L. (2007). Traffic light options, Journal of
i=1 i,n Xi,n are calculated, the variable then
α Banking and Finance 31, 3698–3719.
treated as lognormal, and the option priced by [11] Levy, E. (1992). Pricing European average rate currency
a Black–Scholes-like formula; an application to options, Journal of International Money and Finance
Asian options is given in [11]. 11(5), 474–491.
• integration by Fourier transform techniques, [12] Margrabe, W. (1978). The value of an option to
exchange one asset for another, Journal of Finance 33,
which extends beyond lognormal models and
177–186.
works well if n is not too large (say 2–4); an [13] Milevsky, M.A. & Posner, S.E. (1998). Asian options,
application to spread options is given in [1]. the sum of lognormals, and the reciprocal gamma distri-
• limiting results for n → ∞ as obtained in [5] bution, Journal of Financial and Quantitative Analysis
and [13]; the relation to the reciprocal gamma 33, 409–422.
Margrabe Formula 3
[14] Poulsen, R. (2006). Barrier options and their static Related Articles
hedges: simple derivations and extensions, Quantitative
Finance 6, 327–335.
[15] Reiss, O. & Wystup, U. (2001). Efficient computation of Black–Scholes Formula; Change of Numeraire;
options price sensitivities using homogeneity and other Exchange Options; Foreign Exchange Options.
tricks, Journal of Derivatives 9, 41–53.
ROLF POULSEN
Foreign Exchange (“today”) between the domestic and the foreign
currency. It is specified as the number of units
Options: Delta- and of domestic currency that an investor gets in
exchange for one unit of foreign currency,
At-the-money
number of units of domestic currency
Conventions S(t) :=
one unit of foreign currency
(1)
In financial markets, the value of a plain-vanilla
European option is generally quoted in terms of its • FX forward rate F (t, T ): The FX forward rate
implied volatility, that is, the volatility that, when F (t, T ) is the exchange rate between the domestic
plugged into the Black–Scholes formula, gives the and the foreign currency at some future point
correct market price. By observation of market prices of time T as observed at the present time t
the implied volatility, however, turns out to be a (t < T ). It is again specified as the number of
function of the option’s strike, thus giving rise to units of domestic currency that an investor gets
the so-called volatility smile. in exchange for one unit of foreign currency at
In foreign exchange (FX) markets, it is common time T .
practice to quote volatilities for FX call and put Using arbitrage arguments, spot and forward FX rates
options in terms of their delta sensitivities rather are related by (see, for instance, [3]):
than in terms of their strikes or their moneyness.
Volatilities and deltas are quoted by means of a table, Dfor (t, T )
the volatility smile table, consisting of rows for each F (t, T ) = S(t) · (2)
Ddom (t, T )
FX option expiry date and columns for a number of
delta values, as well as a column for the at-the-money where Dfor := Dfor (t, T ) is the foreign discount fac-
(ATM) volatilities. tor for time T (observed at time t) and Ddom :=
The definition and usage of a volatility smile table Ddom (t, T ) is the domestic discount factor for time T
is complicated by the fact that FX markets have (observed at time t).
established various delta and ATM conventions. In Note that the terminology in FX transactions is
this article, we summarize these conventions and always confusing. In this article, we refer to the
highlight their intuition. For each delta convention, “domestic” currency in the sense of a base currency
we give formulas and methods for the conversion of in relation to which “foreign” amounts of money are
deltas to strikes and vice versa. We describe how to measured (see also [4]). By definition (1), an amount
retrieve volatilities from the table for an arbitrary FX x in foreign currency, for example, is equivalent to
option that is to be priced in accordance with the x · S(t) units of domestic currency at time t.
information contained therein. We point out some In the markets, FX rates are usually quoted in
mathematical problems and pitfalls when trying to a standard manner. For example, the USD–JPY
do so and give criteria under which these problems exchange rate is usually quoted as the number of
surface. Japanese yen an investor receives in exchange for
1 USD. For a Japanese investor, the exchange rate
would fit the earlier definition, while a US investor
Definitions would either need to look at the reverse exchange rate
1/S(t) or think of Japanese yen as the “domestic”
FX Rate currency.
Before discussing the various delta conventions, we
summarize some basic terms and definitions that we Value of FX Forward Contracts
use in this article.
When two parties agree on an FX forward contract at
• FX spot rate S(t): The FX spot rate S(t) is time s, they agree on the exchange of an amount of
the current exchange rate at the present time t money in foreign currency at an agreed exchange rate
2 Foreign Exchange Options: Delta- and At-the-money Conventions
K against an amount of money in domestic currency We call the currency in which an option’s value is
at time T > s. When choosing K = F (s, T ), the FX measured as its premium currency.
forward contract has no value to either of the parties Also note that the present value of call and put
at time s. options are related by the put–call parity:
As, in general, the forward exchange rate changes
over time, at some time t (s < t < T ), the FX vc − vp = vf = Ddom (F (t, T ) − K) (8)
forward contract will have a nonzero value (in
domestic currency) given by
Definition of Delta Types
vf (t, T ) = Ddom (F (t, T ) − K)
This section summarizes the delta conventions used
= S(t)Dfor − KDdom (3) in FX markets and gives some of their properties. We
outline the correspondence of each delta sensitivity
Value of FX Options with a particular delta hedge strategy the holder of
an FX option chooses. FX options are peculiar in
Upon deal inception, the holder of an FX option that the underlying coincides with the exchange rate.
obtains the right to exchange a specified amount While, in general, it makes no sense to measure the
of money in domestic currency against a specified value of an option in units of its underlying (e.g., in
amount of money in foreign currency at an agreed the number of shares of a company), the FX option
exchange rate K. Assuming nonstochastic interest position can be held either in domestic or in foreign
rates and the standard lognormal dynamics for the currency. This gives rise to the premium-adjusted
spot exchange rate, at time t, the domestic currency deltas.
values of plain-vanilla European call and put FX For the ease of notation, we drop the time depen-
options with strike K and expiry date T are given dency of S(t) and F (t, T ) in the following and denote
by their respective Black–Scholes formulas: the spot exchange rate as of time t by S and the
forward exchange rate for time T as observed in t
by F .
Call option: vc (t,T ) = Ddom F (t,T )N(d+ )
− Ddom K N(d− ) (4)
Unadjusted Deltas
Put option: vp (t, T ) = (−1) · [Ddom F (t, T ) N(−d+ )
Spot Delta.
− Ddom K N (−d− )] (5)
Definition 1 For FX options, the spot delta is
= Ddom F (t, T ) N(d+ ) − 1
defined as the derivative of the option price vc/p with
− Ddom K N (d− ) − 1 (6) respect to the FX spot rate S:
p K Properties
cS, pa − S, pa = Dfor (21)
F
The defining equations for premium-adjusted deltas Call option:
have interesting consequences: while put deltas are K
unbounded and strictly monotonous functions of K, cF , pa = N(d− ) (23)
F
call deltas are bounded (i.e., cS ∈ [0; max ] with
Put option:
max < 1) and are not monotonous functions of K.
Thus, the relationship between call deltas and strikes p K K
F , pa = · N(d− ) − 1 = − N(−d− ) (24)
K is not one to one. F F
Put–call delta parity:
Forward Delta Premium Adjusted.
p K
cF , pa + F , pa = N(d− ) − N(−d− )
Definition 4 The premium-adjusted forward delta F
is defined in analogy to the unadjusted delta: K
= 2cF , pa − (25)
F
c/p
S · ∂/∂S vc/p /S S, pa p K
c/p
F , pa = = (22) cF , pa − F , pa = (26)
∂ vf /∂ S ∂ vf /∂ S F
Interpretation The intuition behind the definition Also note the important remarks in the previous
of the premium-adjusted forward delta follows from section on the domain, the range of values, and the
an FX option position that is held in foreign currency relationship between call delta and option strike. They
and hedged by forward FX contracts in domestic apply likewise to premium-adjusted forward deltas.
currency. The premium-adjusted forward delta gives
the number of forward contracts that are needed
for the delta hedge in domestic currency of an FX Definition of At-the-money Types
option held in foreign currency. The derivation of
the defining equation (22) is similar to the one for In this section, we summarize the various ATM
spot delta premium adjusted (cf. the section Spot definitions, comment on their financial interpretation,
Delta Premium Adjusted). Note that F , pa is a pure and give the relations between all relevant quantities
number without units. in Table 1.
Table 1 Strike values and delta values at the ATM point for the different FX delta conventions(a)
-neutral Katm = F Vega/gamma = max = 50%
ATM strike values
2 2
Spot delta F e1/2σ τ F F e1/2σ τ
2 2 2
Fwd delta F e1/2σ τ F F e1/2σ τ F e1/2σ τ
F e−1/2σ τ
2 2
Spot delta p.a. F F e1/2σ τ
−1/2σ 2 τ 1/2σ 2 τ
Fwd delta p.a. F e F F e
ATM delta values
√
Spot delta Dfor N(0) Dfor N 12 σ τ Dfor N(0)
√
Fwd delta N(0) N 1σ τ N(0) N(0)
2 √ √
Spot delta p.a. Dfor e −1/2σ 2 τ
N(0) Dfor N − 12 σ τ Dfor e N −1σ τ
+1/2σ 2 τ
√ 2
√
−1/2σ 2 τ
N −1σ τ e+1/2σ τ N − 1 σ τ
2
Fwd delta p.a. e N(0)
2 2
(a)
Note that N(0) = 1/2. The delta values are given for call options, the corresponding values for put options can be
obtained, by replacing N(x) with (N(x) − 1)
Foreign Exchange Options: Delta- and At-the-money Conventions 5
ATM Definition “Delta Neutral” Properties The relationships of Table 1 can again
be derived in a straightforward manner from the
Definition 5 The ATM point is defined as the strike definitions.
Katm , for which the delta of a call and a put option
add up to zero: ATM Definition “vega = max”
Definition 7 The ATM point is defined as the strike
cx (Katm , σatm ) + px (Katm , σatm ) = 0 (27) Katm for which vega of the FX option is at its
maximum. Vega is the sensitivity of the FX option,
Here, x represents any of the delta conventions with respect to the implied volatility of the underlying
defined in the section Definition of Delta Types. exchange rate. It is given by (cf. [4])
this choice is that traders can use straddles to hedge Katm = F e1/2σ τ
(30)
the vega of their position without upsetting the delta. Properties Table 1 again summarizes the relevant
Properties The ATM definition mentioned earlier quantities for this ATM definition. Note that in case
of unadjusted deltas, this ATM definition is equivalent
for delta neutral FX options is equivalent to N(d+ ) =
to the delta neutral ATM definition. This is, however,
1/2 in case of the unadjusted delta conventions and
not the case for adjusted deltas.
N(d− ) = 1/2 in case of the premium-adjusted delta
conventions. From this, the relationships of Table 1 ATM Definition “γ = max”
follow in a straightforward manner.
Definition 8 The ATM point is defined as the strike
Katm for which the gamma sensitivity of the FX option
ATM Definition via Forward is at its maximum.
We restrict the discussion to the case of gamma
Definition 6 The ATM point is defined as the strike spot,
equaling the forward exchange rate:
c/p
c/p ∂S n(d+ )
γS := = Dfor √ (31)
Katm := F (28) ∂S Sσ τ
Interpretation This definition reflects the view that From ∂γ /∂K = 0, the ATM strike can be deriv-
(given the information at deal inception) an option ed as
is ATM when its strike is chosen equal to the
2
expected exchange rate at option expiry. If the spot Katm = F e1/2σ τ
(32)
exchange rate, indeed, approached F as t → T (as
thus revealing the equivalence to the ATM definition
would be the case in a fully deterministic world by
“vega = max”.a
arbitrage arguments, cf. equation (2)), then the ATM
strike would mark the dividing point between options
ATM Definition “50%”
that expire in-the-money (ITM) and out-of-the-money
(OTM). From the put–call parity (8), we see that this Definition 9 According to this convention, the ATM
is also the strike at which put and call options have point is defined by
the same value. Thus, this ATM definition is also
called value parity [4]. c = 0.5 and |p | = 0.5 (33)
6 Foreign Exchange Options: Delta- and At-the-money Conventions
This condition can only be true for the forward Conversion of Forward Delta to Strike. If volatil-
delta convention and, thus, does not apply to any of ities are given as a function of forward delta, the
the other delta conventions. strike corresponding to a given forward delta cF
can be calculated analytically. Let σ (cF ) denote the
volatility associated with cF ; σ (cF ) may either be
Properties of ATM Definitions quoted or interpolated from the volatility smile table.
Table 1 summarizes the properties of the ATM point With cF and σ (cF ) given, we can directly solve
for all possible combinations of ATM definitions and equation (14),
delta conventions.
In this context, it is interesting to note that, ln (F /K) + 1/2σ (cF )2 T
cF = N(d+ ) = N √
beside their financial interpretation, mathematically, σ (cF ) T
the various definitions of the ATM point lead to three
(37)
characteristic relationships between the strike K and
the forward exchange rate F : for the strike K. We get
√
√ c 1
K = F ⇒ d± = ±σ τ (34) K = F exp −σ (F ) T N
c −1
F + σ (F ) T
c 2
2 √ 2
K = F e1/2σ ⇒ d+ = 0 , d− = −σ τ
τ
(35)
√ (38)
K = F e−1/2σ τ ⇒ d+ = σ τ , d− = 0
2
(36)
Conversion of Strike to Forward Delta. The
reverse conversion from strikes to forward deltas
Converting Deltas to Strikes and Vice is more difficult and can only be achieved numer-
Versa ically. The following algorithm can be shown to
converge [1] and has empirically proven to be very
Quoting volatilities as a function of the options’ efficient.
deltas rather than as a function of the options’ strikes In the first step, calculate a zero-order guess 0
brings about a problem when it comes to pricing FX by using the ATM volatility σatm in equation (14):
options. Consider the case that we want to price a
vanilla European option for a given strike K. To price ln (F /K) + 1/2σatm
2
T
this option, we have to find the correct volatility. As 0 = cF (K, σatm ) = N √
the volatility is given in terms of delta and delta itself σatm T
is a function of volatility and strike, we have to solve (39)
an implicit problem, which, in general, has to be done
numerically. In the second step, use this zero-order guess for delta
The following sections outline the algorithms to derive a first-order guess σ1 for the volatility by
that can be used to that end for the various delta interpolating the curve σ ():
conventions and directions of conversion. As the spot
and forward deltas differ only by constant discount σ1 = σ (0 ) (40)
factors, we restrict the presentation to the forward
Finally, calculate the corresponding first-order guess
versions of the adjusted and unadjusted deltas.
for cF in the third step:
Forward Delta ln (F /K) + 1/2σ12 T
1 = cF (K, σ1 ) = N √
For unadjusted deltas, there are simple one-to-one σ1 T
relationships between put and call deltas, the put–call (41)
delta parities (12) and (16). PUT deltas can there-
fore easily be translated into the corresponding CALL and repeat steps two and three until the changes in
deltas and it is sufficient to perform all the calcula- from one iteration to the next are below the specified
tions for call deltas here. accuracy.
Foreign Exchange Options: Delta- and At-the-money Conventions 7
the strike K appears inside and outside the cumula- In FX markets, linear combinations of plain-vanilla
tive normal distribution function so that one cannot FX options, such as “strangles”, “risk reversals”, and
solve directly for K. Even though both cF, pa and “butterflies”, are liquidly traded. These instruments
σ (cF, pa ) are given, the problem has to be solved are composed of ATM and OTM plain-vanilla put and
numerically. call options at specific values of delta (typically, 0.25
So when converting a given call delta cF, pa , a or 0.1). When aggregating this market information,
root finder has to be used to solve for the correspon- one obtains a scheme called the volatility smile table
dent strike K. This could, for example, be a simple consisting of rows for each FX option expiry date, a
bisection method where, for call deltas, the ATM- column for ATM volatilities, and two distinct sets of
strike Katm is the lower bound and some high (i.e., columns for volatilities of OTM put and call options.
quasi infinite) value such as 100 Katm can be used as We call these sets the put and call sides of the table.
upper bound. Of course, a more elaborate root finder Thus, OTM options can be priced by retrieving
to solve this problem could (and should!) be used, volatilities from the respective side of the volatility
but a discussion of the various methods lies beyond smile table, for example, OTM calls are priced
the scope of this article. using volatilities from the call side. By virtue of
Note, however, that all these methods require that the put–call parity, ITM options can be priced using
is a strictly monotonous function of K. We will volatilities from the opposite side of the table, that is,
see in section Ambiguities in the Conversion from ITM calls are priced using volatilities from the put
to Strike for Premium-adjusted Deltas that this is not side.
always the case. To exemplify this, consider the case of an option
with arbitrary time to maturity and strike. The typical
procedure to retrieve this option’s volatility from the
Conversion of Strike to Forward Delta Premium smile table would be the following.
Adjusted. The conversion of an absolute strike
to forward delta premium adjusted can be done 1. Determine the volatilities of the ATM point and
analogously to the conversion into an unadjusted the call and put sides at the option’s expiry. For
forward delta as described in section Conversion of this, the volatilities of each delta column will, in
Strike to Forward Delta. general, have to be interpolated in time.
First, use the previous guess i−1 to obtain an 2. Decide which side of the table to use depending
improved guess σi for the volatility by interpolating on the option’s strike K: options with K >
8 Foreign Exchange Options: Delta- and At-the-money Conventions
Katm are either OTM calls or ITM puts and are be performed. Which of the two conventions shall
therefore priced using volatilities from the call be used for options with expiries t within the range
side. Accordingly, options with K < Katm are Tk < t < Tk+1 ?
priced using volatilities from the put side (cf. Some possibilities are as follows: convert the
equation (4)). grid points at Tk into the convention used for Tk+1
3. Convert the option’s strike to delta. This depends and do interpolation in the long-term convention or,
on the side of the table chosen in the previous conversely, convert the grid points at Tk+1 into the
step: convert to a call delta if K > Katm , and to a convention used for Tk and interpolate in the short-
put delta if K < Katm . See the section Converting term convention. Another possibility would be to
Deltas to Strikes and vice versa for the details of translate the delta grid into a strike grid and do the
these conversions. interpolation on strikes.
4. Retrieve the volatility from the table by interpo- None of these approaches is a priori superior to
lating volatilities in delta. the others. In real life, however, a choice has to be
made. Even though the differences may be small, one
Alternatively, one could also translate the full should be aware that the choice is arbitrary.
smile volatility table from deltas to strikes. The con-
version of deltas to strikes would then be necessary
for the grid points of the table only, thus, making ATM Delta Falling in the OTM Range
steps 2 and 3 of the earlier listed procedure obsolete.
The interpolation in step 4 would be done in strikes For long times to maturity, the delta of an ATM
However, keep in mind that the strike grid points FX option may become smaller than the delta of
would vary from row to row. the closest OTM option. In a sense, the ATM delta
It is important to note that the earlier procedure “crosses” the nearest OTM delta. In this case, it is
is based on the assumption that delta is a strictly unclear how the interpolation of volatilities (in delta)
monotonous function of the option’s strike K: only should be done or whether the ATM point or the
in this case the option’s delta and strike are equivalent crossed delta point should possibly be ignored.
measures of the option’s moneyness, that is, only in The conditions under which this problem occurs
this case we are guaranteed the equivalence vary with the delta and ATM types. In the following,
we outline the derivation of these conditions for an
K > Katm ⇐⇒ c < catm (45) exemplarily chosen combination (premium-adjusted
spot deltas and forward ATM definition), summarize
In the following section, we will show that the the results for other combinations, and finally dis-
assumption of monotonicity is not always true and cuss a typical numerical example for long-dated FX
derive the conditions under which it is violated.
options.
Table 2 Restrictions on discount factors and ATM volatilities for various combinations
of delta and ATM types(a)
ATM-type forward ATM-type delta neutral
p
Spot delta Dfor > −21 Dfor > 2c1
√ p
σatm T < 2N−1 1 + D 1
for
√ −1
p
Forward delta σatm T < 2N 1 + 1 No constraint
Spot delta p.a. Dfor > 2c1 2c1
Dfor >
√ c √
σatm T < 2N−1 1 − D 1 σatm T < 2 ln Dforc
for 21
√ √ 1
Forward delta p.a. σatm T < 2N−1 1 − c1 σatm T < 2 ln c
21
(a)
If these conditions are violated, the ATM point will cross the nearest OTM point. Note that
p
the put delta 1 is negative
OTM point c1 only if For a 30-year option, this condition restricts the
ATM volatility to a maximum of 21.5%, a value
√ −1 c1 −1 c that—a priori—does not seem unattainable.
σatm T < −2N = 2N 1− 1
Dfor Dfor
(47) Ambiguities in the Conversion from to Strike
for Premium-adjusted Deltas
This inequality has meaningful solutions only if
the right-hand side is positive, that is, only if Premium-adjusted deltas can cause further complica-
1 − c1 /Dfor > 1/2. We therefore have the additional tions. Recall from the section Forward Delta Premium
constraint: Adjusted
0.35
Call delta fwd p.a. a = 0.7
Market quotes 0.3 a = 1.1
∆max 0.25
∆atm 0.2
fa (x )
0.25 0.15
0.1
0.05
0
1.5 2 2.5 3 3.5 4 4.5 5
−0.05
(a) Katm Kmax K(∆ =0.25) (b) x
Figure 1 (a) Premium adjusted (forward) call delta as a function of strike. (b) The function fα (x) as defined in equation
(56) for α = 0.7 (broken line) and α = 1.1 (solid line)
smile table, this can be done by direct interpo- Starting from the expressions for the ATM strike
lation of the entries, possibly including the ATM and the call delta, see Table 1,
point.
In case Kmax > Katm , however, the situation
Katm = F e−1/2σ atm T
2
is more complicated. cF, pa (K) is no longer a (51)
monotonous function of K and the conversion of −1/2σ atm
2
catm = cF, pa Katm , σatm
2
= 1
2
e T
(52)
deltas to strikes is no longer unique for all OTM
options. Thus, when translating a smile table in deltas
into a smile table in strikes, particular care has to we need to find conditions under which there is a
be taken. In addition, when retrieving the volatility second solution with strike K > Katm that solves
for options with strikes K ∈ (Katm ; Kmax ) from the
volatility smile table, one has to extrapolate volatil-
catm = cF, pa K, σ (cF, pa ) (53)
ities in delta beyond the ATM point which seems
odd.
Note that the counterintuitive extrapolation in This may seem a difficult problem at first glance
delta beyond the ATM point does not occur in case since the delta on the right-hand side depends on the
the volatility smile table in deltas is translated to a volatility, which itself is given in terms of delta. It
table in strikes on the grid points first (see the section is, however, simplified considerably by the following
Using the Smile Volatility Table) and the interpolation argument: while we do not know the strike for the
is done in strikes. This is possible as long as there is point that solves equation (53), we do know the delta
no crossing of the ATM point and the closest delta there: it is the ATM delta. As the volatility is given
grid point as discussed in the section ATM Delta in terms of delta, we also know the volatility on the
Falling in the OTM Range. right-hand side of equation (53): it must be the ATM-
In the following discussion, we show that the volatility σatm .
case Kmax > Katm can, indeed, occur. We restrict Therefore, the problem can be reformulated as
ourselves to premium adjusted forward call deltas follows: when does
and ATM-type delta neutral. The conditions for
the ATM-type forward can be derived analogously
and we summarize the results at the end of this catm = cF, pa (K, σatm ) (54)
section. Similar arguments hold for premium-adjusted
spot deltas, however, it is reasonable to assume have a second solution with K > Katm (besides the
that the problems outlined in the section ATM trivial one at K = Katm )?
Delta Falling in the OTM Range surface Inserting the expressions for the ATM delta and
beforehand. the premium-adjusted forward call delta into equation
Foreign Exchange Options: Delta- and At-the-money Conventions 11
1 ln(x)
fα (x) := e −α 2 /2
− xN − (56) √
2 α α = σatm T < 1.224 (61)
The condition K > Katm in the previous equation so that for a 30-year option, ATM volatilities above
corresponds to x > 1. Plotting fα (x) for various 22.3% will lead to ambiguities.
values of α, see Figure 1(b), we make the following
observation: for small values of α the function
End Notes
increases monotonically, taking on only positive
values for x > 1. For larger values of α, the function a.
It should be noted that equation (32) was obtained under
decreases at first, thus, taking on negative values in the assumption that the slope of the volatility σ as a
a certain range, and then increases again, eventually function of the strike K in equation (31) is 0 ATM. This is
reaching a second zero. necessary as otherwise this ATM definition would become
Therefore, the question reduces further to this: impracticable. In general, however, the volatility smile
under which conditions does the function fα (x) have implies a nonzero slope and the maximum value of γ will
a negative slope at x = 1? The first derivative of fα not be found at the strike given by equation (32).
can easily be calculated,
References
ln(x) 1 ln(x)
fα (x) = − e−α
2
/2
N − − N − [1] Borowski, B. (2005). Hedgingverfahren für Foreign
α α α
Exchange Barrieroptionen, Diploma Thesis, Technical
(57) University of Munich.
[2] Carr, P. & Bandyopadhyay, A. (2000). How to derive the
and the slope of fα at x = 1 is now readily obtained Black–Scholes Equation Correctly? http://faculty.chica-
as gogsb.edu/akash.bandyopadhyay/research/ (accessed Mar
2000).
1 [3] Hull, J.C. (1997). Options, Futures and Other Derivative
fα (x = 1) = − e−α /2 N (0) − N (0)
2
Securities, 3rd Edition, Prentice Hall, NJ.
α
[4] Wystup, U. (2006). FX Options and Structured Products,
1 1 1 Wiley.
= − e−α /2
2
− √ (58)
2 α 2π
The constraint that it has to be positive yields the Related Articles
condition
Foreign Exchange Markets; Foreign Exchange
√ 2 Symmetries; Foreign Exchange Smile Interpol-
α = σatm T < (59) ation.
π
which restricts the ATM volatitity for a 30-year CLAUS CHRISTIAN BEIER & CHRISTOPH
option to only 14.6%. RENNER
Stochastic Volatility is to model the FX rate Xt with a volatility process vt
by a system of stochastic differential equations, like
Models: Foreign
dXt = µt Xt dt + σt Xt dWtX
Exchange
dvt = η(vt ) dt + ζ (vt ) dWtv (1)
Table 1 Some common stochastic volatility models Valuation of Options in the Heston Model
f (σt ) η(vt ) ζ (vt ) Reference
For the valuation of options in the Heston model, we
σt2 κ(θ − vt ) ξ vt GARCH similar diffusion consider the value function of a general contingent
model [11] claim V (t, v, X). As shown in [8], applying Itô’s
√
σt2 κ(θ − vt ) ξ vt Heston model [9] lemma, the self-financing condition, and the possibil-
3/2
σt2 κ(θ vt − vt2 ) ξ vt 3/2 model [11] ity to trade in the underlying exchange rate, money
ln σt2 κ(θ − vt ) ξ Log-volatility market, and another option, which is dependent on
Ornstein–Uhlenbeck [11]
time, volatility, and X, we arrive at Garman’s partial
differential equation:
an FX setting. The model is characterized by the
stochastic differential equations: ∂V ∂V ∂V 1 ∂ 2V
+ κ(θ − v) + (rd − rf )X + σ 2v 2
∂t ∂v ∂X 2 ∂v
√
dXt = (rd − rf )Xt dt + vt Xt dWtX 1 ∂ 2V ∂ 2V
√ + vX 2 2
+ ρσ vX − rd V = 0
dvt = κ(θ − vt )dt + σ vt dWtv (2) 2 ∂X ∂v∂X
X (3)
with Cov dWt , dWtv = ρdt. Here, the FX rate pro-
cess {Xt }t≥0 is modeled by a process, similar to A solution to the above equation can be obtained
the geometric Brownian motion, but with a non- by specifying appropriate boundary and exercise con-
constant instantaneous variance vt . The variance ditions, which depend on the contract specifications.
process {vt }t≥0 is driven by a mean-reverting stochas- In the case of European vanilla options, Heston [9]
tic square-root process. The increments of the two provided a closed-form solution, namely,
Wiener processes {WtX }t≥0 and {Wtv }t≥0 are assumed
to be correlated with rate ρ. In an FX setting, the Vanilla = φ e−rf τ Xt P1 − Ke−rd τ P2 (4)
risk-neutral drift term of the underlying process is
the difference between the domestic and the for- where τ = T − t is the time to maturity, φ = ±1 is
eign interest rates rd − rf . The quantities κ ≥ 0 and the call–put indicator and K is the strike price. The
θ ≥ 0 denote the rate of mean reversion and the long- quantities P1 and P2 define the probability that the
term variance. The parameter σ is often called vol of exchange rate X at maturity is greater than K under
vol, but it should be called volatility of instantaneous the spot and the risk-neutral measure, respectively.
variance. The spot delta of the European vanilla option is equal
√ to φe−rf τ P1 .
The term vt in equation (2) ensures a nonneg-
ative volatility in the FX rate process. It is known Assuming that the distribution of ln XT at time
that the distribution of values of {vt }t≥0 is given by a t under the two different measures is determined
noncentral chi-squared distribution. Hence, the proba- uniquely by its characteristic function ϕj , for j =
bility that the variance takes a negative value is equal 1, 2, it is shown, in [15], that P1 and P2 can
to zero. Thus, if the process touches the zero bound, be expressed in terms of the inverse Fourier
the stochastic part of the volatility process turns zero transformation
and the deterministic part will ensure a nonnegative ∞
volatility because of the positivity of κ and θ. 1 1 exp(−iu ln K)ϕj (u)
Pj = + φ du
The Heston model is often not capable of fitting 2 π 0 iu
complicated structures of implied volatility surfaces. (5)
In particular, this is true if the term structure exhibits
a nonmonotone form or the sign of the skew changes The integration in equation (5) can be done
with increasing maturity. For a discussion of the using numerical integration methods such as
implied volatility surface generated by this model, Gauss–Laguerre integration or fast Fourier transform
see [7]. One approach to tackle this limitation is to approximation. In [10], it is shown that the
extend the original Heston model by time-dependent computational time of the fast Fourier transform
parameters [3, 14]. approach to compute vanilla option prices is higher
Stochastic Volatility Models: Foreign Exchange 3
0.2
0.16
0.18
Volatility
0.16
0.15
0.14
Volatility
0.12
0.14
0.1
0.08
2 0.13
1.5
Tim
1 0.12
e to
0.5
Ma
tu
0 −25D 0.11
rity
ATM
25D 0 0.5 1 1.5 2
(a) Delta (b) Time to maturity
Figure 2 Implied volatilities of the Heston model fitted to market volatilities for USD/JPY with maturities of 1, 2, 3,
6, 9, 12, and 24 months and strikes for 10% and 25% put, ATM, 10%, and 25% call. The dots show the market
volatilities and the circles the calibrated volatilities. (a) The whole volatility surface. (b) The implied volatility term structure
for strikes 25% call, ATM, and 25% put (from bottom to top)
6 Stochastic Volatility Models: Foreign Exchange
Table 2 Down-and-out put values with at-the-money strike and discrete monitoring at 6, 12, and
18 months
Barrier (% of spot) 90 80 70 60 50 10
Black–Scholes 0.8752 3.7068 5.7670 6.2663 6.3000 6.3012
Heston 0.6309 1.8723 3.1830 4.3376 5.2202 6.3023
The value of the corresponding plain vanilla put in the Heston model is given by 6.3060
Foreign Exchange Options; Foreign Exchange SUSANNE A. GRIEBSCH & KAY F. PILZ
Smiles; Heston Model; Implied Volatility Surface;
Foreign Exchange Basket Pricing Basket Options
These strategies are composed of spot transactions, dSi = µi Si dt + Si ij dWj (2)
forwards, and, in many cases, options on a sin- j =1
gle currency. Nevertheless, there are instruments that is the basis for pricing. Here µi denotes the difference
include several currencies, and these can be used to between the foreign and the domestic interest rate of
build a multicurrency strategy that is almost always the ith currency pair and dWj the j th component
cheaper than the portfolio of the individual strategies. of independent Brownian increments. The covariance
As a leading example, we explain basket options in matrix is given by Cij = (T )ij = ρij σi σj . Here σi
detail.a denotes the volatility of the ith currency pair and ρij
the correlation coefficients.
at maturity T , where S1 (t) denotes the exchange rate given we know the spot S(t) at time t. It is a fact
of EUR/USD and S2 (t) denotes the exchange rate of that the sum of lognormal processes is not lognormal
EUR/JPY at time t, ai the corresponding weights, and itself, but as a crude approximation, it is certainly
K the strike. a quick method that is easy to implement. To price
A basket option protects against a drop in both the basket call, the drift and the volatility of the
currencies at the same time. Individual options on basket spot need to be determined. This is done
each currency cover some cases, which are not by matching the first and second moment of the
protected by a basket option (shaded triangular areas basket spot with the first and second moment of the
in Figure 1) and that is why they cost more than lognormal model for the basket spot. The moments
a basket. of lognormal spot are
The ellipsoids connect the points that are reached
E(S(T )) = S(t)eµ(T −t)
with the same probability assuming that the forward
2
)(T −t)
prices are at the center. E(S(T )2 ) = S(t)2 e(2µ+σ (5)
2 Foreign Exchange Basket Options
η
√
S1(T ) σ − σ 2 +λ (1+(λ/η)) σ 2 T −2 ln(F η/K)
d2 = √
Tλ
√
√ σ T
d1 = ηd2 + √ (9)
η
Figure 1 Basket-payoff and contour lines for probabilities
where η = 1 − λT
The new parameter λ is determined by matching
We solve these equations for the drift and volatil-
the third moment of the basket spot and the model
ity:
spot. For details, see [1].
1 E(S(T )) Most remarkably, this major improvement in the
µ= ln accuracy only requires a marginal additional compu-
T −t S(t)
tation effort.
1 E(S(T )2 )
σ = ln (6)
T −t E(S(T ))2 Correlation Risk
Correlation coefficients between market instruments
In these formulae we now use the moments for
are usually not obtained easily. Either historical data
the basket spot:
analysis or implied calibrations need to be done.
N However, in the foreign exchange (FX) market, the
E(S(T )) = αi Si (t)eµi (T −t) cross instrument is traded as well. For instance, in
i=1 the example above, USD/JPY spot and options are
traded, and the correlation can be determined from
N
E(S(T )2 ) = αi αj Si (t)Sj (t) this contract. In fact, denoting the volatilities as in
i,j =1
the tetrahedron (Figure 2), we obtain formulae for
N the correlation coefficients in terms of known market
µi +µj + ki j k (T −t) implied volatilities:
×e k=1 (7)
σ32 − σ12 − σ22
The pricing formula is the well-known Black– ρ12 =
2σ1 σ2
Scholes–Merton formula for plain vanilla call
options: σ12 + σ62 − σ22 − σ52
ρ34 = (10)
2σ3 σ4
υ(0) = e−rd T (F N(d+ ) − KN(d− ))
F = S(0)eµT
1 F 1
d± = √ ln ± σ 2T (8)
σ T K 2
s6
s4 s5
Here N denotes the cumulative normal distribution
function and rd the domestic interest rate.
f13 s3
s1
A More Accurate and Equally Fast Approximation f12 s2 f23
Table 1 Table 3
GBP/USD 8.9% Base currency EUR Interest rate 4.0%
USD/JPY 10.1% Nominal in EUR 39 007 Strike K 1
GBP/JPY 9.8% Currencies USD JPY GBP
EUR/USD 10.5% Nominals 29% 30% 41%
EUR/GBP 7.5% 1/spot 1.1429 0.00919 1.6091
EUR/JPY 10.0% Spot 0.8750 108.81 0.6215
Strikes (in EUR) 1.1432 0.00927 1.5985
FX implied volatilities for three- Volatilities 10.5% 10.0% 7.5%
month at-the-money vanilla options Interest rates 4.0% 0.5% 7.0%
as of November 23, 2001. Source: BS-values (in EUR) 235 227 233
Reuters Basket value 563
Sum of individuals 695
This method also allows hedging correlation risk Comparison of a basket call with three currencies for a maturity
by trading FX implied volatility. For details see [1]. of three months versus the cost of three individual call options
Table 2. 250
The amount of option premium one can save using 200
a basket call rather than three individual call options 150
is illustrated in Table 3. 100
The amount of premium saved essentially depends
−90
−80
−70
−60
−50
−40
−30
−20
−10
0
10
20
30
40
50
60
70
80
90
100
on the correlation of the currency pairs (Figure 3).
Correlation (%)
In Figure 3, we take the parameters of the previous
scenario, but restrict ourselves to the currencies USD Figure 3 Premium of basket option versus premium of
and JPY. option strategy depending on the correlation
Upper Bound by Vanilla Options Surprisingly, this is just the case if a specific relation
between the strike of the individual options and their
It is actually clear that the price of the two vanilla
volatilities is satisfied. The basket strike has to satisfy
options in the previous example is an upper bound
of the basket option price. It seems intuitively clear K1 K2
that for a correlation of 100% the price is the same. K = a1 + a2 (11)
S1 (0) S2 (0)
Table 2
GBP/USD (%) USD/JPY (%) GBP/JPY (%) EUR/USD (%) EUR/GBP (%) EUR/JPY (%)
GBP/USD 100 −47 42 71 −19 27
USD/JPY −47 100 60 −53 −18 45
GBP/JPY 42 60 100 10 −36 71
EUR/USD 71 −53 10 100 55 52
EUR/GBP −19 −18 −36 55 100 40
EUR/JPY 27 45 71 52 40 100
FX implied three-month correlation coefficients as of Nov 23, 2001
4 Foreign Exchange Basket Options
ing traded asset for a prespecified price, named the + 1a>A dvf (v)(v − A)+ (1)
0
option strike or the exercise price, at a prespecified
future date named the option expiry or the maturity. On the right hand side, we have a position in a
Put options are analogous rights to sell an under- bond with face value given by the constant term, a
lying asset. For strike K and maturity T with the position in the underlying risk of f (a) and a position
underlying asset trading at maturity for S, the call in puts struck below a and calls struck above a at
expires unexercised if S is below K while the put strike ν of f (ν).
expires unexercised if S is above K. On exercise, the With regard to the information content of the
value of the call option is S − K while that of the put market prices, we consider Breeden and Litzenberger
option is K − S. Hence, one may write the payoffs [1], who showed how one may extract the pricing
at maturity to the call and put options as (S − K)+ density at time t < T , p(t, A) for the underlying risk
and (K − S)+ , respectively. More generally, one may from market option prices. By definition, we have
define a call or put payoff for any underlying random ∞
variable, which need not be a traded asset, for which c(t, K, T ) = e −r(T −t)
(A − K)p (t, A) dA (2)
the realized value at maturity is known to be A, as K
(A − K)+ and (K − A)+ , respectively.
where r is the interest rate prevailing at time t for
When call and put options trade before the matu-
the maturity (T − t). We may differentiate twice with
rity, on an underlying uncertainty resolved at maturity respect to the strike to get
for various strikes K, with prices determined in mar-
kets at time t < T as c(t, K, T ), p(t, K, T ), respec- ∂ 2 c(t, K, T )
tively, we have an options market for the underlying p(t, K) = er(T −t) (3)
∂K 2
risk. Such markets provide a rich source of opportu-
nities for holding the underlying asset or risk while In the case when the underlying risk is an asset
simultaneously providing information on the prices price with a specific dynamics with exposure to a
of these risks. With regard to the opportunities, they Brownian motion with a space–time deterministic
make it possible to hold any function f (A) of the volatility (see Local Volatility Model) as postulated
underlying risk via a portfolio of put and call options. by Dupire [6] plus a compensated jump martingale
with a space–time deterministic arrival rate of jumps
This fact is easily demonstrated as follows [2].
and a fixed dependence of the arrival rate on the jump
Let f (A) be the function we wish to hold. We
size, one may extract information on the dynamics
note that
from market prices. Here, we follow Carr et al. [4].
A a
Let (S(t), t > 0) denote the path of the stock price,
f (A) = f (a) + 1A>a f (u) du − 1a>A f (u) du where r is the interest rate, η the dividend yield,
a A σ (S, t) the deterministic space–time volatility func-
A u tion, (W (t), t > 0) a Brownian motion, m(dx, ds)
= f (a) + 1A>a f (a)+ f (v) dv du the integer-valued counting measure associated with
a a the jumps in the logarithm of the stock price, a(S, t)
a a
the deterministic space–time jump arrival rate, and
− 1a>A f (a) − f (v) dv du k(x) the Lévy density across jump sizes x. The
A u
dynamics for the stock price may be written as
= f (a) + f (a)(A − a) + 1A>a
A t
× dvf (v)(A − v) S(t) = S(0) + S(u )(r − η) du
a 0
a t
+ 1a>A dvf (v)(v − A) + S(u )σ (S(u ), u) dW (u)
A 0
2 Call Options
t ∞
T K
+ S(u ) ex − 1 (m(dx, du) + dY q(Y, u)a(Y, u)
0 −∞ 0 0
∞
− a(S(u ), u)k(x) dx du) (4)
× (Y ex − K)k(x) dx du (6)
K
ln
We now apply a generalization of Itô’s lemma to Y
convex functions known as the Meyer–Tanaka for- Now differentiating equation (6) with respect to
mula (see, e.g., [5, 7, 8] for the specific formulation T , we get
below) to the call option payoff at maturity to obtain
rerT C + erT CT
T ∞
(S(T ) − K)+ = (S(0) − K)+ + 1S(u )>K dS(u) = (r − η) q(Y, T )Y dY
0
K
T
1 σ (K, T )K 2
2
+ δK (S(u ))σ 2
+ q(K, T )
2 0 2
× (S(u), u)S 2 (u) du ∞ ln
K K
Y ln
Y
+ 1S(u )>K (K − S(u))+ + dY Y q(Y, T )a(Y, T ) e −ex
K −∞
u≤T
K
+ 1S(u )<K (S(u) − K)+ (5) × k(x) dx + dY Y q(Y, T )a(Y, T )
0
The second integral denotes the value at K ∞ K
of the continuous local time LaT ; a ∈ , which is × ex − e
ln
Y
k(x) dx (7)
globally K
∞defined for every T bounded Borel function ln
Y
f, as −∞ f (a)LaT da = 0 f (S(u ))d S c u , where
dS c u = σ 2 (S(u), u)S 2 (u)du, and is applied here We now isolate CT on the left, using some
formally to the Dirac measure f (a) = δK (a). The last elementary properties of the relationship between call
term, which is the discontinuous component of local prices and the pricing density. In particular, we note
time at level K, is made up of just the crossovers, ∞
−rT
whereby one receives S(u) − K on crossing the strike e Y q(Y, T ) dY = C − KCK (8)
into the money, whereas one receives (K − S(u)) on 0
crossing the strike out of the money. e−rT q(K, T ) = CKK (9)
Computing expectations on both sides of equation
(5) and introducing q(, u), the transition density and obtain
that the stock price is at time u given that at time
CT = − ηC − (r − η)KCK
0 it is at S(0), we may write the call price function
at time zero as σ 2 (K, T )K 2
+ CKK
2
erT C(K, T ) = (S(0) − K)+ ∞
T ∞ + dY Y CY Y a(Y, T )
+ dY q(Y, u)Y (r − η) du K
0 K K K
ln
Y ln
Y
1 T × e −e x
k(x) dx
+ q(K, u)σ 2 (K, u)K 2 du −∞
2
0
K
T ∞
+ dY q(Y, u)a(Y, u) + dY Y CY Y a(Y, T )
0
0
K
∞ K
K ln
ln
Y × e −e x Y
k(x) dx (10)
× (K − Y ex )k(x) dxdu ln
K
−∞ Y
Call Options 3
most days VIX call option prices when deflated by [4] Carr, P., Geman, H., Madan, D. & Yor, M. (2005). From
the forward VIX are increasing in maturity for given local volatility to local Lévy models, Quantitative Finance
strikes and we have an empirical increase in the con- 4, 581–588.
[5] Dellacherie, C. & Meyer, P. (1980). Probabilités et
vex order, but there are days when this monotonicity Potentiel, Theorie des Martingales, Hermann, Paris.
is lost. The conditions for VIX option surfaces to be [6] Dupire, B. (1994). Pricing with a smile, Risk 7,
free of arbitrage are, therefore, not as clear as they 18–20.
are for an underlying stock or a stock index. [7] Meyer, P. (1976). Un Cours sur les Intégrales stochas-
tiques, in Séminaire de Probabilités X, Lecture Notes in
Mathematics, Springer-Verlag, Berlin, Vol. 511.
References [8] Yor, M. (1978). Rappels et Préliminaires Généraux,
in Temps Locaux, Société Mathématique de France,
[1] Breeden, D. & Litzenberger, R.L. (1978). Pricing of state- Astérisque, pp. 17–22, 52–53.
contingent claims implicit in option prices, Journal of
Business 51, 621–651.
[2] Carr, P. & Madan, D.B. (2001). Optimal positioning in Related Articles
derivatives, Quantitative Finance 1, 19–37.
[3] Carr, P. & Madan, D.B. (2005). A note on sufficient Dupire Equation; Local Volatility Model; Put–Call
conditions for no arbitrage, Finance Research Letters 2,
Parity; Static Hedging; Variance Swap.
125–130.
DILIP B. MADAN
Barrier Options the barrier level, the barrier option will be knocked in
and become a vanilla option. Otherwise, the barrier
option will expire worthless at maturity. Up-and-in
Barrier options are vanilla options with path-depen- calls are more common. This is because, when the
underlying asset increases to knock-in barrier level,
dent payoffs, that is, the payoff is not only a function
it would most likely stay above the initial underlying
of stock level relative to option strike but also
asset level. Therefore, call options will be more likely
dependent upon whether or not the stock reaches
be in the money at maturity than put options. Bullish
certain prespecified barrier level before maturity. An
investors can buy up-and-in call options and pay a
example will illustrate the idea. Suppose an investor
lower premium than that on the vanilla call options.
is long an up-and-in at-the-money call option on
This makes the on up-and-in calls more leveraged
the S&P 500 index with barrier level at 110% of
than vanilla calls.
the initial S&P 500 index level. Before maturity, if
the index never reaches 110% of the initial index
level, the option never gets knocked in. The investor Down-and-in Call/Down-and-in Put
receives nothing at maturity. However, if the index The down-and-in barrier option has a knock-in barrier
level reaches 110% at some point before maturity, the level, which is below the initial underlying asset
investor receives a payoff identical to a vanilla at-the- level. Before the maturity, if the underlying asset
money call option at maturity. In the latter scenario, goes below the barrier level the barrier option will be
the option is “knocked in” on the day when the index knocked in and become a vanilla option. Otherwise,
reaches 110% level. the barrier option will expire worthless at maturity.
There are many types of barrier options. We dis- Down-and-in puts are more common in this case.
cuss two common ones, that is, knock-out and knock- Bearish investors can buy down-and-in puts and pay
in barrier options. For knock-out barrier options, the a lower premium than that on the vanilla put options.
option will be knocked out and become worthless
if the underlying asset crosses a prespecified barrier Up-and-out Call/Up-and-out Put
level. For knock-in barrier options, the barrier option
will be knocked in and become a vanilla option only This is the first kind of knock-out barrier options.
if the underlying asset crosses the prespecified level The up-and-out barrier option has a knock-out barrier
before maturity. The example used earlier is a knock- level above the initial underlying asset level. Before
in barrier option. maturity, if the underlying asset crosses the barrier
Depending upon the barrier level relative to the level, the option will be knocked out and become
initial underlying asset level, we can have an “up” worthless. Otherwise, the barrier option will just be a
barrier or a “down” barrier. If the barrier is above the vanilla option. A bearish investor would buy up-and-
initial underlying asset level, it is called an up bar- out puts to achieve more leverage by paying a lower
rier. If the barrier is below the initial underlying asset premium than that on vanilla puts.
level, it is called a down barrier. Together, we can
have four different variations of barrier options, that Down-and-out Call/Down-and-out Put
is, up-and-in, up-and-out, down-and-in, and down-
The down-and-out barrier option has a knock-out
and-out options. Table 1 shows these four variations barrier level below the initial underlying asset level.
schematically. Before maturity, if the underlying asset goes below
the barrier level, the option will be knocked out
and become worthless. A bullish investor would
Basic Features buy down-and-out calls to achieve more leverage by
paying a lower premium than that on vanilla calls.
Up-and-in Call/Up-and-in Put
This is the first kind of knock-in barrier options. The Some Variations
up-and-in barrier option has a knock-in barrier level,
which is higher than the initial underlying asset level. With increased popularity of barrier options and
Before maturity, if the underlying asset goes above growth of its market, some other features are being
2 Barrier Options
Related Articles
Adjustment for Discrete Barrier
Finite Difference Methods for Barrier Options;
Often, the barrier option has a discrete barrier sched- Pricing Formulae for Foreign Exchange Options.
ule but the exact valuation is only available for the
case of continuous barrier. In the constant volatility MICHAEL QIAN
Corridor Options be seen as the sum of the payoffs of digital options
expiring on successive days.
that is, the coupon payment depends on the level of Therefore, for a fixed-rate range note, we have
the reference rate at the current coupon payment date
(maturity and payment date coincide).
Ɛt φ Tj +1
Sometimes, a minimum coupon clause is also
D (Tj ,Tj +1 )
included, so that the coupon amounts to Cj
= N dTxj,i
u xl
−t − N dTj,i −t
H Tj , Tj +1 D Tj , Tj +1 i=1
φ Tj +1 = max Cj +1 × ,K (5)
D Tj , Tj +1 (8)
The standard contract is the fixed rate range note if and the price of the fixed range note is
the underlying is a stock index or a foreign currency
and the delayed floating range note if the underlying
n
is a LIBOR rate. P t, Tj +1 Ɛt φ Tj +1 + P (t, Tn ) (9)
j =0
[12] Linetsky, V. (1999). Step options: The Feynman-Kac [19] Wu, T.P. & Chen, S.N. (2008). Valuation of floating
approach to occupation time derivatives, Mathematical range notes in a LIBOR market model, Journal of
Finance 9, 55–96. Futures Markets 28(7), 697–710.
[13] Miura, R. (1992). A note on lookback options based on
order statistics, Hitotsubashi Journal of Commerce and
Management 27, 15–28. Further Reading
[14] Navatte, P. & Quittard-Pinon, F. (1999). The valuation
of interest rate digital options and range notes revisited, Tucker, A.L. & Wei, J.Z. (1997). The latest range, Advances
European Financial Management 5(3), 425–440. in Futures and Options Research 9, 287–296.
[15] Nunes, J.P.V. (2004). Multi-factor valuation of floating
range notes, Mathematical Finance 14(1), 79–97.
[16] Pechtl, A. (1995). Classified information, in Over Related Articles
the Rainbow, J. Robert, ed, Risk Publications, pp.
71–74.
[17] Takàcs, L. (1996). On a generalization of the arc-sine
Barrier Options; Corridor Variance Swap; Dis-
law, Annals of Applied Probability 6(3), 1035–1040. cretely Monitored Options; Parisian Option.
[18] Turnbull, S.M. (1995). Interest rate digital options and
range notes, Journal of Derivatives 3, 92–101. GIANLUCA FUSAI
•
Lookback Options A floating strike lookback call The payoff is given
by the difference between the asset price at the
option maturity, which represents the floating strike,
and the minimum price over the monitoring period.
Lookback options are path-dependent options,
Therefore, the buyer of this option can buy the
introduced at first in [25] and [26], characterized by
underlying asset paying the minimum price.
having their settlement based on the minimum or the
Notice that floating strike options will always be
maximum value of an underlying index as registered
exercised. Formulae for the payoffs are provided in
during the lifetime of the option. At maturity, the
Table 1, as well as versions involving the maximum
holder can “lookback” and select the most conve-
and variants denominated partial price or partial
nient price of the underlying that occurred during this
time.
period: therefore they offer investors the opportunity
(at a price, of course) of buying a stock at its low-
est price and selling a stock at its highest price. Since Pricing
this scheme guarantees the best possible result for the
option holder, he or she will never regret the option In this section, we discuss the pricing problem under
payoff. As a consequence, a lookback option is more the geometric Brownian motion (GBM) assumption,
expensive than a vanilla option with similar payoff that is,
function. However, these options do not offer a nat-
ural hedge for typical business and are used mainly dS(t) = (r − q)S(t)dt + σ S(t)dW (t), S(0) = S0
by speculators. To mitigate their cost, sometimes the (1)
lookback feature is mixed with an average feature:
for example, the payoff is the best or the worst of where r is the instantaneous risk-free rate; q is the
past average prices and they are offered as investment instantaneous dividend yield; σ is the percentage
product under names such as Everest, Napoleon, and volatility; S0 is the initial underlying price; and we are
Altiplano. interested in the distribution of the minimum m(T )
In the section Payoff Function, we describe and maximum M(T )
lookback options payoff. In the section Pricing, M(T ) = max S(u), and m(T ) = min S(u)
we illustrate the pricing of these options in the 0≤u≤T 0≤u≤T
Black–Scholes setting and some results on the (2)
hedging problem. Thereafter, in the section Non-
Gaussian Models, we consider the pricing problem Figure 1 illustrates a simulated path of the under-
under non-Gaussian models. Finally, in the section lying asset according to the dynamics in equation 1
Related Payoff, we present payoffs related to look- and the corresponding trajectories for the maximum
back options. and minimum price.
Analytical Solution
Payoff Function Under the GBM assumption, the distribution law of
A lookback option can be structured as a put or m(T ) (as well as the joint density of m(T ) and S(T ))
call. The strike can be either fixed or floating. We is known in closed form. This allows one to obtain
now consider two lookback options written on the an analytical solution for standard lookback options
minimum value achieved by the underlying index as expected value of the discounted payoff; see, for
during a fixed time window: example, [25, 26], and [15]. Therefore, we obtain
the following pricing formula for the floating strike
• A fixed strike lookback put The payoff is given lookback call:
by the difference, if positive, between the strike
price and the minimum price over the monitor- Ɛt,St e−r(T −t) (S(T ) − m(T ))
ing period. Therefore, the buyer of this option
= St e−q(T −t) − e−r(T −t) Ɛt,St m(T )
can sell the asset at the minimum price receiving
the strike K. = St e−q(T −t) N(d2 )
2 Lookback Options
6 2(r − q) √
d3 = − d2 + T − t,
σ
5 x −u2
Stock price
Minimum
N(x) = e 2 du (4)
−∞
4 Maximum
1 St −r(T −t) St
d2 = ln + (r − q)(T − t) × e N d1 − N(−d)
σ (T − t)
2 m t K
1
+ σ 2 (T − t) , + 1(K≥mt ) e−r(T −t) (K − mt ) − St N(−d2 )
2
Lookback Options 3
price and time, but also on the current realized dates exploiting the exact solution of the stochastic
minimum or maximum, and we can write V = differential equation (1):
V (S, m, t) for options on the minimum (similar 2
√ (j )
discussion holds for lookback on the maximum). S (j ) ((i + 1)) = S (j ) (i)e(r−0.5σ )+σ i
(11)
Applying Ito’s lemma and equating the expected (j )
return on the option to the return on a risk-free where i is a standard normal random variate and
investment, it can be shown that V solves the S (j ) (i) is the spot price at time ti = i as sampled
following PDE: in the j th simulation.
The corresponding minimum price m(j ) (i) over
∂V ∂V σ 2 2 ∂ 2V the time interval t(i−1) , ti is updated at each mon-
+ (r − q)S + S = rV (8) itoring date according to the rule
∂t ∂S 2 ∂S 2
which has to be solved for S ≥ m and T ≥ t ≥ 0. (j ) (i) = min S (j ) (i), m
m (j ) ((i − 1)) (12)
The above PDE is the standard Black–Scholes PDE, (j )
with the change of the domain from S ≥ 0 to S ≥ m. with a starting condition m0 = S0 . The MC price for
Here m appears as parameter delimiting the domain a lookback is given by the average of the discounted
of the spot price. This implies that a boundary payoff computed over J simulated sample paths.
condition at S = m is needed. The important point For a lookback option with fixed strike K, the MC
is the observation that when the spot price is near to price is
the running minimum, the probability that at expiry
1
+
J
the minimum will be equal to the current minimum e−rtn (j ) (n)
K −m (13)
m is zero and therefore changes in m do not affect J j =1
the option value. This allows one to set the boundary
condition at S = m: Similarly, for a floating strike lookback option, the
MC price is
∂V (S, m, t)
= 0 when S = m (9)
1 (n)
∂m J
e−rtn S (j ) (n)
−m (14)
Together with the payoff condition at t = T , equa- J j =1
tions (8) and (9) allow us to fully characterize
the lookback option premium. For discretely mon- Unfortunately, the procedure cannot be immediately
itored lookback options, with monitoring at dates applied to continuously monitored options shrinking
ti = i the PDE (8) remains unchanged, while the the time step . Indeed, owing to fact that we can
boundary condition (9) does not apply anymore. only sample at discrete times, we lose information
Indeed, between monitoring dates, the spot price can about the parts of the continuous-time path that lie
freely move in (0, +∞) and at monitoring dates, the between the sampling dates. This procedure will be
solution is updated according to the rule systematically biased in the sense that the continuous
minimum (maximum) will be always overestimated
V (S, m, ti+ ) = V (S, min(S, m), ti− ) (10) (underestimated). Andersen and Brotherton-Ratcliffe
[4] show that for a one-year lookback with 256
Equation (8) can be solved numerically using discrete monitoring points this bias is around 5% of
an appropriate numerical scheme such as the the option price and suggest a procedure to correct it.
Crank–Nicolson one (see Crank–Nicolson Scheme).
However, exploiting a change of numeraire, the PDE
(8) can be simplified to a single state variable [3]. Binomial and Tree Methods
As for PDEs, the implementation of a tree for
Monte Carlo Simulation path-dependent options involves two state variables
and the need to keep track of current extreme
Discretely monitored lookback options can be easily values will cause the number of calculations to
priced by standard Monte Carlo (MC) simulation. grow substantially faster than the number of nodes.
The underlying price is simulated at all monitoring However, under the GBM assumption and exploiting
Lookback Options 5
that the realized maximum drawdown (maximum [7] Ballotta, L. & Kyprianou, A. (2001). A note on the
drawup) will be larger than expected, and thus they Alpha-Quantile option, Applied Mathematical Finance
are natural buyers of this contract. On the other 8, 137–144.
[8] Bermin, H.-P. (2000). Hedging lookback and partial
hand, selling the (unhedged) contract is equivalent
lookback options using Malliavin calculus, Applied
to taking the opposite strategy, namely, buying the Mathematical Finance 39, 75–100.
asset when it is setting its new low. This is known [9] Borovkov, K. & Novikov, A. (2002). On a new approach
as contrarian trading. Contrarian traders believe that for option pricing, Journal of Applied Probability 39,
the realized maximum drawdown (maximum drawup 1–7.
or range) will be smaller than expected, and they are [10] Boyle, P.P. & Tian, Y. (1999). Pricing lookback and
natural sellers of this contract. The distribution of the barrier options under the CEV process, Journal of
Financial and Quantitative Analysis 34, 241–264.
maximum drawdown of Brownian motion is studied
[11] Boyle, P.P., Tian. Y. & Imai, J. (1999). Lookback
in [37]. options under the CEV process: a correction, Jour-
Russian options are perpetual American Options nal of Finance and Quantitative Analysis web site
with lookback payoff, introduced in [40]. Russian http://www.jfqa.org/. In: Notes, Comments, and Correc-
options can be regarded as a kind of perpetual tions.
American fixed strike lookback option with zero [12] Broadie, M., Glasserman, P. & Kou, S. (1999). Con-
strike price and their pricing can be derived by necting discrete and continuous path-dependent options,
using a probability approach [21], or a PDE appro- Finance and Stochastics 3, 55–82.
[13] Broadie, M. & Yamamoto, Y. (2005). A double-
ach [17]. exponential fast Gauss transform algorithm for pricing
Finally, we mention the class of structures named discrete path-dependent options, Operations Research
mountain options, having names such as Himalaya, 53(5), 764–779.
Everest, Altiplano and so on (see Atlas Option; [14] Cheuck, T.H.F. & Vorst, T.C.F. (1997). Currency look-
Himalayan Option; Altiplano Option). Here, the back options and observation frequency: a binomial
extrema over a given period of a given asset is approach, Journal of International Money and Finance
replaced by the best or the worst performer over dif- 16(2), 173–187.
[15] Conze, A. & Vishwanathan, R. (1991). Path-dependent
ferent periods of assets in a given basket. Sometimes,
options: the case of lookback options, Journal of Finance
a global floor on the return of the product is also intro- 46, 1893–1907.
duced. It is clear that MC simulation is needed to be [16] Costabile, M. (2006). On pricing lookback options under
able to price this type of products, that, in general, the CEV process, Decisions in Economics and Finance
are very sensible to the crosscorrelation of the assets. 29, 139–153.
In addition, the Greeks of these contracts can change [17] Dai, M. (2000). A closed-form solution for perpetual
markedly as the trade progresses. American floating strike lookback options, Journal of
Computational Finance 4(2), 63–68.
[18] Dai, M., Kwok, Y.K. & Wong, H.Y. (2004).
References Quanto lookback options, Mathematical Finance 14(3),
445–467.
[1] Aitsahlia, F. & Leung, L.T. (1998). Random walk duality [19] Dassios, A. (1995). The distribution of the quantile of
and the valuation of discrete lookback options, Applied a Brownian motion with drift & the pricing of related
Mathematical Finance 5(3/4), 227–240. path-dependent options, Annals of Applied Probability 5,
[2] Akahori, J. (1995). Some formulae for a new type of 389–398.
path-dependent option, Annals of Applied Probability 5, [20] Davydov, D. & Linetsky, V. (2001). The valuation and
383–388. hedging of barrier and lookback options under the CEV
[3] Andreasen, J. (1998). The pricing of discretely sampled process, Management Science 47, 949–965.
Asian and lookback options: a change of numeraire [21] Duffie, D. & Harrison, J.M. (1993). Arbitrage pricing
approach, Journal of Computational Finance 2(1), 5–30. of Russian options and perpetual lookback options, The
[4] Andersen, L. & Brotherton-Ratcliffe, R. (1996). Exact Annals of Applied Probability 3(3), 641–651.
exotics, Risk Magazine 9, 85–89. [22] Feng, L. & Linetsky, V. (2009). Computing exponential
[5] Atkinson, C. & Fusai, G. (2007). Discrete extrema of moments of the discrete maximum of a Levy process and
Brownian motion and pricing of exotic options, Journal lookback options, Finance and Stochastics, available at
of Computational Finance 10(3), 1–43. SSRN:http://ssrn.com/abstract=1260934.
[6] Babbs, S. (2000). Binomial valuation of lookback [23] Fusai, G., Abrahams, I.D. & Sgarra, C. (2006). An exact
options, Journal of Economic Dynamics and Control analytical solution for discrete barrier options, Finance
24(11–12), 1499–1525. and Stochastics 10(1), 1–26.
8 Lookback Options
[24] Fusai, G., Marazzina, D., Marena, M. & Ng, M. [35] Kou, S.G. & Wang, H. (2003). First passage times of a
(2008). Maturity Randomization and Option Pricing. jump diffusion process, Advances in Applied Probability
w.p. SEMeQ. 35, 504–531.
[25] Goldman, M.B., Sosin, H.B. & Gatto, M.A. (1979). [36] Linetsky, V. (2004). Lookback options and diffusion
Path-dependent options: buy at the low, sell at the high, hitting time: a spectral expansion approach, Finance and
Journal of Finance 34, 1111–1127. Stochastics 8, 373–398.
[26] Goldman, M.B., Sosin, H.B. & Shepp, L. (1979). On [37] Magdon-Ismail, M., Atiya, A., Pratap, A. & Abu-
Mostafa, Y. (2004). On the maximum drawdown of a
contingent claims that insure ex-post optimal stock
Brownian motion, Journal of Applied Probability 41(1),
market timing, Journal of Finance 34, 401–413.
147–161.
[27] Green, R., Fusai, G. & Abrahams, I.D. (2009). The
[38] Miura, R. (1992). A note on lookback options based
Wiener-Hopf technique & discretely monitored path on order statistics, Hitotsubashi Journal of Commerce &
dependent option pricing, Mathematical Finance, to Management 27, 15–28.
appear. [39] Petrella, G. & Kou, S.G. (2004). Numerical pricing
[28] He, H., Keirstead, W. & Rebholz, J. (1998). Double of discrete barrier and lookback options via Laplace
lookbacks, Mathematical Finance 8, 201–228. transforms, Journal of Computational Finance 8, 1–37.
[29] Heynen, R.C. & Kat, H.M. (1995). Lookback options [40] Shepp, L. & Shiryaev, A.N. (1993). The Russian
with discrete and partial monitoring of the underlying option: reduced regret, Annals of Applied Probability 3,
price, Applied Mathematical Finance 2, 273–284. 631–640.
[30] Hobson, D.G. (1998). Robust hedging of the lookback [41] Vecer, J. (2006). Maximum drawdown and directional
option, Finance and Stochastics 2(4), 329–347. trading, Risk Magazine 19(12), 88–92.
[31] Hörfelt, P. (2003). Extension of the corrected bar- [42] Wilmott, P., Dewynne, J.N. & Howison, S. (1993).
rier approximation by Broadie, Glasserman, and Kou, Option Pricing: Mathematical Models and Computation,
Oxford Financial Press.
Finance and Stochastics 7(2), 231–243.
[32] Howison, S. & Steinberg, M. (2007). A matched asymp-
totic expansions approach to continuity corrections for Related Articles
discretely sampled options. Part 1: barrier options,
Applied Mathematical Finance 14, 63–89.
[33] Kat, H.M. & Heynen, R.C. (1994). Selective memory. Barrier Options; Corridor Options; Discretely
Risk Magazine 7(11), 73–76. Monitored Options; Parisian Option.
[34] Kou, S.G. (2003). On pricing of discrete barrier options,
Statistica Sinica 13, 955–964. GIANLUCA FUSAI
Parisian Option over a time of length —by studying its asymptotic
behavior when tends to 0. They derive precise
estimates of gL,tS
:= sup{u ≤ t|Su = L}, which is, for
Parisian options are barrier options that are activated a down-and-out Parisian option, related to TLD,− (S)
or canceled—depending on the type of option—if by the following formula: TLD,− (S) := inf{t > 0 :
the underlying asset has been continuously traded (t − gL,t
S
)11St <L > D}. This procedure still works
above or below the barrier level long enough. A when the asset follows a diffusion process with
down-and-out Parisian option denotes a contract that general coefficients.
expires worthless if the underlying asset reaches a
prespecified level L and remains constantly below
this level for a time interval longer than a fixed Laplace Transforms
number D, called the window. Its price (for a call
option) at time 0 is given by The idea of using Laplace transforms for pricing
Parisian options is owed to Chesney et al., [5]. By
φ(T , K) := e−rT Ɛ (ST − K)+ 11T D,− (S)>T (1) using the Brownian excursion theory, they get closed
L
formulas for
where TLD,− (S) is the first time the asset S makes ∞
an excursion longer than D below L. Parisian- dt e−λt φ(t, K) (2)
style options are mostly encountered in convertible 0
bonds with “soft-call” provision for conversion. For the Laplace transform of the price with respect to the
example, the bond’s specifications may be such maturity time.
that conversion will be allowed if and only if the For models with constant parameters, when con-
share price remains above a theoretical price for sidering a down and in call option, one rewrites
a given amount of time, for example, 20 business φ(T , K) as
days prior to the conversion date (this is Parisian
option). Other covenants stipulate that the average
− r+
m2
T +
share price trades for n days above the trigger level. e 2
Ɛ 11T D,− <T xeσ ZT − K emZT (3)
While the latter does not correspond sensu stricto to b
methods have been proposed in the literature: Monte it can be shown that TbD,− and ZT D,− are indepen-
b
Carlo simulations, Laplace transforms, lattices, and dent. There is no explicit formula for the density of
partial differential equations. TbD,− , but we only know its Laplace transform. The
strong Markov property enables to introduce ZT D,− in
b
equation (3). We rewrite equation (3) as
Monte Carlo Method
∞
As for standard barrier options, using simulations Ɛ (11T D,− <T PT −T D,− (fx )(z))ν − ( dz) (4)
b b
leads to a biased problem, owing to the choice of the −∞
introduces the Laplace transform of TbD,− , which is The new state variable τ can be viewed as a clock
explicitly known. This leads to a closed formula. that starts ticking as soon as the share price crosses
We refer the reader to [1] for the description of a the barrier level and is immediately reset when the
fast and accurate numerical inversion of the Laplace share price returns above L. We assume that the
transforms. By studying the regularity of the Parisian asset follows a log normal Brownian motion given
option prices with respect to the maturity time, Labart by dSt = µSt dt + σ St dWt . The option price is a
and Lelong [9] justify the accuracy of the numerical function of S, t, τ . If S ≥ L, the governing equation
inversion. Except for particular values of the barrier, is the standard Black Scholes equation:
the prices are of class C∞ . Their study relies on
the existence and the regularity of a density for the ∂V 1 ∂ 2V ∂V
+ σ 2 S 2 2 + rS − rV = 0 (6)
Parisian time TbD,− . ∂t 2 ∂S ∂S
This algorithm is implemented in [4] and is
If S ≤ L, τ is ticking. The new governing equation
compared to a procedure for approximating a gen-
is
eral Laplace transform with one that can be easily
inverted. The Laplace transform approach is very spe- ∂V 1 ∂ 2V ∂V ∂V
cific to the problem, but practically we see that the + σ 2 S 2 2 + rS + − rV = 0 (7)
∂t 2 ∂S ∂S ∂τ
lack in the flexibility of the method is compensated
by its accuracy and computational speed. The boundary conditions are the following: the path-
wise continuity of V in S = L leads to V (L, t, τ ) =
V (L, t, 0) for all t, and
Lattices
V (S, T , τ ) = (ST − K)+ if τ < D,
Costabile [6] presents a discrete time algorithm to
evaluate Parisian options. The evaluation method is V (S, T , τ ) = 0 otherwise (8)
based on a combinatorial approach used to count the
number of trajectories of a particle which, moving In the study of Haber et al. [8], the numerical solu-
in a binomial lattice, remains constantly above an tion to equations (6) and (7) is implemented using an
upper barrier for time intervals strictly smaller than a explicit finite difference scheme. In the case of a dis-
prespecified window period. Once this number has crete monitoring of the contract, Vetzal and Forsyth
been computed, it can be used to derive a bino- [7] develop an algorithm based on the numerical
mial algorithm, based on the Cox–Ross–Rubinstein solution of a system of one-dimensional PDEs. It
(CRR) model (see Binomial Tree or Tree Methods). is assumed that τ only changes at observation dates
It enables to evaluate Parisian options with a con- with the value of S with respect to the barrier. Away
stant or an exponential barrier. Avellaneda and Wu from observation dates, the PDE satisfied by V does
[2] model and price Parisian-style options by a trino- not depend on τ . Then, the pricing problems consist
mial lattice method, which changes with the value of of a small number of one-dimensional PDEs, which
the asset with respect to the barrier. exchange information only at observation dates (we
impose the continuity of V ).
These methods have one major benefit: they are
Partial Differential Equations flexible enough to be easily modified to price more
general options, like Parisian (i.e., when the recorded
Pricing of Parisian options can be done using par- duration is cumulative rather than continuous).
tial differential equations. Let τ define the time the
underlying asset has continuously spent in the excur-
sion. For a down Parisian option, τ := t − sup{t ≤ Double Parisian
t|St ≥ L}. The dynamics of τ is
There exists a double barrier version of the stan-
dard Parisian options. Double Parisian options are
dt if St < L,
dτt = −τt − if St = L, (5) barrier options that are activated or canceled if the
0 if St < L underlying asset continuously remains outside a range
Parisian Option 3
[L1 , L2 ] long enough. The price of a double Parisian [5] Chesney, M., Jeanblanc-Picqué, M. & Yor, M. (1997).
out call at time 0 is given by Brownian excursions and Parisian barrier options,
Advances in Applied Probability 29(1), 165–184.
[6] Costabile, M. (2002). A combinatorial approach for
e−rT Ɛ (ST − K)+ 11T D,− (S)>T 11T D,+ (S)>T (9) pricing Parisian options, Decisions in Economics and
L1 L2
Finance 25(2), 111–125.
These double Parisian options can be priced using the [7] Forsyth, P.A. & Vetzal, K.R. (1999). Discrete Parisian and
Monte Carlo procedure improved with the sharp large delayed barrier options: A general numerical approach,
deviation method proposed by Baldi, Caramellino, Advances in Futures Options Research 10, 1–16.
and Iovino [3]. Labart and Lelong [9] give analytical [8] Haber, R.J., Schonbucher, P.J. & Wilmott, P. (1999).
Pricing Parisian options, Journal of Derivatives 6(3),
formulas for the Laplace transforms of the prices with
71–79.
respect to the maturity time. [9] Labart, C. & Lelong, J. Pricing Double Parisian
options using Laplace transforms, International Journal
References of Theoretical and Applied Finance (to appear),
http://hal.archives-ouvertes.fr/hal-00220470/fr/.
The Market for Cliquet Options The periodic strike setting feature of a cliquet enables
an investor to implement a strategy consistent with
The early market in cliquet options featured rolling options positions but without exposure to
vanilla contracts that were simply a series of for- volatility movements. For example, an investor could
ward starting at-the-money options. Rubinstein [4] buy a cliquet to implement a rolling three-month put
provided pricing formulae for forward-start options in strategy and be immunized against the future increase
a Black–Scholes framework resulting in Black–Scho in options premiums that would accompany increases
les pricing for vanilla cliquets. Cliquet products now in volatility throughout the life of the strategy. Hence
trade on exchanges and the fore-runner to these list- a cliquet provides cost certainty, whereas the rolling
ings were reset warrants, whose first public listings in put strategy does not.
the United States appeared in 1993 [5] and 1996 [1, Cliquet products are often embedded in principal-
2]. Cliquet options are equally effective in capturing protected notes, which combine certain aspects of
bullish (call) and bearish (put) market sentiments. fixed-income investing with equity investing. These
The current market for cliquet options accommo- notes guarantee the return of principal at maturity
dates a rich variety of features, which are sometimes with the investment upside provided by the cliquet
best illuminated in discussions of pricing methods [6, return. Retail notes would generally base investment
7]. The most actively traded cliquets are return-based gains on a broad market index such as the S&P 500
products that accumulate periodic settlement values index. Principal-protected notes may further guaran-
and pay a cash flow at maturity. The return character- tee a minimum investment yield, which compounds
istics and the price appeal of a cliquet can be tailored to the value of the global floor at maturity. The guar-
by adding caps and floors to the period returns and anteed yield may be considered as part of the equity
by introducing a strike moniness factor different from return, as it is in equation (2), or it can be considered
one. Defining the ith settlement value, Ri , in a call as part of the fixed-income return. In the latter case,
style cliquet by the equity payoff in equation (2) would be modified
as in equation (3):
Si
Ri = Max floor i , Min − ki , cap i (1)
Si0 payoff = Notional · Max
n
where floor i is the one-period(local) return floor for
period i; cap i is the one-period(local) return cap for × 0, Min Ri , GC − GF (3)
i
period i; Si is the market level on the settlement date
for period i; Si0 is the market level on the strike setting where the global floor now sets a strike on the sum
date for period i; and ki is a strike moniness factor of periodic returns.
for period i.
The payoff at maturity is given by
Summary
payoff = Notional · Max
n We have discussed the general characteristics of cli-
× GF , Min Ri , GC (2) quet options and illustrated the payoff for one com-
i monly traded type of the cliquet. Numerous variations
2 Cliquet Options
exist and can be tailored to give very different risk- [2] Gray, S.F. & Whaley, R.E. (1997). Valuing S&P 500
reward profiles. Some are distinguished in the market bear market warrants with a periodic reset, Journal of
by specific names, for example reverse cliquets [3]. Derivatives 5(1), 99–106.
[3] Jeffrey, C. (2004). Reverse cliquets: end of the road? RISK
The customizability of cliquet options likely means
17(2), 20–22.
we will continue to see product innovation in this [4] Rubinstein, M. (1991). Pay now, choose later, RISK 4, 13.
area in the future. [5] Walmsley, J. (1998). New Financial Instruments, 2nd
edition, John Wiley & Sons, New York.
[6] Wilmott, P. (2002). Cliquet options and volatility models,
References Wilmott Magazine, 6.
[7] Windcliff, H., Forsyth, P.A. & Vetzal, K.R. (2006).
Numerical methods and volatility models for valuing
[1] Conran, A. (1996). IFC Issues S&P 500 Index Bear Mar- cliquet options, Applied Mathematical Finance 13, 353.
ket Warrants, November 26, 1996 Press Release, http:
//www.ifc.org/ifcext/media.nsf/Content/PressReleases. RICK L. SHYPIT
Basket Options basket associated with each stock. If we assume that
these weights are approximately constant which is
reasonable, it follows that the volatility of the basket
and the volatilities of the stocks satisfy the relation
Equity basket options are derivative contracts that
have as underlying asset a basket of stocks. This
n
category may include (broadly speaking) options on σB2 = pi pj σi σj ρij (3)
indices as well as options on exchange-traded funds ij =1
(ETFs), as well as options on bespoke baskets. The
latter are generally traded over the counter, often as where σB is the volatility of the basket, σi are the
part of, or embedded in, structured equity derivatives. volatilities of the stocks, and ρij is the correlation
Options on broad market ETFs, such as the matrix of stock returns. If we assume lognormal
Nasdaq 100 Index Trust (QQQQ) and the S&P returns for the individual stocks, then the probability
500 Index Trust (SPY), are the most widely traded distribution for the price of the basket is not log-
contracts in the US markets. As of this writing, their normal. Nevertheless, the distribution is well approx-
daily volumes far exceed those of options on most imated by a lognormal and equation (3) represents
individual stocks. Owing to this wide acceptance, the natural approximation for the implied volatility
QQQQ and ETF options have recently been given of the basket in this case.
quarterly expirations in addition to the standard The notion of implied correlation is sometimes
expirations for equity options. Options on sector used to quote basket option prices. The market
ETFs, such as the S&P Financials Index (XLF) or convention is to assume (for quoting purposes) that
the Merrill Lynch HOLDR (SMH), are also highly ρij ≡ ρ, a constant. It then follows from equation (1)
liquid. that the implied correlation of a basket option is
If we denote by B the value of the basket of
stocks at the expiration date of the option, a bas-
n
n
ket call has payoff given by max(B − K, 0) and a σB2 − pi2 σi2 σB2 − pi2 σi2
basket put has payoff max(K − B, 0), where K is ρ≡
i=1
=
i=1
2
the strike price. Most exchange-traded ETF options
pi pj σi σj
n
n
are physically settled. Index options tend to be pi σi − pi2 σi2
i=j
cash settled. Over-the-counter basket options, espe- i=1 i=1
cially those embedded in structured notes, are cash σB2
settled. ≈ 2 (4)
The fair value price of a (bespoke) basket option
n
pi σi
is determined by the joint risk-neutral distribution of
i=1
the underlying stocks. If we write the value of the
basket as Implied correlation is the market convention for
n
B= wi S i (1) quoting the implied volatility of a basket option
i=1 as a fraction of the weighted average of implied
volatilities of the components.
where wi , Si denote respectively the number of shares For example, if the average implied volatility for
of the ith stock and its price, the returns satisfy the components of the QQQQ for the December at-
wi Si dSi
n dSi n the-money options is 25% and the corresponding
dB
= = pi , with QQQQ option is trading at an implied volatility of
B i=1
B Si i=1
Si 19%, the implied correlation is ρ ≈ (19/25)2 = 58%.
wi S i This convention is sometimes applied to options
pi ≡ (2) that are not at the money as well. In this case,
B
in the calculation of implied correlation for the
Here, pi represents the instantaneous capitaliza- basket option, the implied volatilities for the com-
tion weight of the ith stock in the basket, that is, ponent stocks are usually taken to have the same
the percentage of the total dollar amount of the moneyness as the index in percentage terms. Other
2 Basket Options
conventions for choosing the volatilities of the com- Haug, E.G. (1998). The Complete Guide to Option Pricing
ponents, such as equal-delta or “beta-adjusted” mon- Formulas, McGraw-Hill.
eyness, are sometimes used as well. Since the Hull, J. (1993). Options Futures and Other Derivative Securi-
ties, Prentice Hall Inc., Toronto.
corresponding implied correlations can vary with
strike price, market participants sometimes talk about
the implied correlation skew of a series of basket Related Articles
options.
Correlation Swap; Exchange-traded Funds
Further Reading (ETFs).
MARCO AVELLANEDA
Avellaneda, M., Boyer-Olson, D., Busca, J. & Friz, P. (2002).
Reconstructing volatility, Risk 15(10).
Call Spread Another important consideration is the volatil-
ity implied by the market (see Implied Volatility:
Market Models). When the strikes are out of the
money, the call spread price increases when volatility
A call spread is an option strategy with limited upside increases because the probability of finishing in the
and limited downside that uses call options of two money increases. On the other hand, when the strikes
different strikes but the same maturity on the same are in the money, the call spread price decreases when
underlying. More details and pricing models can be volatility increases because the probability of finish-
found in [1]. Market considerations can be found ing out of the money increases. This is illustrated in
in [3–5]. The call spread produces a structure that Figure 3.
at maturity pays off only in scenarios where the
price of the underlying is above the lower strike.
One can think of this strategy as buying a low-strike The Relationship with Digital Options
call option and financing part of the upfront cost by and Skew
selling a higher strike call option. The effect of selling
the higher strike option is to limit the upside potential, The value of a European call spread structure can be
but reduce the cost of the structure. It should be written in terms of the difference of two call options.
used for expressing a bullish view that the underlying If we let p(S) denote the probability distribution of
will rise in price above the lower strike. As with all the underlying at the time of option expiry, then we
options, choosing the strike and maturity will depend have
on one’s view of how much the underlying will move ∞
and how quickly it will move there. An example is
CallSpread = e−rτ (S − K1 )p(S)dS − e−rτ
shown, in detail, in Figure 1. K1
In the example shown in Figure 1, we look at a ∞
790/810 call spread on the S&P 500 index, SPX.
× (S − K2 )p(S)dS (1)
With the underlying SPX index at 770 and with three K2
months to expiration, a 790 strike call price is 49.44
and an 810 strike call price is 41.79. The spread cost ∞
is 49.44 − 41.79 = 7.65. Thus, the cost for a call CallSpread = e−rτ (K2 − K1 )p(S)dS + e−rτ
K2
spread is significantly reduced from the outright cost
of a call option with the same strike. This upfront K2
cost for the call spread is the most one can lose × (S − K1 ) · p(S)dS (2)
K1
in a call spread. We subtract this initial investment
from all other valuations, as shown in Figure 1, to If we now take the strikes very close to each other,
get a total value. On the other hand, if both options the second term becomes insignificant. Next, if we
expire in the money, one will earn 20 = 810 − 790 lever up by 1/(K2 − K1 ), the payoff approximates
on the call spread. Then the maximum profit is the payoff of a digital option, which pays one if the
the spread minus the initial cost, or 20 − 7.65 = underlying at termination is greater than the strike
12.35. and zero otherwise. In this case,
Note that with three months to expiration, the call
spread value is fairly insensitive to the underlying
price. However, as the option gets closer to expi- ∞
ration, the sensitivity of the price of the call spread DigitalOption = e−rτ p(S)dS
K
becomes greater, especially in the range of the spread
itself. This sensitivity to underlying price or delta (see = e−rτ (1 − (K)) (3)
Delta Hedging) is illustrated in Figure 2.
The delta of the call spread stays relatively flat where (K) is the cumulative probability distribu-
until relatively close to expiration of the option. When tion at termination of the underlying. For the original
the call spread is close to expiration, the delta is very paper, see [6]. Also see [2, 7, 8] for more details.
unstable around the two strikes. We can state equation (3) in words as follows. The
2 Call Spread
15 0.02
Call spread total value 3 months
2 weeks
0.015 3 months
10 2 weeks
3 days 0.01 3 days
Expiration
5 0.005
Vega
0
0 −0.005
−0.01
−5
−0.015
−10 −0.02
700 750 800 850 900 700 750 800 850 900
Underlying price Underlying price
Figure 1 The value of a call spread at various times before Figure 3 The vega of a call spread at various times before
expiration expiration
area of a triangle)
1
3 months K2
(K2 − K1 )
0.8 2 weeks e−rτ (S − K1 )p(S)dS ≈ e−rτ
3 days K1 2
Expiration
0.6 × ((K2 ) − (K1 ))
Delta
0.4 (5)
0.2 This gives a better approximation
0
700 750 800 850 900 CallSpread ≈ e−rτ (K2 − K1 )
Underlying price
(K1 ) + (K2 )
× 1− (6)
Figure 2 The delta of a call spread at various times before 2
expiration
This is a very intuitive formula as it is just the payoff
of the call spread times the “average” probability the
call spread finishes in the money.
probability distribution function of the underlying at
termination can be inferred from market prices as the
derivative of digital options prices with respect to the References
strike. As these digital option prices come from call
spreads with close strikes, we can conclude that the [1] Hull, J. (2003). Options, Futures, and Other Derivatives,
5th Edition, Prentice Hall.
probability distribution function can be inferred from [2] Lehman Brothers (2008). Listed Binary Options, availa-
vanilla option prices. ble at http://www.cboe.com/Institutional/pdf/ListedBinary
Equation (2) shows that for close strikes or long Options.pdf
expiries, the value of a call spread is approximately [3] The Options Industry Council (2007). Option Strategies
the strike difference times the probability that the in a Bull Market, available at www.888options.com.
underlying finishes above the spread: [4] The Options Industry Council (2007). Option Strategies
in a Bear Market, available at www.888options.com.
[5] The Options Industry Council (2007). The Equity Options
CallSpread ≈ e−rτ (K2 − K1 ) · (1 − (K2 )) (4) Strategy Guide, January 2007, available at www.888
options.com
[6] Reiner, E. & Rubinstein, M. (1991). Breaking down the
This can be used as a crude first-order estimate barriers, Risk Magazine 4, 28–35.
for the value of a call spread. The second term in [7] Taleb, N.N. (1997). Dynamic Hedging: Managing Vanilla
equation (2) can be approximated as (similar to the and Exotic Options, Wiley Finance.
Call Spread 3
Call Options.
Further Reading
ERIC LIVERANCE
Haug, E.G. (2007). Option Pricing Formulas, 2nd Edition,
McGraw Hill.
Butterfly The delta of the butterfly stays relatively flat until
relatively close to expiration of the option. When
the butterfly is close to expiration, the delta is very
unstable around the three strikes.
A butterfly spread is an option strategy with limited Another important consideration is the volatility
upside and limited downside that uses call options implied by the market (see Implied Volatility: Mar-
of three different strikes but the same maturity on ket Models). The vega profile of a butterfly is
the same underlying. Specifically, a butterfly is a shown in Figure 3. When the underlying is close
structure that is a long position in 1 low-strike call, to the strikes, the vega is negative because when
a short position in 2 midstrike calls, and a long volatility increases the probability that the underlying
position in 1 high-strike call. More details and pricing expires out of the money increases. For this reason,
models can be found in [2]. Market considerations it is common to use a butterfly with relatively long
can be found in [4–6]. The butterfly spread produces expiries and with strikes centered around at-the-
a structure that at maturity pays off only in scenarios money to take a view that implied volatility will
where the price of the underlying is between the decline while still holding a position with relatively
lowest and highest strikes. One can think of this small delta (insensitive to changes in the underlying).
strategy as buying an option on the underlying being When the underlying is away from the money,
in a range. The butterfly has limited upside potential, the butterfly is long vega because when volatility
but a significantly reduced cost compared to that increases, the probability that the underlying finishes
of an outright call option. It should be used for in the money increases.
expressing a bullish view that the underlying will
trade in a range. As with all options, choosing the
strike and maturity will depend on one’s view of how The Relationship with Distribution of the
much the underlying will move and how quickly it Underlying
will move there. An example is shown in detail in
Figure 1. A butterfly can be thought of as a long call spread
In the example shown in Figure 1, we look at a plus a short call spread, with overlapping strikes
780/800/820 butterfly on the S&P 500 index, SPX. and the same strike spread. An approximation for
With the underlying SPX index at 770 and with three the value of a call spread can be found in Call
months to expiration, the butterfly cost is close to Spread:
1.00. The call option with the 800 strike is 70.14;
thus, the cost for a butterfly is significantly reduced Call Spread ≈ e−rτ (K2 − K1 )
from the outright cost of a call option with the
(K1 ) + (K2 )
same strike. This upfront cost for the butterfly is · 1− (1)
the maximum that this butterfly position can lose. 2
We subtract this initial investment from all other where (x) is the cumulative distribution function of
valuations, as shown in Figure 1, to get a total value. the underlying. Applying equation (1) to a butterfly,
If the underlying is exactly 800 at expiration, the we have
position will earn 20 on the butterfly from the low-
strike option. The maximum position profit then is the
(K3 ) − (K1 )
strike spread minus the initial cost, or 20 − 1.00 = Butterfly ≈ e−rτ (K2 − K1 )2
19.00. K3 − K1
Note that with three months to expiration, the ≈ e−rτ (K2 − K1 )2 p(K2 ) (2)
butterfly value is fairly insensitive to the underlying
price and is difficult to distinguish from the x-axis. where p(x) is the probability distribution function
However, as the option gets closer to expiration, the of the underlying at option expiration and is the
sensitivity of the price of the call spread becomes derivative of (x) (Figure 4).
greater, especially in the range of the butterfly strikes. We can apply this formula in the following way.
This sensitivity to underlying price or delta (see Delta We convert the triangle in the lower part of Figure 3
Hedging) is illustrated in Figure 2. to a square. Then we let the value of the payoff of the
2 Butterfly
20 10
3 months
3 months
2 weeks
2 weeks
Butterfly total value
15 5 3 days
3 days
Expiration
10
0
5
−5
0
−10
−5
700 750 800 850 900
−15
Underlying price 700 750 800 850 900
Figure 1 The value of a butterfly at various times before Figure 3 The vega of a butterfly at various times before
expiration expiration
1.0
3 months
0.8 2 weeks Butterfly
0.6 3 days Approximation
Butterfly payout
0.4 Expiration
0.2
Delta
0.0
−0.2
−0.4
−0.6
−0.8
−1.0
700 750 800 850 900
Underlying price
Ka Change in underlying Kb
Figure 2 The delta of a butterfly at various times before
expiration Figure 4 Using a butterfly to infer the underlying proba-
bility distribution
Butterfly
prob(Ka < S < Kb ) ≈ erτ (3)
(K2 − K1 ) ∂2
Call = −e−rτ p(K) (6)
The relationship between option prices and the dis- ∂K 2
tribution of the underlying was first pointed out in
[1], but see also [3, 7]. The use of call spreads
and butterflies to impute the market-implied underly- References
ing probability distribution can be related to taking
derivatives with respect to the strike of the call [1] Breeden, D. & Litzenberger, R. (1978). Prices of state-
price. A call spread is like a first derivative and a contingent claims implicit in option prices, Journal of
butterfly is like a second derivative. Formally, we Business 51, 621–651.
have [2] Hull, J. (2003). Options, Futures, and Other Derivatives,
5th Edition, Prentice Hall.
∞
[3] Jackwerth, J.C. (1999). Option-implied risk-neutral dis-
−rτ tributions and implied binomial trees: a literature review,
Call = e (S − K)p(S)dS (4) The Journal of Derivatives 7, 66–82.
K
Butterfly 3
[4] The Options Industry Council (2007). Option Strategies Related Articles
in a Bull Market, available at www.888options.com.
[5] The Options Industry Council (2007). Option Strate-
gies in a Bear Market, available at www.888options. Corridor Options; Risk-neutral Pricing; Variance
com Swap.
[6] The Options Industry Council (2007). The Equity Options
Strategy Guide, January 2007, available at www.888 ERIC LIVERANCE
options.com
[7] Rubenstein, M. (1994). Implied binomial trees, The Jour-
nal of Finance 49, 771–818.
Gamma Hedging in currency and can be summed over several stock
positions, whereas the direct gamma cannot).
As no condition was put on the relation of the
volatility to time and space, equation (3) is easily
Why Hedging Gamma? extended to a local volatility setting (see Local
Volatility Model). Practitioners
√ call this equation the
Gamma is defined as the second derivative of a
breakeven relation and σ δt the breakeven for it
derivative product with respect to the underlying
represents the move in performance the stock has
price. To understand why gamma hedging is not just
to make in the time δt to ensure a flat P&L (e.g.,
the issue of annihilating a second-order term in the
if we consider that a year is composed of 256 open
Taylor expansion of a portfolio, we review the profit
days, a stock having an annualized volatility of 16%
and loss (P&L)a explanation of a delta-hedged self-
needs to make a move of 1%, at which the delta
financing portfolio for a monounderlying option and
is rebalanced, to ensure a flat P&L between two
its link to the gamma.
consecutive days). Figure 1 shows the portfolio P&L
Let us consider an economy described by the
for a position composed of an option with a positive
Black and Scholes framework, with a riskless interest
gamma.
rate r, a stock S with no repo or dividend whose
volatility is σ , and an option O written on that stock. Equation (3) leads to two important remarks. First,
Let be a self-financing portfolio composed at it is a local relation, both in time and space, and
t of the fact that the gamma is gearing the breakeven
relation implies that the global P&L of a positive
• the option Ot ; gamma position, hedged according to the Black and
• its delta hedge: −t St with t = ∂O
∂S
; and Scholes self-financing strategy, can very well be
• the corresponding financing cash amount −Ot + negative if a stock makes large moves in a region
t St . where the gamma is small and makes small moves
in a region where the gamma is maximum, even
We note δ the P&L of the portfolio between if the realized variance of the stock is higher than
t and t + δt and we set δS = St+δt − St . Directly, the pricing variance σ 2 . Secondly, in the long run,
we have that the delta part of the portfolio P&L is the realized variance is usually smaller than the
−t δS and that the P&L of the financing part is implied variance, which can lead practitioners to
(−Ot + t St )rδt. Regarding the option P&L, δO, build negative gamma positions. Yet, Figure 1 shows
we have, by a second-order expansion, that a positive gamma position is of finite loss and
possibly infinite gain, whereas it is the opposite for
∂O ∂O 1 ∂ 2O
δO ≈ δt + δS + (δS)2 (1) a negative gamma position. Practically, this is why
∂t ∂S 2 ∂ 2S traders tend naturally to a gamma neutral position.
Furthermore, the option satisfies the Black and A specific aspect of the equity market is the
Scholes equation (see Black–Scholes Formula): presence of dividends. One can wonder if, on the
date the stock drops by the dividend amount, a
∂O ∂O 1 ∂ 2O positive gamma position is easier to carry than a
+ rS + σ 2 S 2 2 = rO (2) negative gamma position. It is, of course, linked
∂t ∂S 2 ∂ S
to the dividend representation chosen in the stock
Combining these two equations and writing the modeling. It can be shown that the only consistent
P&L of the portfolio as the sum of the three terms, way of representing the dividends is the one proposed
we get in Dividend Modeling, where the stock is modeled
as in Black and Scholes between two consecutive
1 2 ∂ 2O δS 2 dividend dates. It is the only representation in which
δ ≈ S 2 − σ δt
2
(3)
2 ∂ S S equation (3) stands (on the dividend date, the P&L
term coming from the cash dividend part is offset by
where ∂ 2 O/∂ 2 S is the gamma of the option part of a term arising from the adapted Black and Scholes
the portfolio (in terms of definition, S 2 (∂ 2 O/∂ 2 S) equation). In others, either the gamma carries a
is called the cash gamma because it is expressed dividend part (dividend yield models) that leads to a
2 Gamma Hedging
P&L
Breakeven
d S /S
Figure 1 The P&L of a self-financing portfolio composed of an option with a positive gamma in the interval δt
false breakeven on the dividend date or equation (3) These two figures show that, to efficiently hedge
is not associated with the stock but with the variable his or her gamma exposure, a trader would rather use
that is stochastic (model in which the stock is a short-term option to avoid bringing too much vega
described as a capitalized exponential martingale to his or her position. Moreover, the gamma of an “at-
minus a capitalized dividend term, for example). the-money” option is increasing as one gets closer to
This is why practitioners use the model proposed the maturity, whereas the gamma of an “out-of-the-
in Dividend Modeling rather than any other. This money” option is decreasing.
is also why it is, indeed, a general framework we
put ourselves in by excluding dividends and repo
(which is usually represented by a drift term whose
The Put Ratio Temptation
P&L impact is also offset by a term arising from the
adapted Black and Scholes equation) in our analysis.
As equation (3) shows, the gamma and the theta (first
derivative of a derivative product with respect to
time) of a portfolio are of opposite signs. Moreover,
Practical Gamma Hedging in the equity market, the implied volatility is usually
described by a skew, meaning that if we consider two
We have seen why traders usually try to build
puts P1 and P2 for the same maturity T , having two
a gamma-neutral portfolio. Yet, there is no pure
strikes K1 and K2 with K1 < K2 , we classically have
gamma instrument in the market, and neutralizing
σK1 > σK2 . If we now build a self financing portfolio
the gamma exposure always brings a vega exposure
that is composed of P2 − αP1 with α = 2 / 1 , the
to the portfolio. Without trying to be exhaustive,
ratio of the two gammas, we get from equation (3)
we briefly review here some natural gamma hedging
that δ ≈ 12 S 2 2 (σK2 1 − σK2 2 )δt > 0.
instruments.
This result is not in contradiction with arbitrage
theory; it only demonstrates that equation (3) is
Hedging Gamma with Vanilla Options strictly a local relation. As shown by Figure 2, to
keep this relation through time, the trader would
European calls and puts have the same gamma have to continuously sell the put P2 , as α increases
(and the same vega). Hence, they are equivalent as time to maturity decreases, and, in case of a
hedging instruments. Figure 2 shows the gamma of market drop down, he or she would find himself in a
a European option for two different maturities and massive negative gamma situation. Still, practitioners
Figure 3 shows the compared evolutions with respect commonly use put ratios to improve the breakeven of
to the maturity of the gamma and of the vega. their position.
Gamma Hedging 3
Figure 2 Gamma of a European call as a function of the spot for two maturities (strike is equal to 100)
Call vega
Call gamma
0 1 2 3 4 5 6 7 8 9 10
Time
Figure 3 Compared evolution of the gamma and vega of an at-the-money European call as a function of maturity (scales
are different)
is defined as part, and of a term coming from the pure jump part.
The hedge of the latter is very complex because it is
∂ 2O ∂ ∂O ∂σ
+ (4) not localized in space (one needs to use a strip of gap
∂ 2S ∂S ∂σ ∂S options, e.g., to control it).
The second term, the shadow term, depends on the Finally, a possible way of controlling the volatil-
chosen dynamics of the implied volatility. ity surface dynamics is to make no assumption on the
The problem with the shadow approach is that volatility except that it is bounded. This framework
we cannot rely anymore on a self-financing strat- is known as uncertain volatility modeling and is pre-
egy in the Black and Scholes framework to define sented in Uncertain Volatility Model. The analysis
the breakeven. One solution, in order to build a self- leads to the conclusion that instead of one breakeven
financing strategy that incorporates volatility surface volatility, there are, in fact, two: the upper bound for
into the dynamics, is to use a stochastic volatility positive gamma regions and the lower bound for neg-
model (see Heston Model) instead of a Black and ative gamma ones. In that case, and supposing that
Scholes model. For example, one can use the follow- the effective realized volatility stays locally between
ing model: these two bounds, gamma hedging is not necessary,
as the P&L of the delta-hedged self financing portfo-
dS = rS dt + σ S dWt1 lio is naturally systematically positive.
dσ = µ dt + ν dWt2 (5)
dW , W t = ρ dt
1 2
Multiunderlying Derivatives
Using the same arguments as in the Black and We consider a multidimensional Black and Scholes
Scholes framework, the P&L of a delta-hedged self- model of N stocks Si with volatility σi . ρij represents
financing portfolio (now with a first-order hedge for the correlation between the Brownian motions con-
the volatility factor using a volatility instrument like trolling the evolution of Si and Sj . We do not discuss
a straddle, for example) in this model is the issue of multicurrency (see Quanto Options) and,
using the same mechanism as in the monounderlying
1 2 ∂ 2O δS 2 framework, we can express the P&L of a delta-
δ ≈ S 2 − σ δt
2
2 ∂ S S hedged self-financing portfolio as
1 ∂ 2O 1 2 ∂ 2O
N
δSi 2
+ ((δσ )2 − ν 2 δt) δ ≈ S − σi δt
2
2 ∂ 2σ 2 i=1 i ∂ 2 Si Si
∂ 2O δS
+S δσ − ρσ ν δt (6) ∂ 2O δSi δSj
∂S∂σ S + Si Sj −ρij σi σj δt
i<j
∂Si ∂Sj Si Sj
Two other “gamma” terms appear in this equation,
which proves that incorporating the dynamics of the (7)
volatility is not as simple as the addition of a shadow
term in the Black and Scholes breakeven relation. It The first term can be controlled by the hedging
also shows that controlling the P&L leads to a more instruments we have previously reviewed. The cross
complex gamma hedge, as it is now necessary to ones, which incorporate the “cross gammas”, can
annihilate two more terms (the second and third ones, also be controlled, using so-called correlation swaps
for which natural hedging instruments are strangles (typically, a basket option minus the sum of the
and risk reversals). individual options).
Another popular way of integrating the volatility
surface dynamics in the model is to use Levy
processes (see Exponential Lévy Models). We do Conclusion
not give the P&L explanation in that case, but, like
in the stochastic volatility framework, it is the sum of Controlling the gamma exposure of a position is one
the term presented in equation (3), for the Brownian of the main concerns of traders. Hedging instruments
Gamma Hedging 5
CHARLES-HENRI ROUBINET
End Notes
a.
P&L stands for “profit and loss” and represents the
evolution of the portfolio value between two dates due to
time and to the market activity between these dates.
Delta Hedging arising from the transaction. If the net delta expo-
sure between the financial derivative and the hedge
is zero, the position is said to be delta neutral.
derivative, the hedge is considered stable. However, 3. Sell $38.5mm a basket of stock that comprises
if the market maker requires to borrow the hedge to S&P 500 index, paying a borrowing fee and
go short (e.g., short stocks or bonds), then the hedge executing stock orders on the exchange.
may be subject to a lack of availability to borrow in
the marketplace. Example 2. Long total return swap on XYZ com-
modity index.
Incorporating the Cost of Hedging into the Price MMDI sold a five-year total return swap on the XYZ
of the Financial Derivative commodity index to a client for $10mm notional,
which has a delta of $10mm. To hedge the position,
1. Market makers who seek to hedge a position may MMDI can do one of the following:
incorporate the cost of hedging using adjustments
to the financing rates, expected loss owing to basis 1. Buy an equal and offsetting swap with another
risk, or future rollover costs of short-term hedges client or market counterparty or a basket of com-
(like futures) into the pricing of the derivative. modities swaps for each of the components of the
2. In addition, the market maker may seek contrac- XYZ index from another counterparty which is
tual obligations to the ensure that if he is unable the perfect hedge (assuming no counterparty risk).
to hedge the contract, he has a right to unwind 2. Hedge the XYZ index with a much more actively
the derivative at fair market price. traded index like the CRB index taking into
account the tracking risk and weighting differ-
ences of the components between the two indices.
Examples of Delta Hedging 3. Maintain a portfolio of long futures on com-
Below are some examples of delta hedging. modities, which compromises the XYZ index and
rolling futures on the position for the next five
Example 1. Listed put option on S&P 500 index. years. The market maker assumes the risk of
Market Maker Derivatives Inc. (MMDI) sells an rolling the futures positions to changes in the
exchange listed put on the S&P 500 (SPX) Index on a shape of the commodity forward curve.
$100mm notional with a strike price of $1350 and one
year expiration to a client when the SPX is trading
at $1400. The Black–Scholes model calculates the Rehedging Delta with Time, Spot Moves
delta of the position at 38.5%. To delta hedge and The delta of a financial product may change with
neutralize the position to P&L swings from changes time or the levels of the different market parameters
in the level of the SPX, MMDI needs to sell $38.5mm such as volatility, underlying price, interest rates, or
of SPX Index. MMDI can do one of the following to skew. It is then necessary to periodically adjust the
achieve this: size of the hedge to maintain the delta at a preset
1. Sell $38.5mm in SPX index futures (1100 futures level. The rate of change of the delta is proportional
at $1400 with a 250 multiplier). Since these to the gamma (see Gamma Hedging).
futures expire every three months, the market
maker needs to roll this exposure into a new Reference
contract every three months.
2. Sell $38.5mm of a one-year swap on the SPX [1] Black, F. & Scholes, M. (1973). The pricing of options
Index with another counterparty for one year or and corporate liabilities, Journal of Political Economy
81(May-June), 637–659.
enter into an OTC put/call combo transaction
(buying a put and selling a call with the same
strike price to replicate a forward using put–call Related Articles
parity) for $38.5 mm notional. This is done in the
OTC market place and appropriate International Hedging; Hedging of Interest Rate Derivatives;
Swaps and Derivatives Association, Inc. (ISDA) Option Pricing: General Principles.
(http://www.isda.org/) documents and collateral
agreements need to be in place before this is done. VIJU JOSEPH
Dispersion Trading D = 0 corresponds to the case when there is no
dispersion—all correlations are 100%.
So, to long dispersion is equivalent to be short
Dispersion trading refers to the practice of selling correlation and vice versa.
index variance while buying variance of its con- To characterize the correlation between the con-
stituents at the same time. The reverse strategy stituents, one can define the average correlation as
(buying index variance while selling constituents if the correlation is the same between every pair of
variance) can also be employed, but it is not as stocks in the basket
popular.
n
To understand dispersion trading, consider the σI2 − wi2 σi2
index as a basket of stocks: i=1
ρ̄ = (7)
n
n
n
SI = wi S i (1) 2 wi wj σi σj
i=1 i=1 j >i
By definition, realized correlation is the correlation One issue in dispersion trade is to decide the
calculated using realized volatilities, and implied relative weight for index and constituents variances.
correlation is the correlation calculated using implied There is no single “correct” relative weight to use.
volatilities. Implied volatilities decide the price of the For example, vega neutral weights aim to make the
traded instruments like vanilla options and variance sum of constituents vega and index vega zero, so
swaps. that the trade is hedged against fluctuations in level
The success of dispersion trades relies on the fact of volatility. “Premium neutral” weights make the
that statistically the realized correlation tends to be initial premium of buying constituents and selling
below the implied correlation. Historically, if one index cancel each other.
were long dispersion, on average, one made more In reality, it is impractical to trade all constituents.
money than the amount one lost. There are many Often, a selection of names in the index (or even
different reasons for this phenomenon, for example, those not in the index) is used. This is called a
one may argue that there is more market demand proxy basket. One can build the proxy basket by
for index volatility than that of the individual stock, selecting, for example, the names that have the largest
which means usually there is more premium for weights in the index, or the names that are judged
equity stock volatility. More importantly, correlation relatively “cheap”, or the names that are mostly likely
jumps to a very high level when extreme market to “disperse” against each other, or simply by the
conditions exist, namely, global recession and market stock fundamentals.
crash, while it stays low in a normal and uneventful
market.
Related Articles
To long the volatility of each component stock,
and short the index volatility, one can either trade
vanilla options or variance swaps. The variance Basket Options; Correlation Swap.
swaps provide direct exposure to variance without
YONG REN
the unnecessary cost and hassle of hedging against
daily stock movements.
Correlation Swap Realized Correlation
There are mainly two types of realized correla-
tion formulas currently found on over-the-counter
(OTC) markets:
A correlation swap is a type of exotic derivative
security that pays off the observed statistical cor- • equally weighted realized correlation—the for-
relation between the returns of several underlying mula used in the above example;
assets, against a preagreed price. At the time of writ- wi wj ρi,j
ing, it is traded over-the-counter (OTC) on equity • weighted realized correlation—
1≤i<j ≤N
,
and foreign exchange derivatives markets. This article wi wj
1≤i<j ≤N
focuses on equity correlation swaps, which appeared where w1 , . . . , wN are preagreed positive weights
in the early 2000s, as a means to hedge the parametric summing to 1. In the above example, one would
risk exposure of exotic trading desks to changes in typically take the “index weights” as of the trade
correlation. date, that is, the stock quantities that a portfolio
manager would invest in to track one unit of the
Dow Jones Euro Stoxx 50 index.
Payoff Several technical reports have investigated how
the above weighted realized correlation (WRC) for-
Similar to variance swaps, the correlation swap mula relates to other proxy formulas that are pop-
payoff involves a notional (the amount to be paid/ ular in econometrics, when the underlying assets
received per correlation point a ), a realized correla- and weights correspond to an equity index. Tierens–
tion component (the formula used to calculate the Anadu [4] give empirical evidence that in the case of
level of observed statistical correlation between the the S&P 500 index,
underlying assets), and a strike price:
wi wj Cov(Xi , Xj )
i<j
Correlation swap payoff = notional W RC ≈ (1)
wi wj V ar(Xi )V ar(Xj )
× (realized correlation − strike) i<j
Related Articles
End Notes
a. Basket Options; Correlation Risk; Dispersion
In market jargon, a correlation point is equal to 0.01.
With this convention, the value of a correlation coefficient Trading; Variance Swap.
is comprised between −100 and +100 correlation points.
b.
To imply the value for ρi,j , one needs three option prices: SÉBASTIEN BOSSU
a vanilla option on Si , a vanilla option on Sj and, for
Stock Pinning options, the closing print of XYZ is $35, then stock
XYZ is said to pin. Prices of $34.27, $31.60, or even
$32.48, are said not to have pinned. Figure 1 is a tick
Stock pinning, or simply pinning, is formally the price graph of KO (Coca Cola Corporation) showing
occurrence of a closing stock print, on option expi- the last several days prior to a pinning expiration.
ration day, which exactly matches the denominated As a practical matter, it may be useful experimen-
value of a strike price. As an example, let stock tally to consider pinning to have occurred if the stock
XYZ have strikes 30, 32.5, 35, and 40. If on Friday, expires within a certain interval of a strike price.
May 16, 2008 at 4:00 pm EDT (USA), the third Fri- There are several reasons for this looser definition.
day of the month and thus an expiration day for listed Empirically there may be several “closing prints”
45.80
45.60
45.40
45.20
Price
45.00
44.80
44.60
44.40
44.20
44.00
1 30 59 88 117 146 175 204 233 262 291 320 349 378 407 436 465 494 523 552 581 610 639 668 697 726 755
Tick
Figure 1 KO (Coca Cola) tick data for a pinning expiration, October 17, 2003
0.08
0.06
0.04
0.02
0.00
0.36 2.19 7.40 46.41
Adjusted open interest (0.001∗(OI/volatility)) — bin average shown
Figure 2 All optionable stocks in 2002 divided into quartiles by pinning strength, β. As predicted, the probability of
pinning increases with beta. (Pinning criteria $0.15, courtesy Bart Rothwell)
2 Stock Pinning
25.0%
15.0%
10.0%
5.0%
0.0%
0 5 10 15 20
OI-ratio
Figure 3 Cumulative distribution function of pinning of stocks, which are within $1 of a strike with 1 week to go to
expiration as a function of the parameter, β. (Courtesy Tom MacFarland)
Pinning % by date — KO
0.16
0.14
0.12
% Pinned
0.1
0.08
0.06
0.04
0.02
0
−5 −4 −3 −2 −1 0 1 2 3 4 5
Relative trading data from option expiration date
Figure 4 The percentage of days KO closed within $0.15 of a strike in a 10-year period January 1, 1996 to January 1,
2005. 0 is expiration day, negative integers are days prior to expiration, positive integers are days following expiration.
(Courtesy Bart Rothwell)
making a choice of the closing price arbitrary. Tick outside of which in-the-money puts and calls would
data shows that a stock may be effectively pinned be automatically exercised by the clearing process;
over the last several minutes before expiration but options within the interval would require exercise
then have a closing print just off the strike. In the notice by the holder. Over time, the OCC has reduced
first example, this might happen if the last quote were the interval to the current $0.01 (from $0.05 before
$34.98 bid, at $35.01, and the closing price stayed in
June 2008 expiration); traders will declare a stock to
the interval but was not precisely $35.
have pinned if it falls within the OCC interval. Pin
Two additional reasons for a looser definition
are the automatic exercise conditions mandated by risk attends to any short position inside this interval
the OCC (Options Clearing Corporation) and the because an uncertain number of options may be
consequent pin risk, which attends expiring short assigned and thus an uncertain postexpiration stock
positions on the (nearly) pinned strike. The OCC position exists in the positions of those short the
has traditionally fixed an interval about a strike expiring at-the-money options.a
Stock Pinning 3
11.5
11
%
10.5
10
9.5
−10 −9 −8 −7 −6 −5 −4 −3 −2 −1 0 1 2 3 4 5 6 7 8 9 10
Relative trading date from option expiration date
Figure 5 All stocks, January 1996 to September 2002 [Reproduced with permission from Stock price clustering on option
expiration dates, Ni et al., Journal of Financial Econometrics, Elsevier 2005.]
7.5
%
6.5
6
−10 −9 −8 −7 −6 −5 −4 −3 −2 −1 0 1 2 3 4 5 6 7 8 9 10
Relative trading date from option expiration date
Figure 6 Nonoptionable stocks do not pin, January 1996 to September 2002 [Reproduced with permission from Stock
price clustering on option expiration dates, Ni et al., Journal of Financial Econometrics, Elsevier 2005.]
So far we have defined a single instance of pin- of stock pinning. In this perspective, a stock, or
ning. Complementing the notion of an individual stocks, is said to pin if, no matter how small an
instance of stock pinning is an ensemble assertion interval one chooses about a strike price, there
4 Stock Pinning
12
11.5
11
10.5
10
9.5
8.5
−10 −9 −8 −7 −6 −5 −4 −3 −2 −1 0 1 2 3 4 5 6 7 8 9 10
(a) Relative trading date from option expiration date
12
11.5
11
10.5
10
9.5
8.5
−10 −9 −8 −7 −6 −5 −4 −3 −2 −1 0 1 2 3 4 5 6 7 8 9 10
(b) Relative trading date from option expiration date
Figure 7 All optionable stocks, January 1996 to September 2002. (Pinning criterion $0.125): pinning when professional
traders are (a) long and (b) short the expiring at-the-money strike
Stock Pinning 5
is a finite probability of finding closing prints of stock prices near a strike on expiration days for
within the interval. To compute this limit, expressed KO and for the entire market over extended peri-
mathematically as ods. Figure 6 demonstrates the absence of pinning in
nonoptionable stocks. Finally, Ni et al. lent support
lim P (|S-K| ≤ ε) > 0 (1) for the hedging assumptions of Avellaneda and Lip-
ε→0
kin. Figure 7(a,b) shows the difference between pin-
where ε is an interval about (any) strike K, S is ning when professional traders are long (a) and short
the stock price at expiry, and P is the probability (b) the expiring at-the-money strike.
among all expiration closes, one can do empirical Since 2004, other research groups, for example,
experiments or theoretical calculations. It is important Jeannin, et al. [2], have continued to explore the
to note that standard models of option pricing such details of pinning.
as Black–Scholes, Heston, SABR, and stochastic
volatility models, in general, cannot exhibit pinning
mathematically. End Notes
Although traders had long believed pinning to be
a.
a real phenomenon, little theoretical or experimental Practitioners define pin risk as the uncertain deltas which
effort was made to examine the subject through the an otherwise balanced position might have postexpiration
1990s. Krishnan and Nelkin [3] looked at the data due to the assignment of calls or puts on a near pinning
set of MSFT (Microsoft Corporation) expirations strike. For example, a position long 50 calls and short 50
puts on the $25 strike, for a stock which expires near $25,
and found evidence of pinning. They proposed a may be assigned from 0 to 5000 shares of stock due to the
model that combined a Gaussian random walk with uncertain number of puts which may have been exercised.
a Brownian bridge process in order to force pinning This amount of stock thus assigned is independent of the
to a strike. The model perforce guaranteed pinning, number of calls the trader chooses to exercise himself.
b.
but suffered from many obvious weaknesses: stocks Following the initial 2003 work, Gennady Kasyan (with
do not always pin; they can pin at many possible Avellaneda and Lipkin, unpublished) showed that any
strikes; and the choice of the amount of Brownian impact function stronger than square-root would result in
pinning. This suggests that weaker impact functions may
bridge component was exogenously (and arbitrarily) be contradicted by the extensive market evidence of stock
imposed. pinning.
Then Avellaneda and Lipkin [1], and nearly simul-
taneously, Ni et al. [4], produced theoretical and
experimental arguments for pinning. In the former References
work, Avellaneda and Lipkin proposed an asymmetric
hedging strategy for professional traders—aggressive [1] Avellaneda, M. & Lipkin, M.D. (2003). A market-induced
mechanism for stock pinning, Quantitative Finance 3,
hedging of long gamma positions and weak hedg-
417–425.
ing of short gamma positions. This hedging strategy [2] Jeannin, M., Iori, G. & Samuel, D. (2008). The pinning
coupled with a stock impact function (simplistically effect: theory and a simulated microstructure model,
assumed to be linearb ) led directly to pinning (with Quantitative Finance 8, 823–831.
nonzero probability), which depended naturally and [3] Krishnan, H. & Nelken, I. (2001). The effect of stock
endogenously on the option open interest, the intrin- pinning upon option prices, Risk December.
sic stock volatility, the (logarithmic) distance to the [4] Ni S, Pearson N., Poteshman A. (2004). Stock price
clustering on option expiration dates, SSRN August 27.
strike and the time to expiration. In dimensionless
form, the strength parameter, β, is proportional to
the open interest and inversely proportional to the Related Articles
volatility. Figures 2 and 3 show, experimentally, the
monotonic growth of pinning probability with β. Price Impact.
Ni et al. used the IVY and CBOE databases
to check pinning frequencies. Figures 4 and 5, typ- MIKE LIPKIN
ical Ni et al. graphs, indicate the excess clustering
Variance Swap σstrike at maturity, the payoff is approximately equal
to the vega notional.
Uses of variance swaps include trading the level
of volatility (variance swaps are a more pure way to
Definition do this compared to straddles), trading the realized/
implied vol spread, hedging of volatility exposures,
A variance swap is a volatility derivative that pays or trading volatility on a forward basis via forward
off on realized volatility of some underlying: variance swaps. See [1] for more details about
variance swaps, market practices, and their use in vol
Payoff = (σR 2 − Kvar ) × N (1)
spread trading and correlation trading.
where σR 2 is the realized variance, Kvar is the
fair value of the realized variance at inception, Fair Value and Skew
and N is the notional amount, a leverage factor.
The realized variance may be defined differently in In [6], it is shown that a variance swap can be
different markets depending not only on the “default statically replicated with calls, puts, and a forward
model” but also on specific contract specifications. contract. The payoff for the variance swap then
One standard way for stocks [9] is to define realized comes from delta hedging of this portfolio of options
variance as with the underlying. The fair value is determined by
the cost of the replicating portfolio of options. The
252 2
n
σR 2 = ui , ui = ln Si Si−1 (2) payoff in terms of realized volatility is achieved by
n i=1 delta hedging with the underlying. If the realized vol
is exactly equal to the expected vol at inception,
This requires n + 1 observations of the daily clos- the hedging profits will be exactly equal to the
ing stock price Si . The factor 252 is the approximate cost of the option portfolio and the payout will be
number of business days in a year, which gives zero. As the portfolio of options (in theory) includes
an annualized variance. Also note that this formula option of every strike, the fair value cost is affected
assumes the mean of the returns to be zero. This significantly by the skew. The skew is defined to be
means that it is distinct from the statistical vari- the way the implied volatility changes as the strike
ance of the returns. The mean of returns is typically changes, all else being equal.
small and setting it to zero makes returns on vari- The fair value can be derived using the volatility
ance swaps additive over time. Using log returns to formula discussed in [4]. In this reference, the authors
measure variance makes this formula compatible with show the remarkable formula that the expected value
the standard Black–Scholes option pricing formula. of any smooth payoff function f (ST ) in terms of the
The long position in a variance swap receives terminal stock price ST can be written in terms of
N dollars for every point by which the stock’s stock and option prices.
realized variance σR 2 has exceeded the inception fair
value Kvar . See [6] for one of the earliest, but most Et [f (ST )] = f (κ)B0 + f (κ)[Ct (κ) − Pt (κ)]
κ
comprehensive, references on variance swaps. In this
+ f (s)Pt (s)ds
reference, the authors discuss replication problems −∞
due to strike spacing and gapping of the underlying, ∞
for example. + f (s)Ct (s)ds (4)
It is market practice to define the variance notional κ
in volatility terms: An application of Ito’s lemma to the usual Black
diffusion yields the variance differential:
VegaNotional
Variance Notional = (3)
2 × σstrike dSt
2 − d(log St ) = σ 2 dt (5)
St
where the strike volatility σstrike is equal to the square
root of the strike variance Kvar . With this adjustment, Applying equation (4) to the log contract [11]
if the realized volatility is 1 percentage point above in equation (5) shows that the replicating portfolio
2 Variance Swap
K − SF + (1 − λ)(Kt − K0 )],
σ (K) = σATM − b (7)
SF λ = (t − t0 )/(T − t0 ) (14)
the fair value is The first piece is just the time-weighted value
of realized variance against the strike. The second
Kvar = σATM 2 (1 + 3T b2 + · · ·) (8) piece is the time-weighted difference of fair value
variance strikes. The same formula can be used to
For the skew that is linear in the delta (here p decompose the daily price change into three risk
is the put delta), components: gamma, vega, and theta:
1 M(t) = M(t + t) − M(t)
σ (p ) = σATM + b p + (9)
2
1
the fair value is =N
τ V (t, t + t)t
1 √ 1 b2 Gamma
Kvar = σATM 2
1+ √ b T + +· · ·
π 12 σATM 2
1
(10) + (1 − λt+t ) · K − Kt t
(15)
τ
Vega
Theta
In [7], we have a particularly elegant formula for
the fair value given directly in terms of the skew. The biggest independent risk is in the fair value of
Denote the strike, which we have put in the vega component.
√ This component also entails the skew risk.
k σBS (k) T
z(k) = d2 = − √ + (11)
σBS (k) T 2
Volatility Swaps
Intuitively, z measures the log-moneyness of an
option in implied standard deviations. Then Although variance swaps can be statically repli-
cated, volatility swaps (see Volatility Swaps) cannot.
∞
In [3], the authors show that there is an approx-
Kvar = dz · N (z)σBS 2 (z)T (12)
−∞
imate dynamic replicating strategy for volatility
Variance Swap 3
swaps. Before this, volatility swap valuation had been [7] Gatheral, J. (2006). The Volatility Surface: A Practi-
thought to be highly model dependent [2]. See also tioner’s Guide, Wiley Finance.
[8, 10], where the authors give a closed formula for [8] Haug, E.G. (2007). Option Pricing Formulas, 2nd Edi-
tion, McGraw Hill.
valuing vol swaps using a GARCH model. [9] Hull, J. (2003). Options, Futures, and other Derivatives,
5th Edition, Prentice Hall.
References [10] Javaheri, A., Wilmott, P. & Haug, E.G. (2002). Garch
and Volatility Swaps, published on Wilmott.com.
[11] Neuberger, A. (1994). The log contract, The Journal of
[1] Bossu, S., Strasser, E. & Guichard, R. (2005). Just What Portfolio Management 20, 74–80.
You Need to Know About Variance Swaps, JP Morgan
Equity Derivatives Research Publication.
[2] Brockhaus, O. & Long, D. (2000). Volatility Swaps Made Related Articles
Simple, RISK Magazine, pp. 92–95.
[3] Carr, P. & Lee, R. (2008). Robust replication of volatility
derivatives, Mathematics in Finance Working Paper Correlation Swap; Corridor Variance Swap;
#2008-3. Courant Institute of Mathematical Sciences. Gamma Swap; Realized Volatility and Multipower
[4] Carr, P. & Madan, D. (1998). Towards a theory of Variation; Realized Volatility Options; Volati-
volatility trading, in Volatility ed., R.A. Jarrow, Risk lity; Volatility Index Options; Volatility Swaps;
Books. Weighted Variance Swap.
[5] Chriss, N. & Morokoff, W. (1999). Market Risk for
Volatility and Variance Swaps, Risk Magazine. ERIC LIVERANCE
[6] Demeterfi, K., Derman, E., Kamal, M. & Zou, J. (1999).
A guide to volatility and variance swaps, Journal of
Derivatives 6, 9–32.
Volatility Swaps the volatilty swap is slightly lower than that of a
variance swap. The difference between the two is
known as the convexity adjustment and gets larger
as volatility of volatility gets larger. The convexity
Volatility swaps are very similar to variance swaps,
adjustment can be calculated, for example, in the
both in concept and in application. Like variance
Heston model.
swaps, volatility swaps can be used by hedge funds
The variance swap is preferred in the equity mar-
to speculate on volatility movements or by portfolio
ket due to the fact that it can be replicated with a
managers to hedge other products against volatility
linear combination of vanilla options and a dynamic
fluctuations. Since their introduction in 1998, they
position in futures (see Variance Swap; [2]). In other
have seen rapid growth and there is currently a sizable
markets, the volatility swap is actually more liquid
market in both the equity and foreign exchange
than the variance swap. Although a position in vari-
markets. Volatility swaps are also traded in interest
ance swap can be replicated, a position in volatility
rate and commodity markets.
swap cannot. This means that different models that
Technically speaking, a volatility swap is not
correctly calibrate to the vanilla option surface will
really a swap, but a forward contract on the realized
give the same price for variance swaps but not for
volatility. At maturity, the buyer receives from the
the volatility swaps. In other words, the price of a
seller the difference between the realized volatility
variance swap is model independent, but the price
and the fixed strike amount, multiplied by the dollar
of volatility swap is not. In practice, the volatility
notional (quoted in dollar per volatility point):
swap and variance swap admit almost equal pricing
for short-term maturities. Recent research also sug-
Volatility swap = Notional × (σrealized − K) (1) gest that the model dependence is not as large as it is
commonly believed and volatility swaps can also be
while
approximately replicated by trading vanilla options
(see Volatility Index Options; Realized Volatility
Variance swap = Notional × (σrealized
2
− K 2 ) (2) Options; [1]). Newer models have been developed
that can price volatility derivatives including volatil-
The fixed strike payment is usually referred to as ity swaps, while remaining consistent with the entire
the fixed leg of the swap and the realized volatility volatility surface [3].
is referred to as the floating leg. The swap con-
tract contains the detailed specifications on how the
volatility is calculated. Typically, the floating leg is References
calculated as
N [1] Carr, P. & Lee, R. (2007). Realised volatility and vari-
Si 2
Annualization × ln (3)
ance: option via swaps, Risk 20(5), 76–83.
Si−1 [2] Demeterfi, K., Derman, E., Kamal, M. & Zou, J. (1999).
1 More than You Ever Wanted to Know About Volatility
Swaps, Goldman Sachs Quantitative Strategies Research
where Si should be adjusted by discrete dividends Notes.
across dividend payment days. In the most accepted [3] Ren, Y., Madan, D. & Qian, M. (2007). Calibrating and
convention, the floating leg is reset and computed pricing with embedded local volatility models, Risk 20(9),
daily. 138–143.
Besides variance swaps, another product closely
related to volatility swaps is the VIX contract, which
is written as the square root of the sum of expected Related Articles
future variances.
Because variance is the square of volatility, the Corridor Variance Swap; Realized Volatility
payoff of a variance swap is convex in volatility while Options; Variance Swap; Volatility Index Options;
the payoff of a volatility swap is linear in volatility. A Weighted Variance Swap.
volatility swap is thus cheaper than the corresponding
variance swap. More specifically, the fair strike for YONG REN
Static Hedging is to match the barrier option’s value at expiry and
along the barrier.
To illustrate, consider a down-and-out call with
strike K, expiry T, and barrier B. (An up-and-out call
Liquid traded put and call options can be used
is treated similarly, except strike-above-the-barrier
as hedge instruments for over-the-counter traded
calls are used as hedge instruments.) Let Put, Call
products. Barrier options are the most common exotic
(spot, time | strike, expiry) denote put and call values.
options, and, for these contracts, static hedging works
Suppose that we have specified a grid of time
out particularly well. In the Black–Scholes model
points 0 = t0 < t1 < . . . < tn = T , and n pairs of put
there are simple methods (conceptually straightfor-
ward and/or closed form) for constructing replicating with strikes Kj ≤ B and expiries Tj ≤ T . Find the
portfolios that do not require dynamic trading; they solution α to
are set up at initiation of the barrier option, and liq- Aα + u = 0 (1)
uidated at either knockout or expiry. They are, thus,
where A is an n × n matrix with entries Ai,j =
static hedges. Inspired by Allen and Padovani [1], we
P ut (B, ti |Kj , Tj ) and u is an n-vector with entries
describe how to find static hedges for barrier options
Call(B, ti |K, T ). A portfolio with the (K, T )-call
in the Black–Scholes model in a way that encom-
and αj in the (Kj , Tj )-put then matches the barrier
passes both Derman’s [6] intuitive calendar-spread
option’s zero value at the ti -points along the bar-
algorithm and Carr’s [4] strike-spread hedges stem-
rier, and its expiry payoff above the barrier. So the
ming from put–call symmetry.
barrier option is—to a good approximation when the
match points, the ti ’s, are close —replicated buy-
Construction of Static Hedges ing this portfolio at time 0 and selling it either when
the barrier option is knocked out (because sample
Unless we explicitly say otherwise, we consider a paths are continuous, this can only happen if the
Black–Scholes model throughout this article. This barrier is actually hit) or when it expires. In other
means that the interest rate is constant, and all options words, this represents a static hedge. There is free-
are written on some underlying asset S that follows a dom of choice regarding strikes and expiries of the
geometric Brownian motion. A zero-rebate, knockout hedge instruments. Derman [6] suggests calendar-
barrier option is a contract that pays off as a plain spread hedging with strikes along the barrier,that is,
vanilla option if S stays within a specified barrier using Tj = tj −1 and Kj = B. This makes the A-
for the whole life of the barrier option, but becomes matrix triangular so that we can solve for αj ’s in
worthless if the barrier is hit or crossed (see also one easy-to-explain backward-working pass. Another
Barrier Options). Recurrent examples are the down- choice—closely related to Carr’s work [4]—is to use
and-out call and the up-and-out call The value of a strike spreads, that is, Tj = T for all j and Kj ’s that
still-alive barrier option is of the form F (St , t), where are different and below the barrier.
the function F solves the Black–Scholes partial
differential equation with 0 as boundary condition Example 1. Table 1 gives a numerical comparison
along the barrier (see Finite Difference Methods of the performance of different hedge portfolios for
for Barrier Options). This is illustrated in Figure 1, three-month barrier options; a typical lifetime of a
which is useful to keep in mind when the method barrier option in foreign exchange markets. Looking
for constructing static hedges is described in the at the results in Table 1 for the down-and-out call, we
following. see the appeal of using options as hedge instruments;
very few options are needed in the static hedges
Construction of Static Hedges to achieve a hedge quality that is several orders
of magnitude better than usual dynamic -hedging.
A portfolio of puts and calls that (approximately) The numbers for the up-and-out call demonstrate one
replicates the barrier option can be found as the problem that static hedging does not immediately
solution to a linear system of equations, and con- solve: the up-and-out call is a reverse or live-out
structing it does not require knowledge/implemen- option meaning that the underlying call is in the
tation of barrier option valuation formulas. The idea money when the barrier option knocks out. This
2 Static Hedging
F≡0
B
F solves Black–Scholes PDE
F (B, t ) = 0
K
F (B, t ) = 0 K
F (x,T ) = (x – K )+ F (x,T ) = (x – K )+
B
0 0
0 T 0 T
(a) (b)
Figure 1 The PDEs for (a) down-and-out and (b) up-and-out call options
Table 1 Performance of dynamic and static hedge strategies in the Black–Scholes with 15% volatility and zero interest
rate and dividends. The columns show the initial price of the hedge portfolio and the standard deviation of the benchmarked
discounted hedge error, that is, the value of hedge portfolio at liquidation minus barrier option payoff relative to the initial
value of the barrier option. All static hedges use three options besides the (K, T )-call. The time points for value matching,
the ti s, and the expiries for the calendar spreads are from the list (0, 1/12, 2/12, 3/12). The strike spreads use calls with
strikes (110,112,114) for the up-and-out case, and puts with strikes (90.25=B 2 /K , 88.25, 86.25) for the down-and out case.
The -hedge is adjusted daily and all portfolios are continuously monitored
Barrier option type
Hedge method Cost Standard deviation (%) Cost Standard deviation (%)
Dynamic; 2.6964 11 1.0358 81
Static; strike spreads 2.6964 0 1.0674 19
Static; calendar spreads 2.6704 1.0 1.3468 94
discontinuity creates a large gap risk, and hedge dimension is needed to match different volatility
quality deteriorates. To alleviate this, a number of levels at knockout. By using both strike and calendar
regularization techniques have been suggested, for spreads, asymptotically perfect static hedges can be
instance [10] using singular value decomposition found in these two cases. It should be stressed that the
when solving equation (1). static hedges are model and parameter dependent, but
experimental and empirical evidence [7, 9] suggests
Beyond Black–Scholes Dynamics a high degree of robustness to model risk.
model, we have [3] Bowie, J. & Carr, P. (1994). Static simplicity, Risk
Magazine 7(8), 44–50.
Call (St , t|K, T ) = (K/St ) × P ut (St , t|St2 /K, T ) [4] Carr, P., Ellis, K. & Gupta, V. (1998). Static hedging of
exotic options, Journal of Finance 53, 1165–1190.
for all St , t, K, and T (2) [5] Carr, P. & Lee, R. (2008). Put-call symmetry: extensions
and applications, Mathematical Finance forthcoming.
So a down-and-out call is replicated by buying one [6] Derman, E., Ergener, D. & Kani, I. (1995). Static options
strike-K call, selling K/B puts with strike B 2 /K, liq- replication, Journal of Derivatives 2, 78–95.
uidating this position the first time that St = B , and [7] Engelmann, B., Fengler, M., Nalholm, M. &
if that does not happen, holding it until the options Schwendner, P. (2007). Static versus dynamic hedges:
an empirical comparison for barrier options, Review of
expire. More general symmetry relations enable one
Derivatives Research 9, 239–264.
to find static hedges for such contracts as up-and- [8] Fink, J. (2003). An examination of the effectiveness of
out calls, barrier options with rebates, lookback static hedging in the presence of stochastic volatility,
options, and double barrier options ([11] is a survey). Journal of Futures Markets 23, 859–890.
Those static hedges will typically involve a contin- [9] Nalholm, M. & Poulsen, R. (2006). Static hedging
uum of plain vanilla options. Put–call symmetries and model risk for barrier options, Journal of Futures
also exist in models with nonzero interest rates and Markets 26, 449–463.
[10] Nalholm, M. & Poulsen, R. (2006). Static hedging of
dividends, and more general dynamics than geometric barrier options under general asset dynamics: unification
Brownian motion (see [5]). Note that the strike- and application, Journal of Derivatives 13, 46–60.
spread approach from the previous section finds the [11] Poulsen, R. (2006). Barrier options and their static
symmetry-based static hedges without explicit knowl- hedges: simple derivations and extensions, Quantitative
edge of closed-form results, and that the perfect repli- Finance 6, 327–335.
cation of the down-and-out call in Table 1—where
the strike-B 2 /K put is included as a hedge instru- Related Articles
ment—demonstrates the basic put–call symmetry.
Barrier Options; Finite Difference Methods for
References Barrier Options; Hedging; Put–Call Parity;
a standard variance swap holds options at all In the case of nonzero interest rates or divi-
strikes K ∈ (0, ∞). In practice, not all of those dends, add to equation (8) a correction involving
strikes actually trade. If we truncate the port- payoffs at all expiries in (0, T ), as specified in
folio to hold only the strikes in some interval equation (7a) in Weighted Variance Swap, and
C, then the resulting value does not price a full in equation (9) replace Y0 by the forward price.
variance swap but rather a C-corridor variance
4. With discrete monitoring, the question arises,
swap. (Moreover, in practice not even an inter-
how to define up-variance and down-variance,
val of strikes actually trade, but rather a finite
set, which can replicate instead a strike-to-strike and in particular how much variance to recog-
notion of corridor variance, as shown in [1].) nize, given a discrete move that takes Y across
H . Definition (2) recognizes the full square of
3. In the case C = (H, ∞), where H > 0, we each move that ends in the corridor. Alterna-
rewrite equation (4) as tively, the contract specifications in [2] treat the
movements of Y across H by recognizing a frac-
2
λ(y) = (y − H )+ − 2(log y − log H )+ (7) tion of the squared move. The fraction is defined
H in a way that admits approximate discrete hedg-
Thus, the replicating portfolio is long calls on YT ing, in the sense that the time-discretized imple-
and short calls on log YT . mentation of the continuous replication strategy
Let FXT be the characteristic function of has in each period a hedging error of only third
XT = log YT . Then techniques in [4, 5] price the order in that period’s return.
calls on YT and log YT , respectively. Specifically,
assuming zero interest rates and dividends, we
have the following semiexplicit formula for the References
corridor variance swap’s fair strike:
[1] Carr, P. & Lee, R. (2008). From Hyper Options to
Variance Swaps, Bloomberg LP, University of Chicago.
Ɛλ(YT ) − λ(Y0 )
[2] Carr, P. & Lewis, K. (2004). Corridor variance swaps,
∞−αi
2 FXT (z − i) −iz log H Risk 17(2), 67–72.
= Re e dz [3] Carr, P. & Madan, D. (1998). Towards a theory of
H π 0−αi iz − z2 volatility trading, in Volatility, R., Jarrow, ed, Risk
2 ∞−βi FXT (z) −iz log H Publications, pp. 417–427.
+ Re e dz − λ(Y0 ) [4] Carr, P. & Madan, D. (1999). Option valuation using the
π 0−βi z2 fast Fourier transform, Journal of Computational Finance
(8) 3, 463–520.
[5] Lee, R. (2004). Option pricing by transform methods:
for arbitrary positive α, β such that α + 1, β < extensions, unification, and error control, Journal of
p
sup{p : ƐYT < ∞}, where Ɛ denotes expectation Computational Finance 7(3), 51–86.
with respect to martingale measure.
In the case C = (0, ∞), equation (4) implies
Related Articles
the fair strike formula
Ɛλ(YT ) − λ(Y0 ) = − 2Ɛ log(YT /Y0 ) Delta Hedging; Gamma Swap; Realized Volatil-
ity Options; Variance Swap; Volatility Swaps;
= 2iFX T (0) + 2 log Y0 (9) Weighted Variance Swap.
In the case C = (H1 , H2 ), where 0 ≤ H1 < H2 ,
ROGER LEE
subtract the formula for C = (H2 , ∞) from the
formula for C = (H1 , ∞).
Gamma Swap and dynamic trading of shares, under conditions
specified in Weighted Variance Swap, which include
all positive continuous semimartingale share prices
Y under deterministic interest rates and proportional
A gamma swap on an underlying Y is a weighted
dividends.
variance swap (see Weighted Variance Swap) on
Explicitly, one replicates by using equation (7) of
log Y , with weight function
the above article, with
w(y) := y/Y0 (1)
2
λ(y) = y log(y/κ) − y + κ
In practice, the gamma swap monitors Y discretely, Y0
typically daily, for some number of periods N , ∞
2
annualizes by a factor such as 252/N , and multiplies = Van(y, K) dK (3)
by notional, for a total payoff 0 Y 0K
as noted in [2]. Hence, a static combination hold, at each strike K, a quantity proportional
of gamma swaps produces cumulative index- to K n−2 . The gamma swap O(1/K) is interme-
weighted dispersion. diate between logarithmic variance O(1/K 2 ) and
2. By Corollary 2.7 in [1], if the implied volatility arithmetic variance O(1).
smile is symmetric in log-moneyness, and the 5. Let F be the characteristic function of log YT . If
dividend yield equals the interest rate (qt = rt ), ƐYTp < ∞ for some p > 1 then
and there are no discrete dividends, then a
gamma swap has the same value as a variance ƐYT log YT = −iF (−i) (7)
swap. Gamma swap valuations are therefore directly
3. Assuming that YT = Yt Rt,T for all t, where computable in continuous models for which F
the time-t conditional distribution of each Rt,T is known, such as the Heston model (see Heston
does not depend on Yt , the gamma swap Model).
has time-t gamma equal to a discounting/
dividend-dependent factor times
References
2 ∂ 2 2Ɛt Rt,T
Ɛt yRt,T log(yRt,T ) = [1] Carr, P. & Lee, R. Put–call symmetry: extensions and
Y0 ∂y 2 y=Yt Y0 Yt applications, Mathematical Finance, Forthcoming.
[2] Mougeot, N. (2005). Variance Swaps and Beyond , BNP
(6) Paribas.
[3] Overhaus, M., Bermúdez, A., Buehler, H., Ferraris, A.,
where Ɛ denotes expectation with respect to Jordinson, C. & Lamnouar, A. (2007). Equity Hybrid
martingale measure. Therefore, the share gamma, Derivatives, John Wiley & Sons.
defined to be Yt times the gamma, does not
depend on Yt . This property motivates the term
gamma swap. Related Articles
4. Within the family of weight functions propor-
tional to w(y) = y n , the gamma swap takes Corridor Variance Swap; Delta Hedging; Real-
n = 1. In that sense, the gamma swap is inter- ized Volatility Options; Variance Swap; Volatility
mediate between the usual logarithmic variance Swaps; Weighted Variance Swap.
swap (which takes n = 0) and an arithmetic vari-
ance swap (which, in effect, takes n = 2). ROGER LEE
Expressed in terms of put and call holdings,
the replicating portfolios in these three cases
Atlas Option the payoff of the Atlas option is
1
n−b
Si (T )
max − K, 0 (1)
n − (w + b) i=1+w Si (0)
In late 1990s, Societe Generale introduced a series of
options on baskets of assets which are now commonly with the obvious condition that b + w < n.
If b < w, with b possibly 0, or if b > w with w
referred to as “Mountain Range” options [2]. They
possibly 0, then this option becomes the best of or
were introduced in part to replicate certain portfolio
worst of option, respectively. On the other hand, if
strategies and in part to extend single-name options
b = w, with equal number of underperforming and
to portfolios. What these options share is a strong
outperforming stocks removed, this option becomes,
dependence on the correlation structure of the assets,
in effect, a “middle of the road” or an “average
brought about by their nonlinear and path-dependent
of averages” option. By removing the outliers, we
payoffs. But beyond this similarity, each type has its
are removing extreme risk and lowering the pre-
own distinct payoff tailored to its own risk profile
mium, while making it more favorable to risk-averse
and usage, making each deserving a study of its customers. For example, it can provide protection
own. In a series of three articles, we look at the against defaults for the price of missing out on top
three commonly traded types of Mountain Range performers.
options—the Atlas option, the Himalayan option, and
the Altiplano option.
We start with the Atlas option, which being the Modeling
only non-path-dependent option in this group, is
somewhat easier to analyze than the other two. This In the simplest of implementations, as in single-name
article is organized as follows. We first provide options, the asset price processes are modeled as
a description of the Atlas option and discuss the lognormal processes but with a correlation matrix. In
financial motivations for and strategies of its usage. more advanced implementations, to account for the
We then discuss modeling, valuation, and risk issues volatility smile, some versions of stochastic volatility
that Atlas options share with all Mountain Range models are often used. One may even model some
options, and conclude with a brief analysis of the form of default component. However, in these more
risk profile that is unique to it. We remark that complex models, the modeling of correlation and its
although the following discussion holds for a wide estimation become more complex as well.
class of assets, such as foreign exchange (FX) and
commodities, these options are traded mostly on
baskets of stocks.
Valuation and Risk
The number of assets in Mountain Range options
generally ranges from a low of 4 or 5 to a high
of about 20. Owing to their complex payoff and
Contract Description path-dependency, idiosyncratic characteristics of each
asset need to be taken into account. Hence one can-
The payoff of the Atlas option is simply a call (or not assume homogeneity of assets for neither small
a put) option on the performance of a portfolio at nor large baskets, making any closed-form approx-
maturity with the best and worst performing names imation (especially in light of path dependence)
removed. More precisely, given a portfolio, or bas- intractable. Consequently, Mountain Range options,
ket of n stocks, let Si (t) be the price of the stock i = even the non-path-dependent Atlas options, are calcu-
1, . . . , n at time t, with 0 being the start of the option lated using Monte Carlo simulation [1]. Monte Carlo
and T its maturity. Furthermore, assume that the methods, especially for high-dimensional payoffs
indices are such that S1 (T )/S1 (0), . . . , Sn (T )/Sn (0), with large number of assets, are slow to converge, and
that is, the performance of the stocks is in increasing usually one or more variance-reduction techniques
order. Given a strike K, the number of underperform- are employed. This problem is exacerbated further
ing assets w and outperforming ones b to be removed, when calculating first and second order Greeks.
2 Atlas Option
But even in the simple lognormal model, the sheer the option payoff. Since it behaves as a single asset,
size of the correlation matrix can become a challenge. as in single-asset calls, the option’s payoff generally
Since for n assets there can be n(n − 1)/2 distinct increases with volatility.
correlations, even for a modest basket of 10 assets, For low correlations, on the other hand, the basket
45 different correlations are possible. Moreover, it is has a high dispersion at maturity—on average, a few
not clear how one can obtain the correlation numbers stocks will have a high price and the rest low—and
themselves. If, theoretically speaking, there existed higher the volatility, the higher the dispersion. Since
n(n − 1)/2 traded spread options on each pair, their the expectation of the sum of asset prices at matu-
implied correlations could be used with the spread rity is independent of both volatility and correlation,
options as hedges. However, it is unlikely that every having a few high asset prices implies that many oth-
pair of assets in a basket would have a traded spread ers are very low in most paths. But it is precisely
option. Even if they did, their sheer number would these high-contribution assets that are removed from
make transaction costs prohibitive, even for moderate the basket, leaving the basket with low-priced assets,
bid–ask spreads. Hence historical correlations are thus reducing the price of the option. So for low cor-
more often used, even though as with all historical relation, with b = 0, increasing volatility does not
estimates, they are hard to hedge and can change necessarily increase the price. We remind the reader
with macro- and microeconomic shifts. When all that this simple analysis applies to a homogeneous
assets belong to the same sector, a single correlation basket. Individual volatilities, dividends, and a non-
number is commonly used. This high amount of asset constant correlation can affect the payoff in ways not
interdependence makes cross-gammas (see Gamma always easily explained.
Hedging) important, adding further to the hedging
complexity.
References
and can change with macro- and microeconomic performing asset would in most scenarios leave
shifts. When all assets belong to the same sector, the worse performing assets, and since no asset is
a single correlation number is commonly used. This given a chance to grow—it would get removed in
high amount of asset interdependence makes cross- the next cut—we are left with increasingly worse
gammas important, adding further to the hedging performing assets. Thus greediness would actually
complexity. be a bad strategy compared to some predetermined
asset-removal order. Increasing volatility increases
this dispersion, further reducing the payoff of the
Risk Profile Himalayan. We remind the reader that this simple
analysis applies to a homogeneous basket. Individual
Payoffs for Himalayan options can be surprising.
volatilities, dividends, and a nonconstant correlation
For simplicity, we look at a homogeneous portfolio
can affect the payoff in ways not always easily
with identical pairwise correlations given by a single
explained.
number in the simple lognormal model. We look at
the effects of removing the best performing assets
compared to removing assets in a predetermined References
order. For high correlations, since basket assets
move together, the effects of the “greediness” is [1] Glasserman, P. (2004). Monte Carlo Methods in Financial
small—removing the best performing assets leaves Engineering, Springer, New York.
the portfolio with similarly performing assets. So [2] Mountain Range Options, document downloadable from
global-derivatives.com.
compared to removal by some predetermined order,
its effects are small. Increasing the volatility increases
the dispersion, and as we next see for the case of low- Related Articles
correlation baskets, it may adversely impact the value
of the option. Altiplano Option; Atlas Option; Basket Options;
Assets with low correlation, on the other hand, Correlation Risk.
become more disperse as time passes. Since the
expected sum of assets at termination of the option REZA K. GHARAVI
is independent of correlation, removing the best
Altiplano Option use any option. As in single-name barriers, there
are many possible wrinkles for the barrier event,
such as the Parisian type (see Parisian Option).
But unlike a simple extension from the single-name
In the late 1990s, Societe Generale introduced a
case where the barrier is triggered by the sum of
series of options on baskets of assets that are now
the portfolio, individual assets can trigger the barriers
commonly referred to as Mountain Range options
by themselves. In the example above, all it takes is
[2]. They were introduced in part to replicate certain
for one asset to activate a barrier, independently of
portfolio strategies and in part to extend single-name
the level of other assets at the time. This makes the
options to portfolios. What these options share is a
Altiplano sensitive to individual asset moves, rather
strong dependence on the correlation structure of the
than the collective sum.
assets, brought about by their nonlinear and path-
As in single-name barriers, since IB is always less
dependent payoffs. But beyond this similarity, each
than 1, the payoff, and thus the risk, are lower than
type has its own distinct payoff tailored to its own
for standard options on baskets, which makes their
risk profile and use, making each deserving of its
premiums lower as well (assuming C is small or
own study. In this last of three articles on Mountain
zero). The lower premium makes it more attractive
Range options, we look at the Altiplano option, which
when used as a hedge, for example.
can be thought of as an extension of barrier options
to baskets.
This article is organized as follows. We first Modeling
provide a description of the Altiplano option and
discuss the financial motivations for and strategies In the simplest of implementations, as in single-name
of its usage. We then discuss modeling, valuation, options, the asset price processes are modeled as
and risk issues that Altiplano options share with lognormal processes but with a correlation matrix. In
all Mountain Range options, and conclude with a more advanced implementations, to account for the
brief analysis of the risk profile that is unique to volatility smile, some versions of stochastic volatility
it. We remark that although the following discussion models are often used. One may even include some
holds for a wide class of assets, such as foreign form of default component. However, in these more
exchange (FX) and commodities, these options are complex models, the modeling of correlation and its
traded mostly on baskets of stocks. estimation become more complex as well.
continuous payoffs, making first- and second- order terms—the option term and the barrier term. Again,
Greeks calculations even more noisy. This makes use for ease of analysis, we look at a homogeneous
of variance-reduction methods even more critical. portfolio with identical pairwise correlations given
The other challenge posed by these options is the by a single number in the simple lognormal model
correlation. Even in the simple lognormal model, the with no coupon payout. For a simple call option on a
sheer size of the correlation matrix can become a basket, it is known that high correlation and volatility
challenge. Since for n assets, there can be n(n − 1)/2 increase its price. For the barrier event, it depends on
distinct correlations, even for a modest basket of 10 the type. If it takes only one asset to hit a barrier,
assets, 45 distinct correlations are possible. Moreover, then low correlation and high volatility increase the
it is not clear how one can obtain the pairwise probability of hitting it. Moreover, depending on the
correlations themselves. If, theoretically speaking, barrier, the paths that lead to higher option prices may
there existed n(n − 1)/2 traded spread options on make the barrier even more or less likely, leading to
each pair, their implied correlations could be used possible nonmonotonous behavior. So even in this
with the spread options as hedges. However, it simple homogeneous case with a single correlation,
is unlikely that every pair of assets in a basket the behavior is rather complex.
would have a traded spread option. Even if they
did, their sheer number would make transaction
References
costs prohibitive, even for moderate bid–ask spreads.
Hence historical correlations are more often used,
[1] Glasserman, P. (2004). Monte Carlo Methods in Financial
even though as with all historical estimates, they
Engineering, Springer, New York.
are hard to hedge and can change with macro- and [2] Mountain Range Options, document downloadable from
microeconomic shifts. When all assets belong to the global-derivatives.com
same sector, a single correlation number is commonly
used. This high amount of asset interdependence
makes cross-gammas important, adding further to the Related Articles
hedging complexity.
Atlas Option; Basket Options; Correlation Risk;
Himalayan Option.
Risk Profile
REZA K. GHARAVI
As in single-asset barrier options, the payoff of an
Alitplano option is determined by two competing
Constant Proportion the faster the portfolio value approaches the bond
floor in downturn markets.
Portfolio Insurance Both the bond floor and multiple are specified
in the contract and indicate the investor appetite for
risk. Assuming Black–Scholes framework for risky
Portfolio insurance is a dynamic management tech- asset St
nique that aims at giving the investor the ability to dSt = µSt dt + σ St dWt (3)
limit the downside risk while allowing some partic- and assuming floor evolves as follows:
ipation in the upside market. Option-based portfolio
insurance combines a position in the risky asset with a dFt = rFt dt (4)
put option on the asset to achieve this goal. In many
cases, options on a portfolio may not be available. the value of the portfolio at time t, as shown in [5]
Constant proportion portfolio insurance (CPPI) is an and [3], is
alternative to that approach. CPPI was first introduced Vt (St , m) = F0 exp(rt) + αt Stm (5)
by Perold [4] for fixed-income instruments and by
Black and Jones [2] for equity instruments. where αt = Cm0 exp(βt) and β is given by
CPPI utilizes a rule-based strategy to allocate S0
assets dynamically over time. It involves maintaining
1 σ2
a dynamic mix of a “riskless” asset (usually treasury β = r − m r − σ 2 − m2 (6)
2 2
bills or liquid money market instruments) and the
risky asset, usually a market index. In the case of In periods of negative performance, a specified
having more than one risky asset, an index is formed, amount of the risky asset, according to a predeter-
which would be treated as a single risky asset. The mined asset allocation formula, is liquidated and used
weights in the index do not change during the life of to purchase riskless assets. On the other hand, when
the trade. market goes up, a specified amount of riskless assets,
The strategy is based on the notion of cushion, according to the formula, is liquidated and proceeds
which is the difference between the current portfolio are used to purchase the risky asset. The provider
value and the guaranteed level, called bond floor. undertakes the risk of managing of the pool of assets
Obviously, the initial floor F0 is less than the initial (both risk-free and risky assets).
portfolio value V0 . Define Ct to be t-time value of For risk managing of the pool in CPPI, the
the cushion, that is provider receives an annual fee. The fee is specified
in the contract in one of the following ways: (i) as
Ct = Vt − Ft (1) a fixed percentage of the initial notional per annum,
(ii) as a fixed percentage of the value of the portfolio
The final payoff at maturity is the maximum of per annum (path dependent), and (iii) as a fixed
these two quantities: (i) the value of the portfolio at percentage of the value of the equity held in the pool
maturity and (ii) the guaranteed level. In a nutshell, per annum (path dependent). The first two are more
CPPI is a path-dependent, self-financing capital guar- common than the third one.
antee structured product that has final payoff linked The provider also receives the potential value of
to the performance of a pool of assets. dividends on the equity. This amount is also path
Throughout the existence of the contract, an dependent.
amount of wealth is invested into the risky asset. This If there is more than one risky asset in the pool,
amount called exposure is proportional to the cush- they would be treated like a basket in a basket option,
ion and is calculated by multiplying the cushion by so that in the event of rebalancing the collection of
a predetermined multiple. risky assets is treated like a single underlying.
et = mCt (2)
Rebalancing Procedure
The remainder of wealth is allocated into the
riskless asset. Trivially, the higher the multiple, the The asset allocation formula is a part of the contract.
more the holder will participate in rising markets and The terms in the formula are negotiated between the
2 Constant Proportion Portfolio Insurance
provider and the counterparty before entering into the options and (ii) fat-tailed behaviors observed in stock
transaction. returns distributions.
A feature is built into the allocation formula In [1], the authors apply extreme value theory to
in order to avoid constant rebalancing. Rebalanc- determine the multiple. A quantile hedging approach
ing occurs where the difference between theoretical is introduced, which provides an upper bound on
equity exposure from the formula and the actual multiple. This bound is statistically estimated from
equity allocation is greater than a predefined number the behavior of extreme variations in rates of asset
of percentage points specified in the contract. returns. The authors also introduce the distributions
In the case that it is triggered, the provider of interarrival times of these extreme movements and
sells/purchases an amount of equity and pur- show their impacts on CPPI.
chases/sells risk-free assets to make the actual ratio In [5] the authors analyze the cost of the guarantee
equal to the number generated by the formula. The and the performance of portfolio based on such
timetable for rebalancing is up to the investor, with a strategy. They provide two extensions. One is
monthly or quarterly rebalancing being often cited. based on Levy processes that allow jumps into the
dynamics of the underlying asset. Second, they deal
with insurance against all hitting times of modified
Gap Risk floor.
In [3], Cont and Tankov study the behavior
The risk that the value of the portfolio is less or equal of CPPI strategies in models where the price of
to the bond floor is called the gap risk. the underlying portfolio may experience downward
If there is no drastic jump in the value of the risky jumps. That allows them to quantify the gap risk
asset for the life of the trade, then there is no need while maintaining the analytical tractability of the
for injection of money for rebalancing of the pool. continuous-time framework. With respect to the work
Therefore downfall is lower than the gap risk and the done in [5], in [3] the authors consider various
value of the pool would be above the bond floor. In risk measures for the loss and provide an analytical
that case, hedging the CPPI trade would be risk free. method to compute them.
There are cases that the value of the pool may CPPI techniques have also been applied to credit
go under the bond floor. Either the provider cannot portfolios (see Credit Portfolio Insurance).
liquidate the risky asset due to illiquidity or the
value of the equity asset has dropped so much that References
proceeds are not sufficient to maintain the value of the
portfolio above the floor—the market simply drops [1] Bertrand, P. & Prigent, J.-L. (2002). Portfolio insurance:
by more than the gap risk before a rebalancing can be the extreme value approach to the CPPI method, Finance
undertaken. In either case, the provider must make up 23(2), 69–86.
the shortfalls. Whenever the portfolio value reaches [2] Black, F. & Jones, R. (1987). Simplifying portfolio insur-
a given floor, the investor receives a given amount. ance, Journal of Portfolio Management 14(1), 48–51.
[3] Cont, R. & Tankov, P. (2009). Constant proportion
At this point, the entire pool comprises of 100%
portfolio insurance in presence of jumps in asset prices,
exposure to riskless asset. The gap risk is presented Mathematical Finance 19(3), 379–401.
as basis points per annum. [4] Perold, A.R. (1986). Constant Proportion Portfolio Insur-
ance, Harvard Business School, Working Paper.
[5] Prigent, J.-L. & Tahar, F. (2005). CPPI with Cushion
Insurance, University of Cergy-Pontoise, working
Modeling Gap Risk paper.
These are termed equity collateralized obligations where R is the recovery rate, D t is the discount factor
(ECO).
to time t and P and E are probabilities and expecta-
tions under the risk-neutral measure. Then, the fair
spread for an EDS is the value of C that makes
EDS as Credit–Equity Hybrids EDS (S0 ; C, R, L) = 0.
The problem of deriving the first passage time dis-
Since a large fall in the equity price is needed to tribution has been solved in several special cases.
trigger the payout of an EDS, there are often accom- Albanese and Chen [1] compute the EDS spread
panying credit implications to the firm leading many under the assumption that the stock price fol-
to refer to EDS as a credit–equity hybrid instrument, lows a constant elasticity of variance (CEV) pro-
despite the default event being defined exclusively in cess (see Constant Elasticity of Variance (CEV)
terms of the share price. Diffusion Model) and Asmussen et al. [2] use
There is empirical evidence that EDS should the Wiener–Hopf factorization (see Wiener–Hopf
not be considered simply as an equity derivative. Decomposition) to compute the EDS spread under
de Servigny and Jobst [8] assess the relative weight- the assumption that the stock price follows a
ing of debt and equity factors for equity default Carr–Geman–Madan–Yorr (CGMY) Lévy process
probabilities. They find that for the typical definition (see Tempered Stable Process). For models where
of equity default as 30% of the initial share price, credit considerations are more explicitly addressed,
debt factors are more important than equity factors. EDS spreads have been priced using different meth-
In addition, Jobst and de Servigny [5] study EDS ods: Albanese and Chen [1] use a credit barrier model
correlation and EDS–CDS correlation and find that with a credit to equity mapping; Campi et al. [3]
multivariate analyses commonly used for credit are extend the CEV case to include jump to default;
the most appropriate. Medova and Smith [6] use a structural model of
credit risk (see Structural Default Risk Models)
where the firm’s asset value follows a geometric
EDS Pricing Brownian motion; and Sepp [7] makes use of an
extended structural model where the firm’s asset
Consider the problem of finding the fair spread for an value can have stochastic volatility (see Heston
EDS that is to be initiated now at time t = 0, with the Model) or be a double exponential jump diffu-
current stock price as S0 . The default payment in an sion, with the default barrier being deterministic or
EDS is made the first time that the share price trades stochastic.
at the prespecified level L where L < S0 . In addition
to this, the swap payments are made contingent on
the share price not being traded at or below the pre- References
specified level. Thus, to price an EDS, the probability
distribution of the first passage time of the level L is [1] Albanese, C. & Chen, O. (2005). Pricing equity default
required. Here, the first passage time of L is defined swaps, Risk 18, 83–87.
as [2] Asmussen, S., Madan, D. & Pistorius, M.R. (2008).
τL = inf{t > 0; St ≤ L} (2) Pricing equity default swaps under an approximation
to the CGMY Lévy Model, Journal of Computational
Finance 11, 79–93.
Then, if an EDS with $1 notional requires a peri-
[3] Campi, L., Polbennikov, S. & Sbuelz, A. (2009). System-
odic payment of C at times T1 , . . . , Tn the price of atic equity-based credit risk: a CEV model with jump to
the EDS for a protection buyer is default, Journal of Economic Dynamics and Control 33,
93–108.
n
[4] Gil-Bazo, J. (2006). The value of the ‘swap’ fea-
EDS (S0 ; C, R, L) = −C D Ti P (τL > Ti |S0 ) ture in equity default swaps, Quantitative Finance 6,
i=1 67–74.
+ (1 − R)E[D τL 1{τL ≤Tn } |S0 ] [5] Jobst, N. & de Servigny, A. (2006). An empirical analysis
of equity default swaps (II): multivariate insights, Risk 19,
(3) 97–103.
Equity Default Swaps 3
In conclusion, a number of financial entities are [2] Kinnel, R. (2007). Fund Fees are Coming Down, Morn-
necessary to create and maintain ETFs, indexed prod- ingstar, Retrieved on July 25, 2008 from http://
ucts that trade like stocks on major bourses. Further- ibd.morningstar.com/article/article.asp?CN=aol828&id=
more, ETFs differ from mutual funds in that they 194298
[3] Marquardt, K. (2008). Surprise: ETF Fees are Going Up,
trade throughout the day, can be shorted, and gener-
U.S. News, Retrieved on July 25, 2008 from http://www.
ally carry lower fees. Finally, the close to 2000 ETFs
usnews.com/blogs/new-money/2008/7/9/surprise-etf-fees
trading today cover a plethora of different investment
-are-going-up.html
categories, offering investors an inexpensive means to [4] McKeever, C. (2007). A Cost Comparison—The Real
construct well-diversified portfolios. Cost of Mutual Funds v ETF’s, Chance Favors, Retrieved
on July 28, 2008 from http://chancefavors.com/2007/10/
References cost-comparison-mutual-funds-vs-etfs/
[5] McWhinney, J. (2005). An Inside Look at ETF Con-
[1] Amery, P. (2008). European ETF Secondary Market Deal- struction, Investopedia, Retrieved on July 25, 2008
ing Spreads, Index Universe, Retrieved on July 25, 2008 from http://www.investopedia.com/articles/mutualfund/
from http://www.indexuniverse.com/sections/features/12/ 05/062705.asp
4294-european-etf-secondary-market-dealing-spreads.
html MICHAEL J. TARI
Volume-weighted Average Subtleties in the computation of VWAP/TWAP
include (i) the choice of volume definition (e.g.,
Price (VWAP) primary market volume or composite volume), (ii)
the treatment of certain trades (e.g., block trades that
might be negotiated off market), and (c) the decision
The volume-weighted average price (VWAP) and whether to include volumes at the open and close of
its close cousin, the time-weighted average price the market.
(TWAP), are commonly used measures of the average
price of a security over a period of time. VWAP
and TWAP are used by traders and other investment
professionals as reference prices, an indication of the
Uses
average transaction price over an interval of time. So,
for example, if the TWAP of a security is $10 on a VWAP is commonly used as an approximation to
given day and a trader had bought a sizeable block of the price that could be realized by a trader who
shares at $9.50, we might conclude that the trader had passively participates in trading activity. As such,
added value in that he or she obtained a better than a the performance of traders can be measured by
naı̈ve program that mechanically sends out orders in their ability to execute orders at prices better than
the market at a steady rate throughout the day. the VWAP benchmark prevailing over the trading
horizon.
The computational simplicity of the VWAP is a
Mathematical Definition major factor in its popularity in measuring trade exe-
cution, especially in markets where detailed trade
More formally, the VWAP of a security over a speci- level data is difficult or expensive to obtain. VWAP
fied trading horizon (e.g., from market open to close) can be misleading as a benchmark in certain situa-
is defined as the ratio of the total transaction value tions where the trader’s objective is to control the
in that security (i.e., the sum, over all trades in the slippage from a given strike or decision price, or
specified horizon, of the product of each trade’s share where the strategy is not passive. In such cases,
volume and the corresponding price) to the total vol- for example, if the trader has short-term alpha, the
ume of shares traded (i.e., the sum of all shares traded mechanical application of a VWAP strategy (i.e.,
in the trading horizon). When the trading horizon is trading in parallel to historical volume patterns) can
typically a trading day, intraday or multiday VWAP lead to significant opportunity costs in terms of slip-
measures are also computed. A related concept is the page. VWAP is not appropriate when the trader’s
TWAP, defined as the average price over a particu- executions are large relative to market volumes.
lar time interval with no explicit volume weighting. In this case, VWAP might conceal a large price
Traders use TWAP over VWAP for securities where impact because the trader’s own trades constitute
the temporal pattern of volume exhibits considerable the bulk of the reported volume. Finally, if traders
variation, for example, in less-active securities. have discretion over whether to execute or not, the
Formally, given N trades in the relevant interval, VWAP benchmark can be gamed by selectively tim-
let S1 , . . . , SN be the shares transacted with corre- ing executions.
sponding prices P1 , . . . , PN . Then, we have An important application is to so-called VWAP
N strategies, typically algorithmic trading strategies that
Pi Si automatically break up an order and send trades to
i=1 the market to match the historical volume pattern or
V W AP = (1) profile (see, e.g., [1]) of a security. See, for example,
N
Si [2] for a discussion of the uses of VWAP in trading
i=1 strategies and algorithms. The goal of a VWAP
strategy is to obtain an execution price close to the
N
Pi VWAP for the day. Some brokers also guarantee
i=1 VWAP execution, essentially taking on the execution
T W AP = (2) risk for a fee.
N
2 Volume-weighted Average Price (VWAP)
[1] Hobson, D. (2006). VWAP and volume profiles, Journal Automated Trading; Execution Costs; Price
of Trading, 1(2), Spring, 38–42. Impact.
[2] Madhavan, A. (2002). VWAP Strategies, in Transaction
Performance: The Changing Face of Trading, Handbook ANANTH N. MADHAVAN
Series in Finance, B. Bruce, ed, Institutional Investor Inc.
Equity Swaps on the return of a stock or an equity index over the
period [Ti−1 , Ti ].
usually a LIBOR rate. Let L(Ti , Ti+1 ) denote the flow that B pays). This type of contract is referred
simple spot rate over the period [Ti , Ti+1 ]. to as a quanto swap (see also Quanto Options).
Quanto swaps are more complicated to price than
Definition 3 A generic equity-for-floating-rate other swaps. Quanto contracts have been considered
swap in [3, 4] and [7].
An equity-for-floating-rate swap, with tenor struc- From a pricing and hedging perspective, the sim-
ture T, which is written on the equity Z will give rise plest cross-currency swaps are the ones that are
to the following payments between the counterparties currency adjusted. Consider a cross-currency equity-
A and B at each payment date Ti : for-equity swap with currency-adjusted returns. Let
Z1 be a foreign equity, while Z2 is a domestic equity.
• A pays to B the amount: N R(Ti−1 , Ti ).
Let X(t) denote the exchange rate expressed as the
• B pays to A the amount: N δi (L(Ti−1 , Ti ) + s).
number of domestic currency units per foreign cur-
where s is a constant rate such that the initial value rency unit. Then the currency-adjusted period return
of the swap at time T0 equals zero. over the interval [Ti , Ti+1 ] for the asset Z1 is
X(Ti+1 )Z1 (Ti+1 )
An equity-for-floating swap can be decomposed R1 (Ti , Ti+1 ) = −1 (2)
into an equity-for-fixed swap and a suitably chosen X(Ti )Z1 (Ti )
interest rate swap (see LIBOR Rate and [2]).
While the unit of Z1 (t) is foreign currency, the
Let R1 and R2 denote the return of assets Z1 and
unit of Z1 (t)X(t) is domestic currency. Regarding
Z2 , respectively.
the underlying index as the foreign asset times the
Definition 4 A generic equity-for-equity swap exchange rate, R1 can be treated as the return on
An equity-for-equity swap, with tenor structure T, a domestic index. A cross-currency equity-for-equity
which is written on the equities Z1 and Z2 , will give swap that is currency adjusted is, from a valuation
rise to the following payments between the counter- point of view, equal to a domestic equity-for-equity
parties A and B at each payment date Ti : swap.
where s is the constant rate such that the initial value Some equity swaps are constructed with a vari-
of the swap at time T0 equals zero. able notional principal. A variable notional principal
changes over time according to changes in the refer-
The equity-for-equity swap is also referred to as enced equity index.
a two-way equity swap. The simplest contract of this Consider an equity-for-fixed-rate swap. It can
type is a domestic equity-for-equity swap where both essentially be regarded as a leveraged position in the
returns are based on domestic indices or assets. underlying equity. If the notional principal is con-
So far, we have only considered domestic equity stant, the realized returns from the equity index are
indices and assets. However, all of the three equity withdrawn in each period, resulting in a position that
swaps mentioned above have versions where one or is rebalanced periodically. If the notional principal
both cash flows are based on a foreign equity return or is variable, the realized returns in each period are
interest rate. They are so called cross-currency swaps. reinvested.
To illustrate a cross-currency equity swap, suppose Let Ni denote the variable notional principal,
that the United States is the domestic market. Let the which determines the size of the payments at
notional principal be expressed in US dollars. Let Z1 time Ti for i = 1, 2, . . . , M. Let N1 = 1 and Ni =
be a foreign equity index such as, for instance, the Z(Ti−1 )/Z(T0 ) for i = 2, 3, . . . , M. Thus, for
NIKKEI, while Z2 is a domestic equity index such instance, at the third payment date T3
as the S&P 500. The period return R1 is based on
a foreign equity index, while the nominal amount • A pays to B the amount: Z(T2 ) R(T2 , T3 ).
is in domestic units. There is a currency mismatch Z(T0 )
in the cash flow that A pays (but none in the cash • B pays to A the amount: Z(T2 ) δ3 K.
Z(T0 )
Equity Swaps 3
where σ is the VIX/100, or equivalently, VIX = In terms of pricing, roughly speaking there are two
100σ , Ti the time to expiration of the ith option, F methods to price index volatility options:
2 Volatility Index Options
• a model-dependent approach that assumes a price F , and where σ is the volatility of the futures
model for the index volatility diffusion and price, DT is the discount factor expiring at time
provides a closed-form formula of call and put T , and N (x) is the cumulative normal distribution
options; up to x. A straightforward drawback of the Wha-
• a model-free approach that computes the cost of ley approach was the strong log normal underlying
the static hedge to replicate the volatility index assumption. This motivated further research and
option. led to many works. Grunbichler and Longstaff [6]
proposed a mean-reverting square root process for
Historically, the model-dependent approach has the volatility process. Following the popularity of
been the first to emerge. Successively, among others, stochastic volatility, Howison [7] and Elliot [5] sug-
Whaley [11], Grunbichler and Longstaff [6], Howison gested to use stochastic volatility model for the index
et al. [7], and Elliott et al. [5], and lately Sepp [9, 10], volatility to capture the risk of volatility for the
presented the model-dependent approaches to price
index volatility. Moreover, because it is well known
volatility index options. They all assume an underly-
that index volatility is upward sloping, the stochas-
ing stochastic process for the index volatility (or the
tic volatility approach, which can cope with this
index volatility futures) and explicitly compute the
important feature, was an appealing modeling choice.
price of the call and put options.
Figure 1, for instance, gives the VIX smile for 27 July
The first model-dependent approach presented by
2009.
Whaley [11] assumed a log normal diffusion for the
Lately, Sepp [9, 10] argued in favor of adding
VIX cash index and the VIX futures leading to a stan-
jumps to the stochastic volatility model to get a
dard Black–Scholes formula for the VIX call options
more realistic diffusion for the index volatility. This
as follows:
was supported by the econometric works that con-
firmed the evidence of jumps for the volatility
C(T , F, K) = DT F N (d+ ) − KN (d− ) (2) index.
F 1 To sum up, the model-dependent approach aims at
ln ± T σ2 modeling the index volatility evolution as accurately
d± = K√ 2 (3) as possible and providing a very consistent frame-
Tσ
work for pricing any type of option on index volatil-
where C(T , F, K) stands for the value of the call ity. The strength is the flexibility in terms of pricing
option with expiry time T , strike K, and forward as there is no limitation for the types of options. The
140
120
Volatilities
100
80
60
40
20 25 30 35 40 45 50 60 70
Strikes
weakness is the strong assumption of a specific model the volatility swaps using market prices of volatil-
and distribution for the index volatility. ity options as inputs. The resulting pricing consists
Another approach initiated by Neuberger [8], in numerically computing the cost of the hedging
Demeter et al. [4], Carr and Lee [1], and Carr and strategy with the series of options as shown by the
Wu [2] is to exhibit a static hedge and compute in a integrals of equation (6).
model free way the price of this hedge using call and The obvious strength of this approach is to
put options on the index itself. In a very insightful avoid any assumption on the underlying distribu-
paper, Demeter et al. [4] showed that a static portfo- tion of the index volatility. Primarily, the weak-
lio of call and puts options on the volatility index can nesses are that the replication methods do not work
replicate a variance swap. Lately, Carr and Lee [1] for very specific index volatility options and that
and Carr and Wu [2] extended the closed form for- the discretization bias due to the lack of reli-
mula to the case of both the variance and the volatility able liquid quotes for call and put options at any
swap. The starting point is to assume a pure diffusion strikes can be of the same order of the magni-
given as follows: tude as the misspecification of the index volatility
distribution.
dSt = µt St dt + σ (t, . . .)St dWt (4)
Cont, R. and Kokholm, T. (2009). A Consistent Pricing Model Realized Volatility and Multipower Variation;
for Index Options and Volatility Derivatives. Available at Realized Volatility Options; Stochastic Volatility
SSRN: http://ssrn.com/abstract = 1474691. Models; Variance Swap; Weighted Variance Swap.
(Q1/2 − [X]T )+
1/2
for a realized volatility put for some α ∈ . For all z ∈ α + i := {z ∈ :
Re z = α}, define the bilateral Laplace transform
In some places, we restrict attention to puts. Call ∞
prices follow by put–call parity: for realized vari- H (z) := e−zq h(q) dq (2)
ance options, a long-call short-put combination pays −∞
[X]T − Q, equal to a Q-strike variance swap, and for If |H | is integrable along α + i for some α ≤ 0,
realized volatility options, a long-call short-put com- then by Bromwich and Fubini, the h([X]T ) payoff
1/2
bination pays [X]T − Q1/2 , equal to a Q1/2 -strike has price
volatility swap. α+∞i
Unlike variance swaps (see Variance Swap; 1
Ɛh([X]T ) = H (z)Ɛez[X]T dz (3)
Weighted Variance Swap), which admit exact 2πi α−∞i
model-free (assuming only continuity of Y ) hedging
and pricing in terms of Europeans, variance, and For a variance put, let h(q) = (Q − q)+ . Then for
volatility options have a range of values, consistent all α < 0, formula (3) holds with
with the given prices of Europeans. With no e−Qz
further assumptions, there exist sub/superreplication H (z) = (4)
strategies and lower/upper pricing bounds (in the z2
section “Pricing Bounds by Model-free Use of √ +
Europeans”). Under an independence condition, there For a volatility put, let h(q) = Q − q + . Then
exist exact pricing formulas in terms of Europeans for all α < 0, formula (3) holds with
(in the section “Pricing by Use of Europeans, Under √
π Erf ( zQ)
an Independence Condition”). Under specific models, H (z) = − (5)
there exist exact pricing formulas in terms of model 2z3/2
parameters (in the section “Pricing by Modeling the To price variance and volatility calls by put–call
Underlying Process”). parity, we have the variance swap value
Unless otherwise noted, all prices are denominated
in units of a T -maturity discount bond. The results ∂
Ɛ[X]T = Ɛez[X]T (6)
apply to dollar-denominated prices, provided that ∂z z=0
2 Realized Volatility Options
and the volatility swap value For variance option pricing under pure-jump pro-
∞ cesses with independent increments, but without
1 1 − Ɛe−z[X]T
Ɛ[X]T1/2 = √ dz (7) assuming stationary increments, see [2].
2 π 0 z3/2
if Ɛez[X]T is analytic in a neighborhood of z = 0.
Pricing by Use of Europeans, Under an
Independence Condition
Pricing by Modeling the Underlying
Process In this section, let Y be a share price that follows
general stochastic volatility dynamics
Under Heston and under Lévy models, we give
formulas for the transform Ɛez[X]T , where Re z ≤ 0. dYt = σt Yt dWt (13)
Hence, formula (3) prices the variance put and vola-
tility put, using equations (4) and (5), respectively. where σ and the Brownian motion W are independent.
Although all three subsections use this assumption,
the schemes in the sections “Pricing via Transform”
Example: Heston Dynamics and “Pricing and Hedging via Uniform or L2 Payoff
Under the Heston model for instantaneous variance Approximation” are immunized, to first order, against
(see Heston Model), violations of the independence condition.
dVt = (a − κVt ) dt + β Vt dWt (8)
T Pricing via Transform
and the transform of [X]T = 0 Vt dt is T
The transform of [X]T = σt2 dt satisfies [5]
Ɛez[X]T = eA(z)+B(z)V0 (9) 0
where √
Ɛez[X]t = Ɛ θ+ (YT /Y0 )1/2+ (1/4)+2z
a
A(z) := 2
(κ − γ )T √
β + θ− (YT /Y0 ) 1/2− (1/4)+2z
(14)
κ −γ
− 2 log 1 + (1 − e−γ T ) (10) provided√that the expectations are finite. Here, θ± :=
2γ (1 ∓ 1/ 1 + 8z)/2. The right-hand side (RHS) of
2z(eγ T − 1) equation (14) is in principle observable from T -
B(z) := , expiry Europeans, which allows variance/volatility
2γ + (γ + κ)(eγ T − 1)
put option pricing by the formulas (3–5). In this
context, equation (6) can be replaced by the log-
γ := κ 2 − 2β 2 z (11) contract value −2ƐXT , and equation (7) can be
replaced by the synthetic volatility swap value (see
by [6]. Other affine models also have explicit formu-
Volatility Swaps).
las for Ɛez[x]T .
Moreover, source [5] shows that equation (14)
still holds approximately in the presence of corre-
Example: Lévy Dynamics lation between σ and W , in the sense that the RHS
If X is a Lévy process (see Lévy Processes) with is constructed to have zero sensitivity to first-order
Gaussian variance σ 2 and Lévy measure ν, then [X] correlation effects.
has transform
Pricing and Hedging via Uniform or L2 Payoff
σ 2 z2 zx 2 Approximation
Ɛe z[X]T
= exp T +T e − 1 ν( dx)
2
For continuous payoffs, h : [0, ∞) → with finite
(12) limit at ∞, such as the variance put or volatility put,
Realized Volatility Options 3
√
Define for y > 0 and v > 0 where f (z) := 1/4 − 2iz and where α > 0 is arbi-
trary. For y > 0 and bd = bu , define
∞
1
e−(z+v/2)
2
BS(y, v; λ) := λ(yez ) √ /(2v)
dz
−∞ 2πv L(y; bd , bu )
(19) log(bu /bd )
:= −2 log(y/bu ) + 2 (y −bu ) (24)
bu − bd
and define BS(y, 0; λ) := λ(y), and let BSy denote
its y-derivative. Let τQ := inf{t ≥ 0 : [X]t ≥ Q}. and define L(y; Y0 , Y0 ) := −2 log(y/Y0 )+2y/Y0 −2.
Then the following trading strategy subreplicates the Let
variance call payoff: hold statically a claim that pays
at time T
L(y) if y ∈
/ (bd , bu )
λ(YT ) − BS(Y0 , Q; λ) (20) L∗ (y) := (25)
−BP (y, 0) if y ∈ (bd , bu )
and trade shares dynamically, holding at each time Let BPy and Ly denote the y-derivatives, and let
t ∈ (0, T ) τb := inf{t ≥ 0 : Yt ∈
/ (bd , bu )}.
Then, the following strategy superreplicates the
−BSy (Yt , Q − [X]t ; λ) shares if t ≤ τQ variance call payoff ([X]T − Q)+ . Hold statically a
claim that pays at time T
−λy (Yt ) shares if t > τQ (21)
L∗ (YT ) − L∗ (Y0 ) (26)
and a bond position that finances the shares and
accumulates the trading gains or losses. Therefore, and trade shares dynamically, holding at each time at
the time-0 value of the contract paying (20) provides each time t ∈ (0, T )
a lower bound on the variance call value.
The lower bound from equation (20) is optimized
by λ consisting of 2/K 2 dK out-of-the-money vanilla BPy (Yt , [X]t − [X]0 ) shares if 0 ≤ t ≤ τb
payoffs at all K where I0 (K, T ), the squared unannu- −Ly (Yt ) shares if t > τb (27)
alized Black–Scholes implied volatility, exceeds Q:
and a bond position that finances the shares and
2
λ(y) = 2
vanK (y) dK (22) accumulates the trading gains or losses.
{K:I0 (K,T )>Q} K Therefore, the time-0 value of the contract pay-
ing (26) provides an upper bound on the variance
See [3] for generalization to forward-starting vari-
call value. Given T -expiry European options data, the
ance options.
upper bound from equation (26) may be optimized
over all choices of (bd , bu ).
Superreplication and Upper Bounds
The following superreplication strategy is due to [3]. Connection to the Skorokhod Problem
Choose any bd ∈ (0, Y0 ] and bu ∈ [Y0 , ∞). Let
Whereas the sections “Subreplication and Lower
Bounds” and “Superreplication and Upper Bounds”
BP (y, q) presented explicit hedging strategies, which imply
∞−αi pricing bounds, this section presents (a logarithmic
:= y/bu sinh (log(bd /y)f (z)) version of) the result in [7], which showed that
−∞−αi
stopping-time analysis also implies pricing bounds.
− y/bd sinh (log(bu /y)f (z)) Denote by ν the -distribution of YT , which is
revealed by the prices of T -expiry options on Y .
2πz2 ei(Q−q)z sinh (log(bu /bd )f (z)) dz Suppose that Ỹ is a continuous F-martingale with
ỸT ∼ ν, and [X̃]T has finite expectation, where X̃ :=
(23) log Ỹ . Then Dambis–Dubins–Schwartz implies that
Realized Volatility Options 5
Ỹt = G[X̃]t , where G is a driftless unit-volatility geo- the term inside the parentheses becomes log((Yn +
metric G-Brownian motion (on an enlarged prob- Dn )/Yn−1 ), where Dn denotes the discrete dividend
ability space if needed) with G0 = Y0 , and [X̃]t payment, if any, of the nth period.
are G-stopping times, where Gs := Finf{t: [X̃]t >s} . Thus
G[X̃]T ∼ ν; and hence [X̃]T solves a Skorokhod prob- References
lem (see Skorokhod Embedding): it is a finite-
expectation stopping time that embeds the distribu-
[1] Buehler, H. (2006). Consistent variance curve models,
tion ν in G. Conversely, if some finite-expectation Finance and Stochastics 10(2), 178–203.
τ embeds ν in a driftless unit-volatility geometric [2] Carr, P., Geman, H., Madan, D. & Yor, M. (2005). Pricing
Brownian motion G, then Ỹt := Gτ ∧(t/(T −t)) defines a options on realized variance, Finance and Stochastics
continuous martingale with ỸT ∼ ν and [log Ỹ ]T = τ . 9(4), 453–475.
Therefore, distributions of stopping times solving [3] Carr, P. & Lee, R. Hedging variance options on continu-
ous semimartingales, Finance and Stochastics, forthcom-
the Skorokhod problem are identical to distributions
ing.
of realized variance consistent with the given price [4] Carr, P. & Lee, R. (2007). Realized volatility and
distribution ν. Skorokhod solutions that have optimal- variance: options via swaps, Risk 20(5), 76–83.
ity properties, therefore, imply bounds on prices of [5] Carr, P. & Lee, R. (2008). Robust Replication of Volatility
variance/volatility options. In particular, Root’s solu- Derivatives, Bloomberg LP and University of Chicago.
tion is known [9] to minimize the expectations of [6] Cox, J., Ingersoll, J. & Ross, S. (1985). A theory of
convex functions of the stopping time; the minimized the term structure of interest rates, Econometrica 53(2),
385–407.
expectation is, in that sense, a sharp lower bound on
[7] Dupire, B. (2005). Volatility Derivatives Modeling,
the price of a variance option (see also Skorokhod Bloomberg LP.
Embedding). [8] Friz, P. & Gatheral, J. (2005). Valuation of volatility
derivatives as an inverse problem, Quantitative Finance
5(6), 531–542.
Contract Specifications in Practice [9] Rost, H. (1976). Skorokhod stopping times of mini-
mal variance, Séminaire de Probabilités (Strasbourg),
In practice, the realized variance in the payoff spec- Springer-Verlag, Vol. 10, pp. 194–208.
ification is defined by replacing quadratic variation
[X]T with an annualized discretization that monitors
Y , typically daily, for N periods, resulting in a spec- Related Articles
ification
N 2 Exponential Lévy Models; Heston Model; Lévy
Yn Processes; Skorokhod Embedding; Variance Swap;
Annualization × log (28)
n=1
Yn−1 Volatility Swaps; Volatility Index Options;
Weighted Variance Swap.
If the contract adjusts for dividends (as typical for
single-stock dividends but not index dividends) then ROGER LEE
Put–Call Parity at T by Bt and rearranging terms, we have
Put–call parity means that one may switch between = put + p.v. of forward price (1)
call and put positions by selling or buying the or
underlying forward: “long call, short put is long ct + K · Bt = pt + Ft · Bt (2)
forward contract” or c − p ≡ f. In other words,
one may replicate a put contract by buying a call For all investment assets where short selling is
of identical characteristics (underlying asset, strike, feasible, the forward price can be further expressed
maturity) and selling the underlying asset forward as a function of the spot price St and the revenue
(p ≡ c − f), and one may replicate a call by buy- or cost of carry until maturity T (see Forwards and
ing a put and the underlying forward (c ≡ p + f). Futures). For example, the forward price of a stock
This is shown in the three payoff diagrams with continuous dividend rate q satisfies Ft = St /Bt ·
(Figures 1–3). exp(−q(T − t)), and put–call parity simplifies to
A logical proof of the third instance (c ≡ p + f)
is as follows: a rational investor will exercise a call ct + K · Bt = pt + St · e−q(T −t) (3)
option whenever the asset price S at maturity is above
In practice, Kamara and Miller [5] give empirical
the strike K; this is equivalent to promising to buy the
evidence that while put–call parity has many small
asset at K and having the option to sell it at that level,
violations, almost half of the arbitrages would result
which a rational investor will exercise whenever S
in a loss when execution delays are accounted for.
falls below K.
Put–call parity is often referred to as option syn-
thetics by practitioners and holds only for European Basic Implications
options.a It does not require any assumption other
than the ability to buy or sell the asset forward, but it
is worth noting that this may not always be the case: • For trading purposes, puts and calls are identical
to sell forward, either a futures market must exist or instruments (up to a directional position in the
one must be able to short-sell the asset. underlying asset).
Put–call parity must not be confused with • At-the-money-forward calls and puts must have
“put–call symmetry” (see Foreign Exchange Sym- the same value. (An at-the-money-forward option
metries) in foreign exchange, which states that a has its strike set at the forward price of the
call struck at K on a given exchange rate S (e.g., underlying asset.)
dollars per 1 euro) is identical to a put struck at • In the absence of revenue or cost of carry,
1/K on the reverse rate 1/S (euros per 1 dollar), the deltas (see Black–Scholes Formula; Delta
after the ad hoc numeraire conversions: c(S, K)/S ≡ Hedging) of a call and put must add up to 1 (in
K p(1/S, 1/K). absolute value).
• Puts and calls must have the same gamma
and vega (see Black–Scholes Formula; Gamma
Hedging).
Price Relationship
In volatility modeling, put–call parity implies that
Assuming no arbitrage, the synthetic relationship calls and puts of identical characteristics must have
immediately translates into the well-known price the same implied volatility.
relationship: “call minus put equals forward” or In exotic option pricing, Carr and Lee [1] put
ct − pt = ft . Note that here ft denotes the price of forward the idea of a generalized American option
a forward contract struck at K, that is, the present that may be indefinitely exercised until maturity to
value (p.v.) of the gap between the forward price Ft lock-in the intrinsic value and switch between call
and the strike price K (see Forwards and Futures). and put styles. The authors show that this option
Denoting the price of the zero-coupon bond maturing may be replicated by holding onto a European
2 Put–Call Parity
Payoff Payoff
d
Call
ar
rw
Fo
Put
d
ar
Call
rw
Fo
K Call
S
K S
d
d
ar
ar
rw
rw
Fo
Fo
Short put
History
Payoff Haug [3] traces put–call parity as far back as
Call the seventeenth century, but its formulation was
then “diffuse”. According to the author, an early
Sh
or
1902:
Put
It can be shown that the adroit dealer in options can
convert a ‘put’ into a ‘call’, a ‘call’ into a ‘put’ [. . .]
Put
by dealing against it in the stock.
K S
Derman and Taleb [2] argue that the Black–Scholes–
Merton formulas could have been established earlier
than 1973 via put–call parity instead of the dynamic
Sh
or
agent attempts to replicate a forward contract by buying a [4] Higgins, L.R. (1902). The Put-and-Call. E. Wilson,
call and selling a put. If the put is American, it may be London.
exercised against the agent before maturity, thus breaking [5] Kamara, A. & Miller, T.W. (1995). Daily and intradaily
the replication strategy. tests of European put-call parity, Journal of Financial
and Quantitative Analysis 30, 519–539.
References
Related Articles
[1] Carr, P. & Lee, R. (2002). Hyper Options. Working paper,
Courant Institute and Stanford University, December Black–Scholes Formula; Call Options; Forwards
2002.
and Futures; Option Pricing: General Principles;
[2] Derman, E. & Taleb, N.N. (2005). The illusions of
dynamic replication, Quantitative Finance 5(4),
Options: Basic Definitions.
323–326.
[3] Haug, E. (2007). Derivatives: Models on Models. Wiley. SÉBASTIEN BOSSU
Discretely Monitored is the computational complexity associated with an
m-variate normal distribution for even moderate val-
Options ues of m. For example, Monte Carlo or tree-based
algorithms may take several hours or even days for
common values of m [4]. In their paper, Broadie
Traditional pricing models for path-dependent options et al. [3] (see also [17]), opt to circumvent this
rely on continuously monitoring the underlying, hurdle by linking Vm (H ) to the price of a con-
often resulting in closed-form or analytic formu- tinuously monitored option with a barrier shifted
las. References include [14, 19, 20, 21] for bar- away from the original. More precisely, they show
rier options, [6, 12, 13] for look-back options, and that
√ √
[11, 18] for Asian or average options. However,
Vm (H ) = V H e±βσ t + o 1/ m (3)
in practice, monitoring is performed over discrete
dates (e.g., monthly, weekly, or daily), while the
underlying is still assumed to follow a continuous where V (H̃ ) is the price of a continuously mon-
model. In contrast to continuous monitoring, dis- itored barrier option with threshold H̃ and β ≈
crete monitoring rarely, if ever, leads to similarly 0.5826, with + for an up option and − for a down
tractable solutions and using continuous monitor- option.
ing as approximation for discrete monitoring often Although this approach works very well, it appears
leads to significant mispricing (cf. [5, 15, 16].) As to be inaccurate when the barrier is near the initial
a consequence, various approaches have been fol- price of the underlying. Under such circumstances,
lowed to arrive at practically useful computational one can opt to use the recursive method of Ait-
schemes. Sahlia and Lai [1], which consists in reducing an
For illustration, we focus on a down-and-out call m-dimensional integration problem to successively
option, where a standard call option with strike K evaluating m one-dimensional integrals. Specifically,
is canceled if the underlying falls below a barrier they show that
prior to expiry T . We first assume the traditional ∞
x +
Black–Scholes–Merton setup with the price {St } Vm (H ) = S0 e − K fm (x) dx (4)
log(H /S0 )
of the underlying following a geometric Brownian
motion where, for 1 ≤ n ≤ m, fn (x) dx = P {τ > n, Un ∈
St = S0 eBt (1) dx} for x > log(H /S0 ), with fn defined recursively
for each n according to the following:
where {Bt } is a Brownian motion with drift r −
σ 2 /2 and standard deviation σ . Here the param- f1 (x) = ψ(x)
eters r and σ represent the prevailing risk-free ∞
rate and the return volatility of the underlying fn (x) = fn−1 (y)ψ(x − y) dy
asset, respectively. Let H > 0 be a given con- log(H /S0 )
context of a GARCH model, Duan et al. [7] propose [10] Fusai, G., Abrahams, D. & Sgarra, C. (2006). An exact
a Markov chain technique that can also handle analytical solution for discrete barrier options, Finance
American-style exercise. Partial differential equa- and Stochastics 10, 1–26.
[11] Geman, H. & Yor, M. (1993). Bessel processes,
tions are used in [9, 22, 23, 24] to price average
Asian options and perpetuities, Mathematical Finance
and barrier options, including when volatility is 3, 349–375.
stochastic and exercise is of American style. Finally, [12] Goldman, M., Sosin, H. & Gatto, M. (1979). Path
[8] contains an approach that ultimately relies on dependent options: ‘Buy at the low, sell at the high’,
Hilbert and Fourier transform techniques to address Journal of Finance 34, 1111–1127.
the situation when the underlying follows a Lévy [13] Goldman, M., Sosin, H. & Shepp, L. (1979). On
process. contingent claims that insure ex-post optimal stock
market timing, Journal of Finance 34, 401–414.
[14] Heynen, R.C. & Kat, H.M. (1994). Partial barrier
References options, Journal of Financial Engineering 3,
253–274.
[15] Heynen, R.C. & Kat, H.M. (1994). Lookback op-
[1] AitSahlia, F. & Lai, T. (1997). Valuation of discrete
tions with discrete and partial monitoring of the
barrier and hindsight options, Journal of Financial
underlying price, Applied Mathematical Finance 2,
Engineering 6, 169–177.
273–284.
[2] AitSahlia, F. & Lai, T. (1998). Random walk duality
[16] Kat, H. & Verdonk, L. (1995). Tree surgery, Risk 8,
and the valuation of discrete lookback options, Applied
53–56.
Mathematical Finance 5, 227–240.
[17] Kou, S. (2003). On pricing of discrete barrier options,
[3] Broadie, M., Glasserman, P. & Kou, S. (1997). A conti-
Statistica Sinica 13, 955–964.
nuity correction for discrete barrier options, Mathemat-
ical Finance 7, 325–349. [18] Linetsky, V. (2004). Spectral expansions for Asian
[4] Broadie, M., Glasserman, P. & Kou, S. (1999). Con- (average price) options, Operations Research 52,
necting discrete and continuous path-dependent options, 856–867.
Finance and Stochastics 3, 55–82. [19] Merton, R.C. (1973). Theory of rational option pricing,
[5] Chance, D. (1994). The pricing and hedging of limited Bell Journal of Economics and Management Science 4,
exercise caps and spreads, Journal of Financial Research 141–183.
17, 561–583. [20] Rich, D. (1994). The mathematical foundations of barrier
[6] Conze, A. & Viswanathan, R. (1991). Path dependent option pricing theory, Advances in Futures and Options
options: the case of lookback options, Journal of Finance Research 7, 267–312.
46, 1893–1907. [21] Rubinstein, M. & Reiner, E. (1991). Breaking down the
[7] Duan, J.C., Dudley, E., Gauthier, G. & Simonato, J.G. barriers, Risk 4, 28–35.
(2003). Pricing discretely monitored barrier options [22] Vetzal, K. & Forsyth, P.A. (1999). Discrete Parisian and
by a Markov Chain, Journal of Derivatives 10, delayed barrier options: a general numerical approach,
9–31. Advanced Futures Options Research 10, 1–16.
[8] Feng, L. & Linetsky, V. (2008). Pricing discretely [23] Zvan, R., Forsyth, P.A & Vetzal, K. (1999). Discrete
monitored barrier options and defaultable bonds in Asian barrier options, Journal of Computational Finance
Lévy process models: a fast Hilbert transform approach, 3, 41–68.
Mathematical Finance 18, 337–384. [24] Zvan, R., Vetzal, K. & Forsyth, P.A. (2000). PDE
[9] Forsyth, P.A., Vetzal, K. & Zvan, R. (1999). A finite methods for pricing barrier options, Journal of Economic
element approach to the pricing of discrete lookbacks Dynamics and Control 24, 1563–1590.
with stochastic volatility, Applied Mathematical Finance
6, 87–106. FARID AITSAHLIA
Weighted Variance Swap so the share price with reinvested dividends is Yt Qt .
Then the payoff
T
w(Yt ) d[X]t (3)
Let the underlying process Y be a semimartingale θ
taking values in an interval I . Let ϕ : I → be a
difference of convex functions, and let X := ϕ(Y ). admits a model-independent replication strategy,
A typical application takes Y to be a positive price which holds European options statically and trades
process and ϕ(y) = log y for y ∈ I = (0, ∞). the underlying shares dynamically. Indeed, let λ :
Then (the floating leg of) a forward-starting I → be a difference of convex functions, let λy
weighted variance swap or generalized variance swap denote its left-hand derivative, and assume that its
on ϕ(Y ) (shortened to “on Y ” if the ϕ is under- second derivative in the distributional sense has a
stood), with weight process wt , forward-start time signed density, denoted λyy , which satisfies for all
θ, and expiry T , is defined to pay, at a fixed time y∈I
Tpay ≥ T > θ ≥ 0, λyy (y) = 2ϕy2 (y)w(y) (4)
each λ claim decomposes into puts/calls at all strikes unlike equation (10a). The spot-dependent weighting
K, with quantities 2ϕy2 (K)w(K) dK: is, however, the more common specification and is
assumed in remainder of this article.
λ(y) = 2ϕy2 (K)w(K)Van(y, K) dK (8)
I
Examples
where Van(y, K) := (K − y)+ 11K<κ +(y − K)+ 11K>κ
denotes the vanilla put or call payoff. For put/call Returning to the previously specified examples of
decompositions of general European payoffs, see [1]. weights w(Yt ), we express the replication payoff λ
in a compact formula, and also expanded in terms
of vanilla payoffs according to equation (8). We take
Futures-dependent Weights ϕ(y) = log y unless otherwise stated.
In equation (3), the weight is a function of spot Yt . • Variance swap: Equation (4) has solution
The alternative payoff specification λ(y) = −2 log(y/κ) + 2y/κ − 2
T ∞
2
w(Yt Qt /Zt ) d[X]t (9) = 2
Van(y, K) dK (11)
θ 0 K
makes wt a function of the futures price (a constant • Arithmetic variance swap: For ϕ(y) = y, equa-
times Yt Qt /Zt ). tion (4) has solution
In the case ϕ = log, we have [X] = [log Y ] = ∞
[log(Y Q/Z)]; hence λ(y) = (y − κ)2 = 2 Van(y, K) dK (12)
0
• Corridor variance swap: Equation (4) has solu-
T
w Yt Qt /Zt d[X]t tion
θ
2
Y Q Y Q λ(y) = 2
Van(y, K) dK (13)
T T θ θ
=λ −λ K∈C K
ZT Zθ
• Gamma swap: Equation (4) has solution
T
− 2
λy (Yt Qt /Zt ) d(Yt Qt /Zt ) λ(y) = y log(y/κ) − y + κ
θ Y0
∞
for λ satisfying equation (4). So the alternative payoff 2
= Van(y, K) dK (14)
(9) admits replication as follows: hold statically a 0 Y0 K
claim that pays at time Tpay
In all cases, the strategy (7) replicates the desired con-
λ(YT QT /ZT ) − λ(Yθ Qθ /Zθ ) (10a) tract. In the case of a variance swap, the strategy (10)
also replicates it, because w(Y ) = 1 = w(Y Q/Z).
and trade shares dynamically, holding at each time
t ∈ (θ, T )
Discrete Dividends
− λy (Yt Qt /Zt )Qt shares (10b)
Assume that at the fixed times tm where θ = t0 <
and a bond position that finances the shares and t1 < · · · < tM = T , the share price jumps to Ytm =
accumulates the trading gains or losses. Thus, the Ytm − − δm (Ytm − ), where each discrete dividend is
payoffs (9) and (10a) have equal values at time 0. given by a function δm of prejump price. In this case,
In special cases (such as w = 1 or r = q = 0), the the dividend-adjusted weighted variance swap can be
spot-dependent (3) and futures-dependent (9) weight defined to pay at time Tpay
specifications are equivalent. In general, the spot- M
tm−
dependent weighting is harder to replicate, as it w(Yt ) d[X]t (15)
requires a continuum of expiries in equation (7a), m=1 tm−1 +
Weighted Variance Swap 3
If the function y → y − δm (y) has an inverse where Dn denotes the discrete dividend payment, if
fm : I → I , and if Y is still continuous on each any, of the nth period. Both here and in the theoretical
[tm−1 , tm ), then each term in equation (15) can form (15), no adjustment is made for any dividends
be constructed via equation (7), together with the deemed to be continuous (for example, index variance
relation λ(Ytm − ) = λ(fm (Ytm )). Specifically, the mth contracts typically do not adjust for index dividends;
term admits replication by holding statically a claim see [3]).
that pays at time Tpay In some contracts—for example, single-stock
(down-)variance—the risk to the variance seller that
λ(fm (Ytm )) − λ(Ytm−1 ) Y crashes is limited by imposing a cap on the payoff.
tm Hence,
+ (qτ − rτ )λy (Yτ )Yτ dτ (16)
tm−1
Notional × min(Floating, Cap × Fixed) − Fixed
and holding dynamically −λy (Yt )Zt shares, at each
(19)
time t ∈ (tm−1 , tm ).
replaces equation (17), where “Cap” is an agreed
constant, such as the square of 2.5.
Contract Specifications in Practice
In practice, weighted variance swap transactions are References
forward settled; no payment occurs at time 0, and at
time Tpay the party long the swap receives the total [1] Carr, P. & Madan, D. (1998). Towards a theory of volatil-
payment ity trading, in Volatility, R. Jarrow, ed, Risk Publications,
pp. 417–427.
Notional × Floating − Fixed (17) [2] Carr, P. & Lee, R. (2009). Hedging Variance Options on
Continuous Semimartingales, Forthcoming in Finance and
Stochastics.
where “fixed” (also known as the strike), expressed in [3] Overhaus, M., Bermúdez, A., Buehler, H., Ferraris, A.,
units of annualized variance, is the price contracted at Jordinson, C. & Lamnouar, A. (2007). Equity Hybrid
time 0 for time-Tpay delivery of “floating”, an annual- Derivatives, John Wiley & Sons.
ized discretization of equation (15) that monitors Y ,
typically daily, for N periods. In the usual case of
ϕ = log, this results in a specification Related Articles
u1(t,x) s1(t,x)
1 1
0.5 0.5
0 0
1 1
1.5 1.5
0.5 1 0.5 1
t 0 0.5 x 0 0.5
u2(t,x) s2(t,x)
1 1
0.5 0.5
0 0
1 1
1.5 1.5
0.5 1 0.5 1
t 0 0.5 x 0 0.5
Figure 1 Extreme sensitivity of Dupire formula to noise in the data. Two examples of call price function (left) and their
corresponding local volatilities (right). The prices differ through IID noise ∼ U N I F (0, 0.001), representing a bid–ask
spread
In most models, the call prices are computed to its ease of calibration using the Hagan formula
numerically via Fourier transform (see Fourier [30].
Methods in Options Pricing) or by solving a par- In most cases, option prices Ci (θ) depend contin-
tial differential equation (PDE) (see Partial Dif- uously on θ and E is a subset of a finite dimensional
ferential Equations). However, in many situations space (i.e., there are a finite number of bounded
(short or long maturity, small vol–vol, etc.) approx- parameters), so the least-squares formulation always
imation formulae for implied volatilities (Ti , Ki ) admits a solution. However, the solution of equation
of call options are available [5, 10, 11, 30] in (12) need not be unique: J0 may, in fact, have several
terms of model parameters (see Implied Volatility global minima, when the observed option prices do
in Stochastic Volatility Models; Implied Volatility: not uniquely identify the model. Figures 2 and 3 show
Volvol Expansion; Implied Volatility: Long Matu- examples of the function J0 for some popular para-
rity Behavior; SABR Model). In these situations, metric option pricing models, computed using a data
parameters are calibrated by a least-squares fit to the set of DAX index options prices on May 11, 2001.
approximate formula: The pricing error in the Heston stochastic volatil-
ity model (see Heston Model), shown in figure as
I
a function of the “volatility of volatility” and the
inf wi |(Ti , Ki ; θ) − ∗ (Ti , Ki )|2 (13)
θ∈E mean reversion rate, displays a line of local minima.
i=1
The pricing error for the variance gamma model (see
An example is the SABR model (see SABR Variance-gamma Model) in Figure 3 displays a non-
Model), whose popularity is almost entirely due convex profile, with two distinct minima in the range
4 Model Calibration
Log error
6
3
5
M
ea
n
re 10
ve
rs 1.5
ion 1
pa 0.5
ram 15 y
0 of volatilit
et
er Volatility
Figure 2 Error surface for the Heston stochastic volatility model, DAX options
× 105
2
1.8
1.6
1.4
1.2 A
1
0.8 0.25
0.6
8 0.2
7
6
5 0.15 s
4
3
k 2 1
0 0.1
Figure 3 Error surface for variance gamma (pure jump) model, DAX options
of observed values. These examples show that, even term, to the pricing error and solve the auxiliary
if the number of observations (option prices) is much problem:
higher than the number of parameters, this does not inf Jα (θ) (14)
imply identifiability of parameters. θ∈E
The functional (16) consists of two parts: the regu- This regularized formulation has the advantage
larization term αR(θ) which is convex in its argument that its solution exhibits continuous dependence on
and the quadratic pricing error which measures the market prices and with respect to the choice of the
precision of calibration. The coefficient α, called prior model [21, 22].
regularization parameter, defines the relative impor- Simpler regularization methods can be used in
tance of the two terms: it characterizes the trade- settings where prices are computed using analytical
off between prior knowledge and the information transform methods. Belomestny & Reiss [8] pro-
contained in option prices. Jα (.) is usually minimized pose a spectral regularization method for calibrating
by gradient-based methods, where the crux of the exponential-Lévy models. Aspremont [3] formulates
algorithm is an efficient computation of the gradient the calibration of LIBOR market models (Exam-
∇θ J . ple 3) as semidefinite programming problems under
When parameter is a function (such as the local constraints.
volatility function), the regularization term is often Different regularization terms select different solu-
chosen to be a smoothness (e.g., Sobolev) norm. tions: Tikhonov regularization approximates the least-
This method, called Tikhonov regularization (see squares solution with smallest norm [27] while
Tikhonov Regularization) has been applied to diffu- entropy-based regularization selects the minimum-
sion models [1, 2, 13, 23, 26] and to exponential-Lévy entropy least-squares solution [22].
models [19].
Another popular choice of regularization term is
the relative entropy (see Entropy-based Estimation) Entropy Minimization Under Calibration
R(θ) = H (θ |) with respect to a prior probabil- Constraints
ity measure . In continuous-time models, relative
entropy can be used as regularization criterion only An alternative approach to regularization is to select a
if the prior possesses a nonempty class of equiva- pricing model by minimizing the relative entropy
lent martingale measures, that is, it corresponds to an (see Entropy-based Estimation) of the probability
incomplete market model (see Complete Markets). measure with respect to a prior, under calibration
From a calibration perspective, market incomplete- constraints
ness (i.e., the nonuniqueness of equivalent martingale
measure) is therefore an advantage: it allows to con-
inf H (|) under Ci = E [Hi ] for i ∈ I
ciliate compatibility with option prices and equiva- ∼
lence with respect to a reference probability measure. (17)
Examples are provided by jump processes (see Jump
Processes; Exponential Lévy Models) or reduced- Relative entropy being strictly convex, any solu-
form credit risk models (see Reduced Form Credit tion of equation (17) is unique and can be computed
Risk Models): one can modify the jump size distri- in a stable manner using Lagrange multiplier (dual)
bution (Lévy measure) or the default intensity while methods [24] (see Convex Duality).
preserving equivalence (see Equivalence of Prob- Application of these ideas to a set of scenarios
ability Measures) of measures [18, 20]. For Lévy leads to the weighted Monte Carlo algorithm (see
processes (see Exponential Lévy Models), the rela- Weighted Monte Carlo) [6]: one first simulates N
tive entropy term H (ν) is computable in terms of the sample paths N = {ω1 , ..ωN } from a prior model
Lévy measure ν [21]. The calibration problem then and then solves the above problem (AV) using
takes the following form: as prior the uniform distribution on N . The idea
is to weight the paths in order to verify the cali-
Problem 2 Given a prior Lévy process with law 0 bration constraints. The weights (N (ωi ), i = 1..N )
and characteristics (σ0 , ν0 ), find a Lévy measure ν are constructed by minimizing relative entropy under
which minimizes calibration constraints
N
Jα (ν) = αH (ν) + wi (C0ν (Ti , Ki ) − C0 (Ti , Ki ))2
N
N (ωi )
i=1 inf N (ωi ) ln under
(16) N ∈P(N )
i=1
N (ωi )
6 Model Calibration
N programming techniques. Consider a Markovian
N (ωi )Gj (ωi ) = Cj (18) model where the state variable St (asset price, interest
i=1 rate,..) follows a stochastic differential equation
This constrained optimization problem is solved
by duality [6, 24]: the dual has an explicit solution, in dXt =µθ (t) dt + σθ (t, St ) dWt
the form of a Gibbs–Boltzmann measure [4, 6] (see
Entropy-based Estimation). A (discounted) payoff + γθ (t, Xt− )µ(dt dz) (21)
X is then priced using the same set of simulated paths
via
where W is a Wiener process and µ a com-
pensated Poisson random measure with intensity
N
E N [X] = N (ωi )X(ωi ) νθ (dz)λθ (t) dt. The coefficients of the model are
i=1
parameterized by some parameter θ ∈ E; in a non-
parametric setting, θ is just the coefficient itself and
1 N (ωi )
N
E is a functional space. Denote the law of the solution
= X(ωi ) (19)
N i=1 N (ωi ) by θ . Consider now the case where the calibration
criterion J (.)
Tcan be expressed as an expected value
The benchmark payoffs (calibration instruments) J (θ) = E θ [ 0 φ(Xt ) dt] with a strictly convex func-
play the role of biased control variates, leading to tion φ(.). A classical approach to solve the calibration
variance reduction [29]: problem
I I
inf J (θ), under E θ [Hi ] = Ci (22)
N N
E [X] = E X− αi Hi + αi Ci (20) θ∈E
i=1 i=1
is to introduce the Lagrangian functional
This method yields as a by-product, a static
hedge portfolio αi∗ , which minimizes the variance in
L(θ, λ) = J (θ) − λi (E θ [Hi ] − Ci )
equation (20) [3, 6, 17].
i∈I
A drawback is that the martingale property is
lost in this process since it would correspond to an T
infinite number of constraints. As a result, derivative =E θ
φ(Xt ) dt − λi (Hi − Ci )
0 i∈I
prices computed with the weighted Monte Carlo
algorithm may fail to verify arbitrage relations across (23)
maturities (e.g. calendar spread relations), especially
when applied to forward-starting contracts. where λi is the Lagrange multiplier associated to the
These arbitrage constraints can be restored by calibration constraint for payoff Hi . The dual problem
representing as a random mixture of martingales associated to the constrained minimization problem
the law of random mixture being chosen via relative (22) is given by
entropy minimization under calibration constraints
T
[17]. This results in an arbitrage-free version of the
weighted Monte Carlo approach, which is applied inf L(θ, λ) = inf E θ φ(Xt ) dt
θ∈E θ∈E 0
to recovering covariance matrices implied by index
options in [15]. − λi (Hi − Ci ) (24)
i∈I
Stochastic Control Methods
It can be viewed as a stochastic control problem
In certain continuous-time models, the relative (see Stochastic Control) with running cost φ(t, Xt )
entropy minimization approach can be mapped, and terminal cost .
via a duality argument, into a stochastic control This original formulation of the calibration prob-
problem, which can then be solved using dynamic lem was first presented by Avellaneda et al. [7] in the
Model Calibration 7
context of diffusion model with unknown volatility models that are compatible with the market data
(Cibid , Ciask )i=1..I . An evolutionary algorithm simu-
dSt = St σ (t, St ) dWt (25) lates an inhomogeneous Markov chain (Xn )n≥1 in
E N which undergoes mutation–selection cycles [9]
The calibration criterion in [7] was chosen to be
designed such that as the number of iterations n
T grows, the components (θ1N , ..., θnN ) of Xn converge
J (σ ) = E σ dt η(σ 2 (t, Xtσ )) (26) to the Gδ , yielding a population of points (θk ) which
0
converges to a sample of model parameters compati-
where η is a strictly convex function. Duality between ble with the market data (Cibid , Ciask )i=1..I in the sense
(22) and (24) is not obvious in this case since the that J0 (θk ) ≤ δ. We thus obtain a population of N
Lagrangian is not convex with respect to its argument model parameters calibrated to market data, which
[31]. The stochastic control approach can also be can be different especially if the initial problem has
applied in the context of model calibration by relative multiple solutions.
entropy minimization for classes of models where Figure 4 shows a sample of local volatility func-
absolute continuity is preserved under a change of tions obtained using this approach [9]. These exam-
parameters, such as models with jumps. Cont and ples illustrate that precise reconstruction of local
Minca [18] use this approach for retrieving the default volatility from call option prices is at best illusory; the
rate in a portfolio from CDO tranche spreads indexed parameter uncertainty is too important to be ignored,
on the portfolio. especially for short maturities where it does not affect
the prices very much; short-term volatility hovers
anywhere between 15% and 30%. These observa-
Stochastic Algorithms tions cast a doubt on the volatility content of very
short-term options in terms of volatility and ques-
Objective functions used in calibration (with the
tions whether one can solely rely on short maturity
exception of entropy-based methods) are typically
asymptotics (see SABR Model) in model calibration.
nonconvex, event after regularization, leading to
multiple minima and lack of convergence in gradient-
based methods. Stochastic algorithms known as
evolutionary algorithms, which contain simulated Parameter Uncertainty
annealing as a special case, have been widely used for
global nonconvex optimization are natural candidate Model calibration is usually the first step in a pro-
for solving such problems [9]. cedure whose ultimate purpose is the pricing and
Suppose, for instance, we want to minimize the hedging of (exotic) options. Once the model param-
pricing error eter θ is calibrated to market prices, it is used to
compute a model-dependent quantity f (θ)—price of
I
an exotic option or a hedge ratio—using a numerical
J0 (θ) = wi |Ciθ − Ci |, θ ∈ E (27)
procedure. Given the ill-posedness of the calibration
i=1
problem and the resulting uncertainty on the solution
where Ciθ are model prices and Ci are observed θ, one question is the impact of this uncertainty on
(transaction or mid-market) prices for the benchmark such model-dependent quantities. This aspect is often
options. Now define the a priori error level δ as neglected in practice and many users of pricing mod-
els view the calibrated parameter as fixed, equating
I
calibration with a curve-fitting exercise.
δ= wi |Cibid − Ciask | (28)
Particle methods yield, as a by-product, a way to
i=1
analyze model uncertainty. While calibration algo-
Given the uncertainty on option values due to rithms based on deterministic optimization yield a
bid–ask spreads, one cannot meaningfully distin- point estimate for model parameters, particle meth-
guish a “perfect” fit J0 (θ) = 0 from any other fit ods yield a population Q = {θ1 , ..., θk } of pricing
with J0 (θ) ≤ δ. Therefore, all parameter values in models, all of which price the benchmark options
the level set Gδ = {θ ∈ E, J0 (θ) ≤ δ} correspond to with equivalent precision E (Hi ) ∈ [Cibid , Ciask ]. The
8 Model Calibration
0.35
0.3
0.25
0.2
0.15
0.1
0.5
1
0.2
1.5 0.15
2 0.1
S/S0 0.05
2.5 0 t
heterogeneity of this population reflects the uncer- with a portfolio containing αi units of benchmark
tainty in model parameters, which are left undeter- instrument Hi ,
mined by the benchmark options. This idea can be
exploited to produce a quantitative measure of model H = α0 + αi Hi (30)
i∈I
uncertainty compatible with observed market prices
of benchmark instruments [14], by considering the the cost α0 + αi Ci of setting up the hedge is
interval of prices automatically equal to the model price E [H ].
Calibration does not entail that prices, hedge
ratios, or risk parameters generated by the model
inf E [X], sup E [X] (29)
∈Q ∈Q
are “correct” in any sense. This requires a correct
model specification with realistic dynamics for risk
for a payoff X in the various calibrated models. factors. Indeed, many different models may calibrate
Another approach is to calibrate several different the same prices of, say, a set of call options but lead
models to the same data and compare the value to very different prices of hedge ratios for exotics
of the exotic option across models [14, 32]. Model [14, 32]. For example, any equity volatility smile can
uncertainty in derivative pricing is further discussed be reproduced by a one-factor diffusion model (see
in [14]. Example 1) via an appropriate specification of the
local volatility surface, but there is ample evidence
that volatility itself should be modeled as a risk factor
Relation with Pricing and Hedging (see Stochastic Volatility Models) and a one-factor
diffusion may lead to an underestimation of volatility
Calibrating a model to market prices simply ensures risk and unrealistic dynamics [30].
that model prices of benchmark instruments reflect However, a model that is not calibrated to market
current “mark-to-market” values. It also ensures that prices of liquidly traded derivatives is typically not
the cost of a static hedge (see Static Hedging) using easy to use. For example, even if a payoff can
these benchmark instruments is correctly reflected in be statically hedged with traded derivatives using
model prices: if a payoff H can be statically hedged an initial capital V0 , the model price will not be
Model Calibration 9
equal to V0 . Thus, model prices will, in general, [14] Cont, R. (2006). Model uncertainty and its impact on the
be inconsistent with hedging costs if the model is pricing of derivative instruments, Mathematical Finance
not calibrated. Thus, calibration seems a necessary 16(3), 519–547.
[15] Cont, R. & Deguest, R. (2009). What do index options
but not sufficient condition for choosing a model for
imply about the dependence among stock returns? Col-
pricing and hedging. umbia University Financial Engineering Report 2009-
06,www.ssrn.com.
[16] Cont, R., Deguest, R. & Kan, Y.H. (2009). Default
References Intensities Implied by CDO Spreads: Inversion Formula
and Model Calibration. Columbia University Financial
Engineering Report 2009-04, www.ssrn.com.
[1] Achdou, Y. (2005). An inverse problem for a parabolic [17] Cont, R. & Léonard, Ch. (2008). A Probabilistic
variational inequality arising in volatility calibration Approach to Inverse Problems in Option Pricing. Work-
with American options, SIAM Journal on Control and
ing Paper.
Optimization 43, 1583–1615.
[18] Cont, R. & Minca, A. (2008). Recovering Portfolio
[2] Achdou, Y. & Pironneau, O. (2002). Volatility smile
Default Intensities Implied by CDO Tranches. Financial
by multilevel least square, International Journal of
Engineering Report 2008-01, Columbia University.
Theoretical and Applied Finance 5(2), 619–643.
[19] Cont, R. & Rouis, M. (2006). Recovering Lévy Processes
[3] d’Aspremont, A. (2005). Risk-management methods for
from Option Prices by Tikhonov Regularization. Working
the Libor market model using semidefinite program-
Paper.
ming, Journal of Computational Finance 8(4), 77–99.
[20] Cont, R. & Tankov, P. (2004). Financial Modelling with
[4] Avellaneda, M. (1998). The minimum-entropy algorithm
Jump Processes, Chapman and Hall/CRC Press, Boca
and related methods for calibrating asset-pricing mod-
Raton.
els, Proceedings of the International Congress of Math-
[21] Cont, R. & Tankov, P. (2004). Nonparametric calibration
ematicians, Documenta Mathematica, Berlin, Vol. III,
of jump-diffusion option pricing models, Journal of
pp. 545–563.
Computational Finance 7(3), 1–49.
[5] Avellaneda, M., Boyer-Olson, D., Busca, J. & Friz, P.
(2002). Reconstructing the smile, Risk Magazine [22] Cont, R. & Tankov, P. (2005). Recovering Lévy pro-
October. cesses from option prices: regularization of an ill-posed
[6] Avellaneda, M., Buff, R., Friedman, C., Grandchamp, N., inverse problem, SIAM Journal on Control and Opti-
Kruk, L. & Newman, J. (2001). Weighted Monte Carlo: mization 45(1), 1–25.
a new technique for calibrating asset-pricing mod- [23] Crépey, S. (2003). Calibration of the local volatility in
els, International Journal of Theoretical and Applied a trinomial tree using Tikhonov regularization, Inverse
Finance 4, 91–119. Problems 19, 91–127.
[7] Avellaneda, M., Friedman, C., Holmes, R. & Sam- [24] Csiszár, I. (1975). I-divergence geometry of probability
peri, D. (1997). Calibrating volatility surfaces via distributions and minimization problems, The Annals of
relative entropy minimization, Applied Mathematical Probability 3, 146–158.
Finance 4, 37–64. [25] Dupire, B. (1994). Pricing with a smile, Risk 7, 18–20.
[8] Belomestny, D. & Reiss, M. (2006). Spectral calibration [26] Engl, H. & Egger, H. (2005). Tikhonov regulariza-
of exponential Lévy Models, Finance and Stochastics tion applied to the inverse problem of option pricing:
10(4), 449–474. convergence analysis and rates, Inverse Problems 21,
[9] Ben Hamida, S. & Cont, R. (2004). Recovering volatility 1027–1045.
from option prices by evolutionary optimization, Journal [27] Engl, H.W., Hanke, M. & Neubauer, A. (1996). Reg-
of Computational Finance 8(3), 43–76. ularization of Inverse Problems, Mathematics and its
[10] Berestycki, H., Busca, J. & Florent, I. (2004). Comput- Applications, Kluwer Academic Publishers, Dordrecht,
ing the implied volatility in stochastic volatility mod- The Netherlands, Vol. 375.
els, Communications on Pure and Applied Mathematics [28] Friz, P. & Gatheral, J. (2005). Valuing Volatility Deriva-
57(10), 1352–1373. tives as an Inverse Problem, Quantitative Finance,
[11] Bouchouev, I., Isakov, V. & Valdivia, N. (2002). Recov- December 2005.
ering a volatility coefficient by linearization, Quantita- [29] Glasserman, P. & Yu, B. (2005). Large sample prop-
tive Finance 2, 257–263. erties of weighted Monte Carlo estimators, Operations
[12] Carr P., Geman H., Madan D.B. & Yor M. (2004). Research 53(2), 298–312.
From local volatility to local Lévy models, Quantitative [30] Hagan, P., Kumar, D., Lesniewski, A.S. & Wood-
Finance 4(5), 581–588. ward, D.E. Managing smile risk, Wilmott Magazine
[13] Coleman, T., Li, Y. & Verma, A. (1999). Reconstructing September, 84–108.
the unknown volatility function, Journal of Computa- [31] Samperi, D. (2002). Calibrating a diffusion model with
tional Finance 2(3), 77–102. uncertain volatility, Mathematical Finance 12, 71–87.
10 Model Calibration
[32] Schoutens, W., Simons, E. & Tistaert, J. (2004). A per- Related Articles
fect calibration! Now what? Wilmott Magazine March.
Black–Scholes Formula; Convex Duality; Dupire
Further Reading Equation; Entropy-based Estimation; Exponential
Lévy Models; Implied Volatility in Stochastic
Biagini, S. & Cont, R. (2006). Model-free representation of Volatility Models; Implied Volatility: Large Strike
pricing rules as conditional expectations, in Stochastic Pro-
Asymptotics; Jump Processes; Local Volatility
cesses and Applications to Mathematical Finance, J. Aka-
hori, S. Ogawa and S. Watanabe, eds, World Scientific,
Model; Markov Functional Models; SABR Model;
Singapore, pp. 53–66. Stochastic Volatility Models; Weighted Monte
Harrison, J.M. & Pliska, S.R. (1981). Martingales and stochas- Carlo; Yield Curve Construction.
tic integrals in the theory of continuous trading, Stochastic
Processes and their Applications 11, 215–260. RAMA CONT
Dupire Equation call price an instant later and the Jensen convexity
bias θ depends on.
According to the forward Dupire equation, the cost
of extending the maturity of a call depends on the
The Dupire equation is a partial differential equation probability of being at the strike at maturity and
(PDE) that links the contemporaneous prices of on the level of volatility there. It can be seen as
European call options of all strikes and maturities relating the price of a calendar spread to the price
to the instantaneous volatility of the price process, of a butterfly spread.
assumed to be a function of price and time only. The
main application of the equation is to compute (i.e.,
invert) local volatilities from market option prices Uses
to build a local volatility model, which many major
banks currently use for option pricing. The equation
If we assume that the price process S follows the
stochastic differential equation, ∂C σ 2 (K, T ) 2 ∂ 2 C ∂C
= K − (r − q)K − qC
∂T 2 ∂K 2 ∂K
dSt (4)
= µt dt + σ (St , t) dWt (1)
St can be used in the following two ways:
Then if C(S, t, K, T ) denotes the price at time t
1. If the local volatility σ (S, t) is known, the
for an underlying price of S of the European call of
PDE can be used to compute the price today
strike K and maturity T that pays (ST − K)+ at time
of all call options in a single sweep, starting
T , C satisfies, for a fixed (S, t).
from the boundary condition C(S, t, K, T ) =
For a fixed (S, t), the Dupire equation,
(S − K)+ . In contrast, the Black–Scholes back-
∂C σ 2 (K, T ) 2 ∂ 2 C ∂C ward equation requires one PDE for each strike
= K − (r − q)K − qC and maturity.
∂T 2 ∂K 2 ∂K
(2) In the case of calibrating a parametric form of
σ (S, t) to a set of market option prices, one
where r is the interest rate, q is the dividend yield (or needs to compute the model price of all these
foreign interest rate in the case of a currency), and options and the forward equation can accelerate
the boundary conditions are given by C(S, t, K, T ) = the computation to a factor 100.
(S − K)+ . 2. If the call prices are known today, one can
This can be established by a variety of methods, compute their derivatives and extract the local
including double integration of the Fokker–Plank volatility by the following formula:
equation, Tanaka formula, and replication strategy.
∂C
It is commonly named the forward equation, as ∂C
it indicates how current call prices are affected + (r − q)K + qC
∂T ∂K
by an increase in maturity. This can be contrasted σ (K, T ) = 2
∂ 2C
with the classical backward Black–Scholes PDE that K2
applies to a European call of fixed strike and maturity: ∂K 2
(5)
∂C σ 2 (S, t) 2 ∂ 2 C ∂C This equation is also known as the stripping
=− S − (r − q)S + rC (3)
∂t 2 ∂S 2 ∂S formula.
European
60 prices
50 Pricing
40 Calibration
30 Local
volatilities
20 1
0.8
0.6
10 0.4 Pricing
500 1000 0.2
1500 2000 Exotic prices
2500 3000 0
1 0.8
20 0.8
10 0.6 0.6
0.4
500 1000 0.2
1500 2000 0.4
2500 3000 0
0.2
Figure 2 Local volatility surface of the NASDAQ
0 10 20 30 40 50 60 70 80
prices and then applying a nonparametric interpola- Implied volatility
tion to the residuals.
Figure 4 Comparison of an up-and-out call option in the
Figure 1 displays the implied volatility surface of local volatility model and in Black–Scholes model with
NASDAQ, and the associated local volatility surface various volatilities
is shown in Figure 2.
Once the local volatilities are obtained, one can
The combined effect is that the up-and-out call
price nonvanilla instruments with this calibrated local
local volatility price may exceed the price of any
volatility model (Figure 3).
Black–Scholes model, irrespective of the volatility
Properly accounting for the market skew can
input used (Figure 4).
have a massive impact on the price of exotics.
For instance, an up-and-out call option has a pos-
itive gamma close to the strike and a negative Local Volatilities as Forward Volatilities
gamma close to the barrier. A typical equity neg-
ative skew corresponds to high local volatilities The most common interpretation of local volatility
close to the strike, which adds value to the option is that it is the instantaneous volatility as a certain
due to the positive gamma and low local volatili- function of spot price and time that fits market prices.
ties close to the barrier, which is also beneficial to It gives the simplest model calibrated to the market
the option holder as the gamma is negative there. but assumes a deterministic behavior of instantaneous
Dupire Equation 3
volatility, a fact crudely belied by the market. As time in the future, conditioned on a price level, has
such, the local volatility model is an important step to equal the local variance, which is dictated by the
away from the Black–Scholes model, which assumes current market prices of calls and puts. Fitting to
constant volatility, though it may not necessarily today’s market strongly constrains future dynamics
provide the most realistic dynamics for the price. and, for instance, the backbone, defined as the
The second interpretation, as forward volatilities, behavior of the at-the-money volatility as a function
is far more patent. More precisely, the square of the of the underlying price, cannot be independently
local volatility, the local variance, is the instantaneous specified.
forward variance conditional to the spot price being Once we get a perfect fit of option prices using
equal to the strike at maturity: equation (5), we can perturb the volatility surface,
recalibrate, and conduct a sensitivity analysis. This
σ 2 (K, T ) = E[σt2 |ST = K] (6) provides a decomposition of the volatility risk of any
This means that in a frictionless market where all structured product (or portfolio of) across strikes and
strikes and maturities are available, it is possible to maturities, because seeing the price as a function
combine options into a portfolio that will lock these of the whole volatility surface provides through
forward values. In other words, the local variance perturbation analysis the sensitivity to all volatilities.
is not only a function calibrated to the market that
allows to retrieve market prices but it is also the fair Extensions
value of the fixed leg of a swap with a floating leg
equal to the instantaneous variance at time T , with the There are numerous extensions of the forward PDE,
exchange taking place only if the price at maturity is with stochastic rates and dividends, stochastic volatil-
K. It can be seen as an infinitesimal forward corridor ity, jumps, to the Greeks (sensitivities) and to other
variance swap. products than European options, such as barrier
By way of consequence, if one disagrees with the options, compound options, Asian options, and basket
forward variance, one can put on a trade (in essence options. However, until now, there is no satisfactory
calendar spread against butterfly spread) aligned with counterpart for American options.
this view. Conversely, if one has no view but finds
someone who disagrees with the forward view and Further Reading
accepts to trade at a different level, one can lock the
difference. Derman, E. & Kani, I. (1994). Riding on a smile, Risk 7(2),
Another important consequence of this relation- 32–39, 139–145.
ship is that a stochastic volatility model (with no Dupire, B. (1993). Model art, Risk 6(9), 118–124.
jumps) will be calibrated to the market if and only if Dupire, B. (1994). Pricing with a smile, Risk 7, 18–20.
the conditional expectation of the instantaneous vari- Dupire, B. (1997). Pricing and hedging with smiles, in Math-
ematics of Derivative Securities, M.A.H. Dempster & S.R.
ance is the local variance computed from the market Pliska, eds, Cambridge University Press.
prices. In essence, it means that a calibrated stochastic Dupire, B. (2004). A unified theory of volatility, working
volatility model is a noisy version of the local volatil- paper Paribas capital markets 1996, reprinted in Derivatives
ity model, which is centered on it. In this sense, the Pricing: The Classic Collection, P. Carr, ed., Risk Books,
local volatility model plays a central role. London.
Beyond the fit to the current market prices, these
results have dynamic consequences. For example, Related Articles
they imply that in the absence of jumps, the at-
the-money (ATM) implied volatility converges to the Implied Volatility Surface; Local Times; Local
instantaneous volatility when the maturity shrinks to Volatility Model; Markov Processes; Model
0. The same relation indicates that for any stochastic Calibration.
volatility model calibrated to the market, the average
level of the short-term ATM implied variance at any BRUNO DUPIRE
Implied Volatility Surface Conventional stochastic volatility (SV) models
imply a relationship between the assumed dynamics
of the instantaneous volatility and the volatility skew
(see Chapter 8 of [8]). Empirically, volatility is
The widespread practice of quoting option prices well known to be roughly lognormally distributed
in terms of their Black–Scholes implied volatilities [1, 4] and in this case, the derivative of IV with
(IVs) in no way implies that market participants respect to log-strike in an SV model is approximately
believe underlying returns to be lognormal. On the
independent of volatility [8]. This motivates a simple
contrary, the variation of IVs across option strike and
measure of skew: For a given term to expiration,
term to maturity, which is widely referred to as the
the “95–105” skew is simply the difference between
volatility surface, can be substantial. In this article,
we highlight some empirical observations that are the IVs at strikes of 95% and 105% of the forward
most relevant for the construction and validation of price. Figure 2 shows the historical variation of this
realistic models of the volatility surface for equity measure as a function of term to expiration as
indices. calculated from end-of-day SPX volatility surfaces
generated from listed options prices between January
2, 2001 to February 6, 2009. To fairly compare
across different dates and over all volatility levels,
The Shape of the Volatility Surface
all volatilities for a given date are scaled uniformly
Ever since the 1987 stock market crash, volatility to ensure that the one-year at-the-money-forward
surfaces for global indices have been characterized by (ATMF) volatility equals its historical median value
the volatility skew: For a given expiration date, IVs over this period (18.80%). The skews for all listed
increase as strike price decreases for strikes below the expirations are binned by their term to expiration; the
current stock price (spot) or current forward price. median value for each√ five-day bin is plotted along
This tendency can be seen clearly in the S&P500 with fits to both 1/ T and the best-fitting power-law
volatility surface shown in Figure 1. For short-dated dependence on T .
expirations, the cross section of IVs as a function of The important conclusion to draw here is that
strike is roughly V-shaped, but has a rounded vertex the TS of skew is approximately consistent with
and is slightly tilted. Generally, this V-shape softens square-root (or at least power-law) decay. Moreover,
and becomes flatter for longer dated expirations, this rough relationship continues to hold for longer
but the vertex itself may rise or fall depending on expirations that are typically traded (OTC) Over-the-
whether the term structure (TS) of (ATM) At-the- counter.
money volatility is upward or downward sloping. Significantly, this empirically observed TS of
Conventional explanations for the volatility skew the volatility skew is inconsistent with the 1/T
include the following: dependence for longer expirations typical of popular
one-factor SV models (see Chapter 7 of [8] for
• The leverage effect: Stocks tend to be more
example): Jumps affect only short-term volatility
volatile at lower prices than at higher prices.
skews, so adding jumps does not resolve this dis-
• Volatility moves and spot moves are anticorre-
lated. agreement between theory and observation. Introduc-
• Big jumps in spot tend to be downward rather ing more volatility factors with different timescales
than upward. [3] does help but does not entirely eliminate the prob-
• The risk of default: There is a nonzero probability lem. Market models of IV (see Implied Volatility:
for the price of a stock to collapse if the issuer Market Models) obviously fit the TS of skew by
defaults. construction, but such models are, in general, time
• Supply and demand: Investors are net long of inhomogeneous and, in any case, have not so far
stock and so tend to be net buyers of downside proven to be tractable. In summary, fitting the TS
puts and sellers of upside calls. of skew remains an important and elusive bench-
The volatility skew probably reflects all of these mark by which to gauge models of the volatility
factors. surface.
2 Implied Volatility Surface
0 1.5 together with the fact that index options in most mar-
g-s
1
expiration dates, might lead one to expect the dynam-
e
0.5
ry
k
Table 1 PCA studies of the volatility surface. GS, Goldman Sachs study [9] and ML, Merrill Lynch proprietary data
Var. explained by
Correlation of 3
Source Market Top 3 modes First mode (%) Top 3 (%) modes with spot
GS S&P500, weekly, 1994–1997 Level, TS, skew 81.6 90.7 −0.61, −0.07, 0.07
GS Nikkei, daily, 1994–1997 Level, TS, skew 85.6 95.9 −0.67, −0.05, 0.04
Cont S&P500, daily, 1900–2001 Level, skew, curvature 94 97.8 −0.66, ∼0, 0.27
et al.
Cont FTSE100, daily, 1999–2001 Level, skew, curvature 96 98.8 −0.70, 0.08, 0.7
et al.
Daglish S&P500, monthly, 1998–2002 Level, TS, skew 92.6 99.3 n.a.
et al.,
ML S&P500, daily, 1901–2009 Level, TS, skew 95.3 98.2 −0.87,−0.11, ∼0
Implied Volatility Surface 3
Vol
Vol
−0.2 −0.2 −0.2
−1.0 −1.0 −1.0
ke
ke
ke
−0.5 −0.5 −0.5
stri
stri
stri
100 100 100
ed
ed
ed
200 0.0 200 0.0 200 0.0
aliz
aliz
aliz
Te 300 Te 300 Te 300
rm 0.5 rm 0.5 rm 0.5
rm
rm
rm
(da 400 (da 400 (da 400
ys) ys) ys)
No
No
No
(a) 500 1.0 (b) 500 1.0 (c) 500 1.0
Figure 3 PCA modes for Merrill Lynch S&P500 volatility surfaces: (a) level; (b) term structure; and (c) skew
from those above and where the overall magnitude is Table 2 Historical estimates of βT
attenuated as term increases (Figure 3c). T βT standard error R2
It is also worth noting that the two studies [5, 9]
that looked at two different markets during compara- 30 1.55 (0.02) 0.774
ble periods found very similar patterns of variation; 91 1.50 (0.02) 0.825
182 1.48 (0.02) 0.818
the modes and their relative importance were very
365 1.49 (0.02) 0.791
similar, suggesting strong global correlation across
index volatility markets.
In the study by Cont and da Fonseca [5], a TS
market conditions. For example, during the turbulent
mode does not figure in the top three modes; instead
period following the collapse of Lehman Brothers in
a skew mode and another strike-related mode related
September 2008, which was characterized by both
to the curvature emerge as number two and three.
high volatility and high volatility of volatility, spot-
This likely reflects the atypically low variation in TS
volatility correlation remained at historically high
over the historical sample period and is not due to any
levels: −0.92 for daily changes between September
methodological differences with the other studies. As
15, 2008 and end December 31, 2008. On the
in the other studies, the patterns of variation were
other hand, the skew mode, which is essentially
very similar across markets (S&P500 and FTSE100).
uncorrelated with spot return in the full historical
period (see Table 1), did exhibit stronger correlation
Changes in Spot and Volatility are in this period (−0.55), while the TS mode did
Negatively Correlated not. These observations underscore the robustness
of the level-spot correlation as well as the time-
Perhaps the sturdiest empirical observation of all is varying nature of correlations between spot returns
simply that changes in spot and changes in volatility and the other modes of fluctuation of the volatility
(by pretty much any measure) are negatively and surface.
strongly correlated. From the results that we have Other studies have also commented on the robust-
surveyed here, this can be inferred from the high R 2 ness of the spot-volatility correlation. For example,
obtained in the regressions of daily ATMF volatility using a maximum-likelihood technique, Aı̈t-Sahalia
changes shown in Table 1, as well as directly from and Kimmel [1] carefully estimated the parameters
the correlations between spot return and PCA modes of The Heston, CEV, and GARCH models from
shown in Table 2. It is striking that the correlation S&P500 and VIX data between January 2, 1990 and
between the level mode and spot return is consistently September 30, 2003; the correlation between spot and
high across studies, ranging from −0.66 to −0.87. volatility changes varied little between these models
Correlation between the spot return and the other and various estimation techniques, and all estimates
modes is significantly weaker and less stable. were around −0.76 for the period studied.
This high correlation between spot returns and A related question that was studied by Bouchaud
changes in volatility persists even in the most extreme et al. [4] is whether spot changes drive realized
4 Implied Volatility Surface
dVol
stock prices. Moreover, unlike the decay of the IV 0.00
correlation function itself, which is power-law with
an exponent of around 0.3 for SPX, the decay of the −0.02
spot-volatility correlation function is exponential with
a short half-life of a few days. Supposing the general
level of IV (the variation of which accounts for most
of the variation of the volatility surface) to be highly −0.06
correlated with realized volatility, these results also −0.06 −0.04 −0.02 0.00 0.02 0.04 0.06
apply to the dynamics of the IV surface. Under dif-
Skew*dS/S
fusion assumptions, the relationship between implied
and realized volatility is even more direct: Instanta- Figure 4 Regression of 91-day volatility changes ver-
neous volatility is given by the IV of the ATM option sus spot returns. A zero-intercept least squares fit to
with zero time to expiration. model (1) leads to β91 = 1.50 (solid lines). The β = 1
(“sticky-strike”) prediction (dashed line) clearly does not fit
Skew Relates Statics to Dynamics regression were restricted to spot returns of smaller
magnitude, as suggested visually by the scatterplots
Volatility changes are related to changes in spot: as of Figure 4.
mentioned earlier, volatility and spot tend to move Although empirical relationships between changes
in opposite directions, and large moves in volatility in ATMF volatility and changes in spot are clearly
tend to follow large moves in the spot. relevant to volatility trading and risk management,
It is reasonable to expect skew to play a role the magnitude of βT itself has direct implications
in relating the magnitudes of these changes. For for volatility modeling as well. In both local and
example, if all the variation in ATMF volatility were SV models, βT → 2 in the short-expiration limit.
explained simply by movement along a surface that is Under SV, βT is typically a decreasing function of T ,
unchanged as a function of strike when spot changes, whereas under the local volatility assumption where
then we would expect the local volatility surface is fixed with respect to
dσ S a given level of the underlying, βT is typically an
σATMF (T ) = βT (1) increasing function of T .
d(log K) S
Market participants often adopt a phenomenolog-
with βT = 1 for all terms to expiration (T ). ical approach and characterize surface dynamics as
The empirical estimates of βT shown in Table 2 following one of these rules: “sticky strike,” “sticky
are based on the daily changes in S&P500 ATMF delta,” or “local volatility”; each rule has an associ-
volatilities from January 2, 2001 to February 6, ated value of βT . Under the sticky-strike assumption,
2009 (volatilities tied to fixed expiration dates are βT = 1 and the volatility surface is fixed by strike;
interpolated to arrive at volatilities for a fixed number under the sticky-delta assumption, βT = 0 and the
of days to expiration.) Two important conclusions volatility surface is a fixed function of K/S; and
may be drawn: (i) β is not 1.0, rather it is closer to 1.5 under the local volatility assumption, as mentioned
and (ii) remarkably βT does not change appreciably earlier, βT = 2 for short expirations.
by expiration. In other words, although the volatility Neither “sticky-strike” nor “sticky-delta” rules
skew systematically underestimates the daily change imply reasonable dynamics [2]: In a sticky-delta
in volatility, it does so by roughly the same factor model, the log of the spot has independent increments
for all maturities. It is also worth noting that the and the only arbitrage-free sticky-strike model is
hypothesis βT = 1 would be rejected even if the Black–Scholes (where there is no smile).
Implied Volatility Surface 5
Although the estimates of βT in Table 2 are all to Risk Management, Cambridge University Press, Cam-
around 1.5, consistent with SV, this does not exclude bridge.
the possibility that there may be periods where the βT [5] Cont, R. & Fonseca, J. (2002). Dynamics of implied
volatility surfaces, Quantitative Finance 2, 45–60.
may substantially depart from these average values. [6] Daglish, T., Hull, J. & Suo, W. (2007). Volatility surfaces:
Derman [7] identified seven distinct regimes for theory, rules of thumb, and empirical evidence, Quanti-
S&P500 daily volatility changes between September tative Finance 7, 507–524.
1997 and November 1998, finding evidence for all [7] Derman, E. (1999). Regimes of Volatility, Risk 12, 55–59.
three of the alternatives listed above. A subsequent [8] Gatheral, J. (2006). The Volatility Surface, John Wiley &
study [6] looked at S&P500 monthly data between Sons, Hoboken.
[9] Kamal, M. & Derman, E. (1997). The patterns of change
June 1998 and April 2002 (47 points) and found that
in implied index volatilities, in Goldman Sachs Quantita-
for that period, the data were much more consistent tive Research Notes, Goldman Sachs, New York.
with the sticky-delta rule than with the sticky-strike
rule.
Related Articles
References
Black–Scholes Formula; Implied Volatility in
[1] Aı̈t-Sahalia, Y. & Kimmel, R. (2007). Maximum likeli- Stochastic Volatility Models; Implied Volatility:
hood estimation of stochastic volatility models, Journal Large Strike Asymptotics; Implied Volatility:
of Financial Economics 83, 413–452. Long Maturity Behavior; Implied Volatility: Mar-
[2] Balland, P. (2002). Deterministic implied volatility mod- ket Models; SABR Model.
els, Quantitative Finance 2, 31–44.
[3] Bergomi, L. (2008). Smile dynamics III, Risk 21, 90–96. MICHAEL KAMAL & JIM GATHERAL
[4] Bouchaud, J.-P. & Potters, M. (2003). Theory of Finan-
cial Risk and Derivative Pricing: From Statistical Physics
Moment Explosions us note that finite critical moments of the underlying
ST correspond, in essence, to exponential tails of
log (ST ). There is evidence that refined knowledge
of how moment explosion
occurs (or the asymptotic
Let (St , Vt )t≥0 be a Markov process, representing a
behavior of u → Ɛ STu in the case of nonexplosion)
(not necessarily purely continuous) stochastic volatil-
can lead to refined results about implied volatility, see
ity model. (St )t≥0 is the (discounted) price of a
[6, 11] for some examples of stochastic alpha beta rho
traded asset, such as a stock, and (Vt )t≥0 represents
(SABR) type.
a latent factor, such as stochastic volatility, stochas-
In fixed-income markets (St )t≥0 might represent
tic variance, or the stochastic arrival rate of jumps.
a forward LIBOR rate or swap rate. Andersen and
A moment explosion takes place, if the moment
Piterbarg [2] give examples of derivatives with super-
Ɛ[Stu ] of some given order u ∈ becomes infinite linear payoff, whose pricing involves calculation of
(“explodes”) after some finite time T∗ (u). This time
the second moment of ST . It is clear that an explosion
is called the time of moment explosion and formally
of the second moment will lead to infinite prices of
defined by
such derivatives.
For numerical procedures, such as discretization
T∗ (u) = sup t ≥ 0 : Ɛ[Stu ] < ∞ (1)
schemes for stochastic differential equations (SDEs),
We say that no moment explosion takes place for error estimates that depend on higher order moments
some given order u, if T∗ (u) = ∞. of the approximated process may break down if
Moment explosions can be considered both under moment explosions occur [1]. Moment explosions
the physical and the pricing measure, with most may also lead to infinite expected utility in utility
applications belonging to the latter. If (St )t≥0 is maximization problems [12].
a martingale, then Jensen’s inequality implies that
moment explosions can only occur for moments of
order u ∈ \ [0, 1]. Moment Explosions in the Black–Scholes
Conceptually, the notion of a moment explosion and Exponential Lévy Models
has to be distinguished from an explosion of the
process itself, which refers to the situation that the In the Black–Scholes model, moment explosions
process (St )t≥0 , not one of its moments, becomes never occur, since moments of all orders exist
infinite with some positive probability. for all times. In an exponential Lévy model (see
Exponential Lévy Models), St is given by St =
S0 exp(X t ), where Xt is a Lévy process. It holds
Applications that Ɛ Stu = etκ(u) , where κ(u) is the cumulant-
generating function (cgf) of X1 . Thus in an expo-
In equity and foreign exchange models, where (St )t≥0
nential Lévy model, the time of moment explosion is
represents a stock price or an exchange rate, moment
given by
explosions are closely related to the shape of the
implied volatility surface, and can be used to
+∞ κ(u) < ∞
obtain approximations for the implied volatility of T∗ (u) = (2)
deep in-the-money and out-of-the-money options (see 0 κ(u) = ∞
Implied Volatility: Large Strike Asymptotics, and
the references therein). According to [5, 14], the Let us remark that, from Theorem 25.3 in [16],
asymptotic shape of the implied volatility surface for κ(u) < ∞ iff eux 1|x|>1 ν(dx) < ∞ where ν( dx)
some fixed maturity T is determined by the smallest denotes the Lévy measure of X.
and largest moment of ST that is still finite. These
critical moments u− (T ) and u+ (T ) are the piecewise
inverse functionsa of the moment explosion time. Moment Explosions in the Heston Model
Often the explosion time is easier to calculate, so a
feasible approach is to first calculate explosion times, The situation becomes more interesting in a stochastic
and then to invert to obtain the critical moments. Let volatility model, like the Heston model (see Heston
2 Moment Explosions
∂ ∂2
+ λ(θ − v) + ρηv (4) ∂
∂v ∂x∂v φ(t, u) = F (u, ψ(t, u)), φ(0, u) = 0 (8)
∂t
Note that (Xt , Vt )t≥0 has affine structure in the sense
∂
that the coefficients of L are affine linear in the state ψ(t, u) = R(u, ψ(t, u)), ψ(0, u) = 0 (9)
variables.b ∂t
Now
2
where F (u, w) = λθw and R(u, w) = w2 η2 +
(ρηu − λ)w + 12 (u2 − u). Equation (9) is a Riccati
Ɛ euXT |Xt = x, Vt = v differential equation, whose solution blows up at
= eux Ɛ euXT |Xt = 0, Vt = v (5) finite time, corresponding to the moment explosion
of St . Explicit calculations ([2], for instance) yieldc
+∞
if
(u) ≥ 0, χ(u) < 0
1 χ(u) +
(u)
log if
(u) ≥ 0, χ(u) > 0
T∗Heston (u) =
(u) χ(u)−
(u) (10)
2 −
(u)
arctan + π1{χ (u)<0} if
(u) < 0
−
(u) χ(u)
satisfies, as a function of (t, x, v), the backward where χ(u) = ρηu − λ and
(u) = χ(u)2 − η2 (u2 −
equation ∂t + L = 0 with terminal data eux and after u). A simple analysis of this condition (cf. [2]) then
replacing T −t with t we can rewrite this as an initial allows to express the no-explosion condition in terms
of the correlation parameter ρ. With focus on positive
value
setting f = f (t, v; u) :=
problem. Indeed,
moments of the underlying, u ≥ 1, we have
Ɛ euXt |X0 = 0, V0 = v , and noting that
u−1 λ
T∗Heston (u) = +∞ ⇐
⇒ ρ ≤ − +
∂ 2
∂
ux
u ηu
2
− e f = eux u2 − u f and (11)
∂x ∂x
∂ 2
ux ∂ Similar results for a class of nonaffine stochastic
e f = eux u f (6) volatility models is discussed below.
∂x∂v ∂v
Moment Explosions 3
the mean-reversion term −λ(Vt − θ)dt has been affine ansatz f (t, v; u) = exp (φ(t, u) + vψ(t, u))
replaced by the more general b(Vt ) dt. With λ still reduces the Kolmogorov equation to ordinary dif-
replaced by limv→∞ −b(v)/v the formula (10) ferential equations of the type equation (8). The func-
remains valid. If γ = δ, then the model can tions F (u, w) and R(u, w) are no longer quadratic
be transformed into a Heston-like model by polynomials, but of Lévy–Khintchine form (see Infi-
the change of variables Vt := V 2δ . The time of nite Divisibility). The time of moment explosion can
t
moment explosion T∗ (u) can be related to the be determined by calculating the blow-up time for
expression in equation (10), by the solutions of these generalized Riccati equations.
This approach can be applied to a Heston model with
1 Heston an additional jump term:
T∗ (u) = T (u) (21)
2δ ∗
I Bc
by the martingale condition for (St )t≥0 . The time (C4) B = , and b = (bv , bd ) with bv = 0
of moment explosion can be calculated [13] and is 0 0
given by and bd = (1, . . . , 1).
of Yt are represented by the transform formula px2 , with the solution x2 (t) = u2 ept . Substitut-
ing into the equation for the first component
t yields ẋ2 (t) = px1 + x12 + su22 e2pt , a nonau-
Ɛ[exp(2u · Yt )] = exp −2 Ax(s) ds tonomous Riccati equation. After the trans-
0 formation ξ(t) = e−pt x(t) it can be solved
t
explicitly, and the moment explosion time is
+2 |x (s)| ds + 2x(t) · Y0
d 2
determined as
0
(33) 1
T∗ (u1 , u2 ) =log max 0,
p
where x(t) is a solution to the coupled system of
p u1
Riccati equations, given by √ arccot √ +1 (36)
|u2 | s |u2 | s
x1 (t)
ẋ1.(t) ..
.. Av Ac End Notes
= · .
0 Ad
ẋn (t) xn (t)
2
a.
On the intervals (−∞, 0) and (1, ∞), respectively.
x1 (t) b.
In fact, it does not even depend on x, which implies the
I Bc .
+ · .. (34) homogeneity properties in equation (5).
0 0 c.
Only u ∈ / [0, 1] needs to be discussed; in this case, χ (u) =
xn2 (t) 0
⇒
(u) < 0.
d.
When u = 1, equation (14) is precisely the Cox–
with initial condition x(0) = u. Equation (33) holds Ingersoll–Ross bond pricing formula.
in the sense that if either side is well defined and e.
For u < u∗ since equation (14) explodes as u ↑
finite, the other one is also finite, and equality holds. u∗ , where u∗ > 0 is determined by I (u∗ ) ≡ λ +
Thus, moment explosions can again be linked to γ (u) coth(γ (u)t/2) = 0.
f.
Care is necessary since f can be +∞; see [15] for a
the blow-up time of the ODE (34). [10] considers
proper discussion via localization.
two concrete specifications of the above model, with g.
A supersolution f of equation (20) satisfies Af − ∂f ≤
one volatility factor and one dependent factor in ∂f
∂t
[6] Benaim, S., Friz, P. & Lee, R. (2008). The Black Scholes [11] Gulisashvili, A. & Stein, E. (2009). Implied volatility in
implied volatility at extreme strikes, in Frontiers in the Hull-White model, Mathematical Finance, to appear.
Quantitative Finance: Volatility and Credit Risk Model- [12] Kallsen, J. & Muhle-Karbe, J. (2008). Utility Maximiza-
ing, R. Cont, ed, Wiley, Chapter 2. tion in Affine Stochastic Volatility Models, Preprint.
[7] Dai, Q. & Singleton, K.J. (2000). Specification analysis [13] Keller-Ressel, M. (2008). Moment explosions and
of affine term structure models, The Journal of Finance long-term behavior of affine stochastic volatility mod-
55, 1943–1977. els, arXiv:0802.1823, forthcoming. in Mathematical
Finance.
[8] Duffie, D., Filipovic, D. & Schachermayer, W. (2003).
[14] Lee, R. (2004). The moment formula for implied volatil-
Affine processes and applications in finance, The Annals
ity at extreme strikes, Mathematical Finance 14(3),
of Applied Probability 13(3), 984–1053.
469–480.
[9] Filipović, D. & Mayerhofer, E. (2009). Affine Dif- [15] Lions, P.-L. & Musiela, M. (2007). Correlations and
fusion Processes: Theory and Applications, Preprint, bounds for stochastic volatility models, Annales de
arXiv:0901.4003. l’Institut Henri Poincaré 24, 1–16.
[10] Glasserman, P. & Kim, K.-K. (2009). Moment explo- [16] Sato, K.-I. (1999). Lévy Processes and Infinitely Divisi-
sions and stationary distributions in affine diffusion ble Distributions, Cambridge University Press.
models, Mathematical Finance, Forthcoming, available
at SSRN: http://ssrn.com/abstract=1280428. PETER K. FRIZ & MARTIN KELLER-RESSEL
•
Implied Volatility in “VIX-style” MFIV equals the square root of
expected variance.
Stochastic Volatility • “Synthetic volatility swap (SVS) style” MFIV
equals expected volatility under an independence
Models condition, and approximates expected volatility
under perturbations of that condition.
For each K > 0, define the time-0 dimensionless by holding a log contract and dynamically trading
Black–Scholes implied volatility IV0 (K) to be the shares, via the strategy developed in [7, 10, 21], and
unique solution of [9]. Specifically,
Expected Realized Variance Equals Log Con- where denotes derivative (unambiguously, as C, P ,
tract Value. Realized variance admits replication d2 , IV 0 , N are defined as single-variable functions).
Implied Volatility in Stochastic Volatility Models 3
For brevity, we suppress the argument (k) of d2 and Nt := −(Sτ > K), t ∈ (τ ∧ T , T ] (22)
IV 0 and their derivatives.
To justify the integration by parts in equations (15, The break-even property follows from applying Ito’s
16), it suffices to assume the existence of ε > 0 such rule to the process
that ƐST1+ε < ∞ and ƐST−ε < ∞. Then the moment
formula [18] implies that for some β < 2 and all |k| Ct := C BS (IV20 (K) − Xt , St , K), t ∈ [0, τ ∧ T ]
sufficiently large, we have IV 20 (k) < β|k|; hence (23)
to obtain
kN (d2 )|0−∞ − kN (−d2 )|∞
0 =0
∂C BS ∂C BS
and N (d2 )IV 0 |∞
−∞ = 0 (18) dCt = − dXt + dSt
∂V ∂S
Combining equations (11) and (17) gives the conclu- 1 ∂ 2 C BS
sion in equation (8). + dSt
2 ∂S 2
Implied Volatility Equals Break-even Realized 1 2 ∂ 2 C BS ∂C BS
= St − dXt
Volatility 2 ∂S 2 ∂V
Suppose that we buy at time 0 a T -expiry K-strike ∂C bs ∂C BS
+ dSt = dSt (24)
call or put; to be definite, let us say a call. We pay a ∂S ∂S
premium of C0 := C BS (IV20 (K), S0 , K).
Dynamically, delta hedging this option using where the partials of C BS are evaluated at (IV20 (K) −
shares, we have, in principle, a position that is delta Xt , St , K). Therefore,
neutral and “long vega”. Indeed, the implied volatil- τ ∧T
ity is the option’s break-even realized volatility in the ∂C BS
− Cτ ∧T = −C0 − dSt (25)
following sense: There exists a model-independent 0 ∂S
share trading strategy Nt , such that
as shown in [2, 11, 20].
T In the event XT < IV20 (K), hence T < τ , we
P &L := − C0 + Nt dSt + (ST − K)+ have
0
< 0 in the event XT < IV0 (K) (19) P &L = (ST − K)+ − CT
√ = (ST − K)+ − C BS (IV20 (K) − XT , ST , K)
and P &L ≥ 0 in the event XT ≥ IV0 (K).
In other words, total profit/loss (from the time- <0 (26)
0 option purchase, the trading in shares, and the
time-T option payout) is negative if and only if and in the event XT ≥ IV20 (K), hence τ ≤ T , we
volatility realizes to less than the initial implied have
volatility. T
P &L = (ST − K)+ − Cτ − (Sτ > K) dSt
Implied Volatility is Break-even Realized Volatility τ
for Business-time Delta Hedging. Define the busi-
= (ST − K)+ − (Sτ − K)+
ness-time delta hedging strategy by letting
− (Sτ > K)(ST − Sτ ) ≥ 0 (27)
τ := inf{t : Xt = IV20 (K)} (20)
and holding Nt shares at each time t ∈ [0, T ], as claimed. This break-even result is a special case
where of a proposition in [6].
to standard “calendar time” delta hedging, defined by Let Katm := S0 be the at-the-money (ATM) strike.
share holdings Then
∂C BS
− ¯ 20 , St , K), t ∈ [0, T ]
((T − t)IV (28)
∂S Ɛ(ST −Katm )+ = ƐC bs XT , S0 , Katm (32)
where IV¯ 20 := IV20 (K)/T denotes the time-0 annual-
≤ C bs Ɛ XT , S0 , Katm (33)
ized implied variance.
This strategy guarantees neither a profit in the by the conditioning argument of [15], independence,
event that XT > IV20 (K) nor a loss in the opposite
and the concavity of
event. To see this, under the dynamics (2), let
Yt = C BS (T − t)IV ¯ 20 , St , K (29) v
→ C bs (v, S0 , Katm ) (34)
and apply Ito’s rule to obtain
It follows that
∂C BS
¯ 20 dt +
∂C BS
dYt = − IV dSt IV0 (Katm ) ≤ Ɛ XT (35)
∂V ∂S
1 ∂ 2 C BS The function (34), while concave, is nearly linear
+ dSt
2 ∂S 2 for small v; indeed, its second derivative vanishes at
1 2 2 ∂ 2 C BS ∂C BS v = 0, as observed in [4]. Therefore, the inequalities
¯ 0 St
= − IV dt + dSt (33) and (35) are nearly equalities, as shown in [13].
2 ∂S 2 ∂S
In that sense,
1 ∂ 2 C BS
+ σt2 St2 dt (30)
2 ∂S 2 IV0 (Katm ) ≈ Ɛ XT (36)
BS
where the partial derivatives of C are evaluated at
¯ 20 , St , K). Hence,
((T − t)IV assuming the independence of σ and W .
T
∂C BS
P &L = YT − Y0 − dSt
0 ∂S Model-free Implied Volatility (MFIV)
T 2 BS
1 2 ∂ C
= (σt − IV¯ 20 )St2 dt (31) Inverting Black–Scholes is not the only way to
0 2 ∂S 2 extract an implied volatility from option prices.
which is half the time-integrated cash-gamma- While the ATM Black–Scholes implied volatility
weighted difference of instantaneous variance σt2 and approximates expected volatility under the indepen-
implied variance IV ¯ 20 , as shown in [7] and [12]. So dence assumption, alternative definitions of MFIV
if, along some trajectory, σt > IV ¯ 0 at points where use call/put data at all strikes, in order to reflect the
gamma is low, but σt < IV ¯ 0 at points where gamma is expected variance or volatility under more general
T conditions.
high, then it can occur that realized variance 0 σt2 dt
exceeds implied variance IV20 , yet this long-vega
strategy incurs a loss. VIX-style MFIV Equals the Square Root of
In conclusion, implied volatility is the option’s Expected Realized Variance
break-even realized volatility for business-time delta
hedging, but not for calendar-time delta hedging. Motivated by equation (11), define the VIX-style
model-free implied volatility by
Implied Volatility ATM Approximates Expected
Realized Volatility, Under an Independence
Condition VIXIV 0 := Ɛ0 [−2XT ]
In this section, we specialize to dynamics (2) such := Ɛ0 [−2 log(ST /S0 ) + 2(ST /S0 ) − 2]
that σ and W are independent. (37)
Implied Volatility in Stochastic Volatility Models 5
VIXIV 0 is an observable function of option prices, SVSIV 0 is observable from option prices, as the
specifically the square root of the time-0 value of the time-0 value of the portfolio
portfolio
π/2/S0
2/K 2 dK π
puts at strikes K < S0 (38)
3
I1 log K/S0 − I0 log K/S0 dK
8K S0
Indeed in 2003, the Chicago board options exchange calls at strikes K > S0 ,
can be verified. This suggests that SVS-style implied [8] CBOE. (2003). The VIX White Paper, Chicago Board
volatility SVSIV 0 should outperform Black–Scholes Options Exchange.
implied volatility IV0 , as an approximation to the [9] Derman, E., Demeterfi, K., Kamal, M. & Zou, J. (1999).
A guide to volatility and variance swaps, Journal of
expected realized volatility, at least for ρ not too
Derivatives 6, 9–32.
large. [10] Dupire, B. (1992). Arbitrage pricing with stochastic
This is confirmed in [5] for Heston dynam- volatility, Socété Générale.
ics with parameters from [1], and T = 0.5. Across [11] Dupire, B. (2005). Volatility Derivatives Modeling,
essentially all correlation assumptions, the SVS Bloomberg LP.
notion of implied volatility exhibited the small- [12] El Karoui, N., Jeanblanc-Picqué, M. & Shreve, S.
est bias, relative to the true expected annualized (1998). Robustness of the Black and Scholes formula,
volatility. For example, in the case ρ = −0.64, the Mathematical Finance 8, 93–126.
VIX-style implied volatility had bias +98 bp, the [13] Feinstein, S.P. (1989). The Black–Scholes Formula
is Nearly Linear in Sigma for At-the-Money Options:
Black–Scholes implied volatility had bias −30 bp,
Therefore Implied Volatilities from At-the-Money Options
and the SVS-style implied volatility had the smallest
are Virtually Unbiased, Federal Reserve Bank of
bias, −6 bp. Atlanta.
[14] Gatheral, J. (2006). The Volatility Surface: A Practi-
tioner’s Guide, John Wiley & Sons.
Acknowledgments [15] Hull, J. & White, A. (1987). The pricing of options on
assets with stochastic volatilities, Journal of Finance 42,
This article benefited from the comments of Peter Carr. 281–300.
[16] Jiang, G.J. & Tian, Y.S. (2005). The model-free implied
References volatility and its information content, Review of Finan-
cial Studies 18, 1305–1342.
[17] Lee, R. (2004). Implied volatility: statics, dynamics, and
[1] Bakshi, G., Cao, C. & Chen, Z. (1997). Empirical per- probabilistic interpretation, Recent Advances in Applied
formance of alternative option pricing models, Journal
Probability, Springer, pp. 241–268.
of Finance 52, 2003–2049.
[18] Lee, R. (2004). The moment formula for implied
[2] Bick, A. (1995). Quadratic-variation-based dynamic
volatility at extreme strikes, Mathematical Finance 14,
strategies, Management Science 41, 722–732.
469–480.
[3] Black, F. & Scholes, M. (1973). The pricing of options
and corporate liabilities, Journal of Political Economy [19] Matytsin, A. (2000). Perturbative Analysis of Volatility
81, 637–659. Smiles, Merrill Lynch.
[4] Brenner, M. & Subrahmanyam, M. (1988). A simple [20] Mykland, P. (2000). Conservative delta hedging, Annals
formula to compute the implied standard deviation, of Applied Probability 10, 664–683.
Financial Analysts Journal 44, 80–83. [21] Neuberger, A. (1994). The log contract, Journal of
[5] Carr, P. & Lee, R. (2008). Robust Replication of Volatil- Portfolio Management 20, 74–80.
ity Derivatives, Bloomberg LP, University of Chicago. [22] Polishchuk, A. (2007). Variance swap voluation,
[6] Carr, P. & Lee, R. (2008). Hedging Variance Options Bloomberg LP.
on Continuous Semimartingales, Forthcoming in Finance [23] Rebonato, R. (1999). Volatility and Correlation in the
and Stochastics. Pricing of Equity, FX and Interest Rate Options, John
[7] Carr, P. & Madan, D. (1998). Towards a theory of Wiley & Sons.
volatility trading, in Volatility, R. Jarrow, ed, Risk
Publications, pp. 417–427. PETER CARR & ROGER LEE
Local Volatility Model dSt
St
= gt dt + σ (t, St ) dWt (1)
to that from the market. With a careful design of implied volatility should have a very smooth shape
the local volatility function, this so-called calibra- over the range thus defined, and matching the implied
tion process can be implemented very efficiently for volatilities at the calibration strikes should produce
practical use. This methodology has the advantage a very good match for all the implied volatilities
that the knowledge of a perfect implied volatility between them.
surface is not required and the model is arbitrage The preceding discussions give a straightforward
free by construction. In addition, a great amount strategy for building the local volatility model—we
of analytical flexibility is available, which allows specify a small number of strikes, and tune the local
tailor-made designs of different models for specific volatility function with the same number of param-
purposes. eters as the number of strikes for each maturity in
a bootstrapping process. The local volatility parame-
ters are then solved through a root-finding routine so
Volatility Surface Design and Calibration that the implied volatilities at the specified strikes on
each maturity are reproduced. As each local volatil-
The key to the success of a volatility model lies in ity parameter is designed to capture a distinct aspect
an understanding of how the implied volatility sur- of the surface shape, the root-finding system is well
face is used in practice. Empirically, option traders behaved and converges quickly to the solution in
often refer to the implied volatility surface and its practice. More importantly, such a process allows a
shape deformation with intuitive descriptions such much smaller numerical noise compared to a typical
as level, slope, and curvature, effectively approxi- optimization process, giving rise to much more stable
mating the shape as simple quadratic functions. In calibration results. This is essential in ensuring robust
addition, for strikes away from the at-the-money Greeks and scenario outputs from the model.
(ATM) region, sometimes the ability to modify the
out-of-the-money (OTM) surface independent from
the central shape is desired, which traders intuitively Discrete Dividend Models
speak of as changing the put wing or the call wing.
Thus there exist several degrees of freedom on the Dividend modeling is an important problem in equity
volatility surface that a good model should be able to derivatives. It can be shown [15] that with nonzero
accommodate, and we can design the local volatility dividends, the original BSM model only works when
function so that each mode is captured by a distinct the payment amount is proportional to the stock
parameter. price immediately before the ex-dividend date (ex-
To facilitate comparison across different modeling date), through incorporating the dividend yields in
techniques, we standardize the model specification gt of equation (1). However, many market partici-
in terms of the BSM implied volatilities on a small pants tend to view future dividends as absolute cash
number of strikes per maturity, typically three or five. amounts, and this is especially true after trading in
For example, volatilities on three strikes in the ATM index dividend swaps becomes liquid. Existing liter-
region can be used to provide a precise definition of ature [4, 5, 12, 14] suggests even in the case of a
the traders’ level, slope, and curvature parameters. constant volatility, cash dividend equity models (also
Similarly, fixing volatilities at one downside strike known as discrete dividend models) are much less
and one upside strike in the OTM region allows tractable than proportional dividend ones. Recently,
the model to agree on a five-parameter specification Overhaus et al. [16] proposed a theory to ship future
of level, slope, curvature, put wing, and call wing. cash dividends from the stock price to arrive at a pure
These calibration strikes on each maturity are chosen stock process, on which one can apply the Dupire
to cover the range of practical interest, usually equation. This theory calls for the changes in future
one to two standard deviations of diffusion at the dividends to have a global impact, especially for
stock’s typical volatility. In the absence of fine maturities before their ex-dates, a feature that certain
structures such as sharp jumps in the underlying, traders find somewhat counterintuitive.
we expect that one standard deviation in the strike Nontrivial dividend specifications can be naturally
range provides a natural length scale over which the introduced in the framework here. We note that
stock price distribution varies smoothly. Thus the between ex-dates, equations (1) and (2) continue to
Local Volatility Model 3
and hence derive the implied volatility surface from to use the basic model solution as a starting point for
the hybrid model (6). The strategy discussed in the the hybrid calibration.
80% 80%
60% 60%
Volatility
Volatility
40% 40%
20% 20%
0% 10 0% 10
0 8 ) 0 8 )
rs ars
0.5 6
( yea 0.5 6
( ye
4 4
Stri
1
tu rity 1 rity
ke/s 1.5 2
ma
Stri
ke/s 1.5 2 atu
pot 2 0 o pot 2 0 om
et me
t
(a) Tim (b) Ti
12% 12%
10% 10%
Volatility difference
Volatility difference
8% 8%
6% 6%
4% 4%
2% 2%
0% 10 0% 10
0 8 ) 0 8 )
6 ars 6 ars
0.5
4 ( ye 0.5
4 ( ye
1 rity Stri
1 rity
Stri
ke/s 1.5 2 atu ke/s 1.5 2 atu
pot 2 0 t om pot 2 0 t om
(c) Time (d) Time
Figure 1 The implied and local volatility surface on the S&P 500 Index in November 2007. (a) The implied volatility
as a function of time to maturity and strike price (expressed as a percentage of spot price). (b) The local volatility surface
calibrated under the basic model. (c) Changes in the local volatility surface when cash dividends are assumed for the first
five years, gradually transitioning to proportional dividends in 10 years. (d) Changes in the local volatility surface where
the interest rate is assumed to follow the Hull–White model calibrated to ATM caps with correlation ρ = 30%. In both
(c) and (d) the new local volatility is smaller than in (b)
Local Volatility Model 5
0.0 1.2
1 year
0.8
−0.5
Change in fair strike (%)
2 year
0.4
3 year
−1.0
0.0 1 year
4 year 2 year
−1.5
−0.4 3 year
4 year
−2.0 5 year
−0.8 5 year
(a) (b)
−2.5 −1.2
0.0 1.2
Call on maximum
0.8
−0.5
Change in PV (vega)
Put on minimum
0.4
−1.0
0.0 Put on minimum
−1.5
−0.4
−2.0 −0.8
Call on maximum
(c) (d)
−2.5 −1.2
0.0 0.2 0.4 0.6 0.8 1.0 −0.8 −0.4 0.0 0.4 0.8
Cash dividend proportion Equity-interest rate correlation
Figure 2 Impact of discrete dividends and stochastic interest rate on derivative pricing. (a) Changes to the fair strike of
the variance swaps with different dividend assumptions. (b) Changes to the fair strike of the variance swaps under stochastic
interest rate with different correlation. The labels indicate the maturity of the variance swaps. (c) Changes to the PV of
the lookback options with different dividend assumption. (d) Changes to the PV of the lookback options under stochastic
interest rate with different correlation. The numbers in (c) and (d) are in units of vega of Table 2
6 Local Volatility Model
dividends introduce additional deterministic, nonpro- Table 2 Pricing of five-year lookback options with the
portional jump structures in the equity dynamics, and basic model
to maintain the same implied volatility surface the Option type Payout formula PV (%) Vega (%)
local volatility needs to become smaller. This effect
max SSi − S S5
5
depends on the dividend size relative to future spot Call on maximum 25.29 1.17
i=0 0 0
prices, and thus become more pronounced for smaller 5
strikes and longer maturities, producing a skewed Put on minimum 1 − min SSi 22.35 0.85
i=0 0
shape in the difference. On the other hand, stochas-
tic interest rate introduces volatility in discount bond Si (i = 0, 1, . . . , 5) is the index price at annual observation
prices and with positive correlation also reduces the dates on year i from the current date. The PV is the calculated
present value according to the payout formula at maturity. The
equity local volatility. This effect does not depend
Vega is the change in PV when a parallel shift of 1% is applied
on spot levels explicitly and is instead related to the to the implied volatility surface
volatility ratio between the interest rate and the equity
and their correlation. Since the interest rate usually
has a small volatility compared to the equity, to the interest rate. For lookback options, one needs to look
leading order the effect of stochastic rates can some- at the joint distribution among equity prices across
times be approximated by a parallel shift on the local different observation dates. Cash dividends generally
volatility surface. reduce the local volatility and hence decrease the cor-
We can apply these local volatility models to relation between the equity prices at different dates,
price exotic derivatives not directly available from leading to lower lookback prices. With stochastic
the vanilla market. One example is variance swaps, interest rate, the effect of modified equity diffusion
which are popular OTC products offered to capitalize volatility can either reinforce (e.g., call on maximum)
on the discrepancy between implied and realized or partly cancel (e.g., put on minimum) the effect of
volatility. Another example is lookback options, stochastic discounting.
which provide payoffs on the maximum/minimum The numerical impact of different modeling assu-
index prices over a set of observation dates and can be mptions can be comparable to a full percentage dif-
appealing hedges to insurance companies who have ference in volatility. Hence, it may be important to
sold policies with similar exposure. Tables 1 and 2 take these into account when accurate and competi-
display the pricing results for these structures using tive pricing of exotic equity derivatives is required.
the basic model. An extensive and detailed discussion of the impact
Figure 2 shows the pricing impact on these struc- of stochastic interest rate on popular hybrid products
tures when the effects of cash dividends and stochas- can be found in [16].
tic interest rates are considered. As the payout for the
variance swap is directly linked to the equity’s aver- References
age local volatility, the pricing is strongly affected
by the assumption of cash dividends and stochastic [1] Andersen, L. & Brotherton-Ratcliffe, R. (1997). The
equity option volatility smile: an implicit finite-
difference approach, Journal of Computational Finance
Table 1 Pricing of variance swaps with the basic model 1, 5–38.
Maturity (years) Fair strike (%) [2] Berestycki, H., Busca, J. & Florent, I. (2002). Asymp-
totics and calibrations of local volatility models, Quan-
1 27.59 titative Finance 2, 61–69.
2 28.18 [3] Black, F. & Scholes, M. (1973). The pricing of options
3 28.42 and corporate liabilities, Journal of Political Economy
4 29.14 81, 631–659.
5 30.00 [4] Bos, M. & Vandermark, S. (2002). Finessing fixed
N −1 dividends, Risk Magazine 15(9), 157–158.
i=0 (ln Si ) −
Si+1 2
The payoff for strike K at maturity is 252N [5] Bos, R., Gairat, A. & Shepeleva, S. (2003). Dealing with
2
K , where Si is the index closing price on the ith business day discrete dividends, Risk Magazine 16(1), 109–112.
from the current date (i = N corresponds to the maturity). The [6] Brigo, D. & Mercurio, F. (2006). Interest Rate Mod-
fair strike is the value K such that the contract costs nothing els—Theory and Practice with Smile, Inflation and
to enter Credit, 2nd Edition, Springer Finance.
Local Volatility Model 7
[7] Brown, G. & Randall, C. (1999). If the skew fits, Risk [16] Overhaus, M., Bermúdez, A., Buehler, H., Ferraris, A.,
Magazine 12(4), 62–65. Jordinson, C. & Lamnouar, A. (2007). Equity Hybrid
[8] Coleman, T.F., Li, Y. & Verma, A. (1999). Recon- Derivatives, Wiley, Hoboken, New Jersy.
structing the unknown volatility function, Journal of [17] Piterbarg, V. (2007). Markovian projection method for
Computational Finance 2, 77–102. volatility calibration, Risk Magazine 20(4), 84–89.
[9] Derman, E. & Kani, I. (1994). Riding on a smile, Risk [18] Rebonato, R. (2004). Volatility and Correlation, 2nd
Magazine 7(2), 32–39. Edition, Wiley, Chichester, West Sussex.
[10] Dumas, B., Fleming, J. & Whaley, R.E. (1998). Implied [19] Rubinstein, M. (1994). Implied binomial trees, Journal
volatility functions: empirical tests, Journal of Finance of Finance 69, 771–818.
53, 2059–2106.
[11] Dupire, B. (1994). Pricing with a smile, Risk Magazine Related Articles
7(1), 18–20.
[12] Frishling, F. (2002). A discrete question, Risk Magazine
15(1), 115–116. Corridor Variance Swap; Dividend Modeling;
[13] Gatheral, J. (2006). The Volatility Surface: A Practi- Dupire Equation; Lookback Options; Model
tioner’s Guide, Wiley, Hoboken, New Jersy. Calibration; Optimization Methods; Stochas-
[14] Haug, E., Haug, J. & Lewis, A. (2003). Back to basics: a tic Volatility Interest Rate Models; Tikhonov
new approach to the discrete dividend problem, Wilmott Regularization; Variance Swap; Yield Curve Cons-
Magazine 5, 37–47. truction.
[15] Merton, R.C. (1973). Theory of rational option pricing,
The Bell Journal of Economics and Management Science CHIYAN LUO & XINMING LIU
4, 141–183.
Dividend Modeling written in lognormal terms:
A dividend is a portion of a company’s earnings This approach is especially popular when model-
paid to its shareholders. In the process of dividend ing options on indexes, where dividend payments
payment, the following stages are distinguished: are numerous and spread through time. Another
(i) declaration date, when the dividend size and choice, of proportional amounts di = fi Sti paid at ex-
the ex-dividend date are announced; (ii) ex-dividend dividend dates t1 < t2 < · · · for single shares, can be
date, when the share starts trading net of dividend; justified by the fact that dividends tend to increase
(iii) record date, when holders eligible to dividend when a company is doing well, which is correlated
payment are identified; and (iv) payment date, when with a high share price:
delivery is made. At the ex-dividend date, the stock
price drops by an amount proportional to the size of dSt = (r − Qt )St dt + σ St dWt with
the dividend; the proportionality factor depends on
Qt = δ(t − ti )fi (2)
the tax regulations. There are a lot of issues, research
i
streams, and approaches in dividend modeling; here
the issue is considered mainly in the context of option In both these cases, the stock price at each time
pricing theory. still has a lognormal distribution, so the prices
The usual way to price derivatives on dividend- of European options are given by straightforward
paying stocks is to take a model for non-dividend- modifications of the Black–Scholes (BS) pricing
paying stocks and extend it to take the dividends formula. This is no longer true, however, for discrete
into account. The dividends then are commonly cash dividends:
modeled as (i) continuously paid dividend yield,
(ii) proportional dividends (known fractions of the dSt = (rSt − Dt ) dt + σ St dWt with
stock price) paid at known discrete times, or
(iii) fixed dividends (known amounts), paid at Dt = δ(t − ti ) di (3)
known discrete times. It is also possible to model i
the dividend amounts and the dividend dates The stock price St jumps down with the amount
stochastically (though there is evidence that this of dividend di paid at time ti and between the
has a negligible impact on vanilla options [10]). dividends it follows a geometric Brownian motion.
In fact, there is an alternative approach where the In this setting, the stock price can become negative,
stochastic dividends are the primary quantities and but this is usually so unlikely that, in practice, it
the stock followed by option price are derived is not a problem. Still, one might want to use a
from these, which was pioneered in [9]. As usual, more robust dividend policy in the model, such as
one has to choose the complexity of the model capping the dividend at the stock price. Obviously,
depending on dividend exposure of the derivative to different dividend policies result in different option
be priced. prices [7].
In practice, one comes across the notion of implied
dividends: the value of the dividends (independent
of how they are modeled) can be inverted from the Impact on Option Pricing
synthetic forward or future contract; the fact that
one can get quite different (from analyst predictions) To compute an option price under equation (3) the
numbers reflects various uncertainties. Among them standard collection of numerical methods can be
are the sundry tax regulations in different countries employed: finite difference (FD) method with jump
for various market players, timing, and value of the conditions across ex-dividend date [11], Monte Carlo
dividends, just to name a few. simulations, or nonrecombining trees [8]. There is no
The impact of dividends can be illustrated, starting real closed-form solution with multiple dividends for
simply by adding a continuous dividend yield to the European option under equation (3); however, sev-
drift. For the sake of the simplicity of notations, it is eral approximations are available. All of them are
2 Dividend Modeling
based on bootstrapping, that is, repeatedly computing choices used by practitioners [3, 5] are
the convolution of the option value at one dividend
date with the density kernel from that date to the αi = 1 so D̃t = di e−r(ti −t) (6)
previous dividend date and applying the jump condi- t<ti ≤T
tion at the dividend dates, starting from the payoff at
maturity. One can use a piecewise linear or a more αi = 0 so D̃t = di e−r(ti −t) (7)
sophisticated approximation of the option value at 0<ti ≤t
each convolution step and enjoy having a finite sum ti
of closed-form solutions. On the basis of the fact αi = 1 − (8)
T
that diffusion preserves monotonicity and convexity,
it can be shown that the result converges to the true In this approach, the tree for S̃t is recombining (see
value (unpublished work of Amaro de Matos et al.). Binomial Tree or [8]), and the price of a European
Another choice of parameterization was made in [7]: option is again given by a BS type formula, where
at each step of the integration the option value is the spot and the strike are adjusted as S0 → S0 − D̃0fut
past
approximated by BS-like function where strike and and K → K + D̃T . Needless to say, however, the
volatility are adjusted to obtain the best fit. Such volatility for each of these processes will be different.
methods can be used for any underlying process Namely, choice (6) underestimates and choice (7)
where one can compute the density kernel (Green overestimates the volatility compared to the “true”
function, propagator) for the convolution, though it model (3); the weighted choice (8) aims to minimize
will probably be not much faster or more accurate this effect.
than employing the standard finite difference method,
especially in the case of multiple dividends. For the
handling of American options, one can find an exten- Arbitrage Opportunities
sive list of references in [4] and the relation between
In reference [1], it was shown that arbitrage oppor-
early exercise and dividends is explained in Ameri-
tunities exist in the most standard approach (6) if the
can Options; Finite Difference Methods for Early
volatility surface is continuously interpolated around
Exercise Options or [8].
ex-dividend dates. They apply a rough volatility
adjustment to prevent the arbitrage opportunities. The
following example demonstrates that the continuous
A Common Approach
interpolation of volatility around ex-dividend dates
As already pointed out in the discrete cash dividend can lead to significant mispricing. Figure 1 shows
model (3), the price of a European option has no
closed-form solution and trees do not recombine. In
order to remedy this, traders often split the stock 18
price into a risky net-dividend part and a deterministic
16
dividend part:
Price
14
dS̃t = r S̃t dt + σ̃ S̃t dt
12
St = S̃t + D̃t (4)
past 10
D̃t = D̃tfut − D̃t
D̃tfut = αi di e−r(ti −t) 1 2 3 4 5
t<ti ≤T Maturity
(1 − αi )di e−r(ti −t)
past
D̃t = (5) Figure 1 Price of an American call as a function of the
0<ti ≤t time to maturity T for the following models: (3) (solid
line), (6) (dotted line), and (6) with the volatility adjustment
Note that the dependence on the option maturity T (10) (dashed line). The parameters are S0 = 100, K = 100,
in the notation D̃tfut is suppressed. The most common r = 0.05, σ = 0.3, di = 8, and ti = i − 12
Dividend Modeling 3
Table 1 European call prices with parameter set of Figure 1 for different strikes. HHL refers to the approximation of [7];
FD to finite difference and BS to closed-form solution
BS for (6) BS for (8)
Strike FD for (3) HHL for (3) BS for (6) with (10) BS for (8) with (10)
50 33.509 33.641 29.908 33.312 33.547 33.497
80 22.482 22.559 17.846 22.414 22.304 22.473
100 17.393 17.428 12.772 17.404 17.102 17.388
120 13.573 13.575 9.250 13.644 13.209 13.573
150 9.511 9.479 5.836 9.635 9.099 9.515
4 Dividend Modeling
dividends. Examples are exotic options on stocks call options on stocks paying multiple dividends,
and derivatives involving realized volatility, such Finance Research Letters 4, 34–48.
as variance swaps (see Variance Swap), volatil- [5] Frishling, V. (2002). A discrete question, Risk 15,
115–116.
ity swaps (see Volatility Swaps), correlation swaps [6] Gatheral, J. (2006). The Volatility Surface, John Wiley &
(see Correlation Swap), and gamma swaps (see Sons, Hoboken, pp. 13–14.
Gamma Swap). This sensitivity determines the [7] Haug, E., Haug, J. & Lewis, A. (2003). Back to basics: a
required sophistication in dividend modeling. Adding new approach to the discrete dividend problem, Wilmott
dividends to a stock price process may seem trivial Magazine (September), 37–47.
at first glance, but one has to be careful in setting [8] Hull, J.C. (2006). Options, Futures and Other Deriva-
tives, 6th Edition, Prentice-Hall, Upper Saddle River.
the model parameters. The resulting model can then
[9] Korn, R. & Rogers, L.C.G. (2005). Stocks paying
be solved by the usual methods. For the plain vanilla discrete dividends: modeling and option pricing, Journal
option with dividends, a number of numerical approx- of Derivatives 13(2), 44–48.
imations have been developed. [10] Kruchen, S. (2005). Dividend Risk , thesis, Uni/ETH,
Zürich.
[11] Tavella, D. & Randall, C. (2000). Pricing Finan-
References cial Instruments. The Finite Difference Method, John
Wiley & Sons, New York.
[1] Beneder, R. & Vorst, T. (2001). Options on divi-
dends paying stocks, in Recent Developments in Math-
ematical Finance, World Scientific Printers, Shanghai, Related Articles
pp. 204–217.
[2] Bos, R., Gairat, A. & Shepeleva, A. (2003). Dealing with
discrete dividends, Risk 16, 109–112. American Options; Finite Difference Methods for
[3] Bos, M. & Vandermark, S. (2002). Finessing fixed Early Exercise Options; Local Volatility Model;
dividends, Risk 15, 157–158. Monte Carlo Simulation.
[4] Cassimon, D., Engelen, P.J., Thomassen, L. & Van
Wouwe, M. (2007). Closed-form valuation of American ANNA SHEPELEVA & ALAIN VERBERKMOES
u(u − i)
Implied Volatility: Volvol ×
∂f
∂V
−
2
Vf (5)
Expansion
We look for solutions which can be written as a
power series of .
In order to calibrate stochastic volatility models, it
is convenient to have an accurate analytical for-
mula or approximation for call options. However, f (u, V , T ) = f (0) (u, V , T ) + f (1) (u, V , T )
deriving such a formula is not always an easy task.
In the Heston model, the most popular technique + 2 f (2) (u, V , T ) (6)
involves numerical integration, which is necessar-
ily time consuming. The main idea is to apply a
perturbation method to the volvol parameter, cal- Thus, we can obtain the power series of the call
culating the first and second order of the differ- price using either equation (4) directly
ence between a stochastic volatility model and a
Black–Scholes model. In general case, we can reduce
the integration of the exact formula to some simpler C(u, V , T ) = C (0) (u, V , T ) + C (1) (u, V , T )
integration. + 2 C (2) (u, V , T ) (7)
Consider the following two-factor stochastic
volatility model:
18
Exact
17.5 Series A
Series B
17
16.5
16
15.5
15
14.5
14
13.5
13
70
74
78
82
86
90
94
98
0
10
10
11
11
11
12
12
13
Figure 1 Series A (expansion on price), series B (expansion on implied volatility), and exact volatility
First, the expansion is on the fundamental trans- V imp = v(S, v, τ ) + τ −1 J (1) R̃ (1,1)
form of the closed formula, which is presented by
(u, V , T ) in equation (4). The idea is that we can + 2 τ −1 J 2 + τ −2 J (3) R̃ (2,0)
expand this function into a simpler form, so that
the integration in the equation (4) can be reduced to
analytic form. Here, we do not discuss the detailed τ −2 (1) 2
+ τ −1 J (4) R̃ (1,2) + (J )
derivation but only give the result. Interested readers 2
can refer to [2]. For series A, the expansion on
2
price is
× R̃ (2,2)
− R̃ (1,1)
R̃ (2,0)
1 X2 1 1 X 1
R̃ (2,0)
=τ − − , R̃ (1,1)
= − +
2 Y2 2Z 8 Z 2
X 2
X 1 1 1X 3
X 2
1 X 1 1
R̃ (1,2) = 2
− − (4 − Z) , R̃ (2,2) = τ
4
− 3
−3 3 + 2
(12 + Z) + 2
(48 − Z 2 )
Z Z 4Z X 2Z Z 8Z 32 Z
2 4
Z
(14)
ρ τ ω ω φ+ 2
1
J (1)
(V , τ ) = 1 − e−θ(τ −s)
+ e−θs V − ds, J (2) (V , τ ) = 0 (15)
0 θ θ θ
τ
1 2 ω ω 2φ
J (3) (V , τ ) = 2 1 − e−θ(τ −s) + e−θs V − ds (16)
2θ 0 θ θ
τ
1 ω ω φ+ 1
J (4) (V , τ ) = φ + + e−θ(τ −s) V − 2 J (6) (V , τ )ds (17)
2 0 θ θ
τ
−θ(τ −s) ω ω φ− 21
with J (V , τ )ds =
(6)
e − e−θs + e−θ(τ −u) V − du (18)
0 θ θ
Example of Volvol Expansion: Heston To expand the model, we add in the model.
Model
We will now show the expansion of volvol for the v
dXt = vt dWt − t dt, X0 = x0 (22)
Heston model (see Heston Model). By the asymp- 2
totic expansion, we can finally obtain an approximate
dvt = κ(θt − vt )dt + ξt vt dBt , v0 = v0
analytic formula for the European call option. This
work comes from the result of Benhamou et al. [1]. (23)
Consider a Heston model
Now, we will expand the European call option
√ vt price formula with respect to . Note that, when
dXt = vt dWt − dt, X0 = x0
2 = 0, we have a Black–Scholes model; while = 1,
(19) we have a Heston model. We already have the closed
4 Implied Volatility: Volvol Expansion
formula of Black–Scholes for = 0. We expand e−κT −κT (κT + 2) + 2eκT − 2
at = 0, and let = 1 to obtain the approximate q0 = ,
2κ 3
formula. In mathematical language, this can be writ-
ten as follows: e−κT 2eκT (κT − 3) + κT (κT + 4) + 6
q1 = ,
2κ 3
∂PBS 1 ∂ 2 PBS e−2κT −4eκT κT + 2e2κT − 2
PHeston = PBS + Ɛ + Ɛ +E
∂ 2 ∂ 2 r0 = ,
4κ 3
(24)
e−2κT 4eκT (κT + 1) + e2κT (2κT − 5) + 1
r1 =
4κ 3
Here, we will take another approximation to (28)
simulate the partial derivatives in the above equation
by the linear combination of the Greek letters of The advantage is that there is no integration in the
Black–Scholes. Here, the idea is that use of the approximate formula. So the calculations are done
chain rule in the derivative can result in ∂P∂BS = much faster than in the exact formula. We will discuss
∂PBS ∂S
∂S ∂
+ ∂P∂σBS ∂σ
∂
. The same idea holds for the second this point in the section Numerical Results.
derivative. The error in the approximation is estimated as
√ 3√
E = O [ξSup T ] T .
2
∂ i+1 PBS
PHeston = PBS (x0 , varT ) + ai,T (x0 , varT )
i=1
∂x i y
1 Numerical Results
∂ 2i+2 PBS
+ b2i,T (x0 , varT ) + E (25)
i=0
∂x 2i y 2 We test the approximate formula with the follow-
ing strikes. We take strikes from 70% to 130% for
We refer to [1] for proofs and intermediate deriva- short maturity and 10% to 730% for long matu-
tion. The parameters in the formula are as follows: rity. Implied Black–Scholes volatilities of the closed
formula, of the approximation formula, and related
errors (in bp), are expressed as a function of matu-
varT = m0 v0 + m1 θ, a1,T = ρξ(p0 v0 + p1 θ) rities in fractions of years and relative strikes. The
values of the parameters are as follows: θ = 6%,
a2,T = (ρξ ) (q0 v0 + q1 θ),
2
b0,T = ξ 2 (r0 v0 + r1 θ)
κ = 3, ξ = 30%, and ρ = 0%. Except for short matu-
rity plus very small strikes, where we observe the
(26) largest difference (18.01 bp); the difference is less
T than 5 bp (1 bp = 0.01%) in almost all other
varT = v0,t dt (27) cases. With regard to the speed of calculation, the
0
approximate formula is about 100 times quicker
than the exact formula (with the optimization in
e−κT −1 + eκT
m0 = , integral).
κ
e−κT −1 + eκT
m1 =T − , References
κ
e−κT −κT + eκT − 1
p0 = , [1] Benhamou, E., Gobet, E. & Miri, M. (2009). Time
κ2 dependent Heston model, SIAM Journal on Financial
Mathematics.
e−κT κT + eκT (κT − 2) + 2 [2] Lewis, A.L. (2000). Option Valuation Under Stochastic
p1 = , Volatility: With Mathematica Code. February 2000.
κ2
Implied Volatility: Volvol Expansion 5
The inequality in part 3 of Theorem 2 is sharp, as Formula (10) of Theorem 3 can be found, for
there exists a martingale (St )t≥0 such that St → 0 in instance, in [8]. It can be used to calculate the long
probability and such that implied volatility for some examples.
∂ Example (Exponential Lévy Models (see Exp-
T (k, T )2 → −4 (7)
∂k onential Lévy Models)). The simple inequality
as T → ∞ uniformly for k ∈ [−M, M]. A proof of 11[1,∞) (x) ≤ x ∧ 1 ≤ x p (11)
Theorem 2 is in [9].
which holds for all 0 ≤ p ≤ 1 and x ≥ 0, gives the
Remark The condition St → 0 in probability app- bound
earing in part 3 of Theorem 2 has a natural financial 1/2
8 p
interpretation. Indeed, we have St → 0 in proba- lim sup sup − log Ɛ[ST ]
T →∞ p∈[0,1] T
bility (equivalently, almost surely) if and only if
C(K, T ) → S0 as T → ∞ for some K > 0 (equiva- 1/2
8
lently, for all K > 0) where ≤ (∞) ≤ lim inf − log [ST ≥ 1] (12)
T →∞ T
C(K, T ) = Ɛ[(ST − K)+ ] (8) If (log(St ))t≥0 has independent identically distributed
increments, then the above bounds hold with equality
is the price of a European call option. Since the by the large deviation principle. Indeed, let (Lt )t≥0
long maturity call prices converge to stock price in be a Lévy process with cumulant-generating function
many models of interest (including, of course, the
Black–Scholes model), we see that the assumption (p) = log Ɛ(epL1 ) (13)
St → 0 is not particularly onerous. such that (1) < ∞, and model the stock price by
In fact, since (St )t≥0 is a nonnegative martin- the martingale St = eLt −t(1) . Then the long implied
gale, it must converge almost surely to some ran- volatility satisfies
dom variable S∞ by the martingale convergence
theorem.√If S∞ > 0 with positive probability, then (∞)2 = 8 sup {p(1) − (p)} (14)
p∈[0,1]
limT →∞ T (k, T ) exists and is finite for each k,
and hence limT →∞ (k, T ) = 0. which is eight times the Legendre transform of the
cumulant generating function evaluated at (1).
See [5] for further asymptotics of stochastic volatility dBt = rBt dt, there is no arbitrage if S̃t = St e(δ−r)t
models based on this method, and see [4] for asymp- defines a martingale. In this case, everything from
totics based on perturbation methods. above applies if we define the implied volatil-
ity by
Long Implied Volatility Cannot Fall
+
S̃T
In many models of interest, the long implied volatil- Ɛ −e
k
= CBS (k, T (k, T )2 ) (21)
S̃0
ity, if it exists, is constant as a function of the calendar
time. However, the long implied volatility need not where the log-moneyness parameter k now corre-
be a constant in general. In this section, we consider sponds to the strike K = S0 ek+(r−δ)T .
the dynamics of the long implied volatility, and in However, it is unclear which of the above results
fact, we will see that the long implied volatility can can be suitably extended to the general case with
never fall. In this section, we also assume that the arbitrary increasing adapted processes (Dt )t≥0 and
stock price is strictly positive, rather than merely non- (Bt )t≥0 .
negative. We define the implied volatility t (k, τ )
for log moneyness k and time to maturity τ as the
unique nonnegative Ft -measurable random variable References
that satisfies
+ [1] Carr, P. & Wu, L. (2003). The finite moment log stable
St+τ
Ɛ −e k
= CBS (k, τ t (k, τ )2 ) (18) process and option pricing, Journal of Finance 58(2),
St 753–778.
[2] Dybvig, P., Ingersoll, J. & Ross, S. (1996). Long forward
The following theorem was proved in [9]. and zero-coupon rates can never fall, Journal of Business
60, 1–25.
Theorem 4 For all k1 , k2 and 0 ≤ s ≤ t we have [3] Gatheral, J. (2006). The Volatility Surface: A Practi-
tioner’s Guide, John Wiley & Sons, Hoboken, NJ.
lim sup t (k1 , τ ) − s (k2 , τ ) ≥ 0 (19) [4] Fouque, J.-P., Papanicolaou, G. & Sircar, K.R. (2000).
τ →∞ Derivatives in Financial Markets with Stochastic Volatil-
almost surely. ity, Cambridge University Press.
[5] Jacquier, A. (2007). Asymptotic Skew Under Stochas-
This result is an exact analog of the Dybvig– tic Volatility, Pre-print, Birkbeck College, University of
London.
Ingersoll–Ross theorem that long zero-coupon rates [6] Hubalek, F., Klein, I. & Teichmann, J. (2002). A general
never fall. See [6] for a nice proof of this fact. proof of the Dybvig-Ingersoll-Ross theorem: long for-
ward rates can never fall, Mathematical Finance 12(4),
447–451.
Extensions [7] Lee, R. (2004). Implied volatility: statics, dynamics, and
probabilistic interpretation, in Recent Advances in Applied
The previous discussion has considered the case Probability, R. Baeza-Yates, et al., eds, Springer-Verlag,
where the stock pays no dividend and the risk-free Springer, New York, 241–268.
interest rate is zero. In the general case, a stock pays [8] Lewis, A. (2000). Option Valuation Under Stochastic
a dividend and there is a cost to borrow money. The Volatility, Finance Press, Newport Beach.
[9] Rogers, L.C.G. & Tehranchi, M.R. (2008). Can the
situation is usually modeled as follows. Let St be the
Implied Volatility Surface Move by Parallel Shifts? Pre-
stock price, let Dt be the cumulative dividends, and print, University of Cambridge.
let Bt be the price of a numéraire asset such as a bank
account at time t. There is no arbitrage if there exists
a probability measure such that the process Related Articles
t
St dDs
+ (20) Exponential Lévy Models; Heston Model; Implied
Bt 0 Bs
Volatility Surface; Moment Explosions.
is a martingale. In the case of proportional continuous
dividends dDt = δSt dt and constant interest rate MICHAEL R. TEHRANCHI
SABR Model volatility mean-reverts. The use of geometrical meth-
ods in quantitative finance originates from [1, 2] and
was investigated in detail in [5, 6, 7].
The SABR model [4] is a stochastic volatility (see
Stochastic Volatility Models) model in which the
forward asset price follows the dynamics in a forward A More General Stochastic Process
measure T :
In the following, we will assume arbitrary local
volatility functions C(·) and a general time-homo
dft = at C(ft ) dWt (1) geneous one-dimensional stochastic differential equa-
dat = νat dZt , α ≡ a0 (2) tion (SDE) for the stochastic volatility process
with
g(x) ≡ det[gµν (x)] (10) The Short-time Limit
Plugging the general short-time limit for p at the first-
order in time as given by equation (12) in equation
Here, we have used the Einstein convention mean-
(7) and using a saddle-point approximation for the
ing that two repeated indices are implicitly summed.
integration over a, we obtain the short-time limit of
We set
the effective local volatility function.
Aµ (x) = gµν (x)Aν (x) (11) Getting implied volatility from the effective local
volatility function boils down to calculating the
The asymptotic solution to the Kolmogorov equation geodesic distance between any two given points
in the short-time limit is given by in the metric defined by the SVM. While this is
generally a nontrivial task, the geodesic distance is
known analytically in the special case of the geometry
g(x) −
d(x)2
p(t, x|x0 ) = n
(x)P (x 0
, x)e 4t associated with the SVM defined by equations (1) and
(4πt) 2 (6). Details are given in [7].
∞
× an (x)t n (12)
n=1 Asymptotic Implied Volatility
• d(x) is the geodesic distance between x and x 0 Applying these techniques, we find that the general
measured in the metric gµν . d(x) is defined as asymptotic implied volatility at the first order for any
SABR Model 3
amin α + qρν
with amin the volatility a, which minimizes the
geodesic distance d(a, fav |α, f0 ). The g ff are the and
ff -components of the inverse metric evaluated at
amin .
is the Van Vleck–Morette determinant as in
sinh(d(amin ))
equation (14), g is the determinant of the metric, and 1 (20)
P is the parallel gauge transport as in equation (15). d(amin )
The prime symbol indicates a derivative according An asymptotic formula for a SABR model with
to a. This formula in equation (16) is particularly a mean-reversion term, called λ-SABR, has been
useful as we can use it to rapidly calibrate any given obtained similarly in [7].
SVM. In the following, we apply it to the SABR
model with an arbitrary local volatility C(·).
Calibration of the Short-term Smile
Improved SABR Formula Moreover, by inverting equation (17) to lowest order
in τ , we see that for any values α, ρ, and ν, a given
The asymptotic implied volatility in the SABR model short-term smile σBS (f ) is calibrated by construction
with arbitrary local volatility C(·) is then given by if the local volatility function is chosen as
ln fK0 C(f ) =
σBS (τ, K) = (1 + σ1 (fav ) τ ) (17)
σ (K) σBS (f ) −1
f σBS (f ) 1 − f ln ff0 σBS (f )
f
with ln −1 √ 1
α 1 − ρ cosh |ρ| ν σBS (f ) + cosh
2 ρ f0
1−ρ 2
ανρ sinh(d(f )) (C(f )amin )2 (21)
σ1 (f ) = ∂f C(f ) +
4 d(f ) 24
1 2∂ff (C(f )amin ) References
2
+
f C(f )amin
[1] Avellaneda, M., Boyer-Olson, D., Busca, J. & Friz, P.
∂f (C(f )amin ) 2
− (18) (2002). Reconstructing the smile, Risk Magazine October
C(f )amin 91–95.
4 SABR Model
[2] Berestycki, H., Busca, J. & Florent, I. (2004). Computing Pricing, Financial Mathematics Series, Chapman &
the implied volatility in stochastic volatility models, Com- Hall/CRC 102–104.
munications on Pure and Applied Mathematics 57(10), [8] Obłój, J. (2008). Fine-tune your smile, Wilmott Magazine
1352–1373. May.
[3] Dupire, B. (2004). A unified theory of volatility, in
Derivatives Pricing: The Classic Collection, P. Carr, ed.,
Risk Publications. Further Reading
[4] Hagan, P., Kumar, D., Lesniewski, A.S. & Wood-
ward, D.E. (2002). Managing smile risk, Wilmott Mag- Benaim, S., Friz, P., Lee, R. (2008). On the Black-Scholes
azine September, 84–108. implied volatility at extreme strikes, in Frontiers in Quanti-
[5] Henry-Labordère, P. (2007). Combining the SABR and tative Finance: Volatility and Credit Risk Modeling, R. Cont,
BGM models, Risk Magazine October 102–107. ed., Wiley, Chapter 3.
[6] Henry-Labordère, P (2008). A geometric approach to Lee, R. (2004). The moment formula for implied volatility at
the asymptotics of implied volatility, in R. Cont, ed., extreme strikes, Mathematical Finance 14(3), July 469–480.
Frontiers in Quantitative Finance: Volatility and Credit
Risk Modeling, Wiley, Chapter 4. PIERRE HENRY-LABORDÈRE
[7] Henry-Labordère, P. (2008). Analysis, Geometry and
Modeling in Finance: Advanced Methods in Option
with d1,2 = − log (K/F0 ) /V ± V /2. It follows that
Implied Volatility: Large we can express the normalized Black–Scholes call
Strike Asymptotics price
cBS := CBS /S0 (8)
It is clear from the afore mentioned monotonicity of and one is led to Lee’s moment formula
cBS (k, ·) that the fatness of the tail of the returns, for
example, the behavior of V (k)2 /k ∼ ψ p ∗
F̄ (k) = 1 − F (k) = [X > k] as k → ∞ (16) V (−k)2 /k ∼ ψ q ∗ (26)
is related to the shape of the “wing” of the implied Recall that g (k) ∼ h (k) stands for the precise math-
volatility (smile) for far-out-of-the-money calls, V (k) ematical statement that lim g (k) / h (k) → 1 as k →
as k → ∞, and similarly, for F (k) , V (k) as k → ∞. In the same spirit, let us agree that
−∞. Surprisingly, perhaps, this link can be made
very explicit. Let us agree that if F admits a density, g (k) h (k) means lim sup g (k) / h (k) → 1
it is denoted by f = F . Let us also adopt the
common convention that as k → ∞ (27)
dF = σ F β dW, dσ = ησ dZ (36)
End Notes
with σ, η > 0, β < 1 and two Brownian mot-
a.
ions W, Z assumed (here) to be independent. Equation (4) is valid in a nondeterministic interest rate
setting, provided the expectation is taken with respect to the
Using standard stochastic calculus [4],
one can
risk-neutral measure (which is equivalent but, in general,
give good enough estimates on Ɛ |FT /F0 |u , not identical to T ).
from above and below, to see that b.
denotes the distribution function of normal (0, 1).
log Ɛ |FT /F0 |u = log Ɛ exp (uX) References
2 2
η Tu
∼ as u → ∞ (37) [1] Avellaneda, M. & Zhu, Y. (1998). A risk-neutral stochas-
(1 − β) 2 2
tic volatility model, International Journal of Theoretical
and Applied Finance 1(2), 289–310.
From this, Kasahara’s theorem allows to [2] Benaim, S. & Friz, P.K. (2009). Regular variation and
deduce the tail behavior of X, namely smile asymptotics, Mathematical Finance 19(1), 1–12,
eprint arXiv:math/0603146.
(1 − β)2 x 2 [3] Benaim, S. Friz, P.K. (2008). Smile asymptotics II:
− log [X > x] ∼ (38)
η2 T 2 models with known MGF, Journal of Applied Probability
45(1), 16–32.
and the (right hand) tail-wing formula reveals [4] Benaim, S. Friz, P.K. & Lee, R. (2008). The
that the implied volatility in the SABR model Black–Scholes implied volatility at extreme strikes, in
frontiers, in Quantitative Finance: Volatility and Credit
is asymptotically flat, σ (k, T ) ∼ η/ (1 − β) as
Risk Modeling, Chapter 2, Wiley.
k → ∞. [5] Bingham, N.H. Goldie, C.M. & Teugels, J.L. (1987).
Early contributions in the study of smile asymp- Regular Variation, CUP.
[6] Gatheral, J. (2000). Rational shapes of the Volatility
totics are [1, 6]. The moment formula appears in [8],
Surface, Presentation, RISK Conference.
the tail-wing formula in [2] with some additional cri- [7] Gulisashvili, A. & Stein, E. Implied volatility in the
teria in [3]. A survey on the topic, together with some Hull–White model, Mathematical Finance, to appear.
new examples (including CEV and SABR) is found [8] Lee, R. (2004). The moment formula for implied volatil-
in [4]. Further developments in the field include the ity at extreme strikes, Mathematical Finance 14(3),
refined asymptotic results of Gulisashvili and Stein 469–480.
[7]; in a simple log-normal stochastic volatility model
of the form dF = σ F dW, dσ = ησ dZ, with two Further Reading
independent Brownian motions W, Z they find
√ √ Gatheral, J. (2006). The Volatility Surface, A Practitioner’s
log k + log log k
σ (k, T ) T = 2k − √ + O (1) Guide, Wiley.
2η T
(39) PETER K. FRIZ
Constant Elasticity of increasing with the strike price, but care should be
taken when working with this model (see the discus-
Variance (CEV) Diffusion sion below).
The CEV diffusion has the following boundary
Model characterization (see, e.g., [4] for Feller’s bound-
ary classification for one-dimensional diffusions). For
−1/2 ≤ β < 0, the origin is an exit boundary, and
The CEV Process the process is killed the first time it hits the origin.
For β < −1/2, the origin is a regular boundary point.
The constant elasticity of variance (CEV) model is The SDE (1) does not uniquely specify the diffusion
a one-dimensional diffusion process that solves a process, and a boundary condition is needed at the
stochastic differential equation (SDE) origin. In the CEV model, it is specified as a killing
β+1 boundary. Thus, the CEV process with β < 0 natu-
dSt = µSt dt + aSt dBt (1)
rally incorporates the possibility of bankruptcy—the
with the instantaneous volatility σ (S) = aS β speci- stock price can hit zero with positive probability, at
fied to be a power function of the underlying spot which time the bankruptcy occurs. For β ≥ 0, the
price. The model has been introduced by Cox [7] origin is an inaccessible natural boundary.
as one of the early alternative processes to the geo-
metric Brownian motion to model asset prices. Here
Reduction to Bessel Processes, Transition Density,
β is the elasticity parameter of the local volatility,
and Probability of Default
dσ/ dS = βσ/S, and a is the volatility scale param-
eter. For β = 0, the CEV model reduces to the con-
The CEV process is analytically tractable. Its transi-
stant volatility geometric Brownian motion process
tion probability density and cumulative distribution
employed in the Black, Scholes, and Merton model.
function are known in closed form.a It is closely
When β = −1, the volatility specification is that of
related to Bessel processes and inherits their analyti-
Bachelier (the asset price has the constant diffusion
cal tractability. The CEV process with drift (µ = 0)
coefficient, while the logarithm of the asset price has
is obtained from the process without drift (µ = 0) via
the a/S volatility). For β = −1/2 the model reduces
a scale and time change:
to the square-root model of Cox and Ross [8].
Cox [7] originally studied the case β < 0 for 2µβt
(µ) e −1
which the volatility is a decreasing function of the St =e µt
Sτ(0)
(t) , τ (t) = (2)
asset price. This specification captures the leverage 2µβ
effect in the equity markets: the stock price volatility
increases as the stock price declines. The result of this Let Rt(ν) , t ≥ 0 be a Bessel process of index ν.
inverse relationship between the price and volatility Recall that for ν ≥ 0, zero is an unattainable entrance
is the implied volatility skew exhibited by options boundary. For ν ≤ −1, zero is an exit boundary.
prices in the CEV model with negative elasticity. For ν ∈ (−1, 0), zero is a regular boundary. In our
The elasticity parameter β controls the steepness of application, we specify zero as a killing boundary
the skew (the larger the |β|, the steeper the skew), to kill the process at the first hitting time of zero
while the scale parameter a fixes the at-the-money (see, e.g., [4, pp. 133–134], for a summary of Bessel
volatility level. This ability to capture the skew has processes). Before the first hitting time of zero, the
made the CEV model popular in equity options CEV process without drift can be represented as a
markets. power of a Bessel process:
Emanuel and MacBeth [14] extended Cox’s anal-
ysis to the positive elasticity case β > 0, where the − 1
β
asset price volatility is an increasing function of the St(0) = a|β|Rt(ν) (3)
asset price. The driftless process with µ = 0 and with
positive β is a strict local martingale. It has been where ν = 1/(2β).
applied to modeling commodity prices that exhibit The CEV transition density is obtained from the
increasing implied volatility skews with the volatility well-known expression for the transition density of
2 Constant Elasticity of Variance (CEV) Diffusion Model
the Bessel process (see [4, p. 115, 21, p. 446]). For indexed by that converge to the CEV process in
the driftless process, it is given by the limit → 0.
The CEV process with β > 0 can similarly be
−2β−3/2 1/2
−β −β
regularized to prevent the volatility explosion as
St S0 S0 St
p(0) (S0 , St ; t) = I|ν| the process tends to infinity by picking a large
2
a |β|t a2β 2 t value E > 0 and fixing the volatility above E to
−2β −2β
equal a Eβ . The regularized processes with µ = 0 are
S0 + St true martingales, as opposed to the failure of the
× exp − (4)
2a 2 β 2 t martingale property for the driftless CEV process
with β > 0 and µ = 0, which is only a strict local
where Iν is the modified Bessel function of the first martingale. The failure of the martingale property
kind of order ν. From equation (2), the transition for the nonregularized process with β > 0 can be
density with drift is obtained from the density equa- explicitly illustrated by computing the expectation
tion (4) according to (using the transition density (5)):
p(µ) (S0 , St ; t) = e−µt p(0) S0 , e−µt St ; τ (t) (5) µS
−2β
Ɛ[St ] = eµt S0 1 − G ν, 2 2µβt 0
(7)
a β e −1
The density (5) was originally obtained by Cox [7]
for β < 0 and by Emanuel and MacBeth [14] for
β > 0 on the basis of the result due to Feller [15].
CEV Options Pricing
For β < 0, in addition to the continuous transition
density, we also have a positive probability for the The closed-form CEV call option pricing formula
process started at S0 at time zero to hit zero by time with strike K, time to expiration T , and the initial
t ≥ 0 (probability of default or bankruptcy) that is asset price S can be obtained in closed form by
given explicitly by integrating the call payoff with the risk-neutral CEV
−2β
density (5) with the risk-neutral drift µ = r − q (r is
µS0 the risk-free interest rate and q is the dividend
G |ν|, 2 2µβt (6)
a β e −1 yield). The result can be expressed in terms of
the complementary noncentral chi-square distribution
∞
where G(ν, x) = (1/ (ν)) x uν−1 e−u du is the function Q(z; v, k) ([7] for β < 0, [14] for β > 0; see
complementary Gamma distribution function. This also [11, 22]):
expression can be obtained by integrating the contin-
uous density (5) from zero to infinity and observing C(S; K, T ) = e−rT Ɛ (ST − K)+
that the result is less than one, that is, the density is −qT
defective. The defect is equal to the probability mass
e S Q (ξ ; 2ν, y0 ) β>0
−e−rT K (1 − Q (y0 ; 2(1 + ν), ξ )) ,
at zero equation (6).
=
While killing the process at zero is desirable for
stock price modeling, it may be undesirable in other e−qT S Q (y0 ; 2(1 + |ν|), ξ )
β<0
contexts, where one would prefer the process that −e−rT K (1 − Q (ξ ; 2|ν|, y0 )) ,
stays strictly positive (e.g., in stock index models). A (8)
regularized version of the CEV process that never
hits zero has been constructed by Andersen and where
Andreasen [1] (see also [9]). The positive probability 2µS −2β 2µK −2β
of hitting zero comes from the explosion of instanta- ξ= , y 0 = (9)
a 2 β e2µβT − 1 a 2 β 1 − e−2µβT
neous volatility as the process falls toward zero. The
regularized version of the CEV process fixes a small and S = S0 is the initial asset price at time zero. The
value > 0. For S > , the volatility is according price of the put option is obtained from the put–call
to the CEV specification. For S ≤ , the volatility parity relationship:
is fixed at the constant level a β . We thus have a
sequence of regularized strictly positive processes P (S; K, T ) = C(S; K, T ) + Ke−rT − Se−qT (10)
Constant Elasticity of Variance (CEV) Diffusion Model 3
The complementary noncentral chi-square distribu- intensity that is an affine function of the instantaneous
tion function can be expressed as the series of variance:
complementary Gamma distribution functions ([22,
pp. 214]): λ(S) = b + cσ 2 (S) = b + ca 2 S 2β (13)
6.0
JDCEV (b = −1/2)
JDCEV (b = −1)
5.0 JDCEV (b = −2)
JDCEV (b = −3)
CEV (b = −1/2)
CEV (b = −1)
4.0 CEV (b = −2)
CEV (b = −3)
Percent
3.0
2.0
1.0
0.0
0 5 10 15 20 25 30
(a) Time to maturity (years)
7.5
JDCEV T = 0.25
65 JDCEV T = 0.5
JDCEV T = 1
JDCEV T = 5
Implied volatility (%)
35
25
15
30 35 40 45 50 55
(b) Strike
Figure 1 (a) Term structures of credit spreads. Parameter values: S = S ∗ = 50, σ∗ = 0.2, β = −1/2, −1, −2, −3,
r = 0.05, q = 0. JDCEV: b = 0.02 and c = 1/2. CEV: b = 0 and c = 0. (b) Implied volatility skews. Parameter values:
S = S ∗ = 50, σ∗ = 0.2, r = 0.05, q = 0. For JDCEV model: b = 0.02, c = 1/2 and β = −1, the times to expiration
are T = 0.25, 0.5, 1, 5 years. For CEV model: b = c = 0 , β = −1, −2 and times to expiration are T = 0.25, 5. Implied
volatilities are plotted against the strike price
volatility skew with implied volatilities increasing for skew in the CEV model. The slope of the skew
lower strikes, as the local volatility and the default in the JDCEV model is steeper and is controlled
intensity both increase as the stock price declines. by β, as well as the default intensity parameters b
The volatility elasticity β controls the slope of the and c.
Constant Elasticity of Variance (CEV) Diffusion Model 5
Implied Volatility and the SABR model intensity linked to the stock price volatility, jumps,
and stochastic volatility. These models inherit the
By using singular perturbation techniques, Hagan analytical tractability of the CEV and JDCEV mod-
and Woodward [17] obtained explicit asymptotic for- els as long as the Laplace transform of the time-
mulas for the Black–Scholes implied volatility σBS change process is analytically tractable. The stochas-
of European calls and puts on an asset whose for- tic volatility version of the CEV model obtained
ward price F (t) follows the CEV dynamics, that is, in this approach is different from the SABR model
β+1
dFt = aFt dBt , in two respects. The advantage of the time-change
approach is that it preserves the analytical tractability
β (β + 3) F0 − K 2 for more realistic choices for the stochastic volatil-
σBS = afavβ 1− ity process, such as the Cox–Ingersoll–Rand (CIR)
24 fav
process with mean-reversion. Another advantage is
that jumps, including the jump to default, can also
β 2 2 2β
+ a τfav + · · · (15) be incorporated. The weakness is that it is hard to
24 incorporate the correlation between the price and
volatility.
where τ is time to expiration, fav = (F0 + K)/2
and F0 is today’s forward price (Hagan and Wood-
ward’s β is equal to our β + 1). This asymptotics End Notes
for the implied volatility approximates the exact
CEV-implied volatilities well when the ratio F0 /K a.
In this article we present the results for the CEV model
is not too far from one and when K and F0 are with constant parameters. We note that the process remains
far away from zero. The accuracy tends to deteri- analytically tractable when µ and a are taken to be
orate when the values are close to zero since this deterministic functions of time [6].
b.
asymptotic approximation does not take into account It is convenient to parameterize the local volatility
the killing boundary condition at zero. function as σ (S) = aS β = σ∗ (S/S ∗ )β so that at some
Hagan et al. [16] introduced the SABR model, reference spot price level S = S ∗ (e.g., the at-the-money
which is a CEV model with stochastic volatility. level at the time of model calibration) the volatility takes
the reference value, σ (S ∗ ) = σ∗ . In the example presented
More precisely, the volatility scale parameter a is here, the reference level is taken to equal the initial spot
made stochastic, so that the forward asset price price level, S ∗ = S0 , and the volatility scale parameter is
follows the dynamics: β
a = σ∗ /(S0 ).
β+1
dFt = at Ft dBt(1) and
Acknowledgments
dat = ηat dBt(2) (16)
This research was supported by the National Science
where dBt(1) , dBt(2) = ρdt. Hagan et al. derive the Foundation under grant DMS-0802720.
asymptotic expression for the implied volatility in the
SABR model.
References
Introducing Jumps and Stochastic [1] Andersen, L. & Andreasen, J. (2000). Volatility skew
Volatility into the CEV Model and extensions of the LIBOR market model, Applied
Mathematical Finance 7, 1–32.
Mendoza et al. [19] introduce jumps and stochastic [2] Atlan, M. & Leblanc, B. (2005). Hybrid equity-credit
volatility into the JDCEV model by time changing modelling, Risk Magazine 18, 8.
the JDCEV process. Lévy subordinator time changes [3] Benton, D. & Krishnamoorthy, K. (2003). Computing
discrete mixtures of continuous distributions: noncen-
introduce state-dependent jumps into the process, tral chi-square, noncentral t and the distribution of
while absolutely continuous time changes introduce the square of the sample multiple correlation coeffi-
stochastic volatility. The result is a flexible family cient, Computational Statistics and Data Analysis 43,
of models that exhibit the leverage effect, default 249–267.
6 Constant Elasticity of Variance (CEV) Diffusion Model
[4] Borodin, A. & Salminen, P. (2002). Handbook of Brow- [14] Emanuel, D.C. & MacBeth, J.D. (1982). Further results
nian Motion: Facts and Formulae, Probability and Its on the constant elasticity of variance call option pric-
Applications, 2nd rev Edition, Birkhauser Verlag AG. ing model, The Journal of Financial and Quantitative
[5] Campi, L., Sbuelz, A. & Polbennikov, S. (2008). Sys- Analysis 17, 533–554.
tematic equity-based credit risk: A CEV model with [15] Feller, W. (1951). Two singular diffusion problems, The
jump to default, Journal of Economic Dynamics and Annals of Mathematics 54, 173–182.
Control 33, 93–108. [16] Hagan, P.S., Kumar, D., Lesniewski, A.S. & Wood-
[6] Carr, P. & Linetsky, V. (2006). A jump to default ward, D.E. (2002). Managing smile risk, Wilmott Mag-
extended CEV model: an application of Bessel pro- azine 1, 84–108.
cesses, Finance and Stochastics 10, 303–330. [17] Hagan, P. & Woodward, D. (1999). Equivalent
[7] Cox, J.C. (1975, 1996). Notes on option pricing I: black volatilities, Applied Mathematical Finance 6,
constant elasticity of variance diffusions, Reprinted in 147–157.
The Journal of Portfolio Management 23, 15–17. [18] Linetsky, V. (2004). Lookback options and diffusion
[8] Cox, J.C. & Ross, S.A. (1976). The valuation of options hitting times: a spectral expansion approach, Finance
for alternative stochastic processes, Journal of Financial and Stochastics 8, 343–371.
Economics 3, 145–166. [19] Mendoza, R., Carr, P. & Linetsky, V. (2007). Time
[9] Davydov, D. & Linetsky, V. (2001). Pricing and hedging Changed Markov Processes in Credit-Equity Modeling,
path-dependent options under the CEV process, Man- Mathematical Finance, to appear.
agement Science 47, 949–965. [20] Mendoza, R. & Linetsky, V. (2008). Equity Default
[10] Davydov, D. & Linetsky, V. (2003). Pricing options Swaps under the Jump-to-Default Extended CEV Model.
on scalar DIFFUSIONS: an eigenfunction expansion Working paper.
approach, Operations Research 51, 185–209. [21] Revuz, D. & Yor, M. (1999). Continuous Martingales
[11] Delbaen, F. & Shirakawa, H. (2002). A note of option and Brownian Motion, Grundlehren Der Mathematischen
pricing for constant elasticity of variance model, Asia- Wissenschaften, Springer.
Pacific Financial Markets 9, 85–99. [22] Schroder, M. (1989). Computing the constant elasticity
[12] Ding, C.G. (1992). Computing the non-central χ 2 dis- of variance option pricing formula, The Journal of
tribution function, Applied Statistics 41, 478–482. Finance 44, 211–219.
[13] Dyrting, S. (2004). Evaluating the noncentral chi-square
distribution for the Cox-Ingersoll-Ross process, Compu- VADIM LINETSKY & RAFAEL MENDOZA
tational Economics 24, 35–50.
Bates Model the cost of carry is equal to the risk-free interest
rate.
The postulated process has an associated con-
The Bates [3] and Scott [13] option pricing models ditional characteristic function that is exponentially
were designed to capture two features of the asset affine in the state variables. For the Bates model, the
returns: the fact that conditional volatility evolves characteristic function is
over time in a stochastic but mean-reverting fashion,
and the presence of occasional substantial outliers ϕ(i) = E0∗ [eiST |S0 , V0 , T ]
in the asset returns. The two models combined the = exp [iS0 + C(T ; i) + D(T ; i) V0
Heston [9] model of stochastic volatility (see Heston
Model) with the Merton [11] model of independent + λ∗ T E (i) (2)
normally distributed jumps in the log asset price (see
Jump-diffusion Models). The Bates model ignores where E0∗ [·] is the risk-neutral expectational operator
interest rate risk, while the Scott model allows interest associated with equation (1), and
rates to be stochastic. Both models evaluate European
option prices numerically, using the Fourier inversion γ (z) = (ρσv z − β ∗ )2 − σv2 (z2 − z) (3)
approach of Heston (see also Fourier Transform and
αT 2α
Fourier Methods in Options Pricing for a general C(T ; z) = bT z − [ρσv z − β ∗ − γ (z)] −
discussion of Fourier transform methods in finance). σv2 σv2
The Bates model also includes an approximation for
1 − eγ (z)T
∗
pricing American options (see American Options). × ln 1+[ρσv z − β − γ (z)]
The two models were historically important in show- 2γ (z)
ing that the tractable class of affine option pricing (4)
models includes jump processes as well as diffusion
processes. z −z
2
D(T ; z) = (5)
All option pricing models rely upon a risk-neutral eγ (z)T + 1 ∗
representation of the data generating process that γ (z) + β − ρσv z
eγ (z)T − 1
includes appropriate compensation for the various
∗ z 1 δ 2 (z2 −z) ∗
risks. In the Bates and Scott models, the risk- E(z) = 1 + k e2 −1−k z (6)
neutral processes for the underlying asset price St
and instantaneous variance Vt are assumed to be of The terms C(·) and D(·) are identical to those
the form in the Heston [9] stochastic volatility model, while
∗
dSt /St = (b − λ∗ k ) dt + Vt dZt + k ∗ dqt E(·) captures the additional distributional impact
of jumps. Scott’s generalization to stochastic inter-
dVt = α − β ∗ Vt dt + σv Vt dZvt (1) est rates uses an extended Fourier transform of
the form
where b is the cost of carry; Zt and Zvt are
Wiener processes with correlation ρ; qt is an integer- ϕ ∗ (z)
valued Poisson counter with risk-neutral intensity λ∗
T
that counts the occurrence of jumps; and k ∗ is the
= E0∗ exp − rt dt + z ln ST |S0 , r0 , V0 , T
random percentage jumpsize, with a Gaussian dis- 0
∗ ∗
tribution ln(1 + k ) ∼ N ln(1 + k ) − 2 δ , δ con-
1 2 2
(7)
ditional upon the occurrence of a jump. The Bates
model assumes b is constant, while the Scott model which has an analytical solution for complex-valued
assumes it is a linear combination of Vt and an z that is also exponentially affine in the state variables
additional state variable that follows an indepen- S0 , r0 , and V0 .
dent square-root process. Bates [3] examines for- European call option prices take the form c =
eign currency options, for which b is the domes- B(F P1 − XP2 ), where B is the price of a discount
tic/foreign interest differential, while Scott’s applica- bond of maturity T , F is the forward price on the
tion [13] to nondividend paying stock options implies underlying asset, X is the option’s exercise price, and
2 Bates Model
P1 and P2 are upper tail probability measures deriv- Alternate jump specifications (including Lévy pro-
able from the characteristic function. The papers of cesses) with independent and identically distributed
Bates [3] and Scott [13] present Fourier inversion jumps involve modification of the functional form
methods for evaluating P1 and P2 numerically. How- of E(·), and are discussed in other articles: Temp-
ever, faster methods were subsequently developed for ered Stable Process; Normal Inverse Gaussian
directly evaluating European call options, using a sin- Model; Variance-gamma Model; Kou Model; Exp-
gle numerical integration of the form onential Lévy Models). The Bates [5] model with
(risk-neutral) stochastic jump intensities of the form
c =BF − BX λ∗ + λ∗1 Vt involves modifying γ (·) and D(·):
1 1 ∞ f (i)e−i ln X
× + Re d γ (z) = (ρσv z − β ∗ )2 − σv2 [z2 − z + 2λ∗1 E(z)]
2 π 0 i(1 − i)
(8) (9)
z −z+
2
2λ∗1 E(z)
where Re[z] is the real component of a complex vari- D(T ; z) = (10)
eγ (z)T + 1 ∗
able z (see Fourier Methods in Options Pricing). γ (z) + β − ρσv z
For the Bates model, f (i) = ϕ(i); for the Scott eγ (z)T − 1
model, f (i) = ϕ ∗ (i)/B. European put options can See also Time-changed Lévy Process for other
be evaluated from European call option prices using stochastic-intensity jump models.
the put–call parity relationship p = c + B(X − F ) Bates [5] also contains multifactor specifications
(see Put–Call Parity for details on put-call parity). for the instantaneous variance and jump intensity.
Evaluating equation (8) typically involves integra- The general class of affine jump-diffusion models is
tion of a dampened oscillatory function. While there presented in [8], including the volatility-jump option
exist canned programs for integration over a semi- pricing model. Scott’s extended Fourier transform
infinite domain, most papers use various forms of approach for stochastic interest rates was subse-
integration over a truncated domain. Bates [3] uses quently also used by Bakshi and Madan [2] and
Gauss–Kronrod quadrature (see Quadrature Meth- Duffie et al. [8]
ods). Fast Fourier transform approaches have also
been proposed, but these involve substantially more
functional evaluations. The integration is typically
well behaved, but there do exist extreme parameter Further Reference Material
values (e.g., |ρ| near 1) for which the path of integra-
Bates [7, pp. 943-4] presents a simple derivation of
tion crosses the branch cut of the log function. As all
equation (8), and cites earlier papers that develop
contemporaneous option prices of a given maturity
the single-integration approach. Numerical integra-
use the same values of f (i) regardless of the strike
tion issues are discussed by Lee [10]. Bates [3] and
price X, evaluating options jointly greatly increases
Bakshi et al. [1] estimate and test the Bates and Scott
numerical efficiency.
models, respectively, while Pan [12] provides addi-
tional estimates and tests of the Bates [5] stochastic-
intensity model. Bates [4, 6] surveys empirical option
Related Models pricing research.
Related affine models can be categorized along four
lines: References
1. alternate specifications of jump processes;
2. the Bates [5] extension to stochastic-intensity [1] Bakshi, G., Cao, C. & Chen, Z. (1997). Empirical per-
formance of alternative option pricing models, Journal
jump processes; of Finance 52, 2003–2049.
3. models in which the underlying volatility can [2] Bakshi, G. & Madan, D.B. (2000). Spanning and
also jump; and derivative-security valuation, Journal of Financial Eco-
4. multifactor specifications. nomics 55, 205–238.
Bates Model 3
[3] Bates, D.S. (1996). Jumps and stochastic volatility: [10] Lee, R.W. (2004). Option pricing by transform methods:
exchange rate processes implicit in PHLX deutschemark extensions, unification and error control, Journal of
options, Review of Financial Studies 9, 69–107. Computational Finance 7, 51–86.
[4] Bates, D.S. (1996). Testing option pricing models, in [11] Merton, R.C. (1976). Option pricing when underlying
Handbook of Statistics, G.S. Maddala & C.R. Rao, eds, stock returns are discontinuous, Journal of Financial
Economics 3, 125–144.
(Statistical Methods in Finance), Elsevier, Amsterdam,
[12] Pan, J. (2002). The jump-risk premia implicit in options:
Vol. 14, pp. 567–611.
evidence from an integrated time-series study, Journal
[5] Bates, D.S. (2000). Post-’87 crash fears in the S&P of Financial Economics 63, 3–50.
500 futures option market, Journal of Econometrics 94, [13] Scott, L.O. (1997). Pricing stock options in a jump-
181–238. diffusion model with stochastic volatility and interest
[6] Bates, D.S. (2003). Empirical option pricing: a retro- rates: applications of Fourier inversion methods, Math-
spection, Journal of Econometrics 116, 387–404. ematical Finance 7, 413–426.
[7] Bates, D.S. (2006). Maximum likelihood estimation of
latent affine processes, Review of Financial Studies 19,
909–965.
Related Articles
[8] Duffie, D., Pan, J. & Singleton, K.J. (2000). Transform
analysis and asset pricing for affine jump-diffusions, Barndorff-Nielsen and Shephard (BNS) Models;
Econometrica 68, 1343–1376. Heston Model; Jump-diffusion Models; Stochastic
[9] Heston, S.L. (1993). A closed-form solution for options Volatility Models: Foreign Exchange; Time-
with stochastic volatility with applications to bond changed Lévy Process.
and currency options, Review of Financial Studies 6,
327–344. DAVID S. BATES
Barndorff-Nielsen and = ln E[e ]. Under the integrability con-
iuZ1
with ψ(u)
dition |x|>1 ln |x|ν( dx) < ∞, the process (yt ) has a
Shephard (BNS) Models stationary distribution with characteristics
A γ
Ay = , γy = , ν y (B)
2λ λ
Stochastic volatility models based on non-Gaussian ∞
dξ
Ornstein–Uhlenbeck (OU)-type processes were intro- = ν(ξ B) , ∀B ∈ B() (5)
duced in [3]. The motivation was to construct a 1 λξ
mathematically tractable model that provides an ade- In the stationary case, an OU-type process has an
quate description of price fluctuations on various exponential (short memory) autocorrelation structure:
timescales. The main idea is to model the volatility
with a non-Gaussian OU process: solution of a lin- Cov(yt , yt+s ) = e−λs Var yt (6)
ear Stochastic differential equation (SDE) with Lévy
To obtain more interesting correlation structures, one
increments. The non-Gaussian increments allow to
can add up several OU-type processes [1]: if y and
build a process that is positive and linear, meaning
ỹ are independent stationary OU-type processes with
that many computations are very simple.
parameters λ and λ̃ then
Cov(yt + ỹt , yt+s + ỹt+s ) = e−λs Var yt
Non-Gaussian Ornstein–Uhlenbeck-type
+ e−λ̃s Var ỹt (7)
Processes
The price to be paid is an increased model dimen-
The OU-type process (see [8, 11, 12] for original sion: the two-dimensional process (y, ỹ) is Markov
introduction or [2, 4, 10] for a more modern treat- but the sum y + ỹ is not. Superpositions of OU-
ment) is defined as the solution of the stochastic type processes can also be used to construct finite-
differential equation dimensional approximations to non-Markov (e.g.,
long memory) processes.
dyt = −λyt + dZt (1)
with pronounced skew changes from short to long [3] Barndorff-Nielsen, O.E. & Shephard, N. (2001). Non-
maturities such as FX markets. Gaussian Ornstein–Uhlenbeck based models and some
• Lack of stability: since the calibration functional of their uses in financial econometrics, Journal of the
Royal Statistical Society: Series B 63, 167–241.
(12) is not convex and the number of model [4] Cont, R. & Tankov, P. (2004). Financial Modelling with
parameters may be large, the calibration algorithm Jump Processes, Chapman & Hall/CRC Press.
may be caught in a local minimum, which leads [5] Cont, R., Tankov, P. & Voltchkova, E. (2007). Hedg-
to instabilities in the calibration procedure. Usual ing with options in models with jumps, in Stochas-
remedies for this problem include the use of tic Analysis and Applications: The Abel Symposium
global minimization algorithms such as simulated 2005 in Honor of Kiyosi Ito, F.E. Benth, G. Di
annealing or adding a convex penalty term to the Nunno, T. Lindstrom, B. Øksendal & T. Zhang, eds,
Springer, pp. 197–218.
functional (12) to make the problem well posed. [6] Deng, S.-J. & Jiang, W. (2005). Lévy process-driven
mean-reverting electricity price model: the marginal
The minimal variance hedging in BNS models distribution analysis, Decision Support Systems 40,
is discussed in [5]. Let the option price at time t 483–494.
be given by C(t, St , σt2 ) (this can be computed by [7] Duffie, D., Filipovic, D. & Schachermayer, W. (2003).
Fourier transform). The hedging strategy minimizing Affine processes and applications in finance, Annals of
the variance of the residual hedging error under the Applied Probability 13, 984–1053.
risk-neutral probability is then given by [8] Jurek, Z.J. & Vervaat, W. (1983). An integral representa-
tion for self-decomposable Banach space valued random
2 ∂C 1 variables, Zeitschrift für Wahrscheinlichkeitstheorie und
φt = σt− + ν( dz)(eρz − 1) Verwandte Gebiete 62(2), 247–262.
∂S St−
[9] Nicolato, E. & Venardos, E. (2003). Option pricing
30
25
Correlation
20
Implied volatility
10
Black–Scholes
Heston with zero correlation
5
Heston
0
60.7% 74.1% 90.5% 110.5% 135.0% 164.9%
Strike/forward (in log scale)
Figure 1 Stylized effects of changing Vol Of Vol and Correlation in Heston’s model on the one-year implied volatility.
The Heston parameters are v0 = 15%2 , θ = 20%2 , κ = 1, ρ = −70%, and σ = 35%
25 25
20 20
15 15
10 Short Vol 15 10 Long Vol 20
Short Vol 10 Long Vol 15
5 Short Vol 20 5 Long Vol 25
0 0
(a) 0 2 4 6 8 10 12 (b) 0 2 4 6 8 10 12
20
16
12
8 Reversion speed 1
Reversion speed 0.5
4 Reversion speed 1.5
0
(c) 0 2 4 6 8 10 12
Figure 2 The effects of changing Short Vol (a), Long Vol (b), and Reversion Speed (c) on the ATM term structure
of implied volatilities. Each graph shows the volatility term structure for 12 years. The reference Heston parameters are
v0 = 15%2 , θ = 20%2 , κ = 1, ρ = 70%, and σ = 35%
term structure and strike structure, was made for control the risk with the five available parameters,
illustration purposes only: in particular, κ and σ are but has to understand very well their interdependency.
strongly interdependent if the model is used in the For example, to hedge, say, convexity risk in strike
form (1). direction of the implied volatility surface, the trader
This is one of the most serious drawbacks of will also have to deal with the skew risk at the
Heston’s model since it means that a trader who uses same time since in Heston, there is no one parameter
it to risk-manage a position cannot independently to control either: convexity is mainly controlled
Heston Model 3
by Vol Of Vol, but the effect of Correlation on is not an L2 -function in K, we define a “dampened
skew depends on the level of Vol Of Vol, too. call”
Moreover, changes to the short end volatility skew eαk
c(T , k) := (T , ek ) (4)
will always affect the long-term skew. A similar DF(T )FT
strong codependency exists between Vol of Vol and
Reversion Speed; as pointed out in [14], some of for an α > 0,b for which its characteristic func-
the strong interdependence between Vol Of Vol and tion ψt (z; k) := eikz c(t, k) dk is well defined and
Reversion Speed can be alleviated by using the given as
alternative formulation ϕt (k − i(α + 1))
√ √ ψt (z; k) = (5)
dvt = (θ − vt )κ dt + σ̃ vt κ dWt (2) (ik + α)(ik + α + 1)
The function ϕt (z) := Ɛ[exp{iz log St /Ft }] is the
In this parametrization, the new Vol Of Vol and characteristic function of Xt := log St /Ft . Since Hes-
reversion speed are much less interdependent, which
ton belongs to the affine model class, its characteristic
stabilizes results of daily calibration to market data
function has the form
substantially. Mathematically, this parametrization
much more naturally defines κ as the “speed” of the ϕt (z) = e−v0 At −mBt (6)
equation.
Such complications are a general issue with with (cf. [14])
stochastic volatility models: since such models
attempt to describe an unobservable, rather theo- α + aeγ t
At := and
retical quantity (instantaneous variance), they do β + beγ t
not produce very intuitive behavior when looked
β + beγ t
at through the lens of the observable measure of αbγ t + (aβ − αb) log
“implied volatility”. That said, implied volatility itself β +b
Bt := κ̃ (7)
or, rather, its interpolations are also moving on a daily βbγ
basis. This indicates that natural parameters such as where µ := (iz + z2 )/2, κ̃ := κ − ρizσ , γ :=
convexity and skew of implied volatility might be a
− 2µσ + κ̃ 2 , a := −2µ, α := 2µ, b := −κ̃ + γ
2
valuable tool for feeding a stochastic volatility model,
and β := κ̃ + γ .
but it is unreasonable to keep them as constant param-
We can then price a call on X using
eters inside the model.
e−α ln(K)
(T , K) = DF(T )FT
Pricing European Options π
∞
Heston’s popularity is probably mainly derived from × e−iz ln(K) ψt (z; ln(K)) dz (8)
the fact that it is possible to price European options 0
on the stock price S using semi-closed-form Fourier The method also lends itself to Fast Fourier Trans-
transformation, which in turn allows rapid calibration form if a range of option prices for a single maturity
of the model parameters to market data. “Calibration” is required.
here means to infer values for the five unobserv-
√ √ Similarly, various other payoffs can be computed
able parameters v0 , θ , κ, σ, ρ from market data very efficiently with the Fourier approach, for exam-
by minimizing the distance between the models’ ple, forward started vanilla options, options on inte-
European option prices and observed market prices. grated short variance, and digital options.
We focus on the call prices. Following Carr and
Madan [7], we price them via Fourier inversion.a The
call price for a relative strike K at maturity T is given Time-Dependent Parameters
as
(T , K) := DF(T )Ɛ (ST − KFT )+ (3) Moreover, for most of these products—and most im-
portantly, plain European options—it is very straight-
where DF(T ) represents the discount factor and FT forward to extend the model to time-dependent,
is the forward of the stock. Since the call price itself piece-wise constant parameters. This is briefly
4 Heston Model
0.00% 6m 1y 3y ∞
√
−0.05% Long Vol θ 20.7% 23.6% 36.1% 46.5%
−0.10% Reversion 5.0 3.2 0.4 0.3
5y Speed κ
−0.15%
3y Correlation ρ −55.2% −70.9% −80.1% −69.4%
−0.20% 1y Vol Of Vol σ 78.7% 81.5% 35.3% 60.0
3m
75%
80%
85%
1m
90%
95%
(a)
100%
−0.15%
5y Mathematical Drawbacks
3y
−0.20% 1y
3m
The underlying mathematical reason for the relative
75%
1m
85%
90%
95%
(b)
100%
105%
110%
75
65
55
45
35
25
1m
3m
15 6m
−5 0 5 10 15 20 25 30 35 40
Short volatility level
65
55
45
35
25
15 1m
3m
6m
5
−5 0 5 10 15 20 25 30 35 40
Short volatility level
Figure 4 This graphs shows the density of vt for one, three and six months for the case where condition (10) is satisfied
(above) or not (below). Apart from Vol Of Vol, the parameters were v0 = 15%2 , θ = 20%2 and κ = 1
numerical approximations more complicated. In a sense detailed in proposition 3.1 in [2] (see Moment
Monte Carlo simulation, for example, we have to take Explosions for more details). Again, this is not
the event of v being negative into account. The same a problem from a purely mathematical point of
problem appears in a partial differential equation view, but it makes numerical schemes less effi-
(PDE) solver: Heston’s PDE becomes degenerate if
cient. In particular, Monte Carlo simulations perform
Short Vol hits zero. A violation of Equation (10) also
much worse: although an Euler scheme will still
implies that the distribution of short variance Vt at
some later time t is very wide, cf. Figure 4. converge to the desired value, the speed of con-
Additionally, if Equation (10) does not hold, then vergence deteriorates. Moreover, we cannot safely
the stock price S may fail to have a second moment use control variates anymore if the payoff is not
if the Correlation is not negative enough in the bounded.
6 Heston Model
Once we have calibrated the model using the afore- It is straightforward to derive the PDE for the
mentioned semiclosed form of solution for the Euro- previous model. Let
pean options, the question is how to evaluate complex
Pt (v, S) := DFt (T )Ɛ [F (ST )|St = S, vt = v] (15)
products. At our disposal are PDEs and Monte Carlo
schemes. be the price of a derivative with maturity T at time t.
Since the conditional transition density of the It satisfies
entire process is not known, we have to revert to
solving a discretization of the SDE (1) if we want to
use a Monte Carlo scheme (see Monte Carlo Sim- 0 = rt Pt + ∂S Pt (rt − µt )St + ∂v Pt κ(m − vt )
ulation for Stochastic Differential Equations for 1 2 1 2
+ ∂SS Pt St vt + ∂vv Pt σ 2 vt + ∂vS
2
Pt ρvt St (16)
an overview of Monte Carlo concepts). To this end, 2 2
assume that we are given fixing dates 0 = t0 < · · · <
with boundary condition PT (S, v) = F (ST ). To solve
tN = T and let ti := ti+1 − ti for i = 0, . . . , N − 1.
this two-factor PDE with a potentially degenerate
Moreover, we denote by Wi for i = 0, . . . , N − 1 a 2
diffusion term in ∂vv Pt , it is recommended to use a
sequence of independent normal variables with vari-
stabilized alternating direction implicit (ADI) scheme
ance i , and by Bi a corresponding sequence where
such as the one described by Craig and Sneyd [8] (see
Bi and Wi have Correlation ρ.
Alternating Direction Implicit (ADI) Method for a
When using a straightforward Euler scheme, we
discussion on ADI).
will face the problem that v can become negative. It
works well simply to reduce the volatility term of the
variance to the positive part of the variance, that is,
Risk Management
to simulate
Provided that we consider not only the stock price
vti+1 = vti + κ(θ − vti )i + σ vti+ Wi (11) itself but also a second liquid instrument V such
as a listed option as hedging instrument, stochastic
A flaw of this scheme is that it is biased. This is volatility models are complete, that is, in theory every
overcome by using the moment-matching scheme contingent claim P can be replicated in the sense that
there are hedging strategies (t , Vt )t such that
vti+1 = θti + θ − vti e−κti
dPt − rt Pt dt = t (dSt − St (rt − µt ) dt)
−2κti
1 − e + Vt (dVt − rt Vt dt)
+ σ vt+i Wi (12) (17)
2κ
(see Complete Markets for a discussion on complete
which works well in practice. To compute the markets). In Heston’s model, we can write the price
stock price, we approximate the integrated variance process of both the derivative we want to hedge and
over [ti , ti+1 ] as the hedging instrument as a function of current spot
level and short variance, that is, Pt ≡ Pt (St , vt ) and
1 − e−κti Vt ≡ Vt (St , vt ). Then, the correct hedging ratios are
i V := θti + vti − θ (13)
κ ∂v P ∂v P
Vt = and t = ∂S Pt − ∂S Vt (18)
and set ∂v V ∂v V
k−1 This is the equivalent of delta hedging in Black and
1
Stk := Ftk exp i V Bi − i V (14) Scholes (see Delta Hedging). However, as for the
i=1
2 latter, plain theoretical hedging will not work since
the other parameters in our model, Reversion Speed,
Note that this scheme is unbiased in the sense Vol of Vol, Long Vol, and potentially Correlation, will
that Ɛ[ST ] = FT . not remain constant if we calibrate our model on a
Heston Model 7
daily basis. This is the effect of a change in volatility where Nt is a Poisson process with intensity λ (see
for Black and Scholes—a change of this parameter is Poisson Process) and where (ξj )j are the normal
not anticipated by the model itself and must be taken jumps of the returns of S with mean µ and volatil-
care of “outside the model”. ity ν. To make sure that St /Ft is a martingale we
1
As a result, one way to control this risk is 2
stipulate that µ = em+ 2 ν − 1.
to engage in additional parameter hedging, that Since the process X is independent of the jumps,
is, the desk also displays sensitivities with respect the characteristic function of the log-stock process is
to the other model parameters including, potentially, the product of the separate characteristic functions. In
second-order exposures. Those can then be monitored other words, Bates’ model can be evaluated using the
on a book level and managed explicitly. The draw- same approach as above and is equally efficient while
back of this method is that to reduce risk with respect allowing for a very pronounced short-term skew due
to those parameters, a portfolio of vanilla options has to the jump part.d Figure 5 shows the improvement of
to be bought whose composition can change quickly time-dependent Bates over time-dependent Heston.
if implemented blindly.c The model has been further enhanced by Knudsen
A second variant is to try to map standard risks of and Nguyen-Ngoc [12] who also added exponentially
the desk such as implied volatility convexity, skew- distributed jumps to the variance process.
ness, and so on into stochastic volatility risk by
“recalibration”. The idea here is that, say, the con-
vexity parameter of the implied volatility is modified, Multifactor Models
then Heston’s model is calibrated to this new implied
volatility surface and the option priced off this model. Structurally, Heston’s model is a member of the class
The resulting change in model price is then con- of “affine models” as introduced by Duffie et al. [9].
sidered the sensitivity of the option to convexity in As such, it can easily be extended by mixing in fur-
implied volatility. This approach suffers from the fact ther independent square-root processes. One obvious
that typical “implied vol risks” are very different from approach presented in [14] is simply to multiply sev-
typical movements in the Heston model. For exam- eral independent Heston processes. For the two-factor
ple, the standard Heston model is homogeneous so case, this means to set St := Ft Xt1 Xt2 where both X 1
it cannot easily accommodate changes in short-term and X 2 have the form (19). Jumps can be added, but
skew only. to make the Fourier integration work efficiently, the
processes X 1 and X 2 must remain independent.
The stochastic variance of the joint stock price is
then simply the sum of the two separate variances, v 1
Related Models and v 2 , and it is intuitively assumed that one is a
“short-term”, fast mean-reverting process whereas the
Owing to its numerical efficiency, Heston’s model other is mean reverting slowly. Such a structure is
is the base for many extensions. The first notable supported by statistical evidence, cf. [10]. However,
extension is Bates’ addition of jumps to the diffusion the independence of the two processes makes it very
process in his article [3] (see Bates Model). Jumps difficult to impose enough skew into this model
are commonly seen as a necessary feature of any risk since the effective Correlation between instantaneous
management model, even though the actual handling variance and stock price weakens. In practice, this
of the jump risk part is far from clear. model is used only rarely.
Bates’ approach can be written as follows: let X A related model “Double Heston” has been men-
be given by tioned by Buehler [6], which is obtained by modeling
the mean variance level θ in Heston itself as a square-
√ root diffusion, that is,
dvt = κ(θ − vt ) dt + σ vt dWt
√ √
dXt = Xt vt dBt (19) dvt = κ(θt − vt ) dt + σ vt dWt
and let dθt = c(m − θt ) dt + ν θt dWtθ
Nt
ξ −λmt √
St = Ft Xt e j =1 j (20) dSt = St (rt − µt ) dt + St vt dBt (21)
8 Heston Model
Fitted Heston
0.20%
A particular class of derivatives that has gained
0.15% reasonable popularity in recent years are “Options on
0.10% Variance”, that is, structures whose terminal payoff
depends on the realized variance of the returns of
0.05%
the stock over a set of business days 0 = t0 < · · · <
0.00% tn = T ,
n
−0.05% St i 2
(T ) := log (22)
Sti−1
−0.10% i=1
5y
−0.15% The most standard of such products is a “variance
3y
−0.20% 1y swap” (see Variance Swap), which essentially pays
3m the actual realized annualized variance over the
75%
80%
85%
1m
90%
95%
1m
85%
90%
95%
(b)
100%
105%
110%
with the stock price as [3] Bates, D. (1996). Jumps and stochastic volatility:
exchange rate process implicit in the Deutschemark
√
dSt = St (rt − µt ) dt + St wt dBt (27) options, Review of Financial Studies 9(1), 69–107.
[4] Buehler, H. (2006). Consistent variance curve models,
This now reprices all variance swaps automatically Finance and Stochastics 10(2), 178–203.
in the sense (24). Note that this method does not [5] Buehler, H. (2006). Options on variance: pricing and
hedging, Presentation, IQPC Volatility Trading Con-
at all depend on using Heston’s model and can be
ference, London November 28th, 2006, http://www.
applied to any stochastic volatility model as long as quantitative-research.de/dl/IQPC2006-2.pdf
the expectation of instantaneous variance is known. [6] Buehler, H. (2006). Volatility Markets: Consistent Mod-
As pointed out in [6], this model is naturally very eling, Hedging and Practical Implementation, PhD the-
attractive from a risk-management point of view if sis TU Berlin, http://www.quantitative-research.de/dl/
the input M is computed on the fly within the risk HansBuehlerDiss.pdf
management system. In this case, the risk embedded [7] Carr, P. & Madan, D. (1999). Option pricing and the Fast
Fourier Transform, Journal of Computational Finance
in the variance swap level (called VarSwapDelta) is 2(4), 61–73. Summer.
automatically reflected back in the standard implied [8] Craig, I.J.D. & Sneyd, A.D. (1988). An alternating-
volatility risk, and the underlying stochastic volatility direction implicit scheme for parabolic equations with
model is used purely to control skew and convexity mixed derivatives, Computers and Mathematics with
around the variance swap backbone.e Further prac- Applications 16(4), 341–350.
tical considerations and the impact of jumps are [9] Duffie, D., Pan, J. & Singleton, K. (2000). Transform
analysis and asset pricing for affine jump-diffusions,
discussed by Buehler in [5].
Econometrica 68, 1343–1376.
[10] Fouque, J.-P., Papanicolaou, G. & Sircar, K. (2000).
End Notes Derivatives in Financial Markets with Stochastic Volatil-
ity, Cambridge Press.
a.
[11] Heston, S. (1993). A closed-from solution for option
In his original paper [11], Heston suggested a numerically with stochastic volatilty with applications to bond and
more expensive approach via numerical integration that currency options, Review of Financial Studies 6(2),
is twice as slow but still much faster than the same 327–343.
computation for most other models. The approach to price [12] Knudsen, T. & Nguyen-Ngoc, L. (2000). The Heston
with Fourier inversion is due to Carr and Madan [7]; the model steps further Deutsche Bank Quantessence 1(7)
interested reader finds more details on the subject in Lewis’s https: // www.dbconvertibles.com/dbquant/quantessence/
book [13]. Vol1Issue7External.pdf
b.
See [7] for a discussion on the choice of α. [13] Lewis, A. (2000). Option Valuation under Stochastic
c.
Bermudez et al. discuss one approach to find such portfo- Volatility, Finance Press.
lios [14]. [14] Overhaus, M., Bermudez, A., Buehler, H., Ferraris, A.,
d.
In practice, calibrating all parameters (stochastic volatility Jordinson, C., Lamnouar, A. (2006). Equity Hybrid
plus jumps) together is relatively unstable since the two Derivatives, Wiley.
parts play similar roles for the short-term options. It is
therefore customary to fix the jump parameters themselves
or to calibrate them separately to very short-term options.
e.
Related Articles
Usually, the parameters v0 and θ are fixed to some “usual
level” such as 20%. Then, they do not need to be calibrated
anymore and, in addition, σ retains some comparability to Alternating Direction Implicit (ADI) Method;
the standard Heston model. Bates Model; Complete Markets; Cliquet Options;
Econometrics of Diffusion Models; Hedging;
Hull–White Stochastic Volatility Model; Model
References
Calibration; Monte Carlo Simulation for Stocha-
stic Differential Equations; Moment Explosions;
[1] Aı̈t-Sahalia, Y. & Kimmel, R. (2004). Maximum Likeli-
Realized Volatility Options; Variance Swap.
hood Estimation of Stochastic Volatility Models, NBER
Working Paper No. 10579, June 2004.
[2] Andersen, L. & Piterbarg, V. (2007). Moment explosions
HANS BUEHLER
in stochastic volatility models, Finance and Stochastics
11(1), 20–50.
Hull–White Stochastic − rf = −rS
∂f
∂S
− µσ 2
∂f
∂V
(4)
Volatility Model Assuming volatility and stock price are uncorre-
lated, we derive an analytic solution to equation (4)
through risk-neutral valuation procedure
Even before practitioners started using the
Black–Scholes formula extensively, one had iden- f (St , σt2 , t) = e−r(T −t)
tified the assumption of constant volatility as unreal- ∞
istic. Empirical observation of equity vanilla option × f (ST , σT2 , T )p(ST |St , σt2 ) dSt
market shows, indeed, that the implied volatility level 0
depends on the strike. This feature, commonly known (5)
as the volatility smile, violates the constant volatil-
ity assumption. This essential remark motivated the where T is the option maturity, St is the security at
birth of stochastic volatility models (see Stochastic pricing time t, σt is the instantaneous volatility at time
Volatility Models). t, and p(ST |St , σt2 ) is the conditional distribution of
Among the first authors to tackle this issue, Hull ST given the security price and variance at time t.
and White proposed in 1987 a simple extension Introducing the mean variance over the option
of Black–Scholes model [1]. This article aims at life V̄ , T
presenting a sound introduction to the Hull–White 1
stochastic volatility model and at indicating its impli- V̄ = στ2 dτ (6)
T −t t
cations in terms of volatility behavior and correlation.
Hull and White describe the variance V = σ 2 as a we express the call option value as
geometric Brownian motion. Therefore, the asset and ∞
variance satisfy the following stochastic differential CHW (St , σt , t) =
2
CBS (V̄ )h(V̄ |σt2 )dV̄ (7)
equation: 0
17.0% 20%
16.5% 18%
r = 30%
16.0% 16%
r = 0%
15.5%
14%
r = −30%
15.0%
12%
r = −70%
14.5%
0.70 0.80 0.90 1.00 1.10 1.20 1.30 10%
0.70 0.80 0.90 1.00 1.10 1.20 1.30
Figure 1 Implied volatility as a function of strike
Figure 2 Volatility smile for various correlation levels
Related Articles
Correlation between Stock Returns and
Changes in Volatility Heavy Tails; Heston Model; Implied Volatility
in Stochastic Volatility Models; Implied Volatility
By introducing correlation between stock and Surface; Partial Differential Equations; Stochastic
variance Gaussian increments, Hull and White Volatility Models; Stylized Properties of Asset
incorporate explicitly a cause of the volatility skew: Returns.
the leverage effect. Even if they do not provide any
analytic formula in the correlated case, one can still PIERRE GAUTHIER & PIERRE-YVES H.
analyze the impact of correlation through numerical RIVAILLE
simulation.
As shown in Figure 2, this correlation has a huge
impact since it enables to transform the smile into
Tempered Stable Process make S(t) into a martingale and can be determined by
using equation (2), which leads to ω = − ln((−i)).
Further methods to compute an equivalent martingale
measure are discussed in [7].
A tempered stable process is a pure-jump Levy
process (see Lévy Processes) with infinite activity
(see Exponential Lévy Models) whose small jumps Tempered Stable
behave like a stable process, while the large jumps
are “tempered” so that the tail of the density decays The CGMY process is a special case of the tempered
exponentially. Tempered stable processes can be stable process considered in Boyarenko and Leven-
constructed from stable processes by exponential dorskii [1], Cont and Tankov [5] or Rosinski [12].
tilting (see Esscher Transform) of the Levy measure. The latter process has Lévy measure with density
Tempered stable processes were introduced in [8] given by
and introduced in financial modeling by Cont et al.
[4] under the name truncated stable process, where exp(−G|x|)
kTS = C− 1{x<0}
it was noted that tempered stable processes have a |x|1+Y−
short-time behavior similar to stable Levy processes exp(−M|x|)
while retaining finite variance and finite exponential + C+ 1{x>0} (3)
moments. Option pricing with tempered stable pro- |x|1+Y+
cesses was studied in [1], [2], and [5]. The parameters of equation (3) fulfill G, M >
The best known example of tempered stable 0, C± > 0, and Y± ∈ (−∞, 2). The characteristic
processes is the CGMY process introduced in [2], function is available in closed form and hence
which is a pure-jump Levy process with Levy density option pricing and calibration can be performed using
given by Fourier-transform methods. Choosing C− = C+ and
Y− = Y+ leads to the CGMY process and Y− = Y+ =
exp(−G|x|) 0 leads to a variance gamma process (see Variance-
kCGMY = C 1{x<0}
|x|1+Y gamma Model).
exp(−M|x|)
+C 1{x>0} (1)
|x|1+Y
Interpretation of the Parameters
The model parameters of equation (1) fulfill C >
0, G, M ≥ 0, and Y ∈ (−∞, 2). The restriction on In order to show the impact of the model parame-
the parameter Y ensures that the measure is a Lévy ters to asset returns, we consider the properties of the
measure. process X(t). Increasing C makes the density more
For a given stochastic process X(t), its char- peaked while decreasing C flattens it. C controls the
acteristic function is given by (u, t) = Ɛ[exp frequency of jumps. While determining the probabil-
(iuX(t))] (see Fourier Transform; Fourier Meth- ity of jumps larger than a certain level, this parameter
ods in Options Pricing). For the CGMY model, it is is incorporated. The parameter Y governs the fine
derived in [2] and it is given by structure of the process and the choice affects the
overall properties of the process as explained in the
previous section. It determines if the process is of
(u, t) = exp(tC (−Y )((M − iu)Y − M Y finite or infinite activity.
+ (G + iu)Y − GY )) (2) The parameters G and M control the rate of
exponential decay, that is, the tail behavior, on the
On this basis, Fourier-transform methods (see right and the left of kCGMY . We consider three cases:
Fourier Transform; Fourier Methods in Options G = M leads to a symmetric Lévy measure, G < M
Pricing) can be applied to option pricing. Carr makes the left tail heavier than the right one, and vice
et al. show that the CGMY process has completely versa for the case G > M. The last two cases lead to
monotone Lévy density for Y > −1 and is of infinite a skewed distribution.
activity for Y > 0. The drift parameter is chosen to This behavior is illustrated in Figure 1.
2 Tempered Stable Process
Changing G Changing C
2 1.8
Base 1.6 Base
1.5 Double G 1.4 Double C
Half G Half C
1.2
1
1
0.8
0.6
0.5 0.4
0.2
0 0
−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1 −1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1
Changing M Changing Y
1.4 2.5
0.8 1.5
0.6 1
0.4
0.5
0.2
0 0
−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1 −1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1
Figure 1 Illustration of the effect of changing the CGMY model parameters C, G, M, and Y on the probability density
function
If variance, skewness, and kurtosis exist, they can Pricing and Calibration
be computed by
We can use the characteristic function of the log-price
X(t) from equation (2) to apply Fourier methods
1 1 described in Carr and Madan [3] or Eberlein et al.
Variance = C(2 − Y ) + (4) [6] to price European and path-dependent options (see
M 2−Y G2−Y
Fourier Methods in Options Pricing).
1 1
C(3 − Y ) + Options may also be priced by Monte Carlo
M 3−Y G3−Y
Skewness = 3/2
(5) simulation [10, 11] using the representation of the
V tempered stable process as a subordinated Brownian
1 1 motion [5, Proposition 4.1].
C(4 − Y ) +
M 4−Y G4−Y In contrast to the diffusion processes, for the
Kurtosis = (6)
V2 pure-jump processes, the change in measure can be
computed from the statistical measure, that is, using
The equations for the higher moments suggest parameters computed from time-series data and risk-
that the parameter C controls the overall size of neutral measure, that is, using parameters obtained
the moments. This has already been verified by the using quoted option prices. It holds that k˜ (x) =
expression for the density. In the case k(x) dx < Y (x)k ; see [2] for details. Let us call the correspond-
+∞, it can be interpreted as a measure for the ing parameter sets P = {C, G, M, Y, µ} and P̃ =
overall level of activity. In the case of finite activity, {C̃, G̃, M̃, Ỹ , r}, where r denotes the riskless rate and
the process has a finite number of jumps on every the corresponding measures by and ˜ respectively.
compact interval. If the characteristic functions are denoted by and
Tempered Stable Process 3
˜ is an
˜ then using the results in [2] state that [5] Cont, R. & Tankov, P. (2003). Financial Modelling with
Jump Processes, Chapman and Hall / CRC Press.
equivalent martingale measure to if and only if [6] Eberlein, E., Glau, K. & Papapantoleon, A. (2008)
˜
C = C̃, Y = Ỹ , and r − (−i) = µ − (−i). The Analysis of Valuation Formulae and Applications to
constraints on the parameters G, M, G̃, and M̃ are Exotic Options.. Preprint Uni Freiburg, www.stochastic.
implicit in the last equality. uni-freiburg.de/∼eberlein/papers/Eberlein-glau.Papapan.
pdf
[7] Kim, Y.S. & Lee, J.H. (2007). The relative entropy in
Path Properties
CGMY processes and its applications to finance, Mathe-
Path properties of the model affect the prices matical Methods of Operations Research 66(2),
of exotic path-dependent options. We considered 327–338.
[8] Koponen, I. (1995). Analytic approach to the problem
path variation when we gave the interpretation
of convergence of truncated Lévy flights towards the
for the model parameters. Other concepts like hit- Gaussian stochastic process, Physical Review E 52,
ting points, creeping, or regularity of the half- 1197–1199.
line are considered in [9]. We shortly introduce [9] Kyprianou, A.E. & Loeffen, T.L. (2005). Lévy processes
hitting points. The process Xt can hit a point in finance distinguished by their coarse and fine path
x ∈ if (Xt = x for at least one t > 0) > 0. We properties, in Exotic Option Pricing and Advanced Lévy
denote the set of all points the process can hit by models, A.E. Kyprianou, W. Schoutens & P. Wilmott,
eds, Wiley, Chichester.
H = x ∈ |(Xt = x for at least one t > 0) > 0 .
[10] Madan, D. & Yor, M. (2005). CGMY and Meixner
See [9] for details. Subordinators are Absolutely Continuous with Respect
to One Sided Stable Subordinators. Prépublication du
References Laboratoire de Probabilités et Modèles Aléatoires.
[11] Poirot, J. & Tankov, P. (2006). Monte Carlo option
pricing for tempered stable (CGMY) processes, Asia
[1] Boyarchenko, S.I. & Levendorskii, S.Z. (2002). Non- Pacific Financial Markets 13(4), 327–344.
Gaussian Merton-Black-Scholes theory, Advanced [12] Rosinski, J (2007). Tempering stable processes, Stochas-
Series on Statistical Science and Applied Probability,
tic Processes and their Applications, 117(6), 677–707.
World Scientific, River Edge, NJ, Vol. 9.
[2] Carr, P., Geman, H., Madan, D. & Yor, M. (2002). The
fine structure of asset returns: an empirical investigation,
Journal of Business 75(2), 305–332. Related Articles
[3] Carr, P. & Madan, D. (1999). Option valuation using
the fast Fourier transform, Journal of Computational
Finance 2(4), 61–73. Exponential Lévy Models; Fourier Methods
[4] Cont, R., Potters, M. & Bouchaud, J.P. (1997). Scal- in Options Pricing; Fourier Transform; Lévy
ing in stock market data: stable laws and beyond, in Processes; Time-changed Lévy Process.
Scale Invariance and Beyond. B. Dubrulle, F. Graner &
D. Sornette, eds, Springer. JÖRG KIENITZ
Lognormal Mixture for (t, y) > (0, 0) and ν(t, y) = σ0 for (t, y) =
Diffusion Model (0, S0 ), the SDE
dSt = µSt dt + ν(t, St )St dWt (5)
Let us denote the time-t price of a given financial has a unique strong solution whose marginal density
asset by S(t), equivalently St . We say that S evolves is given by the mixture of lognormals
according to a local-volatility model (see also Local
N
1
Volatility Model) if, under the risk-neutral measure, pt (y) = λi √
i=1 yVi (t) 2π
dS(t) = µS(t)dt + σ (t, S(t))S(t)dW (t),
2
S(0) = S0 (1) y 1
× exp − 2 ln − µt + 2 Vi (t)
1 2
where S0 is a positive constant, W is a standard 2Vi (t) S0
Brownian motion, σ is a well-behaved deterministic
function, and µ is the risk-neutral drift rate, which (6)
is assumed to be constant. For instance, in case of
Moreover, for (t, y) > (0, 0), we can write ν 2 (t, y)
a stock paying a continuous dividend yield q, µ =
r − q, where r is the (assumed constant) continuously = N 2
i=1 i (t, y)σi (t), where, for each (t, y) and
compounded risk-free rate. i, i (t, y) ≥ 0 and N i=1 i (t, y) = 1. As a conse-
Brigo and Mercurio [1–3] find an explicit expres- quence, for each t, y > 0,
sion for the function σ such that the resulting process
has a density that, at each time, is given by a mix- 0 < σ̃ := inf min σi (t) ≤ ν(t, y) ≤ σ̂
t≥0 i=1,...,N
ture of lognormal densities. Their result is briefly
reviewed in the following.
Let us consider N functions σi ’s that are determin- := sup max σi (t) < +∞ (7)
t≥0 i=1,...,N
istic and bounded from above and below by positive
constants, and corresponding lognormal densities A proof of this proposition can be found in [2],
1 and more formally in [5].
pti (y) = √ The pricing of European options under the
yVi (t) 2π
2 lognormal-mixture local-volatility model is quite
1 y straightforward (see also Risk-neutral Pricing;
× exp − 2 ln − µt + 2 Vi (t)
1 2
2Vi (t) S0 Black–Scholes Formula).
(2)
Proposition 2 Consider a European option with
t maturity T , strike K, and written on the asset. The
Vi (t) := σi2 (u)du (3) option value at the initial time t = 0 is given by
0
the following convex combination of Black–Scholes
prices:
Proposition 1 Let us assume that each σi is also
continuous and that there exists an ε > 0 such that
σi (t) = σ0 > 0, for each t in [0, ε] and i = 1, . . . , N .
Then, if we set
N
2
1 1 y
2
λi σi (t) exp − 2 ln − µt + 2 Vi (t)
1 2
i=1 Vi (t) 2Vi (t) S0
ν(t, y) =
2 (4)
N
1 1 y
λi exp − 2 ln − µt + 2 Vi (t)
1 2
i=1
Vi (t) 2Vi (t) S0
2 Lognormal Mixture Diffusion Model
We shall denote this distribution by NIG(α, β, δ). νNIG ( dx) = δαπ −1 exp{βx}K1 (α|x|)(|x|)−1 dx
The distribution is so named due to the fact that
NIG(α, β, δ) is a variance–mean mixture of a nor- (4)
mal distribution with the inverse Gaussian as the mix-
ing distribution. It follows immediately from expres- An NIG process has no Brownian component and its
sion (1) that this distribution is infinitely divisible. Lévy triplet is given by [γ , 0, νNIG ( dx)], where
The distribution is defined on the whole real line and
1
has the density function
γ = 2δαπ −1 sinh(βx)K1 (αx)dx (5)
0
f (x; α, β, δ) = αδπ −1 exp δ α 2 − β 2 + βx
The NIG Lévy process may alternatively be repre-
−1 sented via random time change of Brownian motion,
K1 α δ 2 + x 2 δ2 + x 2 , x∈R (2) using the inverse Gaussian (IG) process to determine
time, as
where K1 is the modified Bessel function of third-
order and index 1. If a random variable X follows XtNIG = βδ 2 It + δWIt (6)
an NIG(α, β, δ) distribution and c > 0, then cX
is NIG(α/c, β/c, cδ)-distributed. Further, if X ∼ where W = {Wt , t ≥ 0} is a standard Brownian
NIG(α, β, δ1 ) is independent of Y ∼ NIG(α, β, δ2 ), motion and I = {It , t ≥ 0} is an IG process with
then X + Y ∼ NIG(α, β, δ1 + δ2 ). If β = 0, the parameters a = 1 and b = δ α 2 − β 2 .
2 Normal Inverse Gaussian Model
Table 1 Mean, variance, skewness, and kurtosis of the of the NIG(α, β, δ) process, that is, introducing the
normal inverse Gaussian distribution distribution NIG(α, β, δ, m) with the characteristic
NIG(α, β, δ) function
2 −1/2
Mean δβ α − β 2 φ̃(u; α, β, δ, m) = φ(u; α, β, δ) exp{ium} (9)
−3/2
Variance α2 δ α2 − β 2
−1 −1
2 −1/4
δ α − β
2
Skewness 3βα where
−1/2
Kurtosis 3 1 + α 2 + 4β 2 δ −1 α −2 α 2 − β 2
m=r −q +δ α 2 − (β + 1)2 − α 2 − β 2
The NIG model belongs to the class of exponential and φ(u; α, β, δ) is defined by expression (1).
Lévy models (see Time-changed Lévy Process).
Consider a market with a riskless asset (the bond),
with a price process given by bt = exp{rt}, and one Pricing of European Options
risky asset (the stock or index). The model for the
risky asset is Given our NIG market model, we focus now on
the pricing of European options whose payoffs are
functions of the terminal asset value only. Denote
St = S0 exp{Xt(NIG) } (7)
the payoff of the option at its time of expiry T by
G(ST ) and let F (XT ) = G(ST )
where the log returns log (St+s /St ) follow the NIG
(α, β, δs) distribution (i.e., the distribution of incre-
ments of length s of the NIG process). Pricing through Density Function
Another way to obtain an equivalent martingale Another way to find the value Vt = V (t, Xt ) at
measure Q is by mean-correcting the exponential time t is by solving a partial integro-differential
Normal Inverse Gaussian Model 3
∞
dGH(λ,α,β,δ,µ)(x) = a(λ, α, β, δ, µ)
dGH(λ,α,β,δ,µ)(x) = dN(µ+βy,y) (x)
1 0
λ− /2
× dGIG(λ,δ,√α 2 −β 2 ) (y) dy (4)
2
× (δ 2 + (x − µ)2 )
× K 1 α δ 2 + (x − µ)2
λ− Using maximum likelihood estimation, one can
2
× exp(β(x − µ)) (1) fit GH distributions to empirical return distributions
from financial time series such as the daily stock or
with the norming constant index prices. Figure 1 shows a fit to the daily closing
prices of Telekom over a period of seven years.
Figure 2 shows the same densities on a log scale
(α 2 − β 2 )λ/2 in order to make the fit in the tails visible. One
a(λ, α, β, δ, µ) = √ 1 recognizes the hyperbolic shape of the GH density
2π α λ− 2 δ λ Kλ (δ α 2 − β 2 ) in comparison to the parabolic shape of the normal
(2) density. The characteristic function of the GH distri-
bution is
Kν denotes the modified Bessel function of the third
kind with index ν. The parameters can be inter- λ
preted as follows: α > 0 determines the shape, β α2 − β 2 2
ϕGH (u) = eiuµ
with 0 ≤ |β| < α the skewness and µ ∈ the loca- α 2 − (β + iu)2
tion. δ > 0 serves for scaling, and λ ∈ character-
izes subclasses. It is essentially the weight in the Kλ δ α 2 − (β + iu)2
tails that changes with λ. There are two alterna- × (5)
tive parameterizations that are scale- and location- Kλ (δ α 2 − β 2 )
invariant, that is, they do not change under affine
transformations Y = aX + b for a = 0, namely, ζ = and expectation and variance are
δ α 2 − β 2 , ρ = β/α and ξ = (1 + ζ )−1/2 , χ = ξρ.
Since 0 ≤ |χ| < ξ < 1, for a fixed λ the distributions βδ 2 Kλ+1 (ζ )
parameterized by χ and ξ can be represented by the E[GH ] = µ + (6)
ζ Kλ (ζ )
points of a triangle, the so-called shape triangle.
GH distributions arise in a natural way as vari- δ 2 Kλ+1 (ζ ) β 2 δ 4
Var(GH ) = + 2
ance–mean mixtures of normal distributions. Let ζ Kλ (ζ ) ζ
dGIG denote the density of a generalized inverse
2
Gaussian distribution (see Normal Inverse Gaussian Kλ+2 (ζ ) Kλ+1 (ζ )
× − (7)
Model) with parameters δ > 0, γ > 0, and λ ∈ , Kλ (ζ ) Kλ2 (ζ )
2 Generalized Hyperbolic Models
α
15
dNIG(α,β,δ,µ)(x) = exp δ α 2 − β 2 + β(x − µ)
π
Densities
10
2
x − µ
K1 αδ 1 +
δ
5
×
x−µ 2
0 1+
δ
−0.2 −0.1 0.0 0.1 0.2
x (9)
Figure 1 GH and normal density fitted to the daily The latter one has a particularly simple character-
Telekom returns istic function:
exp δ α 2 − β 2
ϕNIG (u) = eiuµ (10)
exp δ α 2 − (β + iu)2
GH
2 Norm
Many well-known distributions are limit cases
0 of the class of GH distributions. For λ > 0 and
δ → 0, one gets a variance-gamma distribution; in
Log densities
appeared in [3, 8]. The log returns from this model whenever the integral exists. ϕLT denotes the charac-
taken along time intervals of length 1 are Lt − Lt−1 teristic function of the distribution of LT .
and therefore they have exactly the GH distribu-
tion that generates the Lévy process. It was shown References
in [7] that the model (9) is successful in producing
empirically correct distributions on other time hori-
[1] Barndorff-Nielsen, O.E. (1977). Exponentially decreasing
zons as well. This time consistency property can, for distributions for the logarithm of particle size, Proceed-
example, be used to derive the correct VaR estimates ings of the Royal Society of London A 353, 401–419.
on a two-week horizon according to the Basel II [2] Barndorff-Nielsen, O.E. (1998). Processes of normal
rules. Equation (9) can be expressed by the following inverse Gaussian type, Finance and Stochastics 2(1),
stochastic differential equation: 41–68.
[3] Eberlein, E. (2001). Application of generalized hyperbolic
dSt = St− dLt + e
Lt − 1 −
Lt (12) Lévy motions to finance, in Lévy Processes. Theory
and Applications, O.E. Barndorff-Nielsen, T. Mikosch &
The price of a European option with payoff f (ST ) S. Resnick, eds, Birkhäuser, pp. 319–336.
is [4] Eberlein, E. & von Hammerstein, E.A. (2004). Gen-
V = e−rT Ɛ f (ST ) (13) eralized hyperbolic and inverse Gaussian distributions:
limiting cases and approximation of processes, in Seminar
where r is the interest rate and expectation is taken on Stochastic Analysis, Random Fields and Applications
IV, R.C. Dalang, M. Dozzi & F. Russo, eds, Progress in
with respect to a risk-neutral (martingale) measure.
Probability, Birkhäuser, Vol. 58, 221–264.
As shown in [5], there are many equivalent martin- [5] Eberlein, E. & Jacod, J. (1997). On the range of options
gale measures due to the rich structure of the driving prices, Finance and Stochastics 1, 131–140.
process L. The simplest choice is the so-called Ess- [6] Eberlein, E. & Keller, U. (1995). Hyperbolic distributions
cher transform, which was used in [6]. For the process in finance, Bernoulli 1(3), 281–299.
L to be again a GH Lévy motion under an equiva- [7] Eberlein, E. & Özkan, F. (2003). Time consistency of
lent martingale measure (see Equivalent Martingale Lévy models, Quantitative Finance 3, 40–50.
[8] Eberlein, E. & Prause, K. (2002). The generalized hyper-
Measures), the parameters δ and µ have to be kept
bolic model: financial derivatives and risk measures,
fixed [9]. Since the density of the distribution of in Mathematical Finance – Bachelier Congress, 2000,
ST can be derived via inversion of the characteris- H. Geman, D. Madan, S. Pliska & T. Vorst, eds, Springer,
tic function, the expectation in equation (11) can be Paris, pp. 245–267.
computed directly. A numerically much more effi- [9] Raible, S. (2000). Lévy Processes in Finance: Theory,
cient method based on two-sided Laplace transforms, Numerics, and Empirical Facts. Ph.D. thesis, University
which is applicable to a wide variety of options, has of Freiburg.
been developed in [9]. Assume that e−Rx f (e−x ) is
bounded and integrable for some R such that the Related Articles
moment-generating function of LT is finite at −R.
Write g(x) = f (e−x ) and ψg (z) = e−zx g(x)dx for
Exponential Lévy Models; Fourier Methods in
the bilateral Laplace transform of g. If ζ := − log S0 ,
Options Pricing; Heavy Tails; Implied Volatility
then the option price V can be expressed in the
Surface; Jump-diffusion Models; Normal Inverse
following form:
Gaussian Model; Partial Integro-differential
Equations (PIDEs); Stochastic Exponential; Styl-
eζ R−rT ized Properties of Asset Returns.
V (ζ ) = eiuζ ψg (R + iu)ϕLT (iR − u) du
2π
ERNST EBERLEIN
(14)
Regime-switching Models The Regime-switching Framework
A regime-switching model can be cast in either
a discrete or continuous time setting. The model
Many financial time series exhibit sudden changes in is built conditional to a Markov chain s(t), the
the structure of the data-generating process. Examples realization of which is not directly observed by
include financial crises, exchange rate swings, and economic agents. The chain can take a discrete
jumps in the volatility. Sometimes, this sudden switch set of values, and here we label them as s(t) ∈
is due to a change in policy, for example, when {1, 2, . . . , N } = S . The Markov chain is determined
moving from a fixed to a floating exchange rate by its transition probability matrix in discrete time
regime. In other cases, the behavior of the series or its rate matrix in continuous time. In particular,
is influenced by an exogenous fundamental variable, in a discrete time setting, we write the transition
such as the current position on the business or the probabilities pi,j
credit cycle.
Regime-switching models attempt to capture this P [S(t + 1) = j |S(t) = i] = pi,j (1)
behavior by allowing the data-generating process and we collect the elements {pi,j } in the transition
to change in time, depending on an underlying, probability matrix P. The columns of P must sum
discrete but unobserved state variable. Typically, up to 1, and all transition probabilities must be
the functional form of the data-generating process nonnegative. For a continuous time process, we
remains the same across the different regimes with define the transition rates qi,j :
only the parameter values being state-dependent, as,
for example, in a random walk equity return model P[S(t + dt) = j |S(t) = i] = 1(i = j ) + qi,j dt (2)
where the drift and volatility change with the regime. where 1 is the indicator function. The elements {qi,j }
However, it is feasible to set up models where the are collected in the rate matrix Q. Note that by this
data-generating process itself changes, for example, definition, the columns of the rate matrix must sum up
moving from a deterministic fixed exchange to a to 0, and the diagonal elements will be negative. The
stochastic floating one. infinitesimal transition probability matrix is given
From a statistical point of view, regime-switching as
models will produce mixtures of distributions (see P(dt) = I + Qdt (3)
Mixture of Distribution Hypothesis), offering a
very stylized and intuitive way of accommodating where I is the unit (N × N ) matrix.
features such as fat tails, skewness, and volatil- At each point in time, the data-generating process
ity clustering (see Stylized Properties of Asset will vary, according to the regime s(t) that prevails
Returns). It is very easy to calibrate regime- at that time. Thus, for a discrete time process, we can
switching models on historical data using maxi- write
x(t) = g[t, s(t), y(t), (t); β] (4)
mum likelihood techniques, implementing what can
be thought of as a discrete version of the Kalman In the above expression, y(t) includes variables
filter (see Filtering). Virtually all economic and known at time t, including exogenous variables and
financial time series have been analyzed under the lagged values of x(t), and (t) represents the error
regime-switching framework, including interest and term. In continuous time, we can write the stochastic
exchange rates, equity returns, commodity prices, differential equation:
energy prices, and credit spreads.
Derivative prices can be computed for regime- dx(t) = µ[t, s(t), y(t); β]dt
switching models by using transform methods (see
+ σ [t, s(t), y(t); β]dB(t) (5)
Hazard Rate). The characteristic function of a
regime-switching process can be computed in closed Again, y(t) can include exogenous variables
form if the characteristic functions conditional on and the history of x(t), while now B(t) is a
each regime are available. This makes such processes standard Brownian motion. The above equation
a viable alternative to the stochastic volatility models, can be extended to a multidimensional setting
where one has to also resort to transform methods for and can be generalized to include jumps or Lévy
pricing. processes.
2 Regime-switching Models
A standard simple example is a regime-dependent • The numerator of the above expression is the
random walk process, where product of the conditional density with the fore-
cast probability for each state.
x(t) = µ [s(t)] + σ [s(t)] (t) in discrete time • The denominator, which is the sum of all numer-
ators computed in the previous step, is also
dx(t) = µ [s(t)] dt + σ [s(t)] dB(t) the conditional density of P[x(t + 1) ∈ dx|F (t)].
in continuous time (6) This is the likelihood function of the observation
x(t + 1).
The parameter set in this example is β = {µ(1),
σ (1), µ(2), σ (2), . . . , µ(N ), σ (N )}. In the Kalman filtering terminology, the above
computation can be compactly written in two steps:
Derivative Pricing under Regime case, the off-diagonal elements of A(u) are multiplied
Switching by the characteristic function of the jump size.
Pricing of American options can be done by
Derivative pricing is typically carried out in a contin- setting up the continuation region [5] or by employing
uous time setting.a For a vanilla payoff with maturity a variant of Carr’s randomization procedure [4]. More
T , say z(T ) = h(x(T )), the time-zero price is given exotic products can be handled by setting up a
by the risk neutral expectation: system of partial (integro-)differential equations (see
Partial Integro-differential Equations (PIDEs)) or
z(0) = EQ [D(T )z(T )] (12) by explicitly using Fourier methods as in [22]. As
the conditional distribution can be recovered from the
where D(t) is the discount factor. characteristic function numerically, the density-based
In the regime-switching framework, pricing is approach of [2] (see Quadrature Methods) can be a
routinely carried out using the Fourier inversion viable alternative.
techniques (see Fourier Transform) outlined in [7].
In particular, if the log asset price x(T ) follows a
regime-switching Brownian motion Regime Switching as an Approximation
dx(t) = µ(s(t))dt + σ (s(t))dB(t) (13) Rather than serving as the fundamental latent process,
the Markov chain can serve as an approximation to
then the characteristic function φ(u; T ) = E exp more complex jump-diffusive dynamics. Then, one
(iux(T )) is given by the matrix exponential can use the regime-switching framework to tackle
problem in a nonaffine (see Affine Models) setting,
φ(u; T ) = ι exp(T A(u))ξ (0|0) (14) both in terms of calibration and derivative pricing. To
achieve that, the number of regimes must be large, but
where the (N × N ) matrix A(u) has the following the transition rates and conditional dynamics will be
form: functions of a small number of parameters. The book
by Kushner and Dupuis [25] outlines the convergence
qi,i + g(u; i) if i = j
ai,j (u) = (15) conditions for the approximation of generic diffusions
qi,j if i = j
and shows how one can implement the Markov chain
approximation in practice.
for g(u; i) = iuµ(i) − 1 u2 σ 2 (i).
2 Following this approach, many stochastic volatil-
The first implementation that prices options where
ity problems can be cast as regime-switching ones.
a two-regime process is present is that in [26].
Chourdakis [8] shows how a generic stochastic
In a more general setting with N regimes, vanilla
volatility process can be approximated in that way,
call option prices can be easily retrieved using the
whereas Chourdakis [9] extends this method to pro-
Fast Fourier Transform (FFT) approach of Carr and
duce the counterpart of the [21] stochastic volatility
Madan [7] or the fractional variant that allows explicit
model (see Heston Model) in a regime-switching
control of the discretization grids [10]. framework where the equity is driven by a Lévy
The above prototypical process can be extended noise.
in two directions. Rather than having switching
Brownian motions that generate the conditional paths,
one can consider switching Lévy processes (see Lévy
End Notes
Processes; Exponential Lévy Models) between the a.
The treatment in References [12, 13] are exceptions to
regimes (see [24], for the special two-regime case, this.
and [11], for a more general setting). In that case,
the function g(u; i) in A(u) is replaced by the
References
characteristic exponent of the Lévy process that is
active in the ith regime. In addition, to introduce [1] Albert, J. & Chib, S. (1993). Bayes inference via
a correlation structure between the regime changes Gibbs sampling of autoregressive time series subject to
and the log-price changes, a jump in the log-price is Markov mean and variance shifts, Journal of Business
introduced when the Markov chain switches. In that and Economic Statistics 11, 1–15.
4 Regime-switching Models
[2] Andricopoulos, A.D., Widdicks, M., Duck, P.W. & [17] Hamilton, J.D. (1989). A new approach to the economic
Newton, D.P. (2003). Universal option valuation using analysis of nonstationary time series and the business
quadrature methods, Journal of Financial Economics 67, cycle, Econometrica 57, 357–384.
447–471. [18] Hamilton, J.D. (1994). Time Series Analysis, Princeton
[3] Ang, A. & Bekaert, G. (2002). Regime switches in inter- University Press, Princeton, NJ.
est rates, Journal of Business and Economic Statistics [19] Hamilton, J.D. (2005). What’s real about the business
20(2), 163–182. cycle? Federal Reserve Bank of St. Louis Review 87(4),
[4] Boyarchenko, S.I. & Levendorski, S.Z. (2006). Amer- 435–452.
ican Options in Regime-switching Models. Manuscript [20] Hamilton, J.D. & Susmel, R. (1994). Autoregressive
available online at SSRN: 929215. conditional heteroscedasticity and changes in regime,
[5] Buffington, J. & Elliott, R.J. (2002). American options Journal of Econometrics 64, 307–333.
with regime switching, International Journal of Theoret- [21] Heston, S.L. (1993). A closed-form solution for options
ical and Applied Finance 5, 497–514. with stochastic volatility with applications to bond
[6] Calvet, L. & Fisher, A. (2004). How to forecast long-run and currency options, Review of Financial Studies 6,
volatility: regime-switching and the estimation of multi- 327–344.
fractal processes, Journal of Financial Econometrics 2, [22] Jackson, K.R., Jaimungal, S. & Surkov, V. (2007).
49–83. Fourier Space Time-stepping for Option Pricing with
[7] Carr, P. & Madan, D. (1999). Option valuation using Lévy Models. Manuscript available online at SSRN:
the Fast Fourier Transform, Journal of Computational 1020209.
Finance 3, 463–520. [23] Jeanne, O. & Masson, P. (2000). Currency crises,
[8] Chourdakis, K. (2004). Non-affine option pricing, Jour- sunspots, and Markov-switching regimes, Journal of
nal of Derivatives 11(3), 10–25. International Economics 50, 327–350.
[9] Chourdakis, K. (2005). Lévy processes driven by [24] Konikov, M. & Madan, D. (2002). Option pricing using
stochastic volatility, Asia-Pacific Financial Markets 12, Variance Gamma Markov chains, Review of Derivatives
333–352. Research 5(1), 81–115.
[10] Chourdakis, K. (2005b). Option pricing using the Frac- [25] Kushner, H.J. & Dupuis, P.G. (2001). Numerical Meth-
tional FFT, Journal of Computational Finance 8(2), ods for Stochastic Control Problems in Continuous Time,
1–18. 2nd Edition, Applications of Mathematics, Springer Ver-
[11] Chourdakis, K. (2005c). Switching Levy Models in Con- lag, New York, NY, Vol. 24.
tinuous Time: Finite Distributions and Option Pricing. [26] Naik, V. (1993). Option valuation and hedging strategies
Manuscript available online at SSRN: 838924. with jumps in the volatility of asset returns, The Journal
[12] Chourdakis, K. & Tzavalis, E. (2000). Option Pricing of Finance 48, 1969–1984.
Under Discrete Shifts in Stock Returns. Manuscript [27] Weron, R., Bierbauer, M. & Trück, S. (2004). Modeling
available online at SSRN: 252307. electricity prices: jump diffusion and regime switching,
[13] Duan, J.-C., Popova, I. & Ritchken, P. (1999). Option Physica A 336, 39–48.
Pricing under Regime Switching. Technical report, Hong
Kong University of Science and Technology.
[14] Filardo, A.J. (1994). Business-cycle phases and their
Related Articles
transitional dynamics, Journal of Business and Economic
Statistics 12, 299–308. Exponential Lévy Models; Filtering; Fourier
[15] Garcia, R., Luger, R. & Renault, E. (2003). Empirical Methods in Options Pricing; Fourier Transform;
assessment of an intertemporal option pricing model Monte Carlo Simulation; Stochastic Volatility
with latent variables, Journal of Econometrics 116, Models; Stylized Properties of Asset Returns;
49–83.
[16] Gray, S. (1996). Modeling the conditional distribution
Variance-gamma Model.
of interest rates as a regime-switching process, Journal
of Financial Economics 42, 27–62. KYRIAKOS CHOURDAKIS
Variance-gamma Model reveal a kurtosis well in excess of 3, suggesting
that the modeling of returns should be done by a
symmetric distribution with heavier tails than the
normal (see Stylized Properties of Asset Returns;
The variance-gamma (VG) process is a stochas- Heavy Tails). For example, in 1972, Praetz [20]
tic process with independent stationary increments, argued in favor of variance-dilation of the nor-
which allows for flexible parameterization of skew- mal through variance-mixing and found that mix-
ness and kurtosis of increments. It has gained pop- ing according to X|W ∼ N(µ, σ 2 W ), where W has
ularity, especially in option pricing, because of its reciprocal (inverse) gamma PDF with ƐW = 1, gives
analytical tractability. It is an example of a pure-jump the scaled t-distribution symmetric about µ for the
Lévy process (see Lévy Processes). returns. This is a slight generalization of the classical
The VG model is derived from the (symmet- Student’s t-distribution, in that fractional degrees of
ric) VG probability distribution, which is so named freedom are permitted.
because it is the distribution of a random vari- Influenced by Praetz’s work, Madan and Seneta
able X that results from mixing a normal variable [16] took the distribution of the mixing variable W
on its variance by a gamma distribution. Specif- itself to be gamma (rather than reciprocal gamma).
ically, the conditional distribution of X is given This resulted in a continuous-time model, which
by X|W ∼ N(µ, σ 2 W ), µ ∈ , σ > 0, where W ∼ is now known as the (symmetric) variance-gamma
(α, α), α > 0. The symbol “∼” stands for “is dis- model.
tributed as”. The symbol (α, λ) indicates a gamma The VG model may be placed within the context
probability distribution, with probability density func- of a more general subordinator (see Time Change)
tion (PDF), for α, λ > 0, model [10], which gives the price Pt of a risky asset
over continuous time t ≥ 0 as
λα α−1 −λw
f (w; α, λ) = w e , Pt = P0 exp {µt + σ B(Tt )} (5)
(α)
w > 0; = 0 elsewhere (1) where µ and σ (> 0) are real constants. {Tt }, the
(market) activity time, is a positive, increasing ran-
The choice λ = α implies that Ɛ(W ) = 1 and dom process (with stationary differences Wt = Tt −
ar(W ) = α 1 . Thus, X is a random variable sym- Tt−1 , t = 1, 2, . . .), which is independent of the stan-
metrically distributed about its mean Ɛ(X) = µ, with dard Brownian motion {B(t)}. The corresponding
ar(X) = σ 2 and a simple characteristic function returns are then given as
(CF), so that when X is mean-corrected to Y =
X − µ, the CF is Xt = log Pt − log Pt−1 = µ + σ (B(Tt ) − B(Tt−1 ))
(6)
−α
σ 2 u2
Ɛ(e ) = 1 +
iuY
(2) We assume that Ɛ(Wt ) < ∞, and so without loss
2α
of generality that
A random variable X having (symmetric) VG Ɛ(Wt ) = 1 (7)
distribution may also be viewed as
to make the expected activity time change over unit
D 1
X = µ + σ W 2 Z, Ɛ(W ) = 1 (3) calendar time equal to one unit, the scaling change
in time being absorbed into σ , while noting that
D
where the symbol = means “has the same distribution 1
D
as”. Here Z ∼ N(0, 1) and W is a positive nondegen- Xt = µ + σ Wt 2 B(1) (8)
erate random variable distributed independently of Z.
In the case of the VG distribution, W ∼ (α, α). which is of form (3). The case Tt = t of the
Log returns of financial assets model (5) is the classical geometric Brownian motion
(GBM) model for the process {Pt }, with correspond-
Xt = log Pt − log Pt−1 , t = 1, 2, . . . , N (4) ing returns being independently N(µ, σ 2 ) distributed.
2 Variance-gamma Model
In the VG model, {Tt } for t ≥ 0, is the gamma pro- investigations [10, 12] suggest the nonexistence of
cess, a process of stationary-independent increments. higher moments in a model for returns (see Heavy
The distribution of an increment over a time interval Tails; Stylized Properties of Asset Returns) and
of length t is (αt, α). It is a remarkable feature that hence the scaled t-distribution.
the distributional form for any t is the same; this is On the other hand, other investigations [9] suggest
inherited by the VG model for {log(Pt/P0 )}, t ≥ 0, that it is virtually impossible to distinguish between
which is a process of stationary-independent incre- the symmetric scaled t and VG distributions in regard
ments, with the distribution of an increment over any to distributional tail structure by taking compatible
time period t having CF parameter values in the two distributions. In fact,
−αt the PDFs of the two distributions reveal that the
σ 2 u2 concentration of probability near the point of sym-
e iµut
1+ (9) metry µ and in the middle range of the distributions
2α
is qualitatively and quantitatively different. The VG
The corresponding distribution is also called a distribution tends to increase the probability near µ
(symmetric) VG distribution. Its mean and variance and in the tails, at the expense of the middle range.
are given by Ɛ log(Pt/P0 ) = µt and ar(log(Pt/P0 )) The different natures in regard to shape are most
= σ 2 t, respectively. The whole structure is redolent significantly revealed by the Cauchy distribution as
of Brownian motion, to which the VG process a special case of the t-distribution and the Laplace
reduces in the limit as α → ∞. (two-sided exponential) distribution as a special case
An important consequence of the VG distribu- of the VG distribution.
tional form of an increment over any time interval The first monograph to include a study of the
of length t is that, irrespective of the size of unit of VG model was [4]. Since then it has found a place
time between successive data readings, returns have in monographs such as [22], where it is treated in
a VG distribution. the general context of Lévy processes (see Lévy
The CF (2), clearly the CF of an infinitely divisible Processes).
distribution, is also the CF of a difference of two inde- Both the VG distribution and the scaled t-
pendently and identically distributed (i.i.d.) gamma distribution are extreme cases of the generalized
random variables, which reflects the fact shown in hyperbolic (see Generalized Hyperbolic Models)
[16] that the process {log(Pt/P0 ) − tµ}, t ≥ 0, is the distribution [23, 25].
difference of two i.i.d. gamma processes.
The VG model is a pure-jump process [16] (see
Jump Processes) reflecting this feature of a gamma Allowing for Skewness
process. This is seen from the Lévy–Khinchin repre-
A generalized normal mean–variance-mixing distri-
sentation (see Lévy Processes).
bution is the distribution of X, where the conditional
The analytical simplicity of the VG model and
distribution of X is given as
its pure-jump nature make it a leading candidate for
modeling historical financial data. Further, the VG X|W ∼ N(µ + θW, σ 2 W + d 2 ) (10)
distribution’s PDF has explicit structural form (see
below), which is tractable for maximum-likelihood Here, µ, θ, d, and σ (> 0) are real numbers, and W
estimation of parameters from returns data. is a nondegenerate positive random variable. The dis-
Returns {Xt }, t = 0, 1, 2, . . ., considered in isola- tribution is skew if θ = 0, and it is symmetric other-
tion, need not be taken to be i.i.d. as in the preceding wise. Press [21] studied a continuous-time model with
discussion, but to form (more generally) a strictly sta- this distribution for returns, where W ∼ Poisson(λ).
tionary sequence, to which moment estimation meth- This is a process of stationary-independent incre-
ods, for example, will still apply [25]. The symmetric ments, resulting from adding a compound Poisson
scaled t-distribution continues to enjoy favor as a process of normal shocks to a Brownian motion, and
model for the distribution of returns because of its has both continuous and jump components. Some
power-law (Pareto-type) probability tails, a property special cases of equation (10) as a returns distribu-
manifested (in contrast to the VG) in the nonexistence tion, with focus on the estimation of parameters by
of higher moments. For some data sets, empirical the method of moments, are considered in [25].
Variance-gamma Model 3
A random variable X is said to have a normal called the variance-gamma model. Its properties are
variance–mean mixture (NVM) distribution [1] if extensively studied in [15].
equation (10) holds with d = 0.
The symmetric VG and scaled t-distributions are
instances of equation (10), with d = θ = 0. Dependence and Estimation
The skew VG distribution, as introduced in [15], is
the case of NVM where W is described by equation The VG model described above is a Lévy pro-
(1) with ƐW = 1 as in the symmetric case. (The cess (see Lévy Processes)—a stochastic process in
skewed scaled t-distribution is defined analogously continuous time with stationary independent incre-
by taking W to have a reciprocal gamma distribution.) ments—whose increments are independent and VG
The skew VG distribution has PDF distributed. To discuss dependence, we consider the
model for returns:
(x−µ)θ α− 1
2 αα e σ 2 |x − µ| 2
Xt = log Pt − log Pt−1 = µ + θWt + σ Wt
1/2
Zt ,
fVG (x) =
π σ (α) θ 2 + 2ασ 2 t = 1, 2, . . . (16)
|x − µ| θ 2 + 2ασ 2
×K 1 , x ∈
α−
2 σ2 where Zt , t = 1, 2, . . ., are identically distributed
(11) N(0, 1) random variables, independent of the
strictly stationary process {Wt }, t = 1, 2, . . . . Here
and CF θ, σ (> 0) are constants as before.
When θ = 0, this discrete-time model is equiv-
−α alent in distribution to that described by the
1 σ 2 u2
φVG (u) = Ɛ(e iux
)=e iµu
1− iuθ − subordinator model of Heyde [10] given by
α 2 equations (5)–(8). Note that ov(Xt , Xt+k ) = 0,
(12)
2
while ov(Xt2 , Xt+k ) = 0, k = 1, 2, . . . . This is an
important feature inasmuch as many asset returns
Kη (ω) for η ∈ and ω > 0, given as display a sample autocorrelation function plot char-
acteristic of white noise, but no longer do so in a
sample autocorrelation plot of squared returns and of
1 ∞ η−1 − 2 z+ z
ω 1
Kη (ω) = z e dz (13) absolute values of returns [10, 12, 17].
2 0
McLeish [17] considered the distribution of indi-
is a modified Bessel function of the third kind with vidual Wt ∼ (α, λ), which gives the distribution of
index η (Kη (ω) is referred to as a modified Bessel individual Xt as (symmetric) VG, which he regarded
function of the second kind in some texts). as a robust alternative to the normal. He suggested a
An equivalent representation is number of ways of introducing the dependence in the
process {Wt }, t = 1, 2, . . . .
D 1
X = µ + θW + σ W 2 Z, EW = 1 (14) The continuous-time subordinator model was
expanded in [11] to allow for scaled t-distributed
where Z and W are independently distributed, Z ∼ returns. Their specification of the activity time
N(0, 1), W ∼ (α, α) as mentioned before. This dis- process in continuous time {Tt } incorporated self-
tributional structure is consistent with the continuous- similarity (a scaling property) and long-range depen-
time model for prices dence (LRD) in the stochastic process of squared
returns. (LRD in the Allen sense is expressed as
Pt = P0 exp {µt + θTt + σ B(Tt )} (15) divergence of the sequence of ultimately nonnegative
autocorrelations of a discrete stationary process.)
where {Tt }, t ≥ 0, is a gamma process, exactly as The general form of the continuous-time model
before. The process of independent stationary incre- for prices over continuous time t ≥ 0 as
ments {log(Pt/P0 )}, t ≥ 0, with the distribution of
returns described by equations (11)–(13), is also Pt = P0 exp {µt + θTt + σ B(Tt )} (17)
4 Variance-gamma Model
for which the returns are equivalent in distribution to Comparing equations (12) and (19), it is clear that
equation (16) was given in [23] as a generalization choosing c = a results in a (skew) VG
distribution,
of the subordinator model that allows for skewness in with PDF (11) parameters α = a, θ = a b1 − d1 , and
the distribution of returns in the same way as in [15], σ 2 = bd
2a
. The further simplification b = d results in
but the returns inherit the postulated strict stationarity the symmetric VG process for returns.
of the sequence {Wt }, t = 1, 2, . . . . Following on Using this model for option pricing requires
from [11], Finlay and Seneta [5, 6] studied in detail imposing parameter restrictions to ensure that
and in parallel the continuous time structure of the {e−rt Pt } is a martingale, where r is the interest
skew VG model and the skew t-model, with focus rate. This amounts to ensuring that Ɛ(e−rt Pt |Fs ) =
on skewness, asymptotic self-similarity, and LRD. e−rs Ps , where Fs represents information available to
Maximum-likelihood estimation for independent time s ≤ t. In the case of the DG process,
readings from a symmetric VG distribution is dis-
cussed in [17] and in [23], which however proposes
moment estimation in the presence of dependence. Ɛ(e−rt Pt |Fs )
Moment estimation, allowing for dependence, is a(t−s) c(t−s)
−rs b d
further developed in [25], along with goodness of =e Ps ×e (µ−r)(t−s)
b−1 d +1
fit of various models for several sets of asset data. A (20)
method of simulating data from long-range dependent
processes with skew VG or
t-distributed increments is described in [7], and var- so that imposing the restriction
ious estimation procedures (method of moments,
product-density maximum likelihood, and nonstan- b d
µ = r − a log − c log (21)
dard minimum χ 2 in particular) are tested on the b−1 d +1
data to assess their performance. In the simulations
considered, the product-density maximum-likelihood with b > 1 results in {e−rt Pt }, which is a martingale
method performs favorably. The conclusion, within with four free parameters: a, b, c, and d. We label it
the limited testing carried out, indicates then that, MDG.
in practice, ordinary product density maximum- The (skew) VG special case is obtained by choos-
likelihood estimation is satisfactory even in the pres- ing c = a. Relabeling the parameters as above, α =
ence of LRD. This is tantamount to saying that one
2a , results in a martin-
a, θ = a b1 − d1 , and σ 2 = bd
may treat such data on returns as i.i.d. This entails
gale that is a (skew) VG process. The mean constraint
an enormous simplification in estimation procedures
(21) now translates to
in fitting the skew VG and skew t-distributions.
θ + 12 σ 2
µ = r + α log 1 − (22)
Option Pricing Applications α
Ɛ(e−rt Pt |Fs ) Taking the inverse Fourier transform and using the
fact that c(ϒ, k) is real, and using equation (25) gives
1 2
= e−rs Ps × e(µ−r)(t−s) Ɛ e(θ+ 2 σ )(Tt −Ts ) |Fs
e−γ k ∞
(23) C(ϒ, k) = e−ixk ϒ (x) dx
2π −∞
where the sequence {Wt }, where Wt = Tt − Tt−1 , is
e−γ k ∞
strictly stationary. = {e−ixk ϒ (x)} dx (27)
π
Thus, if we take µ = r and θ = − 1 σ 2 in (23), 0
2
the right-hand side of equation (23) becomes e−st Ps , In fact, we shall use a modified version of equation
and we have a martingale. This construction of (27) suggested in [14]:
a martingale, simple and quite general, is slightly
restrictive, however, in that two parameters, µ and θ, e−γ k ∞
C(ϒ, k) = Rγ + {e−ixk ϒ (x)} dx (28)
are constrained. We shall refer to this construction as π 0
a skew-correcting martingale, since θ, the parameter
that determines skewness, is constrained. We denote where Rγ = φϒ (−i) for −1 < γ < 0. The choice of
this martingale model by MSK. Out of the “external” γ generally impacts on the error generated by the
parameters µ, θ, and σ , the only parameter that is numerical approximation of equation (28). Finally,
retained is σ , which is called the historical volatility the option price (28) is computed via numerical
in the Black–Scholes (BS) context, which is a special integration.
case when Tt = t. Any additional parameters in The option price in this procedure is given simply
the martingale (risk-neutral) process will be those by the sum of a number of function evaluations.
emanating from the nature of {Tt }, which will need Lee [14] shows that with a judicious choice of
to be specified for any examination of estimation and tuning parameters, one can calculate the option price
goodness of fit. up to 99.99% accuracy with less than 100, and in
When the CF of the risk-neutral distribution of some cases less than 10, function evaluations. This
price is of the closed form, option prices may be cal- CF-based pricing method lends itself easily to the
culated using Fourier methods (see Fourier Methods fast Fourier transform, which allows for a very fast
in Options Pricing) as in [3, 14]. Specifically, for calculation of a range of option prices.
C(ϒ, k), the price of a European call option with time To numerically illustrate the method and the
to maturity ϒ and strike price K and k = log(K), let empirical performance of MVG against some com-
qϒ (p) be the risk-neutral density of log(Pϒ ), with petitors, we [8] use the data set in [22], Appendix
CF φϒ (u), at time ϒ. Thus, C, which contains 77 call option prices on the S&P
500 index at the close of market on April 18, 2002.
Fundamentally, each data point consists of the triple:
C(ϒ, k) = e−rϒ Ɛ(Pϒ − K)+ strike, option price, and expiry date.
∞ Fitting models involves estimating model param-
= e−rϒ (ep − ek )qϒ (p) dp (24) eters. To do this, we follow [22], p. 7, by minimizing
k
with respect to the model parameters, the root-mean-
Define the modified call price as square error (RMSE):
the asset price process, the RMSE value would be martingale model; its RMSE and σ -estimate values
zero, with all model prices matching market prices, are reported in [22], pp. 40-41, and are 6.73 and
given the single true set of parameters. 0.011, respectively. This apparent insensitivity of the
The estimates of model parameters produced for a MSK model to departure from BS, possibly due to the
given model correspond to the current market status skewness parameter being constrained in the martin-
of that model. The procedure is thus, for a given gale construction, is overcome, as reported in [8], by a
model, a calibration procedure. No historical data are four-parameter (“lack of static arbitrage” model: [2])
used in this procedure. The use of this data set for model, which is termed as C3. This model, though
comparison of several different models in this way as not a martingale model, gives an RMSE = 0.76, and
already done in [22] allows for easy comparison of its parameter estimates conform with estimates from
goodness of fit. We used the tuning parameter value historical data for an LRD VG model [7].
γ = − 12 ; the other nonmodel constants, q, r, were as Thus, for a given maturity, the three-parameter
in [22], Appendix C. MVG model and its associated (skew) VG model
The RMSE surface for the MDG was reported in for historical data perform reasonably well in fitting
[8] to be quite flat, with a number of different parame- option prices. If a four-parameter martingale model
ter values giving essentially the same RMSE value of is to be used, the parent model of MVG that should
2.24. The parameter values that gave the lowest value be used is the MDG, in which the gamma process
by 0.001 were as follows: a = 4.35, b = 240.86, c = continues to play a fundamental role.
9.79 × 10−6 , d = 2.65 × 10−7 . The four-parameter
MDG model thus did better than the four-parameter
CGMY and GH models reported in [22], p. 83, and Historical Notes
shown in Table 1.
The VG (skew) model fit reported in [22], p. 83, In the case where σ 2 = 2α in the CF (2), the
corresponded to the parameter values α(= a = c) = corresponding PDF (11) (with µ = θ = 0) already
5.4296 × 10−3 , b = 14.2699, d = 5.8704. The recal- appears in [18], p. 184, equation (xlii), and is the
culation of RMSE with these parameter values theme of [19], where it is shown to be the distribution
reported in [8] gave the value 3.57. Optimization of of difference of two i.i.d. gamma random variables,
RMSE reported in [8] with starting values a = c = an idea clarified in [13]. The definition of the Bessel
0.01, b = d = 10 resulted in the parameter estimates function Kη (ω) used differs from equation (13).
and RMSE as in [22]. Thus in Table 1, the RMSE val- Teichroew [24] obtained the PDF (11) (with µ = θ =
ues reported under VG and MVG are the same, 3.56. 0), in terms of a Hankel function, from the normal
As expected, this three-parameter model variance-mixing structure of the distribution of X,
(MVG/VG) does not perform quite as well as the using form (1) for the PDF of the mixing variable
four-parameter models. The VG model is a special W . These themes are taken up by McLeish [17] as a
case of the GH model, so this is not unexpected. starting point.
Finlay and Seneta [8] discuss fitting an MSK mar- The skew VG distribution with α = 2n, where n
tingale model, which allows for LRD in the historical is a positive integer, and −1 < θ/σ 2 < 1 appears
data. This introduces two parameters in addition to in [26], a paper generalizing [19], which was also
the “historical volatility” parameter σ , namely, a published in 1932.
parameter α corresponding to the gamma distribution
with mean 1, as before, and a “Hurst” parameter H
associated with dependence. The fit of MSK produces Acknowledgments
an RMSE of 6.35 and an estimate of σ = 0.012.
There is almost no improvement on the BS situa- Many thanks are due to Richard Finlay for his help.
tion reported in Table 1 , which is the standard BS
References
Table 1 Fit of models to Schoutens [22] option data
Model MDG CGMY GH VG MVG BS [1] Barndorff-Nielsen, O.E., Kent, J. & Sørensen, M. (1982).
RMSE 2.24 2.76 2.88 3.56 3.56 6.73 Normal variance-mean mixtures and z distributions,
International Statistical Review 50, 145–159.
Variance-gamma Model 7
[2] Carr, P., Geman, H., Madan, D. & Yor, M. (2003). [18] Pearson, K., Jeffery, G.B. & Elderton, E.M. (1929). On
Stochastic volatility for Lévy processes, Mathematical the distribution of the first product-moment coefficient
Finance 13, 345–382. in samples drawn from an indefinitely large normal
[3] Carr, P. & Madan, D. (1999). Option valuation using population, Biometrika 21, 164–201.
the fast Fourier transform, Journal of Computational [19] Pearson, K., Stouffer, S.A. & David, F.N. (1932).
Finance 2, 61–73. Further applications in statistics of the Tm (x) Bessel
[4] Epps, T.W. (2000). Pricing Derivative Securities, World function, Biometrika 24, 316–343.
Scientific, Singapore. [20] Praetz, P.D. (1972). The distribution of share price
[5] Finlay, R. & Seneta, E. (2006). Stationary-increment changes, Journal of Business 45, 49–55.
Student and Variance-Gamma processes, Journal of [21] Press, S.J. (1967). A compound events model for secu-
Applied Probability 43, 441–453. rity prices, Journal of Business 40, 317–335.
[6] Finlay, R. & Seneta, E. (2007). A gamma activity time [22] Schoutens, W. (2003). Lévy Processes in Finance. Pric-
process with noninteger parameter and self-similar limit, ing Financial Derivatives, Wiley, Chichester.
[23] Seneta, E. (2004). Fitting the Variance-Gamma model to
Journal of Applied Probability 44, 950–959.
financial data, in Stochastic Methods and Their Applica-
[7] Finlay, R. & Seneta, E. (2008a). Stationary-increment
tions (C.C. Heyde Festschrift), J. Gani & E. Seneta, eds,
Variance-Gamma and t-models: simulation and param-
Journal of Applied Probability, Vol. 41A, pp. 177–187.
eter estimation, International Statistical Review 76,
[24] Teichroew, D. (1957). The mixture of normal distribu-
167–186.
tions with different variances, Annals of Mathematical
[8] Finlay, R. & Seneta, E. (2008b). Option pricing with
Statistics 28, 510–512.
VG-like models, International Journal of Theoretical [25] Tjetjep, A. & Seneta, E. (2006). Skewed normal
and Applied Finance 11, 943–955. variance-mean models for asset pricing and the
[9] Fung, T. & Seneta, E. (2007). Tailweight, quantiles and method of moments, International Statistical Review 74,
kurtosis. A study of competing distributions, Operations 109–126.
Research Letters 35, 448–454. [26] Wishart, J. & Bartlett, M.S. (1932). The distribution
[10] Heyde, C.C. (1999). A risky asset model with strong of second order moment statistics in a normal system,
dependence through fractal activity time, Journal of Proceedings of the Cambridge Philosophical Society 28,
Applied Probability 36, 1234–1239. 455–459.
[11] Heyde, C.C. & Leonenko, N.N. (2005). Student pro-
cesses, Advances in Applied Probability 37, 342–365.
[12] Heyde, C.C. & Liu, S. (2001). Empirical realities for Further Reading
a minimal description risky asset model. The need for
fractal features, Journal of the Korean Mathematical Seneta, E. (2007). The early years of the Variance-Gamma
Society 38, 1047–1059. process, in Advances in Mathematical Finance (Dilip B.
[13] Kullback, S. (1936). The distribution laws of the differ- Madan Festschrift), M.C. Fu, R.A. Jarrow, J.-Y.J. Yen, &
ence and quotient of variables independently distributed R.J. Elliott, eds, Birkhäuser, Boston, pp. 3–19.
in Pearson type III laws, Annals of Mathematical Statis-
tics 7, 51–53.
[14] Lee, R. (2004). Option pricing by transform methods: Related Articles
extensions, unification and error control, Journal of
Computational Finance 7, 51–86. Exponential Lévy Models; Generalized Hyper-
[15] Madan, D.B., Carr, P.P. & Chang, E.C. (1998). The bolic Models; Hazard Rate; Heavy Tails; Lévy
Variance-Gamma process and option pricing, European
Processes; Stylized Properties of Asset Returns;
Finance Review 2, 79–105.
[16] Madan, D.B. & Seneta, E. (1990). The Variance-Gamma Tempered Stable Process.
(V.G.) model for share market returns, Journal of
Business 63, 511–524. EUGENE SENETA
[17] McLeish, D.L. (1982). A robust alternative to the normal
distribution, Canadian Journal of Statistics 10, 89–102.
Jump-diffusion Models To identify µJ , taking expectations of equation (1)
and from the definition of µ̂,
Nt Define the forward price F := S0 ert . If xt :=
Jt = Yi (2) log St /F is a Lévy process (see Lévy Processes),
i=1 its characteristic function φT (u) := Ɛ eiuxT has the
Lévy–Khintchine representation
is a compound Poisson process where the jump sizes
Yi are independent and identically distributed with
distribution F and the number of jumps Nt is a φT (u) = exp i u (µJ − σ 2 /2) T − u2 σ 2 /2 T
Poisson process with intensity λ. The asset price
iuξ
St thus follows geometric Brownian motion between +T e − 1 ν(ξ ) dξ (7)
jumps. Monte Carlo simulation of the process can
be carried out by first simulating the number of Typical assumptions for the distribution of jump
jumps Nt , the jump times, and then simulating sizes are as follows: normal as in the original paper
geometric Brownian motion on intervals between by Merton [5] and double exponential as in [3] (see
jump times. Kou Model). In the Merton model, the Lévy density
The SDE (1) has the exact solution: ν(·) is given as
In the double-exponential case (see Kou Model) where for ease of notation, V denotes V (S, t).
Equation (15) is a partial integro-differential equation
ν(ξ ) = λ p α+ e−α+ ξ 1ξ ≥0 (PIDE) (see Partial Integro-differential Equations
(PIDEs)), which can be solved using finite-difference
+ (1 − p) α− e−α− |ξ | 1ξ <0 (11) methods [1].
where α+ and α− are the expected positive and
negative jump sizes, respectively, and p is the relative A Valuation Formula for European Options
probability of a positive jump. This gives the explicit
characteristic function: Merton [5] derived an exact solution of the valuation
equation (15) for a European-style call option with
strike K and time to expiration T , which has the
1
φT (u) = exp i u ω T − u2 σ 2 T + λ T form of an infinite sum of Black–Scholes-like terms:
2
p 1−p ∞
× − (12) e−λ T (λ T )n
α+ − i u α− + i u C(S, K, T ) = Fn (dξ )
n=0
n!
with
× CBS (S eξ eµJ T , K, r, σ, T )
1 p 1−p
ω = − σ2 − λ − (13) (16)
2 α+ + 1 α− − 1
where Fn is the distribution of the sum of n-
Pricing of European Options independent jumps and CBS (·) denotes the Black–
Scholes solution, which is given as
Given a characteristic function, European call options
can be priced using Fourier methods (see Fourier CBS (S, K, r, σ, T ) = S N (d1 ) − K e−rT N (d2 )
Methods in Options Pricing), as in [4]: (17)
with
√
−r T
√ 1 ∞ du
C(S, K, T ) = e F − FK log(S/K) + r T σ T
π 0 1 d1 = √ +
u2 + σ T 2
4 √
log(S/K) + r T σ T
d2 = √ −
2
(18)
× Re e−iuk φT (u − i/2) (14) σ T
Jump to Ruin
where the log-strike k := log (K/F ).
In the case where eYi = 0 with probability 1, µJ = λ
and equation (16) simplifies to
Valuation Equation
C(S, K, T ) = e−λ T CBS (S eλ T , K, r, σ, T ) (19)
Assuming the process (1) and a constant risk-free rate
r and further supposing that the market is complete, which is the Black–Scholes formula with a shifted
the value V (S, t) of a European-style option satisfies interest rate r → r + λ. This special case of the JD
model where the stock price jumps to zero (or ruin)
∂V 1 ∂ 2V ∂V whenever there is a jump is the simplest possible
+ σ 2 S2 2 + r S −rV +λ F (dξ )
∂t 2 ∂S ∂S model of default. Equation (19) for the option price
in the jump-to-ruin model may also be derived from a
ξ ∂V
V (S e , t) − V (S, t) − e − 1 S
ξ
=0 Black–Scholes style replication argument using stock
∂S and bonds of the issuer of the stock; upon default of
(15) the issuer, both stock and bonds jump to zero. The
Jump-diffusion Models 3
cost of funding stock with bonds of the issuer is r + λ Each term CBS (S, K, rn , σn , T ) in equation (22)
in this picture, which explains the simple form (19) is the value of the option conditional on there being
of the solution. exactly n jumps during its life.
2κη
(B 2 − A2 )y B 2 y γY/2 γt κ γt λ2
− − cosh + sinh
f (y) = e 2 E e 2 γ1/2 , 2 γ 2
(16)
G−M G+M 2iu
In the second approach, one writes the stock price We suppose that the background driving Lévy
process S2 (t) as process has the following characteristic function:
E[exp(iuU (t)) = exp(tψU (u)) (28)
S2 (t) = S(0) exp ((r − q)t) exp(Z(t)
− Y (t)ψX (−i)) (22) The characteristic function of the composite pro-
cess Z J (t) may be developed in terms of the joint
For the first approach, the log characteristic func- characteristic function of Y J (t), U (t) as
tion for the logarithm of the stock price is given as
t (a, b) = E exp iaY J (t) + ibU (t) (29)
E exp (iu log S1 (t)) We may show that
= exp (iu(log(S(0) + (r − q)t)))
E exp iuZ J (t) = t (−iψX (u), uρ) (30)
φ (−iψX (u), t, y(0); κ, η, λ)
× (23) We have that
φ(−iψX (−i), t, y(0); κ, η, λ)iu
ψU (v)
E exp (iu log S2 (t)) × exp dv (31)
L a + κb − κv
= exp (iu(log(S(0) + (r − q)t))
L=b (32)
× φ(−iψX (u) − uψX (−i), t, y(0); κ, η, λ)
1 − e−κt
(24) U =b+a (33)
κ
The models of the first approach are termed The characteristic functions for the logarithm of
NIGSA, VGSA, and CGMYSA for NIG, VG, and the stock price for the exponential model are now
CGMY with a stochastic arrival rate of jumps adapted
to the level of the process y(t). The models of
E exp iu log(S1J (t)
the second approach are martingale models and are
termed N I GSAM, V GSAM, and CGMY SAM, = exp (iu(log(S(0) + (r − q)t))
respectively. It is observed in calibrations that the first × t (−iψX (u), ρu)
approach generally fits the option price data better.
× exp (−iu log (t (−iψX (−i), −iρ)) (34)
Some Discontinuous Time Changes For the stochastic exponential, the result is
given as
One can replace the continuous stochastic process
for the arrival rate of jump activity y(t) by a
E exp iu log(S2J (t)
discontinuous process that now only has upward
jumps. We call this process y J (t) for discontinuous = exp(iu(log(S(0)) + (r − q)t − ψU (−iρ)t))
jump arrival rates. Given a background driving Lévy × t (−iψX (u) − uψX (−i), ρu) (35)
process (BDLP) U (t) with only positive jumps, we
define Some explicit examples for ψU (u), for which we
dy J (t) = −κy J (t) dt + dU (t) (25) may obtain exact expressions for t (a, b), remain to
The composite process now permits some direct be determined.
dependence between arrival rate jumps and the under-
lying uncertainty:
Examples for ψU (u) and t (a, b)
Z (t) = X(Y (t)) + ρU (t)
J J
(26)
Three explicit models for ψU were developed. These
t
are SG for stationary gamma, IG for inverse Gaus-
Y J (t) = y J (s) ds (27) sian, and SIG for stationary inverse Gaussian.
0
4 Time-changed Lévy Process
γt γt γt
ib γ cosh − κ sinh + 2ia sinh
2 2 2
B C (t, a, b) =
(52)
γt γ t
γ cosh + κ − ibλ2 sinh
2 2
γ = κ 2 − 2λ2 ia (53)
t
We get the characteristic function for the model
VGCSA, where the letter C denotes correlated Y (t) = y(u) du (56)
0
stochastic arrival by exponentiation as √
dy = κ(η − y) dt + λ y dWy (t) (57)
E exp iu log(S1C (t) dWy dWS = ρ dt (58)
= exp (iu(log(S(0)) + (r − q)t)) ν( dx, dt) = cp + sp y(t) kp (x)1x>0 dx
× Ct (−iψX (u), ρu) + (cn + sn y(t))kn (x)1x<0 dx (59)
× exp −iu log(Ct(−iψX (−i), −iρ)) (54)
The growth rate of the stock price is at the risk
neutral level of (r − q). The coefficients cp , cn are
Exciting the Jumps by the Level of the Lévy jump response components. The sensitivities
of jumps to volatility are captured by the two
Activity that Is Also a Heston Type of
slope coefficients sp , sn for the positive and the
Correlated Volatility negative sides. The logarithm of the stock price
In this class of models, we introduce stochastic is a continuous martingale with stochastic volatility
volatility and allow jump arrival rates to respond to plus a compensated jump martingale that has jumps
the volatility on each side with separate sensitivities. responding to volatility with log price drift set to fix
This will give rise to stochastic skewness as well as the stock drift at r − q.
to volatility. The model for the logarithm of the stock The joint characteristic function of the log of the
price H (t) = log(S(t)) is now as follows: stock price, the level of the terminal variance, and
the remaining integrated variance is
t
y(u)
H (t) = H (0) + (r − q)t − du
0 2 (t, H (t), y(t))
∞
x T
− cp t + sp Y (t) e − 1 − x kp (x) dx = Et exp(iaH (T ) + iby(T ) + ic y(u) du
0 t
0
(60)
− (cn t + sn Y (t)) (e − 1 − x)kn (x) dx
x
−∞
t We have a closed form for in this model given
+ x ∗ (µ − ν) + y(u) dWS (u) (55) as
0
κ − λρia ξ ξ
γ (τ ) = tanh D −+ τ (63)
λ2 λ2 2
ξ = (κ − λρia)2 + λ2 a 2 + i(a − 2c) − 2(sp up + sn un ) (64)
∞ ∞
up = (e − 1 − iax)kp (x) dx − ia
x
(ex − 1 − x)kp (x) dx (65)
0 0
0 0
un = (ex − 1 − iax)kn (x) dx − (ex − 1 − x)kn (x) dx (66)
−∞ −∞
−1 ibλ2 κ − λρia
D = tanh − (67)
ξ ξ
On setting b = c = 0, we obtain the characteristic [3] Cont, R. & Tankov, P. (2004). Financial Modelling with
function of the log of the final stock price and this Jump Processes, Series in Financial Mathematics, CRC
yields the models: SVADNE, SVAVG, and SVAC- Press.
CGMYY. [4] Duffie, D., Filipovic, D. & Schachermayer, W. (2003).
Affine processes and applications in finance, Annals of
We note that for DNE
Applied Probability 13, 984–1053.
[5] Lamberton, D. & Lapeyre, B. (1996). Introduction to
1
up = (1 − ia) (68) Stochastic Calculus Applied to Finance, Chapman and
βp − 1 Hall, New York.
[6] Madan, D. & Yor, M. (2008). Representing the CGMY
1
un = − (1 − ia) (69) and Meixner Levy processes as time changed Brownian
βn + 1 motions, Journal of Computational Finance Fall, 27–47.
[7] Pitman, J. & Yor M. (1982). A decomposition of Bessel
The corresponding calculations for VG in the Bridges, Zeitschrift für Wahrsch- einlichkeitstheorie und
CGM parameterization are Verwandte Gebiete 59, 425–457.
Further Reading
M
up = log (70)
M −1 Carr, P., Geman, H., Madan, D. & Yor, M. (2002). The fine
structure of asset returns: an empirical investigation, Journal
G
un = log (71) of Business, 75(2), 305–332.
G+1 Madan, D., Carr, P. & Chang, E. (1998). The variance gamma
process and option pricing, European Finance Review 2,
For CCGMYY, we have the following result: 79–105.
Madan, D.B. & Seneta, E. (1990). The Variance Gamma (VG)
model for share market returns, Journal of Business 63,
up = (−yp ) (M − 1)yp − M yp (72) 511–524.
un = (−yn ) (G + 1)yn − Gyn (73)
Related Articles
Overshoot
sample sizes typically in the tens of thousands or by the hyperexponential distribution. Feldmann
even hundreds of thousands necessary to distinguish and Whitt [15] develop a numerical algorithm to
power-type tails from exponential-type tails. For fur- approximate completely monotone distributions
ther discussion, see [20], in which is also discussed by the hyperexponential distribution.
the implication in terms of risk measured.
The difficulty in distinguishing tail behavior moti- Cai and Kou [5] show that the hyperexponential
vated Cai and Kou [5] to extend the double expo- jump-diffusion model can lead to analytical solutions
nential jump-diffusion model to a hyperexponential for popular path-dependent options, such as lookback,
jump-diffusion model, in which the jump size {Yi := barrier, and perpetual American options. These ana-
log(Vi ) : i = 1, 2 · · ·} is a sequence of i.i.d. hyperex- lytical solutions are made possible mainly because we
ponential random variables with density solve several high-order integro-differential equations
related to first passage time problems and optimal
m
n stopping problems explicitly. Solving the high-order
fY (x) = pi ηi e−ηi x I{x≥0} + qj θj eθj x I{x<0} integro-differential equations is the main technical
i=1 j =1 contribution of [5], which is achieved by discovering
(6) a connection between integro-differential equations
and homogeneous ordinary differential equations in
where pi > 0 and ηi > 1 for all i = 1, . . . , m, qj > the case of the hyperexponential jump-diffusion gen-
m
0and θj > 0 for all j = 1, . . . , n, and i=1 pi + erator.
n
j =1 q j = 1. Here the condition that ηi > 1, for all
i = 1, . . . , m, is imposed to ensure that the stock
price St has a finite expectation. Multivariate Version
The hyperexponential distribution is general
enough to provide a link between various heavy- A significant drawback of most of the Levy processes
tail distributions, no matter which ones we prefer. In discussed in the literature is that they are one
particular, any completely monotone distribution, for dimensional, whereas many options traded in markets
example, with a density f (x) satisfying the condition have several underlying assets. To overcome this,
that all derivatives of f (x) exist and (−1)n f (n) (x) ≥ Huang and Kou [21] introduced a multivariate jump-
0 for all x and n ≥ 1, can be approximated by
diffusion model in which, under the physical measure
hyperexponential distributions as closely as possible
P , the following stochastic differential equation is
in the sense of weak convergence. Many distributions
proposed to model the asset prices S(t):
with tails heavier than those of the normal distribution
are completely monotone. Here are some examples of N(t)
completely monotone distributions frequently used in dS(t)
= µ dt + σ dW (t) + d (Vi − 1) (7)
finance: S(t−) i=1
1. Gamma distribution. The density of Gamma
(α, β) is x α−1 e−βx , where α, β > 0. When α < 1, where W (t) is an n-dimensional standard Brownian
the distribution is completely monotone. motion, σ ∈ R n×n with the covariance matrix =
2. Weibull distribution. The cumulative distribution σ σ T . The rate
of the Poisson process N (t) process
function of Weibull (c, d) is given by 1 − is λ = λc + nk=1 λk ; in other words, there are two
e−(x/d) , where c, d > 0. When c < 2, it has
c
types of jumps, common jumps for all assets with
heavier tails than the normal distribution. jump rate λc and individual jumps with rate λk ,
3. Pareto distribution. The distribution of Pareto 1 ≤ k ≤ n, only for the kth asset.
(a, b) is given by 1 − (1 + bx)−a , where a, b > 0. The logarithms of the common jumps have
4. Pareto mixture of exponential distribution an m-dimensional asymmetric Laplace distribution
(PME).
+∞ The density of PME (a, b) is given by ALn (mc , Jc ), where mc = (m1,c , . . . , mn,c ) ∈ R n
0 fa,b (y)y −1 e−x/y dy, where fa,b is the den- and Jc ∈ R n×n is positive definite. For the individual
sity of the Pareto (a, b). jumps of the kth asset, the logarithms of the jump
In summary, many heavy-tail distributions used sizes follow a one-dimensional asymmetric Laplace
in finance can be approximated arbitrarily closely distribution, AL1 (mk , vk2 ). In summary,
Kou Model 5
ALn (mc , Jc ), with prob. λc /λ
Y = log (V ) ∼ (0, . . . , 0, AL1 (mk , vk2 ), 0, . . . , 0) , with prob. λk /λ, 1≤k≤n (8)
k−1 n−k
The sources of randomness, N (t), W (t) are assumed The infinitesimal generator of {X1 (t), X2 (t)} is
to be independent of the jump sizes Vi . Jumps given by
at different times are assumed to be independent.
Note that in the univariate case, the above model ∂u ∂u
Lu = µ1 + µ2
degenerates to the double exponential jump-diffusion ∂x1 ∂x2
model [23] but with pη1 = qη2 .
1 ∂ 2u 1 ∂ 2u ∂ 2u
In the special case of a two-dimensional model, + σ12 2 + σ22 2 + ρσ1 σ2
the two-dimensional jump-diffusion return process 2 ∂x1 2 ∂x2 ∂x1 ∂x2
(X1 (t), X2 (t)), with Xi (t) = log(Si (t)/S(0)), is ∞ ∞
given by + λc [u(x1 +y1 , x2 +y2 ) − u(x1 , x2 )]
y2 =−∞ y1 =−∞
N(t) × fY (2) (y2 ) dy2 (12)
+ Yi(2) (9)
i=1 for all continuous twice differentiable function
u(x1 , x2 ), where f(Yc (1) ,Y (2) ) (y1 , y2 ) is the joint den-
sity of correlated common jumps AL2 (mc , Jc ),
Here all the parameters are risk-neutral parameters;
and fY (i) (yi ) is the individual jump density of
W1 (t) and W2 (t) are two independent standard Brow-
AL1 (mi , Ji ), i = 1, 2.
nian motions; and N (t) is a Poisson process with rate One difficulty in studying the generator is that
λ = λc + λ1 + λ2 . The distribution of the logarithm the joint density of the asymmetric Laplace distri-
of the jump sizes Yi is given by bution has no analytical expression. Therefore, the
calculation related to the joint density and gener-
ator becomes complicated. See [21] for change of
Yi = (Yi(1) , Yi(2) ) measures from a physical measure to a risk-neutral
measure, analytical solutions for the first passage
AL2 (mc , Jc ), with prob. λc /λ
times, and pricing formulae for barrier and exchange
∼ (AL1 (m1 , v12 ), 0) , with prob. λ1 /λ
(0, AL (m , v 2 )) , with prob. λ2 /λ
options.
1 2 2
(10) References
[3] Boyle, P., Broadie, M. & Glasserman, P. (1997). Monte [20] Heyde, C.C., Kou, S.G. & Peng, X.H. (2008). What is a
Carlo methods for security pricing, Journal of Economic Good Risk Measure: Bridging the Gaps Between Robust-
Dynamics and Control 21(89), 1267–1321. ness, Subadditivity, Prospect Theory, and Insurance Risk
[4] Cai, N., Chen, N. & Wan, X. (2008). Pricing Double Measures, Columbia University. Preprint.
Barrier Options Under a Flexible Jump Diffusion Model, [21] Huang, Z. & Kou, S.G. (2006). First Passage Times and
Hong Kong University of Science and Technology. Analytical Solutions for Options on Two Assets with Jump
Preprint. Risk, Columbia University. Preprint.
[5] Cai, N. & Kou, S.G. (2008). Option Pricing Under [22] Kijima, M. (2002). Stochastic Processes with Applica-
a HyperExponential Jump Diffusion Model, Columbia tions to Finance, Chapman & Hall, London.
University. Preprint. [23] Kou, S.G. (2002). A jump-diffusion model for option
[6] Carr, P., Geman, H., Madan, D. & Yor, M. (2002). The pricing, Management Science 48, 1086–1101.
fine structure of asset returns: an empirical investigation, [24] Kou, S.G., Petrella. G. & Wang, H. (2005). Pricing
Journal of Business 75, 305–332. path-dependent options with jump risk via Laplace
[7] Carr, P., Geman, H., Madan, D. & Yor, M. (2003). transforms, Kyoto Economic Review 74, 1–23.
Stochastic volatility for Lévy processes, Mathematical [25] Kou, S.G. & Wang, H. (2003). First passage time of a
Finance 13, 345–382. jump diffusion process, Advances in Applied Probability
[8] Carr, P. & Wu, L. (2004). Time-changed lévy processes 35, 504–531.
and option pricing, Journal of Financial Economics 71, [26] Kou, S.G. & Wang, H. (2004). Option pricing under a
113–141. double exponential jump-diffusion model, Management
[9] Chen, N. & Kou, S.G. (2005). Credit spreads, optimal Science 50, 1178–1192.
capital structure, and implied volatility with endogenous [27] Lucas, R.E. (1978). Asset prices in an exchange econ-
default and jump risk, Mathematical Finance Preprint, omy, Econometrica 46, 1429–1445.
Columbia University. To appear. [28] Merton, R.C. (1976). Option pricing when underlying
[10] Cont, R. & Tankov, P. (2004). Financial Modelling with stock returns are discontinuous, Journal of Financial
Jump Processes, 2nd Printing, Chapman & Hall/CRC Economics 3, 125–144.
Press, London. [29] Ramezani, C.A. and Zeng, Y. (2002). Maximum Likeli-
[11] Cont, R. & Voltchkova, E. (2005). Finite difference hood Estimation of Asymmetric Jump-Diffusion Process:
methods for option pricing in jump-diffusion and expo- Application to Security Prices, Working Paper, Depart-
nential Lévy models, SIAM Journal of Numerical Anal- ment of Mathematics and Statistics, University of Mis-
ysis 43, 1596–1626. souri, Kansas City.
[12] d’Halluin, Y., Forsyth, P.A. & Vetzal, K.R. (2003). [30] Sepp, A. (2004). Analytical pricing of double-barrier
options under a double exponential jump diffusion
Robust Numerical Methods for Contingent Claims under
process: applications of Laplace transform, Interna-
Jump-diffusion Processes, Working paper, University of
tional Journal of Theoretical and Applied Finance 7,
Waterloo.
151–175.
[13] Duffie, D., Pan, J. & Singleton, K. (2000). Transform
[31] Singleton, K. (2006). Empirical Dynamic Asset Pricing,
analysis and asset pricing for affine jump-diffusions,
Princeton University Press.
Econometrica 68, 1343–1376.
[32] Stokey, N.L. & Lucas, R.E. (1989). Recursive Methods
[14] Engle, R. (1995). ARCH: Selected Readings, Oxford
in Economic Dynamics, Harvard University Press.
University Press.
[15] Feldmann, A. & Whitt, W. (1998). Fitting mixtures
of exponentials to long-tail distributions to analyze
Further Reading
network performance models, Performance Evaluation
31, 245–279.
[16] Feng, L., Kovalov, P., Linetsky, V. & Marcozzi, M. Hull, J. (2005). Options, Futures, and Other Derivatives,
(2007). Variational methods in derivatives pricing, Prentice Hall.
in Handbook of Financial Engineering, J. Birge &
V. Linetsky, eds, Elsevier, Amsterdam.
[17] Feng, L. & Linetsky, V. (2008). Pricing options in jump- Related Articles
diffusion models: an extrapolation approach, Operations
Research 52, 304–325.
[18] Glasserman, P. & Kou, S.G. (2003). The term structure
Barrier Options; Exponential Lévy Models; Jump
of simple forward rates with jump risk, Mathematical Processes; Lookback Options; Partial Integro-
Finance 13, 383–410. differential Equations (PIDEs); Wiener–Hopf
[19] Heyde, C.C. & Kou, S.G. (2004). On the controversy Decomposition.
over tailweight of distributions, Operations Research
Letters 32, 399–408. STEVEN KOU
is called compound Poisson and can be written as
Exponential Lévy Models
Zt
Xt = Yi (2)
Exponential Lévy models generalize the classical k=1
Black and Scholes model by allowing the stock
prices to jump while preserving the independence and where Z is a Poisson process and (Yi ) is an i.i.d.
stationarity of returns. There are ample reasons for sequence of random variables. In general, the number
introducing jumps in financial modeling. First, asset of jumps of a Lévy process in a given interval need
prices exhibit jumps, and the associated risks cannot not be finite, and the process can be represented as a
be handled within continuous-path models. Second, sum of a Brownian motion with drift and a limit of
the well-documented phenomenon of implied volatil- processes of the form in equation (2):
ity smile in option markets shows that the risk-neutral
returns are non-Gaussian and leptokurtic, all the more
so for short maturities, a clear indication of the pres- Xt = γ t + Bt + Nt + lim Mtε (3)
ε↓0
ence of jumps. In continuous-path models, the law
of returns for shorter maturities becomes closer to where B is a d-dimensional Brownian motion, γ ∈
the Gaussian law, whereas in reality and in models d , N is a compound Poisson process that includes
with jumps, returns actually become less Gaussian as the jumps of X with |Xt | > 1, and Mtε is a compen-
the horizon becomes shorter. Finally, jump processes sated compound Poisson process (compound Poisson
correspond to genuinely incomplete markets, whereas minus its expectation) that includes the jumps of X
all continuous-path models are either complete or can with ε < |Xt | ≤ 1. The law of a Lévy process is
be made so with a small number of additional assets. completely identified by its characteristic triplet—the
This fundamental incompleteness makes it possible positive definite matrix A (unit covariance of B), the
to carry out a rigorous analysis of the hedging errors vector γ (drift), and the measure ν on d , called
in discontinuous models and find ways to improve the Lévy measure, which determines the intensity of
the hedging performance using additional instruments jumps of different sizes. ν(A) is the expected num-
such as liquid European options. ber of jumps on the time interval [0, 1], whose sizes
fall in A. The Lévy measure satisfies the integrability
condition
Lévy Processes
1 ∧ x2 ν(dx) < ∞ (4)
d
Lévy processes (see Fundamental Theorem of Asset
Pricing) [1, 3, 17] are stochastic processes with sta-
tionary and independent increments. The only Lévy and ν() < ∞ if the process has finite jump inten-
process with continuous trajectories is the Brownian sity. The law of Xt at all times t is determined by
motion with drift; all others have paths with discon- the triplet and, in particular, the Lévy–Khintchine
tinuities in finite or (countably) infinite number. The formula gives the characteristic function E[eiuXt ] =
simplest example of a Lévy process is the Poisson exp[tψ(u)] with
process (see Poisson Process): the increasing piece-
wise constant process with jumps of size 1 only and
1
exponential waiting times between jumps. If (τi ) is ψ(u) = iγ , u + Au, u + (eiu,x − 1
2 d
a sequence of independent exponential random vari-
ables with intensity λ and Tk := ki=1 τi , then the − iu, x1x≤1 )ν(dx) (5)
process
Zt := 1Ti ≤t (1) Conversely, any infinitely divisible law (see Infinite
i
Divisibility) has a Lévy–Khintchine representation
as above, so modeling with Lévy processes allows to
is called a Poisson process with intensity λ. A piece- pick any infinitely divisible distribution for the law
wise constant Lévy process with arbitrary jump sizes (say, at time t = 1) of the process.
2 Exponential Lévy Models
or using the ordinary exponential St = S0 eXt . The In the Kou model (see Kou Model) [13], jump sizes
solution to equation (7) with initial condition S0 = 1 are distributed according to an asymmetric Laplace
is called the stochastic exponential of X. It can law with a density of the form
become negative if the process X has a big negative
jump: Xs < −1 for s ≤ t. However, if X does not ν0 (x) = [pλ+ e−λ+ x 1x>0 + (1 − p)λ− e−λ− |x| 1x<0 ]
have jumps of size smaller than −1, then its stochastic
exponential is positive, and the stochastic and the (11)
ordinary exponential yield the same class of positive
processes. Given this result and the fact that ordinary with λ+ > 0, λ− > 0 governing the decay of the tails
exponentials are more tractable (in particular, we for the distribution of positive and negative jump
have the Lévy–Khintchine representation), they are sizes and p ∈ [0, 1] representing the probability of an
more often used for modeling financial time series upward jump. The probability distribution of returns
than the stochastic ones. In the rest of this article, we in this model has semiheavy (exponential) tails.
focus on the exponential Lévy model The second category consists of models with an
infinite number of jumps in every interval, which
St = S0 ert+Xt (8) we call infinite activity or infinite intensity models.
In these models, one does not need to introduce a
where X is a one-dimensional Lévy process with Brownian component since the dynamics of jumps is
characteristic triplet (σ 2 , ν, γ ) and r denotes the already rich enough to generate nontrivial small time
interest rate. behavior [4].
There are several ways to define a parametric
Examples Lévy process with infinite jump intensity. The first
approach is to obtain a Lévy process by subordinating
Exponential Lévy models fall into two categories. In a Brownian motion with an independent increasing
the first category, called jump-diffusion models, the Lévy process (called subordinator). Two examples
“normal”evolution of prices is given by a diffusion of models from this class are the variance gamma
process, punctuated by jumps at random intervals. process and the normal inverse Gaussian process.
Here the jumps represent rare events—crashes and The variance gamma process (see Variance-gamma
large drawdowns. Such an evolution can be repre- Model) [5, 15] is obtained by time changing a
sented by a Lévy process with a nonzero Gaussian Brownian motion with a gamma subordinator and has
component and a jump part with finitely many jumps: the characteristic exponent of the form
Nt
1 u2 σ 2 κ
Xt = γ t + σ Wt + Yi (9) ψ(u) = − log 1 + − iθκu (12)
i=1
κ 2
Exponential Lévy Models 3
The density of the Lévy measure of the variance consequently, their price is not uniquely determined
gamma process is given by by the law of the underlying. This is good news:
c −λ− |x| c this means that the pricing model can be adjusted to
ν(x) = e 1x<0 + e−λ+ x 1x>0 (13) take into account both the historical dynamics of the
|x| x
underlying and the market-quoted prices of European
θ 2 + 2σ 2 /κ call and put options, a procedure known as model
where c = 1/κ, λ+ = − θ2 and λ− = calibration (see Model Calibration). Once the risk-
σ2 σ
θ 2 + 2σ 2 /κ θ neutral measure Q is calibrated, one can price an
+ 2. exotic option with payoff HT at time T by taking the
σ2 σ
The normal inverse Gaussian process (see Nor- discounted expectation
mal Inverse Gaussian Model) [2] is the result of
time changing a Brownian motion with the inverse P0 = e−rT E Q [HT ] (16)
Gaussian subordinator and has the characteristic
exponent
Fourier Transform Methods for Option
1 1 Pricing and Model Calibration
ψ(u) = − 1 + u2 σ 2 κ − 2iθuκ (14)
κ κ
In exponential Lévy models, and in all models where
The second approach is to specify the Lévy the characteristic function of the log stock price
measure directly. The main example of this category
t (u) = E[eiuXt ] is known explicitly, Fourier inver-
is the tempered stable process (see Tempered Stable sion provides a very efficient algorithm for pricing
Process), introduced by Koponen [12] and also European options. This method was introduced in [5]
known under the name of CGMY model [4]. This and later improved and generalized in [14].
process has a Lévy measure with density of the Consider a financial model of the form St =
form S0 ert+Xt , where X is stochastic process whose char-
acteristic function is known explicitly. To compute
c− c+ −λ+ x the price of a call option,
ν(x) = e−λ− |x| 1x<0 + e 1x>0
|x|1+α−
x 1+α+
C(k) = S0 E[(eXT − ek )+ ] (17)
(15)
with α+ < 2 and α− < 2. where k = log(K/S0 ) − rT is the log forward mon-
The third approach is to specify the density of eyness, we would like to express its Fourier transform
increments of the process at a given time scale, say , in terms of the characteristic function of XT and
by taking an arbitrary infinitely divisible distribution. then find the prices for a range of strikes by Fourier
Generalized hyperbolic processes (see Generalized inversion. However, the Fourier transform of C(k)
Hyperbolic Models) [10] can be constructed in this is not well defined because this function is not inte-
way. In this approach, it is easy to simulate the grable, so we subtract the Black–Scholes call price
increments of the process at the same time scale and with nonzero volatility to obtain a function that is
to estimate parameters of the distribution if data are both integrable and smooth:
sampled with the same period , but, unless this
zT (k) = C(k) − CBS
(k) (18)
distribution belongs to some parametric class closed
under convolution, we do not know the law of the If X is a stochastic process such that E[eXT ] = 1 and
increments at other time scales. E[e(1+α)XT ] < ∞ for some α > 0, then the Fourier
transform of zT (k) is given by
Market Incompleteness and Option
T (v − i) −
T (v − i)
Pricing ζT (v) = S0 (19)
iv(1 + iv)
The exponential Lévy models correspond, in gen-
2T 2
eral, to arbitrage-free incomplete markets, mean- where
T (v) = exp(− 2 (v + iv)) is the charac-
ing that options cannot be replicated exactly and, teristic function of log stock price in the Black–
4 Exponential Lévy Models
Scholes model with volatility . The exact value PIDE Methods for Exotic Options
of is not very important, and one can take, for
example, = 0.2 for practical calculations. For contracts with barriers or American-style exer-
Option prices are computed by evaluating numer- cise, partial integro-differential equation (PIDE)
ically the inverse Fourier transform of ζT : methods provide an efficient alternative to Monte
+∞ Carlo simulation. In diffusion models, the price of
1 an option with payoff h(ST ) at time T solves the
zT (k) = e−ivk ζT (v) dv (20)
2π −∞ Black–Scholes partial differential equation (PDE)
0.55
0.50
Implied volatility
0.45
0.40
0.35
0.30
0.25
0.20
1.0
0.8 70
0.6 80
0.4 90
0.2 110 100
T 120
0.0 130 K
140
Figure 1 Implied volatility surface in the Kou model with diffusion volatility σ = 0.2 and only negative jumps with
intensity λ = 10 and average size λ1− = 0.05
Exponential Lévy Models 5
Different path-dependent characteristics of the payoff For jump diffusions, if jumps are small, the Taylor
are translated into the boundary conditions of the decomposition of this formula gives
equation: for example, for a down-and-out option
with barrier B, we would impose P (t, S) = 0 for ∂P St ∂ 2 P
S ≤ B and all t. This equation and its numerical φt ≈ + ν(dz)(ez − 1)3
∂S 2 2 ∂S 2
solution using finite differences is discussed in detail
in [9] (see Partial Integro-differential Equations = σ + (ez − 1)2 ν(dz)
2 2
(26)
(PIDEs)).
Therefore, the optimal strategy can be seen as a small
and typically negative (since the jumps are mostly
Hedging negative) correction to delta hedging. For pure-jump
In the Black–Scholes model, delta hedging is known processes such as variance gamma, (∂ 2 P /∂S 2 ) may
to completely eliminate the risk of an option position. not be defined and the correction may be big.
In the presence of jumps, delta hedging is no longer Numerical studies of the performance of hedging
optimal: to hedge a jump of a given size, one should strategies in the presence of jumps show that
use the sensitivity to fluctuations of this particular
size rather than the sensitivity to infinitesimal move- • If the jumps are small, delta hedging works well
ments. Since the jump size is not known in advance, and its performance is close to optimal.
the risk associated with jumps cannot be hedged away • In the presence of a strong jump component, the
completely. The model given by equation (8) there- optimal strategy is superior to delta hedging both
fore corresponds to an incomplete market except for in terms of hedge stability and residual error.
the following two cases: • If jumps are strong, the residual hedging error
can be further reduced by adding options to the
• no jumps in the stock price (ν ≡ 0, the Black– hedging portfolio.
Scholes case) and
• no diffusion component (σ = 0) and only one To eliminate the remaining hedging error, a pos-
possible jump size (ν = δz0 (z)). In this case, the sible solution is to use liquid options as hedging
optimal hedging strategy is instruments. Optimal quadratic hedge ratios in the
case when the hedging portfolio may contain options
P (St ez0 ) − P (St ) can be found in [8].
φt = (24)
St (ez0 − 1)
In all other cases, the hedging becomes an approx- Additional Reading
imation problem: instead of replicating an option,
one tries to minimize the residual hedging error. For a more in-depth treatment, the reader may refer
Many authors (see, e.g. [8, 11]) studied the quadratic to the monographs [6, 18].
hedging, where the optimal strategy is obtained by
minimizing the expected squared hedging error. A References
particularly simple situation is when this error is com-
puted under the martingale probability. The optimal
[1] Appelbaum, D. (2004). Lévy Processes and Stochastic
hedge is then a weighted sum of the sensitivity of Calculus, Cambridge University Press.
option price to infinitesimal stock movements, and [2] Barndorff-Nielsen, O. (1998). Processes of normal
the average sensitivity to jumps: inverse Gaussian type, Finance and Stochastics 2,
41–68.
φ ∗ (t, St ) [3] Bertoin, J. (1996). Lévy Processes, Cambridge Univer-
sity Press, Cambridge.
∂P 1 [4] Carr, P., Geman, H., Madan, D. & Yor, M. (2002). The
σ2 + ν(dz)(ez −1)(P (t, St ez ) − P (t, St ))
∂S St
fine structure of asset returns: an empirical investigation,
= Journal of Business 75, 305–332.
σ + (ez − 1)2 ν(dz)
2
[5] Carr, P. & Madan, D. (1998). Option valuation using
the fast Fourier transform, Journal of Computational
(25) Finance 2, 61–73.
6 Exponential Lévy Models
[6] Cont, R. & Tankov, P. (2004). Financial Modelling with [13] Kou, S. (2002). A jump-diffusion model for option
Jump Processes, Chapman & Hall/CRC Press. pricing, Management Science 48, 1086–1101.
[7] Cont, R. & Tankov, P. (2006). Retrieving Lévy processes [14] Lee, R.W. (2004). Option pricing by transform methods:
from option prices: regularization of an ill-posed inverse extensions, unification and error control, Journal of
problem, SIAM Journal on Control and Optimization 45, Computational Finance 7, 51–86.
1–25. [15] Madan, D., Carr, P. & Chang, E. (1998),. The variance
[8] Cont, R., Tankov, P. & Voltchkova, E. (2007). Hedging gamma process and option pricing, European Finance
with options in models with jumps. Proceedings of the Review 2, 79–105.
2005 Abel Symposium in Honor of Kiyosi Itô, F.E. Benth, [16] Merton, R. (1976). Option pricing when underlying
G. Di Nunno, T. Lindstrom, B. Øksendal & T. Zhang, stock returns are discontinuous, Journal Financial Eco-
eds, Springer, pp. 197–218. nomics 3, 125–144.
[9] Cont, R. & Voltchkova E. (2005). A finite difference [17] Sato, K. (1999). Lévy Processes and Infinitely Divisible
scheme for option pricing in jump-diffusion and expo- Distributions, Cambridge University Press, Cambridge.
nential Lévy models, SIAM Journal on Numerical Anal- [18] Schoutens, W. (2003). Lévy Processes in Finance: Pric-
ing Financial Derivatives, Wiley, New York.
ysis 43, 1596–1626.
[10] Eberlein, E. (2001). Applications of generalized
hyperbolic Lévy motion to Finance, in Lévy Pro-
cesses—Theory and Applications, O. Barndorff-Nielsen, Related Articles
T. Mikosch & S. Resnick, eds, Birkhäuser, Boston, pp.
319–336.
Barndorff-Nielsen and Shephard (BNS) Models;
[11] Kallsen, J., Hubalek, F. & Krawczyk, L. (2006).
Variance-optimal hedging for processes with stationary
Fourier Transform; Infinite Divisibility; Jump
independent increments, The Annals of Applied Proba- Processes; Jump-diffusion Models; Kou Model;
bility 16, 853–885. Partial Integro-differential Equations (PIDEs);
[12] Koponen, I. (1995). Analytic approach to the problem Tempered Stable Process; Time-changed Lévy
of convergence of truncated Lévy flights towards the Process; Tempered Stable Process.
Gaussian stochastic process, Physical Review E 52,
1197–1199. PETER TANKOV
Uncertain Volatility Model world where the risk is distributed among a large
enough number of buyers), the risk aversion is total,
meaning that your managing policy will aim at
yielding a nonnegative P&L whatever the realized
Black–Scholes and Realized Volatility path. This approach is what is called the superhedging
What happens when a trader uses the Black–Scholes strategy (or superstrategy) approach to derivative
((BS) in the sequel) formula to dynamically hedge pricing. Of course, the larger the set of the underlying
a call option at a given constant volatility while the scenarios (or paths) for which you want to have
realized volatility is not constant? the superhedging property (see Superhedging), the
It is not difficult to show that the answer is the higher the initial selling price. The first set that
following: if the realized volatility is lower than the comes to mind is the set of paths associated with
managing volatility, the corresponding profit and loss an unknown volatility, say between two boundary
(P&L) will be nonnegative. Indeed, a simple, yet, values σmin and σmax . In other words, we look for
clever application of Itô’s formula shows us that the cheapest price at which we can sell and manage
the instantaneous P&L of being short a delta-hedged an option without any assumption on the volatility
option reads except that it lies in the [σmin , σmax ] range. This
framework is the uncertain volatility model (UVM)
1 2 2 dSt 2 introduced by Avellaneda et al. [2].
P &Lt = St σt dt − (1) If you take a call option (or more generally a
2 St
European option with convex payoff), the BS price at
where is the gamma of the option (the second volatility σmax is a good candidate. Indeed, it yields
derivative with respect to the underlying, which is a superhedging strategy by result (1). And should the
positive for a call option), and σt the spot volatility, realized volatility be constantly σmax , then your P&L
for example, the volatility at which the option was will be 0. It is easy to conclude from this that the
2 BS σmax price is the UVM selling price for an option
sold and dS St
t
represents the realized variance over with a convex payoff.
the period [t, t + dt]. Note that this holds without Now very often traders use strategies (butterflies,
any assumption on the realized volatility, which will callspreads, etc.) which are not convex any longer. It
certainly turn out to be nonconstant. This result is is not at all easy to find a superstrategy in such cases.
fundamental in practice: it allows traders to work There is one exception; if you hedge at the selling
with neither exact knowledge of the behavior of time and do not rebalance your hedge before maturity,
the volatility nor a more complex toolbox than the the cheapest price associated to such a strategy will
plain BS formula; an upper bound of the realized be the value at the initial underlying value of the
volatility is enough to grant a profit (conversely, a concave envelope of the payoff function. It is easy to
lower bound for option buyers). This way of handling see that this value corresponds to the total uncertainty
the realized volatility with the BS formula is of case, or to the [0, ∞] case in the UVM model. For a
historical importance in the option market. El Karoui, call option it will be the value of the underlying.
Jeanblanc, and Shreve have formalized it masterfully
in [5].
Black–Scholes–Barenblatt Equation
Superhedging and the Uncertain Volatility There come into play the seminal work [2] and
independently [7]: Going back to equation (1), we
Model (UVM)
are looking for a model with the property that
The UVM Framework the managing volatility is σmin when the gamma
is nonnegative, and σmax in the converse situation.
Assume that you perform the previous strategy. You Should such a model exist, it will yield an optimal
are certainly not alone in the market, and you wish solution to the superhedging problem.
you have the lowest possible selling price compatible An easy way to approximate the optimal solution
with your risk aversion. In practice, on the derivatives is to consider a tree (a trinomial tree, for instance)
desk (this is a big difference with the insurance where the dependence upon the volatility lies in
2 Uncertain Volatility Model
W − (t, St ) = inf ƐPt e−r(T −t) (ST ) each of which corresponds to a volatility process
P ∈P with value at each time in [σmin , σmax ]. In fact,
described above, for example. It can be extended in −ƐP e−r(Ti −t) Fi STi = 0, where P ∗ realizes the
the same way for path-dependent options. Neverthe- sup above. These conditions exactly fit the model to
less, when the price pops up, the usual reaction of observed market prices. The convexity of (t, St , λ1 ,
the trader or risk officer is that the price is too high, . . . , λm ) with respect to λi ensures that if a minimum
especially too high to explain the observed market exists, then it is unique.
price. This approach is very attractive from a theoretical
The fact that the price is high is a direct conse- point of view, but it is much harder to implement.
quence of the total aversion approach in the super- The consistency of observed vanilla prices is a crucial
strategy formulation, and also of the fact that the step that is rarely met in practice. Even if numerous
price corresponds to the worst-case scenario where robust algorithms exist to handle the dual problem,
the gamma changes signs exactly when the volatility their implementation is quite tricky. In fact, this
switches regimes. This is a highly unlikely situation. constrained formulation implies a calibration property
To lower the price and fit in the traditional setting of the model, and the design of a stable and robust
where one wants to fit the observed market price of calibration algorithm is one of the greatest challenges
liquid European calls and puts (so-called vanillas), in the field of financial derivatives.
Avellaneda, Levy, and Paras propose a constrained
extension of the UVM model where the price of the The Curse of Nonlinearity
complex products of the trader is handled within the
UVM framework with the additional constraint of fit- Another issue for a practitioner is the inherent non-
ting the vanilla prices. By duality, this reduces to linearity of the UVM formulation. Most traditional
computing the UVM price for a portfolio parameter- models like BS, Heston, or Lévy-based models are
ized by a Lagrangian multiplier and then minimizing linear models. The fact that an option price should
the dual value function over the Lagrangian parame- depend on the whole portfolio of the trader is a no-
ter. Mathematically speaking, let us consider an asset brainer for risk officers, but this nonlinearity is a
St and a payoff (ST ). m European options with challenge for the modularity and the flexibility of
4 Uncertain Volatility Model
for a fixed K and all T > t, it alone specifies the spot all T > t together with equation (1) admits a unique
volatility σt . This phenomenon is directly related to solution. The best results were obtained by [4] and
the convergence of option prices to the option payoff [19]. Without loss of generality, one can assume that
at expiry. Equivalently, the solution to equation (2) equation (1) is driven by the first Wiener process only
should not blow up too fast near expiry. It is called and that σt = (σt , 0, . . . , 0). Assume that ξt has the
no-bubble restriction in [17], whereas [5] calls it the functional form,
feedback condition and traces it back to [8]. It is also
1 T Xt (u, K)
called the volatility specification in [19] and [6]. It ξt (T , K) = Vt (u, K) du (6)
reads 2 t Xt (T , K)
St ∂ ( (T , K)2 (T − t)) is the
where Xt (T , K) = ∂T
t (t, K) = σt − ln ξt (t, K) (4) t
K square of the forward implied volatility and where
For a proof under proper assumptions, see [13]. V has the form
The case where we let K = St in equation (4) says Vt (T , K) = V(t, T , K, t (T , K), t (t, K), St ) (7)
t (t, St ) = |σt |. In other words, the current value of
the spot volatility can be exactly recovered from the for a deterministic function V satisfying technical
implied volatility smile. This very much parallels the positivity, growth, and Lipschitz conditions [19].
fact that the instantaneous forward rate with infinitely Assume also that the spot volatility has the functional
small tenor is the short rate in the HJM approach to form σt = σ (t, K, t (t, K), St ), where the determin-
interest rates. istic function σ is determined by equation (4). Then,
It is shown in [14] that the relation t (t, St ) = |σt | the infinite system of equation (2) for a fixed K and
holds in great generality even when jumps in the all T > t together with equation (1) admits a unique
spot and/or its volatility are present. It turns out to solution.
be a consequence of the central limit theorem for
martingales.
Equation (4) has an interesting connection to the The Case of Several Strikes
work of Berestycki et al. [3]. In a time homogeneous The infinite system of equation (2) for all K and all
stochastic volatility model, [3] shows that the implied T > t together with equation (1) is more complicated
volatility in the short maturity limit can be expressed and conditions on ξt under which it admits a unique
using the geodesic distance associated with the gen- solution are still poorly understood. One advantage
erator of the bivariate diffusion (xt , yt ), where xt is of dealing with all strikes K at once is that one
the log-moneyness and yt is the spot volatility (|σt | can remove the dependence on S in equation (2) by
in our notation). Keeping their notation, we denote changing the parameterization of the surface from K
by d(x, y) the signed geodesic distance from (x, y) to moneyness K/St . The dynamics of the implied
to (0, y), and obtain volatility surface in these coordinates are obtained
ln(St /K) by applying the Itô–Wentzell formula as in [5]. One
t (t, K) = (5) of the difficulties of the multistrike case is that the
d(ln(St /K), |σt |)
solution to the infinite system in equation (2) must
By comparing equations (4) and (5), it becomes clear satisfy some shape restrictions at each time t. These
that the geodesic distance associated with the genera- are consequences of the well-known static arbitrage
tor of the stochastic volatility model and the implied restrictions that we now recall.
volatility’s volatility vector are strongly related.
Static Arbitrage Restrictions
The Case of a Single Strike Static arbitrage relations lead to constraints on the
shape of the implied volatility surface. The fact that
We first deal with the problem studied by [1, 4, 16,
calendar spreads have positive values leads to
17, 19] where only a single option is considered.
The goal is to set up conditions under which the ∂t t
+ ≥0 (8)
infinite system of equation (2) for a fixed K and ∂T 2(T − t)
Implied Volatility: Market Models 3
The fact that call values are a decreasing function of Empirical Models
the strike leads to
−(− d1 ) ∂t ( d2 ) To overcome the obvious shortcomings of the sticky
√ ≤K ≤ √ (9) strike and sticky delta models, Cont and da Fonseca
ϕ( d1 ) T − t ∂K ϕ( d2 ) T − t [9] have proposed to write down a model for the
Finally, the fact that butterfly spreads have positive future evolution of the surface as an infinite system
values, or that calls are convex functions of the strike where each point of the surface is driven by a few
leads to common factors. These dynamics allow for easy cal-
ibration using principal component analysis [9] and
can be useful for risk management and scenarios sim-
ln (K/St ) ∂t 2 (T − t)2 2 ∂t 2
1− K − t K ulation. It is difficult, however, to check whether such
t ∂K 4 ∂K
specifications satisfy arbitrage restrictions, which
2 prevents them from being used to price exotic
∂t ∂ t
+ (T − t)t K + K2 ≥0 (10) options.
∂K ∂K 2
√
where d2 = d1 − t (T , K) T − t. These restric-
tions must hold at each time t and at each point The Spot Volatility Dynamics from the
(T , K) of the implied volatility surface. Implied Volatility Surface
[17] Schönbucher, P. (1999). A market model for stochastic Heath, D., Jarrow, R. & Morton, A. Bond pricing and the term
implied volatility, Philosophical Transactions of the structure of interest rates: a new methodology for contingent
Royal Society of London. Series A: Mathematical and claims valuation, Econometrica 60(1), 77–105.
Physical Sciences 357(1758), 2071–2092.
[18] Schweizer, M. & Wissel, J. (2008). Arbitrage-free mar-
ket models for option prices: the multi-strike case,
Finance and Stochastics 12(4), 469–505. Related Articles
[19] Schweizer, M. & Wissel, J. (2008). Term structures of
implied volatilities: absence of arbitrage and existence
Black–Scholes Formula; Dividend Modeling; Exp-
results, Mathematical Finance 18, 77–114.
[20] Zhu, Y. & Avellaneda, M. (1998). A risk-neutral stochas- onential Lévy Models; Heath–Jarrow–Morton
tic volatility model, International Journal of Theoretical Approach; Implied Volatility: Long Maturity
and Applied Finance 1(2), 289–310. Behavior; Implied Volatility: Large Strike Asymp-
totics; Implied Volatility: Volvol Expansion;
Implied Volatility Surface; Implied Volatility
Further Reading in Stochastic Volatility Models; Local Volatility
Model; Moment Explosions; SABR Model.
Gatheral, J. (2006). The Volatility Surface: A Practitioner’s
Guide, Wiley Finance. VALDO DURRLEMAN
• nij (t) = number of firms that went from i at
Rating Transition date t − 1 to j at date t.
−1
Matrices • Ni (t) = Tt=0 ni (t) = number of firm exposures
recorded atthe beginning of transition periods.
• Nij (T ) = Tt=1 nij (t) = total number of transi-
Rating transition matrices play an important role in tions observed from i to j over the entire period.
credit risk management both as a method for summa- If we do not assume time homogeneity, we can
rizing the empirical behavior of a rating system and as estimate each element of the one-step transition
a tool for computing probabilities of rating migrations probability matrix using the maximum-likelihood
in, for example, a portfolio of risky loans. Analysis estimator
of statistical properties of rating transition matrices
is intimately linked with Markov chains. Even if rat- nij (t)
p ij (t − 1; t) = (4)
ing processes in general are not Markovian, statistical ni (t − 1)
analysis of rating systems often focuses on assessing
which simply is the fraction of firms that made the
a particular deviation from Markovian behavior. Fur-
transition divided by the number of firms which could
thermore, the tractability of the Markovian setting can
have made the transition.
be preserved in some simple extensions.
Assuming time homogeneity, the maximum-
likelihood estimator of the transition probabilities
matrix is
Discrete-time Markov Chains Nij (T )
ij =
p (5)
Ni (T )
Let the rating process η = (η0 , η1 , . . .) be a discrete-
time stochastic process taking values in a finite state for all i, j ∈ K. This estimator is different from the
space {1, . . . , K}. If the rating process is a Markov estimator obtained by estimating a sequence of 1-year
chain, the probability of making a particular transition transition matrices and then computing the average of
between time t and time t + 1 does not depend on each element at a time. The latter method will weigh
the history before time t, and one-step transition years with few observations as heavily as years with
probabilities of the form many observations. If the viewpoint is that there is
variation in 1-year transition probabilities over time
pij (t; t + 1) = P r(ηt+1 = j | ηt = i) (1) due to, for example, business cycle fluctuations, the
averaging can be justified as a way of obtaining
describes the evolution of the chain. If the one-step an unconditional 1-year default probability over the
transition probabilities are independent of time, we cycle.
call the chain time homogeneous and write Rating agencies often form a cohort of firms at
a particular date, say January 1, 1980, and record
pij = P r(ηt+1 = j | ηt = i) (2) transition frequencies over a fixed time horizon, say
5 years. This can be done in a straightforward way
The one-period transition matrix of the chain is using only information on the initial rating and final
then given as rating after 5 years, assuming that all companies
that are in the cohort, to begin with, stay in the
p11 · · · p1K
sample. In practice, rating withdrawals occur, that
P = .. ..
(3) is, firms or debt issues cease to have a rating.
. .
pK1 · · · pKK According to [4], the vast majority of withdrawals
are due to debt maturing, being redeemed or called.
K
where j =1 pij = 1 for all i. It is traditional in the rating literature to view these
Consider a sample of N firms whose transitions events as “noninformative” censoring. One way to
between different states are observed at discrete dates deal with withdrawals is to eliminate the firms from
t = 0, . . . , T . Now introduce the following notation: the sample and in essence use only those firms that
do not have their rating withdrawn in the 5-year
• ni (t) = number of firms in state i at date t. period. Another way is to estimate a sequence of
2 Rating Transition Matrices
1-year transition probability matrices using the 1-year the generator matrix is given by
estimator and then estimate the 5-year matrix as the
product of 1-year matrices. In this case, information Nij (T )
λ̂ij =
T
(8)
of a firm whose rating is withdrawn is used for the
Yi (s) ds
years where it is still present in the sample. Both 0
methods rely on the assumption of withdrawals being
noninformative. where Yi (s) is the number of firms in rating class i
at time s and Nij (T ) is the total number of direct
transitions over the period from i to j, where i =
j. The denominator counts the number of “firm-
Continuous-time Markov Chains years” spent in state i.
Any period a firm spends in a state will be picked
When one has access to full rating histories and
up through the denominator. In this sense all informa-
therefore knows the exact dates of transitions, the
tion is being used. Note also how (noninformative)
continuous-time formulation offers significant advan-
censoring is handled automatically: When a firm
tages in terms of tractability. Recall that the family of
leaves the sample, it simply stops contributing to
transition matrices for a time-homogeneous Markov
the denominator. Also, this method will produce esti-
chain in continuous time on a finite state space can
mates of transition probabilities for “rare transitions”,
be described by an associated generator matrix, that
even if the rare transitions have not been observed in
is, a K × K matrix , whose elements satisfy the sample. For more on this, see [9].
λij ≥ 0 for i = j
Nonhomogeneous Chains
λii = − j =i λij (6)
For statistical specifications and applications to pric-
Let P (t) denote the K × K matrix of transition ing, the concept of a nonhomogeneous chain is useful.
probabilities, that is, pij (t) = P (ηt = j |η0 = i). Then In complete analogy with the discrete-time case, the
definition of the Markov property does not change
P (t) = exp(t) (7) when we drop the assumption of time homogeneity,
but the description of the family of transition matrices
requires that we keep track of calendar dates instead
where the right hand side is the matrix exponential
of just time lengths.
of the matrix t obtained by multiplying all entries
For each pair of states i, j with i = j, let Aij be a
of by t.
nondecreasing right-continuous (and with left limits)
In case a row consists of all zeros, the chain is
function, which is zero at time zero. Let
absorbed in that state when it hits it. It is convenient
to work with the default states as absorbing states
Aii (t) = − Aij (t) (9)
even if firms in practice may recover and leave the
j =i
default state. If we ask what the probability is that
a firm will default before time T then this can be and assume that
read from the transition matrix P (T ) when we have
defined default to be an absorbing state. If the state Aii (t) ≥ −1 (10)
is not absorbing, but P allows the chain to jump
back into the nondefault rating categories, then the Then there exists a Markov process with state space
transition probability matrix for time T will only give 1, . . . , K whose transition matrix is given by
the probability of being in default at time T and this
(smaller) probability is typically not the one we are
interested in for risk management purposes. P (s, t) = [s,t] (I + dA)
Assume that we have observed a collection of ≡ lim i (I + A(ti ) − A(ti−1 ))
max |ti −ti−1 |→0
firms between time 0 and time T . The maximum-
likelihood estimator for the off-diagonal elements of (11)
Rating Transition Matrices 3
where s ≤ t1 ≤ tn ≤ t. One can think of the proba- a primary concern of through-the-cycle rating is the
bilistic behavior as follows: Given that the chain is correct ranking of the firm’s default probabilities (or
in state i at time s the probability that it remains in expected loss) over a longer time horizon, whereas
that state at least until t (assuming that Aii (u) > a point-in-time is more concerned with following
−1 for u ≤ t) is given by actual, shorter-term default probabilities seeking to
maintain a constant meaning of riskiness associated
P (ηu = 0 for s < u ≤ t|ηs = i) with each rating category.
The degree to which transition probabilities
= exp(−(Aii (t) − Aii (s))) (12) depend on the previous rating history, business cycle
variables, and the sector or country to which the rated
We are interested in testing assumptions on the companies belong has been investigated, for example,
intensity measure when it can be represented through in papers [1, 9, 10]. A good entry into the literature is
integrated intensities, that is, we assume that there in the special journal issue introduced by Cantor [3].
exists integrable functions (or transition intensities) Rating agencies have a system of modifiers that
λij (·) such that effectively enlarge the state space. For example,
t Moody’s operates with a watchlist and long-term
Aij (t) = λij (s) ds (13) outlooks. Being on a watchlist signals a high like-
0
lihood of rating action in a particular direction in the
for every pair of states i, j with i = j. near future, and outlooks signal longer term likely
In this case, given that the chain jumps away from rating directions. Hamilton and Cantor [7] investi-
i at date t, the probability that it jumps to state j is gate the performance of ratings when the state space
given by ij λ (t) .
λ (t)
is enlarged with these modifiers and conclude that
iK
k=i they go a long way in reducing dependence on rating
A homogeneous Markov chain with intensity history.
matrix has Aij (t) = λij t and in this special case
we can write P (s, t) = exp((t − s)).
For a method for estimating the continuous-time Correlated Transitions
transition probabilities nonparametrically using the
so-called Aalen–Johansen estimator, see, for exam- In risk management, the risk of loan portfolios and
ple, [2]. The specification of individual transition exposures to different counterparties in derivatives
intensities allows us to use hazard regressions on contracts depends critically on the extent to which
specific rating transitions. For an example of nonpara- the credit ratings of different loans and counterparties
metric techniques, see [5]. A Cox regression approach are correlated.
can be found in [9]. We finish by briefly outlining two ways of incor-
porating dependence into rating migrations. For the
first approach, see, for example, [6]; we map rat-
Empirical Observations ing probabilities into thresholds. The idea is easily
illustrated through an example. If firm 1 is cur-
There is a large literature on the statistical proper- rently rated i and we know the (say) 1-year transition
ties of the observed rating transitions, mainly for probabilities pi1 , . . . , piK , then we can model the
firms rated by Moody’s and Standard and Poors. transition to the various categories using a standard
It has been acknowledged for a long time that the Gaussian random variable 1 and defining thresholds
observed processes are not time homogeneous and a1 > a2 > . . . > aK−1 such that
not Markov. This is consistent with stated objectives
of rating agencies of trying to avoid rating rever- piK = P (1 ≤ aK−1 ) = (aK−1 ) (14)
sals and seeking to change ratings only when the
pi,K−1 = P (aK−1 ≤ 1 ≤ aK−2 )
change in credit quality is seen as enduring—a prop-
erty sometimes referred to as “rating through the = (a K−2 ) − (aK−1 ) (15)
cycle”. This is in contrast to “point-in-time” rating. ..
.
The distinction between the two approaches is not rig-
orous, but a rough indication of the difference is that pi1 = P (a1 ≤ 1 ) = 1 − (a1 ) (16)
4 Rating Transition Matrices
Similarly, for firm 2, we can define thresholds [2] Andersen, P.K., Borgan, O., Gill, R. & Keiding, N.
b1 , . . . , bK−1 and a standard random normal variable (1993). Statistical Models Based on Counting Processes,
Springer, New York.
2 so that the transition probabilities are matched as
[3] Cantor, R. (2004). An introduction to recent research
earlier. Letting 1 and 2 be correlated with correla- on credit ratings, Journal of Banking and Finance 28,
tion coefficient ρ induces correlation into the migra- 2565–2573.
tion patterns of the two firms. This can be extended [4] Cantor, R. (2008). Moody’s Guidelines for the With-
to a large collection of firms using a full correlation drawal of Ratings, Rating Methodology, Moody’s
matrix obtained, for example, by looking at equity Investors Service, New York.
return correlations. [5] Fledelius, P., Lando, D. & Nielsen, J. (2004). Non-
parametric analysis of rating transition and default data,
A second approach, which makes it possible to Journal of Investment Management 2(2), 71–85.
link up rating dynamics with continuous-time pricing [6] Gupton, G., Finger, C. & Bhatia, M. (1997). Credit-
models, is proposed in [8]. The idea here is to model Metrics—Technical Document, Morgan Guaranty Trust
the “conditional generator” of a Markov process as Company.
the product of a constant generator and a strictly [7] Hamilton, D. & Cantor, R. (2004). Rating Transitions
positive affine process µ, that is, conditionally on and Defaults Conditional on Watchlists, Outlook and
Rating History, Special comment, Moody’s Investors
a realization of the process µ, the Markov chain is
Service, New York.
time non-homogeneous with the transition intensity [8] Lando, D. (1998). On Cox processes and credit risky
λij (s) = µ(s)λij . This framework allows for closed securities, Review of Derivatives Research 2, 99–120.
form computation of transition probabilities in a [9] Lando, D. & Skødeberg, T. (2002). Analyzing rat-
setting where rating migrations are correlated through ing transitions and rating drift with continuous obser-
dependence on state variables. vations, The Journal of Banking and Finance 26,
423–444.
[10] Nickell, P., Perraudin, W. & Varotto, S. (2000). Stability
References of ratings transitions, Journal of Banking and Finance
24, 203–227.
[1] Altman, E. & Kao, D.L. (1992). The implications of DAVID LANDO
corporate bond rating drift, Financial Analysts Journal
48(3), 64–75.
Credit Migration Models (see Loan Valuation) or performance measurement.
The total portfolio risk is commonly considered as a
capital which the lender should hold in order to buffer
It is nowadays widely recognized that portfolio mod- large losses. For nontraded assets, such as loans or
els are an essential tool for a proper and effective mortgages, the costs for holding this risk capital
management of credit portfolios, be it from the per- are typically transferred to the borrowers by means
spective of a corporate bank, a mortgage bank, a of a surcharge on interest rates. Calculating these
consumer finance provider, or a fixed-income asset surcharges necessitates that the total portfolio risk
manager. Traditional credit management was, to a capital is broken down to borrower (or instrument)
large extent, focused on the stand-alone analysis and level risk contributions. Only in a portfolio model
monitoring of the credit quality of obligors or coun- framework, where the dependence between obligors
terparties. Frequently, the credit process did also and the resulting diversification benefits are correctly
include ad hoc exposure-based limit-setting policies captured, this risk contribution can be determined
that were devised in order to prevent excessive risk in an economically rational and fair fashion. We
concentrations. This approach was scrutinized in the mention that risk contributions can also be applied
1990s, when the financial industry started to realize in order to determine the ex post (historical) risk-
that univariate models for obligor default had to be adjusted performance of instruments or subportfolios.
extended to a portfolio context. It was recognized Credit portfolio models also play an important role
that credit rating and loss recovery models, although a in the pricing of credit derivatives or structured
crucial element in the assessment of credit risk, fail to products, such as credit default swaps or CDSs.
explain some of the important stylized facts of credit For the correct pricing of many of these credit
loss distributions, if the stochastic dependence of instruments, it is crucial that the dependence between
obligor defaults is neglected. From a statistical point obligor default times are well modeled.
of view, not only the skewness and the relatively
heavy upper tails of credit portfolio loss distribu-
tions, but also the historically observed variation of Overview of Credit Migration-based
default rates and the clustering of bankruptcies in Models
single sectors are clearly inconsistent with stochastic
independence of defaults. From an economics point This article gives a survey on migration-based port-
of view, it is plausible that default rates are con- folio models, that is, models that describe the joint
nected to the intrinsic fluctuations of business cycles; evolution of credit ratings. The ancestor of all
relationships between default rates and the economic such models is CreditMetrics a , which was intro-
environment have indeed been established in numer- duced by the US investment bank J.P. Morgan. In
ous empirical studies [5]. All these insights supported 1997, J.P. Morgan and cosponsors from the finan-
the quest for tractable credit portfolio models that cial industry published a comprehensive technical
reflect these stylized facts. document [13] on CreditMetrics, in an effort to set
Apart from an accurate statistical description of industry standards and to create more transparency
credit losses, a portfolio model can serve many more in credit risk management. This publication attracted
purposes. In contrast to a univariate approach, a a lot of attention and proved to stimulate research
credit portfolio framework allows to quantify the in credit risk. To this date, the CreditMetrics or
diversification effects between credit instruments. derivations thereof have been implemented by a
This makes it, for example, possible to evaluate the large number of financial institutions. Before we
impact on the total risk when securities are added turn to a detailed description of CreditMetrics, it
or removed from a portfolio. In the same vein, the might be worth to mention two related models.
risk numbers produced by a portfolio model help CreditPortfolioView by McKinsey & Co is credit-
to identify possible hedges. Ultimately, the use of migration-based as well. However, in contrast to
a portfolio model facilitates the active management CreditMetrics, which assumes temporally constant
of credit portfolios and the efficient allocation of transition matrices, it is endowed with an estimator
capital. Less of a pure risk management matter is of credit migration probabilities based on macroe-
the use of portfolio models for risk-adjusted pricing conomic observables. is dedicated to its discussion.
2 Credit Migration Models
The second link concerns the longer standing KMV reached in one step from the analysis time 0; typically
model.b An outline of the KMV methodology can be the time horizon is 1 year. It is assumed that the port-
extracted from an article by Kealhofer and Bohn [16]. folio is static, that is, its composition is not altered
In both CreditMetrics and KMV, the obligor cor- during the time period (0, T ).
relation is generated in a similar fashion, that is,
with a dependence structure following a Gaussian Risk Factors and Valuation
copula. The main differences concern the number
of credit states and the source of probabilities of In case of CreditMetrics, the basic assumption is that
default (PDs). The KMV model operates on a contin- each instrument is tied to one or several obligors. The
uum of states, namely, the so-called expected default user furnishes obligors with a rating from a rating sys-
frequencies (Moody’s KMV EDF c ), basically esti- tem with a finite number of classes and an absorbing
mated PDs, whereas CreditMetrics is restricted to a default state. The obligor ratings are the main risk
finite number of credit rating states. For this reason, drivers. We index the obligors by i = 1, . . . , n and
KMV is strictly spoken not a credit-migration-based assume a rating system with rating classes {1 . . . , K}
model and therefore only touched in this article. that are ordered with respect to the credit quality,
As remarked by McNeil et al. [19], a discretization and a default class 0. At time 0, the obligor i has
of EDF would translate KMV to a model which, the (known) initial rating Siinit , which then becomes
apart from parametrization, is structurally equiva- Sinew at time T . The change from Siinit to Sinew hap-
lent to CreditMetrics. Secondly, while for Credit- pens in a random fashion, according to the so-called
Metrics rating transition matrices are the required credit migration probabilities. These probabilities are
exogenous inputs, the KMV counterparts, EDF of assumed to be identical for obligors in the same
listed companies, are estimated through a propri- rating class and can therefore be represented by a so-
etary method, which is basically an extension of called credit migration (or rating transition) matrix
the celebrated Merton model [20] for firm default. M = (mj k )j,k∈{0,...,K} . Clearly,
Inputs to the EDF model are historical time-series of
equity prices together with company debt informa- (Sinew = k|Siinit = j ) = mj k (1)
tion, with which the unobserved asset value processes The credit migration matrix is an important input
are reconstructed and a quantity called distance to to CreditMetrics. In practice, one often uses rating
default (DD) is calculated for every firm. This DD is systems supplied by agencies such as Moody’s or
used as a predictor of EDF; the relationship is deter- Standard&Poor’s. The model also allows to work in
mined by a nonlinear regression of historical default parallel with several rating systems, depending on the
data against historical DD values. It is beyond the obligor. If public ratings are not available, financial
scope of this article to provide more details and so institutions can resort to internal ratings; see Credit
we refer to [2] or [17] for an account of the EDF Rating; Internal-ratings-based Approach; Credit
methodology. Scoring.
To treat specific positions, CreditMetrics must
The CreditMetrics Model estimate values for the position contingent on the
position’s obligor being in each possible future rating
CreditMetrics models the distribution of the credit state. This is equivalent to estimating the loss (or
portfolio value at a future time, from which risk gain) on the position contingent on each possible
measures can be derived. The changes of port- rating transition. In the case of default, the recovery
folio value are caused by credit migrations of rate δi determines the proportion of the position’s
the underlying instruments. In the following, we principal that is paid back by the obligor.d
describe the rationale of the main building blocks of For the nondefault states, the standard implemen-
CreditMetrics. tation of the model is to value positions based on
market factors: the risk-free interest rate curve and
Timescale a spread curve corresponding to the rating state. For
this reason, CreditMetrics is commonly referred to as
CreditMetrics was conceived as a discrete time a mark-to-market model. Importantly, the mark-to-
model. It has a user-specified time horizon T that is market approach incorporates a maturity effect into
Credit Migration Models 3
the model: other things being equal, a downward More formally, if Ri denotes the asset return of
credit migration will have a greater impact on a long obligor i over (0, T ], then the rating at time T is
maturity bond than a short one, given the long bond’s determined by
higher sensitivity (duration) to the spread widen-
ing that is assumed to accompany the migration. Sinew = j ⇐⇒ dj(i) < Ri ≤ dj(i)+1 (2)
However, this approach does require relevant spread
curves for positions of all possible rating states. The increasing thresholds dj(i) are picked such that
For positions where there is little market informa- the resulting migration probabilities coincide with
tion, or where the mark-to-market approach is incon- the ones prescribed by the credit migration matrix.
sistent with an institution’s accounting scheme, it is Consequently,
possible to utilize policy-driven rather than market-
driven valuation. For example, if an institution has d0(i) = −∞ and (i)
dK+1 = +∞
a reserves policy whereby loss reserves are deter-
mined by credit rating and maturity, then the change Gi (dj(i)+1 ) − Gi (dj(i) ) = (Sinew = j | Siinit ) (3)
in required reserves can serve as a proxy for the loss
where Gi is the cumulative distribution function of
on a position, contingent on a particular rating move.
Ri . We illustrate the rating transition mechanism in
In this way, the model can still incorporate a maturity
Figure 1, which shows the return distribution and
effect, even where a mark-to-market approach is not
the thresholds for an obligor with an initial rating 2
practical.
in a hypothetical rating system with four nondefault
classes.
Risk Factor Dynamics and Obligor Dependence The dependence between obligor ratings stems
Structure from the dependence of the asset returns. CreditMet-
rics assumes that these returns follow a linear factor
In the original formulation of CreditMetrics, foreign model with multivariate normal factors and indepen-
exchange (FX) rates and interest rate and spread dent Gaussian innovations. This means that
curves are assumed to be deterministic since one
p
focuses on the rating as the main risk driver. In Ri = αi + βi F + σi i (4)
principle, this assumption could be relaxed. =1
The migration matrix in the CreditMetrics model
where the common factors F = (F1 , . . . , Fp ) ∼
specifies the rating dynamics of a single obligor,
Np (µ, ) are multivariate Gaussian and the i ’s are
but it does not provide any information about the
independent and identically distributed (i.i.d.) stan-
joint obligor credit migrations. In order to capture
dard normal variables independent of the factors. The
the obligor dependence structure, CreditMetrics bor-
pβi are also called factor exposures or load-
numbers
rows ideas from the Merton structural model for firm
ings, =1 βi F is the systematic return, σi is the
default, which links default to the obligor asset value
volatility of idiosyncratic (or specific) return σi i of
falling short of its liabilities. The assumption of Cred-
obligor i, and the real parameter αi is referred to as
itMetrics is that the obligor rating transition is caused
alpha. The dependence between the returns, and con-
by changes of the obligor’s asset value, or equiva-
sequently the dependence between future ratings, is
lently, the asset value return. The lower this random
caused by the exposure of the obligors to the com-
return, the lower the new rating; if the asset value
mon factors. Usually one normalizes the returns Ri
return drops below a certain threshold, default occurs.
to unit variance; this does not alter the joint distribu-
Mathematically, this amounts to defining return buck- tion of Snew = (S1new , . . . , Snnew ) and leads to adjusted
ets for each obligor; the thresholds bounding these thresholds that are simpler. Not explicitly distinguish-
buckets depend on the initial obligor rating, the tran- ing between returns and normalized returns in our
sition probabilities, and the return distribution. The notation, equation (4) then reads as
rating of an obligor is determined by the bucket its
return falls into. Obviously, the bucket probabilities
must coincide with the transition probabilities. Mod- Ri = ψi (F) + 1 − Var(ψi (F)) i ∼ N(0, 1)
els of this type are also called threshold models. (5)
4 Credit Migration Models
Density
0 1 2 3 4
d1 d2 d3 d4
Asset value return
for appropriate affine linear functions ψi . The where C, the copula associated with R, is the dis-
adjusted thresholds are given by tribution function of a random vector with standard
uniform marginal distributions. From standard argu-
ments (see e.g., Copulas: Estimation; Copulas in
dj(i) = dj (Siinit ) with Econometrics; or Copulas in Insurance and refer-
j −1
ences therein),
−1
dj (s) = msk , 0<j ≤K (6)
k=0
(S1new = s1 , . . . , Snnew = sn )
where is the standard normal distribution
= (ds(1) < R1 ≤ ds(1) 1 +1
, . . . , ds(n) < Rn ≤ ds(n)
n +1
)
function. 1 n
d (1) d (n)
As regards the recovery rates δi , they are assumed s1 +1 sn +1
(8)
Obligor Dependence Structure from Copulas
As was first recognized by Frey et al. [9] (see also Note that the integration limits Gi (ds(i) )=
si −1 i
[8]), copulas provide an elegant means for describing k=0 m init
Si ,k do not depend on Gi . This implies that
the obligor dependence structure in credit portfolio the joint distribution of the ratings vector Snew is
models. By virtue of Sklar’s theorem, the joint determined by the initial obligor ratings, the credit
distribution of the random vector R = (R1 , . . . , Rn ) migration matrix M and the copula C; the marginal
can be factorized as distributions Gi do not matter.
This result helps to categorize threshold credit
portfolio models; models using the same families of
G(r1 , . . . , rn ) = (R1 ≤ r1 , . . . , Rn ≤ rn )
copulas can be considered as structurally equivalent.
= C(G1 (r1 ), . . . , Gn (rn )) (7) The copula associated with a Gaussian random vector
Credit Migration Models 5
is called Gaussian copula and depends on the correla- this allows one to make a link to CreditRisk+ (see
tion matrix only. Since the returns R are multivariate [3, 8, 11] for details).
Gaussian, the original CreditMetrics model [13] is
referred to as having a Gaussian copula. Replacing
the Gaussian copula family by other families gives Asymptotic Behavior of CreditMetrics
different models. Frey et al. [9] study the CreditMet-
rics model with Student-t copulas and find that the In CreditMetrics, risk measures such as VaR
tail of the loss distribution is considerably fatter as (Value-at-Risk) or ES (expected shortfall) cannot
compared with the Gaussian copula with identical be expressed in terms of simple closed formulas.
correlation parameters. For their estimation, one has to resort to Monte-
Carlo (MC) simulation or other numerical methods
CreditMetrics as a Mixture Model (see Credit Portfolio Simulation). Although the
topic of approximations in credit portfolio models
We now interpret the CreditMetrics model in a con- is covered in Large Pool Approximations;
ditional fashion in order to better understand the Saddlepoint Approximation, we provide a brief
meaning of the factors F. To this end, we look at discussion because the asymptotic results provide
the vector of default indicators D = (D1 , . . . , Dn ) , important qualitative insights.
where Di = I{Sinew =0} . We denote the default prob- Concerning approximation, research has dealt with
abilities by p̄i = (Di = 1). Then conditional on strong limits of the relative portfolio loss and the tail
the factors F, the vector D consists of independent behavior of the loss distribution when the number
Bernoulli random variables with success probabilities of obligors tends to infinity. While the derivation of
strong limits consists of straightforward applications
pi (F) = (Di = 1 | F) of the strong law of large numbers, the analysis of
the tail behavior of the loss is more involved. For the
−1 (p̄i ) − ψi (F) tail behavior, we refer to [18] and a recent article by
= (9)
1 − Var(ψi (F)) Glasserman et al. [10].
We next present the idea of the so-called large pool
We deduce that one can simulate D by first approximation. To this end, we work in a simplified
drawing F from a multivariate Gaussian distribution framework that is adapted to loan portfolios. We
and then generating independent Bernoulli random assume that recovery rates, spreads, and interest
variables with success probabilities pi (F). From rates are all equal to zero. Every obligor i has an
this angle, F represents the state of the economy outstanding loan of size ei . Then the loss in the period
that determines the obligor default probabilities. The (0, T ] is given by
distribution of D, which is obtained from “mixing”
the conditional distributions of D by F, is a so-called
n
(D1 = d1 , . . . , Dn = dn )
We define total exposure by e = ni=1 ei and set
= Ɛ (D1 = d1 , . . . , Dn = dn | F) L̄n = Ln /e for the relative loss of the portfolio. We
n want to verify to which extent the specific risk caused
by the obligor-specific returns i is diversified away
=Ɛ pi (F) (1 − pi (F))
di 1−di
(10)
when the number of obligors grows. To this end, we
i=1
decompose the relative loss into a systematic and an
The conditional view offers computational advan- obligor-specific component:
tages. Finger [6] exploits it for the determination
of credit portfolio distributions. Using the Poisson L̄n = Ɛ(L̄n | F) + n (12)
approximation for sums of independent Bernoulli
variables, various authors have shown that Credit- It is straightforward to show that obligor-specific
Metrics is approximated by a Poisson mixture model; variance tends to zero as n → ∞, provided the
6 Credit Migration Models
Herfindahl index, which measures exposure concen- equal to p̄, the limit Ɛ(L̄n | F) is a so-called probit-
tration, converges to zero: normal random variable:
√
1 2
n −1 (p̄) − ρF
Var( n ) → 0 if Hn = e →0 (13) L̄n ≈ p(F) = (15)
e i=1 i 1−ρ
[13] Gupton, G.M., Finger, C.C. & Bhatia, M. (1997). [20] Merton, R.C. (1974). On the pricing of corporate debt:
CreditMetrics–Technical Document, J.P. Morgan & Co. the risk structure of interest rates, Journal of Finance
Incorporated. 29, 449–470.
[14] Hahnenstein, L. (2004). Calibrating the CreditMet- [21] Mina, J. & Xiaao, J.Y. (2001). Return to RiskMetrics:
The Evolution of a Standard, RiskMetrics Group.
rics correlation concept—empirical evidence from
[22] de Servigny, A. & Renault, O. (2003). Correlation
Germany, Financial Markets and Portfoliomanagement
evidence, Risk, July, 90–94.
18, 358–381. [23] Vasicek, O.A. (1987). Probability of Loss on Loan
[15] Hamerle, A. & Rösch, D. (2006). Parameterizing credit Portfolio. Available at www.moodyskmv.com
risk models, Journal of Credit Risk 2(4), 101–122. [24] Vasicek, O.A. (2002). Loan portfolio value, Risk,
[16] Kealhofer, S. & Bohn, J.R. (2001). Portfolio Manage- December, 160–162.
ment of Default Risk. Available at www.moodyskmv.
com
[17] Lando, D. (2004). Credit Risk Modeling, Princeton
Related Articles
Series in Finance, Princeton University Press.
[18] Lucas, A., Klaassen, P., Spreij, P. & Straetmans, S. Exposure to Default and Loss Given Default;
(2001). An analytical approach to credit risk in large Gaussian Copula Model; Large Pool Approxima-
corporate bond and loan portfolios, Journal of Banking tions; Structural Default Risk Models; Rating
and Finance 2, 1635–1664. Transition Matrices.
[19] McNeil, A.J., Frey, R. & Embrechts, P. (2005). Quan-
titative Risk Management, Princeton Series in Finance, DANIEL STRAUMANN & CHRISTOPHER C.
Princeton University Press. FINGER
Structural Default Risk A major advantage of (multivariate) structural-
default models is the appealing economic inter-
Models pretation of the definition of default. Additionally,
comovements of the individual firm-value processes
might also be interpreted as being the result of com-
mon risk factors. Moreover, modeling the evolution
Structural models of default risk for individual firms of the firm’s values as some multivariate stochastic
originate from the seminal work of Merton [25]. process naturally implies a dynamic model, which
Default is linked to the economic fundamentals of is highly desirable in risk-management and pricing
the considered firm via the assumption that default applications. The downside of this class of mod-
occurs if the value of the firm’s assets, modeled as a els is the mathematical challenge of computing the
geometric Brownian motion, falls below some default portfolio-loss distribution or even bivariate default
threshold (the firm’s liabilities) at some future point correlations. Hence, most of the proposed models rely
in time (the maturity of a zero-coupon bond). A sig- on simplifying assumptions or can be solved only via
nificant extension of this methodology was proposed a Monte Carlo simulation.
by Black and Cox [7], who continuously test for
default. Hence, in their model, the time of default is
a first-passage time. Further generalizations address Merton-type Models
stochastic interest rates, more general assumptions
on the default threshold, the definition of the default The Model of Vasicek
event, and discontinuous processes as model for the
firm’s assets [10, 12, 20–22, 34]. On a high level, In his short memo [31], Vasicek considers a port-
these innovations aim at making the model-induced folio of n loans with unit nominal and maturity T .
term structure of default probabilities flexible enough Each individual firm-value process is modeled as a
to allow for a precise fit of the model to observed geometric Brownian motion defined by the stochastic
bond prices and credit default swap (CDS) spreads. differential equation:
The growing popularity of derivatives on credit
portfolios, for example, collateralized debt obli- dVti = Vti (γ i dt + σ i dWti ),
gations (CDOs) and nth to default baskets, and
V0i > 0, i ∈ {1, . . . , n} (1)
advanced demands on risk-management solutions
produced a need for portfolio models that simulta- The first simplification, often referred to as homo-
neously explain the credit quality of multiple firms. geneous portfolio assumption, is to assume identical
Since corporate defaults in a globalized economy are default probabilities for all firms. In the current setup,
not independent, a multivariate default model has to this assumption corresponds to identical parameters
explain univariate default probabilities and the depen- V0 ≡ V0i , σ ≡ σ i , and γ ≡ γ i . Moreover, an identi-
dence among the default events. A natural assump- cal correlation across all bivariate pairs of Brownian
tion for a multivariate structural-default model is to motions W i and W j is assumed. Using Itô’s formula
introduce dependence by assuming correlated asset and replacing the growth rate γ by the risk-free inter-
values, leading to dependent default events. Zhou est rate r, we find
[33] motivates this approach by the observation: “The
d √
fortunes of individual companies are linked together VTi = V0i exp((r − 0.5 σ 2 )T + σ T X i ) (2)
via industry-specific and/or general economic con- √
ditions.” The first portfolio model of this class was where X i := WTi / T follows a standard normal
formulated by Vasicek [31] and can be classified as distribution. Given some default threshold dT ≡ dTi ,
a multivariate generalization of the work by Mer- one can immediately compute the probability of
ton [25]. This model is discussed in some detail in default at time T , since the distribution of the firm
the section “The Model of Vasicek”, as it constitutes value at time T is known explicitly. Moreover,
the basis for most of today’s generalizations and is since default can only happen at maturity, only
used to asses the regulatory capital for loan portfolios the distribution of VTi is of importance and not
within the Basel II framework. the dynamic model leading to it. By scaling the
2 Structural Default Risk Models
original default threshold, default can alternatively can be avoided by applying the law of large numbers;
be expressed in terms of the standard normally this approach is called large portfolio approximation.
distributed variable X i . More precisely, assuming The key observation is that ( ≤ x|M = m) →
the default probability of firm i at time T is given 1{p(m)≤x} for (n → ∞). A straightforward calculation,
by p i , the default threshold with respect to X i is see, for example, [29] for details, establishes
K i = −1 (p i ), where −1 is the quantile function
of the standard normal distribution. (n)
Fp,ρ (x)
To incorporate correlation among the companies,
one explains X i by a common market factor M and ∞ 1 − ρ −1 (x) − −1 (p)
an idiosyncratic risk factor i , that is, → Fp,ρ (x) := √ ,
ρ
d √
X i = ρM + 1 − ρ i , ρ ∈ (0, 1) (3) (n → ∞)
The risk-weighted assets (RW A) are then obtained are more likely to default due to idiosyncratic rea-
by sons. (ρ l , ρ u ) depend on the type of loan and are
specified as (0.12, 0.24) for sovereign, corporate, and
KIRB · EAD bank loans; (0.12, 0.30) for highly volatile commer-
RW A = = 12.5 · KIRB · EAD (10)
0.08 cial real estate loans; ρ l = ρ u = 0.15 for residential
where 0.08 corresponds to the 8% minimum capital mortgages; ρ l = ρ u = 0.04 for revolving retail loans
ratio. The very conservative one-year 99.9%-quantile such as credit cards; and finally (0.03, 0.16) for other
in equation (9) is part of the Basel II accord and retail exposures, where in this case the weight func-
might be interpreted as some cushion regarding the tion a(x) is computed with exponents −35 instead
underlying simplifications in Vasicek’s model. The of −50.
factor MA is the maturity adjustment and calculated The IRB approach is sometimes criticized for
via (some exceptions apply) the strong assumptions that are required to derive
Vasicek’s distribution. However, one should recog-
nize the IRB approach as a compromise that pro-
1 + (M − 2.5) · b(P D)
MA = , vides a common language for regulators, banks, and
1 − 1.5 · b(P D) investors to communicate and establishes compara-
t · CFt
ble risk estimates across banks. The IRB formula is
discussed in depth in [5, 30].
t
M = min
,5 (11)
CFt
t Generalizations Using Other Distributions
and b(P D) = (0.11852 − 0.05478 · log P D)2 , where It is well known that the model [31] does not yield
CFt denotes the expected cash-flow at time t. M a satisfactory fit to market quotes of tranches of
accounts for the fact that loans with longer (shorter) CDOs. More precisely, an implied correlation smile
maturity than one year require a higher (lower) cap- is present when the model is inverted for the corre-
ital charge. Finally, the crucial correlation parameter lation parameter tranche by tranche. Especially tail
needs to be specified. Basel II uses a convex combina- events with multiple defaults are underrepresented
tion between some lower ρ l and upper ρ u correlation in a Gaussian world, making a precise fit to senior
whose weights depend on the default probability of tranches of a CDO impossible. To overcome this
the respective loan, that is, shortcoming, a natural assumption is to give up
normality in equation (3) and consider other heav-
ρ = ρ l a(P D) + ρ u (1 − a(P D)), ier tailed distributions. For the derivation leading to
∞
Fp,ρ in equation (7), the stability of the normal dis-
1 − e−50x tribution under convolutions is essential in equation
a(x) = (12)
1 − e−50 (3). Hence, natural choices for generalizations are
other infinitely divisible distributions, which are con-
For corporate credits, the correlation-adjustment nected to Lévy processes; see, for example, [8]. These
factor generalizations add flexibility to the model and can
additionally imply a dependence structure with tail
max{5, S} − 5 dependence, making multiple defaults more likely.
SMad (S) = − 0.04 · 1 −
45 Specific models in this spirit include, for example, the
× 1l{S≤50} (13) NIG model of Kalemanova et al. [17], the VG model
of Moosbrucker [27], and the BVG model of Baxter
[6]. Following [1], we now derive a large homo-
is added to ρ for borrowers with reported annual geneous portfolio approximation in a general Lévy
sales S ≤ 50, measured in millions of Euros. The spe- framework.
cific form of a(x) and the adjustment factor SMad (S) Let X = {Xt }t∈[0,1] be a Lévy process (see Lévy
being negative stem from the empirical observation Processes) with X1 ∼ H1 for some infinitely divis-
[23] that large firms that bear more systemic risk ible distribution H1 . Assume X1 to be standardized
are more correlated compared to small firms that to zero mean and unit variance. Given a correlation
4 Structural Default Risk Models
ρ ∈ (0, 1), define in analogy to equation (3) for inde- expected jump size ν = Ɛ[
− 1]. The advantage of
pendent copies {X i }ni=1 of X the random variables V i supporting negative jumps on a univariate level is
by that default events are no longer predictable, which
V i := Xρ + X1−ρ i
, i ∈ {1, . . . , n} (14) translates to positive short-term credit spreads. Wille-
mann [32] incorporates dependence to the individual
Here, the common market factor is represented firm-value processes by the classical decomposition
by Xρ , and the idiosyncratic risk of firm i is cap- of each Brownian motion into a market factor and
i
tured in X1−ρ . Using the Lévy properties of X, an idiosyncratic component. Moreover, it is assumed
i
each V is again distributed according to H1 and that all firm-value processes jump together, that is,
Cor(V i , V j ) = ρ for i = j . In what follows, we all processes are driven by the same Poisson pro-
denote by Ht−1 the inverse of the distribution func- cess Nt . Consequently, this construction allows for
tion of Xt . The homogeneous portfolio assumption two layers of correlation: diffusion and jump cor-
in the present setup translates to identical univari- relation; the latter being the main innovation of
ate default probabilities up to time T , abbreviated as this setup.
p ≡ p i , identical threshold levels KT = H1−1 (p) ≡ The default threshold of firm i is set to Kti =
KTi , and unit notional of each firm. The probability e −φ i t i
K0 for some positive constants φ i and K0i .
of exactly k defaults in the portfolio is then again This declining form is chosen to increase short-term
obtained as spreads, but might also imply that the fit to indi-
∞ vidual CDS gets worse with increase in time. To
(n = k) = (n = k|Xρ = m)dHρ (m), achieve semianalytical results for the portfolio-loss
−∞ distribution, default is tested on a grid. The advan-
k ∈ {0, . . . , n} (15) tage of this simplification is that only the distribu-
tion of each firm-value process at the grid points is
Similar to Vasicek’s model, the conditional distri- required, instead of functionals as infs∈[0,t] Vs . Indi-
bution of the number of defaults given Xρ = m is a vidual default probabilities up to time t can then
binomial distribution with n trials and success prob- be computed conditional on the number of jumps
ability p(m) = (V i ≤ KT |Xρ = m) = H1−ρ (KT − up to time t, which is a Poisson-distributed random
m). The large portfolio assumption, that is, letting the variable. Since the specific choice of jump-size dis-
number of firms n tend to infinity, then gives tribution is compatible to the Brownian motion of
the model, this leads to an infinite sum of normally
∞
Fp,ρ (x) = 1 − Hρ (H1−1 (p) − H1−ρ
−1
(x)) (16)
distributed random variables. Moreover, all default
as distribution function of the fractional loss in an events are independent conditional on the market
infinite granular portfolio; see [1] for a complete factor and the number of jumps. Hence, the portfolio-
proof. Let us finally remark that evaluating Ht and loss distribution can be found by integrating out these
Ht−1 requires numerical routines for most choices of common factors and using a recursion technique sim-
X1 ∼ H1 . ilar to [3, 16]. Willemann [32] demonstrates quite
successfully how the model is simultaneously fit-
ted (in seconds) to individual CDS spreads and the
The Model of Willemann tranches of a CDO.
The starting point for Willemann [32] is the univariate
jump-diffusion model of Zhou [34]. This model
assumes a discontinuous firm-value process of the A Remark on Asset and Default Correlation
form
Modeling asset values as correlated stochastic pro-
cesses introduces dependence to the resulting default
dVt = Vt ((γ − λν)dt + σ dWt + (
− 1)dNt ), times. Still, this relation is not trivial and deserves
V0 > 0 (17) some caution, especially when it comes to estimating
the model’s asset-correlation parameter. We follow
where Nt is a Poisson process with intensity λ > 0 [24] in defining the default correlation of two firms
and the jumps
are log-normally distributed with (up to time t) as
Structural Default Risk Models 5
ρtD : = Cor(1l{τ 1 ≤t} , 1l{τ 2 ≤t} ) Being able to convert default to asset correlations
(and vice versa) opens the possibility of estimating
(Pt1 , Pt2 ) − (Pt1 )(Pt2 ) the model’s asset correlation using historical default
=
(Pt1 )(1 − (Pt1 )) (Pt2 )(1 − (Pt2 )) correlations (and vice versa); see, for example, [14].
This approach is relevant since asset values are not
(18) directly observable, making an estimation of asset
correlations delicate. It is an ongoing debate whether
where Pti := {τ i ≤ t}, i ∈ {1, 2}. Most structural- indirectly observed changes in asset values, computed
default models share the commanality that evaluating from changes in the respective firm’s equity, or
(Pt1 , Pt2 ), the probability of a joint default of both observed defaults are the better source of data for
firms up to time t, is quite difficult; an exception the estimation of the model’s correlation parameter.
being the case of two companies with Gaussian In both cases, pointing out the respective limitations
factors coupled as described in equation (3). This is much simpler than providing theoretical evidence
example is, therefore, used to illustrate the nonlinear for the methodology. Empirically estimating default
relation of asset and default correlation. A joint correlations (based on groups of firms with similar
default in this setup corresponds to a simultaneous characteristics) requires a large set of observations,
drop of both factors X 1 and X 2 below their respective since corporate defaults are rare events. This makes
default threshold K i = −1 (pti ), i ∈ {1, 2}. Since the the approach vulnerable to structural changes such as
vector (X 1 , X 2 ) follows a two-dimensional normal new bankruptcy rules. On the other hand, daily equity
distribution with mean vector (0, 0) and the asset- prices are readily available for most firms. When this
correlation ρ as correlation parameter, we obtain latter source of data is used, the difficulty lies in
transforming equity to asset returns, see, for example,
(Pt1 , Pt2 ) = 2 (K 1 , K 2 ; ρ) (19) [9], from which the correlation might be estimated.
In addition, one should be aware that equity prices
might change for reasons that are not related to
which is used to produce Figure 1. This example credit risk.
illustrates that small asset correlations induce only
a negligible default correlation.
First-passage Time Models
The starting point for most multivariate first passage-
1.0 time models is equation (1). Compared to models
in the spirit of the work by Merton [25], the time
0.8 of default is now defined as suggested in [7], that is,
Default correlation
the fact that the joint distribution of the mini- piece of information allows to update the knowl-
mum of several firm-value processes is required, edge on all other default thresholds, leading to
which is already a challenging problem for univari- contagious jumps in credit spreads of the remain-
ate marginals. The following section collects models ing firms. Giesecke [13] also presents an explicit
where analytical results or numerical routines are example of two firms with independent value pro-
available to overcome this problem. cesses modeled as geometric Brownian motions and
default thresholds coupled via a Clayton copula.
While this simplified example illustrates the desired
The Model of Zhou contagion effect of the model, it also highlights the
challenge of finding analytic results in a realistic
Zhou studies [33] a portfolio of two firms whose
framework.
asset-value processes are modeled as in equation
(1) with correlated Brownian motions. The default
thresholds are assumed to be exponential, that is, Models Relying on Monte Carlo Simulations
i
dti = eλ t K i for i ∈ {1, 2}. The degree of dependence
This section briefly presents two first-passage time
of both firms is measured in terms of their default cor-
models that rely on Monte Carlo simulations for the
relation up to time t, that is, as Cor(1{τ 1 ≤t} , 1{τ 2 ≤t} ).
pricing of CDOs.
The key observation is that results of Rebholz [28]
The n firm-value processes [15] are defined as in
can be applied to give an analytical representation
equation (1); the model can therefore be considered
of the default correlation in terms of an infinite sum
as a generalization of Zhou’s [33] bivariate model to
of indefinite integrals over modified Bessel functions.
larger portfolios. The default thresholds are rewritten
Sensitivity analysis of the model parameters indicates
in terms of the driving Brownian motions. Asset
that the model-induced default correlations for short
correlation is introduced by nF risk factors, that is,
maturities are close to zero. This observation needs to
the Brownian motion of firm i is replaced by
be considered when portfolio derivatives with short
maturities are priced within such a framework.
nF
j
nF 1
dWti : = αi,j dFt + (1 − 2 2
αi,j ) dUti ,
The Model of Giesecke j =1 j =1
i ∈ {1, . . . , n} (21)
Giesecke [13] considers a portfolio of n firms
whose value processes evolve according to some where αi,j is the sensitivity of firm i to changes of
vector-valued stochastic process (V 1 , . . . , V n ), where the risk factor F j and U i is the idiosyncratic risk
default is again defined as in equation (20). The of this firm. All processes F j and U i are indepen-
key innovation is to replace the vector of default dent Brownian motions. Hull et al. [15] also consider
thresholds by an initially unobservable random vector extensions to stochastic correlations, stochastic recov-
(d 1 , . . . , d n ) whose dependence structure is repre- ery rates, and stochastic volatilities and compare these
sented by some copula. It is shown that the model- in terms of their fitting capability to CDO tranches.
induced copula of default times is a function of the An interesting conclusion that also applies to similar
copula of default thresholds and the copula of the first-passage time models is drawn when the model
vector of historical lows of the firm-value processes. is compared to a copula model. It is argued that the
On a univariate level, the assumption of an unob- default environment in a copula model is static for
servable random threshold overcomes the predictabil- the whole life of the model, while the dynamic nature
ity of individual defaults, which is responsible for of equation (21) allows to have bad default environ-
vanishing credit spreads for short maturities; see [10] ments in one year, followed by good environments
for a related model. Short-term spreads [13] are later. Hence, the use of one or more common risk
positive as long as the respective firm-value process factors implies a sound economic model for cyclical
is close to its historical low. The consequence of correlation.
this construction on a portfolio level is also remark- Kiesel and Scherer [18] present another multi-
able. Observing a corporate default τ i reveals the variate extension of the work by Zhou [34]. They
respective default threshold d i to all investors. This model the firm-value process of the company i as
Structural Default Risk Models 7
the exponential of a jump-diffusion process with two- the portfolio-loss process; a desired property in risk-
sided exponentially distributed jumps Yij , that is, management solutions and for the pricing of (exotic)
credit-portfolio derivatives.
Vti = V0i exp(Xti ), The downside of multivariate structural-default
models lies in the difficulty of translating the model
N
i
t (b ) to analytical formulas for default correlations and the
Xti =γ t +σ
i i
Wti + Yij , V0i > 0 (22) portfolio-loss distribution. This becomes especially
j =1 apparent when the simplifying assumption in [31]
where the Brownian motions of different firms are and its generalizations are reconsidered; the bottom-
again correlated via a factor decomposition. The up nature of structural-default models is entirely
novelty in their approach is the use of a Poisson given up in order to compute the portfolio-loss
process Nt as ticker for jumps in the market that distribution in closed form. The price to pay for
is thinned-out with probability (1 − bi ) to induce a more realistic framework typically is a Monte
jumps in V i . Consequently, some but not necessarily Carlo simulation. However, if such a simulation is
all firms jump (and possibly default) together. As a efficiently implemented, a realistic dynamic model
result of common jumps, the model allows for default for a portfolio of credit-risky assets is available.
clusters that extend the cyclical correlation induced
by common continuous factors. For this choice of
jump distribution, the marginals of the model can be Acknowledgments
calibrated to CDS quotes using the Laplace transform
Research support by Daniela Neykova, Technische
of first-passage times of X i , which is derived in [19]. Universität München, is gratefully acknowledged.
The multivariate model is solved via a Brownian-
bridge Monte Carlo simulation in the spirit of the
work by Metwally and Atiya [26]. References
[9] Crosbie, P. & Bohn, J. Modeling Default Risk, KMV [25] Merton, R. (1974). On the pricing of corporate debt: the
Corporation, retrieved from http://www.moodyskmv. risk structure of interest rates, Journal of Finance 29,
com/research/files/wp/ModelingDefaultRisk.pdf. 449–470. Reprinted as Chapter 12 in Merton, R. (1990)
[10] Duffie, D. & Lando, D. (2001). The term structure of Continuous-time Finance, Blackwell.
credit spreads with incomplete accounting information, [26] Metwally, S. & Atiya, A. (2002). Using Brownian
Econometrica 69, 633–664. bridge for fast simulation of jump-diffusion processes
[11] Frye, J. (2000). Depressing recoveries, Risk 13(11), and barrier options, The Journal of Derivatives 10(1),
106–111. 43–54.
[12] Geske, R. (1977). The valuation of corporate liabilities [27] Moosbrucker, T. (2006). Pricing CDOs with Correlated
as compound options, Journal of Financial and Quanti- Variance Gamma Distributions. Research report, Depart-
tative Analysis 12(4), 541–552. ment of Banking, University of Cologne.
[13] Giesecke, K. (2004). Correlated default with incomplete [28] Rebholz, J. (1994). Planar Diffusions with Applications
information, Journal of Banking and Finance 28(7), to Mathematical Finance, PhD thesis, University of
1521–1545. California, Berkeley.
[14] Gordy, M. (2000). A comparative anatomy of credit [29] Schönbucher, P. (2003). Credit Derivatives Pricing Mod-
risk models, Journal of Banking and Finance 24(1), els: Models, Pricing, Implementation, Wiley Finance.
119–149. [30] Thomas, H. & Wang, Z. (2005). Interpreting the internal
[15] Hull, J., Predescu, M. & White, A. (2005). The Valua- ratings-based capital requirements in Basel II, Journal
tion of Correlation-dependent Credit Derivatives using of Banking Regulation 6, 274–289.
a Structural Model . Working paper, retrieved from [31] Vasicek, O. (1987). Probability of Loss on Loan Portfo-
http://www.rotman.utoronto.ca/hull/DownloadablePubli- lio, KMV Corporation, retrieved from http://www.mood-
yskmv.com/research/whitepaper/Probability of Loss on
cations/StructuralModel.pdf
Loan Portfolio.pdf
[16] Hull, J. & White, A. (2004). Valuation of a CDO and an
[32] Willemann, S. (2007). Fitting the CDO correlation skew:
n-th to default CDS without a Monte Carlo simulation,
a tractable structural jump-diffusion model, The Journal
Journal of Derivatives 12(2), 8–23.
of Credit Risk 3(1), 63–90.
[17] Kalemanova, A., Schmid, B. & Werner, R. (2007). The
[33] Zhou, C. (2001). An analysis of default correlations
normal inverse Gaussian distribution for synthetic CDO
and multiple defaults, Review of Financial Studies 14,
pricing, Journal of Derivatives 14(3), 80–93.
555–576.
[18] Kiesel, R. & Scherer, M. (2007). Dynamic Credit Portfo-
[34] Zhou, C. (2001). The term structure of credit spreads
lio Modelling in Structural Models with Jumps. working
with jump risk, Journal of Banking and Finance 25,
paper, retrieved from http://www.uni-ulm.de/fileadmin/ 2015–2040.
website uni ulm / mawi.inst.050 / people /kiesel/publica-
tions/ Kiesel Scherer Dec07.pdf.
[19] Kou, S. & Wang, H. (2003). First passage times of a Further Reading
jump diffusion process, Advances in Applied Probability
35, 504–531. Lipton, A. (2002). Assets with jumps, Risk 15(9), 149–153.
[20] Leland, H. (1994). Corporate debt value, bond Lipton, A. & Sepp, A. (2009). Credit value adjustment for
covenants, and optimal capital structure, Journal of credit default swaps via the structural default model, The
Finance 49(4), 1213–1252. Journal of Credit Risk 5(2), 125.
[21] Leland, H. & Toft, K. (1996). Optimal capital structure,
endogenous bankruptcy, and the term structure of credit
spreads, Journal of Finance 51(3), 987–1019. Related Articles
[22] Longstaff, F. & Schwartz, E. (1995). A simple approach
to valuing risky fixed and floating rate debt, Journal of
Finance 50(3), 789–819. Default Barrier Models; Modeling Correlation of
[23] Lopez, J. (2004). The empirical relationship between Structured Instruments in a Portfolio Setting;
average asset correlation, firm probability of default, and Gaussian Copula Model; Internal-ratings-based
asset size, Journal of Financial Intermediation 13(2), Approach; Reduced Form Credit Risk Models.
265–283.
[24] Lucas, D. (1995). Default correlation and credit analysis, RÜDIGER KIESEL & MATTHIAS A. SCHERER
Journal of Fixed Income 4(4), 76–87.
CreditRisk+ early formulation of the Basel accord (see [5]) and
has been used by central banks to analyze country-
wide panel data on defaults (an example is reported
in [1]).
CreditRisk+ is a portfolio credit risk model devel- For these reasons, since its introduction in 1997,
oped by the bank Credit Suisse, who published the CreditRisk+ has consistently attracted the interest
methodology in 1997 [2]. of practitioners, financial regulators, and academics,
A portfolio credit risk model is a means of who have generated a significant body of literature
estimating the statistical distribution of the aggre- on the model. An account of CreditRisk+ and its
gate loss from defaults in a portfolio of loans or subsequent developments can be found in [6].
other credit-risky instruments over a period of time.
More generally, changes in credit quality other than
default can be considered, but CreditRisk+ in its The CreditRisk+ Algorithm
original form is focused only on default. The most
widely used portfolio credit risk models are undoubt- The function of CreditRisk+ is to transform data
edly the so-called structural models, including mod- about the creditworthiness of individual borrowers
els based on the Gaussian copula framework (see into a portfolio-level assessment of risk. In most
Structural Default Risk Models). CreditRisk+ per- portfolio credit risk models, this step requires Monte
forms its calculation in a different way to these Carlo simulation (see Credit Portfolio Simulation).
models, but it is recognized that CreditRisk+ and However, CreditRisk+ avoids simulation by using an
Gaussian copula models have a similar concep- efficient numerical algorithm, as outlined below.
tual basis. A detailed discussion can be found in The approach confers advantages in terms of speed
[4, 7]. of computation and enhanced understanding of the
Financial institutions use portfolio credit risk mod- drivers of the resulting distribution: many useful
els to estimate aggregate credit losses at high per- statistics, such as the moments of the loss distribu-
centiles, corresponding to very bad outcomes (often tion, are given by simple formulae in CreditRisk+,
known as the tail of the loss distribution). These whose relationship to the risk management features
estimates are then used in setting and allocating of the situation is transparent. On the other hand,
economic capital (see Economic Capital) and deter- owing to its analytic nature, CreditRisk+ is a rel-
mining portfolio performance measures such as risk- atively inflexible portfolio model, and as such has
adjusted return on capital (see Risk-adjusted Return tended to find application where transparency and
on Capital (RAROC)). ease of calculation are more important than flexible
Portfolio credit risk models have two elements. parameterization.
The first is a set of statistical assumptions about the To understand the CreditRisk+ calculation, we
effect of economic influences on the likelihood of consider a portfolio containing N loans, where we
individual borrowers defaulting, and about how much wish to assess the loss distribution over a one-year
the individual losses might be when they default. time horizon. (The model can be applied to bonds or
The second element is an algorithm for calculating derivatives counterparties, but the main features of
the resulting loss distribution under these assump- the calculation are the same.) To run CreditRisk+, a
tions for a specific portfolio. Unlike most portfo- number R of economic factors must be chosen. This
lio credit risk models, CreditRisk+ calculates the can be the number of distinct economic influences
loss distribution using a numerical technique that on the portfolio that are considered to exist (say,
avoids Monte Carlo simulation. The other distinc- the number of geographical regions or industries
tion of CreditRisk+ is that it was presented as a significantly represented in the portfolio), but it is
methodology rather than as a software implemen- often assumed in practice that R = 1, in which
tation. Practitioners and institutions have developed case the model is said to be in “one-factor” mode.
their own implementations, leading to a number of CreditRisk+ with one factor gives an assessment
significant variants and improvements of the original of risk that ignores subtle industry or geographic
model. The model has also been used by regulators diversification, but can capture the correct overall
and central banks: CreditRisk+ played a role in the amount of economic and concentration risk present
2 CreditRisk+
in the portfolio, and is sufficient for many purposes. pi but also on the random variables
X1 , . . . , XR . Note
In any event, typically R is much less than N, that because E(Xj ) = 1 and Rj=1 θi,j = 1, we have
the number of loans, reflecting the fact that all the
significant influences on the portfolio affect many E(Pi ) = pi (θi,1 + . . . + θi,R ) = pi (2)
borrowers at once.
For each loan i, where 1 ≤ i ≤ N, the model so that the long-term average default probability (or
needs the following input data: equivalently, the average of the default probabilities
across all states of the economy) is pi as required. In a
1. Long-term average probability of default pi : This particular year, however, Pi will differ from its long-
is the probability that the obligor will default term average. If the borrower i is sensitive to a factor
over the year, typically estimated from the credit j, (i.e., θi,j > 0), and if a large value is drawn for Xj ,
rating (see Credit Rating). then this represents a poor economy with a negative
2. Loss on default Ei : This is typically estimated impact on the obligor i, and we will tend to have
as the loan notional less an estimated recovery Pi > pi , meaning that the obligor i is more likely
amount (see Recovery Rate). to default in this particular year than on average.
3. Economic factor loadings: These are given by Because the same will be true of other obligors i
θi,j , for 1 ≤ j ≤ R, where R is the number of with θi ,j > 0, the economic influence represented
factors introduced above. θi,j must be nonnega- by factor j can affect a large number of obligors at
tive numbers satisfying Rj=1 θi,j = 1 for each i. once. This mechanism incorporates systematic risk,
The factor loadings θi,j require some further which affects many obligors at once and so cannot
explanation: they represent the sensitivity of the be diversified away. The same mechanism in various
obligor i to each of the R economic factors assumed forms is present in all commonly used portfolio credit
to influence the portfolio. In general, determining risk models.
suitable values for θi,j is one of the main difficulties Two technical assumptions are now made in
of using CreditRisk+, and analogous difficulties exist CreditRisk+:
for all portfolio models. Note, however, that if R
1. The random variables Xj , 1 ≤ j ≤ R, are inde-
is chosen to be 1 (“one-factor mode” as described
pendent, and each has a Gamma distribution with
above), then we must have θi,1 = 1 for all i, and
mean 1 and variance βj .
there is no information requirement. This reflects
2. For each loan i, 1 ≤ i ≤ N, the loss given
the fact that one-factor mode ignores the subtle
default Ei is a positive integer.
industry or geographic diversification effects in the
portfolio, but is, nevertheless, a popular mode of The first assumption is made to facilitate the
use of the model due to the simpler parameter CreditRisk+ numerical algorithm. In other credit risk
requirements. models, notably the Gaussian copula models, the
To understand how CreditRisk+ processes this variables that play the role of the Xj are assumed to
data, let X1 , . . . , XR be random variables, each with be normally distributed. Although these assumptions
mean E(Xj ) = 1. The variable Xj represents the eco- seem very different, in fact for many applications
nomic influence of sector j over the year. In common they have little effect on the final risk estimate.
with most portfolio credit risk models, CreditRisk+ Assumption (1) can, however, lead to difficulties in
does not incorporate economic prediction. Instead, parameterizing CreditRisk+.
uncertainty about the economy is reflected by rep- The second assumption, known as bucketing of
resenting economic factors as random variables in exposures, also requires some further explanation.
this way. CreditRisk+ then assumes that the realized Without this assumption, Ei could be any positive
probability of default Pi for loan i is given by the amounts, all expressed in units of a common ref-
following critical relationship: erence currency. An insight of CreditRisk+ is that
the precise values of Ei are not critical: Ei can
Pi = pi (θi,1 X1 + . . . + θi,R XR ) (1) be rounded to whole numbers without significantly
affecting the aggregate risk assessment (a simple way
The realized default probability Pi depends not of estimating the resulting error is given in Section
only on the long-term average probability of default A4.2 of [2]). The amount of rounding depends on
CreditRisk+ 3
how Ei are expressed before rounding; for example, by calculating A0 , which is the probability of no loss,
it is common to express Ei in millions, so that a loss by setting z = 0 in equation (4) to give the explicit
on default of say 24.35, meaning 24.35 million units formula
of the reference currency, would be rounded to 25. N −1/βj
After bucketing of exposures, the aggregate loss R
from the portfolio must itself be a whole number A0 = G(0) = 1 + βj θi,j pi (6)
j =1 i=1
(in the example above, this would mean a whole
number of millions of the reference currency). The and the recurrence relation then allows efficient
loss distribution can therefore be summarized in calculation of An up to any desired level. For
terms of its probability generating function a complete treatment of this algorithm, see, for
∞ example, [6], Chapter 2.
G(z) = An zn (3)
n=0
Later Developments of CreditRisk+
where An denotes the probability that the aggregate
loss is exactly n. To obtain the loss distribution, Many enhancements to CreditRisk+ have been pro-
we need the numerical value of An , for n = 0 posed by various authors (see the introduction to [6]
(corresponding to no loss), 1, 2, . . . up to a desired for a discussion of some of the drawbacks of the
point. For CreditRisk+, with the inputs described original model). Developments have fallen into the
above, it can be shown that the probability generating following broad themes:
function (3) is given explicitly as
1. alternative calculation algorithms, such as sad-
N −1/βj dlepoint approximation, Fourier inversion, and
R the method of Giese [3];
G(z) = 1 − βj θi,j pi (zEi − 1)
2. improved capital allocation methods, notably the
j =1 i=1
method of Haaf and Tasche;
(4) 3. inclusion of additional risks, such as migration
risk and uncertain recovery rates;
For the derivation of this equation, see, for exam- 4. improved methods for determining inputs, par-
ple [2], Section A9 or [6], Chapter 2. The derivation ticularly the economic factor loadings θi,j ;
involves a further approximation, known as the Pois- 5. application to novel situations such as default
son approximation, which can roughly be described probability estimation [8]; and
as assuming that the default probabilities pi are 6. asymptotic formulae, notably the application of
small enough that their squares can be neglected. the “granularity adjustment” [5].
CreditRisk+ then uses an approach related to the so- The reader is also referred to [6] for details on
called Panjer algorithm, which was developed origi- many of these developments.
nally for use in actuarial aggregate claim estimation.
This relies on the fact that there exist polynomials
P (z) and Q(z), whose coefficients can be computed References
explicitly from the input data via equation (4), and
which satisfy [1] Balzarotti, V., Castro, C. & Powell, A. (2004). Reforming
Capital Requirements in Emerging Countries: Calibrating
dG(z) Basel II using Historical Argentine Credit Bureau Data
P (z) = Q(z)G(z) (5) and CreditRisk+. Working Paper, Universidad Torcuato
dz
Di Tella, Centro de Investigación en Finanzas.
Equating the coefficients of zn on each side [2] Credit Suisse Financial Products (1997). CreditRisk+ ,
of this identity, for each n ≥ 0, leads finally to a Credit Risk Management Framework, Credit Suisse
Financial Products, London.
a simple recurrence relationship between An in [3] Giese, G. (2003). Enhancing CreditRisk+, Risk 16(4),
equation (3). The recurrence relationship expresses 73–77.
the value of An for each n, in terms of the earlier [4] Gordy, M. (2000). A comparative anatomy of Credit Risk
coefficients A0 , . . . , An−1 . The calculation is started Models, Journal of Banking and Finance 24, 119–149.
4 CreditRisk+
generalized binomial distribution and given default of the kth name in the portfolio, and
by pk (x), the conditional default probability of the
(L = mN ) = (ν = m) kth name. Then the conditional mean, µ(x), and the
∞ conditional variance, σ 2 (x), of the portfolio losses
K are
= p m (x)q K−m (x) d(x),
m −∞
K
µ(x) = Nk pk (x),
m = 0, 1, . . . , K (4)
k=1
lim νK = π∗ almost surely (6) Let a probability, q, 0 < q < 1, be fixed and consider
K→∞ the equation
in accordance with the strong law of large numbers. q = (L ≤ q ) (11)
If β = 0 the limit in equation (6) is in distribution,
to a random variable with the same distribution as for the quantile of the distribution of the random
ξ = p(X). Thus, one obtains variable L. One has
∞
σ −1 () − H (L ≤ q ) = x (L ≤ q ) d(x)
lim (νK ≤ ) = , 0≤≤1 −∞
K→∞ β
∞
(7) · q − µ(x)
= d(x) (12)
−∞ σ (x)
It follows from equation (7) that the quantile approx-
imation, ∗q , corresponding to the probability q, is Therefore the quantile approximation, ∗q , is the
solution of the equation
∗ β−1 (q) + H
q = N (8) ∞ ∗q − µ(x)
σ q= d(x) (13)
−∞ σ (x)
In terms of the general approach, one has θ = µ with
Fθ () = 1l[µ,∞) () and θ(x) ≡ µ(x) = KNp(x). In terms of the general
approach, one has θ = (µ, σ )
with Fθ () = σ −µ
and θ(x) = (µ(x), σ (x)).
The normal approximation is just the classical central was used in [2] and [6] for synthetic collater-
limit theorem (CLT). The equation for the quantile alized debt obligation (CDO) pricing. The same
approximation simplifies to approach is applicable for approximation of portfolio
∞ ∗ losses.
q /N − Kp(x) In the case of a heterogeneous portfolio, it is
q= d(x) (15)
−∞ Kp(x)(1 − p(x)) not sufficient to approximate the distribution of the
number of losses suffered. One must keep track of
who defaults or at least the sizes of the individual
Generalized Poisson Approximation potential losses because, given only the number of
defaults, one cannot infer the losses incurred. To see
Consider a homogeneous portfolio, for which the how this added complexity is handled and how the
number, K, of obligors is moderately large but compound Poisson distribution arises quite naturally,
not very large. If also the conditional mean num- the simplest heterogeneous case is analyzed first;
ber of default events in the portfolio, K · p(x), namely, when there are only two distinct recovery-
takes moderate values, the conditional distribution adjusted notional values among the obligors in the
of ν might be better approximated by a Poisson portfolio.
distribution Denote by N(1) and N(2) , the two distinct values of
the recovery-adjusted notionals in the pool. The port-
· λm (x) folio then divides into two groups: one with obligors
x (ν = m) = exp (−λ(x)) , m = 0, 1, 2, . . .
m! having the common recovery-adjusted notional equal-
(16) ing N(1) ; the other having common recovery-adjusted
notional equaling N(2) . Denote the number of defaults
than by a normal distribution, where λ(x) = Kp(x). in each of the two groups, by ν1 and ν2 , respectively.
In this case, the (unconditional) portfolio losses Conditionally, their distributions are independent and
can be approximated by the generalized Poisson can be approximated by a Poisson distribution with
distribution: for moderately large K, conditional mean λi (x) = k:Nk =N(i) pk (x), i = 1, 2,
provided both group sizes are moderately large. (This
· λm (x) assumption on the group sizes, is only being made in
(L = mN ) = e−λ(x) dGX (x), the context of this example.) The total number of
m!
defaults in the portfolio, ν = ν1 + ν2 , is condition-
m = 0, 1, 2 . . . . (17)
ally Poisson with conditional mean λ(x) = λ1 (x) +
In terms of the general approach, one has Fθ being λ2 (x). The total portfolio loss is the sum of the losses
the Poisson distribution function with mean θ and of the first and second groups:
θ(x) = λ(x).
In particular, for the quantile approximation, one L = ν1 N(1) + ν2 N(2) (19)
obtains
As a positive linear combination of conditionally
∗
q /N m independent Poisson random variables, L is condi-
λ (x)
q= e−λ(x) dGX (x) (18) tionally a compound Poisson random variable with
m=0
m! the same distribution as that of
ν
λ1 (x) , N = N(1) where (N(m) )Km=1 is an i.i.d. sequence of random
= λ(x) (21) variables with common probability mass function f
λ2 (x) , N = N(2) and independent of ν, the number of defaults in
λ(x)
the pool, which is approximately Poisson distributed
In the general case where the recovery-adjusted under x
notionals take more than two values, the conditional D
ν ≈ Pois(λ(x)) (26)
distribution of the random variable, N(j ) , is
More precisely,
f (N ; x) = pk (x)/λ(x) (22)
k:Nk =N ν
max x (L = N ) − x N =N
(m)
where λ(x) = K k=1 pk (x) and N represents a possi- N
m=1
ble individual loss. K
In the special case where pk does not depend on
k, f is simply the relative frequency of the notional =O (pk (x)) 2
(27)
values and does not depend on x: k=1
∞
In general, the function f (N ; x) is a probability · e−λ(x) λm (x)
mass function with respect to N , which approximates (L ≤ ) = f
m (N ; x) dGX (x)
N≤ m=0
m!
the conditional probability that the portfolio loss is of
size N , given that there has been only one default. (28)
More generally, it can be shown that
In terms of the general approach, one has Fθ being
x (L = N |ν = m) ≈ f
m (N ; x) (24) the compound Poisson distribution function with
parameter θ = (θ1 , θ2 , . . . , θK ) ∈ [0, 1]K and θ(x) =
where f
m denotes the m-fold convolution of f with (p1 (x), p2 (x), . . . , pK (x)); Fθ is defined as
itself, as a probability mass function (for notational
∞
convenience, f
1 ≡ f and f
0 (N ; x) = 1 if and only e−λ λm
Fθ = f
m (N ) (29)
if N = 0). Given that there have been exactly m m!
N≤ m=0
defaults, the pool loss amounts to a sum of m notional
amounts but, as one does not know who defaulted, in −1
where λ := K k=1 θk , f (N ) := λ k:Nk =N θk . In
the heterogeneous case there is still some randomness
practice, the convolutions, would be calculated recur-
left; that randomness is captured (approximately)
sively using the fast Fourier transform.
by f
m .
Assuming that a monetary unit has been chosen
and that all recovery-adjusted notionals are expressed
as integers—that is, integer multiples of the monetary Large Deviations
unit—one has the following result [6]:
Approximations based on large deviation theory usu-
Theorem 1 In the limiting case of a large portfolio ally lead to exponential approximations of the tail
(K large), the following approximate equality holds of the conditional portfolio loss distribution. These
in distribution under x (i.e., conditional on X = x): approximations are derived using the saddlepoint
Large Pool Approximations 5
method for the characteristic function of the portfolio 2. An analytic adjustment (approximation) to a full
losses, multifactor model that is still based on an LLN
type of loss function. This adjustment is called a
L (s) = Ɛ[exp (is L)] multifactor adjustment.
3. An analytic adjustment, bridging the LLN-type
K
loss function of the second stage to the usual
= 1 − pk (x) + pk (x)eisNk dGX (x)
Merton-type one with full specific risk. This
k=1
adjustment is called a granularity adjustment.
(30)
The reason behind the terminology for the two adjust-
The technical details can be found in [1] (see also ments, is that for a single-factor model, the multi-
Saddlepoint Approximation). factor adjustment vanishes, whereas for an infinitely
granular portfolio (i.e., a very large, homogeneous
one), the granularity adjustment vanishes.
Other Methods The approximations, in both the second and third
There are some methods of approximation that deal stages, are based on a single formula for quantile
only with quantiles of the loss distribution directly, approximation, due originally to Gourieroux et al.
focusing on quantiles with high quantile probability, [4]. The formula is a second-order Taylor expansion,
which is the case of interest for credit risk. The for the quantile, in a small parameter that is used
large deviation approximations are examples of such to express the full loss model as a perturbation
methods. of the single-factor model. The first-order Taylor
Another one of these methods is due to Pykhtin coefficient is the difference between the single-factor
[12] who, building on the work of Martin and Wilde (conditional) loss and the conditional expected loss
[10], adapted the tools of an earlier investigation [4] of the full model, conditional on the single factor.
in market-risk sensitivity to position sizes, to the The single factor is constructed so that the first-order
credit risk setting. Note that this method is a direct, Taylor term vanishes.
analytical approximation to the quantile of the uncon- The second-order Taylor coefficient is related to
ditional loss distribution using an approximate model, the conditional variance of the full loss, conditional
unlike the other semianalytic methods described so on the single factor. The well-known conditional
far, which calculate the quantile by making analyti- variance decomposition from statistics is used to split
cal approximations to the conditional loss distribution the Taylor coefficient into two terms, which are the
(conditional on a systemic credit scenario). It is also approximations in the second and third stages.
worth noting that the result is in closed form, a qual- The end result for the entire adjustment to the
itative description of which is given here. single-factor quantile, is expressed as a sum of four
Pykhtin’s approach can be described at a high quadratic forms in the recovery-adjusted exposures,
level as follows. It consists of a three-stage series with coefficients involving the bivariate and univari-
of approximations: ate normal cumulative distribution functions, evalu-
ated in terms of the input statistical parameters of
1. A single-factor model, which is an approximation the model. The result is thus in closed form. The
based on an LLN type of loss function; that is, reader is referred to [12] for the quantitative details
it is a Vasicek type of model. of the construction, the formulae for the terms in the
quantile approximation, and a study of the scope of
a) The single factor is built as a weighted sum
applicability of the method.
of the portfolio’s counterparties’ credit
drivers.
b) The weights are chosen to maximize the
single factor’s correlation with the drivers. End Notes
c) The weights use the counterparties’ loss
characteristics such as default probabilities a.
This specification is a partial case of the famous Gaussian
and losses given default. copula model [9].
6 Large Pool Approximations
critical points may need to be accounted for. is finite for τ in an open interval (−c1 , c2 ) containing
The tangent line to the steepest descent curve at the origin, the Fourier inversion theorem implies that
ζ ∗ can be parameterized by w ∈ by the equation α+i∞
n
fn (x) = en((τ )−τ x) dτ (7)
(sf (2) (ζ ∗ ))1/2 (ζ − ζ ∗ ) = iw (2) 2πi α−i∞
2 Saddlepoint Approximation
for any real α ∈ (−c1 , c2 ). This integral is now which means roughly that when truncated at any
amenable to a saddlepoint treatment as follows. order of n−1 , the remainder is of the same magnitude
For each x in the support of f , one can show that as the first omitted term. A more precise statement
the saddlepoint condition of the magnitude of the remainder is difficult to
establish: the lack of a general error analysis is an
(τ ) − x = 0 (8) acknowledged deficiency of the saddlepoint method.
has a unique real solution τ ∗ = τ ∗ (x). One now
evaluates the integral given by equation (7) with α =
τ ∗ , and √uses Taylor expansion and the substitution
Applications to Portfolio Credit Risk
w = −i n (τ ∗ )(τ − τ ∗ ) to write
√ ∞
n ∗ ∗ 2 The problem of portfolio credit risk measures and the
fn (x) ∼ en((τ )−τ x)−w /2 problem of evaluating arbitrage-free pricing of col-
2π (τ ∗ ) −∞
lateralized debt obligations (CDOs) both boil down
× 1 + in−1/2 ( (τ ∗ ))−3/2 (3) (τ ∗ )w 3 /3! to computation of the probability distribution of the
portfolio loss at a set of times, and can be amenable
+ n−1 ( (τ ∗ ))−2 (4) (τ ∗ )w 4 /4! + . . . dw
to a saddlepoint treatment. To illustrate this fact, we
(9) consider a simple portfolio of credit risky instru-
ments (e.g., corporate loans or credit default swaps),
Each term in this expansion is a Gaussian integral and investigate the properties of the losses caused
that can be evaluated in closed form. The odd terms by default of the obligors. Let (, F, Ft , P ) be a fil-
all vanish, leaving an expansion in powers of n−1 : tered probability space that contains all of the random
elements: P may be either the physical or the risk-
−1 (4) (τ ∗ )
fn (x) ∼ gn (x) 1 + n neutral probability measure. The portfolio is defined
8( (τ ∗ ))2 by the following basic quantities:
5( (3) (τ ∗ ))2 −2 • M reference obligors with notional amounts
− + O(n ) (10)
24( (τ ∗ ))3 Nj , j = 1, 2, . . . , M;
• the default time τj of the j th credit, an Ft
where the leading term (called the saddlepoint stopping time;
approximation) is given by • the fractional recovery Rj after default of the j th
1/2 obligor;
n ∗ ∗
gn (x) = ∗ en((τ )−τ x) (11) • the loss lj = (1 − Rj )Nj /N caused by default of
2π (τ ) the j th
obligor as a fraction of the total notional
The function I (x) = supτ τ x − (τ ) = τ ∗ x − N = j Nj ;
(τ ∗ ) that appears in this expression is the Legendre • the cumulative portfolio loss L(t) = j lj I (τj ≤
transform of the cumulant generating function , and t) up to time t as a fraction of the total notional.
is known as the rate function or Cramér function of
the random variable X. The large deviation principle For simplicity, we make the following assumptions:
The most important consequence of these assump- Here, we need to take P to be the physical
tions is that, conditioned on H, the fractional loss measure.
L(t) is a sum of independent (but not identical)
Bernoulli random variables. For fixed values of CDO Pricing
the time t and conditioningrandom variable Y ,
we note that L̂ := L(t)|Y ∼ j lj Xj where Xj ∼ CDOs are portfolio credit swaps that can be schemat-
Bern(pj (t, y)), pj = Prob(τj ≤ t|Y = y). The fol- ically decomposed into two types of basic contingent
lowing functions are associated with the random claims whose cash flows depend on the portfolio loss
variable L̂: Lt . These cash flows are analogous to insurance and
premium payments paid periodically (typically, quar-
1. the pdf ρ(x) := F (−1) (x) (in our simple example, terly) on dates tk , k = 1, . . . , K, to cover default
it is a sum of delta functions supported on the losses within a “tranche” that occurred during that
interval [0, 1]); period.
2. the cumulative distribution function (CDF) F (0) The writer (the insurer) of one unit of a default leg
(x) = E[I (L̂ ≤ x)]; for a tranche with attachment levels 0 ≤ a < b ≤ 1
3. the higher conditional moment functions F (m) (x) pays the holder (the buyer of insurance) at each date
= (m!)−1 E[((x − L̂)+ )m ], m = 1, 2, . . . ; tk all default losses within the interval [a, b] that
4. the cumulant generating function (CGF) (u) = occurred over [tk−1 , tk ]. The time 0 arbitrage price
log(E[euL̂ ]). of such a contract is
When we need to make explicit the dependence on Wa,b = e−rtk E (b − Ltk )+ − (b − Ltk−1 )+
t, y we write F (m) (x|t, y). The unconditional versions k
of these functions are given by −(a − Ltk )+ + (a − Ltk−1 )+ (17)
d
where E is now the expectation with respect to some
F (x|t) =E[F (x|t, Y )] =
(m) (m)
F (m) (x|t, y)
risk-neutral measure. The writer of one unit of a
× ρY ( dy), m = −1, 0, . . . (13) premium leg for a tranche with attachment levels
a < b (the insured) pays the holder (the insurer) on
According to these definitions, for all m = 0, 1, . . . each date tk an amount jointly proportional to the
we have the integration formula year fraction tk − tk−1 and the amount remaining in
x the tranche. We ignore a possible “accrual term” that
account for defaults between payment dates. The time
F (m) (x) = F (m−1) (z) dz (14)
0 0 arbitrage price of such a contract is
VaRα (LT ) = inf{x|F (0) (x|T ) > α} (15) Saddlepoint Approximations for F (m)
We see that the credit risk management problem and
E[(LT − x)+ ]
CVaRα (LT ) = the CDO pricing problem both boil down to finding
1−α an efficient method to compute E[F (m) (x|t, y)] for
F (1) (x|T ) + E[LT ] − x m = 0, 1 and a large but finite set of values (x, t, y).
= (16) For the conditional loss L̂ = Lt |Y = y, the CGF is
1−α
4 Saddlepoint Approximation
or (25) is used. Thus for example, when x > E[L], slower than the Edgeworth expansion with the same
the approximation for m = 1 is number of terms. However, both [2] and [9] observe
+ +
that the accuracy of the saddlepoint expansion is often
eτ1 x+(τ1 ) far greater.
F (1) (x) ∼ x−E[L] +
2π (2) (τ1+ )
Acknowledgments
(4) (τ1+ )
× 1+
8( (2) (τ1+ ))2 Research underlying this article was supported by the
Natural Sciences and Engineering Research Council of
5( (3) (τ1+ ))2 Canada and MITACS, Canada.
− + ··· (27)
24( (2) (τ1+ ))3
References
In [2], the m = −1 solution τ ∗ , suggested by large
deviation theory, is chosen as the center of the Taylor [1] Andersen, L., Sidenius, J. & Basu, S. (2003). All your
expansion, even for m
= −1. The factor τ −m−1 is hedges in one basket, Risk 16, 67–72.
then included with the other nonexponentiated terms, [2] Antonov, A., Mechkov, S. & Misirpashaev, T. (2005).
leading to an asymptotic expansion with terms of the Analytical Techniques for Synthetic CDOs and Credit
Default Risk Measures, Numerix Preprint http://www.
form defaultrisk.com/pp crdrv 77.htm.
∞ [3] Daniels, H.E. (1954). Saddlepoint approximations in
e−w /2 (w + w0 )−m−1 w k dww0 = τ ∗ / (2) (τ ∗ )
2
statistics, Annals of Mathematical Statistics 25, 631–650.
−∞ [4] Gordy, M. (2002). Saddlepoint approximation of credit
(28) risk, Journal of Banking Finance 26(2), 1335–1353.
[5] Hull, J. & White, A. (2004). Valuation of a CDO and
These integrals can be evaluated in closed form, an nth to default CDS without Monte Carlo simulation,
but are somewhat complicated, and more terms are Journal of Derivatives 2, 8–23.
needed for a given order of accuracy. [6] Martin, R., Thompson, K. & Browne, C. (2003). Taking
to the saddle, in Credit Risk Modelling: The Cutting-edge
Numerical implementation of the saddlepoint
Collection, M. Gordy, ed, Riskbooks, London.
method for portfolio credit problems thus boils down [7] Varadhan, S.R.S. (1966). Asymptotic probabilities and
to efficient computation of the appropriate solutions differential equations, Communications on Pure and
of the saddlepoint condition given by equation (26). Applied Mathematics 19, 261–286.
This is a relatively straightforward application of one- [8] Watson, G.N. (1995). A Treatise on the Theory of Bessel
dimensional Newton–Raphson iteration, but must be Functions, 2nd Edition, Cambridge University Press,
done for a large number of values of (x, t, y). For Cambridge, reprint of the second (1944) edition.
[9] Yang, J.P., Hurd, T.R. & Zhang, X.P. (2006). Saddlepoint
typical parameter values and up to 210 obligors, approximation method for pricing CDOs, Journal of
[9] report that saddlepoints were usually found in Computational Finance 10, 1–20.
under 10 iterations, which suggests that a saddle-
point expansion will run no more than about 10 times THOMAS R. HURD
Credit Scoring the lower the number of specific client information
available to the bank. This generally means that appli-
cation models have a lower prediction power than
Credit scoring models play a fundamental role in the behavioral and collection models.
risk management practice at most banks. Commer- Over the last 50 years, several statistical method-
cial banks’ primary business activity is related to ologies have been used to build credit scoring models.
extending credit to borrowers and generating loans The very simplistic univariate analysis applied at the
and credit assets. A significant component of a bank’s beginning (late 1950s) was replaced as soon as aca-
risk, therefore, lies in the quality of its assets that demic research started to focus on credit scoring
needs to be in line with the bank’s risk appetite.a modeling techniques (late 1960s). The seminal works,
To manage risk efficiently, quantifying it with the in this field, of Beaver [10] and Altman [1] intro-
most appropriate and advanced tools is an extremely duced the multivariate discriminant analysis (MDA)
important factor in determining the bank’s success. that became the most popular statistical methodol-
Credit risk models are used to quantify credit risk ogy used to estimate credit scoring models until
at counterparty or transaction level and they differ Ohlson [26], for the first time, applied the condi-
significantly by the nature of the counterparty (e.g., tional logit model to the default prediction’s study.
corporate, small business, private individual). Rating Since Ohlson’s research (early 1980s), several other
models have a long-term view (through the cycle) and statistical techniques have been utilized to improve
have been always associated with corporate clients, the prediction power of credit scoring models (e.g.,
financial institutions, and public sector (see Credit linear regression, probit analysis, Bayesian methods,
Rating; Counterparty Credit Risk). Scoring mod- neural network, etc.), but the logistic regression still
els, instead, focus more on the short term (point in remains the most popular method.
time) and have been mainly applied to private indi- Lately, credit scoring has gained new impor-
viduals and, more recently, extended to small- and tance with the new Basel Capital Accord. The
medium-sized enterprises (SMEs).b In this article, we so-called Basel II replaces the current 1988 cap-
focus on credit scoring models, giving an overview ital accord and focuses on techniques that allow
of their assessment, implementation, and usage. banks and supervisors to properly evaluate the vari-
Since 1960s, larger organizations have been utiliz- ous risks that banks face (see Internal-ratings-based
ing credit scoring to quickly and accurately assess the Approach; Regulatory Capital). Since credit scor-
risk level of their prospects, applicants, and existing ing contributes broadly to the internal risk assessment
customers mainly in the consumer-lending business. process of an institution, regulators have enforced
Increasingly, midsize and smaller organizations are more strict rules about model development, imple-
appreciating the benefits of credit scoring as well. mentation, and validation to be followed by banks
The credit score is reflected in a number or letter(s) that wish to use their internal models in order to esti-
that summarizes the overall risk utilizing available mate capital requirements.
information on the customer. Credit scoring models The remainder of the article is structured as fol-
predict the probability that an applicant or existing lows. In the second section, we review some of the
borrower will default or become delinquent over a most relevant research related to credit scoring mod-
fixed time horizon.c The credit score empowers users eling methodologies. In the third section, following
to make quick decisions or even to automate deci- the model lifecycle structure, we analyze the main
sions, and this is extremely desirable when banks are steps related to the model assessment, implementa-
dealing with large volumes of clients and relatively tion, and validation process.
small margin of profits at individual transaction level. The statistical techniques used for credit scoring
Credit scoring models can be classified into three are based on the idea of discrimination between
main categories: application, behavioral, and collec- several groups in a data sample. These procedures
tion models, depending on the stage of the consumer originated in the 1930s and 1940s of the previous
credit cycle in which they are used. The main dif- century [18]. At that time, some of the finance houses
ference between them lies in the set of variables that and mail order firms were having difficulties with
are available to estimate the client’s creditworthiness, their credit management. Decision whether to give
that is, the earlier the stage in the credit cycle, loans or send merchandise to the applicants was
2 Credit Scoring
made judgmentally by credit analysts. The decision literature [5, 11, 19, 27, 30] used logit models to
procedure was nonuniform, subjective, and opaque; it predict default.
depended on the rules of each financial house and on Several other statistical techniques have been
the personal and empirical knowledge of each single tested to improve the prediction accuracy of credit
clerk. With the rising number of people applying for scoring models (e.g., linear regression, probit analy-
a credit card, it was impossible to rely only on credit sis, Bayesian methods, neural network, etc.), but the
analysts; an automated system was necessary. The empirical results have never shown really significant
first consultancy was formed in San Francisco by Bill benefits.
Fair and Earl Isaac in the late 1950s.
After the first empirical solutions, academic inter-
est on the topic rose and, given the lack of consumer- Credit Scoring Models Lifecycle
lending figures, researchers focused their attention on As already mentioned, banks that want to imple-
small business clients. The seminal works in this field ment the most advanced approach to calculate their
were Beaver [10] and Altman [1], who developed uni- minimum capital requirements (i.e., advanced inter-
variate and multivariate models, applying an MDA nal rating based approach, A-IRB) are subject to
technique to predict business failures using a set of more strict and common rules regarding how their
financial ratios.d internal models should be developed, implemented,
For many years thereafter, MDA was the prevalent and validated.g A standard model lifecycle has been
statistical technique applied to the default prediction designed to be followed by the financial institutions
models and it was used by many authors [2, 3, 13, that will want to implement the A-IRB approach. The
15, 16, 24, 29]. However, in most of these studies, lifecycle of every model is divided into several phases
authors pointed out that two basic assumptions of (assessment, implementation, validation) and regula-
MDA are often violated when applied to the default tors have published specific requirements for each
prediction problems.e Moreover, in MDA models, one of them. In this section, we describe the key
the standardized coefficients cannot be interpreted aspects of each model’s lifecycle phase.
such as the slopes of a regression equation and,
hence, do not indicate the relative importance of
Model Assessment
the different variables. Considering these MDA’s
problems, Ohlson [26], for the first time, applied the Credit scoring models are used to risk rank new or
conditional logit model to the default prediction’s existing clients on the basis of the assumption that
study.f The practical benefits of the logit methodology the future will be similar to the past. If an applicant
are that it does not require the restrictive assumptions or an existing client had a certain behavior in the past
of MDA and allows working with disproportional (e.g., paid back his debt or not), it is likely that a new
samples. The performance of his models, in terms applicant or client, with similar characteristics, will
of classification accuracy, was lower than the one show the same behavior. As such, to develop a credit
reported in the previous studies based on MDA, but scoring model, we need a sample of past applicants
he pointed out some reasons to prefer the logistic or clients’ data related to the same product as the one
analysis. we want to use our scoring model for. If historical
From a statistical point of view, logit regression data from the bank are available, an empirical model
seems to fit well the characteristics of the default can be developed. When banks do not have data or
prediction problem, where the dependent variable do not have a sufficient amount of data to develop an
is binary (default/nondefault) and with the groups empirical model, an expert or a generic model is the
being discrete, nonoverlapping, and identifiable. The most popular solution.h
logit model yields a score between 0 and 1, which When a data sample covering the time horizon
conveniently can be transformed in the probability necessary for the statistical analysis (usually at least
of default (PD) of the client. Lastly, the estimated 1 year) is available, the performance of the clients
coefficients can be interpreted separately as the inside the sample can be observed. We define perfor-
importance or significance of each of the independent mance as the default or nondefault event associated
variables in the explanation of the estimated PD. with each client.i This binary variable is the depen-
After the work of Ohlson [26], most of the academic dent variable used to run the regression analysis. The
Credit Scoring 3
characteristics of the client at the beginning of the SMEs). When a large amount of applicants or clients
selected period are the predictors. is manually referred to credit analysts to check their
Following the literature discussed in the second information and apply policy rules, most of the
section, a conditional probability model, logit model, benefits associated with the use of scoring models
is commonly used by most banks to estimate the are lost. On the other hand, any scoring model has
1-year score through a range of variables by max- a “gray” area where it is not able to separate with
imizing the log-likelihood function. This procedure an acceptable level of confidence between expected
is used to obtain the estimates of the parameters of “good” clients and expected “bad” ones.l The main
the following logit model [20, 21]: challenge for credit risk managers is to define the
most appropriate and efficient thresholds (cutoff) for
1 each scoring model.
P1 (Xi) = In order to maximize the benefits of a scoring
[1 + e−(B0+B1Xi1+B2Xi2+···+BnXin) ]
model, the optimal cutoff should be set taking into
1 account the misclassification costs related to the
= (1)
[1 + e−(Di) ] type I and type II error rates as Altman et al. [2],
Taffler [29], and Koh [23] point out. Moreover,
where P1 (Xi ) is the score given the vector of we believe that the optimum cutoff value cannot
attributes Xi ; Bj is the coefficient of attribute j (with be found without a careful consideration of each
j = 1, . . . , n); B0 is the intercept; Xij is the value of particular bank peculiarities (e.g., tolerance for risk,
the attribute j (with j = 1, . . . , n) for customer I ; profit–loss objectives, recovery process costs and
and Di is the logit for customer i. efficiency, possible marketing strategies). Today, the
The logistic function implies that the logit score most advanced banks set cutoffs using profitability
P1 has a value in [0,1] interval and is increasing in analyses at account level.
Di . If Di approaches minus infinity, P1 will be zero The availability of sophisticated IT systems has
and if Di approaches plus infinity, P1 will be one. significantly broadened the number of strategies that
The set of attributes that are used in the regression can be implemented using credit scoring models.
depends on the type of model that is going to be The most efficient banks are able to follow the
developed. Application models, employed to decide lifecycle of any client, from the application to the
whether to accept or reject an applicant, typically end of the relationship, with monthly updated scores
rely only on personal information about the applicant, calculated by different scorecards related to the
given the fact that this is usually the only information phase of the credit cycle where the client is located
available to the bank at that stage.j Behavioral (e.g., origination, account maintenance, collection,
and collection models include variables describing write off). Marketing campaigns (e.g., cross-selling,
the status of the relationship between the client up-selling), automated limit changes, early collection
and the bank that may add significant prediction strategies, and shadow limit management are some
power to the model.k of the activities that are fully driven by the output of
Once the model is developed, it needs to be tested scoring models in most banks.
on a test sample to confirm the soundness of its
results. When enough data are available, part of
the development sample (hold-out sample) is usually Model Validation
kept for the final test of the model. However, an
Banks that have adopted or are willing to adopt the
optimal test of the model would require investigating
Basel II IRB-advanced approach are required to put in
its performance also on an out-of-time and out-of-
place a regular cycle of model validation that should
universe sample.
include at least monitoring of the model performance
and stability, reviewing of the model relationships,
Model Implementation and testing of model outputs against outcomes (i.e.,
backtesting).m
The main advantage of scoring models is to allow Considering the relatively short lifecycle of credit
banks to implement automated decision systems to scoring models due to the high volatility of retail
manage their retail clients (private individuals and markets, their validation has always been completed
4 Credit Scoring
by banks. Basel II has only given to it a more of all the scoring models utilized in the daily business
official shape, prescribing that the validation should (see [9], par. 438, 439, 660, 718 (LXXVI), 728).
be undertaken by a team independent from the one
that has developed the models. End Notes
Stability and performance (i.e., prediction accu-
racy) are extremely important information about the a.
Risk appetite is defined as the maximum risk the bank is
quality of the scoring models. As such, they should willing to accept in executing its chosen business strategy,
be tracked and analyzed at least monthly by banks, to protect itself against events that may have an adverse
regardless of the validation exercise. As we have impact on its profitability, the capital base, or share price
discussed above, often scoring models are used to (see Economic Capital Allocation; Economic Capital).
b.
generate a considerable amount of automated deci- Recently, several studies [4, 12] have shown the impor-
tance for banks of classifying SMEs as retail clients and
sions that may have a significant impact on the bank- applying credit scoring models developed specifically for
ing business. Even small changes in the population’s them.
characteristics can substantially affect the quality of c.
The default definition may be significantly different by
the models, creating undesired selection bias. bank and type of client. The new Basel Capital Accord [9]
In the literature, we have found several indexes (par.452) has given a common definition of default (i.e.,
that have been used to assess the performance of the 90 days past due over 1-year horizon) that is consistently
used by most banks today.
models. The simple type I and type II error rates d.
The original Z-score model Altman [1] used five ratios:
that quantify the accuracy of each model in correctly working capital/total assets, retained earnings/total assets,
classifying defaulted and nondefaulted observations EBIT/total assets, market value equity/BV of total debt,
have been the first measures to be applied to scoring and sales/total assets.
e.
models. More recently, the accuracy ratio (AR) MDA is based on two restrictive assumptions: (i) the inde-
pendent variables included in the model are multivariate
and the Gini index have become the most popular
normally distributed and (ii) the group dispersion matri-
measures (see [17] for further details). ces (or variance–covariance matrices) are equal across the
Backtesting and benchmarking are an essential failing and the nonfailing group. See [6, 22, 25] for further
part of the scoring models’ validation. With the back- discussions about this topic.
f.
testing, we evaluate the calibration and discrimination Zmijewski [31] was the pioneer in applying probit analysis
of a scoring model. Calibration refers to the map- to predict default, but, until now, logit analysis has given
better results in this field.
ping of a score to a quantitative risk measure (e.g., g.
The new Basel Capital Accord offers financial institutions
PD). A scoring model is considered well calibrated the possibility to choose between the standardized and the
if the (ex ante) estimated risk measures (PD) deviate advanced approach to calculate their capital requirements.
only marginally from what has been observed ex post Only the latter requires banks to use their own internal
(actual default rate per score band). Discrimination risk assessment tools to quantify the inputs of the capital
measures how well the scoring model provides an requirements formulas (i.e., PD and loss given default).
h.
Expert scorecards are based on subjective weights
ordinal ranking of the risk profile of the observations assigned by an analyst, whereas generic scorecards are
in the sample; for example, in the credit risk con- developed on pooled data from other banks operating in
text, discrimination measures to what extent default- the same market. For a more detailed analysis of the
ers were assigned low scores and nondefaulters high possible solutions that banks can consider when not enough
scores. historical data is available, see [28].
i.
Benchmarking is another quantitative validation See end note (b).
j.
The most common application variables used are sociode-
method that aims at assessing the consistency of mographic information about the applicants (e.g., marital
the estimated scoring models with those obtained status, residence type, time at current address, type of
using other estimation techniques and potentially work, time at current work, flag phone, number of chil-
using other data sources. This analysis may be quite dren, installment on income, etc.). When a credit bureau is
difficult to perform for retail portfolios, given the lack available in the market, the information that can be obtained
related to the behavior of the applicant with other financial
of generic benchmarks in the market.n
institutions is an extremely powerful variable to be used in
Lastly, we would like to point out that Basel II application models.
specifically requires senior management to be fully k.
Variables used in behavioral and collection scoring models
involved and aware of the quality and performance are calculated and updated at least monthly. As such, the
Credit Scoring 5
correlation between these variables and the default event [11] Becchetti, L. & Sierra, J. (2003). Bankruptcy risk and
is significantly high. Examples of behavioral variables are productive efficiency in manufacturing firms, Journal of
as follows: the number of missed installments (current, Banking and Finance 27(11), 2099–2120.
max last 3/6/12 months, or ever), number of days in excess [12] Berger, A.N. & Frame, S.W. (2007). Small business
(current, max last 3/6/12 months, or ever), outstanding on credit scoring and credit availability, Journal of Small
limit, and so on. Behavioral score can be calculated at Business Management 45(1), 5–22.
facility and customer level (when several facilities are [13] Blum, M. (1974). Failing company discriminant analy-
related to the same client). sis, Journal of Accounting Research 12(1), 1–25.
l.
Depending on the chosen binary-dependent variable, [14] Castermans, G., Martens, D., Van Gestel, T., Hamers, B.
“good” and “bad” will have different meanings. For credit & Baesens, B. (2007). An overview and framework
risk models, these terms are usually associated with nonde- for PD backtesting and benchmarking, Proceedings
faulted and defaulted clients, respectively. of Credit Scoring and Credit Control X , Edinburgh,
m.
See par. 417 and 718 (XCix) of the new Basel Capital Scotland.
Accord [7–9] (see also Model Validation; Backtesting). [15] Deakin, E. (1972). A discriminant analysis of predictors
n.
Recently, rating agencies (e.g., Standard & Poor’s and of business failure, Journal of Accounting Research
Moody’s) and credit bureau providers (e.g., Fair Isaac and 10(1), 167–179.
Experian) have started to offer services of benchmarking [16] Edmister, R. (1972). An empirical test of financial ratio
for retail scoring models. For more details about backtesting analysis for small business failure prediction, Journal of
and benchmarking techniques, see [14]. Financial and Quantitative Analysis 7(2), 1477–1493.
[17] Engelmann, B., Hayden, E. & Tasche, D. (2003). Testing
rating accuracy, Risk 16(1), 82–86.
[18] Fisher, R.A. (1936). The use of multiple measurements
References in taxonomic problems, Annals of Eugenic 7, 179–188.
[19] Gentry, J.A., Newbold, P. & Whitford, D.T. (1985).
Classifying bankrupt firms with funds flow components,
[1] Altman, E.I. (1968). Financial ratios, discriminant anal-
Journal of Accounting Research 23(1), 146–160.
ysis and the prediction of corporate bankruptcy, Journal
[20] Gujarati, N.D. (2003). Basic Econometrics, 4th Edition,
of Finance 23(4), 589–611.
McGraw-Hill, London.
[2] Altman, E.I., Haldeman, R.G. & Narayanan, P. (1977).
[21] Hosmer, D.W. & Lemeshow, S. (2000). Applied Logistic
Zeta-analysis. A new model to identify bankruptcy risk
Regression, 2nd Edition, John Wiley & Sons, New York.
of corporations, Journal of Banking and Finance 1,
[22] Karels, G.V. & Prakash, A.J. (1987). Multivariate nor-
29–54.
mality and forecasting of business bankruptcy, Journal
[3] Altman, E.I., Hartzell, J. & Peck, M. (1995). A Scoring
of Business Finance & Accounting 14(4), 573–593.
System for Emerging Market Corporate Debt. Salomon
[23] Koh, H.C. (1992). The sensitivity of optimal cutoff
Brothers Emerging Markets Bond Research, May 15. points to misclassification costs of Type I and Type II
[4] Altman, E.I. & Sabato, G. (2005). Effects of the new errors in the going-concern prediction context, Journal
Basel capital accord on bank capital requirements for of Business Finance & Accounting 19(2), 187–197.
SMEs, Journal of Financial Services Research 28(1/3), [24] Lussier, R.N. (1995). A non-financial business success
15–42. versus failure prediction model for young firms, Journal
[5] Aziz, A., Emanuel, D.C. & Lawson, G.H. (1988). of Small Business Management 33(1), 8–20.
Bankruptcy prediction – an investigation of cash flow [25] Mc Leay, S. & Omar, A. (2000). The sensitivity of
based models, Journal of Management Studies 25(5), prediction models tot the non-normality of bounded an
419–437. unbounded financial ratios, British Accounting Review
[6] Barnes, P. (1982). Methodological implications of non- 32, 213–230.
normality distributed financial ratios, Journal of Business [26] Ohlson, J. (1980). Financial ratios and the probabilis-
Finance and Accounting 9(1), 51–62. tic prediction of bankruptcy, Journal of Accounting
[7] Basel Committee on Banking Supervision (2005). Stud- Research 18(1), 109–131.
ies on the Validation of Internal Rating Systems. Working [27] Platt, H.D. & Platt, M.B. (1990). Development of a class
paper 14, www.bis.org. of stable predictive variables: the case of bankruptcy
[8] Basel Committee on Banking Supervision (2005). prediction, Journal of Business Finance & Accounting
Update on Work of the Accord Implementation Group 17(1), 31–51.
Related to Validation Under the Basel II Framework . [28] Sabato, G. (2008). Managing credit risk for retail low-
Newsletter 4, www.bis.org. default portfolios, in Credit Risk: Models, Derivatives
[9] Basel Committee on Banking Supervision (2006). Inter- and Management, N. Wagner, ed., Financial Mathemat-
national Convergence of Capital Measurement and Cap- ics Series, Chapman & Hall/CRC.
ital Standards. www.bis.org. [29] Taffler, R.J. & Tisshaw, H. (1977). Going, going,
[10] Beaver, W. (1967). Financial ratios predictors of failure, gone – four factors which predict, Accountancy
Journal of Accounting Research 4, 71–111. 88(1083), 50–54.
6 Credit Scoring
information into a single measure of creditworthiness they also publish default rates for different horizons
(riskiness) and then make that summary statistic—the by rating. Thus we would expect default rates or
credit rating—public, essentially providing a public probabilities to be monotonically increasing as one
good [47]. By contrast a Moody’s KMV EDF makes descends the credit spectrum. Using S&P rating his-
use only of public information, although it transforms tories, Hanson and Schuermann [23] show formally
this information using a proprietary methodology.g that monotonicity is violated frequently for most
In fact, rating agencies are in the business of not notch-level investment grade 1-year estimated default
just information production but, in the words of Boot probabilities. The precision of the PD point estimates
et al. [12], they also act as “information equalizers” is quite low; there have been no defaults over 1 year
[quotes in the original]. In this way, they serve as a for triple-A or AA+ (Aa1) rated firms, yet surely we
coordinating mechanism or focal point to the financial
do not believe that the 1-year probability of default
markets.
is identically equal to zero. The new Basel Capi-
tal Accord (see Regulatory Capital), perhaps with
this in mind, has set a lower bound of 3 bp for any
Model Performance PD estimate (10, §285), commensurate with about a
All credit scoring or rating models map a set of finan- single-A rating. Trück and Rachev [46] show the eco-
cial and nonfinancial variables into the unit interval: nomic impact resulting from such uncertainty using
the objective is to generate a probability of default, to bank internal ratings and a corresponding loan port-
separate the defaulters from the nondefaulters. Unsur- folio. Pluto and Tasche [40] propose a conservative
prisingly, there is a plethora of modeling choices approach to generating PD estimates for low-default
as documented, for instance, by Resti and Sironi portfolios.
[41]. However, in the horse race of default pre- Despite this lack of statistical precision, Kliger
diction models, the hazard approach as shown in and Sarig [26] show that bond ratings contain price-
[16, 42] seems to be emerging as the winner. See relevant information by taking advantage of a natural
[11] for a recent overview. While we can say little experiment. On April 26, 1982, Moody’s introduced
about the performance on bank internal credit scoring overnight modifiers to their rating system, much like
models—they are proprietary—we can examine the the notching used by S&P and Fitch, effectively
empirical default experience of firms with a rating introducing finer credit rating information about their
from a credit rating agency. issuer base without any change in the firm funda-
Highly rated firms default quite rarely. For exam- mentals. They find that bond prices indeed adjust to
ple, Moody’s reports that the 1-year investment grade the new information, as do stock prices, and that any
default rate over the period 1983–2007 was 0.069% gains enjoyed by bondholders are offset by losses
or 6.9 bp [35]. This is an average over four letter suffered by stockholders.
grade ratings: Aaa through Baa. Thus in a pool of Although the 1-year horizon is typical in credit
10 000 investment grade obligors or instruments we
analysis (and is also the horizon used in Basel II),
would expect seven defaults over the course of 1 year.
most traded credit instruments have longer maturity.
But what if only four default? What about 11? Higher
For example, the typical CDS contract (see Credit
than expected default could be the result of either bad
Default Swaps) is five years, and over that horizon
luck or a bad model, and it is very hard to distinguish
between the two, especially for small probabilities there are positive empirical default rates for Aaa and
(see also [29] and Backtesting). Indeed the use of the Aa, which Moody’s reports to be 7.8 bp and 18.3 bp,
regulatory color scheme—green, amber, red—which respectively [35].
is behind the 1996 Market Risk Amendment to the The preceding discussion highlights the difficulty
Basel I, was motivated precisely by this recognition, of accurately forecasting such small PDs. Empirical
and in that case the probability to be validated is estimates of PDs using credit rating histories can
comparatively large 1% (for 99% VaR) [8] with daily be quite noisy, even with over 25 years of data.
data. Under the new Basel Capital Accord (Basel II), US
Although rating agencies insist that their rat- regulators would require banks to have a minimum
ings scale reflects an ordinal ranking of credit risk, of seven nondefault rating categories [21].
4 Credit Rating
[17] Committee on the Global Financial System (2005). [36] Moody’s Investors Services (2008). Introducing Assump-
The Role of Ratings in Structured Finance: Issues tion Volatility Scores and Loss Sensitivities for Struc-
and Implications. Available at http://www.bis.org/publ/ tured Finance Securities, Moody’s Global Credit Policy,
cgfs23.htm, January. New York.
[18] Crouhy, M., Galai, D. & Mark, R. (2000). Risk Manage- [37] Nickell, P., Perraudin, W. & Varotto, S. (2000). Stability
ment, McGraw-Hill, New York. of rating transitions, Journal of Banking and Finance 24,
[19] Duffie, Darrell. (2007). Innovations in Credit Risk 203–227.
Transfer: Implications for Financial Stability. Stan- [38] Ong, M.K. (1999). Internal Credit Risk Models: Capital
ford University GSB Working Paper, available at Allocation and Performance Measurement, Risk Books,
http://www.stanford.edu/∼duffie/BIS.pdf. London.
[20] The Economist (2007). Measuring the Measurers. [39] Peterson, M.A. & Rajan, R. (2002). Does distance still
May 31. matter: the information revolution in small business
[21] Federal Reserve Board (2003). Supervisory Guidance lending, Journal of Finance 57, 2533–2570.
on Internal Ratings-Based Systems for Corporate [40] Pluto, K. & Tasche, D. (2005). Thinking positively, Risk
Credit. Attachment 2 in http://www.federalreserve.gov/ August, 76–82.
boarddocs/meetings/2003/20030711/attachment.pdf. [41] Resti, A. & Sironi, A. (2007). Risk Management and
[22] Gross, D. & Souleles, N. (2002). An empirical analysis Shareholders’ Value in Banking, John Wiley & Sons,
of personal bankruptcy and delinquency, Review of New York.
Financial Studies 15, 319–347. [42] Shumway, T. (2001). Forecasting bankruptcy more accu-
[23] Hanson, S.G. & Schuermann, T. (2006). Confidence rately: a simple hazard model, Journal of Business 74,
intervals for probabilities of default, Journal of Banking 101–124.
and Finance 30(8), 2281–2301. [43] Standard and Poor’s (2001). Rating Methodology: Eval-
[24] Kealhofer, S. & Kurbat, M. (2002). Predictive Merton uating the Issuer, Standard & Poor’s Credit Ratings,
models, Risk February, 67–71. New York.
[25] Kiefer, N. (2007). The probability approach to default [44] Standard and Poor’s (2007). Principles-Based Rating
estimation, Risk July, 146–150. Methodology for Global Structured Finance Securities,
[26] Kliger, D. & Sarig, O. (2000). The information value of Standard & Poor’s RatingsDirect Research, New York.
bond ratings, Journal of Finance 55(6), 2879–2902. [45] Stefanescu, C., Tunaru, R. & Turnbull, S. (2008).
[27] Lando, D. & Skødeberg, T. (2002). Analyzing ratings The Credit Rating Process and Estimation of Transition
transitions and rating drift with continuous observations, Probabilities: A Bayesian Approach. London Business
Journal of Banking and Finance 26(2/3), 423–444. School working paper.
[28] Löffler, G. (2004). Ratings versus market-based mea- [46] Trück, S. & Rachev, S.T. (2005). Credit portfolio risk
sures of default risk in portfolio governance, Journal of and PD confidence sets through the business cycle,
Banking and Finance 28, 2715–2746. Journal of Credit Risk 1(4), 61–88.
[29] Lopez, J.A. & Saidenberg, M. (2000). Evaluating credit [47] White, L. (2002). The credit rating industry: an indus-
risk models, Journal of Banking and Finance 24, trial organization analysis, in Ratings, Rating Agencies
151–165. and the Global Financial System, R.M. Levich, C. Rein-
[30] Marrison, C. (2002). Fundamentals of Risk Measure- hart & G. Majnoni, eds, Kluwer, Amsterdam, NL.
ment, McGraw Hill, New York. pp. 41–64.
[31] Mason, J.R. & Rosner, J. (2007). Where Did the Risk
Go? How Misapplied Bond Ratings Cause Mortgage
Backed Securities and Collateralized Debt Obligation Related Articles
Market Disruptions. Hudson Institute Working Paper.
[32] Merton, R.C. (1974). On the pricing of corporate debt: Collateralized Debt Obligations (CDO); Credit
the risk structure of interest rates, Journal of Finance
29, 449–470.
Migration Models; Credit Risk; CreditRisk+;
[33] Moody’s Investors Services (1999). Rating Methodol- Credit Scoring; Internal-ratings-based Approach;
ogy: The Evolving Meanings of Moody’s Bond Ratings, Rating Transition Matrices; Structured Finance
Moody’s Global Credit Research, New York. Rating Methodologies.
[34] Moody’s Investors Services (2007). Structured Finance
Rating Transitions: 1983–2006. Special Comment, ADAM ASHCRAFT & TIL SCHUERMANN
Moody’s Global Credit Research, New York.
[35] Moody’s Investors Services (2008). Corporate Default
and Recovery Rates: 1920–2007. Special Comment,
Moody’s Global Credit Research, New York.
Portfolio Credit Risk: be diversified away. The second is an idiosyncratic
part εti with variance σ 2 , which is specific for
Statistical Methods each firm, independent between the firms and from
the systematic factor. The default threshold cti is
mostly modeled via credit ratings that reflect an
This article gives a brief overview over statistical aggregated summary of a firm’s risk characteristics.
methods for estimating the parameters of credit In a simple case, both variables may be expressed as
portfolio models from default data. The focus is on linear functions of the respective risk drivers, such
models for default probability and correlations; for that Vti = −ωFt + εti and cti = α + β xti where α,
recovery rates, (see Recovery Rate). First, a rather β, and ω are unknown parameters, and xti is a
general model setting is introduced along the lines of design vector consisting of observable covariates for
the models of McNeil and Wendin [10] and others, obligor i, which may be time- and obligor-specific
who depict portfolio models as generalized linear (such as balance sheet ratios) or only time-specific
mixed models (GLMMs). Then, we describe the most (such as macroeconomic variables). Then the firm
common estimation techniques, which are the method defaults if Vti < α + β xti . As shown in [7], the
of moments and maximum likelihood. An excellent aforementioned credit risk models mainly differ in
reference for other estimation techniques is [10], in the distributional assumptions regarding the common
particular for Bayes estimation. and the idiosyncratic random factors driving the firm
value as discussed below.
The probability of the firm’s default conditional
A Single Obligor’s Default Risk on the random factor ft can be expressed as
PD ti = P (Dti = 1|xti ) = (α̃ + β̃ xti ) = (c̃ti ) × (1 − P (Dti = 1|xt , ft ))nt −dt (7)
be employed for the case of the homogeneous portfo- the (homogeneous) asset correlation can be derived as
lio with time-constant parameters where closed-form
solutions for the estimators exist. In the GLMM m2 /T − m21 /T 2
=
ρ (13)
model, more advanced numerical techniques have to 1 + m2 /T − m21 /T 2
be used. Here, we briefly describe the method of
moments and the maximum-likelihood method. For = (T −1 1 − ρ
PD m1 ) (14)
Bayes estimation, we refer to [10] and the references T T
cited therein. where m1 = t=1 pt and m2 = t=1 pt2 [4].
In the general case of the GLMM where obligors
Method of Moments are heterogeneous, the log-likelihood is given via
equation (6) as
If the obligors in the portfolio or the segment are
homogeneous and the parameters are constant, only
T
two parameters are to be estimated, namely, the PD l= ln P (Dti = 1|xti , ft )dti
and the correlation. Gordy [7] applies the method of t=1 i∈t
moments estimator to the probit model. He shows that
expectation and variance of the conditional default (1 − P (Dti = 1|xti , ft ))1−dti dG(ft ) (15)
probability are
As the log-likelihood function includes solving
E (CPD(Ft )) = PD (9) several integrals, it is numerically optimized w.r.t. the
unknown parameters for which several algorithms,
and such as the Newton–Raphson method, exist and are
implemented in many statistical software packages.
Var (CPD(Ft )) The integral approximation can be conducted by,
for example, the adaptive Gaussian quadrature as
= 2 −1 (PD), −1 (PD), ρ − PD 2 (10) described in [12]. Under usual regulatory conditions,
where 2 (·) is the bivariate normal cumulative distri- the resulting estimators asymptotically exist, are
bution function for two random variates, with expec- consistent, and converge against normality. See [2],
tation zero and variance one each and correlation ρ. p.243, for a detailed discussion. Applications and
An unbiased estimator for the unconditional PD is estimation results can, for instance, be found in [6,
given by the average default rate: 8, 13, 14]. For the extension of higher dimensional
random effects, there are also some approximation
1
T
methods that can be used, particularly penalized
p̄ = pt (11) quasi-likelihood (PQL) and marginal quasi-likelihood
T t=1
(MQL) [10].
The left-hand side of equation (10) can be esti-
mated by the sample variance of the default rate: Bayes Estimation
1
T
sp2 = (pt − p̄)2 (12) Finally, Bayes estimation can be used for estimation
T −1 t=1
as thoroughly shown in [10]. The joint prior dis-
tribution of Ft , β (including a constant) and some
Given the two estimates, the asset correlation ρ hyperparameters θ can be given as
can be backed out numerically from equation (10).
Gordy [7] also provides a finite sample adjustment for p(β, Ft , θ) = p(Ft |θ) · p(θ) · p(β) (16)
the estimator. However, this modified estimator turns
out to perform similar to the simple estimator [3]. where a priori independence between β and θ is
assumed. Mostly, Markov chain Monte Carlo meth-
Maximum-likelihood Method ods are applied, which can deal with even more
complex models than shown here, such as autocor-
In the limiting case (8), asymptotic maximum- related random effects or multifactor models. For a
likelihood estimators of the (homogeneous) PD and detailed description, we refer to [10].
4 Portfolio Credit Risk: Statistical Methods
Acharya et al. [1] find both theories to be at work in of the final recovery rate. Arthur and Kapoor [6]
explaining recovery rates. Altman et al. [2] empiri- show how recovery rates can be recovered using
cally estimate the relationship between recovery rates a DDS and a CDS. Finally, Pan and Singleton
(y) and default rates (x) using one linear and three [14] and Das and Hanouna [10] use CDS with
nonlinear specifications: different maturities to extract default probabilities and
recovery rates.
y = 0.51 − 2.61x; R 2 = 0.51 Approximately, if credit spreads are known, we
may write the spread s as a function of default
y = 0.002 − 0.113 ln(x); R 2 = 0.63 probability (λ) and recovery rate (φ): s ≈ λ(1 − φ),
y = 0.61 − 8.72x + 54.8x 2 ; R 2 = 0.65 implying that recovery may we written in a reduced-
form setting as follows:
0.138
y= ; R 2 = 0.65 (1) s
x 0.29 φ =1− (2)
λ
All these specifications show a strong nega-
tive relationship between default rates and recovery More formalized and exact versions of this approx-
rates. imate relation may be derived from a CDS pricing
model or a bond pricing model. Recovery may also
be derived in the class of Merton [13] models. The
Economic Features of Recovery Rates expression for recovery rate is
There are several economic features of recovery rates VT 1
that are important: E[φ] = E |VT < D = E [VT |VT < D]
D D
1. As described above, recovery rates are negatively V0 rT
= e {1 − N (d1 )}
correlated with default rates. This is the case D
when the data is examined historically as shown ln(V0 /D) + (r + 12 σV2 )T
in [2] as well as when implied from the data, as d1 = √ (3)
in [10]. σ T
2. Recovery rates are highly variable and depend where {V0 , σ } are the initial value and volatility of
on regime (see [12]). They vary within rating the firm, D is the face value of debt with maturity T ,
and seniority class as well. and r is the risk-free interest rate. N (·) is the normal
3. Seniority and industry are statistically significant distribution function.
determinants of recovery rates, as shown by
Acharya et al. [1]. These authors also find that,
in industries with high asset-specificity, recovery References
rates are lower.
[1] Acharya, V., Bharath, S.r & Srinivasan, A. (2007).
Does industry-wide distress affect defaulted firms?
Evidence from creditor recoveries, Journal of Financial
Implied Recovery Rates Economics 85(3), 787–821.
[2] Altman, E., Brady, B., Resti, A. & Sironi, A. (2004).
Recovery rates can also be implied from prices The link between default and recovery rates: theory,
of certain credit derivatives. One then speaks of empirical evidence and implications, Journal of Business
“implied” (or risk-neutral) recovery rates, which may 76(6), 2203–2227.
not coincide with historically observed recovery rates. [3] Altman, E., Resti, A. & Sironi, A. (2003). Default
Recovery rate swaps are agreements to exchange a Recovery Rates in Credit Risk Modeling: A Review of
fixed recovery rate for the realized recovery rate the Literature and Empirical Evidence, working paper,
New York University.
allowing the market’s expected recovery rate to be [4] Bakshi, G., Madan, D. & Zhang, F. (2001). Recovery
directly recovered [5]. Digital credit default swaps in Default Risk Modeling: Theoretical Foundations and
(DDS) are credit default swaps (CDSs) where the Empirical Applications, working paper, University of
recovery rates on default are prespecified, irrespective Maryland.
Recovery Rate 3
[5] Berd, A.M. (2005). Recovery swaps, Journal of Credit [13] Merton, R.C. (1974). On the pricing of corporate debt:
Risk 1(3), 1–10. the risk structure of interest rates, The Journal of Finance
[6] Berd, A. & Kapoor, V. (2002). Digital premium, Journal 29, 449–470.
of Derivatives 10(3), 66. [14] Pan, J. & Singleton, Ken (2008). Default and recovery
[7] Carayon, J.-M., West, M., Emery, K. & Cantor, R. implicit in the term structure of sovereign CDS spreads,
(2008). European Corporate Default and Recovery Journal of Finance 63, 2345–2384.
[15] Shleifer, A. & Vishny, R. (1992). Liquidation values and
Rates, 1985–2007, Moody’s investors service.
debt capacity: a market equilibrium approach, Journal of
[8] Chew, W.H. & Kerr, S.S. (2005). Recovery ratings: a
Finance 47, 1343–1366.
new window on recovery risk, in Standard and Poor’s: [16] Tennant, J., Emery, K., Cantor, R., Elliott, J. & Cahill, B.
A guide to the Loan Market, Standard and Poor’s. (2007). Default and Recovery Rates of Asia-Pacific
[9] Christensen, J. (2005). Joint Estimation of Default and Corporate Bond, Moody’s Investors Service and Loan
Recovery Risk: A Simulation Study, working paper, Issuers, Excluding Japan, 1990–1H200.
Copenhagen Business School. [17] Varma, P., Cantor, Richard & Hamilton, David (2003).
[10] Das, S.R. & Hanouna, P. (2009). Implied Recovery, Recovery Rates on Defaulted Corporate Bonds and Pre-
forthcoming, Journal of Economic Dynamics and Con- ferred Stocks, 1982–2003, Moody’s investors service.
trol.
[11] Guo, X., Jarrow, R. & Zeng, Y. (2005). Modeling
the Recovery Rate in a Reduced Form Model , working
Related Articles
paper, Cornell University.
[12] Hu, W. (2004). Applying the MLE Analysis on the Recov- Credit Default Swaps; Credit Risk; Exposure to
ery Rate Modeling of US Corporate Bonds, Master’s Default and Loss Given Default; Recovery Swap.
Thesis in Financial Engineering, University of Califor-
nia, Berkeley. SANJIV R. DAS & PAUL HANOUNA
Internal-ratings-based identifying criteria (typically financial ratios) with
good discriminatory power and combining them by
Approach means of statistical regression or other mathematical
methods.
However, in order to use such tools, there must
Within the new Basel capital rules for banks (see be sufficient historical data—both on defaulted and
Regulatory Capital), the internal-ratings-based appr- surviving borrowers and exposures—for determin-
oach (IRBA) represents perhaps the most important ing the discrimination criteria and calibrating their
innovation for regulatory minimum capital require- weightings. In practice, obtaining such data often
ments. For the first time, subject to supervisory proves to be more difficult than the statistical analy-
approval, banks are allowed to use their own risk sis as such, either because historically borrower and
assessments of credit exposures in order to deter- exposure characteristics were not stored in a readily
mine the capital to be held against them. Within the usable manner, or simply because for some portfo-
IRBA, banks estimate the riskiness of each exposure lios there is not sufficient default data. In general,
on a stand-alone basis. The risk estimates serve as rating systems may include a set of quantitative and
input for a supervisory credit risk model (implicitly some qualitative criteria. The weighting of these cri-
given by risk weight functions) that provides a value teria may also be determined by expert opinion rather
for capital that is deemed sufficient to cover against than by statistical tools. In the extreme, for example,
the credit risk of the exposure, given the assumed in international project finance where certain criteria
portfolio diversification. In order to obtain supervi- are deal breakers for loan arrangements (i.e., the exis-
sory approval for the IRBA, banks must apply for tence of sovereign risk coverage via export insurance
IRBA and fulfill a set of minimum requirements. for projects in regions with high political risk), there
Until approval is granted for the entire book or spe- might be no predetermined weighting scheme at all.
cific portfolios, banks must apply the simpler and less Notions appear to be not entirely uniform in prac-
risk-sensitive standardized approach for credit risk, tice. Often, but not always, the notion of a “scoring
where minimum capital requirements are determined system” or a “score card” is used for a purely statis-
in dependence on asset class (sovereign, bank, cor- tical rating system or the statistical part of a mixed
porate, or retail exposure) only and, if applicable, quantitative and qualitative rating system. Moreover,
ratings by external credit assessment agencies like the notion of scores tends to be more often used
rating agencies or credit export agencies. for retail and small business portfolios, while for
corporate, bank, and sovereign portfolios, the lit-
erature tends to speak of rating systems. From an
The Conception of Internal Rating
IRBA perspective, there are no conceptual differ-
Systems in Basel II ences between these notions: they all depict different
Bank internal rating systems are, in the most general forms of IRBA systems. Likewise, there is no IRBA
sense, risk assessment procedures, which are used for requirement for the number of rating systems a bank
the assignment of internally defined borrower and should apply. Usually, one would expect different
exposure categories. A rating system is based on a systems for retail, small businesses and self-employed
set of predefined criteria to be evaluated for each borrowers, corporates, specialized lending portfolios,
borrower or exposure subject to the system, and result sovereigns, and banks. Many of these asset classes
in a final “score” or “rating grade” for the borrower or might again see different rating systems, depend-
exposure. The choice and weighting of the criteria can ing on, for example, product type (very common
be manifold; there are no rules or guidance on which for retail portfolios, but not constrained to them) or
criteria to include or exclude. The main requirement sales volume and region (both common for corpo-
on IRBA systems is that their rating grades or scores rate portfolios), because the different borrower and
do indeed discriminate borrowers according to credit exposure categories might call for different sets of
default risk. rating criteria. Within a large, internationally active
In practice, rating systems are often designed as and well-diversified bank, one might expect to see a
purely or partly statistical tools, for example, by large number of different rating systems.
2 Internal-ratings-based Approach
• N denotes the standard normal distribution or want to measure the diversification benefits of
function; their portfolio, they need to develop their own,
• G denotes the inverse standard normal distri- fully fledged credit risk models.
bution function; 3. The capital charge for each exposure, given its
• the effective maturity M was fixed at 1 year risk parameters, depends on the correlation with
for retail exposures, and assumes values the single systematic risk factor and the so-
between 0 and 5 years for other exposures as called confidence level. The confidence level for
described in detail in paragraph 320 of [3]. minimum capital requirements was set by the
Basel Committee to be 99.9%. As a consequence,
The risk weight functions for the different expo- the probability that the bank will suffer losses
sure classes differ mostly in the specification of from the credit portfolio that exceed the capital
the asset correlation R. For the retail mortgage requirements should be of an order of magni-
exposure class, R was fixed at 15%. For revolving tude like 0.1%. The correlations were estimated
retail credit, R is 4%. In contrast, in the corporate, from supervisory data bases and are assumed to
sovereign, and bank exposure classes, R depends decrease with decreasing creditworthiness.
on PD by 4. The ASRF takes only the default event as
stochastic and treats the loss in case of default
1 − e−50 PD 1 − e−50 PD as deterministic. As in practice loss amounts
R = 0.12 + 0.24 1− are stochastic as well, and potentially correlated
1 − e−50 1 − e−50 with the drivers of the default events, banks
(2) are supposed to take account of this effect in
their LGD estimates, by estimating the downturn
and also for other retail exposures, R is given as LGDs instead of average LGDs.
a function of PD by 5. Lastly, the ASRF is a default mode (DM) model
that only accounts for losses due to defaults
within a given time horizon (1 year) but not for
1 − e−35 PD 1 − e−35 PD
R = 0.03 + 0.16 1− losses due to rating migrations and future losses
1 − e−35 1 − e−35 after 1 year. This simplification does not comply
(3) with modern accounting practice. It was therefore
adjusted by introducing the maturity adjustments,
which can be seen as an extension of the model
2. The capital charge per exposure depends only toward a marked-to-market (MtM) mode.
on the risk parameters PD, LGD, and EAD of
the exposure, but not on the portfolio composi-
Minimum Requirements
tion. Thus, the capital charge for each exposure
is exactly the same, no matter which portfolio it In order to apply the IRBA, banks must have explicit
is added to (portfolio invariance). From a super- approval from their supervisors. Approval is subject
visory point of view, portfolio invariance was an to a set of minimum requirements aimed to ensure
important characteristic in developing the Basel the integrity of the rating model, rating process, and
risk weight functions, as it ensures computa- thus of the risk parameters and capital charges. The
tional simplicity and the preservation of a level minimum requirements ([3], Part 2, Section III. H)
playing field between well diversified and spe- hence assemble around the following themes:
cialized banks. The capital charge for the entire
portfolio is the sum of the capital charges for • Rating system design. As mentioned before,
individual exposures. The downside of portfo- there are no regulatory requirements with regard
lio invariance is that the Basel formula cannot to the rating criteria. Rating grades must, in a
account for risk concentrations. If it did, the cap- sensible way, discriminate for credit risk, and
ital charge for an exposure would again have to the onus of proof is with the bank. Moreover,
depend on the portfolio to which it is added. If there must be at least seven rating grades for
banks are concerned about concentration effects performing and one grade for nonperforming
4 Internal-ratings-based Approach
exposures in the PD dimension. No minimum enhanced with external data sources and expert
grade numbers are given for the LGD and EAD judgment if needed.
dimension. Also, there is no requirement of a • Validation of internal estimates. The PD, LGD,
common master scale across all rating systems, and EAD estimates must be validated against
although many banks develop such a scale for actually observed default rates and losses.
internal risk management and communication Owing to relatively short time series for the
purposes. latter, validation remains one of the more dif-
• Rating system operations. By this set of min- ficult issues within the IRBA. For available
imum requirements, banks are asked to ensure statistical techniques see, for example, [2].
the integrity of the rating process. Most notably, Where statistical validation is not reliable, banks
the rating assignments must be independent should use more qualitative validation tech-
from any business units gaining from credit niques, like ensuring good rating process gov-
approval (e.g., the sales department). Moreover, ernance, integrity of the input data, and so on.
there should be no “cherry picking” between • Disclosure requirements. Banks that use the
rated and nonrated exposures (the latter being IRBA must base their capital and risk disclosure
treated in the less risk-sensitive standardized requirements (the Third Pillar of Basel II) on
approach), although a temporary partial use of their IRBA figures.
IRBA, coupled with a supervisory approved
In practice, compliance with the minimum require-
implementation (roll-out plan) for bankwide
ments often seems to prove much more difficult and
IRBA use, and a permanent partial use for
costly than the development of the rating systems as
insignificant portfolios are allowed. Another such. The most difficult issues seem to be data avail-
important aspect is the integration of the rat- ability, IT system implementation, and data feed, the
ings into day-to-day credit processes, including actual rating of entire portfolios, which often require
IT systems and input data availability. large amounts of data for all exposures to be fed into
• Corporate governance and oversight. This set the systems (in the worst case manually, as often data
of criteria requires banks to embed their rating not consistent with the rating criteria have been stored
systems into the overall governance structure of in the past) and, connected to this, the buy-in of senior
the bank. Most notably, senior management is management and the entire credit business into the
supposed to buy into the systems and formally more risk-sensitive and more transparent IRBA.
approve for wide use within the bank, such that
the systems become accepted risk management
tools at all levels in the organization. Also, the Implications for the Bank Internal Use of
role of internal audit in regular rating audits is
IRBA Figures
defined.
• Use of internal ratings. Banks will only receive Risk quantification via IRBA can be of great use
IRBA approval if they use their ratings for a for the bank internal credit risk measurement and
wide range of bank internal applications. Exam- management. However, there are some limitations.
ples include credit approval, limit systems, risk- The most important of these is surely that due to
sensitive pricing and loss provisioning. Rating the asymptotic single risk factor model, the IRBA
systems solely developed for regulatory pur- provides no measure of risk concentrations, be they
poses will not be recognized, as only the deep single name, industry, or regional concentrations.
rooting into day-to-day credit risk management If banks are concerned about concentration effects
actions will ensure their integrity. or—as the other side of the same coin—want to
• Risk quantification. Banks need to quantify the measure the diversification benefits of their portfolio,
risk parameters PD, LGD, and EAD, based on they need to go further and develop their own, full-
their rating grades and on the Basel default and fledged credit risk models with more than one risk
loss definitions ([3], paragraphs 452, 453, and factor and their own correlation estimates. Likewise,
460). In doing so, they should employ a variety the asymptotic assumption needs to be given up in
of data sources: preferentially internal data, but order to capture idiosyncratic single name risk.
Internal-ratings-based Approach 5
The most significant benefit of the IRBA for bank Comprehensive Version. Basel Committee of Banking
internal risk management lies in the standardized Supervision.
assessment and measurement of stand-alone borrower [4] Gordy, M. (2003). A risk-factor model foundation for
ratings-based bank capital rules, Journal of Financial
and exposure credit risk. Credit risk becomes much Intermediation 12(3), 199–232.
more transparent within the organization, and there [5] Merton, R.C. (1974). On the pricing of corporate debt: the
is “one common currency” for risk, expressed by risk structure of interest rates, Journal of Finance 29(2),
the risk parameters PD, LGD, and EAD and the 449–470.
regulatory capital charges based on them. [6] Vasicek, O.A. (2002). The distribution of loan portfolio
value, Risk 15, 160–162.
References
Related Articles
[1] BCBS (2004). An Explanatory Note on the Basel II IRB
Risk Weight Functions. Basel Committee of Banking
Credit Rating; Credit Risk; Credit Scoring;
Supervision.
[2] BCBS (2005). Studies on the Validation of Internal Rat- Economic Capital; Exposure to Default and
ing Systems. Basel Committee of Banking Supervision, Loss Given Default; Large Pool Approximations;
Working Paper No. 14. Regulatory Capital.
[3] BCBS (2006). International Convergence of Capital Mea-
surement and Capital Standards. A Revised Framework, KATJA PLUTO & DIRK TASCHE
Exposure to Default and potentially as large as the current unused portion
(UP) of the credit line. To account for exposure risk,
Loss Given Default banks compute credit conversion factors (CCF) as
CCF ≡ E/UP .a Once a set of CCFs, associated
with different types of borrowers and exposures, has
been estimated, a bank can forecast EAD as
In the study of credit risk, the most relevant factor
has traditionally been the borrower’s probability of EAD = DP + CCF · UP (1)
default (or intensity of default), expressing default
risk and, indirectly, migration risk. However, there CCFs are usually calibrated through a statistical anal-
are other risk profiles that significantly affect the loss ysis of past defaults (see, e.g., [9, 11, 25, 27]), where
experienced by the lender upon the occurrence of a the CCF is explained through the characteristics of
default: exposure at default and loss given default. the borrower, the exposure, and the economic envi-
The uncertainty surrounding these variables gives ronment. When past events are analyzed, the UPs
rise, respectively, to exposure risk and recovery risk. must be recorded some time before the default: this
These risks (captured through parameters like EAD, can be a fixed interval (“fixed time horizon,” e.g.,
LGD, and RR, as explained below) have become 12 months before the default) or a fixed moment in
increasingly popular thanks to the preliminary drafts time for all defaults that occurred in the same period
of the new accord on bank capital requirements (“cohort approach,” e.g., January 1 for all exposures
(“Basel II”) that were circulated by the Basel Com- defaulted in a given year); multiple UPs can also be
mittee after 1999 and led to a new regulatory text in recorded at several different instants in time (“vari-
2004 [12]. able time horizon,” e.g., 6, 12, 24 months before
default) to assess the impact of time-to-default on
exposure risk.b
Exposure at Default and Exposure Risk In fact, CCFs can be expected to increase with
time-to-default: a study based on some 400 borrowers
In the simplest forms of credit exposure, the amount in the period 1995–2000 [9] has shown that one-year
due to the lender in the event of a default (that is, the CCFs average 32%, while five-year CCFs average
exposure at default (EAD)) is known with certainty. 72%; this may be due to a rating migration effect
This is the case, for example, of zero-coupon bonds and a greater opportunity to draw down. CCFs also
or fixed-term loans, where the balance outstanding seem to be driven by the percent usage ratio of the
is predetermined in advance and cannot be modified credit line (DP /(DP + UP)): lower usage rates are
without a formal credit restructuring. usually associated with higher CCFs and with better
However, the amount outstanding in the event of ratings [9].
a default might also be uncertain, basically due to the A well-known relationship also exists between rat-
following reasons: ings and CCFs: indeed, the latter have often been
found to increase for borrowers with better ratings.c
1. changes in the value of the contract to which the In other words, exposure risk is especially signifi-
defaulted party had committed itself (typically, cant when default risk is comparatively low. This is
an OTC derivative affected by a number of an expected result, given that firms with investment-
underlying variables); grade ratings can get funds from the commercial
2. the presence of a revolving credit line (e.g., paper market or by negotiating better terms with their
a loan commitment) where the borrower could suppliers, and hence tend to use a small portion of the
increase his/her credit usage before default. available credit lines (which are comparatively more
expensive); however, as their financial shape deteri-
While case 1, known as counterparty risk, can be orates and default gets closer, firms quickly resort to
considered as a sort of intersection between credit and bank credit lines, as other sources of funds dry up.
market risk, case 2 represents a typical example of Besides focusing on loan behavior at default,
exposure risk. Here, the borrower’s current exposure one can assess exposure risk by monitoring credit
(that is, the drawn part of the credit line, DP) can usage throughout the life of a facility, including both
increase to a larger EAD, with the increase (E) defaulted and performing exposures. These “usage
2 Exposure to Default and Loss Given Default
ratios” have been found to behave very differently for place. Curiously, however, Basel II states that CCFs
firms that eventually default, even several years later, cannot be set below zero, regardless of any empirical
as opposed to nondefaulting obligors. For example, evidence that a bank may produce to its supervisors.
a sample of about 770 000 lines of committed credit Apart from OTC derivatives and credit lines,
lines recorded in the Spanish central credit registerd exposure risk can also arise from the issuance of
shows that defaulting exposures have a median usage guarantees and other off-balance-sheet items (e.g.,
ratio of 50%, in contrast to 43% for nondefaulting letters of credit, bid bonds, and performance bonds)
facilities; this median usage ratio was found to that might be used by third parties to get relief
increase (71%) in the last year before default. Usage after the default of the guaranteed entity (leading
ratios are instead lower, all other things being equal, to monetary outflow for the guarantor, that is, to
for “seasoned” credit lines (i.e., credit lines that an EAD). In this case, the EAD can be anywhere
have been in place for a number of years); this between zero and the amount of the off-balance
suggests that relationship banking may play a role sheet item (OBS), and CCFs can be computed as
in preventing usage peaks in credit lines. CCF ≡ EAD/OBS . CCF estimates associated with
Other borrower characteristics may also help different types of guarantees and OBS can then be
explain exposure risk: for example, usage ratios have used to forecast the EAD as EAD = CCF · OBS
been found to be higher for younger, smaller, and less
profitable firms (as age, size, and profitability tend
to be inversely related to PD, this is consistent with Loss Given Default and Recovery Risk
poorly rated companies being more dependent on
bank credit lines).e Other important explanatory vari- The loss rate given default—or simply loss given
ables are the borrower’s leverage, liquidity, and debt default (LGDg )—is the loss rate experienced by a
cushion; also, exposure risk tends to be higher for lender on a credit exposure if the borrower defaults.
larger companies and for those having a larger share It is given by 1 minus the recovery rate (RR) (see
of bank debt in their liabilities mix [25]. However, equation (3)) and can take any value between 0 and
generally, firm characteristics tend to have a compar- 100%. Formally,
atively limited impact on CCFs and usage statistics.
Exposure risk also seems to be affected by the LGD = 1 − RR (2)
macroeconomic cycle. For example, the gross domes-
tic product (GDP) growth rate has been found [27] LGD is never known when a new loan is issued,
to be inversely related to credit line usage, and such although a reasonable estimate can be produced when
a link is especially meaningful in the case of a slow- the default occurs, at least if there is a secondary
down or recession. This makes sense, as credit lines market where the defaulted exposure can be traded.
are often used to provide a liquidity buffer for bor- In fact, RRs can be computed based on several
rowers in times of financial strain. approaches [8,33]:
Other measures have been proposed as an alter-
native to CCFs: these are the EAD factor, EADF = 1. The market LGD approach uses prices of
EAD/(DP + UP ), and the exposure multiplier, defaulted exposures as an estimate of the RR. In
EM = EAD/DP . The former can be considered as practice, if a defaulted bond trades at 30 cents a
a special case of the usage ratio, recorded at the euro, one can infer that the market is estimating a
time of default; the latter cannot be computed when a 30% RR (hence, a 70% LGD). This approach can
credit line was totally undrawn before the borrower’s be used only for exposures traded on a secondary
defaultf . market.
CCFs can usually be expected to lie between 0 A variation of this approach (emergence LGD
(if the UP is still unused at default) and 1 (if the approach) estimates the RR on the basis of the
whole UP gives rise to an extra exposure). However, market value of the new financial instruments
the E and hence the CCF could also be negative; (usually, shares or long-term bonds) that are
this is likely to be the case if the credit line is offered to lenders in exchange for their defaulted
revocable or has some covenant entitling the bank claims. These are usually issued only when the
to claim its money back before a proper default takes restructuring process is over and the company
Exposure to Default and Loss Given Default 3
emerges from default; their market price must, studies estimating RRs in the 1970s [6] was based on
therefore, be discounted back in time to the a survey carried out among the workout departments
moment when the default took place, using an of a number of large banks in the period 1971–1975;
adequate discount rate. the average recovery on unsecured loans (based on
A third version of market LGD involves the the face value of cash flows on defaulted exposures,
use of spreads on performing bonds as a source recorded in the first three years after default and not
of information; in fact, spreads on corporate discounted) was found to be about 30%.
bonds depend on both the borrower’s PD and In the following years, recoveries on bank loans
the expected RR. Assuming the PD can be have been foundi to be affected by many factors,
estimated otherwise, one can then work out the including the size of the loan and different collateral
LGD implied by market spreads (implicit market types. More generally, the four main drivers of RRs
LGD); alternatively, by assuming that some and LGDs can be summarized as follows:
relationship exists between PD and LGD (see
below), PD and LGD can be derived jointly [13]. Exposure characteristics
Note that implicit market LGD makes it possible These include the presence of any collateral (be it
to use a considerably larger dataset, including represented by financial assets of other goods, such
performing exposures, and not only defaulted as plants, real estate, inventories) and its degree of
ones. However, note that LGDs derived from effectiveness (that is, how easily it can be seized and
market prices often are risk-neutral quantities; liquidated); the priority level of the exposure, which
therefore, some assumption on the relationship can be senior or subordinated to other exposures;
between them and real world LGDs is needed if any guarantees provided by third parties (like banks,
implicit market LGDs are to be used. holding companies of public sector entities). An
2. When market data are not available (as for most important driver of recoveries is also the exposure’s
traditional banking loans, where no secondary “debt cushion”, that is, the amount of the liabilities
market exists) one must turn to the workout in the borrower’s balance sheet that are junior to the
approach. This is based on the actual recoveries one being evaluated; as the volume of such junior
(and recovery costs) experienced by the lender in securities increases, so does the RR on the senior
the months (years) after the default took place. exposure, as its holders are more likely to find an
It therefore requires to set up a database, where adequate volume of assets to be liquidated and used
all recoveries on defaulted exposures are filed. as a source of cash [28,34].
According to this approach, the RR (also known
as ultimate recovery) can be computed using the Borrower characteristics
following equation: These include the industry where the company oper-
ates, which may affect the liquidation process, that
Ri · (1 + r)−Ti is, the ease with which the firm’s assets can be sold
RR =
i
(3) and turned into cash for the creditors,j the country of
EAD the obligor, which affects the speed and effective-
where Ri is the ith recovery flow associated ness of the bankruptcy procedures; some financial
with the defaulted exposure (negative Ri s denote ratios, like the leverage (namely, the ratio between
recovery costs), r is the appropriate discount total assets and liabilities, which shows how many
rate,h and Ti is the time elapsed between the euros of assets are reported in the balance sheet for
default and the ith recovery. Note that, based on each euro of debt to be paid back) and the ratio
equation (3), RR can be negative (hence LGD can of EBITDA (earnings before interest, taxes, depre-
exceed 100%) if recoveries do not offset recovery ciation, and amortization) to total turnover (which
costs. indicates whether the defaulted company is still capa-
ble of generating an adequate level of cash flow
The determinants of RRs have been extensively for its would-be borrowers). Another interesting vari-
investigated, mainly based on the market LGD able affecting LGD is the borrower’s original rating:
approach, although some examples of workout LGDs indeed, “fallen angels” (i.e., investment-class oblig-
exist (mainly for bank loans). Indeed, one of the first ors that were downgraded to junk) appear to behave
4 Exposure to Default and Loss Given Default
differently from straight speculative-grade issuers, [15], as well as junk bond data for 1982–2000.n
and have been found to recover significantly more Evidence of a strong relationship between LGD and
than bonds of the same seniority that were rated as the state of the economy, including default frequen-
speculative-grade at issuance.k cies, is also found by Moody’s KMV in its LossCalc
model [23], estimated on a dataset of over 3000
Lender (e.g., bank) characteristics recoveries on loans, bonds, and preferred stock.
These may include the efficiency levels of the depart- The correlation between economic cycle and
ment that takes care of the recovery process (workout recoveries appears stronger if estimated at the indus-
department) or the frequency with which out-of- try level [1]. In fact, if the sector where the borrower
court settlements are reached with the borrowers, or used to operate is undergoing a recession, the lender
nonperforming loans are spun-off and sold to third will find it more difficult to find a buyer for the
parties; in fact, sales of nonperforming loans and out- defaulted company or its assets (as competitors are
of-court settlements, while reducing the face value of likely to suffer from excess production capacity) and
the recovery (compared to what could be obtained by recoveries will be lower than expected. As recessions
the bank on the basis of a formal bankruptcy proce- may occur at the industry level when the econ-
dure), also significantly shorten the duration of the omy as a whole is doing reasonably well, moving
recovery process. The financial effect of this shorter from economy-wide to industry-specific conditions
recovery time usually more than offsets the lower can make the empirical link between default rates
recovered amount. and recoveries much easier to detect.
The PD/LGD correlation has wide-ranging impli-
Macroeconomic variables cations for credit risk models. First, the expected
These mainly include the level of the interest rates loss rate can no longer be considered as the product
(higher rates reduce the present value of recoveries) of the expected LGD times the borrower’s uncondi-
and the state of the economic cycle (if the economy is tional PD, since a second, positive addendum must
in recession, the value at which the companies assets be factored in, accounting for covariance. Second,
can be liquidated is likely to be lower). unexpected loss and Value at Risk prove to be con-
During the last years, an important stream of siderably higher than they are if independence is
research has addressed the relationship between PD assumed, as shown by [7]; in other words, if system-
and LGD. From a theoretical point of view, the same atic risk plays an important role for RRs, estimates of
macroeconomic background variables that affect the economic capital turn out to be downward biased.o
default probability of the borrowers (and cause While most RR studies focus on mean or median
default rates to rise) may drive down the liquida- values, it is also important to understand the whole
tion value of assets and increase LGD (so that the probability distribution of recoveries, if extreme sce-
distribution of LGDs is different in high-default and narios are to be fully understood and managed. In
low-default periods).l This intuition has prompted a the case of bank loans, the probability distribution
number of modelsm generalizing the “classic” single- of workout LGDs is usually strongly bimodal, with
factor model in [17] and [22] to the case where peaks at 0% and 100%. In the case of bonds, uni-
recoveries and defaults are driven by a common com- modal distributions may be sensible, but still it is
ponent (usually systemic in nature). strongly advisable to use flexible distributions, such
From an empirical point of view, several pieces of as the beta (which can be either uni- or bimodal
evidence indicate that LGDs and default rates tend to depending on the estimated parameters, and can eas-
increase together when the overall economic cycle ily be fit to the data by the generalized method of
deteriorates. For example, using data on US corpo- moments).p
rate bonds (Moody’s Default Risk Service database) Finally, it is worth emphasizing that, as with all
for 1982–1997, one finds that in a severe economic other risks, recovery risk may also produce prof-
downturn (when defaults are more frequent), recov- its. Indeed, the price performance of defaulted bonds
eries can be expected to drop by 20–25% compared (estimated by comparing market LGDs to emergence
with their unconditional average [20]. Similar results LGDs) can prove extremely brilliant, although this
are found using Standard and Poor’s Credit Pro is not always the case: while senior bonds (both
database (bond and loan defaults) for 1982–1999 secured and unsecured) have been found to perform
Exposure to Default and Loss Given Default 5
very well in the postdefault period (with per annum Merton framework, they present a model where collateral
returns of 20–30%), junior bonds often show nega- value is correlated with the value of the borrower’s assets
tive returns [3]. and hence to his/her PD. This leads to an inverse relation-
ship between default rates and RRs.
m.
See [19–21]. Jarrow [26] presents a model where, as in
Frye’s works, RRs and PDs are correlated and depend on
Acknowledgments the state of the economy; however, his methodology explic-
itly incorporates equity prices in the estimation procedure,
Part of this article, especially the LGD section, draws on
allowing the separate identification of RRs and PDs and the
previous work carried out with Andrea Sironi, to whom I
use of a larger dataset. Furthermore, he explicitly incorpo-
wish to express my gratitude.
rates a liquidity premium to account for the high variability
in the spreads on US corporate debt. In [32] and [15] also
End Notes models are proposed that account for the dependence of
recoveries on systematic risk by extending Gordy’s single-
a. factor model.
CCFs are sometimes also known as loan equivalents n.
See [2]. Note, however, that this study finds that a single
(LEQs). systematic risk factor—that is, the performance of the
b.
See [30] for further details on fixed time horizon, cohort economy as a whole—is less predictive than theoretical
approach, and variable time horizon. models would suggest, while a key role is played by the
c.
See, for example, [11], where a sample of loan supply of defaulted bonds.
commitments in 1987–93 is analyzed, [25], based on o.
See also the empirical results in [15].
3281 defaulted exposures issued by 720 borrowers in p.
For a more flexible approach, see [24] where a variation
1985–2006, or [9]. of the Gaussian kernel, known as Beta kernel, is used to
d.
These are all loan commitments above ¤6000 issued by fit the distribution of RRs of a sample of defaulted bonds
Spanish banks after 1984. See [27] for further details. from the period 1981–1999. See also [18], for an interesting
e.
See again [27], based on a subset of about 86 000 utility-based approach to the estimation of the conditional
companies.
f. probability distribution of RRs.
The EM is sometimes referred to as the CCF (in which
case, what we called CCF is indicated as LEQ. Note that,
given the important role played by bank capital regulation References
in shaping credit risk measurement techniques and jargon,
we chose to use the word CCF in a way that is consistent
with the terminology of the new Basel accord. [1] Acharya, V., Bharath, S. & Srinivasan, A. (2007). Does
g. industry-wide distress affect defaulted firms: evidence
In principle, one should indicate the loss rate given default
as LGDR (LGD rate) and use LGD for the absolute LGD from creditor recoveries, Journal of Financial Eco-
(in euros or dollars). However, “LGD” is used by most nomics, 85, 787–821.
practitioners (and by the new Basel accord on bank capital) [2] Altman, E.I., Brady, B., Resti, A. & Sironi, A. (2005).
to indicate the loss rate, while the absolute loss is usually The link between default and recovery rates: theory,
indicated as LGD · EAD. empirical evidence and implications, Journal of Business
h. 78(6), 2203–2228.
The choice of a suitable risk-adjusted r is far from trivial,
and basically depends on the amount of systemic risk of [3] Altman, E.I. & Eberhart, A. (1994). Do seniority pro-
the defaulted exposure. See [29]. visions protect bondholders’ investments? Journal of
i. Portfolio Management (Summer), 67–75.
See, for example, [10], based on 24 years of data compiled
by Citibank, or [14], using a sample of 371 loans issued [4] Altman, E.I. & Fanjul, G. (2004). Defaults and returns
by Portugal’s largest private bank during 1985–2000; in the high-yield bond market: the year 2003 in review
both studies are based on the workout approach. A study and market outlook, in Credit Risk—Models and Man-
on bank loans (large syndicated loans traded on the agement, D. Shimko, ed, RiskBooks, London.
secondary market) based on the market LGD approach is, [5] Altman, E.I. & Kishore, V.M. (1996). Almost everything
for example, [16]. you wanted to know about recoveries on defaulted
j. bonds, Financial Analysts Journal 52(6), 57–64.
See [1] based on market LGDs observed during the United
States during 1982–1999. See also [5] and the literature [6] Altman, E.R.H. & Narayanan, P. (1977). ZETA analysis:
survey in [33]. a new model to identify bankruptcy risk of corporations,
k.
See [4], based on a sample of corporate bonds stratified by Journal of Banking & Finance 1(1), 29–54.
original rating and seniority: in the case of senior-secured [7] Altman, E.I., Resti, A. & Sironi, A. (2005). Recovery
exposures, for example, the median RR for fallen angels Risk—The Next Challenge in Credit Risk Management,
was 50.5% versus 33.5%. Risk Books, London.
l.
A somewhat different approach has been proposed by [8] Altman, E., Resti, A. & Sironi, A. (2005). The PD/LGD
Peura and Jovivuolle [31]. Using an option-pricing, à la link: implications for credit risk modelling, in Recovery
6 Exposure to Default and Loss Given Default
Risk—The Next Challenge in Credit Risk Management, Management, E. Altman, A. Resti & A Sironi, eds, Risk
E. Altman, A. Resti & A. Sironi, eds, RiskBooks, Books, London.
London, pp. 253–266. [25] Jacobs, M. (2007). An Empirical Study of Exposure
[9] Araten, M. & Jacobs, M.J. (2001). Loan equivalents for at Default, mimeo Office of the Comptroller of the
revolving credit and advised lines, The RMA Journal Currency, Washington, DC.
83(8), 34–39. [26] Jarrow, R. (2001). Default parameter estimation using
[10] Asarnow, E. & Edwards, D. (1995). Measuring loss market prices, Financial Analysts Journal 57(5), 75–92.
on defaulted bank loans: a 24 year study, Journal of [27] Jiménez, G., Lopez, J.A. & Saurina, J. (2007). Empirical
Commercial Bank Lending 77(7), 11–23. Analysis of Corporate Credit Lines, San Francisco:
[11] Asarnow, E. & Marker, J. (1995). Historical perfor- working paper, 2007–14, Federal Reserve Bank of San
mance of the US corporate loan market: 1988–1993, Francisco, San Francisco.
Journal of Commercial Bank Lending (Spring), 13–32. [28] Keisman, D. (2003). Loss Stats, Standard & Poor’s, New
[12] Basel Committee on Banking Supervision (2006). Inter- York.
national Convergence of Capital Measurement and Cap- [29] Maclachlan, I. (2005). Choosing the discount factor
ital Standards—A Revised Framework—Comprehensive for estimating economic LGD, in Recovery Risk—The
Version, Bank for International Settlements, Basel. Next Challenge in Credit Risk Management, E. Alt-
[13] Das, S.R. & Hanouna, P.E. (2007). Implied Recovery, man, A. Resti & A. Sironi, eds, RiskBooks, London,
Tratto da SSRN, http://ssrn.com/abstract=1028612. pp. 285–306.
[14] Dermine, J. & Neto de Carvalho, C. (2005). How to [30] Moral, G. (2006). EAD estimates for facilities with
measure recoveries and provisions on bank lending: explicit limits, in The Basel II Risk Parameters: Estima-
methodology and empirical evidence, in Recovery Risk— tion, Validation and Stress Testing, E. Bernd & R. Robert,
The Next Challenge in Credit Risk Management, eds, Springer Verlag, Berlin.
E. Altman, A. Resti & A. Sironi, eds, RiskBooks, Lon- [31] Peura, S. & Jovivuolle, E. (2005). LGD in a structural
don, pp. 101–120. model of default, in Recovery Risk—The Next Challenge
[15] Duellmann, K. & Trapp, M. (2005). Systematic risk in Credit Risk Management, E.I. Altman, A. Resti &
in recovery rates of US corporate credit exposures, A. Sironi, eds, RiskBooks, London, pp. 201–216.
in Recovery Risk—The next Challenge in Credit Risk [32] Pykhtin, M. (2003). Unexpected recovery risk, Risk
Management, E. Altman, A. Resti & A. Sironi, eds, 16(8), 74–78.
RiskBooks, London, pp. 235–252. [33] Schuermann, T. (2005). What do we know about Loss
[16] Emery, K. (2003). Moody’s Loan Default Database as of Given Default? in Recovery Risk—The Next Challenge
November 2003, Moody’s Investors Service, New York. in Credit Risk Management, A. Resti & E.I. Altman, eds,
[17] Finger, C. (2001). The one-factor creditmetrics model in Risk Books, London.
the new Basel capital accord, RiskMetrics Journal 2(1), [34] Van de Castle, K. & Keisman, D. (1999). Recovering
9–18. Your Money: Insights Into Losses from Defaults, Stan-
[18] Friedman, C. & Sandow, S. (2003). Ultimate recoveries, dard & Poor’s, New York.
Risk August, 69–73.
[19] Frye, J. (2000). Collateral damage, Risk (April), 91–94.
[20] Frye, J. (2000). Collateral Damage Detected, Federal Further Reading
Reserve Bank of Chicago, Chicago.
[21] Frye, J. (2000). Depressing recoveries, Risk (November), Schleifer, A. & Vishny, R. (1992). Liquidation values and debt
108–111. capacity: a market equilibrium approach, Journal of Finance
[22] Gordy, M.B. (2003). A risk-factor model foundation for 47, 1343–1366.
ratings-based bank capital rules, Journal of Financial
Intermediation 12, 199–232.
[23] Gupton, G.M. & Stein, R.M. (2002). LossCalc: Moody’s Related Articles
Model for Predicting Loss Given Default (LGD),
Moody’s Investors Service, New York.
Counterparty Credit Risk; Recovery Rate; Value-
[24] Hagmann, M., Renault, O. & Scaillet, O. (2005). Esti-
mation of recovery rate densities: non-parametric and at-Risk.
semi-parametric approaches versus industry practice,
in Recovery Risk: the Next Challenge in Credit Risk ANDREA RESTI
Credit Portfolio is Ai has the representation
Simulationa Ai = Ri2
m
wij Xj + 1 − Ri2 Zi (2)
j =1
granularity approximation of the portfolio loss distri- [3] Egloff, D., Leippold, M., Jöhri, S. & Dalbert, C.
bution (compare to [16]). More precisely, the original (2005). Optimal Importance Sampling for Credit Port-
portfolio P is approximated by a homogeneous and folios with Stochastic Approximations. Working paper,
Zürcher Kantonalbank, Zurich.
infinitely granular portfolio P̄ . The loss distribution [4] Glasserman, P. (2004). Monte Carlo Methods in Finan-
of P̄ can be specified by a Gaussian one-factor model. cial Engineering, Springer.
The calculation of the shift of the systematic fac- [5] Glasserman, P. (2005). Measuring marginal risk contri-
tors is now done in two steps: in the first step, the butions in credit portfolios, Journal of Computational
optimal mean is calculated in the one-factor setting Finance 9, 1–41.
and then the scalar mean is lifted to a mean vector [6] Glasserman, P., Kang, W. & Shahabuddin, P. (2007).
for the systematic factors in the original multifactor Fast Simulation of Multifactor Portfolio Credit Risk.
Working paper, Columbia University, New York.
model. Other importance sampling techniques [3, 6]
[7] Glasserman, P. & Li, J. (2005). Importance sam-
are based on the Robbins–Monro stochastic approx- pling for portfolio credit risk, Management Science 51,
imation method or use large deviation analysis to 1643–1656.
calculate multiple mean shifts. [8] Kalkbrener, M., Kennedy, A. & Popp, M. (2007).
The efficiency of the proposed variance reduction Efficient calculation of expected shortfall contributions
schemes heavily depends on the portfolio charac- in large credit portfolios, Journal of Computational
teristics. For example, the technique proposed in Finance 11, 45–77.
[9] Kalkbrener, M., Lotter, H. & Overbeck, L. (2004).
[8, 9] is tailored to large and well-diversified port-
Sensible and efficient capital allocation for credit port-
folios. For those portfolios the analytic loss distri- folios, Risk 17(1), S19–S24.
bution of the infinitely granular portfolio provides [10] Martin, R., Thompson, K. & Browne, C. (2001). Taking
an excellent fit, which typically reduces the vari- to the saddle, Risk 14(6), 91–94.
ance—and therefore the number of required MC sce- [11] McNeil, A.J., Frey, R. & Embrechts, P. (2005). Quan-
narios—by a factor of more than 100. Smaller port- titative Risk Management: Concepts, Techniques, and
folios with low dependence on systematic factors, on Tools, Princeton University Press.
[12] Merino, S. & Nyfeler, M. (2004). Applying importance
the other hand, are dominated by idiosyncratic risk,
sampling for estimating coherent credit risk contribu-
which increases the relative importance of variance tions, Quantitative Finance 4, 199–207.
reduction techniques on idiosyncratic factors [7, 8], [13] Merton, R. (1974). On the pricing of corporate debt: the
for example, importance sampling based on exponen- risk structure of interest rates, Journal of Finance 29,
tial tilting. 449–470.
[14] Morokoff, W.J. (2004). An importance sampling method
for portfolios of credit risky assets, Proceedings of
End Notes the 2004 Winter Simulation Conference, IEEE Press,
pp. 1668–1676.
a.
The views expressed in this article are those of the author [15] Rubinstein, R.Y. (1981). Simulation and the Monte
and do not necessarily reflect the position of Deutsche Bank Carlo Method, Wiley.
AG. [16] Vasicek, O. (2002). Loan portfolio value, Risk 15(12),
b.
A survey on credit portfolio modeling can be found in 160–162.
[2, 11].
Related Articles
References
[1] Barndorff-Nielsen, O. (1978). Information and Exponen- Large Pool Approximations; Monte Carlo Simula-
tial Families, Wiley. tion; Structural Default Risk Models; Saddlepoint
[2] Bluhm, C., Overbeck, L. & Wagner, C. (2002). An Intro- Approximation; Variance Reduction.
duction to Credit Risk Modeling, CRC Press/Chapman &
Hall. MICHAEL KALKBRENER
Counterparty Credit the limit, the transaction is not allowed. The limits
usually depend on the counterparty’s credit quality:
Riska higher rated counterparties have higher limits. To
compare uncertain future exposure with a determin-
istic limit, potential future exposure (PFE) profiles
are calculated from exposure probability distributions
Counterparty credit risk (CCR) is the risk that a at future time points. PFE profiles are obtained by
counterparty in a financial contract will default prior calculating a quantile of exposure at a high confi-
to the expiration of the contract and will fail to dence level (typically, above 90%). Some institutions
make all the payments required by the contract. use different exposure measures, such as expected
Only the contracts privately negotiated between the exposure (EE) profiles, for comparing with the credit
counterparties—over-the-counter (OTC) derivatives limit. It is important to understand that a given credit
and securities financing transactions (SFT)—bear limit amount is meaningful only in the context of a
CCR. Exchange-traded derivatives are not subject to given exposure measure (e.g., 95%-level quantile).
CCR because all contractual payments promised by Future credit exposure can be greatly reduced
these derivatives are guaranteed by the exchange. by means of risk-mitigating agreements between
CCR is similar to other forms of credit risk (such two counterparties, which include netting agreements,
as lending risk) in that the source of economic loss is margin agreements, and early termination agree-
an obligor’s default. However, CCR has two unique ments. Netting agreement is a legally binding con-
features that set it apart from lending risk: tract between two counterparties that, in the event of
default of one of them, allows aggregation of trans-
• Uncertainty of credit exposure Credit expo-
actions between these counterparties. Instead of each
sure of one counterparty to the other is determined
trade between the counterparties being settled sep-
by the market value of all the contracts between
arately, the entire portfolio covered by the netting
these counterparties. While one can obtain the
agreement is settled as a single trade whose value
current exposure from the current contract val-
equals the net value of the portfolio. Margin agree-
ues, the future exposure is uncertain because the
ments limit the potential exposure of one counterparty
future contract values are not known at present.
to the other by means of requiring collateral should
• Bilateral nature of credit exposure Since both
the unsecured exposure exceed a predefined thresh-
counterparties can default and the value of many
old. The threshold value depends primarily on the
financial contracts (such as swaps) can change
credit quality of the counterparty: the higher the credit
sign, the direction of future credit exposure
quality, the higher the threshold.
is uncertain. Counterparty A may be exposed
There are two types of early termination agree-
to default of counterparty B under one set of
ments: termination clauses and downgrade provi-
future market scenarios, while counterparty B
sions. Termination clause is specified at the trade
may be exposed to default of counterparty A
level. A unilateral (bilateral) termination clause gives
under another set of scenarios.
one (both) of the counterparties the right to terminate
The uncertainty of future credit exposure makes the trade at the fair market value at a predefined set of
managing and modeling CCR of the trading book dates. Downgrade provision is specified for the entire
challenging. For a comprehensive introduction to portfolio between two counterparties. Under a unilat-
CCR, see [1, 5, 17]. eral (bilateral) downgrade provision, the portfolio is
settled at its fair market value the first time the credit
rating of one (either) of the counterparties falls below
Managing and Mitigating Counterparty a predefined level.
Credit Risk
One of the most conventional techniques of managing Contract-level Exposure
credit risk is setting counterparty-level credit limits.
If a new transaction with the counterparty would Let us consider a financial institution (we will call
result in the counterparty-level exposure exceeding it a bank for brevity) that has a single derivative
2 Counterparty Credit Risk
contract with a counterparty. The bank’s exposure by any of the netting agreements. Counterparty-level
to the counterparty at a given future time is given exposure in this most general case is given by
by the bank’s economic loss in the event of the
counterparty’s default at that time. If the counterparty
defaults, the bank must close out its position with the Ec (t) = max Vi (t), 0
counterparty. To determine the loss arising from the k i∈NAk
counterparty’s default, it is convenient to assume that
the bank enters into a similar contract with another + max[Vi (t), 0] (3)
counterparty in order to maintain its market position. i∈
/ {NA}
Since the bank’s market position is unchanged after
replacing the contract, the loss is determined by the The inner summation in the first term of equation (3)
contract’s replacement cost at the time of default. aggregates the values of all trades covered by the
If the trade value at the time of default is negative kth netting agreement (hence the notation i ∈ NAk ),
for the bank, the bank receives this amount when it while the outer summation aggregates exposures
replaces the trade, but has to forward the money to the across all netting agreements. The second term in
defaulting counterparty, so that the net loss is zero. If equation (3) is simply the sum of the contract-level
the trade value at the time of default is positive for the exposures of all trades that do not belong to any
bank, the bank pays this amount when replacing the netting agreement (hence the notation i ∈/ {NA}).
trade, but receives nothing (assuming no recovery)
from the defaulting counterparty, so that the net loss
is equal to the trade value.
Margin Agreements and Collateral
Summarizing this, we can write the bank’s credit Modeling
exposure to the counterparty at future time t as
Margin agreements can further reduce credit expo-
Ei (t) = max{Vi (t), 0} (1) sure. Margin agreements can be either unilateral or
bilateral. Under a unilateral agreement, only one of
where Vi (t) is the value of trade i with the counter- the counterparties has to post collateral. If the agree-
party at time t from the bank’s point of view and ment is bilateral, both counterparties have to post
Ei (t) the bank’s contract-level exposure to the coun- collateral.
terparty created by trade i at time t. Usually a margin agreement covers one or more
Since the contract value changes unpredictably netting agreements. We can generalize equation (3)
over time as the market moves, only the current by specifying collateral amount Ck (t) available to the
exposure is known with certainty, while the future bank under netting agreements NAk at time t with
exposure is uncertain. the convention that this amount is positive when the
bank holds collateral and negative when the bank has
posted collateral:
Counterparty-level Exposure and Netting
Agreements
If the bank has more than one trade with the Ec (t) = max Vi (t) − Ck (t), 0
counterparty and counterparty risk is not mitigated k i∈NAk
and minimum transfer amount (MTA). When the the counterparty’s threshold Hc nonnegative, but will
portfolio value exceeds the threshold, the counter- specify the bank’s threshold Hb as nonpositive. Then,
party must post collateral to keep the bank’s exposure the bank posts collateral when the portfolio value
from rising above the threshold. As the exposure (defined from the bank’s point of view) is below the
drops below the threshold, the bank returns collat- bank’s threshold. The MTA is the same for the bank
eral to the counterparty. MTA limits the frequency of and the counterparty and MTA > 0.
collateral exchange. It is difficult to model collateral Similar to the unilateral case, we will create effec-
subject to MTA exactly because that would require tive thresholds for the bank and for the counter-
daily simulation time points, which is not feasible party. Effective threshold for the counterparty, Hc(e) ,
given the long-term nature of exposure modeling. In remains unchanged. From the counterparty’s point of
practice, the actual threshold Hc is often replaced view, effective threshold for the bank must be defined
by the effective threshold defined as Hc(e) = Hc + in exactly the same way. After taking into account
MTA. After this replacement, the margin agreement that we do not switch our point of view and Hb ≤ 0,
is treated as if it had zero MTA. the definition of the effective threshold for the bank
The simplest approach to modeling collateral is to will be Hb(e) = Hb − MTA. Now the bilateral agree-
limit the future exposure from above by the threshold ment can be treated as if it had zero MTA.
(i.e., for all scenarios with portfolio value above the Collateral available to the bank at time t under the
threshold, set the exposure equal to the threshold). bilateral agreement is modeled as
However, this approach is too simplistic because it
ignores the time lag between the last delivery of C(t) = max{V (t − δt) − Hc(e) , 0}
collateral and the time when the loss is realized.
This time lag is known as the margin period of risk + min{V (t − δt) − Hb(e) , 0} (6)
(MPR), which we will denote by δt. While the MPR
is not known with certainty, it is typically assumed The first term in the right-hand side of equation (6)
to be a deterministic number that is defined at the describes the scenarios when the bank receives collat-
margin agreement level. Its value depends on the eral (i.e., C(t) > 0), while the second term describes
contractual margin call frequency and the liquidity of the scenarios when the bank posts collateral (i.e.,
the portfolio. For example, δt = 2 weeks is usually C(t) < 0).
assumed for portfolios of liquid contracts and daily For more details on collateral modeling, see [9,
margin call frequency. 15, 16].
Applying the rules of posting collateral under the
assumption of the effective threshold with zero MTA
and taking into account the MPR, the collateral C(t)
Simulating Credit Exposure
available to the bank at time t is given by Because of the complex nature of banks’ portfolios,
exposure distribution at future time points is usually
C(t) = max{V (t − δt) − Hc(e) , 0} (5)
obtained via Monte Carlo Simulation process. This
where V (t) is the portfolio value from the bank’s process typically consists of three major steps:
point of view at time t. • Scenario generation Dynamics of market risk
factors (e.g., interest rates, foreign exchange (FX)
Bilateral Margin Agreements rates, etc.) is specified via relatively simple
stochastic processes (e.g., geometric Brownian
Under a bilateral margin agreement, both the coun- motion). These processes are calibrated either to
terparty and the bank have to post collateral: the historical data or to market implied data. Future
counterparty posts collateral when the bank’s expo- values of the market risk factors are simulated for
sure to the counterparty exceeds the counterparty’s a fixed set of future time points.
threshold, while the bank posts collateral when the • Instrument valuation For each simulation time
counterparty’s exposure to the bank exceeds the point and for each realization of the underlying
bank’s threshold. Since we are doing our analysis market risk factors, valuation is performed for
from the point of view of the bank, we will keep each trade in the counterparty portfolio.
4 Counterparty Credit Risk
• Aggregation For each simulation time point One should keep in mind that counterparty-level
and for each realization of the underlying mar- exposure Ec (t) incorporates all netting and margin
ket risk factors, counterparty-level exposure is agreements between the bank and the counterparty,
obtained by applying the necessary netting and as discussed above.
collateral rules, conceptually described by equa- Unilateral CVA is obtained by taking the risk-
tions (3) and (4). neutral expectation of the loss in equation (7). Under
the assumption that recovery rate is independent
The outcome of this process is a set of real- of the market factors and the time of default, this
izations of the counterparty-level exposure (each results in
realization corresponds to one market scenario) at
each simulation time point. T
Because of the computational intensity required CVAu−l = (1 − R c ) EE∗c (t) d PDc (t) (8)
to calculate counterparty exposures—especially for 0
a bank with a large portfolio—certain compromises
between the accuracy and the speed of the calcu- where EEc∗ (t) is the risk-neutral discounted EE at
lation are usually made: relatively small number of time t, conditional on the counterparty defaulting at
market scenarios (typically, a few thousand) and sim- time t, given by
ulation time points (typically, in the 50–200 range),
Q B0
simplified valuation methods, and so on. ∗
EEc (t) = E
Ec (t) τc = t (9)
For more details on simulating credit exposure, Bt
see [7, 17].
and R c is the expected recovery rate; PDc (t) is
the counterparty’s cumulative from today to time t,
Pricing Counterparty Risk—Unilateral estimated today; and T is the maturity of the longest
trade in the portfolio.
Approach
The term structure of the risk-neutral PDs is
Let us assume that the bank is default-risk-free. Then, obtained from the Credit Default Swaps spreads
when pricing transactions with a counterparty, the quoted in the market [19].
bank should require a risk premium to be com- We would like to emphasize that the expectation
pensated for the risk of the counterparty defaulting. of the discounted exposure at time t in equation (9) is
The market value of this risk premium, defined for conditional on the counterparty’s default occurring at
the entire portfolio of trades with the counterparty, time t. This conditioning is material when there is a
is known as unilateral credit valuation adjustment significant dependence between the exposure and the
(CVA). counterparty credit quality. This dependence, known
A Risk-neutral Pricing valuation framework is as right/wrong-way risk, was first considered in [8]
used for pricing CCR. The bank’s economic loss aris- and [12]. To account for it, the counterparty’s credit
ing from the counterparty’s default and discounted to quality must be modeled jointly with the market risk
today is given by factors. For more details on modeling right/wrong-
way risk, see [4, 10, 18].
B0 In practice, the dependence between exposure and
Lu−l = 1{τc ≤T } (1 − Rc )Ec (τc ) (7)
Bτc the counterparty’s credit quality is often ignored and
conditioning on default in equation (9) is removed.
where τc is the time of default of the counterparty; Discounted EE is calculated for a set of simulation
1{A} is the indicator function that assumes the value time points {tk } under the exposure simulation frame-
1 if Boolean variable A is TRUE and value 0, other- work outlined above. Then, CVA is calculated by
wise; Ec (t) is the bank’s exposure to counterparty’s approximating the integral in equation (8) by a sum:
default at time t; Rc is the counterparty Recovery
Rate (i.e., percentage of the bank’s exposure to the
counterparty that the bank will be able to recover in CVAu−l ≈ (1 − R c ) EE∗c (tk )
the event of the counterparty’s default); and Bt is the k
value of the money-market account at time t. × [PDc (tk−1 ) − PDc (tk )] (10)
Counterparty Credit Risk 5
Since the exposure expectation in equation (10) is Bilateral CVA is obtained by taking risk-neutral
risk neutral, scenario models for all market risk expectation of equation (11):
factors should be arbitrage free. This is achieved
by appropriate calibration of drifts. Moreover, risk
factor volatilities should be calibrated to the available T
market prices of options on the risk factors. CVAb−l = (1 − R c ) EE∗c (t)
For more details on unilateral CVA, see [3]. 0
× Pr[τb > t|τc = t] dPDc (t)
T
Pricing Counterparty Risk—Bilateral − (1 − R b ) EE∗b (t)
Approach
0
In reality, banks are not default-risk-free. Because of × Pr[τc > t|τb = t] dPDb (t)
the bilateral nature of credit exposure, the bank and (12)
the counterparty will never agree on the fair price of
CCR if they apply unilateral pricing outlined above:
each of them will demand a risk premium from where EE∗c (t) is the discounted EE of the counterparty
the other. The bilateral approach specifies a single to the bank at time t, conditional on the counterparty
quantity—known as bilateral CVA—that accounts defaulting at time t, defined in equation (9), and
both for the bank’s loss caused by the counterparty’s EE∗b (t) is the discounted EE of the bank to the
default and the counterparty’s loss caused by the counterparty at time t, conditional on the bank
bank’s default. defaulting at time t, defined as
Bilateral loss of the bank is given by
B0
EE∗b (t) = E Eb (t) τb = t (13)
B0 Bt
Lb−l = 1{τc ≤T } 1{τc <τb } (1 − Rc )Ec (τc )
Bτc
If the dependence between credit exposure and
B0
− 1{τb ≤T } 1{τb <τc } (1 − Rb )Eb (τb ) the credit quality of the counterparty and of the
Bτb bank can be ignored, the conditional expectations
(11) in equations (9) and (13) should be replaced with
the unconditional ones. As expected, equation (12) is
where τb is the time of default of the bank; Eb (t) is symmetric between the bank and the counterparty, so
the counterparty’s exposure to bank’s default at time that the bank and the counterparty will always agree
t; and Rb is the bank recovery rate (i.e., percentage on the price of CCR for their portfolio.
of the counterparty’s exposure to the bank that the One can use Default Time Copulas; Gaussian
counterparty will be able to recover in the event of Copula Model; Copulas: Estimation to express the
the bank’s default). conditional probabilities in equation (12) as functions
The first term in equation (11) describes the of the counterparty’s and the bank’s risk-neutral PDs.
bank’s loss when the counterparty defaults, but the For example, if the normal copula model [13] is
bank does not default. The second term describes used to describe the dependence between τc and τb ,
the loss of the counterparty in the event of the the conditional probabilities in equation (12) take the
bank’s default and the counterparty’s survival. From form
the bank’s point of view, the counterparty’s loss is
gain arising from the bank’s option not to pay the
counterparty when the bank defaults, so this term Pr[τb > t|τc = t]
is subtracted from the bank’s loss. Equation (11) is −1 [PDb (t)] − ρ−1 [PDc (t)]
completely symmetric: if we change the sign of the =1−
right-hand side, we will obtain the bilateral loss of 1 − ρ2
the counterparty. (14)
6 Counterparty Credit Risk
If alpha of a real portfolio can be estimated, its [5] Canabarro, E. & Duffie, D. (2003). Measuring and mark-
LEQ can be defined according to ing counterparty risk, in Asset/Liability Management
for Financial Institutions, L. Tilman, ed., Institutional
LEQ(j ) (T ) = αq (T )EPE(j ) (T ) (21) Investor Books.
[6] Canabarro, E., Picoult, E. & Wilde, T. (2003). Analysing
Because the EC of a portfolio with deterministic counterparty risk, Risk September, 117–122.
[7] De Prisco, B. & Rosen, D. (2005). Modeling stochastic
exposures is a homogeneous function of the expo- counterparty credit exposures for derivatives portfolios,
sures, using the LEQ defined in equation (21) will in Counterparty Credit Risk Modelling, M. Pykhtin, ed.,
produce the correct EC(Real)
q (T ). The caveat of this Risk Books.
approach is that one has to run a joint simulation of [8] Finger, C. (2000). Toward a better estimation of wrong-
trade values and counterparties’ defaults to calculate way credit exposure, Journal of Risk Finance 1(3),
alpha. 43–51.
Several estimates of typical values of alpha for a [9] Gibson, M. (2005). Measuring counterparty credit expo-
sure to a margined counterparty, in Counterparty Credit
large dealer portfolio and the time horizon T = 1 year Risk Modelling, M. Pykhtin, ed., Risk Books.
are available. An International Swaps and Deriva- [10] Hille, C., Ring, J. & Shimamoto, H. (2005). Modelling
tives Association (ISDA) survey [11] has reported counterparty credit exposure for credit default swaps,
alpha calculated by four large banks for their actual Risk May, 65–69.
portfolios to be in the 1.07–1.10 range. Theoretical [11] ISDA-TBMA-LIBA (2003). Counterparty Risk Treat-
estimates of alpha under a set of simplifying assump- ment of OTC Derivatives and Securities Financing
Transactions, June.
tions [6, 20] are 1.1 when market-credit correlations
[12] Levin, R. & Levy, A. (1999). Wrong way exposure—are
are ignored, and 1.2 when they are not. firms underestimating their credit risk? Risk July, 52–55.
The framework described above has found its [13] Li, D. (2000). On default correlation: a copula approach,
place in the regulatory capital calculations under Journal of Fixed Income 9, 43–54.
Basel II (see Regulatory Capital): a slightly mod- [14] Picoult, E. (2005). Calculating and hedging exposure,
ified version of equation (21) is used to calculate credit value adjustment and economic capital for coun-
exposure at default (EAD) under the internal mod- terparty credit risk, in Counterparty Credit Risk Mod-
elling, M. Pykhtin, ed., Risk Books.
els method for CCR [2]. Basel fixes alpha at 1.4, but
[15] Pykhtin, M. (2009). Modeling credit exposure for col-
it allows banks to calculate their own alpha, subject lateralized counterparties, Journal of Credit Risk; to be
to the supervisory approval and a floor of 1.2. published.
[16] Pykhtin, M. & Zhu, S. (2006). Measuring counterparty
credit risk for trading products under Basel II, in Basel
End Notes Handbook, 2nd Edition, M. Ong, ed., Risk Books.
[17] Pykhtin, M. & Zhu, S. (2007). A guide to modeling
a.
The opinions expressed here are those of the author and do counterparty credit risk, GARP Risk Review July/August,
not necessarily reflect the views or policies of the author’s 16–22.
employer. [18] Redon, C. (2006). Wrong way risk modelling, Risk April,
90–95.
[19] Schonbucher, P. (2003). Credit Derivatives Pricing Mod-
References els, Wiley.
[20] Wilde, T. (2005). Analytic methods for portfolio coun-
[1] Arvanitis, A. & Gregory, J. (2001). Credit: The Complete terparty risk, in Counterparty Credit Risk Modelling,
Guide to Pricing, Hedging and Risk Management, Risk M. Pykhtin, ed., Risk Books.
Books.
[2] Basel Committee on Banking Supervision (2006). Inter-
national Convergence of Capital Measurement and Cap- Related Articles
ital Standards, A Revised Framework.
[3] Brigo, D. & Masetti, M. (2005). Risk neutral pricing
of counterparty credit risk, in Counterparty Credit Risk Default Time Copulas; Economic Capital; Expo-
Modelling, M. Pykhtin, ed., Risk Books. sure to Default and Loss Given Default; Monte
[4] Brigo, D. & Pallavicini, A. (2008). Counterparty risk Carlo Simulation; Risk-neutral Pricing.
and contingent CDS under correlation, Risk February,
84–88. MICHAEL PYKHTIN
Loan Valuation 3. the time by which the principal of the loan must
be repaid, the maturity;
4. the current market rate of interest for the
A loan is an agreement in which one party, called a obligor’s likelihood of default, called the market
lender, provides the use of property, the principal, to credit spread ;
another party, the borrower. The borrower customar- 5. the likelihood of the event that a borrower will
ily promises to return the principal after a specified have repaid the principal at any particular date
period along with payment for its use, called interest prior to maturity.
[3]. When the property loaned is cash, the documen-
tation of the agreement between borrower and lender Although the bulk of the loans outstanding are rated
is called a promissory note. investment-grade or better, these loans trade very
Although cash loans can take many forms, tra- infrequently because of their high credit quality and
ditionally, banks and other financial institutions are lack of price differentiation. In fact, most loans that
the primary lenders of cash and businesses, organiza- trade after origination are those made by banks to
tions, and individuals are the borrowers. Most loans borrowers having speculative-grade credit ratings.
to corporations share a common set of structural char- These loans, made to high-yield firms are typically
acteristics [2, 5]. referred to as leveraged loans, though the exact
definition varies slightly among market participants.b
1. Interest on loans is typically paid quarterly at a
The types of loan facilities commonly traded in
rate specified relative to some reference rate such
secondary markets include the following:
as LIBOR (i.e., L + 250 bp).a Thus, loans have
floating-rate coupons whose absolute values are 1. Amortizing term loans. Usually called “term
not known with certainty except over the next loan A”, the periodic payments from these loans
quarter. include partial payment of principal, similar to
2. Often the firm’s assets or receivables are pledged
what a mortgage loan does. These loans are
against the borrowed principal. Because of this,
usually held by banks and are becoming less
their recovery rates are generally higher than
popular.
corporate bonds, which are most commonly
2. Institutional term loans. These loans are struc-
unsecured.
3. Most loans are prepayable on any coupon date tured to have bullet or close-to-bullet pay-
at par, although some agreements contain a ment schedules and are targeted for institutional
prepayment penalty or have a noncall period. The investors. They are referred to as “term loan B ”,
loan prepayment feature ensures that loan prices “term loan C ” and so on. Institutional term loans
rarely exceed several points above par. constitute the bulk of leveraged loan market.
4. Finally, unlike bonds which are public securities, 3. Revolving credit lines. These are unfunded or
loans are private credit agreements. Thus, access partially funded commitments by lenders that can
to firm fundamentals and loan terms may be lim- be drawn at the discretion of the borrowers. The
ited and loan contracts are less standardized. It is facility is analogous to a corporate credit card. It
not uncommon to find “nonstandard” covenants can be drawn and repaid multiple times during
or other structural features catering to specific the term of the commitment. These commitments
needs of borrowers or investors. are traded in secondary market. They are also
known as revolvers.
Loan valuation concerns the amount of interest that a
4. Second-lien term loans. They have cash-flow
lender requires for use of the property or an investor
schedule similar to that of institutional term
will charge for purchasing the loan agreement. That
valuation depends on several factors, such as loans, except that their claims on borrowers’
assets are behind first-lien loan holders in the
1. the likelihood of failure to receive timely pay- event of default.
ments of principal, called risk of default; 5. Covenant-lite loans. These are borrower-
2. the residual value of the loan in the event of friendly versions of institutional term loans that
default, called its recovery value; have fewer than the typical stringent covenants
2 Loan Valuation
that restrict use of the principal or subsequent on the evolution of an obligor’s credit state and the
borrowing activities of the firm. changing market costs of borrowing. For example, if
a firm’s credit improves or the loan rate over LIBOR
decreases, the likelihood of prepayment increases; the
Loan Pricing borrower can refinance at a lower rate. Conversely,
if a borrower’s credit deteriorates or lending rates
Like bonds, loans contain risk of default; an obligor
increase, it will not be advantages for the borrower
may fail to make timely payments of interest and/or
to refinance.
principal. Thus, the notion of a credit spread to
LIBOR has been used to characterize the riskiness To account for the prepayment option, we price
of loans, where the credit spread, s, to LIBOR is the loans using a credit-state-dependent backward
calculated as induction method.c To illustrate, consider pricing a
term loan with face value F , intermediate floating-
4n ct rate coupon payments of ct , and a maturity at
4 F
V = t + (1) time T , to a borrower of known credit quality, J .
rt + s r4n + s 4n Specifically, Figure 1 displays pricing lattices for a
t=1 1+ 1+
4 4 five-year loan to a double-B rated (i.e., J = BB)
obligor having a coupon of LIBOR + 3%,d and
where V is the market value of the loan, ct is the
face value of 100 at maturity.e Figure 1(a) shows
coupon (LIBOR + contractual spread), rt is the spot
rate for maturity t LIBOR rates, and F is the face how the obligor’s credit state evolves over time.
value of the loan to be repaid at maturity. Loan In the lattice, probabilities are assigned reflecting
coupons are generally paid quarterly and then reset transitions from each node at time t to all nodes at
relative to LIBOR and this is reflected in equation 1. t + 1. Thus, the probability of being at a given node
Using equation 1, we can calculate a credit spread for will be conditional upon all the previous transitions.
any loan whose market price is known. In practice, ratings transition probabilities are based
One problem with equation 1 for loan valuation is on historical data from credit rating agencies,f and
that it fails to account for the fact that loans, unlike these are typically modified by the current market
bonds, are typically prepayable at par on any given price of risk to produce risk-neutral ratings transition
coupon date. The loan prepayment option creates matrices.g,h
uncertainty in the expected pattern of cash flows Having calculated transition probabilities between
and complicates comparisons of value among loans all future nodes, we then apply the backward induc-
based on their credit spreads. Pricing the prepayment tion method. At maturity T , the borrower pays
option has proved difficult because of its dependence the principal plus coupon, F + cT , or the recovery
p = 0.01
AAA 105 AAA 105
105.47
AA p = 0.05 AA
105 105.42 105
A Risk neutral credit A
Credit rating
Credit rating
Figure 1 Credit-dependent backward induction method. (a) Double-B rated obligor, whose credit transitions are derived
from historical data and incorporate market risk premiums are used to specify the likelihood of being in any credit state at
future times to maturity. (b) Calculation of node values using backward induction, whereby values at each non-defaulted
node are the coupon value at that node plus the sum of the conditional cash flows from the later date, discounted one period
at forward LIBOR. In the example in (b), we assume a refinancing penalty of 0.5% of the principal
Loan Valuation 3
value in default R ∗ F . Those cash flows are dis- Let the conditional probability of prepayment at
counted back to each node at the previous period time be qi ,j then the discounted cash flow is given by
using forward LIBOR at T − 1. In other words, for
each node at time i < T and credit state j , and
T
i−1
VJ = Di ∗ (1 − qj ) ∗ [(qi ∗ Ki )
j = (AAA, AA, . . . , CCC) we calculate an induced
i=1 j =1
value, vi,j , as
+ ((1 − qi ) ∗ CFi )] (3)
AAA
1 where CF i = ci /4 for i < T ; CFi = (ci /4 + F ) for
v(i,j ) = min (Pj,k,i ∗ vi+1,k )
f i = T , and the discount margin Di is given by
1+
i+1,i
k=D
4
i
1
Di = (4)
fj,j −1 + ŝ
j =0 1 +
+ ci , Ki , (2) 4
The credit spread, s, is determined by iteratively
changing the parameter ŝ and recalculating the dis-
where Pj,k,i is the probability of migrating from state counted value of the cash flows, VJ , until VJ con-
j to state k from time i to i + 1, fi is the forward verges to P , the market price.
LIBOR rate from time i to i + 1, Ki is the terminal Revolving lines of credit are priced by assuming
value of the loan at time i,i and vT ,j = F + cT . Thus, that the fraction of the loan drawn at a particular
at each node i, j we compute the induced value, time, called the usage, is directly related changes
compare it with the terminal value, Ki , and set the to the obligor’s credit quality. In other words, if
value at that node, vi,j to the lesser of the two. In a borrower’s credit rating improves, it can access
other words, if the induced value exceeds the terminal credit more cheaply and is also less likely to draw
value, the loan is effectively repaid and terminates on existing lines of credit. Conversely, a borrower
at i, j . Also, if the loan defaults at time i, the with deteriorating credit will likely draw on the
loan terminates with a value vi,D = R ∗ F for all i. credit lines it obtained when more highly rated.
Finally, the value of vi,j at time 0 (in this example, In this framework, usage can be interpreted as
at v0,BB ) is the model price of the loan. credit-dependent face value. Thus, in the equations
Although equation 2 is useful for calculating above the face value is modified by F → Uj ∗ F
prices of illiquid loans and for estimating the coupon where j is the credit state and usage Uj ranges
premiums to charge for new loans, it is less useful from 0 to 1.
for evaluating relative value among existing loans,
which are better assessed using credit spreads. In
fact, we can calculate the credit spread for a loan End Notes
by discounting its expected nondefault cash flows by
a.
a constant amount over the LIBOR curve such that LIBOR stands for London interbank offered rate, which
the discounted value matches its current market price. roughly corresponds to the interest rate charged between
For all nondefault cash flows at a given time, the bor- banks when lending large amounts of US dollars outside
the United States. The coupon rate for a given quarter is set
rower will either prepay the principal and terminate, at the beginning of the period. For example, the L + 250 bp
or pay a coupon and continue. The prepayment region coupon in the text indicates that the borrower will pay one-
in the time-and-credit-state lattice can be determined quarter of 250 bp (0.625%) plus the current three-month
using the values of vi,j in equation 2. The probability LIBOR rate on the next coupon date.
b.
of prepaying at period i is the sum of the probabil- Although some people define leveraged loans on the basis
ities of reaching nodes whose value of vi,,j equal of their balance sheet leverage ratio, it is more common to
those capped at the terminal values Ki . Given the use credit ratings (i.e., below BBB-) or credit spread to
LIBOR above some maximum.
probability transition matrix and the set ω of all pre- c.
Several versions of the backward induction method have
payment nodes, we can calculate the probability of been proposed over the years [1, 6, 7, 9]. The version
prepayment at time i conditional on no prepayment presented in equations (1–3) embodies elements that are
before time i. common to most of these methodologies.
4 Loan Valuation
d.
Loan spreads are typically quoted in basis points such as References
LIBOR + 300 bp, where 1% = 100 bp.
e.
For convenience, we assume LIBOR is constant at 2%, [1] Bohn, J. (2000). A Survey of Contingent-Claims Appro-
thereby generating a constant 5% coupon, and that the aches to Risky Debt Valuation, Institutional Investor.
loan pays annually, rather than the typical quarterly coupon [2] Deitrick, W. (2006). Leveraged Loan Handbook, Citi
payment. Markets and Banking.
f.
The most well-known credit rating agencies are Fitch, [3] Downs, J. & Goodman, J.E. (1991). Dictionary of Finance
Moody’s, and Standard & Poor’s. and Investment Terms, Barron’s, Hauppauge, New York.
g.
Ratings transition matrices are published regularly by the [4] Emery, K., Ou, S., Tennant, J., Kim, F. & Cantor, R.
major agencies [4, 8]. (2008). Corporate Default and Recovery Rates, Moody’s
h.
Most models specify adjustment of physical credit Global Corporate Finance. 1920-2007, Special Comment.
transitions so that the default probabilities at each time, i, [5] Miller, S. & William, C. (2007). A Guide to the Loan
match the risk-neutral probabilities of default as implied Market, Standard & Poor’s.
by the bond and loan markets. For example, the risk- [6] Rizk, H. (1993). GMPM Valuation Methodology: An
neutral default probability for a single risky cash flow at Overview, Citi Markets and Banking.
−ts
time t is given as PtQ = 1 − e and PtQ = N (N −1 (Pt ) + [7] Rosen, D. Does Structure Matter? (2002) Advanced Meth-
√ (1 − R)
βλ t) where, P (t, Q) is the cumulative risk-neutral default ods for Pricing and Managing the Risk of Loan Portfolios,
probability to time t, s is the market credit spread, and Algorithmics Inc.
R is the recovery rate in default. On the right, we [8] Vazza, D., Aurora, D., Kraemer, N., Kesh, S., Torres, J. &
calculate PtQ from Pt the physical default probability Erturk, E. (2007). Annual 2006 Global Corporate Default
by adding a term related to the volatility of the credit Study and Rating Transition, Standard and Poor’s Global
relative to the market, the market price of risk, and the fixed Income Research.
time to receipt of the cash flow. (For an elaboration [9] Zeng, B. & Wen, K. (2006). CreditMark Valuation
and discussion of the derivation of this relation, see Methodology, Moody’s K.M.V.
Bohn [1]. Zeng and Wen [9] describe its application to loan
pricing.) Further Reading
i.
It is common to add a refinancing premium to the principal
plus coupon when defining the terminal value for evaluating
prepayment as there are costs and/or penalties associated Aguais, S., Forest, L. & Rosen, D. (2000). Building a
with the refinancing process. Credit Risk Valuation Framework for Loan Instruments, Algo
j.
The probability of prepayment at time 1 from the initial Research Quarterly.
state J is given by q1 = k∈ω PJ,k,0 . For time i > 1, we
must add the condition TERRY BENZSCHAWEL, JULIO DAGRACA &
that the loan was not prepaid before
time i; thus, qi = i−1m=1 (1 − qm ) ∗ / Pl,k,i−1 .
k∈ω,l ∈ω
HENRY FOK
Credit Risk usually much lower than the nominal amount of the
deal, and in many cases is only a fraction of this
amount. This is because the economic value of a
Credit risk is the risk of an economic loss from the derivative instrument is related to its replacement,
failure of a counterpartya to fulfill its contractual or market value, rather than its nominal or face
obligations. For example, credit risk in the loan value. However, the credit exposures induced by
portfolio of a bank materializes when a borrower fails the replacement values of derivative instruments are
to make a payment, either the periodic interest charge dynamic: they can be negative at one point in time
or the periodic reimbursement of principal on the and yet become positive at a later point in time after
loan he contracted with the bank. Credit risk can be market conditions have been changed. Therefore,
further decomposed into four main types: default risk, firms must examine not only the current exposure,
bankruptcy risk, deterioration in creditworthiness (or measured by the current replacement value, but also
downgrading) risk, and settlement risk. the profile of potential future exposures up to the
Default risk corresponds to the debtor’s incapacity termination of the deal.
or refusal to meet his/her debt obligations, whether
interest or principal payments on the loan contracted,
by more than a reasonable relief period from the due Credit Risk at the Portfolio Level
date, which is usually 60 days in the banking industry.
Bankruptcy risk is the risk of actually taking over The first factor affecting the amount of credit risk in
the collateralized, or escrowed, assets of a defaulted a portfolio is clearly the credit standing of specific
borrower or counterparty, and liquidating them. obligors (see Rating Transition Matrices; Credit
Creditworthiness risk is the risk that the perceived Rating). The critical issue, then, is to charge the
creditworthiness of the borrower or counterparty appropriate interest rate, or spread, to each borrower
might deteriorate. In general, deteriorated creditwor- so that the lender is compensated for the risk he/she
thiness translates into a downgrade action by the undertakes and to set the right amount of risk capital
rating agencies, such as Standard and Poor’s (S&P) aside (see Economic Capital).
or Moody’s, and an increase in the risk premium, or The second factor is “concentration risk” or the
credit spread of the borrower. A major deterioration extent to which the obligors are diversified in terms
in the creditworthiness of a borrower might be the of number, geography, and industry.
precursor of default. This leads us to the third important factor that
Settlement risk is the risk due to the exchange of affects the risk of the portfolio: the state of the
cash flows when a transaction is settled. Failure to economy. During economic boom, the frequency
perform on settlement can be caused by a counter- of default falls sharply compared with the peri-
party defaulting, liquidity constraints, or operational ods of recession. Conversely, the default rate rises
issues. This risk is greatest when payments occur in again as the economy enters a downturn. Downturns
different time zones, especially for foreign exchange in the credit cycle often uncover the hidden ten-
transactions, such as currency swaps, where notional dency of customers to default together, with banks
amounts are exchanged in different currencies.b being affected to the degree that they have allowed
Credit risk is only an issue when the position is an their portfolios to become concentrated in various
asset, that is, when it exhibits a positive replacement ways (e.g., customer, region, and industry concen-
value. In that situation, if the counterparty defaults, trations) [1].
the firm loses either all of the market value of the Credit portfolio models are an attempt to discover
position or, more commonly, the part of the value that the degree of correlation/concentration risk in a bank
it cannot recover following the credit event. The value portfolio (see Portfolio Credit Risk: Statistical
it is likely to recover is called the recovery value or Methods).
recovery rate when expressed as a percentage; the The quality of the portfolio can also be affected by
amount it is expected to lose is called the loss given the maturities of the loans, as longer loans are gen-
default (see Recovery Rate). erally considered more risky than short-term loans.
Unlike the potential loss given default on coupon Banks that build portfolios that are not concentrated
bonds or loans, the one on derivative positions is in particular maturities—“time diversification”—can
2 Credit Risk
reduce this kind of portfolio maturity risk. This interest and principal, postponement of payments, or
also helps reduce liquidity risk or the risk that change in the currencies of payment—should count
the bank will run into difficulties when it tries to as a credit event. The Conseco case famously high-
refinance large amounts of its assets at the same lighted the problems that restructuring can cause.
time. In October 2000, a group of banks led by Bank
of America and Chase granted to Conseco a three-
month extension of the maturity of approximately
Credit Derivatives and the ISDA $2.8 billion of short-term loans, while simultaneously
Definition of a Credit Event increasing the coupon and enhancing the covenant
protection. The extension of credit might have helped
With the spectacular growth of the market for credit to prevent an immediate bankruptcy, but as a signif-
default swaps (CDSs) (see Credit Default Swaps), icant credit event it also triggered potential payouts
it has become necessary to be specific about what on as much as $2 billion of CDS.
is a credit event? A credit event, usually a default, The original sellers of the CDS were not happy
triggers the payment on a CDS. This event, then, and were annoyed further when the CDS buyers
should be clearly defined to avoid any litigation seemed to play the “cheapest to deliver” game by
when the contract is settled. CDSs normally con- delivering long-dated bonds instead of the restruc-
tain a “materiality clause” requiring that the change tured loans; at the time, these bonds were trading
in credit status be validated by third-party evi- significantly lower than the restructured bank loans.
dence. (The restructured loans traded at a higher price in the
The new CDS market has struggled to define the secondary market due to the new credit-mitigation
kind of credit event that should trigger a payout under features.)
a credit derivatives contract. Major credit events as In May 2001, following this episode, ISDA issued
stipulated in CDS documentations and as formalized a restructuring supplement to its 1999 definitions
by the International Swaps and Derivatives Associa- concerning credit derivative contractual terminology.
tion (ISDA) are the following. Among other things, this document requires that to
qualify as a credit event, a restructuring event must
• Bankruptcy, insolvency, or payment default.
occur to an obligation that has at least three holders,
• Obligation/cross default that means the occur-
and that at least two-thirds of the holders must
rence of a default (other than failure to make a
agree to the restructuring. The ISDA document also
payment) on any other similar obligation.
imposes a maturity limitation on deliverables—the
• Obligation acceleration which refers to the situa-
protection buyer can only deliver securities with
tion where debt becomes due and repayable prior
a maturity of less than 30 months following the
to maturity. This event is subject to a materiality
restructuring date or the extended maturity of the
threshold of $10 million unless otherwise stated.
restructured loan—and it requires that the delivered
• Stipulated fall in the price of the underlying asset.
security be fully transferable. Some key players in the
• Downgrade in the rating of the issuer of the
market have now dropped restructuring from their list
underlying asset.
of credit events.
• Restructuring: this is probably the most contro-
versial credit event.
• Repudiation/moratorium: this can occur in two End Notes
situations. First, the reference entity (the obligor
a.
of the underlying bond or loan issue) refuses to In the following, we use indifferently the term borrower
honor its obligations. Second, a company could or counterparty for a debtor. In practice, we refer to issuer
be prevented from making a payment because of risk, or borrower risk, when credit risk involves a funded
transaction such as a bond or a bank loan. In derivatives
a sovereign debt moratorium (City of Moscow in
markets, counterparty risk is the credit risk of a counterparty
1998). for an unfunded derivatives transaction such as a swap or
an option.
One of the most controversial aspects of the b.
Settlement failures due to operational problems result
debate is whether the restructuring of a loan—which only in payment delays and have only minor economic
can include changes such as an agreed reduction in consequences. In some cases, however, the loss can be quite
Credit Risk 3
acts as a hedge against default. For a bank hedging A basic question is to determine the fair swap spread,
its loans, this can lead to economic and regulatory or the premium, at inception. The CDS spread must
capital relief. If the buyer does not have exposure to equate the present value at inception of the premium
the reference security, the CDS enables him/her to payments (premium leg) and the present value of the
take a speculative short position that benefits from a payments at default. After inception, the swap must
deterioration of the issuer’s creditworthiness. be marked to the market. Arbitrage-free valuation of
CDSs are often used to hedge against losses in credit default swaps can be done by using the risk-
the event of a default. Thus, CDSs can be viewed as neutral pricing principle (see Risk-neutral Pricing):
insurance contracts against default or, more generally, we assume a pricing measure such that the present
as insurance against credit events. However, it is value at t of any payout H at T > t is E [B(t, T )H ]
important to note that, unlike the case of insurance where B(t, T ) is the (risk-free) discount factor.
contracts, the protection buyer does not need to own Consider a CDS with the notional N , payment
the underlying security or have any exposure to it. dates T1 , T2 , . . . , Tn = T . Denote the (random) date
In fact, an investor can speculate on the default of the underlying credit event as τ . A key role is
of an entity by buying protection on a reference played by the conditional risk-neutral survival prob-
entity. Thus, they are more like deep out-of-the- ability S(t, T ) = (τ > T |Ft ) where Ft represents
money equity puts rather than insurance contracts. information available at date t. We denote S(T ) =
The sheer volume of the CDS market indicates S(0, T ), its value at the inception of the contract.
that a large portion of contracts are speculative since, Denote the recovery rate by R, and R = E [R], the
in many cases, the outstanding notional of CDSs is “implied” recovery rate (see Recovery Swap).
(much) larger than the total debt of the reference The premium leg pays a fixed annual percentage
entity. For example, when it filed for bankruptcy X on the notional N at dates Ti until default: the cash
on September 14, 2008, Lehman Brothers had $155 flow at Ti is therefore
billion of outstanding debt, but more than $400 billion
XN (Ti − Ti−1 )1τ >Ti (1)
notional value of CDS contracts had been written
with Lehman as reference entity [8]. The value at inception t = 0 of this stream of cash
Also, unlike insurance companies, which are flows is therefore
required to hold reserves in accordance with their
issued insurance claims, a protection seller in a CDS
is not required to maintain any reserves to pay off
n
buyers. An important case is the event where a XN (Ti − Ti−1 )B(0, Ti )E [1τ >Ti |F0 ]
protection seller has insufficient funds to cover the i=1
default payment, thereby defaulting on its CDS pay-
n
ment. A famous example is the downfall of AIG, in = XN (Ti − Ti−1 )B(0, Ti )S(0, Ti )
which CDSs sold by its Financial Products subsidiary i=1
(AIGFP) played a major role.
n
CDSs, like many other credit derivatives, are = XN (Ti − Ti−1 )D(0, Ti ) (2)
unfunded and typically do not appear as a liability i=1
Credit Default Swaps 3
4]). This can be represented as a stream of cash flows = N (1 − R) B(0, Ti )[S(Ti−1 ) − S(Ti )]
N (1 − R)1Ti−1 ≤τ ≤Ti paid at Ti . The value at inception i=1
n which yields
N B(0, Ti )E [(1 − R)1Ti−1 ≤τ ≤Ti |F0 ]
i=1 X = CDS(Tn )
n
n
= N (1 − R) B(0, Ti )(Ti−1 ≤ τ ≤ Ti ) (1 − R) B(0, Ti )(S(Ti−1 ) − S(Ti ))
i=1
i=1
=
n
n
= N (1 − R) B(0, Ti )(S(Ti−1 ) − S(Ti )) (Ti − Ti−1 )B(0, Ti )S(Ti )
i=1 i=1
(3) (5)
If payments are made at dates other than Ti , then Figure 2 shows the term structure of CDS spreads
accrued interest must be added. If payment dates are written on Lehman Brothers in September 2008.
frequent (e.g., quarterly) the correction is small. To derive this formula, we have assumed that the
The fair spread for maturity Tn (or contracted firm’s default time and recovery rate are independent,
spread or par spread) is defined as the spread that that interest rate movements are independent from
700
650
600
550
500
450
400
350
300
250
0 1 2 3 4 5 6 7 8 9 10
Years
default times, and that the protection seller has of default probabilities given the CDS spreads
negligible default probability (no counterparty risk CDS(T1 ), . . . , CDS(Tn ). The solution S(Ti ) is called
Counterparty Credit Risk). All these assumptions the implied survival probability and 1 − S(Ti ) is
can be relaxed, especially in the context of reduced- the implied default probability or the “risk-neutral”
form (see Reduced Form Credit Risk Models; default probability implied by CDS quotes.
Intensity-based Credit Risk Models) pricing models This procedure of inverting survival probabilities
[3, 5–7]. Hull and White [6] discuss the incorporation from CDS spreads is analogous to the procedure of
of counterparty risk in CDSs. We note that stripping discount factors/zero coupon bond prices
from bond yields (see Yield Curve Construction).
• CDS spreads depend on the term structure of Note that, as for yield curve construction, there are,
default probabilities and on the term structure in general, many more dates Ti (quarterly payments)
of interest rates, but only through payment dates than CDS maturities; hence, reconstructing S(T )
T1 , . . . Tn : two models that agree on the term from CDS spreads requires interpolation or extra
structure of default probabilities will agree on assumptions on survival probabilities. For example,
CDS spreads. survival probabilities are commonly parameterized
• CDS spread depends on the recovery rate only as T
through its expectation R under the pricing mea-
S(t, T ) = exp − h(t, u) du (6)
sure . In market quotes, R has been usually t
chosen to be 40% for corporates, although this
convention is subject to change. where h(t, T ) = −∂T S(t, T )/S(t, T ) is the forward
hazard rate (defined analogously to the forward
interest rate Heath–Jarrow–Morton Approach).a
Implied Default Probability Reduced-form models (see Reduced Form Credit
Risk Models; Intensity-based Credit Risk Models)
Given an estimate for the expected recovery rate lead to parametric functional forms for h(t, .), which
R and the term structure of discount factors, can then be used to calibrate parameters to the
one can solve equation (5) for the term structure observed CDS spreads.
Survival probability
1
0.95
0.9
0.85
0.8
0.75
0.7
0.65
0 1 2 3 4 5 6 7 8 9 10
Years
Figure 3 Risk-neutral survival probabilities implied by CDS spreads on Lehman Brothers on September 8, 2008
Credit Default Swaps 5
0.1
0.08
0.06
0.04
0.02
0
0 1 2 3 4 5 6 7 8 9 10
Years
Figure 4 Hazard rates implied by CDS spreads on Lehman Brothers on September 8, 2008
0.4
0.3
0.2
0.1
−0.1
−0.2
−0.3
2005 2006 2007 2008 2009 2010
transactions [1]. In the United States the first CDS Changes in Conventions
clearinghouse, ICE Trust, began operating in March
2009. Other proposals to clear credit default swaps
have been made by CME, NYSE Euronext, Eurex Since 2009, the CDS market has been evolving in
AG, and LCH Clearnet. the direction of trading standardized single-name con-
tracts with an upfront payment and a fixed coupon of
either 100 or 500 bp and a common set of coupon
Credit Default Swap (CDS) Basis payment dates (see www.cdsmodel.com). Standard
maturity dates are March/June/September/December
An asset swap is a transaction between two parties in 20. Coupon payment dates are like standard matu-
which the asset swap buyer purchases a bond from rity dates, but are adjusted to fall on the fol-
the other party and simultaneously enters into an lowing business day. Each coupon is equal to
interest rate swap transaction, usually with the same
(annual coupon/360) × (number of days in accrual
counterparty, to exchange the coupon on the bond
period). This simplifies processing and computation
for Libor plus a spread. The spread is called the
of coupons and cash flows. For example, every
asset swap spread. A common asset swap is the par
$10 mm 100 bp standard CDS contract will pay the
asset swap where the buyer pays par at the inception
of the deal. Unlike a CDS, an asset swap continues same 2Q09 coupon, $26 111, on Monday, June 22,
following bond default. 2009, regardless of trade date, maturity, or reference
The CDS-Bond basis is the difference between the entity.
CDS spread and the asset swap spread on the same The upfront payment is then set at the inception
bond. It is an indicator of relative value of CDS such that the buyer and seller positions have the
versus the cash bond [2]. For example, when the CDS same present value. In this convention, the dealer will
spread is higher than the asset swap spread, that is, the quote not a spread (which is fixed) but an upfront
basis is positive, the CDS is generally considered to payment. This convention applies to standardized
be more attractive than the bond. The reverse is true CDS contracts on names contained in CDX and
if the basis is negative. Negative CDS basis has been ITRAXX indices and may set the example for all
frequently observed during the recent financial crisis. other CDS contracts in the future.
8 Credit Default Swaps
End Notes [6] Hull, J. & White, A. (2000). Valuing credit default swaps
ii: modeling default correlations, Journal of Derivatives
a. 8, 897–907.
Not to be confused with the (instantaneous) hazard rate or
[7] Schönbucher, P. (1998). Term structure modeling of
the default intensity (see Hazard Rate).
defaultable bonds, Review of Derivatives Research 2,
161–192.
References [8] VanDuyn, A & Weitzman H. (2008). Fed to hold CDS
clearance talks, Financial Times (Oct 7).
[1] Cont, R. & Minca, A. (2009). Credit Default Swaps and
Systemic Risk . Financial Engineering Report, Columbia
University. Related Articles
[2] Davies, M. & Pugachevsky, D. (2005). Bond spreads as
a proxy for credit default swap spreads, Risk. Basket Default Swaps; Counterparty Credit Risk;
[3] Duffie, D. (1999). Credit swap valuation, Financial Ana- Credit Default Swaption; Equity–Credit Problem;
lyst’s Journal 54(1), 73–87.
Exposure to Default and Loss Given Default;
[4] Duffie, D. & Singleton, K.J. (1999). Modeling term
structures of defaultable bonds, Review of Financial Hazard Rate; Intensity-based Credit Risk Models;
Studies 12, 687–720. Recovery Rate; Recovery Swap; Reduced Form
[5] Hull, J. & White, A. (2000). Valuing credit default swaps Credit Risk Models.
i: no counterparty default risk, Journal of Derivatives 8,
29–40. RAMA CONT
Total Return Swap given borrower and potentially diversify a concen-
trated portfolio without removing the asset itself from
their balance sheet, while maintaining the relationship
with the borrower. However, TRS payers do not have
A total return swap (TRS) is a financial contract
to hold the asset itself on their balance sheets. If a
between two counterparties to synthetically replicate
TRS payer is taking an outright position, i.e. without
the economic returns of an underlying asset. The
holding the asset itself on the balance sheet, a TRS
principal mechanism and interaction are shown in
is an efficient way to go the asset short synthetically.
Figure 1.
A TRS can help to activate comparative advan-
The reference asset still belongs to the TRS payer,
who is buying protection from the TRS receiver. tages of financing, depending on whether a certain
This reference asset contains typically a fixed interest market plays a role in a certain part of the market.
payment and experiences a certain credit risk to be Typically, a TRS is an off-balance-sheet deal.
protected. The TRS payer transfers any payment
made by the reference asset to the TRS receiver,
who conversely pays a variable payment (typically Comparison with an Outright Investment
the London interbank offered rate (LIBOR)) and in the Bond
a positive (or negative) spread as risk premium.
Additionally, settlements for price depreciation and The most striking difference from an outright invest-
appreciations of the reference asset are made between ment in a bond or a loan is that with a TRS, price
the counterparties. changes become cash flows at the predefined reset
The TRS payer thus sells the market and credit periods, at which settlements are made. For a bond,
risk of the reference asset to the TRS receiver without they are only accounting profits or losses and become
selling the reference asset itself. In the case of a credit effective at maturity or when the position is unwound.
event, the TRS receiver pays the difference between Thus, the TRS resembles a futures contract whereas
the value of the reference asset and the recovery value the direct investment is more similar to a forward one
to the TRS payer. He acquires the counterparty risk (see, e.g., Schönbucher [3]).
of the TRS receiver instead.
Note that payments are not made continuously
but rather at discrete times, that is, at given and Valuation and Risk Management
specified reset periods. Occasionally, the reference
asset consists of a whole portfolio of assets. Schönbucher [3] gives an indication about the payoff
streams of a TRS from the point of view of the TRS
receiver to be counted for valuation purposes:
Reasons for Investing in a Total
Return Swap • Initially the TRS is closed at a fair value; hence,
no cash flow is proceeded.
The TRS receiver explores the possibility of investing • If the bond does not default, the TRS receiver
in the risk profile of the reference asset without pays a variable coupon plus (or minus) a spread
owning it legally. Thus, insurance companies, hedge at every predefined reset point; he receives the
funds, and so on, count among the typical investors. interest from the bond and the difference in
They aim to work on a leveraged basis, diversify market value of the bond since the last reset is
their portfolio, and achieve higher yields by taking exchanged.
on risk exposure. They can explore a synthetic • If the bond defaults, the TRS receiver pays for a
way to make loans without having the costs and last time the variable coupon plus (or minus) a
administrative burden; they explore possibilities to spread and the difference between the last market
originating credit. Sometimes, for certain investors value of the bond and its recovery.
with capital constraints, TRS may be an effective way
to leverage the use of capital. Thus, several risk factors influence the value of
TRS payers are typically lenders and investors the TRS: the interest rate risk driven by the chang-
who want to reduce their respective exposure to the ing yield curve and the default probability of the
2 Total Return Swap
Interest etc.
Reference asset
reference asset (we neglect, for instance, the coun- [3] Schönbucher, P. (2003). Credit Derivatives Pricing Mod-
terparty risk). Typical valuation models include the els: Models, Pricing and Implementation, Wiley.
Duffie–Singleton model, hazard rates, and forward
measure. The credit risk is reflected in the fair spread Further Reading
(fair means that initially there has to be no cash flow)
(see also Anson et al. [1]). Kasapi, A. (1999). Mastering Credit Derivatives—A Step-
by-Step Guide to Credit Derivatives and their Application.
Prentice Hall.
References Tavakoli, J.M. (1998). Credit Derivatives. A Guide to Instru-
ments and Applications, Wiley.
[1] Anson, M.J.P., Fabozzi, F.J., Choudry, M. & Chen, R.R.
(2004). Credit Derivatives: Instruments, Applications, and CARSTEN S. WEHN
Pricing. John Wiley & Sons.
[2] Martin, M.R.W., Reitz, S. & Wehn, C.S. (2006). Kred-
itderivate und Kreditrisikomodelle- Eine mathematische
Einführung, Vieweg Verlag. (in German).
Recovery Swap In order to state the triangular arbitrage relation
more generally, consider the case when RS = RD ,
that is, when the strike recovery rates of the recovery
swap and DDS are not the same. Let the premium on
A recovery swap (RS), also called a recovery lock the CDS be c1 and the premium on the DDS be c2 .
or a recovery default swap (RDS), is an agreement In order to replicate the RS, we will hold x units of
to exchange a fixed recovery rate RS for the realized the CDS and y units of the DDS. The replication has
recovery rate φ, the latter being determined under two conditions:
prespecified contractual terms. The fixed recovery
rate may be specified in terms of a recovery of par 1. The cashflows at default must be equal for the RS
amount (RP), or as the recovery percentage of an and the replicating portfolio of CDS and DDS.
equivalent Treasury bond, known as recovery of In other words,
Treasury (RT), or as a fraction of the market value
of the bond prior to default, also known as recovery x · (1 − φ) − y · (1 − RD ) = RS − φ (1)
of market value (RMV).
A recovery swap is no different than a forward 2. The premiums of the replicating portfolio must
contract at rate RS on the underlying recovery rate be net zero as the recovery swap does not have
φ. The maturity of the contract is denoted as T . any intermediate cash flows. Hence the following
If the reference credit underlying the recovery swap equation must hold:
does not default before T , the swap expires worth-
less. There are no intermediate or periodic cash flows y · c2 − x · c1 = 0 (2)
in a recovery swap. In a liquid market for recovery
swaps, the quoted rate RS is the best forecast of the Set x = 1 in equation (1) so as to eliminate
expected recovery rate for default at time T . This dependence on φ in the equation. Then we have that
recovery rate may then be used to price credit default
swaps (CDSs). 1 − RS
We assume that the buyer of the recovery swap x = 1 implies y= (3)
1 − RD
will receive RS and pay φ. Hence, the buyer gains
when the realized recovery rate is lower than that of Substituting this result for x, y in equation (2)
the strike rate RS . The net payoff to the contract is results in the following:
(RS − φ). Recovery swaps are quoted in terms of the
“strike” rate RS . For example, a dealer might quote a c1 1 − RS LS
recovery swap in GM at 37/40. This means the dealer = = (4)
c2 1 − RD LD
is prepared to sell a recovery swap with RS = 37%
and buy at RS = 40%. where L denotes the loss rate. We note the following:
• These no-arbitrage based results do not depend against the default risk of this entity, but because
in any way on the underlying process for default of differences in settlement at default between the
or that of recovery. This makes the relationships CDS Index and the single-name CDS, the investor
in equation (4) very general and easy to apply in might get different recovery rates on the two instru-
practice, as well as easy to assess empirically for ments (recovery basis risk). Hence, recovery swaps
academic purposes. can hedge against recovery basis risk by locking-in
recovery rates.
Furthermore, in the case where the CDSs specify a
Applications and Uses of Recovery Swaps physical settlement, it is possible that the underlying
bonds might be scarce compared to the notional
Recovery swaps were first developed by BNP Paribas amount of CDS traded on the bond. This causes a
in early 2004 [10]. In response to market demand, “delivery squeeze” where the price, and therefore the
banks started issuing fixed-rate recovery collateral- recovery of the bond, is artificially increased because
ized debt obligations (CDOs) and as a consequence the buyers of CDS need to buy the bonds for delivery
were bearing recovery rate risk. In order to hedge to their counterparty. For instance, in October 2005,
against this recovery rate risk, market participants Delphi Corporation had $27.1 billion of outstanding
started selling recovery swaps. CDSs against notional outstanding bonds of just $2
Recovery swap markets are predominantly traded billion causing the price of the defaulted bonds to
on reference entities with a high risk of default surge by as much as 24% [9]. The consequence of this
or of declining credit quality. For this reason, the delivery squeeze is to reduce the profits accruing to
largest activity in the recovery swaps market is in the buyers of CDSs, and recovery swaps provide a hedge
auto parts and auto manufacturing sectors and geo- against this by locking in the recovery rate ahead of
graphically on North American entities [7]. Trading time. More recently though, most CDSs are being
volumes in recovery swaps, although still small rela- settled in cash, thereby circumventing this problem.
tive to the overall credit derivatives market, increased
in 2005 with the defaults of Delphi Corporation and
the Collins & Aikman Corporation [7]. Still, the mar- Recovery Risk
ket remains largely undeveloped and the International
Swaps and Derivatives Association (ISDA), in May There is a growing literature on recovery risk. Berd
2006, issued a template for the documentation on [3] provides a nice introduction and analysis of
recovery swaps but the full documentation remains recovery swaps. DDSs are analyzed in [4]. Altman
to be completed at this time [13]. et al. [2] present a detailed study showing how
There are two primary uses of recovery swaps. recovery rates depend on default rates, positing and
The first is to isolate the probability of default finding an inverse relationship. Chan-Lau [6] presents
from the recovery rate. Traders may have in-house a method to obtain the upper bound on recovery on
expertise in determining default probabilities but not emerging market debt. Das and Hanouna [8] develop
in determining recovery and thus may wish to hedge a methodology for identifying implied recovery rates
their recovery risks through recovery swaps. The sec- and default probabilities from CDS spreads and
ond use of recovery swaps is to eliminate recovery data on stock prices and volatilities. Acharya et al.
basis risk. Recovery basis risk occurs because of [1] provide empirical evidence that recovery rates
different settlement procedures between CDSs and depend on the industry, state of the economy, and
CDOs. CDSs are often settled physically, meaning specificity of assets to the industry in which the
that when default occurs the seller of protection firm operates. Carey and Gordy [5, 14] show that
receives the defaulted bonds, whereas CDOs are recovery has systematic risk. Guo et al. [11] look
almost always cash settled. The difference in set- at recoveries in reduced form models by explicitly
tlement procedures is the source of recovery basis modeling the postbankruptcy process of recoveries.
risk. For instance, an investor might hold a CDS The well-known loss given default model of Gupton
Index that includes a given reference entity and and Stein [12] is well liked and used. Absolute
have an offsetting position by selling the single-name priority rule (APR) violations are modeled in [15].
CDS of the same entity. The investor is hedged For a nice overview, see [16].
Recovery Swap 3
under risk-neutral measure: Aj = (Tj −1 , τk )
Tj −1 <τk ≤Tj
V = i E[min[C, a·STi (Ti , Ti+M )]PTi D(0, Ti )]
× (ps (τk−1 ) − ps (τk ))D0 (τk ) (5)
i
(4) (6)
Constant Maturity Credit Default Swap 3
where Bt (Tj ) is the time t value of a risky unit Instantaneous Hazard Rate Modeling
payment at Tj and Ht (τk−1 , τk ) is the time t value
of a unit payment at τk conditional on a default A more systematic modeling of CM CDS is pos-
event in the interval (τk−1 , τk ]. We used a discretized sible in the framework of stochastic instantaneous
form of the accrued interest consistent with equa- hazard rates. This approach starts with postulating a
tion (3). stochastic differential equation (SDE) for the stochas-
The existence of the required measure follows tic default intensity λ(t). A reasonable choice is a
from a representation of the CDS rate as a ratio of lognormal process (similar to the Black–Karasinski
two tradable assets: model of interest rates) or an affine process (similar
to the Cox–Ingersoll–Ross model of interest rates).
A normal process (similar to the Hull–White model
St (Ti , Ti+M ) = Lt (Ti , Ti+M )/Nt (Ti , Ti+M ) (7)
of interest rates) was also used despite a conceptual
problem posed by positive probabilities of negative
where Lt (Ti , Ti+M ) is the time t expectation of the hazard rates. Multifactor models for joint stochastic
CDS default leg, evolution of hazard rates and instantaneous interest
rates are also possible.
An exact analytical solution for a CM CDS is
Lt (Ti , Ti+M ) = (1 − R)Pt Ht (τk−1 , τk ) (8) not available in any of these models because of a
Ti <τk ≤Ti+M two-layered structure involving inner expectations for
CDS rates fixings conditional on the state achieved on
(For a rigorous discussion of the mathematics of the fixing dates. The machinery of trees, lattices, or
measure change involving risky basis point value as partial differential equation (PDE) solvers, however,
a numeraire, see [8].) After the measure change, the can be accommodated to handle CM CDS structures.
contribution of each individual coupon payment to The key element is a construction of a slice of
the CM CDS leg can be written as values of CM CDS rate fixings on the set of model
states achieved on the fixing date. This is done
using a representation of the CDS rate in terms of
BTi (Ti ) conditional expectations of elementary instruments
i N0 (Ti , Ti+M )E f (STi (Ti , Ti+M )) Bt (T ) and Ht (T1 , T2 ) provided by equations (7), (6),
NTi (Ti , Ti+M )
and (8). We refer to Chapter 7 of the book [7] for
(9) the details of a possible realization of a tree-based
construction.
where f (X) = min(C, aX). The next step is to An advantage of hazard rate modeling is its
assume that St follows a lognormal martingale, consistency that allows to price a wide range of credit
F exp(σ Wt − 0.5σ 2 t), and to replace the true value instruments of different maturities, including CDS
of the ratio BTi (Ti )/NTi (Ti , Ti+M ) by the value options, asset swaps, bond options, and credit linked
of a suitable increasing function g(S) at S = STi . notes using the same model. A notable disadvantage
Imposing the condition N0 (Ti , Ti+M )g(F (Ti , Ti+M ) is the difficulty of calibration.
= ps (Ti )D0 (Ti ) ensures that the calculation of
the average (9) for σ = 0 brings us back to the Forward Credit Spread Modeling
nonstochastic valuation. A nonzero volatility σ > 0
leads to a positive correction due to the convexity As drawbacks of short-rate models of interest rates
of the product g(S)f (S) in the region of values led to the invention and development of swap and
of S close to F (Ti , Ti+M ) and distant from the LIBOR market models, similar progression is taking
cap C. place in the space of structured credit models. We
This approach has an advantage of relative sim- refer to the work [1] and Chapter 23 of [2] for details
plicity and potential ability of calibrating the model of a model in which the CDS rates St (Ti , Ti+M ) are
volatility σ to CDS options. The disadvantage is an chosen as primary variables.
uncontrollable assumption in the choice of the func- An advantage of this approach is the ease of
tion g(S). calibration and ability to derive efficient analytical
4 Constant Maturity Credit Default Swap
approximations under minimal additional assump- [2] Brigo, D. & Mercurio, F. (2007). Interest Rate Mod-
tions. At present, the disadvantage is the paucity of els–Theory and Practice, with Smile, Inflation, and Credit,
relevant market data, leaving a large freedom in spec- 2nd Edition, Springer.
[3] Calamaro, J.-P. & Nassar, T. (2004). CMCDS: The Path to
ifying the structure of volatilities and correlations. A Floating Credit Spread Products, Deutsche Bank, Global
full payback from this level of model sophistication Markets Research.
cannot be expected until the market for structured [4] ISDA (2005). Additional Provisions for Constant Maturity
products develops enough to provide liquid quotes for Credit Default Swaps, International Swaps and Deriva-
CDS option volatilities for a dense set of maturities, tives Association, November 21, 2005. Available at
similarly to caplet and swaption volatility matrices in www.isda.org.
[5] Pedersen, C. & Sen, S. (2004). Valuation of Constant
the interest rate markets.
Maturity Default Swaps, Lehman Brothers, Quantitative
Research Quarterly.
End Notes [6] Renault, O. & Ratul, R. (2007). Constant maturity
credit default swaps, in The Structured Credit Handbook,
a.
A. Rajan, G. McDermott & R. Ratul, eds, Wiley Finance,
The structure can obviously be extended to admit a fixed pp. 57–77.
rate reset floor, which, however, is not included in the [7] Schönbucher, P. (2003). Credit Derivatives Pricing Mod-
standard ISDA template. els, Wiley Finance.
b.
The actual payment dates Ti usually have a delay of at [8] Schönbucher, P. (2004). Measure of survival, Risk
least one business day and are rolled forward or backward August, 79–85.
to fall on a valid business day in accordance with currency-
dependent conventions. In the practice of quantitative
modeling, proper care is taken to make sure that correct Related Articles
discount factors reflecting the actual payment dates are
used.
c.
These expressions are often written in terms of integrals Constant Maturity Swap; Convexity Adjustments;
obtained in the limit of an infinitely frequent discretization. Credit Default Swaps; Credit Default Swaption;
The same remark applies to equation (3). Forward and Swap Measures; Hazard Rate; Inten-
d.
A rigorous calculation of the convexity correction to sity-based Credit Risk Models; Swap Market
accrued interest term is technically involved and can be Models; Term Structure Models.
avoided by using a proportionally adjusted correction to
the main term. TIMUR S. MISIRPASHAEV
References
with respect to GT . This does not change the compu- short rate r and the pseudodefault intensity λ (see
tation in the previous equation since Q̂(τ > T ) = 1. Cox–Ingersoll–Ross (CIR) Model).
Let us first remark that for the forward CDS to be pt,T will further denote the time t forward CDS
priced normally, we must have E Q̂ [pT 1{τ >T } ] = p0,T premium. Though pt,T has a financial meaning only
where p0,T denotes the forward CDS premium. In on the set {τ > t}, its computation can be extended
the case where pT 1{τ >T } is lognormal under Q̂, with to the complete set of events in the previous Cox
volatility parameter σ , we readily get a Black formula modeling framework (see [4] for further discussion).
for the price of the CDS option: pt,T solves for the following equation where, once
again, we do not take into account accrued premium
or up-front payments effects
B̃(0, Tk ) × (p0,T N (d1 ) − pN (d2 )) (3)
Tk >T
Tk
p0,T σ2
pt,T × E exp − (r + λ)(u) du |t
ln
p
+ T
2 √ Tk >T
where d1 = √ and d2 = d1 − σ T . t
σ T
TN s
= E exp − (r + λ)(u) du
Intensity Approaches
T t
Another approach consists in specifying the intensity
of the default time. This is the path followed in ×(1 − δ)λ(s)|t ds (5)
[2–4]. To circumvent the difficulty with default
intensity dropping down to zero after default and the
various mathematical issues related to enlargement of Prior to default, the left-hand term corresponds
filtrations, the easiest way is to model the default time to the value at time t of the premium leg of the
through a Cox process. We thus define the default underlying forward default swap, while the right-
time τ associated with the underlying name as hand term is associated with the default leg. Clearly
t pt,T is t -measurable and we can prove that it is
both a (, Q̂) and a (G, Q̂) martingale. Thus, the
τ = inf t, λ(s) ds ≥ − ln U (4) forward default swap premium shares the properties
0 of a “true” price. It can be checked that pT ,T =
pT .
where λ is a positive process adapted to some filtra- Using an extended version of Girsanov theorem
tion = (t ) and U follows a standard uniform vari- (see Equivalence of Probability Measures) for point
able independent of . For simplicity, we will further processes (see Point Processes), it can be shown
assume that (, Q) is a Brownian filtration. Follow- that
ing [1] or [8], we define as H = (Ht ) the filtration dpt,T
generated by the counting process Nt = 1{τ ≤t} and we = σ dŴt (6)
pt,T
denote by Gt = t ∨ Ht , the relevant information at
time t, incorporating knowledge about occurrence of where Ŵ is a (, Q̂) Brownian motion.
default prior to t and current and past values of finan- Let us also assume that there exists some spec-
cial variables such as interest rates or credit spreads of ification of r and λ such that the volatility σ is
the reference entity (see Filtrations for mathematical constant. Then, the forward CDS spread has a log-
details about filtrations in finance). Up to default time, normal dynamics under Q̂. This readily leads to the
λ(t) is the default intensity of τ (we refer to Point already stated Black formula for the price of the CDS
Processes regarding point processes and to Compen- option. The most obvious advantage is the simplic-
sators about compensators and intensities). While the ity of the outcome. The drawbacks are also rather
default intensity drops to zero after τ , we can remark obvious. The lognormal assumption for the forward
that λ(t) is still well defined, thanks to the above Cox spreads is questionable since jumps are often included
modeling framework. For instance, one can consider in the dynamics of λ as in the affine specification
shifted Cox–Ingersoll–Ross (CIR) processes for the within [5].
Credit Default Swaption 3
The intensity approach is easy to understand and is swap options and the impact of correlation, Interna-
consistent across strikes, maturity of the option, and tional Journal of Theoretical and Applied Finance 9(3),
maturity of the CDS. However, it entails dealing with 315–339.
[4] Brigo, D. & Matteotti, C. (2005). Candidate Market
extra parameters and is numerically more involved. Models and the Calibrated CIR++ Stochastic Intensity
In the more general setting involving correlation Model for Credit Default Swap Options and Callable
between r and λ, Monte Carlo simulation is usually Floaters. Working paper, Credit Models, Banca IMI.
required. In special cases, such as deterministic [5] Duffie, D. & Gârleanu, N. (2001). Risk and valuation
default-free rates, analytical formulas can be derived. of collateralized debt obligations, Financial Analysts
Fortunately enough, in most examples, the correlation Journal 57(1), 41–59.
[6] Hull, J. & White, A. (2003). The valuation of credit
parameter has little impact on option prices and
default swap options, Journal of Derivatives 10(3),
analytical approximations of the implied volatility in 40–50.
the Black formula can be derived. Let us remark [7] Jamshidian, F. (2004). Valuation of credit default swaps
that in these approximations σ depends on the and swaptions, Finance and Stochastics 8(3), 343–371.
exercise date and the maturity of the underlying [8] Jeanblanc, M. & Rutkowski, M. (2000). Modelling of
CDS. default risk: an overview, in Mathematical Finance:
Theory and Practice, J. Yong & R. Cont, eds, Higher
Education Press, Beijing. pp. 171–269.
[9] Schönbucher, P.J. (2000). A Libor Market Model with
Acknowledgments Default Risk . Working paper, University of Bonn.
[10] Schönbucher, P.J. (2003). A Note on Survival Measures
The author thanks A. Cousin, L. Cousot, A. Godet and C. and the Pricing of Options on Credit Default Swaps.
Pedersen and the editors for helpful remarks. The usual Working paper, ETH Zurich.
disclaimer applies.
Related Articles
References
Change of Numeraire; Compensators; Cox–Ing-
[1] Bielecki, T.R. & Rutkowski, M. (2002). Credit Risk: ersoll–Ross (CIR) Model; Credit Default Swap
Modeling, Valuation and Hedging, Springer. Index Options; Credit Default Swaps; Filtrations;
[2] Brigo, D. & Alfonsi, A. (2005). Credit default swap cali- Point Processes.
bration and derivatives pricing with the SSRD stochastic
intensity model, Finance and Stochastics 9(1), 29–42. JEAN –PAUL LAURENT
[3] Brigo, D. & Cousot, L. (2006). The stochastic inten-
sity SSRD implied volatility patterns for credit default
Credit Default Swap (for all but distressed credits), the spread is set
on any given day such that no upfront payment is
(CDS) Indices required.a
A standard market practice is to roll index posi-
tions so as to maintain a position in the on-the-run
Credit markets have shown tremendous growth in (i.e., most recent) series and version, in order to guar-
the last 10 years. In particular, the telecom bubble antee maximum liquidity. From an investor’s point
and corporate scandals of the early 2000s increased of view, in addition to enabling credit diversifica-
the interest of market participants in products such tion, credit indices introduce the possibility of lever-
as credit default swaps (CDS) (see Credit Default age without significant liquidity concerns, as several
Swaps), which provide protection against credit derivatives on these indices exist today (see Collat-
events. In response to this demand for credit protec- eralized Debt Obligations (CDO); Credit Default
tion, credit indices were introduced in 2003, increas- Swap Index Options).
ing the liquidity of CDS markets. These indices
are, in essence, standardized baskets of CDS writ-
ten on investment-grade and high-yield corporate Pricing Framework
issuers, or emerging-market governments. Table 1
shows the basic composition criteria of the main Credit indices are routinely priced through the stan-
indices (more stringent criteria apply too, in particu- dard CDS model. Indeed, though the index contracts
lar, those concerning liquidity of the individual CDS). trade with a fixed spread, the convention is to quote a
The specific constituents for each index are posted at theoretical fair spread (i.e., the coupon that the index
www.markit.com. would need to pay in theory in order to require no
In most indices, issuers are equally weighted. A upfront payment) and use the CDS model to con-
new series of a given index is issued semiannu- vert this fair spread to an upfront payment for the
ally, excluding from the basket those issuers who index. The issuers in the basket are assumed to be
no longer match selection criteria (e.g., downgraded homogeneous in credit quality and recovery rate.
issuers) and adding new ones. In case of a default When deriving the common hazard credit curve for
event, the defaulting issuer is removed from the bas- the issuers, the convention is to assume a flat curve
ket, but the weights remain and the index continues (see Hazard Rate). The expected losses are com-
to trade. The reduced basket is referred to as a new puted from the credit curve, assuming that losses
version of the same series. The loss payment for a are paid at the end of a coupon period, and given
default event is determined through the same settle- a particular recovery rate. The present value for the
ment auction as for single-name CDS (see Credit index contract is then the difference between the dis-
Default Swaps). counted expected losses and the discounted spread
Credit indices are commonly issued with initial payments weighted by the survival probability (since
maturities of 3–10 years. Similar to CDS, a credit premium is only paid on the remaining protected
index is a contract which entails that the protection notional).
buyer pays a spread (or coupon) at a regular fre- The contract can alternatively be valued by using
quency (usually quarterly according to International information on the individual constituents, thus relax-
Swaps and Derivatives Association (ISDA) dates) ing the homogeneity assumption. We can theoreti-
in return for default protection on some notional cally replicate the index by considering a basket of
amount. In case of a default of one of the refer- individual CDS that pay the same spread. We com-
enced issuers, the protection seller pays the non- pute the expected losses on the index by aggregating
recovered part of the protected notional times the the individual-constituent expected losses, each of
weight of the issuer in the index. The contract does which is derived from the full-term structures of
not terminate, but the protected notional is reduced credit spreads for the constituent. Similarly, the pay-
accordingly. Importantly, the index trades with a ment side aggregates survival probabilities over all
fixed spread for each series; changes in market pric- issuers. It is worth noting that the dependence struc-
ing are reflected in the upfront payment required to ture between the issuers does not play a role here as
enter the contract. In contrast, in a standard CDS the whole basket is considered.
2 Credit Default Swap (CDS) Indices
Imperfect Replication component, and the basis. The first two components
constitute the theoretical fair spread of the index,
Pricing the index through the constituents is appeal- as replicated through a basket of (market-traded)
ing, but it is not surprising to observe significant issuer CDS. The nonlinear portion of this fair spread
differences with the quoted index prices. The repli- accounts for the heterogeneity in credit quality among
cating strategy is not perfect, as the mechanics behind the issuers, and increases both with the level of
credit indices are slightly different from those for the average fair spread and the dispersion of the
CDS. We mentioned earlier that an index trades with individual fair spreads. The nonlinear component is
a floating upfront and a fixed spread, whereas most very sensitive to an increase in default likelihood of
CDS trade with a floating spread and no upfront. a single issuer. The basis—defined as the difference
This implies that we cannot, in general, enter into between the observed fair spread and the theoretical
a basket of CDS contracts that pay the same spread fair spread—contains a risk premium rewarding the
as the index. And while the basket can be com- index dealer for the small portion of risk that cannot
posed without initial capital, the index requires a be perfectly hedged through the replicating basket,
nonzero upfront investment. After a default event, and embeds a liquidity premium as well.
new differences between the credit index and the bas-
ket appear. On the index, the reduction in spread
payments is independent of the defaulting issuer;
End Notes
the spread is fixed and only the protected notional a.
Note that changes to the conventional CDS protocol
changes. On the other hand, the spread reduction were instituted in early 2009. Among other things, the
for the basket is proportional to the spread on the new protocol stipulates that single-name CDS trade
CDS for the specific defaulting issuer. The index with a fixed coupon of 100 or 500 bp, and settle via
and the basket consequently exhibit different behav- an upfront payment (see Credit Default Swaps for
iors through time, and offer different sensitivities to further discussion).
interest rates.
Further Reading
Fair Spread Decomposition Couderc, F. (2006). Measuring risk on credit indices: on the
use of the basis, Risk Metrics Journal Winter 2007, 61–87.
As contracts in their own right, credit indices are sub- Zhang, H. (2005). Instant default, upfront concession and CDS
ject to specific demand and supply effects, and have index basis, Journal of Credit Risk 1(2), 79–89.
their own distinct risk profile. A simple, standard way
of analyzing their risk is to observe the quoted index Related Articles
fair spread. This approach, though, cannot distinguish
the risk due to specific issuers from the risk due to Basket Default Swaps; Collateralized Debt Obliga-
demand for the index as a whole. tions (CDO); Credit Default Swap Index Options;
A useful decomposition is to break the index fair Credit Default Swaps.
spread into three components: the average fair CDS
spread across the constituent issuers, the nonlinear FABIEN COUDERC & CHRISTOPHER C. FINGER
Basket Default Swaps Modeling Approaches
Copula Approach
First, successful models of this class are reached most general construction (as e.g., in [7]) is to
when Y i are either Brownian motions with drift or view L as an increasing cadlag pure Jump process
time changed Brownian motions; see [9, 15], where with absolute continuous compensator ν(dt, dx) =
also some numerical calibration results are shown. g(t, dx)dt; see, for example, [10] for the underlying
Exit times of more general stochastic processes, stochastic analysis. This is particularly useful, if one
including stochastic volatility models, are applied to considers options on the spread s kth of a basket swap.
default modeling in [8]. Here, the modeling attempt is on L and the single-
name modeling is not considered.
Reduced-form Modeling
Pricing
Here we start from the classical single-name CDS
approach, where the default time is a double stochas- In order to price basket default swaps, we need
tic Poisson process (or Cox-process); see Hazard the distribution F(k:n) (t) of the time τ kth of the kth
Rate; Multiname Reduced Form Models and [5, 6, default. The kth default time is, in fact, the order
11]. In this approach, it is assumed that conditional statistic τ(k:n) , k ≤ n, and, in general, we can derive
on a realization of a path of the default intensity, the the distribution of the kth order statistics from the
default time is distributed like the time of the first multivariate distribution functions [3]. For pricing we
jump of a time-inhomogeneous Poisson process with also need the survival function:
this intensity. Typically, the dynamics of resulting
credit spreads are closely tied to the dynamics of the S(k:n) (t) = 1 − F(k:n) (t) (4)
default intensity in this approach. The fair spread s kth for maturity Tm is then given
The main challenge here is the incorporation of by
default dependence. One either has to model common
jumps in the spread processes or applies the copula
m
approach exogenously to the default times given from s kth i B(T0 , Ti )S(k:n) (Ti )
the spread and hazard rates [4, 17]. Recently, an i=1
even more reduced approach was developed [1, 7,
n Tm
18, 19] in which the accumulated losses (Lt )t≥0 = (1 − RECi ) kth=i
B(T0 , u)F(k:n) (du) (5)
are modeled directly as a stochastic process. The i=1 T0
2.5
2
(std,min,max)/mean
1.5
0.5
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Correlation r
Figure 1 kth-to-default spread versus correlation for a basket with three underlyings: (solid) s 1st , (dashed) s 2nd ,
(dashed-dotted) s 3rd
Basket Default Swaps 3
The first part is the present value of the spread [8] Fouque, J.P., Wignall, B.C. & Zhou, X. (2008). Mod-
payments, which stops at τ kth . The second part is the eling correlated defaults: first passage model under
present value of the payment at the time of the kth stochastic volatility, Journal of Computational Finance
11(3), 43–78.
default. Since the recovery rates might be different [9] Hull, J. & White, A. (2001). Valuing credit default
for the n underlying names, we have to sum up swaps II: modeling default correlations, The Journal of
over all names and weigh with the probability that Derivatives Spring, 12–21.
the kth default happens around u and that the kth [10] Jacod, J. & Shiryaev, A.N. (1987). Limit Theorems for
defaulted name is just i (we assume that there are no Stochastic Processes, Springer.
kth=i
joint defaults at exactly the same time). So F(k:n) is [11] Jarrow, R.A., Lando, D. & Turnbull, S.M. (1997). A
Markov model for the term structure of credit risk
the probability distribution of the kth order statistic
spreads, Review of Financial Studies 10, 481–523.
of the default times and that kth = i. Figure 1 [3] [12] Laurent, J. & Gregory, J. (2005). Basket default
shows the kth-to-default spreads for a basket of three swaps, cdos and factor copulas, Journal of Risk 7,
underlyings with fair spreads s1 = 0.009, s2 = 0.010, 103–122.
and s3 = 0.011, and pairwise equal normal copula [13] Li, D.X. (1999). The valuation of basket credit deriva-
correlation on the x-axis. In [16], it was already tives, CreditMetricsT M Monitor April, 34–50.
observed that the sum of the kth-to-default swap [14] Li, D.X. (2000). On default correlation: a copula func-
tion approach, Journal of Fixed Income 6, 43–54.
spreads is greater than the sum nof the individual [15] Overbeck, L. & Schmidt, W. (2005). Modeling default
n kth
spreads, that is, k=1 s > i=1 si . Both sides dependence with threshold models, Journal of Deriva-
insure exactly the same risk, so this discrepancy is tives 12(4), 10–19.
due to a windfall effect of the first-to default swap. [16] Schmidt, W. & Ward, I. (2002). Pricing default baskets,
At the time of the first default, one stops paying the Risk 15(1), 111–114.
huge spread s 1st on the one side but on the plain- [17] Schoenbucher, P. (2003). Credit Derivatives Pric-
ing Models: Models, Pricing, Implementation, Wiley
vanilla side one stops just paying the spread si of the
Finance.
first defaulted obligor i. [18] Schönbucher, P. (2005). Portfolio Losses and the Term
Structure of Loss Transition Rates: A New Methodology
References for the Pricing of Portfolio Credit Derivatives. Working
paper.
[19] Sidenius, J., Piterbarg, V. & Andersen, L. (2005).
[1] Bennani, N. (2005). The Forward Loss Model: A A New Framework for Dynamic Credit Portfolio Loss
Dynamic Term Structure Approach for the Pricing of Modelling. Working paper.
Portfolio Credit Derivatives. Working paper.
[2] Bluhm, C., Overbeck, L. & Wagner, C. (2002). An Intro-
duction to Credit Risk Modeling, CRC Press/Chapman & Related Articles
Hall.
[3] Bluhm, C. & Overbeck, L. (2006). Structured
Credit Portfolio Analysis, Baskets and CDOs, CRC- Collateralized Debt Obligations (CDO); Copulas:
press/Chapman & Hall. Estimation; Copulas in Insurance; Credit Default
[4] Duffie, D. & Gârleanu, N. (2001). Risk and valuation Swaps; Credit Default Swap (CDS) Indices;
of collateralized debt obligations, Financial Analysts Default Time Copulas; Duffie–Singleton Model;
Journal 57, 41–59. Gaussian Copula Model; Hazard Rate; Jar-
[5] Duffie, D. & Singleton, K. (1998). Simulating Correlated row–Lando–Turnbull Model; Multiname Reduced
Defaults. Working paper, Graduate School of Business,
Form Models; Intensity-based Credit Risk Models;
Stanford University.
[6] Duffie, D. & Singleton, K. (1999). Modeling term Reduced Form Credit Risk Models; Structural
structures of defaultable bonds, Review of Financial Default Risk Models.
Studies 12, 687–720.
[7] Filipovic, D., Overbeck, L. & Schmidt, T. (2008). LUDGER OVERBECK
Dynamic Term Structure of CDO-losses. Working
Paper.
Collateralized Debt The Nature of Collateral Assets
• “real” asset acquisition (true sale): “cash objective for the sponsor bank is to obtain regu-
CDO” or latory or economic capital relief using CDO tech-
• credit derivative technology (or other, e.g., insur- nology to transfer credit risk to investors. In these
ance): “synthetic CDO” or collateralized synthetic transactions, assets or credit risk exposures are typ-
obligation (CSO). ically sourced from the sponsor bank’s own balance
sheet.
Risk transfer from the SPV to capital market
investors can take the following forms:
Static or Managed CDOs
• SPV credit-linked note issuance: “funded CDO”;
• credit derivatives (CDSs) sold by the investor to “Static CDOs” are characterized by the fact that
the SPV: “unfunded CDO”; and the composition of the reference portfolio does not
• a combination of the above-mentioned: “par- change over the life of the transaction (but for sub-
tially funded CDO”. Most whole capital structure stitutions in a limited number of cases).
CDOs fall into that category. At the opposite end of the spectrum, “managed
CDOs” (see Managed CDO) allow for the dynamic
management of the portfolio of collateral assets
Objective of the Transaction within a predetermined set of constraints. CDOs are
usually managed by a third-party asset manager with
Most CDOs are structured for arbitrage purposes. credit management expertise. In a managed arbi-
Arbitrage CDOs are tailor-made investment prod- trage CDO, the asset manager’s objective may be the
ucts, using cash or synthetic technology, created following:
for the benefit of capital market investors. In these
transactions, collateral assets are usually sourced • to avoid default and ensure timely payment of
in the fixed-income cash or credit derivative mar- interest and repayment of principal (“cash-flow
kets. CDO”) or
However, a significant part of the CDO mar- • to optimize the market value of the underly-
ket was also driven with the purpose of bank bal- ing collateral pool through active management
ance sheet management. In such a transaction, the (“market-value CDO”).
Collateralized Debt Obligations (CDO) 3
“Self-managed CDOs” enable investors them- in future income streams (since the coupon is no
selves to manage the reference portfolio of the CDO longer being paid on the asset in default) and
they have underwritten. therefore in the dividend amounts ultimately paid
The following section provides an analysis of the to the equity tranche investors.
main CDO modeling techniques. • Portfolio management
Active trading by the CDO manager may gen-
erate losses (which have the same impact as a
Analysis of CDO Modelling Techniques default) or gains (which are then paid out in
dividends or incorporated into the CDO capital,
Cash-flow CDOs thereby, increasing the subordination level). Gen-
erally, the CDO manager is only able to modify
On the basis of securitization techniques, cash-flow the portfolio for a given period (5–7 years, the
CDOs usually aim at exploiting an arbitrage oppor- so-called reinvestment period). He/she must com-
tunity between the yield generated by a portfolio ply with a set of criteria (quality of the portfolio,
of credit assets and that required by investors on sector diversification, maturity profile, maximum
the securitized debt, the great majority of which annual trading allowance, etc.) defined in accor-
(80–90%) is rated investment grade due to the vari- dance with the rating agencies.
ous credit enhancement mechanisms: • Ramp-up risk
When a cash CDO is launched, the underlying
• Tranching and waterfall portfolio cannot be immediately constituted by
The creation of several layers of risk (“tranches”) the manager (essentially to avoid disturbing mar-
and the sequential allocation of income generated ket liquidity). The portfolio is, therefore, built up
by the collateral portfolio in order of tranche over 3–6 months (the ramp-up period). During
seniority. that time, asset prices may go up and the ini-
• Subordination tial average coupon target for the portfolio might
Losses are absorbed by all junior tranches to not be attained. In addition, the bank arranging
a given tranche, thus providing a protection the transaction carries the credit risk of the col-
“cushion” (when the CDO is liquidated, the lateral during the ramp-up period (the so-called
senior creditors have priority over the mezzanine “warehousing” risk). To avoid taking too much
investors, who have priority over the equity risk on their balance sheets and allocate capital,
holders). banks have been using off-balance sheet vehicles
• Overcollateralization (O/C) and interest cover (such as conduits and structured investment vehi-
(I/C) tests cles (SIVs)) to park the assets during the ramp-up
These act as CDO covenants, leading to the period. However, as witnessed during the 2007
diversification of cash flows toward the early credit crisis, these defense structures backfired as
repayment of the most senior tranche if they liquidity dried up and banks were forced to recon-
are breached, thus strengthening the level of solidate the vehicles and the security warehouses
subordination. on their balance sheets.
• Diversification • Reinvestment risk
Reference portfolios are diversified in terms of During the life of the transaction, the manager
obligor geography and sector, thus limiting the is regularly led to replace assets and therefore to
risk of correlated defaults. reinvest part of the portfolio. Market conditions
may change and the average coupon level might
Risks and sources of performance in cash-flow not be attained. To manage this risk, the manager
CDOs include the following: and the other equity investors usually have an
early termination option on the CDO.
• Default risk
Underperformance of the underlying portfolio
Synthetic CDOs: Correlation Products
(defaults) leads to a decrease in the amount of
assets (and therefore the amount of capital, the In the synthetic space, tranching and securitization
equivalent of a write-off in accounting terms) and techniques can also be applied to a portfolio of
4 Collateralized Debt Obligations (CDO)
a CDO structure enabled the CDO manager to create, Anatomy of the ABS CDO Market: Where Did It
in essence, new AAA-rated CDO bonds, using only All Go?. About $430 billion of ABS CDOs were
BBB subprime RMBS. issued between 2005 and 2007. However, the amount
The assumed diversification benefit drove the of risk transferred outside the banking system was
capital structure of the CDO and explains a large of actually limited because of the following factors:
part of the enormous “misrating” of subprime CDO
risk by rating agencies (let alone the rating of the • investment banks retaining a significant part of
underlying subprime RMBS risk itself). super-senior risk, either directly ($85 billion for
the most affected: Citigroup, UBS, Merrill Lynch,
ABS CDO—A Key Driver of the “Subprime” Morgan Stanley) or indirectly (by taking on
Demand. The demand from ABS CDOs allowed counterparty risk on monoline insurers; $120
RMBS originators to lay off a significant portion of billion notional amounts);
the risk. We estimate that $70 billion of mezzanine • resecuritization effect though CDO bucket ($40
subprime RMBS were issued in 2005–2007 versus billion notional);
$200 billion of mezzanine ABS CDOs over the same • off-balance sheet vehicles, for which banks
period. Such notional amount of mezzanine ABS retained all potential losses (conduits) or part
CDOs roughly represents an implied capacity of $90 of the losses (SIV, ∼$15 billion of ABS CDO
billion for mezzanine subprime RMBS investments investments); and
(over the vintages 2005–2007).h • “quasi-”off-balance sheet vehicles, such as money
This excess demand was filled by synthetic risk market funds that were subsequently supported by
(CDS) buckets. The creation of the ABS CDS mar- bank capital.
ket multiplied credit risk in the system, allowing for
the creation of far more CDOs than the available cash Outside the main banking sector, the most notable
“CDOable” assets. For example, one tranche of a sub- “CDO” casualties were either sophisticated insurers
prime RMBS securitization (nominal $15.2 million) (such as AIG) or medium-sized banks (IKB, Sach-
was referenced in at least 31 mezzanine ABS CDOs senLB, and other German Landesbanken).
(total notional of $240.5 million). As a result, it appears that CDOs were primarily
High-grade ABS CDOs also need to be taken a repackaging tool. The main roots of the “sub-
into account. Although the “subprime” demand from prime” demand stem from abusive off-balance sheet
these CDOs (roughly $85 billion) was lower than structures (SIVs, conduits) and regulatory capital
the nominal of high-grade subprime actually issued arbitrages (negative basis trades, long/short badly
($230 billion), they fueled the issuance of mezza- captured by Value-at-Risk (VaR) models, etc.), both
nine ABS CDOs through the feature of the “inner of which resulted in maintaining most of the risk
CDO bucket”. Such a bucket typically had an aver- within the banking system while “masking” its true
age size of 20%, allowing CDO arrangers to channel price/value.
a significant portion of ABS CDO risk. Such “rese- One could argue that there was no “real” CDO
curitization” was also facilitated by the existence of market for RMBS where rational investors could have
CDS on CDOs, further multiplying the credit risk sent earlier warning signals (by reducing demand,
in the system: one tranche of a mezzanine ABS refusing incestuous features such as CDO buckets
CDO ($7.5 million nominal) was referenced in at within ABS CDOs) and acted as stabilization agents
least 17 high-grade ABS CDOs ($154 million total (long-term demand, different investor base than in the
notional). underlying RMBS market).
At first sight, it would, therefore, be fair to con- In addition, the derivative market did not perform
clude that, since 2005, ABS CDOs have globally up to its objectives, as it was created too late (the
absorbed almost every cash-subordinated bond cre- ABX index, which effectively introduced a greater
ated in the subprime world (and have sold significant price transparency) and actually magnified the effects
protection in synthetic form as well), while traditional of the mispricing/misrating of RMBS risk.
cash buyers were largely absent. However, does this In conclusion, if the ABS CDO market effectively
mean that the credit risk was effectively transferred drove the demand for mezzanine subprime RMBS, its
to “mainstream” capital market investors? impact on mainstream investors has been limited. In
6 Collateralized Debt Obligations (CDO)
that respect, it is worth noting that the vast majority in synthetic CDOs, this market segment actually held
of RMBS risk (approximately 82 cents on the dollar) up well in line with the underlying asset quality
ended up being rated AAA and acquired not by CDOs (corporate earnings) further supported by the liquidity
but by institutions taking advantage of very cheap provided by banks (correlation desks).
funding. Even though the market avoided the “great
unwind”, the buying base for these products has
essentially gone away, and while some prop desks
How did Other CDO Markets Fare?
and hedge funds are still active, the institutional
Leveraged Loan CLOs. CLOs have suffered from money that provided the liquidity backbone has
pressure on both the asset and the liabilities sides. vanished.
Prices of leveraged loans fell in line with the overall
credit market, due to technical factors (significant
loan overhang resulting from warehouses at the major Conclusion: Where Next for CDOs?
investment banks) and fundamental fears (increase
in default rates, weakly structured leveraged buy- The postcrisis CDO market will probably be charac-
out (LBO) deals). On the liability side, we estimate terized by a convergence trend toward the mechanics
that negative basis buyers represented 50% of the of the corporate synthetic market, which has proved
AAA CLO buyer base, while banks and SIVs/CDOs more efficient and resilient for the distribution of
accounted for 25% and 15%, respectively. The CLO credit risk:
market suffered from the disappearance of such • the development of index and index tranches
“cheap funding”. (transparent and traded correlation) fueling
Even though we witnessed an LBO “bubble” liquidity;
(private equity houses taking advantage of the strong • less reliance on rating agencies and more in-house
CLO bid), the impact of the burst has not been as due diligence on assets; and
significant as for the ABS CDO market: • a return to balance-sheet-driven transactions.
• CLOs were not the sole buyer of leveraged loans. The main challenges for the CDO market include
• They did not suffer from misrating. the following:
• New AAA CLO buyers stepped in (Asian insti-
tutions, unaffected banks, insurance companies). • restoring investor confidence in the benefit of
structured products by providing better trans-
Most of the CLO deals issued in 2008 have been parency and liquidity;
balance sheet driven (cleaning up of warehouses), • addressing the AAA funding issue (now that SIVs
with simple two-tier structures (AAA and equity), and conduits have been dissolved); and
where the AAA tranche (or the equity) is retained • overcoming the discrepancies in accounting
by the originating bank. treatment.i
As the full capital structure execution is challeng-
ing and as the sourcing of cash asset is difficult Once the dust has settled, we expect securitization
(illiquidity, no warehouse providers for ramp up), the and CDO transactions to come back on the basis of
development of single-tranche synthetic CLOs, sup- more transparent and rational fundamentals.
ported by the growth of the Loan CDS market (ISDA
documentation, launch of LCDX and LevX indices),
is a key feature of the forthcoming years. End Notes
a.
Corporate Synthetic CDOs. With the huge growth Tranching is the operation by which the cash flows from a
in synthetic CDOs, what is commonly referred to portfolio of assets are allocated by order of priority to create
various layers (“tranches”), from the less risky (“senior”
as the structured bid became a dominant driver of tranche) to the most risky (“first loss” or “equity” tranche).
credit spreads. While a combination of mark-to- Tranching technology is usually performed using rating
market losses, rating downgrade risk, and headline agency guidelines in order to ensure that the senior tranche
risk could have caused investors to unwind positions attracts the most favorable rating (triple-A).
Collateralized Debt Obligations (CDO) 7
b.
Asset-backed securities are securities representing a secu- Reference
ritization issue. The ABS market covers mortgage-backed
securities (residential and commercial), consumer (credit [1] Bruyere, R., Cont, R., Copinot, R., Jaeck, Ch., Fery, L. &
card, student loans, auto loans), and commercial loans Spitz, T. (2005). Credit Derivatives and Structured Credit:
(trade receivables, leases, small business loans, etc.). A Guide for Investors, Wiley.
c.
Collateralized fund obligations.
d.
“Synthetic” in as far as the mechanism for transferring
risk is synthetic, using a derivative. Related Articles
e.
Combination of two put options on the same underlying
asset, at two different strike prices. Base Correlation; Basket Default Swaps; CDO
f.
Usually in the form of bid–ask spreads. Square; CDO Tranches: Impact on Economic
g.
iTraxx for the European market and CDX.NA for the US Capital; Collateralized Debt Obligation (CDO)
market. Options; Credit Default Swaps; Default Barrier
h.
On the basis of the following assumptions: 50% of the Models; Forward-starting CDO Tranche; Man-
portfolio allocated to subprime, of which 60% to the aged CDO; Multiname Reduced Form Models;
precedent vintage. Nested Simulation; Random Factor Loading
i.
While a cash CDO (or any cash bond) can be accounted Model (for Portfolio Credit); Reduced Form
for as “available for sale” by banks and insurers (meaning Credit Risk Models; Special-purpose Vehicle
that its price volatility will directly impact the equity base (SPV); Total Return Swap.
of the investor), the valuation of an equivalent synthetic
products impacts the income (P&L) of the investor. RICHARD BRUYERE & CHRISTOPHE JAECK
Forward-starting CDO Given a realization of the Gaussian factor Y , the
M individual credits are independent, and a sim-
Tranche ple recursive procedure [2] can then be employed to
recover the conditional loss distribution of the under-
lying portfolio, as well as the loss distribution of any
At the core of any CDO pricing model is a mecha- particular tranche of interest. Once we know how to
nism for generating dependent defaults. If a simple compute the loss distribution of a tranche for a given
factor structure is used to join their marginal distri- realization of the common factor, it is straightfor-
butions, the default times of the underlying credits ward to take a probability-weighted average across
are independent conditionally on the realization of all possible realizations of Y and thus recover the
the common factor(s). This conditional independence unconditional loss distribution of the tranche.
of defaults is very useful because it allows one to Repeating this procedure for a grid of horizon
use quasi-analytical algorithms to compute the term dates and interpreting the expected percentage loss up
structure of expected tranche losses, which is the fun- to time t as a “cumulative default probability”, we can
damental ingredient for the valuation of a synthetic price the tranche using exactly the same analytics that
CDO. we would use for pricing a CDS. More precisely, we
Because of their analytical tractability, condition- can define the “tranche curve” as the term structure
ally independent models have become a standard in of expected surviving percentage notionals of the
the synthetic CDO market. In the next section, we tranche, that is,
review the one-factor Gaussian-copula model, which
a predetermined time (the reset date) as deterministic In words, the conditional tranche curve Q(t; ω)
functions of the random amount of losses incurred represents the (risk-neutral) expected percentage sur-
by the reference portfolio up to that time. Notice viving notional of the tranche at time t, conditional
that forward-starting tranches and tranches whose on the event that the reference portfolio experi-
attachment point resets at a future date both belong ences a cumulative loss of ω units up to the reset
to this class. date.
Equally, we can write down the valuation in terms
of the unconditional tranche curve
Pricing a Reset Tranche
λ
Q(t) = p(ω) · Q(t; ω) (9)
Let ts denote the reset date, λj , j = 1, 2, . . . , M, ω=0
the number of loss units produced by the default
of the j th name, λ = λj the maximum number and thus obtain the familiar equations
of loss units that the portfolio can suffer, p(ω) the
probability today that the reference portfolio incurs
T
Premium = cN i Q(ti )B(ti ) (10)
exactly ω loss units by the reset date ts .
i=1
A reset tranche can be defined by the vector
{tT , ts , U, V , U (ω), V (ω)} where U (ω) ≥ ω is the
T
attachment point of the tranche (in loss units) after Protection = N B(ti )(Q(ti−1 ) − Q(ti )) (11)
the reset date, and V (ω) is the number of loss units i=1
protected by the tranche investor after the reset date.
We can price the two legs of this swap as follows: However, while the unconditional tranche curve
for t0 ≤ t ≤ ts reduces to the standard tranche
Premium curve defined in the section The Gaussian-copula
Model,
λ
T
= cN p(ω) i Q(ti ; ω)B(ti ) (6)
λ
λ
ω=0 i=1 Q(t) = p(ω) · Q(t; ω) = 1 − p(ω)
Protection ω=0 ω=0
λ
T
[Lt − U (t; ω)]+ − [Lt − (U (t; ω) + V (t; ω))]+
Q(t; ω) = T (t, ω) · 1 − E |Lts = ω ,
V (t; ω)
[ω − U ]+ − [ω − (U + V )]+
T (t, ω) = 1 − 1{t>ts } ,
V
U, t ≤ ts
U (t; ω) = ,
U (ω), t > ts
V , t ≤ ts
V (t; ω) = (8)
V (ω), t > ts
Forward-starting CDO Tranche 3
the unconditional tranche curve for ts < t ≤ tT Zv01 ,v2 = 0 otherwise (15)
λ
Q(t) = p(ω) · Q(t; ω)
ω=0
[Lt − U (ω)]+ − [Lt − (U (ω) + V (ω))]+
λ
= p(ω) · T (t, ω) · 1 − E |Lts = ω (13)
ω=0
V (ω)
incorporates the added complexity of the path- We preserve the notation adopted during our
dependent valuation. description of the Gaussian-copula model and denote
by πj,t (Y ) the probability that name j defaults by
Deriving the Conditional Tranche Curve time t, conditional on the market factor taking value
Y . Now we feed one credit at a time into the recursion
Our discussion so far leaves open the problem of and update each element according to the following:
constructing the conditional tranche curve. From the If v1 ≥ λj , then
previous discussion, it should be clear that to achieve
this goal we need to be able to compute Zvj1 ,v2 = (1 − πj,u (Y )) · Zvj1−1
conditional
,v2
expectations of the form E f Ltu , ω |Lts = ω for j −1
+ πj,s (Y ) · Z(v1 −λj ),(v2 −λj )
some function f . In this section, we present a two-
j −1
dimensional recursive algorithm for computing the + (πj,u (Y )−πj,s (Y ))·Z(v1 ),(v2 −λj ) (16)
joint distribution of cumulative losses at two different
horizons, which in turn allows us to compute the If v2 < λj , then
conditional expectations that we need. The method-
ology is conceptually similar to the one introduced Zvj1 ,v2 = (1 − πj,u (Y )) · Zvj1−1
,v2 (17)
by Baheti et al. [3] for pricing “squared” products. If v1 < λj ≤ v2 , then
As anticipated, we assume that the underlying
default model exhibits the property of conditional Zvj1 ,v2 = (1 − πj,u (Y )) · Zvj1−1
,v2
independence. We exploit this by conditioning our j −1
procedure on a particular realization of a common + (πj,u (Y )−πj,s (Y ))·Z(v1 ),(v2 −λj ) (18)
factor Y . We first discretize losses in the event of After including all the issuers, we set
default by associating each credit with the number of
loss units that its default would produce: we indicate Zv1 ,v2 = ZvM1 ,v2 (19)
by λj the integer number of loss units that would
result from the default of name j . Next, we construct The matrix Zv1 ,v2 now holds the joint loss dis-
a square matrix Zv1 ,v2 whose sides consist of all tribution of the reference portfolio at the two horizon
possible loss levels for the reference portfolio, that dates ts and tu , conditional on the realization of the
is, (0, 1, . . . , λ). In this matrix, we store the joint market factor Y , and we can numerically integrate
probabilities that the reference portfolio incurs v1 loss over the common factor to recover the unconditional
units up to time ts and v2 loss units up to time tu , with joint loss distribution. Using the joint distribution of
tu ≥ ts . By definition of cumulative loss, the matrix losses at different horizons, it is then straightforward,
must be upper triangular, that is, for any function f (.), to compute conditional
expec-
tations of the form E f Ltu , ω |Lts = ω , which is
Zv1 ,v2 = 0 if v2 < v1 (14) how we construct the conditional tranche curve.
For the nontrivial elements where v2 ≥ v1 , we set
up the following recursion. We first initiate each state Comments
(recursion step j = 0) by setting
We have presented a simple methodology for quasi-
Zv01 ,v2 = 1, if v1 = 0 and v2 = 0 analytically pricing a class of default-path-dependent
4 Forward-starting CDO Tranche
tranches. The proposed methodology is general in that fit observable prices equally well may produce
the sense that it can be easily applied to any model significantly different valuations for path-dependent
with conditionally independent defaults, including instruments.
“implied copula” models fitted to liquidly traded
tranches as in the Hull–White [4] model. The algo-
rithm is useful because fast pricing of reset tranches References
allows one to obtain a variety of Greeks that are
essential for effective risk management. [1] Andersen, L. (2006). Portfolio losses in factor models:
As observed by Andersen [1], however, some cau- term structures and intertemporal loss dependence, Jour-
tion is necessary when pricing instruments whose nal of Credit Risk 4, 71–78.
valuation is sensitive to the joint distribution of cumu- [2] Andersen, L., Sidenius, J. & Basu, S. (2003). All your
lative losses at different horizons. Liquidly traded hedges in one basket, Risk November, 67–72.
tranches only contain information about marginal [3] Baheti, P., Mashal, R., Naldi, M. & Schloegl, L. (2005).
Squaring factor copula models, Risk June, 73–76.
loss distributions and tell us nothing about their
[4] Hull, J. & White, A. (2006). The Perfect Copula. Working
dependence. Implying a default time copula from Paper, University of Toronto.
these prices, therefore, implicitly contains an arbi-
trary assumption about intertemporal dependencies, PRASUN BAHETI, ROY MASHAL & MARCO
and it is easy to verify that different implied copulae NALDI
CDO Square Mechanics of a Synthetic CDO2
inner CDO tranche losses Linn,j (t) (see Collateral- (non)occurrence of an isolated default event might
ized Debt Obligations (CDO)) simultaneously affect several inner reference portfo-
lios, thereby displaying a leveraged effect [1]. This
Linn,j (t) = min[Dj − Aj , max[Lj (t) − Aj , 0]]
impact is even more pronounced in case of thin
(2) tranches and understood as cliff risk of CDO2 s. More-
over, the double-layer tranche technology generally
where Aj and Dj denote the inner attachment and
amplifies correlation sensitivities: an increase in the
exhaustion level of the corresponding inner reference
asset correlation yields a higher increase of cor-
portfolio j . Third, the outer tranche or CDO2 tranche
loss can be computed as relation between affected inner CDO tranches. In
summary, overlap and correlation are the main risk
Lout (t) = min [Dout − Aout , max [Ltot (t) − Aout , 0]] drivers of a CDO2 tranche. In addition, the described
effects considerably increase the impact of other risk
(3)
drivers such as changing credit spreads (respectively,
where Aout and Dout denote the attachment and changing default probabilities) and changing recovery
exhaustion points of the outer tranche and Ltot (t) =
rates.
M
j =1 Linn,j (t) the sum of inner tranche losses. The key ingredient to pricing is the stochastic
evaluation of the accumulated CDO2 tranche loss
Lout (t) as determined in the previous paragraph. This
Risk Analysis and Pricing requires the consistent use of a multivariate credit
The limited universe of liquid and actively traded (default) model. Since no market standard has been
reference assets naturally yields overlaps in inner ref- developed yet owing to the lack of truly observ-
erence portfolios; in other words, reference assets able correlation information, the necessity and benefit
tend to occur in more than one real-life inner ref- of appropriate scenario models are highlighted in
erence portfolio. This causes the CDO2 loss distri- this article. The rating agencies Moody’s, Standard
bution to display fatter tails on both ends, since the & Poor’s, and Fitch have consistently adapted their
CDO Square 3
CDO rating technology to the CDO2 case. In particu- [2] Smith, D. (2003). CDOs of CDOs: art eating itself? in
lar, the rating technology comes with a look-through Credit Derivatives: The Definite Guide, J. Gregory, ed,
Risk Books, London, pp. 257–279.
capability to underlying assets of inner reference port-
folios. However, the look-through capacity stops with
ABS-type assets that are modeled as a single asset. Related Articles
c+r
Coupons c
Issuer Special purpose X-Losses at T Investor
(protection buyer) Losses vehicle (SPV) (protection seller)
X at t
X-Losses at T
Interest
Losses r X at t
t : inception
T: maturity Collateral
X
Figure 1 This figure shows the standard structure of cash flows in an LSS transaction
Figure 2 This figure shows the credit losses to the investor and the risky fee notional, N (t), as a function of the portfolio
losses. The coupon amount paid to the investor at a coupon date ti is s · N (ti ), where s is the spread. For comparison we
include both the behavior of the LSS and the reference unleveraged super-senior (SS)
transaction. This means that it is never optimal for the widen, the value of the LSS can drop severely from
investor to deliver since after posting more collateral the point of view of the investor even in the absence
he/she will be liable for more losses without receiving of any defaults. This poses a risk to the issuer since
a higher coupon in compensation. It is more favorable at the time of trigger the MTM of the tranche could
to reinvest in a new LSS transaction. have dropped below the collateral amount.
We describe the three main types of trigger
mechanisms below. There is a trade-off between how • Spread Trigger
well the trigger can approximate the MTM of the Spread triggers are based on the average spread of the
trade and how easy it is to objectively assess whether underlying portfolio. Trigger levels can be defined as
the trigger has been breached. a function of the time to maturity and the level of
losses in the portfolio. This provides a much better
• Loss Trigger proxy to the MTM of the tranche than the loss trigger.
A loss trigger is breached when the amount of portfo- For some standard portfolios, for example, iTraxx or
lio notional lost owing to defaults exceeds the trigger CDX (see Credit Default Swap (CDS) Indices), the
level. This is the easiest trigger to monitor as the value of the average spread can also be assessed
loss amounts can be objectively determined. How- using publicly available information and is hence
ever, the loss provides an imperfect approximation unambiguous. Often, however, the LSS is based on
for the MTM of the tranche. In particular, if spreads bespoke portfolios. In this case, the valuation of the
Leveraged Super-senior Tranche 3
SS spreads will have to rely on models, for which MTM losses (unless he/she posts more collateral).
there is no universally agreed methodology. Unless dealing with a loss trigger, a trigger breach
can happen even if the investor has not incurred any
• Mark-to-market Trigger actual credit losses, for example, if there is a dramatic
The MTM trigger is based on the MTM of the refer- rise in spreads.
ence (unleveraged) SS tranche. Clearly, if the MTM
trigger is set below the collateral level the issuer
ensures that the collateral will cover the unwind pay- Valuation
ment (up to gap risk, cf. the section Hedging and
Risks). The disadvantage is that the MTM trigger is The valuation of LSS transactions poses additional
the hardest to asses objectively. Typically, the MTM challenges to that of pricing a standard SS tranche.
for a tranche is not quoted, and hence one has to rely This is because the unwind feature means that we
entirely on (complex) models for valuation. need to be able to value the risk of possible MTM
losses to the investor and the issuer. Hence, we need
to model the joint behavior of MTM and the portfolio
Hedging and Risks losses. This is a dynamic problem that requires more
than knowledge of the marginal loss distributions
If the trigger mechanism guaranteed that upon needed for standard tranche pricing.
unwind the issuer will receive the full MTM of the
There are two main candidates for dynamic credit
reference swap then this swap would provide a per-
models that can, in principle, be used to value an LSS
fect hedge for the LSS. The coupon amount the
transaction:
investor would receive would be the same, as if
he/she had invested the full notional amount N in • low-dimensional models of the portfolio loss
an SS swap transaction. process;
However, there are two reasons why the trade • dynamic models of all single name spreads
can unwind without recovering the full MTM of the in the portfolio (see Duffie–Singleton Model;
hedge: Multiname Reduced Form Models).
• Typically, there will be a delay between a trigger Modeling and valuation of the LSS product is not
breach and the actual unwind of the hedge. only important for the issuer and the investor but also
In this period, there is the risk that the MTM for assessing the rating of the note. This depends not
will drop below the collateral amount. The only on the probability of experiencing credit losses
issuer then has to make good the difference but also on the probability of having a trigger event.
(MT M − X). This is the so-called gap risk: The Rating agencies (see Credit Rating) use in-house
issuer is exposed to large and sudden increases models for this as described in, for example, [1, 2].
in the value of SS protection or equivalently
increases in the spread.
• Even in the absence of a trigger breach, the LSS Model-independent Bounds
will unwind if the SS tranche losses have wiped
out the collateral. Since the collateral is 0 in this Some model-independent bounds for the value of
case, the issuer will have to pay the full MTM of the LSS can be derived. We discuss this from the
the hedge to unwind his/her position. However, perspective of the issuer who is long protection.
this scenario is unlikely, since the trigger should Let us denote the spread of a standard tranche with
be set so that a trigger event occurs before the attachment point a and detachment point b by Sa,b .
collateral has been reduced to 0. The spread of the corresponding leveraged tranche
The investor in an LSS transaction faces MTM with collateral amount x will be denoted by Sa,b,x .
risks as well as credit risks associated with SS tranche Note that the leverage amount α is given by α =
losses. In the case of a trigger event, the investor will (b − a)/x. The most basic bound we can then write
be forced to unwind his/her position and will lose part down is
or all of his/her principal as he/she realizes his/her Sa,b ≤ Sa,b,x ≤ α · Sa,b (1)
4 Leveraged Super-senior Tranche
Time
phase” (which lasts about one year), asset managers tests to avoid early amortization. In response to a
invest the proceeds from CDO placement (possibly general repricing of risk, dwindling investor demand
after an initial warehousing period when the sponsor increased risk premia and curtailed the capacity of
finances the buildup of the asset portfolio before secu- CDO managers to offset higher funding costs. Faced
ritizing). During the subsequent “reinvestment phase” with rising liability pressures and without real buyers
(up to five years or longer), managers reinvest cash available, managers of “blind pools” could “double
flows as well as trade the reference portfolio within up” by opting for riskier positions and greater lever-
the prescribed guidelines. Cash flows generated by age to preserve own arbitrage gains within predefined
the assets are used to pay back investors generally in investment guidelines, which were gradually under-
sequential order from the senior investors, who hold mined by the disassociation of ratings and structured
the highest rated (typically “AAA”-rated) securities, asset performance. In principle, if transaction costs
to the “equity investors” who bear the first-loss risk are ignored, risk-neutral managers would not ben-
and generally hold unrated securities. In transactions efit from dynamic asset allocation by substituting
with revolving pools, portfolio assets can be replaced badly performing assets. Under worsening credit con-
(e.g., credit card and trade receivables, corporate ditions, better asset performance comes at a premium,
bonds) and balances are adjustable up to maximum making it more expensive to weed out distressed
limits without amortization schedule of principal. In assets. Therefore, CDO managers are no better off
contrast, managers of substituting pools incorporate than before once they divert funds to safer but more
new assets (within defined credit parameters) as orig- costly assets (or accept higher hedging costs).
inal liabilities are paid down (e.g., corporate bonds,
some residential mortgages, and consumer loans), but
balances remain fixed. In the “amortization phase”,
the reference portfolio matures (or is prepaid/sold) References
and investors receive some or all of their principal
investment back according to the seniority of their [1] Cousseran, O. & Rahmouni, I. (2005). The CDO
claim. market – functioning and implications in terms of
financial stability, Banque de France Financial Stability
Review June (6), 43–62.
Lessons from the Credit Crisis [2] Duffie, D. & Gârleanu, N. (2001). Risk and valuation of
collateralized debt obligations, Financial Analysts Journal
Although rating agencies have developed stress 57(1), 41–59.
tests to evaluate the resilience of dynamic portfolio [3] Goodman, L.S. & Fabozzi, F.J. (2002). Collateralized
structures, the 2007 subprime mortgage crisis demon- Debt Obligations: Structures and Analysis, John Wiley &
Sons Inc., Hoboken, NJ.
strated that managed CDOs might create incentive
[4] Jobst, A. (2005). Risk management of CDOs during times
problems [4]. Existing quality and coverage tests of stress, Derivatives Week, Euromoney, London. (28
on the underlying collaterals are designed to trigger November), pp. 8–10.
amortization scenarios if asset performance deterio- [5] Jobst, A. (2007). A primer on structured finance, Journal
rates. However, CDO managers can manipulate these of Derivatives and Hedge Funds 13(3), 199–213.
Managed CDO 3
not started to touch the tranche the subordination option to investors to put their CDO tranche at
will be increased by a fixed amount. Multiple par to the issuer of such guarantee if the rating
variations of those contracts exist with several was downgraded below a prespecified threshold.
reset dates or increase in subordination linked to This is in effect a callable structure conditional on
losses being in a specific band. the tranche being downgraded by a rating agency.
• Leveraged supersenior (see Leveraged Super-
senior Tranche): This is a synthetic CDO
tranche, with a large attachment point, thus its Purpose and Market
supersenior nature, which is initially partially col-
lateralized by the protection seller. Owing to that The purpose of those innovations is, in most cases,
partial collateralization, when a loss trigger or related to issues encountered by the bank’s desk
mark-to-market (MtM) trigger is breached, the working on synthetic CDO tranches. The innova-
protection seller has the obligation of either pro- tions in that market were always caused not by a
viding additional collateralization or unwinding need of the investors but by potential arbitrages to
its contract at market value. When the trigger is exploit. Synthetic CDO tranche from 2001 to 2007
based on loss level, this can be viewed as a reset was a booming market with several success stories.
tranche. However, this was a very competitive market, where
The second category encompasses all derivatives the competitive advantage was due to the endless cre-
in the classical sense, that is, derivatives based on the ation of new structural features. Indeed, as soon as
market value of the underlying asset. an innovation was introduced to the market by one
player, several others tried to imitate it, soon deplet-
• Call on CDO tranches: This is an option giving ing the potential gains evidenced by such innovation
the option holder the possibility to buy protection (Figure 1).
on a synthetic CDO tranche at a predetermined Each innovation was triggered by either an arbi-
spread on one of several future dates, being either trage to exploit or a specific problem encountered by
European for one single date or Bermudan for the desk:
a set of future dates. The strike is defined as a
spread level (and not as a value of the tranche). • Forward-starting CDO: Those products were cre-
The synthetic CDO tranche, that is, portfolio ated to exploit discrepancies between the terms
composition, attachment/detachment points, and structure of spread as the five-year maturity
maturity, is defined initially, and akin to the spread was depleted because of the wave of
differentiation done on forward-starting CDO, five-year synthetic CDO tranches. Synthetic CDO
losses up to the exercise date of the option may tranches were starting to be structured at 10 years
or may not affect the attachment point. or even as forward starting 5–10 years to bene-
• Put on CDO tranches: Contrary to the call option, fit from the tightening at 5 years. Indeed, a 5–10
this gives the option holder the possibility to sell years forward starting CDO can be seen as a com-
protection on a CDO tranche at a predetermined bination of a 10 years synthetic tranche and a 5
spread. years synthetic tranche, selling protection for 10
• Callable structure: This is an option that gives years but buying protection for the first 5 years.
the protection seller (or the protection buyer) the • Leveraged supersenior: When the correlation
right to terminate the transaction at no additional desk sold synthetic CDO tranches, they sold
costs during its life. If the option is for the protec- mainly equity and mezzanine tranches, and thus
tion seller, this is, in fact, a Bermudan call on the either delta-hedged them or kept the most senior
CDO tranche itself with a strike equal to its ini- tranches on their book. The supersenior exposures
tial spread level. Here the attachment point of the were very hard to sell due to their low spreads
underlying synthetic CDO tranche will be eroded compared to their notional amounts, that is, the
by losses up to the exercise date of the option. amount of cash needed to invest in those tranches.
• Rating guarantee: We know of one investment The creation of LSS allowed those desks to buy
bank that worked on the possibility to issue a protection on supersenior synthetic CDO tranches
guarantee on the CDO tranche rating, giving the by broadening the investor base outside of its
Collateralized Debt Obligation (CDO) Options 3
$60 000
$0
Mid- End- Mid- End- Mid- End- Mid- End- Mid- End- Mid- End- Mid- End- Mid-
2001 2001 2002 2002 2003 2003 2004 2004 2005 2005 2006 2006 2007 2007 2008
Date
Figure 1 Evolution of the notional of the credit derivatives market with several innovations
initial clients (monolines or (re)insurance com- position on maturities: long, the CDO tranche at the
panies), the LSS having a higher spread for an longest maturity and short, the same CDO tranche
“assumed” low credit risk. at the effective date. Indeed, on comparing the
two positions—a t1 /t2 forward-starting CDO tranche
versus a long CDO tranche at t2 and a short CDO
Valuation tranche at t1 with the same attachment/detachment
points:
The valuation of a CDO tranche, whether initially
or during its life, relies on the knowledge of the • if losses are always below the attachment point,
loss distribution of the underlying portfolio through no CDO tranches will be touched;
time or, in other words on the law governing the • if Lt1 < a and Lt2 > a, the forward-starting
CDO
random path Lt representing the cumulative losses will lose min d − a, Lt2 − a and the long CDO
up to t. The knowledge of the loss distribution at tranche will lose the same amount; and
different future date (thus a loss distribution surface) • if Lt1 > a and Lt2 >a, the forward-starting
is requiredb to price a CDO tranche, that is, to value CDO will lose min d − Lt1 , Lt2 − Lt1 and
the two legs of that tranche swap: P [LT ≤ l|Ft ] the long–short CDO tranches will lose/gain
the probability that losses up to time T will exceed min d − a, Lt2 − a /min d − a, Lt1 − a , which
threshold l knowing the losses at time t.c If the gives the same aggregate amount.d
existing information in the market consists of the
credit index tranches prices, in the arbitrage pricing However, to value the other reset tranches, we
theory framework, from those prices we will extract need an additional information: the intertemporal
constraints
on the “spot”
loss distribution surface dependence of losses, that is, the dependence between
being P LT ≤ l|Ft0 . losses at different dates. For a forward-starting
Some CDO derivatives can be valued with that CDO—first variation—we need the law of Lt1 , Lt2
“spot” loss distribution surface: the forward-starting to be able to price it.
CDO as described in [7] (the second variation as In addition, for options on tranche, dependent on
described above) can be understood as a long–short future spreads, the knowledge of the “spot” loss
4 Collateralized Debt Obligation (CDO) Options
distribution surface is not sufficient to value those default of individual companies; a portfolio
options. An additional assumption related to spread can be analyzed with correlated hazard rates.
volatility is needed: this can be an ad hoc assump- A natural extension of those models to address
tion directly on the volatility [7] or can be embedded dynamic losses is to use stochastic hazard
into a stochastic deformation of the loss distribu- rate for each company, which may be linked
tion surface P [LT ≤ l|Ft ] through time. This leads through common jumps (as first introduced
researchers to introduce the class of models known as by Duffie and Gârleanu [5]), correlated
a dynamic losses model. The dynamic losses model Brownian motion, or even introduction of
defined so far relies on the standard CDO models, stochastic time process mapping calendar time
which are classified according to two broad cate- to business time [10].
gories [8]:
Following the financial crisis of 2008, the land-
• Top-down models: The top-down approach will scape for synthetic CDO tranches has seen a change
only look at the evolution of the losses on the in paradigm. The default of Lehman Brothers in
portfolio and model its dynamics. The seminal September 2008 and the demise of the investment
paper describing a general framework for such bank business model have exposed the shaky foun-
dynamic of the “forward” loss distribution surface dations of the CDO market: liquidity drying in stress
is [11], where the distribution of losses in the period and lack of acknowledgment of the coun-
portfolio is represented as a Markov chain with terparty risk in the CDS market. However, the ini-
stochastic transition rates. Andersen et al. [3] tiatives that are currently discussed (standardization
explore the same road in a less general manner. of that market, central clearing house) will, on the
Those approaches are tractable, flexible but they long term, expand the scope of that market and ulti-
do not capture information from the single-names mately be beneficial for the development of those
CDS market. instruments.
• Bottom-up models: The approach starts with
a representation of the credit risk of the End Notes
underlying single names in order to build a loss
a.
distribution surface. Starting from the modeling Apart from rare guarantees offered by structuring desks
of individual defaults, they use classical credit or call optionality for the equity tranche of cash CDOs.
b.
In reality as pointed out in [6], the knowledge of the
modeling:
expected loss on the CDO tranche is sufficient to price it.
• Structural models (see Default Barrier c.
The filtration Ft may embed more information than the
Models): A structural model computes the cumulative losses up to that time.
default as the breaching by a random process d.
Taking into account even the timing of payment of losses,
of a barrier (in the initial Merton seminal the two positions are the same.
article the first represents the assets of a
company and the second its indebtness). References
That class of model incorporates a dynamic
for the probabilities of losses naturally, [1] Albanese, C., Chen, O., Dalessandro, A. & Vidler, A.
introducing default dependencies through the (2005). Dynamic Credit Correlation Modeling, Working
random process, with linear combination Paper, Imperial College.
[2] Andersen, L. (2006). Portfolio Losses in Factor Models:
of random process (Brownian motion or Term Structures and Intertemporal Loss Dependence.
Gamma process, see [9]). A related class [3] Andersen, L., Piterbarg, V. & Sidenius, J. (2005).
of models looks at a discrete evolution of A New Framework for Dynamic Credit Portfolio Loss
creditworthiness, generally with a Markov Modelling. Working Paper, November.
chain, where the use of stochastic transition [4] Baheti, P., Mashal, R. & Naldi, M. (2006). Step it Up
rates can also be applied (a related example or Start it Forward: Fast Pricing of Reset Tranches,
Lehman Brothers Quantitative Credit Research, Vol.
is in [1]).
2006-Q1.
• Reduced-form models (see Intensity-based [5] Duffie, D. & Gârleanu, N. (2001). Risk and the valuation
Credit Risk Models): A reduced-form model of collateralized debt obligations, Financial Analysts
uses hazard rates to represent the risk of Journal 57, 41–59.
Collateralized Debt Obligation (CDO) Options 5
[6] Hull, J. & White, A. (2006). Valuing credit derivatives for the Pricing of Portfolio Credit Derivatives. Working
using an implied copula approach, Journal of Derivatives Paper, ETZH.
14, 8–28.
[7] Hull, J. & White, A. (2007). Forward and European
options on CDO tranches, Journal of Credit Risk 3, Related Articles
63–73.
[8] Hull, J. & White, A. (2008). Dynamic models of
Collateralized Debt Obligations (CDO); Default
portfolio credit risk: a simplified approach, Journal of
Derivatives 15, 9–28.
Barrier Models; Forward-starting CDO Tranche;
[9] Jäckel, P. (2008). The Discrete Gamma Pool Model. Intensity-based Credit Risk Models; Leveraged
Working Paper, August. Super-senior Tranche.
[10] Joshi, M. & Stacey, A. (2006). Intensity Gamma, Risk
19, 78–83. OLIVIER TOUTAIN
[11] Schonbucher, P.J. (2006). Portfolio Losses and the Term
Structure of Loss Transition Rates: A New Methodology
Credit Default Swap The cash amount is paid by the protection buyer.
The accrued coupon enters the calculation because
Index Options portfolio swaps, by convention trade with accrued
coupon, similar to the way bonds trade with accrued
interest. To simplify the exposition, we ignore
Portfolio credit default swaps (CDSs) referencing accrued coupon in the remainder of the article.
indices such as CDX and iTraxx are the most liquid When a strike spread is specified, the cash amount
instruments in today’s credit market and options on is calculated using the standard CDS valuation model,
these have become mainstream. A CDS index option for example, as implemented in the Bloomberg
(also called a portfolio swaption) is an option to CDSW screen:
enter into a portfolio swap as a protection buyer or Cash amount = Notional · PV01·
a protection seller. A portfolio swap (also called a
CDS index swap) is similar to a portfolio of single- (Strike spread − Coupon) (2)
name CDS all with the same coupon (for details, see
The coupon is the fixed premium rate for the
Credit Default Swap (CDS) Indices).
underlying portfolio swap. When valuing a portfolio
Both portfolio swaps and swaptions are traded
swaption, it is important to respect the exact market
over the counter but are standardized. The con-
convention for calculating the PV01 such as the flat
ventions for how portfolio swaps are quoted and
spread curve convention (see Credit Default Swap
traded are important for properly valuing portfolio
(CDS) Indices).
swaptions.
Another important market convention is that if
In this article, we outline the basic conventions
the swaption is exercised, the option holder will
and terminology for portfolio swaptions, explain the
buy or sell protection on all names in the portfolio
standard model used by most market participants, and
including those that may have defaulted before option
briefly discuss other models and approaches.
expiration.
To price a swaption, we must specify a stochastic It is recommended to solve the model numerically
model for V . In addition to assuming that risk-neutral to get the most accurate pricing. However, by making
valuation is proper [1, 3], the standard model is based a few simple approximations (such as simplifying the
on two minimal assumptions that are clarified further: expression for the PV01 in equation (2)) it is possible
to derive approximate closed-form solutions that look
1. the spread of the underlying portfolio swap is like Black formulas.
lognormally distributed and See [2] for details on the model outlined above.
2. the model correctly prices a synthetic forward
contract constructed by combining a long payer
and a short receiver with the same strikes. Other Models and Approaches
The standard model assumes that V is a function, The standard model is a simple approach to what
V (X), of a hypothetical spread: could be a very complicated problem. Instead of
trying to model the credit curves and default of
X = E(X) exp −0.5σ 2 T + σ Normal(0, T ) (4)
each of the names in the portfolio, the approach
where Normal(0, T ) is random normal variable with in the standard model is to model the hypothetical
mean 0 and variance T (the time, in years, to option spread on the aggregate portfolio that also includes
expiration), and σ is the free parameter that we defaulted names. Thereby the model has only one free
interpret as the spread volatility. E(X) is the expected parameter, the aggregate spread volatility, and the
value of X. The function V (X) is the one found approach becomes similar to using Black–Scholes
in equation (2) when the cash amount is seen as a for S&P 500 options. This analogy to the equity
function of the strike spread. world suggests paths to the next generation of models
The swaptions are priced by discounting their such as introducing stochastic volatility and jumps
expected terminal payoff (risk-neutral valuation). To or creating a model that starts from the individual
understand where E(X) comes from, consider a payer credits by modeling their default, spread volatility,
and a receiver swaption both with a strike price and spread correlation.
of 100% or equivalently strike spreads equal to the
coupon in the underlying portfolio swap. In this case, References
the cash amount in equation (1) or (2) is zero and the
terminal payoff from a position that is long the payer [1] Morini, M. & Brigo, D. (2007). Arbitrage-free Pric-
and short the receiver is V . The value of this position ing of Credit Index Options, Working Paper, Bocconi
is therefore University.
[2] Pedersen, C. (2003). Valuation of Portfolio Credit Default
V0 = D(T )E(V (X)) (5) Swaptions, Lehman Brothers Quantitative Credit Re-
search Quarterly, 2003-Q4, pp. 71–81.
where D(T ) is the discount factor to time T (option [3] Rutkowski, M. & Armstrong, A. (2008). Valuation of
expiration). Credit Default Swaptions and Credit Default Index Swap-
The value of a position that pays V , that is, V0 , tions, Working Paper, University of New South Wales.
can also be determined from the credit curve of
the underlying portfolio (potentially using the credit
curves of all the names in the portfolio) since it is
Related Articles
simply the value of owning protection on all names
in the portfolio but only having to pay premium from Credit Default Swaps; Credit Default Swap (CDS)
option expiration onward. Once we have a value for Indices; Credit Default Swaption; Hazard Rate.
V0 , E(V (X)) can be found as V0 /D(T ) and E(X)
CLAUS M. PEDERSEN
can be implied from this value. We can then price the
swaptions using σ as the only additional parameter.
Hazard Rate expressed as
T
¯ −rT 0 [τ ≤ T ] + r
D0 = e ¯ e−rs 0 [τ ≤ s] ds
0
Consider a credit default swap (CDS) (see Credit
(4)
Default Swaps), where the premium payments are
−rtm
periodic and the terminal payment is a digital cash P0 (S) = S e 0 [τ > tm ] (5)
settlement of recovery rate 1 − . ¯ For simplicity, we tm
assume that the current time is normalized to t = 0,
the risk-free rate r is constant throughout the maturity The fair spread is the spread S ∗ for which
of the contract, and spreads are already given at P0 (S ∗ ) = D0 , making the value of the contract at ini-
tiation 0. This simple expositional formulation shows
standard interperiod rates (allowing us to ignore day-
that the modeling of survival probabilities under the
count fractions and division by period length). The
pricing measure of the form 0 [τ > s] is the essence
cash flows of a CDS can be decomposed into the
of CDS pricing. These quantities can be modeled in
default leg and the premium leg. The default leg is a
a unified way using the concept of hazard rate.
single lump sum compensation for the loss ¯ on the
face value of the reference asset made at the default
time by the protection seller to the protection buyer,
Hazard Rate and Default Intensity
given that the default is before the expiration date
T of the contract. The premium leg consists of the Suppose that we have a filtered probability space
fees, called the CDS spread, paid by the protection (, F , , ) satisfying the usual conditions and
buyer at dates tm (assumed to be equidistant e.g., that the default time of a firm is modeled by a
quarterly) until the default event or T , whichever is random time τ , where [τ = 0] = 0 and [τ >
first. The spread S is given as a fraction of the unit t] > 0 for all t ∈ + . We start under the assumption,
notional. which will be relaxed later, that the evolution of
A concise mathematical expression for both legs information only involves observations of whether
can be obtained via a point process representa- or not default has occurred up to time t. In other
tion. Suppose we have a filtered probability space words, we are dealing with the natural filtration
(, F , , ) satisfying the usual conditions. We F t = N t = σ (Ns , s ≤ t) of the right continuous,
model the default time as a random time τ in [0, ∞] increasing process N , introduced earlier, completed
with an associated single jump point process to include the -negligible sets. Let F (t) = [τ ≤
t] be the cumulative distribution function of τ . Then,
1 if τ ≤ t the hazard function of τ is defined by the increasing
Nt = 1{τ ≤t} = (1)
0 if τ > t function H : + → + given as
H (t) = − ln(1 − F (t)) ∀t ∈ + (6)
The default leg D0 and premium leg P0 (S) can
now be expressed in terms of N as Suppose, furthermore, that F is absolutely contin-
uous, admitting a density representation of F (t) =
t
0 f (s) ds. The hazard rate of τ is defined by the
T
¯ −rτ
D0 = Ɛ0 e Nτ = Ɛ0 ¯e−rs dNs (2) nonnegative function h : + → + given as
0
f (t)
P0 (S) = Ɛ0 S e−rtm (1 − Ntm ) (3) h(t) = (7)
1 − F (t)
tm
under which we have
where Ɛt [·] = Ɛ[·|F t ] is the conditional expectation t
−
F (t) = 1 − e−H (t) = 1 − e
h(s) ds
with respect to time t, information F t , and the 0 ∀t ∈ + (8)
integral of equation (2) is defined in the Stieltjes
sense. Finally, by an application of Fubini’s theorem Naturally, the component probabilities of equa-
and integration by parts, equations (2) and (3) can be tions (4) and (5) can be expressed in terms of the
2 Hazard Rate
distributional properties of τ , we can still use point the many issues surrounding the divergence of opin-
process martingale theory (see Point Processes) to ions and efforts for convergence in the reduced-form
find an increasing -predictable process for which versus structural literature. Duffie and Singleton [5,
the conditional survival probabilities are given by 8] both provide a comprehensive overview of dif-
ferent credit models, while Giesecke [6] specifically
[τ > s|F t ] = 1{τ >t} Ɛ e(t)−(s) |G t (14) outlines the different informational assumptions and
their implications in intensity formulations.
Routinely, if is absolutely continuous then,
(t) − (s) in equation (14) s can be replaced by
its density representation − t λ(u) du. The details References
of such conditions and results, as well as a general
theory of hazard processes, are summarized in [1, 7]. [1] Bielecki, T.R. & Rutkowski, M. (2001). Credit Risk:
Modeling, Valuation and Hedging, Springer.
[2] Bremaud, P. (1981). Point Processes and Queues, Mar-
Reduced-form Modeling and Other Issues tingale Dynamics, Springer-Verlag.
[3] Brigo, D. & Mercurio, F. (2007). Interest Rate Models -
Thoery and Practice, With Smile, Inflation and Credit, 2nd
The importance of the concept of hazard rates (or
Edition, Springer.
intensities) lies in the fact that their direct modeling [4] Dellacherie, C. & Meyer, P.A. (1982). Probabilities and
and parametrization is the prevalent industry practice Potential, North Holland, Amsterdam.
in evaluating credit derivatives. Now that the CDS [5] Duffie, D. & Singleton, K. (2003). Credit Risk: Pric-
market has grown to one of great volume and ing, Measurement and Management, Princeton University
liquidity, the realm of CDS spread modeling has Press.
become less of a pricing issue and more of a [6] Giesecke, K. (2006). Default and information, Journal of
Economic Dynamics and Control 30, 2281–2303.
calibration one. Reduced-form modeling (see Inten-
[7] Jeanblanc, M. & Rutkowski, M. (2000). Modelling of
sity-based Credit Risk Models) refers to valuation default risk: an overview, in Mathematical Finance:
methods in which one exogenously specifies the Theory and Practice, Higher Education Press, Beijing, pp.
dynamics of an intensity model, much like we 171–269.
would for spot rates, and then calibrates the model [8] Lando, D. (2004). Credit Risk Modeling: Theory and
parameters to fit the market spread data via a pricing Applications, Princeton University Press.
formulation such as equations (4)–(5). A full-fledged
model could incorporate features such as premium Further Reading
accrual, dependence of intensity with stochastic spot
rates and the loss rate, and interaction/contagion International Swaps and Derivatives Association (1997). Con-
effects with other names, which were ignored in our firmation of OTC Credit Swap Transaction Single Reference
expositional formulation. Entity Non-Sovereign.
The assumptions, the underlying informational International Swaps and Derivatives Association (2002). 2002
assumptions in particular, implied by the mere exis- Master Agreement.
Tavakoli, J. (2001). Credit Derivatives and Synthetic Structures,
tence of a hazard rate are a nontrivial issue. Not
A Guide to Instruments and Structures, 2nd Edition, John
all models admit an intensity process in their given Wiley & Sons.
information filtrations. For instance, in the classical
first passage structural model under perfect infor-
mation (see Default Barrier Models) the forward Related Articles
default rate (hazard rate with survival information
only) exists, but the intensity process (hazard rate Compensators; Credit Default Swaps; Default
with all available information i.e. the firm value Barrier Models; Duffie–Singleton Model; Inten-
process) does not. Conceptually, the existence of a sity-based Credit Risk Models; Jarrow–Lando–
positive instantaneous default arrival rate implies a Turnbull Model; Point Processes; Reduced Form
certain imperfection in the observable information, Credit Risk Models.
modeled either explicitly through a noisy filtration
or implicitly through a totally inaccessible stopping JUNE HO KIM
time in the complete filtration. This underlines one of
Duffie–Singleton Model ht the default hazard rate, and Lt the fraction of
market value lost in the event of default. λt = ht Lt
can be interpreted as a “risk-neutral mean-loss rate
of the instrument due to default.” As a consequence,
The credit risk modeling approach of Duffie and
credit spread data alone (be it corporate bond yields,
Singleton [8, 9] falls into the class of reduced-
swap to treasury spreads, or credit default swap
form (see Reduced Form Credit Risk Models;
spreads) are insufficient to separate the “risk-neutral
Intensity-based Credit Risk Models) or intensity-
mean-loss rate” λt into its hazard rate ht and loss
based models in the sense that default is directly
fraction Lt .
modeled as being triggered by a point process, as
The representation (1) lends the model consider-
opposed to structural models (see Structural Default
able tractability, particularly for applications that do
Risk Models) attempting to explain default through
not require the separation of Rt into its components
the dynamics of the firm’s capital structure, and
rt , ht , and Lt , since Rt could then be modeled directly
the intensity of this process under a risk-neutral
as a function ρ(Yt ) of a state variable process Y that
probability measure is related to an appropriately
is Markovian under Q. If the payoff of the claim is
defined instantaneous credit spread. In its original
also Markovian in Y , say X = g(YT ), then the value
construction, it is set out as an econometric model,
of the claim at any time t (assuming that default has
that is, a model the parameters of which are estimated not occurred by time t) can be written as the condi-
from the time series of market data, such as the tional expectation
weekly data of swap yields used in [8]. To this
T
end, the model is driven by a set of state variables
following a Markov process under the risk-neutral Vt = E Q exp − ρ(Ys ) ds g(YT ) Yt (2)
t
measure, and defaultable zero-coupon bond prices are
exponentially affine functions of the state variables ρ(Ys ) can be modeled analogously to any one of
along the lines of the results derived by Duffie a number of tractable default-free interest rate term
and Kan [6] for default-free models of the term structure models. One possible choice of making the
structure of interest rates (see Affine Models). Duffie Markovian model specific is along the lines of a
and Singleton [9] show that the model framework multifactor affine term structure model as studied by
can be made specific in a way that also allows Dai and Singleton [3], in which rt and λt are affine
default intensities and default-free interest rates to functions of the vector Yt ,
be negatively correlated in a manner that is more
consistent theoretically than in prior attempts in the
N
default-free and defaultable zero-coupon bond prices justify accepting its slight inconsistency with legal
are exponential affine functions of the state variables. and market practice.
Duffie and Singleton [9] highlight that modeling The parallels of equation (1) to the valuation of
Y as a vector of independent components follow- contingent claims in default-free interest rate term
ing [2] “square-root diffusions” constrains the joint structure models also extend to the methodology of
conditional distribution of rt and λt in a manner Heath et al. [10] (HJM). Defining a term structure of
inconsistent with empirical findings. In particular, the “defaultable instantaneous forward rates” f¯(t, T ) in
[3] conditions on admissible model parameters imply terms of defaultable zero-coupon bond prices B̄(t, T )
that such a model cannot produce negative correlation (i.e., the time t price of a bond maturing in T ) by
between the default-free interest rate and the default T
hazard rate. Duffie–Singleton instead propose to use B̄(t, T ) = exp − ¯
f (t, u) du (9)
a more flexible specification, which does not suffer t
from this disadvantage. In its three-factor form, it is the model can be written in terms of the dynamics
given by of the f¯(t, T ), the drift of which under the risk-
neutral measure must obey the no-arbitrage restric-
0 1 0
tions, derived by Heath, Jarrow, and Morton (HJM)
α= 0 β1 = 0 β2 = β22
in the default-free case. Note that the f¯ are “forward
β3 0 0
rates” only in the sense that equation (9) is analogous
to the definition of instantaneous forward rates in
β31 δ1 γ
the default-free case and their relationship to forward
β3 = β32 δY = 1 γY = γ (7)
bond prices is less straightforward than for default-
0 1 0
free forward rates. That is to say that typically for
with all coefficients (including δ0 and γ0 in equations the forward price F̄ (t, T1 , T2 ) = B̄(t, T2 )/B(t, T1 )
(3) and (4)) strictly positive. Furthermore, (where B(t, T ) is a default-free zero-coupon bond),
one has
T2
κ11 κ12 0 1 0 0
B̄(t, T2 )
K= κ21 κ22 0 = 0 1 0 F̄ (t, T1 , T2 ) = = exp − f¯(t, u) du
0 0 κ33 σ31 σ32 1 B̄(t, T1 ) T1
(8) (10)
For the continuously compounded defaultable
with the off-diagonal elements of K being nonposi-
short rate r̄(t) = f¯(t, t), the no-arbitrage restrictions
tive. This specification ensures strictly positive credit
imply
spreads λt and can represent negative correlation
f¯(t, t) = rt + ht Lt = Rt (11)
between the increments of r and λ.
The “recovery-of-market-value” assumption at the which is equal to the default-adjusted short rate given
core of the Duffie–Singleton framework is in line in equation (1). In this sense, the risk-neutral mean-
with market practice for defaultable derivative finan- loss rate ht Lt is equal to the instantaneous credit
cial instruments such as swaps. For defaultable bonds, spread r̄(t) − rt .
it is arguably more realistic to model the loss in the Cast in terms of HJM, the model is automatically
event of default as a fraction of the par value. How- calibrated to an initial term structure of defaultable
ever, Duffie and Singleton [9] provide evidence that discount factors B̄(t, T ). This type of straightforward
par yield spreads implied by reduced-form models “cross-sectional” calibration makes the model useful
are relatively robust with respect to different recov- not only for the econometric estimation followed by
ery assumptions, and suggest that for bonds trad- Duffie and Singleton [8] and others such as Duffee
ing substantially away from par, pricing differences [4] and Collin-Dufresne and Solnik [1] but also for
due to different recovery assumptions can be largely the relative pricing of credit derivatives.
compensated by changes in the recovery parame- The model can be extended in a number of direc-
ters. The computational tractability gained through tions, several of which are discussed in [9]. “Liquid-
the “recovery-of-market-value” assumption may thus ity” effects can be modeled by defining a fractional
Duffie–Singleton Model 3
carrying cost of defaultable instruments, in which [2] Cox, J.C., Ingersoll, J.E. & Ross, S.A. (1985). A theory
case the relevant discount rate Rt = rt + ht Lt + t of the term structure of interest rates, Econometrica
is adjusted for default and liquidity. The assumption 53(2), 385–407.
[3] Dai, Q. & Singleton, K.J. (2000). Specification analysis
of exogenous default intensity and recovery rate can of affine term structure models, The Journal of Finance
be lifted, as in [5], by allowing intensities/recovery 55(5), 1943–1978.
rates to differ for the counterparties in an over- [4] Duffee, G. (1999). Estimating the price of default risk,
the-counter (OTC) derivative transaction, with the Review of Financial Studies 12(1), 197–226.
intensity/recovery rate relevant for discounting deter- [5] Duffie, D. & Huang, M. (1996). Swap rates and credit
mined by which counterparty is in the money. Jumps quality, Journal of Finance 51(3), 921–949.
in the default-adjusted rate can be introduced along [6] Duffie, J.D. & Kan, R. (1996). A yield factor model of
interest rates, Mathematical Finance 6(4), 379–406.
the lines of [6] while preserving the tractability of
[7] Duffie, D. & Lando, D. (2001). Term structures of
an affine term structure model. The model of single- credit spreads with incomplete accounting information,
obligor default considered by Duffie and Singleton Econometrica 69(3), 633–664.
[8, 9] can also be extended to the portfolio level [8] Duffie, D. & Singleton, K.J. (1997). An econometric
using the copula function approach of Schönbucher model of the term structure of interest-rate swap yields,
and Schubert [11], since introducing default corre- The Journal of Finance 52(4), 1287–1322.
lation through correlated diffusive dynamics of the [9] Duffie, D. & Singleton, K. (1999). Modeling term
structures of defaultable bonds, Review of Financial
default intensities ht for different obligors is typically
Studies 12, 687–720.
insufficient, resulting only in very mild correlation of [10] Heath, D., Jarrow, R. & Morton, A. (1992). Bond pricing
defaults. and the term structure of interest rates: a new method-
Historically, reduced-form models like Duffie– ology for contingent claims valuation, Econometrica
Singleton have been considered to be following 60(1), 77–105.
a different paradigm than the more fundamental [11] Schönbucher, P. & Schubert, D. (2001). Copula Depen-
structural models where default is triggered when dent Default Risk in Intensity Models, University of
Bonn. Working paper.
the value of the firm falls below a barrier taken
to represent the firm’s liabilities. However, the two
approaches have been reconciled by Duffie and Lando Related Articles
[7], who show that models based on a default
intensity can be underpinned by a structural model Affine Models; Constant Maturity Credit Default
in which bondholders are imperfectly informed about Swap; Intensity-based Credit Risk Models;
the firm’s value. Jarrow–Lando–Turnbull Model; Markov Pro-
cesses; Multiname Reduced Form Models; Point
References Processes; Reduced Form Credit Risk Models.
[1] Collin-Dufresne, P. & Solnik, B. (2001). On the term ERIK SCHLÖGL & LUTZ SCHLÖGL
structure of default premia in the swap and LIBOR
markets, Journal of Finance 56(3), 1095–1115.
Jarrow–Lando–Turnbull describe the discrete-time case. Denoting the matrix
of risk premiums at time t by the K × K-dimensional
Model diagonal matrix (t) = diag(π1 (t), . . . , πK−1 (t), 1),
it is assumed that