(Georg Bol, Svetlozar T. Rachev, Reinhold Würth) (BookFi)

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 289

Risk Assessment

Georg Bol Svetlozar T. Rachev


● ●

Reinhold Würth (Editors)

Risk Assessment
Decisions in Banking and Finance

Physica-Verlag
A Springer Company
Editors
Prof. Dr. Georg Bol Prof. Dr. h.c. mult. Reinhold Würth
Prof. Dr. Svetlozar T. Rachev Reinhold-Wü rth-Str. 12-17
University of Karlsruhe (TH) 74653 Künzelsau-Gaisbach
Kollegium am Schloss, Geb. 20.12 Germany
76131 Karlsruhe
Germany
bol@statistik.uni-karlsruhe.de
rachev@statistik.uni-karlsruhe.de

ISBN 978-3-7908-2049-2 e-ISBN 978-3-7908-2050-8

DOI: 10.1007/978-3-7908-2050-8

Contributions to Economics ISSN 1431-1933

Library of Congress Control Number: 2008929529

© 2009 Physica-Verlag Heidelberg

This work is subject to copyright. All rights are reserved, whether the whole or part of the material
is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation,
broadcasting, reproduction on microfilm or in any other way, and storage in data banks. Duplication
of this publication or parts thereof is permitted only under the provisions of the German Copyright
Law of September 9, 1965, in its current version, and permissions for use must always be obtained
from Physica-Verlag. Violations are liable for prosecution under the German Copyright Law.

The use of general descriptive names, registered names, trademarks, etc. in this publication does
not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.

Cover design: WMXDesign GmbH, Heidelberg

Printed on acid-free paper

987654321

springer.com
Preface

On April 5–7, 2006, the 9th Econometric Workshop with the title “Risk
Assessment: Decisions in Banking and Finance” was held at the University
of Karlsruhe (TH), Germany. The workshop was organized by the Institute
for Statistics and Mathematical Economics and the Adolf Würth GmbH &
Co.KG, Künzelsau. More than 20 invited speakers and 70 participants at-
tended the workshop. The papers presented at the conference dealt with new
approaches and solutions in the field of risk assessment and management,
covering all types of risk (i.e., market risk, credit risk, and operational risk).
This volume includes 12 of the papers presented at the workshop. We are
delighted with the range of papers, especially from practitioners.
Many people have contributed to the success of the workshop: Sebastian
Kring and Sven Klussmeier did the major part of organizing the workshop.
The organizational skills of Markus Höchstötter, Wei Sun, Theda Schmidt,
Nadja Safronova, and Aksana Hurynovich proved indispensable. Jens Büchele
und Lyuben Atanasov were responsible for the technical infrastructure while
Thomas Plum prepared the design for this volume. All of their help is very
much appreciated.
The organization committee wishes also to thank the School of Economics
and Business Engineering, Vice-Dean Professor Dr. Christof Weinhardt, and
Professor Dr. Frank Fabozzi (Yale University’s School of Management) for
their cooperation. Last but certainly not least we thank Professor Dr. h.c.
Reinhold Würth and the Adolf Würth GmbH & Ko.KG for their generous
support of this conference.

Karlsruhe, Georg Bol


April 2008 Svetlozar T. Rachev
Reinhold Würth
Contents

Automotive Finance: The Case for an Industry-Specific


Approach to Risk Management
Christian Diekmann . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

Evidence on Time-Varying Factor Models for Equity Portfolio


Construction
Markus Ebner and Thorsten Neumann . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

Time Dependent Relative Risk Aversion


Enzo Giacomini, Michael Handel, and Wolfgang K. Härdle . . . . . . . . . . . . 15

Portfolio Selection with Common Correlation Mixture Models


Markus Haas and Stefan Mittnik . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

A New Tempered Stable Distribution and Its Application


to Finance
Young Shin Kim, Svetlozar T. Rachev, Michele Leonardo Bianchi,
and Frank J. Fabozzi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

Estimation of α-Stable Sub-Gaussian Distributions for Asset


Returns
Sebastian Kring, Svetlozar T. Rachev, Markus Höchstötter, and Frank
J. Fabozzi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

Risk Measures for Portfolio Vectors and Allocation of Risks


Ludger Rüschendorf . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153

The Road to Hedge Fund Replication: The Very First Steps


Lars Jaeger . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
Asset Securitisation as a Profits Management Instrument
Markus Schmidtchen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
VIII Contents

Recent Advances in Credit Risk Management


Frances Cowell, Borjana Racheva, and Stefan Trück . . . . . . . . . . . . . . . . . . 215

Stable ETL Optimal Portfolios and Extreme Risk


Management
Svetlozar T. Rachev, R. Douglas Martin, Borjana Racheva, and Stoyan
Stoyanov . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235

Pricing Tranches of a CDO and a CDS Index: Recent


Advances and Future Research
Dezhong Wang, Svetlozar T. Rachev, and Frank J. Fabozzi . . . . . . . . . . . . 263
Automotive Finance: The Case
for an Industry-Specific Approach
to Risk Management

Christian Diekmann

Department of Econometrics, Statistics and Mathematical Finance, University


of Karlsruhe, Germany, christian.diekmann@statistik.uni-karlsruhe.de

1 Introduction

Automotive finance has come to represent a significant portion of many


financial institutions’ portfolios. The list of these institutions includes finan-
cial service entities of automotive groups as well as banks, bank-owned and
independent finance companies. All of them face the need to assess and man-
age the risks of their activities in the automotive sector. These risks depend to
large extent on the dynamics of the market for motor vehicles. Understanding
these dynamics well enough is an essential prerequisite for dealing adequately
with the financial risk they imply. The requirements imposed on providers of
financial services in the automotive business clearly go beyond the standard
techniques applied to the assessment and the management of risk in tradi-
tional banking. This is the reason why, in virtual absence of literature on this
subject, the present contribution is concerned with the specific aspects to be
considered when managing risk in automotive finance.

2 Automotive Finance
The significant role automotive finance plays today should be seen against the
background that the market for motor vehicles has matured in most countries
over the past two decades while economic growth has slowed down. The face
of the industry serving this market has kept changing during this period more
rapidly than ever before. Three major trends can be identified:
• A global race for economies of scale, resulting in a series of mergers and
take-overs1 accompanied by an extreme expansion of production capacity.

1
Actually the number of independent manufacturers of motor vehicles has halved
since the mid 80s.
2 C. Diekmann

• An intense product differentiation with constant expansion of product line,


shortening life cycles and accelerated technological innovation, with con-
sumers demanding a wider choice of products.
• A series of cost cutting and productivity enhancements based on best
practice techniques, lean production, just-in-time methods, modulization,
outsourcing and joint ventures.
One of the effects of these changes has been a shift of value-added from ve-
hicle producers (OEMs) to their suppliers induced by the need to cut costs
and achieve greater flexibility in response to the needs of the market. At the
same time, however, vehicle producers started to move downstream, moti-
vated by the fact that, according to industry experts, almost two-thirds of the
total profits generated over the lifetime of a car originate from downstream
rather than from upstream activities (Table 1). Finance and insurance alone
account for about one quarter of total profits. This strategic expansion of ac-
tivities, aiming at selling customers mobility rather than vehicles, has largely
contributed to the dynamic growth of automotive captives.
Today the financial service entities of automotive groups operate on a
global scale and manage significant portfolios (Table 2). Well above 40% of
annual sales in passenger cars in the world’s leading automotive markets – the
US and Germany – are financed or leased by captives. As a result, these entities
have not only become a crucial element in the sales strategies of their par-
ent companies but have become major profit contributors. The most striking
examples are the two large US manufacturers who operate profitable finance
entities while losing money on their industrial activities (Table 3).

Table 1. Profit contributions in the automotive value chain [8]

Up-stream activities
Manufacturer 16%
Systems & modules suppliers 7%
Component specialists 8%
Standard parts suppliers 2%
Raw material providers 5%
38%

Down-stream activities
New car retailing 5%
Leasing & financing 9%
Insurance business 15%
Used car retailing 12%
Car rental business 4%
Service & parts business 17%
62%
Automotive Finance 3

Table 2. Automotive portfolios (2004, bn Euro) [2]

Captive Portfolio
GMAC 143.9
FMCC 123.4
DC FS 97.9
T FS 63.3
VW FS 51.9
BMW FS 43.5
RCI Banque 21.9
PSA Banque 21.5
Volvo FS 7.1

Table 3. Profit contributions by financial services (2004, US $ m)a

2004 Total Financial services


Group Net revenue Op. profit Net revenue Op. profit Profit contribution (%)
GM 193,517 1,192 31,972 4,316 362.08
DCX 192,319 7,790 18,871 1,692 21.72
FMC 171,652 4,853 24,518 5,008 103.19
TMC 163,637 15,772 6,782 1,381 8.76
VW 121,177 2,207 11,864 1,261 57.16
PSA 75,320 3,452 2,371 712 20.64
Nissan 70,087 7,781 3,361 611 7.85
BMW 60,389 4,841 11,205 701 14.49
Renault 55,458 3,294 3,159 605 19.970
Estimates based on 2004 annual reports. Figures may be biased due to accounting
issues

Although captives dominate the automotive finance market they are far
from having total control of it (Table 4). The situation in the US where com-
mercial banks and other types of financial institutions account for about half of
the relevant market may serve as an example. It should be noted that captives
and their competitors show very different characteristics. Captives pursue au-
tomotive finance as a core activity and benefit from the close relation to their
parent group. This gives them the advantage of a superior representation at
the point of sale and low costs of distribution. Another benefit is that the
manufacturers favour incorporating incentives into finance rates rather than
giving direct discounts. On the other hand, captives have a somewhat different
mission from that of other lending institutions. Their job is to support brand
sales, acquisition new and keeping existing customers on board. This means
the customer is viewed not only from the perspective of a lending institution
but also from that of an automotive manufacturer. Therefore, in spite of gen-
erating profits they do not follow unbiased financial objectives. In contrast,
4 C. Diekmann

Table 4. Outstanding finance receivables: US loans & leases automotive business [6]

Indirect (US $ bn)


Captives 580
Large Banks 205
Independent Finance 200
Credit Unions 40
Small Banks 15
Total 1,040

Direct (US $ bn)


Credit Unions 95
Small Banks 60
Independent Finance 25
Large Banks 20
Internet Only 20
Online Mixed 20
Total 240

banks and other financial institutions pursue pure profit targets. They rather
view automotive finance as another area of their credit and leasing business.
A potential advantage particularly banks have over captives is a stronger po-
sition in refinancing. Besides, captives and commercial banks there are two
other types of financial service providers in the market. The first category
consists of independent finance companies who often focus on sub-segments
of the automotive finance market, e.g. the sub-prime, used vehicle or fleet
business. The second category comprises internet and direct lenders whose
focus is to increase returns by saving distribution costs, regularly trying to
disintermediate the dealer network.
It should be seen however, that while automotive finance enjoyed a pe-
riod of exuberant growth in the past, the market is beginning to show signs
of saturation. Competition intensifies spilling over from the vehicle markets
and leaving less scope for conservative business policies. Generating profitable
growth in this environment calls for maintaining a careful balance between risk
and return in the future. This leads us to a more specific analysis of the risks
involved in the automotive finance business.

3 Risk Assessment
The theory of risk management has made significant advances over recent
years. However, most studies undertaken in this field have been concerned
with corporate default risk and risk assessment in the case of traditional bank
Automotive Finance 5

portfolios. Although many of the existing concepts can successfully be applied


to automotive finance, there are some marked differences in comparison to
traditional bank portfolios, i.e.
• A substantial portion of retail exposures.
• A virtually complete collateralisation by the financed vehicles.
• A significant portion of leasing exposures.
These characteristics have a substantial impact on the risk structure to be
considered in automotive finance.

3.1 Auto Loans

As far as the loan business is concerned, automotive finance is generally con-


sidered to be a low-risk activity. Default rates tend to be reasonably small and
the high share of retail exposures adds to the granularity of portfolios thus
reducing undiversified idiosyncratic risk. The low-risk argument is further sup-
ported by the strong collateralisation of exposures. Due to the liquid secondary
market, repossessed vehicles can be sold more easily than most other types of
collateral. However, collateralisation should be given a closer look. Banks in
general either neglect systematic recovery risk or assume collateral values to
fluctuate in step with the state of the economy. This is a practical assumption
and quite justified in the case of portfolios that are diversified with regard
to collateralisation. In this case, fluctuations in the value of different types
of collateral may have offsetting effects leaving the overall economic situation
as the only identifiable source of systematic recovery risk. In contrast to this,
automotive finance lacks these benefits of diversification. Recovery rates are
here substantially affected by the state of the used vehicle market which needs
not necessarily move in step with the economy as a whole. As a result, mar-
ket specific supply and demand factors are an additional source of systematic
recovery risk and should not be neglected in the assessment of portfolio risk.

3.2 Auto Leases

Leasing accounts for a substantial portion of the auto finance business. This
has important implications for risk management.
Essentially a lease is a contract by which the lessor conveys the right to
use an asset for a specific period of time to the lessee. In return he receives
a series of payments. A major characteristic of leasing being that while the
lessee takes full responsibility for the leased asset, the legal title of property
remains with the lessor. In addition to this, leases often contain various op-
tions and obligations regarding the use of the leased asset at the end of the
contract. Leasing owes much of its popularity to the favourable effect it has
on liquidity, the flexibility it affords for the lessee and the chance it offers to
realise off-balance sheet finance as well as a favourable tax treatment. For this
reason, leasing has experienced substantial growth over several decades with
6 C. Diekmann

automotive leasing accounting for a substantial portion of newly contracted


business. To give some figures, in Germany e.g. the newly contracted lease
volume exceeded 50 bn Euro in 2005, with automotive leasing accounting for
over 26 bn Euro which is equivalent to about 60% of the total equipment
leasing market [7]. The 1.1 m vehicles that were leased in 2005 accounted for
almost one third of new vehicle registrations in the German market.
While there is a great variety in leasing contracts in general, automotive
leasing is a rather standardised product. After choosing a vehicle and deciding
on the duration of the contract and the milage to be driven under the lease,
the lessee is charged a monthly instalment. It is true that contracts differ
as to the use of the vehicle at the end of the lease. However, two types of
contract predominate. Either the lessee is obliged to return the vehicle or he
can exercise an option to purchase it at the end of the contract at the contrac-
tual residual value. While the latter is normally the case in the US, leasing
contracts in Germany as a rule provide for the return of the leased vehicle.
However, from a finance perspective leases display a considerably more
complex risk structure than loans. In addition to default and recovery risk,
automotive leases are regularly subject to residual value risk. The residual
value, i.e. the future market value of the leased vehicle has to be estimated at
the initiation of the lease and is generally regarded as a fairly unpredictable
quantity. However, it is a central element in the calculation of the lessor and
has a major impact on the level of instalments. These instalments include an
interest as well as an amortization component. If instalments are calculated
based on a high residual value, i.e. if the vehicle is assumed to lose little of
its original value over the term of the lease, the level of instalments declines.
The opposite is the case if the residual value is assumed to be low. It should
be seen, however, that in general the lessor does not reach full amortization
of his investment in the leased asset during the term of the lease. He therefore
depends on the proceeds from the leased vehicle at the end of the contract to
make a profit. If these proceeds fall short of the estimated residual value, the
lessor incurs a residual value loss. As a result lessors face a trade-off between
competitive pricing and the risk taken. While a conservative approach to set-
ting residual values will reduce the danger of incurring losses, it is bound to
result in higher instalments which in turn have an adverse effect on the vol-
ume of newly contracted business. In contrast to this, a sales focused residual
value policy reduces instalments and thus promises higher sales in the market
while increasing the risk of residual value losses.
Strictly speaking, residual value risk consists of two components. The first
critical element is the volatility in the market for used vehicles. The second is
the turn-in behaviour of the lessees. Both are closely-related phenomena. In
case of a downturn in the used vehicle market the gap between actual and con-
tractual residuals tends to grow, causing the number of lessors who decide to
return their vehicles to grow as well. As experience in the US leasing business
has shown, this can easily lead to a downward spiral where the residual losses
incurred in phase one of the process will subsequently be reinforced by the
Automotive Finance 7

need to remarket an increasing number of turned-in vehicles which depresses


the price level of the used vehicle market further. Under these circumstances,
residual value losses can quickly reach a significant dimension. It is of some
interest to look at the experience gained in the US market in this context
because it provides an example of the pitfalls that should be considered by
risk managers.

3.3 Industry Experience

For a considerable time, captives and leasing companies were the exclusive
providers of automotive leasing in the US. However, during the 90s automotive
leasing became more and more popular and developed into a highly profitable
business. Anxious not to miss the chance for profitable growth this business
offered, several large banks and finance companies entered the market. As a
result the competitive pressure in the leasing market began to increase. As all
competitors were concerned about losing profitable business, they reviewed
their residual value policies. Since the early 90s, residual values had steadily
increased. Facing the strong economic situation of the mid-90s, lessors as-
sumed this trend to continue. Therefore, contractual residual values were not
only adjusted to actual levels, rather their historic increase was extrapolated.
Considering that on average residual values had already increased from 42%
in 1990 to 62% in 1997, this extrapolation had a substantial effect. Several
market participants calculated average residuals up to 72% [9]. As a result
automotive leasing experienced an impressive boom.
However, in 1996 and 1997 the sales figures for SUVs, began to decline.
To counter this effect, manufacturers started incentive programs to further
new vehicle sales. Soon the trend of increasing new vehicle prices came to
a broad end at market level. Nevertheless, leasing continued to grow from
US $ 86 bn in 1997 to a new record level in 1999. In the same year the decline
in new vehicle prices spilt over to the used vehicle market. In response to the
apparent downturn in used vehicle values and the exceptionally good deals
available on new vehicles, a growing number of lessees chose to return their
leased vehicles at the end of the contract. Dealers showed little interest in
pre-empting the returned vehicles and passed them on to the lessors where
the inflated residual value estimates, the fall in used vehicle values and the
increase of turn-in rates multiplied to heavy losses (Table 5). Considering the
fact that lease terms had increased up to 60 months, these losses were to
affect their financial performance for several years to come. In 2000, residual
losses reached the limit of US $ 10 bn [3]. Until 2003 they had accumulated
to over US $ 20 bn [4] leading to a substantial consolidation in the market.
In 1999 Wachovia and First Union discontinued their auto lease operations.
GE Capital and National City followed in 2000 [9]. Key Corp was next and
quit leasing in 2001. Many smaller lessors followed their example or vanished
completely. Captives decided to keep up leasing which by then accounted for
8 C. Diekmann

Table 5. Return rates & RV-losses [1]

Year 1998 1999 2000 2001 2002 2003 2004 2005E


Return rate .40 .44 .49 .56 .61 .65 .56 .52
Av. RV-loss (US $) 1,160 1,663 2,516 2,532 2,914 3,187 2,740 3,194

Table 6. Market shares: US automotive finance [1]

Loans
Year 2000 2001 2002 2003 2004 2005E
Captive .35 .37 .41 .49 .44 .45
Bank .33 .33 .30 .28 .30 .30
Finance Co. .16 .15 .15 .11 .13 .13
Credit Union .14 .14 .13 .10 .10 .10
Other .02 .02 .01 .02 .03 .03

Leases
Year 2000 2001 2002 2003 2004 2005E
Captive .46 .55 .50 .48 .51 .55
Bank .33 .26 .25 .21 .23 .20
Finance Co. .15 .13 .19 .23 .18 .17
Credit Union .04 .04 .03 .03 .03 .02
Other .03 .03 .03 .03 .03 .05

about 30% of all passenger cars put on the road but reacted by adopting
a more conservative residual value policy. In the years to follow, auto lease
volumes declined and took until 2004 to show first signs of recovery.

4 Conclusion
The risk inherent in automotive finance is to a considerable extent driven
by industry and market dynamics. As a result, the broad economic perspec-
tive regularly taken when assessing default and recovery risk in diversified
bank portfolios proves to be inappropriate. The correct assessment and the
successful management of risk in this area of finance require considerable ex-
pertise. This is particularly true of the leasing business where residual values
are the central link between risk and return. Given the growing economic
importance of automotive finance as well as the increasing competitive pres-
sure in the market, there is substantial need for further research. This will
inevitably involve a great deal of empirical efforts as well as the development
of sophisticated models tailored to industry specifics. In this context, the com-
Automotive Finance 9

plex risk structure to be accounted for in the leasing business remains to be


the greatest challenge.

References
[1] Adesa Inc. (2005), Global Vehicle Remarketing 04-05.
[2] DaimlerChrysler Financial Services AG (2005), Company Slides 2005.
[3] Fahey J (2003), Residual Risk, Forbes Magazine, Vol. 171, Iss. 13.
[4] Manheim (2004), The Used Car Market Report, 2004 Edition.
[5] Manheim (2006), The Used Car Market Report, 2006 Edition.
[6] Miczeznikowski J, Hirsch E, Reppa P (2003), Rebundling of the Auto
Finance Industry. Booz Allen Hamilton.
[7] Städtler A (2005), Besseres Investitionsklima stärkt Leasingwachstum, ifo
Schnelldienst No. 23/2005.
[8] Volkswagen Financial Services AG (2004), Investor Relations Presenta-
tion.
[9] Wood D (2001), Auto Leasing Becomes Demolition Derby, ERisk
02.08.2001.
Evidence on Time-Varying Factor Models
for Equity Portfolio Construction

Markus Ebner1 and Thorsten Neumann2


1
Union PanAgora Asset Management, Frankfurt, Germany
2
Union Investment Institutional, Frankfurt, Germany,
Thorsten.Neumann@union-investment.de†

1 Introduction

Many applicationers derive the variance-covariance matrix (VCM) for mean-


variance optimization from some risk model or apply a simple historical esti-
mate. A common problem to these approaches is the stability of the variance-
covariance matrix. In turbulent market phases risk estimates from various
risk models are well known to be unreliable. One reason for their poor risk
forecasting ability is the fact that financial markets are subject to substantial
structural change, applied risk models do not account for. In our paper we
account for structural changes by deriving VCMs from time-varying estimates
of the single factor model, i.e., the market model. We demonstrate the advan-
tages of this approach with respect to risk estimation, portfolio selection and
investment performance by means of simulated trading strategies.
The problem of choosing the adequate risk model has come in mind of sci-
entific researchers and practioners only recently. While research has focused
on forecasting returns for a long time there is a lack of evidence in evaluating
the performance of different risk models and the consequences for portfolio
optimization. Next to the well known sensitivity of the mean-variance opti-
mization with respect to assumed expected returns the benefits promised by
this approach also heavily depend on the accuracy in estimating the VCM
(see, for example, [1] and [4]). Given the well known difficulty of estimating
expected returns the most important improvement on MV optimization can
be made in the VCM estimation which is mainly based on financial economet-
rics. However, on the performance of alternative risk models and optimization
procedures there is only limited scientific evidence, such as [3, 9, 10, 13, 18]
among others.
The capital asset pricing model (CAPM) due to [17] and [14] assumes
stock returns to be a linear function of a single factor, namely the market
return. Stock betas, i.e., stock return elasticities with respect to the market

Corresponding author: Thorsten Neumann, Union Investment Institutional
12 M. Ebner and T. Neumann

return, have been widely used to evaluate systematic risk, i.e., the return risk
associated with market movements.
When estimating the CAPM it is common practice to assume stock betas
to be invariant over time. However, this stability assumption has been ques-
tioned and a considerable amount of empirical evidence reports important
beta variation over time (see among others, [2, 5, 8, 12, 16, 19], as well as [7]).
Beta variation over time goes hand in hand with unstable correlations among
stock returns and time-varying VCMs. This might have serious consequences
for the outcomes of portfolio optimization which are not widely recognized
by now.
In [6] we consider VCMs that are derived from time-varying beta estimates
for mean-variance optimization. When estimating time-varying betas we rely
on a time-varying market model given by
yi,t = αi,t + βi,t xt + ui,t , ui,t ∼ N (0, σu,i
2
), i = 1...N, t = 1...T,
with yi,t denoting the return of stock i at period t and xt the market return,
respectively. The error term ui,t captures specific risk of stock i measured by
the standard deviation σu,i , and the slope coefficient βi,t measures the stock’s
return sensitivity with respect to xt . The coefficient αi,t denotes the stock
specific return component at time t.
For estimating time-varying coefficients βi,t we employ three well es-
tablished estimation approaches, namely (i) Moving Window Least Squares
(MWLS); (ii) Flexible Least Squares (FLS) and (iii) the Random Walk Model
(RWM). See [11] and [15] for an illustration of the estimation methods. We
compare estimation results of these approaches with those, generated by the
time-invariant Recursive Least Squares-approach (RLS).
Our empirical findings for the U.S. suggest that betas, stock correlations
and, hence, VCMs are subject to significant variation in the short run as well as
in the long run. In fact, important benefits arise from time-varying estimation
of the market model when compared to time-invariant estimation via RLS.
Moreover, we examine the outcomes from mean-variance portfolio selec-
tion strategies based on variance-covariance matrices derived from these esti-
mates. We obtain improved ex-ante risk estimates as well as portfolios that
have superior risk and return characteristics while being well diversified. For
the estimation techniques considered in this paper, we find the same rank-
ing for nearly all investigated criteria. Due to our results, FLS is the best
method. It is followed by RWM, MWLS and RLS. The FLS procedure deliv-
ers the most precise beta estimates as well as the most precise portfolio risk
estimates. Moreover, efficient frontiers suggest higher returns for given volatil-
ities, trading strategies show the highest Sharpe Ratios and finally, portfolios
are the most diversified.
To summarize, the portfolio performances found in our empirical anal-
ysis indicate a strong need for the application of time-varying estimation
approaches for estimating correlations in risk analysis and portfolio construc-
tion. Due to our results, the FLS estimate is the favourable method to do so.
Evidence on Time-Varying Factor Models 13

References

[1] Best, M.J. and Grauer, R.R. (1991) On the Sensivity of Mean-Variance-
Efficient Portfolios to Changes in Asset Means: Some Analytical and
Computational Results. Journal of Financial Studies 4, 2, 315-342.
[2] Bos, T. and Newbold, P. (1984) An Empirical Investigation of the Pos-
sibility of Systematic Stochastic Risk in the Market Model. Journal of
Business 57, 35-41.
[3] Chan, L.K.C., Karceski, J. and Lakonishok, J. (1999) On Portfolio Op-
timization: Forecasting Covariances and Choosing the Risk Model. The
Review of Financial Studies 5, 937-974.
[4] Chopra, Kjay K. and William Z. Zimba (1993) The Effect of Errors
in Means, Variances and Covariances on Optimal Portfolio Choice. The
Journal of Portfolio Management, Winter 1993, 6-11.
[5] Collins, D.W., Ledolter, J. and Rayburn, J. (1987) Some further Evidence
on the Stochastic Properties of Systematic Risk. Journal of Business 60,
425-448.
[6] Ebner, Markus and Thorsten Neumann (2008) Time-Varying Factor
Models for Equity Portfolio Construction. The European Journal of Fi-
nance 14, 381-395.
[7] Ebner, Markus and Thorsten Neumann (2005) Time-Varying Betas of
German Stock Returns. Journal of Financial Markets and Portfolio Man-
agement. 19, 1, 29-46.
[8] Fabozzi, F.J. and Francis, J.C. (1978) Beta as a Random Coefficient.
Journal of Financial and Quantitative Analysis 13, 101-115.
[9] Jacquier, E. and Marcus, A.J. (2001) Asset Allocation Models and Mar-
ket Volatility. Financial Analysts Journal, 16-29.
[10] Jagannathan, R. and Ma, T. (2003) Risk Reduction in Large Portfolios:
Why Imposing the Wrong Constraints Helps. Journal of Finance 58,
1651-1683.
[11] Kalaba, Robert E. and L. Tesfatsion (1989) Time-Varying Linear Re-
gression via Flexible Least Squares. Computers and Mathematics with
Applications 17, 1215-1245.
[12] Kim, D. (1993) The Extent of Non-Stationarity of Beta. Review of Quan-
titative Finance and Accounting 3, 241-254.
[13] Ledoit, Ollivier and Michael Wolf (2002) Improved Estimation of the
Covariance Matrix of Stock Returns With an Application to Portfolio
Selection. Working paper. University of California, Los Angeles.
[14] Lintner (1965) The Valuation of Risk Assets and the Selection of Risky In-
vestments in Stock Portfolios and Capital Budgets. Review of Economics
and Statistics 47, 13-37.
[15] Neumann, T. (2003) Time-Varying Coefficient Models: A Comparison of
Alternative Estimation Strategies. Allgemeines Statistisches Archiv 87,
257-281.
14 M. Ebner and T. Neumann

[16] Schwert, G.W. and Seguin, P.J. (1990) Heterscedasticity in Stock Re-
turns. Journal of Finance 45, 1129-1155.
[17] Sharpe, C. (1964) Capital Asset Prices: A Theory of Market Equilibrium
under Conditions of Risk. The Journal of Finance 19, 3, 425-442.
[18] Shukla, R., Trzcinka, C. and Winston, K. (1995) Prediction Portfolio
Variance: Firm Specific and Macroeconomic Factors. Working Paper,
http://ssrn.com/abstract=6901.
[19] Sunder, S. (1980) Stationarity of Market Risk: Random Coefficients for
Individual Stocks. Journal of Finance 35, 4, 883-896.
Time Dependent Relative Risk Aversion

Enzo Giacomini1 , Michael Handel2 , and Wolfgang K. Härdle1


1
CASE - Center for Applied Statistics and Economics
Humboldt-University of Berlin, Germany
giacomini@wiwi.hu-berlin.de
haerdle@wiwi.hu-berlin.de
2
Dr. Nagler & Company GmbH, Munich, Germany
michael.handel@nagler-company.com

1 Introduction
Risk management has developed in the recent decades to be one of the most
fundamental issues in quantitative finance. Various models are being devel-
oped and applied by researchers as well as financial institutions. By modeling
price fluctuations of assets in a portfolio, the loss can be estimated using
statistical methods. Different measures of risk, such as standard deviation of
returns or confidence interval Value at Risk, have been suggested. These mea-
sures are based on the probability distributions of assets’ returns extracted
from the data-generating process of the asset.
However, an actual one dollar loss is not always valued in practice as a
one dollar loss. Purely statistical estimation of loss has the disadvantage of
ignoring the circumstances of the loss. Hence the notion of an investor’s utility
has been introduced. Arrow [2] and [10] were the first to introduce elementary
securities to formalize economics of uncertainty. The so-called Arrow-Debreu
securities are the starting point of all modern financial asset pricing theories.
Arrow–Debreu securities entitle their holder to a payoff of 1$ in one specific
state of the world, and 0 in all other states of the world. The price of such
a security is determined by the market, on which it is tradable, and is sub-
sequent to a supply and demand equilibrium. Moreover, these prices contain
information about investors’ preferences due to their dependence on the con-
ditional probabilities of the state of the world at maturity and due to the
imposition of market-clearing and general equilibrium conditions. The prices
reflect investors’ beliefs about the future, and the fact that they are priced
differently in different states of the world implies, that a one-dollar gain is not
always worth the same, in fact its value is exactly the price of the security.
A very simple security that demonstrates the concept of Arrow–Debreu
securities is a European option. The payoff function of a call option at maturity
T is
def
ψ(ST ) = (ST − K)+ = max(ST − K, 0) (1)
16 E. Giacomini et al.

where K is the strike price, T is maturity and ST is the asset’s price at


maturity.
Since an option is a state-dependent contingent claim, it can be valued
using the concept of Arrow–Debreu securities. Bearing in mind, that Arrow–
Debreu prices can be perceived as a distribution (when the interest rate is 0,
they are non negative and sum up to one), the option price is the discounted
expectation of random payoffs received at maturity. Since the payoff equals
the value of the claim at maturity time (to eliminate arbitrage opportunities),
the value process is by definition a martingale. Introducing a new probability
measure Q, such that the discounted value process is a martingale, we can
write 
Ct = e−r(T −t) EQ
def −r(T −t)
t [ψ(ST )] = e qs ψs (ST ) (2)
s

where r is the interest rate and qs is the price of an Arrow–Debreu security


if r = 0, paying 1$ in state s and nothing in any other state. The superscript
Q denotes the expectation based on the risk neutral probability measure,
the subscript t means that the expectation is conditioned on the information
known at time t. The continuous counterpart of the Arrow–Debreu state con-
tingent claims will be defined in the next section as the risk-neutral density
or in its more commonly used name, the State Price Density (SPD).
Based on the relations between the actual data generating process of a
major stock index and its risk-neutral probability measure, we can derive
measures that help us learn a lot about investors’ beliefs and get an idea of
the forces which drive them. This work aims at investigating the dynamics of
investors’ beliefs.

2 Black and Scholes and Macroeconomic Asset-Pricing


Models

The distinction between the actual data generating process of an asset and
the market valuations is the essence of macroeconomic dynamic equilibrium
asset-pricing models, in which market forces and investors’ beliefs are key
factors to value an asset with uncertain payoffs.
A standard dynamic exchange economy as discussed by [20], [29] and many
others, imposes that securities markets are complete, that they consist of one
consumption good and that the investors, which have no exogenous income
other than from trading the goods, seek to maximize their state-dependent
utility function. There is one risky stock St in the economy, corresponding to
the market portfolio in a total normalized supply. In addition, the economy is
endowed by a riskless bond with a continuously compounded rate of return r.
The stock price follows the stochastic process
dSt
= µdt + σdWt (3)
St
Time Dependent Relative Risk Aversion 17

where µ denotes the drift, σ is the volatility and Wt is a standard Brownian


motion. The drift and volatility could be functions of the asset price, time and
many other factors. However, for simplicity, they are considered constant in
this section. The conditional density of the stock price, which is implied by (3),
is denoted by pt (ST |St ). In this setting, due to continuous dividend payments,
the discounted process with cumulative dividend reinvestments should be a
martingale and is denoted by

St = e−(r+δ)t St
def
(4)

Since we are dealing with corrected data and in order to simplify the theoretic
explanations, we will consider δ = 0 from now on and omit the dividends from
the equations.
Taking the total differential yields

dSt = d(e−rt St )
= −re−rt St dt + e−rt dSt
= −re−rt St dt + e−rt [µSt dt + σSt dWt ]
= (µ − r)St dt + σ St dWt
= σ St dW t (5)
def
where W t = Wt + µ−r σ t can be perceived as a Brownian motion on the
probability space corresponding to the risk-neutral measure Q. The term µ−r σ
is called the market price of risk, it measures the excess return per unit of risk
borne by the investor and hence it vanishes under Q, justifying the name risk-
neutral pricing. Risk-neutral pricing can be understood as the pricing done
by a risk-neutral investor, an investor who is indifferent to risk and hence
not willing to pay the extra premium. The conditional risk-neutral density
of the stock price under Q, implied by (5) and denoted as qt (ST |St ), is the
state-price density which was described as the continuous counterpart of the
Arrow–Debreu prices from (2). The basic theorem of asset pricing states, that
absence of arbitrage implies the existence of a positive linear pricing rule ([8]),
and if the market is complete and indeed arbitrage-free, it can be shown that
the risk-neutral measure Q is unique.
In order to relate the subjective and risk-neutral densities to macroe-
conomic factors, we first need to review some of the basic concepts and
definitions of macroeconomic theory. Under some specific assumptions, it
is well known that a representative agent exists. The original representa-
tive agent model includes utility functions which are based on consumption
(see, for example, [21]). However, introducing labor income or intermediate
consumption do not affect the results significantly and hence, without loss of
generality, we review the concept of marginal rate of substitution with the help
of a simple consumption based asset pricing model. The fundamental desire
18 E. Giacomini et al.

for more consumption is described by an intertemporal two-periods utility


function as
def

U (ct , cst+1 ) = u(ct ) + β Et [u(cst+1 )] = u(ct ) + β u(cst+1 )pt (st+1 |st ) (6)
s

where st denotes the state of the world at time t, ct denotes the consumption
at time t, cst+1 denotes consumption at the unknown state of the world at
time t + 1, pt (st+1 |st ) is the probability of the state of the world at time
t + 1 conditioned on information at time t, u(c) is the one-period utility of
consumption and β is a subjective discount factor. We further assume that
an agent can buy or sell as much as he wants from an asset with payoff ψst+1
at price Pt . If Yt is the agent’s wealth (endowment) at t and ξ is the amount
of asset he chooses to buy, then the optimization problem is

max{u(ct ) + Et [βu(cst+1 )]}


{ξ}

subjected to
ct = Yt − Pt · ξ
cst+1 = Yst+1 + ψst+1 · ξ

The first constraint is the budget constraint at time t, the agent’s endowment
at time t is divided between his consumption and the amount of asset he
chooses to buy. The budget constraint at time t + 1 sustains the Walrasian
property, i.e. the agent consumes all of his endowment and asset’s payoff at
the last period. The first order condition of this problem yields
  
u (cst+1 )
Pt = Et β  ψst+1 (7)
u (ct )
 u (c )

def s
We define MRSt = β Et u (ct+1 t)
as the Marginal Rate of Substitution at t,
meaning the rate at which the investor is willing to substitute consumption
at t + 1 for consumption at t. If consumption at t + 1 depends on the state of
the world (which is the case discussed here), the MRS is also referred to as a
stochastic discount factor.
Famous works like [20] or [24] address the asset pricing models in a more
general manner. The utility function depends on the agent’s wealth Yt at
time t and the payoff function depends on the underlying asset St . According
to [24], in equilibrium, the optimal solution is to invest in the risky stock at
every t < T and then consume the final value of the stock, i.e. Yt = St for
∀t < T and YT = ST = cT . This is a multi-period generalization of the model
introduced before (6), where period T corresponds to t + 1 in the previous
def
section. Defining time to maturity as τ = T − t, the date t price of an
Time Dependent Relative Risk Aversion 19

asset with a liquidating payoff of ψ(ST ) is path independent, as the marginal


utilities in the periods prior to maturity cancel out. This price is given by
 ∞
U  (ST )
Pt = e−rτ ψ(ST )λ  pt (ST |St )dST (8)
0 U (St )

where λe−rτ = β to correspond to (7) and λ being a constant independent of


index level, for scaling purposes.
Considering the call option price under the unique risk-neutral probabil-
ity measure in (2) and the existence of a positive linear pricing rule in the
absence of arbitrage, we argue that the price of any asset can be expressed
as a discounted expected payoff (discounted at the risk-free rate) as long as
we calculate the expectation with respect to the risk-neutral density. Since a
risk-neutral agent always has the same marginal utility of wealth, the ratio of
marginal utilities in (8) vanishes under Q, and (8) can be rewritten as
 ∞
Pt = e−rτ ψ(ST )qt (ST |St )dST = e−rτ EQ
t [ψ(ST )] (9)
0

where qt (ST |St ) is the State Price Density and the expectation EQ t [ψ(ST )]
is taken with respect to the risk-neutral probability measure Q and not the
subjective probability measure, thus reflecting an objective belief about the
future states of the world.
Combining (8) and (9) we can define the pricing kernel Mt (ST ), which
relates to the state price density qt (ST |St ), the subjective probability and the
utility function as

def qt (ST |St ) U  (ST )


Mt (ST ) = =λ  (10)
pt (ST |St ) U (St )

and therefore MRSt = e−rτ Et [Mt (ST )]. Substituting out the qt (ST |St ) in (9)
and using (10) yields the Lucas asset pricing equation:

Pt = e−rτ EQ [ψ(ST )]
 t∞
= e−rτ Mt (ST ) · ψ(ST )pt (ST |St )dST
0
= e−rτ Et [Mt (ST ) · ψ(ST )] (11)

The dependence of the pricing kernel on the investor’s utility function has
urged researchers to try and estimate distributions based on various utility
functions. Arrow [3] and [26] showed a connection between the pricing kernel
and the representative agent’s measure of risk aversion. The agent’s risk aver-
sion is a measure of the curvature of the agent’s utility function. The higher
the agent’s risk aversion is, the more curved his utility function becomes. If the
agent were risk-neutral, the utility function would be linear. In order to keep
a fixed scale in measuring the risk aversion, the curvature is multiplied by the
20 E. Giacomini et al.

level of the asset (the argument of the utility function), i.e. the representative
agent’s coefficient of Relative Risk Aversion (RRA) is defined as

def ST u (ST )


ρt (ST ) = − (12)
u (ST )

According to (10) the pricing kernel is related to the marginal utilities as

U  (ST )
Mt (ST ) = λ
U  (St )
U  (ST )
⇒ Mt (ST ) = λ  (13)
U (St )

Substituting out the first and second derivatives of the utility function in (12)
using (13) yields

ST λMt (ST )U  (St ) ST Mt (ST )


ρt (ST ) = − = − (14)
λMt (ST )U  (St ) Mt (ST )

Using equation (10) we can express the RRA as

ST [qt (ST |St )/pt (ST |St )]


ρt (ST ) = −
qt (ST |St )/pt (ST |St )
[q  (ST |St )pt (ST |St ) − pt (ST |St )qt (ST |St )]/p2t (ST |St )
= −ST t
qt (ST |St )/pt (ST |St )
qt (ST |St )pt (ST |St ) − pt (ST |St )qt (ST |St )
= −ST
qt (ST |St )pt (ST |St )
  
pt (ST |St ) qt (ST |St )
= ST − (15)
pt (ST |St ) qt (ST |St )

We now have a method of deriving the investor’s pricing kernel and his risk
aversion just by knowing, or being able to estimate, the subjective and the
risk-neutral densities. As an example, we consider the popular power utility
function  1 1−γ
c for 0 < γ = 1
u(ct ) = 1−nγ t (16)
log(ct ) for γ = 1
Rubinstein [29] showed, that for such a utility function, aggregate consump-
tion is proportional to aggregate wealth, corresponding to the utility of wealth
or asset prices discussed above. It can be seen, that as γ → 0 the utility is
reduced to a linear function. The logarithmic utility function when γ = 1 is
obtained by applying the L’Hospital rule.
The marginal rate of substitution of an investor with a power utility func-
tion is    −γ
u (cT ) cT
MRSt = β Et = β E t (17)
u (ct ) ct
Time Dependent Relative Risk Aversion 21

which means, that it is a function of consumption growth and it is easy to


relate it to empirical data. The relative risk aversion of an investor with a
power utility can be calculated using (12), with consumption instead of wealth
as an argument, as the utility function is utility of consumption
−γ(cT )−γ−1
ρ(cT ) = −cT =γ (18)
(cT )−γ
This equation shows that the RRA turns out to be a constant, and for the
logarithmic utility case, the risk aversion is 1.
Jackwerth [18] argues that due to the risk aversion of the investor with a
power utility function, the pricing kernel is a monotonically decreasing func-
tion of aggregate wealth. He estimates q and p using data on the S&P500
index returns, as it is common to assume that this index represents the aggre-
gate wealth held by investors, and computes the pricing kernel according to
(10). However, he finds out that the pricing kernel is not a monotonically de-
creasing function as expected. Plotted against the return on the S&P500, the
pricing kernel according tos [18] is locally increasing, implying an increasing
marginal utility and a convex utility function. It is referred to as the Pric-
ing Kernel Puzzle. The shape of the pricing kernel does not correspond to
the basic assumption of asset pricing theory. Although [18] tends to rule out
methodological errors, he never proves that the ratio of two estimators equals
the estimate of the ratio. He assumes that if q and p are estimated correctly,
then their ratio should yield a good estimator for the pricing kernel. This
assumption still needs to be proved, but dealing with it is beyond the scope
of this work.
Under the assumptions of the well-known [4] model, the price of a plain
vanilla call option with a payoff function as in (1) is given by the Black and
Scholes formula
C BS (St , t, K, T, σ, r, δ) = e−δτ St Φ(d1 ) − e−rτ KΦ(d2 ) (19)
where δ is the continuous dividend rate, r is a constant riskless interest rate,
τ is time to maturity, Φ(u) is the cumulative standard normal distribution
function and
ln(St /K) + (r − δ + 0.5σ 2 )τ √
d1 = √ and d2 = d1 − σ τ (20)
σ τ
where we assume δ = 0 for the remaining of this work, as mentioned before.
Furthermore, the [4] implied volatility is assumed to be constant and the
corresponding risk-neutral density is log-normal with mean (r − 0.5σ 2 )τ and
variance σ 2 τ .
A famous work by [5] proved the following relation, which also holds when
the assumptions of the [4] model do not:
∂ 2 C(St , K, τ )
erτ = qt (ST ) = SPD (21)
∂K 2 K=ST
22 E. Giacomini et al.

Sustaining the assumptions of the [4] model and plugging (19) into (21)
yields
[ln(ST /St )−(r−0.5σ 2 )τ ]2
1
q BS (ST |St ) = √ · e− 2σ 2 τ (22)
ST 2πσ 2 τ
meaning that the underlying asset price follows the stochastic process
dSt
= r · dt + σ · dWt (23)
St
i.e. the stock price in a [4] world follows a geometric Brownian motion un-
der both probability measures, only with different drifts. Since the subjective
probability under the [4] is also log-normal but with drift µ, plugging the SPD
from (22) and the log-normal subjective density into (10) yields a closed-form
solution for the investor’s pricing kernel
− µ−r
ST σ2 (µ−r)(µ+r−σ 2 )τ
MtBS (ST ) = ·e 2σ 2 (24)
St

The only non constant term in this expression is SSTt , which corresponds to
consumption growth in a pure exchange economy. Since the pricing kernel in
(24) is also the ratio of the marginal utility functions (10), the investor’s utility
function can be derived by solving the differential equation. If we consider the
following constants
µ−r
γ=
σ2
(µ−r)(µ+r−σ 2 )τ
λ=e 2σ 2 (25)

we can rewrite (24) as


−γ
ST
MtBS (ST ) = λ (26)
St
which corresponds to a power utility function. The B&S utility function is
therefore
−1
µ−r (1− µ−r )
uBS (St ) = 1 − · S t
σ2
(27)
σ2
the subjective discount factor of intertemporal utility is
(µ−r)(µ+r−σ 2 )τ
−rτ
β BS = λe−rτ = e 2σ 2 (28)

and the relative risk aversion is constant


µ−r
ρBS
t (ST ) = γ = (29)
σ2
The above equations prove that a constant RRA utility function sustains the
[4] model, as was shown by [29], [5] and many others.
Time Dependent Relative Risk Aversion 23

Referring again to the stochastic process in (5), in which the Brownian


motion W t is defined on the probability space corresponding to the risk-
neutral measure, the Brownian motion under the assumptions of the [4] model
with a constant RRA can be expressed as
µ−r
W t = Wt + t = Wt + σγt (30)
σ
whereas the stochastic process of the corrected stock price can be expressed
as a direct function of the investor’s relative risk aversion

dSt = σ St dW t = σ St dWt + σ 2 St γdt (31)

3 A Static Model: Daily Estimation


It is well known that the assumptions of the [4] model do not hold in practice.
Transaction costs, taxes, restrictions on short-selling and non-continuous trad-
ing violate the model’s assumptions. Moreover, the stochastic process does not
necessarily follow a Brownian motion and the implied volatility is not constant
and experiences a smile. Consequently, the SPD does not have a closed form
solution and has to be estimated numerically. Rubinstein [30] showed, that
an estimated subjective probability together with a good estimation of the
SPD enable an assessment of the representative agent’s preferences. Hence,
the model presented in this section aims at estimating the pricing kernel us-
ing the ratio between the subjective density and the SPD, and it disregards
the issue of whether a ratio of two estimates is a good approximation for the
estimated ratio itself.
This section is divided into four parts. The first part provides a short
description of the database used in this work. The static model for estimating
the pricing kernel and relative risk aversion on a daily basis is introduced in
the following parts of this section. When the densities and preferences are
known for every day, the dynamics of the time-series can be examined. The
results of this examination are reported in the next section.

3.1 The Database

The database used for this work consists of intraday DAX and options data
which has undergone a thorough preparation scheme. The data was obtained
from the MD*Base, maintained at the Center for Applied Statistics and Eco-
nomics (CASE) at the Humboldt-University of Berlin. The first trading day in
the database is January 4th 1999 and the last one is April 30th 2002, i.e. more
than three years of intraday data and 2,921,181 observations. The options
data contains tick statistics on the DAX index options and is provided by
the German–Swiss Futures Exchange EUREX. Each single contract is doc-
umented and contains the future value of the DAX (corresponding to the
24 E. Giacomini et al.

maturity and corrected for dividends according to (4)), the strike, the interest
rate (linearly interpolated to approximate a “riskless” interest rate for the
specific option’s time to maturity), the maturity of the contract, the closing
price, the type of the option, calculated future moneyness, calculated Black
and Scholes implied volatility, the exact time of the trade (in hundredths of
seconds after midnight), the number of contacts and the date.
In order to exclude outliers at the boundaries, only observations with a
maturity of more than one day, implied volatility of less than 0.7 and future
moneyness between 0.74 and 1.22 are considered, remaining with 2,719,640
observations on 843 trading days. For every single trading day starting April
1999, the static model described in the following section is run and the results
are collected. The daily estimation begins 3 months after the first trading day
in the database because part of the estimation process is conducted on his-
torical data, and the history “window” is chosen to be 3 months, as explained
in the next section.

3.2 Subjective Density Estimation


The subjective density is estimated using a simulated GARCH model, the
parameters of which are estimated based on historical data. This method was
shown by [18] and others to resemble the actual subjective density.
The first step is to extract the data from the 3 months preceding the date
of the daily assessment. That is the reason for starting the daily process in
April instead of January 1999. The intraday options data from the preceding
3 months are replaced by daily averages of the stock index and the interest
rate, averaged over the specific day. When we have a 3 months history of daily
asset prices, we can fit a GARCH (1,1) model to the data. A strong GARCH
(1,1) model is described by
εt = σt Zt
σt2 = ω + αε2t−1 + βσt−1
2
(32)
where Zt is an independent identically distributed innovation with a standard
normal distribution. The logarithmic returns of the daily asset prices are cal-
culated according to εt = ∆ log(St ) = log(St ) − log(St−1 ), and this time series
together with its daily standard deviation σt are the input of the GARCH es-
timation. The parameters ω, α and β are estimated using the quasi maximum
likelihood method, which is an extension of the maximum likelihood measure,
when the estimator is not efficient.
After the parameters of the GARCH process have been estimated, a sim-
ulation of a new GARCH process is conducted, starting on the date of the
daily assessment. Equations (32) are used for the simulation, but this time
the unknown variables are the time series σt and εt , while the parameters
ω, α and β are the ones estimated from the historical data. The simulation
creates a T days long time series, and is run N times. The simulated DAX is
calculated as
Time Dependent Relative Risk Aversion 25

St = St−1 eεt ∀t ∈ {1, . . . , T } (33)


where S0 is the present level of the index on the day of the daily assessment.
Our aim is to estimate the subjective density in some fixed time points,
which correspond to specific maturities used for the SPD estimation discussed
next. Therefore, after the simulation has been completed, the simulated data
on the dates, which correspond to the desired maturities, is extracted, and the
daily subjective density is estimated using a kernel regression on the desired
moneyness grid, which corresponds to the asset’s gross return. The transfor-
mation from the simulated St to the moneyness grid is achieved using e−rT SST0
for each desired horizon T , where r is the daily average risk-free rate on the
present day. The subjective density is estimated for every trading day included
in the database. In Fig. 1 we plot the simulated subjective densities on four
different trading days for four different maturities.

P for Tau = 30,60,90,120 days on 19990416 P for Tau = 30,60,90,120 days on 20000403
4
3

3
Density
Density
2

2
1

1
0
0

0.7 0.8 0.9 1 1.1 1.2 1.3 0.7 0.8 0.9 1 1.1 1.2 1.3
Future Moneyness Future Moneyness
P for Tau = 30,60,90,120 days on 20010911 P for Tau = 30,60,90,120 days on 20020228
5

4
4

3
3
Density

Density
2
2

1
1
0

0.7 0.8 0.9 1 1.1 1.2 1.3 0.7 0.8 0.9 1 1.1 1.2 1.3
Future Moneyness Future Moneyness

Fig. 1. Subjective density for different maturities (30, 60, 90, 120 days) on different
trading days
26 E. Giacomini et al.

It can be seen in Fig. 1 that the distribution resembles a log normal distri-
bution, which is more spread the longer the maturity is. A well known feature
of financial data is that equity index return volatility is stochastic, mean-
reverting and responds asymmetrically to positive and negative returns, due
to the leverage effect. Therefore, this GARCH (1,1) model estimation, which
experiences a slight positive skewness, is an adequate measure for the index
returns, and it resembles the nonparametric subjective densities, which were
estimated by [1] and [6].

3.3 State-Price Density Estimation

There is a vast literature on estimating the SPD using nonparametric and


semiparametric methods. Aı̈t Sahalia and Lo [1], for example, suggest a semi-
parametric approach using the nonparametric kernel regression discussed in
[14]. They propose a call pricing function according to [4], but with a non-
parametric function for the volatility. The volatility is estimated using a two
dimensional kernel estimator
n κ−κi τ −τi
i=1 kκ ( hκ )kτ ( hτ )σi
(κ, τ ) = n
σ κ−κi τ −τi (34)
i=1 kκ ( hκ )kτ ( hτ )

def
where κ = erτKSt is future moneyness, τ is time to maturity and σi is the im-
plied volatility. The kernel functions kκ and kτ together with the appropriate
bandwidths hκ and hτ are chosen such that the asymptotic properties of the
second derivative of the call price are optimized. The kernel function measures
the drop of likelihood, that the true density function goes through a certain
point, when it does not coincide with a certain observation. The price of the
call is then calculated using the [4] formula but with the estimated volatility,
and the SPD is estimated using (21).
A major advantage of such a method comparing to nonparametric ones is
that only the volatility needs to be estimated using a nonparametric regres-
sion. The other variables are parametric, thus reducing the size of the problem
significantly. Other important qualities of kernel estimators are a well devel-
oped and tractable statistical inference and the fact that kernel estimators
take advantage of past data, as well as future data, when estimating the cur-
rent distribution. The problem of kernel based SPDs is that they could, for
certain dates, yield a poor fit to the cross-section of option prices, although
for other dates the fit could be quite good.
The state-price density in this work is estimated using a local polynomial
regression as proposed by [27] and described thoroughly in [17]. The choice
of Nadaraya–Watson type smoothers, used by [1], is inferior to local poly-
nomial kernel smoothing. More accurately, the Nadaraya-Watson estimator
is actually a local polynomial kernel smoother of degree 0. If we use higher
order polynomial smoothing methods, we can obtain better estimates of the
functions. Local polynomial kernel smoothing also provides a convenient and
Time Dependent Relative Risk Aversion 27

effective way to estimate the partial derivatives of a function of interest, which


is exactly what we look for when estimating SPDs.
The first step is to calculate the implied volatility for each given maturity
and moneyness in the daily data (based on the B&S formula when prices
are given and σ is the unknown). Then a local polynomial regression is used
to smooth the implied volatility points and to create the implied volatility
surface from which the SPD can be derived. The basic idea of local polynomial
regression is based on a locally weighted least squares regression, where the
weights are determined by the choice of a kernel function, the distance of
an observation from a certain estimated point defining the surface/line at
this coordinate and the chosen bandwidth vector. The use of the moneyness
measure and time to maturity reduces the regression to two dimensions and
enables freedom in estimating the surface in fictional points that do not exist
in the database.
The concept of local polynomial estimation is quite straightforward. The
input data at this stage is a trivariate data, a given grid of moneyness (κ),
time to maturity (τ ) and the implied volatility (σ BS (κ, τ )). We now consider
the following process for the implied volatility surface
 = φ(κ, τ ) + σ BS (κ, τ ) ∗ ε
σ (35)
where φ(κ, τ ) is an unknown function, which is three times continuously dif-
ferentiable, and ε is a Gaussian white noise. Then a Taylor expansion for the
function φ(κ, τ ) in the neighborhood of (κ0 , τ0 ) is
∂φ 1 ∂2φ
φ(κ, τ ) ≈ φ(κ0 , τ0 ) + (κ − κ0 ) + (κ − κ0 )2
∂κ κ0 ,τ0 2 ∂κ2 κ0 ,τ0
∂φ 1 ∂2φ
+ (τ − τ0 ) + (τ − τ0 )2
∂τ κ0 ,τ0 2 ∂τ 2 κ0 ,τ0
1 ∂2φ
+ (κ − κ0 )(τ − τ0 ) (36)
2 ∂κ∂τ κ0 ,τ0

Minimizing the expression


n
 BS
σ (κj , τj ) − [β0 + β1 (κj − κ0 ) + β2 (κj − κ0 )2 + β3 (τj − τ0 )
j=1
2
+ β4 (τj − τ0 )2 + β5 (κj − κ0 )(τj − τ0 )] Kh (κ − κ0 )(τ − τ0 ) (37)
yields the estimated implied volatility surface and its first two derivatives at
 
= β1 and ∂∂κφ2 = 2β2 . This is a very useful
2
the same time, as ∂∂κφ
κ0 ,τ0 κ0 ,τ0
feature, as the second derivative is used to calculate the SPD for a certain
∂2C
fixed maturity. A detailed derivation of ∂K 2 (used for the SPD according to
∂2σ
[5]) as a function of ∂σ
∂κ and ∂κ2 (which are obtained from the implied volatility
surface estimation) is given, for example, by [17].
28 E. Giacomini et al.

SPD for Tau = 30,60,90,120 days on 19990416 SPD for Tau = 30,60,90,120 days on 20000403

5
5

4
4
Density

Density
3
3

2
2
1

1
0

0.8 0.9 1 1.1 1.2 0.8 0.9 1 1.1 1.2


Future Moneyness Future Moneyness
SPD for Tau = 30,60,90,120 days on 20010911 SPD for Tau = 30,60,90,120 days on 20020228
5

6
4

4
Density
Density
3
2

2
1
0

0.8 0.9 1 1.1 1.2 0.8 0.9 1 1.1 1.2


Future Moneyness Future Moneyness

Fig. 2. State-Price density for different maturities (30, 60, 90, 120 days) on different
trading days

The estimated risk neutral densities for the same dates and the same
maturities as in Fig. 1 are depicted in Fig. 2. The SPD is estimated on a
future moneyness scale, thus reducing the number of parameters that need to
be estimated.
One of the trading days plotted in Fig. 2 is September 11th 2001. It is
interesting to see that the options data on this trading day reflects some
increased investors’ beliefs, that the market will go down in the long run.
Similar behavior is found in the trading days following that particular day as
well as in other days of crisis. The highly volatile SPD for negative returns,
which could be explained, for example, by the leverage effect or the correlation
effect, could reflect a dynamic demand for insurance against a market crash.
This phenomenon is more apparent in days of crisis and was reported by [18]
as well.
Time Dependent Relative Risk Aversion 29

3.4 Deriving the Pricing Kernel and Risk Aversion

At this stage, we have the estimated subjective and state-price densities for the
same maturities and spread over the same grid. The next step is to calculate
the daily estimates for the pricing kernel and risk aversion.
The pricing kernel is calculated using (10), where the estimated subjective
density and the estimated SPD replace p(ST |St ) and q(ST |St ) in the equation,
respectively. Since the grid is a moneyness grid, and the p and q are estimated
on the moneyness grid, the estimated pricing kernel is actually Mt (κT ). The
coefficient of relative risk aversion is then computed by numerically estimating
the derivative of the estimated pricing kernel with respect to the moneyness
and then according to (14).
The estimated pricing kernels depicted in Fig. 3 for different trading days
and different maturities bear similar characteristics to those reported by [1],

EPK for Tau = 30,60,90,120 days on 19990416 EPK for Tau = 30,60,90,120 days on 20000403
1.5
2
1.5
EPK

EPK
1
1
0.5

0.5

0.8 0.9 1 1.1 1.2 0.8 0.9 1 1.1 1.2


Future Moneyness Future Moneyness

EPK for Tau = 30,60,90,120 days on 20010911 EPK for Tau = 30,60,90,120 days on 20020228
10
6
4
EPK

EPK
5
2
0

0.8 0.9 1 1.1 1.2 0.8 0.9 1 1.1 1.2


Future Moneyness Future Moneyness

Fig. 3. Estimated Pricing Kernel for different maturities (30, 60, 90, 120 days) on
different trading days
30 E. Giacomini et al.

[18], [28] and others, who conducted a similar process on the S&P500 index.
The pricing kernel is not a monotonically decreasing function, as suggested
in classic macroeconomic theory. It is more volatile and steeply upward slop-
ing for large negative return states, and moderately downward sloping for
large positive return states. Moreover, the pricing kernel contains a region of
increasing marginal utility at the money (around κ = 1), implying a nega-
tive risk aversion. This feature can clearly be seen in Fig. 4 which depicts the
coefficient of relative risk aversion and shows clearly, that the minimal risk
aversion is obtained around the ATM region and the relative risk aversion
is negative. The negative risk aversion around the ATM region implies the
possible existence of risk seeking investors, whose utility functions are locally
convex.
Jackwerth [18] named this phenomenon the pricing kernel puzzle and sug-
gested some possible explanations to it. One possible explanation is that,

RRA for Tau = 30,60,90,120 days on 19990416 RRA for Tau = 30,60,90,120 days on 20000403
40

20
30

15
20
RRA

RRA
10
10

5
0
0

−5

0.8 0.9 1 1.1 1.2 0.8 0.9 1 1.1 1.2


Future Moneyness Future Moneyness
RRA for Tau = 30,60,90,120 days on 20010911 RRA for Tau = 30,60,90,120 days on 20020228
80
150

60
100
RRA

RRA
40
50

20
0
0

0.8 0.9 1 1.1 1.2 0.8 0.9 1 1.1 1.2


Future Moneyness Future Moneyness

Fig. 4. Estimated relative risk aversion for different maturities (30, 60, 90, 120
days) on different trading days
Time Dependent Relative Risk Aversion 31

a broad index (DAX in this work, S&P500 in his work) might not be a good
proxy for the market portfolio and as such, the results are significantly differ-
ent than those implied in the standard macroeconomic theory. In addition to
the poor fit of the index, the assumptions for the existence of a representative
agent might not hold, meaning that markets are not complete or the utility
function is not strictly state-independent or time-separable.
Another possibility is that historically realized returns are not reliable
indicators for subjective probabilities, or that the subjective distribution is
not well approximated by the actual one. This deviation stems from the fact
that investors first observe historical returns without considering crash possi-
bilities, and only afterwards incorporate crash possibilities, which make their
subjective distribution look quite different than the one estimated here. The
historical estimation or the log-normal distribution assumptions ignore the
well known volatility clustering of financial data.
Looking from another interesting point of view, investors might make mis-
takes in deriving their own subjective distributions from the actual objective
one, thus leading to mispricing of options. Jackwerth [18] claims, that mispric-
ing of options in the market is the most plausible explanation to the negative
risk aversion and increasing marginal utility function.
This work does not aim, however, at finding a solution to the pricing
kernel puzzle. The implicit assumption in this work is that some frictions
in the market lead to the contradicting of standard macroeconomic theory,
resulting in a region of increasing marginal utility. In the following section, a
dynamic analysis of the pricing kernel and relative risk aversion is conducted
along the three-year time frame.

4 A Dynamic Model: Time-Series Analysis


Since the process described above is conducted on a daily basis and in most
of the trading days, the GARCH and local polynomial estimations produce a
good fit to the data, three-year long time-series data of pricing kernel and rela-
tive risk aversion are obtained. In this section we will analyze these time-series
and show their moments. A principal component analysis will be conducted on
the stationary series and the principal components will be tested as response
variables in a GLS regression.

4.1 Moments of the Pricing Kernel and Relative Risk Aversion

In order to explore the characteristics of the pricing kernel and the relative risk
aversion, their first four moments at any trading day have to be computed,
i.e. the mean (µt ), standard deviation (σt ), skewness (Skewt ) and kurtosis
(Kurtt ) of the functions across the moneyness grid. In addition, the daily
values of the estimated functions at the money (ATM) are calculated and
analyzed. Including this additional moment could prove essential as it was
32 E. Giacomini et al.

shown before that the functions behave quite differently at the money than in
other regions. Each of the estimates (pricing kernel and relative risk aversion)
is a function of moneyness and time to maturity, which was chosen to be a
vector of four predetermined maturities, and as in the previous section we
concentrate on τ = (30, 60, 90, 120) days.
Figures 5, 6, 7 and 8 depict the time-series of the ATM values and mean
values of the pricing kernel and the relative risk aversion, each estimated for
four different maturities on 589 trading days between April 1999 and April
2002. The trading days, on which the GARCH model does not fit the data, or
the local polynomial estimation experiences some negative volatilities, were
dropped. Time-series of the daily standard deviation, skewness and kurtosis,
as well as the differences time-series, were collected but not included in this
paper.
The plots show that the pricing kernel at the money (Fig. 5) behaves simi-
larly across different maturities and bears similar characteristics to its general
mean (Fig. 6). This result implies, that characterizing the pricing kernel using
the four first moments of its distribution is adequate. Contrary to the pricing
kernel, the relative risk aversion at the money (Fig. 7) looks quite different
than its general mean (Fig. 8). The ATM relative risk aversion is mostly neg-
ative, as detected already in the daily estimated relative risk aversion. The
mean relative risk aversion, however, is mostly positive. Another feature of the
relative risk aversion is that it becomes less volatile the longer the maturity is,

PK (moneyness = 1), maturity = 30 days) PK (moneyness = 1), maturity = 60 days)


3.00 3.00

2.25 2.25

1.50 1.50

0.75 0.75

1/2000 1/2001 1/2002 1/2000 1/2001 1/2002

PK (moneyness = 1), maturity = 90 days) PK (moneyness = 1), maturity = 120 days)


4.00 4.00

3.00 3.00

2.00 2.00

1.00 1.00

1/2000 1/2001 1/2002 1/2000 1/2001 1/2002

Fig. 5. ATM Pricing Kernel for different maturities (30, 60, 90, 120 days)
Time Dependent Relative Risk Aversion 33

E[PK], maturity = 30 days E[PK], maturity = 60 days


4.00 3.00

3.00 2.25

2.00 1.50

1.00 0.75

1/2000 1/2001 1/2002 1/2000 1/2001 1/2002

E[PK], maturity = 90 days E[PK], maturity = 120 days


3.00 3.00

2.25 2.25

1.50 1.50

0.75 0.75

1/2000 1/2001 1/2002 1/2000 1/2001 1/2002

Fig. 6. Mean of Pricing Kernel for different maturities (30, 60, 90, 120 days)

RRA (moneyness = 1), maturity = 30 days) RRA (moneyness = 1), maturity = 60 days)
14.00 8.00

8.25 3.50

2.50 −1.00

−3.25 −5.50

−9.00 −10.00

RRA (moneyness = 1), maturity = 90 days) RRA (moneyness = 1), maturity = 120 days)
40.00 107.00

27.75 77.25

15.50 47.50

3.25 17.75

−9.00 −12.00

Fig. 7. ATM Relative Risk Aversion for different maturities (30, 60, 90, 120 days)
34 E. Giacomini et al.

E[RRA], maturity = 30 days E[RRA], maturity = 60 days


43.00 54.00

29.25 36.00

15.50 18.00

1.75

−12.00 −18.00

E[RRA], maturity = 90 days E[RRA], maturity = 120 days


33.00 40.00

19.50 25.25

6.00 10.50

−7.50 −4.25

−21.00 −19.00

Fig. 8. Mean of Relative Risk Aversion for different maturities (30, 60, 90, 120
days)

implying the existence of more nervous investors for assets with short matu-
rities. The main conclusion we can draw from the relative risk aversion plots
is that the four first moments of the distribution do not necessarily represent
all the features of the relative risk aversion correctly, and the collection of
the extra details regarding the ATM behavior is justified, as it will be shown
by the principal component analysis. After describing the characteristics of
the different time-series, and before we concentrate on specific time-series for
further analysis, it is essential to determine which of the time-series are sta-
tionary. The test chosen to check for stationarity is the KPSS test, originally
suggested by [19].
Conducting stationarity tests for the various functions has shown, that the
moments of the time-series themselves are in most of the cases not stationary,
and the logarithmic differences of the moments are not always defined, due to
the existence of negative values. Contrary to that, the absolute differences of
all moments and across all maturities were found to be stationary. Therefore,
we concentrate from now on only on the absolute differences of the moments.

4.2 Principal Component Analysis

In the following, we will focus on a principal component analysis (PCA) of the


time-series in order to try and explain the variation of the time-series using
a small number of influential factors. As stated before, the only time-series
Time Dependent Relative Risk Aversion 35

to be considered are the differences of the moments, found to be stationary.


The PCA process starts with the definition of the following data matrix for
pricing kernel differences
⎛ ⎞
∆P K2AT M ∆µ2 ∆σ2 ∆Skew2 ∆Kurt2
⎜ ∆P K3AT M ∆µ3 ∆σ3 ∆Skew3 ∆Kurt3 ⎟
⎜ ⎟
X =⎜ .. .. .. .. .. ⎟ (38)
⎝ . . . . . ⎠
∆P KnAT M ∆µn ∆σn ∆Skewn ∆Kurtn
for each maturity 30, 60 and 90 days, where the differences are defined e.g. as
def
∆µt = µt −µt−1 and similarly for the other columns of the matrix X . A similar
matrix is defined for the differences of the relative risk aversion. PCA can
be conducted either on the covariance matrix of the variables or on their
correlation matrix. If the variation were of the same scale, the covariance
matrix could be used for the PCA. However, the data is not scale-invariant,
hence a standardized PCA must be applied, i.e. conducting the PCA on the
correlation matrix.
The principal components can explain the variability of the data. The
proportion of variance explained by a certain principal component is the ratio
of the corresponding eigenvalue of the correlation matrix to the sum of all
eigenvalues, whereas the proportion of variance explained by the first few
principal components is the sum of the proportions of variance explained by
each of them.
The principal component analysis shows, that three principal components
could explain about 85% of the total variability. Nevertheless, the second
and third principal components were found to be correlated, and in order to
perform a univariate analysis on the principal components, they have to be
orthogonal to each other. Therefore, only the first two principal components
of the pricing kernel and relative risk aversion differences are considered from
now on. The first two principal components explain approximately 80% of the
variability of the pricing kernel differences (the first factor explains 60% and
the second explains 20%), and approximately 70% of the variability of the
relative risk aversion differences (divided equally among the two factors).
The j th eigenvector expresses the weights used in the linear combination
of the original data in the j th principal component. Since we are considering
only two principal components, the first two eigenvectors are of interest. More
specifically, we can construct the first principal components for each of the
examined time-series. The following demonstrates the weights of the moments
in the principal components of the differences of the pricing kernel with a
maturity of 60 days
y1,t (τ = 60) = 0.06∆P KtAT + 0.92∆µt + 0.38∆σt + 0.05∆Skewt
−0.03∆Kurtt
y2,t (τ = 60) = 0.47∆P KtAT M + 0.24∆µt − 0.58∆σt − 0.54∆Skewt
+0.29∆Kurtt
36 E. Giacomini et al.

It can clearly be seen, that the dominant factors in the first principal com-
ponent are the changes in mean and standard deviation, whereas the dominant
factors in the second principal component are the changes in skewness and
standard deviation. The equations do not change much when other maturi-
ties are considered. As for the moments of the relative risk aversion, the first
principal component is dominated solely by the changes in standard deviation
and the second principal component is mainly dominated by the change in
relative risk aversion at the money.
We conclude therefore, that the variation of the pricing kernel and relative
risk aversion differences can be explained by two factors. The first factor of
pricing kernel differences explains 60% of the variability and can be perceived
as a central mass movement factor, consisting of the changes in expectation
and standard deviation. The second factor explains additional 20% of the
variability and can be perceived as a change of tendency factor, consisting
of changes in skewness and standard deviation. The principal components of
the relative risk aversion are a little different. The first one explains approx-
imately 35% of the variability and can be perceived as a dispersion change
factor, dominated by the change in standard deviation. The contribution of
the second principal component to the total variability is 35% as well and it
is dominated by the change in relative risk aversion of the investors at the
money. The mean of relative risk aversion differences seems to play no role in
examining the variability of the relative risk aversion.
The correlation between the ith moment and the j th principal component
is calculated as 
lj
rXi ,Yj = gij (39)
sXi Xi

where gij is the ith element of the j th eigenvector, lj is the corresponding


eigenvalue and sXi Xi is the standard deviation of the ith moment Xi .
Descriptive statistics of the principal components time-series and their
correlations with the moments are given in Tables 1 and 2 for the pricing
kernel and relative risk aversion, respectively. The means of the principal
components are very close to zero, as they are linear combinations of the
differences of the moments, which are themselves approximately zero mean.
The moments highly correlated with the principal components are, not
surprisingly, the ones which were reported to be dominant when constructing
the principal components. Nevertheless, Table 1 implies an inconsistent behav-
ior of the different moments across maturities. The first principal components
of the pricing kernel differences (the first rows for each of the maturities in
Table 1) are positively correlated with the changes in mean and standard de-
viation (the dominating moments) for short term maturities, but negatively
correlated with the mean differences of 90 days maturity pricing kernels. The
second principal components of pricing kernel differences (the second rows for
each of the maturities in Table 1) are negatively correlated with the change of
standard deviation for all maturities, but their correlations with the change
Time Dependent Relative Risk Aversion 37

Table 1. Descriptive statistics, principal components of the pricing kernel differ-


ences
Principal Mean standard Correlation with
component ×104 deviation ∆P KtAT M ∆µt ∆σt ∆Skewt ∆Kurtt
τ = 30
y1,t -2.46 0.76 -0.02 0.42 0.62 0.02 -0.02
y2,t -4.39 4.15 0.21 0.25 -0.16 0.29 0.08
τ = 60
y1,t 4.34 0.44 0.06 0.74 0.30 0.04 -0.03
y2,t 8.53 4.06 0.22 0.11 -0.27 -0.25 0.13
τ = 90
y1,t 2.80 0.55 0.09 -0.61 0.46 0.11 -0.05
y2,t 9.20 2.04 0.23 -0.19 -0.21 -0.32 0.11

Table 2. Descriptive statistics, principal components of the relative risk aversion


differences
Principal Mean standard Correlation with
component ×103 deviation ∆RRAAT
t
M
∆µt ∆σt ∆Skewt ∆Kurtt
τ = 30
y1,t 11.5 14.75 0.03 0.04 0.61 0.00 0.01
y2,t 0.55 9.36 0.33 -0.22 -0.02 -0.32 0.26
τ = 60
y1,t -2.57 26.90 0.10 0.04 0.60 -0.02 0.03
y2,t 1.60 13.75 0.36 0.20 -0.06 -0.24 -0.35
τ = 90
y1,t 1.72 28.60 -0.08 0.15 0.63 0.05 0.04
y2,t 3.71 9.22 0.18 0.36 -0.05 -0.27 0.20

of skewness are not consistent across maturities, implying a bad fit. Since the
first principal component of the pricing kernel differences could explain ap-
proximately 60% of the variability, whereas the second factor can explain only
20%, the inconsistent behavior could be justified by the poor contribution of
the second principal component to the total variability.
The correlations of the first and second principal components of the rela-
tive risk aversion differences with their dominant factors (Table 2) are found
to be consistent across maturities. The first principal component is positively
correlated with its most dominant moment, the changes in the relative risk
aversion standard deviation. This correlation means essentially, that the less
homoscedastic the relative risk aversion is, i.e. the larger the changes in stan-
dard deviation are, the larger the first principal component of the relative risk
aversion differences becomes. The second principal component of the relative
risk aversion differences is positively correlated with its most dominant mo-
ment, the behavior at the money. The more volatile the relative risk aversion
at the money is, the higher the second principal component is. Both principal
38 E. Giacomini et al.

components of the relative risk aversion differences contribute more than 30%
of the variability and imply a good fit of the principal components to the
data. After constructing principal components, which explain the variability
of the time-series, it is essential to check the autocorrelation and the partial
autocorrelation functions of the time-dependent principal components. This
is illustrated in Fig. 9 for the pricing kernel differences. The same functions
for the principal components of the relative risk aversion differences have
similar characteristics and hence not reported here. Since the principal com-
ponents have similar autocorrelation and partial autocorrelation functions for
all different maturities, a maturity of 60 days was arbitrarily chosen to be pre-
sented. It can be seen, that the autocorrelation function drops abruptly after

ACF of First PC of PK PACF of First PC of PK


1

0
0.5

-0.2
pacf
acf
0

-0.4
-0.5

0 5 10 15 20 25 30 0 5 10 15 20 25 30
lag lag
ACF of Second PC of PK PACF of Second PC of PK
1

0
0.5

-0.2
pacf
acf
0

-0.4
-0.5

0 5 10 15 20 25 30 0 5 10 15 20 25 30
lag lag

Fig. 9. Autocorrelation function (left panel ) and partial autocorrelation function


(right panel ) of the principal components of pricing kernel differences (τ = 60 days).
The autocorrelation functions of the principal components of relative risk aversion
differences behave similarly exhibiting a MA(1) process
Time Dependent Relative Risk Aversion 39

the first order autocorrelation whereas the partial autocorrelation function


decays gradually. These characteristics imply a MA(1) behavior (Chap. 11 in
[13]) and we therefore concentrate on fitting a model with a moving aver-
age component to the principal components. A calculation of the Akaike and
Schwarz information criteria confirms, that the best-fitted models for the first
principal components are ARMA (1,1), whereas the second principal compo-
nents follow a MA(1) process. As expected, all principal components have an
autocorrelated error term.

4.3 GLS Regression Model for the Principal Components

The last test conducted in this work is to detect a possible relation between
the principal components and easily observed data, such as changes in the
DAX level and in implied volatility at the money. It is well known, that the
simplest relation between an explanatory variable and a response variable can
be described and examined using a simple linear regression model

y = Xβ +  (40)

where y is a n × 1 response vector, X is a n × p explanatory matrix, β is a


p × 1 vector of parameters to estimate and  is a n × 1 vector of errors. If
the errors were normally distributed and uncorrelated, i.e.  ∼ Nn (0, σ 2 In )
then the regression would result in the familiar ordinary least squares (OLS)
estimator
βOLS = (X  X)−1 X  y (41)
with a covariance matrix

Cov(βOLS ) = σ 2 (X  X)−1 (42)

Introducing autocorrelated errors as described above, the relation between


the explanatory variable and the response variable can be modeled using the
generalized least squares (GLS) estimator. In the previous section, we found
evidence of autocorrelated errors of order 1, meaning that the error process
could be modeled using the following AR(1) process
t = ρt−1 + ut (43)

for all t ∈ {1, . . . , n} with ut ∼ Nn (0, σu2 In ) as i.i.d. white noise and |ρ| < 1 for
stability. We could choose autoregressive processes of higher order, but since
most principal components were found to have an autocorrelated error term
of order 1, we concentrate here on AR(1) processes.
Iterating (43) from time 0 onwards yields


n ∞

t = lim (ρn+1 t−n−1 + ρs ut−s ) = ρs ut−s (44)
n→∞
s=0 s=0
40 E. Giacomini et al.

and hence E[t ] = 0 and the covariance matrix of the error term is
⎛ ⎞
1 ρ ρ2 . . . ρn−1
⎜ ρ 1 ρ . . . ρn−2 ⎟
2 σu2 ⎜⎜ ρ2 ρ

1 . . . ρn−3 ⎟
Cov() = σu Ω = ⎜ ⎟ (45)
1 − ρ2 ⎜ .. .. .. . . .. ⎟
⎝ . . . . . ⎠
ρn−1 ρn−2 ρ n−3
... 1
However, in a real application like the model discussed in this work, the
error-covariance matrix is not known and must be estimated from the data
along with the regression coefficients β.  If the generating process is station-
ary, which is the case in the model discussed here, a commonly used algorithm
for estimating these errors is normally referred to as the [25] procedure. This
algorithm begins with running a standard OLS regression and examining the
residuals. The errors vector of the OLS regression is obtained simply by plug-
ging β in (40). Considering the residuals’ first order autocorrelations from
the preliminary OLS regression can suggest a reasonable form for the error-
generating process. These first order autocorrelations can be estimated as
n
t=2 t t−1
ρ =  n 2 (46)
t=1 t

Replacing the ρ’s in (45) with the ρ ’s from (46) results in the estimated matrix
 The best linear unbiased estimator in that case would be the estimated
Ω.
generalized least squares estimator
βGLS = (X  Ω
 −1 X)−1 X  Ω
 −1 y (47)
The [25] algorithm may seem to be a simple model, but it involves a compu-
tationally challenging estimation of Ω. Therefore, an alternative algorithm,
suggested by [31] is presented here. We define the following matrix as
⎛ ⎞
1 − ρ 2 0 0 . . . 0 0 0
⎜ − ρ 1 0 ... 0 0 0⎟
⎜ ⎟
⎜ 0 − ρ 1 ... 0 0 0⎟
⎜ ⎟
Ψ = ⎜ .. .. .. . . .. .. .. ⎟ (48)
⎜ . . . . . . .⎟
⎜ ⎟
⎝ 0 0 0 . . . − ρ 1 0⎠
0 0 0 . . . 0 − ρ1
It can be shown, that this matrix, multiplied by its transpose and the matrix
Ω (which is defined by (45)) is proportional to the unit matrix
1   
Ψ Ψ Ω = In
1 − ρ 2
and hence the matrix Ψ has the following property
Ψ Ψ = (1 − ρ 2 )Ω
 −1 (49)
Time Dependent Relative Risk Aversion 41

Since least squares estimation is not


 affected by scalar multiplication, we
multiply the regression model by 1 − ρ 2 . Expressing Ω −1 in (47) using
(49) leads to the following GLS estimator

βGLS = (X  Ψ ΨX)−1 X  Ψ Ψy = [(ΨX) (ΨX)]−1 [(ΨX)] (Ψy) (50)

which is actually an OLS estimator of the original variables multiplied by a


scalar. The transformed model can be described as

p
yt − ρyt−1 = (xtj − ρxt−1,j )βj + ut (51)
j=0

for t ∈ {2, . . . , n}, ut being a Gaussian noise. For t = 1 it is simply

  
p

1 − ρ 2 y1 = 1 − ρ 2 βj x1j + 1 − ρ 2 1 (52)
j=0

As stated in the beginning of the current section, the changes in the DAX
level (St ) and the changes of ATM implied volatility (IVtAT M ) were chosen
to be tested as explanatory variables (X), whereas the first two principal
components of the pricing kernel and relative risk aversion differences for
different maturities were the dependent variables for the different models (y).
Since the dependency on the explanatory variable does not have to be linear,
different functions of the explanatory variables were tested. For each of the
explanatory variables the differences, the squared differences, the logarithmic
differences and the squared logarithmic differences were tested. The examined
models consisted of all possible combinations between the functions stated
above, as well as checking for interactions in each of the proposed models.
Since no interaction was ever found to be significant, they were dropped from
the model. The criterion for choosing the best model was a maximal value of
the F-statistic.
Table 3 describes the best fitted models for each of the principal com-
ponents (based on (51)). For this analysis, we consider a confidence level of
95%, i.e. any regression or regression coefficient yielding a Pvalue > 5% is re-
garded as non significant. The Pvalues for the regressions’ coefficients appear
in brackets.
The first principal component of the pricing kernel differences, which was
described before as a central mass movement factor, dominated by the changes
in the mean pricing kernel and the pricing kernel’s standard deviation, is
found to depend significantly on the logarithmic differences of ATM implied
volatility. This regression is only significant for short term maturities, and the
impact of the explanatory variables is positive and log-linear. The impact of
the DAX log return is not significant for a short term maturity, meaning the
first principal component of the pricing kernel differences is mainly influenced
by the logarithmic changes in the implied volatility at the money. Therefore,
42 E. Giacomini et al.

Table 3. Estimated parameters of regression suggested in (51): yt − ρyt−1 = β0 +


β1 (xt,1 − ρxt−1,1 ) + β2 (xt,2 − ρxt−1,2 ) + ut where ∆IVtAT M = IVtAT M − IVt−1
def AT M
,
∆St = St − St−1 and β0 = 0 (the constant is never significant due to zero mean
def

property of principal components)

Pricing kernel differences

PC Maturity ρ 1
β xt,1 2
β xt,2 F

St IVtAT M
1 30 -0.43 -1.80 log St−1
1.76 log AT M
IVt−1
18.96
(0.289) (0.000) (0.000)
St IVtAT M
60 -0.47 2.71 log St−1
0.98 log AT M
IVt−1
10.78
(0.005) (0.001) (0.000)
90 Not Significant

St IVtAT M
2 30 -0.47 30.21 log St−1
12.77 log AT M
IVt−1
20.72
(0.001) (0.000) (0.000)
60 Not Significant

90 Not Significant

Relative risk aversion differences

PC Maturity ρ 1
β xt,1 2
β xt,2 F

1 30 -0.54 0.03 ∆St 145.34 ∆IVtAT M 11.56


(0.000) (0.000) (0.000)
60 -0.46 0.03 ∆St 286.43 ∆IVtAT M 18.05
(0.001) (0.000) (0.000)
90 -0.51 0.02 ∆St 224.27 ∆IVtAT M 10.67
(0.028) (0.000) (0.000)

2 30 Not Significant

60 -0.46 -0.01 ∆St -92.15 ∆IVtAT M 7.22


(0.042) (0.000) (0.000)
90 -0.50 0.01 ∆St 35.72 ∆IVtAT M 4.03
(0.020) (0.011) (0.018)
Time Dependent Relative Risk Aversion 43

we can deduce the following: The larger the changes in ATM implied volatility
are and the higher the DAX log returns are (only for maturities of 60 days),
the more volatile the pricing kernel becomes, with bigger daily changes in its
mean and standard deviation.
We can not find a significant relationship between the second principal
component of the pricing kernel differences and the explanatory variables
(other than for very short maturities), a result that supports the second prin-
cipal component’s smaller contribution to the variability of pricing kernel
differences. The pricing kernel differences have one dominant factor which
explains approximately 60% of their variance and depends mainly on the log-
arithmic changes of the ATM implied volatility. The regression coefficients are
positive, as are the correlations of the first principal component with ∆µt (P K)
and ∆σt (P K) for the respective maturities.
The results regarding the principal components of the relative risk aver-
sion differences are quite different. These principal components are related to
the absolute changes in the DAX level and in ATM implied volatility. The
dependence is not log-linear, but strictly linear.
According to Table 2 in the previous section, the correlations of the first
principal components of the relative risk aversion differences with their dom-
inant moments are positive. The first principal component is a dispersion
factor, dominated by the change in the relative risk aversion standard devia-
tion. According to the regression, large changes in the DAX level and the ATM
implied volatility yield a larger principal component, which is associated with
a larger change in risk aversion standard deviation. This result implies the
existence of more uncertain investors with a more heteroscedastic risk aver-
sion, when the DAX level and ATM implied volatility are more time-varying.
This relation could be explained by the dispersion of information sets among
investors. Veldkamp [32] examines the impact of information markets on as-
sets prices. She basically claims, that information markets, not assets mar-
kets, are the source of frenzies and herds in assets prices. However, the price
fluctuations on the market affect these information sets and determine the
information prices, which are incorporated in the investors’ subjective beliefs.
More volatile markets lead necessarily to a higher risk and to less informa-
tion, which increases the demand for information in a competitive market.
Hence, more volatile markets cause more information to be provided at a
lower price. When less information is involved, individual agents are will-
ing to pay for information, and the information sets of the individual agents
become more dispersed. More dispersed information sets could increase het-
eroscedasticity of the aggregate relative risk aversion as a function of assets’
returns.
The results regarding the second principal component of the relative risk
aversion differences are slightly different. The second principal components
are positively correlated to the change of relative risk aversion at the money.
Nevertheless, the linear regression is not significant for a very short term
maturity of 30 days. For long term maturities the coefficients of the regression
44 E. Giacomini et al.

are positive, whereas for medium term maturities, they are negative. That
could be interpreted as follows: When the changes in DAX level and ATM
implied volatility are larger, the relative risk aversion at the money is more
volatile for long term maturities, but is less volatile for the medium term
maturities.
From this section we can conclude, that the principal components model
fits the relative risk aversion differences better than it fits the pricing kernel
differences. We were able to fit an autocorrelated regression model to the first
principal component of pricing kernel differences for short and medium term
maturities, and to both principal components of relative risk aversion differ-
ences. The autocorrelation is indeed found to be quite large (approximately
-0.5) for all of the above models, implying the existence of an autocorrelated
error as detected already.

5 Final Statements

This work focused on estimating the subjective density and the state-price
density of the stochastic process associated with the DAX. Based on the work
of [30], a good estimation of those two measures is sufficient for deriving the
investors’ preferences. However, this work did not include a direct approxima-
tion of the utility function based on empirical data, but rather an estimation
of the pricing kernel and the relative risk aversion as functions of the return
states. The utility function could be approximated numerically by solving the
differential equations discussed in Sect. 2, after the pricing kernel and relative
risk aversion function have been estimated. Nevertheless, this work aimed at
examining the dynamics of these two measures, characterizing the investors’
behavior, rather than deriving their implied utility function.
The daily estimated pricing kernel and relative risk aversion were found to
have similar characteristics to those reported by [18] and [1]. The pricing ker-
nel was shown not to be a strictly decreasing function as suggested by classical
macroeconomic theory, and the relative risk aversion experienced some nega-
tive values at the money. These findings were apparent throughout the three
year long database, implying existence of risk seeking investors with a locally
convex utility function, possibly due to some frictions in the representative
agent’s model.
The variability of the stationary daily changes in pricing kernel and rel-
ative risk aversion was found to be well explained by two factors. Since the
factors experienced some evident autocorrelation, the principal components
were tested as the response variable in a GLS regression model, which re-
gressed each of the principal components on the daily changes in the DAX
and in ATM implied volatility.
We found that large changes in ATM implied volatility lead to a more
volatile and time-varying pricing kernel. The absence of a significant fitted
regression model for the second principal component of the pricing kernel
Time Dependent Relative Risk Aversion 45

differences was in accordance with its smaller contribution to the explained


variability. In addition, we found evidence for the existence of more uncertain
investors with a more heteroscedastic risk aversion, when the daily changes
in the DAX and the ATM implied volatility were larger. This result was
explained by possibly more dispersed information sets among investors.

References
[1] Aı̈t Sahalia, Y. and Lo, A. W. [2000], ‘Nonparametric risk management
and implied risk aversion’, Journal of Econometrics 94, 9–51.
[2] Arrow, K. J. [1964], ‘The role of securities in the optimal allocation of
risk bearing’, Review of Economic Studies 31, 91–96.
[3] Arrow, K. J. [1965], ‘Aspects of the theory of risk-bearing’, Yrjö
Hahnsson Foundation, Helsinki.
[4] Black, F. and Scholes, M. [1973], ‘The pricing of options and corporate
liabililties’, Journal of Political Economy 81, 637–654.
[5] Breeden, D. and Litzenberger, R. [1978], ‘Prices of state-contingent
claims implicit in option prices’, Journal of Business 51, 621–651.
[6] Brown, D. P. and Jackwerth, J. C. [2004], ‘The pricing kernel puzzle:
Reconciling index option data and economic theory’, Working Paper,
University of Konstanz / University of Wisconsin.
[7] Campbell, J. and Cochrane, J. H. [1999], ‘A consumption-based explana-
tion of aggregate stock market behavior’, Journal of Political Economy
107.
[8] Cochrane, J. H. [2001], ‘Asset pricing’, Princeton University Press,
Princeton.
[9] Constantinides, G. [1982], ‘Intertemporal asset pricing with heteroge-
neous consumers and without demand aggregation’, Journal of Business
55, 253–268.
[10] Debreu, G. [1959], ‘The theory of value’, Wiley, New York.
[11] Derman, E. and Kani, I. [1994], ‘The volatility smile and its implied tree’,
Quantitative strategies research notes, Goldman Sachs.
[12] Fengler, M. R. [2005], ‘Semiparametric modelling of implied volatility’,
Springer, Berlin.
[13] Franke, J., Härdle, W. and Hafner, C. [2004], ‘Statistics of financial mar-
kets’, Springer, Heidelberg.
[14] Härdle, W. [1990], ‘Applied nonparametric regression’, Cambridge Uni-
versity Press, Cambridge.
[15] Härdle, W. and Hlávka, Z. [2005], ‘Dynamics of state price densities’, SFB
649 Discussion paper 2005-021, CASE, Humboldt University, Berlin.
[16] Härdle, W. and Zheng, J. [2002], ‘How precise are distributions predicted
by implied binomial trees?’, in: W. Härdle, T. Kleinow and G. Stahl
(eds.), Applied Quantitative Finance, Springer, Berlin, Ch. 7.
46 E. Giacomini et al.

[17] Huynh, K., Kervella, P. and Zheng, J. [2002], ‘Estimating state-price


densities with nonparametric regression’, in: W. Härdle, T. Kleinow and
G. Stahl (eds.), Applied Quantitative Finance, Springer, Berlin, Ch. 8.
[18] Jackwerth, J. C. [2000], ‘Recovering risk aversion from option prices and
realized returns’, Review of Financial Studies 13, 433–451.
[19] Kwiatkowski, D., Phillips, P., Schmidt, P. and Shin, Y. [1992], ‘Test-
ing the null hypothesis of stationarity against the alternative of a unit
root: How sure are we that economic series have a unit root’, Journal of
Econometrics 54, 159–178.
[20] Lucas, R. E. [1978], ‘Asset prices in an exchange economy’, Econometrica
46, 1429–1446.
[21] Mas-Colell, A., Whinston, M. and Green, J. [1995], ‘Microeconomic the-
ory’, Oxford University Press.
[22] McGrattan, E. and Prescott, E. [2003], ‘Taxes, regulations and the value
of us and uk corporations’, Federal Reserve Bank of Minneapolis, Re-
search Department Staff Report 309.
[23] Mehra, R. and Prescott, E. [1985], ‘The equity premium - a puzzle’,
Journal of Monetary Economics 15.
[24] Merton, R. [1973], ‘Rational theory of option pricing’, Journal of Eco-
nomics and Management Science 4, 141–183.
[25] Prais, S. J. and Winsten, C. B. [1954], ‘Trend estimators and serial cor-
relation’, Cowles Commission Discussion Paper 383, Chicago 383.
[26] Pratt, J. [1964], ‘Risk aversion in the small and in the large’, Economet-
rica 32.
[27] Rookley, C. [1997], ‘Fully exploiting the information content of intra
day option quotes: Applications in option pricing and risk management’,
University of Arizona.
[28] Rosenberg, J. V. and Engle, R. F. [2002], ‘Empirical pricing kernels’,
Journal of Financial Economics 64, 341–372.
[29] Rubinstein, M. [1976], ‘The valuation of uncertain income streams and
the pricing of options’, Bell Journal of Economics 7, 407–425.
[30] Rubinstein, M. [1994], ‘Implied binomial trees’, Journal of Finance
49, 771–818.
[31] Sen, A. and Srivastava, M. [1990], ‘Regression analysis: Theory, methods
and applications’, Springer, New York.
[32] Veldkamp, L. [2005], ‘Media frenzies in markets for financial information’,
American Economic Review, 96(3), 577–601.
[33] Weil, P. [1989], ‘The equity premium puzzle and the risk-free rate puzzle’,
Journal of Monetary Economics 24, 401–421.
Portfolio Selection with Common Correlation
Mixture Models

Markus Haas1 and Stefan Mittnik2


1
Department of Statistics, University of Munich, Germany,
haas@stat.uni-muenchen.de
2
Department of Statistics, University of Munich, Germany,
finmetrics@stat.uni-muenchen.de

1 Introduction

The estimation of the covariance matrix of returns on financial assets is a con-


siderable problem in applications of the traditional mean-variance approach
to portfolio selection. If the number of assets is large, as is often the case in
reality, the estimation error in the (sample) covariance matrix, the number
of elements of which increases at a quadratic rate with the number of assets,
can seriously distort “optimal” portfolio decisions [8, 31, 32]. In order to
mitigate this problem, several alternative approaches have been proposed to
filter out the systematic information from historic correlations, e.g., use of
factor structures [12], shrinkage techniques [36, 37], and others (see [8] for an
overview).
While it is generally found that these methods help to predict return corre-
lations more precisely, empirical research on the distributional characteristics
of asset returns comes up with a further challenge for the classical portfolio
theory developed by Markowitz [43]. In particular, it has long been known that
the distribution of stock returns sampled at a daily, weekly or even monthly
frequency is not well described by a (stationary) normal distribution [45].
The empirical return distributions tend to be leptokurtic, that is, they are
more peaked and fatter tailed than the normal distribution, properties that
are of great importance for risk management. In addition, recent evidence
suggests that there are two types of asymmetries in the (joint) distribution
of stock returns. The first is skewness in the marginal distribution of the
returns of individual stocks [29, 46, 33]. The second relates to the joint dis-
tribution of stock returns and is an asymmetry in the dependence between
assets. Namely, stock returns appear to be more highly correlated during
high-volatility periods, which are often associated with market downturns,
i.e., bear markets. Evidence for the asymmetric dependence phenomenon has
been reported, among others, in [17, 34, 49, 38, 5, 6, 10, 47, 20].
48 M. Haas and S. Mittnik

It is clear that these findings have important implications for financial de-
cisions. In general, if the return distribution is not Gaussian, standard mean-
variance analysis may not be reconcilable with expected utility theory, as
properties such as skewness and kurtosis will also affect investors’ decisions.
Moreover, as stressed in [11], the phenomenon of asymmetric dependencies,
with higher correlations in bear markets, is also of considerable relevance for
investment analysis, because it is in times of adverse market conditions that
the benefits from diversification are most urgently needed. However, models
not taking into account the state-dependent correlation structure will tend
to overestimate the benefits from diversification in bear markets, and, conse-
quently, they will underestimate the risk during such periods.
We attempt to tackle all these issues by assuming that returns are gen-
erated by multivariate normal mixture distributions. It is well-known that
normal mixture densities can capture the skewness and kurtosis observed in
empirical return distributions rather well. Moreover, regime-dependent corre-
lation structures are incorporated into the model in a natural and intuitively
appealing manner. By adopting the Markov-switching approach popularized
by Hamilton [24], we also allow for predictability of market regimes, which is
of great importance for portfolio selection.
It is clear, however, that the curse of dimensionality referred to at the
beginning of this section is even more burdensome in the mixture than in the
traditional framework, because we have as many covariance matrices to esti-
mate as we have mixture components. To effectively overcome this drawback,
we introduce a parsimonious parametrization of the regime-specific correlation
matrices by generalizing the common correlation model (CCM) of Elton and
Gruber [15] to the mixture of common correlation models (MCCM). Despite
its simplicity, the CCM has been shown in a number of previous studies to
deliver highly competitive correlation forecasts.
This article is organized as follows. Section 2 motivates the use of normal
mixture distributions to model asset returns. Section 3 introduces the MCCM
and develops parameter estimation via the EM algorithm. Section 4 presents
an application to international stock market returns, and Sect. 5 concludes
and identifies issues for further research.

2 Normal Mixture Models for Asset Returns


The idea of the normal mixture approach to modeling asset returns is that
the distribution of returns depends on an unobserved state (or regime) of the
market. For example, expected returns as well as variances and correlations
may differ in bull and bear markets. Assume that there are k different states
of the market and that, given that the market is in state j at time t, the
N × 1 vector of returns under consideration, rt , has a multivariate normal
distribution with mean µj and covariance matrix Σj , so that its density is
given by
Portfolio Selection with Common Correlation Mixture Models 49
 
1 1  −1
f (rt |st = j) =  exp − (rt − µj ) Σj (rt − µj ) , (1)
(2π)N/2 |Σj | 2

where |A| denotes the determinant of a square matrix A, and st ∈ {1, . . . , k}


is a variable indicating the market regime at time t.
Assume, furthermore, that, at time t, the market is in state j with prob-
ability πjt , i.e.,
Pr(st = j) = πjt , j = 1, . . . , k. (2)
Then the distribution of rt at time t is a k-component finite normal mixture
distribution, with density


k
f (rt ) = πjt φ(rt ; µj , Σj ), (3)
j=1

where φ(·; µj , Σj ) denotes the normal density with mean µj and covariance
matrix Σj , as given in (1). In (3), the πjt ’s are the (conditional) mixing
weights, and the φ(·; µj , Σj ) are the component densities, or mixture com-
ponents, with component means µj , and component covariance matrices Σj ,
j = 1, . . . , k. The normal mixture has finite moments of all orders, which are
easily found using the properties of the normal distribution. For example, the
mean and the covariance matrix are given by


k
µ := E(rt ) = πjt µj (4)
j=1

and

k 
k
Var(rt ) = πjt Σj + πjt (µj − µ)(µj − µ) , (5)
j=1 j=1

respectively.
A finite mixture of a few normal distributions, say two or three, is capa-
ble of capturing the skewness and excess kurtosis detected in empirical asset
return distributions. While a general discussion of the moments of mixture
models may be found in [41, 54], let us briefly illustrate the skewness and
kurtosis properties in the univariate case, where N = 1. Then the centered
third moment of the mixture distribution is

k 
k
E(rt − µ)3 = πjt (µj − µ)3 + 3 πjt σj2 (µj − µ) (6)
j=1 j=1


k 
k 
= πjt (µj − µ)3 + 3 πjt πit (σj2 − σi2 )(µj − µi ).
j=1 j=1 i<j

It is instructive to consider (6) for the two-component case, where k = 2.


Then (6) becomes
50 M. Haas and S. Mittnik

E(rt − µ)3 = π1t π2t [(π2t − π1t )(µ1 − µ2 )3 + 3(µ1 − µ2 )(σ12 − σ22 )]. (7)

In practice, we often find a bear market regime, say Regime 2, with a relatively
low regime probability (π2t < π1t ), a high variance (σ22 > σ12 ), and a low mean
return (µ2 < µ1 ). Then (7) implies that such a combination results in negative
skewness, which is often observed in the distribution of asset returns.
With respect to kurtosis, consider the case of equal component means, i.e.,
µ1 = · · · = µk = µ. Then the coefficient of excess kurtosis over the normal
distribution, κ, is
 
j πjt σj − (
4 2 2
E(rt − µ)4 j πjt σj ) Var(σ 2 )
κ := 2 − 3 = 3  = 3 2 2 > 0. (8)
E (rt − µ) 2 2
( j πjt σj )2
E (σ )

Although it is well-known that the moment-based measures of skewness


and kurtosis must be interpreted with care, it can be shown that the mixture
density with equal component means is in fact leptokurtic, i.e., it has fatter
tails and higher peaks then the normal distribution with the same variance.
In addition to flexibly accommodating nonnormalities of the unconditional
return distribution, the mixture approach is able to account for regime-specific
dependence structures in a very natural way, while still appealing to correla-
tion matrices in the context of (conditionally) normally distributed returns,
which will appeal to portfolio managers used to think in these terms.
Finally, the normal mixture has a very attractive property in applications
to portfolio selection. Namely, if the return vector, rt , has a k-component
multivariate mixed normal distribution as in (3), then it is straightforward to
see that the return, rtp , on a portfolio formed from these assets, i.e., rtp = w rt ,
where w is an N × 1 vector of portfolio weights, has a k-component univariate
normal mixture distribution,1 i.e., it has density
⎧ ! ⎫
k
πjt ⎨ 1 rp − µ 2 ⎬
t j
p
f (rt ) = √ exp − , (9)
2πσj ⎩ 2 σj ⎭
j=1


where µj = w µj , and σj = w Σj w.
There is considerable evidence for the presence of market regimes with
distinctly different stochastic properties of stock returns, e.g., [56, 28, 49, 57,
5, 6, 9, 20, 21, 22, 23, 3, 7, 48]. Often researchers identify a bull market regime
with high expected returns and relatively low variances, and a bear market
regime with lower returns and higher variances. Moreover, when mixture mod-
els are applied to multivariate return series, significantly different correlation
structures are usually found, where the correlations are higher in the bear
market regime [49, 13, 5, 6, 20, 23].

1
This can be seen, for example, by just writing down the moment generating
function of the portfolio return.
Portfolio Selection with Common Correlation Mixture Models 51

To complete the formulation of the mixture model given by (1) and (2), we
need to specify the stochastic process generating the market regimes, i.e., the
evolution of the mixing weights (2). While the independent mixture model,
where the regime probabilities are constant over time, has recently attracted
some interest in the context of normal mixture GARCH models [21, 3, 58], it
seems more likely that regimes are persistent. I.e., if we are in a bull market
currently, the probability of being in a bull market in the next period will be
larger than if the current regime were a bear market. If regimes are persistent,
it is clear that this persistency should be incorporated into the model, because
this implies that the regimes are predictable, and such predictability can be
exploited for asset allocation purposes.
To allow for predictability of regimes, we adopt the Markov-switching tech-
nique which has become very popular in econometrics and empirical finance
since the seminal work of Hamilton [24]. In this model, it is assumed that the
probability of being in regime j at time t depends on the regime at time t − 1
via the time-invariant transition probabilities pij , defined by

pij := Pr(st = j|st−1 = i), j = 1, . . . , k − 1, (10)


k−1
and pik = 1 − j=1 pij , i = 1, . . . , k. It will be useful to collect the transition
probabilities in the k × k transition matrix P ,
⎛ ⎞
p11 p21 · · · pk1
⎜ p12 p22 · · · pk2 ⎟
⎜ ⎟
P =⎜ . . . ⎟. (11)
⎝ .. .. · · · .. ⎠
p1k p2k · · · pkk

In general, if we are in regime j at time t, we anticipate that regime j will


continue with probability pjj . Thus, if regimes are persistent, this will be
reflected in rather large diagonal elements of the transition matrix P , which
can also be characterized as the “staying probabilities”.
It is worthwhile to note that, when regimes are persistent, volatility clus-
tering is also accommodated, i.e., the observation that “large [price] changes
tend to be followed by large changes – of either sign – and small changes tend
to be followed by small changes” [39]. Intuitively, if the “staying probabilities”
are large, then high-volatility regimes tend to be followed by high-volatility
regimes, and low-volatility regimes tend to be followed by low-volatility
regimes. To be more precise, let us consider the univariate case (i.e., N = 1)
with two regimes (see [54] for a more general discussion of the moments of
Markov-switching models). Then it can be shown that the autocovariance
function of the squared returns is given by

Cov(rt2 , rt−τ
2
) = π1,∞ (1 − π1,∞ )δ τ (σ12 − σ22 + µ21 − µ22 )2 ,

where π1,∞ = (1 − p22 )/(2 − p11 − p22 ) is the unconditional probability of the
first regime (to be defined more precisely in (15)), and δ = p11 + p22 − 1 may
52 M. Haas and S. Mittnik

be viewed as a measure for the regimes’ persistence. Thus, the persistence in


the second moments increases with the persistence of the regimes.
Unfortunately, the regimes are not observable, so that we cannot use the
transition probabilities directly to produce forecasts of the future regimes.
However, we can use the observed return history to compute regime infer-
ences once we have estimated the vector of model parameters, θ, consisting
of the component means and the independent elements of the component
covariance matrices and the transition matrix. Simple algorithms for calcu-
lating regime inferences have been developed in the literature and are briefly
reviewed in the next subsection.

2.1 Inference About Market Regimes

As the market regimes are not directly observable, we can only use observed
returns to make probability statements about the market’s past, current, or
future states. Forecasts of future market regimes are needed for optimal out-
of-sample portfolio choices, and regime inferences are also an ingredient of the
EM algorithm for parameter estimation, which will be discussed in Sect. 3.1.
[24, 35] have developed algorithms to calculate such probabilities, and we
briefly summarize their results here (see also [26]).
To this end, we introduce, for each point of time, t, a new (unobserved)
k-dimensional random vector zt = (z1t , . . . , zkt ) , t = 1, . . . , T , with elements
zjt such that
%
1 if st = j
zjt = for j = 1, . . . , k, t = 1, . . . , T. (12)
0 if st = j

That is, zjt is one or zero according as whether the return vector at time t,
rt , has been generated by the jth component of the mixture.
Moreover, let Rτ be the return history up to time τ , i.e., Rτ = (r1 , . . . , rτ ),
τ = 1, . . . , T , and let θ be the vector of model parameters. Then our
probability inference of being in state j at time t, based on the return his-
tory up to time τ , Rτ , and the parameter vector, θ, will be denoted by
Pr(zjt = 1|Rτ , θ) = zjt|τ , and zt|τ = (z1t|τ , . . . , zkt|τ ) .
Using these definitions, [26] shows that zt|t and zt+1|t can be recursively
computed via
zt|t−1 ηt
zt|t = (13)
1k (zt|t−1 ηt )
zt+1|t = P zt|t , (14)

where is the Hadamard product, denoting the elementwise multiplication


of conformable matrices, 1k is a k-dimensional column of ones, and the k × 1
vector ηt is given by ηt = (f (rt |st = 1), . . . , f (rt |st = k)) , with f (rt |st = j)
as given in (1).
Portfolio Selection with Common Correlation Mixture Models 53

Note that, when we use the information up to time t to compute the


conditional return density at time t + 1, then the elements of zt+1|t defined
in (14) are the relevant regime probabilities, πjt , to be used in (3). More
generally, the τ -step regime forecasts can be obtained from

zt+τ |t = P τ zt|t .

It can be shown that, under general conditions (usually satisfied in practice),


there exists a vector π∞ = (π1,∞ , . . . , πk,∞ ) , which does not depend on zt|t ,
such that
lim zt+τ |t = lim P τ zt|t = π∞ . (15)
τ →∞ τ →∞

Consequently, πj,∞ , j = 1, . . . , k, is referred to as the stationary, or uncondi-


tional, probability of regime j.
To initialize the recursion given by (13) and (14), a vector of initial prob-
abilities, , with elements

j := z1|0 = Pr(s1 = j), j = 1, . . . , k, (16)

needs to be either fixed or estimated. The EM algorithm discussed in Sect. 3.1


leads to a natural estimate of the j ’s, which is given in (32).
Having specified an initial probability, the algorithm (13) and (14) can be
used to compute the likelihood function of the sample of observed data, RT ,
at a value of the parameter vector, θ, as


T 
T
log L(θ|RT ) = log f (rt |Rt−1 , θ) = log[1k (zt|t−1 ηt )]. (17)
t=1 t=1

We also need the so-called smoothed regime inferences, i.e., the regime
probabilities conditional on the entire return history, RT . [35] derived a con-
venient algorithm for this purpose, which works backwards through

zt|T = zt|t [P  (zt+1|T zt+1|t )], (18)

where denotes element-by-element division.


A final quantity that is required as an input for the EM algorithm is the
(smoothed) joint probability

Pr(zi,t−1 = 1, zjt = 1|RT , θ) =: zij,t|T , i, j = 1, . . . , k, t = 2, . . . , T, (19)

for which we have [35, 26]


pij · zi,t−1|t−1 · zjt|T
zij,t|T = . (20)
zj,t|t−1
54 M. Haas and S. Mittnik

3 The Markov-Mixture of Common Correlation Models


As discussed in Sect. 1, estimating the correlation structure of returns for
portfolio selection has always been one of the most intricate hurdles to the
practical application of standard mean-variance theory, because the sample
covariance matrix may be estimated with substantial error, especially for high-
dimensional problems.
This is an even more severe concern in the context of mixture models,
because, in the presence of k mixture components, we have to estimate k
component covariance matrices. Clearly, the remedy is, just as in the tradi-
tional framework, to impose some structure on these matrices. This gives rise
to a more parsimonious parametrization of the regime-dependent correlation
matrices which, hopefully, helps to efficiently filter out the systematic infor-
mation from observed return series, and thus to reduce estimation error and
forecast future comovements.
Here, we consider a generalization of the common correlation model
(CCM) originally proposed in [15] to the mixture framework. The underlying
idea of the CCM is that “historical data only contain information concerning
the mean correlation coefficients and that observed pairwise differences from
the average are random or sufficiently unstable” [16], so that the best way
to forecast future correlations is to use the average of the observed historical
sample correlation coefficients. Thus, this model, in its simplest form, reduces
the number of parameters to be estimated for the correlation matrix from
N (N − 1)/2 to one. Despite its striking simplicity, and perhaps somewhat
surprisingly, the model has been found in a number of studies to deliver su-
perior out-of-sample forecasts of return comovements when compared to the
sample covariance matrix or the single index model [15, 16, 18].
Given the competitive performance of the CCM, it seems to be a promising
task to embed this approach to parsimoniously parameterizing the correlation
matrix into the normal mixture framework. In this way we can combine the
remarkable simplicity of the original approach with the undeniable nonnor-
malities in the distribution of asset returns, which renders the use of the
standard CCM somewhat unsatisfactory. The resulting model, coupled with a
Markov chain generating the market regimes, will be termed Markov-mixture
of CCMs, or, in short, Markov-MCCM.
As explained above, the CCM assumes that the correlation between all
pairs of securities is the same. In the Markov-mixture of CCMs, we will assume
that, within each regime, there is a common correlation, ρj , between all pairs
of stocks. That is, the covariance matrix of component j, Σj , can be written as
Σj = Dj Rj Dj , j = 1, . . . , k, (21)
where D = diag(σ1j , . . . , σN j ) is a diagonal matrix with the standard devia-
tions of the jth component on the main diagonal, and Rj is the correlation
matrix with ones on the main diagonal and ρj elsewhere, i.e.,
Rj = (1 − ρj )IN + ρj 1N 1N , (22)
Portfolio Selection with Common Correlation Mixture Models 55

where IN is the identity matrix of dimension N , and 1N is an N -dimensional


column of ones. We will need the fact that

|Rj | = (1 − ρj )N −1 [(N − 1)ρj + 1], (23)

and that Rj−1 is a matrix with

1 + (N − 2)ρj
r̄j := (24)
1 + (N − 2)ρj − (N − 1)ρ2j

on the main diagonal and


−ρj
rj := (25)
1 + (N − 2)ρj − (N − 1)ρ2j

elsewhere.2 By a simple induction, we note from (23) that we require


1
− < ρj < 1, j = 1, . . . , k, (26)
N −1
for Rj to be a valid (positive definite) correlation matrix. As noted by
Samuelson [51], (26) is a nice formula because it shows that, in accordance
with intuition, “although there is no limit on the degree to which all invest-
ments can be positively intercorrelated, it is impossible for all to be strongly
negatively correlated”.
Note that, in the mixture of CCMs, the overall, or unconditional, correla-
tion matrix of returns need not be of the common correlation type, because,
from (5), the unconditional covariance matrix is


k 
k
Cov(rt ) = πj,∞ Σj + πj,∞ (µj − µ)(µj − µ) , (27)
j=1 j=1

where the πj,∞ ’s are the unconditional regime probabilities defined in (15).
This shows that the pairwise correlations may differ due to different regime
means. In applications to financial returns, however, the differences between
the regime means tend to be small, relative to those between the vari-
ances, so that the overall covariance matrix (27) is approximately common
correlation-like.
The parameters of the standard, single-regime CCM are usually estimated
by simply equating (average) sample moments and theoretical quantities. That
is, estimate the individual means and standard deviations by their sample
analogies, and equate the common correlation coefficient to the average of the
2
This follows from (22) and the Sherman–Morrison formulas for the determinant
and the inverse, respectively, stating that, for an invertible matrix A and con-
formable vectors u and v, |A + uv  | = |A|(1 + v  A−1 u), and (A + uv  )−1 =
A−1 − A−1 uv  A−1 (1 + v  A−1 u)−1 [53].
56 M. Haas and S. Mittnik

N (N − 1)/2 sample pairwise correlations [15, 4, 37]. This procedure is not fea-
sible for the mixture of CCMs, because we do not observe sample counterparts
of the regime-specific means, variances and correlation coefficients. However,
estimation can be carried out in a fast and stable manner by employing the
Expectation–Maximization (EM) algorithm of [14], which will be developed
in the next two subsections.
Note that the parameter vector, θ, for this model consists of the compo-
nent means, variances, and correlation coefficients, as well as of the k(k − 1)
independent elements of the transition matrix (11), i.e., it is of dimension
k(2N + k), which increases linearly with the number of assets under study.

3.1 Parameter Estimation via the EM Algorithm

In this section we discuss the computation of the maximum likelihood esti-


mator (MLE) of the parameters of the Markov-MCCM via the EM algorithm
of [14]. This is a broadly applicable approach that provides a convenient pro-
cedure for iteratively finding the MLE in situations that can be described as
missing-data problems, and where ML estimation, but for the absence of some
additional data, would be straightforward.3
As in Sect. 2.1, we denote by RT the sample of observed return data, that
is, RT = (r1 , . . . , rT ), having density f (RT ; θ), where θ is the parameter to
be estimated. Let ZT = (z1 , . . . , zT ) denote the additional data, referred to
as the unobserved or missing data, and let XT denote the so-called complete
data, i.e., XT = (RT , ZT ). Its density will be denoted by f c (XT ; θ), and
the complete-data log likelihood that could be formed for θ if XT were fully
observable is given by log Lc (θ|XT ) = log f c (XT ; θ).
The EM algorithm proceeds as follows. Let θ(n−1) be the estimate of θ
that has been determined on the (n − 1)th iteration of the algorithm. The nth
iteration consists of two steps, the E-step and the M-step (E for “expectation”
and M for “maximization”). On the E-step, compute the expectation
& '
Q(θ; θ(n−1) ) := E log Lc (θ|XT )|RT , θ(n−1) , (28)

that is, compute the expectation of the complete-data log likelihood with
respect to the missing data, given the observed data and the current fit, θ(n−1) .
Next, on the M-step, solve the complete-data likelihood equations
∂Q(θ; θ(n−1) )
= 0, (29)
∂θ
to find an update θ(n) for θ. The E-step (28) and the M-step (29) are alter-
nated repeatedly until convergence is achieved, that is, until the difference
3
[40] provides an exhaustive presentation of the EM algorithm, its theory and
numerous examples. A discussion of EM with special emphasis on mixture models
is given in [50]. The EM algorithm for a general class of Markov-switching models
was derived in [25].
Portfolio Selection with Common Correlation Mixture Models 57

θ(n) − θ(n−1) and/or log L(θ(n) |RT ) − log L(θ(n−1) |RT ) does not exceed a
prespecified (small) value, where log L(θ|RT ) = log f (RT ; θ) is the observed-
data log likelihood (17), i.e., the function to be maximized. A well-known
result of [14] is that log L(θ(n) |RT ) ≥ log L(θ(n−1) |RT ), i.e., the likelihood is
not decreased after an EM iteration.4
It turns out that, for the Markov-MCCM, we need an extension of the
EM algorithm, namely, the ECM algorithm of [44]. The ECM algorithm is an
extension of the EM algorithm, where the maximization on the M-step is bro-
ken into a number of conditional maximization (CM) steps. This procedure is
preferable if the complete-data maximization in the M-step of the EM algo-
rithm is complicated, and the CM-steps are simple. [44] shows that the ECM
algorithm shares the monotone convergence property of the EM algorithm.
To formulate the ECM algorithm for the Markov-MCCM, we let the un-
observed zt = (z1t , . . . , zkt ) , t = 1, . . . , T , be given by the quantities defined
in (12).
Under this missing-data formulation, the complete-data density is given by

(
k (
T (
k (
k
f c (XT ; θ) = (j φ1j )zj1 (pij φjt )zjt zi,t−1 , (30)
j=1 t=2 j=1 i=1

where φjt is short-hand notation for φ(rt ; µj , Σj ), denoting the normal density
with mean µj and covariance matrix Σj . Consequently, the complete-data log
likelihood is given by


k 
T 
k 
k
log Lc (θ|XT ) = zj1 log j + zij,t log pij
j=1 t=2 j=1 i=1


T 
k
+ zjt log φjt , (31)
t=1 j=1

where zij,t = zjt zi,t−1 . As can be seen from (31), the E-step of the nth iter-
ation requires the evaluation of the conditional expectation of the zjt ’s and
zij,t ’s, given the observed data, RT , and the current fit, θ(n−1) , which can be
accomplished using (18) and (20).
The updating formulas for the initial probabilities, j , j = 1, . . . , k,5 and
the elements of the transition matrix, pij , i, j = 1, . . . , k, have been derived
in [25]. They are given by
T (n)
(n) (n) (n) t=2 zij,t|T
j = z1j|T , and pij = T (n)
, i, j = 1, . . . , k, (32)
t=2 zi,t−1|T

4
Besides its theoretical appeal, this is a useful property in practice, as it may help
to detect programming errors by monitoring the change in log likelihood after
each iteration.
5
Recall the definition of j in (16).
58 M. Haas and S. Mittnik
(n) (n)
where zjt|T and zij,t|T are given by (18) and (20), respectively, with θ fixed at
θ(n−1) . Moreover, the updating formula for the component means is given by
!−1
(n)

T
(n)

T
(n)
µj = zjt|T zjt|T rt , j = 1, . . . , k. (33)
t=1 t=1

A similar formula holds for the covariance matrices in the unrestricted


Markov-switching model. However, while the update equations (32) and (33)
remain valid for the Markov-MCCM, the CM-steps for the elements of the
covariance matrices need development, which is pursued next.

3.2 The ECM Algorithm for the MCCM

The E-step of the ECM algorithm is given by (28); and the M-step for the
initial probabilities and the elements of the transition matrix is given by (32).
Thus, to derive the M-step for the parameters of the component densities, we
only need to consider the term in the second line of (31). To this end, let

σj = (σ1j , . . . , σN j ) , j = 1, . . . , k, (34)

be the N × 1-vector collecting the standard deviations of the jth component,


so that, in (21),
Dj = diag(σj ), j = 1, . . . , k.
Then, maximization of the term in the second line of (31) requires maximizing
1
Qj2 (µj , σj , ρj ) := − log |Dj Rj Dj | (35)
2
!−1 T
1  (n)  (n)
T

− zjt|T zjt|T (rt − µj ) Dj−1 Rj−1 Dj−1 (rt − µj )
2 t=1 t=1

(n)
with respect to µj , σj , and ρj , separately for j = 1, . . . , k. As µj is given by
(33), (35) becomes
1 1 ) *
Qj2 (µj , σj , ρj ) = − log |Dj | − log |Rj | − tr Dj−1 Rj−1 Dj−1 Sj , (36)
2 2
where !−1

T 
T + ,+ ,
(n) (n) (n) (n)
Sj = zjt|T zjt|T rt − µj rt − µj . (37)
t=1 t=1

Denote a typical element of the N × N -matrix Sj by sjvw , v, w = 1, . . . , N ,


j = 1, . . . , k.
As explained above, the ECM-algorithm replaces the M-step of the
EM-algorithm with several CM-steps. In the current application, on the
Portfolio Selection with Common Correlation Mixture Models 59

first CM-step, we maximize the complete-data log likelihood with respect to


σj = (σ1j , . . . , σN j ) , with ρj fixed at ρj
(n−1)
. Then, on the second CM-step,
(n)
an update of ρj is calculated, with σj = σj given from the first CM-step.
The log likelihood for the elements of σj , given ρj , is


N
r̄j  sjii
N
rj   sjim
N
log L(σj ) = − log σij − 2 − . (38)
i=1
2 i=1 σij 2 i=1 σij σmj
m=i

Define ψij = 1/σij , and ψj = (ψ1j , . . . , ψN j ) , j = 1, . . . , k. Then (38) can be


written as

N
r̄j  j 2
N
rj   j
N
log L(ψj ) = log ψij − sii ψij − sim ψij ψmj . (39)
i=1
2 i=1 2 i=1
m=i

The first order conditions with respect to ψj are

∂ log L(ψj ) 1  j
= − r̄j sj ψ j − rj si ψij = 0,  = 1, . . . , N. (40)
∂ψ j ψj
i=

Equation (40) may be written more compactly as


∂ log L(ψj )  −1 ) * 
= (diag(ψj )) − Rj−1 Sj diag(ψj ) 1N = 0N , (41)
∂ψj
where 0N and 1N denote N × 1-vectors of zeros and ones, respectively. The
Hessian matrix is given by
∂ log L(ψj ) −2
= − (diag(ψj )) − Rj−1 Sj . (42)
∂ψj ∂ψj

As Rj−1 and Sj are positive definite, their Hadamard product is positive defi-
nite by the Schur product theorem (cf. [30], Theorem 5.2.1); hence, the Hessian
of (39) is negative definite for all ψj , and the unique solution of (41) is the
global maximum of the log likelihood. To solve (41), we multiply (40) by ψ j
to observe that we require
 j
r̄j sj ψ 2j + rj si ψij ψ j − 1 = 0,  = 1, . . . , N. (43)
i=

Hence, we may solve (41) by iterating on


⎛ - ⎛ ⎞
. ⎞2
⎜  .  ⎟
1 ⎜ . ⎟,
+ /rj2 ⎝ sji ψij ⎠ + 4r̄j sj
[d] [d−1] [d−1]
ψj = −rj sji ψij
2r̄j sj ⎝ i= i=

 = 1, . . . , N. (44)
60 M. Haas and S. Mittnik

In (44), we have used brackets instead of parentheses for the superscript in


[d]
order to distinguish ψ j from the final EM-update on the nth iteration of the
(n) [d]
algorithm, i.e., ψ j = limd→∞ ψ j .
So far this iteration scheme has always converged on test data. A partial
theoretical justification for the use of (44) can also be provided by showing
(n)
that it defines a local contraction at the MLE, i.e., ψj .6
Once ψj (and hence σj ) is determined, we can write the log likelihood for
ρj as

N
sjii N 
sjim
log L(ρj ) = − log |Rj | − r̄j − 2rj
σ2
i=1 ij
σ σ
i=1 m<i ij mj

= − log |Rj | − r̄j aj1 − rj aj2 , (45)

where
aj1 = ψj diag(Sj )ψj , aj2 = ψj [Sj − diag(Sj )] ψj . (46)
Using (23), (24) and (25), we compute

dr̄j 2(N − 1)ρj + (N − 1)(N − 2)ρ2j


= , (47)
dρj [1 + (N − 2)ρj − (N − 1)ρ2j ]2

drj 1 + (N − 1)ρ2j
=− , (48)
dρj [1 + (N − 2)ρj − (N − 1)ρ2j ]2
and
d log |Rj | −N (N − 1)ρj
= . (49)
dρj 1 + (N − 2)ρj − (N − 1)ρ2j
By differentiating (45), inserting (47), (48) and (49), and a few additional
(n)
algebraic manipulations, we find that ρj is a solution of the cubic equation


4
P(ρj ) = bi ρ4−i
j = 0, (50)
i=1

where

b1 = −N (N − 1)2 , b2 = (N − 1)[(N − 2)(N − aj1 ) + aj2 ],


b3 = (N − 1)(N − 2aj1 ), b4 = aj2 .

Calculations show that


1 N N
P − = (aj1 + aj2 ) = ψ  Sj ψj > 0,
N −1 N −1 (N − 1) j
6
The details are available from the authors upon request.
Portfolio Selection with Common Correlation Mixture Models 61

and

P(1) = −N [(N − 1)aj1 − aj2 ]



N 
N 
2 j
= −N (N − 1) ψij sii − 2 ψij ψmj sjim
i=1 i=1 m<i


N 
2 j
= −N (ψij 2
sii + ψmj sjmm − 2ψij ψmj sjim )
i=1 m<i


N 
) * sjii −sjim ψij
= −N ψij ψmj < 0.
i=1 m<i
−sjim sjmm ψmj

Hence, (50) has at least one root in the admissible interval (−1/(N − 1), 1). It
may happen that (50) has three roots in this interval, but we did not observe
this in practice so far.7 If this occurs, one would choose the root with the
largest corresponding value of (45).
This completes the formulation of the ECM algorithm for the Markov-
MCCM.

3.3 Computing Standard Errors

Approximate standard errors for the Markov-MCCM can be computed by


using a general technique developed in [27] for calculating the gradient of the
log likelihood of a Markov-switching model. To outline the procedure for our
model, we partition the parameter vector as θ = (θ1 , θ2 ) , where θ1 consists of
the component means and the parameters of the covariance matrices, and θ2
contains the independent elements of the transition matrix. Then it follows
from the results in [27] that

∂ log f (rt |Rt−1 , θ)  ∂ log f (rt |st = j)


k
= zjt|t (51)
∂θ1 j=1
∂θ1


t−1  k
∂f (rτ |sτ = j)
+ (zjτ |t − zjτ |t−1 ),
τ =1 j=1
∂θ1
t = 1, . . . , T,

where f (rt |Rt−1 , θ) is the conditional density of rt , given the history of returns
up to time t − 1; the quantities of the form zjτ |t can be computed using (18);
and the second line of (51) is set to zero for t = 1.

7
However, numerical examples can be constructed where this occurs.
62 M. Haas and S. Mittnik

Similarly, for the elements of the transition matrix, we have

∂ log f (rt |Rt−1 , θ) 1 1


= zij,t|t − zik,t|t (52)
∂pij pij pik
1 
t−1
+ (zij,τ |t − zij,τ |t−1 )
pij τ =2

1 
t−1
− (zik,τ |t − zik,τ |t−1 ),
pik τ =2
i = 1, . . . , k, j = 1, . . . , k − 1, t = 2 . . . , T,

where the quantities of the form zij,τ |t can be obtained from (20); and the
second and third line of (52) are set to zero for t = 2.
An estimate of the information matrix can be constructed from the scores’
average outer product, i.e.,
! !
1  ∂ log f (rt |Rt−1 , θ) 
T
 ∂ log f (rt |Rt−1 , θ)
I(θ) = ,
T t=1 ∂θ ∂θ

where θ is the MLE of θ. Inference can then be based on the approximation

(θ − θ) ≈ N(0, T −1 I(θ)


 −1 ). (53)

Note that the computation of the derivatives in (51) simplifies considerably


in our model, because all the parameters in θ1 appear in only one regime. Thus,
we only need

∂ log f (rt |st = j) ∂ log φjt


= = Σj−1 δjt , j = 1, . . . , k, t = 1, . . . , T,
∂µj ∂µj

where δjt = rt − µj ,

∂ log f (rt |st = j) ∂ log φjt


= = −Dj−1 1N + (Dj−1 δjt ) (Σj−1 δjt ),
∂σj ∂σj
j = 1, . . . , k, t = 1, . . . , T,

where σj is defined in (34). Finally, with ξjt = δjt σj ,

∂ log f (rt |st = j) ∂ log φjt 1 d log |Rj | 1  dr̄j


= =− − ξjt ξjt
∂ρj ∂ρj 2 dρj 2 dρj
1 drj
− 1N (ξjt ξjt
− diag(ξjt ξjt 
))1N ,
2 dρj
j = 1, . . . , k, t = 1, . . . , T.
Portfolio Selection with Common Correlation Mixture Models 63

4 Application to International Stock Market Returns


Although our Markov-MCCM is specifically designed for modeling high-
dimensional return vectors, the illustrative application presented in this sec-
tion is based on a three-variable system in order to assess the consequences of
regime-switching for asset allocation. We consider discrete dollar-denominated
weekly percentage returns of the S&P500, FTSE, and DAX indices over the
period from March 1991 to August 2005, a sample of 754 observations.8 We
use weekly as opposed to monthly data because we need a sample long enough
to be able to estimate the different regimes, but without the potential noise
of daily data. Moreover, returns are calculated from Thursday to Thursday in
order to avoid issues such as possible abnormal Monday or Friday returns.
Our in-sample period includes approximately the first ten years of data,
i.e., 520 observations, while the remaining 234 observations are retained for
out-of-sample portfolio construction.

4.1 Estimation Results and Regime-Evidence

In this subsection, we present the estimation results for the in-sample period,
covering the period from March 1991 to February 2001, with T = 520 obser-
vations. We denote the return vector at time t by rt = (r1t , r2t , r3t ) , where
r1t , r2t , and r3t are the time-t returns of the S&P500, the FTSE, and the
DAX, respectively. A few descriptive statistics of the three series, along with
the Jarque–Bera test for normality, are summarized in Table 1. While the
S&P500 has the largest mean and the smallest variance, it also has the great-
est kurtosis, while the FTSE has the lowest mean and lowest kurtosis value.

Table 1. Distributional properties of stock market returns. p-values are given in


3/2
parentheses. “skewness” denotes the coefficient of skewness γ = m 3 /m2 and
−1
t (rt − r̄) ,
2 i
“kurtosis” the coefficient  of kurtosis κ = m4 /m2 , where mi := T
−1
i = 2, 3, 4, and r̄ = T t rt . The p-values reported for these quantities are based on
the result that, under normality,
 γ and
κ are asymptotically normal with mean 0 and
3, and standard deviation 6/T and 24/T , respectively. Thus, T γ 2 /6 ∼ χ2 (1) and
T (κ − 3)2 /24 ∼ χ2 (1) asymptotically. JB is the Jarque–Bera test for normality, i.e.,
asy
JB = T γ 2 /6 + T (κ − 3)2 /24 ∼ χ2 (2) (see, e.g., [2], p. 286)

correlation/
mean covariance matrix skewness kurtosis JB
S&P500 FTSE DAX
S&P500 0.251 3.827 0.540 0.501 −0.100 4.321 38.675
(0.352) (0.000) (0.000)
FTSE 0.144 2.241 4.494 0.599 0.045 3.203 1.067
(0.676) (0.345) (0.587)
DAX 0.235 2.541 3.289 6.715 −0.094 4.075 25.797
(0.384) (0.000) (0.000)

8
All data have been obtained from Datastream.
64 M. Haas and S. Mittnik

The skewness coefficients are not significant for any of the series. Interestingly,
while the Jarque–Bera test strongly rejects normality for the S&P500 and the
DAX series due to their excess kurtosis, this is not the case for the FTSE. The
relatively mild deviations from normality reflected in the statistics reported
in Table 1 make the data a natural candidate to be modeled by a normal
mixture distribution.
We also note that the stock return series display a considerable degree
of comovement, with pairwise correlation coefficients ranging from approxi-
mately 0.5 to 0.6.
The Markov-MCCM developed in Sect. 3 will be specified with two and
three components. Parameter estimates for these models are reported in
Table 2,9 where the models are ordered with respect to a decreasing uncondi-
tional probability, i.e., π1,∞ > π2,∞ > π3,∞ .
The parameter estimates for the models have a rather striking pattern.
For example, for the two-component model, i.e., k = 2, the first regime is
characterized by a smaller correlation coefficient and, for all series, higher

Table 2. Parameter estimates for international stock returns. k refers to the number
of mixture components in the Markov-MCCM introduced in Sect. 3. Standard errors
are given in parentheses

k=2 k=3
µ1 (0.266 , 0.191 , 0.281 ) (0.196 , 0.136 , 0.233 )
(0.086) (0.108) (0.114) (0.103) (0.141) (0.150)
σ1 (1.469 , 1.844 , 1.947 ) (1.345 , 1.961 , 2.109 )
(0.053) (0.071) (0.074) (0.063) (0.093) (0.102)
ρ1 0.383 0.397
(0.040) (0.045)
π1,∞ 0.602 0.441
µ2 (0.228 , 0.074 , 0.165 ) (0.247 , 0.094 , 0.206 )
(0.188) (0.180) (0.237) (0.176) (0.166) (0.222)
σ2 (2.553 , 2.419 , 3.326 ) (2.515 , 2.395 , 3.273 )
(0.125) (0.131) (0.162) (0.115) (0.118) (0.155)
ρ2 0.649 0.643
(0.032) (0.030)
π2,∞ 0.398 0.425
µ3 – (0.442 , 0.333 , 0.328 )
(0.207) (0.172) (0.148)
σ3 – (1.691 , 1.346 , 1.153 )
(0.139) (0.132) (0.117)
ρ3 – 0.289
(0.087)
π3,∞ 0 ⎛ 0.134 ⎞
⎛ ⎞ 0.995 0.005 0.000
0.996 0.006 ⎜ (0.004) (0.006) (0.005) ⎟
⎝ (0.003) (0.007) ⎠ ⎜ 0.000 0.995 0.015 ⎟
P ⎜ (0.003) (0.006) (0.019) ⎟
0.004 0.994 ⎝ ⎠
(0.003) (0.007) 0.005 0.000 0.985
(0.008) (0.004) (0.018)

9
The ECM algorithm was terminated, i.e., convergence was considered to be
reached, when log L(θ

(n)
|RT ) − log L(θ(n−1) |RT ) ≤ 10−8 and θ(n) − θ(n−1)  ≤
10−8 , where x = x x.
Portfolio Selection with Common Correlation Mixture Models 65

means and lower standard deviations than the second regime. Thus, we can
classify the first regime as a bull and the second as a bear market regime. It
should be noted, however, that the standard errors of the component means
are relatively large, so that their differences may not be statistically significant.
We also note that both the bull and the bear market regimes are highly
persistent, with both of the “staying probabilities” being larger than 99%.
In the long run, however, the market is more often in the bullish regime, as
reflected in its unconditional probability of approximately 60%.
The higher kurtosis of the S&P500 and the DAX, when compared to the
FTSE, as reported in Table 1, is reflected in Table 2 by the fact that
the regime-specific standard deviations are more different across regimes for
the former two indices. In fact, recall from (8) that, in normal mixture mod-
els, the coefficient of excess kurtosis can be interpreted as the coefficient of
variation of the component variances.
While some of the results for the two-component model carry over to the
specification with three components, there are also some differences, which
we discuss briefly. In particular, we still have a bull market regime with high
mean returns, low variances and a small correlation coefficient. However, this
regime now has the smallest unconditional probability and is, thus, classified
as Regime 3 in the three-component model. Instead, in this model, the regime
with the largest stationary probability is a “medium” regime with intermedi-
ate standard deviations (except for the S&P500) and correlation coefficient.
It seems that the bull market regime of the two-component model has been
split into a rather optimistic bull market of the new Regime 3 and the new
“business-as-usual” Regime 1, while the stochastic properties of the second
(bear) market component, as well as its unconditional probability, remain
essentially unchanged.10
Before we turn to out-of-sample portfolio selection, we shall also assess, for
the in-sample-period, the significance of both the presence of regimes as well
as regime-specific correlation coefficients. To do so, we estimate, in addition to
the models above, the standard single-regime CCM as well as Markov-MCCMs
with regime-independent correlation coefficients, i.e., with the restriction that
ρ1 = ρ2 (= ρ3 ).11 Table 3 reports likelihood-based goodness-of-fit measures for
the models. Specifically, we report the values of the maximized log likelihood
function, log L, and the AIC [1] and BIC [52] criteria, respectively.
The improvement in log likelihood when passing from the single-regime
to the two-regime model with regime-specific correlations is 52.46. Although
it is difficult to formally test for the number of regimes, this may be deemed
a significant improvement, in particular as the two-component model ranks
higher with respect to the rather conservative BIC. In this regard it may

10
Note, however, that the mean of the S&P500 is higher in Regime 2 than in Regime
1, although this may not be statistically significant.
11
The ECM algorithm developed in Sect. 3.2 can easily be modified to incorporate
such restrictions. Details are available from the authors upon request.
66 M. Haas and S. Mittnik

Table 3. Likelihood-based goodness-of-fit measures. The first row of the table indi-
cates the model, where k refers to the number of components in the Markov-MCCM.
The model with k = 1 is the standard single-regime CCM. The notation “ k = 2 ” is
(ρ1 =ρ2 )
used to denote the two-component Markov-MCCM with equal correlation structure
in both components, and an analogous notation is used for the three-component
model. K denotes the number of parameters of a model, log L is the log likelihood,
AIC = −2 log L + 2K, and BIC = −2 log L + K log T , where T is the number of
observations. Smaller values of AIC and BIC are preferred. Boldface entries indicate
the best model for the particular criterion

k=1 k=2 k=2 k=3 k=3


(ρ1 =ρ2 ) (ρ1 =ρ2 =ρ3 )

K 7 15 16 25 27
log L −3227.2 −3188.2 −3174.8 −3164.4 −3151.8
AIC 6468.5 6406.3 6381.6 6378.8 6357.7
BIC 6498.3 6470.1 6449.6 6485.1 6472.6

be worthwhile to mention that, in the literature on mixture models, there is


some evidence that the BIC provides a reasonably good indication for the
appropriate number of components (e.g., [19]; see [41] for a survey and fur-
ther references). Note that the single-regime model ranks lowest according
to the BIC, while the two-component model with regime-specific correlations
ranks best.
Formal likelihood-ratio tests (LRT) can be conducted for models with
fixed k, i.e., for MCCMs with regime-independent correlations against those
where the correlations are allowed to switch. The corresponding LRT test
statistics, computed as two times the difference in log likelihood, are 26.77
and 25.09 for the models with k = 2 and k = 3, respectively. Both values
are highly significant when compared to conventional critical values given by
the asymptotically valid χ2 distribution with one and two degrees of freedom,
respectively.
Summarizing the results in Table 3, we conclude that both regimes as well
as regime-dependent correlation structures are important features of the joint
distribution of the international stock returns under study.

4.2 Out-of-Sample Portfolio Selection

Next, we investigate the implications of regime-switching in stock market


returns for portfolio choice decisions. We restrict our analysis to the simplest
case of one-period-ahead all-equity portfolios. That is, for our 234 out-of-
sample return observations, rt , t = 521, . . . , 754, we estimate the models with
one, two, and three components based on data up to time t − 1 with a rolling
window of length 520, and use the parameter estimates to select an optimal
portfolio from the three stock market indices under study to be held until
the end of period t. In this manner, we obtain, for each model, 234 realized
Portfolio Selection with Common Correlation Mixture Models 67

one-period-ahead portfolio returns, the distributional properties of which can


be investigated.
To select portfolios, we adopt an expected utility approach. We assume
that the expected utility function, U (·), can reasonably be approximated by

U (rtp ) = − exp{−crtp }, (54)

where c > 0 is the coefficient of risk aversion, and rtp is the portfolio return
at time t, i.e., rtp = wt rtp , with wt being the portfolio weight vector, satisfying
13 wt = 1 and wt ≥ 0. We will not make an attempt to argue that (54) is
a “realistic” assumption about real-world investor’s preferences. Since (54)
is characterized by a single parameter, c, we can investigate the impact of
increasing risk aversion on portfolio choice decisions. Moreover, in combina-
tion with mixed normally distributed asset returns, it allows a closed-form
computation and straightforward optimization of expected utility.
Using (54), and excluding short sales, a Gaussian investor will solve the
quadratic programming problem

max wt µt − 0.5cwt Σt wt s.t. 13 wt = 1 and wt ≥ 0,


wt
t = 521, . . . , 754. (55)

whereas, in view of (9), an investor assuming that returns are generated by a


mixture distribution will maximize12
k  
 c2 
p
E[U (rt )] = − πjt exp −cwt µjt + wt Σjt wt s.t. 13 wt = 1 and wt ≥ 0,
j=1
2
t = 521, . . . , 754. (56)

Note that, both in (55) and (56), the (component) means and (component)
covariance matrices depend on t, because the parameter estimates are updated
every week. We also mention that the mixing weights, πjt , to be used in (56)
are the one-period-ahead predictive regime inferences given in (14).

Results

Now we turn to the distributional properties of the 234 out-of-sample portfolio


returns realized by the different models. Summary statistics for the respective
distributions are reported in Table 4 for selected values of the risk aversion
parameter, c, ranging from 0.025 to 2. The results in Table 4 show that the
differences between the distributions of realized portfolio returns obtained
from the single-regime and the mixture models are rather small, except for
the lowest values of the risk aversion parameter, i.e., for c ranging from 0.025
to 0.1. This is particularly the case for the three-regime model, which has, for
12
The functions quadprog and fmincon in Matlab 6.5 are used to carry out the
optimizations in (55) and (56), respectively.
68 M. Haas and S. Mittnik

Table 4. Summary statistics for the out-of-sample portfolio returns, covering the
period from February 2001 to August 2005 (234 observations)

single-regime CCM
c = 0.025 0.05 0.1 0.25 0.5 0.75 1 1.25 1.5 2
mean 0.001 0.000 0.023 0.043 0.049 0.051 0.052 0.053 0.053 0.054
variance 5.386 5.247 4.928 4.768 4.739 4.732 4.729 4.728 4.726 4.725
skewness −0.379−0.366−0.342−0.305−0.291−0.287−0.284−0.283−0.282−0.281
kurtosis 5.293 5.457 5.753 5.785 5.772 5.765 5.761 5.758 5.756 5.754
two-regime Markov-MCCM
c = 0.025 0.05 0.1 0.25 0.5 0.75 1 1.25 1.5 2
mean 0.029 0.002 0.016 0.039 0.047 0.050 0.053 0.056 0.058 0.058
variance 6.175 5.504 4.972 4.750 4.704 4.695 4.697 4.702 4.704 4.707
skewness −0.329−0.316−0.278−0.258−0.256−0.254−0.255−0.255−0.255−0.255
kurtosis 4.353 4.879 5.323 5.592 5.691 5.717 5.719 5.715 5.715 5.713
three-regime Markov-MCCM
c = 0.025 0.05 0.1 0.25 0.5 0.75 1 1.25 1.5 2
mean 0.144 0.113 0.059 0.026 0.029 0.032 0.038 0.047 0.053 0.059
variance 6.954 6.413 5.347 4.815 4.677 4.648 4.650 4.670 4.689 4.692
skewness −0.303−0.310−0.289−0.261−0.260−0.257−0.255−0.255−0.257−0.260
kurtosis 3.922 4.161 4.807 5.456 5.723 5.796 5.803 5.772 5.741 5.743

these values of c, a considerably higher mean return, higher variance, lower


skewness, and lower kurtosis than the Gaussian model. For the two-component
model, this is only true for c smaller than 0.1.
We suppose that the reason for this finding is that, in the current appli-
cation, the major benefit of accounting for the presence of (highly persistent)
regimes in asset returns is that it enables the investor to exploit the pre-
dictability of regimes by aggressively selecting stocks with a (conditionally)
high expected return, which may come at the cost of accepting a higher vari-
ance. Then it is clear that investors with a higher degree of risk aversion
cannot benefit from regime predictability in the same way as those with lower
risk aversion.
To illustrate, let us first note that all the forecasting strategies reported in
Table 4 display relatively low overall mean returns. This is clearly due to the
falling stock markets over the first two years of our out-of-sample period, i.e.,
from the beginning of 2001 to the beginning of 2003, which could not have been
predicted by any of the models. For example, Table 5 presents summary statis-
tics of the stock returns both for the entire out-of-sample period from February
2001 to August 2005, as well as for the period covering the last 75 observa-
tions, i.e., from March 2004 to August 2005, a period characterized by rising
stock markets and low volatility.13 In addition, Table 6 documents, for the pe-
13
Interestingly, over the period from March 2004 to August 20005, all the series do
not display any excess kurtosis. Note that this does not contradict the assumption
Portfolio Selection with Common Correlation Mixture Models 69

Table 5. Distributional properties of stock returns over the out-of-sample period.


The left part of the table refers to the entire out-of-sample period from February
2001 to August 2005; the right part reports the same statistics for the period from
March 2004 to August 2005, i.e., the last 75 observations

Feb. 2001 to Aug. 2005 Mar. 2004 to Aug. 2005


S&P500 FTSE DAX S&P500 FTSE DAX
mean 0.022 0.088 0.110 0.162 0.302 0.395
variance 5.224 5.628 12.40 1.703 2.256 3.851
skewness –0.274 –0.241 –0.349 0.093 –0.093 0.001
kurtosis 5.250 4.999 5.794 2.630 2.854 2.736

Table 6. Summary statistics for the out-of-sample portfolio returns, covering the
period from March 2004 to August 2005 (75 observations)

single-regime CCM
c= 0.025 0.05 0.1 0.25 0.5 0.75 1 1.25 1.5 2
mean 0.169 0.185 0.213 0.230 0.236 0.238 0.238 0.239 0.239 0.240
variance 1.707 1.612 1.582 1.611 1.629 1.636 1.639 1.642 1.643 1.645
skewness 0.082 0.074 0.068 0.040 0.028 0.023 0.021 0.020 0.019 0.018
kurtosis 2.606 2.546 2.598 2.677 2.706 2.716 2.721 2.724 2.726 2.728
two-regime Markov-MCCM
c= 0.025 0.05 0.1 0.25 0.5 0.75 1 1.25 1.5 2
mean 0.299 0.248 0.225 0.232 0.233 0.234 0.237 0.241 0.243 0.243
variance 2.825 2.295 1.986 1.748 1.673 1.668 1.660 1.661 1.661 1.661
skewness −0.150 −0.044 0.057 0.073 0.069 0.048 0.019 0.007 0.003 0.001
kurtosis 2.581 2.363 2.366 2.581 2.677 2.718 2.762 2.801 2.815 2.820
three-regime Markov-MCCM
c= 0.025 0.05 0.1 0.25 0.5 0.75 1 1.25 1.5 2
mean 0.432 0.396 0.281 0.195 0.197 0.199 0.205 0.217 0.228 0.241
variance 3.755 3.216 2.362 1.867 1.690 1.664 1.663 1.670 1.677 1.658
skewness −0.064 −0.127 −0.016 0.025 0.053 0.061 0.056 0.036 0.014 0.010
kurtosis 2.732 2.548 2.345 2.330 2.415 2.500 2.587 2.685 2.736 2.771

riod from March 2004 to August 2005, the portfolio selection performance of
the models under study. While, just as in Table 4, the differences between the
models are negligible for the higher risk aversion coefficients, the differences

of an underlying normal mixture distribution, because, as can be learned from


Fig. 1, the markets have been in a bull market regime during most of this period.
Rather this supports the normal mixture hypothesis, which assumes that returns
are normally distributed within a given regime, and excess kurtosis results from
switching variances. But note that it is difficult to make inferences about kurtosis
from just 75 observations.
70 M. Haas and S. Mittnik

for the lower degrees of risk aversion are even more striking, in particular for
the three-regime model.
Figure 1 shows, for the entire out-of-sample period, the three return series
along with the one-step-ahead predictive regime probabilities, as given by (14),
for the bull and bear market regimes, as calculated from the two-component

S&P500
20

10

−10

−20
2002 2003 2004 2005
FTSE
20

10

−10

−20
2002 2003 2004 2005
DAX
20

10

−10

−20
2002 2003 2004 2005

one−step−ahead predictive regime probabilities from the two−component MCCM


1

0.9

0.8

0.7

0.6

bull market regime


0.5
bear market regime
0.4

0.3

0.2

0.1

0
2002 2003 2004 2005

Fig. 1. Shown are, in the top three panels, the three return series under study
over the out-of-sample period from February 2001 to August 2005. The bottom
panel displays the corresponding one-step-ahead predictive regime probabilities, as
given by (14), calculated from the two-regime Markov-MCCM. The solid line is the
probability of the bull market regime, and the dash-dot line is the probability of the
bear market regime
Portfolio Selection with Common Correlation Mixture Models 71

model. Recall that these are the mixing weights which enter the optimization
problem (56). Clearly the market was predicted to be in a bear state over the
first few years of the out-of-sample period, while it was predicted to be in the
bullish sate in the last period.
Thus it seems that most of the “excess returns” over the single component
model have been realized during the last 75 weeks of the out-of-sample period
under study. To confirm this, we plot, in Fig. 2, for each c-value considered
in Tables 4 and 6, and for each model, the average portfolio weights for the
three different stock indices over the last 75 weeks. Figure 2 reveals that there
are significant differences between the single-regime and the mixture models
only for the lower risk aversion coefficients. Namely, for small values of c,
the mixture investors (with low risk aversion) use their inferences about the
prevailing regime to put a large part of their wealth into the high-return
German market, which, on average, generated the highest returns during the
period under study (cf. Table 5). But as can be seen from Table 5, the DAX
was also a high-volatility stock, which prompts more risk averse investors to
abstain from allocating a large fraction of their wealth to the DAX. In this
regard, it is important to note that the conditional probabilities of the bear
market shown in Fig. 1 are not zero (or even close to zero) in the last period of
the sample, so that investors still assign a positive probability to the markets
being bearish. Clearly the importance of the bear market regime for asset
allocation decisions increases with the degree of risk aversion, c, and when
c becomes large, the portfolio choice will be entirely based on this regime.
Consequently, the (average) portfolio weights for larger values of c are very
similar for the single-regime and the multi-regime models.

5 Conclusions
Our results show that investors can benefit from accounting for regimes in
asset returns. The analysis should be extended in various ways. For exam-
ple, while we considered all-equity portfolios, investors could be allowed to
put part of their wealth into a risk-free asset, which would be relevant in
particular in times of falling stock markets. This would perhaps improve the
overall performance of the investment strategies, which, as seen in Table 4,
suffered from the bear market regime at the beginning of our out-of-sample
period. Presumably, the benefits from the opportunity to invest in a risk-free
asset will be especially significant for the multi-regime models, because the
fraction of wealth to be invested in the risk-free asset can be made dependent
on the prevailing regime, to hedge against low returns and high correlations
in the bear market. In addition, risk considerations may be taken into ac-
count. For example, Value-at-Risk or expected-shortfall restrictions could be
incorporated into the portfolio optimization. Then the mixture approach is
expected to further improve upon the traditional Gaussian approach, as it
better accommodates the excess kurtosis in the return distribution.
72 M. Haas and S. Mittnik

average weights for the single−regime CCM from March 2004 to August 2005
1
S&P500
0.8 FTSE
DAX
portfolio weights

0.6

0.4

0.2

0
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
risk aversion, c

average weights for the two−regime MCCM from March 2004 to August 2005
1
S&P500
0.8 FTSE
DAX
portfolio weights

0.6

0.4

0.2

0
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
risk aversion, c

average weights for the three−regime MCCM from March 2004 to August 2005
1
S&P500
0.8 FTSE
DAX
portfolio weights

0.6

0.4

0.2

0
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
risk aversion, c

Fig. 2. Shown are, for the period from March 2004 to August 2005 (i.e., the last
75 observations of the out-of-sample period), and for values of the coefficient of
risk aversion, c, ranging from 0.025 to 2, the average portfolio weights for the three
different stock indices, as implied by the single-regime (top panel ) model, as well as
by the two- (center panel ) and three-regime (bottom panel ) MCCM
Portfolio Selection with Common Correlation Mixture Models 73

Finally, we intend to apply the model to high-dimensional asset allocation


problems, for which it was originally constructed. Here it will be useful to
compare the mixture of CCMs with alternative parsimonious parameteriza-
tions of high-dimensional mixture models, such as the mixtures of principal
components and factor analyzers [55, 42], which have not been applied to port-
folio analysis so far. Also, in some situations, the assumption of a common
correlation coefficient between all pairs of assets may be too restrictive, and
grouping techniques will be more appropriate. These extensions are currently
under investigation.

References

[1] Akaike H (1973) Information Theory and an Extension of the Maximum


Likelihood Principle. In: Petrov BN, Csaki F (eds) 2nd International
Symposium on Information Theory. Akademiai Kiado, Budapest
[2] Alexander C (2001) Market Models. A Guide to Financial Data Analysis.
Wiley, Chichester
[3] Alexander C, Lazar E (2006) Normal Mixture GARCH(1,1): Applica-
tions to Exchange Rate Modelling. Journal of Applied Econometrics 21:
307–336
[4] Aneja YP, Chandra R, Gunay E (1989) A Portfolio Approach to Esti-
mating the Average Correlation Coefficient for the Constant Correlation
Model. Journal of Finance 44:1435–1438
[5] Ang A, Bekaert G (2002) International Asset Allocation with Regime
Shifts. Review of Financial Studies 15:1137–1187
[6] Ang A, Chen J (2002) Asymmetric Correlations of Equity Portfolios.
Journal of Financial Economics 63:443–494
[7] Bauwens L, Preminger A, Rombouts J (2006) Regime-Switching GARCH
Models. CORE Discussion Paper 2006/11
[8] Brandt MW (2005) Portfolio Choice Problems. In: Aı̈t-Sahalia Y,
Hansen LP (eds) Handbook of Financial Econometrics. North-Holland,
Amsterdam
[9] Brannolte C (2002) Nichtlineare Regimewechselmodelle. Theoretische
und emprirische Evidenz am deutschen Kapitalmarkt. Pro Business,
Berlin
[10] Butler KC, Joaquin DC (2002) Are the Gains from International Portfolio
Diversification Exaggerated? The Influence of Downside Risk in Bear
Markets. Journal of International Money and Finance 21:981–1011
[11] Campbell R, Koedijk K, Kofman P (2002) Increased Correlation in Bear
Markets. Financial Analysts Journal 58:87–94
[12] Chan L, Karceski KCJ, Lakonishok J (1999) On Portfolio Optimization:
Forecasting Covariances and Choosing the Risk Model. Review of Finan-
cial Studies 12:937–974
74 M. Haas and S. Mittnik

[13] Chesnay F, Jondeau E (2001) Does Correlation between Stock Returns


Really Increase during Turbulent Periods? Economic Notes 30:53–80
[14] Dempster AP, Laird NM, Rubin DB (1977) Maximum Likelihood from
Incomplete Data via the EM Algorithm. Journal of the Royal Statistical
Society B 39:1–38
[15] Elton EJ, Gruber MJ (1973) Estimating the Dependence Structure of
Share Prices – Implications for Portfolio Selection. Journal of Finance
28:1203–1232
[16] Elton EJ, Gruber MJ, Urich TJ (1978) Are Betas Best? Journal of
Finance 33:1375–1384
[17] Erb CB, Harvey CR, Viskanta TE (1994) Forecasting International
Equity Correlations. Financial Analysts Journal 50:32–45
[18] Eun CS, Resnick BG (1984) Estimating the Correlation Structure of
International Share Prices. Journal of Finance 39:1311–1324
[19] Fraley C, Raftery AE (1998) How Many Clusters? Which Clustering
Method? Answers via Model-based Cluster Analysis. Computer Journal
41:578–588
[20] Guidolin M, Timmermann A (2005) Economic Implications of Bull
and Bear Regimes in UK Stock and Bond Returns. Economic Journal
115:111–143
[21] Haas M, Mittnik S, Paolella MS (2004) Mixed Normal Conditional
Heteroskedasticity. Journal of Financial Econometrics 2:211–250
[22] Haas M, Mittnik S, Paolella MS (2004) A New Approach to Markov-
switching GARCH Models. Journal of Financial Econometrics 2:493–530
[23] Haas M, Mittnik S, Paolella MS (2006) Multivariate Normal Mixture
GARCH. Center for Financial Studies Working Paper 2006/09
[24] Hamilton JD (1989) A New Approach to the Economic Analysis of Non-
stationary Time Series and the Business Cycle. Econometrica 57:357–384
[25] Hamilton JD (1990) Analysis of Time Series Subject to Changes in
Regime. Journal of Econometrics 45:39–70
[26] Hamilton JD (1994) Time Series Analysis. Princeton University Press,
Princeton
[27] Hamilton JD (1996) Specification Testing in Markov-switching Time
Series Models. Journal of Econometrics 70:127–157
[28] Hamilton JD, Susmel R (1994) Autoregressive Conditional Heteroskedas-
ticity and Changes in Regime. Journal of Econometrics 64:307–333
[29] Harvey CR, Siddique A (1999) Autoregressive Conditional Skewness.
Journal of Financial and Quantitative Analysis 34:465–487
[30] Horn RA and Johnson CR (1991) Topics in Matrix Analysis. Cambridge
University Press, Cambridge
[31] Jobson JD, Korkie B (1980) Estimation of Markowitz Efficient Portfolios.
Journal of the American Statistical Association 75:544–554
[32] Jobson JD, Korkie B (1981) Putting Markowitz Theory to Work. Journal
of Portfolio Management 7:70–74
Portfolio Selection with Common Correlation Mixture Models 75

[33] Jondeau E, Rockinger M (2003) Conditional Volatility, Skewness, and


Kurtosis: Existence, Persistence, and Comovements. Journal of Economic
Dynamics and Control 27:1699–1737
[34] Karolyi GA, Stulz RM (1996) Why do Markets Move Together? An Inves-
tigation of U.S.-Japan Stock Return Comovements. Journal of Finance
51: 951–986
[35] Kim CJ (1994) Dynamic Linear Models with Markov-switching. Journal
of Econometrics 60:1–22
[36] Ledoit O, Wolf M (2003) Improved Estimation of the Covariance Matrix
of Stock Returns with an Application to Portfolio Selection. Journal of
Empirical Finance 10:603–621
[37] Ledoit O, Wolf M (2004) Honey, I Shrunk the Sample Covariance Matrix.
Journal of Portfolio Management 31:110–119
[38] Longin F, Solnik B (2001) Extreme Correlation of International Equity
Markets. Journal of Finance 56:649–676
[39] Mandelbrot B (1963) The Variation of certain Speculative Prices. Journal
of Business 36:394–419
[40] McLachlan GJ, Krishnan T (1997) The EM Algorithm and Extensions.
Wiley, New York
[41] McLachlan GJ, Peel D (2000) Finite Mixture Models. Wiley, New York
[42] McLachlan GJ, Peel D, Bean RW (2003) Modelling High-Dimensional
Data by Mixtures of Factor Analyzers. Computational Statistics and
Data Analysis 41:379–388
[43] Markowitz HM (1952) Portfolio Selection. Journal of Finance 7:77–91
[44] Meng XL, Rubin DB (1993) Maximum Likelihood Estimation via the
ECM Algorithm: A General Framework. Biometrika 80:267–278
[45] Pagan A (1996) The Econometrics of Financial Markets. Journal of
Emprirical Finance 3:15–102
[46] Paolella MS (1999) Tail Estimation and Conditional Modeling of Het-
eroscedastic Time Series. Pro Business, Berlin
[47] Patton AJ (2004) On the Out-of-Sample Importance of Skewness and
Asymmetric Dependence for Asset Allocation. Journal of Financial
Econometrics 2:130–168
[48] Pelletier D (2006) Regime Switching for Dynamic Correlations. Journal
of Econometrics 131:445–473
[49] Ramchand L, Susmel R (1998) Volatility and Cross Correlation Across
Major Stock Markets. Journal of Empirical Finance 5:397–416
[50] Redner RA, Walker HF (1984) Mixture Densities, Maximum Likelihood
and the EM Algorithm. SIAM Review 26: 195–239
[51] Samuelson PA (1967) General Proof that Diversification Pays. Journal
of Financial and Quantitative Analysis 2:1–13
[52] Schwarz G (1978) Estimating the Dimension of a Model. Annals of Statis-
tics 6:461–464
[53] Searle SR (1982) Matrix Algebra Useful for Statistics. Wiley, New York
76 M. Haas and S. Mittnik

[54] Timmermann A (2000) Moments of Markov Switching Models. Journal


of Econometrics 96:75–111
[55] Tipping ME, Bishop CM (1999) Mixtures of Probabilistic Principal Com-
ponent Analyzers. Neural Computation 11:443–482
[56] Turner CM, Startz R, Nelson CR (1989) A Markov Model of Het-
eroskedasticity, Risk, and Learning in the Stock Market. Journal of
Financial Economics 25:3–22
[57] Whitelaw RF (2000) Stock Market Risk and Return. Review of Financial
Studies 13:521–547
[58] Zhang Z, Li WK, Yuen KC (2006) On a Mixture GARCH Time Series
Model. Journal of Time Series Analysis 27:577–597
A New Tempered Stable Distribution
and Its Application to Finance

Young Shin Kim1 , Svetlozar T. Rachev2 , Michele Leonardo Bianchi3 ,


and Frank J. Fabozzi4
1
Department of Econometrics, Statistics and Mathematical Finance, University of
Karlsruhe, Germany, aaron.kim@statistik.uni-karlsruhe.de
2
Department of Econometrics, Statistics and Mathematical Finance, University of
Karlsruhe, Germany, rachev@statistik.uni-karlsruhe.de
3
Department of Mathematics, Statistics, Computer Science and Applications,
University of Bergamo, Italy
4
Yale School of Management, New Haven CT, USA, frank.fabozzi@yale.edu

In this paper, we will discuss a parametric approach to risk-neutral density


extraction from option prices based on the knowledge of the estimated his-
torical density. A flexible distribution is needed in order to find an equivalent
change of measure and, at the same time, take into account the historical
estimates. To this end, we introduce a new tempered stable distribution that
we refer to as the KR distribution.
Some properties of this distribution will be discussed in this paper, along
with the advantages in applying it to financial modeling. Since the KR distri-
bution is infinitely divisible, a Lévy process can be induced from it. Further-
more, we can develop an exponential Lévy model, called the exponential KR
model, and prove that it is an extension of the Carr, Geman, Madan, and Yor
(CGMY) model.
The risk-neutral process is fitted by matching model prices to market prices
of options using nonlinear least squares. The easy form of the characteristic
function of the KR distribution allows one to obtain a suitable solution to the
calibration problem. To demonstrate the advantages of the exponential KR
model, we present the results of the parameter estimation for the S&P 500
Index and option prices.

1 Introduction

Since Mandelbrot introduced the Lévy stable (or α-stable) distribution to


model the empirical distribution of asset prices in [17], the α-stable distribu-
tion became the most popular alternative to the normal distribution which has
been rejected by numerous empirical studies that have found financial return
78 Y.S. Kim et al.

series to be heavy-tailed and possibly skewed. Rachev and Mittnik [21] and
Rachev et al. [22] have developed financial models with α-stable distributions
and applied them to market and credit risk management, option pricing, and
portfolio selection. They also discuss the major attack on the α-stable models
in the 1970s and 1980s. That is, while the empirical evidence does not support
the normal distribution, it is also not consistent with an α-stable distribution.
The distribution of returns for assets has heavier tails relative to the normal
distribution and thinner tails than the α-stable distribution. Partly in re-
sponse to those empirical inconsistencies, various alternatives to the α-stable
distribution were proposed in the literature. Two examples are the “CGMY”
(or “KoBoL”) distribution (Carr et al. [7], Koponen [14], and Boyachenko
and Levendorskiĭ [6]) and the “Modified Tempered Stable” distribution (Kim
et al. [12]). These two distributions, sometimes called the tempered stable
distributions, have not only heavier tails than the normal distribution and
thinner than the α-stable distribution, but also have finite moments for all
orders. Recently, Rosiński [23] generalized the CGMY distributions and clas-
sified them using the “spectral” (or Rosiński) measure.
In this paper, we will introduce an extension of the CGMY distribution
named the “KR tempered stable” (or simply “KR”) distribution. The KR
distribution is characterized by a new spectral measure. We believe that the
simple form of the characteristic function, the exponential decayed tails, and
other desirable properties of the KR distribution will result in its use in the-
oretical and empirical finance, such as modeling asset return processes, port-
folio analysis, risk management, derivative pricing, and econometrics in the
presence of heavy-tailed innovations.
In the Black-Scholes model [5], the stock price process is described by the
exponential of Brownian motion with drift : St = S0 eXt where Xt = µt + σBt
and the process Bt is Brownian motion. Replacing the driving process Xt by
a Lévy process we obtain the class of exponential Lévy models. For example,
if Xt is replaced by the CGMY process then one can obtain the exponential
CGMY model (Carr et al. [7]). In the exponential Lévy model, the equiva-
lent martingale measure (EMM) of a given market measure is not unique in
general. For this reason, we have to find a method to select one of them.
One classical method to choose an EMM is the Esscher transform; another
reasonable method is finding the “minimal entropy martingale measure”, as
presented by Fujiwara and Miyahara [20]. However, while these methods are
mathematically elegant and have a financial meaning in a utility maximization
problem, the model prices obtained from the EMM did not match the market
prices observed for options. The other method for handling the problem is
to estimate the risk-neutral measure by using current option price data inde-
pendent of the historical underlying distribution. This method can fit model
prices to market prices directly, but it has a problem: the historical market
measure and the risk-neutral measure need not to be equivalent and it conflicts
with the the no-arbitrage property for option prices. To overcome these draw-
backs, one must estimate the market measure and the risk-neutral measure
A New Tempered Stable Distribution 79

simultaneously, and preserve the equivalent property between two measures.


One method for doing so is suggested by Cont and Tankov [9]. Basically, the
method finds an EMM of the market measure such that it minimizes the least
squares error of the model option prices relative to the market option prices.
In this paper, we will discuss the last method to find an EMM. We will con-
sider the exponential Lévy model, replacing the driving process Xt by the
KR process. Since the change of measure between two KR processes has more
freedom than that of the CGMY, we can find the parameters of the EMM
such that the least squares error of the KR model prices can be smaller than
the error of the CGMY model prices.
The remainder of this paper is organized as follows. Section 2 reviews
the tempered stable distribution introduced by Rosiński. The definition and
properties of the KR distribution and the change of measure between two
KR processes are given in Section 3. Section 4 explains the advantage of the
exponential KR model in the calibration problem. In that section, we will show
the estimation results for the market parameters for the historical distribution
of the log-returns of the S&P 500 index, and compare the performance of the
calibration of the risk-neutral distribution for the CGMY model and the KR
model.

2 Tempered Stable Distributions


In this section we will review the definition and properties of the tempered sta-
ble distributions introduced by Rosiński [23]. The polar coordinates represen-
tation of a measure ν = ν(dx) on Rd0 := Rd \ {0}+ is the measure
, ν = ν(dv, du)
on (0, ∞)×S d−1 obtained by the bijection x → ||x||, ||x||
x
. Let the Lévy mea-
sure M0 of an α-stable distribution on Rd in polar coordinates be of the form

M0 (dv, du) = v −α−1 dvσ(du) (1)

where α ∈ (0, 2) and σ is a finite measure on S d−1 . A tempered α-stable


distribution is defined by tempering the radial term of M0 as follows:
Definition 2.1 (Definition 2.1. [23]) Let α ∈ (0, 2) and σ is a finite mea-
sure on S d−1 . A probability measure on Rd is called tempered α-stable (de-
noted as TαS) if is infinitely divisible without Gaussian part and whose
Lévy measure M can be written in polar coordinates as

M (dv, du) = v −α−1 q(v, u)dv σ(du). (2)

where q : (p, ∞) × S d−1 → (0, ∞) is a Borel function such that q(·, u) is


completely monotone with q(∞, u) = 0 for each u ∈ S d−1 . A TαS distribution
is called a proper TαS distribution if limv→0+ q(v, u) = 1 for each u ∈ S d−1 .
80 Y.S. Kim et al.

d
The completely monotonicity of q(·, u) means that (−1)n dv q(v, u) > 0 for
all v > 0, u ∈ S d−1 , and n = 0, 1, 2, · · · . The tempering function q can be
represented as the Laplace transform
 ∞
q(v, u) = e−vs Q(ds|u) (3)
0

where {Q(·|u)}u∈S d−1 is a measurable family of Borel measures on (0, ∞).


Define a measure Q on Rd by
  ∞
Q(A) := IA (vu)Q(dv|u)σ(du), A ∈ B(Rd ). (4)
S d−1 0

We also define a measure R by



x
R(A) := IA ||x||α Q(dx), A ∈ B(Rd ). (5)
Rd ||x||2

Clearly R({0}) = 0 and Q({0}) = 0 and Q can be expressed in terms of the


measure R as follows:

x
Q(A) = ||x||α R(dx), A ∈ B(Rd ). (6)
R0
d ||x||2

Theorem 2.2 (Theorem 2.3. [23]) Lévy measure M of TαS distribution


can be written in the form
  ∞
M (A) = IA (tx)αt−α−1 e−t dtR(dx), A ∈ B(Rd ). (7)
Rd
0 0

where R is a unique measure on Rd such that



R({0}) = 0 and (||x||2 ∧ ||x||α )R(dx) < ∞. (8)
Rd

If M is as in (2) then R is given by (5).


Conversely, if R is a measure satisfying (8), then (7) defines the Lévy mea-
sure of a TαS distribution. M corresponds to a proper TαS distribution if and
only if 
||x||α R(dx) < ∞. (9)
Rd
The measure R is called a “spectral measure” of the corresponding TαS
distribution. By Theorem 2.9 in [23], the following definition is well defined.
Definition 2.3 Let X be a random vector having a TαS distribution with the
spectral measure R.
(i) If α ∈ (0, 2) and E[||X||] < ∞, then we will write X ∼ T Sα (R, b) to
indicate that characteristic function φ of X is given by
A New Tempered Stable Distribution 81
 !
φ(u) = exp ψα (u, x)R(dx) + iu, b (10)
Rd
0

where 
Γ (−α)((1 − iy)α − 1 + iαy), if α = 1
ψα (y) = (11)
(1 − iy) log(1 − iy) + iy, if α = 1
and b = E[X].
(ii) If α ∈ (0, 1) and 
||x||R(dx) < ∞, (12)
||x||≤1

holds, then X ∼ T Sα0 (R, b0 ) means that the characteristic function φ0 of X is


of the form
 !
φ0 (u) = exp ψα0 (u, x)R(dx) + iu, b0  (13)
Rd
0

where
ψα0 (y) = Γ (−α)((1 − iy)α − 1) (14)
0
and b0 ∈ Rd is the drift vector (i.e. b0 = ||x||≤1 ||x||M (dx)).

Remark 2.4 Let X be a TαS distributed random vector with the spectral
measure R. By Proposition 2.7 in [23], we can say the following:
1. In the above definition, E[||X||] < ∞ if and only if α ∈ (1, 2) or

α = 1 and ||x|| log ||x||R(dx) < ∞, (15)
||x||>1

or 
α ∈ (0, 1) and ||x||R(dx) < ∞. (16)
||x||>1
0
2. If α ∈ (0, 1) and Rd ||x||R(dx) < ∞, then both form (10) and (13) are
valid for X. Therefore
0 X ∼ T Sα0 (R, b0 ) and X ∼ T Sα0 (R, b), where b =
b0 + Γ (1 − α) Rd xR(dx).

The following Lemma shows some relations between the spectral measure
R of the TαS distribution and the Lévy measure of the α-stable distribution
given by (1).
Lemma 2.5 (Lemma 2.14. [23]) Let M be a Lévy measure of a proper TαS
distribution, as in (2), with the spectral measure R. Let M0 be the Lévy mea-
sure of α-stable distribution given by (1). Then
  ∞
M0 (A) = IA (tx)t−α−1 dtR(dx), A ∈ B(Rd ). (17)
Rd 0
82 Y.S. Kim et al.

Furthermore,

x
σ(B) = IB ||x||α R(dx), B ∈ B(S d−1 ). (18)
Rd ||x||

Let X be a α-stable random vector with Lévy measure M0 given by (17). We


have 
E[eiuX ] = exp ψ̄α (u, x)σ(dx) + iu, a
S d−1

where some suitable a ∈ R and d

% ) * α ) *
Γ (−α) cos απ 2 |y| (1 − i tan απ
2 sgn(y)), if α = 1
ψ̄α (y) =
− 2 (|y| + i π y log(y)),
π 2
if α = 1

(See [24, Theorem 14.10]). In this case, we will write X ∼ Sα (σ, a).
Since TαS is infinitely divisible, there is a Lévy process (Xt )t≥0 in Rd such
that X1 has a TαS (proper TαS) distribution. The process (Xt )t≥0 will be
called a TαS (proper TαS) Lévy process.
Let Ω to be the set of all cadlag function on [0, ∞) into Rd , and (Xt )t≥0
is a canonical process on Ω (i.e, Xt (ω) = ω(t), t ≥ 0, ω ∈ Ω). Consider a
filtered probability space (Ω, F, (Ft )t≥0 ) where

F = σ{Xs ; s ≥ 0}
Ft = ∩s≥0 σ{Xu : u ≤ s}, t ≥ 0.

(Ft )t≥0 is the right continuous natural filtration. The canonical process
(Xt )t≥0 is characterized by a probability measure P on (Ω, F, (Ft )t≥0 ).
Theorem 2.6 (Theorem 4.1. [23]) In the above setting, consider two prob-
ability measures P0 and P on (Ω, F) such that the canonical process (Xt )t≥0
under P0 is an α-stable process while under P it is a proper TαS Lévy process.
Specifically, assume that under P0 , X1 ∼ Sα (σ, a), where σ is related to R
by (18) and α ∈ (0, 2), while under P, X1 ∼ T Sα0 (R, b) when α ∈ (0, 1) and
X1 ∼ T Sα (R, b) when α ∈ [1, 2). Let M , the Lévy measure corresponding to
R, be as in (2), where q(0+ , u) = 1 for all u ∈ S d−1 . Then P0 |Ft and P|Ft are
mutually absolutely continuous for every t > 0 if and only if
  1
(1 − q(v, u))2 v −α−1 dv σ(du) < ∞ (19)
S d−1 0

and ⎧
⎨ 0,
0 if α ∈ (0, 1)
b−a= x(log ||x||
0 − 1)R(dx), if α = 1 (20)
⎩ R
d

Γ (1 − α) Rd xR(dx), if α ∈ (1, 2).


Condition (19) implies that the integral in (20) exists. Furthermore, if either
(19) or (20) fails, then P0 |Ft and P|Ft are singular for all t > 0.
A New Tempered Stable Distribution 83

3 KR Tempered Stable Distribution


Consider the proper TαS distribution on R whose Lévy measure M in polar
coordinate is
M (ds, du) = s−α−1 q(s, u)ds σ(du) (21)
where
α α
k+ r+ k− r−
σ(A) = IA (1) + IA (−1), A ⊂ S0,
α + p+ α + p−
and
 r+
−α−p+
q(v, 1) = (α + p+ )r+ e−v/s sα+p+ −1 ds

0
r−
−α−p−
q(v, −1) = (α + p− )r− e−v/s sα+p− −1 ds,
0

with α ∈ (0, 2), k+ , k− , r+ , r− > 0 and p+ , p− > −α. Then the spectral
measure R corresponding to the Lévy measure M can be deduced as
−p+ −p−
R(dx) = (k+ r+ I(0,r+ ) (x)|x|p+ −1 + k− r− I(−r− ,0) (x)|x|p− −1 ) dx. (22)

Lemma 3.1 If M and R are given by (21) and (22), respectively, we have
(i) R({0}) = 0, |x|α R(dx) < ∞ and |x|R(dx) < ∞ for all α ∈ (0, 2).
R |x|>1
(ii) By Theorem 2.2, M can be written in the form
 r+  ∞
−p+
M (A) = k+ r+ IA (tx)t−α−1 e−t dt xp+ −1 dx (23)
0
 r−  ∞
0
−p−
+ k− r− IA (−tx)t−α−1 e−t dt xp− −1 dx, A ∈ B(R0 ).
0 0

(iii) If α = 1 then

x log |x|R(dx) < ∞,
|x|>1

and if α ∈ (0, 1), 


|x|R(dx) < ∞.
|x|<1

Hence, if X is a TαS distributed random variable with Lévy measure M ,


E[|X|] < ∞.
Proof. iii) It can be proved by Proposition 2.7. in [23].
Proposition 3.2 (Exponential Moments) Let X be a random variable
with the proper TαS distribution corresponding to the spectral measure R de-
−1 −1
fined in (22). Then E[eθX ] < ∞ if and only if −r− ≤ θ ≤ r+ .
84 Y.S. Kim et al.
0
Proof. Note that E[eθX ] < ∞ if and only if |x|>1
eθx M (dx) < ∞. We have
  r+  ∞
−p+
eθx M (dx) = k+ r+ eθtx I(1,∞) (tx)t−α−1 e−t dt xp+ −1 dx
|x|>1 0 0
 r−  ∞
−p−
+ k− r− e−θtx I(−∞,−1) (−tx)t−α−1 e−t dt xp− −1 dx
 r+
0
 ∞
0
−p+
= k+ r+ et(θx−1) t−α−1 dt xp+ −1 dx
0 1/x
 r−  ∞
−p
+ k− r− − et(−θx−1) t−α−1 dt xp− −1 dx
0 1/x

−1
If θ ≤ r+ then θx − 1 ≤ 0 where x ∈ (0, r+ ), and hence
 r+  ∞
−p+
k+ r+ et(θx−1) t−α−1 dt xp+ −1 dx
0 1/x
 r+  ∞
−p+
≤ k+ r+ t−α−1 dt xp+ −1 dx
0 1/x


−p
r+
xα+p+ −1 α
k+ r+
= k+ r+ + dx = ,
0 α α(α + p+ )
−1
Similarly if −r− ≤ θ then −θx − 1 ≤ 0 where x ∈ (0, r− ), and hence
 r−  ∞
−p
k− r− − et(−θx−1) t−α−1 dt xp− −1 dx
0 1/x
 r−  ∞
−p−
≤ k− r− t−α−1 dt xp− −1 dx
0 1/x

−p−
r−
xα+p− −1 α
k− r−
= k− r− dx = ,
0 α α(α + p− )
−1 −1
0
Thus, if −r−
≤ θ ≤ r+ then |x|>1 eθx M (dx) < ∞.
−1 −1
Conversely, if θ > r+ then θx − 1 > r+ x − 1 > 0 for all x ∈ (0, r+ ), so
−1
there is  such that 0 <  < r+ x − 1 for all h ∈ (0, r+ ). Hence
 r+  ∞
−p
k+ r+ + et(θx−1) t−α−1 dt xp+ −1 dx
0 1/x
 r+  ∞
−p+
> k+ r+ et t−α−1 dt xp+ −1 dx = ∞.
0 1/x

−1
Similarly, we can prove that, if θ < −r− then
 r−  ∞
−p
k− r− − et(−θx−1) t−α−1 dt xp− −1 dx = ∞.
0 1/x
A New Tempered Stable Distribution 85

Lemma 3.3 Let α ∈ (0, 2), p ∈ (−α, ∞) \ {−1, 0}, h > 0, and u ∈ R. Then
 1,
we have, if α =
 h
hp
xp−1 (1 − iux)α dx = F (p, −α; 1 + p; iuh) (24)
0 p
and, if α = 1,
 h
((1 − iux) log(1 − iux) + iux) xp−1 dx (25)
0
+ ihu hu
= hp + (huF (2 + p, 1; 3 + p; ihu) − i(2 + p) log(1 − ihu))
1+p 2 + 3p + p2
(ihu)−p ) *,
+ (p − ihu)F3,2 (1, 1, 1 − p; 2, 2; 1−ihu)−(1 − (ihu)p ) log(1 − ihu) ,
p
where the hypergeometric function F (a, b; c; x) and the generalized hypergeo-
metric function Fp,q (a1 , · · · , ap ; b1 , · · · , bq ; x). (The hypergeometric function
and the generalized hypergeometric function are described in [3].)
Proof. Suppose |iux| < 1 and α = 1. Since
d ab
F (a, b; c; x) = F (a + 1, b + 1; c + 1; x),
du c

 (p)n (−α)n (iux)n ∞
p (iux)n
= (−α)n
n=0
(p + 1)n n! n=0
p+n n!
and
∞ ∞
(p + 1)n (−α)n+1 (iux)n+1 (p + 1)n−1 (−α)n (iux)n
=
n=0
(p + 1)n+1 n! n=1
(p + 1)n (n − 1)!

 n (iux)n
= (−α)n .
n=1
p+n n!
we have
d xp
F (p, −α, 1 + p; iux)
dx p
xp p(−α)
= xp−1 F (p, −α; 1 + p; iux) + F (p + 1, 1 − α; p + 2; iux)iu
p 1+p
∞ ∞
!
 (p)n (−α)n (iux)n  (p + 1)n (−α)n+1 (iux)n+1
= xp−1 + .
n=0
(p + 1)n n! n=0
(p + 1)n+1 n!

!
 (iux)n p n
= xp−1 1+ (−α)n +
n=1
n! p+n p+n

!
 (iux)n
= xp−1 1+ (−α)n
n=1
n!
= xp−1 (1 − iux)α .
86 Y.S. Kim et al.

Hence, (24) is proved if |iux| < 1 and this result can be extended analytically
if −1 < Re(iux) < 1, so (24) is true for all real u. Equation (25) can be proved
by the same method.  
Theorem 3.4 Let X be a random variable with the proper TαS distribution
corresponding to the spectral measure R defined in (22) with conditions p = 0
and p = −1, and let m = E[X]. Then the characteristic function E[eiuX ],
u ∈ R, is given as follows:
(i) if α = 1,

iuX
E[e ] = exp Hα (u; k+ , r+ , p+ ) + Hα (−u; k− , r− , p− ) (26)

k+ r+ k− r−
+ iu m + αΓ (−α) − ,
p+ + 1 p− + 1
where
aΓ (−α)
Hα (u; a, h, p) = (F (p, −α; 1 + p; ihu) − 1) ,
p
(ii) if α = 1,

iuX
E[e ] = exp Gα (u; k+ , r+ , p+ ) + Gα (−u; k− , r− , p− ) (27)

k+ r+ k− r−
+ iu m + − ,
p+ + 1 p− + 1
where
ahu
Gα (u; a, h, p) = (huF (2 + p, 1; 3 + p; ihu) − i(2 + p) log(1 − ihu))
2 + 3p + p2
a(ihu)−p
+ ((p − ihu)F3,2 (1, 1, 1 − p; 2, 2; 1 − ihu) − (1 − (ihu)p ) log(1 − ihu).
p
Proof. By Lemma 3.1 (vi), m ≡ E[X] < ∞. By Definition 2.3, we have
⎧


⎨ Γ (−α)((1 − iux) − 1 + iαux)R(dx) + imu if α = 1
α
R
log E[eiuX
]= 


⎩ ((1 − iux) log(1 − iux) + iux)R(dx) + imu if α = 1
R

In case α = 1, we have

Γ (−α)((1 − iux)α − 1 + iαux)R(dx) + imu
R
 r+
−p
= k+ r+ + Γ (−α) ((1 − iux)α − 1 − iαux)xp+ −1 dx
0
 r−
−p−
+ k− r− Γ (−α) ((1 + iux)α − 1 + iαux)xp− −1 dx + imu.
0
A New Tempered Stable Distribution 87

By (24), (26) is obtained. Similarly, In case α = 1, we have



((1 − iux) log(1 − iux) + iux)R(dx) + imu
R
 r+
−p
= k+ r+ + ((1 − iux) log(1 − iux) + iux)xp+ −1 dx
0
 r−
−p−
+ k− r− ((1 + iux) log(1 + iux) − iux)xp− −1 dx + imu,
0

and by (25), (27) is obtained.


Now, let’s define the KR distribution.
Definition 3.5 Let α ∈ (0, 2), , k+ , k− , r+ , r− > 0, p+ , p− ∈ (−α, ∞) \
{−1, 0}, and m ∈ R. A tempered stable distribution is said to be the KR
Tempered Stable distribution (or KR distribution) with parameters (α, k+ ,
k− , r+ , r− , p+ , p− , m) if its characteristic function is given by equations
(26) and (27). If a random variable X follows the KR distribution then we
denote X ∼ KR(α, k+ , k− , r+ , r− , p+ , p− , m).
The cumulants of the KR distribution can be obtained using the following
Lemma.
Lemma 3.6 Let X ∼ KR(α, k+ , k− , r+ , r− , p+ , p− , m) and α = 1. Then we
have
dn
log E[eiuX ] (28)
dun
+ k in rn
+ +
= Γ (n − α) F (p+ + n, k − α; p+ + n + 1; iur+ )
p+ + n
k− (−i)k r−n ,
+ F (p− + n, k − α; p− + n + 1; −iur− )
p− + n
k+ r+ k− r−
+ i b + αΓ (−α) − I{1} (n).
p+ + 1 p− + 1
Proof. Since
dn (a)n (b)n
F (a, b; c; x) = F (a + n, b + n; c + n; x),
dun (c)n
and
Γ (−α + n)
Γ (−α)(−α)n = Γ (−α) = Γ (−α + n),
Γ (−α)
we have
dn Γ (−α)k±
F (p± , −α; 1 + p± ; iuh± )
dun p
k± Γ (−α)in hn± (p± )n (−α)n
= F (p± + n, n − α; p± + n + 1; iuh± )
p± (p± + 1)n
88 Y.S. Kim et al.

k± Γ (−α)(−α)n ik hk±
= F (p± + n, n − α; p± + n + 1; iuh± )
p± + n
k± Γ (n − α)in hn±
= F (p± + n, n − α; p± + n + 1; iuh± ).
p± + n
Thus, (28) can be shown.
Proposition 3.7 Let X ∼ KR(α, k+ , k− , r+ , r− , p+ , p− , m) with α = 1. Then
dk
the cumulants ck (X) ≡ i1k du k log E[e
iuX
] u=0 is given by c1 (X) = b and
k k
k+ r+ k− r−
ck (X) = Γ (k − α) + (−1)k
p+ + k p− + k
where k ≥ 2.
Remark 3.8 Let X ∼ KR(α, k+ , k− , r+ , r− , p+ , p− , m) with α = 1. By the
Corollary 3.7, we obtain the mean, variance, skewness and excess kurtosis of
X which are given as follows:
1. E[X] = c1 (X) = m
2 2
k+ r+ k− r−
2. Var(X) = c2 (X) = Γ (2 − α) +
p+++ 2 p− + 2,
3 3
k + r+ k − r−
c3 (X) Γ (3 − α) p+ +3 − p− +3
3. s(X) =
c2 (X)3/2
= + 2 2
,3/2
k r+ k − r−
Γ (2 − α)3/2 p+++2 + p− +2
+ 4 4 ,
k r+ k − r−
c4 (X) Γ (4 − α) p+++4 + p− +4
4. k(X) = = + ,2
c2 (X)2 k r+2 2
k − r−
Γ (2 − α)2 p+++2 + p− +2

The CGMY distribution is a particular case of the KR distribution.


Proposition 3.9 The KR distribution with parameters (α, k+ , k− , r+ , r− ,
p+ , p− , m) converges weakly to the CGMY distribution as p± → ∞ provided
−α
that α = 1 and k± = c(α + p± )r± for c > 0.
Proof. By the Lévy theorem, it suffices to prove the convergence of the char-
acteristic function. We have
k+ Γ (−α)
lim (F (p+ , −α; 1 + p+ ; ir+ u) − 1)
p+ →∞ p+

−α α + p+  (p+ )n (−α)n (iur+ )n
= cΓ (−α)r+ lim
p+ →∞ p+ n=1 (1 + p+ )n n!
∞
−α (α + p+ )(−α)n (iur+ )n
= cΓ (−α)r+ lim
p+ →∞ p+ + n n!
n=1


−α (iur+ )n
= cΓ (−α)r+ (−α)n
n=1
n!
A New Tempered Stable Distribution 89


−α α
= cΓ (−α)r+ (−iur+ )n
n
n=1
−α
= cΓ (−α)r+ ((1 − iur+ )α − 1)
) −1 −α
*
= cΓ (−α) (r+ − iu)α − r+ .

Similarly, we have

k− Γ (−α)
lim (F (p− , −α; 1 + p− ; −ir− u) − 1)
p− →∞ p−
) −1 −α
*
= cΓ (−α) (r− + iu)α − r− .

Moreover, we have
k+ r+ k− r−
µ ≡ m + lim αΓ (−α) − lim αΓ (−α)
p+ →∞ p+ + 1 p− →∞ p− + 1
1−α 1−α
c(α + p+ )r+ c(α + p− )r−
= m + lim αΓ (−α) − lim αΓ (−α)
p+ →∞ p+ + 1 p− →∞ p− + 1
1−α
= m + cαΓ (−α)(r+ − r−
1−α
).

In all, we have

lim E[eiuX ]
p+ ,p− →∞
) )) −1 −α
* ) −1 −α
***
= exp iµu + cΓ (−α) (r+ − iu)α − r+ + (r− + iu)α − r− .

where X ∼ KR(α, k+ , k− , r+ , r− , p+ , p− , m). That completes the proof. 




Figure 1 shows that the KR distributions converge to the CGMY distri-


bution when parameter p = p+ = p− increases.
Definition 3.10 Let X ∼ KR(α, k+ , k− , r+ , r− , p+ , p− , m) with α = 1. If the
parameters satisfies m = 0 and
p+ + 2 p− + 2
k+ = b 2 , k− = (1 − b) 2 .
Γ (2 − α)r+ Γ (2 − α)r−

then X is said to be standard KR tempered stable distributed (or standard KR


distributed) and denote X ∼ StdKR(α, r+ , r− , p+ , p− , b).
Since the KR distribution is infinitely divisible, we can define a Lévy process.
Definition 3.11 A Lévy process X = (Xt )t≥0 is said to be a KR tempered
stable process (or a KR process) with parameters (α, k+ , k− , r+ , r− , p+ ,
p− , m) if X1 ∼ KR(α, k+ , k− , r+ , r− , p+ , p− , m).
Proposition 3.12 The process (Xt )t≥0 ∼ KR(α, k+ , k− , r+ , r− , p+ , p− , m)
has finite variation if α ∈ (0, 1) and infinite variation if α ∈ [1, 2).
90 Y.S. Kim et al.

8
CGMY
KR p+=p−=−0.2
7
KR p =p =1
+ −
KR p+=p−=10
6

0
−0.2 −0.1 0 0.1 0.2 0.3

Fig. 1. Probability density of the CGMY distribution with parameters C = 0.01,


−α
G = 2, M = 10, Y = 1.25, and the KR distributions with α = Y , k± = C(Y +p)r± ,
r+ = 1/M , r− = 1/G, where p = p+ = p− ∈ {−0.25, 1, 10}

Proof. We have
  r+  ∞
−p
|x|M (dx) = k+ r+ + txI(0,1) (tx)t−α−1 e−t dt xp+ −1 dx
|x|<1 0 0
 r−  ∞
−p
+ k− r− − (−tx)I(−1,0) (−tx)t−α−1 e−t dt xp− −1 dx
0 0
 r+  1/x
−p+
= k+ r+ t−α e−t dt xp+ dx
0 0
 r−  1/x
−p−
− k− r− t−α e−t dt xp− dx.
0 0

For fixed x > 0, if α ∈ (0, 1) then


 1/x  ∞
t−α e−t dt ≤ t−α e−t dt = Γ (1 − α) < ∞,
0 0

and if α ∈ [1, 2) then


 1/x
t−α e−t dt = ∞.
0
A New Tempered Stable Distribution 91

Thus
 
< ∞ if α ∈ (0, 1)
|x|M (dx)
|x|<1 = ∞ if α ∈ [1, 2)

3.1 Tail Behavior

In this section, we will discuss the probability tails of the KR distribution.


Although the exact asymptotic behavior of its tails is difficult to obtain unlike
those of the stable distribution, it is possible to calculate the upper and lower
bounds.
In the following, we provide an upper bound for the probability tails by
mean of the well-known Chebyshev’s Inequality.

Proposition 3.13 Let X be a random variable with KR tempered stable dis-


tribution, X ∼ KR(α, k+ , k− , r+ , r− , p+ , p− , m) with α = 1. Then the follow-
ing inequality is fulfilled
C
P(|X − m| ≥ λ) ≤
λ2
where C does not depend on λ.

Proof. By Remark 3.8, X has a mean and variance, therefore we consider the
Chebyshev’s Inequality
1
P(|X − m| ≥ λ) ≤ Var(X).
λ2
We obtain
2 2
1 k+ r+ k− r−
P(|X − m| ≥ λ) ≤ Γ (2 − α) +
λ2 p+ + 2 p− + 2

and the result is proved.

A natural further interest is in a lower bound of the probability tails. By


following the approach of [11], below we will give a lower bound. We consider
the following result:

Proposition 3.14 Let X be an infinitely divisible random variable in R with


Lévy triplet (b, 0, M (dx)). Then we have
1
P(|X − m| ≥ λ) ≥ (1 − exp(−M (u ∈ R : |u| ≥ 2λ))), λ > 0. (29)
4
for all m ∈ R.

Proof. See Lemma 5.4 of [4].


92 Y.S. Kim et al.

For further analysis, we need an auxiliary result.

Lemma 3.15 For a ∈ R+ , the following equality holds


 ∞
s−a−1 e−s ds = β −a−1 e−β + o(β −a−1 e−λ )
β

as β → ∞.

Proof. By integration by parts, if β > 0, we obtain


 ∞  ∞
s−a−1 e−s ds = β −a−1 e−β − (a + 1) s−a−2 e−s ds ≤ β −a−1 e−β
β β

and
 ∞
s−a−1 e−s ds
β
 ∞
−a−1 −β −a−2 −β
=β e − (a + 1)β e + (a + 1)(a + 2) s−a−3 e−s ds
β

≥ β −a−1 e−β − (a + 1)β −a−2 e−β ,

when β → ∞, the result is proved.

Taking into account Proposition 3.14 and Lemma 3.15, we can prove the
following result.

Proposition 3.16 Let X be a random variable with KR tempered stable dis-


tribution, X ∼ KR(α, k+ , k− , r+ , r− , p+ , p− , m) with α = 1. Then the follow-
ing inequality is fulfilled

e− r̄

P(|X − m| ≥ λ) ≥ C
λα+2
as λ → ∞, where C does not depend on λ and r̄ = max(r+ , r− ).

Proof. Applying the following elementary fact

1 − exp(−z) ∼ z, z→0

and according to (29) and Lemma 3.15, we obtain


   ∞ 
1 −α−1 −s
P(|X − m| ≥ λ) ≥ 1 − exp − s e dsR(dx) (30)
4 R0 |x|2λ


λ−α−1
|x|α+1 e− |x| R(dx),

∼ α+3 (31)
2 R0
A New Tempered Stable Distribution 93

as λ → ∞. By using equality (22) and Lemma 3.15, the integral can be


written as
  r+  r−
α+1 − |x|
2λ −p+ α+p+ − 2λ −p−
xα+p− e− x dx

|x| e R(dx) = k+ r+ x e x dx + k− r−
R0 0 0
 ∞
−p
= (2λ)α+p+ +1 k+ r+ + t−α−p+ −2 e−t dt

r+
 ∞
−p
+ (2λ) α+p− +1
k− r− − t−α−p− −2 e−t dt

r−

k+ − r2λ k− − r2λ
∼ (2λ)−1 α+p+ +2 e
+ + (2λ)−1 α+p− +2 e

r+ r−
∼ C̄(2λ)−1 e−

as λ → 0, where r̄ = max(r+ , r− ). Combining this with (30), we get

e− r̄

P(|X − m| ≥ λ) ≥ C .
λα+2

3.2 Absolute Continuity

Let (Xt )t≥0 be a canonical process on Ω, the set of all cadlag function on
[0, ∞) into R, and consider a space (Ω, F, (Ft )t≥0 ), where

F = σ{Xs ; s ≥ 0}
Ft = ∩s>t σ{Xu : u ≤ s}, t ≥ 0.

Theorem 3.17 Consider two probability measures P1 , P2 and the canonical


process (Xt )t≥0 on (Ω, F, (Ft )t≥0 ) given above. For each j = 1, 2, suppose
(Xt )t≥0 is the KR tempered stable process under Pj with parameters (αj , kj,+ ,
kj,− , rj,+ , rj,− , pj,+ , pj,− , mj ) and

pj,± > 12 − αj , αj ∈ (0, 1)
.
pj,± > 1 − αj , αj ∈ [1, 2)

Then P1 |Ft and P2 |Ft are equivalent for every t > 0 if and only if

α := α1 = α2 , (32)
α α α α
k1,+ r1,+ k2,+ r2,+ k1,− r1,− k2,− r2,−
= , = (33)
α + p1,+ α + p2,+ α + p1,− α + p2,−
and
94 Y.S. Kim et al.

m2 − m 1 (34)
⎧ 
⎪ kj,+ rj,+ pj,+ + 2

⎪ (−1)j log rj,+ −

⎪ pj,+ + 1 pj,+ + 1

⎪ j=1,2 if α = 1
⎨ kj,− rj,− pj,− + 2
= − log rj,− − .

⎪ pj,− + 1 pj,− + 1

⎪ 

⎪ kj,+ rj,+ kj,− rj,−
⎩ Γ (1 − α)
⎪ − if α = 1
j
(−1)
j=1,2
pj,+ + 1 pj,− + 1

Proof. In KR(αj , kj,+ , kj,− , rj,+ , rj,− , pj,+ , pj,− , mj ), the spectral measure Rj
is equal to
−p −p
Rj (dx) = (kj,+ rj,+j,+ Ix∈(0,rj,+ ) |x|pj,+ −1 + kj,− rj,−j,− Ix∈(0,rj,− ) |x|pj,− −1 )dx
and the polar coordinated Lévy measure Mj is equal to
Mj (dv, du) = v −αj −1 qj (v, u)dvσj (du)
where
α
j j α
kj,+ rj,+ kj,− rj,−
σj (A) = 11∈A + 1−1∈A , A ⊂ S0
αj + pj,+ αj + pj,−
and
 rj,±
−α −pj,±
qj (v, ±1) = (αj + pj,± )rj,±j e−v/s sαj +pj,± −1 ds
0

By Remark 2.4, we have



T Sα0 (Rj , bj ), αj ∈ (0, 1)
X1 ∼
T Sα (Rj , bj ), αj ∈ [1, 2)
where  0
mj − Γ (1 − α) R
xRj (dx), αj ∈ (0, 1)
bj =
mj , αj ∈ [1, 2)

0under Pj . Indeed, by Lemma 3.1 iii), EPj [|X1 |] < ∞ if αj ∈ (0, 2) and
|x|<1
|x|Rj (dx) < ∞ if αj ∈ (0, 1).
If pj,± > 12 − αj then we have
 rj,±
d −αj −pj,±
qj (v, ±1) = −(αj + pj,± )rj,± e−v/s sαj +pj,± −2 ds
dv
 ∞
0
−αj −pj,±
= −(αj + pj,± )rj,± e−vt t−αj −pj,± dt
1/rj,±
 ∞
−α −p 1
≥ −(αj + pj,± )rj,±j j,± √ t−αj −pj,± dt
1/rj,± vt
αj + pj,± − 12
= −√ 1 v .
rj,± (αj + pj,± − 2 )
A New Tempered Stable Distribution 95

If pj,± > 1 − αj , then we have


 rj,±
d −α −p
qj (v, ±1) = −(αj + pj,± )rj,±j j,± e−v/s sαj +pj,± −2 ds
dv
0 rj,±
−α −p
≥ −(αj + pj,± )rj,±j j,± sαj +pj,± −2 ds
0
αj + pj,±
=− .
rj,± (αj + pj,± − 1)

Let
⎧ & '
⎨ min − √ αj +pj,+
, − √
αj +pj,−
, αj ∈ (0, 1)
& rj,+ (αj +pj,+ −1/2) '−1/2)
rj,− (αj +pj,−
Kj =
⎩ min − α +p α +p
rj,+ (αj +pj,+ −1) , − rj,− (αj +pj,− −1) , αj ∈ [1, 2)
j j,+ j j,−

then 
d Kj v −1/2 , αj ∈ (0, 1)
0> qj (v, ±1) ≥ .
dv Kj , αj ∈ [1, 2)
By the integration of the last inequality on the interval (0, v), we obtain

2Kj v 1/2 , αj ∈ (0, 1)
0 ≥ qj (v, ±1) − 1 = qj (v, ±1) − qj (0, ±1) ≥ .
Kj v, αj ∈ [1, 2)

Hence,
  1
(1 − qj (v, u))2 v −αj −1 dv σ(du)
S0
% 00 01
4K 2 v −αj dv σ(du), αj ∈ (0, 1)
≤ 0S 0 001 2j −α +1
S0 0
Kj v j dv σ(du), αj ∈ [1, 2)

⎨ 4Kj 0 σ(du), α ∈ (0, 1)
2

1−αj S 0 j
⎩ Kj 0 0 σ(du), αj ∈ [1, 2)
= 2

2−αj S

< ∞.

By Theorem 2.6, there is a measure P0j such that P0j |Ft and Pj |Ft are equivalent
for every t > 0 and (Xt )t≥0 is an α-stable process with X1 ∼ Sαj (σj , aj ) under
P0j where

⎨ bj 0 if α ∈ (0, 1)
aj = bj − R x(log |x| 0 − 1)R j (dx) if α = 1

bj − Γ (1 − α) R xRj (dx) if α ∈ (1, 2)
 0
mj − R x(log |x| 0 − 1)Rj (dx) if α = 1 .
=
mj − Γ (1 − α) R xRj (dx) if α = 1
96 Y.S. Kim et al.

Note that, if p > −1 and y > 0,


 y  p+1 y  y
x 1 y p+1 y p+1
p
x log x dx = log x − xp dx = log y − ,
0 p+1 0 p+1 0 p+1 (p + 1)2

by the integration by parts. If α = 1, then pj,± > 0 and



x(log |x| − 1)Rj (dx)
R
 rj,+  rj,−
−p −p
= kj,+ rj,+j,+ (log x − 1)xpj,+ dx − kj,− rj,−j,− (log x − 1)xpj,− dx
0 0
kj,+ rj,+ pj,+ + 2 kj,− rj,− pj,− + 2
= log rj,+ − − log rj,− − ,
pj,+ + 1 pj,+ + 1 pj,− + 1 pj,− + 1

and if α = 1, then
  rj,+  rj,−
−pj,+ pj,+ −p
xRj (dx) = kj,+ rj,+ x dx + kj,− rj,−j,− xpj,− dx
R 0 0
kj,+ rj,+ kj,− rj,−
= −
pj,− + 1 pj,− + 1

Since P01 |Ft and P02 |Ft are equivalent for every t > 0 if and only if α1 = α2 ,
σ1 = σ2 , and a1 = a2 , we obtain the result that P1 |Ft and P2 |Ft are equivalent
for every t > 0 if and only if the parameters satisfy (32), (33) and (34).

4 KR Tempered Stable Market Model

In the remainder of this paper, let us denote a time horizon by T > 0 and the
risk-free rate by r > 0. Let Ω to be the set of all cadlag functions on [0, T ] into
R, and (Xt )t∈[0,T ] is a canonical process on Ω (i.e. Xt (ω) = ω(t), t ∈ [0, T ],
ω ∈ Ω). Consider a filtered probability space (Ω, FT , (Ft )t∈[0,T ] ) where

FT = σ{Xs ; s ∈ [0, T ]}
Ft = ∩s∈(t,T ] σ{Xu : u ≤ s}, t ∈ [0, T ].

(Ft )t∈[0,T ] is the right continuous natural filtration. The continuous-time mar-
ket is modeled by a probability space (Ω, FT , (Ft )t∈[0,T ] , P), for some measure
P named the market measure. In the market, the stock price is given by the
random variable St = S0 eXt , t ∈ [0, T ] for some initial value of the stock
price S0 > 0, and the discounted stock price S̃t of St is given by S̃t = e−rt St ,
t ∈ [0, T ]. The processes (St )t∈[0,T ] and (S̃t )t∈[0,T ] are called the stock price
process and the discounted (stock) price process, respectively. The process
(Xt )t∈[0,T ] is called the driving process of (St )t∈[0,T ] . The driving process
(Xt )t∈[0,T ] is completely described by the market measure P. If (Xt )t∈[0,T ]
A New Tempered Stable Distribution 97

is a Lévy process under the measure P, we say that the stock price process
follows the exponential Lévy model. Assume a stock buyer receives continu-
ous dividend yield d. A probability measure Q equivalent to P is called an
equivalent martingale measure (EMM) of P if the stock price process net of
the cost of carry (Lewis [15]) is a Q-martingale; that is EQ [St ] = e(r−d)t S0 or
EQ [eXt ] = 1.
Now, we will define the KR model. For convenience, we exclude the case
α = 1 and define a function

ψα (u; k+ , k− , r+ , r− , p+ , p− , m) = Hα (u; k+ , r+ , p+ ) + Hα (−u; k− , r− , p− )


k+ r+ k− r−
+ iu m + αΓ (−α) − ,
p+ + 1 p− + 1
−1 −1
on u ∈ {z ∈ C | − Im(z) ∈ (−r− , r+ )}, which is the same as the exponent
of (26).
Definition 4.1 In the above setting, if (Xt )t∈[0,T ] is the KR process with
parameters (α, k+ , k− , r+ , r− , p+ , p− , m) where

α ∈ (0, 1) ∪ (1, 2),


k+ , k− , r− ∈ (0, ∞),
r+ ∈ (0, 1),
p+ , p− ∈ (1/2 − α, ∞) \ {0}, if α ∈ (0, 1),
p+ , p− ∈ (1 − α, ∞) \ {0}, if α ∈ (1, 2),

and m = µ−ψα (−i; k+ , k− , r+ , r− , p+ , p− , 0) for some µ ∈ R, then the process


(St )t∈[0,T ] is called the KR price process with parameters (α, k+ , k− , r+ , r− ,
p+ , p− , µ) and we say that the stock price process follows the exponential KR
model.

Remark 4.2
1. We have the condition r+ ∈ (0, 1) for ψα (−i; k+ , k− , r+ , r− p+ , p− , 0) and
E[eXt ] to be well defined.
p+ , p− ∈ (1/2 − α, ∞) \ {0}, if α ∈ (0, 1)
2. By the condition , we are able
p+ , p− ∈ (1 − α, ∞) \ {0}, if α ∈ (1, 2)
to use Theorem 3.17 for finding an equivalent measure.
3. Since m = µ − ψα (−i; k+ , k− , r+ , r− , p+ , p− , 0), we have

E[St ] = S0 E[eXt ] = S0 eµt .

Theorem 4.3 Assume that (St )t∈[0,T ] is the the KR price process with pa-
rameters (α, k+ , k− , r+ , r− , p+ , p− , µ) under the market measure P, and with
parameters (α̃, ã+ , ã− , r̃+ , r̃− , p̃+ , p̃− , r − d) under a measure Q. Then Q is an
EMM of P if and only if
α = α̃, (35)
98 Y.S. Kim et al.
α α α α
k+ r+ k̃+ r̃+ k− r− k̃− r̃−
= , = (36)
α + p+ α + p̃+ α + p− α + p̃−
and

µ − (r − d) = Hα (−i; k+ , r+ , p+ ) + Hα (i; k− , r− , p− ) (37)


− Hα (−i; k̃+ , r̃+ , p̃+ ) − Hα (i; k̃− , r̃− , p̃− ).

Proof. By Definition 4.1 and Corollary 3.17, it can be proved.

4.1 Estimation of Market Parameters

In this section, we will present the estimation results of the fit of our model
to the historical log-returns of the S&P 500 Index. In order to compare the
KR model with other well-known models, let us consider the normal, CGMY,
and KR density fit. The CGMY process is defined in the Appendix and in
[7]. In our empirical study, we focus on two sets of data. We estimated the
market parameters from time-series data on the S&P 500 Index over the period
January 1, 1992 to April 18, 2002, with ñ = 2573 closing prices (Data1), and
over the period January 1, 1984 to January 1, 1994, with n̄ = 2498 closing
prices (Data2). The estimation of market parameters based on Data1 will be
used to extract the risk-neutral density by using observed option prices, while
the historical series Data2 is selected to demonstrate the benefit of the KR
distribution in fitting historical log-returns containing extreme events (Black
Monday, October 19, 1987).
Our estimation procedure follows the classical maximum likelihood esti-
mation (MLE) method (see Table 1). The discrete Fourier transform (DFT) is
used to invert the characteristic function and evaluate the likelihood function
in the CGMY and KR cases.
In order to compare how the stock market process can be explained by
these different models, Figs. 2 and 3 show the results of density fits.
Let (Ω, A, P) be a probability space and {Xi }1≤i≤n a given set of inde-
pendent and identically distributed real random variables. In the following,
let us consider Xi (ω) = xi , for each i = 1, . . . , n. Let F be the distribution of
Xi , and x1 ≤ x2 ≤ . . . ≤ xn . The empirical cumulative distribution function
F̂n (x) is defined by

no. observations ≤ x ⎨ i
0, x < x1
F̂n (x) = = n , xi ≤ x ≤ xi+1 , i = 1, . . . , n − 1
n ⎩
1, xn ≤ x.

A statistic measuring the difference between F̂n (x) and F (x) is called the
empirical distribution function (EDF) statistic [10]. These statistics include
the Kolmogorov-Smirnov (KS) statistic [10, 18, 26] and Anderson-Darling
(AD) statistic [1, 2, 19]. Our goal is to test if the empirical distribution
A New Tempered Stable Distribution 99

Table 1. S&P 500 Index MLE density fit

S&P 500 Index from January 1, 1992 to April 18, 2002


Parameters
µ σ
Normal 0.096364 0.15756
C G M Y m
CGMY 10.161 97.455 98.891 0.5634 0.1135
k+ k− r+ r− p+ p− α µ
KR 3286.1 2124.8 0.0090 0.0113 17.736 17.736 0.5103 0.1252

S&P 500 Index from January 1, 1984 to January 1, 1994


Parameters
µ σ
Normal 0.11644 0.15008
C G M Y m
CGMY 0.41077 59.078 49.663 1.0781 0.1274
k+ k− r+ r− p+ p− α µ
KR 598.38 694.71 0.0222 0.0183 20.662 20.662 1.0416 0.1840

60
Market data
Normal
CGMY
50
KR

40

30

20

10

0
−0.04 −0.03 −0.02 −0.01 0 0.01 0.02 0.03 0.04

Fig. 2. S&P 500 Index (from January 1, 1992 to April 18, 2002) MLE density fit.
Circles are the densities of the market data. The solid curve is the KR fit, the dotted
curve is the CGMY fit and the dashed curve is the normal fit
100 Y.S. Kim et al.

70
Market data
B−S
60 CGMY
KR

50

40

30

20

10

0
−0.04 −0.03 −0.02 −0.01 0 0.01 0.02 0.03 0.04

Fig. 3. S&P 500 Index (from January 1, 1984 to January 1, 1994) MLE density
fit. Circles are the densities of the market data. The solid curve is the KR fit, the
dotted curve is the CGMY fit and the dashed curve is the normal fit

function of an observed data sample belongs to a family of hypothesized dis-


tributions, i.e.
H0 : F = F0 vs H1 : F = F0 (38)
Suppose a test statistic D takes the value d, the p-value of the statistic
will then be the value
p-value = P (D ≥ d).
We reject the hypothesis H0 if the p-value is less than a given level of signifi-
cance, which we take to be equal to 0.05. Let us consider a test for hypotheses
of the type (38) concerning continuous cumulative distribution function, the
Kolmogorov-Smirnov test. The KS statistic Dn measures the absolute value of
the maximum distance between the empirical distribution function F̂ and the
theoretical distribution function F , putting equal weight on each observation,

Dn = sup |F (xi ) − F̂n (xi )| (39)


xi

where {xi }1≤i≤n is a given set of observations. Using the procedure of [18],
we can easily evaluate the distribution of Dn and find the p-value for our test.
It might be of interest to test the ability of the model to forecast ex-
treme events. To this end, we also provide the AD statistics. We consider two
different versions of the AD statistic. In its simplest version, it is a variance-
weighted KS statistic
A New Tempered Stable Distribution 101

|F (xi ) − F̂ (xi )|
ADn = sup  (40)
xi F (xi )(1 − F (xi ))
Since the distribution of ADn is not known in closed form, p-values were
obtained via 1000 Monte Carlo simulations.
A more generally used version of this statistic belongs to the quadratic
class defined by the Cramér-von Mises family [10], i.e.
 ∞
(F̂n (x) − F (x))2
ADn2 = n dF (x) (41)
−∞ F (x)(1 − F (x))

and by the Probability Integral Transformation (PIT) formula [10], we obtain


the computing formula for the ADn2 statistic

1 1
n n
ADn2 = −n + (1 − 2i) log(zi ) − (1 + 2(n − i)) log(1 − zi )
n i=1 n i=1

where zi is zi = F (xi ), with i = 1, . . . , n. To evaluate the distribution of the


ADn2 statistic, we use the procedure described in [19]. As in the KS case, the
distribution of ADn2 does not depend on F . Results of our tests are shown in
Tables 2 and 3. Following the approach of [18, 19], p-values can be obtained
with a computational time much less than Monte Carlo simulations.
A parametric procedure for testing the goodness of fit is the χ2 -test. We
define the null hypotheses as follows:
H0normal : The daily returns follow the normal distribution.

Table 2. χ2 , KS, AD and AD2 statistics (degrees of freedom in round brackets)

S&P 500 Index from


January 1, 1992 to April 18, 2002
Model χ2 KS AD AD2
Normal 546.49(288) 0.0663 2180.7 23.762
CGMY 273.4(255) 0.0103 0.2945 0.6130
KR 268.91(252) 0.0109 0.2315 0.3367

p-value
Theoretical 
Monte Carlo‡
Model χ2 KS AD2 χ2 KS AD AD2
Normal 0 0 0 0 0 0 0
CGMY 0.2045 0.9450 0.6356 0.43 0.908 0.098 0.656
KR 0.2216 0.9165 0.9082 0.53 0.875 0.242 0.916

Theoretical p-values were obtained from [18, 19] and χ2 distribution

Monte Carlo p-values were obtained via 1,000 simulations
102 Y.S. Kim et al.

Table 3. χ2 , KS, AD and AD2 statistics (degrees of freedom in round brackets)


S&P 500 Index from
January 1, 1984 to January 1, 1994
Model χ2 KS AD AD2
Normal 482.39(202) 0.0699 3.9e+6 33.654
CGMY 191.68(179) 0.0191 0.1527 2.0475
KR 180.07(181) 0.0107 0.1302 0.9719

p-value
Theoretical Monte Carlo‡
Model χ2 KS AD2 χ2 KS AD AD2
Normal 0 0 0 0 0 0 0
CGMY 0.2451 0.3180 0.0865 0.893 0.305 0.696 0.086
KR 0.5055 0.9343 0.3723 0.974 0.875 0.872 0.361

Theoretical p-values were obtained from [18, 19] and χ2 distribution

Monte Carlo p-values were obtained via 1000 simulations

H0CGM Y : The daily returns follow the CGMY distribution.


H0KR : The daily returns follow the KR distribution.
Let us consider a partition P = {A1 , . . . , Am } of the support of our distribu-
tion. Let Nk , with k = 1, . . . , m, be the number of observations xi falling into
the interval Ak . We will compare these numbers with the theoretical frequency
distribution πk , defined by

πk = P (X ∈ Ak ) k = 1, . . . , m

through the Pearson statistic



m
Nk − nπk
χ̂2 = .
nπk
k=1

If necessary, we collapse outer cells Ak , so that the expected value nπk of the
observations always becomes greater than 5 [25].
From the results reported in Tables 2 and 3, we conclude that H0normal is
rejected but H0CGM Y and H0KR are not rejected. QQ-plots (see Figs. 4 and 5)
show that the empirical density strongly deviated from the theoretical density
for the normal model, but this deviation almost disappears in both the CGMY
and KR cases.

4.2 Estimation of Risk-Neutral Parameters

In this section, we will discuss a parametric approach to risk-neutral density


extraction from option prices based on knowledge of the estimated historical
A New Tempered Stable Distribution 103
0.05 0.05

0.04 0.04

0.03 0.03

0.02 0.02

0.01 0.01

0 0

−0.01 −0.01

−0.02 −0.02

−0.03 −0.03

−0.04 −0.04

−0.05 −0.05
−0.05 0 0.05 −0.05 0 0.05

0.05

0.04

0.03

0.02

0.01

−0.01

−0.02

−0.03

−0.04

−0.05
−0.05 0 0.05

Fig. 4. QQ-plots of S&P 500 Index (from January 1, 1992 to April 18, 2002) MLE
density fit. Normal model (left), CGMY model (right) and KR model (down)

0.05 0.05

0.04 0.04

0.03 0.03

0.02 0.02

0.01 0.01

0 0

−0.01 −0.01

−0.02 −0.02

−0.03 −0.03

−0.04 −0.04

−0.05 −0.05
−0.05 0 0.05 −0.05 0 0.05

0.05

0.04

0.03

0.02

0.01

−0.01

−0.02

−0.03

−0.04

−0.05
−0.05 0 0.05

Fig. 5. QQ-plots of S&P 500 Index (from January 1, 1984 to January 1, 1994) MLE
density fit. Normal model (left), CGMY model (right) and KR model (down)
104 Y.S. Kim et al.

density. Therefore, taking into account the estimation results of Section 4.1
under the market probability measure, we want to estimate parameters under
a risk-neutral measure.
Let us consider a given market model and observed prices Ĉi of call options
with maturities Ti and strikes Ki , i ∈ {1, . . . , N }, where N is the number of
options on a fixed day. The risk-neutral process is fitted by matching model
prices to market prices using nonlinear least squares. Hence, to obtain a prac-
tical solution to the calibration problem, our purpose is to find a parameter
set θ̃, such that the optimization problem


N
min (Ĉi − C θ̃ (Ti , Ki ))2 (42)
θ̃
i=1

is solved, where by Ĉi we denote the price of an option as observed in the


market and by Ciθ̃ the price computed according to a pricing formula in a
chosen model with a parameter set θ̃.
By Proposition 3.9, we obtain that the KR model is an extension of the
CGMY model. Therefore, to demonstrate the advantages of the KR tempered
stable distribution model, we will compare it with the well-known CGMY
model. To find an equivalent change of measure in the CGMY model, we
consider the result reported in the Appendix.
By Proposition 5.2, we can consider the historical estimation for param-
eters Ỹ and C̃ and find a solution to the minimization problem (42) which
satisfies condition (43). Therefore, we can estimate parameters M̃ and G̃ under
a risk-neutral measure. The optimization procedure involves four parameters
except r and three equality constraints. Consequently we have only one free
parameter to solve (42).
If we consider the KR exponential model, according to Definition 4.1 and
Proposition 4.3, we can find parameters k̃+ , k̃− , r̃+ and r̃− , such that con-
ditions (35), (36), and (37) are satisfied and (42) is solved. We have seven
parameters except r and four equality constraints, namely three free param-
eters to minimize (42), i.e.
α = α̃,
α
k̃+ r̃+
p̃+ =
k+ r+ α (α + p+ ) − α,

α
k̃− r̃−
p̃− =
k− r− α (α − p− ) − α

and

µ − r = Hα (−i; k+ , r+ , p+ ) + Hα (i; k− , r− , p− )
− Hα (−i; k̃+ , r̃+ , p̃+ ) − Hα (i; k̃− , r̃− , p̃− ).
A New Tempered Stable Distribution 105

In the CGMY case we have only one free parameter but in the KR case we
have 3 free parameters to fit model prices to market prices; therefore, we can
obtain a better solution to the optimization problem. The KR distribution
is more flexible in order to find an equivalent change of measure and, at the
same time, takes into account the historical estimates.
The time-series data were for the period January 1, 1992–April 18, 2002,
while the option data were April 18, 2002.
Contrary to the classical Black-Scholes case, in the exponential-Lev́y mod-
els there is no explicit formula for call option prices, since the probability den-
sity of a Lev́y process is typically not known in closed form. Due to the easy
form of the characteristic functions of the CGMY and KR distributions, we
follow the generally used pricing method for standard vanilla options, which
can be applied in general when the characteristic function of the risk-neutral
stock-price process is known [8, 25]. Let ρ be a positive constant such that
the ρ-th moment of the price exists and φ be the characteristic function of the
random variable log ST . A value of ρ = 0.75 will typically do fine [25]. Carr
and Madan [8, 25] then showed that

exp (−ρ log K) ∞
C(K, T ) = exp(−iv log K)(v)dv,
π 0

where
exp(−rT )φ(v − (ρ + 1)i)
(v) =
ρ2 + ρ − v 2 + i(2ρ + 1)v
Furthermore, we need to guarantee the analyticity of the integrand function
in the horizontal strip of the complex plane, on which the line Lρ = {x + iρ ∈
C| − ∞ < x < ∞} lies [15, 16]. If we consider the exponential KR model, we
obtain the following additional inequality constraint,
−1
r+ ≥ 1 + ρ,

by Proposition 3.2. Since α is less than 1 in the estimated market parameter


for the given time-series data, we have to consider an additional condition

p+ , p− ∈ (1/2 − α, ∞),

by Remark 4.2.
Each maturity has been calibrated separately (see Table 4). Unfortunately,
due to the independence and stationarity of their increments, exponential Lévy
models perform poorly when calibrating several maturities at the same time
[9]. In Table 5, we resume the error estimator of our option price fits. If we
consider the exponential CGMY or KR models, we can estimate simultane-
ously market and risk-neutral parameters using historical prices and observed
option prices. The flexibility of the KR distribution allows one to obtain a
suitable solution to the calibration problem (see Table 5).
106 Y.S. Kim et al.

Table 4. Estimated risk-neutral parameters

CGMY KR
T M̃ G̃ k̃+ k̃− r̃+ r̃−
0.0880 106.5827 96.1341 5325.8 33.727 0.0065 0.0330
0.1840 103.4463 93.3887 9126.3 33.024 0.0066 0.034
0.4360 92.4701 83.7430 4757.3 31.327 0.0074 0.0381
0.6920 89.4576 81.0851 3866.4 30.776 0.0076 0.0395
0.9360 90.0040 81.5675 6655.4 30.78 0.0075 0.03953
1.1920 82.6216 75.0354 9896.7 29.483 0.0079 0.0430
1.7080 77.3594 70.3609 10000 28.468 0.0084 0.046

Table 5. Error estimators

T Model APE AEE RMSE ARPE


0.0880
CGMY 0.0149 0.4019 0.4613 0.0175
KR 0.0030 0.0826 0.1023 0.0035
0.1840
CGMY 0.0341 1.0998 1.4270 0.0442
KR 0.0234 0.7541 0.9937 0.0295
0.4360
CGMY 0.0437 3.1727 3.5159 0.0788
KR 0.0361 2.6249 2.8972 0.0651
0.6920
CGMY 0.0577 4.4063 5.0448 0.1093
KR 0.0503 3.8468 4.4086 0.0953
0.9360
CGMY 0.0802 4.4772 5.2826 0.1378
KR 0.0717 4.0071 4.7401 0.1233
1.1920
CGMY 0.0898 6.7185 7.5797 0.2003
KR 0.0820 6.1366 6.9289 0.1825
1.7080
CGMY 0.1238 9.0494 9.8394 0.2588
KR 0.1156 8.4512 9.1809 0.2409

5 Conclusion

In this paper, we introduce a new tempered stable distribution named the KR


distribution. Theoretically, the KR distribution is a proper tempered stable
distribution with a simple closed form for the characteristic function. One can
easily calculate the moments of the distribution and observe the behavior of
the tails. Moreover, it is an extension of the well-known CGMY distribution
and the change of measure for the KR distributions has more freedom than
that for the CGMY distributions.
A New Tempered Stable Distribution 107

Empirically, we find that there are advantages supporting the KR distri-


bution in the fitting of the historical distribution and the calibration of the
risk-neutral distribution. In the fitting of S&P 500 index returns, the χ2 and
KS tests do not reject the KR distribution, but they do reject the normal
distribution. The p-values of χ2 and KS statistic for the KR distribution are
similar to (sometimes better than) those of the CGMY distribution which is
also not rejected. Furthermore, the p-values of AD and AD2 statistic for the
KR distribution fitting exceed those of the CGMY distribution fitting, sug-
gesting that the KR distribution can capture extreme events better than the
CGMY distribution. In the calibration of the risk-neutral distribution using
the S&P 500 index option prices, the performance of the calibration for the
exponential KR model is better than the CGMY model. The relatively flexible
change of measure for the KR distribution seems to generate the result.
As mentioned at the outset of this paper, the KR distribution can be ap-
plied to other areas within finance. For example, it can be used in risk manage-
ment because of its tail property. If we apply it to the modeling of innovation
processes of the GARCH model, we can obtain an enhanced GARCH model.
Since the KR distribution has the exponential moment with proper condition,
we can calculate prices for exotic options with the partial integro-differential
equation method. Finally, we can study asset pricing models and portfolio
analysis with the KR distribution.

References

[1] Anderson, T. W. and Darling, D. A. (1952). Asympotic Theory of Cer-


tain ‘Goodness of fit’ Criteria Based on Stochastic Processes, Annals of
Mathematical Statistics, 23, 2, 193–212.
[2] Anderson, T. W. and Darling, D. A. (1954). A Test of Goodness of Fit,
Journal of the American Statistical Association, 49, 268, 765–769.
[3] Andrews, L. D. (1998). Special Functions of Mathematics for Engineers,
2nd Edn, Oxford University Press, Oxford.
[4] Breton, J. C., Houdré, C. and Privault, N. (2007). Dimension Free and
Infinite Variance Tail Estimates on Poisson Space, in Acta Applicandae
Mathematicae, 95, 151–203.
[5] Black, F. and Scholes M. (1973). The Pricing of Options and Corporate
Liabilities, Journal of Political Economy, 81, 3, 637–654.
[6] Boyarchenko, S. I. and Levendorskiĭ, S. Z. (2000). Option Pricing for Tun-
cated Lévy Processes, International Journal of Theoretical and Applied
Finance, 3, 3, 549–552.
[7] Carr, P., Geman, H., Madan, D. and Yor M. (2002). The Fine Structure
of Asset Returns: An Empirical Investigation, Journal of Business, 75,
2, 305–332.
[8] Carr, P. and Madan, D. B. (1999). Option Valuation Using the Fast
Fourier Transform, Journal of Computational Finance, 2, 4, 61–73.
108 Y.S. Kim et al.

[9] Cont, R. and Tankov P. (2004). Financial Modelling with Jump Processes,
Chapman & Hall/CRC, London.
[10] D’Agostino, R. B. and Stephens, M. A. (1986). Goodness of Fit Tech-
niques, Dekker, New York.
[11] Kawai, R. (2004). Contributions to Infinite Divisibility for Financial mod-
eling, Ph.D. thesis, http://hdl.handle.net/1853/4888.
[12] Kim, Y. S., Rachev, S. T., Chung, D. M., and Bianchi. M. L. The Modi-
fied Tempered Stable Distribution, GARCH-Models and Option Pricing,
Probability and Mathematical Statistics, to appear.
[13] Kim, Y. S. and Lee, J. H. (2007). The Relative Entropy in CGMY Pro-
cesses and Its Applications to Finance, to appear in Mathematical Meth-
ods of Operations Research.
[14] Koponen, I. (1995). Analytic Approach to the Problem of Convergence of
Truncated Lévy Flights towards the Gaussian Stochastic Process, Phys-
ical Review E, 52, 1197–1199.
[15] Lewis, A. L. (2001). A Simple Option Formula for General Jump-
Diffusion and Other Exponential Levy Processes, avaible from http://
www.optioncity.net.
[16] Lukacs, E. (1970). Characteristic Functions, 2nd Ed, Griffin, London.
[17] Mandelbrot, B. B. (1963). New Methods in Statistical Economics, Jour-
nal of Political Economy, 71, 421–440.
[18] Marsaglia, G., Tsang, W. W. and Wang, G. (2003). Evaluating Kol-
mogorov’s Distribution, Journal of Statistical Software, 8, 18.
[19] Marsaglia, G. and Marsaglia, J. (2004). Evaluating the Anderson-Darling
Distribution, Journal of Statistical Software, 9, 2.
[20] Fujiwara, T. and Miyahara, Y. (2003). The Minimal Entropy Martingale
Measures for Geometric Lévy Processes, Finance & Stochastics 7, 509–
531.
[21] Rachev, S. and Mitnik S. (2000). Stable Paretian Models in Finance,
Wiley, New York.
[22] Rachev, S., Menn C., and Fabozzi F. J. (2005). Fat-Tailed and Skewed
Asset Return Distributions: Implications for Risk Management, Portfolio
selection, and Option Pricing, Wiley, New York.
[23] Rosiński, J. (2006). Tempering Stable Processes, Working Paper,
http://www.math.utk.edu/˜rosinski/Manuscripts/tstableF.pdf.
[24] Sato, K. (1999). Lévy Processes and Infinitely Divisible Distributions,
Cambridge University Press, Cambridge.
[25] Schoutens, W. (2003). Lévy Processes in Finance: Pricing Financial
Derivatives, Wiley.
[26] Shao, J. (2003). Mathematical Statistics, 2nd Ed, Springer, Berlin
Heidelberg New York.
A New Tempered Stable Distribution 109

Appendix
Exponential CGMY Model

The CGMY process is a pure jump process, introduced by Carr et al. [7].
Definition 5.1 A Lévy process (Xt )t≥0 is called a CGMY process with pa-
rameters (C, G, M, Y, m) if the characteristic function of Xt is given by

φXt (u; C, G, M, Y, m)
= exp(iumt + tCΓ (−Y )((M − iu)Y − M Y + (G + iu)Y − GY )), u ∈ R.

where C, M, G > 0, Y ∈ (0, 2) and m ∈ R.


For convenience, let us denote

Ψ0 (u; C, G, M, Y ) ≡ CΓ (−Y )((M − iu)Y − M Y + (G + iu)Y − GY ).

Now, we focus on a way to find an equivalent measure for CGMY processes.


Proposition 5.2 Let (Xt )t∈[0,T ] be CGMY processes with parameters (C, G,
M , Y , m) and (C̃, G̃, M̃ , Ỹ , m̃) under P and Q, respectively. Then P|Ft and
Q|Ft are equivalent for all t > 0 if and only if C = C̃, Y = Ỹ and m = m̃.

Proof. See Corollary 3 in [13].

The exponential CGMY model is defined under the continuous-time mar-


ket as follows.
Definition 5.3 Let C > 0, G > 0, M > 1, Y ∈ (0, 2) and µ > 0. In
the continuous-time market, if the driving process (Xt )t∈[0,T ] of (St )t∈[0,T ]
is a CGMY process with parameters (C, G, M , Y , m) and m = µ −
Ψ0 (−i; C, G, M, Y ), then (St )t∈[0,T ] is called the CGMY stock price process
with parameters (C, G, M , Y ,µ) and we say that the stock price process fol-
lows the exponential CGMY model.
The function Ψ0 (−i; C, G, M, Y ) is well defined with the condition M > 1,
and hence E[St ] = S0 eµt , t ∈ [0, T ].
If we apply Proposition 5.2 to the exponential CGMY model, we obtain
the following proposition.
Theorem 5.4 Assume that (St )t∈[0,T ] is the CGMY stock price process with
parameters (C, G, M, Y, µ) under the market measure P, and with parameters
(C̃, G̃, M̃ , Ỹ , r − d) under a measure Q. Then Q is an EMM of P if and only
if C̃ = C, Ỹ = Y , and

r − d − Ψ0 (−i; C, G̃, M̃ , Y ) = µ − Ψ0 (−i; C, G, M, Y ). (43)

Proof. See [13].


Estimation of α-Stable Sub-Gaussian
Distributions for Asset Returns

Sebastian Kring1 , Svetlozar T. Rachev2 , Markus Höchstötter3 ,


and Frank J. Fabozzi4
1
Department of Econometrics, Statistics and Mathematical Finance, University
of Karlsruhe, Germany, sebastian.kring@statistik.uni-karlsruhe.de
2
Department of Econometrics, Statistics and Mathematical Finance, University
of Karlsruhe, Germany, zari.rachev@statistik.uni-karlsruhe.de
3
Department of Econometrics, Statistics and Mathematical Finance, University
of Karlsruhe, Germany, markus.hoechstoetter@statistik.uni-karlsruhe.de
4
Yale School of Management, New Haven CT, USA, frank.fabozzi@yale.edu

Fitting multivariate α-stable distributions to data is still not feasible in higher


dimensions since the (non-parametric) spectral measure of the characteristic
function is extremely difficult to estimate in dimensions higher than 2. This
was shown by [3] and [15]. α-stable sub-Gaussian distributions are a particular
(parametric) subclass of the multivariate α-stable distributions. We present
and extend a method based on [16] to estimate the dispersion matrix of an
α-stable sub-Gaussian distribution and estimate the tail index α of the dis-
tribution. In particular, we develop an estimator for the off-diagonal entries
of the dispersion matrix that has statistical properties superior to the normal
off-diagonal estimator based on the covariation. Furthermore, this approach
allows estimation of the dispersion matrix of any normal variance mixture
distribution up to a scale parameter. We demonstrate the behaviour of these
estimators by fitting an α-stable sub-Gaussian distribution to the DAX30
components. Finally, we conduct a stable principal component analysis and
calculate the coefficient of tail dependence of the prinipal components.

1 Introduction
Classical models in financial risk management and portfolio optimization such
as the Markowitz portfolio optimization approach are based on the assumption
that risk factor returns and stock returns are normally distributed. Since the
seminal work of [9] and further investigations by [3], [5], [6], [10], [11], [13],
and [17] there has been overwhelming empirical evidences that the normal
distribution must be rejected. These investigations led to the conclusion that
112 S. Kring et al.

marginal distributions of risk factors and stock returns exhibit skewness and
leptokurtosis, i.e., phenomena that cannot be explained by the normal distri-
bution.
Stable or α-stable distributions have been suggested by the authors above
for modeling these pecularities of financial time series. Beside the fact that
α-stable distributions capture these phenomena very well, they have further
attractive features which allow them to generalize Gaussian-based financial
theory. First, they have the property of stability meaning, that a finite sum
of independent and identically distributed (i.i.d.) α-stable distributions is a
stable distribution. Second, this class of distribution allows for the generalized
Central Limit Theorem: A normalized sum of i.i.d. random variables converges
in distribution to an α-stable random vector.
A drawback of stable distributions is that, with a few exceptions, they
do not know any analytic expressions for their densities. In the univariate
case, this obstacle could be negotiated by numerical approximation based on
new computational possibilities. These new possibilities make the α-stable
distribution also accessible for practitioners in the financial sector, at least, in
the univariate case. The multivariate α-stable case is even much more complex,
allowing for a very rich dependence structure, which is represented by the
so-called spectral measure. In general, the spectral measure is very difficult
to estimate even in low dimensions. This is certainly one of the main reasons
why multivariate α-stable distributions have not been used in many financial
applications.
In financial risk management as well as in portfolio optimization, all the
models are inherently multivariate as stressed by [14]. The multivariate nor-
mal distribution is not appropriate to capture the complex dependence struc-
ture between assets, since it does not allow for modeling tail dependencies
between the assets and leptokurtosis as well as heavy tails of the marginal re-
turn distributions. In many models for market risk management multivariate
elliptical distributions, e.g. t-distribution or symmetric generalized hyperbolic
distributions, are applied. They model better than the multivariate normal
distributions (MNDs) the dependence structure of assets and offer an efficient
estimation procedure. In general, elliptical distributions (EDs) are an exten-
sion of MNDs since they are also elliptically contoured and characterized by
the so-called dispersion matrix. The dispersion matrix equals the variance
covariance matrix up to a scaling constants if second moments of the dis-
tributions exist, and has a similar interpretation as the variance-covariance
matrix for MNDs. In empirical studies1 it is shown that especially data of
multivariate asset returns are roughly elliptically contoured.
In this paper, we focus on multivariate α-stable sub-Gaussian distributions
(MSSDs). In two aspects they are a very natural extension of the MNDs. First,
they have the stability property and allow for the generalized Central Limit
Theorem, important features making them attractive for financial theory. Sec-
ond, they belong to the class of EDs implying that any linear combination
1
For further information, see [14]
Estimation of α-Stable Sub-Gaussian Distributions for Asset Returns 113

of an α-stable sub-Gaussain random vector remains α-stable sub-Gaussian


and therefore the Markowitz portfolio optimization approach is applicable to
them.
We derive two methods to estimate the dispersion matrix of an α-stable
sub-Gaussian random vector and analyze them empirically. The first method
is based on the covariation and the second one is a moment-type estimator.
We will see that the second one outperforms the first one. We conclude the
paper with an empirical analysis of the DAX30 using α-stable sub-Gaussian
random vectors.
In Sect. 2 we introduce α-stable distributions and MSSDs, respectively. In
Sect. 3 we provide background information about EDs and normal variance
mixture distributions, as well as outline their role in modern quantitative mar-
ket risk management and modeling. In Sect. 4 we present our main theoretical
results: we derive two new moments estimators for the dispersion matrix of
an MSSD and show the consistency of the estimators. In Sect. 5 we analyze
the estimators empirically using boxplots. In Sect. 6 we fit, as far as we know,
for the first time an α-stable sub-Gaussian distribution to the DAX30 and
conduct a principal component analysis of the stable dispersion matrix. We
compare our results with the normal distribution case. In Sect. 7 we summarize
our findings.

2 α-Stable Distribution: Definitions and Properties

2.1 Univariate α-Stable Distribution

The applications of α-stable distributions to financial data come from the


fact that they generalize the normal (Gaussian) distribution and allow for the
heavy tails and skewness, frequently observed in financial data.
There are several ways to define stable distributions.
Definition 1. Let X, X1 , X2 , ..., Xn be i.i.d. random variables. If the equation
d
X1 + X2 + ... + Xn = cn X + dn

holds for all n ∈ N with cn > 0 and dn ∈ R, then we call X stable or α-stable
distributed.
The definition justifies the term stable because the sum of i.i.d. random vari-
ables has the same distribution as X up to a scale and shift parameter. One
can show that the constant cn in Definition 1 equals n1/α .
The next definition represents univariate α-stable distributions in terms
of their characteristic functions and determines the parametric family which
describes univariate stable distributions.
114 S. Kring et al.

Definition 2. A random variable is α-stable if the characteristic function of


X is
 ) α α2 ) * 3 *
) |t| 2 1 − iβ π tan 2 (sign 3t) + iµt
exp −σ πα
* , α = 1
E(exp(itX)) =
exp −σ|t| 1 + iβ 2 (sign(t) ln |t|) + iµt , α = 1.
where α ∈ (0, 2], β ∈ [−1, 1], σ ∈ (0, ∞) and µ ∈ R.
The probability densities of α-stable random variables exist and are con-
tinuous but, with a few exceptions, they are not known in closed forms. These
exceptions are the Gaussian distribution for α = 2, the Cauchy distribution
for α = 1, and the Lévy distribution for α = 1/2. (For further information,
see [19], where the equivalence of these definitions is shown). The parameter
α is called the index of the law, the index of stability or the characteristic
exponent. The parameter β is called skewness of the law. If β = 0, then the
law is symmetric, if β > 0, it is skewed to the right, if β < 0, it is skewed
to the left. The parameter σ is the scale parameter. Finally, the parameter µ
is the location parameter. The parameters α and β determine the shape of
the distribution. Since the characteristic function of an α-stable random vari-
able is determined by these four parameters, we denote stable distributions by
Sα (σ, β, µ). X ∼ Sα (σ, β, µ), indicating that the random variable X has the
stable distribution Sα (σ, β, µ). The next definition of an α-stable distribution
which is equivalent to the previous definitions is the generalized Central Limit
Theorem:
Definition 3. A random variable X is said to have a stable distribution if it
has a domain of attraction, i.e., if there is a sequence of i.i.d. random vari-
ables Y1 , Y2 , ... and sequences of positive numbers (dn )n∈N and real numbers
(an )n∈N , such that
Y1 + Y2 + ... + Yn d
+ an → X.
dn
d
The notation → denotes convergence in distribution. If we assume that the
sequence of random variables (Yi )i∈N has second moments, we obtain the
ordinary Central Limit Theorem (CLT). In classical financial theory, the CLT
is the theoretical justification for the Gaussian approach, i.e., it is assumed
that the price process (St ) follows a log-normal distribution. If we assume
that the log-returns log(Sti /Sti−1 ), i = 1, . . . , n, are i.i.d. and have second
moments, we conclude that log(St ) is approximately normally distributed.
This is a result of the ordinary CLT since the stock price can be written as
the sum of independent innovations, i.e.,

n
) *
log(St ) = log Sti ) − log(Sti−1
i=1

n
Sti
= log ,
i=1
Sti−1
Estimation of α-Stable Sub-Gaussian Distributions for Asset Returns 115

where tn = t, t0 = 0, S0 = 1 and ti − ti−1 = 1/n. If we relax the assumption


that stock returns have second moments, we derive from the generalized CLT,
that log(St ) is approximately α-stable distributed. With respect to the CLT,
α-stable distributions are the natural extension of the normal approach. The
tail parameter α has an important meaning for α-stable distributions. First,
α determines the tail behavior of a stable distribution, i.e.,

lim λα P (X > λ) → C+
λ→∞
lim λα P (X < λ) → C− .
λ→−∞

Second, the parameter α characterizes the distributions in the do-


main of attraction of a stable law. If X is a random variable with
limλ→∞ λα P (|X| > λ) = C > 0 for some 0 < α < 2, then X is in the
domain of attraction of a stable law. Many authors claim that the returns of
assets should follow an infinitely divisible law, i.e., for all n ∈ N there exists
a sequence of i.i.d. random variable (Xn,k )k=1,...,n satisfying

d

n
X= Xn,k .
k=1

The property is desirable for models of asset returns in efficient markets


since the dynamics of stock prices are caused from continuously arising but
independent information. From Definition 3, it is obvious that α-stable dis-
tribution are infinitely divisible.
The next lemma is useful for deriving an estimator for the scale parame-
ter σ.

Lemma 1. Let X ∼ Sα (σ, β, µ), 1 < α < 2 and β = 0. Then for any
0 < p < α there exists a constant cα,β (p) such that:

E(|X − µ|p )1/p = cα,β (p)σ

where cα,β (p) = (E|X0 |p )1/p , X0 ∼ Sα (1, β, 0).

Proof. See [19] 


.

To get a first feeling for the sort of data we are dealing with, we display
in Fig. 1 the kernel density plots of the empirical returns, the Gaussian fit
and the α-stable fit of some representative stocks. We can clearly discern
the individual areas in the plot where the normal fit causes problems. It is
around the mode where the empirical peak is too high to be captured by
the Gaussian parameters. Moreover, in the mediocre parts of the tails, the
empirical distribution attributes less weight than the Gaussian distribution.
And finally, the tails are underestimated, again. In contrast to the Gaussian,
the stable distribution appears to account for all these features of the empirical
distribution quite well.
116 S. Kring et al.

35
Empirical Density
Stable Fit
Gaussian (Normal) Fit
30

25

20

15

10

0
−0.1 −0.08 −0.06 −0.04 −0.02 0 0.02 0.04 0.06 0.08 0.1

Fig. 1. Kernel density plots of Adidas AG: empirical, normal, and stable fits

Another means of presenting the aptitude of the stable class to represent


stock returns is the quantile plot. In Fig. 2, we match the empirical stock
return percentiles of Adidas AG with simulated percentiles for the normal
and stable distributions, for the respective estimated parameter tuples. The
stable distribution is liable to produce almost absurd extreme values compared
to the empirical data. Hence, we need to discard the most extreme quantile
pairs. However, the overall position of the line of the joint empirical-stable
percentiles with respect to the interquartile line appears quite convincingly in
favor of the stable distribution.2

2.2 Multivariate α-Stable Distributions

Multivariate stable distributions are the distributions of stable random vec-


tors. They are defined by simply extending the definition of stable random
variables to Rd . As in the univariate case, multivariate Gaussian distribution
is a particular case of multivariate stable distributions. Any linear combina-
tion of stable random vectors is a stable random variate. This is an important
property in terms of portfolio modeling. Multivariate stable cumulative distri-
bution functions or density functions are usually not known in closed form and
therefore, one works with their characteristic functions. The representation of
these characteristic functions include a finite measure on the unit sphere, the
so-called spectral measure. This measure describes the dependence structure

2
In Fig. 2 we remove the two most extreme points in the upper and lower tails,
respectively.
Estimation of α-Stable Sub-Gaussian Distributions for Asset Returns 117
0.08

0.06

0.04

0.02
empirical

−0.02

−0.04

−0.06

−0.08

−0.1
−0.04 −0.03 −0.02 −0.01 0 0.01 0.02 0.03 0.04 0.05
normal
0.1

0.08

0.06

0.04

0.02
empirical

−0.02

−0.04

−0.06

−0.08

−0.1
−0.1 −0.08 −0.06 −0.04 −0.02 0 0.02 0.04 0.06 0.08 0.1
stable

Fig. 2. Adidas AG quantile plots of empirical return percentiles vs. normal (top)
and stable (bottom) fits

of the stable random vector. In general, stable random vectors are difficult
to use for financial modeling, because the spectral measure is difficult to esti-
mate even in low dimensions. For stable financial model building, one has to
focus on certain subclasses of stable random vectors where the spectral mea-
sure has an easier representation. Such a subclass is the multivariate α-stable
sub-Gaussian law. They are obtained by multiplying a Gaussian vector by
W 1/2 where W is a stable random variable totally skewed to the right. Stable
118 S. Kring et al.

sub-Gaussian distributions inherit their dependence structure from the under-


lying Gaussian vector. In the next section we will see that the distribution of
multivariate stable sub-Gaussian random vectors belongs to the class of ellip-
tical distributions.The definition of stability in Rd is analogous to that in R.
Definition 4. A random vector X = (X1 , . . . , Xd ) is said to be a stable ran-
dom vector in Rd if for any positive numbers A and B there is a positive
number C and a vector D ∈ Rd such that
d
AX (1) + BX (2) = CX + D

where X (1) and X (2) are independent copies of X.


Note that an α-stable random vector X is called symmetric stable if X
satisfies

P (X ∈ A) = P (−X ∈ A)

for all Borel-sets A in Rd .


Theorem 1. Let X be a stable (respectively symmetric stable) vector in Rd .
Then there is a constant α ∈ (0, 2] such that in Definition 4, C = (Aα +
B α )1/α . Moreover, any linear combination of the components of X of the type
d
Y = i=1 bk Xk = b X is an α-stable (respectively symmetric stable) random
variable.
Proof. A proof is given in [19]. 

The parameter α in Theorem 1 is called the index of stability. It deter-
mines the tail behavior of a stable random vector, i.e., the α-stable random
vector is regularly varying with tail index α.3 For portfolio analysis and risk
management, it is very important that stable random vectors are closed under
linear combinations of the components due to Theorem 1. In the next section
we will see that elliptically distributed random vectors have this desirable
feature as well.
The next theorem determines α-stable random vectors in terms of the
characteristic function. Since there is a lack of formulas for stable densities
and distribution functions, the characteristic function is the main device to
fit stable random vectors to data.
Theorem 2. The random vector X = (X1 , . . . , Xd ) is an α-stable random
vector in Rd if there exists an unique finite measure Γ on the unit sphere
S d−1 , the so-called spectral measure, and an unique vector µ ∈ Rd such that:
(i) If α = 1,

 πα
E(eit X ) = exp{− |(t, s)|α (1 − i sign((t, s)) tan )Γ (ds) + i(t, µ)}
S d−1 2
3
For further information about regularly varying random vectors, see [18].
Estimation of α-Stable Sub-Gaussian Distributions for Asset Returns 119

(ii) If α = 1,

it X 2
E(e ) = exp{− |(t, s)|(1 + i sign((t, s)) ln |(t, s)|)Γ (ds) + i(t, µ)}
S d−1 π
In contrast to the univariate case, stable random vectors have not been
applied frequently in financial modeling. The reason is that the spectral mea-
sure, as a measure on the unit sphere S d−1 , is extremely difficult to estimate
even in low dimensions. (For further information see [17] and [15].)
Another way to describe stable random vectors is in terms of linear pro-
jections. We know from Theorem 1 that any linear combination

d
(b, X) = bi X i
i=1

has an α-stable distribution Sα (σ(b), β(b), µ(b)). By using Theorem 2 we ob-


tain for the parameters σ(b), β(b) and µ(b)
 1/α
σ(b) = |(b, s)|α Γ (ds) ,
0 S
d−1

d−1 |(b, s)| sign(b, s)Γ (ds)


α
β(b) = S 0
S d−1
|(b, s)|α Γ (ds)
and

0 (b, µ) if α = 1
µ(b) =
(b, µ) − 2
π S d−1
(b, s) ln |(b, s)|Γ (ds) if α = 1.
The parameters σ(b), β(b), and µ(b) are also called the projection parameters
and σ(.), β(.) and µ(.) are called the projection parameter functions. If one
knows the values of the projection functions for several directions, one can
reconstruct approximatively the dependence structure of an α-stable random
vector by estimating the spectral measure. Because of the complexity of this
measure, the method is still not very efficient. But for specific subclasses of
stable random vectors where the spectral measure has a much simpler form,
we can use this technique to fit stable random vectors to data.
Another quantity for characterizing the dependence structure between two
stable random vectors is the covariation.
Definition 5. Let X1 and X2 be jointly symmetric stable random variables
with α > 1 and let Γ be the spectral measure of the random vector (X1 , X2 ) .
The covariation of X1 on X2 is the real number

[X1 , X2 ]α = s1 s2<α−1> Γ (ds), (1)
S1
<p>
where the signed power a equals
a<p> = |a|p sign a.
120 S. Kring et al.

The covariance between two normal random variables X and Y can be inter-
preted as the inner product of the space L2 (Ω, A, P). The covariation is the
analogue of two α-stable random variables X and Y in the space Lα (Ω, A, P).
Unfortunately, Lα (Ω, A, P) is not a Hilbert space and this is why it lacks some
of the desirable and strong properties of the covariance. It follows immediately
from the definition that the covariation is linear in the first argument. Unfor-
tunately, this statement is not true for the second argument. In the case of
α = 2, the covariation equals the covariance.

Proposition 1. Let (X, Y ) be joinly symmetric stable random vectors with


α > 1. Then for all 1 < p < α,

EXY <p−1> [X, Y ]α


= ,
E|Y |p ||Y ||α
α

where ||Y ||α denotes the scale parameter of Y .

Proof. For the proof, see [19] 


.

In particular, we apply Proposition 1 in Sect. 4.1 in order to derive an esti-


mator for the dispersion matrix of an α-stable sub-Gaussian distribution.

2.3 α-Stable Sub-Gaussian Random Vectors

In general, as pointed out in the last section, α-stable random vectors have
a complex dependence structure defined by the spectral measure. Since this
measure is very difficult to estimate even in low dimensions, we have to retract
to certain subclasses, where the spectral measure becomes simpler. One of
these special classes is the multivariate α-stable sub-Gaussian distribution.

Definition 6. Let Z be a zero mean Gaussian random vector with variance


covariance matrix Σ and W ∼ Sα/2 ((cos πα
4 )
2/α
, 1, 0) a totally skewed stable
random variable independent of Z. The random vector

X = µ + WZ

is said to be a sub-Gaussian α-stable random vector. The distribution of X is


called multivariate α-stable sub-Gaussian distribution.

An α-stable sub-Gaussian random vector inherits its dependence structure


from the underlying Gaussian random vector. The matrix Σ is also called the
dispersion matrix. The following theorem and proposition show properties of
α-stable sub-Gaussian random vectors. We need these properties to derive
estimators for the dispersion matrix.

Theorem 3. The sub-Gaussian α-stable random vector X with location pa-


rameter µ ∈ Rd has the characteristic function
Estimation of α-Stable Sub-Gaussian Distributions for Asset Returns 121
  1 
E(eit X ) = eit µ e−( 2 t Σt)
α/2
,

where Σij = EZi Zj , i, j = 1, . . . , d are the covariances of the underlying


Gaussian random vector (Z1 , . . . , Zd ) .

For α-stable sub-Gaussian random vectors, we do not need the spectral mea-
sure in the characteristic functions. This fact simplifies the calculation of the
projection functions.

Proposition 2. Let X ∈ Rd be an α-stable sub-Gaussian random vector with


location parameter µ ∈ Rd and dispersion matrix Σ. Then, for all a ∈ Rd , we
have a X ∼ Sα (σ(a), β(a), µ(a)), where
(i) σ(a) = ( 12 a Σa)1/2
(ii) β(a) = 0
(iii) µ(a) = a µ.

Proof. It is well known that the distribution of a X is determined by its


characteristic function.

E(exp(it(a X))) = E(exp(i(ta )X)))


1
= exp(ita µ) exp(−| (ta) Σ(ta)|α/2 )
2
1
= exp(ita µ) exp(−| t2 a Σa|α/2 )
2
1
= exp(−|t|α |( a Σa) 2 |α + ita µ)
1

2
If we choose σ(a) = ( 12 a Σa)1/2 , β(a) = 0 and µ(a) = a µ, then for all t ∈ R
we have
+  + πα ,  ,
E(exp(it(a X))) = exp −σ(a)α |t|α 1 − iβ(a) tan (sign t) + iµ(a)t .
2



In particular, we can calculate the entries of the dispersion matrix directly.

Corollary 1. Let X = (X1 , . . . , Xn ) be an α-stable sub-Gaussian random


vector with dispersion matrix Σ. Then we obtain
(i) σii = 2σ(ei )2
σ 2 (ei +ej )−σ 2 (ei −ej )
(ii) σij = 2 .

Since α-stable sub-Gaussian random vectors inherit their dependence


structure of the underlying Gaussian vector, we can interpret σii as the quasi-
variance of the component Xi and σij as the quasi-covariance between Xi
and Xj .
122 S. Kring et al.

Proof. It follows from Proposition 2 that σ(ei ) = 12 σii 2


. Furthermore, if we set
a = ei + ej with i = j, we yield σ(ei + ej ) = ( 2 (σii + 2σij + σjj ))1/2 and for
1

b = ei − ej , we obtain σ(ei − ej ) = ( 12 (σii − 2σij + σjj ))1/2 . Hence, we have

σ 2 (ei + ej ) − σ 2 (ei − ej )
σij = .
2


Proposition 3. Let X = (X1 , . . . , Xn ) be a zero mean α-stable sub-Gaussian
random vector with dispersion matrix Σ. Then it follows

[Xi , Xj ]α = 2−α/2 σij σjj


(α−2)/2
.

Proof. For a proof see [19]. 




3 α-Stable Sub-Gaussian Distributions as Elliptical


Distributions
Many important properties of α-stable sub-Gaussian distributions with re-
spect to risk management, portfolio optimization, and principal component
analysis can be understood very well, if we regard them as elliptical or normal
variance mixture distributions. Elliptical distributions are a natural extension
of the normal distribution which is a special case of this class. They obtain
their name because of the fact that, their densities are constant on ellipsoids.
Furthermore, they constitute a kind of ideal environment for standard risk
management, see [4]. First, correlation and covariance have a very similar in-
terpretation as in the Gaussian world and describe the dependence structure
of risk factors. Second, the Markowitz optimization approach is applicable.
Third, value-at-risk is a coherent risk measure. Fourth, they are closed under
linear combinations, an important property in terms for portfolio optimiza-
tion. And finally, in the elliptical world minimizing risk of a portfolio with
respect to any coherent risk measures leads to the same optimal portfolio.
Empirical investigations have shown that multivariate return data for
groups of similar assets often look roughly elliptical and in market risk man-
agement the elliptical hypothesis can be justified. Elliptical distributions can-
not be applied in credit risk or operational risk, since the hypothesis of ellip-
tical risk factors is found to be rejected.

3.1 Elliptical Distributions and Basic Properties

Definition 7. A random vector X = (X1 , . . . , Xd ) has


(i) a spherical distribution if, for every orthogonal matrix U ∈ Rd×d ,
d
U X = X.
Estimation of α-Stable Sub-Gaussian Distributions for Asset Returns 123

0.1 0.1

0.08 0.08

0.06 0.06

0.04 0.04

0.02 0.02
DCX

DBK
0 0

−0.02 −0.02

−0.04 −0.04

−0.06 −0.06

−0.08 −0.08

−0.1 −0.1
−0.1 −0.08 −0.06 −0.04 −0.02 0 0.02 0.04 0.06 0.08 0.1 −0.15 −0.1 −0.05 0 0.05 0.1 0.15 0.2
BMW CBK

(a) (b)

Fig. 3. Bivariate scatterplot of BMW vs. DaimlerChrysler and Commerzbank vs.


Deutsche Bank. Depicted are daily log-returns from May 6, 2002 through March 31,
2006

(ii)an elliptical distribution if


d
X = µ + AY,
where Y is a spherical random variable and A ∈ Rd×K and µ ∈ Rd are a
matrix and a vector of constants, respectively.
Elliptical distributions are obtained by multivariate affine transformations of
spherical distributions. Figure 3a,b depict a bivariate scatterplot of BMW
vs. Daimler Chrysler and Commerzbank vs. Deutsche Bank log-returns. Both
scatterplots are roughly elliptical contoured 3.
Theorem 4. The following statements are equivalent
(i) X is spherical.
(ii) There exists a function ψ of a scalar variable such that, for all t ∈ Rd ,

φX (t) = E(eit X ) = ψ(t t) = ψ(t21 + ... + t2d ).
(iii) For all a ∈ Rd , we have

a X = ||a||X1
d

(iv) X can be represented as


d
X = RS
where S is uniformly distributed on S d−1 = {x ∈ Rd : x x = 1} and
R ≥ 0 is a radial random variable independent of S.
Proof. See [14] 

ψ is called the characteristic generator of the spherical distribution and
we use the notation X ∈ Sd (ψ).
124 S. Kring et al.
d
Corollary 2. Let X be a d-dimensional elliptical distribution with X = µ +
AY , where Y is spherical and has the characteristic generator ψ. Then, the
characteristic function of X is given by
 
φX (t) := E(eit X ) = eit µ ψ(t Σt),
where Σ = AA .
Furthermore, X can be represented by
X = µ + RAS,
where S is the uniform distribution on S d−1 and R ≥ 0 is a radial random
variable.
Proof. We notice that
     
φX (t) = E(eit X ) = E(eit (µ+AY ) ) = eit µ E(ei(A t) Y ) = eit µ ψ((A t) (A t))

= eit µ ψ(t AA t) 

Since the characteristic function of a random variate determines the dis-
tribution, we denote an elliptical distribution by
X ∼ Ed (µ, Σ, ψ).
Because of
A
µ + RAS = µ + cR S,
c
the representation of the elliptical distribution in (2) is not unique. We call the
vector µ the location parameter and Σ the dispersion matrix of an elliptical
distribution, since first and second moments of elliptical distributions do not
necessarily exist. But if they exist, the location parameter equals the mean and
the dispersion matrix equals the covariance matrix up to a scale parameter.
In order to have uniqueness for the dispersion matrix, we demand det(Σ) = 1.
If we take any affine linear combination of an elliptical random vector, then,
this combination remains elliptical with the same characteristic generator ψ.
Let X ∼ Ed (µ, Σ, ψ), then it can be shown with similar arguments as in
Corollary 2 that
BX + b ∼ Ek (Bµ + b, BΣB  , ψ)
where B ∈ Rk×d and b ∈ Rk .
Let X be an elliptical distribution. Then the density f (x), x ∈ Rd , exists
and is a function of the quadratic form
f (x) = det(Σ)−1/2 g(Q) with Q := (x − µ) Σ −1 (x − µ).
g is the density of the spherical distribution Y in definition 7. We call g the
density generator of X. As a consequence, since Y has an unimodal density, so
is the density of X and clearly, the joint density f is constant on hyperspheres
Hc = {x ∈ Rd : Q(x) = c}, c > 0. These hyperspheres Hc are elliptically
contoured.
Estimation of α-Stable Sub-Gaussian Distributions for Asset Returns 125

Example 1. An α-stable sub-Gaussian √ random vector is an elliptical ran-


dom vector. The random vector W Z is spherical, where W ∼ Sα
((cos πα
4 )2/α
, 1, 0) and Z ∼ N (0, 1) because of
√ d √
WZ = U WZ

for any orthogonal matrix. The equation is√true, since Z is rotationally sym-
metric. Hence any linear combination of W Z is an elliptical random vec-
tor. The characteristic function of an α-stable sub-Gaussian random vector is
given by
  1 
E(eit X ) = eit µ e−( 2 t Σt)
α/2

due to Theorem 3. Thus, the characteristic generator of an α-stable sub-


Gaussian random vector equals

ψsub (s, α) = e−( 2 s)


1 2/α
.

Using the characteristic generator, we can derive directly that an α-stable


sub-Gaussian random vector is infinitely divisible, since we have
+ ,α/2 n
− 1 s
ψsub (s, α) = e−( 2 s)
1 α/2
= e 2 n2/α
+ + s ,,n
= ψsub , α .
n2/α

3.2 Normal Variance Mixture Distributions

Normal variance mixture distributions are a subclass of elliptical distributions.


We will see that they inherit their dependence structure from the underlying
Gaussian random vector. Important distributions in risk management such as
the multivariate t-, generalized hyperbolic, or α-stable sub-Gaussian distribu-
tion belong to this class of distributions.
Definition 8. The random vector X is said to have a (multivariate) normal
variance mixture distribution (NVMD) if

X = µ + W 1/2 AZ

where
(i) Z ∼ Nd (0, Id );
(ii) W ≥ 0 is a non-negative, scalar-valued random variable which is inde-
pendent of G, and
(iii) A ∈ Rd×d and µ ∈ Rd are a matrix of constants, respectively.
We call a random variable X with NVMD a normal variance mixture
(NVM). We observe that Xw = (X|W = w) ∼ Nd (µ, wΣ), where Σ = AA .
126 S. Kring et al.

We can interpret the distribution of X as a composite distribution. According


to the law of W , we take normal random vectors Xw with mean zero and
covariance matrix wΣ randomly. In the context of modeling asset returns or
risk factor returns with normal variance mixtures, the mixing variable W can
be thought of as a shock that arises from new information and influences the
volatility of all stocks.
√ d √
Since U W Z = W Z for all U ∈ O(d) every normal variance mixture
distribution is an elliptical distribution. The distribution F of X is called the
mixing law. Normal variance mixture are closed under affine linear combina-
tions, since they are elliptical. This can also be seen directly by
d √ √
BX + µ1 = B( W AZ + µ0 ) + µ1 = W BAZ + (Bµ0 + µ1 )

= W ÃZ + µ̃.
This property makes NVMDs and, in particular, MSSDs applicable to portfo-
lio theory. The class of NVMD has the advantage that structural information
about the mixing law W can be transferred to the mixture law. This is true,
for example, for the property of infinite divisibility. If the mixing law is in-
finitely divisible, then so is the mixture law. (For further information see [2].)
It is obvious from the definition that an α-stable sub-Gaussian random vector
is also a normal variance mixture with mixing law W ∼ Sα ((cos πα 4 )
2/α
, 1, 0).

3.3 Market Risk Management with Elliptical Distributions


In this section, we discuss the properties of elliptical distributions in terms
of market risk management and portfolio optimization. In risk management,
one is mainly interested in modeling the extreme losses which can occur.
From empirical investigations, we know that an extreme loss in one asset very
often occurs with high losses in many other assets. We show that this market
behavior cannot be modeled by the normal distribution but, with certain
elliptical distributions, e.g. α-stable sub-Gaussian distribution, we can capture
this behavior.
The Markowitz’s portfolio optimization approach which is originally based
on the normal assumption can be extended to the class of elliptical dis-
tributions. Also, statistical dimensionality reduction methods such as the
principal component analysis are applicable to them. But one must be care-
ful, in contrast to the normal distribution, these principal components are not
independent.
Let F be the distribution function of the random variable X, then we call
F ← (α) = inf{x ∈ R : F (x) ≥ α}
the quantile function. F ← is also the called generalized inverse, since we have
F (F ← (α)) = α,
for any df F .
Estimation of α-Stable Sub-Gaussian Distributions for Asset Returns 127

Definition 9. Let X1 and X2 be random variables with dfs F1 and F2 . The


coefficient of the upper tail dependence of X1 and X2 is

λu := λu (X1 , X2 ) := lim− P (X2 > F2← (q)|X1 > F1← (q)), (2)
q→1

provided a limit λu ∈ [0, 1] exists. If λu ∈ (0, 1], then X1 and X2 are said to
show upper tail dependence; if λu = 0, they are asymptotically independent in
the upper tail. Analogously, the coefficient of the lower tail dependence is

λl = λl (X1 , X2 ) = lim+ P (X2 ≤ F ← (q)|X1 ≤ F1← (q)), (3)


q→0

provided a limit λl ∈ [0, 1] exists.


For a better understanding of tail dependence we introduce the concept of
copulas.
Definition 10. A d-dimensional copula is a distribution function on [0, 1]d .
It is easy to show that for U ∼ U (0, 1), we have P (F ← (U ) ≤ x) = F (x) and
if the random variable Y has a continuous df G, then G(Y ) ∼ U (0, 1). The
concept of copulas gained its importance because of Sklar’s Theorem.
Theorem 5. Let F be a joint distribution function with margins F1 , . . . , Fd .
Then, there exists a copula C : [0, 1]d → [0, 1] such that for all x1 , . . . , xd in
R = [∞, ∞],

F (x1 , . . . , xd ) = C(F1 (x1 ), . . . , Fd (xd )). (4)

If the margins are continuous, then C is unique; otherwise C is uniquely


determined on F1 (R) × F2 (R) × . . . × Fd (R). Conversely, if C is a copula and
F1 , ..., Fd are univariate distribution functions, the function F defined in (4)
is a joint distribution function with margins F1 , . . . , Fd .
This fundamental theorem in the field of copulas, shows that any multivariate
distribution F can be decomposed in a copula C and the marginal distribu-
tions of F . Vice versa, we can use a copula C and univariate dfs to construct
a multivariate distribution function.
With this short excursion in the theory of copulas we obtain a simpler
expression for the upper and the lower tail dependencies, i.e.,
P (X2 ≤ F ← (q), X1 ≤ F1← (q))
λl = lim
q→0+ P (X1 ≤ F1← (q))
C(q, q)
= lim .
q→0 + q
d
Elliptical distributions are radially symmetric, i.e., µ − X = µ + X, hence
the coefficient of lower tail dependence λl equals the coefficient of upper tail
dependence λu . We denote with λ the coefficient of tail dependence.
128 S. Kring et al.

We call a measurable function f : R+ → R+ regularly varying (at ∞)


with index α ∈ R if, for any t > 0, limx→∞ f (tx)/f (x) = tα . It is now
important to notice that regularly varying functions with index α ∈ R behave
asymptotically like a power function. An elliptically distributed random vector
X = RAU is said to be regularly varying with tail index α, if the function
f (x) = P (R ≥ x) is regularly varying with tail index α. (see [18].) The
following theorem shows the relation between the tail dependence coefficient
and the tail index of elliptical distributions.
Theorem 6. Let X ∼ Ed (µ, Σ, ψ) be regularly varying with tail index α > 0
and Σ a positive definite dispersion matrix. Then, every pair of components
of X, say Xi and Xj , is tail dependent and the coefficient of tail dependence
corresponds to
0 f (ρij ) sα

0 2 ds
λ(Xi , Xj ; α, ρij ) = 0 1 α1−s (5)
√s ds
0 1−s2
4
1+ρij √
where f (ρij ) = 2 and ρij = σij / σii σjj .

Proof. See [20] 


.
It is not difficult to show that an α-stable sub-Gaussian distribution is
regularly varying with tail index α. The coefficient of tail dependence between
two components, say Xi and Xj , is determined by equation (5) in Theorem 6.
In the next example, we demonstrate that the coefficient of tail dependence
of a normal distribution is zero.
Example 2. Let (X1 , X2 ) be a bivariate normal random vector with correlation
ρ ∈ (−1, 1) and standard normal marginals. Let Cρ be the corresponding
Gaussian copula due to Sklar’s theorem, then, by the L’Hôpital rule,
Cρ (q, q) l H dCρ (q, q) Cρ (q + h, q + h) − Cρ (q, q)
λ = lim+ = lim+ = lim+ lim+
q→0 q q→0 dq q→0 h→0 h
Cρ (q + h, q + h) − Cρ (q + h, q) + Cρ (q + h, q) − Cρ (q, q)
= lim+ lim
q→0 h→0 h
P (U1 ≤ q + h, q ≤ U2 ≤ q + h))
= lim+ lim
q→0 h→0 P (q ≤ U2 ≤ q + h)
P (q ≤ U1 ≤ q + h, U2 ≤ q)
+ lim+ lim
q→0 h→0 P (q ≤ U1 ≤ q + h)
= lim P (U2 ≤ q|U1 = q) + lim P (U1 ≤ q|U2 = q)
q→0+ q→0+
= 2 lim P (U2 ≤ q|U1 = q)
q→0+

= 2 lim+ P (Φ−1 (U2 ) ≤ Φ−1 (q)|Φ−1 (U1 ) = Φ−1 (q))


q→0
= 2 lim P (X2 ≤ x|X1 = x)
x→−∞
Estimation of α-Stable Sub-Gaussian Distributions for Asset Returns 129

Since we have X2 |X1 = x ∼ N (ρx, 1 − ρ2 ), we obtain


 
λ = 2 lim Φ(x 1 − ρ/ 1 + ρ) = 0 (6)
x→−∞

Equation (6) shows that beside the fact that a normal distribution is not
heavy tailed the components are asymptotically independent. This, again, is
a contradiction to empirical investigations of market behavior. Especially, in
extreme market situations, when a financial market declines in value, market
participants tend to behave homogeneously, i.e., they leave the market and sell
their assets. This behavior causes losses in many assets simultaneously. This
phenomenon can only be captured by distributions which are asymptotically
dependent.
[12] optimizes the risk and return behavior of a portfolio based on the
expected returns and the covariances of the returns in the considered asset
universe. The risk of a portfolio consisting of these assets is measured by the
variance of the portfolio return. In addition, he assumes that the asset returns
follow a multivariate normal distribution with mean µ and covariance Σ. This
approach leads to the following optimization problem

min w Σw,
w∈Rd

subject to

w µ = µp
w 1 = 1.

This approach can be extended in two ways. First, we can replace the
assumption of normally distributed asset returns by elliptically distributed
asset returns and second, instead of using the variance as the risk measure,
we can apply any positive-homogeneous, translation-invariant measure of risk
to rank risk or to determine the optimal risk-minimizing portfolio. In general,
due to the work of [1], a risk measure is a real-valued function  : M → R,
where M ⊂ L0 (Ω, F, P ) is a convex cone. L0 (Ω, F, P ) is the set of all almost
surely finite random variables. The risk measure  is translation invariant if
for all L ∈ M and every l ∈ R, we have (L + l) = (L) + l. It is positive-
homogeneous if for all λ > 0, we have (λL) = λ(L). Note, that value-at-risk
(VaR) as well as conditional value-at-risk (CVaR) fulfill these two properties.

Theorem 7. Let the random vector of asset returns X be Ed (µ, Σ, ψ). We


d
denote by W = {w ∈ Rd : i=1 wi = 1} the set of portfolio weights. As-
d
sume that the current value of the portfolio is V and let L(w) = V i=1 wi Xi
be the (linearized) portfolio loss. Let  be a real-valued risk measure depending
only on the distribution of a risk. Suppose  is positive homogeneous and trans-
lation invariant and let Y = {w ∈ W : −w µ = m} be the subset of portfolios
giving expected return m. Then, argminw∈Y (L(w)) = argminw∈Y w Σw.
130 S. Kring et al.

Proof. See [14]. 




The last theorem stresses that the dispersion matrix contains all the informa-
tion for the management of risk. In particular, the tail index of an elliptical
random vector has no influence on optimizing risk. Of course, the index has
an impact on the value of the particular risk measure like VaR or CVaR, but
not on the weights of the optimal portfolio, due to the Markowitz approach.
In risk management, we have very often to deal with portfolios consisting
of many different assets. In many of these cases it is important to reduce the
dimensionality of the problem in order to not only understand the portfolio’s
risk but also to forecast the risk. A classical method to reduce the dimension-
ality of a portfolio whose assets are highly correlated is principal component
analysis (PCA). PCA is based on the spectral decomposition theorem. Any
symmetric or positive definite matrix Σ can be decomposed in

Σ = P DP  ,

where P is an orthogonal matrix consisting of the eigenvectors of Σ in its


columns and D is a diagonal matrix of the eigenvalues of Σ. In addition, we
demand λi ≥ λi−1 , i = 1, . . . , d for the eigenvalues of Σ in D. If we apply
the spectral decomposition theorem to the dispersion matrix of an elliptical
random vector X with distribution Ed (µ, Σ, ψ), we can interpret the principal
components which are defined by

Yi = Pi (X − µ), i = 1, . . . , d, (7)

as the main statistical risk factors of the distribution of X in the following


sense

P1 ΣP1 = max{w Σw : w w = 1}. (8)

More generally,

Pi ΣPi = max{w Σw : w ∈ {P1 , . . . , Pi−1 }⊥ , w w = 1}.

From equation (8), we can derive that the linear combination Y1 = P1 (X − µ)
has the highest dispersion of all linear combinations and Pi X has the highest
dispersion in the linear subspace {P1 , ..., Pi−1 }⊥ . If we interpret trace Σ =
d
j=1 σii as a measure of total variability in X and since we have


d 
d 
d
Pi ΣPi = λi = trace Σ = σii ,
i=1 i=1 i=1

we can measure the ability ofthe first k


principal components to explain the
k d
variability of X by the ratio j=1 λj / j=1 λj .
Furthermore, we can use the principal components to construct a statistical
factor model. Due to equation (7), we have
Estimation of α-Stable Sub-Gaussian Distributions for Asset Returns 131

Y = P  (X − µ),

which can be inverted to

X = µ + P Y.

If we partition Y due to (Y1 , Y2 ) , where Y1 ∈ Rk and Y2 ∈ Rd−k and also


P leading to (P1 , P2 ), where P1 ∈ Rd×k and P2 ∈ Rd×(d−k) , we obtain the
representation

X = µ + P1 Y1 + P2 Y2 = µ + P1 Y1 + .

But one has to be careful. In contrast to the normal distribution case, the prin-
cipal components are only quasi-uncorrelated but not independent. Further-
more, we obtain for the coefficient of tail dependence between two principal
components, say Yi and Yj ,
0 √1/2 sα

0 2 ds
λ(Yi , Yj , 0, α) = 0 1 α1−s .
√ s
0 1−s2
ds

4 Estimation of an α-Stable Sub-Gaussian Distributions

In contrast to the general case of multivariate α-stable distributions, we show


that the estimation of the parameters of an α-stable sub-Gaussian distribution
is feasible. As shown in the last section, α-stable sub-Gaussian distributions
belong to the class of elliptical distributions. In general, one can apply a two-
step estimation procedure for the elliptical class. In the first step, we estimate
independently the location parameter µ ∈ Rd and the positive definite disper-
sion matrix Σ up to a scale parameter. In the second step, we estimate the
parameter of the radial random variable W .
We apply this idea to α-stable sub-Gaussian distributions. In Sects. 4.1
and 4.2 we present our main theoretical results, deriving estimators for the
dispersion matrix and proving their consistency. In Sect. 4.3 we present a
new procedure to estimate the parameter α of an α-stable sub-Gaussian
distribution.

4.1 Estimation of the Dispersion Matrix with Covariation

In Sect. 2.1, we introduced the covariation of a multivariate α-stable random


vector. This quantity allows us to derive a consistent estimator for an α-stable
dispersion matrix. In order to shorten the notation we denote with σj = σ(ej )
the scale parameter of the jth component of an α-stable random vector X =
(X1 , . . . , Xd ) ∈ Rd .
132 S. Kring et al.

Proposition 4. (a) Let X = (X1 , . . . , Xd ) ∈ Rd be a zero mean α-stable


sub-Gaussian random vector with positive definite dispersion matrix Σ ∈
Rd×d . Then, we have
2
σij = σ(ej )2−p E(Xi Xj<p−1> ), (9)
cα,0 (p)p

where p ∈ (1, α), cα,0 (p) = E(|Y |p )1/p > 0 and Y ∼ Sα (1, 0, 0).
(b) Let X1 , X2 , . . . , Xn be independent and identically distributed samples with
the same distribution as the random vector X. Let σ̂j be a consistent
estimator for σj , the scale parameter of the jth component of X, then, the
(2)
estimator σ̂ij (n, p), defined as

(2) 2 2−p 1
 n
<p−1>
σ̂ij (n, p) = σ̂ Xti Xtj , (10)
cα,0 (p)p j n t=1

is a consistent estimator for σij , where Xti refers to the ith entries of the
observation Xt , t = 1, . . . , n, cα,0 (p) = E(|Y |p )1/p and Y ∼ Sα (1, 0, 0).
Proof. (a) Due to the Proposition 3 we have
Prop. 3 (2−p)/2
σij = 2α/2 σjj [Xi , Xj ]α
Prop.1 (2−α)/2
= 2α/2 σjj E(Xi Xj<p−1> )σjα /E(|Xj |p )
Lemma 1 (2−α)/2
= 2α/2 σjj E(Xi Xj<p−1> )σjα /(cα,0 (p)p σjp )
Corollary 1(i) (2−p)/2
= 2p/2 σjj E(Xi Xj<p−1> )/(cα,0 (p)p )

(b) The estimator σ̂j is consistent and f (x) = x2−p is continuous. Then, the
n
estimator σ̂j2−p is consistent for σj2−p . n1 k=1 Xki Xkj
<p−1>
is consistent
for E(Xi Xj<p−1> ) due to the law of large numbers. Since the product of
two consistent estimators is consistent, the estimator

1
n
(2) 2
σ̂ij = = p
σ̂j2−p <p−1>
Xti Xtj
cα,0 (p) n t=1

is consistent. 


4.2 Estimation of the Dispersion Matrix with Moment-Type


Estimators
In this section, we present an approach of estimating the dispersion matrix
up to a scale parameter which is applicable to the class of normal variance
mixtures. In particular, we will see that if we know the tail parameter of an
α-stable sub-Gaussian random vector X ∈ Rd , this approach allows us to
estimate the dispersion matrix of X.
We denote with (Wθ )θ∈Θ a parametric family of positive random variables.
Estimation of α-Stable Sub-Gaussian Distributions for Asset Returns 133

Lemma 2. Let Z ∈ Rd a normally distributed random vector with mean √ zero


and positive definite dispersion matrix Σ ∈ Rd×d and let Xθ = µ + Wθ Z,
θ ∈ Θ, be a d-dimensional normal variance
√ mixture with location parameter
µ ∈ Rd . Furthermore, we assume that Wθ has tail parameter α(θ), θ ∈ Θ.4
Then, there exists a function c : {(θ, p) ∈ Θ × R : p ∈ (0, α(θ))} → (0, ∞)
such that, for all a ∈ Rd \ {0}, we have

E(|a (Xθ − µ)|p ) = c(θ, p)p (a Σa)p/2 . (11)

The function c is defined by


p/2
c(θ, p) = E(Wθ )E(|Z̃|p ),

where the random vector Z̃ ∈ Rd is standard normally distributed. Further-


more, c satisfies

lim c(θ, p) = 1, (12)


p→0

for all θ ∈ Θ.

We see from (11) that the covariance matrix of Z determines the dispersion
matrix of Xθ up to a scaling constant.

Proof. Let θ ∈ Θ, p ∈ (0, α(θ)) and a ∈ Rd \ {0}, then we have

E(|a (Xθ − µ)|p ) = E(|a Wθ


1/2
Z|p )
)E(|a Z/(a Σa)1/2 |p )(a Σa)p/2 .
p/2
= E(Wθ
5 67 8
=:c(θ,p)

Note that Z̃ = a Z/(a Σa)1/2 is standard normally distributed, hence c(θ, p)


is independent of a. Since E(Wθ ) > 0 and E(|a Z/(a Σa)1/2 |p ) > 0, so
p/2

c(θ, p) > 0.
Since we have xp ≤ max{1, xα(θ) } for p ∈ (0, α(θ)) and x > 0, it follows
from Lebesque’s Theorem
 p
lim c(θ, p) = lim E( Wθ ) lim E(|Z̃|p )
p→0 p→0 p→0
 p
= E( lim Wθ )E( lim |Z̃|p )
p→0 p→0
= E(1)E(1)
= 1. 


4
Note, if the random variable X has tail parameter α then E(|X|p ) < ∞ for all
p < α and E(|X|p ) = ∞ for all p ≥ α (see [19]).
134 S. Kring et al.

Theorem 8. Let Z, Xθ , θ ∈ Θ, and c : {(θ, p) ∈ Θ × R : p ∈ (0, α(θ))} →


(0, ∞) be as in Lemma 2. Let X1 , . . . , Xn ∈ Rd be i.i.d. samples with the same
distribution as Xθ . The estimator

1  |a (Xi − µ)|p


n
σ̂n (p, a) = (13)
n i=1 c(θ, p)p

(i) is unbiased, i.e.,

E(σ̂n (p, a)) = (a Σa)p/2 for all a ∈ Rd

(ii)is consistent, i.e.,

P (|σ̂n (p, a) − (a Σa)p/2 | > ) → 0 (n → ∞),

if p < α(θ)/2.

Proof. (i) follows directly from Lemma 2. For statement (ii), we have shown
that

P (n) := P (|σ̂n (p, a) − (a Σa)p/2 | > ) → 0 (n → ∞).

But this holds because of


(∗)
1
P (n) ≤ Var(σ̂n (p, a))
2 !
1 n
= 2 2 Var |a(Xi − µ)|p
 n c(θ, p)2p i=1
1
= Var(|a (X − µ)|p )
2 nc(θ, p)2p
1
= 2 (E(|a (X − µ)|2p ) − E(|a (X − µ)|p )2 )
 nc(θ, p)2p
1 ) *
= 2 c(θ, 2p)2p (a Σa)2p − c(θ, p)2p (a Σa)2p
 nc(θ, p)2p
!
2p
1 c(θ, 2p)
= 2 − 1 (a Σa)2p → 0 (n → ∞).
 n c(θ, p)

The inequation (∗) holds because of the Chebyshev’s inequality and we have
E(|a (X − µ)|2p ) < ∞ because of the assumption p < α(θ)/2. 

Note, that σ̂n (p, a)2/p , a ∈ Rd , is a biased, but consistent estimator for (aΣa ).
However, since we cannot determine c(θ, p) > 0 we have to use

1 
n
p
σ̂n (p, a)c(θ, p) = |a (Xi − µ)|p (14)
n i=1
Estimation of α-Stable Sub-Gaussian Distributions for Asset Returns 135

as the estimator. But then, Theorem 8 allows us the estimate the disper-
sion matrix only up to a scaling constant by using linear combinations
a X1 , . . . , a Xn , a ∈ Rd of the observations X1 , . . . , Xn . We can apply two
different approaches to do this.
The first approach is based on the fact that the following equation holds

(ei + ej ) Σ(ei + ej ) − (ei − ej ) Σ(ei − ej )


σij =
4
for all 1 ≤ i < j ≤ d. Then, we can conclude that the estimator

c(θ, p)σ̂n (p, ei + ej )2/p − c(θ, p)σ̂n (p, ei − ej )2/p


σ̂ij (n, p) := (15)
4
is a consistent estimator for σij up to the scaling constant c(θ, p), that is the
same for all 1 ≤ i < j ≤ d.
For the second approach we use different linear projections ai X1 , . . . , ai Xn ,
ai ∈ Rd , i = 1, . . . , m, of the observations in order to reconstruct Σ through
the following optimization problem

m
Σ̂(n, p) = argminΣ∈Rd×d :sym. (c(θ, p)σ̂n (p, ai )2/p − ai Σai )2 . (16)
i=1

It is important to note that the optimization problem (16) can be solved by


ordinary least squares regression.
In the next theorem, we present an estimator that is based on the following
observation. Letting Xθ , X1 , X2 , X3 , ... be a sequence of i.i.d. normal variance
mixtures, then we have
!1/p
1  a Xi − µ(a) (
n p n
(∗)
lim lim = lim |a Xi − µ(a)|1/n
n→∞ p→0 n i=1 c(θ, p) n→∞
i=1

= (a Σa)1/2 .

The last equation is true because of (ii) of the following theorem. The proof
of the equality (*) can be found in [21].

Theorem 9. Let Z, Xθ , θ ∈ Θ, and c : {(θ, p) ∈ Θ × R : p ∈ (0, α(θ))} →


(0, ∞) be as in Lemma 2 and let X1 , . . . , Xn ∈ Rd be i.i.d. samples with the
same distribution as Xθ . The estimator

1 ( n
σ̂n (a) = |a (Xi − µ)|1/n
c(θ, 1/n) i=1

(i) is unbiased, i.e.,

E(σ̂n (a)) = (a Σa)1/2 for all a ∈ R


136 S. Kring et al.

(ii) is consistent, i.e.,

P (|σ̂n (a) − (a Σa)1/2 | > ) → 0 (n → ∞).

Proof. (i) follows directly from Lemma 2. For statement (ii), we have shown
that

P (n) := P (|σ̂n (a) − (a Σa)p/2 | > ) → 0 (n → ∞).

But this holds because of


(∗)1
P (n) ≤ Var(σ̂n (a))
2 !
1 (n

= 2 Var |a (Xi − µ)|1/n
 c(θ, 1/n)2 i=1
!
1 (
n (
n
 
= 2 E(|a (Xi − µ)|
2/n
)− E(|a (Xi − µ)|
1/n 2
)
 c(θ, 1/n)2 i=1 i=1
1
= (E(|a (X − µ)|2/n )n − E(|a (X − µ)|1/n ))2n )
2 c(θ, 1/n)2
1
= 2 (c(θ, 2/n)2 (a Σa)2 − (c(θ, 1/n)2 (a Σa)2 ))
 c(θ, 1/n)2
!
2
1 c(θ, 2/n)
= 2 − 1 (a Σa)2 → 0 (n → ∞).
 c(θ, 1/n)

The inequation (∗) holds because of the Chebyshev’s inequality. Then (ii)
follows from (12) in Lemma 2. 

Note, that σ̂n2 (a), a ∈ Rd , is a biased but consistent estimator for (a Σa).
For the rest of this section we concentrate on α-stable sub-Gaussian ran-
dom vectors. In this case, the family of positive random variables (Wθ )θ∈Θ is
given by
πα
(Wα )α∈(0,2) and Wα ∼ Sα/2 (cos( ), 1, 0).
4
Furthermore, the scaling function c(., .) defined in Lemma 2 satisfies

2 )Γ (1 − p/α)
Γ ( p+1
c(α, p)p = 2p √
Γ (1 − p/2) π
2 + πp , + p,
= sin Γ (p)Γ 1 − , (17)
π 2 α
where Γ (.) is the Gamma-function. For the proof of (17), see [7] and [21].
With Theorems 8 and 9, we derive two estimators for the scale parameter
σ(a) of the linear projection a X for an α-stable sub-Gaussian random vector
X. The first one is
Estimation of α-Stable Sub-Gaussian Distributions for Asset Returns 137

1 2 + πp , + p,
−1 
n
σ̂n (p, a) = sin Γ (p) Γ 1 − |a Xi − µ(a)|p
n π 2 α i=1

based on Theorem 8. The second one is


1 (n
σ̂n (a) = (|a Xi − µ(a)|)1/n
c(α, 1/n) i=1
2 +π , 1 1
−n (
n
= sin Γ Γ 1− · (|a Xi − µ(a)|)1/n .
π 2n n nα i=1

based on Theorem 9. We can reconstruct the stable dispersion matrix from


the linear projections as shown in the (15) and (16).

4.3 Estimation of the Parameter α

We assume that the data X1 , . . . , Xn ∈ Rd follow a sub-Gaussian α-stable


distribution. We propose the following algorithm to obtain the underlying
parameter α of the distribution.
(i) Generate i.i.d. samples u1 , u2 , . . . , un according to the uniform distribu-
tion on the unit hypersphere S d−1 .
(ii) For all i from 1 to n estimate the index of stability αi with respect to
the data ui X1 , ui X2 , . . . , ui Xn , using an unbiased and fast estimator α̂
for the index.
(iii) Calculate the index of stability of the distribution by

1
n
α̂ = α̂k .
n
k=1

The algorithm converges to the index of stability α of the distribution.


(For further information we refer to [17].)

4.4 Simulation of α-Stable Sub-Gaussian Distributions

Efficient and fast multivariate random number generators are indispensable


for modern portfolio investigations. They are important for Monte-Carlo sim-
ulations for VaR, which have to be sampled in a reasonable time frame. For
the class of elliptical distributions we present a fast and efficient algorithm
which will be used for the simulation of α-stable sub-Gaussian distributions in
the next section. We assume the dispersion matrix Σ to be positive definite.
Hence we obtain for the Cholesky decomposition Σ = AA a unique full-rank
lower-triangular matrix A ∈ Rd×d . We present a generic algorithm for gen-
erating multivariate elliptically-distributed random vectors. The algorithm is
based on the stochastic representation of Corollary 2. For the generation of
our samples, we use the following algorithm:
138 S. Kring et al.

Algorithm for ECr (µ, R; ψsub ) simulation


(i) Set Σ = AA , via Cholesky decomposition.
(ii) Sample a random number from W .
(iii) Sample d independent random numbers Z1 , . . . , Zd from a N1 (0, 1) law.
(iv) √ Z = (Z1 , . . . , Zd ).
Set U = Z/||Z|| with
(v) Return X = µ + W AU
If we want to generate random number with a Ed (µ, Σ, ψsub ) law with the
d
algorithm, we choose W = Sα/2 (cos( πα4 )
2/α
, 1, 0)||Z||2 , where Z is Nd (0, Id)
distributed. It can be shown that ||Z||2 is independent of both W as well as
Z/||Z||.

5 Emprical Analysis of the Estimators


In this section, we evaluate two different estimators for the dispersion matrix
of an α-stable sub-Gaussian distribution using boxplots. We are primarily
interested in estimating the off-diagonal entries, since the diagonal entries σii
are essentially only the square of the scale parameter σ. Estimators for the
scale parameter σ have been analyzed in numerous studies. Due to Corollary 1
and Theorem 9, the estimator
(1) (σ̂n (ei + ej ))2 − (σ̂n (ei − ej ))2
σ̂ij (n) = (18)
2
is a consistent estimator for σij and the second estimator
2
n 
(2) 2−p 1 <p−1>
σ̂ij (n, p) = σ̂ n (ej ) Xki Xkj . (19)
cα,0 (p)p n
k=1

is consistent because of proposition 4 for i = j. We analyze the estimators


empirically.
For an empirical evaluation of the estimators described above, it is suffi-
cient to exploit the two-dimensional sub-Gaussian law since for estimating σij
we only need the ith and jth component of the data X1 , X2 , . . . , Xn ∈ Rd .
For a better understanding of the speed of convergence of the estimators, we
choose different sample sizes (n = 100, 300, 500, 1000). Due to the fact that
asset returns exhibit an index of stability in the range between 1.5 and 2, we
only consider the values α = 1.5, 1.6, . . . , 1.9. For the empirical analysis of the
estimators, we choose the matrix
12
A= .
34
The corresponding dispersion matrix is
5 11
Σ = AA = .
11 25
Estimation of α-Stable Sub-Gaussian Distributions for Asset Returns 139
(1)
5.1 Empirical Analysis of σ̂ij (n)
(1)
For the empirical analysis of σ̂ij (n), we generate samples as described in
the previous paragraph and use the algorithm described in Sect. 4.4. The
generated samples follow an α-stable sub-Gaussian distribution, i.e., Xi ∼
E2 (0, Σ, ψsub (., α)), i = 1, . . . , n, where A is defined above. Hence, the value
of the off-diagonal entry of the dispersion matrix σ12 is 11.
(1)
In Figs. 4 through 7, we illustrate the behavior of the estimator σ̂ij (n) for
several sample sizes and various values for the tail index, i.e., α = 1.5, 1.6, . . . , 1.9.
We demonstrate the behavior of the estimator using boxplots based on 1,000
sample runs for each setting of sample length and parameter value.
In general, one can see that for all values of α the estimators are median-
unbiased. By analyzing the figures, we can additionally conclude that all es-
timators are slightly skewed to the right. Turning our attention to the rate
of convergence of the estimates towards the median value of 11, we examine
the boxplots. Figure 4 reveals that for a sample size of n = 100 the interquar-
tile range is roughly equal to four for all values of α. The range diminishes
gradually for increasing sample sizes until which can be seen in Figs. 4–7.

Sample size per estimation=100


24
22
21
19
17
Values

15
13
11
9
7
5

alpha=1.5 alpha=1.6 alpha=1.7 alpha=1.8 alpha=1.9

Fig. 4. Sample size 100

Sample size per estimation=300

17

15
Values

13

11

alpha=1.5 alpha=1.6 alpha=1.7 alpha=1.8 alpha=1.9

Fig. 5. Sample size 300


140 S. Kring et al.

Sample size per estimation=500


17

16
15
14
13

Values
12
11
10
9

8
7
alpha=1.5 alpha=1.6 alpha=1.7 alpha=1.8 alpha=1.9

Fig. 6. Sample size 500

Sample size per estimation=1000

14

13

12
Values

11

10

alpha=1.5 alpha=1.6 alpha=1.7 alpha=1.8 alpha=1.9

Fig. 7. Sample size 1000

Finally in Fig. 7, the interquartile range is equal to about 1.45 for all values of
α. The rate of decay is roughly n−1/2 . Extreme outliers can be observed for
small sample sizes larger than twice the median, regardless of the value of α.
For n = 1, 000, we have a maximal error around about 1.5 times the median.
Due to right-skewness, extreme values are observed mostly to the right of the
median.

(2)
5.2 Empirical Analysis of σ̂ij (n, p)
We examine the consistency behavior of the second estimator as defined in
(19) again using boxplots. In Fig. 5 through 12 we depict the statistical behav-
ior of the estimator. For generating independent samples of various lengths
for α = 1.5, 1.6, 1.7, 1.8, and 1.9, and two different values of p we use the al-
gorithm described in Sect. 4.4.5 For the values of p, we select 1.0001 and 1.3,
respectively. A value for p closer to one leads to improved properties of the
estimator as will be seen.
5
In most of these plots, extreme estimates had to be removed to provide for a clear
display of the boxplots.
Estimation of α-Stable Sub-Gaussian Distributions for Asset Returns 141

Sample size per estimation=100

30

25

20

Values
15

10

alpha=1.5 alpha=1.6 alpha=1.7 alpha=1.8 alpha=1.9


980 estimations of 1000 are shown in each boxplot; p=1.00001

Fig. 8. Sample size 100, p = 1.00001

Sample size per estimation=100


35

30

25

20
Values

15

10

−5

−10
alpha=1.5 alpha=1.6 alpha=1.7 alpha=1.8 alpha=1.9
980 estimations of 1000 are shown in each boxplot; p=1.3

Fig. 9. Sample size 100, p = 1.3

In general, we can observe that the estimates are strongly skewed. This
is more pronounced for lower values of α while skewness vanishes slightly for
increasing α. All figures display a noticeable bias in the median towards low
(1) (2)
values. Finally, as will be seen, σ̂ij (n) seems more appealing than σ̂ij (n, p).
For a sample length of n = 100, Figs. 8 and 9 show that the bodies of
the boxplots which are represented by the innerquartile ranges are as high as
4.5 for a lower value of p and α. As α increases, this effect vanishes slightly.
However, results are worse for p = 1.3 as already indicated. For sample lengths
of n = 300, Figs. 10 and 11 show interquartile ranges between 1.9 and 2.4 for
lower values of p. Again, results are worse for p = 1.3. For n = 500, Figs. 12
and 13 reveal ranges between 1.3 and 2.3 as α increases. Again, this worsens
when p increases. And finally for samples of length n = 1, 000, Figs. 14 and
15 indicate that for p = 1.00001 the interquartile ranges extend between 1
for α = 1.9 and 1.5 for α = 1.5. Depending on α, the same pattern but on a
worse level is displayed for p = 1.3.
142 S. Kring et al.

Sample size per estimation=300

18

16

14

Values
12

10

6
alpha=1.5 alpha=1.6 alpha=1.7 alpha=1.8 alpha=1.9
980 estimations of 1000 are shown in each boxplot; p=1.00001

Fig. 10. Sample size 300, p = 1.00001

Sample size per estimation=300


30

25

20
Values

15

10

alpha=1.5 alpha=1.6 alpha=1.7 alpha=1.8 alpha=1.9


980 estimations of 1000 are shown in each boxplot; p=1.3

Fig. 11. Sample size 300, p = 1.3

Sample size per estimation=500

16

14

12
Values

10

alpha=1.5 alpha=1.6 alpha=1.7 alpha=1.8 alpha=1.9


980 estimations of 1000 are shown in each boxplot; p=1.00001

Fig. 12. Sample size 500


It is clear from the statistical analysis that concerning skewness and me-
(1) (2)
dian bias, the estimator σ̂ij (n) has properties superior to estimator σ̂ij (n, p)
(1)
for both values of p. Hence, we use estimator σ̂ij (n).
Estimation of α-Stable Sub-Gaussian Distributions for Asset Returns 143

Sample size per estimation=500


24

22

20

18

16

Values
14

12

10

4
alpha=1.5 alpha=1.6 alpha=1.7 alpha=1.8 alpha=1.9
980 estimations of 1000 are shown in each boxplot; p=1.3

Fig. 13. Sample size 500


Sample size per estimation=1000
16

15

14

13
Values

12

11

10

7
alpha=1.5 alpha=1.6 alpha=1.7 alpha=1.8 alpha=1.9
980 estimations of 1000 are shown in each boxplot; p=1.00001

Fig. 14. Sample size 1000


Sample size per estimation=1000

25

20
Values

15

10

5
alpha=1.5 alpha=1.6 alpha=1.7 alpha=1.8 alpha=1.9
980 estimations of 1000 are shown in each boxplot; p=1.3

Fig. 15. Sample size 1000


144 S. Kring et al.

6 Application to the DAX 30


For the empirical analysis of the DAX30 index, we use the data from the
Karlsruher Kapitaldatenbank. We analyze data from May 6, 2002 to March
31, 2006. For each company listed in the DAX30, we consider 1, 000 daily
log-returns in the study period.6

6.1 Model Check and Estimation of the Parameter α

Before fitting an α-stable sub-Gaussian distribution, we assessed if the data


are appropriate for a sub-Gaussian model. This can be done with at least two
different methods. In the first method, we analyze the data by pursuing the
following steps (also [16]):
(i) For every stock Xi , we estimate θ̂ = (α̂i , β̂i , σ̂i , µ̂i ), i = 1, . . . , d.
(ii) The estimated α̂i ’s should not differ much from each other.
(iii) The estimated β̂i ’s should be close to zero.
(iv) Bivariate scatterplots of the components should be elliptically contoured.
(v) If the data fulfill criteria (ii)-(iv), a sub-Gaussian model can be justified.
If there is a strong discrepancy to one of these criteria we have to reject
a sub-Gaussian model.
In Table 1, we depict the maximum likelihood estimates for the DAX30
components. The estimated α̂i , i = 1, . . . , 29, are significantly below 2, indi-
cating leptokurtosis. We calculate the average to be ᾱ = 1.6. These estimates
agree with earlier results from [8]. In that work, stocks of the DAX30 are
analyzed during the period 1988 through 2002. Although using different es-
timation procedures, the results coincide in most cases. The estimated β̂i ,
i = 1, . . . , 29, are between −0.1756 and 0.1963 and the average, β̄, equals
−0.0129. Observe the substantial variability in the α’s and that not all β’s
are close to zero. These results agree with [16] who analyzed the Dow Jones
Industrial Average. Concerning item (iv), it is certainly not feasible to look
at each bivariate scatterplot of the data. Figure 16 depicts randomly chosen
bivariate plots. Both scatterplots are roughly elliptical contoured.
The second method to analyze if a dataset allows for a sub-Gaussian model
is quite similar to the first one. Instead of considering the components of the
DAX30 directly, we examine randomly chosen linear combinations of the com-
ponents. We only demand that the Euclidean norm of the weights of the linear
combination is 1. Due to the theory of α-stable sub-Gaussian distributions,
the index of stability is invariant under linear combinations. Furthermore,
the estimated β̂ of linear combination should be close to zero under the sub-
Gaussian assumption. These considerations lead us to the following model
check procedure:
6
During our period of analysis Hypo Real Estate Holding AG was in the DAX for
only 630 days. Therefore we exclude this company from further treatment leaving
us with 29 stocks.
Estimation of α-Stable Sub-Gaussian Distributions for Asset Returns 145

Table 1. Stable parameter estimates using the maximum likelihood estimator

Name Ticker symbol α̂ β̂ σ̂ µ̂


Addidas ADS 1.716 0.196 0.009 0.001
Allianz ALV 1.515 −0.176 0.013 −0.001
Atlanta ALT 1.419 0.012 0.009 0.000
BASF BAS 1.674 −0.070 0.009 0.000
BMW BMW 1.595 −0.108 0.010 0.000
Bayer BAY 1.576 −0.077 0.011 0.000
Commerzbank CBK 1.534 0.054 0.012 0.001
Continental CON 1.766 0.012 0.011 0.002
Daimler-Chryser DCX 1.675 −0.013 0.011 0.000
Deutsch Bank DBK 1.634 −0.084 0.011 0.000
Deutsche Brse DB1 1.741 0.049 0.010 0.001
Deutsche Post DPW 1.778 −0.071 0.011 0.000
Telekom DTE 1.350 0.030 0.009 0.000
Eon EOA 1.594 −0.069 0.009 0.000
FresenMed FME 1.487 0.029 0.010 0.001
Henkel HEN3 1.634 0.103 0.008 0.000
Infineon IFX 1.618 0.019 0.017 −0.001
Linde LIN 1.534 0.063 0.009 0.000
Lufthansa LHA 1.670 0.030 0.012 −0.001
Man MAN 1.684 −0.074 0.013 0.001
Metro MEO 1.526 0.125 0.011 0.001
MncherRck MUV2 1.376 −0.070 0.011 −0.001
RWE RWE 1.744 −0.004 0.010 0.000
SAP SAP 1.415 −0.093 0.011 −0.001
Schering SCH 1.494 −0.045 0.009 0.000
Siemens SIE 1.574 −0.125 0.011 0.000
Thyssen TKA 1.650 −0.027 0.011 0.000
Tui TUI 1.538 0.035 0.012 −0.001
Volkswagen VOW 1.690 −0.024 0.012 0.000
Average values ᾱ = 1, 6 β̄ = −0, 0129

(i) Generate i.i.d. samples u1 , . . . , un ∈ Rd according to the uniform distri-


bution on the hypersphere Sd−1 .
(ii) For each linear combination ui X, i = 1, . . . , n, estimate θi = (α̂i , β̂i ,
σ̂i , µ̂i ).
(iii) The estimated α̂i ’s should not differ much from each other.
(iv) The estimated β̂i ’s should be close to zero.
(v) Bivariate scatterplots of the components should be elliptically contoured.
(vi) If the data fulfill criteria (ii)–(v) a sub-Gaussian model can be justified.
If we conclude after the model check that our data are sub-Gaussian dis-
tributed,
n we estimate the α of the distribution by taking the mean ᾱ =
1
n i=1 α̂i . This approach has the advantage compared to the former one
146 S. Kring et al.

0.2 0.2

0.15
0.15

0.1
0.1

0.05

MAN
LHF

0.05
0

0
−0.05

−0.05
−0.1

−0.1
−0.08 −0.06 −0.04 −0.02 0 0.02 0.04 0.06 0.08 0.1 0.12 −0.08 −0.06 −0.04 −0.02 0 0.02 0.04 0.06 0.08 0.1
BAS CON

(a) (b)

Fig. 16. Bivariate Scatterplots of BASF and Lufthansa in (a); and of Continental
and MAN in (b)

2 1
1.9 0.8
1.8 0.6
1.7 0.4
1.6 0.2
1.5 0
1.4 −0.2
1.3 −0.4
1.2 −0.6
1.1 −0.8
1 −1
0 20 40 60 80 100 0 20 40 60 80 100
Linear combinations
α β

Fig. 17. Scatterplot of the estimated α’s and β’s for 100 linear combinations

that we incorporate more information from the dataset and we can generate
more sample estimates α̂i and β̂i . In the former approach, we analyze only
the marginal distributions.
Figure 17 depicts the maximum likelihood estimates for 100 linear com-
binations due to (ii). We observe that the estimated α̂i , i = 1, . . . , n, range
from 1.5 to 1.84. The average, ᾱ, equals 1.69. Compared to the first approach,
the tail indices increase, meaning less leptokurtosis, but the range of the es-
timates decreases. The estimated β̂i ’s, i = 1, . . . , n, lie in a range of −0.4 and
0.4 and the average, β̄, is −0.0129. In contrast to the first approach, the vari-
ability in the β’s increases. It is certainly not to be expected that the DAX30
log-returns follow a pure i.i.d. α stable sub-Gaussian model, since we do not
account for time dependencies of the returns. The variability of the estimated
α̂’s might be explained with GARCH-effects such as clustering of volatility.
Estimation of α-Stable Sub-Gaussian Distributions for Asset Returns 147

The observed skewness in the data7 cannot be captured by a sub-Gaussian


or any elliptical model. Nevertheless, we observe that the mean of the β’s is
close to zero.

6.2 Estimation of the Stable DAX30 Dispersion Matrix

In this section, we focus on estimating the sample dispersion matrix of an α-


stable sub-Gaussian distribution based on the DAX30 data. For the estimation
(1)
procedure, we use the estimator σ̂ij (n), i = j presented in Sect. 5. Before
applying this estimator, we center each time series by subtracting its sample
(1)
mean. Estimator σ̂ij (n) has the disadvantage that it cannot handle zeros.
But after centering the data, there are no zero log-returns in the time series.
In general, this is a point which has to be considered carefully.
For the sake of clearity, we display the sample dispersion matrix and covari-
ance matrix as heat maps, respectively. Figure 18 is a heat map of the sample
dispersion matrix of the α-stable sub-Gaussian distribution. The sample dis-
persion matrix is positive definite and has a very similar shape and structure
as the sample covariance matrix which is depicted in Fig. 19. Dark blue colors
correspond to low values, whereas dark red colors depict high values.

ADS
ALV
ALT
BAS
BMW
BAY
CBK
CON
DCX
DBK
DB1
DPW
DTE
EOA
FME
HEN3
IFX
LIN
LHA
MAN
MEO
MUV2
RWE
SAP
SCH
SIE
TKA
TUI
VOW
ADS
ALV
ALT
BAS
BMW
BAY
CBK
CON
DCX
DBK
DB1
DPW
DTE
EOA
FME
HEN3
IFX
LIN
LHA
MAN
MEO
MUV2
RWE
SAP
SCH
SIE
TKA
TUI
VOW

Fig. 18. Heat map of the sample dispersion matrix. Dark blue colors corresponds
to low values (min=0.0000278), to blue, to green, to yellow, to red for high values
(max=0,00051)8

7
The estimated β̂’s differ sometimes significantly from zero.
8
To obtain the heat map in color, please contact the authors.
148 S. Kring et al.

ADS
ALV
ALT
BAS
BMW
BAY
CBK
CON
DCX
DBK
DB1
DPW
DTE
EOA
FME
HEN3
IFX
LIN
LHA
MAN
MEO
MUV2
RWE
SAP
SCH
SIE
TKA
TUI
VOW
ADS
ALV
ALT
BAS
BMW
BAY
CBK
CON
DCX
DBK
DB1
DPW
DTE
EOA
FME
HEN3
IFX
LIN
LHA
MAN
MEO
MUV2
RWE
SAP
SCH
SIE
TKA
TUI
VOW
Fig. 19. Heat map of the sample covariance matrix. Dark blue colors corresponds
to low values (min=0.000053), to blue, to green, to yellow, to red for high values
(max=0,00097)9

−3 −3
x 10 x 10
3.5 7

3 6

2.5 5
Dispersion

Variance

2 4

1.5 3

1 2

0.5 1

0 0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29

(a) (b)

Fig. 20. Barplots (a) and (b) depict the eigenvalues of the sample dispersion matrix
and the sample covariance matrix

Figure 20a,b illustrate the eigenvalues λi , i = 1, . . . , 29, of the sample


dispersion matrix and covariance matrix, respectively. In both Figures, the
first eigenvalue is significantly larger than the others. The amounts of the
eigenvectors decline in similar fashion.

9
To obtain the heat map in color, please contact the authors.
Estimation of α-Stable Sub-Gaussian Distributions for Asset Returns 149

100 % 100%

90% 90%

80%

Explained variance in percent


Explained dispersion in percent

80%

70% 70%

60% 60%

50% 50%

40% 40%

30% 30%

20% 20%

10% 10%

0% 0%
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 1 2 3 4 5 6 7 8 9 10 11 1213 14 15 16 17 1819 20 21 22 23 2425 26 27 28 29

(a) (b)

Fig. 21. Barplot (a) and (b) show the cumulative


proportion
 of the total dispersion
and variance explained by the components, i.e., ki=1 λi / 29
i=1 λi

Figure 21 (a) and (b) depict the cumulative proportion of the total vari-
ability explained by the first k principal components corresponding to the k
largest eigenvalues. In both figures, more than 50% is explained by the first
principal component. We observe that the first principal component in the sta-
ble case explains slightly more variability than in the ordinary case, e.g. 70%
of the total amount of dispersion is captured by the first six stable components
whereas in the normal case, only 65% is explained. In contrast to the normal
PCA the stable components are not independent but quasi-uncorrelated. Fur-
thermore, in the case of α = 1.69, the coefficient of tail dependence for two
principal components, say Yi and Yj , is
0 √1/2 s1.69
√ ds
0 1−s2
λ(Yi , Yj , 0, 1.69) = 0 1 1.69 ≈ 0.21
√s
0 1−s2
ds

due to Theorem 6 for all i = j, i, j = 1, . . . , 29.


In Fig. 22a–d we show the first four eigenvectors of the sample dispersion
matrix, the so-called vectors of loadings. The first vector is positively weighted
for all stocks and can be thought of as describing a kind of index portfolio.
The weights of this vector do not sum to one but they can be scaled to
be so. The second vector has positive weights for technology titles such as
Deutsche Telekom, Infineon, SAP, Siemens and also to the non-technology
companies Allianz, Commerzbank, and Tui. The second principal component
can be regarded as a trading strategy of buying technology titles and selling
the other DAX30 stocks except for Allianz, Commerzbank, and Tui. The first
two principal components explain around 56% of the total variability. The
vectors of loadings in (c) and (d) correspond to the third and fourth principal
component, respectively. It is slightly difficult to interpret this with respect to
any economic meaning, hence, we consider them as pure statistical quantities.
In conclusion, the estimator σ̂ij (n), i = j, offers a simple way to estimate
150 S. Kring et al.

ADS ADS
ALV ALV
ALT ALT
BAS BAS
BMW BMW
BAY BAY
CBK CBK
CON CON
DCX DCX
DBK DBK
DB1 DB1
DPW DPW
DTE DTE
EOA EOA
FME FME
HEN3 HEN3
IFX IFX
LIN LIN
LHA LHA
MAN MAN
MEO MEO
MUV2 MUV2
RWE RWE
SAP SAP
SCH SCH
SIE SIE
TKA TKA
TUI TUI
VOW VOW
0 0.05 0.1 0.15 0.2 0.25 0.3 −0.2 −0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8

(a) (b)
ADS ADS
ALV ALV
ALT ALT
BAS BAS
BMW BMW
BAY BAY
CBK CBK
CON CON
DCX DCX
DBK DBK
DB1 DB1
DPW DPW
DTE DTE
EOA EOA
FME FME
HEN3 HEN3
IFX IFX
LIN LIN
LHA LHA
MAN MAN
MEO MEO
MUV2 MUV2
RWE RWE
SAP SAP
SCH SCH
SIE SIE
TKA TKA
TUI TUI
VOW VOW

−0.5 −0.4 −0.3 −0.2 −0.1 0 0.1 0.2 0.3 −0.4 −0.3 −0.2 −0.1 0 0.1 0.2 0.3 0.4 0.5

(c) (d)

Fig. 22. Barplot summarizing the loadings vectors g1 , g2 , g3 and g4 defining the first
four principal components: (a) factor 1 loadings; (b) factor 2 loadings; (c) factor 3
loadings; and (d) factor 4 loadings

the dispersion matrix in an i.i.d. α-stable sub-Gaussian model. The results


delivered by the estimator are reasonable and consistent with economic theory.
Finally, we stress that a stable PCA is feasible.

7 Conclusion
In this paper we present different estimators which allow one to estimate the
dispersion matrix of any normal variance mixture distribution. We analyze
the estimators theoretically and show their consistency. We find empirically
(1)
that the estimator σ̂ij (n) has better statistical properties than the estima-
(2)
tor σ̂ij (n, p) for i = j. We fit an α-stable sub-Gaussian distribution to the
DAX30 components for the first time. The sub-Gaussian model is certainly
more realistic than a normal model, since it captures tail dependencies. But
it has still the drawback that it cannot incorporate time dependencies.
Estimation of α-Stable Sub-Gaussian Distributions for Asset Returns 151

Acknowledgement

The authors would like to thank Stoyan Stoyanov and Borjana Racheva-Iotova
from FinAnalytica Inc for providing ML-estimators encoded in MATLAB. For
further information, see [22].

References

[1] Artzner, P., F. Delbaen, J.M. Eber and D. Heath. 1999. Coherent Measure
of Risk. Mathematical Finance 9, 203–228.
[2] Bingham, N.H., R. Kiesel and R. Schmidt. 2003. A Semi-parametric Ap-
proach to Risk Management. Quantitative Finance 3, 241–250.
[3] Cheng, B.N., S.T. Rachev. 1995. Multivariate Stable Securities in Finan-
cial Markets. Mathematical Finance 54, 133–153.
[4] Embrechts, P., A. McNeil and D. Straumann. 1999. Correlation: Pitfalls
and Alternatives. Risk 5, 69–71.
[5] Fama, E. 1965. The Behavior of Stock Market Prices. Journal of Business,
38, 34–105.
[6] Fama, E. 1965. Portfolio Analysis in a Stable Paretian market. Manage-
ment Science, 11, 404–419.
[7] Hardin Jr., C.D. 1984. Skewed Stable Variables and Processes. Technical
report 79, Center for Stochastics Processes at the University of North
Carolina, Chapel Hill.
[8] Höchstötter, M., F.J. Fabozzi and S.T. Rachev. 2005. Distributional Anal-
ysis of the Stocks Comprising the DAX 30. Probability and Mathematical
Statistics, 25, 363–383.
[9] Mandelbrot, B.B. 1963. New Methods in Statistical Economics. Journal
of Political Economy, 71, 421–440.
[10] Mandelbrot, B.B. 1963. The Variation of Certain Speculative Prices.
Journal of Business, 36, 394–419.
[11] Mandelbrot, B.B. 1963. New Methods in Statistical Economics. Journal
of Political Economy, 71, 421–440.
[12] Markowitz, H.M. 1952. Portfolio Selection. Journal of Finance 7, (1),
77–91.
[13] McCulloch, J.H. 1952. Financial Applications of Stable Distributions.
Hanbbook of Statistics-Statistical Methods in Finance, 14, 393–425.
Elsevier Science B.V, Amsterdam.
[14] McNeil, A.J., R. Frey and P. Embrechts. 2005. Quantitative Risk Man-
agement, Princeton University Press, Princeton.
[15] Nolan, J.P., A.K. Panorska, J.H. McCulloch. 2000. Estimation of Stable
Spectral Measure. Mathematical and Computer Modeling 34, 1113–1122.
[16] Nolan, J.P. 2005. Multivariate Stable Densities and Distribution Func-
tions: General and Elliptical Case. Deutsche Bundesbank’s 2005 Annual
Fall Conference.
152 S. Kring et al.

[17] Rachev, S.T., S. Mittnik. 2000. Stable Paretian Models in Finance. Wiley,
New York.
[18] Resnick, S.I. 1987. Extreme Values, Regular Variation, and Point Pro-
cesses. Springer, Berlin.
[19] Samorodnitsky, G., M. Taqqu. 1994. Stable Non-Gaussian Random Pro-
cesses, Chapmann & Hall, New York
[20] Schmidt, R. 2002. Tail Dependence for Elliptically Contoured Distribu-
tions. Mathematical Methods of Operations Research 55, 301–327.
[21] Stoyanov, S.V. 2005. Optimal Portfolio Management in Highly Volatile
Markets. Ph.D. thesis, University of Karlsruhe, Germany.
[22] Stoyanov, S.V., B. Racheva-Iotova 2004. Univariate Stable Laws in the
Fields of Finance-Approximations of Density and Distribution Functions.
Journal of Concrete and Applicable Mathematics, 2/1, 38–57.
Risk Measures for Portfolio Vectors
and Allocation of Risks

Ludger Rüschendorf

Department of Mathematical Statistics, University of Freiburg, Germany,


ruschen@stochastik.uni-freiburg.de

1 Introduction
In this paper we survey some recent developments on risk measures for port-
folio vectors and on the allocation of risk problem. The main purpose to study
risk measures for portfolio vectors X = (X1 , . . . , Xd ) is to measure not only
the risk of the marginals separately but to measure the joint risk of X caused
by the variation of the components and their possible dependence.
Thus an important property of risk measures for portfolio vectors is con-
sistency with respect to various classes of convex and dependence orderings.
It turns out that axiomatically defined convex risk measures are consistent
w.r.t. multivariate convex ordering. Two types of examples of risk measures
for portfolio measures are introduced and their consistency properties are in-
vestigated w.r.t. various types of convex resp. dependence orderings.
We introduce the general class of convex risk measures for portfolio vectors.
These have a representation result based on penalized scenario measures. It
turns out that maximal correlation risk measures play in the portfolio case the
same role that average value at risk measures have in one dimensional case.
The second part is concerned with applications of risk measures to the op-
timal risk allocation problem. The optimal risk allocation problem or, equiv-
alently, the problem of risk sharing is the problem to allocate a risk in an
optimal way to n traders endowed with risk measures 1 , . . . , n . This prob-
lem has a long history in mathematical economics and insurance. We show
that the optimal risk allocation problem is well defined only under an equi-
librium condition. This condition can be characterized by the existence of a
common scenario measure. A meaningful modification of the optimal risk allo-
cation problem can be given also for markets without assuming the equilibrium
condition. Optimal solutions are characterized by a suitable dual formulation.
The basic idea of this extension is to restrict the class of admissible allocations
in a proper way. We also discuss briefly some variants of the risk allocation
problem as the capital allocation problem.
154 L. Rüschendorf

2 Representation of Convex Risk Measures for Portfolio


Vectors
Convex risk measures for real risk variables have been axiomatically intro-
duced and studied in the mathematical finance literature by Artzner et al.
(1998), Delbaen (2002), Föllmer and Schied (2004) and many others while
there are independent and earlier studies of various aspects of risk measures
and related premium principles in the economics and insurance literature.
Various important subclasses of risk measures have been characterized. Law
invariant, convex risk measures on L∞ (P ) (resp. Lr (P ), r ≥ 1) have been
characterized by a Kusuoka type representation of the form
+ ,
(X) = sup AV @Rλ (X)µ(dλ) − β(µ) (1)
µ∈M1 ([0,1]) (0,1)

where λ (X) = AV @Rλ (X) is the average value at risk 0 (also called expected
shortfall or conditional value at risk), β(µ) = supX∈A (0,1] AV @Rλ (X)µ(dλ)
is the penalty function, and A = {X ∈ L∞ (P ); (X) ≤ 0} is the acceptance
set of  (see Kusuoka (2001) and Föllmer and Schied (2004)). Thus in di-
mension d = 1 the average value at risk measures λ are the basic building
blocks of the class of law invariant convex risk measures. For some recent
developments in the area of risk measures see [21].
For portfolio vectors X = (X1 , . . . , Xd ) ∈ L∞ d (P ) on (Ω, A, P ) a risk
measure  : L∞ d (P ) → R 1
is called convex risk measure if
M1) X ≥ Y ⇒ (X) ≤ (Y )
M2) (X + mei ) = −m + (X), m ∈ R1
M3) (αX + (1 − α)Y ) ≤ α(X) + (1 − α)(Y ) for all α ∈ (0, 1);
thus  is a monotone translation invariant, convex risk functional (see [8, 17]).
Like in d = 1 (X) denotes the smallest amount m to be added to the portfolio
vector X such that X +me1 is acceptable. ei denotes here the i-th unit vector.
A subset A ⊂ L∞ d (P ) with R not contained in A is called (convex)
d

acceptance set, if
(A1) A is closed (and convex)
(A2) Y ∈ A and Y ≤ X implies X ∈ A
(A3) X + mei ∈ A ⇔ X + mej ∈ A.
With
A (X) := inf{m ∈ R; X + me1 ∈ A}
risk measures are identified with their acceptance sets:
(a) If A is a convex acceptance set, then A is a convex risk measure
(b) If  is a convex risk measure, then A = {X ∈ L∞d ; (X) ≤ 0} is a convex
acceptance set.
Risk Measures for Portfolio Vectors and Allocation of Risks 155

Let bad (P ) denote the set of finite additive, normed, positive measures
on L∞d (P ). Convex risk measures on portfolio vectors allow a representation
similar to d = 1.

Theorem 1 (see [8]).  : L∞ d (P ) → R is a convex risk measure if and only


1

if there exists some function α : bad (P ) → (−∞, ∞] such that

(X) = sup (EQ (−X) − α(Q)). (2)


Q∈bad (P )

α can be chosen as Legendre–Fenchel inverse

α(Q) = sup (EQ (−X) − (X))


X∈L∞
d (P )

= sup EQ (−X).
X∈A

P
For risk measures  which are Fatou-continuous, i.e. Xn → X, (Xn ) uni-
formly bounded implies (X) ≤ lim inf (Xn ), bad (P ) can be replaced by
the class Md1 (P ) of P -continuous, σ-additive normed measures which can be
identified by the class of P -densities D = {(Y1 , . . . , Yd ); Yi ≥ 0, EP Yi = 1,
1 ≤ i ≤ d}.
For coherent risk measures, i.e. homogeneous, subadditive, monotone,
translation invariant risk measures the representation in (2) simplifies to

(X) = sup EQ (−X), (3)


Q∈P

where P ⊂ ba(P ), resp. P ⊂ Md1 (P ), if the Fatou property holds, can be


interpreted as class of scenario measures.
d   it
For law invariant convex risk measures, i.e. X = X implies (X) = (X)
has been found recently that maximal correlation risk measures play the role of
basic building blocks as the average value at risk measures do in the Kusuoka
representation result. Let for some density vector Y ∈ D, ΨY (X) = EX · Y
denote the correlation of X and Y (up to normalization) and define

ΨY (X) = sup{ΨY (Y ); X


=d
X} (4)

the maximal correlation risk measure (in direction Y ).

Theorem 2 (see [26]). Let Ψ be a Fatou continuous convex risk measure on


L∞
d (P ) with penalty function α. Then it holds:
Ψ is law invariant ⇔ Ψ has a representation of the form

Ψ (X) = sup (ΨY (X) − α(Y )) (5)


Y ∈D0

with law invariant penalty function α and D0 = {Y ∈ D; α(Y ) < ∞}.


156 L. Rüschendorf

Remark 3 a) In particular, the law invariant coherent risk measures on


L∞
d (P ) have a representation of the form

Ψ (X) = sup ΨY (X) (6)


Y ∈A

for some subset A ⊂ D. Thus the maximal correlation risk measures ΨY
are the basic building blocks of all law invariant convex risk measures on
portfolio vectors.
b) For d = 1 the representation in (5) can be shown to be equivalent to the
Kusuoka representation result in (1). For d ≥ 1 optimal couplings as in
the definition of the maximal correlation risk measure ΨY arise, have been
characterized in Rüschendorf and Rachev (1990). There are some examples
where ΨY can be calculated in explicit form but in general one does not have
explicit formulas. Therefore, it is useful to give more explicit constructions
of risk measures for portfolio vectors which generalize the known classes of
one dimensional risk measures. For some partial extensions of distortion
type risk measures see [8, 26].

3 Consistency w.r.t. Convex Orderings and some Classes


of Examples
For some class of functions F ⊂ {f : Rd → R1 } the ordering ≤F is defined
for random vectors X, Y by

X ≤F Y if Ef (X) ≤ Ef (Y ), ∀f ∈ F, (7)

such that the integrals exist.


In particular for the class of nondecreasing functions this leads to the
stochastic ordering ≤st , for the class Fcx of convex functions this leads to the
convex ordering ≤cx . Interesting dependence orderings are by the classes Fdcx
of directionally convex functions, Fsm the class of supermodular functions,
F∆ the class of ∆-monotone functions. The corresponding orderings are de-
noted by ≤dcx , ≤sm , ≤∆ (see Müller and Stoyan (2002) for details on these
orderings).
From Strassen’s well-known representation result it follows that any risk
measure  on L∞ d (P ), which satisfies the monotonicity condition M1) is con-
sistent w.r.t. stochastic ordering ≤st , i.e.

X ≤st Y ⇒ (Y ) ≤ (X). (8)

It is of particular interest to study consistency of risk measures w.r.t. the


above mentioned various convexity and dependence orderings. Let ≤decx , ≤icx
denote the ordering by decreasing resp. increasing convex functions. Then it
turns out that all law invariant axiomatically defined convex risk measures
are consistent w.r.t. decreasing convex ordering ≤decx (see [8]).
Risk Measures for Portfolio Vectors and Allocation of Risks 157

Theorem 4. Let  be a law invariant, Fatou continuous convex risk measure


on L∞
d (P ). Then  is consistent w.r.t. ≤decx , i.e.

X ≤decx Y ⇒ (X) ≤ (Y ). (9)

Since X ≤decx Y is equivalent to Y ≤icv X, ≤icv the ordering by increasing


concave functions, (9) is equivalent for d = 1 with consistency w.r.t. the second
order stochastic dominance. The proof of Theorem 4 is based essentially on
the following important property: For all X, Y ∈ L∞ d (P ) holds

(X) ≥ (E(X | Y )), (10)

i.e. smoothing by conditional expectation reduces the risk (for d = 1 see Schied
(2004) or Föllmer and Schied (2004)).
In insurance mathematics the monotonicity axiom M1) of a risk measure
has to be changed to monotonicity in the usual componentwise ordering. We
shall use the notation Ψ (X) for risk measures satisfying this kind of mono-
tonicity. The relation Ψ (X) = (−X) gives a one to one relation between
risk measures  in the financial context and risk measures Ψ in the insurance
context.
A natural idea to construct risk measures for portfolio vectors X is to
measure the risk of some real aggregation of the risk vector like the joint
portfolio or the maximal risk, i.e. to consider
+
d ,
Ψ (X) = Ψ1 Xi or
i=1
(11)
Ψ (X) = Ψ1 (max Xi ),
i

where Ψ1 is a suitable one dimensional risk measure like expected shortfall


or some distortion type risk measure. More generally for some class of real
aggregation functions F0 = {fα ; α ∈ A} the following classes of risk measures
have been introduced in Burgert and Rüschendorf (2006). Define

ΨA (X) = sup Ψ1 (fα (X)), (12)


α∈A

ΨM (X) = sup Ψ1 (fα (X))dµ(α), (13)
µ∈M

where M ⊂ Mσ (A) is a class of weighting measures on A. ΨA (X) is the


maximal risk of some class of aggregation functions, while ΨM (X) considers
the maximum risk over some weighted average. If for example A = ∆ = {α ∈
d
Rd+ ; 0 i=1 αi = 1}, then one gets in this 0way risk measures like supα∈∆ Ψ1 (α ·
X), Ψ1 (α·X)dµ(α), Ψ1 (maxi αi Xi ) or ∆ Ψ1 (max αi Xi )dµ(α) measuring the
risk in all positive directions α.
It is important to assume that Ψ1 is consistent with respect to ≤icx – the
increasing convex ordering. This is e.g. the case for distortion risk measures
158 L. Rüschendorf
0∞
Ψ1 (X) = 0 g(F X (t))dt where g is a concave distortion function and F X (t) =
1 − FX (t) is the survival function. Then the following consistency results hold
true (see [8]):

a) If F0 ⊂ Ficx , then ΨA , ΨM are consistent w.r.t. ≤icx . (14)


b) If F0 ⊂ Fism , (Fidcx ), then ΨA , ΨM are consistent w.r.t. ≤ism (≤idcx ).
(15)

As consequence of a), b) one gets that more positive dependent risk vectors
have higher risks. This extends some classical results on comparison of risk
vectors. Let Fi−1 denote the generalized inverse of the distribution function
Fi of Xi , then


d 
d
Xi ≤icx Fi−1 (U ), (16)
i=1 i=1

where U is uniformly distributed on [0, 1] (see Meilijson and Nadas (1979),


Rüschendorf (1983)). Further with the comonotonic vector X c := (F1−1 (U ),
. . . , Fd−1 (U )) holds the following basic comparison result wich extends (16)

X ≤sm X c and X ≤∆ X c (17)

(see Tchen (1980) and Rüschendorf (1980)). Thus as consequence of (15) and
(17) we conclude under the conditions of (14), (15)

ΨM (X) ≤ ΨM (X c ), ΨA (X) ≤ ΨA (X c ); (18)

the comonotonic risk vector leads to the highest possible risk under all risk
measures of type ΨM , ΨA . Extensions of (17) to compare risks also of two risk
vectors X, Y are given in [11, 24]. For a review of this type of comparison
results for risk vectors see the survey paper [25].

4 Risk Allocation and Equilibrium

The classical risk sharing problem is to consider a market, described by some


probability space (Ω, A, P ), and n traders in the market supplied with risk

nis to allocate a risk X ∈ L (P ) in an
measures 1 , . . . , n . The problem
optimal way to the traders X = i=1 Xi , such that the risk vector (i (Xi ))
is
nPareto optimal in the class of all allocations or such that the total risk
i=1 i (Xi ) is minimal under all allocations.
This problem goes back to early work in the economics and insurance
literature (see the early contributions of Borch (1960a,b, 1962), Bühlmann
Risk Measures for Portfolio Vectors and Allocation of Risks 159

and Jewell (1979), Chevallier and Müller (1994), and many others). It was
later on extended to risk allocations in financial context (see e.g. Barrieu and
El Karoui (2005) and references therein.
An interesting point is that for translation invariant risk measures i ,
1 ≤ i ≤ n, the principle of Pareto optimal risk allocations is equivalent to
minimizing the total risk. This follows from the separating hyperplane theo-
rem and some simple arguments involving translation invariance. In particular
solutions are not unique and several additional (game theoretic) postulates
like fairness have been introduced to single out specific solutions of the risk
sharing problem. For example Chevallier and Müller (1994) single out condi-
tions which yield as possible solutions only portfolio insurance, tactical asset
allocation, and collar strategies. Classical results are the derivation of linear
quota sharing rules and of stop loss contracts as optimal sharing rules.
We discuss in the following some developments on the risk allocation
problem in the case where i are coherent risk measures with representa-
tion i (X) = supQ∈Pi EQ (−X) and scenario measures Pi . The more general
case of convex risk measures is discussed in [7, 9].
There is a naturally associated equilibrium condition coming from similar
equilibria conditions in game theory saying that in a balance of supply and
demand it is not possible to lower some risks without increasing others. In
formal terms this condition is formulated as:

n
(E) If Xi ∈ L∞ (P ) satisfy Xi = 0 and i (Xi ) ≤ 0, ∀i, then i (Xi ) = 0, ∀i.
i=1

To investigate this equilibrium condition we introduce two naturally asso-


ciated risk measures to the risk allocation problem. The first one is

Ψ (X) = inf{m : X + m ∈ A}, (19)

9n cone generated by the union of the acceptance sets Ai of


with A the closed
i , A = cone( i=1 Ai ). W.r.t. Ψ every risk is acceptable, which is acceptable
to any one of the traders in the market. Thus Ψ corresponds to some kind of
optimistic view towards risk.
The second related risk measure is the infimal convolution ˆ = 1 ∧ · · · ∧n
&
n 
n '
ˆ(X) = inf i (Xi ); Xi = X , (20)
i=1 i=1

which describes the optimal reachable total risk of an allocation. Both risk
measures have been considered in the literature (see [12, 2]).
It turns out (see [7]) that

ˆ is a coherent risk measure ⇔ ˆ(0) = 0 (21)


⇔ The equilibrium condition (E) holds true
⇔ Ψ is a coherent risk measure
160 L. Rüschendorf

and in this case ˆ = Ψ and the scenario set P ∼ ˆ satisfies


:
n
P = Pˆ = PΨ = Pi . (22)
i=1

As consequence one obtains an interesting result of Heath and Ku (2004)


(derived there for finite spaces Ω) saying: The equilibrium condition (E) is
equivalent to
:
n
Pi = ∅, (23)
i=1

i.e. to the existence of a common scenario measure of all traders. In particular


(21) implies that the optimal risk allocation problem makes sense only under
the equilibrium condition (E). Without (E) it is not possible to determine
Pareto optimal allocation rules or allocation rules which minimize the total
risk and a natural question is what to do in case the equilibrium condition
does not hold true.
To consider a useful version of the optimal risk allocation problem we
define for X ∈ L∞ (P )
& 
n '
A(X) = (Xi ); X = Xi , (Xi ) admissible , (24)
i=1

where (Xi ) is called an admissible allocation of X if

X(ω) ≥ 0 ⇒ Xi (ω) ≥ 0
(25)
X(ω) ≤ 0 ⇒ Xi (ω) ≤ 0.

The idea of introducing restrictions as above on the class of decompositions


is similar to portfolio optimization theory, where restrictions on the trading
strategies are introduced in order to prevent doubling strategies and thus to
prevent the possibility of arbitrage. In the risk sharing problem we want to
prevent risk arbitrage by restricting the class of admissible allocations.
We define the admissible infimal convolution ∗ by
&
n '
∗ (X) = inf i (Xi ); (Xi ) ∈ A(X) . (26)
i=1

Considering the connection with multiple decision problems and using a non-
convex version of the minimax theorem we get the following dual represen-
tation of ∗ , which essentially simplifies the calculation (see Burgert and
Rüschendorf
; (2005)).
< Let X− , X+ denote the negative (positive) parts of
X and Pj , Pj denote the lattice supremum resp.; infimum <of Pj . Thus
in the case of P -continuous probability measure Pj , Pj and Pj are the
probability measures with the max resp. inf of the P -densities as their density
with respect to P .
Risk Measures for Portfolio Vectors and Allocation of Risks 161

Theorem 5. For coherent risk measures i = Pi holds


& =  > '
a) ∗ (X) = sup X− d Pj − X+ d Pj ; Pj ∈ Pj , 1 ≤ j ≤ n
&  =  > '
b) A∗ = X ∈ L∞ (P ); X+ d Pj ≤ X+ d Pj , ∀Pj ∈ Pj .

The choice of restrictions in the definition of admissibility is justified by


the following theorem which is based on Theorem 5.

Theorem 6 (see [7]). Define the coherent admissible infimal convolution

ˆ∗ (X) = inf{m ∈ R; X + m ∈ A∗ }


= inf{m ∈ R; ∗ (X + m) ≤ 0}.

a) Under the equilibrium condition (E) holds ˆ∗ = ˆ = Ψ .


b) ˆ∗ is the largest coherent risk measure  ≤ mini i .

Part b) says that our chosen restrictions on decompositions are not too
restrictive since as a result of them we get the largest possible coherent risk
measure below i . Several related classes of restrictions can be given which
lead to the same coherent risk measure. In particular we get a new useful
coherent risk measure describing the value of the total risk of the optimal
modified risk allocation problem.
A different new type of restrictions on the allocation problem has been
introduced in a recent paper by Filipovic
n and Kupper (2006) who consider
for a given risk allocation X = i=1 i as admissible risk transfers only
C
allocations of the form

n
X= Xi with Xi = Ci + xi · Z, (27)
i=1

where Z = (Z1 , . . . , Zd ) is a finite vector of d fixed random instruments


n in the
market, xi ∈ Rd are admissible allocation vectors such that i=1 xi · Z ≤ 0.
Thus the optimal restricted risk allocation problem

n
i (Ci + xi · Z) = inf (28)
xi admissible
i=1

leads to an optimization problem with vector valued variables x1 , . . . , xn ∈


Rd and methods from game theory can be applied to characterize optimal
solutions. Problem (28) can be seen as a variant of the classical portfolio
optimization problem, i.e. to minimize the risk (x·Z) over all portfolio vectors
n
x = (x1 , . . . , xd ), xi ≥ 0, i=1 xi = 1.
There is an alternative related form of the risk allocation problem which
may be called the capital allocation problem (see [12, Chapter 9]). For a
firm with N trading units there are expected future wealth X1 , . . . , XN ∈
162 L. Rüschendorf
N
L∞ (P ). If risk is measured by a risk measured , then k = ( i=1 Xi ) is the
necessary capital the firm needs to cover the total risk. The problem is to find
a fair allocation of the risk capital k = k1 + · · · + kN to the N trading units.
Alternatively for subadditive risk measures n  one can see nthis as the problem
to distribute the gain of diversification i=1 (Xi ) − ( i=1 Xi ) ≥ 0 over the
different business units of a financial institution.
An allocation k1 , . . . , kN of the diversification gain is called fair if


N +
N ,
ki =  Xi (29)
i=1 i=1

and for all J ⊂ {1, . . . , N } holds


 + ,
kj ≤  Xj . (30)
j∈J j∈J

The existence of fair allocations (Bondarava–Shapley theorem for risk mea-


sures) is proved in Delbaen (2000) [12, Theorem 22] for coherent risk measures.
Assuming continuity of  from below (see [15, p. 167]) we get a simple proof
of this existence result and more information on the fair allocation. Let P (see
[15, p. 165]) denote the maximal representation set of scenario measures in
the representation of .

Theorem 7. Let  be a coherent risk measure continuous


N from below and let
X1 , . . . , XN be N wealth variables with k = ( i=1 Xi ). Then there exists
some scenario measure Q∗ ∈ P such that k1∗ , . . . , kN

with ki∗ := EQ∗ (−Xi ) is
a fair allocation of the risk capital k.

Proof. By the representation of  we have


+
N , + N ,
k= Xi = sup EQ − Xi . (31)
i=1 Q∈P i=1

Using that  is continuous from below Corollary 4.35 of Föllmer and Schied
(2004) implies the existence of some Q∗ ∈ P such that the supremum in (31)
is attained in Q∗ and with ki∗ = EQ∗ (−Xi ) holds

+ N , N
k = EQ∗ − Xi = ki∗ . (32)
i=1 i=1

Further for any J ⊂ {1, . . . , N } holds


+ +  , 
 Xj ) ≥ EQ∗ − Xj = kj∗ .
j∈J j∈J j∈J

Thus k1∗ , . . . , kN

is a fair allocation of the risk capital.
Risk Measures for Portfolio Vectors and Allocation of Risks 163

References

[1] P. Artzner, F. Delbaen, J.-M. Eber, and D. Heath. Coherent measures of


risk. Finance and Stochastics, 9:203–228, 1998.
[2] P. Barrieu and N. El Karoui. Inf-convolution of risk measures and optimal
risk transfer. Finance and Stochastics, 9:269–298, 2005.
[3] K. Borch. Reciprocal reinsurance treaties. ASTIN Bulletin, 1:170–191,
1960a.
[4] K. Borch. The safety loading of reinsurance premiums. Skand. Aktuari-
etidskr., 1:163–184, 1960b.
[5] K. Borch. Equilibrium in a reinsurance market. Econometrica, 30:
424–444, 1962.
[6] H. Bühlmann and W. S. Jewell. Optimal risk exchanges. ASTIN Bulletin,
10:243–263, 1979.
[7] C. Burgert and L. Rüschendorf. Allocations of risks and equilibrium in
markets with finitely many traders. Preprint, University Freiburg, 2005.
[8] C. Burgert and L. Rüschendorf. Consistent risk measures for portfolio
vectors. Insurance: Mathematics and Economics, 38:289–297, 2006.
[9] C. Burgert and L. Rüschendorf. On the optimal risk allocation problem.
Statistics & Decisions, 24(1), 2006, 153–172.
[10] E. Chevallier and H. H. Müller. Risk allocation in capital markets: Port-
folio insurance tactical asset allocation and collar strategies. ASTIN
Bulletin, 24:5–18, 1994.
[11] C. Christofides and E. Vaggelatou. A connection between supermod-
ular ordering and positive, negative association. Journal Multivariate
Analysis, 88:138–151, 2004.
[12] F. Delbaen. Coherent risk measures. Cattedra Galileiana. Scuola Normale
Superiore, Classe di Scienze, Pisa, 2000.
[13] F. Delbaen. Coherent risk measures on general probability spaces. In
Klaus Sandmann et al., editors, Advances in Finance and Stochastics.
Essays in Honour of Dieter Sondermann, pages 1–37. Springer, 2002.
[14] D. Filipovic and M. Kupper. Optimal capital and risk transfers for group
diversification. Preprint, 2006.
[15] H. Föllmer and A. Schied. Stochastic Finance. de Gruyter, 2nd edition,
2004.
[16] D. Heath and H. Ku. Pareto equilibria with coherent measures of risk.
Mathematical Finance, 14:163–172, 2004.
[17] E. Jouini, M. Meddeb, and N. Touzi. Vector-valued coherent risk
measures. Finance and Stochastics, 4:531–552, 2004.
[18] S. Kusuoka. On law-invariant coherent risk measures. Advances in Math-
ematical Economics, 3:83–95, 2001.
[19] I. Meilijson and A. Nadas. Convex majorization with an application to
the length of critical paths. Journal of Applied Probability, 16:671–677,
1979.
164 L. Rüschendorf

[20] D. Müller and D. Stoyan. Comparison Methods for Stochastic Models and
Risks. Wiley, 2002.
[21] Risk Measures and Their Applications. Special volume, L. Rüschendorf
(ed.). Statistics & Decisions, vol. 24(1), 2006.
[22] L. Rüschendorf. Inequalities for the expectation of ∆-monotone func-
tions. Zeitschrift für Wahrscheinlichkeitstheorie und verwandte Gebiete,
54:341–349, 1980.
[23] L. Rüschendorf. Solution of statistical optimization problem by rear-
rangement methods. Metrika, 30:55–61, 1983.
[24] L. Rüschendorf. Comparison of multivariate risks and positive depen-
dence. J. Appl. Probab., 41:391–406, 2004.
[25] L. Rüschendorf. Stochastic ordering of risks, influence of dependence and
a.s. constructions. In N. Balakrishnan, I. G. Bairamov, and O. L. Gebi-
zlioglu, editors, Advances in Models, Characterizations and Applications,
volume 180 of Statistics: Textbooks and Monographs, pages 19–56. CRC
Press, 2005.
[26] L. Rüschendorf. Law invariant risk measures for portfolio vectors.
Statistics & Decisions, 24(1), 2006, 97–108.
[27] L. Rüschendorf and S. T. Rachev. A characterization of random variables
with minimum L2 -distance. Journal of Multivariate Analysis, 1:48–54,
1990.
[28] A. Schied. On the Neyman–Person problem for law invariant risk mea-
sures and robust utility functionals. Ann. Appl. Prob., 3:1398–1423, 2004.
[29] A. H. Tchen. Inequalities for distributions with given marginals. Ann.
Prob., 8:814–827, 1980.
The Road to Hedge Fund Replication:
The Very First Steps

Lars Jaeger

Partners Group, Baar/Zug, Switzerland, lars.jaeger@partnersgroup.net

1 Introduction
The debate on sources of hedge fund returns is one of the subjects creating the
most heated discussion within the hedge fund industry. The industry thereby
appears to be split in two camps: Following results of substantial research, the
proponents on the one side claim that the essential part of hedge fund returns
come from the funds’ exposure to systematic risks, i.e. comes from their betas.
Conversely, the “alpha protagonists” argue that hedge fund returns depend
mostly on the specific skill of the hedge fund managers, a claim that they
express in characterising the hedge fund industry as an “absolute return” or
“alpha generation” industry. As usual, the truth is likely to fall within the
two extremes. Based on an increasing amount of empirical evidence, we can
identify hedge fund returns as a (time-varying) mixture of both, systematic
risk exposures (beta) and skill based absolute returns (alpha). However, the
fundamental question is: How much is beta, and how much is alpha?
There is no consensus definition of ‘alpha’, and correspondingly there is
no consensus model in the hedge fund industry for directly describing the
alpha part of hedge fund returns. We define alpha as the part of the return
that cannot be explained by the exposure to systematic risk factors in the
global capital markets and is thus the return part that stems from the unique
ability and skill set of the hedge fund manager. There is more agreement in
modeling the beta returns, i.e. the systematic risk exposures of hedge funds,
which will give us a starting point for decomposition of hedge fund returns
into ‘alpha’ and ‘beta’ components. We begin with stating the obvious: It is
generally not easy to isolate the alpha from the beta in any active investment
strategy. But for hedge funds it is not just difficult to separate the two, it is
already quite troublesome to distinguish them. We are simply not in a position
to give the precise breakdown yet. In other words, the current excitement
about hedge funds has not yet been subject to the necessary amount and
depth of academic scrutiny. However, we argue that the better part of the
confusion around hedge fund returns arises from the inability of conventional
166 L. Jaeger

risk measures and theories to properly measure the diverse risk factors of hedge
funds. This is why only recently progress in academic research has started to
provide us with a better idea about the different systematic risk exposures of
hedge funds and thus give us more precise insights into their return sources.1
Academic research and investors alike begin to realize that that the “search of
alpha” must begin with the “understanding of beta,” the latter constituting
an important – if not the most important - source of hedge fund returns.2
However, at the same time we are starting to realize that hedge fund beta
is different from traditional beta. While both are the result of exposures to
systematic risks in the global capital markets hedge fund beta is more com-
plex than traditional beta. Some investors can live with a rather simple but
illustrative scheme suggested by C. Asness3 : If the specific return is available
only to a handful investors and the scheme of extracting it cannot be simply
specified by a systematic process, then it is most likely real alpha. If it can
be specified in a systematic way, but it involves non-conventional techniques
such as short selling, leverage and the use of derivatives (techniques which are
often used to specifically characterize hedge funds), then it is possibly beta,
however in an alternative form, which we will refer to as “alternative beta.”
In the hedge fund industry “alternative beta” is often sold as alpha, but is
not real alpha as defined here (and elsewhere). If finally extracting the returns
does not require any of these special “hedge fund techniques” but rather “long
only investing,” then it is “traditional beta.”
But how do we model hedge fund returns explicitly and break them down
into alpha, alternative beta and traditional beta? Ultimately, what we are
looking for is a is a general equilibrium model, which relates hedge fund re-
turns to their systematic risk exposures represented by directly observable
market prices in the financial markets, similar to the Capital Asset Pricing
Model for the equity markets.4 This model does not exist yet in its entirety,
but there exists today a growing amount of academic literature on systematic
risk factors and hedge funds’ exposure to them (i.e. their factor loadings), in-
cluding a variety of “alternative beta factors.” We acknowledge that the qual-
ity of the offered model differs strongly for the different hedge fund strategy

1
See the recently published book by Jaeger (2005) and references therein.
2
Martin (2004) makes the pertinent point that measures of alpha inextricably de-
pend on the definition of benchmarks or beta components, going on to identify
ways in which techniques for measuring ‘alpha’ in a traditional asset management
environment are inappropriate or otherwise undermined by the specific charac-
teristics of hedge fund exposures. Moreover, most techniques for measuring hedge
fund alpha tend to reward fund managers for model and benchmark misspecifica-
tion, as imperfect specification of benchmark or ‘beta’ exposure tends to inflate
alpha.
3
Asness (2004).
4
While the CAPM is considered “dead” by most academics, there are extension of
it in various forms that continue to be subject of research. Further the CAPM is
still in extensive use by practioners.
The Road to Hedge Fund Replication 167

sectors. In other words, there is a variable degree of explanatory power for


(the variation of) hedge fund returns that factor models can offer across dif-
ferent strategy sectors. While Long/Short Equity has been well modeled in
academic research,5 models for some other strategies like Arbitrage strate-
gies (Equity Market Neutral, Convertible Arbitrage) display rather limited
explanatory power (i.e. low R-squared values).
This article aims to give reference to this academic effort and provide a co-
herent discussion on the current status of “beta vs. alpha” controversy in the
hedge fund industry. Literature references are given extensively. However, it
goes further than what has been discussed in most academic papers in that it
describes some of the implications we can draw from recognizing that there
is likely more beta than alpha in hedge funds. We will discuss the possibil-
ity and reality of constructing passive, investable hedge fund indices thereof,
and finally provide some remarks on the controversy of the future investment
capacity for hedge funds.
The article is structured as follows: The first part gives a review of the
structure of the currently available return factor models for hedge funds.
The second part discusses the problems and pitfalls of hedge fund indices,
before the third and fourth part provides some concrete asset based factor
models for the various hedge fund strategy sectors. The fifth part discusses
how one can construct real benchmarks and possibly passive and investable
hedge fund indices. The subsequent two sections discuss the future of hedge
funds alphas and the entire industry’s investment capacity, before we provide
some concluding remarks.

2 Factor Models for Hedge Fund Strategies: Revisiting


Sharpe’s Approach

In 1992 W. Sharpe introduced a unifying framework for such style models in an


effort to describe active management strategies in equity mutual funds.6 In his
model, he describes a certain active investment style as a linear combination
of a set of asset class indices. In other words, an active investment strategy is
a linear combination of passive, i.e. long-only, buy-and-hold, strategies. The
models Sharpe introduced are successful in explaining the lion’s share of the
performance of mutual funds.

5
W. Fung, D. Hsieh, “Extracting Portable Alpha from Equity Long/Short Hedge
Funds” (2004),
6
See “Asset Allocation: Management Style and Performance Measurement” (1992)
by William Sharpe and the articles by Eugene Fama and Kenneth French “Mul-
tifactor explanations of Asset Pricing Anomalies” (1993) and “Common risk fac-
tors in the return of stocks and bonds” (1993). More information can also be
found at the websites of William Sharpe, www.wsharpe.com, and Ken French,
http://mba.tuck.dartmouth.edu/pages/faculty/ken.french/.
168 L. Jaeger

Fung and Hsieh were the first to extend Sharpe’s model to hedge funds
in 1997.7 They employed techniques similar to those Sharpe had applied to
mutual funds five years earlier, but introduced short selling, leverage and
derivatives – three important techniques employed by hedge funds - into their
model. The resulting factor equation would account for all hedge fund return
variation that derives from risk exposure to the risk factors of various asset
classes. Adding alpha to the equation, it allows us to decompose hedge fund
return as:
Hedge fund excess return = Manager’s alpha + Σ (βi * Factori ) + random
fluctuations
Fung and Hsieh performed multifactor regressions of hedge fund returns on
eight asset class indices: US equities, non-US equities, emerging market equi-
ties, US government bonds, non-US government bonds, one-month Eurodollar
deposit rate, gold, and the trade-weighted value of the US dollar. They
identified five risk factors (referred to as style factors), which they defined
as modelling Global Macro, Systematic Trend-Following, Systematic Oppor-
tunistic, Value, Distressed Securities. They further argued that hedge fund
strategies are highly dynamic and create option-like, non-linear, contingent
return profiles. These non-linear profiles, they argued, cannot be modelled
in simple asset class factor models. In their later research they explicitly in-
corporate assets with contingent payout profiles, e.g. options.8 Most of the
studies which have followed show results consistent with Fung and Hsieh.9
The recent literature offers an increasing number of studies around the ques-
tion of common style factor exposure and contingency in payoff profile for
hedge funds.10
As the formula above describes, we infer the hedge funds’ alphas by mea-
suring and subtracting out the betas times the beta factors. We can look at
alpha as the “dark matter” of the hedge fund universe. It can only be mea-
sured by separating everything else out and seeing what is left. In other words,
alpha is never directly observable, but is measured jointly with beta. It can
only be indirectly quantified by separating the beta components out. The ob-
tained value of alpha therefore depends on the chosen risk factors. If we leave
out a relevant factor in the model, the alpha will come out as fictively high.
To draw another analogy, we can equally say that alpha is the garbage bag
of the regression: We account for everything we can, and whatever is left gets
7
Fung, W., Hsieh, D., (1997).
8
The idea of option factors for the purpose of hedge fund modeling was already
introduced in the earliest work on hedge fund models by W. Fung and D. Hsieh,
(1997), and was since then discussed by many academic studies. See their recent
work: W. Fung, D. Hsieh. (2002).
9
See e.g. the article by S. Brown and W. Goetzmann, (2003). The authors identify
eight style factors, i.e. three more than Fung and Hsieh in their research.
10
See W. Fung, D. Hsieh, (2003); (2001); (2001); (2002); V. Agarwal, N. Naik,
(2000); D. Capocci, G. Hübner, (2004).
The Road to Hedge Fund Replication 169

put into alpha. As a consequence, some of the returns not accounted for by
these models are unaccounted beta rather than alpha. Surely, an incomplete
model of systematic risk factors doesn’t mean those additional risk factors do
not exist; only that we do not yet know how to model them. To draw another
image from astronomy, the outer planets of our solar system existed and ex-
erted their gravitational pull long before we had telescopes sensitive enough to
see them. Therefore the formula above on hedge fund returns should actually
read as follows:
Hedge fund return=Manager’s alpha +Σ (βi * Factori(modelled) ) + Σ (βi *
Factori(unmodelled ) + random fluctuations.
A simple example illustrates the problem: Consider a put writing strat-
egy on the S&P 500, or equivalently a covered call writing strategy, as e.g.
represented by the Chicago Board of Trade’s BXM index. To be precise, we
write monthly at-the-money call options on existing equity positions with one
month maturities. On regressing the BXM index against the S&P 500 over a
period of 11 years from 1994 to 2004 we obtain a statistically significant alpha
(i.e. a y-intercept of the regression) of around 0.4% per month, or almost 5%
p.a. There is surely not much true skill driven alpha in writing put options on
equities.11 All or most of the 0.4% is what we refer to as spurious or “phan-
tom” alpha, which results from the imperfect specification of the chosen model
(regression against the S&P 500). So we should not confuse pure manager skill
with an imperfect model. This is a common problem of multi-factor models
in the literature which claim to proof high alphas. We must therefore always
take any statistics of alpha with a grain of salt.

3 The Problem with Hedge Fund Indices


There is some more bad news for alpha: Hedge fund databases and thus the
indices constructed thereof are subject to various biases which make their
returns and thus the obtained alpha in a regression analysis based on these
indices look bigger than they really are.12 The lack of transparency and uni-
form reporting standards in the hedge fund industry are disreputable sources
of measurement errors that plague any hedge fund performance analysis. The
most important of these are the survivorship and the backfilling bias. The
consensus view of studies on this subject is that these effects account for at
least 3-4% of the reported hedge fund out-performance. A recent study by
B. Malkiel and A. Saha gives an idea about the performance upward biases
in hedge fund indices.13
11
Writing put options and investing the collateral in cash is identical to writing
covered calls, a property that is known as “put call parity” in option theory.
12
See the discussion in chap. 7 and Chap. 9 in L. Jaeger “Through the Alpha Smoke
Screens: A Guide to Hedge Fund Return Sources” (2005).
13
B. Malkiel, A. Saha, “Hedge Funds: Risk and Return,” Working Paper (2004).
170 L. Jaeger

There is in fact little widely published data on historical hedge fund per-
formance, so industry analysis relies mostly on aggregated returns as provided
by a dozen of different index providers which differentiate hedge fund perfor-
mance across the various strategy sectors. Although these indices constitute an
important tool for comparison and possibly benchmarking within and outside
the hedge fund industry, measuring manager performance, classifying invest-
ment styles, and generally creating a higher degree of transparency in this still
rather opaque hedge industry, the results of these efforts vary significantly be-
tween providers and depend more on “committee decisions” regarding index
construction criteria - such as asset weighting, fund selection and chosen sta-
tistical adjustments - than on objectively determined rules. Although this is
also somewhat of a problem in traditional asset class indices, it is severely ex-
acerbated in the hedge fund space by the diverse, dynamic and opaque nature
of the hedge fund universe.
However, the built-in flaws of existing indices have as much to do with
the built-in complexities of hedge funds as with any fault of the index devel-
opers. It is simply more difficult to create unambiguous index construction
guidelines for the heterogeneous hedge fund universe. In particular, while the
construction of traditional asset class indices rests on the reasonably well
founded assumptions that the underlying assets are homogenous, and that
the investor follows a “buy and hold” strategy, hedge funds are diverse and
subject to dynamic change. In traditional asset classes, the average return
of the underlying securities in an index has a strong theoretical basis. It is
constructed to be the return of the “market portfolio,” which is the asset-
weighted combination of all investable assets in that class or a representative
proxy thereof. According to asset pricing theory – e.g. Sharpe’s Capital Asset
Pricing Models (CAPM) - this market portfolio represents exactly the combi-
nation of assets with the optimal risk-return trade-off in market equilibrium.
It is therefore not surprising that traditional equity indices became vehicles
for passive investment only after the development of a clear theoretical foun-
dation in the form of the CAPM.14 Traditional indices are designed to capture
directly a clearly defined risk premium available to investors willing to expose
themselves to the systematic risk of the asset class. So an investor in the S&P

14
It is worth noting here that equity indices remained almost solely performance
analysis tools rather than investment vehicles for many years. The first asset
weighted index tracker fund (on the S&P index) started in 1973, only about five
years after the CAPM became broadly accepted. The very first tracker fund was
launched in 1971 and was equally weighted (on the NYSE). The problem with
equally weighted indices is that they require constant rebalancing to maintain
those weightings, and in the pre-1975 period (i.e. prior to deregulation of stock
commissions) such rebalancing was extremely costly. Wells Fargo launched a cap-
weighted tracker fund in 1973 which enabled them to reduce transactions costs.
Some argue that the predominance of the S&P500 as a benchmark owes more to
the ease of replication than an inherent confidence in the theoretical jusutification
for cap-weighting, see Schoenfeld’s book “Active IUndex Investing” (2004)
The Road to Hedge Fund Replication 171

500 index knows exactly what he is getting; broad exposure to the risks and
risk premia of the US large cap equities market. In other words, there exists
a general equilibrium model in the case of the stock markets. However, such
a model is still missing for the asset class hedge funds.
The standard way to construct a hedge fund index has so far been to use
the average performance of a set of managers15 . However, indices constructed
from averaging single hedge funds inherit the errors and problems of the un-
derlying databases. Therefore they face several performance biases that limit
the usefulness of the result.16 These biases include (but are not limited to):
Survivorship. The survivorship bias is a result of unsuccessful managers
leaving the industry, thus removing unsuccessful funds ex post from the repre-
sentative index. Only their successful counterparts remain; creating a positive
bias. In the most extreme case this is like lining up a number of monkeys,
let them trade in the markets, take out all those that lost money, and then
checking the performance of the rest. The survivors may all be in good shape,
but they hardly represent the performance of the entire original group! Many
hedge fund databases only provide information on currently operating funds,
i.e. funds that have ceased operation are considered uninteresting for the in-
vestor and are purged from the database. This leads to an upwards bias in
the index performance, since the performance of the disappearing funds is
most likely worse than the performance of the surviving funds.17 Consensus
estimates about the size of the survivorship bias in hedge fund databases vary
from 2% to 4%. We note that hedge fund indices are only subject to this bias
to the extent that they are constructed after the fact/inception of the index.
Today index providers do not restate index returns on a going forward basis
as managers drop in and out of their database. Index users should only use
‘live’ index data rather than all historical pro forma data.
Backfilling. A variation of the survivorship bias can occur when a new fund
is included into the index and his past performance is added or “backfilled”
into the database. This induces another upward bias: New managers enter
the database only after a period of good performance, when entry seems most

15
Indices based on average performance of a set of managers have generally well
known pitfalls, already in traditional asset classes. See the article by Jeffrey Bailey
“Are Manager Universes Acceptable Performance Benchmarks,” Spring 1992.
16
Most of these issues are well known by practitioners and are discussed in details
in Chap. 9 of L. Jaeger “Through the Alpha Smoke Screens.” A good overview
of the problems can be found in A. Kohler, “Hedge Fund Indexing: A square
Peg in a round hole,” State Street Global Advisors (2003). See also “Hedge Fund
Indices” by G. Crowder and L. Hennessee, Journal of Alternative Investments,
(2001); “A Review of Alternative Hedge fund Indices.” by Schneeweis Partners
(2001); “Welcome to the Dark Side: Hedge Fund Attrition and Survivorship Bias
over the Period 1994-2001” by G. Amin et al. (2001).
17
The survivorship bias is also well known in the world of mutual funds, see for
example the paper by S. Brown et al., “Survivorship Bias in Performance Studies”
(1992).
172 L. Jaeger

attractive. Since fewer managers enter during periods of bad performance, bad
performance is rarely backfilled into the averages.18 Again, hedge fund indices
are only subject to this bias to the extent that they are constructed after the
fact/inception of the index.
Selection.Unlike public information used to compose equity and bond indices,
hedge fund index providers often rely on hedge fund managers to voluntarily
and correctly submit return data on their funds. Hedge fund managers are
private investment vehicles and are thus not required to make public disclosure
of their activities. Some bluntly refuse to submit data to any index providers.
This “self-selection bias” causes significant distortions in the construction of
the index and often skews the index towards a certain set of managers and
strategies on a going forward basis. Sampling differences produce much of the
performance deviation between the different fund indices. Hedge fund indices
draw their data from different provider, the largest of which are the TASS,
Hedge Fund Research (HFR) and CISDM (formerly MAR) database. These
databases have surprisingly few funds in common, as most hedge funds report
their data – if at all – only to a subset of the databases. Counting studies
have shown that less than one out of three hedge funds in any one database
contributes to the reported returns of all major hedge fund indices19 .
Autocorrelation. Time lags in the valuation of securities (especially for less
liquid strategies like Distressed Securities) held by hedge funds may induce
a smoothening of monthly returns which leads to volatility and correlations
being significantly underestimated. Statistically this effect expresses itself by
significant autocorrelation in hedge fund returns (as will be shown below).
Ironically, the theoretical and practical problems described above do not
disappear when the index is designed to be investable. Some problems are
actually exacerbated. A prerequisite for creating an investment vehicle is that
the underlying managers provide sufficient capacity for new investments. This
creates a severe selection bias, as hedge funds at full capacity (closed) are a
priori not considered in the index. In traditional assets, an investor in the
Dow Jones Industrial Average Index does not need to worry that IBM is
closed for further investment.20 But for hedge fund indices, capacity with top
18
R. Ibbotson estimates this bias to account for a total of up to 4% of reported hedge
fund performance (Presentation at GAIM conference 2004). See also: Brown, S,
Goetzmann, W., Ibbotson, R., “Offshore hedge funds: Survival and performance
1989–1995” (1999). A recent estimate of the backfilling bias is given by B. Malkiel
et. al in their paper “Hedge Funds: Risk and Return” (2004) where the backfilling
bias is estimated in the same region as by Ibbotson.
19
See the study by W. Fung and D. Hsieh, “Hedge Fund Benchmarks: A Risk Based
Approach” (2004)
20
To be more precise, IBM stocks are in fact “closed for further investments” as
there are only a finite number of shares available (assuming no capital increase).
In this way they actually resemble closed hedge funds. However, any investor who
desires can freely purchase IBM shares in the secondary markets (stock markets)
due to its high degree of liquidity (that is what stock markets are all about). In
this sense the comparison serves us well here.
M M
a a Ap Ap

92
94
96
98
100
102
104
106
108
110
100
105
110
115
120
125
130
100
105
110
115
120
125
130
135
140
145
150
100
105
110
115
120
125
130
135
140
Ap r-0
3 Ap r-0
3 M r-03
a
M r-03
a
M r-0 M r-0
a 3 a 3 Ju y-0 Ju y-0
Ju y-0 Ju y-0 n 3 n 3
n- 3 n- 3 Ju -03 Ju -03
Ju 03 Ju 0
3 Aul-03 Aul-03
Aul-03 Aul-03 g Se g-0
Se -03
Seg-0
3
Seg-0 p p 3
O -0
O p-0 p- 3 O -03
c c 3
c 3 O 03
c N t-0 N t-03
N t-0 N t-0 ov 3 o
o 3 o 3 D -0 D v-0
e 3 e 3

HFRI Macro
D v-0 D v-0
e 3 e 3 Ja c-0 Ja c-0
Ja c-0
3 Ja c-0
3 n 3 n 3
Fe -04 Fe -04
Fen-0

HFRI Event Driven


4 Fen-0
4 M b-0 M b-0

HFRI Equity Hedge


M b-0 M b-0 ar 4 ar 4
a 4 a 4

HFXI Convertible index


Ap -0
Ap r-0
4 Ap r-0 Ap -04 4
4 M r-04
M r-0 M r-0 M r-04
a a
a 4 a 4 Ju y-0 Ju y-0
Ju y-0 Ju y-0 n 4
n 4 n 4 Ju -04
n- 4
Ju 04
Ju -04 Ju -04
Aul-04 Aul-04 Aul-04 Aul-04
Seg-0 Seg-0 Se g-0 Se g-0
4 4 p 4 p 4
O p-0 O p-0 O -04
c
O -04
c
c 4 c 4
N t-0 N t-0 N t-04
o
N t-04
o
ov 4 ov 4
D 0 - D -0 D v-0
e 4
D v-0
e 4
e 4 e 4

HFRX Macro
Ja c-0 Ja c-0 Ja c-0 Ja c-0
4 4 n 4 n 4
Fen-0 Fen-0 Fe -05 Fe -05
5 5
M b-0 M b-0

HFRX Event Driven


M b-0 M b-0 ar 5
ar 5 ar 5 ar 5
HFRX Equity Hedge

HFRX Convertible Arb


Ap -0 Ap -0 Ap -0
5 5 Ap -05 5
M r-0 M r-0 M r-05
a 5 a 5 M r-05
a a
Ju y-0 Ju y-0 Ju y-0 Ju y-0
n- 5
Ju 05
n- 5
Ju 05 n 5 n- 5
Ju -05 Ju 05
Aul-05 Aul-05 Aul-05 Aul-05
g- g- g- g-
05 05 05 05

M M M M
a a a

100
105
110
115
120

100
105
110
115
120
125
130
135
140
a
95
100
105
110
115

100
110
120
130
140
150
160
Ap r-0
3 Ap r-0
3 Ap r-0
3 Ap r-0
3
M r-0 M r-0 M r-0 M r-0
a 3 a 3 a 3 a 3
Ju y-0 Ju y-0 Ju y-0 Ju y-0
n- 3
Ju 03
n- 3
Ju 0 n- 3 n 3
3 Ju 03 Ju -03

the index referring to global hedge fund industry


Aul-03 Aul-03 Aul-03 Aul-03
Seg-0 Seg-0 g
3 p- 3
Se -0
p 3
Se g-0
p 3
O p-0
c 3 O 03
c O -03
c
O -03
c
N t-0 N t-0 N t-0 N t-0
o 3 o 3 ov 3 o 3
D v-0 D v-0 D -0 D v-0
e 3 e 3 e 3 e 3
Ja c-0 Ja c-0 Ja c-0
3 3 n 3
Ja c-0
3
HFRI Distressed

HFRI Merger Arb


Fen-0 Fen-0
4 4 Fe -0
4 Fen-0
4
M b-0 M b-0 M b-0 M b-0
a 4 a 4 a 4 a 4
Ap r-0 Ap r-0
4 4 Ap r-0
4
Ap r-0
4
M r-0 M r-0 M 0 r - M r-0
a 4 a 4 a 4 a 4

HFRI Fund Weighted Index


HFRI Equity Market Neutral

Ju y-0 Ju y-0 Ju y-0 Ju y-0


n 4 n 4 n 4 n 4
Ju -04 Ju -04 Ju -04 Ju -04
Aul-04 Aul-04 Aul-04 Aul-04
Seg-0 Seg-0 g Se g-0
4 4 Se -04
p 4
O p-0 O p-0 p
c 4 c 4 O -0 O -04
c
N t-0 N t-0 c 4 N t-0
o 4 o 4 N t-0
o 4 ov 4
D v-0 D v-0 D v-0 D -0
e 4 e 4 e 4 e 4
Ja c-0 Ja c-0 Ja c-0
4 4 Ja c-0
n 4 4
Fen-0 Fen-0
5 5 Fe -0
5
Fen-0
5
M b-0 M b-0 M b-0 M b-0
ar 5 ar 5 a 5 ar 5
HFRX Distressed Securities

HFRX Merger Arb


Ap -0 Ap -0
5
The Road to Hedge Fund Replication

5 Ap -0
Ap r-0
5 5

HFRX Global Index


M r-0 M r-0 M r-0 M r-0
a 5 a 5 a 5 a 5
Ju y-0 Ju y-0 Ju y-0 Ju y-0
n- 5 n- 5 n- 5 n- 5
Ju 05 Ju 05 Ju 05 Ju 05
HFRX Equity Market Neutral

Aul-05 Aul-05 Aul-05 Aul-05


g- g- g- g-
05 05 05 05

Fig. 1. Comparison of cumulative performance for the HFR investable indices vs.
their non-investable counterparts since inception of the former. The last graph shows
173

Hedge Fund Research investable indices vs. their non-investable counterparts


managers is a main issue. There is a clear trade-off between making an index

formance of the HFRX, the investable counterpart of the HFRI index, to the
since inception of the former. The deviation is eye-catching: Let us just have
a specific look at the Equity Hedge indices. The average monthly underper-
representative and making it investable. Fig. 1 shows the divergence of various
174 L. Jaeger

HFRI index is 62 bps, which translates into an average annual underperfor-


mance of 7.7%! We conjecture that this is about selection bias in the investable
versions of the index more than survivorship bias in the non-investable one.
Investable indices depend directly on the services of particular “access
providers.” The selection of the index participants is biased towards the ac-
cess these service providers have to various hedge funds. This “access bias”
can lead to a severe distortion in the index. The investment capacity of hedge
fund managers (at least those which are actually in a position to provide per-
sistent alpha) is a scarce resource, for which investable index providers must
compete with other investors, e.g. funds of funds. An investor in a traditional
S&P 500 index fund does not have to worry that stocks in IBM will not be
available for purchase. But for an investable hedge fund index, availability of
specific funds is indeed an issue (as for any other investor). In such non-public
markets as those in which hedge funds do their offering, access is not deter-
mined by market price, but by the investors’ ability to get and keep direct
access to the individual fund manager. Often this is determined by personal
relationships and other “soft factors.” Therefore the distinction between in-
dices and regular fund of funds disappears upon a closer look for most index
providers.21 The indices struggle for capacity, must perform due diligence on
hedge fund managers, and have similar subjective means to select and assign
weights to hedge funds. It is thus not surprising that they often charge similar
levels of fees as funds of funds and in almost all cases actually also operate
as such. We can essentially identify them as disguise fund of funds that have
discovered the marketing value of the “index” label.22 They currently offer
neither low fee structures nor the clearly defined risk profiles comparable to
a passive index fund in traditional asset classes.23
The true test of whether a hedge fund index is a valid investment vehicle
is whether there is a secondary market for hedge funds, whether one can

21
The distinction between investible index providers and fund of funds is/should
be about systematic methodology and goals for manager selection. Most index
providers have virtually no selection methodology, and to that extent they are
just fund of funds. Those that do have well founded methodologies that are im-
plemented can, without demurring, be called indices. The biggest problem really
is that the index provider and the asset manager are in fact identical—this is
unlike the case for US Equity Indices, but not unlike the case for the most well
regarded bond indices (e.g. Lehman).
22
One important difference between the index provider and a fund of hedge funds re-
mains, though: The fund of funds manager is actively searching for alpha and trad-
ing talent, which justifies the comparably high feel level charged. He is not in the
business of “averaging the alpha,” an undertaking which almost by construction
will lead to lower results in the case of hedge funds. Note that alpha extraction
is on a global scale a “zero sum game.”
23
The reader is referred to the following article for another discussion on the prob-
lems and pitfalls of hedge fund indices: L. Jaeger, “Hedge Fund Indices – A new
way to invest in absolute returns strategies?,” (June 2004).
The Road to Hedge Fund Replication 175

construct derivatives from it and whether it can be sold short. The possibility
of short selling and constructing synthetic positions based on derivatives (in
a cost efficient way) creates the prospect of arbitrage opportunities using
the hedge fund indices. Ironically such arbitrage opportunities would most
likely be exercised by hedge funds, in a sort of Klein bottle of investments
that contain themselves. Whether or not such trades emerge will eventually
prove whether hedge fund indices can sustain market forces, which ultimately
enforce an arbitrage-free market equilibrium. Today, there is an active market
for structured products referencing hedge fund indices, including delta one
products that allow investors to synthetically short some of the investable
hedge fund indices.

4 Modelling Hedge Fund Returns: A First Simple


Example
Figure 2 provides a first insight into how a combination of simple systematic
strategies each of which mirrors particular “beta factors” (risk premia) tracks
the performance of a multi-strategy hedge fund portfolio. It displays the return
of an equally weighted combination of three simple strategies, each tracking
different risk premia:

400

350

300

250

200

150

100
94

95

95

96

96

97

97

98

98

99

99

00

00

01

01

02

02

03

03

04

04

05

05
7.

1.

7.

1.

7.

1.

7.

1.

7.

1.

7.

1.

7.

1.

7.

1.

7.

1.

7.

1.

7.

1.

7.
.0

.0

.0

.0

.0

.0

.0

.0

.0

.0

.0

.0

.0

.0

.0

.0

.0

.0

.0

.0

.0

.0

.0
31

31

31

31

31

31

31

31

31

31

31

31

31

31

31

31

31

31

31

31

31

31

31

SPX INDEX HFRIFOF Index HFRIFWI Index Simple Strategy Combination

Fig. 2. Performance of an equally weighed combination of three strategies: the sgfii


trend following index, the BXM covered call writing index, and long the Credit Suisse
High Yield Bond Index (annualised return: 10.3%, annualised volatility: 5.6%). For
comparison, we show the performance of the HFR Composite (annualised return:
11.7%, annualised volatility: 7.2%), the HFR Fund of Funds Index (annualised
return: 7.9%, annualised volatility: 5.8%) and the S&P 500
176 L. Jaeger

1. A simple trend following model on 25 liquid futures markets summarized


on what is known as the “sgfi index” (Bloomberg ticker “SGFII <Index>”)24 ;
2. The BXM index - an index defined by the Chicago Board of Trade for
a simple “buy write” strategy on the S&P 50025 (Bloomberg ticker “BXM
<Index>”);
3. The Credit Suisse High Yield Bond Index (Bloomberg ticker “CSHY
<Index>”).
There are no restrictions and only limited fees for investing into these three
strategies. Prices are readily available on information systems like Bloomberg.
Figure 2 also displays the returns of the HFR Composite Hedge Fund Index,
a broad aggregate across all hedge fund strategies, the Hedge Fund Research
Fund of Funds Index, which mirrors the performance of fund of funds man-
agers, and finally the S&P 500 index.
The return of this simple strategy combination over the 11-year period
from 1996 to 2005 stands at 10.1% with a volatility of 5.6% and a Sharpe ratio
of around 1. Compare this to a 11.1% return for the HFR Composite Index
(volatility 7.1%, Sharpe ratio: 0.9) and 7.2% (volatility 5.9%, Sharpe ratio:
0.5) for the HFR Fund of Funds Index. Surprisingly, the performance of our
simple strategy combination outperforms both hedge fund indices on a risk-
adjusted basis. It even fares better than the HFR Fund of Funds index on
a total return basis and has only marginally lower absolute returns than the
HFR Composite Index. The fact that a combination of such simple strategies
already beats hedge fund averages illustrates the key role of risk premia in
hedge fund returns overall. This clearly justifies a deeper search into the risk
premia of individual hedge fund strategies.

5 Regression of Hedge Fund Returns on Systematic Risk


Factors

In the following we perform some modelling efforts of hedge fund strategies


based on various regressions on systematic risk factors. For the lack of better
data we must hereby rely on the publicly available hedge fund indices despite
their shortcomings mentioned above. One might suggest that a better choice
would be to perform the analysis on the investable indices as these do not
come with these upward biases. However, as discussed above, these often lack
the necessary degree of representativness due to their own selection biases.
Furthermore, their history is too short to perform a meaningful regression.
And we claim that non-investable hedge fund indices themselves serve better

24
See L. Jaeger et al., “Case study: The sGFI Futures Index”, Journal of Alternative
Investments, (Summer 2002).
25
“Buy write” refers to holding long the underlying – in this case the S&P 500 index,
and simultaneously selling a call. This combination is economically identical to
selling a put on the S&P500 plus holding an equivalent amount of cash.
The Road to Hedge Fund Replication 177

as the dependent variables in a risk factor analysis as it seems at first sight.


Their discussed short comings refer mostly to the absolute level of perfor-
mance and not to their risk characteristics. While non-investable indices fail
when used as absolute performance measures, they may very well do their
service when it comes to describing the typical risk exposure characteristics
of the diverse strategies.26 In other words, the biases such as survivorship and
backfilling bias have their effects mostly on the y-intercept, i.e. the alpha, and
less so on the sensitivities, i.e. the betas, of the regression. In order to illus-
trate this statement, we performed an analysis identical to the one above on
extended sets of individual managers as provided by the TASS database. The
thus obtained R-squares can be expected to be much lower due to the hetero-
geneity of hedge managers even within the same sector,27 but the obtained
average values for the sensitivities are generally quite similar. Figure 3 illus-
trates this for the case of Long/Short Equity managers, where we display the
histograms of the obtained factor sensitivities in our regression analysis for 483
Long/Short Equity managers in the period form 1998 to 2004. These results
should be compared to the results in the first row in the following Table 1.
Table 1 summarizes the results of a multifactor regression on the vari-
ous hedge fund strategy sector indices provided by the data provider Hedge
10
8

8
6
Percent

Percent
6
4

4
2

2
0

−1 0 1 2 3 −2 −1 0 1 2
Convertible Equity Hedge Manager Small-Cap Spread Equity Hedge Manager
8
8

6
6

Percent
Percent
4

4
2

2
0

−1 −.5 0 .5 1 −.5 0 .5
CPPI Equity Hedge Manager AR(1) Equity Hedge Manager

Fig. 3. Histogram of the factor exposures (“betas”) of Long/Short Equity managers


using the independent variable as in Table 1. Data: Tass
26
Which is actually what linear regression models do, they explain variance, not
absolute return.
27
This is already the case when performing a regression on individual stocks. The
reason is simply idiosyncratic risk!
178 L. Jaeger

Table 1. Alternative factors

Fund Research (HFR). Returns are calculated on monthly data as geometric


averages (cumulative returns) of the log-differences of consecutive (monthly)
prices. Further the risk free rate of return was explicitly subtracted from all
independent as well as the dependent variables, evidently with the exception
of spread factors (as a risk free rate we chose US 3 month Libor). Note that the
regression models include the AR(1) factor (the autocorrelation factor, which
The Road to Hedge Fund Replication 179

is the one-month lagged time series of the dependent variable) as independent


variable where significant. The reason for this is simply that lagged mark-
ing of asset in several hedge fund strategies prices do not adjust instantly to
changing prices of the underlying instruments but with a delay, either because
the underlying markets they trade in are less liquid or because they want to
smooth their reported returns over time, or, as been hypothesized elsewhere,
active smoothing of returns by hedge fund managers.28
Overall, the set of factors captures a large percentage of the hedge fund
return characteristics, which expresses itself in the high R2 values taking a
value of 60% on average. But at the same time this means that although we
can explain a substantial part of the variation of hedge fund returns by these
factor models, a substantial part is still missing. Furthermore, the regressions
are much more successful at explaining some hedge fund strategies than oth-
ers. They do well at explaining Long/Short Equity, Short Selling, and Event
Driven strategies. On the other hand, they do a poorer job with the strategies
Equity Market Neutral, Merger Arbitrage, and Managed Futures. We realize
that hedge funds earn a substantial part of their returns by taking system-
atic risks that our statistical methods allow us to measure. But the nature of
these risks often diverges from the standard notion of systematic (broad mar-
ket) risk. In the case of equity risk factors, it is often small cap risk (Russell
2000), non-linear risk (convertible bonds, BXM), or default risk (high yield,
emerging markets) rather than the risk of the overall stock market. In the
case of bond market risks, it is specifically credit risk that is assumed by
many hedge funds (Event Driven, Distressed Debt, Fixed Income Arbitrage,
Convertible Arbitrage).
Note the significance of the autoregressive term AR(1) in the regression in
five out of ten strategies. We can interpret the autocorrelation shown in the
results as a sign of persistent price lags in the valuation of hedge funds. This
implies that simple measures of risk like Sharpe ratio, volatility, correlation
with market indices etc. significantly underestimate the true market risk in
hedge fund strategies. Indeed positive autocorrelation has two effects: it drives
down estimated volatility and it means that suddenly changing market con-
ditions and shocks – as measured by the risk factors – distribute over several
periods. The AR(1) factors thus measures some lagged beta. Excluding this
factor would cause some unaccounted beta to be misinterpreted as alpha.
The regression results discussed here above merit a more detailed look at
some of the statistics we obtained, specifically on the stability of our mod-
els, a subject which is surprisingly little covered in the literature. For this
purpose we performed a CUSUM test which is designed to test whether the
obtained regression models are stable to any statistically significant degree.

28
A thorough discussion of the autoregressive factor can be found in Getmansky
M., Lo, A. W., Makarov, I., “An Econometric Model of Serial Correlation and
Illiquidity in Hedge Fund Returns” (2004). See also the paper by C. Asness et al,
“Do hedge funds hedge” (2001).
180 L. Jaeger

The CUSUM test considers the cumulated sum of the (normalized) recursive
residuals wr.

r=t
ωr
Wt = ,
σ̂
r=K+1

(where the denominator displays the predicted standard deviation of the


error term of the regression).
In order to perform the test Wt is plotted as a function of the time
variable t. The null hypothesis of model stability can be rejected when Wt
breaks the straight lines passing through the point (K,+/−a(T−K)1/2 ) and
(K,+/−3a(T-K)1/2 ) where a is a parameter dependent on the chosen level
of significance. Figure 4 displays the cumulated residuals for all models. We
observe that for none of our models do the cumulated residuals Wt break the
confidence levels. Therefore the null hypothesis of model stability cannot be
rejected for any of our models.
A second test for model stability is to plot the obtained factor sensitivities
over time in a rolling regression. We equally performed this analysis, and
results equally indicate a generally high degree of stability of these factors.
Figure 5 shows the results for all our strategies.

CUSUM Stability Test CUSUM Stability Test CUSUM Stability Test


Equity Hedge Short Selling Event Driven
40

40

40
20

20

20
CUSUM

CUSUM

CUSUM
0

0
−20

−20

−20
−40

−40

−40

1994m1 1996m1 1998m1 2000m1 2002m1 2004m1 1994m1 1996m1 1998m1 2000m1 2002m1 2004m1 1994m1 1996m1 1998m1 2000m1 2002m1 2004m1
Time Time Time

CUSUM Stability Test CUSUM Stability Test CUSUM Stability Test


Distressed Macro Equity Market Neutral
40

40

40
20

20

20
CUSUM

CUSUM

CUSUM
0

0
−20

−20

−20
−40

−40

−40

1994m1 1996m1 1998m1 2000m1 2002m1 2004m1 1994m1 1996m1 1998m1 2000m1 2002m1 2004m1 1994m1 1996m1 1998m1 2000m1 2002m1 2004m1
Time Time Time

CUSUM Stability Test CUSUM Stability Test CUSUM Stability Test


Merger Arbitrage Fixed Income Convertible Arbitrage
40

40

40
20

20

20
CUSUM

CUSUM

CUSUM
0

0
−20

−20

−20
−40

−40

−40

1994m1 1996m1 1998m1 2000m1 2002m1 2004m1 1994m1 1996m1 1998m1 2000m1 2002m1 2004m1 1994m1 1996m1 1998m1 2000m1 2002m1 2004m1
Time Time Time

Fig. 4. Results of a CUSUM stability test for the regression models in Table 1
182 L. Jaeger

6 Mimicking Hedge Fund Strategies: Can We Create


Better Indices?
The obvious question arises: Can we use the insights given by the models
and the factor exposure discussed above to create better benchmarks? These
would aim at mimicking the particular hedge fund strategies, and possibly
constitute investable alternatives to the currently offered hedge fund indices
(a provocative thought which we already hinted at in Fig. 2). The very goal
would be accurately separate systematic risk exposure from true manager
alpha. The former constitutes what an index is all about while the latter by
definition should not be part of an index/benchmark.
The idea of using strategy replications to model hedge fund returns in a
factor model setting was developed in a paper by Fung and Hsieh in 2001
for Managed Futures strategies.29 Fung and Hsieh modelled the performance
of a generic trend-following strategy using look-back straddles. Since then
they and others have applied this type of modelling to a variety of other
hedge fund styles,30 including Merger Arbitrage,31 Fixed Income Arbitrage,32
and Long Short Equity.33 The hedge fund firm Bridgewater, for example,
has conducted some simple but interesting research along these lines.34 In
most of these studies the authors used simple trading strategies for modelling
Managed Futures, Long/Short Equity, Merger Arbitrage, Fixed Income Arbi-
trage, Distressed Securities, Emerging Markets, and Short Selling strategies
and generally reached good correspondence with the broadly used hedge fund
sub-indices of the corresponding strategy sector.
In the following we calculate the performance of a strategy which invests
directly into the factor exposures taken from the regression, i.e. we explicitly
calculate the cumulative returns
Return(t)=Σ (βi * Factori (t)).
The factors chosen for this analysis are the same as in the regression above.
We refer to these returns as the “Replicating Factor Strategy” returns (in the
following referred to as simply “RFS” returns) and compare them to the
realized returns displayed by the corresponding hedge fund indices. In order
29
See W. Fung, D. Hsieh, “The Risk in Hedge Fund Strategies: Theory and Evidence
from Trend-Followers” (2001).
30
See W. Fung, D. Hsieh, “The Risk in Hedge Fund Strategies: Alternative Alphas
and Alternative Betas” in L. Jaeger (ed.), “The new generation of risk manage-
ment for hedge funds and private equity investment” (2003).
31
M. Mitchel, T., Pulvino, “Characteristics of Risk in Risk Arbitrage” (2001).
32
W. Fung, D. Hsieh, “The Risk in Fixed Income Hedge Fund Styles” (2002).
33
W. Fung, D. Hsieh, “The Risk in Long/Short Equity Hedge Funds” (2004);
V. Agarwal, N. Naik, “Performance Evaluation of Hedge Funds with Option-
Based and Buy-and-Hold Strategies” (2003).
34
See the publication by G. Jensen and J. Rotenberg “Hedge Funds Selling Beta as
Alpha” (2003).
The Road to Hedge Fund Replication 183

to avoid the problem of data mining and in-sample over-fitting, the factors
chosen for the RFS were calculated on a rolling looking forward basis. To
be precise, the RFS returns in a given month were calculated using factors
obtained by a regression over data for the previous five years ending with
the previous month. The RFS are in spirit similar to what Jensen et al.35
describe as a generic replication of hedge fund strategy with the difference
however that the chosen factors/substrategies are explicitly modelled in the
regression set up.
The results for the most recent three years (since inception of the investable
indices) are rather astonishing: The cumulative replicating strategy’s returns
are often superior to the returns of the hedge fund indices, especially when
considering their investable versions. For the latter performance of the RFS
is better for every single strategy sector with the exception of the Distressed
strategy.
Interpreting our results leads us to a schematic illustration of where hedge
fund returns come from (Fig. 6). A long-only manager (represented by the left
bar) has two sources of returns: the market exposure and the manager excess
return, his “alpha” (which is negative for most managers in this domain).
The difference between long-only investing and hedge funds is largely that the
hedge fund will hedge away all or part of the broad market exposure. In order
to achieve this risk reduction, the hedge fund manager employs a variety of

Active long-only bonds / equity fund Hedge Fund


Manager Skill,
Alpha Alpha
(security selection,
timing, execution)
FX Rate Risk
Event Risk
Convergence Risk
Commodity Risk
Complexity Risk
Hedge Alpha Leverage
Short Option Risk
Market Risk
Liquidity Risk
Small Firm Risk
Value Stocks
Yield Curve Risk

Credit Risk

Equity Risk

Fig. 6. A schematic model for hedge fund return sources based on results in
Table 1
35
G. Jensen and J. Rotenberg “Hedge Funds Selling Beta as Alpha” (2003), updated
in 2004 and 2005.
184 L. Jaeger

techniques and instruments not typically used by the long-only fund manager
including short selling and the use of derivatives. This results in what appears
as a “pure alpha” product with low expected returns and low expected risk.
But in order to be attractive as a stand-alone investment, the hedge fund
manager has to conform to the market standard for return. This leads him
to scale the risk by using leverage, which provides the desired magnification
of return and risk. In this magnified configuration, systematic elements of
risk and return that before were hidden in the “Alpha” are suddenly large
enough to be analysed separately. In other words, we now have the necessary
magnifying glass to separate out the “beta in alpha’s clothing.” We estimate
that up to 80% of the returns from hedge funds originate as the result of beta
exposure (i.e. exposure to systematic risk factors) with the balance accounting
for manager skill based alpha (or not yet identified risk factors).
In the following we discuss our results for the individual strategy sec-
tors, the summary of which is presented in Table 2 in comparison with the
investable and non investable indices from Hedge Fund Research.

6.1 Long/Short Equity

Most Long/Short Equity managers have exposure to both the broad equity
market and particularly to small cap stocks. Managers may find it easier to
find opportunities in a rising market, and it may also be easier to short sell
large cap and buy small cap stocks. Our risk factor model in Table 1 confirms
these results. The most significant factors are related to broad equity and small
cap equity markets. Fung and Hsieh obtain similar results in a specific study
on the Long/Short Equity strategy.36 They choose as independent variables
the S&P 500 index and the difference between the Wilshire 1750 index and
the Wilshire 750 index as a proxy for the small cap risk factor. We obtained

Table 2. Cumulated performance of the RFS and the HFRX strategy, data from
March 2003 to August 2005

Strategy RFS HFRX HFRI


Equity Hedge 27.8% 16.0% 32.8%
Market Neutral 6.2% −3.9% 10.9%
Short Selling −28.2% N/A −23.0%
Event Driven 29.8% 24.1% 40.0%
Distressed 20.1% 23.3% 44.8%
Merger Arbitrage 13.0% 10.9% 15.3%
Fixed Income 7.8% N/A 16.3%
Convertible Arbitrage 7.6% −5.3% 2.4%
Global Macro 16.7% 10.1% 24.6%
Managed Futures 9.2% N/A N/A

36
See W. Fung, D. Hsieh, “The Risk in Long/Short Equity Hedge Funds” (2004).
The Road to Hedge Fund Replication 185

very similar results (having chosen the Russell 2000 and Russell 1000 for the
calculation of the small cap spread).
However, a closer look reveals that the exposure of Long/Short Equity
hedge funds has a strongly non-linear profile. This non-linear exposure is
reflected in the fact that the most explanatory independent variable is a con-
vertible bond index.37 Apparently, this profile models the Long/Short Equity
strategy well: Less participation on the upside, protection on the downside to
a certain point, but with more expressed losses in a severe downturn of the
equity markets (when convertible bonds loose their bond floor). The substi-
tution of an equity factor with a convertible bond factor thus yields a better
model than a simple equity factor.38 However there is another equity related
factor that comes into play: Hedge funds tend to decrease their exposure in
falling equity markets and increase it in rising markets, similar to a “Constant
Proportion Portfolio Insurance” strategy often employed in capital protected
structures. We simulate this behaviour by including such a CPPI factor based
on the rolling 12 month performance of the S&P 500. Figure 7 presents the
performance of the RFS next to the HFR non-investable (HFRI) and the in-
vestable versions of HFR (HFRX) and S&P indices since inception of the
HFRX (inception of the S&P index occurred later, and at its inception it was
taken to the same level as the HFRX in the graph). The chart confirms what
the numbers already indicated: We can very well replicate the performance
of the average Long/Short Equity manager in the index with a RFS model

0.05 140
RFS HFEH S&P Long/Short Equity HFRX RFS HFRI S&P Long / Short Equity HFRX
0.04 135

0.03 130

0.02 125

0.01 120

0 115

−0.01 110

−0.02 105

−0.03 100
J u -05
n 3

Ju y-04
Se -0 3
O -03

e 4
4
Se -04
O -04
ec 3
3

Ap r-04
3

05
Ju -03

Ju 05
ar 5
5
a 4
4

M r-05
a 4

Au l-0 4

o 4
M r-03

o 3
n 5
n 3

4
3
c 3

g 4
ec 4
4
ec 3

4
c 4

g 3
3

Au l-05
M r-0 3

05
Ju -03

Ju -05
a 5
5
a 4
4

a 5
a 4

o 4
a 3

o 3

Au l-04
Au l-03

g- 5

J u y-0

D v -0
Ja c -0
D v-0
Ja -0
Ap r-0

Ap -0
M b-0
M b-0
Fen-0

N t-0
N t-0

M r-0

Ju -0

Fen-0
J u y-0
J u y-0

Ju y-0
Se g-0
O p-0

D v -0
Ja -0
D v-0
Ja - 0

Seg-0
O p-0

Au l-0
Ap r-0

Ap r-0

Ap r-0
M b-0
M b-0
Fen-0

M r-0
N t-0
N t-0

M r-0

Jun-0

Fen-0

Au l-0

g-
n-
ay
p

p
n
a

c
a
a

M
M

Fig. 7. Returns (monthly and cumulated) of the non-investable HFRI Equity


Hedge Index, the investable HFRX Equity Hedge Index, and the (investble) S&P
Long/Short Equity Index (all in light color) vs. the RFS cumulative return (in dark
color) based on the factor returns (see text for details)
37
The convertible bond index primarily serves as a proxy for high tech and small
cap stocks. If we include the S&P 500 and Russell 2000 index a lot (but not all)
of the explanatory power of the convertible bond index goes away.
38
We would like to note here, however, that the substitution of the Convertible fac-
tor with a straight equity risk factor such as the S&P 500, yields R-squares which
are only about 10% below the values reported here. The convertible bond index
thus can be considered as a proxy for small cap (an possibly Telecom/Technology)
exposure.
186 L. Jaeger

with similar performance and volatility. The RFS performs along the HFRI
index despite some alpha displayed in Table 1. Figure 17 sheds some light
on this discrepancy: Table 1 displays the average alpha over the regression
period which as Fig. 17 indicates declines quite rapidly over time. Figure 7 in
contrast only matches the most recent performance since 2003. There is only
little alpha shown be Long/Short Equity managers in the most recent pe-
riod as Fig. 17 indicates. Finally, the RFS outperforms both investable indices
(HFRX and S&P) significantly.

Equity Market Neutral

Equity Market Neutral strategies aim at zero exposure to specific equity mar-
ket factors. Correspondingly, the model in Table 1 shows only a small (however
statistically significant) exposure to broad equity markets. However, the re-
sults indicate that the Equity Market Neutral style carries sensitivity to the
Fama-French momentum factor UMD and the value factor (the spread of the
MSCI value and growth indices). The R2 value of the regression for Equity
Market Neutral comes out lowest for all strategy sectors next to the Managed
Futures. In other words, simple linear models fall short of explaining a signif-
icant part of the variation of returns for this hedge fund style. However, to
mix the right combination of systematic risk exposures of Equity Market Neu-
tral strategies right, we must distinguish two distinctly different sub-styles of
this strategy. The one (often system based) approach buys undervalued stocks
and sells short overvalued stocks according to a value and momentum based
analysis. The second more short term oriented approach (also referred to as
“Statistical Arbitrage”) trades in pairs based on a statistical analysis of rel-
ative performance deviation of similar stocks. Both styles naturally have a
different exposure to the factors examined here.
Figure 8 confirms what the numbers in Table 1 indicate: The RFS un-
derperforms the HFRI index by some margin reflecting the positive alpha in
Table 1. However, it outperforms the HFRX investable index significantly.
0.02 115
RFS HFRI HFRX
0.015

0.01
110
RFS HFRI HFRX
0.005

0
105
−0.005

−0.01
100
−0.015

−0.02

−0.025 95
n- 5
n- 3

n 4
3
c 3

e 4
4
4
c 4
e 3
3

05

n 5
4
3

n 3

n- 4
3
ar 5
a 4

c 3

ec 4
n 4
Ju 03

Ju 05

4
c 4
e 3
n 3

4
5

05
Ju -03

Ju -05
a 5
4

5
ar 4
Ju -04

a 5
a 5
a 4

Ju 04

o 4
o 4

a 3
a 3

o 3
o 3

Au l-04
Au l-03

Au l-05
J u y-0
J u y-0

Ju y-0

Aul-04
Aul-03
Se g-0
O p-0

D v -0
Ja c -0

Aul-05
D v-0
Ja c - 0

Seg-0
O p-0

J u y-0
Ap r-0

Ap r-0

Ap -0

J u y-0
M b-0

Ju y-0
M b-0

Se g-0
O p-0

D v -0
Ja -0
D v-0
Ja c-0

Seg-0
O p-0
Ap r-0

Ap -0

Ap r-0
M b-0
Fen-0

M b-0
Fen-0

Fe -0

M r-0
M r-0

M r-0

N t-0
N t-0
M r-0

N t-0
N t-0

M r-0

M r-0

Fe -0
g-

g-
a

a
M

Fig. 8. Returns (monthly and cumulated) of the non-investable HFRI Equity Mar-
ket Neutral Index and the investable HFRX Equity Market Neutral Index (in light
color) vs. the RFS cumulative return (in dark color) based on the factor returns (see
text for details)
The Road to Hedge Fund Replication 187

Short Selling

The main exposure of the Short Selling strategy is, quite obviously, being short
the equity market. Interestingly, the exposure to the broad equity markets can
best be modeled with the same factor as for the Long/Short equity managers,
the Convertible Bond Index. This indicates the same type of non-linear expo-
sure as for the Long/Short Equity strategy, however with the signs inversed.
The strategy displays positive sensitivity to value stocks with as measured
by the spread between the MSCI value and growth indices. The alpha value
for Short Selling strategies stands at around 4-5% p.a. This indicates that
the short side does offer some profit opportunities, possibly explained in part
by most investors being restricted from selling short. However, the alpha of
this strategy must be high in order for the strategy to generate any profits
at all. This is because from the perspective of risk factor exposure, shorting
the equity markets starts off with an expected negative 4-7% return (long
term performance of the equity markets minus short rebate for the short posi-
tions). As a result Short Selling is the only hedge fund strategy with negative
past performance over the last 15 years. This is also reflected in Fig. 9 for the
more recent period. We observe that the Short Selling strategy can be well
replicated by the RFS model.

Event Driven

Event Driven hedge funds constitute an ensemble of various investment strate-


gies around company specific events including restructuring, distress and
mergers. According to our factor model in Table 1 the average Event Driven
strategy comes with a rather simple exposure to the broad equity market,
small cap stocks and the high yield bond market. Further the AR(1) factor
indicates autocorrelation in returns reflecting liquidity risk and possible lagged
pricing of the underlying securities. Our model explains an astonishing 80% of

0.08 100
RFS HFRI
0.06
95
0.04 RFS HFRI

90
0.02

0 85

−0.02
80
−0.04
75
−0.06

−0.08 70
n 5
n 3

n 4
3
O p-03

e 4
n 4
p 4
O -04
e 3
n 3

Ap -04
Ap r-03

Ap -05

05
Ju -03

Ju -05
ar 5
Fe -0 5
ar 4
J u -03

n 4

Fe -04
p 3

a 5
a 4

Ju -04

N t-04
J u -05

a 3

N t-03
O -03

e 4
4
p 4
O -04
e 3
3

4
3

05
Au l-0 3

Au l-0 5
a 5
5
a 4

Au l-04
4

Au l-03
M r-05
a 4

Au l-05
Ju -04

D v -04
M r-03

o 3

4
3

g- 5

J u y-0
J u y- 0

Ju y-0
Se g-0

D v -0
Ja c -0
D v -0
Ja c - 0

Seg-0

M b-0
M b-0
Ju y-0

M r-0
M r-0

M r-0
Se g-0

Ja c -0
D v-0
Ja c-0

Seg-0
Ap r-0

Ap r-0

Ap r-0
M b-0
Ju -0

Jun-0
M b-0
Fen-0

N t-0
N -0

M r-0

Fen-0
Au l-0

g-
ay

ay
n

ct

c
a

o
o

M
M

Fig. 9. Returns (monthly and cumulated) of the non-investable HFRI Dedicated


Short Bias Index vs. the RFS cumulative return (in dark color) based on the factor
returns (see text for details). Note: An investable version of the HFR index does not
exist for dedicated short hedge funds
188 L. Jaeger
0.05 160
RFS HFRI HFRX S&P Event Driven RFS HFRI HFRX S&P Event Driven
0.04
150
0.03
140
0.02

0.01 130

0
120
−0.01
110
−0.02

−0.03 100

n 5
n 3

n 4
3
c 3

n 5
e 4
4

n 3
e 3
3

4
c 4

n 4
4
r 3

p 3
5

05
a 5
Ju -03

Ju -05
a 4

O -03

e 4
n 4
p 4
O -04
4

e 3
Ja c-03

Ap r-04
a 5

Ap r-05

05
a 4

Ju -04

o 4
M -03

o 3

a 5
Ju -03

Ju -05
a 4

Fe -0 5
Fe -04

Ju -04

a 5
a 4

N t-04
a 3

N t-03
Au l-04
3

Au l-05

Au l-04
Au l-03

Au l-05
Apr-03
J u y-0
J u y-0

Ju y-0

J u y-0
Se g-0
O p-0

D v -0
Ja c -0

J u y-0
D v -0
Ja c - 0

Seg-0
O p-0

Ju y-0
Ap r-0

Ap r-0

Ap r-0
M b-0
M b-0

D v -0
Se g-0

Ja c -0
D -0

Seg-0
Fen-0

M r-0
N t-0
N t-0

M r-0

Fen-0

M b-0
M b-0

M r-0
M r-0

M r-0
Au l-0

g-

g-
ov

n
a

c
o
a
M

M
Fig. 10. Returns (monthly and cumulated) of the non-investable HFRI Event
Driven Index and the investable HFRX Event Driven Index (in light color) vs.
the RFS cumulative return (in dark color) based on the factor returns (see text for
details)

the variation of Event Driven returns. Alpha is the highest for any strategy in
the hedge fund universe with roughly 5% p.a. over the analyzed period. This
is also reflected in Fig. 10, where we see that the RFS model yields roughly
about two thirds of the return of the Event Driven managers in the HFRI
index. However, again, the RFS outperforms the HFRX and S&P investable
index version significantly.

Distressed Securities

Distressed Securities strategies come with a simple set of exposures to credit,


equity, particularly small cap equity, and liquidity risks. These are exactly
the factors which show up in Table 1. The AR(1) factor bears the largest
sensitivity, reflecting the low degree of liquidity offered in Distressed Securi-
ties investing. A lack of regular pricing and valuation induces autocorrelation
in the return streams. The partly rather illiquid strategies closely resemble
the return sources of private equity investment. The investor provides an im-
portant funding source for companies without access to traditional capital
sources during important phases of their development; usually times of dis-
tress. In contrast to investors in regular stocks, an investor in distressed debt
or equity just like a private equity investor has no direct access to his capi-
tal for several years. He is further exposed to uncertainty about the size and
timing of future cash flows.
Not surprisingly the level of alpha for Distressed hedge funds managers is
around 3-4% p.a. which is along with its peers in other Event Driven sectors
(e.g. Merger Arbitrage) among the highest in the hedge fund industry. This is
also reflected in Fig. 11, where we see that that the RFS model yields roughly
about half of the return of the Distressed managers in the HFRI index. Even
the investable HFRX index outperforms the RFS.
The Road to Hedge Fund Replication 189
0.05 160
RFS HFRI HFRX

0.04
150
RFS HFRI HFRX
0.03
140

0.02
130
0.01

120
0

110
−0.01

−0.02 100

n 5
n 3

n 4
3
c 3

e 4
n 4
Se -04
O -04
e 3
n 3

05
Ap r-04
3

Ap -05
ar 5
M b-04
Ju -03

Ju -05
Fe -0 5
Fe -04
n 5

Ju -04
Ju n-0 3

Jun-0 4

M r-05
M r-04

ov 4
M r-03

N t-03
p 3
O -03

ec 4
4
e 3
3

4
c 4

05
4
3

5
a 5
a 4

J u -0
J u y-0
Aul-033

Ju -05

Ju y-0
5
4

Aul-04
Aul-03
a 5
a 4

D v -04

Se g-0
O p-0

D -0
Ja c -0

Aul-05
a 3

D v-03

D v-0
Ja c-0
Ap r-0

M -0
J u y-0
J u y-0

Ju y-0

N t-0
Au -04
Se g-0

Ja -0

Au -05
Ja c-0

Seg-0
O p-0
Ap r-0

Ap r-0

Ap r-0
M b-0
M b-0
Fen-0

Fen-0

M -0
N t-0
M r-0

N ct-0

M r-0

g-
ay
g-

g
p

b
a

c
o
r
a

o
l

M
M

Fig. 11. Returns (monthly and cumulated) of the non-investable HFRI Distressed
Index and the investable HFRX Distressed Index (in light color) vs. the RFS cumu-
lative return (in dark color) based on the factor returns (see text for details)

Merger Arbitrage
In their seminal paper on the Merger Arbitrage strategy, Mitchel and
Pulvino39 examine the conditional correlation properties of this strategy:
Merger Arbitrage strategies display rather high correlations to the equity
markets when the latter declines and comparably low correlations when
stocks trade up or sideways. This corresponds to a correlation profile similar
to that of a sold put on equities. As a matter of fact, the payout profile
of Merger Arbitrage strategies corresponds directly to a sold put option on
announced merger deals. This short put profile is reflected in the significance
of the BXM factor in Table 1. Shorting put options provides limited upside
but full participation on the downside (less the option premium). This argu-
ment extends beyond the immediate exposure to merger deals breaking up:
When the stock market falls sharply, merger deals are more likely to break.
In addition, a sharp stock market decline will reduce the likelihood of revised
(higher) bids and/or bidding competition for merger targets. Falling stock
markets also tend to reduce the overall number of mergers, which increases
the competition for investment opportunities and may thereby reduce the
expected risk premium. The strategy therefore has a slightly positive stock
market beta, however strongly non-linear. This overall exposure profile to
equity markets comes more from the correlation between the event risk and
the market than from the individual positions.
Mitchell and Pulvino calculated the historical track record of a simple
rule-based merger arbitrage strategy that at any time invests in each an-
nounced merger deal, both cash and stock-swap, with a pre-specified entry
and exit rule.40 They conducted this calculation for 4,750 merger transac-
tions from 1963 to 1998. The hedge fund manager Bridgewater performed a
very similar study but constrained themselves to the ten largest mergers at
any point in time. In both cases the resulting simulated returns came very
39
See M. Mitchel and T. Pulvino, “Characteristics of Risk in Risk Arbitrage” (2001).
40
See M. Mitchel and T. Pulvino, “Characteristics of Risk in Risk Arbitrage” (2001).
190 L. Jaeger
0.02 120
RFS HFRI HFRX
0.015 RFS HFRI HFRX

0.01 115

0.005

0 110

−0.005

−0.01 105

−0.015

−0.02 100

n 5
n 3

n 4
3
c 3

e 4
n 4
e 3

p 4
c 4
n 3

05
Ap -04
Ap r-03

Ap -05
ar 5
Ju -055

ar 4
Ju n-0 3

Jun-0 4

Ju -03

Ju -05
3

Fe -0 5
Fe -04
ct 3

Ju -04
Ja c -04
4
p 4
O -04
e 3
3

M r-05
M r-04

N t-04
05

M r-03

N t-03
4
Ap r-03

5
a 5
a 4
3

5
4

J u y-0
a 5

J u y-0
a 4

ov 4
a 3

o 3

Ju y-0

Aul-04
Aul-03
Se g-0
O p-0

D v -0
Ja c -0

Aul-05
D v-0
Ja c - 0

Seg-0
O -0

M b-0
J u y-0
J u y-0

M b-0
Ju y-0

Au 04
Aul-03
Se g-0
O p-0

D -0

Aul-05
D v-0
Ja c-0

Seg-0
Ap r-0

Ap r-0
M b-0
M b-0
Fen-0

Fen-0

M r-0
N t-0
M r-0

N -0

M r-0

g-
g-
n

a
-

a
o

o
a

e
l

M
M

Fig. 12. Returns (monthly and cumulated) of the non-investable HFRI Merger
Arbitrage Index and the investable HFRX Merger Arbitrage Index (in light color)
vs. the RFS cumulative return (in dark color) based on the factor returns (see text
for details)

close to the returns of the Merger Arbitrage hedge fund indices (HFR and
Tremont). We included a strategy which focuses on investing exactly along
the Mitchell/Pulvion study, the publicly available “Merger Fund.”41
Our regression shows what we expected, exposure to the equity markets,
in particular the small cap segment (furthermore the value sector), the BXM
index and the Merger Fund. However, the explanatory strength of the model
is not that high (considering that these factor should very well reflect what
the strategy is about). Just as with other Event Driven strategies the alpha
value is above average for this strategy with around 4% p.a. However, a com-
parison with the performance of the RFS in Fig. 12 shows that the skill based
component of returns has declined in recent years, as the RFS tracks the
performance of the HFRI Merger Arbitrage rather closely. Again, the RFS
outperforms the investable version of the HFR index by a safe margin.

General Relative Value

Relative Value strategies–represented here by Fixed Income Arbitrage and


Convertible Arbitrage – have three types of systematic exposure. They first
capitalize on price spreads between two or more related financial instruments
which often represent a compensation for particular risks such as credit risk,
interest rate term structure risk, liquidity risk, or exchange rate risk. Sec-
ondly, they provide liquidity and price transparency in complex instruments
employing proprietary valuation models to value complex financial instru-
ments. Related returns can be referred to as liquidity and “complexity” pre-
mia. The latter is related to the risk of mis-modeling the complexity of the
underlying financial instrument. The hedge fund manager is short an option
which turns strongly into the money when his valuation model is inaccurate.
Finally, Relative Value Hedge fund managers have a preference for negatively
41
Bloomberg ticker: MERFX US Equity.
The Road to Hedge Fund Replication 191

skewed return distribution, where steady but small gains are countered with
rare but large losses. In other words, the managers are short some sort of
volatility, which makes the return profile resemble the payout profile of a
short option position.

Fixed Income Arbitrage


Fixed Income Arbitrage strategies often expose themselves to a combination of
liquidity, credit and term structure risks, e.g. through credit barbell strategies
(long short-term debt of lower credit quality and short long term government
bonds), yield curve spread trades, or on-the-run vs. off-the-run treasury bond
positions. Exposure to credit risk, convertible bonds and emerging market
bonds securities are most prevalent, as Table 1 indicates. The significance of
the AR(1) term indicates autocorrelation in returns signaling lagged pricing
of the underlying securities and reflects liquidity risk. According to our factor
model the alpha value for Fixed Income strategies is in the region of 2.5%
p.a., and the model explains around 41% of the variations of returns.
Fung and Hsieh42 chose another–but similar–set of factors including op-
tions on interest spreads (they call these “ABS factors”) to model various
Fixed Income Arbitrage trading styles. They obtain slightly higher R2 values
than presented in our study here.
Their and our results explain why the heaviest losses of this style occurred
in “flight to quality” scenarios, when credit spreads suddenly widen, liquidity
evaporates and emerging markets fall sharply. Events like the summer 1998
remind us that the strategy bears a risk profile similar to a short option,
with the risk of significant losses but otherwise steady returns. It is inherently
difficult to model the exposure to these extreme events, as they are so rare
that their true likelihood is hard to calculate. However, the hedge fund investor
should nevertheless keep this exposure in mind.
Figure 13 shows that the RFS returns cannot quite keep up with the HFRI
returns coherent with in our results in Table 1.
0.025 120
RFS HFRI
118
0.02
RFS HFRI 116
0.015
114
0.01 112

0.005 110
108
0
106
−0.005
104
−0.01 102

−0.015 100
n 5
n- 3

4
p 3
O -03

e 4
4
e 3
3

p 4
O -04

05

n- 5
4

n- 3
3

n 4
a 5
a 4

3
3

Ju -05
5

c 3

e 4
4
e 3
n 3

4
c 4
4

05
4

4
3

5
M r-05
M r-04

N t-04

ar 5
M r-03

o 3

a 4
Ju 03

Ju 05
5
4

Ju -04

a 5
a 4

o 4
a 3

o 3
J u y-0
J u y-0

Ju y-0

Aul-04
Aul-03
Se g-0

D v -0
Ja c -0

Aul-05
D v-0
Ja c-0

Se -0

J u y-0
J u y-0
Ap r-0

Ap r-0

Ap r-0

Ju y-0
M b-0
M b-0

Aul-04
Ju 0

Aul-03
Se g-0
O p-0

D v -0
Ja c -0

Aul-05
D v-0
Ja c-0

Seg-0
O p-0
Fen-0

Jun-0

Fen-0

Ap r-0

Ap r-0

Ap -0
M b-0
N t-0

M b-0
Fe -0

Fen-0

M r-0
N t-0
M r-0

N t-0

M r-0
g-

g-
g
a

a
a
c

c
o

a
M

Fig. 13. Returns (monthly and cumulated) of the non-investable HFRI Fixed In-
come Index vs. the RFS cumulative return (in dark color) based on the factor returns
(see text for details). Note: An investable version of the HFR index does not exist
for Fixed Income Arbitrage hedge funds
42
See W. Fung, D. Hsieh, “The Risk in Fixed Income Hedge Fund Styles” (2002).
192 L. Jaeger

Convertible Arbitrage

Convertible Arbitrage hedge funds are exposed to a variety of different risk


factors: Credit risk, equity market and equity volatility risk, and liquidity risk.
These factors – the high yield factor, convertible and equity factor, and the
AR(1) factor – also appear as the relevant factors in Table 1. As for Fixed
Income Arbitrage, the Convertible Arbitrage model shows a significant AR(1)
terms which indicates autocorrelation in returns also for this strategy. This
signals a lack of consistent and timely pricing of the underlying convertible
securities and reflects exposure to liquidity risk and valuation risk.
To mix the right combination of these risks however, we must distinguish
two distinctly different sub-styles of Convertible Arbitrage strategies. The
option-based Convertible Arbitrage style simply buys the convertible bond,
sells short the underlying equity and re-establishes a delta hedge frequently, a
trading technique referred to a gamma-trading. This style tries to hedge out
credit risk as much as possible and thus cares little about the credit markets.
The second - credit-oriented - style makes an explicit assessment of the issuer’s
creditworthiness and takes overpriced credit risk. Both styles naturally have
a different exposure to the credit markets.
Naturally, the credit-oriented sub-style of Convertible Arbitrage carries a
significant exposure to credit risk, while the option-based sub-style does not.
As credit risk is correlated with equity markets the second style has a less
well-defined sensitivity to falling equities. Increasing volatility helps the strat-
egy, but widening credit spreads hurt it. The option-based gamma trading
style, in contrast, performs better in a volatile environment in which equi-
ties are falling, which explains the overall negative correlation of Convertible
Arbitrage hedge funds to the equity markets in Table 1. Declining volatility
leads this strategy to under-perform during the period of decline. The dual
nature of Convertible Arbitrage hedge funds led to an interesting development
in 2003 which confused some investors. In an environment of simultaneously
rapidly declining credit spreads and equity volatility, credit oriented Con-
vertible Arbitrage strategies displayed stellar performance while the gamma
traders displayed disappointing returns that hovered near zero.
This divergence in style is currently not reflected in the available hedge
fund indices, which makes it more difficult for factor models to capture the
sensitivities of the style.
To correctly evaluate these two variants of Convertible Arbitrage, we
would need a separate index for each sub-style. In a recent research pa-
per43 , V. Agarwal et al. separate the key risk factors in Convertible Arbi-
trage strategies: equity (and volatility) risk, credit risk, and interest rate risk.
Consequently they design three “primitive trading strategies” to explain the
returns of the strategy in terms of the key risk factors and premia captured by
these strategies: positive carry, credit risk premium (“credit arbitrage”) and
43
V. Agarwal, W. Fung, Y. Loon, N. Naik, “Risks in Hedge Fund Strategies: Case
of Convertible Arbitrage” (2004).
The Road to Hedge Fund Replication 193
0.04 110
RFS HFRI HFRX RFS HFRI HFRX
0.03

0.02 105

0.01

0 100

−0.01

−0.02 95

−0.03

−0.04 90
n 4

n 4
n 3

ec 4

n 5

e 4
Ju -03

n- 5
3
c 3

05
4

ay 3

n- 3
p 3
O -03

05
4

Se -04

n 4
3

ar 4
3

M r-0 4

3
Apr-03

a 4
Apr-04

5
a 3

c 4

a 5

a 5
3

Ju -04

O -04
o 4

Ju -05

Ju -04

ov 4

Fe -05

Ju 05
a 5

a 5
a 4

M r-04
Ju -03

Ju -03
A u l-03

Au l-04
A u l-03

Au l-04

Au l-05
Au l-05
Ju y -0

Ju y -0
Ju y-0

D v -0

Ju y-0

D -0

Ju y-0
Seg-0
O p-0

Ja c -0

Seg-0

M r- 0

Ja c -0
Seg-0
Ja -0

Ja c-0
Dev-0

M b-0
A p r-0

A p -0

A p r-0

Dev-0

M b- 0

A p r-0
M r- 0

O p-0

M r- 0

M r- 0
No t-0

Fen-0

No t-0
N t- 0

Fen-0

Fe 0

N ct-0
M b-0

M b-0
g-

g-
p
g
n

c
a

a
M

M
Fig. 14. Returns (monthly and cumulated) of the non-investable HFRI Convertible
Arbitrage Index and the investable HFRX Convertible Arbitrage Index (in light
color) vs. the RFS cumulative return (in dark color) based on the factor returns (see
text for details)

gamma trading (“volatility arbitrage”). They investigate these factors in the


US and Japanese convertible market. These factors can explain up to 54% of
the return variation of Convertible Arbitrage indices.
According to our factor model the alpha value for Convertible Arbitrage
Income strategies is in the region of 2% p.a, and the model explains around
65% of the variations of returns. However, we observe for the more recent pe-
riod that a RFS model outperforms the HFRI Convertible Arbitrage strategy
slightly with significantly less volatility as shown in Fig. 14. The outperfor-
mance becomes even more striking when considering the investable HFRX
index.

Global Macro

Global Macro managers of all types do better in strong bond markets, as


indicated by the strong sensitivity to the bond market index shown in Table 1.
Other exposures are less obvious: exposure to the risk characteristic to trend
following strategies (the sGFI factor) and some non-linear exposure to the
broad equity market (convertible bond factor).
The R2 value for the regression of Global Macro comes out relatively low
(50%). We assume this is due to the heterogeneity of the strategy. Global
Macro trading includes a wide range of different trading approaches, and a
broad index does not reflect this diversity. A manager-based analysis would
be more appropriate here. More than a broad asset class based index or a
generic trading strategy, it is the particular markets traded by the individual
manager and his particular investment techniques that define the available
risk premia and inefficiencies targeted. However, note that our model gives an
alpha value of around 3% p.a. for the average Global Macro manager. This
is correspondingly reflected in Fig. 15, showing an underperformance of RFS
of around 3-4% p.a.. But again, the non-investable version underperforms
the RFS.
194 L. Jaeger
0.06 130
RFS HFRI HFRX RFS HFRI HFRX

0.04 125

0.02 120

0 115

−0.02 110

−0.04 105

−0.06 100
n 4
n 3

ec 4

n 5

n 4
3
ct 3

Fen-0 3

05
4

e 4
n 3

n- 5
3

ar 4
M r-03

M r-0 4

M r-05

3
c 3

n 3

05
4
a 3

c 4

a 5

n 4
3

o 4

a 4
3

Apr-04

5
a 5

a 3

a 5
No t-03

O p-04
4

Ju -04

o 4
a 4

Fe -05

Ju 5
A u l-0 3

a 5
3

Au l-04

a 4
Au l-05

Ju -03
A u l-03

Au l-04

Au l-05
Ju y -0
Ju y-0

D v -0

Ju y-0

Ju y - 0
Seg-0
O p-0

Ja c -0

Seg-0

Ja -0

D v -0
Ju y-0

Ju y-0
Dev-0

M b-0
A p r-0

A p -0

A p r-0

Seg-0
O p-0

Ja c -0

Seg-0
O p-0

Ja c-0
No -0

Ju -0

N t- 0

Fen-0

Ju -0

Dev-0

M b-0
A p r-0

A p r-0
M b-0

M r-0

M r- 0
Fe -0

N t- 0

0
Ju -0

M b-0
M r-0
g-

g-
a

c
a
M

M
Fig. 15. Returns (monthly and cumulated) of the non-investable HFRI Global
Macro Index and the investable HFRX Global Macro Index (in light color) vs. the
RFS cumulative return (in dark color) based on the factor returns (see text for
details)

Managed Futures

Managed Futures hedge funds are the main speculative agents in the global
futures markets, thus capturing what we referred to as the “commodity hedg-
ing demand premium.” A simple trend following trading rule (sGFII) applied
to the major global futures markets captures a large part of these returns and
shows up as the most dominant term in the regression in Table 1. Several
different studies have independently obtained this result.44 The sGFII index
is designed to model the return of trend following strategies with a simple rule
based momentum approach. It is a volatility weighted combination of trend
following strategies on 25 liquid futures contracts on commodities, bonds, and
currencies. This index shows a 48% correlation with the CISDM trend follow-
ing index, and equally a 48% correlation with the CSFB/Tremont index. Based
on the regression in Table 1 the average CTA in the CISDM Trendfollower
index displays negative alpha. Schneeweis/Spurgin and Jensen and Rotenberg
(Bridgewater) use similar trend following indicators on a much more restricted
set of contracts45 . They obtain an even higher correlation coefficient to the
CSFB-Tremont Managed Futures index (71% in the case of Bridgewater) or
the CISDM Managed Futures Indices (79% against the CISDM Trend follow-
ing index for Schneeweis/Spurgin). The lower correlation of the sGFII index
is possibly due to a comparably high exposure to commodity contracts com-
pared to Bridgewater’s and Schneeweiss/Spurgin’s model (which overweigh
the complex of financial futures contracts).

44
See L. Jaeger et al., “Case study: The sGFI Futures Index” (Summer 2002);
Jensen. G., Rotenberg, J., “Hedge Funds Selling Beta as Alpha” (2003);
R. Spurgin, “A Benchmark on Commodity Trading Advisor Performance” (1999).
45
T. Schneeweis and R. Spurgin, “Multifactor Analysis of Hedge Funds, Managed
Futures, and Mutual Fund Returns and Risk Characteristics” (1998); G. Jensen
and J. Rotenberg “Hedge Funds Selling Beta as Alpha” (2003).
The Road to Hedge Fund Replication 195
0.08 120
RFS CISDM S&P Managed Futures RFS CISDM S&P Managed Futures
0.06
115
0.04

0.02 110

0
105
−0.02

−0.04 100

−0.06
95
−0.08

−0.1 90

n 3

n 4

n 5
p 3
O -03

e 3
Ja c-03

p 4
c 4

e 4
Ja c-04
Ju y-03

n 4

Ju y-05

Apr-03

M -04
Ap -04

a 5
Apr-05
p 3
O -03

e 3
3

p 4
O -04

e 4
4

M r-03

N t-03

a 4

N t-04

M r-05
Ju -03

Fe -04

Ju -04

Fe -05

Ju -05
3

a 4
4

a 5
5
M r-03

Au l-0 3

o 3

a 4

Au l-0 4

o 4

M r-05
5

Ju -05

Au l-03

Au l-04

5
3

Ju -0

Ju y-0

Ju y-0
Seg-0

D v-0

Se -0
O -0

D v-0
Ju y-0

M b-0
Seg-0

D v-0
Ja c-0

Seg-0

D v-0
Ja c-0

M r-0
Ap r-0

M b-0
Ap r-0

M b-0
Ap r-0
Ju -0

N ct-0

Fen-0

M r-0

Ju -0

N ct-0

Fen-0

l-0
l-0

ay

g
b
ar
n

n
n

c
a

a
a

o
a

M
M

Fig. 16. Returns (monthly and cumulated) of the non-investable (!) CISDM Man-
aged Futures Qualified Universe Index (in grey color) vs. the RFS cumulative return
(in dark color) based on the factor returns (see text for details)

An interesting model for trend-following strategies was proposed by Fung


and Hsieh. They constructed their trend-following factor using look back
straddle payout profiles on 26 liquid global futures contracts and the cor-
responding options (across equities, bonds, currencies and commodities). A
look back straddle pays the difference between the highest and lowest price
of the reference asset in the period of time until maturity of the option, mim-
icking the payout of a trend-follower with perfect foresight. The degree of
explanatory power of their model is around R2 =48%, higher than all three
models described above.
Note that the Managed Futures strategy is the only hedge fund sector
which displays negative alpha (albeit not at a statistically significant level).
We can observe the corresponding performance pattern of CTAs compared to
the RFS in Fig. 16: The performance of the RFS and the average CTA in the
CISDM Managed Futures Qualified Universe Futures Index are very well in
line, while the investable S&P Managed Futures index underperforms both
by a significant margin.

7 The Future of Alpha

There is good reason to believe that generally the average alpha extracted by
hedge fund managers is destined to decline. As a matter of fact, we can al-
ready today observe that alpha has grown smaller in size over time, as Fig. 17
indicates for the most obvious strategy, Long/Short Equity, where we display
the alpha of a rolling regression over a 60 months time window. Independently
from our research, the attenuation of alpha has been observed elsewhere. Fung
et al. report in one of their latter research on the same phenomenon.46 One
possible explanation for this phenomenon comes quickly to mind: As more

46
W. Fung, D. Hsieh, N. Naik, T. Ramadorai, “Hedge Fund: Performance, Risk and
Capital Formation”, Preprint (2005)
196 L. Jaeger

1.20%
Rolling Alpha

1.00%

0.80%

0.60% average alpha: 0.56%

0.40%

0.20%

0.00%
00 00 00 00 01 01 01 01 02 02 02 02 03 03 03 03 04 04 04 04 05 05 05
a n- pr- ul- ct- an- pr- ul- ct- an- pr- ul- ct- an- pr- ul- ct- an- pr- ul- ct- an- pr- ul-
J A J O J A J O J A J O J A J O J A J O J A J

Fig. 17. The development of alpha for Long/Short Equity funds (HFR sub-index)
based on a rolling regression over a 60 month time window. The risk factors were
chosen as in Table 1

money chases a limited of market inefficiencies, those inefficiencies should de-


crease or even going to disappear. In other words, the capacity for alpha is
limited. However, there is no good reason to believe that the global “capacity
for alpha” which is ultimately a function of how many inefficiencies the average
global investor (and the corresponding regulatory agencies) will tolerate actu-
ally decreased over time that dramatically. While hedge funds grow strongly
and possibly have to compete harder with other “alpha chasers” they remain a
rather small portion of the global investment activity. Another parallel expla-
nation for the displayed decrease in alpha is the quality of the average hedge
fund manager. The number of managers has multiplied in recent years, and it
reasonable to assume that today’s low entry barriers to starting a hedge fund
attracts numerous managers with a lower level of skill. These tend to dilute
the average performance and thus the average alpha of the entire hedge fund
industry. An interesting research topic which we leave for future efforts is to
test for the average alpha in the top percentile of managers.
Will the “alpha” in hedge funds disappear entirely? Probably not, but it
will become harder to identify and isolate it in the growing jungle of hedge
funds. However, we have seen that alpha constitutes a statistically significant
variable (though decreasing over time) in most of our regression models. We
might be missing explanatory variables in our models, and future modeling
effort will hopefully lead us to better models to answer this question.
Another approach is to model the behavior of the alpha output of our mod-
els in changing market conditions as well as over time. Alpha might depend on
market related variables other than prices which are not so easily captured in
The Road to Hedge Fund Replication 197

our risk based models, such as trading volume, open short interest on stocks,
insider activity, leverage financing policies of prime brokers, etc. A direct de-
pendency of the hedge fund managers’ alpha creation from these variables will
lead us to a better understanding of their time variability that we empirically
observe in our models. This will ultimately lead us to an understanding of
the very alpha creation process of hedge funds, the part of hedge fund returns
which remains still in the dark for most investors. However, little effort has
been put into this task so far.
The main task of the investor will be to define what he wants from hedge
funds. Alpha is and will continue to be ultimately the most attractive sort of
return, as it comes with no systematic risk and no correlation to other asset
classes. But investors should realize both the scarcity of true alpha and the
power of alternative beta. It is the power of diversification into orthogonal
risk factors which will ensure that hedge funds remain broadly attractive for
investors. And when it come to the hedge funds’ beta there is surely a great
deal larger capacity available to investors than in the case of alpha. In fact, the
future growth prospects of the hedge fund industry become quite compelling
considering that we are far from any limit with respect to “beta capacity” in
the hedge fund industry. While the search of alpha surely remains compelling,
we believe it is investment in alternative betas which will be more and more
the key to successful hedge fund investing in the future.

8 The Future of Hedge Fund Capacity


Now that we are in a position to provide a rough breakdown of hedge funds
return sources we can approach a question which lies at the heart of future
hedge fund growth: the issue of capacity. For this purpose we perform a set of
rather simple calculations:47 We know that the global market capitalization of
all public stocks and debt is around 88’000 billion USD (about 51’300 USD in
bonds, 36’700 USD in equity).48 Generating alpha in the global capital mar-
kets is an overall zero sum game, i.e. if hedge fund managers win this game,
i.e. generate positive alpha, there must be other market participants being
on the losing end. We must thus assume an average tolerance level for ineffi-
ciencies, i.e. negative alpha, by equity and bond investors world wide before
competitive (or regulatory) forces step in to keep this number from getting
larger. We estimate this number to be in the range of 0.25% p.a. on average
across all equity and bonds investors.49 With this number we can calculate
47
Note that this calculation is very similar in spirit and takes some of its concepts
from the work of H. Till, “The capacity implications of the search of alpha”
(2004).
48
Source: www.fibv.com/publications/Focus0605.pdf and
http://www.imf.org/external/pubs/ft/GFSR/2005/01/index.htm.
49
H. Till uses another number but aggregates the overall size of the market only
over the holdings of HNWI, mutual funds and institutional funds. Considering
our base number of 88000 billion USD the assumptions are rather similar.
198 L. Jaeger

the overall alpha in the global equity and bond market to be USD 220 billion.
We must further assume that hedge funds can participate from this “alpha
pie” only to a certain extent next to other professional players which are likely
to be “positive alpha players” and thus compete with hedge funds for alpha
(proprietary trading operations, large institutions, mutual funds – before their
fees, etc.). It seems realistic to assume that hedge funds can take one fourth
of that pie50 (a proportion which might grow larger over time, however, as
more players from the other “alpha parties” move into the hedge fund space).
This implies that there are USD 55 billion pure alpha available to hedge funds
each year. Further, assuming that hedge fund investors require a least a 15%
p.a. return gross of fees (before management, performance, trading fees, etc.),
which amounts into a net return of around 8%-10% and constitutes proba-
bly the minimum investors would require from hedge funds. This implies an
overall capacity of hedge funds based on alpha only of
USD 55 billion/0.15 = 366.6 billion USD,
about one third of the actual size of assets in the hedge fund industry. Even
with different, more beneficial assumptions on the overall investor tolerance
for inefficiencies and on how much hedge funds can participate in the total
“alpha pie”51 , we would not come up with a capacity significantly higher than
the current size of the industry. As a result, based on the assumption of ineffi-
ciencies in the global capital markets alone, we are not just lacking a satisfying
economic explanation of hedge fund return sources, we also find ourselves in
a position not being able to explain the current size of the industry!
But by now we understand that a large portion of hedge fund returns
is not related to pure alpha, but rather to “alternative beta.” The analysis
in our research suggests that a large part of the average hedge fund return
stems from alternative beta rather than alpha. We now consider our estimate
for that part to be as high as 80%. Well, this raises the bar for hedge fund
capacity significantly higher. Going along with our conclusion and estimating
that only 20% of the industry returns is related to pure alpha, we can calculate
the capacity of the industry to be
366.6 billion USD/0.2 = 1’833 billion USD,
about twice its current size. However, as large as this number seems, it
is exceeded by some of the estimates given by industry protagonists as to
what level the industry will grow within the following years. How can this
growth be managed considering our numbers? The answer is obvious: Only
by including a larger share of alternative beta in the overall return scheme of
hedge funds. Assuming that the ratio of alpha vs. alternative beta becomes
10%, the capacity reaches the number of 3670 billion USD (assuming that the
capacity of alternative beta is not limited at these levels, a fair assumption in
our view).

50
The reader is invited to perform the calculation with different numbers.
51
The reader may use his own set of assumptions.
The Road to Hedge Fund Replication 199

Summarizing, there is indeed plenty of room for the hedge fund industry
to grow, albeit only at the expense of becoming more and more beta driven.
This development will inevitably occur with the future growth of hedge funds.
As a matter of fact, recent performance suggests that this process has already
started.

9 Summary and Conclusion


The key to the hedge fund ‘black box’ is the understanding that hedge funds
generate returns primarily through risk premia and only secondarily by ex-
ploiting inefficiencies in imperfect markets. Conceptually hedge funds are
therefore nothing really new in that just as an equity mutual fund extracts
the equity risk premium, a hedge fund may try to extract various other risk
premia awarded for, say, credit risk, interest rate risk or liquidity risk. The im-
portant difference however is, that the underlying risk premia are more diverse
than those in traditional asset classes (which led us to refer to these premia as
“alternative betas”). This insight is slowly spreading among the most sophis-
ticated circles in the hedge fund industry. The underlying systematic risks can
be readily analyzed and understood by investors, while the remaining parts
of returns from inefficiencies are more difficult to describe in an unambiguous
way. The risk premia available to hedge fund managers are the same as those
available to other investors. However, extracting those premia in markets un-
familiar to most investors requires special expertise. Like the mining engineer
who can profitably extract gold from low-grade ore that would previously
have been left in the ground, skilled fund managers are simply more efficient
in identifying existing risk premia, and trading with minimal undesired risk
exposure and transaction costs to extract them.
One of the pitfalls of hedge funds is that alpha and beta currently do
not come separate but in an uncontrolled and perhaps undesired combina-
tion. Traditional portfolio management has developed a setting, which could
equally be applicable for hedge fund investors: the “core-satellite” framework.
Here, alpha generation and beta extraction are well separated - and very dif-
ferently compensated. We believe hedge fund investors will want to walk down
the same road. Hedge fund product providers might have to find a way to iso-
late and extract the alpha from the beta in hedge funds. This is the idea of
“portable alpha”: Isolate alpha in one asset class and transfer it into the port-
folio consisting of other types of assets. If a fund manager claims to produce
alpha, why not take out the beta part of his returns with an active hedging
overlay approach and keep only the alpha. A recent paper by B. Fung and
D. Hsieh52 provides some interesting insights into a possible implementation
of that idea and also gives some useful estimates about size and distributional
properties of the “alpha returns” for Long Short Equity strategies.
52
W. Fung, D. Hsieh “Extracting Portable Alpha from Equity Long/Short Hedge
Funds,” Journal of Investment Management (2004).
200 L. Jaeger

Currently available indices or benchmarks which rely on manager and peer


group averages do not necessarily provide a sufficiently accurate picture of
the industry or strategy sector performance due to various well known biases.
The situation does not become much better when the indices are designed
to be investable. At the same time, the demand and necessity of hedge fund
indices for the purpose of measuring manager performance, classifying invest-
ment styles, and generally creating a higher degree of transparency is high
and increasing. Some index providers actually claim that funds of funds have
started to invest in investable indices to gain the desired exposure. While
the authors are not aware of such behavior, they can surely not exclude that
some of the less sophisticated fund of funds have bought the marketing story
of the index providers. But if we acknowledge that the currently existing in-
vestable indices are no valid choice, what can we do? One way suggested in
this article is to create synthetic benchmarks based on the factor exposure
of hedge fund strategies to the underlying risk factors. This could potentially
be a much better choice for fund of funds and other investors to gain the
desired broad exposure to the hedge fund styles. At the same time these repli-
cating factor strategies (RFS) can serve fund of funds as a benchmarking
tool to judge the performance, to be more precise, the alpha generation, of
their managers. First results described here and elsewhere look promising for
some strategy sectors. However, a great deal of work remains to be done for
other strategies. We observe that a corresponding replication of hedge fund
indices by “replicating factor strategies” (RFS) lives up to the returns of the
(non-investable) hedge fund strategy sector indices for some strategy sectors,
in particular Long/Short Equity, Merger Arbitrage, Managed Futures, and
Convertible Arbitrage. These strategies make up significantly more than 50%
of the assets allocated to hedge funds! But as we emphasized in this article,
these non-investable indices are actually not a good measure for hedge fund
return that an investor would actually obtain on average, but overestimate
their size significantly. In contrast to the non-investable hedge fund indices
the RFS can be made investable without impacting their returns. When we
compare the returns of the RFS with the corresponding version of the in-
vestable indices, their outperformance becomes even more striking: The RFS
actually outperform the entire range of investable indices by a safe margin
with the one exception of the Distressed sector. One must wonder why this
is so. The flippant but accurate answer is: fees. Taking out an average of 2%
management fees and a share of 20% performance fees for the single hedge
fund manager actually eats up all and often more of the skill based returns
hedge fund managers offer on average. We emphasize that the last two words
written in italics are important: “on average.” With the inflation of new of-
ten mediocre managers average alpha has been coming down. However, we
acknowledge that there continue to exist highly skilled hedge fund managers
which continue to generate persistent alpha even after their (hefty) fees. It
remains the skill of the experienced hedge fund investor/fund of funds to find
and invest in them.
The Road to Hedge Fund Replication 201

At the end of this report we would like to point out a further direction
of research possibly not sufficiently covered in this research. Our analysis
suggests that the factor loads of hedge fund strategies are adequately modelled
as stationary. However, there is good reason to believe (and recent research
provides some evidence53 ) that there occur sudden and structural breaks in the
systematic risk exposures of hedge funds that cannot be modelled well enough
in a linear model context. Examples of such are easy to find: The blow up of
LTCM in the summer of 1998, the burst of the stock market bubble in the
spring of 2000, the turn in the equity market in March 2003. Upon a closer
look, a closer look at Fig. 4 reveals some evidence for such breaks, which our
analysis here does not account for. In order to model hedge fund exposure
during these breaks occurring in extreme market environment we need non-
linear exposure models. We will leave this topic for future research.
Generally, the progress recently on understanding the generic sources of
hedge fund returns leads us to the conclusion that investable benchmarks con-
structed by a joint venture of financial engineers and quant groups based on
risk factor analysis and replication has the potential to offer a valid, theoreti-
cally more sound, and cheaper alternative to the currently offered hedge fund
index products offered today. It is evident that once these indices become
more broadly recognized the hedge fund industry will be put upside down.
This will have some further important consequences on how hedge funds are
categorized by investors. So far, most consider them a separate asset class.
Realizing that hedge funds regarding their exposure to systematic risk fac-
tors are conceptually not that different from traditional types of investments
investors may find it conceptually easier to integrate them into their overall
asset allocation.

References
[1] Agarwal, V., Naik, N., “Performance Evaluation of Hedge Funds with
Option-Based and Buy-and-Hold Strategies,” Working paper (2001),
published under the title: “Risks and Portfolio Decisions involving Hedge
Funds”, Review of Financial Studies, 17, p. 63 (2004)
[2] Agarwal, V., Fung, W., Loon, Y., Naik, N., “Risks in Hedge Fund Strate-
gies: Case of Convertible Arbitrage,” Working Paper, London Business
School (2004)
[3] Amin, G., Kat, H., “Welcome to the Dark Side: Hedge Fund Attrition
and Survivorship Bias over the Period 1994-2001,” Journal of Alternative
Investments, (Summer 2003)
[4] Asness, C., Krail, R., Liew, J., “Do hedge funds hedge?”; Journal of
Portfolio Management, 28, 1 (Fall 2001)
53
W. Fung, D. Hsieh, N. Naik, T. Ramadorai, “Hedge Fund: Performance, Risk and
Capital Formation”, Preprint (2005)
202 L. Jaeger

[5] Asness, C., “An Alternative Future, I & II” Journal of Portfolio Manage-
ment, (October 2004)
[6] Bailey, J., “Are Manager Universes Acceptable Performance Bench-
marks,” The Journal of Portfolio Management, (Spring 1992).
[7] Brown, S, Goetzmann, W., Ibbotson, R., Ross, S., “Survivorship Bias in
Performance Studies”, Review in Financial Studies, 5, 4 (1992)
[8] Brown, S, Goetzmann, W., Ibbotson, R., “Offshore hedge funds: Survival
and performance 1989-1995”, Journal of Business, 92 (1999)
[9] Brown, S., Goetzmann, W. (2003), “Hedge Funds With Style,” Journal
of Portfolio Management, 29, 101-112
[10] Capocci, D., Hübner, G., “Analysis of hedge fund performance,” Journal
of Empirical Finance, 11 (2004)
[11] Crowder, G., “Hedge Fund Indices”, Journal of Alternative Investments,
(Summer 2001)
[12] Fama, E., French, K., “Common risk factors in the return of stocks and
bonds”, Journal of Financial Economics, 33 (1993)
[13] Fama, E., French, K., “Multifactor explanations of Asset Pricing Anoma-
lies”, Journal of Finance, 51, 55 (1996)
[14] Fung, W., Hsieh, D., Naik, N., Ramadorai, T., “Hedge Funds: Perfor-
mance, Risk and Capital Formation (July 19, 2006). AFA 2007 Chicago
Meetings Paper available at SSRN: http://ssrn.com/abstract=778124
[15] Fung, W., Hsieh, D., “Empirical Characteristics of Dynamic Trading
Strategies: The Case of Hedge Funds”, The Review of Financial Stud-
ies, 10, 2 (1997)
[16] Fung, W., Hsieh, D., “The Risk in Hedge Fund strategies: Theory and
Evidence from Trend-Followers,” The Review of Financial Studies, 14, 2,
p. 313 (Summer 2001) (2001).
[17] Fung, W., Hsieh, D., “Benchmarks of Hedge Fund Performance: Infor-
mation Content and Measurement Biases,” Financial Analyst Journal
(2001).
[18] Fung, W., Hsieh, D., “The Risk in Fixed Income Hedge Fund Styles”,
Journal of Fixed Income, 12, 2 (2002)
[19] Fung, W., Hsieh, D., “The Risk in Hedge Fund Strategies: Alternative
Alphas and Alternative Betas” in L. Jaeger (ed.), “The new genera-
tion of risk management for hedge funds and private equity investment”,
Euromoney (2003);
[20] Fung, W., Hsieh, D., “Hedge Fund Benchmarks: A Risk Based
Approach”, Working Paper (2004)
[21] Fung, W., Hsieh, D., “The Risk in Long/Short Equity Hedge Funds”,
Working Paper, London Business School, Duke University (2004)
[22] Fung, W., Hsieh, D., “Extracting Portable Alpha from Equity Long/
Short Hedge Funds” (2004), Journal of Investment Management, 2, 4,
1-19
The Road to Hedge Fund Replication 203

[23] Getmansky M., Lo, A. W., Makarov, I., “An Econometric Model of Serial
Correlation and Illiquidity in Hedge Fund Returns”, Journal of Financial
Economics, 74 (3), 529-610; Economics (2004)
[24] Jaeger, L , “Through the Alpha Smoke Screens: A Guide to Hedge Fund
Return Sources”, Euromoney Institutional Investors (2005)
[25] Jaeger, L., “Hedge Fund Indices – A new way to invest in absolute returns
strategies?”, AIMA Newsletter (June 2004)
[26] Jaeger, L. (ed.), “The new generation of risk management for hedge funds
and private equity investment”, Euromoney (2003)
[27] Jaeger, L., “Managing Risk in Alternative Investment Strategies”,
Financial Times/Prentice Hall (May 2002)
[28] Jaeger, Lars, “Sources of Return for Hedge funds and Managed Futures”,
The Capital Guide to Hedge Funds 2003, ISIPublications (Nov. 2002)
[29] Jaeger, L., Jacquemai, M., Cittadini, P., “Case study: The sGFI Futures
Index,” The Journal of Alternative Investment (Summer 2002).
[30] Jensen, J., Rotenberg, J., “Hedge Funds Selling Beta as Alpha”
Bridgewater (2003, updated 2004 and 2005)
[31] Kohler, A., “Hedge Fund Indexing: A square Peg in a round hole”, State
Street Global Advisors (2003).
[32] Malkiel, B., Saha, A., “Hedge Funds: Risk and Return”, Working Paper
(2004)
[33] Mitchel, M, Pulvino, T, “Characteristics of Risk in Risk Arbitrage”, Jour-
nal of Finance, 56, 6, 2135 (2001)
[34] Schneeweis, T., “A Review of Alternative Hedge Fund Indices.”
Schneeweis Partners (2001);
[35] Steven A. Schoenfeld, “Active Index Investing”, Wiley, New York (2004)
[36] Spurgin, R., “A Benchmark on Commodity Trading Advisor Perfor-
mance”, Journal of Alternative Investments (Fall 1999)
[37] Sharpe, W., “Asset Allocation: Management style and performance mea-
surement”, Journal of Portfolio Management, 2, 18 (Winter 1991)
[38] Till, H., “The capacity implications of the search of alpha”, AIMA
Newsletter (2004)
Asset Securitisation as a Profits Management
Instrument

Markus Schmidtchen

KfW Bankengruppe, Frankfurt, Germany, markus.schmidtchen@kfw.de†

1 Introduction
The credit derivatives market has enjoyed a strong growth in liquidity for
loan products and credit derivatives for some years now. A growing number
of players populate both the supply and demand sides of the market, leading
to product diversification on the one hand and a broadening of demand on the
other. Rapid growth has been observed, in particular, in the single name credit
default swap market, through which institutions can hedge against credit de-
fault by large, well-known enterprises.
However, the instrument that can be used to hedge against credit default
by small and medium-sized enterprises (SMEs) is portfolio securitisation,
through which loans or loan default risk are transferred as a package. One
of the reasons for making a bundled transfer is the limited exposure to indi-
vidual SMEs.
In parallel with the development in the capital markets, many credit in-
stitutions are re-organising their credit risk management units. This is being
driven, on the one hand, by the regulatory demands in credit risk measure-
ment under Basel II. On the other hand, greater capital market liquidity is
making loan products more mobile. Banks are conducting increasingly active
credit portfolio management. For example, the capital market may be used to
deliberately increase credit exposure by means of an investment or to delib-
erately reduce risk by loan securitisation. This enables a bank to optimise its
credit portfolio, which in turn has a positive impact on its profits position.
Even if there are usually a number of different reasons for portfolio se-
curitisation, it is clear that economically a transaction is appropriate when
the economic capital released by risk reduction and reinvested in new lending
business generates enough profits to cover the cost of the securitisation.


This article presents the author’s opinion only and not an official statement of
KfW views.
206 M. Schmidtchen

This article compares the implications of different securitisation strategies


in both the bank’s overall portfolio risk as well as its return on capital. It shows
that under specific assumptions, a securitisation strategy in which both the
first loss position and the senior tranche are retained by the placing institution
turns out to be optimal.
This optimal securitisation strategy is discussed below, taking the example
of an SME bank. First, the risk situation of the bank before securitisation is
presented. Then both the risk effects and the profits effects of complete risk
placement and optimal risk placement are measured and compared.
The calculations on which the results are based were derived by applying
a Monte Carlo model that is well established in the capital market.1

2 Situation Before Securitisation


The analysis takes as its starting point a credit institution whose loan portfolio
loss distribution before securitisation is shown in Fig. 1. It is assumed that this
bank has a loan portfolio of 2.1 billion Euros. This portfolio has an average
credit rating of “Ba1” to which a default probability of 0.8% per annum
corresponds, which is typical for SMEs, and is 50% secured. These portfolio

Expected loss Unexpected loss resp. Capital commitment


Probability of loss severity

20,0%

15,0% Pool charateristics


Pool charateristics
Volume 2,1 bn
10,0% Avg. rating Ba1
Avg. recovery rate 50%
5,0% Exp. loss (1 year) 0,4%

0,0%
0,0% 0,5% 1,0% 1,5% 2,0% 2,5% 3,0% 3,5% 4,0% 4,5%

Loss severity

Capital utilisation
Volume Expected loss 99,97% quantile Capital commitment
Before
securitisation 2.110.000.000 0,4% 9.114.585 3,9% 82.290.000 3,5% 73.175.415

Fig. 1. The bank’s loss distribution before securitisation (one-year horizon)

1
The Monte Carlo model is based on a Gaussian Copula function and simulates
the loss distribution which occurs for the credit institution in the scenarios and
depicts the bank’s risk situation. To derive this loss distribution, it is assumed
that the bank expects an average asset correlation with the loan portfolio of
around 8%. This value is at the lower limit of the correlation assumptions used
by Basel II to derive the risk weightings for small and medium-sized enterprises.
Asset Securitisation 207

characteristics produce an expected loss of 0.4% per annum. The unexpected


loss or the economic risk that corresponds to the loan portfolio is derived,
however, from the targeted solvency level and the composition of the bank’s
portfolio. The targeted solvency level is first set at 99.97%, which complies
with an “Aa” probability of default.
Figure 1 shows the loss distribution of the loan portfolio before securiti-
sation. This distribution attributes a probability of occurrence (ordinate) to
each potential loss amount (abscissa). The mean of this distribution, which
determines the value of the expected loss, is around 9.1 million Euros, i.e.
0.4% of the pool volume. The 99.97% quantile of the distribution is around
82 million Euros or 3.9% of the loan volume.
The 99.97% quantile means that the bank needs to secure the loan portfolio
with a total of 3.9% capital in order to achieve the targeted credit standing.
As is customary in banking practice, it is assumed that this capital backing
comprises 0.4 percentage point standard risk costs and 3.5 percentage points
economic capital.
It is assumed that the standard risk costs are fully included in the credit
margins. This ensures that the expected loss on the loan portfolio, which
corresponds to the standard risk costs, is borne by future margin income.
By contrast, the economic capital must be held available by the credit
institution. It is used to ensure that the institution remains solvent if unex-
pectedly high losses are incurred. For further analysis it is assumed that the
credit institution requires 11% per annum return on the economic capital and
is invariably in a position to enforce the required credit margins.

3 Securitisation Pool
In order to measure the effects of a securitisation, it is presumed that the credit
institution extracts a sub-portfolio worth 350 million Euros from its existing
loan portfolio with the intention of securitising this sub-portfolio synthetically
over a period of 5 years.
The randomly selected portfolio also has an average credit rating of “Ba1”
and 50% collateralisation. The loss distribution over the securitisation period
for this portfolio and the ensuing tranching are shown in Fig. 2.
The table in Fig. 2 shows that the pool has been subdivided into seven
tranches. The first loss piece (FLP) accounts for 2.35% of the volume. The
“Aaa” tranche accounts for 89% and is thus by far the largest securitisation
tranche.
In order to obtain as clear a distinction as possible between the placing
of expected and unexpected losses when analysing the securitisation effects,
the loan quality of the next tranche above the FLP has a very low “B3”
rating. This ensures that the FLP consists mainly of expected losses. This
can be seen, for example, from the relation between the expected losses of
the securitisation pool over 5 years (1.8%) and the size of the FLP, which is
around 76%.
208 M. Schmidtchen

5,0%
Probability of loss severity
4,5% Pool characteristics
4,0%
Volume 350 m
3,5%
Avg. rating Ba1
3,0%
Avg. recovery rate 50%
2,5%
2,0%
Exp. loss (5 years) 1,80%
1,5%
1,0%
0,5%
0,0%
0,0% 1,0% 2,0% 3,0% 4,0% 5,0% 6,0% 7,0% 8,0% 9,0%

Loss severity

Capital structure
Rating FLP B3 Ba2 Baa2 A2 Aa2 Aaa
Volume 2,35% 1,15% 1,50% 1,75% 2,00% 2,25% 89,00%
Spread p.a. 25,00% 8,00% 2,50% 0,75% 0,50% 0,32% 0,10%

Fig. 2. Loss distribution of the securitisation pool (five-year horizon)

In addition to the sizes of the individual tranches, the table also shows
the assumptions with regard to the spreads that the institution carrying out
the securitisation has to pay to the capital market investors for assuming the
risk. The spreads for categories “Aaa” to “Ba2” are based on observed SME
securitisations. The price for the FLP or the “B3” tranche has be selected in
such a way that, considering the expected losses of these tranches, the investor
achieves a return on the investment of around 12% for the FLP and around
5% for the “B3” tranche. Similar prices can currently also be observed in the
capital market.
It is also expected that the securitisation generates transaction costs to-
talling some 1.3 million Euros. These costs include payments to the arranger,
the rating agency, lawyers, etc, some of which have to be paid upfront while
some are running fees.

4 Effects of Full Portfolio Securitisation

The impact on the bank’s portfolio loss distribution after placing the entire se-
curitisation pool is shown in Table 1. For the purpose of a comparative statical
analysis, the risk situation before securitisation is also shown in Table 1.
The table shows that after securitisation the bank’s total risk exposure is
reduced by the amount of the placed volume. In addition, the relative expected
loss increases marginally and the 99.97% quantile is now 4% of the remaining
volume. Overall, this is accompanied by a 0.08 percentage point increase in
the relative economic capital commitment.
The slight increase in the relative economic capital can be attributed to
the fact that, first, the quality of the loan portfolio retained by the bank has
Asset Securitisation 209

Table 1. The bank’s risk situation in case of full securitisation


Capital utilisation (1-year view)
Capital
Volume Expected loss 99.97% quantile commitment
Before
securitisation 2,110,000,000 0.43% 9,114,585 3.90% 82,290,000 3.47% 73,175,415
After
securitisation 1,760,000,000 0.45% 7,900,933 4.00% 70,400,000 3.55% 62,499,067

worsened slightly. This can be seen from the increase in the relative expected
loss and is due to the fact that the randomly selected securitised loans have
a credit rating that is slightly above average. Second, the placing reduces
the granularity of the bank’s loan portfolio, leading to an increase in the
probability of extreme losses. More economic capital must be retained to cover
this possibility. While the relative moments of the loss distribution increase
slightly, both the absolute expected loss and the absolute capital commitment
are obviously reduced.
The impact of placing the entire risk on the bank’s returns can be seen in
the simplified income statement presented below (see Table 2). The income
statement lists the total profit and expenditure at the securitising institution
over the entire securitisation transaction period. This approach is needed to
illustrate the profitability of a securitisation covering several periods in full.2
The expenses generated by this securitisation strategy comprise transac-
tion costs and capital market costs, which cover payments to the investors.
On the profits side, the institution can record the reduction in expected
loss. It was assumed that the expected loss or the standard risk costs is/are
part of the credit margin. Owing to the placing of the expected loss, this flow
of payments to the bank can now be considered entirely as profits.
Further profits from securitisation are derived from freeing up economic
capital. The amount of this variable is calculated by interest on the released
economic capital (roughly 10.7 Euros) being paid at the target return on
economic capital (11%) over the entire transaction period. If the income and
expenditure sides are added together, total profits are roughly – 3.6 million
Euros. Consequently, this securitisation strategy is not economical under the
given assumptions.

2
In order to quantify the intertemporal effects of securitisation, some simplifying
assumptions have been made. For example, it has been assumed that the risk
reduction effects that are presented in Table 1 and that are calculated for the first
year are also valid in the subsequent years. In addition, these future amounts are
not discounted.
210 M. Schmidtchen

Table 2. Profits on full risk placementa

Profits Income statement (5-year view) Expenditure


Reduction
in expected loss 6,068,262 1,298,463 Transaction costs
Profits 9,906,372 Junior (FLP) Capital market costs
Economic capital release 5,871,991 2,796,938 Mezzanine (Ba3 to Aa2)
1,557,500 Senior (Aaa)
Total 11,940,253 15,559,272
Total profits −3, 619, 019
a
The individual amounts on the expenditure side are generated as follows. The
transaction costs have been set by the author and are based on currently applicable
values. The costs of the junior tranche occur on the assumption that the investor
bears the full expected loss on the FLP which is roughly 5.4 million Euros and is
fully compensated for this by the margin. In addition, he receives 12% per annum
return on his investment.
The costs of the remainder of the capital structure are derived by multiplying the
nominal volume of the individual tranches by the respective spreads and the transac-
tion period. Obviously, this is another simplification because effects of the expected
losses within the tranches on the transaction costs are neglect.
On the profits side, the reduction in the expected losses can be derived directly from
Table 1. The reduction in Year 1 of some 1.2 million Euros is simply also multiplied
by the transaction period. Much the same applies to the profits from releasing eco-
nomic capital; the reduction in Year 1 is also multiplied by the transaction period.
Then this amount is multiplied by the required return on economic capital of 11%

5 Effects of Optimal Portfolio Securitisation

Within the model, securitisation is shown to be optimal when only the mez-
zanine part of the portfolio is issued on the capital market. The FLP and the
“Aaa” part are retained by the credit institution. The risk situation arising
from this strategy is illustrated in Table 3.
The table shows that securitisation reduces the bank’s total risk exposure
by the amount of the placed volume only. In addition, there is a marginal
reduction in the expected loss. This can be attributed to the retention of the
FLP, which accounts for by far the largest portion of the expected losses on
the tranched portfolio.
In contrast to the first moment of the loss distribution, there is a clear
nominal and relative reduction in the 99.97% quantile, with the result that
the capital commitment also decreases markedly and falls to around 64 million
Euros. This value is only slightly above the nominal capital commitment when
the securitisation pool is fully placed (see Table 1).3
3
It can be shown that this somewhat higher capital commitment can be attributed
solely to the retention of the FLP. According to the model calculations, roughly
Asset Securitisation 211

Table 3. The bank’s risk situation when the FLP and the “Aaa” tranche are retained
Capital utilisation (1-year view)
Capital
Volume Expected loss 99.97% quantile commitment
Before
securitisation 2,110,000,000 0.43% 9,114,585 3.90% 82,290,000 3.47% 73,175,415
After
securitisation 2.079.725.000 0.44% 9,101,805 3.50% 72,790,375 3.06% 63,688,570

Table 4. Profits in case of the retention of the FLP and the “Aaa” tranche
Profits Income statement (5-year view) Expenditure
Reduction in
expected loss 63,902 1,298,463 Transaction costs
Profits Junior (FLP) Capital market costs
Economic
capital release 5,217,765 2,796,938 Mezzanine (Ba3 to Aa2)
Senior (Aaa)
Total 5,281,667 4,095,400
Total profits 1,186,267

Consequently, most of the economic risk has been placed with the mezza-
nine tranches and the retention of the senior tranches is not associated with
any significant risk for the bank. This result is also intuitively plausible as
the senior tranche has a higher credit rating than the bank VAR confidence is
and hence by the time the senior tranche defaults, chances are the bank will
already have defaulted.
The income statement effect, which derives from this securitisation strat-
egy, is presented in Table 4.
Compared with Table 2, the income statement expenditure in Table 4
is reduced by the capital market costs that are saved by not placing the
junior and senior parts of the portfolio. In addition to expenditure, profits also
decline. However, the profits do not decline as strongly as the expenditure.
The profits are reduced by roughly 6 million Euros by retaining the FLP
or through the expected losses not placed. By contrast, costs amounting to
some 9.9 million Euros are saved by not placing the FLP. This imbalance
can be explained, inter alia, by the fact that an FLP investor – in contrast
to the securitising institution – is invariably entitled to interest for assuming
the expected losses. In this case, this is 12% per annum. This leads to the

66% of the FLP or 5.4 million Euros are expected losses, which from the bank’s
perspective are not an economic risk. By contrast, the remainder of the FLP
amounting to roughly 1.6 million Euros is a high risk position, almost all of
which must be deducted from the economic capital.
212 M. Schmidtchen

conclusion that in the model used here it is not economically sound to place
the FLP.
The second central result is that placing the senior tranche is not eco-
nomically sound. Retaining this tranche saves around 1.6 million Euros in
expenditure. By contrast, the profits from released economic capital fall by
comparison with full securitisation by roughly 0.7 million Euros only.
Overall, the securitisation strategy of retaining the junior and senior risks
yields positive profits of around 1.2 million Euros.
The economic success of the securitisation strategy presented here is es-
sentially driven by the bank’s targeted return on economic capital and the
targeted solvency level. Whereas the targeted return on economic capital de-
termines the opportunity costs of the use of capital, the solvency level affects
the absolute amount of capital commitment for a specific credit risk. If the
solvency level is set at the 99.97% quantile as in the calculations in the exam-
ple, the total profits from the securitisation strategy rise the more the return
on capital increases. This situation is shown by the solid line in Fig. 3.
However, if the bank’s targeted solvency level falls from 99.97 to 99.90%,
this corresponds roughly to an “A” rating the line in Fig. 3 shifts to the right.
In this case, the break-even return on economic capital is roughly 11%, as
opposed to roughly 8.5% in the previous case.
This result can be attributed to the fact that the lower target credit rating
leads to lower capital commitment before securitisation and hence to overall
less economic capital being released through the securitisation. Accordingly,
the return on economic capital must be higher to ensure that securitisation
makes sense in terms of profits.4

3
Earnings (m)

0
7% 8% 9% 10% 11% 12% 13% 14% 15% 16% 17%
−1

−2
Return on Equity

Retention of FLP & Senior (99,97%) Retention of FLP & Senior (99,90%)

Fig. 3. Connection between return on capital, credit rating and profits

4
The same qualitative effect is achieved if the securitising bank anticipates an
average asset correlation of less than 8% (see footnote 2). In this case, too, less
economic capital would need to be retained before securitisation.
Asset Securitisation 213

6 Conclusion

The analysis has shown that a credit institution which specialises in SMEs
can significantly affect both the risk situation and the profits situation by
using the instrument of portfolio securitisation. In the model presented above
a portfolio strategy in which only the mezzanine part of a portfolio, which
contains most of the economic risk, is securitised is shown to be particularly
effective.
Net profits from this strategy depend, inter alia, on the target credit rat-
ing of the securitising institution, which affects the absolute amount of the
economic capitalisation. The better the target credit rating, the higher the
capitalisation before securitisation and the higher the amount of economic
capital released.
In addition to the target credit rating, the return on economic capital re-
quired is a further important factor behind the profitability of a transaction
of this kind. The return on economic capital required determines the oppor-
tunity costs of the capital utilisation. Together with the absolute amount of
economic capital released, it thus determines the profits from the optimal
securitisation strategy.
The fact that the economic success of the securitisation strategy depends
on the target credit rating and return on economic capital leads, in particular,
to institutions that set themselves a high solvency level and which require a
substantial return on economic capital can use the instrument of loan securi-
tisation to optimise profits.
Recent Advances in Credit Risk Management

Frances Cowell1 , Borjana Racheva2 , and Stefan Trück3


1
Morley Fund Management, London, England, frances.cowell@morleyfm.com
2
FinAnalytica Inc., Sofia, Bulgaria, borjana.racheva@finanalytica.com
3
School of Economics and Finance, Queensland University of Technology,
Australia, strueck@efs.mq.edu.au

1 Introduction
In the last decade, the market for credit related products as well as tech-
niques for credit risk management have undergone several changes. Financial
crises and a high number of defaults during the late 1990s have stimulated
not only public interest in credit risk management, but also their awareness of
its importance in today’s investment environment. Also the market for credit
derivatives has exhibited impressive growth rates. Active trading of credit
derivatives only started in the mid 1990s, but since then has become one of
the most dynamic financial markets. The dynamic expansion of the market
requires new techniques and advances in credit derivative and especially de-
pendence modelling among drivers for credit risk. Finally, the upcoming new
capital accord (Basel II) encourages banks to base their capital requirement
for credit risk on internal or external rating systems [4]. This regulatory body
under the Bank of International Settlements (BIS) becoming effective in 2007
aims to strengthen risk management systems of international financial institu-
tions. As a result, the majority of international operating banks sets focus on
an internal-rating based approach to determine capital requirements for their
loan or bond portfolios. Another consequence is that due to new regulatory
requirements there is an increasing demand by holders of securitisable assets
to sell or to transfer risks of their assets.
Recent research suggests that while a variety of advances have been made,
there are still several fallacies both in banks’ internal credit risk management
systems and industry wide used solutions. As [15] point out, the use of the
normal distribution for modelling the returns of assets or risk factors is not
adequate since they generally exhibit heavy tails, excess kurtosis and skewness.
All these features cannot be captured by the normal distribution. Also the
notion of correlation as the only measure of dependence between risk factors
or asset returns has recently been examined in empirical studies, for example
[7]. Using the wrong dependence structure may lead to severe underestimation
of the risk for a credit portfolio. The concept of copulas [13] allowing for more
216 F. Cowell et al.

diversity in the dependence structure between defaults as well as the drivers


of credit risk could be a cure to these deficiencies.
Further we suggest alternatives to the Value-at-Risk which is often sug-
gested as the only risk measure to be considered. We also relate these consider-
ations to the idea of a coherent measure of risk as introduced by [3]. Thus, the
article extends the framework of risk management by diverting to expected
tail loss (ETL) and advocates on the informational effectiveness of the former
statistics. Finally, the quite dramatic effects of the business cycle on credit
migration behavior have been investigated more thoroughly in recent years
[2, 22]. Alternative and more adequate models suggest the use of conditional
instead of average historical migration matrices for determining credit VaR.
The rest of the paper is set up as follows. Section 2 provides insight into
sound modelling of the returns of risk factors and assets using alternatives
to the Gaussian distribution. Section 3 focuses on dependence modelling with
the concept of copulas. The discussion continues in Section 4 where the ne-
cessity for using conditional migration matrices instead of average historical
ones is illustrated. Section 5 extends the framework of risk management by
diverting to expected tail loss (ETL). Section 6 describes how the features can
be integrated in a credit risk management systems, Section 7 concludes.

2 Adequate Modelling of Market and Risk Factors


In this section, we discuss how to generate scenarios for asset returns of the
obligors or for changes in the market risk factors. The dynamic of financial risk
factors is well known to often exhibit some of the following phenomena: heavy
tails, skewness and high-kurtotic residuals. The recognition and description of
the latter phenomena goes back to the seminal papers of [11] and [8]. To cap-
ture these features, we will introduce the α-stable distribution as an extension
of the normal distribution. Due to its summation stability and the fact that it
generalizes the Gaussian distribution, the class of stable distributions seems
to be an ideal candidate to describe the return distribution of the considered
risk factors. For an extensive description of the stable distribution and its
application in financial theory see [17] or [15].
Let us first briefly review some of the main features of the stable dis-
tribution as the natural extension of the Gaussian distribution. An α-stable
distributed random variable can be defined in the following way [17]:
Definition 1 Let X be a random variable with stable distribution. The fol-
lowing theorem fully characterizes a random variable with stable distribution.
X is a stable random variable if the following condition holds:
Definition 2 A random variable X follows a stable distribution, if for any
positive numbers A and B there exists a positive number C and a real number
D such that
AX1 + BX2 = CX + D (1)
Recent Advances in Credit Risk Management 217

where X1 and X2 are independent copies of X and ”=” denotes equality in


distribution.
The stable distribution can also be defined by its characteristic function:
Definition 3 A random variable X has a stable distribution if there are pa-
rameters 0 < α ≤ 2, σ ≥ 0, −1 ≤ β ≤ 1, and µ real such that its characteristic
function has the following form:

⎨exp(−σ α |t|α [1 − iβsign(t) tan πα 2 ] + iµt), if α = 1,
iXt
E(e ) = (2)

exp(−σ|t|[1 + iβ π2 sign(t) ln |t|] + iµt), if α = 1,
The family of stable distributions contains as a special case the Gaussian
(Normal) distribution. However, non-Gaussian stable models do not possess
the limitations of the normal one and all share a similar feature that differen-
tiates them from the Gaussian one – heavy probability tails. Thus they can
model greater variety of empirical distributions including skewed ones.
The dependence of a stable random variable X from its parameters we
will indicate by writing:
X ∼ Sα (β, σ, µ)
The parameters α, β, σ and µ of a stable Paretian distribution describe the
stability, skewness, scale and drift and satisfy the following constraints:
α is the index of stability (0 < α ≤ 2): for values of α lower than 2
the distribution is becoming more leptocurtic in comparison to the normal
distribution. This means that the peak of the density becomes higher and
the tails heavier. When α > 1, the location parameter µ is the mean of the
distribution.
β is the skewness parameter (−1 ≤ β ≤ 1): a stable distribution with
β = µ = 0 is called a symmetric α-stable distribution (SαS). If β < 0, the
distribution is skewed to the left, if β > 0, the distribution is skewed to the
right. We conclude that the stable distribution can also capture asymmetric
asset returns.
σ is the scale parameter (σ ≥ 0): the scale parameter σ allows to write any
stable random variable X as X = σX0 where X0 has a unit scale parameter
and α and β are the same for X and X0 .
µ is the drift (µ ∈ R): note that for 1 < α ≤ 2, the shift parameter µ
equals the mean.
Obviously the stable distribution offers more parameters to model empir-
ically observed risk factors than e.g. the normal distribution. The word stable
is used because the shape is preserved (apart from scale and shift) under
addition such as in Equation 1. A very important advantage is that stable
distributions form a family that contains the normal distribution as a special
case. It is actually reduced to the Normal distribution if α = 2 and β = 0.
Thus, most of the beneficial properties of the normal distribution which make
it so popular within financial theory are also valid for the stable distributions:
218 F. Cowell et al.

• The sum of independent identically distributed (iid) stable random vari-


ables is again stable. This property allows us to build portfolios, for ex-
ample.
• Stable distributions are the only distributional family that has its own
domain of attraction - that is a large sum of i.i.d. random variables will
have a distribution that converges to a stable one. This is a unique feature,
which means that if a given stock price/rate is reflected by many small
shocks, then the limiting distribution of the stock price can only be stable
(that is Gaussian or non-Gaussian stable).
It is a widely accepted critique of the normal distribution that it fails to
explain certain properties of financial variables - fat tails and excess kurtosis.
Therefore, the stable distributions provide much more realistic models for
financial variables which can capture the kurtosis and the heavy-tailed nature
of financial data, see e.g [15]. Figure 1 illustrates the superior density fit of a
stable (non-Gaussian) distribution in comparison to a Gaussian (normal) to
the empirical distribution of the 1 week EURIBOR rate.
Based on the superior fit to empirical data and the possibility to capture
skewness, heavy tails and high-kurtotic residuals in the distribution the sta-
ble distribution has some advantages over the Gaussian model. Therefore, it
should be favorable to assume that the probability model for asset returns
and risk factors in credit risk modelling is described by the family of stable
laws.

1 week
140
Kernel Estimate
Stable Fit
Gaussian Fit
120

100

80

60

40

20

0
−0.06 −0.04 −0.02 0 0.02 0.04 0.06

Fig. 1. Density fit of a Gaussian (normal), and stable (non-Gaussian) distributions


to the empirical (sample) distribution of 1 week EURIBOR rate
220 F. Cowell et al.

X2 is identical in both models as well as their marginal distributions – X1 and


X2 are normally distributed. Yet it is clear that the dependence structure of
the two models is qualitatively different. If we interpret the random variables
as financial loss, then adopting the first model could lead to underestimation
of the probability of having extreme losses. On the contrary, according to the
second model extreme losses have a stronger tendency to occur together. The
example motivates the idea to model the dependence structure with a method
more general than the correlation approach.
The correlation is a widespread concept in modern finance and insurance
and stands for a measure of dependence between two random variables. How-
ever, this term is very often incorrectly used to mean any notion of depen-
dence. Actually, correlation is one particular measure of dependence among
many. Of course in the world of multivariate normal distribution and, more
generally in the world of spherical and elliptical distributions, it is the ac-
cepted measure. Yet empirical research shows that real data seldom seems to
have been generated from a distribution belonging to this class.
There are at least three major drawbacks of the correlation method. Let
us therefore consider the case of two real-valued random variables X and Y :
The variances of X and Y must be finite or the correlation is not defined.
This assumption causes problems when working with heavy-tailed data.
For instance the variances of the components of a bivariate t(n) distributed
random vector for n ≤ 2 are infinite, hence the correlation between them is
not defined.
Independence of two random variables implies correlation equal to zero, the
opposite, generally speaking, is not correct – zero correlation does not imply
independence.
A simple example is the following: Let X ∼ N (0, 1) and Y = X 2 . Since
the third moment of the standard normal distribution is zero, the correlation
between X and Y is zero despite the fact that Y is a function of X which
means that they are dependent. Indeed, in the case of a multivariate normal
distribution uncorrelatedness and independence are interchangeable notions.
This statement is, however, not valid if only the marginal distributions are
normal and the joint distribution is non-normal. The example on Fig. 2 illus-
trates this fact.
The correlation is not invariant under non-linear strictly increasing trans-
formations T : R → R.
This is a serious disadvantage, since in general
corr(T (X), T (Y )) = corr(X, Y ).
A more prevalent approach is to model dependency using copulas [13].
Let us consider a real-valued random vector X = (X1 , . . . , Xn )t . The depen-
dence structure of the random vector is completely determined by the joint
distribution function
F (x1 , . . . , xn ) = P (X1 ≤ x1 , . . . , Xn ≤ xn ). (3)
Recent Advances in Credit Risk Management 221

It is possible to transform the distribution function and as a result to


have a new function which completely describes the dependence between the
components of the random vector and is not dependent on the marginal dis-
tributions. This function is called copula.
Suppose we transform the random vector X = (X1 , . . . , Xn )t component-
wise to have standard-uniform marginal distributions U (0, 1). Each random
variable Xi has a marginal distribution of Fi that is assumed to be continuous
for simplicity. Recall that the transformation of a continuous random variable
X with its own distribution function F results in a random variable F (X)
which is standardly uniformly distributed. Thus, transforming equation (3)
component-wise yields
F (x1 , . . . , xn ) = P (X1 ≤ x1 , . . . , Xn ≤ xn )
= P [F1 (X1 ) ≤ F1 (x1 ), . . . , Fn (Xn ) ≤ Fn (xn )]
= C(F1 (x1 ), . . . , Fn (xn )), (4)
where the function C can be identified as a joint distribution function
with standard uniform marginals – the copula of the random vector X. In
equation (4), it can be clearly seen, how the copula combines the marginals
to the joint distribution.
Sklar’s theorem provides a theoretic foundation for the copula concept [18]:
Theorem 4 Let F be a joint distribution function with continuous margins
F1 , . . . , Fn . Then there exists a unique copula C : [0, 1]n → [0, 1] such that for
all x1 , . . . , xn in R = [−∞, ∞] (4) holds. Conversely, if C is a copula and
F1 , . . . , Fn are distribution functions, then the function F given by (4) is a
joint distribution function with margins F1 , . . . , Fn .
For the case that the marginals Fi are not all continuous, it can be shown [18]
that the joint distribution function can still be expressed like in equation (4).
However, the copula C is no longer unique in this case.
For risk management, the use of copulas offers the following advantages:
• The nature of dependency that can be modelled is more general. In com-
parison, only linear dependence can be explained by the correlation.
• Dependence of extreme events might be modelled.
• Copulas are indifferent to continuously increasing transformations (not
only linear as it is true for correlations):
If (X1 , . . . , Xn )t has a copula C and T1 . . . , Tn are increasing continuous
functions, then (T1 (X1 ), . . . , Tn (Xn ))t also has the copula C.
The last statement may be quite important in asset-value models for credit
risk, because this property postulates that the asset values of two companies
shall have exactly the same copula as the stock prices of these two companies.
The latter is true if we consider the stock price of a company as a call op-
tion on its assets and if the option pricing function giving the stock price is
continuously increasing with respect to the asset values.
222 F. Cowell et al.

Overall, we conclude that the use of copulas as a more general measure


of dependence has several advantages over the use of correlations only. Since
especially in credit risk a nonlinear dependence structure between different risk
factors, asset values and credit events may be assumed, the concept should
be included in an adequate risk management approach.

4 Alternative Risk Measures


Once the portfolio value scenarios are generated, an estimate for the distri-
bution of the portfolio values can be obtained. We may then choose to report
any number of descriptive statistics for this distribution. For example, mean
and standard deviation could be obtained from the simulated portfolio values
using sample statistics. However, because of the skewed nature of the portfolio
distribution, the mean and standard deviation may not be good measures of
risk. Since the distribution of values is not normal, it is not optimal to infer
percentile levels from the standard deviation. Given the simulated portfolio
values, we can compute better measures, for example empirical quantiles (VaR
at different confidence levels), or expected shortfall (ES) and expected tail loss
(ETL) risk statistics.
The VaR framework, though well-established in the industry, has been
subject to various criticism. In their seminal paper, [3] point out that the
VaR concept has to be regarded with care and should not be the only concept
for risk evaluation. Firstly, VaR creates severe aggregation problems and does
not behave nicely with respect to the addition of risks, even if the risks are
independent. Further, the use of value at risk does not consider diversification
effects adequately. Hence, alternative risk measures should be considered as
it comes to evaluation of portfolio credit risk.
A more adequate measure of risk could be the conditional value-at-risk
(CVaR), also known as total values-at-risk, expected shortfall or expected tail
loss (ETL). It is defined as:

1 α
ET Lα (X) = V aRq (X)dq (5)
α 0

where V aRq (X) = − inf x {x|P (X ≤ x) ≥ α} is the VaR of the random


variable X, interpreted as financial asset return, so −X is the loss, and
ET Lα (X) = −E(X|X ≤ V aRα (X)) when we assume a continuous distribu-
tion for the distribution of X. ETL is defined as conditional loss, i.e. the av-
erage of the losses provided these are larger than the predicted VaR threshold
at given confidence level. Thus compared to VaR which is a point estimate of
risk, ETL reflects all information contained in the left tail of the asset returns
probability distribution. This fact makes ETL a much more reliable and infor-
mation effective risk statistics. Managing risk and/or optimizing portfolios on
Recent Advances in Credit Risk Management 223

the basis of ETL leads to higher risk-adjusted returns. ETL compared to VaR
possesses a number of advantages, among others ETL is a smooth function
which can be readily optimized. Moreover, ETL reflects only the downside and
does not penalize for upside potential of the portfolio/asset returns, which is
not true for the standard deviation.
Recently, a variety of alternative risk measures has been introduced in the
literature. One may also like to consider individual assets and to ascertain
how much risk each asset contributes to the portfolio. Hence, also marginal
(incremental) statistics should be considered. For an overview on desirable
properties of a risk measure see for example [3], [19], [20] or [16].

5 Conditional Migration Behaviour

It is generally agreed that assigned ratings and corresponding default probabil-


ities but also the probabilities for rating changes are important determinants
of a bank’s credit risk management. Unfortunately, due to cyclical behavior
of the economy, credit spreads and migrations are not constant through time.
[9] as well as [1] have shown that default rates and credit spreads clearly de-
pend on the stage of the business cycle. [12] provided insight that probability
transition matrices of bond ratings also vary with the state of the economy.
Further investigating the issue, [22] show that such changes in migration or
default behavior through time lead to substantial effects on risk figures for
credit portfolios. Thus, to measure and forecast changes in migration behav-
ior as well as determining adequate estimators for transition matrices can be
considered as a major issue in rating based credit risk modelling.
Still, despite the obvious importance of recognizing the impact of business
cycles on rating transitions, the literature is sparse on this issue. The first
model developed to explicitly link business cycles to rating transitions was in
1997 CreditPortfolioView (CPV) by [25] and McKinsey and Company. [6] as
well as [10] use a one-factor model whereby ratings respond to business cycle
shifts. The model is extended to a multifactor credit migration model by [23].
Finally, [12] propose an ordered probit model which permits migration matri-
ces to be conditioned on the industry, country domicile and the business cycle.
In this section we will summarize the main ideas for two of the approaches on
adjusting migration matrices to the business cycle: The CreditPortfolioView
Model (CPV) by [25] and factor models initially suggested by [6] and [10].
In the macro simulation approach by [25] a time series model for the
business cycle is used to determine a conditional migration matrix. Let Yj,t
be the macro-economic index for rating class j at time t. Then Yj,t is derived
from a multi-factor time-series model of the form:

Yj,t = βj,0 + βj,1 X1,t + βj,2 X2,t + ..... + βj,m Xm,t + vj,t . (6)
224 F. Cowell et al.

According to the model the index Yj,t is dependent on economic variables


Xk with k = 1, . . . , m where vj,t represents an error term. The error term vj,t
is interpreted as the index innovation vector and assumed to be independent
of the Xk,t and identically normally distributed, vj,t ∼ N (0, σj ) for every t,
and independent for every j and we write vj ∼ N (0, Σv ).
The macroeconomic factors Xk are assumed to follow an auto-regressive
process of order 2 (AR2):

Xk,t = γk,0 + γk,1 Xk,t−1 + γk,2 Xk,t−2 + ek,t . (7)

Hereby, Xk,t−1 and Xk,t−2 denote the lagged values of the variable Xk ,
while ek,t denotes an error term that is assumed to be i.i.d, i.e. ek,t ∼ N (0, σe ).
Obviously, based on parameter estimates for equations (6)-(7) a macro-
economic index also for future periods can be estimated. This index can be
used to determine conditional default probabilities pj,t for rating class j in
period t. The author suggests a logit model of the form:
1
pj,t = . (8)
1 + e−Yj,t
while other models could be applied.
Finally, for estimation of the conditional migration matrix a shifting pro-
cedure is used that redistributes the probability mass within each row of the
unconditional migration matrix [24]. The shift operator is written in terms of
a matrix S = {Sij } and the shift procedure is accomplished by

Pcond = (I + τ S)Puncond (9)

where τ denotes the amplitude of the shift in segment j and is a function of


the estimated conditional default probability. For further conditions imposed
on the factor τ we refer to [24].
Alternative models for adjustment of migration matrices to business cycle
variables are approaches based on factor models including a systematic and
idiosyncratic risk component [6, 10, 23]. In these approaches, a one-factor
model is adopted to incorporate credit cycle dynamics into the transition
matrix.
First a so-called credit cycle index Zt defining the credit state based on
macroeconomic conditions shared by all obligors during period t is estimated.
The index is designed to be positive in good days and to be negative in
bad days. A positive index implies a lower probability of default (PD) and
downgrading probability but a higher upgrading probability and vice versa.
To calibrate the index, PDs of speculative grade bonds are used, since often
PDs of higher rated bonds are rather insensitive to the economic state, see
e.g. [5]. Instead of the logit model suggested in [24], here a probit model is
used.
Further it is assumed that ratings transitions reflect an underlying, con-
tinuous credit-change indicator Y following a standard normal distribution.
Recent Advances in Credit Risk Management 225

This credit-change indicator is assumed to be influenced by both a systematic


and unsystematic risk component. Therefore, Yt has a linear relationship with
the systematic credit cycle index Zt and an idiosyncratic error term εt . Thus,
the typical one-factor model parametrisation is obtained for the credit-change
indicator: 
Yt = ρZt + 1 − ρ2 εt . (10)
Since both Zt and εt are scaled to the standard normal distribution with
the weights chosen to be ρ and 1 − ρ2 , Yt is also standard normal. Note
that, ρ2 represents the correlation between the credit change indicator Yt
and the systematic credit cycle index Zt . The probability distribution for the
rating change for a company then takes place according to the outcome of
the systematic risk index. To apply this scheme to a multi-rating system, it
is assumed that conditional on an initial credit rating i at the beginning of
a year, one partitions values of the credit change indicator Y into a set of
disjoint bins. The bins are defined in a way that the probability of Yt falling
in a given interval equals the corresponding historical average transition rate.
This can be done simply by inverting the cumulative normal distribution
function starting from the default column what is illustrated in Fig. 3.
Using the bins calculated from the average transition matrix it is then
straightforward to calculate the conditional transition probability on the
credit cycle index. On average days one obtains Zt = 0 for the systematic
risk index and the credit-change indicator Yt follows a standard normal dis-
tribution. A positive outcome of the credit cycle index Zt shifts the credit-
change indicator to the right-hand side while in the case of a bad outcome
of the systematic credit cycle index the distribution moves to the left hand
side. Thus, in any year, the observed transition rates will deviate from the
average migration matrix we have to find a shift such that the probabilities

Fig. 3. Corresponding credit scores to transition probabilities for a company with


BBB rating (compare [6])
226 F. Cowell et al.

associated with the bins defined above best approximate the given year’s ob-
served transition rates. The estimation problem then results in determining
ρ such that the distance between the forecasted conditional transition matrix
and empirically observed migrations is minimized, see e.g. [10]. Note that [23]
extends the one-factor model representation by a multi-factor, Markov chain
model for rating migrations and credit spreads.
The different approaches point out the importance to incorporate busi-
ness cycle effects into the estimation of credit migration matrices. [10], [12]
and more recently [21] show that the conditional approach outperforms a
naive approach of simply taking historical average or previous year’s transi-
tion matrices.

6 Integration of the Advanced Technologies in a Credit


Risk Management System
This section outlines the steps of the application of integrated credit and mar-
ket risk management in practice. The introduction follows the implementation
algorithm of these steps in the Cognity software system. The system basically
incorporates two models for credit risk measurement – Asset Value Model
(AVM) which is an extension of CreditMetrics and Stochastic Default Rate
(SDR) model that serves as an enhancement to the McKinsey’s CreditPort-
folioView, and integrates these in a general framework of risk management
where the credit quality of the obligors is modelled dependent on the move-
ments in market risk drivers as well. Most software systems on the market offer
either market or credit risk measurement/management in separate products
and it is a well known fact that their users consistently experience difficulties
when trying to merge the results and build a comprehensive risk picture of
the portfolio.

6.1 The Asset-Value Approach in Cognity Credit Risk

There are four key steps in the Monte Carlo approach to credit risk modelling
in the asset value model:
Step 1. Modelling the dependence structure between market risk factors
and the credit risk drivers.
Step 2. Scenario Generation - each scenario corresponds to a possible “state
of the world” at the end of the risk horizon which the portfolio risk is estimated
for. For purposes of this article, the “state of the world” is just the credit rating
of each of the obligors in the portfolio and the corresponding values of the
market risk factors affecting the portfolio.
Step 3. Portfolio valuation - for each scenario, the software evaluates the
portfolio to reflect the new credit ratings and the values of the market risk
factors. This step creates a large number of possible future portfolio values.
Recent Advances in Credit Risk Management 227

Step 4. Summarize results - having the scenarios generated in the previous


steps, an estimate for the distribution of the portfolio value is produced. The
user may then choose to report any number of descriptive statistics for this
distribution.
The general methodology described below is valid for every Monte Carlo
approach to credit risk modelling in the AVM. We will now describe the
improvements that have been introduced to the first two components of this
class of models.

Modelling the Dependence Structure Between the Market Risk


Factors and the Credit Risk Drivers

Under the asset value models, the general assumption is that the driver of
credit events is the asset value of a company. The dependence structure be-
tween the asset values of two firms can be approximated by the dependence
structure between the stock prices of those firms. In case there is no stock price
information for a given obligor we employ the idea of segmentation described
in CreditMetrics. The essence of this approach is that the user determines the
percentage obligor volatility allocation among the volatilities of certain mar-
ket indices and explains the dependence between obligors by the dependence
of the market indices that drive obligors’ volatilities.
As discussed in Sect. 3 modelling the dependence structure requires a
greater flexibility than the one offered by the correlation concept. Hence, Cog-
nity Credit Risk Module supplies flexible dependence structure models:
• A copula approach
• A subordinated model approach
• A simplified approach using correlations as a measure for the dependency
for comparison purposes
A copula suitable for modelling dependencies between financial variables
and credit drivers in particular, should be flexible enough to capture the
dependence of extreme events and also asymmetries in dependence. There
are few flexible multivariate copula functions which can be applied to large-
dimensional problems. Examples include the Gaussian copula (the one behind
the multivariate Gaussian distribution), the multivariate Student’s t-copula,
etc. Cognity utilizes a flexible copula model which contains the multivariate
Student’s t-copula as a special case and allows for asymmetry in the depen-
dence model as well as for dependence in the extreme events. The copula model
is based on an asymmetric version of the multivariate Student’s t-distribution
and is flexible enough for all market conditions including severe crises in which
the asymmetric dependence is most pronounced.
the subordinated approach arises from the so-called subordinated distri-
butions. The symmetric stable distribution discussed in Section 2 is one rep-
resentative of this class. In particular, a random variable X is said to be
subordinated if its distribution allows the following stochastic representation:
228 F. Cowell et al.

X = Y ·Z, where Y is a positive random variable called subordinator, Z has


a normal distribution and Y is independent of Z. In case X is a vector, Y and
Z are vectors as well and the multiplication is defined as element-by-element.
The subordinated models construct a rich and flexible class containing all
random volatility models and can be extended to include skewed representa-
tives. The concept of dependence within the subordinate models is introduced
in the Gaussian component Z and the dependence between the components of
the subordinators. This dependence model can be interpreted in the following
way: The central part of the distribution is dominated by the Gaussian com-
ponent and, therefore, is described by the covariance structure. The extreme
events are triggered by the subordinators and, as a result, their dependence or
independence is a consequence of the dependence or independence of the com-
ponents of the vector of subordinators. Cognity system distinguishes between
two categories - dependent subordinators and independent subordinators.
Special cases of the dependent subordinators model are: multivariate Stu-
dent’s t when the estimated degrees of freedom for all financial variables are
the same; sub-Gaussian stable when all estimated indices of stability are the
same; and of course, the multivariate Gaussian distribution which appears as
a special case in both the dependent and independent subordinators cases.

Scenario Generation

In this section, we discuss how to simultaneously generate scenarios for future


credit ratings of the obligors in the portfolio and for the changes in the market
risk factors values. Each set of future credit ratings and market risk factors
values corresponds to a possible ’state of the world’ at the end of our risk
horizon.
The scenario generation procedure under the Asset-Value model is as fol-
lows:
1. Establish asset-return thresholds for the obligors in the portfolio. The
thresholds define migration from one credit rating to another.
2. Generate scenarios for asset returns and market risk factors values using
an appropriate distribution - this is an assumption to be imposed.
3. Map the asset returns scenarios to credit ratings scenarios.
As discussed in Sect. 2, a heavy-tailed model is needed to properly describe
the behaviour of assets returns. Utilizing extended subordinated models the
Cognity framework allows for selecting among several heavy-tailed distribu-
tional models: (1) the stable distributions discussed in Sect. 2, (2) the Stu-
dent’s t-distribution. This is in fact a location- and scale-enhanced version of
the Student’s t-distribution. It is a symmetric heavy-tailed distribution which
allows for a subordinated representation. The normal distribution appears
asymptotically as the ‘degrees of freedom’ parameter increases indefinitely,
(3) the asymmetric student’s t-distribution. There are many ways to arrive
Recent Advances in Credit Risk Management 229

at an asymmetric version of the traditional Student’s t-distribution. We have


selected an asymmetric version which allows for representation following an
extension of the classical subordinated model of the form

X = µ + γY + g(Y )Z (11)

where µ and γ are constants and g : + → + is a function.


Note that other classes of distributions which fit in the selected framework
like the generalized hyperbolic distribution could be easily included in the
framework. For further information we refer to [14] or [15].
If dependency is modelled using copulas then the marginal distribution can
also follow any of the univariate forms of the distributions described above.
Both the subordinated and the copula-based Cognity models allow for rela-
tively easy generation of random samples. Once the scenarios for the asset
values are generated, one only needs to assign credit ratings for each scenario.
This is done by comparing the asset value in each scenario to the rating thresh-
olds. Rating thresholds are estimated based on a migration matrix. Note that
some of the conditional migration probability approaches discussed in Sect. 5
can be embedded in this model.

Evaluation on the Portfolio Level and Summarizing the Results

For non-default scenarios, the portfolio valuation step consists of applying a


valuation model for each particular position within the portfolio over each
scenario. The yield curve corresponding to the credit rating of the obligor for
this particular scenario should be used. For default scenarios, a model for the
recovery rates is required. As discussed in many empirical analyses recovery
rates are not deterministic quantities but rather exhibit large variations. Such
variation of value in the case of default is a significant contributor to risk.
Recovery rates can be modelled using the Beta distribution with a specified
mean and standard deviation. In this case, for each default scenario for a
given obligor, we should generate a random recovery rate for each particular
transaction with the defaulted obligor. The value of a given position in case
a particular default scenario is realized will be different.
Having the portfolio value scenarios generated in the previous steps, we
obtain an estimate for the distribution of the portfolio values. We may then
choose to report any set of descriptive statistics for this distribution. The
calculation of statistics is the same for both Cognity models. For example,
mean and standard deviation of future portfolio value can be obtained from
the simulated portfolio values using sample statistics. Because of the skewed
nature of the portfolio distribution, the mean and the standard deviation may
not be good measures of risk. Given the simulated portfolio values, we can
compute better measures, for example empirical quantiles, or Expected Tail
Loss discussed in Sect. 4.
230 F. Cowell et al.

6.2 The Stochastic Default Rate Approach in Cognity Credit Risk

Credit Risk Modelling based on Stochastic Modelling of Default Rate (SDR)


approach comprises five key steps:
1. Build the econometric models for the default rates and for the explanatory
variables. Default probability of a given segment is described based on
an econometric model using explanatory variables such as macro-factors,
indices, etc. It is fit using historical data for default frequencies in a given
segment and historical time series for the explanatory variables.
2. Generate scenarios. Each scenario corresponds to a possible ‘state of the
world’ at the end of our risk horizon. Here, the ‘state of the world’ is a set
of values for the market variables and for the explanatory variable defined
in step 1.
3. Estimate default probabilities for the segments under each scenario based
on the scenario values for the explanatory variables and the model esti-
mated in step 1. Then the migration matrix is adjusted. Simulate sub-
scenarios for the status of each obligor.
4. Portfolio valuation. For each scenario, reevaluate the portfolio to reflect
the new credit status of the obligor and the values of the market risk
factors. This step generates a large number of possible future portfolio
values.
5. Summarize results. Having the scenarios generated in the previous steps,
we possess an estimate for the distribution of portfolio values. We may
then choose to report any descriptive statistics for this distribution.
The last two parts are the same for the Asset Value Model and for the
Stochastic Default Rate Model, so we will concentrate on the first three com-
ponents of the model.

Building the Econometric Models

Two models should be defined and estimated under the SDR approach: the
first model provides an econometric approach for default probabilities of a
segment based on explanatory variables like macro-factors, indices, etc. The
second model deals with a time series approach for the explanatory variables.
Default probability models are evaluated for each user-defined segment.
The segment definitions can be flexible based on criteria like the credit rating,
the industry, the region and the size of the company, provided that the time
series of default rates are available for each of the segments. The explanatory
variables that might be appropriate to represent the systematic risk of the
default rates in the chosen country/industry/segment depend on the nature
of the portfolio and might comprise industry indices, macro variables (GDP,
unemployment rate) as well as long-term interest rates or exchange rates,
etc. When defining the model for the default probability of a segment based
on explanatory variables (macro-factors, indices, etc.) we use historical data
Recent Advances in Credit Risk Management 231

for default frequencies in a given segment and historical time series for the
explanatory variables. The idea is similar to the CreditPortfolioView described
in Sect. 5. Hereby, a function f is chosen and estimated such that

DFs,t = f (X1,t , . . . , XN,t ) + ut (12)

where DFs,t is the default frequency in the segment s for the time period t; Xi,t
is the value of the i-th explanatory variable at time t, i = 1, . . . , N . It should
be mentioned that in general explanatory variables can be observable factors
but also factors estimated by the means of fundamental factor analysis based
on stock returns in a given segment or latent variables coming from statistical
factor models.
The second model is a time-series model for the explanatory variables.
The usual way to model dependent variables (as suggested also in CreditPort-
folioView) is to employ some kind of ARM A(p, q) model. That is the same
as assuming that


p 
q
Xt = a0 + ai Xt−i + bj εt−i + εt , (13)
i=1 j=1

where et is εt ∼ N (0, σ 2 ). It is important to note that a sound modelling of the


default rate will depend very much on the proper modelling of the dependent
variables.
There are numerous empirical studies showing that the real distribution of
residuals deviates from the assumption of the model - residuals are not nor-
mal. They are usually skewed, with fatter tails and volatility clustering. Thus
the improper use of normal residuals may end up with ‘incorrect’ scenarios
(simulations) for the possible default rates. For additional information, see
e.g. [14] or [15].
For the modelling of macro-factors, Cognity system proposes the following
more general Vector-AR(1)-GARCH type model with heavy-tailed residuals.
The model takes the following form:

Xt = A1 Xt−1 + Et (14)

where Xt = (X1,t , . . . , Xn,t ) is the vector of explanatory variables, A1 is an


n × n matrix and Et = (ε1,t , . . . , εn,t ) is the vector of residuals which are
modelled by a multivariate heavy-tailed GARCH-type model.

The Monte Carlo Approach in the SDR Model

There are five key steps in the Monte Carlo approach to credit risk modelling
based on stochastic modelling of the default rate:
Step 1. Build econometric models for default rates and explanatory risk
variables. Based on the explanatory risk variables (macro-factors, indices, etc.)
232 F. Cowell et al.

an econometric model for the default probability for each segment is fitted
using historical data for default probabilities in a given segment and historical
time series data for the explanatory variables.
Step 2. Generate scenarios – each scenario corresponds to a possible “state
of the world” at the end of the risk horizon. Here, the “state of the world” is
a set of values for the market and explanatory risk variables defined.
Step 3. Estimate default probabilities under each scenario for each segment
using the scenario (simulation) values of the explanatory variables and the
model estimated in step 1. Sample a new default rate for each obligor and
adjust the respective migration probabilities based on the new default rate.
Determine the credit rating status of each obligor based on the new migration
and default probabilities. Technically this is accomplished by making use of
a uniform (0,1) random variable, which is drawn for each counterparty and
each simulation of the default rate.
Step 4. Portfolio valuation – for each scenario, revalue the portfolio to
reflect the new credit status of the obligor and the values of the market risk
factors. This step generates a large number of possible future portfolio values.
Step 5. Summarize results – once the scenarios in the previous steps are
generated, we come up with an estimate for the distribution of portfolio values.
We may then choose to report any descriptive statistics for this distribution.

7 Conclusion

In this paper we reviewed recent advances in credit risk management. The


upcoming new Basel capital accord, recent periods of high default rates and
the substantial growth in the credit derivatives markets, have led to a high
awareness of necessary improvements in credit risk modelling.
We provided an overview of the most common fallacies embedded in sev-
eral industry wide used solutions. The first concept under criticism was the
use of the normal distributions to model asset returns. Recent research by
e.g. Rachev and Mittnik (2000) suggests that the use of the normal distri-
bution for modelling the returns of an asset or macroeconomic risk factors
is not adequate. The normal distribution cannot capture important features
like heavy tails, excess kurtosis and skewness exhibited by the variables. We
also reviewed the deficiencies of the use of correlation as dependence measure
between risk factors or asset returns (Embrechts et al, 2001). We argue that
using the wrong dependence structure may lead to severe underestimation of
the risk for a credit portfolio and recommend the use of copulas (Sklar, 1959)
as alternative concept. Copulas allow for more diversity in the dependence
structure between defaults as well as the drivers of credit risk and should be
incorporated in advanced credit risk management systems.
Further we suggested the additional use of alternative risk measures next to
the industry standard of Value-at-Risk. The idea of a coherent risk measures,
initially introduced by Artzner et al (1999) provides theorems for construction
Recent Advances in Credit Risk Management 233

of more adequate measures. Hence, we propose to consider not only a single


quantile of the loss distribution of a credit portfolio, but to include several
risk measures including expected shortfall (ES) and expected tail loss (ETL).
Finally, as a result of the quite dramatic effects of the business cycle on credit
migration behaviour, we point out the importance of using conditional instead
of historical average migration matrices. In recent years, research and empir-
ical studies (e.g. Allen and Saunders, 2003; Trück and Rachev, 2005) suggest
that business cycle effects have substantial impact on CVaR and should not
be ignored. We propose different methods that can be used for the estimation
of conditional transition matrices.
Finally, a case study, using the FinAnalytica Inc. Cognity software system,
provides some information how the discussed features can be incorporated in
a up-to-date credit risk management system. Hereby, two different classes of
credit risk models are considered - an extension of the classic Asset Value
Model (AVM) and an advanced Stochastic Default Rate (SDR) model.

Acknowledgement
The authors are grateful to Georgi Mitov (FinAnalytica) and Dobrin Penchev
(FinAnalytica) for the fruitful comments, suggestions and computational as-
sistance. We also thank Zari Rachev (University of Karlsruhe, UCSB and
FinAnalytica) and Stoyan Stoyanov (FinAnalytica) for helpful discussions.

References
[1] Alessandrini, F., 1999. Credit Risk, Interest Rate Risk, and the Business
Cycle. Journal of Fixed Income 9 (2), 42–53.
[2] Allen, L., Saunders, A., 2003. A Survey of Cyclical Effects in Credit Risk
Measurement Model. BIS Working Paper 126.
[3] Artzner, P., Delbaen, F., Eber, J.-M., Heath, D., 1999. Coherent measures
of risk. Mathematical Finance 9 (3), 203–228.
[4] Basel Committee on Banking Supervision, 2001. The new Basel Capital
Accord, Second Consultative Document.
[5] Belkin, B., Forest, L., Suchower, S., 1998. The Effect of Systematic Credit
Risk on Loan Portfolio Value-at-Risk and Loan Pricing. CreditMetrics
Monitor.
[6] Belkin, B., Forest, L., Suchower, S., 1998. A one-parameter Representa-
tion of Credit Risk and Transition Matrices. CreditMetrics Monitor.
[7] Embrechts, P., McNeil, A., Straumann, D., 1999. Correlation and Depen-
dence in Risk Management: Properties and Pitfalls. In: Risk management:
value at risk and beyond, ed. Dempster, M.
[8] Fama, E., 1965. The Behaviour of Stock Market Prices. Journal of Busi-
ness 38, 34–105.
234 F. Cowell et al.

[9] Helwege, J., Kleiman, P., 1997. Understanding aggregate default rates of
high-yield bonds. Journal of Fixed Income 7(1), 55–61.
[10] Kim, J., 1999. Conditioning the Transition Matrix. Risk Credit Risk
Special Report, 37–40.
[11] Mandelbrot, B., 1963. The Variation of certain speculative Prices. Jour-
nal of Business 36, 394–419.
[12] Nickell, P., Perraudin, W., Varotto, S., 2000. Stability of Rating Transi-
tions. Journal of Banking and Finance 1-2, 203–227.
[13] Picone, D., 1959. Fonctions de répartition à n dimensions et leurs marges.
Working Paper, Cass Business School 8, 229–231.
[14] Rachev, S., Martin, R., Racheva, B., Stoyanov, S., 2006. Stable ETL
Portfolios and Extreme Risk Management. Working Paper.
[15] Rachev, S., Mittnik, S., 2000. Stable Paretian Models in Finance. Wiley,
New York.
[16] Rachev, S., Ortobelli, S., Stoyanov, S., Fabozzi, F., Biglova, A., 2006. De-
sirable Properties of an Ideal Risk Measure in Portfolio Theory. Working
Paper.
[17] Samorodnitsky, G., Taqqu, M., 1994. Stable Non-Gaussian Random Pro-
cesses. Chapman & Hall, New York.
[18] Schweizer, B., Sklar, A., 1983. Probabilistic Metric Spaces. North Holland
Elsevier, New York.
[19] Szegö, G., 2002. Measures of Risk. Journal of Banking and Finance 26(7),
1253–1272.
[20] Szegö, G., 2004. Risk Measures for the 21st Century. Wiley, Chichester.
[21] Trück, S., 2008. Forecasting Credit Migration Matrices with Business
Cycle Effects - A Model Comparison. European Journal of Finance 14(5),
359–379.
[22] Trück, S., Rachev, S., 2005. Credit Portfolio Risk and PD Confidence
Sets through the Business Cycle. Journal of Credit Risk 1(4).
[23] Wei, J., 2003. A Multi-Factor, Credit Migration Model for Sovereign
and Corporate Debts. Journal of International Money and Finance 22,
709–735.
[24] Wilson, T., 1997. Measuring and Managing Credit Portfolio Risk.
McKinsey & Company.
[25] Wilson, T., 1997. Portfolio Credit Risk I/II. Risk 10.
Stable ETL Optimal Portfolios and Extreme
Risk Management

Svetlozar T. Rachev1 , R. Douglas Martin2 , Borjana Racheva3 ,


and Stoyan Stoyanov4
1
FinAnalytica Inc., Sofia, Bulgaria, zari.rachev@finanalytica.com
2
FinAnalytica Inc., Seattle WA, USA, doug.martin@finanalytica.com
3
FinAnalytica Inc., Sofia, Bulgaria, borjana.racheva@finanalytica.com
4
FinAnalytica Inc., Sofia, Bulgaria, stoyan.stoyanov@finanalytica.com

1 Introduction
We introduce a practical alternative to Gaussian risk factor distributions
based on Svetlozar Rachev’s work on Stable Paretian Models in Finance (see
[4]) and called the Stable Distribution Framework. In contrast to normal dis-
tributions, stable distributions capture the fat tails and the asymmetries of
real-world risk factor distributions. In addition, we make use of copulas, a gen-
eralization of overly restrictive linear correlation models, to account for the
dependencies between risk factors during extreme events, and multivariate
ARCH-type processes with stable innovations to account for joint volatility
clustering. We demonstrate that the application of these techniques results in
more accurate modeling of extreme risk event probabilities, and consequently
delivers more accurate risk measures for both trading and risk management.
Using these superior models, VaR becomes a much more accurate measure of
downside risk. More importantly Stable Expected Tail Loss (SETL) can be
accurately calculated and used as a more informative risk measure for both
market and credit portfolios. Along with being a superior risk measure, SETL
enables an elegant approach to portfolio optimization via convex optimiza-
tion that can be solved using standard scalable linear programming software.
We show that SETL portfolio optimization yields superior risk adjusted re-
turns relative to Markowitz portfolios. Finally, we introduce an alternative
investment performance measurement tools: the Stable Tail Adjusted Return
Ratio (STARR), which is a generalization of the Sharpe ratio in the Stable
Distribution Framework.
“When anyone asks me how I can describe my experience of nearly 40
years at sea, I merely say uneventful. Of course there have been winter gales
and storms and fog and the like, but in all my experience, I have never been
in an accident of any sort worth speaking about. I have seen but one vessel in
236 S.T. Rachev et al.

distress in all my years at sea (...) I never saw a wreck and have never been
wrecked, nor was I ever in any predicament that threatened to end in disaster
of any sort.”
E.J. Smith, Captain, 1907, RMS Titanic

2 Extreme Asset Returns Demands New Solutions

Professor Paul Wilmott (www.wilmott.com) likes to recount the ritual by


which he questions his undergraduate students on the likelihood of Black
Monday 1987. Under the commonly accepted Gaussian risk factor distribution
assumption, they consistently reply that there should be no such event in the
entire existence of the universe and beyond!
The last two decades have witnessed a considerable increase in fat-tailed
kurtosis and skewness of asset returns at all levels, individual assets, portfolios
and market indices. Extreme events are the corollary of the increased kurtosis.
Legacy risk and portfolio management systems have done a reasonable job at
managing ordinary financial events. However up to now, very few institutions
or vendors have demonstrated the systematic ability to deal with the unusual
or extreme event, the one that should almost never happen using conventional
modeling approaches. Therefore, one can reasonably question the soundness
of some of the current risk management practices and tools used in Wall Street
as far as extreme risk is concerned.
The two main conventional approaches to modeling asset returns are based
either on a historical or a normal (Gaussian) distribution for returns. Neither
approach adequately captures unusual asset price and return behaviors. The
historical model is bounded by the extent of the available observations and
the normal model inherently cannot produce atypical returns. The financial
industry is beleaguered with both under-optimized portfolios with often-poor
ex-post risk-adjusted returns, as well as overly optimistic aggregate risk indi-
cators (e.g. VaR) that lead to substantial unexpected losses.
The inadequacy of the normal distribution is well recognized by the risk
management community. Yet up to now, no consistent and comprehensive
alternative has adequately addressed unusual returns. To quote one major
vendor:

“It has often been argued that the true distributions returns (even
after standardizing by the volatility) imply a larger probability of
extreme returns than that implied from the normal distribution. Al-
though we could try to specify a distribution that fits returns better, it
would be a daunting task, especially if we consider that the new distri-
bution would have to provide a good fit across all asset classes.” (Tech-
nical Manual, RMG, 2001, http://www.riskmetrics.com/publications/
index.html).
Stable ETL Optimal Portfolios and Extreme Risk Management 237

In response to the challenge, we use generalized multivariate stable


(GMstable) distributions and generalized risk-factor dependencies, thereby
creating a paradigm shift to consistent and uniform use of the most viable
class of non-normal probability models in finance. This approach leads to
distinctly improved financial risk management and portfolio optimization
solutions for assets with extreme events.

3 The Stable Distribution Framework


3.1 Stable Distributions

In spite of wide-spread awareness that most risk factor distributions are heavy-
tailed, to date, risk management systems have essentially relied either on his-
torical, or on univariate and multivariate normal (or Gaussian) distributions
for Monte Carlo scenario generation. Unfortunately, historical scenarios only
capture conditions actually observed in the past, and in effect use empirical
probabilities that are zero outside the range of the observed data, a clearly
undesirable feature. On the other hand Gaussian Monte Carlo scenarios have
probability densities that converge to zero too quickly (exponentially fast) to
accurately model real-world risk factor distributions that generate extreme
losses. When such large returns occur separately from the bulk of the data
they are often called outliers.
Figure 1 below shows quantile–quantile (qq)-plots of daily returns versus
the best-fit normal distribution of nine randomly selected microcap stocks for
the two-year period 2000–2001. If the returns were normally distributed, the
quantile points in the qq-plots would all fall close to a straight line. Instead
they all deviate significantly from a straight line (particularly in the tails),
reflecting a higher probability of occurrence of extreme values than predicted
by the normal distribution, and showing several outliers.
Such behavior occurs in many asset and risk factor classes, including well-
known indices such as the S&P 500, and corporate bond prices. The latter
are well known to have quite non-Gaussian distributions that have substantial
negative skews to reflect down-grading and default events. For such returns,
non-normal distribution models are required to accurately model the tail be-
havior and compute probabilities of extreme returns.
Various non-normal distributions have been proposed for modeling ex-
treme events, including:
• Mixtures of two or more normal distributions
• t-distributions, hyperbolic distributions, and other scale mixtures of nor-
mal distributions
• Gamma distributions
• Extreme value distributions
• Stable non-Gaussian distributions (also known as Lévy-stable and Pareto-
stable distributions)
238 S.T. Rachev et al.

0.00 0.05 0.10


0.2
VWKS.returns

CMED.returns

AER.returns
0.1
0.0

−0.1
−0.2

−0.3
−3 −2 −1 0 1 2 3 −3 −2 −1 0 1 2 3 −3 −2 −1 0 1 2 3
Quantiles of Standard Normal Quantiles of Standard Normal Quantiles of Standard Normal

CVST.returns

−0.2 −0.1 0.0 0.1


NPSI.returns
AXM.returns

0.6
0.00

0.2
−0.10

−0.2

−3 −2 −1 0 1 2 3 −3 −2 −1 0 1 2 3 −3 −2 −1 0 1 2 3
Quantiles of Standard Normal Quantiles of Standard Normal Quantiles of Standard Normal

0.10
−0.1 0.0 0.1 0.2
−0.4 −0.2 0.0 0.2 0.4
WFHC.returns

ALLE.returns

IQW.returns
0.00
−0.10
−3 −2 −1 0 1 2 3 −3 −2 −1 0 1 2 3 −3 −2 −1 0 1 2 3
Quantiles of Standard Normal Quantiles of Standard Normal Quantiles of Standard Normal

Fig. 1. Quantile–quantile (qq)-plots versus the best-fit normal distribution

Among the above, only stable distributions have attractive enough math-
ematical properties to be a viable alternative to normal distributions in trad-
ing, optimization and risk management systems. A major drawback of all
alternative models is their lack of stability. Benoit Mandelbrot [3] demon-
strated that the stability property is highly desirable for asset returns. These
advantages are particularly evident in the context of portfolio analysis and
risk management.
An attractive feature of stable models, not shared by other distribution
models, is that they allow generation of Gaussian-based financial theories
and, thus allow construction of a coherent and general framework for financial
modeling. These generalizations are possible only because of specific proba-
bilistic properties that are unique to (Gaussian and non-Gaussian) stable laws,
namely: the stability property, the central limit theorem, and the invariance
principle for stable processes.
Benoit Mandelbrot [3], then Eugene Fama [2], provided seminal evidence
that stable distributions are good models for capturing the heavy-tailed
(leptokurtic) returns of securities. Many follow-on studies came to the same
conclusion, and the overall stable distributions theory for finance is provided
in the definitive work of Rachev and Mittnik [4], see also [5, 6, 9].
Stable ETL Optimal Portfolios and Extreme Risk Management 239

But in spite the convincing evidence, stable distributions have seen vir-
tually no use in capital markets. There have been several barriers to the
application of stable models, both conceptual and technical:
• Except for three special cases, described below, stable distributions have
no closed form expressions for their probability densities.
• Except for normal distributions, which are a limiting case of stable distri-
butions (with α=2 and β = 0, stable distributions have infinite variance
and only a mean value for α > 1.
• Without a general expression for stable probability densities, one cannot
directly implement maximum likelihood methods for fitting these densities,
even in the case of a single (univariate) set of returns.
The availability of practical techniques for fitting univariate and mul-
tivariate stable distributions to asset and risk factor returns has been the
barrier to the progress of stable distributions in finance. Only the recent de-
velopment of advanced numerical methods has removed this obstacle. These
patent-protected methods are at the foundation of the CognityTM risk man-
agement and portfolio optimization software system (see further comments in
Sect. 5.6).

Univariate Stable Distributions

A stable distribution for a random risk factor X is defined by its characteristic


function: 
) itX *
F (t) = E e = eitx fµ,σ (x)dx,

where
1 x−µ
fµ,σ (x) = f
σ σ
is any probability density function in a location-scale family for X:
 α) ) ** 
−σ α |t| ) 1 − iβsgn(t) tan πα 2* + iµt, α = 1
log F (t) =
−σ |t| 1 − iβ π2 sgn(t) log |t| + iµt, α = 1

A stable distribution is therefore determined by the four key parameters:


1. α determines density’s kurtosis with 0 < α ≤ 2 (e.g. tail weight)
2. β determines density’s skewness with −1 ≤ β ≤ 1
3. σ is a scale parameter (in the Gaussian case, α = 2 and 2σ 2 is the variance)
4. µ is a location parameter (µ is the mean if 1 < α ≤ 2)
Stable distributions for risk factors allow for skewed distributions when
β = 0 and fat tails relative to the Gaussian distribution when α < 2 The
graph in Fig. 2 shows the effect of α on tail thickness of the density as well as
peakedness at the origin relative to the normal distribution (collectively the
“kurtosis” of the density), for the case of β = 0, µ = 0, and σ = 1. As the
240 S.T. Rachev et al.

Symmetric PDFs
0,7

0,6

0,5

α=0.5
0,4
α=1
0,3 α=1.5
α=2
0,2

0,1

0
−5 −4 −3 −2 −1 0 1 2 3 4 5

Fig. 2. Symmetric stable densities

values of α decrease the distribution exhibits fatter tails and more peakedness
at the origin.
The
√ case of α = 2 and β = 0 and with the reparameterization in scale,
 = 2σ, yields the Gaussian distribution, whose density is given by:
σ
1 (x−µ)2
fµ,σ (x) = √ e− 2σ 2 .
2π
σ
The case α=1 and β = 0 yields the Cauchy distribution with much fatter
tails than the Gaussian, and is given by:
!−1
2
1 x−µ
fµ,σ (x) = 1+
π·σ σ

Figure 3 below illustrates the influence of β on the skewness of the density


for α=1.5, µ=0 and σ=1. Increasing (decreasing) values of β result in skewness
to the right (left).

Fitting Stable and Normal Distributions: DJIA Example

Aside from the Gaussian, Cauchy, and one other special case of stable distri-
bution for a positive random variable with α = 0.5, there is no closed form
expression for the probability density of a stable random variable.
Thus one is not able to directly estimate the parameters of a stable distri-
bution by the method of maximum likelihood. To estimate the four parameters
Stable ETL Optimal Portfolios and Extreme Risk Management 241

Skewed PDFs (α=1.5)


0,35

0,3

0,25
β=0
0,2 β=0.25
β=0.5
0,15 β=0.75
β=1
0,1

0,05

0
−5 −4 −3 −2 −1 0 1 2 3 4 5

Fig. 3. Skewed stable densities


2.0

DJIA DAILY RETURNS STABLE DENSITY


TAIL PROBABILITY DENSITIES

NORMAL DENSITY
EMPIRICAL DENSITY
1.5
1.0
0.5
0.0

−0.08 −0.07 −0.06 −0.05 −0.04 −0.03


DJIA DAILY RETURNS
Fig. 4. DJIA daily returns from January 1, 1990 to February 14, 2003

of the stable laws, the CognityTM system uses a special patent-pending ver-
sion of the FFT (Fast Fourier Transform) approach to numerically calculate
the densities with high accuracy, and then applies maximum likelihood esti-
mation (MLE) to estimate the parameters.
The results from applying the CognityTM stable distribution modeling to
the DJIA daily returns from January 1, 1990 to February 14, 2003 is displayed
in Fig. 4. The figure shows the left-hand tail detail of the resulting stable den-
sity, along with that of a normal density fitted using the sample mean and
sample standard deviation, and that of a non-parametric kernel density esti-
mate (labeled “Empirical” in the plot legend). The parameter estimates are:
242 S.T. Rachev et al.

• Stable parameters α̂ = 1.699, β̂ = −0.120, µ̂ = 0.0002, and σ̂ = 0.006,


• Normal density parameter estimatesµ̂ = 0.0003, and σ̂ = 0.010.
Note that the stable density tail behavior is reasonably consistent with the
empirical non-parametric density estimate, indicating the existence of some
extreme returns. At the same time it is clear from the figure that the tail of
the normal density is much too thin, and will provide inaccurate estimates of
tail probabilities for the DJIA returns. The table below shows just how bad
the normal tail probabilities are for several negative returns values.

Probability (DJIA Return < x )


x −0.04 −0.05 −0.06 −0.07
Stable fit 0.0066 0.0043 0.0031 0.0023
Normal fit 0.000056 0.0000007 3.68E−09 7.86E−12

A daily return smaller than −0.04 with the stable distribution occurs with
probability 0.0066, or roughly seven times every four years, whereas such a
return with the normal fit occurs on the order of once every four years.
Similarly, a return smaller than −0.05 with the stable occurs about once
per year and with the normal fit about once every 40 years. Clearly the nor-
mal distribution fit is an exceedingly optimistic predictor of DJIA tail return
values.
Figure 5 below displays the central portion of the fitted densities as well
as the tails, and shows that the normal fit is not nearly peaked enough near

DJIA

Empirical Density
Stable Fit
50 Gaussian (Normal) Fit

40

30

20

10

0
−0.08 −0.06 −0.04 −0.02 0 0.02 0.04 0.06 0.08

Fig. 5. The fitted stable and normal densities together with the empirical density
Stable ETL Optimal Portfolios and Extreme Risk Management 243

the origin as compared with the empirical density estimate (even though the
GARCH model was applied), while the stable distribution matches the em-
pirical estimate quite well in the center as well as in the tails.

Fitting Stable Distributions: Micro-Caps Example


Noting that micro-cap stock returns are consistently strongly non-normal (see
sample of normal qq-plots at the beginning of this section), we fit stable
distributions to a random sample of 182 micro-cap daily returns for the two-
year period 2000–2001. The results are displayed in the box plot in Fig. 6.
The median of the estimated alphas is 1.57, and the upper and lower quar-
tiles are 1.69 and 1.46 respectively. Somewhat surprisingly, the distribution of
the estimated alphas turns out to be quite normal.

Generalized Multivariate Stable Distribution Modeling


Generalized stable distribution (GMstable) modeling is based on fitting uni-
variate stable distributions for each one dimensional set of returns or risk
factors, each with its own parameter estimates αi , βi , µi , σi, i=1,2,. . . ,K,
where K is the number of risk factors, along with a dependency structure.
One way to produce the cross-sectional dependency structure is through
a scale mixing process (called a “subordinated” process in the mathematical
finance literature) as follows. First compute a robust mean vector and covari-
ance matrix estimate of the risk factors to get rid of the outliers, and have a
good covariance matrix estimate for the central bulk of the data. Next we gen-
erate multivariate normal scenarios with this mean vector and covariance ma-
trix. Then we multiply each of random variable component of the scenarios

ESTIMATED ALPHAS OF 182 MICRO-CAP STOCKS

1.1 1.3 1.5 1.7 1.9


ESTIMATED ALPHAS

Fig. 6. A box-plot of estimated alphas


244 S.T. Rachev et al.

by a strictly positive stable random variable with index αi /2, i=1,2,. . . ,K.
The vector of stable random variable scale multipliers is usually independent
of the normal scenario vectors, but it can also be dependent. See for example
Rachev and Mittnik [4], and [5, 6, 9].
Another very promising approach to building the cross-sectional depen-
dence model is through the use of copulas, an approach that is quite attrac-
tive because it allows for modeling higher correlations during extreme market
movements, thereby accurately reflecting lower portfolio diversification at such
times. The next section briefly discussion copulas.

3.2 Copula Multivariate Dependence Models

Why Copulas?

Classical correlations and covariances are quite limited measures of depen-


dence, and are only adequate in the case of multivariate Gaussian distribu-
tions. A key failure of correlations is that, for non-Gaussian distributions, zero
correlation does not imply independence, a phenomenon that arises in the con-
text of time-varying volatilities represented by ARCH and GARH models. The
reason we use copulas is that we need more general models of dependence,
ones which:
• Are not tied to the elliptical character of the multivariate normal
distribution.
• Have multivariate contours and corresponding data behavior that reflect
the local variation in dependence that is related to the level of returns,
in particular, those shapes that correspond to higher correlations with ex-
treme co-movements in returns than with small to modest co-movements.

What are Copulas?

A copula may be defined as a multivariate cumulative distribution function


with uniform marginal distributions:

C(u1 , u2 , · · · , un ), ui ∈ [0, 1] for i = 1, 2, · · · , n

where
C(ui ) = ui for i = 1, 2, · · · , n.
It is known that for any multivariate cumulative distribution function:

F (x1 , x2 , · · · , xn ) = P (X1 ≤ x1 , X2 ≤ x2 , · · · Xn ≤ xn )

there exists a copula C such that

F (x1 , x2 , · · · , xn ) = C(F1 (x1 ), F2 (x2 ), · · · , Fn (xn ))


Stable ETL Optimal Portfolios and Extreme Risk Management 245

where the Fi (xi ) are the marginal distributions of F (x1 , x2 , · · · , xn ), and


conversely for any copula Cthe right-hand-side of the above equation defines a
multivariate distribution function F (x1 , x2 , · · · , xn ). See for example, Bradley
and Taqqu [1] and Sklar [8].
The main idea behind the use of copulas is that one can first specify
the marginal distributions in whatever way makes sense, e.g. fitting marginal
distribution models to risk factor data, and then specify a copula C to capture
the multivariate dependency structure in the best suited manner.
There are many classes of copula, particularly for the special case of bi-
variate distributions. For more than two risk factors beside the traditional
Gaussian copula, the t-copula is very tractable for implementation and pro-
vides a possibility to model dependencies of extreme events. It is defined as:
 t−1
ν (u1 )
Γ ((ν + n)/2)
Cν,c (u1 , u2 , · · · , un ) =  ···
Γ (ν/2) |c| (νπ)n −∞
 !
t−1 −1
ν (un ) s c s
··· 1+ ds
−∞ ν

where c is a correlation matrix.


A sample of 2,000 bivariate simulated risk factors generated by a t-copula
with 1.5 degrees of freedom and normal marginal distributions is displayed in
Fig. 7.
The example illustrates that these two risk factors are somewhat uncorre-
lated for small to moderately large returns, but are highly correlated for the

T-COPULA WITH 1.5 DOF AND NORMAL MARGINALS


3
2
RISK FACTOR 2
1
0
−1
−2
−3

−2 0 2
RISK FACTOR 1

Fig. 7. Bivariate simulations obtained by using t-copula


246 S.T. Rachev et al.

infrequent occurrence of very large returns. This can be seen by noting that
the density contours of points in the scatter plot are somewhat elliptical near
the origin, but are nowhere close to elliptical for more extreme events. This
situation is in contrast to a Gaussian linear dependency relationship where
the density contours are expected to be elliptical.

3.3 Volatility Clustering Models and Stable VaR

It is well known that asset returns and risk factors returns exhibit volatility
clustering, and that even after adjusting for such clustering the returns will
still be non-normal and contain extreme values. There may also be some serial
dependency effects to account for. In order to adequately model these collec-
tive behaviors we recommend using ARIMA models with an ARCH/GARCH
“time-varying” volatility input, where the latter has non-normal stable inno-
vations. This approach is more flexible and accurate than the commonly used
simple exponentially weighted moving average (EWMA) volatility model, and
provides accurate time-varying estimates of VaR and expected tail loss (ETL)
risk measures. See Sect. 4 for discussion of ETL vs. VaR that emphasizes the
advantages of ETL. However, we stress that those who must use VaR to sat-
isfy regulatory requirements will get much more accurate results with stable
VaR than with normal VaR, as the following example vividly shows.
Consider the following portfolio of Brady bonds:
• Brazil C 04/14
• Brazil EIB 04/06
• Venezuela DCB Floater 12/07
• Samsung KRW Ord Shares
• Thai Farmers Bank THB
We have run normal, historical and stable 99% (1% tail probability) VaR
calculations for one-year of daily data from January 9, 2001 to January 9, 2002.
We used a moving window with 250 historical observations for the normal
VaR model, 500 for the historical VaR model and 700 for the stable VaR
model. For each of these cases we used a GARCH(1,1) model for volatility
clustering of the risk factors, with stable innovations. We back-tested these
VaR calculations by using the VaR values as one-step ahead predictors, and
got the results shown in Fig. 8.
The figure shows: the returns of the Brady bond portfolio (top curve); the
normal+EWMA (a la RiskMetrics) VaR (curve with jumpy behavior, just
below the returns); the historical VaR (the smoother curve mostly below but
sometimes crossing the normal+EWMA VaR); the stable+GARCH VaR (the
bottom curve). The results with regard to exceedances of the 99% VaR, and
keeping in mind Basel II guidelines, may be summarized as follows:
• Normal 99% VaR produced 12 exceedances (red zone)
• Historical 99% VaR produced 9 exceedances (on upper edge of yellow zone)
Stable ETL Optimal Portfolios and Extreme Risk Management 247

0.05
Observed Portf. Returns
Stable 99% VaR
0.04 Normal 99% VaR
Historical 99% VaR

0.03

0.02

0.01

−0.01

−0.02

−0.03

−0.04

−0.05
0 50 100 150 200 250 300

Fig. 8. A VaR back-test example

• Stable 99% VaR produced 1 exceedence and nearly two (well in the green
zone)
Clearly stable (+GARCH) 99% VaR produces much better results with
regard to Basel II compliance. This comes at the price of higher initial cap-
ital reserves, but results in a much safer level of capital reserves and a very
clean bill of health with regard to compliance. Note that organizations in the
red zone will have to increase their capital reserves by 33%, which at some
times for some portfolios will result in larger capital reserves than when using
the stable VaR, this in addition to being viewed as having inadequate risk
measures relative to the organization using stable VaR.

4 ETL is the Next Generation Risk Measure


4.1 Why Not Value-at-Risk (VaR)?
There is no doubt that VaR’s popularity is in large part due to its simplicity
and its ease of calculation for 1–5% confidence levels. However, there is a price
to be paid for the simplicity of VaR in the form of several limitations:
248 S.T. Rachev et al.

• VaR does not give any indication of the risk beyond the quantile, and so
provides very weak information on downside risk.
• VaR portfolio optimization is a non-convex, non-smooth problem with
multiple local minima that can result in portfolio composition disconti-
nuities. Furthermore it requires complex calculation techniques such as
integer programming.
• VaR is not sub-additive; i.e. the VaR of the aggregated portfolio can be
larger than the sum of the VaR’s of the sub-portfolios.
• Historical VaR limits the range of the scenarios to data values that have
actually been observed, while normal Monte Carlo tends to seriously un-
derestimate the probability of extreme returns. In either case, the prob-
ability functions beyond the sample range are either zero or excessively
close to zero.

4.2 ETL and Stable versus Normal Distributions

Expected Tail Loss (ETL) is simply the average (or expected value) loss con-
ditioned on the loss being larger than VaR. ETL is also known as Conditional
Value-at-Risk (CVaR), or Expected Shortfall (ES). (We assume that the un-
derlying return distributions are absolutely continuous, and therefore, ETL
is equal to CVaR). As such ETL is intuitively much more informative than
VaR. We note however that ETL offers little benefit to investors who use a
normal distribution to calculate VaR at the usual 99% confidence limit (1%
tail probability). The reason is that the resulting VaR and ETL values differ
by very little, specifically:
• For CI = 1%, VaR = 2.336 and ETL = 2.667
ETL really comes into its own when coupled with stable distribution mod-
els that capture leptokurtic tails (“fat tails”). In this case ETL and VaR values
will be quite different, with the resulting ETL often being much larger than
the VaR.
As in the graph in Fig. 9, consider the time series of daily returns for the
stock OXM from January 2000 to December 2001. Observe the occurrences
of extreme values.
While this series also displays obvious volatility clustering that deserves
to be modeled as described in Sect. 4.3, we shall ignore this aspect for the
moment. Rather, here we provide a compelling example of the difference be-
tween ETL and VaR based on a well-fitting stable distribution, as compared
with a poor fitting normal distribution.
Figure 10 shows a histogram of the OXM returns with a normal density
fitted using the sample mean and sample standard deviation, and a stable
density fitted using maximum-likelihood estimates of the stable distribution
parameters. The stable density is shown by the solid line and the normal
density is shown by the dashed line. The former is obviously a better fit than
the latter, when using the histogram of the data values as a reference. The
Stable ETL Optimal Portfolios and Extreme Risk Management 249

OXM Returns

0.10
0.05
0.0
−0.05
−0.10
−0.15

0 100 200 300 400 500

Fig. 9. The daily returns of OXM

99% VAR FOR NORMAL AND STABLE DENSITIES


30
25
20
15

Normal VAR = .047


10

Stable VAR = .059


5
0

−0.2 −0.1 0.0 0.1 0.2


OXM RETURNS

Fig. 10. The stable and normal 99% VaR for OXM

estimated stable tail thickness index is α̂ = 1.62. The 1% VaR values for the
normal and stable fitted densities are 0.047 and 0.059 respectively, a ratio of
1.26 which reflects the heavier-tailed nature of the stable fit.
Figure 11 displays the same histogram and fitted densities with 1% ETL
values instead of the 1% VaR values. The 1% ETL values for the normal
and stable fitted densities are 0.054 and 0.174 respectively, a ratio of a little
over three-to-one. This larger ratio is due to the stable density’s heavy tail
contribution to ETL relative to the normal density fit.
250 S.T. Rachev et al.

99% ETL FOR NORMAL AND STABLE DENSITIES

30
25
20
15

Normal ETL = .054


10

Stable ETL = .147


5
0

−0.2 −0.1 0.0 0.1 0.2


OXM RETURNS

Fig. 11. The stable and normal 99% ETL for OXM

Unlike VaR, ETL has a number of attractive properties:


• ETL gives an informed view of losses beyond VaR.
• ETL is a convex, smooth function of portfolio weights, and is therefore
attractive to optimize portfolios (see [7]). This point is vividly illustrated
in the subsection below on ETL and Portfolio Optimization.
• ETL is sub-additive and satisfies a set of intuitively appealing coherent
risk measure properties.
• ETL is a form of expected loss (i.e. a conditional expected loss) and is a
very convenient form for use in scenario-based portfolio optimization. It
is also quite a natural risk-adjustment to expected return (see STARR, or
Stable Tail Adjusted Return Ratio).
The limitations of current normal risk factor models and the absence of
regulator blessing have held back the widespread use of ETL, in spite of its
highly attractive properties. However, we expect ETL to be a widely accepted
risk measure as portfolio and risk managers become more familiar with its
attractive properties.
For portfolio optimization, we recommend the use of Stable distribution
ETL (SETL), and limiting the use of historical, normal or stable VaR to
required regulatory reporting purposes only. Finally, organizations should con-
sider the advantages of Stable ETL for risk assessment purposes and non-
regulatory reporting purposes.
Stable ETL Optimal Portfolios and Extreme Risk Management 251

4.3 Portfolio Optimization and ETL Versus VaR

To the surprise of many, portfolio optimization with ETL turns out to be a


smooth, convex problem with a unique solution [7]. These properties are in
sharp contrast to the non-convex, rough VaR optimization problem.
The contrast between VAR and ETL portfolio optimization surfaces is
illustrated in Fig. 12 for a simple two-asset portfolio. The horizontal axes
show one of the portfolio weights (from 0% to 100%) and the vertical axes
display portfolio VAR and ETL respectively. The data consist of 200 simulated
uncorrelated returns.
The VAR objective function is quite rough with respect to varying the
portfolio weight(s), while that of the ETL objective function is smooth and
convex. One can see that optimizing with ETL is a much more tractable
problem than optimizing with VaR.
Rockafellar and Uryasev [7], show that the ETL optimal portfolio weight
vector can be obtained based on historical (or scenario) returns data by mini-
mizing a relatively simple convex function (Rockafellar and Uryasev used the
term CVaR whereas we use the, less confusing, synonym ETL). Assuming p
assets with single period returns ri = (ri1 , ri2 , · · · , rip ), for period i, and a
portfolio weight vector w = (w1 , w2 , . . . , wp ), the function to be minimized is

1  
n
+
F (w, γ) = γ + [w ri − γ] ,
ε · n i=1

NON-CONVEX ROUGH VAR SURFACE


0.22 0.24 0.26 0.28 0.30
VAR

0.0 0.2 0.4 0.6 0.8 1.0


WEIGHT

CONVEX SMOOTH ETL SURFACE


0.38
0.34
ETL
0.30
0.26

0.0 0.2 0.4 0.6 0.8 1.0


WEIGHT

Fig. 12. VaR and ETL surfaces as functions of portfolio weights


252 S.T. Rachev et al.
+
where [x] denotes the positive part of x. This function is to be minimized
jointly with respect to w and γ, where ε is the tail probability for which
the expected tail loss is computed. Typicallyε = .05 or .01, but larger values
may be useful, as we discuss in section 5.6. The authors further show that
this optimization problem can be cast as a LP (linear programming) problem,
solvable using any high-quality LP software.
CognityTM combines this approach with fitting GMstable distribution
models for scenario generation. The stable scenarios provide accurate and
well-behaved estimates of ETL for the optimization problem.

4.4 Stable ETL Leads to Higher Risk Adjusted Returns

ETL portfolio optimization based on GMstable distribution modeling, which


we refer to as SETL portfolios, can lead to significant improvements in risk
adjusted return as compared to the conventional Markowitz mean–variance
portfolio optimization.
Figures 13 and 14 are supplied to illustrate the claim that stable ETL
optimal portfolios produce consistently better risk-adjusted returns. These
figures show the risk adjusted return MU/VaR (mean return divided by VaR)
and MU/ETL (mean return divided by ETL) for 1% VaR optimal portfolios
and ETL optimal portfolios, and using a multi-period fixed-mix optimization
in all cases.
In this simple example, the portfolio to be optimized consists of two assets,
cash and the S&P 500. The example is based on monthly data from February
1965 to December 1999. Since we assume full investment, the VaR optimal
portfolio depends only on a single portfolio weight and the optimal weight(s)
is found by a simple grid search on the interval 0 to 1. The use of a grid
search technique, overcomes the problems with non-convex and non-smooth
VaR optimization. In this example the optimizer is maximizing M U − c · V AR
and M U − c · ET L, where c is the risk aversion (parameter), and with VaR
or ETL as the penalty function.
Figure 13 shows that even using the VaR optimal portfolio, one gets a
significant relative gain in risk-adjusted return using stable scenarios when
compared to normal scenarios, and with the relative gain increasing with
increasing risk aversion. The reason for the latter behavior is that with sta-
ble distributions the optimization pays more attention to the S&P returns
distribution tails, and allocates less investment to the S&P under stable dis-
tributions than under normal distributions as risk aversion increases.
Figure 14 for the risk-adjusted return for the ETL optimal portfolio has the
same vertical axis range as the previous plot for the VaR optimal portfolio.
The figure below shows that the use of ETL results in much greater gain
under the stable distribution relative to the normal than in the case of the
VaR optimal portfolio.
At every level of risk aversion, the investment in the S&P 500 is even less
in the ETL optimal portfolio than in the case of the VaR optimal portfolio.
Stable ETL Optimal Portfolios and Extreme Risk Management 253

STABLE VaR OPTIMAL PORTFOLIOS

4
RISK ADJUSTED RETURN (MU / VAR)
Stable VaR
VaR
3
2
1
0

0.020 0.022 0.024 0.026 0.028 0.030


RISK AVERSION
Fig. 13. Risk aversion versus risk-adjusted return, VaR based

STABLE ETL OPTIMAL PORTFOLIOS


4
RISK ADJUSTED RETURN (MU / ETL)

SETL
VaR
3
2
1
0

0.018 0.020 0.022 0.024 0.026 0.028 0.030


RISK AVERSION
Fig. 14. Risk aversion versus risk-adjusted return, ETL based
254 S.T. Rachev et al.

This behavior is to be expected because the ETL approach pays attention to


the losses beyond VaR (the expected value of the extreme loss), and which in
the stable case are much greater than in the normal case.

5 The Stable ETL Paradigm

5.1 The Stable ETL Framework

Our risk management and portfolio optimization framework uses multi-


dimensional asset and risk factor returns models based on GMstable distri-
butions, and stresses the use of Stable ETL (SETL) as the risk measure of
choice. These stable distribution models incorporate generalized dependence
structure with copulas, and include time varying volatilities based on GARCH
models with stable innovations. Henceforth we use the term GMstable distribu-
tion to include the generalized dependence structure and volatility clustering
model aspects of the model. Collectively, these modeling foundations form the
basis of a new and powerful overall basis for investment decisions that we call
the SETL Framework.
Currently the SETL framework has the following basic components:
• SETL scenario engines
• SETL factor models
• SETL integrated market risk and credit risk
• SETL optimal portfolios and efficient frontiers
• SETL derivative pricing
Going forward, additional classes of SETL investment decision models will
be developed, such as SETL betas and SETL asset liability models. The rich
structure of these models will encompass the heavy-tailed distributions of
the asset returns, stochastic trends, heteroscedasticity, short-and long-range
dependence, and more. We use the term “SETLg model” to describe any such
model in order to keep in mind the importance of the stable tail-thickness
parameter αg and skewness parameter β, along with volatility clustering and
general dependence models, in financial investment decisions.
It is essential to keep in mind the following SETL fundamental principles
concerning risk factors:
(P1) Asset and risk factor returns have stable distributions where each
asset or risk factor typically has a different stable tail-index αi and
skewness parameter βi .
(P2) Asset and risk factor returns are associated through models that
describe the dependence between the individual factors more accu-
rately than classical correlations. Often these will be copula models.
(P3) Asset and risk factor modeling typically includes a SETL econo-
metric model in the form of multivariate ARIMA-GARCH processes
Stable ETL Optimal Portfolios and Extreme Risk Management 255

with residuals driven by fractional stable innovations. The SETL


econometric model captures clustering and long-range dependence of
the volatility.

5.2 Stable ETL Optimal Portfolios

A SETL optimal portfolio is one that minimizes portfolio expected tail loss
subject to a constraint of achieving expected portfolio returns at least as large
as an investor defined level, along with other typical constraints on weights,
where both quantities are evaluated in the SETL framework. Alternatively,
a SETL optimal portfolio solves the dual problem of maximizing portfolio
expected return subject to a constraint that portfolio expected tail loss is
not greater than an investor defined level, where again both quantities are
evaluated in the SETL framework. In order to define the above ETL precisely
we use the following quantities:
Rp : the random return of portfolio p
SERp : the stable distribution expected return of portfolio p
Lp = −Rp + SERp : the loss of portfolio p relative to its expected return
ε: a tail probability of the SETL distribution Lp
SV aRp (ε): the stable distribution Value-at-Risk for portfolio p
The latter is defined by the equation

Pr[Lp > SV aRp (ε)] = ε

where the probability is calculated in the SETL framework, that is SV aRp (ε)
is the ε-quantile of the stable distribution of Lp . In the value-at-risk literature
(1 − ε) × 100% is called the confidence level. Here we prefer to use the simpler,
unambiguous term tail probability. Now we define SETL of a portfolio p as

SET Lp (ε) = E[Lp |Lp > SV aRp (ε) ]

where the conditional expectation is also computed in the SETL framework.


We use the “S” in SERp , SV aRp (ε) and SET Lp (ε) as a reminder that stable
distributions are a key aspect of the framework (but not the only aspect!).
Proponents of normal distribution VaR typically use tail probabilities of
0.01 or 0.05. When using SET Lp (ε) risk managers may wish to use other tail
probabilities such as 0.1, 0.15, 0.20, 0.25, or 0.5. We note that use of different
tail probabilities is similar in spirit to using different utility functions.
The following assumptions are in force for the SETL investor:
(A1) The universe of assets is Q (the set of mandate admissible port-
folios)
(A2) The investor may borrow or deposit at the risk-free rate rf with-
out restriction
(A3) The portfolio is optimized under a set of asset allocation con-
straints λ
(A4) The investor seeks an expected return of at least µ
256 S.T. Rachev et al.

To simplify the notation we shall let A3 be implicit in the following dis-


cussion. At times we shall also suppress the ε when its value is taken as fixed
and understood.
The SETL investor’s optimal portfolio is

ωα (µ|ε) = arg min SET Lq (ε)


q∈Q

subject to
SERq ≥ µ.
Here we use ωα to mean either the resulting portfolio weights or the label
for the portfolio itself, depending upon the context. The subscript α to remind
us that we are using a GMstable distribution modeling approach (which entails
different stable distribution parameters for each asset and risk factor). In other
words the SETL optimum portfolio ωα minimizes the expected tail loss among
all portfolios with mean return at least µ , for fixed tail probability ε and asset
allocation constraints λ. Alternatively, the SETL optimum portfolio ωα solves
the dual problem
ωα (η|ε) = arg max SERq
q∈Q

subject to
SET Lq (ε) ≤ η.
The SETL efficient frontier is given by ωα (µ|ε) as a function of µ for fixed
ε, as indicated in Fig. 15. If the portfolio includes cash account with risk free
rate rf , then the SETL efficient frontier will be the SETL capital market line
(CM Lα ) that connects the risk-free rate on the vertical axis with the SETL
tangency portfolio (Tα ), as indicated in the figure.
We now have a SETL separation principal analogous to the classical sep-
aration principal: The tangency portfolio Tα can be computed without refer-
ence to the risk-return preferences of any investor. Then an investor chooses
a portfolio along the SETL capital market line CM Lα according to his/her
risk-return preference.

SER CMLα
SETL efficient frontier

r1

SETL
Fig. 15. The SETL efficient frontier and the capital market line
Stable ETL Optimal Portfolios and Extreme Risk Management 257

Keep in mind that in practice when a finite sample of returns one ends
up with a SETL efficient frontier, tangency portfolio and capital market line
that are estimates of true values for these quantities.

5.3 Markowitz Portfolios are Sub-Optimal

While the SETL investor has optimal portfolios described above, the
Markowitz investor is not aware of the SETL framework and constructs
a mean-variance optimal portfolio. We assume that the Markowitz investor
operates under the same assumptions A1-A4 as the SETL investor. Let ERq
be the expected return and σq the standard deviation of the returns of a
portfolio q. The Markowitz investor’s optimal portfolio is

ω2 (µ) = min σq
q∈Q

subject to
ERq ≥ µ
along with the other constraints λ.
The Markowitz optimal portfolio can also be constructed by solving the
obvious dual optimization problem.
The subscript 2 is used in ω2 as a reminder that α = 2 you have the
limiting Gaussian distribution member of the stable distribution family, and
in that case the Markowitz portfolio is optimal. Alternatively you can think
of the subscript 2 as a reminder that the Markowitz optimal portfolio is a
second-order optimal portfolio, i.e., an optimal portfolio based on only first
and second moments.
The Markowitz investor ends up with a different portfolio, i.e., a differ-
ent set of portfolio weights with different risk versus return characteristics,
than the SETL investor. It is important to note that the performance of the
Markowitz portfolio, like that of the SETL portfolio, is evaluated under a
GMstable distributional model. If in fact the distribution of the returns were
exactly multivariate normal (which they never are) then the SETL investor
and the Markowitz investor would end up with one and the same optimal
portfolio. However, when the returns are non-Gaussian SETL returns, the
Markowitz portfolio is sub-optimal. This is because the SETL investor con-
structs his/her optimal portfolio using the correct distribution model, while
the Markowitz investor does not. Thus the Markowitz investors frontier lies
below and to the right of the SETL efficient frontier, as shown in Fig. 16, along
with the Markowitz tangency portfolio T2 and Markowitz capital market line
CM L2 .
As an example of the performance improvement achievable with the SETL
optimal portfolio approach, we computed the SETL efficient frontier and the
Markowitz frontier for a portfolio of 47 micro-cap stocks with the smallest
alphas from the random selection of 182 micro-caps in Sect. 3.1. The results
258 S.T. Rachev et al.

SER CMLα
SETL efficient frontier
CML2
Markowitz frontier


T2
rf

xe
SETL
Fig. 16. The SETL and the Markowitz efficient frontiers

RETURN VERSUS RISK OF MICRO-CAP PORTFOLIOS


Daily Returns of 47 Micro-Caps 2000-2001
60

SETL
EXPECTED RETURN (Basis Points per Day)

Markowitz
50
40
30
20
10

TAIL PROBABILITY = 1%
0

0.000 0.002 0.004 0.006 0.008 0.010 0.012 0.014


TAIL RISK

Fig. 17. SETL and Markowitz efficient portfolios, a micro-cap example

are displayed in Fig. 17. The results are based on 3,000 scenarios from the
fitted GMstable distribution model based on two years of daily data during
years 2000 and 2001. We note that, as is generally the case, each of the 47
stock returns has its own estimate stable tail index α̂i , i = 1, 2, . . . , 47.
Here we have plotted values of T ailRisk = ε · SET L(ε), for ε = 0.01,
as a natural decision theoretic risk measure, rather than SET L(ε) itself. We
note that over a considerable range of tail risk the SETL efficient frontier
dominates the Markowitz frontier by 14–20 bp’s daily!
Stable ETL Optimal Portfolios and Extreme Risk Management 259

We note that the 47 micro-caps with the smallest alphas used for this
example have quite heavy tails as indicated by the box plot of their estimated
alphas shown below.
Here the median of the estimated alphas is 1.38, while the upper and lower
quartiles are 1.43 and 1.28 respectively. Evidently there is a fair amount of
information in the non-Gaussian tails of such micro-caps that can be exploited
by the SETL approach.

5.4 From Sharpe to STARR-Performance and R-Performance


Measures

The Sharpe Ratio for a given portfolio p is defined as follows:


ERp − rf
SRp = (1)
σp
where ER p is the portfolio expected return, σ p is the portfolio return standard
deviation as a measure of portfolio risk, and rf is the risk-free rate. While the
Sharpe ratio is the single most widely used portfolio performance measure,
it has several disadvantages due to its use of the standard deviation as risk
measure:
• σp is a symmetric measure that does not focus on downside risk
• σp is not a coherent measure of risk (see Artzner et al. 1999)
• σp has an infinite value for non-Gaussian stable distributions

Stable Tail Adjusted Return Ratio

As an alternative performance measure that does not suffer these disadvan-


tages, we propose the Stable Tail Adjusted Return Ratio (STARR) defined as:
SERp − rf
ST ARRp (ε) = . (2)
SET Lp (ε)
Referring to the first figure in Sect. 5.3, one sees that a SETL optimal
portfolio produces the maximum STARR under a SETL distribution model,
and that this maximum STARR is just the slope of the SETL capital market
line CM Lα . On the other hand the maximum STARR of a Markowitz port-
folio is equal to the slope of the Markowitz capital market line CM L2 . The
latter is always dominated by CM Lα , and is equal to CM Lα only in the case
where the returns distribution is multivariate normal in which case α = 2 for
all asset and risk factor returns. Referring to the second figure of Sect. 5.3,
one sees that for relatively high risk-free rate of 5 bps per day, the STARR
for the SETL portfolio dominates that of the Markowitz portfolio. Further-
more this dominance appears quite likely to persist if the efficient frontiers
were calculated for lower risk and return positions and smaller risk-free rates
were used.
260 S.T. Rachev et al.

We conclude that the risk adjusted return of the SETL optimal portfolio
ωα is generally superior to the risk adjusted return of the Markowitz mean
variance optimal portfolio ω2 . The SETL framework results in improved in-
vestment performance.

Rachev Ratio

The Rachev Ratio (R-ratio) is the ratio between the expected excess tail-
return at a given confidence level and the expected excess tail loss at another
confidence level:
ET Lγ1 (x (rf − r))
ρ(r) =
ET Lγ2 (x (r − rf ))
Here the levels γ1 and γ2 are in [0,1], x is the vector of asset allocations
and r − rf is the vector of asset excess returns. Recall that if r is the portfolio
return, and L = −r is the portfolio loss, we define the expected tail loss as
ET Lα% (r) = E(L/L > V aRα% ), whereP (L > V aRα% ) = α, and α is in (0,1).
The R-Ratio is a generalization of the STARR. Choosing appropriate levels γ1
and γ2 in optimizing the R-Ratio the investor can seek the best risk/return
profile of her portfolio. For example, an investor with portfolio allocation
maximizing the R-Ratio with γ1 = γ2 =0.01 is seeking exceptionally high
returns and protection against high losses.

5.5 The Choice of Tail Probability

We mentioned earlier that when using SET Lp (ε) rather than V aRp (ε), risk
managers and portfolio optimizers may wish to use other values of ε than the
conventional VaR values of .01 or .05, for example values such as 0.1, 0.15,
0.2, 0.25 and 0.5 may be of interest. The choice of a particular ε amounts to
a choice of particular risk measure in the SETL family of measures, and such
a choice is equivalent to the choice of a utility function. The tail probability
parameter ε is at the asset manager’s disposal to choose according to his/her
asset management and risk control objectives.
Note that choosing a tail probability ε is not the same as choosing a risk
aversion parameter. Maximizing

SERp − c · SET Lp (ε)

for various choices of risk aversion parameter c for a fixed value of ε merely
corresponds to choosing different points along the SETL efficient frontier. On
the other hand changing ε results in different shapes and locations of the SETL
efficient frontier, and corresponding different SETL excess profits relative to
a Markowitz portfolio.
It is intuitively clear that increasing ε will decrease the degree to which a
SETL optimal portfolio depends on extreme tail losses. In the limit of ε = 0.5,
which may well be of interest to some managers since it uses the average
Stable ETL Optimal Portfolios and Extreme Risk Management 261

loss below zero of Lp as its penalty function, small to moderate losses are
mixed in with extreme losses in determining the optimal portfolio. There is
some concern that some of the excess profit advantage relative to Markowitz
portfolios will be given up as ε increases. Our studies to date indicate, not
surprisingly, that this effect is most noticeable for portfolios with smaller
stable tail index values.
It will be interesting to see going forward what values of ε will be used by
fund managers of various types and styles.
A generalization of the SETL efficient frontier is the R-efficient fron-
tier, obtained by replacing the stable portfolio expected return SERp in
SERp − c · SET Lp (ε) by the excess tail return , the numerator in the R- ratio.
R-efficient frontier allows for fine tuning of the tradeoff between high excess
means returns and protection against large loss.

5.6 The Cognity Implementation of the SETL Framework

The SETL framework described in this paper has been implemented in the
CognityTM Risk Management and Portfolio Optimization product. This
product contains solution modules for Market Risk, Credit Risk (with inte-
grated Market and Credit Risk), Portfolio Optimization, and Fund-of-Funds
portfolio management, with integrated factor models. CognityTM is imple-
mented in a modern Java based server architecture to support both desktop
and Web delivery. For further details see www.finanalytica.com.

Acknowledgements

The authors gratefully acknowledge the extensive help provided by Stephen


Elston and Frederic Siboulet in the preparation of this paper. The authors
owe a special debt to Paul Wilmott for extensive suggestions on an earlier
version of our work, great understanding and encouragement.

References
[1] Bradley, B. O. and Taqqu, M. S. (2003). “Financial Risk and Heavy
Tails”, in Handbook of Heavy Tailed Distributions in Finance, edited by
S. T. Rachev, Elsevier/North-Holland, Amsterdam
[2] Fama, E. (1963). “Mandelbrot and the Stable Paretian Hypothesis”,
Journal of Business, 36, 420–429
[3] Mandelbrot, B. B. (1963). “The Variation in Certain Speculative Prices”,
Journal of Business, 36, 394–419
[4] Rachev, S. and Mittnik, S. (2000). Stable Paretian Models in Finance.
Wiley, New York
262 S.T. Rachev et al.

[5] Racheva-Iotova B., Stoyanov, S., and Rachev S. (2003). Stable Non-
Gaussian Credit Risk Model; The Cognity Approach, in Credit Risk
(Measurement, Evaluations and Management), edited by G. Bol,
G. Nakhaheizadeh, S. Rachev, T. Rieder, K-H. Vollmer, Physica-Verlag
Series: Contributions to Economics, Springer, Heidelberg, NY, 179–198
[6] Rachev, S., Menn, C., and Fabozzi, F.J. (2005). Fat Tailed and
Skewed Asset Return Distributions: Implications for Risk, Wiley-Finance,
Hoboken
[7] Rockafellar, R. T. and Uryasev, S. (2000). “Optimization of Conditional
Value-at-Risk”, Journal of Risk, 3, 21–41
[8] Sklar, A. (1996). “Random Variables, Distribution Functions, and Copu-
las – a Personal Look Backward and Forward”, in Distributions with Fixed
Marginals and Related Topics, edited by Ruschendorff et. al., Institute of
Mathematical Sciences, Hayward, CA
[9] Stoyanov, S., Racheva-Iotova, B. (2004) “Univariate Stable Laws in the
Field of Finance-Parameter Estimation, Journal of Concrete and Applied
Mathematics, 2, 369–396
Pricing Tranches of a CDO and a CDS Index:
Recent Advances and Future Research

Dezhong Wang1 , Svetlozar T. Rachev2 , and Frank J. Fabozzi3


1
Department of Applied Probability and Statistics, University of California,
Santa Barbara CA, USA, dezhongwang@aol.com
2
Department of Econometrics, Statistics and Mathematical Finance, University
of Karlsruhe, Germany and Department of Applied Probability and Statistics,
University of California, Santa Barbara CA, USA
rachev@statistik.uni-karlsruhe.de
3
Yale School of Management, New Haven CT, USA, frank.fabozzi@yale.edu

1 Introduction
In recent years, the market for credit derivatives has developed rapidly with
the introduction of new contracts and the standardization of trade documen-
tation. These include credit default swaps, basket default swaps, credit default
swap indexes, collateralized debt obligations, and credit default swap index
tranches. Along with the introduction of new products comes the issue of how
to price them. For single-name credit default swaps, there are several factor
models (one-factor and two-factor models) proposed in the literature. How-
ever, for credit portfolios, much work has to be done in formulating models
that fit market data. The difficulty in modeling lies in estimating the correla-
tion risk for a portfolio of credits. In an April 16, 2004 article in the Financial
Times [5], Darrell Duffie made the following comment on modeling portfo-
lio credit risk: “Banks, insurance companies and other financial institutions
managing portfolios of credit risk need an integrated model, one that reflects
correlations in default and changes in market spreads. Yet no such model
exists.” Almost a year later, a March 2005 publication by the Bank for Inter-
national Settlements noted that while a few models have been proposed, the
modeling of these correlations is “complex and not yet fully developed.” [1].
In this paper, first we review three methodologies for pricing CDO
tranches. They are the one-factor copula model, the structural model, and
the loss process model. Then we propose how the models can be improved.
The paper is structured as follows. In the next section we review credit de-
fault swaps and in Sect. 3 we review collateralized debt obligations and credit
default swap index tranches. The three pricing models are reviewed in Sects. 4
(one-factor copula model), 5 (structural model), and 6 (loss process model).
Our proposed models are provided in Sect. 7 and a summary is provided in
the final section, Sect. 8.
264 D. Wang et al.

2 Overview of Credit Default Swaps

The major risk-transferring instrument developed in the past few years has
been the credit default swap. This derivative contract permits market partici-
pants to transfer credit risk for individual credits and credit portfolios. Credit
default swaps are classified as follows: single-name swaps, basket swaps, and
credit default index swaps.

2.1 Single-Name Credit Default Swap

A single-name credit default swap (CDS) involves two parties: a protection


seller and a protection buyer. The protection buyer pays the protection seller
a swap premium on a specified amount of face value of bonds (the notional
principal) for an individual company (reference entity/reference credit). In re-
turn the protection seller pays the protection buyer an amount to compensate
for the loss of the protection buyer upon the occurrence of a credit event with
respect to the underlying reference entity.
In the documentation of a CDS contract, a credit event is defined. The list
of credit events in a CDS contract may include one or more of the following:
bankruptcy or insolvency of the reference entity, failure to pay an amount
above a specified threshold over a specified period, and financial or debt re-
structuring. The swap premium is paid on a series of dates, usually quarterly
in arrears based on the actual/360 day count convention.
In the absence of a credit event, the protection buyer will make a quarterly
swap premium payment until the expiration of a CDS contract. If a credit
event occurs, two things happen. First, the protection buyer pays the accrued
premium from the last payment date to the time of the credit event to the
seller (on a days fraction basis). After that payment, there are no further
payments of the swap premium by the protection buyer to the protection
seller. Second, the protection seller makes a payment to the protection buyer.
There can be either cash settlement or physical settlement. In cash settlement,
the protection seller pays the protection buyer an amount of cash equal to the
difference between the notional principal and the present value of an amount
of bonds, whose face value equals the notional principal, after a credit event.
In physical settlement, the protection seller pays the protection buyer the
notional principal, and the protection buyer delivers to the protection seller
bonds whose face value equals the notional principal. At the time of this
writing, the market practice is physical settlement.

2.2 Basket Default Swap

A basket default swap is a credit derivative on a portfolio of reference entities.


The simplest basket default swaps are first-to-default swaps, second-to-default
swaps, and nth-to-default swaps. With respect to a basket of reference entities,
Pricing Tranches of a CDO and a CDS Index 265

a first-to-default swap provides insurance for only the first default, a second-to-
default swap provides insurance for only the second default, an nth-to-default
swap provides insurance for only the nth default. For example, in an nth-to-
default swap, the protection seller does not make a payment to the protection
buyer for the first n − 1 defaulted reference entities, and makes a payment for
the nth defaulted reference entity. Once there is a payment upon the default
of the nth defaulted reference entity, the swap terminates. Unlike a single-
name CDS, the preferred settlement method for a basket default swap is cash
settlement.

2.3 Credit Default Swap Index

A credit default swap index (denoted by CDX) contract provides protection


against the credit risk of a standardized basket of reference entities. The me-
chanics of a CDX are slightly different from that of a single-name CDS. If a
credit event occurs, the swap premium payment ceases in the case of a single-
name CDS. In contrast, for a CDX the swap premium payment continues to
be made by the protection buyer but based on a reduced notional amount
since less reference entities are being protected. As of this writing, settlement
for a CDX is physical settlement.4
Currently, there are two families of standardized indexes: the Dow Jones
CDX5 and the International Index Company iTraxx.6 The former includes
reference entities in North America and emerging markets, while the latter
includes reference entities in markets in Europe and Asia. Both families of in-
dexes are standardized in terms of the index composition procedure, premium
payment, and maturity.
The two most actively traded indexes are the Dow Jones CDX NA IG
index and the iTraxx Europe index. The former includes 125 North American
investment-grade companies. The latter includes 125 European investment-
grade companies. For both indexes, each company is equally weighted. Also
for these two indexes, CDX contracts with 3-, 5-, 7- and 10-year maturities
are available.
The composition of reference entities included in a CDX are renewed every
six months based on a vote of participating dealers. The start date of a new
version index is referred to as the roll date. The roll date is March 20 and
September 20 of a calender year or the following business days if these days are
not business days. A new version index will be “on-the-run” for the next six
4
The market is considering moving to cash settlement because of the cost of deliv-
ering an odd lot in the case of a credit event for a reference entity. For example,
if the notional amount of a contract is $20 million and a credit event occurs,
the protection buyer would have to deliver to the protection seller bonds of the
reference entity with a face value of $160,000. Neither the protection buyer nor
the protection seller likes to deal with such a small position.
5
www.djindexes.com/mdsidx/?index=cdx.
6
www.indexco.com.
266 D. Wang et al.

months. The composition of each version of a CDX remains static in its lifetime
if no default occurs to the underlying reference entities, and the defaulted
reference entities are eliminated from the index.
There are two kinds of contracts on CDXs: unfunded and funded. An
unfunded contract is a CDS on a portfolio of names. This kind of contract is
traded on all the Dow Jones CDX and the iTraxx indexes. For some CDXs
such as the Dow Jones CDX NA HY index and its sub-indexes7 and the
iTraxx Europe index, the funded contract is traded. A funded contract is a
credit-linked note (CLN), allowing investors who because of client imposed or
regulatory restrictions are not permitted to invest in derivatives to gain risk
exposure to the CDX market. The funded contract works like a corporate bond
with some slight differences. A corporate bond ceases when a default occurs
to the reference entity. If a default occurs to a reference entity in an index,
the reference entity is removed from the index (and also from the funded
contract). The funded contract continues with a reduced notional principal
for the surviving reference entities in the index. Unlike the unfunded contract
which uses physical settlement, the settlement method for the funded contract
is cash settlement.
The index swap premium of a new version index is determined before the
roll day and unchanged over its life time, which is referred to as the coupon
or the deal spread. The price difference between the prevailing market spread
and the deal spread is paid upfront. If the prevailing market spread is higher
than the deal spread, the protection buyer pays the price difference to the
protection seller. If the prevailing market spread is less than the deal spread,
the protection seller pays the price difference to the protection buyer. The
index premium payments are standardized quarterly in arrears on the 20th of
March, June, September, and December of each calendar year.
The CDXs have many attractive properties for investors. Compared with
the single-name swaps, the CDXs have the advantages of diversification and
efficiency. Compared with basket default swaps and collateralized debt obli-
gations, the CDXs have the advantages of standardization and transparency.
The CDXs are traded more actively than the single-name CDSs, with low
bid–ask spreads.

3 CDOs and CDS Index Tranches


Based on the technology of basket default swaps, the layer protection tech-
nology is developed for protecting portfolio credit risk. Basket default swaps
provide the protection to a single default in a portfolio of reference entities,

7
The Dow Jones CDX NA HY index includes 100 equal-weighted North America
High Yield reference entities. Its sub-indexes include the CDX NA HY B
(B-rated), CDX NA HY BB (BB-rated), and CDX NA HY HB (High Beta)
indexes.
Pricing Tranches of a CDO and a CDS Index 267

for example, the first default, the second default, and the nth default. Cor-
respondingly, there are the first layer protection, the second layer protection,
and the nth layer protection. These protection layers work like basket default
swaps with some differences. The main difference is that the n basket default
swap protects the nth default in a portfolio and the nth protection layer pro-
tects the nth layer of the principal of a portfolio, which is specified by a range
of percentage, for example 15–20%. The layer protection derivative products
include collateralized debt obligations and CDS index tranches.

3.1 Collateralized Debt Obligation

A collateralized debt obligation (CDO) is a security backed by a diversified


pool of one or more kinds of debt obligations such as bonds, loans, credit de-
fault swaps or structured products (mortgage-backed securities, asset-backed
securities, and even other CDOs). A CDO is initiated by a sponsor which can
be banks, nonbank financial institutions, and asset management companies.
The sponsor of a CDO creates an entity called a special purpose vehicle (SPV).
The SPV works as an independent entity. In this way, CDO investors are iso-
lated from the credit risk of the sponsor. Moreover, the SPV is responsible for
the administration. The SPV obtains the credit risk exposure by purchasing
debt obligations (bonds or residential and commercial loans) or selling CDSs;
it transfers the credit risk by issuing debt obligations (tranches/credit-linked
notes). The investors in the tranches of a CDO have the ultimate credit risk
exposure to the underlying reference entities.
Figure 1 shows the basic structure of a CDO backed by a portfolio of bonds.
The SPV issues four kinds of CLNs referred to as tranches. Each tranche has
an attachment percentage and a detachment percentage. When the cumulative
percentage loss of the portfolio of bonds reaches the attachment percentage,
investors in the tranche start to lose their principal, and when the cumulative

Tranche 4
30 − 70%

Coupons Spreads
Principal Principal Tranche 3
Bond 1 15 − 30%
Bond 2 SPV
Bond 3
... ... Tranche 2
Bond n 5 − 15%
Proceeding Proceeding

Tranche 1
0 − 5%
Collateral Pool

Fig. 1. Structure of collaterized debt obligation


268 D. Wang et al.

percentage loss of principal reaches the detachment percentage, the investors


in the tranche lose all their principal and no further loss can occur to them.
For example, in Fig. 1 the second tranche has an attachment percentage of
5% and a detachment percentage of 15%. The tranche will be used to cover
the cumulative loss during the life of a CDO in excess of 5% (its attachment
percentage) and up to a maximum of 15% (its detachment percentage).
In the literature, tranches of a CDO are classified as subordinate/equity
tranche, mezzanine tranches, and senior tranches according to their subordi-
nate levels (see [12]). For example, in Fig. 1 tranche 1 is an equity tranche,
tranches 2 and 3 are mezzanine tranches, and tranche 4 is a senior tranche.
Because the equity tranche is extremely risky, the sponsor of a CDO is one of
the holders of the equity tranche and the SPV sells other tranches to investors.
If the SPV of a CDO actually owns the underlying debt obligations, the
CDO is referred to as a cash CDO. Cash CDOs can be classified as collat-
eralized bond obligations (CBO) and collateralized loan obligation (CLO).
The former have only bonds in their pool of debt obligations, and the latter
have only commercial loans in their pool of debt obligations. If the SPV of
a CDO does not own the debt obligations, instead obtaining the credit risk
exposure by selling CDSs on the debt obligations of reference entities, the
CDO is referred to as a synthetic CDO.
Based on the motivation of sponsors, CDOs can be classified as balance
sheet CDOs and arbitrage CDOs. The motivation for balance sheet CDOs
(primarily CLO) is to transfer the risk of loans in a sponsoring bank’s port-
folio in order to reduce regulatory capital requirements. The motivation for
arbitrage CDOs is to arbitrage the interest difference between the underlying
pool of debt obligations and CDO tranches.

3.2 CDS Index Tranches

With the innovation of CDXs, the synthetic CDO technology is applied to


slice CDXs, and standardized tranches with different subordinate levels are
created to satisfy investors with different risk appetites. The tranches of an
index provide the layer protections to the underlying portfolio of the index in
the same way as the tranches of a CDO provide the layer protections to the
underlying portfolio of the CDO as explained earlier.
Both of the most actively traded indexes – the Dow Jones CDX NA IG
and the iTraxx Europe – are sliced into five tranches: equity tranche, junior
mezzanine tranche, senior mezzanine tranche, junior senior tranche, and super
senior tranche. The standard tranche structure of the Dow Jones CDX NA IG
is 0–3%, 3–7%, 7–10%, 10–15%, and 15–30%. The standard tranche structure
of the iTraxx Europe is 0–3%, 3–6%, 6–9%, 9–12%, and 12–22%.
Table 1 shows the index and tranches market quotes for the CDX NA IG
and the iTraxx Europe on August 4, 2004. For both indexes, the swap pre-
mium of the equity tranche is paid differently from the non-equity tranches. It
includes two parts: (1) the upfront percentage payment and (2) the fixed 500
Pricing Tranches of a CDO and a CDS Index 269

Table 1. CDS index and tranche market quotes – August 4, 2004

iTraxx Europe (5 year)


Index 0–3% 3–6% 6–9% 9–12% 12–22%
42 27.6% 168 70 43 20

CDX NA IG (5 year)
Index 0–3% 3–7% 7–10% 10–15% 15–30%
63.25 48.1% 347 135.5 47.5 14.5

Data are collected by GFI Group Inc. and used in [6]

basis points premium per annum. The market quote is the upfront percent-
age payment. For example, the market quote of 27.8% for the iTraxx equity
tranche means that the protection buyer pays the protection seller 27.8% of the
notional principal upfront. In addition to the upfront payment, the protection
buyer also pays the protection seller the fixed 500 basis points premium per
annum on the outstanding notional principal. For all the non-equity tranches,
the market quotes are the premium in basis points, paid quarterly in arrears.
Just like the indexes, the premium payments for the tranches (with the ex-
ception of the upfront percentage payment of the equity tranche) are made
on the 20th of March, June, September, and December of each calendar year.
Following the commonly accepted definition for a synthetic CDO, CDX
tranches are not part of a synthetic CDO because they are not backed by a
portfolio of bonds or CDSs [6]. In addition, CDX tranches are unfunded and
they are insurance contracts, while synthetic CDO tranches are funded and
they are CLNs. However, the net cash flows of index tranches are the same as
synthetic CDO tranches and these tranches can be priced the same way as a
synthetic CDO.

4 One-Factor Copula Model

The critical input for pricing synthetic CDO and CDS index tranches is an es-
timate of the default dependence (default correlation) between the underlying
assets. One popular method for estimating the dependence structure is using
copula functions, a method first applied in actuarial science. While there are
several types of copula function models, Li [10, 11] introduces the one-factor
Gaussian copula model for the case of two companies and Laurent and Gre-
gory [9] extend the model to the case of N companies. Several extensions to
the one-factor Gaussian copula model were subsequently introduced into the
literature. In this section, we provide a general description of the one-factor
copula function, introduce the market standard model, and review both the
270 D. Wang et al.

one-factor double t copula model [6] and the one-factor normal inverse Gaus-
sian copula model [8].
Suppose that a CDO includes n assets i = 1, 2, . . . , n and the default time
τi of the ith asset follows a Poisson process with a parameter λi . The λi is the
default intensity of the ith asset. Then the probability of a default occurring
before time t is
P (τi < t) = 1 − exp(−λi t). (1)
In a one-factor copula model, it is assumed that the default time τi for the
ith company is related to a random variable Xi with a zero mean and a unit
variance. For any given time t, there is a corresponding value x such that

P (Xi < x) = P (τi < t), i = 1, 2, . . . , n. (2)

Moreover, the one-factor copula model assumes that each random variable Xi
is the sum of two components
4
Xi = ai M + 1 − a2i Zi , i = 1, 2, . . . , n, (3)

where Zi is the idiosyncratic component of company i, and M is the common


component of the market. It is assumed that the M and Zi ’s are mutually
independent random variables. For simplicity, it is also assumed that the ran-
dom variables M and Zi ’s are identical. The factor ai satisfies −1 ≤ ai ≤ 1.
The default correlation between Xi and Xj is ai aj , (i = j).
Let F denote the cumulative distribution of the Zi ’s and G denote the
cumulative distribution of the Xi ’s. Then given the market condition M = m,
we have !
x − ai m
P (Xi < x|M = m) = F  , (4)
1 − a2i
and the conditional default probability is
% ?
G−1 [P (τi < t)] − ai m
P (τi < t|M = m) = F  . (5)
1 − a2i

For simplicity, the following two assumptions are made:


• All the companies have the same default intensity, i.e, λi = λ.
• The pairwise default correlations are the same, i.e, in (3), ai = a.
The second assumption means that the contribution of the market compo-
nent is the same for all the companies and the correlation between any two
companies is constant, β = a2 .
Under these assumptions, given the market situation M = m, all the
companies have the same cumulative risk-neutral default probability Dt|m .
Moreover, for a given value of the market component M , the defaults are
mutually independent for all the underlying companies. Letting Nt|m be the
Pricing Tranches of a CDO and a CDS Index 271

total defaults that have occurred by time t conditional on the market condition
M = m, then Nt|m follows a binomial distribution Bin(n, Dt|m ), and

n!
P (Nt|m = j) = Dj (1 − Dt|m )n−j , j = 0, 1, 2, . . . , n. (6)
j!(n − j)! t|m
The probability that there will be exactly j defaults by time t is
 ∞
M
P (Nt = j) = E P (Nt|m ) = P (Nt|m = j)fM (m)dm, (7)
−∞

where fM (m) is the probability density function (pdf) of the random vari-
able M .

4.1 Market Standard Model

Li [10, 11] was the first to suggest that the Gaussian copula can be employed
in credit risk modeling to estimate the default correlation. In a one-factor
Gaussian copula model, the distributions of the common market component
M and the individual component Zi ’s in (3) are standard normal Gaussian
distributions. Because the sum of two independent Gaussian distributions is
still a Gaussian distribution, the Xi ’s in (3) have a closed form. It can be
verified that the Xi ’s have a standard normal distribution.
The one-factor copula Gaussian copula model is the market standard
model when implemented under the following assumptions:
• A fixed recovery rate of 40%
• The same CDS spreads for all of the underlying reference entities
• The same pairwise correlations
• The same default intensities for all the underlying reference entities
The market standard model does not appear to fit market data well (see
[6, 8]. In practice, market practitioners use implied correlations and base cor-
relations.
The implied correlation for a CDO tranche is the correlation that makes
the value of a contract on the CDO tranche zero when pricing the CDO with
the market standard model. For a CDO tranche, when inputting its implied
correlation into the market standard model, the simulated price of the tranche
should be its market price.
McGinty, Beinstein, Ahluwalia, and Watts [14] introduced base correla-
tions in CDO pricing. To understand base correlations, let’s use an example.
Recalling the CDX NA IG tranches 0–3%, 3–7%,7–10%, 10–15%, and 15–30%,
and assuming there exists a sequence of equity tranches 0–3%, 0–7%, 0–10%,
0–15%, and 0–30%, the premium payment on an equity tranche is a combina-
tion of the premium payment of the CDX NA IG tranches that are included
in the corresponding equity tranche. For example, the equity tranche 0–10%
includes three CDX NA IG tranches: 0–3%, 3–7%, and 7–10%. The premium
272 D. Wang et al.

payment on the equity tranche 0–10% includes three parts. The part of 0–3%
is paid the same way as the CDX NA IG tranche 0–3%, the part of 3–7% is
paid the same way as the CDX NA IG tranche 3–7%, and the part of 7–10%
is paid the same way as the CDX NA IG tranche 7–10%. Then the definition of
base correlation is the correlation input that make the prices of the contracts
on these series of equity tranches zero. For example, the base correlation for
the CDX NA IG tranche 7–10% is the implied correlation that makes the price
of a contract on the equity tranche 0-10% zero.

4.2 One-Factor Double t Copula Model

The natural extension to a one-factor Gaussian copula model uses heavy-tailed


distributions. Hull and White [6] propose a one-factor double t copula model.
In the model, the common market component M and the individual compo-
nents Zi in (3) are assumed to have a normalized Student’s t distribution

M =  (nM − 2)/nM TnM , TnM ∼ T (nM ),
(8)
Zi = (ni − 2)/ni Tni , Tni ∼ T (ni ),

where Tn is a Student’s t distribution with degrees of freedom n = 3, 4, 5, . . . .


In the model, the distributions of Xi ’s do not have a closed form but
instead must be calculated numerically.
Hull and White [6] find that the one-factor double t copula model fits
market prices well when using the Student’s t distribution with 4 degrees of
freedom for M and Zi ’s.

4.3 One-Factor Normal Inverse Gaussian Copula Model

Kalemanova, Schmid, and Werner [8] propose utilizing normal inverse Gaus-
sian distributions in a one-factor copula model. A normal inverse Gaussian
distribution is a mixture of normal and inverse Gaussian distributions.
An inverse Gaussian distribution has the following density function
%
(ζ−ηx)2
√ ζ x−3/2 exp(−
fIG (x; ζ, η) = 2πη 2ηx ), if x > 0,
(9)
0, if x ≤ 0,

where ζ > 0 and η > 0 are two parameters. We denote the inverse Gaussian
distribution as IG(ζ, η).
Suppose Y is an inverse Gaussian distribution. A normal Gaussian distri-
bution X ∼ N (υ, σ 2 ) is a normal inverse Gaussian (NIG) distribution when
its mean υ and variance σ 2 are random variables as given below

υ = µ + βY, σ 2 = Y,
(10)
Y ∼ IG(δγ, γ 2 ),
Pricing Tranches of a CDO and a CDS Index 273

where δ > 0, 0 ≤ |β| < α, and γ := α2 − β 2 . The distribution of the random
variable X is denoted by X ∼ (α, β, µ, δ). The density of X is
δα exp(δγ + β(x − u)) 
f (x; α, β, µ, δ) =  K(α δ 2 + (x − µ)2 ), (11)
π δ 2 + (x − µ)2
where K(.) is the modified Bessel function of the third kind as defined below

1 ∞ 1
K(ω) := exp(− ω(t − t−1 ))dt. (12)
2 0 2
The mean and variance of the NIG distribution X are respectively
δβ δα2
E(X) = µ + , V ar(X) = . (13)
γ γ3
The family of NIG distributions has two main properties. One is the closure
under the scale transition
α β
X ∼ N IG(α, β, µ, δ) ⇒ cX ∼ N IG( , , cµ, cδ). (14)
c c
The other is that if two independent NIG random variables X and Y have
the same α and β parameters, then the sum of these two variables is still an
NIG variable as shown below
X ∼ N IG(α, β, µ1 , δ1 ), Y ∼ N IG(α, β, µ2 , δ)
(15)
⇒ X + Y ∼ N IG(α, β, µ1 + µ2 δ1 + δ2 ).
When using NIG distributions in a one-factor copula model, the model
is referred to as a one-factor normal inverse Gaussian copula model. The
distributions for M and Zi ’s in (3) are given below
!
αβ
M ∼ N IG α, β, −  ,α ,
α2 − β 2
    ! (16)
α 1 − a2i β 1 − a2i αβ 1 − a2i α 1 − a2i
Zi ∼ N IG , ,−  , .
ai ai ai α2 − β 2 ai

The distributions of Xi ’s in (3) are


 !
α β αβ 1 − a2i α
Xi ∼ N IG , ,−  , . (17)
ai ai ai α2 − β 2 ai
The selection of the parameters makes the variables Xi ’s, M , and Zi ’s have a
zero mean and a unit variance when β = 0.
The one-factor normal inverse Gaussian copula model fits market data a
little bit better than the one-factor double t copula model. The advantage
of the one-factor normal inverse Gaussian copula model is that the Xi ’s in
the model have a closed form. This reduces the computing time significantly
compared with that of the one-factor double t copula model. The former is
about five times faster than the latter.
274 D. Wang et al.

5 Structural Model
Hull, Predescu, and White [7] propose the structural model to price the default
correlation in tranches of a CDO or an index. The idea is based on Merton’s
model [17] and its extension by Black and Cox [3]. It is assumed that the value
of a company follows a stochastic process, and if the value of the company
goes below a minimum value (barrier), the company defaults.
In the model, N different companies are assumed and the value of company
i (1 ≤ i ≤ N ) at time t is denoted by Vi . The value of the company follows a
stochastic process as shown below

dVi = µi Vi dt + σi Vi dXi , (18)

where µi is the expected growth rate of the value of company i, σi is the volatil-
ity of the value of company i, and Xi (t) is a variable following a continuous-
time Gaussian stochastic process (Wiener process). The barrier for company
i is denoted by Bi . Whenever the value of company i goes below the barrier
Bi , it defaults.
Without the loss of generality, it is assumed that Xi (0) = 0. Applying
Ito’s formula to ln Vi , it is easy to show that

ln Vi (t) − ln Vi (0) − (µi − σi2 /2)


Xi (t) = . (19)
σi
Corresponding to Bi , there is a barrier Bi∗ for the variable Xi as given below

ln Bi − ln Vi (0) − (µi − σi2 /2)t


Bi∗ = . (20)
σi
When Xi falls below Bi∗ , company i defaults. Denote

ln Hi − ln Vi (0) µi − σi2 /2
βi = γi = − , (21)
σi σi
then Bi∗ = βi + γi t.
To model the default correlation, it is assumed that each Wiener process Xi
follows a two-component process which includes a common Wiener process
M and an idiosyncratic Wiener process Zi . It is expressed as
4
dXi (t) = ai (t)dM (t) + 1 − a2i (t)dZi (t), (22)

where the variable ai , 1 ≤ ai ≤ 1 is used to control the weight of the two-


component process. The Wiener processes M and Zi ’s are uncorrelated with
each other. In this model, the default correlation between two companies i
and j is ai aj .
The model can be implemented by Monte Carlo simulation. Hull, Predescu,
and White [7] implement the model in three different ways:
Pricing Tranches of a CDO and a CDS Index 275

• Base case. Constant correlation and constant recovery rate.


• Stochastic Corr. Stochastic correlation and constant recovery rate.
• Stochastic RR. Stochastic correlation and stochastic recovery rate.
Two comparisons between the base-case structural model and the one-
factor Gaussian copula model are provided. One is to calculate the joint de-
fault probabilities of two companies by both models. The other is to simulate
the iTraxx Europe index tranche market quote by both models. In both cases,
the results of these models are very close when the same default time correla-
tions are input, while the one-factor Gaussian copula is a good approximation
to the base-case structural model, the structural model has two advantages:
it is a dynamic model and it has a clear economic rationale.

6 Loss Process Model

Loss process models for pricing correlation risk have been developed by
Schönbucher [20], Sidenius et al. [21], Di Graziano and Rogers [4], and Bennani
[2]. Here we introduce the basic idea of the loss process model as discussed by
Schönbucher. We omit the mathematical details.

6.1 Model Setup

The model is set up in the probability space (Ω, (Ft )0≤t≤T , Q), where Q is
a spot martingale measure, (Ft )0≤t≤T is the filtration satisfying the common
definitions, and Ω is the sample space. Assume that there are N company
names in a portfolio. Each name has the same notional principal in the port-
folio. Under the assumption of a homogenous recovery rate for all the com-
panies, all companies have identical losses in default which is normalized to
one. The cumulative default loss process is defined by


N
L(t) = 1{τk ≤t} , (23)
k

where τk is the default time of company k, and the default indicator 1{τk ≤t}
is 1 when τk ≤ t and 0 when τk > t. The loss process is an N -bounded,
integer-valued, non-decreasing Markov chain. Under Q-measure, the proba-
bility distribution of L(T ) at time t < T is denoted by the vector p(t, T ) :=
(p0 (t, T ), . . . , pN (t, T )) , where the pi ’s are conditional probabilities

pi (t, T ) := P [L(T ) = i|Ft ], i = 0, 2, . . . , N, t ≤ T. (24)

The conditional probability pi (t, T ) is the implied probability of L(T ) = i, T ≥


t given the information up to time t. p(t, .) is referred to as the loss distribution
at time t.
276 D. Wang et al.

6.2 Static Loss Process

To price a CDO, it is necessary to determine an implied initial loss distribu-


tion p(0, T ). The implied initial loss distribution can be found by solving the
evolution of the loss process L(t). As the loss process L(t) is an inhomoge-
neous Markov chain in a finite state space with N + 1 states {0, 1, 2, . . . , N },
its transition probabilities are uniquely determined by its generator matrix.
Assuming that there is only one-step transition at any given time t, the
generator matrix of the loss process has the following form
⎛ ⎞
−λ0 (t) λ0 (t) 0 ... 0 0
⎜ 0 −λ1 (t) λ1 (t) . . . 0 0 ⎟
⎜ ⎟
⎜ .. . .
.. . . . . . ⎟
A(t) = ⎜ . .. .. .. ⎟, (25)
⎜ ⎟
⎝ 0 0 0 . . . −λN −1 (t) λN −1 (t) ⎠
0 0 0 ... 0 0

where the λi (t) s are the transition rates i = 0, 1, . . . , N − 1. The state N is


an absorbing state.
The probability transition matrix, defined by Pij (t, T ) := P [L(T ) =
j|L(t) = i], satisfies the following Kolmogorov equations
d P (t, T ) = −λ (T )P (t, T ),
dT i,0 0 i,0

d P (t, T ) = −λ (T )P (t, T ) + λ (T )P (26)


i,j−1 (t, T ),
dT i,j j i,j j−1

d P (t, T ) = −λ
N −1 (T )Pi,N −1 (t, T ),
dT i,N
for all i, j = 0, 1, . . . , N and 0 ≤ t ≤ T . The initial conditions are Pi,j (t, t) =
1{i=j} . The solution of the Kolmogorov equations in (26) is given below

⎨0 0T
for i > j,
Pi,j (t, T ) = exp{− t λi (t, s)ds} 0 for i = j, (27)
⎩0T − tT λj (t,u)du
t
Pi,j−1 (t, s)λj−1 e ds for i < j.

The representation of the implied loss distribution at time t is simply

pi (t, T ) = P [L(T ) = i|Ft ] = PL(t),i (t, T ). (28)

For example, if L(t) = k, then the implied loss distribution at time t is

pi (t, T ) = Pk,i (t, T ). (29)

6.3 Dynamic Loss Process

In the dynamics version of the loss process model, the loss process follows
a Poisson process with time- and state-dependent inhomogeneous default in-
tensities λL(t) (t), L(t) = 0, . . . , N − 1, which are the transition rates in the
Pricing Tranches of a CDO and a CDS Index 277

generator matrix in (25). The aggregate default intensity λL(t) (t) can be ex-
pressed in terms of the individual intensities λk (t)

λL(t) (t) = λk (t), (30)
k∈S(t)

where S(t) := {1 ≤ k ≤ N |τk > t} is the set of companies that have not
defaulted by time t.
The loss process is assumed to follow a Poisson process with stochastic
intensity, a process referred to as a Cox process.
dλi (t, T ) = µi (t, T )dT + σi (t, T )dB(t), i = 0, . . . , N − 1, (31)
where B(t) is a d-dimension Q-Brownian motion, the µi (t, T )’s are the drifts
of the stochastic processes, and the σi (t, T )’s are the d-dimension volatilities
of the stochastic processes. To keep the stochastic processes consistent with
the loss process L(t), the following conditions must be satisfied
PL(t),i (t, T )µi (t, T ) = σi (t, T )υL(t),i (t, T ), 0 ≤ i ≤ N − 1, t ≤ T, (32)
where, υn,m (t, T )’s are given by


⎨0 for i>j
0T
υi,j = Pnm (t,0 T ){− t σi (t, s)ds} for i = j , (33)
⎩ 0 T e− sT λj (t,u)du [σ P a (t, s) − P (t, s)σ (t, s)]ds for

i<j
t i,j−1 ij j

with
Pa
σi,j−1 (t, T ) = Pi,j−1 (t, T ) + λm−1 (t, T )υn,m−1 (t, T ). (34)

6.4 Default Correlation


In the loss process model, the default correlations between companies can
arise from both the transition rates of the loss process and the volatilities of
the stochastic processes. To understand the default dependence by the tran-
sition rates, recall the concept of default correlation. The default correlation
is the phenomenon of joint defaults and a clustering of defaults. After one or
more companies defaults, the individual default intensities of the surviving
companies increase. The dependence of individual default intensities on the
default number (loss process L(t)) can be reflected by a proper selection of the
transition rates λi (t), i = 0, 1, . . . , N − 1. This is the way that the transition
rates can cause the default dependence between companies.
The default dependence by the volatilities can be explained by considering
the case of a one-dimension driving Brownian motion. For non-zero transition
rate volatilities
σi (t, T ) > 0 for all 0 ≤ i ≤ N − 1, (35)
Brownian motion works like an indicator of the common market condition. If
its value is positive, the market condition is bad and all the transition rates
are larger; if its value is negative, the market condition is good and all the
transition rates are smaller.
278 D. Wang et al.

6.5 Implementation of Dynamic Loss Process Model

The model can be implemented by a Monte Carlo method. For pricing a CDO
with a maturity T , the procedure is as follows:
1. Initial condition: t = 0, L(0) = 0 (p0 (0, 0) = 1), and specify λi (0, 0)’s and
σi (0, .)’s.
2. Simulate a Brownian motion trial.
3. s → s + ∆s: (until s = T )
• Calculate P0,m (0, s) from (27), and υ0,j (0, s) from (33), and use them
and σi (0, s) to calculate µi (0, s) from (32).
• Calculate λi (0, s+∆s) using the Euler scheme and µi (0,s) and σi (0, s).
• In a Euler scheme, calculate the loss distribution pi (0, s + ∆s) from
(27) and using the representation of the loss distribution in (29).
• The loss distribution pi (0, .) on the time period of (0, T ) is then
calculated.
4. Repeat steps 2–4 until the average loss distributions pi (0, .) of all the trials
converge.
5. Using the average loss distributions pi (0, .) to price a CDO.
The loss process model can also be used to price other portfolio credit
derivatives such as basket default swaps, options on CDS indexes, and options
on CDS indexes tranches.

7 Models for Pricing Correlation Risk


In this section, we give our suggestions for future research. It includes two
parts. In the first part, we analyze the shortcoming of the one-factor dou-
ble t copula model, and then propose four new heavy-tailed one-factor copula
models. In the second part, we give our proposal for improving the structural
model and the loss process model.

7.1 Heavy-Tailed Copula Models

Hull and White [6] first use heavy-tailed distributions (Student’s t distribu-
tions) in a one-factor copula model. In their so-called one-factor double t
copula model, Hull and White use the t distribution with ν degrees of free-
dom for the market component M and the individual components Zi ’s in
equation (3). The degrees of freedom parameter ν of the t distribution can
be 3, 4, 5, . . . . When the degrees of freedom parameter of ν is equal to 3, the
copula function has the maximum tail-fatness. When the degrees of freedom
parameter of ν increases, the tail-fatness of the copula function decreases.
As mentioned before, Hull and White find that the double t copula model
fits market data well when the degrees of freedom parameter ν is equal to 4.
But the simulation by Kalemanova et al. [8] shows a different result. When
Pricing Tranches of a CDO and a CDS Index 279

Kalemanova et al. compare their model with the double t copula model, in
addition to the simulation results by their own model, they also give the
simulation results by the double t copula model for both the cases of the
degrees of freedom parameter ν equal to 3 and 4. These simulation results
show that the double t copula model fits market data better when ν = 3 than
ν = 4. One difference in these two works is that different market data are
used in the simulation. Hull and White use market data for the 5-year iTraxx
Europe tranches on August 4, 2004, while Kalemanova et al. use market data
on April 12, 2006. Therefore, the difference, related to how many degrees of
freedom make the double t copula fit market data well, may suggest that
for market data in different times, the double t copula model with different
tail-fatnesses works well.
The drawbacks of the double t copula are that its tail-fatness cannot be
changed continuously and the maximum tail-fatness occurs when the degrees
of freedom parameter ν is equal to 3. In order to fit market data well over
time, it is necessary that the tail-fatness of a one-factor copula model can be
adjusted continuously and can be much larger than the maximum tail-fatness
of the one-factor double t copula model.
In the following, we suggest four one-factor heavy-tailed copula models.
Each model has (1) a tail-fatness parameter that can be changed continuously
and (2) a maximum tail-fatness much larger than that of the one-factor double
t copula model.

One-Factor Double Mixture Gaussian Copula Model

The mixture Gaussian distribution is a mixture distribution of two or more


Gaussian distributions. For simplicity, we consider the case of the mixture dis-
tribution of two Gaussian distributions which have a zero mean. If the random
variable Y is such a mixture Gaussian distribution, then it can be expressed as

X1 with probability p,
Y = (36)
X2 with probability 1 − p,
where X1 and X2 are independent normal Gaussian distributions with a zero
mean
EX1 = EX2 = 0, V arX1 = σ12 and V arX2 = σ22 , (37)
with σ1 > σ2 . The mixture Gaussian distribution Y has a zero mean. Its
variance is
V arY = pσ12 + (1 − p)σ22 . (38)
The pdf of the distribution Y is
p y2 1−p y2
fY (y) = √ exp(− 2 ) + √ exp(− 2 ). (39)
2πσ1 2σ1 2πσ2 2σ2
The mixture Gaussian distribution Y can be normalized by the following
transition
280 D. Wang et al.

1
Y =  2 Y. (40)
σ1 + σ22
The pdf of Y is
√ + 2 2 ,
p pσ12 +(1−p)σ22 y (pσ1 +(1−p)σ22 )
fY (y) = √
2πσ1
exp − 2σ12
√ + 2 2 , (41)
(1−p) pσ12 +(1−p)σ22 y (pσ1 +(1−p)σ22 )
+ √
2πσ
exp − 2σ 2 .
2 2

Using the standardized mixture Gaussian distribution in (41) as the distri-


bution of the M and Zi ’s in (3), we obtain our first extension to the one-factor
Gaussian copula model which we refer to as a double mixture Gaussian dis-
tribution copula model. In this model, the tail-fatness of the M and Z’s is
determined by the parameters σ1 , σ2 , and p. In the implementation of the
model, we can fix the parameters σ1 and σ2 , and make the parameter p the
only parameter to control the tail-fatness of the copula function.

One-Factor Double t Distribution with Fractional Degrees


of Freedom Copula Model

The pdf of the gamma(α, β) distribution is


1
f (x|α, β) = xα−1 exp(−x/β), 0 < x < ∞, α > 0, β > 0. (42)
Γ (α)β α
Setting α = ν/2 and β = 2, we obtain an important special case of the gamma
distribution, the Chi-square distribution, which has the following pdf:
1
f (x|ν) = xν−1 exp(−x/2), 0 < x < ∞, ν > 0. (43)
Γ (ν/2)2ν/2
If the degrees of freedom parameter ν is an integer, equation (43) is the
Chi-square distribution with ν degrees of freedom. However, the degrees of
freedom parameter ν need not be an integer. When ν is extended to a posi-
tive real number, we get the Chi-square distribution with ν fractional degrees
of freedom.
If U is a standard normal distribution, V is a Chi-square distribution
with ν  fractional degrees of freedom, and U and V are independent, then
T = U/ V /ν has the following pdf

Γ ( ν+1
2 )
fT (t|ν) = ν √ (1 + t2 /ν)−(ν+1)/2 , 0 < x < ∞, ν > 0. (44)
Γ ( 2 ) νπ

This is the Student’s t distribution with ν fractional degrees of freedom (see


[13]. Its mean and variance are respectively
ν
ET = 0, ν > 1; V arT = , ν > 2. (45)
ν−2
Pricing Tranches of a CDO and a CDS Index 281

For ν > 2, the Student’s t distribution in (44) can be normalized by making


the transition 
X = (ν − 2)/νT, ν > 2. (46)
The normalized Student’s t distribution with ν(ν > 2) fractional degrees of
freedom has the following pdf
@
ν Γ ( ν+1
2 ) x2 −(ν+1)/2
fX (x|ν) = ν √ (1 + ) , 0 < x < ∞, ν > 2.
ν − 2 Γ ( 2 ) νπ ν−2
(47)
Using the normalized Student’s t distribution with fractional degrees of
freedom as the distribution of the M and Zi ’s in (3), we get our second
extension to the one-factor Gaussian copula model which we refer to as a
double t distribution with fractional degrees of freedom copula model. In this
model, the tail-fatness of the M and Zi ’s can be changed continuously by
adjusting the fractional degrees of freedom parameter ν.

One-Factor Double Mixture Distribution of t and Gaussian


Distribution Copula Model
In the previous model, the tail-fatness of the M and Zi ’s is controlled by the
fractional degrees of freedom parameter of the Student’s t distribution. Here,
we introduce another distribution function for the M and Zi ’s, the mixture
distribution of the Student’s t and the Gaussian distributions. Assume U is a
normalized Student’s t distribution with fractional degrees of freedom, and V
is a standard normal distribution. We can express a mixture distribution X as

U with probability 1 − p,
X= 0 ≤ p ≤ 1, (48)
V with probability p,
where p is the proportion of the Gaussian component in the mixture distri-
bution X. The pdf of X is
f (x) = √p exp(−x2 /2)
2π 4 , (49)
Γ ( ν+1
2 ) x2 −(ν+1)/2
+(1 − p) ν−2
ν

νπΓ (ν/2)
(1 + ν−2 )

where ν is the fractional degrees of freedom of the Student’s t distribution.


Using the mixture distribution of Student’s t and Gaussian distributions
in (3) as the distribution of the M and Zi ’s, we get our third extension to
the one-factor Gaussian copula model which we refer to as a double mixture
distribution of Student’s t and Gaussian distribution copula model. In this
model, the tail-fatness of the M and Z’s is controlled by the parameter p
when the parameter ν is fixed.

One-Factor Double Smoothly Truncated Stable Copula Model


In this part, we first introduce the stable distribution and the smoothly trun-
cated stable distribution, and then provide our proposed model.
282 D. Wang et al.

Stable Distribution
A non-trivial distribution g is a stable distribution if and only if for a sequence
of independent, identical random variables Xi, , i = 1, 2, 3, . . . , n with a distri-
bution g, the constants cn > 0 and dn can always be found for any n > 1 such
that
d
cn (X1 + X2 + · · · + Xn ) + dn = X1 .
In general, a stable distribution cannot be expressed in a closed form except
for three special cases: Gaussian, Gauchy, and Lévy distributions. However,
the characteristic function always exists and can be expressed in a closed
form. For a random variable X with a stable distribution g, the characteristic
function of the X can be expressed in the following form

exp(−γ α |t|α [1 − iβsign(t) tan( πα2 )] + iδt), α = 1 ,
ϕX (t) = E exp(itX) =
exp(−γ|t|[1 + iβ π2 sign(t) ln(|t|)] + iδt), α=1
(50)
where 0 < α ≤ 2, γ ≥ 0, −1 ≤ β ≤ 1, and −∞ ≤ δ ≤ ∞, and the function of
sign(t) is 1 when t > 0, 0 when t = 0, and −1 when t < 0.
There are four characteristic parameters to describe a stable distribution.
They are: (1) the index of stability or the shape parameter α, (2) the scale
parameter γ, (3) the skewness parameter β, and (4) the location parameter
δ. A stable distribution g is called the α stable distribution and is denoted
Sα (δ, β, σ) = S(α, σ, β, δ).
The family of α stable distributions has three attractive properties:
• The sum of independent α stable distributions is still an α stable distri-
bution, a property referred to as stability.
• α stable distributions can be skewed.
• Compared with the normal distribution, α stable distributions can have a
fatter tail and a high peak around its center, a property which is referred
to as leptokurtosis.
Real world financial market data indicate that assets returns tend to be
fat-tailed, skewed, and peaked around center. For this reason α stable distri-
butions have been a popular choice in modeling asset returns (see [19]).

Smoothly Truncated α Stable Distribution


One inconvenience of a stable distribution is that it has an infinite variance
except in the case of α = 2. A new class of heavy-tailed functions is proposed
by Menn and Rachev [15, 16]: smoothly truncated α stable distribution.
A smoothly truncated α stable distribution is an α stable distribution with
its two tails replaced by the tails of the Gaussian distribution. The pdf can
be expressed as ⎧
⎨ h1 (x) f or x < a,
f (x) = gθ (x) f or a ≤ δ ≤ b, (51)

h2 (x) f or x > b,
Pricing Tranches of a CDO and a CDS Index 283

where hi (x), i = 1, 2 are the pdf of two normal distributions with means µi
and standard deviations σi , and gθ (x) is the pdf of an α stable distribution
with its parameter vector θ = (α, γ, β, δ). To secure a well-defined smooth
probability distribution, the following regularities are imposed:

h1 (a) 0= gθ (a), h2 (b)


0 a = gθ (b),
a
p1 := 0−∞ h1 (x)dx =0 −∞ gθ (x)dx,
∞ ∞
p2 := b h2 (x)dx = b gθ (x)dx, (52)
−1
σ1 = ψ(ϕgθ (a)(p1 ))
, µ1 = a − σ1 ϕ−1 (p1 ),
ψ(ϕ−1 (p2 ))
σ2 = gθ (b) , µ2 = b + σ2 ϕ−1 (p2 ),

where ψ and ϕ denote the density and distribution functions of the standard
normal distribution, respectively. A smoothly truncated α stable distribution
[a,b]
is referred to as an STS-distribution and denoted by Sα (γ, β, δ). The proba-
bilities p1 and p2 are referred to as the cut-off probabilities. The real numbers
a and b are referred to as the cut-off points.
The family of STS-distributions has two important properties. The first is
that it is closed under the scale and location transitions. This means that if
the distribution X is an STS-distribution, then for c, d ∈ R, the distribution
[a,b]
Y := cX + d is an STS-distribution. If X follows Sα (γ, β, δ), then Y follows

S
[
a,b]
(  δ)
γ , β,  with

α

a = ca + d, b = cb + d,
 α =α,
cδ + d α = 1, (53)
 = |c|γ,
γ β = sign(c)β, δ =
cδ − π2 c log |c|σβ + d α = 1,

The other important property of the STS-distribution is that with respect


to an α stable distribution Sα (γ, β, δ), there is a unique normalized STS-
distribution Sα (γ, β, δ) whose cut-off points a and b are uniquely determined
[a,b]

by the four parameters α, γ, β, and δ. Because of the uniqueness of cut-


off points, the normalized STS-distribution can be denoted by the NSTS-
distribution Sα (γ, β, δ).

One-Factor Double Smoothly Truncated Stable Copula Model

In the one-factor copula model given in (3), using the NSTS-distribution


Sα (γ, β, δ) for the distribution of the market component M and the individual
components Zi ’s, we obtain the fourth extension to the one-factor Gaussian
copula model. We refer to the model as a one-factor double smoothly trun-
cated α stable copula model. In the model, we can fix the parameters γ, β, and
δ, and make the parameter α the only parameter to control the tail-fatness of
the copula function. When the parameter α = 2, the model becomes the one-
factor Gaussian copula model. When α decreases, the tail-fatness increases.
284 D. Wang et al.

7.2 Suggestions for Structural Model and Loss Process Model

The base-case structural model suggested by Hull et. al [7] can be an alterna-
tive method to the one-factor Gaussian copula model. The results of the two
models are close. Consider the fact that the one-factor double t copula model
fits market data much better than the one-factor Gaussian copula model ac-
cording to Hull and White [6]. A natural way to enhance the structural model
is by applying heavy-tailed distributions.
Unlike the one-factor copula model, where any continuous distribution
with a zero mean and a unit variance can be used, in the structural model there
is a strong constraint imposed on the distribution of the underlying stochastic
processes. The distribution for the common driving process M (t) and the
individual driving process Zi ’s in (22) must satisfy a property of closure under
summation. This means that if two independent random variables follow a
given distribution, then the sum of these two variables still follow the same
distribution. As explained earlier, the α stable distribution has this property
and has been used in financial modeling (see [18]). We suggest using the α
stable distribution in the structural model.
The non-Gaussian α stable distribution has a drawback. Its variance does
not exist. The STS distribution is a good candidate to overcome this problem.
For a STS distribution, if the two cut-off points a and b are far away from the
peak, the STS distribution is approximately closed under summation. Based
on this, employing the STS distribution in the structural model should be the
subject of future research.
In the dynamic loss process model, the default intensities λi ’s follow
stochastic processes as shown in (31). It is also a possible research direc-
tion to use the α stable distribution and the STS distribution for the driving
processes.

8 Summary

In this paper, we review three models for pricing portfolio risk: the one-factor
copula model, the structural model, and the loss process model. We then
propose how to improve these models by using heavy-tailed functions. For
the one-factor copula model, we suggest using (1) a double mixture Gaussian
copula, (2) a double t distribution with fractional copula, (3) a double mixture
distribution of t and Gaussian distributions copula, and (4) a double smoothly
truncated α stable copula. In each of these four new extensions to the one-
factor Gaussian copula model, one parameter is introduced to control the
tail-fatness of the copula function. To improve the structural and loss process
models, we suggest using the stable distribution and the smoothly truncated
stable distribution for the underlying stochastic driving processes.
Pricing Tranches of a CDO and a CDS Index 285

References

[1] Amato J, Gyntelberg J (2005) CDS index tranches and the pricing of
credit risk correlations. BIS Quarterly Review, March 2005, pp 73–87
[2] Bennani N (2005) The forward loss model: a dynamic term structure
approach for the pricing of portfolio credit derivatives. Working paper,
available at http://www.defaultrisk.com/pp crdrv 95.htm
[3] Black F, Cox J (1976) Valuing corporate securities: some effects of bond
indenture provision. The Journal of Finance, vol 31, pp 351–367
[4] Di Graziano G, Rogers C (2005) A new approach to the modeling and
pricing of correlation credit derivatives. Working paper, available at
www.defaultrisk.com/pp crdrv 88.htm
[5] Duffie D (2004) Comments: irresistible reasons for better models of credit
risk. Financial Times, April 16, 2004
[6] Hull J, White A (2004) Valuation of a CDO and nth to default CDS
without Monte Carlo simulation. The Journal of Derivatives, vol 2, pp
8–23
[7] Hull J, Predescu M, White A (2005) The valuation of correlation-
dependent credit derivatives using a structural model. Working paper,
Joseph L. Rotman School of Management, University of Toronto, avail-
able at http://www.defaultrisk.com/pp crdrv 68.htm
[8] Kalemanova A, Schmid B, Werner R (2005) The normal inverse Gaus-
sian distribution for synthetic CDO pricing. Working paper, available at
http://www.defaultrisk.com/pp crdrv 91.htm
[9] Laurent JP, Gregory J (2003) Basket default swaps, CDOs and factor cop-
ulas. Working paper, ISFA Actuarial School, University of Lyon, available
at http://www.defaultrisk.com/pp crdrv 26.htm
[10] Li DX (1999) The valuation of basket credit derivatives. CreditMetrics
Monitor, April 1999, pp 34–50
[11] Li DX (2000) On default correlation: a copula function approach. The
Journal of Fixed Income, vol 9, pp 43–54
[12] Lucas DJ, Goodman LS, Fabozzi FJ (2006) Collateralized debt obliga-
tions: structures and analysis, 2nd edn. Wiley Finance, Hoboken
[13] Mardia K, Zemroch P (1978) Tables of the F- and related distributions
with algorithms. Academic, New York
[14] McGinty L, Beinstein E, Ahluwalia R, Watts M (2004) Credit correlation:
a guide. Credit Derivatives Strategy, JP Morgan, London, March 12, 2004
[15] Menn C, Rachev S (2005) A GARCH option pricing model with alpha-
stable innovations. European Journal of Operational Research, vol 163,
pp 201–209
[16] Menn C, Rachev S (2005) Smoothly truncated stable distribu-
tions, GARCH-models, and option pricing. Working paper, University
of Karlsruhe and UCSB, available at http://www.statistik.
uni-karlsruhe.de/download/tr smoothly truncated.pdf
286 D. Wang et al.

[17] Merton R (1974) On the pricing of corporate debt: the risk structure of
interest rates. The Journal of Finance, vol 29, pp 449–470
[18] Rachev S, Mittnik S (2000) Stable Paretian models in finance. John Wi-
ley, Series in Financial Economics and Quantitative Analysis, Chichester
[19] Rachev S, Menn C, Fabozzi FJ (2005) Fat-tailed and skewed asset return
distributions: implications for risk management, portfolio selection, and
option pricing. Wiley Finance, Hoboken
[20] Schönbucher P (2005) Portfolio losses and the term structure of loss
transition rates: a new methodology for the pricing of portfolio credit
derivatives. Working paper, available at http://www.defaultrisk.
com/pp model 74.htm
[21] Sidenius J, Piterbarg V, Andersen L (2005) A new framework for
dynamic credit portfolio loss modeling. Working paper, available at
http://www.defaultrisk.com/pp model 83.htm

You might also like