Foundations of Quantitative Finance Book Iv Distribution Functions and Expectations 1St Edition Robert R Reitano Full Chapter

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 68

Foundations of Quantitative Finance.

Book IV: Distribution Functions and


Expectations 1st Edition Robert R.
Reitano
Visit to download the full and correct content document:
https://ebookmass.com/product/foundations-of-quantitative-finance-book-iv-distributio
n-functions-and-expectations-1st-edition-robert-r-reitano/
Foundations of Quantitative
Finance
Chapman & Hall/CRC Financial Mathematics Series
Series Editors

M.A.H. Dempster
Centre for Financial Research
Department of Pure Mathematics and Statistics
University of Cambridge, UK

Dilip B. Madan
Robert H. Smith School of Business
University of Maryland, USA

Rama Cont
Department of Mathematics
Imperial College, UK

Robert A. Jarrow
Lynch Professor of Investment Management
Johnson Graduate School of Management
Cornell University, USA

Recently Published Titles

Machine Learning for Factor Investing: Python Version


Guillaume Coqueret and Tony Guida

Introduction to Stochastic Finance with Market Examples, Second Edition


Nicolas Privault

Commodities: Fundamental Theory of Futures, Forwards, and Derivatives Pricing, Second Edition
Edited by M.A.H. Dempster and Ke Tang

Introducing Financial Mathematics: Theory, Binomial Models, and Applications


Mladen Victor Wickerhauser

Financial Mathematics: From Discrete to Continuous Time


Kevin J. Hastings

Financial Mathematics: A Comprehensive Treatment in Discrete Time


Giuseppe Campolieti and Roman N. Makarov

Introduction to Financial Derivatives with Python


Elisa Alòs and Raúl Merino

The Handbook of Price Impact Modeling


Kevin T. Webster

Sustainable Life Insurance: Managing Risk Appetite for Insurance Savings & Retirement Products
Aymeric Kalife with Saad Mouti, Ludovic Goudenege, Xiaolu Tan, and Mounir Bellmane

Geometry of Derivation with Applications


Norman L. Johnson

For more information about this series please visit: https://www.crcpress.com/Chapman-and-HallCRC-


Financial-Mathematics-Series/book series/CHFINANCMTH
Foundations of Quantitative
Finance
Book IV: Distribution Functions and
Expectations

Robert R. Reitano
Brandeis International Business School
Waltham, MA
First edition published 2023
by CRC Press
6000 Broken Sound Parkway NW, Suite 300, Boca Raton, FL 33487-2742

and by CRC Press


4 Park Square, Milton Park, Abingdon, Oxon, OX14 4RN

CRC Press is an imprint of Taylor & Francis Group, LLC

© 2024 Robert R. Reitano

Reasonable efforts have been made to publish reliable data and information, but the author and publisher cannot
assume responsibility for the validity of all materials or the consequences of their use. The authors and publishers
have attempted to trace the copyright holders of all material reproduced in this publication and apologize to copyright
holders if permission to publish in this form has not been obtained. If any copyright material has not been acknowl-
edged please write and let us know so we may rectify in any future reprint.

Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced, transmitted, or
utilized in any form by any electronic, mechanical, or other means, now known or hereafter invented, including pho-
tocopying, microfilming, and recording, or in any information storage or retrieval system, without written permission
from the publishers.

For permission to photocopy or use material electronically from this work, access www.copyright.com or contact the
Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400. For works that are
not available on CCC please contact mpkbookspermissions@tandf.co.uk

Trademark notice: Product or corporate names may be trademarks or registered trademarks and are used only for
identification and explanation without intent to infringe.

ISBN: 978-1-032-20653-0 (hbk)


ISBN: 978-1-032-20652-3 (pbk)
ISBN: 978-1-003-26458-3 (ebk)

DOI: 10.1201/9781003264583

Typeset in CMR10
by KnowledgeWorks Global Ltd.

Publisher’s note: This book has been prepared from camera-ready copy provided by the authors.
to Dorothy and Domenic
Taylor & Francis
Taylor & Francis Group
http://taylorandfrancis.com
Contents

Preface xi

Author xiii

Introduction xv

1 Distribution and Density Functions 1


1.1 Summary of Book II Results . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.1 Distribution Functions on R . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.2 Distribution Functions on Rn . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Decomposition of Distribution Functions on R . . . . . . . . . . . . . . . . 6
1.3 Density Functions on R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.3.1 The Lebesgue Approach . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.3.2 Riemann Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.3.3 Riemann-Stieltjes Framework . . . . . . . . . . . . . . . . . . . . . . 14
1.4 Examples of Distribution Functions on R . . . . . . . . . . . . . . . . . . . 16
1.4.1 Discrete Distribution Functions . . . . . . . . . . . . . . . . . . . . . 16
1.4.2 Continuous Distribution Functions . . . . . . . . . . . . . . . . . . . 19
1.4.3 Mixed Distribution Functions . . . . . . . . . . . . . . . . . . . . . . 25

2 Transformed Random Variables 29


2.1 Monotonic Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.2 Sums of Independent Random Variables . . . . . . . . . . . . . . . . . . . . 32
2.2.1 Distribution Functions of Sums . . . . . . . . . . . . . . . . . . . . . 32
2.2.2 Density Functions of Sums . . . . . . . . . . . . . . . . . . . . . . . 39
2.3 Ratios of Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . . 43
2.3.1 Independent Random Variables . . . . . . . . . . . . . . . . . . . . . 44
2.3.2 Example without Independence . . . . . . . . . . . . . . . . . . . . . 53

3 Order Statistics 57
3.1 M -Samples and Order Statistics . . . . . . . . . . . . . . . . . . . . . . . . 57
3.2 Distribution Functions of Order Statistics . . . . . . . . . . . . . . . . . . . 59
3.3 Density Functions of Order Statistics . . . . . . . . . . . . . . . . . . . . . 60
3.4 Joint Distribution of All Order Statistics . . . . . . . . . . . . . . . . . . . 62
3.5 Density Functions on Rn . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
3.6 Multivariate Order Functions . . . . . . . . . . . . . . . . . . . . . . . . . . 69
3.6.1 Joint Density of All Order Statistics . . . . . . . . . . . . . . . . . . 69
3.6.2 Marginal Densities and Distributions . . . . . . . . . . . . . . . . . . 70
3.6.3 Conditional Densities and Distributions . . . . . . . . . . . . . . . . 73
3.7 The Rényi Representation Theorem . . . . . . . . . . . . . . . . . . . . . . 75

vii
viii Contents

4 Expectations of Random Variables 1 81


4.1 General Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
4.1.1 Is Expectation Well Defined? . . . . . . . . . . . . . . . . . . . . . . 83
4.1.2 Formal Resolution of Well-Definedness . . . . . . . . . . . . . . . . . 85
4.2 Moments of Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
4.2.1 Common Types of Moments . . . . . . . . . . . . . . . . . . . . . . . 88
4.2.2 Moment Generating Function . . . . . . . . . . . . . . . . . . . . . . 89
4.2.3 Moments of Sums – Theory . . . . . . . . . . . . . . . . . . . . . . . 91
4.2.4 Moments of Sums – Applications . . . . . . . . . . . . . . . . . . . . 94
4.2.5 Properties of Moments . . . . . . . . . . . . . . . . . . . . . . . . . . 98
4.2.6 Examples–Discrete Distributions . . . . . . . . . . . . . . . . . . . . 102
4.2.7 Examples–Continuous Distributions . . . . . . . . . . . . . . . . . . 104
4.3 Moment Inequalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
4.3.1 Chebyshev’s Inequality . . . . . . . . . . . . . . . . . . . . . . . . . . 109
4.3.2 Jensen’s Inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
4.3.3 Kolmogorov’s Inequality . . . . . . . . . . . . . . . . . . . . . . . . . 113
4.3.4 Cauchy-Schwarz Inequality . . . . . . . . . . . . . . . . . . . . . . . 114
4.3.5 Hölder and Lyapunov Inequalities . . . . . . . . . . . . . . . . . . . 117
4.4 Uniqueness of Moments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
4.4.1 Applications of Moment Uniqueness . . . . . . . . . . . . . . . . . . 123
4.5 Weak Convergence and Moment Limits . . . . . . . . . . . . . . . . . . . . 125

5 Simulating Samples of RVs – Examples 135


5.1 Random Samples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
5.1.1 Discrete Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . 135
5.1.2 Simpler Continuous Distributions . . . . . . . . . . . . . . . . . . . . 140
5.1.3 Normal and Lognormal Distributions . . . . . . . . . . . . . . . . . . 141
5.1.4 Student T Distribution . . . . . . . . . . . . . . . . . . . . . . . . . 144
5.2 Ordered Random Samples . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
5.2.1 Direct Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
5.2.2 The Rényi Representation . . . . . . . . . . . . . . . . . . . . . . . . 149

6 Limit Theorems 153


6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
6.2 Weak Convergence of Distributions . . . . . . . . . . . . . . . . . . . . . . 156
6.2.1 Student T to Normal . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
6.2.2 Poisson Limit Theorem . . . . . . . . . . . . . . . . . . . . . . . . . 158
6.2.3 “Weak Law of Small Numbers” . . . . . . . . . . . . . . . . . . . . . 159
6.2.4 De Moivre-Laplace Theorem . . . . . . . . . . . . . . . . . . . . . . 161
6.2.5 The Central Limit Theorem 1 . . . . . . . . . . . . . . . . . . . . . . 165
6.2.6 Smirnov’s Theorem on Uniform Order Statistics . . . . . . . . . . . 168
6.2.7 A Limit Theorem on General Quantiles . . . . . . . . . . . . . . . . 172
6.2.8 A Limit Theorem on Exponential Order Statistics . . . . . . . . . . 174
6.3 Laws of Large Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
6.3.1 Tail Events and Kolmogorov’s 0-1 Law . . . . . . . . . . . . . . . . . 176
6.3.2 Weak Laws of Large Numbers . . . . . . . . . . . . . . . . . . . . . . 178
6.3.3 Strong Laws of Large Numbers . . . . . . . . . . . . . . . . . . . . . 183
6.3.4 A Limit Theorem in EVT . . . . . . . . . . . . . . . . . . . . . . . . 187
6.4 Convergence of Empirical Distributions . . . . . . . . . . . . . . . . . . . . 189
6.4.1 Definition and Basic Properties . . . . . . . . . . . . . . . . . . . . . 190
6.4.2 The Glivenko-Cantelli Theorem . . . . . . . . . . . . . . . . . . . . . 195
6.4.3 Distributional Estimates for Dn (s) . . . . . . . . . . . . . . . . . . . 197
Contents ix

7 Estimating Tail Events 2 201


7.1 Large Deviation Theory 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
7.1.1 Chernoff Bound . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
7.1.2 Cramér-Chernoff Theorem . . . . . . . . . . . . . . . . . . . . . . . . 208
7.2 Extreme Value Theory 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210
7.2.1 Fisher-Tippett-Gnedenko Theorem . . . . . . . . . . . . . . . . . . . 211
7.2.2 The Hill Estimator, γ > 0 . . . . . . . . . . . . . . . . . . . . . . . . 212
7.2.3 F ∈ D(Gγ ) is Asymptotically Pareto for γ > 0 . . . . . . . . . . . . 217
7.2.4 F ∈ D(Gγ ), γ > 0, then γ H ≈ γ . . . . . . . . . . . . . . . . . . . . 226
7.2.5 F ∈ D(Gγ ), γ > 0, then γ H →1 γ . . . . . . . . . . . . . . . . . . . . 232
7.2.6 Asymptotic Normality of the Hill Estimator . . . . . . . . . . . . . . 236
7.2.7 The Pickands-Balkema-de Haan Theorem: γ > 0 . . . . . . . . . . . 237

Bibliography 243

Index 247
Taylor & Francis
Taylor & Francis Group
http://taylorandfrancis.com
Preface

The idea for a reference book on the mathematical foundations of quantitative finance
has been with me throughout my professional and academic careers in this field, but the
commitment to finally write it didn’t materialize until completing my first “introductory”
book in 2010.
My original academic studies were in “pure” mathematics in a field of mathematical
analysis, and neither applications generally nor finance in particular were then even on
my mind. But on completion of my degree, I decided to temporarily investigate a career in
applied math, becoming an actuary, and in short order became enamored with mathematical
applications in finance.
One of my first inquiries was into better understanding yield curve risk management, ulti-
mately introducing the notion of partial durations and related immunization strategies. This
experience led me to recognize the power of greater precision in the mathematical specifica-
tion and solution of even an age-old problem. From there my commitment to mathematical
finance was complete, and my temporary investigation into this field became permanent.
In my personal studies, I found that there were a great many books in finance that
focused on markets, instruments, models and strategies, and which typically provided an
informal acknowledgement of the background mathematics. There were also many books
in mathematical finance focusing on more advanced mathematical models and methods,
and typically written at a level of mathematical sophistication requiring a reader to have
significant formal training and the time and motivation to derive omitted details.
The challenge of acquiring expertise is compounded by the fact that the field of quanti-
tative finance utilizes advanced mathematical theories and models from a number of fields.
While there are many good references on any of these topics, most are again written at
a level beyond many students, practitioners and even researchers of quantitative finance.
Such books develop materials with an eye to comprehensiveness in the given subject matter,
rather than with an eye toward efficiently curating and developing the theories needed for
applications in quantitative finance.
Thus the overriding goal I have for this collection of books is to provide a complete and
detailed development of the many foundational mathematical theories and results one finds
referenced in popular resources in finance and quantitative finance. The included topics
have been curated from a vast mathematics and finance literature for the express purpose
of supporting applications in quantitative finance.
I originally budgeted 700 pages per book, in two volumes. It soon became obvious
this was too limiting, and two volumes ultimately turned into ten. In the end, each book
was dedicated to a specific area of mathematics or probability theory, with a variety of
applications to finance that are relevant to the needs of financial mathematicians.
My target readers are students, practitioners and researchers in finance who are quantita-
tively literate, and recognize the need for the materials and formal developments presented.
My hope is that the approach taken in these books will motivate readers to navigate these
details and master these materials.
Most importantly for a reference work, all ten volumes are extensively self-referenced.
The reader can enter the collection at any point of interest, and then using the references

xi
xii Preface

cited, work backwards to prior books to fill in needed details. This approach also works for
a course on a given volume’s subject matter, with earlier books used for reference, and for
both course-based and self-study approaches to sequential studies.
The reader will find that the developments herein are presented at a much greater level
of detail than most advanced quantitative finance books. Such developments are of necessity
typically longer, more meticulously reasoned, and therefore can be more demanding on the
reader. Thus before committing to a detailed line-by-line study of a given result, it is always
more efficient to first scan the derivation once or twice to better understand the overall logic
flow.
I hope the additional details presented will support your journey to better understanding.
I am grateful for the support of my family: Lisa, Michael, David, and Jeffrey, as well as
the support of friends and colleagues at Brandeis International Business School.

Robert R. Reitano
Brandeis International Business School
Author

Robert R. Reitano is Professor of the Practice of Finance at the Brandeis International


Business School where he specializes in risk management and quantitative finance. He pre-
viously served as MSF Program Director, and Senior Academic Director. He has a PhD in
mathematics from MIT, is a fellow of the Society of Actuaries, and a Chartered Enterprise
Risk Analyst. Dr. Reitano consults in investment strategy and asset/liability risk manage-
ment, and previously had a 29-year career at John Hancock/Manulife in investment strategy
and asset/liability management, advancing to Executive Vice President & Chief Investment
Strategist. His research papers have appeared in a number of journals and have won an
Annual Prize of the Society of Actuaries and two F.M. Redington Prizes of the Investment
Section of the Society of the Actuaries. Dr. Reitano serves on various not-for-profit boards
and investment committees.

xiii
Taylor & Francis
Taylor & Francis Group
http://taylorandfrancis.com
Introduction

Foundations of Quantitative Finance is structured as follows:


Book I: Measure Spaces and Measurable Functions
Book II: Probability Spaces and Random Variables
Book III: The Integrals of Riemann, Lebesgue, and (Riemann-)Stieltjes
Book IV: Distribution Functions and Expectations
Book V: General Measure and Integration Theory
Book VI: Densities, Transformed Distributions, and Limit Theorems
Book VII: Brownian Motion and Other Stochastic Processes
Book VIII: Itô Integration and Stochastic Calculus 1
Book IX: Stochastic Calculus 2 and Stochastic Differential Equations
Book X: Classical Models and Applications in Finance

The series is logically sequential. Books I, III, and V develop foundational mathematical
results needed for the probability theory and finance applications of Books II, IV, and
VI, respectively. Then Books VII, VIII, and IX develop results in the theory of stochastic
processes. While these latter three books introduce ideas from finance as appropriate, the
final realization of the applications of these stochastic models to finance is deferred to Book
X.
This Book IV, Distribution Functions and Expectations, extends the investigations of
Book II using the formidable tools afforded by the Riemann, Lebesgue, and Riemann-
Stieltjes integration theories of Book III.
To set the stage, Chapter 1 opens with a short review of the key results on distribution
functions from Books I and II. The focus here is on the connections between distribution
functions of random variables and random vectors and distribution functions induced by
Borel measures on R and Rn . A complete functional characterization of distribution func-
tions on R is then derived, providing a natural link between general probability theory and
the discrete and continuous theories commonly encountered. This leads to an investigation
into the existence of density functions associated with various distribution functions. Here
the integration theories from Book III are recalled to frame this investigation, and the gen-
eral results to be seen in Book VI using the Book V integration theory are introduced. The
chapter ends with a catalog of many common distribution and density functions from the
discrete and continuous probability theories.
Chapter 2 investigates transformations of random variables. For example, given a random
variable X and associated distribution/density function, what is the distribution/density
function of the random variable g(X) given a Borel measurable function g(x)? More gen-
erally, what are the distribution functions and densities of sums and ratios of random vari-
ables, where now g(x) is a multivariate function? The first section addresses the distribution
function question for strictly monotonic g(x) and the density question when such g(x) is
differentiable. More general transformations are deferred to Book VI using the change of
variable results from the integration theory of Book V. A number of results are then de-
rived for the distribution and density functions of sums of independent random variables
using the integration theories of Book III. The various forms of such distribution functions
then reflect the assumptions made on the underlying distribution functions and/or density

xv
xvi Introduction

functions. Examples and exercises connect the theory with the Chapter 1 catalog of distri-
bution functions. The chapter ends with an investigation into ratios of independent random
variables, as well as an example using dependent random variables.
A special example of an order statistic was introduced in Chapter 9 of Book II on
extreme value theory, where this random variable was defined as the maximum of a collection
of independent, identically distributed random variables. Order statistics, the subject of
Chapter 3, generalize this notion, converting such a collection into ordered random variables.
Distribution and density functions of such variates are first derived, before turning to the
joint distribution function of all order statistics. This latter derivation introduces needed
combinatorial ideas as well as results on multivariate integration from Book III and a more
general result from Book V. Various density functions of order statistics are then derived,
beginning with the joint density and then proceeding to the various marginal and conditional
densities of these random variables. The final investigation is into the Rényi representation
theorem for the order statistics of an exponential distribution. While seemingly of narrow
applicability as a result of exponential variables, this theorem will be seen to be more widely
applicable.
Expectations of random variables and transformed random variables are introduced in
Chapter 4 in the general context of a Riemann-Stieltjes integral. In the special case of dis-
crete or continuous probability theory, this definition reduces to the familiar notions from
these theories using Book III results. But this definition also raises existence and consis-
tency questions. The roadmap to a final solution is outlined, foretelling needed results from
the integration theory of Book V and the final detailed resolution in Book VI. Various
moments, the moment generating function, and properties of such are then developed, as
well as examples from the distribution functions introduced earlier. Moment inequalities of
Chebyshev, Jensen, Kolmogorov, Cauchy-Schwarz, Hölder, and Lyapunov are derived, be-
fore turning to the question of uniqueness of moments and the moment generating function.
The chapter ends with an investigation of weak convergence of distributions and moment
limits, developing a number of results underlying the “method of moments.”
Given a random variable X defined on a probability space, Chapter 4 of Book II derived
the theoretical basis for, and several constructions of, a probability space on which could be
defined a countable collection of independent random variables, identically distributed with
X. Such spaces provide a rigorous framework for the laws of large numbers of that book, and
the limit theorems of this book’s Chapter 6. This framework is in the background for Chapter
5, but the focus here is on the actual generation of random sample collections using the
previous theory and the various distribution functions introduced in previous chapters. The
various sections then exemplify simulation approaches for discrete distributions, and then
continuous distributions, using the left-continuous inverse function F ∗ (y) and independent,
continuous uniform variates commonly provided by various mathematical software. For
generating normal, lognormal, and Student’s T variates, these constructions are, at best,
approximate, and the chapter derives the exact constructions underlying the Box-Muller
transform and the Bailey transform, respectively. The final section turns to the simulation
of order statistics, both directly and with the aid of the Rényi representation theorem.
Chapter 6 begins with a more formal short review of the theoretical framework of Book
II for the construction of a probability space on which a countable collection of independent,
identically distributed random variables can be defined, and thus on which limit theorems
of various types can be addressed. The first section then addresses weak convergence of
various distribution function sequences. Among those studied are the Student’s T, Poisson,
DeMoivre-Laplace, and a first version of the central limit theorem, as well as Smirnov’s result
on uniform order statistics, a general result on exponential order statistics, and finally a limit
theorem on quantiles. The next section generalizes the study of laws of large numbers of
Book II using moment defined limits, and proves a limit theorem on extreme value theory
Introduction xvii

identified in that book. The final section studies empirical distribution functions, and in
particular, derives the Glivenko-Cantelli theorem on convergence of empirical distributions
to the underlying distribution function. Kolmogorov’s theorem on the limiting distribution
of the maximum error in an empirical distribution is also discussed, as are related results.
Continuing the study initiated in Chapter 9 of Book II, Chapter 7 again has two main
themes. The first topic is large deviation theory. Following a summary of the main result
and open questions of Book II, the section introduces and exemplifies the Chernoff bound,
which requires the existence of the moment generating function. Following an analysis of
properties of this bound, and introducing tilted distributions and their relevant properties,
the section concludes with the Cramér-Chernoff theorem, which conclusively settles the
open questions of Book II. The second major section is on extreme value theory and focuses
on two matters. The first is a study of the Hill estimator for the extreme value index γ for
γ > 0, the index values most commonly encountered in finance applications. This estimator
is introduced and exemplified in the context of Pareto distributions, and the Hill result of
convergence with probability 1 derived, along with a variety of related results. For this,
earlier developments in order statistics will play a prominent role, as does a representation
theorem of Karamata. The second major investigation is into the Pickands-Balkema-de
Haan theorem, a result that identifies the limiting distribution of certain conditional tail
distributions. This final result was approximated in the Book II development, but here it can
be derived in detail with another representation theorem of Karamata. Using an example,
it is then shown that the convergence promised by this result need not be fast.
I hope this book and the other books in the collection serve you well.

Notation 0.1 (Referencing within FQF Series) To simplify the referencing of results
from other books in this series, we use the following convention.
A reference to “Proposition I.3.33” is a reference to Proposition 3.33 of Book I, while
“Chapter III.4” is a reference to Chapter 4 of Book III, and “II.(8.5)” is a reference to
formula (8.5) of Book II, and so forth.
Taylor & Francis
Taylor & Francis Group
http://taylorandfrancis.com
1
Distribution and Density Functions

1.1 Summary of Book II Results


As the goal of this book is to study distribution functions, it is perhaps worthwhile to
summarize some of the key results of earlier books.

Notation 1.1 (µ → λ) In Book II, probability spaces were generally denoted by (S, E, µ),
where S is the measure space, here also called a “sample” space, E is the sigma algebra of
measurable sets, here also called the collection of “events,” and µ is the probability measure
defined on all sets in E. In this book we retain most of this notational convention. However,
because µ will often be called upon in later chapters to represent the “mean” of a given
distribution as is conventional, we will represent the probability measure herein by λ or by
another Greek letter.

1.1.1 Distribution Functions on R


Book II introduced definitions and basic properties. Beginning with Definition II.3.1:

Definition 1.2 (Random variable) Given a probability space (S, E, λ), a random vari-
able (r.v.) is a real-valued function:

X : S −→ R,

such that for any bounded or unbounded interval, (a, b) ⊂ R :

X −1 (a, b) ∈ E.

The distribution function (d.f.), or cumulative distribution function (c.d.f.),


associated with X, denoted by F or FX , is defined on R by

F (x) = λ[X −1 (−∞, x]]. (1.1)

Properties of such functions were summarized in Proposition II.6.1.

Proposition 1.3 (Properties of a d.f. F (x)) Given a random variable X on a proba-


bility space (S, E, λ), the distribution function F (x) associated with X has the following
properties:

1. F (x) is a nonnegative, increasing function on R which is Borel, and hence, Lebesgue


measurable.
2. For all x :
lim F (y) = F (x), (1.2)
y→x+

DOI: 10.1201/9781003264583-1 1
2 Distribution and Density Functions

and thus F (x) is right continuous, and:


lim F (y) = F (x) − λ({X(s) = x}), (1.3)
y→x−

so F (x) has left limits.


3. F (x) has, at most, countably many discontinuities.
4. The limits of F (x) exists as x → ±∞ :
lim F (y) = 0. (1.4)
y→−∞

lim F (y) = 1. (1.5)


y→∞

This result also reflects the characterizing properties of such distribution functions. A
function F : R → R can be identified as a distribution function by Proposition II.3.6:
Proposition 1.4 (Identifying a distribution function) Let F (x) be an increasing
function, which is right continuous and satisfies F (−∞) = 0 and F (∞) = 1, defined as
limits.
Then there exists a probability space (S, E, λ) and random variable X so that F (x) =
λ[X −1 (−∞, x]]. In other words, every such function is the distribution function of a random
variable.
Distribution functions are also intimately linked with Borel measures by Proposition
II.6.3. Recall that A0 denoted the semi-algebra of right semi-closed intervals (a, b], A the
associated algebra of finite disjoint unions of such sets, and B(R) the Borel sigma algebra
of Definition I.2.13. By definition, A ⊂ B(R).
The outer measure λ∗A is defined in I.(5.8) and induced by the set function λA defined on
A by (1.6) and extended to A by finite additivity. A set A is said to be λ∗A -measurable, or
0

Carathéodory measurable after Constantin Carathéodory (1873–1950), if for every


set E ⊂ R :  \   \ 
λ∗A (E) = λ∗A A E + λ∗A A e E .

Proposition 1.5 (The Borel measure λF induced by d.f. F : R → R) Given a prob-


ability space (S, E, λ) and random variable X : S −→ R, the associated distribution function
F defined on R by (1.1), induces a unique probability measure λF on the Borel sigma
algebra B (R) that is defined on A0 by:
λF [(a, b]] = F (b) − F (a). (1.6)
In detail, with MF (R) defined as the collection of λ∗A -measurable sets:
1. A ⊂ MF (R), and for all A ∈ A :
λ∗A (A) = λA (A).

2. MF (R) is a complete sigma algebra and thus contains every set A ⊂ R with λ∗A (A) = 0.
3. MF (R) contains the Borel sigma algebra, B(R) ⊂ MF (R).
4. If λF denotes the restriction of λ∗A to MF (R), then λF is a probability measure and
hence (R, MF (R), λF ) is a complete probability space.
5. The probability measure λF is the unique extension of λA from A to the smallest sigma
algebra generated by A, which is B(R) by Proposition I.8.1.
6. For all A ∈ B(R) :
λF (A) = µ X −1 (A) .
 
(1.7)
Summary of Book II Results 3

1.1.2 Distribution Functions on Rn


Definitions II.3.28 and II.3.30 provided equivalent formulations for the notion of a joint
distribution of a random vector. This equivalence was proved in Proposition II.3.32.

Definition 1.6 (Random vector 1) If Xj : S −→ R are random variables on a probabil-


ity space (S, E, λ), j = 1, 2, ..., n, define the random vector X = (X1 , X2 , ..., Xn ) as the
vector-valued function:
X : S −→ Rn ,
defined on s ∈ S by:
X(s) = (X1 (s), X2 (s), ..., Xn (s)).
The joint distribution function (d.f.), or joint cumulative distribution func-
tion (c.d.f.), associated with X, denoted by F or FX , is then defined on (x1 , x2 , ..., xn ) ∈
Rn by h\n i
F (x1 , x2 , ..., xn ) = λ Xj−1 (−∞, xj ] . (1.8)
j=1

Definition 1.7 (Random vector 2) Given a probability space (S, E, λ), a random vec-
n
tor is a mapping X : S −→ Rn , so that for all A ∈ B(R ), the sigma algebra of Borel
n
measurable sets on R :
X −1 (A) ∈ E. (1.9)
The joint distribution function (d.f.), or joint cumulative distribution func-
tion (c.d.f.), associated with X, denoted by F or FX , is then defined on (x1 , x2 , ..., xn ) ∈
Rn by h Yn i
F (x1 , x2 , ..., xn ) = λ X −1 (−∞, xj ] . (1.10)
j=1

Properties of such functions, and the link to Borel measures on Rn , were summarized
in Proposition II.6.9. But first, recall the definitions of continuous from above and n-
increasing.

Definition 1.8 (Continuous from above; n-increasing) A function F : Rn → R is


said to be continuous from above if given x = (x1 , ..., xn ) and a sequence x(m) =
(m) (m) (m)
(x1 , ..., xn ) with xi ≥ xi for all i and m, and x(m) → x as m → ∞, then:

F (x) = lim F (x(m) ). (1.11)


m→∞

Further, F is said to be n-increasing


Qor to satisfy the n-increasing condition, if for
n
any bounded right semi-closed rectangle i=1 (ai , bi ] :
X
sgn(x)F (x) ≥ 0. (1.12)
x

Each x = (x1 , ..., xn ) in the summation is one of the 2n vertices of this rectangle, so xi = ai
or xi = bi , and sgn(x) is defined as −1 if the number of ai -components of x is odd, and +1
otherwise.

These properties are common to all joint distribution functions, and are exactly the
properties needed to generate Borel measures, by Proposition II.6.9. Below we quote this
result and add properties from Propositions I.8.15 and I.8.16.
4 Distribution and Density Functions

Proposition 1.9 (The Borel measure λF induced by d.f. F : Rn → R) Let X : S −→


Rn be a random vector defined on (S, E, λ), and F : Rn → R the joint distribution function
defined in (1.8) or (1.10). Then F is continuous from above and satisfies the n-increasing
condition.
In addition, F (−∞) = 0 and F (∞) = 1 defined as limits, meaning that for any  > 0,
there exists N so that F (x1 , x2 , ..., xn ) ≥ 1 −  if all xi ≥ N, and F (x1 , x2 , ..., xn ) ≤  if all
xi ≤ −N.
Thus by Propositions I.8.15 and I.8.16, F induces a unique probability measure Qn λF on the
n
Borel sigma algebra B(R ), so that for all bounded right semi-closed rectangles i=1 (ai , bi ] ⊂
Rn : Yn  X
λF (ai , bi ] = sgn(x)F (x). (1.13)
i=1 x
n
In addition, for all A ∈ B(R ) :

λF (A) = λ X −1 (A) ,
 
(1.14)
Qn
and thus by (1.10), for A = j=1 (−∞, xj ] :
Yn 
λF (−∞, xj ] = F (x).
j=1

We can identify distribution functions by Proposition II.6.10.

Proposition 1.10 ((S, E, λ) and X induced by d.f. F : Rn → R) Given a function F :


Rn → R that is n-increasing and continuous from above, with F (−∞) = 0 and F (∞) = 1
defined as limits, there exists a probability space (S, E, λ) and a random vector X so that F
is the joint distribution function of X.

Joint distribution functions induce both marginal distribution functions and con-
ditional distribution functions. We recall Definitions II.3.34 and II.3.39.

Definition 1.11 (Marginal distribution function) Let X : S −→ Rn be the random


vector X = (X1 , X2 , ..., Xn ) defined on (S, E, λ), where Xj : S −→ R are random variables
on (S, E, λ) for j = 1, 2, ..., n.

1. Special Case n = 2 : Given the joint distribution function F (x1 , x2 ) defined on R2 ,


the two marginal distribution functions on R, F (x1 ) and F (x2 ), are defined by:

F1 (x1 ) ≡ lim F (x1 , x2 ), F2 (x2 ) ≡ lim F (x1 , x2 ). (1.15)


x2 →∞ x1 →∞

2. General Case: Given F (x1 , x2 , ..., xn ) and I = {i1 , ..., im } ⊂ {1, 2, ..., n}, let xJ ≡
(xj1 , xj2 , ..., xjn−m ) for jk ∈ J ≡ I. e The marginal distribution function FI (xI ) ≡
FI (xi1 , xi2 , ..., xim ) is defined on Rm by:

FI (xI ) ≡ lim F (x1 , x2 , ..., xn ). (1.16)


xJ →∞

Definition 1.12 (Conditional distribution function) Let X : S −→ Rn be the ran-


dom vector X = (X1 , X2 , ..., Xn ) defined on (S, E, λ), J ≡ {j1 , ..., jm } ⊂ {1, 2, ..., n} and
m
XJ ≡ (Xj1 , Xj2 , ..., Xjm ). Given a Borel set B ∈ B(R ) with λ XJ−1 (B) 6= 0, define the
conditional distribution function of X given XJ ∈ B, denoted by F (x|XJ ∈ B) ≡
F (x1 , x2 , ..., xn |XJ ∈ B), in terms of the conditional probability measure:
h Yn  i
F (x|XJ ∈ B) ≡ λ X −1 (−∞, xi ] XJ−1 (B) .
i=1
Summary of Book II Results 5

Thus by Definition II.1.31,


h Yn \ i. 
F (x|XJ ∈ B) = λ X −1 XJ−1 (B) λ XJ−1 (B) .

(−∞, xi ] (1.17)
i=1

This distribution function is sometimes denoted by FJ|B (x).

Finally, recall the notion of independent random variables from Definition II.3.47,
and the characterization of the joint distribution function of such variables by Proposition
II.3.53.

Definition 1.13 (Independent random variables/vectors) If Xj : S −→ R are ran-


dom variables on (S, E, λ), j = 1, 2, ..., n, we say that {Xj }nj=1 are independent random
variables if {σ(Xj )}nj=1 are independent sigma algebras in the sense of Definition II.1.15.
That is, given {Bj }nj=1 with Bj ∈ σ(Xj ) :
\ n  Yn
λ Bj = λ (Bj ) . (1.18)
j=1 j=1

Equivalently, given {Aj }nj=1 ⊂ B(R) :


\n  Yn
Xj−1 (Aj ) = λ Xj−1 (Aj ) .

λ (1.19)
j=1 j=1

If Xj : S −→ Rnj are random vectors on (S, E, λ), j = 1, 2, ..., n, we say that {Xj }nj=1
are independent random vectors if {σ(Xj )}nj=1 are independent sigma algebras.
That is, given {Bj }nj=1 with Bj ∈ σ(Xj ) :
\ n  Yn
λ Bj = λ (Bj ) . (1.20)
j=1 j=1

n
Equivalently, given {Aj }nj=1 with Aj ∈ B(R j ) :
\ n  Yn
Xj−1 (Aj ) = λ Xj−1 (Aj ) .

λ (1.21)
j=1 j=1

A countable collection of random variables {Xj }∞ j=1 defined on (S, E, λ) are said
to be independent random variables if given any finite index subcollection, J =
(j(1), j(2), ..., j(n)), {Xj(i) }ni=1 are independent random variables. The analogous definition
of independence applies to a countable collection of random vectors.

Proposition 1.14 (Characterization of the joint D.F.) Let Xj : S −→ R, j =


1, 2, ..., n, be random variables on a probability space (S, E, λ) with distribution functions
Fj (x). Let the random vector X : S −→ Rn be defined on this space by X(s) =
(X1 (s), X2 (s), ..., Xn (s)) and with joint distribution function F (x).
Then {Xj }nj=1 are independent random variables if and only if:
Yn
F (x1 , x2 , ..., xn ) = Fj (xj ). (1.22)
j=1

Countably many random variables {Xj }∞ j=1 defined on (S, E, λ) are independent random
variables if and only if for every finite index subcollection, J = (j(1), j(2), ..., j(n)) :
Yn
F (xj(1) , xj(2) , ..., xj(n) ) = Fj(i) (xj(i) ). (1.23)
i=1

These results are valid for random vectors Xj : S −→ Rnj , noting that Fj (xj ) are joint
distribution functions on Rnj as given in (1.8) or (1.10).
6 Distribution and Density Functions

1.2 Decomposition of Distribution Functions on R


In this section, we utilize some of the results from Chapter III.3 to derive what is sometimes
called the canonical decomposition of the distribution function of a random variable.
This result identifies three components in the decomposition, two of which will be recog-
nized within the discrete and continuous probability theories. The third component, while
fascinating to contemplate, has not found compelling applications in finance so far.
To formally state this proposition requires some terminology. The following definition in-
troduces saltus functions, which are generalized step functions with potentially countably
many steps defined by a collection {xn }∞ n=1 . The word saltus is from Latin, meaning “leap”
or “jump,” and reflects a break in continuity. The general definition also allows two jumps
at each domain point xn , one of size un reflecting a discontinuity from the left, and one of
size vn , reflecting the discontinuity from the right, where un and vn can in this definition
be positive or negative.
Since distribution functions are right continuous and increasing by Proposition 1.3, for
the application below it will always be the case that un > 0 and vn = 0.

Definition 1.15 (Saltus function) Given {xn }∞ ∞


n=1 ⊂ R and real sequences {un }n=1 ,

{vn }n=1 which are absolutely convergent:
X∞ X∞
|un | < ∞, |vn | < ∞,
n=1 n=1

a saltus function f (x) is defined as:


X∞
f (x) = fn (x),
n=1

where: 
 0, x < xn ,
fn (x) = un , x = xn ,
un + vn x > xn .

In other words, X X
f (x) = un + vn .
xn ≤x xn <x

In addition to saltus functions, recall the following functions from Definitions III.3.49
and III.3.54.

Definition 1.16 (Singular function; absolutely continuous function) A function


f (x) is singular on the interval [a, b] if f (x) is continuous, monotonically increasing with
f (b) > f (a), and f 0 (x) = 0 almost everywhere.
A function f (x) is absolutely continuous on the interval [a, b] if for any  > 0 there
is a δ so that: Xn
|f (xi ) − f (x0i )| < ,
i=1

for any finite collection of disjoint subintervals, {(x0i , xi )}ni=1 ⊂ [a, b], with:
Xn
|xi − x0i | < δ.
i=1
Decomposition of Distribution Functions on R 7

Remark 1.17 The relevant facts from Book III on such functions are:

• Proposition III.3.53: Singular functions exist, and this proposition illustrates this
with the Cantor function of Definition III.3.51, named for Georg Cantor (1845–
1918).
• Proposition III.3.62: Absolutely continuous functions are characterized as fol-
lows:
A function f (x) is absolutely continuous on [a, b] if and only if f (x) equals the Lebesgue
integral of its derivative on this interval:
Z x
f (x) = f (a) + (L) f 0 (y)dy.
a

0
Implicit in this result is that f (x) exists almost everywhere and is Lebesgue integrable.
Any continuously differentiable function f (x) is absolutely continuous by the mean value
theorem (Remark III.3.56), and then the above representation is also valid as a Riemann
integral.

We are now ready for the main result.

Proposition 1.18 (Decomposition of F (x)) Let X be a random variable


 defined on a
probability space (S, E, λ) with distribution function F (x) ≡ λ X −1 (−∞, x] . Then F (x) is


differentiable almost everywhere and:

F (x) = αFSLT (x) + βFAC (x) + γFSN (x), (1.24)

where:

• α, β, γ are nonnegative, with α + β + γ = 1.

• FSLT (x) is a saltus distribution function.


• FAC (x) is an absolutely continuous distribution function.
• FSN (x) is a singular distribution function.

Proof. By Proposition 1.3, F (x) is increasing and thus differentiable almost everywhere by
Proposition III.3.12. In addition, F (x) is continuous from the right, has left limits, and has
at most countably many points of discontinuity which we denote {xn }∞ n=1 . At such points
define:
un = F (xn ) − F (x−
n ),

where F (x− ) ≡ limy→x− F (y). Hence from (1.3):

un = λ(X −1 (xn )) > 0,


P∞ P∞
and so 0 ≤ n=1 un ≤ 1. Let α = 0 if this sum is 0, and otherwise define α = n=1 un ,
and: X .
FSLT (x) ≡ un α. (1.25)
xn ≤x

Then FSLT (x) is increasing, and by definition FSLT (−∞) = 0 and FSLT (∞) = 1 defined
as limits. To prove right continuity, let x and  > 0 be given. Then for x ≤ y :
X .
FSLT (y) − FSLT (x) = un α
x<xn ≤y
8 Distribution and Density Functions

where this summation is finite or countable. In the first case, there is δ so that for y ≤
x+δP this summation is zero and this difference is then bounded by . If countable, then

since n=1 un ≤ 1, the summation is convergent and can be made arbitrarily small by
eliminating finitely many terms. Reducing y to eliminate this finite set, the above difference
can again be made arbitrarily small. Thus FSLT (x) is a saltus distribution function by
Definition 1.15 and Proposition 1.4.
Next, let G(x) ≡ F (x) − αFSLT (x). If y > x then F (y − ) ≥ F (x− ) and:
X
F (xn ) − F (x−
 
αFSLT (y) − αFSLT (x) ≡ n)
x<xn ≤y
≤ F (y) − F (x) − F (y − ) − F (x− )
 

≤ F (y) − F (x).

This obtains G(y) − G(x) ≥ 0 and so G(x) is increasing. Also by consideration of the
component functions, G(−∞) = 0 and G(∞) = 1 − α defined as limits.
Further, G(x) is continuous. First, right continuity follows from right continuity of the
components. If x is a left discontinuity of G(x), so G(x) − G(x− ) > 0, then this implies:

F (x) − αFSLT (x) − F (x− ) − αFSLT (x− )


 
X
= F (x) − F (x− ) − lim un
y→x− y<xn ≤x
> 0.

But this is a contradiction. Either x is a continuity point of F so F (x) = F (x− ) and this
sum converges to 0, or, x = xn is a left discontinuity of F and this sum converges to
un = F (xn ) − F (x−
n ) by construction.
By Proposition III.3.12, increasing G(x) is differentiable almost everywhere, G0 (x) ≥ 0
by Corollary III.3.13, and G0 (x) is Lebesgue integrable on every interval [a, b] by Proposition
III.3.19 with: Z b
(L) G0 (y)dy ≤ G(b) − G(a). (1)
a

It follows that the value of this integral increases as a → −∞ and/or b → ∞, and that the
limit exists since bounded by:
Z ∞
0 ≤ (L) G0 (y)dy ≤ 1 − α.
−∞
R∞
If this improper integral equals 0 then define β = 0, and otherwise, let β ≡ (L) −∞
G0 (y)dy.
Define: Z x
F̃AC (x) ≡ (L) G0 (y)dy. (2)
−∞

By Proposition III.3.58, for any a :


Z x
F̃AC (x) − F̃AC (a) = (L) G0 (y)dy,
a

is absolutely continuous for x ∈ [a, b] for any b.


It then follows that FAC (x)−FAC (a) equals the Lebesgue integral of its derivative almost
everywhere by Proposition III.3.62:
Z x
0
F̃AC (x) − F̃AC (a) = (L) F̃AC (y)dy, a.e.,
a
Decomposition of Distribution Functions on R 9
0
and F̃AC (y) = G0 (y) a.e. by the proof of this result. As this integral identity is true for all
a and F̃AC (−∞) = 0 by definition:
Z x
0
F̃AC (x) = (L) F̃AC (y)dy. (3)
−∞

0 0
By (2) and
R ∞ (3),0 the integrals of G (y) and F̃AC (y) agree over (−∞, x] for all x, and so also
β ≡ (L) −∞ F̃AC dy.
Now define: Z x 
0
FAC (x) ≡ (L) F̃AC (y)dy β. (1.26)
−∞

Then FAC (x) = F̃AC (x)/β is absolutely continuous, and is a distribution function since
FAC (−∞) = 0 and FAC (∞) = 1.
Finally consider H(x) ≡ G(x)−βFAC (x). Note that H(−∞) = 0 and H(∞) = 1−α−β.
As a difference of continuous increasing functions, H(x) is continuous and of bounded
variation (Proposition III.3.29), and differentiable almost everywhere by Corollary III.3.32.
But H 0 (x) ≡ G0 (x) − βFAC
0
(x) = 0 a.e., and this obtains by item 3 of Proposition III.2.49
and (1) :
Z b Z b
0
βFAC (b) − βFAC (a) = β FAC (y)dy = G0 (y)dy ≤ G(b) − G(a).
a a

Hence H(a) ≤ H(b) and H(x) is increasing.


If H(x) is constant it must be identically 0 since H(x) → 0 as x → −∞, and, in this
case define γ = 0. Otherwise, there exists an interval [a, b] for which H(a) < H(b), and then
by definition H(x) is a singular function. In this case let γ = H(∞) − H(−∞) defined as
limits, then γ = 1 − α − β and we set:

FSN (x) ≡ H(x)/γ. (1.27)

Remark 1.19 (Decomposition of joint distributions on Rn ) There does not appear


to be an n-dimensional analog to Proposition 1.18 for joint distribution functions
F (x1 , x2 , ..., xn ). By Sklar’s theorem of Proposition II.7.41, named for Abe Sklar (1925–
2020), who published these results in 1959, any such distribution function can be represented
by:
F (x1 , x2 , ..., xn ) = C(F1 (x1 ) , F2 (x2 ) , ..., Fn (xn )). (1.28)
Here, {Fj (x)}nj=1 are the marginal distribution functions of F (x1 , x2 , ..., xn ), and
C(u1 , u2 , ..., un ) is a distribution function defined on [0, 1]n with uniform marginal dis-
tributions.
By uniform marginals is meant that given I = {i} ⊂ {1, 2, ..., n}, and uJ ≡
(uj1 , uj2 , ..., ujn−1 ) for jk ∈ J ≡ I,e the marginal distribution function Ci (ui ) defined
on R by Definition 1.11:
Ci (ui ) ≡ lim C(u1 , u2 , ..., un ),
uJ →∞

satisfies:
Ci (ui ) = ui .
The joint distribution functions C(u1 , u2 , ..., un ) are called copulas.
While the above proposition applies to the marginals {Fj (x)}nj=1 of F (x1 , x2 , ..., xn ),
there does not appear to be a feasible way to use this to decompose F (x1 , x2 , ..., xn ).
10 Distribution and Density Functions

1.3 Density Functions on R


In this section, we investigate the density functions associated with the distribution func-
tions of Proposition 1.3, when these densities exist. There are several approaches for iden-
tifying these functions, which we call the “Lebesgue approach,” the “Riemann approach,”
and the “Riemann-Stieltjes framework,” for reasons that will become apparent.
The Lebesgue approach is introduced first because it will be the more enduring model
that will see various generalizations in Section 4.2.3 and in Book V. This approach leads to
a precise answer to the question of existence of a density function, which is that, a density
exists if and only if the distribution function is absolutely continuous.
The Riemann approach is then seen to obtain a specialized subset of the Lebesgue
approach, which provides the basis for continuous probability theory. This approach, iden-
tifies densities only for the subset of absolutely continuous distribution functions that are
continuously differentiable.
The Riemann-Stieltjes framework expands the final Riemann result. Representing distri-
bution functions as such integrals connect the continuous and discrete probability theories,
and obtains densities for both continuously differentiable and discrete distribution functions
using Book III results.
See Sections 3.5 and 4.2.3 for a discussion of density functions on Rn .

1.3.1 The Lebesgue Approach


Let X be a random variable on a probability space (S, E, λ) with distribution function
F (x) ≡ λ X −1 (−∞, x] , and let f (x) be a nonnegative Lebesgue measurable function


defined on R with: Z ∞
(L) f (y)dy = 1.
−∞

We say that f (x) is a density function associated with F (x) in the Lebesgue sense
if for all x: Z x
F (x) = (L) f (y)dy. (1.29)
−∞

A density f (x) cannot be unique since if g(x) = f (x) a.e., then g(x) is also a density for
F (x) by item 3 of Proposition III.2.31.
This then leads to the question:
Which distribution functions, or equivalently, which random variables X on (S, E, λ),
have density functions in the Lebesgue sense?
A density function can also be defined relative to the measure µF induced by F (x) of
Proposition 1.5, and this will prove useful for generalizations in Book V. We say f (x) is a
density function associated with λF if for all x and Ax ≡ (−∞, x]:
Z Z x
λF (Ax ) = (L) f (y)dy ≡ (L) f (y)dy.
Ax −∞

By definition such a function satisfies the integrability condition above since λF (R) = 1.
By finite additivity of measures and item 7 of Proposition III.2.31, it then follows that
for any right, semi-closed interval A(a,b] = (a, b] :
Z Z b
λF ((a, b]) = (L) f (y)dy ≡ (L) f (y)dy,
A(a,b] a

and this raises the question:


Density Functions on R 11

What is the connection between λF (A) and the density f (x) for other Borel sets A ∈
B (R)?
A very good start to an answer would be to investigate the set function λf defined on
the Borel sigma algebra B (R) by:
Z
λf (A) = (L) f (y)dy. (1.30)
A

The set function λf is well-defined since such Lebesgue integrals are well-defined by Defi-
nition III.2.9.
By definition, the set function λf and the Borel measure λF induced by F agree on
A0 , the semi-algebra of right, semi-closed intervals. By extension they also agree on the
associated algebra A of finite disjoint unions of such sets. The uniqueness of extensions
theorem of Proposition I.6.14 then assures that for all A ∈ B (R):

λf (A) = λF (A),

since B (R) is the smallest sigma algebra that contains A by Proposition I.8.1.
Generalizing, it will be proved in Book V that given any Lebesgue integrable function
f (x), the set function λf of (1.30) is always a measure on B (R) .
Returning to the existence question on a density function associated with the distribution
function F (x), there are two ways to frame an answer.

Summary 1.20 (Existence of a Lebesgue density) When does a distribution function


F (x), or the induced measure µF , have a density function f (x) in the above sense?

1. By Proposition III.3.62, a Lebesgue measurable density f (x) that satisfies (1.29) exists
if and only if F (x) is an absolutely continuous function on every interval [a, b].
This follows because (1.29) and item 7 of Proposition III.2.31 obtain for all a:
Z x
F (x) = F (a) + (L) f (y)dy.
a

This proposition also states that (1.29) is satisfied with f (x) ≡ F 0 (x), and then by item
8 of Proposition III.2.31, it is also satisfied by any measurable function f (x) such that
f (x) = F 0 (x) a.e.
Thus F (x) has a density function f (x) in the Lebesgue sense if and only if F (x) is
absolutely continuous.
2. By item 3 of Proposition III.2.31, if the induced measure λF has a density function,
then λF (A) = 0 for every measurable set with m(A) = 0, where m denotes Lebesgue
measure. This follows from (1.30) and µF = µf , since by Definition III.2.9:
Z Z
f (y)dy ≡ χA (x)f (y)dy.
A

Here χA (x) = 1 on A and is 0 otherwise and thus λF (A) = 0 since χA (x)f (y) = 0 a.e.
Thus if the measure λF induced by F (x) has a density function f (x), then λF (A) = 0
for every set for which m(A) = 0.

Remark 1.21 (Radon-Nikodým theorem) At the moment, item 2 provides a weaker


“if” characterization than does item 1, which obtains “if and only if.” But it will be seen in
Book V that this characterization is again “if and only if,” in that if λF (A) = 0 for every
set for which m(A) = 0, then λF will have a density function in the above sense.
12 Distribution and Density Functions

This will follow from a deep and general result known as the Radon-Nikodým the-
orem, named for Johann Radon (1887–1956) who proved this result on Rn , and Otto
Nikodým (1887–1974) who generalized Radon’s result to all σ-finite measure spaces.
A consequence of this result, which will see generalizations in many ways, is then:
The measure λF induced by F (x) has a density function f (x) if and only if λF (A) = 0
for every set for which m(A) = 0.

The following proposition summarizes the implications of these observations in terms of


the decomposition of Proposition 1.18.

Proposition 1.22 (Decomposition of F (x): Lebesgue density functions) Let X be


 defined on a probability space (S, E, λ) with distribution function F (x) ≡
a random variable
λ X −1 (−∞, x] , and decomposition in (1.24):

F (x) = αFSLT (x) + βFAC (x) + γFSN (x).

Then only FAC (x) has a density function in the Lebesgue sense, with:
0
fAC (x) = FAC (x) a.e.

Proof. This result is Proposition III.3.62, which states that F (x) has a density if and only
if F (x) is absolutely continuous.

Remark 1.23 (Lebesgue decomposition theorem, Book V) As implied by Remark


1.21, the current discussion touches some very general ideas to be developed in Book V,
there in the form of the Radon-Nikodým theorem. The decomposition of Proposition
1.18 also has a measure-theoretic interpretation, which we introduce here.
To simplify notation, assume that the constants α, β, and γ have been incorporated into
the distribution functions FSLT , FAC , and FSN , and thus these are now increasing, right
continuous functions with all F# (−∞) = 0 and F# (∞) equal to α, β, and γ, respectively.
In Book V, the conclusion of the second item of Summary 1.20, that λFAC (A) = 0 for
every set for which m(A) = 0, will be denoted:

λFAC  m.

This notation is read: The Borel measure λFAC is absolutely continuous with respect to
Lebesgue measure m. While this notation is suggestive, that m(A) = 0 forces λFAC (A) = 0,
it is not intended to imply any other relationship between these measures on other sets.
For FSLT , let E1 ≡ {xn }∞n=1 , which in the notation of the proof of Proposition 1.18 are
the discontinuities of F (x). Then m(E1 ) = λFSLT (E e1 ≡ R − E1 denotes the
e1 ) = 0, where E
complement of E1 . This follows because in the notation of Proposition 1.3,

λFSLT (xn ) ≡ FSLT (xn ) − FSLT (x−


n ) = un ,

and this measure is 0 on every set that is outside E1 . For FSN , let E2 be defined as the set
0
of measure 0 on which FSN (x) 6= 0. Then by definition m(E2 ) = 0, while λFSN (Ee2 ) = 0 by
Proposition I.5.30.
Thus, the sets on which m and λFSLT are “supported,” meaning on which they have
nonzero measure, are complementary. The same is true for m and λFSN . In Book V this
relationship will be denoted:
m ⊥ λFSLT , m ⊥ λFSN .
This reads, Lebesgue measure m and the Borel measure λFSLT (respectively λFSN ) are mu-
tually singular.
Density Functions on R 13
S
Now define E = E1 E2 . Then m(E) = λFSLT (E)
e = λF (E)
SN
e = 0, and thus:

m ⊥ (λFSLT + λFSN ) .

By the uniqueness theorem of Proposition I.6.14 it follows that:

λF = λFAC + λFSLT + λFSN ,

and we have derived a special case of Lebesgue’s decomposition theorem, named for
Henri Lebesgue (1875–1941).

• Lebesgue’s decomposition theorem: Given a σ-finite measure space (Definition


I.5.34), here (R, B(R), m), and a σ-finite measure defined on B(R), here λF , there exists
measures ν ac and ν s defined on B(R) so that ν ac  m, ν s ⊥ m, and:

λF = ν ac + ν s .

By the discussion above, when λF is the Borel measure induced by a distribution function
F (x), then:
ν ac = λFAC , ν s = λFSLT + λFSN ,
is such a decomposition.

1.3.2 Riemann Approach


As noted in the introduction, the Riemann approach is a subset of the Lebesgue approach.
Let X be a random variable on a probability space (S, E, λ) with distribution function
F (x) ≡ λ X −1 (−∞, x] , and let f (x) be a nonnegative Riemann integrable function defined


on R with: Z ∞
(R) f (y)dy = 1.
−∞

We say that f (x) is a density function associated with F (x) in the Riemann sense
if for all x: Z x
F (x) = (R) f (y)dy. (1.31)
−∞

Again in this context, such a density f (x) cannot be unique, at least if bounded. By the
Lebesgue existence theorem of Proposition III.1.22, if the above f (x) is bounded, then it
must be continuous almost everywhere on every interval [a, b], and thus continuous almost
everywhere. If g(x) is bounded and continuous almost everywhere, and g(x) = f (x) a.e.,
then g(x) is also a density for F (x). This follows because by the proof of Proposition III.1.22,
the value of the Riemann integral is determined by the continuity points of the integrand.
Proposition III.1.33 then states that such a distribution function F (x) is differentiable
almost everywhere, and F 0 (x) = f (x) at each continuity point of f (x).
For bounded densities, it follows from Proposition III.2.56 that the Lebesgue integral of
f (x) over R is 1, and that F (x) is also definable as a Lebesgue integral. Thus every density
in the Riemann sense is a density in the Lebesgue sense.
Continuous probability theory further specializes the above discussion to continuous
densities f (x). Such densities are unique within the class of continuous functions, and now
Proposition III.1.33 assures that F 0 (x) = f (x) for all x. This result also obtains that the
only distribution functions that have continuous densities are the continuously differentiable
distribution functions.
14 Distribution and Density Functions

Exercise 1.24 (Continuous densities are unique) Prove that if a distribution func-
tion F (x) has two continuous density functions f (x) and fe(x) in the Riemann sense, then
f (x) = fe(x) for all x. Hint: If f (x0 ) > fe(x0 ) then by continuity this inequality applies on
(x0 − , x0 + ). Calculate F (x0 + ) − F (x0 − ).

1.3.3 Riemann-Stieltjes Framework


While the above Lebesgue approach provides the framework that will support significant
generalizations in Book V, the conclusion of Proposition 1.22 is a bit disappointing. The
reader likely already knows that the saltus distribution functions of discrete probability
theory do have well-defined density functions, and may wonder if singular distributions
do too. Certainly, neither type of distribution function can have a density function in the
Lebesgue sense, but perhaps in some other sense. The Riemann-Stieltjes framework provides
an approach to connecting the density functions of continuous and discrete probability
theory.
Let X be a randomvariable on a probability space (S, E, λ) with distribution function
F (x) ≡ λ X −1 (−∞, x] . Then since F is increasing and bounded:
Z x
G(x) ≡ dF,
−∞

exists as a Riemann-Stieltjes integral for all x by Proposition III.4.21. Further by item 4 of


Proposition III.4.24:
Z b
G(b) − G(a) = dF.
a

But an evaluation with Riemann-Stieltjes sums of Definition III.4.3 obtains for all [a, b] :

G(b) − G(a) = F (b) − F (a).

Letting a → −∞, then G(−∞) = F (−∞) = 0 obtains that G(b) = F (b) for all b and thus:
Z x
F (x) ≡ dF. (1.32)
−∞

By Proposition 1.18:

F (x) = αFSLT (x) + βFAC (x) + γFSN (x),

and each of these distribution functions can be represented as in (1.32). Thus by item 5 of
Proposition III.4.24:
Z x Z x Z x
F (x) = α dFSLT + β dFAC + γ dFSN . (1.33)
−∞ −∞ −∞

We now can obtain density functions for two of these distribution functions by Propo-
sition III.4.28. Further, the density functions in (1.34) and (1.35) are the density functions
of discrete probability theory, and continuous probability theory, respectively.

Proposition 1.25 (Decomposition of F (x); R-S density functions) Let X be a ran-


dom
 variable defined on a probability space (S, E, λ) with distribution function F (x) ≡
λ X −1 (−∞, x] , and decomposition in (1.33). Then:

Density Functions on R 15

1. If FSLT (x) is defined as in (1.25), and {xn }∞


n=1 have no accumulation points, then:
Z x X .
dFSLT = un α.
−∞ xn ≤x

In this case, we define the density function fSLT of FSLT by:

fSLT (xn ) = un /α,

and thus: X
FSLT (x) = fSLT (xn ) . (1.34)
xn ≤x

2. If FAC (x) is continuously differentiable, then:


Z x Z x
0
dFAC = (R) FAC (y)dy,
−∞ −∞

defined as a Riemann integral. In this case, we define the unique continuous density
function fAC of FAC by:
0
fAC (x) = FAC (x),
and thus: Z x
FAC (x) = (R) fAC (y)dy. (1.35)
−∞

Proof. Proposition III.4.28.

Remark 1.26 (On Proposition 1.25) A few comments on the above result.

1. If FSLT (x) is defined as in (1.25) and {xn }∞ n=1 has accumulation points, it is con-
ventional to extend (1.34) to this case even though this does not follow directly from
Proposition III.4.28. Instead, we can use right continuity of distribution functions.
For any xn , it follows by definition that:
X
FSLT (xn ) = fSLT (x0n ) . (1)
x0n ≤xn

Given an accumulation point x 6= xn for any n, consider {xn > x}. If xn → x, then by
right continuity of FSLT and (1) :
X
FSLT (x) = lim FSLT (xn ) = 0
fSLT (x0n ) .
n→∞ xn ≤x

2. If FAC (x) is continuously differentiable then it is absolutely continuous by the mean


value theorem (Remark III.3.56), and thus item 2 of Proposition 1.25 is a special case
of the general Lebesgue model above. And by Proposition III.2.56, the Riemann and
0
Lebesgue integrals of fAC (y) = FAC (y) agree over every interval (−∞, x].

3. Finally, there is little hope of defining a density function for a singular distribution
0
function FSN (x). Since FSN (x) = 0 almost everywhere, the Lebesgue integral of this
function is zero, and (1.35) cannot be valid. Similarly, a representation as in (1.34)
0
would in general not be feasible, summing over all xα ≤ x for which FSN (xα ) 6= 0. For
the Cantor function of Definition III.3.51, for example, such points are uncountable.
16 Distribution and Density Functions

1.4 Examples of Distribution Functions on R


The above sections provide a framework for thinking about the distribution functions and
density functions of a random variable. In this section, we exemplify a variety of popular
examples of such functions within this framework, adding to the discussion in Section II.1.3.

1.4.1 Discrete Distribution Functions


Discrete probability
P∞ theory studies random variables for which β = γ = 0 in (1.24),
and so α = n=1 un = 1 and:
F (x) = FSLT (x). (1.36)
Since un = µ(X −1 (xn )) > 0, it is conventional as noted in Proposition 1.25 to define the
probability density function (p.d.f.) associated with a discrete random variable by:

un , x = xn ,
f (x) =
0, otherwise.

The distribution function (d.f.) is then given by:


X
F (x) = f (xn ). (1.37)
xn ≤x

In virtually all cases in discrete probability theory, it is the probability density func-
tions that are explicitly defined in a given application. Frequently encountered examples of
discrete distribution functions are the discrete rectangular, binomial, geometric, negative
binomial, and Poisson distribution functions.

Example 1.27 1. Discrete Rectangular Distribution: The defining collection {xj }nj=1
for this distribution is finite and can otherwise be arbitrary. However, this collection is
n
conventionally taken as {j/n}j=1 , and so {xj }nj=1 ⊂ [0, 1], and the discrete rectangular
random variable is modeled by X R : S −→ [0, 1].
For given n, the probability density function of the discrete rectangular distribution,
n
also called the discrete uniform distribution, is defined on {j/n}j=1 by:

fR (j/n) = 1/n, j = 1, 2, .., n. (1.38)


Sn
In effect, this random variable splits S = j=1 Sj into n sets with Sj ≡ X −1 (j/n) and:

λ(Sj ) = 1/n.

By rescaling, this distribution can be translated to any interval [a, b], defining

Y R = (b − a)X R + a.

2. Binomial Distribution: For given p, 0 < p < 1, the standard binomial random
variable is defined: X1B : S −→ {0, 1}, where the associated p.d.f. is defined by:

f (1) = p, f (0) = p0 ≡ 1 − p.

This is often expressed: 


1, Pr = p,
X1B =
0, Pr = p0 ,
Examples of Distribution Functions on R 17

or to emphasize the associated p.d.f.:



p, j = 1,
fB1 (j) = (1.39)
p0 , j = 0.

A simple application for this random variable is as a model for a single coin flip. So
S = {H, T }, a probability measure is defined on S by: λ(H) = p, and λ(T ) = p0 , and
the random variable defined by X1B (H) ≡ 1 and X1B (T ) ≡ 0. This random variable is
sometimes referred to as a Bernoulli trial, and the associated distribution function as
the Bernoulli distribution after Jakob Bernoulli (1654–1705).
This standard formulation is then translated to a shifted standard binomial random
variable: Y1B = b + (a − b)X1B , which is defined:

a, Pr = p,
Y1B =
b, Pr = p0 ,

where the example of b = −a is common in discrete time asset price modeling for
example.
3. General Binomial: The binomial model can also be extended to accommodate sample
spaces of n-coin flips, producing the general binomial random variable with two pa-
rameters, p and n ∈ N. That is, S = {(F1, F2 , ..., Fn ) | Fj ∈ {H, T }}, and XnB is defined
as the “head counting” random variable:
Xn
XnB (F1 , F2 , ..., Fn ) = X1B (Fj ).
j=1

It is apparent that XnB assumes values 0, 1, 2, ..., n, and using a standard combinatorial
analysis that the associated probabilities are given for j = 0, 1, .., n by:
n

XnB = j, Pr = j pj (1 − p)n−j .


To emphasize the associated p.d.f.:


 
n j
fBn (j) = p (1 − p)n−j , j = 0, 1, 2, ..., n. (1.40)
j

n

Recall that j denotes the binomial coefficient defined by:
 
n n!
= , (1.41)
j (n − j)!j!
where 0! = 1 by convention. This expression is sometimes denoted n Cj and read, “n
choose j.”
The name “binomial coefficient” follows from the expansion of a binomial a + b raised
to the power n, producing the binomial theorem:
Xn n
(a + b)n = aj bn−j . (1.42)
j=0 j

4. Geometric Distribution: For given p, 0 < p < 1, the geometric distribution is


defined on the nonnegative integers, and its p.d.f. is given by:

fG (j) = p(1 − p)j , j = 0, 1, 2, .. (1.43)


18 Distribution and Density Functions

and thus by summation:

FG (j) = 1 − (1 − p)j+1 , j = 0, 1, 2, .. (1.44)

This distribution is related to the binomial distribution in a natural way.


The underlying sample space can be envisioned as the collection of all coin-flip sequences
which terminate on the first H. So:

S ={H, T H, T T H, T T T H, ....},

and the random variable X G defined as the number of flips before the first H. Conse-
quently, fG (j) above is the probability in S of the sequence of j-T s and then 1-H. That
is, the probability
P∞ that the first H occurs after j-T s. Of course, fG (j) is indeed a p.d.f.
in that j=0 p(1 − p)j = 1 as follows from (1.44), letting j → ∞.
The geometric distribution is sometimes parametrized as:

fG0 (j) = p(1 − p)j−1 , j = 1, 2, ..,

and then represents the probability of the first head in a coin flip sequence appearing
on flip j. These representations are conceptually equivalent, but mathematically distinct
due to the shift in domain.
One way of generalizing the geometric distribution is to allow the probability of a head to
vary with the sequential number of the coin flip. This is the basic model in all financial
calculations relating to payments contingent on death or survival, as well as to
various other vitality-based outcomes. Specifically, if

Pr[H | kth f lip] = pk ,

then with a simplifying change in notation to exclude the case j = 0, a generalized


geometric distribution can be defined by:
Yj−1
fGG (j) = pj (1 − pk ), j = 1, 2, 3, ..., (1.45)
k=1

where fGG (j) is the probability of the first head appearing on flip j. By convention,
Q0
k=1 (1 − pk ) ≡ 1 when j = 1.
If pk = p > 0 for all k, then fG (j) is a p.d.f. as noted above. With nonconstant proba-
bilities, this conclusion is also true with some restrictions to assure that the distribution
function is bounded. For example, if 0 < a ≤ pkP ≤ b < 1 for all j, then the summation

is finite since fGG (j) < b(1 − a)j−1 and thus j=1 fGG (j)(j) < b/a by a geometric
series summation. Additional restrictions are required to make this sum equal to 1. De-
tails are left to the interested reader, or see Reitano (2010) pp. 314–319 for additional
discussion.
5. Negative Binomial Distribution: The name of this distribution calls out yet another
connection to the binomial distribution, and here we generalize the idea behind the geo-
metric distribution. There, fG (j) was defined as the probability of j-T s before the first
H. The negative binomial, fN B (j) introduces another parameter k, and is defined as the
probability of j-T s before the kth-H. So when k = 1, the negative binomial is the same
as the geometric.
The p.d.f. is then defined with parameters p with 0 < p < 1 and k ∈ N by:
 
j+k−1 k
fN B (j) = p (1 − p)j , j = 0, 1, 2, .. (1.46)
k−1
Examples of Distribution Functions on R 19

This formula can be derived analogously to that for the geometric by considering the
sample space of coin flip sequences, each terminated on the occurrence of the kth-H.
The probability of a sequence with j-T s and k-Hs is then pk (1 − p)j . Next, we must
determine the number of such sequences in the sample space. Since every such sequence
terminates with an H, there are only the first j +k−1 positions that need to be addressed.
Each such sequence is then determined by the  placement of the first (k − 1)-Hs, and so
the total count of these sequences is j+k−1
k−1 . Multiplying the probability and the count,
we have (1.46).
6. Poisson Distribution: The Poisson distribution is named for Siméon-Denis Pois-
son (1781–1840) who discovered this distribution and studied its properties. This distri-
bution is characterized by a single parameter λ > 0, and is defined on the nonnegative
integers by:
λj
fP (j) = e−λ , j = 0, 1, 2, .... (1.47)
j!
P∞
That j=0 fP (j) = 1 is an application of the Taylor series expansion for eλ :
X∞
eλ = λj /j!
j=0

One important application of the Poisson distribution is provided by the Poisson Limit
theorem of Proposition II.1.11. See also Proposition 6.5. This result states that with
λ = np fixed:
λj
 
n j
lim p (1 − p)n−j = e−λ .
n→∞ j j!
Thus when the binomial parameter p is small, and n is large, the binomial probabilities
in (1.40) can be approximated:

(np)j
 
n j
p (1 − p)n−j ' e−np . (1.48)
j j!
By p small and n large is usually taken to mean n ≥ 100 and np ≤ 10.
Another important property of the Poisson distribution is that it characterizes “arrivals”
during a given period of time under reasonable and frequently encountered assumptions.
For example, the model might be one of automobile arrivals at a stop light; or telephone
calls to a switchboard; or internet searches to a server; or radio-active particles to a
Geiger counter; or insurance claims of any type (injuries, deaths, automobile accidents,
etc.) from a large group of policyholders; or defaults from a large portfolio of loans and
bonds; etc. For this result, see pp. 300–301 in Reitano (2010).

1.4.2 Continuous Distribution Functions


Continuous probability theory studies a subset of the absolutely continuous distribution
functions, so α = γ = 0 and β = 1 in (1.24), and for which:

F (x) ≡ FAC (x),

is also assumed to be continuously differentiable. Then f (x) ≡ F 0 (x) exists everywhere, is


continuous, and by the fundamental theorem of calculus of Proposition III.1.32:
Z x
F (x) = (R) f (y)dy. (1.49)
−∞
20 Distribution and Density Functions

A random variable X : S → R with such a distribution function is said to have a


continuous density function, and the function f (x) is called the probability density
function (p.d.f.) associated with the distribution function F. A continuous density
function is unique among Riemann integrable functions that satisfy (1.49) for given F (x),
since the fundamental theorem of calculus of Proposition III.1.33 obtains that for all x :

F 0 (x) = f (x). (1.50)

There are many distributions with continuous density functions used in finance and in
other applications, but some of the most common are the uniform, exponential, gamma
(including chi-squared), beta, normal, lognormal, and Cauchy. In addition, other examples
are found with the extreme value distributions discussed in Chapter II.9 and Chapter
7, as well as Student’s T (also called Student T) and F distributions introduced in Example
2.28.
It should be noted that it is not required that a continuous density function f (x) be
continuous on R, but only on its domain of definition. Then by Proposition III.1.33, F 0 (x) =
f (x) at every such continuity point.

Example 1.28 1. Continuous Uniform Distribution: Perhaps the simplest continu-


ous probability density that can be imagined is one that is constant on its domain. The
domain of this distribution is arbitrary, though of necessity bounded, and is convention-
ally denoted as the interval [a, b]. The p.d.f. of the continuous uniform distribution,
sometimes called the continuous rectangular distribution, is defined on [a, b] by the
density function:
1
fU (x) = , x ∈ [a, b], (1.51)
(b − a)
and fU (x) = 0 otherwise.
This distribution is called “uniform” because if a ≤ s < t ≤ b and XU : S → R denotes
the underlying random variable, then:
t−s
λ{XU−1 [s, t]} = FU (t) − FU (s) = .
b−a
This probability is thus translation invariant within [a, b], meaning that [s, t] and
[s+, t+] have the same measure when [s+, t+] ⊂ [a, b], and this justifies the uniform
label. It is then also the case that the range of XU under these induced probabilities can be
identified with the probability space ((a, b), B(a, b), m), where m is Lebesgue measure and
B(a, b) is the Borel sigma algebra, noting that nothing is lost in omitting the endpoints
which have probability 0.
Defined on [0, 1], this distribution is more important than it may first appear due to
the results in Chapter II.4. To summarize these results below, we first recall Definition
II.3.12 on the left-continuous inverse of a distribution function, or any increasing func-
tion. This inverse function was proved to be increasing and left continuous in Proposition
II.3.16.

Definition 1.29 (Left-continuous inverse) Let F : R → R be a distribution or


other increasing function. The left-continuous inverse of F, denoted by F ∗ , is de-
fined:
F ∗ (y) = inf{x|y ≤ F (x)}. (1.52)
By convention, if {x|F (x) ≥ y} = ∅ then we define F ∗ (y) = ∞, while if {x|F (x) ≥ y} =
R, then F ∗ (y) = −∞ by definition.
Examples of Distribution Functions on R 21

For the following Chapter II.4 results, let (S, E, λ) be given and X : (S, E, λ) →
(R, B(R), m) a random variable with distribution function F (x) and left-continuous in-
verse F ∗ (y). Let (S 0 , E 0 , λ0 ) be given and XU : (S 0 , E 0 , λ0 ) → ((0, 1), B((0, 1), m) a random
variable with continuous uniform distribution function. Then:
Proposition II.4.5: If F (x) is continuous and strictly increasing, so F ∗ = F −1 by
Proposition II.3.22, then:

(a) F (X) : (S, E, λ) → ((0, 1), B((0, 1), m) defined by F (X)(s) = F (X(s)), is a ran-
dom variable on S which has a continuous uniform distribution on (0, 1), meaning
FF (X) (x) = x.
(b) F −1 (XU ) : (S 0 , E 0 , λ0 ) → (R, B(R), m) defined by F −1 (XU )(s0 ) = F −1 (XU (s0 )), is
a random variable on S 0 with distribution function F (x).

Proposition II.4.8: For general increasing F (x) :

(a) F (X) : (S, E, λ) → ((0, 1), B((0, 1), m) defined as above has distribution function
FF (X) (x) ≤ x, and this function is continuous uniform with FF (X) (x) = x if and
only if F (x) is continuous;
(b) F ∗ (XU ) : (S 0 , E 0 , λ0 ) → (R, B(R), m) defined as above is a random variable on S 0
with distribution function F (x).

Proposition II.4.9:

(a) If {Uj }nj=1 are independent, continuous uniformly distributed random variables,
then {Xj }nj=1 ≡ {F ∗ (Uj )}nj=1 are independent random variables with distribution
function F (x).
(b) If F (x) is continuous and {Xj }nj=1 are independent random variables with distri-
bution function F (x), then {Uj }nj=1 ≡ {F (Xj )}nj=1 are independent, continuous
uniformly distributed random variables.

The significance of these Book II results is that one can convert a uniformly distributed
random sample {Uj }nj=1 into a sample of the {Xj }nj=1 variables by defining:

Xj = F ∗ (Uj ),

with F ∗ (Uj ) defined as in (1.52). And note that despite being defined as an infimum,
the value of F ∗ (Uj ) is truly in the domain of the random variable X because F (x) is
right continuous.
We will return to an application of these results in Chapter 5.
2. Exponential Distribution: The exponential density function is defined with pa-
rameter λ > 0 :
fE (x) = λe−λx , x ≥ 0, (1.53)
and fE (x) = 0 for x < 0. The associated distribution function is then:

FE (x) = 1 − exp (−λx) , x ≥ 0, (1.54)

and FE (x) = 0 for x < 0. When λ = 1, this is often called the standard exponential
distribution.
The exponential distribution is important in the generation of ordered random sam-
ples. See Chapter 3.
22 Distribution and Density Functions

3. Gamma Distribution: The gamma distribution is a two parameter generalization


of the exponential, with α > 0 and λ > 0, and density function fΓ (x) defined by:
1 α α−1 −λx
fΓ (x) = λ x e , x ≥ 0, (1.55)
Γ(α)
and fΓ (x) = 0 for x < 0. When 0 < α < 1, fΓ (x) is unbounded at x = 0 but is integrable.
The gamma function, Γ(α), is defined by:
Z ∞
Γ(α) = xα−1 e−x dx, (1.56)
0
R∞
and thus 0 fΓ (x)dx = 1 with a change of variables. The gamma function is continuous
in α > 0, meaning that Γ(α) → Γ(α0 ) if α → α0 . Since this integral equals the Lebesgue
counterpart by Proposition III.2.56, this can be proved using Lebesgue’s dominated con-
vergence theorem of Proposition III.2.52. Details are left as an exercise.
An integration by parts shows that for α > 0, the gamma function satisfies:

Γ(α + 1) = αΓ(α), (1.57)

and since Γ(1) = 1, this function generalizes the factorial function in that for any integer
n:
Γ(n) = (n − 1)! (1.58)

This identity also provides the logical foundation for defining 0! = 1, as noted in the
discussion of the general binomial distribution.
Another interesting identity for the gamma function is:

Γ(1/2) = π. (1.59)

From (1.56) and the substitution x = y 2 /2 :


Z ∞
Γ(1/2) = x−1/2 e−x dx
0
√ Z ∞ −y2 /2
= 2 e dy.
0

This integral equals 2π/2 by symmetry and the discussion on the normal distribution
in item 5.
When the parameter α = k is a positive integer, the probability density function becomes
by (1.58):
1
fΓ (x) = λk xk−1 e−λx , x ≥ 0.
(k − 1)!
The associated distribution function is:
X∞ (λx)j
FΓ (x) = e−λx , (1.60)
j=k j!
and it can be verified by differentiation that FΓ0 (x) = fΓ (x).

Remark 1.30 (Chi-squared) When a random variable Y has a gamma distribution


with λ = 1/2 and α = n/2, it is said to have a chi-squared distribution with n
degrees of freedom, which is sometimes denoted χ2n d.f. . See Examples 2.4 and 2.28.
Examples of Distribution Functions on R 23

4. Beta Distribution: The beta distribution contains two shape parameters, v > 0 and
w > 0, and is defined on the interval [0, 1] by the density function:
1
fβ (x) = xv−1 (1 − x)w−1 . (1.61)
B(v, w)
The beta function B(v, w) is defined by a definite integral which in general requires
numerical evaluation: Z 1
B(v, w) = y v−1 (1 − y)w−1 dy. (1.62)
0
R1
By definition, therefore, 0
fβ (x)dx = 1.
If v or w or both parameters are less than 1, the beta density and integrand of B(v, w)
are unbounded at x = 0 or x = 1 or both. However the integral converges in the limit as
an improper integral:
Z b
B(v, w) = lim y v−1 (1 − y)w−1 dy,
a→0+ a
b→1−

because the exponents of the y and 1 − y terms exceed −1.


If both parameters are greater than 1, this density function is 0 at the interval endpoints,
and has a unique maximum at x = (v − 1) / (v + w − 2) . When both parameters equal
1, this distribution reduces to the continuous uniform distribution on [0, 1].
The beta function is closely related to the gamma function above. Substituting z = y/(1−
y) into the integral in (1.62):
Z ∞
z v−1
B(v, w) = dz. (1)
0 (1 + z)v+w
Letting α = v + w in (1.56) obtains by substitution x = (1 + z)y :
Z ∞
1 1
= y v+w−1 e−(1+z)y dy.
(1 + z)v+w Γ(v + w) 0

Multiplying by z v−1 , integrating, and applying (1) obtains:


Z ∞ Z ∞ 
1 v+w−1 −y −zy v−1
B(v, w) = y e e z dz dy.
Γ(v + w) 0 0

A final substitution x = zy in the inner integral, and two applications of (1.56) produces
the identity:
Γ(v)Γ(w)
B(v, w) = . (1.63)
Γ(v + w)
Thus for integer n and m, it follows from (1.58) that:
(n − 1)! (m − 1)!
B(n, m) = . (1.64)
(n + m − 1)!

5. Normal Distribution: The normal distribution is defined on (−∞, ∞), depends on


a location parameter µ ∈ R and a scale parameter σ > 0, and is given by the probability
density function:
1 
2

fN (x) = √ exp − (x − µ) /2σ 2 , (1.65)
σ 2π
24 Distribution and Density Functions

where exp A ≡ eA to simplify notation. In many books, the parametrization is defined in


terms of µ and σ 2 , where σ is taken as the positive square root in (1.65). We use these
specifications interchangeably.
When µ = 0 and σ = 1, this is known as the standard normal distribution or unit
normal distribution and denoted φ(x) :
1
φ(x) = √ exp −x2 /2 .

(1.66)

A change of variables in the associated integral shows that if a random variable X is
normally distributed with parameters µ and σ, then (X − µ) /σ has a standard normal
distribution. Conversely, if X is standard normal, then σX + µ is normally distributed
with parameters µ and σ 2 .
Perhaps the greatest significance of the normal distribution is that it can be used as
an approximating distribution to the distribution of sums and averages of a sample of
“scaled” random variables under relatively mild assumptions.

(a) When applied to approximate the binomial distribution, this result is called the De
Moivre-Laplace theorem, named for Abraham de Moivre (1667–1754), who
demonstrated the special case of p = 1/2, and Pierre-Simon Laplace (1749–
1827), who years later generalized to all p, 0 < p < 1. See Proposition 6.11.
(b) In the most general cases, this result is known as the central limit theorem. See
Proposition 6.13 for one example, and Book VI for others.
R∞ R∞
Remark 1.31 ( −∞ fN (x)dx = 1) By change of variable, −∞ fN (x)dx = 1 if and
R∞
only if −∞ φ(x)dx = 1. While there are no elementary proofs of the latter identity, there
R∞
is clever derivation that involves embedding the integral −∞ φ(x)dx into a 2-dimensional
Riemann integral, and the ability to move back and forth between 2-dimensional and
iterated Riemann integrals. This is largely justified in Corollary III.1.77, but the earlier
result requires a generalization to improper integrals. The derivation below also requires
a change of variables to polar coordinates which allows direct calculation, a step that is
perhaps familiar to the reader but one that will not be formally justified until Book V.
In detail:
Z ∞ Z ∞ Z ∞ Z ∞
1
exp − x2 + y 2 /2 dxdy,
  
φ(x)dx φ(y)dy =
−∞ −∞ 2π −∞ −∞

which is an iterated integral that equals a 2-dimensional Riemann integral over R2 .


The change of variables x = r cos θ and y = r sin θ for r ∈ [0, ∞) and θ ∈ [0, 2π)
then obtains dxdy = rdθdr, and a 2-dimensional Riemann integral over the rectangle
R = [0, 2π) × [0, ∞). Thus:
Z ∞ Z ∞ Z
1 2
φ(x)dx φ(y)dy = re−r dθdr.
−∞ −∞ 2π R

Returning to iterated integrals obtains:


Z ∞ Z ∞ Z ∞ Z 2π
1 −r 2
φ(x)dx φ(y)dy = re dr dθ = 1.
−∞ −∞ 2π 0 0
Examples of Distribution Functions on R 25

6. Lognormal Distribution: The lognormal distribution is defined on [0, ∞), depends


on a location parameter µ ∈ R and a shape parameter σ > 0, and unsurprisingly is inti-
mately related to the normal distribution above. However, to some, the name “lognormal”
appears to be opposite to the relationship that exists.
Stated one way, a random variable X is lognormal with parameters (µ, σ 2 ) if X = eZ
where Z is normal with the same parameters. So X can be understood as an exponen-
tiated normal. Stated another way, a random variable X is lognormal with parameters
(µ, σ 2 ) if ln X is normal with the same parameters. Thus the name comes from the
second interpretation, that the log of a lognormal variate is a normal variate.
The probability density function of the lognormal is defined as follows, again using
exp A ≡ eA to simplify notation:
1  
fL (x) =√ exp − (ln x − µ)2 /2σ 2 . (1.67)
σx 2π
R∞ R∞
By change of variable, 0 fL (x)dx = 1 if and only if −∞ fN (x)dx = 1.
7. Cauchy Distribution: The Cauchy distribution is named for Augustin Louis
Cauchy (1789–1857). It is of interest as an example of a distribution function that has
no finite moments as will be seen in item 4 of Section 4.2.7, and has another surprising
property to be illustrated in Example 2.18.
The associated density function is defined on R as a function of a location parameter
x0 ∈ R, and a scale parameter γ > 0, by:
1 1
fC (x) = . (1.68)
πγ 1 + ([x − x0 ] /γ)2

The standard Cauchy distribution is parameterized with x0 = 0 and γ = 1 :


1 1
fC (x) = . (1.69)
π 1 + x2
A substitution into the integral of tan z = (x − x0 ) /γ shows that this probability function
integrates to 1, as expected.

1.4.3 Mixed Distribution Functions


Mixed distribution functions, for which γ = 0 in (1.24), are commonly encountered in
finance and defined:
F (x) ≡ αFSLT (x) + (1 − α)FAC (x), (1.70)
0
where again it is assumed that FAC (x) = f (x) is continuous. In other words, such a distribu-
tion function represents a random variable with both continuously distributed and discrete
parts.

Example 1.32 (Example II.1.7) Example II.1.7 examined default experience on a loan
portfolio, and of particular interest was the modeling of default losses. For a portfolio of n
loans, if X denotes the number of defaults, then X : S → {0, 1, 2, ..., n} is discrete, while if
Y : S → R denotes the dollars of loss, then the model for Y will typically be “mixed.”
26 Distribution and Density Functions

This follows because with fixed default probability p say, the probability of no defaults
assuming independence is given by:

Pr{Y = 0} = Pr{X = 0} = (1 − p)n .

This probability will typically be quite large compared to Pr{0 < Y ≤ }, which is the
probability of one or several defaults but very low loss amounts.
For example, assume that each bond’s loss given default is uniformly distributed on
[0, Lj ], where Lj denotes the loan amount on the jth bond. Then F (y) will have a discrete
part at y = 0, and then a continuous or mixed distribution for y > 0. In the former case,
which we confirm below:

F (y) ≡ (1 − p)n χ[0,∞) (y) + (1 − (1 − p)n )FAC (y),

where χ[0,∞) (y) = 1 for y ∈ [0, ∞) and is 0 otherwise, and FAC (y) is absolutely continuous
with FAC (0) = 0.
As F (y) is continuous from the right as a distribution, and certainly not left continuous
at y = 0, to investigate left continuity for y > 0, recall the law of total probability in
Proposition II.1.35. If Y denotes total losses and 0 <  < y, then noting that Pr{y −  <
Y ≤ y |0 defaults} = 0 :
Xn
Pr{y −  < Y ≤ y} = Pr{y −  < Y ≤ y | k def.} Pr{k def.}, (1.71)
k=1

Since:  
n k
Pr{k defaults) = p (i − p)n−k ,
k
this can be stated in terms of F (y) :
 
Xn n k
F (y) − F (y − ) = p (i − p)n−k [Fk (y) − Fk (y − )] (1)
k=1 k

where Fk (y) is the conditional distribution function of Y given k losses.


Put another way, Fk (y) is the distribution function of the sum of k random variables,
defined as the losses on the k defaulted bonds. If loss given default is uniformly distributed
on [0, Lj ], where Lj denotes the loan amount on the jth bond, then this distribution function
again reflects which subset of k bonds actually defaulted, assuming {Lj }nj=1 are not constant.
Again by the law of total probability, assuming all Nk ≡ nk collections of bonds are equally


likely to default:
1 XNk h (i) (i)
i
Fk (y) − Fk (y − ) = Fk (y) − Fk (y − ) , (2)
Nk i=1

(i)
where now Fk (y) is the distribution function of the sum of k losses from the ith collection
of k bonds.
(i)
In other words, Fk (y) is the distribution function of the sum of k continuously dis-
(i)
tributed variables, here assumed to be uniformly distributed. We claim each such Fk (y) is
continuous for y > 0 and thus by (1), (2) and (1.71), so too is F (y), and thus F (y) is a
mixed distribution as claimed.
By (2.13), if X and Y are independent random variables on a probability space (S, E, λ)
with continuously differentiable distribution functions FX and FY with continuous density
Another random document with
no related content on Scribd:
was a veritable citadel in course of construction, with armoured
trenches, sandbag emplacements for big guns, barbed-wire
entanglements; in fact, everything that modern military science can
contrive to insure impregnability.
The whole place was teeming with activity, and looked like a
gigantic ant-heap; on all sides soldiers were to be seen at work, and
it was evident that those in charge of this important position were
determined to leave nothing to luck. The little that nature had left
unprotected was being made good by the untiring efforts and genius
of the Italians, and the Austrian chances of ever capturing the place
are practically nil. Its curious configuration largely contributes to its
impregnability and power of resistance if ever besieged.
Behind its line of armoured trenches is a deep hollow, which could
shelter an army corps if necessary; and here, under complete cover,
are well-built, barrack-like buildings, in which the troops can be
comfortably quartered during the long winter months when the fort is
buried under yards of snow and practically isolated from the outer
world.
As he whirled past in the big car (see page 50)
To face page 88

The position on the Forcola is probably unique in the world, as it is


situated exactly at a point where three frontiers meet: the Italian,
Austrian and Swiss. From its sandbag ramparts on the front facing
the Austrians one has the most sublime vista of mountain scenery it
would be possible to conceive. It is impossible in mere words to
attempt to convey anything but the faintest impression of it, yet it
would be a sin of omission not to endeavour to.
As I gazed in front of me, the marvellous beauty of the scene held
me in rapt suspense, and for a few moments the war passed from
my mind.
The Austrian Tyrol was before me, a panorama of wondrous
mountain peaks stretching away into the mist of the far distance, and
towering above the highest was the mighty Ortler, crowned with
eternal snow, and positively awe-inspiring in its stately grandeur.
My reverie was abruptly disturbed by the boom of a big gun. I was
back again amongst realities, yet how puny did the biggest efforts of
mankind at war appear in comparison with all this splendour of
nature. Had it not been for the echoes produced by these giant
peaks the report of even the heavy artillery would probably have
scarcely been heard.
The Swiss and Austrian frontiers meet on the summit of a
Brobdingnagian cliff of rock of strange formation, which towers
above the Forcola. Through field glasses the frontier guards and the
blockhouses are plainly discernable.
This overhanging proximity of the enemy strikes me as constituting
a constant menace to the Italian position, as every movement within
its enceinte is visible from the height above. The fact also of the
Austrian and Swiss frontier guards being so close to each other as to
be able to fraternise must inevitably conduce to espionage.
Doubtless, however, all this has been well considered by the Italians,
and they are not likely to be caught napping.
Rateau and I had a very cordial reception, and the officer in
command of the position took a visible pride in showing us round it
and explaining everything, whilst I made a lot of interesting sketches.
The ability and rapidity with which it had been constructed and
fortified were worthy of the very highest praise. Such a fortress
brought into being in so short a time and at such an altitude was in
itself such a marvel of military capacity that one was lost in wonder
at it all. Evidently no obstacle presented by nature deterred its
accomplishment.
In the emplacements were guns of a calibre one certainly never
expected to see except in the valley, and you were lost in conjecture
how the feat of getting them up here was achieved.
Everything was approaching completion, and by the time the snow
set in and the position would be practically cut off from the lower
world it would be a big, self-contained arsenal with all that was
necessary for carrying on its share of the general scheme of
operations on the Frontier without extraneous assistance should the
rigours of the Alpine winter render it unapproachable.
In the warfare in the mountains, positions develop more or less
into isolated communities, as the men seldom have an opportunity of
going down to the busy world below, besides which the summer is so
brief at these altitudes. Even on the date we were at the Forcola, the
twentieth of August, there were already unmistakable signs of the
approach of winter; the air was decidedly frosty—there had been a
fall of snow a few hours previously, and most of the peaks were
powdered with silvery white.
We gladly accepted the invitation to have some lunch in the mess-
room, for the keen air had given both of us healthy appetites, and
while we were doing justice to a well-cooked steak with fried
potatoes and a flask of very excellent Valtellino. I had a chat with
some of the younger officers, and learned to my surprise that they
had not stirred from the place since they had come up nearly three
months before, and they had no hope of getting leave for a long time
to come as things were developing and the winter coming along. A
visit from two civilians like ourselves, who could give them some
news of the outside world, was therefore a veritable red-letter day for
them.
Yet, in spite of the monotony of their existence, these cheery
fellows did not complain. There was always plenty to occupy their
minds, they said, and prevent them from brooding over old times. To
defend the Forcola at all hazards was now their sole pre-occupation;
and after all, they added, with an attempt at mirth, they might easily
have been stationed in a still more isolated spot and with fewer
companions. Such admirable equanimity was only what I expected
to find now that I knew the Italian soldier, so it did not surprise me at
all.
After lunch, as the Austrian batteries seemed to be getting busier,
we strolled up to the “observation post,” a sort of tunnel in which was
a telephone installation and an instrument known as a “goniometre,”
a powerful telescope in miniature, combined with a novel kind of
range-finder, through which the slightest movement of the enemy
can be instantly detected and telephoned to the officers in command
of the different gun emplacements. The machine is always in
readiness, as it is so fixed that once it is focussed it requires no re-
adjustment.
There was a small, irregular hole at the end of the tunnel that
faced the enemy’s line, and through this the goniometre pointed.
Two soldiers were on duty, one to keep his eye on the opposite
mountain, the other to manipulate the telephone. It evidently required
some practice to use the telescope, as I had a good look through it
and could scarcely make out anything.
When we visited the post the Italian batteries were not yet replying
to what I was informed were only the usual daily greetings of the
Austrians, but their response would doubtless be sent in due course,
judging from the conversation of the operator with the telephone.
Everything we saw was so absorbingly interesting that we should
have liked to remain on the Forcola many hours more, but time was
getting on by now, and we had to think of getting back.
As may be imagined, after my experience coming up I was
particularly dreading this moment to arrive. I thought it best,
however, to say nothing and trust to luck in getting down without
another attack of vertigo. When we said goodbye to our genial hosts,
several soldiers were about to descend also, so we were to have
company.
“Three minutes interval between each man and go as fast as
possible,” called out an officer, and off went everyone at a given
signal. Rateau was just before me and, as it turned out, I was last.
I felt like the prisoners in the Conciergerie during the reign of terror
must have felt as they waited their turn to go out to the fatal tumbril.
Through the opening in the sandbags only a bit of the narrow
pathway was visible, as it turned sharply to the right and went down
the face of the cliff beyond. It was like looking out on limitless space.
“Well, goodbye and a pleasant journey,” said the officer to me
when my turn came.
Out I went, putting my hand over my left eye to avoid looking into
the void, and I managed to run like this all the way down.
It was getting late when we got back to Bormio, so we decided to
remain another night, and were glad we did, as an amusing incident
occurred during the evening.
Whilst we were finishing dinner at the hotel, we received a note
inviting us afterwards to smoke a cigar and take a glass of wine with
the sergeants of the regiment quartered in the town. Of course we
accepted and duly turned up.
The reception—for such it was—took place in a large private room
of the hotel we were staying in, and we were greeted with the utmost
cordiality.
There was a big display of a certain very famous brand of
champagne on a side table, and the corks soon began to fly merrily;
toasts were given as usual, and everything was pleasant. During a
pause a sergeant next to me, who spoke French fluently, asked me
how I liked the wine.
In jocular vein I replied that it was excellent, but it was a pity it is
German, as it is well known that the owner of the vineyards near
Rheims is at present interned in France. To my surprise he took my
words seriously; there was an icy moment as he communicated my
remarks to his comrades, and then, as though with one accord, there
was a crash of broken glass, and we had to finish up the evening for
patriotic reasons on Asti Spumante.
CHAPTER IX

From Brescia to Verona—Absence of military movement in rural


districts—Verona—No time for sightseeing—The axis of the Trentino
—Roveretto, the focus of operations—Fort Pozzachio—A “dummy
fortress”—Wasted labour—Interesting incident—Excursion to Ala—
Lunch to the correspondents—Ingenious ferry-boat on River Adige—
The Valley of the Adige—Wonderful panorama—“No sketching
allowed”—Curious finish of incident—Austrian positions—Desperate
fighting—From Verona to Vicenza—The positions of Fiera di
Primiero—Capture of Monte Marmolada—The Dolomites—Their
weird fascination—A striking incident—The attempted suicide—The
Col di Lana—Up the mountains on mules—Sturdy Alpini—Method of
getting guns and supplies to these great heights—The observation
post and telephone cabin on summit—The Colonel of Artillery—What
it would have cost to capture the Col di Lana then—The Colonel has
an idea—The idea put into execution—The development of the idea
—Effect on the Col di Lana—An object lesson—The Colonel gets
into hot water—The return down the mountains—Caprili—Under fire
—We make for shelter—The village muck-heap—Unpleasant
position—A fine example of coolness—The wounded mule—An
impromptu dressing.
CHAPTER IX
It has been said that “who holds the Trentino holds not merely the
line of the Alps and the Passes, but the mouths of the Passes and
the villages which debouch into the Lombard Plain.”
The significance of this statement was being continually brought
home to me here on this northern frontier of Italy, and you could not
shut your eyes to the fact that the very safety of the whole country
depended on the army making good its “tiger spring” in the first
hours of the war.
It was not so much the necessity for an aggressive movement, but
for what one might term a successful defensive—offensive, and it
cannot be gainsaid—and even the Austrians themselves would
admit it, that in this respect the Italians scored everywhere along the
line.
General Cadorna’s remarkable power of intuition was evidenced
by every movement of the army from the outset, but nowhere more
noticeably than in the Trentino sector at this early stage of the war,
when the slightest miscalculation on his part would most assuredly
have spelt irretrievable disaster for Italy.
We were to have abundant proof of what his organizing genius,
combined with the patriotic ardour of the troops, had been able to
accomplish in the short space of three months.
After eight days spent in and around Brescia we motored to
Verona, the next stage as arranged on our programme. Our road
was across country, and therefore some considerable distance from
the Front, so beyond being a delightful trip through glorious scenery,
there was nothing very special about it; touring motorists having
done it hundreds of times.
There was a noteworthy absence of any signs of military
movements in the rural districts, and the peasants were apparently
going about their usual peaceful avocations as unconcernedly as
though the war were in another country instead of being a
comparatively short distance away. In this respect, however, one felt
that the motor journeys we were scheduled to make from centre to
centre would prove exceedingly interesting, as they would afford us
an insight of the conditions prevailing in the rear of the Front, not an
unimportant factor in forming one’s impressions.
In Verona, had one been holiday making, many hours might have
been profitably spent in “doing” the place. As it was, my time was
fully occupied from the hour we arrived till the moment we left, and it
was, I am grieved to have to confess it, only by accident that I was
able to snatch the time to see anything of the artistic treasures of the
famous old city.
As a matter of fact, you scarcely had a moment to yourself if you
wanted to get any work done, as we only remained three days in
Verona.
The reason for thus curtailing our stay did not transpire. In this
sector of the Front the most important operations in the Trentino
were taking place, and the Austrians were straining every nerve in
order to stay the victorious progress of the Italians, but the lightning
rapidity of their advance had proved irresistible, and had forced a
retirement to their second line. To dislodge them from this was the
tack the Italians had before them when we were in Verona.
The axis of the Trentino is obviously Trent, and in due course of
time it will doubtless fall into the hands of the Italians, but the date of
that event is on the knees of the gods.
Meanwhile the focus of the operations in August 1915, was the
fortified position of Roveretto, which has been described as the
“strategic heart” of the Trentino, and which guards the Austrian
portion of the valley of the Adige. Enclosed within several rings of
entrenchments and an outer chain of modern forts of the most
formidable character, it presented a redoubtable barrier to the
advance of the Italians into the Trentino in this direction.
But the lightning-like strategy of Cadorna upset all the plans of the
Austrian generals and, formidable though these defences were, they
were gradually being mastered.
Fort Pozzachio, which might have proved the most serious
obstacle of all, and have involved a long siege before it was
captured, turned out to be little more than a dummy fortress in so far
as defensive possibilities were concerned, and had to be abandoned
at the commencement of the war. This step being decided on by the
Austrians in consequence, as they stated in their communiqué, of its
not being in readiness to offer any prolonged resistance if besieged.
It transpired later that, although years had been spent working on
it and vast sums of money expended, it was so far from being
completed when war was declared that its heavy armament had not
yet arrived. It had been intended to make of it a stronghold which
would be practically impregnable.
Even now, it is a veritable modern Ehrenbreitstein, but with this
difference: it is not built on a rock but excavated out of the summit of
a lofty craig, which is quite inaccessible from the Italian side.
Although only about four miles from Roveretto, its surrender did not
help the Italians over much, in so far as the operations in that zone
were immediately concerned, but its loss must have been a severe
blow to Austrian pride.
It was said that it had been the intention of the Austrians to blow it
up rather than let the Italians reap the advantage of all the labour
that they had wasted on it, and in this connection there was a story
going round at the time that seemed circumstantial enough to be
worth recounting.
On the day of the occupation of the fortress an engineer officer,
strolling about examining the construction of the place, happened to
catch his foot in what he took to be a loose telephone wire, which
had apparently been accidentally pulled in from outside.
In disengaging it his attention was attracted by a peculiar object
attached to the wire, when to his surprise he found that it was an
electrical contrivance connected with a live fuse leading to a positive
mine of dynamite in one of the lower galleries.
A small splinter of rock had somehow got mixed up with the
detonator, and thus, as though by a miracle, the fortress and the
Italian troops in it had been saved from destruction. Almost needless
to add, the wire led from the nearest Austrian position.
We only made one excursion from Verona, but it was of extreme
interest in view of what was taking place at the time in this sector of
the Front. It was to Ala, a small Austrian town in the valley of the
Adige, which had been captured a few weeks previously.
There had been some sharp fighting in the streets, as many of the
houses bore witness to, but its chief interest to us lay in the fact that
it was in redeemed territory, and actually within the portals of
Austrian Trentino. Like, however, most of the liberated towns I had
visited, Ala was more Italian than Austrian.
The whole region was positively alive with warlike energy
(see page 75)
To face page 100

The Mayor offered a lunch to the correspondents, and the usual


patriotic toasts followed. Afterwards we motored to the nearest
position, which was only a short distance from the town. Crossing
the river Adige on our way by an ingenious ferry-boat, constructed by
the engineers, the Austrians having destroyed the only bridge in the
vicinity. The “ferry” consisted of two big barges clamped together,
then boarded over and steered by an immense paddle projecting
from the after part. It was worked on the fixed-rope and sliding-pulley
principle, the swift current supplying the motive power.
We had all been looking forward to getting a good conception of
the operations, which just then were of vital importance, but we were
to be disappointed; we were only to be permitted a long range view.
On reaching a small hamlet on the bank of the river a few hundred
yards further on we were informed we could not proceed beyond this
point. We had, therefore, to be content with what we could see from
the roadway, which overlooked the river.
The coup d’oeil was magnificent, though not so impressive as
what we had seen previously. Before us stretched the broad valley of
the Adige; its swiftly running stream divided up here by numerous
gravel islets. On the opposite bank was the railway line to Trent and
the town of Seravalle; whilst facing it on our side was Chizzola.
Away in the distance bathed in the effulgence of the glorious
summer afternoon, were the Stivo and other high hills, on which are
the forts guarding Roveretto, hidden from our view by a bend in the
river.
Now and again one saw a tiny piff of white smoke, and heard the
muffled boom of artillery, but this was the only indication that any
operations were in progress. It was all so vast and so swamped as it
were by the immensity of the landscape that it was difficult to grasp
what was taking place.
A few hundred yards further down the road where we were
standing was the picturesque village of Pilcante, almost hidden in
luxuriant foliage.
In the immediate foreground and standing out in discordant detail
was a barbed-wire entanglement barricading the road, and guarded
by a detachment of infantry; whilst immediately below the parapet
which skirted the pathway, a cottage and a small garden on a spot of
ground jutting into the river had been transformed into a sort of
miniature “position” with an armoured trench, disguised by small
trees stuck in the walls.
It occurred to me that all this would make an interesting sketch; as
a matter of fact, it was the only subject that had appealed to me that
day, so I got out my book, and had just finished it when I felt a touch
on my shoulder and someone whispered in my ear:
“Be careful, sketching is not allowed here.” I looked round—it was
one of the officers accompanying us.
“Not allowed?” I queried; “surely there must be some mistake.”
“Not at all,” he replied; “special instructions were issued by the
General that no sketches or photographs were to be made here.”
As this was the first time I had heard of any restrictions since we
had started on the tour, my surprise may be imagined, and the more
especially as nothing apparently could have been more innocent
than the subject I had chosen.
“Well, I’m sorry, but what you tell me comes too late as I have
already made a sketch,” said I, showing it to him. “What shall I do?”
The officer, a very good fellow, laughed and shrugged his
shoulders: “Put your book back in your pocket and don’t let anyone
see it; there are several staff officers about.”
The finish of the incident was equally curious. I worked up a
double page drawing from the sketch in question, and, of course,
submitted it to the censors. It aroused a good deal of comment, but it
was eventually “passed” on condition that I altered the title and took
out all the names of the towns, mountains, etc.; only the vaguest
suggestion as to where I had made it being permitted.
In spite of the fact that the Austrians had the geographical
advantage of position almost everywhere, and that their frontier was
comparatively so close to many important Italian cities, the intrepid
advance of the Italian troops upset all the calculations of the Austrian
generals, and, instead of advancing into Italian territory, they found
themselves forced to act on the defensive some distance to the rear
of their first line positions, and well inside their own frontier. But it
was no easy task for the Italians, and tested their valour and
endurance to the utmost.
The fighting in the ravines and on the sides of the mountains was
of the most desperate character, for in this warfare at close quarters
it is man to man, and individual courage tells more than it does down
on the plains.
Here, in the fastnesses of nature, every clump of trees or isolated
rocks are potential ambuscades. So it requires the utmost caution,
combined with almost reckless daring, to advance at any time.
The Austrians, though well provided with heavy artillery, were quite
unable to hold on to their positions. It was brute force pitted against
skill and enthusiastic courage, and brute force was worsted as it
generally is under such conditions.
Our two next “stages,” Vicenza and Belluno, brought us into the
very heart of the fighting on the line of the Italian advance in the
Eastern Trentino towards Bolzano and the region round Monte
Cristallo.
We halted a couple of days at Vicenza to enable us to visit the
positions of Fiera di Primero. The Italian lines here were some
distance inside Austrian territory, so we had a good opportunity of
judging for ourselves the difficulties that had to be overcome to have
advanced so far, as well as the preparations that had been made by
the Austrians for their proposed invasion of Italy.
Cunningly concealed trenches, barbed-wire entanglements and
gun emplacements commanded every approach, whilst protecting
the advance of troops. It seemed incredible that such well planned
works should have been abandoned.
But here as elsewhere the lightning strategy of Cadorna left the
Austrian commanders no option. Monte Marmolada, 11,000 ft. high,
and other mountains on which the Austrians had placed heavy
artillery, were captured by degrees. The strategic value of these
positions was incontestable.
Unless one has seen the Dolomites it is impossible to form any
conception of what these successes mean or the terrible difficulties
that had to be surmounted to gain them.
Neither Dante nor Doré in their wildest and most fantastic
compositions ever conceived anything more awe-inspiring than
warfare amidst these towering peaks.
At all times they exercise a kind of weird fascination which is
positively uncanny; add the thunder of modern artillery and the effect
is supernatural. You try hard to realize what it means fighting
amongst these jagged pinnacles and on the edges of the awful
precipices.
Death, however, has little terror for the men, judging from the look
on the faces of the mortally wounded one saw from time to time
brought down from the trenches.
A little incident related to me by Calza Bedolo brings home the
spirit of Italy’s soldiers.
He had shortly before come upon a stretcher-party carrying down
from the mountains a very dangerously wounded man. Upon enquiry
as to how the wound had been caused he was informed that it was a
case of attempted suicide.
What had led up to this desperate act? It appeared that for some
trivial breach of discipline the man had been deprived of the privilege
of a place in the front trenches and sent to a position in the rear!
The most important of all the strategic points at that time was the
Col di Lana, which dominates the Falzarego and Livinallongo
passes, close to Cortina d’Ampezzo. Here the Austrians were putting
up a defence which was taxing the strength and resources of the
Italians, to their utmost, but it was gradually being overcome.
Every mountain which commanded the position was being
mounted with guns of the heaviest calibre, and big events were said
to be looming in the near future.
As a matter of fact, it was only six months later that the Italians
succeeded in capturing the Col di Lana, so strongly were the
Austrians entrenched on it. A young engineer conceived the idea of
mining it, and so successful was he that the entire summit of the
mountain, with the Austrian positions, was literally blown away.
One of the most interesting of the excursions the English group of
correspondents made was to the top of a mountain facing it. As it
would have been a very trying climb for amateur mountaineers like
ourselves, mules were considerately supplied by the General of the
division. So we accomplished the ascent in easy fashion, for it was
certain that very few of us would have tackled it on foot.
The sturdy Alpini who accompanied us treated the excursion as a
good sort of joke apparently, and plodded steadily alongside us in
the test of spirits, laughing now and again at our vain efforts to keep
our steeds from walking on the extreme edge of the precipices.
This ride gave us a splendid opportunity of seeing how the Italians
have surmounted the difficulty of getting heavy artillery to the very
summits of mountains, where no human foot had trodden before the
war broke out. Rough and terribly steep in places though the road
was, still it was a real roadway and not a mere track as one might
have expected to find considering how rapidly it had been made.
Men were still at work consolidating it at the turns on scientific
principles, and in a few weeks, with the continual traffic passing up
and down it, it would present all the appearance of an old
established road.
It is the method of getting the guns and supplies up these great
heights in the first instance that “starts” the road as it were. Nothing
could be simpler or more efficacious.
It consists in actually cutting the track for the guns just in advance
of them as they are gradually pushed and hauled forward. The
position and angle of the track being settled before starting by the
engineers.
This, of course, takes time at first, especially when the acclivity is
very steep, but it has the advantage of breaking the way for
whatever follows. The rough track is then gradually improved upon
by the succeeding gun teams, and so a well constructed zig-zag
military roadway gradually comes into being.
We left the mules a short distance from the summit and had to
climb the rest of the way. Instead of an artillery position as we
expected to find, it was an observation post, with a telephone cabin
built in a gap in the rocks, and a hut for half a dozen soldiers on duty.
The little station was quite hidden from the Austrians, although
only a couple of thousand yards distant. It was a most important
spot, as from here the fire of all the batteries round about was
controlled.
We were received by the Colonel commanding this sector of the
artillery, a grizzled warrior, wearing a knitted woollen sleeping cap
pulled well down over his ears, which gave him a somewhat quaint
and unmilitary appearance.
The “observation post” was merely a small hole through the rocks,
and so awkward to get at that only two people could look through it
at the same time. Immediately facing you across a shallow valley
was a barren hill of no great elevation (of course it must be
remembered we were here several thousand feet up). There was no
sign of life or vegetation, and it looked so singularly bare and
uninteresting that unless you had been told to look at it your attention
would never have been attracted to it.
Yet this was the much talked of Col di Lana. Seen, however,
through field glasses its aspect altered considerably, and you could
not fail to notice what appeared to be row upon row of battered stone
walls, and also that the ridge was very much broken up, shewing
patches everywhere of red sand.
The “stone walls,” the Colonel told us, were what remained of the
Austrian trenches and the patches of sand were caused by the
incessant bombardment by the Italians. At that moment there was
not the slightest sign of military activity anywhere, no sound of a gun
disturbed the still air.
It seemed incredible that we were gazing on the most redoubtable
position on the whole Front, and one that for weeks had barred the
Italian advance in this direction.
Someone remarked that it did not look so very formidable after all,
and asked the Colonel if it would really mean a very big effort to
capture it.
“To take that innocent looking summit now,” he replied gravely,
“would necessitate attacking it with a couple of hundred thousand
men and being prepared to lose half of them. We shall get it by other
means, but it will take some time; meanwhile every yard of it is
covered by my batteries.”
We continued to gaze on the silent landscape with increasing
interest, when suddenly, as though an idea had occurred to him, the
Colonel said that if we did not mind waiting twenty minutes or so he
would show us what his gunners could do. Of course we asked for
nothing better. So he went up to the telephone cabin and was there a
little while; he then came back and told us to follow him.
He led the way down a ravine enclosed by lofty cliffs close by. At
the foot of it were large boulders, some with sandbags spread on
them. This was his sharpshooter’s lair, he informed us, but for the
moment they were not there.
We were then told to hide ourselves as much as possible behind
the rocks and watch what was going to happen on the Col di Lana,
which was in full view from here.
“We are right under fire here, but you are fairly safe if you keep
well under cover,” he added, as a sort of final recommendation when
he saw us all placed.
The stillness of death reigned for the next ten minutes perhaps.
We kept our eyes glued on the fateful hill opposite, not exactly
knowing what was going to happen, when all of a sudden there was
the crash of a big gun and we heard the shriek of a shell as it passed
overhead; then, with scarcely an interval this was followed up by
such a succession of firing that it sounded like a thunderstorm let
loose.
The effect on the Col di Lana was startling: it was as though a
series of volcanoes had started activity, all along the summit and just
below it fantastic columns of smoke and dust rose high into the air.
As the Colonel had truly said, every yard of the hill was under the fire
of his batteries.
It was an object lesson in precision of aim, and one almost felt
sorry for the men who were thus, without the slightest warning,
deluged with high explosives. Meanwhile the Austrian batteries did
not fire a shot in reply.
The bombardment lasted exactly ten minutes, and ceased as
abruptly as it had started.
“Wonderful,” we all exclaimed when we were reassembled at the
station. The Colonel looked delighted with the way his instructions
had been carried out.
At that moment we heard the telephone bell ringing violently; he
excused himself and hurried to the box, and was there some
minutes. When he returned the look of elation on his face had
disappeared.
“That was the General ringing up,” he explained. “He heard the
firing and wanted to know what had happened suddenly. He is in an
awful rage at my giving you this entertainment.”
Of course we were all very sorry that he should have got into
trouble on our account, but he seemed to make light of it, and
evidently had no fear of unpleasant consequences. We then left the
place and retraced our footsteps.
There was no mule-riding going down the mountain unless you
wanted to break your neck. It was far too steep, therefore we had to
walk the whole way, a very long and tiring job.
In the valley below was the village of Caprile, where we had
arranged to meet our cars. A mountain stream ran past the village,
and there was a broad, open space of ground facing the houses, in
which was a large encampment with long sheds and hundreds of
horses and mules picketted.
As we were walking across to the inn, where we were going to
lunch, we heard the dull boom of a gun in the distance, and in a few
seconds the approaching wail of a projectile, followed by the report
of the explosion a short distance away, and we saw the shell had
burst on the hillside a few feet from the Red Cross Hospital.

You might also like