Thesis

Parameter Estimation in Stochastic Volatility Models Via
Approximate Bayesian Computing
A Thesis
Presented in Partial Fulfillment of the Requirements for the Degree

Master of Science in the Graduate School of The Ohio State
University
By
Achal Awasthi, B.S.
Graduate Program in Department of Statistics
The Ohio State University
2018
Master’s Examination Committee:

Radu Herbei,Ph.D., Advisor
Laura S. Kubatko, Ph.D.

c Copyright by
Achal Awasthi
2018
Abstract
In this thesis, we propose a generalized Heston model as a tool to estimate volatil-
ity. We have used Approximate Bayesian Computing to estimate the parameters of
the generalized Heston model. This model was used to examine the daily closing
prices of the Shanghai Stock Exchange and the NIKKEI 225 indices. We found that
this model was a good fit for shorter time periods around financial crisis. For longer
time periods, this model failed to capture the volatility in detail.
ii
This is dedicated to my grandmothers, Radhika and Prabha, who have had a
significant impact in my life.
iii
Acknowledgments
I would like to thank my thesis supervisor, Dr. Radu Herbei, for his help and his
availability all along the development of this project. I am also grateful to Dr. Laura
Kubatko for accepting to be part of the defense committee. My gratitude goes to
my parents, without their support and education I would not have had the chance
to study worldwide. I would also like to express my gratitude towards my uncles,
Kuldeep and Tapan, and Mr. Richard Rose for helping me transition smoothly to
life in a different country. In addition, my deepest appreciation goes to my friends at
the department of Statistics who have been there for me since my first day of class
at the Ohio State University. Finally, I am extremely thankful to my housemates for
bearing with me during the past one year.
iv
Vita
2016 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B.S. Physics
2016-present . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Graduate Teaching Associate,

The Ohio State University.
Publications
Fields of Study
Major Field: Department of Statistics
v
Contents
Page
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii
Dedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv
Vita . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Emerging Markets during Financial Crisis . . . . . . . . . . . . . . 2
1.3 Structure of Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2. Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2 Brownian Motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.3 Geometric Brownian Motion (GBM) . . . . . . . . . . . . . . . . . 10
2.3.1 Parameter Estimation for the GBM process using Maximum
Likelihood Estimation . . . . . . . . . . . . . . . . . . . . . 14
2.4 The Ornstein-Uhlenbeck Process . . . . . . . . . . . . . . . . . . . 17
2.4.1 Simulation of the OU Process . . . . . . . . . . . . . . . . . 18
2.4.2 Parameter Estimation for OU Process using Maximum Like-
lihood . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
vi
2.4.3 Parameter Estimation for OU Process using Ordinary Least
Squares . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.5 Cox-Ingersoll-Ross Process . . . . . . . . . . . . . . . . . . . . . . . 26
2.5.1 Simulation of CIR process . . . . . . . . . . . . . . . . . . . 27
2.5.2 Parameter Estimation for CIR Process using Maximum Like-
lihood . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.6 Generalized Cox-Ingersoll-Ross model . . . . . . . . . . . . . . . . 35
2.6.1 Parameter Estimation for generalized CIR Process using Max-
imum Likelihood R t2. . . . . . . . . . . . . . . . . . . . . . . . 37
2.6.2 Distribution of t1 W (s) ds . . . . . . . . . . . . . . . . . . 40
3. Approximate Bayesian Computing for Stochastic Volatility Models . . . 43
3.1 Heston Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

3.1.1 Simulation of sample paths of the Heston Model . . . . . . 45
3.1.2 Euler-Maruyama (EM) Approximation . . . . . . . . . . . . 46
3.1.3 Euler-Maruyama scheme with Lord et al’.s modification . . 47
3.1.4 Milstein scheme . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.1.5 Broadie and Kaya’s Exact Algorithm . . . . . . . . . . . . . 48
3.2 A generalized Heston Model . . . . . . . . . . . . . . . . . . . . . . 51
3.2.1 Simulation of sample paths of the generalized Heston model 53
3.3 Approximate Bayesian Computing (ABC) . . . . . . . . . . . . . . 60
3.3.1 ABC for Heston Model . . . . . . . . . . . . . . . . . . . . . 61
3.3.2 ABC for generalized Heston Model . . . . . . . . . . . . . . 83
4. Application: Modeling Volatility in Financial Markets . . . . . . . . . . . 100
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

4.1.1 Stock Index . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
4.2 Exploratory Data Analysis . . . . . . . . . . . . . . . . . . . . . . . 104
4.3 Parameter estimation of the Generalized Heston model using ABC 107
4.3.1 Parameter estimation using ABC for SSE . . . . . . . . . . 107
4.3.2 Parameter estimation using ABC for NIKKEI 225 . . . . . 134
5. Contributions and Future Work . . . . . . . . . . . . . . . . . . . . . . . 142
5.1 Results Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142

5.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
5.2.1 Moments of generalized Heston model . . . . . . . . . . . . 145
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
vii
List of Tables
Table Page
3.1 Table showing the number of simulations vs number of accepted pa-
rameters for different = 100. . . . . . . . . . . . . . . . . . . . . . . 62
rameters for different = 1000. . . . . . . . . . . . . . . . . . . . . . 63
viii
rameters for different = 1500. . . . . . . . . . . . . . . . . . . . . . 64
rameters for = 100. . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
rameters for = 200. . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
rameters for = 500. . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
rameters for = 800. . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
rameters for = 1, 000. . . . . . . . . . . . . . . . . . . . . . . . . . . 85
rameters for = 1, 500. . . . . . . . . . . . . . . . . . . . . . . . . . . 86
rameters for different = 10, 000. . . . . . . . . . . . . . . . . . . . . 108
ix
rameters for different levels. . . . . . . . . . . . . . . . . . . . . . . 111
4.3 Table showing the estimated parameters for different levels (100 sim-
ulations). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
ulations). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
ulations). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
4.9 Table showing the estimated parameters for different levels. . . . . . 132
x
4.11 Table showing the estimated parameters for different levels. . . . . . 139
xi
List of Figures
Figure Page
2.1 Simulated paths of the GBM process with parameters as described in
algorithm 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2 Histogram of log of GBM at the 50th time-step. The orange curve repre-
sents the superimposed normal density curve with parameters obtained
from simulated data at the 50th time-step. . . . . . . . . . . . . . . . 14
2.3 Histogram of estimated values of µ of the GBM as simulated above.
The dashed red line represents the true value of the parameter. . . . . 16
2.4 Histogram of estimated values of σ of the GBM as simulated above.
The dashed red line represents the true value of the parameter. . . . . 16
2.5 Simulated paths of the OU process with parameters as described above 19
xii
2.6 Histogram of estimated values of β of the OU process as simulated
above. The dashed red line represents the true value of the parameter. 21
2.7 Histogram of estimated values of θ of the OU process as simulated
2.8 Histogram of estimated values of σ of the OU process as simulated
2.9 Histogram of estimated values of β of the OU process using least
squares approximation. The dashed red line represents the true value
of the parameter. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.10 Histogram of estimated values of θ of the OU process using least squares
approximation. The dashed red line represents the true value of the
parameter. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.11 Histogram of estimated values of σ of the OU process using least
squares approximation. The dashed red line represents the true value
of the parameter. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.12 Simulated paths of the CIR process with parameters as described above. 32
xiii
2.13 Histogram of estimated values of α of the CIR process. The dashed
red line represents the true value of the parameter. . . . . . . . . . . 34
2.14 Histogram of estimated values of β of the CIR process. The dashed
2.15 Histogram of estimated values of σ of the CIR process. The dashed
2.16 Simulated paths of the generalized CIR process with parameters as
described above . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.17 Histogram of estimated values of α of the generalized CIR process using
normal approximation. The dashed red line represents the true value
of the parameter. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
2.18 Histogram of estimated values of β of the generalized CIR process using
of the parameter. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.19 Histogram of estimated values of σ of the generalized CIR process using
of the parameter. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
xiv
2.20 Histogram of estimated values of γ of the generalized CIR process using
of the parameter. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.1 Simulation of a path of CIR process with N = 252, α = 0.09, β = 0.145
and σ = 0.055. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.2 Simulation of a path of Heston process with N = 252, α = 0.09, β =
0.145, µ = 0.009 and σ = 0.055. . . . . . . . . . . . . . . . . . . . . . 51
3.3 s=4 intermediate points between ti and ti+1 . . . . . . . . . . . . . . . 53
3.4 Simulated path of the CIR process with parameters α2 = 0.221, β2 =
0.601, σ2 = 0.055. Every (s + 1)th value has been chosen for the plot,
where s has been defined in step-I. . . . . . . . . . . . . . . . . . . . 54
3.5 Simulated path of the OU process with parameters α1 = 0.14, β1 =
0.861, σ1 = 0.009. Every (s + 1)th value has been chosen for the plot,
where s has been defined in step-I. . . . . . . . . . . . . . . . . . . . 55
R t2
3.6 Simulated path of the estimates of t1
ν(s) ds at different time points. 56
R t2
3.7 Simulated path of the estimate of t1
µ(s) ds. . . . . . . . . . . . . . . 57
xv
R t2 p
ν(s) dW ν (s). . . . . . . . . . 58
R t2 p
ν(s) dW Z . . . . . . . . . . . 59
3.10 Simulated sample path of the generalized Heston model. . . . . . . . 60
3.11 Histograms of accepted values of the parameters of the Heston Model
for = 100 and 1000 simulations. The dashed red lines represent the
true values of the parameters. . . . . . . . . . . . . . . . . . . . . . . 65
for = 100 and 10, 000 simulations. The dashed red lines represent
the true values of the parameters. . . . . . . . . . . . . . . . . . . . . 66
for = 200 and 1, 000 simulations. The dashed red lines represent the
xvi
xvii
xviii
3.29 Histograms of estimated values of the parameters of the generalized
Heston Model for = 100 and 1000 simulations. The dashed red lines
represent the true values of the parameters. . . . . . . . . . . . . . . 87
Heston Model for = 200 and 10, 000 simulations. The dashed red
lines represent the true values of the parameters. . . . . . . . . . . . . 90
xix
Heston Model for = 500 and 1, 000 simulations. The dashed red lines
Heston Model for = 800 and 1, 000 simulations. The dashed red lines
Heston Model for = 1, 000 and 1, 000 simulations. The dashed red
xx
4.1 Daily Adjusted Closing Price of SSE from 01/01/96 to 04/08/16. . . 105
4.2 Daily Log Adjusted Closing Price of SSE from 01/01/96 to 04/08/16. 106
4.3 Daily Adjusted Closing Price of NIKKEI 225 from 01/05/15 to 07/24/18.106
4.4 Daily Log Returns of NIKKEI 225 from 01/05/15 to 07/24/18. . . . . 107
4.5 Histograms of accepted values of the parameters for = 10, 000 and
100 simulations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
4.6 Comparison between simulated dataset and testing dataset. . . . . . . 110
100 simulations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
xxi
4.8 Histograms of accepted values of the parameters for = 5, 000 and 100
simulations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
4.9 Comparison between simulated dataset and testing dataset for =
5, 000 for the first period. . . . . . . . . . . . . . . . . . . . . . . . . . 115
10, 000 for the first period. . . . . . . . . . . . . . . . . . . . . . . . 115
simulations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
simulations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
100 simulations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
1, 000 for the second period. . . . . . . . . . . . . . . . . . . . . . . . 121
5, 000 for the second period. . . . . . . . . . . . . . . . . . . . . . . . 121
xxii
10, 000 for the second period. . . . . . . . . . . . . . . . . . . . . . . 122
100 simulations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
simulations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
simulations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
1, 000 for the third period. . . . . . . . . . . . . . . . . . . . . . . . . 127
5, 000 for the third period. . . . . . . . . . . . . . . . . . . . . . . . . 127
10, 000 for the third period. . . . . . . . . . . . . . . . . . . . . . . . 128
100 simulations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
xxiii
simulations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
4.25 Histograms of estimated values of the parameters for = 1, 000 and
100 simulations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
1, 000 for the fourth period. . . . . . . . . . . . . . . . . . . . . . . . 133
100 simulations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
simulations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
simulations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
xxiv
4.32 Comparison between simulated dataset and testing dataset for = 1, 000.140
4.33 Comparison between simulated dataset and testing dataset for = 5, 000.140
10, 000. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
xxv
Chapter 1: Introduction
1.1 Motivation
Physicists, statisticians and mathematicians have long been interested in theories
related to finance. The tools developed in statistical physics, statistics and theoretical
mathematics can be used to model complex financial systems. Many changes have
taken place in the world of finance in the later half of the last century. For exam-
ple, in 1973 currencies began to be traded in financial markets. The values of these
were determined by the foreign exchange markets that are active 24 hours a day all
over the world. Among other changes are new models that have come up for esti-
mating volatility which is an inherent framework for pricing European options. The
Black-Scholes model (BSM) was among the first successful models to price options.
However, this model is based on several assumptions that are not representative of
the real world. In particular, the BSM assumes that volatility is deterministic and
remains constant through the option’s life, which clearly contradicts the behavior
observed in financial markets. While the BSM framework can be adapted to obtain
reasonable prices for plain vanilla options, the constant volatility assumption may
lead to significant mispricings when used to evaluate options with non-conventional
or exotics features.
1
During the last decades several alternatives have been proposed to improve volatility
modeling in the context of derivatives pricing. One such approach is to model volatil-
ity as a stochastic quantity. By introducing uncertainty in the behavior of volatility,
the evolution of financial assets can be estimated more realistically. In addition, using
appropriate parameters, stochastic volatility models can be calibrated to reproduce
the market prices of liquid options and other derivatives contracts. One of the most
widely used stochastic volatility models was proposed by Heston in 1993. The Heston
model introduces a dynamic for the underlying asset which can take into account the
asymmetry and excess kurtosis that are typically observed in financial assets returns.
It also provides a closed-form valuation formula that can be used to efficiently price
plain vanilla options. This will be particularly useful in the calibration process, where
many option pricings are usually required in order to find the optimal parameters that
reproduce market prices.
1.2 Emerging Markets during Financial Crisis
Previous research shows us that the strong functioning of stock markets has con-
siderable effect on the growth of an economy, especially so in a developing one. Over
the past few decades, studies have been conducted around the globe by many re-
searchers on the subject of stock market efficiency, and the conflicting results have
made it difficult to comment on the status of stock market of a particular country. So,
we focus our attention on the stock market behavior in developing countries which
aren’t considered to be as stable as the developed ones. They are unlikely to be fully
information-efficient, partly due to institutional barriers restricting information flows
to the market and partly due to lack of experience of market participants to rapidly
2
lock up new information into security prices. Therefore, it would be interesting to
investigate this period of last 20 years studying both the Global Financial and the
Chinese crisis and its effects on fastest emerging economies of India and China. Re-
cession had crumpled economies worldwide but these two were relatively unaffected
and hence are of particular interest.
The the current fastest growing economies BRICS (Brazil, Russia, India, China, South
Africa) were affected primarily through four channels of trade, finance, commodity,
and confidence. The slump in export demand and firmer trade credit caused a slow-
down in aggregate demand. The global financial crisis inflicted significant loss in
output in all these countries. However, the real GDP growth in India and China
remained impressive even though they witnessed some moderation due to weakening
global demand. The crisis also exposed the structural weakness of the global financial
and real sectors. The BRICS were able to recover quickly with the support of domes-
tic demand. The reversal of capital flows led to equity market losses and currency
depreciations, resulting in lower external credit flows. The banking sectors of the
BRICS economies performed relatively well [20].
Since our analysis revolves around the two recent financial crises, we need to under-
stand its effects as well. A financial crisis is a disruption to financial markets in which
adverse selection and moral hazard problems become much worse, so that financial
markets are unable to efficiently channel funds to those who have the most produc-
tive investment opportunities. As a result, a financial crisis can drive the economy
away from an equilibrium with high output in which financial markets perform well
to one in which output declines sharply [19]. The end of 2007 and beginning of 2008
observed that the onset of global financial crisis had brought disorder to the financial
3
markets around the world and it is the first crisis in consideration for our study. The
instability in the global stock market scenario began with a shortfall of liquid assets in
US banking system and the continual fall in stock prices on information that Lehman
Brothers, Merill Lynch and many other investment banks and companies were col-
lapsing. The stock markets around the globe suffered huge losses and Indian stock
market was no exception. The SENSEX which had reached historically high levels in
the beginning of 2008, turned down to its level about three years back and the S&P
CNX NIFTY also followed a similar trend. Economic growth decelerated in 2008-09
to 6.7 percent. This represented a decline of 2.1 percent from the average growth
rate of 8.8 percent in the previous five years. China was not one of the countries
hardest hit by the crisis, neither was it as insulated as many had assumed. This
can be seen from the fact that China continued to have one of the highest rates of
economic growth across the globe, recording 9.6% in 2008 and 9.2% in 2009.
While most countries would be delighted to have such growth rates, the point to be
considered is that these rates reflected a substantial drop from the 14.2% growth in
2007. In terms of short term impact on China, the most visible damage was inflicted
on its export-oriented light industry in southern China. Thousands of companies went
bust, tens of thousands of workers have been laid-off and official statistics revealed
that 10 million migrant workers had returned back to their home provinces. In the
financial sector the stock market crash that started in late 2007 had wiped out more
than two thirds of market value although this dramatic collapse was not without any
home-made reasons [16]. The Chinese banks for all their profitability witnessed the
sudden pull-out of many of their Western partners which (Bank of America, UBS,
RBS) sold their minority stakes in order to retrieve capital. Another massive blow
4
was to the China’s fledgling sovereign wealth fund, China Investment Corporation.
The second crisis in consideration for our study is the Chinese stock market crash
which began with the popping of the stock market bubble on 12 June 2015. A third
of the value of A-shares on the Shanghai Stock Exchange was lost within one month
of the event since mid-June. By 89 July 2015, the Shanghai stock market had fallen
30 percent over three weeks as 1,400 companies, or more than half listed, filed for
a trading halt in an attempt to prevent further losses. This crisis was inevitable
because over major part of 2014-15, investors kept investing more and more into
Chinese stocks, encouraged by falling borrowing costs as the central bank loosened
monetary policy even though economic growth and company profits were weak with
retail investors being the one leading this.
1.3 Structure of Thesis
This thesis is organized as follows: in chapter 2, we present the most commonly en-
countered stochastic models in finance, their simulations and parameter estimations.
Section 3 is devoted to a complete analysis of estimation of parameters of the Heston
model using Approximate Bayesian Computing. In chapter 3, we also present a new
model namely, the generalized Heston model for estimating volatility. In chapter 4,
we fit the generalized Heston model to the data from the Shanghai Stock Exchange
and NIKKEI 225. In chapter 5 we discuss some of the results and talk about future
work.
5
Chapter 2: Background
2.1 Introduction
In this chapter, we introduce the basic concepts from probability theory and its
applications in the field of finance. We also introduce several important and widely
used stochastic processes. In addition to their definitions, we describe a statistical
approach to estimating the parameters defining these processes.
Definition 1. Let Ω be a non-empty set, and let F be a collection of subsets of Ω.
F is a σ−algebra if it satisfies,
1. ∅ ∈ F,
2. If a set A ∈ F, then Ac ∈ F,
3. If a sequence of sets A1 , A2 , · · · ∈ F, then ∪∞

n=1 An ∈ F.
Definition 2. Let Ω be a non-empty set, and let F be a σ−algebra over Ω. A
probability measure P is a function that, to every set A ∈ F assigns a number in
[0, 1]. This number is called the probability of A and is represented as P(A).
The measure P should satisfy the following properties,
1. P(Ω) = 1, and
6
2. If A1 , A2 , . . . is a sequence of disjoint sets such that An ∈ F for all n ≥ 1, then
! ∞
X
∞
P ∪n=1 An = P(An ) (2.1)
n=1
The triple (Ω, F, P) is called a probability space.
Definition 3. Let F be a σ−algebra and Ω the space of outcomes which are specific
to an experiment. A function X : (Ω, F) → R is a random variable if for every subset
Fr = {ω: X(ω) ≤ r} r ∈ R, the condition Fr ∈ F is satisfied.
A random variable X is called a discrete random variable if its range {X(ω) : ω
∈ Ω} is countable. A random variable X is called a continuous random variable if its
range is a continuous subset of R. A continuous random variable has a cumulative
distribution function (CDF) which is absolutely continuous. On the other hand, the
CDF of a discrete random variable is a step function with discontinuities at the values
taken on by the random variable.
Definition 4. Let T ⊆ [0, ∞). A family of random variables {Xt }t∈T is called a
stochastic process. If T ⊆ N, then the stochastic process is discrete and if T ⊆ [0, ∞),
the stochastic process is continuous.
For example, let {X(t) : t = 0, 1, 2, . . .} be a stochastic process that evolves according
to the following rule: X(0) = 0 and, for t ≥ 0,

X(t + 1) = X(t) + 1 with probability p
X(t + 1) = X(t) − 1 with probability 1 − p,
Then, the stochastic process {X(t) : t > 0} is called a random walk. If p = 1/2
i.e., we are equally likely to move forward or backward, then the random walk is
called a symmetric random walk. If p 6= 1/2, i.e. we have a preferred direction, then
7
the random walk is called a biased random walk. The random walk process has the
following properties,
• If p = 1/2 all states of a random walk are recurrent. If p 6= 1/2 all states are
transient.
• Each state of a random walk has period 2 except for the first and last states, if
the process is assumed to live in 1, 2, . . . , k for some positive integer k.
2.2 Brownian Motion
Brownian Motion (BM) was first observed by biologist Robert Brown [9] in 1827
while studying pollen particles. He observed that when seen under a microscope, the
pollen particles floating in water exhibited a zig-zag jittery motion. He repeated the
experiment with particles of dust and concluded that the motion was due to the pollen
being alive. But, he could not explain the source of this random motion. The theory of
BM was first given by French mathematician Louis Bachelier in his PhD thesis titled
”Theory of Speculation” [7]. It was in 1905 when renowned physicist Albert Einstein
using probabilistic arguments was able to explain the theory of BM. He observed that
under the right kinetic energy, molecules of water would move randomly. This is how
Robert Brown described the movement of pollens.
The theory of BM has been applied to a variety of fields ranging from biology, physics,
economics, mathematics to finance. Stock market researchers were battling with a
problem similar to what Robert Brown had encountered in 1827. They were able to
figure out the path of market price but they did not know the reason behind it. They
could not determine who was buying, who was selling and how demand and supply
were affecting price movements.
8
Definition 5. Let (Ω, F, P) be a probability space. A stochastic process {W (t) : t ≥ 0}
is said to be a standard Brownian motion process if,
• W (0) = 0 almost surely;
• The increments for non-overlapping time intervals are independent.
• W (t) − W (s) ∼ N (0, t − s) for s < t,
• cov(W (s), W (t)) = min(s, t).
Next, we briefly introduce the concept of a stochastic differential equation (SDE).
Let {X(t) : t ≥ 0} be a stochastic process and assume that the process satisfies the
following equation,
Z t Z t
X(t) = X(0) + a(X(s), s) ds + B(X(s), s) dW (s), (2.2)
0 0
where a(·, ·) and b(·, ·) are known functions and {W (t) : t ≥ 0} is a standard Brownian
motion. In the equation above, the integral

Z t
a(X(s), s) ds
0
is a Riemann integral whereas the integral

Z t
B(X(s), s)dW (s)
0
is an Itô integral. Throughout this dissertation, we will assume that the functions
a(·, ·) and b(·, ·) satisfy sufficient conditions for such integrals to exist and to be finite
almost surely. Such conditions can be found in [15]. If a process X(t) satisfies equation
(2.2), we say that X(t) is a diffusion process. Equation (2.2) can be briefly written
as,
dX(t) = a(X(t), t) dt + b(X(t), t) dW (t) (2.3)
9
The term a(·, ·) is called the drift term while the function b(·, ·) is called the diffusion
coefficient. In this dissertation we only briefly review some of the necessary tools and
processes from this area. The equation (2.3) is referred to as a stochastic differential
equation (SDE).
Proposition 1. Itô’s Lemma - Let X(t) be a stochastic process which satisfies the
following stochastic differential equation,
dX(t) = a(X(t), t) dt + b(X(t), t) dW (t)
and let f(x,t) be any twice differentiable scalar function of two real variables x and t,
then Itô’s lemma states that,

" #
∂f (X, t) ∂f (X, t) b2 (X, t) ∂ 2 f (X, t) ∂f (X, t)
df (X(t), t) = +a(X, t) + 2
dt+b(X, t) dW (t).
∂t ∂x 2 ∂x ∂x
A proof of this lemma can be found in [15].
2.3 Geometric Brownian Motion (GBM)
Definition 6. Let {W (t) : t ≥ 0} be a stochastic process that describes a Brownian
Motion. Let S(0) > 0 and µ ∈ R and σ ∈ R+ be constants. If S(t) satisfies the
following stochastic differential equation,
dS(t) = µS(t)dt + σS(t)dW (t) (2.4)
then it is said to be a Geometric Brownian Motion (GBM).
The solution of (2.4) is,
n o
S(t) = S(0) · exp (µ − 0.5σ 2 )t + σW (t)
10
For, a small increase in time from t to t + ∆t, the ratio of S(t + ∆t)/S(t) is
S(t + ∆t) n
2
o
= exp (µ − 0.5σ )∆t + σ(W (t + ∆t) − W (t))
S(t)
where, W (t+∆t)−W (t) ∼ N (0, ∆t). From this definition, it follows that S(t) cannot
be zero at any point of time. If σ (the volatility) equals zero, then equation (2.4)
reduces to
S(t) = S(0) exp (µt) .
This implies that given S(0) > 0, S(t) is an increasing function of time t. As noted,
for any particular time interval ∆t,

n o
S(t + ∆t) = S(t) · exp (µ − 0.5σ 2 )∆t + σ(W (t + ∆t) − W (t)) (2.5)
If we take logarithms on both sides, we obtain the following equation,
log(S(t + ∆t)) − log(S(t)) = (µ − 0.5σ 2 )∆t + σ[W (t + ∆t) − W (t)]
where, W (t + ∆t) − W (t) ∼ N (0, ∆t). So, σ[W (t + ∆t) − W (t)] ∼ N (0, σ 2 ∆t).
It follows that, (µ − 0.5σ 2 )∆t + σ[W (t + ∆t) − W (t)] ∼ N [(µ − 0.5σ 2 )∆t, σ 2 ∆t].
Consequently, conditionally on log(S(t)),
log(S(t + ∆t)) ∼ N [log(S(t)) + (µ − 0.5σ 2 )∆t, σ 2 ∆t].
The expectation of this process is,

h n o i
E(S(t)|S(0)) = E S(0) · exp σW (t) + (µ − 0.5σ 2 )t S(0)

n o
= S(0) · exp (µ − 0.5σ 2 )t · E[exp (σW (t))]
n o
= S(0) · exp (µ − 0.5σ 2 )t · exp{0.5σ 2 · t}
= S(0) · exp (µt)
11
h i
Here, we have used the fact that E exp{cW (t)} = exp(c2 t/2), where c ∈ R. Simi-
larly, the variance of S(t) is,
V ar(S(t)|S(0)) = S(0)2 · exp(2µt) · exp(σ 2 t − 1).
This stochastic process has been used to model quantities that must be positive. In
figure 2.1, we show 500 simulated paths of a GBM process, which have been obtained
according to algorithm 1.
Algorithm 1. (Simulation of the GBM process)
• Set the process parameters i.e. total time period (T) = 10, number of steps (N)
= 1000, number of simulations (n) = 500, β = 1.5, θ = 0.15, σ = 0.1.
• Let ∆t = T /N and initialize the process by setting S(0).
• Recursively simulate S(t + ∆t) using (2.5), where W (t + ∆t) − W (t) ∼ N(0,∆t)
is independent of everything else.
12
160
140
120
100
S(t)
80
60
40
20
0 2 4 6 8 10
t
Figure 2.1: Simulated paths of the GBM process with parameters as described in
algorithm 1
13
Histogram of ln(S(t=50dt))
6
Frequency 3
0
2.8 2.9 3.0 3.1 3.2 3.3
Value
Figure 2.2: Histogram of log of GBM at the 50th time-step. The orange curve
represents the superimposed normal density curve with parameters obtained from
simulated data at the 50th time-step.
2.3.1 Parameter Estimation for the GBM process using Max-

imum Likelihood Estimation
Let {X(t) : t ≥ 0} be a stochastic process that satisfies the Markov’s property. As-
sume that we observe this process at a discrete collection of time points {t0 , t1 , . . . , tn }
where, t0 = 0, ti = iT /n for i = 1, 2, . . . , n. Let X = {X(t0 ), X(t1 ), . . . , X(tn )} be the
available data. For simplicity, we use Xi = X(ti ). Let θ be the parameters defining
the process {X(t) : t ≥ 0}. The likelihood function is defined as,

n
Y
L(θθ |X1 , X2 , . . . , Xn ) = fθ (Xi |Xi−1 )
i=1
where fθ (Xi |Xi−1 ) is called the transition density, and X0 is assumed to be fixed. We
make this assumption throughout this document. For the GBM process the transition
14
density is,
!
1 (log(Xi /Xi−1 ) − ντ )2
f (Xi |Xi−1 ) = √ exp −
σXi 2πτ 2σ 2 τ
where ν = µ − σ 2 /2 and τ = T /n. Thus, the likelihood function is,

t
!
Y 1 (log(Xi /Xi−1 ) − ντ )2
L(µ, σ|X) = √ exp −
i=1
σXi 2πτ 2σ 2 τ
Instead of maximizing the likelihood function, we maximize the log likelihood func-
tion l(µ, σ|X).
For a simulation study, we generated a data set according to algorithm 1 using the
same parameter values as above. Based on such data, we used the built in mini-
mization function from Python to estimate the parameter values by minimizing the
negative of log-likelihood. This process is repeated 500 times and the histogram of all
estimates of the parameter µ is presented in Figure 2.3. The dashed red line repre-
sents the true value of the parameter µ. Similarly, Figure 2.4 displays the histogram
of all estimates of the parameter σ and the dashed red line represents the true value
of the parameter σ.
15
Histogram of est_mu
250
200
Frequency
150
100
50
0
0.05 0.10 0.15 0.20 0.25
Value
Figure 2.3: Histogram of estimated values of µ of the GBM as simulated above. The
dashed red line represents the true value of the parameter.
Histogram of est_sigma
200
150
Frequency
100
50
0
0.094 0.096 0.098 0.100 0.102 0.104 0.106
Value
Figure 2.4: Histogram of estimated values of σ of the GBM as simulated above. The
dashed red line represents the true value of the parameter.
16
2.4 The Ornstein-Uhlenbeck Process
The Ornstein-Uhlenbeck (OU) process is a stochastic process that was introduced
to model the velocity of a particle that is undergoing a Brownian Motion [22]. The
OU process was an attempt to model the velocity of a particle directly. This was
particularly important because if the position of a particle is given by Brownian
Motion, then its time derivative would not exist. This difficulty was overcome by
using the OU process to model the velocity of a particle.
In addition, the OU process was one of the first models used to model no arbitrage
interest rates as it had favorable properties, like mean reversion. Later, better models
were developed because this model could assume negative values with a positive
probability whereas the quantities it was used to model, like the no arbitrage interest
rates, could never take negative values. In the financial literature, it is also known as
the Vasicek model
Definition 7. Let {X(t) : t ≥ 0} be a stochastic process and θ ∈ R and β, σ ∈ R+ be
constants. If {X(t) : t ≥ 0} satisfies the following stochastic differential equation,
dX(t) = −β(X(t) − θ)dt + σdW (t), β, σ ∈ R+ , θ ∈ R (2.6)
then X(t) is said to be an OU process.
In (2.6) above, the term dX(t) is called the infinitesimal change in X(t), β > 0 is
called the rate of mean reversion and θ is the long term mean of the OU process. The
parameter σ > 0 is called the volatility and dW (t) is Gaussian Noise. In (2.6) the
−β(X(t) − θ)dt term is known as the drift term and the term σdW (t) is known as
the diffusion term.
17
The OU process is a mean reverting process, i.e., even though the process is stochastic,
it has a tendency to revert to an equilibrium value. The OU process is very helpful in
modeling the interest rates or volatility as these quantities are assumed to fluctuate
around an equilibrium quantity. As can be seen from (2.6), if σ = 0, we get an
ordinary differential equation. Let X(0) = 0, when σ = 0, (2.6) reduces to,
dX(t) = −β(X(t) − θ)dt
which can be solved to get
X(t) = θ − θ exp(−βt)
As t → ∞, the general solution converges to θ. So, with the addition of the term
σdW (t), we are merely adding random fluctuations about the equilibrium position θ.
If X(t) is very far from the equilibrium position θ, then the mean reversion term
−β(X(t) − θ)dt becomes larger and pushes X(t) towards the equilibrium position θ.
2.4.1 Simulation of the OU Process
Euler-Maruyama Approximation for OU Process - Let h > 0 be the step size.
The Euler-Maruyama (EM) approximation for OU process is,
X(t + h) − X(t) ≈ −β(X(t) − θ)h + σ(W (t + h) − W (t)), β, σ ∈ R+ , θ ∈ R (2.7)
This approximation leads to the following transition distribution,
[X(t + h)|X(t)] ∼ N (X(t) − β(X(t) − θ), σ 2 h).
It can be shown that the exact transition density for an OU process is,
!
σ 2 (1 − exp(−2βh))
[X(t + h)|X(t)] ∼ N θ + (X(t) − θ) exp(−βh), (2.8)
2β
18
For a fixed t and a large h > 0, [X(t + h)|X(t)] follows a normal distribution with
mean θ and variance σ 2 /2β.
In Figure 2.5, we show 50 simulated paths according to algorithm 2.
Algorithm 2. (Simulation of the OU process)
= 100, number of simulations (n) = 1000, β = 3.5, θ = 0.7, σ = 0.1.
• Let ∆t = T /N and initialize the process by setting X(0) = 0.7.
• Recursively simulate X(t + ∆t) using the distribution given in (2.8).
0.80
0.75
X(t)
0.70
0.65
0.60
0 2 4 6 8 10
t
Figure 2.5: Simulated paths of the OU process with parameters as described above
19
2.4.2 Parameter Estimation for OU Process using Maximum
Likelihood
Let {X(t) : t ≥ 0} be an OU stochastic process as defined in (2.6). Assume that
we observe this process at a discrete collection of time points {t0 , t1 , . . . , tn } where,
t0 = 0, ti = iT /n for i = 1, 2, . . . , n. Let X = {X(t0 ), X(t1 ), . . . , X(tn )} be the data.
For simplicity, we use Xi = X(ti ). Let θ = (β, θ, σ). Given that this process satisfies
Markov’s property, the likelihood function is defined as,

n
Y
θ|X
L(θ|X
θ|X) = f (Xi |Xi−1 )
i=1
where f (Xi |Xi−1 ) is the transition density. For the OU process the transition density
is, !
1 −(Xi − αi−1 )2
f (Xi |Xi−1 ) = √ · exp
2πη 2η 2
!
where αi−1 = θ + (Xi−1 − θ) · exp (−βh) and η = σ 2 /2β · 1 − exp(−2βh) . Thus,
the likelihood function can be written as,

t
!
Y 1 −(Xi − αi−1 )2
L(θ, β, σ|X) = √ · exp (2.9)
i=1
2πη 2η
The log likelihood function is,

t
!
−t X −(Xi − αi−1 )2
l(θ, β, σ|X) = log(2πη) − (2.10)
2 i=1
2η
For a simulation study, we generated a data set according to algorithm 2 using the
same parameter values as above. Based on such data, we used the built in mini-
mization function from Python to estimate the parameter values by minimizing the
negative of log-likelihood. This process is repeated 500 times and the histogram of
all estimates of the parameter β is presented in Figure 2.6. The dashed red line rep-
resents the true value of the parameter β. Similarly, Figures 2.7 and 2.8 display the
20
histograms of all estimates of the parameters θ and σ, respectively. The dashed red
lines represent the true value of the parameters θ and σ.
Histogram of est_beta
250
200
Frequency
150
100
50
0
2 3 4 5 6 7 8 9
Value
Figure 2.6: Histogram of estimated values of β of the OU process as simulated above.

The dashed red line represents the true value of the parameter.
21
Histogram of est_theta
250
200
Frequency
150
100
50
0
0.68 0.69 0.70 0.71 0.72 0.73
Value
Figure 2.7: Histogram of estimated values of θ of the OU process as simulated above.

250
200
Frequency
150
100
50
0
0.07 0.08 0.09 0.10 0.11 0.12 0.13
Value
Figure 2.8: Histogram of estimated values of σ of the OU process as simulated above.

22
2.4.3 Parameter Estimation for OU Process using Ordinary
Least Squares
We consider an OU process as represented by (2.6). Using the EM discretization
procedure, we can approximate the OU process as (2.7). This can be further simplified
as,
√
Xt+dt = Xt (1 − βdt) + βθdt + σ dtZ (2.11)
where, Z ∼ N (0, 1) represents the standard normal distribution. When represented
this way, equation (2.11) can be thought of as a normal linear model with independent
errors. This normal linear model is of the form Y = βX + , where Y is a N×1 vector
of Xt+dt values. Thus, we can estimate the coefficient vector β and then use that to
estimate the parameters of the OU process. If we compare (2.11) to an AR(1) model
whose equation is of the form Xi+1 = β0 + β1 Xi + , then we get βθdt = β0 and
β1 = (1 − βdt). It so happens that in this case, we would get the same estimates
as we would be get from using the maximum likelihood procedure. This is true
because we have a normal linear model and in the case of a normal linear model,
β̂ols = β̂mle i.e. the estimator obtained using ordinary least squares is the same as
the estimate obtained using maximum likelihood estimation. However, we would lose
some information as the least square estimates only use information from the second
observation onwards where as the maximum likelihood estimates use information from
the first observations itself.
Let ˆ = Xi+1 − (β0 + β1 Xi ) be the ith residual. The sum of squares of residuals (SSE)
is defined as,
N
X N
X N
X
2 2 2
SSE = ˆ = Xi+1 + (β0 + β1 Xi ) − 2 Xi+1 (β0 + β1 Xi ) (2.12)
i=1 i=1 i=1
23
Now, we maximize equation 2.11 with respect to the parameters β0 and β1 . To do
this we differentiate SSE with respect to the parameters and set them equal to zero.
On doing the aforementioned, we obtain,

PN
Xi+1 − N
P
i=1 i=1 β̂1 Xi
β̂0 = (2.13)
n
(N i=1 Xi+1 Xi ) − ( N
PN P PN
i=1 Xi i=1 Xi+1 )
β̂1 = PN 2
PN (2.14)
N i=1 Xi − ( i=1 Xi ) 2
The data generation process and the true parameter values used to generate data
were identical to the processes in the previous section. After getting the least square
estimates, the estimates of the OU process were obtained as follows:- β̂ = (1 − β̂1 )/dt,
ˆ
θ̂ = β̂0 /(1 − β̂1 ), σ̂ = se().
250
200
Frequency
150
100
50
0
2 3 4 5 6
Value
Figure 2.9: Histogram of estimated values of β of the OU process using least squares
approximation. The dashed red line represents the true value of the parameter.
24
Histogram of est_theta
250
200
Frequency
150
100
50
0
0.600 0.625 0.650 0.675 0.700 0.725 0.750 0.775
Value
Figure 2.10: Histogram of estimated values of θ of the OU process using least squares
250
200
Frequency
150
100
50
0
0.05 0.06 0.07 0.08 0.09 0.10
Value
Figure 2.11: Histogram of estimated values of σ of the OU process using least squares
25
2.5 Cox-Ingersoll-Ross Process
The Cox-Ingersoll-Ross (CIR) model [11] was introduced in 1985 by John C. Cox,
Jonathan E. Ingersoll and Stephen A. Ross in order to improve the existing Vasicek
model which allowed for negative interest rates. Earlier, the OU model was used to
model interest rates rt . But, the fundamental problem with that approach was that
the change in rt assumed a constant volatility σ regardless of what happened in the
economy. There is empirical evidence that suggests that ∆rt is more volatile, if rt is
high and it is not so volatile if rt is low, i.e. the change in interest rates would be more
volatile if the interest rates themselves are very high and that change is relatively less
volatile if the interest rates are relatively lower. Also, the interest rates can never be
negative but if modeled using an OU process, they can assume negative values with
some positive probability. With regards to this, the CIR model was used to model
interest rates as it was more efficient and violated fewer assumptions than the OU
model used to model the same interest rates.
Definition 8. Let X(t) be a stochastic process and β, σ ∈ R+ , and θ ∈ R be constants.
If X(t) satisfies the following stochastic differential equation,
p
dX(t) = α(β − X(t))dt + σ X(t)dW (t), β, σ ∈ R+ , θ ∈ R (2.15)
then X(t) is said to be a CIR process.
In equation (2.15), dX(t) is the infinitesimal change in X(t), α is the rate of mean
reversion, β is the long term mean of the process which is also known as the asymptotic
mean, σ > 0 is the volatility and dW (t) Gaussian Noise. The drift function is linear
and has a mean reverting tendency because of which the CIR process is also a mean
26
reverting process. The diffusion function is proportional to X(t) and thus helps in
ensuring that the process never becomes negative. If all the process parameters, i.e.,
σ, α and β, are positive and 2αβ ≥ σ 2 (Feller’s condition), then the CIR process is
well-defined.
The transition density of X(t) given X(s) is,

! 2q
u √
f (X(t)|X(s)) = c exp(−v − u) Iq (2 uv) s < t (2.16)
v
where,
2α
c= ,
σ 2 [1 − exp(−α(t − s))]
u = cX(s)e(−α(t−s)) ,
v = cX(t),
2αβ
q= − 1,
σ2
√
and Iq (2 uv) is the modified Bessel function of the first kind and of order q. We use
the transformation S(t) = 2cX(t). Thus, the transition density of S(t) given S(s) is,
1
f (S(t)|S(s)) = f (X(t)|X(s)), s < t.
2c
Here, f (S(t)|S(s)) is a non-central χ2 distribution with 2u as the non-centrality pa-
rameter and 2q + 2 degrees of freedom.
2.5.1 Simulation of CIR process
Proposition 2. Let Z1 , Z2 , . . . , Zk ∼ N (0, 1) be independent random variables, then
U = Z12 + Z22 + . . . + Zk2 ∼ χ2k (0), where χ2k (0) is a (central) chi-squared distribution
with k degrees of freedom.
27
Let U ∼ χ2k (0). Then, the probability density function of the random variable U
is,
uk/2−1 exp(−u/2)
fU (u) = , u>0
2k/2 Γ(k/2)
R∞
where, Γ(x) = 0
tx−1 exp(−t)dt is the gamma function. It is known that Γ(n) =
(n − 1)! for an integer n > 0. The moment generating function of U is,
MU (t) = E(exp(tU )) = (1 − 2t)−k/2 , |t| < 1/2.
Proposition 3. Let Z1 , Z2 , . . . , Zk ∼ N (µj , 1) for j = 1, 2, . . . , k be independent
random variables, then U = Z12 + Z22 + . . . + Zk2 ∼ χ2k (λ), where χ2k (λ) is a non-central
chi-squared distribution with k degrees of freedom with non-centrality parameter λ
where, λ = 21 kj=1 µ2j .

P
Let V ∼ χ2k (λ). Then, the probability density function of the random variable V
is,
∞ h
X exp(λ)λj v (j+k/2)−1 exp(−v/2) i
fV (v) = j+k/2 Γ(j + k/2)
, v>0 (2.17)
j=0
j!2
The moment generating function of V is,

!
λt
MV (t) = E(exp(tV )) = exp (1 − 2t)−k/2 , |t| < 1/2
1 − 2t
We note that equation (2.17) is a mixture of Poisson and Gamma distributions. The
non-centrality parameter λ is equal to 0 if and only if µj = 0 for all j = 1, 2, . . . , k.
Note that a random variable V ∼ χ2k (λ) can be simulated using the following hierar-
chy:
V |Y ∼ χ2k+2Y (0)
Y ∼ P oisson(λ)
28
We can use the law of iterated expectations to calculate E(V ) and V ar(V ).
E(V ) = E[E(V |Y )] = E(k + 2Y ) = k + 2E(Y ) = k + 2λ
Similarly, the variance of V is,
V ar(V ) = V ar(E(V |Y )) + E(V ar(V |Y ))
= V ar(k + 2Y ) + E(2(k + 2Y ))
= 4λ + 2k + 4λ = 2(k + 4λ)
The characteristic function of V is,
exp{λ2it/(1 − 2it)}
φ(t) = E(exp{itV }) = , |2it| < 1 (2.18)
(1 − 2it)k/2
It can be shown using equation (2.18) that if we have two independent random vari-
ables V1 ∼ χ2k1 (λ1 ) and V2 ∼ χ2k2 (λ2 ) then,
d
V1 + V2 = χ2k1 +k2 (λ1 + λ2 ) (2.19)
The above also holds true for any finite number of independent non-central chi-
squared distributions. Equation (2.19) implies that the sum random variables which
follow a non-central chi-squared distribution is equal in distribution to another ran-
dom variable which follows a non-central chi-squared distribution. In particular, if
we have a random variable V ∼ χ2k (λ) then,
d
V = χ21 (λ) + χ2k−1 (0) d > 1 (2.20)
It is important to understand that,
d
√ d
√
χ21 (λ) = [N ( λ, 1)]2 = (N (0, 1) + λ)2
29
Equation (2.20) implies that a random variable which follows a non-central chi-
squared distribution is equal in distribution to the sum of two independent random
variables following a central chi-squared distribution and a standard normal distribu-
tion.
Proposition 4. Assume that k > 1. Then, it is true that,
d
√
χ2k (λ) = (Z + λ)2 + χ2k−1 (0) .
Therefore, when the degrees of freedom k > 1, sampling from a non-central chi-
squared distribution is equivalent to sampling from an central chi-squared distribu-
tion and an independent normal distribution. This sampling method is not compu-
tationally intensive and is generally efficient. When 0 < k < 1, we cannot use the
above mentioned method to sample from a non-central chi-squared distribution. If
0 < k < 1, a non-central chi-squared distribution can be sampled using a central
chi-squared distribution with random degrees of freedom.
Let Y ∼ P oisson(λ/2) random variable. The probability mass function (pmf) of Y
is,
(λ/2)y
P{Y = y} = exp(−λ/2) y = 1, 2, . . .
y!
Let U ∼ χ2k+2N (0). Conditional on the value of Y = y, let U follow a central chi-
squared distribution with k + 2y degrees of freedom whose CDF is,

Z x
1
P{U ≤ u|Y = y} = (k/2)+y exp{−z/2}z (k/2)+y−1 dz (2.21)
2 Γ[(k/2) + y] 0
The unconditional cumulative distribution of U is,

∞ ∞
X X (λ/2)y
P{Y = y}P{U ≤ u|Y = y} = exp(−λ/2) P{U ≤ y} (2.22)
0 0
y!
30
Equation (2.22) is the CDF of a non-central chi-squared distribution with k degrees
of freedom and non-centrality parameter λ.
Proposition 5. Assume that k < 1 and Y ∼ P oisson(λ). Then, it is true that,
d
χ2k (λ) = χ2k+2Y (0) .
Therefore, when the degrees of freedom are less than 1, we can sample from a non-
central chi-squared distribution by first generating a Poisson random variable Y with
parameter λ/2 and then sampling from a central chi-squared distribution with k + 2Y
degrees of freedom. Even though this hierarchical model to sample from a non-central
chi-squared distribution produces unbiased results, it is usually computationally in-
tensive.
Algorithm 3. (Simulation of the CIR process)
= 1000, number of simulations (n) = 500, α = 0.9, β = 4.0, σ = 1.5.
• Recursively simulate X(t + ∆t) using the distribution given in (2.15).
31
14
12
10
8
X(t)
6
4
2
0
0 2 4 6 8 10
t
Figure 2.12: Simulated paths of the CIR process with parameters as described above.
2.5.2 Parameter Estimation for CIR Process using Maxi-

mum Likelihood
Let {X(t) : t ≥ 0} be a CIR process as defined in (2.15). Assume that we
observe this process at a discrete collection of time points {t0 , t1 , . . . , tn } where, t0 =
0, ti = iT /n for i = 1, 2, . . . , n. Let X = {X(t0 ), X(t1 ), . . . , X(tn )} be the data. For
simplicity, we use Xi = X(ti ). Let θ = (α, β, σ). Given that this process is Markovian,
the likelihood function is,

n
Y
L(θθ |X1 , X2 , . . . , Xn ) = f (Xi |Xi−1 )
i=1
32
where f (Xi |Xi−1 ) is the transition density. The transition density for a CIR process
is,
! 2q
u √
f (Xi |Xi−1 ) = c exp(−v − u) Iq (2 uv) (2.23)
v
where,
2α
c= ,
σ 2 [1 − exp(−αdt)]
u = cXi−1 e(−αdt) ,
v = cXi ,
2αβ
q= − 1,
σ2
√
and Iq (2 uv) is the modified Bessel function of the first kind and of order q. The log
likelihood function is,

t
X
l(α, β, σ|X) = f (Xi |Xi−1 )
i=1
t
X vi √
= t log(c) + [−ui−1 − vi + q/2 log + log(Iq (2 ui−1 vi ))]
i=1
ui−1
(2.24)
where, c, ui−1 , vi = cXi , q and Iq have the usual meaning. For a simulation study, we
generated a data set according to algorithm 3 using θ = (0.9, 4.0, 1.5). Based on such
data, we used the built in minimization function from Python to estimate the param-
eter values by minimizing the negative of log-likelihood. This process is repeated 500
times and the histogram of all estimates of the parameter α is presented in Figure
2.13. The dashed red line represents the true value of the parameter β. Similarly,
Figures 2.14 and 2.15 display the histograms of all estimates of the parameters α and
σ respectively. The dashed red lines represent the true value of the parameters α and
σ.
33
Histogram of est_alpha
120
100
80
Frequency
60
40
20
0
0.5 1.0 1.5 2.0 2.5 3.0 3.5
Value
Figure 2.13: Histogram of estimated values of α of the CIR process. The dashed red
line represents the true value of the parameter.
160
140
120
Frequency
100
80
60
40
20
0
2 3 4 5 6 7 8 9
Value
Figure 2.14: Histogram of estimated values of β of the CIR process. The dashed red
34
100
80
Frequency
60
40
20
0
1.425 1.450 1.475 1.500 1.525 1.550 1.575 1.600
Value
Figure 2.15: Histogram of estimated values of σ of the CIR process. The dashed red
2.6 Generalized Cox-Ingersoll-Ross model
Definition 9. Let Xt be a stochastic process and β, σ ∈ R+ , γ ∈ (0, 1), and θ ∈ R be
constants. If Xt satisfies the following stochastic differential equation,
dXt = α(β − Xt )dt + σXtγ dWt , β, σ ∈ R+ , θ ∈ R, γ ∈ (0, 1) (2.25)
then X(t) is said to be a generalized CIR [10] process.
Let {X(t) : t ≥ 0} be a stochastic process. Assume that we observe this process
at a discrete collection of time points {t0 , t1 , . . . , tn } where, t0 = 0, ti = iT /n for
i = 1, 2, . . . , n. Let X = {X(t0 ), X(t1 ), . . . , X(tn )} be the data. For simplicity, we
use Xi = X(ti ). Let θ = (α, β, γ, σ). The likelihood function is,

n
Y
θ|X
L(θ|X
θ|X) = f (Xi |Xi−1 )
i=1
35
where f (Xi |Xi−1 ) is the transition density.
Even though the exact likelihood function does not have a closed form solution, we
use a Gaussian approximation which works relatively well for smaller intervals of time
(∆t). In order to get accurate results, we would like to have the change in the time
interval (∆t) as small as possible.
So, using the Gaussian approximation we have,
2γ
Xi+1 |Xi ∼ N (Xi + α(β − Xi−1 )dt, σ 2 Xi−1 dt). (2.26)
This is true because we assume that W (t + dt) ∼ N (0, dt).
Algorithm 4. (Simulation of the generalized CIR process)
= 1000, number of simulations (n) = 1000, α = 0.5, β = 3, σ = 0.1, γ = 0.2.
• Recursively calculate X(t + ∆t) using the distribution given in (2.26).
In Figure 2.16, we show 50 simulated paths according to algorithm 4.
36
0.80
0.75
X(t)
0.70
0.65
0.60
0 2 4 6 8 10
t
Figure 2.16: Simulated paths of the generalized CIR process with parameters as
described above
2.6.1 Parameter Estimation for generalized CIR Process us-

ing Maximum Likelihood
We calculate the likelihood function using this Gaussian approximation. Thus,
the likelihood function can be written as,

N
!
Y 1 −(Xi+1 − Xi − α(β − Xt )dt)2
L(θ, β, σ, γ|X) = · exp (2.27)
ηXi2γ
q
2γ
i=1 πηXi
where, η = 2π 2 dt. But instead of maximizing the likelihood function, we maximize
the log likelihood function l(θ, β, σ|X). The log likelihood function is,
N
!
−N X −(Xi+1 − Xi − α(β − Xt )dt)2
l(θ, β, σ|X) = log(2πη) − + γlog(Xi )
2 i=1
ηXi2γ
(2.28)
37
We maximize l(α, β, γ, σ|X) in equation (2.28) to get estimates for the parameters.
Figure 2.17: Histogram of estimated values of α of the generalized CIR process

using normal approximation. The dashed red line represents the true value of the
parameter.
38
Figure 2.18: Histogram of estimated values of β of the generalized CIR process
parameter.
Figure 2.19: Histogram of estimated values of σ of the generalized CIR process

parameter.
39
Figure 2.20: Histogram of estimated values of γ of the generalized CIR process
parameter.
R t2
2.6.2 Distribution of t1 W (s) ds
R t2
In this subsection, we derive the distribution of the quantity t1
W (s) ds, where
{W(t)} is a standard BM process. This distribution will play a significant role later
Rt
in this thesis. The distribution of 0 W (s) ds is the special case of the distribution
Rt Rt
of t12 W (s) ds. Let us start by finding the mean and variance of 0 W (s) ds. Let
f (x) = x3 and applying Ito’s Lemma (proposition 1) we get,

Z t Z t
3 2
W (t) = 3 W (s) dW (s) + 3 W (s) ds
0 0
Z t Z t
1 3
W (s) ds = W (t) − W 2 (s) dW (s).
0 3 0
Rt
Thus, the mean of 0
W (s) ds is,
Z t
E W (s) ds = 0.
0
40
Rt
The variance of 0 W (s) ds is,
Z t "Z 2 #
t Z t Z t
V ar W (s) ds = E W (s) ds =E W (s)W (u) du ds
0 0 0 0
Z tZ t Z tZ t
= E[W (s)W (u)] du ds = min(s, u) du ds
0 0 0 0
Z tZ s Z tZ t
= u du ds + s du ds = t3 /3.
0 0 0 s
Rt t3
Rt
Thus, 0
W (s) ds is a random variable that has a mean 0 and variance 3
. 0
W (s) ds
is a random variable that has a normal distribution so,

Z t 3
t
W (s) ds ∼ N 0, .
0 3
Rt
Once we’ve found the mean and variance of 0 W (s) ds, we move on to the more
Rt Rt
general case of finding the mean and variance of t12 W (s) ds. The mean of t12 W (s) ds
is,
Z t2 Z t2
E W (s) ds = E[W (s)] ds = 0.
t1 t1
R t2
The variance of t1
W (s) ds is,
"Z 2 #
Z t2 t2
V ar W (s) ds = E W (s) ds
t1 t1
Z t2 Z t2 Z t2 Z t2
= E[W (s)W (u)] du ds = min(s, u) du ds
t1 t1 t1 t1
Z t2 Z s Z t2 Z t2
= u du ds + s du ds
t1 t1 t1 s
Z t2 2 Z t2
s t21
= − ds + s(t2 − s) ds
t1 2 2 t1
2t3 t3
= 1 + 2 − t21 t2
3 3
Rt
Thus, the variance of t12 W (s) ds is
2t31 t32
+ − t21 t2 . (2.29)
3 3
We note that,
41
• If we let t1 = 0 and t2 = t then equation (2.29) gets reduced to the variance of
t3
the special case described earlier i.e 3
.
• If we let t1 = t and t2 = t then equation (2.29) gets reduced to 0.
Thus,
t2
2t31 t32
Z
2
W (s) ds ∼ N 0, + − t1 t2 .
t1 3 3
42
Chapter 3: Approximate Bayesian Computing for Stochastic
Volatility Models
3.1 Heston Model
In his 1993 paper “A Closed-Form Solution for Options with Stochastic Volatil-
ity with Applications to Bond and Currency Option” [13] Heston proposed a new
stochastic volatility model, which now carries his name. The Heston model is used
extensively in estimating the volatility of financial assets or derivatives. This model
is an extension of the Black-Scholes model, i.e., the assumption is that underlying
asset price still evolves according to the Black-Scholes model but it also introduces a
stochastic behavior for the volatility component. That is, the model assumes that the
volatility component in the Black-Scholes model is not fixed but rather is governed
by another stochastic differential equation. In particular, the Heston model uses a
mean reverting CIR model to describe the evolution of the volatility.
Definition 10. Let S(t) : t ≥ 0 to be the price of the asset and ν(t); t ≥ 0 be the
variance process. The equations governing the Heston model are,
p
dS(t) = µS(t)dt + ν(t)S(t) dW S (t) (3.1)
p
dν(t) = α(β − ν(t))dt + σ ν(t) dW ν (t) (3.2)
43
where W S (t) and W ν (t) are correlated standard BM processes with the correlation
between them given by ρ ∈ [−1, 1], µ is called the risk-free rate, dS(t) is the infinites-
imal change in S(t), the price of the underlying asset, α is the rate of mean reversion,
β is the long term mean of the CIR process which is also known as the asymptotic
mean, σ > 0 is the volatility of the CIR process.
The Heston model has certain desirable properties which make it a useful model.
Under the Heston model, volatility is modeled as a mean reverting process. This
assumption of the Heston model is also corroborated by observing its behavior in the
financial markets. If the volatility of an asset was not mean reverting, there would be
many assets whose volatility would be close to zero or very high. However, in practice
the probability of occurrence of these cases is very low and short lived.
The Heston model also associates asset prices with volatility by introducing correlated
shocks between the two. This assumption is particularly useful as it helps us to model
the statistical dependence between an asset and its volatility. Empirical evidence [21]
and [14] shows that in an equity market, the volatility and change in price of an asset
are inversely related, i.e., high changes in asset prices result in an increased volatility.
However, the flexibility that the Heston framework provides comes at the expense of
increased model complexity. It is generally difficult to implement the Heston model
as compared to the Black-Scholes model and there is always a tradeoff between the
two models in terms of complexity and accuracy. The Heston model is generally more
complex but also more accurate.
p
Proposition 6. Let dW ν (t) ∼ N (0, dt) and dW S (t) = ρdW ν (t) + 1 − ρ2 dZ(t),
where dZ( t) ∼ N (0, dt) is independent of dW S (t).
44
Then,
V ar[dW ν (t)] = dt
p
V ar[dW S (t)] = ρCov[dW ν (t), dW ν (t)] + 1 − ρ2 Cov[dW ν (t), dZ(t)]
= ρV ar[dW ν (t)]
= ρdt
The correlation between dW S (t) and dW ν (t) is equal to ρ. Let X(t) = log(S(t)).
Using Itô’s Lemma (proposition 1) we can rewrite the equation (3.1) as,
" ! !#
1 ν(t)S 2 (t) −1 p 1
dX(t) = µS(t) · + · 2 dt + ν(t) · S(t) · dW X (t)
S(t) 2 S (t) S(t)
!
ν(t) p
dX(t) = µ− dt + ν(t) dW X (t)
2
Thus, after using Itô’s lemma we get the following set of equations,
!
ν(t) p
dX(t) = µ− dt + ν(t) dW X (t) (3.3)
2
p
dν(t) = α(β − ν(t))dt + σ ν(t) dW ν (t) (3.4)
where, dW X (t) = dW S (t) and all the other parameters have the usual meanings.
Feller’s Condition - It can be seen from equations (3.3) and (3.4) that ν(t) is
under the square root sign. Thus, we require ν(t) to be non-negative. Feller proposed
a condition which guarantees that ν(t) would be non-negative. If 2αβ ≥ σ 2 , then ν(t)
takes non-negative values.
3.1.1 Simulation of sample paths of the Heston Model
There have been extensive studies on how to simulate sample paths of a Heston
model. The basic idea is to partition a time interval into equally spaced intervals and
45
then simulate asset price paths for a given partition. Apart from the generic E-M
discretization and Miller’s algorithm, Broadie and Kaya’s [8] algorithm is also popu-
lar. There have been several modifications to Broadie and Kaya’s algorithm such as
Smith’s Approximation [23], Broadie and Kaya’s drift interpolation [25], Anderson’s
quadratic exponential [6], and Tse and Wan’s Inverse Gaussian [24]. In this project,
we use the exact scheme by Broadie and Kaya [8] but we estimate the integrals using
Riemann sums. This is slightly different from the work done by A. Van Haastrecht
and A. Pelsser [25] who use the trapezoidal rule to estimate the integrals.
3.1.2 Euler-Maruyama (EM) Approximation
The Euler-Maruyama(EM) algorithm is an easily implementable approximation
which can be used to approximate any SDE. The original process X(t) is approximated
by another process X̃(t) which is defined in the following way,

" #
1 p
X̃(t + ∆t) = X̃(t) + µ − ν̃(t) ∆t + ν̃(t)∆tZX
2
" #
p
ν̃(t + ∆t) = ν̃(t) + α β − ν̃(t) ∆t + σ ν̃(t)∆tZν
where ν̃(t) is another process approximating the process ν(t). In between any two time
points t, t+∆t, the processes X̃(·) and ν̃(·) are defined via a linear-interpolation of the
values defined through the above equations. Above, ZX and Zν are standard normal
random variables such that the correlation between them is ρ i.e. Corr(ZX , Zν ) = ρ.
In practice, this algorithm is not robust. When Feller’s condition is violated, the un-
derlying variance process does not remain non-negative and has a positive probability
of becoming negative. In addition, the Gaussian approximation above is valid only
46
when ∆t is very small. To circumvent this problem, Lord, Koekkoek and van Dijk
[17] propose a modification to the EM algorithm.
3.1.3 Euler-Maruyama scheme with Lord et al’.s modifica-

tion
The equations of the modified EM algorithm are,

" #
1 p
X̃(t + ∆t) = X̃(t) + µ − (ν̃(t)) ∆t + ν̃(t)∆tZX
2
" #
p
ν̃(t + ∆t) = ν̃(t) + α β − f (ν̃(t)) ∆t + σ ν̃(t)∆tZν
where, f (z) = max(0, z). If the variance process Ṽ becomes negative, it corrects itself
with a deterministic upward drift of αβ.
3.1.4 Milstein scheme
The Milstein scheme is very similar to the EM algorithm. However, the Milstein
scheme uses a second-order approximation to the SDE whereas the EM algorithm
uses a first-order approximation or linear approximation to the SDE.
The algorithm under the Milstein scheme is,

" #
1 p
X̃(t + ∆t) = X̃(t) + µ − (ν̃(t)) ∆t + ν̃(t)∆tZX ,
2
" #
p σ2
ν̃(t + ∆t) = ν̃(t) + α β − f (ν̃(t)) ∆t + σ ν̃(t)∆tZν + Zν2 h,
4
where, f (z) = max(0, z). It is important to know that ν(t + ∆t) > 0 if ν(t) > 0
and 4αβ ≥ σ 2 . This fact was stated by Gartner in [12]. When this inequality is
not satisfied, it can still be shown that the occurrence of negative realizations of ν̃ is
greatly reduced as compared to the EM algorithm.
47
3.1.5 Broadie and Kaya’s Exact Algorithm
An exact simulation algorithm to simulate the Heston model is proposed by
Broadie and Kaya [8]. However, this algorithm is rarely used in practice as it is
computationally intensive. The solution to (3.1) can be written as,

!
Z t+∆t Z t+∆t
1 p
S(t + ∆t) = S(t) exp µ∆t − ν(u)du + ν(u)dWS (u)
2 t t
Using this and the transformation X = log(S), we get the following explicit solution
for X(t),
1 t+∆t
Z
X(t + ∆t) = X(t) + µ∆t − ν(u) du
2 t
Z t+∆t p p Z t+∆t p
+ρ ν(u) dWν (u) + 1 − ρ 2 ν(u) dWX (u) (3.5)
t t
where, W ν (u) and W X (u) are values from two independent Brownian motions at time
u. If we integrate (3.4), we get,

Z t+∆t Z t+∆t p
ν(t + ∆t) = ν(t) + [α(β − ν(u))]du + σ ν(u)dWν (u) (3.6)
t t
Equation (3.6) can be re-written as,

Z t+∆t p " #
Z t+∆t
−1
ν(u)dWν (u) = σ ν(t + ∆t) − ν(t) − αβ∆t + α ν(u)du
t t
R t+∆t p
and then if we substitute the value of t
ν(u)dWν (u) into equation (3.5), we get,
1 t+∆t
Z
ρ
X(t + ∆t) = X(t) + µ∆t − ν(u)du [ν(t + ∆t) − ν(t) − αβ∆t]
2 t σ
Z t+∆t Z t+∆t p
αρ p
+ ν(u)du + 1 − ρ2 ν(u)dWX (u)
σ t t
Thus, we have to sample the following quantities in the required order,
1. ν(t + ∆t) given ν(t)
48
R t+∆t
2. t
ν(u)du given ν(t + ∆t), ν(t)
R t+∆t p R t+∆t
3. t
ν(u)dWν (u) given t ν(u)du
We know that a transformation of νt+dt follows a scaled χ2 distribution. So,
n(dt)
ν(t + dt)
exp{−αdt}
has a χ2 distribution with λ(t) as the non-centrality parameter and
4αβ
d=
σ2
degrees of freedom. Here,
λ = ndtν(t),
4α exp{−αdt}
n(dt) = .
σ 2 (1
− exp{−αdt})
To get a value for a future time step (t + dt), we sample from a non-central χ2
distribution with λ(t) as the non-central parameter and d as the degrees of freedom.
We use an built in random number generator in the numpy module to achieve this.
Algorithm 5. The sample paths of Heston Model can be simulated using the following
algorithm,
1. Sample ν̂(t + ∆t) given ν̂(t) from a non-central χ2 distribution.
R t+∆t
2. Given ν̂(t + ∆t) and ν̂(t), we estimate t
ν(u)du. For this we use the trape-
zoidal rule and estimate the integrated variance as,
ˆ (t, t + ∆t) ≈ ν̂(t + ∆t) + ν̂(t) .

IV
2
3. Generate a random observation Zx from an independent standard Gaussian ran-
dom variable.
49
4. Use the following exact scheme to get the different values of a sample path.
αρ ˆ ˆ (t, t + ∆t)
IV
X̂(t + ∆t) = X̂(t) + µ∆t + IV (t, t + ∆t) −
σ 2q
ρ p
ˆ (t, t + ∆t) (3.7)
+ [ν̂(t + ∆t) − ν̂(t) − αβ∆t] + 1 + ρ2 Zx IV
σ
V(t) vs time
0.30
0.29
0.28
V(t)
0.27
0.26
0 50 100 150 200 250

t
Figure 3.1: Simulation of a path of CIR process with N = 252, α = 0.09, β = 0.145
and σ = 0.055.
50
X(t) vs time
2.3
2.2
2.1
2.0
X(t)
1.9
1.8
1.7
1.6
0 50 100 150 200 250
t
Figure 3.2: Simulation of a path of Heston process with N = 252, α = 0.09, β =

0.145, µ = 0.009 and σ = 0.055.
3.2 A generalized Heston Model
In this section, we propose a generalization of the Heston model. We extend
the Heston model (10) by allowing the drift µ to be governed by another stochastic
process. The rationale behind this idea is that there are some local variations in the
drift component which we feel might be captured by the generalized Heston model.
As far as we know, all the models that have been proposed in the literature assume
the interest rates to be a strictly positive quantity. But, there have been instances
when the interest rates have been negative [5]. We feel the generalized Heston model
would be more appropriate to estimate the volatility in these markets.
51
Definition 11. Let S(t) : t ≥ 0 to be the price of the asset and ν(t); t ≥ 0 be
the variance process. The equations governing the generalized Heston model are as
follows:
p
dS(t) = µ(t)S(t)dt + ν(t)S(t) dW S (t) (3.8)
dµ(t) = α1 (β1 − µ(t))dt + σ1 dW µ (t) (3.9)

p
dν(t) = α2 (β2 − ν(t))dt + σ2 ν(t) dW ν (t) (3.10)
where dW µ (t) is uncorrelated with both dW S (t) and dW ν (t) by construction. The
rest of the parameters have their usual meanings as defined earlier. From proposition
(2.2) we know that equation (3.9) can be written as,

Z t Z t
µ(t) = µ(0) + α1 (β1 − µ(s))ds + σ1 dW µ (s). (3.11)
0 0
Using the transformation f (X, t) = X(t) = log(S(t)) and Itô’s lemma (proposition
1), equation (3.8) can be written as,

" ! !#
1 ν(t)S 2 (t) −1 p 1
dX(t) = µ(t)S(t) · + · 2 dt + ν(t) · S(t) · dW X (t)
S(t) 2 S (t) S(t)
ν(t) p
dX(t) = µ(t) − dt + ν(t) dW X (t),
2
where µ(t) follows a mean reverting OU process given by equation (3.9) and ν(t)
follows a CIR process given by equation (3.10).
Using equation (2.2), X(t) can be written as,

Z t Z tp
ν(s)
X(t) = X(0) + µ(s) − ds + ν(s) dW X (s). (3.12)
0 2 0
For any two times t1 , t2 such that t2 > t1 , equation (3.12) translates to,
Z t2 Z t2 p
ν(s)
X(t2 ) = X(t1 ) + µ(s) − ds + ν(s) dW X (s).
t1 2 t1
52
This can be further simplified as,
Z t2 Z t2 Z t2
ν(s) p
X(t2 ) = X(t1 ) + µ(s)ds − ds + ν(s) dW X (s).
t1 t1 2 t1
Using proposition 6 we get,

Z t2 Z t2 Z t2 p
ν(s)
X(t2 ) = X(t1 ) + µ(s)ds − ds + ρ ν(s) dW ν (s)
t1 t1 2 t
p Z t12 p
+ 1 − ρ2 ν(s) dW Z (s), (3.13)
t1
where dW µ (s) and dW Z (s) are independent of each other.
3.2.1 Simulation of sample paths of the generalized Heston

model
The sample paths of the modified Heston model can be simulated using the fol-
lowing multistep procedure.
Step-I
Set the process parameters, i.e., total time period (T)= 1.0, number of steps (N ) =
100, ρ = −0.6. Let s be the number of intermediate points between ti and ti+1 .
Figure 3.3 illustrates this with s=4. For our simulations, we choose s as 100.
Figure 3.3: s=4 intermediate points between ti and ti+1 .
53
We need to simulate both the CIR process and the OU process in order to simulate a
path of the generalized Heston model. We simulate the OU process using algorithm
2. Similarly, the CIR process is simulated using algorithm described in Chapter 2.
0.75
0.70
0.65
V(t)
0.60
0.55
0.50
0.45
0.0 0.2 0.4 0.6 0.8 1.0
t
Figure 3.4: Simulated path of the CIR process with parameters α2 = 0.221, β2 =
0.601, σ2 = 0.055. Every (s + 1)th value has been chosen for the plot, where s has
been defined in step-I.
54
0.76
0.74
0.72
M(t) 0.70
0.68
0.66
0.0 0.2 0.4 0.6 0.8 1.0

t
Figure 3.5: Simulated path of the OU process with parameters α1 = 0.14, β1 =

0.861, σ1 = 0.009. Every (s + 1)th value has been chosen for the plot, where s has
been defined in step-I.
Step-II
R t2
We estimate the integral t1
ν(s) ds using the Riemann sum.
Z t2 s
X
ν(s) ds ≈ IV
c = (ν(t1 ) + ν(si ))∆,
t1 i=1
where s = 100 is the number of divisions between t1 and t2 and
t2 − t1
∆= .
s
ν(t1 ) and ν(si ) have already been simulated in Step - I as part of simulating the CIR
process. In this step, we just add the product of all the simulated values of the CIR
process between the time points t1 and t2 and ∆.
55
0.00060
0.00055
0.00050
Value
0.00045
0.00040
0.00035
0 20 40 60 80 100
N
R t2
Figure 3.6: Simulated path of the estimates of t1
ν(s) ds at different time points.
The X axis here represents the number of divisions between 0 and the total time
period T . If N = 100, then there would be 99 estimated values of the integral at

Rt Rt Rt
different times, i.e., t12 ν(s) ds, t23 ν(s) ds, . . . , t99100 ν(s) ds and first value is just νt1 .
Step-III
R t2
µ(s) ds using Riemann sum.
Z t2 s
X
µ(s) ds ≈ Iµ
c = (µ(t1 ) + µ(si ))∆,
t1 i=1
where s and ∆ have their usual meanings.
56
0.000375
0.000350
0.000325
Value
0.000300
0.000275
0.000250
0.000225
0 20 40 60 80 100
N
R t2
Figure 3.7: Simulated path of the estimate of t1
µ(s) ds.
The X axis here represents the number of divisions between 0 and the total time
period T . If N = 100, then there would be 99 estimated values of the integral at

Rt Rt Rt
different times, i.e., t12 µ(s) ds, t23 µ(s) ds, . . . , t99100 µ(s) ds, and first value is just µt1 .
Step-IV
The solution to the CIR process simulated in Step-I is given as,

Z t2 Z t2 p
ν(t2 ) = ν(t1 ) + α2 (β2 − ν(u))du + σ2 ν(u)dW (u).
t1 t1
Rt p
We estimate the integral t12 ν(s) dW ν (s) as follows,
Z t2 p Z t2
ν(u)dW (u) = [ν(t2 ) − ν(t1 ) + α2 (β2 − ν(u))du]σ2−1 .
t1 t1
We have already simulated all the terms on the right hand side and thus, we know
the value of
Z t2 p
ν(u)dW ν (u)
t1
57
.
1.5
1.0
Value 0.5
0.0
−0.5
−1.0
−1.5
0 20 40 60 80 100
N
R t2 p
ν(s) dW ν (s).
The interpretation of the X-axis is similar to the one described above.
Step-V
R t2 p
ν(s) dW Z (s) as,
Z t2 p
ˆ .
p
ν(s) dW Z (s) ≈ Z IV
t1
where, Z is a value from a standard normal random variable and IV

c has already been
explained in Step-II.
58
0.04
0.03
0.02
0.01
Value
0.00
−0.01
−0.02
−0.03
−0.04
0 20 40 60 80 100
N
R t2 p
ν(s) dW Z .
The interpretation of the X-axis is similar to the one described above. We now have
all the estimated integrals. We assume that the initial value X(0) ∼ N (1, 1). Using
X(0) and (3.13) we can now simulated a sample path of the generalized Heston model.
59
5
X(t)3
0 20 40 60 80 100
t
Figure 3.10: Simulated sample path of the generalized Heston model.
3.3 Approximate Bayesian Computing (ABC)
Approximate Bayesian Computing (ABC) is a computational technique used when
an analytical formula for the likelihood function is difficult to derive or is computa-
tionally costly to evaluate. Assume we want to perform Bayesian inference and wish to
explore an intractable posterior density P (θ|D0 ) where θ is the parameter of interest
and D0 is a generic notation for ”observed data”.
Algorithm 6. The ABC algorithm performs the following steps,
1. Sample a new value of the parameter θ* from the prior distribution P (·).
2. Simulate a data set D* from the likelihood model f (·|θ*).
60
3. Compare the newly simulated data D* to the observed data D0 using a well
defined distance function d and tolerance ≥ 0. The tolerance is the desired
level of closeness or agreement between D* and D0 .
4. If d(D*, D0 ) ≤ , we accept θ* else we reject θ*.
Repeat the process until R such parameters sampled from P (·) have been accepted.
The accepted parameters represent a sample from P (θ|d(D*, D0 ) ≤ ). For a suffi-
ciently small , the distribution P (θ|d(D*, D0 ) ≤ ) would be a very good approxi-
mation to the “true” distribution P (θ|D0 ).
3.3.1 ABC for Heston Model
In this section, we estimate the parameters of the Heston Model using ABC. The
observed data D0 is simulated using algorithm 5. We use Gaussian distribution priors
for parameters that have no restrictions and Gamma priors for parameters that are
restricted to be positive. The parameters used to generate the observed data are,
α = 0.290, β = 0.445, σ = 0.055, µ = 0.1, ρ = −0.2, T = 1.0, N = 100.
Here, the parameters have their usual meanings as described in section 3.1.
Algorithm 7. The ABC algorithm for the Heston model performs the following steps,
1. Let θ* = (α*, β*, σ*, µ*, ρ*). Sample α*, β* from a Gamma(1, 1) distribu-
tion, σ* and µ* from a Gamma(0.45, 0.45) distribution. We sample ρ* from a
Gamma(0.3, 0.3) distribution and then multiply the selected value by −1 because
of the additional constraints on ρ as described above.
θ*
2. Simulate a data set D* in accordance to a simulation framework f (D|θ*
θ*) which
is described in 5.
61
3. Let the distance function be,
N
X
d= |D0,i − D*i |
i=1
Where D0,i = X(ti ) is the observed value of the process at time ti .
Table 3.1: Table showing the number of simulations vs number of accepted parameters
for different = 100.
S.No. Number of simulations (n) Number of accepted parameters (R)

1) 100 54
2) 1,000 656
3) 10,000 5488
4) 100,000 65766

1) 100 70
2) 1,000 740
3) 10,000 7321
4) 100,000 74217
62

1) 100 84
2) 1,000 767
3) 10,000 7873
4) 100,000 79502

1) 100 85
2) 1,000 816
3) 10,000 8129
4) 100,000 81515

1) 100 86
2) 1,000 845
3) 10,000 8303
4) 100,000 82981
63

1) 100 88
2) 1,000 858
3) 10,000 8410
4) 100,000 84712
For the same number of simulations, we observe that the number of accepted parame-
ters increases with the tolerance () level. This is in accordance with our expectations
as having a higher tolerance level () corresponds to a weaker constraint which al-
lows more parameters to be accepted. Similarly, for the same tolerance () level, the
number of accepted parameters increases with the number of simulations. Below, we
show the histograms of accepted values of the parameters of the Heston Model for
different number of simulations and different tolerance () levels.
64
Histogram of est_alpha Histogram of est_beta
350
250
300
200
250
Frequency
Frequency
200 150
150
100
100
50
50
0 0
0 1 2 3 4 5 6 7 8 0 1 2 3 4 5
Value Value
(a) Histogram of accepted values of α. (b) Histogram of accepted values of β.
Histogram of est_sigma Histogram of est_mu

400
350 400
300
250 300
Frequency
Frequency
200
200
150
100
100
50
0 0
0.00 0.25 0.50 0.75 1.00 1.25 1.50 0.00 0.25 0.50 0.75 1.00 1.25 1.50 1.75 2.00
Value Value
(c) Histogram of accepted values of σ. (d) Histogram of accepted values of µ.
Histogram of est_rho
500
400
Frequency
300
200
100
0
−0.8 −0.7 −0.6 −0.5 −0.4 −0.3 −0.2 −0.1 0.0
Value
(e) Histogram of accepted values of ρ.
Figure 3.11: Histograms of accepted values of the parameters of the Heston Model
for = 100 and 1000 simulations. The dashed red lines represent the true values of
the parameters.
65
3000
3000
2500
2500
2000
2000
Frequency
Frequency
1500
1500
1000 1000
500 500
0 0
0 2 4 6 8 0 1 2 3 4 5 6 7
Value Value

4000 4500
4000
3500
3500
3000
3000
2500
Frequency
Frequency
2500
2000
2000
1500
1500
1000 1000
500 500
0 0
0.0 0.5 1.0 1.5 2.0 0.0 0.5 1.0 1.5 2.0
Value Value
4000
3000
Frequency
2000
1000
0
−1.0 −0.8 −0.6 −0.4 −0.2 0.0
Value
for = 100 and 10, 000 simulations. The dashed red lines represent the true values
of the parameters.
66
40000
40000
30000 30000
Frequency
Frequency
20000 20000
10000 10000
0 0
0 2 4 6 8 10 12 0 2 4 6 8 10
Value Value

50000 50000
40000 40000
Frequency
Frequency
30000 30000
20000 20000
10000 10000
0 0
0.0 0.5 1.0 1.5 2.0 2.5 3.0 0.0 0.5 1.0 1.5 2.0 2.5
Value Value
50000
40000
Frequency
30000
20000
10000
0
−1.0 −0.8 −0.6 −0.4 −0.2 0.0
Value
of the parameters.
67
350 350
300 300
250 250
Frequency
Frequency
200 200
150 150
100 100
50 50
0 0
0 1 2 3 4 5 6 0 1 2 3 4 5 6 7
Value Value

500
500
400
400
Frequency
Frequency
300
300
200 200
100 100
0 0
0.00 0.25 0.50 0.75 1.00 1.25 1.50 1.75 2.00 0.00 0.25 0.50 0.75 1.00 1.25 1.50 1.75 2.00
Value Value
600
500
400
Frequency
300
200
100
0
−1.0 −0.8 −0.6 −0.4 −0.2 0.0
Value
for = 200 and 1, 000 simulations. The dashed red lines represent the true values of
the parameters.
68
5000
4000
4000
3000
Frequency
Frequency
3000
2000
2000
1000
1000
0 0
0 2 4 6 8 0 2 4 6 8 10 12
Value Value
5000 5000
4000 4000
Frequency
Frequency
3000 3000
2000 2000
1000 1000
0 0
0.0 0.5 1.0 1.5 2.0 2.5 3.0 0.0 0.5 1.0 1.5 2.0 2.5
Value Value
6000
5000
4000
Frequency
3000
2000
1000
0
−1.0 −0.8 −0.6 −0.4 −0.2 0.0
Value
of the parameters.
69
50000
40000
40000
30000
Frequency
Frequency
30000
20000
20000
10000 10000
0 0
0 2 4 6 8 10 0 2 4 6 8 10
Value Value

60000
60000
50000
50000
40000
40000
Frequency
Frequency
30000 30000
20000 20000
10000 10000
0 0
0.0 0.5 1.0 1.5 2.0 2.5 3.0 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5
Value Value
60000
50000
40000
Frequency
30000
20000
10000
0
−1.0 −0.8 −0.6 −0.4 −0.2 0.0
Value
of the parameters.
70
350
400 300
250
300
Frequency
Frequency
200
200 150
100
100
50
0 0
0 2 4 6 8 0 1 2 3 4 5 6
Value Value

500
500
400
400
300
Frequency
Frequency
300
200
200
100
100
0 0
0.00 0.25 0.50 0.75 1.00 1.25 1.50 1.75 0.0 0.5 1.0 1.5 2.0
Value Value
600
500
400
Frequency
300
200
100
0
−0.8 −0.6 −0.4 −0.2 0.0
Value
the parameters.
71
6000
4000 5000
4000
3000
Frequency
Frequency
3000
2000
2000
1000
1000
0 0
0 2 4 6 8 0 2 4 6 8 10 12
Value Value

6000 6000
5000 5000
4000 4000
Frequency
Frequency
3000 3000
2000 2000
1000 1000
0 0
0.0 0.5 1.0 1.5 2.0 2.5 0.0 0.5 1.0 1.5 2.0 2.5
Value Value
6000
5000
4000
Frequency
3000
2000
1000
0
−1.0 −0.8 −0.6 −0.4 −0.2 0.0
Value
of the parameters.
72
50000
50000
40000
40000
30000
Frequency
Frequency
30000
20000
20000
10000 10000
0 0
0 2 4 6 8 10 0 2 4 6 8
Value Value

70000
60000
60000
50000
50000
40000
Frequency
Frequency
40000
30000
30000
20000 20000
10000 10000
0 0
0.0 0.5 1.0 1.5 2.0 2.5 3.0 0 1 2 3 4
Value Value
60000
50000
40000
Frequency
30000
20000
10000
0
−1.0 −0.8 −0.6 −0.4 −0.2 0.0
Value
of the parameters.
73
400 400
350 350
300 300
250 250
Frequency
Frequency
200 200
150 150
100 100
50 50
0 0
0 1 2 3 4 5 6 0 1 2 3 4 5 6 7
Value Value

600
500
500
400
400
Frequency
Frequency
300 300
200 200
100 100
0 0
0.0 0.5 1.0 1.5 2.0 0.0 0.5 1.0 1.5 2.0 2.5
Value Value
600
500
400
Frequency
300
200
100
0
−0.8 −0.6 −0.4 −0.2 0.0
Value
the parameters.
74
5000
5000
4000
4000
Frequency
Frequency
3000 3000
2000 2000
1000 1000
0 0
0 2 4 6 8 10 12 0 2 4 6 8 10
Value Value

6000
6000
5000
5000
4000
Frequency
Frequency
4000
3000
3000
2000
2000
1000 1000
0 0
0.0 0.5 1.0 1.5 2.0 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5
Value Value
6000
5000
4000
Frequency
3000
2000
1000
0
−1.0 −0.8 −0.6 −0.4 −0.2 0.0
Value
of the parameters.
75
50000
50000
40000
40000
Frequency
Frequency
30000 30000
20000 20000
10000 10000
0 0
0 2 4 6 8 10 0 2 4 6 8 10
Value Value

70000 70000
60000 60000
50000 50000
Frequency
Frequency
40000 40000
30000 30000
20000 20000
10000 10000
0 0
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0
Value Value
60000
50000
40000
Frequency
30000
20000
10000
0
−1.0 −0.8 −0.6 −0.4 −0.2 0.0
Value
of the parameters.
76
500
400
400
300
Frequency
Frequency
300
200
200
100
100
0 0
0 1 2 3 4 5 6 7 0 2 4 6 8
Value Value

600
500
500
400
400
Frequency
Frequency
300
300
200
200
100 100
0 0
0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 0.0 0.5 1.0 1.5 2.0 2.5
Value Value
600
500
Frequency
400
300
200
100
0
−1.0 −0.8 −0.6 −0.4 −0.2 0.0
Value
of the parameters.
77
5000
4000
4000
3000
Frequency
Frequency
3000
2000
2000
1000
1000
0 0
0 1 2 3 4 5 6 7 8 0 2 4 6 8 10
Value Value
6000 6000
5000 5000
4000
Frequency
Frequency
4000
3000 3000
2000 2000
1000 1000
0 0
0.0 0.5 1.0 1.5 2.0 2.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0
Value Value
6000
5000
Frequency
4000
3000
2000
1000
0
−1.0 −0.8 −0.6 −0.4 −0.2 0.0
Value
of the parameters.
78
60000
50000 50000
40000 40000
Frequency
Frequency
30000 30000
20000 20000
10000 10000
0 0
0 2 4 6 8 10 12 0 2 4 6 8 10
Value Value

70000
70000
60000
60000
50000
50000
Frequency
Frequency
40000
40000
30000 30000
20000 20000
10000 10000
0 0
0.0 0.5 1.0 1.5 2.0 2.5 3.0 0 1 2 3 4
Value Value
60000
50000
Frequency
40000
30000
20000
10000
0
−1.0 −0.8 −0.6 −0.4 −0.2 0.0
Value
of the parameters.
79
350
400
300
350
300 250
Frequency
Frequency
250 200
200
150
150
100
100
50 50
0 0
0 1 2 3 4 5 6 7 0 1 2 3 4 5
Value Value

600
600
500
500
400 400
Frequency
Frequency
300 300
200 200
100 100
0 0
0.00 0.25 0.50 0.75 1.00 1.25 1.50 1.75 0.0 0.5 1.0 1.5 2.0
Value Value
700
600
500
Frequency
400
300
200
100
0
−0.8 −0.6 −0.4 −0.2 0.0
Value
of the parameters.
80
5000
5000
4000
4000
Frequency
Frequency
3000
3000
2000 2000
1000 1000
0 0
0 2 4 6 8 0 2 4 6 8
Value Value
6000 6000
5000 5000
4000 4000
Frequency
Frequency
3000 3000
2000 2000
1000 1000
0 0
0.0 0.5 1.0 1.5 2.0 2.5 0.0 0.5 1.0 1.5 2.0 2.5
Value Value
6000
5000
Frequency
4000
3000
2000
1000
0
−1.0 −0.8 −0.6 −0.4 −0.2 0.0
Value
of the parameters.
81
60000 60000
50000 50000
40000 40000
Frequency
Frequency
30000 30000
20000 20000
10000 10000
0 0
0 2 4 6 8 10 12 0 2 4 6 8 10 12
Value Value

70000 70000
60000 60000
50000 50000
Frequency
Frequency
40000 40000
30000 30000
20000 20000
10000 10000
0 0
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5
Value Value
60000
50000
Frequency
40000
30000
20000
10000
0
−1.0 −0.8 −0.6 −0.4 −0.2 0.0
Value
of the parameters.
82
We observe that in almost all the histograms, the slab of the highest frequency
contains the dashed red lines, i.e., the highest number of accepted values lie very close
to the true parameter value for α, β, σ and µ. This is not the case for ρ. Next, we
look at the implementation of the ABC algorithm for the generalized Heston model.
3.3.2 ABC for generalized Heston Model
In this section, we estimate the parameters of the generalized Heston Model using
ABC. The observed data D0 is simulated using the process described in section 3.2.
We use Gaussian distribution priors for parameters that have no restrictions and
Gamma priors for parameters that are restricted to be positive. The parameters used
to generate the observed data are,
α1 = 0.283, β1 = 0.661, σ1 = 0.009, α2 = 0.221, β2 = 0.601, σ2 = 0.055,
ρ = −0.6, T = 1.0, N = 100.
Here, the parameters have their usual meanings as described in section 3.2.
Algorithm 8. The ABC algorithm for the generalized Heston model performs the
following steps,
1. Let θ* = (α1 *, β1 *, σ1 *, α1 *, β1 *, σ1 *), ρ*. Sample α1 *, α2 *, β2 * from a Gamma(1, 1)
distribution, σ1 * and σ2 * from a Gamma(0.25, 0.25) distribution, β2 * from a
N ormal(1, 1) distribution. We sample ρ* from a Gamma(1, 1) distribution and
then multiply the selected value by −1 because of the additional constraints on
rho as described above.
θ*
2. Simulate a data set D* in accordance to a simulation framework f (D|θ*
θ*) which
is described is section 3.2.
83
3. Let the distance function be,
N
X
d= |D0,i − D*i | (3.14)
i=1
Where D0,i = X(ti ) is the observed value of the process at time ti . Compare
the simulated data D* to the observed data D0 using a well-defined distance
function d and tolerance ≥ 0.
for = 100.

1) 100 1
2) 1,000 14
3) 10,000 148
for = 200.

1) 100 2
2) 1,000 53
3) 10,000 440
84
for = 500.

1) 100 20
2) 1,000 240
3) 10,000 2236
Table 3.10: Table showing the number of simulations vs number of accepted param-
eters for = 800.

1) 100 36
2) 1,000 246
3) 10,000 2671
eters for = 1, 000.

1) 100 31
2) 1,000 302
3) 10,000 2889
85
eters for = 1, 500.

1) 100 43
2) 1,000 284
3) 10,000 3189
We observe that for the same number of simulations, the number of accepted
parameters increases with the tolerance () level. Similarly, for the same tolerance ()
level, the number of accepted parameters increases with the number of simulations.
For the same number of simulations and same tolerance () level, the number of
accepted parameters for the Heston model is greater than the number of accepted
parameters for the generalized Heston model. This is due to the fact that the number
of parameters in the Heston model are less than the number of parameters in the
generalized Heston model. As we increase the complexity of the model, it becomes
harder to find the right set of parameters that satisfy the constraints of ABC. Below,
we show the histograms of accepted values of the parameters of the generalized Heston
Model for different number of simulations and different tolerance () levels.
86
Histogram of est_alpha1 Histogram of est_beta1
4.0 4.0
3.5 3.5
3.0 3.0
2.5 2.5
Frequency
Frequency
2.0 2.0
1.5 1.5
1.0 1.0
0.5 0.5
0.0 0.0
0 1 2 3 4 5 0.5 1.0 1.5 2.0 2.5 3.0
Value Value
(a) Histogram of estimated values (b) Histogram of estimated values

of α1 . of β1 .
Histogram of est_sigma1 Histogram of est_alpha2

6
8
5
6 4
Frequency
Frequency
3
4
2
1
0 0
0.00 0.02 0.04 0.06 0.08 0.10 0.12 0.14 0.16 0.25 0.50 0.75 1.00 1.25 1.50 1.75 2.00
Value Value
(c) Histogram of estimated values (d) Histogram of estimated values

of σ1 . of α2 .
Histogram of est_beta2 Histogram of est_sigma2

5 5
4 4
Frequency
Frequency
3 3
2 2
1 1
0 0
0.1 0.2 0.3 0.4 0.5 0.6 0.000 0.025 0.050 0.075 0.100 0.125 0.150 0.175
Value Value
(e) Histogram of estimated values (f) Histogram of estimated values

of β2 . of σ2 .
4.0
3.5
3.0
2.5
Frequency
2.0
1.5
1.0
0.5
0.0
−1.0 −0.8 −0.6 −0.4 −0.2 0.0
Value
(g) Histogram of estimated values

of ρ.
Figure 3.29: Histograms of estimated values of the parameters of the generalized
Heston Model for = 100 and 1000 simulations. The dashed red lines represent the
true values of the parameters.
87
50
40
40 35
30
Frequency 30
Frequency
25
20
20
15
10
10
5
0 0
0 1 2 3 4 5 −2 −1 0 1 2 3 4
Value Value

of α1 . of β1 .

120
50
100
40
80
Frequency
Frequency
30
60
20
40
20 10
0 0
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0 1 2 3 4 5
Value Value

of σ1 . of α2 .
100
60
50 80
Frequency
Frequency
40
60
30
40
20
20
10
0 0
0.00 0.25 0.50 0.75 1.00 1.25 1.50 1.75 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7
Value Value

of β2 . of σ2 .
25
20
Frequency
15
10
0
−1.0 −0.8 −0.6 −0.4 −0.2 0.0
Value

of ρ.
88
14 12
12 10
10
Frequency 8
Frequency
8
6
6
4
4
2 2
0 0
0.0 0.5 1.0 1.5 2.0 2.5 −0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0
Value Value

of α1 . of β1 .
40
20
30
15
Frequency
Frequency
20 10
10 5
0 0
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0 1 2 3 4 5
Value Value

of σ1 . of α2 .
25 20
20
15
Frequency
Frequency
15
10
10
5
5
0 0
0.0 0.5 1.0 1.5 2.0 0.0 0.1 0.2 0.3 0.4 0.5
Value Value

of β2 . of σ2 .
6
Frequency
0
−1.0 −0.8 −0.6 −0.4 −0.2 0.0
Value

of ρ.
89
200
100
175
150 80
Frequency
Frequency
125
60
100
75 40
50
20
25
0 0
0 1 2 3 4 5 6 −2 −1 0 1 2 3 4
Value Value

of α1 . of β1 .

350 175
300 150
250 125
Frequency
Frequency
200 100
150 75
100 50
50 25
0 0
0.0 0.2 0.4 0.6 0.8 0 1 2 3 4 5
Value Value

of σ1 . of α2 .

100
200
80
150
60
Frequency
Frequency
40 100
20 50
0 0
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.1 0.2 0.3 0.4 0.5
Value Value

of β2 . of σ2 .
60
50
40
Frequency
30
20
10
0
−1.0 −0.8 −0.6 −0.4 −0.2 0.0
Value

of ρ.
Heston Model for = 200 and 10, 000 simulations. The dashed red lines represent
the true values of the parameters.
90
120 50
100
40
Frequency 80
Frequency
30
60
20
40
10
20
0 0
0 1 2 3 4 5 6 −1 0 1 2 3
Value Value

of α1 . of β1 .

100
175
150 80
125
Frequency
Frequency
60
100
75 40
50
20
25
0 0
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0 1 2 3 4 5
Value Value

of σ1 . of α2 .

120
80
70 100
60
80
Frequency
Frequency
50
60
40
30 40
20
20
10
0 0
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8
Value Value

of β2 . of σ2 .
50
40
Frequency
30
20
10
0
−1.0 −0.8 −0.6 −0.4 −0.2 0.0
Value

of ρ.
Heston Model for = 500 and 1, 000 simulations. The dashed red lines represent the
91
1200 600
1000 500
Frequency 800 400
Frequency
600 300
400 200
200 100
0 0
0 1 2 3 4 5 6 7 −2 −1 0 1 2 3 4
Value Value

of α1 . of β1 .
1750
1000
1500
800
1250
Frequency
Frequency
1000 600
750
400
500
200
250
0 0
0.0 0.2 0.4 0.6 0.8 1.0 0 1 2 3 4 5 6 7
Value Value

of σ1 . of α2 .

1200
1400
1200 1000
1000 800
Frequency
Frequency
800
600
600
400
400
200
200
0 0
0 1 2 3 4 5 6 7 0.0 0.2 0.4 0.6 0.8 1.0
Value Value

of β2 . of σ2 .
600
500
400
Frequency
300
200
100
0
−1.0 −0.8 −0.6 −0.4 −0.2 0.0
Value

of ρ.
92
80
70 50
60
40
Frequency 50
Frequency
40 30
30
20
20
10
10
0 0
0 1 2 3 4 −1 0 1 2 3 4
Value Value

of α1 . of β1 .

200
175
80
150
125 60
Frequency
Frequency
100
40
75
50
20
25
0 0
0.0 0.2 0.4 0.6 0.8 0 1 2 3 4 5
Value Value

of σ1 . of α2 .

100
140
80 120
100
60
Frequency
Frequency
80
40 60
40
20
20
0 0
0 1 2 3 4 0.0 0.2 0.4 0.6 0.8
Value Value

of β2 . of σ2 .
60
50
40
Frequency
30
20
10
0
−1.0 −0.8 −0.6 −0.4 −0.2 0.0
Value

of ρ.
Heston Model for = 800 and 1, 000 simulations. The dashed red lines represent the
93
800
1400
700
1200
600
Frequency 1000 500
Frequency
800 400
600 300
400 200
200 100
0 0
0 1 2 3 4 5 6 7 8 −3 −2 −1 0 1 2 3 4 5
Value Value

of α1 . of β1 .
2000 1400
1200
1500 1000
Frequency
Frequency
800
1000
600
400
500
200
0 0
0.0 0.2 0.4 0.6 0.8 0 2 4 6 8
Value Value

of σ1 . of α2 .

1600 1600
1400 1400
1200 1200
1000 1000
Frequency
Frequency
800 800
600 600
400 400
200 200
0 0
0 1 2 3 4 5 6 7 0.0 0.2 0.4 0.6 0.8
Value Value

of β2 . of σ2 .
600
500
400
Frequency
300
200
100
0
−1.0 −0.8 −0.6 −0.4 −0.2 0.0
Value

of ρ.
94
120 70
100 60
50
80
Frequency
Frequency
40
60
30
40
20
20 10
0 0
0 1 2 3 4 5 −1 0 1 2 3 4
Value Value

of α1 . of β1 .
120
200
100
150 80
Frequency
Frequency
60
100
40
50
20
0 0
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0 1 2 3 4 5
Value Value

of σ1 . of α2 .

160
100
140
80
120
100
Frequency
Frequency
60
80
40 60
40
20
20
0 0
0 1 2 3 4 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7
Value Value

of β2 . of σ2 .
70
60
50
Frequency
40
30
20
10
0
−1.0 −0.8 −0.6 −0.4 −0.2 0.0
Value

of ρ.
Heston Model for = 1, 000 and 1, 000 simulations. The dashed red lines represent
95
700
1400
600
1200
500
1000
Frequency
Frequency
400
800
600 300
400 200
200 100
0 0
0 1 2 3 4 5 6 7 −2 −1 0 1 2 3 4
Value Value

of α1 . of β1 .
1600
2000 1400
1200
1500
Frequency
Frequency
1000
800
1000
600
500 400
200
0 0
0.0 0.2 0.4 0.6 0.8 0 2 4 6 8
Value Value

of σ1 . of α2 .

1600 1750
1400 1500
1200
1250
Frequency
Frequency
1000
1000
800
750
600
500
400
200 250
0 0
0 1 2 3 4 5 6 7 0.0 0.2 0.4 0.6 0.8 1.0
Value Value

of β2 . of σ2 .
600
500
400
Frequency
300
200
100
0
−1.0 −0.8 −0.6 −0.4 −0.2 0.0
Value

of ρ.
96
50
100
40
80
Frequency
Frequency
60 30
40 20
20 10
0 0
0 1 2 3 4 5 6 −1 0 1 2 3
Value Value

of α1 . of β1 .

120
200
100
150 80
Frequency
Frequency
60
100
40
50
20
0 0
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0 1 2 3 4 5 6
Value Value

of σ1 . of α2 .

175
120
150
100
125
80
Frequency
Frequency
100
60
75
40
50
20 25
0 0
0 1 2 3 4 5 6 0.0 0.2 0.4 0.6 0.8 1.0
Value Value

of β2 . of σ2 .
60
50
40
Frequency
30
20
10
0
−1.0 −0.8 −0.6 −0.4 −0.2 0.0
Value

of ρ.
97
2000 800
1500 600
Frequency
Frequency
1000 400
500 200
0 0
0 2 4 6 8 10 12 −3 −2 −1 0 1 2 3 4
Value Value

of α1 . of β1 .

2000
2500
1750
2000 1500
1250
Frequency
Frequency
1500
1000
1000 750
500
500
250
0 0
0.0 0.2 0.4 0.6 0.8 0 2 4 6 8
Value Value

of σ1 . of α2 .

1750
2000
1500 1750
1250 1500
Frequency
Frequency
1250
1000
1000
750
750
500
500
250 250
0 0
0 1 2 3 4 5 6 7 0.0 0.2 0.4 0.6 0.8 1.0
Value Value

of β2 . of σ2 .
700
600
500
Frequency
400
300
200
100
0
−1.0 −0.8 −0.6 −0.4 −0.2 0.0
Value

of ρ.
98
We observe that in almost all the histograms, the slab of the highest frequency
contains the dashed red lines, i.e., the highest number of accepted values lie very close
to the true parameter value for α1 , β1 , σ1 , σ2 and α2 . This is not the case for ρ and
β2 .
99
Chapter 4: Application: Modeling Volatility in Financial
Markets
4.1 Introduction
In this section, we define a stock index and its purpose, and describe different stock
indices of the world. We also describe the importance of volatility in this section.
4.1.1 Stock Index
A stock index is defined as a collection of stocks that are grouped together using a
specific criteria so that a particular sector, market, commodity, bond, currency or any
other asset could be monitored [3]. It is difficult to track every asset so a statistical
measuring tool like an index is really useful.
Standard & Poor’s 500
The Standard & Poor’s 500 or better abbreviated as S&P 500 is an index based
in the American Stock market. It consists of the largest 500 companies listed on the
New York Stock Exchange (NYSE) based on market capitalization [4]. The S&P 500
is one of the most followed indices globally and many economists believe it to be a fair
and apt representation of the US stock market and an index that acts as a bellwether
for the economy of United States.
100
S&P BSE 200
S&P BSE 200 Index is a free float weighted index of 200 companies selected
from Specified and Non-Specified lists of BSE India Exchange, selected based on
their market capitalization. It started as a cap-weighted index with a base value of
100, and base year 1989-90. Effective from 8/16/05, it was changed to a free float
index. Though S&P BSE SENSEX was serving the purpose of quantifying the price
movements as also reflecting the sensitivity of the market in an effective manner,
the rapid growth of the market necessitated compilation of a new broad-based index
series reflecting the market trends in a more effective manner and providing a better
representation of the increased equity stocks, market capitalization as also to the new
industry groups. As such BSE launched on 27th May 1994, two new index series S&P
BSE 200 and S&P Dollex 200. The equity shares of 200 selected companies from the
specified and non-specified lists of BSE were considered for inclusion in the sample
for ‘S&P BSE 200’. The selection of companies was primarily done on the basis of
current market capitalization of the listed scrips. Moreover, the market activity of the
companies as reflected by the volumes of turnover and certain fundamental factors
were considered for the final selection of the 200 companies [1].
Shanghai Stock Exchange (SSE)
The Chinese stock market has been one of the fastest growing emerging capital
markets, and is now the second largest in Asia, only behind Japan. The Shang-
hai Stock Exchange Composite Index is a capitalization-weighted index. The index
tracks the daily price performance of all A-shares and B-shares listed on the Shanghai
Stock Exchange. The index was developed on December 19, 1990 with a base value
101
of 100. The first day of reporting was July 15, 1991 [18]. A-shares are shares of
the Renminbi currency that are purchased and traded on the Shanghai and Shen-
zhen stock exchanges. This is contrast to Renminbi B-shares which are owned by
foreigners who cannot purchase A-shares due to Chinese government restrictions. B
shares (officially Domestically Listed Foreign Investment Shares) on the Shanghai
and Shenzhen stock exchanges refers to those that are traded in foreign currencies.
The composite figure can be calculated by using the formula:
Market Cap of Composite Numbers

Current Index = × Base Value
Base Period
The B-share stocks are generally denominated in US dollars for calculation purposes.
For calculation of other indices, B share stock prices are converted to RMB at the
applicable exchange rate (the middle price of US dollar on the last trading day of
each week) at China Foreign Exchange Trading Center and then published by the
exchange.
Nikkei 225
The Nikkei 225 is Japan’s top stock index which consists of the top 225 blue chip
companies that are listed in the Tokyo Stock Exchange [2]. The Nikkei 225 is the
oldest index in Asia.
According to the US National Bureau of Economic Research (the official arbiter
of US recessions) the US recession began in December 2007 and ended in June 2009,
and thus extended over 19 months. In order to see the impact of recent financial
crisis and have time varying results, the total data is divided into four sub-periods,
i.e. before financial crisis period (period-I, January, 1996- November, 2007), during
recession (period-II, December, 2007- June, 2009), after recession and before Chinese
102
Crisis (period- III, July, 2009- May, 2015) and from the start of Chinese crisis till
date(period- IV, June, 2015 - April, 2016). We assume the sample period is sufficient
to evaluate the information asymmetry especially after the huge Foreign Institutional
Investors investments in stock markets, sub-prime crisis disorder and the recent fi-
nancial crisis.
Most classic equations in finance like the Black-Scholes equation that is used
to price options consider volatility to be a fixed quantity. But empirical studies
have shown that volatility varies over time. Capturing volatility is very important to
predict the price of stocks and commodities. Having some information about volatility
also helps an investor make informed decisions. There have been some studies aimed
at this but very few of them pertain to emerging markets. The goal of this project is to
be able to predict volatility for emerging markets so that investors can make informed
decisions. Volatility estimation is of utmost importance for option valuation.
Volatility is defined as the uncertainty or dispersion in stock price movements
or variability in the returns. Regulated utilities and blue chip companies that are
expected to grow slowly but steadily over time have usually been associated with a
low volatility. Investing in the stocks of these companies turns out to be a viable
investment in the long run. On the other hand, the stock prices of companies that
have a higher volatility associated with them vary rapidly in a short period of time.
Start-ups are a prime example of these type of companies that have a higher volatility
associated with them. A very recent example is the prices of cryptocurrencies espe-
cially Bitcoin. The price movement of Bitcoin over the past one year indicates that it
has a high volatility. Investing in the stocks of companies that have a higher volatility
103
associated with them results in short term gains but it is not a feasible option to go
long on the stocks of these companies.
Volatility is an important indicator and most companies estimate their volatility
by means three measures. The first one is historic volatility followed by implied
volatility and the last one is historical or implied volatility using a subset of peer
companies. Historical volatility is the actual variability that was observed in the
past during a specific time period. The disadvantage of historical volatility is that
it is often calculated using the past stock prices and is of little or no use for future
use. Even though it does provide us with a rough estimate of volatility, it would be
beneficial if we had a better estimate of the volatility. Implied volatility, on the other
hand, is the volatility that gives the theoretical trading price of an option that is
traded in the market. When the value of implied volatility is plugged in the famous
Black-Scholes equation, we get the theoretical trading price. The only problem is
that implied volatility is rarely available for all time periods or for all companies. It
is also subject to short-term market fluctuations. To circumvent this problem, the
companies that have usable option data rely make use of a combination of historical
volatility and implied volatility. But many companies have to exclusively make use
of historical volatility because of several reasons. Not having usable option data is
the primary reason why companies have to exclusively use historical volatility.
4.2 Exploratory Data Analysis
Daily closing prices of Shanghai Stock Exchange (SSE) composite index for the
period Jan 1, 1996 to April 8, 2016 are considered for the study. The data for
SSE was retrieved from Yahoo! Finance. For this purpose, we have used the daily
104
adjusted closing prices for the SSE Composite Index. We have also considered the
daily adjusted closing price of Nikkei 225 from January 5, 2015 to July 24, 2018 which
corresponds to 927 days. We have downloaded the data from the Federal Reserve
Economic Data (FRED) database which is maintained by the Research division of
the Federal Reserve Bank of St. Louis. These indices are considered because of their
popularity around the world so as to represent these markets. In order to apply
our model, we use the log adjusted closing prices. Figures 4.1 - 4.4 represent the
daily adjusted closing prices and daily log adjusted closing prices for the desired time
periods for different indices. The daily log closing prices are calculated as,
X(t) = log S(t).
If the adjusted closing price is missing for a particular day, the price of the preceding
day is taken as the adjusted closing price of the current day (for which the closing
price was missing) [26].
Daily Adjusted Closing Price of SSE from 01/01/96 to 04/08/16

Adjusted Closing Price (in Renminbi)
6000
5000
4000
3000
2000
1000
0 1000 2000 3000 4000 5000

time (no. of days)
Figure 4.1: Daily Adjusted Closing Price of SSE from 01/01/96 to 04/08/16.
105
Log Adjusted Closing Price (in Renminbi)
Daily Log Closing Price of SSE from 01/01/96 to 04/08/16
8.5
8.0
7.5
7.0
6.5
0 1000 2000 3000 4000 5000

time (no. of days)
Figure 4.2: Daily Log Adjusted Closing Price of SSE from 01/01/96 to 04/08/16.
Daily Adjusted Closing Price of NIKKEI 225 from 01/05/15 to 07/24/18

Adjusted Closing Price (in Yen)
24000
22000
20000
18000
16000
0 200 400 600 800

time (no. of days)
Figure 4.3: Daily Adjusted Closing Price of NIKKEI 225 from 01/05/15 to 07/24/18.
106
Log Adjusted Closing Price (in Yen)
Daily Log Closing Price of NIKKEI 225 from 01/05/15 to 07/24/18
10.1
10.0
9.9
9.8
9.7
9.6
0 200 400 600 800
time (no. of days)
Figure 4.4: Daily Log Returns of NIKKEI 225 from 01/05/15 to 07/24/18.
4.3 Parameter estimation of the Generalized Heston model

using ABC
4.3.1 Parameter estimation using ABC for SSE
We fit the generalized Heston model to the data from SSE and estimate the
parameters using ABC. In order to test the fit of our model, we divide the daily log
adjusted closing prices into two parts, the training dataset and the testing dataset.
The first 4800 data points form the training dataset and the remaining 353 form
the testing dataset. We tried a combination of normal and gamma priors. Given
a tolerance level , the ABC algorithm accepts many numerical values for a single
parameter. The table below summarizes the results for different levels.
107
for different = 10, 000.

1) 100 50
108
14 10
12
8
10
Frequency
Frequency
6
8
6
4
4
2
2
0 0
2 4 6 8 10 −1.5 −1.0 −0.5 0.0 0.5 1.0 1.5 2.0
Value Value
(a) Histogram of accepted values of (b) Histogram of accepted values

α1 . of β1 .

16
8 14
12
6
10
Frequency
Frequency
8
4
6
4
2
2
0 0
0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0 2 4 6 8
Value Value
(c) Histogram of accepted values of (d) Histogram of accepted values

σ1 . of α2 .

14
12
12
10
10
Frequency
Frequency
8
8
6
6
4 4
2 2
0 0
0.0 0.5 1.0 1.5 2.0 0.05 0.10 0.15 0.20 0.25 0.30 0.35
Value Value
(e) Histogram of accepted values of (f) Histogram of accepted values of

β2 . σ2 .
6
Frequency
0
−0.8 −0.6 −0.4 −0.2 0.0
Value
(g) Histogram of accepted values of

ρ.
Figure 4.5: Histograms of accepted values of the parameters for = 10, 000 and 100
simulations. 109
Since the histograms of accepted values of the parameters appear to be skewed, we
have used the median as a point estimate of the parameter. The estimated parameters
of the model fit on SSE data from January 01, 1996 to April 08, 2016 are,
α̂1 = 2.497, β̂1 = 0.066, σ̂1 = 0.211, α̂2 = 1.812, β̂2 = 0.428, σ̂2 = 0.233, ρ̂ = −0.358.
Using the estimated parameters given above, we simulate a synthetic dataset and
compare the simulated dataset with the testing dataset. We use the same distance
metric as was used in implementing the ABC algorithm to calculate the goodness of
our fitted model. The smaller distance between the simulated dataset using estimated
parameters and the testing dataset the better the model fits. Figure 4.6 shows the
simulated dataset and the testing dataset. The distance between them was 67.63
units.
Comparision between simulated dataset and testing dataset from SSE

simulated dataset
8.5 testing dataset
8.4
Log Adjusted Closing Price
8.3
8.2
8.1
8.0
7.9
7.8
7.7
0 50 100 150 200 250 300 350
Value
Figure 4.6: Comparison between simulated dataset and testing dataset.
110
Parameter estimation using ABC for SSE during period 1
Using the method described above, we try to estimate the parameters using ABC.
In order to test the fit of our model, we divide the daily log adjusted closing prices
into two parts, the training dataset and the testing dataset. The first 2800 data
points form the training dataset and the remaining 292 form the testing dataset. We
tried a combination of normal and gamma priors. The table below shows the number
of accepted values of the parameters for different simulations.
for different levels.
Number of simulations (n) Number of accepted parameters (R)

10,000 100 45
5,000 100 34
1,000 100 0
It can be seen from table 4.2 that the number of accepted values of the parameters
increase with the increase in tolerance level (). For tolerance level () = 1, 000,
the ABC algorithm does not accept any parameter values. Next, we look at the
histograms of the accepted values of the parameters.
111
17.5
10
15.0
8
Frequency 12.5
Frequency
10.0 6
7.5
4
5.0
2
2.5
0.0 0
60 80 100 120 140 160 180 200 −1 0 1 2 3
Value Value

α1 . of β1 .

7
10
6
5 8
Frequency
Frequency
4 6
3
4
2
2
1
0 0
0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.0 2.5 5.0 7.5 10.0 12.5 15.0 17.5 20.0
Value Value

σ1 . of α2 .

14 8
12 7
6
10
5
Frequency
Frequency
8
4
6
3
4
2
2 1
0 0
0.0 0.5 1.0 1.5 2.0 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35
Value Value

β2 . σ2 .
50
40
Frequency
30
20
10
0
−0.8 −0.6 −0.4 −0.2 0.0
Value

ρ.
simulations.
112
7 7
6 6
5 5
Frequency
Frequency
4 4
3 3
2 2
1 1
0 0
5 10 15 20 25 30 35 40 45 0 1 2 3 4 5
Value Value

α1 . of β1 .

5 7
6
4
5
Frequency
Frequency
3
4
3
2
2
1
1
0 0
0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 2 4 6 8 10 12 14 16 18
Value Value

σ1 . of α2 .

8
10
7
8 6
5
Frequency
Frequency
6
4
4 3
2
2
1
0 0
0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 0.05 0.10 0.15 0.20 0.25 0.30 0.35
Value Value

β2 . σ2 .
35
30
25
Frequency
20
15
10
0
−0.8 −0.6 −0.4 −0.2 0.0
Value

ρ.
simulations.
113
We have used the median as a point estimate of the parameter as the histograms of the
accepted values of the parameters appeared to be skewed. The estimated parameters
of the model fit on SSE data during period 4 from January 01, 1996 to November 30,
2007 for different levels are,
Table 4.3: Table showing the estimated parameters for different levels (100 simula-
tions).
α̂1 β̂1 σ̂1 α̂2 β̂2 σ̂2 ρ̂

10,000 19.283 1.349 0.173 5.774 0.414 0.222 -0.305
5,000 21.608 1.567 0.163 7.417 0.315 0.232 -0.286
compare the simulated dataset with the testing dataset using the methods described
above. For different levels in increasing order i.e. 5, 000 and 10, 000, Figures 4.9-4.10
show the simulated dataset and the testing dataset. The distance between them was
83.26 and 111.48 units, respectively.
114
Comparison between simulated dataset and testing dataset from SSE
simulated dataset
8.6 testing dataset

8.4
8.2
8.0
7.8
7.6
7.4
0 50 100 150 200 250 300
Value
Figure 4.9: Comparison between simulated dataset and testing dataset for = 5, 000
for the first period.

simulated dataset
8.6 testing dataset
8.4
8.2
8.0
7.8
7.6
7.4
0 50 100 150 200 250 300
Value
Figure 4.10: Comparison between simulated dataset and testing dataset for =
10, 000 for the first period.
115
into two parts, the training dataset and the testing dataset. The first 299 data points
form the training dataset and the remaining 100 form the testing dataset. We tried
a combination of normal and gamma priors.

10,000 100 72
5,000 100 82
1,000 100 59
It can be seen from table 4.4 that the number of accepted values of the parameters
increase with the increase in tolerance level (). Next, we look at the histograms of
the accepted values of the parameters.
116
17.5 20.0
15.0 17.5
Frequency 12.5 15.0
Frequency
12.5
10.0
10.0
7.5
7.5
5.0
5.0
2.5 2.5
0.0 0.0
0.0 2.5 5.0 7.5 10.0 12.5 15.0 17.5 20.0 0 1 2 3 4 5
Value Value

α1 . of β1 .

40
17.5
35
15.0
30
12.5
Frequency
Frequency
25
10.0
20
7.5
15
10 5.0
5 2.5
0 0.0
0.0 0.2 0.4 0.6 0.8 0 2 4 6 8 10 12 14 16
Value Value

σ1 . of α2 .

40
35
35
30
30
25
Frequency
Frequency
25
20
20
15
15
10 10
5 5
0 0
0 1 2 3 4 0.0 0.1 0.2 0.3 0.4 0.5 0.6
Value Value

β2 . σ2 .
50
40
Frequency
30
20
10
0
−0.8 −0.7 −0.6 −0.5 −0.4 −0.3 −0.2 −0.1 0.0
Value

ρ.
simulations.
117
14 14
12 12
Frequency 10 10
Frequency
8 8
6 6
4 4
2 2
0 0
0.0 2.5 5.0 7.5 10.0 12.5 15.0 17.5 1 2 3 4 5
Value Value

α1 . of β1 .
40 17.5
35 15.0
30 12.5
Frequency
Frequency
25
10.0
20
7.5
15
5.0
10
5 2.5
0 0.0
0.0 0.2 0.4 0.6 0.8 2.5 5.0 7.5 10.0 12.5 15.0 17.5
Value Value

σ1 . of α2 .

50
30
40 25
20
30
Frequency
Frequency
15
20
10
10
5
0 0
0 1 2 3 4 5 6 7 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7
Value Value

β2 . σ2 .
40
30
Frequency
20
10
0
−0.8 −0.7 −0.6 −0.5 −0.4 −0.3 −0.2 −0.1 0.0
Value

ρ.
simulations.
118
20.0
12
17.5
10
Frequency 15.0
Frequency
12.5 8
10.0 6
7.5
4
5.0
2
2.5
0.0 0
0.0 2.5 5.0 7.5 10.0 12.5 15.0 17.5 20.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0
Value Value

α1 . of β1 .
40 16
14
30 12
Frequency
Frequency
10
20 8
10 4
0 0
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0 2 4 6 8 10 12 14 16
Value Value

σ1 . of α2 .

35 50
30
40
25
Frequency
Frequency
20 30
15
20
10
10
5
0 0
0 1 2 3 4 0.0 0.2 0.4 0.6 0.8
Value Value

β2 . σ2 .
50
40
Frequency
30
20
10
0
−0.8 −0.6 −0.4 −0.2 0.0
Value

ρ.
simulations.
119
We have used the median as a point estimate of the parameter. The estimated
parameters of the model fit on SSE data during period 2 from December 03, 2007 to
June 30, 2009 for different levels are given in table 4.5.
tions).
α̂1 β̂1 σ̂1 α̂2 β̂2 σ̂2 ρ̂

10,000 5.285 2.461 0.078 5.596 0.588 0.065 -0.045
5,000 5.877 2.803 0.055 5.545 0.387 0.074 -0.043
1,000 6.217 2.412 0.080 4.866 0.508 0.059 -0.031
above. For different levels in increasing order i.e. 1, 000, 5, 000 and 10, 000, Figures
4.14-4.16 show the simulated dataset and the testing dataset. The distance between
them was 9.78, 9.67 and 10.62 units, respectively.
120
Comparision between the simulated dataset and testing dataset
8.0 Simulated dataset
Testing dataset

7.9
7.8
7.7
7.6
0 20 40 60 80 100
Value
for the second period.

Testing dataset
7.9
7.8
7.7
7.6
7.5
0 20 40 60 80 100
Value
for the second period.
121
Testing dataset
8.1

8.0
7.9
7.8
7.7
7.6
7.5
0 20 40 60 80 100
Value
10, 000 for the second period.
The first 1200 data points form the training dataset and the remaining 265 form
the testing dataset. We tried a combination of normal and gamma priors.

10,000 100 80
5,000 100 64
1,000 100 33
122
16
25 14
12
20
Frequency 10
Frequency
15
8
10 6
4
5
2
0 0
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 −2 −1 0 1 2 3
Value Value

α1 . of β1 .
10 17.5
15.0
8
12.5
Frequency
Frequency
6 10.0
7.5
4
5.0
2
2.5
0 0.0
0.00 0.05 0.10 0.15 0.20 0.25 0 1 2 3 4 5 6 7 8
Value Value

σ1 . of α2 .

25
12
20
10
Frequency
Frequency
15 8
6
10
4
5
2
0 0
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 0.00 0.05 0.10 0.15 0.20 0.25
Value Value

β2 . σ2 .
12
10
8
Frequency
0
−1.0 −0.8 −0.6 −0.4 −0.2 0.0
Value

ρ.
simulations.
123
20 14
12
15 10
Frequency
Frequency
8
10
6
4
5
2
0 0
0 1 2 3 4 5 −1 0 1 2 3
Value Value

α1 . of β1 .

10
17.5
15.0
8
12.5
Frequency
Frequency
6
10.0
4 7.5
5.0
2
2.5
0 0.0
0.00 0.05 0.10 0.15 0.20 0.25 0 2 4 6 8
Value Value

σ1 . of α2 .

30
10
25
8
20
Frequency
Frequency
6
15
10 4
5 2
0 0
0 1 2 3 4 0.05 0.10 0.15 0.20 0.25
Value Value

β2 . σ2 .
12
10
Frequency
0
−1.0 −0.8 −0.6 −0.4 −0.2 0.0
Value

ρ.
simulations.
124
8
14
7
12
6
10
Frequency 5
Frequency
8
4
6 3
4 2
2 1
0 0
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 0 1 2 3 4 5
Value Value

α1 . of β1 .

5
8
4
6
Frequency
Frequency
3
4
2
1 2
0 0
0.00 0.05 0.10 0.15 0.20 0.25 1 2 3 4 5 6
Value Value

σ1 . of α2 .

5
20.0
17.5
4
15.0
Frequency
Frequency
12.5 3
10.0
2
7.5
5.0
1
2.5
0.0 0
0.0 0.5 1.0 1.5 2.0 2.5 3.0 0.05 0.10 0.15 0.20 0.25
Value Value

β2 . σ2 .
7
5
Frequency
0
−1.0 −0.8 −0.6 −0.4 −0.2 0.0
Value

ρ.
simulations.
125
The estimated parameters of the model fit on SSE data during period 3 from July
01, 2009 to May 29, 2015 for different levels are given in table 4.7.
tions).
α̂1 β̂1 σ̂1 α̂2 β̂2 σ̂2 ρ̂

10,000 0.644 1.101 0.132 2.523 0.777 0.139 -0.374
5,000 0.728 0.981 0.121 2.484 0.547 0.133 -0.356
1,000 0.567 0.838 0.122 2.016 0.176 0.142 -0.315
We have used the median as a point estimate of the parameter as the histograms
appear to be skewed. Using the estimated parameters given above, we simulate a
synthetic dataset and compare the simulated dataset with the testing dataset using
the methods described above. For different levels in increasing order i.e. 1, 000, 5, 000
and 10, 000, Figures 4.19 - 4.21 show the simulated dataset and the testing dataset.
The distance between them was 54.76, 57.45 and 30.00 units.
126
simulated dataset
8.4 testing dataset

8.2
8.0
7.8
7.6
0 50 100 150 200 250

Value
for the third period.

simulated dataset
8.4 testing dataset
8.2
8.0
7.8
7.6
7.4
7.2
0 50 100 150 200 250

Value
for the third period.
127
8.8
8.6

8.4
8.2
8.0
7.8
7.6 simulated dataset

testing dataset
7.4
0 50 100 150 200 250
Value
10, 000 for the third period.
into two parts, the training dataset and the testing dataset. The first 150 data points
form the training dataset and the remaining 47 form the testing dataset. We tried a
combination of normal and gamma priors.

10,000 100 99
5,000 100 94
1,000 100 81
128
20.0 16
17.5 14
15.0 12
12.5 10
Frequency
Frequency
10.0 8
7.5 6
5.0 4
2.5 2
0.0 0
0 2 4 6 8 10 12 14 16 −1 0 1 2 3
Value Value

α1 . of β1 .

25
60
50 20
40 15
Frequency
Frequency
30
10
20
5
10
0 0
0.0 0.2 0.4 0.6 0.8 0 1 2 3 4
Value Value

σ1 . of α2 .

30
40
25
20 30
Frequency
Frequency
15
20
10
10
5
0 0
0 1 2 3 4 0.0 0.2 0.4 0.6 0.8
Value Value

β2 . σ2 .
50
40
Frequency
30
20
10
0
−0.8 −0.6 −0.4 −0.2 0.0
Value

ρ.
simulations.
129
14
20 12
10
Frequency 15
Frequency
8
10 6
4
5
2
0 0
0 2 4 6 8 10 12 14 −1 0 1 2 3
Value Value

α1 . of β1 .

35
50
30
40 25
Frequency
Frequency
30 20
15
20
10
10
5
0 0
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0 1 2 3 4 5
Value Value

σ1 . of α2 .
30 40
25
30
20
Frequency
Frequency
15 20
10
10
5
0 0
0 1 2 3 4 5 0.0 0.1 0.2 0.3 0.4 0.5 0.6
Value Value

β2 . σ2 .
35
30
25
Frequency
20
15
10
0
−0.8 −0.6 −0.4 −0.2 0.0
Value

ρ.
simulations.
130
14
17.5
15.0 12
Frequency 12.5 10
Frequency
10.0 8
7.5 6
5.0 4
2.5 2
0.0 0
0 2 4 6 8 10 12 14 16 −3 −2 −1 0 1 2 3 4
Value Value

of α1 . of β1 .

17.5
35
30 15.0
25 12.5
Frequency
Frequency
20
10.0
15
7.5
10
5 5.0
0 2.5
0.0 0.1 0.2 0.3 0.4 0.5 0.6
Value
0.0
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0
(c) Histogram of estimated values Value
of σ1 . (d) Histogram of estimated values of α2 .
20 40
15
30
Frequency
Frequency
10
20
10
0
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0
Value
0
0.0 0.2 0.4 0.6 0.8
(e) Histogram of estimated values Value
of β2 . (f) Histogram of estimated values of σ2 .
35
30
25
Frequency
20
15
10
0
−0.7 −0.6 −0.5 −0.4 −0.3 −0.2 −0.1 0.0
Value

of ρ.
Figure 4.25: Histograms of estimated values of the parameters for = 1, 000 and 100
simulations.
131
As the histograms of the accepted values of the parameters appear to be skewed, we
have used the median as a point estimate of the parameter. The estimated parameters
of the model fit on SSE data during period 4 from January 01, 2015 to April 08, 2016
for different levels are,
Table 4.9: Table showing the estimated parameters for different levels.
α̂1 β̂1 σ̂1 α̂2 β̂2 σ̂2 ρ̂

10,000 3.756 1.063 0.006 0.823 0.602 0.044 -0.052
5,000 3.082 0.845 0.014 0.575 0.581 0.044 -0.079
1,000 4.109 1.157 0.020 0.731 0.544 0.039 -0.032
above. For different levels in increasing order i.e. 1, 000, 5, 000 and 10, 000, figures
4.25 - 4.27 show the simulated dataset and the testing dataset. The distance between
them was 2.51, 9.19 and 3.67 units.
132
Simulated dataset
8.05 Testing dataset

8.00
7.95
7.90
7.85
0 10 20 30 40
Value
for the fourth period.

Testing dataset
8.0
7.9
7.8
7.7
0 10 20 30 40
Number of data points
for the fourth period.
133
8.05
8.00

7.95
7.90
7.85
7.80

Testing dataset
0 10 20 30 40
Value
10, 000 for the fourth period.
4.3.2 Parameter estimation using ABC for NIKKEI 225
We fit the generalized Heston model to the data from NIKKEI 225 and estimate
the parameters using ABC. In order to test the fit of our model, we divide the daily
log adjusted closing prices into two parts, the training dataset and the testing dataset.
The first 700 data points form the training dataset and the remaining 227 form the
testing dataset. We tried a combination of normal and gamma priors. Given an ,
the ABC algorithm accepts many numerical values for a single parameter.
134
eters for different levels.

10,000 100 96
5,000 100 86
1,000 100 74
135
40
20
35
30
15
Frequency
Frequency
25
20
10
15
10 5
5
0 0
0 1 2 3 4 5 −1 0 1 2 3 4
Value Value

α1 . of β1 .

16 35
14 30
12
25
10
Frequency
Frequency
20
8
15
6
10
4
2 5
0 0
0.00 0.05 0.10 0.15 0.20 0.25 0.30 0 1 2 3 4
Value Value

σ1 . of α2 .

50
12
40 10
8
30
Frequency
Frequency
6
20
4
10
2
0 0
0.0 0.5 1.0 1.5 2.0 2.5 3.0 0.00 0.05 0.10 0.15 0.20 0.25 0.30
Value Value

β2 . σ2 .
17.5
15.0
12.5
Frequency
10.0
7.5
5.0
2.5
0.0
−1.0 −0.8 −0.6 −0.4 −0.2 0.0
Value

ρ.
simulations.
136
20.0
30
17.5
25
15.0
Frequency 20 12.5
Frequency
10.0
15
7.5
10
5.0
5
2.5
0 0.0
0 1 2 3 4 5 −2 −1 0 1 2
Value Value

α1 . of β1 .

12 40
35
10
30
8
25
Frequency
Frequency
6 20
15
4
10
2
5
0 0
0.00 0.05 0.10 0.15 0.20 0.25 0.30 0 1 2 3 4 5 6
Value Value

σ1 . of α2 .

12
25
10
20
8
Frequency
Frequency
15
6
10
4
5 2
0 0
0.0 0.2 0.4 0.6 0.8 1.0 1.2 0.00 0.05 0.10 0.15 0.20 0.25 0.30
Value Value

β2 . σ2 .
16
14
12
10
Frequency
0
−1.0 −0.8 −0.6 −0.4 −0.2 0.0
Value

ρ.
simulations.
137
25 20
20
15
Frequency
Frequency
15
10
10
5
5
0 0
0 1 2 3 4 −2 −1 0 1 2 3
Value Value

α1 . of β1 .
12 17.5
10 15.0
12.5
Frequency
Frequency
8
10.0
6
7.5
4
5.0
2 2.5
0 0.0
0.00 0.05 0.10 0.15 0.20 0.25 0.30 0 1 2 3 4
Value Value

σ1 . of α2 .

12
40
10
30 8
Frequency
Frequency
6
20
4
10
2
0 0
0.0 0.5 1.0 1.5 2.0 2.5 3.0 0.00 0.05 0.10 0.15 0.20 0.25 0.30
Value Value

β2 . σ2 .
10
8
Frequency
0
−1.0 −0.8 −0.6 −0.4 −0.2 0.0
Value

ρ.
simulations.
138
We have used the median as a point estimate of the parameter as all the histograms of
accepted parameters appeared to be skewed. The estimated parameters of the model
fit on NIKKEI 225 data from January 05, 2015 to July 24, 2018 are given in 4.11.
Table 4.11: Table showing the estimated parameters for different levels.
α̂1 β̂1 σ̂1 α̂2 β̂2 σ̂2 ρ̂

10,000 0.642 1.021 0.152 0.691 0.328 0.142 -0.386
5,000 0.856 1.135 0.141 0.663 0.268 0.129 -0.340
1,000 0.702 0.716 0.157 1.081 0.236 0.146 -0.360
compare the simulated dataset with the testing dataset. We use the same distance
metric as was used in implementing the ABC algorithm to calculate the goodness
of our fitted model. The smaller distance between the simulated dataset using es-
timated parameters and the testing dataset the better the model fits. Using the
estimated parameters given above, we simulate a synthetic dataset and compare the
simulated dataset with the testing dataset using the methods described above. For
different levels in increasing order i.e. 1, 000, 5, 000 and 10, 000, figures 4.32 - 4.34
show the simulated dataset and the testing dataset. The distance between them was
33.88, 42.38 and 39.21 units.
139
Simulated dataset v/s testing dataset for NIKKEI 225
testing dataset

10.3
10.2
10.1
10.0
9.9
9.8
0 50 100 150 200
Value
1, 000.

10.6
simulated dataset
10.5 testing dataset
10.4
10.3
10.2
10.1
10.0
9.9
9.8
0 50 100 150 200
Value
5, 000.
140
testing dataset
10.1

10.0
9.9
9.8
9.7
9.6
9.5
9.4
0 50 100 150 200
Value
10, 000.
141
Chapter 5: Contributions and Future Work
5.1 Results Overview
In chapter 4, the generalized Heston model was fit to different indices of two
of the most important emerging markets of the world, namely the Shanghai Stock
Exchange (SSE) and NIKKEI 225. We used the ABC algorithm to estimate the
parameters of the model. As the histograms for almost all of the accepted values for
all the parameters appeared to be skewed, we used the median as a point estimate for
the parameters. If we look at the SSE, for a particular period, the number of accepted
parameters increased as the tolerance level () went up. The maximum number of
accepted parameters was for tolerance level () = 10, 000. As mentioned in chapter
4, the data from SSE was divided into 4 separate periods. Among different periods
for the SSE data i.e., the number of accepted parameters was higher for periods
which were shorter i.e., period 2 and period 4. Overall, the generalized Heston model
was a good fit for the SSE data from 01 January, 1996 to 08 June, 2016. This is
evident from figure 4.6. Not only was the simulated dataset very close to the testing
dataset but it also is able to capture the variations in the testing dataset. All of the
estimated parameters fall within reasonable values and we believe these parameters
to be estimates for the true market parameters.
142
If we look at the SSE data for period 1, we observe that there are no accepted
parameters for () = 1, 000. For tolerance levels () of 5, 000 and 10, 000, the distances
between the synthetic dataset which was simulated using the estimated parameters
of the generalized Heston model and the testing dataset were 83.26 and 111.48 units
respectively. If we look at Figures 4.9-4.10 we would believe that the model has
intermediate predictive power but this predictive power of the model reaches new
heights during the second period. The second period was the period right before
and right after the 2008 financial crisis. For tolerance levels () of 1, 000, 5, 000 and
10, 000, the distances between the synthetic dataset which was simulated using the
estimated parameters of the generalized Heston model and the testing dataset were
only 9.78, 9.69 and 10.62 units respectively. The model was beautifully able to capture
the trend for this period which is evident from Figures 4.14-4.16. This was one
example of a shorter time period for which the generalized Heston model had a high
predicting power. For the SSE data for period 3, the distances between the synthetic
dataset which was simulated using the estimated parameters of the generalized Heston
model and the testing dataset for increasing tolerance levels () were 54.76, 57.45 and
30.00 units. As compared to period 2, these distances were relatively quite large.
From Figures 4.20-4.22 we observe that even though the model was accurately able
to capture the general trend during this period, we feel that the predictive power
of the model for period 3 was not on par with the predictive power of the model
for period 2. The model performance for period 4 for SSE data is a different story
from that of period 3 for the data from the same index. As can be seen from the
Figures 4.26-4.28 the generalized Heston model has very good predictive power. We
should note that the SSE data for period 4 was from 01 June 2015 to 08 April, 2016
143
which coincides with Chinese Stock Market crisis period. For period 4, the distances
between the synthetic dataset which was simulated using the estimated parameters of
the generalized Heston model and the testing dataset for increasing tolerance levels ()
were 2.51, 9.19 and 3.67 units respectively. Out of the all the four periods in which the
SSE data was divided, we feel the model performed the best for period 2 and period 4,
i.e., the period around the 2008 financial crisis and around the Chinese Stock Market
crisis. We believe that the generalized Heston model has good predictive power for
shorter time periods and during financial crunches or crisis.
We applied the model to the NIKKEI 225 data from January 05, 2015 to July 24,
2018. This dataset was really important as the interest rates in the Japanese economy
were negative several times between this time period. The advantage of using the
generalized Heston model over a Heston type model is that it allows for interest
rates to be negative and can capture other local variations in the interest rates as
well. The distances between the synthetic dataset which was simulated using the
estimated parameters of the generalized Heston model and the testing dataset for
increasing tolerance levels () were 33.88, 42.38 and 39.21 units. This model was a
good fit and had good predictive power for the concerned data.
Even though we proposed this model in a financial realm, we would like to explore
other applications of this model as well. This is explained in more details in the next
section.
5.2 Future Work
In this project we propose a new model, the generalized Heston model, to predict
the volatility of financial assets. The decision to build the generalized Heston model
144
was motivated by the fact that some economies and markets could have negative
interest rates as well. We use ABC to estimate the parameters of the generalized
Heston model and also test the validity of the model. Going forward, we would like
to approach this model from a more theoretical point of view. We would like to
calculate the moments of the generalized Heston model and estimate the parameters
using a likelihood based approach. Long term behavior is also of interest. Moreover,
we would like to explore other applications of the generalized Heston model. In
addition, we would like to test this model on a different market during a crisis period
and see how our model performs in comparison to the standard models.
5.2.1 Moments of generalized Heston model
In chapter 3 we stopped at equation (5.1) for simulation purposes. Moving for-
ward, we could use the fact that µ(t) follows an OU process. For any two times t1 , t2
such that t2 > t1 , equation (3.12) translates to,

Z t2 Z t2 p
ν(s)
X(t2 ) = X(t1 ) + µ(s) − ds + ν(s) dW X (s). (5.1)
t1 2 t1
This can be further simplified as,

Z t2 Z t2 Z t2
ν(s) p
X(t2 ) = X(t1 ) + µ(s)ds − ds + ν(s) dW X (s).
t1 t1 2 t1
R t2
Using (3.11) t1
µ(s)ds is,
" #
Z t2 Z t2 Z s Z s
µ(s) ds = µ(t1 ) + α1 (β1 − µ(p)) dp + σ1 dW µ (p) ds,
t1 t1 t1 t1
Z t2 Z t2 Z s Z t2 Z s
= µ(t1 ) ds + α1 (β1 − µ(p)) dp ds + σ1 dW µ (p) ds
t1 t1 t1 t1 t1
145
Assuming that µ(t1 ) is known to us at time t2 and letting t2 − t1 = ∆t we get,
Z t2 Z t2 Z s Z t2 Z s
µ(s) ds = µ(t1 )∆t + α1 (β1 − µ(p)) dp ds + σ1 dW µ (p) ds (5.2)
t1 t1 t1 t1 t1
R t2
After plugging the value of t1
µ(s) ds from equation (5.2) into equation (5.1) we get,
Z t2 Z s Z t2 Z s
X(t2 ) = X(t1 ) + µ(t1 )∆t + α1 (β1 − µ(p)) dp ds + σ1 dW µ (p) ds
t1 t1 t t
Z t2 Z t21 p 1
ν(s)
− ds + ν(s) dW X (s). (5.3)
t1 2 t1
Using proposition 6, we get,
Z t2 Z s Z t2 Z s
X(t2 ) = X(t1 ) + µ(t1 )∆t + α1 (β1 − µ(p)) dp ds + σ1 dW µ (p) ds
t1 t1 t1 t1
Z t2 Z t2 p Z t2 p
ν(s) ν
p
− ds + ρ ν(s) dW (s) + 1 − ρ 2 ν(s) dW Z (s), (5.4)
t1 2 t1 t1
where, dW Z and dW ν are independent of each other. [Using proposition 6.] The
moments can be calculated using the properties of expectations and variance.
146
Bibliography
[1] https://www.bseindia.com.
[2] https://www.investopedia.com/articles/investing/102114/guide-japans-nikkei-
225-index.asp.
[3] https://www.investopedia.com/terms/i/index.asp.
[4] https://www.sec.gov/fast-answers/answersindiceshtm.html.
[5] https://www.cnbc.com/2018/03/13/investing-japan-regional-banks-hit-by-
negative-interest-rates.html. 2018.
[6] Leif BG Andersen. Efficient simulation of the heston stochastic volatility model.
2007.
[7] Louis Bachelier. Theory of speculation. Dimson, E. and M. Mussavian (1998), A
brief history of market efficiency, European Financial Management, 4(1):91–193,
1900.
[8] Mark Broadie and Özgür Kaya. Exact simulation of stochastic volatility and
other affine jump diffusion processes. Operations research, 54(2):217–231, 2006.
147
[9] Robert Brown. A brief account of microscopical observations made in the months
of June, July and August 1827, on the particles contained in the pollen of plants;
and on the general existence of active molecules in organic and inorganic bodies.
The Philosophical Magazine, 4(21):161–173, 1828.
[10] Kalok C Chan, G Andrew Karolyi, Francis A Longstaff, and Anthony B Sanders.
An empirical comparison of alternative models of the short-term interest rate.
The journal of finance, 47(3):1209–1227, 1992.
[11] John C Cox, Jonathan E Ingersoll Jr, and Stephen A Ross. An intertemporal
general equilibrium model of asset prices. Econometrica: Journal of the Econo-
metric Society, pages 363–384, 1985.
[12] Jim Gatheral. The volatility surface: a practitioner’s guide, volume 357. John
Wiley & Sons, 2011.
[13] Steven L Heston. A closed-form solution for options with stochastic volatility
with applications to bond and currency options. The review of financial studies,
6(2):327–343, 1993.
[14] Jens Carsten Jackwerth and Mark Rubinstein. Recovering probability distribu-
tions from option prices. The Journal of Finance, 51(5):1611–1631, 1996.
[15] Ioannis Karatzas and Steven Shreve. Brownian motion and stochastic calculus,
volume 113. Springer Science & Business Media, 2012.
[16] Linyue Li, Thomas D Willett, and Nan Zhang. The effects of the global finan-
cial crisis on china’s financial market and macroeconomy. Economics Research
International, 2012, 2012.
148
[17] Roger Lord, Remmert Koekkoek, and Dick Van Dijk. A comparison of bi-
ased simulation schemes for stochastic volatility models. Quantitative Finance,
10(2):177–194, 2010.
[18] Oleg Malafeyev, Achal Awasthi, and Kaustubh S Kambekar. Random walks
and market efficiency in chinese and indian equity markets. arXiv preprint
arXiv:1709.04059, 2017.
[19] Frederic S Mishkin. Anatomy of a financial crisis. Journal of evolutionary Eco-
nomics, 2(2):115–130, 1992.
[20] India. Ministry of Finance. The BRICS Report: A Study of Brazil, Russia, India,
China, and South Africa with Special Focus on Synergies and Complementarities.
Oxford University Press, 2012.
[21] Mark Rubinstein. Implied binomial trees. The Journal of Finance, 49(3):771–
818, 1994.
[22] Steven E Shreve. Stochastic calculus for finance II: Continuous-time models,
volume 11. Springer Science & Business Media, 2004.
[23] Robert D Smith. An almost exact simulation method for the heston model.
Journal of Computational Finance, 11(1):115, 2007.
[24] Shu Tong Tse and Justin WL Wan. Low-bias simulation scheme for the heston
model by inverse gaussian approximation. Quantitative finance, 13(6):919–937,
2013.
149
[25] Alexander Van Haastrecht and Antoon Pelsser. Efficient, almost exact simulation
of the heston stochastic volatility model. International Journal of Theoretical and
Applied Finance, 13(01):1–43, 2010.
[26] Diane Wilcox and Tim Gebbie. An analysis of cross-correlations in an emerging
market. Physica A: Statistical Mechanics and its Applications, 375(2):584–598,
2007.
150

Thesis

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Thesis

Uploaded by

Copyright:

Available Formats

Parameter Estimation in Stochastic Volatility Models Via

Approximate Bayesian Computing

Presented in Partial Fulfillment of the Requirements for the Degree

Achal Awasthi, B.S.

Graduate Program in Department of Statistics

The Ohio State University

Master’s Examination Committee:

In this thesis, we propose a generalized Heston model as a tool to estimate volatil-

ity. We have used Approximate Bayesian Computing to estimate the parameters of

time periods, this model failed to capture the volatility in detail.

significant impact in my life.

Kubatko for accepting to be part of the defense committee. My gratitude goes to

to study worldwide. I would also like to express my gratitude towards my uncles,

life in a different country. In addition, my deepest appreciation goes to my friends at

at the Ohio State University. Finally, I am extremely thankful to my housemates for

bearing with me during the past one year.

2016 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B.S. Physics

2016-present . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Graduate Teaching Associate,

Major Field: Department of Statistics

List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii

List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii

3. Approximate Bayesian Computing for Stochastic Volatility Models . . . 43

3.1 Heston Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

4. Application: Modeling Volatility in Financial Markets . . . . . . . . . . . 100

4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

5. Contributions and Future Work . . . . . . . . . . . . . . . . . . . . . . . 142

5.1 Results Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142

3.1 Table showing the number of simulations vs number of accepted pa-

rameters for different  = 100. . . . . . . . . . . . . . . . . . . . . . . 62

3.2 Table showing the number of simulations vs number of accepted pa-

rameters for different  = 200. . . . . . . . . . . . . . . . . . . . . . . 62

3.3 Table showing the number of simulations vs number of accepted pa-

rameters for different  = 500. . . . . . . . . . . . . . . . . . . . . . . 63

3.4 Table showing the number of simulations vs number of accepted pa-

rameters for different  = 800. . . . . . . . . . . . . . . . . . . . . . . 63

3.5 Table showing the number of simulations vs number of accepted pa-

rameters for different  = 1000. . . . . . . . . . . . . . . . . . . . . . 63

rameters for different  = 1500. . . . . . . . . . . . . . . . . . . . . . 64

3.7 Table showing the number of simulations vs number of accepted pa-

rameters for  = 100. . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

3.8 Table showing the number of simulations vs number of accepted pa-

rameters for  = 200. . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

3.9 Table showing the number of simulations vs number of accepted pa-

rameters for  = 500. . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

3.10 Table showing the number of simulations vs number of accepted pa-

rameters for  = 800. . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

3.11 Table showing the number of simulations vs number of accepted pa-

rameters for  = 1, 000. . . . . . . . . . . . . . . . . . . . . . . . . . . 85

3.12 Table showing the number of simulations vs number of accepted pa-

rameters for  = 1, 500. . . . . . . . . . . . . . . . . . . . . . . . . . . 86

4.1 Table showing the number of simulations vs number of accepted pa-

rameters for different  = 10, 000. . . . . . . . . . . . . . . . . . . . . 108

rameters for different  levels. . . . . . . . . . . . . . . . . . . . . . . 111

4.4 Table showing the number of simulations vs number of accepted pa-

rameters for different  levels. . . . . . . . . . . . . . . . . . . . . . . 116

4.6 Table showing the number of simulations vs number of accepted pa-

rameters for different  levels. . . . . . . . . . . . . . . . . . . . . . . 122

4.8 Table showing the number of simulations vs number of accepted pa-

rameters for different  levels. . . . . . . . . . . . . . . . . . . . . . . 128

rameters for different  levels. . . . . . . . . . . . . . . . . . . . . . . 135

2.1 Simulated paths of the GBM process with parameters as described in

sents the superimposed normal density curve with parameters obtained

from simulated data at the 50th time-step. . . . . . . . . . . . . . . . 14

2.3 Histogram of estimated values of µ of the GBM as simulated above.

rameters for different = 100. . . . . . . . . . . . . . . . . . . . . . . 62

rameters for different = 200. . . . . . . . . . . . . . . . . . . . . . . 62

rameters for different = 500. . . . . . . . . . . . . . . . . . . . . . . 63

rameters for different = 800. . . . . . . . . . . . . . . . . . . . . . . 63

rameters for different = 1000. . . . . . . . . . . . . . . . . . . . . . 63

rameters for different = 1500. . . . . . . . . . . . . . . . . . . . . . 64

rameters for = 100. . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

rameters for = 200. . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

rameters for = 500. . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

rameters for = 800. . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

rameters for = 1, 000. . . . . . . . . . . . . . . . . . . . . . . . . . . 85

rameters for = 1, 500. . . . . . . . . . . . . . . . . . . . . . . . . . . 86

rameters for different = 10, 000. . . . . . . . . . . . . . . . . . . . . 108

rameters for different levels. . . . . . . . . . . . . . . . . . . . . . . 111

rameters for different levels. . . . . . . . . . . . . . . . . . . . . . . 116

rameters for different levels. . . . . . . . . . . . . . . . . . . . . . . 122

rameters for different levels. . . . . . . . . . . . . . . . . . . . . . . 128

rameters for different levels. . . . . . . . . . . . . . . . . . . . . . . 135