Professional Documents
Culture Documents
(2024) Deep Learning Approach for Multi-Asset Option Pricing (Noguer i Alonso, HAIDA)
(2024) Deep Learning Approach for Multi-Asset Option Pricing (Noguer i Alonso, HAIDA)
Abstract
We introduce an algorithm designed to address semi-linear high-dimensional partial
differential equations (PDEs) that arise in the multi-asset option pricing. This ap-
proach draws an intriguing analogy between forward-backward stochastic differential
equations (FBSDEs) and deep learning (DL), wherein the solution’s gradient assumes
the role of a policy function in sense of a reinforcement learning problem. The loss func-
tion, in this context, quantifies the disparity between the specified terminal condition
and the FBSDE’s solution. To efficiently approximate the policy function, we employ
several time-dependent neural networks following the principles of deep learning. Har-
nessing Python and TensorFlow, we conducted a comprehensive series of numerical
experiments to assess the efficiency and accuracy of the proposed algorithm for several
multi-asset option pricing models such as 100-dimensional Black-Scholes model, as well
as incorporating different interest rates and a risk that a default occurs.
1 Introduction
Option pricing remains a pivotal challenge in the financial sector. Central to this task
is the risk-neutral pricing framework, which has traditionally been addressed through
solutions to partial differential equations (PDEs), a method established by the as-
tonishing work of Black and Scholes in 1973. Practical financial operations demand
the valuation of multi-asset options such as basket options, leading to the formula-
tion of high-dimensional PDEs. Traditional resolution methods like finite difference
and finite element methods are hindered by the curse of dimensionality, limiting their
effectiveness for high-dimensional PDEs. Consequently, the focus has shifted towards
developing novel PDE solvers capable of handling the complexities of higher dimensions.
While linear models have played a foundational role in option pricing, they are not
without limitations. These models typically presume idealized market conditions and
may not fully account for the multifaceted interactions present in financial markets.
Even with the introduction of nonlinear terms to better represent these complexities,
the necessity to consider a substantial number of assets linked to the options remains.
To address this, we must turn to more advanced methodologies, such as stochastic
analysis and deep learning (DL) techniques. These approaches are better equipped to
handle the nonlinear high-dimensional nature of the problem, capturing elements like
varying interest rates, the potential for default, and sudden jumps in asset prices, which
are critical to accurately reflecting the nuanced dynamics of actual financial markets.
The advent of artificial intelligence (AI) and its integration into the financial sector
has revolutionized the approach to these high-dimensional problems. AI, particularly
deep learning, has shown remarkable ability to circumvent the curse of dimensionality,
offering scalable and efficient solutions for complex derivative pricing. By leveraging
the power of neural networks, AI-driven models can approximate the solutions to high-
dimensional PDEs with a level of precision and speed unattainable by conventional
numerical methods. This paradigm shift not only enhances the accuracy of option
pricing but also opens new avenues for innovation in financial risk management and
investment strategies.
1
Electronic copy available at: https://ssrn.com/abstract=4739091
problems that are absent in lower-dimensional scenarios.
Consider the problem of pricing an option associated to 100 assets through a simple
equation like 100-dimensional Black-Scholes partial differential equation. If we were to
represent each dimension with just 10 points, we would need 10100 points for the mesh,
to put this in perspective, there are only about 1080 atoms in the observable universe.
This means that even if we used every atom as a storage unit, it wouldn’t be sufficient
to represent the problem, let alone perform the required calculations.
Fortunately, neural networks have demonstrated the ability to overcome the curse
of dimensionality, and they do not rely on grids to represent solutions. Among these
results, Barron’s proof that neural networks results doesn’t depend on the dimension
of the problem, but on the number of network’s parameters.
2 Theoretical Framework
2.1 Formulating the Pricing Problem: PDEs and SDEs
Two key frameworks used in this context of pricing financial options are Partial Differ-
ential Equations (PDEs) and Stochastic Differential Equations (SDEs). These frame-
works are fundamental for understanding the evolution of financial instruments over
time.
where C = C(t, St ) is the option price, and assuming no dividends yield and a constant
interest rate r which translates to µ = r, the PDE (3) must be considered with the
terminal condition
C(T, ST ) = g(ST ) (4)
2
Electronic copy available at: https://ssrn.com/abstract=4739091
with g as a simple claim.
FBSDEs offer a powerful framework for modeling and analysing dynamic systems,
where the evolution of a process depends on both forward and backward dynamics.
This characteristic is particulary relevant in situations where the future behaviour of
a system depends not only on the present state but also on future contingencies. By
incorporating backward components, FBSDEs enable the consideration of anticipatory
effects, which can be vital in modeling financial phenomena with inherent memory.
where:
µ : [0, T ] × Rn → ×Rn
σ : [0, T ] × Rd → ×Rn×n
f : [0, T ] × Rd × R × Rd → ×R
g:R→R
ξ = X0 and YT = g(XT )
for more general forms of FBSDEs and for full treatment of the existence and the
uniqueness of the solutions, one may refer to Ma, Protter, and Yong work [13].
under some regularity assumptions [13], we have that the nonlinear PDE
∂C 1
(t, x)+ T r[σσ T (t, x)∇xx C(t, x)] + ⟨µ(t, x), ∇x C(t, x)⟩+f (t, x, C(t, x), σ(t, x) ∇x C(t, x)) = 0
∂t 2
(7)
is related to the FBSDE (6) in the sense that for all t ∈ [0, T ] it holds that
• Time Discretization
X̃d+1 = X̃d + µ(td , X̃d )(td+1 − td ) + σ(td , X̃d )(Wtd+1 − Wtd ), (9)
3
Electronic copy available at: https://ssrn.com/abstract=4739091
where X̃d ≈ Xtd and d = 0, 1, . . . , N .
Analogously, we obtain a discretization of the FBSDE as
Ỹd+1 = Ỹd − f (td , X̃d , Ỹd , Z̃d )(td+1 − td ) + Z̃d · (Wtd+1 − Wtd ), (10)
where similarly, Ỹd ≈ Ytd and Z̃d ≈ Ztd . and by the last theorem’s result, we can
replace Y˜d by C(td , X̃d ) and Z̃d by (σ T ∇x C)(td , X̃d ) (refer to [10]) and get our
main resulting equation used for the iterations
C(td+1 , X̃d+1 ) = C(td , X̃d ) − f td , X̃d , C(td , X̃d ), (σ T ∇x C)(td , X̃d ) (td+1 − td )
not the C(td , x) because the approximations for this latter can be computed us-
ing expression (11) together with the deep learning-based approximations for (13).
We look at the number p as the number of parameters in the deep neural network,
for every θ ∈ Rp , let C θ ∈ R for every s ∈ Rr , d ∈ {0, 1, ..., N − 1}, i ∈ N0 ,
let Vd,i
θ,s
: (Rn )N → Rn be a function, as well as the two stochastic processes
X m,i
: {0, 1, ..., N } × Ω :→ Rn and Y θ,s,m,i : {0, 1, ..., N } × Ω :→ R which satisfy
for all θ ∈ Rp , s ∈ Rr and d ∈ {0, 1, ..., N − 1}:
for all θ ∈ Rp .
• Stochastic Optimization
which is defined as the mean of the squared difference between the target value
g(XN ) and the predicted value C(tN , XtN ) = YN
θ
. Here, XT represents the state
variable generated at time t = tN = T .
The loss function represents the discrepancy between the true target value, and
the value predicted by the neural network, By minimizing this loss function, the
neural network aims to improve its ability to accurately approximate the ideal
4
Electronic copy available at: https://ssrn.com/abstract=4739091
parameters θ.
– (C(td , Xtd ), ∇x C(td , Xtd ), Wtd+1 − Wtd ) −→ C(td+1 , Xtd+1 ) represents the
forward iteration process in the neural network. Here, the network takes three
inputs: C(td , Xtd ) (the current approximation of the solution), ∇x C(td , Xtd )
(the approximation of spatial gradients at time td ), and (Wtd+1 − Wtd ) (the
difference in the Wiener process between time steps td+1 and td ).
The forward iteration process uses this information as input to compute and
update the approximation u(tn+1 , Xtn+1 ) of the solution at the next time
step td+1 . This step is essential for the network to iteratively improve its
approximation of the function C(tN , XtN ), which is the final output of the
network, and is fully characterized by equation.
Unlike the previous connection type, there are no parameters (such as weights
or biases) to be optimized in this specific connection. The iteration process
relies on the input data and the network’s current approximation to generate
the next estimation iteratively.
5
Electronic copy available at: https://ssrn.com/abstract=4739091
Figure 1: The architecture sketch for the algorithm
For the neural network architecture, we used an input layer with 100 neurons,
two hidden layers with 120 neurons, and an output layer with 100 neurons. We used
Adaptive Moment Estimation (Adam) optimizer with a learning rate of 8 · 10−3 .
Num of iterations Mean of C(0, S0 ) L1 error Mean of the loss Runtime (seconds)
0 0.000 0.999 3556.34 1
4000 28.836 0.534 1122.13 611
10000 58.087 0.012 35.727 1476
20000 59.485 0.005 27.800 2933
30000 59.443 0.005 21.664 42101
6
Electronic copy available at: https://ssrn.com/abstract=4739091
Figure 2: The evolution of the approximated solution (right) and its loss (left) during
training history for equation (19)
assuming the same settings from the last example, with replacing the fixed interest
rate expression r · C(t, St ) with
( " n
# ) n
1 X ∂C (µ − Rl ) X ∂C
(R − R ) max
b l
0, −C − Rl C − (21)
σ ∂Si σ ∂Si
i=1 i=1
where Rl and Rb are the interest rate for lending and borrowing respectively. thus, the
n-dimensional multi-asset pricing equation becomes
n
∂C σ2 X ∂2C
+ |Si |2 (t, S)
∂t 2 ∂Si2
i=1
( " n
# ! " n # !)
X ∂C X ∂C
+ max R b
Si ( )(t, S) − C(t, S) ,R l
Si ( )(t, S) − C(t, S) =0
∂Si ∂Si
i=1 i=1
(22)
• The Algorithm Results
7
Electronic copy available at: https://ssrn.com/abstract=4739091
Num of iterations Mean of C(0, S0 ) L1 error Mean of the loss Runtime (seconds)
0 0.000 0.999 1002.34 1
400 7.065 0.143 805.33 122
1000 7.298 0.012 246.87 414
2000 7.471 0.005 212.74 830
3000 7.421 0.005 207.97 1216
Figure 3: The evolution of the approximated solution and its loss during training history
for equation (22)
The credit risk and ongoing European sovereign debt crisis have highlighted a signif-
icant risk that was overlooked in the original Black-Scholes model, the default risk
[refer to —]. The Black-Scholes pricing model for multi-asset options can be enhanced
to incorporate this factors in real markets, to include defaultable securities, which in-
troduces a nonlinear component into the pricing model.
We’ll investigate a fair price of a European claim that is linked to n underlying as-
sets, assuming no default has occured yet. In the event of the claim’s issuer defaulting,
the claim holder recieves only a fraction α ∈ [0, 1[ of the current value.
This possible default is modeled by the first jump time of a Poisson processs with
intensity P, a decreasing function of the current value; i.e., the default becomes more
likely when the claim’s value is low. The value process can then be expressed by adding
the following nonlinear term to our PDE
(γ h − γ l )
P(C) = 1]−∞,vh [ (C)γ h + 1[vl ,+∞[ (C)γ l + 1[vh ,vl [ (C) (C − v h ) + γ h (25)
(v h − v l )
8
Electronic copy available at: https://ssrn.com/abstract=4739091
g(ST ) = min{ST1 , ..., ST100 } for (ST1 , ..., ST100 ) ∈ R100 (27)
and we consider a basket of stocks with initial prices S0 = (S01 , S02 , . . . , S0100 ) = (100, . . . , 100)
the table below, summarizes the evolution of the training history for 10 independent
runs,
Num of iterations Mean of C(0, S0 ) error L1 error Mean of the loss Runtime (seconds)
0 0.000 0.999 4270.85 1
4000 28.129 0.509 908.75 663
10000 52.868 0.077 29.709 1724
20000 53.912 0.005 23.455 3081
30000 53.969 0.005 20.440 4407
Figure 4: The evolution of the approximated solution and its loss during training history
for equation (26)
4 Conclusions
This paper presents a novel algorithm leveraging the intriguing connection between FB-
SDEs and deep learning for efficiently solving semi-linear high-dimensional PDEs en-
countered in multi-asset option pricing. The proposed method utilizes time-dependent
neural networks to approximate the solution’s gradient, analogous to a policy function
in reinforcement learning. Extensive numerical experiments on various models, includ-
ing the 100-dimensional Black-Scholes model with diverse interest rates and default
risks, demonstrate the algorithm’s effectiveness and accuracy.
Looking ahead, several exciting avenues warrant further exploration. Firstly, the
framework can be extended by incorporating two additional neural networks, enabling
the model to price options at any point in time, not just the initial condition. This
would significantly enhance its practical applicability in real-world financial markets.
Secondly, the algorithm has the potential to be adapted for solving second-order PDEs
arising in complex financial settings by employing Doubly FBSDEs. This expansion
would broaden the scope of the method and its ability to tackle even more intricate
financial problems.
This work establishes a promising direction for applying deep learning techniques
to solve complex PDEs in financial engineering. The proposed algorithm demonstrates
remarkable efficiency and accuracy, paving the way for further advancements towards
a powerful and versatile framework for option pricing and beyond.
9
Electronic copy available at: https://ssrn.com/abstract=4739091
References
[1] Bellman, R. Dynamic Programming, Princeton University Press, Princeton, NJ,
2010. Reprint of the 1957 edition.
[2] D. Duffie, M. Schroder, C. Skiadas, Recursive valuation of defaultable securities and
the timing of resolution of uncertainty, The Annals of Applied Probability, Vol.6, No.
4, 1075–1090, 1996.
[3] C. Bender, N. Schweizer, J. Zhuo, A primal-dual algorithm for BSDEs,
arXiv:1310.3694, 2014.
[4] S. L. Heston, A Closed-Form Solution for Options with Stochastic Volatility with
Applications to Bond and Currency Options, The Review of Financial Studies.
[5] J. Hull, A. White, The Pricing of Options on Assets with Stochastic Volatilities,
The Journal of Finance, 42(2), 281-300.
[6] F. A. Longstaff, E. S. Schwartz, Interest Rate Volatility and the Term Structure: A
Two-Factor General Equilibrium Model, The Journal of Finance, 47(4), 1259-1282.
[7] Y. Z. Bergman, Option pricing with different interest rates, Review of Financial
Studies 8, 2, 1995, 475-500.
[8] D. Duffie, K. J. Singleton, Modeling Term Structures of Defaultable Bonds, Review
of Financial Studies, 12(4), 687-720
[9] R. C. Merton, Option pricing when underlying stock returns are discontinuous,
Journal of Financial Economics, 3(1-2), 125-144
[10] É. Pardoux, S. Peng, Backward Stochastic Differential Equations and Quasi-Linear
Parabolic Partial Differential Equations. Lecture Notes in Control and Inform. Sci,
Springer, Berlin, 1992, pp. 200–217.
[11] É. Pardoux, S. G. Peng, Adapted Solution of a Backward Stochastic Differential
Equation, Systems Control Letters, vol 14, 1990
[12] É. Pardoux, S. Tang, Forward-Backward Stochastic Differential Equations and
Quasilinear Parabolic PDEs, Probab. Theory Related Fields, 114, vol 2 (1999),
123–150.
[13] J. Ma, P. Protter, J. Young, Solving Forward-Backward Stochastic Differential
Equations Explicitly - a Four Step Scheme, Probability Theory and Related Fields
1998,339-359.
10
Electronic copy available at: https://ssrn.com/abstract=4739091