(2024) Deep Learning Approach for Multi-Asset Option Pricing (Noguer i Alonso, HAIDA)

Deep Learning Approach for Multi-Asset Option Pricing
Miquel Noguer i Alonso, Ayoub Haida

Artificial Intelligence Finance Institute
February 28, 2024
Abstract
We introduce an algorithm designed to address semi-linear high-dimensional partial
differential equations (PDEs) that arise in the multi-asset option pricing. This ap-
proach draws an intriguing analogy between forward-backward stochastic differential
equations (FBSDEs) and deep learning (DL), wherein the solution’s gradient assumes
the role of a policy function in sense of a reinforcement learning problem. The loss func-
tion, in this context, quantifies the disparity between the specified terminal condition
and the FBSDE’s solution. To efficiently approximate the policy function, we employ
several time-dependent neural networks following the principles of deep learning. Har-
nessing Python and TensorFlow, we conducted a comprehensive series of numerical
experiments to assess the efficiency and accuracy of the proposed algorithm for several
multi-asset option pricing models such as 100-dimensional Black-Scholes model, as well
as incorporating different interest rates and a risk that a default occurs.
1 Introduction
Option pricing remains a pivotal challenge in the financial sector. Central to this task
is the risk-neutral pricing framework, which has traditionally been addressed through
solutions to partial differential equations (PDEs), a method established by the as-
tonishing work of Black and Scholes in 1973. Practical financial operations demand
the valuation of multi-asset options such as basket options, leading to the formula-
tion of high-dimensional PDEs. Traditional resolution methods like finite difference
and finite element methods are hindered by the curse of dimensionality, limiting their
effectiveness for high-dimensional PDEs. Consequently, the focus has shifted towards
developing novel PDE solvers capable of handling the complexities of higher dimensions.
While linear models have played a foundational role in option pricing, they are not
without limitations. These models typically presume idealized market conditions and
may not fully account for the multifaceted interactions present in financial markets.
Even with the introduction of nonlinear terms to better represent these complexities,
the necessity to consider a substantial number of assets linked to the options remains.
To address this, we must turn to more advanced methodologies, such as stochastic
analysis and deep learning (DL) techniques. These approaches are better equipped to
handle the nonlinear high-dimensional nature of the problem, capturing elements like
varying interest rates, the potential for default, and sudden jumps in asset prices, which
are critical to accurately reflecting the nuanced dynamics of actual financial markets.
The advent of artificial intelligence (AI) and its integration into the financial sector
has revolutionized the approach to these high-dimensional problems. AI, particularly
deep learning, has shown remarkable ability to circumvent the curse of dimensionality,
offering scalable and efficient solutions for complex derivative pricing. By leveraging
the power of neural networks, AI-driven models can approximate the solutions to high-
dimensional PDEs with a level of precision and speed unattainable by conventional
numerical methods. This paradigm shift not only enhances the accuracy of option
pricing but also opens new avenues for innovation in financial risk management and
investment strategies.
1.1 The Curse of Dimensionality in Financial Modeling

The term "Curse of Dimensionality" initially introduced by Richard Bellman [1], refers
to a set of challenges that arise when trying to analyze data in high-dimensional spaces,
1
Electronic copy available at: https://ssrn.com/abstract=4739091
problems that are absent in lower-dimensional scenarios.
When employing finite difference methods to solve partial differential equations, it

is essential to create a mesh of data points covering the domain. This mesh allows us to
approximate partial derivatives effectively. In low-dimensional domains, this approach
is straightforward to implement, and discretization errors can be minimized by increas-
ing the number of mesh points. However, as the dimension of the problem increases,
the number of required mesh points grows exponentially, leading to a corresponding
increase in computational complexity. To get a concise understanding of this principle
let’s associate it with an options pricing problem.
Consider the problem of pricing an option associated to 100 assets through a simple
equation like 100-dimensional Black-Scholes partial differential equation. If we were to
represent each dimension with just 10 points, we would need 10100 points for the mesh,
to put this in perspective, there are only about 1080 atoms in the observable universe.
This means that even if we used every atom as a storage unit, it wouldn’t be sufficient
to represent the problem, let alone perform the required calculations.
Fortunately, neural networks have demonstrated the ability to overcome the curse
of dimensionality, and they do not rely on grids to represent solutions. Among these
results, Barron’s proof that neural networks results doesn’t depend on the dimension
of the problem, but on the number of network’s parameters.
1.2 The Market Imperfections

Market imperfections play a critical role in the pricing of financial derivatives, necessi-
tating models that can capture the realities of real-world markets. Traditional models
often assume constant interest rates; however, in practice, interest rates vary across dif-
ferent maturities and credit qualities, leading to term structure and credit spread risks
([6]). Default risk is another significant factor, as it introduces non-linearity into pricing
models, reflecting the possibility of a sudden loss of value if an issuer fails to meet its
obligations ([8]). Additionally, asset prices are subject to jumps, which can be caused
by economic events or market news, resulting in discontinuous price movements that
linear models fail to address ([9]). These nonlinearities, including stochastic volatility
and correlation, are essential for a more accurate representation of market behavior
and have been the focus of extensive research in the field of financial mathematics ([4],
[5]).
2 Theoretical Framework
2.1 Formulating the Pricing Problem: PDEs and SDEs
Two key frameworks used in this context of pricing financial options are Partial Differ-
ential Equations (PDEs) and Stochastic Differential Equations (SDEs). These frame-
works are fundamental for understanding the evolution of financial instruments over
time.
As discussed above, the Black-Scholes model plays an outstanding role in pricing

options, think of a portfolio as being composed of one option and n underlying assets,
and let Si be the price processes of the underlying assets for i = 1, 2, . . . , n where each
asset follows the following stochastic differential equation (SDE) dynamics:
dSi = µi Si dt + σi Si dWi (1)
where dWi for i = 1, 2, . . . , n are n Brownian motions correlated by
⟨dWi , dWj ⟩ = ρij dt (2)
then the Black-Scholes partial differential equation equation in multi-asset case is given
by
n n n
∂C 1 XX ∂2C X ∂C
+ ρij σi σj Si Sj +r Si − rC = 0 (3)
∂t 2 ∂Si ∂Sj ∂Si
i=1 j=1 i=1
where C = C(t, St ) is the option price, and assuming no dividends yield and a constant
interest rate r which translates to µ = r, the PDE (3) must be considered with the
terminal condition
C(T, ST ) = g(ST ) (4)
2
with g as a simple claim.
2.2 Forward-Backward Stochastic Differential Equations

One of the primary motivations for exploring the forward-backward stochastic differ-
ential equations (FBSDEs) is its relation to the PDEs and their ability to provide
solutions to a wide range of problems in finance. By establishing connections between
FBSDEs and PDEs, we can leverage the rich theory and techniques of both fields to
tackle complex problems that may not be easily solvable using traditional approaches.
FBSDEs offer a powerful framework for modeling and analysing dynamic systems,
where the evolution of a process depends on both forward and backward dynamics.
This characteristic is particulary relevant in situations where the future behaviour of
a system depends not only on the present state but also on future contingencies. By
incorporating backward components, FBSDEs enable the consideration of anticipatory
effects, which can be vital in modeling financial phenomena with inherent memory.
We consider the special FBSDE form of the following

 Z t Z t
Xt = ξ + µ (s, Xs ) ds + σ (s, Xs ) dWs


0 0
Z T Z T (5)
Yt = g (XT ) + f (s, Xs , Ys , Zs ) ds −

Zs ds

t t
where:
µ : [0, T ] × Rn → ×Rn
σ : [0, T ] × Rd → ×Rn×n
f : [0, T ] × Rd × R × Rd → ×R
g:R→R
ξ = X0 and YT = g(XT )
for more general forms of FBSDEs and for full treatment of the existence and the
uniqueness of the solutions, one may refer to Ma, Protter, and Yong work [13].
2.2.1 The Link between FBSDEs and PDEs

We now present a significant result that enables us to generalize the Feynman-Kac for-
mula, which was introduced earlier, to a nonlinear setting. This extension allows us to
establish a connection between the classical solution of a semilinear Partial Differential
Equation (PDE) and a process-based solution to the corresponding Forward-Backward
Stochastic Differential Equation (FBSDE).
Theorem 2.1 (Pardoux and Peng 1996 [10]). Let Y : [0, T ] × Ω → R and Z : [0, T ] ×
Ω → Rn be an adapted stochastic processes with continuous sample paths which satisfy
that for all t ∈ [0, T ] it holds that
Z T Z T
Yt = g(XT ) + f (s, Xs , Ys , Zs )ds − Zs · dWs (6)
t t
under some regularity assumptions [13], we have that the nonlinear PDE
∂C 1
(t, x)+ T r[σσ T (t, x)∇xx C(t, x)] + ⟨µ(t, x), ∇x C(t, x)⟩+f (t, x, C(t, x), σ(t, x) ∇x C(t, x)) = 0
∂t 2
(7)
is related to the FBSDE (6) in the sense that for all t ∈ [0, T ] it holds that
Yt = C(t, Xt ), Zt = ⟨σ(t, Xt ), ∇x C(t, Xt )⟩, 0 ≤ t ≤ T; (8)
2.3 The Deep Learning Approach
• Time Discretization
We discretize the time interval [0, T ] into N equidistant subintervals

0 = t0 ≤ t1 ≤ ... ≤ tN = T with constant step size ∆t = N T
. To approximate
{Xt }t∈[0,T ] , we employ the Euler scheme for the forward SDE which yields
X̃d+1 = X̃d + µ(td , X̃d )(td+1 − td ) + σ(td , X̃d )(Wtd+1 − Wtd ), (9)
3
where X̃d ≈ Xtd and d = 0, 1, . . . , N .
Analogously, we obtain a discretization of the FBSDE as
Ỹd+1 = Ỹd − f (td , X̃d , Ỹd , Z̃d )(td+1 − td ) + Z̃d · (Wtd+1 − Wtd ), (10)
where similarly, Ỹd ≈ Ytd and Z̃d ≈ Ztd . and by the last theorem’s result, we can
replace Y˜d by C(td , X̃d ) and Z̃d by (σ T ∇x C)(td , X̃d ) (refer to [10]) and get our
main resulting equation used for the iterations
C(td+1 , X̃d+1 ) = C(td , X̃d ) − f td , X̃d , C(td , X̃d ), (σ T ∇x C)(td , X̃d ) (td+1 − td )

+(σ T ∇x C)(td , X̃d ) · (Wtd+1 − Wtd ),

(11)
We emphasize that we are interested in the value Ỹ0 ≈ Y0 = C(0, X0 ), and note
that the increment of a Brownian motion is normally distributed
(Wtd+1 − Wtd ) ∼ N (0, (td+1 − td )In×n ) (12)

• Deep Learning-Based Approximations
In the next step, we utilize a deep learning approach to approximate
(∇x C)(td , x) ∈ Rn , x ∈ Rn , d ∈ {0, 1, ..., N }, (13)
not the C(td , x) because the approximations for this latter can be computed us-
ing expression (11) together with the deep learning-based approximations for (13).
More theoretically, let’s formulate our algorithm framework.
Let T ≥ 0, n, p, v, N, r ∈ N, ξ ∈ Rn , and let f : [0, T ]×Rn ×R×Rn → R, g : Rn →

R, and H : [0, T ]2 × Rn × Rn → Rn be functions, and W m,i : [0, T ] × Ω → Rn ,
i, m ∈ N0 be independent n−dimensional brownian motions.
let t0 , t1 , ..., tN ∈ [0, T ] be real numbers with
0 = t0 < t1 < ... < tN = T
We look at the number p as the number of parameters in the deep neural network,
for every θ ∈ Rp , let C θ ∈ R for every s ∈ Rr , d ∈ {0, 1, ..., N − 1}, i ∈ N0 ,
let Vd,i
θ,s
: (Rn )N → Rn be a function, as well as the two stochastic processes
X m,i
: {0, 1, ..., N } × Ω :→ Rn and Y θ,s,m,i : {0, 1, ..., N } × Ω :→ R which satisfy
for all θ ∈ Rp , s ∈ Rr and d ∈ {0, 1, ..., N − 1}:
X0m,i = ξ, Y0θ,s,m,i = C θ , m,i

Xd+1 = H(td , td+1 , Xdm,i , Wtm,i
d+1
− Wtm,i
d
) (14)
θ,s,m,i
Yd+1 = Ydθ,s,m,i − f (td , Xdm,i , Ydθ,s,m,i , Vd,i
θ,s
({Xdm,j }j∈N ))(td+1 − td )
(15)
+ Vd,i
θ,s
({Xdm,j }j∈N ) (Wtm,i
d+1
− Wtm,i
d
)
we consider
C θ ≈ C(0, ξ = X0 ) and Vdθ (x) ≈ (∇x C(td , x)), d ∈ {0, 1, ..., N − 1}
for all θ ∈ Rp .
• Stochastic Optimization
In order to estimate the appropriate parameter θ ∈ Rp of the neural network, we

can employ one of the optimizers. The objective of this optimization algorithm is
to minimize the expected loss function, for this loss function, we pick the mean
squared error denoted as
2
L(θ) : θ ∈ Rp −→ E[|YN
θ
− g(XN )| ] ∈ [0, ∞[ (16)
which is defined as the mean of the squared difference between the target value
g(XN ) and the predicted value C(tN , XtN ) = YN
θ
. Here, XT represents the state
variable generated at time t = tN = T .
The loss function represents the discrepancy between the true target value, and
the value predicted by the neural network, By minimizing this loss function, the
neural network aims to improve its ability to accurately approximate the ideal
4
parameters θ.
We use a stochastic gradient descent-type algorithm to approximate the parameter

vector θ ∈ Rp , we then obtain random approximations θ0 , θ1 , ..., θm : Ω → Rp
of θ for a large N, p, m ∈ N. so we can say that C θm : Ω → R is an "ideal"
approxiamtion of C(0, X0 )
C θm ≈ C(0, ξ) (17)
and thereby, we utilize the random variable Vdθm (x) as a suitable approximation
for (∇x C)(t, x)
Vθdm (x) ≈ (∇x C)(t, x) (18)
• The Architecture of the Neural Network
For simplicity, we will focus on a scenario where the diffusion coefficient σ in

equation (7) is σ(x) = IdRn for all x ∈ Rn (the identity matrix of dimension n).
Figure (1) illustrates the network architecture employed. The main components
of this architecture involve the approximation of two variables: ∇x C(td , Xtd ) and
C(td , Xtd ).
∇x C(td , Xtd ) represents a variable that is directly approximated using sub-networks
within the architecture. On the other hand, C(td , Xtd ) is a variable that is com-
puted iteratively within the network.
The network consists of three types of connections:
– Xtn → Hn1 → Hd2 → ... → Hdh −→ ∇x C(td , Xtd ) is a multilayer feedforward
neural network used to approximate the spatial gradients at time t = td . The
network’s parameters θn are optimized during the training process.
– (C(td , Xtd ), ∇x C(td , Xtd ), Wtd+1 − Wtd ) −→ C(td+1 , Xtd+1 ) represents the
forward iteration process in the neural network. Here, the network takes three
inputs: C(td , Xtd ) (the current approximation of the solution), ∇x C(td , Xtd )
(the approximation of spatial gradients at time td ), and (Wtd+1 − Wtd ) (the
difference in the Wiener process between time steps td+1 and td ).
The forward iteration process uses this information as input to compute and
update the approximation u(tn+1 , Xtn+1 ) of the solution at the next time
step td+1 . This step is essential for the network to iteratively improve its
approximation of the function C(tN , XtN ), which is the final output of the
network, and is fully characterized by equation.
Unlike the previous connection type, there are no parameters (such as weights
or biases) to be optimized in this specific connection. The iteration process
relies on the input data and the network’s current approximation to generate
the next estimation iteratively.
– (Xtd , Wtd+1 − Wtd ) −→ Xtd+1 represents the shortcut connection between

blocks at different time steps in the neural network architecture. This con-
nection is characterized by equations.
The shortcut connection in this network type takes two inputs: Xtd (value
at time td ) and (Wtd+1 − Wtd ) . It directly propagates information from one
time step to another, bypassing intermediate computations.
This connection has no parameters (weights/biases) to optimize. It trans-
fers information between time steps without adjustable parameters, serving
as a direct link that facilitates efficient computations during the network’s
iterative process.
5
Figure 1: The architecture sketch for the algorithm
3 Application: Pricing Multi-Asset Options

The numerical experiments detailed below were executed using Python with Tensor-
Flow on a Windows PC featuring AMD Athlon Silver 3050U processor operating 2.30
Gigahertz (GHz) and equipped with 16 Gigabytes (GB) of 2400 Megahertz (MHz) dou-
ble data rate type four random-access memory (DDR4-RAM).
For the neural network architecture, we used an input layer with 100 neurons,
two hidden layers with 120 neurons, and an output layer with 100 neurons. We used
Adaptive Moment Estimation (Adam) optimizer with a learning rate of 8 · 10−3 .
3.1 High-Dimensional Geometric Brownian Motions

We’ll start by attacking the simplest, and the most popular multi-asset pricing equation
n
∂C ∂C σ2 X ∂2C
+ µS · + |Si |2 (t, S) − rC = 0 (19)
∂t ∂S 2 ∂Si2
i=1
• The Algorithm Results
for the numerical implementation, we take n = 100, T = 1, µ = r = 0.06, and for

simplicity a fixed volatility of σ = 0.2 . For the terminal condition we choose the claim
g as
g(ST ) = min{ST1 , ..., ST100 } for (ST1 , ..., ST100 ) ∈ R100 (20)
Num of iterations Mean of C(0, S0 ) L1 error Mean of the loss Runtime (seconds)
0 0.000 0.999 3556.34 1
4000 28.836 0.534 1122.13 611
10000 58.087 0.012 35.727 1476
20000 59.485 0.005 27.800 2933
30000 59.443 0.005 21.664 42101
Table 1: Training history of solving equation (??) with 10 independent runs
6
Figure 2: The evolution of the approximated solution (right) and its loss (left) during
training history for equation (19)
3.2 Accounting for Varying Interest Rates

In this subsection, we focus on using our framework to tackle a pricing problem related
to a European multi-asset option. This option is situated within a financial market
that features a risk-free bank account, which is commonly used for hedging purposes.
What sets this particular financial market apart is that the risk-free bank account offers
different interest rates for borrowing and lending activities, typically, the interest rate
for borrowing funds from the bank account is higher than the interest rate for lending
funds to it.
We reformulate the classical Black-Scholes equation, which assumes a constant risk-

free interest rate, to accommodate the varying interest rates for borrowing and lending.
assuming the same settings from the last example, with replacing the fixed interest
rate expression r · C(t, St ) with
( " n
# ) n
1 X ∂C (µ − Rl ) X ∂C
(R − R ) max
b l
0, −C − Rl C − (21)
σ ∂Si σ ∂Si
i=1 i=1
where Rl and Rb are the interest rate for lending and borrowing respectively. thus, the
n-dimensional multi-asset pricing equation becomes
n
∂C σ2 X ∂2C
+ |Si |2 (t, S)
∂t 2 ∂Si2
i=1
( " n
# ! " n # !)
X ∂C X ∂C
+ max R b
Si ( )(t, S) − C(t, S) ,R l
Si ( )(t, S) − C(t, S) =0
∂Si ∂Si
i=1 i=1
(22)
for the numerical implementation, we take n = 100, T = 1/2, µ = Rb = 0.06, Rl = 0.04,

and for simplicity a fixed volatility of σ = 0.2. For the terminal condition we choose
the claim g as

g(ST ) = max max STi − 120, 0 − 2 max max STi − 150, 0 (23)
1≤i≤100 1≤i≤100
7
Num of iterations Mean of C(0, S0 ) L1 error Mean of the loss Runtime (seconds)
0 0.000 0.999 1002.34 1
400 7.065 0.143 805.33 122
1000 7.298 0.012 246.87 414
2000 7.471 0.005 212.74 830
3000 7.421 0.005 207.97 1216
Table 2: Training history of solving equation (22) with 10 independent runs
Figure 3: The evolution of the approximated solution and its loss during training history
for equation (22)
3.3 Incorporating the Risk of Default in Option Pricing
The credit risk and ongoing European sovereign debt crisis have highlighted a signif-
icant risk that was overlooked in the original Black-Scholes model, the default risk
[refer to —]. The Black-Scholes pricing model for multi-asset options can be enhanced
to incorporate this factors in real markets, to include defaultable securities, which in-
troduces a nonlinear component into the pricing model.
We’ll investigate a fair price of a European claim that is linked to n underlying as-
sets, assuming no default has occured yet. In the event of the claim’s issuer defaulting,
the claim holder recieves only a fraction α ∈ [0, 1[ of the current value.
This possible default is modeled by the first jump time of a Poisson processs with
intensity P, a decreasing function of the current value; i.e., the default becomes more
likely when the claim’s value is low. The value process can then be expressed by adding
the following nonlinear term to our PDE
f (C) = −(1 − α) P(C(t, St )) C(t, St ) (24)
We assume that the underlying asset prices St moves as a geometric Brownian

motion, and choose the intensity function P as a piecewise linear function for the
current value with three regions
(γ h − γ l )

P(C) = 1]−∞,vh [ (C)γ h + 1[vl ,+∞[ (C)γ l + 1[vh ,vl [ (C) (C − v h ) + γ h (25)
(v h − v l )
the associated nonlinear n-dimensional Black-Scholes equation in [0, T ]×Rn becomes

n
∂C ∂C σ2 X ∂2C
+ µS · + |Si |2 (t, S) − (1 − α)P(C) C − rC = 0 (26)
∂t ∂S 2 ∂Si2
i=1
For this problem, we chose n = 100, T = 1, α = 23 , r = µ = 0.06, σ = 0.2, v h = 40,

v l = 80, γ h = 0.5, γ l = 0.05 and the terminal condition:
8
g(ST ) = min{ST1 , ..., ST100 } for (ST1 , ..., ST100 ) ∈ R100 (27)
and we consider a basket of stocks with initial prices S0 = (S01 , S02 , . . . , S0100 ) = (100, . . . , 100)
the table below, summarizes the evolution of the training history for 10 independent
runs,
Num of iterations Mean of C(0, S0 ) error L1 error Mean of the loss Runtime (seconds)
0 0.000 0.999 4270.85 1
4000 28.129 0.509 908.75 663
10000 52.868 0.077 29.709 1724
20000 53.912 0.005 23.455 3081
30000 53.969 0.005 20.440 4407
Table 3: Training history of solving equation (26) with 10 independent runs
Figure 4: The evolution of the approximated solution and its loss during training history
for equation (26)
4 Conclusions
This paper presents a novel algorithm leveraging the intriguing connection between FB-
SDEs and deep learning for efficiently solving semi-linear high-dimensional PDEs en-
countered in multi-asset option pricing. The proposed method utilizes time-dependent
neural networks to approximate the solution’s gradient, analogous to a policy function
in reinforcement learning. Extensive numerical experiments on various models, includ-
ing the 100-dimensional Black-Scholes model with diverse interest rates and default
risks, demonstrate the algorithm’s effectiveness and accuracy.
Looking ahead, several exciting avenues warrant further exploration. Firstly, the
framework can be extended by incorporating two additional neural networks, enabling
the model to price options at any point in time, not just the initial condition. This
would significantly enhance its practical applicability in real-world financial markets.
Secondly, the algorithm has the potential to be adapted for solving second-order PDEs
arising in complex financial settings by employing Doubly FBSDEs. This expansion
would broaden the scope of the method and its ability to tackle even more intricate
financial problems.
This work establishes a promising direction for applying deep learning techniques
to solve complex PDEs in financial engineering. The proposed algorithm demonstrates
remarkable efficiency and accuracy, paving the way for further advancements towards
a powerful and versatile framework for option pricing and beyond.
9
References
[1] Bellman, R. Dynamic Programming, Princeton University Press, Princeton, NJ,
2010. Reprint of the 1957 edition.
[2] D. Duffie, M. Schroder, C. Skiadas, Recursive valuation of defaultable securities and
the timing of resolution of uncertainty, The Annals of Applied Probability, Vol.6, No.
4, 1075–1090, 1996.
[3] C. Bender, N. Schweizer, J. Zhuo, A primal-dual algorithm for BSDEs,
arXiv:1310.3694, 2014.
[4] S. L. Heston, A Closed-Form Solution for Options with Stochastic Volatility with
Applications to Bond and Currency Options, The Review of Financial Studies.
[5] J. Hull, A. White, The Pricing of Options on Assets with Stochastic Volatilities,
The Journal of Finance, 42(2), 281-300.
[6] F. A. Longstaff, E. S. Schwartz, Interest Rate Volatility and the Term Structure: A
Two-Factor General Equilibrium Model, The Journal of Finance, 47(4), 1259-1282.
[7] Y. Z. Bergman, Option pricing with different interest rates, Review of Financial
Studies 8, 2, 1995, 475-500.
[8] D. Duffie, K. J. Singleton, Modeling Term Structures of Defaultable Bonds, Review
of Financial Studies, 12(4), 687-720
[9] R. C. Merton, Option pricing when underlying stock returns are discontinuous,
Journal of Financial Economics, 3(1-2), 125-144
[10] É. Pardoux, S. Peng, Backward Stochastic Differential Equations and Quasi-Linear
Parabolic Partial Differential Equations. Lecture Notes in Control and Inform. Sci,
Springer, Berlin, 1992, pp. 200–217.
[11] É. Pardoux, S. G. Peng, Adapted Solution of a Backward Stochastic Differential
Equation, Systems Control Letters, vol 14, 1990
[12] É. Pardoux, S. Tang, Forward-Backward Stochastic Differential Equations and
Quasilinear Parabolic PDEs, Probab. Theory Related Fields, 114, vol 2 (1999),
123–150.
[13] J. Ma, P. Protter, J. Young, Solving Forward-Backward Stochastic Differential
Equations Explicitly - a Four Step Scheme, Probability Theory and Related Fields
1998,339-359.
10

(2024) Deep Learning Approach for Multi-Asset Option Pricing (Noguer i Alonso, HAIDA)

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

(2024) Deep Learning Approach for Multi-Asset Option Pricing (Noguer i Alonso, HAIDA)

Uploaded by

Copyright:

Available Formats

Deep Learning Approach for Multi-Asset Option Pricing

Miquel Noguer i Alonso, Ayoub Haida

February 28, 2024

1.1 The Curse of Dimensionality in Financial Modeling

When employing finite difference methods to solve partial differential equations, it

1.2 The Market Imperfections

As discussed above, the Black-Scholes model plays an outstanding role in pricing

2.2 Forward-Backward Stochastic Differential Equations

We consider the special FBSDE form of the following

2.2.1 The Link between FBSDEs and PDEs

Yt = C(t, Xt ), Zt = ⟨σ(t, Xt ), ∇x C(t, Xt )⟩, 0 ≤ t ≤ T; (8)

2.3 The Deep Learning Approach

We discretize the time interval [0, T ] into N equidistant subintervals

+(σ T ∇x C)(td , X̃d ) · (Wtd+1 − Wtd ),

(Wtd+1 − Wtd ) ∼ N (0, (td+1 − td )In×n ) (12)

In the next step, we utilize a deep learning approach to approximate

(∇x C)(td , x) ∈ Rn , x ∈ Rn , d ∈ {0, 1, ..., N }, (13)

More theoretically, let’s formulate our algorithm framework.

Let T ≥ 0, n, p, v, N, r ∈ N, ξ ∈ Rn , and let f : [0, T ]×Rn ×R×Rn → R, g : Rn →

0 = t0 < t1 < ... < tN = T

X0m,i = ξ, Y0θ,s,m,i = C θ , m,i

C θ ≈ C(0, ξ = X0 ) and Vdθ (x) ≈ (∇x C(td , x)), d ∈ {0, 1, ..., N − 1}

In order to estimate the appropriate parameter θ ∈ Rp of the neural network, we

We use a stochastic gradient descent-type algorithm to approximate the parameter

• The Architecture of the Neural Network

For simplicity, we will focus on a scenario where the diffusion coefficient σ in

– (Xtd , Wtd+1 − Wtd ) −→ Xtd+1 represents the shortcut connection between

3 Application: Pricing Multi-Asset Options

3.1 High-Dimensional Geometric Brownian Motions

• The Algorithm Results

for the numerical implementation, we take n = 100, T = 1, µ = r = 0.06, and for

Table 1: Training history of solving equation (??) with 10 independent runs

3.2 Accounting for Varying Interest Rates

We reformulate the classical Black-Scholes equation, which assumes a constant risk-

for the numerical implementation, we take n = 100, T = 1/2, µ = Rb = 0.06, Rl = 0.04,

Table 2: Training history of solving equation (22) with 10 independent runs

3.3 Incorporating the Risk of Default in Option Pricing

f (C) = −(1 − α) P(C(t, St )) C(t, St ) (24)

We assume that the underlying asset prices St moves as a geometric Brownian

the associated nonlinear n-dimensional Black-Scholes equation in [0, T ]×Rn becomes

• The Algorithm Results

For this problem, we chose n = 100, T = 1, α = 23 , r = µ = 0.06, σ = 0.2, v h = 40,

Table 3: Training history of solving equation (26) with 10 independent runs

You might also like