Download as pdf or txt
Download as pdf or txt
You are on page 1of 65

Price Sensitivity Analysis for Mortgage-Backed Securities and

Estimating Value-at-Risk

Geoffrey Simmons∗
Rajat Goyal†
Xiaohuang Wu‡
Yu-Ying Lee§
Zhuozhou Liu¶
(in alphabetical order)

October 17, 2013

Abstract

Value-at-Risk is a widely used measure of the potential risk of loss on a portfolio of financial assets.
A full revaluation historical VaR engine is often the default choice for capturing risk across most strate-
gies/instruments. However, MBS valuations are complex and computationally intensive. Consequently,
computing historical VaR by full revaluation is not practical for these securities. In this project, we at-
tempt to develop a framework to estimate VaR for agency mortgage-backed securities that is much more
computationally efficient than a full revaluation VaR engine. We explore Taylor Series, Linear Regression,
Automatic Linear Model Selection and Neural Network approaches to estimate the change in price of a
MBS security given its sensitivites to market parameters such as key rates, implied volatility, mortgage
spread and option-adjusted spread. We are able to develop accurate prediction models by using the market
parameters mentioned above. Even though developing an accurate prediction model without using OAS is
difficult, we uncover a promising direction for future research in the form of the Neural Network Without
OAS model.

Acknowledgements

This work is a collaboration between the Master of Financial Engineering (MFE) program at the Haas School
of Business, University of California Berkeley and Royal Bank of Canada (RBC) Capital Markets. We thank
Linda Kreitzman, MFE Executive Director for giving us permission to pursue this excellent opportunity to
work with RBC. We convey our heartfelt gratitude to our project supervisor Serkan Eren, Vice President,
Global Arbitrage and Trading at Royal Bank of Canada Capital Markets for his regular suggestions and
encouragement. We also thank Bobby Koupparis, Vice President, Global Arbitrage and Trading at Royal
Bank of Canada Capital Markets for providing us with innovative insight throughout the project. We are
deeply indebted to Royal Bank of Canada Capital Markets for providing us with the data needed for our
analytical work.
∗ Geoffrey Simmons [geoffrey_simmons@mfe.berkeley.edu] is a Master of Financial Engineering Candidate 2014 at the Haas

School of Business, University of California Berkeley


† Rajat Goyal [rajat_goyal@mfe.berkeley.edu] is a Master of Financial Engineering Candidate 2014 at the Haas School of

Business, University of California Berkeley


‡ Xiaohuang Wu [xiaohuang_wu@mfe.berkeley.edu] is a Master of Financial Engineering Candidate 2014 at the Haas School

of Business, University of California Berkeley


§ Yu-Ying Lee [yuying_lee@mfe.berkeley.edu] is a Master of Financial Engineering Candidate 2014 at the Haas School of

Business, University of California Berkeley


¶ Zhuozhou Liu [zhuozhou_liu@mfe.berkeley.edu] is a Master of Financial Engineering Candidate 2014 at the Haas School of

Business, University of California Berkeley

1
Contents

1 Goals 4

2 Literature Survey 4

3 Data 5

4 Expressing Derivatives in terms of Duration and Convexity 5

5 Nomenclature of Approaches 7

6 Description of Approaches 7

6.1 Taylor Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

6.2 Linear Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

6.3 Automatic Linear Model Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

6.4 Neural Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

7 Results of With OAS Approaches 12

7.1 Taylor Series With OAS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

7.2 Linear Regression With OAS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

7.3 Automatic Linear Model Selection With OAS . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

7.4 Neural Network With OAS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

7.5 A Promising Detour: Neural Network Without OAS . . . . . . . . . . . . . . . . . . . . . . . . 24

8 Comparison between With OAS Approaches 27

8.1 15Y Fannie Low Coupon Pools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

8.2 15Y Fannie Mid Coupon Pools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

8.3 15Y Fannie High Coupon Pools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

8.4 30Y Fannie Low Coupon Pools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

8.5 30Y Fannie Mid Coupon Pools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

8.6 30Y Fannie High Coupon Pools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

8.7 Ginnie Low Coupon Pools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

8.8 Ginnie Mid Coupon Pools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

8.9 Ginnie High Coupon Pools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

8.10 Interest Only Pools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

8.11 Principal Only Pools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

8.12 Estimated Distribution of Change in Cumulative PnL at Portfolio Level . . . . . . . . . . . . . 36

2
9 Out-of-Sample Performance of Approaches 38

9.1 Taylor Series With OAS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

9.2 Taylor Series Without OAS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

9.3 Linear Regression With OAS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

9.4 Linear Regression Without OAS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

9.5 Automatic Linear Model Selection With OAS . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

9.6 Automatic Linear Model Selection Without OAS . . . . . . . . . . . . . . . . . . . . . . . . . . 41

9.7 Neural Network With OAS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

9.8 Neural Network Model Without OAS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

10 Conclusions 42

References 42

A Results of Without OAS Approaches 44

A.1 Taylor Series Without OAS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

A.2 Linear Regression Without OAS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

A.3 Automatic Linear Model Selection Without OAS . . . . . . . . . . . . . . . . . . . . . . . . . . 49

A.4 Neural Network Without OAS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

B Comparison between Without OAS Approaches 56

B.1 15Y Fannie Low Coupon Pools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

B.2 15Y Fannie Mid Coupon Pools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

B.3 15Y Fannie High Coupon Pools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

B.4 30Y Fannie Low Coupon Pools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

B.5 30Y Fannie Mid Coupon Pools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

B.6 30Y Fannie High Coupon Pools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

B.7 Ginnie Low Coupon Pools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

B.8 Ginnie Mid Coupon Pools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

B.9 Ginnie High Coupon Pools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

B.10 Interest Only Pools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

B.11 Principal Only Pools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

B.12 Estimated Distribution of Change in Cumulative PnL at Portfolio Level . . . . . . . . . . . . . 64

3
1 Goals

Value-at-Risk, abbreviated as VaR, is a widely used measure of the potential risk of loss on a portfolio of
financial assets. For a given portfolio, probability and time horizon, VaR is defined as a threshold value such
that the probability that the mark-to-market loss on the portfolio over the given time horizon exceeds this
value (assuming normal markets and no trading in the portfolio) is the given probability level.1 VaR is one of
the most popular measures of financial risk control due to its ability to summarize all market risks a portfolio
is exposed to into a single summary statistic.

A full revaluation historical VaR engine is often the default choice for capturing risk across most strategies/in-
struments. However, MBS valuations are complex and computationally intensive. Consequently, computing
historical VaR by full revaluation is not practical for these securities. In this project, we attempt to develop
a framework to estimate VaR for agency mortgage-backed securities that is much more computationally ef-
ficient than a full revaluation VaR engine. We explore Taylor Series, Linear Regression, Automatic Linear
Model Selection and Neural Network approaches to estimate the change in price of a MBS security given its
sensitivites to market parameters such as key rates, implied volatility, mortgage spread and option-adjusted
spread. Our models can be used in a Monte-Carlo VaR engine that generates scenarios for market parameters
and estimates MBS VaR under the generated set of scenarios. We compare the predictive power of the models
developed on a range of different MBS instruments and benchmark model performance both in-sample and
out-of-sample.

2 Literature Survey

The uncertain nature of cash flows passed on to the holder of a mortgage-backed security makes the valuation
of these cash flows computationally expensive. The variability in the value of these securities comes not
only from changes in interest rates, but also as a result of changes in the speed of prepayment. Any suitable
approach for mortgage-backed security valuation must include a prepayment model complex enough to capture
this dynamic borrower behavior. Typically, this requires the use of Monte Carlo simulation to evaluate the
behavior of borrowers over a large number of possible interest rate scenarios, which comes at a substantial
computational cost. Because of these modeling requirements, when calculating the Value-at-Risk of a portfolio
containing a large number of mortgage-backed securities, it is essential to avoid any VaR method which requires
a significant number of revaluations.

It is precisely due to this issue that Ta and Mccoy (2001) outline and discuss a number of different methods
to approximate VaR for mortgage-backed securities. Primarily a way to measure the interest rate risk of
mortgage-backed securities, the first method discussed is the static cash flow analysis. By first assuming zero
volatility in the term structure, the static cash flow analysis measures the interest rate spread required to
discount the expected cash flows to the observed market price. The resulting spread can be used to measure
interest rate risk, and therefore Value-at-Risk. However, the simplicity of this method leads to its inability to
capture anything other than interest rate risk.

In order to alleviate some of the issues with a simple static cash flow approach, Ta and Mccoy (2001) next
discuss using equivalent duration bonds to estimate VaR. By utilizing the effective duration of a mortgage-
backed security, this measure begins to incorporate some degree of prepayment risk, measuring both the value
effects and cash flow effects of changes in the term structure. In low interest rate volatility environments or
with regards to securities with limited ranges of prepayment speeds, this approach can produce reasonable
results. However, as the variability of cash flows and interest rates increases, the ability of this method to
accurately approximate VaR decreases.

Continuing with the idea of replication, Jakobsen (1996) recommends implementing a Delta Equivalent Cash
Flow (DECF) procedure to estimate VaR for mortgage-backed securities. This method employs a sensitivity
approximation that is based shifting specific points along the term structure and calculating the resulting
price change. Given the sensitivity to key points along the curve, the mortgage-backed security cash flows are
replaced by zero coupon bonds with equivalent sensitivities. However, zero-coupon rates for long maturities
are not directly observable, and Jokobsen shows that if zero-coupon correlations are not estimated correctly,
VaR estimates produced with this approach can understate the true risk of mortgage-backed securities.
1 http://en.wikipedia.org/wiki/Value_at_risk

4
Similar in objectives to the replication approaches, the arbitrage-free bond canonical decomposition (ABCD)
method outlined by Ta and Mccoy (2001) attempts to imitate the cash flows of a mortgage-backed security
instead of matching the sensitivities. By decomposing the security into a portfolio of zero-coupon bonds, caps,
and floors, this method attempts to mimic the variability in cash flows as a result of prepayment. Due to
this decomposition, the VaR estimation procedure becomes a valuation exercise of these simple fixed income
instruments, significantly reducing the computational requirements. However, this method only serves to
capture some indirect exposure of cash flows to the level of interest rates, as mortgage-backed cash flows due
to prepayment are only indirectly related.

Studied by Delianedis, Lagnado, and Geske (2000) and further discussed by Ta and Mccoy (2001), and Risk-
Metrics (2010), the Delta-Gamma VaR approach utilizes a multi-variable Taylor-series expansion in order to
approximate VaR for mortgage-backed securities. This approach uses empirically estimated values for convex-
ities and durations with respect to a number of different points along the yield curve. Arcidiacono, Cordell,
Davidson, and Levin (2013) elaborate on this method by discussing the three primary risks to which mortgage-
backed securities are exposed: interest rate risk, prepayment model risk, and spread risk. A number of relevant
sensitivities are discussed that can be used with the Delta-Gamma approach to capture these risks, like key
interest rate durations and convexities and spread durations and convexities. Once these sensitivities have
been estimated, Arcidiacono et al discuss methods available to capture the volatility of each factor, such as
using swaption volatilities to compute the volatilities of the interest rate factors. Mieczkowski (2009) proposes
the addition of key rate vegas as risk factors in the expansion in order to capture the effects of changes in
interest rate volatility. In general, due to its ability to estimate price changes using only these sensitivities,
the Delta-Gamma approach provides an efficient means to approximate VaR that does not require constant
revaluation of the mortgage-backed security portfolio. The simplicity and efficiency of this approach seems to
lead to its current popularity with regards to approximating the VaR of mortgage-backed securities.

3 Data

For this project, we are given panel data from 2013/01/22 to 2013/07/22. We are given 3M market observed
treasury rates as well as 6M, 1Y, 2Y, 3Y, 5Y, 10Y and 30Y market observed swap rates. We use all the 8 rates
available in our data as key rates. We use 3Yx10Y swaption Black implied volatility as our volatility estimate.
We are also given 30Y MBS rate that we can use to define mortgage spread. Specifically, mortgage spread is
defined to be the difference of the 30Y MBS rate and 10Y swap rate. Please see the list below for the list of
instruments that we are given duration, convexity and option-adjusted spread data for.

15Y Fannie 2.5 30Y Fannie 2.5 Fannie IOS 3.5 2010 IO
15Y Fannie 3.0 30Y Fannie 3.0 Ginnie 3.0 Fannie IOS 4.0 2010 IO
15Y Fannie 3.5 30Y Fannie 3.5 Ginnie 3.5 Fannie POS 3.5 2010 PO
15Y Fannie 4.0 30Y Fannie 4.0 Ginnie 4.0 Fannie POS 4.5 2010 PO
15Y Fannie 4.5 30Y Fannie 4.5 Ginnie 4.5 Fannie POS 5.0 2010 PO
15Y Fannie 5.0 30Y Fannie 5.0 Ginnie 5.0 Ginnie 4.0 2010 IO
15Y Fannie 5.5 30Y Fannie 5.5 Ginnie 5.5 Ginnie 4.0 2010 PO
15Y Fannie 6.0 30Y Fannie 6.0 Ginnie 6.0 Ginnie 4.5 2010 PO
Ginnie 6.5 Ginnie 5.0 2010 PO

4 Expressing Derivatives in terms of Duration and Convexity

For all the price estimation approaches described in the next section, we use partial sensitivity of price to key
2
∂P ∂2P ∂P ∂ 2 P ∂2P
rates ∂kr , ∂ P2 , mortgage spread ∂ms
i ∂kri
∂P
, ∂ms ∂P
2 , volatility ∂σ , ∂σ 2 and option-adjusted spread ∂oas , ∂oas2 as our

explanatory variables. However, we are only given data for the duration and convexity of each MBS instrument
to key rates, mortgage spread, volatility and option-adjusted spread.

We can convert from duration to partial first-order sensitivities as follows:

∂P Dkri · P ($)
($) = −
∂kri 100

5
∂P Dms · P ($)
($) = −
∂ms 100

∂P
($) = Dσ ($)
∂σ

∂P Doas · P ($)
($) = −
∂oas 100
(1)
The partial first-order dollar contribution to change in price from change in key rates ∆Pkri , mortgage spread
(1) (1) (1)
∆Pms , volatility ∆Pσ and option-adjusted spread ∆Poas can be expressed as follows:

(1) ∂P Dkri · P ($)


∆Pkri ($) = ($) · ∆kri = − · ∆kri
∂kri 100

(1) ∂P Dms · P ($)


∆Pms ($) = ($) · ∆ms = − · ∆ms
∂ms 100

∂P
∆Pσ(1) ($) = ($) · ∆σ = Dσ ($) · ∆σ
∂σ

(1) ∂P Doas · P ($) ∆oas(bp)


∆Poas ($) = ($) · ∆oas = − ·
∂oas 100 100

We can convert from convexity to partial second-order sensitivities as follows:

∂2P Ckri · P ($)


2 ($) =
∂kri 100

∂2P Cms · P ($)


($) =
∂ms2 100

∂2P
($) = −Cσ ($)
∂σ 2

∂2P Coas · P ($)


2
($) =
∂oas 100
(2)
The partial second-order dollar contribution to change in price from change in key rates ∆Pkri , mortgage
(2) (2) (2)
spread ∆Pms , volatility ∆Pσ and option-adjusted spread ∆Poas can be expressed as follows:

(2) ∂2P 2 Ckri · P ($) 2


∆Pkri ($) = 0.5 ($) · (∆kri ) = 0.5 · (∆kri )
∂kri2 100

(2) ∂2P 2 Cms · P ($) 2


∆Pms ($) = 0.5 ($) · (∆ms) = 0.5 · (∆ms)
∂ms2 100

∂2P 2 2
∆Pσ(2) ($) = 0.5 ($) · (∆σ) = −0.5 · Cσ ($) · (∆σ)
∂σ 2

2
∂2P

(2) 2 Coas · P ($) ∆oas(bp)
∆Poas ($) = 0.5 ($) · (∆oas) = 0.5 ·
∂oas2 100 100

6
5 Nomenclature of Approaches

In this project, we worked on 8 different approaches for price estimation which we name as follows:

• Taylor Series With OAS


• Taylor Series Without OAS

• Linear Regression With OAS


• Linear Regression Without OAS
• Automatic Linear Model Selection With OAS

• Automatic Linear Model Selection Without OAS


• Neural Network With OAS
• Neural Network Without OAS

The With OAS and Without OAS cases of the same approach differ only in that one case uses option-adjusted
spread as an explanatory variable in prediction of price and the other does not. The underlying working of
both cases is exactly the same. However, we notice significantly different results between the With OAS and
Without OAS cases of the same model and hence, name them as two different approaches and refer to them
by their full name wherever appropriate to maintain complete clarity.

The term With OAS mentioned in isolation refers to the subset of the afore-mentioned 8 approaches that use
OAS as an explanatory variable, that is, Taylor Series With OAS, Linear Regression With OAS, Automatic
Linear Model Selection With OAS and Neural Network With OAS.

Similarly, the term Without OAS mentioned in isolation refers to the subset of the afore-mentioned 8 approaches
that do not use OAS as an explanatory variable, that is, Taylor Series Without OAS, Linear Regression Without
OAS, Automatic Linear Model Selection Without OAS and Neural Network Without OAS.

6 Description of Approaches

6.1 Taylor Series

This is the simplest approach in which we use the standard Taylor Series expansion to estimate the change in
price. Specifically, if we assume the change in price ∆P to be a function of some variables x1 , x2 , ..., xn , then

i=n i=n j=n


X ∂2P
X ∂P X
∆P (t) = (t − 1)∆xi (t) + (t − 1)∆xi (t)∆xj (t) + H.O.T.
i=1
∂xi i=1 j=1
∂xi ∂xj

Since we do not have cross-convexity data, we ignore cross-convexity terms and higher order terms. The
estimation expression then reduces to

i=n i=n 2
X ∂P X ∂ P 2
∆P (t) = (t − 1)∆xi (t) + 2 (t − 1) (∆xi (t))
i=1
∂xi i=1
∂xi

7
Should OAS be used or not? The option-adjusted spread of a mortgage-backed security is a theoretical
and model dependent measure designed to capture the market sentiment of riskiness and prices of securities. It
gives an indication of cheapness of a security with respect to its projected model cashflows. Higher OAS num-
bers are interpreted as the security being relatively cheaper or riskier than others. Due to its all-encapsulating
nature, changes in OAS are difficult to explain. As such, any approach that gives us the required accuracy in
prediction without using OAS is preferred over one that gives us comparable accuracy using OAS. Keeping this
in mind, we sub-divide the Taylor Series approach into two sub-approaches that rely on different assumptions.

The Taylor Series Without OAS approach assumes that ∆P is a function of the given key rates, mortgage
spread and volatility but not option-adjusted spread.

i=nrates i=nrates
X ∂P X ∂2P 2
∆P (t) = (t − 1)∆kri (t) + (t − 1) (∆kri (t)) +
i=1
∂kri i=1
∂kri2
∂P ∂2P 2
(t − 1)∆ms(t) + (t − 1) (∆ms(t)) +
∂ms ∂ms2
∂P ∂2P 2
(t − 1)∆σ(t) + (t − 1) (∆σ(t))
∂σ ∂σ 2

The Taylor Series With OAS approach also includes option-adjusted spread as an explanatory variable for
∆P apart from the variables being used for the Without OAS approach.

i=nrates i=nrates
X ∂P X ∂2P 2
∆P (t) = (t − 1)∆kri (t) + (t − 1) (∆kri (t)) +
i=1
∂kri i=1
∂kri2
∂P ∂2P 2
(t − 1)∆ms(t) + (t − 1) (∆ms(t)) +
∂ms ∂ms2
∂P ∂2P 2
(t − 1)∆σ(t) + (t − 1) (∆σ(t)) +
∂σ ∂σ 2
∂P ∂2P 2
(t − 1)∆oas(t) + (t − 1) (∆oas(t))
∂oas ∂oas2

6.2 Linear Regression

This approach adds a layer of complexity over the Taylor Series approach. Specifically, we regress the individual
terms in the Taylor Series expansion against actual change in price to come up with a weighting scheme that
generates the best in-sample fit. If we assume the change in price ∆P to be a function of some variables
x1 , x2 , ..., xn , we run the following regression
i=n i=n 2
X (1) ∂P X (2) ∂ P 2
∆P (t) = ai (t − 1)∆xi (t) + ai 2 (t − 1) (∆xi (t)) + (t)
i=1
∂xi i=1
∂x i

(1) (2)
to obtain values for the parameters ai and ai .

The estimation expression can now be written as

i=n i=n 2
X (1) ∂P X (2) ∂ P 2
∆P (t) = ai (t − 1)∆xi (t) + ai 2 (t − 1) (∆xi (t))
i=1
∂xi i=1
∂x i

We sub-divide the Linear Regression approach into two sub-approaches that rely on different assumptions

8
The Linear Regression Without OAS approach assumes that ∆P is a function of the given key rates, mortgage
spread and volatility but not option-adjusted spread.

i=nrates i=nrates
X (1),rates ∂P X (2),rates ∂2P 2
∆P (t) = ai (t − 1)∆kri (t) + ai (t − 1) (∆kri (t)) +
i=1
∂kri i=1
∂kri2
∂P ∂2P 2
a(1),ms (t − 1)∆ms(t) + a(2),ms (t − 1) (∆ms(t)) +
∂ms ∂ms2
∂P ∂2P 2
a(1),vol (t − 1)∆σ(t) + a(2),vol 2 (t − 1) (∆σ(t))
∂σ ∂σ

The Linear Regression With OAS approach also includes option-adjusted spread as an explanatory variable
for ∆P apart from the variables being used for the Without OAS approach.

i=nrates i=nrates
X (1),rates ∂P X (2),rates ∂2P 2
∆P (t) = ai (t − 1)∆kri (t) + ai (t − 1) (∆kri (t)) +
i=1
∂kri i=1
∂kri2
∂P ∂2P 2
a(1),ms (t − 1)∆ms(t) + a(2),ms (t − 1) (∆ms(t)) +
∂ms ∂ms2
∂P ∂2P 2
a(1),vol (t − 1)∆σ(t) + a(2),vol 2 (t − 1) (∆σ(t)) +
∂σ ∂σ
∂P ∂2P 2
a(1),oas (t − 1)∆oas(t) + a(2),oas (t − 1) (∆oas(t))
∂oas ∂oas2

6.3 Automatic Linear Model Selection

This approach adds a layer of complexity over the Linear Regression approach. The aim of performing this
exercise is to come up with transformations of the explanatory variables used in the Linear Regression approach
such that we can reach an optimal balance between the dual competing objectives of good in-sample fit and
good out-of-sample generalization. We expect this approach to give better in-sample predictability than the
Linear Regression approach while avoiding overfitting to as large an extent as possible.

The first step in this two step process is to choose transformation functions. The intuiton behind the trans-
formation functions we selected was to come up with different kinds of shapes that cover as large a spectrum
of possible explanatory relationships as possible. Specifically, we add the following explanatory variables over
and above the ones already present in the Linear Regression approach::

1. F (t)2 which gives us a parabolic shape


2. sin(F (t)) which gives us a wave shape
3. exp(F (t)) which gives us a monotonically increasing shape from 0 to ∞
4. exp(−F (t)) which gives us a monotonically decreasing shape from −∞ to 0
1−exp(−F (t))
5. tansig(F (t)) = 1+exp(−F (t)) which gives us a monotonically increasing shape bounded between -1 and 1

∂P ∂2P 2
where F (t) = ∂kri (t−1)∆kri (t) for all i{1, 2, ..., nrates} or F (t) = ∂kri2 (t−1) (∆kri (t)) for all i{1, 2, ..., nrates}
2 2
∂P ∂ P
or F (t) = ∂ms (t − 1)∆ms(t) or F (t) = ∂ms 2 (t − 1) (∆ms(t)) or F (t) = ∂P ∂σ (t − 1)∆σ(t) or F (t) =
∂2P 2 ∂P 2
∂ P 2
∂σ 2 (t − 1) (∆σ(t)) or F (t) = ∂oas (t − 1)∆oas(t) or F (t) = ∂oas2 (t − 1) (∆oas(t)) .

We also add approximate cross-convexity explanatory variables of the form Fi (t)Fj (t) where i 6= j and F (t) =
∂P ∂P ∂P
∂krk (t − 1)∆krk (t) for all k{1, 2, ..., nrates} or F (t) = ∂ms (t − 1)∆ms(t) or F (t) = ∂σ (t − 1)∆σ(t) or
∂P
F (t) = ∂oas (t − 1)∆oas(t).

9
Given the vastly expanded set of explanatory variables, the second step is to use some optimization criteria to
choose a subset of the explanatory variables that gives best in-sample fit and out-of-sample generalization. For
this task, we use Matlab’s standard LinearModel.fit function. Further details for the optimization criteria
used are given under each of the sub-approaches. Given the explanatory features F1 , F2 , ..., Fn chosen by the
optimization criteria, we run the following regression
i=n
X
∆P (t) = ai Fi (t) + (t)
i=1

to obtain values for the parameters ai .

The estimation expression can now be written as

i=n
X
∆P (t) = ai Fi (t)
i=1

We sub-divide the Automatic Linear Model Selection approach into two sub-approaches that rely on different
assumptions.

The Automatic Linear Model Selection Without OAS approach assumes that ∆P is a function of the given
key rates, mortgage spread and volatility but not option-adjusted spread. As such, all explanatory variables
that make use of either the OAS duration or OAS convexity are not included in the initial set of explanatory
variables. Hence, we have a total of 165 initial explanatory variables. For this approach, we use the Akaike
Information Criteria (AIC) as our optimization criteria.

The Automatic Linear Model Selection With OAS approach also includes explanatory variables derived from
OAS duration or OAS convexity apart from the variables being used for the Without OAS approach. Hence, we
have a total of 187 initial explanatory variables. For this approach, we use the Bayesian Information Criteria
(BIC) as our optimization criteria.

The Akaike Information Criteria penalizes the addition of an extra explanatory variable into the model much
less than the Bayesian Information Criteria. Due to the lack of highly explanatory OAS features in the Without
OAS case, no one explanatory variable offers a very good in-sample fit improvement in this case. Hence, we
noticed that BIC, due to its greater penalty, did not add any explanatory variable and just returned a constant
model. This is undesirable and to get around this issue, we use AIC instead for the Without OAS case since
it has a lower penalty than BIC.

6.4 Neural Network

The Neural Network approach adds a final layer of complexity and sophistication over the previous approaches.
Instead of using Linear Regression tools, we construct a non-linear model to capture the non-linear relation
between the change in price and explanatory variables. The utility of artificial Neural Network models lies in
the fact that they can be used to infer a functional form from given supervised data.

For our Neural Network, we use one hidden layer with three neurons in the hidden layer and one neuron in
the output layer. All explanatory variables X have connections to all neurons in the hidden layer, hence the
name “network”. A neuron’s network function Fi (X) is defined as a linear combination of
 explanatory variables
(i) Pj=n (i) 
and then transferred by a non-linear activation function. Mathematically, Fi (X) = K w0 + j=1 wj Xj ,
whereK can be any predefined function. We choose tan sigmoid as the activation function, that is, K(x) =
1−exp(−x)
1+exp(−x) . In the second layer, L(X) is just a linear combination of F1 (X) to F3 (X), that is, L(X) =
Pi=3
w0 + i=1 wi Fi (X).

10
(i)
To use a Neural Network, the parameters wj and wi need to be trained. Due to its complexity, training Neural
Networks is much more demanding than training Linear Models. The most elementary training algorithm is
the backpropagation algorithm, an abbreviation for "backward propagation” of errors. This can be done by
iterating through the following steps for a preset number of epochs, in our case 500.

Phase 1: Propagation2 Each propagation involves the following steps:

1. Forward propagation of every training pattern’s input through the Neural Network in order to generate
the pattern’s output activations.
2. Backward propagation of every pattern’s output activations through the Neural Network using the pat-
tern’s supervised target in order to generate the errors of all output and hidden neurons.

Phase 2: Weight Update For each weight-synapse:

1. Multiply its output delta and input activation to get the gradient of the weight.
2. Subtract a ratio (percentage) of the gradient from the weight. The ratio (percentage) influences the speed
and quality of learning and is called the learning rate. The greater the ratio, the faster the neuron trains;
the lower the ratio, the more accurate the training is. The sign of the gradient of a weight indicates
where the error is increasing, this is why the weight must be updated in the opposite direction.

For training our Neural Network, we use a modified version of the elementary backpropogation algorithm
known as Levenberg-Marquardt backpropogation algorithm. It is the fastest and most effective algorithm in
the Matlab toolbox as of the time of writing this report.

It is also important to note here that Neural Network training can get stuck in a local minima instead of
approaching the global minima. To alleviate this problem, we repeat training for every instrument 5 times
with a randomly chosen initial guess for the parameters. The model that gives us the best fit in terms of R^2
is chosen as the final model.

We sub-divide the Neural Network approach into two sub-approaches that rely on different assumptions. For
both sub-approaches, we only use first-order terms as explanatory variables.

The Neural Network Without OAS approach assumes that ∆P is a function of the given key rates, mortgage
spread and volatility but not option-adjusted spread.
2 http://en.wikipedia.org/wiki/Backpropagation

11
The Neural Network With OAS approach also includes option-adjusted spread as an explanatory variable for
∆P apart from the variables being used for the Without OAS approach.

For all the approaches described above, we noticed that the With OAS case performed much better than the
Without OAS case. Hence, in the section below, we only present results of the With OAS approaches. Please
refer to Appendix A for results of the Without OAS approaches.

7 Results of With OAS Approaches

7.1 Taylor Series With OAS

The figure below shows the regression fit for change in price for 30Y Fannie 4.5 pools. To keep the report
concise, we only show results for one of the 33 instruments that constitute our portfolio. With a R^2 of
92.07%, it is evident that the fit is good.

The figure below shows the estimated v/s actual historical distribution of change in cumulative PnL for 30Y
Fannie 4.5 pools. Again, it is good, especially since the estimated distribution captures the left fat tail evident
in the actual distribution accurately.

12
The figure below shows the estimated v/s actual historical distribution of change in cumulative PnL at the
portfolio level. Again, it is good, especially since the estimated distribution captures the left fat tail evident
in the actual distribution accurately.

The table below shows the R^2 statistics for the regression fit for change in price at the instrument level. It
confirms beyond doubt that the Taylor Series approach is successful in estimating change in price accurately

13
when OAS is used as an explanatory variable. If OAS data is not used, the approach loses a significant amount
of explanatory power. This is studied in Appendix A, Section A.1.

R^2 with Outliers R^2 without Outliers


15Y Fannie 2.5 0.9721 0.9910
15Y Fannie 3.0 0.9128 0.9671
15Y Fannie 3.5 0.8231 0.8556
15Y Fannie 4.0 0.2853 0.4501
15Y Fannie 4.5 0.4304 0.8278
15Y Fannie 5.0 0.1691 0.5391
15Y Fannie 5.5 0.3502 0.6143
15Y Fannie 6.0 0.8171 0.9756
30Y Fannie 2.5 0.9858 0.9951
30Y Fannie 3.0 0.9779 0.9918
30Y Fannie 3.5 0.9654 0.9871
30Y Fannie 4.0 0.9431 0.9735
30Y Fannie 4.5 0.9207 0.9688
30Y Fannie 5.0 0.7186 0.8654
30Y Fannie 5.5 0.5531 0.7822
30Y Fannie 6.0 0.7335 0.8895
Ginnie 3.0 0.9680 0.9873
Ginnie 3.5 0.9622 0.9832
Ginnie 4.0 0.9449 0.9768
Ginnie 4.5 0.8987 0.9605
Ginnie 5.0 0.8400 0.9282
Ginnie 5.5 0.8743 0.9092
Ginnie 6.0 0.9471 0.9733
Ginnie 6.5 0.6742 0.8222
Fannie IOS 4.0 2010 IO 0.9226 0.9732
Fannie IOS 3.5 2010 IO 0.8829 0.9583
Fannie POS 5.0 2010 PO 0.9790 0.9948
Fannie POS 4.5 2010 PO 0.9841 0.9975
Fannie POS 3.5 2010 PO 0.9793 0.9942
Ginnie 4.0 2010 IO 0.7632 0.8117
Ginnie 4.0 2010 PO 0.9734 0.9942
Ginnie 5.0 2010 PO 0.9781 0.9966
Ginnie 4.5 2010 PO 0.9764 0.9956

Training Complexity The Taylor Series approach is extremely fast due to its simplicity. Since the Taylor
Series approach does not contain any parameters, it does not need to be trained and hence can be used
out-of-sample directly.

Conclusions The Taylor Series approach is a good starting point. However, it is a very simple approach
and adding further sophistication to it is expected to be worthwhile.

7.2 Linear Regression With OAS

The figure below shows the regression fit for change in price for 30Y Fannie 4.5 pools. To keep the report
concise, we only show results for one of the 33 instruments that constitute our portfolio. With a R^2 of
98.35%, it is evident that the fit is good.

14
The figure below shows the estimated v/s actual historical distribution of change in cumulative PnL for 30Y
Fannie 4.5 pools. Again, it is good, especially since the estimated distribution captures the left fat tail evident
in the actual distribution accurately.

The figure below shows the estimated v/s actual historical distribution of change in cumulative PnL at the
portfolio level. Again, it is good, especially since the estimated distribution captures the left fat tail evident
in the actual distribution accurately.

15
The table below shows the R^2 statistics for the regression fit for change in price at the instrument level.
It confirms beyond doubt that the Linear Regression approach is successful in estimating change in price
accurately when OAS is used as an explanatory variable. If OAS data is not used, the approach loses a
significant amount of explanatory power. This is studied in Appendix A, Section A.2.

16
R^2 with Outliers R^2 without Outliers
15Y Fannie 2.5 0.9934 0.9973
15Y Fannie 3.0 0.9792 0.9912
15Y Fannie 3.5 0.9305 0.9647
15Y Fannie 4.0 0.4715 0.4334
15Y Fannie 4.5 0.7020 0.7659
15Y Fannie 5.0 0.3121 0.4567
15Y Fannie 5.5 0.6144 0.4769
15Y Fannie 6.0 0.9168 0.9532
30Y Fannie 2.5 0.9977 0.9993
30Y Fannie 3.0 0.9968 0.9988
30Y Fannie 3.5 0.9945 0.9981
30Y Fannie 4.0 0.9909 0.9960
30Y Fannie 4.5 0.9835 0.9932
30Y Fannie 5.0 0.9528 0.9701
30Y Fannie 5.5 0.9380 0.9581
30Y Fannie 6.0 0.9182 0.9568
Ginnie 3.0 0.9963 0.9985
Ginnie 3.5 0.9957 0.9983
Ginnie 4.0 0.9910 0.9965
Ginnie 4.5 0.9839 0.9928
Ginnie 5.0 0.9669 0.9880
Ginnie 5.5 0.9604 0.9796
Ginnie 6.0 0.9893 0.9939
Ginnie 6.5 1.0000 1.0000
Fannie IOS 4.0 2010 IO 0.9895 0.9954
Fannie IOS 3.5 2010 IO 0.9764 0.9900
Fannie POS 5.0 2010 PO 0.9931 0.9984
Fannie POS 4.5 2010 PO 0.9933 0.9984
Fannie POS 3.5 2010 PO 0.9934 0.9978
Ginnie 4.0 2010 IO 0.9719 0.9851
Ginnie 4.0 2010 PO 0.9939 0.9980
Ginnie 5.0 2010 PO 0.9914 0.9975
Ginnie 4.5 2010 PO 0.9919 0.9972

Training Complexity The Linear Regression approach contains 23 parameters. On the particular run that
we timed, training all 33 instruments took 8.6540 seconds on a Windows 7 Intel Core i5 PC. The time includes
operating system multi-tasking delays and will vary slightly on every run. However, it could be taken as a
reasonably accurate indicator of training complexity.

The Linear Regression approach is extremely fast in training and should be used in situations where training
is a time sensitive task.

Conclusions The Linear Regression approach is a step-up over the Taylor Series approach. It gives a
marginal improvement in the With OAS case over the Taylor Series approach. However, that is to be expected
given how well the Taylor Series With OAS approach already performs.

7.3 Automatic Linear Model Selection With OAS

The figure below shows the regression fit for change in price for 30Y Fannie 4.5 pools. To keep the report
concise, we only show results for one of the 33 instruments that constitute our portfolio. With a R^2 of
98.82%, it is evident that the fit is good.

17
The figure below shows the estimated v/s actual historical distribution of change in cumulative PnL for 30Y
Fannie 4.5 pools. Again, it is good, especially since the estimated distribution captures the left fat tail evident
in the actual distribution accurately.

The figure below shows the estimated v/s actual historical distribution of change in cumulative PnL at the
portfolio level. Again, it is good, especially since the estimated distribution captures the left fat tail evident
in the actual distribution accurately.

18
The table below shows the R^2 statistics for the regression fit for change in price at the instrument level.

The table below shows the R^2 statistics for the regression fit for change in price at the instrument level. It
confirms beyond doubt that the Automatic Linear Model Selection approach is successful in estimating change
in price accurately when OAS is used as an explanatory variable. If OAS data is not used, the approach loses
a significant amount of explanatory power. This is studied in Appendix A, Section A.3.

19
R^2 with Outliers R^2 without Outliers
15Y Fannie 2.5 0.9933 0.9973
15Y Fannie 3.0 0.9740 0.9898
15Y Fannie 3.5 0.9735 0.9879
15Y Fannie 4.0 0.9502 0.9571
15Y Fannie 4.5 0.9680 0.9884
15Y Fannie 5.0 0.9862 0.9908
15Y Fannie 5.5 0.9944 0.9973
15Y Fannie 6.0 0.9914 0.9944
30Y Fannie 2.5 0.9976 0.9992
30Y Fannie 3.0 0.9971 0.9985
30Y Fannie 3.5 0.9934 0.9975
30Y Fannie 4.0 0.9900 0.9948
30Y Fannie 4.5 0.9882 0.9934
30Y Fannie 5.0 0.9875 0.9922
30Y Fannie 5.5 0.9804 0.9862
30Y Fannie 6.0 0.9800 0.9871
Ginnie 3.0 0.9962 0.9981
Ginnie 3.5 0.9956 0.9979
Ginnie 4.0 0.9905 0.9961
Ginnie 4.5 0.9942 0.9965
Ginnie 5.0 0.9679 0.9882
Ginnie 5.5 0.9633 0.9806
Ginnie 6.0 0.9926 0.9949
Ginnie 6.5 0.6681 0.7847
Fannie IOS 4.0 2010 IO 0.9975 0.9984
Fannie IOS 3.5 2010 IO 0.9904 0.9957
Fannie POS 5.0 2010 PO 0.9918 0.9984
Fannie POS 4.5 2010 PO 0.9926 0.9978
Fannie POS 3.5 2010 PO 0.9935 0.9984
Ginnie 4.0 2010 IO 0.9817 0.9919
Ginnie 4.0 2010 PO 0.9951 0.9981
Ginnie 5.0 2010 PO 0.9898 0.9981
Ginnie 4.5 2010 PO 0.9972 0.9987

Training Complexity The Automatic Linear Model Selection approach is a variable parameter approach.
The number of parameters can vary from 1 to 188 and the number of parameters is chosen using the opti-
mization criteria. On the particular run that we timed, training all 33 instruments took 1492.9 seconds on a
Windows 7 Intel Core i5 PC. The time includes operating system multi-tasking delays and will vary slightly
on every run. However, it could be taken as a reasonably accurate indicator of training complexity.

As can be seen from the training time, the Automatic Linear Model Selection approach is extremely computa-
tionally intensive. This is to be expected since in each phase of parameter selection, the optimization engine
runs several regressions of the form in the Linear Regression approach to pick the parameter that is best to
include. With an average of about 10 parameters per instrument being selected out of a maximum of 188
possible parameters, the Automatic Linear Model Selection approach runs roughly 187*10=1870 regressions
of steadily increasing complexity per instrument. Given this information, the time taken should not come as
a surprise. One should only use this technique if training is not a time-critical task and/or the approach gives
very good improvement in prediction as compared to other approaches. As an example, we present a logging
snippet from a training run on one of the instruments using this approach.
-------------------------------------------
- - - - - - - - - - - - - RUNNING fncl_4dot5_pools NOW - - - - - - - - - - - - - -
-------------------------------------------
1. Adding oasdur , BIC = -186.1151
2. Adding krd5y , BIC = -290.3902
3. Adding msdur_invexp , BIC = -328.0333
4. Adding krd10y_invexp , BIC = -362.3132
5. Adding krc2y_sqr , BIC = -382.8609

20
6. Adding msdur_sqr , BIC = -385.7417
7. Adding krd6m + msdur , BIC = -395.5006
8. Adding oasdur + msdur , BIC = -401.6139
9. Adding oascvx_sqr , BIC = -404.6971
10. Adding krd30y + msdur , BIC = -405.0707
11. Adding krd3m + krd10y , BIC = -406.072
12. Adding oasdur + krd3m , BIC = -409.6148
13. Adding vega + msdur , BIC = -412.5993
14. Adding krd6m_sqr , BIC = -412.9877
15. Adding krd3m + krd5y , BIC = -415.7529
16. Removing krc2y_sqr , BIC = -419.31
17. Adding krc3m_tansig , BIC = -419.8953

model =

Linear regression model :


ACTUAL ~ [ Linear formula with 16 terms in 15 predictors ]

Estimated Coefficients :
Estimate SE tStat pValue
( Intercept ) 1.9166 0.14836 12.918 4.0417 e -23
oasdur 0.94377 0.017033 55.408 2.0566 e -77
krd5y 1.5066 0.14601 10.318 1.7884 e -17
krd6m_sqr -51848 18316 -2.8307 0.0056048
msdur_sqr -2.0838 0.29953 -6.9571 3.5353 e -10
oascvx_sqr -147.81 47.086 -3.1392 0.0022217
krd10y_invexp -1.2997 0.13107 -9.9165 1.3704 e -16
msdur_invexp -0.60373 0.068214 -8.8505 3.0327 e -14
krc3m_tansig -1.0001 e +05 45376 -2.2041 0.02979
oasdur + krd3m -499.6 105.65 -4.7287 7.3453 e -06
oasdur + msdur -1.0427 0.17478 -5.9654 3.629 e -08
krd3m + krd5y 1933.4 674.06 2.8683 0.005025
krd3m + krd10y -5377.4 943.95 -5.6967 1.2097 e -07
krd6m + msdur -1170.7 205.38 -5.6999 1.1926 e -07
krd30y + msdur -26.129 2.7222 -9.5983 6.8862 e -16
vega + msdur -6.821 1.3707 -4.9761 2.6716 e -06

Number of observations : 117 , Error degrees of freedom : 101


Root Mean Squared Error : 0.0311
R - squared : 0.988 , Adjusted R - Squared 0.986
F - statistic vs . constant model : 562 , p - value = 4.85 e -90

Conclusions The Automatic Linear Model Selection approach is an innovative sophistication added to the
Linear Regression approach. It gives a marginal improvement in the With OAS case over the Linear Regression
approach. However, that is to be expected given how well the Linear Regression With OAS approach already
performs.

7.4 Neural Network With OAS

The figure below shows the regression fit for change in price for 30Y Fannie 4.5 pools. To keep the report
concise, we only show results for one of the 33 instruments that constitute our portfolio. With a R^2 of
99.19%, it is evident that the fit is good.

21
The figure below shows the estimated v/s actual historical distribution of change in cumulative PnL for 30Y
Fannie 4.5 pools. Again, it is good, especially since the estimated distribution captures the left fat tail evident
in the actual distribution accurately.

The figure below shows the estimated v/s actual historical distribution of change in cumulative PnL at the
portfolio level. Again, it is good, especially since the estimated distribution captures the left fat tail evident
in the actual distribution accurately.

22
The table below shows the R^2 statistics for the regression fit for change in price at the instrument level.

23
R^2 with Outliers R^2 without Outliers
15Y Fannie 2.5 0.9982 0.9989
15Y Fannie 3.0 0.9893 0.9963
15Y Fannie 3.5 0.9901 0.9962
15Y Fannie 4.0 0.9636 0.9875
15Y Fannie 4.5 0.9739 0.9902
15Y Fannie 5.0 0.9830 0.9928
15Y Fannie 5.5 0.9896 0.9952
15Y Fannie 6.0 0.9957 0.9976
30Y Fannie 2.5 0.9987 0.9993
30Y Fannie 3.0 0.9980 0.9991
30Y Fannie 3.5 0.9981 0.9989
30Y Fannie 4.0 0.9978 0.9988
30Y Fannie 4.5 0.9919 0.9951
30Y Fannie 5.0 0.9843 0.9913
30Y Fannie 5.5 0.9852 0.9916
30Y Fannie 6.0 0.9855 0.9896
Ginnie 3.0 0.9982 0.9992
Ginnie 3.5 0.9979 0.9985
Ginnie 4.0 0.9949 0.9974
Ginnie 4.5 0.9939 0.9970
Ginnie 5.0 0.9692 0.9858
Ginnie 5.5 0.9916 0.9931
Ginnie 6.0 0.9978 0.9985
Ginnie 6.5 1.0000 1.0000
Fannie IOS 4.0 2010 IO 0.9941 0.9975
Fannie IOS 3.5 2010 IO 0.9922 0.9966
Fannie POS 5.0 2010 PO 0.9962 0.9983
Fannie POS 4.5 2010 PO 0.9968 0.9988
Fannie POS 3.5 2010 PO 0.9962 0.9988
Ginnie 4.0 2010 IO 0.9901 0.9959
Ginnie 4.0 2010 PO 0.9954 0.9983
Ginnie 5.0 2010 PO 0.9943 0.9977
Ginnie 4.5 2010 PO 0.9934 0.9973

Training Complexity The Neural Network approach contains 40 parameters. On the particular run that
we timed, training all 33 instruments took 233.0759 seconds on a Windows 7 Intel Core i5 PC. The time
includes operating system multi-tasking delays and will vary slightly on every run. However, it could be taken
as a reasonably accurate indicator of training complexity.

The Neural Network approach gives a good balance between training complexity and prediction accuracy. It
improves on the prediction accuracy of the Automatic Linear Model Selection approach while also giving a
speedup of more than 6 times as compared to the Automatic Linear Model Selection approach.

Conclusions The Neural Network approach is an innovative sophistication added to the previous Linear
Model approaches, that is, Taylor Series, Liuear Regression and Automatic Linear Model Selection. It gives a
marginal improvement in the With OAS case over the Automatic Linear Model Selection approach. However,
that is to be expected given how well the Automatic Linear Model Selection With OAS approach already
performs.

7.5 A Promising Detour: Neural Network Without OAS

The figure below shows the regression fit for change in price for 30Y Fannie 4.5 pools. To keep the report
concise, we only show results for one of the 33 instruments that constitute our portfolio. With a R^2 of
80.51%, it is evident that the fit is good and in fact, accurate enough to be used in production. The approach

24
gives us a marked improvement over the Automatic Linear Model Selection Without OAS approach which has
a R^2 of 55.52% for 30Y Fannie 4.5 pools.

The figure below shows the estimated v/s actual historical distribution of change in cumulative PnL for 30Y
Fannie 4.5 pools. Again, it is good, especially since the estimated distribution captures the left fat tail evident
in the actual distribution accurately.

25
The figure below shows the estimated v/s actual historical distribution of change in cumulative PnL at the
portfolio level. We notice that the estimated distribution does not capture the fat tails evident in the actual
distribution.

The table below shows the R^2 statistics for the regression fit for change in price at the instrument level.

26
R^2 with Outliers R^2 without Outliers
15Y Fannie 2.5 0.7925 0.9158
15Y Fannie 3.0 0.8450 0.8544
15Y Fannie 3.5 0.7331 0.8514
15Y Fannie 4.0 0.8603 0.9289
15Y Fannie 4.5 0.7711 0.8709
15Y Fannie 5.0 0.6343 0.7677
15Y Fannie 5.5 0.7366 0.8333
15Y Fannie 6.0 0.3344 0.8028
30Y Fannie 2.5 0.5115 0.6729
30Y Fannie 3.0 0.5129 0.7123
30Y Fannie 3.5 0.7849 0.7911
30Y Fannie 4.0 0.8854 0.9558
30Y Fannie 4.5 0.8051 0.9157
30Y Fannie 5.0 0.5913 0.7143
30Y Fannie 5.5 0.6364 0.7945
30Y Fannie 6.0 0.5785 0.6834
Ginnie 3.0 0.7678 0.8768
Ginnie 3.5 0.7445 0.8600
Ginnie 4.0 0.8208 0.9271
Ginnie 4.5 0.7839 0.8958
Ginnie 5.0 0.5742 0.6849
Ginnie 5.5 0.6364 0.7926
Ginnie 6.0 0.8762 0.9866
Ginnie 6.5 1.0000 1.0000
Fannie IOS 4.0 2010 IO 0.7119 0.8551
Fannie IOS 3.5 2010 IO 0.6649 0.7666
Fannie POS 5.0 2010 PO 0.8889 0.9329
Fannie POS 4.5 2010 PO 0.9168 0.9551
Fannie POS 3.5 2010 PO 0.6691 0.7253
Ginnie 4.0 2010 IO 0.4769 0.7008
Ginnie 4.0 2010 PO 0.7216 0.8267
Ginnie 5.0 2010 PO 0.7808 0.7774
Ginnie 4.5 2010 PO 0.8376 0.9124

Conclusions The Neural Network approach is an innovative sophistication added to the previous Linear
Model approaches. It gives a vast improvement in performance over the Linear Model approaches in the
Without OAS case. Out of the 33 instruments that constitute our portfolio, only 17 instruments have a
no outlier R^2 below 75% in the Without OAS case. Hence, overall, we conclude that the Neural Network
approach is promising and could be researched further. It is also a model that could probably be employed in
production.

We make one final observation on all approaches presented in this section. Getting good performance is easy
when OAS is also used as an explanatory variable as against when it is not. This can be seen from the
performance of the With OAS case of the naive approach, that is, Taylor Series. Please refer to Appendix A
for performance of the Without OAS approaches.

8 Comparison between With OAS Approaches

In this section, we compare the With OAS approaches presented in the previous section. Specifically, we
compare Taylor Series With OAS (top left), Linear Regression With OAS (top right), Automatic Linear
Model Selection With OAS (bottom left) and Neural Network With OAS (bottom right). Please refer to
Appendix B for a similar comparison between the Without OAS approaches.

We can see from the below examples that performance is good for all approaches. While the Neural Network
With OAS approach gives marginally better results than the other approaches, it also contains the most

27
number of parameters and hence, carries the greatest risk of being overfitted. Hence, we would prefer either
Taylor Series or Linear Regression for the With OAS case.

The Taylor Series approach works reasonably well for low and mid coupon securities under the given pre-
payment model and the corresponding sensitivities data. However, it seems to miss outliers for high coupon
securities, either due to insufficient model calibration or erratic market movements for these securities. The
good news is that the Automatic Linear Model Selection approach, by choosing the appropriate functional
transformations, provides a good alternative for high coupon securities. It might be worth investigating which
functional transformations for each explanatory variable are chosen for individual securities to identify model
calibration opportunities or to support trading decisions.

8.1 15Y Fannie Low Coupon Pools

The figure below shows the regression fit for change in price for 15Y Fannie 2.5 pools.

The table below shows the R^2 statistics obtained for the regressions in the figure above.

R^2 with Outliers R^2 without Outliers


Taylor Series With OAS 0.9721 0.9910
Linear Regression With OAS 0.9934 0.9973
Automatic Linear Model Selection With OAS 0.9933 0.9973
Neural Network With OAS 0.9982 0.9989

28
8.2 15Y Fannie Mid Coupon Pools

The figure below shows the regression fit for change in price for 15Y Fannie 3.5 pools.

The table below shows the R^2 statistics obtained for the regressions in the figure above.

R^2 with Outliers R^2 without Outliers


Taylor Series With OAS 0.8231 0.8556
Linear Regression With OAS 0.9305 0.9647
Automatic Linear Model Selection With OAS 0.9735 0.9879
Neural Network With OAS 0.9901 0.9962

8.3 15Y Fannie High Coupon Pools

The figure below shows the regression fit for change in price for 15Y Fannie 5.0 pools.

29
The table below shows the R^2 statistics obtained for the regressions in the figure above.

R^2 with Outliers R^2 without Outliers


Taylor Series With OAS 0.1691 0.5391
Linear Regression With OAS 0.3121 0.4567
Automatic Linear Model Selection With OAS 0.9862 0.9908
Neural Network With OAS 0.9830 0.9928

8.4 30Y Fannie Low Coupon Pools

The figure below shows the regression fit for change in price for 30Y Fannie 2.5 pools.

30
The table below shows the R^2 statistics obtained for the regressions in the figure above.

R^2 with Outliers R^2 without Outliers


Taylor Series With OAS 0.9858 0.9951
Linear Regression With OAS 0.9977 0.9993
Automatic Linear Model Selection With OAS 0.9976 0.9992
Neural Network With OAS 0.9987 0.9993

8.5 30Y Fannie Mid Coupon Pools

The figure below shows the regression fit for change in price for 30Y Fannie 4.5 pools.

31
The table below shows the R^2 statistics obtained for the regressions in the figure above.

R^2 with Outliers R^2 without Outliers


Taylor Series With OAS 0.9207 0.9688
Linear Regression With OAS 0.9835 0.9932
Automatic Linear Model Selection With OAS 0.9882 0.9934
Neural Network With OAS 0.9919 0.9951

8.6 30Y Fannie High Coupon Pools

The figure below shows the regression fit for change in price for 30Y Fannie 5.5 pools.

The table below shows the R^2 statistics obtained for the regressions in the figure above.

R^2 with Outliers R^2 without Outliers


Taylor Series With OAS 0.5531 0.7822
Linear Regression With OAS 0.9380 0.9581
Automatic Linear Model Selection With OAS 0.9804 0.9862
Neural Network With OAS 0.9852 0.9916

8.7 Ginnie Low Coupon Pools

The figure below shows the regression fit for change in price for Ginnie 3.0 pools.

32
The table below shows the R^2 statistics obtained for the regressions in the figure above.

R^2 with Outliers R^2 without Outliers


Taylor Series With OAS 0.9680 0.9873
Linear Regression With OAS 0.9963 0.9985
Automatic Linear Model Selection With OAS 0.9962 0.9981
Neural Network With OAS 0.9982 0.9992

8.8 Ginnie Mid Coupon Pools

The figure below shows the regression fit for change in price for Ginnie 4.0 pools.

33
The table below shows the R^2 statistics obtained for the regressions in the figure above.

R^2 with Outliers R^2 without Outliers


Taylor Series With OAS 0.9449 0.9768
Linear Regression With OAS 0.9910 0.9965
Automatic Linear Model Selection With OAS 0.9905 0.9961
Neural Network With OAS 0.9949 0.9974

8.9 Ginnie High Coupon Pools

The figure below shows the regression fit for change in price for Ginnie 5.5 pools.

34
The table below shows the R^2 statistics obtained for the regressions in the figure above.

R^2 with Outliers R^2 without Outliers


Taylor Series With OAS 0.8743 0.9092
Linear Regression With OAS 0.9604 0.9796
Automatic Linear Model Selection With OAS 0.9633 0.9806
Neural Network With OAS 0.9916 0.9931

8.10 Interest Only Pools

The figure below shows the regression fit for change in price for Fannie IOS 4.0 2010 IO pools.

The table below shows the R^2 statistics obtained for the regressions in the figure above.

R^2 with Outliers R^2 without Outliers


Taylor Series With OAS 0.9226 0.9732
Linear Regression With OAS 0.9895 0.9954
Automatic Linear Model Selection With OAS 0.9975 0.9984
Neural Network With OAS 0.9941 0.9975

8.11 Principal Only Pools

The figure below shows the regression fit for change in price for Fannie POS 4.5 2010 PO pools.

35
The table below shows the R^2 statistics obtained for the regressions in the figure above.

R^2 with Outliers R^2 without Outliers


Taylor Series With OAS 0.9841 0.9975
Linear Regression With OAS 0.9933 0.9984
Automatic Linear Model Selection With OAS 0.9926 0.9978
Neural Network With OAS 0.9968 0.9988

8.12 Estimated Distribution of Change in Cumulative PnL at Portfolio Level

The figure below shows the estimated v/s actual historical distribution of change in cumulative PnL at the
portfolio level obtained using the different approaches. The top 4 histograms are estimated distributions
obtained, in order, using Taylor Series With OAS, Linear Regression With OAS, Automatic Linear Model
Selection With OAS and Neural Network With OAS. The last histogram is the actual distribution.

36
37
9 Out-of-Sample Performance of Approaches

In the previous section, we presented the in-sample performance of the different approaches. Now we move to
an out-of-sample test. Given the number of parameters in each of our approaches (please see the table below)
and under the assumption that the number of training samples should be atleast an order of magnitude more
than the number of parameters, we conclude that at least 400 training points are probably needed for decent
training.

Approach Number of Parameters


Taylor Series Without OAS 0
Taylor Series With OAS 0
Linear Regression Without OAS 21
Linear Regression With OAS 23
Automatic Linear Model Selection Without OAS approx. 15-20 (can vary from 1 to 166)
Automatic Linear Model Selection With OAS approx. 10-15 (can vary from 1 to 188)
Neural Network Without OAS 37
Neural Network With OAS 40

Since only a limited number of samples (124) are available to us per instrument, we cannot split instrument level
data into training and testing datasets to test out-of-sample performance. As an alternative, we integrate all
the instrument datasets into one large dataset that contains 4242 samples and test out-of-sample performance
of our approaches on this dataset by dividing it into 2828 training samples and 1414 testing samples.

From the below analysis, we see that most of the With OAS approaches do well. However, none of the Without
OAS approaches have explanatory power. Looking at the data, we notice that OAS can be consistently positive
for some bonds and negative for the others. This implies that either the underlying MBS valuation model is
biased in different directions for each bond, or some bonds are riskier than the others because of which investors
demand a different level of spread. Given this fact, we cannot expect to succeed by using only one paramerter
to correct model deviation for all the instruments. The option-adjusted spread acts as the correction of model
price with respect to market price. Hence, after including OAS as an explanatory variable in our models, we
have effectively factored in the difference between different instruments and get good explanatory power.

9.1 Taylor Series With OAS

The figure below shows the performance of Taylor Series With OAS in out-of-sample data. With a R^2 of
96% with outliers and 98% without outliers, the model has a good explanatory power.

38
9.2 Taylor Series Without OAS

The figure below shows the performance of Taylor Series Without OAS in out-of-sample data. The R^2 of
the model is less than 1%. It is clear that the explanatory power of the Taylor Series approach comes mainly
from OAS.

9.3 Linear Regression With OAS

The figure below shows the performance of Linear Regression with OAS in out-of-sample data. With a R^2
of 96% with outliers and 99% without outliers, the model fits the data well.

39
9.4 Linear Regression Without OAS

The figure below shows the performance of Linear Regression without OAS in out-of-sample data. The R^2
of the model is less than 1%. It is clear that the explanatory power of the Linear Regression approach comes
mainly from OAS.

9.5 Automatic Linear Model Selection With OAS

The figure below shows the performance of Automatic Linear Model Selection With OAS in out-of-sample data.
The model has a R^2 of 24% with outliers and 92% without outliers. We can see from the figure that the model
fits most of the data well, but fails to predict extreme price changes. Since the approach works well in-sample
for individual instruments but misses a lot of outliers in the out-of-sample test using the combined data, we
conclude that the difference between the functional transformations selected for each security is a significant
contributor to the accuracy of this approach. Hence, the Automatic Linear Model Selection approach should
not be used for prediction if trained on combined data.

40
9.6 Automatic Linear Model Selection Without OAS

The figure below shows the performance of Automatic Linear Model Selection Without OAS in out-of-sample
data. It is clear that the explanatory power of the Automatic Linear Model Selection approach comes mainly
from OAS.

9.7 Neural Network With OAS

The figure below shows the performance of Neural Network with OAS in out-of-sample data. With a R^2 of
85% with outliers and 98% without outliers, the model fits the data well.

41
9.8 Neural Network Model Without OAS

The figure below shows the performance of Neural Network without OAS in out-of-sample data. The R^2
of the model is less than 1%. It is clear that the explanatory power of the Neural Network approach comes
mainly from OAS.

10 Conclusions

Value-at-Risk is a widely used measure of the potential risk of loss on a portfolio of financial assets. A full reval-
uation historical VaR engine is often the default choice for capturing risk across most strategies/instruments.
However, MBS valuations are complex and computationally intensive. Consequently, computing historical VaR
by full revaluation is not practical for these securities. In this project, we attempted to develop a framework
to estimate VaR for agency mortgage-backed securities that is much more computationally efficient than a
full revaluation VaR engine. In particular, we explored Taylor Series, Linear Regression, Automatic Linear
Model Selection and Neural Network approaches to estimate the change in price of an MBS security given its
sensitivites to market parameters such as key rates, implied volatility, mortgage spread and option-adjusted
spread. We notice that all approaches ranging from the naive Taylor Series to the most sophisticated Neural
Network work well when option-adjusted spread is used as an explanatory variable. They give price estima-
tion R^2 above 90% for most instruments and are able to capture the left fat tail in the actual historical
change in price distribution accurately. On the other hand, all the Without OAS approaches are quite weak
in prediction. However, we notice substantial improvement in prediction performance as the sophistication of
our models increases. In particular, our innovative Neural Network Without OAS approach does quite well in
the Without OAS category and we feel that it is a promising direction that deserves further exploration and
research. In terms of training complexity, we conclude that the Automatic Linear Model Selection approach
is prohibitively computationally intensive, the Taylor Series and Linear Regression approaches are extremely
fast and the Neural Network approach offers the best balance between accuracy and complexity. Due to lack
of data, we were unable to perform an out-of-sample test for our models instrument-by-instrument. However,
we merged all the individual instrument data into one big dataset and split it for training and testing. We
noticed that the With OAS approaches give satisfactory out-of-sample performance while the Without OAS
approaches fail in the out-of-sample test. Overall, we feel that this project is a major step towards develop-
ing an accurate and computationally efficient VaR engine for mortgage-backed securities. Please refer to the
individual section conclusions section for further details.

42
References
[1] Nicholas Arcidiacono, Larry Cordell, Andrew Davidson, and Alex Levin. Understanding and measuring
risks in agency cmos. 2013.
[2] Gordon Delianedis, Ronald Lagnado, and Teri Geske. Measuring var for portfolios containing mortgage
backed securities. Discussion Paper, 2000.

[3] RiskMetrics Group. Pricing and modeling risk of agency mbs. White Paper, 2010.
[4] Svend Jakobsen. Measuring value-at-risk for mortgage backed securities. Springer, 1996.
[5] David Mieczkowski. An efficient price estimation method and applications to fixed income value-at-risk.
FactSet White Paper, 2009.

[6] Thomas Ta and William McCoy. Comparing methods to approximate mortgage-backed security var. Work-
ing Paper, 2001.

43
A Results of Without OAS Approaches

A.1 Taylor Series Without OAS

The figure below shows the regression fit for change in price for 30Y Fannie 4.5 pools. To keep the report
concise, we only show results for one of the 33 instruments that constitute our portfolio. With a R^2 of 8.12%,
it is evident that the fit is poor.

The figure below shows the estimated v/s actual historical distribution of change in cumulative PnL for
30Y Fannie 4.5 pools. Capturing the left fat tail in the actual distribution accurately is important to get
an accurate VaR estimate. Hence, we conclude that the estimated distribution is poor, especially since the
estimated distribution does not capture the fat tails evident in the actual distribution.

44
The figure below shows the estimated v/s actual historical distribution of change in cumulative PnL at the
portfolio level. Again, it is poor, especially since the estimated distribution does not capture the fat tails
evident in the actual distribution.

The table below shows the R^2 statistics for the regression fit for change in price at the instrument level. It
confirms beyond doubt that this approach is not successful in estimating change in price accurately. Since the

45
Taylor Series With OAS approach performs quite well, the most intuitive explanation that comes to mind for
this lack of performance is the absence of option-adjusted spread as an explanatory variable.

R^2 with Outliers R^2 without Outliers


15Y Fannie 2.5 0.0086 0.0243
15Y Fannie 3.0 0.0182 0.0032
15Y Fannie 3.5 0.0343 0.0000
15Y Fannie 4.0 0.0371 0.0058
15Y Fannie 4.5 0.0344 0.0003
15Y Fannie 5.0 0.0388 0.0018
15Y Fannie 5.5 0.0008 0.0030
15Y Fannie 6.0 0.0071 0.0104
30Y Fannie 2.5 0.0101 0.0096
30Y Fannie 3.0 0.0135 0.0147
30Y Fannie 3.5 0.0229 0.0116
30Y Fannie 4.0 0.0453 0.0064
30Y Fannie 4.5 0.0812 0.0765
30Y Fannie 5.0 0.0418 0.0359
30Y Fannie 5.5 0.0007 0.0006
30Y Fannie 6.0 0.0000 0.0093
Ginnie 3.0 0.0158 0.0002
Ginnie 3.5 0.0201 0.0076
Ginnie 4.0 0.0362 0.0014
Ginnie 4.5 0.0543 0.0366
Ginnie 5.0 0.0312 0.0276
Ginnie 5.5 0.0685 0.0062
Ginnie 6.0 0.0010 0.0000
Ginnie 6.5 0.4706 0.0427
Fannie IOS 4.0 2010 IO 0.0020 0.0114
Fannie IOS 3.5 2010 IO 0.0036 0.0065
Fannie POS 5.0 2010 PO 0.0018 0.0425
Fannie POS 4.5 2010 PO 0.0010 0.0399
Fannie POS 3.5 2010 PO 0.0000 0.0211
Ginnie 4.0 2010 IO 0.0020 0.0118
Ginnie 4.0 2010 PO 0.0005 0.0234
Ginnie 5.0 2010 PO 0.0000 0.0594
Ginnie 4.5 2010 PO 0.0009 0.0316

Conclusions The Taylor Series approach is a good starting point but does not give us accurate results
in the Without OAS case. However, it is a very simple approach and adding further sophistication to it is
expected to be worthwhile.

A.2 Linear Regression Without OAS

The figure below shows the regression fit for change in price for 30Y Fannie 4.5 pools. To keep the report
concise, we only show results for one of the 33 instruments that constitute our portfolio. With a R^2 of
43.24%, it is evident that the fit is poor. However, this approach does give us a big improvement over the
Taylor Series Without OAS approach which has a R^2 of 8.12% for 30Y Fannie 4.5 pools.

46
The figure below shows the estimated v/s actual historical distribution of change in cumulative PnL for 30Y
Fannie 4.5 pools. Again, it is poor, especially since the estimated distribution does not capture the fat tails
evident in the actual distribution.

The figure below shows the estimated v/s actual historical distribution of change in cumulative PnL at the
portfolio level. Again, it is poor, especially since the estimated distribution does not capture the fat tails
evident in the actual distribution.

47
The table below shows the R^2 statistics for the regression fit for change in price at the instrument level. It
confirms beyond doubt that this approach is not successful in estimating change in price accurately. Since
the Linear Regression With OAS approach performs quite well, the most intuitive explanation that comes to
mind for this lack of performance is the absence of option-adjusted spread as an explanatory variable.

48
R^2 with Outliers R^2 without Outliers
15Y Fannie 2.5 0.2497 0.1879
15Y Fannie 3.0 0.2607 0.2483
15Y Fannie 3.5 0.3578 0.3323
15Y Fannie 4.0 0.3529 0.2568
15Y Fannie 4.5 0.2992 0.1624
15Y Fannie 5.0 0.2034 0.1967
15Y Fannie 5.5 0.2475 0.1020
15Y Fannie 6.0 0.1599 0.2559
30Y Fannie 2.5 0.2380 0.0504
30Y Fannie 3.0 0.1899 0.0911
30Y Fannie 3.5 0.1574 0.0368
30Y Fannie 4.0 0.3276 0.2390
30Y Fannie 4.5 0.4324 0.2690
30Y Fannie 5.0 0.3782 0.3623
30Y Fannie 5.5 0.3306 0.4908
30Y Fannie 6.0 0.3233 0.3830
Ginnie 3.0 0.2603 0.2209
Ginnie 3.5 0.1889 0.0710
Ginnie 4.0 0.4073 0.3545
Ginnie 4.5 0.3361 0.3595
Ginnie 5.0 0.3641 0.4765
Ginnie 5.5 0.3619 0.4745
Ginnie 6.0 0.3917 0.6342
Ginnie 6.5 1.0000 1.0000
Fannie IOS 4.0 2010 IO 0.3319 0.1717
Fannie IOS 3.5 2010 IO 0.1879 0.1773
Fannie POS 5.0 2010 PO 0.3230 0.0530
Fannie POS 4.5 2010 PO 0.4074 0.2103
Fannie POS 3.5 2010 PO 0.2665 0.2459
Ginnie 4.0 2010 IO 0.1906 0.1802
Ginnie 4.0 2010 PO 0.2866 0.2414
Ginnie 5.0 2010 PO 0.4078 0.1714
Ginnie 4.5 2010 PO 0.3501 0.1880

Conclusions The Linear Regression approach is a reasonable step-up over the Taylor Series approach and
gives an improvement in performance over the Taylor Series approach in the Without OAS case. However,
the Without OAS performance is still not good enough for it to be employed in production. We believe that
further innovation and sophistication added to the Linear Regression approach would be fruitful.

A.3 Automatic Linear Model Selection Without OAS

The figure below shows the regression fit for change in price for 30Y Fannie 4.5 pools. To keep the report
concise, we only show results for one of the 33 instruments that constitute our portfolio. With a R^2 of
55.52%, it is evident that the fit is poor. However, this approach does give us an improvement over the Linear
Regression Without OAS approach which has a R^2 of 43.24% for 30Y Fannie 4.5 pools.

49
The figure below shows the estimated v/s actual historical distribution of change in cumulative PnL for 30Y
Fannie 4.5 pools. Again, it is poor, especially since the estimated distribution does not capture the fat tails
evident in the actual distribution.

The figure below shows the estimated v/s actual historical distribution of change in cumulative PnL at the
portfolio level. Again, it is poor, especially since the estimated distribution does not capture the fat tails
evident in the actual distribution.

50
The table below shows the R^2 statistics for the regression fit for change in price at the instrument level. It
confirms beyond doubt that this approach is not successful in estimating change in price accurately. Since the
Automatic Linear Model Selection With OAS approach performs quite well, the most intuitive explanation
that comes to mind for this lack of performance is the absence of option-adjusted spread as an explanatory
variable.

51
R^2 with Outliers R^2 without Outliers
15Y Fannie 2.5 0.3947 0.2693
15Y Fannie 3.0 0.4849 0.5547
15Y Fannie 3.5 0.3716 0.5926
15Y Fannie 4.0 0.4355 0.5904
15Y Fannie 4.5 0.2932 0.4434
15Y Fannie 5.0 0.2250 0.1624
15Y Fannie 5.5 0.0829 0.0506
15Y Fannie 6.0 0.0000 0.0000
30Y Fannie 2.5 0.5342 0.6228
30Y Fannie 3.0 0.5267 0.6589
30Y Fannie 3.5 0.2996 0.1093
30Y Fannie 4.0 0.6498 0.7737
30Y Fannie 4.5 0.5552 0.6534
30Y Fannie 5.0 0.6513 0.7446
30Y Fannie 5.5 0.6833 0.7486
30Y Fannie 6.0 0.3664 0.3884
Ginnie 3.0 0.5653 0.6661
Ginnie 3.5 0.8833 0.8711
Ginnie 4.0 0.4179 0.6890
Ginnie 4.5 0.5423 0.6481
Ginnie 5.0 0.6009 0.7247
Ginnie 5.5 0.5585 0.7034
Ginnie 6.0 0.7622 0.7738
Ginnie 6.5 0.6681 0.7847
Fannie IOS 4.0 2010 IO 0.4345 0.3914
Fannie IOS 3.5 2010 IO 0.3073 0.3190
Fannie POS 5.0 2010 PO 0.8767 0.8916
Fannie POS 4.5 2010 PO 0.8979 0.9346
Fannie POS 3.5 2010 PO 0.4092 0.3667
Ginnie 4.0 2010 IO 0.1628 0.2054
Ginnie 4.0 2010 PO 0.8339 0.8549
Ginnie 5.0 2010 PO 0.8811 0.9194
Ginnie 4.5 2010 PO 0.8104 0.8651

Conclusions The Automatic Linear Model Selection approach is an innovative sophistication added to the
Linear Regression approach. It gives an improvement in performance over the Linear Regression approach in
the Without OAS case. However, the Without OAS performance is still not good enough for it to be employed
in production. We decide to explore further.

A.4 Neural Network Without OAS

The figure below shows the regression fit for change in price for 30Y Fannie 4.5 pools. To keep the report
concise, we only show results for one of the 33 instruments that constitute our portfolio. With a R^2 of
80.51%, it is evident that the fit is good and in fact, accurate enough to be used in production. The approach
gives us a marked improvement over the Automatic Linear Model Selection Without OAS approach which has
a R^2 of 55.52% for 30Y Fannie 4.5 pools.

52
The figure below shows the estimated v/s actual historical distribution of change in cumulative PnL for 30Y
Fannie 4.5 pools. Again, it is good, especially since the estimated distribution captures the left fat tail evident
in the actual distribution accurately.

The figure below shows the estimated v/s actual historical distribution of change in cumulative PnL at the
portfolio level. We notice that the estimated distribution does not capture the fat tails evident in the actual
distribution.

53
The table below shows the R^2 statistics for the regression fit for change in price at the instrument level.

54
R^2 with Outliers R^2 without Outliers
15Y Fannie 2.5 0.7925 0.9158
15Y Fannie 3.0 0.8450 0.8544
15Y Fannie 3.5 0.7331 0.8514
15Y Fannie 4.0 0.8603 0.9289
15Y Fannie 4.5 0.7711 0.8709
15Y Fannie 5.0 0.6343 0.7677
15Y Fannie 5.5 0.7366 0.8333
15Y Fannie 6.0 0.3344 0.8028
30Y Fannie 2.5 0.5115 0.6729
30Y Fannie 3.0 0.5129 0.7123
30Y Fannie 3.5 0.7849 0.7911
30Y Fannie 4.0 0.8854 0.9558
30Y Fannie 4.5 0.8051 0.9157
30Y Fannie 5.0 0.5913 0.7143
30Y Fannie 5.5 0.6364 0.7945
30Y Fannie 6.0 0.5785 0.6834
Ginnie 3.0 0.7678 0.8768
Ginnie 3.5 0.7445 0.8600
Ginnie 4.0 0.8208 0.9271
Ginnie 4.5 0.7839 0.8958
Ginnie 5.0 0.5742 0.6849
Ginnie 5.5 0.6364 0.7926
Ginnie 6.0 0.8762 0.9866
Ginnie 6.5 1.0000 1.0000
Fannie IOS 4.0 2010 IO 0.7119 0.8551
Fannie IOS 3.5 2010 IO 0.6649 0.7666
Fannie POS 5.0 2010 PO 0.8889 0.9329
Fannie POS 4.5 2010 PO 0.9168 0.9551
Fannie POS 3.5 2010 PO 0.6691 0.7253
Ginnie 4.0 2010 IO 0.4769 0.7008
Ginnie 4.0 2010 PO 0.7216 0.8267
Ginnie 5.0 2010 PO 0.7808 0.7774
Ginnie 4.5 2010 PO 0.8376 0.9124

Conclusions The Neural Network approach is an innovative sophistication added to the previous Linear
Model approaches. It gives a vast improvements in performance over the Linear Model approaches in the
Without OAS case. Out of the 33 instruments that constitute our portfolio, only 17 instruments have a
no outlier R^2 below 75% in the Without OAS case. Hence, overall, we conclude that the Neural Network
approach is promising and could be researched further. It is also a model that could probably be employed in
production.

55
B Comparison between Without OAS Approaches

In this section, we compare the Without OAS approaches presented in Appendix A. Specifically, we compare
Taylor Series Without OAS (top left), Linear Regression Without OAS (top right), Automatic Linear Model
Selection Without OAS (bottom left) and Neural Network Without OAS (bottom right).

We can see from the below examples that in most cases, performance improves gradually as the sophistication
of the approach increases. In particular, Neural Network Without OAS performs the best while Taylor Series
Without OAS performs the worst.

B.1 15Y Fannie Low Coupon Pools

The figure below shows the regression fit for change in price for 15Y Fannie 2.5 pools.

The table below shows the R^2 statistics obtained for the regressions in the figure above.

R^2 with Outliers R^2 without Outliers


Taylor Series Without OAS 0.0086 0.0243
Linear Regression Without OAS 0.2497 0.1879
Automatic Linear Model Selection Without OAS 0.3947 0.2693
Neural Network Without OAS 0.7925 0.9158

56
B.2 15Y Fannie Mid Coupon Pools

The figure below shows the regression fit for change in price for 15Y Fannie 3.5 pools.

The table below shows the R^2 statistics obtained for the regressions in the figure above.

R^2 with Outliers R^2 without Outliers


Taylor Series Without OAS 0.0343 0.0000
Linear Regression Without OAS 0.3578 0.3323
Automatic Linear Model Selection Without OAS 0.3716 0.5926
Neural Network Without OAS 0.7331 0.8514

B.3 15Y Fannie High Coupon Pools

The figure below shows the regression fit for change in price for 15Y Fannie 5.0 pools.

57
The table below shows the R^2 statistics obtained for the regressions in the figure above.

R^2 with Outliers R^2 without Outliers


Taylor Series Without OAS 0.0388 0.0018
Linear Regression Without OAS 0.2034 0.1967
Automatic Linear Model Selection Without OAS 0.2250 0.1624
Neural Network Without OAS 0.6343 0.7677

B.4 30Y Fannie Low Coupon Pools

The figure below shows the regression fit for change in price for 30Y Fannie 2.5 pools.

58
The table below shows the R^2 statistics obtained for the regressions in the figure above.

R^2 with Outliers R^2 without Outliers


Taylor Series Without OAS 0.0101 0.0096
Linear Regression Without OAS 0.2380 0.0504
Automatic Linear Model Selection Without OAS 0.5342 0.6228
Neural Network Without OAS 0.5115 0.6729

B.5 30Y Fannie Mid Coupon Pools

The figure below shows the regression fit for change in price for 30Y Fannie 4.5 pools.

59
The table below shows the R^2 statistics obtained for the regressions in the figure above.

R^2 with Outliers R^2 without Outliers


Taylor Series Without OAS 0.0812 0.0765
Linear Regression Without OAS 0.4324 0.2690
Automatic Linear Model Selection Without OAS 0.5552 0.6534
Neural Network Without OAS 0.8051 0.9157

B.6 30Y Fannie High Coupon Pools

The figure below shows the regression fit for change in price for 30Y Fannie 5.5 pools.

The table below shows the R^2 statistics obtained for the regressions in the figure above.

R^2 with Outliers R^2 without Outliers


Taylor Series Without OAS 0.0007 0.0006
Linear Regression Without OAS 0.3306 0.4908
Automatic Linear Model Selection Without OAS 0.6833 0.7486
Neural Network Without OAS 0.6364 0.7945

B.7 Ginnie Low Coupon Pools

The figure below shows the regression fit for change in price for Ginnie 3.0 pools.

60
The table below shows the R^2 statistics obtained for the regressions in the figure above.

R^2 with Outliers R^2 without Outliers


Taylor Series Without OAS 0.0158 0.0002
Linear Regression Without OAS 0.2603 0.2209
Automatic Linear Model Selection Without OAS 0.5653 0.6661
Neural Network Without OAS 0.7678 0.8768

B.8 Ginnie Mid Coupon Pools

The figure below shows the regression fit for change in price for Ginnie 4.0 pools.

61
The table below shows the R^2 statistics obtained for the regressions in the figure above.

R^2 with Outliers R^2 without Outliers


Taylor Series Without OAS 0.0362 0.0014
Linear Regression Without OAS 0.4073 0.3545
Automatic Linear Model Selection Without OAS 0.4179 0.6890
Neural Network Without OAS 0.8208 0.9271

B.9 Ginnie High Coupon Pools

The figure below shows the regression fit for change in price for Ginnie 5.5 pools.

62
The table below shows the R^2 statistics obtained for the regressions in the figure above.

R^2 with Outliers R^2 without Outliers


Taylor Series Without OAS 0.0685 0.0062
Linear Regression Without OAS 0.3619 0.4745
Automatic Linear Model Selection Without OAS 0.5585 0.7034
Neural Network Without OAS 0.6364 0.7926

B.10 Interest Only Pools

The figure below shows the regression fit for change in price for Fannie IOS 4.0 2010 IO pools.

The table below shows the R^2 statistics obtained for the regressions in the figure above.

R^2 with Outliers R^2 without Outliers


Taylor Series Without OAS 0.0020 0.0114
Linear Regression Without OAS 0.3319 0.1717
Automatic Linear Model Selection Without OAS 0.4345 0.3914
Neural Network Without OAS 0.7119 0.8551

B.11 Principal Only Pools

The figure below shows the regression fit for change in price for Fannie POS 4.5 2010 PO pools.

63
The table below shows the R^2 statistics obtained for the regressions in the figure above.

R^2 with Outliers R^2 without Outliers


Taylor Series Without OAS 0.0010 0.0399
Linear Regression Without OAS 0.4074 0.2103
Automatic Linear Model Selection Without OAS 0.8979 0.9346
Neural Network Without OAS 0.9168 0.9551

B.12 Estimated Distribution of Change in Cumulative PnL at Portfolio Level

The figure below shows the estimated v/s actual historical distribution of change in cumulative PnL at the
portfolio level obtained using the different approaches. The top 4 histograms are estimated distributions
obtained, in order, using Taylor Series Without OAS, Linear Regression Without OAS, Automatic Linear
Model Selection Without OAS and Neural Network Without OAS. The last histogram is the actual distribution.

64
65

You might also like