Machine Learning Corporate Bonds - MSC - Thesis - Nurlan - Avazli

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 42

Machine learning based US corporate bond return prediction.

An empirical study with extensive bond-specific liquidity


parameters.

Nurlan Avazlı*
MSc Thesis Finance
University of Groningen**
Supervisor: Dr. Jules Tinang
June 2, 2021

Abstract
This paper extends the work of Bianchi et al. (2020) by applying machine learning
techniques to the US corporate bond market. Using 4563 investment-grade bonds with intra-
day transactions starting from January 2012 to February 2021, I find that artificial neural
networks produce considerably less predictive errors compared to the historical average.
Additionally, I find that the inputs offered by Merton (1974) do not help find accurate direction
prediction of intra-day prices where the model accuracy is around 50% and not significant. On
the other hand, the accuracy of predictions increases to statistically significant 60% levels when
changing the frequency of data from intra-day to end-of-day close prices. Lastly, I conclude
that adding bond-specific liquidity parameters does not boost the predictive power chiefly
because the parameters offered by credit risk models already captures the illiquidity premium.

Word count: 9802


Key words: Neural networks, credit risk, corporate bond, liquidity, prediction.
*Student number: S4111540, E-mail: n.avazli@student.rug.nl
**Faculty of Economics and Business
1. Introduction
More often than not, predictions play an essential role in our daily lives. Whether it is game
matches, presidential elections or weather forecasts, the predictions are mirrored by the
expectations set upon the outcome. Financial markets are no exception in the sense that starting
from company earnings to central bank interest rate policies, analysts, economists, and
professionals are constantly making forecasts which partially establish the foundation of the
market’s expectations. These prediction methods differ considerably in their nature, approach,
and level of difficulty. Predicting accurate stock prices has historically always been a
demanding and exciting area. While it is argued that at least in the short-term, accurate and
consistent stock price prediction is not possible due to the stochastic nature of movements
(Merton, 1974), it did not restrain the finance professionals to publicly share their forecasts of
near-future stock prices, which in turn establishes the expected returns of the assets.
Macroeconomics has incorporated the expectation factor in the formulation of nominal interest
rates; thus, the nominal interest rate is a function of interest rate expectation and real interest
rate (Blanchard et al., 2003). Developing an advanced quantitative method for predicting asset
prices may yield an economic gain for market players. Green and Figlewski (1999) find
evidence that using sophisticated statistical methods, such as GARCH instead of historical
average, to predict implied volatility substantially decreases option writer’s overall risk
exposure and leads to an economic gain from market-making strategies. It is for this reason
that improving a forecast model carries tremendous importance for financial players.
The area of interest, which this paper is based, is which methods or models can be used to
make reliable corporate bond return prediction. Therefore, we consult the latest developments
in computer science to try to answer if those models, on average, yield better results compared
to the traditional method. Machine learning has well proved its usability in vast areas of our
lives and is successfully employed by the big tech for advertisement selection. The first actual
application of machine learning was detecting spam emails. Nowadays, however, it has
advanced to the point that it can even detect diabetes in a particular person (Geron, 2017)
The application of sophisticated statistical models in predicting asset prices in the financial
literature is abundant. Yet, most of them are focused on risk management, model development,
or equity price prediction. Interestingly, however, despite its popularity, there is limited
research and application of machine learning techniques on the corporate fixed income
markets. There are several potential reasons for this: first and foremost is a lack of high-quality
corporate bond data. Only recently have brokers gained access to corporate bond trading after
their immense surge in volume. Corporate bonds have gained tremendous weight and
importance on the financial markets and, until recently, financial literature as well. While in
2005, the amount of outstanding investment-grade corporate bonds stood around 1.5 trillion
euros, today, that number is almost five-fold more (Kaufmann et al., 2021). The total
outstanding amount of US corporate bonds has become comparable to the size US equity

1
market (Bai et al., 2020). Secondly, the surge of corporate bonds in the volume and their
increasing weight on the financial markets is relatively a recent event. The final potential reason
is the historically lesser focus and popularity of quantitative investing in corporate bond
markets.
Categorically, bonds can be divided into three general qualities: highest rated risk-free
bonds usually referred to as Treasury bonds, investment-grade bonds, and high yield
speculative bonds. Financial literature has shown that while high-rated bonds offer Treasury-
like returns and price action, speculative bonds bear more similarity with stock-like return and
price action behaviour. The similarity of price action between low-grade bonds and stocks
should not be surprising since noise and speculation elements are abundant in both. For this
reason, the investment-grade corporate bonds, situated in between risk-free and high-yield
debt, lend themselves especially well for a machine learning approach because of all the
additional risk factors associated with them.
This paper extends the work of Bianchi et al. (2020) by expanding the machine learning
application to the US corporate bonds, while the benchmark paper solely focuses on the US
Treasury bonds. Additionally, this paper employs liquidity factors instead of macro variables
as an input. Similar to Chen et al. (2007), we extensively employ the bond-specific liquidity
factors as an explanatory variable and aim to answer if adding liquidity parameters in our
machine learning models increases the power of prediction. To our best knowledge, there has
not been research conducted to predict the returns of corporate bonds using machine learning
techniques while employing an extensive bond-specific liquidity parameter. Moreover, Bianchi
et al. (2020) apply machine learning to predict the Treasury bonds returns, and Chen et al.
(2007) and Lin et al. (2014) employ firm-specific liquidity parameters to predict corporate bond
returns. Following these, I merge these papers to predict corporate bond returns using machine
learning techniques employing firm-specific liquidity parameters. This means there is no
reliable reference to confirm or contradict our results with. All in all, with the surge of corporate
bonds in the financial markets and the recent success of machine learning techniques, it is our
interest to test whether machine learning-based corporate bond risk premia predictions can
outpace the traditional methods.
The structure of this thesis continues as follows. Section 2 explores the literature review,
searching how far previous researchers have come in corporate bond return prediction and
forms the hypotheses. Section 3 explain in detail the data collection and cleaning for the model.
Section 4 demonstrates the theory and algorithm behind the machine learning techniques this
thesis uses. Section 5 presents the results, and section 6 concludes the findings.

2
2. Literature Review and Hypothesis Development

Financial literature has well focused on the cross-section correlation as well as return
prediction of the stocks and bonds (Campbell and Ammer, 1993; Fama and French, 1993).
Intuition would offer that corporate bond returns and its equity returns should not differ
immensely since they all represent the same company. As much as Merton (1974)
quantitatively linked credit spread and equity to the contingent claims on the assets of the firm,
the robust correlation between credit spread and equity premia representing the same corporate
is still vague. The question still remains largely debatable on which inputs predict the corporate
bond returns. In the following sub-sections, theoretical models, the Fama-French and equity
factors are discussed.

2.1 Machine learning applications in finance


Three factors can be associated with the growing adoption of machine learning techniques
in finance: Firstly, the computation powers of the machines have considerably improved over
time which gives the incentive to deploy advanced techniques without too much computation
costs. Secondly, the vast high-quality data has become more available, which paves the way
for back-testing the models. The availability of big data enables train the models to predict
better compared to traditional methods. Lastly, the ever-increasing interest and demand for
accurate algorithmic predictions open the door for machine learning applications in
finance(Geron, 2017). Yet despite all these, we see the limited application of machine learning
techniques on the fixed income until recently. In contrast, most of the machine learning
techniques are being applied for equity markets (Bali et al., 2020).
A number of papers apply several machine learning methods like the Artificial Neural
Networks (ANN), Random Trees, Support Vector Machines (SVM) and Gradient Boosting
Machine (GBM) to predict the daily return of stock market indices, S&P500 being the favourite
choice due to its importance. The papers conclude with the statistically significant
outperformance of machine learning techniques to the buy-and-hold strategy. Similar papers
have concluded the significant predictive power of machine learning techniques, especially the
GBM and ANN performing the best among other models (Heaton et al., 2016; Henrique et al.,
2018; Rasekhschaffe and Jones, 2019; Wolf and Echterling, 2020; Kaufmann et al., 2021).
Furthermore, machine learning techniques outperform the traditional forecasting methods not
only in predicting stock price returns but also in predicting volatility (Wong et al., 2016).
Atkins et al. (2018) apply machine learning techniques to dissect the financial news and predict
future volatility. Furthermore, machine learning has been applied by the bank for the credit
scoring of the client based on previous track records of the person and produce the likelihood
of the default of the person. While previously methods like Altman’s Z-score had been used to
predict the bankruptcy probability, machine learning techniques exhibit high outperformance

3
in detecting potential bankruptcies (Geron, 2017). Therefore, I proceed to make the first
hypothesis on the outperformance of machine learning.

H1: Machine learning technique produces less prediction errors on US corporate bonds
compared to a historical average method and linear regression model.

2.1 Default risk: priced more in equity or bond price?


There has been a debate on whether including stock return as an input parameter to the
prediction of bond return models increases prediction power. Equity and bond prices are driven
by different factors even though some factors, such as contingent claims on the company's
assets, are standard parameters for both. Financial literature has coined a ‘distress puzzle’ term
for the lack of robust correlation between equity and bond returns (Modigliani and Miller,
1958; Penman et al., 2007). In other words, it is still vague if the increasing equity value of the
corporation indicates the decreasing probability of default. In comparison, Fama and French
(1993) claim that time to maturity and default risk are the main drivers of the bond returns,
Elton et al. (2001) find that the expected default rate is surprisingly not the primary explanatory
variable when it comes to the prediction of the credit spread. Eom et al. (2004) find that the
credit spreads offered by structural credit models underestimate the spreads compared to
market pricing. While factors explaining the equity(bond) returns can also partly explain the
bond(equity) returns, there are still contradicting findings that further press on the “distress
puzzle.” Duffresne et al. (2001) find low explanatory variables of traditional credit spread
determinants proposed by the structural form theory Merton (1974) for such as credit risk
premium. Friewald et al. (2014) conclude the positive relationship between a firm’s equity
return and changes in credit spread. However, the paper uses credit derivatives like Credit
Default Swap (CDS) as a proxy for the bond premium. The new findings show biased and
inefficient pricing of these instruments (Du and Zhu, 2017). Perhaps the lesser focus on the
pricing of default risk premia in equity and liquidity factors on firm-specific bonds can partly
explain the “distress puzzle”. Typically, the default risk has been searched and accounted for
in the credit spread, while Vassalou and Xing (2004) find significant power of prediction once
credit spread is adjusted for the pricing of the default risk in equity. Furthermore, Goldberg and
Nozawa (2021) find that corporate bond returns can be predicted if accounted for firm-specific
credit supply shocks.
It is possible that the origins of the “distress puzzle” are not in the quantifiable variables
but the very nature of buyers and sellers of these instruments. Moreover, the work of Bai et al.
(2018) provides the potential reasons behind this puzzle. To start with, the investor profile of
the two instruments is different. While stocks are mostly traded by investors and retail traders,
bonds are chiefly traded by pension funds and institutional investors. The typical activities vary
in the frequency of the transactions as well. As such, most bondholders employ buy-and-hold
strategies compared to active buying and selling. This partly explains why corporate bonds are

4
typically less liquid than stocks. Furthermore, the two market players react to the news and
model the risk differently (Chordia et al., 2017). Secondly, the price action of the stock and
bond representing the same company can simply be a result of flow from one another
(Campbell and Taksler, 2003). For example, if the company is performing well and earnings
beat the expectations, Merton's (1974) model would suggest bond price would go up since the
business risk of the firm decreases, which in turn lowers the credit spread. However, in reality,
bond prices can drop after a better-than-expected performance result simply because of the
flow from the corporate’s bonds to equity, in expectation of higher return. The period between
1990 and 2000 in the US well pictures this phenomenon: while equity prices have surged, the
credit spreads over the Treasury widened in parallel, against the predictions of the structural
models.

2.2 Merton Model of corporate bond risk premium


The first quantitative technique to derive the credit spread of the company has been
introduced by Merton (1974). Intuitively corporate bond and equity returns should be in synch
since they all represent the same company. The model assumes that the value of the firm at
time t follows the Geometric Brownian Motion (GBM) with a drift ! and volatility ".

!"
"
= !%& + " %( # (1)

Where W represents the standard Wiener process. The model assumes a risk-neutral
environment. The model improved by Merton (1974) suggests that credit spread, or in other
words, the company’s default premium for its debt, can be modelled by treating the company’s
asset and liabilities as a real options problem. Furthermore, its equity is a call option on a
company’s assets, and the value of its debt is an inverse function of a put option on the value
of assets. From the put-call parity, the value of corporate’s debt as a function of risk-free debt
can be derived as follows:

)(+, &) = P(t, T) – Put(V, F, !, T-t, ") (2)

Where P(t, T) represents the duration-matched risk-free Treasury bond. The equation above
reports that the value of credit spread of the corporate bond derives from the put option on the
assets of the company. However, this option is instead an abstract value; it is not traded in
exchanges. The credit derivatives, such as Credit Default Swap, attempt to price precisely the
value of the put option in the equation above. Equation (2) clarifies that the higher the value of
a put option, the less the value of the company’s debt. A decrease in the value of a bond
corresponds to a higher yield implying a higher additional premium for the debt. Here, V stands
for the value of the company at time t, and F represents the leverage ratio of the company,

5
which is equal to total debt divided by equity, T-t, and " represent the time to maturity and
volatility, respectively. Bektic et al. (2019) mention the volatility term as “business risk of
company’s assets”. Section 3.2 explains how this term is taken into account in our models.
It becomes clear that a higher premium for the company’s debt is associated with its higher
business risk, which implies a higher probability of default. Merton’s structural credit model
states that a portfolio with a single corporate bond and a put option on the assets of the company
is equivalent to the portfolio of a single same duration risk-free bond. Brownian motion in the
bond price action implies randomness, yet the theory does not shed light on the time series
frequency. Namely and intuitively, the accuracy of predictions where randomness dominates
the price action should be close to a coin flip. Therefore, it is our interest to test whether the
randomness appears more in intra-day price movement. The hypothesis relies on the
assumption that the end of day close price should bear more critical information necessary for
the algorithm to understand the corporate bond price pattern than intra-day price action.

H1: Intraday corporate bond predictions are inaccurate and random, while in comparison,
the daily close price predictions are more accurate.

As much as Merton's (1974) model is groundbreaking in the understanding of credit spread


and its potential determinants, the model makes eight assumptions that are far from real
markets. Notably among them are the assumption of zero transaction costs, zero taxes,
continuous trading of the company’s assets, normally distributed stock returns, and constant
volatility and risk-free rates. None of these assumptions holds true in real markets. The
illiquidity problem on the corporate bonds, which has been discussed thoroughly in section 2.3,
appears to exhibit a considerable effect on the price of the bond. Green and Figlewski (1999)
discuss how the assumption of constant implied volatility, the expected variance of the returns,
causes value destruction for option writers. Furthermore, more often than not, corporate bonds
are issued with embedded options, irregular coupons, floating-rate coupons, which are not
included in the model. Despite all these flaws, the Merton (1974) model provided a framework
for the understanding of the credit spread. Lastly, by default, corporate bonds should have a
higher yield than Treasury bond due to the tax applied to them. The Merton (1974) credit model
does not take corporate taxes into account. Formula (2) would imply that a perfect theoretical
company would have a zero put value which would make the corporate bond equivalent of a
risk-free bond. However, the dividends from corporate bonds are subject to federal income
taxes. At the same time, Treasury notes are exempt from federal income taxes, which
automatically requires a higher yield for the corporate bond for bearing the taxes.

6
2.3 Fama-French factors in corporate bonds
Starting from 1993, Eugene Fama and Kenneth French have published a groundbreaking
series of papers explaining the explanatory variables for the cross-section of stock returns and
coined the “Fama-French factors” term and put the foundations of the factor-investing. In the
first series, the three-factor model with the excess return, Small minus Big (SMB), and High
minus Low (HML) has shown success in explaining the equity returns as much as 70%. SMB
and HML can be interpreted as size and book-to-market values, and excess returns are the
difference between the historical average return and the risk-free rate. The results were
unexpected to many at times. As such, the Fama-French three-factor model concluded that
large caps actually underperform small caps and value stocks outperform growth stocks (Fama
and French, 1993). In 2014, Fama-French added three new factors: “low volatility,” “quality,”
and “momentum”. The new five-factor model outperforms the three-factor model (Fama and
French, 2014). The main contribution of the five-factor model to the average stock return is
that the companies with higher expected earnings have higher stock returns.
The Fama-French factors have been studied in the financial literature and applied to both
developed and emerging markets. Yet, there is a gap in whether the factors apply to the
corporate bond markets. Bektic et al. (2019) examine the four factors in the corporate bond
returns: SMB, HML, RMW (profitability), and CMA (investments). The paper finds that
factors explaining the equity market returns do not perform well in corporate bond returns.
Results are mixed and somewhat counterintuitive to the structural credit risk models. As such,
while the structural models suggest the positive correlation between the profitability factor, and
they find a negative correlation between the profitability of a firm and bond price returns.
Additionally, the research concludes no significant factors for the investment-grade bonds
while empirical findings find some significance of the factors in high-yield graded bonds. That
is, as the authors mention, expected since financial literature has well studied how speculative
bonds behave like stocks while high-grade bonds imitate risk-free bonds (Bektic et al., 2019).

2.4 Credit derivatives: Credit Default Swap


Credit spreads of the corporate bonds are difficult to quantify, and so far, the best financial
instrument to track as a proxy for the credit spread of the bond is Credit Default Swaps (CDS).
By nature, it is an insurance contract with a third party, and typically the insurance companies
are the issuers. Buyers of these swaps are guaranteed to be paid if the underlying company
declares bankruptcy. The Credit Default Swap is a tradable asset, and supposedly, the demand
for these insurance instruments should increase once the investors start pricing the higher
probability of default. It is traded in terms of spreads, and a single spread is then multiplied by
a thousand currency units and then divided by a 10 million currency unit. For example, a CDS
value of 300 spread translates to 300,000 of 10 million currency units which in turn makes 3
percent. It can be followed that the total yield that investors should demand on the issuance of
debt of the company should be the sum of the risk-free rate and 3 percent.

7
While it is tempting to use CDS as a proxy to a credit spread, recent findings show that the
CDS market suffers structural flaws. Due to its insurance nature, the auction of CDS leads to
inefficient and biased prices (Du and Zhu, 2017). Additionally, since the CDS is traded on the
OTC, it is also exposed to market liquidity-related issues. Chen et al. (2010) find evidence that
due to the liquidity issues of the CDS markets, the credit spread is not efficiently priced in the
CDS. Longstaff et al. (2005) find the existence of illiquidity part in the CDS spreads. They
divide the spreads into default and non-default components and find evidence that while the
default component of the bonds is primarily priced in CDS, the “non-default component is
time-varying and strongly related to measures of bond-specific illiquidity”.
CDS are exposed to additional insurance-related variances such as flows from pension
funds as a hedging activity, which would deviate its function to signal as a pure risk premium
of the bond. The idea is that while it is supposed that the supply-demand for these instruments
should move in line with the market’s implied probability risk, a number of pension funds need
to allocate funds to CDS as a means of decreasing their total risk exposure since shorting bonds
are not allowed for many of such funds. This leaves the CDS market exposed to the inefficiency
of pricing the true credit risk premium. Recent research by Jiang et al. (2021) verifies the
statement above and concludes that due to idiosyncratic liquidity issues, the CDS fails to report
efficient price. Additionally, bonds are issued with the embedded options on them, which are
defined as callable bonds. Nerin and Huang (2002) point out that the effect of the embedded
options is not translated into CDS, once again diverting its purpose of signalling the credit
premium on the bond. Zhu (2014) states that no causal evidence has been found between the
credit spread and CDS even despite the theory would suggest so. This paper uses Option
Adjusted Spread (OAS) as a proxy for the credit spread, which does not have inefficiencies
offered by CDS markets. Section 3.2 explains OAS in detail.

2.5 Importance of liquidity


More often than not, as a last resort, illiquidity tends to the attributed reason when the
traditional parameters fail to explain the credit spread determinants (Longstaff et al., 2005;
Duffee, 1999). The potential reason for this is the fact that most quantitative models, including
Merton (1974), do not take transaction costs into account where the cost of transaction is a
mirror to the liquidity of the instrument. Inputs proposed by the structural models tend to have
little to no ability in explaining the price behaviour of the credit spreads (Duffresne et al., 2001).
Historically the liquidity parameter has been overlooked and rather shadowed by the financial
literature chiefly due to the lack of robust liquidity theory to measure the instrument profile.
Generally, liquidity can be defined as a market player’s ability to buy and sell the asset as soon
as possible without incurring additional losses. Quantitative models in asset pricing like Merton
(1974) have traditionally ignored liquidity factors and assumed the asset to be always tradable
in the market. Even when the empirical studies have started to adjust for liquidity factors, in
other words, including transaction costs, it has been assumed to be constant and typically set

8
equal to 12 basis points (Jostova et al., 2013) or 0.5% for equities (Brandt et al., 2009). Here,
the lack of robust quantitative theory on the measurement liquidity poses difficulties to create
a liquidity variable based on robust parameters. For this reason, there are a number of different
proxies that can be used as a measure of the liquidity profile of the instrument. The first and
probably the most prevalently used liquidity measure is the bid-ask spread. The difference
between the best-offered ask and bid price divided by the mid-price would yield a liquidity
percentage. The cost of liquidation is typically calculated as half of the liquidity percentage
(Hull, 2018). Moreover, I proceed to form a hypothesis that due to the lack of robust theories
on liquidity, it is not fully priced in by the market and therefore, including liquidity parameters
potentially uncovers new information which will boost the predictive power of models.

H1: Adding liquidity parameters to inputs boosts predictive power compared to yield-only
prediction.

A number of notable papers in finance are not employing liquidity as an input in their
corporate bond spread analysis. Elton et al. (2001) and Bai et al. (2018) use close prices only.
This can potentially lead to information loss due to the fact that the close price takes the mid-
point between the best ask and bid offers. It is possible that the spread between the best ask and
the bid would move symmetrically such that the mid-point would not change, implying the
same close price but missing out on the potential information beyond the ask-bid move.

3. Data collection

A great deal of effort has been spent to obtain clean and quality data. The fixed income
data sample used in this paper has been achieved from 7 Chord Inc. fixed-income trading
startup with access to Trade Reporting and Compliance Engine (TRACE) transaction data. The
sample contains firm-specific bond data of the top 100 liquid United States-based corporations.
The corporate bond data includes intraday tick prices across a variety of maturities and
coupons. All of the corporate bonds in the sample are investment grade (IG) and US Dollar
denominated. The starting date is January 2012, and the end date is the 28th of February 2021.
Additionally, the data is further enriched with the number of additional bond-related
information as well as the industry of the corporations. Sorted by each year, the total number
of bonds exceed 18,000 while 4563 of them are being the unique bonds, in other words, the
same bonds traded in different year dataset. The sample data used in this paper have rich access
to intraday tick prices with the exact time of the transactions allowing us to enrich our sample
with liquidity proxies. This paper does not employ firm-specific credit ratings but only bond-
specific liquidity parameters. Chen et al. (2007) find that the credit spread changes can be

9
explained more by taking into account liquidity than rating, “…we find that liquidity changes
explain more of the variation in yield spread changes than do changes in the credit rating”.
We divide the bond-related information into two: price and liquidity-related data. With
regards to the price, the sample contains intraday ask-bid price spreads, the ask-bid spread of
the option-adjusted spreads, ask-bid yield to maturity, ask-bid spreads over the same duration
Treasury bond. The liquidity parameters include the spread between the best ask-bid offers of
the price, OAS, yield to maturity and duration-matched Treasury bond. Additionally, following
Chen et al. (2007), the average number of times the bond has been traded during the day and
the last 60 days and the days bond has not been traded are also included in the sample.
Following Kaufmann et al. (2021) to include only bonds with an initial notional value above
$50 million. Similarly, this paper excludes the accrued interest to achieve the clean price. The
sample does not include floating rate bonds, mortgage bonds, convertible bonds, bonds with
variable or surprise coupons. Moreover, since bonds trade even after the company has
defaulted, the sample data shows the last tick price of the bond at the moment of default and
thus achieving survivorship bias-free prices.

3.1 Cleaning the data


Achieving clean and reliable fixed income data is of utmost importance; as highlighted
above, the lack of high-quality corporate fixed income data is associated with historical
disinterest in this area. To this end, we do the following cleaning similar to Kaufmann et al.
(2021): we exclude bonds with price changes of 50% and higher and maturities longer than ten
years. Goldberg and Nozawa (2021) argue that liquidity in bonds tends to fade towards its
maturity. Additionally, Bao and Hou (2017) find that longer maturity bonds tend to drop to
junior notes from seniority nearing their maturities, the evidence related to drying liquidity has
been found. The sample data used in this paper includes only senior debt instruments.
The return of the bond can be formulated as the change between the sum of price and
accrued interest of today and one business day before, and if any, coupons. The formula is as
follows:

(%! &'! ) &*! + (%!"# &'!"# )


. = (%!"# &'!"# )
(3)

Where Pt, At, and Ct stand for price, accrued interest and coupon, if any, at time t.

3.2 Option Adjusted Spread (OAS)


Bonds come with options embedded in their term structure which has the power to change
the price. To mitigate this problem, we use Option Adjusted Spread as an input instead of Credit
Default Swaps for the reasons explained in section 2.4. It is intuitively tempting to assume that
the spread over duration-matched Treasury bond would serve as a proxy for credit spread, as

10
Elton et al. (2001) use. However, one distinction is that they take the zero-coupon corporate
bond spread, the spread over treasury (SOT), over duration-matched zero-coupon Treasury
bond, while the bond sample in this paper included bonds with coupons. We conclude that we
should be using the OAS spread, not the SOT, to compare to the CDS spread. This will ensure
that we are comparing a credit spread adjusted for all of the optionality in the bond's terms and
conditions. SOT is more of the yield differential which accounts for many other factors such
as coupon frequency. Moreover, the value of the embedded options also fluctuates with the
market’s expectation of volatility which means OAS captures the volatility term included in
equation (2).
OAS is denoted in basis points while the spread is expressed in percent of par. For example,
a price of 100 means 100% of the bond’s par value, which is 1,000 for the bonds in our universe.
An OAS of 1 is 0.01 percent. For example, 5-year maturity, 2 percent coupon Apple bond
maturing on May 2020 had an average OAS of 9.2 over its lifetime.

3.3 Proxies for liquidity


As much as liquidity has historically been a lesser focus in financial literature, the papers
employing liquidity in their empirical models use indices for liquidity on an aggregate level,
such as liquidity index (Duffie and Singleton, 1997; Campbell and Taksler, 2003). While the
most used proxy for liquidity is bid-ask spreads, Charles et al. (1999) introduce a new liquidity
proxy as a number of zero returns. The idea is that the new price tick occurs as soon as new
information brings more utility than incurring liquidation costs, and the frequency of this series
of occurrences can be used as a liquidity parameter. Bekaert et al. (2006) confirm this new
liquidity parameter. In our sample, we achieve this by including an input that tracks the
difference between the time to maturity of each tick, which in itself would bear information
about the frequency of the new price ticks.

3.3.1 Bid-Ask spreads


This paper examines more than 4,500 US corporate bonds with extensive bond-specific
liquidity parameters. The sample includes the best bid offer and asks offer data in each
transaction. The price is calculated as the mid between the ask and bid. Furthermore, the
calculated bid-ask spreads of Option-Adjusted Spread derived from the ask and bid prices are
also included. Here, naturally, the OAS ask value is lower than the bid.

11
TABLE 1: Distribution of sample bond data according to their characteristics year by year

The table displays the sample bond data employed in this paper. The average ask-bid spread column reports the spread between ask and bid prices
of the bonds with regard to their prices in percentile. It can be seen that during the 2020 COVID-19 crisis, the liquidity spread and OAS spread
have both widened considerably. Since 2021 data consists of only the first two months, I concatenated them at the end of 2020, meaning that 2020
represents 14 months.

Year # of bonds Average coupon Average Price Average OAS Average YTM Average ask-bid Average
rate (in $) (in bps) spread (x10-2) maturity (in
years)

2012 852 5.36% 110.14 232.11 3.90% 15.24% 7.81


2013 1137 5.10% 106.02 167.91 3.68% 14.96% 5.46
2014 1413 4.89% 105.65 127.99 3.38% 14.30% 6.54
2015 1743 4.68% 103.51 160.45 3.33% 13.93% 7.53
2016 2115 4.53% 103.33 181.76 3.34% 13.75% 5.88
2017 1665 4.19% 102.91 129.68 3.37% 13.50% 5.21
2018 2258 4.58% 99.63 115.48 3.91% 12.18% 6.29
2019 4593 4.82% 103.76 144.75 3.58% 13.15% 6.36
2020 2679 4.83% 106.52 319.84 4.20% 41.32% 7.11

12
4. Methodology

Machine learning is a broad term meaning to train the machine to learn itself. The idea is
that the inputs are provided to the machine, and through sophisticated algorithms, the machine
can train itself on the sample and start making predictions for the new unseen input data (Geron,
2017). Looking broadly at machine learning theory, it can be said that the algorithms are
divided into two: supervised and unsupervised learning. The supervised learning has
similarities to the regression model, where the models are trained by first providing the real
values and allowing the model to fit according to the provided data. In supervised learning
algorithms, the training set already includes the actual values and the model is required to
assign weights and biases to predict the test values. Unsupervised learning algorithms, namely,
train the model while the outputs are not clearly given. This paper uses supervised learning
algorithms where the actual values in the training dataset are the historical corporate bond
returns with the variety of inputs suggested by the structural models.

4.1 Artificial neural networks and deep learning


Artificial Neural Networks (ANN) inspired by the way the human brain works. While each
cell stores the information and passes it to another neuron, and thus the collection of all neurons
makes the decision, the ANN system imitates the same model by treating the inputs with the
random variables and learning from the experience(Geron, 2017). At the start of the epoch,
each neuron is assigned with the random weight and compares the output with the actual output
and aims at minimizing the loss. The typical loss function chosen in ANN systems is mean-
squared errors. Figure 1 depicts the example of the simplest form of the ANN system:
perceptron. First, the random weights are assigned to the inputs yielding the weighted sum,
where it can be formulated as follows:

! = $! %! + $" %" + . . . + $# %# = ($ ) (4)

Then the model applies a step function to the weighted sum. Several step-functions exist,
such as Heaviside step function, Logistic activation, Linear activation, Rectified Linear Unit
Function (ReLU), and Sigmoid Function. Each function serves a different required model. The
flexibility of machine learning, such as activation function and techniques, allows it to be
employed for various problems ranging from company revenue prediction to diabetes forecast.
The figure below displays the Heaviside activation function. This paper uses ReLU activation.
The ReLU calculates as follows:

*%,' (,) = ./%(0, 2 + ,′)) (5)

13
Where b represents the bias, it can be seen that the ReLU function does not allow negative
outputs, in line with our model that the bond prices can only be a positive number. Bianchi et
al. (2020) set forward rates as bias yet finds no increase in the power of prediction. This paper
does not set a bias value in the model.

Figure 1: The simplest artificial neuron with three weights and a single output. The figure is
taken from Geron (2017).

The perceptron is the simplest form of ANN representing a single neuron. To expand,
Figure 2 portrays the whole set of ANN, where each neuron is a single perceptron. Since the
model assigns random weight at first, the loss is considerably higher at the start, and then the
model changes the weight in each epoch according to with gradient descent function. The
perceptron proceeds to randomly assign weight to inputs and export the outcome to another
neuron, as depicted in Figure 2. The weights, however, get updated on each epoch to achieve
a minimum value for the loss function. The algorithm of assigning new weights, the Gradient
Descent Function (GDF), proceeds as follows:

(#,-. /.,0)
$(,) = $(,) + 4(5) − 572 )%( (6)

Where $(,) denotes the weight connecting ith neuron and jth output, 4 is the learning rate,
5) /89 572 are the given actual output and the output from the trained neuron, respectively.
Finally, %( represents the training value of the input to the ith neuron. In each complete step,
which is called an epoch, the GDF completes a set and starts training new epoch with
backpropagation where the new weights are assigned until the algorithm reaches minimum
value for loss function. It is crucial to keep the learning rate constant since the length of each
step towards minima already drops in value, and the decrease of learning rate, in addition to
the reduction in step value risks the algorithm saturating at a point without proceeding. Note
that too high a learning rate also risks missing out on the minimum value of loss function.
Typically, the learning rate is kept around 1-2%. In this paper, the learning rate is set to be
0.1%.

14
Figure 2: The figure plots an example of the artificial neural networks with five inputs and two output
neurons. The whole artificial neural network setup can be divided into three layers: input, hidden and
output layer. The information flow starts from the left side and proceeds to the right side. The number
of neurons on the input layer is equal to the number of inputs in the model, while the output layer is the
desired number of outputs for the testing data. This paper uses one-hidden layered networks similar to
Bianchi et al. (2020).

4.2 Underfitting
Underfitting occurs when the algorithm fails to deduct any relation due to the lack of
relevant variables. An example of underfitting would be predicting the company revenue with
another irrelevant input, such as birth rates in another country. The underfitting is observed
when the loss function does not decrease in the training data. Recall that the neural networks
assign weights to each input to find a loss value and tries to minimize it with gradient descent
function. In the case of underfitting, the training and test loss functions are flat and are not
decreasing. Figure 5 portrays how the loss function decreases on each epoch, clearly indicating
that the model does not suffer underfitting problem thanks to the structural credit risk models,
which allowed us to choose appropriate inputs relevant for the return prediction.

4.3 Overfitting
Overfitting can be thought as the opposite of underfitting. In the case of overfitting, the
model fits the training data quite accurately yet fails to produce accurate results for the testing
data. Detecting if the model overfits is straightforward: if the training and test loss lines diverge,
then it would imply that the algorithm overfits. Aside from choosing the inputs suggested by
the credit models, we randomly shuffled the train and test data to restrain the model from
overfitting. While the raw sample data consists of continuous time-series tick prices, after
shuffling the train and test data became discontinuous data, each column bearing information
about the specific bond. Figure 3 illustrates an example of overfitting when the data is not
shuffled. Despite repeating epochs, the testing loss does not converge to training loss, clearly
signalling the model overfitted. Additionally, a stochastic gradient descent function can be

15
employed instead of a linear-gradient descent function. This helps with the speed and
overfitting of the algorithm.

Figure 3: The performance of the model when the inputs are not shuffled. As we can see, while the
training loss is considerably small, the testing loss diverges immensely, indicating the model suffers
from overfitting.

4.4 Feature selection


Motivated by the structural credit risk models, we provide the following inputs to our
neural network: The company name, industry sector, bond cusip id, bond maturity, coupon
rate, time to maturity, OAS ask-bid spread, ask-bid price, ask-bid yield-to-maturity, ask-bid
spread over duration-matched Treasury bond, number of times the bond was traded in a day,
the average number of times the bond was traded last 60 days, the number of times bond not
traded and the previous 30 days median error of OAS. The dummy variables are assigned to
the first three features: the company name, its sector and bond’s id to allow the algorithm to
distinguish the bond-specific information. Initially, all the inputs implied from the credit risk
models are provided in addition to liquidity parameters. The importance of features is achieved
through Random Forest Regressor and further discussed in section 5.

4.5 Feature range


The sample input data may come at a different range, for instance, while the time to
maturity is expressed between 0 and 1, the OAS spread and price are at factors of ten. This
translates into the need to reduce the dimensionality of the data for neural network-based
models. To achieve this, we use Principal Component Analysis (PCA) type transformation to

16
reduce the dimensionality without losing the necessary information in the data. One should
note that only the inputs are processed through transformation.

4.6 Evaluation of out-of-sample performance


Similar to Bianchi et al. (2020) and Lin et al. (2014), we proceed to calculate the model
performance by calculating the out-of-sample R-square suggested by Campbell and Thomson
(2007). It resembles in-sample R-square and is as follows:

" ∑# (6 7 68 )"
! !
:334 = 1 − ∑!$%
# (6 7 69 )" (7)
!$% ! !

Where T is the whole sample size, <. , <7and


. <=. are the realized bond price, predicted bond
price and the historical average of bond price until time t-1, respectively. If the out-of-sample
R-square is positive, it implies that the model performs better than average historical prices.
The above-mentioned papers typically achieve positive R-square employing neural networks.
The evaluation of the out-of-sample performance of predictive models is conducted with
Diebold-Mariano Test.

4.7 Comparison of predictions: The Diebold-Mariano Test


In finance, predictions play a vital role in modeling. Implied volatility of option modeling
relies on future volatility predictions. As much as it is important to make an accurate prediction,
it is also important to be able to compare the various prediction models. The Diebold-Mariano
Test suits for this purpose to compare the predictive power of the two forecasts (Diebold and
Mariano, 1995). Consider a time series with actual values given as {5. ; = 1, . .. , @} while
B!. ; = 1, . .. , @} and {5
the two forecasted values are available as denoted: {5 B". ; = 1, . .. , @} .
The question of interest is whether the two forecasted values are equally accurate. To test this,
define the forecast error as:

C(. = 5D
. − 5. (8)

Furthermore, we apply a loss function E(⋄) to the error term to achieve the following three
outputs. Firstly, the loss function should yield zero value when the forecast makes no errors.
Secondly, it is desired that the loss function always remains a positive number as no distinction
is put to the positive or negative forecast errors. Lastly, the loss function should give
asymmetric weight to the errors in large magnitude, meaning that larger errors are heavily
weighted compared to smaller errors, similar to the calculation of volatility. Typically, the loss
function of root-mean-squared errors is the same as the loss function used in neural networks.
Finally, the difference between the two loss functions is given as:

17
9. = E(C!. ) − E(C". ) (9)

The null hypothesis claims no difference in the accuracies of the two forecast methods
against the alternative.
Ha : G(dt) ≠ 0, ∀J

4.8 The model


To our best knowledge, there have been no papers studying machine learning-based
corporate bond returns with extensive bond-specific liquidity parameters. The inputs referred
from the structural credit models are included as well as other alternative variables, such as the
liquidity proxies. Totally, 22 inputs are provided for each bond, including company, sector,
credit spread, and liquidity proxies. The testing results are divided into two: year-by-year test
and input sorted test results.
This paper follows these five steps. The first step is to building a neural network to predict
the returns of corporate bonds and create the data frame. The second step is to follow the same
procedure with the linear regression model and add to that existing data frame. The third step
involves calculating the out of sample R-squared for each method and measure the mean-
squared errors. The fourth step is to perform the Diebold-Mariano Test to check the hypothesis
of whether the two forecasted methods are equally accurate. If rejected, the forecast method
with the higher out-of-sample R-square is chosen to be the accurate one.

5 Results

Much of this paper relies on the methodology followed by Bianchi et al. (2020) and trial-
and-error techniques. While cited papers are focused on predicting the returns using a single
technique each time, thus also to be able to compare the accuracies between each machine
learning methods, in this paper, the comparison between machine learning models is not
studied. This is chiefly due to the time and computing power constraint. To refresh, this paper
examines three hypotheses. First is whether using machine learning techniques generate higher
predictive power compared to linear regression model and then followed by the second
hypothesis, in which the information flow is tested with changing frequency of the data from
intraday to end od day. Lastly, I test the importance of liquidity parameters by comparing if
their accuracy in prediction is indeed higher when not included in the inputs.

5.1 The Comparison between Artificial Neural Networks and Linear Regression-based
Predictions.
The preliminary results show higher accuracy of predictions compared to linear regression.
The linear regression is chosen as a basis for our comparison since it is heavily used in

18
econometrics and finance. While some steps followed by neural networks are similar to linear
regression, such as finding the optimal weight for error minimization, the rest of the procedure
allows one to set flexible rules on the model, such as setting activation functions according to
the desired outputs. The Table 2 present the overall findings of neural networks and compares
with linear regression. On the left, it displays if the out-of-sample R-squared is positive and
with its statistical significance. As can be seen, the out-of-sample R-squared for neural
networks are strictly positive and statistically significant at 1% levels in all tested years,
meaning the neural network based predictions outperform historical average. The results are
tested with the findings of linear regression as a comparison model for the Diebold-Mariano
Test. Furthermore, the linear regression does not find positive out-of-sample R-square in any
tested years. The right side of the table displays the mean-squared errors for neural networks.
For clarity, the MSE for linear regression is not shown in the table but used only for the
Diebold-Mariano Test.
Moreover, the comparison between neural networks and linear regression is conducted
with different inputs as well. To back-test the hypothesis, the two sets of inputs are given and
tested differently. Yield only parameter means the neural networks are provided only with the
historical yield data and as well as other standard inputs such as coupon and time to maturity.
As the table illustrates, when the same model is trained and tested with attaching liquidity
parameters such as ask-bid spreads, days of zero returns, the overall result still holds the same.
Neural networks significantly outperform the linear regression model even when bond-
extensive liquidity parameters are not provided. To illustrate the findings, Figure 4 explicitly
shows the plots of predicted versus realized prices and the error distribution of neural network
and linear regression-based predictions. With the results in Table 2, I reject the null hypothesis
and conclude that the neural network-based predictions produce considerably less predictive
errors compared to the historical mean and linear regression

5.2 The impact of frequency of time series data on the information flow
An equally important factor in time series analysis is the frequency of the data. All cited
works in this paper use daily or monthly close prices. Since the sample data frequency used in
this paper is intraday, it also gives an opportunity to transform the data from intraday to daily
close prices. The daily close price is the last transaction that occurred during the day, regardless
of when the transaction took place. To test whether the changing frequency of the data impacts
the accuracy and errors of the predictions, the Classification setup has been implemented. As
such, there are two neurons in the output, as illustrated on Figure 2. The binary values are either
1 or 0, marking whether the next price is upwards or downwards, respectively. This section
tests whether the prediction of the direction of the next transaction price varies with the
frequency of data.

19
TABLE 2: Descriptive results of out-of-sample R-squared and mean squared errors

The table reports the out-of-sample R-squared results for the two forecast methods. The table represents 0.1% learning rate and shuffled training
data. On the left, the X means positive ROOS . The out-of-sample R-squared results are tested with Diebold-Mariano Test. Significance of the
parameters are denoted as follows: * p < 0.10, ** p < 0.05, *** p < 0.01.

ROOS MSE
Neural Networks Neural Networks
Years Yield only Yield & liquidity Yield only Yield &liquidity

2012 X *** X *** 0.34 *** 0.33 ***


2013 X *** X *** 0.39 *** 0.41 ***
2014 X *** X *** 0.42 *** 0.45 ***
2015 X *** X *** 0.41 *** 0.39 ***
2016 X *** X *** 0.40 *** 0.39 ***
2017 X *** X *** 0.36 *** 0.42 ***
2018 X *** X *** 0.38 *** 0.44 ***
2019 X *** X *** 0.37 *** 0.36 ***
2020 X *** X *** 0.44 *** 0.39 ***

20
Figure 4: The figure above illustrates the predictions of neural networks and the distribution of errors of
neural networks and linear regression. The figure on the left portrays the realized price against the neural
network predicted price. The figure on the right shows the distributions of errors, whereas it can be seen
that linear regression produces significantly higher errors.

The motivation of this hypothesis originates from historically low interpretability of machine
learning techniques, often referred to as “black box”. The back-testing of the model is thus vital
for interpretability. The results of predictions exhibited low predictive errors in the previous
section. The question, however, is whether this translates into a successful model since the neural
networks could only learn to randomly assign small errors for the next tick price. To test this, a
Classification model is necessary to assess whether the neural networks indeed learn meaningful
information from the inputs. Table 3 presents the accuracy of direction tested for intraday and end
of day data. The Diebold-Mariano Test has been implemented against a randomly assigned array
of 1 and 0 to test whether the neural network-based accuracy of direction is statistically different
from randomness. The table indicates intra-day predictions are not different from randomly
assigned arrays of 1 and 0, whereas the accuracy of direction of the next day’s close price
statistically different from randomness at, on average 5% level. Therefore, we reject the null
hypothesis and conclude that the algorithm indeed does not learn meaningful information when
trained with intraday data, but the existence of meaningful information is in daily close prices.
This finding also helps interpret the results of the previous section where the test is implemented
on an intraday basis. The impressive low errors in predictions are solely randomly assigned values
by neural networks.

21
5.3 The importance of liquidity parameters
Section 2.4 explains the importance of liquidity parameters in detail. While the measure of
liquidity cannot be defined as a single parameter, several proxies have been used to provide neural
networks with information related to liquidity. Section 5.1 has already partially answered whether
adding liquidity parameters boost predictive power. The result indicated that adding liquidity does
not increase the predictive power of neural networks. Figure 3 plots the development of mean
squared errors on each epoch. It becomes clear that adding liquidity parameters only delays the
algorithm to train the data and produces almost identical errors when liquidity parameters are
absent in the model. This potentially means adding liquidity parameters offers no additional
information. Chen et al. (2007) find that liquidity is already priced in the credit spreads, which
could be a potential explanation why adding liquidity parameters does not boost the predictive
power. To test further, this paper employs the Random Forest Regressor to interpret the importance
of features as well as comparing their importance to the predictive power. The results of the list of
feature importance are shown in Appendix B. Once again, the liquidity parameter indicates almost
no importance. Therefore, I cannot reject the null hypothesis and thus conclude that adding
liquidity parameters does not improve the predictive power.

Figure 5: The plot displays the mean-absolute error as a function of the epoch. The left-hand side
illustrates the loss function when liquidity parameters are included, while the right-hand side plot
shows the loss function with yield-only input. While the loss function converges quickly after 10
epochs, it takes longer epochs to lower the loss function when liquidity is considered. The model
still learns the liquidity related information on each epoch while yield-only data saturates quickly
in 10th epoch and the model does not learn in the subsequent epochs.

22
TABLE 3: The results of the Classification test for intra-day and end-of-day close prices
The table illustrates the accuracy of the prediction of price direction. It is tested both for taking the end-of-day and intra-day basis. The results
indicate that the algorithm learns meaningful information only for the end of day close prices as it is successfully able to classify the direction of
movements. The results are tested against a randomly distributed array of 1 and 0. Significance of the parameters are denoted as follows: * p <
0.10, ** p < 0.05, *** p < 0.01.

Accuracy of direction in % MSE

Years End of day Intra-day End of day Intra-day

2012 56.96 ** 50.09 0.83 0.33


2013 53.99 * 49.78 0.78 0.41
2014 57.64 ** 51.11 0.84 0.45
2015 64.77 *** 50.88 0.77 0.39
2016 61.22 *** 50.09 0.78 0.39
2017 61.30 *** 50.92 0.81 0.42
2018 60.09 *** 49.78 0.73 0.44
2019 59.60 *** 50.13 0.83 0.36
2020 55.76 ** 50.72 0.89 0.39

23
6 Conclusion

This paper examines US investment-grade corporate bond return predictability using machine
learning techniques. This work extends Bianchi et al. (2020) by applying machine learning
techniques, artificial neural networks in our case on corporate bonds for return prediction. While
the previous papers either focused on return prediction of the Treasury bond with machine learning
or corporate bond return prediction with Fama-French factors, this paper differentiates itself by
employing extensive intra-day bond-specific liquidity parameters. Machine learning techniques
have not historically been a favourite choice in financial literature due to their low interpretability
and acting as a black box. To alleviate this, this paper performs back-testing of the findings with
varying inputs, data frequency and re-approving the findings with other machine learning
technique recognized for higher interpretability but lower predictability power.
This work contributes to the financial literature in the following three ways. First, by
employing artificial neural networks, which is one of the most used branches of machine learning,
we find a significant increase in the predictive power compared to traditional methods such as
linear regression. We achieve positive out-of-sample R-square, and all our results are significant
at 1 % level. The average mean-squared errors are below 1% for all years and are significantly
lower than linear regression counterpart. This finding is back-tested with different inputs, and the
result still holds the same: the considerable predictive outperformance of neural networks over
linear regression.
Secondly, to check the validity of impressive predictive results and increase the
interpretability of our models, the Classification model is implemented to test whether the neural
networks truly learn meaningful information from the data. To achieve this, the setup with binary
outputs is constructed to understand whether the algorithms can predict if the next move will be
up or down. This check is significantly essential to assess the model's validity since neural
networks may have learned only to assign minor errors for the subsequent tick data. We find that
the intra-day bond price movement is random and neural networks do not learn meaningful insight
from the data since it fails correctly to categorize the direction of the next movement. The average
accuracy is 50% and not significant. This, however, changes once the frequency of the data is
shifted from intra-day to end-of-day close price. The findings indicate that neural networks indeed
learn meaningful information and successfully categorize the direction of the next day’s close price
even though the mean squared errors slightly increase. The average accuracy is 60% and
statistically significant when tested against randomly assigned arrays. Assuming the low volatility
of investment-grade bonds and low transaction costs due to tight ask and bid spreads (Appendix
D), this translates into an economic gain.

24
Lastly, we test whether the liquidity parameters boost the predictive power of neural networks.
The findings indicate no increase in predictive power is achieved from including bond-specific
liquidity parameters. A potential reason for this is the possibility that the standard parameters such
as price, credit spread, coupon and time to maturity already captures the necessary information
and liquidity factors convey no new marginal information. This finding is in line with Chen et al.
(2007), which concludes that liquidity premium is already priced in traditional inputs. Bianchi et
al. (2020) uses no liquidity parameters and still finds a significant increase of accuracy deploying
machine learning. We conclude that the predictive power originates from changing the model used
for prediction rather than adding alternative inputs. Similarly, Green and Figlewski (1999) find an
economic gain from predicting the implied volatility switching from historical mean to
comparatively advanced models. The conclusion is that using neural networks with daily close
prices achieve higher accuracy and translates into an economic gain.
Much of the improvement accomplished in this paper originates by applying machine learning
techniques that lend themselves well to finding the non-linearities in the model. Therefore, future
research may start focusing on solving the distress puzzle using machine learning techniques where
there is an intense debate. I speculate that applying a machine learning model with back-tests to
improve interpretability may shed new lights on the decades-old distress puzzle: the vague
relationship between credit spread and equity return.

25
Appendix A Corporate Bond Character Description

1. AGE: The time since the bond was issued.

2. ASK-BID SPREAD: The spread between best offered ask and bids. The most prevalent
measure of liquidity. The wider the spread, the illiquid the instrument. It is calculated as:

"#$
)! − )%&''
!
!"# − %&' ")* = "#$
,/. ()! + )%&''
! )

3. CONVEXITY: The sensitivity of bond price to duration. Convexity can be thought


similar to a gamma of an option, taking the second derivative of a price.

4. CREDIT RATING: One of the most critical features of the corporate bond is its rating.
Usually ranked into 10 categories, AAA ranking the highest and C denoting the lowest
rated bond. The core of credit rating models relies on the probability of default.

5. CREDIT SPREAD: The premium the bond issuer has to pay for its credit risk. The
riskier the bond issuer, the higher the premium is required by the investors.

6. CREDIT DEFAULT SWAP(CDS): An insurance contract giving the buyer to claim the
notional if the underlying debt declares bankruptcy. It is taken as a proxy for the credit
spread of a debt.

7. COUPON: The rate the bond pays its holders an interest. All bonds used in this paper
pay coupons on semiannual frequency.

8. DURATION: The sensitivity of bond price to interest rate change. Duration can be
thought similar to the delta of an option, taking the first derivative of a price.

26
9. OPTION ADJUSTED SPREAD(OAS): The spread between bond with embedded
option and no option. Can be taken as a proxy for the credit spread of the bond. If there is
no embedded option on the bond, the option-adjusted spread is equal to the CDS.

10. YIELD TO MATURITY(YTM): Yield offered to the bondholder until the maturity if
purchased at current market price.

11. TIME TO MATURITY(TTM): Time remaining to maturity. This is expressed between


0 and 1, denoting the percent of time remaining until maturity during its lifetime. When
bond is issued, TTM equals 1 and slowly decays to 0. Following Lesmond et al. (1999),
the non-traded days are taken as an alternative measure for illiquidity. To capture this,
TTM is denoted in days, and the difference between each trading day is included in a
different column. By this, the algorithm is provided with a number of zero returns.
'7889:9;<9 =9>?99; 93<@ >:3;A3<>7B; C3>9
)(&)* )&!#), = 234(5, – ,)
, %DA7;9AA C3E

12. MATURITY: The date when the principal amount is paid to the debt holders

13. RISK-FREE RATE: The rate of non-default instrument, referred to as Treasury bond.

14. CLASSIFICATION: It means instead of training the algorithms with historical output
prices, the model is now trained with the binary output values of 1(up) and 0(down). The
rest of the parameters remains the same.

15. BOND PRICE DIRECTION ACCURACY: The prediction of whether the next tick
price will be up or down. The accuracy is calculated as follows:
G) + GH
!<<D:3<E =
G) + GH + I) + IH
Where TP, TN, FP and FN mean true positive, true negative, false positive and false
negative, respectively.

27
Appendix B Feature importance ranking

TABLE 4: Random Forest Regressor based feature importance ranking


The table reports the ranking of features based on deep state Random Forest Regressor (RFR).
The RFR has been regarded as high interpretability yet low predictability among machine
learning models, suited well for ranking features. The hypothesis of liquidity importance cannot
be rejected, and adding liquidity factors does not improve the model's predictability. As the table
below illustrates, the Ask-Bid Spread is ranked as 0.2% importance. The features with
importance less than 0.1% are not listed.

Feature Importance ranking

Error median OAS 30 0.7794


Bond price 0.1687
Coupon 0.0373
YTM 0.0008
Ask-Bid Spread 0.0002
TTM 0.0002
OAS 0.0001
Bond Maturity 0.0001

28
Appendix C Evolution of accuracy on each epoch.

FIGURE 6: The figure illustrates the evolution of accuracy at each step of optimizing the
weights. It can be seen that the evolution of accuracy and loss function bear similarities over
each epoch. The accuracy plot below belongs to 2020 data.

29
Appendix D Yearly descriptive summary and corporate bond description

The correlation matrix belongs to Ford Motor Credit Company LLC, 6.625% coupon. Issued: 04/08/2010.
Maturity: 15/08/2017. Cusip code: 345397VP5

5< year 5 year 7 year 10 year


2012 maturity maturity maturity maturity
Number of bonds 0 55 11 318
Average coupon 0 2.37% 4.39% 4.89%
Ask-bid spreads 0 0.07% 0.11% 0.125
Average OAS 0 123.59 211.83 166.77
Avg. # of day traded in year 0 96 81 126

ASK/BID BOND Error median


SPREAD OAS SPREAD return YTM OAS 30
ASK/BID SPREAD 1
OAS SPREAD -0.07 1
BOND return 0.07 -0.92 1
YTM -0.06 0.96 -0.91 1
Error median OAS 30 -0.04 0.22 -0.26 0.22 1

30
Appendix E Year 2013 descriptive summary, plots and corporate bond
description

The correlation matrix belongs to American Tower Corporation, 4.50% coupon. Issued: 07/12/2010. Maturity:
15/01/2018. Cusip code: 029912BD3

5< year 5 year 7 year 10 year


2013 maturity maturity maturity maturity
Number of bonds 122 116 20 400
Average coupon 2.10% 2.14% 3.88% 4.65%
Ask-bid spreads 0.06% 0.07% 0.08% 0.12%
Average OAS 86.56 87.59 132.33 132.03
Avg. # of day traded in year 124 128 92 138

ASK/BID BOND Error median


SPREAD OAS SPREAD return YTM OAS 30
ASK/BID SPREAD 1
OAS SPREAD 0.02 1
BOND return -0.06 -0.74 1
YTM 0.00 0.75 -0.99 1
Error median OAS 30 0.01 0.03 -0.06 0.06 1

31
Appendix F Year 2014 descriptive summary, plots and corporate bond
description

The correlation matrix belongs to Gilead Sciences Inc, 2.05% coupon. Issued: 07/01/2014. Maturity:
04/01/2019. Cusip code: 375558AV5

5< year 5 year 7 year 10 year


2014 maturity maturity maturity maturity
Number of bonds 39 172 44 475
Average coupon 1.70% 2.19% 3.94% 4.55%
Ask-bid spreads 0.05% 0.07% 0.09% 0.12%
Average OAS 74.74 67.15 123.00 107.09
Avg. # of day traded in year 91 138 100 128

ASK/BID BOND Error median


SPREAD OAS SPREAD return YTM OAS 30
ASK/BID SPREAD 1
OAS SPREAD -0.42 1
BOND return -0.75 0.52 1
YTM -0.31 0.57 0.34 1
Error median OAS 30 -0.24 0.13 0.265 0.16 1

32
Appendix G Year 2015 descriptive summary, plot and corporate bond
description

The correlation matrix belongs to AT&T, 5.8% coupon. Issued: 03/02/2009. Maturity: 15/02/2019. Cusip code:
00206RAR3

5< year 5 year 7 year 10 year


2015 maturity maturity maturity maturity
Number of bonds 88 237 77 546
Average coupon 1.82% 2.30% 3.58% 4.43%
Ask-bid spreads 0.04% 0.06% 0.09% 0.11%
Average OAS 86.80 88.09 133.32 130.99
Avg. # of day traded in year 103 136 121 127

ASK/BID BOND Error median


SPREAD OAS SPREAD return YTM OAS 30
ASK/BID SPREAD 1
OAS SPREAD 0.57 1
BOND return -0.43 -0.70 1
YTM 0.64 0.86 -0.88 1
Error median OAS 30 0.34 0.47 -0.28 0.42 1

33
Appendix H Year 2016 descriptive summary, plots and corporate bond
description

The correlation matrix belongs to American Express Credit Corporation, 2.375% coupon. Issued: 26/05/2015.
Maturity: 26/05/2020. Cusip code: 0258M0DT3

5< year 5 year 7 year 10 year


2016 maturity maturity maturity maturity
Number of bonds 156 307 102 639
Average coupon 1.93% 2.35% 3.45% 4.27%
Ask-bid spreads 0.04% 0.06% 0.09% 0.10%
Average OAS 84.22 92.03 135.26 137.61
Avg. # of day traded in year 108 135 137 126

ASK/BID BOND Error median


SPREAD OAS SPREAD return YTM OAS 30
ASK/BID SPREAD 1
OAS SPREAD 0.96 1
BOND return -0.44 -0.57 1
YTM 0.96 -0.97 -0.61 1
Error median OAS 30 -0.22 -0.24 0.16 -0.23 1

34
Appendix I Year 2017 descriptive summary, plots and corporate bond
description

The correlation matrix belongs to Citigroup Inc, 4.45% coupon. Issued: 29/09/2015. Maturity: 29/09/2027.
Cusip code: 3172967KAB

5< year 5 year 7 year 10 year


2017 maturity maturity maturity maturity
Number of bonds 168 293 112 550
Average coupon 2.09% 2.39% 3.24% 3.96%
Ask-bid spread 0.04% 0.05% 0.09% 0.10%
Average OAS 56.51 63.05 87.39 95.61
Avg. # of day traded in year 106 137 144 142

ASK/BID BOND Error median


SPREAD OAS SPREAD return YTM OAS 30
ASK/BID SPREAD 1
OAS SPREAD -0.87 1
BOND return 0.89 -0.90 1
YTM -0.54 0.69 -0.86 1
Error median OAS 30 0.07 -0.08 0.05 -0.02 1

35
Appendix J Year 2018 descriptive summary, plots and corporate bond
description

The correlation matrix belongs to Dell International LLC, 4.42% coupon. Issued: 01/06/2016. Maturity:
15/06/2021. Cusip code: 325272KAA1

5< year 5 year 7 year 10 year


2018 maturity maturity maturity maturity
Number of bonds 183 289 132 640
Average coupon 2.44% 2.66% 3.27% 3.92%
Ask-bid spreads 0.04% 0.05% 0.08% 0.10%
Average OAS 56.27 63.20 87.74 97.73
Avg. # of day traded in year 120 168 175 162

ASK/BID BOND Error median


SPREAD OAS SPREAD return YTM OAS 30
ASK/BID SPREAD 1
OAS SPREAD 0.43 1
BOND return 0.79 -0.10 1
YTM -0.25 0.65 -0.73 1
Error median OAS 30 -0.23 0.00 -0.24 0.15 1

36
Appendix K Year 2019 descriptive summary, plots and corporate bond
description

The correlation matrix belongs to Enterprise Products Operating LLC, 3.35% coupon. Issued: 18/03/2013.
Maturity: 18/03/2023. Cusip code: 29279VAZ6

5< year 5 year 7 year 10 year


2019 maturity maturity maturity maturity
Number of bonds 328 505 292 1049
Average coupon 2.93% 2.96% 3.34% 3.94%
Ask-bid spreads 0.04% 0.05% 0.08% 0.10%
Average OAS 75.17 92.24 118.55 144.17
Avg. # of day traded in year 55 88 89 105

ASK/BID BOND Error median


SPREAD OAS SPREAD return YTM OAS 30
ASK/BID SPREAD 1
OAS SPREAD 0.03 1
BOND return 0.12 0.28 1
YTM 0.13 0.44 -0.15 1
Error median OAS 30 0.01 0.00 0.06 0.01 1

37
Appendix L Year 2020 descriptive summary, plots and corporate bond
description

The correlation matrix belongs to Apple Inc, 3.45% coupon. Issued: 06/05/2014. Maturity: 06/05/2024. Cusip
code: 037833AS9

5< year 5 year 7 year 10 year


2020 maturity maturity maturity maturity
Number of bonds 124 261 157 688
Average coupon 3.10% 3.01% 3.43% 3.89%
Ask-bid spreads 0.04% 0.05% 0.07% 0.10%
Average OAS 133 148.82 173.52 230.91
Avg. # of day traded in year 36 44 44 45

ASK/BID BOND Error median


SPREAD OAS SPREAD return YTM OAS 30
ASK/BID SPREAD 1
OAS SPREAD 0.01 1
BOND return 0.20 -0.01 1
YTM 0.13 0.24 0.08 1
Error median OAS 30 -0.10 -0.02 0.09 0.00 1

38
References

Atkins, Adam & Niranjan, Mahesan & Gerding, Enrico., 2018. Financial News Predicts Stock Market
Volatility Better Than Close Price. The Journal of Finance and Data Science. 4. 10.1016/j.jfds.2018.02.002.

Aunon-Nerin, Daniel & Cossin, Didier & Hricko, Tomas & Huang, Zhijiang., 2002. Exploring for the
Determinants of Credit Risk in Credit Default Swap Transaction Data: Is Fixed-Income Markets'
Information Sufficient to Evaluate Credit Risk?. SSRN Electronic Journal. 10.2139/ssrn.375563.

Bai, J., Bali, T. G., & Wen, Q., 2018. Common Risk Factors in the Cross-Section of Corporate Bond
Returns. Journal of Financial Economics.

Bao, J., & Hou, K., 2017. De Facto Seniority, Credit Risk, and Corporate Bond Prices. The Review of
Financial Studies, 30(11), 4038-4080.

Bekaert, Geert & Harvey, Campbell & Lundblad, Christian., 2006. Liquidity and Expected Returns:
Lessons from Emerging Markets. Review of Financial Studies. 20. 10.2139/ssrn.424480.

Bektic, D., Wenzler, J.-S., Wegener, M., Schiereck, D., & Spielmann, T., 2016. Extending Fama-French
Factors to Corporate Bond Markets. Journal of Portfolio Management. 45 (3), 141-158

Bianchi, D., Büchner, M., Hoogteijling, T., & Tamoni, A., 2020. Bond Risk Premiums with Machine
Learning. The Review of Financial Studies.

Blanchard, Olivier, Angelo Melino, and David R. Johnson. Macroeconomics. Toronto: Prentice Hall, 2003.
Brandt, M., Santa-Clara, P., & Valkanov, R., 2009. Parametric Portfolio Policies: Exploiting Characteristics
in the Cross-Section of Equity Returns. The Review of Financial Studies, 22(9), 3411-3447.

Campbell, J., & Ammer, J., 1993. What Moves the Stock and Bond Markets? A Variance Decomposition
for Long-Term Asset Returns. The Journal of Finance, 48(1), 3-37.

Campbell, J. Y., & Taksler, G. B., 2003. Equity Volatility and Corporate Bond Yields. The Journal of
Finance, 58(6), 2321–2350.

Campbell, J., & Thompson, S., 2008. Predicting Excess Stock Returns out of Sample: Can Anything Beat
the Historical Average? The Review of Financial Studies, 21(4), 1509-1531.

Chen, L., Lesmond, D. A., & Wei, J., 2007. Corporate Yield Spreads and Bond Liquidity. The Journal of
Finance, 62(1), 119–149.

Chen, R.-R., Fabozzi, F. J., & Sverdlove, R., 2010. Corporate Credit Default Swap Liquidity and Its
Implicationsfor Corporate Bond Spreads. The Journal of Fixed Income, 20(2), 31–57.

Chordia, T., Goyal, A., Nozowa, Y., Subrahmanyam, A., & Tong, Q., 2017. Are capital market anomalies
common to equity and corporate bond markets? Journal of Financial and Quantitative Analysis. 52, (4),
1301-1342.

Collin-Dufresne, P., R. Goldstein, and S. Martin, 2001, The determinants of credit spread changes, Journal
of Finance 56, 2177–2207.

39
Du, S., & Zhu, H., 2017. Are CDS Auctions Biased and Inefficient? The Journal of Finance, 72(6), 2589–
2628.

Duffee, G., 1998. The Relation between Treasury Yields and Corporate Bond Yield Spreads. The Journal
of Finance, 53(6), 2225-2241.

Duffie, D. & Singleton, K., 1997. An Econometric Model of the Term Structure of Interest-Rate Swap
Yields, Journal of Finance, 52, issue 4, p. 1287-1321.

Elton, E. J., Gruber, M. J., Agrawal, D., & Mann, C., 2001. Explaining the Rate Spread on Corporate Bonds.
The Journal of Finance, 56(1), 247–277.

Eom, Y. H., Helwege, J., & Huang, J.-Z., 2004. Structural Models of Corporate Bond Pricing: An Empirical
Analysis. Review of Financial Studies, 17(2), 499–544.

Fama, E., & Bliss, R., 1987. The Information in Long-Maturity Forward Rates. The American Economic
Review, 77(4), 680-692.

Fama, E. F., & French, K. R., 1993. Common risk factors in the returns on stocks and bonds. Journal of
Financial Economics, 33(1), 3–56.

Fama, E. F., & French, K. R., 2015. A five-factor asset pricing model. Journal of Financial Economics,
116(1), 1–22.

Friewald, N., Wagner, C., & Zechner, J., 2014. The Cross-Section of Credit Risk Premia and Equity
Returns. The Journal of Finance, 69(6), 2419–2469.

Géron, A., 2017. Hands-on machine learning with Scikit-Learn and TensorFlow : concepts, tools, and
techniques to build intelligent systems. Sebastopol, CA: O'Reilly Media.

Goldberg, J., & Nozawa, Y., 2020. Liquidity Supply in the Corporate Bond Market. The Journal of Finance.

Green, T. C., & Figlewski, S., 1999. Market Risk and Model Risk for a Financial Institution Writing
Options. The Journal of Finance, 54(4), 1465–1499.

Heaton, J. B., Polson, N. G., & Witte, J. H., 2016. Deep learning for finance: deep portfolios. Applied
Stochastic Models in Business and Industry, 33(1), 3–12.

Henrique, B., Sobreiro, V. & Kimura, H., 2019. Literature Review: Machine Learning Techniques Applied
to Financial Market Prediction. Expert Systems with Applications. 124. 10.1016/j.eswa.2019.01.012.

Hull, J. C., 2018. Risk Management and Financial Institutions. Wiley Finance.

Jostova, G., Nikolova, S., Philipov, A., & Stahel, C., 2013. Momentum in Corporate Bond Returns. The
Review of Financial Studies, 26(7), 1649-1693.

Kaufmann, H., Messow, P., & Vogt, J., 2021. Boosting the equity momentum factor in Credit. SSRN
Working Paper.

40
Lin, H., Wang, J., & Wu, C., 2014. Predictions of corporate bond excess returns. Journal of Financial
Markets, 21, 123–152.

Longstaff, F. A., Mithal, S., & Neis, E., 2005. Corporate Yield Spreads: Default Risk or Liquidity? New
Evidence from the Credit Default Swap Market. The Journal of Finance, 60(5), 2213–2253.

Merton, R. C., 1974. On The Pricing of Corporate Debt: The Risk Structure of Interest Rates*. The Journal
of Finance, 29(2), 449–470.

Modigliani, F. and M. Miller, 1958, The Cost of Capital, Corporation Finance, and the Theory of
Investment, American Economic Review, 48, 261-297.

Nozawa, Y., 2017. What Drives the Cross-Section of Credit Spreads?: A Variance Decomposition
Approach. The Journal of Finance, 72(5), 2045–2072.

Penman, S., S. Richardson, & Tuna I., 2007. The Book-to-Price Effect in Stock Returns: Accounting for
Leverage. Journal of Accounting Research, 45, 427-467.

Rasekhschaffe, Keywan & Jones, Robert., 2019. Machine Learning for Stock Selection. Financial Analysts
Journal. 75. 1. 10.1080/0015198X.2019.1596678.

Shiller, R., 1981. Do Stock Prices Move Too Much to be Justified by Subsequent Changes in
Dividends? The American Economic Review, 71(3), 421-436
Trzcinka, Charles & Lesmond, David & Ogden, Joseph., 1999. A New Estimate of Transaction Costs.
Review of Financial Studies. 12. 1113-41.

Vassalou, M., & Xing, Y., 2004. Default Risk in Equity Returns. The Journal of Finance, 59(2), 831–868.

Wolff, D. & Echterling, F., 2020. Stock Picking with Machine Learning. Available at
SSRN: https://ssrn.com/abstract=3607845 or http://dx.doi.org/10.2139/ssrn.3607845

Wong, Z. Y., Chin, W. C., & Tan, S. H., 2016. Daily value-at-risk modeling and forecast evaluation: The
realized volatility approach. The Journal of Finance and Data Science, 2(3), 171–187.

Zhu, Z., & Jiang, W., 2016. Mutual Fund Holdings of Credit Default Swaps: Liquidity Management and
Risk Taking. Journal of Finance.

Zhu, F., 2014. Corporate Governance and the Cost of Capital: An International Study. International Review
of Finance, 14(3), 393–429.

41

You might also like