RJMF 79 1eng PDF

Russian Journal of
MONEY
& FINANCE v. 79 №1
MARCH 2020
CONTENTS
FISS – A Factor-based Index of Systemic Stress

in the Financial System
Tibor Szendrei, Katalin Varga
Use of Machine Learning Methods

to Forecast Investment in Russia
Mikhail Gareev
Forecasting Inflation in Russia Using Neural Networks

Evgeny Pavlov
Announcements of Sanctions and the Russian Equity Market:

An Event Study Approach
Pavel Dovbnya
Expected and Unexpected Consequences

of Russian Pension Increase in 2010
Ivan Suvorov
Russian Journal of
MONEY AND FINANCE
Founded and published by the Bank of Russia
Editor
Ksenia Yudaeva, Bank of Russia
Deputy Editors
Konstantin Styrin, New Economic School, Bank of Russia
Nadezhda Ivanova, Bank of Russia
Editorial Board
Alex Boulatov, International College Maksim Nikitin, International College
of Economics and Finance of Economics and Finance
Charles Wyplosz, Anna Obizhaeva,
The Graduate Institute Geneva (IHEID) New Economic School
Philipp Kartaev, Andrei Simonov,
Lomonosov Moscow State University Michigan State University
Iikka Korhonen, Bank of Finland Sergey Slobodyan, Center for Economic
Institute for Economies in Transition Research and Graduate Education –
(BOFIT) Economics Institute (CERGE-EI)
Dmitry Makarov, International College Valery Charnavoki,
of Economics and Finance New Economic School
Alexander Muravyev, Higher School Galina Hale,
of Economics – St.Petersburg Federal Reserve Bank of San Francisco
Managing Editor Deputy Managing Editor

Olga Kuvshinova, Bank of Russia Nadezhda Petrova, Bank of Russia
Assistant Editor
Nina Bodrova, Bank of Russia
Editorial contacts:
Address: 12 Neglinnaya Street, Moscow, 107016 Russia
E-mail: journal@mail.cbr.ru
© The Central Bank of the Russian Federation, 2019
The articles with additional materials provided by authors are freely available at:
rjmf.econs.online/en
The opinions and views set out in the articles are those of the authors
and do not necessarily reflect the official opinion of the Bank of Russia.
ISSN 2618-6799
Russian Journal of
MONEY AND FINANCE
v. 79 №1, march 2020
CONTENTS
Financial Stability
FISS – A Factor-based Index of Systemic Stress
in the Financial System
Tibor Szendrei, Katalin Varga............................................................................... 3
Machine Learning Forecasting Methods

to Forecast Investment in Russia
Mikhail Gareev.................................................................................................... 35
Forecasting Inflation in Russia Using Neural Networks

Evgeny Pavlov...................................................................................................... 57
Financial Markets
Announcements of Sanctions and the Russian Equity Market:
Pavel Dovbnya..................................................................................................... 74
Households’ Income and Labor Market

Expected and Unexpected Consequences
of Russian Pension Increase in 2010
Ivan Suvorov........................................................................................................ 93
All the articles are freely available at: rjmf.econs.online/en

The Russian Journal of Money and Finance
is a scholarly peer-reviewed journal published by the Bank of Russia.
Since 2018 it is issued quarterly both in Russian and in English.
The Journal was established in 1938

(from the middle of 1941 to the end of 1945 was not published).
The Journal is included in the list of leading scientific journals

of the Higher Attestation Commission and in the Web of Science – RSCI.
Submission guideline:
rjmf.econs.online/en/information;
rjmf.econs.online/en
v.79 № 1 Russian Journal of Money and Finance 3
FISS – A Factor-based Index of Systemic

Stress in the Financial System1
Tibor Szendrei, Central Bank of Hungary; Heriot-Watt University
ts136@hw.ac.uk
Katalin Varga, Central Bank of Hungary
vargaka@mnb.hu
Tracking and monitoring stress within the financial system is a

key component of macroprudential policy. This paper introduces a
new measure of contemporaneous stress: the Factor-based Index of
Systemic Stress (FISS). The aim of the index is to capture the common
components of data describing the financial system. This new index is
calculated with a dynamic Bayesian factor model methodology, which
compresses the available high frequency and high dimensional dataset
into stochastic trends. Aggregating the extracted four factors into a
single index is possible in a multitude of ways, but averaging yields
satisfactory results. The contribution of this paper is the usage of the
dynamic Bayesian framework to measure financial stress, as well as
producing the measure in a timely manner without the need for deep
option markets. Applied to Hungarian data the FISS is a key element
of the macroprudential toolkit.
Keywords: systemic stress, Citation: Szendrei, T. and Varga, K.

financial stress index, Dynamic (2020). FISS – A Factor-based
Bayesian Factor Model, Index of Systemic Stress
financial system, in the Financial System. Russian
macroprudential toolkit Journal of Money and Finance,
JEL Codes: G01, G10, G20, E44 79(1), pp. 3–34.
doi: 10.31477/rjmf.202001.03
1. Introduction
Tracking and monitoring financial stress is a key component of financial

stability. There are many policy instruments (e.g. release of countercyclical capital
buffers) that rely on knowledge of the level of financial stress in the financial
system. Using these policy instruments when the stress is not adequately high
carries the cost of an inability to implement the same policy in the near future if
1
The opinions expressed in this paper are those of the authors and do not necessarily represent the
official views of the Central Bank of Hungary. The paper has benefited from the helpful insights of Péter
Elek, Zsuzsanna Hosszú, Gergely Lakos, Miklós Sallay and Jan Klacso, as well as the guidance of László
Márkus. Any omissions are the responsibility of the authors.
4 Russian Journal of Money and Finance march 2020
the need arises. The aim of the Factor-based Index of Systemic Stress (hereinafter
FISS) is to capture the current level of systemic stress prevalent in the Hungarian
financial system.
The literature in recent years has introduced several econometric methods
for developing financial stress indices (hereinafter FSI) and financial conditions
indices (hereinafter FCI). Both types of indices compress a wide range of financial
variables into a single series, but FCIs and FSIs are inversely related during financial
instability: increasing financial stress means worsening financial conditions. An
FSI narrowly focuses on measuring financial stress (e.g. Hakkio and Keeton, 2009),
while an FCI focuses on broad measures of financial conditions (e.g. Koop and
Korobilis, 2014). Due to this wider range, FCIs frequently include macroeconomic
variables as well as financial variables. The FISS falls in the category of an FSI, as
its aim is to measure the level of financial stress.
There are multiple ways to estimate an FSI which range from simple weighted
averaging to more complex econometric methods. This diversity in methodologies
is driven by the need to include more financial variables which has been steadily
increasing in the past decade. Coupled with advances in computing power, more
complex econometric models have become commonplace in the literature. In this
paper the FISS uses a dynamic Bayesian factor model framework to compress
information from 19 financial variables into four factors. The choice of four
factors was supported on both intuitive grounds (data from four financial markets
was used), and empirical grounds (four factors yielded an explanatory power of
around 85%). Furthermore, the factors were allowed to follow non-stationary paths.
As such, the factors are the common stochastic trends of the data. This was done
because differencing the daily data would lead to a potential loss of information.
To aggregate the four factors, the usual weighing scheme used in principal
component analysis could not be applied, as the factors are non-orthogonal
after the imposition of a dynamic structure on them. On account of this, several
methodologies were tested: simple average, quantile regression (hereinafter QR),
information value (hereinafter IV), and a simulation procedure. The final results
did not yield significant differences from the simple average and as such it was
used for aggregation.
This paper contributes to the literature in two aspects. First, it creates a non-
stationary factor model to capture financial instabilities. Such an approach has
not been taken in the literature of FSI to the knowledge of the authors. Further,
the paper uses Hungarian data, signifying that the methodology of the FISS can
be used for emerging economies. This later contribution is particularly important
because it highlights that accurate FSIs can be constructed in the absence of data
from deep option markets, which emerging markets lack.
Section 2 of the paper discusses the concept of financial stress and systemic stress,
and gives a brief literature review of other FSIs available. The aim of this Section is to
give an economic intuition of what a FSI aims to capture. Section 3 gives an overview
v.79 №1 Szendrei, Varga: Index of Systemic Stress in the Financial System, pp. 3–34 5
of the variables selected in the model, categorised by the financial market the variable
aims to capture. This Section also discusses how the data were transformed, and
the motivation for each transformation. Section 4 discusses the dynamic Bayesian
factor model framework and briefly discusses the aggregation methods considered.
Section 5 shows the results of the FISS. Finally, Section 6 concludes.
2. Financial stress
Financial stress can be interpreted as the amount of risk which has

materialised. This definition implies that financial stress can be measured by a
continuous variable with its extreme values representing crisis events. The most
common indication of these events is an acute shift in investors’ intention to
hold less risky assets, called flight to quality, and liquid assets, called flight to
liquidity (Caballero and Krishnamurthy, 2008; Hakkio and Keeton, 2009). This
shift in intention is primarily driven by increased uncertainty in the market,
increased information asymmetry between market participants and changing risk
preferences. It is important that an FSI not only includes variables that capture
flight to quality and flight to liquidity, but also indicators that measure increased
uncertainty and information asymmetry, to better measure the continuous range
of financial stress.
Increased uncertainty in the financial market can stem from two sources:
asset valuation and the behaviour of other market participants. Asset valuation
involves a lot of ambiguity and some unexpected development – often referred
to as a Minsky moment2 – can trigger market participants to re-evaluate their
previous estimates. This in turn increases the disagreement about the underlying
assets’ value, because the actors’ asset valuation differs. This is embodied by
increasing volatility in the market.
Perhaps the more pertinent source of uncertainty after a Minsky moment is
the behaviour of other market participants. The price of an asset is not influenced
only by the underlying dividend stream, but also by the selling pressure of other
actors. Increased uncertainty about the behaviour of others might induce a trader
to liquidate her position early. Fire sales can develop when such a decision is
taken by multiple actors in the face of increased uncertainty.
Information asymmetry in the financial markets occurs when one party has
more information about a product than the other. There are a variety of cases when
information asymmetry can occur in financial markets (e.g. borrowers have more
information than lenders, sellers on secondary markets have more information
about the asset than buyers, participants in a swap deal have more information
about themselves than about the respective counterparty, etc.). Information gaps
can worsen during financial stress due to market participants becoming doubtful
2
‘Minsky moment’ is a name derived from Minsky’s financial instability hypothesis (see Kindleberger
and Aliber, 2011).
about the accuracy of their information about the other parties. This is further
exacerbated by the fact that during periods of high financial stress the value of
potential collaterals, which aims to tackle information asymmetry on financial
markets, is expected to decline (Gorton, 2009). This manifests itself in a higher cost
of borrowing and lower lending activity. The decline of lending activity is further
worsened by firms not knowing for certain their banks’ solvency situation and, on
account of this, not knowing if they can count on their credit lines. On secondary
markets the increased information gap lowers the average price of the asset.
Risk preferences are non-constant in time. It is well documented that market
participants tend to underestimate risks during bull markets and overestimate
risks during episodes of high financial stress (Kindleberger and Aliber, 2011;
Freixas et al., 2015). This changing risk preference further reduces market activity
for riskier assets during stressful periods.
Shifting exposures to more liquid and less risky assets is rational at an
individual level, as it can limit the risk of potential losses that could arise due
to the aforementioned effects. On account of the flight to quality, the spread
between risky and less risky assets widens as market participants flock to safe
assets (Caballero and Kurlat, 2008).
Flight to liquidity is a result of market participants’ fear of not being able to
sell illiquid assets close to their fundamental value when the market participants
need cash quickly.3 The effect of flight to liquidity is an increase in the liquidity
risk premia demanded for illiquid assets and a widening of the spread between
liquid and less liquid assets’ returns.
Systemic stress is a subcategory of financial stress and can be defined as the
situation where financial stress becomes so widespread that it impairs several
financial institutions and markets in an economy (De Bandt and Hartmann, 2000).
This contagion-like spread of financial stress is not limited to the borders
of a country. Due to tight cross-border financial interlinkages, financial
stress in the market of one country can spread across a region. Accounting
for these elements is important when trying to measure the financial stress
of a small, open economy.
Systemic stress on the financial markets can be damaging to the real economy
as well, because it impedes financial intermediation. During periods of high
systemic stress, financial market participants have limited possibilities to hedge
on the market, which may force them to limit their activity to only the most liquid
and least risky markets. This raises the cost of doing business for all firms on the
market and the net effect is a decrease in investment. As such, quantifying and
monitoring systemic stress is of key importance.
The financial markets of each economy are different in their level of
development and depth. As a result of this, there exist several financial stress
3
For more information about the aspects of liquidity that differentiate markets, see Páles and
Varga (2008).
indices in the literature. What follows is a non-exhaustive survey of some of the

FSIs that have been created so far.
Illing and Liu (2006) make an influential contribution to the literature on
financial indices. The authors test several aggregation methods to create a daily
FSI that best represents the Canadian financial markets. The authors identify
episodes of high financial stress based on a survey they conducted among Bank
of Canada policy makers. Their best performing FSI uses 11 financial variables
and is aggregated using a weighted average methodology where the weights are
determined by the relative size of the financial market the indicator in question
is associated with.
Nelson and Perli (2007) create a weekly FSI for the US economy using 12
variables. The authors employ a two-step approach to create their desired measure
of financial stress. In the first step, they create three summary indicators: level
factor, rate of change factor and correlation factor. The level factor is constructed
by calculating the weighted average of the variance-normalised indicators. The rate
of change factor is the rolling 8-week percentage change in the level factor. The
correlation factor is the first principal component of the 11 variables on a rolling
26-week window. The three constructed factors are aggregated into one index with
the help of a probit regression on a pre-defined binary crisis indicator.
Hakkio and Keeton (2009) build their indicator using 11 daily financial
market variables. They do so in one step using principal component analysis on
the standardised variables. The idea is that financial stress is the factor responsible
for most of the variance observed in the series. Hatzius et al. (2010) use the idea
of factor analysis as well and apply it to 45 distinct data series in an effort to create
an FCI. The authors test how their FCI fares in forecasting GDP. The conclusion of
the paper is that, while multiple factors are necessary to capture the co-movement
of the financial variables, only the first factor is helpful in forecasting real activity.
These results are crucial, as they highlight how modelling financial systems might
only be possible with the help of multiple factors.
The next two indicators presented here are the Composite Index of Systemic
Stress (CISS) created by Holló et al. (2012) and the System Wide Financial Stress
Indicator (SWFSI) developed by Holló (2012). The CISS and SWFSI use the
same methodology, with the only difference being that the CISS was constructed
for the euro area and the SWFSI was made to capture the stress levels of the
Hungarian economy. This methodology requires the chosen variables to be
transformed using sample CDF, so that all the variables are on the same (0,1]
ordinal scale. These variables are then categorised into the financial markets
they represent. The sub-indices are the arithmetic average of these categorised
transformed variables. To create a single index out of the sub-indices, the CISS
and SWFSI use portfolio theory to give more weight to stress events where the
indices are correlated. This correlation matrix is time varying. Furthermore,
the sub-indices also receive an additional weight according to how much the
sub-index influences the real economy, measured by industrial production

(hereinafter IP). 4
As mentioned, the CISS is constructed for the EU market and the SWFSI was
developed with the Hungarian economy in mind. This difference influences the
selection of variables and the number of sub-indices created. The CISS aggregates
15 variables into five sub-indices, while the SWFSI uses 27 indicators to create
six sub-indices.
Apart from the SWFSI, there have been other attempts at creating an FSI
for emerging markets. One such paper is Cevik et al. (2013), where the authors
construct an index for Bulgaria, the Czech Republic, Hungary, Poland, and
Russia. In creating the FSIs, the paper uses six core components to gauge financial
stress: banking sector fragility, security market risk, currency risk, external debt,
sovereign risk and trade finance. These six variables are aggregated using principal
component analysis and the first factors are used as the FSIs of each country.
One important conclusion of the literature review is that there exists no
consensus in how to create an FSI: the indices currently available differ in their
methodology as well as their variable selection. It is therefore important to have
a rigorous data selection procedure that is guided by economic considerations.
This also implies that, on account of different market structures across different
countries, the selection of indicators will vary. Furthermore, financial markets
are elusive and ever-changing, necessitating frequent and varying attempts at
measuring financial stress.
3. Data selection and transformation
The aim of the FISS is to capture financial stress in the financial system of
Hungary. Thus, following SWFSI and CISS, different segments of the financial
system were defined for data collection. Four segments were identified: the
government bond market, the foreign exchange market, the bank segment and
interbank market, and the capital market. To get a good mix of financial data that
is not influenced excessively by market deepening, the starting point was set at
1 January 2005.
The selection of potential variables was constrained by further data requirements.
First of all, the aim of the FISS is to quantify financial stress in a timely manner,
thus only daily data were considered. Second, movements in the indicator should
capture developments that are market wide. Finally, the variables in the model
should capture some feature of financial stress outlined in Section 2. Additionally, as
the Hungarian financial markets are open and closely connected to their European
counterparts, variables that capture increased uncertainty in the European markets
were considered as indicators of potential cross border contagion effects.
4
This is calculated with the help of Impulse Response functions in a vector autoregression (VAR).
These weights can be time varying as well to account for structural changes in the economy if necessary.
Table 1. List of variables included in the FISS
Raw variable Aspect of financial stress captured

Government Risk premium on 5-year bond Flight to quality
bond market
Yield on 3-month bond Increased uncertainty
Yield on 10-year bond
Foreign exchange EUR/HUF volatility (α = 0.95) Increased uncertainty
market USD/HUF volatility (α = 0.95)
CHF/HUF volatility (α = 0.95)
GBP/HUF volatility (α = 0.95)
Bid-Ask spread (EUR) Flight to liquidity
Capital market CMAX of BUX (60 day window) Increased uncertainty,
CMAX of BUMIX (60 day window) flight to quality
CMAX of CETOP20 (60 day window) Increased uncertainty, contagion
CMAX of DAX (60 day window) Increased uncertainty, contagion
VDAX Increased uncertainty, contagion
Bank segment Harmonic Distance Increased information asymmetry,
and interbank market liquidity drought
Domestic Banks PD Increased uncertainty
Foreign Banks PD Increased uncertainty, contagion
3-month BUBOR Increased uncertainty,
Overnight rate of HUFONIA increased information asymmetry,
flight to liquidity
Turnover of HUFONIA Increased information asymmetry,
liquidity drought
Out of the considered variables, 19 were chosen that maximise the explained
variance. In doing so, all possible combinations of the variables were tested to see
which selection yields the best explanatory power while not including variables
that offer limited information. Although factor analysis is capable of handling the
inclusion of all the variables, Boivin and Ng (2006) found that including fewer,
but pre-screened, variables in the factor model yielded results as good as, if not
better than, using all the available series. The final list of variables is shown in
Table 1. The following Sections detail the final variable selection. For the full list
of variables considered see Appendix 1.
3.1. Government bond market
The government bond market is captured by three variables: risk premium on

the 5-year government bond compared to the German 5-year government bond,
reference yield on the 3-month bond and the reference yield on the 10-year bond.
The risk premium measure is chosen as it can capture episodes of flight
to quality. As such, it yields information about traders’ expectations about the
Hungarian economy compared to Germany. For the calculation of this risk
premium, the German government bond was chosen as it is the closest substitute
to a euro area bond. The choice of Germany as a base is further supported by the
fact that the Hungarian economy is intertwined with the German economy. Thus,
using German bond yields as a base gives a better measure of the idiosyncratic
stress of the Hungarian government bond market.
The yields on the 3-month government bond and the 10-year government
bond together represent the yield curve. In effect, during elevated levels of
financial stress, the short-term outlook of an economy becomes uncertain,
raising the short-term yield. This can lead to an inversion of the yield curve, a
situation where short-term bond yields exceed long-term yields, which captures
the increased uncertainty in the market.
These two price-based measures capture forward-looking aspects of
financial stress. This is useful for an FSI, as it can measure how materialised risk
influences actors’ expectations about future risks. The FSI will still be a measure
of materialised risks, but, with the inclusion of forward-looking measures, the
full effect can be captured, as stress is derived not only from past events, but also
from market participants’ uncertainty about future outcomes.
3.2. Foreign exchange market
Five variables encapsulate the foreign exchange market in the model:

EUR/HUF spot market volatility, USD/HUF spot market volatility, CHF/HUF
spot market volatility, GBP/HUF spot market volatility, and the bid-ask spread
of EUR offers on the spot market.
The volatility measures together aim to capture uncertainty surrounding the
Hungarian forint. Multiple currency pairs were included so as to make sure that only
volatility related to the Hungarian markets is captured. Out of all the currencies, the
volatility of the Hungarian forint with respect to the euro is of key importance due
to the economic links between the Hungarian economy and the euro zone.
To calculate the volatility of the currency, an exponentially weighted
standard deviation (hereinafter EWSD) of the daily log change with a delay
parameter of 0.95 was chosen. Standard deviation on daily log changes of
currencies is commonly used to measure exchange rate volatility (see Hooper
and Kohlhagen, 1978; Akhtar and Hilton, 1984; Gotur, 1985) due to its simplicity
in calculation and the fact that it requires no assumptions to be made to fit the
model. Because the aim of the FISS is to capture financial stress in a timely
manner, exponential weighting was imposed on the standard deviation so
that older observations have a lower impact on the current level of standard
deviation. The EWSD was calculated with the following formula:
(1)
where w𝑖 is the weight and 𝑥‒* is the exponentially weighted moving average using
the same weighting setup.
Although a generalised autoregressive conditional heteroskedasticity model
(GARCH) or an approach using variance from the long-term trend might be
better at identifying episodes of high exchange rate volatility, there is no guarantee
that the same model will hold as newer data is added to the sample. Due to this
robustness concern, the simpler EWSD approach was favoured.
In deciding on the decay factor, the following values were considered:
0.85, 0.9, and 0.95. The EWSD of each decay factor is shown in Figure 1. Upon
inspecting Figure 1, it is visible that the EWSD of the delay factors of 0.9 and
0.85 drop deeper after reaching their maximum value only to rebound instantly
during the crisis. Such erratic movement is not desirable and as such the delay
factor of 0.95 was chosen. Larger values than 0.95 were not considered, because
in that case weights (albeit small ones) are given to observations that occurred
over 90 days ago. One major drawback of using this measure of volatility is that it
is backward-looking and has no information about the expectation of investors.
Figure 1. EUR/HUF exponentially weighted standard deviation with different decay factors
0.03 0.03
0.025 0.025
0.02 0.02
0.015 0.015
0.01 0.01
0.005 0.005
0 0
Jan. 05
Aug. 05
Mar. 06
Oct. 06
May 07
Dec. 07
July 08
Feb. 09
Sept. 09
Apr. 10
Nov. 10
June 11
Jan. 12
Aug. 12
Mar. 13
Oct. 13
May 14
Dec. 14
July 15
Feb. 16
Sept. 16
Apr. 17
Nov. 17
June 18
Jan. 19
Aug. 19
0.85 0.9 0.95
Bid-ask spreads are frequently used to measure the liquidity risk of a market
(Holló, 2012; Páles and Varga, 2008). In periods of low financial stress, the bid-ask
spread is low, which shows a low transaction cost associated with doing business.
High bid-ask spread is associated with increased liquidity risk during periods of
adverse market conditions. This is because, during high stress periods, market
participants are less inclined to enter an illiquid market, or will do so only at a
price below its fundamental value, leading to lower bid prices being registered.
At the same time, market participants partaking in a flight to liquidity want to
sell the asset as close to its fundamental value as possible to not incur needless
losses. It should be noted that a spike in bid-ask spread can occur during periods
of low financial stress as well, due to structural factors (e.g. due to new issuance,
market concentration, etc.). As such, it is important to have other measures of

the foreign exchange market stress level to maximise the information content of
the bid-ask spread.
3.3. Capital market
Several stock market indices were included to capture the movement of

the Hungarian capital market. These are the BUX, BUMIX, and the CETOP20.
The BUX is the primary capital market index of Hungary and includes the
14 major Hungarian companies trading on the Budapest Stock Exchange
(hereinafter BSE). In its aggregation method it is equivalent to the American
Dow Jones Industrial Average and the German DAX. The BUMIX is a stock
market index of 25 small- and medium-sized companies listed on the BSE.
The calculation is the same as for the BUX, with the exception that only
companies with market capitalisation less than 125 billion Forints are included.
The BUMIX and the BUX together can capture flight-to-quality aspects, as
it is expected that the BUMIX will react to financial stress faster than the BUX,
which includes larger firms in its composition.
The CETOP20 follows the performance of the 20 companies with the biggest
market capitalisation in the Central European region. There is a condition applied
during the construction of the index, which is that only seven companies from one
stock exchange may be included in the index. The index serves as a benchmark
for the region. The index included in the variable selection to capture aspects of
region-specific stress.
The CMAX methodology was imposed on these stock market indices, which
is common practice in the literature (Illing and Liu, 2006; Holló et al., 2012;
Holló, 2012). It uses the following formula to construct a series to capture
cumulative losses on the stock market:
(2)
where 𝑥 t is the stock market index at time t and T is the moving average window.
The idea behind the CMAX is that it compares the current value of the stock
market to its maximum value over the rolling window. This value is then subtracted
from 1 to get an index that increases when the cumulative losses increase.
The CMAX’s strength is that it is a hybrid loss-volatility measure, as high values
do not just mean cumulated losses, but also increased uncertainty in the market.
Following the SWFSI, the size of the rolling window was chosen to be smaller
(60 days) than is common in the literature, where it is ordinarily 1–2 years. This was
done on account of the FISS aiming to capture financial stress, as opposed to trying
to identify stock market crashes. The smaller window means that the values from
the CMAX represent only the most recent market developments.
The VDAXNEW (hereinafter VDAX) was also included as a variable

to measure the stress of the capital markets. The VDAX measures the level
of volatility to be expected on the DAX in the next 30 days. The calculation
methodology is equivalent to that of the VIX index, which measures the implied
volatility of the S&P 500. The strength of the VDAX is that it is a completely
forward-looking index and can capture market participants’ expectations about
the near future. The inclusion of the VDAX and CMAX of the DAX in the model
is motivated by the desire to capture contagion effects. As mentioned previously,
the two economies are intertwined, and difficulties in the German capital markets
can influence companies with Hungarian subsidiaries, which in turn can have a
knock-on effect on the Hungarian capital markets.
3.4. Bank segment and interbank market
The bank segment is the largest market in the model, composed of six
variables: harmonic distance measure of bank network, Domestic Banks’
Probability of Default (hereinafter PD), Foreign Banks’ PD, the average daily
rate on 3-month unsecured interbank loans as indicated on the BUBOR, the
average daily overnight rate on unsecured interbank loans as measured by
the Hungarian Forint Overnight Index Average (hereinafter HUFONIA), and
total turnover per day of the HUFONIA. This was done intentionally: to get an
accurate representation of the stress level of the Hungarian financial market, an
accurate measure of bank segment and interbank stress is necessary on account
of the financing of the Hungarian economy being bank based.
The aim of using a network measure was to capture tensions present in
the banking system as a whole. Such measures indicate when banks’ lending
behaviour reverses on the interbank market: banks stop lending to each other and
instead focus on paying back their loans. This is a clear indication of increased
information asymmetry. Four measures of this type were tested: harmonic
distance, betweenness, closeness, and the sum of the degrees. Fukker (2017)
shows that, for the Hungarian banking system, betweenness performs worse than
the other three measures using a factor model as a way to evaluate the different
measures. We follow Fukker (2017) in choosing harmonic distance out of the
aforementioned variables, as this measure increases the variance while exercising
an effect on more factors. The idea of including a network variable in an FSI is
not novel and has been suggested in Hałaj and Kok (2013).
It has to be noted that the network measures perform well partly due to their
close link to global turnover. On account of this, network indicators are also good
indicators of liquidity droughts, which occur when market liquidity abruptly
decreases during episodes of high financial stress.
HUFONIA turnover was also included alongside the harmonic distance
variable to give an explicit measure of decreases in market liquidity. The HUFONIA
is an unsecured market and elevated financial stress can be captured well on this
market. This variable is also a measure of information asymmetry but measures
the overall lending activity of the interbank market, instead of measuring the
shifts in the behaviour of banks which harmonic distance captures.
Bank PD measures the vulnerability of a bank and the uncertainty associated
with the bank in question. The banks were grouped according to whether their
ownership is domestic or foreign. This was done for two reasons: for banks which
are foreign subsidiaries, CDS was available for their parent banks, and foreign
banks’ lending activity may be influenced by the level of financial stress in the
markets of their parent companies. Due to the latter, it is deemed necessary to
separate foreign and domestic banks, as this way the model can capture cross-
border contagion effects on the interbank market. For the domestic banks,
the PD was calculated using the Merton (1974) model.5 The individual banks
were weighted according to their asset size to get a single PD for domestic and
foreign banks.
The final two measures in this category are the daily rates on the unsecured
interbank market. These are similar to the measure of turnover in that, due to the
unsecured nature, financial stress appears in these indicators early. Including both
the overnight and the 3-month rates helps the model gauge market participants’
expectations about the near future. Furthermore, these price variables, together
with the PD indicators, carry information about the prevalence of information
asymmetry in the market.
4. Methodology
4.1. Dynamic factor model
The literature on factor models in macroeconomics is extensive: Bernanke

and Boivin (2003) introduce factor methods among macroeconomists, while
Geweke and Zhou (1996), Otrok and Whiteman (1998) and Kose at el. (2003),
among others, implement Bayesian estimation within a factor model framework.
Later, Bernanke et al. (2004) and Stock and Watson (2005) enhance the factor
approach with VAR methods, yielding the FAVAR.
Dynamic factor models (DFMs) were originally introduced by Geweke (1977),
who extended factor models to time series data. The core idea of DFMs is that
the temporal structure of the data induces persistence which must be taken into
account when compressing the numerous variables. The explanatory power of
DFMs is best portrayed in Sargent and Sims (1977), whose model explains a large
fraction of the variance of the most important macroeconomic variables of the
United States.
5
More information on the Merton model is provided in Appendix 3.
In the early 2000s, economists became more interested in large-dimensional

DFMs. The attractiveness of these models is in their ability to handle the persistent
nature of the data and to analyse large panels of time series while handling the
curse of dimensionality. These types of models proved capable in forecasting and
performing policy analysis by studying the impulse response functions in, e.g.,
a FAVAR setting (see Stock and Watson, 2002, 2005; Giannone et al., 2006, 2008;
Forni et al., 2000, 2005, 2009).
Until recently, large-dimensional factor models were mainly used in
stationary settings. This is at odds with the time-series literature, where the
bulk of the time series describing systemic stress are I(1) (Barigozzi et al., 2020).
As such, there are fundamentally two possibilities. The first option is to follow
the Box-Jenkins methodology in differencing the data until stationarity is
achieved and then applying standard principal component analysis (PCA) to
consistently estimate the model (Stock and Watson, 2002; Bai and Ng, 2002;
Bai, 2003). This method has the advantage of simplicity and statistical
consistency, but has a serious drawback, as it purges any persistence in the
data which might be of importance when wanting to draw conclusions about
financial stress.
The second option is to fit a DFM in order to capture the autocorrelation
of the time series. The core motivation for taking this route is that differencing
the series purges the non-stationary common component. Statistically, this
component is the common stochastic trend of the data series, namely the stress
factor of interest (Bai, 2004; Barigozzi et al., 2020). Therefore, to capture the
systemic stress factors, a factor model has to be fit without transforming the
data to stationary. This solution was first introduced by Engle and Watson (1981)
and later extended by Bai (2004) with the so-called common stochastic trend
approach. In this case, the model can be consistently estimated with maximum
likelihood by applying the Expectation Maximisation algorithm. The FISS builds
upon the recent developments of non-stationary factor models. The decision to
do so was primarily driven by the above-mentioned statistical evidence and the
fact that the aim is to get an index that captures the level of financial stress rather
than the change in financial stress. By differentiating the data, the information
content of the indicators changes, which is undesirable (e.g. Harmonic distance
gives information about the tendency among banks to focus on repayment of
current loans, while the first difference of this variable captures the change of
this effect which, while informative, captures a different effect).
The FISS utilises standardised time series data which are often autocorrelated
and persistent, thus dynamic effects must be considered. This is underpinned
by the unit root tests (see Appendix 2) showing the importance of dynamics
in the data set. Although not all of the chosen variables portray non-stationary
tendencies, this is not a problem, as the factors that are gained all have unit-
root tendencies and as such can be explained well with a random walk structure.
Furthermore, the algorithm is capable of handling scenarios where some

factors are stationary while the others are not (Eickmeier, 2005). Additionally,
Sims (1988) and Uhlig (1994) argue that Bayesian procedures are simpler and
often more reasonable in the presence of unit roots, as they avoid the drawbacks
that arise due to classical distribution theory. This underpins the choice of using
a Bayesian DFM.
Although applying dynamics to a factor model is enticing, it is not
costless: orthogonality of factors, information on explained variance and
factor interpretability are given up. These caveats have to be kept in mind when
creating a DFM, as each aspect carries different consequences. The lack of
orthogonality of factors means that special care has to be taken when multiple
factors are obtained. If the factors are too correlated, there is a possibility that
the added factors do not yield additional information. Giving up information
about explained variance is costly because it makes aggregating multiple
factors into one comprehensive index a non-trivial task. Sacrificing factor
interpretability means that it is hard to determine what each factor captures. In a
DFM, the common behaviour of a high dimensional vector of time series variables
is described by a few latent dynamic factors together with a vector of zero mean
idiosyncratic disturbances. The idiosyncratic noise term arises from measurement
error and from specific features of the individual data series. The latent
factors usually follow a VAR structure of lag order 𝑝. This is shown in equations
(3) and (4) below:
(3)
(4)
Equation (3) is the observation equation signifying that each vector y𝑖t can
be described by independent regression 𝑖. The disturbance term of this equation,
ε𝑖t , is assumed to be independent and identically distributed (i.i.d.). The FISS
uses a simplified version of the DFM, since the possible AR structure of these
disturbance terms is ignored (see Koop and Korobilis, 2010).
Equation (4) plays the role of the state equation, describing the structure of
the latent factors. In the FISS, the vector of latent factors follows a VAR structure.
The disturbance term of the state equation, ε tf, is assumed to be i.i.d. In most
cases, Σf is a diagonal, and for the model presented in this paper this condition
is fulfilled. One last assumption is that the two disturbance terms, εtf and ε𝑖t , are
independent of each other.
This system of equations turns out to be a regular form of a state space model.
This is an important observation because it allows the application of the Carter-
Kohn algorithm (see Carter and Kohn, 1994) to draw factors. Furthermore, all
methods for posterior simulation for state space models are available, due to the
model’s following the structure described above.
4.2. The choice of the factors and the estimation

of the common stochastic trend
The first problem to be addressed is the choice of the number of factors. There
are two main guidelines on how to choose the optimal factor structure: the first
is to find the statistically optimal factor set (see Bai and Ng, 2002; Lopes and
West, 2004), the other is to predefine the number of factors based on intuition
or empirical aspects (see Engle and Watson, 1981). When choosing the number
of factors, the authors tried to merge the two ideas and four factors were chosen.
As described in Section 2, the indicators were categorised by the financial market
they represent. On account of this, the intuitive number of factors would be the
number of categories constructed. The ultimate goal of the FISS is to capture
as much information from the different sources of financial stress as possible,
and four factors satisfy this goal, with an explained variance of around 85%.6
Furthermore, these factors are not highly correlated, which highlights how they
capture different aspects of financial stress.
Section 2 showed that it is very common for a factor methodology based FSI
to use only the first factor. This was unfeasible for the FISS, as using only one
factor captured a mere 55% of the total variance in the data. This low explanatory
power would have violated the first guideline mentioned above.
The second problem that had to be tackled is the persistence of the dataset.
The factors in the model are estimated, following Bai (2004), as so-called
stochastic trends: non-stationary dynamics are allowed in the state equation.
In this way, the model can fully capture the information contained in the financial
data. Nevertheless, it is imperative to check the relations between the factors to
avert any spurious regressions in the model.
Table 2. Coefficient matrix of the state equation
F1 F2 F3 F4
F1t–1 0.989 0.009 0.003 0.001
F2t–1 ‒0.001 1.001 0.002 ‒0.007
F3t–1 ‒0.012 0.029 0.976 ‒0.016
F4t–1 0.005 ‒0.016 0.018 1.004
Table 2 shows the coefficients of the state equation matrix. This matrix
is close to a unit matrix, which highlights how the factors of the model behave
6
Explained variance for the model was obtained from principal component analysis before applying
dynamics to the system.
like independent random walks. This simple structure makes it unnecessary to

deal with the possible linkages between the factors such as cointegration (see
Banerjee et al., 2014). On the other hand, it has to be noted that the predictive
power of a model described by such dynamics is very limited. Thus, the expansion
of this modelling framework in the direction of a forecasting approach is not
straightforward.
4.3. Bayesian model specification
As mentioned before, due to the regular state-space form of the FISS, the
standard algorithms can be used to draw factors. The observation equations
conditional on the factors are just normal linear regression models, which
are independent from each other. Thus, simulating the equations one by one
is possible.
An identification issue arises when applying the PCA type factor model
as a starting point: the matrix of factor loadings and factors is unique only
if rotations are ruled out. One way of surmounting this issue is imposing
restrictions on the loadings (see Lopes and West, 2004). As such, the matrix
of the loading was transformed in a block lower triangular form with strictly
positive elements in the main diagonal. After performing this transformation,
the simulation procedure can commence without any problems. The Carter and
Kohn algorithm was used to draw factors, using a Gibbs sampler to simulate the
parameters: σ𝑖2, Σf, λ𝑖0, λ𝑖, ϕj.
The prior for the coefficients in the observation equations is a normal-
inverse gamma, and the elements of Σ are inverse gamma. Since the observation
equations are independent, the posterior parameters σ𝑖2 , λ 𝑖0 and λ 𝑖 can be
simulated separately in the 𝑖-th dimension (see Geweke and Zhou, 1996).
The matrix of factor loadings is normally distributed with restrictions on the
block structure of the matrix and with positive, truncated normally distributed
elements in the diagonal.
The prior for the state equation is normal-inverse Whishart. The state
equation is in a VAR form, so the Gibbs sampler with no Metropolis-Hastings
steps can be applied (see Kadiyala and Karlsson, 1997).
The prior selection is supported by Uhlig (1994), where it is stated that in the
presence of a unit root the choice of a Normal-Wishart prior centred at the unit
root yields the best results.
The number of simulations run was set at 10000 with a burn-in of 2000 and
a thin value of 7. This was done to get the most accurate estimates for the factors.
Detailed evaluation of the convergence of the algorithm used is discussed in
Appendix 6. The factors obtained with these specifications are shown in Figure 2.
Figure 2. Factors of the FISS

6 6
5 5
4 4
3 3
2 2
1 1
0 0
–1 –1
–2 –2
Jan. 05
Aug. 05
Mar. 06
Oct. 06
May 07
Dec. 07
July 08
Feb. 09
Sept. 09
Apr. 10
Nov. 10
June 11
Jan. 12
Aug. 12
Mar. 13
Oct. 13
May 14
Dec. 14
July 15
Feb. 16
Sept. 16
Apr. 17
Nov. 17
June 18
Jan. 19
Aug. 19
F1 F2 F3 F4
Several things can be observed in Figure 2. First of all, the factors can take on
a negative value, and the factor level values range from ‒2 to +5.5. Although not
necessary, it is useful to convert the final FISS to a range between 0 and 1. This was
achieved by shifting each factor up by 2. After the factors were aggregated (more
on this in Section 4.4), the index values were divided by the historical maximum,
giving the final FISS. It has to be emphasised that these transformations are not
necessary, the factors are informative without them, and they were only done so
the final index was on a scale between 0 and 1. Another important feature of the
factors is that they seem to be correlated during certain periods. This is important
because we expect multiple financial markets to show signs of instability during
periods of high systemic stress.
4.4. Aggregation of the factors
Aggregating the factors into a single index is not trivial. As mentioned before,
the factors in question are not orthogonal, and due to this, using the common
practice in principal component analysis, namely the usage of the square root
of the eigenvalues as weights, is not possible. On account of this, a handful of
aggregation methods were tested: unweighted average, weighted average where
the weights are obtained using information value (IV) methodology, weighted
average where the weights are obtained from QR, and weighted average where
the weights are simulated with no prior information imposed.
Unweighted average
While the simplicity of the unweighted average is appealing, that is not its
only strength. The primary reason considered is that no ex-ante assumption
about when the index should peak or about what variables it should interact
with are required. Nevertheless, there is a drawback associated with aggregation
using unweighted averages: it implicitly assumes that all factors are equally
informative. The validity of this assumption depends on whether the alternative
weighting schemes produce different dynamics in the FISS. These considerations
make the unweighted average a good benchmark to compare the alternative
weighting systems.
Weighted average using quantile regression

The reasoning behind using QR to get weights is motivated by the fact that
certain factors may be more informative about financial stress than others.
This can be measured by the factors’ effects on IP. Following the intuition of
De Bandt and Hartmann (2000), the information content of the factors, in
relation to the real economy, is contained in their extreme values. As such,
a quantile regression measures the interaction between the tail events of the
factors and the IP. Nevertheless, there are drawbacks to utilising a QR. First of
all, the IP is recorded only monthly, while the FISS is recorded daily. Conversion
of the FISS is not an issue, but there is an undeniable loss of information in
doing so. Furthermore, IP is a non-stationary process by definition. As such
it has to be differenced along with the factors to utilise QR, which further
depresses the information content of the factors obtained.7 In addition, the aim
of the FISS is to be a thermometer of stress and tying its weighing too closely to
the tail events of IP can also lead to loss of information. For more information
on the methodology see Appendix 4.
Weighted average using information value

The next methodology considered is the IV, which is widely used to
measure the separation power of a variable in creating PD and LGD models
(Siddiqi, 2012). The approach relies on the variable indicating a problem at the
right time, so in the case of an FSI, whether the indicator gives high values in bad
times and low values in good times. Using the IV methodology has the benefit
of no conversions being required like in the previous scenario. Furthermore, the
IV is not restricted to the IP and as such no information is lost about financial
markets that did not have much effect on the real economy. However, this comes
at the cost of having to define crisis and evaluation periods ex-ante to measure
the ratio of good and bad signals. A further drawback is that bins need to be
manually defined after checking the ratio of good and bad observations in each
decile. These introduce ad-hoc elements to the weighting structure which can
bias the final FISS to represent preconceptions about financial stress episodes in
the market. For more information on the methodology see Appendix 5.
7
Only the weights from the 99th quintile estimates are used, as the slope coefficients are most likely
to be different from 0 for this quintile only. It has to be noted that the sign of the coefficients is the
opposite of the expected sign; as such the QR is only presented for illustrative purposes.
Simulated weights
Finally, weights were also simulated without imposing any prior information
about the distribution of the draws. As such, a simplex method was created to get
an estimate for all possible weight combinations. In total, 18,500 potential weight
pairings were calculated. The aim of this exercise was to check the sensitivity
of the final index in regards to the weighting. Furthermore, it has information
about how much information the other weighting methods add compared to this
uninformed simulation. Only the middle 80% of combinations is presented so as
to maintain reasonable bands around the benchmark estimate.
Comparison of results
Table 3 shows the weights gained from the different estimations. It has to
be noted that, because the QR method utilises differenced data and yields slope
estimates close to zero, it is only shown for completeness (for more information
see Appendix 4 and footnote 7). One striking feature is that the weights are very
close to the unweighted benchmark case, with the IV method portraying the
biggest deviation from it for the fourth factor.
Table 3. Weights gained by the different estimation procedures, %
F1 F2 F3 F4
Unweighted 25.00 25.00 25.00 25.00
QR 37.46 21.53 11.70 29.32
IV 25.81 45.07 27.47 1.65
The FISS with the weights of Table 3 are shown in Figure 3. All indices were
transformed onto a [0,1] scale as described in Section 4.3. From this figure it can
be seen that the different weights do not change the index very much. The figure
reinforces the findings of Table 3 that the different weight estimations do not
result in significantly different indices. There is no real way to determine which
version represents reality the best with such small differences. On account of
this, the unweighted average will be used for the remainder of the paper.
Furthermore, the bands around the average estimate show that the possible
weight combinations do not diverge from the average estimate very much
during periods when the index increases suddenly and dramatically. The fact
that the ‘confidence’ band around the average is existent is further proof of the
factors’ capturing different aspects of financial stress, while the observation
that the bands become extremely narrow during sudden increases highlights
how the aggregation of the factors yields a good measure of systemic stress.
Additionally, this gives further justification for the usage of the simple average
as an aggregation method.
Figure 3. Weighted and unweighted averages of the FISS
1.0 1.0
0.9 0.9
0.8 0.8
0.7 0.7
0.6 0.6
0.5 0.5
0.4 0.4
0.3 0.3
0.2 0.2
0.1 0.1
0 0
Oct. 06
Oct. 13
Mar. 06
May 07
Dec. 07
July 08
Feb. 09
Sept. 09
Apr. 10
Nov. 10
June 11
Mar. 13
May 14
Dec. 14
July 15
Feb. 16
Sept. 16
Apr. 17
Nov. 17
June 18
Jan. 05
Jan. 12
Jan. 19
Aug. 05
Aug. 12
Aug. 19
80% of combinations FISS (unweighted)
FISS (weights from IV) FISS (weights from QR)
Figure 3 also portrays how, due to the dynamic nature and the Bayesian
methodology, the index can reach high levels quickly, while it takes a longer time
for it to decrease. From a policy maker perspective, this property is important, as
it decreases the probability of the index breaching levels that signal crisis levels of
stress only for this level to disappear the following week. One interesting observation
from the figure is that the volatility of the index seems to portray heteroscedastic
tendencies. Accounting for this could be a further improvement to the FISS.
5. Results
5.1. Performance of the index
Having an objective, empirical way to determine the performance of the FISS

is difficult. Measuring financial stress means that the FISS has to reflect all material
disturbances in the financial sector, irrespective of whether the connected risks
materialise in the real economy. Thus, the only objective, numeric measures of
FISS would be the very indicators it aggregates.
Following Holló (2012), Hakkio and Keeton (2009) and Holló et al. (2012),
the performance of the constructed index is verified by whether its peaks match
up with events of importance for the financial markets. Due to the large size
of the dataset (T = 3126 after adjustments, 3 January 2005 – 30 June 2017),
the index is broken down into two subsamples for this Section: one from 2005
until 2010, shown in Figure 4; and the other from 2010 until 2017, shown in
Figure 5. Furthermore, the two figures were each broken down into two periods.
Two periods of Figure 4 are the events before the Global Financial Crisis
(hereinafter GFC; January 2005 – September 2008) and the GFC (September 2008 –
March 2010). Two periods of Figure 5 are the European sovereign bond crisis
(March 2010 – September 2013) and the period encompassing the post-crisis
events (September 2013 – February 2017).
Figure 4. Values of the FISS to 28 February 2010
1.0 1.0
1. 2. 3. 4.
0.9 0.9
0.8 0.8
0.7 0.7
0.6 0.6
0.5 0.5
0.4 0.4
0.3 0.3
0.2 0.2
0.1 1. 2. 3. 0.1
I. II.
0 0
Jan. 05
Mar. 05
May 05
July 05
Sept. 05
Nov. 05
Jan. 06
Mar. 06
May 06
July 06
Sept. 06
Nov. 06
Jan. 07
Mar. 07
May 07
July 07
Sept. 07
Nov. 07
Jan. 08
Mar. 08
May 08
July 08
Sept. 08
Nov. 08
Jan. 09
Mar. 09
May 09
July 09
Sept. 09
Nov. 09
Jan. 10
stress events FISS
I./1: Introduction of the fiscal austerity package in Hungary on 12 June

2006. The FISS starts to increase a month before the programme was announced,
due to markets expecting that some austerity measures would be introduced, but
the type and severity of the measures were unclear. This increased uncertainty in
the markets lead to the FISS increasing sharply during May and June.
I./2: Bear Sterns announces two of its subprime hedge funds lost all
value on 16 July 2007. This announcement can be viewed as the start of the
subprime debacle that eventually led to the collapse of Lehmann Brothers. This
announcement increased both information asymmetry and uncertainty in
the financial markets. Following this date, the FISS’s value increases as capital
markets and foreign exchange markets reacted to the news. The third event in
Figure 4 (I./3) was also about Bear Sterns: On 15 November 2007, Standard &
Poor’s downgraded the investment bank amid concerns for its solvency.
I./4: Turbulence in the Hungarian government bond market during March
2008. Although Hungarian banks did not have large exposures to Mortgage
Backed Securities (hereinafter MBS), the decreased risk appetite of international
traders narrowed Hungary’s opportunities to access funding, raising the price
of funds. This effect was amplified as the government bond market came under
selling pressure as there were concerns about their financing due to deteriorating
macroeconomic fundamentals. These events are captured in the FISS, as its level
rose sharply during the month. Although the index value drops off after the
month, the aftermath is higher levels of stress, as the underlying fundamentals
did not improve. The fact that the model is able to capture this persistence in
uncertainty is partly due to the dynamic structure, and it yields a more intuitive
picture of how market participants internalise such events.
II./1: Bankruptcy of Lehmann Brothers on 15 September 2008. The
following day, Moody’s and Standard & Poor’s downgraded the ratings of
AIG due to concerns about the compounding losses of MBS, which triggered
fears that the company may be insolvent. These events generated widespread
uncertainty in the financial markets. The aftermath was the deepening of
the GFC, which led to a sudden spike in the FISS. Concerns in regards to the
solvency of holders of unhedged foreign-currency-denominated debt, as well as
concerns about the stability of the overall banking system lead to further fears
about the financing of the Hungarian economy. This was further worsened by
the fact that 6–10 October (the sixth event in Figure 4; II./2) was the worst week
for the stock market in the United States in 75 years (the Dow Jones dropped
22.1%, while the S&P 500 dropped 18.2%). The aftermath of these events led to
the FISS recording its highest level, as uncertainty increased leading to traders
shifting their preferences around the globe, flocking to safer investment options
(flight to quality). The value of the FISS eventually dropped off as Hungary
signed the credit-line agreement with the IMF in late October. Nevertheless, the
level of stress did not reach pre-Lehmann levels.
II./3: Turbulence in the Hungarian foreign exchange spot market during
January 2009. During this turbulence, the EUR/HUF exchange rate breached the
300 level, eventually reaching a value of 316 on 6 March of the same year. This was
coupled with increased volatility in the spot market, which led the FISS to spike in
values momentarily. This increased volatility also affected the Hungarian banking
system due to its large exposure to holders of foreign-currency-denominated
debt, whose solvency came under heavy scrutiny.
III./1: Downgrade of Greece to junk bond category by Standard & Poor’s
on 27 April 2010. Portugal was also downgraded on the same day, but it kept its
investment grade status. This event is the first in Figure 5 and can be viewed as the
start of the euro area sovereign debt crisis. Furthermore, there was uncertainty
surrounding the newly formed Hungarian government. There were statements
comparing the situation of Hungary to Greece, which unsettled markets (see
article published on portfolio.hu: “Kósa: "szűk esélyünk van arra, hogy elkerüljük
Görögország helyzetét"”, 2010). The value of the FISS rises, as it captures this
increased uncertainty through the increased volatility in the exchange rate
market and the government bond market.
III./2: Turbulence in the Hungarian foreign exchange swap market during
December 2010. This turbulence was mainly driven by the foreign-currency-
denominated debt problem’s not having been adequately addressed. Furthermore,
the government introduced several new policies to stabilise government debt.
The market reacted to these events through the government debt market. Coupled
with the increased uncertainty of the euro area causing exchange rate volatilities
to rise, these issues caused the FISS to increase.
III./3: Standard & Poor’s raises the possibility of downgrading Italy on
21 May 2011. This had a profound impact on the FISS as some Italian banks
have subsidiaries in the Hungarian market that are systemically important. As
the PD of these banks increased, the FISS started to climb gradually, signifying
the increased uncertainty in the interbank market.
Figure 5. Values of the FISS from 1 March 2010
1.0 1.0
1. 2. 4. 5. 6. 7.
0.9 0.9
3.
0.8 0.8
0.7 0.7
0.6 0.6
0.5 0.5
0.4 0.4
0.3 0.3
0.2 0.2
0.1 1. 2. 3. 4. 5. 6. 7. 0.1
III. IV.
0 0
Mar. 10
July 10
Nov. 10
Mar. 11
July 11
Nov. 11
Mar. 12
July 12
Nov. 12
Mar. 13
July 13
Nov. 13
Mar. 14
July 14
Nov. 14
Mar. 15
July 15
Nov. 15
Mar. 16
July 16
Nov. 16
Mar. 17
July 17
Nov. 17
Mar. 18
July 18
Nov. 18
Mar. 19
July 19
Nov. 19
stress events FISS
III./4: Euro area sovereign debt problem worsens in mid-July of 2011.

During this time, Greece was downgraded further by Standard & Poor’s to
speculative grade on 27 July 2011. Furthermore, Standard & Poor’s announced
that it would closely monitor the sovereign debt process of the United States on
25 July 2011, eventually downgrading the United States on 5 August 2011. This
caused investors to scramble to perceived safe-haven assets, such as Swiss bonds
(flight to quality). This had the effect of increasing the CHF/HUF exchange rate
to a level of 230. Apart from the increased volatility in the exchange rate market, this
further impaired Swiss franc-denominated debt holders’ position. This increased
concerns about the solvency and liquidity of Hungarian banks. The increase in
the FISS is attributed to turbulence in the foreign exchange market and the
government bond market.
III./5: Government announces plans related to early repayment of foreign-
currency-denominated loans at a fixed exchange rate on 9 September 2011.
This was done to tackle the concerns about the solvency of these debtholders. This
announcement increased the FISS as it forced the Hungarian banking system to
absorb the losses that arose from the early repayment of the loans at a non-market
exchange rate. The final payment opportunity was open until February 2012.
The programme cost the Hungarian banking system over HUF 250 billion (see
article published on portfolio.hu: ‘Vége a végtörlesztésnek’, 2012). This increased
the PD of all banks and led to an uptick in the FISS.
III./6: Papademos resigns on 11 April 2012. After reaching a post-Lehmann
peak in January 2012, the index starts to gradually decrease as the financial
markets of Europe seemed to stabilise. This tendency halted when concerns about
the solvency of Greece re-emerged. The resignation of the Prime Minister of
Greece L. Papademos further intensified these concerns, making the FISS rise
as uncertainty on the financial markets increased. The increased uncertainty
occurred in the government bond market and the foreign exchange market.
III./7: Cypriot financial crisis during March 2013. There was a slight uptick
in the FISS during this time, but the Cypriot financial crisis was dealt with swiftly.
Furthermore, the Hungarian financial system did not have exposures to Cyprus,
and the uptick in the FISS is mostly due to the increased exchange rate volatility.
IV./1: Hungarian government bonds come under selling pressure during
January 2014. This selling pressure came from Templeton drastically decreasing
its holdings of Hungarian government bonds. This selling pressure affected the
risk premium charged on Hungarian government bonds, leading to a slight
uptick in the FISS.
The next two events are the economic sanctions imposed on Russia by the
EU on 29–31 July 2014 (IV./2) and 9 September 2014 (IV./3). These sanctions
were met with Russian embargos on imports from the EU. Since Hungary had
export exposure to Russia in agricultural goods, these countermeasures hurt the
general economy of Hungary, raising the FISS moderately as the foreign exchange
market, capital markets and the government bond market reacted to the news.
IV./4: The start of the Quaestor debacle on 10 March 2015. The core of the
breakdown was the revelation that Quaestor operated with fictitious bonds: of
the HUF 210 billion bond stock, HUF 150 billion was issued without permission.
This event increased the information asymmetry in the Hungarian financial
system (especially in the interbank market), which increased the FISS gradually
until the summer of 2015.
IV./5: Greek debt repayment debacle in June 2015. Tsipras announced a
referendum on the bailout agreement, which the Greek parliament approved on
28 June 2015. Furthermore, capital controls were imposed, and the Greek banks
were forced to close on account of a lack of liquidity and remained closed until
20 July 2015. These events increased the volatility of the euro, which resulted in
a local maximum in the FISS.
The final two events on Figure 5 are the Brexit vote on 23 July 2016 (IV./6)
and the US presidential election results on 23 November 2016 (IV./7). These last
two events are minor in terms of their effects on the Hungarian financial markets,
and the upticks are due to the increased volatility in the exchange rate of the euro
and British pound on the first of these events, and the US dollar on the latter.
IV./8 shows a moderate increase in the stress index over a period of a few
months in 2018Q2. This was on account of the lower risk appetite of investors
which led to the yields on government bonds gradually increasing. The FISS
captures this reduced risk appetite by a gradual increase in its value over the
course of a month. The markets stabilised by the end of the quarter, and from
the start of 2018Q3 the FISS starts to gradually drop. Although, the increase in
Figure 5 looks stark, the value of the FISS never reached the heights it had during
the financial crisis.
Table 4. Ranking the effects of the events on the FISS
Event Minimum Maximum Change Rank

By % By raw
Number Source Anticipation Date Value Date Value In % Raw
change change
I./1 HU Yes 09-May-06 0.33 15-Jun-06 0.43 30.46 0.1009 9 12
I./2 US No 16-Jul-07 0.25 17-Aug-07 0.37 48.42 0.1209 4 7
I./3 US Yes 31-Oct-07 0.28 27-Nov-07 0.33 19.47 0.0544 15 16
I./4 HU No 18-Feb-08 0.42 10-Mar-08 0.52 21.91 0.0931 13 13
II./1 US Yes 18-Jul-08 0.36 17-Sep-08 0.51 42.25 0.1517 5 4
II./2 US Yes 26-Sep-08 0.50 31-Oct-08 0.99 96.70 0.4861 1 1
II./3 HU No 09-Jan-09 0.76 12-Mar-09 0.87 14.90 0.1130 17 10
III./1 HU/EU Yes 13-Apr-10 0.37 09-Jun-10 0.69 85.27 0.3172 2 2
III./2 HU Yes 05-Oct-10 0.43 01-Dec-10 0.56 28.09 0.1217 10 5
III./3 EU No 11-Apr-11 0.35 25-May-11 0.47 34.96 0.1218 7 6
III./4 EU/US No 11-Jul-11 0.47 10-Aug-11 0.58 24.71 0.1149 12 9
III./5 HU No 01-Sep-11 0.52 01-Dec-11 0.68 30.88 0.1615 8 3
III./6 EU Yes 20-Mar-12 0.52 01-Jun-12 0.61 16.52 0.0866 16 14
III./7 EU No 13-Feb-13 0.40 26-Mar-13 0.44 9.16 0.0370 22 20
IV./1 HU No 17-Jan-14 0.31 13-Mar-14 0.39 24.93 0.0771 11 15
IV./2 EU/RU No 29-Jul-14 0.27 12-Aug-14 0.31 13.29 0.0362 18 21
IV./3 EU/RU No 09-Sep-14 0.29 20-Oct-14 0.32 10.24 0.0298 21 22
IV./4 HU No 10-Mar-15 0.33 08-Jun-15 0.37 12.69 0.0414 20 19
IV./5 EU Yes 22-Apr-15 0.33 16-Jun-15 0.37 13.01 0.0427 19 18
IV./6 EU Yes 19-Apr-16 0.26 27-Jun-16 0.37 40.39 0.1057 6 11
IV./7 US Yes 26-Oct-16 0.25 25-Nov-16 0.30 21.73 0.0537 14 17
IV./8 HU Yes 11-May-18 0.22 27-Jun-18 0.34 55.45 0.1204 3 8
In order to evaluate the effectiveness of the FISS, a summary table was created
(see Table 4) highlighting the events causing a material change in the index. Here
the effects are categorized by their having an anticipation period, in other words
whether the event came as a surprise or not, and the geographical source of the
disturbance. The first striking feature is that there appears to be no difference in
the movement of the index with respect to the source of the disturbance. This is
reassuring, as the aim of the model is to capture financial stress regardless of the
source of the event.
It has to be noted that market anticipation only matters because the
events shown are imperfect proxies of stress events. The announcement of
the Hungarian government’s austerity programme (I./1) did not happen in a

vacuum, for example: market participants had already started to react to the
worsening financials of the Hungarian government. As such, the true start of the
stress period is not the announcement, but when the financial markets started
to react to the worsening economic fundamentals in Hungary. However, this
start of the event is impossible to determine ex-ante. In Table 4 the start of the
anticipation was determined as the minimum point in the FISS around the
marked event.
An interesting question is whether the FISS should be interpreted on a log
scale. As noted before, looking at the FISS, the value seems to be more volatile the
higher the level of the index. This could be due to functional misspecification.
Table 4 tries to tackle this concern by looking at the change in the index at the
events portrayed in the previous Section and ranking them by raw as well as
percent change. From the table, it can be seen that the percent change ranking
puts too much weight on the changes occurring at the end of the sample as
opposed to the events that occurred during high stress periods. As such it can be
concluded that there is no functional form misspecification and that the FISS is
informative in raw levels. This is further supported by the fact that the included
variables were kept in raw form and were not converted to log form.
5.2. Robustness
The FISS is not run recursively, thus it is important to make sure that the
model does not change drastically as more data is added to the sample. To
test this, the model was re-run on several sample sizes to verify that the peaks
of the model capture the important stress events of the Hungarian financial
markets in levels, while capturing the minor events in dynamics. Figure 6
shows the results of this robustness test. The following sample sizes were tested:
the FISS with the last 500 observations not taken into account, the last 1500
observations not taken into account, the FISS with the first 500 observations
not taken into account, and the FISS with the first 1500 observations not taken
into account.
Inspection of Figure 6 reveals that the estimates are extremely robust when
observations are taken off the end of the sample. The dynamics match up almost
identically and the difference in the levels is minimal. This level of robustness
is maintained also when the first 500 observations are taken off the estimation
sample. The only subsample where the model yields somewhat different results
is when the first 1500 observations are ignored. This is due to the fact that the
biggest shock in the data, the 2008 crisis, is not accounted for in this set up, and
as such the factor identification for financial stress is hindered. However, even
with this caveat the model performs admirably in the given subsample.
Figure 6. The FISS on different sample sizes

1.0 1.0
0.9 0.9
0.8 0.8
0.7 0.7
0.6 0.6
0.5 0.5
0.4 0.4
0.3 0.3
0.2 0.2
0.1 0.1
0 0
Jan. 05
Aug. 05
Mar. 06
Oct. 06
May 07
Dec. 07
July 08
Feb. 09
Sept. 09
Apr. 10
Nov. 10
June 11
Jan. 12
Aug. 12
Mar. 13
Oct. 13
May 14
Dec. 14
July 15
Feb. 16
Sept. 16
Apr. 17
Nov. 17
June 18
Jan. 19
Aug. 19
stress eventsa FISS FISS (T – 500) FISS (T + 500)
FISS (T – 1500) FISS (T + 1500)
Note: for scaling the (T + 1500) series the maximum of the full-length estimation was used.
This level of robustness is partly attributed to the Bayesian structure of the

model: even when almost half of the sample is not taken into account, the model
reaches its historic maximum in the same week. It is possible to create a recursive
variant of the FISS, but Figure 6 shows that it is not necessary, as the model
correctly identifies the underlying factors of financial stress.
6. Conclusion
This paper presented the FISS as a measure of financial stress in the Hungarian
financial system. This FSI aims to capture the financial instabilities in the four
core segments of the financial system: the foreign exchange market, the interbank
and bank segment, the government bond market, and the capital market. In total,
19 variables were used across the four segments, compressed into four common
stochastic trends.
To aggregate the four factors, different weight estimation procedures were
tested, namely a quantile regression, information value as well as a simulation
method. It was found that the simple average yields satisfactory results and the
more complex methods do not increase the information content of the FISS.
The time-varying methodology of the factors allows the compositions of
the underlying common stochastic trends to change as the financial markets
further develop. Nevertheless, as the financial market of Hungary deepens, the
indicators of the FISS should be revisited to make sure the data selection reflects
the underlying structure of the financial system.
The FISS is capable of capturing the core dynamics of financial instability on

the Hungarian financial markets and will be a useful FSI for future policy use.
As such, the FISS has been part of the broader macroprudential toolkit of
the Hungarian Central Bank since August 2017. During this time, the index
has proven to be sensitive enough to give information about financial stress,
without being too sensitive and giving false positives. As such, the proposed
methodology meets the aim presented in the beginning of the paper: it creates
a sufficiently sensitive measure of financial risk for an economy that lacks deep
derivative markets.
The FISS fits in the countercyclical capital buffer (CCyB) framework of the
macroprudential toolkit. As opposed to the widely used early warning indicators,
the FISS provides information about the current level of tension in the financial
system of Hungary, acting as a thermometer of stress and making it particularly
useful when a decision has to be made about the release of the CCyB. The FISS
provides crucial information when releasing the CCyB, as the best practice is to
release built-up capital promptly to make sure the banking sector has enough
capital to cushion the impact of financial stress (Detken et al., 2014). The CCyB is
built up over a period of two years, so a precise signal about the current financial
stress is important, because an erroneous release of the buffer is costly. The FISS
satisfies the condition of being a good variable for the release of the CCyB as it
only reaches extreme heights during the 2008 global financial crisis. Additionally,
since the FISS is a continuous variable, it is capable of giving information about
periods that portray only moderate levels of financial stress, such as the European
bond crisis of the 2010s.
Policy evaluation is another important aspect of central banking. Having
a better understanding of how effective monetary policy is in normal times
and times of stress helps tune the interest rate setting of the Central Bank.
The continuous nature of the FISS allows it to be a threshold variable in bigger
VAR models that try to capture the dynamics of the economy across stress and
normal regimes. As such the FISS is a useful tool, not only in the CCyB framework,
but is also useful as a threshold variable for policy tuning and evaluation.
One potential area of improvement to the FISS stems from the data and
the factors portraying significantly higher volatility during times of financial
stress, or more precisely, the heteroscedastic features. It follows that the model
can be extended and improved by accounting for these innovations. Del Negro
and Otrok (2008) outline such an approach: using time-varying parameters and
stochastic volatility in both the factors and the loadings, the heteroscedastic
features of the factors can be tamed.
Appendices are available at

http://rjmf.econs.online/en;
dx.doi.org/10.31477/rjmf.202001.03
References
Akhtar, M. and Hilton, R. S. (1984). Effects of Exchange Rate Uncertainty on German

and U.S. Trade. Federal Reserve Bank of New York Quarterly Review, 9(1), pp. 7–16.
Bai, J. (2003). Inferential Theory for Factor Models of Large Dimensions. Econometrica,
71(1), pp. 135–171. doi:10.1111/1468-0262.00392
Bai, J. (2004). Estimating Cross-Section Common Stochastic Trends in Nonstationary
Panel Data. Journal of Econometrics, 122(1), pp. 137–183.
Bai, J. and Ng, S. (2002). Determining the Number of Factors in Approximate Factor
Models. Econometrica, 70(1), pp. 191–221. doi:10.1111/1468-0262.00273
Banerjee, A., Marcellino, M. and Masten, I. (2014). Forecasting with Factor-Augmented
Error Correction Models. International Journal of Forecasting, 30(3), pp. 589–612.
Barigozzi, M., Lippi, M. and Luciani, M. (2020). Cointegration and Error Correction
Mechanisms for Singular Stochastic Vectors. Econometrics, 8(1), 3.
Bernanke, B. S. and Boivin, J. (2003). Monetary Policy in a Data-Rich Environment.
Journal of Monetary Economics, 50(3), pp. 525–546.
Bernanke, B. S., Boivin, J. and Eliasz, P. (2004). Measuring the Effects of Monetary Policy:
a Factor-augmented Vector Autoregressive (FAVAR) Approach. NBER Working Papers,
N 10220.
Boivin, J. and Ng, S. (2006). Are More Data Always Better for Factor Analysis?
Journal of Econometrics, 132(1), pp. 169–194.
Caballero, R. J. and Krishnamurthy, A. (2008). Collective Risk Management
in a Flight to Quality Episode. The Journal of Finance, 63(5), pp. 2195–2230.
doi:10.1111/j.1540-6261.2008.01394.x
Caballero, R. J. and Kurlat, P. D. (2008). Flight to Quality and Bailouts: Policy Remarks
and a Literature Review. MIT Department of Economics Working Paper, N 21.
Carter, C. K. and Kohn, R. (1994). On Gibbs Sampling for State Space Models.
Biometrika, 81(3), pp. 541–553.
Cevik, E. I., Dibooglu, S. and Kutan, A. M. (2013). Measuring Financial Stress
in Transition Economies. Journal of Financial Stability, 9(4), pp. 597–611.
Crosbie, P. and Bohn, J. (2003). Modeling Default Risk. Moody’s KMV Company
Technical Paper.
De Bandt, O. and Hartmann, P. (2000). Systemic Risk: A Survey. ECB Working Paper, N 35.
Del Negro, M. and Otrok, C. (2008). Dynamic Factor Models with Time-Varying
Parameters: Measuring Changes in International Business Cycles. Federal Reserve Bank
of New York Staff Reports, N 326.
Detken, C., Weeken, O., Alessi, L., Bonfim, D., Boucinha, M. M., Castro, C., Frontczak, S.,
Giordana, G., Giese, J., Jahn, N., Kakes, J., Klaus, B., Lang, J. H., Puzanova, N.
and Welz, P. (2014). Operationalising the Countercyclical Capital Buffer: Indicator
Selection, Threshold Identification and Calibration Options. ESRB Occasional Paper, N 5.
Eickmeier, S. (2005). Common Stationary and Non-Stationary Factors in the Euro Area
Analysed in a Large Scale Factor Model. Deutsche Bundesbank Discussion Paper
Series 1: Studies of the Economic Research Centre, N 2.
Engle, R. and Watson, M. (1981). A One-Factor Multivariate Time Series Model

of Metropolitan Wage Rates. Journal of the American Statistical Association, 76(376),
pp. 774–781.
Forni, M., Hallin, M., Lippi, M. and Reichlin, L. (2000). The Generalized Dynamic
Factor Model: Identification and Estimation. The Review of Economics and Statistics,
82(4), pp. 540–554.
Forni, M., Hallin, M., Lippi, M. and Reichlin, L. (2005). The Generalized Dynamic
Factor Model: One-Sided Estimation and Forecasting. Journal of the American
Statistical Association, 100(471), pp. 830–840.
Forni, M., Giannone, D., Lippi, M. and Reichlin, L. (2009). Opening the Black Box:
Structural Factor Models with Large Cross Sections. Econometric Theory, 25(5),
pp. 1319–1347.
Freixas, X., Laeven, L. and Peydró, J.-L. (2015). Systemic Risk, Crises,
and Macroprudential Regulation. MIT Press.
Fukker, G. (2017). Harmonic Distances and Systemic Stability in Heterogeneous Interbank
Networks. MNB Working Papers, N 1.
Geweke, J. (1977). The Dynamic Factor Analysis of Economic Time Series. In: D. J. Aigner
and A. S. Goldberger, eds., Latent Variables in Socio-Economic Models.
Amsterdam: North Holland.
Geweke, J. and Zhou, G. (1996). Measuring the Pricing Error of the Arbitrage Pricing
Theory. Review of Financial Studies, 9(2), pp. 557–587.
Giannone, D., Reichlin, L. and Sala, L. (2006). VARs, Common Factors and the Empirical
Validation of Equilibrium Business Cycle Models. Journal of Econometrics, 132(1),
pp. 257–279.
Giannone, D., Reichlin, L. and Small, D. (2008). Nowcasting: The Real-Time
Informational Content of Macroeconomic Data. Journal of Monetary Economics,
55(4), pp. 665–676.
Gorton, G. B. (2009). Information, Liquidity, and the (Ongoing) Panic of 2007.
The American Economic Review, 99(2), pp. 567–572.
Gotur, P. (1985). Effects of Exchange Rate Volatility on Trade: Some Further Evidence.
IMF Economic Review, 32(3), pp. 475–512.
Hakkio, C. S. and Keeton, W. R. (2009). Financial Stress: What Is It, How Can It Be
Measured, and Why Does It Matter? Federal Reserve Bank of Kansas City Economic
Review, Second Quarter, pp. 5–50.
Hałaj, G. and Kok, C. (2013). Assessing Interbank Contagion Using Simulated Networks.
Computational Management Science, 10(2–3), pp. 157–186.
Hatzius, J., Hooper, P., Mishkin, F. S., Schoenholtz, K. L. and Watson, M. W. (2010).
Financial Conditions Indexes: A Fresh Look After the Financial Crisis. NBER Working
Papers, N 16150.
Holló, D. (2012). A System-wide Financial Stress Indicator for the Hungarian Financial
System. MNB Occasional Papers, N 105.
Holló, D., Kremer, M. and Lo Duca, M. (2012). CISS – A Composite Indicator of Systemic
Stress in the Financial System. ECB Working Paper, N 1426.
Hooper, P. and Kohlhagen, S. W. (1978). The Effect of Exchange Rate Uncertainty on the
Prices and Volume of International Trade. Journal of International Economics, 8(4),
pp. 483–511.
Illing, M. and Liu, Y. (2006). Measuring Financial Stress in a Developed Country:
An Application to Canada. Journal of Financial Stability, 2(3), pp. 243–265.
Kadiyala, K. R. and Karlsson, S. (1997). Numerical Methods for Estimation and Inference
in Bayesian VAR-Models. Applied Econometrics, 12(2), pp. 99–132.
Kindleberger, C. P. and Aliber, R. Z. (2011). Manias, Panics and Crashes: A History
of Financial Crises. London: Palgrave Macmillan.
Koop, G. and Korobilis, D. (2010). Bayesian Multivariate Time Series Methods for
Empirical Macroeconomics. Foundations and Trends in Econometrics, 3(4),
pp. 267–358.
Koop, G. and Korobilis, D. (2014). A New Index of Financial Conditions. European
Economic Review, 71, pp. 101–116.
Kósa: “Szűk Esélyünk Van Arra, Hogy Elkerüljük Görögország Helyzetét” (2010).
Portfolio, 3 June. Available at: https://www.portfolio.hu/gazdasag/20100603/kosa-
szuk-eselyunk-van-arra-hogy-elkeruljuk-gorogorszag-helyzetet-134056 [accessed
on 4 February 2020].
Kose, M. A., Otrok, C. and Whiteman, C. H. (2003). International Business Cycles:

World, Region and Country-Specific Factors. American Economic Review, 93(4),
pp. 1216–1239. doi: 10.1257/000282803769206278
Lopes, H. F. and West, M. (2004). Bayesian Model Assessment in Factor Analysis.
Statistica Sinica, 14(1), pp. 41–67.
Merton, R. C. (1974). On the Pricing of Corporate Debt: The Risk Structure of Interest
Rates. The Journal of Finance, 29(2), pp. 449–470.
Nelson, W. R. and Perli, R. (2007). Selected Indicators of Financial Stability.
Risk Measurement and Systemic Risk, 4, pp. 343–372.
Otrok, C. and Whiteman, C. H. (1998). Bayesian Leading Indicators: Measuring and
Predicting Economic Conditions in Iowa. International Economic Review, 39(4),
pp. 997–1014.
Páles, K. and Varga, L. (2008). Trends in The Liquidity of Hungarian Financial Markets –
What Does the MNB’s New Liquidity Index Show? MNB Bulletin, 3(1), pp. 44–51.
Available at: http://www.mnb.hu/letoltes/mnb-bull-2008-04-judit-pales-lorant-varga-
en.pdf [accessed on 4 February 2020].
Sargent, T. J. and Sims, C. A. (1977). Business Cycle Modeling Without Pretending

to Have Too Much A Priori Economic Theory. In: C. A. Sim, ed., New Methods
in Business Cycle Research. Minneapolis: Federal Reserve Bank of Minneapolis,
pp. 145–168.
Siddiqi, N. (2012). Credit Risk Scorecards: Developing and Implementing Intelligent Credit
Scoring. John Wiley and Sons.
Sims, C. (1988). Bayesian Skepticism on Unit Root Econometrics. Journal of Economic
Dynamics and Control, 12(2–3), pp. 463–474.
Stock, J. H. and Watson, M. W. (2002). Forecasting Using Principal Components from

a Large Number of Predictors. Journal of the American Statistical Association, 97(460),
pp. 1167–1179.
Stock, J. H. and Watson, M. W. (2005). Implications of Dynamic Factor Models for VAR
Analysis. NBER Working Paper, N 11467.
Uhlig, H. (1994). What Macroeconomists Should Know about Unit Roots: A Bayesian
Perspective. Econometric Theory, 10(3–4), pp. 645–671.
Vége a Végtörlesztésnek: Több Mint 260 Milliárdot Buktak a Bankok (2012).
Portfolio, 12 March. Available at: https://www.portfolio.hu/bank/20120312/vege-a-
vegtorlesztesnek-tobb-mint-260-milliardot-buktak-a-bankok-164188
[accessed on 4 February 2020].

to Forecast Investment in Russia1
Mikhail Gareev, Russian Presidential Academy of National Economy
and Public Administration (RANEPA)
mkhlgrv@gmail.com
This work forecasts the growth rate of quarterly gross fixed capital
formation in Russia using machine learning methods (regularisation
methods, ensemble methods) over a horizon of up to eight quarters.
The methods tested show higher quality in terms of RMSFE than that of
simple alternative models (autoregressive model, random walk model),
with ensemble methods (boosting and random forest) leading in quality.
The last statement is consistent with the results of other research on the
application of big data in macroeconomics. It was found that removing
observations from the sample which relate to the time before the 1998
crisis and that are atypical for the subsequent period of time does
not worsen the short-term forecasts of machine learning methods.
Estimates of the coefficients of generally accepted key investment factors
obtained using regularisation methods are, on the whole, consistent
with economic theory. The forecasts of the author’s models are superior
in quality to the annual forecasts of growth rates of gross fixed capital
formation published by the Ministry of Economic Development.
Keywords: investment Citation: Gareev, M. (2020).

forecasts, machine learning, Use of Machine Learning Methods to
LASSO, boosting, Forecast Investment in Russia.
random forest Russian Journal of Money
JEL Codes: C53, E22 and Finance, 79(1), pp. 35–56.
doi: 10.31477/rjmf.202001.35
1. Introduction
Currently, huge arrays of macroeconomic and financial data are available,
and many applied studies in economic research come down to retrieving useful
information from them. Extensive research has been dedicated to the application
of various methods for working with big data volumes in macroeconomics.
Models have been developed that make it possible to retrieve information from
hundreds of variables and address the challenge of retraining and the so-called
‘curse of dimensionality’ (see Stock and Watson, 2011). However, the outcomes
obtained using many popular machine learning methods are often difficult
1
The author would like to express his gratitude to Andrey Polbin (RANEPA; Caidar Institute for
Economic Policy) and anonymous reviewers for their invaluable remarks and corrections.
or impossible to interpret. Researchers must understand how adequate

predictive models are and how consistent they are with economic theory, so
that the use of regularisation can be justified (i.e. penalising inadequately
high coefficients on predictors). Regularisation methods like LASSO (Least
Absolute Shrinkage and Selecting Operator) or Ridge, or their modifications,
on the one hand, allow the mitigation of risks associated with retraining the
models to some extent, and, on the other hand, they preserve interpretability.
However, very often methods with weak interpretability tend to demonstrate
the highest quality of forecasts in applied research (e.g., ensemble methods
like boosting and random forest).
Investments are one of the most important drivers of growth in the long-run,
and frequently become the focus of discussions among economists. Research work
by Kudrin and Gurvich (2014) focuses on the decrease of investment activity in
Russia in 2007–2013, and the restriction of foreign capital inflow and technology
imports due to events in Crimea and the South-East of Ukraine. A slowdown in
investment activity leads to lagging behind the rest of the world. The low investment
attractiveness of the Russian economy, which the authors of the paper believe is
primarily caused by weak market mechanisms, is the major obstacle to sustainable
growth. Oreshkin (2018) points out that, considering the existing demographic
limitations, the only way to boost economic growth rates is to increase the volume
and quality of investments. According to estimates of the Ministry of Economic
Development (MED), the target GDP growth rate of 3–3.7% is achievable only
through a 25–30% increase in the share of investments in GDP, and an increase
in household savings must be the primary driver of this transformation. In their
paper, Idrisov and Sinelnikov-Murylev (2014) pay special attention to institutional
drivers for boosting investment prospects in Russia. Among the possible areas
of accelerated investment growth, scholars mention advanced technologies
(Aganbegyan, 2016) and infrastructure (Oreshkin, 2018).
It is clear that, like any other economic indicator, investments are influenced by
many factors, and because it is fundamentally impossible to consider all of them,
relatively simple models are often used for forecasting purposes (e.g., an accelerator
model implying that the level of investments is defined by the current level of output
and its lags). However, it may be possible to extract information from the extensive
data of macroeconomic statistics which will help to generate relatively consistent
forecasts for investment growth rates. The author hopes that this work, which
explores the application of machine learning methods to forecast growth rates for
gross fixed capital formation for a horizon of up to 8 quarters, will somehow fill the
gap that exists in this area.
There are plenty of examples that demonstrate the successful application
of machine learning in macroeconomic forecasts. Thus, Li and Chen (2014)
examined the capabilities of LASSO, Elastic Net and Group LASSO in the field
of macroeconomic forecasting compared with dynamic factor models (DFMs).
v.79 №1 Gareev: Machine Learning to Forecast Investment, pp. 35–56 37
The data include 107 various monthly indicators for the American economy
during the period 1959–2008, and the authors carry out a detailed examination of
the results obtained in relation to the 20 most significant indicators. It was found
that, in the context of one-step-ahead forecasting, regularisation methods on the
whole showed superior quality (in terms of MSFE) as compared with dynamic
factor models (DFMs) for 18, 15 and 19 variables out of 20 for LASSO, Elastic
Net and Group LASSO. In addition, it was found that combining regularisation
methods and DFMs for each of the 20 indicators ensured higher quality forecasts
as compared to using DFMs alone.
Among other things, Bai and Ng (2008) demonstrate that the application of
regularisation methods may be useful in terms of determining the factors that impact
inflation in the US. On the whole, models that support changes in the set of explanatory
variables over time (variables were selected using LASSO, among other things)
demonstrated superior quality as compared to models with a fixed set of factors.
In his work, Baybuza (2018) investigates methods to process big data for the
purposes of forecasting inflation in Russia. The author uses 92 macroeconomic
indicators for the period of 2012–2016 as predictors and generates out-of-sample
inflation forecasts for a horizon of 1–24 months. In addition to regularisation
methods, ensemble machine learning methods are also considered (random forest
and boosting). Models with regularisation yield low-quality forecasts with regard
to benchmarks (random walk model, AR(1) and AR(p)) and ensemble methods).
However, the combination of AR(1) and LASSO demonstrates superior quality
with regard to forecasts for one month ahead.
A paper by Fokin and Polbin (2019) is dedicated to the alignment of vector auto-
regression and a LASSO-regularisation (VAR-LASSO) model to forecast the major
indicators of the Russian economy: GDP, consumption, and fixed capital investment
based on exogenous oil price shock. Based on the testing results, the model offered
by the authors demonstrates good predictive power as compared to an ordinary VAR
model and to the forecasts of the MED. Furthermore, impulse response functions to
the shock in oil prices were constructed.
Hereafter, the work is organised as follows. Section 2 analyses research methodology.
First of all, a description is provided for the machine learning methods used (Ridge,
LASSO, Post-LASSO, Adaptive LASSO, Elastic Net, Spike and Slab, random forest,
boosting) and alternative forecasting methods used for quality comparison. Secondly,
the utilised data is described along with the methods of transformation and forecast
building. Section 3 contains empirical research results and relevant discussion.
The author developed a web application2 that enables both the reproduction of the
deliverables and the configuration of one’s own model specification relative to the
boundaries of the training sample and forecasting horizons, as well as the creation of
forecasts for investment growth rates. Section 4 presents the conclusion.
2
The application is available at https://mkhlgrv.shinyapps.io/investment_forecasting/
2. Methodology
We can highlight several groups of methods for working with the big volumes
of data utilised to forecast macroeconomic variables. These differ in terms of their
approaches to addressing the ‘curse of dimensionality’, which is a major obstacle
when it comes to using a high number of explanatory variables.
One of the approaches to working with a large number of predictors involves the
use of regularisation methods. As we have already mentioned before, the idea is to
use a penalty function for increases in the size or number of estimated parameters.
The two most common and widely examined regularisers in research papers are
LASSO, or regulariser Tibshirani (see Tibshirani, 1996), and Ridge regression, also
known as Tikhonov's regulariser, or regulariser (Hoerl and Kennard, 1970).
A great number of poorly interpretable machine learning methods are also
available. This research pays special attention to ensemble methods – random
forest and boosting.
2.1. Regularisation method
Let us assume that a standard linear regression model is defined:
(1)
where for observation 𝑖 = 1, … , 𝑛, y𝑖 are the values of the explained variable,

𝑥𝑖 ∈ ℝ𝑝 are the values of the 𝑝 explanatory variables, β ∈ ℝ𝑝 is the vector of 𝑝
coefficients, ε𝑖 ~ N(0, σ2) are independent and identically distributed random
errors. All variables are standardised, i.e. they have zero mean and unit variance.
In matrix form, the model is as follows:
(2)
where Y ∈ ℝ𝑛 are the values of the explained variable, X ∈ ℝ𝑛×𝑝 is the matrix of
values of explanatory variables, β ∈ ℝ𝑝 is the vector of coefficients, and ℇ ∈ ℝ𝑛 is
the vector of independent and identically distributed random errors.
The model is called a sparse linear model with high dimensionality (Belloni
and Chernozhukov, 2011a; Belloni and Chernozhukov, 2011b) if it is possible that
𝑝 ≥ 𝑛, but only s < 𝑛 elements of β are not zero. High dimensionality means that
estimation of the model by means of standard OLS may either cause non-robust
estimates of the coefficients, or be completely impossible, if 𝑝 ≥ 𝑛.
If the linear model has high dimensionality, it would be useful to have a method
of estimation of coefficients with a smaller variance (even if biased) than when
using OLS. However, if the model is a sparse model, then it would be useful to
have a way of identifying and discarding the zero elements of vector β in addition
to estimating coefficients. One of the possible ways to address these challenges is

to use regularisation methods. A common idea for such methods is to introduce a
certain penalty (regularisation operator) that would prevent overfitting caused by
unreasonably sizeable estimates of the coefficients, shrinking them toward zero.
2.1.1. Ridge
OLS estimations of models with low dimensionality usually have low bias
but high variance. However, Ridge regression allows a decrease in variance across
estimates of the coefficients, at the expense of making them biased. Formally, the
problem looks as follows:
(3)
The standard minimised functional of least squares is compounded with a

penalty function that consists of the 𝑙2-norm of vector β multiplied by a penalty
parameter value λ. With λ increasing, estimates of the coefficients tend to zero, and
λ = 0 is a standard OLS. An important aspect of this approach is the existence of
an analytical solution
(4)
In case of rank(X) < 𝑝, OLS has no solution, because the X'X matrix is not
invertible. However, a solution exists for the Ridge model, if λ ≠ 0.
The disadvantage of this approach is that it cannot select non-zero coefficients
in a sparse model.
2.1.2. LASSO
The LASSO method became popular after Tibshirani (1996), however,
it had already been mentioned in other publications before that (Santosa and
Symes, 1986). A LASSO-based estimate looks as follows:
(5)
In this case, the sum of the absolute values of the model coefficients
multiplied by the parameter λ acts as the regularisation operator. As in Ridge,
the penalty value is not substantial when λ values are low enough, and the results
of estimation are similar to those of standard linear regression. However, if the λ
values are high enough, the model contains no explained variables at all. Unlike
Ridge, the use of LASSO allows the selection of just some of the most important
variables out of the total set, and the discarding of the rest. However, no analytical
solution exists for this model.
Regularisation methods allow the explicit obtainment of estimates of coefficients
on the predictors. But can they be interpreted accurately? It is well known that
coefficients that are estimated using standard LASSO are often unstable over time
and can change drastically when new observations are added. This is confirmed, for
instance, in Zou and Hastie (2005), and De Mol et al. (2008). There are a number
of modifications of classic LASSO that are intended to improve its statistical
properties, like Post-LASSO and Adaptive LASSO.
2.1.3. Post-LASSO
Normally, estimates by means of regularisation methods will be biased.
Belloni and Chernozhukov (2011a) demonstrated that the Post-LASSO method
may provide estimates that are potentially less biased than those created by
LASSO. The LASSO operator allows the removal of unnecessary variables from
the scope of consideration. In this case, it would be quite natural to additionally
consider ordinary linear regression after selecting the variables, using only those
predictors for which the coefficients were not equal to zero. Thus, we obtain the
Post-LASSO estimate. Formally, this may be recorded in the following way:
where if (6)
Such estimates may provide less bias, however, as Belloni and Chernozhukov
(2011a) demonstrated by means of simulation, high data noisiness makes the
model unstable, and estimates of the coefficients may be even farther from their
true values than those obtained by LASSO.
2.1.4. Adaptive LASSO

Zou (2006) demonstrates that in some situations LASSO may incorrectly
discard variables (i.e. estimate the coefficients as zero). In some cases, the penalty
parameter value of λ which shows the highest estimation quality results in the
selection of irrelevant variables. However, a situation is also possible in which
variables are selected correctly but the non-zero coefficients are unreasonably
sizeable, which results in relatively poor forecasts.
Due to this, another LASSO version is offered, Adaptive LASSO, which already
uses a weighted penalty function. In this case, subject to certain pre-conditions, the
method selects the correct variables, and moreover, we can talk about the consistency
of coefficient estimates obtained this way. The improvement suggested by Zou (2006)
consists of the preliminary weighting of the vector coefficients β which is a part of
λ || ||
the regularisation operator ⎼𝑛
β 1. The problem for finding estimates of the vector of
coefficients using adaptive LASSO is formally described as follows:
(7)
where w ∈ ℝ𝑝 is the vector of coefficient weights. According to the results obtained

by the authors, if the weights are chosen correctly, the estimates obtained by this
method are consistent. But how do we choose the weights? The following formula
may be used:
(8)
where βjinit is the initial estimate of the coefficient βj obtained, for instance, using
Ridge, γ is the additional parameter. With an increase in γ, the importance of penalty
weighting grows (if γ = 0, the problem comes down to ordinary LASSO). Adding
weight values allows to specify the variables in the LASSO operator which addition to
the model is not desirable (the lower the absolute value of the initial estimate of the
coefficient j, the higher the penalty for the appearance of the coefficient in the model).
The parameter γ could be chosen using cross-validation, though this
would not really be reasonable if the available sample is small. According to
recommendations provided by the creator of the method, γ = 0.5 is used, and
initial weight values are calculated based on Ridge estimates.
2.1.5. Elastic Net

Zou and Hastie (2005) demonstrated non-robustness in the selection of
variables LASSO caused by parameter uncertainty when estimating the co-variance
matrix. To address LASSO this issue, the authors suggested the Elastic Net method.
It combines the LASSO and Ridge methods. These methods have the following
central features: although LASSO allows the nullification of the coefficients,
Ridge yields more robust estimates for highly correlated variables, while LASSO
estimates may change substantially when new observations are added. The use of
the Elastic Net method allows the nullification of the coefficients (the advantage of
LASSO) on the one hand, while the estimates of the coefficients may be relatively
robust (the advantage of Ridge) on the other hand. The optimization problem
associated with Elastic Net looks as follows:
(9)
The regularisation operator consists of the weighted sum of the LASSO and
Ridge operators. This definition of the task implies two parameters already: λ,
like the one mentioned above, is responsible for the importance of the penalties,
and α weighs Lasso and Ridge against each other. With α = 1, the model comes
down to LASSO, and with α = 0, to Ridge. The α parameter may be selected by
means of cross-validation, however, this is not really justified when the sample
is small, so in this research work its value is set at a level of 0.5, equidistant from
the two extremes. The same value was selected, for instance, in Baybuza (2018).
2.1.6. Spike and Slab

In econometrics, the Bayesian approach assumes the existence of a prior
distribution for each model parameter, the combination of which with the available
data yields a posterior distribution. This approach, for instance, allows us to obtain
estimates for parameters even if the number of variables exceeds the number of
observations or to directly pose a question about the likelihood of any of the model's
coefficients equal to zero. This research uses the Bayesian Spike and Slab method.
It is important to note that regularisation methods may also be described
through the Bayesian approach. In case of normal prior distribution, the Bayesian
model comes down to Ridge regression, and when Laplace distribution is used,
to LASSO regression (see De Mol et al., 2008).
In a generic form, the representation of the model looks as follows. Let us
assume equation (2) holds and the following prior beliefs about parameters of
the model are assumed:
(10)
where σ2 > 0 is the variance of the error term, Γ = γ ⋅ Ik, Ik is an identity matrix, π(⋅)
and μ(⋅) are prior distributions of respective parameters.
Various versions of prior distributions exist. The author uses the Spike and Slab
method as described in Ishwaran and Rao (2005), which makes the following
assumptions:
(11)
where δ𝑥 is the discrete measure concentrated in the vicinity of point 𝑥, 𝑣0 is a

number close to zero. The number 𝑣0 and the parameters of Gamma distribution
α1 and α2 are selected in such a way that variance of the coefficient γj has a peak at
zero and a right-side tail area. Unlike other earlier versions of the model, this
specification assumes the continuity of all distributions used instead of defining
them as piecewise functions, which ensures greater flexibility of the model.
In accordance with these assumptions, minimisation of the posterior sample
mean of errors is performed, which is standard for the Bayesian approach.
2.2. Ensemble methods
2.2.1. Random forest

Random forest falls under the category of ensemble methods. The general
idea behind ensemble methods is the use of a large number of ‘simple’ regression
methods within a single sample or the classification and generation of predicted
values based on the averaged predictions of these methods. The random forest
method described by Liaw and Wiener (2002) is based on building decision trees.
What is the motivation for their use? Linear methods of model estimation feature
a number of advantages: they can be quickly trained, they support the processing
of a large number of conditions, and they can be regularised. However, they still
have a serious disadvantage: they are only able to estimate linear dependencies
between variables (of course, the model specification may be changed and non-
linear components added, but these changes must be substantiated, and these
opportunities are quite limited anyway). Decision trees allow this challenge to
be addressed to some extent. Though initially they were used for completing
classification tasks (for instance, the classical task solved using trees – the binary
classification of potential debtors by a bank, whether or not the debtor will
repay the credit), they may also be used for regression tasks. However, the trees
themselves are currently seldom used, but often combined into a composition.
Before we proceed with a description of the random forest method, we will give
a formal definition of a binary decision tree. Let us assume that an explained variable
vector, Y, and a matrix of explanatory variables, X, are specified. The tree consists
of internal and terminal (leaf) nodes. Each internal node 𝑣 has a β𝑣 : X → {0, 1}
function (predicate) assigned to it, and each terminal node 𝑢 has a y𝑢 ∈ Y forecast
assigned to it. Moreover, algorithm (X) is defined, which starts from an initial
node 𝑣0 and transitions to the left-hand node if β𝑣 (X) = 0, or to the right-hand
0
node if β𝑣 (X) = 1. This happens until the algorithm reaches a terminal node, after
0
which the explained variable is forecast. Usually, one predicate uses just one
variable from the set of explanatory variables.
How are the trees constructed? In the construction of each node, the variable
and its divisional value are selected in such a way as to ensure the maximum
improvement of the pre-defined quality functional. For instance, the sum of errors
squared RSS = ∑𝑛𝑖=1(y𝑖 – ŷ 𝑖)2 may act as such a functional, and the mean value
or median of the explained variable in the sub-sample may be used as predicted
value yι̂ . The stopping criterion is checked at each node. If the check fails, the node
is declared an internal node, and splitting continues. If the check is successful, the
node is recognised as a terminal node and is assigned forecast ŷ𝑖. The maximum
tree depth, the minimum number of objects at a node, the maximum growth of the
quality functional, or a combination of the above, may be used as a stopping criterion.
It is easy to notice that trees are prone to overfitting (for instance, in a trivial
case, you can develop an error-free tree on a training sample if each terminal
node contains only one object). Random forest is not so susceptible to this issue.
‘Planting’ decision trees into a random forest consists of two steps:
1. Data are randomly divided into N sub-samples with replacements. The size
of each sub-sample is equal to the size of the entire training sample, but a
certain part of the observations is excluded from the sample, and other
observations are found multiple times. A decision tree is built on each
sub-sample. However, when building the tree on a single sub-sample, only

𝑝 0 variables that are randomly selected from 𝑝 predictor variables
individually for each sub-sample are available.
2. The averaged value of ŷ𝑖 across all j (j = 1, … , N) trees is selected as the
predicted value ŷ𝑖j.
This approach allows the achievement of high predictive power, and, as we
have already mentioned above, using multiple trees secures the model against
overfitting to some extent. However, meaningful economic interpretation of the
results of random forest building is quite difficult or completely impossible.
Moreover, the use of decision trees involves an important restriction: predicted
values cannot go beyond the values that were observed in the training sample, while
such a problem does not exist for linear models. This restriction may be especially
important in terms of forecasting macroeconomic or financial data.
2.2.2. Boosting
Like random forest, gradient boosting, as proposed in the research by
Friedman (2001), belongs to the category of ensemble methods (i.e. it is
a composition of several models of the same type). However, unlike random
forest, boosting allows the iterative improvement of the model using information
obtained from previous learning iteration.
Boosting is actually a general methodology of developing methods, so it can
use almost any type of models as base models. However, as with random forest, the
base model is often a decision tree. This is the approach that is used in this paper.
In the course of boosting, the base model M1 is first trained on a training sample,
then the model errors are counted, and a new model 𝑚2 is trained based on them.
The new model is added to the previous one with the multiplier η ∈ (0, 1):
(12)
The parameter η is responsible for the speed of learning (if the speed too
high, the model is prone to overfitting, and if too low, then more iterations will
be needed to get high-quality predictions).
The result of N iterations is the model MN = M1 + ∑N𝑖=2 η𝑖–1𝑚𝑖.
2.3. Alternative models
This subsection describes alternative models used in the paper which are not
related to machine learning methods.
2.3.1. Random walk

A simple random walk model is used as a basic model. Random walk is a
simple and clear quality indicator that is frequently used in applied economic
research. This model assumes that process increments over time are white noise.
Formally, a forecast of h steps ahead may be recorded in the following way:
(13)
2.3.2. Autoregression
Another frequently used model to compare quality when forecasting time
series is the autoregressive model in the range of 𝑝, AR(𝑝). This model assumes
that the process changes at a given moment in time t are fully explained by its 𝑝
lagged values and random error εt :
(14)
In practice, the number of lags is usually selected using information criteria,

AIC or BIC.
2.4. Data
Growth rate values for gross fixed capital formation are used as a predicted
variable. This variable consists of three series of gross fixed capital formation
provided by Rosstat for different periods (from 1995Q1 to 2002Q4, from
2003Q1 to 2010Q4, and from 2011Q1 to 2019Q1) by means of calculating the
1st log difference relative to the corresponding quarter of the previous year. This
approach allows the removal of seasonality in the series of gross fixed capital
formation. Moreover, it accounts for possible multiplicativity and structural
shifts in seasonality. The extended Dickey-Fuller test performed reveals no non-
stationarity in the transformed series. After its transformation, each time series
is reduced by 4 first observations and starts from 1996Q1.
A set of 36 (37, together with the values in predicted series) macroeconomic
indicators of the Russian economy were used as predictors.
Unfortunately, many of the indicators frequently used in macroeconomic
forecasting appeared in Russia relatively late, in the mid-2000s. This includes
the well-known PMI survey indices, which reflect the sentiment of management
across various industries. If they were included in the data, then there would be
very few quarterly observation models available for learning and quality check,
so the author had to sacrifice the number of predictors in order to extend the
sample: the source values of all series used in the research are available starting
from at least 1995Q1. They were all normalised to stationary form as may be
required. This was made by means of calculating the first log difference either
relative to the previous quarter (for series without seasonality), or relative to
the corresponding quarter of the previous year (for series with seasonality).
After all transformations, 1996Q1 was set as the earliest point for the data used
in the research. A full list of the variables, their sources and their methods of
transformation are given in Table 6 in Appendix. Figure 1 in Appendix shows
a graph of the standardised3 variables, in stationary form, used in the research,
along with a separate series of predicted variable.
2.5. Generating forecasts
In this research, several specifications were examined for each model.

We analysed two start dates (the left-hand boundaries of the training
sample): 1996Q1 and 2000Q1. The first date was chosen only because many
macroeconomic indicators were unavailable before 1995, and after the
transformations performed with the data, the minimum available date shifts to
1996Q1. The second date was chosen for the following reasons: we can presume
that adding information for the period that preceded the 1998 crisis (for 2000Q1,
the earliest data used for learning refers to 1999Q1) does not improve the quality
of the models. If this hypothesis receives indirect proof, this may also be useful
for futher research.
For each specification, the regularisation models were trained in the area
between the left-hand and right-hand boundaries of the training sample, where
the right-hand boundary takes values in the range from 2012Q1 to 2018Q4,
separately for forecasting on horizons of h of 1-8 quarters.
In this case, for models with h = 1, the end date for the training sample is
2018Q3; for models providing forecasts for 2 quarters, this is 2018Q2, etc.; for
models providing forecasts for 8 quarters, this is 2016Q4. Therefore, the range of
the training sample extended from 65 to 84 (h = 8) and 91 (h = 1) observations for
the first start date (1996Q1), and from 49 to 68 (h = 8) and 75 (h = 1) observations
for the second start date (2000Q1).
After training was complete, out-of-sample forecasts were generated for each
training sample on the horizon of h = 1, 2 … , 8 quarters for the first observation
following the training sample, and for the new iteration, the boundary of the
training sample shifted one observation ahead. Therefore, to obtain the forecast
for quarter t + h, only the values of the variables in quarter t were used.
Parameters of the penalty in corresponding regularisation models were
selected using cross-validation in the training sample for each individual
specification. Cross-validation inside the training sample was performed in a
fixed sliding window of 40 quarters with increments of 1.
The methodology of forecast generation for random forest and boosting models
is not much different from that of forecast generation for regularisation models,
except for the fact that cross-validation was not used for selecting parameters.
3
In this figure, standardisation is only required for convenience of graphic representation. In the
course of model estimation, standardisation of data was only performed for regularisation methods.
It is not required for ensemble methods.
Random forest parameters were selected as follows. The number of variables

available for the development of a single decision tree, 𝑝0, is established at the
level recommended by Liaw and Wiener (2002) for 𝑝0 = 𝑝/3 regression tasks,
where 𝑝 is the number of variables. Thus, 𝑝0 = 12. When selecting the number
of trees N, the following guidelines must be observed. With a small number of
trees, we cannot be sure that all observations are used in the model; however,
as the number of trees increases, the computational complexity also grows.
Four different values for N are used in this research work: 100, 500, 1000, 2000.
As you will see below, the quality of the results for these four models is similar
enough, and none of them leads with regard to all the specifications. However, the
difference between N = 1000 and N = 2000 is almost imperceptible, so a further
increase of N is useless. This research uses the standard stopping criterion for tree
development, which is recommended by the authors of the method. A node is
recognised as a terminal node if further splitting results in one of the child nodes
containing less than 5 observations.
Let us proceed with the boosting parameters. The standard value for learning
speed is set at η = 0.3. Though this value usually results in quite a conservative
model, additionally test η = 0.4, η = 0.2 and η = 0.1 specifications. It is not possible
to unambiguously select the best parameter value in various specifications.
The parameter N is responsible for the number of boosting iterations. Usually,
when N is selected, one stops at a level after which the model predictions no
longer change. For the purposes of this work, setting N = 100 was sufficient.
The development of benchmark models is worth a special mention. Forecasts
made by the random walk (h steps ahead) model were merely the current values
of the predicted series.
To build the autoregression model, we first selected the number of lags used
for each specification (usually no greater than 2) by means of AIC. After this, we
estimated the coefficients and generated a recursive forecast 8 steps ahead, i.e. we
performed sequential estimation of all 8 predicted values instead of estimating
8 individual equations. As we can see in Faust and Wright (2013), this approach
ensures more precise forecasts.
The quality metric used in the research is RMSFE (Root Mean Squared
Forecast Error):
(15)
This metric can often be seen when solving regression tasks (it is used, for
instance, in Fokin and Polbin, 2019; Baybuza, 2018). All calculations 4 were
performed in R using the following packages:
– glmnet for Ridge, LASSO, Post-LASSO, Adaptive LASSO, Elastic Net,
– spikeslab for Spike and Slab,
4
The code for calculation is available at https://github.com/mkhlgrv/investment_forecasting/
– randomForest for random forest,

– forecast for the AR model,
– xgboost for the boosting model.
3. Results and discussion
3.1. Quality of models
After generating the forecasts, RMSFE values were calculated for each model
in the training sample. These values relative to the quality indicator for the naїve
forecast (random walk process) for forecasting horizons of 0–8 quarters are listed
in Table 1 for the left-hand boundary of the training sample at 1996Q1, and in
Table 2 for the right-hand boundary of the training sample at 2000Q1.
Table 1. RMSFE (1996Q1 is used as the left-hand boundary of the training sample)
Model h=1 h=2 h=3 h=4 h=5 h=6 h=7 h=8

Random walk 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00
AR 0.92 0.85 0.79 0.73 0.68 0.67 0.68 0.61
Adaptive LASSO 0.99 0.91 0.79 0.84 0.75 0.77 0.61 0.50
Elastic Net 1.00 0.85 0.84 0.80 0.76 0.74 0.69 0.56
LASSO 1.01 0.89 0.83 0.81 0.82 0.72 0.71 0.55
Post-LASSO 1.04 0.95 0.80 0.93 0.79 0.78 0.72 0.54
Ridge 0.98 0.88 0.88 0.87 0.81 0.79 0.70 0.61
Spike and Slab 1.00 0.84 0.82 0.79 0.77 0.74 0.63 0.58
Boosting (η = 0.1) 0.91 0.74 0.81 0.66 0.67 0.53 0.45 0.62
Boosting (η = 0.2) 0.87 0.76 0.79 0.71 0.68 0.46 0.49 0.59
Boosting (η = 0.3) 0.96 0.79 0.69 0.71 0.73 0.52 0.53 0.60
Boosting (η = 0.4) 1.01 0.70 0.82 0.69 0.61 0.54 0.48 0.59
Random forest (N = 100) 0.90 0.72 0.75 0.65 0.70 0.56 0.54 0.57
Random forest (N = 500) 0.88 0.70 0.75 0.66 0.66 0.58 0.53 0.56
Random forest (N = 1000) 0.90 0.70 0.76 0.66 0.66 0.58 0.55 0.55
Random forest (N = 2000) 0.90 0.70 0.75 0.66 0.67 0.58 0.54 0.56
Random forest and boosting models almost always produce the best results
for the two different start dates in the training sample. This was to be expected;
poorly interpretable machine learning methods often demonstrate better results in
terms of forecast quality in comparsion with regularisation methods. For instance,
this pattern was seen in Baybuza (2018), where random forest and boosting
methods turned out to be the winners for almost all specifications in the course
of inflation forecasting, or in Kvisgaard (2018), where regularisation methods
performed worse than other machine learning methods in the context of GDP
and inflation forecasting. Selecting different values for the N parameter produces
an insignificant deviation in the forecasting quality of the random forest model,
which allows us to assert that a further increase in the number of decision trees
will not improve the quality of predictions.
It is worth noting that the use of LASSO modifications is not justified.

Adaptive LASSO and Post-LASSO do not offer better forecasting quality as
compared to the ‘parent’ model. The Bayesian approach to regularisation in the
context of this research did not provide any significant improvements to the
forecasting quality. Interestingly, the Elastic Net method, which in theory uses
the advantages of both Ridge and LASSO, is quite often worse than at least one
of these models. However, they are all quite similar in terms of forecasting errors.
Overall, all the tested models almost always demonstrate at least the same quality
as the random walk model. However, when making a forecast for a horizon of one
quarter, many models perform worse than the naїve forecast.
The second benchmark, the autoregression model, performs better than some
regularisation methods if the full data set is used. That said, it is always worse than
other models if the left-hand boundary of the training sample is established at 2000Q1.
Table 2. RMSFE (2000Q1 is used as the left-hand boundary of the training sample)
Model h=1 h=2 h=3 h=4 h=5 h=6 h=7 h=8

Random walk 1 1 1 1 1 1 1 1
AR 1.00 1.09 1.20 1.17 1.16 1.13 1.02 0.88
Adaptive LASSO 1.00 0.64 0.97 0.84 0.89 0.89 0.63 0.63
Elastic Net 0.78 0.62 0.96 0.76 0.81 0.82 0.67 0.59
LASSO 0.91 0.58 0.99 0.74 0.88 0.83 0.66 0.57
Post-LASSO 0.94 0.72 1.03 0.91 1.04 0.98 0.75 0.63
Ridge 0.88 0.64 0.79 0.78 0.76 0.79 0.68 0.66
Spike and Slab 0.85 0.64 0.80 0.80 0.91 0.83 0.66 0.60
Boosting (η = 0.1) 0.97 0.64 0.70 0.60 0.68 0.59 0.60 0.44
Boosting (η = 0.2) 0.90 0.63 0.65 0.58 0.69 0.63 0.63 0.57
Boosting (η = 0.3) 1.10 0.69 0.66 0.66 0.63 0.59 0.62 0.52
Boosting (η = 0.4) 1.08 0.66 0.72 0.56 0.67 0.66 0.56 0.50
Random forest (N = 100) 0.75 0.62 0.73 0.62 0.67 0.63 0.62 0.56
Random forest (N = 500) 0.81 0.61 0.73 0.64 0.67 0.63 0.61 0.58
Random forest (N = 1000) 0.78 0.62 0.72 0.63 0.66 0.62 0.59 0.57
Random forest (N = 2000) 0.78 0.62 0.73 0.64 0.66 0.62 0.59 0.58
Tables 1 and 2, which demonstrate the quality of the models, may be amended
using the Diebold-Mariano test described in Diebold and Mariano (1995, 2002) in
order to compare the quality of the models. Since the size of test samples is quite
small (17–25 observations), the statistics for the Diebold-Mariano test adjusted for
small samples were used, as suggested by Harvey et al. (1997). The null hypothesis
of the test implies that two models have the same quality of forecasts. An alternative
hypothesis for each specification is single-sided, i.e. it implies that the model with
the least RMSFE across the sample produces higher-quality forecasts.
Figures 3 and 4 in Appendix demonstrate the results of Diebold-Mariano tests
for different models trained based on the same sample and having the same forecast
horizon. The colour green in the figures means that the model corresponding to
the line is better than that corresponding to the column. The colour yellow denotes
superiority of the model corresponding to the column. The lighter the colour,
the greater the 𝑝-value is according to Diebold-Mariano test statistics, and hence,
the smaller the difference between the two forecasts. If the 𝑝-value is higher
than 0.1, the difference between the two models is statistically insignificant, which
is indicated by the colour white in the figures. It is clear that the location of the
yellow and green cells in the square is symmetrical in relation to the main diagonal.
It can be seen that as the forecasting horizon increases (with h > 4), the forecasts
produced by the models are more and more different from each other, and ensemble
methods show superiority over regularisation methods more frequently. However,
even with h = 8, the forecasts produced by almost all models are virtually identical
statistically. The only exception is the AR model, the quality of which is reduced
dramatically when transitioning to a trimmed sample, and the random walk model,
which falls behind the majority of models across most forecasting horizons.
Despite the fact that the quality of machine learning methods increases as
the forecasting horizon increases compared to random walk methods, the ability
to retrieve useful information from the data seems to worsen. When average
horizons are used (h = 4, … , 6), all ensemble methods are dramatically better
than regularisation methods. Nonetheless, in the case of two-year forecasting, this
advantage becomes statistically imperceptible. In fact, this means that the data
contain insufficient information for the models to generate two-year forecasts
that are significantly different from the merely average level.
Figures 3 and 4 in Appendix also show how transitioning to a trimmed sample
influences various methods: high-quality models (boosting and random forest)
become even better, while the quality of regularisation methods, which are on the
whole worse, does not show the same improvement rate, or even deteriorates.
Figure 2 in Appendix shows out-of-sample forecasts on horizons up to one year
(h ≤ 4), from 2013Q1 to 2019Q1, for the trimmed data set. Each separate forecast
line on the graph corresponds to a single model training session. The difference
in the quality of the models in terms of RMSFE is also confirmed by the graph:
it looks like boosting and random forest are better at reproducing the shifts in
the forecasted series. The graph displays the forecasts produced by boosting and
random forest models for a single specification only, because visually the forecasts
of identical models turn out to be similar to each other in case of changes in the
parameters. The graph does not display any random walk forecasts because they are
trivial and are obtained by means of shifting observations to the right.
3.1.1. Two data sets

Tables 1 and 2 show that the data from the pre-crisis period almost never help
to improve the forecasts. Why is that? This is due to the fact that the pre-crisis
observations are very volatile, the variable values are non-typical for subsequent
observations, and, as a result, the models trained based on them begin to produce
lower-quality data forecasts for the previous periods. Figure 1 in Appendix
demonstrates the values for all variables used in the research. We can see that data
variability is reduced from the beginning of the century (it is expected that during
the crisis periods of 2008–2009 and 2014–2015 the series become volatile again).
The Diebold-Mariano test may be used to perform a more rigorous check of
this forecast improvement hypothesis.
Table 3 displays changes in RMSFE relative to the random walk model
when transitioning to the trimmed sample, and the brackets enclose the values
of Diebold-Mariano statistics for checking the hypothesis of the congruence
between two models. The alternative hypothesis is single-sided, i.e. it implies that
the model with the smallest RMSFE indicator demonstrates the highest quality.
Machine learning models that were trained based on the trimmed data produce
forecasts at least as good as similar models based on enhanced data for the one-
year horizon (h ≤ 4). The only exception is the boosting model, which produces
significantly worse forecasts with h = 1.
Table 3. Diebold-Mariano test: two samples
Модель h=1 h=2 h=3 h=4 h=5 h=6 h=7 h=8

AR ‒0.082 ‒0.243. ‒0.413* ‒0.441** ‒0.477** ‒0.458** ‒0.337* ‒0.271*
(0.195) (0.081) (0.018) (0.002) (0.002) (0.008) (0.021) (0.021)
Adaptive LASSO ‒0.004. 0.264* ‒0.181. ‒0.002 ‒0.144* ‒0.121 ‒0.013 ‒0.128
(0.080) (0.025) (0.073) (0.186) (0.015) (0.461) (0.490) (0.126)
Elastic Net 0.218 0.230* ‒0.116 0.041 ‒0.050 ‒0.072 0.027 ‒0.037
(0.412) (0.046) (0.128) (0.453) (0.421) (0.379) (0.249) (0.250)
LASSO 0.104 0.309* ‒0.153 0.074 ‒0.056 ‒0.102 0.055 ‒0.021
(0.472) (0.037) (0.113) (0.343) (0.403) (0.421) (0.130) (0.288)
Post-LASSO 0.105 0.229. ‒0.229 0.022 ‒0.243* ‒0.201 ‒0.025 ‒0.095
(0.298) (0.054) (0.126) (0.176) (0.014) (0.325) (0.298) (0.128)
Ridge 0.098 0.240* 0.087 0.092 0.041 ‒0.005 0.025 ‒0.051
(0.365) (0.050) (0.246) (0.189) (0.162) (0.249) (0.484) (0.217)
Spike and Slab 0.142 0.207. 0.022 ‒0.013 ‒0.139* ‒0.083 ‒0.024 ‒0.021
(0.479) (0.063) (0.302) (0.418) (0.045) (0.381) (0.292) (0.442)
Boosting (η = 0.1) ‒0.058 0.100. 0.113 0.066. ‒0.011 ‒0.054. ‒0.147** 0.178*
(0.109) (0.083) (0.107) (0.069) (0.421) (0.083) (0.006) (0.034)
Boosting (η = 0.2) ‒0.033 0.131. 0.139* 0.132 ‒0.011 ‒0.162* ‒0.136** 0.024
(0.175) (0.062) (0.023) (0.128) (0.493) (0.050) (0.006) (0.134)
Boosting (η = 0.3) ‒0.136* 0.097 0.022 0.047. 0.099 ‒0.064 ‒0.091 0.082
(0.042) (0.220) (0.224) (0.055) (0.112) (0.117) (0.120) (0.114)
Boosting(η = 0.4) ‒0.067 0.048 0.092 0.126. ‒0.061. ‒0.118* ‒0.081 0.091
(0.102) (0.207) (0.170) (0.089) (0.079) (0.021) (0.325) (0.185)
Random forest (N = 100) 0.156 0.096* 0.024 0.026 0.032 ‒0.071 ‒0.076 0.009
(0.269) (0.041) (0.275) (0.124) (0.203) (0.269) (0.219) (0.147)
Random forest (N = 500) 0.074 0.096* 0.026 0.019. ‒0.011 ‒0.052 ‒0.071. ‒0.020
(0.448) (0.036) (0.353) (0.093) (0.392) (0.200) (0.079) (0.356)
Random forest (N = 1000) 0.114 0.075* 0.034 0.029. ‒0.005 ‒0.044 ‒0.047 ‒0.019
(0.308) (0.040) (0.279) (0.093) (0.486) (0.224) (0.256) (0.369)
Random forest (N = 2000) 0.124 0.083* 0.027 0.027 0.008 ‒0.044 ‒0.040 ‒0.013
(0.227) (0.037) (0.337) (0.101) (0.385) (0.213) (0.199) (0.335)
Note: the table shows change in RMSFE relative to the random walk model when switching to the trimmed sample.
Inside the brackets is the p-value of Diebold-Mariano statistics (H 0: increasing the training sample by adding data from
before the 1998 crisis does not affect the quality of the models). p-value: *** ≤ 0.001 < ** ≤ 0.01 < * ≤ 0.05 < . ≤ 0.1.
With regard to two-year forecasts, the quality of several models also

deteriorates with h = 5, 6, but neither of the two specifications is continuously
superior to the other one. However, considering an increase in h, i.e. an increase
in the gap between the dates on which and for which the forecast is made, the
interpretability of predictions must decrease due to natural reasons, we can
posit that removing the observations from before 1998 that are non-typical for
subsequent periods makes it possible to produce more robust and adequate
results when generating forecasts for at least one year ahead.
Out of the regularisation methods the Ridge model was the most robust with
respect to sample trimming: either the deterioration in quality was insignificant
(h = 6, 8), or the quality actually improved. This was to be expected, considering
that Ridge produces quite robust estimates.
However, the autoregression model shows significant deterioration with
smaller sample. It seems that the effect of insufficient information in the course of
sample trimming in this case exceeds the effect of adding atypical observations
to the sample.
3.1.2. Selection of variables in the LASSO model

As we have already mentioned above, LASSO-based methods only allow the
selection of the most important predictors. Despite that fact that LASSO did not
perform well in the forecasting experiment for this research, the results of selecting
the relevant predictor variables turned out to be interpretable and consistent with
the basic theoretical idea of how the investment function may look. In this regard,
it is important to briefly introduce the results of such selection.
Let us start with showing the number of variables that were selected by
LASSO for various forecasting horizons and various training samples. Figure 5
in Appendix shows the number of LASSO-selected variables for the two different
left-hand boundaries of the training sample.
The number of selected variables is quite unstable, which, according to Zou and
Hastie (2005) and De Mol et al. (2008), is typical of LASSO models. In this case, we
can see that for h = 3, 5, the number of selected variables is more robust for the models
that were trained based on the data from 1996Q1 (based on the larger sample), and
LASSO shows quality deterioration for these horizons if the sample is reduced (see
Table 3). For h = 2, on the contrary, the number of coefficients in the model that was
trained based on a trimmed sample does not show as much fluctuation as the model
trained based on the larger sample. This is consistent with the statistically significant
improvement in model quality in the case of sample trimming.
However, what predictors are actually selected? Tables 4 and 5 show the
values for the first five standardised values of the LASSO model coefficients
specified in the module for various forecasting horizons with the start date of
2000Q1. The coefficients are standardised, so their values do not have a direct
interpretation, but the greater the predictor's contribution to the changes in
investment, the greater the absolute value of the coefficient.
In the case of generating forecasts for a horizon of one quarter, the major
predictors are the current GDP and the current lag of the explained variable,
which is consistent with accelerator theory as described by Clark (1917) and
Guitton (1955). According to this theory, there is an optimal ratio μ of capital to
output, and companies aiming to maximise their profit strive to achieve this ratio.
Adjustment occurs with lags, so the current level of investments must depend not
only on the current output, but also on previous investments and previous output.
Key predictors for the remaining horisons h can be found in Tables 4 and 5,
however, forecast interpretability seems to decrease as the horizon increases. The
most important variables for mid-term forecasting are CPI and accounts payable.
Table 4. Primary predictors in the LASSO model for h = 1, … , 4

(2000Q1 as the left-hand boundary of the training sample)
h=1 h=2 h=3 h=4

1 GDP in constant prices M2 (as of the end of the Index of real retail trade Consumer price index
0.046 quarter) turnover (CPI)
0.023 0.228 ‒0.292
2 Gross fixed capital Household real money Accounts payable Accounts payable
formation income index (average per quarter) (average per quarter)
0.035 0.023 ‒0.059 ‒0.083
3 Household real money Real salary index M2 (as of the end of the Real salary index
income index ‒0.015 quarter) ‒0.068
0.024 0.052
4 M2 (as of the end Construction price index Real salary index Household real money
of the quarter) 0.014 ‒0.042 income index
0.010 0.056
5 Construction price index Import Share of gross fixed Construction price index
0.007 0.016 capital formation in GDP ‒0.051
(nominal value)
‒0.036
Table 5. Primary predictors in the LASSO model for h = 5, … , 8

(2000Q1 as the left-hand boundary of the training sample)
h=1 h=2 h=3 h=4

1 Consumer price index Consumer price index Consumer price index Accounts payable
(CPI) (CPI) (CPI) (average per quarter)
‒0.022 ‒0.04 ‒0.019 ‒0.013
2 Accounts payable Accounts payable Accounts payable Accounts receivable
(average per quarter) (average per quarter) (average per quarter) (average per quarter)
‒0.007 ‒0.01 ‒0.011 0.009
3 Household real money Construction price index Construction price index Consumer price index
income index ‒0.003 ‒0.006 (CPI)
0.003 0.007
4 Construction price index Accounts receivable Accounts receivable Index of real retail trade
‒0.003 (average per quarter) (average per quarter) turnover
0.003 0.006 0.004
5 Overdue accounts Industrial producer Share of gross fixed Industrial producer
receivable prices index capital formation in GDP prices index
(average per quarter) ‒0.002 (nominal value) ‒0.004
0.001 ‒0.003
It is worth paying attention to the variable of the share of gross fixed capital
formation in GDP, which is important for several horizons with a negative sign.
The sign of this variable means that if in some quarter the growth of investments
is higher than the growth of GDP so that the share of investments in GDP grow,
then some quarters ahead a significantly negative effect on the growth of a
forecasted variable is expected.
Despite the fact that the coefficients and their signs selected by LASSO do
not contradict economic theory, the set of predictors is generally not stable.
The same picture is observed for LASSO modifications. Probably, the latter
explains why forecasts produced by them are inferior in quality as compared to
the ensemble methods.
3.2. Comparison with the MED forecast
The MED publishes extensive forecasts of social and economic development

indicators5 each autumn. The author compares the forecasting results produced
by his own models with the forecasts of gross fixed capital formation growth
rates released by the MED for the following year. The MED forecasts changes in
percentage terms, so the author's forecasts were also re-calculated as percentage
changes relative to the previous year. Thus, Q3 6 of year t was used as the
forecast date for year t + 1, i.e. the forecast of annual change was calculated
based on forecasts for h = 2 (corresponding to Q1 of year t + 1) etc., and h = 5
(corresponding to Q4 of year t + 1).
The earliest date from which out-of-sample annual forecasts are available is
2013Q3, and thus comparison with the MED forecast is possible starting from 2014.
Figure 6 in Appendix shows the forecasts issued by the MED and the author's
models. Of all the specifications of ensemble methods, only one specification
per method is displayed. The forecasts produced with different parameter values
selected are visually similar to each other. We can see that, on the whole, neither
of the two forecasting methods is an undisputed winner, however, boosting and
random forest models are almost always close to the actual values of change in
investments, which is consistent with their superior quality in terms of RMSFE.
4. Conclusion
In the research, forecasts for the gross fixed capital formation index in Russia
were generated using a range of machine learning methods and an extensive set
of predictors. The highest quality is demonstrated by ensemble methods (random
forest and boosting), which outperform both simple reference models and
regularisation methods. This is consistent with the literature on macroeconomic
5
MED forecasts are available at https://economy.gov.ru/material/directions/makroec/
6
The forecast date (Q3) is established so as to ensure that the information used to train the
forecasting models roughly corresponds to the information available as of the date of publication of the
MED forecast (this usually takes place in November). For the purpose of clarity, the author assumes
that the values of all predictors in Q3 are published no later than in Q4.
forecasting. Using an enlarged data set that includes observations for the period
before the 1998 crisis almost never allows the improvement of forecasts as compared
with the trimmed data set. The relatively low quality of the forecasts produced by
regularisation methods corresponds to the instability of the specifications they
produce over time. According to the results of comparing forecasts obtained by the
author and by the MED, some of the machine learning models yield much higher-
quality forecasts of short-term changes in investments.
A promising area for further research could be to extend the set of forecasting
models and perform nowcasting of investments taking into account the non-
simultaneity in the release of different economic statistics data (the so-called ‘ragged
edge’ issue).
Appendix is available at
dx.doi.org/10.31477/rjmf.202001.35
References
Aganbegyan, A. (2016). Reduction of Investments – Death for the Economy, Investment
Growth – its Salvation. Economicheskie Strategii, 18(4), pp. 74–83. [In Russian].
Available at: http://www.inesnet.ru/article/sokrashhenie-investicij-gibel-dlya-
ekonomiki-podem-investicij-ee-spasenie/ [accessed on 11 February 2010].
Bai, J. and Ng, S. (2008). Forecasting Economic Time Series Using Targeted Predictors.
Baybuza, I. (2018). Inflation Forecasting Using Machine Learning Methods. Russian
Journal of Money and Finance, 77(4), pp. 42–59. doi: 10.31477/rjmf.201804.42
Belloni, A. and Chernozhukov, V. (2011a). High Dimensional Sparse Econometric
Models: An Introduction. In: P. Alquier, E. Gautier and G. Stoltz, eds., Inverse
Problems and High-Dimensional Estimation. Springer Verlag Berlin Heidelberg,
pp. 121–156.
Belloni, A. and Chernozhukov, V. (2011b). l1-penalized Quantile Regression

in High-dimensional Sparse Models. Annals of Statistics, 39(1), pp. 82–130.
doi: 10.1214/10-AOS827
Clark, J. M. (1917). Business Acceleration and the Law of Demand: A Technical Factor
in Economic Cycles. Journal of Political Economy, 25(3), pp. 217–235.
De Mol, C., Giannone, D. and Reichlin, L. (2008). Forecasting Using a Large Number
of Predictors: Is Bayesian Shrinkage a Valid Alternative to Principal Components?
Diebold, F. X. and Mariano, R. S. (1995). Comparing Predictive Accuracy.
Journal of Business and Economic Statistics, 13(3), pp. 253–263.
doi: 10.1080/07350015.1995.10524599
Diebold, F. X. and Mariano, R. S. (2002). Comparing Predictive Accuracy.
Journal of Business and Economic Statistics, 20(1), pp. 134–144.
doi: 10.1198/073500102753410444
Faust, J. and Wright, J. H. (2013). Forecasting Inﬂation. In: Elliott G. and Timmerman A.,
eds., Handbook of Economic Forecasting, Vol. 2. North-Holland: Elsevier, pp. 2–56.
Fokin, N. and Polbin, A. (2019). Forecasting Russia's Key Macroeconomic Indicators with
the VAR-LASSO Model. Russian Journal of Money and Finance, 78(2),
pp. 67–93. doi: 10.31477/rjmf.201902.67
Friedman, J. H. (2001). Greedy Function Approximation: A Gradient Boosting Machine.
Annals of Statistics, 29(5), pp. 1189–1232.
Guitton, H. (1955). Koyck (L.M.) – Distributed Lags and Investment Analysis.
Revue Économique, 6(1), pp. 127–128.
Harvey, D., Leybourne, S. and Newbold, P. (1997). Testing the Equality of Prediction
Mean Squared Errors. International Journal of Forecasting, 13(2), pp. 281–291.
Hoerl, A. E. and Kennard, R. W. (1970). Ridge Regression: Biased Estimation For
Nonorthogonal Problems. Technometrics, 12(1), pp. 55–67.
doi: 10.1080/00401706.1970.10488634
Idrisov, G. and Sinelnikov-Murylev, S. (2014). Forming Sources of Long-run Growth:
How to Understand Them? Voprosy Ekonomiki, 3, pp. 4–20. [In Russian].
doi: 10.32609/0042-8736-2014-3-4-20
Ishwaran, H. and Rao, J. S. (2005). Spike and Slab Variable Selection:
Frequentist and Bayesian Strategies. Annals of Statistics, 33(2), pp. 730–773.
doi:10.1214/009053604000001147
Kudrin, A. and Gurvich, E. (2014). A New Growth Model for the Russian Economy.
Voprosy Ekonomiki, 12, pp. 4–36. [In Russian]. doi: 10.32609/0042-8736-2014-12-4-36
Kvisgaard, V. H. (2018) Predicting the Future Past. How Useful Is Machine Learning
in Economic Short-Term Forecasting? Master thesis, University of Oslo.
Li, J. and Chen, W. (2014). Forecasting Macroeconomic Time Series: LASSO-Based
Approaches and Their Forecast Combinations with Dynamic Factor Models.
International Journal of Forecasting, 30(4), pp. 996–1015.
Liaw, A. and Wiener, M. (2002). Classification and Regression by randomForest. R News,
2(3), pp. 18–22.
Oreshkin, M. (2018). Prospects of Economic Policy. Ekonomicheskaya Politika, 2018,
13(3), pp. 8–27. [In Russian]. doi: 10.18288/1994-5124-2018-3-01
Santosa, F. and Symes, W. W. (1986). Linear Inversion of Band-Limited Reflection
Seismograms. SIAM Journal on Scientific and Statistical Computing, 7(4),
pp. 1307–1330. doi: 10.1137/0907087
Stock, J. H. and Watson, M. (2011). Dynamic Factor Models. In: M. Clements and
D. Hendry, eds., The Oxford Handbook of Economic Forecasting. Oxford University Press.
Tibshirani, R. (1996). Regression Shrinkage and Selection Via the Lasso. Journal
of the Royal Statistical Society: Series B (Methodological), 58(1), pp. 267–288.
doi: 10.1111/j.2517-6161.1996.tb02080.x
Zou, H. (2006). The Adaptive Lasso and Its Oracle Properties. Journal of the American
Statistical Association, 101 (476), pp. 1418–1429. doi: 10.1198/016214506000000735
Zou, H. and Hastie, T. (2005). Regularization and Variable Selection via the Elastic Net.
Journal of the Royal Statistical Society: Series B (Statistical Methodology), 67(2),
pp. 301–320. doi:10.1111/j.1467-9868.2005.00503.
Forecasting Inflation in Russia

Using Neural Networks
Evgeny Pavlov, New Economic School
eepavlov@nes.ru
Forecasting Russian inflation is an important practical task. This

paper applies two benchmark machine learning models to this task.
Although machine learning in general has been an active area of
research for the past 20 years, those methods began gaining popularity
in the literature on inflation forecasting only recently. In this paper,
I employ neural networks and support-vector machines to forecast
inflation in Russia. I also apply Shapley decomposition to obtain
economic interpretation of inflation forecasts. The performance of
these two models is then compared with the performance of more
conventional approaches that serve as benchmark forecasts. These are an
autoregression and a linear regression with regularisation (a.k.a. ridge
regression). My empirical findings suggest that both machine learning
models forecast inflation no worse than the conventional benchmarks
and that the Shapley decomposition is a suitable framework that yields
a meaningful interpretation to the neural network forecast. I conclude
that machine learning methods offer a promising tool of inflation
forecasting.
Keywords: inflation forecast, Citation: Pavlov, E. (2020).

machine learning, ridge Forecasting Inflation in Russia
regression, neural networks, Using Neural Networks. Russian
support-vector machines Journal of Money and Finance,
JEL Codes: C53, E37 79(1), pp. 57–73.
doi: 10.31477/rjmf.202001.57
1. Introduction
Inflation is an important macroeconomic variable that economic agents
take into account in their decisions. There are a number of various long-term
obligations, such as salaries or loan rates for example, which are generally
expressed in nominal prices. Therefore, both households and companies need to
forecast inflation accurately. Inflation forecast is also an important input to the
monetary policy rule.
One of the key tasks faced by central banks is maintaining price stability,
i.e. low and predictable inflation rates. Over the last three decades, in an effort
to maintain low and stable inflation, the central banks of many advanced and
developing economies have been adopting, in one form or another, the policy of
inflation targeting. The Bank of Russia completed the transition to the inflation
targeting policy regime in December 2014. The effectiveness of this policy
framework depends heavily on trust with regard to monetary authorities on the
part of economic agents. To maintain the trust, central banks regularly announce
their forecasts of inflation and other macroeconomic indicators and explain what
actions they will take under various possible scenarios. Consequently, the need
arose to create a sufficiently reliable and accurate forecasting method. This paper
attempts to expand the existing toolkit.
A number of studies show that short-term inflation forecasting (up to two
years) is best carried out using simple methods, which are based on the time series
of inflation alone (for example, Unobserved Component – Stochastic Volatility
Model, UC-SV) and do not require collecting a large number of potential external
predictors such as, for instance, the unemployment rate (Stock and Watson, 2008;
Faust and Wright, 2013). There are certain restrictions involved in forecasting
for Russia that are based on a large number of predictors. Although using a
large number of different macroeconomic indicators in forecasting models is
potentially justified, the accessibility of data significantly restricts the set of time
series considered. Thus, in my research, I use only 10 macroeconomic indicators
that reflect the state of the Russian economy and its various sectors. The monthly
observations time series restricted by data accessibility may total no more than
approximately 200 points.
A high ratio between the number of predictors and observations leads to
a high chance of model overfitting. In this context, overfitting implies that the
algorithm is fitted to random short-term patterns within a training sample that
are not typical for the general population. In other words, a low forecasting error
demonstrated by the in-sample model is not preserved if this method is used
out-of-sample. This problem is one of the key drawbacks of the models with large
number of predictors and, particularly, when such models are used for inflation
forecasting.
There are various ways of addressing the problem of overfitting. One of the
approaches involves selecting variables based on their theoretical significance.
However, this strategy does not necessarily work well over time. Different
predictors may have their forecasting value changed depending on various
economic issues. Using this method for other countries or time frames may
impair the forecasting power. In addition, such a selection of predictors hinders
the creation of a universal forecasting method for different forecast horizons.
The predictors that have a high forecasting content with regard to prices over two
months are likely to differ substantially from those most valuable over two years.
Accordingly, this approach is more focused on obtaining good results for a specific
sample, rather than on creating a universal algorithm with high accuracy.
The overfitting problem is a serious challenge faced by the machine
learning (ML) developers. Over the last few decades, various ML models have
v.79 №1 Pavlov: Forecasting Inflation in Russia Using Neural Networks, pp. 57–73 59
been created: neural networks, decision trees and random forests, support-vector
machines (SVM), boosting, etc. These models have become widely used in all fields
of data processing and have a high development potential. As far as economics is
concerned, it should be noted that ML models can also demonstrate promising
results with regard to panel data and time series forecasting. Not surprisingly,
with the development of models and growth in their popularity, they started to
be applied in the forecasting of macroeconomic indicators. A number of research
papers on using ML for inflation forecasting in other countries has already been
published (Chakraborty and Joseph, 2017), two of them are dedicated to the
application of these methods in Russia (Andreev, 2016; Baybuza, 2018).
Therefore, this paper examines the applicability of ML methods for inflation
forecasting in Russia, in particular neural networks. In this research I use a neural
network with one hidden layer. To identify the quality of a model, the neural network
output is compared with the benchmark econometric standard, autoregression of
order one (AR(1)) and the simplest ML method, ridge regression.
It should be noted that due to the high complexity of a model such as a
neural network, it is highly criticised because its results cannot be interpreted.
The model determines important predictors, but the researcher analysing the
results cannot use economic logic to determine the ways these predictors affect
the predicted variable. The researcher's inability to tell what is going on inside the
applied model also leads to additional criticism of neural networks. As it has been
noted in Kleinberg et al. (2015), ‘we built them, but we don't understand them.’
To address this problem, I integrate the Shapley decomposition method
with ML algorithms (Joseph, 2019). This approach makes it possible to present
the results of complex non-linear models in the form customary for a linear
regression without losing their significance: the higher the absolute value of the
predictor coefficient is, the greater impact it has on the predicted variable. Using
and analysing the effectiveness of this interpretation method is also one of the
goals of this study.
2. Literature review
To forecast inflation using neural networks, it is crucial to understand not

only the ML methods, but also the conventional econometric approaches to
inflation forecasting. In their recent survey, Faust and Wright (2013) summarise
and compare all the models currently used for inflation forecasting in practice.
The authors analyse 17 different models, including autoregression model (AR),
UC-SV, random walk, phillips curve, vector autoregression, Bayesian model
averaging, etc. Inflation is forecasted in pseudo-real time within the expanding
window, starting from the current quarter and on a quarterly basis to a
maximum of eight quarters. Root Mean Square Error (RMSE) versus the value
of the forecast error in the reference AR(1) model is used as a quality criterion.
The authors have found that the highest forecasting power is demonstrated by
the unobserved component model that incorporates information from inflation
expectations surveys (primarily the Survey of Professional Forecasters, SPF).
Moreover, simple methods such as random walk or AR have proved adequate
in inflation forecasting. This paper confirms that complex multi-factor models
systematically come short of the classic one-factor models, which are based on
the inflation time series alone. Following Faust and Wright (2013), I use AR(1)
as a benchmark.
The paper of Chakraborty and Joseph (2017) provides a review of the
benchmark ML models applied in addressing certain practical tasks faced by
central banks in the field of macroeconomic forecasting. Inflation forecasting in
England is studied here as one of the empirical applications of ML. The authors
make a forecast with a medium-term horizon of eight quarters. Benchmark
macroeconomic indicators (GDP, unemployment rate, money supply, money
market rate and other indicators) are used as predicting variables within the
window from the start of 1988 until late 2015. All models are trained within
the initial window from 1990Q1 to 2004Q4, with further quarterly expansion
up until the end of 2013. Mean absolute error (MAE) is used as a measure of
quality. The authors also separately consider the value of the forecast error
before and after the 2008 crisis. The research paper covers the following models:
Bayesian classifier, nearest neighbours model, decision tree and random forest,
neural networks, SVM and various clustering methods. AR(1) is used as a
baseline model. The authors have found that the combined method of SVM and
neural networks delivers superior forecasts before the crisis, while the random
forest method yields the best forecast after crisis. These models have not only
demonstrated better results as compared to other ML methods, but have also
surpassed traditional benchmarks such as AR(1) in terms of forecast quality.
Joseph (2019) applies the Shapley decomposition for conceptual
interpretation of macroeconomic forecasts obtained by ML methods. The author
considers the Shapley value model and its adaptivity to complex non-linear ML
methods. The Shapley value is a linear mapping, which makes it possible to
preserve the value of ML regression coefficients. To train the neural networks,
SVM and random forest, the author uses quarterly data for the United Kingdom
and the United States, which employ the key macroeconomic indicators for 62
and 52 years, respectively. OLS regression with L2 regularisation is used as a
benchmark. RMSE serves as a measure of quality. In most cases, ML methods
have largely surpassed the OLS reference in terms of the quality of forecasting.
Usage of the Shapley decomposition made it possible to ascertain that the most
valuable predictors in different models are comparable. The author concludes
that ML methods are highly effective for macroeconomic forecasting. Using the
Shapley decomposition has turned out to be a useful exercise for simplifying
the interpretation of the results of ML regressions.
Szafranek (2017) tries to create an optimum ML method for inflation

forecasting in a resource-based economy, taking Poland as an example. This
paper is useful for understanding specific issues when applying ML methods
for an economy similar to Russia’s. The author uses a family of direct ML
models with one hidden layer as the basis. At the same time the basic model
architecture is used, assuming that the amount of neurons within the hidden
layer is distributed in accordance with the Poisson distribution. A large number
of single-layer neural networks are trained, after which the results are averaged
by means of bagging (bootstrap aggregating). The model is trained based on
Polish data, including 188 predictors for the period from January 1999 to
December 2016. Performing the exercise, the author makes 72 pseudo-out-of-
sample forecasts in a recursive manner starting from January 2011. The forecasts
are made for horizons of 1 to 12 months. The quality of forecasts is assessed
using the RMSE. The author has found that bootstrap-averaged single-layer
neural networks have a better forecast quality than standard models, especially
over a long-term horizon.
Andreev (2016) describes the combined method used by the Bank of Russia
for inflation forecasting. This algorithm is designed to combine various models,
allowing them not to discard preliminary potentially useful predictors. This
method includes the following models: random walk, autoregression with
linear trend, unobservable component model, classical and Bayesian vector
autoregression model, as well as linear regression. After training each of the
models, they are aggregated. First of all, the multi-factor models, which were
trained using the 𝑛 number of different samples, are aggregated. After that,
the multi-factor and one-factor models are aggregated into a final forecast.
This approach makes it possible to use the strengths of different models
while engaging all the available predictors. Given that ML methods are also
capable of combining various algorithms, creating an effective neural network
may become the next stage in the evolution of methods used for forecasting
inflation in Russia.
3. Data description and methodology
The data set includes ten macroeconomic time series, including the
series of consumer prices. For the purpose of this study, the following main
macroeconomic indicators were selected: real GDP, labour productivity (the
ratio of real GDP to employment), money supply aggregate M2, real credit
growth, unemployment rate, exports, the price of oil in US dollars, real
disposable income, money market interest rate, and consumer prices. The
consumer price index (CPI), in monthly percentage increments, computed by
Rosstat serves as the measure of consumer price inflation. The time series in my
sample cover the period from January 2002 to August 2018 (200 observations
altogether). All the data is normalised into monthly growth rates. The physical
volume index of GDP computed by Rosstat serves as a proxy for quarterly
growth rate of real GDP. Monthly growth rate of real GDP is obtained by linear
interpolation. All series are seasonally adjusted, transformed to a stationary
form, and standardised.
After seasonal adjustment, the consumer price series was transformed into
the first difference of the logarithm to obtain inflation:
The monthly inflation is shown in Figure 3 in the Appendix. The augmented

Dickey–Fuller test confirms that inflation time series is stationary.
Different forecasting models are assessed using the quality functional, which
is equal to the RMSE between the real and predicted inflation series:
The price level is predicted in pseudo-real time out-of-sample. The need for
out-of-sample forecasts is determined by the high vulnerability of ML models
to overfitting. For this reason, the model quality is hard to assess without out-
of-sample forecasts. Inflation is predicted over the horizons of 1, 2, 6, 12 and 24
months following the most recently available reading of monthly inflation at a
pseudo-real moment of time. Models are then compared in terms of their forecast
quality separately for each forecast horizon.
Basic AR(1) regression is taken as a benchmark. Accordingly, if the RMSE
value for a specific model in a specific month is smaller than the RMSE value of
the benchmark, this model is judged to be superior to the benchmark model in
terms of the forecast quality, and vice versa.
For a better interpretation of the neural network performance results, the
Shapley decomposition is used.
3.1. Shapley decomposition
One of the important tasks in applied macroeconomic forecasting is not

only to build a model with high accuracy of forecasts it produces, but also to
be able to interpret its predictions. The accuracy of inflation forecasting is not
absolute due to a large number of factors influencing future inflation with no
hope to explicitely accounting all of them. However, explaining the reasons
why the model produced a particular forecast may help researchers find key
ways to improve the forecast by adding information about the developments
in those sectors of the economy that the model considers to be the most
important.
The Shapley value is principle solution concept in co-operative game theory
that yields an optimal allocation of surplus among the players. This allocation
is based on the notion that each player's payoff will equal the player's average
contribution to the welfare of the total coalition with a certain mechanism for
its formation.
For a formal definition of the Shapley value, let us consider a co-operative
game with an ordered set of N players. Let K𝑖 denote a subset of first 𝑖 players in
such an ordering. The contribution of the 𝑖-th player is defined as υ(K𝑖) – υ(K𝑖–1),
where υ is a characteristic function of this cooperative game. The Shapley value
in a cooperative game reflects the surplus distribution, where each player receives
a mathematical expectation of their contributions to the coalition K𝑖 with equally
probable emergence of orderings:
where 𝑛 is the total number of players, and 𝑘 is the number of coalition members.
The Shapley value satisfies the following properties:
Linearity. The mapping Φ(υ) is a linear operator. That is, for any two games
with characteristic functions υ and ω and for each α the following is true:
Symmetry. The player ranking does not affect the payoff received by such
player. In other words, if the players are rearranged, the Shapley values are
transformed with a similar rearrangement of elements.
Dummy player axiom. The Shapley value for a player not contributing to any
coalition, will always be zero.
Efficiency. The Shapley value makes it possible to fully distribute the surplus
of the total coalition. In other words, the sum of components Φ(υ) of the Shapley
value is equal to υ(N).
Strict monotonicity. If the contribution of a player to the welfare of a coalition
increases or does not change, his/her Shapley value does not decrease.
Cooperative game theory proves that the surplus allocation among players
as defined by the Shapley value is unique and always exists. The Shapley
decomposition is a complex exercise that, due to the exponential complexity
of the computation, significantly slows down the working of the algorithm.
In my study I use a Shapley additive explanation model, following Lundberg
and Lee (2017). This model is the most efficient way to compute the Shapley
decomposition for its implementation in ML.
The Shapley decomposition is needed for clear interpretation of the results.

In the case of linear regression, the interpretation of coefficients is straightforward:
the higher the absolute value of the predictor coefficient, the more important it is.
This approach, however, may not work for ML methods with a complex structure
of parameters.
The procedure I use in this study involves the following steps. First,
the values of predictor coefficients are calculated. Each such a coefficient is
equivalent to the respective player's contribution, which is required to calculate
the Shapley vector. Next, the Shapley value for the 𝑖-th variable is calculated
using different sub-samples of data in all possible combinations of predictors,
after which the absolute values obtained are summed up to obtain the ultimate
importance score of the 𝑖-th predictor. As a result, the complex and non-obvious
coefficients produced by the neural network are presented in linear form, which,
as mentioned earlier, is transparent for interpretation: the higher the absolute
value of the predictor coefficient is, the greater forecasting value it has with
regard to the predicted variable.
4. Models
This section refers to the models I use in this study. First, the traditional AR
model is described, which is based only on the previous dynamics of the time
series. It is followed by the linear regression model estimated by the regularised
OLS (the simplest model in the family of ML methods). Then follows a description
of the SVM and neural network – two main ML methods presented in this paper.
4.1. Autoregression (AR)
I use AR model to make an iterated multi-step forecast. That is, to make a

forecast for 𝑘 periods ahead, inflation values are predicted successively between
the moments of time t and t + 𝑘. The iterated multi-step method demonstrates
better results than the direct one for predicting the inflation value at t + 𝑘 (Faust
and Wright, 2013). The number of lags 𝑝 in the AR(𝑝) model is determined
using the Bayesian information criterion (BIC). Mathematically, the AR model
is formulated as follows:
4.2. Regularised OLS
ML methods often encounter the problem of overfitting. This is handled by

means of regularisation. Regularisation is used to add penalties on coefficients in
the benchmark model. As a result, the algorithm has to find a balance between
the accuracy that follows from overfitting and the minimisation of regularisation
penalties. OLS regression together with regularisation is a special case of the so-
called shrinkage estimator. In this study, a specification with L2 regulariser is used,
which is called ridge regression. In mathematical terms, the model is formulated
as follows:
where Q(𝑥) is the quality functional, which is the mean squared error, R(𝑥) is a
penalty for overfitting, and 𝑎 is the hyperparameter responsible for the specific
weight of regularisation penalties, which is optimally found out-of-sample by
means of cross-validation. The ridge regulariser is:
Unlike the model with L1 regularisation, it does not set to zero the coefficients for
the majority of explanatory variables. This regularization model is chosen to be
compared with the neural network model and the Shapley decomposition, which
employ the full set of potential predictors.
4.3. Support-vector machines (SVM)
SVMs are a powerful tool for solving classification and regression problems,
though it is more frequently used for classification. Generally, a two-class
classification problem is solved with the help of a logistic regression. Input data
are assessed in accordance with their position with regard to the hyperplane in
the feature space and are projected onto interval (0; 1), which is interpreted as a
probability of being classified within a certain class.
Using SVM helps address two potential problems that arise from using
logistic regression. Firstly, the classification data may be linearly inseparable,
that is, the decision boundary may not take the form of a hyperplane. Secondly,
the exact separation between the classes may be unknown, even if the exact
number of members in classes is known. To address the problem of linear
inseparability, data may be projected onto a space with a higher number of
dimensions. After that, linear separation becomes feasible using an appropriate
transformation. Such data projection is the key element of the SVM algorithm.
The second question is how to choose the best separation margin. SVM model
chooses the decision boundary in such a way that the vertical distance to the
nearest observation is maximised. Thus, only the observations nearest to the
decision boundary will be taken when choosing the line which determines the
classification across the entire data space. Such observations are called support
vectors, which give the entire model its name. Therefore, the SVM algorithm is a
kind of improved logistic regression based on data transformation and geometry
of observations.
Figure 1. Graphic interpretation of the two-class classification problem
Kernel trick
Original feature space Transformed feature space
Margin
Source: Chakraborty and Joseph (2017)
The idea of two-class separation is shown in Figure 1. The SVM regression will
be addressed at the end of this section. The left panel of Figure 1 shows two types of
observations (green and orange points) in a certain feature space. SVM algorithm
will try to find a decision boundary leading to the maximum distance from the
nearest observations (marked in yellow). Finding the the decision boundary (black
line) may be a nontrivial task if the data are linearly inseparable. In this case,
the data should be transformed into a new space, where the green and orange
points can be separated by a hyperplane, as shown on the right panel of Figure 1.
Thus, SVM algorithm first tries to find a new space for the transformed data in
which they can be linearly separable, and then determines the margin based on
the maximum distance to the nearest observations.
Let us consider the mathematical formulation of this model. Let us suppose
that there is a training sample, (𝑥1, y1), … , (𝑥m, ym), 𝑥𝑖 ∈ ℝ 𝑛, y𝑖 ∈ {‒1,1}. SVM model
builds the classification function F, which takes the form of F(𝑥) = sign(‹w, 𝑥› + b),
where ‹ , › is the scalar product operator, w is the normal vector to the separation
hyperplane, and b is an auxiliary parameter. The objects in which the F value is
equal to 1 are classified into one class, and those in which the F value is equal to ‒1
in the other class. This specific classification function is chosen on the basis that any
hyperplane may be specified in the form of ‹w, 𝑥› + b, for certain w and b.
Hereinafter, such w and b should be chosen so that the distance to each class
is maximised. In this case, the optimisation problem takes the following form:
This problem is solved by the method of Lagrange multipliers. A regularisation

parameter is also introduced in the form of support vector 𝑎𝑖 = C, which is
necessary so that the model does not try to find a clear separation where there is
none (if the classes are mixed).
It is obvious that the quality of class separation at the second stage directly
depends on how successful the model could transform the data. In case of linear
inseparability, we can transform the data by expanding the dimensionality of
the feature space. In other words, the data are mapped from a less dimensional
space into a more dimensional space in order for the data to become linearly
separable. Finding such transformation is a nontrivial problem. However, there
is a mathematical solution to this problem. As data enter the Lagrangian in the
form of a scalar product, it becomes possible to use the transformation T(.) using
the so-called kernel:
The function most commonly used in the ML as a kernel transition function

is the basis radial function (Gaussian kernel):
The SVM can also be used to create a regression based on a continuous

series of variable values. The operating principle of regression is similar to the
classification model mechanism. In this case, the model has a mechanism similar
to linear regression, with the only difference being that now the input data z were
transformed by means of kernel transition:
Thus, the model is a linear regression with non-linear data transformation.

Like many complex models, SVM are faced with the problem of interpretation.
The predictors and the dependent variable have a numerical value, however, using
the Shapley value is still reasonable, as with the non-linear kernel, methods that
are conventional for linear models will not help find out what is going on inside
the model.
4.4. Neural network (NN)
Neural networks are one of the most powerful statistical algorithms in ML.
Initially designed for simulating the work of a human brain, neural networks have
become a popular tool in machine learning to solve classification and regression

problems. For the purpose of this article, the class of models called multi-layer
perceptrons have been chosen from among a wide range of neural networks.
I have chosen this type of model due to its good ability to approximate continuous
functions. Neural networks may be described as semi-parametric models. On the
one hand, model parameters are set using the base network structure. On the other
hand, input data are used in the learning process to decide which parameters are
significant. This effect does not manifest itself much in this study, but is rather
noticeable when using large and complex neural networks on large data arrays.
Figure 2. Graphic interpretation of the neural network structure
Bias (intercept) Bias (intercept)
CPI
Input layer W1 Hidden layer W2 Output layer
Source: Chakraborty and Joseph (2017)
Macroeconomic data enter the neural network for inflation modelling. There
is an input layer with a node for each incoming object. The coefficient matrix W1
(weight matrix) combines the input layer with the hidden layer in the middle
of the network. At every node in the hidden layer, the combined signal from
the input layer is assessed by the activation function to generate a standardised
output signal to the next layer intended for retrieving the final result. This data
circulating inside the network are called derived objects. The weight matrix W2
combines the derived objects with the output layer. The coefficients at the nodes
of the activation function are the weights of the weight matrix W. Thus, for the
𝑖-th node in the layer 𝑘, the incoming object in the activation function will be a
derived object of the (𝑘–1)-th layer and the 𝑖-th row of coefficients in the weight
matrix. NN usually uses the activation function that makes it possible to model
a non-linear relation between the predictors and the predicted variable. Such
functions may include sigmoid and logarithmic functions. In this study, we use
the ReLU (Rectified Linear Unit) activation function. The neural network shown
in Figure 2 may be formalised as follows:
where (X × W1T) is the internal product of X properties multiplied by the transposed

weight matrix W1. With 𝑛 variables, the input objects, the weight matrix and the
output objects X, W1T, W2T and Y have the dimension (𝑛 × (𝑛 + 1)), ((𝑛 + 1) × (𝑛 + 1)),
((𝑛 + 1) × 1) and (𝑚 × 1); respectively. ’+1’ in columns reflects the node responsible
for the error. Note that if certain inputs in the coefficient matrix are zero, the
error node does not pass to this input. Thus, considering 10 explanatory variables,
the model will have the parameter (10 × 11) + 11 = 112 = 121. Accordingly, for a
limited amount of analysed data, the number of layers in the neural network is
very restricted.
The activation function is performed by ReLU, which is formally expressed as
and is responsible for the threshold transition at point zero. This function works
well for computationally intensive operations. Due to the fact that it is expressed as
a zero-activation matrix and does not become saturated, the function facilitates
better performance of the algorithm. Though this algorithm is subject to the risk
that updated weights could cause the neuron to be never activated again, this
problem can be avoided by a moderate learning rate.
Despite the numerous advantages of neural networks, their potential
drawback is the lack of interpretability. In particular, economic intuition cannot
be readily applied to the weight matrix W. Moreover, when working with neural
networks in general, different algorithms can give different dimensionality of the
weight matrix for the same size of input array. One of the approaches used to solve
this problem is to apply the Shapley value to explain the results of ML algorithms.
4.5. Cross-validation and training
In this study, I use cross-validation. Data are randomly broken down into 10
divisions (in the proportion 90% to 10%) for training and testing, respectively.
The quality of forecasts is assessed based on the results obtained out-of-sample.
Each division is also broken down in the proportion 90% to 10% for calibrating the
model hyper-parameters and training, respectively. This procedure is repeated in
15 bootstrap iterations to ensure the stability of results. More bootstrap iterations
would result in a higher accuracy of model-based forecasts and less overfitting,
but would substantially slow down the algorithm.
5. Results
Table 1 provides the RMSE values for all models in the forecast horizons
analysed: 1, 2, 6, 12 and 24 months. Moreover, Table 3 in Appendix shows the values
of the generalisation error (GE), which is equal to the share of the difference between
the RMSE values for the training and testing samples and the RMSE value for the
testing sample. The data obtained allowed us to draw the following conclusions:
1. Neural networks and SVM have better forecast accuracy compared to
benchmark one-factor models.
2. Besides the forecasting horizon of one month, neural networks have a
higher forecasting quality in comparison with the AR model, demonstrating
a lower average RMSE value by 7% (9% for the forecasting horizons of over
one month).
3. Neural networks and SVM have proved insensitive to changes in forecasting
horizons of over two months.
4. Neural networks have turned out to be the most prone to overfitting
compared to all other models.
5. There is practically no difference between neural networks and SVM in
terms of forecast accuracy over a horizon of over two months. However,
for the horizon of one month, SVM produce higher-quality forecasts than
neural networks and are close to the AR model.
6. The ridge regression model has turned out to be the most sensitive among
the ML models to choosing a forecasting horizon. In terms of the quality
of forecasts, it has proved to be the worst among the models considered.
Hereinafter is a detailed overview of the results.
Table 1. Forecasting errors
Forecast month
Model
1 2 6 12 24
RMSE
AR (1) 0.2998 0.4500 0.4278 0.4286 0.4369
NN 0.3460 0.3880 0.3980 0.3990 0.4010
SVM 0.3100 0.3910 0.3980 0.3940 0.4070
Ridge 0.3180 0.3910 0.4180 0.4310 0.4230
Note: the best models for each horizon are highlighted in green, and the worst models – in yellow.
The main model used in this paper, the neural network, has demonstrated better
results than the benchmark model in the majority of cases. With the exception of
switching from a one-month to a two-month horizon, the model is characterised
by quite stable forecast error when changing the forecasting horizon. The average
forecast error is about 0.4, which corresponds to an increase in the accuracy of the
forecast by an average of 9% as compared to the benchmark model.
One reason why non-linear methods have an advantage over linear
benchmarks when increasing the forecasting horizon is obvious from the forecast
graph over a six-month horizon (see Figure 5 in Appendix). The graph shows
a sharp surge in inflation in 2014, which was caused by the depreciation of the
ruble. The AR model memorises and reproduces this surge with a certain lag in
the forecasting (equal to six months in this case), which decreases the forecast
quality. In contrast, the neural network, which does not use the inflation lag
for training, proves to be less susceptible to sharp surges and has therefore an
advantage in forecasting if short-term surges are present.
The models analysed do not include the inflation lag. Since they use both
the contemporaneous inflation series and the key macroeconomic indicators for
forecasting, they turn out to be less sensitive than linear models to changes in
the forecasting horizon. Analysing the relation between the predictors and the
inflation series, we do not know the exact quantitative effects captured by these
variables. A response to possible critics on the need to increase the number of
potential predictors for a more accurate description of the state of the economy
(which presumably will allow the neural network to identify an overall pattern of
interrelations inside the economy) is a small number of available time observations.
Such a unfavourable ratio would inevitably lead to the overfitting of the neural
network, which is observed already even in the case of 10 predictors. It should be
noted that the upsurge reproduced by the neural network in 2014 in a one-year
horizon forecast does not prove the model's ability to forecast sudden shocks, and
is more likely a consequence of model overfitting. Therefore, if we increase the
number of bootstrap samples, this upsurge should disappear. In comparison with
other ML models, neural networks turn out to be more susceptible to overfitting.
The GE value (see Table 3 in Appendix) when training the neural network is 0.4
in a one-month horizon and 0.212 on average, with some fluctuations for further
horizons. This is evidence of significant overfitting of the model, or in other words,
its adjustment to peculiarities of the available data. However, this overfitting
volume is acceptable for the neural network and does not substantially affect
the quality of the model. The other ML models used demonstrate practically no
overfitting, but, as the graphs show, they cannot predict local inflation upsurges
and downfalls as the neural network does, but instead tend to average the existing
series. Thus, although neural networks and SVM preserve the balance between the
ability to predict local inflation changes and the absence of overfitting in different
ways, both models feature close RMSE values, which indicates a similar level of
accuracy for both methods.
From Table 5 in Appendix, which reports the Shapley decomposition for the
neural network performance results in various forecasting horizons, it is clear
which predictors had the highest impact on the forecast, according to the neural
network. For a one-month horizon, the most informative predictors were the
GDP, the volume of loans and the inflation rate. Although the contemporaneous
inflation has a significant effect on the one-month forecast, it is less than it should
be, based on the intuitive understanding of the inflation’s nature. Given the
neural network’s propensity to overfitting for this horizon, we should be careful
about the results of the model's performance for this horizon. Later, when the
forecast horizon is increased to up to two months, this problem disappears and

the weight increases can be noticed for predictors that reflect the state of key
aspects of the Russian economy.
If the forecast horizon is increased to up to six months, the money supply
and the interest rate become the most significant variables. The contemporaneous
inflation preserves its status as an important variable for forecasting. The forecast
chart for six months (see Figure 5 in Appendix) suggests that the neural network
predicts some of the local changes quite well. As the degree of overfitting is not
too high, the predictors used are most likely to have a decisive influence on the
changes in inflation in these local episodes. It is also likely that the remaining
local peaks are explained by other predictors not covered by this study.
For the one-year forecast, the most significant predictors were the interest
rate, the volume of loans issued and the price of oil. This means that from a neural
network perspective, in the case of rather long-term forecasting, the impact of the
current inflation evens out and the forecasting power shifts to other indicators
that reflect the overall state of the Russian economy and monetary policy.
In the case of a forecast over a two-year horizon, many of the predictors cease
to be informative. The interest rate, the unemployment rate and surprisingly,
the contemporaneous inflation, have the highest impact on the forecast. Such
results can be attributed to the overall complexity of the long-term forecast. Thus,
Figure 6 in Appendix shows that the neural network generally strives to follow
the mean level, rather than predict sudden peaks. Perhaps using valid forecasts
of important macroeconomic indicators as predictors may direct the neural
network towards the path of economic development expected by economists.
Finding ways to adjust the model to the Russian reality without overfitting is an
important and complex challenge for future researchers of this subject matter.
The neural network and SVM demonstrate similar RMSE values. As the
SVM practically does not reproduce local changes in inflation, and aims, by its
design, to predict thetrend, then this method will start losing out to the neural
network in future when the number of observations increases. This is due to the
fact that the neural network will be able to identify potential key peculiarities of
separate episodes and, owing to this, will substantially improve the accuracy of
the forecast.
The ridge regression has turned out to be the worst among the ML models,
but the quality of the ridge regression and AR models’ forecasts has proved to
be comparable. The possible problem of overfitting and the model's inability to
ignore insignificant variables result in rather poor forecasts.
6. Conclusion
The primary goal of this study is to examine the applicability of neural

networks in inflation forecasting in Russia and to compare the results of complex
ML methods with the benchmark forecasting models. The neural network and SVM
have generally proved more efficient than one-factor and simple linear models.
The models analysed have demonstrated good results over the forecasting
horizon of longer than one month, which makes them a promising instrument
for short- and medium-term forecasting.
Using the Shapley decomposition to interpret the forecast of the neural
network trained on a macroeconomic data sample has made it possible to identify
key predictors in the neural network approach and to determine the scale of their
impact on the inflation forecast.
The neural network framework used in the paper is one of the simplest in the
family of neural networks. Despite this, it has already shown promising results.
Over time, computing power is expected to improve and the amount of training
data will increase. This will allow to create more complex models trained on a
greater number of variables, which will further increase the usefulness of ML
methods to improve the understanding of the quantitative relationships between
key macroeconomic indicators.
Appendix is available at
dx.doi.org/ 10.31477/rjmf.202001.57
References
Andreyev, A. (2016). Integrated Inflation Forecasting at the Bank of Russia. Bank of Russia
Series of Economic Research Reports, N 14. [In Russian]. Available at: https://www.
cbr.ru/Content/Document/File/16726/wps_14.pdf [accessed on 5 December 2019].
Baybuza, I. (2018). Inflation Forecasting Using Machine Learning Methods. Russian Journal
of Money and Finance, 77(4), pp. 42–59.
Chakraborty, C. and Joseph, A. (2017). Machine Learning at Central Banks. Bank of
England Working Papers, N 674.
Faust, J. and Wright, J. H. (2013). Forecasting Inﬂation. In: Elliott G. and Timmerman A.,
eds., Handbook of Economic Forecasting, Vol. 2. North-Holland: Elsevier, pp. 2–56.
Joseph, A. (2019). Shapley Regressions: A Framework for Statistical Inference on Machine
Learning Models. Bank of England Working Paper, N 784.
Kleinberg, J., Ludwig, J., Mullainathan, S. and Obermeyer, Z. (2015). Prediction Policy
Problems. American Economic Review: Papers and Proceedings, 105(5), pp. 491–495.
doi: 10.1257/aer.p20151023
Lundberg, S. and Lee, S.-I. (2017). A Unified Approach to Interpreting Model Predictions.
arXiv preprint arXiv:1705.07874. Available at: https://arxiv.org/abs/1705.07874
[accessed on 5 December 2019].
Stock, J. H. and Watson, M. W. (2008). Phillips Curve Inflation Forecasts. NBER Working
Paper, N 14322.
Szafranek, K. (2017). Bagged Artificial Neural Networks in Forecasting Inflation:
An Extensive Comparison with Current Modelling Frameworks. NBP Working Paper,
N 262.
74
Russian Journal of Money and Finance march 2020
Announcements of Sanctions
and the Russian Equity Market:
Pavel Dovbnya, New Economic School1
pdovbnya@nes.ru
The research question raised in this paper is an investigation of

the effect of the announcement of US and EU sanctions on the stock
returns of the targeted companies listed on the Moscow Exchange.
The strategy for identification is based on firm-specific and multivariate
short-term event studies. Firm-specific event study of eight sanctions
that targeted 14 entities at different times results in a statistically
significant ‒5.4% estimate of the expected cumulative abnormal return
within a window of seven trading days.
Keywords: sanctions, Russian Citation: Dovbnya P. (2020).

equity market, event studies, Announcements of Sanctions
abnormal returns, MOEX and the Russian Equity Market:
JEL Codes: G14, F51 An Event Study Approach. Russian
Journal of Money and Finance,
79(1), pp. 74–92.
doi: 10.31477/rjmf.202001.74
1. Introduction
The question of the effectiveness of the Western sanctions against Russia

remains unsolved. Both in the US and Russia there is still a heated debate over
this issue, which makes a study of sanctions relevant to the discussion of public
policy. The research question raised in this paper is an investigation of the effect
of the announcements of US and EU sanctions on the stock returns of the targeted
companies listed on the Moscow Exchange.
The identification strategy is based on short-term event studies. Event study
methodology is designed to identify the short-term average cumulative impact
of an event on the equity returns of the public companies that were directly
or indirectly affected by the event. In the case of sanctions, an important
division arises. On the one hand, an announcement of sanctions is a type of
1
The author is a winner of the 2019 Economic Research Competition organised by the Bank
of Russia and the Russian Journal of Money and Finance for students and postgraduates.
v.79 №1 Dovbnya: Announcements of Sanctions and the Russian Equity Market, pp. 74–92 75
announcement in and of itself. Thus, it should be studied across events. On the other
hand, due to the idiosyncratic nature of specific sanctions, their announcements
may be perceived as separate events which should be studied separately. Special
approaches exist for each of these two situations: firm-specific event studies, for
analysis across events, and multivariate event studies, for the analysis of particular
events. A firm-specific event study is designed to answer the question: what
is the expected pattern of cumulative abnormal returns of a company affected
by a typical announcement of this particular type of event (announcement of
sanctions)? It estimates the expectation of the cumulative abnormal return path
conditional on the event window around a typical announcement. In turn, the
question raised by a multivariate event study is different: what is the expected
pattern of cumulative abnormal returns of a company affected by the event
under consideration? It estimates the expected cumulative abnormal return path
conditional on a particular event. For more information about these methods, see
Section 4 and Appendices B and C.
In this work, the firm-specific event study method is applied to a set of
eight sanction packages which, among other things, targeted either publicly
traded Russian companies themselves or their subsidiaries, CEOs, or major
shareholders. The multivariate event studies are used to disentangle the effect
of inclusion in the US Office of Foreign Assets Control’s (OCAF) 2 Sectoral
Sanctions Identifications (SSI) and Specially Designated Nationals (SDN) lists
by investigating the two most extensive announcements of sanctions: (1) the
inclusion of eight exchange-listed companies in the SSI list on 12 September
2014 and (2) the inclusion of six exchange-listed companies in the SDN list
on 6 April 2018. For more information on these types of sanctions and the
announcements, see Section 3.
The firm-specific event study demonstrates that, on average, the
announcement of the imposition of sanctions that target a Russian public
company has a negative impact on its equity returns. The average effect is
measured as the estimate of the expected cumulative abnormal return within
a window of seven trading days after a sanction announcement and amounts
to ‒5.4%. The multivariate event study analysis of the two most extensive
sanctions imposed by the US on the Russian economy – one on 12 September
2014 and the other on 6 April 2018 – shows that the effect of the announcement
of the former was statistically equal to zero, whereas for the latter, estimates
of the expected cumulative abnormal return already on the 5th day after the
announcement are steadily around ‒20%.
2
The Office of Foreign Assets Control (OFAC) is a special agency of the US Treasury Department
that develops, administers and enforces all of the US sanctions. See https://www.treasury.gov/resource-
center/sanctions/sdn-list/pages/ssi_list.aspx for the SSI list, and https://www.treasury.gov/resource-
center/sanctions/sdn-list/Pages/default.aspx for the SDN list
76
The limitations of the present study naturally include the small number of
events being considered. There were only 8 sanction packages announced after
March 2014 which, among other things, targeted Russian public companies.
Moreover, only two of these sanction packages are extensive enough to be
analysed in the framework of multivariate event studies. Nevertheless, even with
the small number of events, the results are statistically robust, which means that
one can draw some basic, though cautious, conclusions about the effectiveness
of sanctions against Russia.
The rest of the paper is organized as follows. Section 2 provides a review of
the literature on the topic of sanctions both in general and in the case of Russia
specifically. Section 3 describes the data used in the event study analysis: the
market data used in the regressions and the sanctions’ events data set under
consideration. A brief discussion of the methodology implemented is presented
in Section 4. The main results of the event study are described in Section 5.
Section 6 concludes the paper. The Appendix includes the broad presentation of
the results of the event study in the form of graphs and tables as well as a complete
description of all the mathematics that work behind the event study methods
used in the work.
2. Literature overview
The literature on economic sanctions is diverse and can be divided into

two broad groups: monographs and scientific articles. A comprehensive history
of economic sanctions since World War I written by Hufbauer et al. (1990)
demonstrates the complexity of sanctions as an economic phenomenon and
that every instance requires its own approach when analysed. Zarate (2013) is a
prominent monograph written by the former Deputy National Security advisor
about the emergence, establishment and organisation of the US Treasury’s
international activity in the field of economic sanctions.
Scientific articles cover both theoretical and empirical questions. While
empirical articles seek to answer questions about the effectiveness of sanctions
and their effects on the targeted entities, theoretical papers focus on their micro-
foundations. Galtung (1967) was the first to perform comprehensive analysis of
a particular case of sanctions – the British oil embargo against Rhodesia – from
both theoretical and empirical point of view, in order to determine whether these
sanctions were effective. Kirshner (1997) introduces a microeconomic model of
sanctions trying to answer the question of how sanctions work.
This paper contributes to the literature on the sanctions against Russia
since March 2014 and their impact on the Russian economy (and the Russian
financial market in particular). Gurvich and Prilepskiy (2016) investigate
the effect of the Western financial sanctions on the foreign funding of
the sanctioned Russian state-controlled banks and the oil, gas and arms
companies: the authors estimate the potential loss from sanctions at USD 280
billion in gross capital inflow over 2014–2017, but do not address the issue of
identification. Nivorozhkin and Castagneto-Gissey (2016) study the dynamic
relationship between the Russian and the global equity markets after the
outbreak of the Ukrainian crisis in 2014. They find a considerable decrease
in synchronicity between the markets, reflected by a significant decrease in
the cross-correlation of market returns. Naidenova and Novikova (2018) use
standard firm-specific event studies to detect the effect of the announcement
of sanctions on Russian public companies’ stock prices. However, their pool of
events includes diplomatic, personal, economic and other types of sanctions
in addition to sanctions targeting public companies. Moreover, they look at
the stock returns of all major Russian public companies, regardless of whether
a company was targeted by sanctions or not. This approach may potentially
introduce additional noise to the results. In contrast, in my research I focus
specifically on sanctions targeting companies listed on the Moscow Exchange
and consider the stock returns only of those companies.
Using cointegrated vector autoregression (VAR) models, Dreger et al. (2016)
compare the relative impact of the drop in oil prices and the imposition of
sanctions on the fluctuations of the Russian ruble. They construct a composite
index of sanctions and estimate the VAR model. Their main result is that most
of the depreciation of the ruble is explained by the decline in oil prices, whereas
the effect of sanctions is hardly significant. A similar index is used in Kholodilin
and Netšunajev (2019) as the main component of a structural VAR model
aimed at separating the long-run effects of the shock of sanctions and the oil
price shock on the real sector of the Russian economy. The authors do not find
evidence supporting the hypothesis that the slowdown in Russian GDP growth
was caused by sanctions.
Finally, Ankudinov et al. (2017) perform a statistical comparative analysis
of extreme movements (heavy-tailedness) in the returns of Russian stock indices
before and after the imposition of sanctions. The breakpoint in their analysis
is March 2014. Despite the fact that the authors detect structural breaks in
heavy-tailedness for some Russian indices (Metals and Mining and Finance
MOEX indices), they emphasize that the effect cannot be directly linked to the
announcement of sanctions, since there was a significant contemporaneous drop
in oil prices in the summer of 2014.
Identification of the long-run effects is challenging, because the Russian
financial markets depend heavily on global economic trends. The main goal
of this paper is to contribute to the broad debate about the effectiveness of the
sanctions against Russia by estimating their short-run effect on the equity
returns of the targeted public companies. This approach allows the more accurate
detection of the effect of sanctions, but it is silent on their long-run effects.
78
3. Data description
3.1. Moscow Exchange data set
The official Moscow Exchange data set of all trades and best orders (type B3)
on the securities market (stocks and bonds) is the main source of data used in
the estimation of parameters.
The data set covers the period from 6 January 2014 to 31 July 2018. For each
trading day, it includes the ticker symbol (the security’s code), price, an indicator
column that shows whether this price is the new best bid, new best ask or
transaction price, time (down to microseconds) and volume. There are two
major advantages of this data set that are important in the context of the further
analysis. First, the data set is provided by the Moscow Exchange itself, which
does not raise questions about its accuracy or credibility. Second, in contrast
to end-of-day market data, type B intra-day data allows for the computation of
volume-weighted prices. The use of last-3-minute, volume-weighted transaction
prices instead of standard end-of-day transaction prices potentially leads to a
more accurate calculation of daily returns. This becomes especially important
in the analysis of volatile and high-volume trading periods.
When it comes to the estimation of the market model, for the purposes
of a robustness check the daily returns of two indices are used: the MOEX
Russia Index (IMOEX) and the MSCI Emerging Markets index (MSCI EM).
The IMOEX is the major free-float index calculated by the Moscow Exchange
on the basis of the most liquid stocks (depending on many factors, the average
number of stocks is 45). The MSCI EM is a free-float index that covers the equity
markets of 24 developing countries, including Russia. The advantage of this
index over the IMOEX in the analysis is that the Russian equity market does not
constitute a considerable part of the total market capitalisation of the MSCI EM,
which makes the returns of the index less correlated with the returns of large-
cap Russian stocks. This feature allows more accurate detection and estimation
of abnormal returns on Russian stocks, which is especially important in cases of
companies targeted by sanctions, since the stocks of these companies are among
the most liquid on the Moscow Exchange. An important point in working with
the MSCI EM is that this index is denominated in dollars, whereas Russian stock
prices are denominated in rubles. Therefore, the MSCI series was first converted
to rubles according to the daily RUB/USD exchange rate, and only then were the
rates of return calculated. The IMOEX daily time series was downloaded from
the official Moscow Exchange website, whereas the data on the MSCI EM were
downloaded from Thomson Reuters.
3
The Moscow Exchange has three types of data: type A stands for the full limit order book, type B
stands for the top of the book and type C for the end-of-day data. See Market Data description section
on www.moex.com/en
3.2. Sanction-related stocks
I first created a list that would include all of the major sanctions introduced
and their classification. For this purpose, I made a complete list of all the sanctions
imposed on Russia by the US and the EU over the period from March 2014 to
November 2018 (see Appendix F). This list also includes all the counter-sanctions
imposed by Russia on the US and the EU.
There was no existing comprehensive record of sanctions imposed at the
time this paper was written (at least in free sources). The list was prepared
manually on the basis of news articles from the RBC and RIA News agencies,
and also on the basis of official US Treasury and EU announcements.
The sanctions against Russia can be classified into the following five categories:
diplomatic (e.g., the expulsion of diplomats), personal (targeting specific
persons), corporate (targeting particular companies), economic (e.g., a ban
on the export of technologies to a specific Russian industry) and extending
(sanctions that extend the enforcement of previous sanctions). The sanctions can
also be classified according to the reason for their imposition: Ukraine-, Syria-,
Skripal- or US elections-related.
In this paper, I focus on the effect of the announcement of sanctions on
the returns of stocks on the Moscow Exchange. In order to make the results
more specific, I study only sanctions that directly affect companies listed on
the Moscow Exchange. I consider a set of sanctions that includes personal,
corporate and economic sanctions, and a set of companies that includes those
targeted by some of these sanctions and whose stocks were traded on the Moscow
Exchange at that time. Based on the sanctions list described above, the record
of the Moscow Exchange codes and the complete list of all the companies
under sanctions prepared by RBC4 (as of December 2018), I compiled a list of
the sanctions that, among other things, targeted either publicly-traded Russian
companies themselves or their subsidiaries, CEOs, or major shareholders.
The list of sanctions and companies is presented in Table 1. This set of companies
and sanctions will be the focus of further analysis.
The sanctions under consideration can be divided into two groups. The first
group of sanctions adds the target to the OFAC’s SSI list. The second group
of sanctions adds the target to the OFAC’s SDN list. The SDN list is the most
important of the OFAC’s sanction lists and is designed to target any person or
company in the world. The SSI list was created in July 2014 specifically for the
Ukraine-related sanctions against Russia. The key difference between these lists
is that the SDN list prohibits US citizens from doing any kind of business with
the persons/entities included in it, and also requires the freezing of all assets
belonging to the target, while the SSI list allows the elaboration of partial sanctions
4
See https://s0.rbk.ru/v6_top_pics/media/file/1/59/755452933522591.pdf
80
that restrict connections with the target only in particular areas. Both lists follow
the so-called ‘50 Percent Rule’, which says that if a targeted person has more than
a 50 percent stake in some company, then the company is automatically added
to the list. That is why Polyus – a public company controlled by S.A. Kerimov –
is included in the data set (8) of Table 1.
Basic intuition and interesting insights can be obtained from simple graphical
analysis. Figure 1 shows that normalized stock prices in the 14-day event windows
of the announcement of sanctions (seven trading days before the announcement
and seven trading days after) generally have a downward pattern. Two sanctions
stand in the sharp contrast to the others: those of 12 September 2014 and
6 April 2018. These sanctions are the most extensive: they target six and eight
public companies respectively.
Yet, the sanctions of 12 September 2014 and 6 April 2018 differ considerably.
First, on 12 September 2014, the companies were added to the SSI list, whereas
on 6 April 2018 the companies were added to the SDN list. Second, Figure 1
demonstrates that there is no clear pattern in the time series of prices for the
event window around 12 September 2014. In contrast, the practically unchanging
path of prices before 6 April 2018 is followed by a dramatic drop on 6 and 9 April
(6 April was a Friday) for four of the six companies.
Figure 8 and Figure 9 in Appendix A show that the daily trading volume and
daily realised volatility for stocks in each of the event windows exhibit patterns
similar to that described above.
In general, the visualized data show the intuitively expected result: on
average, it seems that sanctions are accompanied by a drop in stock prices,
increased trading volume and volatility of returns. Yet, the magnitudes and
patterns are clearly different, especially in the cases of sanction packages of
12 September 2014 and 6 April 2018. Such different behaviour can probably be
explained by two factors: (1) whether the sanction corresponds to inclusion in
the SSI or SDN list; and (2), whether the sanction was expected by the market
or not. While the first factor seems to be plausible enough, since a restriction
of financing is not the same as a prohibition of doing business, the second
factor is likely to be less important: analysis of the news in the weeks prior
to 12 September 2014 and 6 April 2018 shows that in both cases there was
information about the upcoming sanctions, but there was no information about
the targeted companies. 5 Nevertheless, in order to draw robust and credible
scientific conclusions about the reaction of the market to the announcement
of sanctions, one must apply a suitable econometric methodology and properly
adjust it for risks.
5
See, for example, https://www.nytimes.com/2014/09/05/world/as-nato-meets-ukraine-conflict-
holds-center-stage.html. In addition, see https://www.reuters.com/article/us-usa-russia-sanctions/
u-s-plans-to- sanction-russian-oligarchs-this-week-sources-idUSKCN1HB34U
Figure 1. Normalised prices of stocks in the event window of a sanction announcement
(1) 28 Apr 2014 (2) 17 July 2014

1.02 1.000
Price
1.01 0.975
0.950
1.00
0.925
0.99
0.900 GAZP
0.98 NVTK
0.875
ROSN ROSN
0.97 0.850
–7 –6 –5 –4 –3 –2 –1 0 1 2 3 4 5 6 7 –7 –6 –5 –4 –3 –2 –1 0 1 2 3 4 5 6 7
Day Day
(3) 30 July 2014 (4) 12 Sept 2014
1.02 VTBR SBER LKOH SIBN
1.04
Price
GAZP SNGS ROSN TRNFP
1.00 1.02
1.00
0.98
0.98
0.96
0.96
0.94 VTBR
–7 –6 –5 –4 –3 –2 –1 0 1 2 3 4 5 6 7 –7 –6 –5 –4 –3 –2 –1 0 1 2 3 4 5 6 7
Day Day
(5) 23 Sept 2015 (6) 2 Sept 2016
1.04 1.02
Price
1.02 1.00
1.00 0.98
0.98 0.96
VTBR
0.96 SBER 0.94 MSTT
–7 –6 –5 –4 –3 –2 –1 0 1 2 3 4 5 6 7 –7 –6 –5 –4 –3 –2 –1 0 1 2 3 4 5 6 7
Day Day
(7) 26 Jan 2018 (8) 6 Apr 2018
1.1
1.03
Price
1.02 1.0
1.01 0.9 RUAL

1.00 ENPL
0.8
0.99 GAZA
0.7 GAZP
0.98 PLZL
0.97 SNGS 0.6 VTBR
–7 –6 –5 –4 –3 –2 –1 0 1 2 3 4 5 6 7 –7 –6 –5 –4 –3 –2 –1 0 1 2 3 4 5 6 7
Day Day
Source: author’s calculations

82
Table 1. List of sanctions imposed on Moscow Exchange-listed companies
№ Date/Country Codes Description/Link
(1) 28 April 2014, US ROSN The company’s CEO, I.I. Sechin, was added to the OFAC’s SDN List.
https://www.treasury.gov/resource-center/sanctions/ofac-
enforcement/pages/20140428.aspx
(2) 17 July 2014, US GAZP, Gazprombank (35% owned by GAZP), ROSN and NVTK were
NVTK, restricted in their access to 90-day+ maturity US financing (SSI list).
ROSN https://www.treasury.gov/press-center/press-releases/pages/
jl2572.aspx
(3) 30 July 2014, US VTBR VTBR and the Bank of Moscow (having been taken over by VTBR)
were restricted in their access to 90-day+ maturity US financing (SSI).
https://www.treasury.gov/resource-center/sanctions/OFAC-
Enforcement/Pages/20140729.aspx
(4) 12 Sept 2014, US/EU GAZP, (1) Gazprombank, SBER and VTBR were restricted in their access to
LKOH, 30-day+ maturity financing from the US (SSI list);
ROSN, (2) US entities were prohibited to export goods, services, or
SBER, technology to GAZP, LKOH, ROSN, SIBN and TRNFP in support of
SIBN, exploration for or production of Russian deepwater, Arctic offshore, or
SNGS, shale projects that have the potential to produce oil;
TRNFP,
VTBR (3) SIBN and TRNFP were restricted in their access to 90-day+
maturity US financing (SSI list);
(4) SIBN, ROSN and TRNFP were restricted in their access to 90-day+
maturity EU financing.
https://www.treasury.gov/press-center/press-releases/Pages/
jl2629.aspx
https://eur-lex.europa.eu/legal-content/GA/
TXT/?uri=CELEX:32014R0960
(5) 23 Sept 2015, U.S SBER, A number of SBER and VTBR subsidiaries were added to the OFAC’s
VTBR SSI list.
(6) 2 Sept 2016, US MSTT MSTT was added to the OFAC’s SDN list.
(7) 26 Jan 2018, US SNGS A number of SNGS subsidiaries were added to the OFAC’s SSI list.
(8) 6 April 2018, US ENPL, (1) The CEOs of GAZP, RUAL and VTRB – A.B. Miller,
GAZA, O.V. Deripaska, A.L. Kostin respectively – were added
GAZP, to the OFAC’s SDN List;
PLZL, (2) The major shareholder of PLZL – S.A. Kerimov –
RUAL, was added to the OFAC’s SDN List;
VTBR
(3) ENPL, GAZA and RUAL were added to the OFAC’s SDN List.
Note: links to the official US Treasury and EU announcements are provided. MOEX ordinary shares’ codes stand
for: ROSN – Rosneft, GAZP – Gazprom, NVTK – Novatek, VTBR – VTB Bank, LKOH – Lukoil, SBER – Sberbank,
SIBN – Gazprom Neft, SNGS – Surgutneftegaz, TRNFP – Transneft, MSTT – Mostotrest, ENPL – EN+, GAZA –
GAZ Group, PLZL – Polyus, RUAL – Rusal.
4. Methodology
In financial econometrics, the classic method to determine the effect of

a news announcement on the returns of stocks is the event study approach.
Its major advantage is that it solves the issue of identification: given that there
is no confounding event prior to the announcement of a sanction, it guarantees
that the effect is correctly identified. However, it also has a significant drawback:
the bigger the event window is, the higher the variance of the effect studied.
In other words, event studies perform well only over short horizons. The main
goal of any event study is to estimate the effect of a specific type of announcement
(or of some particular announcement) on stock returns by estimating the
cumulative average abnormal returns (CAAR) in the event window around the
announcement.
There is an important peculiarity of the framework we are working in. On the
one hand, sanction announcement is a type of announcement by itself. Thus, it
should be studied across events. On the other hand, due to the differences between
the sanctions described above, sanction announcements can be perceived as
events that should be studied separately. In the context of event studies, a special
approach exists for each of these two situations.
First, firm-specific event study – the standard – is designed to identify
the effect of a particular type of shock, such as earnings announcements or
announcements about some macroeconomic indicators. In other words, it
addresses the question: What is the expected pattern of cumulative abnormal
returns of a company affected by a typical announcement of this particular
type of event (in our case, the announcement of sanctions)? Formally speaking,
it estimates the expectation of the cumulative abnormal return conditional on
the event window around the typical announcement. The main requirement of
this framework is that the events in the data set are gathered in such a way that
there is no overlap between the event windows. This requirement asserts that
there is no correlation between the abnormal returns of stocks (clustering), a
problem that firm-specific event studies are not designed to address. In cases of
significant cross-correlation of abnormal returns, the variance of CAARs will be
estimated incorrectly, which will lead to inaccurate confidence intervals (CIs).
For more information about the methodology of firm-specific event studies, see
Chapter 4 in Campbell et al. (1997), and also Kothari and Warner (2007).
Second, the approach used to analyse the effect of a particular announcement
is called multivariate event study. It is designed to tackle the problem of cross-
sectional correlation between abnormal returns, allowing the identification of
the effect of an event that had a simultaneous impact on a number of firms.
However, in contrast to firm-specific event studies, this approach cannot deal
with different event windows at the same time. In our case, this drawback makes
the approach applicable only to the analysis of sanctions that were announced
84
either on 12 September 2014 or on 6 April 2018. The question addressed by a

multivariate event study is: What is the expected pattern of cumulative abnormal
returns of a company affected by the event under consideration?
Fama et al. (1969) were the first to introduce event study methodology as a
tool of financial econometrics. Multivariate event studies were first implemented
by Binder (1985). However, the author notes that the distribution of test statistics
in multivariate event studies is not well approximated by the asymptotic approach,
since the assumption of independent and identically distributed error terms
is often violated in the context of stock returns. This problem gave rise to the
application of bootstrapping in multivariate event studies. In case the test statistic
is pivotized (i.e., its asymptotic distribution does not depend on the parameters),
bootstrapping ensures asymptotic refinement, that is, better approximation of
the actual distribution of test statistic. For the purposes of robustness, I use
bootstrapping together with an asymptotic approach in conducting multivariate
event studies. For the general bootstrapping methodology, see Anatolyev (2007).
The application of bootstrapping in multivariate event studies is developed in
Chou (2004) and Hein and Westfall (2004), and then revised in Lefebvre (2007).
Note that the main difference between the general bootstrapping approach and
the one that is used in event studies is that the latter does not account for the
recentering of bootstrap statistics, since the bootstrap samples are constructed
in such a way that there is no event at all.
A complete and self-contained description of the econometric, statistical and
bootstrap methodology used in this work is given in Appendix B and Appendix C.
All estimates are implemented by a Python code. In Python, the following
packages were used: NumPy, Pandas, SciPy and Matplotlib.
5. Results
5.1. Firm-specific event studies
The core problem is that there are only 8 observations of events in Table 1.
There are 1 · 3 · 1 · 8 · 2 · 1 · 1 · 6 = 288 possible samples of companies such that
the event windows do not overlap (i.e. one company per announcement). Figure 2
clearly demonstrates that the resulting CAARs and their significance depend
crucially on the choice of companies in the sample.
The main obstacle in conducting a firm-specific event study is the choice
of companies to be included in the sample. To make the result robust to the
sample choice, one can conduct an event study for each of the 288 possible
combinations of companies and average the results. From the standpoint of
formal mathematical statistics, this approach is incorrect, because the CIs
intervals for the ‘average’ CAAR are not the averaged CIs of the resulting
CAARs. Nevertheless, this approach gives an intuitive result about the average
Figure 2. Firm-specific event studies for different combinations of companies
ROSN, GAZP, VTBR, LKOH, SBER, MSTT, SNGS, GAZA ROSN, NVTK, VTBR, SBER, TRNFP, MSTT, SNGS, PLZL
CAAR
0.04
0.04
0.02
0.02
0.00 0.00
–0.02 –0.02
–0.04 –0.04
–0.06 –0.06
–0.08 –0.08
–0.10
–0.10
–7 –6 –5 –4 –3 –2 –1 0 1 2 3 4 5 6 7 –7 –6 –5 –4 –3 –2 –1 0 1 2 3 4 5 6 7
Day Day
ROSN, NVTK, VTBR, SBER, GAZP, MSTT, SNGS, RUAL ROSN, GAZP, VTBR, LKOH, SIBN, MSTT, SNGS, ENPL
CAAR
0.025 0.025
0.000 0.000
–0.025 –0.025
–0.050 –0.050
–0.075 –0.075
–0.100 –0.100
–0.125 –0.125
–7 –6 –5 –4 –3 –2 –1 0 1 2 3 4 5 6 7 –7 –6 –5 –4 –3 –2 –1 0 1 2 3 4 5 6 7
Day Day
Note: event studies that have CAAR plots with yellow (green) fill colour of asymptotic CI have statistically
significant (insignificant) negative CAARs after the announcement of sanctions. The market model is estimated
on the basis of the MSCI EM index.
Figure 3. Average firm-specific event study of the announcement of sanctions

0.04 0.04
Average CAAR
0.02 Average CAAR at = 7 0.02
0.00 0.00
–0.02 –0.02
–0.04 –0.04
–0.06 –0.06
–0.08 –0.08
–0.10 –0.10
–0.12 h = 15 –0.12
–7 –6 –5 –4 –3 –2 –1 0 1 2 3 4 5 6 7 15 17 19 21 23 25 27 29 31 33 35 37 39 41
Day h
Note: the left side plots the CAAR and its CI averaged across 288 firm-specific event studies with h = 15 (for
each of the combinations of companies in Table 1). The market model is estimated on the basis of the MSCI EM
index. The right side plots the CAAR and its CI on the seventh day after the sanction announcement averaged
across 288 firm-specific event studies with h ∈ {15, 17, ... , 41}. The black dashed line is the average estimate across
different ℎ values. The black dotted lines are the averages of CI across different h values. For both sides, the CIs
are constructed at 10% significance.
86
effect that can be identified by conducting numerous firm-specific event studies

in the sample of sanction-targeted companies.
Figure 3 shows that, on average, a firm-specific event study results in a
downward-sloping CAAR pattern after a typical sanction announcement. This
result is consistent with intuition. Yet, if the event window’s width h is 15, it appears
that the result is (on average) statistically significant at 10% only in the cases of
CAARs for the 5th, 6th and 7th days after the sanction announcement. For other
event days, the CAARs are not (on average) statistically different from zero.
The next step is to check for robustness and examine the sensitivity of the
CAAR at event day τ = 7 to the width of the event window h. The key idea
is that when we increase h, we also increase the number of average abnormal
returns that are included in the calculation of the CAAR at τ = 7. Intuitively, this
means that the wider the event window is, the more the CAAR at τ = 7 takes
into account the possible effect of trading based on inside information about the
future sanction which took place prior to the announcement.6 Obviously, this
concept will work correctly only if there are no confounding events.
The right-hand side subplot of Figure 3 presents a robustness check conducted
by repeating the same ‘average event study’ procedure for h ∈ {15, 17, ... , 41}.
It demonstrates several facts. First, the estimate for the CAAR at day τ = 7
is rather stable for different ℎ values, with a mean of ‒0.054. Second, the CI for
the CAAR at day τ = 7 is on average almost statistically significant at 10% for
different h. Thus, the average estimated cumulative effect of a typical sanction
announcement on the stock return of a targeted company is ‒5.4% within
a window of seven trading days after the announcement. On the one hand, one
should be careful with this conclusion, since it was made on the basis of incorrect
‘average’ CIs. On the other hand, this was the only way to overcome the lack of
event observations.
5.2. Multivariate event studies
There are two events in Table 1 to which the multivariate event study
methodology can be applied: the sanction announcements on 12 September 2014
and 6 April 2018. As was discussed in Section 2, these announcements differ
considerably: the former sanction was the most extensive in terms of inclusion in the
SSI list, whereas the latter was the most extensive in terms of inclusion in the SDN
list. Conducting multivariate event study on these two sanction announcements
could answer the question of the average effects of inclusion in the SSI and SDN
lists on the stock returns of a targeted company. For the purposes of robustness,
I conduct both studies on the basis of the MSCI EM as well as IMOEX indices.
6
By the way, this is the reason why CAARs are better estimates of the effect of sanctions than
simple AARs.
Figure 4. Multivariate event study for the sanction announcement of 12 September 2014,
estimated CAARs of the targeted companies, h = 29
(a) Market model: IMOEX index (b) Market model: MSCI ME index
0.25
Asymptotic CI
0.06 Bootstrap CI 0.20
0.04 0.15
0.10
CAAR
0.02
0.05
0.00 0.00
0.02 –0.05
–0.10
–14–12 –10 –8 –6 –4 –2 0 2 4 6 8 10 12 14 –14–12 –10 –8 –6 –4 –2 0 2 4 6 8 101214
Day Day
Figure 5. Multivariate event study for the sanction announcement of 6 April 2018,
estimated CAARs of the targeted companies, h = 29
(a) Market model: IMOEX index (b) Market model: MSCI ME index
0.05
0.00 0.00
–0.05 –0.05
–0.10 –0.10
CAAR
–0.15 –0.15
–0.20 –0.20
–0.25 –0.25
Asymptotic CI –0.30
–0.30 Bootstrap CI
–0.35
–14–12–10 –8 –6 –4 –2 0 2 4 6 8 10 12 14 –14–12–10 –8 –6 –4 –2 0 2 4 6 8 10 1214
Day Day
Figure 6. Multivariate event study for the sanction announcement of 6 April 2018,
estimated CAARs for different sectors of the Russian economy, h = 15
(a) Wide market (b) Oil & Gas
0.06
0.02
0.04
0.00
0.02
–0.02
0.00
–0.04
–0.02
–0.06 –0.04
–0.08 –0.06
–0.10 –0.08
–7 –6 –5 –4 –3 –2 –1 0 1 2 3 4 5 6 7 –7 –6 –5 –4 –3 –2 –1 0 1 2 3 4 5 6 7
Day Day
(c) Power (d) Telecom
Figure 6 continued on p. 88
0.050 0.05
0.00
0.000
–0.05
–0.050
0.02
0.04
0.00
0.02
–0.02
0.00
88

–0.04 Russian Journal of Money and Finance march 2020
–0.02
–0.06 –0.04
–0.08 –0.06
–0.10 –0.08
Continuation, Figure 6 starts on p. 87
–7 –6 –5 –4 –3 –2 –1 0 1 2 3 4 5 6 7 –7 –6 –5 –4 –3 –2 –1 0 1 2 3 4 5 6 7
Day Day
(c) Power (d) Telecom
0.050 0.05
0.00
0.000
–0.05
–0.050
–0.10
–0.100
–0.15
–7 –6 –5 –4 –3 –2 –1 0 1 2 3 4 5 6 7 –7 –6 –5 –4 –3 –2 –1 0 1 2 3 4 5 6 7
Day Day
(e) Metals & Mining (f) Finance
0.05 0.05
0.00 0.00
–0.05 –0.05
–0.10
–0.10
–0.15
–0.15
–7 –6 –5 –4 –3 –2 –1 0 1 2 3 4 5 6 7 –7 –6 –5 –4 –3 –2 –1 0 1 2 3 4 5 6 7
Day Day
(g) Consumption (h) Chemicals
0.15
0.02
0.10
0.00
0.05
–0.02
0.00
–0.04
–0.05
–0.06 –0.10
–0.08 –0.15
–7 –6 –5 –4 –3 –2 –1 0 1 2 3 4 5 6 7 –7 –6 –5 –4 –3 –2 –1 0 1 2 3 4 5 6 7
Day Day
(i) Transport
0.05
0.00
–0.05
–0.10
–0.15
–0.20
–0.25
–7 –6 –5 –4 –3 –2 –1 0 1 2 3 4 5 6 7
Day
Note: due to the similarity between asymptotic and bootstrap CI, only bootstrap CI are presented. Sectors with
yellow (green) fill colour of CI have statistically significant (insignificant) negative CAARs after the sanction
announcement. The market model is estimated on the basis of the MSCI EM index. Samples of companies in each
case are similar to the samples for the calculation of Moscow Exchange indices on 6 April 2018: (a) MICEXBMI
(100 companies), (b) MICEXO and G, (c) MICEXPWR, (d) MICEXTLC, (e) MICEXM and M, (f) MICEXFNL,
(g) MICEXCGS, (h) MICEXCHM, (i) MICEXTRN.
The results for 12 September 2014 are provided in Table 2 (MSCI EM)
and Table 3 (IMOEX) in Appendix D. The results for 6 April 2018 are given
in Table 4 (MSCI EM) and Table 5 (IMOEX) in Appendix E. The estimates in
the case of IMOEX are consistently lower than in the case of MSCI. As was
mentioned in Section 3.1, this is because the IMOEX is highly correlated
with the returns of targeted companies with large capitalisation. A graphical
presentation of the findings can be found in Figure 4 for 12 September 2018
and Figure 5 for 6 April 2018.
The analysis shows that the sanctions that were imposed on 12 September
2014 had an insignificant effect on stock returns, whereas the impact of the
sanctions announced on 6 April 2018 is substantial and equals about 20%. None
of the four major event study hypotheses explained and defined in Appendix C
was rejected in the first case, while all were rejected with almost zero 𝑝-value
in the second. The results may be explained as follows: (1) the SSI list is not as
restrictive as the SDN list; (2) investors’ negative expectations about the scale
of the upcoming sanctions were not eventually justified on 12 September 2014;
(3) investors underestimated the scale of the future sanctions prior to 6 April 2018
and were caught off guard.
Multivariate event studies are also useful for estimating the effect of a sanction
announcement on the market as a whole and on its different sectors separately.
Figure 6 graphically presents the results for 6 April 2018 (see also Figure 11
for h = 29 in Appendix E). Figure 10 represents the reaction of the market on
12 September 2014 (Appendix D). On 12 September 2014, the pattern of CAARs
is insignificantly positive for the wide market and for each of the sectors, except
for Oil and Gas, where returns are positive. When it comes to 6 April 2018, the
analysis shows that the greatest negative impact was made on such sectors as
metals and mining, finance, consumption and transport.
One of the standard problems in financial econometrics is that sometimes
the estimate of β in the market model appears to be sensitive to the choice of the
estimation window. Thus, the results of an event study can potentially depend
on the width of the estimation window 𝑙 (see Appendix C). Moreover, the results
can also differ depending on the choice of h, the width of the event window.
In order to check that the result of the multivariate event study for 6 April 2018 is
robust to the choice of 𝑙 and h, I conducted calculations for 𝑙 ∈ {150, 200, 250, 300}
and h ∈ {11, 13, 15, ... , 41}. The results for the CAAR at days τ = 1 (right after the
sanction announcement) are summarised in Figure 7. In Appendix E, Figure 12
presents similar results for τ = 5. In both cases, the estimates prove to be more
sensitive to changes in h than to changes in 𝑙. This is logical, because the wider
the event window is, the more abnormal returns are included in the calculation
of CAARs. Moreover, with higher h, the bootstrap CIs become wider, since they
account for the variation in the larger number of the previous abnormal returns.
Overall, the robustness check demonstrates that the optimal choice for h is 23,
90
because it ensures that the estimates of the CAARs are almost identical to the
averages of the estimates across different h values.
Figure 7. Robustness check for the CAAR right after the sanction announcement (τ = 1)
CAAR
–0.125
–0.125
–0.150
–0.150
–0.175
–0.175
–0.200
–0.200
–0.225
–0.225
–0.250
–0.250
–0.275
–0.275 = 150 = 200
13 17 21 25 29 33 37 41 13 17 21 25 29 33 37 41
–0.100
CAAR
–0.125
–0.125
–0.150
–0.150
–0.175
–0.175
–0.200
–0.200
–0.225
–0.225
–0.250
–0.250
–0.275
= 250 = 300
–0.275
13 17 21 25 29 33 37 41 13 17 21 25 29 33 37 41
h h
Note: CAARs from the beginning of the event window up to the 1st day after the sanction announcement are
calculated for different event (h) and estimation (𝑙) windows. The market model is based on the MSCI EM index.
Bootstrap CIs at 5% significance for each CAAR are plotted as the vertical lines with dashes on the ends. The
horizontal black dashed line represents the average value of the CAAR for different h values. The horizontal black
dotted lines represent the averages of the CIs.
6. Conclusion
On the basis of firm-specific event study of eight sanction announcements

targeting 14 entities at different times, the following conclusion is made.
On average, the announcement of the imposition of a sanction that targets a Russian
public company has a negative impact on its equity returns. The average effect is
measured as the estimate of the expected cumulative abnormal return within a
window of seven trading days after the announcement of sanctions and amounts
to ‒5.4%. The estimate can be considered as statistically significant at the 10% level.
At the same time, one should take into the account that this result depends crucially
on the presence of the 6 April 2018 sanction announcement in the sample of events.
Without this data, the statistical significance of the results would be much lower.
Multivariate event study analysis of the two most extensive sanctions imposed
by the US on the Russian economy, on 12 September 2014 and on 6 April 2018,
shows that the effect of the announcement of the former was statistically equal
to zero, whereas for the latter, estimates of the expected cumulative abnormal
return already on the 5th day after the announcement are steadily around ‒20%.
On the one hand, this result indicates the failure (in the context of equity markets)
of sectoral sanctions, which were designed specifically to restrict the foreign
financing of large-cap state-affiliate Russian public companies. On the other
hand, the result demonstrates the effectiveness of the standard time-tested OFAC
procedure of inclusion in the SDN list.
When it comes to areas for further research, it would be of substantial scientific
interest to investigate the immediate intra-day reaction of the market to the
announcement of sanctions. The complete list of sanctions described in Section 3.2
contains the exact time (down to the minute) of each of the announcements, and the
Moscow Exchange Type B data set covers all intra-day transactions, which makes it
possible to do such research in the framework of the market microstructure theory.

dx.doi.org/10.31477/rjmf.202001.74
References
Anatolyev, S. (2007). The Basics of Bootstrapping. Quantile, 3, pp. 1–2. [In Russian].
Available at: http://www.quantile.ru/03/03-SA.pdf [accessed on 28 December 2019].
Ankudinov, A., Ibragimov, R. and Lebedev, O. (2017). Sanctions and the Russian Stock
Market. Research in International Business and Finance, 40, pp. 150–162.
Binder, J. J. (1985). On the Use of the Multivariate Regression Model in Event Studies.
Journal of Accounting Research, 23(1), pp. 370–383.
Campbell, J. Y., Lo, A. W. and MacKinlay, A. C. (1997). The Econometrics of Financial
Markets, Vol. 2. Princeton, NJ: Princeton University Press.
Chou, P.-H. (2004). Bootstrap Tests for Multivariate Event Studies. Review of Quantitative
Finance and Accounting, 23(3), pp. 275–290.
Dreger, C., Kholodilin, K. A., Ulbricht, D. and Fidrmuc, J. (2016). Between the Hammer
and the Anvil: The Impact of Economic Sanctions and Oil Prices on Russia’s Ruble.
Journal of Comparative Economics, 44(2), pp. 295–308.
Fama, E. F., Fisher, L., Jensen, M. C. and Roll, R. (1969). The Adjustment of Stock Prices
to New Information. International Economic Review, 10(1), pp. 1–21.
Galtung, J. (1967). On the Effects of International Economic Sanctions, with Examples
from the Case of Rhodesia. World Politics, 19(3), pp. 378–416.
Gurvich, E. and Prilepskiy, I. (2016). The Impact of Financial Sanctions on the Russian
Economy. Voprosy Economiki, 1, pp. 5–35. [In Russian].
doi: 10.32609/0042-8736-2016-1-5-35
92
Hein, S. E. and Westfall, P. (2004). Improving Tests of Abnormal Returns by

Bootstrapping the Multivariate Regression Model with Event Parameters.
Journal of Financial Econometrics, 2(3), pp. 451–471. doi: 10.1093/jjfinec/nbh018
Hufbauer, G. C., Schott, J. J. and Elliott, K. A. (1990). Economic Sanctions Reconsidered:
History and Current Policy, Vol. 1. Washington, DC: Peterson Institute.
Kholodilin, K. A. and Netšunajev, A. (2019). Crimea and Punishment: The Impact
of Sanctions on Russian Economy and Economies of the Euro Area.
Baltic Journal of Economics, 19(1), pp. 39–51.
Kirshner, J. (1997). The Microfoundations of Economic Sanctions. Security Studies, 6(3),
pp. 32–64.
Kothari, S. and Warner, J. B. (2007). Econometrics of Event Studies. In: B. E. Eckbo,
ed., Handbook of Corporate Finance: Empirical Corporate Finance, Vol. 1. Elsevier,
pp. 3–36.
Lefebvre, J. (2007). The Bootstrap in Event Study Methodology. M. Sc. Thesis, Université
de Montréal. Available at: https://papyrus.bib.umontreal.ca/xmlui/handle/1866/1503
[accessed on 28 December 2019].
Naidenova, J. and Novikova, A. (2018). The Reaction of Russian Public Companies’ Stock
Prices to Sanctions against Russia. Journal of Corporate Finance Research, 12(3),
pp. 27–38.
Nivorozhkin, E. and Castagneto-Gissey, G. (2016). Russian Stock Market in the
Aftermath of the Ukrainian Crisis. Russian Journal of Economics, 2(1), pp. 23–40.
Zarate, J. (2013). Treasury’s War: The Unleashing of a New Era of Financial Warfare.
London, UK: Hachette.
Expected and Unexpected

Consequences of Russian Pension
Increase in 2010
Ivan Suvorov, University of North Carolina at Chapel Hill1
isuvorov@live.unc.edu
This paper studies the effects of the pension increase in Russia in

2010 on the labour force participation decisions and living arrangements
of senior people and their family members. There is not much research
on the effects of pension rises in Russia. In particular, researchers have
not yet analysed the influence of pension increases in Russia on non-
elderly people nor the heterogeneity of this influence. The increase in
pensions in 2010 is of particular interest due to its unique magnitude, its
relative independence from economic trends in Russia at that time, and
its plausible exogeneity for pensioners.
This study provides evidence that this jump in pension caused an
approximately 5 percentage point increase in the relative number of
seniors who chose to retire. The effect was stronger in the two biggest
cities of Russia, namely Moscow and Saint Petersburg, where before 2010
a substantial number of people continued to work after reaching the
pension age. One out of four employed pensioners living in these cities
left the labour force in 2010. In addition, this paper shows a relatively
unexpected external effect on younger individuals. The labour force
participation decisions of younger people who lived with pension receivers
were influenced considerably. The non-seniors who lived with pensioners,
compared with their peers, were less likely to work or to look for jobs.
The change in pensions also affected living arrangements. The rate of
pension receivers living with their children and grandchildren went up
significantly. Thus, the evidence from the 2010 pension increase highlights
the fact that policies might have an impact not only on the target group of
population, but on the family members of this group as well.
Keywords: labour force Citation: Suvorov, I. (2020). Expected

participation, pensions, and Unexpected Consequences
retirement of Russian Pension Increase in 2010.
JEL Codes: J14, J26 Russian Journal of Money and Finance,
79(1), pp. 93–112.
doi: 10.31477/rjmf.202001.92
1
The author is a New Economic School (MAE’19) graduate, a winner of the 2019 Economic
Research Competition organised by the Bank of Russia and the Russian Journal of Money and
Finance for students and postgraduates, a PhD student of the University of North Carolina at
Chapel Hill.
1. Introduction
The pension system is an important issue in Russia. In 2010, total retirement
benefits in Russia were as large as 10% of Russian GDP.2 The Russian Federation
inherited a pay-as-you-go system from the Union of Soviet Socialist Republics.
During his presidency Vladimir Putin has already introduced two significant
reforms of the Russian pension system (in 2002 and 2005). Recently, in 2018,
president Putin started a new reform. According to many economists, the
reform was long overdue. The reform is expected to be significant enough to
have a considerable impact on the Russian economy as a whole. To understand
the potential consequences of this reform, it is important to analyse the previous
changes in the Russian pension system.
A number of articles have been devoted to the Russian pension system and its
changes. My research concentrates on the study of the consequences of pension
increases in Russia. There is little research on this particular topic.
Some articles analyse the pension system during Russia’s transition from
a state-planned to a market-based economy and provide ideas for further
modifications. De Castello Branco (1998), who performed such an analysis,
proposed improving the existing pay-as-you-go system and only then focusing
on a more complex structure.
The 2002 Russian pension reform is a relatively well-studied case.
Gontmakher (2009) stresses the complexity of the pre-reform situation.
In Gel’man (2017), the 2002 Russian pension reform is described as a compromise
between quite different policy positions. Despite the fact that the compromise was
achieved, the reform did not tackle the problems of the pension system. Afanasiev
(2003) highlights the drawbacks of the 2002 pension reform. On the other hand,
Kovrova (2007) manages to find positive effects of that reform as well as of the 2005
Russian pension reform.
Overall, even after the 2002 and 2005 Russian pension reforms, researchers
have held a consensus about the presence of significant unsolved problems
in the Russian pension system. Economists repeatedly suggested an increase
of the pension age in Russia (Sinyavskaya, 2005; Hauner, 2008; Maleva and
Sinyavskaya, 2010; Gurvich and Sonina, 2012). It is only today that the pension
age is finally increasing in the country. The 2018 pension reform gradually
increases the current pension age from 55 to 60 for females and from 60 to 65
for males. Supporters of the reform underline that the pension age increase will
allow a significant increase in the size of pensions.3
In contrast to the researchers, Russian society did not expect the pension age to
increase. 4 The initial public reaction to the reform was unambiguously negative.5
2
See https://www.vesti.ru/doc.html?id=316634
3
See https://tass.ru/ekonomika/5290888
4
See https://carnegie.ru/commentary/77015
5
See https://fom.ru/posts/14043
v.79 №1 Suvorov: Consequences of Russian Pension Increase in 2010, pp. 93–112 95
President Putin made an attempt to change public opinion and spoke to the
nation about this issue personally.6 This did not help change the situation. Local
elections were held in September 2018. In spite of the support of the president,
many pro-government candidates failed to win the elections. Some political
observers claim that it was the pension reform which caused these failures.7
Overall, the existing research on the topic had the following goals: analysis
of the 2002 and 2005 pension reforms, the identification of the effective
pension age (Sinyavskaya, 2005), the examination of the problems of the
Russian pension system (Gurvich and Sonina, 2012) and the reasons for them
(Solovyev, 2013), discussion of the pros and cons of possible pension age increase
(Maleva and Sinyavskaya, 2010), and the study of the determinants of senior
people’s participation in the labour force (Danielyan, 2016; Lyashok, 2017).
What is completely missing in the research in this area so far is the analysis
of the economic and social effects of a pension increase on other groups of the
population and the study of the heterogeneity of these effects. My research aims
to fill this gap.
The rise in pensions in 2010 was of a unique magnitude. The outstanding scale
of that rise was initially underlined in the research by Lyashok (2017). In addition
to the magnitude, Lyashok highlighted the exogeneity of the increase (from the
viewpoint of pensioners) and the independence of that rise from economic
trends as a rationale to study this particular rise. I furthermore underline the
independence of that pension increase from economic trends in Russia at that
time. The average real incomes of non-retired and retired people followed
common trends throughout all recent years except for the jump in 2010Q1 (see
Figure 1). The 2010 pension increase allows the separation of the effect of pension
rises from the influence of all such variables correlated with these changes. The
unique magnitude of the 2010 pension increase allows the removal of possible
bias due to the omitted variables. One of such examples is an individual’s health.
Having stronger health, a person is more likely to stay in the labour force and
have higher wages and pension, ceteris paribus. Concentration on the sharp
pension increase of 2010 makes it possible to see the effects of pension increase
per se. The consequences of a sharp increase in pension can provide an idea about
the possible aftermath of other changes in the Russian pension system, including
the current reform.
My work relates to the literature addressing the relationship between
retirement and pensions (see e.g. Malkova, 2020; Lyashok, 2017; Manoli and
Weber, 2016). I expected to see that a sharp increase in retirement benefits led
to the retirement of a higher percentage of senior people. Indeed, my research
proved this to be true. Furthermore, the research of Schröder-Butterfill (2004)
demonstrates that inter-generational family support provided by older people can
See https://www.youtube.com/watch?v=1UYMdAABvnw
6
See https://www.pravda.ru/news/politics/1392918-edinorossy/
7
influence the degree of economic independence of their children. With higher

pensions, elderly people are capable of providing more significant support to
their relatives. Therefore, I expected to find an external effect caused by the
2010 pension increase on employment among younger adults living with their
retired parents. I found that this is true. Additionally, I expected to see evidence
to support the result of Gruber et al. (2009) about the absence of substitution
between older and younger workers. Again, the findings met expectations.
In addition, I found an effect on living arrangements. The existing evidence is
ambiguous on this issue. Bianchi (1987) and Young (1987) show that large family
economic resources increase the probability of coresidence. Similarly, Manacorda
and Moretti (2006) show that higher income of seniors raised the proportion of
men coresiding with their parents. However, Goldscheider and DaVanzo (1989)
suggest that parents’ large income increases the likelihood of children leaving
parents’ homes. Specifically in the Russian case, Lokshin et al. (2000) show that
in times of economic downturn, single mothers prefer to coreside. My research
in turn indicates that the effect of Russian seniors’ acute increase in income on
coresidence with younger generations is a positive one.
My research emphasises the fact that a policy might have an effect not only
on the intended group, but also on their relatives and other family members.
In particular, my findings are relevant to the current 2018 pension reform in
Russia. In order to make predictions about the consequences of this reform, one
has to analyse the aftermath of previous significant changes in the Russian pension
system. The results of my research provide an idea about possible consequences of
the current reform and elucidate some characteristics of Russian society in general.
While enriching the literature on the Russian pension system, my research also
contributes to the research on labour and family economics.
The rest of this paper is organised as follows. Section 2 provides background
information. Section 3 describes the data. Section 4 discusses the empirical
strategy. Section 5 presents the main results. Section 6 makes robustness checks,
provides some historical evidence, and explores heterogeneity. Section 7 discusses
the findings. Section 8 concludes.
2. Background
Speaking about federal expenditures on retirement benefits in 2010, Vladimir

Putin pointed out that these expenditures were the highest in Russian history.8
Despite the fact that the 2010 pension increase was of a unique magnitude, the
authorities had announced it just a few months before9 and senior people did not
expect such a sharp rise in pensions. This increase in pensions was based on the
law on labour pension and did not influence special groups of people who retired
8
See https://www.vesti.ru/doc.html?id=316634
9
See https://ria.ru/20091130/196272604.html
earlier, such as military officials. The 2010 pension increase was calculated based
on employment history and the cost of living in the region.10 Some economists
expressed concern whether the government would be able to maintain the high
level of pensions in 2010 as it was.11 In nominal terms, the average pension
increased from 4909.8 rubles at the start of 2009 to 8156.8 rubles (66% larger) at
the beginning of 2011.12
It is important to note that the state pension is a major part of the income
of most retired Russian individuals. Despite the 2010 rise and regular increases
in pensions,13 these particular social benefits are considered to be quite small–
Russians say that the pensions they deserve should be more than twice as high as
they are now.14 However, the majority of Russians believe that the pension will be
their major source income when they reach the pension age.15
The significance of pensions and the opinions of retired people is considerable
in Russian politics.16 Opportunistic cycles in pensions can be an expected finding.
On the other hand, the general public and the media tend to become more
interested in retired individuals during the anniversaries of the victory in World
War II. Before any anniversary, the media usually compares the well-being of
retirees in Russia and Germany.17
The celebrations of the victory in World War II, so called ‘Victory Days’,
are becoming more and more significant in Russia,18 therefore, an influence of
World War II victory anniversaries on pension size can be anticipated. Durante
and Zhuravskaya (2016) demonstrate that politicians can exploit special events
to conceal their misdeeds. ‘Victory Days’ also illustrate the fact that politicians
also exploit such special occasions to show off.
3. Data
In my study, I utilise two datasets. The first dataset being used is the
quarterly data from the Household Budget Survey (HHBS) by the Rosstat.19
The HHBS is a nationally representative survey. I use the data on observations
between 2003 and 2015. The Rosstat provides databases for each quarter of the
year. Furthermore, for each quarter there is a database with information about
income, then there is a separate database with information about sources
of income, and finally there is another dataset about the characteristics of
10
See http://xn--b1agvbq6g.xn--p1ai/news/economy/valorisation/
11
e.g. see https://echo.msk.ru/blog/gontmaher/637242-echo/
12
see https://www.fedstat.ru/indicator/31455
13
See https://tass.ru/info/4853990
14
e.g. see https://tass.ru/ekonomika/7060617
15
See https://fom.ru/Ekonomika/14079
16
See https://www.bbc.com/russian/russia/2010/09/100908_putin_pensions_politics
17
e.g. see https://www.newsru.com/finance/09may2013/pensii.html
18
See https://youtu.be/nZs2ajTKpRw
19
See https://gks.ru/folder/13397
households’ members. More than 100,000 individuals within approximately

50,000 households are surveyed each quarter.
The second dataset being used is the Russian Longitudinal Monitoring
Survey – Higher School of Economics (RLMS-HSE). 20 The RLMS-HSE is
an annually run nation-wide representative study designed to analyse the
consequences of the Russian reforms on the welfare of Russian households and
individuals. The RLMS-HSE contains questions about individuals’ objective
and subjective well-being. Each year approximately 15,000 individuals who
comprise more than 5,000 households are surveyed. The survey is conducted
in the period between October and December. I use the data on observations
between 2000 and 2018.
In this study, there will be terms which will be specific for this paper only.
Senior people are respondents of pension age, i.e. men older than 60 and women
older than 55. A retired household is a household which has pension as the
main source of income. The HHBS contains a direct question about the major
source of income. As for the RLMS, I choose retired households based on the
criterion that their income from pension is more than half of their total income.
Respondents of working age are people who are younger than the pension age
but who are older than 18 at the time of survey. NLFJ means the percentage of
the respondents of working age who both did not work and did not look for a
job for the whole quarter. The head of a household is the member of a household
who brings in the major part of the household’s income. Household employment
is the employment of the head of a house and/or her/his spouse. Relative income
is defined in this paper as the income of an individual divided by the average
income of all respondents that year or a quarter of that year. The CPI21 is used in
order to account for inflation.
The HHBS dataset has its limitations. Only the total income of a household
(HH) is specified. The question about the main source of income is asked in this
survey and a pension is one of the possible answers. This allows me to consider
retired households. However, the reasons for a HH member to get a pension are
not listed in this question about the major source of income. As a result, the
main source of income for some retired households could be, for example, a
military pension due to the recent military service of HH members. The income
of such HHs was not considerably influenced by the 2010 pension increase, if
it was influenced at all. I analyse these retired households only so as to support
the evidence of the significance of the 2010 pension increase and I also use
these households one more time so as to show an increase in the number of
households that live on pensions. In both cases, the presence of a group which
was not influenced by the 2010 pension increase is expected to make the changes
20
See https://www.hse.ru/en/rlms/
21
See http://www.gks.ru/free_doc/new_site/prices/potr/tab-potr1.htm
I am looking at less significant. The fact that these changes are apparent even
with the presence of a group who were not considerably influenced by the 2010
pension increase speaks for the significance of these changes.
4. Empirical strategy
My identification strategy is based on a regression discontinuity (RD) design

and a difference-in-differences technique (see Angrist and Pischke, 2009; Cameron
and Trivedi, 2005; Lee and Lemieux, 2010). RD estimates treatment effects in
a quasi-experimental setting where treatment is determined by whether some
assignment variable exceeds a known cutoff point. In my case, the treatment is the
pension increase in 2010 and the assignment variable is time. Under the reasonable
assumption that other variables which could influence the variables of interests
do not jump in 2010Q1 (see the discussion of this issue in Introduction), RD can
identify pension increase and its effects on Russians. Based on this technique,
I explore jumps in 2010Q1 and in the years of anniversaries of the World War II
victory using the following specifications:
where Y𝑖t is the variable of interest, f(t) stands for the linear trend variable, and
g(t) stands for the time-fixed effects (seasonal dummies). β1 is the main coefficient
in these regressions: it shows a change in the variable of interest in 2010Q1 or in an
anniversary year relative to the trend and seasonal fluctuations.
The difference-in-differences technique studies the differential effect of a
treatment on a treatment group versus a control group in a natural experiment.
In my case, my treatment group are younger people living in households with
retired senior people and my control group are those living in households without
retired seniors. The technique is implemented by the following specification:
where IRetiredInHH is an indicator for the presence in the household of a person who
receives a pension. Variables of interest, Y𝑖t, stand for the indicator that illustrates
that a person worked for the whole quarter of the year (see Table 2) and the
indicator that shows that a person was out of labour force for a whole quarter of the
year (see Table 3). β1 is the main coefficient. It shows a change in the variable of
interest in 2010Q1 relative to the trend and seasonal fluctuations among the
respondents of a treatment group in comparison to a control group. Figures 10
and 11 confirm the parallel pre-trends assumption which is necessary for the
identification of true β1. Standard errors in all regressions are clustered by regions
and the types of settlement.
5. Main results
Lyashok (2017) identified and analysed the sharp increase in pensions in

2010 with the RLMS only. I support the identification and perform analysis of
that jump in pensions using the more frequent HHBS in addition to the RLMS.
Both graphs and regressions provide evidence of a discontinuity in the income of
retired households and individuals in 2010Q1 (see Table 1, Figure 2). Regressions
show that the jump in income of retired households and individuals was around
30%, which is in line with Lyashok’s results.
Lyashok (2017) concentrated on the influence of the 2010 pension rise
on retirement decisions. I continue Lyashok’s analysis in several directions.
Firstly, I check for the heterogeneity of that pension increase with respect to
the pensioners’ place of living. Next, I explore the heterogeneity of the effects of
that pension rise. Finally, I study the impact of that 2010 jump in pensions on
other population groups.
As Figure 3 shows, retired individuals living in cities are always better-off than
those residing in rural areas in terms of real income. The 2010 pension increase
appears to be homogeneous with respect to place of living. The discontinuity is
clear in the relative income of both types of retired individuals, rural and urban
(see Figure 4).
Lyashok (2017) analysed changes in retirement decisions using RLMS data.
I show the discontinuity in the labour force participation of pensioners via the more
frequent HHBS data (see Figure 5). In 2010, the percentage of seniors who were not
working went up by 5 percentage point (p.p.), from 69% to 74%, which is similar to
the results of Lyashok (2017). Rural labour markets were hardly influenced by the
2010 pension increase, while those of Moscow and Saint Petersburg were influenced
dramatically (see Figures 6 and 7). Due to these changes in retirement decisions,
the average age of working pension receivers changed. The discontinuity in the
average age of working male pensioners is considerable (see Figure 8). Retirement
was hardly forced. Respondents almost always claim that retirement was their own
decision (see Figure 9). All the above mentioned demonstrates that the 2010 rise
in pension resulted in a jump in retirement.
The pension increase in 2010 influenced employment not only among
pension receivers, but among younger people as well. Respondents of working
age who live with pensioners were affected most (see Figures 10 and 11).
The difference-in-differences technique is used in order to see the impact
of coresidence with pensioners on the relation to employment among such
respondents (see Tables 2, 3). All these changes led to a jump in the percentage
of retired households (see Figure 12).
Table 1. The discontinuity in the real income of retired individuals and HHs in 2010
RLMS retired individuals HHBS retired HH

Variables
log (real income) log (real income)
0.208*** 0.197***
It>=2010 (0.0133) (0.0159)
0.0943*** 0.103***
Trend
(0.00207) (0.00476)
‒0.0930*** ‒0.0714***
Trend 𝗑 It>=2010 (0.00314) (0.00605)
6.804*** 8.702***
Constant
(0.0343) (0.0149)
Seasonal Dummies – Yes
Observations 52,189 633,894
R-squared 0.504 0.276
Note: robust standard errors in parentheses; *** – p < 0.01.
Table 2. Difference-in-differences Table 3. Difference-in-differences

estimation of the pension increase effect estimation of the pension increase
on the employment of respondents of effect on the desire of non-employed
working age respondents of working age not to look
for a job
Non-employed Non-employed
respondents of working respondents of working
Variables Variables
age not looking for a lob age not looking for a lob
for a whole quarter for a whole quarter
‒0.0285*** 0.0214***
IRetiredInHH 𝗑 It>=2010 (0.00565) IRetiredInHH 𝗑 It>=2010 (0.00468)
‒0.0544*** 0.0458***
IRetiredInHH (0.00519) IRetiredInHH (0.00401)
‒0.0113*** 0.000513
It>=2010 (0.00422) It>=2010 (0.00340)
0.00589*** ‒0.00537***
Trend Trend
(0.00171) (0.00126)
0.00337 0.000525
Trend 𝗑 It>=2010 (0.00211) Trend 𝗑 It>=2010 (0.00155)
0.771*** 0.193***
Constant Constant
(0.00615) (0.00445)
Seasonal Dummies Yes Seasonal Dummies Yes
Observations 3,860,070 Observations 3,860,070
R-squared 0.006 R-squared 0.005
Note: robust standard errors in parentheses; Note: robust standard errors in parentheses;
*** – p < 0.01. *** – p < 0.01.
Source: HHBS, author's calculations Source: HHBS, author's calculations

Figure 1. The average real income of HHs without seniors

and that of households living on pensions
250 HHs without seniors,
%
quarter average
retired HHs,
quarter average
200
HHs without seniors
trend
retired HHs trend
150
100
2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015
Source: HHBS, author's calculations
Figure 2. The real income of retired households
20000 HHBS quarter average

Real rubles with respect to 2003
RLMS quarter average

HHBS trend
15000 RLMS trend
10000
5000
2003 2005 2007 2009 2011 2013 2015
Source: RLMS, HHBS, author's calculations
Figure 3. The real income of retired people
4000 urban annual average

rural annual average

urban trend
3000 rural trend
2000
1000
2000 2002 2004 2006 2008 2010 2012 2014 2016
Source: RLMS, author's calculations

Figure 4. The relative income of retired people with respect to all respendents
0.8 urban annual average

Rate
rural annual average

urban trend
0.7 rural trend
0.6
0.5
2000 2002 2004 2006 2008 2010 2012 2014 2016
Figure 5. The percentages of seniors who did not work for the whole quarter
85
%
80
75
70
65
2003 2005 2007 2009 2011 2013 2015
Figure 6. The percentages of seniors who worked for the whole quarter
and lived in rural areas, Moscow or Saint Petersburg
live in rural areas
%
60
live in Moscow
50 or Saint-Petersburg
40
30
20
10
2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015

Figure 7. The log percentage of seniors who worked for the whole
quarter and lived in rural areas, Moscow or Saint Petersburg
4.0 live in rural areas

Log %
live in Moscow
or Saint-Petersburg
3.5
3.0
2.5
2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015
Figure 8. The average age of working male seniors

64.5
Age
64.0
63.5
63.0
2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015
Note: the graph shows the average age of males who are older than ‘retirement age’
and continue to work.
Figure 9. The rate of seniors who were forced to leave a job by their
employer as a share of seniors who left their jobs
0.2
Rate
0.15
0.1
2007 2009 2011 2013 2015 2017
Note: see Appendix C for details.

Figure 10. The percentages of respondents of working age who did not
work for the whole quarter
30
%
HH with seniors
quarter average
HH without seniors
25 quarter average
HH with seniors trend
HH without seniors trend
20
15
2003 2005 2007 2009 2011 2013 2015
Figure 11. The percentages of respondents of working age who did not
work and did not look for a job for the whole quarter
25 HH with seniors
%
quarter average
HH without seniors
20 quarter average
HH with seniors trend
HH without seniors trend
15
10
2005 2007 2009 2011 2013 2015
Figure 12. The rate for retired HHs as a share of HHs with seniors
0.60
Rate
0.58
0.56
0.54
0.52
2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015

Figure 13. The rate for HHs with only seniors as a share of HHs with seniors
0.61
Rate
0.60
0.59
0.58
0.57
0.56
0.55
2004 2006 2008 2010 2012 2014
Figure 14. The rate for HHs with children and seniors as a share of HHs with seniors
0.15
Rate
0.14
0.13
0.12
0.11
2004 2006 2008 2010 2012 2014

Note: children are people younger than 18 years old.
Figure 15. The average number of people in retired households

1.9
Average number of people
1.85
1.8
1.75
1.7
1.65
2006 2008 2010 2012 2014
Figure 16. The real income of retired households

real income
20000
real income per capita
15000
10000
5000
2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015

The pension increase in 2010 also had an influence on Russians’ living

arrangement decisions. Considering the ratio of households with pensioners only to
households where there are other individuals, one can see a drop in the percentage
of pensioners living alone (see Figure 13). Furthermore, in the households
with pension receivers, the presence of children increased (see Figure 14).
The consequence of these changes was an increased number of members of retired
households (see Figure 15). Remarkably, the real income of retired households
per capita did not have a discontinuity in contrast to the total real income of
retired households (see Figure 16).
6. Some historical evidence, robustness checks,

and the heterogeneity
The pension increase in 2010Q1 is a part of a long ongoing series of rises.
Pensioners’ income changes have a very special pattern. The relative income of
retired individuals and households follows the cycles of the anniversaries of the
victory in World War II (see Figure A1, Appendix A). Regressions (see Table A1,
Appendix A) show that there are peaks in the anniversaries relative to the trend
and the years around anniversaries.
I check for the absence of discontinuities in pre-determinant covariaties
that may potentially confound my results. The rate of pensioners as a share
of the population increased smoothly between the years 2004 and 2015 (see
Figure A2, Appendix A). I also look at the percentage of pension receivers with
higher education. Figure A3, Appendix A, illustrates that nothing extraordinary
happened with this percentage in the period of pension increase.
I do robustness checks for three of my major results. First, I do a robustness
check for extra entry into retirement. Naturally, the subgroups of pensioners
are expected to experience the influence of pension increase as well. Figure A4,
Appendix A, shows a discontinuity in the percentage of retired people among
people older than 65. Thus, the result of extra entry into retirement is robust to
the choice of this older age group.
Next, I do a robustness check for the jump in NLFJ (see Table A2, Appendix A).
I include the indicators of jumps in 2009 and 2011, and I also include the products
of these indicators of jumps and the indicators of the presence of pensioners
in households in the basic regression. The coefficients for these products of
indicators are statistically insignificant. However, the coefficient for the product
of the indicator of a jump in 2010 and the indicator of a pensioner’s presence
in a household remain statistically significant. This means that my result on the
discontinuity in NLFJ is robust for this check.
Finally, I do a robustness check for the jump in the number of the retired
households. To do this, I analyse a specific industry. It is natural to expect the
effect of pension increase on employment to be the most obvious for unskilled
labour. I look at wholesale and retail sales as such menial work. Based on the
HHBS data, I measure the employment only of the head of household and
her/his spouse. I concentrate on the employment of the head of household.
Figure A5, Appendix A, shows a discontinuity in the employment of heads of
household working in wholesale and retail sales for households with pension
receivers and an absence of any jumps in that for households without pension
receivers. Thus, my result on the discontinuity in the number of the retired
households is robust for this check.
Then I check for the heterogeneity of the effect with respect to the type
of organisation a person works in and with respect to the person’s education.
In order to see the former type of heterogeneity, I look at the following ratio: the
number of employed retirees living in a certain type of household to all retirees.
I consider households where the head and the spouse of the head work only in
the public sector, only in the private sector, and in both the public and private
sectors. I see that in absolute terms the public sector was influenced the most (see
Figure B1, Appendix B). However, this is driven by the size of the public sector
in the Russian economy. If one looks at log percentages, it becomes apparent that
the influence of pension increase was similar across different sectors in relative
terms (see Figure B2, Appendix B). At last, I consider heterogeneity with respect to
pensioners’ education. The effect of pension increase on the labour participation
of retirees with and without higher education is very similar both in absolute and
relative terms (see Figures B3, B4, Appendix B).
7. Discussion
The relative income of retirees follows cycles which are driven by the
anniversaries of the victory in World War II. The relative income of other
groups of people who receive money from the federal budget depends on these
anniversaries much less. The possible opportunistic cycles for pensions are a
minor issue compared to these ‘anniversary’ cycles. Thus, the Russian government
does not appear to use increases in pensions specifically to win elections.
As noted, the most significant increase by magnitude happened in 2010,
the 65th anniversary of the victory in World War II. The absolute real income of
pension receivers tended to rise before 2010. Then there was a significant jump in
the income of retired households and pensioners in 2010. Later, the real income
of pensioners did not change significantly.
One of the most significant consequences of the 2010 pension increase is a
jump in the percentage of pensioners who chose to retire. From 2003 to 2015,
there was an inclination that pensioners were more and more employed. Only
the sharp 2010 pension rise was capable of changing this tendency. This signals
the presence of a reference point in pensioners’ minds when they consider labour
force participation decisions.
Due to the pre-existing situation (see Figure 7) and different employment

incentives (e.g. consider the possibility of household production for rural
dwellers), the effect on labour markets in rural areas and in the megalopolises was
quite heterogeneous. The labour markets of the largest Russian cities, Moscow
and Saint Petersburg, were influenced more than those of rural areas both in
absolute and relative terms. However, for cases when employment incentives were
not so obviously different, the effect was rather homogeneous.
It was not only pension receivers who stopped working and looking for jobs, but
also people who lived with them during the pension increase. As the comparison
of total and per capita real income of retired HHs shows (see Figure 16), younger
people absorbed all additional money from the government by changing their
living arrangements. The comparison of the labour force participation of younger
individuals living separately from pensioners and those living together with them
implies (see Figure 11) that younger people used this additional money so as to
not to work, at least for some period of time.
What can policy makers learn from my findings? My research shows that
pension increases influence not only pensioners, but people living with them, too.
The share of persons living with pensioners among all people not eligible for pensions
throughout 2005–2015 was quite large, specifically 17%–19%. My findings suggest
that an increase in pensions can result in falling labour force participation among
this group. If some pension increase in Russia leads to an X p.p. rise in retirement,
it will also lead to a 0.4*X p.p. drop in labour force participation among younger
individuals who live together with pension receivers or who will start living together
with pensioners after this pension increase. In addition, it appears that the effect of
changes in the pension system influences rural areas and large cities very differently.
As stated above, a jump in the percentage of people who were not working –
among individuals below the pension age and living with pensioners – was
identified. However, the trends of employment of these individuals and their peers
living separately from pensioners did not change. These two results combined can
be used in the debates about the current pension reform.
Some politicians say that senior people who stay at work longer due to
the new pension age will take the jobs of younger people.22 The results of my
research speak against this view. In particular, my paper illustrates that younger
individuals do not massively take job positions which became vacant after senior
people’s retirement. In contradiction with the view of substitution, some group of
younger people, namely those living with pensioners, quit jobs, too.
The evidence from the 2010 Russian pension increase also supports the idea
that significant increases in income have a positive influence on the coresidence
of generations in the country. Since people react well to financial incentives, the
government can exploit this relationship of living arrangements and financial
e.g. see https://rg.ru/2015/03/12/kvoti.html

22
incentives to nudge younger people to live together with seniors. Coresidence

with other people is important for older generations: loneliness is often listed as
one of the biggest fears for this group.23
8. Conclusion
The paper studies the effects of the pension increase in Russia in 2010 on the
labour force participation decisions and living arrangements of senior people
and their family members. The rise led to an approximately 5 p.p. increase in the
share of pension receivers who chose to retire. The effect on the labour market
depended heavily on the type of settlement. In the biggest Russian cities the
effect was greater. One out of four employed pensioners left the labour force
in Moscow and Saint Petersburg. Also, I find an external effect on people who
live with pensioners. The regression shows that in 2010Q1 there was no jump in
the percentage of unemployed people not looking for a job for a whole quarter
among those who lived separately from pensioners. Meanwhile, the increase
in this percentage among those who lived with pension receivers was a slightly
above 2 p.p. Additionally, the coresidence of pensioners with other generations
went up. Thus, the paper shows that the pension increase influenced people who
did not receive pensions themselves.
The research can be extended in different ways. Since the labour markets
of Moscow and Saint Petersburg experienced the largest influence of the 2010
pension increase, more rigorous analysis of the effects of this pension increase
on the labour markets of these cities is of particular interest. In addition, the
consequences of the 2010 pension increase can be studied from the viewpoints of
the structural model of interactions between spouses of senior age and dynamic
models of retirement decisions.

dx.doi.org/10.31477/rjmf.202001.92
References
Afanasiev, S. A. (2003). Pension Reform in Russia: First Year of Implementing. PIE
Discussion Discussion Paper, N 146. Available at: http://hermes-ir.lib.hit-u.ac.jp/rs/
bitstream/10086/14467/1/pie_dp146.pdf [accessed on 31 December 2019].
Angrist, J. and Pischke, J. (2009). Mostly Harmless Econometrics: An Empiricist’s
Companion. Princeton University Press.
Bianchi, S. M. (1987). Living at Home: Young Аdults’ Living Arrangements in the 1980s.
Unpublished paper, Center for Demographic Studies, US Bureau of the Census.
23
See https://fom.ru/nastroeniya/12596
Cameron, A. and Trivedi, P. (2005). Microeconometrics: Methods and Applications.

Cambridge University Press.
Danielyan, V. A. (2016). Individual Determinants of Pension Age: Review of Researches.
Vestnik Instituta Ekonomiki Rossijskoj Akademii Nauk, 3, pp. 171–202. [In Russian].
De Castello Branco, M. (1998). Pension Reform in the Baltics, Russia, and Other Countries
of the Former Soviet Union (BRO). IMF Working Paper, N 11.
Durante, R. and Zhuravskaya, E. (2016). Attack When the World Is Not Watching?
U.S. News and the Israeli-Palestinian Conflict. Journal of Political Economy, 126(3),
pp. 1085–1133.
Gel’man, V., ed. (2017). Authoritarian Modernization in Russia: Ideas, Institutions,
and Policies. London: Routledge. doi: 10.4324/9781315568423
Goldscheider, F. K. and DaVanzo, J. (1989). Pathways to Independent Living in Early
Adulthood: Marriage, Semiautonomy, and Premarital Residential Independence.
Demography, 26, pp. 597–614.
Gontmakher, E. (2009). The Pension System of Russia After the Reform of 2002:
Challenges and Prospects. Journal of the New Economic Association, 1–2, pp. 190–206.
Gruber, J., Milligan, K. and Wise, D. (2009). Social Security Programs and Retirement
Around the World: The Relationship to Youth Employment, Introduction and Summary.
NBER Working Paper, N 14647.
Gurvich, E. and Sonina, Y. (2012). Microanalysis of the Russia's Pension System.
Voprosy Ekonomiki, 2, pp. 27–51. [In Russian]. doi: 10.32609/0042-8736-2012-2-27-51
Hauner, D. (2008). Macroeconomic Effects of Pension Reform in Russia. IMF Working
Paper, N 201.
Kovrova, I. (2007). Shaping a Pension System: Distributive and Incentive Effects of the
Russian Pension Reforms. PhD Thesis, University of Turin.
Lee, D. S. and Lemieux, T. (2010). Regression Discontinuity Designs in Economics.
Journal of Economic Literature, 48(2), pp. 281–355. doi: 10.1257/jel.48.2.281
Lokshin, M., Harris, K. M. and Popkin, B. M. (2000). Single Mothers in Russia:
Household Strategies for Coping with Poverty. World Development, 28(12),
pp. 2183–2198.
Lyashok V. Y. (2017). Pension, Health and Labor Demand As Determinants of Economic
Activity of Elderly Population in Russia. PhD Thesis, Higher School of Economics.
[In Russian]. Available at: https://www.prlib.ru/item/1155798 [accessed
on 21 February 2020].
Maleva, T. and Sinyavskaya, O. (2010). Pension Age Increase: Pro et Contra.

Journal of the New Economic Association, 8, pp. 117–137.
Malkova, O. (2020). Did Soviet Elderly Employment Respond to Financial Incentives?
Evidence from Pension Reforms. Journal of Public Economics, 182. Article in press.
doi.org/10.1016/j.jpubeco.2019.104111.
Manacorda, M. and Moretti, E. (2006). Why Do Most Italian Youths Live with Their
Parents? Intergenerational Transfers and Household Structure. Journal of the
European Economic Association, 4(4), pp. 800–829. doi: 10.1162/JEEA.2006.4.4.800
Manoli, D. and Weber, A. (2016). Nonparametric Evidence on the Effects of Financial

Incentives on Retirement Decisions. American Economic Journal: Economic Policy,
8(4), pp. 160–182.
Schröder-Butterfill, E. (2004). Inter-generational Family Support Provided by
Older People in Indonesia. Ageing and Society, 24(4), pp. 497–530. doi: 10.1017/
S0144686X0400234X
Sinyavskaya, O. (2005). Pension Reform in Russia: A Challenge of Low Pension Age.
PIE Discussion Paper, N 267. Available at: http://cis.ier.hit-u.ac.jp/Common/pdf/
dp/2004/dp267.pdf [accessed on 21 February 2020].
Solovyev, A. (2013). The Macroanalysis of Russian Pension System. Voprosy Economiki, 4,
pp. 82–93. [In Russian]. doi: 10.32609/0042-8736-2013-4-82-93
Young, C. (1987). Young People Leaving Home in Australia: The Trend Towards
Independence. Australian Family Formation Project Monograph, N 9.
Call for Papers
for students
The Bank of Russia

and the Russian Journal of Money and Finance
invite students and PhD fellows to submit papers for the
2nd Economic Research Competition
Submission topics:
Central bank’s policy Inflation Financial stability Banking Data analysis: methodology and results
Key requirements for the research:

Applied research
Meeting academic criteria
The study should have been carried out no earlier than 2017
When carrying out the research, the author should have
possessed a Bachelor’s or Master’s degree,
or else been working towards a PhD
How to apply: Competition dates:

1. To submit a paper in Russian or English, please send The closing date for submission
it to journal@mail.cbr.ru, putting Application is 15 May 2020
for 2020 Bank of Russia Research Competition for
The winners will be announced
students and doctoral researchers in the subject line
on 30 July 2020
Give your full name, course of study, and educational
institution Competition jury:
2. A cover letter, containing the research objective, Editors and the Editorial Board
the contribution made by the authors (the originality of the Russian Journal
of the results when compared against specific existing of Money and Finance
studies), results, and conclusions Researchers from the Research
Papers are accepted in PDF format and Forecasting Department,
(file size not more than 5 MB) the Monetary Policy Department,
and the Financial Stability
Department of the Bank of Russia
Publication of research papers:

The Russian Journal of Money and Finance
will consider the winners’ papers for publication
The competition winners will be invited to participate

in the Bank of Russia Macroeconomics Summer School, the Bank of Russia
International Conference, and the International Financial Congress
in 2021 in St Petersburg
(travel and accommodation costs covered)
For more detailed information, please see

cbr.ru/eng/ec_research, rjmf.econs.online/en

RJMF 79 1eng PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

RJMF 79 1eng PDF

Uploaded by

Copyright:

Available Formats

Russian Journal of

FISS – A Factor-based Index of Systemic Stress

Use of Machine Learning Methods

Forecasting Inflation in Russia Using Neural Networks

Announcements of Sanctions and the Russian Equity Market:

Expected and Unexpected Consequences

Managing Editor Deputy Managing Editor

© The Central Bank of the Russian Federation, 2019

Machine Learning Forecasting Methods

Forecasting Inflation in Russia Using Neural Networks

Households’ Income and Labor Market

All the articles are freely available at: rjmf.econs.online/en

The Journal was established in 1938

The Journal is included in the list of leading scientific journals

FISS – A Factor-based Index of Systemic

Tracking and monitoring stress within the financial system is a

Keywords: systemic stress, Citation: Szendrei, T. and Varga, K.

Tracking and monitoring financial stress is a key component of financial

Financial stress can be interpreted as the amount of risk which has

indices in the literature. What follows is a non-exhaustive survey of some of the

sub-index influences the real economy, measured by industrial production

3. Data selection and transformation

Table 1. List of variables included in the FISS

Raw variable Aspect of financial stress captured

3.1. Government bond market

The government bond market is captured by three variables: risk premium on

3.2. Foreign exchange market

Five variables encapsulate the foreign exchange market in the model:

0.85 0.9 0.95

market concentration, etc.). As such, it is important to have other measures of

3.3. Capital market

Several stock market indices were included to capture the movement of

The VDAXNEW (hereinafter VDAX) was also included as a variable

3.4. Bank segment and interbank market

4.1. Dynamic factor model

The literature on factor models in macroeconomics is extensive: Bernanke

In the early 2000s, economists became more interested in large-dimensional

Furthermore, the algorithm is capable of handling scenarios where some

4.2. The choice of the factors and the estimation

Table 2. Coefficient matrix of the state equation

like independent random walks. This simple structure makes it unnecessary to

4.3. Bayesian model specification

Figure 2. Factors of the FISS

4.4. Aggregation of the factors

Weighted average using quantile regression

Weighted average using information value

Table 3. Weights gained by the different estimation procedures, %

QR 37.46 21.53 11.70 29.32

IV 25.81 45.07 27.47 1.65

Figure 3. Weighted and unweighted averages of the FISS

5.1. Performance of the index

Having an objective, empirical way to determine the performance of the FISS

Figure 4. Values of the FISS to 28 February 2010

I./1: Introduction of the fiscal austerity package in Hungary on 12 June

Figure 5. Values of the FISS from 1 March 2010

III./4: Euro area sovereign debt problem worsens in mid-July of 2011.

Table 4. Ranking the effects of the events on the FISS

Event Minimum Maximum Change Rank

the Hungarian government’s austerity programme (I./1) did not happen in a

Figure 6. The FISS on different sample sizes

This level of robustness is partly attributed to the Bayesian structure of the

The FISS is capable of capturing the core dynamics of financial instability on

Appendices are available at