Jurnal 4

International Review of Economics and Finance 45 (2016) 559–571
Contents lists available at ScienceDirect
International Review of Economics and Finance

journal homepage: www.elsevier.com/locate/iref
Forecasting the volatility of the Dow Jones Islamic Stock Market

Index: Long memory vs. regime switching夽
Adnen Ben Nasr a , Thomas Lux b, c, * , Ahdi Noomen Ajmi d , Rangan Gupta e
a
BESTMOD, Institut Supérieur de Gestion de Tunis, Université de Tunis, Tunisia
b
Department of Economics, University of Kiel, Germany
c
Banco de España Chair in Computational Economics, University Jaume I, Castellon, Spain
d
College of Sciences and Humanities in Slayel, Salman bin Abdulaziz University, Saudi Arabia
e
Department of Economics, University of Pretoria, South Africa
A R T I C L E I N F O A B S T R A C T
Article history: The financial crisis has fueled interest in alternatives to traditional asset classes that might be
Received 31 July 2014 less affected by large market gyrations and, thus, provide for a less volatile development of
Received in revised form 22 July 2016 a portfolio. One attempt at selecting stocks that are less prone to extreme risks is obeyance
Accepted 25 July 2016 of Islamic Sharia rules. In this light, we investigate the statistical properties of the Dow Jones
Available online 1 August 2016
Islamic Stock Market Index (DJIM) and explore its volatility dynamics using a number of up-
to-date statistical models allowing for long memory and regime-switching dynamics. We find
JEL classification: that the DJIM shares all stylized facts of traditional asset classes, and estimation results and
G15 forecasting performance for various volatility models are also in line with prevalent findings
G17 in the literature. With this proximity to standard asset classes, investments in the DJIM could
G23
hardly provide a cushion against extreme market fluctuations. Among the various models,
the relatively new Markov-switching multifractal model performs best under the majority of
Keywords:
time horizons and loss criteria. Long memory GARCH-type models (FIGARCH and FITVGARCH)
Islamic finance
Volatility dynamics always improve upon the short-memory GARCH specification and additionally allowing for
Long memory regime changes can further improve their performance, and also enhance the accuracy of
Multifractals value-at-risk forecast.
© 2016 Elsevier Inc. All rights reserved.
1. Introduction
In the wake of the recent global financial crisis, it is now well-established that enormous negative impacts have been felt by
conventional institutions and markets. Understandably, a need has been felt for exploring alternatives to conventional financial
practices that allow to reduce investment risks, increase returns, enhance financial stability, and reassure investors and financial
markets. Herein one has observed a tremendous growth in Islamic Finance products and instruments. These services now sum
up to a global industry amounting to around US $2 trillion in assets; 80% of which is accounted for by Islamic banks or Islamic
夽 Thomas Lux gratefully acknowledges financial support from the Spanish Ministry of Science and Innovation (ECO2011-23634), from Universitat Jaume I
(P1.1B2012-27), and from the European Union Seventh Framework Programme, under grant agreement no. 612955. All authors acknowledge helpful comments
by the reviewer and the Editor-in-charge of the previous version of this paper.
* Corresponding author at: Department of Economics, University of Kiel, Germany.
E-mail addresses: adnen.bennasr@isg.rnu.tn (A. Nasr), lux@bwl.uni-kiel.de (T. Lux), ajmi.ahdi.noomen@gmail.com (A. Ajmi), rangan.gupta@up.ac.za
(R. Gupta).
http://dx.doi.org/10.1016/j.iref.2016.07.014
1059-0560/© 2016 Elsevier Inc. All rights reserved.
560 A. Nasr, et al. / International Review of Economics and Finance 45 (2016) 559–571
windows of conventional banks, 15% Sukuk (Islamic bonds), 4% Islamic mutual funds, and 1% Takaful (Islamic insurance) (Abedi-
far, Ebrahim, Molyneux, & Tarazi, 2015). As indicated by Abedifar et al. (2015), even though Islamic banking and financial assets
comprise less than 1% of total global financial assets, it has grown faster than conventional finance since the outbreak of the cri-
sis. More importantly, this trend is expected to continue into the near future (Abedifar et al., 2015). In addition to the growth in
banking assets, there is increasing competition between major financial centers like London, Kuala Lumpur and Dubai, to take
the lead in the Islamic bonds issuance and to develop a broader array of Islamic investment products (Abedifar et al., 2015;
Nazlioglu, Hammoudeh, & Gupta, 2015).
To understand this tremendous growth in Islamic Finance, following the crisis, one needs to understand the two underlying
principles that define this market, namely the prohibition of interest (Riba) and adherence to other Islamic law (Shariá) require-
ments (Abedifar et al., 2015). As noted by Hammoudeh, Mensi, Reboredo, and Nguyen (2014), Shariá compliance imposes two
types of restrictions on Islamic equity finance: (i) Any companies with involvement in alcohol, tobacco, pork-related products,
gambling, entertainment, weapons, and conventional financial services are screened out, and (ii) Financial ratios are used to
remove companies based on debt and interest income levels. The Shariá-based principle thus aims to prevent speculative finan-
cial transactions. For instance, Islamic finance prohibits financial derivatives (which do not have any underlying real transactions
like futures and options), government debt issues with a fixed coupon rate, hedging by forward-sale, interest-rate swaps, and
any other transactions,like short-sales, which involves items that are not in the ownership of the seller physically (Hammoudeh
et al., 2014).
Against this backdrop, in this paper, we aim to model and forecast conditional volatility of the returns of the Dow Jones
Islamic Market World Index (DJIM), accounting for both the possibility of long memory and structural changes in the volatility
process. The choice of the DJIM is justified by the fact that it is the most widely used, and most comprehensive representative
time series for the Islamic stock market (Hammoudeh et al., 2014). Appropriate modeling and forecasting of volatility of the DJIM
is important due to the fact that, when volatility is interpreted as uncertainty, it becomes a key input to investment decisions
and portfolio choices. The basic idea behind Islamic Finance rests on the belief that it is less riskier than conventional equity
markets, given the type of investments it involves. However, the statistical evidence in the literature regarding this view is, at
best, mixed (Abedifar et al., 2015). Hence, our analysis of volatility (if perceived as uncertainty) of the DJIM is of paramount
importance, since based on our results relative to the widely-available evidence of volatility modeling and forecasting for the
conventional equity markets based on the models we use here as well, we can determine whether Islamic Finance is indeed
different from conventional markets or not. Understandably, if it tends to behave similarly to conventional equity markets, it
too can have wide repercussions on the economy as a whole, via its effect on real economic activity and public confidence.
Hence, estimates of market volatility can serve as a measure for the vulnerability of the DJIM. While there is a rich literature on
volatility modeling and forecasting of conventional financial assets, not much evidence exists to date with respect to the Islamic
stock market. We try to fill part of this gap using some of the most advanced tools available in contemporaneous econometric
literature.
There exists a large literature on evidence of long memory in the conditional volatility of various financial time series (Baillie,
Bollerslev, & Mikkelsen, 1996; Bollerslev & Mikkelsen, 1996; Davidson, 2004; Ding, Granger, Engle, & Engle, 1993; Lobato &
Savin, 1998; Andersen & Bollerslev, 1997). At the same time, there is also another literature that finds evidence of structural
changes in the volatility process of financial variables (Andreou & Ghysels, 2002; Bos, Franses, Ooms, & Ooms, 1999; Rapach,
Strauss, & Wohar, 2008). Not surprisingly, a parallel literature exists that emphasizes the simultaneous role of both long memory
and structural changes in characterizing financial returns volatility (Baillie & Morana, 2007; Beine & Laurent, 2000; Lobato &
Savin, 1998; Martens, van Dijk, & de Pooter, 2004; Morana & Beltratti, 2004).
Given this line of research on the co-existence of both long memory and structural change in the volatility processes of
financial market data, and following Ben Nasr, Boutahar, and Trabelsi (2010) and Ben Nasr, Ajmi, and Gupta (2014), we estimate
a model for the DJIM returns that allows the volatility of the returns to accommodate for both these features. The idea is to allow
for time-dependent parameters in the conditional variance equation of a Fractionally Integrated Generalized Autoregressive
Conditional Heteroskedasticity (FIGARCH) model. Specifically speaking, the change of the parameters is assumed to evolve
smoothly over time using a logistic transition function, to yield a so-called Fractionally Integrated Time Varying Generalized
Autoregressive Conditional Heteroskedasticity (FITVGARCH) model.
Further, a related line of research on long memory and structural changes in volatility discusses the connection between
these phenomena. In fact, volatility persistence may be due to switching of regimes in the volatility process, as first suggested
by Diebold (1986) and Lamoureux and Lastrapes (1990). This literature concludes that it could be very difficult to distinguish
between true and spurious long memory processes. This ambiguity motivates us to include a relatively new type of Markov-
switching model in addition to our array of volatility models (i.e., GARCH, FIGARCH, FITVGARCH) — the Markov-switching
multifractal (MSM) model of Calvet and Fisher (2001). Despite allowing for a large number of regimes, this model is more par-
simonious in parameterization than other regime-switching models. It is furthermore well-known to give rise to apparent long
memory over a bounded interval of lags (Calvet & Fisher, 2004) and it has limiting cases in which it converges to a ‘true’ long
memory process. To the best of our knowledge, this is the first attempt in forecasting the volatility process for the DJIM returns
using a wide variety of advanced volatility models trying to capture long-memory, structural breaks and the fact that structural
breaks can lead to the spurious impression of long-memory. The only closely related paper is that of Ben Nasr et al. (2014),
which compared the in-sample fits of a FITVGARCH with a FIGARCH model, to show the superiority of the former. However, as
is well-known, better in-sample fit does not guarantee superiority of a model based on out-of-sample forecasts, which is, more
importantly, what is required for portfolio allocation. Further, unlike us Ben Nasr et al. (2014), did not include the powerful MSM
A. Nasr, et al. / International Review of Economics and Finance 45 (2016) 559–571 561
modeling approach, which as we show, plays an important role in appropriately modeling and forecasting volatility of the DJIM
returns. The rest of the paper is organized as follows: Section 2 provides basic information on GARCH, FIGARCH, FITVGARCH and
MSM models, while Section 3 presents the data and the empirical results. Finally, Section 4 concludes.
2. GARCH, FIGARCH, FITVGARCH and MSM volatility models
Univariate models of volatility usually consider the following specification of financial returns measured over equally spaced
discrete points in time t = 1, . . . , T:
y t = l t + st ut , (1)
where yt = pt − pt−1 with pt = ln Pt the logarithmic asset price, l t = E [yt |Ft−1 ] and st2 = Var [yt |Ft−1 ] the conditional mean
and the conditional variance (volatility), respectively. The information set Ft−1 is assumed to contain all relevant information
up to period t − 1. Moreover, ut is an independently and identically distributed disturbance with mean zero and variance one.
Although ut can be drawn from various stationary distributions, in this study we let ut ∼ N(0, 1). The return components l t
and s t can be specified according to the assumed data generating process. For the purpose of this study we use the simple
specification l t = l + qyt−1 . Defining rt = yt − l t , the ‘centered’ returns can be modeled as
rt = s t u t . (2)
Now we turn to volatility modeling. Returns in financial markets are typically found to be heteroskedastic with high auto-
correlation of all measures of volatility (e.g., squared or absolute returns). To capture this feature, the literature had developed
the time-honored class of models with autoregressive conditional heteroskedasticity. As the benchmark version of this class of
models, the GARCH(1,1) model of Bollerslev (1986) assumes that the volatility dynamics is governed by
st2 = y + art−1
2 2
+ bst−1 , (3)
where the restrictions on the parameters are y > 0, a, b ≥ 0 and a + b < 1.

The FIGARCH model introduced by Baillie et al. (1996) expands the GARCH variance equation by considering fractional dif-
ferences. As in the case of the GARCH model, we restrict our attention to one lag in both the autoregressive term and in the
moving average term. The FIGARCH(1,d,1) model is, then, given by

st2 = y + 1 − bL − (1 − dL)(1 − L)d rt2 + bst−1
2
, (4)
where L is the lag operator, d is the parameter of fractional differentiation and the restrictions on the parameters are b − d ≤ d ≤
(2−d)/3 and d(d−2 −1 (1−d)) ≤ b(d−b+d). In the case of d = 0, the FIGARCH model reduces to the standard GARCH(1,1) model.
For 0 < d < 1 the Binomial expansion of the fractional difference operator introduces an infinite number of past lags with
hyperbolically decaying coefficients. Note that, in practice, the infinite number of lags in the FIGARCH model with 0 < d < 1
must be truncated. We employ a lag truncation of 1000 steps.
The FITVGARCH model has been introduced by Ben Nasr et al. (2010), and has been used recently by Ben Nasr et al. (2014)
showing that the model outperformed the GARCH and FIGARCH models in terms of in-sample fit of the volatility of DJIM returns.
It expands the FIGARCH model of Baillie et al. (1996) by allowing the conditional variance parameters to change over time. The
FITVGARCH(1, d, 1) model is given by
[1 − 0t L] (1 − L)d rt2 = yt + [1 − bt L] vt (5)
where vt = rt2 − st2 , yt = y1 + y2 F(t∗ ; c, c), 0t = 01,1 + 02,1 F(t∗ ; c, c); bt = b1,1 + b2,1 F(t∗ ; c, c), and F(t∗ ; c, c) is a logistic smooth
transition function defined as
⎛ ⎧ ⎫⎞−1
⎨ K ⎬
∗ ⎝
F(t ; c, c) = 1 + exp −c (t − ck )⎭⎠ ,
∗
(6)
⎩
k=1
with the constraints c > 0 and c1 ≤ c2 ≤ . . . ≤ cK for the transition points in the standardized time variable t∗ = t/T with
T as the sample size. The transition function F(t∗ ; c, c) is a continuous function bounded between 0 and 1. The parameter c
corresponds to the speed of transition between the two regimes, while the parameter ck , known as the threshold parameter,
indicates when, within the range of t, the transitions take place.
The roots of [1−0t L] and [1−bt L] should be outside the unit circle for all t. This implies that [1−0t ] > 0 and [1−bt ] > 0. With
K = 1, the parameters of the FITVGARCH model change smoothly over time from (y1 , 01,1 , b1,1 ) to (y1 + y2 , 01,1 + 02,1 , b1,1 +
b2,1 ). The transitions between regimes happen instantaneously when t∗ = c1 and c is large. When c → 0, the FITVGARCH(1, d, 1)
model in Eqs. (5) and (6) nests the FIGARCH(1, d, 1) model of Baillie et al. (1996) since the logistic transition function becomes
constant and equal to 1/2.
Estimation of the GARCH, FIGARCH and FITVGARCH models can be done via the Quasi Maximum Likelihood (QML) method.
2
The l-period ahead forecasts ŝt+l|t for these models can be obtained most easily by recursive substitution of one-step ahead
2
forecasts ŝt+1 . Note that one obtains volatility forecasts from FITVGARCH in much the same way as for FIGARCH using the active
regime at time t. The advantage of FITVGARCH would consist in detecting a possible regime switch within the in-sample data
used for estimation so that the set of parameters might be different from those of a FIGARCH model without regime switching
both estimated for the same series.
We now turn to a description of the MSM model. An in-depth analysis of this model can be found in Calvet and Fisher
(2004) and Lux (2008). In the MSM model, instantaneous volatility is determined by the product of k volatility components or
(1) (2) (k)
multipliers Mt , Mt , . . . , Mt and a scale factor s 2 :
k
(i)
st2 = s 2 Mt . (7)
i=1
(i)
Following the basic hierarchical principle of the multifractal approach, each volatility component Mt will be renewed at time
t with a probability ci depending on its rank within the hierarchy of multipliers, and will remain unchanged with probability
1 − ci . Convergence of the discrete-time MSM to a Poisson process in the continuous-time limit requires to formalize transition
probabilities according to
i−k )
ci = 1 − (1 − ck )(b , (8)
with ck and b parameters to be estimated (Calvet & Fisher, 2001). Since we are not interested in the continuous-time limit
in this article, we follow Lux (2008) and use pre-specified transition probabilities ci = 2i−k rather than the specification of
Eq. (8). The negligence of a more flexible parametrization of transition probabilities also can be motivated by the fact that the in-
sample fit and out-of-sample forecasting performance of both alternatives laid out above have been found to be almost invariant
compared to the influence of other (estimated) parameters (Lux, 2008).
The MSM model is fully specified once we have determined the distribution of the volatility components. It is usually
(i)
assumed that the multipliers Mt follow either a Binomial or a Lognormal distribution. In the MSM framework, only one param-
eter
has to be estimated for the distribution of volatility components, since one would normalize the distribution so that
(i)
E Mt = 1.
Here we use both the Lognormal MSM (LMSM) and Binomial MSM (BMSM) model. In the former multipliers are determined
by random draws from a Lognormal distribution with parameters k and m, i.e.

(i)
Mt ∼ LN −k, m2 . (9)

(i)
Normalization via E Mt = 1 leads to

exp −k + 0.5m2 = 1, (10)
√
from which a restriction on the shape parameter m can be inferred: m = 2k. Hence, the distribution of volatility components
corresponds to a one-parameter family of Lognormals with the normalization restricting the choice of the shape parameter.
Thus, the LMSM parameters to be estimated are just k and s. In the Binomial model the multipliers are drawn with equal
probabilities of 0.5 from two values only, denoted m0 and 2-m0 .
Calvet and Fisher (2004) have first estimated the BMSM model via maximum likelihood, whereas Lux (2008) has introduced
a GMM estimator that is universally applicable to all possible specifications of MSM processes. Maximum likelihood estimation
requires a finite state space and is, therefore, not applicable for LMSM. While, in principle, the ML approach for BMSM is the
same as with any Markov-switching model, the high numbers of states (2k in the Binomial setting) makes it computationally
demanding for practical research, it is, therefore, applicable only to one-digit choices for k, and we set k = 8 for the likeli-
hood approach for estimation of the parameters v = (m0 , s) in BMSM. Since ML estimation is not feasible for the Lognormal
model, we resort to GMM along the lines of Lux (2008). In the GMM framework the unknown parameter vector of LMSM model
v = (k, s) is obtained by minimizing the distance of empirical moments from their theoretical counterparts, i.e.
v̂T = arg min fT (v) AT fT (v), (11)

v∈V
with V the parameter space, fT (v) the vector of differences between sample moments and analytical moments, and AT a positive
definite and possibly random weighting matrix. Under standard regularity conditions that are routinely satisfied by MSM
models, the GMM estimator v̂T is consistent and asymptotically normal.1
In order to account for the proximity to long memory characterizing MSM models we follow Lux (2008) in using logarithmic
differences of absolute returns together with the pertinent analytical moment conditions, i.e.
nt,T = ln |rt | − ln |rt−T |. (12)
Using Eqs. (2) and (7) in Eq. (12) we get the expression
k

mt − mt−T + ln |ut | − ln ut−T ,
(i) (i)
nt,T = 0.5 (13)
i=1
(i) (i)
where mt = ln Mt . The variable nt,T only has nonzero autocovariances over a limited number of lags. To exploit the tempo-
ral scaling properties of the MSM model, covariances of various orders q over different time horizons are chosen as moment
conditions, i.e.

q q
Mom (T, q) = E nt+T,T • nt,T , (14)

for q = 1, 2 and T = 1, 5, 10, 20, together with E rt2 = s 2 for identification of s 2 . In the GMM framework, we consider different
specifications of the MSM model varying k from 2 through 15 and choose the one at which the objective function does not
improve anymore by more than a very small difference. The consideration of a high number of multipliers k can be motivated
by previous findings that show that even levels beyond k > 10 may improve the forecasting capabilities of the MSM for some
series and proximity to temporal scaling of empirical data might be closer (Liu, di Matteo, & Lux, 2007). Indeed, having ‘too
many’ multipliers is always harmless as the other parameter estimates would remain unchanged beyond some threshold and
‘superfluous’ multipliers with very long life times would just absorb part of the scale parameter.
Forecasting of volatility can be implemented within an ML framework by taking stock of the conditional probabilities at the
forecast origin. Iteration via the transition matrix of the Markov process thus provides predictions for the subsequent periods.
Out-of-sample forecasting of the MSM model estimated via GMM is performed for the zero-mean time series Yt = rt2 − ŝ 2 for
l-step ahead horizons, by means of best linear forecasts computed with the generalized Levinson-Durbin algorithm developed
by Brockwell and Dahlhaus (2004).
3. Empirical analysis
In this section, we present the results of our empirical study, starting with the description of the data, the in-sample
estimation results, and then proceed to the out-of-sample forecast comparison of the different volatility models discussed above.
3.1. Data
The various volatility models are estimated using daily data of the Global Dow Jones Islamic Market World Index (DJIM). The
DJIM index measures the performance of the global universe of investable equities that have been screened for Sharia compli-
ance. The companies in this index pass the industry and financial ratio screens. The regional allocation for DJIM is classified as
follows: 60.14% for the United States; 24.33% for Europe and South Africa; and 15.53% for Asia (Hammoudeh et al., 2014). Our
data spans the period of January 1, 1996 to September 2, 2013, implying a total of 5750 observations. Note that the start and
end date for the index is governed purely by data availability at the time of writing this paper. The time series for the index is
sourced from Bloomberg. In order to get a preliminary idea about the data set, we present, in Fig. 1, the daily index in levels and
returns. Note that daily returns are normalized by taking 100 times the first difference of the natural log of the index. Table 1
gives some basic statistics for the DJIM index returns. Inspecting the first four moments of the data (prior to normalization), we
find pronounced negative skewness (probably due to the inclusion of the time of the financial crisis) and excess kurtosis as it
typically characterizes financial time series.
The Jarque-Bera test rejects the hypothesis of Normally distributed returns at any level of significance. Similarly, conducting
a series of Box-Ljung tests for different lags, independence of raw, squared and absolute returns is rejected at any traditional
level of significance. In line with practically all other such analyses of financial time series, the Box-Ljung statistics is orders of
magnitude larger for both squared and absolute returns than for the raw data. Hence, there is much higher persistence in the
1
The standard regularity conditions are problematic for the preceding ‘first generation’ multifractal model of Mandelbrot, Fisher, and Calvet (1997) because
of its restrictions to a bounded time interval. This is not an issue for the ‘second generation’ MSM of Calvet and Fisher (2001) which by its very nature is a variant
of a Markov-switching model.
Daily DJIM index
2600
1800
1000
1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013
Daily DJIM returns

8
6
4
2
0
-2
-4
-6
-8
1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013
Fig. 1. Daily DJIM index in level and returns.
measures of volatility than in the raw returns themselves. It is this very feature that gave rise to the development of volatility
models with autoregressive dependence and long memory. We finally also show the so-called Hill estimator for the tail index,
i.e. the decay of the unconditional distribution of returns in its outer, extremal region. The literature unequivocally finds the
extreme part of the distribution following a power-law, i.e. Prob(|rt | > x) ∼ x −a , with the ‘tail index’ typically around 3 (which
has been denominated the ‘cubic’ law of large returns), cf. Cont (2001). Applying the conditional maximum likelihood estimator
of Hill (1975), we report results for the decay parameter a together with its 95% confidence intervals in Table 1 Tail index
Table 1
Time series statistics of DJIM returns.
Mean: 0.000
Std. dev: 0.010
Skewness: −0.367
Kurtosis: 8.889
Bera-Jarque statistics: 19,055.047 (0.000)
Box-Ljung statistics for: Returns Squared returns Absolute returns
Lag 8 91.039 3545.467 2657.314

(0.000) (0.000) (0.000)
Lag 12 92.981 5318.485 3339.935
(0.000) (0.000) (0.000)
Lag 16 103.141 6217.137 4691.161
(0.000) (0.000) (0.000)
Hill tail index at 10% tail 2.754 (2.529–2.979)
5% tail 3.133 (2.770–3.495)
1% tail 3.575 (2.647–4.503)
Note: For the Bera-Jarque and Box-Ljung statistics, p-values are given in brackets; for the tail index estimates, the
brackets contain the 95% confidence intervals of the point estimates based upon the limiting distribution of the
estimator.
estimates for assumed extremal tail regions are in the vicinity of around 3 as it had been found for essentially all other financial
series scrutinized in the literature. The DJIM data also displays the slight decrease of the estimate for larger tail sizes that is
presumably due to more and more contamination of tail observations with those from the central part of the distribution.
So far, we do not find anything out of the ordinary. Returns from the DJIM seem to share all basic stylized facts of stock market
data and more broadly defined classes of financial assets. In particular, the Box-Ljung test points to a very similar structure of
the volatility dynamics (already visible in the clear clustering of volatility in Fig. 1), and the obeyance of the ‘cubic law’ of large
returns indicates that Islamic stocks share the same risk structure like more conventional assets.
Fig. 1 shows that daily returns are, in general, highly volatile, with volatility being highest during the month of October in
2008. Given this, we used the first 3926 (until December 29, 2006) observations for in-sample estimation, and the remaining
1824 observations for out-of-sample forecasting of the volatility of DJIM returns and value-at-risk predictions, with the out-of-
sample period being chosen to coincide with the global financial crisis — a period of high volatility.
3.2. Estimation
We now turn to estimation of our four volatility models introduced above. In-sample estimation results of the different
models are reported in Table 2. The results corresponding to GARCH-type models, i.e., GARCH(1,1), FIGARCH(1,d,1) and FITV-
GARCH(1,d,1) indicate that the constant parameter (ŷ) is significant at the 5% level. Also, we observe a statistically significant
effect of past volatility on current volatility (b̂) and of past squared innovations on current volatility (0̂) in all three models at
the 5% significance level. In addition, estimation results of both FIGARCH and FITVGARCH models indicate strong evidence of
long memory in the conditional variance of the DJIM index returns, with statistically significant estimates of d, and almost the
same value in both models ( 0.48). FITVGARCH estimation results show that the regime specific FIGARCH parameters expe-
rience significant changes. The estimated threshold parameter is significant at the 5% significance level and equal to 0.6114,
indicating that the change in the volatility structure took place at about the time point t̂ = 0.6114 × T 2400, corresponding to
mid-August 2002. The high estimated value of the smoothness parameter c implies a sudden change in the parameters of the
variance equation. We now turn to the estimation of the Lognormal MSM model. The in-sample selection procedure discussed
above results in the choice of the number of multipliers to be k = 14. Hence, it requires a high number of hierarchical levels to
obtain saturation of the GMM objective function in Eq. (11). The estimates of the Lognormal parameter (k̂) and the scale factor
parameter (ŝ) have values of 1.464 and 0.809 respectively. In BMSM, k = 8, is fixed from the outset, and we obtain a parameter
estimate of m̂0 = 1.687. It is worthwhile to remark that both multifractal parameters are rather at the upper end or beyond the
spectrum that has so far been reported in the literature for various assets. This speaks for a high fractality, i.e. strong variation
of fluctuations over time in the DJIM. In a sense, this could be seen as the contrary of what the index is expected to deliver:
relatively more riskiness rather than less extreme fluctuations.
3.3. Out-of-sample analysis
In this section, the out-of-sample forecasting accuracy of the considered volatility models is analyzed. Daily volatility fore-
casts were computed for the out-of-sample period; from January 2, 2007 to September 2, 2013. For each day in the forecasting
period, forecasts are computed for horizons of various lengths: 1, 5, 10, 20, 60, 70, 80, 90 and 100 days. We have used the set
of the in-sample parameter estimates and have kept it fixed over the out-of-sample period, as it is computationally demand-
ing and time consuming to estimate the FI(TV)GARCH models using maximum-likelihood. The l-steps-ahead forecast ŝt+h|t is
obtained by appropriate substitution based on the conditional volatility specification and the forecast errors are given by
2 2
et+h|t = tt+h − ŝt+h|t . (15)
Table 2
Estimated parameters of four models for DJIM daily index returns.
GARCH FIGARCH LMSM BMSM FITVGARCH
d̂ – – 0.4831 (0.0627) – – 0.4888 (0.0450)

ĉ – – – – – – 0.6114 (0.0278)
ĉ – – – – – – 500 (3268.62)
ŷ|ŷ1 0.0031 (0.0009) 0.0089 (0.0042) – – 0.0159 (0.0032)
0ˆ1 |0̂1,1 0.0418 (0.0049) 0.3186 (0.0526) – – 0.3030 (0.0238)
bˆ1 |b̂1,1 0.9540 (0.0054) 0.7467 (0.0377) – – 0.7314 (0.0295)
ŷ2 – – – – – – -0.0139 (0.0034)
0̂2,1 – – – – – – 0.0691 (0.0344)
b̂2,1 – – – – – – 0.0771 (0.0169)
k̂ – – – – 1.4639 – – –
m̂0 – – – – – 1.6870 – –
ŝ – – – – 0.8089 0.8089 – –
Note: Standard errors are given in parentheses. For the BMSM model, k = 14 provided the lowest value of the GMM objective function while for LMSM k = 8
was imposed from the outset. Since the ML estimator for BMSM showed a large deviation of the estimated scale factors s for the sample standard deviation, we
resorted to estimate s by the naive sample counterpart and only obtained the estimate of m0 from the ML algorithm.
For forecast evaluation, we use both the mean squared forecast error (MSE) and the mean absolute forecast error (MAE)
criteria. The null hypothesis of equality of forecast performance from different models is tested in a pairwise comparison using
the Diebold and Mariano (1995) (DM) test and the modified DM type test statistics for nested models of Clark and West (2007),
depending on the models to be compared. Furthermore, we use the superior predictive ability (SPA) test introduced by Hansen
(2005) that allows for the simultaneous test of n similar null hypotheses against a group of alternatives.
Table 3 reports MSE and MAE of volatility forecasts for the five volatility models described previously relative to the MSE and
MAE obtained with a naive forecast using the constant historical volatility (computed as average squared or absolute returns)
of the in-sample period. A value <1 would, thus, indicate that the pertinent model improves upon historical volatility under
the respective criterion. Based on the MSE criterion, the long memory model with time varying parameters, FITVGARCH, seems
to be the best model for certain short-horizon volatility forecasts such as 1 and 20 days, while FIGARCH turns out to be the best
model at a horizon of 5 and 10 days. For longer horizons, 30 days and beyond, the BMSM model is the best one, while LMSM and
FITVGARCH come next. Also, we find that the simple GARCH model can not outperform the long memory-type GARCH models at
any horizon. According to the MAE criterion, results are somewhat different: we find that the MSM-type models dominate over
all the other models at most horizons. With respect to the GARCH-type models, we find that the FIGARCH model outperforms
FITVGARCH only at horizons 1 and 5 days. For longer horizons, 10 days and beyond, the FITVGARCH model performs better than
GARCH and FIGARCH models.
Comparison with historical volatility shows, however, that none of the models improves upon the naive forecast (HV) under
the MAE criterion, while only the MSM models consistently outperform historical volatility at all time horizons under the MSE
criterion. This difference in the performance under the MSE and MAE criterion indicates that time-series models are gener-
ally better suited to forecast large realizations of volatility than average sized ones (as the former have a higher influence on
the average MSE compared to the average MAE). Similar results have been found for other time series before (e.g. Leövey &
Lux, 2012), and they appear to some extent plausible and even reassuring since it is the occurrence of large clusters of highly
autocorrelated fluctuations that has motivated the development of modern asset pricing models like the ones used in this study.
Table 4 contains results of pairwise forecast comparisons, for the four models, with the Diebold and Mariano (1995) test using
both squared forecast error and absolute forecast error loss functions. For the cases of nested models, we apply the modified
Diebold-Mariano test by Clark and West (2007). Note that in the hierarchy of GARCH type models the simpler ones are always
nested in the more complex ones and historical volatility is nested in all time series models. In contrast MSM and any of the
GARCH-type models are non-nested and also none of the two MSM models is nested in the other. We show results for the
adjusted test only for the MSE criterion as it applies to quadratic loss criteria only by design. The results represent the p-values
of the null hypothesis that forecast performance at horizon l of model 1 is equal to the one of model 2 against the one-sided
alternative that model 2’s forecast performance is superior than the one of model 1. At the 10% level of significance and in terms
of the squared error loss function, the MSM model is outperformed by the other models at lower horizons (l ≤ 20) but BMSM
dominates over all GARCH-type models when the forecast horizon exceeds 40 days, and the same holds for LMSM for forecast
horizons of 50 days or more. We also find that the FITVGARCH model seems to outperform the FIGARCH model for l ≥ 30. In
terms of the absolute error loss function, the LMSM outperforms the other models at all horizons, while BMSM is superior to the
GARCH family at long horizons only, but is significantly inferior at all horizons to LMSM.
We also apply the SPA test of Hansen (2005) using the same two loss functions, MSE and MAE. Hansen’s test allows to
compare one model’s performance to that of a whole set of competitors. The null hypothesis of the test is that a particular model
(benchmark model) is not inferior to all the other candidate models. Table 5 presents the SPA test for each model including also
historical volatility in the comparison. We find that, based on the MSE criterion used as loss function in the SPA test, the long-
memory models FIGARCH and FITVGARCH perform pretty similarly, in that both of them can not be significantly outperformed
at any horizon not exceeding 30 days. But beyond this horizon, they can be outperformed by other models at any traditional level
of significance. The BMSM model, in contrast, can be outperformed at short-horizons (l ≤ 10 days) but not for longer horizons,
Table 3
Forecast evaluation for DJIM return volatility based on MSE and MAE criteria.
GARCH FIGARCH FITVGARCH LMSM BMSM
Horizon MSE MAE MSE MAE MSE MAE MSE MAE MSE MAE
1 0.7976 1.0585 0.7874 1.0581 0.7752 1.0694 0.8643 1.0058 0.8479 1.3944
5 0.8037 1.0636 0.7751 1.0569 0.7924 1.0583 0.9021 1.0015 0.8448 1.3398
10 0.8577 1.0922 0.8368 1.0910 0.8406 1.0830 0.9389 1.0148 0.8968 1.3089
20 0.9378 1.1427 0.9165 1.1404 0.9152 1.1287 0.9682 1.0248 0.9342 1.2427
30 0.9986 1.1831 0.9682 1.1774 0.9657 1.1608 0.9811 1.0286 0.9515 1.1896
40 1.0394 1.2006 1.0027 1.1969 0.9988 1.1776 0.9877 1.0302 0.9621 1.1535
50 1.0503 1.2060 1.0135 1.2065 1.0087 1.1842 0.9906 1.0310 0.9680 1.1280
60 1.0475 1.1988 1.0139 1.2055 1.0092 1.1807 0.9920 1.0306 0.9730 1.1074
70 1.0422 1.1977 1.0130 1.2111 1.0083 1.1844 0.9931 1.0303 0.9770 1.0926
80 1.0363 1.1908 1.0116 1.2114 1.0075 1.1844 0.9939 1.0301 0.9792 1.0793
90 1.0340 1.1923 1.0126 1.2170 1.0087 1.1876 0.9948 1.0302 0.9815 1.0694
100 1.0356 1.1940 1.0157 1.2217 1.0119 1.1911 0.9957 1.0300 0.9833 1.0607
Note: MSE and MAE for all four models are displayed relative to the MSE and MAE of a constant forecast using historical volatility as estimated from the
in-sample series. Entries in italics represent the best performing model for the pertinent loss function and forecasting horizon.
Table 4
Diebold and Mariano test results.
Horizon 1 5 10 20 30 40 50 60 70 80 90 100
Model 1 Model 2
Clark and West test (MSE))

HV GARCH 0.000 0.000 0.004 0.023 0.028 0.024 0.042 0.075 0.103 0.117 0.131 0.136
FIGARCH 0.000 0.000 0.004 0.019 0.021 0.016 0.029 0.056 0.078 0.089 0.099 0.100
FITVGARCH 0.000 0.000 0.003 0.017 0.018 0.013 0.026 0.051 0.074 0.085 0.095 0.096
LMSM 0.001 0.000 0.002 0.010 0.011 0.022 0.063 0.107 0.145 0.175 0.210 0.256
BMSM 0.000 0.000 0.000 0.002 0.005 0.007 0.011 0.013 0.017 0.022 0.026 0.027
GARCH FIGARCH 0.059 0.001 0.018 0.096 0.108 0.108 0.112 0.113 0.112 0.111 0.109 0.115
FITVGARCH 0.282 0.038 0.037 0.095 0.099 0.097 0.101 0.101 0.099 0.098 0.095 0.100
FIGARCH FITVGARCH 0.360 0.928 0.706 0.398 0.180 0.041 0.028 0.020 0.011 0.014 0.016 0.012
Diebold and Mariano test (MSE)

GARCH LMSM 0.970 0.972 0.864 0.651 0.380 0.173 0.128 0.100 0.078 0.068 0.055 0.063
BMSM 0.914 0.929 0.777 0.478 0.173 0.099 0.088 0.075 0.058 0.047 0.032 0.030
FIGARCH LMSM 0.984 0.985 0.921 0.784 0.641 0.202 0.050 0.013 0.012 0.033 0.045 0.052
BMSM 0.965 0.989 0.960 0.641 0.230 0.046 0.045 0.040 0.028 0.020 0.009 0.004
FITVGARCH LMSM 0.914 0.987 0.923 0.796 0.668 0.253 0.069 0.019 0.040 0.053 0.089 0.082
BMSM 0.980 0.978 0.895 0.650 0.243 0.056 0.052 0.048 0.042 0.028 0.019 0.008
LMSM GARCH 0.030 0.028 0.136 0.349 0.620 0.827 0.872 0.900 0.922 0.932 0.945 0.937
FIGARCH 0.016 0.015 0.079 0.216 0.359 0.798 0.950 0.988 0.989 0.967 0.955 0.948
FITVGARCH 0.086 0.013 0.077 0.204 0.332 0.747 0.932 0.981 0.960 0.947 0.911 0.918
BMSM 0.365 0.078 0.147 0.169 0.159 0.150 0.149 0.145 0.144 0.138 0.135 0.122
BMSM GARCH 0.086 0.071 0.222 0.522 0.827 0.901 0.912 0.925 0.942 0.953 0.968 0.970
FIGARCH 0.035 0.011 0.094 0.359 0.769 0.954 0.955 0.960 0.972 0.980 0.991 0.996
FITVGARCH 0.020 0.022 0.105 0.350 0.757 0.944 0.948 0.952 0.958 0.972 0.981 0.992
LMSM 0.635 0.922 0.853 0.831 0.841 0.850 0.851 0.855 0.856 0.862 0.865 0.878
Diebold and Mariano test (MAE)

GARCH LMSM 0.005 0.021 0.071 0.079 0.070 0.070 0.064 0.062 0.057 0.055 0.057 0.059
BMSM 1.000 1.000 1.000 0.873 0.523 0.352 0.263 0.216 0.174 0.148 0.129 0.114
FIGARCH LMSM 0.003 0.008 0.026 0.021 0.012 0.011 0.008 0.008 0.008 0.010 0.012 0.014
BMSM 1.000 1.000 1.000 0.956 0.567 0.294 0.165 0.107 0.073 0.058 0.048 0.042
FITVGARCH LMSM 0.000 0.009 0.046 0.034 0.022 0.021 0.018 0.019 0.020 0.023 0.027 0.031
BMSM 1.000 1.000 1.000 0.970 0.652 0.380 0.240 0.174 0.128 0.104 0.089 0.079
LMSM GARCH 0.995 0.979 0.929 0.921 0.930 0.930 0.936 0.938 0.943 0.945 0.943 0.941
FIGARCH 0.997 0.992 0.974 0.979 0.988 0.989 0.992 0.992 0.992 0.990 0.988 0.986
FITVGARCH 1.000 0.991 0.955 0.966 0.978 0.979 0.982 0.981 0.980 0.977 0.973 0.969
BMSM 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000
BMSM GARCH 0.000 0.000 0.000 0.127 0.477 0.648 0.737 0.784 0.826 0.852 0.871 0.886
FIGARCH 0.000 0.000 0.000 0.044 0.433 0.706 0.835 0.893 0.927 0.942 0.952 0.956
FITVGARCH 0.000 0.000 0.000 0.030 0.348 0.620 0.760 0.826 0.872 0.896 0.911 0.921
LMSM 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.001
Note: Table entries represent the one-sided p-values of the Clark and West (2007) test and the Diebold and Mariano (1995) test (the second is based on both
squared and absolute prediction errors). The Clark-West test is used in the case of nested models, while the Diebold-Mariano test is utilized for non-nested
models. For both models, the null hypothesis is that the forecast performance at horizon h of model 2 is equal to the one of model 1 against the one-sided
alternative that model 2’s forecast performance is superior to the one of model 1.
l ≥ 20. LMSM appears to perform poorly based on the SPA results, but this is mainly so because BMSM performs significantly
better than LMSM at longer horizons. Would we leave BMSM out of this competition, LMSM would take over its leading position
vis-à-vis the GARCH-type models. Under the MAE criterion we observe a clear dominance of historical volatility over all models.
For the GARCH model, we find that it is inferior to alternative models under MSE at all horizons but l=1. Interestingly,
historical volatility is also clearly outperformed by some alternative forecasts at all horizons at any significant level so that the
application of our battery of time series models adds value in terms of forecasting accuracy under the MSE criterion. This is
different under the MAE criterion where the null of non-inferiority of HV for the alternative forecasts is never rejected. If we
would exclude HV from the competition, the SPA test would indicate superiority of LMSM compared to the GARCH type models
at all forecast horizons in line with the results of Tables 3 and 4. However, under the MAE criterion, all our time series models
would not add value to the naive approach of using the in-sample average as a predictor for future volatility.
Tables 6 and 7 provide performance statistics for value-at-risk predictions derived for our volatility forecasts. For all GARCH-
type models the computation of the value-at-risk (VaR) at any forward looking time horizon is straightforward. Since the
marginal distribution is Gaussian, the VaR threshold is simply given by VaRat+h = 0−1 (a)ŝt+h|t , with a the prescribed value-
at-risk probability, 0 −1 (a) the inverse of the standard Gaussian with variance equal to unity that determines the a-percent
quantile, and ŝt+h|t the h-period forecast of the pertinent model given information at the forecast origin t. Matters are more
intricate for the MSM class of models as their marginal distribution consists in a mixture of 2k Gaussians rather than a single
one. Best linear forecasts on the base of GMM estimation provide us only with an estimate of the overall variance at some future
Table 5
Superior predictive ability (SPA) test results.
Horizon HV GARCH FIGARCH FITVGARCH LMSM BMSM
Squared errors
1 0.000 0.253 0.625 0.728 0.065 0.020
5 0.000 0.003 1.000 0.113 0.030 0.030
10 0.000 0.003 0.708 0.292 0.033 0.048
20 0.000 0.020 0.700 0.757 0.020 0.278
30 0.000 0.003 0.400 0.438 0.007 0.785
40 0.000 0.003 0.000 0.015 0.003 1.000
50 0.000 0.000 0.000 0.000 0.000 1.000
60 0.000 0.003 0.000 0.000 0.000 1.000
70 0.000 0.003 0.000 0.010 0.003 1.000
80 0.000 0.000 0.000 0.005 0.000 1.000
90 0.000 0.005 0.000 0.003 0.000 1.000
100 0.000 0.005 0.000 0.000 0.000 1.000
Absolute errors
1 0.680 0.048 0.013 0.005 0.320 0.000
5 0.535 0.048 0.033 0.022 0.465 0.000
10 1.000 0.018 0.003 0.005 0.020 0.000
20 1.000 0.000 0.000 0.000 0.000 0.000
30 1.000 0.000 0.000 0.000 0.000 0.000
40 1.000 0.000 0.000 0.000 0.000 0.000
50 1.000 0.000 0.000 0.000 0.000 0.000
60 1.000 0.000 0.000 0.000 0.000 0.000
70 1.000 0.000 0.000 0.000 0.000 0.000
80 1.000 0.000 0.000 0.000 0.000 0.000
90 1.000 0.000 0.000 0.000 0.000 0.000
100 1.000 0.000 0.000 0.000 0.000 0.000
Note: The table entries represent the p-values of the SPA test of Hansen (2005) using two loss functions (MSE and MAE). The null hypothesis is that a particular
model (benchmark model) cannot be outperformed by other candidate models. Each column shows the outcome of this test in terms of the one-sided p-ratios
for the pertinent model against all alternatives.
time t + h characterizing this compound distribution. Given the absence of additional information, we could only perform a
Gaussian approximation to the predictive density using the same formula as with GARCH-type models.
Circumstances are more convenient in the case of the more computation intensive ML approach. Since here, parameter
estimates come along with estimates of conditional state probabilities, these can be used to compute expected future state
probabilities via multiplication with the transition matrix of the process. Since the resulting distribution is too complicated for
an analytical representation, we can only approximate it via simulations. To this end, we draw a large sample of realizations of
Table 6
Failure rates of VaR forecasts.
Horizon 1 5 10 20 30 40 50 60 70 80 90 100
Model VaR percent
HV 1% 0.052 0.054 0.054 0.054 0.055 0.055 0.055 0.055 0.055 0.056 0.056 0.056
5% 0.097 0.097 0.097 0.098 0.098 0.099 0.099 0.099 0.099 0.100 0.100 0.101
GARCH 1% 0.025 0.029 0.030 0.033 0.036 0.034 0.039 0.039 0.041 0.041 0.045 0.046
5% 0.069 0.074 0.076 0.078 0.077 0.080 0.082 0.078 0.079 0.084 0.084 0.084
FIGARCH 1% 0.020 0.025 0.026 0.030 0.031 0.033 0.035 0.035 0.036 0.040 0.040 0.042
5% 0.067 0.071 0.073 0.070 0.070 0.071 0.073 0.066 0.066 0.069 0.072 0.073
FITVGARCH 1% 0.024 0.025 0.026 0.030 0.034 0.034 0.037 0.036 0.040 0.044 0.047 0.048
5% 0.072 0.069 0.069 0.069 0.070 0.072 0.072 0.067 0.068 0.073 0.076 0.076
LMSM 1% 0.025 0.036 0.037 0.040 0.041 0.044 0.046 0.045 0.045 0.045 0.046 0.047
5% 0.074 0.078 0.078 0.082 0.086 0.087 0.088 0.089 0.090 0.091 0.092 0.093
BMSM 1% 0.003 0.003 0.004 + 0.008∗ + 0.011∗ + 0.011∗ 0.012∗ 0.015∗ 0.016 0.017 0.017 0.020
5% 0.056∗ 0.059∗ + 0.068 0.083 0.095 0.102 0.108 0.113 0.116 0.123 0.127 0.131
Combined 1% 0.005∗ + 0.004 0.008∗ + 0.013∗ + 0.017 0.017 0.018 0.019 0.022 0.022 0.025 0.026
FIGARCH & BMSM 5% 0.055∗ + 0.063 0.071 0.076 0.079 0.086 0.089 0.087 0.087 0.093 0.098 0.100
Note: The table reports the empirical failure rates of VaR forecasts for the 1 and 5% quantiles for all models considered as well as combined forecasts of the
FIGARCH and BMSM models. Superscript * and + denote non-rejection at the 5% significance level of the two-sided tests of unconditional coverage (Kupiec,
1995) and conditional coverage (Christoffersen, 1998).
Table 7
Superior predictive ability test results for VaR-based loss function.
Horizon HV GARCH FIGARCH FITVGARCH LMSM BMSM
VaR 1%
1 0.000 0.383 0.618 0.025 0.018 0.003
5 0.000 0.022 1.000 0.043 0.007 0.007
10 0.000 0.080 0.665 0.003 0.005 0.367
20 0.000 0.433 0.487 0.013 0.010 0.770
30 0.000 0.158 0.123 0.000 0.020 1.000
40 0.000 0.010 0.007 0.000 0.013 1.000
50 0.000 0.000 0.000 0.000 0.003 1.000
60 0.000 0.000 0.000 0.000 0.005 1.000
70 0.000 0.000 0.000 0.000 0.003 1.000
80 0.000 0.000 0.000 0.000 0.000 1.000
90 0.000 0.000 0.000 0.000 0.000 1.000
100 0.000 0.000 0.000 0.000 0.000 1.000
VaR 5%
1 0.000 0.000 1.000 0.005 0.028 0.105
5 0.000 0.000 0.780 0.020 0.003 0.237
10 0.000 0.020 0.858 0.135 0.003 0.198
20 0.000 0.040 0.880 0.120 0.045 0.080
30 0.000 0.140 0.915 0.020 0.185 0.200
40 0.007 0.040 0.818 0.000 0.352 0.068
50 0.003 0.018 0.465 0.000 0.603 0.005
60 0.007 0.007 0.560 0.000 0.500 0.003
70 0.010 0.003 0.500 0.000 0.583 0.000
80 0.018 0.037 0.265 0.000 0.782 0.000
90 0.048 0.102 0.155 0.000 0.870 0.000
100 0.092 0.193 0.125 0.000 0.892 0.000
Note: The table reports the p-values of the SPA test of Hansen (2005) using the VaR-based loss function of Eq. (16) with parameter m = 5. Additional experiments
showed that the results remained the same up to the digits shown here for any higher value of m. The null hypothesis is that a particular model (benchmark
model) cannot be outperformed by other candidate models. Each column shows the outcome of this test in terms of the one-sided p-ratios for the pertinent
model against all alternatives.
the states given their predicted probabilities at time t + h and multiply these with standard Normal innovations. The a-percent
quantiles are, then, simply inferred from this numerical approximation and are used as VaR predictions in our competition
with the other models. In Tables 6 and 7, the LMSM labels indicate the Lognormal variant of the MSM model estimated via
GMM for which we compute VaR forecasts on the base of best linear volatility predictions and a Gaussian approximation of
the predicted density. BMSM indicates the Binomial multifractal model estimated via ML with the numerical approach on the
base of predicted state probabilities used for VaR forecasts. We apply these forecasts for VaR probabilities a of 5 and 1%, and
in the numerical approximation in the case of BMSM we use 20,000 random draws for the approximation of the predictive
density.
Table 6 shows the failure rates for all models, i.e. the fraction of cases for which the returns at horizon h during the out-
of-sample period are smaller than the a percent threshold which except for the case of VaRs based on historical volatility is
changing over time with the volatility dynamics captured by the various models. Ideally, empirical failure rates should closely
correspond to nominal VaR probabilities. Various tests are available to test the null hypothesis of the empirical fraction of fail-
ures, say a ∗ , being identical to the nominal coverage ratio, a, i.e. H0 : a = a ∗ against H1 : a = a ∗ . The simplest is the likelihood
ratio test proposed by Kupiec (1995) that just unconditionally tests for identity of the hypothesized coverage ratio with the real-
ized one. Since this test does not exclude dependencies in VaR exceedances (which should be absent to a correctly specified VaR
forecast), a more involved tests combines both unconditional coverage and absence of first-order serial dependence in its null
hypothesis. In Table 6 we indicate non-rejection of the null of the test of Kupiec as well as the null of Christoffersen’s (1998) test
that also includes absence of first-order dependence in its null, both at the 5% level, together with the empirical fractions of VaR
violations. We note that results for time horizons beyond one have to be taken with a grain of salt because of dependency of
the sequence of VaR forecast which is not reflected in the distribution of the test statistics that we are using. We, nevertheless,
believe that the results are indicative of the general tendencies of the outcome of our exercise.
We actually do not find too many non-rejections of the nulls of both the test by Kupiec and Christoffersen. This is to same
extent due to our large out-of-sample record that allows detecting relatively small differences between hypothesized and actual
coverage ratios. The basic tendencies in Table 6 are the following: the worst results are obtained using simple VaR predictions
based on historical volatility that by far overestimate the probability of adverse outcomes. GARCH, FIGARCH, and FITVGARCH are
also too conservative throughout, but improve upon HV with the long memory models showing smaller deviation from target
than GARCH. The Gaussian approximation for the LMSM model shows roughly the same performance like the GARCH family,
while the VaR build upon the numerical approximation of the compound marginal distribution for BMSM gets often much closer
to the target. In fact, it is the only model that exhibits a certain number of non-rejections of correct coverage, particularly, for
longer horizons at the 1% threshold and shorter horizons at the 5% threshold. It is also the only model that exhibits cases of
underestimation of the risk of large returns (at the lower horizons for a = 0.01). Table 6 also displays results from naively
combining the best GARCH-type and MSM-type models (FIGARCH and BMSM) which, however, do not provide much overall
improvement compared to the best simple models.
Table 7 explores the question of superior predictive ability of our models vis-à-vis all others under the task of VaR prediction.
The approach pursued is the same as with the analyses reported in Table 5 using a different loss function.
In order to capture the overall performance of VaR forecasts, we follow González-Rivera, Lee, and Mishra (2004) by adopting,
the following loss function:

T−h+1

MVaR(a) = (T − h) a − gm mt+h , VaRat+h rt+h − VaRat+h (16)
t=1
with rt+h , t = 1, . . . , T − h + 1, returns observed in the out-of-sample period, VaRat+h the pertinent forecasts, and the function gm
being defined as gm (y, z) = [1 + exp(m(y − z))] −1 which for m large converges to the indicator function that assigns the value of
one in the case of an exceedance of (negative) returns of the VaR threshold and zero otherwise. While using this differentiable
function is convenient to satisfy the regularity conditions of the SPA test, results are actually completely identical up to the digits
reported here for values of the parameter m beyond some threshold as well as for the non-differentiable loss function that is
obtained by replacing gm by the indicator function. Note that the asymmetry of this loss function is motivated by the objective of
weighting more heavily violations of the VaR forecasts rather than the bulk of moderate returns above this boundary. The results
confirm broadly those inferred from Table 6: Historical volatility, GARCH and FITVGARCH are clearly inferior to alternative
models except for a few selected horizons. Models for which superior predictive capacity cannot be rejected are as follows: the
FIGARCH model for a = 0.01 and horizons up to 30 days, as well as BMSM for horizons of twenty days and more. For a = 0.05,
we find non-rejection of superior performance for the FIGARCH model at all horizons, BMSM at short and LMSM at long horizons.
Overall, it then appears that while no model provides completely satisfactory coverage of large risks, a portfolio of FIGARCH
and multifractal models appears most suitable for both volatility and VaR predictions. Since these models seem to have different
strengths and weaknesses in the various tasks explored here, an important question to be pursued would be their optimal
combination. We also note that the combination of long memory and structural changes in the form of the FITVGARCH model
does not turn out to contribute additional explanatory power on top of the older GARCH-type models in their application, while
the more non-linear combination of these aspects within the MSM family seems more successful.
4. Conclusions
In the wake of the recent global financial crisis, a need has emerged for a reconsideration of many facets of the existing
financial system. Among other developments, this has also led to a renewal of interest in Islamic finance. In essence, Islamic
finance attempts to provide financial products and instruments that are consistent with certain principles such as social respon-
sibility, ethical and moral values and sustainability. Given the prevalent interest in such products, we have investigated the
statistical properties of the Dow Jones Islamic Market World Index (DJIM), and have applied up-to-date volatility models to
model and forecast conditional volatility of DJIM returns, accounting for both long memory and structural changes in the volatil-
ity process, as well as the fact that volatility persistence may be due to structural breaks. Given this, we use four different
types of volatility models, namely, the Generalized Autoregressive Conditional Heteroskedasticity (GARCH), Fractionally Inte-
grated Generalized Autoregressive Conditional Heteroskedasticity (FIGARCH), Fractionally Integrated Time Varying Generalized
Autoregressive Conditional Heteroskedasticity (FITVGARCH) and Markov-switching multifractal (MSM) models in two different
forms and with different estimation and forecasting techniques. While the GARCH model serves as our benchmark volatility
model, FIGARCH allows for long memory, FITVGARCH covers both long memory and structural breaks simultaneously, and the
MSM models capture regime-switching that might lead to spurious time-series characteristics close to ‘true’ long memory. The
choice of the DJIM is justified by the fact that it is the most widely used, most comprehensive representative, and has the most
adequate time series for the Islamic stock market.
Our results show that the MSM model appears to be superior to other models considered, especially at longer horizons,
and with absolute errors as loss criterion, for forecasting the volatility of the DJIM returns, and that it outperforms the GARCH,
FIGARCH and FITVGARCH for most of the out-of-sample forecast comparison tests. However, this superiority against GARCH-
type models only has economic value under the MSE criterion, while under the MAE loss function, all time-series models
show predictive capabilities that are inferior to historical volatility. Not surprisingly, the classical GARCH model seems to be
the worst performing model in terms of forecasting future volatility among the models considered. Using the same set of
models for prediction of value-at-risk, we find a clear dominance of the MSM-type models together with FIGARCH against
historical volatility, simple GARCH and FITVGARCH. All in all, these results are not too different from those of other previ-
ous studies of the comparative performance of volatility models: Calvet and Fisher (2004); Lux (2008); Lux and Kaizoji (2007)
and Lux, Morales-Arias, and Sattarhoff (2014) all have found certain gains in forecastability of volatility and related tasks with
MSM compared to GARCH-type models. The DJIM seems no exception and also shows complete agreement with more tradi-
tional asset classes in terms of its basic statistical features. This, however, casts doubt on whether investment into the stocks
represented in the DJIM could provide any safeguard against extreme market gyrations like those observed over the last couple
of years.
References
Abedifar, P., Ebrahim, S.M., Molyneux, P., & Tarazi, A. (2015). Islamic banking and finance: Recent empirical literature and directions for future research. Journal
of Economic Surveys, 29(4), 637–670.
Andersen, T.G., & Bollerslev, T. (1997). Heterogeneous information arrivals and return volatility dynamics: Uncovering the long-run in high frequency returns.
Journal of Finance, 52, 975–1005.
Andreou, E., & Ghysels, E. (2002). Detecting multiple breaks in financial market volatility dynamics. Journal of Applied Econometrics, 17, 579–600.
Baillie, R., Bollerslev, T., & Mikkelsen, H. (1996). Fractionally integrated generalized autoregressive conditional heteroskedasticity. Journal of Econometrics, 74,
3–30.
Baillie, R., & Morana, C. (2007). Modeling long memory and structural breaks in conditional variances: An adaptive FIGARCH approach. ICER. Working Paper No.
11/2007,
Beine, M., & Laurent, S. (2000). Structural change and long memory in volatility: New evidence from daily exchange rates, University of LiegeWorking paper.
Ben Nasr, A., Ajmi, A., & Gupta, R. (2014). Modeling the volatility of the Dow Jones Islamic Market World Index using a Fractionally Integrated Time Varying
GARCH (FITVGARCH) model. Applied Financial Economics, 24(14), 993–1004.
Ben Nasr, A., Boutahar, M., & Trabelsi, A. (2010). Fractionally integrated time varying GARCH model. Statistical Methods and Applications, 19(3), 399–430.
Bollerslev, T. (1986). Generalized autoregressive conditional heteroskedasticity. Journal of Econometrics, 31(3), 307–327.
Bollerslev, T., & Mikkelsen, H. (1996). Modelling and pricing long memory in stock market volatility. Journal of Econometrics, 73, 151–184.
Bos, C., Franses, P., & Ooms, M. (1999). Long memory and level shifts: Re-analyzing inflation rates. Empirical Economics, 24, 427–449.
Brockwell, P., & Dahlhaus, R. (2004). Generalized Levinson-Durbin and Burg algorithms. Journal of Econometrics, 118(1-2), 129–149.
Calvet, L., & Fisher, A. (2001). Forecasting multifractal volatility. Journal of Econometrics, 105, 27–58.
Calvet, L., & Fisher, A. (2004). Regime switching and the estimation of multifractal processes. Journal of Financial Econometrics, 2, 49–83.
Christoffersen, P. (1998). Evaluating interval forecast. International Economics Review, 39, 841–862.
Clark, T., & West, K. (2007). Approximately normal tests for equal predictive accuracy in nested models. Journal of Econometrics, 138, 291–311.
Cont, R. (2001). Empirical properties of asset returns: Stylized facts and statistical issues. Quantitative Finance, 1, 223–236.
Davidson, J. (2004). Conditional heteroskedasticity models and a new model. Journal of Business and Economic Statistics, 22, 16–29.
Diebold, F. (1986). Comment on “Modeling the persistence of conditional variance”. By Engle R, Bollerslev T. Econometric Reviews, 5, 51–56.
Diebold, F., & Mariano, R. (1995). Comparing predictive accuracy. Journal of Business and Economic Statistics, 13(3), 253–263.
Ding, Z., Granger, C., & Engle, R. (1993). A long memory property of stock market returns and a new model. Journal of Empirical Finance, 1, 83–106.
González-Rivera, Lee, & Mishra, (2004). Forecasting volatility: A reality check based on option pricing, utility function, value-at-risk, and predictive likelihood.
International Journal of Forecasting, 20, 629–645.
Hammoudeh, S., Mensi, W., Reboredo, J., & Nguyen, D. (2014). Dynamic dependence of the global Islamic equity index with global conventional equity market
indices and risk factors. Pacific-Basin Finance Journal, 30(1), 189–206.
Hansen, P.R. (2005). A test for superior predictive ability. Journal of Business and Economic Statistics, 23, 365–380.
Hill, B. (1975). A simple general approach to inference about the tail of a distribution. The Annals of Statistics, 3(5), 1163–1174.
Kupiec, P. (1995). Techniques for verifying the accuracy of risk measurement models. Journal of Derivatives, 3, 73–84.
Lamoureux, C.G., & Lastrapes, W. (1990). Persistence in variance, structural change and the GARCH model. Journal of Business and Economic Statistics, 8, 225–234.
Leövey, A., & Lux, T. (2012). Parameter estimation and forecasting for multiplicative log-normal cascades. Physical Review, E85(046114),
Liu, R., di Matteo, T., & Lux, T. (2007). True and apparent scaling: The proximity of the Markov-switching multifractal model to long-range dependence. Physica
A, 383, 35–42.
Lobato, I.N., & Savin, N.E. (1998). Real and spurious long memory properties of stock market data. Journal of Business and Economic Statistics, 16, 261–268.
Lux, T. (2008). The Markov-switching multifractal model of asset returns: GMM estimation and linear forecasting of volatility. Journal of Business and Economic
Statistics, 26, 194–210.
Lux, T., & Kaizoji, T. (2007). Forecasting volatility and volume in the Tokyo stock market: Long memory, fractality and regime switching. Journal of Economic
Dynamics and Control, 31, 1808–1843.
Lux, T., Morales-Arias, L., & Sattarhoff, C. (2014). Forecasting daily variations of stock index returns with a multifractal model of realized volatility. Journal of
Forecasting, 7, 532–541.
Mandelbrot B., Fisher A., Calvet, L. E., 1997. A Multifractal Model of Asset Returns. Mimeo: Cowles Foundation for Research in Economics.
Martens, M., van Dijk, D., & de Pooter, M. (2004). Modeling and forecasting S&P 500 volatility: Long memory, structural breaks and nonlinearity. Tinbergen
institute discussion paper 04-067/4..
Morana, C., & Beltratti, A. (2004). Structural change and long-range dependence in volatility of exchange rates: Either, neither or both? Journal of Empirical
Finance, 11, 629–658.
Nazlioglu, S., Hammoudeh, S., & Gupta, R. (2015). Structural breaks and GARCH models of exchange rate volatility. Applied Economics, 47(46), 4996–5011.
Rapach, D.E., Strauss, J., & Wohar, M. (2008). Forecasting stock return volatility in the presence of structural breaks. In David E. Rapach, & Mark E. Wohar (Eds.),
Forecasting in the presence of structural breaks and model uncertainty. Frontiers of Economics and Globalization, Vol. 3. (pp. 381–416). Bingley, United Kingdom:
Emerald.

Jurnal 4

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Jurnal 4

Uploaded by

Copyright:

Available Formats

International Review of Economics and Finance 45 (2016) 559–571

Contents lists available at ScienceDirect

International Review of Economics and Finance

Forecasting the volatility of the Dow Jones Islamic Stock Market

2. GARCH, FIGARCH, FITVGARCH and MSM volatility models

where the restrictions on the parameters are y > 0, a, b ≥ 0 and a + b < 1.

[1 − 0t L] (1 − L)d rt2 = yt + [1 − bt L] vt (5)

v̂T = arg min fT (v) AT fT (v), (11)

nt,T = ln |rt | − ln |rt−T |. (12)

Daily DJIM index

Daily DJIM returns

Fig. 1. Daily DJIM index in level and returns.

Box-Ljung statistics for: Returns Squared returns Absolute returns

Lag 8 91.039 3545.467 2657.314

3.3. Out-of-sample analysis

GARCH FIGARCH LMSM BMSM FITVGARCH

d̂ – – 0.4831 (0.0627) – – 0.4888 (0.0450)

GARCH FIGARCH FITVGARCH LMSM BMSM

Clark and West test (MSE))

Diebold and Mariano test (MSE)

Diebold and Mariano test (MAE)

Horizon HV GARCH FIGARCH FITVGARCH LMSM BMSM

Model VaR percent

Horizon HV GARCH FIGARCH FITVGARCH LMSM BMSM

You might also like

v̂T = arg min fT (v) AT fT (v), (11)