Professional Documents
Culture Documents
Hassan I 2012
Hassan I 2012
J. Forecast. (2012)
Published online in Wiley Online Library
(wileyonlinelibrary.com) DOI: 10.1002/for.2244
ABSTRACT
In recent years the singular spectrum analysis (SSA) technique has been further
developed and applied to many practical problems. The aim of this research
is to extend and apply the SSA method, using the UK Industrial Production
series. The performance of the SSA and multivariate SSA (MSSA) techniques
was assessed by applying it to eight series measuring the monthly seasonally
unadjusted industrial production for the main sectors of the UK economy. The
results are compared with those obtained using the autoregressive integrated
moving average and vector autoregressive models.
We also develop the concept of causal relationship between two time series
based on the SSA techniques. We introduce several criteria which characterize
this causality. The criteria and tests are based on the forecasting accuracy and
predictability of the direction of change. The proposed tests are then applied and
examined using the UK industrial production series. Copyright © 2012 John
Wiley & Sons, Ltd.
INTRODUCTION
Econometric methods have been widely used to forecast the evolution of quarterly and yearly national
account data. However, the standard time series forecasting models may fail to accurately predict
economic and financial time series. This may be due to technological advances, and change in
government policies and consumer preferences, causing structural breaks and non-stationarity in the
data. Furthermore, for standard forecasting methods such as autoregressive integrated moving average
(ARIMA), assuming stationarity for the data, linearity for the model and normality for the residuals
can provide only an approximation of the true situation. Therefore a method that does not depend
on these assumptions could be very useful for modelling and forecasting economic data. Singular
spectrum analysis (SSA) is a very good example of such a technique Golyandina et al. (2001).
* Correspondence to: Hossein Hassani, The Business School, Bournemouth University, Bournemouth BH8 8EB, UK.
E-mail: hhassani@bournemouth.ac.uk
In this paper, our aim is to predict the monthly industrial production indices for the UK using
multivariate singular spectrum analysis (MSSA). We are motivated to use MSSA in order to assess the
potential role of cross-industry information for improving forecasts. We also use MSSA to investigate
the causality between these industries. Contrary to the traditional methods of time series forecasting,
SSA is a non-parametric method and makes no prior assumptions about the data.
The eight series examined are interesting and important since they range from traditional indus-
trial sectors (Basic Metals) to Food and Electricity/Gas. These eight time series contribute at least
50% to aggregate industrial production in the UK economy. Previously, these series have been
analysed in linear and nonlinear contexts in Hassani et al. (2009a), Osborn et al. (1999), Heravi
et al. (2004) and Zhigljavsky et al. (2009). They have generally found that these production series
are nonlinear and nonstationary. Osborn et al. (1999) have considered the extent and nature of sea-
sonality in these series. They found that for the UK data seasonality accounts for at least 80% of
variation in all series (except vehicles) in the UK. In our recent research (Hassani et al., 2009a), we
used SSA, ARIMA and Holt–Winters methods for forecasting seasonally unadjusted monthly data
on industrial production indicators in Germany, France and the UK. We have shown that the quality
of one-step-ahead forecasts are similar for ARIMA and SSA, Holt–Winters forecasts being slightly
worse. The quality of SSA forecasts at horizons h D 3, 6 and 12 is much better than the quality of
ARIMA and Holt–Winters forecasts. As h increases, the quality of ARIMA and Holt–Winters fore-
casts becomes worse. Also the standard deviation of the ARIMA and Holt–Winters forecasts increases
almost linearly with h, up to 12 months ahead. The situation was totally different for the SSA fore-
casts: the quality of SSA forecasts was almost independent of the value of h. In Hassani et al.
(2009a) we have also shown that SSA, ARIMA and Holt–Winters methods perform similarly well
in predicting the direction of change for small h. However, SSA outperforms the Holt–Winters and
ARIMA at longer horizons and hence can be considered as a reliable method for predicting recessions
and expansions.
The structure of this paper is as follows. A brief introduction into the SSA method is given in the
next section. The descriptive statistics of the series and the results of various tests (such as normality,
nonlinearity) are presented in the third section. The performance of ARIMA, SSA, MSSA and vector
autoregressive (VAR) is considered in the fourth section. A causality test based on the SSA technique
is considered in the fifth section. Finally, the sixth section presents a summary of the study and some
concluding remarks.
SSA decomposes the original time series into a sum of a small number of interpretable components
such as slowly varying trend, oscillatory components and noise. The basic SSA method consists of
two complementary stages: decomposition and reconstruction; both stages include two separate steps.
At the first stage we decompose the series and at the second stage we reconstruct the original series
and use the reconstructed series for forecasting. The key ideas of SSA can be traced back to the 18th
century (de Prony, 1795). However, the commencement of SSA is usually associated with publica-
tions of the papers by Broomhead and King (1986a,1986b) and Fraedrich (1986), Vautard and Ghil
(1989), Vautard et al. (1992) and Allen and Smith (1996). The three books (Golyandina et al., 2001;
Elsner and Tsonis, 1996; Danilov and Zhigljavsky, 1997) are fully devoted to SSA. For comparison
of the SSA with classical methods, ARIMA, ARAR algorithm, GARCH and Holt–Winters, see Has-
sani et al. (2009a,2009b, 2010), Hassani (2007); Patterson et al. (2010) and Hassani and Zhigljavsky
(2009). For a recent development see Zhigljavsky (2010). A through review of SSA for analysis and
forecasting economic and financial time series is given in Hassani and Thomakos (2010), and two
new versions of SSA, based on the minimum variance estimator and perturbation theory, have been
introduced in Hassani (2010) and Hassani et al. (2011). Here, a short description of the SSA technique
is given (for more information see Golyandina et al., 2001).
Univariate SSA
Stage 1: Decomposition
Step 1: Embedding and the trajectory matrix
Consider a univariate stochastic processes fYt gt 2Z and suppose that a realization of size T from this
process is available: YT D Œy1 , y2 , : : : , yT . The single parameter of the embedding is the window
length L, an integer such that 2 L T . Embedding can be regarded as a mapping operation that
transfers a one-dimensional time series YT into a multidimensional series X1 , : : : , XK with vectors
Xi D Œyi , yi C1 , yi C2 , : : : , yi CL1 0 (1)
for i D 1, 2, : : : , K, where K D T L C 1 and the prime symbol denotes transposition. These vectors
group together L time-adjacent observations which are supposed to describe the local state of the
underlying process. Vectors Xi are called L-lagged vectors (or, simply, lagged vectors). The result of
this step is the trajectory matrix
0 1
y1 y2 y3 : : : yK
B y2 y3 y4 : : : yKC1 C
B C
X D ŒX1 , : : : , XK D .xij /L,K D B . . . . . .. C (2)
i ,j D1
@ .. .. .. . . A
yL yLC1 yLC2 : : : yT
Note that the trajectory matrix X is a Hankel matrix, which means that all the elements along the
diagonal i C j D const are equal.
Stage 2: Reconstruction
Step 1: Grouping
The grouping step corresponds to splitting the elementary matrices into several groups and summing
the matrices within each group. Let I D fi1 , : : : , ip g be a group of indices i1 , : : : , ip . Then the matrix
XI corresponding to the group I is defined as XI D Xi C C Xi . The split of the set of indices
1 p
procedure of choosing the sets I1 , : : : , Im is called the grouping. For a given group I , the contribution
P Pd
of the component XI is measured by the share of the corresponding eigenvalues: i 2I i = i D1 i .
another expansion: X D e XI C : : : C e
1
XI , where e
m
XI is the diagonalized version of the matrix XI .
j j
This isPequivalent to the decomposition of the initial series YT D .y1 , : : : , yT / into a sum of m series;
m
yt D j D1 e y .j / e.j /
t , where Y T D .e y .j /
y .j
1 , : : : ,e
/ e
T / corresponds to the matrix XI . In what follows, we
j
use two groups of indices, I1 D f1, : : : , rg and I2 D fr C 1, : : : , Lg and associate the group I D I1
with the signal component and the group I2 with noise.
Forecasting
An important advantage of SSA is that it allows production of forecasts for either the individual com-
ponents of the series and/or the reconstructed series itself. The SSA can be applied in forecasting the
time series that approximately satisfy the linear recurrent formulae (LRF):
X
d
yi Cd D ˛k yi Cd k , 1i T d (4)
kD1
of some dimension d with the coefficients ˛1 , : : : , ˛d . It should be noted that d D rank X D rank XX0
(the order of the SVD decomposition of the trajectory matrix X). An important property of the SSA
decomposition is that, if the original time series YT satisfies the LRF (4), then for any T and L there
are at most d nonzero singular values in the SVD of the trajectory matrix X; therefore, even if the
window length L and K D T L C 1 are larger than d , we only need at most d matrices Xi to
reconstruct the series. Let us now formally describe the algorithm for the SSA forecasting method.
The SSA forecasting algorithm, as proposed in Golyandina et al. (2001), is as follows:
1 X
r
AD i UiO
1 v 2 i D1
where U O 2 RL1 is the vector consisting of the first L 1 components of the vector U 2 RL .
It can be proved that the last component yL of any vector Y D .y1 , : : : , yL /T 2 Lr is a linear
combination of the first yL1 components, i.e.
yL D ˛1 yL1 C : : : C ˛L1 y1
and this does not depend on the choice of a basis U1 , : : : , Ur in the linear space Lr .
10. Define the time series YN Ch D .y1 , : : : , yN Ch / by the formula
e
yi for i D 1, : : : , T
yi D PL1 (5)
j D1 ˛ j y i j for i D T C 1, : : : , T C h
THE DATA
The data in this study are taken from the Office for National Statistics (ONS) and represent eight major
components of industrial production in the UK. The series examined in this research are seasonally
unadjusted monthly indices for real output in Food Products, Chemicals, Basic Metals, Fabricated
Metals, Machinery, Electrical Machinery, Vehicles and Electricity/Gas Industries.
The same eight series, ending in 1995 and 2007, have been previously analysed in Osborn et al.
(1999), Heravi et al. (2004) and Hassani et al. (2009a). As explained in these papers, these indus-
tries have been chosen primarily because of their importance. Here we have updated the data and in
all cases the sample period starts from January 1978 and ends in July 2009, giving a long series of
380 observations. The movements in the series are dominated by seasonality and broadly represent
periods of growth in the 1980s and during 2000–2007 and periods of stagnation or recessions during
the early 1990s and 2008–2009.
In line with the usual convention for the economic time series and for comparability, all series are
analysed in the logarithmic form. To investigate short-term movements and the degree of seasonality,
Table I shows the descriptive statistics for the first difference of each series. Sample means and sample
standard deviations are scaled by 100 and, since the data are logarithmically transformed, they effec-
tively refer to monthly percentage changes in the original series. Table I shows that, with the exception
of Food, Chemicals and Electricity/Gas, the production series have declined over the period. In par-
ticular, Chemicals show substantial growth over the period, with an average increase of about 0.2%
per month (2.24% per year) and Basic Metals and Vehicles show an average decline of 0.17% per
month or 2% per year. The sample standard deviations indicate that the series have broadly similar
variability, with Food less volatile and Vehicles more volatile than the others. In the table, R2 is a
measure of seasonality and is calculated from a regression of the first differences against 12 monthly
dummy variables. The highest seasonality for these series is for production of Electricity and Gas.
These values, which are obtained on longer time series, are lower than the R2 reported in Osborn
et al. (1999).
A detailed and comprehensive statistical analysis of the data (such as various tests of normality,
nonlinearity and cross-correlation analysis) can be found in Zhigljavsky et al. (2009). The results
indicate that there is strong evidence of non-normality and there is a significant degree of nonlinear
dependence among the series. The cross-correlation analysis of these series shows that the strongest
correlations are generally at lag zero or at a very short lags.
FORECASTING RESULTS
In this section, the performance of the SSA and MSSA techniques is assessed by applying it to the
UK industrial production series. The results are compared with those obtained using ARIMA and
VAR models.
Here, we use the ratio of the root mean squared error (RRMSE) and the percentage of forecasts
that correctly predict the direction of change to measure the forecast accuracy. We also computed
other measures based on the magnitude of forecast errors, such as relative root mean absolute errors.
These measures yield similar results to RMSE; we therefore do not report them.
In computing Box–Jenkins ARIMA forecasts, we need to choose the lags, the degree of
differencing and the degree of seasonality .p, d , q/, .P , D, Q/s , where s D 12. To do that we use
the maximum order of lags, set by the software (we used computer packages SPSS and EViews), and
apply the Bayesian information criterion (BIC). The SSA parameters, the window length L and the
number of eigentriples r are chosen based on the eigenvalue spectra and separability. The parameters
.L, r/ of the SSA and the orders .p, d , q/, .P , D, Q/s of the ARIMA models are given when the
models are estimated using data up to the end of 2007. In SSA and MSSA, we used L D 24 and
r D 14 for all series.
Table II shows the out-of-sample RMSE ratios. Some summary statistics are also given at the
bottom of the table. The summary statistics are the average RRMSE and the score. The score is
the number of times when the first forecasting technique yields lower RMSE than the second. For
the VAR model, we use the same series as a supportive series which provides the minimum RMSE
in forecasting one step ahead; these series are shown in parentheses. For example, for forecasting
the Food products series in the VAR model the Chemical series was used. This series was chosen
as the results of the one-step-ahead forecasts for the Food series give minimum RMSE if we use
the Chemical series as the second series. The same series is also used for MSSA. Note also that the
Granger causality test confirms that there exists a Granger causality between the two series considered
here. We have indicated this by the symbol in Table II.
Table II shows that SSA forecasts are much better than the forecasts obtained by ARIMA model.
The averages and the scores for one- and three-step-ahead forecasts show that the VAR forecasts are
also much better than the forecasts obtained by the ARIMA model. However, the advantage of the
VAR, relative to ARIMA, reduces for forecasting at horizons greater than 3. The scores also confirm
that the SSA forecasts outperform forecasts produced by the ARIMA model, at all horizons. For all
the series (eight cases), SSA outperforms ARIMA 7, 7, 7 and 6 times at h D 1, 3, 6 and 12 horizons
respectively, while the VAR outperforms the ARIMA 8, 5, 3 and 2 times. MSSA also outperforms the
ARIMA model 7, 7, 8 and 8 times and outperforms the VAR model 5, 8, 8 and 7 times at h D 1, 3 ,6
and 12 horizons.
As can be seen from Table II, MSSA outperforms ARIMA much more than the VAR model. On
average, the results obtained by MSSA are up to 43%, 38%, 24% and 18% better than the ARIMA
model. The results also indicate that MSSA outperforms the VAR model. On average, the results
obtained by MSSA are up to 15%, 37%, 34% and 26% better than the VAR model.
The results of Table II confirm that the quality of SSA forecasts (both univariate and multivariate)
at horizons h D 1, 3, 6 and 12 are much better than the quality of ARIMA and VAR forecasts. This
observation serves as a confirmation of the following facts: (i) most of the series considered here have
a complex structure of trend and seasonality and this structure is well recovered by SSA; (ii) SSA
forecasts are more robust than ARIMA and VAR forecasts with respect to the presence of shocks in
the series. Note that the current recession has started at the period which we have considered as the
out-of-sample forecast period.
Using the modified Diebold–Marino statistics, given in Harvey et al. (1997), we test for the
statistical significance of the results of the forecasts. The asterisk indicates the results at the 5%
level of significance or less. Comparing the SSA forecasts with the ARIMA, at 5% significance level
or less, SSA outperforms the ARIMA significantly 6, 5, 3 and 5 times at h D 1, 3, 6 and 12 horizons,
Series RRMSE
SSA VAR MSSA MSSA
h ARIMA ARIMA ARIMA VAR
Score 1 7 8 7 5
3 7 5 7 8
6 7 3 8 8
12 6 2 8 7
The second column represents the standard deviation (SD) of the forecast errors. As the results
show, MSSA results provide the minimum SD, at all horizons, among others. For example, for one-
step-ahead forecasts, the SD of the out-of-sample forecast errors for MSSA is 0.0704, whereas this
is 0.0838 for the VAR model. Note also that the performance of the MSSA is much better than the
VAR model at longer horizons. Similar results are obtained for SSA compared to the ARIMA model.
Median, minimum and maximum values are also reported in the table, showing similar results. The
overall conclusion is that, without exception, SSA outperforms ARIMA in the univariate forecasting,
and MSSA outperforms VAR model in multivariate forecasting.
Figure 1. Empirical cumulative distribution functions for absolute forecasting errors of ARIMA, SSA, VAR
and MSSA
A question that frequently arises in time series analysis is whether one economic variable can
help in predicting another economic variable. One way to address this question was proposed by
Granger (1969), in which he formalized the causality concept. Here, we develop new tests and
criteria which will be based on the forecasting accuracy and predictability of the direction of change
of the SSA algorithms.
.h,d /
conclude that X is more supportive to Y than is Y to X . Otherwise, if FXjY < FY.h,d /
jX , we conclude
that Y is more supportive to X than is X to Y .
However, if FY.h,d / .h,d /
jX < 1 and FXjY < 1, we then conclude that X and Y are mutually supportive. We
shall call this F-feedback (forecasting feedback), which means that the use of a multivariate system
improves the forecasting for both series.
To check whether the discrepancy between the two forecasting procedures is statistically signifi-
cant, we apply the Diebold and Mariano (1995) test statistic, with the corrections suggested in Harvey
et al. (1997).
take theP value 1 if the forecast correctly predicts the direction of change and 0 otherwise. Then
n
ZNX D i D1 ZX i =n shows the proportion of forecasts that correctly predict the direction of the
series. The Moivre–Laplace central limit theorem implies that, for large samples, the test statistic
2.ZNX 0.5/N 1=2 is approximately distributed as standard normal. When ZNX is significantly larger
than 0.5, the method is said to have the ability to predict the direction of change. Alternatively, if ZNX
is significantly smaller than 0.5, the method tends to give the wrong direction of change.
For the multivariate case, let ZXjY ,i take a value 1 if the forecast of the direction of change of
the series X having information about the series Y is correct and 0 otherwise. Then, we define the
following criterion: DXjY .h,d /
D ZNX =ZN XjY , where h and d have the same interpretation as for FXjY .h,d /
.
.h,d /
The criterion DXjY characterizes the improvement we are getting from the information contained in
YT Ch (or XT Ch ) for forecasting the direction of change in the h-step-ahead forecast.
.h,d /
If DXjY < 1, then having information about the series Y helps us to obtain better predictions of
.h,d /
the direction of change for the series X. Alternatively, if DXjY > 1, then the univariate SSA is better
than the multivariate version. Similar to the forecasting accuracy criterion, we can define a feed-
back system and check which series is more supportive based on the predictability of the direction
of change.
.h,d /
To test the significance of the value of DXjY , we use the test developed in Zhigljavsky et al.
(2009). As in the comparison of two proportions, when we test the hypothesis concerning the dif-
ference between two proportions, first we need to know whether the two proportions are dependent.
The test is different depending on whether the proportions are independent or dependent. In our case,
obviously, ZX and ZXjY are dependent. Let us consider the test statistic for the difference between
the forecast values of ZX and ZXjY as shown in Table V.
The estimated proportion of correct forecasts using the multivariate system is PXjY D .a C b/=n,
and the estimated proportion using the univariate version is PX D .a C c/=n. The difference between
the two estimated proportions is
aCb aCc bc
D PXjY PX D D (6)
n n n
Since the two population probabilities are dependent, we then estimate standard error as follows:
r
1
SE./ D
1
n
.b C c/
.b c/2
n
(7)
H0 W d D 0
(8)
Ha W d ¤ 0
The test statistic, assuming that the sample size is large enough for having the normal approxima-
tion to the binomial distribution, is
0 1=n
T D
d
1
SE./
(9)
where 1=n is the continuity correction. In our case 0 D 0, and the test statistic becomes
.b c/=n 1=n bc1
T D
d
p Dp (10)
2
1=n .b C c/ .b c/ =n .b C c/ .b c/2 =n
Under the null hypothesis of no difference between samples ZX and ZXjY , T has standardd
normal distribution.
Empirical results
The results presented in the previous section showed that the VAR model is not the best choice in
predicting the UK industrial production series, as MSSA decisively outperforms the VAR model.
We have also shown that these series are nonlinear, non-stationary and not normally distributed.
The Granger causality test reported in Zhigljavsky et al. (2009) confirms that there is a Granger
causality between certain series considered in the multivariate forecasting approach. The Food and the
Chemical series have been found as the series where Granger causality was most apparent (the linear
and non-linear correlations for these two series are 0.75 and 0.85, respectively).
We performed h D 1, 3,6 and 12-step-ahead forecasts based on the most up-to-date information
available at the time of the forecast. We use d -step-ahead information about the Food series as addi-
tional information in forecasting h steps ahead for the Chemical series and vice versa. We denote the
.h,d / .h,d /
corresponding statistics FFjC and FCjF , respectively. Note that we select window length 24 for both
univariate and multivariate SSA. In model selection for VAR model we use the Akaike information
criterion (AIC) and the Schwarz information criterion (SIC) for the identification of the VAR model
order. An asterisk indicates significant results at the 1% level.
As we mentioned earlier, the Granger causality test results confirm that there is a significant rela-
tionship between the Food and the Chemical series. To examine this using the SSA technique, next
we consider MSSA with d additional observations for one series. For example, we use the Food series
up to time t and the Chemical series up to time t C d in forecasting h steps ahead for the Food series
.h,d /
to compute FFjC . We use a similar procedure in forecasting the Chemical series. We expect this
additional information to give better results in both forecasting accuracy and the direction of change
prediction.
Let us first consider the results obtained for the Food series using extra information from the
Chemical series. As can be seen from Table VI, the accuracy of the results obtained using MSSA
.h,d /
is better than that obtained using SSA for h 6 with respect to FFjC column MSSA
SSA
. The MSSA
also improved predictability of the direction of change of the Food series for all considered horizons
.h,d /
(according to DCjF column MSSASSA
in Table VII). The results show that, for the Food series, we have
improved both accuracy and direction of change of the forecasting results for h 6. Furthermore,
prediction of the direction of change has been improved for h 6. This indicates that we can at least
improve the direction of change predictability for the Food series using extra information from the
Chemical series.
.h,d /
Note also that FFjC for h D 1, : : : , 6 is equal to 1 or slightly grater than 1, which indicates that
the multivariate version could not help in short horizon forecasting. To examine this more precisely,
we need to increase our forecasting period. On average, we were able to improve both forecasting
accuracy and direction of change predictability by 4%. As the results show, the VAR model cannot
help us in forecasting the Food series using extra information about the Chemical series according to
the results provided in the second column of Tables VI and VII. We have also compared VAR results
and those obtained by MSSA. As can be seen from the tables, the results obtained by MSSA are much
better than those provided using the VAR model. On average, the MSSA results are up to 50% and
13% better than the VAR model in forecasting accuracy and predictability of the direction of change.
Let us now consider the results of forecasting the Chemical series having extra observations for
.h,d / .h,d /
the Food series. As can be observed from columns FCjF , DCjF , the errors for the MSSA forecast
and direction of change, with d additional observations, are much smaller than those obtained for
the univariate SSA. These results are also better than those obtained using the multivariate approach
with zero lag difference. This is not surprising as the additional data used are highly correlated with
the series we are forecasting. As the results show, the accuracy performance of MSSA has been
significantly increased. However, it seems that this correlation, either linear and nonlinear, is not
properly captured by the VAR model.
For example, in forecasting one step ahead of the Chemical series having extra information for
the Food series, compared to the univariate case, we have improved the accuracy and the direction of
change of the forecasting results up to 45% and 35% (column 4 of Tables VI and VII), respectively.
On average, the forecasting accuracy results have been improved up to 39% and 22% using MSSA
and the VAR model, respectively. However, the direction of change results, presented in Table VII for
the Chemical series, confirm that even though the VAR model could help us to improve the forecasting
accuracy, it fails to improve the predictability of direction of change. Comparison of MSSA prediction
of the direction of change results and forecasting accuracy results relative to the VAR model indicates
that the MSSA technique is up to 24% and 9% better than the VAR model, respectively.
.h,d / .h,d /
Moreover, FCjF > FFjC indicates that the Food series is more supportive than the Chemical
series in terms of forecasting accuracy. Furthermore, the results of Table VII confirm that there exists
D-feedback between the Food and Chemical series for h D 1, : : : , 12. This means that, considering
both the Food and Chemical series simultaneously and using SSA, we are able to improve the pre-
dictability of the direction of change. On average, the MSSA results have improved the percentage
of correct prediction of the series movement up to 4% and 22% for the Food and the Chemical
series, respectively. Moreover, there exists F-feedback between the Food and the Chemical series for
h D 6, : : : , 12.
Finally, the results obtained by the VAR model and ARIMA, in terms of both forecasting accuracy
and direction of change predictability, confirm that for the Food series VAR models cannot help us to
improve our results while there exists Granger causality between the series. However, as the results
show, the VAR model can improve the results obtained for the Chemical series except for h D 7 and
9 (this probably happened because our out-of-sample size was small). On average, for the Chemical
series, MSSA and VAR model enable us to improve the forecasting accuracy up to 34% and 12%,
respectively. However, there is no improvement for the direction of change prediction using the VAR
model for both the Food and the Chemical series.
In this paper, we have described the methodology of SSA and demonstrated that SSA can be suc-
cessfully applied to the analysis and forecasting of the industrial production series for the UK.
This research has illustrated that the SSA technique performs well in the simultaneous extraction of
harmonics and trend components even in circumstances where there is a structural break in the series.
A comparison of forecasting results showed that SSA is more accurate than the ARIMA model, con-
firming the results obtained in Hassani et al. (2009a). We were motivated to use MSSA because these
production series are highly correlated with each other. The multivariate SSA also outperformed the
VAR model in predicting the values and direction of the production series according to the RMSE
criterion and the direction of change results.
The empirical cumulative distribution function and descriptive statistics of the errors were obtained
by SSA, MSSA, ARIMA and the VAR model. The results confirmed that the errors obtained by
SSA were stochastically smaller than those obtained by ARIMA in the univariate case. Similarly,
MSSA forecast errors were stochastically smaller than those obtained by the VAR model. The results
obtained from the empirical cumulative distribution functions were in line with those concluded from
the descriptive statistics. The overall conclusions according to these criteria are that SSA outperforms
the ARIMA model in the univariate case, and MSSA outperforms the VAR model in the multivariate
situation.
We also developed a new approach in testing for causality between two arbitrary univariate time
series. Several test statistics and criteria are introduced in testing for casuality. We believe that
our approach is superior to the criteria used for testing the Granger causality, which are based on
autoregressive integrated moving average .p, d , q/ or multivariate vector autoregressive (VAR) rep-
resentation of the data. The criteria are based on the idea of minimizing a loss function, forecasting
accuracy and predictability of the direction of change.
The performance of the proposed tests was examined using the Food production series and the
Chemical series. It has been shown that, on average, the Chemical series causes the Food series and
the Food series causes the Chemical series considering the predictability of direction of change. We
also found that there exist two-way causality between the Food and the Chemical series in terms of
forecasting accuracy criterion for horizons longer than 6 months. On the other hand, the VAR model
fails to use the dependence between the Food and the Chemical series as the VAR only marginally
improves the accuracy of the forecasts.
REFERENCES
Allen MR, Smith LA. 1996. Monte Carlo SSA: detecting irregular oscillations in the presence of colored noise.
Journal of Climate 9: 3373–3404.
Broomhead DS, King GP. 1986a. Extracting qualitative dynamics from experimental data. Physica D 20: 217–236.
Broomhead DS, King GP. 1986b. On the qualitative analysis of experimental dynamical systems. In Nonlinear
Phenomena and Chaos, Sarkar S (ed.). Adam Hilger: Bristol; 113–144.
Danilov D, Zhigljavsky A (eds). 1997. Principal components of time series: The ‘Caterpillar’ method. University
of St Petersburg: St Petersburg (in Russian).
de Prony GS. 1795. Essai expérimental et analytique: sur les lois de la dilatabilité de fluides élastique et sur celles
de la force expansive de la vapeur de lalkool, à différentes températures. Journal de l’Ecole Polytechnique 1(22):
24–76.
Diebold FX, Mariano RS. 1995. Comparing predictive accuracy. Journal of Business and Economic Statistics 13(3):
253–263.
Elsner JB, Tsonis AA. 1996. Singular spectrum analysis: A new tool in time series analysis. Plenum Press: New
York.
Fraedrich K. 1986. Estimating the dimension of weather and climate attractors. Journal of Atmospheric Sciences
43: 419–432.
Golyandina N, Nekrutkin V, Zhigljavsky A. 2001. Analysis of time series structure: SSA and related techniques.
Chapman & Hall/CRC: New York.
Granger CWJ. 1969. Investigating causal relations by econometric models and cross-spectral methods. Economet-
rica 37(3): 424–438.
Harvey DI, Leybourne SJ, Newbold P. 1997. Testing the equality of prediction mean squared errors. International
Journal of Forecasting 13: 281–291.
Hassani H. 2007. Singular spectrum analysis: methodology and comparison. Journal of Data Science 5(2):
239–257.
Hassani H. 2010. Singular spectrum analysis based on the minimum variance estimator. Nonlinear Analysis: Real
World Applications 11(3): 2065–2077.
Hassani H, Thomakos D. 2010. A review on singular spectrum analysis for economic and financial time series.
Statistics and its Interface 3(3): 377–397.
Hassani H, Zhigljavsky A. 2009. Singular spectrum analysis: methodology and application to economics data.
Journal of System Science and Complexity 22(3): 372–394.
Hassani H, Heravi H, Zhigljavsky A. 2009a. Forecasting European industrial production with singular spectrum
analysis. International Journal of Forecasting 25(1): 103–118.
Hassani H, Soofi A, Zhigljavsky A. 2009b. Predicting daily exchange rate with singular spectrum analysis.
Nonlinear Analysis: Real World Applications 11(3): 2023–2034.
Hassani H, Dionisio A, Ghodsi M. 2010. The effect of noise reduction in measuring the linear and nonlinear
dependency of financial markets. Nonlinear Analysis: Real World Applications 11(1): 492–502.
Hassani H, Zhigljavsky A, Zhengyuan X. 2011. Singular spectrum analysis based on the perturbation theory.
Nonlinear Analysis: Real World Applications 12(5): 2752–2766.
Heravi S, Osborn DR, Birchenhall CR. 2004. Linear versus neural network forecasts for European industrial
production series. International Journal of Forecasting 20: 435–446.
Osborn DR, Heravi S, Birchenhall CR. 1999. Seasonal unit roots and forecasts of two-digit European industrial
production. International Journal of Forecasting 15: 27–47.
Patterson K, Hassani H, Heravi S, Zhigljavsky A. 2010. Forecasting the final vintage of the industerial production
series. Journal of Applied Statistics 38(10): 2183–2211.
Vautard R, Ghil M. 1989. Singular spectrum analysis in nonlinear dynamics, with applications to paleoclimatic time
series. Physica D 35: 395–424.
Vautard R, Yiou P, Ghil M. 1992. Singular-spectrum analysis: a toolkit for short, noisy chaotic signal. Physica D
58: 95–126.
Zhigljavsky A. 2010. Singular spectrum analysis for time series: introduction to this special issue. Statistics and its
Interface 3(3): 255–258.
Zhigljavsky A, Hassani H, Heravi H. 2009. Forecasting European industrial production with multi-
variate singular spectrum analysis. International Institute of Forecasters, 2008–2009 SAS/IIF Grant.
Available: http://forecasters.org/pdfs/SAS-Report.pdf [accessed on 10 February 2012].
Authors’ biographies:
Hossein Hassani is a Senior Lecturer in Statistics and Econometrics at Business School, Bournemouth University.
His research interests include Singular Spectrum Analysis, time series analysis and forecasting, econometrics, data
mining and applied statistics.
Saeed Heravi is a Reader in Quantitative Methods at Cardiff Business School, Cardiff University. His research
interests are in time series analysis, forecasting, applied econometrics and index numbers.
Anatoly Zhigljavsky is a Professor in Statistics at school of Mathematics, Cardiff University. His research interests
are in time series analysis, experimental design, stochastic global optimization, dynamical systems in search and
optimization and applied statistics.
Authors’ addresses:
Hossein Hassani, The Business School, Bournemouth University, Bournemouth BH8 8EB, UK; Statistical
Research and Training Center (SRTC), Tehran, Iran.
Saeed Heravi, Cardiff Business School, Cardiff University, Cardiff CF10 3EU, UK.
Anatoly Zhigljavsky, Cardiff School of Mathematics, Cardiff University, Cardiff CF24 4AG, UK.