Carlos3 CS

1
Econometric Game 2013 Team 4
April 10, 2013
Contents
1 Introduction 3
Data Description
Methodology and Results of the Baseline Model
3.1 3.2
ARIMA Model
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7 7 8 11 11
Dynamic Factor Models 3.2.1 3.2.2 Factor Estimation
Estimating the number of factors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.3
Lasso Regression Method
Results and Forecasts Evaluation
12
4.1 4.2 4.3 4.4
DFM Results
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
12 12 13 14
Lasso Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Forecast Evaluation 2013 Forecasts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Conclusions
16
References
17
Introduction
Forecasting the real GDP growth rate is crucially important for countries in order to take ecient public policy decisions. This is in general a very challenging task. It becomes even more dicult in times of crisis, when governments need to make key public interventions in order to correct the negative dynamics of the economy.
Nowadays countries in Southern Europe are experiencing the biggest scal imbalances in recent economic history. The current situation seems to call for scal stabilizations policies, which have to be properly assessed by taking into account also the growth perspective of the countries. dicult to escape from. Forcing a scal adjustment during a crisis might generate vitious circles
There is indeed some empirical evidence in the literature on non-linearities in the way scal
policy aects the economy. Therefore, to this extent it is of crucial importance to be able to forecast properly GDP.
Recent approaches implemented to forecast key macroeconomic variables take advantage of todays rich data bases. The possibility of extracting value from the additional information available can signicantly improve the forecasting. Because of the nature of these datasets the classical OLS framework is not feasible for estimation, as the number of regressors (therefore of parameters) is usually bigger than the number of observations.
However, there are dierent ways to deal with this dimensionality problem.
The general approaches that have been
used in the literature are: a) the hard thresholding; b) the soft thresholding; c) the index models; d) the forecast combination. In the following we are interested in forecasting the real GDP growth rate for Spain in 2013 by using a rich dataset from the OECD Economic Outlook. In the next section we perform a more detailed data description. Section 3 presents the models and the estimation results. First, we construct a nave benchmark model. Afterwards, we develop dierent forecasting approaches exploiting the rich data environment. Finally, in section 4 we perform forecasts comparison and evaluation. Section 5 concludes.
Data Description
We use a very rich dataset for Spain coming from the OECD Economic Outlook. It includes all relevant macroeconomic variables for the period 1970-2012 at quarterly frequency. The information includes time series of GDP, prices, expenditures, current accounts, exports, imports, exchange rates, prices, deators, employment and interest rates. We have a total of 70 variables. Several variables are though repeated at dierent price levels, or both in value and volume. When-
ever possible, we decide to keep the variables at 2005 prices in USD and in volume rather than in value. We also add the appropriate deators. As our target is to forecast the volume of GDP, the strategy that we use does not generate any information loss and at the same time it prevents us from overtting the model with redundant variables. After
deleting the observations with missing values we are left with a balanced panel spanning the period 1977-2012, with 143 observations. We work with a total of 45 variables.
All the time series are plotted in Figure 1.
We can observe that almost all of them are clearly trending over time,
while many present a dynamic evolution which could be consistent with a white noise process. Before using these variables in our forecasting model, we therefore need to test for their stationarity. We run on each time series the Augmented Dickey Fuller (ADF) test, whose null hypothesis is that the process has a unit root. Given that the data is quarterly we use 4 lags to take into account the likely high correlation between the variables within the same year. We can observe from Table 1 that the MacKinnon approximate P-values are very high, and we cannot reject the null hypothesis for any of the variables. This is an expected result when dealing with macroeconomic variables and we can easily tackle this diculty. We compute growth rates of all the series rather than log-dierences (as we have several negative values), solving the non-stationarity in this way. Price levels and deators are the only problematic variables, as the plot of their growth rate makes us still doubtful about their non-stationarity. We follow the common practice of taking growth rates again, in order to make sure to have stationary series.
The presence of large outliers in the time series might distort the inference of our analysis. In order to control for this potential threat, we decide to replate the extreme values (over the 97.5 and before the 2.5 percentile) of each time series by the mean of the neighbouring values (linear interpolation). Marcelino and Banerjee (2011). To this extent, we simply follow the strategy of Bech,
CBD
4 0 5E+11 -4 4E+11 12,000,000 2,400,000 8,000,000 70 75 80 85 90 95 00 05 10 15 70 75 80 85 90 95 00 05 10 15 70 75 80 85 90 95 00 05 10 15 70 75 80 85 90 95 2,200,000 8,000,000 00 05 10 15 12,000,000 -8 3E+11 2E+11 70 75 80 85 90 95 00 05 10 15 -12 05 10 15 70 75 80 85 90 95 00 05 10 15 5.0E+10 1.0E+11 1.5E+11 2,600,000 16,000,000 16,000,000 2.0E+11 6E+11 2,800,000 20,000,000 2.5E+11 7E+11 20,000,000 3,000,000 24,000,000
CBGDPR
CGV
CPV
EE
ES
ET
5.0E+10
0.0E+00
-5.0E+10
-1.0E+11
-1.5E+11
-2.0E+11
70
75
80
85
90
95
00
ET_NA
2.8 2.4 1.0E+12 1.0E+12 8.0E+11 1.6 1.2 0.8 05 10 15 70 75 80 85 90 95 00 05 10 15 70 75 80 85 90 95 00 05 10 15 70 75 80 85 90 95 00 05 10 15 70 75 80 85 90 95 00 05 10 15 70 75 80 85 90 95 00 05 10 15 80 4.0E+11 4.0E+11 4.0E+10 100 6.0E+11 6.0E+11 8.0E+10 120 8.0E+11 1.2E+11 5 0 70 75 80 85 10 1.6E+11 15 2.0 140 160 1.2E+12 2.0E+11 180 1.2E+12 1.4E+12 2.4E+11 20
EXCH
EXCHEB
FDDV
GDPVD
IBGV
IRL
24,000,000
20,000,000
16,000,000
12,000,000
8,000,000
70
75
80
85
90
95
00
90
95
00
05
10
15
IRS
4E+11 3E+11 20,000,000 3E+11 0E+00 16,000,000 1E+11 1E+11 12,000,000 70 75 80 85 90 95 00 05 10 15 70 75 80 85 90 95 00 05 10 15 70 75 80 85 90 95 00 05 10 15 70 75 0E+00 -2E+10 80 85 90 95 00 05 0E+00 05 10 15 70 75 80 85 90 95 00 05 10 15 0E+00 1E+11 -1E+10 2E+11 2E+11 2E+11 3E+11 4E+11 1E+10 4E+11 24,000,000 5E+11 2E+10
ITISKV
ITV
LF
MGSVD
NTRD
1.2
PCG
30
20
0.8
10
0.4
0.0 10 15 70 75 80 85 90 95 00 05 10 15
70
75
80
85
90
95
00
PCP
1.2 1.2 1.2 1.2
PFDD
PGDP
PIT
PITISK
1.2
PMGS
1.2 1.0
PMGSX
1.2
0.8
0.8
0.8
0.8
0.8
0.8
0.8 0.4 0.6 0.4
0.4
0.4
0.4
0.4
0.4
0.0 05 10 15 70 75 80 85 90 95 00 05 10 15 70 75 80 85 90 95 00 05 10 15 70 75 80 85 90 95 00 05 10 15 70 75 80 85 90 95
0.0
0.0
0.0
0.0 00 05 10 15
0.0 70 75 80 85 90 95 00 05 10 15
0.2 70 75 80 85 90 95 00 05 10 15
70
75
80
85
90
95
00
PMNW
1.2 2.0 0.8 1.6 1.2 0.8 0.0 05 10 15 70 75 80 85 90 95 00 05 10 15 70 75 80 85 90 95 00 05 10 15 70 75 80 85 90 95 00 05 10 15 0.0 0.0 0.4 70 75 80 85 0.8 0.8 1.2 1.2 2.4
PTDD
PXGS
PXGSX
PXNW
1.2 1.1 1.0
RPMGS
1.2 1.0 0.8 0.9 0.8 90 95 00 05 10 15 70 75 80 85 90 95 00 05 10 15 0.6 0.4 70 75 80 85
RPXGS
2.5
2.0
1.5 0.4 0.4 0.4
1.0
0.5
0.0
70
75
80
85
90
95
00
90
95
00
05
10
15
SHTGSVD
1.2E+12 1.0E+12 1.2E+12 8.0E+11 8.0E+11 6.0E+11 4.0E+11 05 10 15 70 75 80 85 90 95 00 05 10 15 70 75 80 85 90 95 00 05 10 15 70 4.0E+11 4.0E+11 75 80 85 90 95 00 05 8.0E+11 1.2E+12 1.6E+12 1.6E+12 2.0E+12
TDDV
TEV
TEVD
30
UNR
4E+11 3E+11 20 2E+11 10 1E+11 0 10 15 70 75 80 85 90 95 00 05 10 15 0E+00 70 75 80 85
XGS
4E+11 3E+11 2E+11 1E+11 0E+00 90 95 00 05 10 15 70 75 80 85
XGSV
.028
.024
.020
.016
70
75
80
85
90
95
00
90
95
00
05
10
15
XGSVD
4E+11 3E+11 1.0 2E+11 0.8 1E+11 0E+00 05 10 15 70 75 80 85 90 95 00 05 10 15 70 75 80 85 90 0.6 95 00 05 10 15 1.2
XMKT
XPERF
4E+11
3E+11
2E+11
1E+11
0E+00
Figure 2.1. Time series of relevant macroeconomic variables for Spain (1977-2012). Source: OECD Economic Outlook.
70
75
80
85
90
95
00
Table 1. Augmented Dickey Fuller Tests performed on each time series.
Variable (label) Final domestic expenditure, deator GDP deator Gross total xed capital formation deator Gross capital formation deator Imports of goods and services deator Total domestic expenditure deator Exports of goods and services deator Dependent employment Total self-employed Total employment Total employment, National Accounts basis Labour force Unemployment rate Long-term interest rate on government bonds Short-term interest rate Export performance for goods and services, volume Government nal consumption expenditure Private nal consumption expenditure Final domestic expenditure Gross domestic product, volume, at 2005 PPP, USD Private non-residential and government xed capital formation Gross capital formation Gross xed capital formation Total domestic expenditure Total expenditure Total expenditure, volume, 2005 USD Exports of goods and services, value, National Accounts basis Exports of goods and services, volume, National Accounts basis Government nal consumption expenditure deator Private nal consumption expenditure deator Current account balance, value in USD Current account balance, as a percentage of GDP Exchange rate, USD per National currency Nominal eective exchange rate, chain-linked, overall weights Imports of goods and services, volume, USD, 2005 prices Net current international transfers, value, balance of payments basis, USD Price of non- commodity imports of goods and services Price of commodity imports Price of non- commodity exports of goods and services Price of commodity exports Relative price of imported goods and services Relative price of exported goods and services Share of country's trade expressed in USD volume (2005 prices Exports of goods and services, volume, USD, 2005 prices Export market for goods and services, volume, USD, 2005 prices
T Statistic -1.70 -1.61 -1.03 -1.14 -1.91 -1.74 -2.20 -1.02 -0.72 -1.15 -1.01 -0.51 -1.86 -0.98 -0.69 -1.61 -1.19 -0.71 -1.18 -0.66 -1.34 -1.46 -1.43 -1.19 -0.58 -0.58 1.11 0.81 -2.00 -1.71 -1.80 -2.33 -2.31 -2.39 -0.74 -0.76 -3.07 0.84 -3.72 0.29 -2.17 -2.20 -1.38 0.81 1.31
MacKinnon p-value 0.43 0.48 0.74 0.70 0.33 0.41 0.21 0.74 0.84 0.70 0.75 0.89 0.35 0.76 0.85 0.48 0.68 0.84 0.68 0.86 0.61 0.56 0.57 0.68 0.87 0.87 1.00 0.99 0.28 0.43 0.38 0.16 0.17 0.14 0.84 0.83 0.03 0.99 0.00 0.98 0.22 0.21 0.59 0.99 1.00
Figure 3.1. Forecasts of GDP Growth for the periods 1977.3-2002.4. ARIMA Model.
.03
.02
.01
.00
Root Mean Squared Error Mean Absolute Error Mean Abs. Percent Error Theil Inequality Coefficient Bias Proportion Variance Proportion Covariance Proportion
0.005725 0.003868 173.8783 0.415270 0.154562 0.353654 0.491785
-.01
-.02 2003 2004 2005 2006 2007 2008 2009 2 S.E. 2010 2011 2012
GDP growth
Methodology and Results of the Baseline Model
As a baseline we estimate an ARIMA models for the GDP growth. Further, we use other approaches able to exploit in a more ecient way the rich-data. In particular, we use Diusion Indexes via estimation of a dynamic Factor models a-la Stock and Watson (2002) and Lasso. We prefer to use dierent models in order to have some room for making comparisons. In this section we provide some of the results and a brief introduction to Dynamic Factor Models and Lasso.
3.1
ARIMA Model
Based on information criteria (AIC and BIC) and Q-stat autocorrelation test, we selected a AR(4) model. The model was estimated for the period 1977.3 - 2002.4, and use to forecast the period 2003.1 - 2012.4 as it is presented in Figure 3.1. Forecasting power indicator was computed in order to be compared to the one of the other models in the following subsections.
3.2
Dynamic Factor Models
Dynamic factor models (DFMs) where initially proposed by Geweke (1977) as the time-series extension of factor models previously designed for cross-sectional data. The starting point of DFMs is that the dynamics of a high dimensional (n) time-series vector (Xt ) are driven by few (q ) common factors
fit
and an idiosyncratic
nvector
of disturbances
et
. The
use of DFMs in economics became widespread after Geweke (1977) and Sims and Sargent (1977) who allowed both the factors and the idiosyncratic errors to be serially correlated. The factors (ft ) are usually assumed to follow a VAR process whereas the idiosyncratic disturbances (et ) are assumed to follow univariate autoregressive processes. Thus, DFMs can be written as:
Xt = (L)ft + et
(3.1)
(L)ft = t
(3.2)
where the lag polynomials
i (L) are the dynamic factor loadings of each series in Xt et
. Assume initially that both equations
(3.1) and (3.2) are stationary. The idiosyncratic error leads and lags (E (et , tk )
is assumed to be uncorrelated with factors' innovations at all
= 0 k ).
In the exact dynamic factor model it is also assumed that idiosyncratic disturbances
are mutually uncorrelated at all leads and lags, that is,
E (eit ejs = 0)s
if
i=j
As noted by Stock and Watson (2011), when the factors are known and the errors (et and
t )
Gaussian, an individ-
ual variable can be eciently forecasted regressing it on the lagged factors and lags of the variable itself, so that we do not need to include all then variables in the regression. Thus, in words of Stock and Watson (2006) DFMs allow to turn dimensionality from a curse into a blessing. However, not only the factors are unknown but also we do not know how many of them are driving the data.
3.2.1
Factor Estimation
Denoting the on the
r 1 vector (ft , ..., ftp ) (L),
as
Ft
and the
n r matrix (0,... p ) as , where i
is the
n q matrix of coecients
ith
lag in
then the DFM can be re-written in its static form as:
Xt = Ft + et
(3.3)
A(L)Ft = Gt
(3.4)
where
A(L)
contains
1s, 0s
and elements of
(L),
and
is composed of
1s
and
0s.
Note that the number of static
factors will be
r pq
because some lagged factors could be redundant. As it will become evident below, this state-space
formulation has important advantages for estimation.
As indicated by Stock and Watson (2011), estimation methods can be divided in three classes.
The rst class con-
siders a small number of series so that factors and model's parameters can be estimated using the Gaussian maximum likelihood (MLE) and the Kalman lter. The need for a small number of parameters comes from the fact that the procedure requires non-linear optimization. The second class of approaches are those using non-parametric estimation via some averaging method among which principal components is the most usual. Finally, as factors can be consistently estimated by principal components (for large
n),
in the last class of methods these estimations are used to estimate the parameters
of the state-space model, solving the dimensionality issue of the rst approaches.
Principal Components
As noted by Stock and Watson (2011) an important motivation of these approaches is that in a (weighted) cross-sectional average of
Xt ,
idiosyncratic disturbances will converge to zero so that only the linear combinations of the factors will
remain. The assumptions required for averaging to work are just:
limn n1 = D
(3.5)
maxeval(
e
where
) c < m
(3.6)
is
rr
and it is full rank,maxeval denotes the maximum eigenvalue and
e is the covariance matrix of
et .
Consider a weighting matrix
(with
W W/n = I
) such that the factors are estimated as
t = n1 W Xt . F
If
limn n1 W = Hrr , H
has full rank and conditions in (3.5) nad (3.6) are satised, then:
et p HFt as n Ft = n1 W Ft + n1 W
(3.7)
where it was used that restrictions
n1 W et p 0
by the weak law of large numbers. Note that without imposing some additional
Ft
and
are not identied because the matrix
is unknown. Since
is
rr
we need
r2
restrictions to restrictions.
identify the factors and their loadings. The usual normalization assumption
n 1 = I r
provides
r(r + 1)/2
10
The remaining
r(r 1)/2
restrictions are obtained imposing
F F
to be diagonal, where
F = (F1 , ..., FT ).
The matrix
is not unique and can be selected in many dierent ways.
In the Principal Components approach
is the matrix of eigenvalues of the sample covariance matrix of sarily equal to
Xt .
Specically, for a given number of factors
(not neces-
r) the principal components method estimates the factors and loadings by solving the optimization problem:
minF1 ,...FT , Sk
(3.8)
with
Sk = (nT )1
T t=1 (Xt
Ft ) (Xt Ft ) n1 = Ir and
the restricting
subject to the normalization
F F
to be diagonal (which is automatically satis-
ed). The problem can be solved by concentrating out
Ft .
This gives the least squares estimator of
Ft
given
so that
() = ( )1 Xt . F
Then, (3.8) can be rewritten as
min T 1
T 1 Xt ]Xt =1 Xt [I ( )
(3.9)
But this new problem is equivalent to:
max tr{( )1/2 (T 1

t=1
which is also equivalent to
Xt Xt )( )1/2 }
(3.10)
max
t=1
subject to set
Xt Xt
(3.11)
n 1 = I k .
This nal problem is the starting point of principal components analysis, which solution is to
equal to the eigenvectors of
= Xt Xt
corresponding to its
largest eigenvalues. Next, as
= nIr ,
we get
Ft = n1 Xt ,
which are the scaled
principal components. Bai and Ng (2008) summarize the properties of the esti-
mated factors and loadings. Briey, as proved by Bai and Ng(2002), both estimators are consistent (the average squared deviation between the
estimated factors and the space spanned by
of the true factors vanish at rate
min[N, T ],
and
they converge to normal distributions. Moreover, for each t, estimated factors are while for each
consistent for the true factor space
i,
estimated factor loadings are
consistent for the space spanned by the true factor loadings.
11
Finally, given that the covariance matrix of
et
is not assumed to be diagonal, generalized principal components methods Several approaches have been proposed to make this procedure
have been proposed to take this feature into account.
feasible (see Forni et.al, 2005 Boving and Bg, 2003 and Stok and Watson, 2005). Nonetheless, empirical applications to real and simulated data do not show the generalized method to produce better forecasting results systematically (see e.g Boivin et.al 2005; D'Agostino and Giannone, 2006; or Forni et.al, 2005).
3.2.2
Estimating the number of factors
In their survey about large dimensional factor analysis, Bai and Ng (2008) highlight two possible information criteria for determining the number of factors:
P CP (k ) = Sk + k 2 g (n, T )
(3.12)
ICk = ln(Sk ) + kg (n, T ) g (n, T ) is a penalty function, Sk is given by (3.8) and 2 = Skmax for a certain value of kmax. k.
The authors show that when
(3.13)
where
In both cases
is deterk
mined by minimizing the information criteria over as
g (n, T ) 0 and [min(n, T )]g (n, T )
n, T ,
the probability of selecting the correct number of factors tends to one for both criteria. A usual penalty
function is
g (n, T ) =
n+T nT
ln[min(n, T )], however Bai and Ng (2008) consider three additional possibilities for this function.
3.3
Lasso Regression Method
Another solution to deal with the dimensionality problem in forecasting is to use the Least Absolute Shrinkage and Selection Operator (Lasso), proposed by Tibshirani (1996) . The Lasso method is a regularized version of the least squares, which adds the constraint that the
L1 norm
of the parameter vector,
|| ||,
is no greater than a given threshold. As it is well
known, one can write the constrained problem as an unconstrained one using the Lagrange form of the problem. Hence, the Lasso estimator can be seen as the solution of the least-squares problem with the penalty
|| ||
added, where
is a
12
given constant More formally, the Lasso estimate is the solution to
1 min (Y X ) (Y X ) + ||i || n i=1

where
0 < .If
is equal to
0,
we have the OLS problem, and as
gets bigger, more parameters are shrinked to
0, and hence more regressors are excluded from the model. Knight and Fu (2000) studied the asymptotic properties of Lasso-type estimators. They showed that under appropriate conditions, the Lasso estimators are consistent for estimating the regression coecients. moreover, it has been demonstrated in Tibshirani (1996) that the Lasso is more stable and accurate than traditional variable selection methods such as best subset selection.
4
4.1
Results and Forecasts Evaluation

DFM Results
Following the recommendations of Bai (2004) to work the stationary data we estimate the space spanned by the factors using the principal components approach (see Stock and Watson 2002) with the stationary transformation of the variables. For estimating the factors we used the usual normalization criteria (see Bai and Ng, 2008 for a discusion). We used
standardized data (not including Spanish GDP). For deciding the number of factors we the IC information criteria with the three penalty functions discussed in Bai and Ng (2008), and kmax was set equal to 6. The three penalty functions lead to dierent number of factors (1.4 a.nd 5) so we consider this three possibilities for the forecasting exercise. In the three cases we consider the following model for producing one step ahead forcast of GDP's growth rateyt +h where
= c+(L)Ft +(L)yt ,
(L)
and
(L)
are polynomials in the lag operator.
4.2
Lasso Results
Given the above mentioned advantages of the Lasso methodology, we also use it to forecast the GDP growth. We consider 8 lags of both the GDP and the other macroeconomic variables as covariates, summing to 360 regressors, more than 2 times the number of observations available. We normalize all the variables to have mean 0 and variance 1, and hence, we do not consider an intercept in the model.
A crucial step to enjoy the nice properties of the Lasso estimator is to choose optimally the tuning parameter
We
follow two approaches: rst, we set it to the value 0.5, arbitrarily. Alternatively, we use cross-validation and it sets
to
13
Table 2. Forecast accuracy indicators for all the models
Criterion Root Mean Squared Error Mean Absolute Error Mean Abs. Percent Error Theil Inequality Coecient Bias Proportion Variance Proportion Covariance Proportion
AR(4) 0.0057 0.0039 173.8783 0.4153 0.1546 0.3537 0.4918
Factor Model 1 0.0039 0.0031 119.1316 0.2523 0.3156 0.0586 0.6258
Factor Model 4 0.0042 0.0033 134.1213 0.2660 0.3484 0.0197 0.6319
Factor Model 5 0.0040 0.0032 113.6162 0.2596 0.3365 0.0543 0.6091
LASSO 2 0.0050 0.0036 166.9765 0.3197 0.4985 0.1922 0.3092
LASSO 4 0.0044 0.0031 133.2918 0.2817 0.4163 0.1090 0.4747
0.1, approximately.
It is important to notice that, regardless of the two choices of the tuning parameter, we only select rst lag variables. With the ad-hoc value of
we select only two variables (in rst lag): total employment (National Accounts basis) and private As expected, once we reduce the threshold the optimal number of variables
nal consumption expenditure (volume).
decreases: on the top of the aforementioned variables, export market for goods and services (volume, USD, 2005 prices) and GDP deator (market prices). All the estimates have the expected signs: higher employment, ination, exports and consumption lead to higher GDP.
An interesting feature is that the Lasso constrains the lags of GDP to zero.
Nonetheless, we expect that this vari-
able would improve the forecast accuracy of the model, and hence we introduce the extra restriction that the rst lag of GDP must be dierent from zero.
4.3
Forecast Evaluation
In this subsection we discuss the forecasting power of the aforementioned models. For comparing the models we consider one-step ahead forecast errors for the period 2003.1-2012.4. Note that in this exercise we are producing true out of sample forecast given that models are estimated using data up to 2002.4. The forecast accuracy is measured through the root mean squared errors and the mean absolute error as it is common practice in the forecasting literature. Additionally, to be able to statistically compare these measure we considered the Diebold-Mariano test.
Table 3 shows the sign of the dierence between the mean squared errors across the dierent models the asterisk makes reference to the statistical signicance. For reading the table, (+) means that the model in the row has a higher mean squared error than the one in the corresponding column.
14
Table 3. Diebold-Mariano forecast comparison test
Factor Model 1 AR(4) Factor Model 1 Factor Model 4 Factor Model 5 LASSO 2 LASSO 4 (+)**
Factor Model 4 (+)** (-)***
Factor Model 5 (+)** (-)** (+)***
LASSO 2 (+)* (-)** (+)*** (+)***
LASSO 4 (+)** (-) (-) (-) (+)***
Combination (+)*** (+)** (+)*** (+)*** (+)*** (+)***
Table 4. Point estimates for forecast of GDP growth. All models.
Factor Model 1 2013q1 2013q2 2013q3 2013q4 2013 0.09 -0.22 -0.10 -0.01 -0.24
LASSO 4 0.28 0.07 0.20 0.14 0.88
Combination 0.11 -0.11 0.01 0.09 0.08
From tables 2 and 3 we conclude that the model with one factor and the Lasso model with 4 variables are the best options when comparing with the other models. These are the models with the smallest root mean squared error and from the Diebold-Mariano tests we conclude that these dierences are statistically signicant (but we cannot reject that the Factor 1 and Lasso 4 have the same forecasting power). Additionally, the combination of forecasts seems to be the best option overall.
4.4
2013 Forecasts
We now consider the forecasts of GDP growth produced by our models. The four periods ahead forecast with the Lasso model presents an additional diculty since we need to forecast the expalantory variables. In order to do this without losing the rich information contained in the dataset, we do it in a Factor Augmented VAR (see inter alia Bernanke, Boivin and Eliasz, 2005). As discussed by Banerje and Marcellino (2009) the FAVAR approach has the drawback of not considering the equilibrium correction term. Unfortunately, due to the time constraint we were not able to implement the FVEC model.
In the next gures and tables we study the forecasts of GDP growth. overall, all models forecast a growth very close to zero.
All the values are in percentage points.
On
The only exception is represented by the Lasso models, that
forecasts a positive growth of about 0.8% (signicantly dierent from zero).
15
Figure 4.1. Forecast of GDP growth. All models.
0.40
0.30
0.20
0.10
0.00
2013q1
2013q2
2013q3
2013q4
-0.10
-0.20
-0.30 Factor Model 1 LASSO 4 Combination
16
Figure 4.2. 95% Condence Intervals of forecast of GDP growth. All models.
Factor Model 1 Lower limit 2013q1 2013q2 2013q3 2013q4 2013 0.081 -0.232 -0.115 -0.023 -0.251 Upper limit 0.107 -0.207 -0.085 0.009 -0.223
LASSO 4 Lower limit 0.271 0.059 0.183 0.130 0.865 Upper limit 0.294 0.083 0.208 0.155 0.889
Combination Lower limit 0.094 -0.126 -0.002 0.074 0.068 Upper limit -0.285 -0.494 -0.299 -0.196 -0.261
Conclusions
Forecasting the real GDP growth rate becomes even more important in times of crisis, when governments need to choose public interventions with much more care to restore the macroeconomic equilibrium. For instance, economies in Southern Europe are currently experiencing an historical peak in debt to GDP ratios. Hence, public measures have to be properly implemented by taking into account the growth perspective of the countries.
In this report we have analyzed some of the possible models to forecast GDP growth for Spain. theoretical background on the dierent specications and provide our own forecasts for 2013.
We provide some
Our forecast results for the GDP growth rate in 2013 are very close to zero.
This means that Spain is still not re-
covering from the crisis. Even in the absence of growth, one should nd comforting that the models don't forecast any further recession. Regarding the scal crisis, this result supports the arguments for a smoother adjustment that can also be found in the IMFs country report for Spain of July 2012. The decit path envisaged in the SGP should be less
front-loaded, in agreement with European partners. The medium-term targets are broadly appropriate, but a smoother path would be more desirable during a period of extreme weakness, when multipliers are likely to be particularly large and the tax base soft, to reduce the risk of creating a negative feedback loop with growth and NPLs, which may also undermine market condence, especially if targets are missed. Such a smoother path should also be embedded in a prudent macroeconomic framework .
From a methodological point of view, we nd that using the high dimensional models is important. This allows a more ecient use of all the information contained in large dataset and this is reected in signicantly more accurate forecasts. Moreover, these results are robust to testing for data snooping.
17
References
References
[1] Jushan Bai. Estimating Cross-Section Common Stochastic Trends in Nonstationary Panel Data.
Journal of Econo-
metrics, 122(1), 2004.
[2] Jushan Bai and Serena Ng. A Panic Attack on Unit Roots and Cointegration.
Econometrica, 72(4), 2004.
[3] Jushan Bai and Serena Ng. Large Dimensional Factor Analysis. 2008.
Foundations and Trends
in Econometrics,, 3(2),
[4] A. Banerjee, M Marcellino, and I Masten.
Forecasting with Factor-Augmented Error Correction Models.
CEPR
Discussion Paper, 2010.
[5] A. Banerjee and Massimiliano Marcellino. Factor-Augmented Error Correction Models.
Mimeo, 2009.
[6] Jean Boivin and Serena Ng. Are More Data Always Better for Factor Analysis? 2006.
Journal of Econometrics, 132(1),
[7] Keith Knight and Wenjiang Fu. Asymptotics for lasso-type estimators.
Annals of Statistics, 28(5), 2000.
[8] James H. Stock and Mark W. Watson. Forecasting with Many Predictors.
Handbook of Economic Forecasting.
[9] James H. Stock and Mark W. Watson. Macroeconomic Forecasting Using Diusion Indexes.
Journal of Business and
Economic Statistics, 20(2), 2002.
[10] James H. Stock and Mark W. Watson. The Evolution of National and Regional Factors in US Housing Construction.
Mimeo, 2008.
[11] James H. Stock and Mark W. Watson. Dynamic Factor Models.
Oxford Handbook of Economic Forecasting, 2011.
[12] Robert Tibshirani. Regression Shrinkage and Selection via the Lasso.
Journal of the Royal Statistical Society. Series
B (Methodological), 58(1), 1996.

Carlos3 CS

Uploaded by

Copyright:

Available Formats

You might also like

Carlos3 CS

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Carlos3 CS

Uploaded by

Copyright:

Available Formats

1

Econometric Game 2013 Team 4

April 10, 2013

Methodology and Results of the Baseline Model

Dynamic Factor Models 3.2.1 3.2.2 Factor Estimation

Estimating the number of factors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Lasso Regression Method

Results and Forecasts Evaluation

4.1 4.2 4.3 4.4

Lasso Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Forecast Evaluation 2013 Forecasts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

The general approaches that have been

All the time series are plotted in Figure 1.

0.8 0.4 0.6 0.4

1.5 0.4 0.4 0.4

Table 1. Augmented Dickey Fuller Tests performed on each time series.

0.005725 0.003868 173.8783 0.415270 0.154562 0.353654 0.491785

Methodology and Results of the Baseline Model

Dynamic Factor Models

where the lag polynomials

i (L) are the dynamic factor loadings of each series in Xt et

. Assume initially that both equations

is assumed to be uncorrelated with factors' innovations at all

are mutually uncorrelated at all leads and lags, that is,

E (eit ejs = 0)s

Denoting the on the

r 1 vector (ft , ..., ftp ) (L),

n r matrix (0,... p ) as , where i

then the DFM can be re-written in its static form as:

Note that the number of static

formulation has important advantages for estimation.

The rst class con-

remain. The assumptions required for averaging to work are just:

and it is full rank,maxeval denotes the maximum eigenvalue and

e is the covariance matrix of

Consider a weighting matrix

) such that the factors are estimated as

where it was used that restrictions

are not identied because the matrix

restrictions are obtained imposing

is not unique and can be selected in many dierent ways.

In the Principal Components approach

is the matrix of eigenvalues of the sample covariance matrix of sarily equal to

Specically, for a given number of factors

subject to the normalization

to be diagonal (which is automatically satis-

ed). The problem can be solved by concentrating out

This gives the least squares estimator of

Then, (3.8) can be rewritten as

But this new problem is equivalent to:

max tr{( )1/2 (T 1

equal to the eigenvectors of

largest eigenvalues. Next, as

which are the scaled

estimated factors and the space spanned by

of the true factors vanish at rate

consistent for the true factor space

estimated factor loadings are

consistent for the space spanned by the true factor loadings.

The rst class con-

is not unique and can be selected in many dierent ways.

ed). The problem can be solved by concentrating out

nal consumption expenditure (volume).

Factor Model 4 (+) (-)*

Factor Model 5 (+) (-) (+)***

LASSO 2 (+)* (-) (+)* (+)***

LASSO 4 (+) (-) (-) (-) (+)*

Combination (+)* (+) (+)* (+)* (+)* (+)*

forecasts a positive growth of about 0.8% (signicantly dierent from zero).