Professional Documents
Culture Documents
Carlos3 CS
Carlos3 CS
Carlos3 CS
Contents
1 Introduction 3
Data Description
3.1 3.2
ARIMA Model
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7 7 8 11 11
3.3
12
DFM Results
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
12 12 13 14
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Conclusions
16
References
17
Introduction
Forecasting the real GDP growth rate is crucially important for countries in order to take ecient public policy decisions. This is in general a very challenging task. It becomes even more dicult in times of crisis, when governments need to make key public interventions in order to correct the negative dynamics of the economy.
Nowadays countries in Southern Europe are experiencing the biggest scal imbalances in recent economic history. The current situation seems to call for scal stabilizations policies, which have to be properly assessed by taking into account also the growth perspective of the countries. dicult to escape from. Forcing a scal adjustment during a crisis might generate vitious circles
There is indeed some empirical evidence in the literature on non-linearities in the way scal
policy aects the economy. Therefore, to this extent it is of crucial importance to be able to forecast properly GDP.
Recent approaches implemented to forecast key macroeconomic variables take advantage of todays rich data bases. The possibility of extracting value from the additional information available can signicantly improve the forecasting. Because of the nature of these datasets the classical OLS framework is not feasible for estimation, as the number of regressors (therefore of parameters) is usually bigger than the number of observations.
However, there are dierent ways to deal with this dimensionality problem.
used in the literature are: a) the hard thresholding; b) the soft thresholding; c) the index models; d) the forecast combination. In the following we are interested in forecasting the real GDP growth rate for Spain in 2013 by using a rich dataset from the OECD Economic Outlook. In the next section we perform a more detailed data description. Section 3 presents the models and the estimation results. First, we construct a nave benchmark model. Afterwards, we develop dierent forecasting approaches exploiting the rich data environment. Finally, in section 4 we perform forecasts comparison and evaluation. Section 5 concludes.
Data Description
We use a very rich dataset for Spain coming from the OECD Economic Outlook. It includes all relevant macroeconomic variables for the period 1970-2012 at quarterly frequency. The information includes time series of GDP, prices, expenditures, current accounts, exports, imports, exchange rates, prices, deators, employment and interest rates. We have a total of 70 variables. Several variables are though repeated at dierent price levels, or both in value and volume. When-
ever possible, we decide to keep the variables at 2005 prices in USD and in volume rather than in value. We also add the appropriate deators. As our target is to forecast the volume of GDP, the strategy that we use does not generate any information loss and at the same time it prevents us from overtting the model with redundant variables. After
deleting the observations with missing values we are left with a balanced panel spanning the period 1977-2012, with 143 observations. We work with a total of 45 variables.
We can observe that almost all of them are clearly trending over time,
while many present a dynamic evolution which could be consistent with a white noise process. Before using these variables in our forecasting model, we therefore need to test for their stationarity. We run on each time series the Augmented Dickey Fuller (ADF) test, whose null hypothesis is that the process has a unit root. Given that the data is quarterly we use 4 lags to take into account the likely high correlation between the variables within the same year. We can observe from Table 1 that the MacKinnon approximate P-values are very high, and we cannot reject the null hypothesis for any of the variables. This is an expected result when dealing with macroeconomic variables and we can easily tackle this diculty. We compute growth rates of all the series rather than log-dierences (as we have several negative values), solving the non-stationarity in this way. Price levels and deators are the only problematic variables, as the plot of their growth rate makes us still doubtful about their non-stationarity. We follow the common practice of taking growth rates again, in order to make sure to have stationary series.
The presence of large outliers in the time series might distort the inference of our analysis. In order to control for this potential threat, we decide to replate the extreme values (over the 97.5 and before the 2.5 percentile) of each time series by the mean of the neighbouring values (linear interpolation). Marcelino and Banerjee (2011). To this extent, we simply follow the strategy of Bech,
CBD
4 0 5E+11 -4 4E+11 12,000,000 2,400,000 8,000,000 70 75 80 85 90 95 00 05 10 15 70 75 80 85 90 95 00 05 10 15 70 75 80 85 90 95 00 05 10 15 70 75 80 85 90 95 2,200,000 8,000,000 00 05 10 15 12,000,000 -8 3E+11 2E+11 70 75 80 85 90 95 00 05 10 15 -12 05 10 15 70 75 80 85 90 95 00 05 10 15 5.0E+10 1.0E+11 1.5E+11 2,600,000 16,000,000 16,000,000 2.0E+11 6E+11 2,800,000 20,000,000 2.5E+11 7E+11 20,000,000 3,000,000 24,000,000
CBGDPR
CGV
CPV
EE
ES
ET
5.0E+10
0.0E+00
-5.0E+10
-1.0E+11
-1.5E+11
-2.0E+11
70
75
80
85
90
95
00
ET_NA
2.8 2.4 1.0E+12 1.0E+12 8.0E+11 1.6 1.2 0.8 05 10 15 70 75 80 85 90 95 00 05 10 15 70 75 80 85 90 95 00 05 10 15 70 75 80 85 90 95 00 05 10 15 70 75 80 85 90 95 00 05 10 15 70 75 80 85 90 95 00 05 10 15 80 4.0E+11 4.0E+11 4.0E+10 100 6.0E+11 6.0E+11 8.0E+10 120 8.0E+11 1.2E+11 5 0 70 75 80 85 10 1.6E+11 15 2.0 140 160 1.2E+12 2.0E+11 180 1.2E+12 1.4E+12 2.4E+11 20
EXCH
EXCHEB
FDDV
GDPVD
IBGV
IRL
24,000,000
20,000,000
16,000,000
12,000,000
8,000,000
70
75
80
85
90
95
00
90
95
00
05
10
15
IRS
4E+11 3E+11 20,000,000 3E+11 0E+00 16,000,000 1E+11 1E+11 12,000,000 70 75 80 85 90 95 00 05 10 15 70 75 80 85 90 95 00 05 10 15 70 75 80 85 90 95 00 05 10 15 70 75 0E+00 -2E+10 80 85 90 95 00 05 0E+00 05 10 15 70 75 80 85 90 95 00 05 10 15 0E+00 1E+11 -1E+10 2E+11 2E+11 2E+11 3E+11 4E+11 1E+10 4E+11 24,000,000 5E+11 2E+10
ITISKV
ITV
LF
MGSVD
NTRD
1.2
PCG
30
20
0.8
10
0.4
0.0 10 15 70 75 80 85 90 95 00 05 10 15
70
75
80
85
90
95
00
PCP
1.2 1.2 1.2 1.2
PFDD
PGDP
PIT
PITISK
1.2
PMGS
1.2 1.0
PMGSX
1.2
0.8
0.8
0.8
0.8
0.8
0.8
0.4
0.4
0.4
0.4
0.4
0.0 05 10 15 70 75 80 85 90 95 00 05 10 15 70 75 80 85 90 95 00 05 10 15 70 75 80 85 90 95 00 05 10 15 70 75 80 85 90 95
0.0
0.0
0.0
0.0 00 05 10 15
0.0 70 75 80 85 90 95 00 05 10 15
0.2 70 75 80 85 90 95 00 05 10 15
70
75
80
85
90
95
00
PMNW
1.2 2.0 0.8 1.6 1.2 0.8 0.0 05 10 15 70 75 80 85 90 95 00 05 10 15 70 75 80 85 90 95 00 05 10 15 70 75 80 85 90 95 00 05 10 15 0.0 0.0 0.4 70 75 80 85 0.8 0.8 1.2 1.2 2.4
PTDD
PXGS
PXGSX
PXNW
1.2 1.1 1.0
RPMGS
1.2 1.0 0.8 0.9 0.8 90 95 00 05 10 15 70 75 80 85 90 95 00 05 10 15 0.6 0.4 70 75 80 85
RPXGS
2.5
2.0
1.0
0.5
0.0
70
75
80
85
90
95
00
90
95
00
05
10
15
SHTGSVD
1.2E+12 1.0E+12 1.2E+12 8.0E+11 8.0E+11 6.0E+11 4.0E+11 05 10 15 70 75 80 85 90 95 00 05 10 15 70 75 80 85 90 95 00 05 10 15 70 4.0E+11 4.0E+11 75 80 85 90 95 00 05 8.0E+11 1.2E+12 1.6E+12 1.6E+12 2.0E+12
TDDV
TEV
TEVD
30
UNR
4E+11 3E+11 20 2E+11 10 1E+11 0 10 15 70 75 80 85 90 95 00 05 10 15 0E+00 70 75 80 85
XGS
4E+11 3E+11 2E+11 1E+11 0E+00 90 95 00 05 10 15 70 75 80 85
XGSV
.028
.024
.020
.016
70
75
80
85
90
95
00
90
95
00
05
10
15
XGSVD
4E+11 3E+11 1.0 2E+11 0.8 1E+11 0E+00 05 10 15 70 75 80 85 90 95 00 05 10 15 70 75 80 85 90 0.6 95 00 05 10 15 1.2
XMKT
XPERF
4E+11
3E+11
2E+11
1E+11
0E+00
Figure 2.1. Time series of relevant macroeconomic variables for Spain (1977-2012). Source: OECD Economic Outlook.
70
75
80
85
90
95
00
Variable (label) Final domestic expenditure, deator GDP deator Gross total xed capital formation deator Gross capital formation deator Imports of goods and services deator Total domestic expenditure deator Exports of goods and services deator Dependent employment Total self-employed Total employment Total employment, National Accounts basis Labour force Unemployment rate Long-term interest rate on government bonds Short-term interest rate Export performance for goods and services, volume Government nal consumption expenditure Private nal consumption expenditure Final domestic expenditure Gross domestic product, volume, at 2005 PPP, USD Private non-residential and government xed capital formation Gross capital formation Gross xed capital formation Total domestic expenditure Total expenditure Total expenditure, volume, 2005 USD Exports of goods and services, value, National Accounts basis Exports of goods and services, volume, National Accounts basis Government nal consumption expenditure deator Private nal consumption expenditure deator Current account balance, value in USD Current account balance, as a percentage of GDP Exchange rate, USD per National currency Nominal eective exchange rate, chain-linked, overall weights Imports of goods and services, volume, USD, 2005 prices Net current international transfers, value, balance of payments basis, USD Price of non- commodity imports of goods and services Price of commodity imports Price of non- commodity exports of goods and services Price of commodity exports Relative price of imported goods and services Relative price of exported goods and services Share of country's trade expressed in USD volume (2005 prices Exports of goods and services, volume, USD, 2005 prices Export market for goods and services, volume, USD, 2005 prices
T Statistic -1.70 -1.61 -1.03 -1.14 -1.91 -1.74 -2.20 -1.02 -0.72 -1.15 -1.01 -0.51 -1.86 -0.98 -0.69 -1.61 -1.19 -0.71 -1.18 -0.66 -1.34 -1.46 -1.43 -1.19 -0.58 -0.58 1.11 0.81 -2.00 -1.71 -1.80 -2.33 -2.31 -2.39 -0.74 -0.76 -3.07 0.84 -3.72 0.29 -2.17 -2.20 -1.38 0.81 1.31
MacKinnon p-value 0.43 0.48 0.74 0.70 0.33 0.41 0.21 0.74 0.84 0.70 0.75 0.89 0.35 0.76 0.85 0.48 0.68 0.84 0.68 0.86 0.61 0.56 0.57 0.68 0.87 0.87 1.00 0.99 0.28 0.43 0.38 0.16 0.17 0.14 0.84 0.83 0.03 0.99 0.00 0.98 0.22 0.21 0.59 0.99 1.00
Figure 3.1. Forecasts of GDP Growth for the periods 1977.3-2002.4. ARIMA Model.
.03
.02
.01
.00
Root Mean Squared Error Mean Absolute Error Mean Abs. Percent Error Theil Inequality Coefficient Bias Proportion Variance Proportion Covariance Proportion
-.01
-.02 2003 2004 2005 2006 2007 2008 2009 2 S.E. 2010 2011 2012
GDP growth
As a baseline we estimate an ARIMA models for the GDP growth. Further, we use other approaches able to exploit in a more ecient way the rich-data. In particular, we use Diusion Indexes via estimation of a dynamic Factor models a-la Stock and Watson (2002) and Lasso. We prefer to use dierent models in order to have some room for making comparisons. In this section we provide some of the results and a brief introduction to Dynamic Factor Models and Lasso.
3.1
ARIMA Model
Based on information criteria (AIC and BIC) and Q-stat autocorrelation test, we selected a AR(4) model. The model was estimated for the period 1977.3 - 2002.4, and use to forecast the period 2003.1 - 2012.4 as it is presented in Figure 3.1. Forecasting power indicator was computed in order to be compared to the one of the other models in the following subsections.
3.2
Dynamic factor models (DFMs) where initially proposed by Geweke (1977) as the time-series extension of factor models previously designed for cross-sectional data. The starting point of DFMs is that the dynamics of a high dimensional (n) time-series vector (Xt ) are driven by few (q ) common factors
fit
and an idiosyncratic
nvector
of disturbances
et
. The
use of DFMs in economics became widespread after Geweke (1977) and Sims and Sargent (1977) who allowed both the factors and the idiosyncratic errors to be serially correlated. The factors (ft ) are usually assumed to follow a VAR process whereas the idiosyncratic disturbances (et ) are assumed to follow univariate autoregressive processes. Thus, DFMs can be written as:
Xt = (L)ft + et
(3.1)
(L)ft = t
(3.2)
(3.1) and (3.2) are stationary. The idiosyncratic error leads and lags (E (et , tk )
= 0 k ).
In the exact dynamic factor model it is also assumed that idiosyncratic disturbances
if
i=j
As noted by Stock and Watson (2011), when the factors are known and the errors (et and
t )
Gaussian, an individ-
ual variable can be eciently forecasted regressing it on the lagged factors and lags of the variable itself, so that we do not need to include all then variables in the regression. Thus, in words of Stock and Watson (2006) DFMs allow to turn dimensionality from a curse into a blessing. However, not only the factors are unknown but also we do not know how many of them are driving the data.
3.2.1
Factor Estimation
as
Ft
and the
is the
n q matrix of coecients
ith
lag in
Xt = Ft + et
(3.3)
A(L)Ft = Gt
(3.4)
where
A(L)
contains
1s, 0s
and elements of
(L),
and
is composed of
1s
and
0s.
factors will be
r pq
because some lagged factors could be redundant. As it will become evident below, this state-space
As indicated by Stock and Watson (2011), estimation methods can be divided in three classes.
siders a small number of series so that factors and model's parameters can be estimated using the Gaussian maximum likelihood (MLE) and the Kalman lter. The need for a small number of parameters comes from the fact that the procedure requires non-linear optimization. The second class of approaches are those using non-parametric estimation via some averaging method among which principal components is the most usual. Finally, as factors can be consistently estimated by principal components (for large
n),
in the last class of methods these estimations are used to estimate the parameters
of the state-space model, solving the dimensionality issue of the rst approaches.
Principal Components
As noted by Stock and Watson (2011) an important motivation of these approaches is that in a (weighted) cross-sectional average of
Xt ,
idiosyncratic disturbances will converge to zero so that only the linear combinations of the factors will
limn n1 = D
(3.5)
maxeval(
e
where
) c < m
(3.6)
is
rr
et .
(with
W W/n = I
t = n1 W Xt . F
If
limn n1 W = Hrr , H
has full rank and conditions in (3.5) nad (3.6) are satised, then:
et p HFt as n Ft = n1 W Ft + n1 W
(3.7)
n1 W et p 0
by the weak law of large numbers. Note that without imposing some additional
Ft
and
is unknown. Since
is
rr
we need
r2
restrictions to restrictions.
identify the factors and their loadings. The usual normalization assumption
n 1 = I r
provides
r(r + 1)/2
10
The remaining
r(r 1)/2
F F
to be diagonal, where
F = (F1 , ..., FT ).
The matrix
Xt .
(not neces-
r) the principal components method estimates the factors and loadings by solving the optimization problem:
minF1 ,...FT , Sk
(3.8)
with
Sk = (nT )1
T t=1 (Xt
Ft ) (Xt Ft ) n1 = Ir and
the restricting
F F
Ft .
Ft
given
so that
() = ( )1 Xt . F
min T 1
T 1 Xt ]Xt =1 Xt [I ( )
(3.9)
Xt Xt )( )1/2 }
(3.10)
max
t=1
subject to set
Xt Xt
(3.11)
n 1 = I k .
This nal problem is the starting point of principal components analysis, which solution is to
= Xt Xt
corresponding to its
= nIr ,
we get
Ft = n1 Xt ,
principal components. Bai and Ng (2008) summarize the properties of the esti-
mated factors and loadings. Briey, as proved by Bai and Ng(2002), both estimators are consistent (the average squared deviation between the
min[N, T ],
and
they converge to normal distributions. Moreover, for each t, estimated factors are while for each
i,
11
et
is not assumed to be diagonal, generalized principal components methods Several approaches have been proposed to make this procedure
feasible (see Forni et.al, 2005 Boving and Bg, 2003 and Stok and Watson, 2005). Nonetheless, empirical applications to real and simulated data do not show the generalized method to produce better forecasting results systematically (see e.g Boivin et.al 2005; D'Agostino and Giannone, 2006; or Forni et.al, 2005).
3.2.2
In their survey about large dimensional factor analysis, Bai and Ng (2008) highlight two possible information criteria for determining the number of factors:
P CP (k ) = Sk + k 2 g (n, T )
(3.12)
ICk = ln(Sk ) + kg (n, T ) g (n, T ) is a penalty function, Sk is given by (3.8) and 2 = Skmax for a certain value of kmax. k.
The authors show that when
(3.13)
where
In both cases
is deterk
n, T ,
the probability of selecting the correct number of factors tends to one for both criteria. A usual penalty
function is
g (n, T ) =
n+T nT
ln[min(n, T )], however Bai and Ng (2008) consider three additional possibilities for this function.
3.3
Another solution to deal with the dimensionality problem in forecasting is to use the Least Absolute Shrinkage and Selection Operator (Lasso), proposed by Tibshirani (1996) . The Lasso method is a regularized version of the least squares, which adds the constraint that the
L1 norm
|| ||,
known, one can write the constrained problem as an unconstrained one using the Lagrange form of the problem. Hence, the Lasso estimator can be seen as the solution of the least-squares problem with the penalty
|| ||
added, where
is a
12
0 < .If
is equal to
0,
0, and hence more regressors are excluded from the model. Knight and Fu (2000) studied the asymptotic properties of Lasso-type estimators. They showed that under appropriate conditions, the Lasso estimators are consistent for estimating the regression coecients. moreover, it has been demonstrated in Tibshirani (1996) that the Lasso is more stable and accurate than traditional variable selection methods such as best subset selection.
4
4.1
Following the recommendations of Bai (2004) to work the stationary data we estimate the space spanned by the factors using the principal components approach (see Stock and Watson 2002) with the stationary transformation of the variables. For estimating the factors we used the usual normalization criteria (see Bai and Ng, 2008 for a discusion). We used
standardized data (not including Spanish GDP). For deciding the number of factors we the IC information criteria with the three penalty functions discussed in Bai and Ng (2008), and kmax was set equal to 6. The three penalty functions lead to dierent number of factors (1.4 a.nd 5) so we consider this three possibilities for the forecasting exercise. In the three cases we consider the following model for producing one step ahead forcast of GDP's growth rateyt +h where
= c+(L)Ft +(L)yt ,
(L)
and
(L)
4.2
Lasso Results
Given the above mentioned advantages of the Lasso methodology, we also use it to forecast the GDP growth. We consider 8 lags of both the GDP and the other macroeconomic variables as covariates, summing to 360 regressors, more than 2 times the number of observations available. We normalize all the variables to have mean 0 and variance 1, and hence, we do not consider an intercept in the model.
A crucial step to enjoy the nice properties of the Lasso estimator is to choose optimally the tuning parameter
We
follow two approaches: rst, we set it to the value 0.5, arbitrarily. Alternatively, we use cross-validation and it sets
to
13
Criterion Root Mean Squared Error Mean Absolute Error Mean Abs. Percent Error Theil Inequality Coecient Bias Proportion Variance Proportion Covariance Proportion
0.1, approximately.
It is important to notice that, regardless of the two choices of the tuning parameter, we only select rst lag variables. With the ad-hoc value of
we select only two variables (in rst lag): total employment (National Accounts basis) and private As expected, once we reduce the threshold the optimal number of variables
decreases: on the top of the aforementioned variables, export market for goods and services (volume, USD, 2005 prices) and GDP deator (market prices). All the estimates have the expected signs: higher employment, ination, exports and consumption lead to higher GDP.
An interesting feature is that the Lasso constrains the lags of GDP to zero.
able would improve the forecast accuracy of the model, and hence we introduce the extra restriction that the rst lag of GDP must be dierent from zero.
4.3
Forecast Evaluation
In this subsection we discuss the forecasting power of the aforementioned models. For comparing the models we consider one-step ahead forecast errors for the period 2003.1-2012.4. Note that in this exercise we are producing true out of sample forecast given that models are estimated using data up to 2002.4. The forecast accuracy is measured through the root mean squared errors and the mean absolute error as it is common practice in the forecasting literature. Additionally, to be able to statistically compare these measure we considered the Diebold-Mariano test.
Table 3 shows the sign of the dierence between the mean squared errors across the dierent models the asterisk makes reference to the statistical signicance. For reading the table, (+) means that the model in the row has a higher mean squared error than the one in the corresponding column.
14
Factor Model 1 AR(4) Factor Model 1 Factor Model 4 Factor Model 5 LASSO 2 LASSO 4 (+)**
Factor Model 1 2013q1 2013q2 2013q3 2013q4 2013 0.09 -0.22 -0.10 -0.01 -0.24
From tables 2 and 3 we conclude that the model with one factor and the Lasso model with 4 variables are the best options when comparing with the other models. These are the models with the smallest root mean squared error and from the Diebold-Mariano tests we conclude that these dierences are statistically signicant (but we cannot reject that the Factor 1 and Lasso 4 have the same forecasting power). Additionally, the combination of forecasts seems to be the best option overall.
4.4
2013 Forecasts
We now consider the forecasts of GDP growth produced by our models. The four periods ahead forecast with the Lasso model presents an additional diculty since we need to forecast the expalantory variables. In order to do this without losing the rich information contained in the dataset, we do it in a Factor Augmented VAR (see inter alia Bernanke, Boivin and Eliasz, 2005). As discussed by Banerje and Marcellino (2009) the FAVAR approach has the drawback of not considering the equilibrium correction term. Unfortunately, due to the time constraint we were not able to implement the FVEC model.
In the next gures and tables we study the forecasts of GDP growth. overall, all models forecast a growth very close to zero.
On
15
0.40
0.30
0.20
0.10
0.00
2013q1
2013q2
2013q3
2013q4
-0.10
-0.20
16
Figure 4.2. 95% Condence Intervals of forecast of GDP growth. All models.
Factor Model 1 Lower limit 2013q1 2013q2 2013q3 2013q4 2013 0.081 -0.232 -0.115 -0.023 -0.251 Upper limit 0.107 -0.207 -0.085 0.009 -0.223
LASSO 4 Lower limit 0.271 0.059 0.183 0.130 0.865 Upper limit 0.294 0.083 0.208 0.155 0.889
Combination Lower limit 0.094 -0.126 -0.002 0.074 0.068 Upper limit -0.285 -0.494 -0.299 -0.196 -0.261
Conclusions
Forecasting the real GDP growth rate becomes even more important in times of crisis, when governments need to choose public interventions with much more care to restore the macroeconomic equilibrium. For instance, economies in Southern Europe are currently experiencing an historical peak in debt to GDP ratios. Hence, public measures have to be properly implemented by taking into account the growth perspective of the countries.
In this report we have analyzed some of the possible models to forecast GDP growth for Spain. theoretical background on the dierent specications and provide our own forecasts for 2013.
We provide some
Our forecast results for the GDP growth rate in 2013 are very close to zero.
covering from the crisis. Even in the absence of growth, one should nd comforting that the models don't forecast any further recession. Regarding the scal crisis, this result supports the arguments for a smoother adjustment that can also be found in the IMFs country report for Spain of July 2012. The decit path envisaged in the SGP should be less
front-loaded, in agreement with European partners. The medium-term targets are broadly appropriate, but a smoother path would be more desirable during a period of extreme weakness, when multipliers are likely to be particularly large and the tax base soft, to reduce the risk of creating a negative feedback loop with growth and NPLs, which may also undermine market condence, especially if targets are missed. Such a smoother path should also be embedded in a prudent macroeconomic framework .
From a methodological point of view, we nd that using the high dimensional models is important. This allows a more ecient use of all the information contained in large dataset and this is reected in signicantly more accurate forecasts. Moreover, these results are robust to testing for data snooping.
17
References
References
[1] Jushan Bai. Estimating Cross-Section Common Stochastic Trends in Nonstationary Panel Data.
Journal of Econo-
[2] Jushan Bai and Serena Ng. A Panic Attack on Unit Roots and Cointegration.
[3] Jushan Bai and Serena Ng. Large Dimensional Factor Analysis. 2008.
in Econometrics,, 3(2),
CEPR
Mimeo, 2009.
[6] Jean Boivin and Serena Ng. Are More Data Always Better for Factor Analysis? 2006.
[7] Keith Knight and Wenjiang Fu. Asymptotics for lasso-type estimators.
[8] James H. Stock and Mark W. Watson. Forecasting with Many Predictors.
[9] James H. Stock and Mark W. Watson. Macroeconomic Forecasting Using Diusion Indexes.
[10] James H. Stock and Mark W. Watson. The Evolution of National and Regional Factors in US Housing Construction.
Mimeo, 2008.
[12] Robert Tibshirani. Regression Shrinkage and Selection via the Lasso.