Multivariate State Space Plus VAR

Multivariate State Space Models
Siem Jan Koopman

http://staff.feweb.vu.nl/koopman
Department of Econometrics
VU University Amsterdam
Tinbergen Institute
2010
Multivariate State Space Models – p. 1

Multivariate local level model
Seemingly Unrelated Time Series Equations model:
yt = µt + εt , εt ∼ N ID(0, Σε ),
µt+1 = µt + ηt , ηt ∼ N ID(0, Ση ).
• Observations are p × 1 vectors;

• The disturbances εt , ηs are independent for all s, t;
• The p different time series are related through correlations in the
disturbances.
For a full discussion, see Harvey and Koopman (1997).
A pdf version (scanned) at http://staff.feweb.vu.nl/koopman
under section “Publications” and subsection “Published articles as
contributions to books”.

Multivariate LL Model
The multivariate LL model is given by
yt = µt + εt , εt ∼ N ID(0, Σε ),
µt+1 = µt + ηt , ηt ∼ N ID(0, Ση ).
• First difference
∆yt = ηt−1 + ∆εt
is stationary;
• Reduced form: ∆yt is VMA(1) or VAR(∞);

Multivariate LL Model
• Stochastic properties are multivariate analogous of univariate

case:
Γ0 = E(∆yt ∆yt′ ) = Ση + 2Σε

′
Γ1 = E(∆yt ∆yt−1 ) = −Σε
′
Γτ = E(∆yt ∆yt−τ ) = 0, τ ≥ 2,
• The unrestricted vector MA(1) process has p2 + p(p + 1)/2

parameters, the SUTSE has p × (p + 1);
• Such multivariate reduced form representations can also be
established for general models.

Homogeneous Multivariate LL Model
The homogeneous multivariate LL model is given by
yt = µt + εt , εt ∼ N ID(0, Σε ),
µt+1 = µt + ηt , ηt ∼ N ID(0, qΣε ),
where q is a non-negative scalar. This implies that Ση = qΣε .

• The model is restricted, all series in yt have the same dynamic
properties (the same acf).
• Not so relevant in practical work apart from forecasting. It is the
model representation for exponentially weighted moving average
(EWMA) forecasting of multiple time series.
• This can be generalised to more general components models.
• Easy to estimate, only a set of univariate Kalman filters are
required.

Common Levels
The common local level model is given by
yt = µt + εt , εt ∼ N ID(0, Σε ),
µt+1 = µt + ηt , ηt ∼ N ID(0, Ση ),
where rank(Ση ) = r < p.

• the model can be described by r underlying level components, the
common levels;
•
Ση = AΣc A′ ,
A is p × r, Σc is r × r of full rank;
• interpretation of A: factor loading matrix.

Common Levels
The common local level model
yt = µt + εt , εt ∼ N ID(0, Σε ),
µt+1 = µt + ηt , ηt ∼ N ID(0, AΣc A′ ),
can be rewritten in terms of underlying levels:
yt = a + Aµct + εt ,
µct+1 = µct + ηtc , ηtc ∼ N ID(0, Σc ),
so that
µt = a + Aµct , ηt = Aηtc .

Common Levels
For the common local level model
yt = µt + εt , εt ∼ N ID(0, Σε ),
µt+1 = µt + ηt , ηt ∼ N ID(0, AΣc A′ ),
notice that
• decomposition Ση = AΣc A′ is not unique;
• identification restrictions: Σc is diagonal, Choleski decomposition,
principal compoments (based on economic theory);
• more interesting interpretation can be obtained by factor rotations;
• can be interpreted as dynamic factor analysis, see later.

Common components
Common dynamic factors:

• are useful for interpretation → cointegration;
• have consequence for inference and forecasting (dimension of
parameter space reduces as a result).
• common local level model can be generally represented as a
VAR(∞) or VECM models, details can be provided upon request.

Multivariate components
• So far, we have concentrated on multivariate variants of the local

level model;
• Similar considerations can be applied to other components such
as the slope of the trend, seasonal and cycle components and
other time-varying features in the multiple time series.
• Harvey and Koopman (1997) review such extensions.
• In particular, they define the similar cycle component, see
Exercises.

Common and idiosyncratic factors
Multiple trends can also be decomposed into a one common factor and
multiple idiosyncratic factors:
yt = µt + εt , εt ∼ N ID(0, Σε ),
µt+1 = µt + ηt , ηt ∼ N ID(0, Ση ),
where Ση = δδ ′ + Dη with vector δ and diagonal matrix Dη . This

implies that the level can be represented by
µt = δµct + µ∗t , ηt = δηtc + ηt∗
with common level (scalar) µct and ”independent” level µ∗it generated by
∆µct+1 = ηtc ∼ N ID(0, 1), ∆µ∗t+1 = ηt∗ ∼ N ID(0, Dη ).

Mulitvariate Kalman filter
The Kalman filter is valid for the general multivariate state space
model.
Computationally it is not convenient when p becomes large, very
large.
Each step of the Kalman filter requires the inversion of the p × p
matrix Ft . This is no problem when p = 1 (univariate) but when
p > 20, say, it will slow down the Kalman filter considerably.
However, we can treat each element in the p × 1 observation vector yt
as a single realisation. In other words, we can "update" each
single element of yt within the Kalman filter.
The arguments are given in DK book §6.4.
The same applies to smoothing.

Univariate treatment of Kalman filter
• Consider standard model: yt = Zt αt + εt and αt+1 = Tt αt + Rt ηt

where Var(εt ) = Ht is diagonal.
• Observation vector yt = (yt,1 , . . . , yt,pt )′ is treated and we view
observation model as a set of pt separate equations.
• We then have, yt,i = Zt,i αt,i + εt,i with αt,i = αt for i = 1, . . . , pt .
• The associated transition equations become αt,i+1 = αt,i for
i = 1, . . . , pt and αt+1,1 = Tt αt,pt + Rt ηt for t = 1, . . . , n.
• This disentangled model can be treated by the Kalman filter and
smoother equations straightforwardly.
• Innovations are now relative to the past and the “previous”
observations inside yt,pt !
• Non-diagonal matrix Ht can be treated by data-transformation or
by including εt in the state vector αt .
• More details in DK book §6.4.

VAR representation of multivariate LL model
Consider the common local level (CLL) model as given by
yt = µt + εt , εt ∼ N ID(0, Σε ),
µt+1 = µt + ηt , ηt ∼ N ID(0, Ση ),
where rank(Ση ) = r < p.

Rewrite CLL model as yt = a + Aµct + εt and Ση = AΣc A′ .
The relevant Kalman filter (KF) equations are
vt = yt − ct − Zt at , at+1 = Tt at + Kt vt , t = 1, . . . , n.
In case of CLL model: ct = a, Zt = A and µ̃ct|t−1 ≡ at .
The model can therefore be written in innovation form:
yt = a + Aµ̃ct|t−1 + vt , vt ∼ N ID(0, F ),
where µ̃ct|t−1 and vt are obtained from KF applied to the CLL model.
VAR(∞) Representation
The Kalman filter (in steady state) is
µ̃ct+1|t = µ̃ct|t−1 + Kvt

= (I − KA)µ̃ct|t−1 + K(yt − a)
= [I − (I − KA)L]−1 Kyt − (KA)−1 Ka,
where L is lag operator and K is Kalman gain matrix.

This leads to the alternative innovation form
yt = [I − A(KA)−1 K]a
+ A[I − (I − KA)L]−1 KLyt + vt ,
which is the VAR(∞) representation:
Φ(L)yt = Φ(1)a + vt ,
Φ(L) = I − A[I − (I − KA)L]−1 KL.

VECM Representation
The VECM is based on decomposition
Φk (L) = Φ(1)L + Φ∗k−1 (L)∆.
where Φ∗j coefficients are functions of Φk (L) coefficients.

This can be applied to the VAR(∞) representation of Common LL
model:
Φ∗ (L)∆yt = m + BC ′ yt−1 + εt ,
with singular matrix
Φ(1) = BC ′ = I − A(KA)−1 K,
where B and C are N × r matrices with r = N − p and p is rank of Ση .

Note that Φ(1)A = 0 and KΦ(1) = 0.

VAR(∞) Representation
• The VAR(∞) representation of the Common LL model is

consistent with a VECM of a cointegrating system;
• Short term dynamics are modelled differently, that is, the
polynomial Φ∗ (L);
• Computations of VAR(∞) and VECM coefficients: Koopman&
Harvey, JEDC 2003.

Illustration: US Monthly Housing Starts and Sold
Reinsel (1996) adopts bivariate cointegrated system for US monthly

housing-starts and housing-sold (SA) for the period 1965 – 1974.
He considers VARMA model
(I − ΦL)∆12 yt = (I + ΘL12 )ut ,
where Θ is estimated as −I: seasonality is treated as deterministic.

As VAR(1):
(I − Φ∗ L)yt = m + γt + ut ,
with deterministic season γt .
We use PcGive to obtain the VECM representation
∆yt = m + γt + (−0.524, 0.141)′ (1.00, −1.873)yt−1 + ut ,
where we used Johansen’s cointegration test to conclude that matrix

I − Φ has a rank close to 1.

US Monthly Housing Starts and Sold: PcGive output
----PcGive 10.0b session----
eigenvalue loglik for rank

-355.032 0
0.425082 -322.097 1
0.0182852 -320.999 2
Ho:rank=p -Tlog(1-\mu) using T-nm 95%

p == 0 65.87** 64.76** 14.1
p <= 1 2.196 2.159 3.8
standardized \beta’ eigenvectors

Hstarts Hsold
1.0000 -1.8730
0.52946 1.0000

US Monthly Housing Starts and Sold: PcGive output
standardized \alpha coefficients
Hstarts -0.52419 -0.027212
Hsold 0.14073 -0.023996
long-run matrix Po=\alpha*\beta’, rank 2

Hstarts Hsold
Hstarts -0.53860 0.95459
Hsold 0.12803 -0.28758

US Monthly Housing Starts and Sold: UC Model
In STAMP we estimate the bivariate UC model:
yt = µt + γt + εt ,
with µt as RW and γt as seasonal component.

Estimation shows that seasonal component is fixed and a common
level exists:
µt = a + Aµct , µct+1 = µct + ηtc .
Next we estimate common level model:
yt = (0.000, 2.581)′ + (1.00, 0.535)′ µct + γt + εt ,
with var(ηtc ) estimated as 4.518.

US Monthly Housing Starts and Sold: STAMP output
----Stamp 6.20 session----
Summary statistics
Hstarts Hsold
Normality 0.41928 3.4801
Q( 9, 7) 9.1474 5.5896
Rsˆ2 0.18158 0.14433
Eq 1 : Estimated covariance matrices. (Lvl rank = two)

(upper-triangular = correlations)
Irr disturbance 11.095 -0.70665
-3.8864 2.7263
Lvl disturbance 19.388 0.96901
10.766 6.3667
Sea disturbance 0.00000 0.00000
0.00000 0.00000

US Monthly Housing Starts and Sold: STAMP output
Eq 2 : Estimated covariance matrices. (Lvl rank = one)

(upper-triangular = correlations)
Irr disturbance 12.778 -0.75708
-5.1727 3.6533
Lvl disturbance 20.411 1.0000
10.923 5.8456
Eq 2 : Diagonal and Load matrices.

Diag matrix Lvl 4.5178
Load matrix Lvl 1.0000 0.5352
Constant Lvl 0.0000 2.5807
Comparing results of PcGive (VECM analysis) and STAMP (UC

analysis) we observe that
C = (1.00, −1.873)′ , A = (1.00, 0.535)′ → C ′ A ≈ 0.
VECM and UC results are consistent with each other.

Illustration: the Euro business cycle
14.25
14.20
14.15
14.10
14.05
14.00
13.95
13.90
1985 1990 1995 2000

Topics in Business Cycle Analysis
• dating of business cycles (Markov-switching models)
• prinicipal components analysis
(Stock and Watson, Forni, Hallin, Lippi and Reichlin)
• convergence and synchronisation
(economic theory, empirical studies)
• asymmetry and nonlinearities (econometrics)
• coincident and leading indicators (economics)
In this illustration aim is to detect the business cycle

• Detrending methods (Hodrick-Prescott);
• Bandpass filtering methods (Baxter-King, Christiano-Fitzgerald);
• Model-based, univariate (Beveridge-Nelson, Clark,
Harvey-Jaeger);
• Model-based, multivariate, common cycles (VAR model, UC
model).

Different Univariate Trend-Cycle Decompositions
0.02
14.2 0.01
0.00
14.0 HP trend −0.01
HP cycle
1985 1990 1995 2000 1985 1990 1995 2000
14.3
0.01
14.2
14.1
0.00
14.0
STAMP trend
13.9 −0.01 STAMP cycle
1985 1990 1995 2000 1985 1990 1995 2000
14.2 0.01
0.00
14.0
−0.01
AKR trend AKR cycle
1985 1990 1995 2000 1985 1990 1995 2000

Univariate UC Trend-Cycle Decomposition
yt = µt + ψt + εt
• Trend µt : ∆d µt = ηt ;
• Irregular εt : White Noise;
• Cycle ψt : AR(2) with complex roots as in Clark (1987) or with
stochastic trigonometric functions as in Harvey (1985,1989);
Trigonometric specification:
! " # ! !
ψt+1 cos λ sin λ ψt κt
+ = φ + ,
ψt+1 − sin λ cos λ ψt+ κ+
t
κt , κ+ 2
t ∼ N ID(0, σκ ).
Signal extraction is about (locally) weighting observations. Kalman

filter gives the optimal weights for the given models.

Weights and Gain Functions of Components
14.3
0.02
14.2
14.1 0.00
14.0
13.9 −0.02
1985 1990 1995 2000 1985 1990 1995 2000
0.2
0.1 0.5
0.0 0.0
−20 −10 0 10 20 −20 −10 0 10 20
1.0 1.0
0.5 0.5
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

Band-pass Properties
"Band-pass" refers to frequency domain properties of polynomial lag

functions of time series (filters).
In business cycle analysis, one is interested in filters for trend and
cycles such that trend only captures the low-frequencies, cycle the
mid-frequencies and irregular the high frequencies.
1.0
0.5
TREND
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
1.0
0.5
CYCLE
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
1.0
0.5
IRREGULAR
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

Butterworth Filters for Trend
Butterworth trend filters can be considered; they have a model-based

representation and can be put in state space framework; see Gomez
(JBES, 2001).
(m)
The m-th order stochastic trend is µt = µt where
(m)
∆m µt+1 = ζt , ζt ∼ N ID(0, σζ2 ),
or
(j) (j) (j−1)
µt+1 = µt + µt , “j = m, m − 1, . . . , 1,
(0)
with µt = ηt as before.
(1)
For m = 2 we have IRW with βt = µt .
Higher value for m gives low-pass gain function with sharper cut-off
downwards at certain low frequency point.

Generalised Cycle
Same principles can be applied to the cycle. The generalised kth order
(k)
cycle is given by ψt = ψt , where
! " # ! !
(j) (j) (j−1)
ψt+1 cos λ sin λ ψt ψt
+(j) = φ +(j) + +(j−1) ,
ψt+1 − sin λ cos λ ψt ψt
j = 1, . . . , k,
with ! !
(0)
ψt κt
+(0) = + .
ψt κt ‘m
Higher orders ensure smoother transitions. Further details:

Harvey & Trimbur (REStat, 2003).

Weights and Gain Functions of Components
14.3
0.02
14.2
14.1 0.00
14.0
13.9 −0.02
1985 1990 1995 2000 1985 1990 1995 2000
0.2
0.1 0.5
0.0 0.0
−20 −10 0 10 20 −20 −10 0 10 20
1.0 1.0
0.5 0.5
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

Measuring a common cycle from a multiple time series
• Analysis is based on a multivariate model
• Data-set includes series that are leading, lagging GDP
• We prefer not to choose leads & lags a-priori
• Common cycle will be allowed to shift for individual time series
using techniques developed by Rünstler (EctJ, 2004).
0.2
0.0
−0.2
estimated cycles
gdp (red) versus
cons confidence (blue)
−0.4
1980 1985 1990 1995
0.2
0.0
−0.2
estimated cycles
gdp (red) versus
shifted cons confidence (blue)
−0.4
1980 1985 1990 1995

Shifted cycles
In standard case, cycle ψt is generated by

! " # ! !
ψt+1 cos λ sin λ ψt κt
+ =φ +
ψt+1 − sin λ cos λ ψt+ κ+
t
• The cycle
cos(ξλ)ψt + sin(ξλ)ψt+ ,
is shifted ξ time periods to the right (when ξ > 0) or to the left
(when ξ < 0);
• Here, − 1 π < ξ0 λ < 1 π (shift is w.r.t. ψt );
2 2
• More details in Rünstler (EctJ, 2004) for idea of shifting cycles in
multivariate unobserved components time series model;
• Also some details in paper of Koopman and Azevedo (2003), see
http://staff.feweb.vu.nl/koopman).

The basic multivariate model
Panel of N economic time series, yit , with basic model:

2
yit = µit + δi ψt + εit , εit ∼ N ID(0, σi,ε )
The time series have mixed frequencies (quarterly and monthly)

The final model is with shifted and generalised cycles:
n o
(k) (m) +(m)
yit = µit + δi cos(ξi λ)ψt + sin(ξi λ)ψt + εit .
Thus we have
(k)
• generalised individual trend µit ,
• generalised common cycle ψt(m) with possible shifts;
• irregular εit .

Business cycle
Stock and Watson (1999):
“ . . . fluctuations in aggregate output are at the core of the

business cycle so the cyclical component of real GDP is a
useful proxy for the overall business cycle . . . ”
We therefore impose a unit common cycle loading and zero phase shift
on real GDP of the Euro area.
Time series 1986 – 2002:
* quarterly GDP
* industrial production
* unemployment (countercyclical, lagging)
* industrial confidence
* construction confidence
* retail trade confidence
* consumer confidence
* retail sales
* interest rate spread (leading)

Eurozone Economic Indicators
14.30 GDP Retail sales

IPI unemployment
Interest rate spread Industrial confidence indicator
14.25 Construction confidence indicator Retail trade confidence indicator
Consumer confidence indicator
14.20
14.15
14.10
14.05
14.00
13.95
13.90
1990 1995 2000

Details of model, estimation
• we have set m = 2 and k = 6 for generalised components

• leads to estimated trend/cycle estimates with band-pass
properties, Baxter and King (1999).
• frequency cycle is fixed at λ = 0.06545 (96 months, 8 years), see
Stock and Watson (1999) for the U.S. and ECB (2001) for the
Euro area
• shifts ξi are estimated
• number of parameters for each equation is four (σi,ζ
2 2
, δi , ξi , σi,ε )
and for the common cycle is two (φ and σκ2 )
• total number is 4N = 4 × 9 = 36

Decomposition of real GDP
14.2 0.003
14.1
0.002
14.0
0.001
GDP Euro Area Trend slope
13.9
1990 1995 2000 1990 1995 2000
0.01 0.0050
0.0025
0.00
0.0000
−0.0025
−0.01
−0.0050
Cycle irregular
1990 1995 2000 1990 1995 2000

The business cycle coincident indicator
Noteworthy features:
• GDP is quarterly, estimated components are monthly
• Euro area potential growth has declined after major recession of
1993 (before, growth was around 3.7% in annualised terms, after it
was 2.4%, falls within the 2.0 − 2.5 underlying the ECB monetary
policy)
• GDP cycle in line with common wisdom regarding Euro area
business cycle, ECB (2002)
• business cycle tracks the turning points well
• historical minimum value is observed in Aug 1993, falls in most
severe recession period of Euro area
• maximum value is in Jan 2001

The business cycle coincident indicator
Selected estimation results
series load shift R2d

gdp −− −− 0.31
indutrial prod 1.18 6.85 0.67
Unemployment −0.42 −15.9 0.78
industriual c 2.46 7.84 0.47
construction c 0.77 1.86 0.51
retail sales c 0.26 −0.22 0.67
consumer c 1.12 3.76 0.33
retail sales 0.11 −4.70 0.86
int rate spr 0.57 16.8 0.22

Coincident indicator for Euro area business cycle
0.010
0.005
0.000
−0.005
−0.010
−0.015
1990 1995 2000

Revisions
Real-time reliability of business cycle and growth indicators:

• in practice, indicators are subject to revisions over time due to
data revisions and to their re-computation
• not possible to evaluate consequences of first potential source of
revisions
• we assess the second one by comparing smoothed and filtered
versions of indicator

Revisions
Smoothed cycle Filtered cycle

0.02
0.01
0.00
−0.01
−0.02
1990 1995 2000

0.02
revisions
0.01
0.00
−0.01
−0.02
1990 1995 2000

Revisions, continued
Some revision statistics for cycle
period sd ratio corr sign

1989 – 2002 0.84 0.55 0.72
1993 – 2002 0.66 0.75 0.84
• take into account it is hard to estimate output gap in real-time

• only with the increase of time, one can be more accurate about
cyclical position
• Orphanides and van Norden (REStat, 2002) say whatever method
is used, reliability is quite low

Reliability statistics for business cycle indicators
Some comparisons for the quarterly frequency:
M ethod correlation noise-signal sign concord

86-02 93-02 86-02 93-02 86-02 93-02
H-P 0.33 0.70 1.23 0.99 0.47 0.55

C-F 0.54 0.76 0.90 0.67 0.55 0.65
H-C 0.42 0.69 0.91 0.75 0.61 0.63
A-K-R 0.58 0.80 0.81 0.61 0.69 0.83
Correlation is the contemporaneous correlation between the real time

(filtered) estimates and the final (smoothed) estimates of the business
cycle. Noise-to-signal ratio is the ratio of the standard deviation of the
revisions against the standard deviation of the final estimates. Sign
concordance is the percentage of times that the sign of the real time
and final estimates are equal.

Comparison of four different business cycle indicators
0.020
0.015
0.010
0.005
0.000
H−C
−0.005 A−K−R
C−F
−0.010
H−P
−0.015
1985 1990 1995 2000

Filtered (dotted) and smoothed (solid) cycle estimates
0.02 0.01
0.01
0.00
0.00
−0.01
−0.01
−0.02
H−P C−F
1985 1990 1995 2000 1985 1990 1995 2000
0.01
0.01
0.00
0.00
−0.01
−0.01 A−K−R
H−C
1985 1990 1995 2000 1985 1990 1995 2000

Dynamic factor models
Consider a basic example of the dynamic factor model, for p × 1

observation vector yt and r × 1 latent factor vector ft , as given by
yt = Λft + εt , ft+1 = Φft + ζt , t = 1, . . . , n,
where εt ∼ N ID(0, Σε ) and ζt ∼ N ID(0, Σζ ). For identification

purposes, we have vec(Σζ ) = (I − Φ ⊗ Φ)−1 vec(I). In other words,
factors ft are standardized.

The basic example
For p × 1 data vector yt and r factors in ft , the basic DFM is
yt = Λft + εt , ft+1 = Φft + ζt , t = 1, . . . , n.
Cross-section dimension p is typically high and time series length n is

moderate.
• We are possibly interested in p >> n.
• Estimation concentrates on Λ, Σε and Φ.
• However, first we concentrate on

◦ signal extraction of ft ,
◦ likelihood evaluation,
for given values of Λ, Σε and Φ.

State space formulation
The dynamic factor model for p × 1 observation vector yt and with r × 1

latent factor vector ft
yt = Λft + εt , ft+1 = Φft + ζt , t = 1, . . . , n,
and disturbances
εt ∼ N ID(0, Σε ) ζt ∼ N ID(0, Σζ ),
can be obviously represented in the familiar (partially time-invariant)

state space model
yt = Zαt + εt , αt+1 = Tt αt + Rt ηt ,
with
εt ∼ N ID(0, H) ηt ∼ N ID(0, Qt ).
Let’s adopt the state space representation.

Signal extraction
Consider dynamic factor model
with high-dimensional yt and low-dimensional αt .

Likelihood evaluation can be based on predicion error decomposition
n
Y
ℓ = p(y1 ) p(yt |y1 , . . . , yt−1 ),
t=2
and can be routinely computed by the Kalman filter. Evaluation of
et = E(αt |y1 , . . . ys ),
α V ar(αt |y1 , . . . ys ), s = t − 1, . . . , n,
for t = 1, . . . , n is then carried out by Kalman filter and related methods.

Kalman filter methods often dismissed as p becomes very large : (

Transformation by regression
However, huge computational gains can be obtained as follows : )

Model
Apply GLS regression lemma for every t:
α
bt = P yt , where P = (Z ′ H −1 Z)−1 Z ′ H −1 .
Then, transform model for yt to a model for α

bt , that is
α
bt = αt + et ,
with et = P εt ∼ N ID{0, (Z ′ H −1 Z)−1 }. It can be shown that
et = E(αt |y1 , . . . , ys ) = E(αt |b

α α1 , . . . , α
bs ), t, s = 1, . . . , T.
It implies that observation equation dimension N reduces to r.

Two-step method
Model
for known system matrices.
Signal extraction for αt is carried out in two steps:
1. Cross-section step (GLS)
bt = (Z ′ H −1 Z)−1 Z ′ H −1 yt .
α
2. Time series step: use Kalman filter methods to evaluate

et = E(αt |y1 , . . . ys ) based on low-dimensional model
α
α
bt = αt + et , et ∼ N ID{0, (Z ′ H −1 Z)−1 }
It turns out that all inference can be based on this model for α
bt ,
including the evaluation of the likelihood function.

Transforming the observation vector
Consider model yt = Zαt + εt with εt ∼ N ID(0, H).

Transform yt+ = Ayt , for t = 1, . . . , n, for some non-singular matrix A:
MMSLEs are not affected and loglikelihood function differs only by the
Jacobian term log |A|n .
" # !
L
A ytL
A= , yt+ = ,
AH ytH
where ytL = AL yt , ytH = AH yt . Choose A s.t.
ytL = AL Zαt + eL
t , ytH = eH
t ,
! ( ! " #)
eL
t 0 ΣL 0 ΣL = AL HAL′ ,
∼ , , with
eH
t 0 0 ΣH ΣH = AH HAH′ .

Conditions for transformation
A suitable matrix A needs to fulfill the following conditions:

1. A is full rank, prevents any loss of information;
2. AH HAL′ = 0, ensures that both equations are independent;
3. Row{AH } = Col{Z}⊥ implies that ytH does not depend on αt (can
be weakened);
LEMMA 1:
Matrix A satisfies these conditions if and only if
AL = CZ ′ H −1 ,
for any nonsingular r × r matrix C.

For this matrix AL , we can always find a matrix AH that satisfies 1,2,3.
However, we will not need to compute AH , see below.

An additional condition for convenience
A suitable matrix A needs to fulfill the following conditions:

1. A is full rank, prevents any loss of information;
2. AH HAL′ = 0, ensures that both equations are independent;
3. Row{AH } = Col{Z}⊥ implies that ytH does not depend on αt ;
4. |ΣH | = 1 where ΣH = AH HAH ′
The additional fourth condition is not restrictive, it is about scaling and
it simplifies various calculations.
For example, from the fourth condition, it follows that
|A|2 = |H|−1 |AHA′ | = |H|−1 |AL HAL ′ ||AH HAH ′ | = |H|−1 |ΣL |.
Particularly convenient for likelihood evaluation, next.

Likelihood evaluation
Gaussian likelihood (GL) based on transformation via A is
ℓ(y; ψ) = ℓ(y L ; ψ) + ℓ(y H ; ψ) + n log |A|, |A|2 = |H|−1 |ΣL |.
The first term ℓ(y L ; ψ) can be evaluated by the Kalman filter.

The second term is
n
H (p − m)n 1 X H ′ −1 H
ℓ(y ; ψ) = − log 2π − yt ΣH yt ,
2 2 t=1
as the log-determinental term vanishes since |ΣH | = 1 (condition 4).

LEMMA 2:
ytH ′ Σ−1
H ty H
= e′ −1
tH et ,
where et = yt − Z(Z ′ H −1 Z)−1 Z ′ H −1 yt is the GLS residual for
data-vector yt , covariate matrix Z and variance matrix H.
Choice of C is irrelevant.

Likelihood evaluation
Gaussian likelihood (GL) can now be expressed as

n
L n |H| 1 X ′ −1
ℓ(y; ψ) = c + ℓ(y ; ψ) − log − et H et ,
2 |ΣL | 2 t=1
where c is some constant, not dependent on both y and ψ.
It follows that for the evaluation of the loglikelihood, computation of

matrix AH and vectors ytH , for t = 1, . . . , T , is not required.
Matrix H is oftentimes treated as diagonal or has other strong

structure (blocks, bands, spatial). Term |ΣL | is delivered by KFS.
This GL expression is instrumental for a computationally feasible

approach to a quasi-likelihood based analysis of the dynamic factor
model.

Exercise 1
Consider the common trends model of Harvey and Koopman (1997,

§§9.4.1 and 9.4.2).
1. Put the common trends model with (possibly common) stochastic

slopes and based on equations (21)-(23) in state space form.
2. Put the common trends model with (possibly common) stochastic
slopes and based on equations (24)-(26) in state space form.
Define all vectors and matrices precisely.
3. Discuss the generalisation of Ση 6= 0 and the consequences for
the state space formulation of the model as in 2.

Exercise 2
Consider the multivariate trend model of Harvey and Koopman (1997).
1. Consider a multiple data set of N time series yt . The aim is to

decompose the time series into trend and stationary components.
It is further required that the multiple trend can be decomposed
into a common single trend (common to all N time series) and
idiosyncratic trends (specific to the individual time series).
• Formulate a model for such a decomposition.
• Discuss the identification of the different trends.
• Express the model in state space form.
2. Once multiple trend models are expressed in state space form,
we need to estimate the parameter coefficients of the model.
Please describe shortly some relevant issues of maximum
likelihood estimation. Is it feasible ? What problems can you
expect ? Any recommendations for a successful implementation ?

Exercise 3
This exercise is based on the paper of Harvey and Koopman (1997),

see my website.
Consider the similar cycle model of Harvey and Koopman (1997) with
observation equation
yt = ψt + εt , εt ∼ N (0, Σε ),
where yt is a 3 × 1 observation vector. Cycle ψt represents a common

similar cycle component of rank 2.
1. Please provide the state space representation of this model.
2. Comment on the restrictive nature of the similar cycle model.
3. How would you modify the similar cycle model so that each time
series in yt has a different cycle frequency λ.
4. Can you apply the univariate Kalman filter of DK §6.4 in case Σε
is diagonal ? What if Σε is not diagonal ? Give details.

Exercise 4
Consider the dynamic factor model and the transformation approach.

Can you propose a transformation matrix A that applies to all three
conditions and that lead to a model for ytL with a diagonal variance
matrix ΣL ?
In this case, the univariate treatment of the Kalman filter can be
considered.

Multivariate State Space Plus VAR

Uploaded by

Copyright:

Available Formats

You might also like

Multivariate State Space Plus VAR

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Multivariate State Space Plus VAR

Uploaded by

Copyright:

Available Formats

Multivariate State Space Models

Siem Jan Koopman

Multivariate State Space Models – p. 1

Seemingly Unrelated Time Series Equations model:

• Observations are p × 1 vectors;

Multivariate State Space Models – p. 2

The multivariate LL model is given by

Multivariate State Space Models – p. 3

• Stochastic properties are multivariate analogous of univariate

Γ0 = E(∆yt ∆yt′ ) = Ση + 2Σε

• The unrestricted vector MA(1) process has p2 + p(p + 1)/2

Multivariate State Space Models – p. 4

The homogeneous multivariate LL model is given by

where q is a non-negative scalar. This implies that Ση = qΣε .

Multivariate State Space Models – p. 5

The common local level model is given by

where rank(Ση ) = r < p.

Multivariate State Space Models – p. 6

The common local level model

can be rewritten in terms of underlying levels:

Multivariate State Space Models – p. 7

For the common local level model

Multivariate State Space Models – p. 8

Common dynamic factors:

Multivariate State Space Models – p. 9

• So far, we have concentrated on multivariate variants of the local

Multivariate State Space Models – p. 10

where Ση = δδ ′ + Dη with vector δ and diagonal matrix Dη . This

µt = δµct + µ∗t , ηt = δηtc + ηt∗

∆µct+1 = ηtc ∼ N ID(0, 1), ∆µ∗t+1 = ηt∗ ∼ N ID(0, Dη ).

Multivariate State Space Models – p. 11

Multivariate State Space Models – p. 12

• Consider standard model: yt = Zt αt + εt and αt+1 = Tt αt + Rt ηt

Multivariate State Space Models – p. 13

Consider the common local level (CLL) model as given by

where rank(Ση ) = r < p.

In case of CLL model: ct = a, Zt = A and µ̃ct|t−1 ≡ at .

The model can therefore be written in innovation form:

The Kalman filter (in steady state) is

µ̃ct+1|t = µ̃ct|t−1 + Kvt

where L is lag operator and K is Kalman gain matrix.

which is the VAR(∞) representation:

Multivariate State Space Models – p. 15

The VECM is based on decomposition

Φk (L) = Φ(1)L + Φ∗k−1 (L)∆.

where Φ∗j coefficients are functions of Φk (L) coefficients.

where B and C are N × r matrices with r = N − p and p is rank of Ση .

Multivariate State Space Models – p. 16

• The VAR(∞) representation of the Common LL model is

Multivariate State Space Models – p. 17

Reinsel (1996) adopts bivariate cointegrated system for US monthly

(I − ΦL)∆12 yt = (I + ΘL12 )ut ,

where Θ is estimated as −I: seasonality is treated as deterministic.

∆yt = m + γt + (−0.524, 0.141)′ (1.00, −1.873)yt−1 + ut ,

where we used Johansen’s cointegration test to conclude that matrix

Multivariate State Space Models – p. 18

eigenvalue loglik for rank

Ho:rank=p -Tlog(1-\mu) using T-nm 95%