FALLSEM2022-23_CBS1009_ETH_VL2022230104326_2022-11-15_Reference-Material-II

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

14

MULTIVARIATE TIME SERIES ANALYSIS

Multivariate time series analysis involves the use of stochastic models to describe and
analyze the relationships among several time series. While the focus in most of the earlier
chapters has been on univariate methods, we will now assume that 𝑘 time series, denoted
as 𝑧1𝑡 , 𝑧2𝑡 , … , 𝑧𝑘𝑡, are to be analyzed, and we let 𝒁 𝑡 = (𝑧1𝑡 , … , 𝑧𝑘𝑡 )′ denote the time series
vector at time 𝑡, for 𝑡 = 0, ±1, …. Such multivariate processes are of interest in a variety of
fields such as economics, business, the social sciences, earth sciences (e.g., meteorology
and geophysics), environmental sciences, and engineering. For example, in an engineering
setting, one may be interested in the study of the simultaneous behavior over time of current
and voltage, or of pressure, temperature, and volume. In economics, we may be interested
in the variations of interest rates, money supply, unemployment, and so on, while sales
volume, prices, and advertising expenditures for a particular commodity may be of interest
in a business context. Multiple time series of this type may be contemporaneously related,
some series may lead other series, or there may exist feedback relationships between the
series.
In the study of multivariate processes, a framework is needed for describing not only
the properties of the individual series but also the possible cross relationships among the
series. Two key purposes for analyzing and modeling the series jointly are:

1. To understand the dynamic relationships over time among the series.


2. To improve accuracy of forecasts for individual series by utilizing the additional
information available from the related series in the forecasts for each series.

With these objectives in mind, we begin this chapter by introducing some basic concepts
and tools that are needed for modeling multivariate time series. We then describe the vector
autoregressive, or VAR, models that are widely used in applied work. The properties of

Time Series Analysis: Forecasting and Control, Fifth Edition. George E. P. Box, Gwilym M. Jenkins,
Gregory C. Reinsel, and Greta M. Ljung
c 2016 John Wiley & Sons. Inc. Published 2016 by John Wiley & Sons. Inc.

505
506 MULTIVARIATE TIME SERIES ANALYSIS

these models are examined and methods for model identification, parameter estimation, and
model checking are described. This is followed by a discussion of vector moving average
and mixed vector autoregressive--moving average models, along with associated modeling
tools. A brief discussion of nonstationary unit-root models and cointegration among vector
time series is also included. We find that most of the basic concepts and results from
univariate time series analysis extend to the multivariate case. However, new problems and
challenges arise in the modeling of multivariate time series due to the greater complexity
of models and parametrizations in the vector case. Methods designed to overcome such
challenges are discussed. For a more detailed coverage of various aspects of multivariate
time series analysis, see for example, Reinsel (1997), Lütkepohl (2006), and Tsay (2014).

14.1 STATIONARY MULTIVARIATE TIME SERIES

Let 𝒁 𝑡 = (𝑧1𝑡 , … , 𝑧𝑘𝑡 )′ , 𝑡 = 0, ±1, ±2, …, denote a 𝑘-dimensional time series vector of
random variables of interest. The choice of the univariate component time series 𝑧𝑖𝑡 that
are included in 𝒁 𝑡 will depend on the subject matter area and an understanding of the
system under study, but it is implicit that the component series will be interrelated both
contemporaneously and across time lags. The representation and modeling of these dynamic
interrelationships is of main interest in multivariate time series analysis. Similar to the
univariate case, an important concept in the model representation and analysis, which
enables useful modeling results to be obtained from a finite sample realization of the series,
is that of stationarity.
The vector process {𝒁 𝑡 } is (strictly) stationary if the probability distributions of the
random vectors (𝒁 𝑡1 , 𝒁 𝑡2 , … , 𝒁 𝑡𝑚 ) and (𝒁 𝑡1 +𝑙 , 𝒁 𝑡2 +𝑙 , … , 𝒁 𝑡𝑚 +𝑙 ) are the same for arbitrary
times 𝑡1 , 𝑡2 , … , 𝑡𝑚 , all 𝑚, and all lags or leads 𝑙 = 0, ±1, ±2, …. Thus, the probability
distribution of observations from a stationary vector process is invariant with respect to
shifts in time. Hence, assuming finite first and second moments exist, for a stationary
process we must have 𝐸[𝒁 𝑡 ] = 𝝁, constant for all 𝑡, where 𝝁 = (𝜇1 , 𝜇2 , … , 𝜇𝑘 )′ is the

for all 𝑡, which we denote by 𝚺𝑧 ≡ 𝚪(0) = 𝐸[(𝒁 𝑡 − 𝝁)(𝒁 𝑡 − 𝝁)′ ]. A less stringent definition
mean vector of the process. Also, the vectors 𝒁 𝑡 must have a constant covariance matrix

of second-order, or covariance stationarity will be provided below.

14.1.1 Cross-Covariance and Cross-Correlation Matrices


For a stationary process {𝒁 𝑡 } the covariance between 𝑧𝑖𝑡 and 𝑧𝑗,𝑡+𝑙 must depend only on
the lag 𝑙, not on time 𝑡, for 𝑖, 𝑗 = 1, … , 𝑘, 𝑙 = 0, ±1, ±2, …. Hence, similar to definitions
used in Section 12.1.1, we define the cross-covariance between the series 𝑧𝑖𝑡 and 𝑧𝑗𝑡 at lag
𝑙 as

𝛾𝑖𝑗 (𝑙) = cov[𝑧𝑖𝑡 , 𝑧𝑗,𝑡+𝑙 ] = 𝐸[(𝑧𝑖𝑡 − 𝜇𝑖 )(𝑧𝑗,𝑡+𝑙 − 𝜇𝑗 )]

and denote the 𝑘 × 𝑘 matrix of cross-covariances at lag 𝑙 as

⎡𝛾11 (𝑙) 𝛾12 (𝑙) … 𝛾1𝑘 (𝑙)⎤


⎢𝛾21 (𝑙) 𝛾22 (𝑙) … 𝛾2𝑘 (𝑙)⎥
𝚪(𝑙) = 𝐸[(𝒁 𝑡 − 𝝁)(𝒁 𝑡+𝑙 − 𝝁)′ ] = ⎢ ⎥ (14.1.1)
⎢ ⋮ ⋮ ⋱ ⋮ ⎥
⎢ ⎥
⎣𝛾𝑘1 (𝑙) 𝛾𝑘2(𝑙) … 𝛾𝑘𝑘 (𝑙)⎦
STATIONARY MULTIVARIATE TIME SERIES 507

for 𝑙 = 0, ±1, ±2, … . The corresponding cross-correlations at lag 𝑙 are

𝛾𝑖𝑗 (𝑙)
𝜌𝑖𝑗 (𝑙) = corr[𝑧𝑖𝑡 , 𝑧𝑗,𝑡+𝑙 ] =
{𝛾𝑖𝑖 (0)𝛾𝑗𝑗 (0)}1∕2

with 𝛾𝑖𝑖 (0) = var[𝑧𝑖𝑡 ]. Thus, for 𝑖 = 𝑗, 𝜌𝑖𝑖 (𝑙) = 𝜌𝑖𝑖 (−𝑙) denotes the autocorrelation function
of the 𝑖th series 𝑧𝑖𝑡 , and for 𝑖 ≠ 𝑗, 𝜌𝑖𝑗 (𝑙) = 𝜌𝑗𝑖 (−𝑙) denotes the cross-correlation function
between the series 𝑧𝑖𝑡 and 𝑧𝑗𝑡 . The 𝑘 × 𝑘 cross-correlation matrix 𝝆(𝑙) at lag 𝑙, with (𝑖, 𝑗)th
element equal to 𝜌𝑖𝑗 (𝑙), is given by

𝝆(𝑙) = 𝐕−1∕2 𝚪(𝑙)𝐕−1∕2 = {𝜌𝑖𝑗 (𝑙)} (14.1.2)

for 𝑙 = 0, ±1, ±2, …, where 𝐕−1∕2 = diag{𝛾11 (0)−1∕2 , … , 𝛾𝑘𝑘(0)−1∕2}. Note that 𝚪(𝑙)′ =
𝚪(−𝑙) and 𝝆(𝑙)′ = 𝝆(−𝑙), since 𝛾𝑖𝑗 (𝑙) = 𝛾𝑗𝑖 (−𝑙). In addition, the cross-covariance matrices
𝚪(𝑙) and cross-correlation matrices 𝝆(𝑙) are nonnegative definite, since
[ 𝑛 ]
𝒃′𝑖 𝚪(𝑖 − 𝑗)𝒃𝑗 ≥ 0
𝑛
𝑛 ∑
∑ ∑
var 𝒃′𝑖 𝒁 𝑡−𝑖 =
𝑖=1 𝑖=1 𝑗=1

for all positive integers 𝑛 and all 𝑘-dimensional constant vectors 𝒃1 , … , 𝒃𝑛 .

14.1.2 Covariance Stationarity


The definition of stationarity given above is usually referred to as strict or strong stationarity.
In general, a process {𝒁 𝑡 } that possesses finite first and second moments and that satisfies
the conditions that 𝐸[𝒁 𝑡 ] = 𝝁 does not depend on 𝑡 and 𝐸[(𝒁 𝑡 − 𝝁)(𝒁 𝑡+𝑙 − 𝝁)′ ] depends
only on 𝑙 is referred to as weak, second-order, or covariance stationary. In this chapter, the
term stationary will generally be used in this latter sense of weak stationarity. For a sta-
tionary vector process, the cross-covariance and cross-correlation matrices provide useful
summary information on the dynamic interrelations among the components of the pro-
cess. However, because of the higher dimensionality 𝑘 > 1 of the vector process, the
cross-correlation matrices generally have more complicated structures and can be much
more difficult to interpret than the autocorrelation functions in the univariate case. In
Sections 4.2-4.4, we will examine the covariance properties implied by vector autoregres-
sive, moving average, and mixed autoregressive-moving average models.

14.1.3 Vector White Noise Process


The simplest example of a stationary vector process is the vector white noise process,
which plays a fundamental role as a building block for general vector processes. The
vector white noise process is defined as a sequence of random vectors … , 𝒂1 , … , 𝒂𝑡 , …
with 𝒂𝑡 = (𝑎1𝑡 , … , 𝑎𝑘𝑡)′ , such that 𝐸[𝒂𝑡 ] = 𝟎, 𝐸[𝒂𝑡 𝒂′𝑡 ] = 𝚺, and 𝐸[𝒂𝑡 𝒂′𝑡+𝑙 ] = 𝟎, for 𝑙 ≠ 0.
Hence, its covariance matrices 𝚪(𝑙) are given by
{
𝚺 for 𝑙=0
𝑙≠0
𝚪(𝑙) = 𝐸[𝒂𝑡 𝒂′𝑡+𝑙 ] = (14.1.3)
𝟎 for
508 MULTIVARIATE TIME SERIES ANALYSIS

The 𝑘 × 𝑘 covariance matrix 𝚺 is assumed to be positive definite, since the dimension 𝑘 of


the process could be reduced otherwise. Sometimes, additional properties will be assumed
for the 𝒂𝑡 , such as normality or mutual independence over different time periods.

14.1.4 Moving Average Representation of a Stationary Vector Process


A multivariate generalization of Wold’s theorem states that if {𝒁 𝑡 } is a purely nondeter-
ministic (i.e., 𝒁 𝑡 does not contain a purely deterministic component process whose future
values can be perfectly predicted from the past values) stationary process with mean vector
𝝁, then 𝒁 𝑡 can be represented as an infinite vector moving average (MA) process,


𝒁𝑡 = 𝝁 + 𝚿𝑗 𝒂𝑡−𝑗 = 𝝁 + 𝚿(𝐵)𝒂𝑡 𝚿0 = 𝐈 (14.1.4)
𝑗=0

where 𝚿(𝐵) = ∞ 𝑗=0 𝚿𝑗 B is a 𝑘 × 𝑘 matrix in the backshift operator 𝐵 such that 𝐵 𝒂𝑡 =
𝑗 𝑗
∑∞
𝒂𝑡−𝑗 and the 𝑘 × 𝑘 coefficient matrices 𝚿𝑗 satisfy the condition 𝑗=0 ‖𝚿𝑗 ‖2 < ∞, where
‖𝚿𝑗 ‖ denotes the norm of 𝚿𝑗 . The 𝒂𝑡 form a vector white noise process with mean 𝟎 and
covariances given by (14.1.3). The covariance matrix of 𝒁 𝑡 is then given by

∑ ′
Cov(𝒁 𝑡 ) = 𝚿𝑗 𝚺𝚿𝑗
𝑗=0

The Wold representation in (14.1.4) is obtained by defining 𝒂𝑡 as the error 𝒂𝑡 =


𝒁𝑡 − 𝒁 ̂ 𝑡−1 (1) of the best (i.e., minimum mean square error) one-step-ahead linear pre-
̂ 𝑡−1 (1) of 𝒁 𝑡 based on the infinite past 𝒁 𝑡−1 , 𝒁 𝑡−2 , … . Thus, the 𝒂𝑡 are mutually
uncorrelated by construction since 𝒂𝑡 is uncorrelated with 𝒁 𝑡−𝑗 for all 𝑗 ≥ 1 and, hence,
dictor 𝒁

is uncorrelated with 𝒂𝑡−𝑗 for all 𝑗 ≥ 1, and the 𝒂𝑡 have a constant covariance matrix by
stationarity of the process {𝒁 𝑡 }. The best one-step-ahead linear predictor can be expressed
as

∑ ∞

̂ 𝑡−1 (1) = 𝝁 +
𝒁 𝚿𝑗 {𝒁 𝑡−𝑗 ̂ 𝑡−𝑗−1 (1)} = 𝝁 +
−𝒁 𝚿𝑗 𝒂𝑡−𝑗
𝑗=1 𝑗=1

Consequently, the coefficient matrices 𝚿𝑗 in (14.1.4) have the interpretation of the linear
regression matrices of 𝒁 𝑡 on the 𝒂𝑡−𝑗 in that 𝚿𝑗 = cov[𝒁 𝑡 , 𝒂𝑡−𝑗 ]𝚺−𝟏 .
In what follows, we will assume that 𝚿(𝐵) can be represented (at least approximately, in
practice) as the product 𝚽−1 (𝐵)𝚯(𝐵), where 𝚽(𝐵) and 𝚯(𝐵) are finite autoregressive and
moving average matrix polynomials of orders 𝑝 and 𝑞, respectively. This leads to a class of
linear models for vector time series 𝒁 𝑡 defined by a relation of the form 𝚽(𝐵)(𝒁 𝑡 − 𝜇) =
𝚯(𝐵)𝐚𝑡 , or
𝑝
∑ 𝑞

(𝒁 𝑡 − 𝝁) − 𝚽𝑗 (𝒁 𝑡−𝑗 − 𝝁) = 𝒂𝑡 − 𝚯𝑗 𝒂𝑡−𝑗 (14.1.5)
𝑗=1 𝑗=1

A process {𝒁 𝑡 } is referred to as a vector autoregressive--moving average, or VARMA(𝑝, 𝑞),


process if it satisfies the relations (14.1.5) for a given white noise sequence {𝒂𝑡 }.
We begin the discussion of this class of vector models by examining the special case
when 𝑞 is zero so that the process follows a pure vector autoregressive model of order 𝑝.
VECTOR AUTOREGRESSIVE MODELS 509

The discussion will focus on time-domain methods for analyzing vector time series and
spectral methods will not be used. However, a brief summary of the spectral characteristics
of stationary vector processes is provided in Appendix A14.1.

14.2 VECTOR AUTOREGRESSIVE MODELS

Among multivariate time series models, vector autoregressive models are the most widely
used in practice. A major reason for this is their similarity to ordinary regression models
and the relative ease of fitting these models to actual time series. For example, the param-
eters can be estimated using least-squares methods that yield closed-form expressions for
the estimates. Other methods from multivariate regression analysis can be used at other
steps of the analysis. Vector autoregressive models are widely used in econometrics, for
example, to describe the dynamic behavior of economic and financial time series and to
produce forecasts. This section examines the properties of vector autoregressive models
and describes methods for order specification, parameter estimation, and model checking
that can be used to develop these models in practice.

14.2.1 VAR(𝒑) Model


A vector autoregressive model of order 𝑝, or VAR(𝑝) model, is defined as

𝚽(𝐵)(𝒁 𝑡 − 𝝁) = 𝒂𝑡

where 𝚽(𝐵) = 𝐈 − 𝚽1 𝐵 − 𝚽2 𝐵 2 − ⋯ − 𝚽𝑝 𝐵 𝑝 , 𝚽𝑖 is a 𝑘 × 𝑘 parameter matrix, and 𝒂𝑡 is


a white noise sequence with mean 𝟎 and covariance matrix 𝚺. The model can equivalently
be written as
𝑝

(𝒁 𝑡 − 𝝁) = 𝚽𝑗 (𝒁 𝑡−𝑗 − 𝝁) + 𝒂𝑡 (14.2.1)
𝑗=1

The behavior of the process is determined by the roots of the determinantal equation
det{𝚽(𝐵)} = 0. In particular, the process is stationary if all the roots of this equation are
greater than one in absolute value; that is, lie outside the unit circle (e.g., Reinsel,1997,
Chapter 2). When this condition is met, {𝒁 𝑡 } has the infinite moving average representation


𝒁𝑡 = 𝝁 + 𝚿𝑗 𝒂𝑡−𝑗 (14.2.2)
𝑗=0

or 𝒁 𝑡 = 𝝁 + 𝚿(𝐵)𝒂𝑡 , where 𝚿(𝐵) = 𝚽−1 (𝐵) and the coefficient matrices 𝚿𝑗 satisfy the

condition ∞ 𝑗=0 ‖𝚿𝑗 ‖ < ∞. Then, since 𝚽(𝐵)𝚿(𝐵) = 𝐈, the coefficient matrices can be
calculated recursively from

𝚿𝑗 = 𝚽1 𝚿𝑗−1 + · · · + 𝚽𝑝 𝚿𝑗−𝑝 (14.2.3)

with 𝚿0 = 𝐈 and 𝚿𝑗 = 𝟎, for 𝑗 < 0.


The moving average representation (14.2.2) is useful for examining the covariance
properties of the process and it has a number of other applications. As in the univariate
case, it is useful for studying forecast errors when the VAR(𝑝) model is used for forecasting.
510 MULTIVARIATE TIME SERIES ANALYSIS

It is also used in impulse response analysis to determine how current or future values of
the series are impacted by past changes or ‘‘shocks’’ to the system. The coefficient matrix
𝚿𝑗 shows the expected impact of a past shock 𝒂𝑡−𝑗 on the current value 𝒁 𝑡 . The response of a
specific variable to a shock in another variable is often of interest in applied work. However,
since the components of 𝒂𝑡−𝑗 are typically correlated, the individual elements of the 𝚿𝑗
can be difficult to interpret. To aid the interpretation, the covariance matrix 𝚺 of 𝒂𝑡 can
be diagonalized using a Cholesky decomposition 𝚺 = 𝑳𝑳′ , where 𝑳 is a lower triangular
matrix with positive diagonal elements. Then, letting 𝒃𝑡 = 𝑳−𝟏 𝒂𝑡 , we have Cov(𝒃𝑡 ) = 𝐈𝑘 ,
and the model can be rewritten as


𝒁𝑡 = 𝝁 + 𝚿∗𝑗 𝒃𝑡−𝑗
𝑗=0

where 𝚿∗0 = 𝑳 and 𝚿∗𝑗 = 𝚿𝑗 𝑳 for 𝑗 > 0. The matrices 𝚿∗𝑗 are called the impulse response
weights with respect to the orthogonal innovations 𝒃𝑡 . Since 𝑳 is a lower triangular matrix,
the ordering of the variables will, however, matter in this case. For further discussion and
for applications of impulse response analysis, see Lütkepohl (2006, Chapter 2) and Tsay
(2014, Chapter 2).

Reduced and Structural Forms. It is sometimes useful to express the VAR(𝑝) process in
(14.2.1) in the following slightly different form. Since the matrix 𝚺 = 𝐸[𝒂𝑡 𝒂′𝑡 ] is assumed
to be positive definite, there exists a lower triangular matrix 𝚽#0 with ones on the diagonal

such that 𝚽#0 𝚺𝚽#0 = 𝚺# is a diagonal matrix with positive diagonal elements. Hence, by
premultiplying (14.2.1) by 𝚽#0 , we obtain the following representation:

𝑝

𝚽#0 (𝒁 𝑡 − 𝝁) = 𝚽#𝑗 (𝒁 𝑡−𝑗 − 𝝁) + 𝒃𝑡 (14.2.4)
𝑗=1

where 𝚽#𝑗 = 𝚽#0 𝚽𝑗 and 𝒃𝑡 = 𝚽#0 𝒂𝑡 with Cov[𝒃𝑡 ] = 𝚺# . This model displays the concurrent
dependence among the components of 𝒁 𝑡 through the lower triangular matrix 𝚽#0 and is
sometimes referred to as the structural form of the VAR(𝑝) model. The model (14.2.1) that
includes the concurrent relationships in the covariance matrix 𝚺 of the errors and does not
show them explicitly is referred to as the standard or reduced form of the VAR(𝑝) model.
Note that a diagonalizing transformation of this type was already used in the impulse
response analysis described above, where the innovations 𝒃𝑡 ’s were further normalized to
have unit variance.

14.2.2 Moment Equations and Yule--Walker Estimates


For the VAR(𝑝) model, the covariance matrices 𝚪(𝑙) = Cov(𝒁 𝑡 , 𝒁 𝑡+𝑙 ) = Cov(𝒁 𝑡−𝑙 , 𝒁 𝑡 ) =

𝐸[(𝒁 𝑡−𝑙 − 𝝁)(𝒁 𝑡 − 𝝁) ] satisfy the matrix equations

𝑝

𝚪(𝑙) = 𝚪(𝑙 − 𝑗)𝚽′𝑗 (14.2.5)
𝑗=1
VECTOR AUTOREGRESSIVE MODELS 511

∑𝑝
for 𝑙 = 1, 2, …, with 𝚪(0) = 𝑗=1
𝚪(−𝑗)𝚽′𝑗 + 𝚺. This result is readily derived using

(14.2.1), noting that 𝐸[(𝒁 𝑡−𝑙 − 𝝁)𝒂𝑡−𝑗 ] = 𝟎, for 𝑗 < 𝑙. The matrix equations (14.2.5) are
commonly referred to as the multivariate Yule--Walker equations for the VAR(𝑝) model.
For 𝑙 = 0, … , 𝑝, these equations can be used to solve for the 𝚪(𝑙) simultaneously in terms
of the AR parameter matrices 𝚽𝑗 and 𝚺.
Conversely, the AR coefficient matrices 𝚽1 , … , 𝚽𝑝 and 𝚺 can also be determined
from the 𝚪’s by first solving the Yule--Walker equations, for 𝑙 = 1, … , 𝑝, to obtain the
parameters 𝚽𝑗 . These equations can be written in matrix form as 𝚪𝑝 𝚽(𝑝) = 𝚪(𝑝) , with
solution 𝚽(𝑝) = 𝚪−1𝑝 𝚪(𝑝) , where

𝚽(𝑝) = [𝚽1 , … , 𝚽𝑝 ]′ 𝚪(𝑝) = [𝚪(1)′ , … , 𝚪(𝑝)′ ]′

and 𝚪𝑝 is a 𝑘𝑝 × 𝑘𝑝 matrix with (𝑖, 𝑗)th block of elements equal to 𝚪(𝑖 − 𝑗). Once the 𝚽𝑗
are determined from this, 𝚺 can be obtained as

𝚪(−𝑗)𝚽′𝑗 ≡ 𝚪(0) − 𝚪′(𝑝) 𝚽(𝑝) = 𝚪(0) − 𝚽′(𝑝) 𝚪𝑝 𝚽(𝑝)


𝑝

𝚺 = 𝚪(0) −
𝑗=1

In practical applications, these results can be used to derive Yule--Walker estimates of the
parameters in the VAR(𝑝) model by replacing the variance and covariance matrices by their
estimates.

14.2.3 Special Case: VAR(1) Model


To examine the properties of VAR models in more detail, we will consider the VAR(1)
model,

𝒁 𝑡 = 𝚽𝒁 𝑡−1 + 𝒂𝑡

where the mean vector 𝝁 is assumed to be zero for convenience. For 𝑘 = 2, we have the
bivariate VAR(1) process
[ ] [ ]
𝜙11 𝜙12 𝑎1𝑡
𝒁𝑡 = 𝒁 𝑡−1 +
𝜙21 𝜙22 𝑎2𝑡

or equivalently

𝑧1𝑡 = 𝜙11 𝑧1,𝑡−1 + 𝜙12 𝑧2,𝑡−1 + 𝑎1𝑡


𝑧2𝑡 = 𝜙21 𝑧1,𝑡−1 + 𝜙22 𝑧2,𝑡−1 + 𝑎2𝑡

where 𝜙11 and 𝜙22 reflect the dependence of each component on its own past. The parameter

dependence of 𝑧2𝑡 on 𝑧1,𝑡−1 in the presence of 𝑧2,𝑡−1 . Thus, if 𝜙12 ≠ 0 and 𝜙21 ≠ 0, then
𝜙12 shows the dependence of 𝑧1𝑡 on 𝑧2,𝑡−1 in the presence of 𝑧1,𝑡−1 , while 𝜙21 shows the

there is a feedback relationship between the two components. On the other hand, if the off-
diagonal elements of the parameter matrix Φ are zero, that is, 𝜙12 = 𝜙21 = 0, then 𝑧1𝑡 and
𝑧2𝑡 are not dynamically correlated. However, they are still contemporaneously correlated
unless 𝚺 is a diagonal matrix.
512 MULTIVARIATE TIME SERIES ANALYSIS

Relationship to Transfer Function Model. If 𝜙12 = 0, but 𝜙21 ≠ 0, then 𝑧1𝑡 does not
depend on past values of 𝑧2𝑡 but 𝑧2𝑡 depends on past values of 𝑧1𝑡 . A transfer function
relationship then exists with 𝑧1𝑡 acting as an input variable and 𝑧2𝑡 as an output variable.
However, unless 𝑧1𝑡 is uncorrelated with 𝑎2𝑡 , the resulting model is not in the standard
transfer function form discussed in Chapter 12. To obtain the standard transfer function
model, we let 𝑎1𝑡 = 𝑏1𝑡 and 𝑎2𝑡 = 𝛽𝑎1𝑡 + 𝑏2𝑡 , where 𝛽 is the regression coefficient of 𝑎2𝑡 on
𝑎1𝑡 . Under normality, the error term 𝑏2𝑡 is then independent of 𝑎1𝑡 and hence of 𝑏1𝑡 . The
unidirectional transfer function model is obtained by rewriting the equations for 𝑧1𝑡 and 𝑧2𝑡
above in terms of the orthogonal innovations 𝑏1𝑡 and 𝑏2𝑡 . This yields

(1 − 𝜙22 𝐵)𝑧2𝑡 = {𝛽 + (𝜙21 − 𝛽𝜙11 )𝐵}𝑧1,𝑡−1 + 𝑏2𝑡

where the input variable 𝑧1𝑡 does not depend on the noise term 𝑏2𝑡 .
Hence, the bivariate transfer function model emerges as a special case of the bivariate
AR model, in which a unidirectional relationship exists between the variables. In general,
for a VAR(1) model in higher dimensions, 𝑘 > 2, if the 𝑘 series can be arranged so that the
matrix 𝚽 is lower triangular, then the VAR(1) model can also be expressed in the form of
unidirectional transfer function equations.

Stationarity Conditions for VAR(1) Model. The VAR(1) process is stationary if the roots
of det{𝐈 − 𝚽𝐵} = 0 exceed one in absolute value. Since det{𝐈 − 𝚽𝐵} = 0 if and only
if det{𝜆𝐈 − 𝚽} = 0 with 𝜆 = 1∕𝐵, it follows that the stationarity condition for the AR(1)
model is equivalent to requiring that the eigenvalues of 𝚽 be less than one in absolute value.
When this condition is met, the process has the convergent infinite MA representation
(14.2.2) with MA coefficient matrices 𝚿𝑗 = 𝚽𝑗 , since from (14.2.3) the 𝚿𝑗 now satisfy

𝚿𝑗 = 𝚽𝚿𝑗−1 ≡ 𝚽𝑗 𝚿0

To look at the stationarity for a 𝑘-dimensional VAR(1) model further, we note that for
arbitrary 𝑛 > 0, by 𝑡 + 𝑛 successive substitutions in the right-hand side of 𝒁 𝑡 = 𝚽𝒁 𝑡−1 + 𝒂𝑡
we obtain
𝑡+𝑛

𝒁𝑡 = 𝚽𝑗 𝒂𝑡−𝑗 + 𝚽𝑡+𝑛+1 𝒁 −𝑛−1
𝑗=0

Hence, provided that all eigenvalues of 𝚽 are less than one in absolute value, as

𝑛 → ∞ this will converge to the infinite MA representation 𝒁 𝑡 = ∞ 𝑗=0 𝚽 𝒂𝑡−𝑗 , with
𝑗
∑∞
𝑗=0 ‖𝚽 ‖ < ∞, which is stationary. For example, suppose that 𝚽 has 𝑘 distinct eigen-
𝑗

values 𝜆1 , … , 𝜆𝑘 , so there is a 𝑘 × 𝑘 nonsingular matrix P such that 𝐏−1 𝚽𝐏 = 𝚲 =


diag(𝜆1, … , 𝜆𝑘 ). Then 𝚽 = 𝐏𝚲𝐏−1 and 𝚽𝑗 = P𝚲𝑗 P−1 , where 𝚲𝑗 = diag(𝜆1 , … , 𝜆𝑘 ), so
𝑗 𝑗
∑∞ ∑
when all |𝜆𝑖 | < 1, 𝑗=0 ‖𝚽𝑗 ‖ < ∞ since then ∞ 𝑗
𝑗=0 ‖𝚲 ‖ < ∞.

Moment Equations. For the VAR(1) model, the matrix Yule--Walker equations (14.2.5)
simplify to

𝚪(𝑙) = 𝚪(𝑙 − 1)𝚽′ for 𝑙 ≥ 1


VECTOR AUTOREGRESSIVE MODELS 513

so 𝚪(1) = 𝚪(0)𝚽′ , in particular, with

𝚪(0) = 𝚪(−1)𝚽′ + Σ = 𝚽𝚪(0)𝚽′ + 𝚺

Hence,′ 𝚽′ can be determined from 𝚪(0) and 𝚪(1) as 𝚽′ = 𝚪(0)−1 𝚪(1) and also 𝚪(𝑙) =
𝚪(0)𝚽 𝑙 . This last relation illustrates that the behavior of all correlations in 𝝆(𝑙), ob-
tained using (14.1.2), will be controlled by the behavior of the 𝜆𝑙𝑖 , 𝑖 = 1, … , 𝑘, where
𝜆1 , … , 𝜆𝑘 are the eigenvalues of 𝚽, and shows that even the simple VAR(1) model is
capable of fairly general correlation structures (e.g., mixtures of exponential decaying and
damping sinusoidal terms) for dimensions 𝑘 > 1. (For more details, see Reinsel, 1997,
Section 2.2.3).

14.2.4 Numerical Example


Consider the bivariate (𝑘 = 2) AR(1) model (𝐈 − 𝚽𝐵)𝒁 𝑡 = 𝒂𝑡 with
[ ] [ ]
0.8 0.7 4 1
𝚽= 𝚺=
−0.4 0.6 1 2

The roots of det{𝜆𝐈 − 𝚽} = 𝜆2 − 1.4𝜆 + 0.76 = 0 are 𝜆 = 0.7 ± 0.5196𝑖, with absolute
value equal to (0.76)1∕2; hence, the AR(1) model is stationary. Since the roots are complex,
the correlations of this AR(1) process will exhibit damped sinusoidal behavior. The co-
variance matrix 𝚪(0) is determined by solving the linear equations 𝚪(0) − 𝚽𝚪(0)𝚽′ = 𝚺.
Together with 𝚪(𝑙) = 𝚪(𝑙 − 1)𝚽′ , these lead to the covariance matrices
[ ] [ ]
18.536 −1.500 13.779 −8.315
𝚪(0) = 𝚪(1) =
−1.500 8.884 5.019 5.931
[ ] [ ]
5.203 −10.500 −3.188 −8.381
𝚪(2) = 𝚪(3) =
8.166 1.551 7.619 −2.336
[ ] [ ]
−8.417 −3.754 −9.361 1.115
𝚪(4) = 𝚪(5) =
4.460 −4.449 0.453 −4.453

The corresponding correlation matrices are obtained from 𝝆(𝑙) = 𝐕−1∕2 𝚪(𝑙)𝐕−1∕2 , where
V−1∕2 = diag(18.536−1∕2, 8.884−1∕2). The autocorrelations and cross-correlations of this

rather involved and correlations do not die out very quickly. The coefficients 𝚿𝑗 = Φ𝑗 , 𝑗 ≥
process are displayed up to 18 lags in Figure 14.1. We note that the correlation patterns are

1, in the infinite MA representation for this AR(1) process are


[ ] [ ] [ ]
0.80 0.70 0.36 0.98 −0.10 0.84
𝚿1 = 𝚿2 = 𝚿3 =
−0.40 0.60 −0.56 0.08 −0.48 −0.34
[ ] [ ] [ ]
−0.42 0.43 −0.51 −0.03 −0.39 −0.38
𝚿4 = 𝚿5 = 𝚿6 =
−0.25 −0.54 0.02 −0.50 0.22 −0.28

So the elements of the 𝚿𝑗 matrices are also persistent and exhibit damped sinusoidal
behavior similar to that of the correlations.
514 MULTIVARIATE TIME SERIES ANALYSIS

FIGURE 14.1 Theoretical autocorrelations and cross-correlations, 𝜌𝑖𝑗 (𝑙), for the bivariate VAR(1)
process example: (a) autocorrelations 𝜌11 (𝑙) and 𝜌22 (𝑙) and (b) cross-correlations 𝜌12 (𝑙).

Finally, since det{𝜆𝐈 − 𝚽} = 𝜆2 − 1.4𝜆 + 0.76 = 0, it follows from Reinsel (1997,


Section 2.2.4) that each individual series 𝑧𝑖𝑡 has a univariate ARMA(2, 1) model rep-
resentation as (1 − 1.4𝐵 + 0.76𝐵2 )𝑧𝑖𝑡 = (1 − 𝜂𝑖 𝐵)𝜀𝑖𝑡 , 𝜎𝜀2 = var[𝜀𝑖𝑡 ], where 𝜂𝑖 and 𝜎𝜀2 are
𝑖 𝑖
readily determined. For a 𝑘-dimensional VAR(𝑝) model, it can be shown that each indi-
vidual component 𝑧𝑖𝑡 follows a univariate ARMA of maximum order (𝑘𝑝, (𝑘 − 1)𝑝). The
order can be much less if the AR and MA polynomials have common factors (e.g., Wei,
2006, Chapter 16).

Computations in R. The covariance matrices 𝚪(𝑙) and the 𝚿 matrices shown above can be
reproduced using the MTS package in R as follows:

> library(MTS)
> phi1=matrix(c(0.8,-0.4,0.7,0.6),2,2)
> sig=matrix(c(4,1,1,2),2,2)
> eigen(phi1)
> m1=VARMAcov(Phi=phi1, Sigma=sig, lag=5)
> names(m1)
[1] "autocov" "ccm"
> autocov=t(m1$autocov)

You might also like