Chapter 6

Time series analysis

Applied Econometrics
WS 2020/21

Prof. Dr. Simone Maxand

Humboldt University Berlin
6.1 Introduction

6.2 Stochastic processes

6.2.1 Basic concepts

6.2.2 Stationarity and ergodicity

6.2.3 Linear processes

6.3 ARMA models

6.3.1 Autoregressive and other ARMA processes

6.3.2 Estimation and forecasting

6.3.3 The Box-Jenkins program

6.4 Nonstationary processes

6.4.1 Unit root processes

6.4.2 Unit root tests

6.4.3 An empirical application with R

Chapter 6: Time series analysis
6.1 Introduction
I Time Series (TS): sequence (set) of observations yt of a

random variable over time

. Values of a variable are observed at successive time points.

. Observation at time t ∈ T : yt
I Notation: Write {yt }t∈T
. Sometimes shortly: {yt } or yt (if obvious that the TS and not
the observation at time t is meant)

I Example: Monthly US industrial production index

Chapter 6: Time series analysis

I T discrete set ⇒ {yt }t∈T TS in discrete time

. Data per hour, day, week, month, quarter, year, etc.

. Special case: T nite, equidistant points in time, i.e.

T = {1, ..., T } ⇒ {yt }t∈T = {y1 , ..., yT }

I TS in continuous time: Observations are recorded continuously

over some time interval, e.g. T = [0, 1].

. We use then the notation y (t) rather than yt .

I In theory: Assume often that {yt }t∈T has started in the

(innite) past (t ≤ 0) and continues to the (innite) future

(t > T ), i.e. {yt }∞

t=−∞ .

. {yt }T
t=1 is considered as nite segment of that innite series.

Chapter 6: Time series analysis

Typical characteristics of TS data

I yt is typically not independent of yt−1 !
. Strength of dependence is an essential characteristic of TS.

. Examples:
 Independence of yt t = 1, . . . , T .
for all
 Dependence under stationarity: yt = φyt−1 + εt with |φ| < 1
 Integrated process (stochastic trend): yt = yt−1 + εt
 Deterministic (linear) trend: yt = β · t + εt

I TS data may have a time-varying variance.

I TS data are often governed by a trend (deterministic/stoch.?).

I TS data have seasonal/cyclic components.

I TS data may have structural breaks.

Chapter 6: Time series analysis

Goals of time series analysis

I Generally, TS data can be used to answer quantitative

questions for which cross-sectional data are inadequate.

I Description/estimation of dynamic properties

. to gain a better understanding of the DGP:

 Are there regularities or structures in the data?

. to check economic theory:

 E.g. Quantity Theory of Money: money supply has a direct

proportional relationship to the price level,

. to forecast the future development of an economic variable:

 What is next month's ination rate, interest rate, stock price,


Chapter 6: Time series analysis

I (Dynamic) causal dependences between variables:

. How does yt depend on xt−1 ?
. How does xt depend on yt−1 ?
. How does xt depend on yt ?
. Example: What will be the present and future implications of a
change in income for consumption and investment?
. Require Multivariate Time Series Analysis (not in this course).

I Forecasts
. Predict yt based on yt−1 , yt−2 ,...
. Predict yt based on xt−1 , xt−2 ,...
. Example: Forecast of ination rate y by means of its own past
 or ADL model: ination rate y is additionally inuenced by
unemployment rate and its lagged values

. Forecasts make sense even without causal interpretation (e.g.

in case of omitted variables).

Chapter 6: Time series analysis

Time series plots

I A rst impression about the behavior of the TS is provided by

a graphical representation (TS plot).

I A TS plot provides information about

. trends,

. seasonal patterns,

. structural breaks,

. conditional heteroscedasticity,

. outliers, etc.

I Note: When dealing with outliers, common sense is often

more important than statistical theory.

Chapter 6: Time series analysis

Quarterly German GDP and rst dierences

Quarterly German log(real GDP) First dierences (income growth)

6.4 0.06
6.2 0.03
6.0 1983 1988 1993 1998 2003 2008 2013
5.8 -0.03
5.6 -0.06
5.4 -0.09
1983 1988 1993 1998 2003 2008 2013

Chapter 6: Time series analysis

Daily exchange rate BRA-USD

Chapter 6: Time series analysis

Daily DAX returns (in %)

10.0 DAX, Veränderung täglich in %

Quelle: Thomson Reuters Datastream







2000 2001 2002 2003 2004 2005 2006 2007 2008 2009

Chapter 6: Time series analysis

Monthly car registrations in Germany

Chapter 6: Time series analysis

The lag operator

I Applying some operator to a TS (or sequence of (random)

variables) provides a new TS (or sequence).

I Lag- (Backshift-) operator L is dened by:

Lyt := yt−1 (rst lag of yt )

. Convention: L0 yt = yt

I Powers of L are dened in an obvious way (recursively):

Lj yt = L(Lj−1 yt ) (j ≥ 2)
⇒ Lj yt = yt−j , j ≥ 0 (j -th lag of yt )
I Obviously, for some constant c and integers j, k :
Lj c = c and Lj Lk yt = Lj+k yt
Chapter 6: Time series analysis
Lag polynomials
I The lag operator is a linear operator:

L(cxt + yt ) = cLxt + Lyt

I Lag polynomial: for some set I ⊆ Z,
c(L) = cj Lj
I A lag polynomial describes a linear lter :
yt∗ := c(L)yt = cj Lj yt = cj yt−j
j∈I j∈I
. Convention: c(1) = cj
I Algebra of lag polynomials is isomorphic to the algebra of

usual polynomials (in real or complex variables).

Chapter 6: Time series analysis

Dierence operator
I Dierence operator ∆ of rst order is dened by:

∆yt = yt − yt−1 .
⇒ yt∗ = ∆yt = yt − yt−1 is a linear lter (dierence lter).

⇒ ∆ = 1 − L, i.e. ∆yt = yt − yt−1 = (1 − L)yt

I Dierence operator ∆p of order p ≥ 1: recursively dened by

∆p yt := ∆(∆p−1 yt ) = ∆p−1 yt − ∆p−1 yt−1 ; ∆0 yt = yt .

I Polynomials in L and ∆ may be manipulated in the same way

as polynomials in real or complex variables, e.g.:

p p
X p
∆ = (1 − L) = (−1)j Lj .

Chapter 6: Time series analysis

I Dierence operator of order 2:

∆2 yt = ∆(∆yt ) = (yt − yt−1 ) − (yt−1 − yt−2 )

= yt − 2yt−1 + yt−2
= (1 − 2L + L2 )yt = (1 − L)2 yt

I ∆p removes a polynomial of order p (degree p − 1)

I For the Example with p = 2:

∆(α + βt) = [α + βt] − [α + β(t − 1)] = β

⇒ ∆2 (α + βt) = ∆[∆(α + βt)] = ∆(β) = 0

Chapter 6: Time series analysis

I The rst dierence of the log TS describes the growth rate:

∆ ln(yt ) = ln(yt ) − ln(yt−1 ) = ln
yt − yt−1 yt − yt−1 ∆yt
= ln 1 + ≈ = .
yt−1 yt−1 yt−1
I Moving average lter (of order q ):
1 X
yt∗ = yt−j = c(L)yt , where
2q +1
2q+1 , if |j| ≤ q
cj =
0, otherwise.

I Seasonal dierence lter (for seasonality s ):

∆s yt = yt − yt−s = (1 − Ls )yt .
. For example, s=4 in case of quarterly data.

Chapter 6: Time series analysis

Classical decomposition
I Many economic TS exhibit trends and seasonal patters that

are informative but often not of interest for the study.

I Classical decomposition: yt can be written as the sum of a

trend (tt ), seasonal (st ) and random (rt ) component:

yt = tt + st + rt
⇒ Detrended and deseasonalized time series:

rbt = yt − tbt − sbt .

I Estimate trend parametrically (e.g. linear tt = α0 + α1 t ) or by

ltering (see last slide); estimate seasonality by trigonometric

functions → seasonally adjusted data often available.

Chapter 6: Time series analysis

Autocovariance and autocorrelation

I Assume: yt is realization of real-valued random variable Yt .
I Autocovariance:

γ(t, s) := Cov[Yt , Ys ] = E[(Yt − E[Yt ])(Ys − E[Ys ])]

I Autocorrelation:

ρ(t, s) := Corr [Yt , Ys ] = p

Cov[Ypt , ys ] = p γ(t,ps) ,
V[Yt ] V[Ys ] γ(t, t) γ(s, s)

where γ(t, t) = V[Yt ] := E[(Yt − E[Yt ])2 ].

Chapter 6: Time series analysis

I The description of dependences of variable between dierent

time points is a main issue in Time Series Analysis.

I Autocovariance/autocorrelation describes linear dependence!

I Obviously it holds: γ(s, t) = γ(t, s).

I γ(s, t) = 0 ⇒ Ys and Yt are uncorrelated, but can nonetheless

(strongly) depend on each other.

I Special case: If (Yt , Ys )0 follow a bivariate normal distribution,


γ(s, t) = 0 ⇔ Ys and Yt are stochastically independent.

Chapter 6: Time series analysis

Sample moments
I Besides plots, sample moments may serve as exploratory tools.

. Meaningful in particular for so-called stationary time series

I Sample mean describes the central location of the series:

1 X
y= yt .

I Sample variance describes the variation of the series:

2 1
(yt − y )2 .
b =
I Sample standard deviation: σ

Chapter 6: Time series analysis

I Sample autocovariance function at lag h (h = 0, 1, . . . , T − 1)

1 X
bh = (yt − y )(yt−h − y ) or

1 X
bh = (yt − y )(yt−h − y ).
T −h

I Sample autocorrelation function (measures dependence over


ρbh = γ
bh /b
γ0 ρh | ≤ 1).
I (Auto-)correlogram: Plot ρbh against h.

Chapter 6: Time series analysis

Chapter 6: Time series analysis
6.2 Stochastic processes

6.2.1 Basic concepts
I The analysis of time series (TS) requires a suitable

mathematical model for the data.

I Each observation yt of the TS is considered as a realization of

a random variable (RV) Yt .

I The observed TS {yt }t∈T0 is a realization of the family of RVs

{Yt }t∈T0 .
I The observed TS is (part of ) a realization of a stochastic

process {Yt }t∈T , T0 ⊆ T .

Chapter 6: Time series analysis

Stochastic process
I Denition. A stochastic process (SP) is a family of RVs

{Yt }t∈T dened on a probability space (Ω, A, P).

I Here: T =Z and T0 = {1, . . . , T }.

I {Yt }t∈T is dened on Ω×T, with values in some space E.
. In this course: E =R (univariate TS).

. Or: E =R (multivariate TS).

I For each xed t ∈ T , Yt is a RV, i.e. Yt = Yt (·) : Ω → E .

I For each xed ω ∈ Ω, Y· (ω) : T → E is a function of time.

Chapter 6: Time series analysis

SP, cont.
I The functions {Y· (ω)}ω∈Ω on T are called realizations (or

trajectories) of the SP {Yt }t∈T .

I The image space E of the SP {Yt }t∈T is called state (or

phase) space.

I Frequently:

. The term TS is used for both the data and the SP,

. there is no distinction in notation between the RV Yt and its

realization yt = Yt (ω), if meaning is clear from the context.

Chapter 6: Time series analysis

Example 1
I Let X ∼ N (0, 1) (dened on some space Ω) and dene a SP

{Yt }t∈N by
Yt = (−1)t X
(i.e., more explicitly, Yt (ω) = (−1)t X (ω), ω ∈ Ω, t ∈ N).
I Realizations of this SP: functions of t obtained by xing ω:

yt = (−1)t x, where x = X (ω)

I One realization of the SP: x = 0.45, t = 1, . . . , 20

Chapter 6: Time series analysis

Example 2 (Binary process)

I Let Yt , t = 1, 2, . . ., be a sequence of i.i.d. random variables

P(Yt = 1) = P(Yt = −1) = for all t.
⇒ It is not so obvious as in Example 1 that there exists a

probability space (Ω, A, P) with RVs Yt dened on Ω having

the required joint distributions, i.e. such that for all n ∈ N and

all i1 , . . . , in ∈ {−1, 1}:

P(Y1 = i1 , . . . , Yn = in ) =
(nite-dimensional distributions). The existence of such a SP

is however guaranteed by Kolmogorov's theorem.

Chapter 6: Time series analysis

Finite-dimensional distributions
I An important characteristic of a (real-valued) SP is the

collection of its nite-dimensional distribution functions

Ft1 ,...,tn (·, . . . , ·),

which are dened for all t1 , . . . , tn with

t1 < t2 < . . . < tn , n = 1, 2, . . . by

Ft1 ,...,tn (y1 , . . . , yn ) = P(Yt1 ≤ y1 , . . . , Ytn ≤ yn ).

I In Example 1, we were able to dene Yt (ω) quite explicitly for

each t and ω.
I In contrast, investigations start frequently by specifying the

collection of all nite-dimensional distributions, cp. Example 2.

Chapter 6: Time series analysis

Fundamental problem of time series analysis

I Aim: Inference about properties/characteristics of {Yt }.
. We try to explain a variable with regard to its own past (and
the history of random disturbances).

I But: We observe only one single trajectory of the TS.

. We cannot go back in time, let history run again, and observe
another realization of the TS!

I If we could do that several times:

⇒ For each year: several observations
⇒ Averaging over this cross-sectional dimension would provide
a consistent estimator of E[Yt ] for each year t.
. Similarly, other features (higher moments or the distribution
itself, etc.) of the process could be estimated for each year.

Chapter 6: Time series analysis

I A profound inference requires:

1. Yt 's have common (or similar) characteristics: If distribution of

Yt does not change over time, the observed values yt of
history can be viewed as realizations of the same distribution.
⇒ Concept of stationarity

2. Observations over time can be used to infer on properties of

each Yt (population properties): If the SP is not too
persistent, each observation yt contains information not
available from the other elements.
⇒ Concept of ergodicity
1 PT
I 1. & 2. ⇒ TS average over time (i.e.
T t=1 Yt ) is a
consistent estimator of the population average E[Yt ].

Chapter 6: Time series analysis

6.2.2 Stationarity and ergodicity

Weak stationarity

I Denition. The time series {Yt }t∈Z is called weakly

stationary (or covariance stationary), if:

. E |Yt |2 < ∞
∀ t ∈ Z,
. E[Yt ] = µ ∀ t ∈ Z, and

. γ(s, t) = γ(s + r , t + r ) ∀ s, t, r ∈ Z.

Chapter 6: Time series analysis

Autocovariance function under stationarity

I If {Yt }t∈Z is weakly stationary, then

γ(s, t) = γ(t − s, 0) ∀s, t

= γ(h, 0) = γ(−h, 0) for h := t − s.
⇒ γ(s, t) depends only on |t − s|.
I Redenition of the autocovariance function (at lag h):
γ(h) := γ(h, 0) = Cov[Yt , Yt−h ] ∀ t, h ∈ Z
with γ(−h) = γ(h) and V[Yt ] := γ(0) (for all t ∈ Z).
I Autocorrelation function (ACF): ρ(h) := γ(h)/γ(0).
. The ACF describes the short-term dynamics of a TS (in
contrast, the trend characterizes the long-run behavior).

Chapter 6: Time series analysis

Empirical autocorrelation function

d = γ(h) = t=h+1 (yt − y )(yt−h − y)
ρ(h) ,
γ( 0) t=1 (yt − y )
1 X
y := yt
by2 := (yt − y )2 .
d 0) = σ

I Compare Slide 22 for a rst denition.

. Now, the notation has slightly changed.

. Under stationarity, the theoretical ACF is estimated!

Chapter 6: Time series analysis

Strict stationarity

I Denition. The TS {Yt }t∈Z is called strictly stationary, if for

all n ∈ N and for all t1 , ..., tn , h ∈ Z the (nite-dimensional
marginal) distribution of (Yt1 , . . . , Ytn ) and
(Yt1 +h , . . . , Ytn +h ) are identical, i.e.
(Yt1 , . . . , Ytn )0 = (Yt1 +h , . . . , Ytn +h )0 .

. That is, the nite-dimensional marginal distributions of a

strictly stationary process are shift-invariant.

Chapter 6: Time series analysis

Relations between stationarity concepts

I If {Yt } is strictly stationary with E|Yt |2 < ∞, then {Yt } is

also weakly stationary.

I The converse of the above implication does not hold in

general, but it holds for Gaussian processes.

I {Yt } is called a Gaussian process if its nite dimensional

distributions are multivariate normal (Gaussian) distributions.

Chapter 6: Time series analysis

I Let εt ∼ N (0, σ 2 ) i.i.d. Then:

1. Yt = µ + εt is (strictly and weakly) stationary.

2. Yt = βt + εt is not stationary (because of time trend).

3. Yt = Yt−1 + εt with initial condition Y0 = 0 is not stationary.

 Random walk

 E[Yt ] ≡ 0
 γ(s, t) = σ 2 min(s, t)

I In what follows, the focus will be on second order processes

{Yt } (i.e. with E|Yt |2 < ∞) and on weak stationarity.

Chapter 6: Time series analysis

I We have only one observation of the SP and thus only the
1 PT
time average Y = T t=1 Yt .

⇒ When does this converge to µ = E[Yt ]?

I Denition. A weakly stationary SP {Yt }t∈Z is called ergodic

for the mean µ = E[Yt ], if

1 X p
Y = Yt → µ.
. Requires that γ(h) goes to 0 as h → ∞.
. Sucient condition:

|γ(h)| < ∞. (1)

Chapter 6: Time series analysis

I Denition. A weakly stationary process {Yt } is called ergodic

for the second moments, if

1 X p
(Yt − µ)(Yt−h − µ) → γ(h) ∀ h.
T −h

I If {Yt } is a Gaussian process, it follows from (1) the ergodicity

for all moments.

I Often, stationarity and ergodicity have the same requirements

Chapter 6: Time series analysis

6.2 Stochastic processes | 6.2.2 Stationarity and ergodicity 40 | 124

I Assume

Yit = µ + λi + νit , i = 1, . . . , I ; t = 1, . . . , T ,

where the λi and νit are independent for all i, t with

λi ∼ N (0, σλ2 ) i.i.d. and

νit ∼ N (0, σν ) i.i.d.

I The process {Yit }t∈Z is weakly stationary, but it is not


Chapter 6: Time series analysis

6.2.3 Linear processes

White noise processes
I White noise processes are basic building block for many other


1. A SP {εt } is said to be a white noise process (with mean zero

and variance σ 2 ), written {εt } ∼ WN(0, σ 2 ) ,

⇔ E[εt ] ≡ 0 and (2)

σ2 ,
if t=s
E[εt εs ] = (3)
0, otherwise.

. Obviously, a WN-process is weakly stationary with

autocovariance function γ(h) given by (3) with h = t − s.
Chapter 6: Time series analysis
White noise processes, cont.

2. If, additionally to (2) and (3), εt , εs are independent for t 6= s ,
then {εt } is an independent white noise process, written

{εt } ∼ IWN(0, σ 2 ) .
3. If εt ∼ N (0, σ 2 ) i.i.d., then {εt } is a Gaussian white noise

process, written

{εt } ∼ GWN(0, σ 2 ) .
I Clearly,

{εt } ∼ GWN ⇒ {εt } ∼ IID ⇒ {εt } ∼ IWN ⇒ {εt } ∼ WN

I The designation white originates from the analogy with

white light: It indicates that all possible periodic oscillations

are present with equal strength.

Chapter 6: Time series analysis

Simulated GWN(0,1) process




0 100 200 300 400 500 600 700


Chapter 6: Time series analysis

Linear processes
I Let {εt } ∼ WN(0, σ 2 ).
I Let {cj }j∈Z be a sequence of real-valued, absolutely summable

coecients, i.e.

X ∞
X ∞
|cj | := |cj | + |c−j | < ∞ . (4)
j=−∞ j=0 j=1

I Then {cj }j∈Z or the associated lag polynomial,

c(L) = cj Lj ,

is called an absolutely summable linear lter.

Chapter 6: Time series analysis

I An application of a linear lter to a WN process (and adding

possibly a constant) provides a general linear process:

Yt = (µ+ ) c(L)εt = (µ+ ) cj εt−j

X ∞
= (µ+ ) cj εt−j + c−j εt+j
j=0 j=1

I Linear lters could be dened for arbitrary processes {εt }t∈Z ;

but WN-processes are of particular interest in applications.

Chapter 6: Time series analysis

Existence of a linear process

I The existence is no problem as long as cj 6= 0 holds only for a

nite number of the coecients.

I Otherwise (4) assures the existence, because then:

|cj εt−j | < ∞ with probability one for t∈Z .
⇒ The sequence j=−n cj εt−j converges almost surely to the
corresponding limiting value Yt (resp. Yt − µ).
I This a.s. limit coincides with the mean square limit, which

exists even under the weaker condition

|cj |2 < ∞.

Chapter 6: Time series analysis
Weak stationarity of linear processes

I Let
(i) {εt }j∈Z be weakly stationary with E[εt ] = µε and
autocovariance function γε , and
(ii) {cj }j∈Z absolutely summable.

I Then

Yt = cj εt−j = c(L)εt
is weakly stationary with
 
µY = E[Yt ] =  cj  µε = c(1)µε ,
γY (h) = ci cj γε (h + i − j).
j i
Chapter 6: Time series analysis
Causality and ergodicity

I An absolutely summable lter {cj }j∈Z is called causal, if

cj = 0 ∀ j < 0.

I If {εt }t∈Z ∼ WN(0, σ 2 ) and {cj }j∈Z is absolutely summable

and causal, then the SP {Yt }t∈Z dened by

Yt := c(L)εt = cj εt−j

is weakly stationary and causal.

. Causality means that Yt depends on εt , εt−1 , . . ., but not on

εt+1 , εt+2 , . . ..
. More precisely, {Yt }t∈Z is causal w.r.t. {εt }t∈Z .

Chapter 6: Time series analysis

I Then it holds that

µY = E[Yt ] ≡ 0, and

γY (h) = E[Yt Yt−h ] = σ cj cj+h = γ(−h) (h ∈ N).

I Ergodicity for the mean follows from the absolute summability

of the lter {cj }.

I Under (4), ergodicity for the second moments follows e.g. if

εt ∼ (0, σ 2 ) i.i.d. and E[ε4t ] < ∞.

I SP {Yt } is ergodic for all moments, if

{εt } ∼ GWN(0, σ 2 ).

Chapter 6: Time series analysis

Chapter 6: Time series analysis
6.3.1 AR and other ARMA processes

Autoregressive (AR) processes
I An autoregressive process {Yt }t∈Z of order p, denoted AR(p ),

satises the following dierence equation (for every t ):

Yt = c + φ1 Yt−1 + . . . + φp Yt−p + εt , {εt } ∼ WN(0, σ 2 ).
I Lag operator notation:

Φp (L)Yt = c + εt , where

Φp (L) = 1 − φ1 L − . . . − φp Lp
is the autoregressive (AR) polynomial.

I Usually, only a stationary solution to the AR(p ) equations is

called AR(p ) process.

Chapter 6: Time series analysis

I AR model relates a TS to its past values and a current shock.

I E.g., forecasting the ination rate Xt is of interest for

. investors at stock market (how much to pay for bonds?),
. central banks (to decide about monetary policy), or
. rms (to forecast sales of their products).

I Fitting an AR(1) model for quarterly changes Yt = ∆Xt of the

U.S. ination rate (1962-2004, see Stock & Watson, 2007):

ybt = 0.017 − 0.238yt−1 .

. Increase of ination rate in one quarter ⇒ decrease of ination
rate next quarter.
. xT = 3.5(%), xT −1 = 1.6 ⇒ YT = 1.9 (with T = 2004 : 4)
⇒ ybT +1|T = 0.017 − 0.238yT = −0.43 ≈ −0.4
⇒ Forecast of XT +1 : xbT +1|T = xT + ybT +1|T = 3.5 − 0.4 = 3.1

Chapter 6: Time series analysis

Moving average (MA) processes

I A moving average process {Yt }t∈Z of order q, denoted

MA(q ), is given by:

Yt = µ + εt + θ1 εt−1 + . . . + θq εt−q , {εt } ∼ WN(0, σ ).

I Lag operator notation:

Yt = µ + Θq (L)εt , where

Θq (L) = 1 + θ1 L + . . . + θq Lq .

is the moving average (MA) polynomial.

. The value of the TS is inuenced by current and past shocks.

Chapter 6: Time series analysis

Autoregressive-moving average (ARMA)

I An autoregressive-moving average process {Yt }t∈Z of order

(p, q), denoted ARMA(p, q ), satises (for every t ):

Yt = c + φ1 Yt−1 + . . . + φp Yt−p + εt + θ1 εt−1 + . . . + θq εt−q ,
where {εt } ∼ WN(0, σ ).
I Lag operator notation:

Φp (L)Yt = c + Θq (L)εt , where

Φp (L) = 1 − φ1 L − . . . − φp Lp ,
Θq (L) = 1 + θ1 L + . . . + θq Lq .
. Again, only a stationary solution to the ARMA(p, q ) equations
is usually called ARMA(p, q ) process.
Chapter 6: Time series analysis
Weak stationarity of an AR(1) process

I Is there a stationary process {Yt } satisfying the AR(1)


Yt = c + φYt−1 + εt , {εt } ∼ WN(0, σ )? (5)

I If |φ| < 1 (stability condition), then there exists a unique,

weakly stationary and causal solution to (5):

X c
Yt = µ + φj εt−j , µ=
1 −φ

I This is an MA(∞) representation of the process with

absolutely summable coecients, which is obviously causal

(and mean ergodic, since the coecients are abs. summable).

Chapter 6: Time series analysis

I The existence and uniqueness holds e.g. in the mean square


I For |φ| = 1, there is no stationary solution.

I If |φ| > 1 (explosive case), then there exists a unique and

weakly stationary solution to (5) given by

Yt = µ − φ−j εt+j .

. But this solution is not causal!

. Do not confuse this solution with the non-stationary solution

obtained when starting with any RV Y0 which is uncorrelated
with the WN!

Chapter 6: Time series analysis

Moments of the AR(1) process

I For |φ| < 1:

X c
E[Yt ] := µ = c φj = ∀ t,

2 σ2
φ2j =
V[Yt ] = σ ∀ t,
1 − φ2

γ(h) = σ 2
1 − φ2
ρ(h) = = φ|h| .
I Note: The ACF satises the dierence equation (for h > 0):
ρ(h) = φρ(h − 1).
Chapter 6: Time series analysis
The partial autocorrelation function (PACF)

I Let {Yt } be a weakly stationary process.

I Its partial autocorrelation function (PACF) α(h) at lag h

. is the correlation between Yt and Yt+h adjusted for the
observations Z := (Yt+1 , . . . , Yt+h−1 )0 , i.e. for h ≥ 2:
h i
α(h) = Corr Yt − E(Y
b t |Z ), Yt+h − E(Y b t+h |Z ) ,

where b t |Z )
E(Y is the best linear prediction of Yt based on Z,
. or, equivalently, the last coecient in a linear projection of Yt
on its h most recent values.

⇒ α(h) = φhh (h = 1, 2, . . .) in the following AR(h) regression:

Yt = c + φ1h Yt−1 + . . . + φhh Yt−h + εt ,

which allows its estimation by OLS.
Chapter 6: Time series analysis
Simulated AR(1) process with

c = 0, φ = 0.5, {εt } ∼ GWN(0, 1)
AR(1): =0.5



0 100 200 300 400 500 600 700


Chapter 6: Time series analysis

Simulated AR(1) process: Correlogram

Series Y



lag.1 0.510 0.510

lag.2 0.209 -0.070

lag.3 0.073 -0.007
0 5 10 15 20 25
lag.4 0.036 0.019
lag.5 0.018 -0.003
lag.6 0.019 0.013
Series Y lag.7 0.016 0.002
lag.8 0.006 -0.006
lag.9 -0.002 -0.004

lag.10 -0.036 -0.044

lag.11 -0.021 0.022

lag.12 -0.042 -0.046

lag.13 -0.049 -0.015
Partial ACF

lag.14 -0.005 0.046

lag.15 -0.017 -0.044

0 5 10 15 20 25


Chapter 6: Time series analysis

Simulated AR(1) process: Now φ = 0.99

AR(1): =0.99



0 100 200 300 400 500 600 700


Chapter 6: Time series analysis

Simulated AR(1) process: Correlogram

Series Y



lag.1 0.962 0.962
lag.2 0.927 0.018

lag.3 0.895 0.024

0 5 10 15 20 25
lag.4 0.863 -0.013
lag.5 0.832 -0.002
lag.6 0.802 0.005
Series Y lag.7 0.773 -0.011
lag.8 0.745 0.001

lag.9 0.712 -0.077

lag.10 0.678 -0.036

lag.11 0.643 -0.042

lag.12 0.608 -0.025
lag.13 0.575 0.005
Partial ACF

lag.14 0.542 -0.014

lag.15 0.507 -0.053

0 5 10 15 20 25


Chapter 6: Time series analysis

Simulated AR(1) process: Now φ = −0.9

AR(1): = -0.9



0 100 200 300 400 500 600 700


Chapter 6: Time series analysis

Simulated AR(1) process: Correlogram

Series Y




lag.1 -0.899 -0.899

lag.2 0.808 0.002
lag.3 -0.730 -0.018
0 5 10 15 20 25
lag.4 0.666 0.038
lag.5 -0.603 0.028
lag.6 0.542 -0.019
Series Y lag.7 -0.488 -0.008
lag.8 0.436 -0.025
lag.9 -0.381 0.041
lag.10 0.337 0.021

lag.11 -0.290 0.044

lag.12 0.244 -0.028

lag.13 -0.198 0.031

Partial ACF

lag.14 0.158 -0.003


lag.15 -0.125 0.002


0 5 10 15 20 25


Chapter 6: Time series analysis

Weak stationarity of AR(p) processes

I Let {εt } ∼ WN(0, σ
2 ). Then there is a unique, weakly

stationary and causal solution to the AR(p ) equations:

Yt = c + φ1 Yt−1 + . . . + φp Yt−p + εt ,
Φp (L)Yt = c + εt , with Φp (L) = 1 − φj Lj ,

if all rootsz1 , . . . , zp of the (characteristic) AR polynomial

Φp (z) = (1 − φ1 z − . . . − φp z p ) (z ∈ Z)
lie outside the unit circle, i.e. |zj | > 1 for j = 1, . . . , p .
. This condition is called stability condition and can be
equivalently expressed as

Φp (z) 6= 0 ∀|z| ≤ 1.
Chapter 6: Time series analysis
I Stability condition ⇒ MA(∞) representation:

∞ ∞
X X c
Yt = µ + ψj εt−j , |ψj | < ∞, µ= .
Φp (1)
j=0 j=0
P∞ j
I Ψ(L) = j=0 ψj L is the inverse lter of Φp (L), i.e.

Φp (z)Ψ(z) = 1 for all |z| ≤ 1.

I Factorization of characteristic polynomial:
Y 1
Φp (z) = 1 − z

I Process is nonstationary if |zj | = 1 for some j ∈ {1, . . . , p}.

I |zj | =
6 1 for all j = 1, . . . , p but |zk | < 1 for at least one k
⇒ AR(p ) equations have a w. stationary, non-causal solution.

Chapter 6: Time series analysis

Stationarity check for AR(2) processes: Examples

1. Yt = 2 + 56 Yt−1 − 16 Yt−2 + εt , {εt } ∼ WN(0, σ
2 ).

5 1
⇒ Φ2 (z) = 1 − z + z 2 = 0 ⇔ z 2 − 5z + 6 = 0
r6 6

5 25 5 1
⇔ z1,2 = ± −6= ± ⇔ z1 = 3, z2 = 2
2 4 2 2

⇒ {Yt } is weakly stationary and causal, since |zi | > 1 for i = 1, 2.

2. Yt = 1 + 49 Yt−1 − 12 Yt−2 + εt , {εt } ∼ WN(0, σ

2 ).

9 1 9
⇒ Φ2 (z) = 1 − z + z 2 = 0 ⇔ z 2 − z + 2 = 0
r4 2 2

9 81 9 7 1
⇔ z1,2 = ± −2= ± ⇔ z1 = 4, z2 =
4 16 4 4 2

⇒ {Yt } is weakly stationary (|zi | 6 1),

= but non-causal (|z2 | < 1).

Chapter 6: Time series analysis

Moments of the AR(p) process

I Under the stability condition it holds:
c c
E[Yt ] := µ = Pp =
1 − j=1 φj Φp (1)
V[Yt ] = σ 2 +
φj γ(j)
γ(h) = φj γ(h − j) for h > 0.

⇒ Yule-Walker equations (dierence equations of order p ):

ρ(h) = φj ρ(h − j) for h≥1 (6)
Chapter 6: Time series analysis
MA process Pq
I MA(q) process: Yt = µ + j=1 θj εt−j + εt = µ + Θq (L)εt
I Moments: E[Yt ] := µ and
σ 2 k=0 θk θk+|h| ,
( Pq−|h|
if |h| ≤ q
γ(h) =
0, if |h| > q
I Without any condition on the parameters, the process exists, is

weakly stationary, causal and ergodic for the mean.

I Special case q = 1:
γ(0) = V[Yt ] = (1 + θ2 )σ 2 ,
γ(1) = θσ 2 , γ(h) = 0 for q > 1,
θ 1 1
ρ(1) = 2
, − ≤ ρ(1) ≤ .
1+θ 2 2
Chapter 6: Time series analysis
Simulated MA(1) process with

µ = 0, θ = 0.9, {εt } ∼ GWN(0, 1)
MA(1): = 0.9



0 100 200 300 400 500 600 700


Chapter 6: Time series analysis

Simulated MA(1) process: Correlogram

Series Y



lag.1 0.479 0.479
lag.2 -0.042 -0.352

lag.3 -0.024 0.254

0 5 10 15 20 25
lag.4 0.057 -0.109
lag.5 0.103 0.179
lag.6 0.044 -0.142
Series Y lag.7 0.043 0.202
lag.8 0.066 -0.124
lag.9 0.035 0.123
lag.10 0.049 -0.041

lag.11 0.048 0.060

lag.12 0.002 -0.084
lag.13 -0.039 0.016
Partial ACF

lag.14 -0.044 -0.055

lag.15 -0.013 0.028

0 5 10 15 20 25


Chapter 6: Time series analysis

Invertibility of ARMA processes

I Stationary and causal AR(MA)-processes have an MA(∞)
representation with absolutely summable coecients.

I Invertible ARMA processes have an AR(∞) representation with

absolutely summable coecients.

I Denition. The ARMA(p ,q ) process Φ(L)Yt = Θ(L)εt is invertible,

if there exists a sequence of constants {πj } with

|πj | < ∞ (absolute summability) and

εt + ν = πj Yt−j , π0 = 1 (ν is some constant).
(⇔ Yt = ν − πj Yt−j + εt )
Chapter 6: Time series analysis
One can show that...

I If the roots of the MA polynomial

Θq (z) = (1 + θ1 z + . . . + θq z q )
are outside the unit circle, then the ARMA(p, q ) process,

Yt = c + φ1 Yt−1 + . . . + φp Yt−p + εt + θ1 εt−1 + . . . + θq εt−q ,

is invertible.

I Moreover, the ARMA(p, q ) process is weakly stationary and

causal, if the roots of the AR polynomial

Φp (z) = (1 − φ1 z − . . . − φp z p )
lie outside the unit circle (stability condition).

. Weak stationarity follows if Φp (z) 6= 0 ∀ |z| = 1.

Chapter 6: Time series analysis
6.3.2 Estimation and forecasting

I Consider a weakly stationary and causal AR(p ) process:
yt = c + φj yt−j + εt = xt0 β + εt , {εt } ∼ WN(0, σ ),

with xt = (1, yt−1 , . . . , yt−p )0 and β = (c, φ1 , . . . , φp )0 .

I Regressors yt−j , j = 1, . . . , p do not depend on εt , εt+1 , . . .
⇒ Regressors are not strictly exogenous but predetermined.

⇒ xt and εt are uncorrelated: E[xt εt ] = 0.

⇒ The OLS estimator of β is biased, but (under mild conditions)

consistent and asymptotically normal.

I Alternative estimator: Use Yule-Walker equations (6).

Chapter 6: Time series analysis

On the consistency of the OLS estimator

I Let c = 0, xt = (yt−1 , . . . , yt−p )0 and φ = (φ1 , . . . , φp )0 :
yt = φj yt−j + εt = xt0 φ + εt , ⇔ y = Xφ + ε

I Under mild conditions on εt , {yt } is ergodic for the 2nd moment

and zt = xt ε t is stationary and (mean) ergodic.

X 0X 1 X p
⇒ = xt xt0 → E[xt xt0 ] = ((γ(h − k))ph,k=1 =: Γp
T T t=1
X 0ε 1 X p
and = xt εt → E[xt εt ] = 0
T T t=1
 0  −1 0
0 −1 0 X X X ε p
⇒ φ = (X X ) X y = φ +
b → φ 
Chapter 6: Time series analysis
Estimation of ARMA models

I ARMA(p, q ) model (q ≥ 1): Maximum Likelihood (dicult)

I For an ARMA(p, q ) model (q 6= 0) the OLSE is not

completely implementable. In practice, one also uses the

following simple approach:

I Step 1: Approximate the ARMA(p, q ) by an AR(r ) (with

r >> max{p, q}), and apply OLS.

⇒ OLS residuals e1 , . . . , eT .
I Step 2: Use OLS to estimate the model:

yt = c + φ1 yt−1 + . . . + φp yt−p + θ1 et−1 + . . . + θq et−q + εt .

⇒ Consistent estimator of (c, φ1 , . . . , φp , θ1 , . . . , θq )

Chapter 6: Time series analysis

The general problem of prediction

I Assume that all (functions of ) random variables have nite

second moments.

I Aim: prediction of Y on basis of X = (X1 , ..., Xk )0 .

I Suppose that Y is predicted by Yb = Yb (X ).
⇒ Prediction error: Y − Yb
I Performance measure: Mean Squared Error of Prediction

b) = E(Y − Yb )2

Chapter 6: Time series analysis

Result 1: Best (mean square) prediction

I The (population) regression function f ∗ (X ) := E[Y |X ] is the
best (mean square) prediction of Y on the basis of X , i.e.

E(Y − E[Y |X ])2 = min E[Y − f (X )]2 .

f : Ef (X )2 <∞

I Note that E[Y |X ] is an unbiased prediction, i.e.

E(Y − E[Y |X ]) = 0.

⇒ MSEP(E[Y |X ]) = E(Y − E[Y |X ])2 is just the variance of the

prediction error (Y − E[Y |X ]).

Chapter 6: Time series analysis

Result 2: Best linear prediction

I Linear predictions are easier to obtain than E[Y |X ].
I If V[X ] is nonsingular, then the linear (population) regression


b |X ] := E[Y ] + Cov(Y , X )(V[X ])−1 (X − E[X ])

`∗ (X ) = E[Y
is the best linear (mean square) prediction of Y on basis of X,
 2
b |X ])2 =
E(Y − E[Y min E Y − β0 − βj Xj  .
β0 ,...,βk

I Note that b |X ]
E[Y is also an unbiased prediction.

Chapter 6: Time series analysis

Forecasting with ARMA models

I Aim: Prediction of YT +h based on X = (Y1 , . . . , YT )0
. h denotes the forecast horizon.

I By Result 1, YT +h|T := E[YT +h |YT , . . . , Y1 ] is the best

h-step-ahead forecast.

I In practice, it is easier to derive best linear forecasts (Result 2).

I But in case of linear processes such as ARMA models, one can

often proceed as follows:

. We derive the best forecast under a stronger assumption such

as IWN error terms.

. If this forecast is linear, it must be the (unique!) best linear

forecast under the weaker assumption of WN errors.

Chapter 6: Time series analysis

Example: Forecasting an AR(p) process

I Let {Yt } ∼ AR(p) be weakly stationary and causal.

I Assume: {εt } ∼ IWN(0, σ 2 ), h = 1, T ≥ p ⇒

YT +1|T = E[YT +1 |YT , . . . , Y1 ]
= E[c + φj YT +1−j + εT +1 |YT , . . . , Y1 ]
= c+ φj E[YT +1−j |YT , . . . , Y1 ] + E[εT +1 |YT , . . . , Y1 ]
= c + φ1 YT + . . . + φp YT +1−p
⇒ This is the best linear 1-step-ahead forecast under WN errors.

. YT +h|T can be obtained recursively (h = 2, 3, . . .).

Chapter 6: Time series analysis
Example: Forecasting an ARMA(1,1) process

I AR(∞) representation of an (invertible) ARMA(1,1) process:

(−θ)i−1 (Yt−i − µ) + εt
Yt − µ = (φ + θ)

I Assume again: {εt } ∼ IWN(0, σ 2 ), h = 1

⇒ Optimal forecast of YT +1 based on the innite past:

YT∗ +1|T := E[YT +1 |YT , YT −1 , . . .]

(−θ)i−1 (YT +1−i − µ).
= µ + (φ + θ)
I Approximate YT∗ +1|T by truncating the innite sum at T.
I In practice, we replace the parameters by estimates ⇒ YbT +h|T .

Chapter 6: Time series analysis

6.3.3 The Box-Jenkins program

(1) If necessary, transform the data, so that the assumption of

weak stationarity is reasonable (see also Section 6.4).

. Main tool of Box & Jenkins (1976): (seasonal) dierencing

(2) Model identication: Propose suitable lag orders p and q for

the ARMA model.

(3) Estimate the parameters of the ARMA(p , q) model.

. see Section 6.3.2

(4) Model validation/diagnostic check: is the model consistent

with the observed features of the data?

(5) Forecasting: Prediction of future values of the process (see

Section 6.3.2) and forecast evaluation.

Chapter 6: Time series analysis

(2) Model identication

I Evaluate the empirical (P)ACF.

. The ACF of an MA(q ) process dies out after the order q.

. The PACF of an AR(p ) process dies out after the order p.

I Make an initial guess of small values for lag order p and q for

a suitable ARMA model.

I For specifying an appropriate model, you could also use some

information criterion (cp. step (4))

Chapter 6: Time series analysis

(4) Model validation/ Diagnostic analysis

I Calculation of residuals

b p (L) cb
et := Yt − .
b q (L) Θ
b q (1)

Example: For AR(p ): et = Yt − cb − φb1 Yt−1 − . . . − φbp Yt−p .

I Under WN/IWN/GWN assumption: The et are approximately

uncorrelated/independent/independently normally distributed.

I Analysis of the ACF/PACF and if necessary Jarque-Bera test.

I Plot (standardized) residuals:

. Roughly 95 % should be within ±1.96.

Chapter 6: Time series analysis

Portmanteau tests (for autocorrelation)

I Test for the nonexistence of autocorrelation among the errors εt :
H0 : ρε (1) = . . . = ρε (h) = 0
H1 : ρε (j) 6= 0 for some j ∈ {1, . . . , h}
I Portmanteau statistic (Box and Pierce, 1970):

ρbε (j)2 .
QBP (h) = T
I If εt are i.i.d., then we have

QBP (h) ∼ χ2(h−m) ,


where m(= p + q) is the number of parameters to be

Chapter 6: Time series analysis
I Ljung and Box (1978) (Q-statistic ):

ρbε (j)2
∼ χ2(h−m)
QLB (h) = T (T + 2)
T −j H0

I Note that ρbε (j) estimates the true correlations between the

εt 's by means of residuals after tting the ARMA model.

I The Ljung-Box statistic has greater power in smaller samples.

I Reject H0 ⇔ QLB (h) > χ2(h−m)


I Attention: Rejecting the null hypothesis can also be due to

nonlinear dependences (Ex: GARCH eects)!

Chapter 6: Time series analysis

Breusch-Godfrey LM Test for autocorrelation

I Assume, e.g., an AR(p) model for Yt :
Yt = φ1 Yt−1 + . . . + φp Yt−p + εt .

I Test H0 : {εt } ∼ WN against H1 : {εt } ∼ AR(r ).

1. Perform the auxiliary regression for residuals et (after tting the

AR(p ) model):

et = α1 Yt−1 + . . . + αp Yt−p + β1 et−1 + . . . + βr et−r + νt .

2. Calculate the test statistic

LM = T · R 2 ∼ χ2r ,

R2 is the coecient of determination of the auxiliary regression.

3. Reject H0 ⇔ LM > χr2,1−α .

Chapter 6: Time series analysis
Model comparison
I Akaike Information Criterion (AIC):

AIC := −2 ln[L(θ)]
b + 2m

I Bayes Information Criterion (BIC):

BIC := −2 ln[L(θ)]
b + m ln(T )

where L(θ)
b is the likelihood function at the point θb (MLE),
and m denotes the number of model parameters.

Chapter 6: Time series analysis

(5) Forecast evaluation

I Out-of-sample prediction:
. h-periods-ahead forecast of Yt : Ybt+h|t
. Prediction error: et+h|t := Ybt+h|t − Yt+h .

I Evaluation of T∗ forecasts for t ∈T∗ with |T ∗ | = T ∗ :

. Mean Absolute Prediction Error - MAPE:
1 X 1 X
MAPE = |Ybt+h|t − Yt+h | = |et+h|t |.
T∗ T∗
t∈T ∗ t∈T ∗

. Root Mean Square Prediction Error - RMSPE:

s s
1 1
(Ybt+h|t − Yt+h )2 = (et+h|t )2 .
T∗ ∗
T∗ ∗
t∈T t∈T

I Pseudo-out-of-sample: e.g.

T ∗ = {T − h, . . . , T − h − T ∗ + 1}
Chapter 6: Time series analysis
Chapter 6: Time series analysis
6.4 Nonstationary processes

6.4.1 Unit root processes
I Many economic TS show a trending behavior (e.g. German

⇒ Stationarity assumption is unrealistic.

I Possible reasons for nonstationarity are, for example:

1. A nonstable mean due to deterministic trends, seasonality,

breaks in deterministic components etc. (deterministic
nonstationarity), or

2. a root of the AR-polynomial of the process that lies on the

unit circle (e.g. a unit root; stochastic nonstationarity).

Chapter 6: Time series analysis

Trend-stationary processes
I A trend-stationary process is given by

Yt = δ0 + δt + Ψ(L)εt , {εt } ∼ WN(0, σ 2 ),


Ψ(L)εt = ψ0 εt + ψ1 εt−1 + ..., ψ 0 = 1, |ψj | < ∞.

I Yet := Yt − δ0 − δt = Ψ(L)εt is a weakly stationary and causal

process with zero mean.

. Any weakly stationary and causal ARMA process (with mean

zero) has such an MA(∞) representation!

⇒ E[Yt ] = δ0 + δt.

Chapter 6: Time series analysis

6.4 Nonstationary processes | 6.4.1 Unit root processes 94 | 124

Example:Trend-stationary AR(1) process

I Let Xt be a zero-mean w. stationary and causal AR(1) process:

Xt = φXt−1 + εt , |φ| < 1, {εt } ∼ WN(0, σ 2 )

⇒ Xt = Ψ(L)εt = φj εt−j

I Trend-stationary AR(1) process:

Yt = δ0 + δt + Xt
= δ0 + δt + φ Xt−1 +εt
| {z }
=Yt−1 −δ0 −δ(t−1)
= [δ0 (1 − φ) + δφ] + δ(1 − φ)t + φYt−1 + εt
I Last representation: AR(1) process around a linear trend

Applied Econometrics  Chapter 6: Time series analysis

6.4 Nonstationary processes | 6.4.1 Unit root processes 95 | 124

Simulated trend-stationary AR(1) process

Yt = −0.05 + 0.05t + Xt , Xt = 0.7Xt−1 + εt , t = 1, . . . , 700
⇔ Yt = 0.02 + 0.015t + 0.7Yt−1 + εt , {εt } ∼ GWN(0, 1)



0 100 200 300 400 500 600 700

Chapter 6: Time series analysis
Simulated trend-stationary AR(1) process:

Correlogram (Sample ACF/PACF)
Series Y



lag.1 0.990 0.990
lag.2 0.982 0.104

lag.3 0.974 0.021

0 5 10 15 20 25
lag.4 0.968 0.092
lag.5 0.963 0.071
lag.6 0.959 0.029
Series Y lag.7 0.954 0.028
lag.8 0.950 0.010
lag.9 0.946 0.054

lag.10 0.942 -0.001

lag.11 0.937 -0.038

lag.12 0.932 -0.012

lag.13 0.927 -0.042
Partial ACF

lag.14 0.923 0.066

lag.15 0.919 0.007

0 5 10 15 20 25


Chapter 6: Time series analysis

Integrated processes
I Denition: d ∈ N0 , a time series {Yt }∞
For t=−∞ is called
integrated of order d , denoted {Yt } ∼ I (d), if {∆ Yt } is a

(weakly) stationary process, whereas {∆

d−1 Y } is not (trend)

I Representation: ∆d Yt = c + Ψ(L)εt ,
. j=0 |ψj | < ∞, ψ0 = 1, Ψ(1) 6= 0

I Example: ARIMA(p, d, q )-Process:

Φp (L)(1 − L)d Yt = c + Θq (L)εt ,

where Wt ≡ (1 − L)d Yt is the d -th dierence of Yt and

follows a (stationary) ARMA(p, q) process.

Applied Econometrics  Chapter 6: Time series analysis

Chapter 6: Time series analysis

(Yt − Yt−1 ) = 0.7(Yt−1 − Yt−2 ) + εt , {εt } ∼ GWN(0, 1)



0 100 200 300 400 500 600 700


Chapter 6: Time series analysis

6.4 Nonstationary processes | 6.4.1 Unit root processes 99 | 124

Simulated ARIMA(1,1,0) process: Correlogram

Series Y

lag.1 0.995 0.995


lag.2 0.984 -0.692

lag.3 0.967 -0.056

lag.4 0.947 -0.037

0 5 10 15 20 25 lag.5 0.925 0.004
Lag lag.6 0.901 0.086
lag.7 0.877 -0.036
Series Y lag.8 0.852 -0.017
lag.9 0.827 -0.020

lag.10 0.802 -0.003

lag.11 0.777 -0.026
Partial ACF


lag.12 0.752 -0.040


lag.13 0.727 -0.022


lag.14 0.701 0.063

lag 15 0 676 0 023
0 5 10 15 20 25


Chapter 6: Time series analysis

Example of random walk (with drift c )

I Random Walk with drift:

Yt = c + Yt−1 + εt , {εt } ∼ WN(0, σ 2 ).

Y0 : Yt = Y0 + ct + tj=1 εj
I With (constant) initial value

E[Yt ] = t · c + Y0
V[Yt ] = t · σ 2
ρ(t, t − h) = 1 − (for h ≥ 0).
I The drift c generates a linear trend!

I For c = 0, the process is called Random Walk.

I Then: {Yt } ∼ ARIMA(0, 1, 0).

Chapter 6: Time series analysis
Simulated random walk (RW)

Yt = Yt−1 +εt = Y0 + εj , t = 1, . . . , 700, {εt } ∼ GWN(0, 1)


0 100 200 300 400 500 600 700

Chapter 6: Time series analysis
6.4 Nonstationary processes | 6.4.1 Unit root processes 102 | 124

Simulated random walk: Correlogram

Series Y



lag.1 0.987 0.987
lag.2 0.972 -0.054

lag.3 0.958 0.015

0 5 10 15 20 25
lag.4 0.944 -0.002
lag.5 0.930 -0.002
lag.6 0.914 -0.068
Series Y lag.7 0.900 0.023
lag.8 0.886 0.010
lag.9 0.873 0.059

lag.10 0.861 -0.004

lag.11 0.848 -0.037

lag.12 0.836 0.054

lag.13 0.826 0.037
Partial ACF

lag.14 0.819 0.081

lag.15 0.810 -0.030

0 5 10 15 20 25


Chapter 6: Time series analysis

Simulated random walk with drift c = 0.02

Yt = 0.02 + Yt−1 + εt = Y0 + 0.02t + εj , t = 1, . . . , 700



0 100 200 300 400 500 600 700

Chapter 6: Time series analysis
6.4 Nonstationary processes | 6.4.1 Unit root processes 104 | 124

Simulated RW with drift: Correlogram

Series Y



lag.1 0.996 0.996
lag.2 0.991 -0.005

lag.3 0.987 -0.017

0 5 10 15 20 25
lag.4 0.982 0.017
lag.5 0.978 -0.019
lag.6 0.973 0.018
Series Y lag.7 0.969 0.008
lag.8 0.965 -0.014
lag.9 0.961 0.025

lag.10 0.956 0.007

lag.11 0.952 -0.015

lag.12 0.948 0.004

lag.13 0.944 0.021
Partial ACF

lag.14 0.940 -0.012

lag.15 0.936 0.013

0 5 10 15 20 25


Chapter 6: Time series analysis

Integrated processes, I(0) vs I(1)

I(0) I(1)
stationary process e.g. random walk

mean reverting wanders widely, stochastic trend

eect of error term is temporary eect of error term is innite

I Unit root process =

b I(1) ⊆ non-stationary.

I 'Near-I(1)' processes: stationary but highly persistent, often

approximated by I(1) processes in empirical applications.

Chapter 6: Time series analysis

6.4.2 Unit root tests

I Box-Jenkins program assumes that data has been transformed,

if necessary, so that stationarity assumption is reasonable.

I Modelling of trend: Deterministic or stochastic trend?

I Stochastic trend ⇒ (in contrast to deterministic trend):

. the series does not revert to a long-term trend line,

. innovation/shocks have a permanent (non-vanishing) eect,
. the forecast variance does not converge, but increases
(linearly) with the forecast horizon.

⇒ Integration order of the process is of great importance for the

analysis (economic interpretation).

⇒ Interest is in tests which allow to detect a unit root in the AR

polynomial of an AR(MA) process.

Chapter 6: Time series analysis
The Dickey Fuller (DF) test

I Consider an AR(1) model: yt = φ1 yt−1 + εt .
I DF-regression:

∆yt = (φ1 − 1)yt−1 + εt or ∆yt = αyt−1 + εt .

I Test H0 : α = 0 (φ1 = 1) vs H1 : α < 0 (φ1 < 1).
I The asymptotic null distribution of the DF t−statistic is not


φb1 − 1 α d 1 W (1)2 − 1
DF = q =p qR ,
Var (b
α ) 2
Var (φb1 ) W (r )dr

where W (1) is a Brownian motion evaluated in 1.

I The asymptotic distribution is called Dickey-Fuller distribution.

Chapter 6: Time series analysis

The augmented Dickey-Fuller (ADF) test

I Consider an AR(p ) model for {Yt } without deterministic


(1 − φ1 L − . . . − φp Lp )Yt = εt ,

I Alternative formulation:

Yt = ρYt−1 + ζ1 ∆Yt−1 + . . . + ζp−1 ∆Yt−p+1 + εt , (7)

where ∆Yt := Yt − Yt−1 , ρ := φ1 + . . . + φp

ζj := −[φj+1 + . . . + φp ] for j = 1, 2, . . . , p − 1.

Chapter 6: Time series analysis

I If {Yt } ∼ I (1), then there exists a solution to the equation

Φp (z) = (1 − φ1 z − . . . − φp z p ) = 0
that is equal to one (and {Yt } is called an unit root process),
Φp (1 ) = 1 − φ1 − . . . − φp = 0 ⇔ ρ = 1.
I In this case (7) is not stationary; if however one unit root is

removed from Φp (L), then the resulting process is stationary, if

1 was the only root on the unit circle.

I Therefore the ADF test regression equation is given by:

∆Yt = αYt−1 + ζ1 ∆Yt−1 + . . . + ζp−1 ∆Yt−p+1 + εt , (8)

where α = ρ − 1 = φ1 + . . . + φp − 1 = −Φp (1).

Chapter 6: Time series analysis

I Test for unit roots implies

H0 : α = 0 vs. H1 : α < 0.
and is performed by means of a simple t -test statistic for α.
. If {Yt } is stationary and causal, then α < 0.
. Under H1 , {Yt } might be nonstationary or stationary and
non-causal; but these cases are of little practical relevance.

I Because under H0 the regressor Yt−1 is not stationary, the resulting

t -statistic is neither t -distributed nor asymptotically normal
distributed; it follows a non-standard (Dickey-Fuller) distribution.

I Instead, adapted (simulated) critical values developed by

MacKinnon (1991) have to be used.

I Adding a constant or a linear trend to the ADF regression (8) leads

to dierent (non-standard) distributions of the t -statistic.

Chapter 6: Time series analysis

Deterministic terms in ADF regression

I Deterministic terms have dierent eects under H0 and H1 :
E.g., under H0 (unit root) a constant generates a linear trend

(in contrast to trend stationary case). ⇒ Proposed solution:

. If the data has a nonzero mean (and shows prolonged upward

and downward patterns but no clear overall trend direction),
then include a constant in the regression.

. If the series has a clear (linear) trend direction, then include a

linear term in the regression.

 Exception: E.g. interest rates (no economic theory for trend).

⇒ The model is correctly specied under H1 .

 Under H0 : the corresponding parameter estimates should be
close to zero.

Chapter 6: Time series analysis

6.4.3 An empirical application with R

I Application of the Box-Jenkins Program to a data set:

I Data: Quarterly U.S. Fixed Investment, from 1947:1 until

1972:4 (xt , t = 1, . . . , T , T = 104), see slides 10, 11, 19, 20.

I It turns out, e.g. by unit root testing, that {xt } ∼ I (1) (unit

root process).

⇒ Select and estimate an appropriate model for quarterly

changes of U.S. xed investment, yt = ∆xt .

I Check the adequacy of the selected model by diagnostic tests

of residuals.

I Forecast the future values (4 quarters) of xt (nonstationary).

Chapter 6: Time series analysis

Data: U.S. xed investment (x )

I Quarterly observations, 1947:11972:4 (T = 104)

U.S. Fixed Investment


1950 1955 1960 1965 1970


Chapter 6: Time series analysis

Sample ACF and PACF for U.S. xed

investment Series x



0 1 2 3 4 5


Series x
Partial ACF


1 2 3 4 5

Chapter 6: Time series analysis
ADF Test for U.S. xed investment

I ADF regression with linear time trend since time series shows
upward trend:

> adfTest(x, type="ct")

Augmented Dickey-Fuller Test

Test Results:
Lag Order: 1
Dickey-Fuller: -2.4072
Chapter 6: Time series analysis
Quarterly Changes of US xed investment (y )

Quarterly Changes in U.S. Fixed Investment


1950 1955 1960 1965 1970


Chapter 6: Time series analysis

Sample ACF and PACF for quarterly

changes of U.S. xed investment Series y



0 1 2 3 4 5


Series y
Partial ACF


1 2 3 4 5

Chapter 6: Time series analysis
ADF test for quarterly changes in U.S. xed

I ADF regression with constant since time series shows nonzero mean:

> adfTest(y, type="c")

Augmented Dickey-Fuller Test

Test Results:
Lag Order: 1
Dickey-Fuller: -5.3516
Chapter 6: Time series analysis
Estimation of an AR(1) model for y

> ar1<-arima(y,order=c(1,0,0))
> ar1

arima(x = y, order = c(1, 0, 0))

ar1 intercept
0.5019 1.0885
s.e. 0.0899 0.4994

sigma^2 estimated as 6.308: log likelihood = -234.13,

aic = 474.25
Chapter 6: Time series analysis
Estimation of an AR(4) model for y

I Reasonable if PACF of Yt is nonzero.

> ar4<-arima(y,order=c(4,0,0))
> ar4

arima(x = y, order = c(4, 0, 0))

ar1 ar2 ar3 ar4 intercept
0.5264 -0.0968 -0.0155 -0.2043 1.0125
s.e. 0.1015 0.1146 0.1149 0.1023 0.3085

sigma^2 estimated as 5.84: log likelihood = -230.4,

aic = 472.81
Chapter 6: Time series analysis
Estimation of a restricted AR(4) model for y

I Model: yt = c + ϕ1 yt−1 + ϕ4 yt−4 + εt (smallest AIC/BIC)

> ar4r<-arima(y,order=c(4,0,0),fixed=c(NA,0,0,NA,NA))
> ar4r

arima(x = y, order = c(4, 0, 0), fixed = c(NA, 0, 0, NA, NA))

ar1 ar2 ar3 ar4 intercept
0.4750 0 0 -0.2292 1.0150
s.e. 0.0879 0 0 0.0889 0.3247

sigma^2 estimated as 5.903: log likelihood = -230.93,

aic = 469.86
Chapter 6: Time series analysis
Some diagnostics for the selected model Standardized Residuals

1950 1955 1960 1965 1970


ACF of Residuals


0 1 2 3 4 5


p values for Ljung-Box statistic

p value


5 10 15 20

Chapter 6: Time series analysis
Distribution of residuals in selected model

Histogram of Residuals


X-squared 2.8564686
df 2.0000000
p-value 0.2397318

-5 0 5


Chapter 6: Time series analysis

6.4 Nonstationary processes | 6.4.3 An empirical application with R 124 | 124

Forecasting next 4 quarters of U.S. xed investment

# Estimate the selected model in terms of x (not): need linear trend to have constant for y!
> time<-seq(0,length(x)-1,1)
> ar4r_dx<-arima(x,order=c(4,1,0),xreg=time,fixed=c(NA,0,0,NA,NA))
> ar4r_dx
arima(x = x, order = c(4, 1, 0), xreg = time, fixed = c(NA, 0, 0, NA, NA))

ar1 ar2 ar3 ar4 time
0.4750 0 0 -0.2292 1.0150
s.e. 0.0884 0 0 0.0894 0.3263

sigma^2 estimated as 5.963: log likelihood = -228.62, aic = 465.25

> time_new=seq(length(x),length(x)+3,1)

# Forecasting US fixed investment (x)

> predict(ar4r_dx,ahead=4,newxreg=time_new)
Qtr1 Qtr2 Qtr3 Qtr4
1972 178.0686 179.0836 180.0986
1973 181.1136
1972 2.441928
# True observations in 1973, Qtr2-Qtr4: 176.1 178.2 186.7
Chapter 6: Time series analysis

You might also like