Introduction To Time Series Analysis. Lecture 7.: Peter Bartlett

Introduction to Time Series Analysis. Lecture 7.
Peter Bartlett
Last lecture:
1. ARMA(p,q) models: stationarity, causality, invertibility
2. The linear process representation of ARMA processes: .
3. Autocovariance of an ARMA process.
4. Homogeneous linear difference equations.

Peter Bartlett
1. Review: ARMA(p,q) models and their properties
2. Review: Autocovariance of an ARMA process.
Forecasting
1. Linear prediction.
2. Projection in Hilbert space.
Review: Autoregressive moving average models

An ARMA(p,q) process {Xt } is a stationary process that
satisfies
(B)Xt = (B)Wt ,
where , are degree p, q polynomials and {Wt }
W N (0, 2 ).
Well insist that the polynomials and have no common factors.
Review: Properties of ARMA(p,q) models

Theorem: If and have no common factors, a (unique) stationary solution {Xt } to (B)Xt = (B)Wt
exists iff
(z) = 1 1 z p z p = 0 |z| =
6 1.
This ARMA(p,q) process is causal iff
(z) = 1 1 z p z p = 0 |z| > 1.
It is invertible iff
(z) = 1 + 1 z + + q z q = 0. |z| > 1.
Review: Properties of ARMA(p,q) models
(B)Xt = (B)Wt ,
Xt = (B)Wt
(B) = (B)(B)
so
..
.
1 = 0 ,
1 = 1 1 0 ,
2 = 2 1 1 2 0 ,
..
.
We need to solve the linear difference equation j = (B)j , with 0 = 1,

j = 0 for j < 0, j > q.
5
Review: Autocovariance functions of ARMA processes
(B)Xt = (B)Wt ,
Xt = (B)Wt ,
2
(h) 1 (h 1) (2)(h 2) = w
q
X
j jh .
j=h
We need to solve the homogeneous linear difference equation

(B)(h) = 0 (h > q), with initial conditions
2
(h) 1 (h 1) p (h p) = w
q
X
j=h
j jh ,
h = 0, . . . , q.

Forecasting
Homogeneous linear diff eqns with constant coefficients
a0 xt + a1 xt1 + + ak xtk = 0

k
a0 + a1 B + + ak B xt = 0
auxiliary equation:
a(B)xt = 0
a0 + a1 z + + ak z k = 0
(z z1 )(z z2 ) (z zk ) = 0
where z1 , z2 , . . . , zk C are the roots of this characteristic polynomial.

Thus,
a(B)xt = 0
(B z1 )(B z2 ) (B zk )xt = 0.
a(B)xt = 0
(B z1 )(B z2 ) (B zk )xt = 0.
So any {xt } satisfying (B zi )xt = 0 for some i also satisfies a(B)xt = 0.

Three cases:
1. The zi are real and distinct.
2. The zi are complex and distinct.
3. Some zi are repeated.

1. The zi are real and distinct.
a(B)xt = 0
(B z1 )(B z2 ) (B zk )xt = 0
xt is a linear combination of solutions to
(B z1 )xt = 0, (B z2 )xt = 0, . . . , (B zk )xt = 0
xt = c1 z1t + c2 z2t + + ck zkt ,
for some constants c1 , . . . , ck .
10

1. The zi are real and distinct. e.g., z1 = 1.2, z2 = 1.3
c zt + c zt
1 1
2 2
1
c =1, c =0
1
2
c =0, c =1
1
2
c =0.8, c =0.2
0.8
0.6
0.4
0.2
0.2
0.4
0.6
0.8
10
11
12
14
16
18
20
Reminder: Complex exponentials
a + ib = rei = r(cos + i sin ),

p
where r = |a + ib| = a2 + b2

b
= tan1
(, ].
a
Thus,
r1 ei1 r2 ei2 = (r1 r2 )ei(1 +2 ) ,

z z = |z|2 .
12

2. The zi are complex and distinct.
a(B)xt = 0
As before,
xt = c1 z1t + c2 z2t + + ck zkt .
If z1 6 R, since a1 , . . . , ak are real, we must have the complex conjugate

root, zj = z1 . And for xt to be real, we must have cj = c1 . For example:
xt = c z1t + c z1 t
= r ei |z1 |t eit + r ei |z1 |t eit

= r|z1 |t ei(t) + ei(t)
= 2r|z1 |t cos(t )
where z1 = |z1 |ei and c = rei .

13

2. The zi are complex and distinct. e.g., z1 = 1.2 + i, z2 = 1.2 i
c zt + c zt
1 1
2 2
1
c=1.0+0.0i
c=0.0+1.0i
c=0.80.2i
0.8
0.6
0.4
0.2
0.2
0.4
0.6
0.8
10
14
12
14
16
18
20

2. The zi are complex and distinct. e.g., z1 = 1 + 0.1i, z2 = 1 0.1i
c zt + c zt
1 1
2 2
2
c=1.0+0.0i
c=0.0+1.0i
c=0.80.2i
1.5
0.5
0.5
1.5
10
20
30
40
15
50
60
70

3. Some zi are repeated.
a(B)xt = 0
(B z1 )(B z2 ) (B zk )xt = 0.
If z1 = z2 ,
(B z1 )(B z2 )xt = 0
(B z1 )2 xt = 0.
We can check that (c1 + c2 t)z1t is a solution...

More generally, (B z1 )m xt = 0 has the solution
t
m1
c1 + c2 t + + cm1 t
z1 .
16

3. Some zi are repeated. e.g., z1 = z2 = 1.5.
(c + c t) zt
1
2
c =1, c =0
1
2
c =0, c =2
1
2
c =0.2, c =0.8
1
1.5
0.5
0.5
10
17
12
14
16
18
20
Solving linear diff eqns with constant coefficients

Solve:
a0 xt + a1 xt1 + + ak xtk = 0,
with initial conditions x1 , . . . , xk .
a0 + a1 z + + ak z k = 0
Auxiliary equation in z C:
(z z1 )m1 (z z2 )m2 (z zl )ml = 0,
where z1 , z2 , . . . , zl C are the roots of the characteristic polynomial, and

zi occurs with multiplicity mi .
Solutions: c1 (t)z1t + c2 (t)z2t + + cl (t)zlt ,
where ci (t) is a polynomial in t of degree mi 1.
We determine the coefficients of the ci (t) using the initial conditions
(which might be linear constraints on the initial values x1 , . . . , xk ).
18
Autocovariance functions of ARMA processes: Example

(1 + 0.25B 2 )Xt = (1 + 0.2B)Wt ,
Xt = (B)Wt ,

1 1 1
1
1
1 1
,... .
j = 1, , , , , , ,
5 4 20 16 80 64 320
2
(h) 1 (h 1) 2 (h 2) = w
qh
X
h+j j
j=0
(h) + 0.25(h 2) =
(0 + 0.21 )
19
if h = 0,
2
0.2w
0
if h = 1,
otherwise.

We have the homogeneous linear difference equation
(h) + 0.25(h 2) = 0
for h 2, with initial conditions
2
(0) + 0.25(2) = w
(1 + 1/25)
2
(1) + 0.25(1) = w
/5.
20

Homogeneous lin. diff. eqn:
(h) + 0.25(h 2) = 0.
The characteristic polynomial is
1
1
2
4 + z = (z 2i)(z + 2i),
1 + 0.25z =
4
4
2
which has roots at z1 = 2ei/2 , z1 = 2ei/2 .

The solution is of the form
(h) = cz1h + cz1 h .
21

z1 = 2ei/2 , z1 = 2ei/2 , c = |c|ei .
We have
(h) = cz1h + cz1 h

i(h/2)
i(+h/2)
h
|c|e
+ |c|e
=2

h
h
.
= c1 2 cos
2
And we determine c1 , from the initial conditions

2
(0) + 0.25(2) = w
(1 + 1/25)
2
(1) + 0.25(1) = w
/5.
22

We determine c1 , from the initial conditions:
(0) = c1 cos()
c1
(1) =
sin()
2
c1
(2) = cos()
4
2
(0) + 0.25(2) = w
(1 + 1/25)
We plug
into
2
1.25(1) = w
/5.
23

(h)
1.2
0.8
0.6
0.4
0.2
0.2
0.4
10
24
10

Forecasting
25
Review: least squares linear prediction

Consider a linear predictor of Xn+h given Xn = xn :
f (xn ) = 0 + 1 xn .
For a stationary time series {Xt }, the best linear predictor is
f (xn ) = (1 (h)) + (h)xn :
2
E (Xn+h (0 + 1 Xn )) E (Xn+h f (Xn ))

= 2 (1 (h)2 ).
26
Linear prediction
Given X1 , X2 , . . . , Xn , the best linear predictor
n
Xn+m
= 0 +
n
X
i Xi
i=1
of Xn+m satisfies the prediction equations

n
E Xn+m Xn+m = 0

n
E Xn+m Xn+m Xi = 0
for i = 1, . . . , n.
This is a special case of the projection theorem.
27
Projection Theorem
If H is a Hilbert space,
M is a closed linear subspace of H,
and y H,
then there is a point P y M
(the projection of y on M)
satisfying
1. kP y yk kw yk for w M,
2. kP y yk < kw yk for w M, w 6= y
3. hy P y, wi = 0 for w M.
28
yPy
y
Py
Hilbert spaces
Hilbert space = complete inner product space:
Inner product space: vector space, with inner product ha, bi:
ha, bi = hb, ai,
h1 a1 + 2 a2 , bi = 1 ha1 , bi + 2 ha2 , bi,
ha, ai = 0 a = 0.
Norm: kak2 = ha, ai.
complete = limits of Cauchy sequences are in the space
Examples:
P
n
1. R , with Euclidean inner product, hx, yi = i xi yi .
2. {random variables X: EX 2 < },
with inner product hX, Y i = E(XY ).
(Strictly, equivalence classes of a.s. equal r.v.s)
29
Projection theorem
Example: Linear regression
Given y = (y1 , y2 , . . . , yn ) Rn , and Z = (z1 , . . . , zq ) Rnq ,
choose = (1 , . . . , q ) Rq to minimize ky Zk2 .
Here, H = R , with ha, bi = i ai bi , and

{z1 , . . . , zq }.
M = {Z : Rq } = sp
30
Projection theorem
M is a closed subspace of H,
and y H,
satisfying
1. kP y yk kw yk
2. hy P y, wi = 0
for w M.
31
yPy
y
Py
Projection theorem
yPy
y
hy P y, wi = 0
Py
zi i = 0,
hy Z ,
Z Z = Z y
= (Z Z)1 Z y
normal equations.
32
Projection theorem
Example: Linear prediction

2
Given 1, X1 , X2 , . . . , Xn r.v.s X : EX < ,
choose 0 , 1 , . . . , n R
Pn
so that Z = 0 + i=1 i Xi minimizes E(Xn+m Z)2 .
Here, hX, Y i = E(XY ),
Pn
{1, X1 , . . . , Xn }, and
M = {Z = 0 + i=1 i Xi : i R} = sp
y = Xn+m .
33
Projection theorem
M is a closed subspace of H,
and y H,
satisfying
1. kP y yk kw yk
2. hy P y, wi = 0
for w M.
34
yPy
y
Py
Projection theorem: Linear prediction

n
Let Xn+m
denote the best linear predictor:
n
kXn+m
Xn+m k2 kZ Xn+m k2
for all Z M.
The projection theorem implies the orthogonality

n
hXn+m
Xn+m , Zi = 0
n
hXn+m
Xn+m , Zi = 0

n
E Xn+m Xn+m = 0
n

E Xn+m Xn+m Xi = 0
for all Z M
for all Z {1, X1 , . . . , Xn }
n
That is, the prediction errors (Xn+m
Xn+m ) are uncorrelated with the
prediction variables (1, X1 , . . . , Xn ).
35
Linear prediction
n
Error (Xn+m
Xn+m ) is uncorrelated with the prediction variable 1:

n
E Xn+m Xn+m = 0
!
X
i Xi Xn+m = 0
E 0 +
i
X
i
36
= 0 .
Linear prediction
...
Substituting for 0 in
n
Xn+m
= 0 +
= 0 .
i Xi ,
we get
n
Xn+m
=+
i (Xi ) .
So we can subtract from all variables:

X
n
i (Xi ) .
Xn+m =
i
Thus, for forecasting, we can assume = 0. So well ignore 0 .

37
One-step-ahead linear prediction
Write
Prediction equations:
n
Xn+1
= n1 Xn + n2 Xn1 + + nn X1

n
E (Xn+1 Xn+1 )Xi = 0, for i = 1, . . . , n
n
X
nj E (Xn+1j Xi ) = E(Xn+1 Xi )
j=1
n
X
nj (i j) = (i)
j=1
38
n n = n ,
One-step-ahead linear prediction

Prediction equations:
n n = n .
(0)
(1)
(n 1)
(0)
(n 2)
(1)
n =
..
..
..
.
.
.
(n 1)
(n 2)
n = (n1 , n2 , . . . , nn ) ,
(0)
n = ((1), (2), . . . , (n)) .
39
Mean squared error of one-step-ahead linear prediction

n
Pn+1
= E Xn+1
=E
2
n
Xn+1
Xn+1
n
Xn+1
Xn+1

n
= E Xn+1 Xn+1 Xn+1

= (0) E (n XXn+1 )
= (0) n 1
n n ,
where X = (Xn , Xn1 , . . . , X1 ) .
40
n
Xn+1
Mean squared error of one-step-ahead linear prediction

Variance is reduced:
n
Pn+1
= E Xn+1
2
n
Xn+1
= (0) n 1
n n
= Var(Xn+1 ) Cov(Xn+1 , X)Cov(X, X)1 Cov(X, Xn+1 )

2
= E (Xn+1 0) Cov(Xn+1 , X)Cov(X, X)1 Cov(X, Xn+1 ),

where X = (Xn , Xn1 , . . . , X1 ) .
41

Peter Bartlett
Forecasting
42

Introduction To Time Series Analysis. Lecture 7.: Peter Bartlett

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Introduction To Time Series Analysis. Lecture 7.: Peter Bartlett

Uploaded by

Copyright:

Available Formats

Introduction to Time Series Analysis. Lecture 7.