EC2019 Econometrics II: Seminar 4-Solution

EC2019 Econometrics II: Seminar 4—Solution
1. Consider the following process:
yt = φyt−1 + t ; where |φ| < 1; t = 0, ±1, ±2, · · · , (1)
where c is a non-zero constant, and the error term t satisfies the usual white-noise assumptions:
E(t ) = 0 ∀t
E(2t ) = σ 2 ∀t
E(t t−s ) = 0 ∀s 6= 0
Let ŷt+h|t denote the optimal linear prediction of yt+h made at time t. Demonstrate that ŷt+h|t
approaches the unconditional mean of the process as h −→ ∞. Let eh denote the prediction error
such that
eh = yt+h − ŷt+h|t (2)
Show that the prediction error variance approaches the unconditional variance of the process as
h −→ ∞.
Recall the unconditional mean and variance of an AR(1) with no deterministic components:
E(yt ) = 0 (3)
σ2
V (yt ) = (4)
1 − φ2
Optimal linear predictions are derived by taking conditional expectation of future values at time t.
For h = 1, the actual yt+1 is generated by
yt+1 = φyt + t+1 (5)
Take conditional expectation given It , the information available at time t:
ŷt+1|t = E(yt+1 |It ) = φyt (6)
yt is known at time t and hence part of the information set It . However, t+1 in (5) is a future error
term. Given that every t is independent from t−s , ∀s 6= 0, E(t+1 |It ) = E(t+1 ) = 0—i.e. having
the past information does not help predict future random disturbances any better. The forecast error
e1 is the difference between the actual (5) and its prediction (6), which is obviously t+1 .
For h = 2, the actual yt+2 is generated by:
yt+2 = φyt+1 + t+2 (7)
and its conditional expectation given by:
ŷt+2|t = E(yt+2 |It )

= φE(yt+1 |It ) + E(t+2 |It )
= φŷt+1|t Substitute (6)
2
= φ yt (8)
For h = 3, yt+3 is generated by:

yt+3 = φyt+2 + t+3 (9)
1
Taking conditional expectation at time t gives its prediction:
ŷt+3|t = E(yt+3 |It )

= φE(yt+2 |It ) + E(t+3 |It )
= φŷt+2|t
= φ3 yt (10)
It is clear that the general prediction formula for the process of (1) is given by:
ŷt+h|t = φŷt+h−1|t
(11)
= φh yt
Since |φ| < 1, as h −→ ∞, ŷt+h|t −→ 0 = E(yt ).
As for the prediction error and its variance, in order to compute the prediction error variance,
we need to express the forecast errors in terms of t ’s, since this is the random variable whose
statistical properties we know—in this instance, by assumption. The one-step-ahead prediction error
is obviously the difference between (5) and (6), which is t+1 . For h > 1, we need to express yt+h in
terms of future errors. For h = 2:
yt+2 = φyt+1 + t+2 Substitute (5):

2
= φ yt + φt+1 + t+2
For h = 3:
yt+3 = φyt+2 + t+3

= φ(φ2 yt + φt+1 + t+2 ) + t+3
= φ3 yt + φ2 t+1 + φt+2 + t+3
By following the same pattern, it is clear that the actual observation yt+h can be expressed as:
yt+h = φh yt + φh−1 t+1 + φh−2 t+2 + · · · + t+h (12)
Writing the actual and predicted side-by-side makes evaluating the prediction error much easier.
yt+h ŷt+h|t
yt+1 = φyt + t+1 ŷt+1|t = φyt

2
yt+2 = φ yt + φt+1 + t+2 ŷt+2|t = φ2 yt
yt+3 = φ3 yt + φ2 t+1 + φt+2 + t+3 ŷt+3|t = φ3 yt
yt+4 = φ4 yt + φ3 t+1 + φ2 t+2 + φt+3 + t+4 ŷt+4|t = φ4 yt
.. ..
. .
The prediction errors are merely the difference between yt+h and ŷt+h|t , therefore:
e1 = t+1
e2 = φt+1 + t+2
2
e3 = φ2 t+1 + φt+2 + t+3
e4 = φ3 t+1 + φ2 t+2 + φt+3 + t+4
..
.
eh = φh−1 t+1 + φh−2 t+2 + · · · + t+h (13)
Note that E(eh ) = 0, ∀h. Then, V (eh ) = E(e2h ).
V (e1 ) = E(2t+1 ) = σ 2
V (e2 ) = E[(φt+1 + t+2 )2 ]
= φ2 E(2t+1 ) + E(2t+2 )
= σ 2 (1 + φ2 )
V (e3 ) = E[(φ2 t+1 + φt+2 + t+3 )2 ]
= σ 2 (1 + φ2 + φ4 )
V (e4 ) = E[(φ3 t+1 + φ2 t+2 + φt+3 + t+4 )2 ]
= σ 2 (1 + φ2 + φ4 + φ6 )
..
.
V (eh ) = σ 2 (1 + φ2 + φ4 + · · · φ2(h−1) )
The common increment in the sum is φ2 . Since 0 ≤ φ2 < 1 due to the stationarity condition:
σ2
lim V (eh ) = = V (yt ) (14)
h→∞ 1 − φ2
yt = φ1 yt−1 + φ2 yt−2 + t ; t = 0, ±1, ±2, · · · , (15)
and the error term t follows a normal distribution with the following properties:
E(t ) = 0 ∀t
E(2t ) = σ 2 ∀t
E(t t−s ) = 0 ∀s 6= 0
Show that the process of (15) can be equivalently expressed as:
∆yt = ρyt−1 − φ2 ∆yt−1 + t ; (16)
where ρ = φ1 + φ2 − 1. Demonstrate that ρ = 0 if the process yt has an AR unit root.
yt = φ1 yt−1 + φ2 yt−2 + t subtract yt−1 from both sides:

yt − yt−1 = (φ1 − 1)yt−1 + φ2 yt−2 + t add and subtract φ2 yt−1
∆yt = (φ1 + φ2 − 1)yt−1 − φ2 (yt−1 − yt−2 ) + t
∆yt = (φ1 + φ2 − 1)yt−1 − φ2 ∆yt−1 + t
∆yt = ρyt−1 + π∆yt−1 + t (17)
3
(15) and (17) are mathematically equivalent with:
ρ = φ1 + φ2 − 1; π = −φ2 (18)
regardless of the presence/absence of a unit root in yt . Consider the case where the process yt
generated by (15) contains a unit root; so that one of the roots of the polynomial
φ(z) = 1 − φ1 z − φ2 z 2 = 0
is on the unit circle. Rearrange φ(z):
φ(z) = 1 − φ1 z − φ2 z 2
= (1 − λ1 z)(1 − λ2 z)
= 1 − (λ1 + λ2 )z + λ1 λ2 z 2
This enables us to express the AR parameters in terms of the roots of the polynomial as:
φ1 = λ1 + λ2
φ2 = −λ1 λ2
Substituting them into (18) gives:
ρ = λ1 + λ2 − λ1 λ2 − 1
The roots of the polynomial φ(z) are λ−1 −1

1 and λ2 , and if either one of them is to be 1 (we shall limit
ourselves to the analysis of real roots), then either λ1 or λ2 must be 1. Let λ1 = 1:
ρ = φ1 + φ2 − 1
= 1 + λ2 − λ2 − 1
=0
Now let λ2 = 1:
ρ = λ1 + 1 − λ1 − 1 = 0
Therefore, when either of the roots is 1, ρ = 0, and the AR(2) process of (17) becomes:
∆yt = π∆yt−1 + t (19)
which is a stationary AR(1) process in first differences since π = λ1 λ2 .
yt = c + yt−1 + t − θt−1 ; t = 0, ±1, ±2, · · · , (20)
and the error term t follows a normal distribution with the following properties:
E(t ) = 0 ∀t
E(2t ) =σ 2
∀t
E(t t−s ) = 0 ∀s 6= 0
Suppose you have drawn T observations, y1 , y2 , · · · , yT from this process. Discuss how you can
estimate the model of (20) and compute optimal linear forecasts for the level yT +h , h = 1, 2, 3.
4
The process of (20) is an I(1) process: it has an AR unit root. Recall from the discussion on spurious
regression that the parameters of a unit root process cannot be consistently estimated. However, we
know that the first difference of an I(1) process is stationary. In this case, just subtracting yt−1 from
both sides of (20) gives:
∆yt = c + t − θt−1 (21)
which is a stationary MA(1) in ∆yt . Therefore, using the conditional maximum likelihood estimation
method, the unknown parameters of the process (20), c and θ, can be consistently estimated provided
that the invertibility condition |θ| < 1 holds.
Following the procedure detailed in Chapter 3, Section 2 of my lecture notes, the conditional likelihood
function, conditional on the assumption that 1 is a known constant, is given by:
T
−(∆yt − c + θt−1 )2

Y 1
f (∆y2 ∩ ∆y3 ∩ · · · ∩ ∆yT ) = √ exp (22)
2πσ 2 2σ 2
t=2
Note that given the set of observations y1 , y2 , · · · , yT as the question says, ∆y1 = y1 − y0 cannot be
computed since y0 obviously does not exit. Hence the first usable observation for ∆yt is ∆y2 . Then
the log-likelihood function is given by:
T
T −1 T −1 1 X
ll(θ) = − ln(2π) − ln σ 2 − 2 (∆yt − c + θt−1 )2 (23)
2 2 2σ
t=2
0
where θ is a vector containing the unknown parameters of the process, in this case θ = c θ σ 2 .

The ML estimate of θ is solved by numerical optimisation as no analytical solution exits.
Let c̃, θ̃, and σ̃ 2 denote the maximum likelihood estimates of c, θ, and σ 2 respectively. Using these,
we can compute forecasts for the level yt by simply using equation (20). Let ŷt+h|t = E(yt+h |t), the
optimal forecast of yt+h computed at time t.
yT +1 = c + yT + T +1 − θT
ŷT +1|T = c̃ + yT − θ̃T
At time T , the present value T is known—in practice, once the parameter estimates c, θ, and σ 2 are
obtained, the residual series {˜
t } can always be used.
yT +2 = c + yT +1 + T +2 − θT +1
ŷT +2|T = c̃ + ŷT +1|T
= 2c̃ + yT − θ̃T
yT +3 = c + yT +2 + T +3 − θT +2
ŷT +3|T = c̃ + ŷT +2|T
= 3c̃ + yT − θ̃T
It is apparent from above that

ŷT +h|T = hc̃ + yT − θ̃T
Therefore, apart from the initial value of yT − θ̃T , the h-step-ahead forecast for yt is simply a
linear line—hc̃. This makes sense as ∆yt is an MA(1). Recall from Chapter 4 that if yt follows a
zero-mean MA(1) process, all future forecasts ŷT +h|T = 0 apart from ŷT +1|T . If the process had a
constant c, then ŷT +h|T = c. In this case, it is ∆yt which follows an MA(1) with a non-zero mean,
so ∆y
c T +h|T = c, where ∆yc T +h|T denotes the forecast of ∆yT +h = yT +h − yT +h−1 made at time T . If
the level series yt is predicted to grow by c every time period, then ŷT +h|T will converge to a linear
line: hc.

EC2019 Econometrics II: Seminar 4-Solution

Uploaded by

Copyright:

Available Formats

You might also like

EC2019 Econometrics II: Seminar 4-Solution

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

EC2019 Econometrics II: Seminar 4-Solution

Uploaded by

Copyright:

Available Formats

EC2019 Econometrics II: Seminar 4—Solution

1. Consider the following process:

yt = φyt−1 + t ; where |φ| < 1; t = 0, ±1, ±2, · · · , (1)

yt+1 = φyt + t+1 (5)

Take conditional expectation given It , the information available at time t:

ŷt+1|t = E(yt+1 |It ) = φyt (6)

For h = 2, the actual yt+2 is generated by:

yt+2 = φyt+1 + t+2 (7)

and its conditional expectation given by:

ŷt+2|t = E(yt+2 |It )

For h = 3, yt+3 is generated by:

ŷt+3|t = E(yt+3 |It )

Since |φ| < 1, as h −→ ∞, ŷt+h|t −→ 0 = E(yt ).

yt+2 = φyt+1 + t+2 Substitute (5):

yt+3 = φyt+2 + t+3

yt+h = φh yt + φh−1 t+1 + φh−2 t+2 + · · · + t+h (12)

yt+1 = φyt + t+1 ŷt+1|t = φyt

Note that E(eh ) = 0, ∀h. Then, V (eh ) = E(e2h ).

2. Consider the following process:

yt = φ1 yt−1 + φ2 yt−2 + t ; t = 0, ±1, ±2, · · · , (15)

Show that the process of (15) can be equivalently expressed as:

∆yt = ρyt−1 − φ2 ∆yt−1 + t ; (16)

where ρ = φ1 + φ2 − 1. Demonstrate that ρ = 0 if the process yt has an AR unit root.

yt = φ1 yt−1 + φ2 yt−2 + t subtract yt−1 from both sides:

is on the unit circle. Rearrange φ(z):

Substituting them into (18) gives:

The roots of the polynomial φ(z) are λ−1 −1

∆yt = π∆yt−1 + t (19)

which is a stationary AR(1) process in first differences since π = λ1 λ2 .

3. Consider the following process:

yt = c + yt−1 + t − θt−1 ; t = 0, ±1, ±2, · · · , (20)

The ML estimate of θ is solved by numerical optimisation as no analytical solution exits.

It is apparent from above that

You might also like

yt = φyt−1 + t ; where |φ| < 1; t = 0, ±1, ±2, · · · , (1)

yt+1 = φyt + t+1 (5)

yt+2 = φyt+1 + t+2 (7)

yt+2 = φyt+1 + t+2 Substitute (5):

yt+3 = φyt+2 + t+3

yt+h = φh yt + φh−1 t+1 + φh−2 t+2 + · · · + t+h (12)

yt+1 = φyt + t+1 ŷt+1|t = φyt

yt = φ1 yt−1 + φ2 yt−2 + t ; t = 0, ±1, ±2, · · · , (15)

∆yt = ρyt−1 − φ2 ∆yt−1 + t ; (16)

yt = φ1 yt−1 + φ2 yt−2 + t subtract yt−1 from both sides:

∆yt = π∆yt−1 + t (19)

yt = c + yt−1 + t − θt−1 ; t = 0, ±1, ±2, · · · , (20)