Download as pdf or txt
Download as pdf or txt
You are on page 1of 2

Forecasting and predictive analytics – Exercises – part 2

Teacher: Pierre Jacob (jacob@essec.edu)

Master in Management – ESSEC Business School – 2022

Each exercise is worth 25% of the total grade. Have fun!

Exercise 2.1. (Sunspots). Download the sunspot dataset (sunspot.csv on Moodle). The file contains a series of
n = 289 (yearly averaged) numbers of sunspots observed from 1700 to 1988. Denote the time series by (yt )nt=1 .

1. Plot the series (yt ) against time t, its autocorrelogram (acf in R) and partial autocorrelogram (pacf in R),
and comment.

2. Consider an AR(1) process, where Yt = µ+ϕ1 (Yt−1 −µ)+Wt where (Wt ) are i.i.d. Normal(0, σ 2 ) variables, and
assume that (Yt ) is stationary. The parameters are ϕ1 , µ and σ 2 . Discuss whether this model is well-adapted
to the sunspot observations, based on the plots obtained in the first question.

3. We pursue with the AR(1) process, and we assume that this process is stationary. Express the expectation
E[Yt ] and the variance V[Yt ], at any time t, as a function of µ, ϕ1 , σ 2 .

4. Consider the following pseudo-code, that takes a value of ϕ1 as input.


Pn
• Compute µ̂ = n−1 t=1 yt .
Pn
• Compute σ̂ 2 = n−1 t=1 (yt − µ̂)2 × (1 − ϕ21 ).
• Define v = 0.
1 1
• For t = 2, . . . , n, redefine v as v − 2 log(2πσ̂ 2 ) − 2σ̂ 2 ((yt − µ̂) − ϕ1 (yt−1 − µ̂))2 .
• Return v.

We denote by A(ϕ1 ) the result of the above. Explain how the function ϕ1 7→ A(ϕ1 ) relates to the likelihood
function of the AR(1) process, and explain why µ̂ and σ̂ 2 are defined in such a way.

5. Plot A(ϕ1 ) against a grid of values of ϕ1 between 0.5 and 1, and determine numerically the value ϕ?1 that
maximizes the function ϕ1 7→ A(ϕ1 ).

6. Using the parameter estimate ϕ?1 , describe how you would construct a 95% prediction interval for the next
observation yn+1 .

Exercise 2.2. (Best predictions versus best linear predictions). Consider the problem of predicting Y using a
function g of X, where X and Y are (possibly dependent) random variables. In the lectures, we have seen that in
order to minimize the mean squared error (MSE)
h i
2
MSE = E (Y − g(X)) ,

we should choose the function g to be the conditional expectation, i.e. g (X) = E [Y |X].

1
1. Suppose that we consider only linear
h functions of iX: g is of the form g (x) = a + bx, for a, b ∈ R. What are the
2
values of a and b minimizing E (Y − (a + bX)) ? Express a and b in terms of expectations and covariances
of X and Y .

2. Let X and Z be independent Normal variables with mean 0 and variance 1. Let Y = X 3 + Z 2 . What is the
conditional expectation E[Y |X] in this case, and what is the associated MSE E[(Y − E[Y |X])2 ]?

3. Focusing on linear functions of X as in question 1., what are the values of a and b minimizing the MSE
E[(Y − (a + bX))2 ] when X, Y are defined as in question 2.? What is the associated MSE? Compare with the
value obtained in question 2.
   
Hint: recall that, if Z is a standard Normal variable, then E Z k = 0 if k is an odd number, and E Z k = (k − 1)!!
     
if k is even, where k!! = k × (k − 2) × (k − 4) × . . . × 1. Thus E Z 2 = 1, E Z 4 = 3!! = 3 × 1 = 3, E Z 6 = 5!! =
5 × 3 × 1 = 15.

Exercise 2.3. (Autocorrelation of AR(1) and AR(2)). Consider a stationary AR(1) process: Yt = 3/4Yt−1 + Wt ,
for all t ∈ Z, where (Wt )t∈Z are uncorrelated variables with mean zero and variance one.
1. Show that, for each t ∈ Z, the variable Yt has mean zero and variance equal to 16/7.

2. Show that the autocovariance function γY of (Yt )t∈Z satisfies, for all h ≥ 1, γY (h) = 3/4γY (h − 1).
h
3. Deduce that the autocorrelation function satisfies ρY (h) = (3/4) for all h ≥ 0.
Consider a weakly stationary AR(2) process: Vt = 1/3Vt−1 + 2/9Vt−2 + Wt , where (Wt )t∈Z is defined as above.
4. Show that the autocorrelation of (Vt )t∈Z satisfies ρV (h) − 1/3ρV (h − 1) − 2/9ρV (h − 2) = 0 for h ≥ 2.
Hint: start by working out E [Vt Vt−h ] for h ≥ 2.
h
5. Show that the function ρV (h) = (16/21) (2/3)h + (5/21) (−1/3) , for all h ≥ 0, satisfies the equation in question
4., and verify the result using simulations.

Exercise 2.4. (The Town). Following S. J. Deutsch & F. B. Alt in an article published in 1977, we study the
number of armed robberies reported each month in Boston, Massachusetts, from January 1966 to October 1975
(n = 118 months in total). The series (yt )nt=1 is in bostonrobberies.csv on Moodle.
1. Plot yt against t, and discuss whether the time series could be adequately modeled with a stationary process.
2. The Box–Cox transform is a function fλ defined as

|y|λ − 1
∀y ∈ R fλ : y 7→ ,
λ

where λ ∈ R is a parameter. Show that, for all y > 0, fλ (y) → log(y) as λ → 0.


Hint: recall that exp(x) ≈ 1 + x + x2 /(2!) + x3 /(3!) + . . . and that xa = exp(a log x).
3. For all z > −1/λ, find y > 0 such that fλ (y) = z, as an expression involving z and λ.
4. Using λ = 0.5, compute zt = fλ (yt ) for t = 1, . . . , n, fit a linear regression zt = α+βt+t , extract the residuals
ˆt for t = 1, . . . , n, plot the residuals against t, present their autocorrelogram and the partial autocorrelogram, and
comment.
t ) using an AR(1) process, ˆt = ϕ1 ˆt−1 + Wt , with Wt ∼
5. We consider the modeling of the residuals (ˆ
Normal(0, σ 2 ). Report estimates ϕ̂1 , σ̂ 2 of the parameters ϕ1 , σ 2 , describe how you obtain these estimates.
6. Describe how you would construct a 95% prediction interval for the next observation yn+1 , using the approach
outlined in questions 4. and 5, and the result of question 3.

You might also like