Professional Documents
Culture Documents
Solutions 1
Solutions 1
Tutorial Sheet 1
Solution: There is evidence for seasonality: We can see retailer sales spike in December each year and
slump in January, coinciding with festive activities in the U.S. It also looks like there is a strong positive
trend in this variable that looks quite linear.
2. Explain why trends and seasonality can be problematic in time series analysis.
Solution: Trends and seasonality can lead to the spurious regression problem. This is the problem where
we find a relationship (e.g. through regression) between two or more variables that is purely due to the
fact that all/some of them are trending or follow similar seasonal patterns.
3. Explain how we could de-seasonalize and de-trend the retailer sales variable so that it can be used in time series
analysis.
1 You can find this (and many other good data series) at https://fred.stlouisfed.org/graph/?id=RETAILSMNSA
Solution: One way to do this is to “partial out” (see Chapter 3) a linear trend and monthly dummies.
Of course there could be higher-order trends or seasonality other than monthly, but based on the figure it
seems like a good idea to use a linear trend and monthly dummies. We’d run the following regression
êt = salest − α̂0 − α̂1 t − α̂2 f ebt − α̂3 mart − . . . − α̂12 dect
4. We now try to estimate the effect of the unemployment rate (unratensa) on retailer sales (retailsmnsa). Is it
the case that higher unemployment means less disposable income and thus lower sales? Consider the regression
results shown in column 1 in Table 1. Comment on how important it is to account for trends and seasonality.
Page 2
Table 1: Retailer Sales and Unemployment
log(Retail Sales)
Dependent variable: log(Retail Sales) de-trended and log(Retail Sales) log(Retail Sales)
de-seasonalized
(1) (2) (3) (4)
Unemployment rate -0.0093 -0.0163 -0.0168 -0.0252
(0.0097) (0.0017)*** (0.0018)*** (0.0053)***
t 0.0031 0.0031
(0.0000)*** (0.0000)***
2.month -0.0091 -0.0215
(0.0154) (0.0162)
3.month 0.1102 0.0977
(0.0154)*** (0.0162)***
4.month 0.0857 0.0736
(0.0154)*** (0.0162)***
5.month 0.1438 0.1327
(0.0154)*** (0.0160)***
6.month 0.1258 0.1181
(0.0154)*** (0.0156)***
7.month 0.1225 0.1115
(0.0154)*** (0.0160)***
8.month 0.1383 0.1247
(0.0154)*** (0.0164)***
9.month 0.0645 0.0515
(0.0154)*** (0.0164)***
10.month 0.0916 0.0796
(0.0155)*** (0.0162)***
11.month 0.1079 0.0977
(0.0156)*** (0.0161)***
12.month 0.2622 0.2519
(0.0156)*** (0.0161)***
L.Unemployment rate 0.0091
(0.0053)*
Constant 12.6232 0.0957 10.8252 10.8347
(0.0597)*** (0.0106)*** (0.0229)*** (0.0230)***
Observations 346 346 346 345
R2 0.003 0.206 0.969 0.969
Standard errors in parentheses
* p < 0.10, ** p < 0.05, *** p < 0.01
Solution: In the first column we see the results of a simple regression of the natural logarithm of retail
sales (log = ln in economics) on the contemporaneous unemployment rate. We can see a slightly negative
coefficient but it is not statistically significant at any conventional level (10%, 5%, or 1%). In column 2 we
use the êt from above instead of retail sales as the dependent variable.2 . We can see a drastic change in
coefficient: now much more negative and statistically significant.
Page 3
We get almost exactly the same if we just directly regress retail sales on the unemployment rate including
a trend and monthly dummies, shown in column 3.
Clearly it matters a lot for the results whether or not we account for trends and seasonality. Based on the
first figure we would probably trust the results in columns 2 and 3 much more than column 1.
Is this conclusive evidence that higher unemployment leads to less disposable income and thus lower retail
sales? Probably not. There are many factors we have not included in this regression which could be
correlated with unemployment and retail sales...we could thus have an omitted variable bias problem. The
results in this table should not be considered causal. Instead, we have a descriptive regression. But we
can turn this around: the evidence in this table is consistent with a strong effect of unemployment on
disposable income and then retail sales (it’s one potential factor among man).
5. Compare columns 3 and 4 in Table 1: first we only include contemporaneous unemployment, then also its first
lag. What is the impact propensity of a permanent one-unit increase in unemployment on (log) retail sales?
What is the long-run propensity?
Solution: The estimated impact propensity and long-run propensity are simply -0.0168 in column 3 be-
cause we only include the contemporaneous unemployment rate. In column 4 we have to distinguish: the
impact propensity of a permanent one-unit increase in the unemployment rate is -0.0252 and the long-run
propensity is -0.0252 + 0.0091 = -0.0161. If we include lags it is a bit more tricky to say what ’the effect
of x on y’ is. The first lag is weakly statistically significant and the contemporaneous effect changes quite
a bit, so it could be important to include a lag.
Solution: To show that the process is covariance stationary we have to show that the mean, variance, and
the covariance between two terms do not depend on time.
xt = et + αet−1
E(xt ) = E(et + αet−1 )
= E(et ) + αE(et−1 ) = 0.
Since xt is only ever a function of terms which have a mean of zero it’s expected value will always be zero.
Therefore it’s expected value does not depend on time.
xt = et + αet−1
V ar(xt ) = V ar(et + αet−1 )
= V ar(et ) + α2 V ar(et−1 ) + 2Cov(et , αet−1 )
= σ 2 + α2 σ 2
V ar(xt ) = (1 + α2 )σ 2 .
Page 4
Since xt is only ever a function of terms which have the same variance it’s variance will be constant over
time.
Again, the covariance term does not depend on t. This series is covariance stationary.
To show that the series is weakly dependent we need to show that the correlation between the terms goes
to zero the further apart the terms get. Start with two terms which are adjacent to one another:
Remember that this correlation is for terms that are one period apart. If we looked at terms that were two
periods apart, say xt and xt+2 , it is very easy to show that the covariance between these terms is zero (you
should try this out using the formula above). Therefore, the correlation between the terms goes to zero
once the terms are two or more periods apart and the MA(1) process is weakly dependent.
7. Consider the time series plots in Figure 1. For each of the plots do you believe the time series is covariance
stationary? if not which conditions are violated?
Figure 2: Time series plots from four difference data generation processes
10
6
4
5
2
x
x
0
0
-2
-4
-5
0 100 200 300 400 500 0 100 200 300 400 500
t t
Panel A Panel B
10
6
4
5
2
x
x
0
0
-2
-5
-4
0 100 200 300 400 500 0 100 200 300 400 500
t t
Panel C Panel D
Page 5
Solution: The three conditions for covariance stationarity are:
(i) E[yt ] = µ : Mean is constant for all t
(ii) V ar(yt ) = σ 2 : Variance is constant for all t
(iii) Cov(yt , yt+h ) = f (h): Covariance for given h is constant for all t
Panel A: Is covariance stationary, as the mean, variance, and covariance, appears reasonably constant for
all t.
Panel B: There appears to be a linear time trend so (i) does not hold.
Panel C: It seems to matter where we start in terms of getting negative or positive covariance. specifically
in the firsts half of the time trend there is a positive times trend and in the second half there is a negative
trend. This violates both (i) and (iii) both the mean and covariance are dependent on time.
Panel D: There is a clear increase in the variance as time increases, which is a violation of condition (ii).
yt = ρyt−1 + et t = 1, 2, ...,
where the starting point in the sequence is y0 (at t = 0), {et : t = 0, 1, ..., } is an i.i.d. sequence with zero mean
and variance σe2 , and |ρ| < 1. We also assume that the et are independent of y0 and that E(y0 ) = 0. Assuming
the process is covariance stationary, find the mean, variance, covariance, and show that the process is weakly
dependent.
Solution: We know from the question that the process is covariance stationary and that E(y0 ) = 0. So it
should also be the case that E(y1 ) = 0 as well. To see this, look at the first observation,
y1 = ρy0 + e1
E(y1 ) = E(ρy0 + e1 )
= ρE(y0 ) + E(e1 ) = 0.
yt = ρyt−1 + et
V ar(yt ) = V ar(ρyt−1 + et )
= V ar(ρyt−1 ) + V ar(et )
= ρ2 V ar(yt−1 ) + σe2 .
Why is there no covariance between yt−1 and et ? Remember that yt−1 can represented as the sum of the
y0 term and all subsequent error terms,
all of which are independent of et . Because we are told in the question that yt is covariance stationary we
can set the variance of yt equal to the variance of yt−1 and solve:
yt = ρyt−1 + et
yt+h = ρh yt + ρh−1 et+1 + ρh−2 et+2 + ... + ρ1 et+h−1 + et+h
Page 6
Remember that yt+h can be represented as yt and the sum of all subsequent error terms. Now calculate
the covariance:
Cov(yt , yt+h ) = Cov(yt , ρh yt + ρh−1 et+1 + ρh−2 et+2 + ... + ρ1 et+h−1 + et+h )
Cov(yt , yt+h ) = Cov(yt , ρh yt )
Cov(yt , yt+h ) = ρh V ar(yt )
Again, the covariance term does not depend on t. This series is covariance stationary.
To show that the series is weakly dependent we need to show that the correlation between the terms goes
to zero the further apart the terms get. Start with two terms which are adjacent to one another:
Cov(yt , yt+h )
Corr(yt , yt+h ) =
SD(yt ) · SD(yt+h )
Cov(yt , yt+h )
=
SD(yt ) · SD(yt )
Cov(yt , yt+h )
=
V ar(yt )
h
ρ V ar(yt )
=
V ar(yt )
= ρh .
This shows that even though yt and yt+h are correlated for any h ≥ 1 this correlation becomes very small
for large values of h. Therefore, this AR(1) process is weakly dependent.
Page 7