Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

Applications of Econometrics

Tutorial Sheet 1

Trends and Seasonality


1. Consider the following graph showing retailer sales (retailsmnsa) in the U.S.1 Comment on trends and seasonality
in this variable.

Figure 1: Retailer Sales (FRED)

Solution: There is evidence for seasonality: We can see retailer sales spike in December each year and
slump in January, coinciding with festive activities in the U.S. It also looks like there is a strong positive
trend in this variable that looks quite linear.

2. Explain why trends and seasonality can be problematic in time series analysis.

Solution: Trends and seasonality can lead to the spurious regression problem. This is the problem where
we find a relationship (e.g. through regression) between two or more variables that is purely due to the
fact that all/some of them are trending or follow similar seasonal patterns.

3. Explain how we could de-seasonalize and de-trend the retailer sales variable so that it can be used in time series
analysis.

1 You can find this (and many other good data series) at https://fred.stlouisfed.org/graph/?id=RETAILSMNSA
Solution: One way to do this is to “partial out” (see Chapter 3) a linear trend and monthly dummies.
Of course there could be higher-order trends or seasonality other than monthly, but based on the figure it
seems like a good idea to use a linear trend and monthly dummies. We’d run the following regression

salest = α0 + α1 t + α2 f ebt + α3 mart + . . . + α12 dect + et

(NB: we did not include a dummy for January...why?)


Then we calculate the residuals

êt = salest − α̂0 − α̂1 t − α̂2 f ebt − α̂3 mart − . . . − α̂12 dect

We then use êt as a de-trended and de-seasonalized retailer sales variable.


There are other ways to approach this. We could simply just include trends and dummies in any regressions
involving this variable. But there could also be deeper issues. For example, the trend could come from
inflation. Then it might make sense to look at real retailer sales, which might not have a trend. Or the
trend could come from population growth. Then it could make sense to look at sales per capita, which
again might not have a trend. Both of these adjustments would change the interpretation of the variable
since we would be looking at a different measure. Our partialling-out approach does not change the variable
itself.

4. We now try to estimate the effect of the unemployment rate (unratensa) on retailer sales (retailsmnsa). Is it
the case that higher unemployment means less disposable income and thus lower sales? Consider the regression
results shown in column 1 in Table 1. Comment on how important it is to account for trends and seasonality.

Page 2
Table 1: Retailer Sales and Unemployment

log(Retail Sales)
Dependent variable: log(Retail Sales) de-trended and log(Retail Sales) log(Retail Sales)
de-seasonalized
(1) (2) (3) (4)
Unemployment rate -0.0093 -0.0163 -0.0168 -0.0252
(0.0097) (0.0017)*** (0.0018)*** (0.0053)***
t 0.0031 0.0031
(0.0000)*** (0.0000)***
2.month -0.0091 -0.0215
(0.0154) (0.0162)
3.month 0.1102 0.0977
(0.0154)*** (0.0162)***
4.month 0.0857 0.0736
(0.0154)*** (0.0162)***
5.month 0.1438 0.1327
(0.0154)*** (0.0160)***
6.month 0.1258 0.1181
(0.0154)*** (0.0156)***
7.month 0.1225 0.1115
(0.0154)*** (0.0160)***
8.month 0.1383 0.1247
(0.0154)*** (0.0164)***
9.month 0.0645 0.0515
(0.0154)*** (0.0164)***
10.month 0.0916 0.0796
(0.0155)*** (0.0162)***
11.month 0.1079 0.0977
(0.0156)*** (0.0161)***
12.month 0.2622 0.2519
(0.0156)*** (0.0161)***
L.Unemployment rate 0.0091
(0.0053)*
Constant 12.6232 0.0957 10.8252 10.8347
(0.0597)*** (0.0106)*** (0.0229)*** (0.0230)***
Observations 346 346 346 345
R2 0.003 0.206 0.969 0.969
Standard errors in parentheses
* p < 0.10, ** p < 0.05, *** p < 0.01

Solution: In the first column we see the results of a simple regression of the natural logarithm of retail
sales (log = ln in economics) on the contemporaneous unemployment rate. We can see a slightly negative
coefficient but it is not statistically significant at any conventional level (10%, 5%, or 1%). In column 2 we
use the êt from above instead of retail sales as the dependent variable.2 . We can see a drastic change in
coefficient: now much more negative and statistically significant.

Page 3
We get almost exactly the same if we just directly regress retail sales on the unemployment rate including
a trend and monthly dummies, shown in column 3.
Clearly it matters a lot for the results whether or not we account for trends and seasonality. Based on the
first figure we would probably trust the results in columns 2 and 3 much more than column 1.
Is this conclusive evidence that higher unemployment leads to less disposable income and thus lower retail
sales? Probably not. There are many factors we have not included in this regression which could be
correlated with unemployment and retail sales...we could thus have an omitted variable bias problem. The
results in this table should not be considered causal. Instead, we have a descriptive regression. But we
can turn this around: the evidence in this table is consistent with a strong effect of unemployment on
disposable income and then retail sales (it’s one potential factor among man).

5. Compare columns 3 and 4 in Table 1: first we only include contemporaneous unemployment, then also its first
lag. What is the impact propensity of a permanent one-unit increase in unemployment on (log) retail sales?
What is the long-run propensity?

Solution: The estimated impact propensity and long-run propensity are simply -0.0168 in column 3 be-
cause we only include the contemporaneous unemployment rate. In column 4 we have to distinguish: the
impact propensity of a permanent one-unit increase in the unemployment rate is -0.0252 and the long-run
propensity is -0.0252 + 0.0091 = -0.0161. If we include lags it is a bit more tricky to say what ’the effect
of x on y’ is. The first lag is weakly statistically significant and the contemporaneous effect changes quite
a bit, so it could be important to include a lag.

Stationarity and Weak Dependence


6. Consider the following sequence,
xt = et + αet−1 , t = 1, 2, ...,
where {et : t = 0, 1, ..., } is an i.i.d. sequence with zero mean and variance σ 2 . The process {xt } is called a
moving average process of order one [MA(1)]: xt is a weighted average of et and et−1 . Show that this MA(1)
process is both covariance stationary and weakly dependent.

Solution: To show that the process is covariance stationary we have to show that the mean, variance, and
the covariance between two terms do not depend on time.

xt = et + αet−1
E(xt ) = E(et + αet−1 )
= E(et ) + αE(et−1 ) = 0.

Since xt is only ever a function of terms which have a mean of zero it’s expected value will always be zero.
Therefore it’s expected value does not depend on time.

xt = et + αet−1
V ar(xt ) = V ar(et + αet−1 )
= V ar(et ) + α2 V ar(et−1 ) + 2Cov(et , αet−1 )
= σ 2 + α2 σ 2
V ar(xt ) = (1 + α2 )σ 2 .

Page 4
Since xt is only ever a function of terms which have the same variance it’s variance will be constant over
time.

Cov(xt , xt+1 ) = Cov(et + αet−1 , et+1 + αet )


= Cov(et , et+1 ) + Cov(et , αet ) + Cov(αet−1 , et+1 ) + Cov(αet−1 , αet )
= ασ 2 .

Again, the covariance term does not depend on t. This series is covariance stationary.
To show that the series is weakly dependent we need to show that the correlation between the terms goes
to zero the further apart the terms get. Start with two terms which are adjacent to one another:

Corr(xt , xt+1 ) = Cov(xt , xt+1 )/V ar(xt )


= ασ 2 /(1 + α2 )σ 2
= α/(1 + α2 ).

Remember that this correlation is for terms that are one period apart. If we looked at terms that were two
periods apart, say xt and xt+2 , it is very easy to show that the covariance between these terms is zero (you
should try this out using the formula above). Therefore, the correlation between the terms goes to zero
once the terms are two or more periods apart and the MA(1) process is weakly dependent.

7. Consider the time series plots in Figure 1. For each of the plots do you believe the time series is covariance
stationary? if not which conditions are violated?

Figure 2: Time series plots from four difference data generation processes
10
6
4

5
2
x

x
0

0
-2
-4

-5

0 100 200 300 400 500 0 100 200 300 400 500
t t

Panel A Panel B
10
6
4

5
2
x

x
0

0
-2

-5
-4

0 100 200 300 400 500 0 100 200 300 400 500
t t

Panel C Panel D

Page 5
Solution: The three conditions for covariance stationarity are:
(i) E[yt ] = µ : Mean is constant for all t
(ii) V ar(yt ) = σ 2 : Variance is constant for all t
(iii) Cov(yt , yt+h ) = f (h): Covariance for given h is constant for all t
Panel A: Is covariance stationary, as the mean, variance, and covariance, appears reasonably constant for
all t.
Panel B: There appears to be a linear time trend so (i) does not hold.
Panel C: It seems to matter where we start in terms of getting negative or positive covariance. specifically
in the firsts half of the time trend there is a positive times trend and in the second half there is a negative
trend. This violates both (i) and (iii) both the mean and covariance are dependent on time.
Panel D: There is a clear increase in the variance as time increases, which is a violation of condition (ii).

8. Consider the following autoregressive process of order one [AR(1)]:

yt = ρyt−1 + et t = 1, 2, ...,

where the starting point in the sequence is y0 (at t = 0), {et : t = 0, 1, ..., } is an i.i.d. sequence with zero mean
and variance σe2 , and |ρ| < 1. We also assume that the et are independent of y0 and that E(y0 ) = 0. Assuming
the process is covariance stationary, find the mean, variance, covariance, and show that the process is weakly
dependent.

Solution: We know from the question that the process is covariance stationary and that E(y0 ) = 0. So it
should also be the case that E(y1 ) = 0 as well. To see this, look at the first observation,

y1 = ρy0 + e1
E(y1 ) = E(ρy0 + e1 )
= ρE(y0 ) + E(e1 ) = 0.

Now we can calculate the variance of yt

yt = ρyt−1 + et
V ar(yt ) = V ar(ρyt−1 + et )
= V ar(ρyt−1 ) + V ar(et )
= ρ2 V ar(yt−1 ) + σe2 .

Why is there no covariance between yt−1 and et ? Remember that yt−1 can represented as the sum of the
y0 term and all subsequent error terms,

yt−1 = ρt−1 y0 + ρt−2 e1 + ρt−3 e2 + ... + et−1

all of which are independent of et . Because we are told in the question that yt is covariance stationary we
can set the variance of yt equal to the variance of yt−1 and solve:

V ar(yt ) = ρ2 V ar(yt ) + σe2


σe2
V ar(yt ) =
1 − ρ2
The final part is to find the covariance. We will do this by calculating the covariance between yt and yt+h :

yt = ρyt−1 + et
yt+h = ρh yt + ρh−1 et+1 + ρh−2 et+2 + ... + ρ1 et+h−1 + et+h

Page 6
Remember that yt+h can be represented as yt and the sum of all subsequent error terms. Now calculate
the covariance:

Cov(yt , yt+h ) = Cov(yt , ρh yt + ρh−1 et+1 + ρh−2 et+2 + ... + ρ1 et+h−1 + et+h )
Cov(yt , yt+h ) = Cov(yt , ρh yt )
Cov(yt , yt+h ) = ρh V ar(yt )

Again, the covariance term does not depend on t. This series is covariance stationary.
To show that the series is weakly dependent we need to show that the correlation between the terms goes
to zero the further apart the terms get. Start with two terms which are adjacent to one another:

Cov(yt , yt+h )
Corr(yt , yt+h ) =
SD(yt ) · SD(yt+h )
Cov(yt , yt+h )
=
SD(yt ) · SD(yt )
Cov(yt , yt+h )
=
V ar(yt )
h
ρ V ar(yt )
=
V ar(yt )
= ρh .

This shows that even though yt and yt+h are correlated for any h ≥ 1 this correlation becomes very small
for large values of h. Therefore, this AR(1) process is weakly dependent.

Page 7

You might also like