Time Series

Time Series Analysis
EC3090 Econometrics
Gaia Narciso
Trinity College Dublin
1
Time Series vs. Cross Sectional
• Time series data has a temporal ordering, unlike cross-
section data
2
• Question: How do we think about randomness in time
series data?
Consider the series of Irish GDP data.
We can interpret each observation for each year as a

realization of a stochastic process
We only observe one observation, one realization

because we cannot start the process over again. But if
history had been different, we would have obtained a
different realization for the stochastic (random) process
3
• A stochastic process (or time series process) is a
sequence of random variables
• When we collect time series data, we actually

collect the realizations of this stochastic process
• Sample size: Number of time periods over which

we observe the variables of interest.
4
• To sum up:
Cross sectional data:
Population
Sample
Time series data:
Stochastic process (Time series process)
Realization
We are going to consider different stochastic
processes. Before doing so, we will give some
definitions
5
Stationary Time Series
• Covariance Stationary
E (Yt )  
Var(Yt )   
Cov (Yt , Yt  k )  k
6
Autoregressive Process AR
• AR(1)
Yt  Yt  1  ut ut is white noise
 1      Then AR(1) is stationary

E (Yt )  E[ Yt  1  ut ]
E (Yt )  0
7
• E(Yt) Derivation
E (Yt )  E[ Yt  1  ut ]
 E[Yt  1]
 E[Yt ]
(1   E (Yt )  0  E (Yt )  0
8
• AR(1)
Var(Yt )  Var( Yt  1  ut )

 2
Var(Yt )  u
1  
9
• Var (Yt) Derivation
Var(Yt )  Var( Yt  1  ut )

  Var(Yt  1)   u
 
(1   )Var(Yt )   u
 
 2
Var(Yt )  u
1  
10
• AR(1)
Cov (Yt , Yt  1)  ?
Yt  1  Yt  ut  1
Cov (Yt , Yt  1)  E(Yt (Yt  1))
 2
Cov (Yt , Yt  1)  
  2
11
• Covariance Derivation
Cov (Yt , Yt  1)  E(Yt (Yt  1))

 E (Yt ( Yt  ut  1))
 E (Yt ) 2
 2
Cov (Yt , Yt  1)  
  2
12
• AR(1)
Cov (Yt , Yt  2)  E (YtYt  2)
 2
Cov (Yt , Yt  2)   2
  2
We can generalize:
 2
Cov (Yt , Yt  k )   k
  2
13
• Covariance Derivation
Cov (Yt , Yt  2)  E (YtYt  2)
 EYt ( Yt  1  ut  2)
 EYtYt  1  Ytut  2)
 EYt( Yt  ut  1)  E (Ytut  2)
 EYt  Ytut  1  E (Ytut  2)
2
14
Covariance Derivation (Contd.)
Cov (Yt , Yt  2)   E (Yt )
2 2
 
 2
  
15
• AR(1)
 1      Covariance between two

terms which are very distant
in time is low
  as k 
k
16
• AR(1)
cov(Yt , Yt  k )
Corr (Yt , Yt  k ) 
Var (Yt )
 

1  
Corr (Yt , Yt  k )    

1  
Look:  1     
as k   the correlation 
17
Moving Average Process MA
• MA (1)
Yt  ut  ut  1
E (Yt )  0
Var (Yt )  Var (ut   ut  1)
   
2
u
2 2
u
18
Variance Derivation
Var (Yt )  Var (ut   ut  1)

 ( 1   ) 2
u
   
2
u
2 2
u
19
• MA(1)
Cov(Yt , Yt  1)  E[(ut   ut  1)(ut  1   ut  2)]

  
20
• Covariance Derivation (Contd.)
Cov (Yt , Yt  1)  E[(ut   ut  1)(ut  1   ut  2)]

 E[utut  1   utut  2   ut21 
  2ut-2]
  
21
Autocorrelation
Cov(Yt,Yt  1)  E[(ut  ut  1)(ut  1  ut )]
  
Cov(Yt,Yt  2)  E[(ut  ut  1)(ut  2  ut  1)]

0
22
Autocorrelation
MA(k)
MA(1): Cov(Yt, Yt  1)   
Cov(Yt, Yt  2)  0
MA(k): Cov (Yt , Yt  k )  0 up to k

Cov (Yt , Yt  s )  0 if s  k
23
Autoregressive Process
• AR (1)
ut is white noise
Yt  Yt  1  ut 1    
 2
• E (Yt )  0 • Cov (Yt , Yt  k )   k

  2
 2
• Var(Yt )  • Corr (Yt , Yt  k )   

1  
as k    k
24
Moving Average Process
• MA (1)
Yt  ut  ut  1 t  1,2,.....
• E (Yt )  0
• Var(Yt )   u2   2 u2
25
How Do You Calculate Autocorrelation?
Cov (Yt , Yt  k )
Corr (Yt , Yt  k )  k 
Var(Yt )
Cov(Yt, Yt  k )
k 
Var (Yt ) *Var(Yt  k )
Cov (Yt , Yt  k ) k
k  
Var(Yt ) 
26
How Do You Calculate Autocorrelation
AR(1)
 k     
k    
/
     

 k  corr (Yt , Yt  k )   k
as k   decreases expon.
k
-1    
27
How Do You Calculate Autocorrelation
MA(1)
Cov (Yt , Yt  k )
o k 
Var(Yt )
 

o 1   
 (1   ) (1    )

cov(Yt , Yt  2) 0
o 2   0
Var(Yt ) (1   )
 
28
Autocorrelation
We are going to consider the autocorrelation

function:
It gives the relationship between
autocorrelation and the time distance
between the variables
29
Autocorrelation
By looking at the autocorrelation function
you can try to understand what the
underlying stochastic process is
ARMA
Yt  Yt  1  ut  ut  1
Yt   1Yt  1   2Yt  2  ...   pYt  p
ut   1ut  1  ...   qut  q
30
Box-Jenkins Methodology
Box-Jenkins consists of 4 steps
1. Identification
ACF will be relevant correlogram
2. Estimation
Estimate the parameters
31
Box-Jenkins consists of 4 steps (Continued)
3. Diagnostic Checking
Does the model fit the data?
Are the estimated residuals white noise?
If the answer is no: Start again
4. Forecasting
This is the reason we like time series!
32
1. Identification
We have seen the autocorrelation function
Cov(Yt, Yt  k )  k
k  
Var (Yt ) 0
33
1. Identification
We can calculate the sample autocorrelation function
ˆ 
k
 (Y  Y )(Y
t t  k Y )
n
ˆ 0  
(Yt  Y ) 2
n
ˆk
ˆk 
ˆ 0
Then we plot ̂k over time
34
1. Identification
2. Estimation
4. Forecasting
35
1. Identification
 Autocorrelation Function
ACF AR(1)
MA(1)
ACF
Corr(Y,Yt-1)
Corr(Y,Yt-2)
1 2 3 4 5
# of Lags
 AR(p) decays exponentially or with damped sine wave

pattern or both
36
1. Identification
ACF
1 2 3 4
Lag
 MA(q) spikes through q lags
37
a) Autocorrelation function
b) Partial Autocorrelation Function

 kk
 It measures correlation between time series
observations that are K time periods apart
after controlling for correlation at
intermediate lags
38
Consider Yt and Yt-k
 When you calculate the correlation between

these 2 observations, you consider the effect of
the intermediate Y’s
 [Yt-1 and Yt-2 ,.., Yt-k-1]have an impact on
corr(Yt,Yt-k)
When you calculate PACF, you disregard this.
 How do you estimate?
Yt =0+1Yt-1+2Yt-2+et
39
What does it look like?
kk
AR(2)
1 2 3 4
Lag
kk
MA(1)
1 2 3 4 5
Lag
40
Model ACF PACF
AR(p) Declines exponentially, Spikes for p lags

or with damped sine Then it drops
wave pattern, or both
MA(q) Spikes through q lags Declines exponentially
ARMA(p,q) Exponential decay Exponential decay
41
1. Identification
2. Estimation
• As usual
• Plot the residuals
• Look at ACF and PACF of the residuals
Ut
4. Forecasting
42
Non-Stationary Process
(Weak) Stationarity
E (Yt )     Constant over time
Var(Yt )   
Constant over time
Cov (Yt ,Yt  k )  k Depends on the time

distance between the 2
observations
43
3 Types of Non-Stationary Processes
1) Random Walk
2) Random Walk with Drift
3) Trend Stationary
44
What is the difference?
 Their mean and variance change over time
What does it mean that the mean/var/cov

are constant?
 It means that whenever the series is hit by a
shock, it diverges from its mean, but then it
goes back to it.
The fluctuations around the mean have a
constant amplitude
45
Non stationary time series
 They vary
 But if the mean and variance change over
time, each set of time series data is an
episode
46
Non-Stationary Stochastic Process
A) Random Walk
Yt  Yt  1  ut
Similar to AR(1) :
Yt  Yt  1  ut   1
47
Non-Stationary Stochastic Process:
Random Walk
Let’s see why it is non-stationary
Y 1  Y 0  u1
Y 2  Y 1  u2
Y 2  Y 0  u1  u 2
Y 3  Y 2  u3 3
Y 3  Y 0  u1  u 2  u 3  Y 0   ui
t i 1
Yt  Y 0   ui
i 1
48
Random Walk
Mean
t
E (Yt )  E[Y 0   ui ]
i 1
Y0
{ut} is white noise
ut ~ (0, u )
49
Random Walk
Variance
t
var(Yt )  var[Y 0   ui ]
i 1
 t 2
u
• The variance increases with time

• Note one feature of RW: Random shocks are
persistent
50
Random Walk
Variance Derivation
t
var(Yt )  var[Y 0   ui ]
i 1
 var[u1  u 2  ...  ut ]
 [ u   u  ...   u ]
2 2 2
 t 2
u
t
51
B) Random Walk With Drift
Yt    Yt  1  ut
  Drift parameter
52
Random Walk With Drift
Yt    Yt  1  ut
Y 1    Y 0  u1
Y 2    Y 1  u 2    (  Y 0  u1)  u 2
Y 2  2  Y 0  u1  u 2
Y 3    Y 2  u3
Y 3      Y 0  u1  u 2]  u3
3
Y 3  3  Y 0  u1  u 2  u 3  3  Y 0   ui
i 1
t
Yt  t  Y 0   ui
i 1 53
Random Walk With Drift
Mean
t
E (Yt )  E[t  Y 0   ui ]
i 1
 t  Y 0
• The mean depends on time
54
Random Walk
Variance
t
var(Yt )  var[t  Y 0   ui ]
i 1
 t 2
u
55
C) Trend Stationary
Yt      t  ut
t  Determinis tic Trend
56
Trend Stationary
Mean
E (Yt )  E ( 1   t  ut )
     t Not Constant
Variance
var(Yt )   2
u
57
Trend Stationary
What Shall We Do?
Differencing
Say that you have a RW:
Yt  Yt  1  ut
Subtract Yt-1
Yt  Yt  1  ut
Yt  ut Stationary
While Yt is non-stationary, Yt is stationary
58
Trend Stationary
Differencing
Yt     Yt  1  ut
Yt  Yt  1     ut
You can take the first difference It is stationary
59
Trend Stationary
Detrending
The mean of a trend stationary process is:
Yt      t  ut
E (Yt )      t
If we subtract the mean of Yt from Yt
 The series is stationary
 Detrending
60
Trend Stationary
Detrending
Yt      t  ut
Yˆt  ˆ   ˆ t
Yt  Yt  Yˆt Stationary
61
Integrated Process
A non-stationary process is said to be
integrated of order 1 if it has to be
differentiated once to make it stationary
Y~I(d): It has to be differentiated d

times to makes it stationary.
62
Integrated Process
Some properties
1) Xt~I(0) Stationary
Yt~I(1) Non-Stationary
Zt=(Xt+Yt)~I(1)
2) Xt~I(d)
Zt=(a+bXt)~I(d)
63
Integrated Process
Some properties
3) Xt~I(d1), Yt~I(d2) d2>d1
Zt=(aXt+bYt)~I(d2)
4) Xt~I(d) Yt~I(d)
Zt=aXt+bYt~I(d*)
d*  d Cointegrat ion
64
Spurious Regressions
Say that you have 2 variables, Yt~I(1)
Xt~I(1)
It means that they are highly trended.
Xt  Xt  1  at at ~ (0, )

a
Yt  Yt  1  t t ~ (0, )


Two RW processes. at and lt are independent:

Suppose that you run:
Yt       Xt  ut
65
Spurious Regressions (Continued)
Say that you know they are totally independent
H 0 : 1  
1  
You would expect to reject H0
But….
The t-test will not be able to reject the null
hypothesis.
This is the spurious regression problem
66
Why does it occur?
Yt       Xt  ut
ut should be homoscedastic, not serially
correlated.
Look at ut:
ut  Yt       Xt
67
ut has always been assumed to be well-behaved but
in this case it is a linear combination of 2 integrated
processes
It is non-stationary
Gauss Markov assumptions are violated
t-statistic doesn’t have a limiting standard normal
distribution.
R2 will be high (0.99)
68
Random Walk
Random walk process (with/without drift) is an
example of unit root process.
Yt  Yt  1  ut 1    1
 • Random walk process
Yt  Yt  1  ut
• Mean is constant over time
• Variance is not constant
Unit root process
 • AR(1) It is stationary
69
Dickey-Fuller Test
How do we detect whether a process is a unit root
process or not?
 Dickey-Fuller Test
Start with the general process
Yt  Yt  1  ut *
We would like to know whether 
Play with * and subtract Yt-1
Yt  Yt  1  Yt  1  Yt  1  ut
Yt  (   1)Yt  1  ut
Yt  Yt  1  ut 70
Dickey-Fuller Test (Continued)
H0 :       
a :       
If   , Yt = Ut the first difference is stationary
If you cannot reject the null hypothesis it means that we

cannot reject the fact that Yt is non-stationary.
Evidence in favour of non-stationarity
If we reject H0: Evidence in favour of stationarity
71
Can we use the t-test?
No! Why?
Under the null hypothesis, the model is non
stationary.
The t-statistic of the estimated coefficients doesn’t
follow a t distribution, not even in large samples
72
What should we do then?
Dickey-Fuller have created tables of critical values by

using a montecarlo simulator
You have the tau-statistic
The table of critical values is made of three panels:

Why?
The critical values depend on the model used.
73
Upper Panel
Yt  Yt  1  ut Random Walk
Middle Panel
Yt     Yt  1  ut RW with drift
Lower Panel
Yt     Yt  1   t  ut RW +Trend + Drift
74
The null hypothesis is the same
H 0 :   Ha :   
ˆ
tau statistic : 
se (ˆ)
But the critical values depend on the model
75
Augmented Dickey-Fuller Test
We have seen that a way of carrying out the unit
root test is to write the RW as:
Yt    Yt  1  et
H0 :  
H0 :  
We could also have more complicated models with
additional lags
Yt    Yt  1   1Yt  1  et where  1  1
H0 :   {Yt} follows a stable AR(1) model
76
Augmented Dickey-Fuller Test
(Continued)
We can also add more lags:
Run a regression
Yt on Yt  1, Yt  1, Yt  2,  , Yt  p
and carry out the  test on ˆ as before.
This is the Augmented Dickey-Fuller test
77
Cointegration and Error Correction
Models
We have talked about the fact that:
Xt~I(1) and Yt~I(1)
 A linear combination of the 2 is I(1)
However, there are exceptions!
It is possible that there exists a  such that:
Zt = (Yt - Xt) ~ I(0)
In this case we say that the 2 variables are
cointegrated.
If  exists, then  is the cointegration parameter
78
Cointegration and Error Correction
Models
Two variables are said to be cointegrated if they

have a long term or equilibrium relationship
between them
It means that they drift together, they share a

common trend
79
Testing For Cointegration
How do you test for cointegration?
1) If  is known,
Zt = Yt-Xt
Zt = Zt-1+et Test whether Zt has
Zt = (1-Zt-1+et a unit root
Apply the Dickey-Fuller test to Zt
80
a) If we find evidence that Zt is stationary (Reject
the null hypothesis of non-stationarity)
 The 2 variables share a common stochastic
trend: they are cointegrated.
b) If we find evidence that Zt is nonstationary

 We cannot reject the null hypothesis of non-
stationarity
 The 2 variables are not cointegrated
81
2) If  is unknown,
We have to rely on the residuals
a) Consider the case of a spurious regression.

In a spurious regression the errors are non-
stationary
b) Then, in the presence of cointegration it must

be that the errors are stationary
82
Yt    Xt  ut
Estimate it:
Yt  ˆ  ˆXt  uˆt
uˆt  Yt  ˆ  ˆXt
If the 2 variables are cointegrated
The ût are stationary
 ût is a linear combination of the 2
83
Test
et is white noise
uˆt  uˆt  1  et process
and use the Dickey-Fuller critical values for

cointegration
84
If Yt and Xt are I(1) and they are not cointegrated
 A regression of Yt on Xt does not mean anything
 You can still take the first differences of each
variable and work with Yt, Xt.
If Yt and Xt are cointegrated

 Ok, it means that a long run equilibrium
relationship exists.
85
Forecasting
We are at time t
We want to forecast Yt+1
What do we do?
We have an information set at time t, It. It means
that we know all the previous values taken by Yt
and other variables
86
Forecasting
Consider an AR(1)
Yt      Yt  1  et
Estimate it and get the estimated values ˆ ,ˆ 
Now update the process
Yt  1      Yt  et  1
Et (Yt  1 \  t )  ˆ   ˆ Yt
We can forecast forward
Yt  2      Yt  1  et  2
EtYt  2  ˆ   ˆ Et (Yt  1)
EtYt  2  ˆ   ˆ [ˆ   ˆ Yt ] 87
Forecasting
The quality of the forecast deteriorates as we

forecast farther out in the future.
Of course forecasts are not accurate

 We have to consider the forecast error:
1-step ahead forecast error:
t (1)  Yt  1  Et (Yt  1)
88

Time Series

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Time Series

Uploaded by

Copyright:

Available Formats

Time Series Analysis

Consider the series of Irish GDP data.

We can interpret each observation for each year as a

We only observe one observation, one realization

• When we collect time series data, we actually

• Sample size: Number of time periods over which

Yt  Yt  1  ut ut is white noise

 1      Then AR(1) is stationary

Var(Yt )  Var( Yt  1  ut )

Var(Yt )  Var( Yt  1  ut )

Cov (Yt , Yt  1)  E(Yt (Yt  1))

 1      Covariance between two

Var (Yt )  Var (ut   ut  1)

Cov(Yt , Yt  1)  E[(ut   ut  1)(ut  1   ut  2)]

Cov (Yt , Yt  1)  E[(ut   ut  1)(ut  1   ut  2)]

Cov(Yt,Yt  2)  E[(ut  ut  1)(ut  2  ut  1)]

MA(k): Cov (Yt , Yt  k )  0 up to k

• E (Yt )  0 • Cov (Yt , Yt  k )   k

• Var(Yt )  • Corr (Yt , Yt  k )   

We are going to consider the autocorrelation

 AR(p) decays exponentially or with damped sine wave

 MA(q) spikes through q lags

b) Partial Autocorrelation Function

 When you calculate the correlation between

AR(p) Declines exponentially, Spikes for p lags

MA(q) Spikes through q lags Declines exponentially

ARMA(p,q) Exponential decay Exponential decay

E (Yt )     Constant over time

Cov (Yt ,Yt  k )  k Depends on the time

2) Random Walk with Drift

What does it mean that the mean/var/cov

• The variance increases with time

• The mean depends on time

t  Determinis tic Trend

Y~I(d): It has to be differentiated d

Two RW processes. at and lt are independent:

If you cannot reject the null hypothesis it means that we

If we reject H0: Evidence in favour of stationarity

Dickey-Fuller have created tables of critical values by

The table of critical values is made of three panels:

This is the Augmented Dickey-Fuller test

Two variables are said to be cointegrated if they

It means that they drift together, they share a

Apply the Dickey-Fuller test to Zt

b) If we find evidence that Zt is nonstationary

a) Consider the case of a spurious regression.

b) Then, in the presence of cointegration it must

and use the Dickey-Fuller critical values for

If Yt and Xt are cointegrated

We want to forecast Yt+1

The quality of the forecast deteriorates as we

Of course forecasts are not accurate

You might also like