Download as pdf or txt
Download as pdf or txt
You are on page 1of 88

Time Series Analysis

EC3090 Econometrics
Gaia Narciso
Trinity College Dublin

1
Time Series vs. Cross Sectional
• Time series data has a temporal ordering, unlike cross-
section data

2
Time Series vs. Cross Sectional
• Question: How do we think about randomness in time
series data?

Consider the series of Irish GDP data.

We can interpret each observation for each year as a


realization of a stochastic process

We only observe one observation, one realization


because we cannot start the process over again. But if
history had been different, we would have obtained a
different realization for the stochastic (random) process
3
Time Series vs. Cross Sectional
• A stochastic process (or time series process) is a
sequence of random variables

• When we collect time series data, we actually


collect the realizations of this stochastic process

• Sample size: Number of time periods over which


we observe the variables of interest.

4
Time Series vs. Cross Sectional
• To sum up:
Cross sectional data:
Population
Sample
Time series data:
Stochastic process (Time series process)
Realization
We are going to consider different stochastic
processes. Before doing so, we will give some
definitions
5
Stationary Time Series
• Covariance Stationary

E (Yt )  
Var(Yt )   

Cov (Yt , Yt  k )  k

6
Autoregressive Process AR
• AR(1)

Yt  Yt  1  ut ut is white noise

 1      Then AR(1) is stationary


E (Yt )  E[ Yt  1  ut ]
E (Yt )  0

7
Autoregressive Process AR
• E(Yt) Derivation
E (Yt )  E[ Yt  1  ut ]
 E[Yt  1]
 E[Yt ]

(1   E (Yt )  0  E (Yt )  0
8
Autoregressive Process AR
• AR(1)

Var(Yt )  Var( Yt  1  ut )


 2

Var(Yt )  u

1  

9
Autoregressive Process AR
• Var (Yt) Derivation

Var(Yt )  Var( Yt  1  ut )


  Var(Yt  1)   u
 

(1   )Var(Yt )   u
 

 2

Var(Yt )  u

1  

10
Autoregressive Process AR
• AR(1)
Cov (Yt , Yt  1)  ?
Yt  1  Yt  ut  1
Cov (Yt , Yt  1)  E(Yt (Yt  1))
 2

Cov (Yt , Yt  1)  
  2

11
Autoregressive Process AR
• Covariance Derivation

Cov (Yt , Yt  1)  E(Yt (Yt  1))


 E (Yt ( Yt  ut  1))
 E (Yt ) 2

 2

Cov (Yt , Yt  1)  
  2

12
Autoregressive Process AR
• AR(1)
Cov (Yt , Yt  2)  E (YtYt  2)
 2

Cov (Yt , Yt  2)   2

  2

We can generalize:
 2

Cov (Yt , Yt  k )   k

  2

13
Autoregressive Process AR
• Covariance Derivation
Cov (Yt , Yt  2)  E (YtYt  2)
 EYt ( Yt  1  ut  2)
 EYtYt  1  Ytut  2)
 EYt( Yt  ut  1)  E (Ytut  2)
 EYt  Ytut  1  E (Ytut  2)
2

14
Autoregressive Process AR
Covariance Derivation (Contd.)
Cov (Yt , Yt  2)   E (Yt )
2 2

 

 2

  

15
Autoregressive Process AR
• AR(1)

 1      Covariance between two


terms which are very distant
in time is low

  as k 
k

16
Autoregressive Process AR
• AR(1)

cov(Yt , Yt  k )
Corr (Yt , Yt  k ) 
Var (Yt )
 


1  
Corr (Yt , Yt  k )    


1  

Look:  1     
as k   the correlation 
17
Moving Average Process MA
• MA (1)

Yt  ut  ut  1

E (Yt )  0
Var (Yt )  Var (ut   ut  1)
   
2
u
2 2
u

18
Moving Average Process MA
Variance Derivation

Var (Yt )  Var (ut   ut  1)


 ( 1   ) 2
u

   
2
u
2 2
u

19
Moving Average Process MA
• MA(1)

Cov(Yt , Yt  1)  E[(ut   ut  1)(ut  1   ut  2)]


  

20
Moving Average Process MA
• Covariance Derivation (Contd.)

Cov (Yt , Yt  1)  E[(ut   ut  1)(ut  1   ut  2)]


 E[utut  1   utut  2   ut21 
  2ut-2]
  

21
Autocorrelation
Cov(Yt,Yt  1)  E[(ut  ut  1)(ut  1  ut )]
  

Cov(Yt,Yt  2)  E[(ut  ut  1)(ut  2  ut  1)]


0

22
Autocorrelation
MA(k)

MA(1): Cov(Yt, Yt  1)   
Cov(Yt, Yt  2)  0

MA(k): Cov (Yt , Yt  k )  0 up to k


Cov (Yt , Yt  s )  0 if s  k
23
Autoregressive Process
• AR (1)
ut is white noise
Yt  Yt  1  ut 1    

 2

• E (Yt )  0 • Cov (Yt , Yt  k )   k


  2
 2

• Var(Yt )  • Corr (Yt , Yt  k )   


1  

as k    k

24
Moving Average Process
• MA (1)

Yt  ut  ut  1 t  1,2,.....

• E (Yt )  0
• Var(Yt )   u2   2 u2

25
How Do You Calculate Autocorrelation?

Cov (Yt , Yt  k )
Corr (Yt , Yt  k )  k 
Var(Yt )
Cov(Yt, Yt  k )
k 
Var (Yt ) *Var(Yt  k )

Cov (Yt , Yt  k ) k
k  
Var(Yt ) 
26
How Do You Calculate Autocorrelation
AR(1)
 k     

k    
/
     

 k  corr (Yt , Yt  k )   k

as k   decreases expon.
k

-1    
27
How Do You Calculate Autocorrelation
MA(1)
Cov (Yt , Yt  k )
o k 
Var(Yt )
 

o 1   
 (1   ) (1    )

cov(Yt , Yt  2) 0
o 2   0
Var(Yt ) (1   )
 

28
Autocorrelation

We are going to consider the autocorrelation


function:
It gives the relationship between
autocorrelation and the time distance
between the variables

29
Autocorrelation
By looking at the autocorrelation function
you can try to understand what the
underlying stochastic process is

ARMA
Yt  Yt  1  ut  ut  1
Yt   1Yt  1   2Yt  2  ...   pYt  p
ut   1ut  1  ...   qut  q
30
Box-Jenkins Methodology
Box-Jenkins consists of 4 steps

1. Identification
ACF will be relevant correlogram

2. Estimation
Estimate the parameters

31
Box-Jenkins Methodology
Box-Jenkins consists of 4 steps (Continued)

3. Diagnostic Checking
Does the model fit the data?
Are the estimated residuals white noise?
If the answer is no: Start again

4. Forecasting
This is the reason we like time series!

32
Box-Jenkins Methodology

1. Identification
We have seen the autocorrelation function

Cov(Yt, Yt  k )  k
k  
Var (Yt ) 0

33
Box-Jenkins Methodology
1. Identification
We can calculate the sample autocorrelation function

ˆ 
k
 (Y  Y )(Y
t t  k Y )
n
ˆ 0  
(Yt  Y ) 2

n
ˆk
ˆk 
ˆ 0
Then we plot ̂k over time

34
Box-Jenkins Methodology
1. Identification

2. Estimation

3. Diagnostic Checking

4. Forecasting
35
Box-Jenkins Methodology
1. Identification
 Autocorrelation Function
ACF AR(1)
MA(1)
ACF
Corr(Y,Yt-1)
Corr(Y,Yt-2)

1 2 3 4 5
# of Lags

 AR(p) decays exponentially or with damped sine wave


pattern or both
36
Box-Jenkins Methodology
1. Identification
ACF

1 2 3 4
Lag

 MA(q) spikes through q lags

37
Box-Jenkins Methodology
a) Autocorrelation function

b) Partial Autocorrelation Function


 kk
 It measures correlation between time series
observations that are K time periods apart
after controlling for correlation at
intermediate lags
38
Box-Jenkins Methodology
Consider Yt and Yt-k

 When you calculate the correlation between


these 2 observations, you consider the effect of
the intermediate Y’s
 [Yt-1 and Yt-2 ,.., Yt-k-1]have an impact on
corr(Yt,Yt-k)
When you calculate PACF, you disregard this.
 How do you estimate?
Yt =0+1Yt-1+2Yt-2+et
39
Box-Jenkins Methodology
What does it look like?
kk
AR(2)

1 2 3 4
Lag

kk
MA(1)

1 2 3 4 5
Lag
40
Box-Jenkins Methodology
Model ACF PACF

AR(p) Declines exponentially, Spikes for p lags


or with damped sine Then it drops
wave pattern, or both

MA(q) Spikes through q lags Declines exponentially

ARMA(p,q) Exponential decay Exponential decay

41
Box-Jenkins Methodology
1. Identification

2. Estimation
• As usual
3. Diagnostic Checking
• Plot the residuals
• Look at ACF and PACF of the residuals
Ut

4. Forecasting
42
Non-Stationary Process
(Weak) Stationarity

E (Yt )     Constant over time

Var(Yt )   
Constant over time

Cov (Yt ,Yt  k )  k Depends on the time


distance between the 2
observations

43
Non-Stationary Process
3 Types of Non-Stationary Processes

1) Random Walk

2) Random Walk with Drift

3) Trend Stationary

44
Non-Stationary Process
What is the difference?
 Their mean and variance change over time

What does it mean that the mean/var/cov


are constant?
 It means that whenever the series is hit by a
shock, it diverges from its mean, but then it
goes back to it.
The fluctuations around the mean have a
constant amplitude
45
Non-Stationary Process
Non stationary time series
 They vary
 But if the mean and variance change over
time, each set of time series data is an
episode

46
Non-Stationary Stochastic Process
A) Random Walk

Yt  Yt  1  ut

Similar to AR(1) :
Yt  Yt  1  ut   1

47
Non-Stationary Stochastic Process:
Random Walk
Let’s see why it is non-stationary
Y 1  Y 0  u1
Y 2  Y 1  u2
Y 2  Y 0  u1  u 2
Y 3  Y 2  u3 3
Y 3  Y 0  u1  u 2  u 3  Y 0   ui
t i 1

Yt  Y 0   ui
i 1
48
Non-Stationary Stochastic Process:
Random Walk
Mean
t
E (Yt )  E[Y 0   ui ]
i 1
Y0
{ut} is white noise
ut ~ (0, u )

49
Non-Stationary Stochastic Process:
Random Walk
Variance
t
var(Yt )  var[Y 0   ui ]
i 1
 t 2
u

• The variance increases with time


• Note one feature of RW: Random shocks are
persistent
50
Non-Stationary Stochastic Process:
Random Walk
Variance Derivation
t
var(Yt )  var[Y 0   ui ]
i 1
 var[u1  u 2  ...  ut ]
 [ u   u  ...   u ]
2 2 2

 t 2
u
t
51
Non-Stationary Stochastic Process
B) Random Walk With Drift

Yt    Yt  1  ut

  Drift parameter

52
Non-Stationary Stochastic Process:
Random Walk With Drift
Yt    Yt  1  ut
Y 1    Y 0  u1
Y 2    Y 1  u 2    (  Y 0  u1)  u 2
Y 2  2  Y 0  u1  u 2
Y 3    Y 2  u3
Y 3      Y 0  u1  u 2]  u3
3
Y 3  3  Y 0  u1  u 2  u 3  3  Y 0   ui
i 1
t
Yt  t  Y 0   ui
i 1 53
Non-Stationary Stochastic Process:
Random Walk With Drift
Mean
t
E (Yt )  E[t  Y 0   ui ]
i 1
 t  Y 0

• The mean depends on time

54
Non-Stationary Stochastic Process:
Random Walk
Variance
t
var(Yt )  var[t  Y 0   ui ]
i 1
 t 2
u

55
Non-Stationary Stochastic Process
C) Trend Stationary

Yt      t  ut

t  Determinis tic Trend

56
Non-Stationary Stochastic Process:
Trend Stationary
Mean
E (Yt )  E ( 1   t  ut )
     t Not Constant

Variance
var(Yt )   2
u
57
Non-Stationary Stochastic Process:
Trend Stationary
What Shall We Do?
Differencing
Say that you have a RW:
Yt  Yt  1  ut
Subtract Yt-1
Yt  Yt  1  ut
Yt  ut Stationary
While Yt is non-stationary, Yt is stationary
58
Non-Stationary Stochastic Process:
Trend Stationary
Differencing

Yt     Yt  1  ut

Yt  Yt  1     ut
You can take the first difference It is stationary

59
Non-Stationary Stochastic Process:
Trend Stationary
Detrending
The mean of a trend stationary process is:

Yt      t  ut
E (Yt )      t
If we subtract the mean of Yt from Yt
 The series is stationary
 Detrending
60
Non-Stationary Stochastic Process:
Trend Stationary
Detrending

Yt      t  ut
Yˆt  ˆ   ˆ t
Yt  Yt  Yˆt Stationary

61
Integrated Process
A non-stationary process is said to be
integrated of order 1 if it has to be
differentiated once to make it stationary

Y~I(d): It has to be differentiated d


times to makes it stationary.

62
Integrated Process
Some properties
1) Xt~I(0) Stationary
Yt~I(1) Non-Stationary

Zt=(Xt+Yt)~I(1)

2) Xt~I(d)
Zt=(a+bXt)~I(d)

63
Integrated Process
Some properties
3) Xt~I(d1), Yt~I(d2) d2>d1
Zt=(aXt+bYt)~I(d2)

4) Xt~I(d) Yt~I(d)
Zt=aXt+bYt~I(d*)

d*  d Cointegrat ion

64
Spurious Regressions
Say that you have 2 variables, Yt~I(1)
Xt~I(1)
It means that they are highly trended.
Xt  Xt  1  at at ~ (0, )

a

Yt  Yt  1  t t ~ (0, )

Two RW processes. at and lt are independent:


Suppose that you run:
Yt       Xt  ut
65
Spurious Regressions (Continued)
Say that you know they are totally independent
H 0 : 1  
1  
You would expect to reject H0
But….
The t-test will not be able to reject the null
hypothesis.
This is the spurious regression problem
66
Spurious Regressions (Continued)
Why does it occur?
Yt       Xt  ut
ut should be homoscedastic, not serially
correlated.

Look at ut:
ut  Yt       Xt

67
Spurious Regressions (Continued)
ut has always been assumed to be well-behaved but
in this case it is a linear combination of 2 integrated
processes
It is non-stationary
Gauss Markov assumptions are violated
t-statistic doesn’t have a limiting standard normal
distribution.
R2 will be high (0.99)

68
Random Walk
Random walk process (with/without drift) is an
example of unit root process.
Yt  Yt  1  ut 1    1
 • Random walk process
Yt  Yt  1  ut
• Mean is constant over time
• Variance is not constant
Unit root process
 • AR(1) It is stationary

69
Dickey-Fuller Test
How do we detect whether a process is a unit root
process or not?
 Dickey-Fuller Test
Start with the general process
Yt  Yt  1  ut *
We would like to know whether 
Play with * and subtract Yt-1
Yt  Yt  1  Yt  1  Yt  1  ut
Yt  (   1)Yt  1  ut
Yt  Yt  1  ut 70
Dickey-Fuller Test (Continued)

H0 :       
a :       
If   , Yt = Ut the first difference is stationary

If you cannot reject the null hypothesis it means that we


cannot reject the fact that Yt is non-stationary.
Evidence in favour of non-stationarity

If we reject H0: Evidence in favour of stationarity

71
Dickey-Fuller Test (Continued)
Can we use the t-test?

No! Why?
Under the null hypothesis, the model is non
stationary.
The t-statistic of the estimated coefficients doesn’t
follow a t distribution, not even in large samples

72
Dickey-Fuller Test (Continued)
What should we do then?

Dickey-Fuller have created tables of critical values by


using a montecarlo simulator
You have the tau-statistic

The table of critical values is made of three panels:


Why?
The critical values depend on the model used.

73
Dickey-Fuller Test (Continued)
Upper Panel
Yt  Yt  1  ut Random Walk

Middle Panel
Yt     Yt  1  ut RW with drift

Lower Panel
Yt     Yt  1   t  ut RW +Trend + Drift

74
Dickey-Fuller Test (Continued)
The null hypothesis is the same
H 0 :   Ha :   
ˆ
tau statistic : 
se (ˆ)
But the critical values depend on the model

75
Augmented Dickey-Fuller Test
We have seen that a way of carrying out the unit
root test is to write the RW as:
Yt    Yt  1  et
H0 :  
H0 :  
We could also have more complicated models with
additional lags
Yt    Yt  1   1Yt  1  et where  1  1
H0 :   {Yt} follows a stable AR(1) model
76
Augmented Dickey-Fuller Test
(Continued)
We can also add more lags:
Run a regression
Yt on Yt  1, Yt  1, Yt  2,  , Yt  p
and carry out the  test on ˆ as before.

This is the Augmented Dickey-Fuller test

77
Cointegration and Error Correction
Models
We have talked about the fact that:
Xt~I(1) and Yt~I(1)
 A linear combination of the 2 is I(1)
However, there are exceptions!
It is possible that there exists a  such that:
Zt = (Yt - Xt) ~ I(0)
In this case we say that the 2 variables are
cointegrated.
If  exists, then  is the cointegration parameter
78
Cointegration and Error Correction
Models

Two variables are said to be cointegrated if they


have a long term or equilibrium relationship
between them

It means that they drift together, they share a


common trend

79
Testing For Cointegration
How do you test for cointegration?
1) If  is known,
Zt = Yt-Xt
Zt = Zt-1+et Test whether Zt has
Zt = (1-Zt-1+et a unit root

Apply the Dickey-Fuller test to Zt

80
Testing For Cointegration
a) If we find evidence that Zt is stationary (Reject
the null hypothesis of non-stationarity)
 The 2 variables share a common stochastic
trend: they are cointegrated.

b) If we find evidence that Zt is nonstationary


 We cannot reject the null hypothesis of non-
stationarity
 The 2 variables are not cointegrated

81
Testing For Cointegration
2) If  is unknown,
We have to rely on the residuals

a) Consider the case of a spurious regression.


In a spurious regression the errors are non-
stationary

b) Then, in the presence of cointegration it must


be that the errors are stationary

82
Testing For Cointegration
Yt    Xt  ut
Estimate it:
Yt  ˆ  ˆXt  uˆt
uˆt  Yt  ˆ  ˆXt
If the 2 variables are cointegrated
The ût are stationary
 ût is a linear combination of the 2
83
Testing For Cointegration
Test
et is white noise
uˆt  uˆt  1  et process

and use the Dickey-Fuller critical values for


cointegration

84
Testing For Cointegration
If Yt and Xt are I(1) and they are not cointegrated
 A regression of Yt on Xt does not mean anything
 You can still take the first differences of each
variable and work with Yt, Xt.

If Yt and Xt are cointegrated


 Ok, it means that a long run equilibrium
relationship exists.

85
Forecasting

We are at time t

We want to forecast Yt+1

What do we do?
We have an information set at time t, It. It means
that we know all the previous values taken by Yt
and other variables

86
Forecasting
Consider an AR(1)
Yt      Yt  1  et
Estimate it and get the estimated values ˆ ,ˆ 
Now update the process
Yt  1      Yt  et  1
Et (Yt  1 \  t )  ˆ   ˆ Yt
We can forecast forward
Yt  2      Yt  1  et  2
EtYt  2  ˆ   ˆ Et (Yt  1)
EtYt  2  ˆ   ˆ [ˆ   ˆ Yt ] 87
Forecasting

The quality of the forecast deteriorates as we


forecast farther out in the future.

Of course forecasts are not accurate


 We have to consider the forecast error:
1-step ahead forecast error:
t (1)  Yt  1  Et (Yt  1)

88

You might also like