Time-Series Econometrics

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 36

Time-Series Econometrics

Stochastic Process:
 Analysis of time-series is based on the modelling of stochastic process.
 A stochastic process is a collection of random variables ordered in time.
 The stochastic process evolves in time according to probabilistic laws.
 An observed time-series is considered to be one realization of a stochastic process.

Type of Stochastic Definition Functional Specification


Process
Purely Random (White Each element has independent and Y t =ut with
Noise) identical distribution with constant
(or zero) mean and constant E(ut )=μ ∀ t ;var(ut )=σ 2 ∀ t;
variance cov (ut ,u t−s )=0 ∀ s≠0

For simplicity, one may also assume that:

E(ut )=0 ∀ t ;var(u t )=σ 2 ∀ t ;


cov (ut ,u t−s )=0 ∀ s≠0

Example: The random disturbance terms


2
in CLRM, ut ~ IIN (0 , σ )

Random Walk In each period, the variable takes a Y t =Y t−1 +ε t or Y t −Y t−1 =ε t


random deviation from its previous
If ut is white noise with
value, and the deviations are
independently and identically E(ut )=μ ∀ t ;var(ut )=σ 2 ∀ t;
distributed in size cov (ut ,u t−s )=0 ∀ s≠0
2
=> E(Y t )=tμ;var (Y t )=t . σ
Moving Average (MA) Value of Y at time point t is moving Moving Average process of order m:
Process average of the current and past Y t =β 0 ut +β 1 ut−1 +β 2 ut−2 +. ..+β m ut−m
values of the random disturbance Here, ut is white noise with
term. E(ut )=0 ∀ t ;var(u t )=σ 2 ∀ t ;
cov (ut ,u t−s )=0 ∀ s≠0
m

=>
E( Y t )=0 ;var( Y t )=σ
j =0
2
( )
∑ β 2j
Autoregressive (AR) Value of Y at time point t depends Autoregressive process of order r:
Process on its previous values and random Y t =α 1 Y t−1 +α 2 Y t−2 +.. .+α r Y t−r +ut
disturbance at that time point.
ARMA Process The variable Y has characteristics of ARMA process of order r and m, i.e.
both AR and MA ARMA (r, m):
Y t =α 1 Y t−1 +α 2 Y t−2 +.. .+α r Y t−r +ut +
β 1 ut−1 +β 2 ut−2 +. ..+β m u t−m
Here,
E(ut )=0 ∀ t ;var(u t )=σ 2 ∀ t ;
cov (ut ,u t−s )=0 ∀ s≠0
ARIMA Process Differencing non-stationary time- ARIMA (r, d, m) means the time-series
series to make it integrated and has to be differenced d times to make it
modelling the differenced series as stationary and the stationary time series
ARMA process can be modelled as ARMA(r, m)

ARIMA (r, 0, 0) indicates purely AR(r)


stationary process

ARIMA (0, 0, m) indicates purely MA(m)


stationary process

ARIMA (0, d, 0) indicates that the time-


series is integrated of order d, i.e., I(d)

Basic Model: Y t =θ0 +θ1 t +θ2 Y t−1 +ε t

Random Walk Econometric Specification


Random walk without drift and Y t =Y t−1 +u t This is an AR(1) model
trend If ut is white noise with
θ0 =0 ;θ1=0 ;θ2 =1 E(ut )=0 ∀ t ;var(u t )=σ 2 ∀ t ;
cov (ut ,u t−s )=0 ∀ s≠0
E(Y t )=0 ,var(Y t )=tσ 2
(Assuming 0Y =0 )
Violates the condition of
stationarity
Random walk with drift, but no Y t =θ0 +Y t−1 +ut Drift parameter: θ0
trend Drift will be upward or
θ0 ≠0 ;θ1 =0 ;θ2 =1 downward depending on θ0>0
or θ0<0
If ut is white noise with
E(ut )=0 ∀ t ;var(u t )=σ 2 ∀ t ;
cov (ut ,u t−s )=0 ∀ s≠0
E(Y t )=tθ 0 ,var(Y t )=tσ 2
(Assuming
Y =0
0 )
Violates the condition of
stationarity
Deterministic Trend Y t =θ0 +θ1 t +ut If ut is white noise with
θ0 ≠0 ;θ1 ≠0 ;θ2 =0 E(ut )=0 ∀ t ;var(u t )=σ 2 ∀ t ;
cov (ut ,u t−s )=0 ∀ s≠0
E(Y t )=θ0 +θ 1 t ,var(Y t )=σ 2
Violates the condition of
stationarity
Random walk with drift and Y t =θ0 +θ1 t +Y t−1 +ut Yt is non-stationary
deterministic trend
θ0 ≠0 ;θ1 ≠0 ;θ2 =1
Random walk with drift and Y t =θ0 +θ1 t +θ2 Y t−1 +ut Yt is stationary around a
deterministic trend deterministic trend
θ0 ≠0 ;θ1 ≠0 ;θ2 <1

Nature of Time-Series

Nature Type Econometric Specification


Stationary Time- Weak or Covariance A time-series process is weak or covariance stationary if
Series* Stationarity the means and variance are constant, and the
covariance of any pair of observations depends only on
distance in time between the observations (s) and not
on t.
Strictly Stationary The joint probability distribution of any pair of
observations depends only on the distance in time
between the observations (s), and not on t, i.e., all
statistical measures on it are stationary.

The distribution of a strictly stationary stochastic


process is the same at time t as at any other time (t+s)
Non-Stationary Breakpoint Non-Stationarity The parameters of the data-generating process change
Time-Series at a particular time point
Trend Non-Stationarity The series has a deterministic trend. The mean of the
series varies linearly with time.
Example: Y t =α + βt+u t
*For a Gaussian process, the weak or second-order stationarity implies strict stationarity as well
An Example:

Consider the model:


Y t =ρY t−1 +ut

Three possibilities

(1) When ρ>1 , the series is non-stationary and explosive. Past shocks have a greater
impact than current ones.
(2) When ρ=1 , the series is non-stationary shocks persist at full force, and the series is
not mean-reverting. This is the random walk model and where the variance increases
with t and we have the infinite variance problem.
(3) When ρ<1 , series is stationary and the effects of shocks die out exponentially
according to. The series reverts to its mean.

ρ=1 and ρ<1 . The question is


Typically, we are interested in the last two scenarios, i.e.,
whether we have a unit root or not (also known as a random walk) i.e., ρ=1 ?

(a) For the model


Y t =Y t−1 +u t if ut is white noise with zero mean and constant variance
t t−1
Y t =Y 0 + ∑ u j =Y 0 + ∑ ut−τ
 j=1 τ =0
t−1
E ( Y t |Y 0 ) =Y 0 + ∑ E (u t−τ )=Y 0
 τ=0 (conditional mean)
t−1 t−1


var ( Y t|Y 0 )=var
(∑ ) ∑
τ =0
ut−τ =
τ =0
var(ut−τ )=tσ 2
(conditional variance)
Thus, conditional variance varies with time.
Unconditional variance:
∞ ∞ ∞


var ( Y t ) =var
( )(
∑ uτ =
τ=0 τ=0
)
∑ var(u τ ) =∑ σ 2=∞
τ=0

2
Note: tσ →∞ as t →∞

For the model


Y t =ρY t−1 +ut with ρ<1


Y 1 =ρY 0 +u1

 Y 2 =ρY 1 +u2 =ρ( ρY 0 +u1 )+u2 =ρ2 Y 0 + ρu1 +u2

 Y 3 =ρY 2 +u3 =ρ( ρ2 Y 0 + ρu1 + u2 )+ u3 =ρ3 Y 0 + ρ2 u1 + ρu2 +u3


This shows that the current error is the sum of all the previous shock weighted by the
coefficient () declining exponentially. How fast the effect of these previous errors will die out
depends on the value of 

When  < 1 the time-series is stationary. In this case, the time-series looks jagged and it never
wanders too far from the mean. The effect of the errors decay and disappear over time. Impact
of recent events are relatively more important than what happened a long time ago.
With given initial value of Y at Y0,
t t−1
Y t =Y 0 + ∑ ρ u j=Y 0 + ∑ ρτ ut−τ
τ

j =1 τ =0

Unconditional variance:
∞ ∞
σ2
var ( Y t ) =var
(∑ ρ u ) (
τ =0
τ
t−τ
2
=σ ∑ ( ρ ) =
τ =0
2 τ
1−ρ2
<∞
)
var ( Y t )
What happens to for the following models?

(1)
Y t =α +ρY t−1 +ut with ρ=1

(2)
Y t =α +ρY t−1 +ut with ρ<1

Model 1:
t−1
Y t =Y 0 +tα + ∑ u t−τ
τ =0

E ( Y t |Y 0 ) =Y 0 +tα
and
var ( Y t|Y 0 )=tσ 2

Both conditional mean and variance depend on time


Model 2:
t−1 ∞
Y t =Y 0 + ∑ ρτ (α+u t −τ )== ∑ ρτ ( α+u t−τ )
τ =0 τ =0


α
E(Y t )=α ∑ ( ρτ ) =
τ=0 1− ρ (Unconditional mean)
∞ ∞
2 τ σ2
var( Y t )=var
( ∑ ρτ (α +ut−τ ) =σ 2
τ=0
) ( ∑ (
τ=0
ρ ) =
) 2
1−ρ (Unconditional variance)
Both unconditional mean and variance are constant (independent of time)
Importance of Stationarity in Time-Series

 It is necessary to test if a time-series is stationary (i.e., if there is no change in property


of the series in respect of the mean, variance and autocorrelation structure over time)

 The null hypothesis is generally defined as the presence of a unit root and the
alternative hypothesis is either stationarity, trend stationarity or explosive root
depending on the test used

 Persistence of shocks for non-stationary series

 Problem of spurious regressions - two variables trending over time can have a high R 2
even if the two are totally unrelated

 Standard assumptions for asymptotic analysis will not be valid for non-stationary
variables - testing of hypothesis may not be valid

Trend Stationary and Difference Stationary


Decision
General Specification Y t =θ0 +θ1 t +θ2 Y t−1 +ut Nature of the series depends on sign and
magnitude θ0, θ1 and θ2
Random walk without drift Y t =Y t−1 +u t Difference stationary process (DSP)
and without trend
Y t −Y t−1 =ut
θ0 =0 ;θ1=0 ;θ2 =1 Or,
Or,
ΔY t =ut
Random walk with drift, Y t =θ0 +Y t−1 +ut Y has positive stochastic trend if θ0 > 0
but no trend Y has negative stochastic trend if θ0 < 0
ΔY t =θ0 +ut
θ0 ≠0 ;θ1 =0 ;θ2 =1 Difference stationary process (DSP)
Deterministic Trend Y t =θ0 +θ1 t +ut Trend stationary process (TSP)
θ0 ≠0 ;θ1 ≠0 ;θ2 =0 Y t −E(Y t )=ut
Random walk with drift Y t =θ0 +θ1 t +Y t−1 +ut Yt is non-stationary
and deterministic trend ΔY t =θ0 + θ1 t+u t
θ0 ≠0 ;θ1 ≠0 ;θ2 =1
Random walk with drift Y t =θ0 +θ1 t +θ2 Y t−1 +ut Yt is stationary around a deterministic trend
and deterministic trend ΔY t =θ0 + θ1 t+(θ2 −1)Y t−1 +ut
θ0 ≠0 ;θ1 ≠0 ;θ2 <1
Testing for Unit Roots

Steps for Unit Root Test in STATA


1. Define the data as time-series
2. Make log transformation of variables, if necessary
Standard unit root tests assume linearity under both the null and the alternative
Violation of this assumption may cause severe size and power distortions both in finite and large samples
3. Setting lags and differences of the variables
4. Setting the lag length (for ADF and PP tests)
5. Carrying out the test with different types of random walk

No Drift; No Trend Drift; No Trend Drift and Trend Hypothesis


Dickey- Y t =θ2 Y t−1+u t Y t =θ0 +θ2 Y t−1 +u t Y t =θ0 +θ1 t +θ2 Y t−1 +ut H 0 :γ 2 =0
Fuller (DF)
ΔY t =γ 2 Y t−1 +ut ; Y t −Y t−1 =θ0 +θ 2 Y t−1−Y t−1 +u t ΔY t =θ0 +θ1 t+γ 2 Y t−1 +ut ; H 1 :γ 2 <0
Test
γ 2=θ2 −1 ΔY t =θ0 +γ 2 Y t−1 +ut ; γ 2=θ2 −1
γ 2=θ2 −1
Augmented p p p H 0 :γ 2 =0
Dickey- Y t =θ2 Y t−1 + ∑ λ j ΔY t− j +u t Y t =θ0 +θ2 Y t−1 + ∑ λ j ΔY t− j +ut Y t =θ0 +θ1 t +θ2 Y t−1 + ∑ λ j ΔY t− j +ut H 1 :γ 2 <0
Fuller j=1 j=1 j=1
(ADF) Test p p p
ΔY t =γ 2 Y t−1 + ∑ λ j ΔY t− j +ut ; ΔY t =θ0 +γ 2 Y t−1 + ∑ λ j ΔY t− j +u t ; ΔY t =θ0 +θ1 t+γ 2 Y t−1 + ∑ λ j ΔY t− j +u t ;
j=1 j=1 j=1
γ 2=θ2 −1 γ 2=θ2 −1 γ 2=θ2 −1

Phillips- Y t =θ2 Y t−1 +u t Y t =θ0 +θ2 Y t−1 +u t Y t =θ0 +θ1 t +θ2 Y t−1 +ut H 0 :γ 2 =0
Perron (PP)
ΔY t =θ0 +γ 2 Y t−1 +ut ; ΔY t =θ0 +θ1 t+γ 2 Y t−1 +ut ; H 1 :γ 2 <0
Test
γ 2=θ2 −1 γ 2=θ2 −1
ΔY t =θ2 Y t−1+u t ;
γ 2=θ2−1
 Phillips-Perron Test Statistics

Y t =α +ρY t−1 +ut


The PP test statistics:

1 n2 σ^ ^ 2 2 γ^ 2 ( ^ρ−1 ) 1 n σ^ λ^ 2− γ^ 2
Z ( ρ )=n( ^ρ−1)− ( )( )
2 γ^ 2
( λ −^γ )
;
Z ( τ )= 2
^λ√ σ^
− ( ) ( )(
2 γ^ 2 ^λ2 )
Here, ρ^ = OLS estimate of; σ^ = Standard error of ρ^
n
∑ u^ 2t
γ^ 2= t=1
n−k (Sample variance of the least square residuals)
^λ2
= Newey-West long-run variance estimate of residuals
Choice of the Alternative Models:

Situation Function Form Strategy


When the time series is flat (i.e. doesn’t have a No drift; No Non-rejection of the null hypothesis:
trend) and potentially slow-turning around Trend Series to be differenced for
zero stationarity
Rejection of the null hypothesis:
Stationary series- need not be
differenced
When the time series is flat and potentially Drift; No Trend Non-rejection of the null hypothesis:
slow-turning around a non-zero value Series to be differenced for
stationarity
Rejection of the null hypothesis:
Stationary series- need not be
differenced
When the time series has a trend in it (either Drift; Trend Non-rejection of the null hypothesis:
up or down) and is potentially slow-turning Series to be differenced for
around a trend line stationarity
Rejection of the null hypothesis:
Series is trend stationary; To be
analysed by using a time trend instead
of differencing the data
Note: If the series is exponentially trending, logarithmic transformation of the series is necessary before differencing it.

Summary of the Steps Involved in the Tests


1. Test for unit roots in the process of the variable with the drift and the time trend.
If the null hypothesis (H0: 2 = 0) is not rejected, there are unit roots.
If the null hypothesis rejected, check for the presence of the time trend. If the corresponding null hypothesis is rejected, it
can be concluded that the process is stationary around a time trend.
If the coefficient of time variable is significant and presence of unit roots is not rejected, the variable has unit roots with the
time trend.
2. If there is no time trend, test for the unit roots with drift.
Check the process of unit roots with the Dickey-Fuller critical values. If there are no unit roots, the variable is stationary.
If a constant term is significant, check the results for unit roots. If the null hypothesis is not rejected, the variable has unit
roots with the drift. If the null hypothesis is rejected, the variable has no unit roots.
3. If there is no constant, test for unit roots with no drift and no time trend. If there are no unit roots, the process is stationary.

Comparative Analysis among the Alternative Tests of Unit Root


p
∑ ΔY t− j
 Unlike the DF test, the ADF test allows for higher-order autoregressive processes by including the term(s) j=1

 Although the DF and the ADF tests are frequently used in testing for unit roots, there are problems of size distortions and low
power. In DF test, the problem of autocorrelation is not corrected for. There is problem of selection of lag length in ADF test.
The information criteria such as AIC or BIC often select a low value of the lag length.
 The PP test is based on the similar equation as employed in the DF test (without the lagged differenced terms included in the
ADF test).

 The PP test incorporates automatic non-parametric correction procedure for autocorrelated residuals, and usually gives the
same conclusions as the ADF tests

 Monte Carlo studies suggest that the PP test has greater power than the ADF test

6. Interpretation of the results of unit roots

7. Making series stationary if there is unit root – (1) Detrending the series for the trend stationary process (TSP); or (2)
Differencing the series (DSP)
DF-GLS Test (Elliott, Rothenberg, and Stock, 1996)

 While the ADF test corrects for higher-order serial correlation by adding lagged differenced terms as independent variables,
the PP test makes a nonparametric correction to account for the autocorrelation.
 Monte Carlo studies suggest generally greater power of the PP test than the ADF test
 The PP test is also robust to general forms of heteroscedasticity
 The PP test does not require specification of lag length
However

 The PP test is that it is based on asymptotic theory - works well only in large samples.
 Also, like the ADF test, it is sensitive to structural breaks, poor small sample power often resulting in unit root conclusions.
 The DF-GLS test is a modified version of the ADF test
 The DF-GLS test has a higher power than the ADF tests
The DF-GLS tests the null hypothesis that the series is a random walk. There are two possible alternative hypotheses: (1) The series is
stationary about a linear time trend; or (2) It is stationary with a possibly non-zero mean but with no linear time trend.
The DF-GLS test proceeds by first de-trending the series as
d
Y t =Y t −α^ − β^ t
(a) Detrending Process using quasi difference series if there is a constant but no trend
Z t =Y t for t=1
7
( )
Z t =Y t − 1− Y for t= 2, 3, ..,T
T t−1
Similarly,
X t =1 for t=1
7
X t = for t= 2, 3, . .,T
T
Regressing Zt on Xt with no intercept

Z t =δX t +v t
Getting the de-trended series as
d
Y t =Y t −δ^
Subsequently, the DF-GLS test uses the following specification
p
ΔY dt =γ 2 Y dt−1 + ∑ λ j ΔY dt− j +ut
j=1

The DF-GLS test has the same null and alternative hypotheses as the ADF test, i.e.,

H 0 :γ 2 =0
H 1 :γ 2 <0
(b) Detrending Process using quasi difference series around deterministic trend with a constant
Z t =Y t for t=1
13. 5
(
Z t =Y t − 1−
T )Y t−1 for t= 2, 3, .. ,T

Similarly,
X 1 t =1 for t=1
13 .5
X 1 t= for t= 2, 3, . . ,T
T
X 2 t =1 for t=1
13 .5
(
X 2 t =t− 1−
T )
(t- 1) for t= 2, 3, . .,T

Regressing Zt on Xt with no intercept

Z t =δ 1 X t +δX 2 t +v t
Getting the de-trended series as
d
Y t =Y t −δ^ 1 −δ^ 2 t
Subsequently, the DF-GLS test uses the following specification
p
ΔY dt =γ 2 Y dt−1 + ∑ λ j ΔY dt− j +ut
j=1

The DF-GLS test has the same null and alternative hypotheses as the traditional ADF test, i.e.,

H 0 :γ 2 =0
H 1 :γ 2 <0
Setting Maximum Lag Length
Information Criteria

Criterion Expression Symbols


Akaike Information Criterion LL t LL – log likelihood
( ) ( )
AIC p =−2
T
+2 p
T
T – Total observations
tp- total parameters
Schwarz Bayesian Information LL ln(T ) LL – log likelihood
Criterion SBIC =−2 ( )+
p t p T – Total observations
T T
tp- total parameters
Hannan-Quinn Information LL 2[ ln(ln(T )] LL – log likelihood
Criterion HQC =−2 ( )+ (
p )t p T – Total observations
T T
tp- total parameters
Final Prediction Error T +kp+1 T – Total observations
FPE p =( λ ) P – number of lags
T −kp−1
K – number of variables
2
T
Newey-West Criteria (1994):
pmax =4× ( )
100
9

1
T
Schwert Criteria (1989):
pmax =12×
100( ) 4

Testing for Cointegration


This is to test if two time-series share a stochastic trend)

 Run OLS regression


 Estimate the residuals
 Run unit root test for the residual – using ADF Test or DW d statistic
The order of integration of the two time-series must be the same
If the residual has a unit root (i.e., if the null hypothesis is not rejected), the variables are not cointegrated.
Some Concepts relating to Cointegration:

 If a time-series is stationary, it is integrated of order 0 - I( 0)


 If the first difference of a time series is stationary but not in its level, the series is integrated of order 1 - I(1) . Random
walks are I(1) .
 If the first difference is non-stationary but the second difference is stationary, the series is integrated of order 1 - I(2) .
 In practice, most economic time-series are I( 0) and I(1) or occasionally I(2)
 If a series is integrated of order k, it has k unit roots and the series is stationary after being differenced for k times
Properties of Integrated Series

 If X t ~ I (0 ) and Y t ~ I (1) , Z t =( X t +Y t )~ I(1 )


 Any linear combination of a stationary with a non-stationary series is non-stationary.
 If X t ~ I (d ) , Z t =(α+βX t )~ I (d )
 Any linear transformation of an I(d) series is also I(d)
 If X t ~ I (d 1 ) and Y t ~ I (d 2 ) , Z t =(αX t +βY t )~ I (d 2 ) when d 2 ≥d 1
¿
 If X t ~ I (d ) and Y t ~ I (d ) , Z t =(αX t +βY t )~ I (d ) for d ¿ ≤d

Autocorrelation Coefficient:

γ^ k ∑ (Y t −Ȳ )(Y t+k −Ȳ ) ~ N 0 , 1


ρ^ k =
γ^ 0
=
∑ (Y t −Y )2 ( n ) for k=1
k −1
γ^ k ∑ ( Y t −Ȳ )( Y t+k −Ȳ ) ~ N
ρ^ k =
γ^ 0
=
∑ ( Y t −Y )2 ( (
0,
1
n i=1
))
1+ 2 ∑ 2 ρ^ 2 ( i )
for k> 1 (Brockwell and Davis, 2002)

 The 95 percent confidence interval: ρ^ k ±1.96∗SE { ρ^ k ¿


 If the interval includes the zero the null hypothesis of no autocorrelation is not rejected.
 Choice if lag length for ACF = roughly one-third/one-quarter of the observations
Partial Autocorrelation Coefficient:
Partial Correlation Coefficient

A partial correlation is a conditional correlation. It is the correlation between two variables controlling for influence of other
variables. It is defined as,

cov( y , x i|x −i )
r=
√ var( y|x−i )*var( x i|x−i )

Consider the function: y=f ( x 1 , x 2 , x 3 )

Partial correlation (r) between y and x i (say x1) is the correlation between the two variables accounting for the influence of other
variables (say x2 and x3) on y and also on xi (say x1)

In regression, this partial correlation could be found by correlating the residuals from the following two regressions:
y=f ( x−i ) ; and (2) x i=f ( x−i )
(1)
 For y=f ( x−i ) , the residuals give the part of y that is not predicted by other variables
 For x i=f ( x−i ) , the residuals give the part of xi that is not predicted by other variables
Partial Autocorrelation Coefficient
It is defined as
k −1
^ρk − ∑ ( r^ k−1 )( j . ρ^ k− j)
j=1
r^ k = k −1
1−∑ ( r^ k−1 )( j. ρ^ j)
j=1

k
Y t =α + ∑ β j Y t− j +u t
For the regression equation j=1

Partial autocorrelation coefficient between Y and its lag k,


r k =β k

Q Statistic
m
Q=n ∑ ρ 2k ~ χ 2m
Box-Pierce Statistic: k =1

For large sample Q is approximately distributed as 2 with m degrees of freedom


m
ρ 2k
LB=n ( n+2) ∑ ~ χ 2m
Ljung-Box Statistic: k =1 n−k

The LB Statistic has better small sample properties (more powerful)


Revisiting Granger Causality Test
Given the past values of variable Y, if the past values of variable X are useful for predicting Y, variable X is said to Granger cause
variable Y.
Basic Approach
• Regressing Y on its own lagged values and on lagged values of X
• Testing of the null hypothesis that the coefficients of the lagged values of X are jointly zero (Restricted F Test) - Rejection of
the null hypothesis implies X Granger causes Y
Basic Equations
p q
Y t =α + ∑ ai Y t−i + ∑ b j X t− j +ut
(1) i=1 j=1
r s
X t =β + ∑ ci X t−i + ∑ d j Y t− j +v t
(2) i =1 j=1

Four Possible Cases:


(a) Unidirectional causality from X to Y: If at least one of the coefficients of lagged X in the first equation are statistically
significant (the null hypothesis of the restricted F Test is rejected), whereas those of lagged Y are not as a group in the
second equation (the null hypothesis of the restricted F Test is not rejected)

(b) Unidirectional causality from Y to X: If at least one of the coefficients of lagged Y in the second equation are statistically
significant (the null hypothesis of the restricted F Test is rejected), whereas those of lagged X are not as a group in the first
equation ((the null hypothesis of the restricted F Test is not rejected)

(c) Feedback or bilateral causality: If the coefficients of both lagged X and lagged Y are statistically significant in both the
equations (when the coefficients are statistically significant in both the equations)

(d) Independence: If the coefficients of both lagged X and lagged Y are not statistically significant in both the equations (when
the coefficients are not statistically significant in either of the equations)
Decision Matrix on Granger Causality

Fail to Reject: Reject:


H 0 :d 1 =d 2 =.. .=d s =0 H 0 :d 1 =d 2 =.. .=d s =0
Fail to Reject: No Granger causality either Unidirectional Granger
H 0 :b1 =b 2=. ..=b q =0 from X to Y or from Y to X causality from Y to X
Reject: Unidirectional Granger Bidirectional Granger
H 0 :b1 =b 2=. ..=b q =0 causality from X to Y causality between X and Y

Decision Matrix on Feedback Effects

Fail to Reject: Reject:


H 0 :c 1 =c 2 =.. .=c r =0 H 0 :c 1 =c 2 =.. .=c r =0
Fail to Reject: No feedback effect either for Feedback effect only for X
H 0 :a1 =a2=. . .=a p =0 X or for Y
Reject: Feedback effect only for Y Feedback effect for both X
H 0 :a1 =a2=. . .=a p =0 and Y

Steps in Granger Causality Test

 Regressing current Y on lagged Y, but without inclusion of X – Restricted RSS

 Regressing current Y on both lagged Y and on lagged X – Unrestricted RSS

 H0: Lagged X terms do not influence Y

 Carrying out restricted F test

 Rejection of the H0 – X Granger causes Y

 Repetition of steps to test if Y Granger causes X


Major Concerns in Granger Causality Test

 Non-stationary nature of the variables


 Lag length influencing the direction of causality
 Autocorrelated error terms in the two equations
Issue of Non-stationarity in Granger Causality Test

 If two time-series X and Y are cointegrated, there must exist Granger causality either from X to Y, or from Y to X or in both the
directions.

 Presence of cointegration among the variables rules out the possibility of spurious regression. If the series are I(1) but not
cointegrated, Granger causality test may give misleading results unless the data are transformed to induce stationarity.

 However, presence of Granger causality in either or both the directions between X and Y does not necessarily imply that the
series will be cointegrated.

 Although cointegration indicates the presence or absence of Granger causality, it does not indicate in which direction causality
runs between the variables. The direction of Granger’s causality can be detected through the Vector Error Correction model of
long-run cointegrating vectors.

Following Oxley and Greasley (1998), a three-stage procedure can be used to test the direction of causality:

 The first step tests for the order of integration (of natural logarithm) of the variables. If the variables are stationary, Granger
causality test can be carried out.

 If the variables are not stationary, the second stage involves in investigating bivariate cointegration between the two
variables. If the variables have bivariate cointegration, Granger causality test can be carried out.

 If bivariate cointegration is rejected, the variables are to be made stationary to carry out the Granger causality test.
Thus, there are three alternative specifications:

(a) When the variables are individually non-stationary i.e., I(1) but cointegrated
p q
Y t =α + ∑ ai Y t−i + ∑ b j X t− j +ut
i=1 j=1
r s
X t =β + ∑ ci X t−i + ∑ d j Y t− j +v t
i =1 j=1

(b) Cointegrated variables but use of I( 0) series with an error correction term (to capture short-run dynamics)
p q r s
ΔY t =α+ ∑ ai ΔY t−i + ∑ b j ΔX t− j +λ ECM t−1 +ut ΔX t =β+ ∑ c i ΔY t−i + ∑ d j ΔY t− j + θ ECM t−1 +v t
i=1 j=1 i=1 j=1

(c) When the variables are individually non-stationary, i.e., I(1) and not cointegrated

p q
ΔY t =α+ ∑ ai ΔY t−i + ∑ b j ΔX t− j +u t
i=1 j=1
r s
ΔX t =β+ ∑ c i ΔY t−i + ∑ d j ΔY t− j +v t
i=1 j=1

Error Correction Mechanism (ECM):


Consider the following relationship:

Y t =θX t (Example: Long-run consumption function – Permanent Income Hypothesis)

 ln(Y t )=ln(θ )+ln( X t )

 ln(Y t )=ln(θ)+ln( X t ) (1)


Here, ln(Y t )= y t ;ln (θ)=k ;ln( X t )=x t

y t−1=k+x t−1

Δy t =Δx t

Consider the following ARDL Model:


y t =α +β 1 y t−1 + β2 x t +β 3 x t−1 +ut (2)
¿ ¿
If
x t =x and y t = y for all t and
ut =0 , (2) can be rewritten as
¿ ¿ ¿ ¿
y =α+β 1 y + β2 x +β 3 x

(1−β 1 ) y ¿ =α+( β 2 +β 3 )x ¿ (3)
 If (1−β 1 )=( β 2 +β 3 ) , (3) can be rewritten as
¿ α
¿ ¿ ¿ k=
 y =k +x ; Here, 1−β 1

If (1−β 1 )=γ =( β2 +β 3 ) , (1−γ)=β 1 and ( λ−β 2 )=β 3


Accordingly, (2) can be rewritten as
y t =α +(1−γ ) y t−1 +β 2 x t +( γ−β 2 ) xt−1 +u t
 y t − y t−1=α+γ( x t−1− y t−1 )+ β 2 ( x t −x t−1 )+ut
 Δy t =α+γ( x t−1− y t−1 )+β 2 Δx t +ut (4)

This is the structure of a simple ECM. The model relates change in one variable to change in another variable and the gap between
the two variables in the previous period.

It captures the short-run adjustments being guided by long-run theory. Here, the term ( x t−1 − y t−1 ) provides the short-run
disequilibrium adjustments. A test on  is, therefore, the test for this disequilibrium component.
In a generalized form (4) can be rewritten as,
Δy t =α+β 2 Δx t +γ 1 x t−1 +γ 2 y t−1 +ut

Here, error correction rate = 1; short-run effects = β2


The corresponding second equation for variable (X) can be written as:
Δx t = λ0 + λ1 Δy t +λ 2 x t−1 + λ3 y t −1 + v t
Vector Autoregression (VAR):

 Vector autoregression (VAR) (introduced by Sims, 1980) characterizes the joint dynamic behaviour of a collection of variables
without restrictions needed to identify the underlying structural parameters
 A typical restriction takes the form of an assumption about the dynamic relationship between the pair of variables
 A VAR system contains a set of m variables, each of which is expressed as a linear function of p lags of itself and of all of the
other m – 1 variables, plus an error term. However, one can also include exogenous variables such as seasonal dummies or
time trends in a VAR.
 For two variables, X and Y, VAR of order p (without exogenous variable) is written as,
p p
Y t =α + ∑ ai Y t−i + ∑ b j X t− j +ut
i=1 j=1
p p
X t =β + ∑ ci X t−i + ∑ d j Y t− j +v t
i =1 j=1

If another variable Z is added to the system, there would be a third equation for Z and terms involving p lagged values of z would be
added to each of the three equations.
When the variables of a VAR are cointegrated, vector error-correction (VEC) model is estimated. A VEC model for two variables can
be expressed as
p p
ΔY t =α+ ∑ ai ΔY t−i + ∑ b j ΔX t− j −λ1 (Y t−1 −θ0 −θ 1 X t−1 )+u t
i=1 j=1
p p
ΔX t =β+ ∑ c i ΔY t−i + ∑ d j ΔY t− j −λ 2 (Y t−1−θ 0 −θ1 X t−1 )+v t
i=1 j=1

Here,
Y t =θ0 +θ1 X t
stands for long-run cointegrating relationship between X and Y and 1 and 2 are the error correction
parameters. These two parameters measure how the variables X and Y react to deviation from long run equilibrium.
In VEC model with more than two variables, there is possibility that more than one cointegrating relationship among the variables.
For example, if X, Y and Z tend to be equal in the long-run, Xt = Yt and Yt = Zt (or, Xt = Zt) would be the cointegrating relationships.
Impulse Response Function:
Large number of coefficients in the VAR model – difficulty in interpretation of coefficients
Impulse responses – tool of interpretation of VAR results

The impulse responses are the time‐paths of the variables response to shocks (u and v) - found by the recursion formula - functions
of the estimated VAR coefficients

In a 2‐variable system, there are 4 impulse response functions:


Effect of a shock to y on y (u)
Effect of a shock to x on y (v)
Effect of a shock to y on x (u)
Effect of a shock to x on x (v)

In a k variable system, there are k2 impulse response functions

Impulse variable - source of the shock; Response variable - the variable being affected

The impulse response is graphed as a function of forward time periods

Graph: varbasic, Y, X - Impact of a Y shock on the time‐path of X

Derivation of Impulse Response Functions:


x t =a11 x t−1 +a12 y t−1 +ut
y t =a 21 x t−1 +a22 y t−1 +v t
j
Use of lag operator (L): For any variable z, the lag operator is defined as z t− j =L z t
 z t−1 =Lz t ; z t−2 =L2 z t ; z t−3 =L3 zt ; z t+1=L−1 z t


x t =a11 Lx t +a12 Ly t +u t

y t =a 21 Lx t + a22 Lyt + v t

Here, x t−1 =Lx t ; y t−1 =Ly t

1−a11 L −a12 L x t u


[ −a 21 L 1−a22 L y t ][ ] [ ]
= t
vt

−1
xt 1−a11 L −a12 L ut 1 1−a 22 L a12 L ut


[ ][yt
=
−a 21 L 1−a22 L ] [] [ =
v t Δ a21 L ][ ]
1−a 11 L v t

Here, Δ=(1−a11 L)(1−a22 L )−a12 La21 L=1−(a11 +a 22 )L+(a11 a22−a12+ a21 ) L2

Or, Δ=(1−λ1 L)(1−λ2 L) with 1 and 2 being the two roots of the equation

λ2 −(a11 +a 22 ) λ +(a 11 a22−a12 +a 21 )=0

Convergent expansion of x and y in terms of u and v requires


|λ1|<1, |λ 2|<1

When the stability condition is satisfied, x and y can be expressed as functions of the current and lagged values of u and v. These are
known as the impulse response functions. They show the current and lagged effects over time of changes in u and v on x and y.
Some Questions:

1. Let
y t =βx t +ut with ut =ρut−1 +v t and -1< ρ<1 Express this as ARDL model.

Ans.
y t =βx t + ρut−1 +v t
Again,
y t−1=βx t−1+ut−1 or y t−1−βx t−1=ut−1

 y t =βx t + ρ( y t−1−βx t−1 )+v t


y t =βx t + ρyt−1 −βρ x t−1 +v t

2. Express the ARDL model


y t =β 0 x t + β1 x t−1 +ρy t−1 +ut in terms of the lag operators

Ans. With lag operator


y t−1=Ly t ; x t−1 =Lx t


y t =β 0 x t + β1 Lx t +ρ Lyt +ut

 (1−ρL) y t =( β 0 +β 1 L )x t +ut
( β 0 +β 1 L) ut
yt= xt +
 (1−ρL) (1− ρL)

3. Derive the ECM equivalence of the following ARDL model:


y t =β 0 x t + β1 x t−1 +ρy t−1 +ut

Ans.
y t =β 0 x t + β1 x t−1 +ρy t−1 +ut

 y t − y t−1= β0 xt +β 1 x t−1 +( ρ−1) y t−1 +ut

Δxt =x t −x t−1


Δxt +x t−1=x t

 Δy t =β 0 ( Δxt +x t−1 )+β 1 x t−1 +γy t−1 +ut ; γ=ρ−1

 Δy t =β 0 Δx t +( β 0 + β1 ) xt−1 +γy t−1 +ut



Δyt =β 0 Δxt +β 2 x t−1 +γy t−1 +ut ; β 2=β 0 + β 1

−β 2

(
Δy t =β 0 Δxt +γ y t−1−
γ )
x t−1 +ut

−β 2 −( β 0 + β 1 )
Δy t =β 0 Δx t +γ ( y t−1−β 3 x t−1 ) +ut β 3= =
; γ γ
Δy t =β 0 Δxt +γ ( y t−1−β 3 x t−1 ) +ut
4. Write the ECM in lag operator form.

Δy t =β 0 Δx t +γ ( y t−1−β 3 x t−1 ) +ut


Ans. We have the ECM


y t − y t−1= β0 xt − β0 x t−1 +γy t−1−γβ 3 xt−1 +u t


y t − y t−1−γy t−1 =β 0 x t −β 0 x t−1 −γβ 3 x t−1 +ut


y t −Ly t −γ Lyt =β 0 xt −β 0 Lxt −γβ 3 Lxt +ut

 [1−(1+γ ) L] y t =( β 0− β0 L−γβ 3 L) x t +u t
γ ( β0 + β1)

[ {
[1−(1+γ ) L] y t = β 0− β 0 −
γ }]
L x t +u t

 (1−ρL) y t =( β 0 +β 1 L )x t +ut

( β 0 +β 1 L) u
yt= xt + t
 1−ρL 1−ρL

5. How will you estimate the ECM for


y t =α + βx t +u t

Ans. (1) Engel-Granger Two-step Procedure

(a) Estimate the model


y t =α + βx t +u t

(b) Generate the residual


^ β^ x =u^
y −α−
t t t and hence y t−1− α^ − β^ x t−1 =u^ t−1

(c) Estimate the model


Δyt =θ0 +θ 1 Δx t−1 +θ2 u^ t−1 +v t

(2) One-step Procedure

Δy t =θ0 +θ 1 Δx t +θ2 ( y t−1 −α−βx t−1 )+v t =(θ 0−αθ 2 )+θ1 Δx t +θ 2 y t−1 −θ2 βx t−1 +v t
In this setup, the coefficient of the lagged dependent variable is the coefficient on the error
correction mechanism and θ1 is the short-run effect of X on Y

t 0 t Δy =β Δx +γ y
( t−1 3 t−1 ) t −β x +u
6. Interpret the coefficients of the ECM
 The model uses differences in both the dependent variables and the independent variables.

y −β x
Inclusion of the term ( t−1 3 t−1 ) reflects the assumption that X and Y have a long-term equilibrium relationship.
 More specifically, any change in y is a sum of two effects: (i) the short-run impact of the
change in x on y, and (ii) the long-run impact of the deviation from the equilibrium in period t adjusted at each period (short-
run adjustments) at the rate γ.
 Here, β0 captures the short-run relationship between X and Y. It indicates how Y and Y immediately change if X goes up one
period.
 On the other hand, γ gives the rate at which the model re-equilibrates i.e. the speed at which it returns to its equilibrium
level. Formally, γ tells us the proportion of the disequilibrium which is corrected with each passing period.
 This coefficient should be negative and less than the absolute value of one indicating its re-equilibrating properties. If γ = 0,
then the process never re-equilibrates and if γ = 1, re-equilibration occurs in one period.
7. Consider the following models:

y t + βxt =ut ; ut =ut−1 + et ; e t ~ IN (0 , σ 21 ) y t +αxt =v t ; v t =ρv t−1 +w t with |ρ|<1 ; wt ~ IN (0 , σ 22 )

Prove that x t ~ I (1 ) and y t ~ I (1)

Ans. Here, ut ~ I (1) and v t ~ I (0 ) Hence, z t =(au t +bv t )~ I (1)

1 β yt u

Now,
[ ][
1 α xt ][]
= t
vt
1 ut u β
| | |t |
1 v t v t −ut vt α αut −βv t
xt= = yt= =
1 β α−β 1 β α−β
| | | |
 1 α and 1 α
Thus, both x and y are linear combinations of u and v.

Hence, x t ~ I (1 ) and y t ~ I (1)

8. Consider the MA(2) process: t


y =β 0 ut +β 1 ut−1 +β 2 ut−2
Derive the invertibility conditions.

Ans.: We have the MA process


y t =β 0 ut +β 1 ut−1 +β 2 ut−2

Using lag operator for u,


ut−1=Lu t

y t =β 0 ut +β 1 Lut + β 2 Lu t

Scaling β0 to 1 we get, y t =ut +β 1 Lut + β 2 L2 u t =(1+β 1 L+β 2 L2 )ut =β ( L)ut

 y t =[(1−λ1 L)(1−λ2 L )]ut


2
Here, 1 and 2 are roots of the quadratic equation x + β 1 x+ β 2 =0

Invertibility requires
|λi|<1

−β 1±√ β 21−4 β 2
| |<1
 2
 β 1 +β 2 >−1 ; β 2−β 1 >−1 ; |β 2|<1

9. Consider the AR(2) process: t 1


y =α y t−1 +α 2 y t−2 +ut
Derive the convergence conditions.

Ans.: We have the MA process


y t =α 1 y t−1 +α 2 y t−2 +ut

Using lag operator for y,


y t−1=Ly t

y t =α 1 Lyt +α 2 L2 y t +u t

 (1−α 1 L−α 2 L2 ) y t =u t
1
yt= u
 [(1−θ 1 L )(1−θ2 L )] t
2
Here, θ1 and θ2 are roots of the quadratic equation x −α 1 x−α 2 =0

Convergence requires
|θi|<1

α 1 ±√ α 12 +4 α 2
| |<1 y t =0.6 y t−1 +0.2 y t−2 +ut
 2
 α 1 +α 2 <1 ; α 1−α 2 >−1 ; |α 2|<1
Topics discussed in the class on 26/09/2020 (Saturday)

1. DF versus ADF test


2. PP test and its comparison with other tests
3. Steps to be followed for selection of the functional form for unit root test
4. Cointegration

Topics discussed in the class on 28/09/2020 (Monday)

1. Structural breaks versus temporal fluctuations


2. Importance of selecting the appropriate functional specification for unit root test
3. Use of DW d statistic to test for cointegration – ADF test is a part of cointegration test. When DW d statistic is used carrying out the ADF
test in the second stage of cointegration test is not necessary. In case of DW test here, the null hypothesis is that d=0 or rho=1. If the null
hypothesis is rejected, the variables are cointegrated. The critical values are given by Sargan and Bhargava.

Topics discussed in the class on 01/10/2020 (Thursday)

1. Size and power of a statistical test


 The size of a test is the probability of incorrectly rejecting the null hypothesis if it is true (i.e., the (maximum) probability of
committing a Type I error).
 The power of a test is the probability of correctly rejecting the null hypothesis if it is false. It is the probability of making a
correct decision (to reject the null hypothesis) when the null hypothesis is false
 Increasing sample size makes the hypothesis test more sensitive - more likely to reject the null hypothesis when it is, in fact,
false. Thus, it increases the power of the test. ... And the probability of making a Type II error gets smaller, not bigger, as
sample size increases.
2. DF-GLS Test for unit root- Two step procedure – (a) detrending time-series via a generalized least squares (GLS) regression
before performing the test, and (b) carrying out ADF using the detrended series.
Topics discussed in the class on 05/10/2020 (Monday)

1. Issue of non-linear trend


2. Selection of lag length when the conclusion differs across criteria
3. DF-GLS test in relation to ADF test
Topics discussed in the class on 06/10/2020 (Tuesday)

1. Application of unit root test: Example from the paper on instabilities in market concentration
2. Information criteria for lag length selection
3. Phillips-Perron test statistic (the mathematical details)
4. Autocorrelation function and partial autocorrelation function (PACF)

Topics discussed in the class on 07/10/2020 (Wednesday) – Lab Class

1. Assignment III for the Lab course (a) testing instabilities in index of industrial production (by use- based classification of industries); (b)
Examining impact of cropping intensity and irrigation facilities on use of chemical fertilizers in Indian agriculture sector
2. Focus Areas for (a): (i) Concept of instability, its implications and use of time-series econometrics; (ii) Splicing method (for converting
index number with different bases into a common base); (iii) interpretation of the results
3. Focus Areas for (b): (i) Measurement of the variables and their interpretation and implications; (ii) Different aspects of estimation (to be
discussed further)

Topics discussed in the class on 08/10/2020 (Thursday)

1. Revisiting correlogram (ACF, PACF)


2. Box-Pierce and Ljung-Box Q Statistic
3. Partial correlation coefficient, its computation and interpretation
4. Computation of autocorrelation and partial autocorrelation coefficients
Topics discussed in the class on 09/10/2020 (Friday)

1. Revisiting Ganger causality test – issue of unit roots and cointegration


2. Method of OLS and VAR approach to Ganger causality test
3. Interpretation of results of Granger causality test
Topics discussed in the class on 10/10/2020 (Saturday)

1. Discussions on the assignments for the Lab course


2. Discussions on the results on Ganger causality test
3. Summary of topics covered so far in the course
Topics discussed in the class on 12/10/2020 (Monday)

1. Decision matrix on Granger causality and feedback effects


2. Error correction mechanisms (ECM)
3. Vector autoregression (VAR)
Topics discussed in the class on 14/10/2020 (Wednesday) – Lab Class

1. System of simultaneous difference equations


2. Estimation of VAR
3. Impulse response functions
Topics discussed in the class on 15/10/2020 (Thursday)

1. System of simultaneous differential equations


2. Simultaneous difference equations in relation to VAR
3. Derivation of impulse response functions
Topics discussed in the class on 16/10/2020 (Friday)

1. Answering to questions on ADRL, ECM


2. Estimation and interpretation of ECM
Topics discussed in the class on 17/10/2020 (Saturday)

1. Basics of MA, AR, ARMA and ARIMA


2. Stability conditions for MA and AR
Tests for Critical aspects If unit roots
Unit Roots Selecting function  Differencing the series - DSP
Purely Random Selecting lag length  Detrending the series - TSP
(White Noise) No drift and Significance level
no trend ACF, PACF Smoothing of series
and Q-Stat Log transformation Cointegration test
Moving average  ADF test for residual
Random Walk DF Test Structural breaks  DW d test
Drift and but

ADF Test Implications for Granger Causality Test


Making series stationary – TSP and DSP
Moving Drift and
Average trend PP Test
Stochastic
Process DF-GLS Test
Autoregressive
Process
Stationary
Weakly
Strongly
ARMA
Non-
stationary
Break-point
Trend ARIMA

You might also like