Professional Documents
Culture Documents
Stationarity, Cointegration: Arnaud Chevalier University College Dublin January 2004
Stationarity, Cointegration: Arnaud Chevalier University College Dublin January 2004
Arnaud Chevalier
University College Dublin
January 2004
STATIONARITY
Typically, we only observe one set of realisations for any particular series.
However if yt is stationary, the mean, variance and autocorrelations can
usually be well approximated by sufficient long time averages based on a
single set of realisations.
A stochastic process having a finite mean and variance is co-variance
stationary if for all t and t-s:
E ( yt ) E ( yt s )
E yt
2
E y t s
2
2
y
E
E y t y t s E y t j y t j s s
A series is covariance stationary if its mean and all autocovariances are
unaffected by a change of time origin.
For a covariance stationary series, autocorrelation between yt and ys is:
s s / 0
The autocorrelation is independent of time
* stationarity conditions for an AR(1) process
y t a 0 a1 y t 1 t
Supposed the process started in period zero, so y0 is a deterministic
initial condition.
t 1 t 1
y t a 0 a a y 0 a1i t i
i
1
t
1
i 0 i 0
t 1
Eyt a 0 a1i a1t y 0
i 0
So the mean is time dependent, and this sequence is therefore not
stationary.
If |a1|<1, ( a1 ) t y 0 converges to 0 as t∞.
i
Also as t becomes large, we have: a 0 a1i a 0 /(1 a1 )
i 1
Thus lim y t a 0 /(1 a1 ) a1i t i
i 0
And for large t, Yet = a0 / (1-a1), which is finite and time independent.
The limit of the variance is:
E Yt E t a1 t 1 ( a1 ) 2 t 2 ...
2
2
2 1 a1 a1 .. 2 / 1 a1
2 4
2
Which is also finite and time-independent.
Similarly, the limiting values of all autocovariances are finite and ime-independent:
E y t y t s E t a1 t 1 a1 t 2 .. t s a1 t s 1 a1 t s 2 ..
2 2
2 4
2 a1 1 a1 a1 ... 2 a1
s s
/ 1 a
1
2
From (17) it is easy to see that E(yt)=0 and var(yt)= 2
i 0
i
2
are finite and time invariant.
* AR(1) process: y t a 0 a1 y t 1 t
0 2 / 1 a1
2
s 2 a1 / 1 a1
s 2
Thus, the autocorrelations are : s 0 / s , hence ρ0=1, ρ1=a1, ρs=(a1)s
Thus,
0 a1 1 a 2 2 2
s a1 s 1 a 2 s 2
Dividing γs by γ0 yields:
s a1 s 1 a 2 s 2 (19)
and we know that ρ0=1.
The roots of (19) lie inside the unit circle.
* Autocorrelation function of an MA(1) process. y t t t 1
By the Yule-Walker equations, we get:
0 E ( y t yt ) E t t 1 t t 1 (1 2 ) 2
1 E ( y t y t 1 ) E t t 1 t 1 t 2 2
And s E ( y t y t s ) E t t 1 t s t s 1 0s 1
1
1 a1 a1 2
1 a 2
1
Hence 1
1 a1 a1 .
1 2
2a1
And s a1 s 1 for all s 2
6 PARTIAL AUTOCORRELATION
In an AR(1) process, yt and yt-2 are correlated even though yt-2 does not
directly appear in the model. In contrast the partial autocorrelation between
yt and yt-s eliminates the effect of the intervening values. So, for an AR(1)
process, the partial autocorrelation between yt and yt-2 is 0.
The most direct way to obtain partial autocorrelation is to form the series :
y t* y t
y t* 11 y t*1 et
Similarly, 22 , gives the partial autocorrelation between yt and yt-2.
y t* 21 y t*1 22 y t* 2 et
Partial autocorrelation can also be found from the Yule Walker equations:
11 1
22 2 12 / 1 12
s 1
s s 1, j s j
j 1
ss s 1
1 s 1, j j
j 1
For an AR(p) process, there is no direct correlation between yt and yt-s for s>p.
An MA(1) process can be written as an AR(∞), so always have partial
autocorrelation, decaying slowly over time.
1 1 z 0 z 1 .
1
The condition that the roots are less than unity is equivalent to
1 1
If an AR(p) has a root equals one, the series is said to have a unit (autoregressive)
root. If Yt has a unit root, it contains a stochastic trend and is not stationary. (the two
terms can be used inter-changeably).
If a series has a unit root, the estimator of the autoregressive coefficient in an AR(p) is
biased towards 0, t-stat have a non-normal distribution, two independent series may
appear related.
1) bias towards 0
Suppose that the true model is a random walk: ( Yt Yt 1 u t ) but the econometrician
estimates an AR(1) ( Yt 1Yt 1 u t ).
Since the series is non stationary, the OLS assumptions are not satisfied and it can be
shown that:
E ( ˆ1 ) 1 5.3 / T . So with 20 years of quarterly data, you would expect ˆ1 0.934
Monte carlo with 100 replications gives:
Variable | Obs Mean Std. Dev. Min Max
-------------+-----------------------------------------------------
RES1 | 100 .9270481 .0570009 .7792342 1.010915
10
-5
19600 19700 19800 19900 20000
daten
150
100
50
0
19600 19700 19800 19900 20000
daten
7.7.2 Testing for unit root
The most commonly used test in practice is the Dickey and Fuller test.
------------------------------------------------------------------------------
| Robust
dinf | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
inf_1 | -.1559304 .0737577 -2.11 0.038 -.3031924 -.0086683
_cons | 1.07776 .4075892 2.64 0.010 .2639821 1.891538
------------------------------------------------------------------------------
The DF statistics does not have a normal distribution, so the critical values are specific
to the test.
Table 7.1 Critical values for Augmented Dickey and Fuller test
10% 5% 1%
Intercept only -2.57 -2.86 -3.43
Intercept and time trend -3.12 -3.41 -3.96
So in the previous regression we cannot reject at any level of statistical confidence that
0 , so the series has a unit root, and is not stationary.
* Dickey-Fuller test in the AR(p) model
For an AR(p), the Dickey Fuller test is based on the following regression:
Yt 0 Yt 1 1 Yt 1 2 Yt 2 ... p Yt p u t
(7.7)
H0: 0 vs H1: 0
The ADF statistics is the OLS-t-statistics testing 0 . If H0 is rejected, Yt
is stationary.
The number of p-lags needed is unknown. Studies suggest that for the ADF
it is better to have too many lags rather than too few, so it is recommended
to use the AIC to determine the number of lags for the ADF.
* Dickey Fuller allowing for a linear trend
Some series have an obvious linear trend (Japanese GDP) so it will be
uninformative to test their stationarity without accounting for the trend.
Alternatively, if Yt is stationary around a deterministic linear trend, the
trend must be added to (7.7) which becomes:
Yt 0 t Yt 1 1 Yt 1 2 Yt 2 ... p Yt p u t
If the series is found to have a unit root, then the first difference of the
series does not have a trend. For example: Yt 0 Yt 1 u t then
Yt 0 u t is stationary.
R q : T h e p o w e r o f a te s t is e q u a l to th e p r o b a b ility o f r e je c tin g a f a ls e n u ll
h y p o th e s is (1 - p r o b T y p e I I) . M o n te C a r lo h a v e s h o w n th a t U R te s t h a v e
lo w p o w e r , th e y c a n n o t d is tin g u is h b e tw e e n a u n it r o o t a n d a s ta tio n a r y
n e a r u n it r o o t p r o c e s s . T h u s th e te s t w ill o f te n in d ic a te th a t a s e rie s c o n ta in s
a UR.
y t 1 .1 y t 1 0 .1 y t 2 t
z t 1 . 1 z t 1 0 . 15 z t 2 t
C h e c k in g f o r U R ,
1 1 .1 y 0 .1 y 2 0
W ith th e f ir s t p r o c e s s , w e h a v e : ( y 1 )( 0 . 1 y 1 ) 0
y 1 , y 10
W ith th e s e c o n d p r o c e s s , w e h a v e th e f o llo w in g r o o ts : z = 0 .9 4 0 5 , z = 0 .1 5 9 5 .
S o th e fir s t p r o c e s s h a s a U R a n d th e s e c o n d o n e is s ta tio n a r y .
y z
10
-10
-20
0 100 200 300 400
t
S
imilarly
,itcanb
edifficulttod
isting
uishb
etw
eenatren
dstation
aryandaun
itroo
tprocessw
ith
d
rift.
t
w 10 tt
.02
xt 0 xt1t /3
.02
w x
10
-5
0 100 200 300 400
t
In th e s h o rt ru n , th e fo re c a s t fro m s ta tio n a ry a n d n o n -s ta tio n a ry m o d e ls
w ill b e c lo s e , h o w e v e r th e lo n g te rm fo re c a s t w ill b e q u ite d iffe re n t.
A ls o , th e p o w e r o f th e u n it ro o t te s t is d ra s tic a lly a ffe c te d b y th e d a ta
g e n e ra tin g p ro c e s s . If w e in a p p ro p ria te ly o m it th e in te rc e p t o r tim e
tre n d , th e p o w e r o f th e U R te s t c a n g o to 0 . F o r e x a m p le o m ittin g th e
tre n d le a d s to a n u p w a rd b ia s in th e e s tim a te d v a lu e o f in :
Y t 0 t Y t 1 1 Y t 1 2 Y t 2 . p Y t p u t
( 7 .8 )
T h u s a p ro c e d u re fo r U R te s tin g c a n ta k e th e fo llo w in g fo rm :
1 - U s e th e le a s t r e s tr ic tiv e m o d e l ( 7 .8 ) to te s t f o r U R .
U R te s t h a v e lo w p o w e r to re je c t H o , s o if H o is re je c te d th e re is
n o n e e d to p ro c e e d fu rth e r. If n o t g o to s te p 2 .
2 - T e s t 0 , if n o t u s e (7 .8 ) to te s t fo r U R s te p 1
If y e s , u s e (7 .7 ) to te s t fo r U R , if H o is re je c te d c o n c lu d e n o u n it ro o t, if
n o t, g o to s te p 3 .
3 - T e s t 0 , if n o t g o b a c k to s te p 2 ,
p
If y e s , u s e y y y t1
j1
y t j to te s t fo r U R .
7.8 Non stationary: Breaks
A second type of nonstationary arises when the population regression
function changes over the course of the sample
A break can arise either from a discrete change in the population
regression coefficients at a distinct date (policy change) or from a
gradual evolution of the coefficients over a longer period of time
(change in the structure of the economy).
If the break is not noticed, estimates will be based on the average
behaviour of the series over the period of time and not the true
relationship at the end of the period, thus forecast will be poor.
7.8.1 testing for breaks at a known date
To keep it simple, let’s consider the ADL(1,1) model. Let’s denote the period
at which the break is supposed to have happened.
Create a dummy variable (Dt)taking values 0 before and 1 after . D is also
interacted to Yt-1 and Xt-1.
Yt 0 1Yt 1 1 X t 1 0 Dt 1 ( Dt * Yt 1 ) 2 ( Dt * X t 1 ) u t
Under the hypothesis of no break, 0 1 2 0 can be tested using a F-test.
Under the alternative of a break, at least one of these coefficients will be
different from 0. This is usually referred as a Chow test.
This approach can be modified to check for a break in a subset of the
coefficients by including only the binary variable interactions for that subset of
regressions of interest.
7.8.2 Testing for break at an unknown date
Often the date of a possible break is unknown, but you may suspect the range
during which the break took place, say between 0 and 1 . The Chow test is used
to test for breaks at all dates between 0 and 1 ., then using the largest of the
resulting F-statistics to test for a break at an unknown date. This is often referred
as Quandt Likelihood Ratio. Since, QLR is the largest of a series of F-statistics,
its distribution is special and depends on the number of restrictions tested q (nbr
of coefficients, including the intercept allowed to break), 0 and 1 , expressed as
a fraction of the total sample size. For the large sample approximation to the
distribution of the QLR to be a good one, 0 and 1 cannot be too close to the end
of the sample, For this reason, the QLR is computed over a trimmed range so that
0 0.15T and 1 0.85T .
The QLR test can detect a single discrete break, multiple discrete breaks and/or
slow evolution of the regression function. If there is a distinct break in the
regression function, the date at which the largest Chow statistics occurs is an
estimator of the break date.
Say, we want to check that our estimates of the determinants of inflation in the
US over the 1962:I and 1999:4 period. More specifically, we are concerned
that the intercept and unemployment may have changed over time. The first
period we can check for structural break is 0.15T is 1967:4. So we create a
dummy variable for observations after 1967:4 and interact it with
unemployment variables:
Source | SS df MS Number of obs = 152
-------------+------------------------------ F( 13, 138) = 7.41
Model | 184.330595 13 14.1792765 Prob > F = 0.0000
Residual | 283.045198 138 1.91246756 R-squared = 0.3944
-------------+------------------------------ Adj R-squared = 0.3412
Total | 467.375793 151 2.90295524 Root MSE = 1.3829
------------------------------------------------------------------------------
dinf | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
dinf_1 | -.4009554 .0824812 -4.86 0.000 -.5639484 -.2379623
dinf_2 | -.3433158 .0892349 -3.85 0.000 -.5196549 -.1669767
dinf_3 | .0545284 .0850863 0.64 0.523 -.1136126 .2226693
dinf_4 | -.038809 .0754606 -0.51 0.608 -.1879284 .1103105
unemp_1 | -1.719641 1.254766 -1.37 0.173 -4.199214 .7599307
unemp_2 | 3.46834 2.364168 1.47 0.144 -1.203546 8.140225
unemp_3 | -3.370699 2.164944 -1.56 0.122 -7.648893 .9074963
unemp_4 | 1.666702 1.155521 1.44 0.151 -.6167486 3.950152
D | 1.775541 1.839904 0.97 0.336 -1.860335 5.411417
D_unemp_1 | -1.225527 1.351754 -0.91 0.366 -3.896758 1.445703
D_unemp_2 | .2032217 2.560099 0.08 0.937 -4.855847 5.26229
D_unemp_3 | 2.394236 2.370403 1.01 0.314 -2.28997 7.078442
D_unemp_4 | -1.668078 1.255425 -1.33 0.186 -4.148952 .8127955
_cons | -.2276938 1.757672 -0.13 0.897 -3.701068 3.245681
------------------------------------------------------------------------------
. testparm D-D_unemp_4
F( 5, 148) = 0.85
Prob > F = 0.5135
F=0.85, we now re-estimate this model with D=1 if t>=1968:1, and until
1993:I.
For example, a break at 1981:4 leads to
Regression with robust standard errors Number of obs = 152
F( 13, 138) = 8.42
Prob > F = 0.0000
R-squared = 0.4223
Root MSE = 1.367
------------------------------------------------------------------------------
| Robust
dinf | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
dinf_1 | -.4075559 .0932063 -4.37 0.000 -.591853 -.2232587
dinf_2 | -.3777853 .0977229 -3.87 0.000 -.5710131 -.1845574
dinf_3 | .0515292 .0798247 0.65 0.520 -.1063085 .2093669
dinf_4 | -.0260024 .0826179 -0.31 0.753 -.1893631 .1373584
unemp_1 | -2.705181 .6911244 -3.91 0.000 -4.071744 -1.338618
unemp_2 | 3.54704 1.300035 2.73 0.007 .9764752 6.117605
unemp_3 | -2.025859 1.188034 -1.71 0.090 -4.374964 .3232453
unemp_4 | .9846463 .5641419 1.75 0.083 -.1308334 2.100126
D | -.0729984 .9544203 -0.08 0.939 -1.960177 1.81418
D_unemp_1 | -.5718067 .8773241 -0.65 0.516 -2.306543 1.162929
D_unemp_2 | .1754026 1.576346 0.11 0.912 -2.941512 3.292317
D_unemp_3 | 2.79729 1.599601 1.75 0.083 -.3656069 5.960186
D_unemp_4 | -2.432152 .8388761 -2.90 0.004 -4.090865 -.7734395
_cons | 1.350888 .733964 1.84 0.068 -.100382 2.802157
------------------------------------------------------------------------------
. testparm D-D_unemp_4
F( 5, 138) = 3.31
Prob > F = 0.0074
7.8.3 Pseudo out of sample forecast
1) choose the number of observations P for which you will generate pseudo out of
sample forecast, say P=10%. Let’s define s=T-P
2) Estimate the regression on the shortened sample: t=1,..,s
~
3) Compute the forecast for the first period beyond the shortened sample: Ys 1|s
~
4) The forecast error : u~s 1 Ys 1 Ys 1|s
5) Repeat steps 2-4 for each date from T-p+1 to T-1 (reestimating the regression
each time) .
6) The pseudo forecast errors can be examined to see if they are consistent with a
stationary relationship
For example, going back to our prediction of inflation, using data up to
1993:4, we can predict inflation for 1994:1, doing so until 1999:4, we have 24
pseudo forecasts.
Regression with robust standard errors Number of obs = 128
F( 13, 114) = 7.37
Prob > F = 0.0000
R-squared = 0.4210
Root MSE = 1.4729
------------------------------------------------------------------------------
| Robust
dinf | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
dinf_1 | -.4190169 .0998416 -4.20 0.000 -.6168024 -.2212315
dinf_2 | -.3961329 .1031673 -3.84 0.000 -.6005065 -.1917593
dinf_3 | .039491 .0844715 0.47 0.641 -.1278463 .2068283
dinf_4 | -.0449508 .0860523 -0.52 0.602 -.2154198 .1255181
unemp_1 | -2.679112 .6980463 -3.84 0.000 -4.061936 -1.296288
unemp_2 | 3.465039 1.325757 2.61 0.010 .8387247 6.091353
unemp_3 | -1.987951 1.22184 -1.63 0.106 -4.408407 .4325056
unemp_4 | .9924426 .5769953 1.72 0.088 -.1505805 2.135466
D | .4808356 1.389741 0.35 0.730 -2.27223 3.233901
D_unemp_1 | -.9707623 .9465191 -1.03 0.307 -2.845809 .9042847
D_unemp_2 | .6794326 1.700203 0.40 0.690 -2.688656 4.047521
D_unemp_3 | 2.716406 1.821819 1.49 0.139 -.8926028 6.325415
D_unemp_4 | -2.525234 .9671997 -2.61 0.010 -4.441249 -.6092183
_cons | 1.414308 .7407146 1.91 0.059 -.0530417 2.881658
The inflation rate is predicted to rise by 1.9 percentage points. But the true
value is 0.9, so our forecast erro is –1 percentage points.
Doing this 24 times, we find that the average forecast error is 0.37 which is significantly
different from 0 (t=-2.71). This suggests that the forecasts were biased over the period,
systematically forecasting higher inflation. This would suggest that the model has been
unstable (break).
7.9 Cointegration
7.9.1 Cointegration and error correction
Series can move together so closely over the long run that they appear to have the same trend
component. For example, the 3 months and 12months US interest rate.
FYFF FYGM3
19.1
1.73
19591 20004
daten
moreover, the spread between the two series does not appear to have a trend.
-2
19600 19700 19800 19900 20000
daten
The two series have a common stochastic trend, they are said to be cointegrated. .
S u p p o s e X t a n d Y t a re in te g ra te d o f o rd e r 1 . If th e re e x is t a c o e ffic ie n t s u c h
th a t Y t X t is in te g ra te d o f o rd e r 0 (s ta tio n a ry ), th e n th e 2 s e rie s a re s a id to b e
c o in te g ra te d w ith a c o in te g ra tin g c o e ffic ie n t .
* T e s tin g fo r c o in te g ra tio n w h e n is k n o w n .
In s o m e c a s e s , e c o n o m ic th e o ry s u g g e s ts a v a lu e o f . In th is c a s e a D F te s t o n
th e s e rie s z t Y t X is c o n d u c te d .
In our example, let’s assume that theory suggest that =1. There is no trend in dspread, so we
simply estimate:
. reg dspread spread_1 dspread_1 dspread_2 dspread_3 dspread_4
------------------------------------------------------------------------------
dspread | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
spread_1 | -.2506278 .0719562 -3.48 0.001 -.3927548 -.1085007
dspread_1 | -.283247 .091436 -3.10 0.002 -.4638504 -.1026437
dspread_2 | .0230289 .0910197 0.25 0.801 -.1567521 .20281
dspread_3 | -.0599991 .0895151 -0.67 0.504 -.2368085 .1168102
dspread_4 | .048277 .0791148 0.61 0.543 -.1079897 .2045436
_cons | .1548892 .063015 2.46 0.015 .0304227 .2793557
------------------------------------------------------------------------------
Lags AIC
4 -1.049
3 -1.059
2 -1.063
1 -1.072
The t-stat on spread_1 = -3.48, which is greater than the critical value (1% of the ADF) so we reject
the null hypothesis that 0 , the series does not have a unit root, and is therefore I(0). The 2
interest rate series are cointegrated.
* testing for cointegration when is unknown.
In general is unknown, the cointegration coefficient must be estimated prior to testing for
unit root. This preliminary step makes it necessary to use different critical values for the
subsequent unit root test.
Step 1: estimate Yt X t t (7.12)
Step2: a Dickey Fuller t-test is used to test for unit root in the residuals from (1): ̂ t
This procedure is called the Engle-Granger Augmented Dickey Fuller Test. Critical values for
the EGADF are:
Nbr of X in (7.12): 10% 5% 1%
Cointegrated variables
1 -3.12 -3.41 -3.96
2 -3.52 -3.80 -4.36
3 -3.84 -4.16 -4.73
4 -4.20 -4.49 -5.07
. reg dnu nu_1 dnu_1 dnu_2 dnu_3 dnu_4
------------------------------------------------------------------------------
dnu | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
nu_1 | -.5739985 .1150186 -4.99 0.000 -.8011821 -.3468149
dnu_1 | -.1574595 .1139771 -1.38 0.169 -.3825858 .0676667
dnu_2 | .0752181 .1052652 0.71 0.476 -.1327006 .2831369
dnu_3 | .0053021 .0974368 0.05 0.957 -.1871541 .1977583
dnu_4 | .1237554 .0782992 1.58 0.116 -.0309003 .278411
_cons | .0016953 .0421806 0.04 0.968 -.0816193 .0850099
Reject the null hypothesis of a unit root, the two series are cointegrated.
*Error correction model
If 2 series are cointegrated, then the forecast of Yt and X t can be improved by including an
If Xt and Yt are cointegrated, one way to eliminate the stochastic trend is to compute the series
Yt X t which is stationary and can be used for analysis. The term Yt X t is called the error
correction term
Yt 0 1 Yt 1 .... p Yt p 1 X t 1 ... q X t q (Yt 1 X t 1 ) u t
------------------------------------------------------------------------------
dfyff | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
dfyff_1 | -.0014132 .2136881 -0.01 0.995 -.4235733 .4207468
dfyff_2 | -.0264828 .2208415 -0.12 0.905 -.4627751 .4098095
dfyff_3 | .1002626 .2129522 0.47 0.638 -.3204438 .5209689
dfyff_4 | .1444413 .1802188 0.80 0.424 -.2115972 .5004798
dfy3m_1 | .0068489 .2541142 0.03 0.979 -.4951767 .5088745
dfy3m_2 | -.1758844 .275382 -0.64 0.524 -.7199263 .3681576
dfy3m_3 | .2220654 .2653096 0.84 0.404 -.3020777 .7462086
dfy3m_4 | -.3159166 .2272404 -1.39 0.166 -.7648506 .1330174
spread_1 | -.4598352 .1585354 -2.90 0.004 -.7730361 -.1466342
_cons | .2955998 .1381308 2.14 0.034 .0227098 .5684897
------------------------------------------------------------------------------
the lag spread does help to predict change in interest rate in the one year treasure bond rate.