Download as pdf or txt
Download as pdf or txt
You are on page 1of 16

Univariate

time-series models are a class of specifications where one attempts to model


and to predict financial variables using only information contained in their own past
values and possibly current and past values of an error term.
1. Stationary process: is one whose properties and behaviours do not depend on the
time at which the series is observed
1.1 A strictly stationary process:
A series is strictly stationary if the distribution of its values remains the same
as time progresses, implying that the probability that y falls within a particular
interval is the same now as at any time in the past or the future.
1.2 A weakly stationary process:
If a series satisfies 1 − (3) for 𝑡 = 1, 2, . . . , ∞, it is said to be weakly or
covariance stationary

1 𝐸 𝑦K = 𝜇; 𝑡 = 1, 2, . . . , ∞
A stationary process should have a constant mean
2 𝐸 𝑦K − 𝜇 𝑦K − 𝜇 = 𝜎 O < ∞
A stationary process should have a constant variance
3 𝐸 𝑦KQ − 𝜇 𝑦KR − 𝜇 = 𝛾KR TKQ < ∞
A stationary process should have a constant covariance. The auto-co-variances
determine how y is related to its previous values, and for a stationary series
they depend only on the difference between 𝑡V and 𝑡O .

The moment
3 ′ 𝐸 𝑦K − 𝐸 𝑦K 𝑦KTX − 𝐸 𝑦KTX = 𝛾X < ∞
is known as auto-co-variance function. When s = 0, the auto-co-variance at lag
zero is obtained, which is the auto-co-variance of 𝑦K with 𝑦K , i.e., the variance
of 𝑦. These covariances, 𝛾X , are also known as auto-co-variances since they are
the co-variances of y with its own previous values. The auto-co-variances are
not a particularly useful measure of the relationship between y and its
previous values, however, since the values of the auto-co-variances depend on
the units of measurement of 𝑦K , and hence the values that they take have no
immediate interpretation.
It is thus more convenient to use the autocorrelations, which are the auto-co-
variances normalised by dividing by the variance
𝛾X
𝜏X = ; 𝑠 = 0,1,2, …
𝛾]
The series 𝜏X now has the standard property of correlation coefficients that
the values are bounded to lie between ±1. In the case that 𝑠 = 0, the
autocorrelation at lag zero is obtained, i.e., the correlation of 𝑦K with 𝑦K ,
which is of course 1. If 𝜏X is plotted against 𝑠 = 0,1,2, …, a graph known as the
autocorrelation function (acf) or correlogram is obtained.

Remark:
• Trend and seasonal time-series are non-stationary
• Stationary time series have no predictable patterns in the long-run
• The time series with cyclic behaviours, but the cycle is appearance
with no trend and seasonality is also stationary
• ACF values of stationary process decrease quickly


• The non-stationary decrease slowly



2. White noise process: a white noise process (for disturbance errors) is a
stationary process, with no discernible (perceptible) structure:
1 𝐸 𝑢K = 𝜇f ; 𝑡 = 1, 2, . . . , ∞
White noise process has constant mean
2 𝑣𝑎𝑟(𝑢K ) = 𝜎f O < ∞
White noise process has constant variance
𝜎f O 𝑖𝑓 𝑠 = 0
3 𝛾X =
0 𝑖𝑓 𝑠 ≠ 0
Each observation is uncorrelated with all other values in the sequence. Thus a
white noise process has zero auto-co-variances, except at lag zero. Another
way to state this last condition would be to say that each observation is
uncorrelated with all other values in the sequence.


Hence the autocorrelation function for a white noise process will be zero apart
from a single peak of 1 at 𝑠 = 0.
𝛾X 1 𝑖𝑓 𝑠 = 0
𝜏X = =
𝛾] 0 𝑖𝑓 𝑠 ≠ 0
If 𝜇f = 0, and the three conditions hold, the process is known as zero mean white
noise.
Furthermore, If it is assumed that 𝑢K is distributed normally, then the sample
autocorrelation coefficients are also approximately normally distributed
1
𝜏X ~𝑁 0,
𝑇
where T is the sample size, and 𝜏X denotes the autocorrelation coefficient at lag s
estimated from a sample


3. Hypothesis testing for 𝜏X :
3.1 Constructing a non-rejection region:
𝐻] : 𝜏X = 0
𝐻u : 𝜏X ≠ 0
V V
95% 𝑛𝑜𝑛 − 𝑟𝑒𝑗𝑒𝑐𝑡𝑖𝑜𝑛 𝑟𝑒𝑔𝑖𝑜𝑛: (−1.96×

, 1.96×

)

If the sample autocorrelation coefficient 𝜏X falls outside this region for a given
value of s, then the null hypothesis that the true value of the coefficient at that
lag s is zero is rejected.
Alternatively, we could calculate 𝑇𝑆 = 𝜏X × 𝑇
𝑇𝑆 > 𝑍… → 𝑅𝑒𝑗𝑒𝑐𝑡 𝐻]
R

𝑇𝑆 ≤ 𝑍… → 𝐹𝑎𝑖𝑙 𝑡𝑜 𝑟𝑒𝑗𝑒𝑐𝑡 𝐻]
R

3.2 Box-Pierce test:


𝐻] : 𝜏XQ = 0, 𝜏XR = 0, … , 𝜏X• = 0
𝐻u : 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
’ O O
𝑇𝑆 = 𝑄 = 𝑇× “”V 𝜏X‘ ~𝜒’
where T = sample size, m = maximum lag length
O
𝑇𝑆 > 𝜒–,’ → 𝑅𝑒𝑗𝑒𝑐𝑡 𝐻]
O
𝑇𝑆 ≤ 𝜒–,’ → 𝐹𝑎𝑖𝑙 𝑡𝑜 𝑟𝑒𝑗𝑒𝑐𝑡 𝐻]
As for any joint hypothesis test, only one autocorrelation coefficient needs to be
statistically significant for the test to result in a rejection. However, the Box–
Pierce test has poor small sample properties, implying that it leads to the wrong
decision too frequently for small samples. A variant of the Box–Pierce test,
having better small sample properties, has been developed. The modified
statistic is known as the Ljung–Box (1978) statistic.
3.3 Ljung-Box test
𝐻] : 𝜏XQ = 0, 𝜏XR = 0, … , 𝜏X• = 0
𝐻u : 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
R
’ • ž‘
𝑇𝑆 = 𝑄∗ = 𝑇×(𝑇 + 2)× O
“”V •T“ ~𝜒’

It should be clear from the form of the statistic that asymptotically (that is, as
the sample size increases towards infinity), the (T + 2) and (T − k) terms in
the Ljung–Box formulation will cancel out, so that the statistic is equivalent to
the Box–Pierce test. This statistic is very useful as a portmanteau (general)
test of linear dependence in time series.
3.4 Example:
Suppose that a researcher had estimated the first five autocorrelation coefficients
using a series of length 100 observations, and found them to be
Lag 1 2 3 4 5
Autocorrelation 0.207 -0.013 0.086 0.005 -0.022
coefficient
Test each of the individual correlation coefficients for significance, and test all five
jointly using the Box–Pierce and Ljung–Box tests.
1) Constructing a non-rejection region:
1 1
95% 𝑛𝑜𝑛 − 𝑟𝑒𝑗𝑒𝑐𝑡𝑖𝑜𝑛 𝑟𝑒𝑔𝑖𝑜𝑛: (−1.96× , 1.96× )
𝑇 𝑇
where T = 100 in this case. The decision rule is thus to reject the null hypothesis that a
given coefficient is zero in the cases where the coefficient lies outside the range
(−0.196,0.196). For this example, it would be concluded that only the first
autocorrelation coefficient is significantly different from zero at the 5% level.
2) Box-Pierce and Ljung-Box test:
turning to the joint tests, the null hypothesis is that all of the first five autocorrelation
coefficients are jointly zero, i.e.

𝐻] : 𝜏XQ = 0, 𝜏XR = 0, 𝜏X = 0, 𝜏X¡ = 0, 𝜏X¢ = 0
or simply
𝐻] : 𝜏V = 0, 𝜏O = 0, 𝜏£ = 0, 𝜏¤ = 0, 𝜏¥ = 0
The test statistics for the Box–Pierce and Ljung–Box tests are given respectively, as

𝑄 = 𝑇× 𝜏X‘ O = 100×(0.207O + −0.013&0 O


+ 0.086O + 0.005O + −0.022 O
“”V

= 5.09


𝜏X ‘ O
𝑄 = 𝑇× 𝑇 + 2 ×
𝑇−𝑘
“”V

0.207O −0.013 O 0.086O 0.005O −0.022 O


= 100×102× + + + +
100 − 1 100 − 2 100 − 3 100 − 4 100 − 5
= 5.26
O
The relevant critical values are from a 𝜒’ distribution with five degrees of freedom,
which are 11.1 at the 5% level, and 15.1 at the 1% level. Clearly, in both cases, the joint
null hypothesis that all of the first five autocorrelation coefficients are zero cannot be
rejected. Note that, in this instance, the individual test caused a rejection while the joint
test did not. This is an unexpected result that may have arisen as a result of the low power
of the joint test when four of the five individual autocorrelation coefficients are
insignificant. Thus the effect of the significant autocorrelation coefficient is diluted in the
joint test by the insignificant coefficients. The sample size used in this example is also
modest relative to those commonly available in finance.
𝐿𝑎𝑔 𝑜𝑝𝑒𝑟𝑎𝑡𝑜𝑟 (𝐿):
𝑦KTV = 𝐿×𝑦K
𝑦KTO = 𝐿×𝑦KTV = 𝐿O ×𝑦K
𝑦KT£ = 𝐿×𝑦KTO = 𝐿£ ×𝑦K

𝑦KT« = 𝐿« ×𝑦K

𝐷𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑐𝑖𝑛𝑔 ∆ :
∆®¯ = 𝑦K − 𝑦KTV
∆®¯ O = ∆®¯ − ∆®¯°Q
∆®¯ £ = ∆®¯ O − ∆®¯ TV O

∆®¯ « = ∆®¯ «TV − ∆®¯ TV «TV

∆®¯ = 𝑦K − 𝑦KTV = 𝑦K − 𝐿×𝑦K = (1 − 𝐿)×𝑦K

∆®¯ O = ∆®¯ − ∆®¯°Q = (1 − 𝐿)×𝑦K − (1 − 𝐿)×𝑦KTV = (1 − 𝐿)O ×𝑦K


∆®¯ £ = ∆®¯ O − ∆®¯°Q O = 1 − 𝐿 O ×𝑦K − 1 − 𝐿 O ×𝑦KTV = 1 − 𝐿 £ ×𝑦K

∆®¯ « = 1 − 𝐿 « ×𝑦K

4. Moving average process MA(q):

Let 𝑢K ; 𝑡 = 1, 2, . . . , ∞ be a white noise process such that
𝐸 𝑢K = 𝜇f ; 𝑡 = 1, 2, . . . , ∞
𝑣𝑎𝑟(𝑢K ) = 𝜎f O < ∞
Then MA(q) model is defined as:
𝑦K = 𝜇 + 𝑢K + 𝜃V 𝑢KTV + 𝜃O 𝑢KTO + ⋯ + 𝜃´ 𝑢KT´
´
= 𝜇 + 𝑢K + µ”V 𝜃µ ×𝑢KTµ
´ µ
= 𝜇 + 𝑢K + µ”V 𝜃µ ×𝐿 𝑢K
´ µ
= 𝜇 + 𝑢K (1 + µ”V 𝜃µ ×𝐿 )

= 𝜇 + 𝑢K × 𝜃 𝐿

4.1 Properties:

1 𝐸 𝑦K = 𝜇;
2 𝑣𝑎𝑟 𝑦K = 𝛾] = [1 + 𝜃V O + 𝜃O O + ⋯ + 𝜃´ O ]×𝜎 O < ∞
𝜃X + 𝜃´¸V 𝜃V + 𝜃´¸O 𝜃O + ⋯ 𝜃´ 𝜃´TX 𝑓𝑜𝑟 𝑠 = 1, 2, … , 𝑞
3 𝑐𝑜𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒 𝛾X =
0 𝑓𝑜𝑟 𝑠 > 𝑞
So, a moving average process has constant mean, constant variance, and auto-co-
variances which may be non-zero to lag q and will always be zero thereafter.

Consider the following simple MA(2) model:

𝑦K = 𝑢K + 𝜃V 𝑢KTV + 𝜃O 𝑢KTO

where 𝑢K is a zero mean white noise process with variance σ2.
(1) Calculate the mean and variance of 𝑦K .
(2) Derive the autocorrelation function for this process (i.e., express the
autocorrelations 𝜏V , 𝜏O , 𝜏£ as functions of the parameters 𝜃V and 𝜃O
(3) If 𝜃V = −0.5 and 𝜃O = 0.25, sketch the acf of yt.

(1) 𝐸 𝑢K = 0 𝑡ℎ𝑒𝑛 𝐸 𝑢KTµ = 0, ∀𝑖
So the expected value of the error term is zero for all time periods.
Taking expectations of both sides of given equation gives
𝐸 𝑦K = 𝐸 𝑢K + 𝜃V 𝑢KTV + 𝜃O 𝑢KTO = 𝐸(𝑢K) + 𝐸 𝜃V 𝑢KTV + 𝐸 𝜃O 𝑢KTO = 0

𝑣𝑎𝑟 𝑦K = 𝐸(𝑦K − 𝐸 𝑦K ) (𝑦K − 𝐸 𝑦K ) = 𝐸(𝑦K )(𝑦K )
=E(𝑢K + 𝜃V 𝑢KTV + 𝜃O 𝑢KTO )(𝑢K + 𝜃V 𝑢KTV + 𝜃O 𝑢KTO )
=E(𝑢K O +𝜃V 𝑢KTV O + 𝜃O 𝑢KTO O + 𝑐𝑟𝑜𝑠𝑠 − 𝑝𝑟𝑜𝑑𝑢𝑐𝑡s)

But E(𝑐𝑟𝑜𝑠𝑠 − 𝑝𝑟𝑜𝑑𝑢𝑐𝑡s)=0 since 𝑐𝑜𝑣(𝑢K , 𝑢KTX ) = 0 for 𝑠 ≠ 0. ‘Cross-products’ is
thus a catchall expression for all of the terms in u which have different time
subscripts, such as 𝑢K −1 𝑢K −2 or 𝑢K −5 𝑢K −20, etc.
Again, one does not need to worry about these cross-product terms, since these
are effectively the auto-co-variances of 𝑢K , which will all be
zero by definition since 𝑢K is a random error process, which will have zero auto-
co-variances (except at lag zero).
𝛾] can also be interpreted as the auto-co-variance at lag zero.
So 𝑣𝑎𝑟 𝑦K = 𝛾]
= 𝐸 𝑢K O + 𝜃V 𝑢KTV O + 𝜃O 𝑢KTO O
= 𝜎 O + (𝜃V 𝜎)O + (𝜃O 𝜎)O )
= 𝜎 O (1 + 𝜃V O + 𝜃O O )

(2) Calculating now the acf of 𝑦K , first determine the auto-co-variances and then
the autocorrelations by dividing the auto-co-variances by the variance.
The auto-co-variance at lag 1 is given by
𝛾V = 𝐸(𝑦K − 𝐸 𝑦K ) (𝑦KTV − 𝐸 𝑦KTV ) = 𝐸(𝑦K )(𝑦KTV )
= 𝐸(𝑢K + 𝜃V 𝑢KTV + 𝜃O 𝑢KTO )(𝑢KTV + 𝜃V 𝑢KTO + 𝜃O 𝑢KT£ )
=E(𝜃V 𝑢KTV O + 𝜃V 𝜃O 𝑢KTO O + 𝑐𝑟𝑜𝑠𝑠 − 𝑝𝑟𝑜𝑑𝑢𝑐𝑡s)
= 𝐸(𝜃V 𝑢KTV O + 𝜃V 𝜃O 𝑢KTO O )
= 𝜃V 𝜎 O + 𝜃V 𝜃O 𝜎 O


The auto-co-variance at lag 1 is given by
𝛾O = 𝐸(𝑦K − 𝐸 𝑦K ) (𝑦KTO − 𝐸 𝑦KTO ) = 𝐸(𝑦K )(𝑦KTO )
= 𝐸(𝑢K + 𝜃V 𝑢KTV + 𝜃O 𝑢KTO )(𝑢KTO + 𝜃V 𝑢KT£ + 𝜃O 𝑢KT¤ )
=E(𝜃O 𝑢KTO O + 𝑐𝑟𝑜𝑠𝑠 − 𝑝𝑟𝑜𝑑𝑢𝑐𝑡s)
= 𝐸(𝜃O 𝑢KTO O )
= 𝜃O 𝜎 O
The auto-co-variance at lag 3 is given by
𝛾£ = 𝐸(𝑦K − 𝐸 𝑦K ) (𝑦KTO − 𝐸 𝑦KTO ) = 𝐸(𝑦K )(𝑦KT£ )
= 𝐸(𝑢K + 𝜃V 𝑢KTV + 𝜃O 𝑢KTO )(𝑢KT£ + 𝜃V 𝑢KT¤ + 𝜃O 𝑢KT¥ )
= 0
So 𝛾X = 0 for s > 2. All autocovariances for the MA(2) process will be
zero for any lag length, s, greater than 2.
The autocorrelation at lag 0 is given by
ÀÁ
𝜏] =
ÀÁ
= 1

The autocorrelation at lag 1 is given by


ÀQ ÂQ à R ¸ÂQ ÂR à R  ¸ÂQ ÂR
𝜏V =
ÀÁ
=
à R (V¸ÂQ R ¸ÂR R )
=V¸ÂQ R
¸ÂR R

Q

The autocorrelation at lag 2 is given by


ÀR ÂR Ã R ÂR
𝜏O =
ÀÁ
=
à (V¸ÂQ R ¸ÂR R )
R
=V¸Â R
¸ÂR R

Q

The autocorrelation at lag s is given by


𝜏O = 0, ∀𝑠 > 2
(3) For 𝜃V = −0.5 and 𝜃O = 0.25, substituting these into the formulae
above gives the first two autocorrelation coefficients as 𝜏V = −0.476,
𝜏O = 0.190. Autocorrelation coefficients for lags greater than 2 will all
be zero for an MA(2) model. Thus the acf plot will appear as


4.2 Process invertibility MA(1) ~ AR(∞):

MA(1) model is defined as:
𝑦K = 𝜇 + 𝑢K + 𝜃𝑢KTV = 𝜇 + 𝑢K + 𝜃𝐿×𝑢K
→ 𝑦K − 𝜇 = 𝑢K + 𝜃𝐿×𝑢K = 𝑢K (1 + 𝜃𝐿)
→ 𝑦K − 𝜇 = 𝑢K + 𝜃𝐿×𝑢K = 𝑢K (1 − −𝜃𝐿 )
® TÄ O
→ 𝑢K = VT ¯TÂÅ =(𝑦K − 𝜇)(1 + (−𝜃𝐿) + −𝜃𝐿 + (−𝜃𝐿)£ + ⋯ )

= 𝜇′ + 𝑦K − 𝜃𝐿𝑦K + (𝜃𝐿)O 𝑦K − (𝜃𝐿)£ 𝑦K + ⋯


→ 𝑦K = 𝜇′ + 𝜃𝑦KTV − 𝜃 O 𝑦KTO + 𝜃 £ 𝑦KT£ +…
→ 𝑦K = 𝜇′ + 𝜙V 𝑦KTV + 𝜙O 𝑦KTO + 𝜙£ 𝑦KT£ +…
is in fact a AR(∞) model
Condition for invertibility: 𝜃 < 𝟏

5. Autoregressive process AR(p)

Let 𝑢K ; 𝑡 = 1, 2, . . . , ∞ be a white noise process such that
𝐸 𝑢K = 𝜇f ; 𝑡 = 1, 2, . . . , ∞
𝑣𝑎𝑟(𝑢K ) = 𝜎f O < ∞
Then AR(p) model is defined as:
𝑦K = 𝜇 + 𝑢K + 𝜙V 𝑦KTV + 𝜙O 𝑦KTO + ⋯ + 𝜙È 𝑦KTÈ
È
= 𝜇 + 𝑢K + µ”V 𝜙µ ×𝑦KTµ
È µ
= 𝜇 + 𝑢K + µ”V 𝜙µ ×𝐿 𝑦K
È µ
→ 𝑦K − µ”V 𝜙µ ×𝐿 𝑦K = 𝜇 + 𝑢K
È µ
→ 𝑦K (1 − µ”V 𝜙µ ×𝐿 ) = 𝜇 + 𝑢K
→ 𝑦K ×𝜙(𝐿) = 𝜇 + 𝑢K

5.1 Stationary condition:

1) Is the following model stationary?
𝑦K = 𝑦KTV + 𝑢K

𝑦K = 𝑦KTV + 𝑢K → 𝑦K = 𝐿×𝑦K + 𝑢K → 𝑦K (1 − 𝐿) = 𝑢K
then the characteristic equation is:
1 − 𝑧 = 0 → 𝑧 = 1
which lies on, not outside, the unit circle. In fact, the particular AR(p) model
given is a non-stationary process known as a random walk.

2) Is the following model stationary?
𝑦K = 3𝑦KTV − 2.75𝑦KTO + 0.75𝑦KT£ + 𝑢K

𝑦K = 3𝑦KTV − 2.75𝑦KTO + 0.75𝑦KT£ + 𝑢K
→ 𝑦K = 3𝐿×𝑦K − 2.75×𝐿O 𝑦K + 0.75×𝐿£ 𝑦K + 𝑢K
→ 𝑦K (1 − 3𝐿 − 2.75𝐿O + 0.75𝐿£ ) = 𝑢K
then the characteristic equation is:
1 − 3𝑧 − 2.75𝑧 O + 0.75𝑧 £ = 0
→ 1 − 𝑧 (1 − 1.5𝑧)(1 − 0.5𝑧) = 0
𝑧=1
2
→ 𝑧=
3
𝑧=2
Only one of these lies outside the unit circle and hence the process is not
stationary.

5.2 Wold’s Decomposition Theorem

For the AR(p) model given in the following equation with 𝜇 set to zero for
simplicity) and expressed using the lag polynomial notation
𝑦K = 𝑢K + 𝜙V 𝑦KTV + 𝜙O 𝑦KTO + ⋯ + 𝜙È 𝑦KTÈ → 𝜙(𝐿)𝑦K = 𝑢K
È

𝑤ℎ𝑒𝑟𝑒 𝜙 𝐿 = 1 − 𝜙µ ×𝐿µ = 1 − 𝜙V ×𝐿V − 𝜙O ×𝐿O − ⋯ − 𝜙È ×𝐿È


µ”V

the Wold decomposition is 𝑦K = 𝜙(𝐿)TV 𝑢K = 𝜓(𝐿)𝑢K


𝑤ℎ𝑒𝑟𝑒 𝜓 𝐿 = (1 − 𝜙V ×𝐿V − 𝜙O ×𝐿O − ⋯ − 𝜙È ×𝐿È )TV
𝜇
𝐸 𝑦K =
1 − 𝜙V − 𝜙O − ⋯ 𝜙È

The autocovariances and autocorrelation functions can be obtained by solving a
set of simultaneous equations known as the Yule–Walker equations. The Yule–
Walker equations express the correlogram (𝜏X ) as a function of the autoregressive
coefficients (the 𝜙X )

For any AR model that is stationary, the autocorrelation function will decay
geometrically to zero.1 These characteristics of an autoregressive process will be
derived from first principles below using an illustrative example.

5.3 Example:
Consider the following simple AR(1) model

𝑦K = 𝜇 + 𝜙V 𝑦KTV + 𝑢K
(1) Calculate the (unconditional) mean of yt.
For the remainder of the question, set the constant to zero (𝜇 = 0) for
simplicity.
(2) Calculate the (unconditional) variance of 𝑦K .
(3) Derive the autocorrelation function for this process.

(1)
The unconditional mean will be given by the expected value of
Expression
𝐸 𝑦K = 𝐸 𝜇 + 𝜙V 𝑦KTV + 𝑢K
= 𝜇 + 𝜙V 𝐸 𝑦KTV
= 𝜇 + 𝜙V (𝜇 + 𝜙V 𝐸 𝑦KTO
= 𝜇 + 𝜙V 𝜇 + 𝜙V O 𝐸 𝑦KTO
= 𝜇 + 𝜙V 𝜇 + 𝜙V O (𝜇 + 𝜙V 𝐸 𝑦KT£
= ⋯
= 𝜇(1 + 𝜙V + 𝜙V O + ⋯ 𝜙V ÎTV )+ 𝜙V Î 𝐸 𝑦KTÎ
= 𝜇 1 + 𝜙V + 𝜙V O + ⋯ 𝜙V ÎTV
𝜇
=
1 − 𝜙V
Thus the expected or mean value of an autoregressive process of order one is
given by the intercept parameter divided by one minus the autoregressive
coefficient.

(2)
Calculating now the variance of yt, with µ set to zero
𝑦K = 𝜙V 𝑦KTV + 𝑢K
This can be written equivalently as
𝑦K (1 − 𝜙V 𝐿) = 𝑢K
𝑦K = 1 − 𝜙V 𝐿 TV
𝑢K = 1 + 𝜙V 𝐿 + 𝜙V O 𝐿O + ⋯ 𝑢K
= 𝑢K + 𝜙V 𝑢KTV + 𝜙V O 𝑢KTO +𝜙V £ 𝑢KT£ + ⋯
So long as |𝜙V | < 1, i.e., so long as the process for yt is stationary, this
sum will converge.
From the definition of the variance of any random variable y, it is
possible to write
𝑣𝑎𝑟(𝑦K ) = 𝐸(𝑦K − 𝐸 𝑦K ) (𝑦K − 𝐸 𝑦K )
= 𝐸 𝑦K 𝑦K
=E(𝑢K + 𝜙V 𝑢KTV + 𝜙V O 𝑢KTO +𝜙V £ 𝑢KT£ … )(𝑢K + 𝜙V 𝑢KTV 𝜙V O 𝑢KTO +𝜙V £ 𝑢KT£ … )
=E(𝑢K O + 𝜙V O 𝑢KTV O + 𝜙V ¤ 𝑢KTO O + ⋯ )
= 𝜎 O (1 + 𝜙V O + 𝜙V ¤ + ⋯ )
Provided that |𝜙V | < 1, the infinite sum in equation can be written as
𝜎O
𝑣𝑎𝑟(𝑦K ) =
1 − 𝜙V O

(3)
Turning now to the calculation of the autocorrelation function, the auto-
covariances must first be calculated. This is achieved by following similar
algebraic manipulations as for the variance above, starting with the definition
of the autocovariances for a random variable. The autocovariances for lags 1,
2, 3, ..., s, will be denoted by γ1, γ2, γ3, ..., γs, as previously.

𝛾V = 𝑐𝑜𝑣 𝑦K , 𝑦KTV = 𝐸(𝑦K − 𝐸 𝑦K )(𝑦KTV − 𝐸 𝑦KTV )=E(𝑦K )(𝑦KTV )
=E(𝑢K + 𝜙V 𝑢KTV + 𝜙V O 𝑢KTO +𝜙V £ 𝑢KT£ + ⋯ )(𝑢KTV + 𝜙V 𝑢KTO + 𝜙V O 𝑢KT£ +𝜙V £ 𝑢KT¤ +. . . )
=E(𝜙V 𝑢KTV O +𝜙V £ 𝑢KTO O +𝜙V ¥ 𝑢KT£ O +…)
=𝜙V 𝜎 O 1 + 𝜙V O + 𝜙V ¤ + ⋯
𝜙V 𝜎 O
𝑠𝑜 𝛾V =
1 − 𝜙V O

𝛾O = 𝑐𝑜𝑣 𝑦K , 𝑦KTO = 𝐸(𝑦K − 𝐸 𝑦K )(𝑦KTO − 𝐸 𝑦KTO )=E(𝑦K )(𝑦KTO )
=E(𝑢K + 𝜙V 𝑢KTV + 𝜙V O 𝑢KTO +𝜙V £ 𝑢KT£ + ⋯ )(𝑢KTO + 𝜙V 𝑢KT£ + 𝜙V O 𝑢KT¤ +𝜙V £ 𝑢KT¥ +. . . )
=E(𝜙V O 𝑢KTO O +𝜙V ¤ 𝑢KT£ O +𝜙V Ò 𝑢KT¤ O +…)
=𝜙V O 𝜎 O 1 + 𝜙V O + 𝜙V ¤ + ⋯
𝜙V O 𝜎 O
𝑠𝑜 𝛾O =
1 − 𝜙V O

𝜙V £ 𝜎 O
𝛾£ =
1 − 𝜙V O

𝜙V X 𝜎 O
𝛾X =
1 − 𝜙V O
The acf can now be obtained by dividing the covariances by the variance, so
that
𝛾]
𝜏] = = 1
𝛾]
𝜙V 𝜎 O
𝛾V 1 − 𝜙V O
𝜏V = =
𝛾] 𝜎O
1 − 𝜙V O
= 𝜙V
𝜙V O 𝜎 O
𝛾O 1 − 𝜙V O
𝜏O = =
𝛾] 𝜎O
1 − 𝜙V O
= 𝜙V O

𝜏X = 𝜙X O
which means that 𝑐𝑜𝑣 𝑦K , 𝑦KTO = 𝜙X O . Note that use of the Yule–Walker
equations would have given the same answer.

6. The Partial Autocorrelation Function

The PACF is useful for telling the difference between an AR process and
an ARMA process.
In the case of an AR(p), there are direct connections between 𝑦K and 𝑦KTX only for
𝑠 ≤ 𝑝. So for an AR(p), the theoretical PACF will be zero after lag p.
In the case of an MA(q), if it is invertible (roots of characteristic equations
𝜃(𝑧)=0 lie outside the unit circle), it can be written as an AR(∞), so there are
direct connections between 𝑦K and all its previous values. For an MA(q), the
theoretical PACF will be geometrically declining.

At lag 1: 𝜏VV = 𝜏V
•R T•Q R
At lag 2: 𝜏OO = VT•Q R


7. ARMA(p,q):

By combining the AR(p) and MA(q) models, an ARMA(p, q) model is obtained.
Such a model states that the current value of some series y depends linearly on
its own previous values plus a combination of current and previous values of a
white noise error term. The model could be written

For 𝜙 𝐿 = 1 − 𝜙V 𝐿 − 𝜙O 𝐿O − ⋯ − 𝜙È 𝐿È
𝑎𝑛𝑑 𝜃 𝐿 = 1 + 𝜃V 𝐿 + 𝜃O 𝐿O + ⋯ + 𝜃´ 𝐿´ , we have

𝜙 𝐿 𝑦K = 𝜇 + 𝜃(𝐿)𝑢K , or
𝑦K = 𝜇 + 𝜙V 𝑦KTV + 𝜙O 𝑦KTO + ⋯ + 𝜙È 𝑦KTÈ +𝜃V 𝑢KTV + 𝜃O 𝑢KTO + ⋯ + 𝜃´ 𝑢KT´ +𝑢K

You might also like