Professional Documents
Culture Documents
Understanding Time Series Analyisis 2021
Understanding Time Series Analyisis 2021
Understanding Time Series Analyisis 2021
With Exercises
Lecture Notes
February 2021
1
Chapter One: Introduction
This chapter introduces time series analysis decomposition method and explores the various
parametric and non-parametric tests for a stationary/no trend series. It concludes with the
estimation of the stationary trend model.
1.1 Decomposition
A time series is a collection of observations made sequentially in time. e.g. annual Gross
Domestic Product, monthly coffee sales, quarterly tea production, etc.
The time series can be decomposed in to four components namely; Trend, Seasonal, Cyclic and
Irregular term. There are two approaches in decomposing a series i.e. additive and multiplicative
models. The additive model assumes a sum of components while multiplicative model assumes a
product of components. The additive and multiplicative models are expressed as follows:
Additive: yt = Tt + St + Ct + It
Where:
yt – Actual series
Tt – Trend
St – Seasonal
Ct – Cyclic component
It – Irregular term
Multiplicative: yt = Tt x St x Ct x It
Where:
yt – Actual series
Tt – Trend
St – Seasonal
Ct – Cyclic component
It – Irregular term
In an additive model all the components are measured in the same units as the original series (yt).
In case of the multiplicative model it is only the trend which is measured in the same units as the
original series while the other components have no units since they are indexes.
Trend
It is a long term tendency of the series to either rise or fall and it is times be known as secular
trend.
2
Chart 1.1: Plot of the Trend
30
25
20
y 15
10
0
Time
Seasonality
These are periodic fluctuations in the series within a year. Such fluctuations form a relatively
fixed pattern that tends to repeat year in year out. These fluctuations are attributable to weather
changes and social customs or various institutional arrangements like Christmas, public holidays,
summer etc.
1
0.9
0.8
0.7
0.6
y 0.5
0.4
0.3
0.2
0.1
0
Time
3
Cyclic component
Cyclic component is like seasonal component in that it is weave like pattern with ups and downs.
The difference is that cycles are viewed as broad contractions and expansions that take several
years and not with a year. The length of time between successive peaks of a cycle is not
necessarily fixed as for seasonality.
1
0.9
0.8
0.7
0.6
y 0.5
0.4
0.3
0.2
0.1
0
Time
Irregular Term
This is the residual movement after accounting for the trend, seasonality and cyclic component.
2.5
2
1.5
1
0.5
y 0
-0.5
-1
-1.5
-2
-2.5
Time
4
1.2 Stationary Series/No Trend
1.2.1 Definition
A series is said to be stationary if it appears about the same on average irrespective of
when it was observed.
yt = βo + ε for t=1,2,3,…..
Where: yt - actual values of the series
y 4
0
Time
Stable environment – this is when the forces generating the series have stabilized and the
environment in which the series exists is relatively unchanging e.g. the mature stage of a
life cycle of a product such as new smearing jelly.
Easily correctable trend - stability may be realized through making of simple corrections
for factors such as population growth and inflation.
5
Short forecasting horizon – the trend may be present but the period for which the
forecasts are needed is relatively short such that the amount due to trend is negligible.
Transformable series – some series may be mathematically transformed into a stable one
just by taking logarithms, square roots, differencing etc.
Residual analysis – analysis of residual series may result in a horizontal pattern or
stationary series;
Preliminary stages of model development - a simple model may be required for ease of
explanation and interpretation.
There are four tests which may be used to determine the existence of stationarity namely: runs
test, turning points test, sign test and Daniels teat.
a) Runs Test
Let the runs be R, which is the number of runs in a random sequence of m pluses and m minuses.
𝑚(𝑚−1)
Standard deviation of R: 𝑆𝑅 =√ (standard deviation of the number of runs)
2𝑚−1
Hypothesis
|𝑅−𝜇𝑅 |
Compute Z =
𝑆𝑅
𝛼
Note that 𝑍𝛼 is the upper ( x 100%) point of the standard normal distribution.
2 2
6
Conclusion: If H0 is rejected, the test has shown with (1-α) x 100% confidence that the series is
non-stationary and if H0 is not rejected, the test has shown some support for using a stationary or
no trend model.
Example 1.1: Coffee exports of country X shall be used to test for stationarity using the
“runs” test.
Arrange the series in ascending or descending order and compute the median. The median = 7.85
.
The number of runs are R = 12
𝑚(𝑚−1) 12𝑥11
SD 𝑆𝑅 = √ =√ = 2.396
2𝑚−1 (2𝑥12)−1
7
On substituting for the mean and standard deviation in the computation for Z
|𝑅−𝜇𝑅 | |12−13|
Z= = = 0.417
𝑆𝑅 2.396
Conclusion: Since 0.417 is less than 1.96 we do not reject the null hypothesis and conclude that,
there is some support for a stationary series or horizontal model at 95% significance level.
A turning point in a time series is a point where the series changes direction. Each the turning
point represents either a local “peak” or a local “trough” in the series.
In order to determine a turning point assign a plus or minus to a period depending on whether its
first difference yt – yt-1 is positive or negative. A plus indicates that the series went up in the
period and minus implies that it went down. A turning point is a time period whose sign is
different from that of the next period.
Hypothesis
2(𝑛−2)
Mean U: 𝜇𝑢 = 3
16𝑛−29
Standard Deviation U: 𝑆𝑢 =√ 90
Decision criteria, reject H0, if the computed Z is greater than the tabulated 𝑍𝛼 ,
2
𝛼
Note that 𝑍𝛼 is the upper ( 2 x 100%) point of the standard normal distribution.
2
Conclusion: If H0 is rejected, the test has shown with (1-α) x 100% confidence that the series is
non-stationary and if H0 is not rejected, the test has shown some support for using a stationary or
no trend model.
8
Example 1.2: Coffee exports shall be used to test for stationarity using the turning points
test.
Count the number of turning points in the series shown on table 1.2, U = 11
2(𝑛−2) 2(24−2)
Mean of U: 𝜇𝑢 = = = 14.667
3 3
16𝑛−29 16𝑥24−29
Standard Deviation of U: 𝑆𝑢 =√ =√ = 1.986
90 90
|𝑈−𝜇𝑢 | |11−14.667|
Z= = = 1.846
𝑆𝑢 1.986
Conclusion: Since 1.846 is less than 1.96 we do not reject the null hypothesis and conclude that,
there is some support for a stationary series or horizontal model at 95% significance level.
9
c) Sign Test
Once the signs of the first differences have been determined as done for the turning points test, a
sign test may be used i.e. assign a plus or minus to a period depending on whether its first
difference yt – yt-1 is positive or negative. A plus indicates that the series went up in the period
and minus implies that it went down.
The test statistic V = the number of positive first differences in the series.
Hypothesis
|𝑉−𝜇𝑣 |
Compute Z =
𝑆𝑣
𝑛́
Mean V: 𝜇𝑣 =
2
√𝑛́
Standard Deviation V: 𝑆𝑣 =
4
Decision criteria, reject H0, if the computed Z is greater than the tabulated 𝑍𝛼 ,
2
𝛼
Note that 𝑍𝛼 is the upper ( 2 x 100%) point of the standard normal distribution.
2
Conclusion: If H0 is rejected, the test has shown with (1-α) x 100% confidence that the series is
non-stationary and if H0 is not rejected, the test has shown some support for using a stationary or
no trend model. If the computed Z is positive we conclude that the trend is upward and if it is
negative it is downward.
Example 1.3:
Using the computations made in Example 1.2 of the first differences. Count the number of
positive first differences.
𝑛́ = 23
𝑛́ 23
Mean V: 𝜇𝑣 = 2 = = 11.5
2
10
√𝑛́ √23
Standard Deviation V: 𝑆𝑣 = 4 = = 1.1990
4
|𝑉−𝜇𝑣 | |12−11.5|
Compute Z = = = 0.417
𝑆𝑣 1.1990
Conclusion: Since 0.417 is less than 1.96 we fail to reject the null hypothesis and conclude that,
the series is stationary at 95% significance level.
11
d) Daniels Test
6 ∑ 𝑑2
𝑡
The test statistic rs = 1- 𝑛(𝑛2 −1)
dt = t – rank(yt)
Hypothesis
If the sample is small (n<30), then you use the R table. The decision criteria, reject H0, if the
computed r is greater than the tabulated 𝑟𝛼 .
2
|𝑟−𝜇𝑟 |
If the sample is large (n>30), compute Z = 𝑆𝑟
Mean R: 𝜇𝑟 = 0
1
Standard Deviation R: 𝑆𝑟 =
√𝑛−1
Decision criteria, reject H0, if the computed Z is greater than the tabulated 𝑍𝛼 ,
2
𝛼
Note that 𝑍𝛼 is the upper ( 2 x 100%) point of the standard normal distribution.
2
Conclusion: If H0 is rejected, the test has shown with (1-α) x 100% confidence that the series is
non-stationary and if H0 is not rejected, the test has shown some support for using a stationary or
no trend model. If the computed rs is negative we conclude that the trend is downward and if it is
positive it is an upward trend.
Example 1.4: Coffee exports shall be used to test for stationarity using Daniels test
6 ∑ 𝑑2
𝑡 6 𝑥 1402
rs = = 1- 𝑛(𝑛2 −1) = = 1- 24(576−1) = 0.3904
Conclusion: Since the computed r= 0.3904 < 0.555, we fail to reject the null hypothesis and
conclude with 95% confidence that there is no trend or the series is stationary.
12
Table 1.4: Exports of coffee (yt) in country X
T yt Rank of dt dt2
yt
1 7.2 8 -7 49
2 6.4 3 -1 1
3 6.2 2 1 1
4 8.3 16 -12 144
5 8.4 18 -13 169
6 6.9 6 0 0
7 7.6 10 -3 9
8 8.2 15 -7 49
9 9.3 23 -14 196
10 8.3 16 -6 36
11 6.6 4 7 49
12 5.9 1 11 121
13 7.6 10 3 9
14 8.5 19 -5 25
15 6.8 5 10 100
16 7.9 13 3 9
17 7.8 12 5 25
18 6.9 6 12 144
19 8.8 21 -2 4
20 9.5 24 -4 16
21 7.9 13 8 64
22 7.4 9 13 169
23 8.7 20 3 9
24 9.2 22 2 4
∑ 𝑑𝑡 2 1402
This test detects primarily the presence of a linear trend and it may not detect if the trend is
curvilinear. The coefficient is computed as follows:
(∑ 𝑡)2
Stt = ∑(𝑡 − 𝑡̅)2 = ∑ 𝑡 2 − 𝑛
2 2 (∑ 𝑦)2
Syy = ∑(𝑦 − 𝑦̅) = ∑ 𝑦 − 𝑛
(∑ 𝑡)( ∑ 𝑦)
Sty = ∑(𝑡 − 𝑡̅) (𝑦 − 𝑦̅) = ∑ 𝑡𝑦 − 𝑛
13
𝑆𝑡𝑦
r=
√𝑆𝑡𝑡 𝑆𝑦𝑦
𝑟√𝑛−2
tr = √1−𝑟2
Decision criteria, reject H0, if the computed t is greater than the tabulated 𝑡𝛼 ,
2
𝛼
Where 𝑡𝛼 is the upper ( 2 x 100%) point of the students’ t-distribution with n-2 degrees of
2
freedom.
Conclusion: If H0 is rejected, the test has shown with (1-α) x 100% confidence, that the series is
non-stationary and if H0 is not rejected, the test has shown some support for using a stationary or
no trend model. If the computed tr is negative we conclude that the trend is downward and if it is
positive it is an upward trend.
Example 1.5:
Using the coffee exports in Example 1.1 to compute the Pearson’s correlation for testing for
stationarity of the series.
∑(𝑡)2 3002
Stt = ∑(𝑡 − 𝑡̅)2 = ∑ 𝑡 2 − = 4900 - = 1150
𝑛 24
(∑ 𝑦)2 186.32
Syy = ∑(𝑦 − 𝑦̅)2= ∑ 𝑦 2 − = 1469.35− = 23.1962
𝑛 24
(∑ 𝑡)( ∑ 𝑦) 300𝑥186.3
Sty = ∑(𝑡 − 𝑡̅) (𝑦 − 𝑦̅) = ∑ 𝑡𝑦 − = 2393.9 - = 65.15
𝑛 24
𝑆𝑡𝑦 65.15
r= = = 0.3989
√𝑆𝑡𝑡 𝑆𝑦𝑦 √1150𝑥23.1962
𝑟√𝑛−2 0.3989√22
tr = √1−𝑟2 = √1−0.39892
= 2.04
Conclusion: Since the computed t= 2.04 < 2.201, we fail to reject the null hypothesis and
conclude with 95% confidence that there is no trend or the series is stationary.
14
Table 1.5: Exports of coffee (yt) in country X
t yt tyt yt2 t2
1 7.2 7.2 51.84 1
2 6.4 12.8 40.96 4
3 6.2 18.6 38.44 9
4 8.3 33.2 68.89 16
5 8.4 42.0 70.56 25
6 6.9 41.4 47.61 36
7 7.6 53.2 57.76 49
8 8.2 65.6 67.24 64
9 9.3 83.7 86.49 81
10 8.3 83.0 68.89 100
11 6.6 72.6 43.56 121
12 5.9 70.8 34.81 144
13 7.6 98.8 57.76 169
14 8.5 119.0 72.25 196
15 6.8 102.0 46.24 225
16 7.9 126.4 62.41 256
17 7.8 132.6 60.84 289
18 6.9 124.2 47.61 324
19 8.8 167.2 77.44 361
20 9.5 190.0 90.25 400
21 7.9 165.9 62.41 441
22 7.4 162.8 54.76 484
23 8.7 200.1 75.69 529
24 9.2 220.8 84.64 576
Σt=300 Σ yt =186.3 Σ tyt =2393.9 Σ yt2 =1469.35 2
Σ t =4900
The horizontal model is easy to fit since it has one form only. The horizontal model is expressed
as follows:
yt = βo + εt
Using the least squares method to compute βo, this method requires the estimation of βo that
minimizes the sum of squared forecast errors.
Σyt = nβo
15
∑ 𝑦𝑡
βo = …………………………………………………………………..(1)
𝑛
𝑦̂𝑡 = = 7.76
Chart 1.6: Coffee exports (yt) in country X with a fitted stationary model
10
9
8
7 βt
6
yt 5
4
3
2
1
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
t
16
Chapter Two: Trend
This chapter introduces the trend, tests applicable for present of trend in a series and concludes
with the estimation of the different types of trend models.
2.1 Introduction
A trend is said to be trended if the expected value of the series changes over time such that;
Where yt = f(βo, β1, β2,………;t) + εt and f(βo, β1, β2,………;t) represents the trend.
The major causes of trend in economic time series are the following:
Population changes – indeed population growth leads to increased demand for a number
of commodities associated with demand such as commodity sales, food consumption etc;
Technological changes – technological advancements led to increased productivity and as
such may lead to general improvement in the standard of living. New products come in
the market as a result of technological advancement and items which were once luxuries
become necessities and necessities become outdated;
Changes in social customs – social customers change over time: peoples tastes & habits
change, cigarette smoking may increase or decline, etc;
Inflation – affects the interest rates, salaries and wages and the purchasing power of the
Uganda shilling or any other currency among others;
Environmental conditions – changes in the environmental conditions lead to increase or
reduction of money requirement for its management;
Market acceptance – when new product comes in the market, a few people know it and as
time goes by more consumers familiarize, time series associated with that products.
All the non-parametric tests and parametric tests used for making a decision about a stationary
series are applicable for the trend. The non-parametric tests are: runts test, turning points test,
sigh test and the Daniel’s test. The parametric test considered in the previous chapter one is the
Pearson’s test.
17
2.4 Estimation of Trends
A line trend is one of the simplest mathematical models to b e estimated. The line of best fit may
be computed mathematically using the least squares. This is a line which minimizes the total
squared deviations of the actual observations from the calculated line. The general form of a
linear equation is as follows:
yt = βo + β1t+εt
Using the least squares method to compute βo and β1, this method requires that we derive the
normal equations which have to be solved simultaneously.
We derive the two normal equations required for computation of the two coefficients:
(∑ 𝑡)2
Stt = ∑(𝑡 − 𝑡̅)2 = ∑ 𝑡 2 − 𝑛
2 2 (∑ 𝑦)2
Syy = ∑(𝑦𝑡 − 𝑦̅) = ∑ 𝑦 − 𝑛
(∑ 𝑡)( ∑ 𝑦)
Sty = ∑(𝑡 − 𝑡̅) (𝑦 − 𝑦̅) = ∑ 𝑡𝑦 − 𝑛
𝑛 ∑ 𝑡𝑦− ∑ 𝑡 ∑ 𝑦
̂1 =
𝛽 2 ………………………………………………………………………2.3
𝑛Ʃ𝑡 2 −(∑ t)
̂0 = 𝑦̅ - 𝛽
𝛽 ̂1 𝑡̅ …………………………………………………………………………...2.4
̂0 +𝛽
𝑦̂𝑡 = 𝛽 ̂1 t
𝐸𝑥𝑝𝑙𝑎𝑖𝑛𝑒𝑑 𝑣𝑎𝑟𝑖𝑎𝑡𝑖𝑜𝑛
R2 = 𝑡𝑜𝑡𝑎𝑙 𝑣𝑎𝑟𝑖𝑎𝑡𝑖𝑜𝑛
∑(𝑦̂ −𝑦̅)2
= ∑(𝑦 𝑡− 𝑦̅)2
𝑡
18
Example 2.1: Cotton sales yt for the period 1997 to 2011
𝑛 ∑ 𝑡𝑦− ∑ 𝑡 ∑ 𝑦 15𝑥7421−120𝑥805
̂1 =
𝛽 2 == = 3.5036
𝑛Ʃ𝑡 2 −(∑ t) 15𝑥1240 − 1202
𝑦̅ = 53.67 and 𝑡̅ = 8
̂0 = 𝑦̅ - 𝛽
𝛽 ̂1 𝑡̅ = 53.67 – 3.5036 x 8 = 25.64
̂0 +𝛽
𝑦̂𝑡 = 𝛽 ̂1 t
∑(𝑦̂ −𝑦̅)2
= ∑(𝑦 𝑡− 𝑦̅)2 = 0.83
𝑡
19
yt = βo + β1t+ β2t2 + εt
We derive the three normal equations required for computation of the coefficients:
The measure for closeness of the fit to the data is the coefficient of determinations which is
computed using the formula:
̂0 +𝛽
𝑦̂𝑡 = 𝛽 ̂1t + 𝛽
̂2t2
𝐸𝑥𝑝𝑙𝑎𝑖𝑛𝑒𝑑 𝑣𝑎𝑟𝑖𝑎𝑡𝑖𝑜𝑛
R2 = 𝑡𝑜𝑡𝑎𝑙 𝑣𝑎𝑟𝑖𝑎𝑡𝑖𝑜𝑛
∑(𝑦̂ −𝑦̅)2
= ∑(𝑦 𝑡− 𝑦̅)2
𝑡
20
Example 2.2: Cotton sales yt for the period 1997 to 2011
1997 1 17 17 1 289 1 1 17
On substitution into the three normal equations you obtain the following:
21
βo = 10.90, β1 = 8.7072 and β2 = -0.3252
∑(𝑦̂ −𝑦̅)2
= ∑(𝑦 𝑡− 𝑦̅)2 = 0.94
𝑡
yt = βoβ1tεt
The model is to be transformed for ease of computation of the coefficients. The linear form of
the equation is
On close observation, you will notice that it is similar to the linear trend previously seen. In
deriving the normal equations:
After rearranging the equations the computational methods for the coefficients become:
∑ 𝑙𝑜𝑔𝑦𝑡 log β1 ∑ 𝑡
logβo = –
𝑛 𝑛
𝑛 ∑ 𝑡𝑙𝑜𝑔𝑦− ∑ 𝑡 ∑ 𝑙𝑜𝑔𝑦
logβ1 = 2
𝑛Ʃ𝑡 2 −(∑ t)
𝑦̂ = 𝑎̂ 𝑏̂t
22
𝐸𝑥𝑝𝑙𝑎𝑖𝑛𝑒𝑑 𝑣𝑎𝑟𝑖𝑎𝑡𝑖𝑜𝑛
R2 = 𝑡𝑜𝑡𝑎𝑙 𝑣𝑎𝑟𝑖𝑎𝑡𝑖𝑜𝑛
∑(𝑦̂ −𝑦̅)2
= ∑(𝑦 𝑡− 𝑦̅)2
𝑡
Note that if the computed b1 is greater than 1 then there is growth while if b1 is less than 1 there
is a decline.
ΣLnyt=58.73 ΣtLnyt=492.259
Σt=120 1 0 Σt2=1,240 ΣInyt2=232.3973
15𝑥492.2590−120𝑥58.731
lnβ1 = = 0.080
15𝑥1240−1202
β1 = 1.0833
∑ 𝐿𝑛𝑦𝑡 Inβ1 ∑ 𝑡 58.731 0.08𝑥120
lnβo = – = – = 3.275
𝑛 𝑛 15 15
βo = 26.45
𝑦̂ = 26.45(1.0833)t
Comment: The sales were 26.45 in the base period and the growth is 8.3% per annum.
23
𝐸𝑥𝑝𝑙𝑎𝑖𝑛𝑒𝑑 𝑣𝑎𝑟𝑖𝑎𝑡𝑖𝑜𝑛
R2 = 𝑡𝑜𝑡𝑎𝑙 𝑣𝑎𝑟𝑖𝑎𝑡𝑖𝑜𝑛
∑(𝑦̂ −𝑦̅)2
= ∑(𝑦 𝑡− 𝑦̅)2 = 0.73%
𝑡
Divide the time series into three equal parts and compute the partial sums;
Let the first sum for the first third be S1y, the second sum for the second third be S2y and the
thirds and last sum be S3y. Let n be the number of observations in the third of the series.
𝛽2 −1
β1 = (S2y- S1y) (𝛽2 𝑛 −1)2
1 𝛽 𝑛 −1
β0 = 𝑛[S1y- ( 𝛽2 −1 ) β1
2
24
𝑆3 𝑦−𝑆2 𝑦
β2 n = 𝑆2 𝑦−𝑆1 𝑦
340−296 44
β2 = = 127 = 0.809
296−169
𝛽2 −1
β1 = (S2y- S1y) (𝛽2 𝑛 −1)2
0.809−1
β1 = (296-169) = 127x-0.447 = -56.80
(0.346−1)2
1 𝛽 𝑛 −1
β0 = 𝑛[S1y- ( 𝛽2 −1 ) β1]
2
1 0.346−1
β0 = 5[169- (0.809−1)x-56.80] = 72.663
Chart 2.1: Plot of sales yt and the trend for the period 1997 to 2011
80
70
60
50
yt 40
30
20
10
-
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
years
25
On transforming by taking logs we obtain:
lnyt = ln βo + ln(β1)β2tεt
This is a modified exponential trend with a time series lnyt. lnβo , lnβ1 , β2 are the coefficients for
the model.
Divide the transformed series into three equal parts and compute the partial sums;
Let the first sum for the first third be S1lny, the second sum for the second third be S2lny and the
thirds and last sum be S3lny. Let n be the number of observations in the third of the series.
𝛽2 −1
lnβ1 = (S2lny- S1lny) (𝛽2 𝑛 −1)2
1 𝛽 𝑛 −1
lnβ0 = 𝑛[S1lny- ( 𝛽2 −1 ) lnβ1]
2
21.228−20.558
β2 5 = = 0.2255
20.558−17.587
26
β2 = 0.7424
𝛽2 −1
lnβ1 = (S2lny- S1lny) (𝛽2 𝑛 −1)2
0.7424−1
lnβ1 = (20.558- 17.587) = -1.276
(0.2255−1)2
β1 = 0.2792
1 𝛽 𝑛 −1
lnβ0 = 𝑛[S1lny- ( 𝛽2 −1 ) lnβ1]
2
1 0.2255−1
lnβ0 = 5[17.587- (0.7424−1)x -1.276] = 4.2845
β0 = 72.57
0.7424𝑡
𝑦̂t = 72.570.2792
Chart 2.2: Plot of sales yt and the trend for the period 1997 to 2011
90
80
70
60
50
yt
40
30
20
10
-
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
years
27
2.4.5 Logistic Curve
𝛽0
yt = 𝑡 , for case of log e
1+𝑒 1 +𝛽2 )
(𝛽
or
𝛽0
yt = 𝑡 , for case of log 10
1+10(𝛽1 +𝛽2 )
Divide the time series into three equal parts and select the t periods such that they are equidistant
from one another i.e. t0, t1, and t2. The first t0 should be near the beginning within the first third,
the second t1 in the middle and the last one t2, near the end within the last third of the series.
Then select the yt corresponding to the three t’s and them be y0, y1 and y2 and n is the number
periods from one t to the other.
The computations are as follows:
𝛽0 −𝑦0
β1 = log 𝑦0
1 𝑦 (𝛽 −𝑦 )
β2 = 𝑛 [log 𝑦0 (𝛽0−𝑦1 )]
1 0 0
28
2𝑦0 𝑦1 𝑦2 −𝑦12 (𝑦0 +𝑦2 ) 2𝑥33𝑥59𝑥72−592 (33+72)
βo = = = 77.0471
𝑦0 𝑦2 −𝑦12 33𝑥72−592
𝛽0 −𝑦0 77.0471−33
β1 = loge = loge = 0.2888
𝑦0 33
1 𝑦 (𝛽 −𝑦 ) 1 33(77.047−59)
β2 = 𝑛 [loge 𝑦0 (𝛽0 −𝑦1)] = 5 [loge 59(77.047−33)] = -0.2947
1 0 0
77.047
𝑦̂t = 1+𝑒 (0.2888−0.2947𝑡)
Chart 2.3: Plot of sales yt and the trend for the period 1997 to 2011
90
80
70
60
50
yt
40
30
20
10
-
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
t
29
2.5 Estimation of the trends using Excel
Regression Statistics
Multiple R 0.912325935
R Square 0.832338611
Adjusted R
Square 0.819441581
Standard Error 7.297680147
Observations 15
ANOVA
Significance
df SS MS F F
Regression 1 3437.004 3437.004 64.53723 2.14E-06
Residual 13 692.3298 53.25614
Total 14 4129.333
Standard
Coefficients Error t Stat P-value
Intercept 25.64 3.965254 6.465688 2.11E-05
X Variable 1 3.5036 0.43612 8.033507 2.14E-06
30
Linear Trend 𝑦̂𝑡 = 25.64 + 3.5036t
The t values of 6.47 for the intercept and 8.03 for the X variable are high and they reveal that the
two coefficients are significant.
Regression Statistics
Multiple R 0.968508266
R Square 0.938008262
Adjusted R
Square 0.927676305
Standard Error 4.618662782
Observations 15
ANOVA
Significance
Df SS MS F F
Regression 2 3873.3488 1936.674391 90.78709097 5.68E-08
Residual 12 255.98455 21.3320459
Total 14 4129.3333
Standard
Coefficients Error t Stat P-value
Intercept 10.89450549 4.1139988 2.6481548 0.021251339
X Variable 1 8.707191338 1.1831985 7.3590286 8.74617E-06
X Variable 2 -0.32522624 0.0719096 -4.5227111 0.000698504
The t values of 2.65 for the intercept, 7.40 for X variable 1 and -4.52 for the X variable2 are high
and they reveal that the three coefficients are significant.
Regression Statistics
Multiple R 0.857097645
R Square 0.734616372
31
Adjusted R
Square 0.714202247
Standard Error 0.223260997
Observations 15
ANOVA
Significance
df SS MS F F
Regression 1 1.793724 1.793724 35.98569 4.45E-05
Residual 13 0.647991 0.049845
Total 14 2.441715
Standard
Coefficients Error t Stat P-value
Intercept 3.275093963 0.121311 26.99757 8.38E-13
X Variable 1 0.08003847 0.013342 5.998807 4.45E-05
𝑦̂ = 26.45(1.0833)t
The t values of 27.0 for the intercept and6.0 for the X variable are high and they reveal that the
two coefficients are significant.
32
Exercise 2:
The annual GDP series for country X
Year t GDP(y)
1997 1 8.62
1998 2 9.45
1999 3 10.07
2000 4 10.51
2001 5 11.20
2002 6 11.99
2003 7 12.73
2004 8 13.47
2005 9 14.81
2006 10 15.86
2007 11 17.14
2008 12 18.93
2009 13 19.71
2010 14 20.93
2011 15 22.17
i) Fit a linear trend, quadratic equation and exponential trend and compute their respective
coefficients of determination.
ii) Plot the GDP series and the three trends computed above on one graph
33
Chapter Three: Seasonal
This chapter introduces the seasonal component, tests for seasonality and the methods for
estimation of the seasonal factors.
3.1 Introduction
Seasonality is a regular pattern of fluctuations that repeats from year to year in a time series
observed at shorter than yearly intervals.
A time series yt observed L times per year at times t=1, 2, 3, …….. is said to be seasonal if the
Average value of the series over changes over time such that:
Additive model: E( yt) = f(βo, β1, β2,………;t) + St for t=1,2,3, … and ∑𝑙1 𝑆𝑡 = 0
or
Multiplicative model: E( yt) = f(βo, β1, β2,………;t) x St for t=1,2,3, … and ∑𝑙1 𝑆𝑡 = L
Weather and temperature related factors such as sales of goods and services associated
with weather patterns; purchase of clothing, consumption of heating fuels in temperate
regions which experience extreme weather conductions such as winters and summers
normally exhibits seasonality;
Calendar related events which are associated with collective social behavior depict
seasonal patterns. The calendar related factors are activities associated with events such
as public holidays, religious events, school open days or school calendars etc.
34
2.5 Tests for Seasonality
The non-parametric and parametric test will be used for making a decision about presence or
absence of seasonality and the seasonal factors will be estimated.
Hypothesis
or
Add: yt = Tt + St + Ct + It
Multiplicative: yt = Tt x St x Ct x It
And the moving average (MA) = Tt + Ct for an additive model and (MA) = Tt x Ct for a
multiplicative model.
In case of a quarterly series yt observed 4 times per year at times t=1, 2, 3, … the moving
average is computed as follows:
The first MA = (y1 + y2 + y3 + y4)/4, the second MA = (y2 + y3 + y4 + y5)/4, third MA = (y3 + y4 + y5
+ y6)/4 and so on.
If L is an even number then the moving averages must be centered to correspond to a specific
quarter. The computation of the centered moving average (CMA) follows:
35
Step 2: Calculate 2- period moving average of the moving averages resulting from step 1
resulting in a centered moving average;
Kruskal-Wallis test
= n1 + n2 + n3 + … + nL
𝑅𝑖
Let 𝑅̅𝑖 = average rank in the ith season = 𝑛𝑖
H = ̅̅̅1 − 𝑅̅ )2 + 𝑛2 (𝑅
𝑛1 (𝑅 ̅̅̅2 − 𝑅̅ )2 + ….+ 𝑛𝐿 (𝑅
̅̅̅𝐿 − 𝑅̅ )2
= Σ𝑛𝑖 (𝑅̅𝑖 − 𝑅̅ )2
12 𝑅2
= [∑ 𝑛𝑖 ] - 3(n+1)
𝑛(𝑛+1) 𝑖
= Σni
Where 𝜒𝛼2 (𝐿 − 1) is the upper α x 100% point of the chi-square distribution with L-1 degrees of
freedom.
36
Example 3.1: Cotton sales by a Soroti based company.
Centred
Moving Moving yt/CMA =
Year Quarter yt Average (MA) Average (CMA) SxI Rank
2005 I 130
II 45 71.25
III 20 78.75 75.00 0.267 1
IV 90 82.50 80.63 1.116 11
2006 I 160 83.75 83.13 1.925 16
II 60 92.50 88.13 0.681 5
III 25 90.00 91.25 0.274 2
IV 125 93.75 91.88 1.361 12
2007 I 150 96.25 95.00 1.579 13
II 75 85.00 90.63 0.828 6
III 35 90.00 87.50 0.400 3
IV 80 93.75 91.88 0.871 7
2008 I 170 97.50 95.63 1.778 15
II 90 105.00 101.25 0.889 9
III 50 106.25 105.63 0.473 4
IV 110 106.75 106.50 1.033 10
2009 I 175 104.25 105.50 1.659 14
II 92 105.50 104.88 0.877 8
III 40
IV 115
Σni = 4 + 4 +4 + 4 = 16
12 582 282 102 402
H = [ + + + ] - 3(16+1)
16(16+1) 4 4 4 4
37
= 13.5
Let α = 5%
2
Reject Ho if H > 𝜒0.05 (3) = 7.81
Conclusion: Since H = 13.5 > 7.81, we reject the null hypothesis and conclude that, there is
seasonality with 95% significance level.
Centred
Moving Moving
Year Quarter yt Average (MA) Average yt/CMA = SxI
2005 I 130
II 45 71.25
III 20 78.75 75.00 0.267
IV 90 82.50 80.63 1.116
2006 I 160 83.75 83.13 1.925
II 60 92.50 88.13 0.681
III 25 90.00 91.25 0.274
IV 125 93.75 91.88 1.361
2007 I 150 96.25 95.00 1.579
II 75 85.00 90.63 0.828
III 35 90.00 87.50 0.400
IV 80 93.75 91.88 0.871
2008 I 170 97.50 95.63 1.778
II 90 105.00 101.25 0.889
III 50 106.25 105.63 0.473
IV 110 106.75 106.50 1.033
2009 I 175 104.25 105.50 1.659
II 92 105.50 104.88 0.877
III 40
IV 115
Rearranging the yt/CMA = SxI into their respective quarters and computing arithmetic means.
The arithmetic means represent the unadjusted seasonal factors while on adjustment they become
adjusted seasonal factors.
38
Average
Quarter SxI (unadjusted SF) Adjusted SF
I 1.6588 1.9248 1.5789 1.7778 1.7351 1.7341
II 0.8772 0.6809 0.8276 0.8889 0.8186 0.8182
III 0.2667 0.2740 0.4000 0.4734 0.3535 0.3533
IV 1.1163 1.3605 0.8707 1.0329 1.0951 1.0945
Sum 4.0023 4.0000
Quarter SF
I 1.7341
II 0.8182
III 0.3533
IV 1.0945
Interpretation: Generally, in quarter I the sales 73% above the average or trend, quarter II the
sales are 18% below the average or trend, quarter III the sales are 64% below the average or
trend and quarter IV the sales are 9% above the average or trend.
39
2.5 Test for Seasonality using Excel
The least squares regression may be used to test for seasonality. The method fits a least squares
trend and seasonal indicator variables yt (additive model) or log yt (multiplicative model) if the
trend is present then fit the seasonal indicator variables to the residual and test the model fit.
Let x1, x2, … xL be seasonal indicators variables such that xi is 1 in the ith season and 0 otherwise
for i=1,2, …, L.
log yt = logβ0 +log β1t +logS1x1+ logS2x2+ … + logSLxL + logεt and ΣlogSi = 0
There should be L-1 seasonal indicators since Lth seasonal indicator can easily be determined
once we have determined the L-1 indicators.
The regression technique of determining if the seasonal factors are significant is as follows;
Step 1
Using Ms Exel, fit a linear trend and L-1 seasonal dummy variables and obtain:
Variability explained by the trend and seasonal model
= SSR (trend + seasonal)
Variability unexplained by the trend and seasonal model
= SSE (trend + seasonal)
Step 2
Test for significance of the linear trend coefficient β1 using the the t – test
̂
𝛽
t =𝑆𝑡
̂𝑡
𝛽
Decision criteria, reject H0, if the computed t is greater than the tabulated 𝑡𝛼 , (df=n-L-1)
2
Step 3
If the trend is not significant, Using Ms Excel refit the model without the trend that is with only
the L-1 seasonal dummies only and obtain;
Variability explained by the trend and seasonal model
= SSR (seasonal)
Variability unexplained by the trend and seasonal model
= SSE (seasonal)
40
Use the F test to test the overall model fit.
Hypothesis to be tested
Or
Or
Step 4
In case the trend is present, fit a regression with only the linear trend yt = β0 + β1t without the
seasonal dummies.
SSE (trend) = Variability explained by a model with only linear trend;
Step 5
In order to test for seasonality we compute the F as follows:
41
𝑀𝑆𝑅(𝑆𝑒𝑎𝑠𝑜𝑛𝑎𝑙)
Test statistic: 𝐹 = 𝑀𝑆𝐸(𝑡𝑟𝑒𝑛𝑑+ 𝑠𝑒𝑎𝑠𝑜𝑛𝑎𝑙)
Where:
𝑆𝑆𝑅(𝑆𝑒𝑎𝑠𝑜𝑛𝑎𝑙)
MSR(seasonal) = 𝐿−1
𝑆𝑆𝐸(𝑡𝑟𝑒𝑛𝑑+𝑠𝑒𝑎𝑠𝑜𝑛𝑎𝑙)
MSE(trend + seasonal) = 𝑛−𝐿−1
42
Regression Statistics
Multiple R 0.97959
R Square 0.959596
Adjusted R
Square 0.948822
Standard Error 1.113403
Observations 20
ANOVA
Significance
df SS MS F F
Regression 4 441.633 110.4083 89.06285 2.9E-10
Residual 15 18.595 1.239667
Total 19 460.228
Standard
Coefficients Error t Stat P-value
Intercept 8.525 0.72585 11.74485 5.8E-09
t X Variable 1 0.18625 0.044011 4.231885 0.000725
x1 X Variable 2 5.95875 0.716449 8.317058 5.32E-07
x2 X Variable 3 -2.7875 0.709658 -3.92795 0.001343
x3 X Variable 4 -6.93375 0.705552 -9.82741 6.28E-08
Conclusion: Since the computed value 4.23 is greater than the tabulated t0.025 = 2.131, we reject
H0, and conclude that the sales of batteries series has trend.
43
The trend is present in the series hence we proceed to Step 4;
Fit a regression with only the linear trend yt = β0 + β1t without the seasonal dummies;
.
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.124363
R Square 0.015466
Adjusted R
Square -0.03923
Standard Error 5.017248
Observations 20
ANOVA
Significance
df SS MS F F
Regression 1 7.117955 7.117955 0.282764 0.601397
Residual 18 453.11 25.17278
Total 19 460.228
Standard
Coefficients Error t Stat P-value
Intercept 8.453684 2.33067 3.627148 0.001927
t X Variable 1 0.103459 0.194561 0.531756 0.601397
SSR(trend) = 7.118
The amount of improvement obtained by adding a seasonal component is;
SSR(seasonal) = SSR(trend + seasonal) – SSR(trend)
= 441.633 – 7.118
= 434.52
𝑆𝑆𝑅(𝑆𝑒𝑎𝑠𝑜𝑛𝑎𝑙)
MSR(seasonal) = 𝐿−1
434.52
= = 144.8
4−1
𝑆𝑆𝐸(𝑡𝑟𝑒𝑛𝑑+𝑠𝑒𝑎𝑠𝑜𝑛𝑎𝑙)
MSE(trend + seasonal) = 𝑛−𝐿−1
= 1.2396
144.8
Test statistic: 𝐹 = 1.2396 = 116.84
44
Step 5
Hypothesis
H0: S1 = S2 = S3 = 0
Conclusion: Since the computed F= 116.84 > 2.131, we reject H0, and conclude that there is
seasonality in the sales of batteries series.
45
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.771744
R Square 0.595588
Adjusted R
Square 0.510449
Standard Error 0.672257
Observations 24
ANOVA
Significance
df SS MS F F
Regression 4 12.6458 3.161451 6.995462 0.001225
Residual 19 8.586647 0.451929
Total 23 21.23245
Standard
Coefficients Error t Stat P-value
Intercept 5.268241 0.392949 13.40693 3.9E-11
t X Variable 1 -0.0335 0.020088 -1.66768 0.111777
x1 X Variable 2 -2.04323 0.392778 -5.20199 5.07E-05
x2 X Variable 3 -1.15684 0.390201 -2.96472 0.007959
x3 X Variable 4 -1.09197 0.388647 -2.80966 0.011186
Conclusion: Since the computed t = |-1.67| < t0.025 = 2.093, we reject H0, and conclude that the
series is not trended.
Refit the model without the trend but with only X1, X2 and X3.
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.732388
R Square 0.536392
Adjusted R
Square 0.466851
Standard Error 0.701553
Observations 24
46
ANOVA
Significance
df SS MS F F
Regression 3 11.38892 3.796306 7.7133 0.00129
Residual 20 9.843532 0.492177
Total 23 21.23245
Standard
Coefficients Error t Stat P-value
Intercept 4.799247 0.286408 16.75669 3.07E-13
x1 X Variable 1 -1.94273 0.405042 -4.79636 0.00011
x2 X Variable 2 -1.08984 0.405042 -2.69069 0.014062
x3 X Variable 3 -1.05847 0.405042 -2.61323 0.016643
Conclusion: Since the computed F = 7.71 > 3.10, we reject H0, and conclude that the series has
seasonality.
47
2.6 Least Squares seasonal modeling
In the test for seasonality there were L-1 seasonal factors instead of L. Since we have the L-1
indicators and we know the sum of seasonal factors as Zero for the additive model and L for a
multiplicative model, it is easy to determine the one missing factor.
̂0∗ + 𝛽
𝑦̂𝑡 = 𝛽 ̂1∗ t + 𝑆
̂1∗ x1+ 𝑆
̂2∗ x2 + … + 𝑆̂
∗
𝐿−1 xL-1
With L dummies
̂0 + 𝛽
𝑦̂𝑡 = 𝛽 ̂1t + 𝑆̂1x1+ 𝑆̂2 x2 + … + 𝑆̂𝐿 xL
̂0∗ , 𝛽
Regression can be used to produce 𝛽 ̂1∗ , 𝑆 ̂2∗ , … , 𝑆̂∗ L-1 and the 𝑆
̂1∗ , 𝑆 ̂𝐿∗ = 0
̂0∗ + 𝛽
𝑦̂𝑡 = 𝛽 ̂1∗ t + 𝑆
̂1∗ x1+ 𝑆
̂2∗ x2 + … + 𝑆̂
∗ ̂∗
𝐿−1 xL-1+ 𝑆𝐿 xL
̂1∗ , 𝑆
The sum of the , 𝑆 ̂2∗ , … , 𝑆
̂𝐿∗ is not zero as expected which implies that we should to
normalize the coefficients;
̂𝑖∗ = 𝑆
𝑆 ̂𝑖∗ - ̅̅̅
𝑆∗ for i=1,2, …, L
𝛽 ̂0∗ + ̅̅̅
̂0 = 𝛽 𝑆∗
̂1∗ ,
̂1 = 𝛽
𝛽
𝑆̂𝑖 = 𝑆
̂𝑖∗ - ̅̅̅
𝑆∗ for i=1,2, …, L-1
𝑆̂𝐿 = − ̅̅̅
𝑆∗
𝛽 ̂0∗ + ̅̅̅
̂0 = 𝛽 𝑆∗
̂1∗ ,
̂1 = 𝛽
𝛽
̂0 + 𝛽
𝑦̂𝑡 = 𝛽 ̂1t + 𝑆̂1x1+ 𝑆̂2 x2 + … + 𝑆̂𝐿 xL for an additive model
48
Or
̂0 x 𝛽
𝑦̂𝑡 = 𝛽 ̂1t x 𝑆̂1x1x 𝑆̂2 x2 x … x 𝑆̂𝐿 xL for multiplicative model
Example: 3.5. Refer to Example 3.2 for which we fitted a linear trend and seasonal factors which
yielded the following regression output:
Regression Statistics
Multiple R 0.97959
R Square 0.959596
Adjusted R
Square 0.948822
Standard Error 1.113403
Observations 20
ANOVA
Significance
df SS MS F F
Regression 4 441.633 110.4083 89.06285 2.9E-10
Residual 15 18.595 1.239667
Total 19 460.228
We observed earlier while testing for trend and seasonality, that the trend and seasonal
components are significant.
̂1∗ - ̅̅̅
𝑆̂1 = 𝑆 𝑆 ∗ = 5.95875 – (-0.9406) = 6.8994
̂2∗ - ̅̅̅
𝑆̂2 = 𝑆 𝑆 ∗ = -2.7875– (-0.9406) = -1.8469
̂3∗ - ̅̅̅
𝑆̂3 = 𝑆 𝑆 ∗ = -6.93375 – (-0.9406) = -5.9932
49
𝛽 ̂0∗ + ̅̅̅
̂0 = 𝛽 𝑆 ∗ = 8.525 + -0.9406 = 7.5844
̂1∗ = 0.18625
̂1 = 𝛽
𝛽
In case of a multiplicative model with a simple exponential trend, fit a simple exponential trend
on the cotton series. Regress Ln(yt) on t, X1 , X2 and X3
50
The results of the regression are as follows:
Regression Statistics
Multiple R 0.8687991
R Square 0.7548119
Adjusted R Square 0.7031933
Standard Error 0.2618031
Observations 24
ANOVA
Significanc
df SS MS F eF
14.62287
Regression 4 4.0090582 1.00226455 9 1.2957E-05
Residual 19 1.3022761 0.06854085
Total 23 5.3113343
Coefficient Standard
s Error t Stat P-value
Intercept 4.8007481 0.1530298 31.3713257 7.866E-18
0.004361
T X Variable 1 0.0253042 0.0078229 3.23464779 6
X1 X Variable 2 -0.9555109 0.1529631 -6.24667361 5.34E-06
0.000700
X2 X Variable 3 -0.6138307 0.1519597 -4.039432 2
0.004433
X3 X Variable 4 -0.4884804 0.1513544 -3.22739559 2
𝛽̂0∗ = 4.8007
𝛽̂1∗ = 0.0253
𝑆̂1∗ = -0.9555
𝑆̂2∗ = -0.6138
𝑆̂3∗ = -0.4885
51
𝑆̂1′ = 𝑆̂1∗ - 𝑆̅ ∗ = -0.9555- (-0.5145) = -0.4411
The sum of the seasonal factors should be = L = 4, hence we adjust them to sum to 4.
𝑆̂2
𝑆2 = = 0.9055/1.062 = 0.853
𝑆̅
𝑆̂3
𝑆3 = = 1.0263/1.062 = 0.966
𝑆̅
𝑆̂4
𝑆4 = = 1.6728/1.062 = 1.575
𝑆̅
Origin: IV 2004
Units of (t): quarters and
Units (y) : Million tons
52
Exercise 3: cotton sales in million tons
53
Chapter Four: Cyclical Component
4.1 Introduction
Cyclic component is defined as long swings away from trend that are due to factors other than
seasonality. It’s important to note that cycles occur over a number of years. The up-down
oscillations of a cycle rarely repeat at fixed intervals of time and the amplitude of the fluctuations
may also vary.
A time series yt is said to have cyclical component if the average value of yt changes over time
such that;
or
Psychological forces may contribute to swings in a series for example a series linked to
popular tastes such as food, music or fashions exhibits cyclic behavior. A cycle starts
with a few people who are adherents to the particular fashion which eventually swells and
reaches a peak. Once the peak is reached then disinterest sets in causing a decline in
popularity;
Institutional causes include public policy like a policy to recruit more police to curb the
crime rate. The increase in the number of police officers may lead to a decline in the
54
crime rate which may again prompt government to reduce the police force because of
increased costs hence contributing to another round of increase in the crime rate;
Replacement cycles arise due to introduction of new products in the market which may
undergo rapid sales at the beginning which will eventually level off when the market
saturation approaches. Eventually, the early customers find themselves with old and
broken, worn out products which they may have to replace, this may result in another
round of growth;
Education - demand for some fields often undergoes cycles because students go for
courses with high prospects of employment. After a few years the demand in those areas
slackens once the demand is met leading to a number of qualified job seekers not getting
jobs causing students to avoid those areas;
Predator/pray relationships were certain pray are known to have distinct predators. In that
case too few predators lead to increased breeding of the prey leading to increased
predators food supply; too many predators eventually compete for and reduce the number
of prey, reducing the food supply and leading to a diminished number of predators;
Combined causes arises due to a possibility of more than one cause acting at a time such
as education combined with population cycles etc.
The test for presence of cycles may be based upon the von Neumann’s ratio test which is a non-
parametric test applied to the ranks of the data set.
Hypothesis
H1: The residuals are positively auto correlated or cycles are present
∑𝑛−1
𝑡=1 (𝑅𝑡+1 −𝑅𝑡 )
2 ∑𝑛−1
𝑡=1 (𝑅𝑡+1 −𝑅𝑡 )
2
Test Statistics RM = ∑𝑛 ̅ 2
=
1 (𝑅𝑡 −𝑅 ) 𝑛(𝑛2 −1)/12
Decision criteria, Reject H0, if the computed RM is less than the tabulated RMα
Note: RMα is the lower α x 100% point in the distribution of von Neumann’s rank ratio.
55
Conclusion: If H0 is rejected, we conclude that there is a positive autocorrelation in the residuals
that may be due to cycles and if it is not rejected we conclude that the residuals are independent.
Compute the trend values (column 4) then detrend the series which depends on whether you
assume an additive or a multiplicative model. In case you assume an additive model you obtain
column 5. Rank the detrended series and obtain successive differences between the ranks.
(columns 6 and 7). Lastly, compute squired differences and sum them as shown on column 8.
∑𝑛−1
𝑡=1 (𝑅𝑡+1 −𝑅𝑡 )
2 368
RM = =
𝑛(𝑛2 −1)/12 15(152 −1)/12
368
= = 1.31
280
Taking α=10% in order to test for cyclic component then RM10 = 1.36
Conclusion: Since the computed RM = 1.31 < 1.34 we conclude that there is significant (α =
10%) autocorrelation in the residuals, and the cycle may be the cause.
56
4.4 Estimation Cyclical Movements
4.4.1 Residual Method
The residual method may be used to compute cyclic movements. Adjusting monthly data
involves elimination of seasonal variation and trend hence obtaining the cyclic irregular
movements.
Assuming a multiplicative model: yt = Tt x St x Ct x It
Next the data is smoothed in order to obtain cyclical components, which may be referred to as
cyclical relatives since they are always in percentages.
Deseasonalising
The computation of a seasonal index is very important since it may be used also for isolation of
the cyclic movements. The elimination of seasonal variation in a series could be realized by
dividing the original series by the seasonal index. The deseasonalised data may contain the three
components the trend, cyclic and irregular movements.
After removing the seasonal movements the trend component should be eliminated. On
detrending the deseasonalised series you obtain cyclical-irregular series. It does not matter
whether you started by eliminating seasonal first or trend as long as the two components are
removed the residual comprises of cyclic and irregular components.
57
4.4.3 Harmonic Analysis
A sine-cosine curve may be used to estimate cyclic movements if the series has about the same
duration and amplitude. The sine-cosine curve may be fitted to the cyclical-irregular data after
irregular movements have been smoothed. This type of series is rear in business and social
science we shall not discuss more.
Year Yt Year Yt
1991 8.29 2000 10.57
1992 8.00 2001 11.71
1993 8.57 2002 11.71
1994 8.86 2003 10.57
1995 9.71 2004 11.14
1996 9.71 2005 10.57
1997 9.43 2006 11.70
1998 8.86 2007 12.80
1999 9.71
58
Chapter Five: Exponential Smoothing
If we have a time series denoted over time as: Y1, Y2, ..., Yt-1, Yt and the forecasts are: Ft+1, Ft+2,
..., Ft+r of future values of Y.
where α is a given weight which shall be selected subject to the constraint 0 < α < 1. Thus Ft+1 is
the weighted average of the current observation, Yt, with the forecast, Ft, made at the previous
time point t − 1.
t 1
Ft 1 (1 ) F1 (1 ) j Yt j
t
j 0
showing that the current forecast: Yt, Yt-1, Yt-2, ... drops exponentially. The rate at which it drops
depends upon the α.
The single exponential smoothing needs to be initialized. A simple way to initialize the forecast
is to let:
F2 = Y1.
Lt Yt (1 )(Lt 1 bt 1 )
bt ( Lt Lt 1 ) (1 )bt 1
Ft r Lt bt m
Where:
59
L1 = Y1 and b1 = 0.
If zero is not a typical value of the initial slope then a more careful estimate of the slope may be
needed to ensure that the initial forecasts are realistic.
Yt
Lt (1 )(Lt 1 bt 1 )
S t s
bt ( Lt Lt 1 ) (1 )bt 1
Yt
St (1 ) S t s
Lt
Ft r Lt bt r S t s r
where s is the number of periods in one cycle of seasons e.g. number of months or quarters in a
year.
To initialize we need one complete cycle of data, i.e. s values. Then set
1
Ls (Y1 Y2 ... Ys )
s
To initialize trend we use s + k time periods.
1 Y Y Y Y Y Yk
bs s 1 1 s 2 2 ... s k .
k s s s
If the series is long enough then a good choice is to make k = s so that two complete cycles are
used. However we can, at a pinch, use k = 1.
Yk
Sk k 1, 2, ..., s
Ls
The parameters α, β, γ should lie in the interval (0, 1).
60
5.3.2 Holt-Winter's Method, Additive Seasonality
Lt (Yt S t s ) (1 )(Lt 1 bt 1 )
bt ( Lt Lt 1 ) (1 )bt 1
S t (Yt Lt ) (1 ) S t s
Ft r Lt bt r S t s r
The initial values of Ls and bs can be as in the multiplicative case. The initial seasonal indices
can be taken as
Sk Yk Ls k 1, 2, ..., s .
The parameters α, β, γ should lie in the interval (0, 1).
61
Chapter Six: Box-Jenkins Methodology
The Box Jenkins ARIMA modelling in time series was introduced in n 1976 by Box and Jenkins.
ARIMA stands for AR= Autoregessive, I = integrated, MA= Moving Average
6.1 Stationarity
A key concept underlying time series processes is that of stationarity. A time series is covariance
stationarity when it has the following three characteristics:
Stationarity is important because if the series is non-stationary then all the typical results of the
classical regression analysis are not valid. Regressions with non-stationary series may have no
meaning and therefore called “spurious”. The long-term forecast of a stationary series will
converge to the unconditional mean of the series.
Yt Yt 1 ut ………………………………………………………(1)
where we do not include a constant the AR(1) and 1and ut is white noise or error term.
For equation (1), the constraint is that 1. If 1 , then Yt is trended, it will grow bigger and
bigger with time and as such the time series will explode.
62
Yt 1Yt 1 2Yt 2 u t …………………………………………..… (2)
We may use the lag operator L (the lag operator L has the property: LnYt=Yt-n) and write the AR(p)
model as:
The AR(p) process is stationary only if the p roots of the polynomial equation ( z) 0 are greater
than 1 in absolute value, where z is a real variable.
Usining the polynomial notation, the condition for the AR(1) process reduces to the following:
(1 z) 0 ……………………………………………………(7)
If it is greater than 1 in absolute value, and the first root is equal to λ, then the condition is:
1
1 ……………………………………………………………….(8)
1 …………………………………………………………………….(9)
For the AR(p) model to be stationary the summation of the p autoregressive coefficients should be
less than 1:
Yt 1 t Y0 ( t u1 t 1u 2 ... 0 u t 1 )
And since 1, t will be close to zero for large t. Thus we have that:
and
2
𝛿𝑢
𝑉𝑎𝑟(𝑌𝑡 ) = 𝑉𝑎𝑟(𝛼𝑌𝑡−1 + 𝑢𝑡 ) = 𝛼 2 𝛿𝑌2 + 𝛿𝑢2 = 2 ……………….. (12)
1−𝛼2 𝛿𝑌
For the AR(1) model the autocovariance function will be given by:
Cov (Yt , Yt 1 ) E[YtYt 1 ] [Yt E (Yt 1 )] [ E (Yt )Yt 1 ] [ E (Yt ) E (Yt 1 )]
E (YtYt 1 )
E[(Yt 1 ut )Yt 1 ]
………………………….(15)
E (Yt 1Yt 1 ) E (ut Yt 1 ) Y2
and in general:
64
Cov (Yt , Yt k ) K Y2
Cor (Yt , Yt k ) k ……..………………(18)
Var(Yt )Var (Yt k ) Y 2
For an AR(1) time series the autocorrelation function (ACF), a graph showing a plot of Cor(Yt,Yt-
k) against k and is called correlogram and it decays exponentially as k increases. Finally the partial
autocorrelation function (PACF) involves plotting the estimated coefficient Yt-k from an ordinary
least squares (OLS) estimate of an AR(k) process, against k.
If the observations are generated by an AR(p) process then the theoretical partial autocorrelations
will be high and significant for up to p lags and zero for lags beyond p.
Yt ut ut 1 ……………………………………………………….(19)
Yt (1 1 L 2 L2 ... p Lq )ut
………………………………….(22)
( L)ut
Because any MA(q) process is, by definition, an average of q stationary white-noise processes, it
follows that every moving average model is stationary, as long as q is finite.
A property often discussed in connection with the moving average processes is that of invertibility.
A time series Yt is invertible if it can be represented by a finite-order MA or convergent
autoregressive process.
Invertiblity is important because the use of the ACF and PACF for identification implicitly assume
that the Yt sequence can be well-approximated by an autoregressive model.
65
Yt ut ut 1 ………………………….…………………………….(23)
Yt (1 L)ut
Yt ………………………………………………………(24)
ut
(1 L)
If 1 , then the left-hand side of (24) can be considered as the sum of infinite geometric
progression.
MA(1) process:
Yt ut ut 1
ut 1 Yt 1 ut 2
And repeating this an infinite number of times we finally get the expression (25). The MA(1)
process has been inverted into an infinite order AR process with geometrically declining weights.
In general the MA(q) processes are invertible if the roots of the polynomial
( z) 0 ………………………………………………………. (26)
The mean of the MA process will be clearly equal to zero as it is the mean of white noise terms.
The variance will be given by:
66
Cov (Yt , Yt 1 ) E[(ut ut 1 )(ut 1 ut 2 )]
……………………. (28)
E (ut ut 1 ) E (ut21 ) 2 E (ut 1ut 2 )
u2 ……………………………………………………. (29)
From this we can understand that for the MA(1) process the autocorrelation function will be:
u2
Cov (Yt , Yt k ) 2 for k 1
Cor (Yt , Yt k ) u (1 ) 1 2
2 ……………….(31)
Var (Yt )Var (Yt k )
0 for k 1
So, if we have an MA(q) model we will expect the correlogram (ACF) to have q spikes for k=q,
and then go down to zero immediately. The partial autocorrelation function (PACF) for MA
process should decay slowly.
We can have combinations of the two processes to give a new series of models called ARMA(p,
q) models. The general form of the ARMA(p, q) models is following:
The ARMA(p, q) model, can only be made stationary through differencing or detrending.
Yt Yt Yt 1 …………………………………..(35)
67
Most economic and financial time series exhibit trends to some degree, we commonly always end
up taking first differences of the time series. If, after first differencing, a series is stationary then
the series is also called integrated to order one, and denoted I(1).
(a) identification,
(b) estimation, and
(c) diagnostic checking.
Identification
A comparison of the sample ACF and PACF to those of various theoretical ARIMA processes may
suggest several plausible models. If the series is non-stationary the ACF of the series will not die
down or show signs of decay at all. A common stationarity inducing transformation is to take
logarithms and then first differences of the series. Once we have achieved stationarity, the next
step is identify the p and q orders of the ARIMA model
Pure white noise All autocorrelation are zero All partial autocorrelation are zero
68
Estimation
In this second stage, the estimated model are compared using AIC and BIC.
Diagnostic checking
References
Box, George E. P., Gwilym M. Jenkins (1976) Time Series Analysis. Revised Edition. Oakland,
CA: Holden-Day.
Croxton E. Frederick, Cowden J. Dudley and Klein Sidney (1967), Applied General Statistics,
Third Edition, Sir Isaac Pitman and Sons Ltd, London
Kendall, M. G.; Stuart, A. (1968). The Advanced Theory of Statistics, Volume 3: Design and
Analysis, and Time-Series (2nd edition ed.). London: Griffin..
69