Understanding Time Series Analyisis 2021

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 69

Understanding Time Series

With Exercises

Lecture Notes

Yeko Mwanga (PhD)

February 2021

1
Chapter One: Introduction

This chapter introduces time series analysis decomposition method and explores the various
parametric and non-parametric tests for a stationary/no trend series. It concludes with the
estimation of the stationary trend model.

1.1 Decomposition

A time series is a collection of observations made sequentially in time. e.g. annual Gross
Domestic Product, monthly coffee sales, quarterly tea production, etc.

The time series can be decomposed in to four components namely; Trend, Seasonal, Cyclic and
Irregular term. There are two approaches in decomposing a series i.e. additive and multiplicative
models. The additive model assumes a sum of components while multiplicative model assumes a
product of components. The additive and multiplicative models are expressed as follows:

Additive: yt = Tt + St + Ct + It
Where:
yt – Actual series
Tt – Trend
St – Seasonal
Ct – Cyclic component
It – Irregular term

Multiplicative: yt = Tt x St x Ct x It

Where:
yt – Actual series
Tt – Trend
St – Seasonal
Ct – Cyclic component
It – Irregular term

In an additive model all the components are measured in the same units as the original series (yt).
In case of the multiplicative model it is only the trend which is measured in the same units as the
original series while the other components have no units since they are indexes.

Trend

It is a long term tendency of the series to either rise or fall and it is times be known as secular
trend.

2
Chart 1.1: Plot of the Trend

30

25

20

y 15

10

0
Time

Seasonality

These are periodic fluctuations in the series within a year. Such fluctuations form a relatively
fixed pattern that tends to repeat year in year out. These fluctuations are attributable to weather
changes and social customs or various institutional arrangements like Christmas, public holidays,
summer etc.

Chart 1.2: Seasonal

1
0.9
0.8
0.7
0.6
y 0.5
0.4
0.3
0.2
0.1
0
Time

3
Cyclic component

Cyclic component is like seasonal component in that it is weave like pattern with ups and downs.
The difference is that cycles are viewed as broad contractions and expansions that take several
years and not with a year. The length of time between successive peaks of a cycle is not
necessarily fixed as for seasonality.

Chart 1.3: Cyclic Component

1
0.9
0.8
0.7
0.6
y 0.5
0.4
0.3
0.2
0.1
0
Time

Irregular Term

This is the residual movement after accounting for the trend, seasonality and cyclic component.

Chart 1.4: Cyclic Component

2.5
2
1.5
1
0.5
y 0
-0.5
-1
-1.5
-2
-2.5
Time

4
1.2 Stationary Series/No Trend
1.2.1 Definition
A series is said to be stationary if it appears about the same on average irrespective of
when it was observed.

yt = βo + ε for t=1,2,3,…..
Where: yt - actual values of the series

βo – average level of the series

ε – random variable (irregular term)

The random variable is assumed to be independent with mean value of zero.

Chart 1.5: Stationary Series

y 4

0
Time

1.2.2 Causes of stationary series

A stationary series may arise in the following circumstances:

 Stable environment – this is when the forces generating the series have stabilized and the
environment in which the series exists is relatively unchanging e.g. the mature stage of a
life cycle of a product such as new smearing jelly.
 Easily correctable trend - stability may be realized through making of simple corrections
for factors such as population growth and inflation.

5
 Short forecasting horizon – the trend may be present but the period for which the
forecasts are needed is relatively short such that the amount due to trend is negligible.
 Transformable series – some series may be mathematically transformed into a stable one
just by taking logarithms, square roots, differencing etc.
 Residual analysis – analysis of residual series may result in a horizontal pattern or
stationary series;
 Preliminary stages of model development - a simple model may be required for ease of
explanation and interpretation.

1.2.3 Non parametric tests for stationarity/no trend

There are four tests which may be used to determine the existence of stationarity namely: runs
test, turning points test, sign test and Daniels teat.

a) Runs Test

The median must first be computed and let the median be ̃.


𝒚
Assign a plus to observations above the median and a minus to those below it. Then count the
number of “runs” or blocks of pluses and minuses.

Let the runs be R, which is the number of runs in a random sequence of m pluses and m minuses.

The mean of R is: 𝜇𝑅 = m + 1 (expected number of runs)

𝑚(𝑚−1)
Standard deviation of R: 𝑆𝑅 =√ (standard deviation of the number of runs)
2𝑚−1

Hypothesis

H0: The series is stationary

H1: The series is non-stationary or trended

|𝑅−𝜇𝑅 |
Compute Z =
𝑆𝑅

Decision criteria, reject H0 , if the computed Z is greater than the tabulated 𝑍𝛼 ,


2

𝛼
Note that 𝑍𝛼 is the upper ( x 100%) point of the standard normal distribution.
2 2

6
Conclusion: If H0 is rejected, the test has shown with (1-α) x 100% confidence that the series is
non-stationary and if H0 is not rejected, the test has shown some support for using a stationary or
no trend model.

Example 1.1: Coffee exports of country X shall be used to test for stationarity using the
“runs” test.
Arrange the series in ascending or descending order and compute the median. The median = 7.85

.
The number of runs are R = 12

The expected number of runs is the mean 𝜇𝑅 = m + 1 = 12+1 = 13

𝑚(𝑚−1) 12𝑥11
SD 𝑆𝑅 = √ =√ = 2.396
2𝑚−1 (2𝑥12)−1

Example 1.1: Exports of coffee (yt) in country


T yt Sign Runs
1 7.2 -
2 6.4 - 1
3 6.2 -
4 8.3 + 2
5 8.4 +
6 6.9 - 3
7 7.6 -
8 8.2 +
9 9.3 + 4
10 8.3 +
11 6.6 -
12 5.9 - 5
13 7.6 -
14 8.5 + 6
15 6.8 - 7
16 7.9 + 8
17 7.8 -
18 6.9 - 9
19 8.8 +
20 9.5 + 10
21 7.9 +
22 7.4 - 11
23 8.7 +
24 9.2 + 12

7
On substituting for the mean and standard deviation in the computation for Z

|𝑅−𝜇𝑅 | |12−13|
Z= = = 0.417
𝑆𝑅 2.396

Taking α=5% in order to test for stationarity or no trend the 𝑍𝛼 ,= 1.96.


2

Conclusion: Since 0.417 is less than 1.96 we do not reject the null hypothesis and conclude that,
there is some support for a stationary series or horizontal model at 95% significance level.

b) Turning Points Test

A turning point in a time series is a point where the series changes direction. Each the turning
point represents either a local “peak” or a local “trough” in the series.

In order to determine a turning point assign a plus or minus to a period depending on whether its
first difference yt – yt-1 is positive or negative. A plus indicates that the series went up in the
period and minus implies that it went down. A turning point is a time period whose sign is
different from that of the next period.

The test statistic U = the number of turning points.

Hypothesis

H0: The series is stationary

H1: The series is non-stationary or trended


|𝑈−𝜇𝑢 |
Compute Z = 𝑆𝑢

2(𝑛−2)
Mean U: 𝜇𝑢 = 3

16𝑛−29
Standard Deviation U: 𝑆𝑢 =√ 90

Decision criteria, reject H0, if the computed Z is greater than the tabulated 𝑍𝛼 ,
2

𝛼
Note that 𝑍𝛼 is the upper ( 2 x 100%) point of the standard normal distribution.
2

Conclusion: If H0 is rejected, the test has shown with (1-α) x 100% confidence that the series is
non-stationary and if H0 is not rejected, the test has shown some support for using a stationary or
no trend model.

8
Example 1.2: Coffee exports shall be used to test for stationarity using the turning points
test.

Count the number of turning points in the series shown on table 1.2, U = 11

2(𝑛−2) 2(24−2)
Mean of U: 𝜇𝑢 = = = 14.667
3 3

16𝑛−29 16𝑥24−29
Standard Deviation of U: 𝑆𝑢 =√ =√ = 1.986
90 90

Table 1.2: Exports of coffee (yt) in country X


T yt yt – yt-1 Turning
point
1 7.2 __
2 6.4 -
3 6.2 -
4 8.3 + 1
5 8.4 +
6 6.9 - 2
7 7.6 + 3
8 8.2 +
9 9.3 +
10 8.3 - 4
11 6.6 -
12 5.9 -
13 7.6 + 5
14 8.5 +
15 6.8 - 6
16 7.9 + 7
17 7.8 - 8
18 6.9 -
19 8.8 + 9
20 9.5 +
21 7.9 - 10
22 7.4 -
23 8.7 + 11
24 9.2 +

|𝑈−𝜇𝑢 | |11−14.667|
Z= = = 1.846
𝑆𝑢 1.986

Conclusion: Since 1.846 is less than 1.96 we do not reject the null hypothesis and conclude that,
there is some support for a stationary series or horizontal model at 95% significance level.

9
c) Sign Test

Once the signs of the first differences have been determined as done for the turning points test, a
sign test may be used i.e. assign a plus or minus to a period depending on whether its first
difference yt – yt-1 is positive or negative. A plus indicates that the series went up in the period
and minus implies that it went down.

The test statistic V = the number of positive first differences in the series.

Hypothesis

H0: The series is stationary

H1: The series is non-stationary or trended

|𝑉−𝜇𝑣 |
Compute Z =
𝑆𝑣

𝑛́
Mean V: 𝜇𝑣 =
2

√𝑛́
Standard Deviation V: 𝑆𝑣 =
4

𝑛́ is the number of non-zero first differences.

Decision criteria, reject H0, if the computed Z is greater than the tabulated 𝑍𝛼 ,
2

𝛼
Note that 𝑍𝛼 is the upper ( 2 x 100%) point of the standard normal distribution.
2

Conclusion: If H0 is rejected, the test has shown with (1-α) x 100% confidence that the series is
non-stationary and if H0 is not rejected, the test has shown some support for using a stationary or
no trend model. If the computed Z is positive we conclude that the trend is upward and if it is
negative it is downward.

Example 1.3:
Using the computations made in Example 1.2 of the first differences. Count the number of
positive first differences.

The number of positive first differences V = 12

𝑛́ = 23
𝑛́ 23
Mean V: 𝜇𝑣 = 2 = = 11.5
2

10
√𝑛́ √23
Standard Deviation V: 𝑆𝑣 = 4 = = 1.1990
4

Table 1.3: Exports of coffee (yt) in country X


T yt yt – yt-1 Positive 1st
Difference
1 7.2 __
2 6.4 -
3 6.2 -
4 8.3 + 1
5 8.4 + 2
6 6.9 -
7 7.6 + 3
8 8.2 + 4
9 9.3 + 5
10 8.3 -
11 6.6 -
12 5.9 -
13 7.6 + 6
14 8.5 + 7
15 6.8 -
16 7.9 + 8
17 7.8 -
18 6.9 -
19 8.8 + 9
20 9.5 + 10
21 7.9 -
22 7.4 -
23 8.7 + 11
24 9.2 + 12

|𝑉−𝜇𝑣 | |12−11.5|
Compute Z = = = 0.417
𝑆𝑣 1.1990

Conclusion: Since 0.417 is less than 1.96 we fail to reject the null hypothesis and conclude that,
the series is stationary at 95% significance level.

11
d) Daniels Test

This non-parametric test is based upon the spearman’s coefficient.

6 ∑ 𝑑2
𝑡
The test statistic rs = 1- 𝑛(𝑛2 −1)

dt = t – rank(yt)

Hypothesis

H0: The series is stationary

H1: The series is non-stationary or trended

If the sample is small (n<30), then you use the R table. The decision criteria, reject H0, if the
computed r is greater than the tabulated 𝑟𝛼 .
2

|𝑟−𝜇𝑟 |
If the sample is large (n>30), compute Z = 𝑆𝑟
Mean R: 𝜇𝑟 = 0
1
Standard Deviation R: 𝑆𝑟 =
√𝑛−1

Decision criteria, reject H0, if the computed Z is greater than the tabulated 𝑍𝛼 ,
2

𝛼
Note that 𝑍𝛼 is the upper ( 2 x 100%) point of the standard normal distribution.
2

Conclusion: If H0 is rejected, the test has shown with (1-α) x 100% confidence that the series is
non-stationary and if H0 is not rejected, the test has shown some support for using a stationary or
no trend model. If the computed rs is negative we conclude that the trend is downward and if it is
positive it is an upward trend.

Example 1.4: Coffee exports shall be used to test for stationarity using Daniels test

6 ∑ 𝑑2
𝑡 6 𝑥 1402
rs = = 1- 𝑛(𝑛2 −1) = = 1- 24(576−1) = 0.3904

Assume α =5% and read the r table, r0.025 = 0.555

Conclusion: Since the computed r= 0.3904 < 0.555, we fail to reject the null hypothesis and
conclude with 95% confidence that there is no trend or the series is stationary.

12
Table 1.4: Exports of coffee (yt) in country X
T yt Rank of dt dt2
yt
1 7.2 8 -7 49
2 6.4 3 -1 1
3 6.2 2 1 1
4 8.3 16 -12 144
5 8.4 18 -13 169
6 6.9 6 0 0
7 7.6 10 -3 9
8 8.2 15 -7 49
9 9.3 23 -14 196
10 8.3 16 -6 36
11 6.6 4 7 49
12 5.9 1 11 121
13 7.6 10 3 9
14 8.5 19 -5 25
15 6.8 5 10 100
16 7.9 13 3 9
17 7.8 12 5 25
18 6.9 6 12 144
19 8.8 21 -2 4
20 9.5 24 -4 16
21 7.9 13 8 64
22 7.4 9 13 169
23 8.7 20 3 9
24 9.2 22 2 4
∑ 𝑑𝑡 2 1402

1.2.4 The Pearson’s parametric test for stationarity/no trend


The persons test will be used to determine stationarity or presence of a trend in the time series.

This test detects primarily the presence of a linear trend and it may not detect if the trend is
curvilinear. The coefficient is computed as follows:

(∑ 𝑡)2
Stt = ∑(𝑡 − 𝑡̅)2 = ∑ 𝑡 2 − 𝑛
2 2 (∑ 𝑦)2
Syy = ∑(𝑦 − 𝑦̅) = ∑ 𝑦 − 𝑛
(∑ 𝑡)( ∑ 𝑦)
Sty = ∑(𝑡 − 𝑡̅) (𝑦 − 𝑦̅) = ∑ 𝑡𝑦 − 𝑛

13
𝑆𝑡𝑦
r=
√𝑆𝑡𝑡 𝑆𝑦𝑦

𝑟√𝑛−2
tr = √1−𝑟2

Decision criteria, reject H0, if the computed t is greater than the tabulated 𝑡𝛼 ,
2

𝛼
Where 𝑡𝛼 is the upper ( 2 x 100%) point of the students’ t-distribution with n-2 degrees of
2
freedom.

Conclusion: If H0 is rejected, the test has shown with (1-α) x 100% confidence, that the series is
non-stationary and if H0 is not rejected, the test has shown some support for using a stationary or
no trend model. If the computed tr is negative we conclude that the trend is downward and if it is
positive it is an upward trend.

Example 1.5:
Using the coffee exports in Example 1.1 to compute the Pearson’s correlation for testing for
stationarity of the series.

∑(𝑡)2 3002
Stt = ∑(𝑡 − 𝑡̅)2 = ∑ 𝑡 2 − = 4900 - = 1150
𝑛 24

(∑ 𝑦)2 186.32
Syy = ∑(𝑦 − 𝑦̅)2= ∑ 𝑦 2 − = 1469.35− = 23.1962
𝑛 24

(∑ 𝑡)( ∑ 𝑦) 300𝑥186.3
Sty = ∑(𝑡 − 𝑡̅) (𝑦 − 𝑦̅) = ∑ 𝑡𝑦 − = 2393.9 - = 65.15
𝑛 24

𝑆𝑡𝑦 65.15
r= = = 0.3989
√𝑆𝑡𝑡 𝑆𝑦𝑦 √1150𝑥23.1962

𝑟√𝑛−2 0.3989√22
tr = √1−𝑟2 = √1−0.39892
= 2.04

Assume α =5% and read the t table, t0.025 = 2.201

Conclusion: Since the computed t= 2.04 < 2.201, we fail to reject the null hypothesis and
conclude with 95% confidence that there is no trend or the series is stationary.

14
Table 1.5: Exports of coffee (yt) in country X
t yt tyt yt2 t2
1 7.2 7.2 51.84 1
2 6.4 12.8 40.96 4
3 6.2 18.6 38.44 9
4 8.3 33.2 68.89 16
5 8.4 42.0 70.56 25
6 6.9 41.4 47.61 36
7 7.6 53.2 57.76 49
8 8.2 65.6 67.24 64
9 9.3 83.7 86.49 81
10 8.3 83.0 68.89 100
11 6.6 72.6 43.56 121
12 5.9 70.8 34.81 144
13 7.6 98.8 57.76 169
14 8.5 119.0 72.25 196
15 6.8 102.0 46.24 225
16 7.9 126.4 62.41 256
17 7.8 132.6 60.84 289
18 6.9 124.2 47.61 324
19 8.8 167.2 77.44 361
20 9.5 190.0 90.25 400
21 7.9 165.9 62.41 441
22 7.4 162.8 54.76 484
23 8.7 200.1 75.69 529
24 9.2 220.8 84.64 576
Σt=300 Σ yt =186.3 Σ tyt =2393.9 Σ yt2 =1469.35 2
Σ t =4900

1.2.5 Fitting a horizontal/stationary model

The horizontal model is easy to fit since it has one form only. The horizontal model is expressed
as follows:

yt = βo + εt

Using the least squares method to compute βo, this method requires the estimation of βo that
minimizes the sum of squared forecast errors.

Σyt = Σβo +Σεt

Σyt = nβo

15
∑ 𝑦𝑡
βo = …………………………………………………………………..(1)
𝑛

Using table 1.5 to fit a horizontal model, Σ yt =186.3 and n = 24


∑ 𝑦𝑡 186.3
βo = = = 7.76
𝑛 24

𝑦̂𝑡 = = 7.76

Chart 1.6: Coffee exports (yt) in country X with a fitted stationary model

10
9
8
7 βt
6
yt 5
4
3
2
1
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
t

Exercise 1.1: The annual GDP series for Country X

Year t GDP(y) Year t GDP(y)


1997 1 8.62 2005 9 14.81
1998 2 9.45 2006 10 15.86
1999 3 10.07 2007 11 17.14
2000 4 10.51 2008 12 18.93
2001 5 11.20 2009 13 19.71
2002 6 11.99 2010 14 20.93
2003 7 12.73 2011 15 22.17
2004 8 13.47

(i) Plot the GDP series and comment on it


(ii) Test for presence of trend using the runs test, turning points test, sign
test, Daniel’s test, Pearson’s’ test (assume α = 5%)

16
Chapter Two: Trend

This chapter introduces the trend, tests applicable for present of trend in a series and concludes
with the estimation of the different types of trend models.

2.1 Introduction

A trend is said to be trended if the expected value of the series changes over time such that;

E( yt) = f(βo, β1, β2,………;t) for t=1,2,3, ……..

Where yt = f(βo, β1, β2,………;t) + εt and f(βo, β1, β2,………;t) represents the trend.

2.2 Causes of Trend

The major causes of trend in economic time series are the following:

 Population changes – indeed population growth leads to increased demand for a number
of commodities associated with demand such as commodity sales, food consumption etc;
 Technological changes – technological advancements led to increased productivity and as
such may lead to general improvement in the standard of living. New products come in
the market as a result of technological advancement and items which were once luxuries
become necessities and necessities become outdated;
 Changes in social customs – social customers change over time: peoples tastes & habits
change, cigarette smoking may increase or decline, etc;
 Inflation – affects the interest rates, salaries and wages and the purchasing power of the
Uganda shilling or any other currency among others;
 Environmental conditions – changes in the environmental conditions lead to increase or
reduction of money requirement for its management;
 Market acceptance – when new product comes in the market, a few people know it and as
time goes by more consumers familiarize, time series associated with that products.

2.3 Tests for the trend

All the non-parametric tests and parametric tests used for making a decision about a stationary
series are applicable for the trend. The non-parametric tests are: runts test, turning points test,
sigh test and the Daniel’s test. The parametric test considered in the previous chapter one is the
Pearson’s test.

17
2.4 Estimation of Trends

2.4.1 Linear Trend

A line trend is one of the simplest mathematical models to b e estimated. The line of best fit may
be computed mathematically using the least squares. This is a line which minimizes the total
squared deviations of the actual observations from the calculated line. The general form of a
linear equation is as follows:

yt = βo + β1t+εt

Using the least squares method to compute βo and β1, this method requires that we derive the
normal equations which have to be solved simultaneously.

We derive the two normal equations required for computation of the two coefficients:

Σyt = Σβo + Σβ1t +Σεt

Σtyt = Σβot + Σβ1t2 +Σtεt

The assumptions are that Σεt = 0 and Σtεt = 0.

The normal equations are:

Σyt = nβo + β1Σt ………………………………………………………………………….2.1

Σtyt = βoΣt + β1Σt2 ……………………………………………………………………….2.2

Where n = number of observations in the series

The computational method to be used in deriving the two coefficients is as follows:

(∑ 𝑡)2
Stt = ∑(𝑡 − 𝑡̅)2 = ∑ 𝑡 2 − 𝑛
2 2 (∑ 𝑦)2
Syy = ∑(𝑦𝑡 − 𝑦̅) = ∑ 𝑦 − 𝑛
(∑ 𝑡)( ∑ 𝑦)
Sty = ∑(𝑡 − 𝑡̅) (𝑦 − 𝑦̅) = ∑ 𝑡𝑦 − 𝑛
𝑛 ∑ 𝑡𝑦− ∑ 𝑡 ∑ 𝑦
̂1 =
𝛽 2 ………………………………………………………………………2.3
𝑛Ʃ𝑡 2 −(∑ t)

̂0 = 𝑦̅ - 𝛽
𝛽 ̂1 𝑡̅ …………………………………………………………………………...2.4

̂0 +𝛽
𝑦̂𝑡 = 𝛽 ̂1 t

𝐸𝑥𝑝𝑙𝑎𝑖𝑛𝑒𝑑 𝑣𝑎𝑟𝑖𝑎𝑡𝑖𝑜𝑛
R2 = 𝑡𝑜𝑡𝑎𝑙 𝑣𝑎𝑟𝑖𝑎𝑡𝑖𝑜𝑛

∑(𝑦̂ −𝑦̅)2
= ∑(𝑦 𝑡− 𝑦̅)2
𝑡

18
Example 2.1: Cotton sales yt for the period 1997 to 2011

Year t Sales (yt) tyt t2 y2


1997 1 17 17 1 289
1998 2 28 56 4 784
1999 3 31 93 9 961
2000 4 40 160 16 1,600
2001 5 53 265 25 2,809
2002 6 58 348 36 3,364
2003 7 55 385 49 3,025
2004 8 57 456 64 3,249
2005 9 58 522 81 3,364
2006 10 68 680 100 4,624
2007 11 59 649 121 3,481
2008 12 70 840 144 4,900
2009 13 70 910 169 4,900
2010 14 75 1,050 196 5,625
2011 15 66 990 225 4,356

Σt=120 Σyt=805 Σtyt=7,421 Σt2=1,240 Σy2=47,331

𝑛 ∑ 𝑡𝑦− ∑ 𝑡 ∑ 𝑦 15𝑥7421−120𝑥805
̂1 =
𝛽 2 == = 3.5036
𝑛Ʃ𝑡 2 −(∑ t) 15𝑥1240 − 1202

𝑦̅ = 53.67 and 𝑡̅ = 8

̂0 = 𝑦̅ - 𝛽
𝛽 ̂1 𝑡̅ = 53.67 – 3.5036 x 8 = 25.64

̂0 +𝛽
𝑦̂𝑡 = 𝛽 ̂1 t

Linear Trend 𝑦̂𝑡 = 25.64 + 3.5036t

Comment: The sales increase by 3.5036 units per annum.


𝐸𝑥𝑝𝑙𝑎𝑖𝑛𝑒𝑑 𝑣𝑎𝑟𝑖𝑎𝑡𝑖𝑜𝑛
R2 = 𝑡𝑜𝑡𝑎𝑙 𝑣𝑎𝑟𝑖𝑎𝑡𝑖𝑜𝑛

∑(𝑦̂ −𝑦̅)2
= ∑(𝑦 𝑡− 𝑦̅)2 = 0.83
𝑡

The trend model explains 83% of the total variation.

2.4.2 Quadratic equation

This is the second degree polynomial and it is of the form:

19
yt = βo + β1t+ β2t2 + εt

We derive the three normal equations required for computation of the coefficients:

Σyt = Σβo + Σβ1t + Σβ2t2 +Σεt

Σtyt = Σβot + Σβ1t2 + Σβ2t3 + Σtεt

Σt2yt = Σβot2 + Σβ1t3 + Σβ2t4 + Σt2εt

The assumptions are that Σεt = 0, Σtεt = 0 and Σt2εt.

Which may be solved through the three normal equations?

Ʃyt = βon + β1Ʃt + β2Ʃ t2………………………………………………………….……2.5

Ʃtyt = βoƩ t + β1Ʃ t2 + β2 t3…………….. .……………...………………..………………….. 2.6

Ʃ t2yt = βoƩ t2 + β1Ʃ t3 + β2Ʃ t4 …………….. .………….……………………..………….…... 2.7

The measure for closeness of the fit to the data is the coefficient of determinations which is
computed using the formula:

̂0 +𝛽
𝑦̂𝑡 = 𝛽 ̂1t + 𝛽
̂2t2

𝐸𝑥𝑝𝑙𝑎𝑖𝑛𝑒𝑑 𝑣𝑎𝑟𝑖𝑎𝑡𝑖𝑜𝑛
R2 = 𝑡𝑜𝑡𝑎𝑙 𝑣𝑎𝑟𝑖𝑎𝑡𝑖𝑜𝑛

∑(𝑦̂ −𝑦̅)2
= ∑(𝑦 𝑡− 𝑦̅)2
𝑡

20
Example 2.2: Cotton sales yt for the period 1997 to 2011

Year t Sales (yt) tyt t2 y2 t3 t4 ytt2

1997 1 17 17 1 289 1 1 17

1998 2 28 56 4 784 8 16 112

1999 3 31 93 9 961 27 81 279

2000 4 40 160 16 1,600 64 256 640

2001 5 53 265 25 2,809 125 625 1,325

2002 6 58 348 36 3,364 216 1,296 2,088

2003 7 55 385 49 3,025 343 2,401 2,695

2004 8 57 456 64 3,249 512 4,096 3,648

2005 9 58 522 81 3,364 729 6,561 4,698

2006 10 68 680 100 4,624 1,000 10,000 6,800

2007 11 59 649 121 3,481 1,331 14,641 7,139

2008 12 70 840 144 4,900 1,728 20,736 10,080

2009 13 70 910 169 4,900 2,197 28,561 11,830

2010 14 75 1,050 196 5,625 2,744 38,416 14,700

2011 15 66 990 225 4,356 3,375 50,625 14,850

Σt=120 Σyt=805 Σtyt=7,421 Σt2=1,240 Σy2=47,331 Σt3=14,400 Σt4=178,312 Σytt2=80,901

On substitution into the three normal equations you obtain the following:

805 = 15βo + 120β1t + 1240β2

7421 = 120βo + 1240β1 +14400 β2

80901 = 1240βo+ 14400β1 + 17831β2

On rearranging and simplification the three coefficients are:

21
βo = 10.90, β1 = 8.7072 and β2 = -0.3252

𝑦̂𝑡 = 10.90 + 8.7072t – 0.3252t2


𝐸𝑥𝑝𝑙𝑎𝑖𝑛𝑒𝑑 𝑣𝑎𝑟𝑖𝑎𝑡𝑖𝑜𝑛
R2 = 𝑡𝑜𝑡𝑎𝑙 𝑣𝑎𝑟𝑖𝑎𝑡𝑖𝑜𝑛

∑(𝑦̂ −𝑦̅)2
= ∑(𝑦 𝑡− 𝑦̅)2 = 0.94
𝑡

The trend model explains 94% of the total variation.

2.4.3 Simple exponential trend

The simple exponential trend takes the form:

yt = βoβ1tεt

Where βo and β1 are coefficients and εt is the error term.

The model is to be transformed for ease of computation of the coefficients. The linear form of
the equation is

log yt = logβo + tlogβ1+ logεt

On close observation, you will notice that it is similar to the linear trend previously seen. In
deriving the normal equations:

Σlogyt = Σlogβo + Σlogβ1t +Σlogεt

Σtlogyt = Σlogβot + Σlogβ1t2 +Σtlogεt

The assumptions are that Σlogεt = 0 and Σtlogεt = 0.

The normal equations are:

Σlogyt = nlogβo + logβ1Σt ……………………………………………………………….2.8

Σtlogyt =logβoΣt + logβ1Σt2 …………………………….……………………………….2.9

After rearranging the equations the computational methods for the coefficients become:
∑ 𝑙𝑜𝑔𝑦𝑡 log β1 ∑ 𝑡
logβo = –
𝑛 𝑛

𝑛 ∑ 𝑡𝑙𝑜𝑔𝑦− ∑ 𝑡 ∑ 𝑙𝑜𝑔𝑦
logβ1 = 2
𝑛Ʃ𝑡 2 −(∑ t)

𝑦̂ = 𝑎̂ 𝑏̂t

22
𝐸𝑥𝑝𝑙𝑎𝑖𝑛𝑒𝑑 𝑣𝑎𝑟𝑖𝑎𝑡𝑖𝑜𝑛
R2 = 𝑡𝑜𝑡𝑎𝑙 𝑣𝑎𝑟𝑖𝑎𝑡𝑖𝑜𝑛

∑(𝑦̂ −𝑦̅)2
= ∑(𝑦 𝑡− 𝑦̅)2
𝑡

Note that if the computed b1 is greater than 1 then there is growth while if b1 is less than 1 there
is a decline.

Example 2.3: Cotton sales yt for the period 1997 to 2011

Year T Sales (yt) Ln (yt) tlnyt t2 lny2


1997 1 17 2.833 2.8332 1 8.0271
1998 2 28 3.332 6.6644 4 11.1036
1999 3 31 3.434 10.3020 9 11.7923
2000 4 40 3.689 14.7555 16 13.6078
2001 5 53 3.970 19.8515 25 15.7632
2002 6 58 4.060 24.3627 36 16.4872
2003 7 55 4.007 28.0513 49 16.0587
2004 8 57 4.043 32.3444 64 16.3463
2005 9 58 4.060 36.5440 81 16.4872
2006 10 68 4.220 42.1951 100 17.8042
2007 11 59 4.078 44.8529 121 16.6263
2008 12 70 4.248 50.9819 144 18.0497
2009 13 70 4.248 55.2304 169 18.0497
2010 14 75 4.317 60.4448 196 18.6407
2011 15 66 4.190 62.8448 225 17.5532

ΣLnyt=58.73 ΣtLnyt=492.259
Σt=120 1 0 Σt2=1,240 ΣInyt2=232.3973

15𝑥492.2590−120𝑥58.731
lnβ1 = = 0.080
15𝑥1240−1202

β1 = 1.0833
∑ 𝐿𝑛𝑦𝑡 Inβ1 ∑ 𝑡 58.731 0.08𝑥120
lnβo = – = – = 3.275
𝑛 𝑛 15 15

βo = 26.45

𝑦̂ = 26.45(1.0833)t

Comment: The sales were 26.45 in the base period and the growth is 8.3% per annum.

23
𝐸𝑥𝑝𝑙𝑎𝑖𝑛𝑒𝑑 𝑣𝑎𝑟𝑖𝑎𝑡𝑖𝑜𝑛
R2 = 𝑡𝑜𝑡𝑎𝑙 𝑣𝑎𝑟𝑖𝑎𝑡𝑖𝑜𝑛

∑(𝑦̂ −𝑦̅)2
= ∑(𝑦 𝑡− 𝑦̅)2 = 0.73%
𝑡

The trend model explains 73% of the total variation.

2.4.4 Modified exponential trend

A modified exponential is an asymptotic growth curve expressed in the form:


yt = βo + β1β2tεt

Divide the time series into three equal parts and compute the partial sums;
Let the first sum for the first third be S1y, the second sum for the second third be S2y and the
thirds and last sum be S3y. Let n be the number of observations in the third of the series.

The coefficients for the trend are:


𝑆 𝑦−𝑆 𝑦
β2n = 𝑆3 𝑦−𝑆2 𝑦
2 1

𝛽2 −1
β1 = (S2y- S1y) (𝛽2 𝑛 −1)2

1 𝛽 𝑛 −1
β0 = 𝑛[S1y- ( 𝛽2 −1 ) β1
2

Example 2.4: Cotton sales yt for the period 1997 to 2011

Year t Sales (yt)


1997 1 17
1998 2 28
1999 3 31 S1y=169
2000 4 40
2001 5 53
2002 6 58
2003 7 55
2004 8 57 S1y=296
2005 9 58
2006 10 68
2007 11 59
2008 12 70
2009 13 70 S1y=340
2010 14 75
2011 15 66

24
𝑆3 𝑦−𝑆2 𝑦
β2 n = 𝑆2 𝑦−𝑆1 𝑦
340−296 44
β2 = = 127 = 0.809
296−169

𝛽2 −1
β1 = (S2y- S1y) (𝛽2 𝑛 −1)2

0.809−1
β1 = (296-169) = 127x-0.447 = -56.80
(0.346−1)2

1 𝛽 𝑛 −1
β0 = 𝑛[S1y- ( 𝛽2 −1 ) β1]
2

1 0.346−1
β0 = 5[169- (0.809−1)x-56.80] = 72.663

𝑦̂𝑡 = 72.663 – 56.80(0.809) t

Chart 2.1: Plot of sales yt and the trend for the period 1997 to 2011

80
70
60
50

yt 40
30
20
10
-
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
years

2.4.4 Gompert Curve

A Gompertz curve is an asymptotic growth curve expressed in the form:


𝛽𝑡
yt = β0 𝛽1

25
On transforming by taking logs we obtain:

lnyt = ln βo + ln(β1)β2tεt

This is a modified exponential trend with a time series lnyt. lnβo , lnβ1 , β2 are the coefficients for
the model.

Divide the transformed series into three equal parts and compute the partial sums;
Let the first sum for the first third be S1lny, the second sum for the second third be S2lny and the
thirds and last sum be S3lny. Let n be the number of observations in the third of the series.

The coefficients for the trend are:


𝑆 ln𝑦−𝑆 ln𝑦
β2n = 𝑆3 ln𝑦−𝑆2 ln𝑦
2 1

𝛽2 −1
lnβ1 = (S2lny- S1lny) (𝛽2 𝑛 −1)2

1 𝛽 𝑛 −1
lnβ0 = 𝑛[S1lny- ( 𝛽2 −1 ) lnβ1]
2

Example 2.5: Cotton sales yt for the period 1997 to 2011

Year t Sales (yt) ln yt


1997 1 19 2.944
1998 2 30 3.401
1999 3 33 3.497 17.587
2000 4 42 3.738
2001 5 55 4.007
2002 6 60 4.094
2003 7 57 4.043
2004 8 59 4.078 20.558
2005 9 60 4.094
2006 10 70 4.248
2007 11 61 4.111
2008 12 72 4.277
2009 13 72 4.277 21.228
2010 14 77 4.344
2011 15 68 4.220

The coefficients for the trend are:


𝑆 ln𝑦−𝑆2 ln𝑦
β2 n = 3
𝑆2 ln𝑦−𝑆1 ln𝑦

21.228−20.558
β2 5 = = 0.2255
20.558−17.587

26
β2 = 0.7424
𝛽2 −1
lnβ1 = (S2lny- S1lny) (𝛽2 𝑛 −1)2

0.7424−1
lnβ1 = (20.558- 17.587) = -1.276
(0.2255−1)2
β1 = 0.2792

1 𝛽 𝑛 −1
lnβ0 = 𝑛[S1lny- ( 𝛽2 −1 ) lnβ1]
2

1 0.2255−1
lnβ0 = 5[17.587- (0.7424−1)x -1.276] = 4.2845

β0 = 72.57
0.7424𝑡
𝑦̂t = 72.570.2792

Chart 2.2: Plot of sales yt and the trend for the period 1997 to 2011

90

80

70

60

50
yt
40

30

20

10

-
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
years

27
2.4.5 Logistic Curve

A logistic curve is an asymptotic growth curve expressed in the form:

𝛽0
yt = 𝑡 , for case of log e
1+𝑒 1 +𝛽2 )
(𝛽

or
𝛽0
yt = 𝑡 , for case of log 10
1+10(𝛽1 +𝛽2 )

Divide the time series into three equal parts and select the t periods such that they are equidistant
from one another i.e. t0, t1, and t2. The first t0 should be near the beginning within the first third,
the second t1 in the middle and the last one t2, near the end within the last third of the series.
Then select the yt corresponding to the three t’s and them be y0, y1 and y2 and n is the number
periods from one t to the other.
The computations are as follows:

2𝑦0 𝑦1 𝑦2 −𝑦12 (𝑦0 +𝑦2 )


βo = 𝑦0 𝑦2 −𝑦12

𝛽0 −𝑦0
β1 = log 𝑦0

1 𝑦 (𝛽 −𝑦 )
β2 = 𝑛 [log 𝑦0 (𝛽0−𝑦1 )]
1 0 0

Example 2.6: Cotton sales yt for the period 1997 to 2011

Year t Sales (yt)


1997 1 19
1998 2 30
1999 3 33 First
2000 4 42 y0=33
2001 5 55
2002 6 60
2003 7 57
2004 8 59 Middle
2005 9 60 y1=59
2006 10 70
2007 11 61
2008 12 72
2009 13 72 Last
2010 14 77 y2=72
2011 15 68

28
2𝑦0 𝑦1 𝑦2 −𝑦12 (𝑦0 +𝑦2 ) 2𝑥33𝑥59𝑥72−592 (33+72)
βo = = = 77.0471
𝑦0 𝑦2 −𝑦12 33𝑥72−592

𝛽0 −𝑦0 77.0471−33
β1 = loge = loge = 0.2888
𝑦0 33

1 𝑦 (𝛽 −𝑦 ) 1 33(77.047−59)
β2 = 𝑛 [loge 𝑦0 (𝛽0 −𝑦1)] = 5 [loge 59(77.047−33)] = -0.2947
1 0 0

77.047
𝑦̂t = 1+𝑒 (0.2888−0.2947𝑡)

Chart 2.3: Plot of sales yt and the trend for the period 1997 to 2011

90

80

70

60

50
yt
40

30

20

10

-
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
t

29
2.5 Estimation of the trends using Excel

2.5.1 Linear trend


Example 2.1: Cotton sales yt for the period 1997 to 2011

Year t Sales (yt)


1997 1 17
1998 2 28
1999 3 31
2000 4 40
2001 5 53
2002 6 58
2003 7 55
2004 8 57
2005 9 58
2006 10 68
2007 11 59
2008 12 70
2009 13 70
2010 14 75
2011 15 66

Regression Statistics
Multiple R 0.912325935
R Square 0.832338611
Adjusted R
Square 0.819441581
Standard Error 7.297680147
Observations 15

ANOVA
Significance
df SS MS F F
Regression 1 3437.004 3437.004 64.53723 2.14E-06
Residual 13 692.3298 53.25614
Total 14 4129.333

Standard
Coefficients Error t Stat P-value
Intercept 25.64 3.965254 6.465688 2.11E-05
X Variable 1 3.5036 0.43612 8.033507 2.14E-06

30
Linear Trend 𝑦̂𝑡 = 25.64 + 3.5036t

The t values of 6.47 for the intercept and 8.03 for the X variable are high and they reveal that the
two coefficients are significant.

2.5.2 Quadratic equation


Using the same sales data in Example 2.1 and on fitting a quadratic equation using Excel you
obtain:

Regression Statistics
Multiple R 0.968508266
R Square 0.938008262
Adjusted R
Square 0.927676305
Standard Error 4.618662782
Observations 15

ANOVA
Significance
Df SS MS F F
Regression 2 3873.3488 1936.674391 90.78709097 5.68E-08
Residual 12 255.98455 21.3320459
Total 14 4129.3333

Standard
Coefficients Error t Stat P-value
Intercept 10.89450549 4.1139988 2.6481548 0.021251339
X Variable 1 8.707191338 1.1831985 7.3590286 8.74617E-06
X Variable 2 -0.32522624 0.0719096 -4.5227111 0.000698504

Quadratic Equation: 𝑦̂𝑡 = 10.90 + 8.7072t – 0.3252t2

The t values of 2.65 for the intercept, 7.40 for X variable 1 and -4.52 for the X variable2 are high
and they reveal that the three coefficients are significant.

2.5.3 Exponential trend


Using the same sales data in Example 2.1 and on fitting a exponential trend using Excel you
obtain:

Regression Statistics
Multiple R 0.857097645
R Square 0.734616372

31
Adjusted R
Square 0.714202247
Standard Error 0.223260997
Observations 15

ANOVA
Significance
df SS MS F F
Regression 1 1.793724 1.793724 35.98569 4.45E-05
Residual 13 0.647991 0.049845
Total 14 2.441715

Standard
Coefficients Error t Stat P-value
Intercept 3.275093963 0.121311 26.99757 8.38E-13
X Variable 1 0.08003847 0.013342 5.998807 4.45E-05

lnβ1 = 0.080 and β1 = 1.0833

lnβo = 3.275 and βo = 26.45

𝑦̂ = 26.45(1.0833)t

The t values of 27.0 for the intercept and6.0 for the X variable are high and they reveal that the
two coefficients are significant.

32
Exercise 2:
The annual GDP series for country X
Year t GDP(y)
1997 1 8.62
1998 2 9.45
1999 3 10.07
2000 4 10.51
2001 5 11.20
2002 6 11.99
2003 7 12.73
2004 8 13.47
2005 9 14.81
2006 10 15.86
2007 11 17.14
2008 12 18.93
2009 13 19.71
2010 14 20.93
2011 15 22.17

i) Fit a linear trend, quadratic equation and exponential trend and compute their respective
coefficients of determination.

ii) Plot the GDP series and the three trends computed above on one graph

33
Chapter Three: Seasonal

This chapter introduces the seasonal component, tests for seasonality and the methods for
estimation of the seasonal factors.

3.1 Introduction

Seasonality is a regular pattern of fluctuations that repeats from year to year in a time series
observed at shorter than yearly intervals.

A time series yt observed L times per year at times t=1, 2, 3, …….. is said to be seasonal if the
Average value of the series over changes over time such that:

Additive model: E( yt) = f(βo, β1, β2,………;t) + St for t=1,2,3, … and ∑𝑙1 𝑆𝑡 = 0

or

Multiplicative model: E( yt) = f(βo, β1, β2,………;t) x St for t=1,2,3, … and ∑𝑙1 𝑆𝑡 = L

Where: f(βo, β1, β2,………;t) represents the trend (T)

St = St+L = St+2L= ……….

. L is the length of seasonality

and St are the seasonal indexes.

The model may be expressed as follows:

Add: yt = Tt + St + εt for t=1,2,3, … and ∑𝑙1 𝑆𝑡 = 0

Mult: yt = Tt x St x εt for t=1,2,3, … and ∑𝑙1 𝑆𝑡 = L

3.2 Causes of Seasonality

The causes of seasonality are:

 Weather and temperature related factors such as sales of goods and services associated
with weather patterns; purchase of clothing, consumption of heating fuels in temperate
regions which experience extreme weather conductions such as winters and summers
normally exhibits seasonality;
 Calendar related events which are associated with collective social behavior depict
seasonal patterns. The calendar related factors are activities associated with events such
as public holidays, religious events, school open days or school calendars etc.

34
2.5 Tests for Seasonality

The non-parametric and parametric test will be used for making a decision about presence or
absence of seasonality and the seasonal factors will be estimated.

Hypothesis

H0: The series has no seasonality

Ha: The series has seasonality

or

H0: S1 = S2 = S3 = S4 = 0 for additive model

H0: S1 = S2 = S3 = S4 = 1 for multiplicative model

Ha: Si ≠ 0 for some seasons and an additive model

Ha: Si ≠ 1 for some seasons and a multiplicative model

Tests using the moving average (MA) method

Add: yt = Tt + St + Ct + It

Multiplicative: yt = Tt x St x Ct x It

And the moving average (MA) = Tt + Ct for an additive model and (MA) = Tt x Ct for a
multiplicative model.

The specific seasonal are: yt – MA = St + It for additive model and;

yt/MA = St x It for multiplicative model.

Computation of a moving average

In case of a quarterly series yt observed 4 times per year at times t=1, 2, 3, … the moving
average is computed as follows:

The first MA = (y1 + y2 + y3 + y4)/4, the second MA = (y2 + y3 + y4 + y5)/4, third MA = (y3 + y4 + y5
+ y6)/4 and so on.

If L is an even number then the moving averages must be centered to correspond to a specific
quarter. The computation of the centered moving average (CMA) follows:

Step 1: Calculate L- period moving average of the original data;

35
Step 2: Calculate 2- period moving average of the moving averages resulting from step 1
resulting in a centered moving average;

Kruskal-Wallis test

Ri = the sum of the ranks of the yt’s in the ith season;

ni = number of specific seasonal in the ith season;

n = total number of specific seasonal;

= n1 + n2 + n3 + … + nL
𝑅𝑖
Let 𝑅̅𝑖 = average rank in the ith season = 𝑛𝑖

Let 𝑅̅ = the overall average rank;


𝛴𝑅𝑖 𝑛(𝑛+1)/2 𝑛+1
= = =
𝑛 𝑛 2

H = ̅̅̅1 − 𝑅̅ )2 + 𝑛2 (𝑅
𝑛1 (𝑅 ̅̅̅2 − 𝑅̅ )2 + ….+ 𝑛𝐿 (𝑅
̅̅̅𝐿 − 𝑅̅ )2

= Σ𝑛𝑖 (𝑅̅𝑖 − 𝑅̅ )2

12 𝑅2
= [∑ 𝑛𝑖 ] - 3(n+1)
𝑛(𝑛+1) 𝑖

ni = number of observations in the ith season;

n = total number of specific seasonal;

= Σni

yi’ = specific seasonal for time t,

Ri = ΣRank(yi’) ith season

Reject Ho if H > 𝜒𝛼2 (𝐿 − 1)

Where 𝜒𝛼2 (𝐿 − 1) is the upper α x 100% point of the chi-square distribution with L-1 degrees of
freedom.

36
Example 3.1: Cotton sales by a Soroti based company.

Centred
Moving Moving yt/CMA =
Year Quarter yt Average (MA) Average (CMA) SxI Rank
2005 I 130
II 45 71.25
III 20 78.75 75.00 0.267 1
IV 90 82.50 80.63 1.116 11
2006 I 160 83.75 83.13 1.925 16
II 60 92.50 88.13 0.681 5
III 25 90.00 91.25 0.274 2
IV 125 93.75 91.88 1.361 12
2007 I 150 96.25 95.00 1.579 13
II 75 85.00 90.63 0.828 6
III 35 90.00 87.50 0.400 3
IV 80 93.75 91.88 0.871 7
2008 I 170 97.50 95.63 1.778 15
II 90 105.00 101.25 0.889 9
III 50 106.25 105.63 0.473 4
IV 110 106.75 106.50 1.033 10
2009 I 175 104.25 105.50 1.659 14
II 92 105.50 104.88 0.877 8
III 40
IV 115

Rearranging the ranks(y’t) in their respective quarters we obtain:

Quarter Ranks (y’t) Σ(Ri)


I 14 16 13 15 58
II 8 5 6 9 28
III 1 2 3 4 10
IV 11 12 7 10 40

L – 1 = 4 - 1 = 3 and n1= n2= n3= n4 = 4

Σni = 4 + 4 +4 + 4 = 16
12 582 282 102 402
H = [ + + + ] - 3(16+1)
16(16+1) 4 4 4 4

= 0.04412 x[ 841+196 + 25+400] x 51

37
= 13.5

Let α = 5%
2
Reject Ho if H > 𝜒0.05 (3) = 7.81

Conclusion: Since H = 13.5 > 7.81, we reject the null hypothesis and conclude that, there is
seasonality with 95% significance level.

3.3 Computation of Seasonal Factors

Centred
Moving Moving
Year Quarter yt Average (MA) Average yt/CMA = SxI
2005 I 130
II 45 71.25
III 20 78.75 75.00 0.267
IV 90 82.50 80.63 1.116
2006 I 160 83.75 83.13 1.925
II 60 92.50 88.13 0.681
III 25 90.00 91.25 0.274
IV 125 93.75 91.88 1.361
2007 I 150 96.25 95.00 1.579
II 75 85.00 90.63 0.828
III 35 90.00 87.50 0.400
IV 80 93.75 91.88 0.871
2008 I 170 97.50 95.63 1.778
II 90 105.00 101.25 0.889
III 50 106.25 105.63 0.473
IV 110 106.75 106.50 1.033
2009 I 175 104.25 105.50 1.659
II 92 105.50 104.88 0.877
III 40
IV 115

Rearranging the yt/CMA = SxI into their respective quarters and computing arithmetic means.
The arithmetic means represent the unadjusted seasonal factors while on adjustment they become
adjusted seasonal factors.

38
Average
Quarter SxI (unadjusted SF) Adjusted SF
I 1.6588 1.9248 1.5789 1.7778 1.7351 1.7341
II 0.8772 0.6809 0.8276 0.8889 0.8186 0.8182
III 0.2667 0.2740 0.4000 0.4734 0.3535 0.3533
IV 1.1163 1.3605 0.8707 1.0329 1.0951 1.0945
Sum 4.0023 4.0000

The seasonal factors (SF) are:

Quarter SF
I 1.7341
II 0.8182
III 0.3533
IV 1.0945

Interpretation: Generally, in quarter I the sales 73% above the average or trend, quarter II the
sales are 18% below the average or trend, quarter III the sales are 64% below the average or
trend and quarter IV the sales are 9% above the average or trend.

39
2.5 Test for Seasonality using Excel
The least squares regression may be used to test for seasonality. The method fits a least squares
trend and seasonal indicator variables yt (additive model) or log yt (multiplicative model) if the
trend is present then fit the seasonal indicator variables to the residual and test the model fit.

Let x1, x2, … xL be seasonal indicators variables such that xi is 1 in the ith season and 0 otherwise
for i=1,2, …, L.

Additive model: yt = β0 + β1t + S1x1+ S2x2+ … + SLxL + εt and ΣSi = 0

Multiplicative model: yt = β0 (β1)t S1x1S2x2 … SLxLεt and ΣSi = 1

On transformation by taking logarithms;

log yt = logβ0 +log β1t +logS1x1+ logS2x2+ … + logSLxL + logεt and ΣlogSi = 0

There should be L-1 seasonal indicators since Lth seasonal indicator can easily be determined
once we have determined the L-1 indicators.
The regression technique of determining if the seasonal factors are significant is as follows;

Step 1
Using Ms Exel, fit a linear trend and L-1 seasonal dummy variables and obtain:
Variability explained by the trend and seasonal model
= SSR (trend + seasonal)
Variability unexplained by the trend and seasonal model
= SSE (trend + seasonal)
Step 2
Test for significance of the linear trend coefficient β1 using the the t – test
̂
𝛽
t =𝑆𝑡
̂𝑡
𝛽

Decision criteria, reject H0, if the computed t is greater than the tabulated 𝑡𝛼 , (df=n-L-1)
2

Step 3

If the trend is not significant, Using Ms Excel refit the model without the trend that is with only
the L-1 seasonal dummies only and obtain;
Variability explained by the trend and seasonal model
= SSR (seasonal)
Variability unexplained by the trend and seasonal model
= SSE (seasonal)

40
Use the F test to test the overall model fit.

Hypothesis to be tested

H0: S1 = S2 = S3 = SL-1 = 0 for additive model

Or

H0: S1 = S2 = S3 = S4 = 1 for multiplicative model

Ha: Si ≠ 0 for some seasons and additive model

Or

Ha: Si ≠ 1 for some seasons and a multiplicative model


𝑀𝑆𝑅(𝑆𝑒𝑎𝑠𝑜𝑛𝑎𝑙 𝑜𝑛𝑙𝑦)
Test statistic: 𝐹 = 𝑀𝑆𝐸(𝑠𝑒𝑎𝑠𝑜𝑛𝑎𝑙 𝑜𝑛𝑙𝑦)
Where:
𝑆𝑆𝑅(𝑆𝑒𝑎𝑠𝑜𝑛𝑎𝑙 𝑜𝑛𝑙𝑦)
MSR(seasonal only) = 𝐿−1
𝑆𝑆𝐸(𝑆𝑒𝑎𝑠𝑜𝑛𝑎𝑙 𝑜𝑛𝑙𝑦)
MSE(seasonal only) = 𝑛−𝐿

Decision Rule: Reject H0 if F > Fα with df = L – 1, n – L


Otherwise do not reject H0

Conclusion: If Ho is rejected we conclude with (1-α) x 100% confidence that seasonality is


present otherwise seasonality is not present.

Step 4

In case the trend is present, fit a regression with only the linear trend yt = β0 + β1t without the
seasonal dummies.
SSE (trend) = Variability explained by a model with only linear trend;

Obtain the amount of improvement obtained by adding a seasonal component;


SSR(seasonal) = SSR(trend + seasonal) – SSR(trend)

Step 5
In order to test for seasonality we compute the F as follows:

41
𝑀𝑆𝑅(𝑆𝑒𝑎𝑠𝑜𝑛𝑎𝑙)
Test statistic: 𝐹 = 𝑀𝑆𝐸(𝑡𝑟𝑒𝑛𝑑+ 𝑠𝑒𝑎𝑠𝑜𝑛𝑎𝑙)
Where:
𝑆𝑆𝑅(𝑆𝑒𝑎𝑠𝑜𝑛𝑎𝑙)
MSR(seasonal) = 𝐿−1
𝑆𝑆𝐸(𝑡𝑟𝑒𝑛𝑑+𝑠𝑒𝑎𝑠𝑜𝑛𝑎𝑙)
MSE(trend + seasonal) = 𝑛−𝐿−1

Decision Rule: Reject H0 if F > Fα with df = L – 1, n – L - 1


Otherwise do not reject H0

Conclusion: If Ho is rejected we conclude with (1-α) x 100% confidence that seasonality is


present otherwise seasonality is not present.

Example 3.2: Sales of Batteries by a certain company in Mbale


year Quarter yt t X1 X2 X3
2005 I 13.6 1 1 0 0
II 5.1 2 0 1 0
III 2.5 3 0 0 1
Iv 9.3 4 0 0 0
2006 I 16.3 5 1 0 0
II 6.6 6 0 1 0
III 2.8 7 0 0 1
Iv 12.8 8 0 0 0
2007 I 15.5 9 1 0 0
II 7.9 10 0 1 0
III 3.6 11 0 0 1
Iv 8.6 12 0 0 0
2008 I 17.6 13 1 0 0
II 9.1 14 0 1 0
III 5 15 0 0 1
Iv 11.4 16 0 0 0
2009 I 17.8 17 1 0 0
II 9.3 18 0 1 0
III 4.3 19 0 0 1
Iv 11.7 20 0 0 0

42
Regression Statistics
Multiple R 0.97959
R Square 0.959596
Adjusted R
Square 0.948822
Standard Error 1.113403
Observations 20

ANOVA
Significance
df SS MS F F
Regression 4 441.633 110.4083 89.06285 2.9E-10
Residual 15 18.595 1.239667
Total 19 460.228

Standard
Coefficients Error t Stat P-value
Intercept 8.525 0.72585 11.74485 5.8E-09
t X Variable 1 0.18625 0.044011 4.231885 0.000725
x1 X Variable 2 5.95875 0.716449 8.317058 5.32E-07
x2 X Variable 3 -2.7875 0.709658 -3.92795 0.001343
x3 X Variable 4 -6.93375 0.705552 -9.82741 6.28E-08

The decision rule: α = 5%, df = n-L-1 = 20-4-1 = 15

From print out


The t value for the trend is = 4.23
SSR(trend +seasonal) = 441.633
SSE(trend +seasonal) = 18.595
MSE(trend +seasonal) = 1.2396

Conclusion: Since the computed value 4.23 is greater than the tabulated t0.025 = 2.131, we reject
H0, and conclude that the sales of batteries series has trend.

43
The trend is present in the series hence we proceed to Step 4;
Fit a regression with only the linear trend yt = β0 + β1t without the seasonal dummies;
.
SUMMARY OUTPUT

Regression Statistics
Multiple R 0.124363
R Square 0.015466
Adjusted R
Square -0.03923
Standard Error 5.017248
Observations 20

ANOVA
Significance
df SS MS F F
Regression 1 7.117955 7.117955 0.282764 0.601397
Residual 18 453.11 25.17278
Total 19 460.228

Standard
Coefficients Error t Stat P-value
Intercept 8.453684 2.33067 3.627148 0.001927
t X Variable 1 0.103459 0.194561 0.531756 0.601397

From print out

SSR(trend) = 7.118
The amount of improvement obtained by adding a seasonal component is;
SSR(seasonal) = SSR(trend + seasonal) – SSR(trend)
= 441.633 – 7.118

= 434.52
𝑆𝑆𝑅(𝑆𝑒𝑎𝑠𝑜𝑛𝑎𝑙)
MSR(seasonal) = 𝐿−1
434.52
= = 144.8
4−1

𝑆𝑆𝐸(𝑡𝑟𝑒𝑛𝑑+𝑠𝑒𝑎𝑠𝑜𝑛𝑎𝑙)
MSE(trend + seasonal) = 𝑛−𝐿−1
= 1.2396
144.8
Test statistic: 𝐹 = 1.2396 = 116.84

44
Step 5

Hypothesis

H0: S1 = S2 = S3 = 0

Ha: Si ≠ 0 for some seasons

The decision rule: α = 5%, df = L-1 =3, n-L-1= 15

Decision Rule: Reject H0 if F > F0.05,3,15 = 3.29

Conclusion: Since the computed F= 116.84 > 2.131, we reject H0, and conclude that there is
seasonality in the sales of batteries series.

Assuming a multiplicative model


Example 3.3: Sales of Batteries by a certain company in Mbale
Year Quarter yt In(yt) t X1 X2 X3
2005 I 32 3.466 1 1 0 0
II 57 4.043 2 0 1 0
III 110 4.700 3 0 0 1
IV 103 4.635 4 0 0 0
2006 I 45 3.807 5 1 0 0
II 77 4.344 6 0 1 0
III 118 4.771 7 0 0 1
IV 134 4.898 8 0 0 0
2007 I 17 2.833 9 1 0 0
II 38 3.638 10 0 1 0
III 24 3.178 11 0 0 1
IV 89 4.489 12 0 0 0
2008 I 3 1.099 13 1 0 0
II 17 2.833 14 0 1 0
III 12 2.485 15 0 0 1
IV 137 4.920 16 0 0 0
2009 I 18 2.890 17 1 0 0
II 38 3.638 18 0 1 0
III 44 3.784 19 0 0 1
IV 160 5.075 20 0 0 0
2010 I 21 3.045 21 1 0 0
II 43 3.761 22 0 1 0
III 34 3.526 23 0 0 1
IV 119 4.779 24 0 0 0

45
SUMMARY OUTPUT

Regression Statistics
Multiple R 0.771744
R Square 0.595588
Adjusted R
Square 0.510449
Standard Error 0.672257
Observations 24

ANOVA
Significance
df SS MS F F
Regression 4 12.6458 3.161451 6.995462 0.001225
Residual 19 8.586647 0.451929
Total 23 21.23245

Standard
Coefficients Error t Stat P-value
Intercept 5.268241 0.392949 13.40693 3.9E-11
t X Variable 1 -0.0335 0.020088 -1.66768 0.111777
x1 X Variable 2 -2.04323 0.392778 -5.20199 5.07E-05
x2 X Variable 3 -1.15684 0.390201 -2.96472 0.007959
x3 X Variable 4 -1.09197 0.388647 -2.80966 0.011186

The decision rule: α = 5%, df = n-L-1 = 24-4-1 = 19

From print out


The t value for the trend is = -1.67

Conclusion: Since the computed t = |-1.67| < t0.025 = 2.093, we reject H0, and conclude that the
series is not trended.

Refit the model without the trend but with only X1, X2 and X3.
SUMMARY OUTPUT

Regression Statistics
Multiple R 0.732388
R Square 0.536392
Adjusted R
Square 0.466851
Standard Error 0.701553
Observations 24

46
ANOVA
Significance
df SS MS F F
Regression 3 11.38892 3.796306 7.7133 0.00129
Residual 20 9.843532 0.492177
Total 23 21.23245

Standard
Coefficients Error t Stat P-value
Intercept 4.799247 0.286408 16.75669 3.07E-13
x1 X Variable 1 -1.94273 0.405042 -4.79636 0.00011
x2 X Variable 2 -1.08984 0.405042 -2.69069 0.014062
x3 X Variable 3 -1.05847 0.405042 -2.61323 0.016643

The decision rule: α = 5%, df = L-1=3, n - L = 20, F0.05,3,20 = 3.10

From print out


F = 7.71

Conclusion: Since the computed F = 7.71 > 3.10, we reject H0, and conclude that the series has
seasonality.

47
2.6 Least Squares seasonal modeling

In the test for seasonality there were L-1 seasonal factors instead of L. Since we have the L-1
indicators and we know the sum of seasonal factors as Zero for the additive model and L for a
multiplicative model, it is easy to determine the one missing factor.

Assume the fitted model is;

̂0∗ + 𝛽
𝑦̂𝑡 = 𝛽 ̂1∗ t + 𝑆
̂1∗ x1+ 𝑆
̂2∗ x2 + … + 𝑆̂

𝐿−1 xL-1

With L dummies

̂0 + 𝛽
𝑦̂𝑡 = 𝛽 ̂1t + 𝑆̂1x1+ 𝑆̂2 x2 + … + 𝑆̂𝐿 xL

̂0∗ , 𝛽
Regression can be used to produce 𝛽 ̂1∗ , 𝑆 ̂2∗ , … , 𝑆̂∗ L-1 and the 𝑆
̂1∗ , 𝑆 ̂𝐿∗ = 0

Hence it may be rewritten as:

̂0∗ + 𝛽
𝑦̂𝑡 = 𝛽 ̂1∗ t + 𝑆
̂1∗ x1+ 𝑆
̂2∗ x2 + … + 𝑆̂
∗ ̂∗
𝐿−1 xL-1+ 𝑆𝐿 xL

̂1∗ , 𝑆
The sum of the , 𝑆 ̂2∗ , … , 𝑆
̂𝐿∗ is not zero as expected which implies that we should to
normalize the coefficients;

Let ̅̅̅ ̂𝑖∗ /L, and


𝑆 ∗ = Σ𝑆

̂𝑖∗ = 𝑆
𝑆 ̂𝑖∗ - ̅̅̅
𝑆∗ for i=1,2, …, L

𝛽 ̂0∗ + ̅̅̅
̂0 = 𝛽 𝑆∗

̂1∗ ,
̂1 = 𝛽
𝛽

And 𝛽̂0 + 𝛽 ̂0∗ + 𝑆


̂1t + 𝑆̂𝑖 = (𝛽 ̂1∗ t + (𝑆
̅̅̅∗ ) + 𝛽 ̂𝑖∗ - 𝑆 ̂0∗ + 𝛽
̅̅̅∗ ) = 𝛽 ̂1∗ 𝑡 + 𝑆
̂𝑖∗
The derived seasonal from the regression is as follows:

𝑆̂𝑖 = 𝑆
̂𝑖∗ - ̅̅̅
𝑆∗ for i=1,2, …, L-1

𝑆̂𝐿 = − ̅̅̅
𝑆∗

𝛽 ̂0∗ + ̅̅̅
̂0 = 𝛽 𝑆∗

̂1∗ ,
̂1 = 𝛽
𝛽

The forecasting model for an additive and a multiplicative model are:

̂0 + 𝛽
𝑦̂𝑡 = 𝛽 ̂1t + 𝑆̂1x1+ 𝑆̂2 x2 + … + 𝑆̂𝐿 xL for an additive model

48
Or

̂0 x 𝛽
𝑦̂𝑡 = 𝛽 ̂1t x 𝑆̂1x1x 𝑆̂2 x2 x … x 𝑆̂𝐿 xL for multiplicative model

Example: 3.5. Refer to Example 3.2 for which we fitted a linear trend and seasonal factors which
yielded the following regression output:

Regression Statistics
Multiple R 0.97959
R Square 0.959596
Adjusted R
Square 0.948822
Standard Error 1.113403
Observations 20

ANOVA
Significance
df SS MS F F
Regression 4 441.633 110.4083 89.06285 2.9E-10
Residual 15 18.595 1.239667
Total 19 460.228

Coefficients Standard Error t Stat P-value


Intercept 8.525 ̂ ∗
0.72585 (𝛃𝟎 ) 11.74485 5.8E-09
T X Variable 1 0.18625 ̂ ∗
0.044011(𝜷𝟏 ) 4.231885 0.000725
x1 X Variable 2 5.95875 0.716449(𝑺̂∗𝟏 ) 8.317058 5.32E-07
x2 X Variable 3 -2.7875 0.709658(𝑺̂∗ ) -3.92795 0.001343
𝟐
x3 X Variable 4 -6.93375 ̂
0.705552(𝑺∗𝟑 ) -9.82741 6.28E-08

We observed earlier while testing for trend and seasonality, that the trend and seasonal
components are significant.

The actual components are worked out as follows:


̅̅̅
𝑆 ∗ = (5.95875 - 2.7875 - 6.93375)/4 = -3.7625/4 = -0.9406

̂1∗ - ̅̅̅
𝑆̂1 = 𝑆 𝑆 ∗ = 5.95875 – (-0.9406) = 6.8994

̂2∗ - ̅̅̅
𝑆̂2 = 𝑆 𝑆 ∗ = -2.7875– (-0.9406) = -1.8469

̂3∗ - ̅̅̅
𝑆̂3 = 𝑆 𝑆 ∗ = -6.93375 – (-0.9406) = -5.9932

𝑆̂4 = - (-0.9406) = 0.9406

49
𝛽 ̂0∗ + ̅̅̅
̂0 = 𝛽 𝑆 ∗ = 8.525 + -0.9406 = 7.5844

̂1∗ = 0.18625
̂1 = 𝛽
𝛽

The forecasting model is therefore:

𝑦̂𝑡 = 7.5844 + 0.18625t + 6.8994x1- 1.8469x2 - 5.9932x3 + 0.9406x4

Units of (t): quarters and


Units (y) : Units

In case of a multiplicative model with a simple exponential trend, fit a simple exponential trend
on the cotton series. Regress Ln(yt) on t, X1 , X2 and X3

Example 3.3: Cotton Exports in Million tons


Year Quarter yt In(yt) t X1 X2 X3
2005 I 35 3.555 1 1 0 0
II 64 4.159 2 0 1 0
III 121 4.796 3 0 0 1
IV 118 4.771 4 0 0 0
2006 I 64 4.159 5 1 0 0
II 100 4.605 6 0 1 0
III 145 4.977 7 0 0 1
IV 165 5.106 8 0 0 0
2007 I 52 3.951 9 1 0 0
II 77 4.344 10 0 1 0
III 67 4.205 11 0 0 1
IV 136 4.913 12 0 0 0
2008 I 54 3.989 13 1 0 0
II 72 4.277 14 0 1 0
III 71 4.263 15 0 0 1
IV 200 5.298 16 0 0 0
2009 I 85 4.443 17 1 0 0
II 109 4.691 18 0 1 0
III 119 4.779 19 0 0 1
IV 239 5.476 20 0 0 0
2010 I 104 4.644 21 1 0 0
II 130 4.868 22 0 1 0
III 125 4.828 23 0 0 1
IV 214 5.366 24 0 0 0

50
The results of the regression are as follows:

Regression Statistics
Multiple R 0.8687991
R Square 0.7548119
Adjusted R Square 0.7031933
Standard Error 0.2618031
Observations 24

ANOVA
Significanc
df SS MS F eF
14.62287
Regression 4 4.0090582 1.00226455 9 1.2957E-05
Residual 19 1.3022761 0.06854085
Total 23 5.3113343

Coefficient Standard
s Error t Stat P-value
Intercept 4.8007481 0.1530298 31.3713257 7.866E-18
0.004361
T X Variable 1 0.0253042 0.0078229 3.23464779 6
X1 X Variable 2 -0.9555109 0.1529631 -6.24667361 5.34E-06
0.000700
X2 X Variable 3 -0.6138307 0.1519597 -4.039432 2
0.004433
X3 X Variable 4 -0.4884804 0.1513544 -3.22739559 2

The coefficients are:

𝛽̂0∗ = 4.8007
𝛽̂1∗ = 0.0253
𝑆̂1∗ = -0.9555
𝑆̂2∗ = -0.6138
𝑆̂3∗ = -0.4885

The logarithmic coefficients are:

𝑆̅ ∗ = (𝑆̂1∗ + 𝑆̂2∗ + 𝑆̂3∗ )/4 = (-0.9555 - 0.6138 - 0.4885)/4 = -0.5145

51
𝑆̂1′ = 𝑆̂1∗ - 𝑆̅ ∗ = -0.9555- (-0.5145) = -0.4411

𝑆̂2′ = 𝑆̂2∗ - 𝑆̅ ∗ = -0.6138- (-0.5145) = -0.0993

𝑆̂3′ = 𝑆̂3∗ - 𝑆̅ ∗ = -0.4885 - (-0.5145) = 0.026

𝑆̂4′ = - 𝑆̅ ∗ = - (-0.5145) = 0.5145

𝛽̂0′ = 𝛽̂0∗ + 𝑆̅ ∗ = 4.8007+ (-0.5145) = 4.2862

𝛽̂1′ = 𝛽̂1∗ = 0.0253

To convert the actual coefficients we obtain the antilogarithms;


𝛽̂0 = exp(𝛽̂0′ ) = 72.69

𝛽̂1 = exp(𝛽̂1′ ) = 1.0256

𝑆̂1 = exp(𝑆̂1′ ) = 0.6434

𝑆̂2 = exp(𝑆̂2′ ) = 0.9055

𝑆̂3 = exp(𝑆̂3′ ) = 1.0263

𝑆̂4 = exp(𝑆̂4′ ) = 1.6728

The sum of the seasonal factors should be = L = 4, hence we adjust them to sum to 4.

𝑆̅ = (0.6434 + 0.9055 + 1.0263 + 1.6728)/4 = 1.062


𝑆̂1
𝑆1 = = 0.6434/1.062 = 0.606
𝑆̅

𝑆̂2
𝑆2 = = 0.9055/1.062 = 0.853
𝑆̅

𝑆̂3
𝑆3 = = 1.0263/1.062 = 0.966
𝑆̅

𝑆̂4
𝑆4 = = 1.6728/1.062 = 1.575
𝑆̅

Forecast: 𝑦̂𝑡 = 72.69(1.0256)t x (0.606)X1 x (0.853)X2 x (0.966)X3 x (1.575)X4

Origin: IV 2004
Units of (t): quarters and
Units (y) : Million tons

52
Exercise 3: cotton sales in million tons

Year Quarter Sales (million tons )


2005 III 100
IV 98
2006 I 45
II 80
III 125
IV 145
2007 I 32
II 52
III 47
IV 98
2008 I 20
II 30
III 51
IV 160
2009 I 65
II 87
III 99
IV 220
2010 I 80
II 110
III 100
IV 196

(i) Test for presence of seasonality Kruskal-Wallis (assume α = 5%)


(ii) Compute the seasonal factors
(iii) Use Ms Excel to fit a linear (additive model) and simple exponential trend
(multiplicative model) and seasonal indicators and test for presence of trend and seasonal
factors;
(iv) Compute the coefficients and propose a forecasting model in both cases.

53
Chapter Four: Cyclical Component

4.1 Introduction

Cyclic component is defined as long swings away from trend that are due to factors other than
seasonality. It’s important to note that cycles occur over a number of years. The up-down
oscillations of a cycle rarely repeat at fixed intervals of time and the amplitude of the fluctuations
may also vary.

A time series yt is said to have cyclical component if the average value of yt changes over time
such that;

Additive model: E( yt) = f(βo, β1, β2,………;t) + St + Ct for t=1,2,3, …

or

Multiplicative model: E( yt) = f(βo, β1, β2,………;t) x St x St for t=1,2,3, …

Where: f(βo, β1, β2,………;t) represents the trend (T)

St = Seasonal component at time t

. Ct = Cyclic component at time t

The model may be expressed as follows:

Add: yt = Tt + St + Ct+ εt for t=1,2,3, … and ∑𝑙1 𝑆𝑡 = 0

Mult: yt = Tt x St + Ct + εt for t=1,2,3, … and ∑𝑙1 𝑆𝑡 = L

4.2 Causes of Cycles in a Series

The possible causes of cycles are:

 Psychological forces may contribute to swings in a series for example a series linked to
popular tastes such as food, music or fashions exhibits cyclic behavior. A cycle starts
with a few people who are adherents to the particular fashion which eventually swells and
reaches a peak. Once the peak is reached then disinterest sets in causing a decline in
popularity;

 Populations generally undergo successive booms and declines. Population dependent


series undergo cycles as a result of the events such as wars, famines, epidemics and other
natural disasters which cause declines and they recover commonly after the event;

 Institutional causes include public policy like a policy to recruit more police to curb the
crime rate. The increase in the number of police officers may lead to a decline in the

54
crime rate which may again prompt government to reduce the police force because of
increased costs hence contributing to another round of increase in the crime rate;

 Replacement cycles arise due to introduction of new products in the market which may
undergo rapid sales at the beginning which will eventually level off when the market
saturation approaches. Eventually, the early customers find themselves with old and
broken, worn out products which they may have to replace, this may result in another
round of growth;

 Education - demand for some fields often undergoes cycles because students go for
courses with high prospects of employment. After a few years the demand in those areas
slackens once the demand is met leading to a number of qualified job seekers not getting
jobs causing students to avoid those areas;

 Predator/pray relationships were certain pray are known to have distinct predators. In that
case too few predators lead to increased breeding of the prey leading to increased
predators food supply; too many predators eventually compete for and reduce the number
of prey, reducing the food supply and leading to a diminished number of predators;

 Combined causes arises due to a possibility of more than one cause acting at a time such
as education combined with population cycles etc.

4.3 Tests for Cycle

The test for presence of cycles may be based upon the von Neumann’s ratio test which is a non-
parametric test applied to the ranks of the data set.

Hypothesis

H0: The residuals are independent

H1: The residuals are positively auto correlated or cycles are present

∑𝑛−1
𝑡=1 (𝑅𝑡+1 −𝑅𝑡 )
2 ∑𝑛−1
𝑡=1 (𝑅𝑡+1 −𝑅𝑡 )
2
Test Statistics RM = ∑𝑛 ̅ 2
=
1 (𝑅𝑡 −𝑅 ) 𝑛(𝑛2 −1)/12

Where: 𝑅𝑡 is the rank of yt.

Decision criteria, Reject H0, if the computed RM is less than the tabulated RMα

Otherwise, do not reject H0.

Note: RMα is the lower α x 100% point in the distribution of von Neumann’s rank ratio.

55
Conclusion: If H0 is rejected, we conclude that there is a positive autocorrelation in the residuals
that may be due to cycles and if it is not rejected we conclude that the residuals are independent.

Example 4.1: The tomato sales in Nakasero Market


Year(1) T(2) Yt(3) Trend(4) De-trended(5) Rank t(6) Rt+1-Rt(7) (Rt+1-Rt)2(8)
1991 1 16 15.775 0.225 10 -7 49
1992 2 15 16.579 -1.579 3 4 16
1993 3 17 17.382 -0.382 7 2 4
1994 4 18 18.186 -0.186 9 4 16
1995 5 21 18.989 2.011 13 -1 1
1996 6 21 19.793 1.207 12 -6 36
1997 7 20 20.596 -0.596 6 -5 25
1998 8 18 21.400 -3.400 1 4 16
1999 9 21 22.204 -1.204 5 6 36
2000 10 24 23.007 0.993 11 4 16
2001 11 28 23.811 4.189 15 -1 1
2002 12 28 24.614 3.386 14 -10 100
2003 13 24 25.418 -1.418 4 4 16
2004 14 26 26.221 -0.221 8 -6 36
2005 15 24 27.025 -3.025 2 _ _
Sum=368

Linear Trend T = 14.971+ 0.803571t


Origin: 1990
Units t = yearly

Compute the trend values (column 4) then detrend the series which depends on whether you
assume an additive or a multiplicative model. In case you assume an additive model you obtain
column 5. Rank the detrended series and obtain successive differences between the ranks.
(columns 6 and 7). Lastly, compute squired differences and sum them as shown on column 8.

∑𝑛−1
𝑡=1 (𝑅𝑡+1 −𝑅𝑡 )
2 368
RM = =
𝑛(𝑛2 −1)/12 15(152 −1)/12

368
= = 1.31
280

Taking α=10% in order to test for cyclic component then RM10 = 1.36

Conclusion: Since the computed RM = 1.31 < 1.34 we conclude that there is significant (α =
10%) autocorrelation in the residuals, and the cycle may be the cause.

56
4.4 Estimation Cyclical Movements
4.4.1 Residual Method
The residual method may be used to compute cyclic movements. Adjusting monthly data
involves elimination of seasonal variation and trend hence obtaining the cyclic irregular
movements.
Assuming a multiplicative model: yt = Tt x St x Ct x It

On deseasonalisation: (Tt x St x Ct x It)/ St = Tt x Ct x It

Detrend the residual: (Tt x Ct x It)/ Tt = Ct x It

Next the data is smoothed in order to obtain cyclical components, which may be referred to as
cyclical relatives since they are always in percentages.

Deseasonalising

The computation of a seasonal index is very important since it may be used also for isolation of
the cyclic movements. The elimination of seasonal variation in a series could be realized by
dividing the original series by the seasonal index. The deseasonalised data may contain the three
components the trend, cyclic and irregular movements.

Adjustment for seasonal and trend

After removing the seasonal movements the trend component should be eliminated. On
detrending the deseasonalised series you obtain cyclical-irregular series. It does not matter
whether you started by eliminating seasonal first or trend as long as the two components are
removed the residual comprises of cyclic and irregular components.

Smoothing irregular movements


Irregular movements may not be removed completely because the series could be over-
smoothed. The irregular movements can be smoothed in order to make cyclical movements
clearer by use of short term moving averages.

4.4.2 Direct Analysis


The method consists of expressing each month as a percentage of a corresponding month of a
preceding year. This will eliminate seasonal variation and the trend. It is important to note that
the trend will not be eliminated completely since the percentages will tend to be above 100 if the
trend is upward and below 100 if the trend is down ward. The direct method expresses each
month as a percentage of the average for the corresponding month for several previous years.
The number of years to be considered for computation of the average varies depending on the
length of the cycles in the series. A decision must be made on the length of individual cycles
before the cyclic component may be computed.

57
4.4.3 Harmonic Analysis
A sine-cosine curve may be used to estimate cyclic movements if the series has about the same
duration and amplitude. The sine-cosine curve may be fitted to the cyclical-irregular data after
irregular movements have been smoothed. This type of series is rear in business and social
science we shall not discuss more.

Exercise 4.1: Bean Production in Kiboga District

Year Yt Year Yt
1991 8.29 2000 10.57
1992 8.00 2001 11.71
1993 8.57 2002 11.71
1994 8.86 2003 10.57
1995 9.71 2004 11.14
1996 9.71 2005 10.57
1997 9.43 2006 11.70
1998 8.86 2007 12.80
1999 9.71

(iii) Fit a linear trend to the beans production series


(i) Detrend the series and test for present of cyclic component using the von Neumann’s
ratio test (assume α = 5%)
(ii) Describe the different methods for estimation of the cyclic component.

58
Chapter Five: Exponential Smoothing

If we have a time series denoted over time as: Y1, Y2, ..., Yt-1, Yt and the forecasts are: Ft+1, Ft+2,
..., Ft+r of future values of Y.

5.1 Single exponential smoothing


The formular for single exponential smoothing (SES) is as follows:

Ft+1 = αYt + (1 − α)Ft,

where α is a given weight which shall be selected subject to the constraint 0 < α < 1. Thus Ft+1 is
the weighted average of the current observation, Yt, with the forecast, Ft, made at the previous
time point t − 1.

On repeated application of the above formula you obtain:

t 1
Ft 1  (1   ) F1    (1   ) j Yt  j
t

j 0

showing that the current forecast: Yt, Yt-1, Yt-2, ... drops exponentially. The rate at which it drops
depends upon the α.

The single exponential smoothing needs to be initialized. A simple way to initialize the forecast
is to let:

F2 = Y1.

5.2 Holt's Linear Exponential Smoothing


The Holt’s linear exponential smoothing is an extension of single exponential smoothing to take
into account a possible linear trend. There are two smoothing constants α and β. The equations
become:

Lt  Yt  (1   )(Lt 1  bt 1 )
bt   ( Lt  Lt 1 )  (1   )bt 1
Ft  r  Lt  bt m

Where:

Lt - Estimates of the level of the series at time t

bt - Linear trend of the series at time t

Ft+r - Linear forecast from t onwards.

The initial estimates of L1 and b1 are as follows:

59
L1 = Y1 and b1 = 0.

If zero is not a typical value of the initial slope then a more careful estimate of the slope may be
needed to ensure that the initial forecasts are realistic.

5.3 Holt-Winter's Method


Holt-Winter's Method is an extension of Holt's linear exponential smoothing to take into account
seasonality. There are two versions, with the multiplicative the more widely used.

5.3.1 Holt-Winter's Method, Multiplicative Seasonality

In this case the equations are:

Yt
Lt    (1   )(Lt 1  bt 1 )
S t s
bt   ( Lt  Lt 1 )  (1   )bt 1
Yt
St    (1   ) S t  s
Lt
Ft  r  Lt  bt r S t  s  r

where s is the number of periods in one cycle of seasons e.g. number of months or quarters in a
year.

To initialize we need one complete cycle of data, i.e. s values. Then set

1
Ls  (Y1  Y2  ...  Ys )
s
To initialize trend we use s + k time periods.

1 Y Y Y Y Y  Yk 
bs   s 1 1  s  2 2  ...  s  k .
k s s s 
If the series is long enough then a good choice is to make k = s so that two complete cycles are
used. However we can, at a pinch, use k = 1.

Initial seasonal indices can be taken as

Yk
Sk  k  1, 2, ..., s
Ls
The parameters α, β, γ should lie in the interval (0, 1).

60
5.3.2 Holt-Winter's Method, Additive Seasonality

The equations are

Lt   (Yt  S t  s )  (1   )(Lt 1  bt 1 )
bt   ( Lt  Lt 1 )  (1   )bt 1
S t   (Yt  Lt )  (1   ) S t  s
Ft  r  Lt  bt r  S t  s  r

where s is the number of periods in one cycle.

The initial values of Ls and bs can be as in the multiplicative case. The initial seasonal indices
can be taken as

Sk  Yk  Ls k  1, 2, ..., s .
The parameters α, β, γ should lie in the interval (0, 1).

61
Chapter Six: Box-Jenkins Methodology

The Box Jenkins ARIMA modelling in time series was introduced in n 1976 by Box and Jenkins.
ARIMA stands for AR= Autoregessive, I = integrated, MA= Moving Average

6.1 Stationarity

A key concept underlying time series processes is that of stationarity. A time series is covariance
stationarity when it has the following three characteristics:

 Exhibits mean reversion in that it fluctuates around a constant long-run mean;


 Has a finite variance that is time-invariant;
 Has a theoretical correlogram that diminishes as the lag length increases.

A time series is said to be stationary if;

(a) E(Yt) = constant for all t;


(b) Var(Yt) = constant for all t;
(c) Cov(Yt,Tt+k) = constant for all t and all ,
or if its mean, its variance and its covariances remain constant over time.

Stationarity is important because if the series is non-stationary then all the typical results of the
classical regression analysis are not valid. Regressions with non-stationary series may have no
meaning and therefore called “spurious”. The long-term forecast of a stationary series will
converge to the unconditional mean of the series.

6.2 Autoregressive (AR) time series models

The AR(1) model

Yt  Yt 1  ut ………………………………………………………(1)

where we do not include a constant the AR(1) and   1and ut is white noise or error term.

Condition for stationarity

For equation (1), the constraint is that   1. If   1 , then Yt is trended, it will grow bigger and
bigger with time and as such the time series will explode.

The Autoregessive (AR(p)) model

A generalization of the AR(1) model is AR(p) model

An AR(2) model shall be of the form:

62
Yt   1Yt 1   2Yt  2  u t …………………………………………..… (2)

AR(p) model takes on,

Yt   1Yt 1   2Yt  2  ... p Yt  p  u t ………………………………….. (3)

or using the summation symbol:


p
Yt    i Yt i  u t ……………………………………………………. (4)
i 1

We may use the lag operator L (the lag operator L has the property: LnYt=Yt-n) and write the AR(p)
model as:

Yt (1  1 L   2 L2  ...   p Lp )  ut …………………………………… (5)

( L)Yt  ut …………………………………………………………… (6)

where  ( L)Yt is a polynomial function of Yt.

Stationarity in the AR(P) Model

The AR(p) process is stationary only if the p roots of the polynomial equation ( z)  0 are greater
than 1 in absolute value, where z is a real variable.

Usining the polynomial notation, the condition for the AR(1) process reduces to the following:
(1  z)  0 ……………………………………………………(7)

If it is greater than 1 in absolute value, and the first root is equal to λ, then the condition is:

1
   1 ……………………………………………………………….(8)

  1 …………………………………………………………………….(9)

For the AR(p) model to be stationary the summation of the p autoregressive coefficients should be
less than 1:

∑𝑝𝑖−1 𝛼𝑖 < 1 ……………………………………………………….………….(10)

Properties of the AR models

The mean and variance of an AR(1) process is given by:


63
E (Yt )  E (Yt 1 )  E (Yt 1 )  0

Where Yt 1  Yt  u t 1 . Substituting repeatedly for lagged Yt we have:

Yt 1   t Y0  ( t u1   t 1u 2  ...   0 u t 1 )

And since   1,  t will be close to zero for large t. Thus we have that:

E (Yt 1 )  0 ………………………………………………………….. (11)

and
2
𝛿𝑢
𝑉𝑎𝑟(𝑌𝑡 ) = 𝑉𝑎𝑟(𝛼𝑌𝑡−1 + 𝑢𝑡 ) = 𝛼 2 𝛿𝑌2 + 𝛿𝑢2 = 2 ……………….. (12)
1−𝛼2 𝛿𝑌

The covariance of two random variables Xt and Zt is defined to be:

Cov ( X t , Zt )  E[ X t  E ( X t )][Zt  E ( Zt )]……………………………(13)

Cov (Yt , Yt 1 )  E[Yt  E (Yt )][Yt 1  E (Yt 1 )]…………………………….(14)

This is the autocovariance function.

For the AR(1) model the autocovariance function will be given by:

Cov (Yt , Yt 1 )  E[YtYt 1 ]  [Yt E (Yt 1 )]  [ E (Yt )Yt 1 ]  [ E (Yt ) E (Yt 1 )]
 E (YtYt 1 )

 E[(Yt 1  ut )Yt 1 ]
………………………….(15)
 E (Yt 1Yt 1 )  E (ut Yt 1 )   Y2

It may be shown that:

𝐶𝑜𝑣(𝑌𝑡 𝑌𝑡−2 ) = 𝐸(𝑌𝑡 𝑌𝑡−2 )

= E[(𝛼𝑌𝑡−1 + 𝑢𝑡 )𝑌𝑡−2 ] = 𝐸[(𝛼(𝛼𝑌𝑡−2 + 𝑢𝑡−1 ) + 𝑢𝑡 )𝑌𝑡−2 ]

= 𝐸(𝛼 2 𝑌𝑡−2 𝑌𝑡−2 ) = 𝛼 2 𝛿𝑌2 …………………………………..(16)

and in general:

Cov (Yt , Yt  k )   k  Y2 …………………………………………………(17) The


autocorrelation function will be given by:

64
Cov (Yt , Yt k )  K  Y2
Cor (Yt , Yt k )   k ……..………………(18)
Var(Yt )Var (Yt k ) Y 2

For an AR(1) time series the autocorrelation function (ACF), a graph showing a plot of Cor(Yt,Yt-
k) against k and is called correlogram and it decays exponentially as k increases. Finally the partial
autocorrelation function (PACF) involves plotting the estimated coefficient Yt-k from an ordinary
least squares (OLS) estimate of an AR(k) process, against k.

If the observations are generated by an AR(p) process then the theoretical partial autocorrelations
will be high and significant for up to p lags and zero for lags beyond p.

6.3 Moving average (MA) models

The MA(1) model

Yt  ut   ut 1 ……………………………………………………….(19)

The MA(q) model

Yt  ut   1ut 1   2ut  2  ...   qut  q ………………………………….(20)

Which can be rewritten as:


q
Yt  ut   j ut  j ………………………………………………………(21)
i 1

Or, using the lag operator:

Yt  (1  1 L   2 L2  ...   p Lq )ut
………………………………….(22)
 ( L)ut

Because any MA(q) process is, by definition, an average of q stationary white-noise processes, it
follows that every moving average model is stationary, as long as q is finite.

Invertibility in MA(q) models

A property often discussed in connection with the moving average processes is that of invertibility.
A time series Yt is invertible if it can be represented by a finite-order MA or convergent
autoregressive process.

Invertiblity is important because the use of the ACF and PACF for identification implicitly assume
that the Yt sequence can be well-approximated by an autoregressive model.

For an example consider the simple MA(1) model:

65
Yt  ut   ut 1 ………………………….…………………………….(23)

Using the lag operator this can be rewritten as:

Yt  (1  L)ut
Yt ………………………………………………………(24)
ut 
(1  L)

If   1 , then the left-hand side of (24) can be considered as the sum of infinite geometric
progression.

ut  Yt (1   L   2 L2   3L3  ...) ……………………………………(25)

MA(1) process:

Yt  ut   ut 1

ut 1  Yt 1   ut  2

Substituting this into the original expression we have:

Yt  ut   (Yt 1  ut  2 )  ut  Yt 1   2ut  2

 ut  Yt 1   2Yt  2   3ut  3

And repeating this an infinite number of times we finally get the expression (25). The MA(1)
process has been inverted into an infinite order AR process with geometrically declining weights.

Note that for the MA(1) process to be invertible it is necessary that   1

In general the MA(q) processes are invertible if the roots of the polynomial

( z)  0 ………………………………………………………. (26)

Are greater than 1 in absolute value.

Properties of the MA models

The mean of the MA process will be clearly equal to zero as it is the mean of white noise terms.
The variance will be given by:

Var (Yt )  Var (ut  ut 1 )   u2   2 u2   u2 (1   2 ) ………………………. (27)

The autocovariance will be given by:

66
Cov (Yt , Yt 1 )  E[(ut  ut 1 )(ut 1  ut  2 )]
……………………. (28)
 E (ut ut 1 )  E (ut21 )   2 E (ut 1ut  2 )

  u2 ……………………………………………………. (29)

And since 𝑢𝑡 is serially uncorrelated it is easy to see that:

Cov (Yt , Yt  k )  0 for k  1 ………………………………………….. (30)

From this we can understand that for the MA(1) process the autocorrelation function will be:

  u2 
Cov (Yt , Yt  k )  2  for k  1
Cor (Yt , Yt  k )    u (1   ) 1   2
2 ……………….(31)
Var (Yt )Var (Yt  k ) 
0 for k  1

So, if we have an MA(q) model we will expect the correlogram (ACF) to have q spikes for k=q,
and then go down to zero immediately. The partial autocorrelation function (PACF) for MA
process should decay slowly.

6.4 ARMA models

We can have combinations of the two processes to give a new series of models called ARMA(p,
q) models. The general form of the ARMA(p, q) models is following:

Yt  1Yt 1   2Yt 2  ...   pYt  p  ut


……………………………………… (32)
 1ut 1   2 ut 2  ...   q ut q

Which can be rewritten, using the summations as:


p q
Yt    i Yt i u t    j u t  j ………………………………………………….(33)
i 1 i 1

or, using the lag operator:

Yt (1  1 L   2 L2  ...   p Lp )  (1  1 L   2 L2  ...   p Lq )ut


……………………(34) In the
( L)Yt  ( L)ut
ARMA(p, q) model the condition for stationarity has to deal with the AR(p) part of the
specification only.

Integrated processes and the ARIMA models

The ARMA(p, q) model, can only be made stationary through differencing or detrending.
Yt  Yt  Yt 1 …………………………………..(35)

67
Most economic and financial time series exhibit trends to some degree, we commonly always end
up taking first differences of the time series. If, after first differencing, a series is stationary then
the series is also called integrated to order one, and denoted I(1).

6.5 ARIMA models

If a process Yt has an ARIMA(p, d, q) representation, the has an ARMA(p ,q) representation as


presented by the equation below:

d Yt (1  1 L   2 L2  ...   p Lp )  (1  1 L   2 L2  ...   p Lq )ut …………….. (36)

Box-Jenkins model selection

In general Box-Jenkins popularized a three-stage method aimed at selecting an appropriate


(parsimonious) ARIMA model for the purpose of estimating and forecasting a univariate time
series.

Three stages are:

(a) identification,
(b) estimation, and
(c) diagnostic checking.

Identification

A comparison of the sample ACF and PACF to those of various theoretical ARIMA processes may
suggest several plausible models. If the series is non-stationary the ACF of the series will not die
down or show signs of decay at all. A common stationarity inducing transformation is to take
logarithms and then first differences of the series. Once we have achieved stationarity, the next
step is identify the p and q orders of the ARIMA model

Model ACF PACF

Pure white noise All autocorrelation are zero All partial autocorrelation are zero

MA(1) Single positive spike at lag 1 Damped exponential decay

AR(1) Damped exponential decay Single positive spike at lag 1

ARMA(1,1) Decay beginning at lag 1 Decay beginning at lag 1

ARMA(p,q) Decay beginning at lag q Decay beginning at lag p

68
Estimation

In this second stage, the estimated model are compared using AIC and BIC.

Diagnostic checking

In the diagnostic checking stage we examine the goodness of fit of model.

We must be careful here to avoid overfitting.(the procedure of adding another coefficient in


appropriate). The special statistic that we use here are the Box-Piece statistic (BP) and the Ljung-
Box (LB) Q-statistic, which serve to test for autocorrelations of the residual.

References
Box, George E. P., Gwilym M. Jenkins (1976) Time Series Analysis. Revised Edition. Oakland,
CA: Holden-Day.

Croxton E. Frederick, Cowden J. Dudley and Klein Sidney (1967), Applied General Statistics,
Third Edition, Sir Isaac Pitman and Sons Ltd, London

Kendall, M. G.; Stuart, A. (1968). The Advanced Theory of Statistics, Volume 3: Design and
Analysis, and Time-Series (2nd edition ed.). London: Griffin..

69

You might also like