Professional Documents
Culture Documents
Modeling and Forecasting New
Modeling and Forecasting New
N and
longitude 5
0
45
.
Model identification is by comparing the theoretical
patterns of the ACF and PACF of the various
ARIMA models with that of the sample ACF and
PACF computed using empirical data (J anacek and
Swift, 1993). A suitable model is inferred by
matching these patterns. Generally ( Brooks, 2002),
ARIMA (0, d, q) is indicated by spikes up to lag q
and a cut to zero thereafter of the ACF values
k
complemented by an exponential decay or damped
sine wave of the PACF values
kk
.
Inversely, ARIMA (p, d, 0 ) is identified by
exponential decay or damped sine wave of the of the
ACF values
k
complemented spikes up to lag p
and a cut thereafter to zero of the PACF values
kk
.
When the process is an ARIMA (0, d, q)* (0, D, Q)
then spikes will be noticed up to lag q+Qs.
While ARIMA (p, d, 0)*(P, D, 0) is indicated by
spikes at lag p+Ps and a cut to zero thereafter of the
PACF.
However, the mixed SARIMA model is difficult to
identify by visual methods of ACF and PACF plots
only. In this work, we use the model identification
discussed above to give a rough guess of possible
values p, q, P, and Q fromwhich several models shall
be postulated and then use the model selection
criterion of Residual Sumof Square RSS (Box and
Jenkins, 1976), Akaikes Information Criterion AIC
(Akaike, 1974) to choose the best model.The AIC
computation is based on the mathematical formula
m L AIC 2 log 2 + = , where m=p+q+P+Q is the
number of parameters in the model and L is the
likelihood function. The best model is the one with
the lowest AIC value. It is however noted that the
likelihood is likely increased by addition of more
parameters into the model. This will further reduce
the value of the AIC leading to the choice of a model
with many parameters. Wei (1990) emphasize on the
need for the chosen model to meet criterion of model
adequacy and parsimony. For this reason we
complement the RSS and AIC with the Schwartzs
Bayesian Criterion SBC, (Schwartz, 1978).
The SBC computation is based on the mathematical
formula n m L SBC log log 2 + = , where m=
p+q+P+Q is the number of parameters in the model
and L is the likelihood function. The SBC introduced
a penalty function to check excess parameters in the
model having identified a suitable SARIMA model,
the next stage is the parameters estimation of the
identified model and this is done through an exact
maximumlikelihood estimate due to Melard (1984).
While forecast and prediction is by least squares
forecast using a least square algorithm due to
Brokwell and Davis (1991). When the estimated
parameters are not significant, we do correlation
analysis to remove redundant parameters.
Research Journal in Engineering and Applied Sciences (ISSN: 2276-8467) 2(5):370-375
Modeling And Forecasting MaximumTemperature Of Warri City- Nigeria
372
The test for model adequacy stage requires residual
analysis and this is done by inspecting the ACF of the
residual obtained by fitting the identified model. If
the model is adequate then residuals should be a
white noise process. Under the assumption that the
residual is a white noise process, the standard error of
the autocorrelation functions should be
approximately
n
1
(Anderson, 1942). Hence under
the white noise assumption, 95% of the
autocorrelation functions should fall within the
range
n
96 . 1
.If more than 5% fall outside this range
then the residual process is not white noise .We
complement the visual inspection of the residual ACF
with the portmanteau test of Ljung and Box, 1978.
This test provides a Q statistics defined by
, ) ( ) 2 (
2
1
1
k
m
k
r k n n n Q
=
+ =
(2)
Where
k
r
is the autocorrelation value of the residual
at lag k, n=N-d-D. Q is approximately distributed as
( ) Q P q p m
2
.
The technique here is to choose a level of
significance and compare the computed Q with the
tabulated
2
with m-p-q-P-Q degree of freedom. If
the model is inappropriate, the Q value will be
inflated when compared with tabulated
2
RESULT AND DISCUSSION
To decide on the presence of trend and time varying
variances, we inspect the time plot of warri maximum
temperature data in Fig 1side by side with the ACF
and PACF of the data as shown in Fig 2 and Fig 3
respectively.
Fig1. Time Plot of MaximumumTemperature
MONTH, period 12
4 7 10 1 4 7 10 1 4 7 10 1 4 7 10 1 4 7 10 1
M
a
x
T
e
m
p
36
34
32
30
28
26
Fig 2. ACF Plot of Max Temp
Lag Number
16
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
A
C
F
1.0
.5
0.0
-.5
-1.0
Confidence Limits
Coefficient
Fig 3. PACF Plot of Max Temp
Lag Number
16
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
P
a
r
t
i
a
l
A
C
F
1.0
.5
0.0
-.5
-1.0
Confidence Limits
Coefficient
Examination of Fig 1 clearly shows presence of time
varying variance and seasonal variation while the
refusal of the ACF and PACF values to decay in Figs
2 and 3 respectively is an indication of a regular
trend. However, we are unable to decide at this stage
the presence or otherwise of seasonal trend. We
performa logarithmand first regular difference so as
to stabilize the variance and remove the trend. A time
plot of max temperature after logarithm and first
difference transformis shown in Fig 4 below
Fig 4. Time Plot of MaximumTemp
(LogarithmTransformand First Difference)
MONTH, period 12
4 7 10 1 4 7 10 1 4 7 10 1 4 7 10 1 4 7 10
M
a
x
T
e
m
p
.2
.1
0.0
-.1
-.2
On inspecting Fig 4, we note the strong presence of
seasonal factors and suspect the presence of seasonal
trend. This is confirmed by very high spikes at and
around seasonal lags of the ACF as shown in Fig 5.
Research Journal in Engineering and Applied Sciences (ISSN: 2276-8467) 2(5):370-375
Modeling And Forecasting MaximumTemperature Of Warri City- Nigeria
373
Fig 5. ACF Plot
( Logarithmand First Difference Transform)
Lag Number
58
55
52
49
46
43
40
37
34
31
28
25
22
19
16
13
10
7
4
1
A
C
F
1.0
.5
0.0
-.5
-1.0
Confidence Limits
Coefficient
We complete the data preparation process by
additionally performing a first order seasonal
difference and the time plot is shown in Fig 6.
Fig6. Time Plot of Max Temp
(Logarithm, First Difference and Seasonal Difference Transform)
MONTH, period 12
4 7 10 1 4 7 10 1 4 7 10 1 4 7 10 1 4 7
M
a
x
T
e
m
p
.2
.1
0.0
-.1
-.2
Visual examination of Fig 6 shows that the process is
now stationary. For celerity of discussion, we from
now on refer to the maximum temperature after
logarithm, first regular difference and first seasonal
difference transformations as the Stationary Process
of The Maximum Temperature. Hence we expect a
seasonal ARIMA process of the form
( )( )
12
, 1 , , 1 , Q P q p SARIMA
The order of the model parameters p, q, P and Q are
identified by visual inspection of ACF and PACF of
the stationary process of the maximum temperature
shown in Figs 7 and 8 to propose many possible
models and the use of model selection criterion of
AIC and BIC to pick the most appropriate model.
We expect the ACF in Fig 7 to cut at q+Qs. However
we notice a cut after lag 25 suggesting a moving
average parameter of order one i.e. q=1 and a
seasonal moving average parameter of orde two i.e.
Q=2. Similarly fromthe PACF in Fig 8, we notice a
cut at lag 25 suggesting an AR parameter of order
one i.e.p=1 and a Seasonal autoregressive parameter
of order two i.e. P=2. Since our strategy is not to have
mixed seasonal factors, we postulate two models
fromwhich, based on the model selection criterion of
RSES, AIC and SBC, the best is selected.
Fig 7. ACF Plot of Stationary Process of Max Temp
Lag Number
58
55
52
49
46
43
40
37
34
31
28
25
22
19
16
13
10
7
4
1
A
C
F
1.0
.5
0.0
-.5
-1.0
Confidence Limits
Coefficient
Fig 8. PACF of The Stationary Process of Max Temp
Lag Number
58
55
52
49
46
43
40
37
34
31
28
25
22
19
16
13
10
7
4
1
P
a
r
t
i
a
l
A
C
F
1.0
.5
0.0
-.5
-1.0
Confidence Limits
Coefficient
The two models are SARIMA (1, 1, 1) (0, 1, 2) and
SARIMA (1, 1, 1) (2, 1, 0). We extend the search to
models around the two already mentioned. The result
is shown in table 1.
Table 1: Postulated Models and Performance
Evaluation
Model RSES AIC SBC
SARIMA (1, 1, 1 )(2, 1, 0) .07842732 -797.81253 -785.34056
SARIMA (1, 1, 1 )(0, 1, 2) .06904940 -818.31944 -805.84746
SARIMA (1, 1, 0 )(1, 1, 2) .08470936 -783.31059 -770.83862
SARIMA (1, 1, 1 )(1, 1, 2) .06774596 -817.60735 -802.01738
SARIMA (1, 1, 0 )(0, 1, 2) .08598198 -783.37908 -774.0251
SARIMA (0, 1, 1 )(1, 1, 2) .07209436 -809.62574 -797.15376
SARIMA (0, 1, 1 )(0, 1, 2) .07849310 -799.52263 -790.16865
SARIMA (0, 1, 0 )(1, 1, 2) .10477684 -750.3175 -740.96352
SARIMA (0, 1, 0 )(0, 1, 2) .10536211 -751.00055 -744.76457
From table 1, we note that in terms of AIC and SBC,
the SARIMA (1, 1, 1) (0, 1, 2) model performed best.
However it is in competition with SARIMA (1, 1, 1)
(1, 1, 2) that has the lowest RSES. This
Research Journal in Engineering and Applied Sciences (ISSN: 2276-8467) 2(5):370-375
Modeling And Forecasting MaximumTemperature Of Warri City- Nigeria
374
notwithstanding, we choose SARIMA (1, 1, 1) (0, 1,
2) as the best in terms of model parsimony and
performance based on AIC and BIC.
We estimated the parameter values of the chosen
model as shown below.
Table 2: Parameters B in the Model
B SEB T-RATIO APPROX. PROB.
AR1 .23653389 .07697176 3.072996 .00248460
MA1 .97208876 .03819124 25.453188 .00000000
SMA1 .66546734 .13438867 4.951811 .00000182
SMA2 .22351861 .09816789 2.276901 .02409261
We note that all the parameters are significant
The chosen model is mathematically of the form
( )( )( ) ( )( )
( ) ( )
( )( )
t t
t t t t t t t
t t
t t
y B B x
where
a a a a a x x
a B B B B B x
a B B B y B B
log 1 1
2172 . 0 2235 . 0 6655 . 0 9720 . 0 2365 . 0
2172 . 0 6469 . 0 22351 . 0 6655 . 0 9720 . 0 1 0.2365B 1
2235 . 0 6655 . 0 1 9720 . 0 1 log 1 1 0.2365B 1
12
26 25 13 1 1
25 13 24 12
12 12 12
=
+ + =
+ + =
=
To verify the suitability of the model, we plot the
autocorrelation values of the residual against lag as
shown in Fig 9.
Fig 9. ACF Plot of Residuals
Lag Number
58
55
52
49
46
43
40
37
34
31
28
25
22
19
16
13
10
7
4
1
A
C
F
1.0
.5
0.0
-.5
-1.0
Confidence Limits
Coefficient
We note that on inspection of Fig 9, there is no spike
at any lag indicating that the residual process is
random. We complement with the portmanteau of
Ljung and box. Computation of the Q value of the
portmanteau test, using the first 25 autocorrelation
values of the residual gives 18.468. When compared
with tabulated chi square value of 32.7, with 21
degree of freedomand at 5% level of significance, we
conclude that the model is a good fit.
Forecast and Model Validation
Below is the 2009 forecast using SARIMA (1, 1, 1) (0, 1, 2) and empirically observed data for the year
Table 3: Forecast for 2009
Month J an Feb March April May J une J uly Aug Sept Oct Nov Dec
Forecast 33.06 34.28 33.98 33.08 32.64 30.82 28.99 29.88 29.87 31.20 32.19 33.39
Observed 33.4 34.4 34.2 33.4 32.4 31.3 29.1 29.4 30.1 30.8 33.2 32.8
Difference -0.34 -0.12 -0.22 -0.32 0.24 -0.48 -0.11 0.48 -0.23 0.4 -1.01 0.59
A t-distribution test of equality of mean shows that
the difference between the two means is not
significant at 1% level of significance. We therefore
conclude that the chosen model can adequately be
used to forecast maximum temperature.
CONCLUSION
We have shown that time series ARIMA models can
be used to model and forecast Maximumtemperature.
The identified SARIMA (1, 1, 1) (0, 1, 2) has proved
to be adequate in forecasting maximumtemperature
for at least one year. Researchers will find this result
useful in building temperature component into a
general climatic forecasting model. Also
environmental manager who require long term
temperature forecast will find the identified model
very useful.
However, due to low data point of fifteen years, we
have not been able to identify the changing pattern of
fluctuations of maximum temperature over a century
as this will require at least one hundred years of data
point.
REFERENCES
Anderson, R. L., (1942) Distribution of Serial
Correlation Coefficient, Annals of Mathematical
Statistics 13(1), 1-13
Akaike, H., (1974) A New Look at Statistical Model
Identification. IEEE Transaction on Automatic
Control 19(6) 716-723
Bindraban, P. S and Coauthors, (2012) Assessing
The Impact of Soil Degradation on Food Production.
Current Opinion on Environmental Sustainability. 4,
478-488
Box, G, E. P. and Jenkins. (1976) Time Series
Analysis: Forecasting and Control. Holden-Day, San
Francisco, USA
Brockwell, P. J and Davis, R. A.(1991) Time Series:
Theory and Method. Spinger
Brooks, C. (2002) Introductory Econometrics for
Finance. Cambridge University Press, UK
Chung, E. S., Park, K. and Lee, K. S (2011) The
Relative Impact of Climate Change and Urbanization
on The Hydrological Response of a Korean Urban
Watershed. Hydrological Processes. 25, 544-560
Grace, J (2004) Understanding and Managing Global
Carbon Cycle. Journal of Ecology. 92, 189-202
Research Journal in Engineering and Applied Sciences (ISSN: 2276-8467) 2(5):370-375
Modeling And Forecasting MaximumTemperature Of Warri City- Nigeria
375
Hipel, KJ. W., McLeod, A. 1 and Lennox, W. (1977).
Advances in Box-J enkins Modeling: Model
Construction. Water resources Research 13, 567-575
Ljung, G. M and Box, G. E. P (1978) On the Measure
of Lack of Fit in Time Series Model. Biometrika, 65,
297-303.
Mayhew P.J., Jarkins,E.B., and Banton, T.B.(2008).
A Long-term Association Between Global
Temperature and Biodiversity, Origin and Estimation
on the Fossil record. Proceedings of the Royal
Society B. 275, 47-53
Mcleod, A. I., (1995) Diagnostic Checking of
Periodic Autoregression Models With Application.
The Journal of Time Series Analysis 15, 221-233
Melard, G (1984) A Fast Algorithm for The Exact
Likelihood of Autoregressive- moving Average
Models. Applied Statistician 33(1): 104-119
Romilly,P.(2005).Time series Modeling of Global
Mean Temperature for Managerial Decision Making.
Journal of Environment magament, 76, 61-70.
Schwartz, G. E (1978). Estimating the Dimension of
a Model. Annals of Statistics. 6(2): 461-464
Stelzer,H., and Post,E.(2009). Seasons and Life
Cycles. Science, 324,886-887
APPENDIX: Maximum Temperature (
0
C) for
Warri (1994 2009).
1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005
J AN 33 33.4 32.7 32.8 32.5 32.2 33 32.8 33.1 33.1 32.8 32.3
FEB 34.2 34.4 33.1 34.1 34.2 32.6 33.8 34.1 34.6 34.7 34.2 34.8
MARCH 34.2 34.2 33.3 32.8 33.3 33.5 34.7 33.9 33.2 33.8 34.5 33.5
APRIL 33.2 33.4 32.9 32 32.2 33 33.2 32.7 32.5 33.1 32.7 33.2
MAY 33 32.4 32.5 31.8 31.5 32.5 32.5 32.3 32.3 32.5 31.6 32.7
J UNE 31.1 31.3 30.9 30.1 31.1 30.8 30.6 30.6 30.5 30 30.9 30.6
J ULY 29.2 29.1 29.1 28.8 29.1 28.1 28.4 29.7 29.2 28.9 28.7 28.7
AUG 28.9 29.4 29 29.4 29.6 29.5 28.1 27.9 28.7 28.6 28.5 29.4
SEPT 29.8 30.1 28.8 30.4 30.1 28.7 29.7 29.7 28.9 30 30.4 30.9
OCT 31.6 30.8 30.5 31.2 32.4 29.7 29.9 30.1 30.3 32.1 30 31.2
NOV 31.1 33.2 33.2 33.3 32.7 32.5 32.6 32.8 32.6 32.9 31.4 33.6
DEC 33.7 32.8 32.7 32.6 32.3 33.5 33.3 33.4 33.4 32.4 33.2 33.3
2006 2007 2008 2009
J AN 32.5 33.3 33 33.4
FEB 34.3 34.1 34.2 34.4
MARCH 33.4 33.7 34.2 34.2
APRIL 33 33 33.2 33.4
MAY 32.4 33 33 32.4
J UNE 29.5 31.1 31.1 31.3
J ULY 27.8 28.9 29.2 29.1
AUG 27.9 29.1 28.9 29.4
SEPT 29.5 30.5 29.8 30.1
OCT 31.2 32.5 31.6 30.8
NOV 32.8 33.4 31.1 33.2
DEC 33.2 33.1 33.7 32.8