Professional Documents
Culture Documents
Isiaq Chapter Four New
Isiaq Chapter Four New
4.0 INTRODUCTION
This chapter aims at presenting and analyzing the data collected on the price of corrugated
roofing sheets (Power hand type) manufactured by Midland Galvanizing Limited, Abeokuta
using Autoregressive Integrated Moving Average (ARIMA) model so as to predict the future
Fig. 4.1: Time series plot of Monthly Power Hand Sales figures
The time series plot in fig. 4.1 shows the price of roofing sheets (Power hand) over the years.
The series indicates non-stationarity, with both upward and downward movements indicating
pattern of irregularity. It can also be evidenced from the ACF and PACF plot in fig. 4.2 and
4.3.
Figure 4.2: Sample ACF plot of Monthly Power Hand Sales figures
The Autocorrelation plot in fig. 4.2 indicates significant spikes from lag 1, 2, 5, and 16 as
well as irregular variation, an upward spike for subsequent lags and off to zero at lag 23
Figure 4.3: Sample PACF plot of Monthly Power Hand Sales figures
Also there is an upward significant spikes at lag 1, 5, and 26 with a slow decay in the PACF
-2.7799 12 0.252
Table 4.1 also indicates the presence of unit root (P-value 0.252> 0.05 level of significance)
at level of the series as evidenced from 4.1, 4.2 and 4.3 respectively.
In view of this, stationarity was therefore achieved by applying the method of first order
differencing as evidenced in fig 4.4, 4.5 and 4.6 for the series acceptability of no unit root
which was also ascertained by performing Augmented Dickey Fuller Test of significance as
Fig. 4.4: Time series plot of First Order Differenced Monthly Power Hand Sales figures
It can be seen that the trend and irregular variation has disappeared in figure 4.4 due to the
significant spikes. Many of the spikes do not touch the upper and lower bounds of the graph
and there is slow decay of autocorrelation function to zero at some respective lags.
Figure 4.6: PACF plot of First Order Differenced Monthly Power Hand Sales figures.
Table 4.2: Augmented Dickey-Fuller Test for Series Stationarity @First Order
Differencing
-6.2146 12 0.01563
The stationarity of the series is further confirmed in table 4.2, with ADF value of -6.2146
with P-value of 0.01563?<5% significance level at lag order of 12. However, with the series
stationarity, we can then go ahead with model identification. At this point, it is very important
to identify various order for the AR(p) and MA(q) components. With a few iterations on this
model building strategy. We hope to arrive at the best possible model for the series.
The orders of AR(p) and MA(q) were chosen based on the significant spikes from the ACF
and PACF plots. The significant spikes in ACF of lag 1, lag 3, and lag 4 suggest MA of order
1, 3 or 4 in figure 4.5. Also, the PACF with significant spikes of lag 1, lag 2, lag 4, lag 8 and
best model, we look for the model with the least AIC andσ 2. Three different models can be
identified from the table to give minimum AIC and estimated sigma square ( σ 2), that is,
ARIMA (4, 1, 5), ARIMA(2, 1, 3) and ARIMA (1, 1, 2). Though ARIMA (4, 1, 5) gives the
least AIC(2128.37) estimated σ 2(345716) and higher Log-Likelihood (-1054.18) out of the
three models with lower AIC. But applying the rule of parsimony which says the simplest
model should be chosen provided all necessary conditions are met, this resulted in choosing
The model specification for ARIMA(1, 1, 2) in table 4.4 is written in form of backshift
operator as;
From table 4.4, it shows that all parameters are significant, that is AR (1) and MA (1) up to
MA(2) the absolute value of Z for each parameter estimate is higher than the critical value of
Z α =Z 0.05 =1.96 ). This indicates that the model coefficient is efficient in predicting the price
2 2
of roofing sheets in Nigeria since all parameters of Autoregressive and Moving Averages are
statistically significant.
Since it is essential to check whether the model is correctly specified, that is, whether the
model assumptions are supported by the data. However, if any of the key assumptions seem
to be violated, then a new model should be specified, fitted and checked again until a model
Fig. 4.7: Adequacy Check for ARIMA(1,1,2) using Standardized Residuals, ACF Of
Figure 4.7 depicts the standardized residuals, ACF of residuals and P-values for the Ljung-
Box statistic. This indicates that the model has captured goodness of fit since the spikes of the
ACF of residuals were found to be insignificant and were found within the upper and lower
bound. The P-values for the Ljung-Box statistics shows an evidence of efficient and
Before we accept a fitted model and interpret its findings, it is essential to check whether the
model is correctly specified, that is, whether the model assumptions are supported by the
data. If some key assumptions seem to be violated, then a new model should be specified,
fitted and checked again until a model that provides an adequate fit to the data is found. Here,
Shapiro-Wilk P-value
W = 0.99367 0.8134
The Shapiro-Wilk test of normality in table 4.5 above has a test statistic of W = 0.99319,
best fitted ARIMA(1, 1, 2) is not rejected at 1%, 5% and 10% significance levels. This
Histogram of residuals of the best fitted model with an embedded normal probability curve as
evidenced in figure 4.8 also confirm that the residuals are normally and independently
The Ljung-Box test statistic examines the null hypothesis of independence in the residuals of
the fitted model. The result gives a Chi-squared of 8.024944 with 12 degrees of freedom and
8.024944 12 0.8745
The statistic and large p-values in the Box test above is suggesting us to accept the null
hypothesis that all of the autocorrelation functions in lag 1 to 12 are zero. In other words, we
can conclude that there is no (or almost nil) evidence for non-zero autocorrelations in the
forecast errors at lags 1 to 12 in our fitted model. This also indicates that the model has
captured the dependence in the series. The AIC and Log-likelihood deal with the fit and
parsimony of the model which provides a measure of efficient and parsimonious prediction.
In addition, the ARIMA(1, 1, 2) model can be confirmed to be adequate for predicting the
The ultimate aim of building any time series models is forecasting. If this objective is not
achieved, the work is incomplete. Forecasts are usually based on the assumption that the
prevailing condition or variation will persist into the future. Forecasts are usually needed over
a period known as the lead-time which varies with each problem. In this case, we assume the
model parameter is correct and that the true parameter does not change. Forecasts was made
Fig 9: Two (2) Years Forecast from the fitted ARIMA model (1, 1, 2)
Fig. 9 shows the two years forecast. The two shaded zones of forecast represent the 80% and
95% (lower and upper side) projection of prediction intervals. A closer look indicates that
there is going to be a steady rise in the price of roofing sheets in Nigeria but this steady rise is
observed to take short time in reaching past peaked values except there is favourable
intervention of government in making sure the nation’s export is more than her import
counterpart thereby creating an avenue for favourable exchange rate and price reduction.
This study is primarily concerned with building a suitable parsimonious time series model
using the monthly price of roofing sheets (Power hand) collected from Midland Galvanizing
A popularly recognized Box-Jenkins modeling approach was used to identify and estimate
the parameters of the identified models for the series. The original series was not stationary
without any indications of seasonality. Stationarity was achieved after performing a first
order differencing.
Non-seasonal models were identified considering the ACF and PACF plot and model
adequacy was done using the AIC, variance 2 and log-likelihood technique. This plot
indicated that an Autoregressive AR(1) and Moving Average MA(2) model will make a
better fit since the ACF was significant at lag 1 and lag 2. However, ARIMA(1, 1, 2), model
was specified as the most adequate model considering the significant spike at lag 2 in the
sampled differenced ACF and lag 1 in the sampled differenced PACF. The AR and MA
parameters were significant at 0.05 significance level and the diagnostic check shows that the
estimated model capture dependence and white noise was achieved for the price trend of
data: isiaq
Dickey-Fuller = -2.7799, Lag order = 12,
p-value = 0.252
alternative hypothesis: stationary
data: diffisiaq
Dickey-Fuller = -3.9132, Lag order = 12, p-value = 0.01563
alternative hypothesis: stationary
data: diffisiaq
Dickey-Fuller = -6.2145, Lag order =
6, p-value = 0.01
alternative hypothesis: stationary
Warning message:
In adf.test(diffisiaq, alternative = c("stationary"), k = 6) :
p-value smaller than printed p-value
Call:
arima(x = diffisiaq, order = c(1, 1, 1), method = "ML")
Coefficients:
ar1 ma1
-0.2954 -1.0000
s.e. 0.0824 0.0195
Coefficients:
ar1 ar2 ar3 ar4 ma1 ma2 ma3 ma4
-0.5175 0.2904 0.1663 -0.1104 -0.9923 -0.8278 0.6377 0.183
s.e. 0.3902 0.1882 0.1857 0.1127 0.5204 0.4730 0.4261 0.332
Call:
arima(x = diffisiaq, order = c(4, 1, 5), method = "ML")
Coefficients:
ar1 ar2 ar3 ar4 ma1 ma2 ma3
ma4
-0.6811 -0.6332 -0.2360 0.2049 -0.8094 -0.1088 -0.4692 -
0.4944
s.e. 0.1236 0.1156 0.1265 0.1068 0.1362 0.1397 0.1227
0.1431
ma5
0.8826
s.e. 0.1301
Call:
arima(x = diffisiaq, order = c(0, 1, 2), method = "ML")
Coefficients:
ma1 ma2
-1.5553 0.5553
s.e. 0.1159 0.1143
Call:
arima(x = diffisiaq, order = c(2, 1, 0), method = "ML")
Coefficients:
ar1 ar2
-0.8326 -0.4546
s.e. 0.0770 0.0783
Call:
arima(x = diffisiaq, order = c(2, 1, 1), method = "ML")
Coefficients:
ar1 ar2 ma1
-0.3590 -0.2125 -1.0000
s.e. 0.0843 0.0853 0.0204
Call:
arima(x = diffisiaq, order = c(1, 1, 2), method = "ML")
Coefficients:
ar1 ma1 ma2
0.4465 -1.9649 0.9652
s.e. 0.0936 0.0895 0.0877
Coefficients:
ar1 ar2 ma1 ma2 ma3
-0.4024 0.3400 -1.0951 -0.8093 0.9047
s.e. 0.1194 0.1058 NaN 0.1466 NaN
> fit9residual<-resid(fit9)
> fit9residual
Jan Feb Mar Apr May
2006 -0.5249991 1137.4269883 -947.3339781 -229.1416414
2007 -632.3192147 323.8196583 -880.8554593 -367.6009079 32.7717932
2008 -219.9252633 459.1388749 292.0632748 -197.0614655 3.0507456
2009 -295.4132126 -429.6089609 -1338.0256579 522.0744242 -62.3813333
2010 271.9526726 -1014.0476472 560.7128292 476.2184001 776.5839704
2011 -834.8748340 69.7776395 466.3181869 721.2534626 1062.6869573
2012 606.9524617 -291.1073445 35.9107550 -14.4851464 555.4552464
2013 -54.2075806 302.8122656 727.6658058 -344.9377195 202.1621869
2014 -1367.9160232 -453.5036040 533.5412392 426.5542191 -807.0342635
2015 -279.4562061 10.4689032 219.7034652 -956.6968460 331.1787412
2016 -409.5399218 -747.5467853 748.9270376 -1174.0844656 -441.1280780
2017 703.8445280 1386.0005333 677.7587461 636.1393609
Jun Jul Aug Sep Oct
2006 -21.8061608 266.3998514 1427.2301172 144.0740007 -392.1638518
2007 -473.7621862 4.2430985 -353.5574889 -184.0619804 -4.5378862
2008 249.5427530 -905.5406009 -349.3057680 -409.2207183 -1984.9736043
2009 564.8282354 401.8565033 713.2634862 831.3281033 -316.8329093
2010 492.5286052 304.6581539 -161.6720259 1631.1984448 50.0690597
2011 -940.4568420 -144.3481275 -396.1554660 519.0874622 656.7282070
2012 370.7385122 404.8269784 956.4738299 569.8924133 -14.2352768
2013 563.4062106 112.8347501 -586.0787560 308.4646025 -100.2962724
2014 -150.0316882 44.2396446 -78.6472005 967.0528190 974.5712367
2015 223.4302982 -953.1369836 73.2226440 -429.0420853 67.9409620
2016 -211.2220451 -165.1367918 -126.4826981 -13.8095718 436.6371630
2017
Nov Dec
2006 -731.8832167 -858.8209417
2007 -265.1773618 709.1361492
2008 2030.6579299 -200.8575187
2009 -489.1203945 -425.2993362
2010 -279.2842193 970.4033729
2011 -653.2712602 -412.9774244
2012 -802.9547985 -560.2745947
2013 386.8726372 698.5439975
2014 -616.2910550 252.1527100
2015 315.8669477 -829.9089529
2016 -600.7634557 293.4316360
2017
> shapiro.test(fit9residual)
data: fit9residual
W = 0.99367, p-value = 0.8134
> Box.test(fit9residual)
Box-Pierce test
data: fit9residual
X-squared = 8.024944, df = 12, p-value = 0.8745
> Fit9ErrorsPlot<-function(fit9residual)
+ {
+ mybinsize<-IQR(fit9residual)/4
+
+ mysd<-sd(fit9residual)
+ mymin<-min(fit9residual)*5
+ mymax<-max(fit9residual)+mysd*3
+ mynorm <- rnorm(10000, mean=0, sd=mysd)
+ mymin2 <- min(mynorm)
+ mymax2 <- max(mynorm)
+ if (mymin2 < mymin) { mymin <- mymin2 }
+ if (mymax2 > mymax) { mymax <- mymax2 }
+ mybins <- seq(mymin, mymax, mybinsize)
+ hist(fit9residual, col="green", freq=FALSE, breaks=mybins)
+ myhist <- hist(mynorm, plot=FALSE, breaks=mybins)
+ points(myhist$mids, myhist$density, type="l", col="red", lwd=2)
+ }
> Fit9ErrorsPlot(fit9residual)