Professional Documents
Culture Documents
Mini Project Based On Time Series Forecasting Methods: Data Used
Mini Project Based On Time Series Forecasting Methods: Data Used
By Vipul Malpani
1) Objective----------------------------------------------------------------------------------------------3
2) Setting working directory and loading working data. ---------------------------------------3
2.1) TS Plot of working data------------------------------------------------------------------------4
2.2) Extracting Data for building model----------------------------------------------------------4
2.3) Decomposing the data--------------------------------------------------------------------------5
2.4) De-seasonalizing data--------------------------------------------------------------------------6
3) Check for stationarity-------------------------------------------------------------------------------6
3.1) ACF and PACF plots------------------------------------------------------------------------------6
3.2) Differencing the time series data-------------------------------------------------------------7
3.3) ACF and PACF for difference time series----------------------------------------------------8
4) Splitting into training and test sets---------------------------------------------------------------9
5) Arima Model-----------------------------------------------------------------------------------------10
5.1) Fitting with Auto ARIMA----------------------------------------------------------------------11
6)Forecasting & Accuracy-----------------------------------------------------------------------------13
6.1) Forecasting with the ARIMA model---------------------------------------------------------13
6.2) Accuracy of the forecast-----------------------------------------------------------------------14
1)Objective:-
The main objective is to analyze past data of Australia gas production to build a
model for forecasting next 12 months gas production.
#IMPORTING DATA
Working_data=forecast::gas
str(Working_data)
## Time-Series [1:476] from 1956 to 1996: 1709 1646 1794 1878 2173 ...
#Working_data
#FROM THE ABOVE PLOT WE CAN’T SAY THAT GAS PRODUCTION FROM 1956 TO 1970 IS
CONSTANT BUT FROM 1970 WE CAN SEE BOTH TREND AND SEASONILTY IN DATA.
2.2) Extracting Data for building model.
str(Gasts)
## Time-Series [1:308] from 1970 to 1996: 1709 1646 1794 1878 2173 ...
plot(Gasts)
2.3) Decomposing the data
Gas_decompose = stl(Gasts, s.window = "periodic")
plot(Gas_decompose)
deseasonal_gas=seasadj(Gas_decompose)
##
## Augmented Dickey-Fuller Test
##
## data: Gasts
## Dickey-Fuller = 0.73972, Lag order = 6, p-value = 0.99
## alternative hypothesis: stationary
#From above test it is clear that our p value is greater than 5% so we can’t reject our null
hypothesis that means our time series not stationary.
3.1) ACF and PACF plots
acf(Gasts,lag=50)
pacf(Gasts,lag=50)
#from above ACF Plot we can see significant auto correlation till lag 50 , so we can say that
there is influce of long past data on the current or recent data.
#PACF plot shows that there could be monthly seasonality
3.2) Differencing the time series data
Gas_data_diff = diff(Gasts, differences = 1)
plot(Gas_data_diff)
##
## Augmented Dickey-Fuller Test
##
## data: Gas_data_diff
## Dickey-Fuller = -15.575, Lag order = 6, p-value = 0.01
## alternative hypothesis: stationary
3.3)acf and pacf for dif time series
Acf(Gas_data_diff, main='ACF for Differenced Series')
5) Arima Model
Gas.Arima=arima(Gas_train,order=c(0,1,1))
Gas.Arima
##
## Call:
## arima(x = Gas_train, order = c(0, 1, 1))
##
## Coefficients:
## ma1
## 0.3797
## s.e. 0.0806
##
## sigma^2 estimated as 1939217: log likelihood = -926.47, aic = 1856.93
gas.arima.fit=fitted(Gas.Arima)
ts.plot(Gas_train,gas.arima.fit,col=c("red","blue"))
#Since our fitted value is very close to actual value so we can say that our Model is correct
and working well
5.1)Fitting with Auto ARIMA
auto_arima<-auto.arima(Gas_train, seasonal = TRUE)
auto_arima
## Series: Gas_train
## ARIMA(1,0,0)(2,1,0)[12] with drift
##
## Coefficients:
## ar1 sar1 sar2 drift
## 0.6764 -0.4588 -0.3065 165.2297
## s.e. 0.0759 0.1121 0.1118 12.0501
##
## sigma^2 estimated as 604632: log likelihood=-775.44
## AIC=1560.89 AICc=1561.55 BIC=1573.71
Auto ARIMA also fits the same p and q parameters for the model, but has a slightly
lower AIC.
#Ljung box test
H0: Residuals are independent
Ha: Residuals are not independent
library(stats)
Box.test(Gas.Arima$residuals)
##
## Box-Pierce test
##
## data: Gas.Arima$residuals
## X-squared = 0.61408, df = 1, p-value = 0.4333
plot(fcast.auto_arima)
6.2)Accuracy of the forecast
f.arima=forecast(Gas.Arima)
accuracy(f.arima, Gas_Test)
f.auto_arima=forecast(auto_arima)
accuracy(f.auto_arima,Gas_Test)
#From Above it is clear that auto arima is working well because its mape value is very close
to 3.
#Now we will fore cast for next 12 months using auto arima model.
fcast.auto_arima1<- forecast( auto_arima, h=12)
plot(fcast.auto_arima1)
From different model we build we can conclude that ARIMA model with parameters
(1,0,0) is the best model and provides good accuracy .