Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 14

MINI PROJECT BASED ON TIME SERIES FORECASTING METHODS

By Vipul Malpani

Data used:- Australian monthly gas production


Contents :-

1) Objective----------------------------------------------------------------------------------------------3
2) Setting working directory and loading working data. ---------------------------------------3
2.1) TS Plot of working data------------------------------------------------------------------------4
2.2) Extracting Data for building model----------------------------------------------------------4
2.3) Decomposing the data--------------------------------------------------------------------------5
2.4) De-seasonalizing data--------------------------------------------------------------------------6
3) Check for stationarity-------------------------------------------------------------------------------6
3.1) ACF and PACF plots------------------------------------------------------------------------------6
3.2) Differencing the time series data-------------------------------------------------------------7
3.3) ACF and PACF for difference time series----------------------------------------------------8
4) Splitting into training and test sets---------------------------------------------------------------9
5) Arima Model-----------------------------------------------------------------------------------------10
5.1) Fitting with Auto ARIMA----------------------------------------------------------------------11
6)Forecasting & Accuracy-----------------------------------------------------------------------------13
6.1) Forecasting with the ARIMA model---------------------------------------------------------13
6.2) Accuracy of the forecast-----------------------------------------------------------------------14
1)Objective:-
 The main objective is to analyze past data of Australia gas production to build a
model for forecasting next 12 months gas production.

2) Setting working directory and loading working data.


setwd(“C:/Users/Vipul/Desktop/GL/Project/project 6”)
library(forecast)
library(tseries)

#IMPORTING DATA
Working_data=forecast::gas
str(Working_data)

## Time-Series [1:476] from 1956 to 1996: 1709 1646 1794 1878 2173 ...

#Working_data

2.1) TS Plot of working data.


plot.ts(Working_data)

#FROM THE ABOVE PLOT WE CAN’T SAY THAT GAS PRODUCTION FROM 1956 TO 1970 IS
CONSTANT BUT FROM 1970 WE CAN SEE BOTH TREND AND SEASONILTY IN DATA.
2.2) Extracting Data for building model.

Gasts=ts(Working_data, start = c(1970,1), end = c(1995,8),frequency = 12)

str(Gasts)

## Time-Series [1:308] from 1970 to 1996: 1709 1646 1794 1878 2173 ...

plot(Gasts)
2.3) Decomposing the data
Gas_decompose = stl(Gasts, s.window = "periodic")
plot(Gas_decompose)

2.4) De-seasonalizing data

#Since there is seasonality in data so we will de-seasonalize data

deseasonal_gas=seasadj(Gas_decompose)

3) Check for stationarity


adf.test(Gasts)

##
## Augmented Dickey-Fuller Test
##
## data: Gasts
## Dickey-Fuller = 0.73972, Lag order = 6, p-value = 0.99
## alternative hypothesis: stationary

#From above test it is clear that our p value is greater than 5% so we can’t reject our null
hypothesis that means our time series not stationary.
3.1) ACF and PACF plots
acf(Gasts,lag=50)

pacf(Gasts,lag=50)

#from above ACF Plot we can see significant auto correlation till lag 50 , so we can say that
there is influce of long past data on the current or recent data.
#PACF plot shows that there could be monthly seasonality
3.2) Differencing the time series data
Gas_data_diff = diff(Gasts, differences = 1)
plot(Gas_data_diff)

adf.test(Gas_data_diff, alternative = "stationary")

## Warning in adf.test(Gas_data_diff, alternative = "stationary"): p-value


## smaller than printed p-value

##
## Augmented Dickey-Fuller Test
##
## data: Gas_data_diff
## Dickey-Fuller = -15.575, Lag order = 6, p-value = 0.01
## alternative hypothesis: stationary
3.3)acf and pacf for dif time series
Acf(Gas_data_diff, main='ACF for Differenced Series')

Pacf(Gas_data_diff, main='PACF for Differenced Series')

4) Splitting into training and test sets


Gas_train = window(Gasts, start=c(1985,1),end=c(1993,12))
Gas_Test= window(Gasts, start=c(1994,1))

5) Arima Model
Gas.Arima=arima(Gas_train,order=c(0,1,1))
Gas.Arima

##
## Call:
## arima(x = Gas_train, order = c(0, 1, 1))
##
## Coefficients:
## ma1
## 0.3797
## s.e. 0.0806
##
## sigma^2 estimated as 1939217: log likelihood = -926.47, aic = 1856.93

tsdisplay(residuals(Gas.Arima),lag.max = 20,main = "Model_residuals")

gas.arima.fit=fitted(Gas.Arima)
ts.plot(Gas_train,gas.arima.fit,col=c("red","blue"))
#Since our fitted value is very close to actual value so we can say that our Model is correct
and working well
5.1)Fitting with Auto ARIMA
auto_arima<-auto.arima(Gas_train, seasonal = TRUE)
auto_arima

## Series: Gas_train
## ARIMA(1,0,0)(2,1,0)[12] with drift
##
## Coefficients:
## ar1 sar1 sar2 drift
## 0.6764 -0.4588 -0.3065 165.2297
## s.e. 0.0759 0.1121 0.1118 12.0501
##
## sigma^2 estimated as 604632: log likelihood=-775.44
## AIC=1560.89 AICc=1561.55 BIC=1573.71

tsdisplay(residuals(auto_arima), lag.max=45, main='Auto ARIMA Model


Residuals')
gas.fit=fitted(auto_arima)
ts.plot(Gas_train,gas.fit,col=c("red","blue"))

Auto ARIMA also fits the same p and q parameters for the model, but has a slightly
lower AIC.
#Ljung box test
H0: Residuals are independent
Ha: Residuals are not independent
library(stats)
Box.test(Gas.Arima$residuals)

##
## Box-Pierce test
##
## data: Gas.Arima$residuals
## X-squared = 0.61408, df = 1, p-value = 0.4333

Since p value is greater than 0.5 so, Residuals are independent


6)Forecasting & Accuracy
6.1)Forecasting with the ARIMA model
fcast.arima <- forecast(Gas.Arima,h=20)
fcast.auto_arima<- forecast(auto_arima, h=20)
plot(fcast.arima)

plot(fcast.auto_arima)
6.2)Accuracy of the forecast
f.arima=forecast(Gas.Arima)
accuracy(f.arima, Gas_Test)

## ME RMSE MAE MPE MAPE MASE


## Training set 106.4061 1386.096 1092.937 0.6521214 6.686101 0.5294037
## Test set 9704.4827 11573.729 9739.039 28.0056123 28.166844 4.7174555
## ACF1 Theil's U
## Training set 0.07540524 NA
## Test set 0.75049455 3.326208

f.auto_arima=forecast(auto_arima)
accuracy(f.auto_arima,Gas_Test)

## ME RMSE MAE MPE MAPE


## Training set 28.00523 717.6744 528.3848 0.09295282 3.229727
## Test set 4350.84815 5196.2061 4350.8482 12.74873170 12.748732
## MASE ACF1 Theil's U
## Training set 0.2559422 -0.01981323 NA
## Test set 2.1074904 0.70679799 1.546872

#From Above it is clear that auto arima is working well because its mape value is very close
to 3.
#Now we will fore cast for next 12 months using auto arima model.
fcast.auto_arima1<- forecast( auto_arima, h=12)

plot(fcast.auto_arima1)

7) Model Conclusion and Accuracy.

From different model we build we can conclude that ARIMA model with parameters
(1,0,0) is the best model and provides good accuracy .

So we can use Arima model for forecasting.

You might also like