Electricity Demand Forecasting Using Modern Modified Techniques With Arma and Arima in Tamilnadu Using R Programming

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 62

ELECTRICITY DEMAND FORECASTING USING

MODERN MODIFIED TECHNIQUES WITH ARMA


AND ARIMA IN TAMILNADU USING
R PROGRAMMING
A PROJECT REPORT

Submitted by
NANDHINIEE K 310617105059
NAVEEN B 310617105060
RAJAHAMSA R 310617105073
YASHWANTH KUMAR SP 310617105106

In partial fulfillment for the award of the degree of

BACHELOR OF ENGINEERING
In
ELECTRICAL AND ELECTRONICS ENGINEERING

EASWARI ENGINEERING COLLEGE, CHENNAI 600089


(Autonomous Institution)
Affiliated to
ANNA UNIVERSITY, CHENNAI 600025
APRIL 2021

EASWARI ENGINEERING COLLEGE, CHENNAI


(AUTONOMOUS INSTITUTION)
AFFILIATED TO ANNA UNIVERSITY, CHENNAI 600025

BONAFIDE CERTIFICATE

Certified that this project report “COMPARISON OF ELECTRICITY DEMAND


FORECASTING WITH MODERN TECHNIQUES OF ARMA AND ARIMA USING R
PROGRAMMING” is the bonafide work of NANDHINIEE.K (310617105059),
NAVEEN.B (310617405060), RAJAHAMSA.R (310617105073), and
YASHWANTH KUMAR .S.P (310617105106) who carried out the project work
under my supervision.

SIGNATURE SIGNATURE
Dr.E.KALIAPPAN,M.Tech,Ph.D., Dr.E.KALIAPPAN, M.Tech,Ph.D.,
SUPERVISOR HEAD OF THE DEPARTMENT
Professor and Head Department of Electrical and
Department of Electrical and Electronics Engineering
Electronics Engineering Easwari Engineering College
Easwari Engineering College Ramapuram, Chennai-89
Ramapuram, Chennai-89

-----------------------------------------------------------------------------------------------

Submitted for Semester Examination held on ____________

INTERNAL EXAMINER EXTERNAL EXAMINER


ACKNOWLEDGEMENT

We would like to extend my gratitude to Dr. R. SHIVAKUMAR, M.D.,

Ph.D., our Chairman, SRM Group of Institutions, Ramapuram Campus.

We express our gratitude and indebtedness to our principal

Dr. R.S. KUMAR, M.Tech., Ph.D., and the management for providing us with

excellent working environment and facilities throughout our degree course.

We are very much thankful to Head of the department and our project

guide Dr. E. KALIAPPAN, M.Tech., Ph.D., for his motivation and support for

completion of our project with perfection. In addition, we thank him for his

valuable suggestions and constant encouragement throughout our project.

We are thankful to Mr. S. SENTHILKUMAR, Assistant Executive Engineer

in Tamil Nadu Transmission Corporation (TANTRANSCO) for guiding us in

collecting the official data pertaining to the load demands from verified official

sources.

We would like to express our gratitude to our Internal Coordinator, Prof.

K. V. THILAGAR, M.E, Ph.D., for his valuable support and encouragement for
the completion of our project with perfection.

TABLE OF CONTENTS

Chapter No. Title Page No.

Abstract 1

List of Figures 2

1. Introduction 4

1.1 Classification of demand Forecasting Techniques 5

1.2 Traditional Forecasting Techniques 6

1.3 Regression Methods 6

1.4 Exponential Smoothing 9

1.5 Iterative Reweighted Least-Square 9

1.6 Modified Traditional Techniques 10

1.7 Adaptive Demand Forecasting 11

1.8 Stochastic Time Series 12

2. AUTOREGRESSIVE MODELS 13

2.1 Auto-Regressive(AR) Model 13

2.2 Auto-Regressive Moving Average(ARMA) Model

2.3 Auto-Regressive IntegratedMoving 14

Average(ARIMA) Model

2.4 Support Vector Machine Based Techniques 15


2.5 Soft Computing Techniques 16

2.6 Genetic Algorithm 17

2.7 Fuzzy Logic 19

2.8 Neural Networks 21

2.9 Knowledge-based Expert 22

3. Literature Review 23

4. Methodologies And Comparison 28

4.1 Building Processes of ARMA-Prediction Model 28

4.2 Auto-correlation 29

4.3 Stationary Auto-Regression 30

4.4 Difference between ARMA and ARIMA Models 31

4.5 Comparison of Models 32

4.6 Experiments and Analysis 33

4.7 Comparison Of Results 45

5. Conclusion and Future Scope 52


References 55
ABSTRACT

Electricity demand forecasts are significant for energy providers and other

members in electric energy generation, transmission, distribution and

marketplaces. Accurate models for electric power load forecasting are crucial

to the process and planning of a utility establishment. Load forecasts are

extremely significant for energy providers and other members in electric

energy generation, transmission, distribution and markets. This report

presents a review of electricity demand forecasting techniques and the various

types of methodologies and comparison of two models: ARMA and ARIMA.

Our discussion about different time series models are supported by giving the

experimental forecast results, performed on real time series datasets. While

fitting a model to a dataset, special care is taken to select the most

parsimonious one. To evaluate forecast accuracy as well as to compare among

different models fitted to a time series, we have used specific performance

measures.

To have authenticity as well as clarity in our discussion about comparison of

demand forecasting, we have taken the help of various published research

works from reputed journals and some standard books.

1
LIST OF FIGURES

Figure 1.1 The Building Stages of the ARMA models


Figure 1.2 Auto-Correlation of Time Series Model

Figure 1.3 Stationary Auto-Regression of the Time Series Model


Figure 2.1 Time vs Demand Data for the Year 2010-2016
Figure 2.2 Aggregated Cycle Data for the Year on Year Trend for

2010-2015
Figure 2.3 Boxplot Across Months to Illustrate Seasonal Effects
Figure 2.4 Complete Auto-correlation function plot in Time Series

Analysis with its lagged values


Figure 2.5 Partial Auto-correlation function plot in Time Series

Analysis with its correlation of its Residuals

Figure 2.6 Plot between Partial auto-correlation function and lag

function

Figure 2.7 Correlogram of the observed data trend obtained with

the function plot.acf()

Figure 2.8 Correlogram of the observed data trend obtained with

the function plot.pacf()


Figure 2.9 Correlogram of the observed residuals trend obtained

with the function acf()


Figure 2.10 Correlogram of the observed residuals trend obtained
2
with the function pacf()
Figure 2.11 ARIMA model to predict the future 5 years with

seasonal components in the ARIMA formulation


Figure 2.12 Predicted ARIMA formulation graph for the next

years:2010 -2022

Figure 3.1 Demand Forecast data of the year 2016 in Tamil Nadu

Figure 3.2 Forecast error data of the year 2016 In Tamil Nadu
Figure 3.3 Demand forecast data of the year 2017 in Tamil Nadu
Figure 3.4 Forecast error data of the year 2017 in Tamil Nadu

Figure 3.5 Forecasted data of the year 2018 in Tamil Nadu


Figure 3.6 Error data of the year 2018 in Tamil Nadu
Figure 3.7 Forecasted data of the year 2019 in Tamil Nadu
Figure 3.8 Error data of the year 2019 in Tamil Nadu

Figure 3.9 Demand forecast of the year 2020 for

January in Tamil Nadu

CHAPTER 1: INTRODUCTION

Load forecasting helps an electric utility to make important decisions including

decisions on purchasing and generating electric power, load switching, and

infrastructure development. This involves the accurate prediction of both the


3
magnitudes and geographical locations of electric load over the different periods

of the planning horizon. Electricity demand forecasting, are considered as one

of the critical factors for economic operation of power; accurate load

forecasting holds a great saving potential for electric utility corporations.

The maximum savings can be achieved when load forecasting is used to control

operations and decisions like economic dispatch/ unit commitment and fuel

allocation /on -line network analysis. The operating cost is increased due to the

forecasting errors (either positive or negative).This part of the research work is

necessary to establish the statistical relevance of the proposed research work,

establish a generalized research question, analyze existing methods, and explore

areas of possible improvements.

The study proposes a model based on ARMA (Autoregressive moving average)

and ARIMA (Autoregressive Integrated moving average) decomposition of

monthly electricity consumption forecasting methods. The first use of these

models according to the properties of electricity each month is its power

consumption time series decomposition individuation. It influences the

factorization of monthly electricity consumption into season, trend, and random

components. Then, the change in the characteristics of the three components

over time is considered. Finally, the appropriate model is selected to predict the

components in the reconfiguration of the monthly electricity consumption


4
forecast. A forecasting program is developed based on R language, and a case

study is conducted on the power consumption data of a containing distributed

energy. Results show that the proposed method is reasonable and effective.

1.1 Classification of Demand Forecasting Techniques

There have been many studies relating demand-forecasting methodology since

its origin. Various types of classifications based on duration of forecasting and

forecasting methods are proposed in literature over a period. Demand

forecasting methods can be also classified in terms of their degrees of

mathematical analysis used in the forecasting model. These are presented into

two basic types, namely: quantitative and qualitative methods. In most cases,

historical data are insufficient or not available at all. Planners to forecast

accurately, these methods are Delphi method, Curve fitting and technological

comparisons including other methods, generally use the qualitative forecasting

methods. Other forecasting techniques such as decomposition methods,

regression analysis, exponential smoothing, and the Box-Jenkins approach are

quantitative methods. Based on the various types of studies presented in these

papers, the load forecasting techniques may be grouped broadly in three major

groups: 1.Traditional Forecasting technique, 2.Modified Traditional Technique

and 3.Soft Computing Technique.

5
1.2 Traditional Forecasting Techniques

One of the most important topics for the planners of the nation is to predict

future load demands for planning the infrastructure, development trends and

index of overall development of the country etc. In early days, these predictions

or forecasts were carried out using traditional/conventional mathematical

techniques. With the development of advanced tools, these techniques have

been augmented with the finding of researches for more effective forecasting in

various fields of study. The traditional forecasting techniques are as follows:

regression, multiple regression, exponential smoothing and Iterative reweighted

least-squares technique.

1.3 Regression Methods

Regression is one of the most widely used statistical techniques and it is often

easy to be implemented. The regression methods are usually employed to model

the relationship of load consumption and other factors such as weather

conditions, day types and customer classes. This method assumes that the load

can be divided in a standard load trend and a trend linearly dependent on some

factors influencing the load. The mathematical model can be written as:

6
Where, is the normal or standard load at time t, ai is the estimated

slowly varying coefficients, are the independent influencing factors

such as weather effect, is a white noise component, n is the number of

observations, usually 24 or 168. The method accuracy relies on the adequate

representation of possible future conditions by historical data but a measure to

detect any unreliable forecast can be easily constructed. The proposed

procedure requires few parameters that can be easily calculated from historical

data by applying the cross-validation technique. In order to forecast the load

precisely throughout a year, one should consider seasonal load change, annual

load growth and the latest daily load change. To deal with these characteristics

in the load forecasting, a transformation technique is presented. This technique

consists of a transformation function with translation and reflection methods.

The transformation function is estimated with the previous year's data points, in

order that the function converts the data points into a set of new data points with

preservation of the shape of temperature-load relationships in the previous year

[4, 5-7].

Multiple Regressions

Multiple Regressions is the most popular method and often used to forecast the

load affected by a number of factors ranging from meteorological effects, per

7
capita growth, electricity prices, economic growth etc. Multiple Regression

analysis for load forecasting uses the technique of least-square estimation.

Where , t is sampling time, is total measured load system, is

vector of adapted variables such as time, temperature, light intensity, wind

speed, humidity, day type (workday, weekend), etc., at is transposed vector of

regression coefficients and et is model error at time t.

The data analysis program can select the Polynomial degree of influence of the

variables from 1 to 5. In most cases, linear dependency gives the best results. In

subsequent steps, it uses the maximum of the initial hourly forecast; the most

recent initial peak forecast error and exponentially smoothed errors as variables

in a regression model to produce an adjusted peak forecast. Trend estimation

evaluates growth by the variable transformation technique, while Trend

cancellation removes annual growth by subtraction or division. Lately they

modified the developed model as an adaptable regression model for 1-day-

ahead forecasts, which identifies weather insensitive and sensitive load

components.

1.4 Exponential Smoothing

8
Exponential smoothing is one of the approaches used for load forecasting. In

this method, the first load is model based on previous data, then to use this

model to predict the future load. Here, the load at time t, y(t), is modeled using a

fitting function and is expressed in the form :

Where, - Fitting function vector of the process, - Coefficient

of vector, - White noise and -Transpose operator.

This method is one of existing exponential smoothing methods having capacity

to analyze seasonal time series directly. It is based on three smoothing constants

for stationary, trend and seasonality.

1.5 Iterative Reweighted Least Squares

This is used as an iteratively reweighted leastsquares procedure to identify the

model order and parameters. The method uses an operator that controls one

variable at a time and determines the optimal starting point. Autocorrelation

function and the partial autocorrelation function of the resulting differenced past

load data are utilized to identify a suboptimal model of the load dynamics. A

three-way decision variable is formed by the weighting function, the tuning


9
constants and the weighted sum of the squared residuals in identifying an

optimal model and the subsequent parameter estimates. Consider the parameter

estimation problem involving the linear measurement equation:

Where, is an n x 1 vector of observations, is an n x p matrix of known

coefficients (based on previous load data), is a p x 1 vector of the unknown

parameters and is an n x 1 vector of random errors. Results are more

accurate when the errors are not Gaussian. Iterative methods are used to find

. Newton method /alternatively Beaton-Turkey iterative reweighted least-

square’s algorithm (IRLS) can be applied if is known.

1.6 Modified Traditional Techniques

The traditional forecasting techniques have been modified so that they are able

to automatically correct the parameters of forecasting models under changing

environmental conditions. Some of the techniques that are the modified version

of these traditional techniques are adaptive load forecasting, stochastic time

series and support vector machine based techniques.

10
1.7 Adaptive Demand Forecasting

Demand forecasting model parameters are automatically corrected to keep track

of the changing load conditions. Hence, Demand forecasting is adaptive in

nature and can be used as an on-line software package in the utilities control

system. Next state vector is estimated using current prediction error and the

current weather data acquisition programs. State vector is determined by total

historical data set analysis. Switching between multiple and adaptive regression

analysis is possible in this mode. The same model as in the multiple regression

section, by equation given below is used in this model :

Where, t-sampling time, - measured system total load, -vector of

adapted variables such as time, temperature, light intensity, wind speed,

humidity, day type (workday, weekend), etc., -transposed vector of

regression coefficients and Model error at time t.

This model used a joint Hammerstein nonlinear time-varying functional

relationship between load and temperature. This algorithm performed better

than the commonly used RLS (Recursive Least-square) algorithm. A composite

model for load prediction composed of three components (nominal load, type

load and residual load) was developed .To use Kalman’s filter nominal load is

11
modeled accordingly and the parameters of the model are adapted by the

exponentially weighted recursive least-squares method. An adaptive online load

forecasting approach was introduced which automatically adjusts model

parameters according to changing conditions based on time series analysis. This

approach has two unique features: autocorrelation optimization is used for

handling cyclic patterns & in addition to updating model parameters, the

structure and order of the time series is adaptable to new conditions.

1.8 Stochastic Time Series

The Time series methods appear to be among the most popular approaches that

applied to STLF. Time series methods are based on the assumption that the data

have an internal structure, such as autocorrelation, trend or seasonal variation.

The first impetus of the approach is to accurately assemble a pattern matching

available data and then obtain the forecasted value with respect to time using

the established model. The next subsection discusses some of the time series

models used for load forecasting.

CHAPTER 2: AUTOREGRESSIVE MODELS

2.1 Autoregressive (AR) Model

12
Auto-Regressive (AR) model can be used to model the load profile, if the load

is assumed a linear combination of previous loads, which is given by Liu [23]

as:

Where, is the predicted load at time k (min), is a random load disturbance,

α i, i =1…m are unknown coefficients and the above given equation is the auto

regressive model of order m. The unknown coefficients in the equation can be

tuned on-line using the well-known least mean square (LMS) algorithm.

2.2 Autoregressive Moving-Average (ARMA) Model

ARMA model represents the current value of the time series y (t) linearly in

terms of its values at previous periods

& in terms of previous values of a white noise

For an ARMA of order (p,q), the model is written as:

A recursive scheme is used to identify the parameters, or using a maximum-

likelihood approach. In this method, the original time series of monthly peak

13
demands are decomposed into deterministic and stochastic load components,

the WRLS (Weighted Recursive Least Squares) algorithm can be used to update

the parameters of their adaptive ARMA model. Using minimum mean square

error to derive error-learning coefficients, the adaptive scheme outperformed

conventional ARMA models.

2.3 Autoregressive Integrated Moving-Average (ARIMA) Model

If the process is dynamic/non-stationary, then transformation of the series to

the stationary form has to be done first. The differencing process can do this

transformation. By introducing the operator, the series

For a series that needs to be, differenced d times and has orders p and q for

the AR and MA components, i.e. ARIMA (p; d; q), the model is written as:

The trend component to forecast the growth in the system load, the weather

parameters to forecast the weather sensitive load component, and the ARIMA

model to produce the non-weather cyclic component of the weekly peak load.

14
2.4 Support Vector Machine based Techniques

It is a novel powerful machine learning method based on statistical learning

theory (SLT), which analyzes data and recognizes patterns, used for

classification and regression analysis. They combine generalization control with

a technique to address the curse of dimensionality. Once the risk function of

conventional support vector machines by penalizing insensitive errors were

modified more heavily than the distant insensitive errors, they named this

method as C-ascending support vector machine. They conclude by a test that the

C-ascending support vector machines with the actually ordered sample data

consistently forecast better than the standard support vector machines.

To estimate the relations between input and output variables Lee & Song further

modified the Support Vector Machine (SVM) by using an empirical inference

model. This method was derived by modifying the risk function of the standard

SVM by using the concept of Locally Weighted Regression. The proposed

method proves useful to be in the field of process monitoring, optimization and

quality control. A new short-term load forecasting method by conjunctive use of

fuzzy C-mean clustering algorithm and weighted support vector machines

(WSVMs) was introduced. They clustered input samples according to the

similarity degree. The SVM based model provides a promising arithmetic to

forecasting electricity load than an artificial neural network. The model


15
overcomes the disadvantages of general artificial neural network (ANN), such

as it is not easy to converge, liable to trap in partial minimum and unable to

optimize globally, and these generalization of the model is not good.

The model is proved able to enhance the accuracy, improve global convergence

ability, and reduce operation time.

2.5 Soft Computing Techniques

It is a fact that every system is pervasively imprecise, uncertain and hard to be

modelled precisely. A flexible approach called Soft Computing technique has

emerged to deal with such models effectively and most efficiently in research

scenarios. It has been very widely in use over the last few decades. Soft

computing is an emerging approach which parallels the remarkable ability of

the human mind to reason and learn in an environment of uncertainty and

imprecision. It is fast emerging as a tool to help computer-based intelligent

systems mimic the ability of the human mind to employ modes of reasoning that

are approximate rather than exact. The basic theme of soft computing is that

precision and certainty carry a cost and that intelligent systems should exploit,

wherever possible, the tolerance for imprecision and uncertainty. Soft

computing constitutes a collection of disciplines that include fuzzy logic (FL),

neural networks (NNs), evolutionary algorithms (EAs) like genetic algorithms

(GAs) etc. Natural intelligence is the product of millions of years of biological


16
evolution. Simulating complex biological evolutionary processes may lead us to

discover how evolution propels living systems toward higher levels of

intelligence. One of the newer and relatively simple optimization approaches is

the GA, which is based on the evolutionary principle of natural selection.

Perhaps one of the most attractive qualities of GA is that it is a derivative free

optimization tool. The demand/ load forecasting techniques are also developed

based on the following soft computing/ intelligent techniques. The Knowledge-

based expert systems have been utilized for this purpose also.

2.6 Genetic Algorithms

The genetic algorithm (GA) or evolutionary programming (EP) approach is

used to identify the autoregressive moving average with exogenous variable

(ARMAX) model for load demand forecasts. By simulating a natural

evolutionary process, the algorithm offers the capability of converging towards

the global extreme of a complex error surface. A global search technique

simulates the natural evolution process and constitutes a stochastic optimization

algorithm. Since the GA simultaneously evaluates many points in the search

space and need not assume the search space is differentiable or unit-modal, it is

capable of asymptotically converging towards the global optimal solution, and

thus can improve the fitting accuracy of the model.

The general scheme of the Genetic Algorithm process is briefly described here.
17
The integer or real valued variables to be determined in the genetic algorithm

are represented as a D-dimensional vector P for which a fitness f (p) is assigned.

The initial population of k parent vectors Pi, i = 1, k, is generated from a

randomly generated range in each dimension. Each parent vector then generates

an offspring by merging (crossover) or modifying (mutation) individuals in the

current population. Consequently, 2k new individuals are obtained. Of these, k

individuals are selected randomly, with higher probability of choosing those

with the best fitness values, to become the new parents for the next generation.

This process is repeated until ƒ is not improved or the maximum number of

generations is reached.

ARMAX form:

Where, - load at time t, - exogenous temperature input at

time t,

- white noise at time t, and q-1 - back-shift operator and A(q), B(q), and

C(q) are parameters of the autoregressive (AR), exogenous (X), and moving

average (MA) parts, respectively.

The fuzzy autoregressive moving average with exogenous variable (FARMAX)

model is for load demand forecasts. The model is formulated as a combinatorial

18
optimization problem, and then solved by a combination of heuristics and

evolutionary programming. A genetic algorithm with a newly developed

knowledge augmented mutation-like operator called the forced mutation can

also be introduced. To maximize the efficiency of GAs, the three inherent

parameters of GAs are to be optimized, the mutation probability (Pm) crossover

probability (Pc), and the population size (POPSIZE). For parameter

optimization of GAs, several results have been obtained over the last few years.

2.7 Fuzzy Logic

It is well known that a fuzzy logic system with centroid de-fuzzification can

identify and approximate any unknown dynamic system (here load) on the

compact set to arbitrary accuracy. Liu observed that a fuzzy logic system has

great capability in drawing similarities from huge data. The similarities in input

data (L-i -L0) can be identified by different first order differences ( ) and

second-order differences ( ), which are defined as:

The fuzzy logic-based forecaster works in two stages: training and on-line
19
forecasting. In the training stages, the metered historical load data are used to

train a 2m-input, 2n-output fuzzy-logic based forecaster to generate patterns

database and a fuzzy rule base by using first and second-order differences of the

data. After enough training, it will be linked with a controller to predict the load

change online. If a most probably matching pattern with the highest possibility

is found, then an output pattern will be generated through a centroid de-

fuzzifier. Several techniques have been developed to represent load models by

fuzzy conditional statements. An expert system was used to do the updating

function. Short-term forecasting was performed and evaluated on the Taiwan

power system. A hybrid approach was proposed which can accurately forecast

on weekdays, public holidays, and days before and after public holidays.

Simulated annealing and the steepest descent method perform the search for the

optimum solution. A hybrid scheme combining fuzzy logic with both neural

networks and expert systems for load forecasting. Fuzzy load values are inputs

to the neural network, and the output is corrected by a fuzzy rule inference

mechanism. The fuzzy logic approach for next-day load forecasting offers three

advantages. These are namely the ability to (1) handle non-linear curves, (2)

forecast irrespective of day type and (3) provide accurate forecasts in hard-to-

model situations. Automatic model identification is used, that utilizes analysis

of variance, cluster estimation, and recursive leastsquares. By applying a two-

phase STLF methodology, that also uses orthogonal least squares (OSL) in

fuzzy model identification.


20
2.8 Neural Networks (NN)

Neural networks or artificial neural networks (ANN) have very wide

applications because of their ability to learn. Neural networks offer the potential

to overcome the reliance on a functional form of a forecasting model. There are

many types of neural networks: multilayer perceptron network, self-organizing

network, etc. There are multiple hidden layers in the network. In each hidden

layer, there are many neurons. Inputs are multiplied by weights ωi and are

added to a threshold θ to form an inner product number called the net function.

The net function NET for example, is put through the activation function y, to

produce the unit’s final output, y (NET). The main advantage here is that most

of the forecasting methods seen in the literature do not require a load model.

However, training usually takes a lot of time. A fully connected feed-forward

type neural network. The network outputs are linear functions of the weights

that connect inputs and hidden units to output units. Therefore, linear equations

can be solved for these output weights. In each iteration through the training

data (epoch), the output weight optimization training method uses conventional

back propagation to improve hidden unit weights, then solves linear equations

for the output weights using the conjugate gradient approach. Attention was

paid to accurately model special events, such as holidays, heat waves, cold
21
snaps and other conditions that disturb the normal pattern of the load.

2.9 Knowledge-Based Expert Systems

Expert systems are new techniques that have emerged because of advances in

the field of artificial intelligence. An expert system is a computer program that

has the ability to reason, explain and have its knowledge base expanded as new

information becomes available to it. To build the model, the ‘knowledge

engineer’ extracts load-forecasting knowledge from an expert in the field by

what is called the knowledge base component of the expert system. This

knowledge is represented as facts and IF-THEN rules, and consists of the set of

relationships. Between the changes in the system load and changes in natural

and forced condition factors that affect the use of electricity this rule base is

used daily to generate the forecasts. Some of the rules do not change over time,

while others have to be updated continually. The logical and syntactical

relationships between weather load and the prevailing daily load shapes have

been widely examined to develop different rules for different approaches. The

typical variables in the process are the season under consideration, day of the

week, the temperature and the change in this temperature.

22
CHAPTER 3: LITERATURE REVIEW

“Physical algorithm” is a general term referring to models that primarily use

physical data such as temperature, velocity, density, and terrain information

based on a numerical weather prediction (NWP) model to predict wind speeds

in subsequent periods. Spatial correlation models, which are applied to solve

time series forecasting to make up for the shortcomings of physical algorithms,

take the relationships of time series from different locations into consideration .

A classic case is a novel model proposed by Tascikaraoglu et al utilizing a

spatiotemporal method and a wavelet transform, successfully improving the

performance of forecasting compared to other benchmark models. However,

spatial correlation arithmetic is always difficult to use in practice because of its

requirements of strict measurements and a large amount of meticulous

measuring in many spatially related sites.

Traditional prediction methods also include random time series models such as

exponential smoothing, autoregressive (AR) methods, filtering methods,

autoregressive moving average (ARMA) methods, and the well-known

autoregressive integrated moving averages (ARIMA) and seasonal ARIMA

models, mainly focusing on regression analysis. The regression model is aimed


23
at establishing a relationship between historical data, treated as dependent

variables, and influencing factors, treated as independent variables.

Modern forecasting methods include artificial neural networks (ANNs), support

vector machines (SVMs), fuzzy systems, expert system forecasting methods,

chaotic time series methods, gray models, adaptive models, optimization

algorithms, etc. These modern methods are getting more popular among

researchers when dealing with time series forecasting. These artificial

intelligence models can achieve good forecasting performance because of their

unique characteristics, such as memory, self-learning, and self-adaptability,

since the neural networks are products of biological simulation that follow the

behavior of the human brain . Park showed good performance of this type of

model after first applying ANNs in power load forecasting in 1991. He

concluded that ANNs were highly effective in electrical load forecasting. After

that, many time series forecasting studies were performed using various

artificial neural networks by many researchers. Lou and Dong proved that

electric load forecasting with RFNN showed much higher variability with

hourly data in Macau. Okumus and Dinler integrated ANNs and the adaptive

neuro-fuzzy inference system to predict wind power, and forecasting results

proved that their proposed hybrid model Energies 2019, 12, 1931 three of 30

was better than the classical methods in forecasting accuracy. Hong selected

better parameters for SVR by using the CPSO algorithm, while Che and Wang

established a hybrid model that was a combination of ARIMA and SVM, called
24
SVRARIMA. Liu et al. built a model integrating EMD, extended extreme

learning machine (ELM), Kalman filter, and PSO algorithm. Although the

hybrid model seemed better than individual classical models, the limitations of

each model due to the nature of the structure seemed inevitable. In order to

solve this problem, a combined forecasting model is proposed. The combined

forecasting theory has been developed through the joint efforts of three

generations of scientists. It was initiated by Bates and Granger and developed

by Diebold and Pauli, then further extended by Psarian and Timberman as a

combination of several individual models. Many kinds of ANNs have been

combined into short-term forecasting models in order to fully utilize the

advantages of individual models and at the same time overcome their

shortcomings. There are some typical studies: Zhang et al. successfully obtained

promising results of wind speed forecasting by developing a combined model

that consisted of CEEMDAN, five neural networks, CLSFPA, and no negative

constraint theory (NNCT). Moreover, it costs a lot of computing time and

resources when using NWP models because of their complex calculation

process and high cost. Spatial correlation arithmetic requires detailed

measurements from multiple spatially correlated sites, which increases the

difficulty in searching for electric load data. Moreover, because of the strict

measuring requirements and time delays, the model is always hard to

implement. For conventional statistical arithmetic, mainly known as the linear

model, there are insurmountable shortcomings. Primarily, these models cannot


25
deal with nonlinear features of electric load time series. Moreover, the

regression method also fails to achieve the expected forecasting accuracy.

Linear regression relies too much on historical data to cope with nonlinear

forecasting problems; as time goes by, the forecasting effect of regression

analysis models will become weaker and weaker. In addition, when faced with

complex objective data, it is hard to choose the appropriate influencing factors.

The exponential smoothing model also has shortcomings, in that it cannot

recognize the turning point of the data and does not perform well in long-term

forecasting. As for the autoregressive moving average model, it only gets results

through historical and current data, ignoring potential influencing factors. In

addition, strong random factors of the data may lead to instability of the model,

which affects the accuracy of the forecasting performance. Overall, none of

these models meets the accuracy required by an electric load forecasting system.

For artificial intelligence arithmetic, although artificial intelligence neural

network performance is superior to traditional forecasting techniques, ANNs are

impeccable; the defects and shortcomings of their structure cannot be ignored.

There are three major problems. First, it is hard to choose the parameters of

ANN models, as a slight change in parameters may cause huge differences in

the outcomes. Second, ANNs are inclined to fall into local minima owing to

their relatively slow self-learning convergence rate. Lastly, the number of layers

and neurons in a neural network structure has an effect on the forecasting result
26
and computing time. As to other models, SVM has a high requirement for

storage space and expert systems strongly rely on knowledge databases, while

gray forecasting models can produce decent results only under the condition of

exponential growth trends. To solve these problems, evolutionary algorithms

are applied. When the optimization algorithms are combined with Energies

2019, 12, 1931 four of 30 forecasting models, more reasonable parameters will

be selected and results that are more accurate will be obtained.

The main contributions and novelties of our proposed model are summarized as

follows:

The presented model includes all the state-of-art statistical methods used in the

demand forecasting fields. From the results, we comprehended that the

ensemble of the results of time-series model and regression-based model

provides an efficient outcome because of nullifying the over-forecasting and

under-forecasting values and by bringing the forecast values near to the actual.

These results are better than considering individual algorithms used in the two

models viz. ARMA & ARIMA. The experimental results show that our

combined model has high forecasting accuracy and strong stability.

27
CHAPTER 4: METHODOLOGY AND COMPARISON

4.1 Building Processes of ARMA Prediction Model:

28
Figure 1.1 The Building stages of the ARMA models

Different techniques namely; regression, multiple regression, exponential

smoothing, iterative reweighted least squares, adaptive load forecasting,

stochastic time series autoregressive, ARMA model, ARIMA model, support

vector machine based, soft computing based models- genetic algorithms, fuzzy

logic, neural networks and knowledge based expert systems etc. have been

applied to load forecasting. The merits and demerits of these techniques are

presented technique wise. From the works reported so far, it can be inferred that

demand-forecasting techniques based on soft computing methods are gaining

major advantages for their effective use. There is also a clear move towards

29
hybrid methods, which combine two or more of these techniques. The research

has been shifting and replacing old approaches with newer and more efficient

ones.

An ARMA Model is used for forecasting the throughput performance of each

unit passed by electroplating products in the cold rolling production process. An

IDE algorithm is proposed for optimizing parameter estimation of the ARMA

model. Compared with the CDE algorithm, the IDE algorithm is more effective

for improving parameter estimation performance of the ARMA model. The

developed ARMA models based on the IDE algorithm can forecast throughput

performance of acid pickling and rolling mill units with acceptable accuracy

and forecast throughput performance of annealing and electroplating units with

high accuracy. Future study will be extended to handle a variety of products of

plant-wide production in iron and steel enterprise

4.2 Auto-Correlation

This compares the persistence over time of the random walk. The

autocorrelation seen in figure1.2 is high, as it moves up and down by a small

amount due to past-accumulated motion. This indicates that there is a long

persistence of leveling electricity demand. Making electricity supply-demand

non-stationary.
30
The difference between time steps of the autocorrelation can also be determined

to validate if the time series model is stationary or non-stationary, when the

autocorrelation plot does not provide enough information for stationarity

validation.

Figure 1.2 Auto-Correlation of time series model

4.3 Stationary Auto-regression

Stationary auto-regression provides information that verifies if the time series

model is stationary or not. This figure shows the stationary auto-regression plot

of the time series model. It shows that the model is random and has the

tendency to move beyond the center and back, which makes the model

nonstationary.

31
Figure 1.3 Stationary Auto-Regression of the time series model

4.4 Difference between an ARMA model and ARIMA

The two models share many similarities. The AR and MA components are

identical, combining a general autoregressive model AR (p) and general moving

average model MA (q). AR (p) makes predictions using previous values of the

dependent variable. MA (q) makes predictions using the series mean and

previous errors. What sets ARMA and ARIMA apart is differencing. An

ARMA model is a stationary model; if your model is not stationary, then you

can achieve stationarity by taking a series of differences. The “I” in the ARIMA

model stands for integrated; it is a measure of how many no seasonal

differences are needed to achieve stationarity. If no differencing is involved in

the model, then it becomes simply an ARMA. A model with a dth difference to

fit and ARMA (p, q) model is called an ARIMA process of order (p, d, q). You

can select p, d, and q with a wide range of methods, including AIC, BIC, and
32
empirical autocorrelations measures of accuracy.

Mean Absolute percentage error (MAPe)

Mean error in detail according to the orders p and q to determine their optimal

values. We compute the difference between the values obtained through model

ARMA (p, q) and those obtained experimentally. However, beyond these values

the gain is small. Due to the constraint capabilities of the network devices, the

model complexity should be minimized in order to decrease the memory and

computational power usage. Median values are used for the prediction, and

MAPE is used for error analysis. Using MAPE has low profit loss

◼ Pro: The MAPE measure works well in load forecasting, since load values

are significantly higher than zero

◼ Con: MAPE values are biased with the actual electricity price values, which

are around zero.

4.5 Comparison of models

The combination of autoregressive and moving average leads to a very relevant

class of models in the field of load curve forecasting, namely the ARMA model.

33
A methodology that considers temperature and breaks down the deterministic

and the stochastic component, the latter modelled as an ARMA process. If two

models are generally similar in terms of their error statistics and other

diagnostics, you should prefer the one that is simpler and/or easier to

understand. The simpler model is likely to be closer to the truth, and others will

usually more easily accept it.

4.6 Experiment and Analysis

ARIMA Model Program with Output in Console

> Library (forecast)

> Library (fUnitRoots)

> Library (timeDate)

> library(timeSeries)

> library(trend)

> library(tseries)

> library(unittest)

> library(urca)

> library(wmtsa)

>library(zoo)

> library(xts)

> de<-demand > de.ts<-ts(data=de$demand,frequency =12,start = c(2010,1),end

= c(2015,12))
34
#This is the start of the time series and end of the time series.

#The cycle of this time series is 12months in a year

> class(de.ts)

#This tells you that the data series is in a time series forma

[1] "ts"

Min. 1st Qu. Median Mean 3rd Qu. Max.

9442 11094 11819 11841 12808 13775

> plot(de.ts)

Figure 2.1 Time vs Demand data for the year 2010-2016

35
> abline(reg=lm(de.ts~time(de.ts)))

Figure 2.2 Time vs Demand data for the year 2010-2016

> cycle(de.ts)

#This will print the cycle across years.

Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

2010 1 2 3 4 5 6 7 8 9 10 11 12

2011 1 2 3 4 5 6 7 8 9 10 11 12

2012 1 2 3 4 5 6 7 8 9 10 11 12

2013 1 2 3 4 5 6 7 8 9 10 11 12

2014 1 2 3 4 5 6 7 8 9 10 11 12

2015 1 2 3 4 5 6 7 8 9 10 11 12

36
> plot(aggregate(de.ts,FUN=mean))

#This will aggregate the cycles and display a year on year trend

Figure 2.3 Aggregated cycle data for the year on year trend for 2010-2015

Figure 2.4 Boxplot across months to illustrate seasonal effect

37
> boxplot(de.ts~cycle(de.ts))

#Box plot across months will give us a sense on seasonal effect

Figure 2.5 Complete auto-correlation function plot in time series analysis with its lagged values

>acf(de.ts)

Figure 2.6 Partial auto-correlation function plot in time series analysis with its correlation of the

residuals

38
>pacf(de.ts)

Figure 2.7 Plot between partial auto-correlation function and lag function

>log(de.ts)

Jan Feb Mar Apr May

2010 9.163249 9.214930 9.207236 9.219003 9.227296

2011 9.301003 9.331052 9.347316 9.363834 9.356344

2012 9.152923 9.316680 9.361085 9.348100 9.360741

2013 9.397732 9.393828 9.451638 9.409765 9.403932

2014 9.510519 9.468697 9.458138 9.457669 9.483797

2015 9.428592 9.477692 9.525443 9.517899 9.490998

Jun Jul Aug Sep Oct

2010 9.206132 9.269269 9.288689 9.202510 9.242323

2011 9.356862 9.354008 9.291183 9.318298 9.309643

2012 9.414342 9.376194 9.378732 9.356257 9.317669


39
2013 9.427305 9.400713 9.394826 9.416297 9.414750

2014 9.530611 9.499646 9.467383 9.488048 9.445096

2015 9.497772 9.530248 9.522154 9.529303 9.491677

Nov Dec

2010 9.219003 9.274066

2011 9.315601 9.347403

2012 9.303193 9.208839

2013 9.326433 9.396322

2014 9.365633 9.419872

2015 9.432764 9.462887

>diff(de.ts)

Jan Feb Mar Apr May Jun Jul Aug

2010 506 -77 118 84 -213 649 208

2011 291 334 185 191 -87 6 -33 -703

2012 -2027 1680 505 -150 146 640 -459 30

2013 2076 -47 715 -522 -71 287 -326 -71

2014 1457 -553 -136 -6 339 630 -420 -424

2015 108 626 639 -103 -361 90 440 -111

Sep Oct Nov Dec

2010 -893 403 -238 571

2011 298 -96 66 359

2012 -263 -438 -160 -988


40
2013 261 -19 -1037 813

2014 270 -555 -966 651

2015 98 -508 -758 382

> adf.test(diff(log(de.ts)), alternative="stationary", k=0)

Augmented Dickey-Fuller Test

data: diff(log(de.ts))

Dickey-Fuller = -11.129, Lag order = 0, p-value = 0.01

alternative hypothesis: stationary

Warning message:

In adf.test(diff(log(de.ts)), alternative = "stationary", k = 0) :

p-value smaller than printed p-value

> Acf(log(de.ts))

Figure 2.8 Correlogram of the observed data trend obtained with the function plot.acf()

41
>pacf(log(de.ts))

Figure 2.9 Correlogram of the observed data trend with the function plot.pacf()

> (fit <- arima(log(de.ts), c(1, 1, 1),seasonal = list(order = c(0, 1, 1), period =

12)))

Call:arima(x = log(de.ts), order = c(1, 1, 1), seasonal = list(order = c(0, 1, 1),

period = 12))

Coefficients:

ar1 ma1 sma1

0.0918 -0.8285 -0.9997

s.e. 0.1855 0.1399 0.2554

sigma^2 estimated as 0.001864: log likelihood = 90.45, aic = -172.9

42
> Acf(ts(fit$residuals),main="ACF RESIDUALS")

Figure 2.10 Correlogram of the observed residuals trend obtained with the function of acf

>pacf(ts(fit$residuals),main=”ACF RESIDUALS”)

Figure 2.11 Correlogram of the observed residuals trend obtained with the function of pcaf

>pred <- predict(fit,n.ahead=5*12)

>pred

$pred

43
Jan Feb Mar Apr May

2016 9.498771 9.541039 9.565774 9.560017 9.561157

2017 9.553885 9.595475 9.620148 9.614385 9.615525

2018 9.608254 9.649843 9.674517 9.668753 9.669893

2019 9.662622 9.704211 9.728885 9.723121 9.724261

2020 9.716990 9.758580 9.783253 9.777490 9.778629

Jun Jul Aug Sep Oct

2016 9.579477 9.578986 9.564468 9.559093 9.544166

2017 9.633845 9.633354 9.618836 9.613461 9.598535

2018 9.688213 9.687722 9.673204 9.667829 9.652903

2019 9.742581 9.742091 9.727573 9.722197 9.707271

2020 9.796949 9.796459 9.781941 9.776565 9.761639

Nov Dec

2016 9.501066 9.525385

2017 9.555434 9.579753

2018 9.609802 9.634122

2019 9.664170 9.688490

2020 9.718538 9.742858

$se

Jan Feb Mar Apr

2016 0.04693807 0.04852130 0.04937250 0.05015693

2017 0.05836789 0.05930557 0.06014828 0.06097223


44
2018 0.06967634 0.07063173 0.07149689 0.07234482

2019 0.08134547 0.08230943 0.08318718 0.08404907

2020 0.09323996 0.09420805 0.09509323 0.09596369

May Jun Jul Aug

2016 0.05092462 0.05168049 0.05242543 0.05315992

2017 0.06178454 0.06258626 0.06337783 0.06415964

2018 0.07318231 0.07401025 0.07482904 0.07563896

2019 0.08490160 0.08574560 0.08658136 0.08740914

2020 0.09682575 0.09768013 0.09852711 0.09936687

Sep Oct Nov Dec

2016 0.05388442 0.05459945 0.05530673 0.05601933

2017 0.06493205 0.06569559 0.06645258 0.06722340

2018 0.07644033 0.07723363 0.07802159 0.07882967

2019 0.08822918 0.08904195 0.08985045 0.09068385

2020 0.10019962 0.10102581 0.10184866 0.10270012

45
>ts.plot(de.ts,2.178^pred$pred,log=”y”,lty=c(1,3))

Figure 2.12 AARIMA model to predict the future 5 years with seasonal components in the ARIMA

formulation

Figure 2.13 Predicted ARIMA formulation graphy for the next 10 years 2010-2020

46
3.7 Comparison of Results

Figure 3.1 Demand forecast data of the year 2016 in Tamil Nadu

Figure 3.2 Forecast error data of year 2016 in Tamil Nadu

47
Figure 3.3 Demand forecast data of the year 2017 in Tamil Nadu

Figure 3.4 Forecast Error data of the year 2017 in Tamil Nadu

48
Figure 3.5 Forecasted data of year 2018 in Tamil Nadu

Figure 3.6 Error data of the year 2018 in Tamil Nadu

49
Figure 3.7 Forecasted data of year 2019 in Tamil Nadu

Figure 3.8 Error Data of the year 2019 in Tamil Nadu


50
Figure 3.9 Demand forecast of the year 2020 january in Tamil Nadu

51
CHAPTER 5: CONCLUSION FUTURE SCOPE

This paper presents an attempt to forecast the Maximum demand of electricity

by finding An appropriate time series and regression model. Various Classes of

time series models, namely ARIMA, Naïve, Seasonal Holt-Winters, ARAR

forecast And Regression with ARMA errors have been Considered. Results

indicated that AR (2), which was the mean corrected series differenced at lag 12

and 1, emerged as the best model for forecasting the maximum demand of

electricity. It is suggested that models incorporating other variables like an

hourly or a daily maximum demand or any intervening events may be useful In

forecasting the electricity and this will be looked into for future research.

The integration strategy was developed by drawing inspiration from the

ensemble methodology, namely, boosting. The approach considers the

performance of each model in time and combines the weighted predictions of

the best performing models for the demand forecasting process. Various classes

of time series models, namely ARIMA, Regression with ARMA errors have

been considered.

The statistics related to the last 10 years were used to train the models and the 5

past years were used to forecast. To the best of our knowledge, this is the first

attempt to consolidate the real time data with the ARMA model algorithm, and
52
different time series methods for demand forecasting systems. After comparison

between ARMA and ARIMA model, we conclude that ARIMA model as most

effective model due to its less error.

FUTURE SCOPE

Seasonal based electricity demand forecasting can be an addition. It is

based on the regularities in time series related to the changes in seasons. Based

on the above work, we use a real time regional monthly electric power

consumption demand. It may be useful in forecasting the electricity and this will

be looked into for future research.

53
REFERENCES

[1] An Overview of Electricity Demand Forecasting

Techniques:https://ieeexplore.ieee.org/stamp/stamp.jsp?

tp=&arnumber=6508132A runesh Kumar Singh, Ibraheem, S. Khatoon, Md.

Muazzam , D. K. Chaturvedi (2012)

[2] Analysis of an adaptive time-series autoregressive moving-average (ARMA)

model for short-term load forecasting:Nataraja.C, M.B.Gorawar, Shilpa.G.N,

Shri Harsha.J(2012)

[3] Forecasting of Throughput Performance Using an ARMA Model with

Improved Differential Evolution

Algorithm:https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8256132

J. T. Huang, Y. Meng, and Y. Yang(2017)

[4] G. E. P. Box, G. M. Jenkins, and G. C. Reinsel, Time series analysis:

forecasting and control, 4rd ed., Prentice Hall: Englewood Cliffs, NJ, 2011.

[5] Implementing “R” Programming for Time Series Analysis and Forecasting

of Electricity Demand :https://ieeexplore.ieee.org/stamp/stamp.jsp?

tp=&arnumber=8767131 Robert Ngabesong and Lifford McLauchlan, Senior

Member, IEEE (2019)

[6] Prapanna, M., Labani S., & Saptarsi G. (2014). Study of Effectiveness of

Time Series Modelling (ARIMA) in Forecasting Stock Prices. International

Journal of Computer Science, Engineering and Applications (IJCSEA). 4(2),

13-29.
54
[7] Time series forecasting using improved ARIMA:

https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=7529496 Soheila

Mehrmolaei and Mohammad Reza Keyvanpour (2016)

[8] Forecast of hourly average wind speed with ARMA models in Navarre

(Spain)":https://www.sciencedirect.com/science/article/abs/pii/S0038092X0400

2 877?via%3Dihub J. L. Torres, A. Garcia, M. D. Blas et al., (2005)

[9] R. G. Kavasseri and K. Seetharaman, "Day-ahead wind speed forecasting

using fARIMA models", :

https://www.sciencedirect.com/science/article/abs/pii/S0960148108003327?via

%3 DihubRenewable Energy R. G. Kavasseri and K. Seetharaman(2019)

[10] Time Series ARIMA Model for Prediction of Daily and Monthly Average

Global Solar Radiation: The Case Study of Seoul, South Korea, Mohammed H.

Alsharif 1,*, Mohammad K. Younes 2 and Jeong Kim (2019)

[11] Tarno, Subanar, D. Rosadi and Suhartono, "New procedure for determining

order of subset autoregressive integrated moving average (ARIMA) based on

over-fitting concept," 2012 International Conference on Statistics in Science,

Business and Engineering (ICSBE), Langkawi, 2012, pp. 1-5, doi:

10.1109/ICSSBE.2012.6396643.(2012)

[12] Aston, John & Findley, David & Mcelroy, Tucker & Wills, Kellie &

Martin, Donald. (2007). New ARIMA Models for Seasonal Time Series and

Their Application to Seasonal Adjustment and Forecasting.

[13] Bartholomew, D. & Box, G. & GM, Jenkins. (1976). Time Series Analysis:
55
Forecasting and Control. Operat. Res. Quart.. 22. 199-201. 10.2307/3008255.

[14] G. Chang Y. Zhang D. Yao and Y. Yue. Short-term traffic flow forecasting

methods ICCTP 2011 2011

56
57

You might also like