Draft 1 Mideterm

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 15

ABSTRACT

The energy industry is diverse, encompassing many different sectors, units, and processes

which makes it very complicated when it comes to data collection or data analysis, as the

sources are disparate, difficult to integrate, and unstable. Energy data analytics is an

important operational tool for power companies to manage load, ensure low-cost, high

availability power, and ensure reliable energy supply. The energy grids require smarter and

more flexible solutions – into which Data Analytics comes. First, we provide an overview

present scenario of Data Analytics research in the field of energy and show historical

literature growth, leading countries in the field, and the most intensive international

collaborations. This paper provides an overview of the PJM Interconnection electricity

market. It describes and classifies the organization and functioning of the electricity market.

We discuss used data sets, feature selection methods, benchmark methods, evaluation

metrics, and model complexity. Based on the findings determined from the different areas,

we identify best practices one area can learn from other areas.

We have used Python programming as the domain to perform Data Analytics for this project.

The reason why we used Python is that it has a vast collection of libraries for numerical

computation and data manipulation as well as libraries for graphics and data visualization to

build plots. Some of the Python libraries used here are NumPy, Pandas, Matplotlib, etc.

NumPy stands for ‘Numerical Python.’ It supports n-dimensional arrays and provides

numerical computing tools. It is useful for Linear algebra and Fourier transformation.

NumPy, along with Machine Learning modules like Scikit-learn, is widely used for data

2
analyses, and hence we will be using the same here. Pandas provide functions to handle

missing data, perform mathematical operations, and analyze the data. Matplotlib is commonly

used for data visualization and the creation of interactive visualizations of data. We will also

be using supervised machine learning (SML) algorithms. The SML learns from labelled

training data, helps you to predict outcomes for new data, and then successfully builds,

scales, and deploys accurate supervised machine learning results.

3
1. INTRODUCTION

Accurate load forecasting is critical in power system planning and operation, such as

developing unit commitment plans and establishing appropriate spinning reserves and

maintenance plans. Long-term forecasting, medium-term forecasting, and short-term

forecasting are the three types of load forecasting. The forecasting of medium and long-

term load ranges from several weeks to several years. They are primarily employed in

long-term planning and seasonal demand analysis. In contrast, short-term load forecasting

(STLF), which ranges from minutes to one week ahead, is a critical component of the

power grid's daily operation. Calibrated and validated short term load forecasts are

considered necessary for planning operations such as hydro-thermal power generation

coordination, economic dispatch, energy trading, predictive frequency control, security

analysis, and system restoration.

Energy data analytics uses statistical software, big data, and machine learning techniques

to analyse all aspects of energy usage within the utility industry. Users can forecast

demand, optimize costs, improve the supply chain, better understand customer

consumption patterns and prepare for future market behaviour. Energy data analytics

benefit both utility companies and their customers. Forecasting and predicting changes in

power flow, making long-term strategic decisions, and behavioural modelling all rely on

the availability of both good data and data analytics. It also provides a better insight into

asset management and resource planning. Data analytics also speeds up the settlement

process which is heavily data-intensive allowing increasing volumes of data from smart

meters and new business opportunities, such as smart contracts, and energy management

4
across organizational boundaries, and will support high-frequency energy trading. Energy

data analytics allows for a more comprehensive and responsive overview of the utility

sector, in response to real-time market fluctuations. In all cases, energy data analytics

helps businesses solve complex problems and make more efficient business decisions.

5
2. LITERATURE REVIEW

PJM is a regional transmission organization (RTO) that coordinates the movement of

wholesale electricity serving all or parts of Delaware, Illinois, Indiana, Kentucky,

Maryland, Michigan, New Jersey, North Carolina, Ohio, Pennsylvania, Tennessee,

Virginia, West Virginia, and the District of Columbia.

Figure 1: Representation of ISO and RTO controlled regions respectively

To understand what PJM does, it is helpful to think of the process of getting

electricity from a power plant to your home as having three parts: generation,

transmission, and distribution. Generation is the production of electricity at a power

plant, distribution is a utility routine that electricity to your home. The transmission

then, in the middle portion of the process and involves sending electricity from power

plants to utilities, sometimes over long distances.

6
PJM manages the market where power plants bid to provide electricity to utilities

within PJM territory. It monitors the transmission system to ensure that the right

amount of electricity is always supplied. To function properly, our electricity system

must be in balance at every moment between what is being used (consumption) and

what is being produced (generation).

Figure 2: Generation and Load data Representation on the PJM website

Predicting the load is very important for the electrical energy industry in an

uncontrolled economy. It has many applications which include buying and producing

energy, changing loads, contract testing, and infrastructure development. Great a

variety of mathematical methods are designed to load prediction.

In the past 30 years, classic load forecasting technologies have been extensively

employed. These projections attribute load demand to the country's economic activity

and temperature fluctuations, while assuming load inelasticity to price sensitivity.

Based on duration, they can be divided into three categories:

7
I. STLF: STLF time periods range from a few minutes or hours to one day or a

week ahead. STLF seeks to achieve economical dispatch and optimal

generator unit commitment while addressing real-time control and security

evaluation.

II. MTLF: The duration of MTLF is between one week and one year (possibly

two years). MTLF aims to balance demand and generation via maintenance

scheduling, coordination of load dispatch, and price settlement.

III. LTLF: The LTLF time period ranges from a few years to 10 to 20 years in the

future. LTLF aims to plan the expansion of the system, including generation,

transmission, and distribution. In certain instances, it also affects investments

in new generating units.

With the advent of advanced computational technologies, the potential to process data

for the development of predictive models based on machine learning has multiplied.

However, emphasis has quickly shifted away from the model's interpretability due to

a focus on its accuracy alone. Consequently, forecasters and academics require

forecasting analytics that are both accurate and interpretable.

This paper advances the understanding of short-term load forecasting via generalised

regression analysis with high-degree polynomials in order to aid energy forecasters.

The proposed model utilises a time series of an instantaneous load of every five

minutes for one month of PJM interconnection to forecast the irregularly changing

energy demand at the consumer level.

8
3. OBJECTIVES

1. To comprehend how PJM interconnection works and import pertinent data

from the PJM Data Miner

2. To perform data cleansing and data completion.

3. To conduct exploratory data analysis through graphing.

4. To utilise machine learning techniques by employing Python Machine

Learning repositories to develop analytical methods and ML models for

pattern-based forecasting.

9
4. TYPE AND NATURE OF THE PROJECT

This project is mainly based on Data analysis using different tools. First, relevant data

is collected from the PJM Interconnection website. This data is then pre processed as

per the requirement. Then the processed data is analysed using various types of

AI/ML based models such as linear regression, regressor, and random forest. These

models are then trained to forecast load. The main language used in this project is

Python Programming Language. This project also requires various python libraries for

data analysis such as NumPy, Matplotlib, Pandas etc.

10
5. METHODOLOGY

When the electricity market was established, it was unsettled could not analyze
market generation & load and statistically characterize them. To solve this issue,
we use AI models to predict consumption and generation according to it.
Predicting the variables using the AI model will require high computation, and the
forecasts are bound to vary depending on the input variable and the algorithm.

1. Obtain the Generation and Load Forecast data from the PJM website using their
Data Miner.
3. Aggregate the generation and consumption separately.
4. Obtain the aggregated supply curve and demand curve from the data.
5. Determine the generation and consumption.

Once the variables are determined, we use an AI model that would analyse them
for classification and regression analysis, and then predictions can be made by
using an appropriate algorithm.

11
6. HARDWARE AND SOFTWARE REQUIREMENTS

Software Requirement: Python Programming Language, Additional Python


Libraries such as NumPy, Pandas, Matplotlib, Scikitlearn, etc.
Hardware Requirement: Sufficient computing power to support Python and
Machine algorithms and conduct simulations on large datasets.

12
7. RESULTS & ANALYSIS

This section should include

 Result analysis

 Graphical/tabular form

 The explanation for the graphical/tabulated results

 Significance of the result obtained

 Any deviations from the expected results & their justification

13
8. SIGNIFICANCE OF THE RESULTS

Short-term load forecasting is extremely important for control and operation of power
systems as it plays a crucial role in power system planning and operation, including the
creation of unit commitment plans and the implementation of appropriate spinning reserve
and maintenance plans. Electric load forecasting has garnered an increasing amount of
attention from academic researchers and power systems engineers over the years due to its
importance to the efficient and cost-effective operation of power utilities.

Hence in order to support the operation of the electric power system in a secure and cost-
effective manner, an accurate method of load forecasting is required as the primary factor in
all daily and weekly operations scheduling.

14
9. IMPACT OF THE WORK ON SOCIETY/ ENVIRONMENT/ OTHER
FACTORS

Numerous physical factors determine the accuracy of load forecasting, which has a
significant impact on distribution investments, tariff formulation, and electricity pricing.
Through identification and forecasting of load-influencing factors, it is possible to anticipate
future development measures and tariff enhancements. Alternately, an optimal state for the
economic sustainability of the electricity distribution industry can be established by orienting
the influencing factors.
The load forecasting method presented in this paper would be used to determine network
distribution planning as urban development culminates in the provision of social services
concurrently with the development of distribution networks.

15
10. INDIVIDUAL CONTRIBUTION

Objectives of Amartya
 Data Collection
 Data Exploration
 Exploring various ML techniques for forecasting
 Identifying the suitable ML techniques

Objectives of Ayush
 Data Pre-processing
 Data Segregation
 Implementing different ML techniques
 Identifying the suitable ML techniques

16

You might also like