Professional Documents
Culture Documents
Draft 1 Mideterm
Draft 1 Mideterm
Draft 1 Mideterm
The energy industry is diverse, encompassing many different sectors, units, and processes
which makes it very complicated when it comes to data collection or data analysis, as the
sources are disparate, difficult to integrate, and unstable. Energy data analytics is an
important operational tool for power companies to manage load, ensure low-cost, high
availability power, and ensure reliable energy supply. The energy grids require smarter and
more flexible solutions – into which Data Analytics comes. First, we provide an overview
present scenario of Data Analytics research in the field of energy and show historical
literature growth, leading countries in the field, and the most intensive international
market. It describes and classifies the organization and functioning of the electricity market.
We discuss used data sets, feature selection methods, benchmark methods, evaluation
metrics, and model complexity. Based on the findings determined from the different areas,
we identify best practices one area can learn from other areas.
We have used Python programming as the domain to perform Data Analytics for this project.
The reason why we used Python is that it has a vast collection of libraries for numerical
computation and data manipulation as well as libraries for graphics and data visualization to
build plots. Some of the Python libraries used here are NumPy, Pandas, Matplotlib, etc.
NumPy stands for ‘Numerical Python.’ It supports n-dimensional arrays and provides
numerical computing tools. It is useful for Linear algebra and Fourier transformation.
NumPy, along with Machine Learning modules like Scikit-learn, is widely used for data
2
analyses, and hence we will be using the same here. Pandas provide functions to handle
missing data, perform mathematical operations, and analyze the data. Matplotlib is commonly
used for data visualization and the creation of interactive visualizations of data. We will also
be using supervised machine learning (SML) algorithms. The SML learns from labelled
training data, helps you to predict outcomes for new data, and then successfully builds,
3
1. INTRODUCTION
Accurate load forecasting is critical in power system planning and operation, such as
developing unit commitment plans and establishing appropriate spinning reserves and
forecasting are the three types of load forecasting. The forecasting of medium and long-
term load ranges from several weeks to several years. They are primarily employed in
long-term planning and seasonal demand analysis. In contrast, short-term load forecasting
(STLF), which ranges from minutes to one week ahead, is a critical component of the
power grid's daily operation. Calibrated and validated short term load forecasts are
Energy data analytics uses statistical software, big data, and machine learning techniques
to analyse all aspects of energy usage within the utility industry. Users can forecast
demand, optimize costs, improve the supply chain, better understand customer
consumption patterns and prepare for future market behaviour. Energy data analytics
benefit both utility companies and their customers. Forecasting and predicting changes in
power flow, making long-term strategic decisions, and behavioural modelling all rely on
the availability of both good data and data analytics. It also provides a better insight into
asset management and resource planning. Data analytics also speeds up the settlement
process which is heavily data-intensive allowing increasing volumes of data from smart
meters and new business opportunities, such as smart contracts, and energy management
4
across organizational boundaries, and will support high-frequency energy trading. Energy
data analytics allows for a more comprehensive and responsive overview of the utility
sector, in response to real-time market fluctuations. In all cases, energy data analytics
helps businesses solve complex problems and make more efficient business decisions.
5
2. LITERATURE REVIEW
electricity from a power plant to your home as having three parts: generation,
plant, distribution is a utility routine that electricity to your home. The transmission
then, in the middle portion of the process and involves sending electricity from power
6
PJM manages the market where power plants bid to provide electricity to utilities
within PJM territory. It monitors the transmission system to ensure that the right
must be in balance at every moment between what is being used (consumption) and
Predicting the load is very important for the electrical energy industry in an
uncontrolled economy. It has many applications which include buying and producing
In the past 30 years, classic load forecasting technologies have been extensively
employed. These projections attribute load demand to the country's economic activity
7
I. STLF: STLF time periods range from a few minutes or hours to one day or a
evaluation.
II. MTLF: The duration of MTLF is between one week and one year (possibly
two years). MTLF aims to balance demand and generation via maintenance
III. LTLF: The LTLF time period ranges from a few years to 10 to 20 years in the
future. LTLF aims to plan the expansion of the system, including generation,
With the advent of advanced computational technologies, the potential to process data
for the development of predictive models based on machine learning has multiplied.
However, emphasis has quickly shifted away from the model's interpretability due to
This paper advances the understanding of short-term load forecasting via generalised
The proposed model utilises a time series of an instantaneous load of every five
minutes for one month of PJM interconnection to forecast the irregularly changing
8
3. OBJECTIVES
pattern-based forecasting.
9
4. TYPE AND NATURE OF THE PROJECT
This project is mainly based on Data analysis using different tools. First, relevant data
is collected from the PJM Interconnection website. This data is then pre processed as
per the requirement. Then the processed data is analysed using various types of
AI/ML based models such as linear regression, regressor, and random forest. These
models are then trained to forecast load. The main language used in this project is
Python Programming Language. This project also requires various python libraries for
10
5. METHODOLOGY
When the electricity market was established, it was unsettled could not analyze
market generation & load and statistically characterize them. To solve this issue,
we use AI models to predict consumption and generation according to it.
Predicting the variables using the AI model will require high computation, and the
forecasts are bound to vary depending on the input variable and the algorithm.
1. Obtain the Generation and Load Forecast data from the PJM website using their
Data Miner.
3. Aggregate the generation and consumption separately.
4. Obtain the aggregated supply curve and demand curve from the data.
5. Determine the generation and consumption.
Once the variables are determined, we use an AI model that would analyse them
for classification and regression analysis, and then predictions can be made by
using an appropriate algorithm.
11
6. HARDWARE AND SOFTWARE REQUIREMENTS
12
7. RESULTS & ANALYSIS
Result analysis
Graphical/tabular form
13
8. SIGNIFICANCE OF THE RESULTS
Short-term load forecasting is extremely important for control and operation of power
systems as it plays a crucial role in power system planning and operation, including the
creation of unit commitment plans and the implementation of appropriate spinning reserve
and maintenance plans. Electric load forecasting has garnered an increasing amount of
attention from academic researchers and power systems engineers over the years due to its
importance to the efficient and cost-effective operation of power utilities.
Hence in order to support the operation of the electric power system in a secure and cost-
effective manner, an accurate method of load forecasting is required as the primary factor in
all daily and weekly operations scheduling.
14
9. IMPACT OF THE WORK ON SOCIETY/ ENVIRONMENT/ OTHER
FACTORS
Numerous physical factors determine the accuracy of load forecasting, which has a
significant impact on distribution investments, tariff formulation, and electricity pricing.
Through identification and forecasting of load-influencing factors, it is possible to anticipate
future development measures and tariff enhancements. Alternately, an optimal state for the
economic sustainability of the electricity distribution industry can be established by orienting
the influencing factors.
The load forecasting method presented in this paper would be used to determine network
distribution planning as urban development culminates in the provision of social services
concurrently with the development of distribution networks.
15
10. INDIVIDUAL CONTRIBUTION
Objectives of Amartya
Data Collection
Data Exploration
Exploring various ML techniques for forecasting
Identifying the suitable ML techniques
Objectives of Ayush
Data Pre-processing
Data Segregation
Implementing different ML techniques
Identifying the suitable ML techniques
16