Download as pdf or txt
Download as pdf or txt
You are on page 1of 159

University of Calgary

PRISM: University of Calgary's Digital Repository

Graduate Studies The Vault: Electronic Theses and Dissertations

2017

Developing Energy Forecasting Tools in Power


Systems: Application to Microgrids

Chitsaz, Hamed

Chitsaz, H. (2017). Developing Energy Forecasting Tools in Power Systems: Application to


Microgrids (Unpublished doctoral thesis). University of Calgary, Calgary, AB.
doi:10.11575/PRISM/25623
http://hdl.handle.net/11023/4259
doctoral thesis

University of Calgary graduate students retain copyright ownership and moral rights for their
thesis. You may use this material in any way that is permitted by the Copyright Act or through
licensing that has been assigned to the document. For uses that are not allowable under
copyright legislation or licensing, you are required to seek permission.
Downloaded from PRISM: https://prism.ucalgary.ca
UNIVERSITY OF CALGARY

Developing Energy Forecasting Tools in Power Systems: Application to Microgrids

by

Hamed Chitsaz

A THESIS

SUBMITTED TO THE FACULTY OF GRADUATE STUDIES

IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE

DEGREE OF DOCTOR OF PHILOSOPHY

GRADUATE PROGRAM IN ELECTRICAL AND COMPUTER ENGINEERING

CALGARY, ALBERTA

NOVEMBER, 2017


c Hamed Chitsaz 2017
Abstract

In recent years, distributed energy resources and microgirds have attracted a great deal of interests

in the power industry. The development of microgrids has required engineers and operators to

enhance the efficiency and the energy management of such small-scale power systems. To do so,

energy forecasting plays a key role in their optimal operation. In particular, Short-term Wind Fore-

casting (STWF), Short-term Load Forecasting (STLF) and Short-term Price Forecasting (STPF)

are important tools for reliable operation scheduling of grid-connected microgrids with renewable

energy sources (e.g., wind). The generated energy forecasts are used in an optimization platform

to ensure the most economical operation for microgrids as the end goal of this thesis.

The main focus of this thesis is the development of forecasting models that are tailored for the

application to grid-connected microgrids. An STWF model is developed based on artificial intelli-

gence and an evolutionary algorithm to provide wind forecasts. This model can be applied to gen-

erate wind predictions at the power system, wind farm and/or wind turbine levels. By statistically

analyzing the behavior of electricity consumption at campus/building levels, an STLF methodol-

ogy is developed based on neural networks to provide satisfactory forecasts for volatile electricity

loads in microgrids. Further, an STPF is designed to improve the economics of grid-connected

microgrids by taking advantage of energy arbitrage opportunities with the grid. It is noted that the

microgrid is assumed to be large enough to have transactions based on market prices.

Numerical results in Chapters 2, 3, 4 and 5 of this thesis are provided based on Alberta, On-

tario, British Columbia, California, Texas and NewYork power systems, and two campuses. The

simulations show the effectiveness of the proposed neural networks for STWF and STLF. The sta-

tistical and economic evaluations show the satisfactory performance of the developed STPF model

in scheduling a storage system in a microgrid. Moreover, deterministic and probabilistic optimiza-

tion platforms are developed for the optimal operation of microgrids, which can help the operator

apply the most effective approach under different scenarios of generation and market conditions.

i
Acknowledgments

First and the foremost, I would like to express my sincere gratitude to my supervisor, Dr. Hamidreza

Zareipour, and my co-supervisor, Dr. David Wood, for their continuous support in my Ph.D. stud-

ies. Their motivation and passion inspired me to proceed my research and achieve my goals not

only in the academia but also in personal life. I truly appreciate their fantastic mentorship and their

patience in this broad training process. The success and outcome of this thesis required a lot of

guidance and assistance from my supervisors, colleagues, friends and family, and I am extremely

privileged to have had this all along the way.

Also, I would like to thank the members of my committee: Dr. Andrew Knight, Dr. Svetlana

Yanushkevich, Dr. Geoffrey Messier, and Dr. Sherif Faried for their insightful comments.

A very special gratitude goes out to my friend and colleague Dr. Hamid Shaker for the great

collaboration during my PhD studies. I would also like to extend my gratitude to my colleagues

and friends Dr. Payam Zamani-Dehkordi, Dr. Soroush Shafiee, Dr. Ehsan Nasrolahpour, Mr.

Hamidreza Rafieenia, Mr. Saeed Masoumi, Mr. Shahab Esmaeilnejad, Mr. Babatunde Odetayo,

Mr. Juan Arteaga, Mr. Shubhrajit Bhattacharjee, and Dr. Mostafa Kazemi who encouraged,

supported, and assisted me in my research.

I am extremely thankful to Dr. Nima Amjady for encouraging me towards continuing my

education and supporting my PhD application by his strong recommendation. Also, I would like to

thank Mr. David Adair and Mr. Ben Thomas for their help in designing a website for visualization

of price forecasts, and Mr. Gregor Hähner and Mr. Shane Fast for their help in preparing different

databases used in some quantitative analyses along the way.

Last but not the least, I owe my deep gratitude to my parents and to my brother and sister

for supporting me spiritually in continuing my higher education and constantly encouraging me

throughout my life.

ii
Dedication

Dedicated to my beloved parents,

Ali-Akbar and Marzieh

iii
Table of Contents

Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . i
Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii
Dedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
Table of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
List of Symbols and Abbreviations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Research Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Research Objectives and Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Research Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4 Thesis Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2 Wind Power Forecast Using Wavelet Neural Network Trained by Improved Clonal
Selection Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2 Literature review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.3 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.3.1 The Developed Wavelet Neural Network . . . . . . . . . . . . . . . . . . 15
2.3.2 The Proposed Training Strategy . . . . . . . . . . . . . . . . . . . . . . . 19
2.4 Numerical results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.4.1 Numerical Results with 6-hour Updates . . . . . . . . . . . . . . . . . . . 28
2.4.2 Numerical Results with Hourly Updates . . . . . . . . . . . . . . . . . . . 33
2.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3 Short-term Electricity Load Forecasting of Buildings in Microgrids . . . . . . . . . 36
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.2 Data analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.3 The forecasting model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.3.1 Self-Recurrent Wavelet Neural Network . . . . . . . . . . . . . . . . . . . 44
3.3.2 The training algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.4 Numerical results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4 Electricity Price Forecasting for Operational Scheduling of Behind-the-meter Stor-
age Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
4.2 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
4.2.1 Operation of a BESS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.2.2 Forecasting Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
4.3 Case Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
4.3.1 Ontario’s Electricity Market . . . . . . . . . . . . . . . . . . . . . . . . . 72
4.3.2 Statistical Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
4.3.3 Economic Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
4.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

iv
5 Impact of Uncertainty Modeling on Economic Performance of Microgrids . . . . . 84
5.1 Nomenclature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
5.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
5.3 Forecasting Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
5.3.1 Deterministic Forecasting . . . . . . . . . . . . . . . . . . . . . . . . . . 88
5.3.1.1 Electricity Load Forecasting . . . . . . . . . . . . . . . . . . . . 89
5.3.1.2 Electricity Price Forecasting . . . . . . . . . . . . . . . . . . . . 89
5.3.1.3 Wind Power Forecasting . . . . . . . . . . . . . . . . . . . . . . 91
5.3.2 Probabilistic Forecasting . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
5.4 Optimization Platform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
5.4.1 Microgrid Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
5.4.2 Deterministic Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
5.4.3 Probabilistic Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
5.5 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
5.5.1 Statistical Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
5.5.2 Economic Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
5.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
6.1 Concluding remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
6.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
A Wind power forecasting at wind farm levels . . . . . . . . . . . . . . . . . . . . . 129
B Benchmark models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
C Formulation of the training algorithm . . . . . . . . . . . . . . . . . . . . . . . . 133
D Mutual-Information feature selection . . . . . . . . . . . . . . . . . . . . . . . . . 137
E Copyright permission letters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139

v
List of Tables

2.1 nRMSE (%) and nMAE (%) results for the four test weeks of year 2012 . . . . . . 29
2.2 Comparison of the proposed method and a WNN with Mexican hat wavelet . . . . 31
2.3 Comparison of the proposed training strategy, i.e. ICSA, with SA, PSO, DE and
CSA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.4 Wind power prediction error results of the proposed method for 10 months . . . . . 32

3.1 Comparison of electricity load time series in terms of volatility . . . . . . . . . . . 40


3.2 Ramp events in electricity load time series . . . . . . . . . . . . . . . . . . . . . . 43
3.3 Forecasting errors, in %, of SRWNN, WNN and MLP for 10 test months. . . . . . 52
3.4 Forecasting errors of SRWNN and WNN for two power systems. . . . . . . . . . . 56

4.1 Confusion matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

5.1 Errors of point forecasts in terms of MAE . . . . . . . . . . . . . . . . . . . . . . 99


5.2 Errors of probabilistic forecasts . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

A.1 Wind power prediction errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129

vi
List of Figures and Illustrations

1.1 The platform for energy management of a grid-connected microgrid . . . . . . . . 3

2.1 Mexican hat and Morlet wavelet functions . . . . . . . . . . . . . . . . . . . . . . 16


2.2 Architecture of the proposed wind forecasting engine (WNN with Morlet wavelet) . 17
2.3 Representation of step 4 (copy operator) of CSA . . . . . . . . . . . . . . . . . . . 22
2.4 Forecast results of different models for two different days . . . . . . . . . . . . . . 30
2.5 Average nRMSE errors of two forecasting models in different look ahead forecasts 34

3.1 One-year hourly load data of BC’s power system and the building in BCIT . . . . . 41
3.2 Distribution of 1-hour and 2-hour ramps . . . . . . . . . . . . . . . . . . . . . . . 42
3.3 Architecture of the SRWNN. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.4 Mean absolute error (kW) of different hours of the day in different months . . . . . 53
3.5 10-month mean absolute error for different hours of the day . . . . . . . . . . . . . 53
3.6 Samples for bad (a) and good (b) forecasting days . . . . . . . . . . . . . . . . . . 54
3.7 Forecasting errors for different days of the week . . . . . . . . . . . . . . . . . . . 55

4.1 Graphical representaion of Algorithm 1 . . . . . . . . . . . . . . . . . . . . . . . 64


4.2 Graphical representaion of Algorithm 2 . . . . . . . . . . . . . . . . . . . . . . . 65
4.3 Potential capability of MCPs in spike detection . . . . . . . . . . . . . . . . . . . 70
4.4 Structure of the models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
4.5 IRH framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
4.6 Forecasting errors in different months . . . . . . . . . . . . . . . . . . . . . . . . 75
4.7 Price values for the first week of August 2015 . . . . . . . . . . . . . . . . . . . . 76
4.8 Total money saved for different number of MCPs . . . . . . . . . . . . . . . . . . 79
4.9 Monthly total money saved by operating the BESS . . . . . . . . . . . . . . . . . 80
4.10 BESS schedules based on different strategies for October 2nd, 2015 . . . . . . . . 81
4.11 Economic effect of number of cycles for the BESS . . . . . . . . . . . . . . . . . 82

5.1 Costs for Scenario #1: low price and low wind . . . . . . . . . . . . . . . . . . . . 102
5.2 Costs for Scenario #2: low price and high wind . . . . . . . . . . . . . . . . . . . 103
5.3 Costs for Scenario #3: high price and low wind . . . . . . . . . . . . . . . . . . . 104

vii
List of Symbols and Abbreviations

Symbol Definition
AESO Alberta Electric System Operator

AI Artificial Intelligence

ANFIS Adaptive Neuro-fuzzy Inference Systems

ANN Artificial Neural Network


AR Auto-regressive

ARIMA Auto-regressive Integrated Moving Average

ARX Auto-regressive with exogenous variables

BA Balancing Authority

BC Province of British Columbia


BCIT British Columbia Institute of Technology

BESS Battery Energy Storage Systems

BTM Behind-the-meter
CI Computational Intelligence

CHP Combined Heat and Power


CL Confidence Level
CSA Clonal Selection Algorithm

DA Day-ahead

DE Differential Evolution
DER Distributed Energy Resource

DG Diesel Generator
EMS Energy Management System

ERCOT Electric Reliability Council of Texas

ESS Energy Storage System

viii
FFNN Feed-Forward Neural Network
GE General Electric
HE Hour Ending

HOEP Hourly Ontario Electric Price

HPF Hourly Price Forecast

ICSA Improved Clonal Selection Algorithm

IESO Independent Electric System Operator

IRH Intra-hour Rolling Horizon

LM LevenbergMarquardt learning algorithm

MAE Mean Absolute Error


MCC Maximum Correntropy Criterion

MCP Market Clearing Price

MCS Monte Carlo Simulation


MI Mutual Information
MILP Mixed-Integer Linear Programming

MLP Multi-Layer Perceptron

MPIW Mean Prediction Interval Width


MSE Mean Squared Error

nMAE Normalized Mean Absolute Error


nRMSE Normalized Root Mean Squared Error

NSERC Canadian National Science and Engineering Research Council

NYISO New York Independent System Operator

NYSERDA New York State Energy Research and Development Authority

NWP Numerical Weather Prediction


PDP Pre-Dispatch Price

PI Prediction Interval

ix
PICP Prediction Interval Coverage Probability

PP Pool Price
PSO Particle Swarm Optimization

QRA Quantile Regression Averaging

RBF Radial Basis Function


RES Renewable Energy Source

RH Rolling Horizon

RNN Recurrent Neural Network


RO Robust Optimization

SA Simulated Annealing

SARIMA Seasonal Auto Regressive Moving Average

SMP System Marginal Price

SPA Spike Prediction Accuracy

SRWNN Self-Recurrent Wavelet Neural Network


STLF Short-term Load Forecast
STPF Short-term Price Forecast
STWF Short-term Wind Forecast
SVR Support Vector Regression

SVM Support Vector Machine

WECC Western Electricity Coordinating Council

WNN Wavelet Neural Network

x
Chapter 1

Introduction

1.1 Research Motivation

The idea of microgrids was first introduced in the technical literature by Lesseter [1] as a solution

for the reliable integration of Distributed Energy Resources (DERs), including Energy Storage

Systems (ESSs) and controllable loads. Microgrids are integrated energy systems operating as

an autonomous grid, which can be either in parallel to or islanded from the existing power grid.

Despite this general definition for microgrids, there is still an ambiguity of what is and is not

a microgrid in the literature. This vagueness mainly comes from the size and type of energy

resources within such small-scale power systems. For instance, inclusion of a renewable power

generation resource, Combined Heat and Power (CHP), or some form of energy storage as well as

network controls for optimization of generation and loads is one of the criteria for a remote electric

system to be called a microgrid [2].

In a recent report (2nd Quarter of 2017), Navigant Research states that there are currently

more than 1,840 known microgrid projects across 135 countries with a total capacity of 19,279.4

MW [3]. In this report, seven major market segments with their shares are defined: remote (45%),

commercial/industrial (16%), utility distribution (15%), community (10%), institutional/campus

(9%), military (5%), and direct current (less than 1%) microgrids [2]. Although remote projects

constitute the majority of microgrids with a total of 8,708.1 MW, the commercial/industrial sector

is anticipated to become the fastest-growing microgrid sector, representing more than 35% of the

total microgrid market by 2026 [4].

With the fast and worldwide development of microgrids, energy management of such electric

systems become important. The optimal utilization of available resources within the microgrid is

essential for operating the system with the lowest possible cost, which requires advanced tools and

1
energy management techniques [5]. In practice, operating a grid-connected microgrid in the most

economic way can be a challenging task due to a number of reasons. The integration of renewable

energy in microgrids causes an additional complexity in their operation because of the uncertainties

attached to such intermittent energy resources. Another challenge in the operation of microgrids

is the non-smooth behavior of the electricity load at building and/or microgrid levels. Unlike the

smooth variations of electricity loads in power systems, microgrid loads can have severe variations

that negatively impact their predictability [6]. In addition, grid-connected microgrids capable of

trading energy with the main grid are subject to the risks of fluctuations in electricity market prices,

which can affect the economics of the microgrid. Hence, accurate short-term forecasting tools are

required for an economic energy management in microgrids [7].

Forecasts are important because they are the main inputs to the optimization platform for the

operation scheduling of resources in the microgrid. In a renewable-powered system, e.g., with

wind energy, the only approach to deal with the generation intermittency in operation time scale

is to forecast the energy over an extended period of time. Wind power forecasts are needed for

operators to control balancing, operation and safety of the system [8]. Additionally, electricity

load forecast is an indispensable task for the operation of a micro-grid, as many operating deci-

sions, such as dispatch scheduling of generating capacity and demand side management, are based

on load forecasts [9]. It is discussed that the forecasted loads as well as forecasted generation

of renewable resources are the main inputs for optimal energy management [7] and generation

scheduling [10] in micro-grids. However, electricity price forecasts also become essential for the

operation of grid-connected microgrids. Accurate price forecasts help the operator with effective

strategies for the energy arbitrage with the main grid. Microgrids tend to rely only on local energy

resources when the electricity price is high, whereas it is economical to purchase energy from the

grid when electricity prices are lower than the cost of generating power locally.

Therefore, this thesis provides methodologies for short-term energy forecasting applicable in

the optimal operation of microgrids as the end goal of this thesis. In particular, wind power gen-

2
eration, electricity loads and electricity market prices are the main focus in this thesis. Energy

forecasts are applied to an optimization platform to provide operational schedules of the resources

in the microgrid. Furthermore, this thesis aims to investigate the most suitable optimization plat-

form to use energy predictions under different scenarios of forecast uncertainties.

1.2 Research Objectives and Scope

The main objective of this thesis is to provide efficient short-term forecasting models required for

the optimal operation of microgrids. In particular, this thesis has four specific objectives. The first

is to develop a short-term wind power forecasting model. The second objective is to implement

a short-term forecasting model for volatile electricity loads in microgrids. The third objective is

to develop a short-term prediction model for electricity market prices required for the economical

operation of a grid-connected microgrid. The last objective of this thesis is to investigate the most

effective optimization approach to mitigate the impact of uncertainties related to forecasts.

Database Energy Management System Microgrid Components


Forecasting Models - Generating Units
- Wind Generation - Storage System
- Weather Data Wind Power - Grid Connection
Forecaster - Microgrid Load

- Electricity Load Microgrid


- Temperature Electricity Load
Operation
- Calendar Forecaster
Optimizer

- Market Price Electricity Price Microgrid


- Electricity Load Forecaster Operation
- Supply Cushion Schedules

Figure 1.1: The platform for energy management of a grid-connected microgrid

Figure 1.1 illustrates a general platform for optimal operation of a grid-connected microgrid.

3
The energy management system contains forecasting models as well as the optimization frame-

work. The forecasting models are fed by required input data located in a database. Forecasts of

wind power generation, electricity loads and electricity prices along with the operational infor-

mation of all microgrid components are sent to the optimization framework for optimal operation

scheduling of the microgrid. The final outputs are the schedules for dispatchable units within the

microgird and energy trade with the main grid.

In this manuscript-based thesis, I use publicly available data of different electricity markets,

such as British Columbia, Alberta, Ontario in Canada, as well as electricity markets in New York

and Texas in the U.S. In addition, I use the electricity demand data of a building in British Columbia

Institute of Technology (BCIT) campus and a campus building at the University of Calgary for the

electricity load forecasting purpose. Although the developed forecasting methodologies have been

tailored to the characteristics of the time series of interest, they can be used for other markets or

microgrids with minor adjustments.

In Chapter 2, I focus on building a short-term forecasting model for wind power generation.

According to the literature on wind power forecasting methodologies, nonlinear models such as

artificial neural networks have shown better prediction performance than linear models, e.g., linear

regression models, as wind power is generally a non-linear function of its input features. Therefore,

I base the structure of the model on a state-of-the-art artificial neural network. In addition to the

forecasting engine, the training algorithm plays a key role in the performance of forecasting tools.

Thus, to enable the model to capture high fluctuations of wind power time series, I equip the

forecasting engine with an efficient training mechanism. To do so, I develop an evolutionary

algorithm that optimizes the free parameters of the model in the training phase. This helps achieve

the first objective of this thesis, which is the development of a wind power prediction tool with

high forecasting accuracy. The wind power forecasts are essential in the optimal operation of

microgrids with wind turbines. An improvement in wind power forecasts means the real-time

operation of dispatchable units in a microgrid is closer to their schedules provided in advance.

4
Therefore, unexpected actual wind generation would not result in operational risks for other units,

load curtailments (in stand-alone mode of operation) or financial risks.

In Chapter 3, the focus is on the development of a short-term forecasting model for electricity

loads in microgrids. First, I perform an analysis to highlight the main challenge for load forecasting

in microgrids. To do so, the characteristics of the historical load of a microgrid is compared with

those of electricity loads in two power systems. In other words, I compare the volatility of the

microgrid load with the volatility of the electricity load in two power systems using two measures.

In addition, I perform another experiment to analyze the ramp events in the electricity loads of

the microgrid versus a power system. Knowing the characteristics of the microgrid loads, I then

propose a forecasting methodology based on a state-of-the-art neural network with high capability

of capturing severe variations in non-smooth load time series. Therefore, the objective of this

chapter is to propose a load forecasting model that can provide accurate forecasts for the optimal

operation of microgrids. An improvement in load forecast accuracy can be translated into less

power mismatch in real-time operation of the microgrid. Overestimating the electricity load bring

more controllable unit online which results in excess power generated, whereas underestimating

the load results in supply shortfall that could result in load curtailments in stand-alone operation or

financial risks in grid-connected operation.

Chapter 4 presents a methodology for short-term electricity price forecasting that is specifically

tailored to enhance the operation of microgrids. When it comes to price forecasting, it is of high

importance to take into account the main application for the prediction tool. In this chapter, the

application is the operation of a behind-the-meter battery energy storage system within a microgrid.

Hence, I first formulate the operation scheduling of the storage system using common sets of

forecasts, i.e., day-ahead forecasts and rolling horizon forecasts. Then, I propose a new operation

framework for the storage system using intra-hour rolling horizon that can potentially enhance the

economics of the microgrid. To achieve this, I perform an analysis to evaluate the potential of high-

resolution market data, i.e., market clearing prices, in capturing the price spikes using the publicly

5
available data of four North American electricity markets. Accordingly, I build a price forecasting

tool that includes high-resolution and low-resolution market data with high capability of capturing

price spikes. Thus, the objective in this chapter is to construct such a price forecasting model that

can enhance the economics of the grid-connected microgrid by an effective energy arbitrage with

the main grid.

In Chapter 5, I focus on the optimal operation of a microgrid using the generated forecasts.

Since the forecasts are not perfect, the uncertainty related to them can negatively affect the op-

eration. There are two main approaches to mitigate the forecast uncertainties; rolling horizon

technique and prediction intervals. In the rolling horizon technique, point forecasts are updated

every hour to provide more accurate prediction for the remaining hours of the operating day. Al-

ternatively, prediction intervals provide a range in which the actual values are expected to fall with

a certain confidence level. Using the prediction intervals, robust optimization can be used in the

operation of microgrid in which the worst case scenario is considered. In this chapter, I first pro-

vide simple methodologies for the electricity load, electricity price and wind power forecasting for

both point and interval forecasts. Then, I formulate optimization platforms for both deterministic

and probabilistic approaches. Afterwards, each platform is evaluated by three different scenarios

of wind power generation and electricity price. The objective of this chapter is to investigate the

impact of different optimization approaches on the economic performance of the microgrid under

different scenarios of forecast uncertainties.

1.3 Research Contributions

To achieve the objectives stated above, several contributions are made to the existing literature. A

summary of the main contributions are provided in the following paragraphs. These contributions

are elaborated in Chapters 2, 3, 4, and 5 of this thesis.

The first contribution of this thesis is to build a short-term forecasting model for wind power

generation. Having access to wind power generation in the province of Alberta, Canada, the pro-

6
posed methodology is constructed based on aggregated wind power from all wind farms in the

province. This model can be used for providing point forecasts of wind power for power systems,

wind farms or small wind turbines with minor adjustments if needed (please see Appendix A for

further information). For the microgrid application, the operator can generate wind power forecasts

and then apply them into an optimization platform to determine the operational schedules of the

dispatchable units within the microgrid. The forecasting engine is built on a Wavelet Neural Net-

work (WNN) which is an efficient artificial neural network model. Due to the local properties of

wavelets and the ability of adapting the wavelet shape according to the training data set instead of

adapting the parameters of the fixed shape activation function, WNNs offer higher generalization

capability compared to the classical feed forward neural networks [11]. This make them suitable

methods for predicting volatile time series. To enhance the performance of this forecasting en-

gine, the activation functions of the hidden neurons are constructed based on multi-dimensional

Morlet wavelets. To train the forecasting engine, I propose a new stochastic evolutionary algo-

rithm, named Improved Clonal Selection Algorithm (ICSA), which optimizes the free parameters

of the WNN for wind power prediction. The obtained numerical results confirm the validity of the

developed prediction model.

As the second contribution of this thesis, I develop a Short-term Load Forecasting (STLF)

model for microgrids, which is ultimately used for enhancing the energy management of available

resources in microgrids. First, I perform a data analysis to compare the characteristics of a power

system load versus an electricity load in a microgrid. The case study is British Columbia’s power

system and British Columbia Institute of Technology (BCIT) campus microgrid. As the analy-

sis suggests, the microgrid load has highly volatile behavior and consequently, this non-smooth

characteristic decreases the predictability of such a time series. Therefore, in this thesis, an STLF

prediction model is proposed based on Self-Recurrent Wavelet Neural Network (SRWNN) as the

forecasting engine. In addition, the Levenberg-Marquardt (LM) learning algorithm is implemented

and adapted to train the SRWNN. The numerical results show the effectiveness of this forecasting

7
model to deal with severe variations of the electricity load in microgrids.

The third contribution of this thesis is to develop a Short-term Price Forecasting (STPF) model

for the operation of a grid-connected microgrid. As a common mode of operation for microgrids,

they can trade energy with the main grid. Therefore, there is a potential for an energy arbitrage.

As mentioned, electricity price forecasting is very dependent on the application. Since the oper-

ation of the microgrid is of interest, I develop a methodology specifically tailored to benefit the

economics of microgrids by optimally operating a behind-the-meter battery energy storage sys-

tem. The forecasting model takes advantage of high-resolution market data, e.g., market clearing

prices, that carries the most recent information about the condition of the electricity market. I first

conduct an analysis to highlight the significance of the candidate high-resolution market data, and

its potential capability to capture sharp variations in market prices. To develop this model, I focus

on the Ontario’s electricity market as the case study. It should be noted that different jurisdictions

have different electricity markets. However, the proposed model can be adapted to other electricity

markets as all have similar high-resolution data. Moreover, an effective algorithm, named Intra-

day Rolling Horizon (IRH), is designed to embed the generated price forecasts in an optimization

platform. This optimization algorithm schedules the operation of the storage system in such a way

that the microgrid can make a profit by trading energy with the main grid.

The fourth and the last contribution of this thesis is to investigate the impact of different uncer-

tainty mitigation approaches on the economic performance of microgrids under different scenarios.

Two main strategies have been introduced in the literature to deal with the uncertainties of fore-

casts, i.e., rolling horizon technique and probabilistic forecasting. The rolling horizon technique

updates the point forecasts on every forecasting step, while the probabilistic forecasting provides

prediction intervals with a certain confidence level as opposed to point forecasts. First, I imple-

ment models to generate point and interval forecasts for wind power generation, electricity loads

and electricity prices. I use the wind power generation data of Taber wind farm in Alberta, the

electricity demand data of a building in University of Calgary and the electricity price of the Al-

8
berta’s electricity market in this study. Then, I develop deterministic and probabilistic optimization

platforms to feed the forecasts to the operation of the microgrid. Two different scenarios of high

wind and high electricity price are considered. As the main contribution of this chapter, I inves-

tigate the impact of these different approaches on the economic performance of microgrids. This

helps the operator to apply the most effective approach that leads to higher economic benefits of

the microgrid.

1.4 Thesis Organization

In this manuscript-based thesis, Chapters 2, 3, and 4 are works that have been published as journal

papers. Chapter 5 is a journal paper submitted for publication. The articles have been modified for

coherency of the thesis. However, the contents are the same as the papers. The rest of this thesis is

organized as follows:

• Chapter 2 is titled “Wind Power Forecast Using Wavelet Neural Network Trained by

Improved Clonal Selection Algorithm”. This chapter was published in the journal

Energy Conversion and Management [12]. Dr. Amjady is a co-author of this paper.

He provided invaluable comments on the evolutionary optimization algorithm. I

implemented the models, performed the simulations and analyzed the numerical

results along with the paper write-up.

• Chapter 3 is titled “Short-term Electricity Load Forecasting of Buildings in Micro-

grids”. This chapter was published in the journal Energy and Buildings [13]. Dr.

Hamid Shaker is a co-author of this paper. He assisted me in programming the train-

ing algorithm in the forecasting model in MATLAB


R
software. I implemented the

models, performed the simulations and prepared the analyses along with the paper

write-up.

• Chapter 4 is titled “Electricity Price Forecasting for Operational Scheduling of

9
Behind-the-meter Storage Systems”. This chapter was published in the journal

IEEE Transactions on Smart Grid [14]. Dr. Payam Zamani-Dehkordi is a co-

author of this paper. He assisted me in modeling the optimization platform of the

battery energy storage system in MATLAB


R
software. Dr. Palak Parikh is also

a co-author of this paper who provided valuable feedback on this work. I imple-

mented the models, performed the simulations and analyzed the numerical results

along with the paper write-up.

• Chapter 5 is titled “Impact of Uncertainty Modeling on Economic Performance of

Microgrids”. This chapter has been submitted as a manuscript to the journal IEEE

Transactions on Smart Grid. Dr. Soroush Shafiee is a co-author of this paper. He

assisted me in modeling the probabilistic optimization algorithm. All the model

implementation, simulations and data analyses as well as manuscript write-up have

been done by myself.

• The conclusions of this thesis is provided in Chapter 6.

• A list of references is provided in the Bibliography as well.

• Appendices A, B, C and D are also included in the thesis to provide further infor-

mation on the performance of the proposed wind power forecasting methodology at

wind farm levels and benchmark models in Chapter 2, the formulation of the train-

ing algorithm in Chapter 3, and the mutual-information feature selection technique

in Chapter 4.

10
Chapter 2

Wind Power Forecast Using Wavelet Neural Network Trained

by Improved Clonal Selection Algorithm 1

2.1 Introduction

In recent years, wind power has been the fastest growing renewable electricity generation tech-

nology in the world [15, 16]. The worldwide wind capacity reached approximately 300 GW by

the end of June 2013, out of which 13.9 GW were added in the first six months of 2013 [17]. In

particular, Canada installed 377 MW during the first half of 2013, which is 50 % more than in the

previous period of 2012 [17]. In the Province of Alberta, Canada, in particular, the installed capac-

ity reached 1087 MW in late 2012, and is expected to grow to 2388 MW by 2016 [18]. Despite the

environmental benefits of wind power [19], it has an intermittent nature [20], which could affect

power systems security [21] and reliability [22].

One approach to deal with wind power intermittency in operation time scale is to forecast it

over an extended period of time. Accurate wind power forecasting can improve the economical

and technical integration of large capacities of wind energy into the existing electricity grid [23].

Wind forecasts are important for system operators to control balancing, operation and safety of the

grid [8]. On the other hand, wind power forecast errors might sometimes require system operators

to re-dispatch the system in real time. The costs of re-dispatch affect electricity prices and system

performance [24]. Moreover, reserve requirements are connected to wind forecast uncertainty

[25]. Hence, reducing the costs of re-dispatch and contribution of spinning reserves by more

accurate wind power prediction can effectively increase system operation efficiency. For instance,
1
2015
c Elsevier Ltd. Reprinted, with permission, from [12]:H. Chitsaz, N. Amjady, and H.
Zareipour,“Wind power forecast using wavelet neural network trained by improved Clonal selection algo-
rithm”, Energy Conversion and Management, vol. 89, pp. 588-598, January 2015.

11
the economic benefits of accurate wind forecasting were assessed by GE Energy for the New York

State Energy Research and Development Authority (NYSERDA) and the NYISO - all terms are

defined in the list of symbols in this thesis. In that study, it was estimated that $125 Million, or

36%, of the cost reduction is associated with state-of-the-art wind power forecasting. It was about

80% of the estimated cost reduction that could be achieved with a perfect wind power production

forecast [26].

Hence, various approaches have been proposed to improve wind power forecasting accuracy in

the literature. In one group of models, wind speed and other climate variables are predicted using

Numerical Weather Prediction (NWP) models, and those forecasts are used to predict the wind

power output of a wind turbine or a wind farm [27, 28] using turbine or farm production curves. In

another group of models, the NWP forecasts, or self-generated climate variables forecasts, are fed

into secondary time series models to predict wind power output for a turbine, a farm or system level

wind power production. The time series models may be built based on ensemble forecasting [29],

statistical approaches [30, 31], or artificial intelligence techniques [32, 33]. In a third group of

models, only past power production values are used in univariate models to predict future wind

power values [34, 35]. Despite improvements in wind power forecasting methods, wind power

forecasts still suffer from relatively high errors, ranging from 8% to 22% (in terms of normalized

mean squared error) depending on several factors, such as, forecasting horizon, type of forecasting

model, size of wind farm and geographic location [36].

The contribution of the present paper is to propose a new forecasting technique for short-term

wind power forecasting. In particular, we develop a wind power forecasting engine based on

Wavelet Neural Network (WNN) with multi-dimensional Morlet wavelets as the activation func-

tions of the hidden neurons and maximum correntropy criterion as the error measure of the training

phase. We propose a stochastic search technique, which is an improved version of Clonal search

algorithm, for training the forecasting engine. The significance of the proposed forecasting tech-

nique is that the combination of the WNN and the proposed training strategy is capable of capturing

12
highly non-linear patterns in the data and result in improved forecast accuracy. Particularly, high

exploitation capability of the proposed training strategy enables it to find more optimal solutions

for the optimization problem of WNN training.

The remaining sections of the paper are organized as follows. A brief literature review on wind

power forecasting models is provided in Section 2.2. The architecture of the wind forecasting

model is introduced in subsection 2.3.1. The proposed training strategy is then presented in sub-

section 2.3.2. The results of the proposed wind forecasting method, obtained for the real-world test

cases, are compared with the results of several other prediction approaches in section 2.4. Section

2.5 concludes the paper.

2.2 Literature review

In this section, a literature review of the existing wind power forecasting models is provided. As

mentioned in section 2.1, the forecasting methods based on time series, either statistical models

or artificial intelligence models, use historical wind power data recorded at the wind farms along

with the historical data of the exogenous meteorological variables such as wind speed, tempera-

ture and humidity, providing that the data is available. Auto-Regressive Moving Average (ARMA)

models [37], Auto-Regressive Integrated Moving Average (ARIMA) model [38], and Fractional

ARIMA (FARIMA) model [31], have already been applied to wind speed and wind power predic-

tion. Although time series models are simple forecasting methods and can be easily implemented,

most of them are linear predictors, while wind power is generally a non-linear function of its input

features.

Artificial intelligence techniques, especially artificial neural networks, have been used in sev-

eral papers to predict wind power generation [39]. Recurrent neural network [40], Radial Basis

Function (RBF) neural network [41], and Multi-Layer Perceptron (MLP) neural networks [35],

have been proposed for wind power forecasting. Although neural networks can model nonlinear

input/output mapping functions, a single neural network with traditional training mechanisms has

13
limited learning capability and may not be able to correctly learn the complex behavior of wind

signal. To remedy this problem, combinations of neural networks with each other and with fuzzy

inference systems such as Adaptive Neuro-Fuzzy Inference System (ANFIS) [42, 43], and Hy-

brid Iterative Forecast Method (combining MLP neural networks) [44], have also been suggested

for wind speed and power prediction. However, such models and especially fuzzy logic models,

involve high complexity and a long processing time in the case of many rules [45].

Another approach to tackle the complex behavior of wind power time series is using wavelet

transform. In [46], it has been discussed that wavelets can effectively be used for both stationary

and non-stationary time series analysis 2 , and that is one of the reasons for the wide and diverse ap-

plications of wavelets. Wind speed and power prediction approaches based on wavelet transform,

as a preprocessor to decompose wind speed/power time series, and ANFIS [47], Auto Regres-

sive Moving Average (ARMA) [48], Artificial Neural Network (ANN) [49], and Support Vector

Regression (SVR) [50], as forecast engines, have been presented. As for SVM-based models,

they highly depend on appropriately tuning of parameters and involve complex optimization pro-

cess [45]. Wavelet can also be applied in a more efficient structure called wavelet neural network,

in which wavelet functions are used as the activation functions of the neurons in neural networks.

In [51], wavelet has been used in the form of WNN for wind speed prediction and it is trained

by extended Kalman filter. Since such a model consists of many scaled and shifted wavelets of

the utilized mother wavelet, it requires a powerful training algorithm to efficiently train the model

and not to be trapped in local optima while finding the best input/output mapping function of the

model.

In [52], a wind power prediction strategy including a Modified Hybrid Neural Network and

Enhanced Particle Swarm Optimization (EPSO) has been proposed. In this paper, a developed
2
This footnote was added in response to a question raised by a committee member in the PhD oral examination.
A time series is stationary if its mean, variance and autocorrelation do not change over time, i.e., constant mean and
variance over time. Non-stationary data is often transformed to become stationary. This is to obtain meaningful sample
statistics such as means, variances, and correlations with other variables. Such statistics are useful as descriptors of
future behavior only if the series is stationary. For example, if the series is consistently increasing over time, the
sample mean and variance will grow with the size of the sample, and they will always underestimate the mean and
variance in future periods.

14
evolutionary algorithm, i.e., EPSO, is presented to empower the training phase of the utilized neural

network, which is generally a combination of three simple MLPs. Taking into consideration the

advantages of wavelet transform in the form of WNN and evolutionary algorithms as the training

algorithms, we propose a wind power prediction model, which is elaborated in the next section.

2.3 Methodology

In this section, we provide the details of the proposed forecasting engine and its training strategy.

Briefly, the proposed forecasting technique is composed of a WNN structure with Morlet wavelet

functions as activation functions in the hidden layer and a new training strategy. These components

are described next.

2.3.1 The Developed Wavelet Neural Network

Wavelet transform has been used in some recent research works for wind forecasting, as a pre-

processor to decompose wind speed/power time series to a set of sub-series [47–50]. The future

values of the sub-series are predicted by ANFIS [47], SVR [50], ARMA [48] and ANN [49] and

then combined by the inverse WT to form the forecast value of wind power/speed. Another ap-

proach to utilize wavelet in a forecast process is through constructing wavelet neural network in

which a wavelet function is used as the activation function of the hidden neurons of an ANN. For

instance, WNNs with Mexican hat and Morlet wavelets, shown in Fig. 1, as the activation function

of the hidden neurons have been applied for another application, i.e. price forecast of electric-

ity markets, in [11, 49], respectively. Due to the local properties of wavelets and the ability of

adapting the wavelet shape according to the training data set instead of adapting the parameters of

the fixed shape activation function, WNNs offer higher generalization capability compared to the

classical feed forward ANNs [11]. Recently, a WNN using Mexican hat mother wavelet function

is proposed for wind speed forecast [51]. However, Morlet wavelet has vanishing mean oscillatory

behavior with more diverse oscillations with respect to Mexican hat wavelet, which can be seen

15
Mexican hat function Morlet function
1 1

0.5 0.5

 (X)

 (X)
0 0

-0.5 -0.5

-1 -1
-5 -4 -3 -2 -1 0 1 2 3 4 5 -5 -4 -3 -2 -1 0 1 2 3 4 5
X X

Figure 2.1: Mexican hat and Morlet wavelet functions

from Fig. 2.1, and so it can better localize high frequency components in frequency domain and

various changes in time domain of severely non-smooth time series, e.g. wind power. In [53], it

is mentioned that Morlet leads to a better electricity price forecast compared with Mexican hat.

In this paper, we propose a WNN with multi-dimensional Morlet wavelet as the activation func-

tion of the hidden neurons for wind power forecasting. In the proposed method, we implement a

new training algorithm which can efficiently search for the global optimum solution, while a sim-

ple gradient method is used as the training algorithm of the WNN in [53]. In addition, since the

performance of these two activation functions has not been illustrated in [53], we compare these

functions in numerical results in order to demonstrate the effectiveness of Morelet wavelet over

Mexican hat in wind power forecast.

Architecture of the WNN is shown in Fig. 2.2, which is a three-layer feed-forward structure. In

Fig. 2.2, X = [x1 , x2 , ..., xm ] is the input vector of the forecast process and y is the target variable.

The inputs x1 , x2 , ..., xm of the forecasting engine can be from the past values of the target variable

and past and forecast values of the related exogenous variables. For instance, past values of wind

power along with the past and forecast values of wind speed, wind direction, temperature and

humidity can be considered for wind power prediction, provided that their data is available [33].

A feature selection technique can be used to refine these candidate features and select the most

effective inputs for the forecast process. In this research work, we use the feature selection method

16
Hidden Layer

Input Layer
F1

x1 I
F2 w1 Output Layer

x2 I w2
v1
v2
+ y

xm vm
I
wn
Fn

Figure 2.2: Architecture of the proposed wind forecasting engine (WNN with Morlet wavelet)

of [52]. This method is based on the information theoretic criterion of mutual information and

selects the most informative inputs for the forecast process by filtering out the irrelevant and re-

dundant candidate features through two stages. In the first stage, so-called irrelevancy filter, mutual

information between each candidate input, i.e. xi (t), and the target variable is computed. The can-

didate input with higher value of mutual information has more common information content with

the target variable. The candidate inputs with calculated mutual information value greater than a

relevancy threshold T H1 are considered as the relevant features of the forecast process, which are

retained for the next stage, while other candidate inputs with mutual information value lower than

T H1 are considered as irrelevant features, which are filtered out. In the second stage, so-called

redundancy filter, redundant features among the selected candidate inputs from the first stage are

found and filtered out. Higher value of mutual information between two selected candidates, e.g.,

xk (t) and xl (t), means more common information between these two candidates and thus, they

have a higher level of redundancy. Therefore, the redundancy of each selected feature xk (t) with

the other candidate inputs is measured. Afterwards, if the measured redundancy becomes greater

than a redundancy threshold T H2 , xk (t) is considered as a redundant candidate input. Hence,

between this candidate and its rival, which has the maximum redundancy with xk (t), one with

17
lower relevancy should be filtered out [52]. The selected candidate features in the second stage

are considered as the inputs of the wind power forecast engine. Moreover, tuning the values of

the thresholds T H1 and T H2 is performed by cross validation technique. Since this method is not

the focus of this paper, it is not further discussed here. The interested reader can refer to [52] for

details of this feature selection method. In addition, here, the target variable is wind power of the

next time interval that the forecasting engine presents a prediction for it. Multi-period forecast,

e.g. prediction of wind power for the next forecast steps, is reached via recursion, i.e. by feeding

input variables with the forecaster’s outputs. For instance, forecasted wind power for the first hour

is used as y(t − 1) for wind power prediction of the second hour provided that y(t − 1) is among

the selected candidate inputs of the feature selection technique.

The forecasting engine should construct the input/output mapping function of X ⇒ y. The

activation function of the input layer nodes is the identity function, i.e. I(x) = x. In other words,

the input layer only propagates the inputs of the WNN to the next layers. The activation function

of the hidden layer nodes of the WNN, i.e. multi-dimensional Morlet wavelet, is constructed as

follows:
m
Y
Fi (x1 , x2 , ..., xm ) = ψai ,bi (xj ), ∀i = 1, 2, ..., n (2.1)
j=1

xj − b i
ψai ,bi (xj ) = ψ( ) (2.2)
ai

where n indicates the number of hidden neurons of WNN, and one-dimensional Morlet wavelet

ψ(.) is defined as follows:

2
ψ(x) = e−0.5x cos(5x) (2.3)

In (2.1) and (2.2), ψai ,bi is the scaled and shifted version of ψ(.) with ai and bi as the scale and

shift parameters, respectively. Each activation function Fi (.) has its own ai and bi . Based on (2.1),

Fi (.) is m-dimensional wavelet function of x1 , x2 , ..., xm constructed by the tensor product of one-

dimensional Morlet wavelets ψai ,bi . Finally, the output of the WNN, denoted by y in Fig. 2.2, is

18
computed as follows:
n
X m
X
y= wi Fi (x1 , x2 , ..., xm ) + vj xj (2.4)
i=1 j=1

where, wi is the weight between ith hidden neuron and the output node, and vj is the direct input

weight between j th input and the output node. In other words, the output of the WNN is obtained

by a combination of multi-dimensional wavelet functions, i.e. Fi (x1 , x2 , ..., xm ) , as well as a

combination of inputs, i.e. xj . Thus, the proposed WNN not only can benefit from the capabilities

of wavelet functions, such as their ability to capture cyclical behaviors, but also can capture trends

of the signal. Based on the above formulation, the vector of the free parameters of the WNN,

denoted by Z, is as follows:

Z = [v1 , ..., vm , w1 , ..., wn , a1 , ..., an , b1 , ..., bn ] (2.5)

Therefore, the WNN has N P = m + 3n free parameters, which should be determined by the

proposed training strategy.

2.3.2 The Proposed Training Strategy

We propose a new training strategy to train the developed WNN. This strategy is based on improved

Clonal selection algorithm. In the following, at first, the Clonal Selection Algorithm (CSA) is

briefly introduced. Then, the proposed Improved CSA (ICSA) is presented and adapted as the

training strategy to optimize the free parameters of WNN.

As an antigen (e.g., a virus) invades the human body, the biological immunity system selects

the antibodies that can effectively recognize and destroy the antigen. The selection mechanism of

the immunity system operates based on the affinity of the antibodies with relation to the invading

antigen. CSA is an efficient optimization method, inspired by the biological immunity system

selection mechanism, proposed by De Castro and Van Zuben [54]. This method has successfully

been applied to optimization and pattern recognition domains [54,55] and also unit commitment of

power systems [56] in recent years. The performance of CSA can be summarized as the following

19
step by step procedure [54]:

Step 1: Randomly produce the initial population of CSA within the allowable ranges. Each indi-

vidual of the population, so called antibody in CSA, is a candidate solution for the optimization

problem including its decision variables, called genes in CSA [54]. Here, each antibody of the

population of CSA includes N P free parameters of WNN, shown in (2.5). The number of antibod-

ies in the population is denoted by N . The generation number g is set to zero (g = 0). In general,

antibody k in generation g is as follows:

Zkg = [Zk,1
g g
, Zk,2 g
, ..., Zk,N P ], ∀k = 1, 2, ..., N (2.6)

g
where the lth gene or decision variable Zk,l (l = 1, ..., N P ) can be from v1 , ..., vm or w1 , ..., wn or

a1 , ..., an or b1 , ..., bn as shown in (2.5). The four sets of vj , wi , ai and bi of each individual are

randomly initialized with uniform distribution in the intervals [-1, 1], [-1, 1], [0.5, 2] and [-3, 3],

respectively. It is noted that the interval of [-1,1] is the most common range for weight and bias ini-

tialization of neural networks [57]. According to Fig. 1, the output of the Morlet wavelet function

becomes very close to zero for the input values bigger than 3 and smaller than -3 and therefore, the

initial values for bi is set in the interval of [-3,3]. Finally, with regard to the initialization of ai , the

interval of [0.5 2] is chosen as it is not an extreme interval to make the function too spread/dense.

Step 2: Determine the affinity of the antibodies with respect to the antigen. In the optimization

problems, usually there is no explicit antigen population to be recognized, but an objective function

to be optimized (maximized or minimized). Thus, in optimization tasks, an antibody affinity cor-

responds to the evaluation of the objective function for the given antibody [56]. Here, the proposed

WNN is trained, i.e. its free parameters are optimized, by CSA. Thus, training error of the WNN

is considered as the objective function of CSA, which should be minimized. Since training error

of ANN-based forecasting engines is widely measured in terms of Mean Squared Error (MSE), it

is considered as the measure of WNN training error for the present.

Step 3: Sort antibodies based on their training error values in terms of MSE (i.e., the objective

function values) such that the best antibody with the lowest MSE ranks the first.

20
Step 4: Copy the antibodies based on their position in the sorted population:

βN
nck = Round( ), ∀k = 1, 2, ..., N (2.7)
k

where nck is the number of antibodies copied from k th antibody; Round(.) function rounds

up/down its real argument to the nearest integer value; β is a constant coefficient which indi-

cates rate of copy. Thus, an antibody with lower MSE and higher rank (lower k) will be copied

more than those with higher MSE. At the end of this step, the number of copied antibodies will be

N C as follows:
N
X βN
NC = Round( ) (2.8)
k=1
k

Performance of this step is graphically shown in Fig. 2.3. As seen, the first antibody with the

highest rank is copied nc1 times, the second one nc2 times and so on.

Step 5: Mutate the N C antibodies, produced in step 4, using the hypermutation operator [55].

Step 6: Determine MSE value for each mutated antibody. Among the N C mutated and N original

antibodies, select N S antibodies (N S < N ) with the lowest MSE values. These N S antibodies

enter directly to the next generation.

Step 7: Randomly generate N − N S new antibodies for the next generation. These randomly

generated antibodies enhance search diversity of CSA, and consequently, the algorithm takes the

chance to escape from the local optima.

Step 8: Increment the generation number (g → g + 1). If the termination criterion, such as

maximum number of generations, is satisfied, the algorithm is terminated and the best antibody of

the last generation, owning the lowest MSE, is determined as the final solution of CSA; otherwise,

go back to step 2 and repeat this cycle. The termination criterion used for the training of the WNN

will be described later.

In the above algorithm, N , N S, and β are user-defined settings of CSA. In this algorithm,

antibodies are evolved through the mutation, which is the key operator of CSA. The proposed

Improved CSA (ICSA) includes two enhancements for this operator as follows. De Castro and

21
Figure 2.3: Representation of step 4 (copy operator) of CSA

Van Zuben [54] discuss that the mutation rate should be inversely proportional to the antigenic

affinity: the higher the affinity, the smaller the mutation rate. Thus, in an optimization problem,

more optimal candidate solutions should be less mutated. This general idea is modeled in the

proposed ICSA based on the following relation:


M SEmin
(−ρ )
NkM ut = Round[e M SEk
× N P ], ∀k = 1, ..., N (2.9)

where M SEk is MSE of k th antibody and M SEmin is the lowest MSE among the antibodies of the

current population; NkM ut represents number of genes of the antibodies copied from k th antibody,

produced in step 4, that should be mutated (NkM ut ≤ N P ); the coefficient ρ controls the mutation

rate such that higher ρ leads to lower values of NkM ut . Thus, fewer decision variables are mutated

in antibodies with lower MSE (more optimal candidate solutions) and more decision variables are

mutated in antibodies with higher MSE (less optimal candidate solutions), and consequently, the

individuals of ICSA population are mutated in a coordinated manner. As a result, not only does

ICSA allow antibodies with lower MSE to be copied more than those with higher MSE, but also

controls the mutation process of copied antibodies in accordance with their MSE using (2.9). Note

22
that although the mutation operation is applied to the N C copied antibodies as shown in step 5,

only N values of NkM ut should be computed, since the antibodies copied from one individual have

the same NkM ut .

After determining NkM ut for the antibodies of ICSA, the set of genes that should be mutated

are randomly selected among the decision variables of each antibody. The hypermutation operator

of step 5 adds a normal random variable with zero mean and constant variance to each decision

variable of the mutating antibody [39]. Here, a more effective mutation operation inspired from

Differential Evolution (DE) algorithm [58] is proposed for ICSA as follows:

M SEmin M SEmin
g+1 −ρ g −ρ g g
Zk,l = [1 − e M SEk
]Zk,l +e M SEk
(Zk1,l − Zk2,l ), (2.10)

1 ≤ k 6= k1 6= k2 ≤ N C, 1 ≤ l ≤ N P

g g+1
where Zk,l and Zk,l represent gene l of antibody k in two successive generations g and g + 1. As

seen, to mutate k th antibody, this mutation operator uses the values of decision variables in two

other antibodies (i.e. Zkg1 ,l and Zkg2 ,l ) as it is applied in DE mutation operator.

In [58], it has been discussed that DE mutation operator by computation of difference between

two randomly chosen individuals from the population (here, antibodies k1 and k2 ), determines

a function gradient in a given area (not in a single point), and therefore, prevents trapping the

solution in a local optimum of the objective function. Moreover, to enhance the search diversity of

the proposed mutation operation, it is separately applied to each gene of the mutating antibodies.

Additionally, the exp(.) term of (2.9) and its complement 1 − exp(.) are used in (2.10). If k th
g
antibody Zk,l is a good candidate solution, the exp(.) term and its complement become close to 0

and 1, respectively. Thus, the next generation decision variables mainly take their values from the

current generation decision variables and small gradient terms are added to them. Consequently,

ICSA can search promising areas of the solution space with small steps or high resolution, which is

known as exploitation capability. Based on this capability, a stochastic search technique can extract

optimal solutions of the solution space from its promising areas. High exploitation capability of

the proposed ICSA allows it to find more optimal solutions for the optimization problem of WNN

23
training. In other words, the WNN using ICSA can better learn the complex input/output mapping

function of the wind power forecast process and so predict its future values with higher accuracy.

However, if k th antibody is a poor candidate solution, the exp(.) term and its complement become

close to 1 and 0, respectively, and so the next generation decision variables mainly obtain their

values from large gradient terms. Hence, ICSA can move out from the non-promising areas of the

solution space by large steps. Finally, note that due to the effect of the mutation operation of (2.10),

even the copied antibodies of the same antibody lead to different individuals after the mutation

operation. Thus, at the end of step 5 of the proposed ICSA, N C diverse candidate solutions are

added to N original antibodies (Fig. 2.3), which further enhance the search ability of ICSA.

The termination criterion is an important aspect of the proposed training strategy, as it can af-

fect the performance of the proposed forecasting engine. A low number of ICSA generations may

lead to insufficient training of the forecasting engine and cause the WNN to incorrectly learn the

input/output mapping function of the forecast process, i.e. X ⇒ y. On the other hand, a large

number of ICSA generations increases the computation burden and more importantly may lead to

over-fitting problem of the WNN. When over-fitting occurs in a neural network based forecast en-

gine, it memorizes the training samples instead of learning them, and thus, while the neural network

obtains very low training error, its generalization capability (i.e. its ability to reply to unseen fore-

cast samples) degrades. To avoid these problems, a termination criterion based on cross-validation

technique is used for the training phase of the proposed forecasting engine. In this technique, the

whole gathered historical data is divided to training and validation samples. The WNN is trained

by the proposed ICSA trying to minimize MSE of training samples in each generation. However,

at the end of each generation (step 8), the WNN is tested on the unseen validation samples. For

instance, validation samples can be some samples at the end of the historical data interval, i.e. the

closest historical data to the forecasting horizon. When MSE of the unseen validation samples, as

a measure of the WNN’s forecast error, begins to rise, the generalization capability of the WNN

begins to degrade, and therefore, the training phase should be terminated. Then, the best individual

24
of the WNN in the generation leading to the minimum validation error, which is expected to yield

the maximum generalization capability of the WNN, is selected as the final solution of the training

phase indicating the free parameters of the WNN.

Although MSE has widely been used in forecasting models as a training error measure, its

applicability to train a neural network is optimal only if the probability distribution of the predic-

tion errors is Gaussian. However, wind power forecast error presents a non-Gaussian shape [59].

Minimizing the squared error is equivalent to minimizing the variance of the error distribution.

Accordingly, the higher moments (e.g., skewness, kurtosis, etc.) are not captured, but they contain

information that should be passed to the free parameters of the neural network instead of remain-

ing in the error distribution. Ricardo Bessa et al. in [59] proposed some new training error criteria

based on minimizing the information content of the error distribution (instead of minimizing the

variance in MSE). Maximum Correntropy Criterion is defined as follow:


Ne
1 X
M CC = max{ G(i , σ 2 )} (2.11)
Ne i=1

where G is the Gaussian kernel, Ne is the number of training samples, i is the error for ith training

sample, and σ 2 is the variance. Therefore, MCC approximates a non-Gaussian shape by summation

of Gaussian functions corresponding the errors of training samples. For more detailed information

relating to MCC, refer to [59]. It should be noted that MCC is a maximization problem, while the

proposed training algorithm in section 2.3.2, i.e. ICSA, was described as a minimization problem

since it was based on MSE. Hence, to adapt this criterion to the proposed training algorithm,
1
minimization of M CC
or −M CC can be considered instead of maximization of MCC. It is noted

that the value of correntropy is always positive [60]. Moreover, the values of M SEk and M SEmin

are simply replaced by ( M 1CC )k and ( M 1CC )min (or (−M CC)k and (−M CC)min ), respectively,

in equations (9) and (10). The main reason for including MSE in the structure of the proposed

algorithm is because of the popularity and common use of MSE criterion in training phase of

forecasting models. Moreover, the minimization of errors for training samples is often the objective

function of forecasting problems, and thus, it is more tangible to deal with a minimization problem

25
in this area.

2.4 Numerical results

In order to generate forecasts, two parameters must be decided, namely, forecast interval and fore-

cast horizon. Forecast interval determines the length of each time step into the future (e.g., 10

minutes or one hour), and forecast horizon determines how many time steps into the future are of

interest. Both factors depend on the application of the forecasts. For instance, if the forecasts are

used for very short-term adjustments in operation schedules, the forecast horizon could be as short

as a few hours. Furthermore, the forecast interval may also vary depending on the application. For

example, while a unit commitment algorithm may consider hourly intervals, an economic dispatch

algorithm may look into shorter intervals (e.g., 5 minutes). Note that in generating the forecasts,

selecting the forecast interval is sometimes limited by the availability of the data. For example, in

Alberta, the meteorological towers that measure weather factors at wind farm sites are set to collect

the data for every 10-minute interval. Thus, the forecast interval cannot be less than 10 minutes if

the meteorological data are of interests in the models.

In this paper, we have selected to generate hourly wind power production forecasts for Al-

berta’s power system for up to 6 hours into the future as our test case. Alberta’s electricity market

is a real-time market with an hourly settlement interval. Although the system marginal price is

determined every minute, the supply and demand offers must be submitted to the system operator

for hourly intervals. The bids and offers may be changed up to two hours before the operation

hour, which is refereed to as the T-2 window. The majority of slower generators do not strate-

gically adjust their prices and bid at $0/MWh. Faster units, on the other hand, actively watch the

supply-demand balance in the market and adjust their strategies. These units normally act based on

the developments in the market in the short-term, usually the next few hours. In particular, for any

given operation hour, forecasts of wind power generated before the start of the T-2 Window, when

the market participants can still change their bids and offers, is of important value. In addition,

26
given the real-time nature of the market, and considering the fact that only short lead-time units

behave strategically, the system operator is mainly concerned with the supply-demand balance for

the next hour or two. Thus, while forecasts for longer horizons are important, the ones for the

short forecast horizons are particularly valuable for the system operator and market participants in

Alberta.

The forecasts of meteorological variables, e.g., wind speed, through NWP models are known

to be useful in wind power forecasts for longer forecast horizons [41]. Thus, we choose not to

include such data into our model since we are focusing on short-term prediction, and generate

forecasts solely based on past power production values. More specifically, 100 hourly lagged

values of wind power are considered as the candidate inputs, which are processed by the feature

selection technique to select a minimum subset of the most informative features for the proposed

forecasting engine. Furthermore, 60 days prior to each forecast day are considered as the historical

data divided to 59 days as the training set and one day before the forecast day as the validation set.

The second week of March, June, September, and December of year 2012 are considered as test

weeks. For each test week, the forecasts are updated in two different ways in this paper. In the first

part of the numerical results, we update the forecasts every 6 hours, and thus, 28 sets of forecasts

are generated for each of the representative weeks. In this part, the error measures are evaluated

based on the average of errors for the individual forecasts over the entire week. In the second part

of the results, we test the model by updating the forecasts every hour, i.e., for every test hour, there

will be six versions of forecasts produced at previous hours. For these forecasts, we evaluate the

error measures for each forecast horizon, as further discussed later in this section.

To show an example for selection of inputs using the feature selection technique, the selected

features for the third forecasting window of September test week, i.e. the third 6-hours of Septem-

ber 8, 2012 , are as follows:

X = [x1 , x2 , ..., xm ] = [W P (t − 1), W P (t − 2), W P (t − 3), W P (t − 4), W P (t − 5), W P (t −

6), W P (t − 7), W P (t − 8), W P (t − 10), W P (t − 11), W P (t − 18), W P (t − 19), W P (t −

27
20), W P (t − 21), W P (t − 22), W P (t − 23), W P (t − 24), W P (t − 25), W P (t − 26), W P (t − 28)]

These 20 features are selected from 100 candidate inputs W P (t−1), W P (t−2), ..., W P (t−100).

Two error criteria are used in this paper to evaluate forecast errors: normalized Root Mean

Square Error (nRMSE) and normalized Mean Absolute Error (nMAE) defined as follows:

v
u NH
u 1 X WPACT(t) − WPFOR(t) 2
nRMSE = t ( ) × 100 (2.12)
NH t=1 WPCap

NH
1 X WPACT(t) − WPFOR(t)
nMAE = | | × 100 (2.13)
NH t=1 WPCap

where WPACT(t) and WPFOR(t) indicate the actual and forecast values of wind power for hour t.

Also, NH indicates number of hours, which is 168 for test weeks, and WPCap is the total wind

power capacity of aggregated wind farms, which are 861, 941, 941 and 1087 MW in March, June,

September and December of year 2012, respectively, due to growth of wind power capacity in

Alberta.

2.4.1 Numerical Results with 6-hour Updates

For these forecasts, at the end of each 6-hour window, when the wind power values of 6 hours

become available, the historical data is updated to perform the wind power prediction of the next 6

hours. Thus, each forecast horizon includes 6 forecast steps. However, one week or 168 hours are

considered as the evaluation period for the error criteria of nRMSE and nMAE to better evaluate

the performance of the method over a longer period.

The results obtained from the proposed forecasting method, i.e. WNN with Morlet wavelet

function, ICSA training algorithm and MCC training criterion, in comparison with the same model

but consisting of MSE training criterion instead of MCC are shown in Table 2.1. We have also

generated the forecasts based on some other popular models, i.e., the persistence method, and RBF

and MLP neural networks. A brief description of these models is presented in B. For the sake of a

28
Table 2.1: nRMSE (%) and nMAE (%) results for the four test weeks of year 2012
Test week
Model Error Average
Mar. Jun. Sep. Dec.
nRMSE 13.71 15.14 18.44 12.49 14.95
Persistence
nMAE 10.08 10.79 13.11 8.84 10.71
nRMSE 18.32 14.57 18.62 14.11 16.40
RBF
nMAE 13.32 10.45 13.77 10.24 11.95
nRMSE 15.36 15.62 19.80 12.32 15.78
MLP
nMAE 12.42 11.56 14.54 9.02 11.89
WNN nRMSE 12.38 14.99 17.66 11.65 14.17
with MSE nMAE 9.36 10.64 12.49 8.53 10.26
nRMSE 12.23 12.48 16.68 11.58 13.24
Proposed Method
nMAE 9.22 9.64 11.73 8.22 9.70

fair comparison, all of these methods have the same historical data except the persistence method

that does not require training samples. Observe from Table 2.1 that WNN with MSE as the training

measure outperforms the three other forecast methods. For instance, the average nRMSE and

average nMAE of WNN with MSE are (14.95-14.17)/14.95=5.2% and (10.71-10.26)/10.71=4.2%

lower than those of the persistence method.

Moreover, as seen from the results of Table 2.1, considering MCC as the training error measure

can significantly improve the forecasting accuracy of wind power prediction. For instance, the

average nRMSE and average nMAE for the proposed method are respectively 6.6% and 5.5%

lower than those for WNN with MSE, and 11.4% and 9.4% lower than those for the persistence

method, which clearly show the advantage of using MCC error measure in training phase of a wind

power forecasting model. Note that while persistent forecasts are a useful benchmark, especially

for one-step-ahead forecasts, they do not provide any information on variations and ramps in a

multi-step-ahead forecasting practice. Thus, despite their average errors being relatively close to

the proposed method, they do not contain ramping information, and thus, less useful.

Fig. 2.4 graphically shows the forecast results of different forecasting models for June 10, 2012,

in which there is a sharp downward ramp, and June 12, 2012, in which there is an upward ramp.

Observe from this figure that the proposed method, shown in red color, can satisfactorily follow the

trend and ramps of the measured wind power, shown in dotted black color, for both days. Neural

29
June 10, 2012 June 12, 2012
700 600
Proposed
MLP
600
Persistence
500
RBF
Wind power production (MW)

Wind power production (MW)


500 Measured

400
400

300
300
Proposed

200 MLP

Persistence
200
RBF
100
Measured

0 100
1 7 13 19 24 1 7 13 19 24
Hour Hour

Figure 2.4: Forecast results of different models for two different days

network based models, e.g., MLP and RBF shown in marked blue color, can also follow ramps of

wind power to some extent although they might miss the correct direction sometimes. However,

persistence model shown in green, cannot provide any ramp information or good forecast values,

especially when there is a ramp in the time series; hence, this model can only be useful for very

short term prediction, e.g., up to one hour-ahead.

To illustrate the effectiveness of Morlet wavelet function as the activation function of WNN,

the proposed method, i.e., WNN with Morlet wavelet function, ICSA training algorithm and MCC

training criterion, is compared with the same model but with the Mexican hat wavelet function (i.e.,

WNN with Mexican hat wavelet function, ICSA training algorithm and MCC training criterion)

in Table 2.2. Improved performance of the proposed method can be seen from Table 2.2 such that

wind power forecast accuracy of the proposed method is better than the WNN with Mexican hat

wavelet in all test weeks.

In the next numerical experiment, the effectiveness of the proposed training strategy is eval-

uated. For this purpose, the proposed ICSA is replaced with several other well-known stochastic

30
Table 2.2: Comparison of the proposed method and a WNN with Mexican hat wavelet
Test week
Model Error Average
Mar. Jun. Sep. Dec.
WNN with nRMSE 13.77 13.23 17.12 12.22 14.08
Mexican hat nMAE 10.13 9.80 12.54 8.71 10.30
nRMSE 12.23 12.48 16.68 11.58 13.24
Proposed Method
nMAE 9.22 9.64 11.73 8.22 9.70

search techniques including Simulated Annealing (SA), Particle Swarm Optimization (PSO), Dif-

ferential Evolution (DE), and CSA, while the other parts of the suggested wind power forecasting

engine, i.e. WNN with Morlet wavelet function and MCC training criterion, are kept unchanged.

As seen from Table 2.3, the proposed ICSA leads to the lowest wind forecasting errors in all test

weeks among all stochastic search techniques including SA, PSO, DE and CSA by 20.7%, 25.1%,

20.8% and 9.8% improvement of average nRMSE, and by 22.4%, 25.0%, 22.5% and 9.4% im-

provement of average nMAE, respectively, demonstrating the effectiveness of the proposed train-

ing strategy.

Table 2.3: Comparison of the proposed training strategy, i.e. ICSA, with SA, PSO, DE and CSA
Test week
Algorithm Error Average
Mar. Jun. Sep. Dec.
nRMSE 19.95 16.23 16.96 13.70 16.71
SA
nMAE 14.73 11.57 13.53 10.18 12.50
nRMSE 18.33 17.69 20.70 14.01 17.68
PSO
nMAE 14.18 13.38 14.37 9.88 12.95
nRMSE 17.51 17.43 17.94 13.97 16.72
DE
nMAE 13.39 13.34 13.08 10.25 12.52
nRMSE 13.79 13.47 18.29 13.22 14.69
CSA
nMAE 10.49 10.29 12.80 9.21 10.70
Proposed Method nRMSE 12.23 12.48 16.68 11.58 13.24
(ICSA) nMAE 9.22 9.64 11.73 8.22 9.70

Finally, we applied the proposed method to predict wind power for the year 2012 so as to have

a comprehensive evaluation of its performance. Hence, 10 months from March to December 2012

have been considered in the last numerical experiment and monthly errors are reported in Table

2.4. It is noted that the data for the first two months is used as training samples for prediction

of month March, and therefore, no prediction result can be reported for January and February.

31
Table 2.4: Wind power prediction error results of the proposed method for 10 months
Error
Test month
nRMSE nMAE
Mar. 11.89 8.32
Apr. 11.98 8.46
May 12.32 9.26
Jun. 13.69 9.74
Jul. 10.71 7.29
Aug. 12.08 8.05
Sep. 13.26 8.78
Oct. 11.35 7.78
Nov. 12.21 8.64
Dec. 11.52 7.80
Average 12.10 8.41

According to this table, monthly errors for months March and December are very close to those

earlier presented for the associated test weeks in these months. For instance, monthly nRMSE

and nMAE are respectively 11.52% and 7.80% for the month December, and weekly nRMSE

and nMAE are 11.58% and 8.22%, respectively, for the test week of December. Errors related

to September test month shown in Table 2.4 are considerably lower than those for the test week

of September presented in previous tables. The reason is that there are more severe ramps in the

second week of September selected as the test week compared with other weeks in this month, and

therefore, the average error of the month is lower than the error associated with the second week

of this month. On the contrary, forecasting errors related to test month of June presented in Table

2.4 are higher than those earlier presented for the test week of June. Here, wind power data for

the second week of June has been more predictable than other weeks in this month. Moreover,

the average errors presented in Table 2.4, i.e., 12.10% and 8.41% in terms of nRMSE and nMAE,

respectively, are close and even lower than ones presented for the four test weeks, i.e., 13.24% and

9.70%. As a result, it validates the comparative results presented in Tables 2.1, 2.2 and 2.3 between

the proposed method and other methods based on the test weeks’ consideration. Furthermore,

considering forecasting errors presented in Tables 2.1, 2.2 and 2.3, no matter which test week is

considered (e.g., with high or low predictability), the proposed method demonstrated its higher

32
wind power forecast accuracy compared with other benchmark models.

2.4.2 Numerical Results with Hourly Updates

Prediction errors can also be calculated for each look-ahead forecast distinctly (i.e., each forecast

hour). In some of the mainly academic literature, the average of prediction errors for 1-hour ahead

up to the last hour of the forecast horizon is calculated. For instance, in 6-hour ahead prediction,

each forecasting window includes 1-hour to 6-hour ahead forecasts, and the error is the average

of the prediction errors of these 6 forecast values, as performed in subsection 2.4.1. However,

in most commercial forecasting tools, the forecasts are updated after each forecast interval, and

forecast accuracy is evaluated for each look-ahead interval individually. In other words, the values

of wind power for the first 6 hours, i.e., W P (t + 1), ..., W P (t + 6), are predicted at time t. When

the observed value of wind power for the hour t + 1 becomes available, the data is updated and

forecasts for the next 6 hours, i.e., W P (t + 2), ..., W P (t + 7), are predicted at time t + 1.

In other words, the data is updated every hour, while the forecast horizon is still 6-hour-ahead.

In the following numerical experiment, the proposed method is applied for aggregated wind power

forecast of Alberta in 10 test weeks, including the second weeks of March to December 2012,

such that the prediction accuracy of different look-ahead forecasts within the 6-hour window is

evaluated. Fig. 2.5 demonstrates the results of this numerical experiment including the curves

of average nRMSE for the six look-ahead forecasts, obtained from the proposed approach. We

also present the same measures of accuracy for the third-party forecasts currently employed by

the Alberta Electric Systems Operator [18]. Observe from this figure that forecast accuracy for

both models decreases as the look ahead forecast increases due to higher cumulative errors. For

instance, the average forecast error of the proposed method for 1-hour-ahead forecast, i.e. forecast

values generated 1 hour before the real time, is 5.04%, while for 6-hour-ahead, i.e. forecast values

generated 6 hours before the real time, is 15.94%.

Compared to the third-party forecasts, the proposed method results in better or comparable

forecast accuracy up to 3 hours ahead. For instance, the proposed model improves the accuracy

33
16

14

12
nRMSE (%)

10

8 Third-party Forecasts

Proposed Method

4
1 2 3 4 5 6
Look-ahead Forecast (Hour)

Figure 2.5: Average nRMSE errors of two forecasting models in different look ahead forecasts

for 1-hour-ahead forecast by 30.29%. Note that at any settlement interval, the next three hours are

particularly important to the system operator in Alberta. This is because the real-time nature of

the market requires the next hour or two need to be monitored properly to ensure system security

and supply-demand balance. Strategic market participants also need to watch the next three hours

because it is just outside the T-2 Window. For the three longer look-ahead windows, the third

party forecasts outperform the ones generated by our model. This is mainly because the third party

forecasts include NWP data. Thus, a combination of the forecasts generated by our model for the

first three hours and the third-party forecasts for the last three hours would provide a more accurate

picture of future wind generation.

2.5 Conclusion

In this paper, a new wind power prediction strategy is proposed. A WNN with multi-dimensional

Morlet wavelet as the activation function of the hidden neurons and MCC as the training crite-

rion is applied as the forecasting engine to implement the input/output mapping function of wind

34
prediction process. A new stochastic search technique, named ICSA, which is the improved ver-

sion of Clonal selection algorithm, is proposed and adapted as the training procedure to optimize

the free parameters of the forecasting engine. Effectiveness of the whole proposed wind power

forecasting strategy as well as the effectiveness of its main components including the suggested

WNN and training procedure is extensively evaluated by real-world data for wind power predic-

tion. As regards the training algorithm, the proposed ICSA outperforms the other stochastic search

algorithms including SA, PSO, DE and CSA in terms of both nRMSE and nMAE, illustrating the

effectiveness of the proposed training strategy. Moreover, the suggested Morelet wavelet function

results in more accurate wind power forecast than Mexican-hat wavelet function, and MCC train-

ing criterion leads to lower wind power prediction errors than the traditional training error measure

of MSE.

35
Chapter 3

Short-term Electricity Load Forecasting of Buildings in

Microgrids 1

3.1 Introduction

Micro-grids are integrated energy systems composed of distributed energy resources and multiple

electrical loads operating as an autonomous grid, which can be either in parallel to or islanded from

the existing power grid. A micro-grid can be considered as a small-scale version of the traditional

power grid that its small scale results in far fewer line losses and lower demand on transmis-

sion infrastructure. All of these advantages are consequently motivating an increased demand for

micro-grids in a variety of application areas such as campus environments, military operations,

community/utility systems, and commercial and industrial markets [61].

Considering the fast and worldwide development of micro-grids, their optimal operation re-

quires advanced tools and techniques. In particular, Short-Term Load Forecast (STLF) is an in-

dispensable task for the operation of a micro-grid. In conventional power systems, STLF is an

important tool for reliable and economic operation of power systems, as many operating decisions,

such as dispatch scheduling of generating capacity, demand side management, security assessment

and maintenance scheduling of generators, are based on load forecast [9, 62–66]. Load forecasts

also have significant roles in energy transactions, market shares and profits in competitive elec-

tricity markets [66, 67]. Different prediction strategies have already been presented for the STLF

of traditional power systems over the years. These methodologies are generally divided into two

main groups: classical statistical techniques and computational intelligent techniques. Reviews on
1
2015
c Elsevier Ltd. Reprinted, with permission, from [13]: H. Chitsaz, H. Shaker, H. Zareipour, D. Wood,
and N. Amjady,“Short-term Electricity Load Forecasting of Buildings in Microgrids”, Energy and Buildings,
vol. 99, pp. 50-60, July 2015.

36
some of these strategies can be found in [9, 62, 64–67].

In a similar way, STLF is a key factor in operation of micro-grids such as energy management

for optimal utilization of available resources in order to minimize the operation cost or any environ-

mental impact of a micro-grid [5]. Moreover, STLF for a micro-grid can be used for profitable trade

of electric energy within the grid. In other words, it is important for the operator of a micro-grid

to determine the amount of exchanged power with a wholesale energy market so as to maximize

the total benefit [68]. It has also been discussed that the forecasted loads as well as forecasted

generation of renewable resources are the main inputs for optimal energy management [7, 69] and

generation scheduling [10] in micro-grids.

However, modeling and forecasting of micro-grids’ loads can be more complex tasks than

those usually applied for conventional power systems, as the load time series of micro-grids is

more volatile in comparison with the load of power systems, as demonstrated later in the present

paper. Since the size of a micro-grid is considerably small compared to a traditional power system,

the load of a micro-grid includes more fluctuations. In other words, the inertia in small-scale

systems is low and therefore, the smoothness of load time series in such systems degrades. Using a

criterion to measure the volatility of a time series, it will be shown in this paper that the volatility of

load time series for a micro-grid is considerably higher than that for a conventional power system.

As a result, there is a need to adapt a suitable STLF model to volatile behavior of micro-grids load

time series. Despite the importance of STLF for micro-grids there are a few works presented in

this area. Authors in [70] present an on-line learning model based on Multiple Classifier Systems

(MCSs) for short-term load forecasting of micro-grids, and the model was tested on real data of a

micro-grid. A bi-level prediction strategy is proposed in [6] for STLF in micro-grids. This strategy

is composed of a forecaster including neural network and evolutionary algorithm in the lower level

and an enhanced differential evolution algorithm in the upper level for optimizing the performance

of the forecaster. The proposed model in [6] is designed having the aggregated micro-grid load in

mind. However, the present paper focuses on forecasting the load of the individual loads within a

37
micro-grid, with potentially significantly higher volatility compared to the aggregated micro-grid

load. Forecasting individual micro-grid load components is important for operation scheduling and

determining load serving priorities at the feeder level [71].

Some research works have also been presented regarding electricity load prediction for resi-

dential areas and buildings [72–74]. The proper consumption of electricity in buildings leads to

lower operational costs. If the facility manager could predict the electricity demand of the build-

ing, actions could consequently be taken to reduce the amount of energy and therefore, reduce the

operational cost of the building [74]. A few works have been published very recently in the area of

energy prediction of buildings. For instance, long-term energy consumption of a residential area in

South West China has been studied in [75]. In this reference, an Artificial Neural Network (ANN)

model is compared with some other prediction models, including Grey model, regression model,

polynomial model and polynomial regression model, to forecast the total energy consumption of

the residential area, and it is shown that ANN model outperforms the other models. Having access

to detailed data of a six-story multi-family residential building located on the Columbia University

campus in New York City, the authors in [76] were able to conduct a comparative spatial analysis

to forecast the energy consumptions of units, floors and the whole building for different temporal

intervals (e.g., 10-min, hourly and daily). The results indicate that the most effective models are

built with hourly consumption at the floor level providing that high resolution and granular data

is available via advanced smart metering devices. In [77], a Case-Based Reasoning (CBR) model,

categorized as a machine-learning artificial intelligence technique, is proposed to forecast energy

demand in an office building located in Verennes, Quebec, Canada. Three forecasting horizons

of 3-hour, 6-hour and 24-hour ahead have been simulated with hourly prediction resolution, and

the results demonstrate that the prediction capability of the model is improved when the hori-

zon is reduced to 3-hour ahead. Authors in [78] have proposed a new methodology for electrical

consumption forecasting based on end-use decomposition and similar days. Total consumption

forecast is also obtained from end-use consumptions and the data of selected days. In [79], a

38
building-level neural network-based ensemble model is presented for day-ahead electricity load

forecasting, and it is shown that the presented model outperforms SARIMA (Seasonal Auto Re-

gressive Moving Average) by up to 50%. However, the comparisons are made only with SARIMA

model, which is a linear statistical model, which may not be capable of capturing high nonlinearity

of the building-level electricity load.

To summarize the main points, micro-grids can bring considerable benefits to power systems,

such as supplying loads in remote areas, reducing total system expansion planning cost, reducing

carbon emission through coordinated utilization of Renewable Energy Sources (RESs), provid-

ing cheaper electricity through proper energy management of available resources and energy trade

with the main grid, and improving system reliability resiliency by providing dispatchable power

for use during peak power conditions or emergency situations. Moreover, it was discussed that a

short-term load forecasting tool is of high importance in optimal energy management and secure

operation of micro-grids. In this way, some research works have been conducted to develop load

forecasting models with higher accuracy. However, as discussed above, a few works have focused

on day-ahead load consumption prediction of buildings in micro-grids and consequently, improve-

ment of forecast accuracy is still needed in this area. In the present paper, a forecast method is

proposed for the STLF of micro-grids with the focus of electricity load prediction for individual

buildings. The main contribution of this paper is applying a Self-Recurrent Wavelet Neural Net-

work (SRWNN) forecasting engine for electricity load prediction of micro-grids. Moreover, the

Levenberg-Marquardt (LM) learning algorithm is implemented to train the SRWNN. The proposed

method improves the forecast accuracy for highly volatile and non-smooth time series of micro-

grid electricity load. The higher the forecast accuracy of electricity load, the more efficient energy

management can be achieved in a micro-grid.

The remaining parts of the paper are organized as follows. Section 3.2 provides a data analysis

on different electricity load time series to draw a distinction between the load of a micro-grid and a

power system. The proposed forecasting method consists of the SRWNN as the forecasting engine

39
Table 3.1: Comparison of electricity load time series in terms of volatility
British Columbia’s California’s
Volatility index BCIT
System Load System Load
Daily volatility (%) 8.34 2.66 3.18
Weekly volatility (%) 7.09 2.28 3.15

and LM as the training algorithm, and is presented in Section 3.3. The proposed load forecasting

method is tested on real-world test cases and the results are compared with the results of some

other prediction approaches in Section 3.4. Finally, Section 3.5 concludes the paper.

3.2 Data analysis

A data analysis is presented in this section so as to compare the characteristics of a micro-grid load

time series and electricity load in power systems. The British Columbia Institute of Technology

(BCIT) in Vancouver, the Province of British Columbia (BC), Canada, is considered as the micro-

grid test case studied in this paper. BCIT’s Burnaby campus is Canada’s first Smart Power Micro-

grid comprised of power plants (including renewable resources of wind and photovoltaic modules),

campus loads, command and control (including substation automation, micro-grid control center

and distributed energy management), and communication network [80]. The load data used in this

work is from one building with a peak value of 694 kW from March 2012 to March 2013, within

the BCIT micro-grid. Hereafter, we refer to this load as BCIT. To draw a comparison between the

characteristics of a micro-grid load and power system load level, the load time series of two power

systems, i.e., British Columbia where BCIT micro-grid is located, and California, are analyzed.

Electricity load follows daily and weekly periodicities. In this way, we consider two measures

for volatility analysis, i.e., daily volatility and weekly volatility. These measures are based on the

standard deviation of logarithmic returns over a time window. In general, daily volatility quantifies

the overall change in hourly electricity load from one day to another, and weekly volatility mea-

sures the load changes in subsequent weeks. For more details regarding aforementioned volatility

indices, see [81].

40
(a) BC’s power system electricity load (b) Building electricity load in BCIT

Figure 3.1: One-year hourly load data of BC’s power system and the building in BCIT

One year hourly load data has been considered for British Columbia’s and California’s power

systems for the same period, i.e., from March 2012 to March 2013. Observe from Table 3.1 that

both daily and weekly volatility indices for a micro-grid are considerably higher than those for

power systems, which demonstrate low smoothness of micro-grid load time series. For instance,

daily volatility related to the micro-grid is 8.34%, while it is respectively 2.66% and 3.18% for

British Columbia’s and California’s power systems. It means that electricity load of the micro-grid

fluctuates more severely from one day to another compared with that of power systems. Like-

wise, weekly fluctuations are more severe in the micro-grid than those in a power system. As a

result, daily and weekly periodicities of electricity load in a micro-grid are noticeably low, and

consequently, the predictability of such load time series decreases.

Fig. 3.1 illustrates one-year hourly load data of British Columbia’s power system and that of

the building in BCIT. It is noted that the data is normalized to the maximum value. As seen, the

aggregated electricity load in a large area (e.g., the province of British Columbia) is noticeably

different from the aggregated load in a building. For instance, British Columbia’s load follows a

common seasonal pattern, as the load decreases in the spring in April and starts to increase in the

fall in October. The building’s load follows a fairly similar seasonal pattern. From the beginning

of the academic year in september, electricity load starts increasing, and it starts decreasing in

41
(a) 1−hour ramp distribution
2000

Number of occurrences
BC power system
1500 Building in BCIT

1000

500

0
−20 −15 −10 −5 0 5 10 15 20 25
% of the peak load
(b) 2−hour ramp distribution
1200
Number of occurrences

BC power system
900 Building in BCIT

600

300

0
−20 −15 −10 −5 0 5 10 15 20 25
% of the peak load

Figure 3.2: Distribution of 1-hour and 2-hour ramps

February. Moreover, it is seen that fluctuations of load are more severe for a building compared

with those for a power system. These variations in load time series of the micro-grid graphically

demonstrate its volatility previously shown in Table 3.1 by volatility indices.

To have a better understanding of such severe load fluctuations for a building, hourly changes

of load, i.e., the difference between the two observations at subsequent hours so-called 1-hour

ramps, can be taken into consideration. Fig. 3.2 (a) shows the distribution (with equally spaced

bins of 1% of the peak load) of 1-hour ramps for both the building in BCIT and BC’s power system

loads. Note that the negative values show downward ramps. As seen, more frequent hourly upward

and downward ramps have been occurred in the building in BCIT with the amplitude of more than

5% of the peak load. The most severe ramp happened in BC’s power system load is a ramp up

with the amplitude of almost 9% of the peak load, while it is a ramp up with more than 20% of the

peak load for the building. Similarly, Fig. 3.2 (b) illustrates the distribution for 2-hour ramps, i.e.,

load variations in two-hour duration. As the longer time is considered for ramps, the larger ramps

42
Table 3.2: Ramp events in electricity load time series
Interval 1-hour 2-hour
Building BC power Building BC power
(% of the peak load)
in BCIT system in BCIT system
5% ≤ RU < 10% 556 504 818 971
10% ≤ RU < 15% 84 0 362 352
Ramp Up
15% ≤ RU < 20% 12 0 121 55
(RU)
RU > 20% 4 0 21 0
5% ≤ RD < 10% 520 259 1003 1272
10% ≤ RD < 15% 48 0 283 141
Ramp Down
15% ≤ RD < 20% 9 0 57 0
(RD)
RD > 20% 0 0 11 0

will be detected. Obviously, sharp upward and downward ramps have more frequently happened

in BCIT building load than in BC system load 2 .

To provide more detailed statistics of ramps, table 3.2 shows the number of upward and down-

ward ramps for 1-hour and 2-hour duration. For instance, there have been 100 upward ramps more

than 10% of the peak load in BCIT building load, while no 1-hour ramp up has occurred with the

amplitude of more than 10% of the peak load in BC system load. With regard to 2-hour ramp up,

there have been 55 2-hour ramp ups more than 15% of the peak load occurred in BC, while it has

been 142 for BCIT load. This table also demonstrates that the number of downward ramps are

fewer than the number of upward ramps when large ramps are concerned.

Based on the above descriptions, prediction of electricity load time series of a building seems to

be more difficult than that of a power system since high volatility lowers the predictability. Conse-

quently, it is required to adapt a forecasting model so as to cope with the challenging characteristics

of such time series. In the next section, a forecasting model is proposed to capture the dynamic

and volatile behaviour of micro-grid time series.


2
This footnote was added in response to a question raised by a committee member in the PhD oral examination.
It should be noted that although such severe ramps might happen not as frequent, it is of high importance to capture
them. This is because missing such sharp ramps might jeopardize the reliability of the system. The risk of power
mismatch could be load curtailment in stand-alone mode of operation. In a grid-connected mode, the risk of power
mismatch is mostly financial as the shortfall/excess power is traded with the grid.

43
3.3 The forecasting model

The discussion in section 3.2 showed that dealing with micro-grid load time series is a more chal-

lenging task compared with a power system, and therefore, traditional STLF will not result in

satisfactory accuracy in micro-grid load prediction. In this way, the SRWNN forecasting engine

is firstly presented in this section and the training algorithm is then implemented to set the free

parameters of the SRWNN.

3.3.1 Self-Recurrent Wavelet Neural Network

The wavelet theory has been applied through two different approaches for forecast processes. The

first one is using the wavelet transform as a preprocessor to compose the load time series into

its low and high frequency components. Each component is separately processed by a forecast

engine [82]. The other approach is constructing the wavelet neural network (WNN) in which a

wavelet function is used as the activation function of the hidden neurons of a Feed-Forward Neural

Network (FFNN). The WNN was first introduced in [83] for approximating nonlinear functions.

Due to the local properties of wavelets and the concept of adapting the wavelet shape according to

training data set instead of adapting the parameters of the fixed shape basis function, WNNs have

better generalizability compared to the classical FFNNs, and therefore, these are more appropriate

for the modelling of time series [11].

The SRWNN is a modified model of WNN including the properties of the dynamics of Re-

current Neural Networks (RNNs) [84] and the fast convergence of WNNs, which has successfully

been applied to estimating and controlling nonlinear systems [85]. Since the SRWNN has a self-

recurrent mother wavelet layer, it can store the past information of wavelets and well capture the
3
complex nonlinear systems [86]. Having self-feedback loops and input direct terms, SRWNN
3
This footnote was added in response to a question raised by a committee member in the PhD oral examination.
A linear system is a system in which the output can be represented by a linear combination of inputs. In time se-
ries forecasting, linear regression models are an example of linear systems in which the output (forecast) is a linear
weighted average of a set of inputs. A nonlinear system, however, is a system in which the change of the output is
not proportional to the change of the input. In prediction processes, artificial neural networks are an example such a
nonlinear mapping function of a set of inputs to the output.

44
has improved capabilities compared to WNN, such as its dynamic response and information stor-

ing ability. Therefore, SRWNN has been applied as a forecast engine in this paper to overcome

the volatile and non-smooth behavior of the load time series in a micro-grid. Moreover, SRWNN

does not include limitations, such as dependency on appropriate tuning of parameters and complex

optimization process, which are likely to be found in models such as Support Vector Machines

(SVMs) [45].

The architecture of the SRWNN, shown in Fig. 3.3, is a feed forward network with four layers.

As seen, X = [x1 , ..., xM ] is the input vector of the forecast engine and y is the target variable.

The inputs x1 , ..., xM of the forecast engine can be from the past values of the target variable and

past and forecast values of the related exogenous variables. For instance, past values of electricity

load along with the past and forecast values of temperature can be considered for electricity load

prediction, provided that their data is available.

A feature selection technique can be used to refine these candidate features and select the most

effective inputs for the forecast process. In this research work, we use the feature selection method

of [52]. This method is based on the information theoretic criterion of mutual information and

selects the most informative inputs for the forecast process by filtering out the irrelevant and re-

dundant candidate features through two stages. In the first stage, which is called irrelevancy filter,

mutual information between each candidate input, i.e. xi (t), and the target variable is calculated.

The higher value of mutual information for xi (t) means the more common information content

of this feature with the target variable. The candidate inputs with computed mutual information

value greater than a relevancy threshold, denoted by T H1 , are considered as the relevant features

of the forecast process, which are retained for the next stage. However, other candidate inputs with

mutual information value lower than T H1 are considered as irrelevant features, which are filtered

out. In the second stage, which is called redundancy filter, redundant features among the candidate

inputs secected by the relevancy filter are found and filtered out. Two selected candidates, e.g.,

xk (t) and xl (t), with high value of mutual information have more common information, i.e., high

45
Figure 3.3: Architecture of the SRWNN.

level of redundancy. Thus, the redundancy of each selected feature xk (t) with the other candidate

inputs is calculated. Then, if the measured redundancy becomes greater than a redundancy thresh-

old, denoted by T H2 , xk (t) is considered as a redundant candidate input. Hence, between this

candidate and its rival, which has the maximum redundancy with xk (t), one with lower relevancy

should be filtered out [52]. The selected candidate features in the relevancy filter are considered

as the inputs of the load forecasting engine. Moreover, fine-tuning the values of the thresholds

T H1 and T H2 is performed by cross validation technique. Since this method is not the focus of

this paper, it is not further discussed here. The interested reader can refer to [52] for details of this

feature selection method.

Therefore, the target variable is the electricity load of the next time interval that the forecasting

engine presents a prediction for it using the past values of electricity load and calendar effects.

Moreover, Multi-period forecast, e.g. load prediction for the next 24 hours, is reached via recur-

sion, i.e. by feeding input variables with the forecaster’s outputs. For instance, forecasted load for

the first hour is used as y(t − 1) for load prediction of the second hour provided that y(t − 1) is

among the selected candidate inputs of the feature selection technique.

The input layer of the forecast engine transmits M input variables, which are selected by the

46
feature selection technique, to the next layer without any changes. The second layer, which is

called the wavelet layer, consists of N × M neurons that each has a self-feedback loop. In this

paper, Morlet wavelet function has been considered as the activation function of neurons in the

mother-wavelet layer, which is defined as follows:

2
ψ(x) = e−0.5x cos(5x) (3.1)

In SRWNN, a wavelet of each node is derived from its mother wavelet as below:

ui,j − bi ui,j − bi
ψi,j (ri,j ) = ψ( ), ri,j = (3.2)
ai ai

where ψi,j is the scaled and shifted version of Morlet mother wavelet with ai and bi as the scale

and shift parameters, respectively. In addition, the inputs of the wavelets in (3.2) are as follows:

ui,j = xj + ψi,j z −1 · θi,j (3.3)

where z −1 is the time delay; thus, the input of this layer contains the memory term ψi,j z −1 which

can store the past information of networks, and θi,j denotes the weight of the self-feedback loop,

which represents the rate of information storage. This feature is the main difference between a

SRWNN and a WNN. In fact, the SRWNN is the same as WNN when all θi,j are equal to zero.

However, it is noted that the initial values for θi,j are usually considered zero, which means there

is no feedback initially.

M-dimensional wavelet functions are constructed by the tensor product of one-dimensional

Morlet wavelets in the third layer as follows:


M
Y
Ψi = ψi,j , i = 1, 2, ..., N (3.4)
j=1

The output of the SRWNN, denoted by y, is finally computed as:


N
X M
X
y= wi · Ψi + vj · xj + g (3.5)
i=1 j=1

where, wi is the weight between ith neuron of the product layer and the output node, vj is the

direct input weight between j th input and the output node, and g is the bias of the output node.

47
Therefore, the output of SRWNN is obtained from a combination of multi-dimensional wavelet

functions, i.e. Ψi , as well as a combination of inputs, i.e. xj . In other words, the proposed model

not only can benefit from the capabilities of wavelet functions, such as their ability to capture

cyclical behaviors, but also can capture trends of the signal. In addition, SRWNN can benefit from

its dynamic response by storing the past information of wavelets in self-feedback loops (equation

3.3) to capture complex nonlinearities. Based on the aforementioned formulation, the vector of the

free parameters of the SRWNN is denoted by P as follows:

P = [vj , wi , ai , bi , θi,j , g], i = 1, ..., N, j = 1, ..., M (3.6)

Therefore, the SRWNN has N P = M +3N +M ×N +1 free parameters which are determined by

the training method. It should also be noted that the SRWNN model presented in this paper differs

from the SRWNN proposed in [86]. There are two differences between these two models. First,

there is an additional external bias (e.g., g) to the output layer of the presented SRWNN in this

work. A bias can increase or lower the net input of the activation function, depending on whether

it is positive or negative, respectively [87]. Consequently, biases can enhance the input/output

mapping function by adding another feature to neural networks. Second, Morlet wavelet functions

have been used as the activation functions in Wavelet layer of SRWNN in this paper, while the

second derivative of Gaussian functions, i.e., Mexican hat wavelet function, in reference [86] of

the previous version. Although the Mexican hat wavelet function has successfully been used in

WNN models for forecasting applications due to its superiorities over Daubechies wavelets [11], it

has been shown that Morlet wavelets outperform Mexican hat wavelets for prediction applications

[12, 53]. Therefore, we applied Morlet mother wavelets as the activation functions in SRWNN in

our paper.

3.3.2 The training algorithm

In this subsection, a training algorithm is implemented to set the free parameters of the SRWNN

denoted by P in (3.6). Since the mother wavelet function used in the SRWNN, i.e. Morlet wavelet

48
function, is differentiable with respect to all free parameters, the Levenberg-Marquardt (LM) learn-

ing algorithm can be used in this regard. This learning algorithm was applied to train the neural

networks by Hagan and Menhaj in [88]. Due to the advantages of the LM algorithm, such as

accurate training and fast convergence, it has been recommended in many research works, and

therefore, it is implemented for training the SRWNN in this paper. The LM algorithm is briefly

described in Appendix C and its implementation on the SRWNN is then presented.

Moreover, the termination criterion used for the training of the SRWNN is based on early-

stopping technique. Accordingly, the whole available data is divided into training and validation

samples. The SRWNN is trained using the training samples and the error for validation samples is

monitored in each iteration. As the validation error begins to rise during some number of iterations,

usually five, the training phase is stopped and the values of the free parameters relating to the

iteration with the least validation error are stored as the final solution of the training algorithm.

3.4 Numerical results

In this paper, we mainly focus on 24-hour ahead load prediction with hourly forecast steps. Day-

ahead load forecasting can bring significant operational advantages for energy management of

micro-grids. For instance, the BCIT micro-grid consists of different types of generating units (e.g.,

thermal, wind and PV units), and day-ahead load predictions are used for energy management pur-

poses. In other words, optimal utilization of available resources is achieved using load forecasting

in order to minimize the operation cost for the BCIT campus micro-grid. Moreover, as this micro-

grid can operate in both stand alone and grid-connected modes, accurate load forecasts can be used

for profitable trade of electric energy within the British Columbia power system.

The same load time series data of the building in BCIT and two power systems are used for nu-

merical experiments of this section. Based on the data analyses presented in section 3.2, electricity

load not only depends on the load profile of the previous day, i.e., daily periodicity, but also the

load pattern of the previous week, i.e., weekly periodicity. To capture such patterns, 192 candidate

49
inputs have been considered as lagged hourly load data, i.e., {Lt−192 , ..., Lt−1 } where Lt indicates

the electricity load at time t. The feature selection technique selects the most informative lagged

load values from these candidate inputs. Calendar information is also highly important for a load

forecasting model so as to capture weekly and seasonal patterns. For instance, either considering

the day of the week or differentiating weekends and weekdays is a common way presented in the

literature [5, 64, 89]. Thus, weekends and holidays are considered in this work using a binary vari-

able for detecting weekends and holidays from weekdays. The month of the year is also used in

some cases [89]; however, it is not considered in this paper since the seasonality factor is already

captured, as the model is re-trained every day. Furthermore, temperature data as an exogenous vari-

able has been used to improve load forecasting prediction since temperature time series usually has

high relevancy to electricity consumption time series [64, 66, 67, 90]. Accordingly, based on pub-

licly available data, seven daily values of temperature for the previous week (e.g., Td−7 , ..., Td−1 ),

and the daily forecast value of the temperature for the prediction day (e.g., Td ) were first considered

for the model, where Td represents the average daily temperature for day d. However, numerical

experiments for the BCIT test case revealed that low resolution temperature data, i.e. daily data,

cannot improve the accuracy for hourly load forecast. Therefore, we tested historical hourly tem-

perature data (located in Vancouver) and also used the same time series for temperature forecasts,

i.e., perfect forecasts, in order to observe if hourly temperature data can enhance the forecast results

for BCIT test case. For this purpose, lagged hourly temperature data, i.e., {Tt−192 , ..., Tt−1 }, are

considered as 192 candidate inputs that feed the feature selection stage along with 192 candidate

inputs for load data. The feature selection technique then selects the most informative candidates

among the candidates of load and temperature and transfer them to the model. Considering the

selected inputs, few temperature inputs are among the selected inputs that show the low correlation

of the temperature time series and load time series of BCIT. The low correlation results from the

fact that the electric load of this building is mainly lighting. Considering the mild temperatures

in Vancouver, the heating load is not as significant. The numerical results also supported this low

50
correlation, as hourly temperature data with even perfect forecasts could not improve the forecast

accuracy of the model. Therefore, temperature inputs are not considered for the numerical results

in this paper.

To show the effectiveness of different forecasting engines, SRWNN is compared with two other

efficient neural network-based forecasting models, i.e., WNN and Multi-Layer Perceptron (MLP).

It is noted that statistical models (e.g., Autoregressive Integrated Moving Average (ARIMA) model)

are not considered in this paper since such techniques are basically linear methods and have limited

capability to capture nonlinearities in the load series [91, 92]. Therefore, we chose two efficient

Computational Intelligence (CI) based models, e.g., MLP as an efficient Feed Forward Neural

Network (FFNN) and WNN as an effective model combining nonlinear mapping merits of FNNNs

and wavelet functions, as benchmarks in our comparative results.

Hence, 10 test months of hourly load data from the building in BCIT from May 2012 to Febru-

ary 2013 are considered for 24-hour ahead load prediction. It is noted that the first two months

of the historical data is used for training of the forecast engine and so the results of the first two

months cannot be presented here. Two error criteria are used in this paper to evaluate forecast er-

rors: (i) normalized Root Mean Square Error (nRMSE) and (ii) normalized Mean Absolute Error

(nMAE), defined as follows:


v
u N
u 1 X LACT(t) − LFOR(t)
nRMSE = t ( )2 × 100 (3.7)
N t=1 LPeak

N
1 X LACT(t) − LFOR(t)
nMAE = | | × 100 (3.8)
N t=1 LPeak

where LACT(t) and LFOR(t) indicate the actual and forecast values of electricity load for hour t.

Moreover, N indicates number of hours for each month, and LPeak is the peak value of the electricity

load over the year, which is 694 kW for this test case. Observe from Table 3.3 that SRWNN out-

performs the other forecasting models in all test months and in terms of both nRMSE and nMAE.

For instance, the average nRMSE and average nMAE of SRWNN are (5.67-4.98)/5.67=12.1%

51
Table 3.3: Forecasting errors, in %, of SRWNN, WNN and MLP for 10 test months.
MLP WNN SRWNN
Month nRMSE nMAE nRMSE nMAE nRMSE nMAE
May 8.44 6.22 5.96 4.05 5.23 3.80
Jun. 9.92 7.55 5.44 4.27 4.86 3.80
Jul. 10.41 7.92 7.04 5.26 5.43 4.01
Aug. 10.40 8.14 6.57 4.95 6.46 4.80
Sep. 11.88 8.40 7.83 6.01 6.28 4.82
Oct. 10.45 7.68 4.81 3.83 4.24 3.28
Nov. 6.34 4.89 4.62 3.56 4.30 3.21
Dec. 6.21 4.58 4.54 3.35 4.22 3.05
Jan. 6.93 4.74 4.86 3.40 4.25 3.11
Feb. 6.85 5.29 5.06 3.94 4.58 3.50
Average 8.78% 6.54% 5.67% 4.26% 4.98% 3.74%

and (4.26-3.74)/4.26=12.2% lower than those of WNN, and (8.78-4.98)/8.78=43.2% and (6.54-

3.74)/6.54=42.8% lower than those for MLP, respectively. This table demonstrates that for a highly

volatile time series, i.e. micro-grid electricity load, a SRWNN forecasting model can more effi-

ciently cope with the variations and non-smooth behavior of the time series.

Moreover, Fig. 3.4 illustrates the carpet charts of monthly mean absolute errors for different

hours of the day for SRWNN and WNN on BCIT test case. This figure clearly shows that large

errors for both models usually occur between 12:00 PM and 16:00 PM when the load peaks. How-

ever, this colormap shows lower errors during the peak hours for SRWNN in comparison with the

WNN. More importantly, the superiority of SRWNN over WNN is revealed during the upward

ramps in the morning. As analyzed in section 3.2, sharp upward ramps occur more than downward

ramps for BCIT test case, and consequently, any improvements in forecasting ramp up events can

considerably enhance the forecast accuracy of this load time series. Fig. 3.5 demonstrates the

average of mean absolute errors for all 10 months. According to these two curves, SRWNN shows

lower yearly errors during the morning ramp, which usually occurs from 7:00 AM to 12:00 PM. In

addition, there is an improvement in ramp down forecasting from 16:00 PM to 18:00 PM.

Curves of generated forecasts and real data for a good forecasting day, i.e. November 15,

and a bad forecasting day, i.e. September 7, is demonstrated in Fig. 3.6. Fig. 3.6(a) shows that

52
(a) SRWNN (b) WNN

Figure 3.4: Mean absolute error (kW) of different hours of the day in different months
45

40

35
Mean Absolute Error (kW)

30

25

20
SRWNN
WNN
15

10

5
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
Hour

Figure 3.5: 10-month mean absolute error for different hours of the day

there are sharp changes and variations on September 7. Sharp spikes could result from the high

temperatures during specific days, which increase the electricity consumption of the buildings for

air conditioning. As a consequence of such severe ramps, the forecasting model faces difficulties

to capture this high sudden variations in electricity load. The major error is the magnitude error

occurred during the peak load. On the contrary, there has been smoother variations on November

15 shown in Fig. 3.6(b), so the forecasting model could perfectly capture the upward ramp. As

a result, the challenge of high volatility and sharp ramps in micro-grid time series is evidently

distinct from power system loads, and makes such time series more unpredictable.

In the next experiment, forecasting errors of different days of the week for the same 10-month

53
(a) Bad forecasting day (b) Good forecasting day

Figure 3.6: Samples for bad (a) and good (b) forecasting days

period are separately considered to observe the users’ behavior. It is noted that the electricity con-

sumption of the building is mainly from lighting as mentioned earlier in this section. Here, users’

behavior is represented by considering the calendar effect as the inputs of the model. A binary

variable for differentiating weekends and holidays from weekdays is used, i.e., zero represents

weekends and holidays, while one represents weekdays. Fig. 3.7 demonstrates the forecasting

errors with and without the calendar effects. First, observe that the average of nRMSE considering

the calendar effect, i.e., 4.78%, is lower than that when the calendar effect is not included, i.e.,

5.32%. Moreover, according to the figure, the highest error occurs on Mondays, which is the first

working day at the campus. Calendar inputs can efficiently capture such behavior of the users. For

instance, the forecasting error in terms of nRMSE for Monday considerably decreased from 7.32%

to 5.83% when the calendar effect is taken into account. In addition, the standard deviation of the

error associated with different days of week has decreased from 0.97% to 0.57% using the calendar

effect. In other words, the model performs in a more robust way for predicting different days of

54
Figure 3.7: Forecasting errors for different days of the week

the week. According to Fig. 3.7, the difference between the maximum and the minimum errors

with calendar, i.e., 1.85%, and without calendar, i.e., 2.83%, also shows the better performance of

the model including the calendar effect. As a result, users’ behavior can be efficiently captured by

considering calendar effects in order to improve the forecast accuracy.

In the last experiment, the proposed forecasting model is applied to predict two power system

time series. The main goal of this numerical experiment is to show how forecast accuracy of

SRWNN improves, compared with WNN, as the volatility of the time series increases. Hence, from

a power system with low volatility to one with higher volatility, forecast accuracy improvements

increase for SRWNN. In this way, the same test cases for British Columbia’s and California’s power

systems are considered. Table 3.4 shows the obtained forecast error results (based on the average

of 10-month error) for both SRWNN and WNN models. Firstly, this table demonstrates noticeably

lower forecast errors of both models for prediction of power systems’ load data compared with

those for a micro-grid illustrated in Table 3.3. For instance, 4.98% compared with 2.29% in terms

of nRMSE for the micro-grid and British Columbia’s power system, respectively. Besides, Table

3.1 shows the volatility for British Columbia’s power system time series is the lowest in terms of

both daily and weekly volatility indices. Consequently, it is expected to have higher predictability

55
Table 3.4: Forecasting errors of SRWNN and WNN for two power systems.
Power WNN SRWNN Improvements(%)
System nRMSE nMAE nRMSE nMAE nRMSE nMAE
British Columbia 2.46 1.81 2.29 1.69 6.9 6.6
California 3.67 2.57 3.38 2.37 7.9 7.8

for British Columbia’s power system compared to the micro-grid and California’s power system.

Table 3.4 indicates the forecasting errors for British Columbia are lower than those for California,

e.g., 2.46% compared to 3.67% in terms of nRMSE for WNN.

Secondly, Table 3.4 shows how effective the SRWNN becomes as the volatility of a time se-

ries increases. As seen from the last column of Table 3.4, the forecast accuracy improvements

obtained from SRWNN in terms of nRMSE and nMAE are respectively (2.46-2.29)/2.46=6.9%

and (1.81-1.69)/1.81=6.6% for BC’s power system. Similarly, there are 7.9% and 7.8% forecast

accuracy improvements in terms of nRMSE and nMAE for California’s power system, respec-

tively. Therefore, since the volatility of California’s power system is higher than that for British

Columbia’s, SRWNN obtained higher improvement of forecast accuracy compared with WNN for

California’s power system. In other words, California’s load contains higher daily and weekly

volatilities, and consequently, the SRWNN can capture these variations and provide more accurate

forecasts compared with WNN. To have a better sense of these percentage errors, forecast accuracy

improvement in terms of mean absolute error is around 93 MW, which is almost twice as big as the

capacity of Kumeyaay wind farm, i.e. 50 MW, located in San Diego, California [93]. As a result,

as the volatility of the time series increases, the performance of SRWNN improves in comparison

with WNN. As mentioned earlier in this section, load forecast accuracy can be improved using

weather forecast data as exogenous inputs to the forecasting model. For instance, load forecasting

models utilized in California ISO (CAISO) include weather forecasts, such as temperature, dew

point, wind speed and cloud cover, for the next 9 days from 24 weather stations [94]. It is noted

that including such exogenous inputs to the model depends on the availability of the public data.

The computation time of the SRWNN model for the training phase is less than 35 seconds for

one day prediction for the test cases of this paper, which is measured on a hardware set of Mac

56
Intel Core i5 2.7 GHz with 12 GB RAM. Although this computation time is larger than that for

WNN, which is less than 11 seconds, it is completely acceptable within a 24-hour decision making

framework, and shows fast forecasting performance of the proposed method.

3.5 Conclusions

STLF is an important tool for reliable and economic operation of power systems as many operating

decisions are based on load forecast, e.g., dispatch scheduling of generating units, security assess-

ment and demand side management. Likewise, precise STLF for a micro-grid can enhance the

management of its renewable and conventional resources and improve the economics of energy

trade with electricity markets. Considering volatile and non-smooth characteristics of load time

series of micro-grids compared with power systems’ electricity load, a new forecasting method is

proposed to deal with such challenges in this paper. The proposed method has the structure of a

SRWNN as the forecasting engine, in which feedback loops have been added to a WNN so as to

better capture nonlinear complexities of volatile time series. The LM learning algorithm is im-

plemented to train the SRWNN, i.e., adjusting the free parameters of the SRWNN. High volatility

of a micro-grid load was shown by defining a volatility criterion and comparing with the volatil-

ity of two power systems’ load data. The effectiveness of the proposed forecasting method was

demonstrated by real-world load data of a micro-grid and power systems. The results show that the

proposed SRWNN model leads to more accurate forecasts when a volatile time series prediction is

of interest.

Acknowledgements

Partial support for this work came from the Canadian National Science and Engineering Research

Council (NSERC) and the ENMAX Corporation under the Industrial Research Chairs program.

Moreover, the authors would like to thank Dr. Hassan Farhangi and Dr. Ali Palizan of British

Columbia Institute of Technology (BCIT) for providing data and invaluable insight.

57
Chapter 4

Electricity Price Forecasting for Operational Scheduling of

Behind-the-meter Storage Systems 1

4.1 Introduction

The market for behind-the-meter (BTM) battery energy storage is growing rapidly. For example,

in the third quarter of 2015, more than 13 MW of such units were installed, which indicates a

15 times year-over-year growth [95]. Behind-the-meter storage enables consumers to have more

control over their own energy usage [96]. Depending on the market structure, the use of BTM

storage may vary. Where flat rates are charged by utilities, a common use of BTM storage is for

demand charge management [97]. In competitive markets where large customers are charged at

wholesale pool price rates, BTM storage could be employed for avoiding peak prices. In such

cases, an insight into price fluctuations is essential for the optimal use of the storage system. The

focus of this work is the operation of a storage system owned by a large consumer that purchases

electricity from the wholesale market, and thus, needs short-term price forecasting (STPF) for

operation scheduling.

Most electricity price forecasting studies in the literature have considered short horizons be-

cause of the proximity to the real-time operation [98]. Different methodologies have been pro-

posed for point, probabilistic, and threshold forecasting of the electricity market price in the lit-

erature [98]. While the point forecasting approaches provide single-valued predictions [99], the

probabilistic forecasting models can quantify the uncertainties associated to point forecasts using

prediction intervals [100–102]. Alternatively, certain applications, e.g., demand-side management,


1
2017
c IEEE. Reprinted, with permission, from [14]: H. Chitsaz, P. Zamani-Dehkordi, H. Zareipour, P.
Parikh“Electricity Price Forecasting for Operational Scheduling of Behind-the-meter Storage Systems”, IEEE
Transactions on Smart Grid, in press.

58
do not require exact values for future prices but apply pre-specified price thresholds as the ba-

sis for the decision-making process [103]. Therefore, the threshold forecasting models work as

classification problems and simply provide different classes for future prices [104].

Moreover, a few research works have presented methodologies aiming to detect price spikes

using statistical [105–107] or data mining approaches [108–110]. In [105], a recursive dynamic

factor analysis (RDFA) is combined with a Kalman Filter model for electricity price forecast. It is

shown that the proposed model has better forecasting accuracy than three other approaches under

the presence of price spikes, however, a solid price spike detection analysis has not been provided.

In [106], authors proposed a closed loop prediction mechanism including neural networks and

feature selection techniques to predict both price spike occurrences and values. An autoregres-

sive approach has been used to model the time series of price spikes in the Australian electricity

market in [107]. Classification techniques along with feature selections [108, 110] and similarity

searching methods [109] have also been applied to detect price spikes. However, these proposed

methodologies are based on analyzing the historical data. Such analyses might not be reliable

in practice because the most influential factors creating price spikes are unplanned generation or

transmission line outages, which are neither predictable nor possible to model [103]. In addition,

price spikes are distinguished by defining fixed price thresholds in presented works. Despite simple

implementation, it may not be efficient as price statistics vary significantly from month to month.

Most STPF models developed in the literature use hourly historical/forecast data of different

explanatory variables [111] to predict hourly prices. In [112], authors investigated that using intra-

day prices with 30-min resolution could improve the short-term forecasts of base-load electricity

prices in the U.K. market. However, the price settlement process in real-time markets is with

a higher resolution than an hourly basis. For instance, Market Clearing Prices (MCPs) are set

every five minutes in Ontario’s [113], California’s [114], Texas’ [115] and New York’s electricity

markets [116], and every one-minute in Alberta’s electricity market [117]. Such high-resolution

market data contains recent updates on the electricity market conditions, and can be used along

59
with hourly market data for predicting the hourly prices. The major benefit from utilizing higher

resolution data is to capture price variations and detect price spikes efficiently.

In our previous works (e.g., [104, 118, 119]), we demonstrated that unlike load forecasting

where predicting the absolute value of demand is critical in operation scheduling, price forecasting

accuracy should be measured in terms of economic savings/gains. Thus, the main contribution

of the present paper is to propose an intra-hour rolling-horizon electricity price forecasting strat-

egy that is specifically developed to optimize the operation of behind-the-meter Battery Energy

Storage Systems (BESS). Hence, two objectives are of interest in this paper: an efficient price

forecasting tool as well as optimal operation of the BESS using generated price forecasts. The

proposed strategy takes advantage of high-resolution five-minute market clearing price data in the

real-time market to generate hourly market price forecasts. Using intra-hour market clearing price

information enables the method to detect if the current hour is going to have a high price. With the

capability to capture high prices, the battery is discharged accordingly to offset energy purchase

from the grid, and thus save on energy costs. A simple regression-based forecasting model is de-

veloped that is computationally very light and can be effectively used on micro-grid management

firmware systems.

The remaining parts of the paper are organized as follows. The operation scheduling of a BESS

and the proposed price forecasting strategy are described in section 4.2. In section 4.3, Ontario’s

electricity market is briefly presented as the case study in this paper. Afterwards, the proposed

forecasting strategy is evaluated from both statistical and economic perspectives. Finally, section

4.4 concludes the paper.

4.2 Methodology

In this section, the operation of a BESS is outlined considering different forecasting perspectives.

The components of the proposed forecasting strategy are then described.

60
4.2.1 Operation of a BESS

With a behind-the-meter BESS, energy can be stored during off-peak hours when prices are low

and then injected back to load/grid during peak periods to reduce the amount of energy purchased

from the wholesale market at high prices. The price-based optimal operation scheduling of such a

system can be presented as:


T
X
Max S = Pt λ̂t (4.1)
Pt
t=1

Subject to Φ

where the objective function is to maximize the net energy arbitrage saving, denoted by S. T

represents the scheduling horizon (e.g., T = 24 for day-ahead scheduling), λ̂t ($/MWh) is the

price forecast for hour t, and Pt (MW) is the variable showing the power consumed (Pt < 0)

from the grid or injected (Pt > 0) back to the micro-grid by the BESS at hour t. The objective

function is subject to a set of battery operation and technical constraints, denoted by Φ, such as,

charging/discharging rates, rated energy capacity, battery state of charge and number of cycles per

day [120–122], as follows:


t
X
Et = E ini + Pk (4.2)
k=1

E emg + (1 − DOD).E max ≤ Et ≤ E max (4.3)

P ch,max ≤ Pt ≤ P dis,max (4.4)

− M.ut ≤ Pt ≤ M.(1 − ut ) (4.5)


T
X
ut ≤ NCycle (4.6)
t=1

Equations (4.2) and (4.3) ensure that the energy stored in the BESS at hour ending t, denoted

by Et , is within the allowable range. Pt is the amount of scheduled power for charging (Pt < 0)

and discharging (Pt > 0). Eini is the initial stored energy in the battery, Eemg is the energy related

to the emergency load (150 kWh), and Emax is the maximum capacity for the battery (500 kWh).

Pch,max and Pdis,max are maximum charging and discharging rates. Equations (4.5) and (4.6) are

61
auxiliary constraints to count the number of full cycles per day, denoted by NCycle . M is a large

positive number and ut is defined as a flag variable that is set to ut = 1 when the battery is in the

charging mode. It should be noted that this formulation only includes the basic, most dominant

features of a storage unit in order to show the impact of price forecasts on the BESS’s operation.

The solution of this optimization problem is the scheduled power of the BESS for each hour,

denoted by Pt∗ . The Mixed-Integer Linear Programming (MILP) solver of the optimization toolbox

in MATLAB was used to solve the optimization problem of battery operation. When the actual

prices, λt , are realized, the after-the-fact value of the net energy arbitrage saving (S ∗ ) can be

calculated as:
T
X

S = Pt∗ λt (4.7)
t=1

In this approach, the schedules are set once for the whole scheduling horizon with no changes. As

a common practice to deal with uncertainties of forecasts, they are updated at each forecasting step

(e.g., each hour). This is called the Rolling Horizon (RH) approach, which has successfully been

applied to a few scheduling and energy management problems under load and wind uncertainties

in the literature [123, 124]. Basically, the optimization problem is run on an hourly basis using

the updates on price forecasts. As a result, schedules of the BESS can efficiently be updated once

price forecasts are updated.

This approach is presented in Algorithm (1), where h represents the forecasting origin; that

is, the hour at which the forecasts are generated and used in the optimization platform. h = 0 is
(h)
the last hour of day D − 1 when the first set of forecasts is generated for day D. λ̂t is the price
(h)
forecast for hour t generated at hour h, and Pt is the power of the BESS for hour t scheduled at

forecasting origin h. According to Algorithm (1), for scheduling the BESS from hour t up to hour

T , the optimization problem is run one hour in advance, hour t − 1. For instance, forecasts are

updated at hour 1, i.e., forecast origin h = 1, and fed into the optimization problem at the same

hour to solve for the optimal scheduling of the BESS for hour 2 up to hour T (t = 2, ..., T ).

With this approach, updates on scheduling of the BESS leads to a more adaptive and efficient

62
Algorithm 1 Scheduling based on the conventional RH
1: h = 0
2: while h ≤ T − 1 do
3: Solve the optimization problem:
(h) (h)
Max S = Tt=h+1 Pt λ̂t
P
Pt

Subject to Φ
4: h=h+1
5: end while

operation. The after-the-fact value of the net energy arbitrage saving is:
T
∗(h)
X
S∗ = Pt λt (4.8)
h=t−1
t=1
∗(h=t−1)
Pt is power of the BESS for hour t that has been scheduled at forecast origin h (h = t−1). In

spite of successful application of this conventional RH approach in energy management problems,

it may not be the most effective approach for operating a behind-the-meter BESS. The reason is

that such a RH approach applies the market data of the previous hour to generate the forecast

updates for the current hour. As illustrated in Fig. 4.1, the conventional RH provides price forecast
(0) (0)
(λ̂1 ) and the corresponding battery schedule (P1 ) for the first hour of the current day (t = 1)

sometime in the last hour of the previous day (t = 24) at forecasting origin h = 0. The battery

schedule is fixed for the whole first hour. It is also noted that the schedules are provided for the

whole operation horizon (e.g., up to T = 24), however, the schedules for t = 2, ..., T will be

updated once the next forecasts are available (h = 1). This approach may not be able to efficiently

capture electricity price spikes, as the most recent market data is not incorporated. Consequently,

the operational scheduling of the BESS is negatively affected by missing severe price variations.

In this paper, a forecasting strategy is proposed to provide more informative forecasts to en-

hance the operation of a BESS. In this strategy, severe variations in electricity price can potentially

be captured by predicting the price for any hour during the same hour. The proposed rolling hori-

zon framework is called Intra-hour Rolling Horizon (IRH). The operation scheduling of the BESS

using the proposed time-frame is presented in Algorithm (2) and also illustrated in Fig. 4.2. Ac-

cordingly, the first updates on price forecasts are generated at forecast origin h = 1, which is the

63
Figure 4.1: Graphical representaion of Algorithm 1

first hour of the current day (day D) as opposed to h = 0 (the last hour of day D − 1) presented in

Algorithm (1). As seen in Step 3 of Algorithm (2), the first hour is separated from the remaining

hours of the scheduling horizon as the optimization problem is performed. γ depicts the fraction of
(h)
time that the forecasting model requires to generate λ̂t in practice. For this fraction of the current

hour, the BESS follows the operation that has been scheduled in the previous update. Therefore,

only the second fraction of the current hour proportional to (1−γ) is considered in the optimization

problem for the current hour.

Algorithm 2 Scheduling based on the proposed IRH


1: h = 1
2: while h ≤ T do
3: Solve the optimization problem:

(h) (h) (h) (h)
Max (1 − γ)(Pt λ̂t ) + Tt=h+1 Pt λ̂t
P
Pt t=h

Subject to Φ
4: h=h+1
5: end while

As shown in Fig. 4.2, at the forecasting origin h = 1 that corresponds to the first hour (t = 1)
(1) (1) (1)
of the current day, the first set of price forecasts are generated, i.e., λ̂1 , λ̂2 , ..., λ̂T . In a practical

live forecasting system, it takes a fraction of the first hour, i.e., γ, before the forecasts become

available. This is the time needed for the forecasting tool to fetch new data, process them and

generate new forecasts and communicate them to the optimization platform. All communication

64
Figure 4.2: Graphical representaion of Algorithm 2

delays and latencies are included in this time. Afterwards, the price forecasts are fed into the

optimization algorithm to schedule the BESS for the (1 − γ) fraction of the first hour, shown in the

first term of step 3, and all remaining hours of the scheduling horizon, shown in the second term

of step 3. The optimization platform applies the prices forecasts to provide the most economic

hourly schedules for the BESS operation up to T . In the first fraction of the second hour (t = 2)

associated to γ, the battery follows the operation instruction that was scheduled in the first hour
(1)
for the second hour, i.e., P2 . Once the second updates on price forecasts become available (at

forecast origin h = 2), the optimization algorithm applies them to update battery schedules for
(2) (2) (2)
the remaining fraction of the current hour, P2 , and the remaining hours, i.e., P2 , ..., PT . This

process is repeated until the last hour of the operating horizon, T . As a result, the proposed IRH

can update the price forecasts and consequently operation schedules of the battery for the current

hour up to T while standing in the current hour. Whereas in the conventional RH approach, the

latest update for the operation of battery for any given hour is prepared one hour in advance. The

net energy arbitrage saving is calculated as follows:

∗(1)
S ∗ = (1 − γ)P1 λ1 (4.9)
T
∗(h) ∗(h)
X
+ [γPt + (1 − γ)Pt ]λt
h=t−1 h=t
t=2

Equation (4.9) suggests that the saving value associated to the first hour of the scheduling period
∗(1)
is proportional to (1 − γ)P1 . In other words, the first fraction of only the first hour of the

65
scheduling horizon is not considered because the algorithm does not take into account the price

forecast generated in the last hour of the previous day. For any remaining hour of the scheduling

∗(h)
period, the first part of the saving comes from γPt that includes the operation scheduled
h=t−1

in the previous hour. The second part is related to the operation scheduled in the same hour,

∗(h)
(1 − γ)Pt . In this way, the optimization problem can benefit from more recent forecast
h=t

updates generated during the operation hour. The operation scheduling of the BESS with the

proposed strategy is evaluated and compared with the conventional RH approach in section 4.3.

4.2.2 Forecasting Strategy

In general, prediction models consist of three main components: 1) data pre-processing, 2) feature

selection, and 3) model selection [103]. In this section, a brief background regarding each of these

components are provided, and the methods developed in this work are described as well.

1) Data Pre-processing: Also known as data cleaning, this component performs preliminary

analyses on the raw data gathered for the prediction purpose, such as dealing with missing values,

removing the outliers, and normalizing the data [125]. In this work, outliers such as price spikes

are limited by defining a threshold calculated by the mean and variance of the times series. Hence,

outliers remain in the time series as they carry important information about the nature of the time

series, while their abnormal values are reduced to not negatively affect the learning capability of

the forecasting model.

2) Feature Selection: This component selects a subset of features among all candidate fea-

tures from the original dataset according to a feature goodness criterion [126]. There are several

features influencing the electricity spot price, including historical load and price, imports/exports,

capacity excess/shortfall, historical reserves, and generation types as well as calendar effects, e.g.,

day of week, month of year, seasonal and holiday effects [111]. In this paper, Mutual Information

(MI) [52] technique was applied to the original database with hourly resolution to select the best

subset that contains the least number of key features contributing to forecast accuracy, while dis-

carding the remaining insignificant features. This feature selection technique is briefly described

66
in Appendix D, and the detailed formulations can be found in [52].

3) Model Selection: The forecasting model constructs the input/output mapping function,

where inputs come from the feature selection and the output is the forecast value. A number

of approaches have been presented in the literature for electricity price prediction for various pre-

diction horizons, prediction steps and applications. In general, the point forecasting engines fall

into three main categories of statistical time series models, computational intelligence models, and

hybrid models [111]. For instance, ensemble price forecasts from individual models are gener-

ated using linear regression models for price-directed demand management in smart grids [127].

In [11], an adaptive Wavelet Neural Network (WNN) is proposed for short-term price forecasting

that outperforms a number of prediction approaches such as, statistical models (Auto-regressive

Integrated Moving Average - ARIMA), Multi-Layer Perceptron (MLP) and Radial Basis Func-

tion (RBF) neural networks, and fuzzy neural network (FNN). Each group of models has its own

strengths and weaknesses. Selecting a model depends on the nature of the data and the application.

Weron in [98], has provided a comprehensive review on state-of-the-art forecasting models for the

electricity price.

In the present work, the forecasting model itself is not the main focus, and therefore, an autore-

gressive model with exogenous variables (ARX) is implemented, as follows:


p l
X X
yt = ai .yt−i + bj .xt−j + t (4.10)
i=1 j=1

where yt is the output of the model (the price forecast). The first term on the right-hand side

of the formulation shows the auto-regressive part with lagged values of the target variable and

the corresponding parameters, i.e., ai . The second term represents the exogenous variables and

their associated parameters, i.e., bi . The parameters of the model are determined in the training

phase. t is a normal white noise process with zero mean and variance. The input features of the

forecasting model are discussed later in this section. Despite the simplicity of the regression model,

it is specifically tailored in the proposed forecasting strategy for application to the operation of a

storage system. In other words, this work aims to build a forecasting strategy to provide informative

67
forecasts such that the operation of a BESS is optimized. In particular, the forecasting strategy

should be capable of capturing high prices and severe variations in real-time prices as much as

possible. The better price variations can be captured, the better an energy storage control system

can adapt its scheduling strategy. Accordingly, high-resolution market information is used along

with low-resolution market data in order to enable the model to detect sudden price fluctuations.

The low-resolution data includes the hourly market data selected by the feature selection. The

high-resolution data could be from any informative market data with higher resolution than one

hour that is publicly available.

Our studies showed that market clearing prices carry significant information regarding the price

variations in electricity markets. MCPs are set every five minutes in many electricity markets

[113–116], and the average of twelve MCPs in an hour represents the hourly price. MCPs contain

the information about the most recent state of the electricity market. For instance, in the event

of a contingency in power systems, e.g., generation outages, the consequent power imbalance is

reflected as a change of MCP in supply-demand curve. Here, four North American electricity

markets are studied to demonstrate the effectiveness of MCPs in detecting price variations. MCP

values over the year 2015 are considered for Ontario’s (Independent Electric System Operator-

IESO), Alberta’s (Alberta Electric System Operator-AESO), Texas’ (Electric Reliability Council

of Texas-ERCOT), and New York’s (New York Independent System Operator-NYISO) electricity

markets. It is noted that MCPs are set every minute in Alberta’s market and hence, the average of

five one-minute MCPs are calculated to form MCPs with five-minute resolution in this study.

Being able to capture high price hours is important in order to increase the profit gained by an

energy storage system. High prices, referred to as price spikes, are abnormal high prices that can be

distinguished by statistical methods based on the historical data. In [109], a price spike threshold is

defined using the mean (µ) and the standard deviation (σ) of historical prices as, TSpike = µ + 2σ.

Electricity prices greater than TSpike are considered as spikes. Thresholds are calculated for each

month, as the electricity price has seasonal trends. Once the price spikes for each month are

68
distinguished, the capability of MCPs to potentially detect those high price spikes is of interest in

this experiment. Defining a threshold for MCPs, TMCP , it is investigated if MCPs can potentially

detect a distinguished price spike. To do so, for any hour at which a price spike has been detected,

if MCPs in that hour are greater than TMCP , then they are likely to reflect the price spike in that

given hour.

Here, two important factors should be considered: 1) the value of TMCP and 2) the number of

MCPs in the given hour. Regarding TMCP , two values of TMCP,1 = µ + 2σ and TMCP,2 = µ + σ

are considered. The reason for considering two values is that a MCP with the value greater than

TMCP,1 = TSpike can clearly detect a price spike, while a distinguished price spike can also be

potentially detected having a lower MCP value, i.e., TMCP,2 . With regard to the number of MCPs, it

is evident that the more MCPs are considered during an hour, the higher is the chance of detecting

the price spike in the same hour. On the other hand, the cost of considering more MCPs is the

time that has been lost during an hour waiting for new MCPs to be released. In other words, the

more MCPs are used, the longer the forecasting model has to wait to generate the forecast for a

given hour and therefore, such a late forecast may not be as useful for decision-making. In this

experiment, up to 9 MCPs values are considered for price spike detection.

Spike Prediction Accuracy (SPA) measures the ability of correctly predicting spike occur-

rences, defined as follows:

Number of correctly predicted spikes


SPA(%) = (4.11)
Number of spikes

Fig. 4.3 illustrates the value of SPA versus the number of MCPs considered for detecting price

spike in four electricity markets. SPA is calculated for each month of the year 2015, and the average

over the year is reported. Fig. 4.3(a) takes into account TMCP,1 for evaluating the capability of

MCPs in spike detection. Evidently, the more MCPs are included, the higher SPA value would be.

For instance, the chance of detecting a price spike is over 90% considering nine MCPs. However,

even considering the first MCP in each hour results in at least around 40% chance of detecting a

spike (ERCOT). Fig. 4.3(b) shows the same experiment by considering TMCP,2 . In fact, the values

69
(a) TMCP,1 = µ + 2 × σ (b) TMCP,2 = µ + σ

Figure 4.3: Potential capability of MCPs in spike detection

for MCPs do not need to be as high as TMCP,1 = TSpike to be able to detect the spike. A high MCP

with the value lower than TSpike could also be capable of detecting a high hourly price. SPA values

considering TMCP,2 are expected to be higher than those with TMCP,1 as the threshold. For instance,

there is at least 50% chance of detecting a spike using only the first MCP in an hour. Thus, this

experiment shows the effectiveness of MCP values as high-resolution market data for capturing

high prices. Inspired by this high capability, a forecasting strategy is proposed in this work to

include MCP values as high-resolution market data for price prediction.

The proposed forecasting strategy consists of two separate forecasting models, high-resolution

model (MHR ) and low-resolution model (MLR ), with different purposes, illustrated in Fig. 4.4. The

forecasting models provide day-ahead price predictions with an hourly resolution. MHR generates

the price forecast for the first step of the forecasting horizon, i.e., 1-hour-ahead forecast. Because of

this very short prediction horizon, MHR can take advantage of the high-resolution data available in

the current hour ending. Note that Hour Ending (HE) denotes the time when an hour ends, e.g., HE

23 represents the time period 22:00-23:00. Thus, hourly inputs selected by the feature selection,
HR
SX HR = {sxHR HR
1 , ..., sxm }, as well as a number of MCPs in the current HE are fed to M to

predict the hourly electricity price of the same HE. A vector of high-resolution inputs is denoted

70
(a) High-resolution model (M HR ) (b) Low-resolution model (M LR )

Figure 4.4: Structure of the models

by XM CP = {M CP1 , M CP2 , ..., M CPl }, where MCP1 corresponds to the MCP that is set for the

first five-minute and so forth. l represents the number of MCPs fed to the high-resolution model.

As mentioned, MHR provides the price prediction for the first step of the forecasting horizon, i.e.,

current HE. If the after-the-fact value of the current hour turns out to be a price spike, there is a

high chance that any of the included MCPs have an unusual high value leading to this hourly price

spike. Hence, the inclusion of these MCPs in the model increases the value of the output of the

model, i.e., 1-hour-ahead price spike prediction. Obviously, the more MCP values are included in

the model, the higher the chance would be to detect the hourly spike by any of those high MCP

values. Once the first step is predicted, the forecast is used as an input in MLR , which provides

multi-step-ahead price predictions for the remaining hours of a day recursively. MLR is fed by

hourly input features selected by the feature selection, i.e., SX LR = {sxLR LR


1 , ..., sxk }. Fig. 4.4

illustrates the structure of both models. Hourly Price Forecast (HPF) denotes the output for both

models.

The forecasting strategy applies the two forecasting models in the IRH framework. This ap-

proach can mitigate the impacts of uncertainties associated to forecasts by updating the forecasts at

each forecasting step over the forecasting horizon. In this work, updates of the price forecasts are

generated once new observations of the required market data are available, i.e., every hour. Fig.

4.5 demonstrates how the IRH performs to update the forecasts employing MHR and MLR . As

seen, the first set of forecasts is generated in HE 01 for the whole hours of the current day. MHR

71
generates the price forecast for the first hour, i.e., HPF 1, while MLR provides the forecasts for the

remaining hours of the day, i.e., HPF 2 to HPF 24. Forecasting horizons for MHR and MLR are

shown in red and blue arrows in the figure, respectively. When the first hour is past, both models

are used to update the forecasts up to HE 24 of the current day. MHR predicts the price for the

second HE, HPF 2, and MLR provides HPF 3 to HPF 24. This process is repeated every hour to

update the forecasts up to the last hour of the current day. Updates can go beyond the current day if

the optimization problem of the storage system is designed for multiple days. Once the updates are

available, they can be applied to the optimization platform of a behind-the-meter BESS to adjust

charging/discharging strategies.

Figure 4.5: IRH framework

4.3 Case Study

In this section, first, Ontario’s electricity market is briefly described as a case study. The proposed

price forecasting strategy is then evaluated from both statistical and economic points of view.

4.3.1 Ontario’s Electricity Market

The Ontario’s wholesale electricity market is real-time energy and operating reserves markets.

Ontario demand is supplied by installed capacity within the province and imports from neighboring

power systems. The IESO accepts the lowest-cost offers to supply electricity until sufficient power

72
generations are available to meet the demand. MCPs are set every five minutes and the Hourly

Ontario Electric Price (HOEP), which is the average of the twelve MCPs in each hour, is published

5 minutes past each hour.

Pre-Dispatch Price (PDP) is the publicly available price prediction published by the IESO.

PDPs are updated hourly and provided according to the conventional rolling horizon approach

explained in section 4.2.1. For instance, the 1-hour-ahead PDP for HE 2 is published in HE 1. This

is the most recent update on future electricity prices available in this market. A data analysis over a

two-year period (January 2014 to December 2015) reveals a significant deviation of 1-hour-ahead

PDP from HOEP values, i.e., over 40% error on average. For optimizing the operation of a behind-

the-meter storage system, this source of price forecasts may not be helpful because high prices are

likely to be missed. For this reason, the proposed price forecasting strategy is applied to Ontario’s

market to improve the forecasting accuracy, and consequently operation of such participants.

In [128], the authors evaluated several potential input variables for predicting HOEP. The re-

sults reveal that none of temperature, predicted shortfalls and predicted transmission constraints

can improve the forecast accuracy of hourly electricity prices in Ontario. The feature selection

selects hourly features including PDP, Ontario’s demand forecast, supply cushion, lagged values

of HOEP as inputs of the two forecasting models. Although PDP is likely to have an error with

respect to electricity prices in real time, it is regarded as an initial price forecast for the two fore-

casting models. In [129], it is stated that price spikes are usually observed when the value of

supply cushion is very low, e.g., 10%. Supply cushion is an important publicly available variable

that includes the information about forecasts of demand and non-dispatchable generations (e.g.,

wind and solar generations), and import schedules. The number of MCPs considered for the high-

resolution model can be influential on the performance of the forecasting strategy from statistical

and economic aspects. This is studied in the following sections. Note that all the required market

data for this study are publicly available on IESO’s website [130].

73
4.3.2 Statistical Analysis

In this section, the proposed price forecasting strategy is evaluated from the statistical perspective,

i.e., the prediction errors of the generated price forecasts are analyzed. For this, the proposed

forecasting strategy is tested on publicly available data of Ontario’s electricity market in year 2015.

An error measure of Mean Absolute Error (MAE) is considered to evaluate the forecasting errors

for each month, defined as:


N
1X
MAE ($/MWh) = |PACT(t) − PFOR(t) | (4.12)
N t=1

where N denotes the number of hours in a month. PACT(t) and PFOR(t) represent the actual and

forecast values of price at hour t, respectively. Monthly forecasting errors in terms of MAE are

shown in Fig. 4.6. In this figure, the proposed strategy is applied with different number of MCPs

fed to M HR , e.g., l = 1, 2, ..., 6. As expected, the more MCP values are used, the lower prediction

error is achieved, because of a higher chance of detecting price spikes during the predicting hour.

The average of forecasting errors in terms of MAE are respectively $7.24/MWh, $6.33/MWh,

$5.98/MWh, $5.80/MWh, $5.65/MWh, $5.56/MWh for l = 1 to l = 6. For the sake of a compari-

son, 1-hour-ahead PDP published by the IESO is also shown in blue for all months. The proposed

forecasting strategy, even with l = 1, results in lower forecasting errors in all test months compared

to PDP values. The average of forecasting error for PDP is $9.02/MWh. Therefore, using only

one MCP in the proposed strategy results in 20% improvement in forecast accuracy in comparison

with PDP.

Fig. 4.7 displays hourly actual prices of Ontario’s market against 1-hour-ahead forecasts from

the proposed strategy and PDP values during the first week of August 2015. As seen, the proposed

model can satisfactorily track the price fluctuations and spikes. It should be noted that in many

scheduling and operational optimization problems, the exact value of the price peaks is not of

interest but the occurrence. This is because the control system of a BESS only requires a precise

occurrence of the price spike rather than the magnitude. For instance, there was a price spike of

$371/MWh at HE 41 of this test week, where the proposed model effectively detected this price

74
Figure 4.6: Forecasting errors in different months

spike although with the value of $232/MWh.

To evaluate the performance of the proposed strategy in price spike detection, confusion ma-

trices for different number of MCPs are presented in Table 4.1. Here, the confusion matrix is a

2 × 2 square matrix indicating two classes of prices, i.e., price spikes and normal prices, for actual

and predicted price values. In this experiment, the total 201 spikes are distinguished from nor-

mal prices using TSpike , defined in section 4.2.2, for each month in 2015. For instance, including

one MCP (l = 1), 102 spikes have been correctly detected, while 99 spikes have been missed.

This results in 50.7% accuracy in spike detection shown in the last column of the table. Whereas,

only 15 normal prices have incorrectly been detected as price spikes among 8559 normal prices,

which leads to 99.8% accuracy in this class. As the number of MCP values used in the model is

increased, the spike detection accuracy is enhanced significantly, e.g., 76.1% for l = 6, which is

expected according to Fig. 4.3. Although considering more MCPs results in more accurate fore-

casts, an economic aspect should also be considered to assess whether more MCPs lead to higher

values for a storage system.

75
Figure 4.7: Price values for the first week of August 2015

4.3.3 Economic Analysis

In this section, the performance of the proposed forecasting strategy is evaluated from an economic

point of view. Thus, the generated price forecasts from the proposed strategy are applied to opera-

tion scheduling of a behind-the-meter BESS within a micro-grid facility in Ontario. The micro-grid

is designed to operate as backup power during a power outage of the main grid. A 500kW Li-ion

BESS should provide emergency power for critical loads in a building of this micro-grid. The

remaining capacity of the BESS can be utilized to trade energy with the main grid.

Real-time electricity prices can provide end-users in electricity markets with the opportunity to

reduce their electricity costs by strategically responding to prices that varies with different times

of the day [131]. The size of the battery is very small compared to the total load of the micro-grid,

and thus its operation does not cause any major issues in power flows. Hence, the BESS in the

micro-grid is modeled as an individual agent with the objective of maximizing its profit (or the net

energy arbitrage saving). This can also be interpreted as reducing the amount of energy purchased

from the wholesale market during peak hours when the electricity prices are high. The energy

is stored during off-peak hours where prices are low and injected back to micro-gird/grid during

peak periods. It should be noted that other factors may impact the operation of a battery system

76
Table 4.1: Confusion matrices
Predicted
No. MCP Actual
Spike Normal Accuracy
Spike 102 99 50.7%
l=1
Normal 15 8544 99.8%
Spike 126 75 62.7%
l=2
Normal 8 8551 99.9%
Spike 133 68 66.2%
l=3
Normal 11 8548 99.9%
Spike 140 61 69.7%
l=4
Normal 10 8549 99.9%
Spike 147 54 73.1%
l=5
Normal 8 8551 99.9%
Spike 153 48 76.1%
l=6
Normal 7 8552 99.9%

inside a microgrid (e.g., renewable energy fluctuations or load balance). The microgrid operator

may consider those factors, in addition to prices, when making charging/discharging decisions.

A significant input for this optimization problem is electricity price forecasts of Ontario’s mar-

ket, which are provided by the proposed forecasting strategy. The formulation for this optimiza-

tion problem is provided in Algorithm (2). Accordingly, the optimization problem is run every

hour up to the last hour of the day. The value of γ is calculated using the number of MCPs (l),

γ = (l + 1)/12.

In Ontario’s market, MCP values are published 3 minutes past each five-minute interval, and

thus, it should be taken into account in the optimization problem. For instance, given that only

the first MCP is used, the forecasting strategy generates the forecasts eight minutes past each hour.

Then, the forecasts are used in the BESS optimization platform, and hence, 10 minutes is expected

to be past until the scheduling updates of the BESS is available. The emergency load is 150 (kW)

and the Depth of Discharge (DOD) for this battery is 70%. This leaves 200 (kW) available capacity

for the operation of this battery in energy arbitrage. Let assume that the BESS can only have one

full charge-discharge cycle per day. Obviously, the more cycles the battery can be operated per

day, the more profit can be gained; however, it may decrease the battery’s life-time.

77
In the first experiment, the proposed strategy generates forecasts considering different number

of MCPs as inputs (l = 1, ..., 6). The aforementioned operation scheduling is then applied to

calculate the amount of money saved in energy cost. The goal is to find an optimal value for l (or

γ). The total saved money for each month is demonstrated in Fig. 4.8. Interestingly, this figure

shows that more accurate forecasts generated with higher number of MCPs do not necessarily

lead to higher profits gained by the BESS. The monthly average of saved money shown in the

last column for l = 1 to l = 6 are $254.1, $251.7, $246.5, $238.5, $225.9, $221.1, respectively.

The relationship between the forecasting performance and corresponding economic values can be

interpreted using the parameter γ. An increase in the value of l means that the forecasting strategy

has to wait longer to use more MCPs as inputs of the model. This consequently leads to a higher

value for γ in the optimization problem. With larger γ, although the generated forecasts for the

current hour are more accurate, a smaller portion of the current hour associated to (1 − γ) uses

such accurate forecasts for scheduling the BESS. This can be seen form Fig. 4.2 when the value

of γ increases and (1 − γ) decreases. According to the same figure, the first portion of the current

hour associated to γ has already been scheduled in the previous hour using the forecasts generated

in the same hour, which is not the latest forecast update for the current hour. As a result, more

accurate predictions from a higher value of l could not be necessarily effective for the BESS from

the economic point of view. Thus, the optimal value of γ that results in the highest economic value

is γ = 1/6 (γ = (l + 1)/12). This means that only the first MCP of the current hour should be

fed into the model. Thus, the forecasting strategy can be tuned including only the first MCP in its

high-resolution model, because it leads to the highest economic benefits although the forecasting

accuracy is slightly compromised.

For the sake of comparisons, four operational strategies are considered based on available price

forecasts, generated price forecasts, and the historical data as follows:

1. PDP Scheduling: This scheme considers PDP values with hourly updates published by the IESO

as available price forecasts. PDPs are applied to the optimization problem in accordance with the

78
Figure 4.8: Total money saved for different number of MCPs

formulation presented in Algorithm (1).

2. Proposed Strategy Scheduling: In this scheme, price forecasts with an hourly updates generated

by the proposed forecasting strategy (with l = 1) are fed into the optimization platform (Algorithm

(2)). The operational schedules of the BESS are updated every hour as the updates of price fore-

casts are generated. The scheduling horizon is kept the same, up to HE 24 of the day.

3. Ad-hoc Strategies: It is considered that the BESS has an unchanging operation strategy rather

than finding the optimal operation scheduling using price forecasts fed into the optimization prob-

lem. Two different ad-hoc strategies for the operation of the BESS are considered as below:

3.a. Ad-hoc #1: Weighted averages of electricity prices from 2002 to 2013 are calculated for each

month separately. Hours with the lowest and the highest electricity prices are correspondingly con-

sidered for charging and discharging. For instance, the charging hour is HE 4 and the discharging

hours are HE 18 and HE 19 for the month October.

3.b. Ad-hoc #2: Charging and discharging decisions for the current day are made based on the pro-

file of electricity price in the previous day. Hours with the lowest and the highest electricity prices

in the previous day are considered for charging and discharging in the current day, respectively.

In this experiment, the potential profit that could be achieved by operating the BESS based on

79
Figure 4.9: Monthly total money saved by operating the BESS

perfect forecasts is calculated. Having the perfect price forecasts, the micro-grid could have po-

tentially saved $4,688 in total in energy cost by operating the BESS over the year 2015. Applying

the proposed forecasting strategy into the developed optimization platform, 62% of the potential

saving could be captured (total $2,937). Scheduling based on available PDP could only capture

43% of this potential (total $2,019). Hence, applying the proposed forecasting strategy could result

in around 50% improvement over the use of available PDP in the profit gained by operating the

BESS. The results indicate that the total profits based on two ad-hoc strategies are only $1,407 and

$1,120, respectively. Fig. 4.9 illustrates the total amount of money saved by operating the BESS

for each month by adopting different strategies compared with the potential profit. It is also noted

that the size of the battery can affect the total energy arbitrage saving, while this is not the focus of

the present paper.

An example of the operation of the BESS on October 2nd , 2015 is presented for different

strategies in Fig. 4.10. Observe that the proposed strategy could effectively detect the price spike

at HE 10, while PDP failed to do so. Therefore, the BESS is scheduled to discharge when the price

is at its highest value. For both ad-hoc strategies, equal discharging rates are considered. It is also

noted that the charging hour for ad-hoc #2 is at HE 6, which has been covered with the one for

80
Figure 4.10: BESS schedules based on different strategies for October 2nd, 2015

PDP scheduling. The amount of money saved by applying the proposed strategy is $140.58 for

this particular day. This amount is 52 times greater than the corresponding amount obtained from

scheduling based on PDP values, i.e., $2.67. The amounts of saved money are respectively $3.03

and $3.59 for the first and the second adopted ad-hoc strategies. This figure clearly depicts the

effect of accurately detecting price spike occurrences in the operation of a storage system. In this

practical application, it is of high importance to predict the exact timing of price spikes in order

to discharge the battery accordingly. It is noted that the higher the difference between electricity

prices at charging and discharging hours, the higher would be the saved money by operating the

BESS.

In the last experiment, we evaluate the potential for gaining higher economic values by increas-

ing the number of cycles in the operation of the BESS. It is noted that frequent and deep cycles

accelerate cyclic aging and consequently reduce the life of the battery. For any type of battery,

number of cycles is usually defined as a function of depth of discharge, which can be obtained by

a fitting technique using detailed experimental data provided by manufacturers [132]. However,

since we do not have access to such data, we assume a constant depth of charge for analyzing the

economic impact of higher number of cycles. Fig. 4.11 shows the total saved money obtained

81
Figure 4.11: Economic effect of number of cycles for the BESS

by increasing the number of cycles. As expected, the higher the number of cycles, the higher the

potential profit would be if perfect price forecasts were available. For instance, the potential profit

increases from $4,688 with one cycle per day to $6,283 with 4 cycles per day. It is observed from

this figure that the total saved money starts to saturate once the battery is operated with more than

two cycles per day. This is because the opportunity for arbitrage is limited during a day depending

of the price profile, and therefore, an increase in the number of cycles cannot necessarily lead to a

significant increase in the total saved money. Higher cycles for the proposed strategy, on the other

hand, do not result in any noticeable changes in the total saved money. This can be justified by

the effect of forecasting errors along with the limited arbitrage capacity during a day. Finally, high

forecasting errors associated with PDP even leads to lower total saved money as the number of

cycles increases. As a result, this BTM BESS is economically better off being operated with one

cycle per day using the proposed strategy.

4.4 Conclusions

In this paper, a forecasting strategy is proposed to provide accurate price forecasts for operation

of behind-the-meter storage systems. This strategy includes two separate forecasting models to

82
take advantage of high-resolution market data along with hourly data in order to capture price

spikes as much as possible. The proposed intra-hour rolling horizon framework is applied to

update the forecasts on an hourly basis. From statistical analysis, the proposed strategy results in

20% improvement in forecast accuracy compared to available PDPs, and has a high capability of

detecting price spikes. The generated forecasts are fed to an optimization platform for operation

scheduling of the BESS within a micro-grid facility. It is concluded that the BESS can bring more

economic values using the price predictions generated by the proposed forecasting strategy, 62%

of the potential saving, in comparison with a number of other strategies, e.g., 43% of the potential

saving using PDPs.

Acknowledgment

The authors would like to thank NRGStream Inc. for providing a complimentary license to use

their market data collection platform. We would also like to thank GE Digital Energy for financial

supports in this work. Finally, we want to thank Dr. Mostafa Kazemi for his comments on operation

of energy storage systems.

83
Chapter 5

Impact of Uncertainty Modeling on Economic Performance of

Microgrids 1

5.1 Nomenclature

Indices and Sets:

i Index of diesel generators from 1 to NG

t Index of hours from 1 to NT

H Index of forecasting origin from 1 to NT

NT Set of scheduling hours

NG Set of diesel generators

Ψ Set of decision variables


Ω Set of uncertain parameters

Parameters:

E B,max Energy capacity of the storage system, (MWh)


B
Eini Initial energy level of the storage system, (MWh)

P B,max Power capacity of the storage system, (MW)

ηB Energy efficiency of the storage system, (%)

P̂tL Load forecast for hour t, (MW)

PtL,Low Lower bound of load forecast for hour t, (MW)


1
2017
c IEEE. Reprinted, with permission, from [133]: H. Chitsaz, S. Shafiee, H. Zareipour, D. Wood,
“Impact of Uncertainty Modeling on Economic Performance of Microgrids”, IEEE Transactions on Smart Grid,
under review.

84
PtL,U p Upper bound of load forecast for hour t, (MW)

λ̂t Price forecast for hour t, ($/MWh)

λLow
t Lower bound of price forecast for hour t, ($/MWh)

λUt p Upper bound of price forecast for hour t, ($/MWh)

P̂tW Wind power forecast for hour t, (MW)

PtW,Low Lower bound of wind forecast for hour t, (MW)

PtW,U p Upper bound of wind forecast for hour t, (MW)

PtL Uncertain parameter of load at hour t, (MW)

λt Uncertain parameter of price at hour t, ($/MWh)

PtW Uncertain parameter of wind at hour t, (MW)

SCi Start-up cost of diesel generator i, ($)

Variables:
Gen
Pi,t Power of diesel generator i at hour t, (MW)

ui,t State of diesel generator i at hour t (1 is ON and 0 is OFF)

SU Ci,t Start-up cost of diesel generator i at hour t, ($)

PtGrid Grid power at hour t, (MW)

PtB Power of the storage system at hour t, (MW)

PtB,c Charging power of the storage system at hour t, (MW)

PtB,d Discharging power of the storage system at hour t, (MW)

EtB Energy of the storage system at hour t, (MW)

uGrid
t State of the grid at hour t

85
5.2 Introduction

With the increasing interest in distributed energy resources and microgrids in recent years, the

optimal operation of such energy systems has become important. In practice, operating a grid-

connected microgrid in the most possible economic way can be a challenging task due to a number

of reasons. Integration of renewable energy in microgrids causes an additional complexity in

their operation because of the uncertainties attached to such intermittent energy resources [134].

In addition, grid-connected microgrids capable of trading energy with the main grid are subject

to the risks of fluctuations in electricity market prices, which can affect the economics of the

microgrid [135]. Another challenge in the operation of microgrids is the non-smooth behavior

of the electricity load at microgrid levels [6]. Unlike the smooth variations of electricity loads in

power systems, microgrid loads can have severe variations that negatively affect their predictability.

Hence, accurate short-term forecasting tools are essential for economic energy management in

microgrids [7].

Many approaches have been presented in the literature for energy management of microgrids.

Typically, point forecasts of the electricity load, price and renewable energy generation are fed to

an optimization problem to schedule the dispatchable units within the microgrid [136]. However,

it has been argued that point forecasting does not provide any information regarding uncertainties

associated with forecasts, and thus cannot be fully relied on for decision-making [101]. Hence,

different strategies have been introduced to mitigate the effect of forecasting errors on the operation

of microgrids.

An efficient approach to deal with the effect of forecasting errors is to update forecasts, and

consequently the set-points of dispatchable units at each scheduling step (e.g., every hour or 15-

min) [137]. This is called the Rolling Horizon (RH) technique that can significantly reduce the

effect of the forecasting errors on microgrid operation [123, 138]. Another approach to deal with

uncertainties associated with forecasts is to generate different scenarios for predictions [134, 139,

140]. For instance, Khodaei [139] considered 1000 scenarios for sensitivity analysis of load, price

86
and wind forecast errors with uniform random error of 10%, 30% and 30%, respectively. Monte

Carlo simulations were used to generate 10000 scenarios for electricity load in [140]. However,

in addition to a high computational burden [141], such scenarios are usually generated randomly

according to the distribution of random variables [142], which may not be available or even realistic

enough [143].

An alternative way to quantify the prediction uncertainties is to use probabilistic or interval

forecasting. This provides an operator with an insight of to what extent forecasts could be trusted

[144]. Prediction intervals are used in a Robust Optimization (RO) formulation that considers the

worst-case scenario for operation scheduling of the microgrid [143,145]. RO is used to incorporate

the uncertainty from wind power generation in the operation of microgrids in [146–148]. To do so,

prediction intervals are considered as lower and higher bounds of forecasts based on the distribution

of uncertain variables.

Despite the introduction of such strategies in mitigating uncertainties, a comparative analysis

of alternative methods and their merits under different circumstances has not been investigated in

the literature. In this paper, as the first approach, the RH technique is applied to update the point

forecasts, and consequently the schedules of the microgrid are updated every hour. In the second

approach, we apply the Quantile Regression Averaging (QRA) probabilistic model to generate pre-

diction intervals (PIs) for the electricity load, price and wind power generation with different con-

fidence intervals. The generated PIs are fed into the RO formulation to find the optimal schedules

of the microgrid components in the worst-case scenario. The third approach is the combination

of the first two strategies, i.e., generating PIs with RH technique for hourly updates. The main

contribution of this paper is exploring the impact of these three approaches on the economic per-

formance of microgirds under different scenarios of wind power generation levels and electricity

price volatilities. The significance of this work is to determine which approach the operator should

adopt to meet the highest economic performance of the microgrid under different scenarios.

The remaining parts of the paper are organized as follows. The forecasting methodologies for

87
electricity load, price and wind power generation are presented in section 5.3. In section 5.4, the

optimization platforms for the operation of the microgrid are formulated. Statistical and economic

alnalyses are provided in section 5.5. Finally, section 5.6 concludes the paper.

5.3 Forecasting Methodology

In this section, the forecasting methodologies for electricity load, price and wind power are pre-

sented. The generated forecasts are then fed into the optimization algorithm for operation schedul-

ing of the microgrid.

5.3.1 Deterministic Forecasting

Deterministic forecasting models provide point predictions of the target variable of interest. Many

approaches have been presented for short-term point forecasting of electricity load [72], electricity

price [98] and wind power generation [149] in the literature. In general, the forecasting models fall

into three main groups: statistical models, neural network-based models and hybrid models. Each

model has its own strengths and weaknesses and hence, there is no solid model that can always

outperform others in terms of different forecasting criteria. Since the forecasting engine itself is

not the contribution of this paper, we implemented the well-known linear autoregressive model

with exogenous variable (ARX) for predicting electricity loads, market prices and wind power

generation, with the following formulation.


K
X L
X
yt = ak yt−k + bl xt−l + t (5.1)
k=1 l=1

where yt is the output of the model (i.e., the forecast) at hour t. The first term on the right-hand

side of the formulation represents the auto-regressive part with lagged values of the target variable

(yt−k ) and the corresponding parameters (ak ). The second term includes the exogenous variables

(xt−l ) with the parameters (bl ). The parameters of the model are determined in the training phase.

t is a normal white noise process with zero mean and finite variance.

88
Despite the simple structure of the ARX model, an effective forecasting strategy, in which the

forecasting model is implemented, could significantly enhance the performance of such models.

For instance, a forecasting strategy may include: i) an efficient pre-processing stage, in which the

most informative inputs are selected, ii) a precise forecasting timeline that runs the online model

at the exact time when all the essential inputs are available, and iii) a proper post-processing stage

to fine-tune the generated forecasts. The forecasting strategies for electricity load, price and wind

power generation are presented as follows.

5.3.1.1 Electricity Load Forecasting

A few research works have highlighted the challenges of electricity load predictions for residential

buildings and microgrids due to high volatility of such loads compared to electric loads of power

systems [13, 76]. To overcome the non-smooth behavior of the microgrid load, an effective pre-

processing stage is very important. In this paper, the feature selection technique presented in [150]

is implemented to select the most informative inputs features. The candidate inputs consist of

lagged values of the electricity load, temperature, minimum and maximum loads of the previous

day and the same day in the previous week, and the hours corresponding to occurrences of peak

load and ramp-up for previous day and the same day in the previous week.

The forecasting strategy also includes two separate models for forecasting electricity loads of

weekdays, and weekends and holidays. The strategy of generating forecasts for different days of

the week from different models would result in better forecasting accuracy since the electricity is

highly dependent on daily load profiles. In the training phase, the parameters of the model, defined

in equation (5.1), are determined. Having used two models for load forecasting, the historical data

are divided into two groups of weekdays, and weekends and holidays.

5.3.1.2 Electricity Price Forecasting

When it comes to short-term electricity price forecasting, the application for using generated fore-

casts becomes critical. Here, the application for the price forecasting tool is the operation schedul-

ing of a grid-connected microgrid. A microgrid does not need to submit any offers/bids into the

89
market, and it is mainly treated as a load with local generation. Hence, the operation and energy ar-

bitrage of microgrids are usually through real-time markets for which the predictions for real-time

electricity prices are required. Note that this is subject to the structure of the electricity market of

interest; however, real time settlements are also performed even in the markets that have day-ahead

settlements too, such as California’s electricity market [114].

Moreover, price forecasting methodologies could vary significantly for different electricity

markets with different structures. Thus, a price prediction tool should be tailored specifically for

a particular electricity market. In this paper, we focus on Alberta’s electricity market in Canada.

Alberta’s real-time electricity market is a Balancing Authority (BA) connected to Western Elec-

tricity Coordinating Council (WECC). The Alberta Electric System Operator (AESO) oversees the

competitive electricity market and operates the power grid. The electricity price, established every

minute, is the System Marginal Price (SMP) and the Pool Price (PP) is the average of SMPs in

every hour. The AESO publishes SMPs, PPs, as well as price forecasts up to 3 hours ahead on its

website [151]. Using the same feature selection technique [150] as applied to load forecasting, the

main drivers for pool price predictions include lagged pool prices, SMPs, AESO’s price predic-

tions for 3 hours ahead, hourly average of historical pool prices for the last week, and historical

and forecast values of the electricity demand within the province.

Having access to such publicly available market data and using the multi-variate linear regres-

sion model presented in (5.1), a forecasting strategy is proposed to predict hourly pool prices of

Alberta’s market using three different forecasting models. In our previous research work [14], we

showed that high-resolution market data such as market clearing prices (e.g., SMP in Alberta’s

market) are of high importance in capturing severe price variations in a real-time market. In this

paper, we take advantage of the first one-minute SMP along with the pool price and the demand

forecasts published by the AESO to generate the pool price forecast for the first hour of the fore-

casting horizon (e.g., 24-hour-ahead). Thus, M1P is specifically used for 1-hour-ahead pool price

prediction. The AESO’s forecasts can also be used as inputs for generating 2-hour-ahead and 3-

90
hour-ahead pool price forecasts. To do so, M2P is fed by AESO’s pool price forecasts as well as

the demand forecast and the average of pool price over the last week for a given hour of the day.

Finally, lagged pool prices, the average of pool prices for a given hour of the day over the last

week, and demand forecasts are fed into M3P to generate pool price forecasts from 3 to 24 hours

ahead.

This forecasting strategy was used to effectively generate price forecasts for Alberta’s electric-

ity market as evaluated in Section 5.5.

5.3.1.3 Wind Power Forecasting

A microgrid may consist of a few small wind turbines, which are usually less than 50 kW for

residential areas [152]. The aggregate wind power generation is of interest for short-term predic-

tion. Having applied the same feature selection technique, the selected features with the highest

impact on wind power generation are lagged values of wind power and wind speed, and forecast

values of wind speed and temperature and their squared values. The historical and forecasts values

of weather data are usually available online on weather station websites. In this paper, we used

Environment Canada website to gather the weather data [153]. The regression model 5.1 is fed by

these input features to generate hourly wind power forecasts for the next 24 hours.

5.3.2 Probabilistic Forecasting

Deterministic forecasting does not provide any information about uncertainties of the provided

point forecasts. Alternatively, probabilistic forecasting is a way to quantify such prediction uncer-

tainties. In this way, when prediction intervals with a specific confidence level are generated, an

operator can assess the extent to which these results could be trusted. Various state-of-the-art prob-

abilistic forecasting models have been presented in the literature [101]. Presented by Nowotarski

and Weron in [154], Quantile Regression Averaging (QRA) is as an efficient method for generat-

ing probabilistic forecasts for the electricity load, electricity price and wind power generation in

this paper. QRA provides prediction intervals using point forecasts from different individual de-

91
terministic models [155]. To do so, individual models can have different forecasting engines (e.g.,

different statistical models and/or neural networks), or the same forecasting model (e.g., a regres-

sion model) can be applied with different sets of input features to generate various point forecasts.

In this paper, we used the latter approach to generate different point forecasts.

To generate prediction intervals using QRA, the following procedure is performed: i) a number

of individual deterministic models are developed, ii) the individual models are trained using a

number of training samples, iii) point forecasts from the individual models are generated, iv) QRA

is trained using the generated point forecasts, and finally v) prediction intervals for the test samples

are generated. Using a set of point forecasts, the QRA model is trained as follows:

Q(q, Xt ) = Xt βq = x1,t β1,q + x2,t β2,q + ... + xn,t βn,q (5.2)

where Q(q|) is the conditional q th quantile of the target variable, Xt is the vector of n point fore-

casts for a given time t, and βq is the vector of parameters for quantile q. The parameters are

determined by minimizing the loss function over the vector of parameters for a particular q th quan-

tile, as follows:
X X
Min. q|At − Xt βq | + (1 − q)|At − Xt βq | (5.3)
βq
t:At >Xt βt t:At <Xt βt

where At is the actual value of the target variable at time t. In this process, the parameters of the

QRA model (i.e., βq ) are determined.

In this work, we consider five confidence levels of 50%, 60%, 70%, 80% and 90% to investigate

their impacts on the operation of the microgrid. With a confidence level of 90%, prediction inter-

vals are expected to cover the real values of the target variable with the probability of 90%. To have

a confidence level of 90%, the lower and the upper quantiles should be 5% and 95%, respectively.

Thus, the value of the quantile should be set to 5% (q = 0.05) to estimate the vector of parameters

for the lower bound, β0.05 . Likewise, the value of the quantile is set to 95% (q = 0.95) to determine

the vector of parameters for the upper bound, β = 0.95. A similar process is performed for other

confidence levels. In addition, similar to the point forecasting models, the prediction intervals are

also updated every hour using the rolling horizon framework.

92
5.4 Optimization Platform

In this section, after a brief overview of microgrids, deterministic and robust optimization platforms

are presented for the operation of a grid-connected microgrid.

5.4.1 Microgrid Overview

A grid-connected microgrid with different components is considered in this paper. The microgrid

includes an electric load, a number of Diesel Generators (DGs), a Battery Energy Storage System

(BESS), as well as wind power generation from a few small wind turbines. The microgrid is

connected to the main grid, and thus it can purchase shortage power from the grid and sell excess

power to the grid as well. The objective is to schedule the units of the microgrid such that the

total cost of serving the microgrid load is minimized. An Energy Management System (EMS)

is the main microgrid controller that sends directives to the dispatchable units. The optimization

platform could be embedded in the EMS. In this way, the required forecasts of the load, price, and

wind power are fed into the optimization problem, and the outputs are schedules of the units.

5.4.2 Deterministic Approach

For the microgrid with components mentioned above, a deterministic operation scheduling prob-

lem of the microgrid is formulated in this section. Deterministic scheduling is one way to schedule

the operation of a micro grid, in which a set of point forecasts (e.g. load, price and wind power) is

fed to the optimization problem, while their associated uncertainties are not considered [141, 142].

93
NT
" NG
#
X X
Min. λ̂t PtGrid + Gen
(F Ci,t (Pi,t ) + SU Ci,t ) (5.4)
Ψ
t=H i=1

Gen Gen
F Ci,t (Pi,t ) = ai ui,t + bi Pi,t (5.5)

s.t.

SU Ci,t > SCi (ui,t − ui,t−1 ) (5.6)

SU Ci,t > 0 (5.7)


NG
X
PtGrid + Gen
Pi,t + P̂tW + ηB PtB = P̂tL (5.8)
i=1

ui,t PiGen,min 6 Pi,t


Gen
6 ui,t PiGen,max (5.9)

ui,t ∈ {0, 1} (5.10)

PtB = PtB,d − PtB,c (5.11)


t
X
EtB = B
Eini + ηB PkB,c − PkB,d (5.12)
k=1

0 6 EtB 6 E B,max (5.13)

0 6 PtB,c 6 P B,max (5.14)

0 6 PtB,d 6 P B,max (5.15)

In the formulations above, the objective is to minimize the cost of operating the microgrid up

to NT hours into the future, e.g., 24-hour-ahead, shown in Equation (5.4). The total cost includes

the fuel costs and start-up costs of generators, plus the power purchased from the main grid. The

fuel cost, shown in Equation (5.5), is a function of the power generated by the generator. The

objective function is subject to a set of operational constraints. The start-up cost is formulated

using the state of the generator, presented in Equations (5.6) and (5.7). Equation (5.8) shows the

power balance constraint, in which the total generation should be equal to the electric load at each

time. PtGrid is positive when the microgrid purchases energy from the grid, while it is negative

when the microgrid sells the excess power to the grid. In other words, when PtGrid is negative,

the grid is considered as a load that is supplied by the local generation in the microgrid. Equa-

94
tions (5.9) and (5.10) show the operation limits of diesel generators. It should also be noted that

ramp up/down constraints of diesel generators are ignored in this study due to fast ramping rates

of small-scale diesel generators. Equations (5.11)-(5.15) also represent the operational constraints

of the BESS. The outputs of this optimization problem are the optimal values for the set of vari-
Gen
ables, i.e., Ψ = {Pi,t , ui,t , SU Ci,t , PtGrid , PtB , EtB }. In other words, the solution of the optimiza-

tion problem is the schedules of the dispatchable/controllable units in the microgrid. The rolling

horizon framework updates the forecasts at each forecasting origin, i.e., H, and consequently the

schedules every hour to ensure the most economical operation of the microgrid in an operating

day.

5.4.3 Probabilistic Approach

Robust optimization is an interval based approach in which a confidence gap is defined around

the forecast parameters [156]. The worst case realization of the uncertainties is then evaluated

for the decision-making process. Therefore, RO can be fed by prediction intervals with different

confidence levels for the operation scheduling of the microgrid. The RO formulation is presented

as a max-min optimization problem as follows:

NT
" NG
#
X X
Grid Gen
Min. Max. λ t Pt + (F Ci,t (Pi,t ) + SU Ci,t ) (5.16)
Ψ Ω
t=H i=1

s.t.
NG
X
PtGrid + Gen
Pi,t + PtW + ηB PtB = PtL (5.17)
i=1

λLow
t 6 λt 6 λUt p (5.18)

PtL,Low 6 PtL 6 PtL,U p (5.19)

PtW,Low 6 PtW 6 PtW,U p (5.20)

Here, the uncertain parameters in the objective function (Equation (5.16)) and the power bal-

ance constraint (Equation (5.17)) are Ω = {λt , PtL , PtW }. The objective is to minimize the cost

95
with respect to Ψ, the set of decision variables. Meanwhile the cost is maximized with respect

to the set of uncertain parameters, i.e., Ω, in order to reduce the risk of uncertain parameters on

decision making. Observe from Equations (5.18) - (5.20) that uncertainties vary in specific inter-

vals. These are prediction intervals provided by the probabilistic forecasting models. The robust

optimization evaluates the worst case of uncertainties and guarantees a level of cost. Thus, the

actual cost of operating the microgrid would be lower than the guaranteed level if the actual values

of those variables fall into the provided prediction intervals.

To solve this min-max optimization problem using commercial solvers, it requires modification

to a simple minimization problem. It is observed from Equations (5.16) - (5.20) that the optimiza-

tion problem is linear with respect to uncertain parameters. Therefore, the worst-case would be

obtained on the lower or upper bounds of these uncertain parameters. The extreme points of the

uncertain parameters electricity load and non-dispatchable generation can be easily determined. A

higher load and a lower non-dispatchable generation (i.e., wind) would result in a higher operation

cost. Therefore, the worst-case solution would be obtained when the non-dispatchable generation

is at its lower uncertainty bound (i.e., PtW,Low ) and the load is at its upper uncertainty bound (i.e.,

PtL,U p ).
NG
X
PtGrid + Gen
Pi,t + PtW,Low + ηB PtB = PtL,U p (5.21)
i=1

Hence, the power balance constraint is accordingly replaced with (5.21), where the upper bound of

load and the lower bound of wind power generation prediction intervals replace their corresponding

uncertain parameters.

However, it is not as easy to determine which bound of the uncertain parameter electricity

price would lead to the worst case. This is because it could be the upper bound in some cases

(e.g., when buying energy from the grid) or the lower bound in other cases (e.g., when injecting

power to the grid). Therefore, a binary variable is first defined, denoted by uGrid
t , in order to define

the state of the grid’s power. Thus, uGrid


t = 1 when the microgrid buys energy from the grid

(PtGrid > 0), while uGrid


t = 0 when the microgrid injects power to the grid (PtGrid < 0). This has

96
been formulated in Equations (5.22) and (5.23). Using uGrid
t and a big enough positive number,

denoted by M, we can determine the worst case of the uncertain parameter λt . This method is also

known as the Big M method [156]. Bt is a replacement variable for the nonlinear term λt PtGrid in

the objective function. The following equations determine the worst case for the electricity price

as the uncertain parameter.

uGrid
t ∈ {0, 1} (5.22)

PtGrid 6 uGrid
t M (5.23)

Bt − M 6 λUt p PtGrid − uGrid


t M (5.24)

Bt + M > λUt p PtGrid + uGrid


t M (5.25)

Bt − M 6 λLow
t PtGrid − (1 − uGrid
t )M (5.26)

Bt + M > λLow
t PtGrid + (1 − uGrid
t )M (5.27)

Equations (5.24) and (5.25) ensure that the uncertain parameter λt is set to λUt p when the mi-

crogrid purchases energy from the grid, i.e., the worst case. Similarly, equations (5.26) and (5.27)

make sure that the uncertain parameter is set to λLow


t when the microgrid sells energy to the grid,

i.e., the worst case. Hence, the final objective function can be re-formulated as follows:
NT
" NG
#
X X
Gen
Min. Bt + (F Ci,t (Pi,t ) + SU Ci,t ) (5.28)
Ψ
t=H i=1

where we could eliminate the max part of the optimization. Therefore, a simple minimization

problem is formulated in a manner that can be input to any commercial solver. The final objective

function presented in (5.28) is subject to constraints (5.6)-(5.7), (5.9)-(5.15) and (5.21)-(5.27).

With this formulation, the cost of the operation of the microgrid is minimized when the worst case

of uncertain parameters occurs. This suggests that the cost of operation would be lower if the

actual values of uncertain parameters fall in their bounds.

97
5.5 Numerical Results

In this study, the electric load consumptions of a building in the University of Calgary, Calgary,

Alberta, Canada, is considered as the microgrid load. The electricity load data of this campus

building is from January 2016 to December 2016 with the peak value of 968 kW. For the electricity

price, the data of pool prices from Alberta’s electricity market is considered for the year 2016. In

addition, the hourly data of Taber wind farm in Alberta, Canada, in year 2015 is used for generating

wind power forecasts. The wind power data is scaled down to 100 kW to be consistent with a

normal residential capacity. The battery energy storage system has the capacity of 200 kW/400

kWh with the efficiency of 80%. In addition, two diesel generators are considered within the

microgrid with the capacity of 500 kW for generator #1 and 600 kW for the generator #2.

5.5.1 Statistical Analysis

In this section, a brief statistical evaluation is provided for generated point and interval forecasts to

show their accuracy. An error measure of Mean Absolute Error (MAE) is considered to evaluate

the point forecasting errors for each month, defined as:


Nh
1 X
M AE = |ACT (t) − F OR(t)| (5.29)
Nh t=1

where Nh is the number of hours in the test month, and ACT (t) and F OR(t) are the actual

and forecast values at hour t, respectively. For the electricity load and wind power generation,

the normalized MAE is expressed as a percentage of the peak load and wind power capacity,

respectively. Table 5.1 shows the results of point forecasting errors for 1-hour-ahead predictions for

electricity load, electricity price and wind power generation in terms of MAE for 12 test months.

In addition, the generated price forecasts are compared with forecasts published by the AESO

as a benchmark to show the accuracy of the developed methodology. The average of error for

the electricity load is 9.9 kW, which is only 1% of the peak load. Wind power forecasting error

averaged 6.77 kW that accounts for 6.7% of the total wind capacity. These figures demonstrate a

satisfactory performance for load and wind power prediction models. The last two columns show

98
Table 5.1: Errors of point forecasts in terms of MAE
Test Load Wind Price ($/MWh)
Month (%) (%) AESO Proposed
Jan. 0.98 7.01 2.93 1.98
Feb. 0.89 6.89 0.53 0.52
Mar. 0.97 7.20 0.39 0.38
Apr. 0.94 7.52 0.45 0.33
May 1.07 5.91 1.06 0.66
Jun. 0.90 5.63 0.53 0.46
Jul. 0.88 5.42 1.10 0.99
Aug. 0.94 6.50 2.42 1.39
Sep. 1.02 6.62 0.78 0.67
Oct. 1.15 7.45 2.25 1.66
Nov. 1.18 7.82 0.51 0.61
Dec. 1.38 7.38 1.71 1.39
Avg. 1.02 6.77 1.22 0.92

the effectiveness of the developed price forecasting model for Alberta’s electricity market. As

shown, the average of the error is $1.22/MWh for the price forecasts generated by the AESO, while

the developed model resulted in $0.92/MWh MAE. In other words, there is 25% improvement in

forecasting accuracy using the present model.

Two error measures are also used for evaluating the probabilistic forecasts; Prediction Interval

Coverage Probability (PICP) and Mean Prediction Interval Width (MPIW) [98], defined as follows:


Nh

1 X 1,
 ACT (t) ∈ [Lt , Ut ]
P ICP = ct ; ct = (5.30)
Nh t=1 
0,

otherwise

Nh
1 X
M P IW = (Ut − Lt ) (5.31)
Nh t=1

where Ut and Lt are upper and lower bounds of the prediction interval for hour t, respectively.

Table 5.2 displays the yearly average of errors for probabilistic forecasts of the electricity load,

price and wind power for different Confidence Levels (CL). PICP is an indication of the proportion

of the time that generated prediction intervals contain the actual values of interest. It is observed

99
Table 5.2: Errors of probabilistic forecasts
Load Wind Price
CL PICP MPIW PICP MPIW PICP MPIW
(%) (kW) (%) (kW) (%) ($/MWh)
50% 50.2 14.5 48.7 8.7 52.1 0.62
60% 60.1 18.9 57.9 11.5 61.6 0.95
70% 70.2 23.8 69.5 15.0 71.5 1.46
80% 80.5 31.1 78.4 20.8 80.6 2.21
90% 90.3 42.2 88.9 30.1 90.5 3.48

that the coverage probability for different confidence levels are satisfactorily above the nominal

probabilities for electricity load and price. Although PICP values for wind power are slightly

below the nominal probability, the results show an acceptable level of coverage for all confidence

levels. A satisfactorily large PICP can be easily achieved by widening prediction intervals from

either side. However, such intervals are too conservative and less useful in practice, as they do not

show the variation of the targets. Therefore, MPIW is an alternative measure to show how wide

the intervals are. This table clearly demonstrates that MPIW increases as the nominal confidence

level increases. However, even for 90% nominal confidence level, the average width of generated

prediction intervals is very small for the electricity load, price and wind power. For instance, the

MPIW for the electricity load prediction with 90% confidence level is 42.2 kW, which is only 4% of

the peak load. The lower the average width of the prediction intervals, the higher the quality of the

generated probabilistic forecasts. It should also be noted that confidence levels less than 70% are

not usually considered in practical applications. The inclusion of 50% and 60% confidence levels

is only to investigate how they would impact the performance of microgrid from an economic

perspective.

5.5.2 Economic Analysis

In this section, the performance of the generated forecasts is evaluated from an economic point of

view. To do so, the point forecasts are fed into the deterministic optimization problem presented

in section 5.4.2, and the prediction intervals are used as inputs to the RO problem introduced in

section 5.4.3. The optimization problem schedules the units for the operating day and the cost of

100
operation is calculated considering actual values of electricity load, price and wind power gener-

ation. The objective is to find which approach leads to better economics of the microgrid under

different scenarios.

In the first experiment, a scenario is considered with the electricity price of Alberta’s market

in 2016 along with the wind power generation capacity of 100 kW. This wind capacity is almost

10% of the peak load and hence, it is considered a low wind profile in this study. Hourly price

in Alberta’s market averaged $18.28/MWh, which was a remarkable decrease of 45% from 2015.

Even the average pool price during the peak period (i.e., 7 a.m. to 11 p.m.) was only $19.73/MWh

[157]. Thus, this shows a low, smooth price profile for the first scenario. Also, given the low ratio

of total wind power generation to the peak load (i.e., almost 10%), the first scenario investigates

the performance of two approaches with low price and wind profiles.

The total cost of operating the microgrid over a year considering different approaches is shown

in Fig. 5.1. Having access to perfect forecasts, the operation cost of the microgird could be

as low as $71,215 over the year. While the day-ahead deterministic scheduling resulted in the

total operation cost of $102,327, the RH approach could reduce the cost to $90,403 (i.e., 11.5%

improvement). As seen, the day-ahead probabilistic approach could slightly decrease the total cost

with 50% and 60% confidence levels, whereas any further increase in the confidence interval led

to higher operation costs. Moreover, applying both approaches together with all confidence levels

could not outperform the deterministic RH strategy. Due to smooth wind and price profiles, the

point forecasts have high accuracy and thus, applying robust scheduling leads to an unnecessarily

conservative schedule that impose higher costs to the microgrid.

In the next experiment, a high wind profile with the same price profile is studied to investigate

the impact of high wind power generation. To create a high wind scenario, the wind capacity is

increased to 500 kW, which is approximately 50% of the peak load. As seen from Fig. 5.2, higher

wind power capacity clearly resulted in lower operation cost overall compared to the first scenario.

Here, perfect forecasts could lower the total cost to $68,079. Observe that with high wind level,

101
110

105 RH
DA
100

95

Cost (thousand $)
90

85

80

75

70

65

60
ic ) ) ) ) )
ist 0% 0% 0% 0% 0%
in (5 (6 (7 (8 (9
m tic tic tic tic tic
er
De
t
b ilis ilis ilis ilis ilis
ba b ab b ab b ab b ab
o ro ro ro ro
Pr P P P P

Figure 5.1: Costs for Scenario #1: low price and low wind

the combination of RH and probabilistic methods for 50% and 60% confidence levels would lead

to the lowest possible operation cost. Therefore, in case of a smooth price profile, the uncertainty

related to high wind generation can be mitigated with robust scheduling in a more effective way.

The higher the confidence level of interest, the higher the total cost could be. This is because higher

confidence levels lead to more conservative scheduling and consequently higher costs. On the other

hand, low confidence levels might not be very attractive to operators. The proper confidence level

needs to be determined based on the level of wind power penetration in such a scenario.

In the final experiment, the effect of volatile, non-smooth electricity price on the operation of

microgrid is assessed. To do so, the price profile of the same market in year 2013 is considered.

Alberta’s market had an average pool price of $80.19/MWh over 2013 [157]. The peak prices

averaged $106.13/MWh including numerous price spikes (with the price cap of $999.99/MWh) in

2013 that shows a very high electricity price profile. In terms of volatility, the root mean squared

of the price was $187.5/MWh in 2013 compared to $22.2/MWh in 2016. This simply shows a

volatile price profile in year 2013. Considering a low wind generation profile along with high

prices, Fig. 5.3 demonstrates the operation costs for different strategies. As shown in this figure,

102
100

RH
95 DA

90

Cost (thousand $)
85

80

75

70

65

60
ic ) ) ) ) )
ist 0% 0% 0% 0% 0%
in (5 (6 (7 (8 (9
rm tic tic tic tic tic
te ilis ilis ilis ilis ilis
De b b b b b
ba ba ba ba ba
P ro Pr
o
Pr
o
Pr
o
Pr
o

Figure 5.2: Costs for Scenario #2: low price and high wind

there is a significant cost reduction in the operation of the microgrid due to high potential for energy

arbitrage with the main grid. When the electricity price is low, the microgrid purchases energy from

the grid and also charges the BESS. As the price increases the excess power generated within the

microgrid is injected to the grid for making profit. Interestingly, given the perfect forecasts, the

total annual cost of the microgrid is $-6,141; in fact, the microgrid is making a profit over the year.

The deterministic RH approach resulted in the lowest operation cost with 92% improvement

over the DA deterministic method. This is mainly due to high capability of the price forecast-

ing model with rolling horizon framework in capturing high prices for timely energy arbitrage.

Although the DA probabilistic method with 50%, 60% and 70% confidence levels and RH proba-

bilistic method with all confidence levels could improve the deterministic DA approach, they failed

to outperform deterministic RH strategy. This is because the RO acts conservatively and might not

take advantage of high price hours. Hence, this approach can significantly enhance the economics

of microgrids when the electricity prices contains severe variations and spikes.

103
30
RH
DA
25

Cost (thousand $)
20

15

10

ic ) ) ) ) )
ist 0% 0% 0% 0% 0%
in (5 (6 (7 (8 (9
m tic tic tic tic tic
er
De
t
bilis b ilis bilis b ilis bilis
ba ba ba ba ba
P ro Pr
o
Pr
o
Pr
o
Pr
o

Figure 5.3: Costs for Scenario #3: high price and low wind

5.6 Conclusions

In this paper, the operation of a grid-connected microgrid is studied using both determinsitic and

probabilistic strategies. Point and interval forecasts are generated for the electricity load, price and

wind power generation. The generated forecasts are fed into an optimization platform to operate

the microgrid with the lowest possible cost. Two strategies to mitigate the uncertainties related

to forecasts, i.e., rolling horizon technique and prediction intervals, are implemented, and their

impact on the economic performance of microgrids are investigated using different scenarios. It

is concluded that the level of wind power generation and the volatility of the electricity price are

the main drivers to select the effective approach for the operation of a microgrid. The numerical

experiments showed that prediction intervals with robust optimization can improve the economics

of the microgrid when wind power generation is considerable with respect to the peak load. How-

ever, when the electricity price is highly volatile, the deterministic rolling horizon strategy is the

most economical way to operate the microgrid, while RO might not fully take advantage of high

price hours and their potential energy arbitrage opportunities.

104
Chapter 6

Conclusions

In this thesis, short-term forecasting tools are developed for the operation of microgrids. The

three sets of forecasts required for this purpose are i) the power generation from renewable energy

resources within the microgrid, ii) the electricity load of the microgrid, and iii) the price of the elec-

tricity market in which the microgrid is located and operated. Therefore, a wind power forecasting

model is developed to provide predictions for short-term wind power generation. Then, the special

characteristics of electricity loads in microgrids are analyzed and compared to loads in power sys-

tems. A prediction methodology is accordingly built to accommodate the volatile behavior of loads

at microgrid/building levels. Further, a forecasting model, capable of capturing severe variations,

is proposed for the electricity market price to enable the microgrid to trade energy with the main

grid in the most economical way. To operate the microgrid, forecasts are fed into an optimization

algorithm that provides the operational schedules of the dispatchable resources in the microgrid.

In the last stage of this thesis, different optimization platforms for the operation of microgrids are

implemented. Then, the impacts of different uncertainty mitigation approaches on the economic

performance of microgrids under different scenarios are investigated.

The overall significance of this thesis is that it focuses on supportive tools for the development

of microgrids. The outcome of this thesis could help the microgrid operators to operate their

systems in a more reliable, efficient and economical way. The detailed conclusions of each chapter

of this thesis are summarized in the following section.

6.1 Concluding remarks

In Chapter 2, a prediction method is developed for wind power generation. The model is based

on an artificial neural network trained by an efficient heuristic algorithm. The performance of

105
the proposed wind power forecasting model as well as its main components, e.g., the training

procedure, is extensively evaluated by real-world wind power data. The statistical performance of

the proposed model is compared with that of an existing wind power forecasts for Alberta’s power

system from a third-party company. The results show the satisfactory performance of the proposed

model for short-term horizons. The contribution of this chapter is the development of an efficient

forecasting model that can provide accurate short-term wind power predictions for aggregated or

individual wind farms. The significance of this work is the application of wind power forecasting

in the optimal operation of a microgrid with wind energy resources.

Chapter 3 proposes a short-term load forecasting model for the operation of micro-grids. Con-

sidering volatile and non-smooth characteristics of load time series of micro-grids compared with

power systems’ electricity loads, the proposed forecasting method aims to deal with such chal-

lenges. The model has the structure of a state-of-the-art neural network-based forecasting engine,

i.e., Self-recurrent Wavelet Neural Network (SRWNN), capable of capturing nonlinear complexi-

ties of volatile time series. The Levenberg-Marquardt learning algorithm is implemented to train

the forecasting engine. The effectiveness of the proposed forecasting model is demonstrated using

real-world load data of a micro-grid and two power systems. The results show that the proposed

model leads to more accurate forecasts when the prediction of a volatile time series, i.e., microgrid

loads, is of interest. Thus the main contribution of this chapter is the development of a forecasting

model that can capture non-smooth behavior of electricity loads in microgrids. The significance

of this work is that accurate load forecasts can enhance the energy management of both renewable

and conventional resources in the microgrid and also improve the economics of energy trades with

electricity markets.

In Chapter 4, a forecasting strategy is proposed to provide accurate price forecasts for the oper-

ation of behind-the-meter storage systems within microgrids. This strategy includes two separate

forecasting models to take advantage of high-resolution market data along with hourly data in or-

der to capture price spikes as much as possible. Moreover, the forecasting models are embedded

106
in an intra-hour rolling horizon framework in order to update the forecasts on an hourly basis.

Using real-world price data from Ontario’s electricity market, the proposed strategy is evaluated

from both statistical and economic perspectives. From statistical analysis, the proposed strategy

results in 20% improvement in forecast accuracy compared to available pre-dispatch prices from

the system operator, and has a high capability of detecting price spikes. For economic assessments,

the generated forecasts are fed to an optimization platform for the operation scheduling of a bat-

tery energy storage system within a micro-grid facility. It is concluded that the storage system

can bring better economic return using the price predictions generated by the proposed forecasting

strategy, 62% of the potential saving, in comparison with a number of other strategies, e.g., 43%

of the potential saving using available pre-dispatch prices. Hence, the contribution of this chapter

is to develop a price forecasting strategy that can efficiently capture severe variations in electricity

prices, e.g., price spikes. The significance of this work is that the detection of high electricity

prices can enhance the economics of grid-connected microgrids by timely energy arbitrage with

the main grid.

Chapter 5 summarizes the findings of an investigation in the alternative approaches for mitigat-

ing the impact of forecast errors in the operation of microgrids. The operation of a grid-connected

microgrid is considered using both deterministic and probabilistic strategies. Using real-world

data, point and interval forecasts are first generated for the electricity load, price and wind power

generation. The generated forecasts are then fed into an optimization platform to operate the mi-

crogrid with the lowest possible cost. The rolling horizon technique and prediction intervals, as

the two strategies to mitigate the uncertainties related to forecasts, are evaluated and compared

using different scenarios. It is concluded that the level of wind power generation and the volatility

of the electricity price are the main drivers to select the effective approach for the operation of a

microgrid. The numerical experiments show that prediction intervals with robust optimization can

improve the economics of the microgrid when wind power generation is considerable with respect

to the peak load. However, when the electricity price is highly volatile, the deterministic rolling

107
horizon strategy is the most economical way to operate the microgrid, while robust optimization

might not fully take advantage of high price hours and their potential energy arbitrage opportuni-

ties. Thus, the contribution of this chapter is to explore the most efficient optimization platform

for the operation of a microgrid given a set of deterministic and probabilistic forecasts. The signif-

icance of this work is that it helps the operator to adopt the most economical approach to operate

the microgrid under different scenarios of wind integration levels and market conditions.

6.2 Future Work

Future extensions to this thesis are as follows.

1. In Chapter 2, a forecasting model is constructed based on wind power generation

data with an hourly resolution. Wind energy could have severe variations during

an hour that might not be effectively captured by the hourly average of the data.

Considering a higher resolution for wind power prediction, e.g., 15 minutes, could

better capture sub-hourly volatility of such time series. As an extension to this work,

the development of a wind power forecasting method with such higher resolutions

could improve the forecast accuracy. However, an optimization platform also needs

to be implemented in accordance with such a data resolution that can potentially

enhance the economics of a microgrid with wind energy.

2. In Chapter 3, a forecasting model is developed to cope with highly volatile behav-

ior of the electricity load at microgrid/building levels. Different types of buildings

may have various patterns of energy consumption. For instance, a conventional

campus building might have a totally different load pattern than a building con-

taining numerous laboratories. In particular, with the advent of smart buildings, in

which different technologies are used to increase the energy efficiency (e.g., triple-

glazed windows, occupancy sensors and smart timing schedules), the consumed

108
energy might continuously change with high magnitudes. Hence, an extension to

this chapter could be investigating different types of buildings with various load

patterns, and developing effective forecasting models accordingly. This could po-

tentially improve the forecast accuracy of the electricity consumption in a build-

ing/microgrid.

3. Chapter 4 of this thesis introduces an effective strategy that includes high-resolution

market data in the proposed intra-hour optimization platform for microgrids. Since

the forecasting engine is not the main focus of this chapter, a simple linear re-

gression model is applied. A potential improvement could be the implementation

of a more efficient forecasting model that can particularly provides more accurate

forecasts for longer horizons, e.g., from 6 to 24 hours ahead. This can potentially

enhance the optimal operation of the microgrid.

4. The study in Chapter 5 could be extended by considering behind-the-meter Photo-

voltaic (PV) solar energy units within the microgrid. There will not be significant

changes in modeling the optimization algorithm, as a source of uncertainty is added

the same as other sources, e.g., wind power generation. PV and wind production are

often anti-correlated, and there could be value in combining them just for that rea-

son. This will reveal which mitigation approach can accommodate the uncertainty

related to this source of energy more efficiently.

5. Since the load forecasting error in Chapter 5 was significantly lower than those for

wind power generation and the electricity price, the research focused only on sce-

narios for high wind penetration and high price volatility. Hence, another extension

to Chapter 5 could be investigating different types and levels of the electricity load,

and their impacts on the economic performance of the microgrid.

109
Bibliography

[1] B. Lasseter, “Microgrids [distributed power generation],” Power Engineering Society Winter

Meeting, 2001. IEEE, vol. 1, pp. 146–149, January 2001.

[2] [Online]. Available: http://energyaccess.org/news/recent-news/microgrids-mini-grids-

andnanogrids-an-emerging-energy-access-solution-ecosystem

[3] [Online]. Available: https://www.navigant.com/insights/energy/2017/tracking-microgrids-

and-the-challenges

[4] [Online]. Available: https://www.navigant.com/insights/energy/2017/commercial-

industrial-microgrid-market-tips-scales

[5] A. Chaouachi, R. M. Kamel, R. Andoulsi, and K. Nagasaka, “Multiobjective intelligent

energy management for a microgrid,” IEEE Transactions on Industrial Electronics, vol. 60,

no. 4, pp. 1688–1699, April 2013.

[6] N. Amjady, F. Keynia, and H. Zareipour, “Short-term load forecast of microgrids by a new

bilevel prediction strategy,” IEEE Transactions on Smart Grid, vol. 1, no. 3, pp. 286–294,

December 2010.

[7] E. R. Sanseverino, M. L. D. Silvestre, M. G. Ippolito, A. D. Paola, and G. L. Re, “An execu-

tion, monitoring and replanning approach for optimal energy management in microgrids,”

Energy, vol. 36, no. 5, pp. 3429–3436, May 2011.

[8] K. Rohrig and B. Lange, “Improvement of the power system reliability by prediction of

wind power generation,” Power Engineering Society General Meeting, 2007. IEEE, pp. 1–

8, 2007.

[9] E. Paparoditis and T. Sapatinas, “Short-term load forecasting: The similar shape functional

110
time-series predictor,” IEEE Transactions on Power Systems, vol. 28, no. 4, pp. 3818–3825,

November 2013.

[10] M. Eghbal, T. K. Saha, and N. Mahmoudi-Kohan, “Utilizing demand response programs in

day ahead generation scheduling for micro-grids with renewable sources,” 2011 IEEE PES

Innovative Smart Grid Technologies Asia (ISGT), pp. 1–6, November 13-16 2011.

[11] N. M. Pindoriya, S. N. Singh, and S. K. Singh, “An adaptive wavelet neural network-based

energy price forecasting in electricity markets,” IEEE Transaction on Power System, vol. 23,

no. 3, pp. 1423–1432, August 2008.

[12] H. Chitsaz, N. Amjady, and H. Zareipour, “Wind power forecast using wavelet neural net-

work trained by improved clonal selection algorithm,” Energy Conversion and Management,

vol. 89, pp. 588–598, January 2015.

[13] H. Chitsaz, H. Shaker, H. Zareipour, D. Wood, and N. Amjady, “Short-term electricity load

forecasting of buildings in microgrids,” Energy and Buildings, vol. 99, pp. 50–60, July 2015.

[14] H. Chitsaz, P. Zamani-Dehkordi, H. Zareipour, and P. Parikh, “Electricity price forecast-

ing for operational scheduling of behind-the-meter storage systems,” IEEE Transactions on

Smart Grid, 2017.

[15] F. G. Montoya, F. Manzano-Agugliaro, S. Lopez-Marquez, Q. Hernandez-Escobedo, and

C. Gil, “Wind turbine selection for wind farm layout using multi-objective evolutionary

algorithms,” Expert Systems with Applications, vol. 41, no. 15, pp. 6585 – 6595, 2014.

[16] F. Manzano-Agugliaro, A. Alcayde, F. Montoya, A. Zapata-Sierra, and C. Gil, “Scientific

production of renewable energies worldwide: An overview,” Renewable and Sustainable

Energy Reviews, vol. 18, pp. 134 – 143, 2013.

[17] [Online]. Available: www.wwindea.org

111
[18] Aeso. [Online]. Available: https://www.aeso.ca

[19] Q. Hernandez-Escobedo, F. Manzano-Agugliaro, and A. Zapata-Sierra, “The wind power

of Mexico,” Renewable and Sustainable Energy Reviews, vol. 14, no. 9, pp. 2830 – 2840,

2010.

[20] Q. Hernandez-Escobedo, F. Manzano-Agugliaro, J. A. Gazquez-Parra, and A. Zapata-

Sierra, “Is the wind a periodical phenomenon? the case of Mexico,” Renewable and Sus-

tainable Energy Reviews, vol. 15, no. 1, pp. 721 – 728, 2011.

[21] Y. Xu, Z.-Y. Dong, Z. Xu, K. Meng, and K. P. Wong, “An intelligent dynamic security as-

sessment framework for power systems with wind power,” IEEE Transactions on Industrial

Informatics, vol. 8, no. 4, pp. 995–1003, November 2012.

[22] P. Hu, R. Karki, and R. Billinton, “Reliability evaluation of generating systems containing

wind power and energy storage,” Generation, Transmission Distribution, IET, vol. 3, no. 8,

pp. 783–791, 2009.

[23] Q. Hernandez-Escobedo, R. Saldana-Flores, E. Rodriguez-Garcia, and F. Manzano-

Agugliaro, “Wind energy resource in northern Mexico,” Renewable and Sustainable Energy

Reviews, vol. 32, pp. 890 – 914, 2014.

[24] J. Cardell, L. Anderson, and C. Y. Tee, “The effect of wind and demand uncertainty on

electricity prices and system performance,” Transmission and Distribution Conference and

Exposition, 2010 IEEE PES, pp. 1–4, 2010.

[25] M. Black and G. Strbac, “Value of bulk energy storage for managing wind power fluctua-

tions,” IEEE Transactions on Energy Conversion, vol. 22, no. 1, pp. 197–205, 2007.

[26] [Online]. Available: www.nyiso.com

112
[27] N. Chen, Z. Qian, I. Nabney, and X. Meng, “Wind power forecasts using gaussian processes

and numerical weather prediction,” IEEE Transactions on Power Systems, vol. 29, no. 2, pp.

656–665, March 2014.

[28] M. Khalid and A. Savkin, “A method for short-term wind power prediction with multiple

observation points,” IEEE Transactions on Power Systems, vol. 27, no. 2, pp. 579–586, May

2012.

[29] P. Kou, D. Liang, F. Gao, and L. Gao, “Probabilistic wind power forecasting with on-

line model selection and warped gaussian process,” Energy Conversion and Management,

vol. 84, pp. 649 – 663, 2014.

[30] I. J. Ramirez-Rosado, L. A. Fernandez-Jimenez, C. Monteiro, J. Sousa, and R. Bessa, “Com-

parison of two new short-term wind-power forecasting systems,” Renewable Energy, vol. 34,

no. 7, pp. 1848 – 1854, 2009.

[31] R. G. Kavasseri and K. Seetharaman, “Day-ahead wind speed forecasting using f-arima

models,” Renewable Energy, vol. 34, no. 5, pp. 1388 – 1393, 2009.

[32] C. Gonzalez-Mingueza and F. Munoz-Gutierrez, “Wind prediction using weather research

forecasting model (wrf): A case study in peru,” Energy Conversion and Management,

vol. 81, pp. 363 – 373, 2014.

[33] S. Fan, J. Liao, R. Yokoyama, L. Chen, and W. jen Lee, “Forecasting the wind generation

using a two-stage network based on meteorological information,” IEEE Transactions on

energy conversion, vol. 24, no. 2, pp. 474–482, 2009.

[34] M. Milligan, M. Schwartz, and Y. Wan, “Statistical wind power forecasting models:

results for u.s. wind farms,” NREL, Tech. Rep. NREL/CP-500-33956, May 2003. [Online].

Available: http://www.nrel.gov/docs/fy03osti/33956.pdf

113
[35] K. Methaprayoon, C. Yingvivatanapong, W. jen Lee, and J. Liao, “An integration of ann

wind power estimation into unit commitment considering the forecasting uncertainty,” IEEE

Transactions on Industry Applications, vol. 43, no. 6, pp. 1441–1448, 2007.

[36] C. Monteiro, R. Bessa, V. Miranda, A. Botterud, J. Wang, and G. Conzelmann, “Wind

power forecasting: State-of-the-art 2009.” [Online]. Available: http://www.dis.anl.gov/

pubs/65613.pdf

[37] E. Erdem and J. Shi, “Arma based approaches for forecasting the tuple of wind speed and

direction,” Applied Energy, vol. 88, no. 4, pp. 1405 – 1414, 2011.

[38] A. Sfetsos, “A novel approach for the forecasting of mean hourly wind speed time series,”

Renewable Energy, vol. 27, no. 2, pp. 163 – 174, 2002.

[39] R. Banos, F. Manzano-Agugliaro, F. Montoya, C. Gil, A. Alcayde, and J. Gomez, “Opti-

mization methods applied to renewable and sustainable energy: A review,” Renewable and

Sustainable Energy Reviews, vol. 15, no. 4, pp. 1753 – 1766, 2011.

[40] T. Barbounis and J. Theocharis, “Locally recurrent neural networks for long-term wind

speed and power prediction,” Neurocomputing, vol. 69, no. 4-6, pp. 466 – 496, 2006.

[41] G. Sideratos and N. Hatziargyriou, “Using radial basis neural networks to estimate wind

power production,” Power Engineering Society General Meeting, 2007. IEEE, pp. 1–7,

2007.

[42] C. Potter and M. Negnevitsky, “Very short-term wind forecasting for tasmanian power gen-

eration,” IEEE Transactions on Power Systems, vol. 21, no. 2, pp. 965–972, 2006.

[43] H. Pousinho, V. Mendes, and J. Catalao, “A hybrid pso-anfis approach for short-term wind

power prediction in portugal,” Energy Conversion and Management, vol. 52, no. 1, pp. 397

– 402, 2011.

114
[44] N. Amjady, F. Keynia, and H. Zareipour, “A new hybrid iterative method for short-term

wind speed forecasting,” European Transactions on Electrical Power, vol. 21, no. 1, pp.

581–595, 2011.

[45] A. Tascikaraoglu and M. Uzunoglu, “A review of combined approaches for prediction of

short-term wind speed and power,” Renewable and Sustainable Energy Reviews, vol. 34,

pp. 243 – 254, 2014.

[46] G. Nason, Wavelet methods in Statistics with R. Springer, 2008.

[47] J. Catalão, H. M. I. Pousinho, and V. Mendes, “Hybrid intelligent approach for short-term

wind power forecasting in portugal,” Renewable Power Generation, IET, vol. 5, no. 3, pp.

251–257, 2011.

[48] D. Faria, R. Castro, C. Philippart, and A. Gusmao, “Wavelets pre-filtering in wind speed

prediction,” International Conference on Power Engineering, Energy and Electrical Drives,

pp. 168–173, 2009.

[49] R. R. B. de Aquino, M. M. S. Lira, J. de Oliveira, M. Carvalho, O. Neto, and G. de Almeida,

“Application of wavelet and neural network models for wind speed and power generation

forecasting in a brazilian experimental wind park,” International Joint Conference on Neural

Networks, pp. 172–178, 2009.

[50] P. Chen, H. Chen, and R. Ye, “Chaotic wind speed series forecasting based on wavelet

packet decomposition and support vector regression,” IPEC, 2010 Conference Proceedings,

pp. 256–261, 2010.

[51] L. J. Ricalde, G. Catzin, A. Y. Alanis, and E. N. Sanchez, “Higher order wavelet neural

networks with kalman learning for wind speed forecasting,” IEEE Symposium on Computa-

tional Intelligence Applications In Smart Grid (CIASG), pp. 1–6, 2011.

115
[52] N. Amjady, F. Keynia, and H. Zareipour, “Wind power prediction by a new forecast engine

composed of modified hybrid neural network and enhanced particle swarm optimization,”

IEEE Transactions on Sustainable Energy, vol. 2, no. 3, pp. 265–276, July 2011.

[53] L. Wu and M. Shahidehpour, “A hybrid model for day-ahead price forecasting,” IEEE Trans-

actions on Power Systems, vol. 25, no. 3, pp. 1519–1530, 2010.

[54] L. de Castro and F. Von Zuben, “Learning and optimization using the clonal selection prin-

ciple,” IEEE Transactions on Evolutionary Computation, vol. 6, no. 3, pp. 239–251, 2002.

[55] Q. Wang, C. Wang, and X. Gao, “A hybrid optimization algorithm based on clonal selection

principle and particle swarm intelligence,” Sixth International Conference on Intelligent

Systems Design and Applications, vol. 2, pp. 975–979, 2006.

[56] G. C. Liao, “Application of an immune algorithm to the short-term unit commitment prob-

lem in power system operation,” IEEE Proceedings on Generation, Transmission and Dis-

tribution, vol. 153, no. 3, pp. 309–320, 2006.

[57] Mathworks. [Online]. Available: www.mathworks.com

[58] A. Slowik and M. Bialko, “Training of artificial neural networks using differential evolution

algorithm,” Conference on Human System Interactions, pp. 60–65, May 2008.

[59] R. Bessa, V. Miranda, and J. Gama, “Entropy and correntropy against minimum square error

in offline and online three-day ahead wind power forecasting,” IEEE Transactions on Power

Systems, vol. 24, no. 4, pp. 1657–1666, 2009.

[60] W. Liu, P. P. Pokharel, and J. C. Principe, “Correntropy: Properties and applications in cor-

rentropy: Properties and applications in non-gaussian signal processing,” IEEE Transactions

on Signal Processing, vol. 55, no. 11, pp. 5286–5298, November 2007.

[61] (2013) Navigant research. [Online]. Available: http://www.navigantresearch.com/research/

microgrids

116
[62] J. Taylor and P. McSharry, “Short-term load forecasting methods: An evaluation based

on european data,” IEEE Transactions on Power Systems, vol. 22, no. 4, pp. 2213–2219,

November 2007.

[63] T. Hong, M. Gui, M. Baran, and H. Willis, “Modeling and forecasting hourly electric load by

multiple linear regression with interactions,” Power and Energy Society General Meeting,

2010 IEEE, pp. 1–8, July 2010.

[64] H. Hippert, C. Pedreira, and R. Souza, “Neural networks for short-term load forecasting:

a review and evaluation,” IEEE Transactions on Power Systems, vol. 16, no. 1, pp. 44–55,

February 2001.

[65] Y. Wang, Q. Xia, and C. Kang, “Secondary forecasting based on deviation analysis for short-

term load forecasting,” IEEE Transactions on Power Systems, vol. 26, no. 2, pp. 500–507,

May 2011.

[66] E. Ceperic, V. Ceperic, and A. Baric, “A strategy for short-term load forecasting by support

vector regression machines,” IEEE Transactions on Power Systems, vol. 28, no. 4, pp. 4356–

4364, November 2013.

[67] Y. Goude, R. Nedellec, and N. Kong, “Local short and middle term electricity load forecast-

ing with semi-parametric additive models,” IEEE Transactions on Smart Grid, vol. 5, no. 1,

pp. 440–446, January 2014.

[68] E. Mashhour and S. Moghaddas-Tafreshi, “Integration of distributed energy resources into

low voltage grid: A market-based multiperiod optimization model,” Electric Power Systems

Research, vol. 80, no. 4, pp. 473–480, April 2010.

[69] A. Mohamed, V. Salehi, and O. Mohammed, “Real-time energy management algorithm for

mitigation of pulse loads in hybrid microgrids,” IEEE Transactions on Smart Grid, vol. 3,

no. 4, pp. 1911–1922, December 2012.

117
[70] P. Chan, W.-C. Chen, W. Ng, and D. Yeung, “Multiple classifier system for short term

load forecast of microgrid,” Proceedings of the 2011 International Conference on Machine

Learning and Cybernetics, pp. 1268–1273, 10-13 July, 2011.

[71] M. Shahidehpour and M. Khodayar, “Cutting campus energy costs with hierarchical con-

trol,” IEEE Electrification Magazine, vol. 1, no. 1, pp. 40– 56, September 2013.

[72] A. Ahmad, M. Hassan, M. Abdullah, H. Rahman, F. Hussin, H. Abdullah, and R. Saidur,

“A review on applications of ANN and SVM for building electrical energy consumption

forecasting,” Renewable and Sustainable Energy Reviews, vol. 33, pp. 102 – 109, 2014.

[73] G. Escriva-Escriva, C. Alvarez-Bel, C. Roldan-Blay, and M. Alcazar-Ortega, “New artificial

neural network prediction method for electrical consumption forecasting based on building

end-uses,” Energy and Buildings, vol. 43, no. 11, pp. 3112 – 3119, 2011.

[74] A. H. Neto and F. A. S. Fiorelli, “Comparison between detailed model simulation and arti-

ficial neural network for forecasting building energy consumption,” Energy and Buildings,

vol. 40, no. 12, pp. 2169 – 2176, 2008.

[75] S. Farzana, M. Liu, A. Baldwin, and M. U. Hossain, “Multi-model prediction and simulation

of residential building energy in urban areas of chongqing, south west china,” Energy and

Buildings, vol. 81, pp. 161 – 169, October 2014.

[76] R. K. Jain, K. M. Smith, P. J. Culligan, and J. E. Taylor, “Forecasting energy consumption of

multi-family residential buildings using support vector regression: Investigating the impact

of temporal and spatial monitoring granularity on performance accuracy,” Applied Energy,

vol. 123, pp. 168 – 178, June 2014.

[77] D. Monfet, M. Corsi, D. Choiniere, and E. Arkhipova, “Development of an energy pre-

diction tool for commercial buildings using case-based reasoning,” Energy and Buildings,

vol. 81, pp. 152 – 160, October 2014.

118
[78] G. Escriva-Escriva, C. Roldan-Blay, and C. Alvarez-Bel, “Electrical consumption forecast

using actual data of building end-use decomposition,” Energy and Buildings, vol. 82, pp. 73

– 81, 2014.

[79] J. G. Jetcheva, M. Majidpour, and W. P. Chen, “Neural network model ensembles for

building-level electricity load forecasts,” Energy and Buildings, vol. 84, pp. 214 – 223,

2014.

[80] (2014) British Columbia Institute of Technology. [Online]. Available: http://www.bcit.ca/

microgrid/

[81] H. Zareipour, K. Bhattacharya, and C. A. Canizares, “Electricity market price volatility:

The case of Ontario,” Energy Policy, vol. 35, pp. 4739–4748, 2007.

[82] N. Amjady and F. Keynia, “Short-term load forecasting of power systems by combination

of wavelet transform and neuro-evolutionary algorithm,” Energy, vol. 34, no. 1, pp. 46 – 57,

2009.

[83] Q. Zhang and A. Benveniste, “Wavelet networks,” IEEE Transactions on Neural Networks,

vol. 3, no. 6, pp. 889–898, November 1992.

[84] J. Vermaak and E. Botha, “Recurrent neural networks for short-term load forecasting,” IEEE

Transactions on Power Systems, vol. 13, no. 1, pp. 126–132, February 1998.

[85] S. J. Yoo, J. B. Park, and Y. H. Choi, “Adaptive dynamic surface control of flexible-joint

robots using self-recurrent wavelet neural networks,” IEEE Transactions on Systems, Man,

and Cybernetics, Part B: Cybernetics, vol. 36, no. 6, pp. 1342–1355, December 2006.

[86] ——, “Indirect adaptive control of nonlinear dynamic systems using self recurrent wavelet

neural networks via adaptive learning rates,” Information Sciences, vol. 177, no. 15, pp.

3074–3098, August 2007.

119
[87] S. S. Haykin, Neural Networks: A Comprehensive Foundation. Prentice Hall, 1999.

[88] M. T. Hagan and M. B. Menhaj, “Training feedforward networks with the marquardt al-

gorithm,” IEEE Transactions on Neural Networks, vol. 5, no. 6, pp. 989–993, November

1994.

[89] L. Hernandez, C. Baladrón, J. Aguiar, B. Carro, A. Sanchez-Esguevillas, and J. Lloret,

“Short-term load forecasting for microgrids based on artificial neural networks,” Energies,

vol. 6, no. 3, pp. 1385–1408, 2013.

[90] A. Pandey, D. Singh, and S. Sinha, “Intelligent hybrid wavelet models for short-term load

forecasting,” IEEE Transactions on Power Systems, vol. 25, no. 3, pp. 1266–1273, August

2010.

[91] B.-L. Zhang and Z.-Y. Dong, “An adaptive neural-wavelet model for short term load fore-

casting,” Electric Power Systems Research, vol. 59, pp. 121–129, 2001.

[92] N. Amjady and A. Daraeepour, “Mixed price and load forecasting of electricity markets by a

new iterative prediction method,” Electric Power Systems Research, vol. 79, pp. 1329–1336,

2009.

[93] Kumeyaay wind farm. [Online]. Available: http://www.thewindpower.net/windfarm en

2792 kumeyaay.php

[94] California Independent System Operator. [Online]. Available: http://www.caiso.com/1c57/

1c578a8751b30.pdf

[95] (2016) IEEE Smart Grid Newsletter. [Online]. Available: http://smartgrid.ieee.org/

newsletters/january-2016/emergence-of-the-behind-the-meter-energy-storage-market

[96] Y. J. Kim, G. Del-Rosario-Calaf, and L. K. Norford, “Analysis and experimental implemen-

tation of grid frequency regulation using behind-the-meter batteries compensating for fast

120
load demand variations,” IEEE Transactions on Power Systems, vol. 32, no. 1, pp. 484 –

498, May 2017.

[97] J. Neubauer and M. Simpson, “Deployment of behind-the-meter energy storage for

demand charge reduction,” National Renewable Energy Laboratory (NREL), January 2015.

[Online]. Available: http://www.nrel.gov/docs/fy15osti/63162.pdf

[98] R. Weron, “Electricity price forecasting: A review of the state-of-the-art with a look into

the future,” International Journal of Forecasting, vol. 30, no. 4, pp. 1030–1081, December

2014.

[99] P. Mandal, A. U. Haque, J. Meng, A. K. Srivastava, and R. Martinez, “A novel hybrid

approach using wavelet, firefly algorithm, and fuzzy artmap for day-ahead electricity price

forecasting,” IEEE Transaction on Power System, vol. 28, no. 2, pp. 1041–1051, May 2013.

[100] C. Wan, Z. Xu, Y. Wang, Z. Y. Dong, and K. P. Wong, “A hybrid approach for probabilistic

forecasting of electricity price,” IEEE Transactions on Smart Grid, vol. 5, no. 1, pp. 463–

470, January 2014.

[101] T. Hong, P. Pinson, S. Fan, H. Zareipour, A. Troccoli, and R. J. Hyndman, “Probabilistic

energy forecasting: Global energy forecasting competition 2014 and beyond,” International

Journal of Forecasting, vol. 32, no. 3, pp. 896–913, July 2016.

[102] C. Wan, M. Niu, Y. Song, and Z. Xu, “Pareto optimal prediction intervals of electricity

price,” IEEE Transactions on Power Systems, vol. 32, no. 1, pp. 817–819, January 2017.

[103] H. Zareipour, A. Janjani, H. Leung, A. Motamedi, and A. Schellenberg, “Classification of

future electricity market prices,” IEEE Transaction on Power System, vol. 26, no. 1, pp.

165–173, February 2011.

[104] D. Huang, H. Zareipour, W. D. Rosehart, and N. Amjady, “Data mining for electricity

price classification and the application to demand-side management,” IEEE Transactions

121
on Smart Grid, vol. 3, no. 2, pp. 808–817, June 2012.

[105] H. C. Wu, S. C. Chan, K. M. Tsui, and Y. Hou, “A new recursive dynamic factor analysis for

point and interval forecast of electricity price,” IEEE Transaction on Power System, vol. 28,

no. 3, pp. 2352–2365, August 2013.

[106] N. Amjady and F. Keynia, “A new prediction strategy for price spike forecasting of day-

ahead electricity markets,” Applied Soft Computing, vol. 11, no. 6, pp. 4246–4256, April

2011.

[107] T. Christensen, A. Hurn, and K. Lindsay, “Forecasting spikes in electricity prices,” Interna-

tional Journal of Forecasting, vol. 28, no. 2, pp. 400 – 411, June 2012.

[108] J. H. Zhao, Z. Y. Dong, X. Li, and K. P. Wong, “A framework for electricity price spike

analysis with advanced data mining methods,” IEEE Transaction on Power System, vol. 22,

no. 1, pp. 376–385, February 2007.

[109] X. Lu, Z. Y. Dong, and X. Li, “Electricity market price spike forecast with data mining

techniques,” Electric Power Systems Research, vol. 73, no. 1, pp. 19–29, January 2005.

[110] A. Fragkioudaki, A. Marinakis, and R. Cherkaoui, “Forecasting price spikes in european

day-ahead electricity markets using decision trees,” 12th International Conference on the

European Energy Market (EEM), pp. 1–5, May 2015.

[111] S. K. Aggarwal, L. M. Saini, and A. Kumar, “Electricity price forecasting in deregulated

markets: A review and evaluation,” International Journal of Electrical Power & Energy

Systems, vol. 31, no. 1, pp. 13–22, January 2009.

[112] K. Maciejowska and R. Weron, “Short- and mid-term forecasting of baseload electricity

prices in the u.k.: The impact of intra-day price relationships and market fundamentals,”

IEEE Transactions on Power Systems, vol. 31, no. 2, pp. 994–1005, March 2016.

122
[113] Independent Electric System Operator (IESO) - Pricing. [Online]. Avail-

able: http://www.ieso.ca/Pages/Ontario’s-Power-System/Electricity-Pricing-in-Ontario/

How-Wholesale-Electricity-Price-is-Determined.aspx

[114] California Independent System Operator (CAISO) - pricing. [Online]. Available:

http://www.caiso.com/market/Pages/MarketProcesses.aspx

[115] Electric Reliability Council of Texas (ERCOT) - pricing. [Online]. Available: http:

//www.ercot.com/mktinfo/prices

[116] New York Independent System Operator (NYISO) - Pricing. [Online]. Available:

http://www.nyiso.com

[117] Alberta Electric System Operator (AESO) - Pricing. [Online]. Available: http:

//www.aeso.ca/rulesprocedures/18592.html

[118] H. Zareipour, C. Canizares, and K. Bhattacharya, “Economic impact of electricity market

price forecasting errors: A demand-side analysis,” IEEE Transactions on Power Systems,

vol. 25, no. 1, pp. 254–262, February 2010.

[119] B. Mohammadi-Ivatloo, H. Zareipour, M. Ehsan, and N. Amjady, “Economic impact of

price forecasting inaccuracies on self-scheduling of generation companies,” Electric Power

Systems Research, vol. 81, no. 2, pp. 617–624, February 2011.

[120] E. Nasrolahpour, S. J. Kazempour, H. Zareipour, and W. D. Rosehart, “Strategic sizing of

energy storage facilities in electricity markets,” IEEE Transactions on Sustainable Energy,

vol. 7, no. 4, pp. 1462–1472, October 2016.

[121] M. Kazemi, H. Zareipour, M. Ehsan, and W. D. Rosehart, “A robust linear approach for of-

fering strategy of a hybrid electric energy company,” IEEE Transactions on Power Systems,

vol. 32, no. 3, May 2017.

123
[122] S. Shafiee, P. Zamani-Dehkordi, H. Zareipour, and A. M. knight, “Economic assessment of

a price-maker energy storage facility in the alberta electricity market,” Energy, vol. 111, no.

537-547, 2016.

[123] R. Palma-Behnke, C. Benavides, F. Lanas, B. Severino, L. Reyes, J. Llanos, and D. Sáez, “A

microgrid energy management system based on the rolling horizon strategy,” IEEE Trans-

actions on Smart Grid, vol. 4, no. 2, pp. 996–1006, June 2013.

[124] J. Silvente, G. M. Kopanos, E. N. Pistikopoulos, and A. Espuña, “A rolling horizon op-

timization framework for the simultaneous energy supply and demand planning in micro-

grids,” Applied Energy, vol. 155, pp. 485–501, October 2015.

[125] J. Han and M. Kamber, Data Mining: Concepts and Techniques. Morgan Kaufmann

Publishers, 2006.

[126] K. J. Cios, W. Pedrycz, R. W. Swiniarski, and L. A. Kurgan, Data Mining: A Knowledge

Discovery Approach. Springer, 2007.

[127] M. Alamaniotis, D. Bargiotas, N. G. Bourbakis, and L. H. Tsoukalas, “Genetic optimal

regression of relevance vector machines for electricity pricing signal forecasting in smart

grids,” IEEE Transactions on Smart Grid, vol. 6, no. 6, pp. 2997–3005, November 2015.

[128] C. P. Rodriguez and G. J. Anders, “Energy price forecasting in the ontario competitive power

system market,” IEEE Transaction on Power System, vol. 19, no. 1, pp. 366–374, February

2004.

[129] H. Zareipour, C. A. Cañizares, K. Bhattacharya, and J. Thomson, “Application of public-

domain market information to forecast ontario’s wholesale electricity prices,” IEEE Trans-

action on Power System, vol. 21, no. 4, pp. 1707–1717, November 2006.

[130] (2016) Independent Electric System Operator. [Online]. Available: http://www.ieso.ca/

Pages/Power-Data/default.aspx

124
[131] A. H. Mohsenian-Rad and A. Leon-Garcia, “Optimal residential load control with price

prediction in real-time electricity pricing environments,” IEEE Transactions on Smart Grid,

vol. 1, no. 2, pp. 120–133, September 2010.

[132] G. He, Q. Chen, C. Kang, P. Pinson, and Q. Xia, “Optimal bidding strategy of battery

storage in power markets considering performance-based regulation and battery cycle life,”

IEEE Transactions on Smart Grid, vol. 7, no. 5, pp. 2359–2367, September 2016.

[133] H. Chitsaz, S. Shafiee, H. Zareipour, and D. Wood, “Impact of uncertainty modeling on

economic performance of microgrids,” IEEE Transactions on Smart Grid, no. Under review,

2017.

[134] W. Su, J. Wang, and J. Roh, “Stochastic energy scheduling in microgrids with intermittent

renewable energy resources,” IEEE Transactions on Smart Grid, vol. 5, no. 4, pp. 1876–

1883, July 2014.

[135] C. Ju, P. Wang, L. Goel, and Y. Xu, “A two-layer energy management system for microgrids

with hybrid energy storage considering degradation costs,” IEEE Transactions on Smart

Grid, 2017.

[136] S.-J. Ahn, S.-R. Nam, J.-H. Choi, and S.-I. Moon, “Power scheduling of distributed genera-

tors for economic and stable operation of a microgrid,” IEEE Transactions on Smart Grid,

vol. 4, no. 1, pp. 398–405, March 2013.

[137] A. Parisio, E. Rikos, and L. Glielmo, “A model predictive control approach to microgrid

operation optimization,” IEEE Transactions on Control Systems Technology, vol. 22, no. 5,

pp. 1813–1827, September 2014.

[138] D. Olivares, C. Cañizares, and M. Kazerani, “A centralized energy management system for

isolated microgrids,” IEEE Transactions on Smart Grid, vol. 5, no. 4, pp. 1864–1875, July

2014.

125
[139] A. Khodaie, “Microgrid optimal scheduling with multi-period islanding constraints,” IEEE

Transaction on Power System, vol. 29, no. 3, pp. 1383–1392, May 2014.

[140] M. Tasdighi, H. Ghasemi, and A. Rahimi-Kian, “Residential microgrid scheduling based on

smart meters data and temperature dependent thermal load modeling,” IEEE Transactions

on Smart Grid, vol. 5, no. 1, pp. 349–357, January 2014.

[141] W. Shi, N. Li, C.-C. Chu, and R. Gadh, “Real-time energy management in microgrids,”

IEEE Transactions on Smart Grid, vol. 8, no. 1, pp. 228–238, January 2017.

[142] T. Niknam, F. Golestaneh, and A. Malekpour, “Probabilistic energy and operation manage-

ment of a microgrid containing wind/photovoltaic/fuel cell generation and energy storage

devices based on point estimate method and self-adaptive gravitational search algorithm,”

Energy, vol. 43, pp. 427–437, May 2012.

[143] K. P. Kumar and B. Saravanan, “Recent techniques to model uncertainties in power gener-

ation from renewable energy sources and loads in microgrids – a review,” Renewable and

Sustainable Energy Reviews, vol. 71, pp. 348–358, 2017.

[144] J. Che and J. Wang, “Short-term electricity prices forecasting based on support vector re-

gression and auto-regressive integrated moving average modeling,” Energy Conversion and

Management, vol. 51, no. 10, pp. 1911 – 1917, October 2010.

[145] R. Wang, P. Wang, and G. Xiao, “A robust optimization approach for energy generation

scheduling in microgrids,” Energy Conversion and Management, vol. 106, pp. 597–607,

October 2015.

[146] R. Gupta and N. K. Gupta, “A robust optimization based approach for microgrid operation

in deregulated environment,” Energy Conversion and Management, vol. 93, pp. 121–131,

January 2015.

126
[147] C. Zhang, Y. Xu, Z. Y. Dong, and J. Ma, “Robust operation of microgrids via two-stage

coordinated energy storage and direct load control,” IEEE Transactions on Power Systems,

vol. 32, no. 4, pp. 2858–2868, July 2017.

[148] E. Kuznetsova, C. Ruiz, Y.-F. Li, and E. Zio, “Analysis of robust optimization for decen-

tralized microgrid energy management under uncertainty,” Electrical Power and Energy

Systems, vol. 64, pp. 815–832, September 2015.

[149] S. Soman, H. Zareipour, O. Malik, and P. Mandal, “A review of wind power and wind speed

forecasting methods with different time horizons,” North American Power Symposium, Ar-

lington, TX, pp. 1–8, 2010.

[150] N. Amjady and F. Keynia, “Day-ahead price forecasting of electricity markets by mutual

information technique and cascaded neuro-evolutionary algorithm,” IEEE Transactions on

Power Systems, vol. 24, no. 1, pp. 306–318, February 2009.

[151] Alberta Electric System Operator (AESO) - Energy Trading System. [Online]. Available:

http://ets.aeso.ca

[152] City of Calgary - Small Wind Turbines. [Online]. Available: www.calgary.ca/CS/CPB/

Pages/Operations-WorkPlace-Centre/Bearspaw-OWC/Wind-need-for-small-turbine.aspx

[153] “Environment Canada - Weather Data.” [Online]. Available: www.weather.gc.ca

[154] J. Nowotarski and R. Weron, “Computing electricity spot price prediction intervals using

quantile regression and forecast averaging,” Computational Statistics, vol. 30, no. 3, pp.

791–803, September 2015.

[155] B. Liu, J. Nowotarski, T. Hong, and R. Weron, “Probabilistic load forecasting via quantile

regression averaging on sister forecasts,” IEEE Transactions on Smart Grid, vol. 8, no. 2,

pp. 730–737, March 2017.

127
[156] M. Kazemi, H. Zareipour, N. Amjady, W. D. Rosehart, and M. Ehsan, “Operation scheduling

of battery storage systems in joint energy and ancillary services markets,” IEEE Transac-

tions on Sustainable Energy, no. 99, 2017.

[157] AESO - Annual Market Statistics Report 2017. [Online]. Available: https://www.aeso.ca/

market/market-and-system-reporting/annual-market-statistic-reports/

[158] R. Velo, P. Lopez, and F. Maseda, “Wind speed estimation using multilayer perceptron,”

Energy Conversion and Management, vol. 81, pp. 1 – 9, 2014.

128
Appendix A

Wind power forecasting at wind farm levels

In this section, the performance of the proposed wind power forecasting methodology, presented in

Chapter 2, is evaluated at a wind farm level. The objective is to investigate the forecasting accuracy

as the the size of wind power generation is reduced from a system-level to a wind farm level. To

do so, the historical wind power generation data of Taber wind farm located in Alberta, Canada is

used. For the sake of a fair comparison, the same test months presented in Table 2.4 are considered.

Table A.1: Wind power prediction errors


System Level Wind Farm Level
Test month
nRMSE nMAE nRMSE nMAE
Mar. 11.89 8.32 12.55 8.76
Apr. 11.98 8.46 12.81 9.12
May 12.32 9.26 12.40 9.25
Jun. 13.69 9.74 12.98 9.49
Jul. 10.71 7.29 10.55 7.17
Aug. 12.08 8.05 12.72 8.33
Sep. 13.26 8.78 13.55 8.99
Oct. 11.35 7.78 11.73 8.11
Nov. 12.21 8.64 13.12 9.07
Dec. 11.52 7.80 12.87 8.71
Average 12.10 8.41 12.53 8.70

As shown in Table A.1, the forecasting errors in terms of both nMAE and nRMSE are higher

at a wind farm level than those for an aggregated wind generation. However, this table also shows

demonstrates that forecasting errors for the wind farm level are lower in two months of June and

July. Moreover, the forecasting errors seem to be noticeably higher in winter months. Overall, this

numerical experiment shows that the proposed forecasting methodology performs satisfactorily for

wind power prediction of individual wind farms as well.

The higher average of the forecasting error at a wind farm level is because the aggregate wind

generation from different geographically dispersed wind farms is expected to be slightly smoother

129
than that for an individual wind farm. This could become significant if wind farms are placed in

different geographical locations with potentially different wind regimes.

130
Appendix B

Benchmark models

Here, three forecasting methods used as benchmarks in Table 2.1, i.e., Persistence, MLP and RBF,

are briefly described.

Persistence is the simplest forecasting model as it assumes the forecast value at a certain time

in the future is the same as the last measured value. Therefore, this naive method is useful for

very short-term prediction purposes, while its forecast error significantly increases as the forecast

horizon increases.

Multi-Layer Perceptron (MLP) and Radial Basis Function (RBF) are famous Artificial Neu-

ral Networks (ANNs), which have successfully been applied to forecasting problems in power

systems. MLP is a feed-forward artificial neural network model capable of creating a mapping

function between sets of input data and a set of corresponding outputs. It consists of multiple

layers of nodes, so-called neurons in neural networks, and each layer is connected to the others.

Neurons are the processing elements of the network composed of activation functions, such as

linear, logarithmic sigmoid and tangent hyperbolic sigmoid functions. The last one is used as the

activation function of neurons in this paper, as it results in better performance. Weights connecting

the neurons of layers in the network and, biases connected to each neuron are free parameters that

should be adjusted in the training phase.

In RBF neural networks, radial basis functions are used as the activation functions of neurons

in the hidden layer and linear functions for the neurons of the output layer. In this network, the

vector distance between the input weights vector and the input vector is calculated (using the dot

product of the two) and then multiplied by the bias. Afterwards, the result is transferred to the

radial basis function. The output of the first layer is then transferred to the second layer in which a

linear function is the activation function of the output neuron. More details about MLP and RBF

131
neural networks can be found in [41, 57, 158].

132
Appendix C

Formulation of the training algorithm

The task of the forecasting engines is to learn the mapping function between a specified set of

input/output pairs {(X1 , t1 ), (X2 , t2 ), ..., (XQ , tQ )}, known as training samples. Q indicates the

number of training samples. Xq and tq are the q th input vector and the corresponding target output

of the forecasting model, respectively. Mean squared error (MSE) is usually considered to be the

performance index for the network. The MSE is calculated by


Q
1 X 2
M SE = e, (eq = tq − yq ) (A.1)
Q q=1 q

where, yq is the output of the forecasting engine when Xq is fed as the input of the forecasting

engine. eq is the forecast error of the q th sample.

The LM algorithm is an approximation of Newton’s method, in which the solution is updated

as follows:

Pk+1 = Pk − (J | J + µI)−1 J | e (A.2)

where P is the vector of the free parameters according to (3.6). k represents the iteration number,

and I is the identity matrix. J is the Jacobian matrix composed of the first derivatives of the

network errors with respect to all its free parameters and J | J is the Hessian matrix. Considering

(A.1) as the performance function that should be minimized, the gradient of (A.1) can be shown as

J | e.

The main modification of the LM algorithm with respect to Newton’s method is the parameter

µ, such that the algorithm becomes the Newton’s method if µ is zero in (A.2). When µ is large,

the LM algorithm tends to gradient descent with a small step size, i.e., (1/µ), while for small µ the

LM algorithm tends to Newton’s method. Since the Newton’s method is faster and more accurate

than the gradient descent, the aim is to shift toward Newton’s method as quickly as possible. Thus,

133
µ is divided by a factor β (β > 1) after each successful step, i.e. reduction in the MSE given in

(A.1). On the contrary, µ is multiplied by the factor β when a tentative step increases the MSE.

Therefore, the MSE is always reduced at each iteration of the algorithm [88]. The initial value

for µ is usually considered 0.01 and β is usually set as 10. For further details regarding the LM

training algorithm, the interested reader can refer to [88]. The implementation of the LM learning

algorithm for the SRWNN is proposed in the following.

Since computation of the Jacobian matrix is the most important part of the LM algorithm, it is

required to determine the first derivative of the network errors with respect to each free parameter

of (3.6) in the SRWNN, i.e., vj , wi , ai , bi , θi,j , and g. The elements in the Jacobian matrix are

calculated by the following equations.

∂e ∂(t − y)
= = −xj , j = 1, 2, ..., M (A.3)
∂vj ∂vj
∂e ∂(t − y)
= = −Ψi , i = 1, 2, ..., N (A.4)
∂wi ∂wi
∂e ∂(t − y) ∂Ψi
= = −wi , i = 1, 2, ..., N (A.5)
∂ai ∂ai ∂ai

M
" M
#
∂Ψi X dψi,j Y
= ψ(ri,l ) , (A.6)
∂ai j=1
dai l=1,l6=j
dψi,j −ri,j 0
= ψ (ri,j ), (A.7)
dai ai

where ψ 0 (.) is the derivative of the Morlet mother wavelet function.

∂e ∂(t − y) ∂Ψi
= = −wi , i = 1, 2, ..., N (A.8)
∂bi ∂bi ∂bi

M
" M
#
∂Ψi X dψi,j Y
= ψ(ri,l ) , (A.9)
∂bi j=1
dbi l=1,l6=j
dψi,j −1 0
= ψ (ri,j ), (A.10)
dbi ai

134
∂e ∂Ψi
= −wi , i = 1, ..., N , j = 1, ..., M (A.11)
∂θi,j ∂θi,j
M
∂Ψi ψi,j z −1 0 Y
= ψ (ri,j ) ψ(ri,l ) (A.12)
∂θi,j ai l=1,l6=j

∂e ∂(t − y)
= = −1 (A.13)
∂g ∂g

Therefore, the Jacobian matrix with the size of Q × N P can be computed using (A.3) to (A.13)

and all free parameters of the SRWNN are updated using (A.2). The procedure of the LM learning

algorithm for training the SRWNN is summarized as follows:

1. Set the iteration number to 1, i.e., (k = 1). Randomly initialize the free parameters

vj , wi , ai , bi , θi,j and g of the forecasting engine within their allowable ranges for

the first iteration P1 .

2. Present all xq s and compute the corresponding SRWNN outputs yq using (3.5).

Moreover, compute the corresponding errors eq and the performance index MSE

using (A.1).

3. Compute the Jacobian matrix.

4. Update the free parameters of the forecasting engine using (A.2) to obtain Pk+1 .

5. Compute the performance index MSE using Pk+1 . If the new MSE is smaller than

the one computed in step 2, reduce the parameter µ by the factor β, and save Pk+1 .

Otherwise, increase the parameter µ by multiplying it to β and go back to step 3.

6. Increment k, i.e., (k = k + 1). The training algorithm is terminated when the

termination criterion is satisfied. Otherwise, go back to step 3. It is noted that the

termination criterion can be the maximum number of iterations. However, the early

stopping technique, discussed in section 3.3.2, is used as the termination criterion

of the training algorithm in this paper as it can monitor the prediction ability of

135
SRWNN forecast engine for the unseen samples and terminate the training process

in the best point with the least validation error.

136
Appendix D

Mutual-Information feature selection

This thesis implements the feature selection technique proposed in [52] for electricity price pre-

diction. Constructed on the information theoretic criterion of mutual information, this method

selects the most informative input features for the forecast process by filtering out the irrelevant

and redundant candidate features through two stages. In this work, a vector of candidate features is

formed including market clearing prices, hourly Ontario electric prices, pre-dispatch prices, supply

cushion and Ontario demand forecast.

In the first stage, called irrelevancy filter, mutual information between each candidate feature,

i.e. fi (t), and the target variable is calculated. The higher value of mutual information for fi (t)

means the more common information content of this feature with the target variable. Defining a

relevancy threshold denoted by TRel , the candidate inputs with calculated mutual information value

greater than TRel are considered as the relevant features of the forecast process. These features are

retained for the second stage, while other candidate features whose mutual information values are

lower than TRel are considered as irrelevant features, which are filtered out.

In the second stage, redundant features among the candidate features selected by the relevancy

filter are detected and filtered out. This stage is called redundancy filter. Two selected candidates

from the first stage, e.g., fk (t) and fl (t), with high value of mutual information have more common

information, meaning high level of redundancy. Therefore, the redundancy of each selected feature

fk (t) with the other candidate inputs is first calculated. Defining a redundancy threshold denoted

by TRed , if the measured redundancy becomes greater than TRed , fk (t) is then considered as a

redundant candidate feature. Thus, between this candidate and the feature that has the maximum

redundancy with fk (t), one with lower relevancy should be filtered out [13].

The selected candidate features in redundancy filter are considered as the inputs of the price

137
forecasting model. It is noted that cross validation technique is used for fine-tuning the values of

the thresholds TRel and TRed . Since this method is not the focus of this paper, the interested reader

can refer to [150] for detailed formulation of mutual information criterion.

138
Appendix E

Copyright permission letters

To Whom It May Concern:

I, Dr. Hamidreza Zareipour, hereby grant permission to Mr. Hamed Chitsaz to reuse the below
three articles in his thesis titled “Developing Energy Forecasting Tools in Power Systems:
Application to Microgrids”.

1. H. Chitsaz, N. Amjady, and H. Zareipour, “Wind power forecast using wavelet neural
network trained by improved clonal selection algorithm,” Energy Conversion and
Management, vol. 89, pp. 588–598, January 2015.

2. H. Chitsaz, H. Shaker, H. Zareipour, D. Wood, and N. Amjady, “Short-term electricity load


forecasting of buildings in microgrids,” Energy and Buildings, vol. 99, pp. 50–60, July
2015.

3. H. Chitsaz, P. Zamani-Dehkordi, H. Zareipour, and P. Parikh, “Electricity price


forecasting for operational scheduling of behind-the-meter storage systems,” IEEE
Transactions on Smart Grid, 2017, in press.

I agree to the terms outlined in the University of Calgary Non-Exclusive Distribution License. I
am aware that all University of Calgary Theses are also achieved by the Library and Archives
Canada (LAC) and the University of Calgary Theses may be submitted to ProQuest.

Date:

Signature:

139
To Whom It May Concern:

I, Dr. David Wood, hereby grant permission to Mr. Hamed Chitsaz to reuse the below article in
his thesis titled “Developing Energy Forecasting Tools in Power Systems: Application to
Microgrids”.

H. Chitsaz, H. Shaker, H. Zareipour, D. Wood, and N. Amjady, “Short-term electricity load


forecasting of buildings in microgrids,” Energy and Buildings, vol. 99, pp. 50–60, July 2015.

I agree to the terms outlined in the University of Calgary Non-Exclusive Distribution License. I
am aware that all University of Calgary Theses are also achieved by the Library and Archives
Canada (LAC) and the University of Calgary Theses may be submitted to ProQuest.

Date:

Signature:

140
To Whom It May Concern:

I, Dr. Nima Amjady, hereby grant permission to Mr. Hamed Chitsaz to reuse the below two articles
in his thesis titled “Developing Energy Forecasting Tools in Power Systems: Application to
Microgrids”.

1. H. Chitsaz, N. Amjady, and H. Zareipour, “Wind power forecast using wavelet neural
network trained by improved clonal selection algorithm,” Energy Conversion and
Management, vol. 89, pp. 588–598, January 2015.

2. H. Chitsaz, H. Shaker, H. Zareipour, D. Wood, and N. Amjady, “Short-term electricity load


forecasting of buildings in microgrids,” Energy and Buildings, vol. 99, pp. 50–60, July
2015.

I agree to the terms outlined in the University of Calgary Non-Exclusive Distribution License. I
am aware that all University of Calgary Theses are also achieved by the Library and Archives
Canada (LAC) and the University of Calgary Theses may be submitted to ProQuest.

Date:

Signature:

141
To Whom It May Concern:

I, Dr. Hamid Shakerardakani, hereby grant permission to Mr. Hamed Chitsaz to reuse the below
article in his thesis titled “Developing Energy Forecasting Tools in Power Systems: Application
to Microgrids”.

H. Chitsaz, H. Shaker, H. Zareipour, D. Wood, and N. Amjady, “Short-term electricity load


forecasting of buildings in microgrids,” Energy and Buildings, vol. 99, pp. 50–60, July 2015.

I agree to the terms outlined in the University of Calgary Non-Exclusive Distribution License. I
am aware that all University of Calgary Theses are also achieved by the Library and Archives
Canada (LAC) and the University of Calgary Theses may be submitted to ProQuest.

Date:

Signature:

142
To Whom It May Concern:

I, Dr. Payam Zamani-Dehkordi, hereby grant permission to Mr. Hamed Chitsaz to reuse the below
article in his thesis titled “Developing Energy Forecasting Tools in Power Systems: Application
to Microgrids”.

H. Chitsaz, P. Zamani-Dehkordi, H. Zareipour, and P. Parikh, “Electricity price forecasting for


operational scheduling of behind-the-meter storage systems,” IEEE Transactions on Smart Grid,
2017, in press.

I agree to the terms outlined in the University of Calgary Non-Exclusive Distribution License. I
am aware that all University of Calgary Theses are also achieved by the Library and Archives
Canada (LAC) and the University of Calgary Theses may be submitted to ProQuest.

Date:

Signature:

143
To Whom It May Concern:

I, Dr. Palak Parikh, hereby grant permission to Mr. Hamed Chitsaz to reuse the below article in
his thesis titled “Developing Energy Forecasting Tools in Power Systems: Application to
Microgrids”.

H. Chitsaz, P. Zamani-Dehkordi, H. Zareipour, and P. Parikh, “Electricity price forecasting for


operational scheduling of behind-the-meter storage systems,” IEEE Transactions on Smart Grid,
2017, in press.

I agree to the terms outlined in the University of Calgary Non-Exclusive Distribution License. I
am aware that all University of Calgary Theses are also achieved by the Library and Archives
Canada (LAC) and the University of Calgary Theses may be submitted to ProQuest.

Date:

Signature:

144
Rightslink® by Copyright Clearance Center 2017-11-29, 9)03 PM

Title: Wind power forecast using Logged in as:


wavelet neural network trained Hamed Chitsaz
by improved Clonal selection Account #:
algorithm 3001221640

Author: Hamed Chitsaz,Nima


Amjady,Hamidreza Zareipour
Publication: Energy Conversion and
Management
Publisher: Elsevier
Date: 1 January 2015
Copyright © 2014 Elsevier Ltd. All rights reserved.

Please note that, as the author of this Elsevier article, you retain the right to include it in a thesis or
dissertation, provided it is not published commercially. Permission is not required, but please ensure
that you reference the journal as the original source. For more information on this and on your other
retained rights, please visit: https://www.elsevier.com/about/our-business/policies/copyright#Author-
rights

Copyright © 2017 Copyright Clearance Center, Inc. All Rights Reserved. Privacy statement. Terms and Conditions.
Comments? We would like to hear from you. E-mail us at customercare@copyright.com

https://s100.copyright.com/AppDispatchServlet Page 1 of 1

145
Rightslink® by Copyright Clearance Center 2017-11-29, 9)04 PM

Title: Short-term electricity load Logged in as:


forecasting of buildings in Hamed Chitsaz
microgrids Account #:
3001221640
Author: Hamed Chitsaz,Hamid
Shaker,Hamidreza
Zareipour,David Wood,Nima
Amjady
Publication: Energy and Buildings
Publisher: Elsevier
Date: 15 July 2015
Copyright © 2015 Elsevier B.V. All rights reserved.

Please note that, as the author of this Elsevier article, you retain the right to include it in a thesis or
dissertation, provided it is not published commercially. Permission is not required, but please ensure
that you reference the journal as the original source. For more information on this and on your other
retained rights, please visit: https://www.elsevier.com/about/our-business/policies/copyright#Author-
rights

Copyright © 2017 Copyright Clearance Center, Inc. All Rights Reserved. Privacy statement. Terms and Conditions.
Comments? We would like to hear from you. E-mail us at customercare@copyright.com

https://s100.copyright.com/AppDispatchServlet Page 1 of 1

146
Rightslink® by Copyright Clearance Center 2017-11-24, 2)41 PM

Title: Electricity Price Forecasting for Logged in as:


Operational Scheduling of Hamed Chitsaz
Behind-the-meter Storage
Systems
Author: Hamed Chitsaz
Publication: Smart Grid, IEEE Transactions
on
Publisher: IEEE
Date: Dec 31, 1969
Copyright © 1969, IEEE

Thesis / Dissertation Reuse

The IEEE does not require individuals working on a thesis to obtain a formal reuse license, however,
you may print out this statement to be used as a permission grant:

Requirements to be followed when using any portion (e.g., figure, graph, table, or textual material) of an IEEE
copyrighted paper in a thesis:

1) In the case of textual material (e.g., using short quotes or referring to the work within these papers) users
must give full credit to the original source (author, paper, publication) followed by the IEEE copyright line © 2011
IEEE.
2) In the case of illustrations or tabular material, we require that the copyright line © [Year of original
publication] IEEE appear prominently with each reprinted figure and/or table.
3) If a substantial portion of the original paper is to be used, and if you are not the senior author, also obtain the
senior author's approval.

Requirements to be followed when using an entire IEEE copyrighted paper in a thesis:

1) The following IEEE copyright/ credit notice should be placed prominently in the references: © [year of original
publication] IEEE. Reprinted, with permission, from [author names, paper title, IEEE publication title, and
month/year of publication]
2) Only the accepted version of an IEEE copyrighted paper can be used when posting the paper or your thesis
on-line.
3) In placing the thesis on the author's university website, please display the following message in a prominent
place on the website: In reference to IEEE copyrighted material which is used with permission in this thesis, the
IEEE does not endorse any of [university/educational entity's name goes here]'s products or services. Internal or
personal use of this material is permitted. If interested in reprinting/republishing IEEE copyrighted material for
advertising or promotional purposes or for creating new collective works for resale or redistribution, please go to
http://www.ieee.org/publications_standards/publications/rights/rights_link.html to learn how to obtain a License
from RightsLink.

If applicable, University Microfilms and/or ProQuest Library, or the Archives of Canada may supply single copies
of the dissertation.

Copyright © 2017 Copyright Clearance Center, Inc. All Rights Reserved. Privacy statement. Terms and Conditions.
Comments? We would like to hear from you. E-mail us at customercare@copyright.com

https://s100.copyright.com/AppDispatchServlet#formTop Page 1 of 2

147

You might also like