Professional Documents
Culture Documents
Learning and Optimization For Price-Based Demand Response of Electric Vehicle Charging
Learning and Optimization For Price-Based Demand Response of Electric Vehicle Charging
Learning and Optimization For Price-Based Demand Response of Electric Vehicle Charging
Abstract— In the context of charging electric vehicles (EVs), demands and price [4]–[8]. Knowing such PBDR patterns of
the price-based demand response (PBDR) is becoming increas- EV customers, the charging station operator can precisely
ingly significant for charging load management. Such response estimate customers’ charging requirements and hence de-
usually encourages cost-sensitive customers to adjust their
energy demand in response to changes in price for financial velop reasonable station-level charging policy to minimize
arXiv:2404.10311v1 [eess.SY] 16 Apr 2024
incentives. Thus, to model and optimize EV charging, it is operation cost while satisfying charging demands.
important for charging station operator to model the PBDR Previous research attempts utilized price-based demand
patterns of EV customers by precisely predicting charging response to encourage EV charging flexibility [9]. [5] turned
demands given price signals. Then the operator refers to to a rule-based approach to design the deadline-differentiated
these demands to optimize charging station power allocation
policy. The standard pipeline involves offline fitting of a PBDR electric power service. Yet such rule-based methods heavily
function based on historical EV charging records, followed rely on a set of pre-defined heuristics or pre-assumed condi-
by applying estimated EV demands in downstream charging tions. An on-off EV demand response strategy was leveraged
station operation optimization. In this work, we propose a new by formulating a binary optimization problem [7], and finite-
decision-focused end-to-end framework for PBDR modeling horizon Markov decision process was used to coordinate
that combines prediction errors and downstream optimization
cost errors in the model learning stage. We evaluate the consumer owned energy storage [8]. These methods however
effectiveness of our method on a simulation of charging station simplified the demand response model such that charging de-
operation with synthetic PBDR patterns of EV customers, mands were assumed known exactly beforehand. Moreover,
and experimental results demonstrate that this framework can both rule-based and optimization-based methods do not take
provide a more reliable prediction model for the ultimate into account the modeling procedure of price-based demand,
optimization process, leading to more effective optimization
solutions in terms of cost savings and charging station operation and the charging demand is either estimated or assumed
objectives with only a few training samples. known exactly. An inaccurate forecast of price responsive
charging load would further incur loss of robustness and
I. INTRODUCTION reliability of station operation [10]. In this paper, we take
Recent fast-growing penetration of electric vehicles (EVs) a specific look into the relationship between charging price
casts unforeseen challenges for charging station and power assigned to each EV owner and EV’s charging demand, and
grid operations. Massive number of charging sessions in- investigate how to optimize charging strategy when such
evitably result in overload due to the cluster effect on relationship is unrevealed to the charging station. We answer
excessively concentrated EV charging stations and charging the following research question:
times [1], [2]. As both charging station and EV usually have How do we optimize EV charging scheduling over unknown
limited capacities, excessive charging requests during peak demands incurred by variable charging price?
times can cause considerable operational challenges. It is We tackle both the demand forecasting and multi-step
thus hard for charging stations to both satisfy consumers’ charging optimization problems in one framework. The pro-
demands and maximize charging profits. posed paradigm holds the promise of adaptively improving
Thus, it is desirable to alleviate the overload of the EV its performance over time when exposed to more data and
charging stations with appropriate charging control. And feedback, making it well-suited for handling complex and
when optimizing the charging control policy, one non- dynamical problems. Standard pipelines involve individually
negligible factor is the price-based demand response (PBDR) fitting of a demand response function using the collected EV
of EV chargers. Cost-sensitive EV customer can actively charging data, followed by the incorporation of predicted
adjust their total amount of charging energy based on the results into the ultimate optimization task [11], [12]. Such
assigned charging prices [3]. Therefore, in order to better workflow neglects the fact that the locally accurate predic-
operate simultaneous EV charging sessions, the price-based tions may not benefit the underlying optimization problem’s
demand response (PBDR) strategy needs to be investigated overall objectives [10], [13]. Moreover, the complex physical
through modeling the inherent relationship between energy constraints make such predict-then-optimize framework pro-
1 duce infeasible solutions. [14] considered demand response
AI Thrust, Information Hub, The Hong Kong
University of Science and Technology (Guangzhou). model in an end-to-end fashion, while the application was
{cgu893,rliu519}@connect.hkust-gz.edu.cn , limited to identifying a subset of demand response model
yizechen@ust.hk parameters rather than solving the optimization problem.
2 Division of Emerging Interdisciplinary Area under IPO,
The Hong Kong University of Science and Technology. Our formulation makes it possible to backpropagate from
yuxin.pan@connect.ust.hk the ultimate charging task-based objective to the learning
Fig. 2: Examples of EV PBDR patterns. Individual EV’s
charging demand is a function of charging price.
X X X X
min g(x) = p(t) xi (t) − ci xi (t)+
x
t i i t
Fig. 1: (1a). Proposed EV demand estimation and charging XX
2
X
x2i (t) (1a)
β ( xi (t) − ei ) +α
station operation model. EV customers have various pat-
i t i,t
terns of demand response to price. (1b). Proposed end- X
to-end charging demand forecasts and charging scheduling s.t. 0≤ xi (t) ≤ x, ∀t (1b)
framework. With optimization task-based loss L, our model i
propagates loss derivative with respect to neural network 0 ≤ xi (t) ≤ ȳi , ∀i, t (1c)
model’s parameter θ to help train the prediction model f (·). xi (t) = 0, f or t ≤ ta,i and t ≥ td,i (1d)
of individual PBDR model. Moreover, it elicits the control ei = fi (ci ); (1e)
actions for EV charging. The schematic of proposed method where x = [x1 (1), x1 (2), ..., x1 (T ), x2 (1), ..., , xN (T )]T is
is shown in Fig. 1. Specifically, to find the price-responsive the control variable; xi (t) represents the charging power
EV charging demands, we learn the complex price-based assigned to EV customer i at time t; p(t) is the electricity
demand response (PBDR) as a function of charging price purchase price from the grid; ci and ei are the electricity
unrevealed to the system operator. We combine prediction selling price and total electricity demand for EV customer i.
errors and downstream optimization cost errors in the model x is maximum charging power at time t for the station. ȳi
learning stage. Experimentally, our method is evaluated on is the maximum charging power for EV customer i. With
noisy demand response samples based on varying charging arrival time ta,i and departure time td,i , and β, α is the
prices. The results demonstrate that our framework can lead coefficient for regularizing demand charging completion and
to a considerable decrease in the operation costs by more non-smooth charging power.
than 20% compared to the standard two-step (predict-then- With known ei , this EV scheduling problem is a con-
optimize) method. This advantage is even more significant strained Quadratic Programming (QP) problem which strong
when the training dataset is small, meaning only limited duality holds, and it can be solved to find optimal charging
customer responses are revealed to the charging stations. policy. However, it can be challenging to solve in practice
II. P ROBLEM F ORMULATION with unknown ei . Since each EV owner has distinct charging
behaviors, while the charging demand can be varying with
In this section, we describe the formulation of charging respect to price. From the charging station side, it cannot
station operation problem, in which we further develop the solve (1a) without knowing such charging demand informa-
relationship between the electricity selling price assigned tion. Customers adjust their electricity demands according
to EV customers and their corresponding total electricity to the selling price signal, and each customer may have its
demand as a price-based demand response (PBDR) model. unique response to the price ei . In order to model this price-
demand map and use it to infer demand given price, we form
A. Charging Station Operation Model
the following demand price-response models. We note that
We first formulate the charging station operation model. in our setting, charging station operator informs customer ci
We suppose the goal of one EV charging station is to find before charging starts, and the optimization of ci along with
the optimal policy to minimize its operation cost, meanwhile charging scheduling is left for future work.
satisfying its customers’ charging demands as much as
possible. Each EV comes to the charging station at various B. Price-based Demand Response Model
timesteps, and reveals their requested energy based on the In this work we include two examples on the explict
charging price (costs in per kWh) along with the departure modeling of price-based EV demand response which is not
time. The charging station can control the charging rate of considered in previous research. We note that our solution
techniques described in Section III are not limited to the challenging to design charging control under such situation.
particular function mapping from charging price to charging In this section, we illustrate how it is possible to include
demand, and we are able to solve the charging station op- the training of price-demand forecasting module as a fully
timization problem under various unknown price-responsive differentiable part in the EV charging optimization model.
demand patterns.
Example 1: Utility-maximization EV demand A. Reformulation of Charging Station Model
Once the EV owner i comes to the charging station and
To simplify the analysis, we first reformulate the optimiza-
gets the price ci revealed, the EV aims at minimizing the
tion problem, and rewrite the charging station optimization
charging costs while maximizing the charging utility, given
problem (1a) as a standard Quadratic Programming (QP):
the prevailing electricity price. The utility function f (ei ) is
usually assumed to be continuous, nondecreasing, and con-
cave [15]. In this paper, the user utility function is modeled as min g(x) =β(Bx − ê)T (Bx − ê) + (pT − cT B)x + αxT x
x
f (ei ) = −(elevel,i − etrip,i )2 , where elevel,i = einit,i + ei ηi . 1
ei , elevel,i , etrip,i , and einit,i denote charging demand, final = xT (2βB T B + 2αI)x + (pT − cT B − 2β êT B)x
2
energy level, desired energy level after charging session, and + β êT ê (4a)
initial energy level, respectively. Given that the users know
−A
0
their energy requirement for an upcoming trip, their utility A r
function represents how well the charging need is met. Such s.t. x≤ (4b)
−I 0
utility maximization problem is formulated as I y
F x = 0. (4c)
min γ1 ci ei + γ2 (elevel,i − etrip,i )2 (2a)
ei
s.t. elevel,i = einit,i + ei ηi (2b) • p = [p(1), p(2), ..., p(T ), ......, p(1), p(2), ..., p(T )]T ∈
RN T ×1 . lists the electricity price for charging xi (t).
SOCmin,i ≤ elevel,i ≤ SOCmax,i (2c)
ei • c = [c1 , c2 , .., cN ]T lists the selling price assigned to
≤ pmax,i (2d) each EV customer i.
(td,i − ta,i )
• ê = [ê1 , ê2 , .., êN ]T lists the estimated demand of each
In the formulation above, Constraint (2b) states that the EV customer i by a Multi-layer Perceptron (MLP)
energy level elevel,i of the ith electric vehicle (EV) is equal model ê = f (c; θ).
to the sum of its initial energy level einit,i and the charging • r = x̄1(T ×1) ∈ RT ×1 represents the charging capacity
demand ei ; (2c) requires that the energy level elevel,i must (i.e. station level maximum charging power).
fall within the battery state-of-charge capacity range; (2d) • y = [ȳ1 1T(T ×1) , ȳ2 1T(T ×1) , ..., ȳN 1T(T ×1) ]T ∈ RN T ×1
specifies that the average charging power rate must not represents the customer level maximum charging power.
exceed the maximum power rate. γ1 and γ2 denote the • A =P [I(T ×T ) , ..., I(T ×T ) ] ∈ RTP ×N T
helps transform
weighting factors of the charging cost and utility term; term i xi (t) into matrix form. i xi (t) = Ax.
ηi ∈ (0, 1) represents the charging efficiency. Generally, the 1TT ×1 0 ... 0
charging demand ei obtained by solving (2a) will have a 0 1 T
... 0
• B = T ×1 ∈ RN ×N T helps
piece-wise linear relationship with price signal ci . Fig. (2)(a) ... ... ... ...
illustrates one example of such pattern. 0 T
Example 2: Piecewise quadratic EV demand P0 P ... 1T ×1 2
Some EV owners are price-sensitive over certain price P P term i2( t xi (t) − êi ) into matrix form.
transform
i( t xi (t) − êi ) = Bx − ê.
range, meaning that an increase in the charging cost results in • F ∈ RH×N T transforms the arrival and departure
a rapid decline in their demand for charging. To approximate constraints (1d) into matrix form F x = 0. H is the
the behaviour of this kind of customers, we adopt a general total number of elements xi (t) which have constraint
price-demand model and construct the piecewise function xi (t) = 0. Each line Fh of F corresponds to one
with quadratic components for their price-response model: particular equality constraint xih (th ) = 0; in this line
(
ei,base , if ci ≤ ci,s only element Fh,ih ×T +th = 1.
ei = (3)
max(ei,base − µi (ci − ci,s )2 , 0), if ci > ci,s Previous works usually suppose that if we have an accurate
estimation ê of actual demand e, the system cost (4a) will
where ei,base denotes the original charging demand of the
also achieve its optimal [16]. Hence, two-step approach
ith EV; µi represents the price sensitive parameter; when
separates prediction and optimization: first to train the MLP
the current price ci is greater then its sensitive price level
f (c; θ) by minimizing root mean squared error (RMSE)
ci,s , the EV user decides to decrease charging demand.
between customer demand prediction and actual charging
Fig. (2)(b) illustrates one example of demand price-response
demand; then to use the trained MLP to predict customer
model based on the definition mentioned above.
demands ê, and based on the predicted demands they solve
III. E ND - TO -E ND L EARNING AND O PTIMIZATION problem (4) to obtain minimum operation cost.
In practice, charging station is faced with unknown price- Our goal is to learn the optimal parameter θ of an
based demand response model for each customer. It is MLP prediction model ê = f (c; θ), which approximates
customer’s demand response to price signal. Based on it we updates to be well-defined. We refer to [17] for a more
can minimize actual charging station operation cost L(θ): detailed discussion on the model differentiability.
min L(θ) = β(Bx∗ (θ) − e)T (Bx∗ (θ) − e) C. Model Learning
θ (5)
T T ∗ ∗ T ∗
+(p − c B)x (θ) + αx (θ) x (θ); In practice, we can solve QP problem (4) in batches
efficiently in deep learning environment like Pytorch. We
in which x∗ (θ) represents the optimal solution of the charg-
adapt the neural network architecture defined in OptNet [13].
ing station operation optimization problem (4). e denotes the
Our approach is inspired by such end-to-end differentiable
actual demand of each EV customer.
optimization framework [10], [13].
In contrast to the existing paradigm, our learning model
In our model, we output optimal solution x∗ and compute
is decision-focused, which means in the learning process we
the charging scheduling task-based loss gradient ∂L ∂θ by
focus on the downstream optimization performance rather
solving Equation (8). Then we backpropagate the gradient
than solely on the prediction accuracy. The overall algorithm
to update parameter θ in MLP model f (ê|c; θ) until the loss
is an end-to-end framework to find the optimal decisions with
converges to an optimal value. Once the model is trained,
price signals as input. Fig. (1) illustrates the architecture
we use it to generate estimated demand ê. Then we forward
of our framework. More specifically, we directly use the
ê to the differentiable optimization module, and it will solve
objective function (5) of the charging station operation model
(4) and output optimal solution x∗ . Given x∗ we use (5) to
as the training loss for the demand forecasting model MLP
compute the actual optimal operation cost L. The whole end-
f (·), rather than the RMSE of demand prediction, and back
to-end model learning process is illustrated in Algorithm. 1.
propagate its gradient with respect to MLP model parameters
∂L ∂L ∂L ∂x∗
∂θ , to adapt the model: ∂θ = ∂x∗ ∂θ . From (5) we can Algorithm 1 Charging Scheduling task based loss optimiza-
∂L
easily obtain ∂x ∗ . Hence, we focus on computing the partial
∂x∗ tion
derivative ∂θ hereafter.
1: Input: price signal c
B. Differentiating the Optimal Solution 2: Initialize parameter θ for MLP model f (ê|c; θ).
Now we analyze the optimal solution x∗ of this standard 3: for Iteration j (j = 1, 2, ..., Jmax ) do
form QP problem (4). Since strong duality holds, x∗ should 4: Estimate ê = f (c|θ).
satisfy the Karush-Kuhn-Tucker (KKT) Conditions: 5: Solve Equation (4) for optimal solution x∗ with esti-
−A
T mated ê. ∗
A 6: With x∗ compute derivative ∂L(x )
by solving Equa-
(2βB T B + 2αI)x∗ +(pT − cT B − 2β êT B)T + λ∗ ∂θ
−I tion (8).
I ∂L ∂x∗
7: Update θj+1 ← θj − η ∂x ∗ ∂θ .
T ∗
+F ν = 0 (6a) 8: end for
F x∗ = 0 (6b)
−A 0 IV. N UMERICAL S IMULATIONS
∗ A ∗ r
diag(λ )( x − ) = 0 (6c) In the simulation, we consider running a real-world charg-
−I 0
I y ing station with electricity price data in Anaheim City,
California [18], which has limited charging capacity and
where λ∗ and ν ∗ are optimal∗dual variables. We compute
the target partial derivative ∂x serves multiple EV customers for each day. We evaluate
∂θ by differentiating Equation
(6). The matrix form of the KKT differentials is: the performance with the overall actual optimal operation
cost L(θ), and 3 sub evaluation metrics: (1) operation profit
2βB T B + 2αI [−AT AT − I I] FT
−A
−A
0
∗ (pT − cT B)x∗ , representing the profits earned by charging
dx 2βB T dê
∗ A A ∗ r station. (2) charging demand completion penalty β(Bx∗ −
∗
diag(λ ) diag( x − ) 0 dλ = 0
−I −I 0 dν ∗
0 ê)T (Bx∗ − ê), reflecting the demand satisfaction rate of EV
I I y
F 0 0 customers. Lower penalty value represents higher demand
satisfaction rate. (3) smooth penalty α(x∗ )T x∗ , reflecting
| {z }
G
(7)
We divide both sides of Equation 7 with differential dθ. the smoothness of charging curves under charging policy x∗ .
And then we ∗have a set of linear equations to solve for target We compare our framework with ground-truth results (i.e.
derivative ∂x
∂θ :
solve (4) for optimal solution x∗ under no demand estimation
dx∗ errors) and the two-step approach (as compared baseline). All
dθ∗ 2βB T dê
dθ models are built and trained with NVIDIA RTX 2080 Ti and
G dλ = 0 (8)
dθ
dν ∗
Intel(R) Xeon(R) Silver 4210 CPU.
0
dθ In the charging station model, we assume that the op-
Matrix G is obviously an upper triangular matrix here. erational horizon of station is T = 24 hours since EV
Hence, it will always be invertible. Linear Equation 8 will charging follows daily commute patterns, and the number
∗
dλ∗ dν ∗
achieve the unique solutions of gradients dx
dθ , dθ , dθ . This of EV customers N = 5. Each EV customer can charge at
gives the guarantee to make the gradients and parameter a particular station at different starting times, so each EV
2-hidden-layer network with a residual connection between
inputs and outputs as the MLP model in both two-step
approach and our decision-focused end-to-end framework.
We respectively use 200, 400, 600, 800 samples in training
dataset generated in Section IV.B to train the MLP model.
For both methods we train the models for 250 iterations. We
evaluate the performance of two methods with the average
operation cost, profit, penalties of demand completion and
(a) (b) smooth on the 100-sample test dataset in Section IV.B. All
Fig. 3: (3a) shows the synthetic price-based demand response the results are benchmarked against ground-truth setting with
(PBDR) patterns used in the simulation; (3b) shows the known individual EV’s charging price and demands.
real-world hourly commercial electricity purchase price [18]
TABLE I: Evaluation on operation cost L ($) with different
with colors indicating different price-time ranges. We set this
number of training samples. Ground-Truth value is -11.69.
peak-valley purchase price as p in our simulation. Samples Decision-focused E2E Two-step Improvement($)
200 3.39±0.45 7.01±1.44 3.62
400 -4.90±0.45 -1.91±0.25 2.99
600 -7.26±0.24 -4.45±0.59 2.81
800 -8.07±1.02 -5.56±0.31 2.51