Charging Scheduling of Electric Vehicles With Local Renewable Energy Under Uncertain Electric Vehicle Arrival and Grid Power Price

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

2600 IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, VOL. 63, NO.

6, JULY 2014

Charging Scheduling of Electric Vehicles With Local


Renewable Energy Under Uncertain Electric
Vehicle Arrival and Grid Power Price
Tian Zhang, Wei Chen, Senior Member, IEEE, Zhu Han, Fellow, IEEE, and Zhigang Cao, Senior Member, IEEE

Abstract—In this paper, we consider delay-optimal charging Index Terms—Charging scheduling, electric vehicle (EV),
scheduling of the electric vehicles (EVs) at a charging station with Markov decision process (MDP), renewable energy.
multiple charge points. The charging station is equipped with
renewable energy generation devices and can also buy energy from I. I NTRODUCTION
power grids. The uncertainty of the arrival of the EV, the intermit-
tence of the renewable energy, and the variation of the grid power
price are taken into account and described as independent Markov
processes. Meanwhile, the required charging energy for each EV
A S an important method of operation to mitigate the short-
age of fossil fuels and severe environmental problems,
electric vehicle (EV) technology has attracted much interest in
is random. The goal is to minimize the mean waiting time for EVs
under the long-term constraint on the cost. We propose queue recent years. Compared with conventional vehicles, EVs have
mapping to convert the EV queue to the charging demand queue, the following advantages: energy efficiency, ecological effects,
and we prove the equivalence between the minimization of the performance benefits, and energy independence [1]. Since EVs
two queues’ average length. Then, we focus on the minimization are propelled by an electric motor (or motors) that is (are)
for the average length of the charging demand queue under the
long-term cost constraint. We propose a framework of Markov powered by rechargeable battery packs, EVs need to be charged
decision process (MDP) to investigate this constrained stochastic periodically. Then, EV charging becomes an important topic
optimization problem. The system state includes the charging [2], [3].
demand queue length, the charging demand arrival, the energy In the scheduling of EV charging, cost minimization and ser-
level in the storage battery of the renewable energy, the renewable vice quality improvement are two conflicting aspects. On one
energy arrival, and the grid power price. Additionally, the number
of charging demands and the allocated energy from the storage hand, there are studies focusing on cost minimization under the
battery compose the 2-D policy. We derive two necessary condi- service quality constraint. In [4], EV battery charging behavior
tions of the optimal policy. Moreover, we discuss the reduction of was optimized with the objective to minimize charging costs
the 2-D policy to be the number of charging demands only. We and to achieve satisfactory state-of-energy levels and optimal
give the sets of system states for which charging no demand and power balancing. In [5], the problem of optimizing the plug-
charging as many demands as possible are optimal, respectively.
Finally, we investigate the proposed policies numerically. in hybrid EV (PHEV) charge trajectory (i.e., timing and rate
of the charging) was studied to reduce the energy costs and
battery degradation. In [6], a joint optimal power flow–charging
(dynamic) optimization problem was formulated with the goal
Manuscript received March 25, 2013; revised August 2, 2013 and October 5,
2013; accepted December 14, 2013. Date of publication December 20, 2013;
of minimizing the generation and charging costs while sat-
date of current version July 10, 2014. This work was supported in part by isfying network, physical, and inelastic-load constraints. By
the National Basic Research Program of China (973 Program) under Grant modeling an EV charging system as a cyber–physical system,
2013CB336600 and Grant 2012CB316000; by the National Natural Science
Foundation of China through the Excellent Young Investigator Program under
a decentralized online EV charging scheduling scheme was
Grant 61322111; by the Chinese Ministry of Education through the New Cen- developed in [7]. In [8], the EV charging scheduling problem
tury Talent Program under Grant NCET-12-0302; by the Beijing Nova Program was formulated to fill the electric load valley as an optimal
under Grant Z121101002512051; by the National Science and Technology Key
Project under Grant 2013ZX03003006-005 and Grant 2013ZX03003004-002;
control problem, and a decentralized algorithm was derived. In
by the U.S. National Science Foundation under Grant CNS-1265268, Grant [9], a strategy to coordinate the charging of plug-in EVs (PEVs)
CNS-1117560, Grant ECCS-1028782, and Grant CNS-0953377; and by the was proposed by using noncooperative games [10]. Flexible
Electric Power Analytics Consortium. The review of this paper was coordinated
by Dr. M. S. Ahmed.
charging optimization for EVs considering distribution grid
T. Zhang is with the School of Information Science and Engineering, Shan- constraints, both voltage and power, was investigated in [11].
dong University, Jinan 250100, China. He was also with Tsinghua University, On the other hand, some works focus on improving the
Beijing 100084, China (e-mail: tianzhang.ee@gmail.com).
W. Chen and Z. Cao are with the State Key Laboratory on Microwave service quality with the cost constraint. For the purpose of
and Digital Communications, Tsinghua National Laboratory for Information improving satisfiability of EVs, a reservation-based scheduling
Science and Technology, Department of Electronic Engineering, Tsinghua algorithm for the charging station to decide the service order of
University, Beijing 100084, China (e-mail: wchen@tsinghua.edu.cn; czg-dee@
tsinghua.edu.cn). multiple requests was proposed in [12]. In [13], utilizing the
Z. Han is with the Department of Electrical and Computer Engineering, Uni- particle swarm optimization, a proposed algorithm optimally
versity of Houston, Houston, TX 77204-4005 USA (e-mail: zhan2@uh.edu). managed a large number of PHEVs charging at a municipal
Color versions of one or more of the figures in this paper are available online
at http://ieeexplore.ieee.org. parking station. In [14], the minimization of the waiting time
Digital Object Identifier 10.1109/TVT.2013.2295591 for EV charging via scheduling charging activities spatially

0018-9545 © 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

Authorized licensed use limited to: Cyprus University of Technology. Downloaded on January 12,2021 at 08:44:05 UTC from IEEE Xplore. Restrictions apply.
ZHANG et al.: CHARGING SCHEDULING OF EVs 2601

and temporally in a large-scale road network was investigated. reconstructed as follows. The demand arrives and waits in
In [15], the tradeoff between the distribution system load and the charging demand queue before service (charging). At the
quality of charging service was considered, and the centralized beginning of each period, the charging station determines the
algorithms to schedule the charging of vehicles were designed. number of charging demands to be served and the amount
In all aforementioned works, the charging energy is supplied of allocated energy from the storage battery (the rest of the
from the power grid only. However, recent studies, e.g., [16], required energy is purchased from the power grid). The aim
reveal that a fuel-driven vehicle can produce less CO2 than an is to minimize the mean length of the charging demand queue
EV if the charging energy is entirely produced by coal-fired under the long-term cost constraint.
power plants. Then, the renewable energy (e.g., solar or wind Next, we find that the charging demand queue optimization
energy [17]) should be the energy source of the EVs fully or problem can be studied under a Markov decision process
at least partially to achieve the real environmental advantages. (MDP) framework. The system state contains the charging
Accordingly, charging scheduling of EVs in the presence of demand queue length, the demand arrival, the energy level
renewable energy becomes a more practical and interesting in the storage battery of the renewable energy, the renewable
research problem. However, it has not been extensively inves- energy arrival, and the grid power price. Meanwhile, the num-
tigated in literature. In [18] and [19], the real-time scheduling ber of charging demands and the allocated energy from the
policies of EV charging were considered when both renewable storage battery constitute the 2-D policy. The mean length of the
energy and energy from the grid are available. In [20], the PEV charging demand queue minimization problem under the long-
charging and wind power scheduling were integrated, and the term cost constraint is formulated as a constrained MDP [23].
synergistic control algorithm of PEV charging and wind power We analyze the optimal 2-D policy of the constrained MDP by
scheduling was proposed. transforming to an average cost MDP and its corresponding
In this paper, we focus on the scheduling approach of EV discount cost MDP thereafter. First, the constrained MDP is
charging at a renewable-energy-aided charging station to min- converted to an unconstrained MDP by using Lagrangian re-
imize waiting time.1 Greatly different from previous works, laxation. Moreover, we derive that the optimal solution of the
we not only consider the renewable energy but the uncertain unconstrained MDP with a certain Lagrangian multiplier is op-
EV arrival, random required charging energy for each EV, and timal for the original constrained MDP. Next, the unconstrained
variable grid power price as well. Moreover, we give an ana- MDP can be analyzed by transforming to its corresponding
lytical framework to study the more complicated and practical discount cost MDP. We obtain two necessary conditions for
problem. The charging station has multiple charge points, and the optimal solution. Third, we analyze the relations between
the charged energy at a charge point during a period is constant the two elements of the 2-D policy and find that the number of
and is called an energy block. The charging station is equipped charging demands is dominant. Thus, we propose a conjecture
with renewable energy generation devices and a storage battery. that the constrained MDP problem can be reduced to an MDP
Meanwhile, the charging energy can be also purchased from the problem with the policy to be the number of charging demands
power grid. Once an EV arrives at the charging station, it waits only. We then derive the conditions of the system state when
in a queue before charging. In each period, the charging station the policy that charging no demand is optimal. We also obtain
chooses some EVs from the head of the queue for charging. At the system state conditions when charging as many demands as
the same time, the station also determines how much energy possible is optimal.
is supplied from the storage battery (the rest of the required The remainder of this paper is structured as follows. In
energy is supplied from the power grid). The objective is to Section II, the system model is described, and we formulate
minimize the mean waiting time of EVs under the long-term a stochastic optimization problem that minimizes the mean EV
cost constraint. queue. In Section III, by proposing a queue mapping method,
Since the amount of charging energy (i.e., the number of the EV queue minimization is equivalently transformed to a de-
energy blocks to charge) for each EV is different and random, mand queue minimization problem. Next, Section IV presents
the scheduling problem is challenging. We propose the queue the analysis of the mean demand queue minimization. It is re-
mapping method to overcome this challenge. We map the EV constructed as a constrained 2-D MDP problem, and we analyze
queue to a charging demand queue. In the charging demand the optimal policy of the constrained MDP. In Section V, the
queue, each demand means an energy block that needs to numerical results are performed. Finally, Section VI concludes
charge, and some consecutive demands correspond to an EV’s this paper. The main symbols utilized in the paper and their
required charging energy. We prove that the minimization of meanings are listed in Table I.
the average EV queue length is equivalent to the minimization
of the average charging demand queue length. Then, we focus
II. S YSTEM M ODEL AND P ROBLEM F ORMULATION
on the minimization of the charging demand queue under the
cost constraint. The scheduling problem can be equivalently Time is divided into periods of length τ each. The EVs
arrive at the charging station according to a finite-state ergodic
1 Compared with traditional internal combustion engine (ICE) vehicles pow- Markov chain {A[n]}. The EVs wait in a first-input–first-
ered by gasoline, EVs need frequent charging and take a long time to charge. output queue before charging, as shown in Fig. 1. The charging
The long charging time may further result in a long wait at a charging station. station has M charge points, i.e., at most M EVs can be
For travel efficiency and driver comfort, it is necessary and important to
intelligently schedule the electricity charging of EVs to minimize waiting charged in each period. The charging station has renewable
without disrupting the travel plans or the habits of drivers [14]. energy generation devices, and it can also purchase power from

Authorized licensed use limited to: Cyprus University of Technology. Downloaded on January 12,2021 at 08:44:05 UTC from IEEE Xplore. Restrictions apply.
2602 IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, VOL. 63, NO. 6, JULY 2014

TABLE I
M AIN S YMBOLS U TILIZED IN T HIS PAPER

Ec = LE, with L being a random integer, to denote the required


energy block number of an EV.4 In the nth period, K[n]
EVs from the head of the EV queue are scheduled to charge.
Meanwhile, the charging station allocates W [n] power from the
storage battery, and the rest power will be supplied by the power
grid. The objective of the charging station is to find a sequence
of charging EV number and renewable energy allocation that
minimizes the mean EV queue length under an average cost
constraint.
By denoting the number of EVs in the queue at the beginning
of the nth period as Q[n], we have Q[n + 1] = Q[n] − K[n] +
A[n]. Denote the capacity of the renewable energy storage
battery as Emax . The stored battery energy at the beginning of
the nth period is Eb [n]. The battery energy evolution can be
expressed as
Eb [n + 1] = min {Eb [n] − W [n]τ + Ea [n], Emax }
Fig. 1. System model.
:= (Eb [n] − W [n]τ + Ea [n])− . (1)
the power grid. The renewable energy is modeled as another
The cost at the nth period is given by
finite-state ergodic Markov process {Ea [n]}.2 The renewable
energy is viewed as free, and the price for the grid power during ⎛ K[n]−1  ⎞+
n−1
RK[n−1] −γn−1 + i=1 Rin +γn E
the nth period is denoted P [n]. The grid power price remains C[n] =⎝ −W [n]⎠ P [n]
static during each period and changes between different peri- τ
ods. The sequence of the price {P [n]} is a finite-state ergodic (2)
Markov chain. We assume that the energy that one charging
point can charge during a period is a constant E and referred to where (·)+ := max{·, 0}, Rin is the required energy block
as “energy block.”3 Assume that the required charging energy number of the ith EV among K[n] EVs that are scheduled to
of an arbitrary EV Ec is independent on each other, and that charge in the nth period, and γn ≥ 1 is the energy block number
that has been charged for the last EV (i.e., the K[n]th EV)
2 Similarly as in [21] and [22], we use the Markov process to characterize the
arrival of harvested renewable energy. Observe that the renewable energy can 4 When EVs arrive at a charging station, different EVs may require different
be generated from difference sources (i.e., renewable energy generation devices amounts of energy to charge due to different remaining electricity energy and
or energy harvesting devices) in principle. We consider the renewable energy different travel plans. Consequently, we consider the required charging energy
arrivals that may be from different sources as a whole in this paper (since they of each EV (i.e., L) to be random in this paper. This is a general and practical
are all “free”), and we model the arrival of all of the renewable energy as a scenario. Observe that, for each EV, the required charging energy is determined
Markov chain. by the driving plan of the EV. EVs may not charge completely. Charging
3 It is assumed that, if an EV utilizes m charge points simultaneously during completely or not, as well as how much to charge, is determined by each EV
a period, the amount of charged energy is mE. itself.

Authorized licensed use limited to: Cyprus University of Technology. Downloaded on January 12,2021 at 08:44:05 UTC from IEEE Xplore. Restrictions apply.
ZHANG et al.: CHARGING SCHEDULING OF EVs 2603

A[n]
The demand arrival can be given by B[n] = i=1 Sin ,
where Sin is the required charging demand number of the ith
EV among arrived A[n] EVs in the nth period as J[n]. Let the
length of the charging demand queue at the beginning of the
nth period be Qe [n]; then, the evolution of Qe [n] is

Qe [n + 1] = Qe [n] − J[n] + B[n]. (5)

The cost in the nth period can be reexpressed as


 +
J[n]E
C[n] = − W [n] P [n]. (6)
τ
Fig. 2. Queue mapping. The optimization problem of finding an optimal charging de-
mand number and renewable energy allocation sequence to
in the nth period.5 Formally, we have the following stochastic minimize the mean charging demand queue length under the
optimization problem: long-term cost constraint can be expressed as
n−1 n−1
1
1

min lim sup E Q[i] (3) min lim sup E Qe [i] (7)
{(K[n],W [n])}∞
n=1 n→∞ n {(J[n],W [n])}∞
n=1 n→∞ n
i=0 i=0
⎧ n−1  ⎧
⎪  ⎪
⎨ lim sup n1 E [C[i]] ≤ C¯ (8a)
⎪ lim sup 1 E
⎨ C[i] ≤ C¯ (4a) n→∞
n
s.t. n→∞ i=0 s.t. J[i] ≤ min {Qe [i], M } (8b)
⎪ K[i] ≤ min {Q[i], M } (4b) ⎪


⎩ W [i] ≤ Ebτ[i] . (8c)
W [i] ≤ Ebτ[i] (4c)
The following lemma proves the equivalence between (7)
where E[·] is the expectation operation, and C¯ is the average and (3).
cost constraint.6 Lemma 1: Under the proposed queue mapping, the mini-
mization of the mean charging demand queue length is equiv-
alent to the minimization of the mean EV queue length. In this
III. E QUIVALENT T RANSFORMATION TO THE AVERAGE sense, (7) is equivalent to (3).
C HARGE D EMAND Q UEUE M INIMIZATION P ROBLEM Proof: See Appendix A. 
Direct analysis of the stochastic optimization problem in (3) As (7) is equivalent to (3), we focus on the analysis of (7) in
is difficult due to the complexity of the one-step cost expression the following.
in (2). To overcome the difficulty, we first propose a method
to map the EV queue to the charging demand queue. Cor- IV. A NALYSIS OF THE C HARGING D EMAND Q UEUE
respondingly, the charging EV number becomes the charging M INIMIZATION P ROBLEM
demand number in each period. Next, we convert the average
EV queue length minimization problem to the average charging Here, we reformulated the charging demand queue mini-
demand queue minimization problem. Moreover, we prove that mization problem as a constrained MDP problem. After that,
the conversion is equivalent. we perform the theoretical study on the optimal policy of the
The queue mapping method is shown in Fig. 2. Each EV MDP problem. Specifically, we first prove that the constrained
in the EV queue corresponds to several consecutive charging MDP can be analyzed through an unconstrained MDP (by using
demands (the number of the demands denotes the amount of the Lagrangian dynamic programming approach). Then, we
required energy) in the charging demand queue. A demand focus on the analysis of the unconstrained MDP. We analyze the
means E energy (i.e., an energy block) needs to be charged. In unconstrained MDP by using its corresponding discount MDP.
Fig. 2, the first EV (EV 1) in the EV queue wants to charge 3 × Next, we consider the dimension reduction of the 2-D policy.
E; then, it corresponds to the first three consecutive charging Finally, we propose two stationary deterministic policies based
demands in the charging demand queue. The second EV (EV on the theoretical results.
2) charges 2 × E; then, it corresponds to the two consecutive
charging demands after the first EV’s corresponding charging
A. Reconstructed as a Constrained MDP
demands.
Let the system state and action be X[n] = (Qe [n], B[n],
Eb [n], Ea [n], P [n]) and Y [n] = (J[n], W [n]), respectively.7
5 The remaining required energy block number (Rn
K[n]
− γn ) will be {X[n], Y [n]} is an MDP. Denote the state space as X and
charged in the next period.
6 In this paper, we assume that the power from the power grid and the
renewable energy generator is sufficient to stabilize the queue length. The 7 The state includes the demand queue length, demand arrival, energy in the
stability issue, such as the bounds on the average generation rate of renewable battery, renewable energy arrival, and grid power price. The action includes the
energy or the average EV arrival rate, will be studied in future work. charging demand number and the allocated renewable energy.

Authorized licensed use limited to: Cyprus University of Technology. Downloaded on January 12,2021 at 08:44:05 UTC from IEEE Xplore. Restrictions apply.
2604 IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, VOL. 63, NO. 6, JULY 2014

denote the action space as A. The feasible action (j, w) Remark: The solution of the constrained problem (9) can be
in a state x = (qe , b, eb , ea , p) ∈ X belongs to A(x) := obtained by solving the unconstrained UPβ with one or two
{0, 1, . . . , min{qe , M }} × {0, 1/τ, . . . , eb /τ }.8 Define a pol- certain β.
icy π = (π0 , π1 , · · ·) with πn generating an action y[n] = Next, we define a discount cost MDP with discount factor
(j[n], w[n]) ∈ A with a probability at the nth period [23], α ∈ (0, 1) corresponding to UPβ for each initial system state
[25]. We denote the set of all policies as Π. When a policy x = (qe , b, eb , ea , p), with the following value function:
π = (ψ, ψ, · · ·) with ψ being a measurable mapping from X ∞

to A such that ψ(x) ∈ A(x) for each x ∈ X , it is referred to as Vα (x) = min Eπx αi fβ (X[i], J[i], W [i]) . (12)
a stationary deterministic policy. π
i=0
The stochastic optimization problem (7) can be reexpressed
as the following constrained MDP problem whose policy is 2-D The optimal solution for the discounted problem is called a
with the charging demand number and allocated renewable discount optimal policy.
energy as elements: The following lemma reveals the existence of the optimal
n−1 stationary deterministic policy of UPβ and, furthermore, how
1 π
to derive the average cost optimal policy.
min Dx := lim sup Ex
π
Qe [i] (9) Lemma 3: There exists a stationary deterministic policy that
π∈Π n→∞ n
i=0
n−1 solves UPβ , which can be obtained as a limit of discount
1 π
optimal policies as α → 1.
s.t. Sx := lim sup Ex
π
C[i] ≤ C¯ (10) Proof: See Appendix C. 
n→∞ n
i=0 Based on the given analysis, we find that the constrained
where x = (qe , b, eb , ea , p) ∈ X is the initial system state. MDP can be analyzed through the defined average cost MDP
Given an initial system state x and policy π, we have a stochas- and its corresponding discount cost MDP thereafter. Hence, we
tic state–action sequence x,y[0], x[1], y[1], . . .; Eπx [·] means first investigate the solution of the discount cost MDP in the
the expectation related to the stochastic  state–action sequence following.
[25]. Formally, Eπx [ϕ(X[n], Y [n])] = i∈X , a∈A Pπx [X[n] =
i, Y [n] = a]ϕ(i, a), where ϕ(·) is a function, and Pπx [X[n] = C. Discount Optimal Policy
i, Y [n] = a] denotes the probability that, at the nth period, the
state is i, and the action is a, given that policy π is used and x For state–action pair (x = (qe , b, eb , ea , p), (j, w)) ∈ X ×
is the initial state. A, let u = qe − j and η = eb − wτ . Then, (u(x), η(x)) can
also define a stationary deterministic policy. Then, the dis-
counted cost optimality equation [24], [25] is given by
B. Transformation to the Unconstrained MDP
and Discount MDP Vα (qe , b, eb , ea , p)

Define fβ (x, j, w) := β(jE/τ − w)+ p + qe with β > 0. = min


u∈{qe −min{qe , M },..., qe }
We have the following unconstrained MDP (i.e., UPβ ): η∈{0, 1,..., eb }

n−1  
1 π
+
min Hx (β) := lim sup Ex
π
fβ (X[i], J[i], W [i]) . (qe − u)E eb − η
π n→∞ n
× β − p + qe + αEb, ea , p
i=0 τ τ
(11)

Remark: UPβ is an average cost MDP. Its optimal solution  
is referred to as the average cost optimal policy. × Vα (u + B, B, (η + Ea )− , Ea , P ) (13)
The following lemma reveals the relation between solutions
of the constrained MDP and the unconstrained MDP. and the corresponding value iteration algorithm (or successive
Lemma 2: When there exists a β0 > 0 where the optimal approximation method) is
¯ the optimal solution of the
policy of UPβ0 has a cost equal to C,
unconstrained MDP in (11) (i.e., UPβ ) is also optimal for the Vα, n (qe , b, eb , ea , p)
constrained MDP in (9). Otherwise, there exit β + and β − .
The optimal policy P − that obtained for UPβ − has a cost = min
u∈{qe −min{q,M },...,qe }
slightly larger than C. ¯ β + > β − will instead lead to a less η∈{0,1,...,eb }

aggressive policy with a cost slightly smaller than C. ¯ The   +


optimal policy for the constrained MDP (9) is as follows: At (qe − u)E eb − η
× β − p + qe + αEb, ea , p
each decision epoch, choose P − with a certain probability q τ τ
and P + with probability 1 − q, where q depends on C¯ and the 
cost of the two policies.  
Proof: See Appendix B.  × Vα,n−1 (u + B, B, (η + Ea )− , Ea , P ) (14)

8× is the Cartesian product. The energy has been discretized. with Vα, 0 (qe , b, eb , ea , p) = 0.

Authorized licensed use limited to: Cyprus University of Technology. Downloaded on January 12,2021 at 08:44:05 UTC from IEEE Xplore. Restrictions apply.
ZHANG et al.: CHARGING SCHEDULING OF EVs 2605

First, regarding Vα (qe , b, eb , ea , p), we have the following Lemma 6: Given state x = (qe , b, eb , ea , p), the average
properties: cost optimal policy (u∗ (x), η ∗ (x)) should satisfy the following
Property 1: Vα (qe , b, eb , ea , p) is an increasing function inequality array:
of qe .
E
Proof: See Appendix D.  Z̃1 (u∗ , b, η ∗ , ea , p) ≤ β p ≤ Z̃1 (u∗ + 1, b, η ∗ , ea , p) (19)
τ
Property 2: Vα (qe , b, eb , ea , p) is a nonincreasing function −p
of eb . Z̃2 (u∗ , b, η ∗ , ea , p) ≤ β ≤ Z̃2 (u, b, η ∗ + 1, ea , p) (20)
τ
Proof: See Appendix E.  p
Z̃3 (u∗ , b, η ∗ , ea , p) ≤ β (E −1) ≤ Z̃3 (u∗ +1, b, η ∗ +1, ea , p)
In practice, the allocated renewable energy will not surpass τ
(21)
the required charging energy. Thus, jE ≥ wτ , i.e.,

(qe − u)E eb − η where Z̃1 (u, b, η, ea , p) = limα→1 Z1 (u, b, η, ea , p), Z̃2 (u, b, η,
− ≥ 0. (15) ea , p) = limα→1 Z2 (u, b, η, ea , p), and Z̃3 (u, b, η, ea , p) =
τ τ
limα→1 Z3 (u, b, η, ea , p).
Property 3: Vα (qe , b, eb , ea , p) is convex in (qe , eb ).
Proof: See Appendix F. 
E. Reducing the Policy’s Dimension
Next, the following two lemmas reveal two necessary condi-
tions for the optimality, respectively. The number of charging demands j and the power allocation
Lemma 4: In state x = (qe , b, eb , ea , p), (u(x), η(x)) is from battery w are coupled together; they affect each other.
not the discount optimal solution if u(x) > qe − min{qe , M } However, if we assume that j has been chosen, then the required
and η(x) + ea > Emax . total power has been fixed. In this case, to minimize the instant
Remark: Lemma 4 reveals the sufficient condition for the cost, we will allocate as much power as possible from the
nonoptimality, and it can be also viewed as the necessary battery to meet the required total power, i.e., the greedy policy
condition for optimality. In other words, any optimal solutions for the battery power allocation. This is because the power from
should not satisfy the condition. the battery is free [see (2)]. We can have the conjecture that
Lemma 5: Denote the discount optimal policy in state x = the greedy allocation strategy of battery power is the optimal
(qe , b, eb , ea , p) as (u∗ (x), η ∗ (x)). Then, (u∗ (x), η ∗ (x)) sat- policy. However, it is difficult to prove. The difficulty lies in
isfies the following inequality array:9 the fact that the remaining battery energy will affect the future
action and cost [e.g., (13)]. On the other hand, once w has been
E
Z1 (u∗ , b, η ∗ , ea , p) ≤ β p ≤ Z1 (u∗ + 1, b, η ∗ , ea , p) (16) fixed, the power allocation from the power grid can also affect
τ j. In summary, when j is chosen, the optimal w∗ is the greedy
∗ ∗ −p
Z2 (u , b, η , ea , p) ≤ β ≤ Z2 (u, b, η ∗ + 1, ea , p) (17) policy. By contrast, if w is fixed and the optimal j is not fixed,
τ
p we need to solve the power allocation from the power grid to
Z3 (u∗ , b, η ∗ , ea , p) ≤ β (E − 1)
τ ∗ find the optimal j ∗ . Thus, we can reduce the policy from (j, w)
≤ Z3 (u + 1, b, η ∗ + 1, ea , p) (18) to j. We have the following conjecture.
Conjecture 1: The greedy policy is the optimal battery
where Z1 (u, b, η, ea , p) = αEb,ea ,p [G1 (u + B, B, (η + Ea )− , power allocation policy of the 2-D constrained MDP in
Ea , P )] with G1 (qe , b, eb , ea , p) = Vα (qe , b, eb , ea , p) − (9). Furthermore, view (X[n], J[n]) as an MDP with state
Vα (qe −1, b, eb , ea , p), Z2 (u, b, η, ea , p) = αEb,ea ,p [Vα (u+B, X[n] and action J[n].10 The feasible action j in state
B, (η + Ea )− , Ea , P ) − Vα (u + B, B, (η − 1 + Ea )− , Ea , x = (qe , b, eb , ea , p) belongs to {0, 1, . . . , qe }. Define P =
P )], and Z3 (u, b, η, ea , p) = αEb, ea , p [Vα (u + B, B, (η + (P[0], P[1], . . .) to be a policy that P[n] generates an action
Ea )− , Ea , P ) − Vα (u − 1 + B, B, (η − 1 + Ea )− , Ea , P )]. j[n] at nτ ; the optimal policy of the following MDP problem is
Proof: See Appendix G.  the optimal charging demand policy of (9):
Remark: Lemma 5 gives the necessary condition of the n−1
discount optimality, i.e., the optimal policy (or policies) should 1 P

min lim sup Ex Qe [i] (22)


be the solution(s) of the inequality array. In particular, if the P n→∞ n
i=0
inequality array has a single solution, the corresponding single
solution is the optimal policy since the existence of the optimal s.t.
policy. n−1 
1 P
J[i]E 1
lim sup Ex − min
n→∞ n τ τ
i=0
D. Average Cost Optimal Policy
+
First, Lemma 4 still holds for the average cost MDP. Next,
based on Lemmas 3 and 5, we have the following lemma. × {J[i]E, Eb [i]} P [i] ≤ C¯ (23)

9 Using Property 3, we can derive that Z (u, b, η, e , p) ≤ Z (u+1, b, η, 10 The state includes the demand queue length, demand arrival, energy in the
1 a 1
ea , p), Z2 (u, b, η, ea , p) ≤ Z2 (u, b, η+1, ea , p), and Z3 (u, b, η, ea , p) ≤ battery, renewable energy arrival, and grid power price. The action includes the
Z3 (u+1, b, η+1, ea , p). charging demand only.

Authorized licensed use limited to: Cyprus University of Technology. Downloaded on January 12,2021 at 08:44:05 UTC from IEEE Xplore. Restrictions apply.
2606 IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, VOL. 63, NO. 6, JULY 2014

where the evolution of energy in the battery becomes

Eb [i + 1] = (Eb [i] − min {J[i]E, Eb [i]} + Ea [i])− . (24)

Remark: The policy can be reduced in dimension ((j, w) →


j). If the stated β in Lemma 2 satisfying β  1, Conjecture 1
can be proved based on (13) in addition with Lemmas 3 and 2.
In the following, we discuss the optimal policy after dimen-
sion reduction. For state–action pair x = (qe , b, eb , ea , p), j),
let u = qe − j, and u(x) can also define a stationary deter-
ministic policy. We have the following lemmas to reveal the
properties of the optimal policy.
Lemma 7: Denote the discount optimal policy in state x =
(qe , b, eb , ea , p) as u∗ (x). Then, u∗ (x) satisfies Z(u∗ ) ≤ β(E/
τ )p ≤ Z(u∗ +1), where Z(u) = αEb,ea ,p [Vα (u+B, B, (η(u) +
Ea )− , Ea , P )−Vα (u−1 + B, B, (η(u−1) + Ea )− , Ea , P )] +
βη(u) − η(u − 1)/τ p with η(u) := max{0, eb − (q − u)E}.
Furthermore, the average cost optimal policy u∗ satisfies
Z̃(u∗ ) ≤ βE/τ p ≤ Z̃(u∗ + 1) with Z̃(u) = limα→1 Z(u).
Proof: See Appendix H. 
Lemma 8: For x = (qe , b, eb , ea , p) satisfying Z(qe −
min{qe , M }) > βE/τ p, u = qe − min{qe , M } is the discount
optimal policy. In addition, for (qe , b, eb , ea , p) satisfying
Z(qe ) < βE/τ p, u = qe is the discount optimal policy.
Proof: See Appendix I. 
Remark: u = qe − min{qe , M }, i.e., j = min{qe , M }
means charging as many demands as possible. If the number of
demands in the queue is less than the charge point number M ,
charge all the demands. Otherwise, charge M demands from
the head of the queue. u = qe , i.e., j = 0 denotes charging no
demand.
Based on Lemma 8 and Lemma 3, we have the following. Fig. 3. Average cost performance versus Ā under different values of Emax .
Lemma 9: For x = (qe , b, eb , ea , p) satisfying Z̃(qe − (a) M = 50. (b) M = 8.
min{qe , M }) > β(E/τ )p, u = qe − min{qe , M } is the aver-
age cost optimal policy. In addition, for (qe , b, eb , ea , p) sat- Remark: The radical policy is the optimal policy to mini-
isfying Z̃(qe ) < β(E/τ )p, u = q is the average cost optimal mize the waiting time of EVs when no average cost is consid-
policy. ered or the constraint is large enough. Moreover, we find that,
given an average cost constraint, when the mean EV arrival,
mean renewable energy arrival, and mean grid power price
F. Two Stationary Deterministic Policies satisfy a certain condition, the average cost of the radical policy
This paper has derived the structural properties of the optimal can satisfy the constraint, the radical policy is the optimal policy
policy. In particular, we have proven that the optimal policy even when taking the average cost constraint into account. (See
exists and is stationary deterministic. We have also proven the analysis of Figs. 3 and 4 as concrete examples.)
that the greedy policy may be optimal battery power allocation In the radical policy, the average cost constraint is not con-
policy (i.e., Conjecture 1). Based on these results, we propose sidered. Then, we propose another policy that guarantees the
the following two specific stationary deterministic policies. average cost constraint through satisfying the cost constraints
For state x = (qe , b, eb , ea , p), we define the radical policy in each period. We call the following policy the conservative
as policy:
    ¯  
min{eb , jE} eb + pC τ min{eb , jE}
(j(x), w(x)) = min{qe , M }, . (j(x), w(x)) = min qe , M,
E
,
τ
.
τ

That is to say, we charge as many demands as possible and use That is to say, we first guarantee that the cost of charging in
the greedy policy for the battery energy allocation, i.e., if the each period is less than the average cost constraint, then charge
required energy is not greater than the battery energy, then all as many demands as possible, and utilize the greedy policy for
the energy will be supplied from the storage battery, and no grid the battery energy allocation.
power will be used. Otherwise, all the storage battery energy is The focus of this paper is on the formulation of the analytical
allocated, and the rest will be supplied from the power grid. MDP framework and the structural properties, and the proposed

Authorized licensed use limited to: Cyprus University of Technology. Downloaded on January 12,2021 at 08:44:05 UTC from IEEE Xplore. Restrictions apply.
ZHANG et al.: CHARGING SCHEDULING OF EVs 2607

policies here are suboptimal and might not be optimal in


general. However, from the simulation results shown in the
following, we demonstrate the effectiveness of the proposed
schemes.
Remark: For MDPs with constraints (especially the general
state space), no satisfactory algorithm is known to find the
optimal policy, even in the class of stationary policies [25].
Unfortunately, (9) belongs to this category. Additionally, the
coupling between the two elements of the policy produces extra
challenges. In summary, the optimal policy of (9) is difficult to
find if possible in mathematics and engineering. We will try to
get the optimal policy in future work.

V. N UMERICAL R ESULTS
Here, we perform simulations to demonstrate the relations
among the mean EV arrival, the mean renewable energy arrival,
the upper bound of the average cost, the average cost, and the
average EV queue length. Meanwhile, we consider different
charge point numbers and capacities of the renewable energy
storage battery. The units for energy, power, time, price, and
cost are kilowatthours, kilowatts, hours, dime per kilowatts, and
dime ($0.1), respectively. We omit the unit in the following
for brevity. In the simulations, the period length is τ = 1 and
L = 1,11 and the size of the “energy block” is E = 10.
Fig. 3 shows the average cost performance with respect to the
mean EV arrival Ā. In the simulations, we utilize the radical
policy. We consider the i.i.d. cases of A, Ea , and P . A takes
0 and 2Ā with equal probability of 0.5. Ea takes values {0,
50, 100} with probabilities {0.1, 0.4, 0.5}. P takes values {5,
10, 20} with probabilities {0.2, 0.3, 0.5}. The performance
is averaged over 105 periods. We set the number of charge
Fig. 4. Average cost under different E¯a and Emax . (a) M = 50. (b) M = 8.
points M = 50 and M = 8 in Fig. 3(a) and (b), respectively.
Furthermore, we plot the curves for different storage battery
capacities: Emax = 100, Emax = 300, and infinite capacity, Ā is small, the required energy can be supplied by the battery
respectively. with a very high probability, and no grid power is needed.
In Fig. 3(a), we can see that when Ā is small, and the cost is Then, the average cost is zero. When Ā increases, the required
nearly zero. However, when Ā is large (e.g., Ā ≥ 10), the cost energy increases. Once the battery energy is not enough, the
increases rapidly with the increase in Ā according roughly to grid power will be consumed to fulfill the gap between the
a linear function. It is because, when Ā is small, the required required energy and battery energy. With the increase in Ā,
energy is small, and the battery can supply the energy. Thus, the grid power consumption increases since the average battery
no grid power will be consumed, and the cost is zero. Once Ā energy is constant. Thus, the average cost increases. However,
is larger than a certain value, the required energy is larger than when Ā is large enough, we get k = min{q, M } = M with
the battery energy, then the grid power will be utilized. As M is a high probability because M is not large in this simulations.
large (compared with the considered Ā), i.e., the restriction on Then, the required energy k × E = M × E, i.e., it becomes a
the number of charge points will not influence the performance, constant. That means the grid power consumption is also a
we have k = min{q, M } = q with a high probability. The constant. Thus, the cost remains static.
grid power consumption will increase with an increase in Ā. Fig. 4 shows the average cost performance with respect to
Moreover, when Ā is large, the grid power becomes the main the mean renewable energy arrival Ēa . The radical policy is
energy source. Based on (2), we derive that the cost varies with applied in the simulations. A takes values 0 and 10 with equal
Ā roughly according to a linear relation. probability of 0.5. Ea takes values {0, (5/7)Ēa , (10/7)Ēa }
In Fig. 3(b), we can find that the average cost is zero when Ā with probabilities {0.1, 0.4, 0.5}, respectively. P is the same
is small, and with an increase in Ā, the average cost increases. as in Fig. 3. Emax = 100, Emax = 300, and infinite capacity
However, once Ā is larger than a certain value, the average are also considered in the simulations, respectively. From the
cost remains constant. It can be explained as follows: When figure, we can find that the cost decreases with an increase in
Ēa . However, once Ēa is large enough, the cost almost remains
11 EVs charge the same amounts of energy E (e.g., an EV production static. First, in the range of small Ēa , when Ēa increases, more
company). In this case, we can use “EV” and “demand” interchangeably. free renewable energy will arrive and be stored in the battery.

Authorized licensed use limited to: Cyprus University of Technology. Downloaded on January 12,2021 at 08:44:05 UTC from IEEE Xplore. Restrictions apply.
2608 IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, VOL. 63, NO. 6, JULY 2014

¯
Fig. 5. Average EV length performance versus C. Fig. 6. Average EV length performance versus M .

Then, the cost will decrease. If the battery capacity is large


enough, all the arrived renewable energy can be stored in the average length remains static with respect to C.¯ Additionally,
battery. With the increase in Ēa , the battery energy will increase by comparing the four curves, we can derive that the larger
all the time. Once the battery energy is larger than the required the capacity or the charge point number, the better the length
energy for charging, no grid power is needed then, and the cost performance.
becomes zero since that time. If the battery capacity is not large Fig. 6 plots the average EV queue length performance with
(e.g., Emax = 100 in the figure), the overflow occurs when Ēa respect to the charge point number (M ) under the conservative
is large. That is to say, the battery energy will remain Emax , policy. The settings of A, Ea , and P are the same as those in
although we increase Ēa . On the other hand, Emax is smaller Fig. 5. Different values of the upper bound of the average cost
than the required charge energy; therefore, grid power is still and the battery capacity are considered in the simulations. We
needed. Consequently, the cost is nonzero and remains static. can find that the EV queue length decreases with the increase in
In Figs. 3 and 4, we can observe that the larger the battery M . However, once M is larger than a certain value, the average
capacity, the lower the cost. That is because, when Emax is EV queue length almost remains static with the increase in M .
larger, the probability of overflow will be lower (it is zero for It can be explained as follows. When M is small
infinite capacity). Then, less free renewable energy is wasted,
and the cost will be lower. Furthermore, we can derive that, if  ¯ 
eb + pC τ
Ā is less a certain value or Ēa is larger than a certain value, the k = min q, M, =M
average cost can be less than a certain value. Then, we claim E
that, when Ā is less a certain value or Ēa is larger than a certain
value, the radical policy is also optimal, even when considering with a high probability, and it increases with the increase in M .
the constraint.12 Then, the number of remaining EVs at the queue u = q − k
Fig. 5 shows the average EV queue length performance with decreases, and the average EV queue length decreases. When
respect to the upper bounds of the average cost when the con- M is large enough, (11) occurs with a high probability, and k is
servative policy is applied. In the simulations, A chooses values constant with respect to M . Thus, the EV queue length remains
{0, 12} with an equal probability of 0.5. Ea and P have the static. In addition, by comparing different curves, we can see
same settings as in Fig. 3. In the plotting, we consider different that a larger upper bound or battery capacity leads to a better
values of the battery capacity and charge point number. We average EV queue length performance.
can observe that the average length performance improves with
an increase in C, ¯ and when C¯ is larger than a certain value,
the average length performance becomes almost constant. The
reason is as follows: When C¯ is small VI. C ONCLUSION
 ¯   ¯  We consider the scheduling of the EVs’ charging at a charg-
eb + pC τ eb + pC τ ing station whose energy is provided from both power grid
k = min q, M, = min q, (25)
E E and local renewable energy. Under the uncertainty of the EV
arrival, the renewable energy, the grid power price, and the
with a high probability, and it increases with an increase in charging energy of each EV, we study the mean delay optimal
¯ Thus, the average EV queue length performance increases.
C. scheduling with the average cost constraint. We analyze the
Once C¯ is large enough, we get k = min{q, M }, and the optimal policy of the formulated MDP problem. In addition,
two specific stationary policies (radical and conservative) are
12 Notice that the radical policy is optimal for the mean EV queue delay applied in the simulations to reveal the impacts of relevant
minimization without the average cost constraint. parameters on performance.

Authorized licensed use limited to: Cyprus University of Technology. Downloaded on January 12,2021 at 08:44:05 UTC from IEEE Xplore. Restrictions apply.
ZHANG et al.: CHARGING SCHEDULING OF EVs 2609

A PPENDIX A state (qe , b, eb , ea , p). Consider state (qe + 1, b, eb , ea , p); let


P ROOF OF L EMMA 1 the optimal action be (u∗ , η ∗ ) with u∗ ∈ {0, 1, . . . , qe }. Hence
First, the energy demand queue length and the EV queue
Q[n] Vα, n (qe + 1, b, eb , ea , p)
length have the following relation: Qe [n] = i=1 Tin , with
Tin being the charging demand number of the ith EV in the EV  +
(qe + 1 − u∗ )E eb − η ∗
queue during the nth period. Thus, the average energy demand =β − p + (qe + 1)
  Q[j] τ τ
queue length is (1/n) nj=1 Qe [j] = (1/n) nj=1 i=1 Tij .
  
Meanwhile, if an EV comes earlier than another EV, it will + αEb, ea , p Vα, n−1 u∗ + B, B, (η ∗ + Ea )− , Ea , P .
leave earlier in the EV queue serving. Using the queue mapping
mechanism, if an EV arrives earlier, its charging demand will be (26)
fulfilled no later (accomplishing this at the same time is possible
for two consecutive EVs). That is to say, the queue mapping is As (u∗ , η ∗ ) is feasible in state (qe , b, eb , ea , p)
an isotonic mapping. Then, we claim that a policy minimizing
the mean EV queue length results in minimal mean demand Vα, n (qe , b, eb , ea , p)
 +
queue length and vice versa. (qe − u∗ )E eb − η ∗
≤β − p + qe
τ τ
  ∗ 
A PPENDIX B + αEb, ea , p Vα,n−1 u + B, B, (η ∗ + Ea )− , Ea , P
P ROOF OF L EMMA 2 ≤ Vα, n (qe + 1, b, eb , ea , p). (27)
The proof is based on the results of [26] and [27]. First, we If (u∗ , η ∗ ) with u∗ = qe + 1
prove that, if for some β0 > 0, the optimal policy π ∗ of the
∗ ∗
UPβ |β=β0 satisfies the following: 1) π ∗ yields S π and Dπ as Vα, n (qe + 1, b, eb , ea , p) = qe + 1 + αEb, ea , p

limits for all x ∈ X ; and 2) S π = C. ¯ Then, π ∗ is optimal for  
the constrained MDP (9) [27]. The proof is similar as that of × Vα, n−1 (qe + 1 + B, B, (η ∗ + Ea )− , Ea , P ) . (28)
[26, Th. 4.3]. Second, if no such β0 exists, the optimal policy of
the constrained MDP (9) can be obtained by solving UPβ with Meanwhile, since (qe , η ∗ ) is feasible in state (qe , b, eb , ea , p)
two different values of β (i.e., β + and β − ). The proof is similar
to that of [26, Th. 4.4]. Vα, n (qe , b, eb , ea , p)
  
≤ q+αEb, ea , p Vα, n−1 qe +B, B, (η ∗ + Ea )− , Ea , P
A PPENDIX C (a)
P ROOF OF L EMMA 3 ≤ Vα, n (qe + 1, b, eb , ea , p) (29)
First, we derive that the conditions of [28, Prop. 2.1] are where (a) holds because of the induction hypothesis.
satisfied. Then, a discount optimal stationary policy exists.
Second, we prove that, for some x0 , Vα (x) − Vα (x0 ) < ∞.
Third, there exits a policy π ∈ A and an initial state x ∈ X Case 2
such that Hxπ (β) < ∞ in the practical problem. Otherwise,
M ≤ qe .
the cost is infinite for all policies, and any policy is optimal.
The set of feasible u is {0, 1, . . . , M } in both states (qe +
Accordingly, we can prove the lemma by applying [28, Th. 3.8].
1, b, eb , ea , p) and (qe , b, eb , ea , p). Then, we can prove the
increasing property of Vα, n (qe , b, eb , ea , p) by using (26)
A PPENDIX D and (27).
P ROOF OF P ROPERTY 1
We verify the increasing property by induction. According to A PPENDIX E
(14), Vα, 0 = 0 and P ROOF OF P ROPERTY 2
β ((qe − min{qe , M }) E − eb )+ p Based on (14), the property can be proven through
Vα, 1 = + qe .
τ induction. First, we have Vα, 0 = 0 and Vα, 1 = (β((qe −
min{qe , M })E − eb )+ p/τ ) + qe . Thus, the nonincreasing
The increasing property in qe holds. Assume Vα,n−1 (qe , b, eb ,
property in eb holds for n = 0, 1. Next, assume that
ea , p) is increasing in qe . Depending on the values of M , we
Vα, n−1 (qe , b, eb , ea , p) is a nonincreasing function of eb . Fix
have the following two cases.
(qe , b, ea , p) for state (q, a, eb , ea , p), and let (u∗ , η ∗ ) be the
optimal policy. We get
Case 1  +
(qe − u∗ )E eb − η ∗
M ≥ qe + 1. Vα, n (qa , b, eb , ea , p) = β − p+qe
τ τ
Fix (b, eb , ea , p), in the state (qe +1, b, eb , ea , p); the set of   
feasible u is {0, 1, . . . , qe + 1}, whereas it is {0, 1, . . . , qe } for + αEb, ea , p Vα, n−1 u∗ +B, B, (η ∗ +Ea )− , Ea , P . (30)

Authorized licensed use limited to: Cyprus University of Technology. Downloaded on January 12,2021 at 08:44:05 UTC from IEEE Xplore. Restrictions apply.
2610 IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, VOL. 63, NO. 6, JULY 2014

Since (u∗ , η ∗ ) is feasible in state (qe , b, eb + 1, ea , p), we where (b) holds because of the convexity of Vα,n−1 (qe , b, eb ,
derive ea , p), (c) holds because of Proposition 1 as well as Property 2,
 + and (d) holds since (φu1 +(1−φ)u2 , φη1 +(1−φ)η2 ) is feasi-
(qe −u∗ )E eb +1−η ∗ ble for φ(qe1 , b, eb1 , ea , p)+(1−φ)(qe2 , b, eb2 , ea , p).
Vα, n (qe , b, eb +1, ea , p) ≤ β − p+qe
τ τ
  
+ αEb, ea , p Vα,n−1 u∗ +B, B, (η ∗ + Ea )− , Ea , P . (31) A PPENDIX G
P ROOF OF L EMMA 5
By combining (30) and (31), we get Vα, n (qe , b, eb , ea , p) ≥
Let
Vα, n (qe , b, eb + 1, ea , p). Then, we complete the proof of the
 
property. (qe − u)E eb − η
S(u, η) = β − p + qe
τ τ
  
A PPENDIX F +αEb, ea , p Vα u + B, B, (η + Ea )− , Ea , P . (33)
P ROOF OF P ROPERTY 3
First, we have
First, we prove the following proposition.
S(u + 1, η) − S(u, η)
Proposition 1 E
= −β p + αEb, ea , p
τ
For φ ∈ (0, 1) and ∀x1 , x2 , y, we have φ min{x1 , y} +   
(1 − φ) min{x2 , y} ≤ min{φx1 + (1 − φ)x2 , y}. × Vα u + 1 + B, B, (η + Ea )− , Ea , P
 
−Vα u + B, B, (η + Ea )− , Ea , P (34)
Proof S(u − 1, η) − S(u, η)
The proposition can be verified by considering min{x1 , E
= β p + αEb, ea , p
x2 } > y, max{x1 , x2 } < y, and min{x1 , x2 } ≤ y ≤ τ
max{x1 , x2 }, respectively.   
× Vα u − 1 + B, B, (η + Ea )− , Ea , P
The convexity is proven by induction. For n = 0, Vα, 0 = 0  
and is convex. Assume Vα, n−1 (qe , b, eb , ea , p) is convex in −Vα u + B, B, (η + Ea )− , Ea , P . (35)
(qe , eb ). Fix (b, ea , p), and let (u1 , η1 ) and (u2 , η2 ) be the
optimal policy for (qe1 , eb1 ) and (qe2 , eb2 ), respectively. Then, Then, applying S(u∗ + 1, η ∗ ) − S(u∗ , η ∗ ) ≥ 0 and S(u∗ −
we get 1, η ∗ ) − S(u∗ , η ∗ ) ≥ 0, we obtain (16). Similarly, as

φVα, n (qe1 ,b, eb1 , ea , p) + (1 − φ)Vα, n (qe2 , b, eb2 , ea , p)


S(u, η + 1) − S(u, η)
(qe1 − u1 )E eb1 − η1 p
=φ β − p + qe1 =β + αEb, ea , p
 τ τ   τ
(qe2 − u2 )E eb2 − η2   
+ (1 − φ) β − p + qe2 × Vα u + B, B, (η + 1 + Ea )− , Ea , P
 τ τ   
+ αEb, ea , p φVα,n−1 u1 + B, B, (η1 + Ea )− , Ea , P −Vα u + B, B, (η + Ea )− , Ea , P (36)
+(1 − φ)Vα,n−1 
u2 + B, B, (η2 + Ea )− , Ea , P S(u, η − 1) − S(u, η)
(b)
≥ β [(φ(qe1 − u1 ) + (1 − φ)(qe2 − u2 )) E −p
p =β + αEb, ea , p
− (φ(eb1 − η1 ) + (1 − φ)(eb2 − η2 ))] τ
τ   
× Vα u + B, B, (η − 1 + Ea )− , Ea , P
 − φ)qe2 ] + α Eb, ea , p
+ [φqe1 + (1
× Vα,n−1 φu1 + (1 − φ)u2 + B, −  

 B, φ(η1 + Ea ) −Vα u + B, B, (η + Ea )− , Ea , P (37)
+(1 − φ)(η2 + Ea ) , Ea , P
(c)
≥ β [(φ(qe1 − u1 ) + (1 − φ)(qe2 − u2 )) E we can reach (17) from S(u∗ , η ∗ + 1) − S(u∗ , η ∗ ) ≥ 0 and
p S(u∗ , η ∗ − 1) − S(u∗ , η ∗ ) ≥ 0. In addition
− (φ(eb1 − η1 ) + (1 − φ)(eb2 − η2 ))]
τ
(1 − φ)qe2 ] + α Eb, ea , p
+ [φqe1 +  S(u + 1, η + 1) − S(u, η)
× Vα,n−1 φu1 + (1 − φ)u2 + B, B, p
 = β (1 − E) + αEb, ea , p
− τ
(φη1 + (1 − φ)η2 + Ea ) , Ea , P
  
(d) × Vα u + 1 + B, B, (η + 1 + Ea )− , Ea , P
≥ Vα, n (φqe1 + (1 − φ)qe2 , a, φeb1 + (1 − φ)eb2 , ea , p)  
(32) −Vα u + B, B, (η + Ea )− , Ea , P (38)

Authorized licensed use limited to: Cyprus University of Technology. Downloaded on January 12,2021 at 08:44:05 UTC from IEEE Xplore. Restrictions apply.
ZHANG et al.: CHARGING SCHEDULING OF EVs 2611

S(u − 1, η − 1) − S(u, η) [4] O. Sundstrom and C. Binding, “Optimization methods to plan the charg-
ing of electric vehicle fleets,” in Proc. Int. CCPE, Chennai, India,
p Jul. 2010, pp. 323–328.
= β (E − 1) + αEb, ea , p [5] S. Bashash, S. J. Moura, and H. K. Fathy, “Charge trajectory optimization
τ
   of plug-in hybrid electric vehicles for energy cost reduction and battery
× Vα u − 1 + B, B, (η − 1 + Ea )− , Ea , P health enhancement,” in Proc. Amer. Control Conf., Baltimore, MD, USA,
Jun. 2010, pp. 5824–5831.
 
−Vα u + B, B, (η + Ea )− , Ea , P . (39) [6] S. Sojoudi and S. H. Low, “Optimal charging of plug-in hybrid electric
vehicles in smart grids,” in Proc. IEEE PES Gen. Meet., Detroit, MI, USA,
Jul. 2011, pp. 1–6.
Then, (18) can be obtained by applying S(u∗ − 1, η ∗ − 1) − [7] R. Jin, B. Wang, P. Zhang, and P. B. Luh, “Decentralised online charging
S(u∗ , η ∗ ) ≥ 0 and S(u∗ + 1, η ∗ + 1) − S(u∗ , η ∗ ) ≥ 0. scheduling for large populations of electric vehicles: A cyber-physical
system approach,” Int. J. Parallel, Emergent Distrib. Syst., vol. 28, no. 1,
pp. 29–45, Feb. 2013.
A PPENDIX H [8] L. Gan, U. Topcu, and S. H. Low, “Optimal decentralized protocol
for electric vehicle charging,” IEEE Trans. Power Syst., vol. 28, no. 2,
P ROOF OF L EMMA 7 pp. 940–951, May 2013.
[9] Z. Ma, D. S. Callaway, and I. A. Hiskens, “Decentralized charging control
First, based on Conjecture 1, we only need to con- of large populations of plug-in electric vehicles,” IEEE Trans. Control
sider the policy set Ψ = {(u, η) : (u, η = η(u))} ∩ {(u, η) : Syst. Technol., vol. 21, no. 1, pp. 67–78, Jan. 2013.
(u, η) ≥ (0, 0)}. Consequently [10] W. Saad, Z. Han, H. V. Poor, and T. Basar, “Game-theoretic methods for
the smart grid,” IEEE Signal Process. Mag., vol. 29, no. 5, pp. 86–105,
 
(qe − u)E eb − η(u) Sep. 2012.
S (u, η(u)) = β − p + qe [11] O. Sundstrom and C. Binding, “Flexible charging optimization for electric
τ τ vehicles considering distribution grid constraints,” IEEE Trans. Smart
   Grid, vol. 3, no. 1, pp. 26–37, Mar. 2012.
+ αEb, ea , p Vα u + B, B, (η(u) + Ea )− , Ea , P . (40) [12] H.-J. Kim, J. Lee, G.-L. Park, M.-J. Kang, and M. Kang, “An efficient
scheduling scheme on charging station for smart transportation,” in Proc.
Then, applying S(u∗ + 1, η(u∗ + 1)) − S(u∗ , η(u∗ )) ≥ 0 Int. Conf. SuComs Grid, Daejon, Korea, Sep. 2010, pp. 274–278.
[13] W. Su and M. Y. Chow, “Performance evaluation of a PHEV parking sta-
and S(u∗ − 1, η(u∗ − 1)) − S(u∗ , η(u∗ )) ≥ 0, we get Z(u∗ ) ≤ tion using particle swarm optimization,” in Proc. IEEE PES Gen. Meet.,
βE/τ p ≤ Z(u∗ + 1). Finally, using Lemma 3, we reach the Detroit, MI, USA, Jul. 2011, pp. 1–6.
second half of the lemma. [14] H. Qin and W. Zhang, “Charging scheduling with minimal waiting in a
network of electric vehicles and charging stations,” in Proc. ACM VANET,
Las Vegas, NV, USA, Sep. 2011, pp. 51–60.
A PPENDIX I [15] J. Huang, V. Gupta, and Y.-F. Huang, “Scheduling algorithms for PHEV
charging in shared parking lots,” in Proc. Amer. Control Conf., Montreal,
P ROOF OF L EMMA 8 QC, Canada, Jun. 2012, pp. 276–281.
[16] J. Keiser, M. Lutzenberger, and S. Albayrak, “Wind power-aware vehicle-
Following the proof of Lemma 7, we can prove that the to-grid algorithms for sustainable EV energy management systems,” in
first half of the lemma by contradiction. Specifically, suppose Proc. 1st IEEE IEVC, Greenville, SC, USA, Mar. 2012, pp. 1–7.
u = qe − min{qe , M } is not the optimal solution, then S(u∗ − [17] L. Xie, P. M. S. Carvalho, L. A. F. M. Ferreira, J. Liu, B. Krogh, N. Popli,
and M. D. Ilic, “Wind energy integration in power systems: Operational
1, η(u∗ − 1)) − S(u∗ , η(u∗ )) ≥ 0 should hold. We have challenges and possible solutions,” Proc. IEEE, vol. 99, no. 1, pp. 214–
232, Jan. 2011.
E
Z (qe − min{qe , M }) ≤ Z(u∗ ) ≤ β p (41) [18] A. Subramanian, M. Garcia, A. Dominguez-Garcia, D. Callaway,
τ K. Poolla, and P. Varaiya, “Real-time scheduling of deferrable electric
loads,” in Proc. Amer. Control Conf., Montreal, QC, Canada, Jun. 2012,
and the contradiction occurs. We can verify the second half of pp. 3643–3650.
the lemma similarly by using contradiction. If we assume that [19] S. Chen and L. Tong, “iEMS for large scale charging of electric vehicles:
Architecture and optimal online scheduling,” in Proc. IEEE SmartGrid-
u = qe is not the optimal solution, then S(u∗ + 1, η(u∗ + 1)) − Comm, Tainan, Taiwan, Nov. 2012, pp. 629–634.
S(u∗ , η(u∗ )) ≥ 0 should be satisfied. Consequently, we get [20] C.-T. Li, C. Ahn, H. Peng, and J. S. Sun, “Synergistic control of plug-in
vehicle charging and wind power scheduling,” IEEE Trans. Power Syst.,
E vol. 28, no. 2, pp. 1113–1121, May 2013.
Z(qe ) ≥ Z(u∗ + 1) ≥ β p. (42) [21] C. K. Ho and R. Zhang, “Optimal energy allocation for wireless commu-
τ
nications powered by energy harvesters,” in Proc. IEEE Int. Symp. Inf.
The contradiction then occurs. Theory, Austin, TX, USA, Jun. 2010, pp. 2368–2372.
[22] R. A. Raghuvir and D. Rajan, “Delay bounded rate and power control
in energy harvesting wireless networks,” in Proc. IEEE WCNC, Cancun,
ACKNOWLEDGMENT Mexico, Mar. 2011, pp. 369–374.
[23] E. Altman, Constrained Markov Decision Processes. London, U.K.:
The authors would like to thank the anonymous reviewers Chapman & Hall, 1999.
[24] O. H. Lerma and J. B. Lassere, Discrete-Time Markov Control Processes:
and the editor for their constructive comments. Basic Optimality Criteria. New York, NY, USA: Springer-Verlag,
1996.
[25] E. A. Feinberg and A. Shwartz, Handbook of Markov Decision Processes:
R EFERENCES Methods and Applications. Boston, MA, USA: Kluwer, 2002.
[1] Y. Li, R. Kaewpuang, P. Wang, D. Niyato, and Z. Han, “An energy [26] F. J. Beutlerand and K. W. Ross, “Optimal policies for controlled Markov
efficient solution: Integrating plug-in hybrid electric vehicle in smart grid chains with a constraint,” J. Math. Anal. Appl., vol. 112, no. 1, pp. 236–
with renewable energy,” in Proc. IEEE INFOCOM Workshop Comput. 252, Nov. 1985.
Commun., Orlando, FL, USA, Mar. 2012, pp. 73–78. [27] D. J. Ma, A. M. Makowski, and A. Shwartz, “Estimation and optimal
[2] E. Hossain, Z. Han, and V. Poor, Smart Grid Communications and Net- control for constrained Markov chains,” in Proc. IEEE Conf. Decision
working. Cambridge, U.K.: Cambridge Univ. Press, 2012. Control, Athens, Greece, Dec. 1986, pp. 994–999.
[3] G. B. Shrestha, S. G. Ang, and S. G. Ang, “A study of electric vehicle [28] M. Schal, “Average optimality in dynamic programming with gen-
battery charging demand in the context of Singapore,” in Proc. Int. Power eral state space,” Math. Oper. Res., vol. 18, no. 1, pp. 163–172,
Eng. Conf., Singapore, Dec. 2007, pp. 64–69. Feb. 1993.

Authorized licensed use limited to: Cyprus University of Technology. Downloaded on January 12,2021 at 08:44:05 UTC from IEEE Xplore. Restrictions apply.
2612 IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, VOL. 63, NO. 6, JULY 2014

Tian Zhang received the B.S. and M.S. degrees from Zhu Han (S’01–M’04–SM’09–F’14) received the
Shandong Normal University, Jinan, China, in 2006 B.S. degree in electronic engineering from Tsinghua
and 2009, respectively. Since September 2009, he University, Beijing, China, in 1997 and the M.S.
has been pursuing the Ph.D. degree with Department and Ph.D. degrees in electrical engineering from the
of Information Science and Engineering, Shandong University of Maryland, College Park, MD, USA, in
University, Jinan, China. 1999 and 2003, respectively.
From March 2010 to June 2013, he was also a From 2000 to 2002, he was a Research and Devel-
Visiting Research Staff with Tsinghua University, opment Engineer with JDS Uniphase Corporation,
Beijing, China. Since 2014, he has also been a Vis- Germantown, MD, USA. From 2003 to 2006, he
iting Research Assistant with the Chinese University was a Research Associate with the University of
of Hong Kong, Hong Kong. His research interests Maryland. From 2006 to 2008, he was an Assistant
include wireless communications and smart grid. Professor with Boise State University, Boise, ID, USA. He is currently an
Associate Professor with the Department of Electrical and Computer Engi-
neering, University of Houston, Houston, TX, USA. His research interests in-
Wei Chen (S’05–M’07–SM’13) received the B.S. clude wireless resource allocation and management, wireless communications
and Ph.D. degrees in electronic engineering (both and networking, game theory, wireless multimedia, security, and smart grid
with highest honors) from Tsinghua University, communications.
Beijing, China, in 2002, and 2007, respectively. Dr. Han has been an Associate Editor for the IEEE T RANSACTIONS ON
From 2005 to 2007, he was also a visiting research W IRELESS C OMMUNICATIONS since 2010. He received the National Science
staff with the Hong Kong University of Science Foundation CAREER Award in 2010 and the IEEE Fred W. Ellersick Prize in
and Technology (HKUST). Since July 2007, he has 2011.
been with Department of Electronic Engineering,
Tsinghua University, where he is a full professor
and the Vice Director of the Institute of Communi-
cations. He visited the University of Southampton,
Southampton, U.K., from June 2010 to September 2010. His research interests
are in the broad areas of wireless communications, information theory, and
applied optimizations. Zhigang Cao (M’84–SM’85) graduated from
Prof. Chen is a 973 Youth Project chief scientist and is supported by the new Tsinghua University, Beijing, China, in 1962.
century talent program of the Ministry of Education, Beijing nova program, Since then, he has been with Tsinghua University,
and 100 fundamental research talents program of Tsinghua University (also where he is currently a Professor with the Depart-
known as the 221 talents Program). He served as an Editor for IEEE W IRELESS ment of Electronic Engineering. From 1984 to 1986,
C OMMUNICATIONS L ETTERS, a vice director of the youth committee of the he was a Visiting Scholar with Stanford University,
China Institute of Communications, a tutorial Co-chair of the 2013 IEEE Stanford, CA, USA. In 1997, he was a Visiting
International Conference on Communications, a track Co-chair of the wireless Professor with Hong Kong University of Science
track of the 2013 IEEE CCNC, a Technical Program Committee (TPC) Co-chair and Technology, Hong Kong. He is the author of six
of the 2011 Spring IEEE Vehicular Technology Conference, the Publication books and more than 500 papers in the fields of com-
Chair of the 2012 IEEE International Conference on Communications in China munications and signal processing and is a holder of
(ICCC), a TPC Co-chair of the Wireless Communication Symposium at the over 20 patents. His current research interests include mobile communications
2010 IEEE International Conference on Communications (ICC), and a Student and satellite communications.
Travel Grant Chair of ICC 2008. He received the 2010 IEEE Comsoc Asia Mr. Cao is a Fellow of the Chinese Institute of Communications; a Senior
Pacific Board Best Young Researcher Award, the 2009 IEEE Marconi Prize Member of the Chinese Institute of Engineers; and a member the Institute
Paper Award, the Best Paper Awards at IEEE ICC 2006, IEEE IWCLD 2007, of Electronics, Information, and Communication Engineers. He serves as an
and IEEE Smart Grid 2012, the 2011 Tsinghua Rising Academic Star Award, Editor for China Communications, the Journal of Astronautics, and Frontiers
the 2012 Tsinghua Young Faculty Teaching Excellence Award, the First Prize of Electrical and Electronic Engineering, and as an Associate Editor-in-Chief
at the first national young faculty teaching competition, the First Prize at the of ACTA Electronica Sinica. He received a golden medal from Tsinghua
Seventh Beijing Young Faculty Teaching Competition, and the First Prize at University in 1962, 11 research awards, and a special grant from the Chinese
the Fifth Tsinghua University Young Faculty Teaching Competition. He has Government for his outstanding contributions to education and research. He
received the National May 1st Medal and also holds the honorary title of also coreceived several best paper awards, including the 2009 IEEE Marconi
outstanding teacher in Beijing. Prize Paper Award.

Authorized licensed use limited to: Cyprus University of Technology. Downloaded on January 12,2021 at 08:44:05 UTC from IEEE Xplore. Restrictions apply.

You might also like