Professional Documents
Culture Documents
BESS Aided Renewable Energy Supply Using Deep Rein
BESS Aided Renewable Energy Supply Using Deep Rein
fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TGCN.2021.3136363, IEEE
Transactions on Green Communications and Networking
1
Abstract—The year of 2020 has witnessed the un- base station (BS), its signal coverage range is much
precedented development of 5G networks, along with shorter than that of the 4G/LTE. Consequently, the
the widespread deployment of 5G base stations (BSs). mobile operators need to deploy a large number
Nevertheless, the enormous energy consumption of BSs
of 5G BSs to tackle the problem of poor signal
and the incurred huge energy cost have become significant
concerns for the mobile operators. As the continuous coverage. This would result in an ultra-dense BS
decline of the renewable energy cost, equipping the power- deployment, especially in “hotspot” areas, as illus-
hungry BSs with renewable energy generators could be a trated in Fig. 1.
sustainable solution. In this work, we propose an energy Building and operating such large-scale BSs re-
storage aided renewable energy supply solution for the quire enormous investments and resources. Accord-
BS, which could supply clean energy to the BS and
ing to a field survey in the cities of Guangzhou and
store surplus energy for backup usage. Specifically, to
flexibly regulate the battery’s discharging/charging, we Shenzhen, China, the full-load power consumption
propose a deep reinforcement learning based regulating of a typical 5G BS is about 2 ∼ 3 times of
policy, which can adapt to the dynamical renewable energy that of a 4G one [5]. Considering the ultra-dense
generations as well as the varying power demands. Our deployment of 5G BSs, it could lead to a tenfold in-
experiments using the real-world data on renewable energy crease in energy consumption. In addition, with the
generations and power demands demonstrate that, our increasing emphasis on environmental protection,
power supply solution can achieve an cost saving ratio of
77.9%, compared to the case with traditional power grid
many governments have shut down some coal-fired
supply. power plants, resulting in severe power shortages
in some areas. In this regard, how to effectively
Index Terms—5G base stations, BESS, renewable energy
supply, deep reinforcement learning
reduce energy consumption and improve the energy
efficiency are critical problems.
Renewable energies like the solar and wind en-
I. I NTRODUCTION ergies are eco-friendly with zero carbon emissions
The 5G network is considered as a promising and become popular in more scenarios in recent
technology to significantly improve the way how years [6]. Owing to the continuing price decline in
we live [1]. Compared to the 4G/LTE, it can ensure photovoltaic (PV) module and wind turbine, the in-
users with higher bandwidth and lower latency and stallation cost of renewable energy has dramatically
thus enable various cutting-edge mobile services, decreased over the past decade, e.g., it reports a
such as the Internet of Vehicles [2], Virtual Real- 61% reduction of the solar equipment from 2010
ity [3], and Smart Medical Home [4]. Nevertheless, to 2017 [7]. Such cost reductions lead to a rapid
due to the adoption of high frequency bands by 5G payback period for the renewable energy invest-
ment, from a couple of years to several months [8].
H. Yuan and D. Guo are with the Science and Technology on The above observations indicate the great potential
Information Systems Engineering Laboratory, National University of
Defense Technology, Changsha, Hunan, China. G. Tang is with the of renewable energy on the market of fossil fuel
Peng Cheng Laboratory, Shenzhen, Guangdong, China. K. Wu is with replacement and carbon emission reduction.
the Department of Computer Science, University of Victoria, Victoria, It thus has inspired the mobile operators to utilize
BC, Canada. X. Shao is with the School of Regional Innovation and
Social Design Engineering, Kitami Institute of Technology, Kitami, renewable energy as the auxiliary power supply
Japan. Keping Yu is with the Global Information and Telecommunica- to tackle the huge power demand at 5G BSs. In
tion Institute, Waseda University, Shinjuku, Tokyo, Japan. W. Wei is some developing countries, solar power has already
with School of Computer Science and Engineering, Xi’an University
of Technology, Xi’an, China. been applied to supply the BSs, some of which
Guoming Tang and Deke Guo are the corresponding authors. occupies over 8% of the total electricity usage [9].
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TGCN.2021.3136363, IEEE
Transactions on Green Communications and Networking
2
Low-orbit
Satellite
Wind Mobile Base
Generator Station
Macro Cell
Power Line
Solar Power Grid
Transformer
Panel
Fig. 1. A vision of the future radio access network (RAN) in 5G and beyond, which consists of macro and small cells, and also includes
the mobile and space BSs. For the purpose of green communication, all the BSs could be supplied by both the renewable energy and power
grid.
By installing the PV and wind turbine near the BSs, of power grid (i.e., fossil energy). Specifically, the
it shows that the maximum power from the solar and energy charge can be continuously reduced by the
wind generators can reach up to 8.5kW and 6.0kW, generated renewable power, and the demand charge
respectively, which could remarkably cut down the can be reshaped and flatten through strategic battery
communication energy supply from the traditional discharging/charging operations.
power gird. When designing the optimal control strategy
To maximize the utilization of renewable energy, in battery discharging/charging operations, we are
energy storage can be strategically utilized such that faced with several challenges. Firstly, the renewable
the energy can be continuously provided, as the energy generation and power demand are highly
renewable (like solar or wind) energy is intermittent varying in both spatial and temporal dimensions and
and unstable. Meanwhile, most BSs are equipped thus hard to predict. Secondly, owing to the phys-
with backup batteries to safeguard the BS’s normal ical constraints of the battery discharging/charging
functioning against power outages, making it the operations (e.g., discharge/charge efficiency), it is
natural energy storage. Besides, with the continuous complicated to design the optimal battery control-
price decline in battery storage these years [10], ling policy. Thirdly, as the battery’s capacity and
[11], combining the battery storage with renewable lifetime are limited and shortened along with the
energy generators could offer even greater cost- discharge/charge cycles, it is necessary while non-
reduction potential. Specifically, i) when the gener- trivial to trade-off between the cost of battery’s
ated renewable power is less than the power demand degradation/replacement and the gain of renewable
(e.g., during the peak hours), the battery can be energy storage.
discharged to flatten the peak power demands, and By tackling the above challenges, we make the
ii) when the generated renewable power is more than following contributions in this work:
the power demand (e.g., during the off-peak hours), • We present the BESS aided renewable energy
the battery can be charged to store the surplus supply paradigm for 5G BS operations, in
renewable energy. which the battery discharging/charging opera-
In this paper, we propose a battery energy stor- tion is modelled as an optimization problem.
age system (BESS) aided renewable energy supply The model is comprehensive by taking into
solution for the 5G network and beyond. Aiming account the practical considerations of dynamic
at energy cost reduction for mobile operators, our power demand and renewable energy gener-
solution is to maximize the utilization of the re- ation, as well as battery specifications and
newable energy and thus minimize the utilization physical constraints.
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TGCN.2021.3136363, IEEE
Transactions on Green Communications and Networking
3
1400 1400
1200
1300 1150 1300
1200 1100 1200
1050
1100 1100
1000
1000 950 1000
0 24 48 72 96 120 144 168 0 24 48 72 96 120 144 168 0 24 48 72 96 120 144 168
Time (Hour) Time (Hour) Time (Hour)
(a) Power demand pattern of BSs at resident (b) Power demand pattern of BSs at office (c) Power demand pattern of BSs at compre-
area. area. hensive area.
Fig. 2. Power demand patterns of BSs at different area in one week period [12].
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TGCN.2021.3136363, IEEE
Transactions on Green Communications and Networking
4
For example, for a commercial data center con- Note that the feasibility of such an implementa-
suming 10 MW on peak and 6 MW on average, the tion as illustrated by Fig. 3 has been preliminar-
monthly energy charge and demand charge amount ily verified in practice. According to [15], small
to around $24,000 and $165,500, respectively [13]. integrated renewable energy generators have been
The demand charge could be up to 8x the energy provided by some commercial companies for the
charge, therefore, effectively cutting down the de- BS system, which are easily deployed in both open
mand charge could remarkably reduce the energy rural and crowded urban environments.
cost. However, there seems no practical way to
flatten the peak power demands of 5G BSs, e.g.,
B. BS Power Supply and Demand
shifting the real-time demands from mobile users
to the off-peak hours could lead to the long delay The power of each 5G BS is mainly supplied
for some of the classes of jobs [14]. by three parts: power grid, generated renewable
energy, and storage energy. In particular, i) when
III. S YSTEM M ODEL the generated renewable energy is more than the
power demand (e.g., during the off-peak hours),
In this section, we present the system models each 5G BS is only supplied by the renewable
and basic assumptions and problem formulation. For energy (i.e., off-grid) and the surplus renewable
clarity, the major notations used in this paper are energy is stored in the battery storage, ii) when the
explained in Table I. generated renewable energy is less than the power
demand (e.g., during the peak hours), each 5G BS
A. Scenario Overview is supplied by all three parts in a cooperative way.
As illustrated in Fig. 3, the proposed BESS aided In this paper, we consider a discrete time model,
renewable energy supply solution deployed at each where the entire billing cycle (e.g., one month) is
5G BS mainly includes: i) a renewable energy gen- equally spilt into T consecutive slots with length
erator, e.g., the PV panel and wind turbine, which of ∆t and denoted by T = {1, 2, · · · , T }. For an
is deployed near the 5G BS system and generates arbitrary 5G BS, the power demand during the entire
renewable energy for the system, ii) a battery stor- billing cycle can be represented by a power demand
age, which stores the surplus renewable energy and vector:
acts as the power source for the BS as needed, and d := [d(1), d(2), · · · , d(T )] (1)
iii) a controller, which can obtain the environment
where d(t) is the power demand in time slot t, which
state (i.e., the measurement data) so as to control the
can be obtained by power meter readings at each
battery discharging/charging operations through the
BS.
control signals. In addition to the standard meter, as
shown in Fig. 3, an additional generation meter is
installed for the BS power supply system to mea- C. Renewable Energy Generation
sure the renewable energy generation. Furthermore, By harvesting energy from renewable energy
with commands from the controller, the distribution resources, the BSs could be powered in an
panel takes responsibility of power switch between environmentally-friendly and cost-efficient way. In
the renewable energy and grid energy and ensures this paper, in order to make the model extensible,
continuous and stable electricity supply for the BS. we denote the renewable energy generation vector
As the essential component of the BESS aided re- as:
newable energy supply solution, the controller deter- g := [g(1), g(2), · · · , g(T )] (2)
mines how efficient this paradigm is. Specifically, at
each scheduling point, the controller needs to decide In this work, we choose two typical renewable
the amount of power supply from either the battery energy as the auxiliary way of power supply, i.e.,
or the power grid. The scheduling operations should solar energy (i.e., g s (t)) and wind energy (i.e.,
be made upon the power demands and battery states g w (t)). Accordingly, for an arbitrary time slot t, the
in real-time, so that the utilization of renewable renewable energy generation vector can be repre-
energy can be enhanced and the total energy cost sented by:
can be minimized. g(t) = g s (t) + g w (t) (3)
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
5
Power
Demand
Solar
Panel
Distribution
Panel
Generation Measurement Standard
Meter Data
Meter
Controller
Control
Wind Turbine Signals Power Grid
Battery Storage
Fig. 3. An exemplified implementation of the BESS aided renewable energy supply solution for the 5G BS.
We assume that if the total generated renewable into electricity. The amount of the power generated
energy is beyond the power demand (i.e., g(t) > by the wind turbine at time slot t can be calculated
d(t)), the power is supplied in proportion to the by the following function:
renewable energy generated. The generation of both
varies during a certain period (e.g., one day) and is g w (t) = FW (W V (t), W S(t), HH(t)) (6)
affected by a some similar factors such as weather, where FW (·) is a known, non-linear function defined
temperature, wind speed, and so on. in [17]. Accordingly, the wind energy generation
1) Solar Energy Generation: Power generated by during the entire billing cycle can be represented
the solar PV system mainly depends on three fac- by a vector:
tors: global horizontal irradiance (GHI(t)), outdoor
temperature (T emp(t)), and time of day (T oD(t)). g w := [g w (1), g w (2), · · · , g w (T )] (7)
By arranging solar PV cells in series/parallel, solar
PV could harvest energy and convert it into DC to D. Battery Specification
charge the battery storage and supply the power At an arbitrary time slot t, the state of the battery
demand. The generated power by the solar PV is modeled as follows:
at time slot t can be measured by the following
function: χ(t) := hSoE(t), SoC(t), DoD(t)i (8)
g s (t) = FS (GHI(t), T emp(t), T oD(t)) (4) where the notations of SoE, SoC, and DoD repre-
sent the state of effective capacity, state of charge,
where FS (·) is a known, non-linear function de- and depth of discharge of the battery, respectively.
fined in PVLIB [16]. Accordingly, the solar energy Specifically, i) SoE indicates the current effective
generation during the entire billing cycle can be capacity of the battery, as a percentage of its initial
represented by a vector: capacity (denoted as π), ii) SoC indicates the current
g s := [g s (1), g s (2), · · · , g s (T )] (5) energy stored in the battery, as a percentage of the
current effective capacity, and iii) DoD indicates
2) Wind Energy Generation: Power generated how much energy the battery has released, as a
by the wind turbine generator fluctuates randomly percentage of the current effective capacity.
with time and mainly depends on the wind velocity For simplicity to tackling the optimization prob-
(W V (t)), weather system (W S(t)), and hub height lem, we discretize the SoC of a battery into
(HH(t)). The wind turbine generate energy typ- M equal-spaced states (e.g., M = 10, i.e.,
ically into two stages: first, it converts the wind {10%, 20%, · · · , 100%}). Accordingly, the DoD are
power into mechanical energy and then transforms also discretized (e.g., release 10% from 90%, i.e.,
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TGCN.2021.3136363, IEEE
Transactions on Green Communications and Networking
6
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TGCN.2021.3136363, IEEE
Transactions on Green Communications and Networking
7
Number of Cycles
them in detail as follows.
• Energy Charge: the total consumed electricity 60000
amount (in kWh) throughout the entire billing 40000
cycle (in the unit $kWh and denoted by λe ). 20000
• Demand Charge: the peak power consumption 0
supplied by power gird (in kW) during the en- 0 20 40 60 80 100
DoD (%)
tire billing cycle (in the unit $kW and denoted
by λd ). Fig. 4. Relationship between DoD levels and battery lifetime (in
number of discharge/charge cycles) for LI battery, respectively [20].
Therefore, the incurred cost of energy charge
of the whole system in each time slot t can be
represented by:
where u(t) is defined by:
e
C (t) = λe · p(t) · ∆t (16)
1 , if using
Accordingly, the incurred cost of demand charge u(t) = (19)
0 , if not using
of the whole system in each time slot t can be
represented by:
We formulate the using cost of the renewable energy
C d (t) = λd · max 0, p(t) − pmax
(17) generator in each time slot t as:
where pmax records the peak power consumption ∆t · u(t)
during the past t − 1 time slots. For any arbitrary C u (t) = λ · (20)
L
time slot t, if p(t) − pmax > 0, pmax will be updated
to p(t) accordingly. where λ is the investment cost of a new renewable
energy generator.
B. Investment Cost We extend the model of renewable energy gener-
ator to specific system, i.e., the solar PV system and
Every usage of this equipment (solar PV, wind wind turbine system. To be detail, i) for the solar PV
turbine, and battery storage) incurs a certain re- system, we denote the lifetime, the investment cost,
duction of its lifetime, which is essential for the and investment as ls (t), C us (t), and λ , respectively,
s
investor. Therefore, it is significant to understand, ii) for the wind turbine system, we denote the
detail and quantify the various factors influencing lifetime, the using cost, and investment as lw (t),
the performance loss curves. For the accuracy of C uw (t), and λ . Accordingly, we can derive the
w
our model, we quantify the investment cost in every using cost of the solar PV system and wind turbine
time slot as follows. system by replacing the symbol in the Eq. 20.
1) Renewable Energy Generator Cost: As mod-
2) Battery Storage Degradation Cost: Every cy-
ules of a renewable energy generated system age,
cle of discharge/charge operation does some “harm”
they gradually lose some performance. In this paper,
to the battery (typically lead-acid) and reduces its
we assume the decline of the system is linear and
capacity and lifetime. Especially, a deep discharging
positively related to its using time. We denote the
severely affect its internal structure, even can perma-
lifetime of the renewable energy generator as L,
nently damage the battery (e.g., an overdischarging).
which indicates the total time it can be used. For an
The battery has to be discarded and replaced by a
arbitrary time slot t, the remaining lifetime of the
new one, when the effective capacity drops down
renewable energy generator is denoted as l(t), which
to the ”ineffective” level, denoted by SoEine in this
is constrained by 0 ≤ l(t) ≤ L. The renewable
paper.
energy generator has to be discarded and replaced
by a new one if l(t) ≤ 0. Given the remaining As illustrated in Fig. 4, each level of DoD
lifetime of the renewable energy generator at time has a corresponding number of discharge/charge
t − 1, the remaining lifetime at time t is updated by: cycles, thus, we can formulate the battery stor-
age degradation cost by the relationship between
l(t) = l(t − 1) − ∆t · u(t) (18) both. Given a state of battery at time slot t, i.e.,
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TGCN.2021.3136363, IEEE
Transactions on Green Communications and Networking
8
hSoE(t), SoC(t), DoD(t)i, the SoE decrease of the 1) Uncertainty of Renewable Energy: Renewable
battery during this time slot can be measured by: energy generation is affected by multiple factors
( such as outdoor temperature and wind velocity. It is
0 , if b(t) ≤ 0 hard to accurately forecast renewable energy gen-
∆SoE(t) = 1−SoEine
h(DoD(t−1)+∆DoD(t))
, if b(t) > 0 eration (i.e., g(t)) and make the optimal discharg-
(21) ing/charging operations (i.e., b(t)) of the battery
where h(·) maps from an input DoD level to the to- storage without accurate information in advance, as
tal number of discharge/charge cycles (exemplified the unpredictable and intermittent nature of these
in Fig. 4), and ∆DoD(t) gives the increase of DoD factors. Therefore, we need to propose a method to
and can be calculated by: tackle the problem of the uncertainty of renewable
b(t)∆t energy generation.
∆DoD(t) = (22) 2) Dynamic of Power Demand: In our modeled
π
problem, we assume the power demand (i.e., p(t))
With the above expression of SoE decrease in is known in advance and thus can essentially opti-
each time slot t, we can then formulate the degra- mize in an offline way. However, such assumptions
dation cost of the battery storage at each time slot are unrealistic in practice. In fact, traditional of-
t as: fline optimization methods (e.g., dynamic program-
C ub (t) = λb · ∆SoE(t) (23) ming [21], [22]) are hard to find the global optimal
where λb is a coefficient converting the battery solution, as the power demand can be obtained only
degradation to a monetary cost, with the unit of when the workload arrives at the 5G BS. Thus,
“$/SoE decrease”. an online method to deal with the dynamic power
To sum up, the total investment cost in each time demands (i.e., d(t)), and make optimal discharg-
slot t can be calculated as: ing/charging operations (i.e., b(t)), is in great need.
3) High Computation Complexity: The optimiza-
u us up ub
C (t) = C (t) + C (t) + C (t) (24) tion problem in Eq. 26 has embedded NP-hard sub-
problems. Firstly, in every time slot t, the controller
C. Optimization Formulation and Difficulty Analy- needs to search the action space (mainly determined
sis by M ), so as to find the the optimal discharg-
ing/charging operation (i.e., b(t)). For simplicity to
The battery discharging/charging operations is
solving the optimization problem, in this paper, we
controlled by the controller. Given the state (i.e.,
discretize the SoC of battery in to M equal-spaced
χ(t)) of the battery storage in time slot t − 1, the
states, however, in real scenario, the state of the
state in time slot t can be updated by:
battery is continous, which leads to an enormous
SoE(t) = SoE(t − 1) − ∆SoE(t) searching space. Secondly, during the entire billing
χ(t) ← SoC(t) = SoC(t − 1) − b(t)∆t/π cycle (i.e., T ), it is challenging for the controller to
DoD(t) = DoD(t − 1) + ∆DoD(t) continuously make the optimal discharging/charging
(25) operation.
For the entire billing cycle T , we need to find To tackle the above three challenges, we propose
the optimal battery discharging/charging controlling an online discharging/charging operation control-
policy to solve the optimization problem, so as to ling method based on deep reinforcement learning
minimize the total electricity bill during the entire (DRL) in the following section.
billing cycle, which is defined as follows.
V. A DRL- BASED BATTERY O PERATION
T
X A PPROACH
C e (t) + C d (t) + C u (t)
min (26a)
b(t)
t=1
Recent breakthrough of deep reinforcement learn-
s.t. (9), (11), (12), and (25), ∀t ∈ T (26b) ing (DRL) [23] provides a promising technique for
enabling effective experience-driven control, which
When solving the above optimization problems, exploit the past experience (e.g., historical battery
however, we are faced with the following three discharging/charging operations) for better decision-
challenges. making by adapting to current state of environment.
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TGCN.2021.3136363, IEEE
Transactions on Green Communications and Networking
9
We consider DRL is particularly suitable for online Policy: The battery discharging/charging con-
•
discharging/charging operations because: i), it is trolling policy ψ(s(t)) : S → A defines the
capable of handling a high-dimensional state space mapping relationship from the state space to
(such as in AlphaGo [24]), which is more ad- the action space, where S and A represent the
vantageous over traditional Reinforcement Learning state space and the action space, respectively.
(RL) [25], and ii) it is able to deal with highly Specifically, the controlling policy can be rep-
dynamic time-variant environments such as time- resented by set of a(t) = ψ(s(t)), which maps
varying power demand and renewable energy gener- the state of the environment to the action at
ation. Next, we will introduce the basic components time slot t.
and concepts of DRL and the proposed DRL-based • Reward: After interacting with the environ-
battery discharging/charging controlling policy in ment, the agent will receive a reward r(t) (cal-
detail. culated by the reward function R(s(t), a(t))),
which indicates the effect of the action in
this episode, so as to update the controlling
A. Components & Concepts policy. The objective of the agent is to find a
policy ψ to maximize the total reward through
A typical DRL framework consists of five key continuous interacting with the environment.
components: agent, state, action, policy, and re- The design of the reward function significantly
ward. The concept and design of each component affect the performance of the DRL-based algo-
in our DRL-based battery discharging/charging con- rithm, and we will introduce its detail in the
trolling policy is explained as follows. next subsection.
• Agent: The role of the agent is to make de- To sum up, at each episode, the agent observes the
cisions in every episode by interacting with state s(t), takes an action a(t) generated by the pol-
the environment. Specifically, at the beginning icy ψ, and receives a reward r(t) calculated by the
of each time slot, it determines the discharg- reward function R(s(t), a(t)). The objective of the
ing/charging operations (i.e., b(t)) according proposed DRL-based battery discharging/charging
the current state (e.g., d(t), g(t) and χ(t)) of controlling policy is to take the optimal action in
the environment. The objective is to find an every episode so as to maximize the total reward.
optimal battery discharging/charging control-
ling policy to minimize the total electricity bill
B. Reward Function Design
during the entire billing cycle.
• State: At each episode, the agent first observes At the end of each time slot, the agent evaluates
the state of the current environment to take ac- the performance of the action using a reward func-
tion. In order to take the optimal action at each tion, which transforms the performance statistics to
episode, the current state should cover as much a numerical utility value. For an arbitrary time t, the
information as possible. In this paper, we define agent observes the state s(t), takes the action a(t)
the state vector of the current environment as and adopts the following reward function to access
s(t) = [d(t), g(t), χ(t), pmax ], which concludes the performance of the controlling action:
the current information of the power demand, R(s(t), a(t)) = exp V e (t) + V d (t) + V u (t) (27)
the renewable energy generation, the battery
storage and the peak power consumption. in which:
e e
• Action: After observing the state of the envi- • V (t) = −C (t), measures the reward of the
ronment, the agent will take an action accord- incremental energy charge caused by the action
ingly. In our problem, the action is to control in time slot t.
d d
the battery discharging/charging operations in • V (t) = −C (t), measures the reward of the
each time slot, i.e., b(t), specifically, i) whether incremental demand charge caused by the ac-
the battery should be discharged or charged, tion in time slot t.
u u
and ii) how much energy should be discharged • V (t) = −C (t), measures the reward of the
or charged. We denote the action taken at time investment cost caused by the action in time
t by a(t), which is equivalent to b(t). slot t.
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
10
Environment Controller
𝑟
Power 𝑠
Demand 𝑚𝑎𝑥! {𝑄 𝑠, 𝑎; 𝜃- }
𝑠′
Replay Target
Renewable (𝑠, 𝑎, 𝑟, 𝑠′) Net
Buffer Loss
Energy 𝜃
(𝑠, 𝑎, 𝑟, 𝑠′) (𝑠, 𝑎) Function
Generation
∇𝜃
Battery Storage
Fig. 5. The framework of the learning process in DQN. For simplicity, we denote s(t + 1) as s0 . After interacting with the environment,
the agent (i.e., controller) will determine the specific discharging/charging operation.
At the end of each time slot, the agent evaluates As illustrated in Fig. 5, two effective techniques
the performance of the action by the reward r(t) were introduced in [23] to improve stability: replay
calculated by the reward function R(s(t), a(t)). In buffer and target network. Specifically,
the DRL-based framework, the objective is to max-
imize the expected cumulative discounted reward: • Replay Buffer: Unlike traditional reinforce-
ment learning, DQN applies a replay buffer
∞
X to store state transition samples in the form
γ k R(s(t), a(t))
r(t) = E (28) of hs(t), a(t), r(t), s(t + 1)i collected during
k=t
learning. Every κ time steps, the DRL-based
where γ ∈ (0, 1] is the discount accumulative agent updates the DNN with a mini-batch
factor indicating the degree of emphasis of future experiences from the replay buffer by means
rewards, and the higher γ indicates a higher degree of stochastic gradient descent (SGD): θi+1 =
of emphasis on future rewards. θi + σ5θ Loss(θ), where σ is the learning
rate. The higher learning rate will lead to the
faster parameters updating speed. However, at
C. Learning Process Design the same time, the algorithm would be more
The learning process of the algorithm adopts affected by abnormal data, which is easy to di-
a deep neural network (DNN) called Deep Q- verge and difficult to converge. Compared with
Network (DQN) to derive the correlation between Q-learning (only using immediately collected
each state-action pair (s(t), a(t)) and its value func- samples), randomly sampling from the replay
tion Q(s(t), a(t)), which is the expected discounted buffer allows the DRL-based agent to break
cumulative reward. If the environment is in state the correlation between sequentially generated
s(t) and follows action a(t), the value function of samples, and learn from a more independently
the state-action (s(t), a(t)) can be represented as: and identically distributed past experiences.
Thus, the replay buffer can smooth out learning
and avoid oscillations or divergence.
Q(s(t), a(t)) = E r(t)|s(t), a(t) (29)
• Target Network: There are two neural net-
After obtaining the value of each state-action works with the same structure but different
(s(t), a(t)), the agent selects the action a(t) with parameters in DQN, the main net and the
the -greedy policy ψ, that is, randomly selects the target net. Q(s, a; θ) and Q(s, a; θ̃) represent
action with the probability of , and chooses the the current Q-value and target Q-value gen-
action with the maximum of Q(s(t), a(t)) with the erated by the main net and the target net,
probability of 1-, i.e., argmaxa(t) Q(s(t), a(t)). respectively. The DRL-based agent uses the
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TGCN.2021.3136363, IEEE
Transactions on Green Communications and Networking
11
target net to estimate the target Q-value Q̃ Algorithm 1: Battery Controlling Algorithm
for training the DQN. Every τ time steps, with DRL
the target net copies the parameters from the Input: Power demand of BS d(t) and renewable
main net, whose parameters are updated in real- energy generation g(t), 1 ≤ t ≤ T
Output: Discharging/charging actions a(t),
time. After introducing the target net, the target
1≤t≤T
Q-value will remain unchanged for a period 1 Initialize replay buffer (RB) to capacity N;
time, which reduces the correlation between 2 Initialize main net Q with random weights θ;
the current Q-value and the target Q-value and 3 Initialize target net Q̃ with weights θ̃ = θ;
improves the stability of the algorithm. 4 for episode = 1 : M axLoop do
5 for t = 1 : T do
Accordingly, the DQN can be trained by the loss:
6 Get environment state s(t) ;
argmaxa Q(s(t), a(t); θ), prob.
Loss(θ) ← E (Q̃ − Q(s(t), a(t); θ))2
(30) 7 a(t) =
random action, prob. 1 −
8 Execute action a(t) and receive r(t) and
where θ is the network parameters of the main net, s(t + 1);
and Q̃ is the target Q-value and calculated by: 9 Store h(s(t), a(t), r(t), s(t + 1)i into RB;
10 Randomly sample a mini-batch of
Q̃ ← r(t)+γmaxa(t+1) Q(s(t+1), a(t+1); θ̃) (31) experience hs(i), a(i), r(i), s(i + 1)i from
RB by every κ steps;
where θ̃ is the network parameters of the target net 11 Q̃ =
and it updates every τ time slots by coping from r(t), terminates at step t + 1
the main net. r(t) + γmaxa(t+1) {Q(s(t + 1), a(t + 1); θ̃)}, else
12 Perform SGD on (Q̃ − Q(s, a; θ))2 w.r.t. θ;
To sum up, the learning process is depicted by the
13 Set Q̃ = Q by every τ steps;
pseudo-code in Alg. 1. The controller first initializes 14 end
the replay buffer and the parameters (i.e., θ and 15 end
θ̃) of the main net and target net, respectively
(Line 1-3 in Alg. 1). After obtaining the value
of each state-action (s(t), a(t)), the agent selects
VI. P ERFORMANCE E VALUATION
the action a(t) with the -greedy policy ψ, and
then performs the action a(t) and interacts with the We evaluate the performance of the proposed
environment (Line 6-7 in Alg. 1). Next, the agent DRL-based battery discharging/charging controlling
will receive the reward r(t) and observe the next policy through extensive numerical analysis.
state s(t + 1) of the environment, meanwhile store
the state hs(t), a(t), r(t), s(t + 1)i into the RB (Line
8-9 in Alg. 1). Every κ time steps, the agent updates A. Experiment Setup
the DNN by Eq. 30 with a mini-batch experience 1) BS and Power Consumption Data: In order
from the replay buffer by means of stochastic gra- to show the performance of the proposed method,
dient descent (SGD). The target net will copy the we mainly consider the 5G BS deployed at the
parameters of the main net by every τ time steps three areas, i.e., resident area, office area, and com-
(Line 10-13 in Alg. 1). During the learning process, prehensive area, whose power consumption within
we set the learning rate σ is 0.001, the in -greedy one-week period are illustrated in Fig. 2, and we
method is 0.9, the discount accumulative factor γ assume the power consumption of the same type
is 0.9, and the step parameters τ and κ are both BSs in different cities (e.g., Beijing, Shanghai and
2000. For the whole battery discharging/charging Guangzhou) is the same. For simplicity, we denote
scheduling process, the algorithm has an overall the BS deployed at the areas of resident, office,
computational
Pn complexity of O(Cconv · T ), where and comprehensive as type I, type II, and type
i i
Cconv = i=1 Cin Cout represents the sum of the III, respectively. We will apply the BESS aided
product of the input channel (neurons) and the renewable energy supply solution to different types
output channel (neurons) of i-th linear layer, leading of BSs in different cities under different weather
to the faster convergence speed compared to other conditions and evaluate its performance through
DRL algorithm. massive simulation experiment.
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TGCN.2021.3136363, IEEE
Transactions on Green Communications and Networking
12
5
Clear Day 5 High-wind Day
4 Low-wind Day
3 3
2 2
1 1
0 0
0 2 4 6 8 10 12 14 16 18 20 22 24 0 2 4 6 8 10 12 14 16 18 20 22 24
Time (Hour) Time (Hour)
(a) The solar PV output power patterns under different weather (b) The wind turbine output power patterns under different weather
conditions. conditions.
Fig. 6. (a) The solar PV output power patterns under different weather conditions (i.e., GHI(t), T emp(t), and T oD(t)) in one day period.
(b) The wind turbine output power patterns under different weather conditions (i.e., W V (t), W S(t), and HH(t)) in one day period.
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TGCN.2021.3136363, IEEE
Transactions on Green Communications and Networking
13
into three types under different weather conditions. C. Performance under Different Types of BSs
Accordingly, the weather pattern can be divided into As the different types of BSs has diverse power
nine types: clear & high-wind day, clear & middle- demand, resulting in different energy charge and
wind day, clear & low-wind day, partial cloudy & demand charge, thus the performance of deployment
high-wind day, partial cloudy & middle-wind day, of the BESS aided renewable energy supply solution
partial cloudy & low-wind day, cloudy & high-wind could be different.
day, cloudy & middle-wind day, and cloudy & low- Specifically, as is shown in Table III, the type I
wind day. BS has the highest cost saving compared to other
two types of BSs, i.e., $50.4 in Beijing, $50.7, and
The power supply patterns under different $49.5. This is because that type I BS has the biggest
weather conditions in one day period of 5G BS power demand and peak value (near 1450 watts),
at the area of resident, office, and comprehensive making it has great potential in energy-saving and
are illustrated in Fig. 8, Fig. 10, and Fig. 11 (in peak power shaving. Besides, as type II BS’s power
the appendix), respectively. As we can see, the demands are relatively small, the generated and
BESS aided renewable energy supply solution could stored renewable energy can effectively reduce the
significantly reduce the power from the grid (i.e., power grid supply. Therefore it has the highest sav-
energy charge and demand charge). Specifically, ing ratio, i.e., 76.4% in Beijing, 77.9% in Shanghai,
with the increase of radiation and wind velocity, and 75.6% in Guangzhou.
renewable energy generation increased accordingly.
It could cover most of the power demand and reduce
the power supplied from the power grid. Especially, D. ROIs of Different Scenarios
under high-wind days, the power demand could be The return of investment (ROI) is a financial
totally supplied by the renewable energy and battery metric defined by the benefit (cost saving in our
storage and need 0 power from the grid. case) divided by the total investment. It indicates the
probability of gaining a return from an investment
After calculating the power supply paradigm un- and has been widely used to evaluate the efficiency
der different weather patterns, we can derive the of an investment [29]. Typically, a bigger ROI value
electricity bill of these three types of BSs during indicates a higher investment efficiency. With the
the billing cycle in different cities (i.e., different costs of renewable energy generator and battery
weather patterns, which is illustrated in Fig. 7), and storage (given in Table II), the total investments can
the results from all the set scenarios are summarized be calculated. Accordingly, the ROIs can thus be
in Table III. derived with the results in Table III.
The ROIs of different types of BSs deployed in
Specifically, for a single 5G BS without the different cities are shown in Table IV. Specifically,
proposed power supply paradigm, the energy charge type I BS has the highest ROI, which could reach
and the demand charge are $45.6 and $22.8, re- to 5.43% in Beijing, 5.46% in Shanghai, and 5.33%
spectively. However, after utilizing the BESS aided in Guangzhou, respectively, indicating a relatively
renewable energy supply solution on the 5G BS, the high investment efficiency for the operators. This is
electricity bill is significantly reduced. Especially because that type I BS has the biggest cost saving.
in Shanghai, which has relatively more clear and As the equipment’s cost is estimated to decrease
high-wind days, the energy charge and the demand dramatically in the future [30], and the ROI could
charge can be reduced to $3.8 and $9.1, respectively. rise significantly in 5G and beyond. Additionally,
Although there exists equipment degradation during as we can see, the city with more clear and high-
the discharge/charge cycles, the investment cost wind days will obtain a bigger ROI value, thus the
still keeps at a well accepted level. The highest proposed solution is more suitable for those cities
cost saving for the BS which utilized the proposed with more sunny and windy days.
power supply paradigm in Beijing, Shanghai, and It is worth noting that, we assume the deployed
Guangzhou in one billing cycle is $50.4, $50.7 and renewable energy generator and the battery storage
$49.5, respectively. Accordingly, the saving ratio only supply power to one single 5G BS, and thus the
can be up to 74.4%, 74.8% and 73.2%, respectively. surplus renewable energy (when the battery is full)
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TGCN.2021.3136363, IEEE
Transactions on Green Communications and Networking
14
Irradiance
Beijing
Wind Velocity
Irradiance
Shanghai
Wind Velocity
Irradiance
Guangzhou
Wind Velocity
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
Days
Fig. 7. The weather data is obtain from [28], and the billing cycle window is from 1st June 2020 to 30th June 2020.
0 0 0
0 2 4 6 8 10 12 14 16 18 20 22 24 0 2 4 6 8 10 12 14 16 18 20 22 24 0 2 4 6 8 10 12 14 16 18 20 22 24
Time (Hour) Time (Hour) Time (Hour)
(a) The power supply pattern under the clear (b) The power supply pattern under the clear (c) The power supply pattern under the clear
& high-wind day. & middle-wind day & low-wind day
0 0 0
0 2 4 6 8 10 12 14 16 18 20 22 24 0 2 4 6 8 10 12 14 16 18 20 22 24 0 2 4 6 8 10 12 14 16 18 20 22 24
Time (Hour) Time (Hour) Time (Hour)
(d) The power supply pattern under the partial (e) The power supply pattern under the partial (f) The power supply pattern under the partial
cloudy & high-wind day cloudy & middle-wind day cloudy & low-wind day
0 0 0
0 2 4 6 8 10 12 14 16 18 20 22 24 0 2 4 6 8 10 12 14 16 18 20 22 24 0 2 4 6 8 10 12 14 16 18 20 22 24
Time (Hour) Time (Hour) Time (Hour)
(g) The power supply pattern under the cloudy (h) The power supply pattern under the cloudy (i) The power supply pattern under the cloudy
& high-wind day & middle-wind day & low-wind day
Fig. 8. The power supply pattern of a single 5G BS at area of resident is supplied by different power supply methods under different weather
conditions in one day period.
will be discarded. This actually leads to a relatively E. Total Electricity Bill under Different Algorithms
low utilization, as given in this work. In practice,
the generated renewable energy could supply to In order to reflect the performance of the pro-
multiple BSs [5], so that the ROI and utilization posed method, we mainly compare the total elec-
of the renewable energy could be further improved. tricity bill with two baseline algorithms.
• AC: which uses actor-critic (AC) method [31]
to make the discharging/charging scheduling
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TGCN.2021.3136363, IEEE
Transactions on Green Communications and Networking
15
TABLE III
R ESULTS S UMMARY (O NE B ILLING C YCLE )
BS Type Scenario Energy Charge ($) Demand Charge ($) Investment Cost ($) Cost Saving ($) Saving Ratio (%)
No deployment 44.6 23.1 0 / /
Deployment in Beijing 5.0 12.0 0.4 50.4 74.4
Type I
Deployment in Shanghai 4.7 12.0 0.4 50.7 74.8
Deployment in Guangzhou 5.9 12.0 0.3 49.5 73.2
No deployment 40.1 20.2 0 / /
Deployment in Beijing 4.8 9.1 0.3 46.1 76.4
Type II
Deployment in Shanghai 3.8 9.1 0.4 47.0 77.9
Deployment in Guangzhou 5.3 9.1 0.3 45.6 75.6
No deployment 45.6 22.8 0 / /
Deployment in Beijing 6.8 13.9 0.3 47.4 69.3
Type III
Deployment in Shanghai 5.7 13.9 0.4 48.4 70.8
Deployment in Guangzhou 7.9 13.9 0.2 46.4 67.8
40 40 40
DQN DQN DQN
35 AC 35 AC 35 AC
30 Max 30 Max 30 Max
Total Electricity Bill
25 25 25
20 20 20
15 15 15
10 10 10
5 5 5
0 Beijing Shanghai Guangzhou 0 Beijing Shanghai Guangzhou 0 Beijing Shanghai Guangzhou
(a) Total electricity bill of type I BS. (b) Total electricity bill of type II BS. (c) Total electricity bill of type III BS.
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TGCN.2021.3136363, IEEE
Transactions on Green Communications and Networking
16
TABLE V
L ITERATURE S UMMARY
making the load curve flatten by reducing the peak to 5G BS. Therefore, we propose the DRL-based
amount of load and shifting it to times of lower method to tackle the problem of large and con-
load [35], [36]. Specifically, peak power shaving strained state- and action-space and the uncertainty
is achieved through charging energy storage system of renewable energy generation and power demand.
when demand is low (off-peak period) and discharg-
ing energy when demand is high (on-peak period). VIII. C ONCLUSIONS
For task offloading, the total power consumption can To cope with the ever-increasing electricity bill
also be reduced by dispatching tasks to BSs with for mobile operators in 5G era, we proposed a BESS
lower loads [37], [38]. As shown in Table V, we aided energy supply solution for the 5G BS system,
have summarized the relevant literature. which models the battery discharging/charging as an
optimization problem. With our proposed solution,
B. Battery Storage Optimal Control besides the power grid, a BS can be powered by
The optimal control of energy storage has been the renewable energy and the battery storage, to cut
extensively studied in the past. Most related works down the total energy cost. To solve the problem
formulate an optimization problem that aims to under the dynamic power demands and renewable
maximize the revenue generated by the battery energy generation, we developed a DRL-based ap-
storage co-located with renewable energy generator. proach to the BESS operation that accommodates all
Babacan et al. [39] proposed a convex program factors in the modeling phase and makes decisions
to minimize the electricity bill of operators. Ratnam in real-time. To evaluate the performance of our
et al. [40] aimed to maximize the daily operational solution, we chose three cities with different weather
savings that accrue to customers while penalizing patterns for experiments. The experimental results
large voltage swings stemming from reverse power show that our power supply solution can achieve a
flow and peak load. Kazhamiaka et al. [41] studied cost saving ratio of 74.8% during the entire billing
the profitability of residential PV-storage systems cycle and improve the renewable energy utilization.
in three jurisdictions and set up an integer linear In the future, with further development of the
program to determine the battery controlling policy. communication technology (e.g., B5G/6G), there
These works assume the generations of renewable will be more mobile BSs and air BSs equipped
energy and the power demand are known in advance with more batteries, which could much rely on the
and can be optimized in an offline way. However, renewable energy. Designing an effective battery
these assumptions are unpractical in the real world. discharging/charging policy to ensure the high QoS
Several papers study the optimal control of bat- of mobile networks is also an interesting and chal-
teries under uncertainty and randomness. Guan et lenging problem for future work.
al. [42] utilized a reinforcement learning method
to minimize the homeowner’s cost by taking an ACKNOWLEDGMENT
action that yields the best expected reward. Ener- This work is partically supported by the Na-
gyBoost [18] could provide a predictable ability of tional Natural Science Foundation of China (No.
the renewable energy generation and power demand. 61802421 and No. U19B2024) and National Nat-
However, these works are only applied in the home ural Science Foundation of Hunan Province (No.
scenario, which generates a few demands compared 2019JJ30029), Telecommunications Advancement
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TGCN.2021.3136363, IEEE
Transactions on Green Communications and Networking
17
Foundation (Japan) Research Grant, RIEC Nation- [18] B. Qi, M. Rashedi, and O. Ardakanian, “Energyboost: Learning-
wide Cooperative Research Projects, Research Insti- based control of home batteries,” in Proceedings of the Tenth
ACM International Conference on Future Energy Systems,
tute of Electrical Communication, Tohoku Univer- 2019, pp. 239–250.
sity, Japan, H31/B18, ROIS NII Open Collaborative [19] Y. Shi, B. Xu, B. Zhang, and D. Wang, “Leveraging energy
Research 2021 (21FA03). storage to optimize data center electricity cost in emerging
power markets,” in Proceedings of the Seventh International
Conference on Future Energy Systems, 2016, pp. 1–13.
R EFERENCES [20] B. Aksanli, T. Rosing, and E. Pettis, “Distributed battery control
for peak power shaving in datacenters,” in IEEE IGCC, 2013,
[1] J. G. Andrews, S. Buzzi, W. Choi, S. V. Hanly, A. Lozano, A. C. pp. 1–8.
Soong, and J. C. Zhang, “What will 5g be?” IEEE Journal on [21] D. K. Maly and K.-S. Kwan, “Optimal battery energy storage
selected areas in communications, vol. 32, no. 6, pp. 1065– system (bess) charge scheduling with dynamic programming,”
1082, 2014. IEE Proceedings-Science, Measurement and Technology, vol.
[2] M. Gerla, E.-K. Lee, G. Pau, and U. Lee, “Internet of vehicles: 142, no. 6, pp. 453–458, 1995.
From intelligent grid to autonomous cars and vehicular clouds,” [22] A. Oudalov, R. Cherkaoui, and A. Beguin, “Sizing and optimal
in 2014 IEEE world forum on internet of things (WF-IoT). operation of battery energy storage system for peak shaving
IEEE, 2014, pp. 241–246. application,” in 2007 IEEE Lausanne Power Tech. IEEE, 2007,
[3] G. C. Burdea and P. Coiffet, Virtual reality technology. John pp. 621–625.
Wiley & Sons, 2003. [23] V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness,
[4] E. D. Muse, P. M. Barrett, S. R. Steinhubl, and E. J. Topol, M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland,
“Towards a smart medical home,” The Lancet, vol. 389, no. G. Ostrovski et al., “Human-level control through deep rein-
10067, p. 358, 2017. forcement learning,” nature, vol. 518, no. 7540, pp. 529–533,
[5] G. Tang, Y. Wang, and H. Lu, “Shiftguard: Towards reliable 2015.
5g network by optimal backup power allocation,” in IEEE [24] D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. Van
SmartGridComm, 2020, pp. 1–6. Den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershel-
[6] H. Lund, “Renewable energy strategies for sustainable devel- vam, M. Lanctot et al., “Mastering the game of go with deep
opment,” Energy, vol. 32, no. 6, pp. 912–919, 2007. neural networks and tree search,” nature, vol. 529, no. 7587,
[7] R. Fu, D. Feldman, R. Margolis, M. Woodhouse, and K. Ardani, pp. 484–489, 2016.
“Us solar photovoltaic system cost benchmark: Q1 2017,” [25] R. S. Sutton and A. G. Barto, Reinforcement learning: An
EERE Publication and Product Library, Tech. Rep., 2017. introduction. MIT press, 2018.
[8] J. A. Turner, “A realizable renewable energy future,” Science, [26] Dominion Energy South Carolina, Inc., “Rate 23 - in-
vol. 285, no. 5428, pp. 687–689, 1999. dustrial power service,” https://etariff.psc.sc.gov/Organization/
[9] X. Wang, A. V. Vasilakos, M. Chen, Y. Liu, and T. T. Kwon, TariffDetail/150?OrgId=411, 2020.
“A survey of green mobile networks: Opportunities and chal- [27] US Department of Energy, “Energy storage
lenges,” Mobile Networks and Applications, vol. 17, no. 1, pp. technology and cost characterization report,”
4–20, 2012. https://www.energy.gov/eere/water/downloads/
[10] B. Nykvist and M. Nilsson, “Rapidly falling costs of battery energy-storage-technology-and-cost-characterization-report,
packs for electric vehicles,” Nature climate change, vol. 5, no. 4, 2019.
pp. 329–332, 2015. [28] China Meteorological Administration, “Historical weather fore-
[11] A. Mondal, S. Misra, and M. S. Obaidat, “Distributed home cast,” http://www.weather.com.cn/, 2020.
energy management system with storage in smart grid using
[29] Wikipedia, “Return on investment,” https://en.wikipedia.org/
game theory,” IEEE Systems Journal, vol. 11, no. 3, pp. 1857–
wiki/Return on investment, 2020.
1866, 2015.
[30] National Renewable Energy Laboratory (NREL), “Cost pro-
[12] H. Wang, F. Xu, Y. Li, P. Zhang, and D. Jin, “Understanding
jections for utility-scale battery storage,” https://www.nrel.gov/
mobile traffic patterns of large scale cellular towers in urban
docs/fy19osti/73222.pdf, 2019.
environment,” in Proceedings of the 2015 Internet Measurement
Conference, 2015, pp. 225–238. [31] V. R. Konda and J. N. Tsitsiklis, “Actor-critic algorithms,” in
[13] H. Xu and B. Li, “Reducing electricity demand charge for Advances in neural information processing systems, 2000, pp.
data centers with partial execution,” in Proceedings of the 5th 1008–1014.
international conference on Future energy systems, 2014, pp. [32] K. Son, H. Kim, Y. Yi, and B. Krishnamachari, “Base station
51–61. operation and user association mechanisms for energy-delay
[14] M. Dabbagh, B. Hamdaoui, A. Rayes, and M. Guizani, “Shav- tradeoffs in green cellular networks,” IEEE journal on selected
ing data center power demand peaks through energy storage areas in communications, vol. 29, no. 8, pp. 1525–1536, 2011.
and workload shifting control,” IEEE Transactions on Cloud [33] E. Oh, K. Son, and B. Krishnamachari, “Dynamic base station
Computing, 2017. switching-on/off strategies for green cellular networks,” IEEE
[15] L. Qingdao Jinfan Energy Science & Technology Co., “Renew- Transactions on Wireless Communications, vol. 12, no. 5, pp.
able energy generator,” http://www.jinfanenergy.cn, 2019. 2126–2136, 2013.
[16] W. F. Holmgren, R. W. Andrews, A. T. Lorenzo, and J. S. Stein, [34] C. Liu, B. Natarajan, and H. Xia, “Small cell base station
“Pvlib python 2015,” in 2015 ieee 42nd photovoltaic specialist sleep strategies for energy efficiency,” IEEE Transactions on
conference (pvsc). IEEE, 2015, pp. 1–5. Vehicular Technology, vol. 65, no. 3, pp. 1652–1661, 2015.
[17] A. Jahid, M. S. Hossain, M. K. H. Monju, M. F. Rahman, [35] E. Reihani, M. Motalleb, R. Ghorbani, and L. S. Saoud, “Load
and M. F. Hossain, “Techno-economic and energy efficiency peak shaving and power smoothing of a distribution grid with
analysis of optimal power supply solutions for green cellular high renewable energy penetration,” Renewable energy, vol. 86,
base stations,” IEEE Access, vol. 8, pp. 43 776–43 795, 2020. pp. 1372–1379, 2016.
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TGCN.2021.3136363, IEEE
Transactions on Green Communications and Networking
18
[36] C. G. Tse, B. A. Maples, and F. Kreith, “The use of plug-in Deke Guo received the B.S. degree in industry
hybrid electric vehicles for peak shaving,” Journal of Energy engineering from the Beijing University of
Resources Technology, vol. 138, no. 1, 2016. Aeronautics and Astronautics, Beijing, China,
[37] Y. Bejerano and S.-J. Han, “Cell breathing techniques for in 2001, and the Ph.D. degree in management
load balancing in wireless lans,” IEEE transactions on Mobile science and engineering from the National
Computing, vol. 8, no. 6, pp. 735–749, 2009. University of Defense Technology, Changsha,
[38] A. Sang, X. Wang, M. Madihian, and R. D. Gitlin, “Coordinated China, in 2008. He is currently a Professor
load balancing, handoff/cell-site selection, and scheduling in with the College of System Engineering, Na-
multi-cell packet data systems,” in Proceedings of the 10th tional University of Defense Technology, and
annual international conference on Mobile computing and is also with the College of Intelligence and Computing, Tianjin Uni-
networking, 2004, pp. 302–314. versity. His research interests include distributed systems, software-
[39] O. Babacan, E. L. Ratnam, V. R. Disfani, and J. Kleissl, defined networking, data center networking, wireless and mobile
“Distributed energy storage system scheduling considering tariff systems, and interconnection networks. He is a senior member of
structure, energy arbitrage and solar pv penetration,” Applied the IEEE and a member of the ACM.
Energy, vol. 205, pp. 1384–1393, 2017.
[40] E. L. Ratnam, S. R. Weller, and C. M. Kellett, “An optimization-
based approach to scheduling residential battery storage with
solar pv: Assessing customer benefit,” Renewable Energy, Kui Wu received the BSc and the MSc degrees
vol. 75, pp. 123–134, 2015. in computer science from Wuhan University,
[41] F. Kazhamiaka, P. Jochem, S. Keshav, and C. Rosenberg, “On China, in 1990 and 1993, respectively, and the
the influence of jurisdiction on the profitability of residential PhD degree in computing science from the
photovoltaic-storage systems: A multi-national case study,” University of Alberta, Canada, in 2002. He
Energy Policy, vol. 109, pp. 428–440, 2017. joined the Department of Computer Science,
[42] C. Guan, Y. Wang, X. Lin, S. Nazarian, and M. Pedram, “Rein- University of Victoria, Canada, in 2002, where
forcement learning-based control of residential energy storage he is currently a Full Professor. His research
systems for electric bill minimization,” in 2015 12th Annual interests include smart grid, mobile and wire-
IEEE Consumer Communications and Networking Conference less networks, and network performance evaluation. He is a senior
(CCNC). IEEE, 2015, pp. 637–642. member of the IEEE.
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TGCN.2021.3136363, IEEE
Transactions on Green Communications and Networking
19
A PPENDIX A
R ESULTS FROM O FFICE & C OMPREHENSIVE
A REAS
(a) The power supply pattern under the clear (b) The power supply pattern under the clear (c) The power supply pattern under the clear
& high-wind day. & middle-wind day & low-wind day
(d) The power supply pattern under the partial (e) The power supply pattern under the partial (f) The power supply pattern under the partial
cloudy & high-wind day cloudy & middle-wind day cloudy & low-wind day
(g) The power supply pattern under the cloudy (h) The power supply pattern under the cloudy (i) The power supply pattern under the cloudy
& high-wind day & middle-wind day & low-wind day
Fig. 10. The power supply pattern of a single 5G BS at area of office is supplied by different power supply methods under different weather
conditions in one day period.
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TGCN.2021.3136363, IEEE
Transactions on Green Communications and Networking
20
(a) The power supply pattern under the clear (b) The power supply pattern under the clear (c) The power supply pattern under the clear
& high-wind day. & middle-wind day & low-wind day
(d) The power supply pattern under the partial (e) The power supply pattern under the partial (f) The power supply pattern under the partial
cloudy & high-wind day cloudy & middle-wind day cloudy & low-wind day
(g) The power supply pattern under the cloudy (h) The power supply pattern under the cloudy (i) The power supply pattern under the cloudy
& high-wind day & middle-wind day & low-wind day
Fig. 11. The power supply pattern of a single 5G BS at area of comprehensive is supplied by different power supply methods under different
weather conditions in one day period.
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/