Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

This article has been accepted for publication in a future issue of this journal, but has not been

fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TVT.2019.2926733, IEEE
Transactions on Vehicular Technology

Adaptive Hierarchical Energy Management De-

sign for a Plug-in Hybrid Electric Vehicle
Teng Liu, Xiaolin Tang, Hong Wang, Huilong Yu, Xiaosong Hu

split controls for multiple onboard energy sources [5, 6].

Abstract—To promote the real-time application of the ad- Nowadays, the most difficult problem in HEVs’ energy
vanced energy management system in hybrid electric vehicles management field is how to achieve the real-time application of
(HEVs), this article proposes an adaptive hierarchical energy the corresponding derived strategies [7, 8].
management strategy for a plug-in HEV. In this work, deep In the initial phase of developing HEVs, the rule-based
learning (DL) and genetic algorithm (GA) are synthesized to de-
methods are useful and powerful because they can be
rive the power split controls between the battery and internal
combustion engine (ICE). First, the architecture of the multimode
implemented easily. For example, Ali et al. [9] designed a hand
powertrain is founded, wherein the particular control actions, -made system to extract the fuzzy logic rules from the offline
state variables, and optimization objective are explained. Then, optimization results. These rules could be used in unknown
the hierarchical framework for control actions generation is in- power profile and improve the system efficiency. Furthermore,
troduced. GA is utilized to search the global optimal controls Ref. [10] formulated Markov chain-based rules to predict the
based on the powertrain model provided in Matlab/Simulink. DL operation modes for HEV. The acquired rules can realize online
is applied to train the neural network model that is connecting the corrections for engine and motor torques and thus enhance fuel
inputs and control actions. Finally, the effectiveness of the pre- economy. However, the humans-dependent experiences and
sented integrated energy management strategy is validated via
inflexibility to the uncertain environments restrict extensive
comparing with the original charge depleting/charge sustaining
(CD/CS) policy. Simulation results indicate that the proposed applications of the rule-based energy management strategies
technique can highly improve the fuel economy. Furthermore, a (EMSs) [11]. Hence, academic and industrial researchers are
hardware-in-the-loop (HIL) is conducted to evaluate the adaptive exploring more advanced techniques and algorithms to address
and real-time characteristics of the designed energy management these drawbacks.
system. To excavate the deeper capacity of HEVs, optimization
-based approaches attracted more and more attention in the last
Index Terms—Chevrolet Volt, hierarchical energy manage- several years. For example, a part of them is able to obtain
ment, deep neural network, genetic algorithm globally optimal results for the special driving conditions, such
as dynamic programming (DP) [12], genetic algorithm (GA)
I. INTRODUCTION [13], simulated annealing (SA) [14] and game theory (GT) [15].
EVELOPMENT of the green, efficient and pollution-free As alternatives, many potential methods could generate in-
D automobiles is one of the most prevalent research hotspots
in the recent decade. Plug-in hybrid electric vehicle
stantaneous policies for the practical implementation, such as
model predictive control (MPC) [16], equivalent consumption
((P)HEV) has the potential to reduce pollution emissions and minimization strategy (ECMS) [17], extremum seeking control
fuel consumption and is common in current traffic [18] and stochastic dynamic programming (SDP) [19]. Dif-
environments [1-4]. Energy management is a critical ferent degrees of improvement are realized using these tech-
technology in PHEV, which is able to determine optimal power niques in hundreds of literature. However, real-time applica-
tions of the formulated EMSs in complicated driving situations
are still the struggling direction.
Copyright (c) 2015 IEEE. Personal use of this material is permitted. How- With the rapid development of the artificial intelligence
ever, permission to use this material for any other purposes must be obtained
from the IEEE by sending a request to technologies, learning-based methods are also explored and
The work was supported in part by National Natural Science Foundation of employed in the energy management field of HEVs. For
China (51705044). (Corresponding authors: Xiaolin Tang and Hong Wang) example, reinforcement learning (RL) algorithms [20-24] are
T. Liu is with Department of Automotive Engineering and the State Key
Laboratory of Mechanical Transmission, Chongqing University, Chongqing evaluated to be efficient tools by training an intelligent
400044, China, and also with Department of Mechanical and Mechatronics controller based on the cumulative experiences. Recent studies
Engineering, University of Waterloo, N2L 3G1, Canada. (email: tengliu17@ reveal that a combination of the transportation information and
X. Tang is with State Key Laboratory of Mechanical Transmissions, College
big data analysis is a promising way to promote fuel economy
of Automotive Engineering, Chongqing University, Chongqing, 400044, PR for many HEVs [25, 26]. As a result, deep RL has been
China. (email: attempted in recent researches [27, 28], wherein deep learning
H. Wang and H. Yu are with Mechanical and Mechatronics Engineering
Department, Waterloo University, N2L 3G1, Canada. (email:
is exploited to deal with enormous driving data, and RL is, utilized to produce various branches of control action. However,
X. Hu is with the Department of Automotive Engineering and the State Key the explainability and feasibility of the generated model need to
Laboratory of Mechanical Transmission, Chongqing University, Chongqing be further studied.
400044, China (e-mail:

0018-9545 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TVT.2019.2926733, IEEE
Transactions on Vehicular Technology

➢ Up-level: GA-based EMS for PHEV

Research Vehicle (PHEV) Powertrain Model in Autonomie GA for Parameters Optimization
Parameters for mode choice
PG set and power split control
GA function in Matlab

➢ Low-level: DL-enabled online controller training
Trained Model (Controller) Multiple Artificial NN Driving Conditions Information

Input: speed, SoC, power demand…

Output: Control actions Multiple types of driving cycles

Fig. 1. DL and GA-enabled adaptive hierarchical EMS framework [29].

This work focuses on developing a real-time energy The following content is arranged in this layout: the
management strategy for a multimode PHEV, Chevrolet Volt. powertrain model and optimization control problem are
The control components and powertrain model are provided in explained in Section II. Section III describes the usage of GA
Autonomie software. The property and efficiency maps are and DL for the optimal power split controls set. The related
tested and collected via standard experiments. Based on these simulation and HIL experiment results are displayed and
models, two advanced technologies are combined to derive the analyzed in Section IV. Finally, the conclusion and future work
online EMS through multiple steps. are summarized in Section V.
This paper proposes an adaptive hierarchical energy man-
agement strategy for the Chevrolet Volt, as depicted in Fig. 1. II. POWERTRAIN MODEL AND ENERGY MANAGEMENT
First, the powertrain model is built, wherein the optimization PROBLEM CONSTRUCTION
goal, control actions, and state variables are defined. Then, DL In this section, the powertrain modeling is given that in-
and GA are incorporated to extract the optimal control logics cluding a battery, a planetary gear (PG) and an internal com-
for different driving conditions. Finally, the derived EMS is bustion engine (ICE). Because of the existence of three clutches
validated in simulation tests and hardware-in-the-loop (HIL) (BK1, CL1, and CL2), the vehicle could operate in four traction
experiment. The real-time characters indicate that the presented modes [29]. The mathematic formulation of these four modes is
method is a promising choice in the online application. expounded carefully. Based on the founded model, the energy
Three innovations and contributions are underlined in this management problem is transformed into an optimization con-
article: 1) A hierarchical control framework is constructed to trol problem with the cost function and special constraints.
decide the adaptive EMS for a plug-in HEV; 2) GA is leveraged
to search the optimal control actions for particular driving A. Powertrain Architecture
conditions and DL is utilized to establish the mapping model The construction of the studied HEV is shown in Fig. 2 [30].
connecting the states and actions; 3) The proposed strategy is In the battery, the state of charge (SoC) is a significant factor to
evaluated in a HIL experiment to explain its effectiveness. To represent the current electric capacity in the battery. It is chosen
the best of our knowledge, this is a novel attempt to combine as the state variable in the energy management problem and
learning-based and optimization-based methods in energy determined as follows
management research of HEV.
Tank Battery
SoC = − I b / Qb = −(Voc − Voc2 − 4rb Pb ) / (2Qbrb ) (1)
where Ib and rb are the battery current and internal resistance.
And Voc, Pb and Qb are the open-circuit voltage, output power
ICE Generator Motor and nominal capacity of battery, respectively. It is noted that
CL1 Final the battery degradation cost is neglected in this research [29]. rb
Carrier drive and Voc vary with the SoC values, and the related functions are
decided by the experiment data. Since the motor and generator
Ring BK1 are connected to the battery directly, Pb can be expressed as
Fig. 2. Powertrain architecture of the studied Chevrolet Volt [29].

0018-9545 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TVT.2019.2926733, IEEE
Transactions on Vehicular Technology

Pb = Pm m + Pg g (2) Twh = (1 +  )Tm − (1 +  )2 J mwh (8)

where Pm and Pg are the powers of motor and generator, and ηm where ωm is the motor speed, v is the vehicle speed, r is the
and ηg are their efficiencies, respectively. Sign θ belongs to {-1, wheel radius, Tm is the motor torque and Jm=0.0289 is the motor
1}, wherein θ=1 denotes recuperation and θ=-1 means motor- inertia.
ing. The efficiency maps of motor and generator, and fuel and
efficiency maps of ICE are all obtained from bench tests. Fur- 2) Electric Mode 2: The values for the clutches status are de-
thermore, the instantaneous fuel consumption rate is computed noted as BK1=0, CL1=0 and CL2=1, which means the motor and
as generator both can supply power provided by the battery. The
battery could be charged by regenerative braking. The dy-
namics equations for this mode are shown as
m f = f (Te , e ) (3)
where Te and ωe are the torque and rotation rate of ICE, re- Pwh = m Pm + g Pg = m  Tmm + g  Tgg (9)
The PG set is composed of the sun gear, ring gear, and carrier. Twh = (1 +  )Tm − (1 +  ) J mwh + (1 +  )  J mg (10)

In the Chevrolet Volt’s powertrain, the sun gear is directly where Pwh is the power demand. Tg and ωg are the torque and
connected to the motor, the ring gear is linked to the generator rotate speed of the generator.
via clutch 2 (CL2), the generator and ICE are connected by
clutch 1 (CL1) and the carrier is finally linked to the reduction 3) Range-extender Mode: For this mode, the ICE and battery
transmission [31]. Generally, the carrier speed ωc could be
can provide power in a series style and thus the clutches are
calculated by the sun speed ωs and ring speed ωr
BK1=1, CL1=1 and CL2=0. The ICE and generator set can
charge the battery and propel the vehicle. Hence, this mode is
 1
c = r +  (4) explained as
 +1  +1 s
where ρ is the teeth ratio of the ring and sun gears and is equal Pwh = m Pm = g Pg + Pb (11)
to 83/37=2.243. The torques of these three parts are further
depicted as Pg = Pe = Te  e (12)
( J e + J g )e = Te + Tg (13)
Ts =Tr /  = Tc / (  + 1) (5)
where Pe, Te, and ωe are the output power, torque and rotate
where Ts, Tr, and Tc are the torques of the sun gear, ring gear and speed of the ICE, respectively. Je=0.1988 and Jg=0.0784 are the
carrier, respectively. Based on the connected condition, these inertias of ICE and generator.
torques can be expressed as following
4) Hybrid Mode: In this mode, the clutches are switched as
Ts = Tm BK1=0, CL1=1, and CL2=1 and thus the engine can supply the
 power while the battery is charged. The dynamic expressions of
Tc = Twh / i0 (6)
this mode are
T = T + CL  T
 r g 1 e

where Twh is the required torque at wheel and i0 is the final Pwh = Pb + Pe (14)
transmission ratio. CL1∈{0, 1} indicates the open or close of
( J e + J g +  2 J m )e -(1 +  )  J mwh =Te + Tg - Tm (15)
the corresponding clutch, so are the CL2 and BK1.
TABLE I [25]
As mentioned above, the three clutches decide the working Notation Implication Values
condition of different power sources, which results in diverse M Vehicle mass 1715 kg
types of operation mode [32]. The clutch symbol is equal to 1
ρ PG ratio 2.243
denotes that this clutch is close, vice versa. The studied
powertrain totally has four operation modes and they are mod- i0 Final transmission ratio 3.02
eled as r Wheel radius 0.3 m
Pg, max Generator peak power 53 kW
1) Electric Mode 1: In this mode, the status of the clutches are
Pb, max Battery peak power 110 kW
BK1=1, CL1=0 and CL2=0, which indicates that the ICE and
generator set would not provide power to the wheels. The Pe, max ICE maximum power 63 kW
transient dynamics (rotation rate and torque) at wheels can be Pm, max Motor peak power 111 kW
written as As the operation modes are complicated, seven parameters
are selected as the control actions in this energy management
wh = m / (1 +  )=vi0 / r (7) problem. They are clutches status BK1, CL1 and CL2, torque
and rotate speed of the ICE Te and ωe, and torques of motor and

0018-9545 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TVT.2019.2926733, IEEE
Transactions on Vehicular Technology

generator Tm and Tg. The first three actions are used to deter- model that can unite different GA-based EMSs. Hence, the
mine the operation modes and the last four ones are utilized to theory of GA is described first in this section.
calculate the optimal power spilt. The parameters of the main GA is named from the principle of natural genetics and
components in this powertrain are listed in Table I. represents a heuristic search procedure. The concepts of artifi-
cial intelligence and function optimization are included in GA
C. Optimization Control Problem
to develop the solutions. First, the primary generations of so-
After defining the state variable and control actions in the lutions are chosen randomly. Then, better solutions with higher
powertrain modeling, the control objective can be served as performance are recombined to formulate new ones. Finally,
their output. In this research, the cost function is the sum of fuel the original solutions are replaced by the new ones and this
cost Cfu and electricity cost Cel process is repeated to obtain the best solution.
J = C fu + Cel Three representative evolution ideas are defined in GA,
tf (16) which are crossover, mutation, and selection [13]. Selection
=  m f (t )dt  Pfu + g ( SoC (t f ), Pel ) indicates that some of the primary generations are determined
to reproduce the new offspring. The chosen rule is based on the
where [t0, tf] is the simulation time horizon. Pfu and Pel are the control objective, which means better control actions are pre-
prices for fuel and electricity, which are 0.795 US $/kg and ferred. Crossover is treated as the guidance for new offspring
0.137 US $/kWh, respectively. Function g represents the production. Hybridization among the candidate control actions
look-up table method, which indicates that the electricity cost is the core method to generate the new offspring. The function
depends on the final SoC value and electricity price. The related of mutation is avoiding the local optimum solutions, which is
table is provided in the software Autonomie [29]. The fuel achieved by the random adjustment.
consumption rate and SoC value are determined by the seven
control actions, which are mentioned in Section II.B (the last
When selecting the optimal control actions at each time in-
stant, a couple of restraints need to be followed, such as safety
and physical constraints. They are formulated as follow Driving Cycle Profile 1 Driving Cycle Profile 2

 SoCmin  SoC (t )  SoCmax Genetic Algorithm (GA)

 (17)
 Pb,min  Pb (t )  Pb,max Population Size: 130
Terminal threshold: 1e-4
 x ,min   x (t )   x ,max , x = e, m, g Crossover: 80%
 (18) Selection: 5%
Tx ,min  Tx (t )  Tx ,max , x = e, m, g Mutation: 1%

where the subscript min and max denote the lower and upper
Realization: GA function in Matlab, variables constraints in
bounds of the variables. The minimum and maximum values of (17) and (18), search seven control actions simultaneously.
SoC are set as 0.3 and 0.9 in this problem. The goal of this
problem is designing an adaptive EMS, which is able to ac-
Optimal control actions (Group 1) Optimal control actions (Group 2)
commodate unknown driving cycle. In the next section, the GA Three clutches status, torque Three clutches status, torque
and DL methods are introduced to search the target policy. and rotate speed of ICE, and and rotate speed of ICE, and
torques of motor and generator. torques of motor and generator.
Fig. 3. Operation procedure of GA for energy management problem.
This section focuses on the methodology that is used to ex- The global optimality of the GA-enabled EMSs has been
plore the desired control actions. First, the principle of GA is evaluated and analyzed in previous literature [33]. Since the
explained to obtain optimal control actions offline with respect control action space is large (seven control actions), GA is
to the integrated driving cycle. Furthermore, the DL method is suitable to optimize them simultaneously. Fig. 3 depicts the
applied to train the mapping model to connect the inputs and operation procedure of GA in this work. For a particular known
output. driving cycle, GA could search the optimal control actions. The
A. GA Exploration first three actions (clutches status) decide the operation mode of
powertrain, and the last four actions determine the power split
GA is a global search algorithm and it is first applied to
among the energy sources. The search space of these seven
search the adaptive control actions for an integrated driving
control actions is BK1∈{0, 1}, CL1∈{0, 1}, CL2∈{0, 1}, Te
cycle in our previous work [31]. This cycle combines the urban
and motorway cycles together and it is expected to generate the ∈[0 200] Nm, ωe∈[0 4500] r/min, Tm∈[-350 350] Nm, Tg∈
appropriate parameters for the testing driving cycle. The [-200 200] Nm, respectively. If the driving cycle is unknown,
comparison analysis indicates the GA-based EMS has nearly the derived control actions from other cycles could be ap-
optimal performance and is potential to be employed for un- proximate solutions for this cycle. The performance should be
known driving cycles. To further improve the control perfor- evaluated by cost function in (16).
mance and real-time effectiveness, DL is employed to train a The computation process is realized by combining the GA
function in Matlab and powertrain model in Simulink offline.

0018-9545 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TVT.2019.2926733, IEEE
Transactions on Vehicular Technology

The values for population size and terminal threshold are 130 sum of the weighted parameters (from the input layer) and a
and 1e-4. Furthermore, the probability of crossover, mutation, bias b. Assuming the input vector is x=[x1, x2, … xN] and the
and selection are 80%, 1% and 5%, respectively. The service weight is k1, k2, …, kN, then the neuron input is described as
platform is a common 2.90 GHz desktop with 7.83 GB RAM. N
Specifically, the inputs of the GA function are the cost func- y =  ki xi + b (19)
tion, threshold intervals of parameters and constraints in (17) i =1
and (18), and the outputs are total cost and optimal control where N is the index of the total inputs from the input layer. An
actions. The seven control actions are optimized at the same activation function is used to connect the neuron input and
time and their optimality depends on the information of the output and the log-sigmoid activation is adopted in this article
driving cycle is known or not. The next subsection explains  z = h( y )
how to exploit DL to train an online model based on the  −x
GA-enabled control actions. h( x ) = 1 / (1 + e )
Finally, the output of the multiple ANN is represented as
B. DL Training
DL method is one of the representation learning approaches S N
and consists of multiple levels of representation regularly [34]. zall = h 2 ( k12j h1 ( k 1 ji zi + b1j ) + ball
) (21)
The aim of DL is transforming several raw data into more j =1 i =1
abstract and slighter one. As long as the collected data is where h2 and h1 are the activation functions of the hidden layer
enough, complicated functions can be conducted during the and the output layer. S indicates the total number of neurons in
transformations. Its advantage in uncovering intricate struc- the hidden layer, k1ji is the weight linking the i-th input and j-th
tures has been demonstrated in many researches and thus can be neuron in the hidden layer, k21j represents the weight connect-
applied in many domains. ing the j-th source of hidden layer to the output layer neuron.
Inputs Weight Neuron Output b2all denotes the bias of the neuron in the output layer and b1j
represents the bias of the j-th neuron in the hidden layer.
x1 In this article, the multiple ANN is utilized to connect the
inputs and outputs in the energy management of HEV. The
y h2
. ω2 h1 z inputs are the vehicle velocity, acceleration, power demand,
. and SoC values. The outputs are the seven control actions,
. which are clutches status, torque and rotate speed of the ICE,
. and torques of motor and generator. The operational diagram is
. ωN shown in Fig. 5. The relevant inputs and outputs for different
xN driving cycles are first collected. Then, this data is utilized to
Bias b Activation function
train an ANN model. Different types of driving cycles enable
Fig. 4. Construction of a neuron in multiple ANN. the trained model to adapt to various driving conditions.
A multiple artificial neural network (ANN) [35] is one of the For different driving conditions, the GA-based EMSs are
DL methods, which is able to establish the function approxi- different, and thus the collected data is not the same. Different
mation between the inputs and output. It is composed of artifi- driving cycles provided in the Autonomie are all employed to
cial neurons to form a network, which is applying a mathe- gather data, such as NYC-Hev-Taxi, UDDS, WLTC, and JC08,
matical model to process information. Its structure could etc. The data size of each driving cycle is about 28 MB and the
change with the transmitted information. Four characteristics total size of the training data is about 1.54 GB. The trained
are included in an ANN [36]: 1) Approximate network topol- model could guide the instantaneous control actions based on
ogy is necessary for different problems to avoid model over- the input information. The proposed approach is estimated in
fitting; 2) ANN is able to handle redundant features because the next section by comparing the original GA method.
weights are learned automatically; 3) Training time increases
along with the number of hidden layers; 4) ANN is quite sen- IV. ANALYSIS AND DISCUSSION OF RESULTS
sitive to the perspective of noise in the training data.
This section evaluates the optimality and adaptability of the
Three important layers usually exist in one ANN and they are
proposed energy management strategy. First, a comparative
input layer, hidden layer, and output layer. The input layer
analysis among the signal GA, DL combines with GA and
defines the feature data from the raw database, which would
original charge depleting/charge sustaining (CD/CS) methods
affect the variation of the output. The hidden layer connects the
is constructed. Signal GA indicates the control actions are
inputs and output via an activation function and is usually not
searched by original GA and they will not change all the time.
visible. The output layer aims to coalesce and concretely pro-
The optimality is verified by comparing the related total cost in
duce the final results. For DL methods, the features of these
(16). Furthermore, a HIL experiment is formulated to test the
three layers are learned from data instead of being designed by
online application of the presented policy. The corresponding
human engineers.
calculative time indicates the DL and GA method can be ap-
Neurons are the basic components of the multiple ANN, see
plied in real-world environments.
Fig. 4 as an example. The input of a neuron is y, which is the

0018-9545 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TVT.2019.2926733, IEEE
Transactions on Vehicular Technology

Collected data: Group 1 Artificial Neural Network (ANN)

BK1, CL1, CL2, Te, ωe, Tm and Tg.
Vehicle velocity, acceleration,
power demand, and SoC values
Trained Model
Collected data: Group 2
BK1, CL1, CL2, Te, ωe, Tm and Tg.
Vehicle velocity, acceleration,
power demand, and SoC values
… Inputs: vehicle velocity, acceleration,
Collected data: Group n power demand, and SoC values.
BK1, CL1, CL2, Te, ωe, Tm and Tg. Outputs: clutches status, torque and
Vehicle velocity, acceleration, rotate speed of the ICE, and torques
power demand, and SoC values of motor and generator.
Fig. 5. Training process of ANN for collected data based on GA.
The simulation cycle and the SoC curves in three cases are
A. Comparative Analysis
described in Fig. 6. This cycle is contained in the integrated
Three control approaches are evaluated on the same driving driving cycle used for training signal GA. As the cycle is long
cycle to demonstrate the advantages of the proposed technique. enough, the primary method (CD/CS policy) transforms from
The primary CD/CS strategy implies that the vehicle is oper- CD mode to CS mode after 1400 second. Then, the SoC value is
ated as an electric vehicle before the SoC reaches the minimum equal to near 0.3 all the time, which means the ICE always
threshold. After that, the vehicle runs as an HEV to maintain a provides the power demand. Since the simulation cycle is in-
constant SoC value. The signal GA policy indicates that the cluded in training data, signal GA-based results are optimal.
control parameters are trained based on an integrated driving The SoC in DL+GA is close to that of signal GA, which indi-
cycle. Finally, the DL combines with GA (DL+GA) method cates the proposed method is nearly optimal. Furthermore, the
means a multiple ANN is leveraged to connect the inputs and control actions in DL+GA can change with the driving envi-
outputs, wherein the inputs are the vehicle velocity, accelera- ronments and accommodate many driving cycles as long as the
tion, power demand, and SoC value and outputs are the clutches training data is sufficient. Hence, the presented DL+GA could
status, torque and rotate speed of the ICE, and torques of motor be applied in real-time and achieve sub-optimal results.
and generator.

Fig. 7. Torque split in different control strategies.

The torque splits in different control approaches are depicted
in Fig. 7. As we know, the motor, ICE and generator torques are
regarded as the control actions in the studied HEV powertrain.
CD to CS Hence, Fig. 7 implies that the derived control strategies are
various in these three cases. In CD/CS method, the battery
supplies power to promote the vehicle at first, and then ICE
charges the battery and satisfies the power demand simulta-
neously. However, in signal GA and DL+GA approaches, the
ICE is activated whenever the power demand is large. The
Fig. 6. Simulation cycle and SoC curves in three control cases. hybrid mode is utilized wherein the battery and ICE powers are
managed more reasonably to realize the lower total cost.

0018-9545 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TVT.2019.2926733, IEEE
Transactions on Vehicular Technology

regarded as benchmark. However, this information of driving

cycles was unknown to DL+GA, signal GA and CD/CS, and
thus this information was not included in the training data. It is
obvious that the performance of DL+GA and signal GA sur-
passes CD/CS. DL+GA is better than signal GA when the
driving cycles are unknown, and it is close to DP. Unfortu-
nately, this information is usually unknown in real-world en-
vironments. Therefore, whether the proposed approach is
adaptive to different driving conditions or not should be further
Algorithms* ARB02 JC08 HYC-HEV REV2
CD/CS 1.4481& 0.2731 6.2215 0.3607
Signal GA 1.1156 0.2216 5.5103 0.3248
Fig. 8. Status of three clutches in different control cases.
DL+GA 1.0151 0.1996 5.3508 0.2989
DP 0.9852 0.1947 5.1178 0.2746
* A 2.90 GHz microprocessor with 7.83 GB RAM was used.
Unit is dollar ($).
In Fig. 9, the application time implies the duration time of the
derived control strategy used in the new simulation cycle. The
training process can be implemented offline and the related
time is not considered. The time in DL+GA is longer than
signal GA because the control actions should change with the
$1.4209 driving conditions in DL+GA, however, the control actions are
$1.2323 all the same in signal GA. Moreover, these calculation times are
short enough, and thus the relevant control policy could be
applied in real-time. The signal GA and DL+GA-based EMSs
are evaluated in HIL experiment for real-time validation pur-
pose in the next section.
B. HIL Validation
Fig. 9. Total cost and application time of the compared methods.
Furthermore, the status of three clutches (BK1, CL1, and CL2) To evaluate the performance and adaptability of the pro-
posed method, a HIL platform is established to apply the signal
in these three methods are shown in Fig. 8. The differences
GA and DL+GA policies in real-time situations. The bench
between the signal GA and DL+GA approaches (purple rec-
sketch of HIL experiment is displayed in Fig. 10, which con-
tangle) illuminate that the related control actions are also dif-
sists of two main hardware. The first one is called MotoTron,
ferent. The rules for mode switching decide the working con- which is used to reserve the trained control actions, and the
ditions of each energy source, which would further determine second hardware is RT-Lab, which is exploited to save the
the power split control. Hence, the curves of three clutches in powertrain model and mimic the real vehicle’s running. Both of
Fig. 8 indicate the control strategies in signal GA and DL+GA them could transform Matlab/Simulink into C language and
are not the same sometimes. realize quick online calibration.
Finally, the different control policies reflected by Fig. 7 and
8 results in disparate costs and computation time, see Fig. 9 as MotoTron Powertrain
an illustration. It is obvious that the total costs in DL+GA and Hardware Model in RT-Lab
signal GA are lower than that in CD/CS, and DL+GA is close to
signal GA. This result testifies the optimality of the presented
method since the signal GA could perform well in the previous MotoHawk
work [31]. Note that, signal GA is better than DL+GA in this
section only because the simulation cycle in Fig. 6 is known a
priori and is comprised of the training data of signal GA.
To further estimate the optimality of the proposed method,
the DL+GA, signal GA, CD/CS and global optimal DP meth- Upper PC of Lower PC of
ods are applied on several driving cycles extracted from soft- RL-Lab RL-Lab
ware Autonomie [31]. The relevant total costs of these algo-
rithms are described in Table II. The driving cycles are assumed Fig. 10. Experiment platform of the HIL validation.
to be known in DP, which indicates DP-based result can be

0018-9545 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TVT.2019.2926733, IEEE
Transactions on Vehicular Technology

mitted information between lower PC and MotoTron states

variables, power demand, vehicle speed, acceleration and con-
Standard cycle Real cycle trol actions. In RT-Lab, the powertrain is assumed to be oper-
ated in real-world environments and the sampling time is
After building these two parts, the signal GA and DL+GA
enabled control strategies can be applied in real-time. The
verified driving cycle and SoC trajectories in these two cases
are shown in Fig. 11. To evaluate the adaptability of the pro-
posed method, the simulation cycle is combined with two parts,
the first one is a standard cycle from Autonomie (named Unif01)
and utilized in the training process, and the second part is a real
collected cycle and unknown to these two methods. The SoC
variations are different and thus the feedback control actions
from the controller are not the same. As DL is utilized to train
the control actions in many driving conditions, and thus the
DL+GA-based control strategy could be tuned to accommodate
the current driving cycle. Hence, the power split between the
battery and ICE can be managed reasonably according to the
driving situation, which illuminates the adaptability of the
presented approach.
According to the definition of operation modes in Section
II.B, ICE doesn’t work in Electric Mode 1 and 2. Oppositely, it
works in Range-extender Mode and Hybrid Mode. The ICE
Change to unknown cycle
speeds of signal GA and DL+GA are depicted in Fig. 12. It is
obvious that the speeds in these two methods are not the same,
e.g., from 1000 to 1250 second. In signal GA, the speed is zero,
which means the operation mode of the powertrain is Electric
Mode 1 or 2. However, the operation mode is Range-extender
Mode or Hybrid Mode in DL+GA. This difference leads to
Fig. 11. Combined driving cycle and SoC trajectories in HIL experiment. various power split controls of the energy sources. The per-
For the control strategy part, the matched software is in- formance of these two control policies is reflected by the fuel
stalled in the upper personal computer (PC) of MotoTron. The cost, electricity cost and total cost in (16). After running the
MotoHawk is embedded in Simulink and provides multiple driving cycle in Fig. 11, the fuel, electricity and total cost in
modules to record the control policy, wherein the input, output,
DL+GA is 0.4924, 0.25 and 0.7424 dollars, respectively. Then,
and control logics are defined. The MotoTune is utilized to
these three prices in signal GA are $ 0.5891, $ 0.2902, $ 0.8793.
download the related control strategy into the hardware and
Comparing the total cost in these two methods, DL+GA is
display the real-time values of selected parameters.
15.57% lower than signal GA, which implies that the proposed
method achieves better performance when the driving cycle is
unknown. For online application, the training processes are
completed previously in these two techniques. Thus, the pro-
posed method may be a better choice in real-time environ-
To accelerate the online application of the advanced energy
management system in plug-in hybrid electric vehicles, this
article employs DL and GA to formulate the adaptive hierar-
chical EMS for the Chevrolet Volt. This paper improves the
signal GA method (the previous work [31]) by adding the
multiple ANN to connect the state variable and control actions
in different driving conditions. Simulation results first estimate
the optimality of the presented method, which indicates that the
Fig. 12. ICE speed of signal GA and DL+GA in HIL experiment.
total cost ($ 1.4209) of the DL+GA is close to that of signal GA
For the powertrain modeling part, two PCs are included in
($ 1.2323) and lower than CD/CS ($ 1.7254). The HIL ex-
RT-Lab. The upper PC saves the powertrain model of Chev-
periment further demonstrates the adaptability of the proposed
rolet Volt [29], and the inputs are the control actions and the
method, which means the derived control policy could be ap-
outputs are the powertrain states. The lower PC exchanges
plied in real-time easily and achieve better performance.
information with the MotoTron through the reticle. The trans-

0018-9545 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TVT.2019.2926733, IEEE
Transactions on Vehicular Technology

Future work focuses on three perspectives: 1) Apply the re- Sustainability, vol. 10, no. 6, 2060, 2018.
lated EMS of this article into the real HEV and improve the [18] D. Zhou, and A. Ravey, “A comparative study of extremum seeking
methods applied to online energy management strategy of fuel cell hy-
effectiveness by adding the reinforcement learning techniques; brid electric vehicles,” Energy Conversion and Management, vol. 151,
2) Formulate more efficient EMS based on the information pp. 778-790, 2017.
provided by the intelligent transportation system. By doing this, [19] X. Wu, X. Hu, X. Yin, and S. Moura, “Stochastic optimal energy man-
the efficiency and safety of HEVs in the network can be pro- agement of smart home with PEV energy storage,” IEEE Trans. Smart
moted simultaneously by sharing the information; 3) Employ Grid, vol. 9, no. 3, pp. 2065-2075, 2018.
[20] T. Liu, X. Hu, S. Li, and D. Cao, “Reinforcement learning optimized
more advanced algorithms to derive human-like EMSs for look-ahead energy management of a parallel hybrid electric vehicle,”
plug-in HEVs. For example, inverse reinforcement learning IEEE/ASME Trans. Mechatronics, vol. 22, no. 4, pp. 1497-1507, 2017.
(IRL) is a promising method to train the human-like EMS by [21] E. Foruzan, L. Soh, and S. Asgarpoor, “Reinforcement learning approach
providing the expert trajectory first. for optimal distributed energy management in a microgrid,” IEEE Trans.
Power Syst., vol. 33, no. 5, pp. 5749-5758, 2018.
REFERENCES [22] T. Liu, Y. Zou, D. Liu and F. C. Sun, “Reinforcement learning of adaptive
[1] T. Liu, and X. Hu, “A bi-level control for energy efficiency improvement energy management with transition probability for a hybrid electric
of a hybrid tracked vehicle,” IEEE Trans. Ind. Informat., vol. pp, no. 99, tracked vehicle,” IEEE Trans. Ind. Electron., vol. 62, no. 12,
pp. 1-1, 2018. pp.7837-7846, 2015.
[2] X. Tang, D. Zhang, T. Liu, A. Khajepour, H. Yu, and H. Wang, “Research [23] T. Liu, X. Hu, W. Hu, Y. Zou, “A heuristic planning reinforcement
on the energy control of a dual-motor hybrid vehicle during engine learning-based energy management for power-split plug-in hybrid elec-
start-stop process,” Energy, vol. 166, pp. 1181-1193, 2019. tric vehicles,” IEEE Transactions on Industrial Informatics, DOI:
[3] Y. Zou, T. Liu, D. X. Liu, and F. C. Sun, “Reinforcement learning-based 10.1109/TII.2019.2903098.
real-time energy management for a hybrid tracked vehicle,” Appl. En- [24] T. Liu, B. Wang, and C. Yang, “Online Markov Chain-based energy
ergy, vol. 171, pp. 372-382, 2016. management for a hybrid tracked vehicle with speedy Q-learning,” En-
[4] H. Wang, Y. Huang , and A. Khajepour, “Cyber-Physical Control for ergy, vol. 160, pp. 544-555, 2018.
Energy Management of Off-road Vehicles with Hybrid Energy Storage [25] R. Langari, and J. Won, “Intelligent energy management agent for a
Systems,” IEEE/ASME Trans. Mechantroincs, DOI: 10.1109/ parallel hybrid vehicle-part I: system architecture and design of the
TMECH.2018.2832019 driving situation identification process,” IEEE Trans. Veh. Technol., vol.
[5] Z. Chen, L. Li, X. Hu, B. Yan, and C. Yang, “Temporal-Difference 54, no. 3, pp. 925-934, 2005.
Learning-Based Stochastic Energy Management for Plug-in Hybrid [26] J. Won, and R. Langari, “Intelligent energy management agent for a
Electric Buses,” IEEE Transactions on Intelligent Transportation Sys- parallel hybrid vehicle-part II: torque distribution, charge sustenance
tems, 2018, DOI: 10.1109/TITS.2018.2869731. strategies, and performance results,” IEEE Trans. Veh. Technol., vol. 54,
[6] Y. Zou, Z. Kong, and T. Liu, “A real-time markov chain driver model for no. 3, pp. 935-953, 2005.
tracked vehicles and its adaptability via stochastic dynamic program- [27] Y. Hu, W. Li, K. Xu, T. Zahid, F. Qin, and C. Li, “Energy management
ming,” IEEE Trans. Veh. Technol., vol. 13, no. 2, pp. 8-16, 2016. strategy for a hybrid electric vehicle based on deep reinforcement
[7] H. Wang, Y. Huang, A. Khajepour, A. Soltani, and D. Cao, learning,” Applied Sciences, vol. 8, no. 2, 187, 2018.
“Cyber-Physical Predictive Energy Management for Through-The-Road [28] P. Zhao, Y. Wang, N. Chang, Q. Zhu, and X. Lin, “A deep reinforcement
Hybrid Vehicles,” IEEE Trans. Veh. Technol., 2019. learning framework for optimizing fuel economy of hybrid electric ve-
[8] D. Zhou, A. Al-Durra, I. Matraji, A. Ravey, and F. Gao, “Online energy hicles,” In Proc. 2018 IEEE Design Automation Conference (ASP-DAC),
management strategy of fuel cell hybrid electric vehicles: a fraction- pp. 196-202, Jan. 2018.
al-order extremum seeking method,” IEEE Transactions on Industrial [29] C. Depature, S. Pagerit, L. Boulon, et al., “IEEE VTS motor vehicles
Electronics, vol. 65, no. 8, pp. 6787-6799, 2018. challenge 2018-energy management of a range extender electric vehicle,”
[9] I. Ali., M. Turki, J. Belhadj, and X. Roboam, “Optimized fuzzy in Proc. IEEE Vehicle Power and Propulsion Conference (VPPC), Chi-
rule-based energy management for a battery-less PV/wind-BWRO de- cago, Illinois, USA, Aug. 2018.
salination system,” Energy, vol. 159, pp. 216-228, 2018. [30] C. Depature, S. Jemei, L. Boulon, A. Bouscayrol, N. Marx, S. Morando,
[10] Y. Liu, J. Gao, D. Qin, Y. Zhang, and Z. Lei, “Rule-corrected energy and A. Castaings, “Energy management in fuel-cell/battery vehicles: key
management strategy for hybrid electric vehicles based on opera- issues identified in the IEEE vehicular technology society motor vehicle
tion-mode prediction,” Journal of Cleaner Production, vol. 188, pp. challenge 2017,” IEEE Vehicular Technology Magazine, vol. pp, no. 99,
796-806, 2018. June 2018.
[11] T. Liu, Y. Zou, and D. Liu, “Energy management for battery electric [31] T. Liu, H. Yu, and X. Hu, “Robust energy management strategy for a
vehicle with automated mechanical transmission,” International Journal range extender electric vehicle via genetic algorithm,” in Proc. IEEE
of Vehicle Design (IJVD), vol. 70, no. 1, pp. 98-112, 2016. Vehicle Power and Propulsion Conference (VPPC), Chicago, Illinois,
[12] Y. Zou, T. Liu, F. Sun, and H. Peng, “Comparative study of dynamic USA, Aug. 2018.
programming and pontryagin’s minimum principle on energy manage- [32] S. Onori, and L. Tribioli, “Adaptive pontryagin’s minimum principle
ment for a parallel hybrid electric vehicle,” Energies, vol. 6, no. 4, pp. supervisory controller design for the plug-in hybrid GM Chevrolet Volt,”
2305-2318, 2013. Appl. Energy, vol. 147, no. 1, pp. 224-234, 2015.
[13] T. Liu, H. Yu, H. Guo, Y. Qin, and Y. Zou, “Online energy management [33] M. Wieczorek, and M. Lewandowski, “A mathematical representation of
for multimode plug-in hybrid electric vehicles,” IEEE Trans Ind. In- an energy management strategy for hybrid energy storage system in
format., DOI: 10.1109/TII.2018.2880897. electric vehicle and real time optimization using a genetic algorithm,”
[14] Z. Song, J. Hou, and H. Hofmann, et al., “Sliding-mode and Lyapunov Appl. Energy, vol. 192, pp. 222-233, 2017.
function-based control for battery/supercapacitor hybrid energy storage [34] Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” nature, vol. 521,
system used in electric vehicles,” Energy, vol. 122, pp. 601-612, 2017. 7553, pp. 436-444, May 2015.
[15] C. Martinez, M. Heucke, and F. Y. Wang, “Driving style recognition for [35] A. Afram, F. Janabi-Sharifi, A. Fung, and K. Raahemifar, “Artificial
intelligent vehicle control and advanced driver assistance: A Survey,” neural network (ANN) based model predictive control (MPC) and op-
IEEE Trans. Intell. Transp. Syst., vol. 19, no. 3, pp. 666-676, Aug. 2017. timization of HVAC systems: A state of the art review and case study of a
[16] H. Viot, A. Sempey, L. Mora, J. Batsale, and J. Malvestio, “Model residential HVAC system,” Energy and Buildings, vol. 141, pp. 96-113,
predictive control of a thermally activated building system to improve 2017.
energy management of an experimental building: Part II-Potential of [36] S. Shanmuganathan, “Artificial neural network modelling: An introduc-
predictive strategy,” Energy and Buildings, vol. 172, pp. 385-396, 2018. tion,” In Artificial neural network modelling, pp. 1-14, Springer, Cham,
[17] Y. Zeng, Y. Cai, G. Kou, W. Gao, and D. Qin, “Energy management for 2016.
plug-in hybrid electric vehicle based on adaptive simplified-ECMS,”

0018-9545 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TVT.2019.2926733, IEEE
Transactions on Vehicular Technology

Teng Liu (M’2018) received the B.S. degree in math- Xiaosong Hu (SM’16) received the Ph.D. degree in
ematics from Beijing Institute of Technology, Beijing, Automotive Engineering from Beijing Institute of
China, 2011. He received his Ph.D. degree in the Technology, China, in 2012.
automotive engineering from Beijing Institute of He did scientific research and completed the Ph.D.
Technology (BIT), Beijing, in 2017. His Ph.D. disserta- dissertation in Automotive Research Center at the
tion, under the supervision of Pro. Fengchun Sun, was University of Michigan, Ann Arbor, USA, between
entitled “Reinforcement learning-based energy man- 2010 and 2012. He is currently a professor at the State
agement for hybrid electric vehicles.” He worked as a Key Laboratory of Mechanical Transmissions and at the
research fellow in Vehicle Intelligence Pioneers Ltd for Department of Automotive Engineering, Chongqing
one year. Now, he is a member of IEEE VTS, IEEE ITS, University, Chongqing, China. He was a postdoctoral
IEEE IES, IEEE TEC and IEEE/CAA. researcher at the Department of Civil and Environmental Engineering, Uni-
Dr. Liu is now a postdoctoral fellow at Department of Mechanical and versity of California, Berkeley, USA, between 2014 and 2015, as well as at the
Mechatronics Engineering, University of Waterloo, Ontario N2L3G1, Canada. Swedish Hybrid Vehicle Center and the Department of Signals and Systems at
Dr. Liu has more than 8 years’ research and working experience in renewable Chalmers University of Technology, Gothenburg, Sweden, between 2012 and
vehicle and connected autonomous vehicle. His current research focuses on 2014. He was also a visiting postdoctoral researcher in the Institute for Dy-
reinforcement learning (RL)-based energy management in hybrid electric namic systems and Control at Swiss Federal Institute of Technology (ETH),
vehicles, RL-based decision making for autonomous vehicles, and CPSS-based Zurich, Switzerland, in 2014. His research interests include modeling and
parallel driving. He has published over 30 SCI papers and 10 conference papers control of alternative powertrains and energy storage systems.
in these areas. He received the Merit Student of Beijing in 2011, the Teli Xu Dr. Hu has been a recipient of several prestigious awards/honors, including
Scholarship (Highest Honor) of Beijing Institute of Technology in 2015, “Top World Emerging Sustainability Leaders Award in 2016, EU Marie Currie
10” in 2018 IEEE VTS Motor Vehicle Challenge and sole outstanding winner Fellowship in 2015, ASME DSCD Energy Systems Best Paper Award in 2015,
in 2018 ABB Intelligent Technology Competition. Dr. Liu is a workshop and Beijing Best Ph.D. Dissertation Award in 2013.
co-chair in 2018 IEEE Intelligent Vehicles Symposium (IV 2018) and has been
reviewer in multiple SCI journals, selectively including IEEE Trans. Industrial
Electronics, IEEE Trans. on Intelligent Vehicles, IEEE Trans. Intelligent
Transportation Systems, IEEE Transactions on Systems, Man, and Cybernetics:
Systems, IEEE Transactions on Industrial Informatics, Advances in Mechani-
cal Engineering.

Xiaolin Tang received B.S. degree in Mechanics engi-

neering and M.S. Degree in Vehicle Engineering from
Chongqing University, China, in 2006 and 2009, respec-
tively. He received the Ph.D. degree in Mechanical Engi-
neering from Shanghai Jiao Tong University, China, in
2015. He is currently an Assistant Professor at the State
Key Laboratory of Mechanical Transmissions and at the
Department of Automotive Engineering, Chongqing
University, Chongqing, China. His research focuses on Hybrid Electric Vehi-
cles (HEVs), Vehicle Dynamics, Noise and Vibration, and Transmission

Hong Wang is currently a Research Associate of Me-

chanical and Mechatronics Engineering with the Univer-
sity of Waterloo. She received her Ph.D. degree in Beijing
Institute of Technology in China in 2015. She is an
IEEE/CAA member. Her research focuses on the path
planning control and ethical decision making for auton-
omous vehicles and component sizing, modeling of hy-
brid powertrains and power management control strate-
gies design for Hybrid electric vehicles; intelligent control
theory and application.

Huilong Yu (M’17) received the M.Sc. degree in me-

chanical engineering from Beijing Institute of Technol-
ogy, and the Ph.D. degree in mechanical engi-neering
from Politecnico di Milano, Milano, Italy, in 2013 and
2017, respec-tively. He is now a Research Fellow of
advanced vehicle engineering with the University of
Waterloo, Waterloo, CA. His research interests include
vehicle dynamics, optimal control, closed loop control,
and energy management problems of conventional,
electric, hybrid electric and autonomous vehicles.

0018-9545 (c) 2019 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See for more information.

You might also like