Energy: Shradhdha Sarvaiya, Sachin Ganesh, Bin Xu

Energy 228 (2021) 120604
Contents lists available at ScienceDirect
Energy
journal homepage: www.elsevier.com/locate/energy
Comparative analysis of hybrid vehicle energy management strategies

with optimization of fuel economy and battery life
Shradhdha Sarvaiya 1, Sachin Ganesh 1, Bin Xu*
Clemson University, Department of Automotive Engineering, 4 Research Dr, Greenville, SC, 29607, USA
a r t i c l e i n f o a b s t r a c t
Article history: Hybrid Electric Vehicles (HEV) form an important category of the automotive segment and most
Received 19 November 2020 importantly fill the transition between internal combustion engines powered conventional vehicles and
Received in revised form electric motor-powered vehicles. One of the main propagandas and sales factor of HEV is improved fuel
5 April 2021
economy over conventional vehicles. Over the recent years, considering the longevity of HEV usage,
Accepted 6 April 2021
battery life evaluation has been brought to the forefront of research along with fuel economy and the key
Available online 17 April 2021
method to slow down the battery aging is through Energy Management Strategy (EMS). This research
paper presents the comparative analysis of battery life optimization with different control strategies in a
Keywords:
Hybrid electric vehicles
parallel hybrid vehicle. In the available research work, EMS considering battery aging is still lacking. This
Energy management strategy research work considers the impact of multiple parameters, including temperature and current on
Battery aging battery aging, providing more accurate battery life prediction. Four different control strategies are
analyzed including, Thermostat, Fuzzy logic, Adaptive Equivalent Consumption Minimization Strategy
(A-ECMS) and Q-learning considering battery aging. Results are compared concerning battery aging and
fuel economy. In this research, ECMS results show a 25% improved fuel economy compared to the rule-
based strategy. Also, a market cost-analysis is depicted to show the monetary savings for each of the
energy management strategy.
© 2021 Elsevier Ltd. All rights reserved.
1. Introduction Management Strategy (EMS) which can make decisions on power-

split, based on the defined objective functions regarding fuel con-
With the increasing efforts of reducing global climatic change sumption and vehicle performance.
and greenhouse gases (GHG), the automotive industry is experi- Over the years, several EMS have been introduced to achieve
encing a paradigm shift from conventional internal combustion optimum power split at different driving conditions. In Ref. [3] the
engines to electric vehicles [1]. Between internal combustion (IC) authors have compared a hybrid electrified vehicle’s performance
engines and electric vehicles lies a major band of powertrain pro- with the different energy management strategies including Dy-
pulsion systems, namely HEVs which combine the advantages of namic programming (DP), Pontryagin’s minimum principle (PMP),
both the powertrain systems. and ECMS. In Ref. [4] An adaptive fuzzy logic-based control strategy
HEVs have the unique ability to split the driver’s torque is developed for an HEV in which, the power split is decided based
requirement between two distinct powertrain systems namely on the battery SOC and the road grade information acquired
engine and electric machine. This feature provides high flexibility through geographical Information System (GIS) maps. While Dy-
for powertrain operation through interchangeable torque demands namic Programming offers the global optimum performance in
between the two systems. Also, the hybrid powertrain system of- terms of fuel economy and emissions, it is computationally heavy
fers complex energy conversion and management functionalities and requires prior knowledge of the driving cycle, making them
such as regenerative braking, engine shut down and power assist less suitable for online implementation in the vehicle. For online
[2]. These abilities demand the need for a dedicated Energy implementation, strategies such as rule-based control, Thermostat,
ECMS are more suitable as they do not require large computation
cost and future information of the driving cycle.
* Corresponding author. Among the components in an HEV, the battery contributes to a
E-mail address: xbin@clemson.edu (B. Xu). major part of the vehicle cost. Yet, when it comes to the existing
1
These authors contribute equally to this work.
https://doi.org/10.1016/j.energy.2021.120604
0360-5442/© 2021 Elsevier Ltd. All rights reserved.
S. Sarvaiya, S. Ganesh and B. Xu Energy 228 (2021) 120604
literature on HEV control strategies, most of the research work is battery capacity loss. The purpose of the analysis is to study the
focused on achieving the best fuel economy. In Ref. [3] the authors impact of different control strategies on vehicle fuel economy and
have compared a PHEV’s (Plug-in Hybrid Electric Vehicle) perfor- battery life. The contribution of this study is as follows.
mance in real world condition with control strategies including
PMP and genetic algorithm based fuzzy logic where, the perfor- (1) Four different control strategies, including the very widely
mance of the PHEV is evaluated with respect to fuel consumption used fuzzy logic controller, ECMS, Thermostat controller and
and emissions but, the effect on Lithium-ion (Li-ion) battery life Q-learning are discussed in this paper. Further, the models
which is the critical parameter performance for an HEV is not are validated with Environmental Protection Agency (EPA)
analyzed. In Ref. [4] the authors have developed a fuzzy logic mandated driving cycles including Urban Dynamometer
control strategy for the power split in a PHEV and the split is Driving Schedule (UDDS) and Highway Fuel Economy Test
decided by predicting and recognizing the traffic beforehand. The (HWFET) to simulate the on-road driving scenario in city and
underlying effect of this control strategy is reduced fuel con- highway driving conditions.
sumption and emissions. In Ref. [5] the authors have developed an (2) In this research work, a semi-empirical battery aging model
energy management strategy based on the genetic algorithm to is used for battery degradation estimation. This model is less
reduce fuel consumption and emissions. In Ref. [6] the authors have accurate compared to the electrochemical model but, it is
proposed a rule-based control strategy formulated over the energy computationally faster and easier to integrate with Battery
consumption over the trip and driving information. The perfor- management systems [10]. To estimate the battery capacity
mance of the controller is evaluated based on fuel consumption and loss more accurately, the temperature is considered as a
number of engine start-stop. In all these literature, Li-ion battery dynamic variable in each control strategy, eliminating the
performance or battery life is not considered in the energy man- constant temperature assumption found in the literature. A
agement strategy. Only in recent years, researchers have laid secondary control layer is added to the normal rule-based
emphasis on the importance of tailoring control strategies to and ECMS control strategies to show the in-depth impacts
improve battery life in HEVs. Therefore, this paper is focused on the of battery temperature control to decelerate aging.
development of control strategies of HEVs, with significant (3) To overcome the drawbacks in gearshift, a quasi-static cost-
importance given to simultaneous optimization of fuel economy based gear controller is implemented and the improvements
and battery life. Also, a cost analysis is performed for all the control in fuel economy are analyzed.
strategies to operate the vehicle in Hybrid mode to give a better
idea of the economic impact of each of these strategies. The rest of the paper is organized as follows. The vehicle
Wang et al. [7] describe the damage accumulation model to modelling is described including vehicle layout in section 2. The
predict the battery life, which is a function of Temperature, DOD subsections in section 2 explain IC engine and electric motor-
(depth of discharge) and ampere-hour throughput of the battery. generator models, driver model and the transmission model. At
The relation derived between battery capacity loss and these pa- the end of section 2, a simulation model is described which is used
rameters is based on the tests conducted in the laboratory condi- in MATLAB and Simulink for simulation of the HEV. The battery
tions, where the test objects are subjected to a controlled model is explained in section 3, followed by the battery equivalent
environment and standard test profile. However, this scenario does circuit model and the battery aging models in the consecutive
not replicate the real-world driving conditions. In Ref. [8] the subsections. All the control strategies are discussed in Section 4,
optimization of battery life and fuel economy have been studied followed by the introduction of energy management strategies
through dynamic programming-based energy management strat- used to improvise the vehicle performance in terms of battery life
egy and the cost function of the strategy includes battery power and fuel economy. Section 5 explains the simulation results which,
consumption as a battery life optimization parameter. Similarly [9], compares each control strategy with the base-line strategy (Ther-
solves the multi-objective problem of enhancing battery life and mostat) and discusses the trade-off between the vehicle fuel
fuel-economy in an HEV and, the cost function in their dynamic economy and the battery life. The argument is supported with the
programming strategy includes battery replacement cost as one of battery capacity loss over 100 cycles and battery fuel consumption
the control parameters. However, the effect of temperature, which for better understanding of the trend. At the end of Section 5, cost
is a key parameter contributing to battery aging [10], is considered analysis is given for all the control strategy to operate the vehicle in
as a constant parameter in both literature which, is variable in the Hybrid mode. The paper ends with the conclusion in Section 6,
actual operating condition. In Ref. [11], where a supervisory control stating the results of this analysis.
strategy is proposed to solve the power-split optimization problem
in HEV, but the strategy considers only state-of-charge (SOC) 2. Vehicle model
variation to split the power, without taking the impact of temper-
ature on battery aging into consideration. Physical-chemical 2.1. Vehicle architecture
modelling provides a more accurate observation for battery life
degradation, based on solid electrolyte interface (SEI) layer growth. The analysis conducted in this paper is for a four-wheel drive
In Ref. [12], the electrochemical model-based degradation identi- PHEV with a total weight of 1636 kg. The vehicle powertrain
fication method is presented. This method requires detailed components are shown in Fig. 1. The front wheels are driven by the
knowledge of battery chemistry and longer time for computation. IC engine with peak power output of 63 kW and the rear wheels are
In the previous research [13], the proposed Q-learning strategy driven by the electric motor-generator pair producing peak power
considers optimizing only fuel consumption in a HEV, however, the of 22 kW. The engine is coupled to a five-speed automatic gearbox
battery degradation over the drive cycle is not considered. In with gear ratios: 5.5, 3.8, 2.2, 1.5 and 0.8. Between the gearbox and
summary, the battery degradation in HEV EMS is still lacking. the front wheels, there is a final differential with a drive ratio of
Different control strategies considering battery degradation are 4.13. In the rear, the electric machine is powered by a 9 kWh Li-ion
compared in this research work. The impact of each control strategy battery pack whose specifications are discussed in detail in section
is discussed in detail and the results are supported by the trend 3 and is coupled to the rear axle through a constant gearbox having
observed in the engine and electric machine efficiency operating a gear ratio of 6.45 with a transmission efficiency of 90%. The motor
points, battery SOC, operating temperature, fuel consumption and provides traction power to the vehicle during acceleration, and it
2
comparison between this cost-based gear controller and ECMS

inbuilt gearshift controller has been analyzed in this paper.
To evaluate the performance of the cost-based gearshift
controller, a comparison was made using the Fuzzy logic control
strategy. The cost-based controller was compared to an ordinary
speed and torque-based controller. The former operates based on
Eq. (2.3.1), maximizing the torque, and minimizing the fuel con-
sumption at every time step, while the latter switches the gear by
comparing the current vehicle speed and torque demands with
predefined threshold values. While the ordinary controller was able
Fig. 1. Block diagram representation of a hybrid electric vehicle. to achieve 49 MPG, the cost-based controller was able to attain fuel
economy of 54 MPG. Thus, the cost-based gear control strategy
shows 8.16% fuel economy improvement compared to the ordinary
regenerates during braking events. torque and speed-based controller. Figs. 2 and 3 shows the com-
Component Parameter Value
Vehicle curb weight 1636 kg

Internal Combustion Engine (ICE) Maximum power 63 kW
Maximum torque 115Nm
Electric Motor-Generator Maximum power 22 kW
Maximum torque 100 Nm
Li-ion battery pack Pack energy 9 kWh
cell type and capacity A123 (ANR 26650), 2.5Ah
5 speed automatic Transmission Transmission gear ratio 5.5, 3.8, 2.2, 1.5,0.8
Electric motor to wheel gear ratio 6.45
Differential gear ratio 4.13
2.2. Driver model parison of engine operating points for the two gearshift controllers.
The main reason for this improvement in fuel economy is with cost-
In this paper, the controller has been tuned for the UDDS and based control, the engine operating points are more concentrated
HWFET cycles to ensure versatility over both city and highway in the 2500 to 4500 RPM range, which is the optimum speed range
driving conditions. For the current model, the controller gains are for best fuel economy and on the maximum torque (high efficiency)
Kp ¼ 0.47, Ki ¼ 0.003 and Kd ¼ 0.07. The output of the controller is line (Fig. 3). Whereas, with the torque and speed controller, the
normalized in the range [-1,1] and alpha, beta values are calculated operating points are scattered over the entire map leading to a
according to the equation, greater operation in low efficiency region (Fig. 2) and increased fuel
consumption.
UðtÞcUðtÞ > 0
aðtÞ ¼ (2.2.1)
0cUðtÞ 0

0cUðtÞ 0
bðtÞ ¼ (2.2.2)
UðtÞcUðtÞ < 0
2.3. Transmission model
A cost-based controller has been developed to assist in gearshift

promoting vehicle performance and minimum fuel consumption.
At any given instant, t, the controller is designed to choose the gear
based on two conflicting parameters: maximize torque output and
minimize fuel consumption according to the cost equation shown
below.
X
5
JðtÞ ¼ max twheel;i ðtÞ þ 1 m_ f ðtÞengine;i (2.3.1)
i¼1
where i ¼ 1 to 5 is the gear number, and twheel;i ðtÞandm_ f ðtÞengine;i are

the engine torque at the wheel and engine fuel consumption at the
ith gear ratio. The control strategies discussed in this paper, Ther-
mostat, Fuzzy logic, Q-Learning employ this gear shifting controller
Eq. (2.3.1). Since ECMS already employs a local minimization Fig. 2. Engine Operating points comparison for Speed and Torque based gearshift
function for the selection of engine and motor operating points, a control.
3
where, Vbatt is the battery voltage under load, VO is the battery open
circuit voltage, I is the overall battery current and R1 is the battery
internal resistance.
The cell open circuit voltage and internal resistance are calcu-
lated as a function of SOC and temperature. From the charge and
discharge characteristic curves provided by the manufacturer, the
voltage vs SOC points are collected. The coefficients are calculated
by polynomial curve fitting where the polynomial is,
VO ¼ a0 þ a1 SOC þ a2 SOC 2 þ a3 T (3.2.2)
R ¼ b0 þ b1 SOC þ b2 SOC 2 þ b3 T (3.2.3)
where a0 ; a1 ; …a3 andb1 ; b2 ; …b3 are the coefficients calculated

using experimental data and T is the cell temperature in Kelvin.
After the calibration, the internal resistance is shown in Fig. 5.
3.3. State of charge (SOC) estimation model
For battery pack simulation, it is crucial to model the energy

stored in the Reserve Energy Storage System (RESS). For this pur-
Fig. 3. Engine Operating points comparison for Fuel Consumption Cost based
controller. The Cost-based gear control strategy shows 8.16% fuel economy improve- pose, many methods have been identified so far in the literature
ment compared to the ordinary torque and speed-based controller. such as Open Circuit Voltage (OCV) method, Electromotive Force
(EMF) method, Coulomb counting method and internal resistance
method etc. [1]. Of these methods, the Coulomb counting method
3. Battery model has proven to be simple to implement and less memory intense
[13] thereby making it the obvious choice when it comes to online
3.1. Battery pack construction implementation of the control system. The main governing equa-
tion for SOC estimation using the coulomb counting model is given
The Reserve Energy Storage System (RESS) is modelled as a by,
9 kWh Li-ion battery pack containing 1105 Li-ion cells. The cells
ðt
used in this model are A123 Systems high-performance lithium
iron phosphate (LiFePO4) cells. Each cell has a rated nominal ca- IðqÞdq
0
pacity of 2.5Ah and nominal voltage of 3.3 V. The battery pack is SOCðtÞ ¼ 1 (3.3.1)
Cn
designed as 17 series cell modules (SCM) each consisting of 65 cells
connected in series. where, I is the battery current, Cn is the battery nominal capacity.
3.2. Cell modelling 3.4. Cell temperature model
For modelling the resistance of the battery pack at different In this paper, the cell temperature is calculated based on a
operating conditions, an equivalent circuit method has been used lumped parameter assumption [14]. The thermal model has two
to identify the circuit parameters, using a zeroth-order model. A heat transfer components, heat generated by the cell’s internal
simple representation of a zeroth-order model is shown in Fig. 4. resistance and the convectional heat loss between the cell and the
The battery voltage for the circuit shown in Fig. 4 is governed by the surroundings. The overall heat transfer equation is given as,
equation,
dTcell
mccell ¼ I 2 Rint þ AhðT Tamb Þ (3.4.1)
Vbatt ¼ VO IR1 (3.2.1) dt
where, mccell is the thermal mass of the cell, T is the cell
Fig. 5. Cell Internal Resistance vs SOC at different temperatures, the overall resistance
Fig. 4. Equivalent circuit model for battery. reduces with the increasing temperature.
4
temperature in K, I is the charge/discharge current through the cell condition with, temperature ¼ 25 C, Icnom ¼ 2.5 h1 and SOC ¼ 35%
(A), A is the surface area of the cell exposed to convection (m2), h is [10]. gðIc; T; SOC Þ is the total charge throughput through the bat-
the convective heat transfer coefficient (W/m2K) and Tamb is the tery under the actual operating condition with the actual value of Ic;
ambient environment temperature (303 K). T and SOC.
Fig. 6 shows the severity factor map, which shows the relation of
severity factor function map concerning to the C-rate value and
3.5. Battery aging estimation
temperature. smap >1 shows the high stress zone. If the value of C-
rate and/or temperature is high enough then it raises the severity
Battery life is classified into two terms, calendar life and battery
factor value to high stress zone and the battery capacity loss in-
cycle life. In this research work, multiple control strategies are
creases. The reward function in Q learning strategy ensures that the
discussed to increase the battery cycle life. A Semi-empirical bat-
battery operation is in the low stress zone.
tery aging model from Refs. [11,13], is adopted for battery capacity
degradation calculation. This model is suitable for control appli-
cation where, the lower computation time is available, and the 4. Energy management strategies (EMS)
model required to be easily included with the battery management
system [10]. Battery life is a highly dynamic factor and is dependent 4.1. Thermostat strategy
on a wide variety of parameters including current, operating tem-
perature and the depth of discharge (DOD) [7]. The battery life The Thermostat strategy is a map-based strategy that considers
degradation is referred as battery capacity loss and the same is battery SOC limit to decide the electric machine torque mapped to
quantified by Eq. (3.5.1) [10]. it. It turns on the engine-generator unit when the battery SOC
approaches the lower limit and turns off the engine when the
Qbatt ð0Þ Qbatt ðr; AhÞ battery SOC approaches the upper limit. The upper and lower
Qloss ðr; AhÞ ¼ 100* (3.5.1)
Qbatt ð0Þ values of SOC are calculated based on battery internal resistance
[16]. In this research, a similar strategy is followed. Fig. 7 shows the
where r is the vector of aging factors (Ic; T; SOC) [15], Ah is the graphical representation of the Thermostat strategy followed. Fig. 7
ampere-hour throughput, which is the total amount of charging shows the Thermostat torque versus SOC map, which decides the
and discharging current through the battery during the operation. electric machine torque based on the current battery SOC. SOClow is
Qbatt ð0Þ is the initial capacity of the battery and Qloss ðr; AhÞ is the the battery SOC lower limit, SOChigh is the battery SOC upper limit,
capacity of the aged battery. The value of Qloss ðr; AhÞ is computed tEM;max and tEM;min is the maximum and minimum torque limits of
from Eq. (3.5.2), adopted from Ref. [10], motor and generator, respectively.
As represented in Fig. 7 the electric motor torque depends on
Qloss ðr; AhÞ ¼ sðrÞ*Ahz (3.5.2) the battery SOC value. When the motor torque is not sufficient to
meet the demanded torque, the engine operates and provides the
where z shows the dependency of the charge throughput on Qloss . remaining torque. In the traction region, when the SOC value is
Charge throughput is the amount of charging and discharging
above the higher limit, the motor provides the maximum torque.
current through the battery during the operation. sðrÞ is the Similarly, when there is a braking action and the battery SOC value
severity factor function which is expressed in Eq. (3.5.3) [10].
goes below the SOC low limit, the electric machine acts as a
generator and operates up to the highest torque limit.
Ea þ h:Ic
sðrÞ ¼ ða:SOC þ bÞ:exp (3.5.3)
Rg ð273:15 þ TÞ
4.2. Fuzzy logic controller
here, Ic is the C-rate which is defined as per Eq. (3.5.4)
This method is based on the utilization of fuzzy logic to split
jIj power between the engine and electric machine according to the
Ic ¼ (3.5.4)
Qbatt
value of a ¼ 2694:5; b ¼ 6022:2, Ea ¼ 31500 Jmol1 , h ¼ 152:5.

where T is the temperature in Celsius in the given operating con-
dition. To compute the relative aging of the battery with different
drive cycles and under the different operating conditions, the
concept of severity factor map is discussed in Ref. [15]. This concept
is used in section 4.5 to consider the battery aging factor in Q
learning strategy. Expression of the severity factor map is given
with Eq. (3.5.5).
ð EOL
jI ðtÞjdt
GðIcnom ; Tnom ; SOCnom Þ 0 nom
smap ¼ ¼ ð EOL (3.5.5)
gðIc; T; SOCÞ
jIðtÞjdt
0
where smap is the ratio of total charge throughput through the

battery in nominal operating condition to the charge throughput in
the actual operating condition G is the nominal battery life
expressed in Ah-throughput for the nominal discharge current of
Inom . To calculate G it is assumed that the battery capacity left at the Fig. 6. Severity factor map for the cell at 35% SOC, battery operation in the region with
end of the life is 20% and, the battery is operated in controlled severity factor >1 leads to the higher battery capacity loss.
5

tEM ¼ 0:5*tEM; lim
tdmd ðtÞ > tEM; lim : tice ¼ tdmd tEM; lim (4.2.2)

tEM ¼ tdmd
tdmd ðtÞ < tEM; lim : (4.2.3)
tice ¼ 0
Case 1b : TBatt ðtÞ > TBatt; optimal

tEM ¼ 0:5*tEM;lim
tdmd ðtÞ > tEM;lim : tice ¼ tdmd 0:5*tEM; lim (4.2.4)

tEM ¼ 0:5*tdmd
tdmd ðtÞ < tEM; lim : (4.2.5)
tice ¼ 0:5*tdmd
Fig. 7. Thermostat electric motor torque map, based on battery SOC.
Case 2 : Traction case; SOCBatt ðtÞ < SOCLow
conditions of battery SOC and road torque requirement [17] based

on the underlying principle that an action is triggered when a tEM ¼ 0 (4.2.6)
threshold is reached. This action may refer to shutting down the
engine at high SOC or switching off the motor at low SOC. Different tice ¼ tdmd ðtÞ (4.2.7)
strategies for rule-based control of HEV is discussed in Ref. [18]. The
objectives of the rule-based strategies in this research work are as
follows. Case 3 : Regeneration; Braking case; SOCBatt ðtÞ < SOCLow
Case 3a : TBatt ðtÞ < TBatt;optimal

tEM ¼ tEM; lim
1) Providing driver demanded power tdmd ðtÞ > tEM; lim :
2) Reducing fuel consumption and emissions tice ¼ 0
3) Maintain SOC of the battery at optimum level (4.2.8)
In general, the performance of the rule-based controllers is

largely dependent on human expertise and heuristics. One of the tdmd ðtÞ < tEM; lim : tEM ¼ 0:5* tEM; lim (4.2.9)
tice ¼ 0
advantages of this is that the knowledge of a predefined drive cycle
profile is not required [19]. In this paper, apart from maintaining
battery SOC level, the rule-based strategy includes two additional Case 3b : TBatt ðtÞ > TBatt; optimal
(4.2.10)
layers of control constraints e electric motor torque and temper- tEM ¼ 0; tice ¼ 0
ature. As stated in Section 3.5, the battery aging is contributed by
the higher current operation and the high cell temperature. The
Case 4 : Regeneration; Braking case; SOCBatt ðtÞ > SOChigh
additional layers of control ensure that the battery temperature and
the battery operating current are limited up to the threshold values.
If the threshold is reached, the fuzzy logic controller limits the tEM ¼ 0 (4.2.11)
operation of the induction machine until the values below the
threshold are achieved. This strategy controls battery aging by
limiting the high temperature and high current operations through
tice ¼ 0 (4.2.12)
the battery. Additionally, the model takes advantage of the hybrid
where, SOCBatt is the current battery state of charge, SOCLow is the
system and instantaneous torque capabilities of the electric ma-
battery lower SOC level, SOChigh is the battery higher SOC level, Tbat
chine, allowing to limit the engine operation under appropriate
is the cell temperature in Celsius, TBatt;optimal is the optimal battery
conditions. This ensures that idling is eliminated as much as
temperature for prolonged life, which in this work has been
possible, bringing down overall fuel consumption during the drive
considered as 30 C, tdmd is the demand torque from the vehicle
cycle. Thus, by controlling the operation of the electric machine and
corresponding to the road load at that time instant, tEM;lim is the
engine, appropriate battery life and fuel economy are achieved for
motor torque threshold limit to increase the battery life which is
the drive cycle.
±50Nm in this paper, tEM is the electric motor torque demand and
In this paper the following rules are applied as hard constraints
tice is the engine torque request. In the case of braking, the electric
to the powertrain controller:
machine acts as a generator and charges the battery when the
The electric machine torque boundary condition is,
battery SOC is lower than the SOC upper limit. Although, the
generator operation is limited by the temperature threshold limit.
tEM; min tdmd ðtÞ tEM; max (4.2.1)
When the battery cell temperature is above the threshold limit, the
generator operation is limited to half of its maximum torque value.
Case 1 : Traction case; SOCBatt ðtÞ > SOCLow This ensures strategy helps to reduce the battery operation in the
temperature zone and eventually contributes to saving the battery
Case 1a : Traction case; TBatt ðtÞ < TBatt; optimal life. For the electric machine used in this research, the minimum
and the maximum torque limit is 100 Nm and 100 Nm,
respectively.
6
4.3. Thermostat strategy with high voltage battery pack and special
DC-DC converter 0 tice tice;max (4.4.1.6)
The cycle life of a battery is majorly affected by the charging and 0 uice uice;max (4.4.1.7)
discharging of the battery with high amount of current [7]. If the
current passing through the battery is limited, the aging of the
tEM;min tEM tEM;max (4.4.1.8)
battery can be controlled. The same concept is used in this control
strategy to reduce battery aging. To reduce the current throughput
through the battery, the battery pack arrangement is redesigned. 0 uEM uEM;max (4.4.1.9)
The total number of cells in the battery are 1170. The number of
series cells used in this battery pack is 130 and the number of Pbatt;min Pbatt Pbatt;max (4.4.1.10)
parallel cells is 9. It is assumed that the nominal output voltage of
the battery is around 440 V which is step down by the DC-DC where, m_ ice andm_ batt;eq are the instantaneous engine fuel con-
converter at around 220 V to meet the induction motor input sumption and motor equivalent fuel consumption factors. The en-
requirement. A bidirectional buck type of DC-DC converter is used. gine instantaneous fuel consumption is calculated through the
For a buck converter, the output voltage is lower than the input interpolation of the engine fuel map (Fig. 2). The motor equivalent
voltage and, the output current is higher than the input current. In fuel consumption is calculated using Eq. (4.4.1.3). where SCECMS is
EV and HEV, the primary function of the DC-DC converter is to the equivalent fuel consumption factor which accounts for the
boost/reduce the voltage of different component in a vehicle amount of fuel needed to replenish the electrical energy consumed,
depending on the load type and system requirement [20]. In this fSOC is the correction factor for maintaining battery SOC governed
case, the scope of the discussion is limited to the impact of a step- by Eq. (4.4.1.11) and Pbatt is the battery power requirement corre-
down DC-DC converter on battery life. The performance of DC-DC sponding to the motor operating point at time instant t.
converter is limited by its efficiency, electromagnetic interference, The ECMS employed in this paper is tailored to be adaptive like
and stress on internal switches with a higher conversion ratio. DC- the one discussed in Ref. [23]. The controller calculates the battery
DC converter with higher power capacity adds additional weight to cost based on the motor power and battery operating SOC as shown
the vehicle [21]. It is assumed that the performance of DC-DC in Eq. (4.4.1.3). The SOC correction factor fSOC is written as,
converter used here is a function of the converter efficiency, and
the efficiency reduces at higher voltage and increases at higher !3
current at the given voltage. SOCtþ1 SOCdesired
fsocðtþ1Þ ¼ 1 2 (4.4.1.11)
SOChigh SOClow
4.4. Equivalent fuel consumption minimization strategy (ECMS) Pbatt;t

SOCtþ1 ¼ SOCt þ (4.4.1.12)
BatteryCapacityðAhÞ
4.4.1. ECMS without gear control
The control strategy, discussed in this section is ECMS, which The function fSOC is a cubic polynomial function. The dynamics
uses a cost-based function which, tries to find the local minima of of this equation push the controller to operate the engine more
all possible operating points of the engine and electric machine. until the SOC reaches the desired value when the current SOC is
This strategy is based on the quasi-static behavior assumption of lower than desired SOC value and, encourages the motor to operate
the system and model. Within the quasi-static time frame, the more when the SOC is higher than the desired value, as depicted in
controller tries to get information about vehicle speed, torque de- Fig. 8. This ensures the robustness of the controller in maintaining
mand and battery SOC to decide the overall cost of operating at a the optimum battery SOC range for better performance. The impact
power split ratio (tice , tEM and gear ratio), which is denoted by the of including SOC correction cost in the ECMS controller is shown in
J(t), expanded in Eq. (4.4.1.1). The cost of operating the battery is Fig. 9. As shown in this figure, the desired SOC value is set as 50%.
calculated by the fact that when battery power is used at one time
step, it must be replenished using the engine at a later stage [22].
The ECMS local minimization cost function J(t)is written as,
Tðf Tðf
JðtÞ ¼ _
meq ðtÞ dt ¼ m_ ice ðtÞ þ m_ batt;eq ðtÞ dt (4.4.1.1)
0 0
where,
m_ ice ðtÞ ¼ f ðuice ; tice Þ (4.4.1.2)
m_ batt;eq ðtÞ ¼ SCECMS *fSOC ðSOCtþ1 Þ*Pbatt ðuEM ; tEM Þ (4.4.1.3)
subject to the constraints,
0 tice tice;max (4.4.1.4)
0 uice uice;max (4.4.1.5)
Fig. 8. Adaptive battery cost factor fsoc based on battery SOC.
7
algorithm is used (i.e., Q values are saved in a look-up table rather

than approximated by correlations). Q-learning has three states
(i.e., vehicle speed, vehicle torque demand, gear number) and one
action (i.e., power split between engine and electric motor). Auto-
motive OEMs need to follow emission regulations when they
design new cars. According to the experimental results reported in
Ref. [1], CO2 emission is proportional to fuel consumption within
2.5% measurement error. However, the CO2/NOx emission map is
different from the engine fuel efficiency map. There is trade-off
among CO2 emission, NOx emission and fuel consumption [2]. In
this study, fuel consumption is assumed to be related tos CO2 and
NOx emissions and fuel saving assists emission reduction. This
study only focuses on fuel consumption and battery aging. In the Q-
learning algorithm, there is a reward generated in the state tran-
sition, which is defined as follows.

R ¼ wf m_ ICE þ m_ ESS 1 wf s þ 1 (4.5.1)
Fig. 9. Effect of including fsoc factor on battery pack SOC in the ECMS cost function.
where m_ ICE is the fuel rate of the engine and m_ ESS is the equivalent
fuel rate of battery (i.e., energy storage system (ESS)), s is battery
Therefore, the battery is discharged up to 50% SOC and then the
severity factor, wf is the weight of fuel consumption, ð1 wf Þ is the
controller goes into charge sustaining mode, maintaining the SOC
level constant. weight of battery severity factor (i.e., battery aging speed). The less
the fuel rate and battery severity factor, the greater the reward is.
The 1 added at the end of R is to make the reward a positive
4.4.2. ECMS with integrated gear shift control
number. The key equation in the Q-learning is the update equation
An attempt was made to improve the supervisory energy
as follows.
management control through the integration of gearshift and po-
wer split controllers into one common ECMS cost function by
building the cost matrix into a 2-D vector instead of a 1-D power
split vector. The second dimension of the cost matrix accounts for
the selection of optimum gear ratio for the gearbox coupled to the
Q ðst ; at Þ ¼ Q ðst ; at Þ þ a R þ gmaxQ ðstþ1 ; ai Þ Q ðst ; at Þ
engine. This will introduce an additional degree of freedom to the i
controller allowing more optimum powertrain operating point (4.5.2)
selection based on vehicle torque demand.
The updated cost function can be written as where Q ðst ; at Þ is the Q value at state and action pair ðst ; at Þ,
subscript t is the time step, a is the learning rate, g is the discount
Tðf Tðf factor, maxQ ðstþ1 ; ai Þ is the maximum Q value at state stþ1 among
Jt ¼ m_ eq ðtÞ dt ¼ m_ ice ðtÞ þ m_ batt;eq ðtÞ dt (4.4.2.1) i
all the action-state Q values. The term R þ gmaxQ ðstþ1 ; ai Þ is the
0 0 i
observed Q value at state and action pair ðst ;at Þ, whereas Q ðst ; at Þ is
where m_ ice ðtÞis f ðtice ; gear ratioÞ the estimated Q value. More information about the tabular Q-
learning parameter setup and implementation process can be
found in reference [4]. The procedures of Q-learning update during
the simulation are summarized as follows.
4.5. Q-learning
1) Initialize the Q value table with zeros.
Q-learning is a model-free reinforcement learning algorithm. 2) Run the vehicle simulation in one driving cycle (i.e., one itera-
Compared to Thermostat and rule-based strategies, Q-learning is an tion). During the simulation, the action corresponding to the
optimization-based strategy, which generally has higher compu- largest Q value at a given state is selected 92% of the time, and
tation cost and produces better fuel economy. Compared to ECMS, random action is taken at remaining 8% of the time.
Q-learning is model-free and long-horizon optimization. ECMS is a 3) Update the Q value table based on Eq. (6) from the first time-
model-based strategy because it requires engine and electric motor step to the last time-step using the data from the step number 2.
models when calculating the minimum fuel consumption. How- 4) Repeat step number 2 and step number 3 until sum of reward
ever, Q-learning does not need those models and only needs to does not increases any more.
interact with the actual vehicle or vehicle plant model (in this
study). In the vehicle interaction process, Q-learning takes the real- The Q value look-up table is a 4D table. The four inputs are three
time vehicle speed, torque demand, gear number signals and de- states and one action. The one output is the power split ratio be-
termines the optimal power split between engine and electric tween the engine and electric motor. State discretization is 5 and
motor. Additionally, ECMS only find the minimum fuel consump- action discretization is 20. The states and action discretization in-
tion for one-time step and does not consider the long-term future, formation are summarized in Table 1. The sum of rewards for each
while Q-learning utilizes Bellman Equation and considers future in iteration is shown in Fig. 10. It is observed that the Q-learning
the optimization process. In this study, a tabular Q-learning converges after 50 iterations.
8
Table 1
State and action discretization and resolution data for Q-learning.
States and action Lower boundary Upper boundary Discretization Resolution
Vehicle speed 0 m/s 30 m/s 5 6 m/s

Vehicle torque demand 100 Nm 250 Nm 5 70 Nm
Gear number 1 5 5 1
Electric motor torque 100 Nm 100 Nm 20 10 Nm
and less SOC variation in Fig. 11 (c) and (a), respectively.

Fig. 12 shows the comparison of the electric machine and the
engine torque and, the chemical energy from fuel and the electrical
energy from the battery in case of Thermostat and Thermostat with
a high voltage battery pack. From Fig. 12(a), the ICE torque with a
high voltage battery pack is comparatively higher and, as per
Fig. 12(b), the electric machine torque with Thermostat is
comparatively higher. This behavior is because of the less current
operation with the high voltage battery pack as shown in Fig. 11(b).
Fuel energy consumption is relatively higher with a high voltage
battery pack as shown in Fig. 12(c). Due to the lower SOC magnitude
through the entire driving cycle, overall battery energy consump-
tion is higher when a high voltage battery pack is used as shown in
Fig. 12(d).
5.2. Comparative analysis of thermostat and the rule-based

strategy
Fig. 10. Sum of Q-learning algorithm rewards over 150 iterations. Fig. 13 shows the comparative analysis of the quasi-static plot of
electric machine efficiency with the operating points; Fig. 13(a) and
(b) shows the electric machine efficiency map with the operating
5. Simulation and results points for Thermostat strategy and rule-based strategy, respec-
tively. As mentioned in case 1b (Section 4.2), if the demand torque
5.1. Comparative analysis of Thermostat and Thermostat with a is higher than ±50Nm or if the temperature exceeds the threshold
high voltage battery pack and a special DC-DC converter strategy value, the electric machine operation is limited to half of the
maximum/minimum torque it can provide. This behavior can be
Fig. 11 shows the comparative analysis of control strategies seen in Fig. 13 (b) for the rule-based strategy. This leads to lower
Thermostat and, Thermostat with a high voltage battery pack. As
shown in Fig. 11 (b), the C-rate in the high voltage battery pack is
slightly less, which results in slightly lower battery temperature
Fig. 12. Comparative analysis of Thermostat and Thermostat with special DC-DC
converter strategy, (a) Engine torque (b) Electric machine torque (c) Engine output
Fig. 11. Comparative analysis of Thermostat and Thermostat with special DC-DC energy and (d) Electric machine output energy comparison. Compared to Thermostat,
converter strategy, (a) SOC, (b) C-rate and (c) Temperature comparison. Thermostat Thermostat with special DC-DC converter strategy has overall lower motor torque
with special DC-DC converter strategy has lower C-rate, lower temperature rise and leading to the lower electrical energy consumption and, higher IC engine torque
lower SOC variation compared to Thermostat strategy. leading to higher fuel energy consumption.
9
Fig. 13. Electric machine efficiency map with operating points, (a) Thermostat, and (b) rule-based strategy. The rule-based strategy has more operating points in the higher ef-
ficiency region compared to Thermostat strategy.
battery capacity loss and hence, improves the battery life. In the operating points with Thermostat strategy in Fig. 14(a) and,
case of Thermostat, the motor has the operating points all over the Fig. 14(b) shows the engine map with operating points with rule-
map (Fig. 13(a)). Also, from the efficiency point of view, the rule- based strategy. The rule-based strategy has more operating points
based strategy has more points in the high-efficiency region on maximum torque line in Fig. 14(b), compared to Thermostat.
compared to Thermostat. Since the Thermostat strategy is map- This shows that the engine is providing the additional torque to
based, the motor output torque is proportional to the existing meet the demanded torque when the motor operation is limited to
battery SOC level which limits the motor from operating in the ±50Nm. This leads to the higher fuel-economy with the rule-based
high-efficiency zone. Whereas, for rule-based strategy, the motor strategy. In Thermostat strategy, most of the engine operating
continues to supply until the battery SOC is beyond the battery SOC points are in the low-efficiency zone compared to the rule-based
lower limit. This allows the motor to operate in comparatively high- strategy. This is because the motor operation with Thermostat is
efficiency zones. not limited to ±50Nm like rule-based strategy. With Thermostat,
Fig. 14 shows the engine efficiency plot with the engine the motor provides the maximum amount of demanded torque
Fig. 14. Engine efficiency map with the operating points, (a) Thermostat, and (b) rule-based strategy. The rule-based strategy has more operating points on the maximum torque
line compared to Thermostat strategy.
10
depending on the battery SOC level and the remaining torque is

provided by the engine. As a result, the engine has operating points
in the lower efficiency zone and this leads to the lower fuel econ-
omy with Thermostat strategy compared to the rule-based strategy.
Fig. 15 shows the comparative analysis of engine torque, electric
machine torque and, the battery temperature in the case of rule-
based and Thermostat strategies. The red dotted lines in Fig. 15(b)
and (c) shows the threshold limit of ±50Nm for electric machine
torque and 30.25 C for temperature, respectively. As described in
Fig. 15(b), Electric machine torque is limited up to the threshold
value for the rule-based strategy. In this situation, the remaining
demand torque is supplied by the engine. Thermostat strategy is
the map-based strategy that will provide the electric machine
torque based on the battery SOC. Comparison of the engine torque
for Thermostat and rule-based strategy is shown in Fig. 15(a).
Fig. 15(c) shows the slow rise in the battery temperature with rule-
based strategy compared to Thermostat due to the control layer
explained in the case 1a, 1b, 3a and 3b in Section 4.2.
Fig. 16 shows the comparison of SOC and C-rate for Thermostat
and rule-based strategy. From Fig. 16(a), it can be inferred that the
SOC with rule-based strategy is not changing so aggressively as
Fig. 16. Comparative analysis of Thermostat and rule-based strategy, (a) SOC, and (b)
Thermostat. This is because of the control layers applied in the rule-
C-rate. The rule-based strategy has lower C-rate and SOC variation leading to the lower
based strategy, which ensures the limited electric machine opera- battery capacity loss compared to Thermostat strategy.
tion in motoring mode as well as in the regeneration mode. This
control strategy enhances the battery life as explained earlier in this
section. Fig. 16(b) shows a comparatively lower C-rate with the
rule-based strategy which is the result of limited electric machine
operation in this strategy.
5.3. ECMS without gear control strategy
A comparison of power split parameters is shown in Fig. 17.

When compared with Thermostat, ECMS can achieve a better po-
wer split between the engine and electric machine. Thermostat
tends to load the engine more, making it operate at high torques
during hard acceleration, leading to high fuel consumption which is
clearly shown is Fig. 17(b) and (c). These high torque points in the
engine throughout the drive cycle increase the overall fuel
Fig. 17. Comparison of Thermostat and ECMS strategies in terms of power split be-
tween engine and electric machine, (a) UDDS Velocity Profile, (b) ICE Instantaneous
torque request, (c) ICE instantaneous fuel consumption, (d) Electric machine torque
request. ECMS avoids peaks in the IC engine torque curve by finding optimal power
split and thereby reduces overall fuel consumption over the drive cycle.
consumption of Thermostat compared to ECMS. The electric motor

torque request for both the strategies is significantly different with
ECMS operating the motor at higher torques Fig. 17(d). When it
comes to a cell operating temperature, ECMS has better control
(Fig. 18(b)), maintaining the temperature lower, thereby improving
the battery life.
The battery major operating parameters, current and tempera-
ture are shown in Fig. 18. From the figure, ECMS has better control
over the peaks in battery current rate which contributes to slower
Fig. 15. Comparative analysis of Thermostat and the rule-based strategy, (a) Engine battery aging (Fig. 18(a)). The battery operating temperature
torque (b) Electric machine torque and (c) Temperature comparison. Compared to (Fig. 18(b)) is also better controlled in ECMS which is one of influ-
Thermostat, the rule-based strategy has lower IC engine torque and lower charging
torque, leading to the lower temperature rise in the battery.
ential parameters in battery aging equation as shown in Eq. (3.4.3).
11
Also, the number of operating points in the high-efficiency zone

is higher in ECMS with gear control. This can be explained by the
way these two algorithms work., in the case of normal ECMS, the
gear selection is made first in a separate gearshift controller and
then the cost of engine efficiency is applied to ECMS algorithm for
local optimization at that gear and time instant. Whereas in the
case of ECMS with gear controller, the gear selection is made after
the engine operating efficiency cost is applied, leading to the
operating point with higher efficiency among all possible gears.
This leads to higher fuel economy savings in ECMS with gear con-
trol compared to normal ECMS. By controlling the gearshift and
engine torque request simultaneously, ECMS with gear control
strategy can manage engine output power effectively keeping the
overall fuel consumption minimal as shown in Fig. 19(d).
In terms of battery operating parameters, the ECMS with gear
control strategy is maintaining the cell temperature at a lower level
throughout the drive cycle as seen in Fig. 21(d). The electric ma-
chine torque and SOC profiles are almost similar except during the
initial 200 s of the drive cycle, where normal ECMS tends to operate
the electric machine more owing to power deficit from the IC en-
Fig. 18. Comparison of Thermostat and ECMS strategies in terms of battery operating
gine as seen in the plots shown in Fig. 21(b). The C-rate profile is
parameters, (a) Cell current rate, (b) Cell temperature, and (c) Battery SOC. ECMS
shows lower C-rate peaks and lower temperature rise than the Thermostat control comparatively smoother for ECMS with gear control, with lesser
does. number of peaks than normal ECMS as shown in Fig. 21(c).
5.5. Q-learning results and analysis

5.4. ECMS with gear control strategy
Fig. 22 shows the comparative analysis of electric machine ef-
Fig. 19(a) and Fig. 19(b) show a comparison of the engine ficiency plot with operating points for Thermostat in Fig. 22(a) and
operating parameters for normal ECMS and ECMS with gear con- for Q-learning strategy in Fig. 22(b). There are more operating
trol. It is evident from Fig. 19(b) that both strategies use distinct points in the higher torque zone in Fig. 22(b) compared to
gear shift patterns. The key point from the gear profiles is that Fig. 22(a). Thus, the electric machine application is intense with Q-
ECMS with gear control mostly operates in the higher gear ratio learning strategy compared to Thermostat strategy. As a result, the
compared to the normal ECMS. This consequently leads to the C-rate values are higher with Q-learning strategy (Fig. 25(b)) and
constant lower power operation of the engine in ECMS with gear hence, the battery capacity loss is higher. This is because the reward
control compared to normal ECMS as shown Fig. 19(c). This claim is function term (Eq. (4.5.1)) in the Q-learning strategy ensures that
supported by the engine operating maps shown in Fig. 20. While high fuel consumption leads to a lower reward. As a result, the
normal ECMS operates mostly in the medium-speed region agent tries to push the motor as much as possible to increase the
Fig. 20(a), ECMS with gear control Fig. 20(b) is mostly operating in overall reward. The high motor torque by the Q-learning is
the low-speed region of the engine and consequently shift to high accompanied by the higher current and thus C-rate of the battery,
gear numbers. which increases the aging severity factor. From the efficiency
perspective, the Q learning strategy has comparatively more
operating points in the high-efficiency region compared to Ther-
mostat. This is because the motor has more opportunity to operate
with Q-learning strategy compared to Thermostat because of the
reward function trying to reduce the fuel consumption as explained
earlier.
Fig. 23 shows the comparative analysis of the engine efficiency
plot with operating points for Thermostat Fig. 23(a) and Q-learning
Fig. 23(b). Moreover, the number of operating points in Fig. 23(b)
are lesser compared to the number of operating points in Fig. 23(a).
As explained earlier, this is because the agent tries to operate the
motor as much as possible to reduce fuel consumption and keep the
overall reward value high.
Fig. 24 shows the comparison of Thermostat and Q-learning
strategy in terms of the engine torque Fig. 24(a), the electric ma-
chine torque Fig. 24(b), and instantaneous fuel consumption
Fig. 24(c). It can be concluded that the ICE torque value is higher
with Thermostat strategy compared to Q-learning, which results in
higher fuel consumption with Thermostat. With Q-learning strat-
egy, electric machine usage is more when compared to Thermostat.
Fig. 25 shows the comparison of SOC (Fig. 25(a)), C-rate
(Fig. 25(b)), and temperature (Fig. 25(c)) in the case of Thermostat
Fig. 19. Comparison of engine operating parameters between the two ECMS strategies,
(a) ICE instantaneous torque, (b) Gear Shift pattern, (c) ICE mechanical power output,
and Q-learning strategy. The high C-rate value (Fig. 25(b)) and
and (d) Cumulative Fuel consumption. ECMS with gear control can operate the vehicle consequently, the battery temperature (Fig. 25(c)) value is higher
at optimal gear ratios and more frequent gear shifts, reducing the fuel consumption. with Q learning strategy. Due to fewer discharging events in the
12
Fig. 20. (a) Engine Operating points with normal ECMS (b) Engine Operating points with ECMS integrated with gearshift controller. ECMS with gearshift strategy can operate the
engine at comparatively lower speeds and hence lower fuel consumption.
5.6. Comparative analysis of fuel economy and battery life for all
the control strategies
This section presents the results of each strategy compared with

one another in terms of fuel economy and battery life. For the
calculation of fuel economy, the EPA combined fuel economy cal-
culations are used. The governing equation for this calculation is as
follows,
1
EPA combined fuel economy ¼ 0:55
(5.6.1)
FEUDDS þ FE0:45
HWFET
To achieve more stable results, the results of fuel economy and

the battery life are shown for 100 cycles. At the end of 100 cycles,
the effect of change on the battery state of charge between the start
and end of the simulation can be neglected as it is small compared
to the overall energy expenditure.
The fuel economy and battery capacity loss attained by different
strategies and the drive cycles are shown in Table 2. This table
shows the impact of the drive cycle and more specifically traffic
Fig. 21. Comparison of battery parameters for normal ECMS and ECMS with integrated
conditions on the performance of these strategies. For example,
gearshift controller, (a) Electric machine instantaneous torque (b) Battery SOC profile
(c) Battery Current rate (d) Cell operating temperature. ECMS with gear control is UDDS is an urban cycle with frequent start/stops, analogous to
operating at a higher average C-rate and overall higher SOC through the drive cycle. driving around in heavy city traffic. The HWFET cycle has very less
stops and high speeds, which can be considered as highway driving,
having much less traffic. The trend is very similar between the two
different drive cycles across all strategies. Q-Learning has the
initial 100 s, the SOC level is more sustained with Q-learning highest fuel economy and highest battery capacity loss in both
strategy compared to Thermostat in the first 800 s in the cycle. The driving conditions. One important trend to be noticed here is the
initial rise of the Q-learning SOC is the result of electric motor magnitude of fuel economy and battery capacity loss difference
charging. As shown in the first spike in Fig. 24(a), the ICE torque of between different strategies in the two different drive cycles. In the
Q-learning strategy is much higher than the ICE torque of ther- case of UDDS, where frequent braking and acceleration is involved,
mostat strategy. Whereas, the EM torque of Q-learning strategy is the difference in battery capacity loss and fuel economy between
negative, which represents charging mode. This charging event strategies is high, whereas it is very minimal in the case of highway
aims to increase the SOC level so that the battery and ICE can be cycle. This throws light on the influence of drive cycle on controller
used more efficiently during 770s and 950s. efficiency. In UDDS, because of the frequent regeneration braking
and acceleration, the ability of the controller plays a major role in
13
Fig. 22. Electric machine efficiency map with operating points, (a) Thermostat strategy, (b) Q-learning strategy. Q-Learning strategy operates the electric motor at higher efficiency
points compared to Thermostat strategy.
Fig. 23. Engine efficiency map with operating points, (a) Thermostat (b) Q-learning strategy. Q-Learning strategy operates the engine at comparatively higher efficiency operating
points.
deciding the fuel economy and battery life. However, in the high- The rule-based strategy has the second lowest capacity loss in
way driving conditions, there is very little for the controller to work Fig. 27 and lower fuel consumption compared to Thermostat. This
on, as the vehicle does not brake much leading to the lower can be attributed to a comparatively more sophisticated control
regeneration of the energy. strategy. The rule-based strategy restricts the motor operation
Figs. 26 and 27 show the comparison of the battery capacity loss when the battery operating temperature and the torque request to
with 100 UDDS cycles for all the control strategies. From Fig. 26, it the electric machine cross the threshold value. This allows intelli-
can be inferred that the Q-learning strategy has the highest battery gent control over the battery capacity loss and hence, the battery
capacity loss whereas the high voltage battery pack strategy has the capacity loss is the least compared to ECMS, ECMS with gear con-
least capacity loss. As shown in the figure, the capacity loss is trol, Thermostat, and Q-learning strategies. When the battery SOC
closely related to the Ah-throughput. On the contrary, the fuel is higher than the SOC lower limit, the electric machine is given
consumption of the high voltage battery pack is high. priority for operation over the engine, which ensures higher fuel
14
Fig. 24. Comparison of Thermostat and Q-learning strategy, (a) ICE torque, (b) Electric
Fig. 26. Comparison of battery capacity loss for all the control strategies with 100
machine torque, and (c) instantaneous fuel consumption. The overall ICE torque with
UDDS cycles (the strategies with the highest and the lowest battery capacity losses are
Q-learning strategy is lower compared to the Thermostat strategy, leading to lower fuel
pointed out). The red dotted box is zoomed in Fig. 27. Q-learning has the highest
consumption with Q-learning strategy.
battery capacity loss, and Thermostat with the high voltage battery pack has the lowest
battery capacity loss. (For interpretation of the references to colour in this figure
legend, the reader is referred to the Web version of this article.)
alone, hence, electric machine operation is considerably restricted

with this strategy. This leads to very high fuel consumption with
Thermostat. Thermostat does not set any direct control on the
motor application when the current passing through the battery is
very high, or the battery operating temperature is high, hence, the
battery capacity loss with this strategy is higher compared to the
rule-based strategy.
In the case of ECMS, when gear selection is not considered as a
control output from the ECMS controller, the resulting fuel econ-
omy and battery capacity loss are poor compared to the ECMS
strategy where gearshift is taken as a control output from the
controller. ECMS with gearshift controller can achieve 11.7% fuel
economy improvement at the cost of 3.2% reduction in battery life.
This leads to the conclusion that by giving the ECMS controller an
additional degree of freedom in selecting gears, the controller can
better split the power between the engine and the motor and po-
sition the engine operating points in low fuel consumption zone.
When compared to the baseline Thermostat strategy, ECMS has
Fig. 25. Comparison of Thermostat and Q-learning strategy, (a) SOC, (b) C-rate, and (c)
battery temperature comparison. Q-learning strategy has higher C-rate than Ther- more of electric machine operation, leading to better fuel economy
mostat strategy, which results in more heat generation and slightly higher battery but at the cost of higher battery capacity loss. To arrive at a better
temperature. conclusion in this case, a cost-benefit analysis is provided in the
next section based on battery and fuel costs.
Fig. 28 shows the comparative analysis of the battery capacity
economy with this strategy compared to Thermostat. and fuel economy with 100 UDDS cycles on an absolute scale. Here,
Thermostat has higher capacity loss compared to the rule-based it is assumed that the Q-learning strategy is the reference and has
strategy as shown in Figs. 26 and 27, and the second highest fuel the battery capacity loss and the fuel economy equivalent to 1. All
consumption compared to the other strategies. This is because the other strategies show relative amount of the battery capacity
Thermostat has the simplest map-based control strategy which loss and fuel economy value compared to Thermostat. From the
demands power from the electric machine based on SOC value
Table 2
Comparison of EPA combined Fuel Economy (FE) and battery capacity loss (%) values for different strategies over a period of 100 drive cycles of UDDS and HWFET.
Strategy Thermostat Rule-based Thermostat with special DC-DC converter ECMS no gear ECMS with gear Q-Learning
City Driving (UDDS) FE 52.27 49.71 49.68 58.66 64.64 80.06

Highway Driving (HWFET) FE 101.99 106.32 101.26 104.92 107.06 108.58
EPA Combined FE 66.96 65.38 64.42 73.18 81.79 90.79
Battery Capacity Loss % over 100 UDDS cycles 0.4102 0.3893 0.2933 0.3927 0.4054 0.5061
Battery Capacity Loss % over 100 HWFET cycles 0.1982 0.1775 0.1413 0.2056 0.2193 0.2334
15
graph, it can be inferred that there is a trade-off between battery

life improvement and fuel economy. The high voltage battery pack
strategy shows the least capacity loss hence, this strategy works the
best when the battery life is more concerned, but the fuel economy
is proportionally compromised. Q learning strategy provides the
best fuel economy but, the battery life is compromised the most
with this strategy. To better understand the effect of these trade-
offs, a cost-benefit analysis is provided in Section 5.7.
5.7. Cost analysis of all the control strategies considering battery

life and fuel consumption
To identify the best strategy out of the presented strategies, a

cost analysis model was created based on the fuel and battery costs
involved in each of the strategies. Based on the per kWh cost of
batteries proposed by Duffner [24] and keeping the battery
replacement cost of the 2020 Prius, the per kWh cost for the battery
is taken as $130/kWh. Under this cost assumption, the battery cost
to travel 50,000 miles for the different control strategies is
mentioned in Table 3. The average fuel cost in US is gathered from
the website of US Energy Information Administration [25] as $2.183
as on 14th September 2020. Considering this data, the cost of the
Fig. 27. Zoomed Section of the battery capacity loss plot from Fig. 26. Here, battery fuel for 50,000 miles for the different control strategies are
capacity loss curves at the end of 100 UDDS cycles for the four strategies (Thermostat,
mentioned in Table 3. The last column in the table shows the overall
Rule-based, ECMS with and without gear control) are shown.
cost of fuel and battery combined.
From the results mentioned in Table 3 and Fig. 29, it can be
inferred that the Q-learning control strategy is the most optimum
control strategy out of all the strategies compared. Considering the
battery life, high voltage battery pack can save the battery life the
most. Rule based controller, which operates according to fixed rules
in the strategy has the second highest total cost, proving its in-
efficiency in optimal power-split decisions. ECMS with integrated
gear control can optimize the control to a lower cost than the ECMS
without gear control, showcasing the significance of including gear
selection along with power split decisions of ECMS. This cost
analysis proves that Q-learning can achieve the best trade-off be-
tween battery life and fuel economy due to the reinforcement
learning methodology.
6. Conclusion
This paper presents an overview of some of the most popular

energy management strategies concerning a PHEV including sim-
ple controllers such as Thermostat, rule-based and the highly effi-
Fig. 28. Bar Chart showing the comparison of fuel economy and battery capacity loss
cient ECMS and Q-learning controller. All these control strategies
results for 100 UDDS cycles for the five different strategies discussed in this paper. The are discussed in detail, starting from the formulation of the gov-
values are converted to the absolute scale, with a value of 1 given to Q-learning and the erning equations and the constraints, followed by the simulation
rest of the strategies calculated based on their difference from Q-learning results. Q results, emphasizing the two critical factors, fuel economy and
learning has the highest fuel economy and highest battery capacity loss among all the
battery life. From the fuel consumption point of view, Q-Learning
strategies.
strategy produces the best fuel economy result, at the cost of
Table 3
Cost analysis for the strategies discussed based on fuel economy, battery capacity loss and additional architecture cost for 50,000 miles. The last column shows the overall
operating cost for 50,000 miles and Q-learning has achieved the lowest operating cost among all the strategies.
16
appeared to influence the work reported in this paper.
References
[1] Hannan MA, Lipu MSH, Hussain A, Mohamed A. A review of lithium-ion

battery state of charge estimation and management system in electric
vehicle applications: challenges and recommendations. Renewable and Sus-
tainable Energy Reviews. Elsevier Ltd; 2017. p. 834e54. https://doi.org/
10.1016/j.rser.2017.05.001.
[2] Cordoba-Arenas A, Onori S, Rizzoni G. A control-oriented lithium-ion battery
pack model for plug-in hybrid electric vehicle cycle-life studies and system
design with consideration of health management. J Power Sources 2015;279:
791e808. https://doi.org/10.1016/j.jpowsour.2014.12.048.
[3] Montazeri-Gh M, Pourbafarani Z, Mahmoodi-k M. Comparative study of
different types of PHEV optimal control strategies in real-world conditions.
Proc Inst Mech Eng - Part D J Automob Eng 2018;232(12):1597e610. https://
doi.org/10.1177/0954407017732858.
Fig. 29. Graphical representation of cost analysis for the different strategies discussed [4] Montazeri-Gh M, Mahmoodi-K M. Optimized predictive energy management
in the paper. Thermostat with a high voltage battery pack has the highest overall cost of plug-in hybrid electric vehicle based on traffic condition. J Clean Prod
and Q-learning has the lowest overall cost with the other strategies, lining up in the 2016;139:935e48. https://doi.org/10.1016/j.jclepro.2016.07.203.
[5] Montazeri-Gh M, Mahmoodi-K M. An optimal energy management develop-
mid-range.
ment for various configuration of plug-in and hybrid electric vehicle. J Cent S
Univ 2015;22(5):1737e47. https://doi.org/10.1007/s11771-015-2692-6.
[6] Padmarajan Bv, McGordon A, Jennings PA. Blended rule-based energy man-
increased battery usage and subsequent battery capacity loss. ECMS agement for PHEV: system structure and strategy. IEEE Trans Veh Technol
strategies come next to Q-Learning providing 17% better fuel 2016;65(10):8757e62. https://doi.org/10.1109/TVT.2015.2504510.
[7] Wang J, Liu P, Hicks-Garner J, Sherman E, Soukiazian S, Verbrugge M,
economy than the low memory rule-based and Thermostat con- Tataria H, Musser J, Finamore P. Cycle-life model for graphite-LiFePO4 cells.
trollers. From the battery life perspective, the high voltage battery J Power Sources 2011;196(8):3942e8. https://doi.org/10.1016/
pack has the lowest battery capacity loss owing to an overall lesser j.jpowsour.2010.11.134.
[8] Anselma PG, Kollmeyer P, Belingardi G, Emadi A. Multitarget evaluation of
current throughput through the battery pack, although the fuel hybrid electric vehicle powertrain architectures considering fuel economy and
economy is compromised due to higher DC-DC conversion losses. battery lifetime. SAE Technical Paper Series 2020;1. https://doi.org/10.4271/
Also presented is a comparison between two different ECMS stra- 2020-37-0015.
[9] Anselma PG, Kollmeyer P, Belingardi G, Emadi A. Multi-objective hybrid
tegies, one considering separate gear and power split controller and electric vehicle control for maximizing fuel economy and battery lifetime.
the other integrating both into a single controller. The integrated 2020. p. 1e6. https://doi.org/10.1109/itec48692.2020.9161518.
controller can achieve better results since the controller has an [10] Suri G, Onori S. A control-oriented cycle-life model for hybrid electric vehicle
lithium-ion batteries. Energy 2016;96:644e53. https://doi.org/10.1016/
additional degree of freedom, gear selection while choosing a local
j.energy.2015.11.075.
minimum. Since monetary savings is a deciding factor in current [11] Malmir F, Xu B, Filipi Z. A heuristic supervisory controller for a 48V hybrid
automotive trends, a market cost analysis has been performed for electric vehicle considering fuel economy and battery aging. SAE Technical
Papers 2019. https://doi.org/10.4271/2019-01-0079. 2019-Janua (January).
each strategy. This analysis put forth Q-Learning as the most cost-
[12] Xiong R, Li L, Li Z, Yu Q, Mu H. An electrochemical model based degradation
effective strategy since it can achieve the right trade-off between state identification method of lithium-ion battery for all-climate electric ve-
vehicle fuel economy and battery life. This is attributed to the dy- hicles application. Appl Energy 2018;219(5):264e75. https://doi.org/10.1016/
namic nature of this strategy, which allows it to learn and improve j.apenergy.2018.03.053.
[13] Xu B, Rathod D, Zhang D, Yebi A, Zhang X, Li X, Filipi Z. Parametric study on
from previous simulation data and experience. reinforcement learning optimized energy management strategy for a hybrid
There are several directions in which this fuel economy and electric vehicle. Appl Energy 2020;259. https://doi.org/10.1016/
battery aging work can be carried forward in the future. First, an j.apenergy.2019.114200.
[14] Ismail NHF, Toha SF, Azubir NAM, Md Ishak NH, Hassan MK, Ksm Ibrahim BS.
experimental validation of the proposed EMS comparison is critical. Simplified heat generation model for lithium ion battery used in electric
Besides, the cost and aging of battery vary among different types of vehicle. IOP Conf Ser Mater Sci Eng 2013;53(1). https://doi.org/10.1088/1757-
cells and chemistry. The generality of the conclusion needs vali- 899X/53/1/012014.
[15] Onori S, Spagnol P, Marano V, Guezennec Y, Rizzoni G. A new life estimation
dation on different battery chemistries and fuel prices. Moreover, method for lithium-ion batteries in plug-in hybrid electric vehicles applica-
the Q-learning results only show a glimpse of reinforcement tions pierfrancesco spagnol vincenzo marano yann guezennec and giorgio
learning. Systematic parametric study and algorithm selection of rizzoni. Int J Power Electron 2012;4(3):302e19.
[16] Kim M, Jung D, Min K. Hybrid thermostat strategy for enhancing fuel economy
reinforcement learning can be conducted in the fuel economy and
of series hybrid intracity bus. IEEE Trans Veh Technol 2014;63(8):3569e79.
battery aging optimization. Also, the driving pattern may have an https://doi.org/10.1109/TVT.2013.2290700.
impact on the battery life and fuel economy of a PHEV and these [17] Antonio S, Back M, Guzzella L. Optimal control of parallel hybrid electric ve-
hicles. In: Proceedings of the 2012 7th IEEE conference on industrial elec-
effects should be analyzed as a future scope of work.
tronics and applications, ICIEA 2012, vol. 12; 2012. p. 423e7. https://doi.org/
10.1109/ICIEA.2012.6360764. 3.
[18] Hmidi ME, ben Salem I, el Amraoui L. Analysis of rule-based parameterized
Credit authors statement control strategy for a HEV hybrid electric vehicle. In: 19th international
conference on sciences and techniques of automatic control and computer
Concept and Methodology: The methodology was proposed by engineering, STA 2019; 2019. p. 112e7. https://doi.org/10.1109/
STA.2019.8717250.
the entire team. Simulation: Simulation was conducted by [19] Çaǧatay Bayindir K, Go € züküçük MA, Teke A. A comprehensive overview of
Shradhdha Sarvaiya and Sachin Ganesh. Manuscript: The draft was hybrid electric vehicle: powertrain configurations, powertrain control tech-
written by Shradhdha Sarvaiya and Sachin Ganesh. Guidance: The niques and electronic control units. Energy Convers Manag 2011;52(2):
1305e13. https://doi.org/10.1016/j.enconman.2010.09.028.
team was guided throughout the process by Bin Xu with his valu- [20] Chakraborty S, Vu HN, Hasan MM, Tran DD, el Baghdadi M, Hegazy O. DC-DC
able inputs in modeling, controls, and manuscript phase. converter topologies for electric vehicles, plug-in hybrid electric vehicles and
fast charging stations: state of the art and future trends. Energies 2019;12(8).
https://doi.org/10.3390/en12081569.
Declaration of competing interest [21] Al M, Van J, Gualous H. DC/DC converters for electric vehicles. Electric Vehicles
- Modelling and Simulations; 2011. https://doi.org/10.5772/17048. 2014.
[22] Pisu P, Rizzoni G. A supervisory control strategy for series hybrid electric
The authors declare that they have no known competing vehicles with two energy storage systems. In: 2005 IEEE vehicle power and
financial interests or personal relationships that could have propulsion conference, VPPC 2005; 2005. p. 65e72. https://doi.org/10.1109/
17
VPPC.2005.1554534. [24] Duffner F, Wentker M, Greenwood M, Leker J. Battery cost modeling: a review
[23] Onori S, Serrao L, Rizzoni G. Adaptive equivalent consumption minimization and directions for future research. Renew Sustain Energy Rev
strategy for hybrid electric vehicles. ASME 2010 Dynamic Systems and Control 2020;127(April):109872. https://doi.org/10.1016/j.rser.2020.109872.
Conference, DSCC2010 2010;1:499e505. https://doi.org/10.1115/DSCC2010- [25] US Gasoline and Diesel Retail Prices. https://www.eia.gov/dnav/pet/pet_pri_
4211. gnd_dcus_nus_w.htm. [Accessed 19 September 2020].
18

Energy: Shradhdha Sarvaiya, Sachin Ganesh, Bin Xu

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Energy: Shradhdha Sarvaiya, Sachin Ganesh, Bin Xu

Uploaded by

Copyright:

Available Formats

Energy 228 (2021) 120604

Contents lists available at ScienceDirect

Comparative analysis of hybrid vehicle energy management strategies

1. Introduction Management Strategy (EMS) which can make decisions on power-

comparison between this cost-based gear controller and ECMS

Component Parameter Value

Vehicle curb weight 1636 kg

2.3. Transmission model

A cost-based controller has been developed to assist in gearshift

where i ¼ 1 to 5 is the gear number, and twheel;i ðtÞandm_ f ðtÞengine;i are

VO ¼ a0 þ a1 SOC þ a2 SOC 2 þ a3 T (3.2.2)

R ¼ b0 þ b1 SOC þ b2 SOC 2 þ b3 T (3.2.3)

where a0 ; a1 ; …a3 andb1 ; b2 ; …b3 are the coefﬁcients calculated

3.3. State of charge (SOC) estimation model

For battery pack simulation, it is crucial to model the energy

3.2. Cell modelling 3.4. Cell temperature model

where, mccell is the thermal mass of the cell, T is the cell

value of a ¼ 2694:5; b ¼ 6022:2, Ea ¼ 31500 Jmol1 , h ¼ 152:5.

where smap is the ratio of total charge throughput through the

Case 1b : TBatt ðtÞ > TBatt; optimal

conditions of battery SOC and road torque requirement [17] based

In general, the performance of the rule-based controllers is  

4.4. Equivalent fuel consumption minimization strategy (ECMS) Pbatt;t

m_ ice ðtÞ ¼ f ðuice ; tice Þ (4.4.1.2)

m_ batt;eq ðtÞ ¼ SCECMS *fSOC ðSOCtþ1 Þ*Pbatt ðuEM ; tEM Þ (4.4.1.3)

subject to the constraints,

0 tice tice;max (4.4.1.4)

0 uice uice;max (4.4.1.5)

Fig. 8. Adaptive battery cost factor fsoc based on battery SOC.

algorithm is used (i.e., Q values are saved in a look-up table rather

States and action Lower boundary Upper boundary Discretization Resolution

Vehicle speed 0 m/s 30 m/s 5 6 m/s

and less SOC variation in Fig. 11 (c) and (a), respectively.

5.2. Comparative analysis of thermostat and the rule-based

depending on the battery SOC level and the remaining torque is

5.3. ECMS without gear control strategy

A comparison of power split parameters is shown in Fig. 17.

consumption of Thermostat compared to ECMS. The electric motor

Also, the number of operating points in the high-efﬁciency zone

5.5. Q-learning results and analysis

This section presents the results of each strategy compared with

To achieve more stable results, the results of fuel economy and

alone, hence, electric machine operation is considerably restricted

City Driving (UDDS) FE 52.27 49.71 49.68 58.66 64.64 80.06

graph, it can be inferred that there is a trade-off between battery

5.7. Cost analysis of all the control strategies considering battery

To identify the best strategy out of the presented strategies, a

This paper presents an overview of some of the most popular

appeared to inﬂuence the work reported in this paper.

[1] Hannan MA, Lipu MSH, Hussain A, Mohamed A. A review of lithium-ion

You might also like

In general, the performance of the rule-based controllers is

m_ batt;eq ðtÞ ¼ SCECMS fSOC ðSOCtþ1 ÞPbatt ðuEM ; tEM Þ (4.4.1.3)