Treatment of Accumulative Variables in Data-Driven Prognostics of Lead-Acid Batteries

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

9th IFAC Symposium on Fault Detection, Supervision and

9th
9th IFAC
Safety of Symposium
IFAC on
on Fault
Fault Detection,
Technical Processes
Symposium Detection, Supervision
Supervision and
and
9th IFAC
Safety of Symposium
of
September
Safety Technical
2-4, 2015.
Technical on Fault
Processes
Arts Detection,
et Métiers
Processes Supervision
Available
ParisTech, Paris, and
online at www.sciencedirect.com
France
Safety of
September Technical Processes
September 2-4, 2015. Arts et Métiers ParisTech, Paris, France
2-4, 2015. Arts et Métiers ParisTech, Paris, France
September 2-4, 2015. Arts et Métiers ParisTech, Paris, France
ScienceDirect
IFAC-PapersOnLine 48-21 (2015) 105–112
Treatment
Treatment of
of accumulative
accumulative variables
variables in
in
Treatment
Treatment of
of
data-driven accumulative
accumulative
prognostics ofvariables
variables
lead-acidin
in
data-driven
data-driven prognostics
prognostics of
of lead-acid
lead-acid
data-driven prognostics
batteries
batteries of lead-acid
batteries
batteries
Erik Frisk and Mattias Krysander
Erik
Erik Frisk
Frisk and
and Mattias
Mattias Krysander
Krysander
Erik Frisk and Mattias Krysander
Department of Electrical Engineering
Department
Linköpingof
Department Electrical
Electrical Engineering
ofUniversity, Engineering
Sweden
Department
Linköping
Linköping
e-mail: of Electrical Engineering
University,
University, Sweden
Sweden
{frisk,matkr}@isy.liu.se
Linköping
e-mail:
e-mail: University, Sweden
{frisk,matkr}@isy.liu.se
{frisk,matkr}@isy.liu.se
e-mail: {frisk,matkr}@isy.liu.se
Abstract
Abstract
Abstract
Problems with starter batteries in heavy-duty trucks can cause costly unplanned stops along
Abstract
Problems
Problems
the with
with starter
road. Frequent starter batteries
batteries
battery changesin
in heavy-duty
heavy-duty
can increase trucks
trucks can
can cause
availabilitycause butcostly
costly unplanned
unplanned
is expensive andstops
stops
sometimes along
along
Problems
the
the road.
not road.
necessary withsince
Frequent
Frequent starter batteries
battery
battery
battery changes
changesin heavy-duty
degradation can
can increase
increase
is highlytrucks can cause
availability
availability
dependent oncostly
but
but is
the unplanned
is expensive
expensive
particular and
and stops
sometimes
sometimes
vehicle along
usage
the
not road.
necessary
not necessary
and Frequentsince battery
battery
since battery
ambient conditions. changes
degradation
Thedegradation can
main contribution increase
is highly availability
dependent
is highlyofdependent
this work on but
on is
the expensive
particular
the particular
is case study where and sometimes
vehicle
vehicle
prognosticusage
usage
not
and necessary
and ambient
information since
ambientonconditions.
conditions.battery
remaining The
The
usefuldegradation
main
main is highly
life ofcontribution
contribution
lead-acid ofdependent
of this
this in
batteries work
work on
is the Scania
is case
individual case particular
study where
study heavy-duty
where vehicle
prognostic
prognosticusage
trucks
and ambient
information
information
is computed.on on conditions.
remaining
A remaining
data-driven The
useful
useful main
approachlife ofcontribution
lead-acid
life ofusing
lead-acid
random of this
batteries
batteries
survival work
in is
individual
in individual case
forests is used study
Scania
Scania where
heavy-duty
heavy-duty
where prognostic
the prognostic trucks
trucks
information
is
is computed.
computed.
algorithm hasonA remaining
data-driven
Aaccess
data-driven
to fleetuseful
approach
approachlife ofusing
operational lead-acid
usingdata random
random batteries
survival
survival
including 291 in variables
individual
forests
forests is Scania
used
is from
used heavy-duty
where
where
33603 the
the prognostic
vehicles trucks5
prognostic
from
is computed.
algorithm
algorithmEuropean
different has A data-driven
access to
has accessmarkets. fleet approach
operational
to fleet operational usingdata random survival
including
data includingaspect
A main implementation 291 forests
variables
291 variables is used
from where
33603 the
vehicles
from 33603is vehicles
that is discussed prognostic
the treatment from
from 55
algorithm
different
different has
European
Europeanaccess to fleet
markets.
markets. operational
A
A main
main data
implementation
implementationincluding 291
aspect
aspect
of accumulative variables such as vehicle age in the approach. Battery lifetime predictions variables
that
that is
is from
discussed
discussed33603 is
is vehicles
the
the from
treatment
treatment are5
different
of
of accumulative
computed European
accumulative markets.
variables
variablesonsuch
and evaluated such A main
as
as vehicle
recorded implementation
vehicle
data age age in the
frominScania’s aspect
approach.
the approach. that is
Battery discussed
lifetime
Battery lifetime
fleet-management is the treatment
predictions
systempredictions
and the effect are
are
of accumulative
computed
computed
of and variables
evaluated
and evaluated
how accumulative onsuch as
recorded
on recorded
variables vehicle
data
data from
are handled age
from in the approach.
Scania’s Battery
fleet-management lifetime
system
Scania’s fleet-management system and the effect
is analyzed. predictions
and the are
effect
computed
of how and
accumulative evaluated on
variables recorded
are
of how accumulative variables are handled is analyzed. data
handled from
is Scania’s
analyzed. fleet-management system and the effect
of©how accumulative
2015, IFAC (International variables are handled
Federation is analyzed.
of Automatic Control) Hosting by Elsevier Ltd. All rights reserved.
Keywords: Battery prognostics, reliability, survival analysis, machine learning, classification
Keywords: Battery prognostics, reliability,
Keywords: Battery prognostics, reliability, survival analysis, survival analysis, machine
machine learning,
learning, classification
classification
Keywords: Battery prognostics, reliability, survival analysis, machine learning, classification
1. INTRODUCTION The basic idea is to classify vehicles with similar battery
1. INTRODUCTION
1. INTRODUCTION The
The basic
basic idea
degradation andis
idea isforto classify
toeach classvehicles
classify estimatewith
vehicles with similar battery
similar function
a reliability battery
1. INTRODUCTION The
in a basic idea
degradation
degradation
training and
and isfor
for
phase. toeach
classify
each
Then, class
class vehicles
whenestimate
estimate with
prognostics similar
aa reliability
reliability
for a battery
function
function
specific
To efficiently transport goods by heavy-duty trucks it is in degradation
a training and for
phase. each
Then, classwhenestimate
prognosticsa reliability
for afunction
specific
To
To efficiently
efficiently
important that transport
transport
vehicles goodsgoods
have a by by
highheavy-duty
heavy-duty
degree of trucks trucks
availabilityit is in
it is vehicle a training
vehicle is phase.
computed, Then,
the when
reliability prognostics
function can forbe a specific
obtained
To efficiently transport goods by heavy-duty trucks it is in aidentifying
training
vehicle
by is
is phase.
computed,
computed, which Then,
the
the when
reliability
reliability
class the prognostics
function
function
vehicle can
can for
belongs be
be aobtained
specific
obtained
to and
important
important
and that
in particular vehicles
that vehicles
avoid havehave
becominga high degree
a highstanding
degree ofby of availability
availability
the road vehicle is computed, the reliability function can be obtained
important
and in that
particular vehicles
avoid have
becominga high degree
standing ofby availability
the road by
by identifying
identifying
compute the which
which
corresponding class
class the
the vehicle
vehicle
reliability belongs
belongs
function. The to and
to data
and
and in to
unable particular
continueavoid becomingmission.
the transport standing Anbyunplanned
the road compute by identifying
the which variables
corresponding class reliability
thesuch vehicle belongs
function. The to data
and
and
unable
unable
stop in
bytoparticular
tothecontinue
continue
road does avoid
the
the not becoming
transport
transport
only cost standing
mission.
mission.
due to An by
Anthe the
unplanned
unplanned road
delay in contains compute
contains the corresponding
accumulative reliability as function.
driven The
distance data
and
unable to continue the transport mission. An unplanned compute
contains
vehicle the corresponding
accumulative
accumulative
age. The variables
variables
accumulative reliability
such
such
variablesas
as function.
driven
driven
will The
distance
distance
increase data
and
and
over
stop by
stop by the
delivery, the road
but road
can alsodoes
doeslead not
notto only
only cost
cost due
damaged due
cargo. to the delay in
to the delay in vehicle contains accumulative variables such asindriven distance overand
stop by
delivery, the road does not only cost due to the delay in vehicle
time and age.
age.if The
The
these accumulative
accumulative
variables are variables
variables
used will
will
the increase
increase
classification, over
delivery, but but can
can also
also lead
lead to to damaged
damaged cargo. cargo. vehicle
time and age.if The
these accumulative
variables are variables
used in will
the increase
classification, over
delivery,
One cause butof can also lead
unplanned to damaged
stops is a failure cargo.in the electrical a time and ifused
vehicle these in variables
a similarare wayused forinits theentire
classification,
life will
time and ifused
these variables are isused inits theentire
classification,
One
One cause
power cause of
of unplanned
system, unplanned stops
stops is
and in particular is aa failure
failure in
in the
the lead-acid starter a
the electrical
electrical a vehicle
vehicle
change classused over in a
a similar
intime, similar
which way
way not for
fordesirable.
its entire The life will
lifemain
will
One cause of main
unplanned stops is a battery
failure in a vehicle used intime,
awork
similar way fordesirable.
itshowentire lifemain
will
power
power
battery. system,
system,
The and in
in particular
andpurpose particular
of the the is the
the lead-acid
lead-acid electrical
to power the change
starter
starter change class
class
contribution over
over
of this time, which
which is
is not
not
is to investigate desirable. The
The
accumulatedmain
power
battery. system,
battery.motorThe main
The main and in
purpose particular
of the the
battery lead-acid
is to power starter
the change class
contribution
contribution
the variables over
of this
canofbethis time,
work which
is to
work isintoRSF
handled is not
investigate
investigate desirable.
and how how
how they The
accumulated
accumulatedmain
affect the
starter to getpurpose
the diesel of engine
the battery
running, is to
but power
it is also
battery.
starter Theexample,
main getpurpose of engine
the battery is to power contribution
the variables
result. can canofbe bethis work isin
handled intoRSF
investigate
and
and how how theyaccumulated
affect
affect the
to,motor
starter
used motor
for to
to get the
the diesel
diesel
power engine
auxiliary running,
running,
units such but
but as it is
is also
itheating,
also variables handled RSF how they the
starter
used to, motor
for to get
example, the diesel
power engine
auxiliary running,
units suchbut as it is also
heating, variables
result. can be handled in RSF and how they affect the
used to, for
cooling, andexample,
kitchen powerequipment. auxiliary
High units can be result.
such as heating,
availability
used to, for result.
cooling,
cooling,
achieved byexample,
and
and kitchen
kitchen
changing power
equipment.
equipment. auxiliary
batteries High
High units such
butas such
availability
availability
frequently heating,
can
can an be The
be
The
outline of the paper is as follows. First, Section 2
outline of the paper is
cooling,
achieved
achieved is
approach and
by kitchen
changing
byexpensive
changingbothequipment.
batteries
batteries High availability
frequently
frequently but
due to unnecessary but can
such be
an
such an The
maintenance The outline
introduces of and
data the briefly is as
paper recalls as follows.
follows.
a case study First, Section
First,(Frisk
Sectionet al.,2
2
achieved by changing batteries frequently but such an outline
introduces
introduces
2014) based of
data
dataon the
and
and
the paper
briefly
briefly
same is as set.
recalls
recalls
data follows.
aa case
case
Section First,
study
study 3 Section
(Frisk
(Frisk
states et
et 2
al.,
al.,
the
approach
approach
actions andis expensive
is expensive
also due toboth both due
due to
the cost to unnecessary
of unnecessary maintenance
the batteries.maintenance
In addition 2014) introduces data and briefly recalls a case study (Frisk et al.,
approach is expensive based on the same data set. Section 3 states the
actions
actions and
battery and also
also due
degradation due to isboth
to the
the
highlyduedependent
cost
cost to
of unnecessary
of the
the batteries.
batteries.
on the maintenance
In addition 2014)
addition
Inparticular studiedbased
problem. on the same4 data
Section recallsset. howSection
to estimate 3 states the
battery
actions
battery
batteryand
usage and also dueconditions.
degradation
degradation
ambient to
is the
is highly
highlycostdependent
of the batteries.
dependent on
on the addition 2014)
theInparticular
particular studiedbased
studied problem.
problem.
degradation on the same
Section
Section
properties 44 data
based on set.
recalls
recalls howSection
how
fleet to
to estimate 3 states
estimate
operational the
battery
battery
data by
battery degradation is highly dependent on the particular studied problem.
degradation
degradation
using random Section
properties
properties
survival 4
based
based recalls
forests. on
on how
fleet
fleet
One to estimate
operational
operational
characteristic battery
data
data
of by
by
the
usage and ambient
usage and ambient conditions. conditions. degradation properties based onOnefleetthatoperational data by
usage
A and ambient and
non-parametric data-driven prognostics approach using
conditions. usingset
data random
random
is that survival
survival
it contains forests.
forests. One
variables characteristic
characteristic
are accumulated of
of the
the
A non-parametric and data-driven prognostics approach using
data random
set
data time is
set isand that
that survival
it contains
it they
contains forests. One
variables
variables characteristic
that
thatinare are of
accumulated
theaccumulated the
A
wasnon-parametric
developed in (Frisk and data-driven
et al., 2014) prognostics
to compute,approach on an in- over how can be introduced approach is
A
was
wasnon-parametric
developed
vehiclein
developed
dividual andprognostic
(Frisk
inbasis,
(Frisk data-driven
et
et al.,
al., 2014)
2014) prognostics
to
to compute,
compute,
information approach
on
on an
on remaining an in-in- dataover set isand
over time
time
discussed inthat
and how
how
Sectionit they
contains
they can
can be
5. Finally, variables
be Sectionthat
introduced
introduced inare
in
6 analyze theaccumulated
the approach
approach
and discuss is
is
was developed over time and how they candepends
be introduced in theaccumulated
approach is
dividual
dividual
useful of theinbasis,
lifevehicle
vehicle (Frisk
basis,
lead-acid etbatteries.
al., 2014)
prognostic
prognostic to compute,
information
information
Prognostic on on an in- discussed
remaining
oninformation
remaining discussed
how in
in Section
Section
the prognostic 5.
5. Finally,
Finally,
result Section
Section 66 analyze
on the analyze
way and
and discuss
discuss
dividual
useful lifevehicle
of the basis,
lead-acid prognostic
batteries. information
Prognostic on remaining
information discussed
how the in Section
prognostic
how the prognostic 5. Finally,
result
result Section
depends
anddepends on 6
the analyze
on conclusionsway and discuss
accumulated
the way accumulated
useful
is life of the
computed lead-acidabatteries.
by applying tree based Prognostic
classification method variables
information are included then some are given
useful life of the lead-acid how the prognostic result
anddepends on conclusions
the way accumulated
is
is computed
computed
called Random by Survival aabatteries.
by applying
applying tree
tree based
Forests based Prognostic
(RSF) classification
classification
(Ishwaraninformation
et al., variables
method
method variables
in Section are
are7. included
included and then
then some
some conclusions are
are given
given
is computed
called Random
called Ishwaran by applying
Random SurvivalSurvival a tree
Forests
Forests based
(RSF)
(RSF) classification
(Ishwaran
(Ishwaran method
et al., variables
in Section
et al., in Section 7. are7. included and then some conclusions are given
2008; and Kogalur, 2010) on fleet operational
called
2008;
2008;
data from Random
Ishwaran
Ishwaran Survival
and
and Kogalur,
the heavy-duty Forests
Kogalur, truck2010) (RSF)
2010) on (Ishwaran
on fleet
manufacturer fleet etThe
operational
operational
Scania. al., in Section 7.
2008;
data Ishwaran
from the and
heavy-duty Kogalur, truck 2010) on
manufacturer fleet operational
Scania. The 2. BACKGROUND
data from can
approach the beheavy-duty
classifiedtruck as a manufacturer
reliability function Scania.based The 2.
2. BACKGROUND
BACKGROUND
data from
approach
approach can
prognostic the
can heavy-duty
be classified
be classified
approach truck
as
(Linxiaasand a manufacturer
reliability
a reliability
Köttig, 2014). Scania.
function The
based
function based There exist a number 2. BACKGROUND
approach
prognosticcan
prognostic be classified
approach (Linxiaasand
(Linxia a reliability
Köttig,
Köttig, 2014). function based There exist a number of of
approaches in the literature to
approach
prognostic approach (Linxia and Köttig, 2014).
and 2014). There exist a number
do prognostics. A physics of approaches
approaches
based approach in
in thetheisliterature
literature
to look for to
to
There
do
do exist
prognostics.
prognostics.
trends a number
in measured A
A physics
physicsof approaches
or estimated based approach
basedcomponent
approachin theis literature
to
to look
ishealth look to
for
for
status
 This work was sponsored by Scania and FFI - Strategic Vehicle Re-
do prognostics.
trends in measured
trends in measured
indicators, A
see e.g. (Hengphysics
or estimated
or estimated based approach
component
component
et al., 2009). is to
health look for
status
health status
Then, extrapolating

 This work was sponsored by Scania and FFI -- Strategic Vehicle Re-
trends
computed in measured
indicators,
indicators, see
see e.g.
health or estimated
(Heng
e.g.status
(Heng et
et al.,
indicators component
al., 2009).
2009). giveThen,
Then, health on status
extrapolating
extrapolating
indications the
search
This and
work Innovation
was (Swedish
sponsored by Governmental
Scania and FFI Agency
Strategicfor Innovation
Vehicle Re- indicators, see e.g. (Heng et al., 2009). Then, extrapolating
 This and
search
Systems) work Innovation
was (Swedish
sponsored by Governmental
Scania and FFI Agency
- Strategicfor Innovation
Vehicle Re- computed health
computedofhealth
amount status
usefulstatuslife leftindicators
indicators
in the give give indications
indications
component. on
on the
Such the
an
search andand the Swedish
Innovation (SwedishResearch Council Agency
Governmental within The Linnaeus
for Innovation computed health
Systems)
search
Center
Systems)and and the
the Swedish
Innovation
CADICS.
and (Swedish
Swedish Research Council
Governmental
Research within
within The
Council Agency Linnaeus
for Innovation
The Linnaeus
amount
amount of
approach usefulstatus
useful
ofrequires life
life left
reliable indicators
left in the give
in the
degradation indications
component.
component.
models or Suchon the
Such
measure- an
an
Center
Systems) CADICS.
and
Center CADICS. the Swedish Research Council within The Linnaeus amount
approach of useful
requires life
reliable left in the
degradation
approach requires reliable degradation models or measure- component.
models or Such
measure- an
Center CADICS. approach requires reliable degradation models or measure-
Copyright
2405-8963 © © 2015,
2015 IFAC 105Hosting by Elsevier Ltd. All rights reserved.
IFAC (International Federation of Automatic Control)
Copyright
Peer review©
Copyright 2015
©under IFAC
2015 responsibility
IFAC 105
105Control.
of International Federation of Automatic
Copyright © 2015 IFAC
10.1016/j.ifacol.2015.09.512 105
SAFEPROCESS 2015
106
September 2-4, 2015. Paris, France Erik Frisk et al. / IFAC-PapersOnLine 48-21 (2015) 105–112

0.15 VIMP with accumulative variables (min node size 200)


Without battery problems
Normalized relative frequency

BattVoltTemp_I1_p1
With battery problems
SpeedDist_p2
PowerOffSOC_I4_p2
0.1
SOCPowerOffTime_I3_Ptail
PowerOffSOC_I6_c2
SOCPowerOffTime_I3_p3
0.05 BattVoltTemp_I3_c3
BattVolt_p1
Distance
SpeedTM_pct50
0 NoKitchen
−2 0 2 4 6 8 10 12 14 16 BattVoltTemp_I1_p2
time unit SpeedDist_p10
Country Id
BattVoltTM_p4
Figure 1. Normalized histogram of time stamp for vehicles FuelConsumption_c10

with and without battery problems. BattVoltTM_p2


ChassiNo
Create year
ments closely related to battery health, neither of which BattVolt_p2

are available in this work. An alternative to a physics based Battery pos.


BattVoltTemp_I3_p2
approach where the battery health is estimated directly is Create month
SnapshotNo
to rely on recorded data from a large number of vehicles. Age
This paper explores a data-driven approach where the 0 0.01 0.02 0.03 0.04 0.05 0.06
prognostic algorithm has access to fleet operational data
and some characteristics of the data are: Figure 2. Variable importance.
• 33603 vehicles logged from 5 different markets. Random Survival Forests (RSF) (Ishwaran et al., 2008;
• 291 variables are logged for each vehicle. Ishwaran and Kogalur, 2010) have been used to estimate
• No time series, only aggregated data like traveled vehicle specific reliability functions. The key motives for
distance, year of delivery, histogram of ambient tem- using random survival forests for the available data are
peratures.
• it handles heterogeneous data; both discrete and
• Heterogeneous data; mix of numerical values such as
continuous valued variables
temperatures and pressures with categorical data such
• it handles missing data
as battery mount point or wheel configuration.
• it is non-parametric, i.e., does not rely on a specific
• Data set includes histogram variables.
hazard function parameterization like proportional
• Significant missing data rate (≈ 15%).
hazards
• Each vehicle with a replaced battery has logged time
• it handles censored data
of failure.
• There are many vehicles where battery failure has not The basic idea of the approach can loosely be stated as
occurred before the time of observation, i.e., data are utilizing a classifier to cluster vehicles with similar battery
right censored. degradation properties. Then a non-parametric estimate
Figure 1 shows normalized relative frequency of logged for the reliability function RV (t) is computed for a specific
time in the data set. The red bars show the time of vehicle V using only the vehicles in the corresponding
failure for vehicles with battery problems and the blue vehicle cluster.
bars show time of logged data for vehicles with no battery The RSF algorithm also automatically computes which
problem. The histogram for vehicles with no battery variables that are important for clustering vehicles with
problems thus reflects the last time data was logged similar battery degradation, i.e., which variables that are
from the vehicle, which approximately is the age of the important for predicting battery degradation. Figure 2
vehicle. Time is originally in days but has been scaled shows a list of the 20 most important variables, when
to time units to avoid revealing sensitive information. A considering also accumulative variables, and their variable
first observation is that some batteries fail much earlier importance (VIMP), which is defined and discussed in more
than others and in (Frisk et al., 2014) it has been shown details in (Ishwaran et al., 2008, 2007). The most important
that battery usage and vehicle configuration have a big variable, and its corresponding strength, is the undermost
impact on battery degradation. For example, the battery variable in Figure 2.
failure rate is significantly different for different vehicles,
e.g., a long-haulage vehicle with a large battery, kitchen There are configuration variables such as battery position
equipment, and driving in cold weather may experience Battery pos. and country index Country Id and usage
significantly different battery degradation behavior than a variables such as battery voltage BattVolt_p2 and driven
city distribution truck. A more detailed discussion is given distance Distance. Configuration variables describe the
in Section 4. Hence there clearly is potential in vehicle vehicle configuration which does not change with time while
individual maintenance plans. usage variables changes as the vehicle is used. Variables
with the suffix x_pi represents the frequency of a bin in
2.1 Prognostics Approach histogram x, x_ci a cumulative sum of bins in histogram
x, x_pct50 the median of histogram x, x_Ptail the weight
Let T be the random variable of failure time. Then the of the tails of histogram x, see (Frisk et al., 2014) for more
reliability function, sometimes referred to as the survival details.
function, is the probability of survival up to time t, i.e.,
2.2 Accumulative Variables
R(t) = P (T ≥ t) (1)
which is a fundamental object in the prognostics analysis. Most of the variables in the Figure 2 are more or less
Since vehicle configuration and usage is important for constant over time if the vehicle is operated in a similar way
battery reliability, let V denote configuration and usage over time. However there are a couple of exceptions. Vehicle
data for a vehicle and let RV (t) denote the reliability age will of course increase with time and if this variable is
function for that particular vehicle. In (Frisk et al., 2014) used as a classification variable there is a risk of estimating

106
SAFEPROCESS 2015
September 2-4, 2015. Paris, France Erik Frisk et al. / IFAC-PapersOnLine 48-21 (2015) 105–112 107

a reliability function based on vehicles observed only with All vehicles


a similar age. Then the reliability function estimate will 1
only be changing values in a tight age interval. If age t
would be omitted as a classification variable, it would still 0.9
be used in the prediction step since t is used to evaluate the
reliability function RV (t), see Section 3. Another example 0.8

of a variable that is accumulated over time is the traveled


distance. Variables that are accumulated over time will be 0.7

R(t)
called accumulative variables and this paper investigate 0.6
how to include accumulative variables in RSF.
0.5
3. PROBLEM FORMULATION
0.4

The problem studied in this paper is to compute a


probabilistic measure of the remaining useful life of a 0.3
0 2 4 6 8 10 12 14 16
particular vehicle with a well functioning battery at a time unit
specified time t = t0 . As before, let T be the time of failure
for the battery in a specific vehicle and let V denote usage Figure 3. Reliability function estimate for the full data set.
and configuration data for the vehicle. The objective is to if the i:th vehicle does not have battery problems at time
estimate the function t = ti and ci = 1 if the vehicle has battery problems. Then,
B(t; t0 , V) = P (T ≥ t + t0 |T ≥ t0 , V), t ≥ 0 (2) in discrete-time, the maximum-likelihood estimator of the
which describes, for a specific vehicle V, the probability hazard function, i.e., immediate hazard-rate, at time-point
that the battery will be operational at least t time units t = ti can be found as
after t0 . This function is closely related to the reliability di
ĥi = (4)
function R(t). Let RV (t) be the reliability function for a ui
specific vehicle V, then where di and ui is the number of battery failures and the
B(t; t0 , V) = P (T ≥ t + t0 |T ≥ t0 , V) = number of vehicles at risk at time t = ti respectively. Here
P (T ≥ t + t0 |V) RV (t + t0 ) (3) it is explicitly taken into consideration that data might
= = be right right censored, i.e., the time of battery failure
P (T ≥ t0 |V) RV (t0 ) is unknown but is known to be greater than the time of
The basic problem is then to, given the usage data for observation. The Kaplan-Meier (Product-limit) estimator
a vehicle V, estimate RV (t) and then compute B(t; t0 , V) of the reliability function R(t) is then

according to (3). R̂(ti ) = (1 − ĥj ) (5)
A key problem is how to handle accumulative variables tj <ti
in the classification method. The main objectives of the This means that expressions (4) and (5) can be used to
paper are to, in a case study with heavy-duty truck estimate the reliability function R(t), and thereby the
data, analyze and compare the difference in the results battery prognostic function B(t; t0 , V) using (3).
obtained with or without including accumulative variables
in the classification approach. In particular the effects on 4.2 Battery Degradation Characteristics
variable importance and the estimate of the reliability
function RV (t) for a specific vehicle V will be studied. As described in Section 2, the battery failure rate is
Vehicles with similar age and distance but different battery significantly different in different vehicles, e.g., a long-
predictions are compared and their differences in operation haulage vehicle with a large battery, kitchen equipment,
and configuration are analyzed. and driving in cold weather may experience significantly dif-
ferent battery degradation behavior than a city distribution
4. PROGNOSTICS WITH RANDOM SURVIVAL truck. To illustrate this, Figure 3 shows the Kaplan-Meier
FORESTS estimate (5) for the full data set. This estimate would
be useful if it were true that the battery degradation is
This section will briefly outline the algorithm used to equal for all vehicles, no matter the vehicle configuration or
estimate the battery prognostics function B(t; t0 , V) as usage. Figure 4 shows corresponding estimates for classes
defined in (2). The key step, from (3), is to estimate the of vehicles with different battery mount position (a) and
reliability function (1). Thus, a reliable estimate of the different temperature statistics (b). The blue curve in
Figure 4 a/b corresponds to the full set of vehicles in
reliability function RV (t) for a specific vehicle V makes it
the database, as shown in Figure 3. Since the estimated
possible to compute the prognostics function B(t; t0 , V).
reliability functions significantly deviate from the blue
curve for different sets of vehicles it is clear that battery
4.1 Reliability Function Estimation degradation characteristics significantly depends on which
set of vehicles that are investigated. Further, this means
Basic techniques for maximum-likelihood estimation of that there is a need to estimate the battery reliability
reliability functions can be found in (Cox and Oakes, 1984). function for each specific vehicle and (5) can not be directly
As will be described below, they are not directly applicable applied.
to this case but they are useful so first a brief summary
of a basic result for reference purposes. Derivations and 4.3 Reliability Function Estimation for a specific Vehicle V
details of these expressions can be found in (Cox and Oakes,
1984). Now, assume N vehicles with age ti and response From the discussion above, the 291 variables that are stored
variable ci for i = 1, . . . , N . The response variable ci = 0 for each vehicle and describe vehicle configuration and

107
SAFEPROCESS 2015
108
September 2-4, 2015. Paris, France Erik Frisk et al. / IFAC-PapersOnLine 48-21 (2015) 105–112

Battery Mount Position Time at low voltages (26-27 V) when cold outside (-5 to -10 deg C)
1 1

0.9
0.8

0.8

0.6
0.7
R(t)

R(t)
0.6
0.4

0.5
All
0.2 Left hand side All
Rear Frame End 0.4 Little time with low voltage during cold
Missing Significant time with low voltage during cold
0 0.3
0 2 4 6 8 10 12 14 16 0 2 4 6 8 10 12 14 16
time unit time unit
(a) (b)

Figure 4. Reliability function estimation for different battery positions (a) and vehicles different with different amount of
time with low battery voltage during cold ambient temperatures (b).

usage need to be taken into account when estimating the V


reliability function. As said in Section 2, the basic idea of Histogram  variables   Data  reduc@on   Build  model  
the approach can loosely be stated as utilizing a classifier to
cluster vehicles with similar battery degradation properties. Fleet  data   Fleet  data   Fleet  data  
Random  Survival    
Then a non-parametric estimate for the reliability function 33603  vehicles  
291  variables  
33603  vehicles  
1031  variables  
33603  vehicles  
113/116  variables  
Forest  Model  

RV (t) is computed for a specific vehicle V using only the


vehicles in the corresponding vehicle cluster. The approach
is based on Random Survival Forests (Ishwaran et al., 2008; RV (t)
Ishwaran and Kogalur, 2010). Random survival forest is a B(t; t0 , V)
survival analysis extension of Random Forests (Breiman,
2001) which is a tree-based classifier (Breiman et al., Figure 5. Flowchart of the reliability function estimation
1984) extended with bootstrap aggregation (Breiman, 1996) procedure.
techniques.
5. ACCUMULATIVE VARIABLES
There are 291 variables stored for each vehicle and the
data includes 17 histograms. The treatment of histogram
variables is not described here in detail, the procedure This section describes how variables, and especially the
can be found in (Frisk et al., 2014), but the key step accumulative variables, are prepared for the classification
is that additional variables are derived to take these step in RSF. The basic principle is the property that a
histogram variables into account. This results in a total vehicle operated similarly over time should remain in the
of 1031 variables for each vehicle. To keep computational same class. In that case vehicles at different age with similar
complexity down when building the random survival forest operation characteristics will be collected in a class and
data size is reduced, the procedure is described in (Frisk a reliability function estimate for that type of operation
et al., 2014), to 113 or 116 variables depending on how can be computed. Accumulative variables do not have this
accumulative variables are handled. The treatment of property and need to be modified.
accumulative variables is further discussed in Section 5. In the data there are 17 histograms with bin-values with
This corresponds to a slightly modified version of the units time, distance, count, and fuel volume which are
approach from (Frisk et al., 2014) and a flowchart in all accumulative entities. All these 17 histograms are
Figure 5 outlines the procedure. The procedure to build normalized such that the sum of bin-values equals 1. The
a random survival forest model here then comprises the variable Age is removed from the set of classification
steps variables but the information of vehicle age is used for
(1) Collect the data estimating the reliability function. The variable Distance
(2) Handle histogram variables as in (Frisk et al., 2014) is an accumulative variable and is replaced by distance per
(3) Reduce data size as in (Frisk et al., 2014) day MilagePerDay.
(4) Build the model, see Section 6.1 Figure 6 shows the correlation of the most correlated
When built, the random survival forest model is able to variables with age. There is a strong correlation between
predict a reliability function RV (t), and also B(t; t0 , V) Create month and Age and this is caused by the way
based on vehicle data V. The experiments are conducted in data has been collected. Data from all vehicles up to a
R (R Core Team, 2014) using the package Random Forests certain age has been exported at one single date. This date
for Survival, Regression and Classification (Ishwaran and subtracted with create month will be an upper bound and
Kogalur, 2013). often also a good estimate of vehicle age. Even though
vehicle age can be estimated based on Create month, it
is not accumulated over time and is therefore considered
as a classification variable. By using Create month in the
classification, seasonal and component quality variations for
different months can affect reliability function estimation.

108
SAFEPROCESS 2015
September 2-4, 2015. Paris, France Erik Frisk et al. / IFAC-PapersOnLine 48-21 (2015) 105–112 109

Age
Correlation with age
• number of random split values
Create month
Create year The discussion on these variables requires some detailed
Distance
Battery pos. knowledge on random survival forest, and a reader not
BrakeStartSpeed_I0
CoolantTempTime_I3
familiar with the technique can skip this part. Selection
CoolantTempTime_I2 of these parameters is important for the result and in
LowerBed
StartMotorTime_I3 (Frisk et al., 2014), an investigation on a proper size of the
InAirTempTime_I7
InAirTempTime_I2
terminal nodes were conducted and is also here chosen to be
StartMotorTime_I1 of size 200. Thus, each class in each tree in the grown forest
InAirTempTime_I8
InAirTempTime_I6 consists of at least 200 vehicles. Further, when building each
InAirTempTime_I3
RoadSlopePercDist_I4
tree in the forest, at each node a random procedure is used
InletAirTempTime_I5 to select the next split variable. A rule of
√ thumb (Ishwaran
BrakeStartSpeed_I1
InletAirTempTime_I9 and Kogalur, 2013) is to randomly try n variables where
BrakeStartSpeed_I5
SpeedDist_I0 n is the total number of variables. Here, n is 113 and 116
SnapshotNo
InAirTempTime_I1
respectively for the cases with and without accumulative
RoadSlopePercDist_I6 variables. Thus, the number of randomly chosen variables
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
to try at each split would ≈ 11, here 13 variables is used.
Figure 6. Most correlated variables with age. To find a split value for the corresponding node in the
tree classifier, a randomized procedure could be used to
Another non-accumulative variable that has been removed speed up the process. But here instead a complete search
is snapshot number SnapshotNo which is a serial number is performed.
assigned to each vehicle data download in chronologically The fourth and last of the key parameters in the algorithm
increasing numbers. It has been removed because the data is the number of trees to grow in the forest. Analysis of
collection method has introduced a correlation with battery the prediction error rate is useful for selecting number of
failure in the following way. For vehicles with working trees. The error rate measures how well the forest ranks two
batteries the snapshot is taken within a maintenance random individuals in terms of survival, and 0 is perfect
interval from the date of data collection. The snapshot for and 0.5 is no better than guessing. The error rate can
vehicles with battery problems is taken at time of battery be interpreted as the probability of correctly ranking the
failure. Hence a low snapshot number will correlate with survival of batteries in two random vehicles. Formally,
battery problem. However considering a true situation the the error rate is 1 − C where C is Harrell’s concordance
snapshot number will not be correlated with battery failure index (Harrell et al., 1982). Figure 7 plots the error rate
and therefore should not influence the reliability function as a function of number of trees for both models, i.e.,
estimation. the model with accumulative variables Macc and without
To conclude this section a summary of the difference of the accumulative variables Mno acc . From this plot it is clear
classification variables used here and used in (Frisk et al., 0.35
2014) will be given. In (Frisk et al., 2014) the histograms With no acc. variables
With acc. variables
where normalized so the change from that work to this
work is 0.3

• Age, SnapshotNo, and ChassiNo have been removed


• Distance has been replaced by MileagePerDay
0.25

6. CASE STUDY: BATTERY PROGNOSTICS


Error rate

The objective now is to use the methodology described 0.2

in Section 4 to estimate battery prognostic functions and


analyzing the results based on the discussions in Section 5.
To avoid revealing sensitive information, presented data 0.15

is normalized. A fundamental property when analyzing


the data set is that there is no ground truth, i.e., the true
battery degradation behaviors for the set of vehicles are not 0.1
0 50 100 150 200 250 300
known. Therefore, a discussion why the predicted battery Number of trees

prognostic functions are reasonable are included at the end


of the section. Figure 7. Error rate as a function of number of trees for
two models, one with and one without accumulative
variables.
6.1 Building the models from data
that, based on the error rate, there is no reason to grow
First, the approach form Section 4 is used to build the more than about 200-300 trees in the forest. Here, 300
random survival forest models. Two models will be built, classification trees are grown in the forest for both models.
one using accumulative variables, denoted Macc , and one Another observation is that the model Macc obtains a
who do not which is denoted Mno acc . To build the models, significantly lower error rate than Mno acc . This is to be
the software package (Ishwaran and Kogalur, 2013) is expected since the variables in Macc is a superset of the
used and there are 4 main parameters to be chosen in variables in Mno acc . However, this should not immediately
the software package be interpreted as that Macc is a more accurate model for
the reasons outlined in Sections 3 and 5.
• number of trees to grow in the forest
• minimum size of terminal nodes With the parameter values chosen, building the random
• number of random split variables survival forest models Macc and Mno acc takes about 62

109
SAFEPROCESS 2015
110
September 2-4, 2015. Paris, France Erik Frisk et al. / IFAC-PapersOnLine 48-21 (2015) 105–112

minutes each. The computer used has 128 GB of RAM and vibrations. The objective of the prognostics approach is to
2 Intel Xeon Processor X5675 (12M Cache, 3.06 GHz) find a vehicle individual maintenance plan that is not based
resulting in 12 cores and 24 logical processors. In the on age or distance. A set of vehicles to analyze further is
experiment, 20 of the 24 logical processors were allocated therefore needed. The set of vehicles to predict and further
in the tree computation. Note that training the forest is a analyze is selected such that the vehicles have similar age
one-time task, at least until more data becomes available, and distance properties, i.e., vehicles that with a fixed
and predicting the reliability for a given vehicle takes about maintenance schedule based on age or distance should have
25 seconds. similar time for next maintenance.

6.2 Variable Importance Analysis Now, let W0 be the set of vehicles in the original database
with no battery problems and let the functions age(V) and
distance(V) give the age and distance traveled respectively
Figure 2 and Figure 8 show variable importance for the for a given vehicle V. Then, a set of vehicles with age about
models Mno acc and Macc respectively. A comparison of 5 time units is extracted as
these figures shows how variable importance get influenced
by different treatments of the accumulative variables. W1 = {V; V ∈ W0 and 4.85 ≤ age(V) ≤ 5.15}
Let m be the mean distance traveled among the vehicles in
First remember that the variables Age, SnapshotNo, W1 , then the final set of vehicles W are the vehicles in W1
ChassiNo, and Distance are not included in Mno acc with distance traveled within 10% of the mean distance,
and hence not included in Figure 8. A comparison of i.e.,
the remaining variables show that Create month is the
most important variable in both cases. It is interesting to W = {V; V ∈ W1 and 0.9 m ≤ distance(V) ≤ 1.1 m}
note that in Figure 8 Create year and Battery pos. are The resulting set W ⊆ W0 consists of 144 vehicles with
higher ranked than in Figure 2. It is also interesting to note no reported battery problems, similar age, and traveled
that the 3 most important variables in Figure 8 are the 3 distance.
most correlated variables with age according to Figure 6. Figure 9 shows the predicted battery prognostic function
The variable Age is important according to Figure 2 and B(t; t0 , V) for the 144 vehicles in W using both models.
in the case when Age is not used the most correlated From Figure 9 it is evident that there is a wide spread
variables Create month, Create year, and Battery pos. among the battery prognoses and this is true regardless if
can provide information of age. the accumulative variable are included in the model or not.
There are some battery related variables that are im- Let T90 denote the maximum time where we have more
portant in both cases such as BattVoltTemp_I3_p2 and than 90% confidence that the battery will be operational,
BattVolt_p2. It is also worth noting that MilagePerDay i.e.,
is higher ranked than Distance. Also Country Id is more T90 = max B(t; t0 , V) ≥ 0.9
t
important when accumulative variables are not used.
In Figure 9(a), with predictions using the model Mno acc ,
As a conclusion Create month, Create year, and Battery it is clear that the T90 time varies from about 1.3 time units
pos. are most important variables in Mno acc but further for the vehicle with the worst prognosis and more than 8
investigations need to be done in order to understand if their time units for the vehicle with the best prognosis. A similar
importance are due to quality variations of batteries over situation occurs when predicting using also accumulative
time or seasonal conditions degrading the battery different variables. This is interesting since the models predict what
over time or if those variables are important because of Figure 4 showed, that vehicle configuration and usage
their correlation with vehicle age. significantly influences the battery prognosis.
VIMP without accumulative variables (min node size 200)
To further analyze the results of Figure 9, we identify
PowerOffSOC_I3_p3
AtmPress_var the vehicles with the most extreme battery prognoses.
PowerOffSOC_I6_c2
SpeedDist_p2
Let V1 and V2 denote the vehicles in W with best and
SOCPowerOffTime_I3_p3
BattVolt_p1
worst prognosis using the model Mno acc , i.e., the functions
NoKitchen in Figure 9(a) that are highest and lowest respectively.
PowerOff_p3
KickDownRel Figure 10 shows the battery prognostics function for
PowerOffSOC_I4_p2
SpeedDist_p10
vehicles V1 and V2 where it is evident that the estimated
BattVoltTemp_I3_c3
SOCPowerOffTime_I3_Ptail
prognoses for these two vehicles are significantly different
BattVoltTemp_I1_p2 and needs different maintenance plans. Identifying the
vehicles with best and worst prognosis using model Macc
MileagePerDay
SpeedTM_pct50
FuelConsumption_c10
BattVoltTM_p2 instead of model Mno acc shows a similar difference. It turns
BattVoltTM_p4
Country Id
out that the vehicle with best predicted prognosis is the
BattVolt_p2
BattVoltTemp_I3_p2
same for both models, but the vehicle with worst predicted
Battery pos. prognosis is different with the two models. Denote the
Create year
Create month vehicle with worst predicted prognosis using model Macc
0 0.002 0.004 0.006 0.008 0.01 0.012 0.014 0.016 0.018 with V3 . Thus, three vehicles with extreme differences in
battery prognosis have been identified as:
Figure 8. VIMP without accumulative variables.
• V1 - best predicted prognosis using both models
6.3 Battery prognostics Mno acc and Macc .
• V2 - worst predicted prognosis using model Mno acc .
• V3 - worst predicted prognosis using model Macc .
Given the two estimated models, Macc and Mno acc , we
can now estimate the battery prognostic function (2) given Table 1 shows some detailed data from vehicles V1 , V2 , and
vehicle data V. It is clear that there is an age component V3 . In the table, not all variables are included, only the 25
to battery degradation, either directly or indirectly for most important according to Figure 8 and also distance
example due to longer exposure to low temperatures or and age of the vehicles. The data has been normalized such

110
SAFEPROCESS 2015
September 2-4, 2015. Paris, France Erik Frisk et al. / IFAC-PapersOnLine 48-21 (2015) 105–112 111

1 1

0.9 0.9

0.8 0.8

0.7 0.7
B(t; t0 ; V)

B(t; t0 ; V)
0.6 0.6

0.5 0.5

0.4 0.4

0.3 0.3

0.2 0.2
0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8
time unit time unit

(a) Mno acc (b) Macc

Figure 9. Estimated battery prognostic function of the 144 vehicles in W using both random survival forest models.
Table 1. Normalized data for the three vehicles V1 , V2 , and V3 which correspond to the vehicle
with best prognosis, the worst prognosis according to model Mno acc , and the worst prognosis
according to model Macc .
Variable Vehicle V1 Vehicle V2 Vehicle V3
Distance (norm) 1.00 0.93 0.87
Age (norm) 1.00 1.00 1.02
Create month 2011092 2011101 2010122
Create year 2011 2011 2010
Battery pos. Left hand side Left hand side Left hand side
BattVoltTemp_I3_p2 (norm) 1.00 10.20 24.48
BattVolt_p2 (norm) 1.00 10.61 26.64
Country Id 2 1 0
BattVoltTM_p4 (norm) 1.00 5.89 12.34
BattVoltTM_p2 (norm) 1.00 23.25 84.70
FuelConsumption_c10 (norm) 1.00 0.79 0.52
SpeedTM_pct50 (norm) 1.00 1.09 1.36
MileagePerDay (norm) 1.00 0.93 0.86
BattVoltTemp_I1_p2 (norm) 1.00 9.69 28.99
SOCPowerOffTime_I3_Ptail - 1.0 1.0
BattVoltTemp_I3_c3 (norm) 1.00 11.78 34.94
SpeedDist_p10 (norm) 1.00 0.66 0.35
PowerOffSOC_I4_p2 (norm) 1.00 16.91 28.64
KickDownRel - 1.0 1.0
PowerOff_p3 (norm) 1.00 0.64 1.12
NoKitchen yes no no
BattVolt_p1 (norm) 1.00 11.65 31.22
SOCPowerOffTime_I3_p3 - 1.0 1.0
SpeedDist_p2 (norm) 1.00 0.89 0.76
PowerOffSOC_I6_c2 (norm) 1.00 1.11 1.11
AtmPress_var (norm) 1.00 3.48 1.37
PowerOffSOC_I3_p3 - 1.0 1.0

that the vehicle V1 with best battery prognosis has variable there is no ground truth available but one conclusion
values 1 except for cases when the unnormalized value so far is that for vehicles with similar age and distance,
is 0 or if it is a categorical variable such as Country_Id. the most important differences are related to low battery
The variables for vehicles V2 and V3 that are significantly voltage independent of the treatment of accumulated
different from vehicle V1 are mainly variables related to low variables which is consistent with engineering experience.
battery voltage and sometimes at specified temperature Thus, the procedure managed to automatically produce
intervals. For example BattVoltTemp_I3_p2 is the relative relevant battery prognostic functions, separating vehicles
time spent in 10-25◦ C with 26-27 V where the normal that otherwise would have had the same maintenance
voltage is up to 30 V. This is a good temperature for schedule.
the battery and low voltages are not expected to be
common in this temperature interval. The batteries in To further analyze the effect of using accumulative variables,
V2 and V3 had this condition 10.20 and 24.48 times as Figure 11 shows the predicted battery prognostic function
frequently as the battery with good prognosis in vehicle for vehicles Vi using both models with and without the
V1 . Further, comparing vehicles V2 and V3 it is clear that accumulative variables. Let Bino acc and Biacc correspond
the main differences compared to vehicle V1 are in the to the battery prognostic function for vehicle Vi using
same variables. As noted in the beginning of this section, models Mno acc and Macc respectively. From the figure it
is clear that choice of model has significant influence on the

111
SAFEPROCESS 2015
112
September 2-4, 2015. Paris, France Erik Frisk et al. / IFAC-PapersOnLine 48-21 (2015) 105–112

1
since degradation heavily relies on usage profile, vehicle
configuration, and ambient conditions.
0.9 A contribution is a case study utilizing the data-driven
approach random survival forests to compute probabilistic
0.8
reliability properties for a battery in a specific vehicle. The
case study is based on vehicle data from 33603 vehicles. A
0.7
main contribution of the paper is to analyze and compare
B(t; t0 ; V)

the difference in the results obtained with or without includ-


0.6
ing accumulative variables in the classification approach.
0.5
A first conclusion is that if the accumulative and most
important variable Age is removed, the three most strongly
0.4
correlated variables with age become most important. A
second conclusion is that the estimated battery prognostic
0.3 function is significantly changed if accumulative variables
V1
V2
are omitted. A third conclusion is that when looking at
0.2 vehicles with the same age and driven distance but with
0 1 2 3 4
time unit
5 6 7 8
significant different battery predictions the main differences
are in variables related to battery properties such as relative
Figure 10. Battery prognostics function for vehicles in W time with low voltage or relative time with a certain voltage
with best and worst predicted prognoses using model at a specified temperature interval.
Mno acc .
REFERENCES
1
Breiman, L. (1996). Bagging predictors. Machine learning,
0.9 24(2), 123–140.
Breiman, L. (2001). Random forests. Machine learning,
0.8 45(1), 5–32.
Breiman, L., Friedman, J., Stone, C.J., and Olshen, R.A.
0.7 (1984). Classification and regression trees. CRC press.
B(t; t0 ; V)

Cox, D.R. and Oakes, D. (1984). Analysis of survival data,


0.6 volume 21. CRC Press.
Frisk, E., Krysander, M., and Larsson, E. (2014). Data-
0.5
B1no acc driven lead-acid battery prognostics using random sur-
B1acc vival forests. In Proceedings of the Annual Conference of
0.4
B2no acc The Prognostics and Health Management Society. Fort
B2acc Worth, Texas, USA.
0.3 B3no acc
B3acc
Harrell, F.E., Califf, R.M., Pryor, D.B., Lee, K.L., and
0.2
Rosati, R.A. (1982). Evaluating the yield of medical
0 1 2 3 4
time unit
5 6 7 8 tests. Jama, 247(18), 2543–2546.
Heng, A., Zhang, S., Tan, A.C., and Mathew, J. (2009).
Figure 11. Battery prognostic estimates for functions for Rotating machinery prognostics: State of the art, chal-
vehicles V1 , V2 , and V3 using models Mno acc and lenges and opportunities. Mechanical Systems and Signal
Macc . Processing, 23(3), 724–739.
Ishwaran, H. and Kogalur, U. (2013). Random
estimate of the prognostic function, compare for example Forests for Survival, Regression and Classification
B3no acc with B3acc . This indicates that choice of model is (RF-SRC). URL http://cran.r-project.org/web/
important. In this case, the model Macc produces mostly packages/randomForestSRC/. R package version 1.4.
conservative prognostic estimates, i.e., the dashed lines lies Ishwaran, H. and Kogalur, U.B. (2010). Consistency of
below the corresponding solid lines in the figure for most random survival forests. Statistics & probability letters,
of the prediction horizon. However, this is not generally 80(13), 1056–1064.
true. Consider a vehicle that is old, but has been used in Ishwaran, H., Kogalur, U.B., Blackstone, E.H., and Lauer,
a way that is not damaging for the battery, for example M.S. (2008). Random survival forests. The Annals of
operational in mainly +20◦ , low speeds with low levels of Applied Statistics, 841–860.
vibrations etc. That vehicle would, in the Macc model, be Ishwaran, H. et al. (2007). Variable importance in binary
associated in the same class as other equally old vehicles. regression trees and forests. Electronic Journal of
This is due to that the age variable is so important in the Statistics, 1, 519–537.
classifier as shown in Figure 2. This would not be true for Linxia, L. and Köttig, F. (2014). Review of hybrid prog-
the model Mno acc and the vehicle would be associated nostics approaches for remaining useful life prediction
with vehicles with similar usage profile and configuration. of engineered systems, and an application to battery
This is a key difference between the different models and life prediction. IEEE Transactions on Reliability, 63(1),
the main reason why Mno acc is preferable to Macc . 191–207.
R Core Team (2014). R: A Language and Environment for
7. CONCLUSIONS Statistical Computing. R Foundation for Statistical Com-
puting, Vienna, Austria. URL http://www.R-project.
High degree of availability and reliability is important org/.
in many businesses and in particular heavy-duty trucks
and the lead-acid battery is one important component to
maintain. The battery is a difficult component to predict

112

You might also like