Soualhi 2014

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

2864 IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, VOL. 61, NO.

6, JUNE 2014

Prognosis of Bearing Failures Using Hidden Markov


Models and the Adaptive Neuro-Fuzzy
Inference System
Abdenour Soualhi, Student Member, IEEE, Hubert Razik, Senior Member, IEEE,
Guy Clerc, Senior Member, IEEE, and Dinh Dong Doan

Abstract—Prognostics and health management (PHM) play a RTNDS Remaining time before the next degradation state.
key role in increasing the reliability and safety of systems espe- INDS Imminence of the next degradation state.
cially in key sectors (military, aeronautical, aerospace, nuclear, AAC Artificial ant clustering.
etc.). This paper presents a new methodology which combines
data-driven and experience-based approaches for the PHM of
roller bearings. The proposed methodology uses time domain
features extracted from vibration signals as health indicators. The I. I NTRODUCTION
degradation states in bearings are detected by an unsupervised
classification technique called artificial ant clustering. The immi-
nence of the next degradation state in bearings is given by hidden
Markov models, and the estimation of the remaining time before
O VER the past decade, many diagnosis methodologies
have been applied successfully in the domain of the relia-
bility and safety of systems [1]–[5]. However, new methodolo-
the next degradation state is given by the multistep time series
prediction and the adaptive neuro-fuzzy inference system. A set gies dedicated to failure prediction were developed in order to
of experimental data collected from bearing failures is used to improve the reliability and safety of systems [6]–[10]. The most
validate the proposed methodology. Experimental results show promising methodology to provide a reliable failure prediction
that the use of data-driven and experience-based approaches is a is called prognostics and health management (PHM) [11], [12].
suitable strategy to improve the PHM of roller bearings. The purpose of this discipline is to increase both reliability
Index Terms—Artificial intelligence, feature extraction, fuzzy and availability of complex systems as well as to improve
neural networks, hidden Markov models (HMMs), pattern recog- maintenance decisions. The overall architecture of a typical
nition, prognosis, time domain analysis, vibration analysis. bearing PHM system includes four steps: feature extraction,
detection, diagnosis, and, finally, prognosis.
N OMENCLATURE Features (also called characteristics and signatures) of bear-
ing PHM systems are parameters extracted from vibration
Ω Classes.
signals and processed to detect and track bearing degrada-
M Number of classes.
tions. These features are extracted by using the following:
n Number of samples.
the time domain analysis, the frequency analysis, and the
π Initial state probability vector.
time–frequency analysis [13]–[15]. The frequency analysis is
A State-transition probability matrix.
based on the fact that a localized defect generates a signal
aij Transition probability from state i to j.
with some characteristic frequencies. The time–frequency anal-
B Observation probability matrix.
ysis investigates waveform signals in both time and frequency
bj Conditional observation probability of class j.
domains. It identifies time-dependent variations of frequency
λ Set of model parameters.
components within the signal, which makes time–frequency
α Forward variable.
analysis techniques a powerful tool for analyzing nonstation-
β Backward variable.
ary signals. The time domain analysis is one of the simplest
approaches. It uses statistical features extracted from the raw
vibration signal (root mean square (rms) of overall acceleration,
crest factor, peak acceleration, mean, skewness, and kurtosis)
Manuscript received November 22, 2012; revised February 5, 2013, as health indicators. The appearance of a defect changes the
March 26, 2013, and June 4, 2013; accepted June 19, 2013. Date of publication values of these features. Therefore, monitoring these features
July 23, 2013; date of current version December 20, 2013.
A. Soualhi, H. Razik, and G. Clerc are with the Université de Lyon, can provide useful diagnostic and prognostic information. This
69100 Villeurbanne, France, and also with Laboratoire Ampère, CNRS, UMR analysis is often unable to identify the faulty component of
5005, 69622 Villeurbanne, France (e-mail: abdenour.soualhi@univ-lyon1; the bearing. However, it is expected that a high value of these
hubert.razik@univ-lyon1; guy.clerc@univ-lyon1).
D. D. Doan is with the Department of Automatic Control and Micro- features corresponds to an overall deterioration of the bearing’s
Mechatronic Systems, Femto-ST Institute, 25044 Besançon Cedex, France health.
(e-mail: dinhdong.doan@femto-st.fr). Fault detection, diagnosis, and prognosis have mainly
Color versions of one or more of the figures in this paper are available online
at http://ieeexplore.ieee.org. evolved upon three major paradigms: model-based, data-driven,
Digital Object Identifier 10.1109/TIE.2013.2274415 and experience-based approaches [16]–[18]. These approaches
0278-0046 © 2013 IEEE
SOUALHI et al.: PROGNOSIS OF BEARING FAILURES USING HMMs AND ANFIS 2865

Fig. 1. Bearing PHM system.

have been applied to various fields such as aviation and used to provide an estimation of the “remaining time before the
aerospace to improve maintenance decisions. The model-based next degradation state” (RTNDS). In Section III, we investigate
approach uses a mathematical representation of the system’s the described methodology on an experimental data set of
degradation and thus involves a physical understanding of the bearings provided by the Center for Intelligent Maintenance
system and its degradation phenomenon. Systems, University of Cincinnati, Cincinnati, OH, USA. The
The model-based approach is well suited when accurate final conclusion is drawn in Section IV.
mathematical models can be constructed from the underlying
system. However, in many real-world applications, the system
is very complex so that constructing a comprehensive model II. A RCHITECTURE OF P ROPOSED PHM M ETHODOLOGY
which takes into account all the coupling effects is complex The PHM methodology includes the following four steps
or, in some cases, impossible. An alternative to the model- (see Fig. 1).
based approach is the data-driven approach. This approach uses
statistical pattern recognition and machine learning to detect 1) The feature extraction step extracts a list of features
changes in the measured data, thereby enabling diagnosis and (health indicators) from the bearing being tested by
prognosis of the system’s degradation. applying one of the analysis techniques (time domain,
The data-driven approach includes methods based on com- frequency domain, and time–frequency analyses). These
putational intelligence techniques such as neuro-fuzzy systems, features are filtered in order to remove irrelevant and
hidden Markov models (HMMs), and also dynamic Bayesian redundant features.
networks. The last approach is based on the experience and 2) The detection step explores and classifies into M -classes
the expert knowledge learned, for example, on the history of a set of historical data collected from bearings with the
failures or degradation mechanisms of the considered system. same characteristics and the same operating conditions
The experience-based approach is used in statistical reliability of the bearing being tested. The M -classes represent the
applications to predict the probability of a failure at any time. M degradation states, including the failed state. These
In this paper, we propose a new methodology for the PHM classes are obtained by an unsupervised classification
of bearings, combining the data-driven and experience-based technique called the AAC.
approaches. The major contributions of this paper are in the 3) The diagnosis step classifies the new data of the bearing
development of techniques for the detection, diagnosis, and being tested by a similarity function “Sim.”
prognosis of bearings’ health. In this context, a description 4) In the prognosis step we have the following.
of the proposed methodology is given in Section II. Time a) The INDS in bearings is given by a new HMM-based
domain features extracted from the vibration signal are used prognosis method.
as health indicators to track the degradation of bearings. The b) The detection of the INDS allows us to estimate the
experience-based approach is used to provide the detection RTNDS. This is done by a multistep prediction tech-
and the diagnosis of bearings by an unsupervised classification nique combined with ANFIS. The ANFIS is used here
technique called the artificial ant clustering (AAC) [19]. The as an extrapolation tool to predict the trending of the
AAC explores a set of historical data collected from bearings in features and to provide an estimation of the RTNDS.
order to detect the different degradation states and to provide an
online diagnosis. The data-driven approach is used to detect the
A. Feature Extraction
“imminence of the next degradation state” (INDS) in bearings
by a new HMM-based prognosis method. A detailed discussion The extraction of health indicators is the best approach to
about this method and its implementation is included here. In increase the effectiveness of the PHM of bearings. In our case,
addition to HMMs, a multistep time series prediction combined the test bench operates under stationary conditions (load and
with the adaptive neuro-fuzzy inference system (ANFIS) is speed). This allows the use of time domain features as health
2866 IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, VOL. 61, NO. 6, JUNE 2014

TABLE I from unknown classes. To correct this drawback, we adopt an


T IME D OMAIN F EATURES
alternative approach based on the unsupervised classification
of data. The development of a new method based on the
unsupervised classification provides an attractive alternative to
solve the problem of unlabeled data and subsequently generates
diagnosis and prognosis models.
Methods based on artificial intelligence are widely used for
the detection and diagnosis of bearings. They automatically
classify data, based on vibrations, acoustic and/or electrical
signatures. The AAC technique belongs to this category; it is an
unsupervised classification technique inspired by the behavior
of real ants to optimize detection and diagnosis. Given a set
of N -data (each datum xi is defined by a set of relevant
features xi = [γ1 , γ2 , . . . , γ6 ]i∈[1,...,N ), one seeks a clustering
of this data set into M classes (Ω1 , Ω2 . . . ΩM ) so that similar
data are grouped in the same classes while dissimilar data
are in separated classes. The value of M is unknown. The
indicators for bearings. One can distinguish the following: the classification of this data set is obtained by a succession of
rms, the average power, the crest factor, and those measuring similarities between ants. The obtained M -classes represent the
the signal distribution, such as the mean value (first moment), M -degradation states. For more details, readers are invited to
the skewness (third moment), and the kurtosis (fourth moment) read this paper [19].
[20]. These features are summarized in Table I, where n is the For the diagnosis, a similarity function is used to classify the
number of discrete points, xi is an experimental data, and x̄ is new data (xnew ). This similarity allows a better affectation of
the mean value. the new data based on the highest similarity. This similarity
It was reported that the kurtosis can be a good indicator to “Sim” takes into account the weight of each class, given by
distinguish between damaged and healthy bearings [21]. The the intensity of the pheromone accumulated along the detection
kurtosis of a healthy bearing has a value close to 3. When the process, and the distance which separates them
bearing deteriorates, this value increases to indicate a failure.
However, the kurtosis value decreases when the defect reaches [τi ]e · [d(gi , xnew )]f
sim (Ωi , xnew ) = 1 − 
an advanced state of degradation. Therefore, the kurtosis is A∈(1,...,M ) [τA ]
e · [d(gA , xnew )]f
most effective in detecting early stages of bearing failures. (1)
The measurement of signal energy and power can indicate the
bearing’s health status. The rms and the average power are where d(gi , xnew ) is the Euclidean distance between a mea-
two representative features of the energy and the power of the sured data and the gravity center of one of the obtained
vibration signal. These two features have been applied with classes. (e, f ) ∈ [0 − 1] are two parameters which define the
limited success to localize defects [22]. However, it has been importance of pheromones deposited by the ants on Ωi and the
shown that high rms value and average power correspond to an distance between the data xnew and the class Ωi . A is the Ath
overall deterioration of the bearing’s health. The crest factor class, and τi is the quantity of the pheromone deposited by ants
is the ratio of the peak to the rms and indicates the early on Ωi , with
stages of bearing failures. However, this feature decreases with

ni
a progressive failure because the rms generally increases with xij
a progressive failure. The skewness measures the asymmetry j=1
gi =
of the impulse train generated by bearing defects. The mean i∈(1,...,M ) ni
acceleration signal is the standard statistical mean value. The
mean value of the vibration signal for a bearing in a good where ni is the number of data in the class Ωi . N is the total
condition is close to zero. As the mean value increases, the number of data. gi is the gravity center of the class Ωi , and xij
bearing appears to deteriorate. is the jth data in the ith class Ωi .
The degradation state of the bearing being tested at time
tΩi (t)1≤i≤M is defined by the class which has the highest
B. Detection and Diagnosis Procedures similarity. The detection and diagnosis steps are shown in
When we use a classifier in the field of condition monitoring, Fig. 2.
it is very common to handle unlabeled data. This lack of
information constitutes a problem in the detection and diagnosis
C. Prognosis Procedure
of faults because the classification quality of any monitoring
system is defined by the quality of data used in the input of In this section, we briefly describe HMMs and how it is
the classifier. This phenomenon occurs when the measurements possible to detect the INDS in bearings with HMMs, and
have not been identified. This inconvenience causes a malfunc- then, we illustrate the integration of the multistep time series
tion of the monitoring system since the classifier cannot learn prediction and the ANFIS to estimate the RTNDS.
SOUALHI et al.: PROGNOSIS OF BEARING FAILURES USING HMMs AND ANFIS 2867

4) An emission probability matrix (B = bj (ot )) that indi-


cates how likely it is for a certain observation (measure)
value to come from a given state

1≤j≤M
bj (ot ) = P (ot |qt = Sj ), (4)
0≤t≤T

where bj (ot ) is the probability of emitting an observation


ot in the state Sj at time t.
The probabilities A, B, and π are necessary to obtain an
HMM for each state of bearing degradation. For convenience,
each state of degradation (k) uses the following notation:
Fig. 2. Detection and diagnosis procedures.  
λk = A(k) , B (k) , π (k) .
The HMM was introduced by Rabiner as a powerful tool in 1≤k≤M
modeling and analyzing complicated stochastic processes [23].
In [24] and [25], HMMs are used for fault diagnosis of electrical The degradation states of bearings, classified by the AAC,
machines. A set of features is extracted by the time–frequency are the only information available to detect the INDS of the
and time domain analyses. These features are converted into ob- bearing being tested. Thus, the usual approach is to learn the
servation sequences in order to estimate the HMM parameters. classes obtained by the AAC.
For this purpose, the observation sequences are grouped into 1) Training: The learning phase consists of extracting a set
classes. Each class represents the health status of the machine. of measurements of length T from each class Ok1≤k≤M =
A set of HMMs is constructed from these classes, one for each (ok,0 , ok,1 , ok,2 , . . . , ok,T ) and adjusting the model parameters
class. In order to test the obtained HMMs, the observation (A(k) , B (k) , π (k) ) in order to maximize the probability that
sequence of an operation condition was processed by each of each model generates its corresponding observation sequence
the HMMs, and the likelihood calculation was performed on P (Ok |λk ).
each. The state of the machine was classified according to Given an initial model λk = (A(k) , B (k) , π (k) ) and an obser-
the maximum of the likelihoods obtained from each HMM. vation sequence Ok1≤k≤M = (ok,0 , ok,1 , ok,2 , . . . , ok,T ), the
The HMM represents an ideal tool for machine fault diagnosis Baum–Welch algorithm adjusts the parameters of A(k) , B (k) ,
and prognosis when the input of the model is a time series of and π (k) in order to maximize the likelihood of the observation
observations extracted from features. sequence Ok .
The HMM is a statistical method used for modeling a system By using the Baum–Welch algorithm, we obtain the
that evolves through a finite number of states. The degradation reestimation formulas to update the HMM parameters
states of the bearing that is being modeled are hidden to the ob- (A(k) , B (k) , π (k) )
server. However, it is possible to detect the INDS by analyzing
T
−1
the observations (measurements) made on the bearing. (k)
ξt (i, j)
Each HMM is defined by the following. (k)
aij = t=0
, 1 ≤ i, j, k ≤ M (5)
T
−1
1) A set of (M ) hidden states S = {S1 , S2 , . . . , SM }. The (k)
γt (i)
state of the model at time t is given by qt ∈ S, 0 ≤ t ≤ T , t=0
where T is the length of the observation sequence and qt (k) (k)
denotes the current state. πi = γ0 (i) (6)
2) An initial probability value (π) that indicates how likely 
T
(k)
it is for an input measure to start in a given state γt (j)
t=0∩ot
bj (ok,t ) = (7)
πi = P (q0 = Si ), 1 ≤ i ≤ M. (2) 
T
(k)
γt (j)
t=0
3) A transition probability matrix (A = {aij }) that indicates
the likelihood of transitioning from one state Si at time t (k)

M
(k)
γt (i) = ξt (i, j) (8)
to another state Sj at time t + 1. j=1

aij = P (qt+1 = Sj |qt = Si ), 1 ≤ i, j ≤ M. (3) (k) (k)


αt (i)aij bj (ok,t+1 )βt+1 (j)
(k)
(k)
ξt (i, j) =M M (9)
The transition probability should satisfy the following   (k) (k) (k)
αt (i)aij bj (ok,t+1 )βt+1 (j)
constraints: i=1 j=1

aij ≥ 0 1 ≤ i, j ≤ M where αt (i) and βt (i) are called, respectively, the forward and

M backward variables.
aij = 1. The forward variable αt (i) is defined as the probability
j=1 of generating a partial observation sequence in the forward
2868 IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, VOL. 61, NO. 6, JUNE 2014

direction (i.e., from the start of the sequence) and finishing at a


certain state i at time t
(k)
αt (i) = P (ok,0 , ok,1 , ok,2 , . . . , ok,t , qt = Si |λk ) (10)

with
(k) (k)
α0 (i) = πi bi (ok,0 )

(k)

M
(k) (k) 1 ≤ i, j, k ≤ M
αt (j) = bj (ok,t ) αt−1 (i)aij , .
1≤t≤T
i=1

The backward variable βt (i) denotes the probability of gen-


erating a partial observation sequence in the reverse direction Fig. 3. HMM-based prognosis method.
(from the time T ), given a certain state i at time t
(k)
From the last formula, the imminence of the next probable
βt (i) = P (ok,t+1 , . . . , ok,T |qt = Si , λk ) (11) state is done by estimating the transition probability aij , the
forward variable αt (i) at time t, and the emission probabil-
with
ity of a new observation at t + 1. However, the observation
(k)

M
(k) (k) 1 ≤ i, j, k ≤ M
ot+1 is unavailable. To correct this drawback, we replace the
βt (i) = βt+1 (j)bj (ok,t+1 )aij , emission probability of a new observation ot+1 for each state
0≤t≤T −1
j=1 by a coefficient named CSj . Like bj (ot+1 ), this coefficient
is between “0” and “1.” A coefficient of “0” means that the
βT(k) (i) = 1.
emission probability of a new observation in the state Sj is
2) Imminence of Next Probable State: Given an observa- low. A coefficient of “1” means that there is a high probability
tion sequence O = (o0 , o1 , o2 , . . . , ot ) and HMMs (λk )1≤k≤M that a new observation will be emitted by the state Sj . Thus,
obtained previously, we seek to estimate the imminence the INDS can be obtained by giving a coefficient equal to the
of the next probable state at time t + 1 by the proba- emission probability of the observation at time t(Csj = bj (ot ))
bility that HMMs generate the observation sequence O = for the current state and a coefficient of “1” (Csj = 1) for the
(o0 , o1 , o2 , . . . , ot , ot+1 ) until time t + 1. next probable states. This strategy promotes the transition from
The probability that an HMM generates the observation one state at time (t) to a new state at time (t + 1)
sequence O = (o0 , o1 , o2 , . . . , ot ) is given by    M 
M
(k) (k)
P o0 , o1 , o2 , . . . , ot+1 | λk = C sj αt (i)aij

M
(k) (k) 1≤k≤M
P (o0 , o1 , o2 , . . . , ot |λk ) = αt (j)βt (j). (12) j=1 i=1

j=1
(15)
with
The probability that an HMM generates the observation 
1  qt
if qt+1 =
sequence O = (o0 , o1 , o2 , . . . , ot , ot+1 ) is given by C sj = , 1 ≤ j ≤ M.
bj (ot ) if qt+1 = qt

M
(k)
P (O|λk ) = αt+1 (j) (13) The detection of the INDS is shown in Fig. 3.
j=1 The HMM-based prognosis method can be summarized as
follows.
(k)
with βt+1 (j) = 1 1) An observation sequence of length T is extracted from
  
M
each class obtained by the AAC.
P o0 , o1 , o2 , . . . , ot+1 | λk =
(k)
αt+1 (j) 2) An HMM is trained from each observation sequence in
1≤k≤M
j=1
order to estimate the following parameters: A(k) , B (k) ,
and π (k) .

M  
3) An observation sequence is extracted from the bearing
= P o0 , o1 , o2 , . . . , ot+1 , qt+1 = sj | λk being tested. The HMM for which the probability given
1≤k≤M
j=1
by (15) is the highest indicates the INDS.

M 
M
(k) (k)
With the HMM-based prognosis method, it is difficult to
= bj (ot+1 ) αt (i)aij . predict the exact moment of the next degradation state [26].
j=1 i=1 However, it is possible to determine this moment with the
ANFIS model [27].
The imminence of the next probable state Sk (t + 1)1≤k≤M
The ANFIS is a fuzzy inference system implemented in the
is defined by the state which has the highest probability
framework of an adaptive neural network. It has been success-
Sk (t + 1) = arg max [P (o0 , o1 , o2 , . . . , ot+1 |λk )] . (14) fully applied for fault detection and diagnosis of induction ma-
1≤k≤M chines [28]. Recently, ANFIS has been employed successfully
SOUALHI et al.: PROGNOSIS OF BEARING FAILURES USING HMMs AND ANFIS 2869

Fig. 4. Multistep prediction with ANFIS.

in the prediction of machine condition degradation, where the


prediction is carried out via a fuzzy system while its parameters
are optimized through an artificial neural network [29]. This
paper shows that ANFIS is a reliable and robust predictor.
The ANFIS is used as an extrapolation tool to predict the Fig. 5. Estimation of the RTNDS by ANFIS.
trending of the statistical features up to a certain limit (l)ôt+l .
For this purpose, the set of historical data of bearing failures is
divided into a training set and a validation set. The training set
is used to reestimate the initial parameters of the fuzzy mem-
bership functions, and the validation set is used to validate the
system. In the training process, the weights and the bias values
of the ANFIS are adjusted by minimizing the error between
the ANFIS output and the validation set. In our application, the
ANFIS input is one of the statistical features extracted from
the vibration signal (i.e., rms, crest factor, peak acceleration,
mean, skewness, and kurtosis). A sliding window of size (r) is
used to define the number of previous time steps of the ANFIS
input. The output ôt+1 is the prediction of the input feature at Fig. 6. Bearing test bench [30].
time t + 1 from the r-time steps. In order to predict the trending
TABLE II
until time t + l, the ANFIS uses the previous predicted values C HARACTERISTICS OF S TUDIED B EARINGS
as inputs to forecast the l-future values of the input feature (see
Fig. 4).
Given a sliding window of observations (ot , ot−1 , ot−2 , . . . ,
ot−r ), the first future value ôt+1 is given as follows:

ôt+1 = f (ot , ot−1 , ot−2 , . . . , ot−r ) (16) In order to reflect the natural degradation of bearings, bearing
where r is the size of the sliding window. run-to-failure tests were performed on a test bench designed
Then, the process is repeated recursively until a certain with support from Rexnord Corp. The experimental data were
value l downloaded from the Prognostics Center of Excellence (http://
ti.arc.nasa.gov/tech/dash/pcoe/prognostic-data-repository/).
ôt+l = f (ôt+l−1 , ôt+l−2 , . . . , ôt+l−r ). (17) The test bench is composed of four Rexnord ZA-2115 double
row bearings installed on one shaft as shown in Fig. 6. The
The value of l must satisfy the following constraint: characteristics of these bearings are detailed in Table II.
The rotation speed was kept constant at 2000 r/min with a
arg max [Sim(Ωi , ôt+l )] = arg max [Sim(Ωi , ôt+l−1 )] . 6000-lb radial load placed onto the shaft and bearings by a
1≤i≤M 1≤i≤M
(18) spring mechanism.
A magnetic plug installed in the oil feedback pipe collected
An illustration for the estimation of the RTNDS by ANFIS is debris from the oil as evidence of bearing degradations. A
given in Fig. 5. PCB 353B33 High Sensitivity Quartz ICP Accelerometer was
installed on the horizontal (X) and vertical axes (Y ) for each
bearing. The vibration data were recorded every 20 min, with a
III. E XPERIMENTAL R ESULTS
fixed sampling rate of 20 kHz.
Many approaches about the prognosis of bearings are avail- A set of experimental data, extracted from the vibration
able in literature. However, experimental applications do not re- signal of bearing 4, is used to evaluate the performance of our
flect the natural degradation of bearings. This can be explained methodology for the detection of the INDS and the estimation
by the difficulty of simulating failures. Failures are typically of the RTNDS. For this purpose, we suppose that the set of
induced by introducing some impurities into the lubricant, historical data of bearings (1, 2, and 3) is available. This data
by machining with an electrical discharge, or by drilling or set is used to detect the M -unknown degradation states and to
scratching the surface of bearings. identify the parameters of HMMs and the ANFIS.
2870 IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, VOL. 61, NO. 6, JUNE 2014

A. Feature Extraction
The feature extraction technique described in Section II-A
is applied to extract statistical features from the vibration data
of bearings (1, 2, 3, and 4). These features are extracted from
accelerometers installed on the horizontal axis (X) which
provide better results for the tracking of bearing degradation
than the accelerometers installed on the vertical axis (Y ).
The rms, the kurtosis, the mean value, and the average power
of the vibration signal are considered good features for tracking
the degradation of bearings. For each feature, the total number
of observations in the data set was 1944, with 486 observations
per bearing and 14 observations per day.
Fig. 7(a) and (b) shows, respectively, the average power and
the rms. Note that bearings (1, 2, 3, and 4) were operated at
2000 r/min with 6000-lb radial load for about 35 days of test.
The test will stop when the accumulated debris that adheres to
the magnetic plug exceeds a certain level and causes an elec-
trical switch to close. After two days of operation (2880 min),
the first signs of ageing were noticed in the four bearings. After
30 days of operation (43 200 min), the first sign of degradation
was noticed in bearing 4. After that, the values of the rms
and the average power continued to rise as the degradation
increases.
Fig. 7(c) shows the results of the kurtosis feature. As the
rms, the kurtosis performs well and gives an indication about
the degradation of bearings. The kurtosis indicates a signif-
icant change in the distribution before the appearance of the
degradation. After 28 days of operation (40 320 min), the first
sign of degradation was noticed in bearing 4. After 31 days
of operation (44 640 min), the first visible sign of degradation
was noticed in bearings 3 and 4. It is noted that the kurtosis
reaches its maximum value on the 33rd day (47 520th min).
This implies that the kurtosis is, although a good indicator to
detect the degradation of bearings, not always a good indicator
of the extent of the degradation.
Fig. 7(d) shows the results of the mean value. During the two
first days, the value of the mean is close to zero. This indicates
that the bearings (1, 2, 3, and 4) are in a good state.
After two days of operation (2880 min), the first signs of
aging were noticed. After that, the mean value did not change
significantly.

B. Detection and Diagnosis Procedures


The AAC described previously is used to identify the M -
unknown degradation states from the set of historical data of
bearings (1, 2, and 3). A good knowledge of the component
can help us choose the most appropriate features. Based on
Fig. 7(a)–(d), we choose the rms, the mean value, and the aver-
age power of the vibration signal. The kurtosis is not selected
because, when the deterioration increases, the kurtosis value
decreases [see Fig. 7(c)]. For the selected features (γ1 , γ2 , γ6 ),
several data disturbed by noise can appear. Thus, a class which
regroups data of the same degradation state can be represented
by an area of points. If the features are well selected, each state
of degradation can be represented by a constrained class in the
multidimensional feature space as shown in Fig. 8. Fig. 7. Vibration features of bearings (1, 2, 3, and 4).
SOUALHI et al.: PROGNOSIS OF BEARING FAILURES USING HMMs AND ANFIS 2871

Fig. 9. Picture of bearings after 35 days of test. (a) Inner race defect in
bearing 3. (b) Roller element defect in bearing 4. (c) Outer race defect in
bearing 4 [30].

This data set is slightly higher than that of the first class.
Thus, this class represents bearings in a medium–good state of
operation. The class (Ω3 ) includes the data set of bearings 2
and 3 recorded between the 33rd and 34th days of experiment.
Fig. 8. Historical data of bearings 1, 2, and 3 in the 3-D feature space after
standardization. This data set is slightly higher than that of the second class. This
class represents bearings in a medium–bad state of operation.
The parameters e and f are defined according to the highest The obtained classes (Ω1 , Ω2 , Ω3 , Ω4 ) represent, respec-
value of the criterion (Cq ) tively, bearings in the following: good state, medium–good
−1
state, medium–bad state, and bad state of operation.
Cq = trace SW · SB . (19) Each class has a geometric area in the multidimensional
This criterion takes into account the distance between the feature space. Given a new data measured on bearing 4, the
classes defined by the interclass scatter matrix SB and the diagnosis consists of affecting this measure to one of the ob-
compactness of the classes defined by the intraclass scatter tained classes according to (1). The class which has the highest
matrix SW similarity defines the degradation state of bearing 4.

1 
M ni
SW = (xij − gi ) · (xij − gi )T (20) C. Prognosis Procedure
N i=1 j=1
The total number of observations is 1944. As a training
1 
M set, 1458 observations are used (486 observations for each of
SB = (gi − g) · (gi − g)T (21) bearings 1, 2, and 3), and 486 observations of bearing 4, which
N i=1
are different and independent of the training set, are used as a
where M is the number of classes. ni is the number of data in validation set. The statistical features (γ1 , γ2 , γ6 ) of bearings 1,
the class Ωi . N is the total number of measured data. gi is the 2, and 3 are used to estimate the parameters of HMMs and the
gravity center of the class Ωi . xij is the jth data in the ith class ANFIS.
Ωi , and g is the gravity center of the data set 1) Estimation of HMM Parameters: The hidden states are,
respectively, the following: bearings in a good state of operation

ni
xij (S1 ), bearings in a medium–good state of operation (S2 ),
gi =
j=1
(22) bearings in a medium–bad state of operation (S3 ), and bearings
i∈(1,...,M ) ni in a bad state of operation (S4 ). From each class, an observation
sequence of length (T = 4) is extracted. Each observation is

N
described by the statistical features (γ1 , γ2 , γ6 ). It has been
xi
g = i=1 . (23) shown in [24] that an observation sequence of length (T = 4) is
N sufficient to train an HMM. The observation sequences used to
The AAC detected four classes (M = 4)(Ω1 , Ω2 , Ω3 , Ω4 ) for estimate the parameters of HMMs can be formulated as follows:
e = 0.4 and f = 0.8. The first class (Ω1 ) includes the data set (Oi = {oi,0 , oi,1 , . . . , oi,T } for 1 ≤ i ≤ M and T = 4).
of bearings (1, 2, and 3) recorded during the two first days of We define the initial parameters (transition matrix A and the
operation. This class represents bearings (1, 2, and 3) in a good initial state vector π) as follows:
state of operation because the first signs of ageing were noticed Ω1 Ω2 Ω3 Ω4
after two days of experiment. The last class (Ω4 ) includes the ⎛ ⎞
Ω1 0.6 0.2 0.1 0.1
data set of bearing 3. This data set corresponds to an inner
race defect discovered in bearing 3 [see Fig. 9(a)]. This class A1,2,3,4 = Ω2 ⎜ 0 0.5 0.25 0.25 ⎟
Ω3 ⎝ 0 0 0.5 0.5

represents bearings in a bad state of operation. In addition to
Ω3 0 0 0 1
(Ω1 ) and (Ω4 ), the AAC discovered two other classes, namely,
(Ω2 ) and (Ω3 ).  
The class (Ω2 ) includes the data set of bearings (1, 2, and 1 1 1 1
π1,2,3,4 = .
3) recorded between the 2nd and 33rd days of experiment. 4 4 4 4
2872 IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, VOL. 61, NO. 6, JUNE 2014

TABLE III 3) INDS and RTNDS: When the current state Ωi (t)1≤i≤M
HMM OF (Ω1 )
has been identified by the AAC, it is possible to detect the
INDS and the RTNDS of bearing 4. In Fig. 10(a), the first
prediction has detected the imminence of the medium–good
state after 5553 min of operation. The emission probability
of the good state at time t = 5553 min is b1 (ot ) = 0.92. For
an unknown emission probability at time t + 1, the coefficient
Csj takes the same value of the emission probability b1 (ot )
for the good state at time t(Csj = b1 (ot ) = 0.92)(j = 1) and
TABLE IV a coefficient of 1 (Csj = 1) for the next states (j = 2, 3, 4).
HMM OF (Ω2 )
Fig. 11(a) shows the probabilities of generating the observation
sequence O = (o0 , o1 , o2 , . . . , ot , ot+1 ) at time t = 5553 min
for each HMM and for different values of Csj for the good state.
As we can see, for Csj = 0.92, the probability of generating
the observation sequence O = (o0 , o1 , o2 , . . . , ot , ot+1 ) at time
t = 5553 min for the good state is 0.7204. This probability
is lower than that of the medium–good state (0.7207). This
allows the detection of the imminence of the medium–good
TABLE V
HMM OF (Ω3 )
state at time t = 5553 min. This indication allows us to estimate
the RTNDS by the multistep prediction technique combined
with ANFIS. The ANFIS uses the previous predicted values to
forecast the l-future values of the input feature. The value of l
is obtained by (18). The remaining time before the appearance
of the medium–good state is estimated at 3427 min, with a
transition to the medium–good state estimated after 8980 min
of operation, while the real transition is diagnosed after
9176 min of operation. The prediction given by HMMs pro-
TABLE VI
HMM OF (Ω4 ) vides an indication to the ANFIS about the imminence of
the next state, and the output value given by the ANFIS is
interpreted as an approximate estimation of the remaining time
of the good state.
After the first prediction, the data set of bearing 4 until
time t is taken into account in order to train ANFIS. This will
provide a more accurate prediction about the remaining time
before the appearance of the medium–bad and bad states. The
second and third predictions given by HMMs and the ANFIS
The observation matrix B for (O1 , O2 , O3 , and O4 , with T = 4) are shown in Fig. 10(b) and (c), respectively. The second
is given by and third predictions detected the imminence of the medium–
bad and bad states after 42 667 and 49 910 min of operation,
(Oi − μj )Cov −1 (Ej )(Oi − μj )
bj (Oi ) = 1 − M (24) respectively. For the second prediction, the emission probability
−1 
1≤i,j≤M =1 (Oi − μ )Cov (E )(Oi − μ ) of the medium–good state at time t = 42667 is b2 (ot ) = 0.92.
For an unknown emission probability at time t + 1, the coef-
where ficient Csj takes the same value of the emission probability
Ej training data of the class Ωj ; b2 (ot ) for the medium–good state at time t(Csj = b2 (ot ) =
μj gravity center of the class Ωj ; 0.92)(j = 2) and a coefficient of 1 (Csj = 1) for the next states
(.)’ matrix transpose. The covariance matrix Cov is esti- (j = 3, 4). Fig. 11(b) shows the probabilities of generating
mated from each class. the observation sequence O = (o0 , o1 , o2 , . . . , ot , ot+1 ) at time
The best HMM parameters (Ai , πi ) that maximize the prob- t = 42 667 min for each HMM and for different values of
ability P (Ok |λk ) for k = 1, 2, 3, 4 are given in Tables III–VI. Csj for the medium–good state. As we can see, for Csj =
2) ANFIS Parameters: Three ANFIS are necessary to es- 0.92, the probability of generating the observation sequence
timate the future values of the selected features (γ1 , γ2 , γ6 ). O = (o0 , o1 , o2 , . . . , ot , ot+1 ) at time t = 42 667 min for the
Each system corresponds to one of the three statistical features. medium–good state is 0.7216. This probability is lower than
The set of historical data of bearings (1, 2, and 3) has been that of the medium–bad state (0.7364). This allows the de-
used to define the parameters of the ANFIS. For each system, tection of the imminence of the medium–bad state at time
81 fuzzy rules were used to build the fuzzy inference system. t = 42 667 min. The remaining time before the appearance of
Three trapezoidal membership functions were used for each the medium–bad state is estimated at 187 min, with a transition
input to train ANFIS. The size of the sliding window is fixed to the medium–bad state estimated after 42 854 min of oper-
at four inputs. ation, while the real transition is diagnosed after 42 854 min
SOUALHI et al.: PROGNOSIS OF BEARING FAILURES USING HMMs AND ANFIS 2873

Fig. 10. Prediction of the imminence and the remaining time of (a) the medium–good state, (b) the medium–bad state, and (c) the bad state.

Fig. 11. Probability of generating an observation sequence until time t + 1 for different values of Csj for the following: (a) good state, (b) medium–good, and
(c) bad state.

of operation. For the third prediction, the emission probability been described, as well as its implementation in a real case of
of the medium–bad state at time t = 49 910 min is b3 (ot ) = bearing’s degradation. For stationary conditions, time domain
0.832. For an unknown emission probability at time t + 1, the features are considered as good parameters to track the degra-
coefficient Csj takes the same value of the emission probability dation of bearings. These features have been used in the AAC
b3 (ot ) for the medium–bad state at time t(Csj = b3 (ot ) = to detect the degradation states of bearings, both for the training
0.832)(j = 3) and a coefficient of 1 (Csj = 1) for the next of HMMs and ANFIS parameters. However, for nonstationary
state (j = 4). Fig. 11(c) shows the probabilities of generating conditions, the use of time–frequency analysis is considered
the observation sequence O = (o0 , o1 , o2 , . . . , ot , ot+1 ) at time as the best way to extract other features in order to track the
t = 49 910 min for each HMM and for different values of degradation of bearings. A new HMM-based prognosis model
Csj for the medium–bad state. As we can see, for Csj = has been proposed to detect the INDS. The ANFIS combined
0.832, the probability of generating the observation sequence with the multistep time series prediction has been used as an
O = (o0 , o1 , o2 , . . . , ot , ot+1 ) at time t = 49 910 min for the extrapolation tool in order to predict the RTNDS. The ob-
medium–bad state is 0.4779. This probability is lower than the tained results show the efficiency of the proposed methodology
bad state (0.4886). This allows the detection of the imminence for the detection, diagnosis, and prognosis of faults in roller
of the bad state at time t = 49 910 min. The remaining time bearings.
before the appearance of the bad state is estimated at 389 min, To apply the proposed methodology, a significant amount of
with a transition estimated after 50 299 min of operation, while past knowledge of the bearing being tested is required because
the real transition is diagnosed after 50 299 min of operation. the corresponding degradation states must be known in advance
In this case, the prediction is more accurate, and it is obvious and must be well described. This allows us to obtain a reliable
that more information about the previous states will increase diagnostic and prognostic tool, but on the other hand, it limits
the prediction accuracy. One can note that, after 50 299 min the applicability of this methodology when the availability of
of operation, bearing 4 is considered defective due to a roller historical data is very difficult to obtain.
element defect and an outer race defect [see Fig. 9(b) and (c)].
ACKNOWLEDGMENT
IV. C ONCLUSION
The authors would like to thank the Intelligent Maintenance
A new methodology for the PHM of roller bearings has been System (IMS) Center and Rexnord Technical Services for their
proposed in this paper. The architecture of this methodology has database.
2874 IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, VOL. 61, NO. 6, JUNE 2014

R EFERENCES [23] L. Rabiner and B. Juang, “An introduction to hidden Markov models,”
IEEE ASSP Mag., vol. 3, no. 1, pp. 4–16, Jan. 1986.
[1] L. Frosini and E. Bassi, “Stator current and motor efficiency as indicators [24] A. Soualhi, G. Clerc, H. Razik, and A. Lebaroud, “Fault detection and
for different types of bearing faults in induction motors,” IEEE Trans. Ind. diagnosis of induction motors based on hidden Markov model,” in Proc.
Electron., vol. 57, no. 1, pp. 244–251, Jan. 2010. ICEM, Sep. 2012, pp. 1693–1699.
[2] F. Immovilli, M. Cocconcelli, A. Bellini, and R. Rubini, “Detection of [25] A. Lebaroud and G. Clerc, “Classification of induction machine faults
generalized-roughness bearing fault by spectral-kurtosis energy of vi- by optimal time–frequency representations,” IEEE Trans. Ind. Electron.,
bration or current signals,” IEEE Trans. Ind. Electron., vol. 56, no. 11, vol. 55, no. 12, pp. 4290–4298, Dec. 2008.
pp. 4710–4717, Nov. 2009. [26] D. Tobon-Mejia, K. Medjaher, N. Zerhouni, and G. Tripot, “Hidden
[3] B. Zhang, C. Sconyers, C. Byington, R. Patrick, M. Orchard, and Markov models for failure diagnostic and prognostic,” in Proc. PHM,
G. Vachtsevanos, “A probabilistic fault detection approach: Application 2011, pp. 1–8.
to bearing fault detection,” IEEE Trans. Ind. Electron., vol. 58, no. 5, [27] J. S. Jang and C. T. Sun, “Neuro-fuzzy modeling and control,” Proc.
pp. 2011–2018, May 2011. IEEE, vol. 83, no. 3, pp. 378–406, Mar. 1995.
[4] F. Immovilli, C. Bianchini, M. Cocconcelli, A. Bellini, and R. Rubini, [28] S. Altug, M. Y. Chen, and H. Trussell, “Fuzzy inference systems imple-
“Bearing fault model for induction motor with externally induced vi- mented on neural architectures for motor fault detection and diagnosis,”
bration,” IEEE Trans. Ind. Electron., vol. 60, no. 8, pp. 3408–3418, IEEE Trans. Ind. Electron., vol. 46, no. 6, pp. 1069–1079, Dec. 1999.
Aug. 2013. [29] W. Q. Wang, M. Golnaraghi, and F. Ismail, “Prognosis of machine
[5] A. Yazidi, H. Henao, G. A. Capolino, F. Betin, and F. Filippetti, “A Web- health condition using neuro-fuzzy systems,” Mech. Syst. Signal Process.,
based remote laboratory for monitoring and diagnosis of ac electrical vol. 18, no. 4, pp. 813–831, Jul. 2004.
machines,” IEEE Trans. Ind. Electron., vol. 58, no. 10, pp. 4950–4959, [30] H. Qiu, J. Lee, J. Lin, and G. Yu, “Wavelet filter-based weak signature
Oct. 2011. detection method and its application on rolling element bearing prognos-
[6] S. Zaidi, S. Aviyente, M. Salman, K. K. Shin, and E. Strangas, “Prognosis tics,” J. Sound Vib., vol. 289, no. 4/5, pp. 1066–1090, Feb. 2006.
of gear failures in dc starter motors using hidden Markov models,” IEEE
Trans. Ind. Electron., vol. 58, no. 5, pp. 1695–1706, May 2011.
[7] C. Chen, B. Zhang, G. Vachtsevanos, and M. Orchard, “Machine con-
dition prediction based on adaptive neuro fuzzy and high-order particle Abdenour Soualhi (S’11) received the Master’s de-
filtering,” IEEE Trans. Ind. Electron., vol. 58, no. 9, pp. 4353–4364, gree in electrical engineering from the “Université
Sep. 2011. Claude Bernard Lyon 1,” Lyon, France, in 2010. He
[8] E. Strangas, S. Aviyente, J. Neely, and S. Zaidi, “Improving the reliability is currently working toward the Ph.D. degree in the
of electrical drives through failure prognosis,” in Proc. IEEE SDEMPED, Department of Electrical Engineering, Laboratoire
2011, pp. 172–178. Ampère, UMR 5005, Villeurbanne, France.
[9] A. K. Mahamad, S. Saon, and T. Hiyama, “Predicting remaining useful His research interests are fault diagnosis and prog-
life of rotating machinery based artificial neural network,” Comput. Math. nosis by means of pattern recognition and artificial
Appl., vol. 60, no. 4, pp. 1078–1087, Aug. 2010. intelligence techniques.
[10] T. Benkedjouh, K. Medjaher, N. Zerhouni, and S. Rechak, “Fault prognos-
tic of bearings by using support vector data description,” in Proc. IEEE
PHM, 2012, pp. 1–7.
[11] P. A. Sandborn and C. Wilkinson, “A maintenance planning and busi-
ness case development model for the application of prognostics and Hubert Razik (M’98–SM’03) graduated from the
health management (PHM) to electronic systems,” Microelectron. Reliab., Ecole Normale Supérieure, Cachan, France, in 1987
vol. 47, no. 12, pp. 1889–1901, Dec. 2007. and received the Ph.D. degree in electrical engi-
[12] W. Wang and M. Pecht, “Economic analysis of canary-based prognostics neering from the Institut Polytechnique de Lorraine,
and health management,” IEEE Trans. Ind. Electron., vol. 58, no. 7, Nancy, France, in 1991.
pp. 3077–3089, Jul. 2011. Since November 1, 2009, he has been a Professor
[13] O. R. Seryasat, M. A. Shoorehdeli, F. Honarvar, and A. Rahmani, “Multi- of electrical engineering with the “Université Claude
fault diagnosis of ball bearing based on features extracted from time- Bernard Lyon 1,” Lyon, France, and with Labora-
domain and multi-class support vector machine (MSVM),” in Proc. IEEE toire Ampère, UMR 5005, Villeurbanne, France. His
SMC, 2010, pp. 4300–4303. fields of research include the monitoring conditions
[14] K. Chen, X. Li, F. Wang, T. Wang, and C. Wu, “Bearing fault diagnosis of multiphase induction motor.
using wavelet analysis,” in Proc. IEEE Int. Conf. QRRMSE, 2012,
pp. 699–702.
[15] M. D. Prieto, G. Cirrincione, A. G. Espinosa, J. A. Ortega, and
H. Henao, “Bearing faults detection by a novel condition monitoring Guy Clerc (M’90–SM’10) received the Engineer’s
scheme based on statistical-time features and neural networks,” IEEE degree and the Ph.D. degree in electrical engineering
Trans. Ind. Electron., vol. 60, no. 8, pp. 3398–3407, Aug. 2013. from the Ecole Centrale de Lyon, Lyon, France, in
[16] J. Luo, K. Pattipati, L. Qiao, and S. Chigusa, “Model-based prognostic 1984 and 1989, respectively.
techniques applied to a suspension system,” IEEE Trans. Syst., Man, He is a university Professor. He teaches electri-
Cybern. A, Syst., Humans, vol. 38, no. 5, pp. 1156–1168, Sep. 2008. cal engineering at the “Université Claude Bernard
[17] D. Tobon-Mejia, K. Medjaher, N. Zerhouni, and G. Tripot, “A data-driven Lyon 1,” Lyon. He carried out research on control
failure prognostics method based on mixture of Gaussians hidden Markov and diagnosis of induction machines at Laboratoire
models,” IEEE Trans. Rel., vol. 61, no. 2, pp. 491–503, Jun. 2012. Ampère, UMR 5005, Villeurbanne, France.
[18] A. Bellini, F. Filippetti, C. Tassoni, and G. A. Capolino, “Advances in di-
agnostic techniques for induction machines,” IEEE Trans. Ind. Electron.,
vol. 55, no. 12, pp. 4109–4126, Dec. 2008.
[19] A. Soualhi, G. Clerc, and H. Razik, “Detection and diagnosis of faults
in induction motor using an improved artificial ant clustering technique,” Dinh Dong Doan received the M.A.Sc. degree in au-
IEEE Trans. Ind. Electron., vol. 60, no. 9, pp. 4053–4062, Sep. 2013. tomated system engineering from INSA Lyon, Lyon,
[20] K. Medjaher, F. Camci, and N. Zerhouni, “Feature extraction and eval- France, in 2012. He is currently working toward
uation for health assessment and failure prognostics,” in Proc. 1st Eur. the Ph.D. degree in the Department of Automatic
Conf. Prognost. Health Manage. Soc., Dresden, Germany, 2012, vol. 3, Control and Micro-Mechatronic Systems, Femto-ST
pp. 111–116. Institute, Besançon Cedex, France.
[21] R. Heng and M. Nor, “Statistical analysis of sound and vibration signals From February to June 2012, he was with Lab-
for monitoring rolling element bearing condition,” Appl. Acoust., vol. 53, oratoire Ampère, “University of Claude Bernard
no. 1–3, pp. 211–226, Jan.–Mar. 1998. Lyon 1,” Lyon. His research interests are SHM, prog-
[22] K. Heidarbeigi, H. Ahmadi, and M. Omid, “Fault diagnosis of Massey nostics and health management, NDT, composite
Ferguson gearbox using power spectral density,” in Proc. ICEM, 2008, materials, pattern recognition and machine learning,
pp. 1–4. data clustering, and neural networks.

You might also like