Application of Relevance Vector Machine and Survival Probability To Machine

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

Expert Systems with Applications 38 (2011) 2592–2599

Contents lists available at ScienceDirect

Expert Systems with Applications


journal homepage: www.elsevier.com/locate/eswa

Application of relevance vector machine and survival probability to machine


degradation assessment
Achmad Widodo a, Bo-Suk Yang b,⇑
a
Mechanical Engineering Department, Diponegoro University, Tembalang, Semarang 50275, Indonesia
b
School of Mechanical Engineering, Pukyong National University, San 100, Yongdang-dong, Nam-gu, Busan 608-739, South Korea

a r t i c l e i n f o a b s t r a c t

Keywords: Condition monitoring (CM) of machines health or industrial components and systems that can detect,
Machine prognostics classify and predict the impending faults is critical in reducing operating and maintenance cost. Many
Survival probability papers have reported the valuable models and methods of prognostic systems. However, it was rarely
Relevance vector machine found the papers deal with censored data, which was common in machine condition monitoring practice.
Censored data
This work deals with development of machine degradation assessment system that utilizes censored and
Uncensored data
complete data collected from CM routine. Relevance vector machine (RVM) is selected as intelligent sys-
tem then trained by input data obtained from run-to-failure bearing data and target vectors of survival
probability estimated by Kaplan–Meier (KM) and probability density function estimators. After validation
process, RVM is employed to predict survival probability of individual unit of machine component. The
plausibility of the proposed method is shown by applying the proposed method to bearing degradation
data in predicting survival probability of individual unit.
Ó 2010 Elsevier Ltd. All rights reserved.

1. Introduction the problems and maintenance routines before downtime occurs.


The capability of prediction would enable the maintainer to exe-
Prognostics has emerged as an alternative to traditional reliabil- cute a very beneficial strategy based on future expected machine
ity prediction, run-to-failure, and scheduled maintenance. It is also condition.
an important aspect of machine components or equipment surveil- Currently the existing prognostics techniques are developed
lance system. This system has been developed through several using approaches such as TTF data-based, stress-based and ef-
modules which use device related to data acquisition and perform- fects-based (Hines & Usynin, 2008). Time-to-failure based utilizes
ing condition monitoring, fault diagnostics and prognostics. Condi- statistical approaches through, e.g. Weibull analysis of historical
tion monitoring and fault diagnostics portions have been well time-to-failure data. This technique typically involves fitting prob-
developed for several decades, while prognostics methods have abilistic failure distribution to historical data. The logic extension
recently attracted much attention in engineering maintenance to this method is the correlation of failure event history with more
research work. The reason of the growing interest in developing specific health condition data. It estimates the life of an average
prognostics technique is there are several advantages could be component under average usage condition. This method has been
gained from prognostics application such as reducing production implemented by Groer (2000) who performed analysis of TTF with
downtime, spare-parts inventory, maintenance cost and safety a Weibull model. Another research work on the suitability of Wei-
hazards. Another reason is that prognostics requirements for mod- bull distribution for machine failure estimation was reported by
ern maintenance system and safety–critical components have be- Schömig and Rose (2003). Stress-based approach considers envi-
came a mission that presents many challenges for engineering ronmental stresses, e.g. temperature, load, vibration, etc. under
system design work. The aims of prognostics are usually developed which the equipment operates. A common method is proportional
to accurately predict one of related measures such as remaining hazard model (PHM) that utilizes regression and life-tables as pro-
useful life (RUL), time-to-failure (TTF) or probability-of-failure posed by Cox (1972). This method use prior observations of
(POF) of machine components or engineering assets. These objec- explanatory variables such as stress, vibration, temperature, cur-
tives are a must to support an excellent maintenance process that rent, voltage, and the response variable, which is usually failure
capable to estimate future equipment health status, to anticipate time, to predict life component. The environmental conditions,
termed as covariates (z0), are used to modify a baseline hazard rate
(k0) to obtain a new hazard rate. Failure data collected at covariate
⇑ Corresponding author. Tel.: +82 51 620 1604; fax: +82 51 620 1405. operating conditions are used to solve for the unknown parameter
E-mail address: bsyang@pknu.ac.kr (B.-S. Yang). b using maximum likelihood estimation (MLE). Research works of

0957-4174/$ - see front matter Ó 2010 Elsevier Ltd. All rights reserved.
doi:10.1016/j.eswa.2010.08.049
A. Widodo, B.-S. Yang / Expert Systems with Applications 38 (2011) 2592–2599 2593

prognostics and health management (PHM) were reported, e.g. by work deals with survival analysis to estimate the remaining useful
Jardine, Anderson, and Mann (1987, 1989) and Mazucchi and Soyer life (RUL) of machine components. So we draw a random sample of
(1989). Effects-based prognostics approach uses degradation mea- these machine components, put them into test, collect and perform
sures to perform a prognostics prediction. These degradation analysis of the data then make the inference among them. This
measures are scalar or vector quantities that numerically represent work employs KM and PDF estimators to generate survival proba-
the current ability of the system to perform its designated func- bility as target vectors of our prognostics system.
tions properly. This technique is similar to data-driven technique KM estimator also known as product-limit estimator of the sur-
in the prognostics literature study. Data-driven method was popu- vivor function is non-parametric estimator (Kaplan & Meier, 1958),
lar technique of prognostics; however, it usually requires a large which uses intervals starting death times. The standard formula of
amount of data to reach high accuracy and good performance of this survivor function is given by
RUL estimation. In this case, the techniques of time series analysis k  
Y nj  dj
have been performed to predict the future state of machines based b
SðtÞ ¼ ð1Þ
on previous state. The examples of research works of machine j¼1
nj
prognostics that used data-driven technique were conducted by
Yang and Widodo (2008), Tran, Yang, Oh, and Tan (2008) and Niu Since, by construction, there are nj units which are survive just
and Yang (2009). before tj and dj failed occurring at tj, the probability that an unit
In the case of expert system and intelligent techniques applied failures between time interval and tj is estimated by dj/nj. Thus,
in prognostic system, artificial neural network (ANN) is one of pop- the probability of units surviving through [tj, tj + 1] is estimated
ular methods. ANN learns from example and aim to capture the by (nj  dj)/nj. The only influence of the censored data is in the
relationship among data. The remaining problem of ANN is that computation of the number of units, nj, which are survive just be-
the reasoning between their decisions is not always evident but fore tj. If a censored survival time occurs simultaneously with one
nevertheless, they are a feasible tool for practical problem and eas- or more unit failure, then the censored survival time is taken to oc-
ier than to build mathematical models describing system’s physic cur immediately after the failure time.
(Vachtsevanos, Lewis, Roemer, Hess, & Wu, 2006). ANN was re- In the case of complete failure data, we adapt the previous work
ported as tool for prognostics system by many researchers (Gebra- done in Heng et al. (2008), that means the machine components
eel, Lawley, Liu, & Parmeshwaran, 2004; Huang et al., 2007; Shao & have reached failure when removed from the machine, the survi-
Nezu, 2000; Tse & Atherton, 1999; Wen & Zhang, 2004. Another vor function is calculated by

methods were reported using support vector machine (Yang & 1; 0 6 t þ k < T
b
Sðt þ kÞ ¼ ð2Þ
Widodo, 2008), regression tree and neuro-fuzzy (Tran et al., 0; tþk>T
2008, Tran, Yang, & Tan, 2009), and Dempster–Shafer regression
(Niu & Yang, 2009). where T is failure time.
This paper contributes an intelligent machine prognostics sys- Data set considered as censored if the machine components
tem based on probability estimation of CM data of historical units have not reached the failure threshold when removed from the
when some data were censored and not undergo failure. This situ- machine. In this work, the standard formula of KM estimator was
ation commonly occurs in practice when preventive replacements modified to produce cumulative survival probability for individ-
are conducted, while the units under study are still operated. ual/unit machine components that is given by
Moreover, CM data is considered to be integrated with reliability 8
< 1;  
06tþk<L
analysis to enable prognostic system that is longer-range system. b Q
Sðt þ kÞ ¼ nj dj
; tþk>L ð3Þ
The censored data of historical units usually rare to be considered : nj
L6t j 6tþk
as prognostic input data and it has also not been fully utilized.
Whereas this phenomenon is very common in the practice that where L denotes the last observed survival time of the unit machine
the system does not contain of only single unit but a population component.
of units. Therefore, the relation between CM data and actual sur- Note that we use the last observed survival time L of each cen-
vival state of the assets need to be deduced. sored unit as the starting time, rather than time 0, to compute
This work complements intelligent prognostics system of the appropriate training target survival probabilities.
previous work done by Heng, Tan, & Mathew (2008) and utilizes PDF is employed to estimate the survivor function of each unit j
relevance vector machine (RVM) for prediction the survival which derived from CM data Yj(t) at time t. In this case, the esti-
probability of units under study. The training inputs for RVM were mated survival probability is successive multiplication of probabil-
generated from simulation and experimental bearing defect degra- ity of units that have survived preceding intervals having condition
dation data that involves censored data. Target vectors were sur- indices higher than the observed index of item j but lower than the
vival probability that was obtained from survival analysis by threshold, this is given by
using Kaplan–Meier (KM) and probability density function (PDF) R Y threshold
estimators. Y
k
y
f ðyjt þ dkÞdy
b
Sðt þ dkÞ ¼ Ri;tþdk
1 ð4Þ
j¼1 yi;tþdk
f ðyjt þ dkÞdy
2. Theoretical background
where d is time interval.
2.1. Survival analysis Finally, the target vectors of training are mean of survival prob-
ability obtained by above methods.
Survival analysis is the name for a collection of statistical tech-
niques used to describe and quantify time to event data. In survival 2.2. Relevance vector machine (RVM)
analysis, we use the term ‘failure’ to define the occurrence of the
event of interest and the term ‘survival time’ to specify the length RVM is a Bayesian form representing a generalized linear model
of time taken for failure to occur. Situations where survival analy- of identical functional form of support vector machine (SVM). It
sis have been used include prognostics of life time machine com- differs with SVM in the case of solution which provides probabilis-
ponents, time from diagnosis to death in clinical trial, duration of tic interpretation of its outputs (Tipping, 2000). RVM evades the
industrial dispute, time from infection to disease onset, etc. Our complexity by producing models which have both a structure
2594 A. Widodo, B.-S. Yang / Expert Systems with Applications 38 (2011) 2592–2599

and a parameterization process that, together, are appropriate to tions for a new data are then made according to integration out
the information content of the data. As a supervised learning, the weights to obtain the marginal likelihood for the
RVM starts with a set of data inputs fxgNn¼1 and their corresponding hyperparametes
target vectors ftgNn¼1 . The aim is to learn a model of the dependency Z
of the target vectors on the inputs in order to make accurate pre- pðtja; r2 Þ ¼ pðtjw; r2 ÞpðwjaÞdw ¼ ð2pÞN=2 jB1 þ UA1 UT j1=2
diction of t for unseen value of x. Typically, the predictions are  
1
based on a function y(x) defined over the input space, and learning  exp  tT ðB1 þ UA1 UT Þ1 t ð14Þ
the process of inferring the parameter of this function. In the con- 2
text of SVM, this function takes form
X
N 3. Methodology
yðxÞ ¼ wi Kðx; xi Þ þ w0 ð5Þ
1
The machine degradation assessment methodology is depicted
where w = {w1, w2, . . . , wN} is weight vectors, w0 is bias and K(x, xi) is in Fig. 1 which employs CM data of j units machines that obtained
a kernel function. from CM routine. Feature calculation is performed to obtain good
RVM seeks to predict target t for any query of x according to features that represent clear progressive degradation of machine.
When we deal with multi-features, feature extraction should be
t ¼ yðxÞ þ en ð6Þ
performed to map the calculated features from high dimensional
where en are independents samples from noise process with mean 0 space onto lower dimensional space. We can employ unsupervised
and variance r2. learning techniques such as principal or independent component
The likelihood of data set can be written as analysis and self-organizing map for feature extraction. One
  dimensional feature may be obtained by unsupervised learning
1
pðtjw; r2 Þ ¼ ð2pr2 ÞN2 exp  2 kt  Uwk2 ð7Þ from which the survival probability is calculated. Survival proba-
2r
bility is then estimated by KM and PDF estimators as target vectors
where U is the N  (N + 1) design matrix with Unm = {1, K(xi, x1), for RVM training and validation. Good validation process is mea-
K(xi, x2), . . . , K(xi, xN)}T. sured by root-mean-square error (RMSE) that the lower the better
Maximum likelihood estimation of w and r2 in Eq. (7) often re- of validation process, and correlation (R). One or more CM data
sults overfitting. Therefore, Tipping (2001) recommended imposi- from individual unit can be used to test the performance of RVM
tion of some prior constraints on the parameters w by adding a model after validation. The weights obtained from validation pro-
complexity to the likelihood or error function. This a priori infor- cess are saved and then used for testing the ability of RVM based
mation controls the generalization ability of the learning process. machine degradation assessment.
Typically, new higher-level parameters are used to constrain an ex-
plicit zero-mean Gaussian prior probability distribution over the CM data of j Target Training RVM and
weights unit machines vectors validation

Y
N
 Feature calculation Survival probability Yes No
pðwjaÞ ¼ N wi j0; a1
i ð8Þ and extraction estimation Good (?)
i¼0

where a is a vector of (N  1) hyperparameters that controls how Testing RVM model


Machine Degradation
Assessment
far from zero each weights is allowed to deviate (Schölkopf & Smola,
2002). Fig. 1. Machine degradation assessment method.
Using Bayes’ rule, the posterior overall unknowns could be com-
puted, given the defined non-informative prior-distributions
pðtjw; a; r2 Þpðw; a; rÞ Outer-race fault, 100rpm, BPFO=4.89Hz, Gaussian Noise 30dB
pðw; a; r2 jtÞ ¼ R ð9Þ 2
pðtjw; a; r2 Þpðw; a; r2 Þdw da dr2
However, we cannot compute the solution of the posterior in Eq. (9) 0

directly since we cannot perform the normalizing integral pðtÞ ¼


R -2
pðtjw; a; r2 Þpðw; a; r2 Þdw da dr2 . Instead, we decompose the pos- 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
terior as (a) Time (sec)

pðw; a; r2 jtÞ ¼ pðwjt; a; r2 Þpða; r2 jtÞ ð10Þ 100


to facilitate the solution. The posterior distribution of weights is gi-
50
ven by
pðtjw; r2 Þpðw; aÞ 0
pðwjt; a; r2 Þ ¼
pðtja; r2 Þ
ð11Þ 0
(b) 500 1000
Freq (Hz)
1500 2000 2500

Eq. (11) has an analytical solution where the posterior covari-


ance and mean are ←4.88
100
R ¼ ðUT BU þ AÞ1 ð12Þ ←9.77
←14.65
l ¼ RU BtT
ð13Þ ←19.53
←24.41
0
0 10 20 30 40 50 60 70 80 90 100
(c)
2
with A = diag(a1, . . . , aN+1), and B = r I. Freq (Hz)
Note that r2 is also treated as a hyperparameter, which may be
estimated from the data. Therefore, machine learning becomes a Fig. 2. Simulated signal of outer-race defect: (a) time domain plot of raw signal, (b)
search for the hyperparameter posterior most probable. Predic- frequency spectrum of raw signal and (c) fault detection after demodulation.
A. Widodo, B.-S. Yang / Expert Systems with Applications 38 (2011) 2592–2599 2595

1 1 1
0.8 0.8 0.8
0.6 0.6 0.6
0.4 0.4 0.4
Amplitude

Amplitude

Amplitude
0.2 0.2 0.2
0 0 0
-0.2 -0.2 -0.2
-0.4 -0.4 -0.4
-0.6 -0.6 -0.6
-0.8 -0.8 -0.8
-1 -1 -1
0 1 2 3 4 5 0 1 2 3 4 5 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
Time [s] Time [s] Time [s]

Fig. 3. Defective bearing signal simulation.

14 12 1.8
11 1.7
12
10 1.6

Entropy Estimation
9 1.5
10
Kurtosis

8 1.4
Peak

8 7 1.3
6 1.2
6
5 1.1
4 1
4
3 0.9
2 2 0.8
0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100
Time-step Time-step Time-step

Fig. 4. Peak, kurtosis and entropy estimation of simulated defective bearing signal.

(a) (b) 30
Dataset 1
Dataset 10
Dataset 39
25
Quantization Error (QE)

20

15

10

5
0 10 20 30 40 50 60 70 80
Time step

Fig. 5. (a) Feature extraction by PCA and (b) presentation of QE obtained from different dataset.

4. Application on machine degradation rotating speed 100 rpm and sampling frequency 5 kHz. Fig. 2(a)
shows the simulated time domain signal of bearing with outer-race
The proposed method is validated by using simulation data of defect. This signal was converted to frequency domain using fast-
bearing defect degradation and real data obtained from experi- Fourier transform (FFT) as shown in Fig. 2(b). This figure presents
mental work. In the simulation, we developed vibration CM data that the spectrum was dominated by high-frequency resonant sig-
that represents defect propagation of rolling element bearing by nals. To separate the bearing fault frequency signal from these
Matlab program. The properties of rolling element bearing in the dominant signals, the vibration signals were band-pass filtered
simulation were as follows: pitch diameter of 23 mm, number of and rectified. Fig. 2(c) depicts the peaks were detected at
rolling elements of 9; roller diameter of 8 mm and contact angle 4.88 Hz, which closely matched with the calculated outer-race
of 0°. We conducted bearing outer-race defect simulation under fault frequency as indicated in the top of figure.
2596 A. Widodo, B.-S. Yang / Expert Systems with Applications 38 (2011) 2592–2599

0.9

0.8

0.7

Survival probability
0.6

0.5

0.4

0.3
RMSE =2.73e-6
0.2 R = 0.98

0.1
0 5 10 15 20 25 30 35 40 45 50
Measurement points

Fig. 8. Validation of RVM training with QE.

1.4

1.2

0.8
QE

Fig. 6. Bearing test rig and sensor placement illustration (Qiu et al., 2006). 0.6

0.4

80
Data No. 1-Bearing 3 0.2
Actual
Data No. 2-Bearing 1
70 Prediction (kernel width = 5e-5)
Data No. 3-Bearing 3 0 Prediction (kernel width = 2.5e-5)
Predcition (kernel width = 2.5e-6)
60 -0.2
Kurtosis of vibration

0 10 20 30 40 50 60 70 80 90 100
Measurement points
50
Fig. 9. Overfitting prediction of simulation data.
Threshold
40

30
Table 1
Performance of RVM testing w.r.t. kernel-width using bearing simulation data.
20
Kernel-width RMSE R

10 5.0  104 0.170 0.92


2.5  104 0.115 0.95
5.0  105 0.060 0.97
0
0 200 400 600 800 1000 1200 1400 1600 1800 2000 2.5  105 0.048 0.98
Mesurements points 5.0  106 0.046 0.98
2.5  106 0.266 0.79
Fig. 7. Kurtosis of vibration data and threshold of failure condition.

We calculated three features from time domain signals namely


The simulated signals were repeatedly generated from the com- peak, kurtosis and entropy estimation (as depicted in Fig. 4), then
puter program based on equations presented by McFadden & we performed feature extraction by means of PCA to reduce the
Smith (1984) & Wang & Kootsookos (1998), while the defective dimensionality of calculated features. This feature reduction was
severity was increased exponentially with random fluctuations to addressed to minimize the input of the RVM network and training
represent real condition (Fig. 3). Every simulated signal has defect time. After PCA training, the deviations between mapped features
impulses that increase at different rates and time measurements. of simulated signals and healthy state conditions were calculated.
The signals were set up to be having same threshold of failure, These deviations are regarded as quantization errors (QE) as de-
but the time of reaching failure was different for each data set. It picted in Fig. 5(a). Fig. 5(b) shows the QE of different simulated
has been observed in bearing life test that bearing degradation sig- data set generated from defective bearing simulation.
nals possess an inherent exponential growth (Gebraeel, Lawley, We generated 40 datasets and the corresponding QE values
Rong, & Ryan, 2005; Gebraeel, 2006). were obtained. Thirty-six of 40 datasets were employed for train-
A. Widodo, B.-S. Yang / Expert Systems with Applications 38 (2011) 2592–2599 2597

4.5 1

Actual failure time 0.9


4
X: 100
Y: 4.075
3.5 0.8

0.7
3

Survival probability
0.6
2.5
QE

0.5
2
0.4
1.5
0.3
1
0.2
0.5
0.1 Prediction (kernel width = 0.5)
Prediction (kernel width = 1e-1)
0
0 10 20 30 40 50 60 70 80 90 100 110 0
0 100 200 300 400 500 600 700
Measurement points
Measurement points
Fig. 10. Bearing defect degradation of dataset No. 30.
Fig. 13. Overfit prediction of experimental bearing data.

1.2 Table 2
Performance of RVM testing w.r.t. kernel-width experiment bearing data.
Overfitting
1 Kernel-width RMSE R
0.5 28.427 0.25
0.1 8.046 0.31
Survival probability

0.8
0.01 0.906 0.47
0.001 0.146 0.88
0.0001 0.044 0.98
0.6 0.00001 0.011 0.99

0.4

50
X: 98
0.2 Y: 0.1308 45 X: 692
Actual Predicted failure time Y: 46
Threshold
Prediction (kernel width =5e-6 ) 40
0
0 10 20 30 40 50 60 70 80 90 100 35
Kurtosis of vibration

Measurement points
30
Fig. 11. RVM prediction of bearing defect degradation dataset No. 30.
25

20
1
15
0.9
10
0.8
5
Survival probability

0.7 RMSE = 1.29e-6


R = 0.99 0
0 100 200 300 400 500 600 700
0.6
Measurement points
0.5
Fig. 14. Bearing degradation data for testing RVM.
0.4

0.3
The experimental data was also generated from bearing test rig
0.2 that able to produce run-to-failure data. These data was down-
0.1 Actual loaded from Prognostics Center of Excellence (PCoE) through prog-
Prediction nostic data repository contributed by Intelligent Maintenance
0 System (IMS), University of Cincinnati (Lee, Qiu, Yu, Lin, & Rexnord,
0 20 40 60 80 100 120 140 160 180
Measurement points 2007). Bearing test rig consists of four bearings that were installed
on one shaft as presented in Fig. 6. The rotation speed of shaft was
Fig. 12. Validation of SVM training with kurtosis of vibration experimental data. kept constantly at 2000 rpm and a radial load of 6000 lb was added
to the shaft and bearings through spring mechanism. The bearings
ing and the remaining for testing the system. In training datasets, used were Rexnord ZA-115 double row bearings that have 16 roll-
we imposed 1/3 of training data which are censored data. The tar- ers at each row, a pitch diameter of 2.815 in., roller diameter of
get vectors for training process were obtained from KM and PDF 0.311 in., and a tapered contact angle of 15.17°. The vibration sig-
estimators. nals were acquired by eight accelerometers from PCB 353B33
2598 A. Widodo, B.-S. Yang / Expert Systems with Applications 38 (2011) 2592–2599

(a high sensitivity quartz ICP accelerometers) that were installed at 1.2


vertical and horizontal directions. Four thermocouples were also
Overfitting
installed to the outer-race of each bearing to record bearing tem- 1
perature for monitoring lubrication purposes. Vibration signals
were collected every 20 min by NI-DAQCard 6062E data acquisi-
0.8
tion card with data sampling rate was 20 kHz.

Survival probability
The collected vibration data were 12 complete failure datasets
with different failure time and four datasets regarded as normal 0.6

condition. The measurement points of original data were cut due


to high dimensionality that represents normal condition. In our 0.4
work, we used 2100 measurement points which is still able to
show the normal condition and failure event. We calculated and 0.2
used only one dimensional features namely kurtosis for validating X: 664
Y: 0.007243
the proposed method. The presentation of kurtosis of vibration
0 Predicted failure time
data and threshold assumption of failure is shown in Fig. 7. In
the case of complete failure, we only took eight data and calculated
survival probability for training the target vectors. Four datasets -0.2
0 100 200 300 400 500 600 700
were taken to represent the censored data, and then modified Measurement points
KM and PDF estimators were performed to determine the survival
probability of censored data. One remaining data was addressed to Fig. 15. RVM prediction of bearing failure time.
test the performance of system after training RVM.

5. Result and discussion kernel-width gave serious overfit prediction of survival probability
of RVM testing. The complete results of RVM testing performance
In the case of simulation data, we trained RVM using inputs is summarized in Table 2. The best RVM testing performance was
from QE of 36 CM bearing degradation data and target vectors ob- reached at kernel-width 1.0  105 with RMSE and R are 0.01 and
tained by KM and PDF estimators. In RVM training, we employed 0.99, respectively. In addition, selection of kernel-width that lower
Gaussian kernel and performed 2-fold cross-validation for obtain- than 1.0  105 effected high error and low correlation of survival
ing proper kernel-width parameter (c). We searched kernel-width probability.
value in the range of {5  104, 2.5  104, . . . , 2.5  106} to obtain Fig. 14 shows the individual bearing data that used for testing
optimized RVM training process. The validation process is shown the validated RVM. This data was no involved in training RVM
in Fig. 8 with acceptable RMSE and R are 2.73  106 and 0.98, and reached failure time at ta = 692. RVM based survival probabil-
respectively. ity prediction is depicted in Fig. 15 and predicts the failure time at
The effect of improper kernel-width parameter is presented in tp = 664. The maximum amplitude of kurtosis of vibration data
Fig. 9 that shows overfitting phenomenon in prediction of survival reached threshold at t = 692 is matched with the decreasing of sur-
probability of bearing data. In addition, Table 1 informs the perfor- vival probability, S = 0, that represent failure condition of bearing
mance of testing process after RVM validation with respect to ker- under study. The plausibility of the prediction can be shown from
nel-width. In our work, kernel-width values was studied in the the accuracy given by
range of {5  104, 2.5  104, . . . , 2.5  106} while higher and    
jt a  tp j 692  664
lower from this range gave serious overfitting. The overfit predic- Accuracy ¼ 1  100% ¼ 1   100%
ta 692
tion of survival probability resulted over prediction as depicted
in Fig. 9. Kernel-width obtained from cross-validation was ¼ 95:9%
5  106. Fig. 15 also presents overfit prediction at the early measure-
Fig. 10 shows the testing data for validated RVM obtained from ment points, but this case does not significantly decrease the prog-
QE of bearing dataset No. 30. The actual failure time is located at nostics meaning because the machine still in normal condition.
ta = 100. RVM prediction is presented in Fig. 11 which gives good
prediction of failure time at tp = 98. At early measurement points,
6. Conclusion
there is still having overfit prediction, however, it does not signif-
icantly reduce the meaning of prognostics because the bearing still
This paper presents the study of machine degradation assess-
in normal condition. The accuracy of prediction can be simply cal-
ment based on RVM and survival probability. RVM was trained
culated as
by simulation and experimental CM data including censored data
    to obtain good prognostics model. Target vectors were generated
jt a  t p j 100  98
Accuracy ¼ 1   100% ¼ 1   100% by KM and PDF estimators which represent survival probability
ta 100
of the population of machines being studied. RVM has been exper-
¼ 98:0%
imented and validated by simulation and experimental data, and
In the case of experimental data, RVM was trained by 700 data they have resulted plausible performance of failure time predic-
points of kurtosis of CM vibration data that represent run-to-fail- tion. Overfit prediction emerged at the early measurement points
ure data. In this case, we also employed Gaussian kernel and per- of both simulation and experimental data. However, it might be
formed 4-fold cross-validation for obtaining proper kernel-width acceptable and still gives prognostics meaning. Result deduced
parameter (c). We searched kernel-width value in the range of from simulation and experimental data is plausible to be a ma-
{0.5, 0.1, . . . , 5  106} to obtain optimized RVM training process. chine degradation assessment model.
The validation process is shown in Fig. 12 with plausible RMSE
and R are 1.29  106 and 0.99, respectively Acknowledgement
Improper kernel-width selection leads to overfitting phenome-
non as presented in Fig. 13. In this case, selection of relatively high This work was supported by the Brain Korea (BK) 21 project.
A. Widodo, B.-S. Yang / Expert Systems with Applications 38 (2011) 2592–2599 2599

References McFadden, P. D., & Smith, J. D. (1984). Model for the vibration produced by a single
point defect in rolling element bearing. Journal of Sound and Vibration, 96,
69–82.
Cox, D. R. (1972). Regression model and life-tables. Journal the Royal Statistic Society,
Niu, G., & Yang, B. S. (2009). Dempster–Shafer regression for multi-step-ahead time-
Series B (Methodological), 34(2), 187–220.
series prediction towards data-driven machinery prognosis. Mechanical System
Gebraeel, N., Lawley, M., Liu, R., & Parmeshwaran, V. (2004). Residual life prediction
and Signal Processing, 23(3), 740–751.
from vibration-based degradation signals: A neural network approach. IEEE
Qiu, H., Lee, J., Lin, J., & Yu, G. (2006). Wavelet filter-based weak signature detection
Transactions on Industrial Electronics, 51, 694–700.
method and its application on rolling element bearing prognostics. Journal of
Gebraeel, N. Z., Lawley, M. A. ., Rong, Li., & Ryan, J. K. (2005). Residual-life
Sound and Vibration, 289(4-5), 1066–1090.
distribution from component degradation signals: A Bayesian approach. IIE
Shao, Y., & Nezu, K. (2000). Prognosis of remaining bearing life using neural
Transactions, 37, 543–557.
network. Proceedings of the Institution of Mechanical Engineers, Part I: Journal of
Gebraeel, N. (2006). Sensory-updated residual life distribution for components with
System and Control Engineering, 214(3), 217–230.
exponential degradation pattern. IEEE Transactions on Automation Science and
Schölkopf, B., & Smola, A. J. (2002). Learning with kernels: Support vector machines,
Engineering, 3(4), 382–393.
regularization, optimization, and beyond. Cambridge, MA: MIT Press.
Groer, P. G. (2000). Analysis of time-to-failure with a Weibull model. In Proceedings
Schömig, A., & Rose, O. (2003). On the suitability of the Weibull distribution for the
of the maintenance reliability conference, Knoxville, Tennessee, USA.
approximation of machine failures. In The Proceeding of Industrial Engineering
Heng, A., Tan, A., & Mathew, J. (2008). Asset health prognostics incorporating
Research Conference, Portland, Oregon, USA.
reliability data and condition monitoring histories. In J. Gao, J. Lee, J. Ni, L. Ma, &
Tipping, M. E. (2000). The relevance vector machine. In S. Solla, T. Leen, & K. R.
J. Mathew (Eds.), Proceeding of the 3rd world congress engineering asset
Muller (Eds.). Advances in neural information processing system (Vol. 12,
management and intelligent maintenance system (WCEAM-IMS), Beijing, China
pp. 287–289). Cambridge, MA: MIT Press.
(pp. 666–672).
Tipping, M. E. (2001). Sparse Bayesian learning and the relevance vector machine.
Hines, J. W., & Usynin, A. (2008). Current computational trends in equipment
Journal of Machine Learning Research, 1, 211–244.
prognostics. International Journal of Computational Intelligence System, 1(1),
Tran, V. T., Yang, B. S., Oh, M. S., & Tan, A. C. C. (2008). Machine condition prognosis
94–102.
based on regression trees and one-step-ahead prediction. Mechanical System
Huang, R., Xi, L., Li, X., Liu, C. R., Qiu, H., & Lee, J. (2007). Residual life predictions
and Signal Processing, 22(5), 1179–1193.
for ball bearings based on self-organizing map and back propagation
Tran, V. T., Yang, B. S., & Tan, A. C. C. (2009). Multi-step ahead direct prediction for
neural network methods. Mechanical System and Signal Processing, 21, 193–
the machine condition prognosis using regression trees and neuro-fuzzy
207.
systems. Expert System with Application, 36(5), 9378–9387.
Jardine, A. K. S., Anderson, P. M., & Mann, D. S. (1987). Application of the Weibull
Tse, P., & Atherton, D. (1999). Prediction of machine deterioration using vibration
proportional hazards model to aircraft and marine engine failure data. Quality
based fault trends and recurrent neural networks. Transaction of the ASME:
and Reliability Engineering International, 3, 77–82.
Journal of Vibration and Acoustics, 121, 255–362.
Jardine, A. K. S., Ralston, P., Reid, N., & Stafford, J. (1989). Proportional hazards
Vachtsevanos, G., Lewis, F., Roemer, M., Hess, A., & Wu, B. (2006). Intelligent fault
analysis of diesel engine failure data. Quality and Reliability Engineering
diagnosis and prognosis for engineering systems. New Jersey: John Wiley and
International, 5, 207–216.
Sons.
Kaplan, E. L., & Meier, P. (1958). Nonparametric estimation from incomplete
Wang, Y. F., & Kootsookos, P. J. (1998). Modelling of low shaft speed bearing faults
observations. Journal of the American Statistical Association, 53, 457–481.
for condition monitoring. Mechanical System and Signal Processing, 12(3),
Lee, J., Qiu, H., Yu, G., Lin, J., & Rexnord (2007). Technical Services 2007. ’Bearing
415–426.
Data Set’, IMS, University of Cincinnati. NASA Ames Prognostics Data
Wen, G., & Zhang, X. (2004). Prediction method of machinery condition based
Repository, NASA Ames, Moffett Field, CA. <http://ti.arc.nasa.gov/project/
recurrent neural network models. Journal of Applied Sciences, 4, 675–679.
prognostic-data-repository> Accessed 06.07.09.
Yang, B. S., & Widodo, A. (2008). Support vector machine for machine fault diagnosis
Mazucchi, T. A., & Soyer, R. (1989). Assessment of machine tool reliability using a
and prognosis. Journal of System Design and Dynamics, 2(1), 12–23.
proportional hazards model. Naval Research Logistics, 36(6), 765–777.

You might also like