10 1016@j Ymssp 2020 106899

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 17

Mechanical Systems and Signal Processing 144 (2020) 106899

Contents lists available at ScienceDirect

Mechanical Systems and Signal Processing


journal homepage: www.elsevier.com/locate/ymssp

A two-stage method based on extreme learning machine for


predicting the remaining useful life of rolling-element bearings
Zuozhou Pan, Zong Meng ⇑, Zijun Chen, Wenqing Gao, Ying Shi
Yanshan University, Qinhuangdao, PR China

a r t i c l e i n f o a b s t r a c t

Article history: Rolling-element bearing is one of the main parts of rotating equipment. In order to avoid
Received 29 November 2019 the mechanical equipment damage caused by the sudden failure of rolling-element bear-
Received in revised form 10 April 2020 ings, it is necessary to monitor the condition of bearing and predict its life. Therefore, a
Accepted 11 April 2020
two-stage prediction method based on extreme learning machine is proposed to predict
the remaining useful life of rolling-element bearings quickly and accurately. This method
uses the relative root mean square value (RRMS) to divide the operation stage of the bear-
Keywords:
ing into two stages: normal operation and degradation. Starting from the normal operation
Rolling-element bearings
Multivariate feedback extreme learning
stage, according to the principle of univariate prediction, a feedback extreme learning
machine (MFELM) machine model is constructed for real-time short-term prediction of bearing degradation
Small sample trend. Once the predicted value shows that the bearing has entered the degradation stage,
Short-term prediction the sensitive features are selected as the input by correlation analysis, and the multi vari-
Remaining useful life (RUL) prediction able feedback extreme learning machine model, which takes into account the dual advan-
tages of multivariable regression and small sample prediction, is constructed to predict the
remaining useful life. The experimental results show that the proposed method has higher
short-term prediction accuracy and faster operation speed in the case of limited learning
sample size.
Ó 2020 Elsevier Ltd. All rights reserved.

1. Introduction

Bearing is one of the most important parts of mechanical equipment, which is widely used in rotating machinery. Its per-
formance directly affects the health of the whole equipment [1,2]. However, the bearing service life of many precision machi-
nes is not long. The service life of aeroengine main bearing is only hundreds of hours, and the precision life of CNC machine
tool high-speed spindle bearing is thousands of hours. Once the operation time exceeds the service life, the bearing operation
accuracy will drop sharply, which will lead to the failure of aeroengine, CNC machine tools and so on [3]. Therefore, early
detection of bearing damage can effectively avoid bearing failure and machine damage and reduce production losses and
casualties as much as possible [4], and accurate remaining useful life (RUL) prediction is also the premise for formulating
reasonable performance inspection and maintenance plans [5,6].
Currently, most of the techniques used to estimate the RUL of rolling-element bearings can be roughly divided into
model-based and data-driven methods [7]. Model-based bearing health prediction method uses the physical principle of
controling the bearing degradation to develop a mathematical model. The mathematical model is then used to predict
the future health condition of the bearing and estimate its RUL. Liao et al [8] proposed a prediction method based on

⇑ Corresponding author.
E-mail address: mzysu@ysu.edu.cn (Z. Meng).

https://doi.org/10.1016/j.ymssp.2020.106899
0888-3270/Ó 2020 Elsevier Ltd. All rights reserved.
2 Z. Pan et al. / Mechanical Systems and Signal Processing 144 (2020) 106899

proportional hazard and logistic regression models to predict the RUL of rolling-element bearings. Tian et al [9] proposed a
proportional hazards model-based method for the RUL prediction of the systems consisting of bearings. Liao [10] employed
the Paris model combined with a genetic programming method to predict the RUL of bearings. Whereas Li et al [11] utilized
an exponential model in consort with particle filters to estimate the RUL of bearings. Jantunen et al [12] constructed a new
type of degradation model for bearing’s RUL prediction by performing simplified trend analysis and curve fitting. Lei et al
[13] constructed a new nonlinear degradation model to describe the degradation process of rolling bearings, which consid-
ered four variable sources of the stochastic degradation process and could further improve the prediction accuracy. However,
these model-based prediction methods have certain limitations. This is due to the fact that this type of method requires a
specific model in each system, and developing such models can be very expensive [14].
In contrast, data-driven methods, which infer their health using one or a set of characteristics of the measured data as the
degradation indicators of the bearing, are more suitable for diagnosing faults in complex and difficult modeling system. Geb-
raeel et al. [15] estimated a bearing’s RUL by learning the measurement data of artificial neural networks and analyzing the
degradation behavior. Mao et al. [16] further improved the accuracy of bearing’s RUL prediction through deep learning net-
work by combining deep feature representation with transfer learning.
In addition, different variants of the Kalman filter have been used to estimate the bearing’s RUL [17–19], and neurofuzzy-
based methods have also been used to estimate the RUL [20–22] of bearings. Data-driven techniques based on neural net-
work are highly effective and very useful especially for complex systems, where the development of a mathematical model
might not always be feasible. However, the traditional data-driven method based on neural network may need a large num-
ber of learning samples and a great deal of learning time [23], and real bearing failure data in factories are very rare, and only
very limited bearing samples are obtained in laboratories, which spend a lot of time and money [24]. At present, the network
that can successfully predict RUL under small sample training is the improved LSTM network proposed by Qin et al. [25,26].
The improved network takes multiple sets of characteristics as input, meanwhile amplifies the input weight and recursive
weight of the hidden layer of the traditional LSTM network in different degrees, which increases the utilization of the input
signal by the LSTM network, and further improves the prediction accuracy of the network by increasing the learning time of
the samples, thus realizing the accurate prediction in the small sample training environment. Therefore, how to build an
appropriate model under limited state data is the key to accurately estimate the RUL and is also the urgent need of industrial
production. Recently, Wasim et al. [27,28] developed a dynamic regression method suitable for life prediction of various
bearings. This method determines the evolution trend of data by measuring the growth rate of degradation indicators. It uses
an iteratively updated quadratic regression model to predict the future value of degradation indicators. Without learning
process, it can predict the RUL of various bearings. Compared with the neural network algorithm, this kind of method can
be better used in practical situations. However, because of lack of necessary training process, large fitting errors inevitably
exist in the early stage of prediction.
Based on the above situation, a two-stage prediction method based on extreme learning machine (ELM) is proposed to
predict the characteristic trend and the RUL of rolling-element bearings. As a single-layer neural network that solves small
sample classification and prediction, ELM has unique advantages in training sample size and training speed [29,30]. In addi-
tion, the input weights and hidden layer deviations in ELM are generated randomly, and the output weights are calculated by
analysis, so there is no prior condition restriction [31,32]. ELM-based prediction methods have higher prediction accuracy
than multivariate function-based prediction methods. At the same time, Xu et al. [33] pointed out that the prediction accu-
racy of ELM can be further improved by adding feedback layer on the basis of ELM.
In the process of bearing life prediction, this method divides the prediction into two parts:

1. From the normal operation stage of rolling bearing to its complete failure, according to the relationship between the tar-
get signals, feedback extreme learning machine (FELM) is used to predict the short-term deterioration trend of bearings,
and the test samples are updated in real time according to the latest collected signals.
2. When the predicted value shows that the bearing enters the degradation stage, the multiple sensitive indicators of the
bearing are used as the input of the multivariate feedback extreme learning machine (MFELM), and the RUL of the signal
is used as the output to achieve the RUL prediction of the signal.

The short-term degradation trend prediction process based on the FELM model has the advantages as follows:

1. By short-term prediction of bearing degradation trend, FELM model can detect the signs of bearing degradation earlier
than other methods, so that the bearing’s RUL prediction process can be performed earlier.
2. With higher accuracy than the multi-objective RUL prediction method, the short-term degradation trend prediction pro-
cess based on the FELM model can assist the multi-objective RUL prediction method to judge the final failure time under
the condition of small learning samples.
3. The multi-objective RUL prediction model cannot describe the degradation trend of bearings, whereas the short-term
degradation trend prediction process based on the FELM model can help users predict the degradation state of bearings
in advance, so as to formulate a reasonable maintenance plan.

Once the predicted signal shows the signs of bearing degradation, the MFELM model can quickly start the RUL prediction
process by analyzing the changes of multiple sensitive indicators of the target signal. In general, the whole scheme has a high
Z. Pan et al. / Mechanical Systems and Signal Processing 144 (2020) 106899 3

processing speed and accuracy, and the prediction results of the scheme can be updated in real time according to the latest
signal collected, which can be fully adapted to the prediction of bearing life in the actual environment.
The main contributions of this article are as follows:

1. A two-stage prediction model is proposed. The residual life of bearings can be obtained directly while short-term predic-
tion of fault trend is performed.
2. ELM model with feedback mechanism is used to make short-term predictions of the states of rolling bearings to realize
the early detection of the bearing degradation threshold. And this model achieves the short-term prediction process
under condition of small sample learning by making full use of the correlation between signal data.
3. On the basis of the single-variable RUL prediction method, several relative time-domain features are selected as the input
of FELM by analyzing the correlation between the relative time-domain characteristics of bearings and the recession char-
acteristics of bearings, which improves the amount of information contained in the input samples and realizes the high-
precision RUL prediction under the small sample learning environment.

The rest of this article is organized as follows: Section II describes the degradation indicator and threshold of rolling-
element bearings. Section III shows the univariate prediction principle and the short-term prediction method based on
the FELM model to the degradation trend of rolling element bearings. Section IV shows the multivariable prediction principle
and the prediction method based on the MFELM model to the RUL of rolling-element bearings. In Section V, the prediction
methods are tested using accelerated degradation data of rolling element bearings. Section VI summarizes this article.

2. Assessment of bearing performance decline

From the beginning of operation to failure, rolling-element bearings generally will undergo three stages: normal opera-
tion, degradation, and failure. The main content of bearing performance deterioration assessment is to construct an indicator
that can truly reflect the changing law of bearing operation state [34].

2.1. Root mean square

The international standard ISO 2372 [35] gives an industry standard for mechanical vibration: when the root mean square
(RMS) value of the medium mechanical vibration signal reaches 2.0 to 2.2 g, the equipment is in a dangerous state. Root
mean square can be obtained by the following equation:
vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
u n
u1 X
¼t
2
xrms ðxðiÞÞ ð1Þ
n i¼1

xðiÞ is the signal sequence and i ¼ 1; 2;    n is the data point.


Root mean square value, as a response quantity of mechanical fault characteristic, can change with the fault character-
istic. In the early stage of degradation, RMS value can maintain a gentle state. But in the late stage of degradation, RMS value
will rapidly increase because the bearing damage has seriously affected the normal operation of the equipment. It is com-
pletely reasonable to use RMS as a degradation indicator to reflect the bearing state. Fig. 1 shows the corresponding time-
domain graph and RMS curve of Bearing 1–1 (The relevant description of the bearing data can be obtained from Section V).

50 7

5
Amplitude (A/v)

RMS Values

4
0
3

-50 0
0 0.5 1 1.5 2 2.5 3 0 500 1000 1500 2000 2500 3000
Time (s) 4 Data point
x 10
(a) (b)
Fig. 1. Time-domain graph and RMS curve of Bearing 1–1. (a) Time-domain graph, (b) RMS curve graph.
4 Z. Pan et al. / Mechanical Systems and Signal Processing 144 (2020) 106899

Each data point in RMS curve is obtained by calculating RMS value from the data collected every 10-second interval. It real-
izes the process of reducing the amount of data from millions to thousands. The upward trend of the RMS value coincides
with the degradation process of the bearing.
Although the RMS can reflect the operation state of the bearing, the individual bearing differences have a great influence
on the RMS. Even under the same test conditions, the RMS of different rolling-element bearings varies greatly [36], which is
reflected in Fig. 2 (a). This situation will interfere with the judgment of the bearing stage and also make it difficult to set the
threshold in the process of fault prediction. Therefore, it is better to process the RMS by normalizing and linear rectification
to obtain the relative RMS (RRMS).

2.2. Relative root mean square

1) Standardization
First, a series of stable RMS values during the normal period are intercepted, and the mean value of the RMS is determined
as a standard value. Then, calculate the ratio of each RMS to the standard value, and then the standardized RMS xsrms can be
obtained by the following equation

xrms ðiÞ
xsrms ðiÞ ¼ 1 Pm1 þm ð2Þ
m j¼m 1
xrms ðjÞ

m1 represents the data point to start interception, and m represents the intercepted length. Comparing Fig. 2(a) with (b)
you can see that the difference of RMS values corresponding to each group of signals is narrowed in the normal operation
stage after standardization. At the same time, the peak value of the RMS values of different groups is balanced, which pro-
vides convenience for setting the fault threshold in the later stage (The relevant description of the bearing data can be
obtained from Section V).
2) Linear rectification technology
Despite the gradual deterioration of rolling-element bearings over time, degradation indicators may show improved fluc-
tuations. From the modeling perspective, it is impractical to model these stochastic fluctuations in degradation indicators,
which are not a typical phenomenon of bearing degradation process. In order to reduce the randomness of the degradation
indicators, the standardize RMS is smoothed by linear rectification technology (LRT) to obtain RRMS, and the result is shown
in Fig. 3. LRT is obtained according to Eqs, (3) and (4).
8
> xsrms ðiÞ
>
>
< ð 8x ði  1Þ 6 x ðiÞ 6 ð1 þ kÞx ði  1ÞÞ
srms srms srms
xrrms ðiÞ ¼ ð3Þ
> xsrms ði  1Þ þ k
>
>
:
ð 8xsrms ðiÞ < xsrms ði  1Þ _ xsrms ðiÞ > ð1 þ kÞxsrms ði  1ÞÞ
Pn
i¼1 ðxsrms ði þ 1Þ  xsrms ðiÞÞ
k¼ ð4Þ
n
where k represents the growth rate of xsrms. RRMS has the following advantages as an evaluation indicator of bearing per-
formance degradation: (1) RRMS is sensitive to initial damage and grows steadily with the development of damage (in this

3 7 1-2
1-2
2-1 6 2-1
2.5 2-2
2-2
5 2-4
2 2-4
RMS Values

2-6
RMS Values

2-6 4
3-1 3-1
1.5
3

1 2

0.5 1

0 0
0 200 400 600 800 1000 0 200 400 600 800 1000
Data point Data point

(b)
(a)
Fig. 2. The RMS of each group of bearing signals before and after standardization. (a) Before Standardization. (b) After standardization.
Z. Pan et al. / Mechanical Systems and Signal Processing 144 (2020) 106899 5

15 15 15

10 10 10
RRMS Values

RRMS Values
RRMS Values
5 5 5

0 0 0
0 1000 2000 0 1000 2000 0 1000 2000
Data point Data point Data point
(a) (b) (c)

15 15 15
RRMS Values

10 10

RRMS Values
10
RRMS Values

5 5 5

0 0 0
0 1000 2000 0 1000 2000 0 1000 2000
Data point Data point Data point
(d) (e) (f)
Fig. 3. Relative RMS curves are obtained from different window widths in LRT smoothing. The window widths are as follows: (a) n = 0, (b) n = 5, (c) n = 15,
(d) n = 30, (e) n = 40, and (f) n = 50.

article, it is defined that when the RRMS value exceeds 2.6 the bearing will completely fail); (2) RRMS is easy to calculate; (3)
RRMS is not susceptible to individual differences of rolling element bearings and has good versatility. As can be seen from
Fig. 3, the larger the window width n of LRT, the smoother the RMS curve. Therefore, after selecting the appropriate window
width n, the problem of error estimation caused by the randomness of vibration characteristics can be avoided in life pre-
diction (in this article, the value of n is 30).

2.3. Degradation threshold

The RRMS value of the bearing is usually kept constant when the bearing is operating in a normal state. However, once the
bearing degenerates, the RRMS value will begin to increase. According to the propagation rate and damage growth rate of
bearing defects, RRMS values can show the corresponding linear and nonlinear trend [37,38]. In this article, the n latest sam-
ples of RRMS are used to determine whether the bearing has entered the degradation stage. Once the starting time of bearing
degradation is determined, the MFELM model is used to predict the RUL of the bearing.
Because RRMS is obtained by standardization and LRT smoothing operation on the basis of RMS, RRMS can suppress the
sudden change of the RMS. Therefore, the degradation state of the bearing can be judged by analyzing the gradient value of
the samples in the latest window [27,28]. The gradient value of the n samples in the window can be obtained as follows:

yi ðnÞ  yi ð1Þ
ki ¼ ð5Þ
n1

where yi is a set of RRMS values in the ith window, and ki is the gradient value in the ith window. Once the gradient value ki
in the ith window reaches the set threshold, it is considered that the bearing begins to degenerate. After reviewing the degra-
dation indicators of multiple rolling-element bearings and analyzing them from their stable behavior, select 0.005 as the
degradation threshold for the bearing (Fig. 4).
6 Z. Pan et al. / Mechanical Systems and Signal Processing 144 (2020) 106899

-3
x 10
5.5 n-th
value
5
ki > T hreshold value

RRMS Values
4.5

4
1st
value
3.5
Window of size = n
3
0 5 10 15 20 25
Data point

Fig. 4. Method for detecting bearing degradation.

3. Short-term prediction model based on univariate prediction principle

3.1. Univariate prediction principle

For the univariate time-series fxð1Þ; xð2Þ;    xðnÞg, the embedding theory [39] assumes that there is a certain functional
relationship between the future values of the sequence and the preceding m values. That is,
xði þ m þ 1Þ ¼ FðxðiÞ; xði þ 1Þ;    ; xði þ mÞÞ ð6Þ
If the first n samples are used for training, the latter (N-n) samples are used for prediction. Then we can construct training
sample pairs and test sample inputs according to (7) and (8).
0 1 0 1
xð1Þ xð2Þ  xðmÞ xðm þ 1Þ
B C
   xðm þ 1Þ C B C
B xð2Þ xð3Þ B xðm þ 2Þ C
X train ¼B
B .. .. ..
C Y train ¼ B
C B .. C
C ð7Þ
@ . . . A @ . A
xðn  mÞ xðn  m þ 1Þ    xðn  1Þ xðnÞ
0 1
xðn  m þ 1Þ xðn  m þ 2Þ    xðnÞ
B xðn  m þ 2Þ xðn  m þ 3Þ    xðn þ 1Þ C
B C
X test ¼B
B .. .. ..
C
C ð8Þ
@ . . . A
xðN  m þ 1Þ xðN  m þ 2Þ    xðNÞ
Xtrain and Ytrain form the training sample pair, and Xtest is the input of the test sample. Univariate prediction method uses
the implication information between the sequence data for further calculation, which requires that the sequence must be
sampled at equal intervals, without considering the interaction of multiple variables. This method cannot directly estimate
the RUL of rolling element bearings based on the measured values of current vibration, but it has a higher accuracy of pre-
diction when this method is used to predict the short-term degradation trend of rolling-element bearings.

3.2. ELM model

As a single hidden layer feed-forward neural network, ELM has the advantages of simple structure, fast learning speed,
good global search ability and excellent generalization performance. Among them, the input weights and hidden layer devi-
ations of the ELM are generated randomly, while the output weights are obtained by analysis and calculation. Therefore, the
ELM does not need to learn iteratively like the traditional neural network. Its output weights can be obtained by calculating
the generalized inverse of the output matrix of the hidden layer. This process greatly simplifies the network structure of the
ELM. The basic algorithm flowchart of ELM is as follows:
Input: Training sample set fxi ; t i gNi¼1  Rn  Rm , testing sample set fyi gM n
i¼1  R , activation function gðÞ, number of hidden
layer nodes L.
Step 1: Calculate the output matrix H of the hidden layer:
Z. Pan et al. / Mechanical Systems and Signal Processing 144 (2020) 106899 7

2 3
gða1 ; b1 ; x1 Þ gðaL ; bL ; x1 Þ
6 .. .. 7
Hða1 ;   ; aL ; b1    bL ; x1    xn Þ ¼ 6
4 .  . 5
7 ð9Þ
gða1 ; b1 ; xN Þ gðaL ; bL ; xN Þ NL
ðai ; bi Þ; i ¼ 1; 2;    ; L are randomly generated hidden node parameters, where ai is the input weight of ith hidden layer
node, and bi is the deviation of ith hidden layer node.
Step 2: Calculate the output weight matrix b of the hidden layer:
If H is nonsingular, the output weight is calculated using (10).
b ¼ Hþ T ð10Þ
þ
where T is the desired output and H is the Moore-Penrose generalized inverse matrix of the hidden node output matrix H. In
 1
the case where the hidden layer output is full rank, Hþ ¼ HT H HT .
2 3 2 3
bT1 T T1
6 . 7 6 . 7
b¼6
4 .. 5
7 T¼6
4 .. 5
7 ð11Þ
bTL Lm T TN Nm

Output: output weight matrix b.

3.3. FELM model and short-term prediction

In order to further improve the prediction accuracy and stability of ELM model, especially in the case of small sample, the
FELM network is formed by adding a feedback layer between the output layer and the hidden layer of ELM to memorize the
output data of the hidden layer [33].
1) Calculate the Feedback Layer Weight
Feedback extreme learning machine adds I-layers feedback layers to the hidden layer to save the output of the hidden
layer. Assuming that the current input is xðkÞ, the Ith layer saves the sample as gðxðk  IÞÞ. If the feedback weight is
W; W 2 ½0; 1, the feedback output weight at the Ith layer is expressed as follows:
W I ¼ ½W 1 ; W 2 ;    ; W N  W i 2 ½0; 1 ð12Þ
The weight of the Ith layer is set to the I power of the first layer, which makes the feedback layer forget the past data.
2) Calculate the Trend Change Rate
Using sliding window technology to extract characteristic, the change rate of data trend per unit time in the Ith layer is
obtained by
gi ¼ ½l1 ; l2 ;    ; lN  i 2 ½1; 2;    N ð13Þ

gðai xkiþ1 þ bi Þ  gðai xki þ bi Þ


lN ¼ ð14Þ
Ct
Ct represents the change in unit time.
3) Calculate the Feedback Layer Output
X
I
H0 ðkÞ ¼ ðgi W i  g½xðk  iÞÞ ð15Þ
i¼1

The hidden layer output matrix is:


2 3
gða1 ; b1 ; x1 Þ þ H0 ð1Þ gðaL ; bL ; x1 Þ þ H0 ð1Þ
6 .. .. 7
H ¼ HðkÞ þ H0 ðkÞH ¼ 6
4 .  .
7
5 ð16Þ
0 0
gða1 ; b1 ; xN Þ þ H ðNÞ gðaL ; bL ; xN Þ þ H ðNÞ NL

Substituting (16) into (10) and solving it by the least squares method, the output weight matrix b can be obtained. The
addition of feedback layer ensures that the correlation between data can be fully utilized in the prediction process of the
FELM model, so FELM model has a better prediction accuracy than ELM model even in a small sample learning environment.
The operation of ‘‘the feedback layer forgets the past data” in the feedback layer is based on the fact that the correlation
between data gradually decreases with the increase of distance.
Combining with the univariate prediction principle and FELM network structure proposed in this section, the specific pro-
cess of short-term prediction method based on FELM is given.
Step 1: Set the window width to i. The window is used to shift values on the training data to form a matrix Aki . The pre-
dicted true value corresponding to each row of data in Aki consists of a vector Bkj .
8 Z. Pan et al. / Mechanical Systems and Signal Processing 144 (2020) 106899

Step 2: The matrix A is taken as input sample set x, and the vector B is taken as desired output T. According to (16), the
output matrix H of hidden layer can be obtained, and then the output weight matrix b can be obtained from the formula
b ¼ Hþ T.
Step 3: The latest value in the sample set can get short-term predictive value prediction k1 through the FELM model.
With the newly acquired signal values continuously added to the sample set, Steps 1, 2, and 3 are repeated to achieve the
purpose of short-term signal prediction. When the gradient value of the predicted value exceeds the set degradation thresh-
old, the RUL prediction of rolling-element bearings begins (Fig. 5).

4. RUL prediction model based on multivariable prediction principle

4.1. Multivariable prediction principle

Considering that when treating univariate prediction as a short-term prediction method the prediction results are easily
interfered by the randomness of the sequence history data and the bearing RUL cannot be directly estimated based on the
measurement value of current vibration [40], a multivariate RUL prediction method is proposed.
The multivariate prediction process can be simply described as constructing a multivariate prediction model to find the
relationship between the prediction variable z and the main influencing factors x 1 ; x 2 ;    ; xM (characteristic variables).

zðiÞ ¼ Fðx1 ðiÞ; x2 ðiÞ;    ; xM ðiÞÞ ð17Þ


The training sample pair and the predicted sample input are reconstructed according to the following (18) and (19).
0 1 0 1
x1 ð1Þ x2 ð1Þ    xM ð1Þ zð1Þ
B x1 ð2Þ x2 ð2Þ    xM ð2Þ C B zð2Þ C
B C B C
X train ¼ B
B .. .. .. C
C Y train ¼ B
B .. C
C ð18Þ
@ . . . A @ . A
x1 ðnÞ x2 ðnÞ    xM ðnÞ zðnÞ

0 1
x1 ðn þ 1Þ x2 ðn þ 1Þ    xM ðn þ 1Þ
B x1 ðn þ 2Þ x2 ðn þ 2Þ    xM ðn þ 2Þ C
B C
X test ¼B
B .. .. ..
C
C ð19Þ
@ . . . A
x1 ðNÞ x2 ðNÞ  xM ðNÞ
Xtrain and Ytrain form the training sample pair, and Xtest is the input of the test sample. Xtrain is composed of characteristic
variable values corresponding to the first n time-series points; Ytrain is composed of predicted values corresponding to the
first n time series points, and Xtest is composed of characteristic variable values corresponding to the latter (N-n) time-
series points.

Fig. 5. Feedback extreme learning machine network structure.


Z. Pan et al. / Mechanical Systems and Signal Processing 144 (2020) 106899 9

4.2. MFELM model and RUL prediction

According to the principle of multivariate prediction, a good health assessment method should take advantage of mutual
information from multiple features for system degradation modeling [41]. In order to obtain the relationship between the
predictor variable z and the main influencing factors x 1 ; x 2 ;    ; xM (multiple characteristic variables), an MFELM model is
constructed in this article. Based on the FELM model, the model replaces the input in the FELM model from univariate time
series to multivariable time series with high correlation.
By considering the interaction and constraints between multiple variables, the model makes the best use of all the valid
information to expand the number of training samples to achieve residual life prediction under small sample conditions.
Relative characteristic RRMS in time-domain is a sensitive indicator that can better reflect the change of bearing perfor-
mance and can be used to predict the RUL of rolling-element bearings. However, the information contained in one indicator
is too little to reflect the current operating state of bearing comprehensively. Therefore, it is far from enough to accurately
predict the RUL of rolling element bearings with only one indicator. It is necessary to select several indicators with good
trend as the input of the prediction sample set. In addition to RRMS, there are 13 relative time-domain characteristics of
vibration signals (in Table 1).
The relative characteristics are obtained by normalizing and LRT smoothing the original characteristics. These relative
characteristics show different trends with the increase of the operating time of rolling-element bearings and also reflect
the changes of bearing operating conditions and the development trend of damage to varying degrees. Where the relative
characteristics that are sensitive to bearing damage should be selected for RUL prediction, as this can make full use of the
linear and interdependent relationships between the relative characteristics.
Based on RRMS, the specific methods of selecting sensitive indicators are as follows:
Step 1: Calculate the correlation coefficients between the relative time-domain characteristics and RRMS.
Step 2: Preset a threshold. When the correlation coefficient is greater than the threshold, the relative characteristic is
retained; otherwise, it will be rejected.
Step 3: A series of sensitive indicators of the training bearing that in the degradation stage are taken as input, and the RUL
corresponding to the sensitive indicators is taken as output; the input and output are added to the MFELM model for train-
ing. After the training is completed, the sensitive indicators of the test bearing at the current moment are added to the
MFELM model, and the RUL value of the test bearing at the current moment can be obtained.

5. Result and discussion

An experimental system named PRONOSTIA [42] is shown in Fig. 6. This system is designed to test methods for bearing
fault detection, diagnosis and RUL prediction. The main objective of PRONOSTIA is to provide real experimental data that
characterize the degradation of rolling-element bearings along their whole operational life (until their total failure). This
experimental platform allows to conduct bearings’ degradations in only few hours, and thus it is possible to get significant
number of experiments within a week [43], and many researchers [17,18,44–46] have used the PRONOSTIA dataset as a
benchmark to test their RUL estimation algorithms.
In order to conduct accelerated degradation tests of bearings in a few hours, a radial force, which is close to the bearing’s
maximum dynamic load (4–5 kN), is applied on the tested bearings. It is worth noting that the type of bearings used in the
experiment is the same.
The force is generated by a cylinder pressure, and the pressure is delivered through a pressure regulator. During the tests,
the rotating speed of the bearing keeps 1500 to 1800 r/min. Accelerometers are fixed on the outer race of the bearing, and
vibration signals are captured. The sampling frequency is 25.6 kHz. Each sample contains 2560 data points, that is, 0.1 s, and
the sampling is repeated every 10 s. As given in Table 2, the PRONOSTIA dataset provides accelerated degradation test data
for a total of 17 bearings, that is, 7 bearings each for Conditions (1) and (2), whereas 3 bearings for Condition (3).

5.1. Short-term prediction experiment

The prediction method in this paper consists of two parts. The first part describes short-term prediction of the degrada-
tion state of the bearing, so as to realize the early detection of the degradation threshold. First, according to the content in the

Table 1
Thirteen relative time-domain characteristics of vibration signals.

Symbol Meaning Symbol Meaning


rma Relative maximum value rku Relative kurtosis
rmi Relative minimum value rsk Relative skew
rme Relative mean value rrm Relative root mean square
rpk Relative peak to peak value rS Relative waveform factor
rav Relative average value of absolute value rC Relative peak factor
rva Relative variance rI Relative pulse factor
rst Relative standard deviation rL Relative margin factor
10 Z. Pan et al. / Mechanical Systems and Signal Processing 144 (2020) 106899

Fig. 6. The PRONOSTIA platform for bearing accelerated degradation tests.

Table 2
Detail of the bearing operation conditions and the accelerated degradation
test data.

Operating Conditions
Condition (1) Condition (2) Condition (3)
Radial load (N) 4000 4200 5000
Speed (r/min) 1800 1650 1500
Training sets Bearing 1–1 Bearing 2–1 Bearing 3–1
Bearing 1–2 Bearing 2–2 Bearing 3–2
Testing sets Bearing 1–3 Bearing 2–3 Bearing 3–3
Bearing 1–4 Bearing 2–4
Bearing 1–5 Bearing 2–5
Bearing 1–6 Bearing 2–6
Bearing 1–7 Bearing 2–7

second section, the RMS values of the target bearing signal are obtained and used as the performance degradation indicator
of the target bearing. At the same time, in order to ensure the generalization performance and stability of the indicator, the
RMS values are standardized and smoothed.
Before the prediction, FELM model was trained with the RRMS values of bearing under smooth operation state as training
samples, and the training samples sizes were 1000 (Bearing 1–1, Bearing 1–3), 500 (Bearing 3–2) and 200 (Bearing 2–1, Bear-
ing 2–2, Bearing 3–1). Taking Bearing 1–1 as an example, the first 1000 RRMS values are used as training samples, and the
FELM is used to predict the degradation trend in a short term; the length of short-term prediction is 20. In order to verify the
advantages of FELM as a prediction method compared with traditional neural network prediction method, the short-term
prediction results of the FELM algorithm are compared with ELM, back propagation (BP) and generalized regression neural
network (GRNN) algorithm in Fig. 7.
Fig. 7 shows that when the training samples are limited the accuracy of short-term prediction using the FELM algorithm is
higher than that of the ELM, BP and GRNN algorithms, and as the number of training samples decreases, the performance
advantages of the FELM algorithm over BP and GRNN algorithms become more pronounced. The computational complexity
and accuracy of the four prediction algorithms are given in Table 3 to quantitatively compare the performance of the four
methods. The prediction process of the four prediction methods is consistent; that is, the program framework and the iter-
ative steps of the four algorithms are kept consistent. The operational complexity of the four methods is measured by the
operation time (‘‘time” in Table 3 represents the time required for a single prediction). The prediction accuracy of each algo-
rithm is measured by average normalized mean squared error (ANMSE),

1X k xi  ^x k2
n 2
ANMSE ¼ ð20Þ
n i¼1 k ^xi k2
2
Z. Pan et al. / Mechanical Systems and Signal Processing 144 (2020) 106899 11

Fig. 7. Short-term degradation trend prediction results of Bearing1-1 signal by using FELM, ELM, BP and GRNN algorithms.
12 Z. Pan et al. / Mechanical Systems and Signal Processing 144 (2020) 106899

Table 3
The operation time and ANMSE value of the FELM method compared with those of BP and GRNN, respectively.

Case FELM Method ELM Method BP Method GRNN Method


ANMSE Time (s) ANMSE Time (s) ANMSE Time (s) ANMSE Time (s)
Bearing 1–1 0.1221 0.4862 0.1365 0.4568 0.1869 3.7908 0.1346 5.0682
Bearing 1–3 0.1728 0.3927 0.1951 0.3820 0.1750 3.2321 0.1767 5.6192
Bearing 2–1 0.0605 0.1284 0.0922 0.1217 0.1092 0.7436 0.1169 3.8400
Bearing 2–2 0.0414 0.1241 0.0514 0.1106 0.0800 1.2061 0.0891 3.1727
Bearing 3–1 0.0910 0.0980 0.1166 0.0771 0.1442 0.4704 0.1862 3.6369
Bearing 3–2 0.0423 0.2539 0.0926 0.2300 0.2154 1.2782 0.1623 3.8194

Table 3 shows that the FELM algorithm has higher prediction accuracy than the ELM algorithm due to the addition of the
feedback layer. At the same time, the prediction speed of the FELM method is much faster than traditional neural network
and more competitive in prediction accuracy; especially when the number of training samples is small, it can still predict the
short-term degradation trend of bearing with high precision.

5.2. RUL prediction experiment

After obtaining the degradation threshold of the bearing, the second part of the prediction method is used to predict the
RUL of the bearing. Before predicting RUL, It is necessary to find the sensitive indicators of bearing performance change dur-
ing bearing degradation stage. Taking Bearing 1–2 as an example intercepts its 13 time-domain relative characteristics from
degradation stage to failure stage and extracts the relative time-domain characteristics similar to RRMS trend by correlation
analysis. Fig. 8 shows the correlation coefficient values between the time-domain relative characteristics. By setting a thresh-
old of 0.95, six relative characteristics including RRMS are selected as sensitive indicators: RRMS, relative maximum value,
relative minimum value, relative peak to peak value, relative average value of absolute value, and relative standard
deviation.
According to the description in Section 4, the training sample pairs are composed of the sensitive indicators of Bearing
1–2 during the degradation stage and the corresponding RUL. A series of sensitive indicators are taken as input, and the
RUL is taken as output, then add them to the MFELM model for training. After the training is completed, the sensitive indi-
cators of the test bearing at the current moment are added to the MFELM model, and the RUL value of the test bearing at the
current moment can be obtained. The test bearings are other failure bearings, 20 sets of sensitive indicators which are
obtained every 5% time interval from the beginning of the degradation stage to the end of the degradation stage constitute
the input vectors in the test samples to predict their RUL. The results are shown in Fig. 9 (the abscissa is the current time, and
the ordinate is the RUL) and Table 4.

1
rma 1.00 0.96 -0.19 0.98 0.98 0.94 0.98 0.84 -0.38 0.98 0.89 0.83 0.86 0.87
rmi 0.96 1.00 -0.16 0.99 0.98 0.91 0.99 0.92 -0.42 0.99 0.93 0.86 0.90 0.91
0.8
rme -0.19 -0.16 1.00 -0.17 -0.15 -0.13 -0.15 -0.12 0.13 -0.15 -0.17 -0.23 -0.21 -0.21
rpk 0.98 0.99 -0.17 1.00 0.99 0.92 0.99 0.89 -0.41 0.99 0.92 0.86 0.89 0.90
0.6
rav 0.98 0.98 -0.15 0.99 1.00 0.91 1.00 0.96 -0.41 1.00 0.89 0.81 0.85 0.86
rva 0.94 0.91 -0.13 0.92 0.91 1.00 0.92 0.75 -0.34 0.92 0.79 0.71 0.75 0.76
0.4
rst 0.98 0.99 -0.15 0.99 1.00 0.92 1.00 0.87 -0.41 1.00 0.91 0.82 0.86 0.87
rku 0.84 0.92 -0.12 0.89 0.96 0.75 0.87 1.00 -0.37 0.87 0.96 0.94 0.96 0.97
0.2
rsk -0.38 -0.42 0.13 -0.41 -0.41 -0.34 -0.41 -0.37 1.00 -0.41 -0.39 -0.39 -0.40 -0.40
rrm 0.98 0.99 -0.15 0.99 1.00 0.92 1.00 0.87 -0.41 1.00 0.91 0.82 0.86 0.87
0
rS 0.89 0.93 -0.17 0.92 0.89 0.79 0.91 0.96 -0.39 0.91 1.00 0.91 0.95 0.96
rC 0.83 0.86 -0.23 0.86 0.81 0.71 0.82 0.94 -0.39 0.82 0.91 1.00 0.99 0.99
-0.2
rI 0.86 0.90 -0.21 0.89 0.85 0.75 0.86 0.96 -0.40 0.86 0.95 0.99 1.00 1.00
rL 0.87 0.91 -0.21 0.90 0.86 0.76 0.87 0.97 -0.40 0.87 0.96 0.99 1.00 1.00
-0.4
rma rmi rme rpk rav rva rst rku rsk rrm rS rC rI rL

Fig. 8. Correlation analysis between the time-domain relative characteristics of Bearing 1–2 signal.
Z. Pan et al. / Mechanical Systems and Signal Processing 144 (2020) 106899 13

Bearing1-1 Bearing1-3
7000 6000
Predicted RUL Predicted RUL
6000 5000 Calculated RUL
Calculated RUL
20% Confidence Interval 20% Confidence Interval
5000
4000
4000

RUL(sec)
RUL(sec)

3000
3000
2000
2000

1000 1000

0 0
1.9 2 2.1 2.2 2.3 2.4 2.5 1.5 1.6 1.7 1.8 1.9
Time(sec) 4 Time(sec) 4
x 10 x 10
Bearing1-4 Bearing1-5
300 600
Predicted RUL Predicted RUL
250 Calculated RUL 500 Calculated RUL
20% Confidence Interval 20% Confidence Interval
200 400
RUL(sec)

RUL(sec)
150 300

100 200

50 100

0 0
1.09 1.095 1.1 1.105 2.42 2.43 2.44 2.45 2.46
Time(sec) 4 Time(sec) 4
x 10 x 10
Bearing2-2 Bearing2-3
3000 1200
Predicted RUL Predicted RUL
2500 Calculated RUL 1000 Calculated RUL
20% Confidence Interval 20% Confidence Interval
2000 800
RUL(sec)
RUL(sec)

1500 600

1000 400

200
500

0
0 1.85 1.9 1.95
2000 2500 3000 3500 4000 4500 5000
Time(sec) 4
Time(sec) x 10
Bearing3-2 Bearing3-3
600 1000

Predicted RUL Predicted RUL


500 Calculated RUL 800 Calculated RUL
20% Confidence Interval 20% Confidence Interval
400
600
RUL(sec)

RUL(sec)

300

400
200

100 200

0
1.59 1.6 1.61 1.62 1.63 0
3200 3400 3600 3800 4000
Time(sec) 4
x 10 Time(sec)

Fig. 9. The prognostic performance of the proposed method in terms of the a-k metric for Bearing 1–1, Bearing 1–3, Bearing 1–4, Bearing 1–5, Bearing 2–2,
Bearing 2–3, Bearing 3–2, and Bearing 3–3.
14 Z. Pan et al. / Mechanical Systems and Signal Processing 144 (2020) 106899

Table 4
Detailed prediction results and relative errors of bearing RUL.

Bearing 1–1 Bearing 1–3 Bearing 1–4 Bearing 1–5


Cal. Pre. R.E. Cal. Pre. R.E. Cal. Pre. R.E. Cal. Pre. R.E.
5450 3652.2 33.0% 4690 4313.8 8.0% 190.0 268.5 41.3% 460 370.3 19.5%
5178 5073.1 2.0% 4456 3976.9 10.8% 180.5 199.8 10.7% 437 438.2 0.3%
4905 4929.1 0.5% 4221 4304.5 2.0% 171.0 171.4 0.2% 414 421.5 1.8%
4633 4570.6 1.3% 3987 4429.0 11.1% 161.5 186.1 15.2% 391 387.7 0.8%
4361 4166.8 4.4% 3752 3719.4 0.9% 152.0 145.2 4.5% 368 359.6 2.3%
4088 3302.1 19.2% 3518 3103.9 11.8% 142.5 142.7 0.1% 345 334.6 3.0%
3816 3742.5 1.9% 3283 3644.2 11.0% 133.0 134.1 0.8% 322 353.1 9.7%
3543 4055.2 14.4% 3049 3206.1 5.2% 123.5 125.0 1.2% 299 202.6 32.2%
3271 3510.3 7.3% 2814 2892.4 2.8% 114.0 114.5 0.4% 276 283.3 2.6%
2999 3008.5 0.3% 2580 2643.6 2.5% 104.5 103.5 1.0% 253 213.7 15.5%
2726 2753.3 1.0% 2345 2655.1 13.2% 95.0 95.7 0.7% 230 231.1 0.5%
2454 2529.2 3.1% 2111 2177.2 3.2% 85.5 95.3 11.5% 207 211.2 2.0%
2181 2179.6 0.1% 1876 2078.2 10.8% 76.0 80.4 5.8% 184 183.9 0.1%
1909 2008.5 5.2% 1642 1588.8 3.2% 66.5 64.1 3.6% 161 173.2 7.6%
1637 1707.4 4.3% 1407 1420.5 0.9% 57.0 55.3 3.0% 138 137.7 0.2%
1364 1401.4 2.7% 1173 1111.7 5.2% 47.5 44.7 5.9% 115 112.6 2.1%
1092 1203.6 10.3% 938 949.5 1.2% 38.0 34.9 8.2% 92 90.8 1.3%
819 830.6 1.4% 704 680.2 3.3% 28.5 30.1 5.6% 69 70.7 2.5%
547 505.9 7.5% 469 498.8 6.3% 19.0 17.9 5.8% 46 45.0 2.2%
275 245.7 10.5% 235 237.4 1.2% 9.5 5.2 45.3% 23 24.0 4.3%
Bearing 2–2 Bearing 2–3 Bearing 3–2 Bearing 3–3
Cal. Pre. R.E. Cal. Pre. R.E. Cal. Pre. R.E. Cal. Pre. R.E.
2230 736.4 67.0% 960 788.1 17.9% 410.0 418.6 2.1% 750.0 662.0 11.7%
2119 1891.3 10.7% 912 959.4 5.2% 389.5 403.2 3.5% 712.5 691.6 2.9%
2007 1894.9 5.6% 864 873.1 1.1% 369.0 365.9 0.8% 675.0 684.3 1.4%
1896 1770.6 6.6% 816 826.8 1.3% 348.5 355.1 1.9% 637.5 630.8 1.1%
1784 1683.9 5.6% 768 804.9 4.8% 328.0 292.4 10.9% 600.0 578.1 3.7%
1673 1598.7 4.4% 720 763.4 6.0% 307.5 339.3 10.3% 562.5 559.5 0.5%
1561 1632.2 4.5% 672 698.8 4.0% 287.0 299.1 4.2% 525.0 521.9 0.6%
1450 1631.5 12.5% 624 642.5 3.0% 266.5 268.6 0.8% 487.5 498.5 2.3%
1338 1637.2 22.3% 576 602.1 4.5% 246.0 233.6 5.0% 450.0 448.7 0.3%
1227 1328.7 8.3% 528 564.7 7.0% 225.5 223.8 0.8% 412.5 418.3 1.4%
1115 1341.5 20.3% 480 515.4 7.4% 205.0 167.0 18.5% 375.0 366.2 2.3%
1004 1200.1 19.6% 432 465.3 7.7% 184.5 180.8 2.0% 337.5 341.4 1.2%
892 1061 18.9% 384 430.3 12.1% 164.0 170.4 3.9% 300.0 294.1 2.0%
781 1219.6 56.2% 336 408.3 21.5% 143.5 180.1 25.5% 262.5 261.1 0.5%
669 886.8 32.5% 288 229.2 20.4% 123.0 150.7 22.5% 225.0 237.2 5.4%
558 628.4 12.7% 240 301.1 25.5% 102.5 109.7 7.0% 187.5 199.8 6.6%
446 352.6 21.0% 192 251.4 30.9% 82.0 99.7 21.6% 150.0 160.2 6.8%
335 260.1 22.3% 144 165.4 14.9% 61.5 67.1 9.1% 112.5 115.5 2.7%
223 286.6 28.4% 96 138.8 44.6% 41.0 39.9 2.6% 75.0 103.2 37.6%
112 154.4 38.2% 48 92.17 92.0% 20.5 69.2 237% 37.5 69.6 85.6%

* Cal.: Calculated RUL. Pre.: Predicted RUL. R.E.: Relative Error.

In this article, a-k Performance [47] is introduced to evaluate the predictive performance of the proposed scheme. The
parameter a is used to plot the upper and lower limits of errors in RUL estimation. The parameter k represents the relative
time distance from the end of bearing life to the given point. In this study, the performance of the proposed method with
a = 20% is evaluated. Fig. 9 shows that the predicted RUL value is within the acceptable error range most of the time, that
is, ½ð1  aÞrðt i Þ 6 ^r ðt i Þ 6 ½ðð1  aÞrðt i ÞÞ. Here, rðti Þ denotes the actual RUL, whereas ^r ðt i Þ denotes the predicted RUL of time ti.
It is noteworthy that the data point values predicted in this article need to be converted to the RUL values of the bearings
by (21).

RUL ¼ k  DT ð21Þ

where k is the number of data points predicted before the failure threshold is reached, DT is the interval period between data
points, and the interval period of a given data set is 10 s.
As can be seen from Fig. 9 and Table 4, Bearing 1–2 is used as a training sample; after trained by the MFELM model, the
MFELM model is used to predict the RUL of other bearings. Whether under the same load or speed or different load and speed
conditions, it can be guaranteed that the prediction accuracy is within the range of a = 20% for most of the time (the exper-
imental results are obtained by calculating the average values of extensive experiments). Therefore, it can be concluded that
after the MFELM model is trained by the existing samples the RUL can also be predicted when the load and speed are dif-
ferent for the same type of bearings.
Z. Pan et al. / Mechanical Systems and Signal Processing 144 (2020) 106899 15

Table 5
CRA scores and convergence rates by the proposed method and different RUL estimation methods.

Metric Case Paris Exponential Improved Dynamic Extended MMALSTMNetwork Proposed


Model Model Exponential Regression Kalman Method
Model Model Filtering
CRA scores Bearing 0.6967 0.7111 0.8696 0.9452 0.8821 0.9557 0.9659
1–1
Bearing 0.6074 0.5311 0.7623 0.9201 0.9173 0.9304 0.9521
1–3
Bearing 0.6317 0.542 0.8712 0.9261 0.8903 0.9283 0.9509
1–4
Bearing 0.7443 0.7463 0.9324 0.8063 0.6847 0.9336 0.9681
1–5
Convergencerates Bearing 9234.7 9382.3 8993.7 3748.2 29853.6 12865.7 2627.9
1–1
Bearing 4406.6 4320.7 4284.6 4764.4 38022.2 10936.9 2247.2
1–3
Bearing 307.3 329.9 275.4 162.6 1290.6 476.4 92.3
1–4
Bearing 315.3 296.7 290.2 211.9 1599.4 893.5 165.3
1–5

To further verify the effectiveness and superiority of the proposed method, the proposed method is compared with exist-
ing methods for RUL estimation of bearings including an improved exponential model [11], a dynamic regression model
[27,28], an extended Kalman filter [17] and a MMALSTM network [26]. (These methods use the same data set as this article,
the PRONOSTIA data set.) In the process of prediction, we expect that the prediction algorithm can converge to the real value
faster, so as to ensure that the prediction process has a high confidence in a larger range. It is better to use the cumulative
relative accuracy (CRA) and convergence rate [48] to evaluate the performance of these methods.

1 X
CRAk ¼ wðrðiÞÞRAk ð23Þ
jpk j i2p
k

qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
CM ¼ ðxc  tp Þ2 þ y2c ð24Þ

wðrðiÞÞ is a weight factor as a function of RUL at all time indicators before tk ; when a prediction is made,jpk j is the cardi-
nality of the set. RA is the relative accuracy, which is a measure of error in the estimated value of RUL rðik Þ relative to the
actual RUL r ðik Þ at a specific time indicator ik [48]. MðiÞis a nonnegative prediction accuracy or precision metric with a time
varying value; ðxc ; yc Þ is the center of mass of the area under the curve MðiÞ between tp and t EoUP (time for End of Useful
Prediction).
Table 5 shows the CRA scores and convergence rates obtained by the proposed method and other RUL estimation methods
(the CRA score and convergence rate were measured for the RUL estimates of four bearings, ie, Bearing 1–1, Bearing 1–3,
Bearing 1–4 and Bearing 1–5). It can be observed from Table 5 that the proposed method can obtain a better CRA score
and a faster convergence rate than other techniques.

6. Conclusion

In this article, RRMS is used as the evaluation indicator of rolling-element bearings operation state, and the stage before
failure of rolling-element bearings failure is divided into normal operation stage and degradation stage. During the normal
operation of the rolling-element bearings, the short-term prediction of the rolling-element bearings degradation trend is
based on the univariate prediction principle and the FELM model. Feedback extreme learning machine is a single hidden
layer feedforward neural network model. Its hidden layer output weight can be found by simple linear regression. Compared
with traditional neural network methods, FELM has faster operation speed. At the same time, the experimental results show
that the prediction accuracy of the FELM model is better than that of traditional neural network in small sample environ-
ment. In the degradation stage of rolling-element bearings, when the RUL prediction is performed by the MFELM model,
the correlations between relative characteristic variables in time domain are fully considered, and the potential information
of small sample data is maximized. The a-k performance shows that the MFELM model can obtain accurate prediction results
with much effective information under small sample conditions. According to the results of CRA score and convergence rate,
it can be seen that the proposed method achieves fairly good results compared with the existing methods.
16 Z. Pan et al. / Mechanical Systems and Signal Processing 144 (2020) 106899

CRediT authorship contribution statement

Zuozhou Pan: Data curation, Writing - original draft, Writing - review & editing. Zong Meng: Conceptualization, Method-
ology, Investigation, Supervision, Validation. Zijun Chen: Visualization, Writing - review & editing. Wenqing Gao: Software,
Visualization. Ying Shi: Writing - review & editing.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have
appeared to influence the work reported in this paper.

Acknowledgments

This work was supported in part by the National Natural Science Foundation of China under Grant 51575472, in part by
the Natural Science Foundation of Hebei Province of China under Grant E2019203448, in part by the Hebei Province Grad-
uate Innovation Funding Project under Grant CXZZBS2020047.

Appendix A. Supplementary data

Supplementary data to this article can be found online at https://doi.org/10.1016/j.ymssp.2020.106899.

References

[1] L. Cui, X. Gong, J. Zhang, H. Wang, Double-dictionary matching pursuit for fault extent evaluation of rolling bearing based on the Lempel-Ziv
complexity, J. Sound Vib. 385 (2016) 372–388.
[2] X. Li, K. Yu, H. Ma, L. Cao, Z. Luo, H. Li, L. Che, Analysis of varying contact angles and load distributions in defective angular contact ball bearing, Eng.
Fail. Analy. 91 (2018) 449–464.
[3] L. Cui, J. Huang, F. Zhang, Quantitative and localization diagnosis of a defective ball bearing based on vertical-horizontal synchronization signal
analysis, IEEE Trans. Ind. Electron. 64 (11) (2017) 8695–8706.
[4] Y. Liu, B. He, F. Liu, S. Lu, Y. Zhao, Feature fusion using kernel joint approximate diagonalization of eigen-matrices for rolling bearing fault identification,
J. Sound Vib. 385 (2016) 389–401.
[5] J. Ben-Ali, B. Chebel-Morello, L. Saidi, S. Malinowski, F. Fnaiech, Accurate bearing remaining useful life prediction based on Weibull distribution and
artificial neural network, Mech. Syst. Signal Process. 56–57 (2015) 150–172.
[6] F.D. Maio, K.L. Tsui, E. Zio, Combining Relevance Vector Machines and exponential regression for bearing residual life estimation, Mech. Syst. Signal
Process. 31 (2012) 405–427.
[7] C. Chen, G. Vachtsevenos, M.E. Orchard, Machine remaining useful life prediction: an integrated adaptive neuro-fuzzy and high-order particle filtering
approach, Mech. Syst. Signal Process. 28 (2018) 597–607.
[8] H. Liao, W. Zhao, H. Guo, Predicting remaining useful life of an individual unit using proportional hazards model and logistic regression model, in: Proc.
Annu. Rel. Maintain. Symp. (2006) 127-132.
[9] Z. Tian, H. Liao, Condition based maintenance optimization for multi-component systems using proportional hazards model, Rel. Eng. Syst. Safety 96
(5) (2011) 581–589.
[10] L. Liao, Discovering prognostic features using genetic programming in remaining useful life prediction, IEEE Trans. Ind. Electron. 61 (5) (2014) 2464–
2472.
[11] N. Li, Y. Lei, J. Lin, S.X. Ding, An improved exponential model for predicting remaining useful life of rolling element bearings, IEEE Trans. Ind. Electron.
62 (12) (2015) 7762–7773.
[12] E. Jantunen, J. Hooghoudt, Y. Yang, M. McKay, Predicting the remaining useful life of rolling element bearings, in: IEEE Int. Conf. Ind. Technol. (2018)
2035-2040.
[13] Y. Lei, N. Li, F. Jia, J. Lin, S. Xing, A nonlinear degradation model based method for remaining useful life prediction of rolling element bearings, Prog. Syst.
Health Manage. Conf. (2015) 1–8.
[14] B. Wang, Y. Lei, N. Li, N. Li, A hybrid prognostics approach for estimating remaining useful life of rolling element bearings, IEEE Trans. Rel. (2018) 1–12.
[15] N. Gebraeel, M. Lawley, R. Liu, V. Parmeshwaran, Residual life predictions from vibration-based degradation signals: a neural network approach, IEEE
Trans. Ind. Electron. 51 (3) (2004) 694–700.
[16] W. Mao, J. He, M. Zuo, Predicting remaining useful life of rolling bearings based on deep feature representation and transfer learning, IEEE Trans.
Indust. Measurement (2019) 1–17.
[17] R.K. Singleton, E.G. Strangas, S. Aviyente, Extended Kalman filtering for remaining useful life estimation of bearings, IEEE Trans. Ind. Electron. 62 (3)
(2015) 1781–1790.
[18] Y. Wang, Y. Peng, Y. Zi, X. Jin, K.L. Tsui, A two-stage data-driven-based prognostic approach for bearing degradation problem, IEEE Trans. Ind. Inf. 12 (3)
(2016) 924–932.
[19] C.K.R. Lim, D. Mba, Switching Kalman filter for failure prognostic, Mech. Syst. Sig. Process. 22–23 (2015) 426–435.
[20] A. Soualhi, H. Razik, G. Clerc, D.D. Doan, Prognosis of bearing failures using hidden Markov models and the adaptive neuro-fuzzy inference system, IEEE
Trans. Ind. Electron. 61 (6) (2014) 2864–2874.
[21] C. Chen, B. Zhang, G. Vachtsevanos, M. Orchard, Machine condition prediction based on adaptive neuro–fuzzy and high-order particle filtering, IEEE
Trans Ind. Electron. 58 (9) (2011) 4353–4364.
[22] J. Liu, W. Wang, F. Golnaraghi, A multi-step predictor with a variable input pattern for system state forecasting, Mech. Syst. Sig. Process. 96 (5) (2009)
1586–1599.
[23] J. Yan, C. Guo, X. Wang, A dynamic multi-scale Markov model based methodology for remaining life prediction, Mech. Syst. Sig. Process. 25 (4) (2011)
1364–1376.
[24] Y. Lei, N. Li, L. Guo, N. Li, T. Yan, J. Lin, Machinery health prognostics: a systematic review from data acquisition to RUL prediction, Mech. Syst. Sig.
Process. 104 (2018) 799–834.
[25] S. Xiang, Y. Qin, C. Zhu, Y. Wang, H. Chen, Long short-term memory neural network with weight amplification and its application into gear remaining
useful life prediction, Eng. Appl. Artif. Intell. 91 (2020) 103587.
Z. Pan et al. / Mechanical Systems and Signal Processing 144 (2020) 106899 17

[26] Y. Qin, S. Xiang, Y. Chai, H. Chen, Macroscopic-microscopic attention in LSTM networks based on fusion features for gear remaining life prediction, IEEE
Trans. Ind. Electron. 1–11 (2019).
[27] W. Ahmad, S.A. Khan, M.M.M. Islam, J.M. Kim, A reliable technique for remaining useful life estimation of rolling element bearings using dynamic
regression models, Rel. Eng. Syst. Safety 184 (2018) 67–76.
[28] W. Ahmad, S.A. Khan, J.M. Kim, A hybrid prognostics technique for rolling element bearings using adaptive predictive models, IEEE Trans. Ind. Electron.
65 (2) (2018) 1577–1584.
[29] J.H. Zhong, P.K. Wong, Z.X. Yang, Fault diagnosis of rotating machinery based on multiple probabilistic classifiers, Mech. Syst. Signal Process. 108
(2018) 99–114.
[30] P. Potocnik, E. Govekar, Semi-supervised vibration-based classification and condition monitoring of compressors, Mech. Syst. Signal Process. 93 (2017)
51–65.
[31] S.G. Soares, R. Araújo, An adaptive ensemble of on-line extreme learning machines with variable forgetting factor for dynamic system prediction,
Neurocomputing 171 (2016) 693–707.
[32] X. Wang, M. Han, Online sequential extreme learning machine with kernels for nonstationary time series prediction, Neurocomputing 145 (2014) 90–
97.
[33] Y. Xu, M. Zhang, L. Ye, Q. Zhu, Z. Geng, Y. He, Y. Han, A novel prediction intervals method integrating an error and self-feedback extreme learning
machine with particle swarm optimization for energy consumption robust prediction, Energy 164 (2018) 137–146.
[34] X. Li, F. Elasha, S. Shanbr, D. Mba, Remaining useful life prediction of rolling element bearings using supervised machine learning, Energies 12 (14)
(2019) 1–17.
[35] M.P. Blake, W.S. Mitchel, Vibration and acoustic measurement, in: Spartan Books, New York, 1972.
[36] L. Guo, N. Li, F. Jia, Y. Lei, J. Lin, A recurrent neural network based health indicator for remaining useful life prediction of bearings, Neurocomputing 240
(2017) 98–109.
[37] M. Elforjani, S. Shanbr, Prognosis of bearing acoustic emission signals using supervised machine learning, IEEE Trans. Ind. Electron. 65 (7) (2017) 5864–
5871.
[38] I. El-Thalji, E. Jantunen, Dynamic modelling of wear evolution in rolling bearings, Tribol. Int. 84 (2015) 90–99.
[39] F. Takens, Detecting strange attractors in turbulence, Dynam. Syst. Turb., Heidelberg (1981).
[40] Z. Shen, X. Chen, Z. He, C. Sun, X. Zhang, Z. Liu, Remaining life predictions of rolling bearing based on relative features and multivariable support vector
machine, J. Mech. Eng. 42 (2) (2013) 183–189.
[41] C. Duan, V. Makis, C. Deng, An integrated framework for health measures prediction and optimal maintenance policy for mechanical systems using a
proportional hazards model, Mech. Syst. Signal Process. 111 (2018) 285–302.
[42] N.Z. Gebraeel, Sensory-updated residual life distributions for components with exponential degradation patterns, IEEE Trans. Autom. Sci. Eng. 3 (4)
(2006) 382–393.
[43] P. Nectoux, PRONOSTIA: an experimental platform for bearings accelerated degradation tests, IEEE int. Conf. Prognostics Health Manage (2012) 1–8.
[44] K. Javed, R. Gouriveau, N. Zerhouni, P. Nectoux, Enabling health monitoring approach based on vibration data for accurate prognostics, IEEE Trans. Ind.
Electron. 62 (1) (2015) 647–656.
[45] K. Medjaher, N. Zerhouni, J. Baklouti, Data-driven prognostics based on health indicator construction: Application to PRONOSTIA’s data, in Proc. Eur.
Control Conf. (2013) 1451–1456.
[46] J. Wu, C. Wu, S. Cao, S. Wing, C. Deng, X. Shao, Degradation data-driven time-to-failure prognostics approach for rolling element bearings in electrical
machines, IEEE Trans. Ind. Electron. 66 (1) (2018) 529–539.
[47] A. Saxena, J. Celaya, E. Balaban, K. Goebel, B. Saha, S. Saha, M. Schwabacher, Metrics for evaluating performance of prognostic tech-niques, in Proc. Int.
Conf. Prognostics Health Manage. (2008) 1–17.
[48] A. Saxena, J. Celaya, B. Saha, S. Saha, K. Goebel, Metrics for offline evaluation of prognostic perfor-mance, Int. J. Prognostics Health Manage. 1 (1) (2010)
4–12.

You might also like