Professional Documents
Culture Documents
3-Sai
3-Sai
3-Sai
net/publication/335256153
CITATIONS READS
5 1,212
3 authors:
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
The Development of Scientific and methodical foundation of effective automatic forecasting in the intelligent management of Hybrid Renewable Energy System View
project
All content following this page was uploaded by Maxim Shcherbakov on 12 May 2020.
Volgograd State Technical University, Lenin Avenue 28, 400005 Volgograd, Russia
svcuonghvktqs@gmail.com
http://www.vstu.ru
1 Introduction
In the current manufacturing world, systems are becoming more and more com-
plex, especially for the machine system. This complexity is a source of various
incidents and faults that cause considerable damage to items, the environment,
and people. Failure of some parts of the system could affect all of the operations.
The classical maintenance approaches (corrective and preventive maintenance)
under such conditions largely lose their effectiveness.
In corrective maintenance, the interventions are performed only when the
critical component is fully worn out and failure. It minimizes the number of
unnecessary part replacements or repairs since maintenance is only carried out
as needed. However, this approach can also lead to unexpected and lengthy
The reported study was supported by RFBR research projects 19-47-340010 r a.
c Springer Nature Switzerland AG 2019
A. G. Kravets et al. (Eds.): CIT&DS 2019, CCIS 1083, pp. 344–358, 2019.
https://doi.org/10.1007/978-3-030-29743-5_28
Data-Driven Framework for Predictive Maintenance in Industry 4.0 Concept 345
advances and accuracy improvements for RUL prediction. Methods such as the
autoregressive model (AR) [8], deep neural network [7,17], and support vec-
tor machine [5,15] are used. Yet their conclusions are based only on a certain
situation and on specific data.
In machine learning, there’s something called the “No Free Lunch” theorem
which basically states that no one-size-fits-all machine learning algorithm is best
for all problems and every dataset. It is difficult to determine which or what
type of learning algorithm should be selected among many competing learning
algorithms. Moreover, accurate model today could become inaccurate tomorrow
and the model’s predictive accuracy depends on the relevance, sufficiency, and
quality of the training and test data. Therefore, proper preprocessing strategies
and model selection are the foundation of the construction of a robust accurate
model.
This paper presents a developed framework (a package in R) for predictive
maintenance in Industry 4.0 concept (PdM framework). The PdM framework
helps engineers and domain experts to easily analyze and utilize multiple multi-
variate time series sensor data to develop and test predictive maintenance models
based on RUL estimation using machine learning algorithms and deep learning
algorithms in a rapid for decision support proactive systems for optimizing the
maintenance and service of the machine.
R is a free software environment for statistical computing and graphics sup-
ported by the R Foundation for Statistical Computing [13]. R is one of the most
powerful machine learning platforms and is used by the top data scientists in
the world. There are so many algorithms and so much power sits there ready
to use. It has a large number of packages that expand the functionality for pro-
cessing and predictive modeling. One of them is caret [6], which implements the
functions to streamline the model training process for complex regression and
classification problems.
2 Proposed Framework
2.1 Task Statement and Methodology
We assume to have sensor readings over the total operational life of simi-
lar equipment (run-time to failure data). We denote the set of instances by
ID. For an instance i ∈ ID, we consider multi-sensor times series X (i) =
(i) (i) (i)
X1 , X2 , ..., XT (i) }, where T (i) is length of time series that corresponds
to the total operational life (from start to end of life), Xti ∈ R(n+k) is an
(n + k)-dimensional vector corresponding to the n sensors related to the equip-
ment state and k sensors related to operational conditions at time t: X (i) =
(i) (i) (i) (i) (i) (i)
s1 , s2 , ..., sn , c1 , c2 , ..., ck }.
In Table 1 we presents the data table schema for storing such data allowing
the implementation of the proposed framework. Figure 1 shows the architecture
of the proposed framework.
Data-Driven Framework for Predictive Maintenance in Industry 4.0 Concept 347
Column Description
id Unit identifier - a unique number that identifies an instance
timestamp A timestamp (days, cycles, ..) when data was obtained
s1 , s2 , ..., sn The n sensor columns related to the equipment state at
each time step
c1 , c2 , ..., ck The k sensor columns related to the operating conditions
Because we know when each instance in ID will fail, for a instance i, we can
compute a RUL at each time step t, defined as an instance’s elapsed life at that
(i)
time minus its total lifetime: RU Lt = T (i) − t(i) .
In order to produce RUL estimates, we must still determine the models f (.)
of the form described below that captures a functional relationship between the
sensor values and RUL at time t for a instance i:
RU Li (t) = f si,1 (t), si,2 (t), ..., si,n (t), ci,1 (t), ci,2 (t), ..., ci,k (t))
Once we have the historical run-to-failure data with RUL labels at each time
step, we can build and train models using machine learning and deep learning
technologies. The next step is to find a predictive model that can accurately
predict the RUL from new sensor data coming in from the currently monitored
operating instances similar to the historical monitored instances in ID.
For the currently monitored operating instance similar to the historical mon-
itored instances in ID, the length T (i) corresponds to the elapsed operational
life till the latest available sensor reading (measurements are truncated some
(unknown) amount of time before it fails.).
348 V. C. Sai et al.
Downloading and Installation. To install and use the PdM framework, run
this code:
> i n s t a l l . pac kage ( d e v t o o l s )
> d e v t o o l s : : i n s t a l l g i t h u b ( ” f o r v i s /PdM” )
> l i b r a r y (PdM)
350 V. C. Sai et al.
Fig. 3. Sensor readings over time for the first 10 engines in the training set
operating) for the first 10 engines (10 lines in each subplot-one for each engine)
from the training set, plotted against time.
From the results of the data summary and visualization we can draw the
following conclusions:
– The variables have not the same scale and sensor readings have noise. It means
that we should transform this data in order to best expose its structure to
machine learning algorithms.
– The following variables are constant in the training set, meaning that the
operating condition was fixed and/or the sensor was broken/inactive: c3, s1,
s5, s10, s16, s18, s19.
– The sensor s6 is practically constant.
We can check and discard these variables from the analysis for both training
and testing datasets by using the function process data():
# Check and remove v a r i a b l e s with a z e r o v a r i a n c e
> c ( t r a i n d a t a 1 , t e s t d a t a 1 ) %<−% p r o c e s s d a t a ( t r a i n d a t a 1 ,
t e s t d a t a 1 , method=’zv ’ )
## The f o l l o w i n g v a r i a b l e s with a z e r o v a r i a n c e a r e
## removed : c3 , s1 , s5 , s 10 , s16 , s18 , s 19
The function process data() also provides a number of useful data trans-
form methods supported in the argument to the process data() function in PdM
framework, e.g. Box-Cox transform, Yeo-Johnson transforms, MinMax normal-
ization, divide values by standard deviation, subtract mean from values, trans-
form data to the principal components, etc.
Before building any predictive maintenance model for predicting RUL, we
should check that our data should contain enough useful information to allow us
to distinguish between healthy and failing states of the engines. If they don’t, it’s
unlikely that any model built with sensor data will be useful for our purposes.
The function visualize data() also allows us to compare the distribution of sensor
values in “healthy” engines to a similar set of measurements when the engines
are close to failure:
> v i s u a l i z e d a t a ( t r a i n d a t a 1 , i d = 1 : 1 0 0 , t yp e = ” h f ” ,
n s t e p = 20 )
The Fig. 4 shows the distribution of the values of all sensor channels for each
engine in the training set, where healthy values (in green) are those taken from
the first 20 time steps of the engine’s lifetime and failing values are from the last
20 time steps (n step = 20). It’s apparent that these two distributions are quite
different for some sensor channels.
Also, the correlation between attributes can be calculated using the visual-
ize correlation() function.
> v i s u a l i z e c o r r e l a t i o n ( t r a i n d a t a 1 , method=’ c i r c l e ’ )
354 V. C. Sai et al.
The Fig. 5 shows that many of the attributes have a strong correlation. Many
methods perform better if highly correlated attributes are removed.
We can find and remove the highly correlated attributes with an absolute
correlation of 0.75 using the find redundant() function as follows:
> t r a i n d a t a 1 <− f i n d r e d u n d a n t ( t r a i n d a t a 1 , c f = 0 . 7 5 )
## The f o l l o w i n g h i g h l y c o r r e c t e d f e a t u r e s a r e removed : s4 ,
## s7 , s1 1 , s1 2 , s13 , s 14
With the missing values handled and redundant features removed, our
datasets is now ready to undergo variable transformations if required. Here is
an example of using the process data() function to transform the training and
testing datasets (MinMax normalization):
# MinMax n o r m a l i z a t i o n
> c ( t r a i n d a t a 1 , t e s t d a t a 1 ) %<−% p r o c e s s d a t a ( t r a i n d a t a 1 ,
t e s t d a t a 1 , method=’ra nge ’ )
Data-Driven Framework for Predictive Maintenance in Industry 4.0 Concept 355
> evaluation$plot
356 V. C. Sai et al.
Fig. 6. Visualisation of machine id vs. RUL covering both predicted and actual values
of dataset.
Fig. 7. Prediction-realization diagram for FD001 data. Different colors and marks to
show forecasts relating to different forecasting methods.
> eva lu at io n$ pr d
The result shows that SVM has the lowest errors (MAE, MdAPE, RMSE,
sMAPE). We can look at the default parameters of this algorithm:
> p r i n t ( models$svm )
Data-Driven Framework for Predictive Maintenance in Industry 4.0 Concept 357
We can improve the accuracy of this best algorithm (svm in this case) by
tuning their parameters using grid search:
# Tune SVM sigma and C p a r a me t r es
# Use t he expand . g r i d to s p e c i f y t h e s e a r c h s p a c e
> g r i d <− expand . g r i d ( s igma=c ( . 0 1 , . 0 1 5 , 0 . 2 ) ,
C=s e q ( 1 , 1 0 , by =1))
# Tr ai n t h e svm
> s vm g r i d <− t r a i n m o d e l (RUL, t r a i n d a t a 1 ,
method=”svm ” ,
tun eG ri d=g r i d )
Using print(svm grid) we can see the optimal model with final values of
parameters selected for this model (sigma and C in this case). Once we have an
accurate model on our test harness we can save it to a file so that we can load it
up later and make predictions by using the saveRDS() and readRDS() function:
# s a ve th e model to d i s k
> saveRDS ( sv m gri d , ” . / f i n a l m o d e l . r d s ” )
# Load th e model
> model <− readRDS ( ” . / f i n a l m o d e l . r d s ” )
In addition to machine learning algorithms, the PdM framework also pro-
vides some tools for building deep learning predictive maintenance models based
deep learning algorithms, e.g. create tensor(), train lstm(), train cnn(), etc. More
details about all functions of the PdM framework use following code to access
to the documentation pages:
> h e l p ( pac kage = ”PdM”)
4 Conclusion
We conclude, that the proposed framework PdM in this study can be applied
for proactive decision support. The proposed method was applied to predict the
remaining useful life of the equipment in the concept of industrial predictive
maintenance.
In future work, we wrap the functionalities of the proposed predictive main-
tenance framework PdM into a graphical user interface. This enables the user
to conduct all steps of the predictive maintenance building workflows from his
browser without using codes.
References
1. Allaire, J., Chollet, F.: R Interface to ‘Keras’. R package version 2.2.4 (2018).
https://CRAN.R-project.org/package=keras
2. Dui, H., Si, S., Zuo, M., Sun, S.: Semi-Markov process-based integrated importance
measure for multi-state systems. IEEE Trans. Reliab. 64(2), 754–765 (2015)
358 V. C. Sai et al.
3. Hanachi, H., Liu, J., Banerjee, A., Chen, Y., Koul, A.: A physics-based modeling
approach for performance monitoring in gas turbine engines. IEEE Trans. Reliab.
64(1), 197–205 (2015)
4. Huang, Z., Xu, Z., Wang, W., Sun, Y.: Remaining useful life prediction for a
nonlinear heterogeneous Wiener process model with an adaptive drift. IEEE Trans.
Reliab. 64(2), 687–700 (2015)
5. Khelif, R., Chebel-Morello, B., Malinowski, S., Laajili, E., Fnaiech, F., Zerhouni,
N.: Direct remaining useful life estimation based on support vector regression.
IEEE Trans. Industr. Electron. 64(3), 2276–2285 (2017)
6. Kuhn, M.: caret: Classification and Regression Training. R package version 6.0-82
(2019). https://CRAN.R-project.org/package=caret
7. Li, X., Ding, Q., Sun, J.: Remaining useful life estimation in prognostics using deep
convolution neural networks. Reliab. Eng. Syst. Saf. 172, 1–11 (2018)
8. Long, B., Xian, W., Jiang, L., Liu, Z.: An improved autoregressive model by particle
swarm optimization for prognostics of Lithium-Ion batteries. Microelectron. Reliab.
53(6), 821–831 (2013)
9. Malhi, A., Yan, R., Gao, R.: Prognosis of defect propagation based on recurrent
neural networks. IEEE Trans. Instrum. Meas. 60(3), 703–711 (2011)
10. Qian, Y., Yan, R., Hu, S.: Bearing degradation evaluation using recurrence quantifi-
cation analysis and Kalman filter. IEEE Trans. Instrum. Meas. 63(11), 2599–2610
(2014)
11. Soh, S., Radzi, N., Haron, H.: Review on scheduling techniques of preventive main-
tenance activities of railway. In: Fourth International Conference on Computational
Intelligence, Modelling and Simulation. IEEE, pp. 310–315, Kuantan, Malaysia,
September 2012. https://doi.org/10.1109/CIMSim.2012.56
12. Si, X., Wang, W., Chen, M., Hu, C., Zhou, D.: A degradation path-dependent app-
roach for remaining useful life estimation with an exact and closed-form solution.
Eur. J. Oper. Res. 226(1), 53–66 (2013)
13. The Comprehensive R Archive Network. https://cran.r-project.org/. Accessed 20
Jan 2019
14. Turbofan engine degradation simulation data set. https://c3.nasa.gov/dashlink/
resources/139/. Accessed 18 Jan 2019
15. Wang, S., Zhao, L., Su, X., Ma, P.: Prognostics of Lithium-Ion batteries based
on battery performance analysis and flexible support vector regression. Energies
7(10), 6492–6508 (2014)
16. Wickham, H.: ggplot2: Elegant Graphics for Data Analysis. Springer, New York
(2016). https://doi.org/10.1007/978-0-387-98141-3
17. Yu, J., Mo, B., Tang, D., Liu, H., Wan, J.: Remaining useful life prediction for
Lithium-Ion batteries using a quantum particle swarm optimization-based particle
filter. Qual. Eng. 29(3), 536–546 (2017)