ISA Transactions x0 (8x) 20% Contents lists available at ScienceDirect ISA Transactions Journal homepage: wwy.lsevier.comvocateisatrans Research article An automated health indicator construction methodology for prognostics based on multi-criteria optimization Khanh T.P, Nguyen’, Kamal Medjaher ch, ENT, Toulouse INP, 47 Avene ser, 65000 Tarbes, Frome ARTICLE INFO ABSTRACT ‘rie sory: Received 16 July 2019 ‘Accepted 12 Mave 2020 ‘valle online 20x in recent yeas, the development of autonomous health management systems received increasing attention from worldwide companies to improve their performances and avoid dawntime losses. This ‘an be done, inthe first step, by constructing powerful health indicators (Hl) from intelligent sensors for system monitoring and for making maintenance decisions. In this context, this paper aims 10 develop a new methodology that allows automatically choosing the pertinent measurements among various sources and als handling raw data from high-frequency sensors to extract the use low-level features. Then, it combines these features to create the most appropriate Hl following the previously ‘defined multiple evaluation criteria. Thanks tothe flexibility ofthe genetic programming, the proposed methodology does not require any expertise knowledge about system degradation trends but allows easily integrating this information if avaiable. Its performance is then verified on two real application ‘ase studies In addition, an insightful overview on HI evaluation criteria is also discussed inthis paper. Raynor rognatcs and heath management Feature extraction Health nator construcon Geneve ontannig (© 20201SA Published by Elsevier Lid llrightsreserved, 1. Introduction, In recent years, companies facing with fierce global competi- tion must continuously innovate to improve their performances and avoid downtime and loss of revenue. One of the levers to achieving these goals is to develop autonomous systems from the viewpoint of system health management [1]. This can be done, in the frst step, by using intelligent sensors, which provide reliable solutions for systems monitoring in real time. Then, monitoring data are treated and analyzed in the second step (o extract health indicators (HI) for maintenance and operation decisions The heaith indicator construction is generally based on feature engineering process. This process involves selecting relevant fea- tures in data and transforming them to generate new powerful indicators which are then used for system health management. In early studies, the principal component analysis (PCA) was proposed to find lower dimensional representation of features for condition and performance assessment [2]. However, this ‘method, which is based on linear combination of original vari- ables, shows its limitations when facing nonlinearities and time- varying behaviors of system degradations. Thus, numerous PCA variations were developed to handle data with nonlinearity and ‘multi-modality properties such as Kernel-PCA, PCA-based KNN and PCA-based Gaussian mixture model (3]. Besides, other non- linear combination techniques such as Isomap or Linear Locally * conesponding autor smal adresses: th phoog Khanhaguyenenitic (KEP. Nguyen, amalmedjaersenic tr. Medien, spsdotorg/10.1016satr.2000030%7 (0019.0578/0 2020 1A. Published by Elsevier Le. Al eights reserve Pease ite this atce a: KTP. [SA anacons (005 ps eto tolop na 0300501 Embedding (LLE), which allow finding manifold embedding of lower dimensionality, are also used to extract useful features for monitoring system states [4,5]. However, the new features created by these mentioned statistical projection methodologies are not interpretable, which can lead to a deal-breaker in some settings. In addition, they are usually used for diagnostic prob- Jem 6.7], but rarely for prognostics issue which requires more signal processing techniques [4.5 Considering HI construction methods that are dedicated to prognostics domain, they can be generally classified in two groups: mathematical model based and deep learning based. For ‘model based methods, the authors in [9] evaluate the distance between the vibration signals of degraded bearing and nominal bearing, and then smooth it by an exponential mode. In [10}, the authors manually chose the relevant features, and then construct the HI using a weighted average combination of the chosen features. The HI is also developed based on expertise know!- edge about physical behaviors of system [1,11] and about the relevant features used for creating effective HI [12]. na recent study [13], the authors propose to use the multivariate state estimation method, which is a non-parametric regression model- ing technique, to generate useful HI. However, the above studies are based on assumptions about degradation forms over time or the expertise knowledge about signal processing techniques, data analysis, and system behaviors. In practice, it is usually difficult to obtain these information, especially for complex sys- tems. Then, an automatic process end-to-end is preferable. Deep Tearning (DL) models, that provide alternative solutions, become A automated heath indicator construction methodology fr prognosis based on mult-iteria optimization | 2 KEEP. Neuyen ond K Megiaher /1SA Transactions nex (von) ex ‘one of the most popular trends in recent studies. Infact, they allow automatically extracting and creating useful features by themselves without the expertise knowledge while traditional machine learning approaches require features to be predete mined by users. This ability helps users to save an important amount of work. Furthermore, they are cable of using different data formats to train models without a manual processing. and still obtain useful results. Once trained properly, a DL model can Perform accurately many repetitive tasks within a short time- period. Hence, in literature numerous studies develop different DDL models to construct Hl for prognostics. For example, the au- thors in [14] propose the Encoder-Decoder based on Long Short ‘Term Memory (LSTM) while a Recurrent Neural Network (RNN) Encoder-Decoder is used in | 15]. Besides, the Convolution Neural Network (CNN) is also applied to create HI using raw vibration signals [16,17] or using time-frequency features extracted from data [18,19]. From these studies, it can be seen that the DL models ‘can take advantage of abundant data to automatically generate health indicators without much expert knowledge about the system, Nevertheless, the deep features created by these works are difficult to understand and cannot be interpreted as physical ‘characteristics of the system, Developing an automated HI construction method that re- {quires minimal user effort and allows creating interpreted HI is challenging. n [20], the authors present a genetic programming. (CP) method to automatically find the best mathematical formu- lation that combines low-level features to form more abstract high-level prognostic features. This method does not require any analytical knowledge about the HI formulation and is flexible in discovering new mathematical combinations of features for HI construction. The created HI, which is an explicit mathe- ‘matical function of low-level features, can be interpreted for further studies, In other words, it offers the interpretability ofthe physical meaning of the created HL. Thanks to these mentioned advantages, GP can be considered as a promising solution to ‘develop the automated HI construction method. However, as GP finds the optimal mathematical formulation based on the fitness function, the performance of created solutions strictly depends ‘on the evaluation criteria. To our humble knowledge, various evaluation criteria for creating prognostcs features are proposed in literature but no paper provides an insightful overview on ‘which criteria are appropriate and when a combination of these criteria could be useful. In addition, the existing work based ‘on GP [20] only considers low-level features, that are manually ‘extracted using signal processing experiences of the authors, and ‘does not construct an automated feature extraction (FE) phase ‘with various flexible options of FE operators. ‘Therefore, the first contribution of this work, presented in Section 2, aims to fill the literature gap by a brief overview of the HI evaluation criteria for prognostics purpose. It also addresses ‘an interesting question, whether the multi-criteria is necessary to create Hi or a simple criterion is suficient, through different ‘case studies. The second contribution, that is also the main co tribution, concerns the development of a new HI construction framework. In our humble knowledge, this is the first work that proposes a complete automated process from extraction ‘of low-level features to construction of useful HI when tanking into account multi evaluation criteria, To inherit the positive properties such as the flexibility in creating new mathematical Functions and the result interpretability, the proposed framework is developed based on two stage GP. The first stage aims to au- tomatically extract pertinent low-level features from raw sensor ‘measurements, while the second stage is dedicated fo construct effective HI using the first stage’s output. An overview of the proposed framework will be presented in Section 3.1. Then, the details of the feature extraction step and the HI construction Pease ite his atce as: KTP. Nguyen and 15ers a0 psig ool 2000 05017 ‘step will be respectively described in Sections 3.2 and 3.3, Next, the performance of the proposed methodology will be examined in Section 4. Finally, the conclusion and further works will be discussed in Section 5 ‘The choice of the appropriate criteria is essential to construct powerful HI for the prognostics purpose. Therefore, this section aims to present an overview ofthe Hl evaluation criteria. Its main points are drawn from our comprehensive survey of the studies that developed Hl in prognostic field, In summary, according tothe final objective, which is prognos- tics, the HI evaluation criteria can be generally classified in two principal groups. The first group only focuses on the performance assessment of the HI construction phase while the second group is interested in the effectiveness of the prognostics results (Fi. 1). In the first group, the performance of HI construction methods, ‘can be evaluated through the HI intrinsic natures or its extern correlations (e.g. its correlation with Remaining Useful lifetime (RUL)). For intrinsic natures of HI, one can cite the following characteristics: © Monotonicity: It is the most used criterion in literature and is designed to evaluate the property of monotonic in- crease of decrease trend of a Hl, [10,12.16.20-23}, If all HL trajectories strictly increase or strictly decrease, then the monotonicity will equal to 1 its maximum value. « Failure consistency: Considering different HI trajectories, it is necessary to develop a criterion to assess the consistency of their failure threshold, (10). When all Hi trajectories reach the same failure threshold, the failure consistency converges to 1. its maximum value. © Trendability: It is widely used in literature to evaluate the correlation between the degradation trend of a Hl and the system operation time, [16.21.22]. The trendabilty is high ‘when there exists a correlation between HI trajectories and the system life time. « Scale similarity: It is used to measure the similarity of the range scales among all HI trajectories, [16,21,22]. Ir all trajectories have the same minimal and maximal values, the scale similarity can reac its maximal value, that equals to 1 # Robustness: It is designed to assess how robust the Hl is to random fluctuations, which may arise due to sensor noises, uncertainty of degradation phenomena, or variations in operating conditions, [22.23]. As can be seen in Fig. 2, the ‘created HI including high noises has low robustness score but high scale similarity score because their trajectories almost fluctuate in the same range. Different from the HI intrinsic natures which only reflect the self-properties of HI, the extern correlation requires RUL informa- tion for evaluating its prognosticability. Concretely, the following criteria could be investigated: © Correlation with RUL: It is a statistical measure that de- scribes the association between Hi and RUL There are sev- ‘eral methods for calculating different types of strength of association, Three of the most popular methods are briefly described below: = Pearson correlation coeficient. It is the most widely used correlation coefficient, and allows measuring the Tinear association between continuous variables = Spearman's correlation coefficient. I can be considered 8a special case of Pearson applied to non-linear rela- tionships. Instead, it measures monotonic association A automated heath indicator construction methodology fr prognosis based on multi-criteria optimization. ATP. Nye and K Mediher 84 Transactions fe) 3 evaiaton eter for rogosies eas o oan le erosts, eee ‘enue ml we) | en Co Hist nt al = & Seo ha vse) | ‘aD {monotony a informe | na ‘cei c= stnay | a FU Robson Fig. 1. Casifcaton ofthe Hl evaluation criteria for prognostic. MAE: Mean absolute err, MAPE: Mean absolute percentage error, MSE: Mean square crt, MAD: “Mean absluce deviation, FP PN False postive) false negative 4000 — Taj.1 — trie 2000} — Traj.3 2000 1000 Hi trajectories =1000 2000 3000 a 20 40 60 20 100 Time (days) Fig. 2 Ilusation of HI having low robustness but high scale similarity score (only strictly increasing or decreasing, but not mixed) between two variables and relies on the rank order of values, In other words, it looks at the relative order of values for each variable, This property makes it appropriate to use with both continuous and discrete data, ~ Kendalts Tau coefficient. It does not take into account the difference between ranks but uses only the direc- tional agreement to measure the ordinal association between two variables, ic the similarity of the data orderings when ranked by each of their quantities Hence, itis more appropriate for discrete data ‘Among the three above correlation coefficients, the Spear- ‘man correlation, that allows measuring the non-linear re- lationships and that is appropriate for both continuous and discrete data, is the most used one to evaluate the HI per- formance [11,24]. Therefore, in this paper, the Spearman correlation is used to measure the correlation between the amount of information” obtained about RUL information by observing the HI [25]. It is a non-negative value, and equal {0 zeto i and only if two random variables are independent, while higher values mean higher dependency. 1 Fetest: This statistical testis useful in feature selection by evaluating the significance of a feature in improving the ‘model [26]. Then, it can be directly used to assess HI per- formance. Fig. 3 illustrates the difference between MI and F-test criterion, As F-Test only captures linear relationships between variables, on the left side, the HI having linear tend obtains high F-test score but low MI score as its trajectories have different ranges, At the opposite, the Hl on the right side has non-linear trend and its trajectories almost fluctuate in the same range, then it has the low F-test and high MI score, Jn the second group, tne Hil performance can be indirectly evalu- ated through the following prognostic performance metrics, that have been considered in numerous papers [15,18,|: Hi trajectories and the system RUL, ‘* Mutual information (MD): Ics used fo measure the mutual, dependence between HI and RUL. In decal, it quantifies the Pease ite this article a: KTP, ISA Transactions (2020) ps a ear, atone ea caer constuction ntbodog fo pogo eon mule pinion isa 0300801 KEEP. Neuyen ond K Megiaher /1SA Transactions nex (von) ex (a) High F-test and low ME Fig. & lution ofthe dierence between ‘¢ Mean absolute error (MAE): It measures the mean abso- lute difference between the original and the predicted RUL values. Mean absolute percentage error (MAPE): This statistical metic is used to evaluate how accurate a forecast RUL is. It is calculated as the average absolute percent etror for each predicted RUL value minus the original RUL value divided by the original RUL value. Mean square error (MSE): It measures the average squared difference between the true RUL and the predicted RUL values. ‘Accuracy: This metric is generally used to evaluate classi- fication models. For prognostics issue. itis defined as the ratio of the number of the predicted values that belong to a precision interval identified around the true values and the total number of the predicted values. Mean absolute deviation (MAD): [tis a spread metric used to measure the vatiability ofthe difference between the true and predicted RUL values. [tis similar to standard deviation but meant to be more robust to outliers. Score: This asymmetric scoring function is provided in [29] to measure the prognostics performance. As late predictions could cause serious system failures in real-life PHM applica- tions, it penalizes more the late predictions than the early predictions. False positiveifalse negative (FP/FN): These metrics are defined as the fraction of the false positive predictions (or false negative predictions) and the total number ofthe pre- dictions. A prediction is false positive (false negative) when it is an early prediction (late prediction) and the absolute difference value between the predicted and the true values is greater than a threshold defined by the user. Discussion on HI evaluation criteria ‘Table | summarizes the characteristics of Hl evaluation criteria for prognostics. As the prognasties performance metrics strictly ‘depend on the effectiveness of the prognostics model and require lot of time to be evaluated, i is not interesting to consider these metrics in this paper. ‘Next, among the remaining criteria, which allow directly eval- uating the performance of HI construction step, the monotonicity and trendability criteria are the simplest ones. They can be easily used for all applications because they do not require test-to- failure experiments for acquiring RUL information. Different from the monotonicity and trendability, which re- flect the properties of a single HI [6], the failure consistency. scale similarity and robustness measure the resemblance and correlation among all HI trajectories. Therefore, they are only ap- propriate for the cases where there exists abundant test-to-failure ease cite this ate as: KP. IBA Transctons (2020) eps nd K Meta fidroigjisawa 202003017. (6) Low Pot ‘and high Ml test and Mtl information eterion. data of the system. Among those three criteria, the robustness requires the most time to extract the trend of Hi trajectories, Finally, the criteria such as Ml, correlation and F-test require many test-to-failure experiments to acquire RUL information. ‘Among them, the correlation with RUL is the fastest evaluated criterion. ‘To have a comprehensive view of the impact of the HI eval- uation criteria on the HI construction step, hereinafter in this Paper we consider the following criteria: monotonicity. trend- ability, failure consistency, scale similarity, robustness, mutual information, Spearman correlation and F-test. Their formulations ‘are summarized in Table 2, Note that, their values are scaled into the range from 0 to 1 t0 facilitate the analysis, the assessment, the comparison and the combination between them. 3, Two stage GP based automated-H-construction methodol- ony ‘The characteristics of the HI construction methods reported in literature are summarized in Table 3. Each method has its ‘own advantages and weaknesses, To inherit the advantages and ‘overcome the weaknesses of these methods, this work aims (0 develop a new methodology that allows: 1. automatically extracting reasonable low-level features, and automatically combining them to form high-level abstract features: ‘ot requiring the expertise knowledge but facilitating its Integration if available: taking into account multi criteria for HI performance eval- uation; creating useful HI that can be interpreted by the physical characteristics of the system in further studies: being easily applied to various systems. 4 5. ‘To satisfy the above requirements, an automated process based ‘on two stage GP is proposed in this Section. The GP is chosen as 4 promising solution because of (1) its flexibility in discovering new mathematical expressions under tree-like structure, (2) its ability of extension of the solution length up to hardware limits, (3)its ability co get accurate results without analytical knowledge, (4) its ability t0 easily take into account multicriteria for HI performance evaluation. Beside these mentioned advantages, the ‘computational complexity of the proposed methodology should be deeply analyzed and improved in future work. In addition, the manual adjustment of the method's hyper-parameters can also be considered as an own weakness, Hence, an automatic process to identify the most appropriate input parameters could be developed as one of perspectives of this paper. A automated heath indicator construction methodology fr prognosis based on multi-criteria optimization. ARTICLE IN PRESS ATP. Nye and K Mediher 84 Transactions fe) 5 ‘able 1 Smumary of Ml evaluation tesa for prognostic, ‘ricer Require ROL Require many Tests 10 Tow computational tn fare complexity Monstonity x x 7 Treniabilty x x < Failure consistency x “ ‘ Seale simlaty x “ “ Robostnes x Y x Mutu information ac a x Correlacon wih RUL t o ¢ Fes ‘ ’ x Progestics performance metics t x ‘rable 2 Femulation of Hl evaluation criteria (a. numberof HI trajectories, length of ith eajecory, dn. dilference between two consecutive olnts oft trajectory, tne cortesponding eo Ith Wajctory. (Hla) Standard deviation of the ferminal poins ofall H trajectories, Mg: st point of Hh Secor, Mla, OF Hig maximal oF minimal valve ofall BL trajectories, Hl; mean trend of all HE cajectoies HM, RU mutual formation betwen Rh trajectory and ts reevantRUL. Get Formuliton Monotonic Mon = 2 DF me NA = 0) — Old, <8 “wendy eon = EP eo Failure consistency Scale snag Fem ZA (tag Ma) > 107% se: 0 Thin mg 1 (as — Hl) > 10-7205: Robustness ko Murua information Spearman coreation with RUL Ercan MGM i ohm) = 107: ee 0 = BPM exp a, SC = 2 Spero RU, es R= EEN ext rabies Sar of const mrs est wets renin ime "aon ‘pie owine We aan Sata Masog br fet ere Nanay emact Dv nie To nv en i praeion 8) gn due tenes bene Towed a imeem da rn Peabo eee aes Ping sea se sane ae Nunenaicl moss Cet wey Gan be epee Mul exact and Hoque exgeise Take in acu one Dees oes2200° sep rene Coin eours "Ege he stn ems cy Sem as Seaton Sie Slee end Dep kang Wwcey appa for Nocimepeae Automatic process‘ net ee ony cnr wane (enh ‘een sts ri erornue be Soca en aa Sirertcuute itt Depa ele oem sue cr based (20) ‘dey apttable Cane intereabe inayat Dont reque only cme to ate tems fetes ble Troms bot min ce ano owe negating emu tropsca Wiel aplale Cane interpretable Automat process Do et requre Caner tl rmotsory ‘artic tes Coweapt bet cee ows incgang kt ukecve insenati inci 3.1, Overview ofthe proposed methodology Fig, 4 shows an overall flow chart of the proposed HI con- struction methodology. It generally includes two stages: feature extraction and HI construction, The first stage aims to extract pertinent low-level features by automatically adjusting the fea- ture extraction (FE) functions and their relevant ourputs. It starts with a population of individuals that are tee-like representations of the FE functions, their parameters and the sensor signals. Next, it evaluates them based on some evaluation criteria and generates a new population by using evolutionary operators on high scoring individuals and then eliminating the low scoring, ‘ones. The details of this stage will be described in Section 3.2. In the second stage. GP is used to derive reasonable mathematical formulations of the features extracted in the fist stage to create powerful prognostics health indicators, see Section 3.3 for more details. Each individual HI is defined using an expression tree constructed by combining the values of low-level features and a set of mathematical operators. ARTICLE IN PRESS 6 KT en and Mr / 4 Toman) Sarr measurements v i i ves oping eta SlCr ‘acon ‘ira onsruion” >) Suan and Fig. 4. Overall Now char of the proposed methodoog. ante + ‘Summary of signal processing techniques to extrac the usefl features iF: Tinpulse Factor RMS: Root mean square. EEMD: Ensemble empical mode decomposition) Signal pressing methods ‘Advantages Wakes Time domain [31-33] ‘Overall frm (eg. IF) Stas vale (8 MS) (asia. ast and simple. Cane generaly applied for eifferent fale types of various systems. To viable for nooy sana May equie Detteamentrehniques Beare the evaluation of Frequency domain ast Fourier Transform [7] Envelope analysis [435] Power Spectrum Density [33.35.37] lective techniques to detect system anomalies when knowing it fault characterise fequences Require infomation about fale characteristic feequences Can not be broadly applied. Tine-Fequency domain Short ime Fue Transform [8 ‘Wavelet Transform [2539] Hier Huang Transform [20-42], EMD (43) Powerful techniques for aalyaing and characterizing fhe sen spectrin in time. Se viable oF nosy High computation tie. Requite experiences t choose appropriate Hyperparameters of methods and torentat the vsefol formation from their esl 3.2. Automated feature extraction stage This subsection aims to present the automated feature ex- traction step of the proposed methodology. Firstly, an overview of signal processing techniques to extract features is presented in Section 3.2.1. Then, Section 3.2.2 is dedicated to develop an effective algorithm to automatically choose appropriate functions ‘and their relevant parameters to extract the useful features. 3.2.1. Feature extraction techniques mn literature, signal processing techniques for feature extrac- tion can be classified into three categories: time, frequency and time-frequency domain, see [44] for a brief review. The proper- ties of these techniques are summarized in Table 4. Among the three categories, the time-frequency domain includes the most powerful techniques for analyzing and characterizing the non- stationary signals and therefore can be viable for noisy signals. However, it requires high computation time and also abundant experiences to choose appropriate hyperparameters and to get useful information from the results. Next, che frequency domain analysis is effective to detect system anomalies when knowing their fault characteristic frequencies. For the time domain analy- sis, although it may require pretreatment teclniques to enhance noisy signals before extraction of features, its methods can be generally applied for various systems thanks to their simplicity ‘and also their fast computation time. As this work aims to develop ‘an automated, fast and effective method that does not require much expertise knowledge about signal processing techniques ‘and can be broadly applied for various system, only time domain is investigated hereinafter. ‘The functions to extract time domain features used in this ‘work are summarized in Table 2. Apart from the classical and Popular time indicators in literature (vp. trp. Uwas Uns: Yer: Us yy. Use Ur. aNd vp), We consider one More signal pretreatment technique that is represented by the smooth function, usy. This function aims to extract the signal trend using a polynomial fitting function and therefore allows removing signal noises. Table 5 ‘Summary of function to extract time domain features, onsen a tie widow of gt) the signal te ak he xa) = mR rea to Peak vate =a) ~ m0 ‘Mean absolute vale tal) = jae Root me se va swale) = fF EEA crest aor w= BR ‘Skewmness value cn) = | Bem? Kurtosis value oul) iSfates Amt ee) shape ator vo) = ee Input sete) = Sad beta cor sr) = 05 0) + a) Shout fncion so) = ae 2 Fig. 5 illustrates how to apply the functions presented in ‘Table 2 to a particular signal. In detail, these functions require ‘one parameter that is the time window length n, in which the corresponding function will be evaluated. For example, the time ‘window length is assumed t0 be fixed n = 3. It will slip on the recorded signal from the beginning (x;) until the time ¢ (x1). Concretely, at the beginning, x; is copied two times and ‘these values are added on the left side of the recorded signal to evaluate vgys over the time window n = 3 for the first value xt, Next, at the second epoch, there exists two signal values x; ‘and Xp, only one value x} is added on the left side ofthe recorded signal. Finally, atthe current time ¢, che function vgys is evaluated using three signal values x2, %_y and x; that are located within the defined time window. Note that using this slipping method evaluation, the extracted feature length is equal Co the signal length. ATP. Nye and K Mediher 84 Transactions fe) 7 wil a |x| x bx-2)ei-] x a4 [aa Pal] 2 | [4 bx-ajei-a] x Stas apa lels|s bx-ajti-a| x1 Ses Hexus > x |.x2 | |x ape] x emus Fig. 5. stration of the operator For feaure extraction (ey with eto Fig. Flow chart ofthe automated festue extraction aigorithn, 42.2. Automated feature extraction algorithm To extract useful features, it is necessary to identify which functions and their relevant parameters are appropriate for every raw sensor signal. Hence, this subsection aims to propose an algorithm that allows facilitating this task. The general flow chart of the proposed algorithm is presented in Fig. 6 The proposed algorithm is based on Genetic Programming. Firstly, an initial population including iy individuals is randomly created, Every individual, which is a combination of (1) a feature extraction (FE) function, (2) a sensor signal output and (3)a value of window length parameter 1, represents a way for extraction Pease ite this article a: KTP, 13 fast (00; por tlh aa MOODS OT Table 6 Sumaty of mathematical operators wo crete the HL ‘Operators Formation ‘Aasiion ey Suraction wy Multpistion ey Proteted division iy ify > 10°, otherwise 10° Protected exponential function expla) Mx = 100, otherwise 10° Protected logarithmic function Tog) i a> 10°, otherwise —10" Power function a= 10 Netatvefneton Squared futon of Features. ts performance is then evaluated through one (or a combination) of the evaluation criteria presented in Table 2 From the intial population, n, offspring are generated in df= ferent ways through crossover, mutation of FE function, terminal mutation and reproduction, Concretely, a real number r, 7 € [0, randomly generated. If its value is less than the crossover probability, p. then two individuals are chosen from the initial population to perform the crossover between them (Fig. 7). Dif- ferent from the normal crossover existing in literature, the one proposed in this section is only performed to exchange the same type terminals, eg. the sensor signal output of parent 1 cannot be replaced by the window length parameter of parent 2. After crossover operation, two offsprings are created. Unlike (0 crossover, for mutation and production operator, only one parent is chosen to create one offspring. Note that for all operators. the individual having the better finest function is chosen with the higher probability. In detail, ifr is superior to ¢ but inferior to its cumulative sum with the probability of the FE-function mutation p, the FE function ofthe chosen parent will be replaced by another FE function (Fig. 8(a)). Otherwise. ifr is superior to this sum (p; + po), but is inferior to the cumulative sum including the probability of the terminal mutation py, the terminal (ie. sensor output or window length parameter n) will be replaced hy another same type terminal (Fig. 8b). Finally, if ris superior to the sum of crossover and mutation probability (Pe + Py + Pols the chosen parent will be copied to create its offspring, Alter creating n, offsprings, all individuals including the ones, in the parent population ny, and the one in the offspring popula- tion ng, will be evaluated to update the hall of frame (HOF) that Is the best solutions through all generations. The HOF number, that is defined by users, is the number of extracted features. Among, 1ng+ni individuals, ny individuals wil be randomly chosen as the parents of the next generation. Note that the individual having the better finest function will be Kept with the higher probabil For a new generation, the above procedure will be repeated ‘until the stopping criteria (the maximal number of generations) is attained. 4.3, Automated health indicator construction step The second stage of the proposed methodology aims to find best mathematical functions that allow combining the low-level features extracted from the first stage to derive the powverful Hl ‘To prevent not-a-number values that can be created by random combinations, several variants of basie mathematical operators are proposed. The operators used for this work are summarized in Table 6, Fig. 9 presents the low chart of the automated HI construction, algorithm. Different from the first stage that only finds one-Ievel- ‘combinations of FE functions, the individuals of the second stage are multi-level-combinations of mathematical operators defined in Table 6. Thus, the evolutionary operators defined in this step are richer than the ones used in the previous step. Concretely, these evolutionary operators are summarized as follow: A automated heath indicator construction methodology fr prognosis based on mult-iteria optimization | ARTICLE IN PRESS KEEP. Neuyen ond K Megiaher /1SA Transactions nex (von) ex ame @ i i) cater D> D @ @ (a) Same type individuals Parents ccieran D >» (onus) (Gmootn) (b) Different type individuals Fig 7. Crossover operator for feature extraction OFigrinal rival mus © @ Mutated Invi © @ (a) Funetion mutation Original naa Mutated icv = f (b) Terminal mutation Fig. & Mutation operator for feature extraction ‘* Crossover: If the random value r is inferior or equal co the probability of crossover (r < p.), the crossover operator will be performed. it randomly selects one point in each parental individual and exchanges their relevant subtrees (Fig. 10), ‘¢ Function mutation: Ifthe random value ris superiar to the probability of erossaver and inferior or equal to the sum of ‘the cross aver and function mutation probability, noted p2 (pe

