Lec 7 Historybased Fault ANN

Outline Traditional ML for IFD Example
Intelligent Control and Fault Diagnosis

Lecture 7: History-Based IFD (ANN)
Farzaneh Abdollahi
Department of Electrical Engineering
Amirkabir University of Technology
Winter 2024
Farzaneh Abdollahi Intelligent Fault Diagnosis Lecture 7 1/24

Traditional ML for IFD

Data Collection
Feature Extraction
Health State Recognition
ANN-Based Approaches
Example
Feature Extraction
Feature Selection
NN Classifier
Results

▶ In history-based Intelligent Fault Diagnosis (IFD) approaches, mostly

a classifier should be designed to diagnose the faults.
▶ Machine Learning (ML) is an approach of history-based IFD [1, 2].
▶ We categorize the history-based IFD into traditional and deep
approaches
▶ The traditional ML for IFD includes three steps
▶ Data collection
▶ Artificial feature extraction
▶ Health state recognition

[3]

Data Collection
▶ The sensors mounted on machines constantly collect data

▶ i.e. I/O set of the system should be collected in different conditions.
▶ Some times multi-source sensors collect information, so they should
be fused to achieve higher diagnosis accuracy
▶ Some of the popular sensors for FDI are
▶ Vibration for bearings and gearboxes
▶ Acoustic emission for detecting the incipient faults and the
deformation of bearings especially under the low-speed operation
conditions and low-frequency-noise environment.
▶ Temperature for batteries
▶ Current for electric-driven machines

Feature Extraction
▶ Feature extraction includes two steps

1. Features are extracted in time-domain, frequency-domain, or
time–frequency-domain which reflects the health states of machines.
▶ Time–frequency-domain like energy entropy, are usually extracted by
wavelet transform (WT), wavelet package transform (WPT) or
empirical model decomposition (EMD)
▶ These features can reflect health states of machines under
non-stationary operation conditions
2. Using filters, wrappers or embedded approaches the sensitive features
to the health state of machines are selected to avoid redundant
information (Feature Selection)

Feature Selection
▶ Filter: directly preprocess the collected features, which are

independent to the training of the classifier
▶ Wrapper: different from filter, it focuses on the interaction of feature
selection with training classifiers.
▶ the performance of classifiers is used to assess the selected feature set.
If the selected feature subset cannot produce the optimal classification
accuracy, another subset is reselected in the next iteration until the
selected features enforce the classifiers at the most favorable
performance. ( Las Vegas wrapper (LVW) is one of them)

Feature Selection
▶ Embedded methods integrate the feature selection into training the

classifiers.
▶ The regularization terms (like L1 and L2 ) on the optimization objects
of the classifiers automatically select the features once the training of
classifiers is done.
▶ Both can cope with the over fitting that occurs in training with a
small number of samples.
▶ L1 term prefers to obtain the sparse parameters, which is able to
abandon redundant features in classification and further enforce the
classifier to achieve the high classification performance.

Health State Recognition
▶ At this stage, a model based on the selected features and health

states of machine is established.
▶ In general, the diagnosis models are first trained with labeled
samples. After that, the models are able to recognize the health
states of machines when the input samples are unlabeled.
▶ Some popular approaches are
▶ ANN (MLP,RBF,Wavelet NN, and etc)
▶ SVM

Artificial Neural Network (ANN)-Based Approaches

▶ One of the popular training algorithm is Back propagation neural
network (BPNN) is a multilayer perceptron as a supervised learning
▶ It consists of the forward propagation and the back propagation:
▶ In Forward path:
▶ The input samples are processed by multilayers to obtain output.
▶ Given the training dataset {xi , di }m i=1 with m samples, where xi ∈ R
d
l
includes d features and di ∈ R includes l health states
▶ The output of the hth hidden layer is expressed as
Pnh−1 h h−1
(xih )j = σ h ( i=1 wj .xi + bjh ), j = 1, ..., nh , h = 1, ..., H
h
where (xi )j : the output of the jth neuron in the hth hidden layer,
xi0 = xi , nh : the number of neurons in the hth hidden layer, σ h : the
activation function of the hth hidden layer, nh−1 : the number of
neurons in the (h − 1)th hidden layer, wjh the weights between the
neurons in the previous layer and the jth neuron in the hth hidden
layer, and bjh :the bias of the hth hidden layer.
▶ The output of BPNN is
XnH
out
(yi )k = σ ( wjout .xiH + bjout ), k = 1, ..., l
i=1
where (yi )k : the predicted output of the kth neuron in the output
layer, σ out : the activation function of the output layer, wjout , bjout :
respectively the weights and bias of the output layer
item The optimization objective of BPNN to be minimized are the
error between the output and the target
l
1X
min Ei = [(di )k − (yi )k ]2
w ,b 2
k=1

▶ In Backward Path
▶ The training parameters w and b are updated by the gradient descent:
∂E ∂E
w ←w −η , b ←b−η
∂w ∂b
where η:the learning rate.
▶ The error gradient propagates backward from the output layer to the
input layer, and updates the training parameters layer by layer.
▶ Some other popular ANN-based approaches are RBF and Wavelet
NN

[3]
Example [4]
▶ Objective: Diagnosing two of the most common types of faults in a

three-phase induction motor:
▶ stator winding interturn short,
▶ rotor dynamic eccentricity
▶ Both of them
▶ Proposed Solution: Casecade RBF-MLP NN
▶ Data gathering:
▶ A designed motor is selected and tested in different loads
▶ Three ac current probes were used to measure the stator current
signals in max frequency of 5 kHz
▶ The number of sampled data: 2500.

Feature Extraction
▶ In traditional approaches, mainly stator current or vibration signals

are used as features; however, empirical diagnostic is more
complicated because:
▶ The stator current spectrum contains not only the components
produced by motor faults but also other components such as
characteristic harmonics that exist because of supply voltage
distortion, air gap space harmonics, slotting harmonics, or load
imbalance.
▶ The signal is affected by strong background noise.
▶ The diagnostic system is subjected to frequent transient and other
various kinds of nonstationary interferences from both inner and outer
sides of the motor.
▶ Several faults may exist at the same time, while for different kinds of
faults, the span of the corresponding feature harmonics is different.

Feature Extraction
▶ Solution: Some common statistical measures, such as the mean,

variance, standard deviation, kurtosis, energy, and entropy are used
to describe the probability density function of a time-varying signal.
▶ The uncertainty can be measured by entropy.
▶ The entropy of a distribution is the amount of randomness of that
distribution

Feature Selection
▶ After the feature extraction, there are too many input features that
would require significant computational efforts to calculate
▶ It may result in low accuracy in the monitoring and fault diagnosis.
▶ Solution To remove the redundant or irrelevant information , the
data is mapped into a space of lower dimensionality by principal
component (PC) analysis (PCA).
▶ PCA is a multivariate technique that analyzes a data table in which
observations are described by several intercorrelated quantitative
dependent variables.
▶ By introducing a an orthogonal linear transformation, the important
information from the table is extracted and represent it as a set of
new orthogonal variables called PCs
▶ Therefore the data is transformed to a new coordinate system such
that the greatest variance by any projection of the data comes to lie
on the first coordinate (called the first PC), the second greatest
variance on the second coordinate, and so on
NN Classifier
▶ First RBF- and MLP-based classifiers are designed.
▶ To improve accuracy cascade connection of RBF-MLP is proposed

NN Classifier
▶ Number of input neurons: To get a compact and accurate network

structure, an input feature space containing the number of PCs is
fed to the network.
▶ The number of inputs is increased, and the network performance is
observed carefully in terms of training and mse and classification
accuracy.
▶ Best performance is achieved by six PCs as input feature space. are
selected for inputs.
▶ RBF: Gaussian is considered for activaruin fcn of first hidden layer
▶ The number of hidden neurons is increased while monitoring mse
▶ No significant change is observed in the mse beyond 35 neurons.
▶ Hence no reason to increase nuurons anymore,because the more the
number of neurons, results in the more number of connection weights,
and the more complex NN

NN Classifier
▶ MLP: For the second layer of the network, (MLP layer), proper
learning rule and transfer function should be selected for optimum
performance
▶ tanh is found the best choice for activation fcn
▶ Different learning rules are examined: Momentum (MOM), Conjugate
Gradient (CG), Quick Propagation (QP), Delta Bar Delta (DBD),
Levenberg–Marquardt (LM), and Step (STP)
▶ Step size, number of hodden layers, momentum and learning rate are
other parameters to be chosen properlly by trial and error

NN Classifier
▶ Final proposed design for NN:
▶ number of inputs: 6;
▶ stopping condition: 4500 epochs;
▶ number of hidden layers: 01
▶ error criterion: L2 norm;
▶ number of cluster centers(RBF hidden layer): 35;
▶ number of connection weights: 489.
▶ Hidden Layer :
▶ Transfer function: Tanh
▶ Step size: 0.3
▶ Momentum: 0.9
▶ Output Layer :
▶ Transfer function: Tanh
▶ Step size: 0.4
▶ Momentum: 0.5

Results
▶ The NN is trained with 70% of data with different random initial

weights and tested with 30% of data
▶ To investigate noise sustainablilty the performance is evaluated
under uniform and gaussian noises
▶ . The IFD is able detect the faults in induction motor with average
classification accuracies of 98.41% on testing data.
▶ The training time required per epoch per exemplar is 0.849 ms, ⇝
the IFD is fast enough.
▶ The results sustain with both uniform and Gaussian noises up to
15% variance of input and output with zero mean value

References I
B. Ayhan, M.Y. Chow, and M.H. Song, “Multiple discriminant

analysis and neural-network-based monolith and partition
fault-detection schemes for broken rotor bar in induction
motors,” IEEE Trans. Ind. Electron., vol. 53, pp. 1298–1308,
2006.
H.A. Talebi, F. Abdollahi, R.V. Patel, and Kh. Khorasani,
Neural Network-Based State Estimation of Nonlinear Systems:
Application to Fault Detection and Isolation (Lecture Notes in
Control and Information Sciences, 395).
2010.

References II
Y. Lei,B. Yang,X. Jiang, F. Jia, N. Li, and A. K. Nandi,

“Observers for nonlinear systems in steady state applications
of machine learning to machine fault diagnosis: A review and
roadmap,” Mechanical Systems and Signal Processing,
vol. 138, 2020.
V. N. Ghate and S. V. Dudul, “Cascade neural-network-based
fault classifier for three-phase induction motor,” IEEE
Transactions On Industrial Electronics, vol. 58, no. 5,
pp. 1555–1563, 2011.

Lec 7 Historybased Fault ANN

Uploaded by

Copyright:

Available Formats

You might also like

Lec 7 Historybased Fault ANN

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lec 7 Historybased Fault ANN

Uploaded by

Copyright:

Available Formats

Outline Traditional ML for IFD Example

Intelligent Control and Fault Diagnosis

Department of Electrical Engineering

Amirkabir University of Technology

Farzaneh Abdollahi Intelligent Fault Diagnosis Lecture 7 1/24

Traditional ML for IFD

Farzaneh Abdollahi Intelligent Fault Diagnosis Lecture 7 2/24

Traditional ML for IFD

▶ In history-based Intelligent Fault Diagnosis (IFD) approaches, mostly

Farzaneh Abdollahi Intelligent Fault Diagnosis Lecture 7 3/24

Traditional ML for IFD

Farzaneh Abdollahi Intelligent Fault Diagnosis Lecture 7 4/24

▶ The sensors mounted on machines constantly collect data

Farzaneh Abdollahi Intelligent Fault Diagnosis Lecture 7 5/24

▶ Feature extraction includes two steps

Farzaneh Abdollahi Intelligent Fault Diagnosis Lecture 7 6/24

▶ Filter: directly preprocess the collected features, which are

Farzaneh Abdollahi Intelligent Fault Diagnosis Lecture 7 7/24

▶ Embedded methods integrate the feature selection into training the

Farzaneh Abdollahi Intelligent Fault Diagnosis Lecture 7 8/24

Health State Recognition

▶ At this stage, a model based on the selected features and health

Farzaneh Abdollahi Intelligent Fault Diagnosis Lecture 7 9/24

Artificial Neural Network (ANN)-Based Approaches

Farzaneh Abdollahi Intelligent Fault Diagnosis Lecture 7 11/24

Farzaneh Abdollahi Intelligent Fault Diagnosis Lecture 7 12/24

▶ Objective: Diagnosing two of the most common types of faults in a

Farzaneh Abdollahi Intelligent Fault Diagnosis Lecture 7 14/24

▶ In traditional approaches, mainly stator current or vibration signals

Farzaneh Abdollahi Intelligent Fault Diagnosis Lecture 7 15/24

▶ Solution: Some common statistical measures, such as the mean,

Farzaneh Abdollahi Intelligent Fault Diagnosis Lecture 7 16/24

▶ First RBF- and MLP-based classifiers are designed.

▶ To improve accuracy cascade connection of RBF-MLP is proposed

Farzaneh Abdollahi Intelligent Fault Diagnosis Lecture 7 18/24

▶ Number of input neurons: To get a compact and accurate network

Farzaneh Abdollahi Intelligent Fault Diagnosis Lecture 7 19/24

Farzaneh Abdollahi Intelligent Fault Diagnosis Lecture 7 20/24

Farzaneh Abdollahi Intelligent Fault Diagnosis Lecture 7 21/24

▶ The NN is trained with 70% of data with different random initial

Farzaneh Abdollahi Intelligent Fault Diagnosis Lecture 7 22/24

B. Ayhan, M.Y. Chow, and M.H. Song, “Multiple discriminant

Farzaneh Abdollahi Intelligent Fault Diagnosis Lecture 7 23/24

Y. Lei,B. Yang,X. Jiang, F. Jia, N. Li, and A. K. Nandi,

Farzaneh Abdollahi Intelligent Fault Diagnosis Lecture 7 24/24

You might also like