13.lec 12 Transfer Learning 2

Outline DTL Solutions Data Collection in DTL Suggestions to select DTL algorithms Further Challen
Intelligent Control and Fault Diagnosis

Lecture 11: Transfer Learning in IFD 2
Farzaneh Abdollahi
Department of Electrical Engineering
Amirkabir University of Technology
Winter 2024
Farzaneh Abdollahi Intelligent Fault Diagnosis Lecture 11 1/53

DTL Solutions
Generalization Performance Improvement
Varying WCs
Different Machines
Partial-Doamin FD
Emerging FD
Compound Fault Decoupling
Data Collection in DTL
Suggestions to select DTL algorithms
Further Challenges To Investigate!

DTL Solutions
▶ DTL solutions are provided for four application scenarios:

1. Generalization performance improvement: The label space of target
domain is identical with the label space of the source domain,
YT ≡ YS
2. Partial domain fault diagnosis: the label space of target domain is a
proper subset of the label space of the source domain, Y T ⊂ Y S
3. Emerging fault detection: the label space of the target domain is a
proper superset of the label space of the source domain, Y T ⊃ Y S
4. Compound fault decoupling: the label space of the target domain is
different from the label space of the source domain, but the fault in
target domain is coupled by multiple single faults in the source
domain.

DTL Solutions
1. Generalization performance improvement: In this scenario, the label
space of target domain is identical with the label space of the source
domain, that is, Y T ≡ Y S , which imposes strict restriction on the
fault types of domains and mainly focuses on improving the
generalization performance of DTL model under varying environments.
[1]

▶ The goal is to learn a robust deep model that is implementable under

varying environments like
▶ Varying Working conditions (WCs)
▶ Across Different Machines
▶ Imbalanced instances
▶ Across sensors, and etc
[1]

Generalization Performance Improvement (Varying WCs)

▶ Instance-based DTL solutions:
▶ One of the most popular instance reweighting strategy is the fast
TrAdaBoost algorithm [2], it:
▶ Weaken the weight of the low-quality instances
▶ Enhance the weight of high-quality instances through iteratively
update,
▶ is successful in enhancing the generalization performance for the fault
diagnosis model under varying wcs.
▶ Instances-based approaches are effective for varying WCs. However,
the performance of these methods depends on the number or the
quality of target instances to some extent, and they may have
difficulty to tackle the problems in more challenging but complex
scenarios which have significant discrepancies between source and
target domain.

▶ Model-based DTL solutions:

▶ A Popular solution is pretraining a deep model like VGG-16, VGG-19,
Alexnet, CNN , then sequentially fine-tune it using the labeled
instances under the target WC
▶ Finetuning DTL can be done at different levels (for example on
CNN)[3]:
▶ just finetuning the classifier,
▶ just finetuning the feature extractor,
▶ finetuning the whole CNN model
▶ The experimental validations show that the proposed transfer
strategies can effectively transfer the useful features for the target
task and achieve the highest accuracy for the generalization problem
of WCs.


▶ Another method is to jointly implement the source and target tasks in
a deep model with multibranches. (for example[4])
▶ two convolutional classification networks designed (one for the source
task and one for the target task
▶ sharing weights with each other
▶ Compared with other solutions training a model from scratch,
model-based DTL solutions benefit from the faster convergence rate
and the reduction of the risk of overfitting.
▶ The inherent limitation of these solutions is that the fine-tune
algorithm heavily relies on labeled training data in the target domain.


▶ Feature-based DTL solutions:
▶ For varying WCs, the features learned by deep models are expected to
be less sensitive to varying WCs (e.g. speed-insensitive and
load-insensitive) to provide better the generalization
▶ The main efforts are on how to learn universal features under varying
WCs from the following three aspects:
▶ Discrepancy-based domain adaptation, a variety of criteria, such as
Maximum Mean Discrepancy (MMD)], Conditional MMD (CMMD),
MK-MMD, KL Divergence, CORAL, Central Moment Discrepancy
(CMD), and Maximum Classifier Discrepancy (MCD), have been
widely introduced into the objective function to measure the features
discrepancy between the source and the target WCs.
▶ In [5] the MMD term used in the objective function for reducing the
distribution discrepancy between different WCs. It was the first time
that the transfer learning technique, i.e., the domain adaptation, was
applied to train the deep model in the field of IFD.


▶ Adversarial-based domain adaptation GAN has been employed to help
the deep model to learn task-sensitive but domain-insensitive features
for the target tasks in the generative or non-generative adaptation
ways.
▶ In generative adaptation, deep GANs and their variants have been
exploited to generate different types of data, such as frequency
domain data, and time–frequency domain data, with the help of
available source data, and then these generated and real data are used
to train an extra deep model, achieving reliable diagnosis results when
testing data in target WCs are not available during model training [6]
▶ Main challenge for generative adaptation is the difficulty in evaluating
the quality of the generated data with effective metrics quantitatively.
▶ In non-generative adaptation, several deep DA frameworks such as the
Domain Adversarial Transfer Network (DATN), the DANN, the
Adversarial Discriminative Domain Adaptation (ADDA), the
Wasserstein GAN (W-GAN), have been developed for FDI under
varying WCs.[7]

▶ Reconstruction-based domain adaptation: some applications that
utilized encoder–decoder reconstruction to enhance the generalization
capability of diagnosis model under different WCs are proposed [8].

Generalization Performance Improvement (Diff. Machines)

▶ For varying WCs the data used for model training and testing are
both measured on the identical machine under different WCs
▶ But different machines is more complicated because of different
mechanical structures, diverse material and various sizes.
▶ These factors lead to a more significant distribution shift between
the training and testing data rather than scenario of varying WCs.
▶ There are three typical applications across different machines:
▶ transfer from laboratory to industry
▶ transfer from simulation to reality
▶ transfer from past to future
▶ If the transfer knowledge is done properly, it will
▶ largely eliminate the dependency of the fault data collected from the
target machine
▶ potentially reduce the economic cost spent on the maintenance of the
target machine

▶ Usually the loss function is modified properly to transfer the knowledge
▶ In [9] a metric named Optimal Transport-embedded Similarity
Measure is introduced for analyzing the transferability of diagnostic
knowledge across machines, in which cluster-conditional distributions
are explored to assign cluster labels for the target instances.
▶ [10] proposed an instance-based discriminative loss to explore the
DTL on different machines scenarios where only normal samples are
available in the dataset of the target machine.


▶ Adapting a network as the backbone model that was pretrained on
Source date (e.g. non-manufacturing data), then fine-tuning on target
machines (e.g. manufacturing machines) for transferring common
latent features among different machines.[11]
▶ In other approaches like [12] a Transferable Convolutional Neural
Network (TCNN) is proposed, which exploits the knowledge learned
from different source machines to improve the generalization
performance of the target task.
▶ Its core idea is that the layers and the parameters of the pretrained
TCNN are firstly subdivided into several blocks, and then each block
is finetuned in reverse order.
▶ This strategy is also applicable for DBN, SAE, and LSTM
▶ The model-based solutions, especially the fine-tune algorithm, are
comparatively easy to implement in the different machines scenarios.
▶ The performance would decrease dramatically if the labeled instances
are insufficient or Intelligent
Farzaneh Abdollahi
unavailable.
Fault Diagnosis Lecture 11 14/53

▶ The discrepancy-based domain adaptation is still one of the most
popular and promising solutions for fault diagnosis in the scenario
across different machines,( e.g. from lab to real industry machines)
and brings successful breakthroughs compared with traditional DL
methods
▶ In these scenarios they ( discrepancy-based domain adaptation) are
even more popular comparing with other feature-based approaches
,i.e., the adversarial-based and reconstruction-based domain
adaptation techniques


▶ In [13] a domain-shared CNN was trained by simultaneously
minimizing three discrepancies:
▶ the classification discrepancy of the labeled instances in the source
domain
▶ the classification discrepancy of the unlabeled instances in the target
domain with the help of the pseudo label learning
▶ the multilayer MMD discrepancy of the learned representations across
domains.

▶ Other Scenarios
▶ There are other challenges like types and sampling frequency of
sensors, different the number of training instances of each class, result
in imbalanced instances a huge distribution diversity between realistic
industrial data.
▶ Several approaches are introduced in different scanrios
▶ E.g. to deal with the problem where the instances of each fault class
are imbalanced during model training, by generating the minority
instances through the GAN, the instances can get balanced.[14]

DTL Solutions
2. Partial-domain FD: the label space of target domain is a proper subset
of the label space of the source domain, Y T ⊂ Y S , which relaxes the
same label space requirement and mainly focuses on transferring
knowledge from a large- scale but redundant source domain to an
unknown small-scale target domain.
[1]

Partial-Domain FD
▶ Motivation: Using the historical labeled data and the open-source

industrial data collected from related scenarios.
▶ Goal: To train a diagnosis model that can transfer knowledge from
large-scale but redundant source domain to unknown small-scale
target domain.
[1]

Partial-Domain FD
▶ Challenges:
▶ In target domain the information is not labeled.
▶ Annotating data is expensive and some times impassible
▶ Outlier source faults may lead to negative transfer.
▶ The large-scale but redundant source dataset is diverse enough to
subsume all fault classes of the small-scale target dataset.
▶ Directly transferring between the entire source and target domains as
the popular DTL methods is not an optimal and effective solution
▶ Solution: aligning the distribution of source and target domains to
positive transfer of the relevant data and to alleviate the negative
transfer of irrelevant data
▶ Model-based approaches inherently rely on the label information of
target instances, so they cannot help in these scenarios

Partial-Domain FD

▶ An intuitive solution is to select the outlier instances in source domain
that are negative for building target model.
▶ This idea is implementable by adapting the instance-level or class-level
weighting strategies during the process of model training.
▶ E.g. by calculating class weights through classifier inconsistency loss
estimates, the label space of target domain is estimated and the
source instances beyond the shared label space of source and target
domains are selected according to the class weights.[15]

Partial-Domain FD

▶ E.g. Inspired by GAN, combined two techniques: [16]
▶ Conditional data alignment implemented by minimizing the
distribution discrepancy between source and target domains through
MMD.
▶ Unsupervised prediction consistency achieved when the same
prediction results of target domain data can be obtained after finishing
the adversarial learning between multiple classification modules and
the discriminator
▶ Although the proposed approached in both DTL solutions seems
effective, according to the characteristic of the target domain data,
how to select the labeled source instances and determine the range
of source domain from numerous low-quality industrial data are still
challenging problems,

DTL Solutions

3. Emerging FD: the label space of target domain is a proper subset of
the label space of the source domain, Y T ⊃ Y S relaxes the
assumption on the same label space and mainly focuses on detecting
the new faults that never exist in the source domain.
[1]

Emerging FD
▶ Motivation: unpredicted faults are prone to occur since the machines
typically operate in complex and uncertain environments during
long-term service.
▶ Goal: Expanding diagnosis knowledge, by detecting the unknown
faults absent in the labeled source dataset and annotate them with
correct labels
[1]

Emerging FD
▶ Challenges:
▶ No knowledge about the new faults are available
▶ Separate the known and unknown fault classes in an unsupervised
manner is difficult.
▶ The emerging fault classes may also jeopardize the knowledge
alignment between the source and target domains due to the absence
of emerging faults in the source domain.
▶ negative transfer will happen if the distribution of the target domain is
directly matched with that of the whole source domain.

Emerging FD
▶ Using similarity metric learning

▶ Using Sparse AE (SAE) extract features of known faults and forms a
prior distribution of them with the Gaussian Distribution by training
samples. Then if unknown instances distribution deviates from the
prior distribution of the known faults, it is detected as new fault and
used to retrain the diagnosis mode. [17]
▶ Cos similarity can also be used to detect new faults
▶ But they cannot deal with the diagnosis task under complex
application scenarios where the distribution shift exists between the
training and testing data.

Emerging FD
▶ E.g. Deep Adversarial Transfer Learning Network (DATLN) has two
components trained by adversarial training: [18]
▶ a feature extractor
▶ extracts features from input data, and the classifier outputs K + 1
dimension probability,
▶ K : the number of known faults in source domain and the (K + 1):
the probability of the unknown fault.
▶ a classifier
▶ build a decision boundary to recognize the unknown fault in the target
domain.
▶ limitation:they can only detect all unknown faults as one category
even if there exist multiple emerging faults
▶ More complex emerging FD when across machines and sensors are
added are still open to investigate

DTL Solutions

4. Compound Fault Decoupling the label space of the target domain is
different from the label space of the source domain, but the fault in
target domain is coupled by multiple single faults in the source
domain.
[1]


▶ Goal: Having knowledge of single faults, decouple the compound
fault intelligently
[1]

▶ Challenges:
▶ Sometimes the key parts are defected or even damaged.
▶ the fault characteristics of each component are coupled and exerted
influence reciprocally. Decoupling them is difficult.
▶ It is difficult and unrealistic to collect all types of compound fault data
in industrial cases.
▶ The traditional classifiers using the Softmax as the activation fcn. of
the last fully connected layer only outputs one label for a testing
instance, consider the compound fault as an independent pattern for
classification and the relationship between the compound fault and its
corresponding single faults is ignored.


▶ E.g. by introducing the MMD into the last layer of the feature
extractor, a Transferable CN (TCN)is introduced for decoupling
compound fault of rotating machinery under varying WCs. [19]
▶ In [20], they consider that the data cannot be obtained in advance
for some special and extreme WCs, and proposed a Deep Adversarial
Capsule Network (DACN) which embeds the domain generalization
task into the intelligent compound fault diagnosis task. It has three
parts:
▶ the feature extractor (FE),
▶ the decoupling classifier (DC)
▶ the multidomain classifier (MC)
▶ Using the single fault data collected under multiple WCs, the
adversarial training strategy is employed to train the DACN.
▶ It can decouple the compound fault in intelligently, and also strongly
generalize the performance across unseen working conditions.
Data Collection in DTL
▶ An advantages of DTL: the labeled data in the related but different

domains can be used to help training the target model.
▶ Two approaches to collect the source domain data:
▶ Use the historical labeled data, collected from similar machines
▶ Select similar data from open-source industrial big datasets. ( like
PHM )


▶ Proper DTL selection is based on three factors
▶ availability of labeled data in target domain
▶ the similarity between the source and the target data
▶ objective to use DTL
▶ Generalization performance improvement
▶ Labeled data are available in the target domain, and the source and
target domains are similar.
▶ The most efficient and optimal option will be the model-based DTL
algorithms, more specifically, the fine-tuning strategy.
▶ Since the labeled data are available, the target model can be trained
in a supervised manner.
▶ since the gap between the source and target domain is small, it should
be enough to directly merge the source and the target data as a
training dataset or to fine-tune the top layers of the pre-trained deep
model.


▶ Labeled data are available in the target domain, and the source and
target domains are different.
▶ If the labeled data is sufficient for target model training, fortunately, it
is suggested to fine-tune all the layers of the pre-trained model.
▶ If the labeled data is insufficient, the feature-based DTL algorithms,
such as the discrepancy-based and non-generative domain adaptation
could be useful.
▶ Labeled data are unavailable in the target domain, and the source and
target domains are similar
▶ the instance-based DTL algorithms, such as the weight-estimation
based on kernel embedding techniques and the heuristic weighting
strategy, would be a good choice to select the positive instances in the
source domain to help training the target model.


▶ Labeled data are unavailable in the target domain, and the source and
▶ the feature-based DTL algorithms (discrepancy-based and
adversarial-based domain adaptation).
▶ The hybrid DTL algorithm which combines the instance-based and the
feature-based domain adaptation will also be a promising tool

▶ Partial Domain Fault Diagnosis

▶ Labeled data are unavailable in the target domain, and source and
▶ the instance-based DTL algorithm, e.g., the class weight-estimation
strategy, is recommended to single out the instances in the shared
classes, and then used to train the target model.
▶ Considering the demand for reducing the domain discrepancy and
avoiding the negative transfer, the feature-based DTL algorithms,
especially the adversarial-based domain adaptation (SAN or GAN +
PK-MMD), should be given priority.

▶ Emerging Fault Detection

▶ an effective solution would be to apply the traditional DL-based
models that detect the new faults by calculating the similarity metric
between the testing and the labeled instances.
▶ the Open Set Domain Adaptation (OSDA) algorithm and its variants
( in instance-based approaches) would be more practical and effective
to address the problems of emerging fault detection.

▶ Compound Fault Decoupling

▶ Even under an identical working condition and the same machine, it is
a challenging task to intelligently decouple the compound fault via a
target model just being trained by single fault instances.
▶ If the labeled compound fault data is available, the DNN with
multiple-label classifier (MLC) can be trained in a supervised manner
and further applied for compound fault detection.
▶ Otherwise, the deep model can be trained only using the normal and
single fault data, and then a rule (e.g., Triple probabilistic terms) can
be used to restrict the outputting labels of classifier. From the results
shown in literature, the Capsule network is the best choice for
compound fault decoupling [21].

▶ Compound Fault Decoupling

▶ Since the domain shift is introduced with the varying environments,
the DNN- based algorithms perform not well in this situation.
▶ An effective and promising solution can be to combine the capsule
networks with the feature-based DTL algorithms, such as the
discrepancy-based and the adversarial-based domain adaptation.

▶ Stability and reliability

▶ The current DTL-based IFD methods could only accomplish the
well-defined transfer tasks that often have restrictions on WCs,
machines, and other hypotheses, which leads to the fact that the IFD
model is not yet robust enough in dealing with uncertain
circumstances.
▶ For a trained IFD model, an uncertain change caused by human or
non-human factors in the input could cause a large change in the
output.

▶ Interpretability of deep model

▶ these methods have been perceived as “black box” techniques and are
not interpretable
▶ lack compelling evidence to convince companies or industry that these
techniques will work repeatedly. Applications in industry have strict
requirements for safety and accuracy, which needs to explain the
reasonableness of the prediction decisions.
▶ With the help of the domain knowledge in the physical/ statistical
model, the “black box” of the IFD model can be partly opened, and it
would be easily understood how decisions are reached step-by-step.

▶ Hyperparameters of deep model

▶ The hyper-parameters are typically selected in most publications via
manual setting and experimental validation based on the grid search
technique, which is time-consuming to ensure the model achieves the
optimum performance.
▶ An automatic machine learning might be an effective solution to solve
such problem

▶ Capacity of data processing

▶ Heterogeneous Data
▶ The industry factory is a typical multi-source heterogeneous data
environment. For instance, in wind farms, there are large amounts of
the high-frequency data (current, acoustic emission and vibration
signals) and low-frequency data (environmental index, working
condition information and control parameters), have been collected
from the Supervisory Control and Data Acquisition (SCADA) system
and the Condition Monitoring System (CMS).
▶ the multi-source heterogeneous data can provide different information
for the same health states of machine, it is possible to transfer
diagnosis knowledge from one sensor data to another one, which may
greatly improve the stability and reliability of IFD algorithms.
▶ heterogeneous transfer learning between multiple sensors is an
attractive future research topic.


▶ Capacity of data processing
▶ Data Privacy and Protection.
▶ It is a top priority for any company. It provides significant obstacles
for the applications of the IFD algorithms in the practical industry.
▶ combining the federated learning and DTL to build and train an
effective and accurate IFD model is an interesting topic to research
▶ Challenges in transfer learning
▶ Identifying Appropriate Source Domain
▶ Identifying an appropriate source domain includes sufficient training
instances annotated with precise label information for implementing
target tasks is still a challenging problem due to industrial big data.
▶ With the rapid development of digital technology, such as Digital
Twins, one promising way is to utilize the simulation techniques to
generate training data as the source domain in such a scenario.
▶ Transfering the knowledge from multi-source domains has attracted
more and more attention recently.

▶ Challenges in transfer learning

▶ Avoiding Negative Transfer.
▶ One of the effective measures to improve the performance of the
DTL-based IFD model in industrial scenarios is to transfer only the
common knowledge that can contribute to the target learning task
and to avoid negative transfer at the same time.
▶ For example, developing accurate “distance” metrics between the
domains might be a feasible solution for avoiding negative transfer
since existing metrics used in feature-based DTL are not powerful
enough to develop a perfect transfer learning application.

▶ Computation and energy efficiency

▶ the DTL-based IFD methods suffer from the high requirement of
computational source and speed.
▶ the capability of real-time monitoring is fundamental to PHM sys
tems, which can improve the security of mechanical systems, identify
potential faults as soon as they occur, allow for early maintenance,
and avoid systems failures.
▶ Techniques, including efficient neural network compression,
incremental learning and deep reinforcement learning, are potential
research directions to facilitate the real-time ability of DTL based IFD
algorithms.

References I
W. Li,R. Huang, J. Li,Y. Liao, Zh. Chen, G. He, R. Yan, and

K. Gryllias, “A perspective survey on deep transfer learning for
fault diagnosis in industrial scenarios: Theories, applications
and challenges,” Mechanical Systems and Signal Processing,
vol. 167, 2022.
R. L. F. Shen and R. Yan., “Exploring sample/feature hybrid
transfer for gear fault diagnosis under varying working
conditions,” J. Comput. Inf. Sci. Eng., vol. 20, no. 4, 2020.
W. Y. T. Han, C. Liu and D. Jiang, “Learning transferable
features in deep convolutional neural networks for diagnosing
unseen machine conditions,” ISA Trans., vol. 93, pp. 341–353,
2019.

References II
N. Z. X. Cao, B. Chen, “A deep domain adaption model with

multi-task networks for planetary gearbox fault diagnosis,”
Neurocomputing, vol. 409, pp. 173–190, 2020.
Y. C. D. M. J. Y. W. Lu, B. Liang and T. Zhang, “Deep
model based domain adaptation for fault diagnosis,” IEEE
Trans. Ind. Electron., vol. 64, no. 3, pp. 2296—-2305, 2017.
Y. Z. Z. Shi, J. Chen and Z. Zhou., “A novel multitask
adversarial network via redundant lifting for multicomponent
intelligent fault detection under sharp speed variation,” IEEE
Trans. Instrum. Meas., vol. 70, pp. 1—-10, 2021.

References III
J. L. Z. C. Y. Liao, R. Huang and W. Li, “Deep

semisupervised domain generalization network for rotary
machinery fault diagnosis under variable speed,” IEEE Trans.
Instrum. Meas., vol. 69, no. 10.
H. W. L. C. Z. Liu, L. Jiang and X. Li, “Optimal
transport-based deep domain adaptation approach for fault
diagnosis of rotating machine,” IEEE Trans. Instrum. Meas.,
vol. 70, pp. 1–12, 2021.
S. X. B. Yang, Y. Lei and C. Lee, “An optimal
transport-embedded similarity measure for diagnostic
knowledge transferability analytics across machines,” IEEE
Trans. Ind. Electron., doi: 10.1109/TIE.2021.3095804., 2021.

References IV
J. Y. Y. L. R. W. H. Zheng, Y. Yang and M. Xu, “Deep

domain generalization combining a priori diagnosis knowledge
toward cross-domain fault diagnosis of rolling bearing,” IEEE
Trans. Instrum. Meas., vol. 70, 2021.
P. Wang and R. Gao, “Transfer learning for enhanced machine
fault diagnosis in manufacturing,” CIRP Ann., vol. 69, no. 1,
pp. 413–416, 2020.
K. G. Z. Chen and W. Li, “Intelligent fault diagnosis for rotary
machinery using transferable convolutional neural network,”
IEEE Trans. Ind. Informat., vol. 16, no. 1, pp. 339–349, 2020.

References V
F. J. B. Yang, Y. Lei and S. Xing, “An intelligent fault

diagnosis approach based on transfer learning from laboratory
bearings to locomotive bearings,” Mech. Syst. Signal Process.,
vol. 122, pp. 692–706, 2019.
P. S. M. Zareapoor and J. Yang, “Oversampling adversarial
network for class-imbalanced fault diagnosis,” Mech. Syst.
Signal Process., vol. 149, 2021.
J. L. J. Jiao, M. Zhao and C. Ding, “Classifier
inconsistency-based domain adaptation network for partial
transfer intelligent diagnosis,” IEEE Trans. Ind. Informat.,
vol. 16, no. 9, pp. 5965–5974, 2020.

References VI
W. Z. X. Li, “Deep learning-based partial domain adaptation

method on intelligent machinery fault diagnostics,” IEEE
Trans. Ind. Electron., vol. 68, no. 5, pp. 4351–4361, 2021.
W. L. J. L. S. Zhang, M. Wang and Z. Lin, “Deep learning
with emerging new labels for fault diagnosis,” IEEE Access,
vol. 7, pp. 6279–6287, 2019.
R. H. J. Li and W. Li, “Intelligent fault diagnosis for bearing
dataset using adversarial transfer learning based on stacked
auto-encoder,” Procedia Manuf., vol. 49, pp. 75–80, 2020.

References VII
R. Huang, Z. Wang, J. Li, J. Chen,and W. Li, “A transferable

capsule network for decoupling compound fault of machinery,”
in in Proc. IEEE Int. Instrum. Meas. Technol. Conf. (I2MTC),
Dubrovnik, Croatia, pp. 1–6, May 2020.
Y. L. J. C. Z. W. R. Huang, J. Li and W. Li, “Deep adversarial
capsule network for compound fault diagnosis of machinery
toward multidomain generalization task,” IEEE Trans. Instrum.
Meas., vol. 70, pp. 1–11, 3506311, 2021.
“Introduction to capsule neural networks ml.”
https://www.geeksforgeeks.org/
capsule-neural-networks-ml/(availabledate:
May.,2024).

13.lec 12 Transfer Learning 2

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

13.lec 12 Transfer Learning 2

Uploaded by

Copyright:

Available Formats

Outline DTL Solutions Data Collection in DTL Suggestions to select DTL algorithms Further Challen

Intelligent Control and Fault Diagnosis

Department of Electrical Engineering

Amirkabir University of Technology

Farzaneh Abdollahi Intelligent Fault Diagnosis Lecture 11 1/53

Data Collection in DTL

Suggestions to select DTL algorithms

Further Challenges To Investigate!

Farzaneh Abdollahi Intelligent Fault Diagnosis Lecture 11 2/53

▶ DTL solutions are provided for four application scenarios:

Farzaneh Abdollahi Intelligent Fault Diagnosis Lecture 11 3/53

Farzaneh Abdollahi Intelligent Fault Diagnosis Lecture 11 4/53

Generalization Performance Improvement

▶ The goal is to learn a robust deep model that is implementable under

Farzaneh Abdollahi Intelligent Fault Diagnosis Lecture 11 5/53

Generalization Performance Improvement (Varying WCs)

Farzaneh Abdollahi Intelligent Fault Diagnosis Lecture 11 6/53

Generalization Performance Improvement (Varying WCs)

▶ Model-based DTL solutions:

Farzaneh Abdollahi Intelligent Fault Diagnosis Lecture 11 7/53

Generalization Performance Improvement (Varying WCs)

▶ Model-based DTL solutions:

Farzaneh Abdollahi Intelligent Fault Diagnosis Lecture 11 8/53

Generalization Performance Improvement (Varying WCs)

Farzaneh Abdollahi Intelligent Fault Diagnosis Lecture 11 9/53

Generalization Performance Improvement (Varying WCs)

Generalization Performance Improvement (Varying WCs)

▶ Feature-based DTL solutions:

Farzaneh Abdollahi Intelligent Fault Diagnosis Lecture 11 11/53

Generalization Performance Improvement (Diff. Machines)

Generalization Performance Improvement (Diff. Machines)

▶ Instance-based DTL solutions:

Farzaneh Abdollahi Intelligent Fault Diagnosis Lecture 11 13/53

Generalization Performance Improvement (Diff. Machines)

Generalization Performance Improvement (Diff. Machines)

▶ Feature-based DTL solutions:

Farzaneh Abdollahi Intelligent Fault Diagnosis Lecture 11 15/53

Generalization Performance Improvement (Diff. Machines)

▶ Feature-based DTL solutions:

Farzaneh Abdollahi Intelligent Fault Diagnosis Lecture 11 16/53

Generalization Performance Improvement

Farzaneh Abdollahi Intelligent Fault Diagnosis Lecture 11 17/53

Farzaneh Abdollahi Intelligent Fault Diagnosis Lecture 11 18/53

▶ Motivation: Using the historical labeled data and the open-source

Farzaneh Abdollahi Intelligent Fault Diagnosis Lecture 11 19/53

Farzaneh Abdollahi Intelligent Fault Diagnosis Lecture 11 20/53

▶ Instance-based DTL solutions:

Farzaneh Abdollahi Intelligent Fault Diagnosis Lecture 11 21/53

▶ Feature-based DTL solutions:

Farzaneh Abdollahi Intelligent Fault Diagnosis Lecture 11 22/53

▶ DTL solutions are provided for four application scenarios:

Farzaneh Abdollahi Intelligent Fault Diagnosis Lecture 11 23/53

Farzaneh Abdollahi Intelligent Fault Diagnosis Lecture 11 24/53

Farzaneh Abdollahi Intelligent Fault Diagnosis Lecture 11 25/53

▶ Using similarity metric learning

Farzaneh Abdollahi Intelligent Fault Diagnosis Lecture 11 26/53

Farzaneh Abdollahi Intelligent Fault Diagnosis Lecture 11 27/53

▶ DTL solutions are provided for four application scenarios:

Farzaneh Abdollahi Intelligent Fault Diagnosis Lecture 11 28/53

Compound Fault Decoupling

Farzaneh Abdollahi Intelligent Fault Diagnosis Lecture 11 29/53

Compound Fault Decoupling

Farzaneh Abdollahi Intelligent Fault Diagnosis Lecture 11 30/53