Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/344971617

Federated Transfer Learning: concept and applications

Preprint · September 2020

CITATIONS READS

0 207

2 authors:

Sudipan Saha Tahir Ahmad


Technische Universität München Fondazione Bruno Kessler
57 PUBLICATIONS   381 CITATIONS    13 PUBLICATIONS   68 CITATIONS   

SEE PROFILE SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Image segmentation View project

Censorship detection View project

All content following this page was uploaded by Tahir Ahmad on 27 January 2022.

The user has requested enhancement of the downloaded file.


Intelligenza Artificiale 15 (2021) 35–44 35
DOI 10.3233/IA-200075
IOS Press

Federated transfer learning: Concept


and applications
Sudipan Sahaa,∗ and Tahir Ahmadb
a Data Science in Earth Observation, Technical University of Munich, Taufkirchen, Germany

PY
b Security & Trust Unit, Fondazione Bruno Kessler, Trento, Italy

CO
Abstract. Development of Artificial Intelligence (AI) is inherently tied to the development of data. However, in most industries
data exists in form of isolated islands, with limited scope of sharing between different organizations. This is an hindrance to
the further development of AI. Federated learning has emerged as a possible solution to this problem in the last few years
without compromising user privacy. Among different variants of the federated learning, noteworthy is federated transfer
learning (FTL) that allows knowledge to be transferred across domains that do not have many overlapping features and users.
In this work we provide a comprehensive survey of the existing works on this topic. In more details, we study the background
of FTL and its different existing applications. We further analyze FTL from privacy and machine learning perspective.
OR

Keywords: Federated learning, transfer learning, machine learning, privacy-preservation, privacy


TH

1. Introduction obtain in some industries, e.g., finance and healthcare


[4]. Labeling data from such industries require strong
Currently machine learning is playing an important professional expertise. Moreover, data in such indus-
role in many applications, including commodity rec- tries are generally protected by different privacy and
ommendation, risk analysis, and image analysis. This security related restrictions [5]. Additionally, there
AU

is due to its excellent capability to extract insights exists practical risks of data abuse once the data are
from data. Machine learning and deep learning tech- shared to the third parties. In such industries, data
niques strongly depend on the availability of data exists in the form of isolated island [6]. Competi-
[1]. Abundance of data has led to the overwhelm- tion between different organizations also hinder the
ing success of machine learning in image analysis sharing and thus exposing one’s user data to another.
[2], remote sensing [3], and many other fields. Tra- Thus, such industries own insufficient data to train
ditionally, such machine learning or deep learning reliable machine learning frameworks.
models are trained over a centralized corpus of data. Federated Learning (FL) can be used to over-
While it is easier to collect image data, they are come the above-mentioned constraints by using data
also more intuitive to label without strong domain from different organizations to train machine learn-
knowledge. Moreover, image data is generally not ing model, however not violating the different data
sensitive and shared without restriction among dif- related regulations. FL system was first proposed by
ferent parties. However, training data is not easy to Google in 2016 [7–9]. Their method was proposed for
mobile devices that enables users to train a central-
ized model while their data are stored locally. Thus,
∗ Corresponding author: Sudipan Saha, Data Science in Earth federated learning technique can be used prevent
Observation, Technical University of Munich, Taufkirchen, the leakage of private information. While central-
Germany. E-mail: sudipan.saha@tum.de. ized learning needs to collect data from users and

ISSN 1724-8035/$35.00 © 2021 – IOS Press. All rights reserved.


36 S. Saha and T. Ahmad / Federated transfer learning: Concept and applications

store them in centralized server, federated learning works are briefly presented in Section 7. We conclude
can learn a global model while the data are distributed the work in Section 8.
on the users’ devices. Many other works adopted the
federated learning framework [10, 11]. This led to
emergence of sub-groups within federated learning, 2. Related work
e.g., horizontal federated learning and vertical feder-
ated learning [6]. For the scenario where datasets held by different
A constraint imposed by the traditional federated users differ mostly in samples, federated learning can
learning is that training data owned by different be categorized into horizontal and vertical federated
organizations need to share same feature space. learning. In this section we briefly review them. We
In practice, this is never the case in industries also briefly review transfer learning, considering its
like finance or healthcare. To mitigate this short- relation to the federated transfer learning.

PY
coming, Federated Transfer Learning (FTL) was
proposed [12]. Different participants in FTL can 2.1. Horizontal federated learning
have their specific feature space, thus making it suit-
able for practical scenarios. FTL takes inspiration Horizontal federated learning is a system in which
from transfer learning, a paradigm already popu- all the parties share the same feature space. How-

CO
lar in image analysis [13]. In this setting, machine ever, their userbase may be significantly different.
learning models trained on a large dataset for one Such parties can collaboratively learn a model with
problem/domain is applied to a different but related help of a server. Each party locally computes train-
problem/domain [14]. The performance of trans- ing gradient and masks them with some encryption or
fer learning is strongly dependent on interrelation privacy-preservation technique [15]. All parties send
between different domains. While talking about encrypted gradient to the server. The server aggre-
federated learning, stakeholders in the same data gates them and sends the aggregated result to all
OR
federation are usually organizations from the same parties. Parties update their model with the decrypted
industry. Thus, it is suitable to apply transfer learning aggregated gradient and this way all parties share final
in the federated learning framework. model parameters [7]. An example of horizontal fed-
Federated transfer learning lies at the intersection erated learning is a set of banks located in the same
of two different but fast-evolving fields: machine city or region and thus sharing same set of features,
TH

learning and information privacy. Thus it is an imper- however very few common users. Horizontal feder-
ative to bridge the gap between them to fully exploit ated learning can be used to learn a common model
the benefits of federated transfer learning. This moti- by agglomerating models learnt in individual banks.
vated us to investigate into the different aspects The horizontal federal learning can be implemented
related to federated transfer learning. It is important with different machine learning algorithms without
AU

to understand how transfer learning and federated changing the main framework.
learning intermarried to give rise to federated trans-
fer learning. It is critical to understand about how 2.2. Vertical federated learning
FTL has been applied in different real-life appli-
cations. It is important to understand the different In the vertical federated learning, participating par-
privacy and machine learning aspects related to FTL. ties do not expose users that do not overlap among
Keeping them in mind, in this paper, we present a the parties [6, 16]. Overlapping users are found by
comprehensive survey of the federated transfer learn- using an encryption-based user ID alignment. Since
ing. Towards this we: 1) outline the definition of different parties have different features correspond-
FTL, 2) present some case studies on FTL, 3) analyze ing to the common users, vertical federated learning
FTL from privacy aspect, and 4) analyze FTL from aggregates different features from different parties
machine learning aspects. and computes the training loss and gradients in a
We briefly discuss about horizontal and vertical privacy-preserving manner [16]. Subsequently com-
federated learning in Section 2. Federated transfer puted gradients are used to train the model. Vertical
learning is defined in Section 3. We detail the case federated learning assumes honest participants and
studies on FTL in Section 4. We analyze FTL’s there is no hard requirement of a third party, as
privacy aspects in Section 5 and machine learning illustrated here [17]. However, sometimes to secure
aspects in Section 6. Datasets used in the FTL related computations between the participants, an additional
S. Saha and T. Ahmad / Federated transfer learning: Concept and applications 37

party is introduced [6]. An example of the vertical In the last decade, deep learning emerged as a
federated learning is case of cooperation between the very successful paradigm in the machine learning
online retailers and the insurers. They own their own research. However, deep learning is even more data
feature space (and labels). However, their have signif- dependent than the previous machine learning algo-
icant amount of common users. In such cases, vertical rithms. To tackle this, researchers started adopting
federated learning exploits the situation by merging deep transfer learning, to utilize knowledge from
the features together to create a larger feature space other fields by deep neural networks [24]. In addi-
for machine learning tasks [18]. tion to the four approaches defined above, another
approach that gained significant attention in the deep
transfer learning is adversarial-based deep transfer
2.3. Transfer learning learning that introduces adversarial techniques based
on generative adversarial network (GAN) [25] and its
Most machine learning algorithms assume that the

PY
variants to find transferable representations applica-
training data and the test data have same distribution ble to both the source domain and the target domain.
and they are in the same feature space. However, this Deep transfer learning is closely related to other top-
assumption does not hold in most real life scenario. ics in the deep learning, e.g., deep domain adaptation
In most practical cases, we have intend to analyze one [33]. One important line of investigation in the deep

CO
domain of interest, while we have sufficient training transfer learning is that which networks are more suit-
data in another domain. As an example, we can con- able for transfer and which features are transferable
sider the problem of classification of a product review in the deep network [26].
[19]. Using traditional machine learning approach, Over the last few years, many interesting applica-
sufficient number of product review needs to be col- tions of transfer learning have emerged where FTL
lected and annotated for training. The distribution can be potentially useful, e.g., reconstruction of the
of review data varies from product to product and human gene regulatory network [27], human motion
OR
hence this training data collection and annotation analysis [28], cancer diagnosis [29], multi-temporal
process needs to be performed separately for each satellite image analysis [30]. In most of these appli-
product. Separate data collection for each product cations, data is owned by different parties/agencies,
is time-consuming and challenging. Transfer learn- thus FTL can be useful in such scenarios.
ing emerged as a framework to address this problem.
TH

Transfer learning provides a mechanism of training


model on one product and reusing it on another prod- 3. Federated transfer learning
uct.
Transfer learning allows the tasks, domains, and Federated transfer learning is a special case of fed-
distributions in the training and testing to be different. erated learning and different from both horizontal
AU

Pan and Yang [19] noted that research on trans- and vertical federated learning. In federated transfer
fer learning attracted more attention since 1995 and learning, two datasets differ in the feature space. This
was tackled under different names, e.g., knowledge applies to datasets collected from enterprises of dif-
transfer, inductive transfer, and multi-task learning. ferent but similar nature. Due to the differences in the
Different approaches were adopted for transfer learn- nature of business, such enterprises share only a small
ing [19]: overlap in feature space. This is also applicable to the
enterprises set up far in globe. Thus in such scenarios,
1. Instance transfer re-weights labeled data in the datasets differ both in samples and in feature space.
source domain to reuse it in the target domain Transfer learning techniques aim to build effective
[20]. model for the target domain while leveraging knowl-
2. Parameter transfer discovers shared parameters edge from the other (source) domains. A typical
between the source and the target domain [21]. architecture of federated transfer learning is shown in
3. Feature-representation transfer finds a feature- Fig. 1. Considering two parties A and B, where there
representation such that it reduces the difference is only a small overlap in feature space and sample
between source and target domains [22]. space between A and B, a model learned on B is trans-
4. Relational-knowledge transfer builds mapping ferred to A by leveraging small overlapping data and
of relational knowledge between source and tar- features. We recall that horizontal federated learn-
get domain [23]. ing is used when there is large overlap in the feature
38 S. Saha and T. Ahmad / Federated transfer learning: Concept and applications

The architecture presents several critical chal-


lenges that hinders the development of effective
analytical approaches for the generalizing of health-
care data. The adequate posture of healthcare data
has several root causes including, (i) regulatory
restrictions—lack of acquisition of massive user data,
(ii) security and privacy—restricts sharing of data that
exist in the form of isolated islands, and (iii) personal-
izing issue—the process of training machine learning
model lacks personalization [32].
Fig. 1. Federated transfer learning [6].

4.2. EEG signal classification

PY
space between datasets and vertical federated learn-
In addition to Section 4.1, another example of
ing is used when there is large overlap in user/sample
usage of federated transfer learning in healthcare
space between datasets. In contrast to them, FTL is
domain is the electroencephalographic (EEG) signal
used when there is small overlap in both feature space
classification [33]. Brain-Computer Interface (BCI)
and sample space (shown by dotted box in Fig. 1).

CO
systems aim to decode participants’ brain states. The
FTL ingests a model trained on source domain sam-
success of deep learning based BCI models for clas-
ples and feature space. Subsequently FTL orients the
sification of EEG recordings is restricted by lack
model for reuse in target space such that model is used
of large EEG datasets. Due to the privacy concern
for non-overlapping samples leveraging the knowl-
and high data collection expenses, EEG-BCI data is
edge acquired from source domain non-overlapping
present in the form of multiple small datasets owned
features. Thus FTL covers the region in right upper
by different entities across the globe.
OR
corner of the Fig. 1 by transferring knowledge from
Towards this, Ju et al. [33] proposes a method
non-overlapping features from source domain to the
where the EEG data is represented as the spatial
new samples in the target domain. The ability to use
covariance matrix and is subsequently fed to a deep
the transferred on non-overlapping data in A makes
learning based federated transfer learning architec-
FTL different from vertical transfer learning.
ture. It is assumed that the architectures of each
TH

user’s deep classifier is same. Federated averaging


method [7] is adopted to aggregate models from dif-
4. Use cases of FTL ferent users. A server-client setting is used where a
server acts as the model aggregator. In each round, the
4.1. Wearable healthcare updated local models are sent to the server and server
AU

sends back the updated global model after aggre-


Wearable devices are fast becoming part of gation. When a client receives the global model, it
everyday life for patients and healthcare providers. updates the model with its local data. This work [33]
Multiple features and functionalities of wearable clearly demonstrates that use of domain adaptation in
devices include remote patient monitoring, track- federated learning architecture boosts EEG classifier
ing and collecting data, enhancing everyday health performance.
and lifestyle patterns, detecting chronic conditions,
among others. Healthcare data, however, are usually 4.3. Autonomous driving
fragmented and private making it difficult to generate
robust results across populations. A typical archi- Autonomous driving normally refers to self-
tecture of wearable healthcare system is shown in driving vehicles or transport systems that move
Fig. 2. It can be seen that the users use their wear- without the intervention of a human driver. As seen in
able healthcare devices to measure various health Fig. 3 autonomous driving technology is a complex
related parameters they are concerned about. The integration of technologies including sensing, per-
user’s data generated by healthcare devices often exist ception, and decision. The cloud platform (vehicular
in the form of isolated islands. For further assessment and Internet) provides data storage, simulation, high
and analysis of results obtained the collected data is definition (HD) map generation, and deep learning
uploaded to the remote cloud based server [31]. model training functionalities. Autonomous vehicles
S. Saha and T. Ahmad / Federated transfer learning: Concept and applications 39

PY
CO
OR

Fig. 2. Architecture of the wearable healthcare system.

The dynamic nature of the autonomous driving


environment and the uncertainty of real-life scenarios
makes autonomous driving as a special use-case for
TH

FTL [35].
It is noteworthy that big data analysis is becoming
more and more common in autonomous driving [36].
FTL is relevant from the big data perspective and the
AU

characteristics of big data are aligned with the objec-


tives of FTL, e.g., sheer amount of data, different data
owners, versatility of data (features) [37].

4.4. Image steganalysis

Image steganography is the technique of hiding


information in the digital image without compro-
Fig. 3. Layered architecture of autonomous driving.
mising its visual aesthetics. Image steganalysis is a
counter technique to image steganography. It aims to
detect the hidden information in the digital images.
Towards this, it extracts and analyzes the stegano-
are mobile systems, and autonomous driving clouds graphic features generated by image steganographic
provide some basic infrastructure supports includ- algorithms. There is a lack of data for training ste-
ing distributed computing, distributed storage, and ganalysis methods due to the unwillingness of sharing
heterogeneous computing [34]. On top of this infras- data among the steganographers. Furthermore, dif-
tructure, essential services can be implemented in the ferent data owners may have different preference
form of applications to support autonomous vehicles. for steganographic algorithms and cover images.
40 S. Saha and T. Ahmad / Federated transfer learning: Concept and applications

PY
CO
Fig. 4. Users A, B, and C denote steganographic images owner and they do not share any information (left). For personalized model training,
each user individually interacts with the cloud model (right).

The traditional image steganalysis algorithms cannot integrated due to many legal and practical constraints.
account for this personalized preference. Federated transfer learning (FTL) helps to improve
To overcome these challenges, Yang et al. [38] statistical modeling under a data federation. The data
proposed federated transfer learning framework for federation as in case of FTL allows knowledge to
OR
image steganalysis (FedSteg), as shown in Fig. 4. In be shared without compromising user privacy, and
the proposed framework, different users have their enables complimentary knowledge to be transferred
own data (stego and cover images). The users do not in the network. Thus allowing the target domain party
leak the data to each other. Instead it is assumed that to develop a more flexible and powerful model by
there is a cloud model and personalized local model leveraging rich labels from source domain party [12].
for each user. The cloud model is trained with the The broad application of FTL is currently hin-
TH

data (cover and stego images) on the cloud-side. Then dered by limited dataset availability for algorithm
the cloud model is distributed to the users. Each user training and validation, due to the absence of tech-
trains their local model with local data. The trained nical and legal approaches to protect user’s privacy.
user models are sent back to the cloud side to help FTL has strict privacy preservation requirements,
it train a new cloud model. This step only shares therefore, to prevent user privacy compromise while
AU

encrypted model parameters. In this fashion, each promoting scientific research on large datasets, the
user can keep performing personalized training by implementation of technical solutions and devel-
consolidating the new cloud model with their previ- opment/implementation of legal frameworks to
ous model and its data [38]. However, there will be simultaneously address the demands for data protec-
distribution discrepancy between the cloud and user tion and utilization is mandatory.
data. To mitigate the distribution discrepancy, trans- Privacy-preserving FTL typically involves mul-
fer learning is performed to make the model more tiple parties with emphasis on security guarantees
suitable for user. Yang et al. [38] uses a CNN based to perform machine learning. Here, we present
deep transfer learning to train personalized model for an overview of the necessary privacy-preserving
each user. approaches and techniques in context of FTL [39, 40].
– Privacy by Design FTL designed from the
5. Privacy in FTL ground up with privacy in mind. The idea is
taking into account privacy, throughout the FTL
Machine learning relies on the availability of vast development process. The principles of privacy
amounts of data for training. However, in real life sce- by design may be applied to all types of sensitive
narios (as seen in Section 4), data are mostly scattered data and the strength of the implemented privacy
across different organizations and cannot be easily measure must be dependent on the sensitivity of
S. Saha and T. Ahmad / Federated transfer learning: Concept and applications 41

subject data. For example, processing of only – Secure multi-party computation The tech-
necessary data, storage of data for minimal nique is based on splitting data among
period, and limited accessibility. Optimal pri- collaborating entities to perform joint compu-
vacy preservation requires implementations that tation but prevents any collaborating entity from
are secure by default and require minimal or gaining knowledge of the data. For example,
no data transfer and provide theoretical and/or identifying the common patients among two
technical guarantees of privacy [39, 41]. hospitals without disclosing the respective hos-
– Anonymization and pseudonymization The pital patient’s list. Earlier works [48– 50] mostly
former refers to the removal of personally identi- focus on approaches based on multi-party com-
fiable information from a dataset (e.g., removing putation. SecureML [51], a privacy-preserving
information related to age and gender), whereas, protocol combining secret-sharing [52] and
the later refers to replacement of personally Yao’s Garbled Circuit [47], is considered as

PY
identifiable information in a dataset with a syn- the state-of-the-art protocol for linear regres-
thetic entry with separate storage of the linkage sion, logistic regression, and neural networks.
record. For example, in case of health insur- SecureNN [53] is also proposed using a multi-
ance companies wishing to reduce financial party protocol for efficient neural network
risk, re-identification of patient records are a training.

CO
lucrative target. Recently, data mining compa- – Hardware security implementations The
nies are adopting large-scale re-identification approach to assure data and algorithm privacy
attacks and the sale of re-identified medical by utilizing specialized hardware. For example,
records as a business model [42]. Keeping that in the form of secure processors or enclaves
into account, the use of naive anonymization implemented in mobile device [54]. Due to
or pseudonymization alone must therefore be the rising significance of hardware-level deep
viewed as a technically insufficient measure learning implementations, e.g., tensor process-
OR
against identity inference [39]. ing units [55], it is likely that such system-based
– Differential privacy The alteration of a dataset privacy guarantees built into edge hardware will
to obfuscate individual data points while retain- become more common, e.g., trusted execution
ing the ability of interaction with a data within environments [39].
a certain scope (privacy budget) and of sta-
TH

tistical analysis. The approach can also be


applied to algorithms. For example, random- 6. Machine learning in FTL
ization of data to omit relationships between
individuals and respective data entries. It pro- Most works on FTL [32, 33, 38] have adopted
vides privacy preservation against membership- variants of deep learning as the architecture for FTL.
AU

inference attack in the model inference stage In [32], the deep model is formed using a series
[43]. However, during model training FTL of 1D convolution layers and fully connected lay-
approaches based on differential privacy are ers. The model also consists of other auxiliary layers
vulnerable to privacy leakage among the partic- like pooling. Softmax layer is used for classification.
ipants [44, 45]. A simplified schema of the deep model is shown
– Homomorphic encryption A cryptographic in Fig. 5. During the model from user to server,
technique that preserves the ability to per- the convolution layers are kept frozen and the fully
form mathematical operations on data as if connected layers are updated to learn user and task-
it was unencrypted i.e., plain text. For exam- specific parameters.
ple, performing neural network computations on A similar framework as above is used in FedSteg
encrypted data without the need of first decrypt- [38]. The model consists of 9 convolution layers fol-
ing it. Homomorphic encryption is studied for lowed by fully connected layer. While transferring
private federated logistic regression on verti- model from user to cloud/server, the convolution lay-
cally partitioned data [16]. More recently, [46] ers are kept frozen, while the fully connected layer
proposed a privacy-preserving linear regression is used for model transfer. For feature alignment
on horizontally partitioned data using homo- between source and target, correlation alignment
morphic encryption and Yao’s Garbled Circuits (CORAL) loss is used, which adapts the second order
[47]. feature statistics [38]. The work in [33] uses both
42 S. Saha and T. Ahmad / Federated transfer learning: Concept and applications

activities collected from 30 users within an age


bracket of 19–48 years. To adapt the dataset for Fed-
Health, 5 subjects were regarded as isolated users
(who cannot share data) and the remaining were used
to train cloud model.
Ju et al. [33] used PhysioNet EEG Motor Imagery
(MI) Dataset [61]. The MI dataset is recorded from
Fig. 5. The deep learning model for FTL [38].
109 subjects. Their experiments [33] use 5-fold
cross-validation settings with 4 folds being used
classification loss and domain loss that is imple- for training. Fedsteg [38] used two different image
mented using maximum-mean discrepancy [56]. datasets. Gao et al. [40] conducted experiments from
In federated autonomous driving [35], reinforce- several public datasets from UCI repository [62].

PY
ment learning is used. Reinforcement learning [57], Differently from [32], [40] and [33], Liang et al.
different from other machine learning techniques, [35] conducted real-life experiments on RC cars and
can be considered as a collection of observation and Airsim.
action spaces. In [35], each RL agent (user) is trained
individually. Following this, knowledge of the dis- 8. Conclusions

CO
tributed RL agents is distributed. Following that an
online transfer process is performed to make align- Data isolation and data privacy concerns pose sig-
ments on the observations and actions. nificant challenge for applying artificial intelligence
In FTL, training with heterogeneous data may techniques in real life applications. Moreover, fea-
present additional challenge, e.g., not all client data tures and users generally vary from organization to
distributions may be adequately captured by the organization. In the last few years, federated learning,
OR
model. Furthermore, quality of a particular local data especially FTL has emerged as a potential solution
partition may be significantly different from the rest. to overcome the above challenges. In this work, we
To overcome these challenges, Dimitriadis et al. [58] analyzed the concept of FTL and its applications
presented a dynamic gradient aggregation (DGA) in several practical domains. Furthermore, we did a
method which weights the local gradients during detailed review of the machine learning techniques
aggregation step. used in FTL. Our analysis shows that while FTL
TH

Inspite of success of the methods discussed above, has explored different machine learning techniques,
most of them do not provide an in-depth analysis there is still potential of using more sophisticated
of how different feature spaces can be handled in machine learning techniques along with FTL. Adver-
different users. Moreover, most of the above works sarial techniques may provide a promising direction
are restricted to limited number of users. They are for further strengthen FTL. While FTL has been used
AU

based on simplistic assumptions that convolution lay- in some practical applications, number of such appli-
ers learn shared feature while fully connected layers cations are still few. Moreover, most of the datasets
learn domain-specific/ task-specific features. While used in the FTL works are basic machine learning
correct, such assumptions do not fully exploit the datasets and not tailored to real life scenarios. FTL
tremendous progress made by deep domain adap- holds the promise to break the barrier between differ-
tation community [59]. Furthermore, there are very ent enterprises. However, there is scope of further
few works that investigates the complicated machine advancement, by incorporating advanced machine
learning issues that can arise in the heterogeneous learning techniques in FTL and applying FTL to more
FTL setting [58]. practical applications.

7. FTL datasets References

Most works on FTL have adopted existing machine [1] I. Goodfellow, Y. Bengio, A. Courville and Y. Bengio, Deep
learning datasets and modified them as per the learning, Vol. 1, MIT press Cambridge, 2016.
[2] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li and L. Fei-
requirement of FTL.
Fei, Imagenet: A large-scale hierarchical image database,
Fedhealth [32] used human activity recognition in: 2009 IEEE conference on computer vision and pattern
dataset - UCI Smartphone [60], consisting of 6 recognition, Ieee, (2009), pp. 248–255.
S. Saha and T. Ahmad / Federated transfer learning: Concept and applications 43

[3] S. Saha, F. Bovolo and L. Bruzzone, Unsupervised Multiple- of theassociation of computational linguistics, (2007), pp.
Change Detection in VHR Multisensor Images Via Deep- 264–271.
Learning Based Adaptation, in: IGARSS 2019-2019 IEEE [21] J. Gao, W. Fan, J. Jiang and J. Han, Knowledge transfer via
International Geoscience and Remote Sensing Symposium, multiple model local structure mapping, in: Proceedings of
IEEE, (2019), pp. 5033–5036. the 14th ACM SIGKDD international conference on Knowl-
[4] Q. Jing, W. Wang, J. Zhang, H. Tian and K. Chen, Quantify- edge discovery and data mining, (2008), pp. 283–291.
ing the Performance of Federated Transfer Learning, arXiv [22] A. Argyriou, M. Pontil, Y. Ying and C.A. Micchelli, A
preprint arXiv:1912.12795 (2019). spectral regularization framework for multi-task structure
[5] P. Voigt and A. Von dem Bussche, The eu general data pro- learning, in: Advances in neural information processing
tection regulation (gdpr), A Practical Guide, 1st Ed., Cham: systems, 2008, pp. 25–32.
Springer International Publishing (2017). [23] L. Mihalkova, T. Huynh and R.J. Mooney, Mapping and
[6] Q. Yang, Y. Liu, T. Chen and Y. Tong, Federated machine revising markov logic networks for transfer learning, in:
learning: Concept and applications, ACM Transactions on Aaai, 7 (2007), pp. 608–614.
Intelligent Systems and Technology (TIST) 10(2) (2019), [24] C. Tan, F. Sun, T. Kong, W. Zhang, C. Yang and C. Liu,
1–19. A survey on deep transfer learning, in: International con-

PY
[7] H.B. McMahan, E. Moore, D. Ramage and B.A. y Arcas, ference on artificial neural networks, Springer, 2018, pp.
Federated learning of deep networks using model averaging 270–279.
(2016). [25] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D.
[8] J. Konečn‘y, H.B. McMahan, F.X. Yu, P. Richtárik, Warde-Farley, S. Ozair, A. Courville and Y. Bengio, Gen-
A.T. Suresh and D. Bacon, Federated learning: Strategies erative adversarial nets, in: Advances in neural information
for improving communication efficiency, arXiv preprint processing systems, (2014), pp. 2672–2680.
arXiv:1610.05492 (2016). [26] J. Yosinski, J. Clune, Y. Bengio and H. Lipson, How

CO
[9] J. Konečn‘y, H.B. McMahan, D. Ramage and P. Richtárik, transferable are features in deep neural networks?, in:
Federated optimization: Distributed machine learning for Advances in neural information processing systems, (2014),
ondevice intelligence, arXiv preprint arXiv:1610.02527 pp. 3320–3328.
(2016). [27] P. Mignone, G. Pio, D. D’Elia and M. Ceci, Exploit-
[10] H. Zhu and Y. Jin, Multi-objective evolutionary feder- ing transfer learning for the reconstruction of the human
ated learning, IEEE Transactions on Neural Networks and gene regulatory network, Bioinformatics 36(5) (2020),
Learning Systems 31(4) (2019), 1310–1322. 1553–1561.
[11] W.Y.B. Lim, N.C. Luong, D.T. Hoang, Y. Jiao, Y.-C. Liang, [28] T. Zhou, H. Fu, C. Gong, J. Shen, L. Shao and F. Porikli,
OR
Q. Yang, D. Niyato and C. Miao, Federated learning in Multimutual consistency induced transfer subspace learn-
mobile edge networks: A comprehensive survey, IEEE ing for human motion segmentation, in: Proceedings of
Communications Surveys & Tutorials (2020). the IEEE/CVF Conference on Computer Vision and Pattern
[12] Y. Liu, Y. Kang, C. Xing, T. Chen and Q. Yang, A Secure Recognition, (2020), pp. 10277–10286.
Federated Transfer Learning Framework, IEEE Intelligent [29] H. Chougrad, H. Zouaki and O. Alheyane, Multi-label
Systems (2020). transfer learning for the early diagnosis of breast cancer,
[13] S. Saha, Y.T. Solano-Correa, F. Bovolo and L. Bruz- Neurocomputing 392 (2020), 168–180.
TH

zone, Unsupervised Deep Transfer Learning-Based Change [30] S. Saha, L. Mou, X.X. Zhu, F. Bovolo and L. Bruzzone,
Detection for HR Multispectral Images, IEEE Geoscience Semisupervised Change Detection Using Graph Convo-
and Remote Sensing Letter (2020). lutional Network, IEEE Geoscience and Remote Sensing
[14] S. Saha, F. Bovolo and L. Bruzzone, Unsupervised deep Letters (2020).
change vector analysis for multiple-change detection in [31] F. Sun, W. Zang, R. Gravina, G. Fortino and Y. Li, Gait-
VHR images, IEEE Transactions on Geoscience and based identification for elderly users in wearable healthcare
AU

Remote Sensing 57(6) (2019), 3677–3693. systems, Information Fusion 53 (2020), 134–144.
[15] Y. Aono, T. Hayashi, L. Wang, S. Moriai, et al., Priva- [32] Y. Chen, X. Qin, J. Wang, C. Yu and W. Gao, Fedhealth: A
cypreserving deep learning via additively homomorphic federated transfer learning framework for wearable health-
encryption, IEEE Transactions on Information Forensics care, IEEE Intelligent Systems (2020).
and Security 13(5) (2017), 1333–1345. [33] C. Ju, D. Gao, R. Mane, B. Tan, Y. Liu and C. Guan, Feder-
[16] S. Hardy, W. Henecka, H. Ivey-Law, R. Nock, G. Patrini, ated Transfer Learning for EEG Signal Classification, arXiv
G. Smith and B. Thorne, Private federated learning on ver- preprint arXiv:2004.12321 (2020).
tically partitioned data via entity resolution and additively [34] S. Liu, J. Tang, C. Wang, Q. Wang and J.-L. Gaudiot, Imple-
homomorphic encryption, arXiv preprint arXiv:1711.10677 menting a cloud platform for autonomous driving, arXiv
(2017). preprint arXiv:1704.02696 (2017).
[17] S. Yang, B. Ren, X. Zhou and L. Liu, Parallel distributed [35] X. Liang, Y. Liu, T. Chen, M. Liu and Q. Yang, Federated
logistic regression for vertical federated learning without Transfer Reinforcement Learning for Autonomous Driving,
third-party coordinator, arXiv preprint arXiv:1911.09824 arXiv preprint arXiv:1910.06001 (2019).
(2019). [36] D. Fényes, B. Németh and P. Gáspár, A predictive control
[18] G. Wang, Interpret federated learning with shapley values, for autonomous vehicles using big data analysis, IFACPa-
arXiv preprint arXiv:1905.04519 (2019). persOnLine 52(5) (2019), 191–196.
[19] S.J. Pan and Q. Yang, A survey on transfer learning, IEEE [37] Q. Zhang, L.T. Yang and Z. Chen, Privacy preserving deep
Transactions on Knowledge and Data Engineering 22(10) computation model on cloud for big data feature learning,
(2009), 1345–1359. IEEE Transactions on Computers 65(5) (2015), 1351–1362.
[20] J. Jiang and C. Zhai, Instance weighting for domain adap- [38] H. Yang, H. He, W. Zhang and X. Cao, FedSteg: A Fed-
tation in NLP, in: Proceedings of the 45th annual meeting erated Transfer Learning Framework for Secure Image
44 S. Saha and T. Ahmad / Federated transfer learning: Concept and applications

Steganalysis, IEEE Transactions on Network Science and [50] A.C. Yao, Protocols for secure computations, in: 23rd
Engineering (2020). annual symposium on foundations of computer science (sfcs
[39] G.A. Kaissis, M.R. Makowski, D. Rückert and R.F. Braren, 1982), IEEE, (1982), pp. 160–164.
Secure, privacy-preserving and federated machine learning [51] P. Mohassel and Y. Zhang, Secureml: A system for scalable
in medical imaging, Nature Machine Intelligence (2020), privacy-preserving machine learning, in: 2017 IEEE Sym-
1–7. posium on Security and Privacy (SP), IEEE, (2017), pp.
[40] D. Gao, Y. Liu, A. Huang, C. Ju, H. Yu and Q. Yang, Pri- 19–38.
vacypreserving heterogeneous federated transfer learning, [52] A. Shamir, How to share a secret, Communications of the
in: 2019 IEEE International Conference on Big Data (Big ACM 22(11) (1979), 612–613.
Data), IEEE, (2019), pp. 2552–2559. [53] S. Wagh, D. Gupta and N. Chandran, SecureNN: Efficient
[41] A. Cavoukian and M. Prosch, Privacy by ReDesign: Build- and Private Neural Network Training., IACR Cryptol. ePrint
ing a better legacy, Information Privacy Commissioner Arch. 2018 (2018), 442.
Ontario (2011), 1–8. [54] A.P. Security, Secure enclave overview, 2020, Accessed:
[42] A. Tanner, Our bodies, our data: how companies make bil- 24-Sep-2020.
lions selling our medical records, Beacon Press, 2017. [55] Google, Cloud TPU, 2020, Accessed: 24-Sep-2020.

PY
[43] C. Dwork, F. McSherry, K. Nissim and A. Smith, Calibrating [56] A. Gretton, K. Borgwardt, M. Rasch, B. Schölkopf and
noise to sensitivity in private data analysis, in: Theory of A.J. Smola, A kernel method for the two-sample-problem,
cryptography conference, Springer, 2006, pp. 265–284. in: Advances in Neural Information Processing Systems,
[44] Y. Wang, Q. Gu and D. Brown, Differentially private hypoth- (2007), pp. 513–520.
esis transfer learning, in: Joint European Conference on [57] Y. Li, Deep reinforcement learning: An overview, arXiv
Machine Learning and Knowledge Discovery in Databases, preprint arXiv:1701.07274 (2017).
Springer, (2018), pp. 811–826. [58] D. Dimitriadis, K. Kumatani, R. Gmyr, Y. Gaur and S.E.

CO
[45] Q. Yao, X. Guo, J.T. Kwok, W. Tu, Y. Chen, W. Dai Eskimez, Federated Transfer Learning with Dynamic Gradi-
and Q. Yang, Differential private stack generalization ent Aggregation, arXiv preprint arXiv:2008.02452 (2020).
with an application to diabetes prediction, arXiv preprint [59] G. Wilson and D.J. Cook, A survey of unsupervised deep
arXiv:1811.09491 (2018). domain adaptation, ACM Transactions on Intelligent Sys-
[46] V. Nikolaenko, U. Weinsberg, S. Ioannidis, M. Joye, D. tems and Technology (TIST) 11(5) (2020), 1–46.
Boneh and N. Taft, Privacy-preserving ridge regression on [60] D. Anguita, A. Ghio, L. Oneto, X. Parra and J.L. Reyes-
hundreds of millions of records, in: 2013 IEEE Symposium Ortiz, Human activity recognition on smartphones using
on Security and Privacy, IEEE, (2013), pp. 334–348. a multiclass hardware-friendly support vector machine,
OR
[47] S. Yakoubov, A gentle introduction to Yao’s Garbled cir- in: International workshop on ambient assisted living,
cuits, 2019. Springer, (2012), pp. 216–223.
[48] M. Kantarcioglu and C. Clifton, Privacy-preserving dis- [61] G. Schalk, D.J. McFarland, T. Hinterberger, N. Birbaumer
tributed mining of association rules on horizontally and J.R. Wolpaw, BCI2000: a general-purpose brain-
partitioned data, IEEE Transactions on Knowledge and computer interface (BCI) system, IEEE Transactions on
Data Engineering 16(9) (2004), 1026–1037. Biomedical Engineering 51(6) (2004), 1034–1043.
[49] L. Wan, W.K. Ng, S. Han and V.C. Lee, Privacy-preservation [62] A. Asuncion and D. Newman, UCI machine learning repos-
TH

for gradient descent methods, in: Proceedings of the 13th itory, 2007.
ACM SIGKDD international conference on Knowledge dis-
covery and data mining, (2007), pp. 775–783.
AU

View publication stats

You might also like