Advancing Federated Learning in 6G: A Trusted Architecture With Graph-Based Analysis

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

Advancing Federated Learning in 6G: A Trusted

Architecture with Graph-based Analysis


Wenxuan Ye∗† , Chendi Qian‡ , Xueli An∗ , Xueqiang Yan§ , Georg Carle†
∗Advanced Wireless Technology Laboratory, Munich Research Center, Huawei Technologies Duesseldorf GmbH
† TUM School of Computation, Information and Technology, Technical University of Munich
‡ Computer Science 6: Machine Learning and Reasoning, RWTH Aachen
§ Wireless Technology Lab, 2012 Laboratories, Huawei Technologies Co., Ltd

wenxuan.ye@tum.de, chendi.qian@log.rwth-aachen.de, {xueli.an, yanxueqiang1}@huawei.com, carle@net.in.tum.de


arXiv:2309.05525v1 [cs.NI] 11 Sep 2023

Abstract—Integrating native AI support into the network However, several challenges hinder the wide application of
architecture is an essential objective of 6G. Federated Learning FL in the 6G context. From a security perspective, detecting
(FL) emerges as a potential paradigm, facilitating decentralized and addressing anomalous models pose notable obstacles due
AI model training across a diverse range of devices under the co-
ordination of a central server. However, several challenges hinder to the restriction of analyzing only the model updates without
its wide application in the 6G context, such as malicious attacks direct access to the raw data [4]. Adversarial clients may
and privacy snooping on local model updates, and centralization launch poisoning attacks to impair the model performance and
pitfalls. This work proposes a trusted architecture for supporting disrupt the training process. From a privacy view, while FL of-
FL, which utilizes Distributed Ledger Technology (DLT) and fers a considerable improvement over centralized training, it is
Graph Neural Network (GNN), including three key features.
First, a pre-processing layer employing homomorphic encryption still susceptible to information leakage through the observation
is incorporated to securely aggregate local models, preserving of clients’ shared model updates [5]. From an architectural
the privacy of individual models. Second, given the distributed standpoint, with massive connectivity, the central server may
nature and graph structure between clients and nodes in the pre- struggle to handle the increased load. The centralized design
processing layer, GNN is leveraged to identify abnormal local for accessing and processing model parameters is prone to
models, enhancing system security. Third, DLT is utilized to
decentralize the system by selecting one of the candidates to a single point of failure and limited scalability. Besides,
perform the central server’s functions. Additionally, DLT ensures such a centralized system lacks a proactive mechanism for
reliable data management by recording data exchanges in an demonstrating compliance in data management [1].
immutable and transparent ledger. The feasibility of the novel To enhance privacy preservation in FL, secure aggregation
architecture is validated through simulations, demonstrating has been recognized as a pivotal technology, enabling the
improved performance in anomalous model detection and global
model accuracy compared to relevant baselines. combination of model updates without disclosing individual
Index Terms—Federated learning, Distributed ledger technol- contributions [6]. Homomorphic Encryption (HE) is a promis-
ogy, Graph neural network, Secure aggregation, 6G, Homomor- ing cryptographic method for implementing secure aggregation
phic encryption [7]. It allows the server to perform computations directly on
encrypted updates to obtain the desired encrypted result. Re-
I. INTRODUCTION
cent progress in its utility, coupled with increased computing
6G regards AI as a crucial enabler driving innovation in capabilities of communication network entities, have paved the
network architecture. In addition to utilizing AI for intelli- way for its widespread application [7].
gent communication networks, 6G aims to achieve native AI In addition to secure aggregation for privacy issues, Dis-
support, transcending the traditional role of delivery channels tributed Ledger Technology (DLT) offers a potential solution
for collected data [1]. Federated Learning (FL), a distributed to overcome other FL limitations, such as centralization issues
learning paradigm with efficient communication, emerges as a and unreliable data management. DLT ensures transparency
promising solution for extracting valuable information from and immutability by decentralized peers maintaining a shared
dispersed client data while ensuring privacy [2]. A central ledger for blocks [8]. Each block stores multiple transactions
server distributes the training process across multiple clients, and links to the previous block by containing its cryptographic
with each client performing local computations on its own hash, forming a chronological record [9]. The smart contract,
data, and iteratively exchanging model parameters to improve a piece of code deployed on a DLT platform, executes auto-
model quality. It holds diverse potentials across vertical in- matically when preset conditions are met [9]. It reduces the
dustries, including smart cities, healthcare and other domains need for intermediaries and streamlines processes.
involving extensive data exchanges and analysis [1]. Recent Furthermore, considering the complex relationships among
advancements in decentralized computing and communication data and the distributed nature of FL, Graph Neural Networks
capabilities of mobile networks have further facilitated the (GNNs) have emerged as a powerful tool for modeling graph-
provision of FL services [3]. structured data [10]. Graphs naturally introduce the inductive
979-8-3503-1090-0/23/$31.00 ©2023 IEEE bias of connectivity between node pairs, which sets them apart
from other data structures like images or sequential data [11]. which performs a (weighted) average of model updates to
GNN iteratively distinguishes nodes based on their attributes, generate global models [2]. Krum is a robust aggregation rule
neighbors and connections. By leveraging the inherent graph that replaces the average step with a reliable estimate of the
structure of data and modeling interactions, GNNs can ef- mean [15]. It selects the most trusted model as the aggre-
fectively capture the dependencies and relationships among gated result using majority-based and distance-based analysis.
distributed clients, leading to more accurate and robust models. Besides, ML-based approaches have also been extensively ex-
To tackle the concerns raised earlier, a novel trusted ar- plored for anomaly detection, such as Multi-Layer Perceptron
chitecture for FL is proposed. The key contributions are (MLP) [16]. However, these methods require knowledge of
summarized as follows: the specific data, where encryption is difficult to apply as it
1) Privacy: Introducing a pre-processing layer for secure alters the spatial location and data distance. Moreover, the
aggregation. This layer employs HE to securely aggregate implementations rely on a reliable third party or a centralized
local model updates to prevent the exposure of clients’ private setting, which is not feasible in some scenarios.
information. By encrypting the updates before aggregation, Previous studies have separately explored defense mecha-
only the aggregator results can be obtained by the server. nisms against security and privacy attacks in FL, but there is
2) Security: Leveraging GNNs to identify abnormal local limited focus on developing combined solutions that address
models. Considering the graph structure between clients and both threats simultaneously. An attempt was made in [17],
nodes in the pre-processing layer, along with their complicated but the approach to anomaly detection involves analyzing raw
features, GNNs are employed for anomaly model identifica- data, raising privacy concerns.
tion. To the best of our knowledge, this is the first attempt to
III. PRELIMINARIES
apply GNN in the FL architecture optimization.
3) Reliability: Utilizing DLT for trustworthy data manage- A. DLT-DDSE
ment and decentralized architecture. Using smart contracts for DLT empowers traceability and immutability through de-
automatic access control and logging data exchange operations centralized peers that collectively maintain a common ledger.
on the ledger, DLT facilitates transparent and traceable data Data storage in DLT is classified into on-chain and off-chain
management. An aggregator is chosen from multiple candi- methods. On-chain storage records model parameters directly
dates via smart contracts to execute the central server’s tasks, on the ledger, while this transparency may raise privacy
promoting system decentralization and bolstering its resilience. concerns in FL systems. Off-chain method stores only pointers
Moreover, we measure the performance in terms of ab- of raw data on the chain, where raw data are saved in a
normal model detection, global model accuracy and time separate entity, such as a Distributed Data Storage Entity
overhead caused by HE. The simulation results demonstrate (DDSE) [18]. DDSE can be implemented through various
that the proposed method outperforms the baseline by utilizing technologies including Distributed Hash Table (DHT), which
features of graph nodes and their connections, leading to employs hashes as keys to provide a scalable and efficient
better performance in dealing with anomaly local models. In lookup mechanism. Please refer to our previous work [19] for
addition, we analyze time consumption of system components further details on the application of DLT in FL.
to provide practical insights for scalability. In this work, we utilize DLT-DDSE to enable data exchange
between clients and the central server (subsequently imple-
II. RELATED WORK
mented as aggregator candidates), while ensuring transparent
Recent studies have proposed various techniques against and immutable records. Smart contracts are employed for
privacy leakages, which exploit vulnerabilities to gain unau- automated operations, including access control and aggregator
thorized access to sensitive information. Apart from HE, selection, thereby improving the system’s trustworthiness.
commonly employed technologies include Differential Privacy
(DP) [12] and secret sharing [14]. By adding deliberately B. Homomorphic Encryption
designed noise, DP protects individual records while mini- HE allows mathematical or logical operations to be per-
mizing the impact on data aggregation [12]. In [13], a model formed directly on encrypted data [7]. Enc(m) denotes the
aggregation algorithm was proposed to balance the trade-off encryption form of plaintext m. Given encrypted values
between privacy loss and model performance. The secret- Enc(m1 ) and Enc(m2 ), homomorphic addition and multipli-
sharing method distributes copies of secret messages among cation operations allow for the computation of Enc(m1 + m2 )
participants, enabling reconstruction with a certain number of and Enc(m1 ∗ m2 ), respectively, without the need for de-
participants [14]. A practical secret-sharing method proposed cryption. These operations can be denoted as Enc(m1 ) ⊕
in [6] utilized an innovative double-mask structure to protect Enc(m2 ) = Enc(m1 + m2 ) for homomorphic addition, and
privacy. Despite enhanced privacy assurances, these methods Enc(m1 ) ⊗ Enc(m2 ) = Enc(m1 ∗ m2 ) for homomorphic
pose challenges in identifying anomalous models due to lim- multiplication, where ⊕ and ⊗ represent the corresponding
ited access to individual model updates [6]. homomorphic operations.
Aimed at addressing security threats in FL, considerable Paillier algorithm is a public-key cryptographic algorithm
effort has been devoted to defense mechanisms. The vanilla based on composite residuosity classes, where encryption
model aggregation method is Federated Averaging (FedAvg), and decryption use modular exponentiation [20]. It permits
A. System Components and Threat Assumption
The proposed architecture consists of four components and
the corresponding interactions, as illustrated in Fig. 1.
Clients in the proposed system download the global model,
train it based on their own datasets and upload the obtained
models. Clients could be, but are not limited to, user equipment
or IoT devices capable of data collection and model training,
such as smartphones and cars.
The pre-processing layer performs model pre-processing
before storage on the DLT-DDSE platform. By leveraging
Paillier’s homomorphism, this layer implements secure aggre-
gation and model projection with dimensionality reduction,
Fig. 1: A trusted architecture for FL support with graph-based analysis (shown
in the green box and implemented in Step 7)
maintaining data similarity while preventing the recovery of
original data. Thus, subsequent detection of anomaly models
is facilitated in a secure and privacy-preserving manner.
homomorphic addition and multiplication of ciphertext by
The DLT-DDSE layer is utilized for data management and
plaintext numbers. A detailed description can be found in [20].
aggregator selection. All the data access records are stored
In this work, HE is used for model aggregation and linear
as transactions in an immutable ledger. To upload or down-
projection, and Paillier’s properties meet these demands. On
load data, entities require authorization from DLT and then
the other hand, HE schemes with homomorphic multiplication
proceed accordingly to DDSE where the raw data is stored,
necessitate a more complex operation of noise removal after
with DDSE confirming consistency with DLT. In addition to
finite multiplication, leading to a substantial communication
data management, DLT employs smart contracts for selecting
overhead [7]. Given these factors, Paillier is selected for the
aggregators, such as by random or round robin, or based on the
encryption scheme in the proposed architecture.
hash of previous blocks. The selected aggregator assumes the
central server’s responsibility, thereby decentralizing the entire
C. Graph Neural Network
system and enhancing its resilience. The specific transaction
GNNs have gained considerable interest for their proficiency types and data interaction procedures have been clearly defined
in processing and analyzing graph-structured data [10]. Nor- and described in our previous work [19].
mally, a graph consists of nodes and edges that indicate the The selected aggregator undertakes the task of processing
connectivity of the node pairs. A bipartite graph is a specific uploaded models from the pre-processing layer. To identify
model in graph theory, in which nodes are divided into two abnormal models, it conducts graph-based analysis based on
disjoint sets with edges connecting only nodes in different sets. the features of clients and nodes in the pre-processing layers
Message passing between nodes through edges is the core and the underlying graph structure. Then it generates the
mechanism of GNNs, capturing node relations and interac- global model for the new global epoch.
tions. A GNN comprises multiple layers, with each layer Nodes in the pre-processing layer, peers in the DLT-DDSE
creating and aggregating the message from neighboring nodes layer and aggregator candidates could be within mobile com-
to update the representation for each node. This process munication systems, such as network entities with computa-
is repeated across all layers, ultimately generating a rich, tional capabilities. Alternatively, they may belong to third-
high-level representation of the nodes or the whole graph. party organizations, where proper registration and authenti-
GNNs have been widely applied in diverse domains, including cation procedures are required.
node classification, where node embeddings are utilized and a This work categorizes clients as benign or malicious, with
classification head (typically an MLP) is employed. a focus on malicious clients that launch untargeted poisoning
In the proposed architecture, model features, pre-processing attacks, which do not involve targeted model updated attacks
layer properties and their connections are essential to assess or evasion attacks during inference. Our defense mechanism
model quality. These elements structurally make up a bipartite operates by filtering out malicious updates that deviate suf-
graph, which lacks a fixed form and is not amenable to ficiently from those of benign clients. Other components
distance measurements. Therefore, GNN serves as an effective are assumed honest but curious, following system rules but
tool to address this challenge. attempting to deduce sensitive information from received data.

IV. ARCHITECTURE DESIGN B. Proposed Approach


This section outlines the proposed trusted architecture for The architecture design, illustrated in Fig. 1, partitions one
supporting FL, which aims to enhance both privacy and global epoch into ten steps. For global epoch t, Ct denotes
security. It begins with an introduction of system components the set of selected clients, N represents the set of nodes in
and threat assumption, then proceeds to detail the proposed the pre-processing layer, and Cn,t denotes the set of clients
approach, including the algorithm presentation. connected to node n.
Algorithm 1 Model pre-processing for each node n ∈ N with the algorithm summarized in Alg. 1. Each node n ∈ N
1: Input: Encrypted local model Enc(wt,cn ), ∀cn ∈ Cn,t uses a projection function Ppk to map homomorphically en-
2: Require: Projection function Ppk , Aggregation function crypted local model Enc(wt,cn ) into a low-dimensional vector
Apk Enc(vt,cn ), e.g. of length 32. Then it applies aggregation
3: Output: Encrypted projected vector Enc(vt,cn ), ∀cn ∈ function Apk to obtain a semi-aggregation Enc(wt,n ).
Cn,t , Encrypted model semi-aggregation Enc(wt,n ) Step 4: The pre-processed models are passed to the DLT-
4: Enc(vt,cn ) ← Ppk (Enc(wt,cn )) DDSE layer, including Enc(vt,cn ), Enc(wt,n ) and graph con-
5: Enc(wt,n ) ← Apk (Enc(wt,cn ), ∀cn ∈ Cn,t ) nections (n, cn ), ∀cn ∈ Cn,t , ∀n ∈ N.
Step 5: DLT uses the smart contract for aggregator selection.
Step 6: The selected aggregator obtains pre-processed
Algorithm 2 Data processing
models and history client selection records H(C) =
1: Input: Encrypted projected vector Enc(vt,cn ), Encrypted {C0 , C1 , ..., Ct } from the DLT-DDSE layer.
model semi-aggregation Enc(wt,n ), Graph connection Step 7: The selected aggregator performs data processing,
(n, cn ), ∀cn ∈ Cn,t , ∀n ∈ N, History client selection including data decryption and analysis, as shown in Alg. 2.
records H(C) The aggregator decrypts Enc(vt,cn ) and Enc(wt,n ). Then,
2: Require: Decryption Dsk , GNN function G, Test function the performance pert,n of the semi-aggregate model wt,n is
T , Aggregation function A inferred by a test function T . Subsequently, the aggregator
3: Output: Anomalous model detection result At , Client takes (vt,cn , H(C), pert,n , (n, cn ), ∀cn ∈ Cn,t , ∀n ∈ N) as
selection Ct+1 , Semi-aggregated model performance the input of GNN function G to detect anomalous models,
{pert,n , ∀n ∈ N}, Global model wt+1 with the detection results denoted by At , on the basis of
4: for cn ∈ Cn,t , n ∈ N do which client selection Ct+1 for the next global epoch t + 1
5: wt,n ← Dsk (Enc(wt,n )) is performed. Semi-aggregated models with high accuracy are
6: pert,n ← T (wt,n ) aggregated to form a global model wt+1 by referring to At .
7: vt,cn ← Dsk (Enc(vt,cn )) Step 8: The obtained data, including {pert,n , ∀n ∈ N}, At ,
8: end for Ct+1 and wt+1 , are uploaded to the DLT-DDSE layer.
9: At , Ct+1 ← G(vt,cn , H(C), pert,n , (n, cn ), ∀cn ∈ Cn,t , Step 9: The pre-processing layer downloads wt+1 and Ct+1 ,
∀n ∈ N) and notifies selected clients for the next global epoch t + 1.
10: wt+1 ← A(wt,n , At , pert,n , ∀n ∈ N) Step 10: At the end of a global epoch, selected clients Ct+1
download the latest global model wt+1 .
HE employs encryption function Epk and decryption func- The architecture design prevents any entity from acquiring
tion Dsk , with public key pk and private key sk. Enc(x) complete knowledge of a single model and refrains from
denotes the encrypted form of x. Clients and aggregator can- directly storing individual models on the ledger, safeguarding
didates hold the private key for decryption and the public key client information privacy. Simultaneously, system security
for encryption, while nodes in the pre-processing layer only is enhanced by the utilization of GNN for anomaly model
hold the public key. Projection function Ppk and aggregation detection, using graph structures and model characteristics. All
function Apk are implemented through homomorphic proper- the data exchanges are immutably and transparently recorded
ties without the private key, where Ppk involves flattening the on the DLT platform for further reference and audit.
model into a vector and multiplying it by a matrix, and Apk
could be achieved through FedAvg. V. SIMULATION AND EVALUATION
Test function T evaluates model performance on unseen The evaluation of the proposed trusted architecture encom-
data using the test dataset, yielding results based on various passes the performance of abnormal model detection, global
metrics such as accuracy, error rate and F1 score. model accuracy and HE time overhead. The evaluation setup
GNN function G is applied to detect anomalous models is first delineated, followed by experimental results.
using a bipartite graph comprised of clients and nodes in the Regarding the performance evaluation of the DLT-DDSE,
pre-processing layer, as depicted in the green box in Fig. 1. we have provided a comprehensive analysis in our prior
The local model projection and history client selection results work [19]. However, we acknowledge the significance of this
are taken as the client features, and semi-aggregated model component for providing trusted data management.
performance works as the features for the pre-processing layer.
Next, we provide a detailed description of steps in Fig. 1. A. Evaluation Setup
Step 1: Each selected client c ∈ Ct trains the global model The whole simulation is conducted on a laptop with Intel(R)
wt on its own dataset to obtain the local model wt,c , and Core(TM) i7-10510U CPU (16 GB RAM), and is implemented
encrypts it through Epk to get Enc(wt,c ). with PyTorch 1.13.1.
Step 2: Then Enc(wt,c ) is uploaded to the corresponding 1) Dataset and model structure for FL: MNIST dataset
nodes in the pre-processing layer, where node n collects the [21] is adopted for training and testing the FL models. We
local models from the associated clients Cn,t . set 100 clients as default, each of which maintains 600
Step 3: Nodes in this layer undertake model pre-processing, images, and trains with batch size 64. For FL models, we
(a) Perturbation ratio (b) Perturbation step (c) Selected clients (d) Client connections per node
Fig. 2: Abnormal model detection: The proposed method performs better accuracy than baseline in making use of node features and connections.

(a) Basic configuration (b) Perturbation ratio of 0.8 (c) Perturbation step of 10 (d) Client connections per node of 6
Fig. 3: Global model: The novel method improves model accuracy, and the performance could be close to the non-perturbed scenario.

employ a Convolutional Neural Network (CNN) model with 2 variables include perturbation ratio, perturbation step, number
convolutional layers, both with kernel size 5, with 32 and 64 of selected client and number of client connections per node.
channels respectively, followed by 2 fully connected layers of Dependent variables are MLP F1 score as the baseline and
hidden size 2048. We apply max pooling with stride 2 right GNN F1 score for the proposed method, with abnormal local
after the convolutional layers. The employed CNN model has model accuracy as a reference for the attack degree.
a total of 28,784 parameters. 3) Simulation result: The comparison of our proposed
2) Abnormal local model generation: After the normal method with the baseline is presented in Fig. 2, where the
training process, local models are trained on a manipulated primary axis displays the detection accuracy of abnormal
dataset with incorrect labels for several iterations, termed as models and the secondary axis shows the accuracy of abnormal
perturbation steps. local models. Fig. 2a depicts the effect of increasing the
3) Dataset and model architecture for GNN: We use a 32- ratio of malicious models, with the abnormal local model
dimensional vector as local model features, obtained through accuracy being 94%. The proposed method receives a minimal
projecting the model parameters via random linear transforma- impact, with the detection accuracy remaining around 98%.
tion matrices with a fixed seed. For nodes in pre-processing Conversely, the baseline, which solely relies on model similar-
layer, test accuracies for semi-aggregated models are taken as ity, faces a substantial obstacle, with its accuracy dropping to
features. In addition, the connection between them is taken 86%. In Fig. 2b, as perturbation steps increase, the divergence
into consideration. For GNN, we employ a heterogeneous between anomalous and normal models amplifies, simplifying
message passing structure consisting of 3 layers, each with 128 detection. The novel method achieves increasing accuracy
hidden dimensions, and dropout probability of 0.5. Between from 95% to 99%, exceeding the baseline performance from
the layers, we use ReLU activation functions for non-linearity. 88% to 99%. In Fig. 2c and Fig. 2d, changes in client
We use default Adam optimizer settings with a batch size of numbers and connections impact the graph structure, but have
64, and run each experiment for 200 epochs. no influence on the baseline with a detection accuracy of 91%.
4) Basic configuration: For each global epoch, the pro- Higher connectivity enables better information aggregation,
posed system consists of 20 selected clients with 10 local and clients that appear only once in the graph may slightly
epochs, 10 nodes in the pre-processing layer, each randomly affect detection, with the lowest detection accuracy being 96%.
connected to 5 clients. The ratio of anomalous models, i.e., Overall, the proposed method outperforms the baseline by
perturbation ratio, is 0.5 with 3 perturbation steps. leveraging the features of clients and the pre-processing layer,
along with their connections.
B. Performance of Abnormal Model Detection
1) Baseline: The commonly used MLP model [16] is taken
C. Performance of Global Model
as a baseline for comparison, solely leveraging projected local
model vectors. For a fair comparison, we have the same 1) Baseline: We compare the model aggregation perfor-
model capacity as the GNN, i.e., 3 layers with 128 hidden mance under three different scenarios: “non-perturbed” rep-
dimensions, accompanied by ReLU functions in-between. resenting normal training without perturbation, “perturbed”
2) Variable settings: We examine the effect of independent with perturbation, and “novel method” with perturbation but
variables in a controlled experimental setting. Independent applying the proposed method.
VI. CONCLUSION AND OUTLOOK
This work proposes a novel trusted architecture for support-
ing FL. DLT is leveraged for reliable data management and
decentralized architecture. A pre-processing layer employing
Paillier’s additive homomorphic encryption is incorporated to
enhance client privacy protection through secure aggregation.
Fig. 4: Average time consumption for system components per global epoch, Considering the graph structure between clients and nodes in
where “cli” refers to clients with local model training and encryption, “pre”
refers to the pre-processing layer with model projection and semi-aggregation,
the pre-processing layer, GNN is leveraged to identify abnor-
“agg” represents selected aggregators with data decryption and analysis. mal models, thereby improving the architecture security. The
proposed architecture enables reliable and secure collaborative
2) Scenario settings: Based on the simulation results in machine learning in wireless communications.
Section V-B, we consider application scenarios with a single This work is an exploration to apply GNN in FL architec-
change to the basic configuration, including perturbation ratio ture optimization, taking local models projection and history
of 0.8, perturbation step of 10, and client connections of 6. selection results as client features. In future work, we plan to
3) Simulation result: In Fig. 3a, the novel method achieves incorporate geographical location and communication network
a quicker convergence to a higher accuracy level compared to characteristics, such as signal strength, into the GNN function
the perturbed scenario. From the 10th global epoch onwards, for better prediction.
the model accuracy obtained with the novel method is close R EFERENCES
to that in the non-perturbed condition. In Fig. 3b and Fig. 3c,
[1] W. Tong and P. Zhu, “6G: the next horizon: From connected people and
with an increase in perturbation ratio or step, the entire system things to connected intelligence,” Cambridge University Press, 2021.
receives more aggressive perturbations and the benefits of the [2] B. McMahan, E. Moore, D. Ramage, S. Hampson and B.Arcas,
novel method are amplified, with at least 8% higher than the “Communication-efficient learning of deep networks from decentralized
data,” PMLR, 2017.
perturbed condition. The proposed architecture’s performance [3] Y. Liu, et al., “Federated learning for 6G communications: Challenges,
is relatively unaffected by changes in graph structures, as methods, and future directions,” China Communications, 2020.
shown in Fig. 3d. In conclusion, the novel method improves [4] Z. Wang, et al., “Beyond inferring class representatives: User-level
privacy leakage from federated learning,” IEEE INFOCOM 2019.
the model accuracy and stability of the FL system, and the [5] P. Kairouz, et al., “Advances and open problems in federated learning,”
performance could be close to the non-perturbed condition. Foundations and Trends in Machine Learning 14.1–2 (2021): 1-210.
[6] K. Bonawitz, et al., “Practical secure aggregation for privacy-preserving
D. Time Consumption machine learning,” Proceedings of the 2017 ACM SIGSAC Conference
This subsection presents an analysis of time consumption on Computer and Communications Security, 2017.
[7] R. Gilad-Bachrach, et al., “Cryptonets: Applying neural networks to
for each system component per global epoch. The imple- encrypted data with high throughput and accuracy,” PMLR, 2016.
mented Paillier scheme utilizes a 128-bit key length. We cat- [8] S. Nakamoto, “Bitcoin: A Peer-to-Peer Electronic Cash System,” in
egorize the time consumption by components and investigate Cryptography Mailing list at https://metzdowd.com, 2019.
[9] E. Androulaki, et al., “Hyperledger fabric: A distributed operating
the impact of changes in the number of selected clients. In system for permissioned blockchains,” Proceedings of the thirteenth
Fig. 4, “cli” refers to clients with local model training and EuroSys conference, 2018.
local model encryption, “pre” refers to the pre-processing layer [10] F. Scarselli, M. Gori, A. C. Tsoi, M. Hagenbuchner and G. Monfar-
dini, “The graph neural network model,” IEEE transactions on neural
with model projection and semi-aggregation, “agg” represents networks 20.1 (2008): 61-80.
selected aggregators with data decryption and data analysis. [11] P. W. Battaglia, et al., “Relational inductive biases, deep learning, and
The majority of time is spent on training (around 40%), graph networks,” arXiv preprint arXiv:1806.01261, 2018.
[12] C. Dwork, “Differential privacy,” Automata, Languages and Program-
encryption (around 16%), projection (around 12%) and de- ming: 33rd International Colloquium, ICALP, 2006.
cryption (around 26%), while semi-aggregation and analysis [13] R. C. Geyer, T. Klein, and M. Nabi, “Differentially private federated
occupy only 5% and 0.002% of the total time, respectively. learning: A client level perspective,” arXiv:1712.07557, 2017.
[14] A. Shamir, “How to share a secret,” Communications of the ACM, 1979.
The increase in the client number has minimal impact on sys- [15] P. Blanchard, et al., “Machine learning with adversaries: Byzantine
tem time due to parallel training and encryption across clients, tolerant gradient descent,” Advances in neural information processing
as well as parallel projection by nodes in the pre-processing systems, 2017.
[16] T. Hastie, J. Friedman and R. Tibshirani, “The elements of statistical
layer. Despite the additional work required for decrypting the learning: data mining, inference, and prediction,” springer, 2009.
projected vector of models with increasing clients, the impact [17] M. Shayan, C. Fung, C. J. M. Yoon and I. Beschastnikh, “Biscotti:
is limited because of the vector’s low dimensionality of 32. A blockchain system for private and secure federated learning,” IEEE
Transactions on Parallel and Distributed Systems, 2020.
Simulation results indicate that further processing (including [18] G. Zyskind and Oz Nathan, “Decentralizing privacy: Using blockchain
pre-processing and decryption) takes longer than encryption to protect personal data,” IEEE Security and Privacy Workshops, 2015.
in the client part; however, in practice, it does not present a [19] W. Ye, X. An, X. Yan, M. Hamad and S. Steinhorst, “FLaaS6G: Feder-
ated Learning as a Service in 6G Using Distributed Data Management
performance bottleneck as it is conducted in the cloud with Architecture,” GLOBECOM, 2022.
superior computing resources. Despite the overhead introduced [20] P. Paillier, “Public-key cryptosystems based on composite degree resid-
by the Paillier scheme, the system effectively preserves privacy uosity classes,” Advances in Cryptology—EUROCRYPT, 1999.
[21] Y. LeCun, L. Bottou, Y. Bengio and P. Haffner, “Gradient-based learning
by safeguarding individual clients’ model updates, justifying applied to document recognition,” Proceedings of the IEEE, 1988.
these costs.

You might also like