Download as pdf or txt
Download as pdf or txt
You are on page 1of 16

Quantum federated learning in healthcare: The shift

from development to deployment and from models


to data
Sabre Kais (  kais@purdue.edu )
Purdue university
Amandeep Bhatia
Purdue University
Muhammad Alam
Purdue University

Article

Keywords:

Posted Date: May 15th, 2023

DOI: https://doi.org/10.21203/rs.3.rs-2723753/v1

License:   This work is licensed under a Creative Commons Attribution 4.0 International License.
Read Full License

Additional Declarations: (Not answered)


Quantum federated learning in healthcare: The shift from development to deployment
and from models to data
∗ ∗

Amandeep Singh Bhatia1 , Sabre Kais1,2 , Muhammad Ashraful Alam1
1
School of Electrical and Computer Engineering, Purdue University, West Lafayette, IN, USA
2
Department of Chemistry, Purdue University, West Lafayette, IN, USA

Healthcare organizations have a high volume of sensitive data and traditional technologies have
limited storage capacity and computational resources. The prospect of sharing healthcare data
for machine learning is more arduous due to firm regulations related to patient privacy. The bal-
anced protection of confidentiality, integrity, and availability of healthcare data, has become a major
concern beyond classical data security considerations. In recent years, federated learning offers a
solution to accelerate distributed machine learning addressing concerns related to data privacy and
governance. Currently, the blend of quantum computing and machine learning has experienced sig-
nificant attention from academic institutions and research communities. Quantum computers have
shown the potential to bring huge benefits to the healthcare sector through efficient distributed
training across several quantum nodes. The ultimate objective of this work is to develop a quantum
federated learning framework (QFL) to tackle the optimization, security, and privacy challenges
in the healthcare and clinical industries for medical imaging tasks. In this work, we proposed
federated quantum convolutional neural networks (QCNNs) with distributed training across edge
devices. To demonstrate the feasibility of the proposed QFL framework, we performed extensive
experiments on medical datasets (Pneumonia MNIST, and CT-kidney disease analysis), which are
non-independently and non-identically partitioned among the healthcare institutions/clients. The
quantum federated global model maintained a high classification testing accuracy and generaliz-
ability and outperformed the locally train clients regardless of how unbalanced the medical data
is distributed among the clients. The global model achieved the best performance as compared to
local clients in terms of the area under curve of the receiver operating characteristic curve (AUC-
ROC) (0.953) and an average of (0.98) on all classes for predicting outcomes on pneumonia and
CT-kidney datasets, respectively. Moreover, the client selection mechanism is proposed to reduce
the computation overhead at each communication round, which effectively improves the convergence
rate. The proposed quantum federated learning framework is validated and assessed via large-scale
simulations. Based on our results from numerical simulations, the deployment of distributed and
secure quantum machine learning algorithms for enabling scalable and privacy-preserving intelligent
healthcare applications would be extremely valuable.

1 I. INTRODUCTION 21 tralized approach is not effective for health analytics and


22 data learning.
2 Emerging technologies like artificial intelligence (AI), 23 According to World Health Organization (WHO) re-
3 big data, and cloud technologies have the maximum 24 ports, pneumonia alone has become the world’s leading
4 impact on healthcare and have shown the potential to 25 infectious cause of death in children under 5 years of age.
5 improve medical outcomes and population health [1]. 26 Globally, at least one child dies every 45 seconds in de-
6 The benefits of AI have consistently shown from men- 27 veloping countries [5]. The main goal is rapid and ac-
7 tal healthcare to chronic disease management to medical 28 curate diagnosis of pneumonia in symptomatic patients.
8 imaging. But, training AI systems need access to a large 29 Health workers diagnose it via physical exam (i.e. check
9 amount of data and security presents AI with a new set 30 abnormal breathing patterns) or using the main medi-
10 of challenges [2]. Most healthcare organizations are be- 31 cal imaging method i.e. radiographic data (chest x-ray).
11 coming more vulnerable than ever due to the possibility 32 Although, the interpretation of radiographic findings is
12 of data breaches by sending data over networks [3]. Tra- 33 often delayed by several hours and heavily affected by
13 ditionally, there are several challenges in deploying prac- 34 the clinical conditions of the patient, and past imag-
14 tice management software and managing electronic med- 35 ing history. Moreover, chronic kidney disease (CKD)
15 ical records due to sensitive information related to med- 36 has become the leading cause of death among dialysis
16 ical conditions, laboratory results, and personal details 37 patients by affecting >10% of the world’s population
17 of patients. Healthcare data are incredibly valuable as- 38 [6]. WHO ranked CKD 10th in 2020 and expected to
18 sets, and responsible execution of AI in healthcare needs 39 jump to 5th among the leading causes of death by 2040
19 a focus on privacy and security [4]. Due to the increase 40 [7]. Commonly, kidney illnesses that impede its func-
20 in the volume of sensitive medical data, the current cen- 41 tion are caused by renal cell carcinoma (which repre-
42 sents kidney tumor), nephrolithiasis (kidney stone), and
43 cyst formation. Today, healthcare data privacy and secu-
44 rity concerns are top barriers to adopting data analytical
∗ bhatia87@purdue.edu; kais@purdue.edu; alam@purdue.edu 45 methodologies and several organizations hesitate to share
2

46 data or participate in big data communities. Health-104 the capacity of classical computers. In recent years, sev-
47 related data are very sensitive. Federated learning tech-105 eral QML algorithms have been employed in the health-
48 niques have been introduced as a solution that enhances106 care domain such as quantum neural networks [18], vari-
49 data privacy without any raw data leaving the devices107 ational quantum classifier [19, 20], quantum support vec-
50 [8]. It can help radiologists to make an accurate and108 tor machine [21], quantum convolutional neural network
51 faster diagnosis of pneumonia and early mitigate chronic109 [22–24], quantum Boltzmann machines [25], and many
52 conditions. 110 more.
53 In recent years, the concept of federated learning111 Motivated by the structure of classical convolutional
54 has been emerged as a research paradigm and actively112 neural networks (CNNs), a QCNN is designed in a quan-
55 driven by scientists from Google. Since the introduction113 tum domain to mimic the role of CNNs [26]. In QC-
56 of Google’s federated learning-based android app called114 NNs architecture, a series of convolutional layers are in-
57 Gboard, it has received a significant response to solve115 corporated with pooling operations, followed by a fully-
58 the data island problem [9]. It has been applied in sev-116 connected unitary layer, as shown in Fig 2. The first
59 eral domains for collaborative learning while protecting117 step is to encode the image data into a quantum sys-
60 data privacy such as smart cities, finance, and health-118 tem. There exist several encoding strategies i.e., angle
61 care. Federated learning supports the researchers on a119 encoding, amplitude encoding, and basis encoding. An
62 common objective: to propose a model that learns from120 n-qubit input state is given as an input in a Hilbert
63 the shared data in the cloud and generalizes to private121 space Hin , which is sent through quantum circuits us-
64 data at local nodes without transferring private data ex-122 ing convolutional filters, as schematically shown in Fig
65 plicitly [10, 11]. It has emerged as a promising solution123 2. In a quantum convolutional layer (U ), parameterized
66 for recognizing the cost-effectiveness of large-scale tech-124 one-qubit, two-qubit, and four-qubit gates are applied
67 nological applications with advanced privacy protection.125 between neighboring qubits. The structure of quantum
68 An overview of the quantum federated learning frame-126 circuits used for creating two and four-qubit unitaries is
69 work (QFL) for the healthcare sector is described in Fig127 shown in Fig 3. Firstly, Rx (θ1 ), Ry (θ2 ), Rz (θ3 ) rotations
70 1. The basic components in a federated learning process128 are appliedNto all qubits
N separately. Further, a parametric
71 are a central node and several client nodes. In our case,129 2-qubit Z Z and X X rotations about ZZ and XX are
72 clients/devices representing hospitals/healthcare institu-130 applied to each pair of qubits, followed by single-qubit
73 tions hold medical data. The central node holds the131 rotations. In quantum pooling operation (V ), a fraction
74 global quantum model and broadcasts it to all clients132 of qubits are measured, and their outcomes control the
75 for training. Each client performs the training on their133 unitary applied to nearby qubits. The pooling operation
N
76 local medical data and sends the model updates to the134 can be thought as a map two-qubit HRS = HR HS to
77 central node. After receiving updates from the clients,135 a single-qubit Hilbert space HW [27]. Thus, the pooling
78 the server performs an aggregation process. The cur-136 operations effectively reduce the feature map size while
79 rent state of the global model is thus updated and sends137 preserving characteristic features of the input state by
80 back the updated model to all clients for the next round138 applying single-qubit rotations and controlled-X gates,
81 of training. This process repeats until the global model139 as shown in Fig 3b. The convolution layer and pooling
82 converges. 140 operations are applied until the feature space dimension
141 is sufficiently small. After the final pooling operation,
83 The realm of quantum machine learning (QML) has
142 a fully-connected unitary (F ) is applied to the remain-
84 become a rapidly growing framework to solve compu-
143 ing qubits. Finally, the output of the circuit is measured
85 tational problems dealing with quantum data [12, 13].
144 by the expectation value of some fixed number of output
86 With the advent of noisy intermediate-scale quantum
145 qubits.
87 (NISQ) devices, it has become an effective new tool for
146 In QCNNs, the nonlinearities come from reducing the
88 advancing the estimating capabilities in several industries
147 number of degrees of freedom. It consists of O(log (n))
89 ranging from healthcare to finance [14]. Since the in-
148 variational parameters for n input qubits [28]. Thus, the
90 troduction of variational quantum algorithms, and more
149 shallow depth architecture of QCNNs makes them suit-
91 generally quantum neural networks, quantum comput-
150 able for noisy intermediate-scale quantum (NISQ) de-
92 ing approaches have been widely applied to learning al-
151 vices. QCNNs exploit entanglement and have the po-
93 gorithms. It can be witnessed by the rapid increase in
152 tential to capture local correlations transcend CNNs. It
94 highly impactful quantum machine learning research ar-
153 has been successfully implemented for quantum phase
95 ticles [15–17]. The field is in the initial phase of research
154 recognition [29], image classification [30], and high en-
96 and development, its potential applications continue to
155 ergy physics data analysis [31].
97 expand. Although, variational quantum circuits have got
98 significant attention and triggered an extensive amount
99 of work to improve the performance of classical machine
100 learning algorithms. The researchers have formulated a156 A. Motivation
101 theory and executed machine learning algorithms in al-
102 most every field of science, such as physics, chemistry,157 The main motivation is to unlock the potential ben-
103 biomedical, and healthcare to solve problems afar from158 efits of quantum computing and federated learning in
3

(a) (b) (c)


Updated global model Securely aggregated
Untrained global model
model
Secure
Central server aggregation Central server

... ... ...


...

...

...

...

...

...

...

...

...
... ... ... ... ... ... ... ... ...
Site-1 local training Site-2 local training Site-n local training Site-1 local training Site-2 local training Site-n local training Site-1 local training Site-2 local training Site-n local training

FIG. 1. An overview of quantum federated learning framework (QFL) applied in a healthcare sector. Each
hospital consists of patient-sensitive data, which are never exhibited to the global server directly. (a) Initially, the untrained
state of the global model (quantum convolutional neural network) is broadcast by the central server to all quantum nodes
(hospitals) for training, where the actual data resides. (b) At each hospital, the quantum models are trained locally and send
back the gradients/updates (θ1 , θ2 , ..., θn ) of their model to the server. (c) The server performs the secure aggregation and
sends back the updated state of the model for the next round of training.

159 the healthcare sector, including creating better diagnos-192 with the non-iid distribution. Section 4 presents the
160 tic tools, improving medical image analysis, and ensuring193 problem formulation and description of medical datasets
161 meaningful results for rare diseases when smaller health-194 used in the study. Section 5 presents the experimental
162 care institutions lack enough data to train an accurate195 results on pneumonia detection and kidney abnormalities
163 predictive model. Motivated by the efficient training of196 detection and discussions on the findings and outcomes.
164 quantum machine learning models, a generalized quan-197 Finally, the concluding remarks are reported.
165 tum federated model can be created to assist forefront
166 physicians worldwide because it can provide an extraordi-
167 nary opportunity for data science technologies to enhance198 II. RELATED WORK
168 the quality-of-care delivery. It can become a reality for
169 healthcare industries to access more data and efficiently199 Today’s implementation of variational quantum algo-
170 train their models. 200 rithms on quantum computers is catered towards the
171 The goal of this paper is to offer a federated quantum201 noisy intermediate-scale quantum (NISQ) era. In prac-
172 machine learning framework based on quantum convo-202 tice, researchers have produced just a handful of related
173 lutional neural networks performing distributed training203 works between quantum machine learning and federated
174 across quantum nodes. QCNNs can take advantage of204 learning. Initially, Chen et al. [32] demonstrated the
175 quantum parallelism to speed up the training process,205 quantum federated learning framework that is built on
176 allowing them to learn from large datasets more quickly.206 hybrid quantum-classical transfer learning. The pro-
177 By leveraging the principles of quantum computing, cer-207 posed framework was discussed to provide efficient train-
178 tain operations can be performed more efficiently and208 ing and converges in fewer communication rounds as com-
179 securely in federated learning settings. The proposed209 pared to centralized models. Xia et al. [33] set up the
180 framework has the potential to incorporate privacy, se-210 quantum federated learning framework based on quan-
181 curity, and the expedited processing of distributed data.211 tum neural networks. Li et al. [34] considered a private
182 The deployment of such an innovative framework in the212 single and multi-party quantum protocol for distributed
183 healthcare industry can be beneficial for the betterment213 learning with blind quantum computing. It has been in-
184 of healthcare. To the best of our knowledge, no quan-214 vestigated that it has the ability to resist several attacks
185 tum convolutional neural networks have been proposed215 with differential privacy. Yamany et al. [35] proposed
186 that attempt to implement federated learning over non-216 a quantum federated learning architecture based on a
187 independently and non-identically distributed (non-iid)217 quantum-behaved particle swarm optimization algorithm
188 medical datasets. 218 for transportation systems. It has been claimed that the
189 The remainder of this paper is organized as follows:219 proposed work is resilient against adversarial attacks in
190 Section 2 is the related work. Section 3 contains the220 federated learning. Later, Xia et al. [36] compared the
191 description of quantum federated learning optimization221 performance of classical and quantum federated learning
4

U V U V
U1 U1
Uf
U2 U2 U2

U4 U2 U2

Amplitude Encoding
U2 U2
U1
U4 U2

U2
Data Encoding
U4 U2
Quantum convolution layer
U2 U2 Pooling operation layer
U1 Fully connected unitary layer

FIG. 2. A schematic representation of Quantum convolutional neural network. It consists of four components:
encoding of data into a quantum state (in red rectangle), quantum convolution (U ) (in green), and pooling operation (V ) (in
blue) using parameterized quantum gates and a two-qubit fully connected gate (Uf ). U1 , U2 , U4 represent the unitary filters to
extract the features, and l1 , l2 denotes the number of layers to repeat the convolutional filters.

(a) 225 ing and training a local quantum convolutional neural


226 networks for binary classification It has been investigated
= 227 that the proposed model has the potential to perform ef-
228 ficiently on Non-iid distributed datasets.
(b) 229 Recently, Zhang et al. [38] introduced a secure ag-
230 gregation strategy based on a quantum entangled state
= 231 for securing model parameters against eavesdropping in
232 federated settings. It has been demonstrated by im-
233 plementing quantum neural networks, classical convolu-
234 tional neural networks, and logistic regression models. Qi
235 [39] implemented a variational quantum circuit using a
(c) 236 quantum natural gradient descent optimizer in federated
237 learning settings. The performance of the proposed algo-
= 238 rithm is compared with conventional SGD methods and
239 claims faster convergence by implementing handwritten
240 digit datasets

FIG. 3. Representation of quantum circuits used


in quantum convolutional and pooling operation of
QCNN. (a) the structure of a 2-qubit filter to create a two-241 III. QUANTUM FEDERATED LEARNING
qubit unitary, consisting of 14 parameters (b) the pooling op-242 OPTIMIZATION
eration circuit to reduce the entanglement from two-qubits
to one using 6 parameters (c) the structure of a 4-qubit
filter with an entanglement of qubits using 22 parameters.243 In this section, we describe the quantum federated
Rx (θ1 ), Ry (θ2 ), Rz (θ3 ) are the single-qubit rotationsNaround244 learning algorithm to achieve better medical imaging
x,N y, z axis, respectively, and parametric 2-qubit Z Z and245 quantum models for individual sites. Healthcare organi-
X X are rotations about ZZ and XX applied to pair of246 zations/hospitals work based on centralized servers shar-
qubits. 247 ing their raw data and face privacy challenges to train
248 a machine learning model. By leveraging the power of
249 quantum computers with federated learning, we can en-
222 against Byzantine attacks. Chehimi et al. [37] proposed250 able the creation of robust quantum models that can
223 a quantum federated learning framework by introducing251 work well across multiple institutions constrained by
224 the first quantum federated dataset for distributed learn-252 sparse or confidential data.
5

253 A. Federated averaging quantum convolutional 275 and we will use Dir(.) to represent the data partitioning
254 neural network (FedAvg-QCNN) 276 strategy.
277 The benefit of using Dirichlet distribution is that it
255 Motivated by the potential of quantum computers to278 gives us the flexibility to control the data heterogeneity
256 significantly improve the performance of machine learn-279 degree by a single hyper-parameter. The Dirichlet distri-
257 ing systems, our objective is to develop a quantum fed-280 bution to model the real-world distribution by varying α
258 erated learning algorithm with the potential for practical281 is visualized in Fig 4. If the α is set to a higher value, then
259 impact. The decentralized training across several quan-282 the distribution is more balanced. The imbalanced or
260 tum computers could enhance the training time and data283 sparse distribution can be created by setting α to a lower
261 privacy concerns because the majority of data stays on a284 value. Dir(α=1.0) distribution is stronger non-iid than
262 client’s device. 285 Dir(α=5.0). It helps us to allocate different amounts of
286 data samples to each client. In quantum federated learn-
287 ing, the Dirichlet distribution is used to sample data for
263 B. Non-identical Data Distribution 288 non-iid settings. To examine the impact of α, we run
289 our experiments with different degrees of α and clients in
290 non-iid settings. The probability distribution over set n
291 categories [41] is given as
v1 v1

.1 .1
.9 .9
.2
.8
.2
.8
n
X
.4
.3
.7
.4
.3
.7 P = {x1 , x2 , ..., xn }, where 0 ≤ xi ≤ 1, and xi = 1
.6 .6
.5 .5 i=1
.5 .5
.6
.4
.6
.4
(1)
.8
.7
.3
.8
.7
.3 292 It is denoted by n parameters α = {α1 , α2 , ..., αn }.
.9
.2
.1
.9
.2
.1
293 Formally, it is represented as P r(x; α) ∼ Dir(α).
v2 .1 .2 .3 .4 .5 .6 .7 .8 .9 v3 v2 .1 .2 .3 .4 .5 .6 .7 .8 .9 v3
n
v1 v1
1 Y αi −1
P r(x; α) = x (2)
B(α) i=1 i
.1 .1
.9 .9
.2 .2
.8 .8
.3 .3
.7 .7
.4

.5
.4
.6
.5
.6 294 where B(α) is a constant normalizer, which can be
.5 .5
.6
.4
.6
.4
295 stated by a gamma function (Γ) as
.7 .7
.3 .3 Qn
i=1 Γ(αi )
.8 .8
.2 .2
.9
.1
.9
.1 B(α) = Pn (3)
v2 .1 .2 .3 .4 .5 .6 .7 .8 .9 v3 v2 .1 .2 .3 .4 .5 .6 .7 .8 .9 v3
Γ i=1 αi

FIG. 4. Visualization of Dirichlet distribution to repli-296 C. Quantum federated dataset


cate the real-world data distribution based on con-
centration parameter (α). It determines the distribution
297 Next, we created a quantum federated data based
of Dirichlet. Setting lower α generated a sparse distribution,
whereas higher α gives a more dense distribution. For ex-
298 on Dir(α) of several non-iid clients. The set of clients
ample, v1 , v2 , and v3 are the classes and their samples are 299 (Cn ) can be represented by orthonormal basis {|Cni ⟩ ∈
represented by red, green, and blue colors, respectively in300 H C n }m
i=1 Each client (c ∈ Cn ) hold their own local
trivariate Dirichlet distributions. With Dir(α=0.1), a more301 datasets containing nc samples from whole dataset (D)
imbalanced data distribution is created. As the α increases302 represented as state {|ψic ⟩}ni=1
c
. Thus, the quantum fed-
it is more likely to have a balanced distribution from all the303 erated dataset is a list of client ids, where each client’s
classes. 304 local dataset is represented as
nc
In our data partition task, we obey the Dirichlet dis- 1 X
264
ρc = |ψ c ⟩ ⟨ψic | (4)
265 tribution to generate the non-identical data partition nc i=1 i
266 among the clients. Dirichlet distribution is parameter-
267 ized by α and is commonly used to model various real-305 The distribution of a dataset is an imbalance among
268 world phenomena, ranging from weather to stock market306 the clients. Therefore, the density matrix of each client
269 prices. The Dirichlet is a type of continuous probabil-307 will be different from others. For a given classified
270 ity distribution over a probability simplex i.e. a bunch308 training, the dataset consists of labeled inputs in pairs
271 of numbers that add up to 1. It is often called a mul-309 {|ψi ⟩ , yi }ni=1
c
∈ D, where |ψi ⟩ are input states and yi
272 tivariate beta distribution [40]. In the paper, we draw310 classifies label (either 0 or 1) for a binary classification
273 qi ∼ Dir(α) and assign a qi,j segment of the instances of311 task. The next step is to encode individual clients’ data
274 class i to client j, where α is a concentration parameter312 and batch them accordingly.
6

Pre-processing FedQCNN model Deployment


Perform the secure
Select data sources Boradcasts the initial
aggregation at server.
QCNN model to all
clients Send the updated
Create a list of client
model
ids Encode the local
datasets at all clients Repeat the above
Distribute the dataset steps
among the clients Perform the training
based on Dirichlet on local datasets & Quantum model is
distribution Dir ( ) send back the results ready for deployment

FIG. 5. Schematic representation of the FedAvg-QCNN algorithm workflow. It consists of three main steps i.e.
pre-processing (distribute the dataset based on Dir (α), prepare and transmit the QCNN model to all clients, perform secure
aggregation, and deploy the quantum model.

313 D. Local quantum models 324 We considered the amplitude encoding strategy for the
325 preparation ofnquantum states in QCNN, which assumes
314 We consider a quantum federated learning (QFL)326 that V = R2Pand vector inputs are normalized such
315 framework (Fig 1.) with quantum convolutional neural327 that ∥ v ∥2 = i |vi |2 = 1. It represents the input data
316 networks (QCNN) (Fig 2.), consisting of several clients328 as amplitudes of an n-qubit quantum state |φ(v)⟩ as
317 (quantum nodes) and a central server. To efficiently sim-
318 ulate quantum circuits, each medical image is reduced to 1 X
N

319 n-dimensional vector using principal component analysis |φ(v)⟩ = vj |j⟩ (5)
∥ v ∥ j=1
320 (PCA) with 90-99% variance. After dimension reduction,
321 the first step is to encode classical data into a quantum
322 state. 329 where |j⟩ denotes the j th computational basis state.
330 It enables us to encode a vector of N double precision
331 numbers into log2 (N ) qubits.
Algorithm 1 Federated averaging (FedAvg-
332 Suppose, there exists Cn healthcare institu-
QCNN) algorithm
333 tions/hospitals/clients where the medical imaging
1: Input: Cn denotes the number of clients such that 334 data reside in. The local data of each client is dis-
c=1, 2,..., Cn , R is the communication rounds, B is
335 tributed based on Dirichlet distribution, i.e. federated
batch-size, E represents the local epochs at client-
side, sk is the number of samples available with
336 quantum data is non-iid. Each client has its local
client k and ηc , ηs are the learning rates of client 337 data and local federated QCNN model being served to
and server, respectively. Based on QCNN archi- 338 perform the medical classification task. After encoding
tecture, the learnable parameters of each quantum 339 classical data into federated quantum data, each client’s
node are represented as fθc = (C, P, M ), where C, 340 local QCNN (f ) consists of a stacking of quantum
P, M represents pooling operation, quantum convo- 341 convolutional layers and quantum pooling operations,
lution and measurement on specific qubits, respec- 342 followed by a fully connected layer. The vector of
tively and θc is the vector of model parameters for 343 trainable parameters (θ) is represented by θc = (U, V, F )
client c. 344 for each client (c), where c = 1, 2, 3, ..., Cn . Finally, the
323
2: ClientModelUpdate(c, θ): 345 measurement of expectation values is performed over
3: for i=1 to E do
346 the state qout and is represented as f{U,V,F } (|ψn ⟩). The
4: for b ∈ B do
5: θk ← θc − ηc ∇Lc (θc , b) 347 quantum circuit parameters are updated to minimize
6: end for 348 the cost function, represented as:
7: end for
8: return θ to Server nc
1 X
9: SecureServerAggregation: F(θc ) = (yi − (f{U,V,F } (|ψi ⟩)))2 (6)
10: Initialize θ1shared nc i=0
11: for r =0 to R-1 do
12: Send θt global server parameters to each client.
13: for each c ∈ C do 349 where nc is the number of samples associated with the
14: c
θr+1 c
local ← ClientModelUpdate(c, θr shared ),
350 client (c). In our federated quantum setup, all quantum
15: end for 351 nodes/clients Cn participated in collaborative training of
PC
16: θc+1shared ← θtclocal − ηs C1 c
c=1 θr+1local
352 the global QCNN model to gain experience from the vast
17: end for 353 range of data located at different medical sites.
7

354 E. Global model


TABLE I. List of parameters used in the experiments.
Each client locally trains the model according to local epochs
355 Initially, all the healthcare institutions or clients hold (E) before sending updates to the server, the total number
356 their local datasets and the central server holds a global of clients (C ) used in the experiments, R denotes the total
357 federated QCNN model. In the beginning, the central number of communication rounds, q is the number of qubits
358 server broadcasts the untrained quantum model parame- used to encode the classical data, B is a batch-size, ηc , ηs
359 ters (θsr ) to all clients (Cn ) for collaborative training. In represents learning rate of client and server to update the
360 each communication round (r ), the clients perform train- parameters, Dir(α) represents the Dirichlet distribution i.e.
data partitioning strategy based on α among the clients.
361 ing on their local datasets to improve the received quan-
362 tum model and generate the updated parameters using Description Value
363 the client’s ADAM optimizer with a fixed learning rate Local epochs per communication round (E ) (1, 3)
Number of clients (C ) (2, 4)
364 (ηc ) as
Number of communication rounds (R) (150, 200)
Number of qubits (q) 8
θcr = θsr − ηc ▽θcr F(θcr ) (7) Batch size (B ) 32
Server learning rate (ηs ) 0.01
365 where F(θcr ) denotes the quantum circuit parameters re- Client learning rate (ηc ) 0.001
366 ceived of client(c) at round (r ). After training their Optimizer ADAM
367 quantum models locally, the clients send their local up- Dir (α) for pneumonia dataset (0.5, 5.0)
368 dates to a central server. Finally, the server aggregates Dir (α) for CT-kidney dataset (0.5, 1.0, 10.0)
369 the parameters of all quantum nodes to update its global
370 QCNN model using the server’s ADAM optimizer with a
371 learning rate (ηs ). For the next round of training (r +1),401 non-iid set-up based on Dirichlet distribution to the lo-
372 the server sends the new model parameters to all Cn 402 cal quantum nodes.
373 clients as
Cn
1 X 403 A. Datasets description
θsr+1 ← θcr − ηs θr+1 (8)
Cn c=1 c

374 where θcr+1 is parameters at communication round (r +1). (a) (b) Normal
Pneumonia Cyst
The objective is to find an optimal global federated
_________________
375

376 QCNN model by minimizing the global loss function.


377 The steps of quantum federated learning algorithm based
378 on QCNN are given in Algorithm 1. The exchange
379 of local and global parameters repeats until the global
380 model converges after some communication rounds. The Stone
Normal Tumor
381 schematic representation of the algorithm workflow is de-
382 scribed in Fig 5. The list of parameters used in the ex-
383 periments is given in Table 1.

384 IV. PROBLEM FORMULATION


PneumoniaMNIST CT Kidney
385 In this Section, we evaluated our quantum federated FIG. 6. Samples of medical images. (a) Pneumonia
386 learning framework in diagnosing common diseases. The MNIST consists of two classes normal (no sign of infection)
387 examination of x-ray images is prone to subjective vari- and pneumonia (infection caused by bacteria), and (b) CT-
388 ability and is a difficult task. Although, the centralized kidney dataset to detect kidney disease analysis, consists of
389 methods of detecting pneumonia in x-ray images have four classes (cyst (fluid-filled sac), normal (no infection), stone
390 proven to be accurate and give patients new hope in their (tiny crystals), and tumor (abnormal growth of cells in the
391 recovery. In practice, it is challenging to build a database kidney). The samples of these medical images are taken from
392 with enough samples to train a model. Collaboration be- [42, 43].
393 tween healthcare organizations is often problematic due
394 to privacy concerns. This study aims to develop a quan-404 For quantum federated learning benchmarks, we per-
395 tum federated non-iid dataset consisting of x-ray images405 formed the experiments on two medical datasets pneumo-
396 to detect pneumonia and CT scans to detect kidney ab-406 nia MNIST [42], and CT-kidney datasets [43]. Initially,
397 normalities using a quantum federated learning frame-407 the evaluation is conducted for the diagnosis of pediatric
398 work. Moreover, a client selection scheme is proposed to408 pneumonia using 5,856 chest x-ray images of Pneumonia
399 improve the communication efficiency between the quan-409 MNIST Ṫhe task is a binary-class classification of normal
400 tum nodes. All the experiments are performed with a410 (no sign of infection) and pneumonia (infection caused by
8

411 bacteria, viruses, or fungi). For the multi-classification463 other two clients have 42% training samples of normal
412 task in radiology and healthcare imaging settings, addi-464 class. The total local dataset contributions of each class
413 tional experiments are performed for the diagnosis of kid-465 for all clients are presented at the top of the box plots,
414 ney diseases on 11,368 images of the CT-kidney dataset.466 in Fig 7(k-m). Due to data imbalance, the testing loss
415 It contains 3,708 cyst (fluid-filled sac), 4,000 normal (no467 of client1, and client2 is more as compared to the other
416 infection), 1,377 stone (tiny crystals), and 2,283 tumor468 two clients, in Fig 7i. Although, it converges better after
417 (abnormal growth of cells) images from CT-radiography.469 100 communication rounds, and is able to achieve around
418 Some random samples of the pneumonia dataset and CT-470 85% accuracy on the testing set. Fig 7j illustrates that
419 kidney datasets are shown in Fig 6. 471 the performance of the global model does not decline on
472 distributing the dataset among four clients. FedQCNN
473 has higher testing accuracy 91% and converges faster as
420 V. RESULTS AND DISCUSSION 474 compared to four local train clients, in Fig 7l.
475 As depicted in Fig 7g, we partitioned the dataset
421 To demonstrate the robustness and generalizability of a476 by setting Dir(α = 5.0), so that all clients contain an
422 proposed quantum federated learning algorithm, we stud-477 adequate quantity of x-ray images from both classes.
423 ied the non-iid data partitioning strategy in the experi-478 Therefore, all clients achieve around 88-90% accuracy
424 ments based on Dirichlet distribution. Our initial study479 and global model accuracy remains 91%, as shown in
425 aimed to analyze the effectiveness of QCNN on the task480 Fig 7m. We also computed AUC (i.e. area under the
426 of pneumonia detection in a decentralized training man-481 curve) for each locally trained model, in Fig 7n. Here,
427 ner. The study encompassed 5,856 x-ray images (includ-482 the global model outperforms the local train clients with
428 ing pneumonia and normal) and was distributed across 2483 AUC=0.953 and the performance of client1 is lower with
429 and 4 client sites. We used 80% of the dataset for train-484 AUC=0.92, in comparison to other clients. Further, we
430 ing and the remaining 20% for testing purposes. Each485 compared the performance of classical CNN with QCNN
431 client who participated in the training process contains486 in federated settings, as shown in Fig 7o. Due to efficient
432 a different number of samples of each class. Thus, the487 training of quantum circuits, the QCNN model achieves
433 mean of the whole sample size of each client varies from488 an accuracy of 91% with only 160 training parameters.
434 that of other clients. 489 Although, the classical CNN achieves higher testing ac-
490 curacy 92% and loss remains low. The confusion ma-
491 trix for the testing set in FedQCNN (C = 4, α = 0.5) is
435 A. Application to detect pneumonia 492 given in Fig 7p. The distribution of predicted values on
493 a test set by the global model is depicted in Fig 7q. The
494 global quantum algorithm achieved better results than
436 The original size of the x-ray image is 28*28. In order
495 any client that was trained on local data and proved to
437 to efficiently simulate quantum circuits, each x-ray image
496 be robust nonetheless of how imbalanced the data was
438 is reduced to 16*16 (i.e. 256 components) using principal
497 distributed across the clients.
439 component analysis (PCA) while keeping 99% variability
440 and maintaining principal properties of the original im-
441 age. Further, in pre-processing, each image is normalized
442 to one so that vectors can be encoded in the amplitudes498 B. Application to detect kidney abnormalities
443 of a many-qubit superposition state using amplitude en-
444 coding (8 qubits). 499 Next, we applied the concept of quantum federated
445 In the experiments, the testing loss against the num-500 learning for the diagnosis of kidney diseases. We aimed
446 ber of communication rounds or training epochs is used501 to demonstrate the feasibility of training a QCNN for a
447 as a metric to measure the quality of training capability502 multi-classification task i.e detection of kidney abnormal-
448 of all clients including the global FedQCNN model. Fig503 ities in federated settings, which does not require data to
449 7(a-d) describes the steps of the local training process504 be stored centrally or exchange of data between centers.
450 between the four clients and the global quantum model.505 A dataset namely the ”CT-Kidney dataset” is collected
451 The ratio of pneumonia and normal class in the pneumo-506 [43], consisting of 512*512 size images for diagnosing kid-
452 nia dataset can be found in Fig 7e. The non-iid distribu-507 ney cyst, normal, stone, and tumor. It has a higher
453 tion among two and four clients based on Dir (α) is given508 dimensionality than the previous example. Therefore,
454 in Fig 7(f, g). We intend to measure the influence of each509 great care has to be taken when encoding classical data
455 client in the training of a global model. The testing loss510 into less number of qubits. Here, we slightly changed
456 of client2 is higher due to insufficient training samples511 our approach to applying QCNN to maintain the spa-
457 (i.e. 17%) in Fig 7h, leading to lesser testing accuracy512 tial structure of the CT-scan image. The illustration of
458 in comparison to client1 with 83% training samples, and513 a QCNN for CT-kidney images is given in Fig 8. Each
459 the global FedQCNN achieves 90.5% testing accuracy, as514 CT-scan image is reduced to 12*12 (i.e. 144 components)
460 shown in Fig 7k. Further, the dataset is distributed ran-515 using PCA while keeping 99% of variance and preserving
461 domly among four clients based on α = 0.5, where client1516 as much variability. Next, we took each 3*3 pixel as an
462 and client2 are having only 8 and 9% samples, and the517 input feature and encodes it into a 9-qubit QCNN using
9

Local (a) (b) (c) (d)


1
client 1 2 1 2 1 2

Central 2
QC Quantum QC QC QC
model
3 Central Central Central
Personal model model model
data 3 4 3 4 3 4
4

(e) (f ) (g)
3QHXPRQLD 


1RRIVDPSOHV

1RRIVDPSOHV




3QHXPRQLD'DWDVHW



1RUPDO  
 FOLHQW FOLHQW FOLHQW FOLHQW FOLHQW FOLHQW FOLHQW FOLHQW FOLHQW FOLHQW FOLHQW FOLHQW
DOSKD  DOSKD  DOSKD  DOSKD 

'LVWULEXWLRQ 'LVWULEXWLRQ
(h) (i) (j)
 
)HG4&11 &OLHQW
 
&OLHQW &OLHQW
&OLHQW &OLHQW

7HVWLQJ$FFXUDF\
 
 &OLHQW 
7HVWLQJ/RVV

7HVWLQJ/RVV
7HVWLQJ/RVV

 
)HG4&11/RVV

)HG4&11$&&
 



 
 


         
    

&RPPXQLFDWLRQ5RXQGV &RPPXQLFDWLRQ5RXQGV &RPPXQLFDWLRQ5RXQGV

94% 72% 6% 28% 43% 8% 21% 9% 23% 41% 13% 42% 26% 32% 30% 31% 19% 16% 25% 21%
P N P N P N P N P N P N P N P N P N P N

(l) (m)
(k) 1 2 1 2 3 4 1 2 3 4
83% 17% 26.5% 15% 32% 27.5% 29% 30.5% 17.5% 23%
 

 
7HVWLQJ$FFXUDF\



7HVWLQJ$FFXUDF\
7HVWLQJ$FFXUDF\


 

 

 

 


&OLHQW &OLHQW )HG4&11 &OLHQW &OLHQW &OLHQW &OLHQW )HG4&11 &OLHQW &OLHQW &OLHQW &OLHQW )HG4&11

&OLHQWV &OLHQWV &OLHQWV


(n) (o) (p) (q)


150
7HVWLQJ$FFXUDF\

7HVWLQJ/RVV

)HG&11 N
 )HG4&11 N 
Count

)HG&11 N 100


)HG4&11 N
 
50



0
     Ȼ1 0 1

&RPPXQLFDWLRQ5RXQGV Score

FIG. 7. FedQCNN for pneumonia detection. (a) The concept of quantum federated learning is described considering
four clients with their data. (b) The central model (QCNN) is broadcast to the clients for training on their local datasets.
(c) After training, all the clients send their model parameters to the central model. (d) The central/global model performs a
secure aggregation and sends back the updated model. (e) Pneumonia MNIST consists of two classes (pneumonia (27%) and
normal (73%)) (f-g) Different amounts of data samples allocated to each client using Dirichlet distribution with parameter
α ∈ {0.5, 5.0}. (h-i) Testing loss curves against communication rounds for two and four local clients, when data is distributed
using Dir(0.5). (j) Testing accuracy and loss of a FedQCNN global model, when C=4 and α = 5.0. (k-m) Box plots representing
the testing accuracy of two, four clients and a global model with a non-iid distribution of pneumonia dataset. The total local
dataset contributions of each class for all clients are presented at the top of the box plots. (n) Area under curve of the receiver
operating characteristic curve (AUC-ROC) of the individual client and FedQCNN with Dir(0.5) (o) Comparison of FedQCNN
with federated classical CNN, where clients=2 and α is set as 0.5. (p) Confusion matrix shown for the testing dataset of a
global model after 200 communication rounds of training. (q) Histogram plot showing the distribution of predicted values (i.e.
less than zero for class pneumonia and greater than zero for a normal class).

518 a qubit encoding scheme. It requires only single qubit519 rotations which is efficient quantum state preparation in
10

520 terms of time. After rescaling the vector to lie in [0, π/2],545 compared to other clients. Although, FedQCNN shows
521 we encode the elements of each 3*3 vector in a qubit as 546 a fast convergence, good generalization properties, and
547 achieves 93% testing accuracy.
548 Further, we examined the robustness and the stabil-
ψin = cos(xni ) |0⟩ + sin(xni ) |1⟩ (9)549 ity of FedQCNN with a significantly insufficient distribu-
550 tion, created using Dir(α=1.0) in Fig 9i, where the four
551 clients are having 34.5%, 18.5%, 34.7%, and 13.2% re-
552 spectively. Similarly to previous results, especially client
553 2 achieves higher testing loss, and lower test accuracy
554 of 87% in comparison to other clients due to inadequate
555 training images (i.e. only 3% of stone class), as shown
556 in Fig 9f. Still, the FedQCNN achieves better results in
557 comparison to locally trained clients. For a balanced dis-
558 tribution in Fig 9g, the global model converges quickly in
559 fewer communication rounds as compared to the previous
560 distribution results.
561 Fig 9(h-j) summarizes the performance of individual
562 clients and the global FedQCNN on the testing set. Fur-
563 ther, the dataset is distributed among the clients in a
564 balanced manner using Dir(10.0), such that each client
565 can have an adequate quantity of images from each class,
566 as depicted in Fig 9c. All clients are having approx 25%
567 images from the dataset, as shown in Fig 9j. It has been
568 observed that the testing accuracies of all four locally
FIG. 8. The illustration of a QCNN for CT-kidney569 trained clients improved and remain in the range of 90%
dataset. Each image in the dataset is reduced to 12*12 us-570 to 93%. The testing accuracy of the FedQCNN global
ing PCA while keeping 99% variability in the original image571 model is around 94% with a balanced data distribution.
dataset. The squares at the bottom denote the pixels of an572 The performance of individual classes in all local models
image via a feature map. The 3*3 squares (red) represent573 and FedQCNN is shown in Fig 9(k-o). The performance
the pixels of an image. The 9-qubit QCNN (blue) is applied574 of a global model on the testing set with Dir(0.5) is shown
on 3*3 pixels to extract the features. The top-most block575 in Fig 9p. For Dir(0.5, 1.0), the confusion matrices of
is a 16-qubit linearly entangled quantum circuit to perform
576 FedQCNN show the number of correctly classified sam-
the classification. The sphere at the top denotes the label
(whether the image belongs to the cyst, normal, stone, or tu-
577 ples, in Fig 9(q, r). The histogram shows the number of
mor). 578 communication rounds taken by FedQCNN to reach at
579 least 90% accuracy with different data distributions, in
580 Fig 9s. The balanced distribution takes lesser rounds as
522 Here, we have 16 squares in the whole image, and each
581 compared to data distribution with lower α.
523 square is of 3*3 size. Then, we performed single qubit
524 measurements on it and formed a 16-dimensional vector,582 The results shown in Fig 9(e-j) strongly support the
525 which is given as an input to a 16-qubit linearly entangled583 success of decentralized quantum optimization with het-
526 variational quantum circuit for a classification. 584 erogeneous training data distributions. Our experimen-
527 For the CT-kidney dataset, in total, 11,368 images are585 tal observations have shown that the quantum federated
528 used with four classes, 80% of the images are used for586 learning framework improved the generalization perfor-
529 training, and the remaining 20% of the images were used587 mance of all local client models in the presence of insuf-
530 for testing. The proportion of each class in the dataset588 ficient and unbalanced training data.
531 is shown in Fig 9a. Here, we performed the experiments
532 with three types of data settings using Dir(α=0.5, 1.0,
533 10.0), and partitioned the dataset among four clients in a589 C. Increasing computation on clients
534 non-iid manner. As depicted in Fig 9(b-d), data distribu-
535 tion with Dir(0.5) is stronger non-iid than with Dir(10.0).590 In this section, we are increasing the computing bur-
536 We have an imbalance class distribution or sparse across591 den on the clients by incrementing the number of local
537 the clients with lower α. Figure 9e illustrates the test-592 epochs per communication round. Thus, the clients will
538 ing loss of the global FedQCNN model with four locally593 perform local training as per the number of local epochs
539 trained clients. The class distributions are given at the594 before sending their model updates to the central server.
540 top of box-plot Fig 9h, in which client3 and client4 are595 It has been observed that increasing computation can
541 having less than 1% of training samples from tumor and596 decrement the number of communication rounds signif-
542 normal classes, respectively. Due to less number of train-597 icantly to reach target accuracy. In Fig 10(a, c), the
543 ing images, the client3 and client4 experienced fluctua-598 gray line indicates the target accuracy, i.e. achieved by
544 tions in loss and achieve low testing accuracy of 80%. as599 non-federated QCNN. For the pneumonia dataset with
11

(a) (b) (c) (d)


  

&\VW

1RRIVDPSOHV

1RRIVDPSOHV
1RUPDO 

1RRIVDPSOHV


 

 
&7.LGQH\'DWDVHW

 

  
FOLHQW FOLHQW FOLHQW FOLHQW FOLHQW FOLHQW FOLHQW FOLHQW
7XPRU 6WRQH FOLHQW FOLHQW FOLHQW FOLHQW
DOSKD  DOSKD  DOSKD 
 
'LVWULEXWLRQ 'LVWULEXWLRQ 'LVWULEXWLRQ

(e) (f )

)HG4&11
&OLHQW

)HG4&11
&OLHQW
(g) )HG4&11
&OLHQW

&OLHQW &OLHQW &OLHQW
 
&OLHQW &OLHQW &OLHQW

&OLHQW

7HVWLQJ/RVV
&OLHQW &OLHQW
7HVWLQJ/RVV

7HVWLQJ/RVV


 



 





     
       
&RPPXQLFDWLRQ5RXQGV
&RPPXQLFDWLRQ5RXQGV &RPPXQLFDWLRQ5RXQGV

4% 52% 26% 35% 15% 11% 55% 2% 40% 9% 25% 38% 13% 44% 22% 9% 23% 28% 23% 28% 29% 18% 25% 26%
(h) C N C N C N C N
(i) C N C N C N C N
(j) C N C N C N C N
4% 31% 13% 60% 30% <1% 53% 9% 64% 25% 3% 8% 20% 58% 13% 9% 22% 24% 21% 24% 28% 26% 29% 26%
S T S T S T S T S T S T S T S T S T S T S T S T

1 2 3 4 1 2 3 4 1 2 3 4
22.7% 33.5% 14% 29.7% 34.5% 18.5% 34.7% 13.2% 24.25% 24% 25.25% 26.5%

 


7HVWLQJ$FFXUDF\

7HVWLQJ$FFXUDF\
7HVWLQJ$FFXUDF\





 



 


&OLHQW &OLHQW &OLHQW &OLHQW )HG4&11 &OLHQW &OLHQW &OLHQW &OLHQW )HG4&11 &OLHQW &OLHQW &OLHQW &OLHQW )HG4&11

&OLHQWV &OLHQWV &OLHQWV


(k) (l) (m) (n) (o)

(p) (q) (r) (s)



120
7RWDOFRPPXQLFDWLRQURXQGV



100

7HVWLQJ$FFXUDF\


sum of count
7HVWLQJ/RVV

80

)HG4&11/RVV 
)HG4&11$&& 60



40

 20


 0
alpha05 alpha1 alpha5
   
Values for alpha
&RPPXQLFDWLRQ5RXQGV

FIG. 9. FedQCNN to detect kidney abnormalities. (a) The proportion of each class (cyst (32.6%), normal (35.2%),
stone (12.1%), and tumor (20.1%)) in CT-kidney dataset (b-d) Different amounts of data samples allocated to each client using
Dirichlet distribution with parameter α ∈ {0.5, 1.0, 10.0}. (e-g) The performance of individual clients with global FedQCNN
is reflected via testing loss and accuracy against the communication rounds. (h-j) Box plots representing the testing accuracy
of four local clients and FedQCNN on the testing set with unbalanced data distributions. The class-wise distribution of each
client is presented at the top of the box plots. (k-0) True positive rate versus the false positive rate of all clients and FedQCNN
model, respectively, when the data is distributed with Dir(0.5). (p) The performance of FedQCNN global model with Dir(0.5)
uneven data distribution. (q-r) Confusion matrices showing the number of correctly classified samples by the global model
after 150 communication rounds of training, when the dataset is distributed using Dir(0.5) and Dir(1.0), respectively. (s)
Histogram plot showing the total number of communication rounds taken by the global model to reach at least 90% accuracy
on the testing set).

600 different clients, it has been investigated that the global601 model achieved the target accuracy under 50 communi-
12

(a) 
(b) 
WUDFH 627 tion scheme based on the gradients. The purpose is to
WUDFH
WUDFH 628 transmit less information and reduce the overall commu-
7HVWLQJ$FFXUDF\

 WUDFH
nication overhead over the network while maintaining the

7HVWLQJ/RVV
 629

 630 performance of the quantum model. Instead of sending




631 the updates from each client, we compare the gradients
WUDFH 632 computed by the clients and send the gradients with the
WUDFH

WUDFH
 633 highest norm to the server, as shown in Fig 10. In the ex-
WUDFH

634 periments, we compare the performance of the FedQCNN
            635 model by taking the updates from all clients versus se-
&RPPXQLFDWLRQ5RXQGV &RPPXQLFDWLRQ5RXQGV 636 lecting one client with the highest norm for 50 commu-
(c) (d) 
WUDFH
637 nication rounds on pneumonia and CT-kidney datasets.
 WUDFH

WUDFH
7HVWLQJ$FFXUDF\

 WUDFH

7HVWLQJ/RVV



 Central Server

 WUDFH
WUDFH 
 WUDFH
WUDFH 
 Global Model
       

&RPPXQLFDWLRQ5RXQGV &RPPXQLFDWLRQ5RXQGV

FIG. 10. Increasing the computation load on clients.


The effect of setting up higher local client epochs (E ) before
sending the local updates to the central server for aggregation. 1 2
... n

(a-b) Pneumonia dataset (c-d) CT-kidney dataset. The solid


curves represent the results when each client (C ) performed ...
local training at once. The dotted curves are the results when
the local epochs increase from 1 to 3.
compare & select

max norm
602 cation rounds due to the impact of large E, as shown in
send back
603 Fig 10(a, b). The dotted lines represent the results with
604 E =3 and solid lines show the experiments with E =1.
605 Next, we investigated the effect of local training for
606 many epochs (large E ) between aggregating steps for the FIG. 11. Selecting the client based on the highest gra-
607 CT-Kidney dataset. The speedups for non-iid distribu- dient norm. To accelerate the convergence time, compare
608 tion with α = 0.5 is marginally less as compared to a bal- the gradients (θ1 , θ2 , ..., θn ) of all clients (C) and send them
609 anced dataset, but still considerable. In fact, the reason back to the central server with the highest norm (max||θ||) in
610 behind this is that some clients are having more samples the proposed quantum federated learning framework.
611 of datasets. which makes valuable increment in the local
612 training. Still, the global model achieved the target accu-638 On the pneumonia dataset with C=2, α = 0.5, the
613 racy under 70 communication rounds with α = (0.5, 1.0).639 client selection scheme gives quite similar results due to
614 Although, the fewer local epochs (E =1) per communi-640 large data heterogeneity (i.e. data distribution client1
615 cation round results in slow convergence of the global641 (85%) and client2 (15%)). When the dataset is parti-
616 model. Thus, it can be beneficial to maintain the bal-642 tioned among four clients with Dir(0.5), we observed a
617 ance between a number of local epochs and the learning643 5% increment in testing accuracy and 8% decrement in
618 rate at the later stage of the training process. 644 testing loss with the highest norm client selection scheme
645 FedQCNN(max||θ||) as compared to FedQCNN(θagg ), as
646 shown in Fig 11(a,b). Thus, the global FedQCNN model
647 can achieve similar performance on the testing set using
619 D. Efficient client selection scheme 648 the highest norm selection strategy, leading to a decline
649 in the number of communication rounds.
620 In federated learning, the objective is to reduce the650 Further, we applied the proposed client selection
621 communication rounds without sacrificing the accuracy651 scheme to CT-kidney dataset for better generalization,
622 of a global model during training. One of the main chal-652 in which four clients are having insufficient and unbal-
623 lenges is how to select the most suitable clients before653 anced training samples. From experimental results, we
624 sending the updates to a central server at each commu-654 can observe that on sending the updates of a client with
625 nication round. In this section, we analyze the conver-655 the highest norm, the FedQCNN achieves a faster con-
626 gence of the FedQCNN global model using a client selec-656 vergence speed on the testing set. When the dataset is
13

(a)  (b)  FP 682 non-iid manner in federated settings. The quantum fed-
FP
 F 683 erated model learns the difficult task of classifying the
7HVWLQJ$FFXUDF\


F
684 chest x-ray image of a patient as either pneumonia or

7HVWLQJ/RVV

685 normal (no sign of infection). To assess the generaliz-
 
3QHXPRQLD


3QHXPRQLD'DWDVHW

1RUPDO

686 ability of a proposed model for a multi-classification task,
 FP
 687 we employ a FedQCNN for detecting chronic kidney dis-
FP
F  688 eases across multiple clients that comprise local patient
 F

689 data. It has been observed that the proposed collabo-
            690 rative framework can improve timely diagnosis and can
&RPPXQLFDWLRQ5RXQGV &RPPXQLFDWLRQ5RXQGV 691 prevent patients from developing chronic kidney diseases.
(c)  (d)  YP
692 To the author’s knowledge, this is a prior work to inves-

YP 693 tigate the effectiveness and feasibility of quantum fed-
 Y
erated learning to detect pneumonia and predict chronic
7HVWLQJ$FFXUDF\

Y 694

7HVWLQJ/RVV


695 kidney diseases, where collaborative effort is beneficial at
 
696 the time of a higher death rate.
 
YP 697 Our experimental results have shown that quantum
 YP
Y
 698 federated learning has the ability to enhance perfor-
 Y
 699 mance across multiple healthcare institutional models
            700 and reflect collaborative optimization success with non-
&RPPXQLFDWLRQ5RXQGV &RPPXQLFDWLRQ5RXQGV 701 identically distributed data. In our experiments, it has
702 been observed that the testing accuracy of locally trained
FIG. 12. Sending the updates of a client with the high-703 clients declines due to insufficient training images. Al-
est norm. The impact of selecting the best gradients based704 though, the testing accuracy of the global FedQCNN
on the highest norm (max||θ||) before sending the local up-705 model achieves a high classification accuracy with uneven
dates to the central server. (a-b) Pneumonia dataset (c-d)
706 medical data distributions. It shows the robustness and
CT-kidney dataset. The dotted lines represent the results
when all clients participate in the training of the global model707 stability of the QCNN model in federated settings. Fur-
by sending local updates. The solid lines are the results when708 thermore, we modified our existing approach to reduce
sending the local updates of a client with the highest norm. 709 the computation and communication load on the clients.
710 It has been observed that the communication rounds be-
711 tween the clients and the central server can be reduced
657 distributed with Dir(0.5, 1.0) in a non-iid manner, the712 significantly by selecting the client with the highest gra-
658 FedQCNN achieved a 10% increment in testing accuracy713 dient norm.
659 and a decrement of 10% in testing loss with client selec-714 The decentralized quantum algorithm can help dis-
660 tion strategy at 50 communication rounds, as shown in715 tribute the computational resources on NISQ devices. In
661 Fig 11(c, d). On considering large unbalanced datasets716 the future, we also intend to investigate the potential of a
662 and a large number of clients, the global model cannot re-717 proposed quantum federated learning framework on large
663 flect the diverse datasets available with local clients using718 datasets with privacy-preserving techniques by sharing
664 a small number of selected clients with the highest norm.719 only partial model updates with the central server. By
665 Hence, selecting the optimal number of clients with the720 leveraging the benefits of quantum computers and feder-
666 highest norm is crucial in the training process. In fact, we721 ated learning, healthcare providers can improve the accu-
667 distributed the medical datasets among a small number722 racy and efficiency of image classification, leading to bet-
668 of clients, so the global model achieves better results with723 ter diagnosis and treatment options for patients. More-
669 a single client selection scheme in fewer communication724 over, the proposed quantum federated learning frame-
670 rounds. Thus, experimental results demonstrate that the725 work can safeguard the personal records of providers and
671 proposed scheme accelerates the convergence of quantum726 patients.
672 federated learning as compared to convergence with all
673 clients’ participation in the training and is proven to be
674 efficient.
727 DATA AVAILABILITY
675 CONCLUSIONS
728 The pneumonia MNIST dataset can be accessed at
676 In this paper, an effort has been put forward to de-729 https://medmnist.com/. The CT-kidney dataset is col-
677 velop and analyze a new computing model to enhance730 lected from https://www.kaggle.com/datasets/nazmul
678 healthcare analysis with an integrated quantum feder-731 0087/ct-kidney-dataset-normal-cyst-tumor-and-stone.
679 ated learning framework. To evaluate the effectiveness732 The original contributions presented in the study are
680 of a proposed model, we first trained the QCNN on the733 included in the article; further inquiries can be directed
681 pneumonia dataset by distributing the data to clients in a734 to the corresponding author’s.
14

735 ACKNOWLEDGMENTS 743 experiments under the supervision of S.K. and M.A.A.
744 All authors contributed to conceptualization and revising
736 We would like to thank the Elmore Family School of745 the manuscript.
737 Electrical and Computer Engineering (ECE) Emerging
738 Frontiers Center at Purdue University for the financial
739 support.

740 AUTHOR CONTRIBUTIONS 746 COMPETING INTERESTS

741 A.S.B., S.K., and M.A.A conceived and designed the


742 complete study. A.S.B collected data, and performed the747 The authors declare no competing interests.

748 1 Davenport, T. & Kalakota, R. The potential for artificial794 scale quantum algorithms. Reviews Of Modern Physics.
749 intelligence in healthcare. Future Healthcare Journal. 6, 94795 94, 015004 (2022)
750 (2019) 796 15 Cerezo, M., Verdon, G., Huang, H., Cincio, L. & Coles, P.
751 2 He, J., Baxter, S., Xu, J., Xu, J., Zhou, X. & Zhang, K.797 Challenges and opportunities in quantum machine learn-
752 The practical implementation of artificial intelligence tech-798 ing. Nature Computational Science. 2, 567-576 (2022)
753 nologies in medicine. Nature Medicine. 25, 30-36 (2019) 799 16 Sajjan, M., Li, J., Selvarajan, R., Sureshbabu, S., Kale, S.,
754 3 Seh, A., Zarour, M., Alenezi, M., Sarkar, A., Agrawal, A.,800 Gupta, R., Singh, V. & Kais, S. Quantum machine learn-
755 Kumar, R. & Ahmad Khan, R. Healthcare data breaches:801 ing for chemistry and physics. Chemical Society Reviews.
756 insights and implications. Healthcare. 8, 133 (2020) 802 (2022)
757 4 Abouelmehdi, K., Beni-Hssane, A., Khaloufi, H. & Saadi,803 17 Xia, R. & Kais, S. Quantum machine learning for elec-
758 M. Big data security and privacy in healthcare: A Review.804 tronic structure calculations. Nature Communications. 9,
759 Procedia Computer Science. 113 pp. 73-80 (2017) 805 4195 (2018)
760 5 Bryce, J., Boschi-Pinto, C., Shibuya, K. & Black, R. WHO806 18 Landman, J., Mathur, N., Li, Y., Strahm, M., Kazdaghli,
761 estimates of the causes of death in children. The Lancet.807 S., Prakash, A. & Kerenidis, I. Quantum Methods for Neu-
762 365, 1147-1152 (2005) 808 ral Networks and Application to Medical Image Classifi-
763 6 Webster, A., Nagler, E., Morton, R. & Masson, P. Chronic809 cation. Quantum. 6 pp. 881 (2022)
764 kidney disease. The Lancet. 389, 1238-1252 (2017) 810 19 Yano, H., Suzuki, Y., Raymond, R. & Yamamoto, N. Ef-
765 7 Kovesdy, C. Epidemiology of chronic kidney disease: an811 ficient discrete feature encoding for variational quantum
766 update 2022. Kidney International Supplements. 12, 7-11812 classifier. 2020 IEEE International Conference On Quan-
767 (2022) 813 tum Computing And Engineering (QCE). pp. 11-21 (2020)
768 8 Rieke, N., Hancox, J., Li, W., Milletari, F., Roth, H., Al-814 20 Houssein, E., Abohashima, Z., Elhoseny, M. & Mohamed,
769 barqouni, S., Bakas, S., Galtier, M., Landman, B., Maier-815 W. Hybrid quantum-classical convolutional neural net-
770 Hein, K. & Others The future of digital health with fed-816 work model for COVID-19 prediction using chest X-ray
771 erated learning. NPJ Digital Medicine. 3, 119 (2020) 817 images. Journal Of Computational Design And Engineer-
772 9 Hard, A., Rao, K., Mathews, R., Ramaswamy, S., Beau-818 ing. 9, 343-363 (2022)
773 fays, F., Augenstein, S., Eichner, H., Kiddon, C. & Ram-819 21 Sierra-Sosa, D., Arcila-Moreno, J., Garcia-Zapirain, B.,
774 age, D. Federated learning for mobile keyboard prediction.820 Castillo-Olea, C. & Elmaghraby, A. Dementia prediction
775 ArXiv Preprint ArXiv:1811.03604. (2018) 821 applying variational quantum classifier. ArXiv Preprint
776 10 Wu, Q., Chen, X., Zhou, Z. & Zhang, J. Fedhome: Cloud-822 ArXiv:2007.08653. (2020)
777 edge based personalized federated learning for in-home823 22 Parisi, L., Neagu, D., Ma, R. & Campean, F. Quantum
778 health monitoring. IEEE Transactions On Mobile Com-824 ReLU activation for Convolutional Neural Networks to im-
779 puting. 21, 2818-2832 (2020) 825 prove diagnosis of Parkinson’s disease and COVID-19. Ex-
780 11 Xu, J., Glicksberg, B., Su, C., Walker, P., Bian, J. &826 pert Systems With Applications. 187 pp. 115892 (2022)
781 Wang, F. Federated learning for healthcare informatics.827 23 Sengupta, K. & Srivastava, P. Quantum algorithm for
782 Journal Of Healthcare Informatics Research. 5 pp. 1-19828 quicker clinical prognostic analysis: an application and
783 (2021) 829 experimental study using CT scan images of COVID-19
784 12 Huang, H., Broughton, M., Mohseni, M., Babbush, R.,830 patients. BMC Medical Informatics And Decision Making.
785 Boixo, S., Neven, H. & McClean, J. Power of data in quan-831 21, 1-14 (2021)
786 tum machine learning. Nature Communications. 12, 2631832 24 Esposito, M., Uehara, G. & Spanias, A. Quantum ma-
787 (2021) 833 chine learning for audio classification with applications to
788 13 Biamonte, J., Wittek, P., Pancotti, N., Rebentrost, P.,834 healthcare. 2022 13th International Conference On Infor-
789 Wiebe, N. & Lloyd, S. Quantum machine learning. Nature.835 mation, Intelligence, Systems & Applications (IISA). pp.
790 549, 195-202 (2017) 836 1-4 (2022)
791 14 Bharti, K., Cervera-Lierta, A., Kyaw, T., Haug, T.,837 25 Jain, S., Ziauddin, J., Leonchyk, P., Yenkanchi, S. &
792 Alperin-Lea, S., Anand, A., Degroote, M., Heimonen, H.,838 Geraci, J. Quantum and classical machine learning for
793 Kottmann, J., Menke, T. & Others Noisy intermediate-839 the classification of non-small-cell lung cancer patients. SN
15

840 Applied Sciences. 2 pp. 1-10 (2020) 873 for defending against adversarial attacks in intelligent
841 26 Cong, I., Choi, S. & Lukin, M. Quantum convolutional874 transportation systems. IEEE Transactions On Intelligent
842 neural networks. Nature Physics. 15, 1273-1278 (2019) 875 Transportation Systems. (2021)
843 27 Pesah, A., Cerezo, M., Wang, S., Volkoff, T., Sornborger,876 36 Xia, Q., Tao, Z. & Li, Q. Defending against byzantine
844 A. & Coles, P. Absence of barren plateaus in quantum con-877 attacks in quantum federated learning. 2021 17th Interna-
845 volutional neural networks. Physical Review X. 11, 041011878 tional Conference On Mobility, Sensing And Networking
846 (2021) 879 (MSN). pp. 145-152 (2021)
847 28 Wei, S., Chen, Y., Zhou, Z. & Long, G. A quantum convo-880 37 Chehimi, M. & Saad, W. Quantum federated learning
848 lutional neural network on NISQ devices. AAPPS Bulletin.881 with quantum data. ICASSP 2022-2022 IEEE Interna-
849 32 pp. 1-11 (2022) 882 tional Conference On Acoustics, Speech And Signal Pro-
850 29 Herrmann, J., Llima, S., Remm, A., Zapletal, P., McMa-883 cessing (ICASSP). pp. 8617-8621 (2022)
851 hon, N., Scarato, C., Swiadek, F., Andersen, C., Hellings,884 38 Yu, K., Zhang, X., Ye, Z., Guo, G. & Lin, S. Quantum fed-
852 C., Krinner, S. & Others Realizing quantum convolutional885 erated learning based on gradient descent. ArXiv Preprint
853 neural networks on a superconducting quantum proces-886 ArXiv:2212.12913. (2022)
854 sor to recognize quantum phases. Nature Communications.887 39 Qi, J. Federated Quantum Natural Gradient De-
855 13, 4144 (2022) 888 scent for Quantum Federated Learning. ArXiv Preprint
856 30 Hur, T., Kim, L. & Park, D. Quantum convolutional neu-889 ArXiv:2209.00564. (2022)
857 ral network for classical data classification. Quantum Ma-890 40 Blei, D., Ng, A. & Jordan, M. Latent dirichlet allocation.
858 chine Intelligence. 4, 3 (2022) 891 Journal Of Machine Learning Research. 3, 993-1022 (2003)
859 31 Chen, S., Wei, T., Zhang, C., Yu, H. & Yoo, S. Quantum892 41 Ronning, G. Maximum likelihood estimation of Dirich-
860 convolutional neural networks for high energy physics data893 let distributions. Journal Of Statistical Computation And
861 analysis. Physical Review Research. 4, 013231 (2022) 894 Simulation. 32, 215-221 (1989)
862 32 Chen, S. & Yoo, S. Federated quantum machine learning.895 42 Yang, J., Shi, R., Wei, D., Liu, Z., Zhao, L., Ke, B., Pfis-
863 Entropy. 23, 460 (2021) 896 ter, H. & Ni, B. MedMNIST v2-A large-scale lightweight
864 33 Xia, Q. & Li, Q. Quantumfed: A federated learning897 benchmark for 2D and 3D biomedical image classification.
865 framework for collaborative quantum training. 2021 IEEE898 Scientific Data. 10, 41 (2023)
866 Global Communications Conference (GLOBECOM). pp.899 43 Islam, M. CT kidney dataset: Normal-
867 1-6 (2021) 900 cyst-tumor and stone. Kaggle. (2021,11),
868 34 Li, W., Lu, S. & Deng, D. Quantum federated learn-901 https://www.kaggle.com/datasets/nazmul0087/ct-
869 ing through blind quantum computing. Science China902 kidney-dataset-normal-cyst-tumor-and-stone
870 Physics, Mechanics & Astronomy. 64, 100312 (2021)
871 35 Yamany, W., Moustafa, N. & Turnbull, B. Oqfl: An
872 optimized quantum-based federated learning framework

You might also like