Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

Quantum computing models for artificial neural networks

Stefano Mangini,1, 2, ∗ Francesco Tacchino,3 Dario Gerace,1 Daniele Bajoni,4 and Chiara Macchiavello1, 2, 5
1
Dipartimento di Fisica, Università di Pavia, Via Bassi 6, I-27100, Pavia, Italy
2
INFN Sezione di Pavia, Via Bassi 6, I-27100, Pavia, Italy
3
IBM Quantum, IBM Research - Zurich, Säumerstrasse 4, CH-8803 Rüschlikon, Switzerland
4
Dipartimento di Ingegneria Industriale e dell’Informazione,
Università di Pavia, Via Ferrata 1, 27100, Pavia, Italy
5
CNR-INO - Largo E. Fermi 6, I-50125, Firenze, Italy
(Dated: May 21, 2021)
Neural networks are computing models that have been leading progress in Machine Learning (ML)
and Artificial Intelligence (AI) applications. In parallel, the first small scale quantum computing
devices have become available in recent years, paving the way for the development of a new paradigm
arXiv:2102.03879v2 [quant-ph] 20 May 2021

in information processing. Here we give an overview of the most recent proposals aimed at bring-
ing together these ongoing revolutions, and particularly at implementing the key functionalities of
artificial neural networks on quantum architectures. We highlight the exciting perspectives in this
context, and discuss the potential role of near term quantum hardware in the quest for quantum
machine learning advantage.

I. INTRODUCTION The size of the data and features elaborated by classi-


cal NNs, and the number of parameters defining their
Artificial Intelligence (AI) broadly refers to a wide set structure, have been steadily increasing over the last
of algorithms that have shown in the last decade impres- years. These advancements have come at the expense
sive and sometimes surprising successes in the analysis of a dramatic rise in the demand for energy and compu-
of large datasets [1]. These algorithms are often based tational power [5] from the field of AI. In these regards,
on a computational model called Neural Network (NN), quantum computers may offer a viable route forward to
which belongs to the family of differential programming continue such growth in the dimension of the treatable
techniques. problems and, in parallel, enable completely new func-
The simplest realization of a NN is in a “feedforward” tionalities. In fact, quantum information is elaborated by
configuration, mathematically described as a concate- exploiting superpositions in vector spaces (the so called
nated application of affine transformations and element Hilbert spaces) [6], and quantum processors are planned
wise nonlinearities, fj (·) = σj (wj · + bj ), where σj is a to soon surpass the computational capabilities of classical
nonlinear activation function, wj is a real matrix con- supercomputers. Moreover, genuinely quantum correla-
taining the so-called weights to be optimized, and bj is tions represent by themselves a resource to be considered
a bias vector, which is also to be optimized. The action in building new forms of ML applications.
of the NN is defined through repeated applications of
this function to an input vector, x, leading to an output
ŷ = fL ◦ fL−1 ◦ · · · ◦ f1 (x), where L denotes the number of With respect to classical computational paradigms,
layers in the NN architecture, as schematically shown in quantum computing comes with its own peculiarities.
Fig. 1(a). The success of NNs as a computational model One of these is that quantum computation is intrinsi-
is primarily due to their trainability [2], i.e. the fact that cally linear. While this property could ensure, e.g., that
the entries of the weight matrices can be adjusted to learn an increase in computational power does not come with
a given target task. The most common use case is super- a parallel increase in energy costs, it also means that im-
vised learning, where the algorithm is asked to reproduce plementing the nonlinear functions forming the backbone
the associations between some inputs xi and the desired of NNs is a non-trivial task. Also, information cannot be
correct outputs (or label) yi . If properly trained, the NN copied with arbitrary precision in quantum algorithms,
should also be able to predict well on new data, i.e., data meaning that tasks such as repeatedly addressing a vari-
that were not used during training. This crucial property able are impossible when the latter is represented by a
is called generalization, and it represents the key aspect quantum state. As a consequence, classical NN algo-
of learning algorithms, which ultimately distinguish them rithms cannot be simply ported to quantum platforms.
from standard fitting techniques. Generalization encap-
sulates the idea that the learning models discover hidden
patterns and relations in the data, instead of applying
pre-programmed procedures [1, 3, 4]. In this article, we briefly review the main directions
that have been taken through the years to develop ar-
tificial NNs for quantum computing environments, the
strategies used to overcome the possible difficulties, and
∗ stefano.mangini01@universitadipavia.it the future prospects for the field.
2
...
Input layer Hidden layers
θ
Classical Quantum
Neural ...
Neural
Networks Networks

...

Encoding Quantum
Quantum qubits
Kernel
Perceptrons
Methods
Ancilla
Activation function

CL PL CL PL FCL l=1

} discard
Quantum Dissipative l=2

Convolutional Quantum } discard


Neural Neural ⋮
Networks Networks

FIG. 1. Schematic representation of classical and quantum learning models. (a) Classical feedforward NNs process
information through consecutive layers of nodes, or neurons. (b) A quantum neural network in the form of a variational
quantum circuit: the data re-uploading procedure (Si (x)) is explicitly illustrated, and non-parametrized gates (Wi ) of Eq. 1
are incorporated inside the variational unitaries (Ui (θi )). (c) A possible model for a quantum perceptron [7], which uses
operations Ui and Uw to load input data and perform multiplication by weights, and a final measurement is used to emulate
the neuron activation function. (d) Quantum Kernel Methods map input data to quantum states [8], and then evaluate their
inner product hφ(x)|φ(x0 )i to build the Kernel matrix K(x, x0 ). (e) A Quantum Convolutional NN [9] consists of a repeated
application of Convolutional layers (CL) and Pooling layers (PL), ultimately followed by a fully-connected layer (FCL), i.e., a
general unitary on all remaining qubits applied before a measurement (or else, O). (f ) In a Dissipative QNN [10], qubits in
one layer are coupled to ancillary qubits from the following layer in order to load the result of the computation, while previous
layers are discarded (i.e., dissipated ).

II. QUANTUM NEURAL NETWORKS by three main constituents: data encoding, variational
ansatz, and final measurements with a classical update of
the parameters. The first stage accounts for loading in-
In gate-based quantum computers, such as those re- put data (classical or quantum) into the quantum circuit
alized with superconducting qubits, ion traps, spins in using quantum operations that depend on the given in-
semiconductors, or photonic circuits [11], a quantum put; the second stage consists of a variational circuit with
computation is fully described by its quantum circuit a specific structure, called ansatz, which often consists of
representation, i.e., the arrangement of subsequent oper- repeated applications of similar operations (layers). Fi-
ations (gates) applied to the qubits in the quantum pro- nally, a measurement on the qubits is performed to in-
cessing unit. These gates can be parametrized by contin- fer some relevant information about the system. Given
uous variables, like a Pauli-X rotation, Rx (θ) = eiθX/2 . the outcome, the parameters in the circuits are updated
A quantum circuit that makes use of parametrized gates through a classical optimizer to minimize the cost func-
is known as a Parametrized Quantum Circuit (PQC) [12]. tion defining the problem. Such algorithms generally re-
Variational Quantum Algorithms (VQAs) [13–15] are a quire limited resources to be executed on real quantum
large class of PQC-based protocols whose goal is to devices, thus they are expected to take full advantage of
properly tune the parameters in order to solve a target current noisy quantum hardware [16, 17].
task. For example, the task might be the minimiza-
tion of the expectation value of an observable hOi = The definitions above highlight many similarities be-
h0|U (θ)† OU (θ)|0i, where U (θ) is the unitary operator tween PQCs and classical NNs. In both models, infor-
representing the quantum circuit parametrized by angles mation is processed through a sequence of parametrized
θ = (θ1 , θ2 , . . .) and acting on a quantum register initial- layers, which are iteratively updated using a classical
ized in a reference state |0i = |0i⊗n . VQAs are made optimizer. The key difference lies in the way informa-
3

tion is processed. Quantum computers promise to possi- features of the data. Thus, for a more effective con-
bly achieve some form of computational advantage, i.e., struction, each layer in Eq. 1 should be replaced with
speedups or better performances [18], due to inherently Ui Wi → S(x)Ui Wi , where S(x) denotes the data en-
quantum effects not available in classical learning scenar- coding procedure mapping input x to its corresponding
ios 1 . While many different definitions of Quantum Neu- quantum state.
ral Networks (QNN) are possible, the most general and This formulation still leaves a lot of freedom in choos-
currently accepted one is that of a parametrized quan- ing how the encoding and the variational circuit are im-
tum circuit whose structure is directly inspired by classi- plemented [28, 29]. However, as it happens in the clas-
cal neural networks. Thus, we formally define a QNN as sical domain, also different architectures for QNNs exist,
a PQC whose variational ansatz contains multiple repe- and in the following, we give a brief overview of some of
titions of self-similar layers of operations, that is: the ones that received more attention.
1
Y
UQNN (θ) = Ui (θi )Wi 1. Quantum Perceptron Models
i=L
= UL (θL )WL · · · U1 (θ1 )W1 , (1)
Early works on quantum models for single artificial
where Ui (θi ) are variational gates, Wi are (typically en- neurons (known as perceptrons) were focused on how
tangling) fixed operations that do not depend on any pa- to reproduce the non-linearities of classical perceptrons
rameter, and L is the number of layers (see Fig. 1(b) for with quantum systems, which, on the contrary, obey lin-
a schematic picture). Depending on the problem under ear unitary evolutions [30–33]. In more recent proposals
investigation and the connectivity of the quantum device (e.g., Fig. 1(c)), quantum perceptrons were shown to be
(i.e., related to the native gate set available [11]), various capable of carrying out simple pattern recognition tasks
realizations exist for both the entangling structure and for binary [7] and gray scale [34] images, possibly employ-
the parametrized gates [15]. Note that this definition al- ing variational unsampling techniques for training and to
most coincides with that of VQAs, and in fact the terms find adaptive quantum circuit implementations [35]. By
VQAs and QNNs are often used interchangeably. connecting multiple quantum neurons, models for quan-
While being rather general, a growing body of research tum artificial neural networks were also realized on proof-
has recently been highlighting that such a structure, in of-principle experiments on superconducting devices [36].
its naive realization, may be poorly expressive, in the
sense that it can only represent a few functions — actu-
ally, mostly sine functions — of the input data [22–26]. 2. Kernel methods
A milestone of classical learning theory, the Universal
Approximation Theorem [27], guarantees that classical Directly inspired by their classical counterparts, the
NNs are powerful enough to approximate any function main idea of kernel methods is to map input data into a
f (x) of the input data x. Interestingly, it was shown higher dimensional space, and leverage this transforma-
that a similar result also holds for QNNs, provided that tion to perform classification tasks otherwise unfeasible
input data are loaded into the quantum circuit multi- in the original, low dimensional representation. Typi-
ple times throughout the computation. Intuitively, such cally, any pair of data vectors, x, x0 ∈ X , is mapped into
data re-uploading can be seen as a necessary feature to quantum states, |φ(x)i, |φ(x0 )i, by means of an encoding
counterbalance the effect of the no-cloning theorem of mapping φ : X → H, where H denotes the Hilbert space
quantum mechanics [6]. In fact, while in classical NNs of the quantum register. The inner product hφ(x)|φ(x0 )i,
the output of a neuron is copied and transferred to every evaluated in an exponentially large Hilbert space, defines
neuron in the following layer, this is impossible in the a similarity measure that can be used for classification
quantum regime, thus it is necessary to explicitly intro- purposes. The matrix Kij = hφ(xi )|φ(xj )i, constructed
duce some degree of classical redundancy. As a rule of over the training dataset, is called kernel, and can be
thumb, the more often data are re-uploaded during the used along with classical methods such as Support Vec-
computation, the more expressive the quantum learning tor Machines (SVMs) to carry out classification [8, 37] or
model becomes, as it is able to represent higher order regression tasks, as schematically displayed in Fig. 1(d).
A particularly promising aspect of such quantum SVMs
is the possibility of constructing kernel functions that are
hard to compute classically, thus potentially leading to
1 Earlier works on quantum generalizations of classical machine quantum advantage in classification.
learning focused on the use of quantum computers to perform
linear algebra exponentially faster compared to classical comput-
ers [18]. However, due to their unsuitability for NISQ devices and 3. Generative Models
comparable performances with new classical quantum-inspired
methods [19–21], interest in this field is declining, with most of
the research now redirected towards parametrized quantum cir- A generative model is a probabilistic algorithm whose
cuits as quantum counterparts of classical learning models. goal is to reproduce an unknown probability distribution
4

pX over a space X , given a training set, T = {xi | xi ∼ 5. Quantum Dissipative Neural Networks
pX , i = 1, . . . , M }. A common classical approach to
this task leverages probabilistic NNs called Boltzmann Dissipative QNNs, as proposed in [10], consist of mod-
Machines (BMs) [2, 38], which are physically inspired els where each qubit represents a node, and edges con-
methods resembling Ising models. Upon changing classi- necting qubits in different layers represent general uni-
cal nodes with qubits, and substituting the energy func- tary operations acting on them. The use of the term
tion with a quantum Hamiltonian, it is possible to define dissipative is related to the progressive discarding of lay-
Quantum Boltzmann Machines (QBMs) [39–41], dedi- ers of qubits after a given layer has interacted with the
cated to the generative learning of quantum and classical following one (see, e.g., Fig. 1(f) for a schematic illus-
distributions. On a parallel side, a novel and surprisingly tration). This model can implement and learn general
effective classical tool for generative modeling is rep- quantum transformations, carrying out a universal quan-
resented by Generative Adversarial Networks (GANs). tum computation. Endowed with a backpropagation-like
Here, the burden of the generative process is offloaded training algorithm, it represents a possible route towards
onto two separate neural networks called generator and the realization of quantum analogues of deep neural net-
discriminator, respectively, which are programmed to works for analyzing quantum data.
compete against each other. Through a careful and it-
erative learning routine, the generator learns to produce
high-quality new examples, eventually fooling the dis- III. TRAINABILITY OF QNN
criminator. A straightforward translation to the quan-
tum domain led to the formulation of Quantum GANs
For decades, AI methods remained merely academic
(QGANs) models [42–44], where the generator and dis-
tools, due to the lack of computational power and proper
criminator consist of QNNs. Even if not inspired by any
training routines, which made the use of learning algo-
classical neural network model, it is also worth mention-
rithms practically unfeasible on available devices. This
ing Born machines [45], where the probability of generat-
underlines a very pivotal problem of all learning models,
ing new data follows the inherently probabilistic nature
i.e., the ability to efficiently train them.
of quantum mechanics represented by Born’s rule.

1. Barren plateaus

4. Quantum Convolutional Neural Networks Unfortunately, also quantum machine learning mod-
els were shown to suffer from serious trainability issues,
which are commonly referred to as barren plateaus[47].
As the name suggests, these models are inspired by In particular, using a parametrization as in Eq. (1), and
the corresponding classical counterpart, nowadays ubiq- assuming that such architecture forms a so-called unitary
uitous in the field of image processing. In these net- 2-design 2 , it can be shown that the derivative of any cost
works, schematically shown in Fig. 1(e), inputs are pro- function of the form C(θ) = h0|UQNN (θ)† OUQNN (θ)|0i
cessed through a repeated series of so-called convolutional with respect to any of its parameters θk , yields h∂k Ci =
and pooling layers. The former applies a convolution of 0, and Var[∂k C] ≈ 2−n , where n is the number of qubits.
the input—i.e., a quasi-local transformation in a trans- This means that, on average, there will be no clear di-
lationally invariant manner— to extract some relevant rection of optimization in the cost function landscape
information, the latter applies a compression of such in- upon random initialization of the parameters in a large
formation in a lower dimensional representation. After enough circuit, being it essentially flat everywhere, ex-
many iterations, the network outputs an informative and cept in close proximity to the minima. Moreover, barren
low dimensional enough representation that can be ana- plateaus were shown to be strongly dependent on the
lyzed with standard techniques, such as fully connected nature of the cost function driving the optimization rou-
feedforward NNs. Similarly, in Quantum Convolutional tine [48]. In particular, while the use of a global cost
NNs [9, 46] a quantum state first undergoes a convolu- function always hinders the trainability of the model (in-
tion layer, consisting of a parametrized unitary opera- dependently of the depth of the quantum circuit), a lo-
tion acting on individual subsets of the qubits, and then cal cost function allows for an efficient trainability for
a pooling procedure obtained by measuring some of the circuit depths of order O(log n). Besides, also entangle-
qubits. This procedure is repeated until only a few qubits ment plays a major role in hindering the trainability of
are left, where all relevant information was stored. This
method proved particularly useful for genuinely quantum
tasks such as quantum phase recognition and quantum
error correction. Moreover, the use of only logarithmi- 2 Roughly speaking, it is sufficiently random that sampling over its
cally many parameters with respect to the number of distribution matches the Haar distribution — a generalization of
qubits is particularly important to achieve efficient train- the uniform distribution in the space of unitary matrices — up
ing and effective implementations on near-term devices. to the second moment.
5

QNNs. In fact, it was shown that whenever the objective 3. Optimization routines
cost function is evaluated only on a small subset of avail-
able qubits (visible units), with the remaining ones being The most common training routine in classical NNs is
traced out (hidden units), a barren plateau in the opti- the backpropagation algorithm [1], which combines the
mization landscape may occur [49]. This arises from the chain rule for derivatives and stored values from inter-
presence of entanglement, which tends to spread informa- mediate layers to efficiently compute gradients. However,
tion across the whole network instead of its single parts. such a strategy cannot be straightforwardly generalized
Moreover, also the presence of hardware noise during the to the quantum case, because intermediate values from a
execution of the quantum circuit can itself be the reason quantum computation can only be accessed via measure-
for the emergence of noise-induced barren plateaus [50]. ments, which disturb the relevant quantum states [59].
It is worth noticing that most of these results stem For this reason, the typical strategy is to use classical
from strong assumptions on the shape of the variational optimizers leveraging gradient-based or gradient-free nu-
ansatz, for example, its ability to approximate an exact 2- merical methods. However, there exist scenarios where
design. While these assumptions are often well-matched the quantum nature of the optimization can be taken into
by proposed quantum learning models, QNNs should be account, leading to strategies tailored to the quantum do-
analyzed case-by-case to have precise statements about main like the parameter shift rule [25, 59], the Quantum
their trainability. As an example, dissipative QNNs were Natural Gradient [60], and closed update formulas [26].
shown to suffer from barren plateaus [51] when global Finally, while a fully coherent quantum update of the
unitaries are used, while convolutional QNNs seem to parameters is certainly appealing [61], this remains out
be immune to the barren plateaus phenomena [52]. In of reach for near-term hardware, which is limited in the
general terms, however, recent results indicate that the number of available qubits and circuit depth [11].
trainability of quantum models is closely related to their
expressivity, as measured by their distance from being
an exact 2-design. Highly expressive ansatzes generally IV. QUANTUM ADVANTAGE
exhibit flatter landscapes and will be harder to train [53].
At last, it was recently argued that the training of any Quantum computers are expected to become more
VQA is by itself an NP-hard problem [54], thus being in- powerful than their classical counterparts on many spe-
trinsically difficult to tackle. Despite their intrinsic the- cific applications [62–64], and particularly in sampling
oretical value, these results should not represent a severe from complex probability distributions — a feature
threat, as similar conclusions also hold in general for clas- particularly relevant for generative modeling applica-
sical NNs, particularly in worst-case analysis, and did not tions [65, 66]. It is therefore natural to ask whether this
prevent their successful application in many specific use hierarchy also applies to learning models. While a posi-
cases. tive answer is widely believed to be true, clear indications
on how to achieve such a quantum advantage are still un-
der study, with only a few results in this direction.

2. Avoiding plateaus 1. Power of Quantum Neural Networks

Several strategies have been proposed to avoid, or at In the previous sections, we used the term expressiv-
least mitigate, the emergence of barren plateaus. A possi- ity quite loosely. In particular, we used it with two
ble idea is to initialize and train parameters in batches, in different meanings: the first to indicate the ability of
a procedure known as layerwise learning [55]. Similarly, QNNs to approximate any function of the inputs [23],
another approach is to reduce the effective depth of the the second to indicate their distance from being an ex-
circuit by randomly initializing only a subset of the total act 2-design, i.e. the ability to span uniformly the space
parameters, with the others chosen such that the over- of quantum states [53]. Each definition conveys a par-
all circuit implements the identity operation [56]. Intro- ticular facet of an intuitive sense of power, highlighting
ducing correlations between the parameters in multiple the need for a unifying measure of such power yet to be
layers of the QNN, thus reducing the overall dimension- discovered. Nevertheless, some initial results towards a
ality of the parameter space [57], has also been proposed. thorough analysis of the capacity (or power) of quantum
Along a different direction, leveraging classical recurrent neural networks have recently been put forward [67–69].
NNs to find good parameter initialization heuristics, such Starting from the Vapnik–Chervonenkis (VC) dimen-
that the network starts close to a minimum, has also sion, which is a classical measure of richness of a learning
proven effective [58]. Finally, as previously mentioned, it model related to the maximum number of independent
is worth reminding that the choice of a local cost func- classifications that the model can implement, a univer-
tion and of specific entanglement scaling between hidden sal metric of capacity related to the maximum informa-
and visible units is of key importance to ultimately avoid tion that can be stored in the parameters of a learning
barren plateaus [48]. model, be it classical or quantum, was defined [67]. Based
6

on this definition, it was claimed that quantum learning 3. Quantum data for quantum neural networks
models have the same overall capacity as any classical
model having the same number of weights. However, as Some data set could be particularly suited for QNNs.
argued in Ref. [68], measures of capacity like the VC- An example is provided by those proposed in [70] and
dimension can become vacuous and difficult — if not based on the discrete logarithm problem. Also, it is
completely impractical — to be measured, and there- not hard to suppose that quantum machine learning will
fore cannot be used as a general faithful measure of the prove particularly useful when analyzing data that are
actual power of learning models. For this reason, the inherently quantum, for example coming from quantum
use of effective dimension as a measure of power is pro- sensors, or obtained from quantum chemistry simula-
posed, motivated by information-theoretical standpoints. tions of condensed matter physics. However, even if
This measure can be linked to generalization bounds of this sounds theoretically appealing, no clear separation
the learning models, leading to the result that quantum has yet been found in this respect. A particularly rel-
neural networks can be more expressive and efficient dur- evant open problem concerns, for example, the identi-
ing training than their classical counterparts, thanks to a fication and generation of interesting quantum dataset.
more evenly spread spectrum of the Fisher Information. Moreover, based on information-theoretic derivations in
Some evidence was provided through the analysis of two the quantum communication formalism, it was recently
particular QNNs instances: one having a trivial input shown [71] that a classical learner with access to quantum
data encoding, and the other using the encoding scheme data coming from quantum experiments performs simi-
proposed in [8], which is thought to be classically hard larly to quantum models with respect to sample complex-
to simulate and indeed shows remarkable performances. ity and average prediction error. Even if an exponential
Despite such specificity, this is one of the first results advantage is still attainable if the goal is to minimize the
drawing a clear separation between quantum and classi- worst-case prediction error, classical methods are again
cal neural networks, highlighting the great potentials of found to be surprisingly powerful when given access even
the former. to quantum data.

4. Quantum-inspired methods
2. The role of data
When discussing the effectiveness of quantum neural
In the same spirit, but with different tools, the role of networks, and in order to achieve a true quantum ad-
data in analyzing the performances of quantum machine vantage, one should compare not only to previous quan-
learning algorithms was taken into account [69], by devis- tum models but also to the best available classical alter-
ing geometric tests aimed at clearly identify those scenar- natives. For this reason, and until fault-tolerant quan-
ios where quantum models can achieve an advantage. In tum computation is achieved, quantum-inspired proce-
particular, focusing on kernel methods, a geometric mea- dures on classical computers could pave the way towards
sure based exclusively on the training dataset and the the discovery of effective solutions [19–21]. One of the
kernel function induced by the learning model, which is most prominent examples is represented by tensor net-
actually independent of the actual classification/function works [72], initially developed for the study of quantum
to be learned, was defined. If such quantity is similar for many-body systems, which show remarkably good per-
the classical and quantum learning case, then classical formances for the simulation of quantum computers in
models are proved to perform similarly or better than specific regimes [73]. In fact, many proposals for quan-
quantum models. However, if this geometric distance is tum neural networks are inspired by or can be directly
large, then there exists some dataset where the quantum rephrased in the language of tensor networks [9, 46], so
model can outperform the classical one. Notably, one key that libraries developed for those applications could be
result of the study is that poor performances of quantum used to run some instances of these quantum algorithms.
models could be attributed precisely to the careless en- Fierce competition is therefore to be expected in the
coding of data into exponentially large quantum Hilbert coming years between quantum processors and advanced
spaces, spreading them too far apart from each other and classical methods, including the so called neural network
resulting in the kernel matrix approximating the identity. quantum states [74, 75] for the study many-body physics.
To avoid this setback, authors propose the idea of pro-
jected quantum kernels, which aim at reducing the effec-
tive dimension of the encoded data set while keeping the V. OUTLOOKS
advantages of quantum correlations. On engineered data
sets, this procedure was shown to outperform all tested Quantum machine learning techniques, and partic-
classical models in prediction error. From a higher per- ularly quantum neural network models, hold promise
spective, the most important handout of these results is to significantly increase our computational capabilities,
that data themselves play a major role in learning mod- unlocking whole new areas of investigation. The main
els, both quantum and classical. theoretical and practical challenges currently encompass
7

trainability, the identification and suitable treatment VI. ACKNOWLEDGMENTS


of appropriate data sets, and a systematic comparison
with classical counterparts. Moreover, with steady
progress on the hardware side, significant attention will
be devoted in the near future towards experimental This research was partly supported by the Italian Min-
demonstrations of practical relevance. Research evolves istry of Education, University and Research (MIUR)
rapidly, with seminal results highlighting prominent through the “Dipartimenti di Eccellenza Program (2018-
directions to explore in the quest for quantum compu- 2022)”, Department of Physics, University of Pavia,
tational advantage in AI applications, and new exciting and the PRIN-2017 project 2017P9FJBS “INPhoPOL”,
results are expected in the years to come. and by the EU H2020 QuantERA ERA-NET Cofund in
Quantum Technologies project QuICHE.

[1] I. Goodfellow, Y. Bengio, and A. Courville, Deep Learn- [16] J. Preskill, Quantum Computing in the NISQ era and
ing (MIT Press, 2016). beyond, Quantum 2, 79 (2018).
[2] G. E. Hinton, S. Osindero, and Y.-W. Teh, A fast learning [17] K. Bharti, A. Cervera-Lierta, T. H. Kyaw, T. Haug,
algorithm for deep belief nets, Neural Comput. 18, 1527 S. Alperin-Lea, A. Anand, M. Degroote, H. Heimonen,
(2006). J. S. Kottmann, T. Menke, W.-K. Mok, S. Sim, L.-C.
[3] Y. LeCun, Y. Bengio, and G. Hinton, Deep learning, Na- Kwek, and A. Aspuru-Guzik, Noisy intermediate-scale
ture 521, 436–444 (2015). quantum (nisq) algorithms (2021), arXiv:2101.08448
[4] T. Hastie, R. Tibshirani, and J. Friedman, The Elements [quant-ph].
of Statistical Learning: Data Mining, Inference, and Pre- [18] J. Biamonte, P. Wittek, N. Pancotti, P. Rebentrost,
diction, Springer series in statistics (Springer, 2009). N. Wiebe, and S. Lloyd, Quantum machine learning, Na-
[5] A. S. G. Andrae and T. Edler, On global electricity us- ture 549, 195–202 (2017).
age of communication technology: Trends to 2030, Chal- [19] S. Aaronson, Read the fine print, Nat. Phys. 11, 291–293
lenges 6, 117 (2015). (2015).
[6] M. A. Nielsen and I. L. Chuang, Quantum computation [20] E. Tang, A quantum-inspired classical algorithm for rec-
and quantum information, 10th ed. (Cambridge Univer- ommendation systems, in Proceedings of the 51st An-
sity Press, Cambridge, UK, 2010). nual ACM SIGACT Symposium on Theory of Comput-
[7] F. Tacchino, C. Macchiavello, D. Gerace, and D. Bajoni, ing, STOC 2019 (Association for Computing Machinery,
An artificial neuron implemented on an actual quantum New York, NY, USA, 2019) p. 217–228.
processor, npj Quantum Inf. 5, 26 (2019). [21] J. M. Arrazola, A. Delgado, B. R. Bardhan, and S. Lloyd,
[8] V. Havlı́ček, A. D. Córcoles, K. Temme, A. W. Harrow, Quantum-inspired algorithms in practice, Quantum 4,
A. Kandala, J. M. Chow, and J. M. Gambetta, Super- 307 (2020).
vised learning with quantum-enhanced feature spaces, [22] F. J. Gil Vidal and D. O. Theis, Input redundancy for pa-
Nature 567, 209–212 (2019). rameterized quantum circuits, Front. Phys. 8, 297 (2020).
[9] I. Cong, S. Choi, and M. D. Lukin, Quantum convolu- [23] M. Schuld, R. Sweke, and J. J. Meyer, Effect of data en-
tional neural networks, Nat. Phys. 15, 1273–1278 (2019). coding on the expressive power of variational quantum-
[10] K. Beer, D. Bondarenko, T. Farrelly, T. J. Osborne, machine-learning models, Phys. Rev. A 103, 032430
R. Salzmann, D. Scheiermann, and R. Wolf, Training (2021).
deep quantum neural networks, Nat. Commun. 11, 808 [24] A. Pérez-Salinas, A. Cervera-Lierta, E. Gil-Fuster, and
(2020). J. I. Latorre, Data re-uploading for a universal quantum
[11] F. Tacchino, A. Chiesa, S. Carretta, and D. Gerace, classifier, Quantum 4, 226 (2020).
Quantum computers as universal quantum simulators: [25] K. Mitarai, M. Negoro, M. Kitagawa, and K. Fujii, Quan-
State-of-the-art and perspectives, Adv. Quantum Tech- tum circuit learning, Phys. Rev. A 98, 032309 (2018).
nol. 3, 1900052 (2020). [26] M. Ostaszewski, E. Grant, and M. Benedetti, Structure
[12] M. Benedetti, E. Lloyd, S. Sack, and M. Fiorentini, Pa- optimization for parameterized quantum circuits, Quan-
rameterized quantum circuits as machine learning mod- tum 5, 391 (2021).
els, Quantum Sci. Technol. 4, 043001 (2019). [27] K. Hornik, Approximation capabilities of multilayer feed-
[13] J. R. McClean, J. Romero, R. Babbush, and A. Aspuru- forward networks, Neural Netw. 4, 251 (1991).
Guzik, The theory of variational hybrid quantum- [28] R. LaRose and B. Coyle, Robust data encodings for quan-
classical algorithms, New J. Phys. 18, 023023 (2016). tum classifiers, Phys. Rev. A 102, 032420 (2020).
[14] A. Peruzzo, J. McClean, P. Shadbolt, M.-H. Yung, X.-Q. [29] S. Lloyd, M. Schuld, A. Ijaz, J. Izaac, and N. Killoran,
Zhou, P. J. Love, A. Aspuru-Guzik, and J. L. O’Brien, (2020), arXiv:2001.03622 [quant-ph].
A variational eigenvalue solver on a photonic quantum [30] M. Schuld, I. Sinayskiy, and F. Petruccione, The quest
processor, Nat. Commun. 5, 4213 (2014). for a quantum neural network, Quantum Inf. Process. 13,
[15] M. Cerezo, A. Arrasmith, R. Babbush, S. C. Benjamin, 2567–2586 (2014).
S. Endo, K. Fujii, J. R. McClean, K. Mitarai, X. Yuan, [31] M. Schuld, I. Sinayskiy, and F. Petruccione, Simulating a
L. Cincio, and P. J. Coles, Variational quantum algo- perceptron on a quantum computer, Phys. Lett. A 379,
rithms (2020), arXiv:2012.09265 [quant-ph]. 660 (2015).
8

[32] Y. Cao, G. G. Guerreschi, and A. Aspuru-Guzik, networks (2020), arXiv:2005.12458 [quant-ph].


Quantum neuron: an elementary building block [52] A. Pesah, M. Cerezo, S. Wang, T. Volkoff, A. T. Sorn-
for machine learning on quantum computers (2017), borger, and P. J. Coles, Absence of barren plateaus
arXiv:1711.11240 [quant-ph]. in quantum convolutional neural networks (2020),
[33] E. Torrontegui and J. J. Garcia-Ripoll, Unitary quantum arXiv:2011.02966 [quant-ph].
perceptron as efficient universal approximator, EPL 125, [53] Z. Holmes, K. Sharma, M. Cerezo, and P. J. Coles, Con-
30004 (2019). necting ansatz expressibility to gradient magnitudes and
[34] S. Mangini, F. Tacchino, D. Gerace, C. Macchiavello, barren plateaus (2021), arXiv:2101.02138 [quant-ph].
and D. Bajoni, Quantum computing model of an artifi- [54] L. Bittel and M. Kliesch, Training variational quan-
cial neuron with continuously valued input data, Mach. tum algorithms is np-hard – even for logarithmi-
Learn.: Sci. Technol. 1, 045008 (2020). cally many qubits and free fermionic systems (2021),
[35] F. Tacchino, S. Mangini, P. K. Barkoutsos, C. Macchi- arXiv:2101.07267 [quant-ph].
avello, D. Gerace, I. Tavernelli, and D. Bajoni, Varia- [55] A. Skolik, J. R. McClean, M. Mohseni, P. van der Smagt,
tional learning for quantum artificial neural networks, and M. Leib, Layerwise learning for quantum neural net-
IEEE Transactions on Quantum Engineering 2, 1 (2021). works (2020), arXiv:2006.14904 [quant-ph].
[36] F. Tacchino, P. Barkoutsos, C. Macchiavello, I. Taver- [56] E. Grant, L. Wossnig, M. Ostaszewski, and M. Benedetti,
nelli, D. Gerace, and D. Bajoni, Quantum implementa- An initialization strategy for addressing barren plateaus
tion of an artificial feed-forward neural network, Quan- in parametrized quantum circuits, Quantum 3, 214
tum Sci. Technol. 5, 044010 (2020). (2019).
[37] M. Schuld and N. Killoran, Quantum machine learning [57] T. Volkoff and P. J. Coles, Large gradients via correlation
in feature hilbert spaces, Phys. Rev. Lett. 122, 040504 in random parameterized circuits, Quantum Sci. Technol.
(2019). 6 (2021).
[38] D. H. Ackley, G. E. Hinton, and T. J. Sejnowski, A learn- [58] G. Verdon, M. Broughton, J. R. McClean, K. J. Sung,
ing algorithm for boltzmann machines, Cogn. Sci. 9, 147 R. Babbush, Z. Jiang, H. Neven, and M. Mohseni, Learn-
(1985). ing to learn with quantum neural networks via classical
[39] M. H. Amin, E. Andriyash, J. Rolfe, B. Kulchytskyy, and neural networks (2019), arXiv:1907.05415 [quant-ph].
R. Melko, Quantum boltzmann machine, Phys. Rev. X [59] M. Schuld, V. Bergholm, C. Gogolin, J. Izaac, and N. Kil-
8, 021050 (2018). loran, Evaluating analytic gradients on quantum hard-
[40] C. Zoufal, A. Lucchi, and S. Woerner, Variational quan- ware, Phys. Rev. A 99, 032331 (2019).
tum Boltzmann machines, Quantum Machine Intelli- [60] J. Stokes, J. Izaac, N. Killoran, and G. Carleo, Quantum
gence 3, 7 (2021). Natural Gradient, Quantum 4, 269 (2020).
[41] M. Kieferová and N. Wiebe, Tomography and generative [61] G. Verdon, J. Pye, and M. Broughton, A universal
training with quantum boltzmann machines, Phys. Rev. training algorithm for quantum deep learning (2018),
A 96, 062327 (2017). arXiv:1806.09729 [quant-ph].
[42] C. Zoufal, A. Lucchi, and S. Woerner, Quantum genera- [62] A. W. Harrow and A. Montanaro, Quantum computa-
tive adversarial networks for learning and loading random tional supremacy, Nature 549, 203–209 (2017).
distributions, npj Quantum Inf. 5, 103 (2019). [63] F. Arute, K. Arya, R. Babbush, D. Bacon, J. C. Bardin,
[43] S. Lloyd and C. Weedbrook, Quantum generative adver- R. Barends, R. Biswas, S. Boixo, F. G. S. L. Bran-
sarial learning, Phys. Rev. Lett. 121, 040502 (2018). dao, and D. A. e. Buell, Quantum supremacy using a
[44] P.-L. Dallaire-Demers and N. Killoran, Quantum gen- programmable superconducting processor, Nature 574,
erative adversarial networks, Phys. Rev. A 98, 012324 505–510 (2019).
(2018). [64] H.-S. Zhong, H. Wang, Y.-H. Deng, M.-C. Chen, L.-C.
[45] B. Coyle, D. Mills, V. Danos, and E. Kashefi, The born Peng, Y.-H. Luo, J. Qin, D. Wu, X. Ding, Y. Hu, P. Hu,
supremacy: quantum advantage and training of an ising X.-Y. Yang, W.-J. Zhang, H. Li, Y. Li, X. Jiang, L. Gan,
born machine, npj Quantum Inf. 6, 60 (2020). G. Yang, L. You, Z. Wang, L. Li, N.-L. Liu, C.-Y. Lu,
[46] E. Grant, M. Benedetti, S. Cao, A. Hallam, J. Lockhart, and J.-W. Pan, Quantum computational advantage using
V. Stojevic, A. G. Green, and S. Severini, Hierarchical photons, Science 370, 1460 (2020).
quantum classifiers, npj Quantum Inf. 4, 65 (2018). [65] R. Sweke, J.-P. Seifert, D. Hangleiter, and J. Eisert, On
[47] J. R. McClean, S. Boixo, V. N. Smelyanskiy, R. Babbush, the Quantum versus Classical Learnability of Discrete
and H. Neven, Barren plateaus in quantum neural net- Distributions, Quantum 5, 417 (2021).
work training landscapes, Nat. Commun. 9, 4812 (2018). [66] Y. Du, M.-H. Hsieh, T. Liu, and D. Tao, Expressive
[48] M. Cerezo, A. Sone, T. Volkoff, L. Cincio, and P. J. power of parametrized quantum circuits, Phys. Rev. Re-
Coles, Cost function dependent barren plateaus in shal- search 2, 033125 (2020).
low parametrized quantum circuits, Nature Communica- [67] L. G. Wright and P. L. McMahon, The capacity of quan-
tions 12, 1791 (2021). tum neural networks (2019), arXiv:1908.01364 [quant-
[49] C. O. Marrero, M. Kieferová, and N. Wiebe, Entangle- ph].
ment induced barren plateaus (2020), arXiv:2010.15968 [68] A. Abbas, D. Sutter, C. Zoufal, A. Lucchi, A. Figalli,
[quant-ph]. and S. Woerner, The power of quantum neural networks
[50] S. Wang, E. Fontana, M. Cerezo, K. Sharma, A. Sone, (2020), arXiv:2011.00027 [quant-ph].
L. Cincio, and P. J. Coles, Noise-induced barren [69] H.-Y. Huang, M. Broughton, M. Mohseni, R. Babbush,
plateaus in variational quantum algorithms (2020), S. Boixo, H. Neven, and J. R. McClean, Power of data in
arXiv:2007.14384 [quant-ph]. quantum machine learning, Nature Communications 12,
[51] K. Sharma, M. Cerezo, L. Cincio, and P. J. Coles, Train- 2631 (2021).
ability of dissipative perceptron-based quantum neural
9

[70] Y. Liu, S. Arunachalam, and K. Temme, A rigorous and 10, 041038 (2020).
robust quantum speed-up in supervised machine learning [74] G. Carleo and M. Troyer, Solving the quantum many-
(2020), arXiv:2010.02174 [quant-ph]. body problem with artificial neural networks, Science
[71] H.-Y. Huang, R. Kueng, and J. Preskill, Information- 355, 602 (2017).
theoretic bounds on quantum advantage in machine [75] G. Torlai, G. Mazzola, J. Carrasquilla, M. Troyer,
learning, Phys. Rev. Lett. 126, 190505 (2021). R. Melko, and G. Carleo, Neural-network quantum state
[72] S. Montangero, Introduction to Tensor Network Methods tomography, Nat. Phys. 14, 447 (2018).
(Springer Nature Switzerland AG, Cham, CH, 2018).
[73] Y. Zhou, E. M. Stoudenmire, and X. Waintal, What lim-
its the simulation of quantum computers?, Phys. Rev. X

You might also like