Download as pdf or txt
Download as pdf or txt
You are on page 1of 23

Artificial Intelligence and Machine Learning for Quantum Technologies

Mario Krenn,1, ∗ Jonas Landgraf,1, 2 Thomas Foesel,1, 2 and Florian Marquardt1, 2


1
Max Planck Institute for the Science of Light, Erlangen, Germany.
2
Department of Physics, Friedrich-Alexander Universität Erlangen-Nürnberg, Germany.
(Dated: August 9, 2022)
In recent years, the dramatic progress in machine learning has begun to impact many areas of
science and technology significantly. In the present perspective article, we explore how quantum
technologies are benefiting from this revolution. We showcase in illustrative examples how scientists
in the past few years have started to use machine learning and more broadly methods of artificial
intelligence to analyze quantum measurements, estimate the parameters of quantum devices, dis-
cover new quantum experimental setups, protocols, and feedback strategies, and generally improve
aspects of quantum computing, quantum communication, and quantum simulation. We highlight
open challenges and future possibilities and conclude with some speculative visions for the next
arXiv:2208.03836v1 [quant-ph] 7 Aug 2022

decade.

CONTENTS with an amazing vision of applications (in the 1950s


and 1980s, respectively), went through a series of chal-
I. Introduction 1 lenges, and are currently extremely hot research topics.
Of these two, machine learning has firmly taken hold
II. Basic Techniques of Machine Learning and beyond academia and beyond prototypes, triggering a
Artificial Intelligence 2 revolution in technological applications during the past
A. Evolutionary Algorithms 2 decade. This perspective article will be concerned with
B. Neural Networks: Structure 2 shining a spotlight on how techniques of classical ma-
C. Neural Networks: Training 3 chine learning (ML) and artificial intelligence (AI) hold
D. Unsupervised Learning 4 great promise for improving quantum technologies in the
E. Reinforcement Learning 4 future. A wide range of ideas have been developed at this
F. Automatic differentiation and gradient-based interface between the two fields during the past five years,
optimization 5 see Fig. 1. Whether one tries to understand a quantum
G. How to get started 5 state through measurements, discover optimal feedback
strategies or quantum error correction protocols, or de-
III. Applications of Machine Learning for Quantum sign new quantum experiments, machine learning can
Technologies 5 yield efficient solutions, optimized performance and, in
A. Measurement data analysis and quantum the best cases, even new insights.
state representation 5 With the present review, we aim to take physicists with
B. Parameter estimation: learning the properties a background in quantum technologies on a tour of this
of quantum systems 8 rapidly growing area at the interface to classical machine
C. Discovering strategies for hardware-level learning. The readers are not expected to have a back-
quantum control 10 ground in machine learning and will get a state-of-the-art
D. Discovering quantum experiments, protocols, view into how machine learning techniques are applied to
and circuits 12 quantum physics. We should state right away that in this
E. Quantum Error Correction 16 perspective article we tried to achieve our goal by focus-
ing in each application domain on some selected illus-
IV. Outlook 17 trative examples. Our selection is necessarily subjective.
We thus make no claim as to providing a comprehensive
V. Acknowledgments 18 list of the literature and apologize to anyone who misses
some favorite work.
References 18 We hope that after seeing the examples discussed in
our review, the reader will appreciate how useful ma-
chine learning techniques could be for quantum technolo-
I. INTRODUCTION
gies. At the same time, we also want to make the reader
aware of how crucial it is to choose the right AI approach
The fields of machine learning [1–3] and quantum tech- for a problem. It is by no means necessary that the most
nologies [4–7] have a lot in common: Both started out advanced AI method is the most suited tool for a given
task. Often, modern deep-learning methods can be sig-
nificantly outperformed by rather simple methods if ap-
plied to the wrong task, as seen in other areas [8]. Here it
∗ ML4qtech@mpl.mpg.de is crucial to analyze the scope of the problem and decide
2

FIG. 1. Overview of tasks in the area of quantum technologies that machine learning and artificial intelligence can help solve
better, as explained in this perspective article.

on the best-suited algorithms. trees” [20]. However, the flexibility of neural networks
Once the reader wants to understand how to apply has made them a popular general-purpose choice, so we
these tools in practice, we recommend the very educa- focus on those in our brief introduction.
tional lecture notes introducing machine learning tech-
niques with a view to their application to quantum de-
vices, both brief [9], and very extended [10]. In addition, A. Evolutionary Algorithms
there are several reviews from recent years with a some-
what different focus than ours, e.g. about machine learn- One set of algorithms that often been used in opti-
ing applied to physics in general [11] or machine learning mization tasks in the domain of artificial intelligence are
for quantum many-body physics [12]. evolutionary (genetic) algorithms [21, 22]. There, the
We also remark that there is the whole field of quan- idea is to deal with a set of candidate solutions, each
tum machine learning, which tries to discover potential of them described via a suitable vector. These solutions
quantum advantages when implementing new learning can be randomly changed (”mutated”), two solutions can
algorithms on quantum platforms. This also includes be combined to form a new candidate (”crossover”), and
variational quantum circuits, which are very promising finally only the best solutions can be kept for the next
in the context of noisy intermediate-scale quantum com- round of evolution (”selection”). Such methods can be
puters (NISQ). We do not cover these developments here, surprisingly effective, and their applications to new prob-
and we refer the interested reader to reviews for quan- lems is often straight forward. As we will describe in fur-
tum machine learning [13–16] and variational quantum ther sections, genetic algorithms have successfully been
algorithms and NISQ devices [17, 18]. While we focus applied to quantum-technology-related tasks.
on quantum technology here, we want to mention that
there is also considerable work on ML for foundational
quantum science [19]. B. Neural Networks: Structure
In the following section, we will briefly introduce the
basics of neural networks and other machine learning The structure of artificial neural networks, which can
techniques, aiming to set the stage for subsequent dis- be trained to approximate arbitrary functions, is loosely
cussions. The bulk of our review is contained in section motivated by neurons in the human brain. Each neuron
III, where we discuss the various applications of machine receives multiple inputs and generates an output signal
learning to quantum technologies. In each case, we aim which serves as input for other neurons. During learning,
to remark on some of the challenges and potential future the strength of connections between neurons changes.
research directions. Finally, in the outlook, we specu- More precisely, each artificial neuron receives N inputs
late about how machine learning might have transformed xj which are summed according to some weights wj , rep-
quantum technologies in a dozen years from now. resenting connection strengths, see Fig. 2a. After adding
a bias (shift) b, a simple nonlinear ”activation function”
f is applied to yield the output y:
II. BASIC TECHNIQUES OF MACHINE
LEARNING AND ARTIFICIAL INTELLIGENCE y = f (z) (1a)
N
X
The purpose of this section is merely to provide a z= w j xj + b (1b)
glimpse of the essential basics so that the subsequent dis- j=1
cussion of applications becomes intelligible to the reader
without prior exposure. Machine learning techniques Without the nonlinear activation function between each
have been around for several decades and involve many layer, the whole neural network could be compressed into
efficient approaches that predate the recent deep learning one single linear transformation, with very limited com-
revolution, like ”support vector machines” or ”decision putational power. Popular activation functions are the
3

so-called long short-term memory (LSTM) [24] networks.

C. Neural Networks: Training

A network can be made to approximate any function


by suitably choosing its weights and biases, often sum-
marily denoted as a parameter vector θ. To produce
correct predictions with a neural network, its parameters
have to be trained for the given problem. Supervised
learning (SL) is a training method to learn a desired tar-
get function f (x) from a dataset containing pairs (”sam-
FIG. 2. Basics of Neural Networks and Machine Learning ples”) of inputs and associated outputs, (x, f (x)). A
Techniques. (a) Operation of a single artificial neuron. (b) neural network represents an approximate input-output
Structure of a neural network with dense layers. (c) Evolu-
relation fθ (x). The deviation between the network’s pre-
tion of the network’s parameters θ in the cost function land-
scape using stochastic gradient descent. For every step (or- diction and the correct output in the dataset is quantified
ange arrows) the gradients of the averaged cost function with by a ”cost” (or ”loss”) function. For any given input x,
respect to θ (gray dashed line) are approximated by by the this cost function can be calculated, C(fθ (x), f (x)), and
gradient averaged over a random batch. The parameters θ̂ eventually, it will be averaged over all samples in the data
minimize the averaged cost function. (d) Classification of un- set. The trainable parameters θ of the NN fθ (x) have to
labeled data. (e) Reinforcement learning problem modelled be chosen so as to minimize this (sample-averaged) cost
as Markov decision process. function.
The choice of cost function depends on the problem.
Typical fields of application for neural networks are re-
“ReLU” (f (z) = z for z ≥ 0 and f (z) = 0 for z ≤ 0) gression and classification tasks. For regression, the tar-
and the “sigmoid” function (f (z) = 1/(e−z + 1)), which get function f (x) is a continuous function of the input x.
represents a smoothed step-function. Then, the so-called ”mean-square error” is the canonical
Multiple connected neurons form a neural network as choice, which for n output neurons reads:
shown in Fig. 2b. These networks are typically struc- n
X
tured into several layers, where the output from each 2
CMSE (x) = (fθ,i (x) − fi (x)) (2)
layer serves as input for the next layer. Overall, there i=1
is one input layer, one or more intermediate ”hidden”
layers, and an output layer. The input being fed into On the other hand, in classification tasks, the input
such a network could be, e. g. pixel data from an image, should be assigned to certain predefined classes (e.g. cat-
and after successful training, the output neurons encode egories of images). This is solved by having each output
e. g. the category of the image (cat vs. dog). A network neuron correspond to one of the n classes. Each neuron
with only one hidden layer and suitably many neurons value (or ”activation”) can then be interpreted as the
is already able to approximate arbitrary functions with probability that the input is assigned to the correspond-
arbitrary precision [23]. However, networks with more ing class. In that situation, the typical cost function is
hidden layers (called “deep” neural networks) may be the so-called ”cross-entropy,” a means to compare prob-
able to fulfill the same task with fewer neurons. ability distributions.
In the network in Fig. 2b, all outputs of one layer serve In any case, neural-network training relies on the min-
as input for all neurons of the next layer. Such a struc- imization of the cost function using gradient descent.
ture is called a dense layer, or ”fully connected network.” However, evaluating the cost function averaged over all
However, this is often not the best choice, especially when samples of the data set is infeasible. Rather, in each
the input has certain symmetries. For example, an im- update step, the cost function is averaged over a batch
age classification task can have a translational symme- of randomly selected inputs. In the simplest version of
try since the category of an object shown in a picture the resulting ”stochastic gradient descent” scheme (see
is independent of the object’s precise location. Convo- Fig. 2c), the parameters θ are updated in each step ac-
lutional neural networks (CNNs) use this translational cording to θ → θ − ηg with the gradient g:
invariance by convolving the input with a ”kernel” that ∂hCibatch
can be learned. g= (3)
∂θ
Time series have a temporal structure that can
be utilized using so-called ”recurrent” neural networks h·ibatch denotes the average over the random batch, and
(RNNs). The key idea is that the RNN receives the in- η is the ”learning rate,” controlling the update’s size.
formation for each subsequent time step sequentially as A neural network has many parameters (even small
input but also keeps some memory about previous inputs. networks usually have thousands of parameters, the
One of the most advanced and commonly-used RNNs are largest published networks have hundreds of billions of
4

parameters). Therefore, it is crucial that there exists An MDP consists of an ”agent” (the controller) and
a highly efficient approach to calculating the gradient: an ”environment” (the world, or the system to be con-
the so-called ”backpropagation” scheme. As its name trolled), and both interact in multiple time steps. In
implies, this algorithm calculates the gradients layer by each time step t, the environment’s state st is observed.
layer, starting from the output layer. Remarkably, it is Solely based on this observation, the agent decides on
computationally not more demanding than the original its next action at , which will change the environment’s
evaluation of the neural network (also called the forward state. The agent’s behavior is defined by the ”policy”
propagation). A significant hardware advance for train- π(a|s), which denotes the probability of choosing the ac-
ing neural networks are GPUs (graphical processor units) tion a given the observation s. For each action at , the
that are optimized to perform highly-efficient manipula- agent receives a reward rt . For example, in a game, this
tions of large matrices. This efficiency is essential for the reward could be +1/ − 1 at the last time step, when the
success of ML in many applications. agent has won/lost the game, and otherwise 0 in all pre-
For more details on neural network architectures, back- vious steps. RL aims to maximize the cumulative reward
propagation, and gradient descent, the interested reader R (also called ”return”)
is referred to the many existing excellent introductions
T
[3, 9, 10, 25]. X
R= rt , (4)
t=1

D. Unsupervised Learning where T is the total number of time steps.


Three major branches of RL algorithms exist: pol-
Unsupervised learning approaches can learn the struc- icy gradient, Q-learning, and actor-critic methods. For
ture of unlabelled data sets on their own, for example, as policy-gradient methods, the agent directly sets the pol-
shown in Fig. 2d. Typical fields of application are fea- icy πθ (a|s). In deep RL, this agent is realized by a deep
ture learning, where the machine is asked to find a com- neural network with trainable parameters θ. To find the
pact representation of the data, and clustering, where optimal strategy, policy-gradient estimates the gradient
the computer has to sort on its own samples into classes of the average return hRi with respect to θ. Here, h·i
with similar properties. Generative models are also often denotes the average over all trajectories for the current
attributed to the field of unsupervised learning. Their policy. At first sight, it is unclear how to take the gra-
purpose is to stochastically create new samples that fol- dients through the reward without knowing the model.
low the same distribution as the previously observed data However, one can compute the gradients of the frequency
set (e.g. images of the same type, though never seen be- of a certain reward via the policy function πθ . Thus, the
fore). The simplest approaches to feature learning still gradient turns out to be:
rely on the techniques of supervised learning (namely, * T +
”self-supervised” learning in so-called ”autoencoders”). ∂hRi X ∂
However, clustering and generative models are typically g= = R ln πθ (at |st ) (5)
∂θ t=1
∂θ
implemented using distinct techniques (see, e.g. [3]).
It is important to note that R depends on the full trajec-
tory and the parameters θ (we suppressed these depen-
E. Reinforcement Learning dencies for brevity). Updates based on this equation will
increase (”reinforce”) the probabilities of actions that oc-
Some of the most important problems in artificial intel- cur predominantly in high-reward trajectories. Another
ligence and machine learning can be seen as the attempt approach to finding the optimal policy is so-called ”Q-
to find an optimal strategy to cope with a certain task, learning.” It employs the ”Q function” Qπ (s, a) that tries
where the optimal strategy is unknown (thus, supervised to estimate the quality of an action a: it is defined as
learning cannot be applied). This is the domain of Re- the average expected future return starting from a state
inforcement Learning (RL), which aims to discover the s and action a for a policy π. The optimal policy is
optimal action sequences in decision-making problems. then, by definition, to choose the action that maximizes
Its power has been famously illustrated in board games Q in a given state s. In practice, the Q function is ini-
like chess or Go [26] or video games[27], in all of which tially unknown. During training, an approximation to
RL is able to reach superhuman performance. In many this function is learned, often using a deep neural net-
applications, RL can find the optimal strategy without work to represent Q.
prior knowledge about the actual dynamics of the sys- Finally, the third group of RL algorithms are the so-
tem. When that is the case, we speak of ”model-free” called ”actor-critic” methods. These try to combine
RL. The goal in any RL task is encoded by choosing a the benefits of both policy-gradient and Q-learning ap-
suitable ”reward,” a quantity that measures how well the proaches. The basic idea behind actor-critic methods is
task has been solved. to estimate the expected reward given the current state,
The typical RL problem can be understood as a so- the so-called value function V . The success of any ac-
called ”Markov Decision Process” (MDP), see Fig. 2e. tion is then measured by comparing the resulting reward
5

against V . As might be expected, the value function is


represented by a neural network (the ”critic” network)
which is trained by SL to approximate the true value
function for the current policy.
For an introduction to the different RL algorithms, the
interested reader is referred to [28].

F. Automatic differentiation and gradient-based


optimization

Deep learning is efficient because of the backpropa-


gation technique that can efficiently calculate gradients
with respect to all the hundreds or millions of parame-
ters in a neural network. This technique more generally
leads to the concept of automatic differentiation. There,
the idea is to obtain the exact gradient with respect to
any variable appearing in any kind of numerical calcu- FIG. 3. State Estimation via Neural Networks. (a) Measure-
lation. As a numerical approach, this is distinct from ments on many identical copies of a quantum state can be
symbolic differentiation applied in computer algebra pro- processed to produce an estimate of the quantum state. (b)
grams. Modern frameworks used for neural networks of- A continuous weak measurement on a single quantum system
fer various modes of automatic differentiation. This of- can be used to update the estimated state. Both in (a) and
fers the chance to employ them for arbitrary gradient- (b), a single network is trained to estimate arbitrary states
based continuous optimization tasks, especially those in- correctly. (c) One can also train a network-based generative
volving many parameters, where efficiency is of concern. model to reproduce the statistics of a quantum state, i.e. to
sample from the probability distribution. Training requires
We will later show examples of how this can be used
many identical copies that can be measured, so the statistics
to discover new quantum experiments, quantum circuits, can be learned. Here, one network represents only a single
and in other contexts. quantum state. It can be extended to handle measurements
in arbitrary bases.

G. How to get started

the full reconstruction of the measured quantum state.


Only a few years ago, implementing neural networks In many cases, this can be phrased as a supervised learn-
still required quite some effort. Nowadays, libraries like ing task. One example might be to extract an approxi-
Tensorflow, PyTorch, JAX, and many more offer pow- mate description of a quantum state from a sequence of
erful tools to set up and train neural networks. We measurement results on identically prepared copies, see
have created an online collection of resources to start Fig. 3a. Provided the actual quantum state is known for
with ML, that contains information on different popu- each training example, the machine learning algorithm
lar frameworks, as well as helpful tutorials, lecture notes, will learn to provide the best possible approximation to
and reviews for machine learning in general1 : https: the quantum state. The same is true if the goal is to
//github.com/ML4QTech/Collection. reconstruct certain properties of the state or the device
instead of reconstructing the full quantum state. The
choice of the cost function is essential, a simple exam-
III. APPLICATIONS OF MACHINE LEARNING
FOR QUANTUM TECHNOLOGIES
ple being the infidelity between the predicted and the
actual state. Other choices will lead to slightly different
optimal approximations predicted by the ML algorithm.
A. Measurement data analysis and quantum state
Crucially, the algorithm can easily deal with distortions
representation
of the measurement data (such as extra technical noise),
as it will learn the properties of these distortions and how
An important direct application of machine learning to undo them.
to quantum devices is the interpretation of measurement Interpreting Measurements – An interesting early
data. The application ranges from an improved under- example used machine learning to improve the readout
standing of the measurement apparatus itself to extract- fidelity of a qubit in a superconducting quantum device.
ing some high-level properties of the quantum system to There, the noisy measurement trace, obtained from a
microwave signal passing through a readout resonator
interacting with the qubit, can be used to deduce the
1 We plan to update this collection continuously and are happy qubit’s logical state. However, classifying the qubit state
about contributions from the community. is challenging. The authors of [29] use a basic machine
6

learning technique called support-vector machine (SVM) ery possible measurement on the quantum system. This
to perform clustering of measurement traces in an un- might, however, not be necessary in most cases. The
supervised fashion, outperforming classical clustering al- theory of probably approximately correct models (PAC)
gorithms. The idea of a nonlinear SVM is to map data provides a hypothesis for every measurement close to the
points to a higher-dimensional space and to find the best dataset. Surprisingly, it was found that this question can
hyperplane for separating two classes of data points in be solved with a linear scaling of measurements. This
that space. In this specific work, each measurement tra- computational learning strategy has been first demon-
jectory, which consists of hundreds of individual data strated in quantum optics experiments with up to six
points, is interpreted as a point in a high-dimensional photons. The experiment has confirmed the scaling be-
space. The SVM’s goal was to separate curves that orig- havior of PAC, even in the presence of realistic experi-
inate from a zero-state and one from a one-state. The mental noise [35].
readout fidelity was improved compared to non-ML clus- Another very interesting recent development shows
tering techniques. Furthermore, this analysis has shown that joint quantum measurements of several individual
that the main noise contribution comes from physical bit- copies of a many-particle quantum state can lead to
flips (either heating or relaxation of the qubit). Without an exponential improvement over classical learning al-
such events, the classification of the SVM becomes near- gorithms [36]. The authors show experimentally on a
perfect. A similar approach has been demonstrated – us- platform of 40 superconducting qubits that tasks such as
ing neural networks – to enhance the readout capability predicting properties of the physical system can be signif-
of trapped-ion qubits [30]. Here, the authors show that icantly improved when the results of such joint measure-
the readout fidelity improves significantly compared to ments are fed as an input to a classical RNN. The result
a non-ML clustering method, especially when the effec- is particularly remarkable as it shows a clear advantage
tive amount of data per measurement increases. Similar already for current, noisy, and not error-corrected quan-
techniques have also been applied to NV center quantum tum computers.
devices [31]. Many other interesting examples exist that use neu-
In a pioneering experimental work, a neural network ral networks to analyze simulated or measured data of
reconstructed the quantum dynamics of a quantum sys- quantum systems. For example, it has been demon-
tem directly from measurement data [32]. There, the au- strated that the Wigner negativity of a multimode quan-
thors considered again a superconducting qubit coupled tum state can be approximated well even in the low-data
to a readout resonator, whose noisy measurement trace regime [37], with important consequences for quantum
is fed as input into the network (together with the initial technologies. Another vivid field is the neural-network-
preparation state and the final measurement basis). The based detection of quantum phase transitions and classi-
network’s task was to predict the statistics of arbitrary fication of quantum phases in condensed matter physics.
measurements at some given time during the evolution, We will not go into detail here. Instead, we point to
i.e. effectively predict the evolution of the density matrix some exciting early works in this field [38–40], as well
given the measurement record, see Fig. 3b. The kind of as some very modern applications that, for instance, use
network most suited to this task is a so-called recurrent anomaly detection to find phase transitions in an unsu-
network, i.e. a network able to process a time series (orig- pervised way [41]. Anomaly detection has already been
inally used for text or speech processing). The resulting applied to detect phases directly from experimental mea-
fully trained network can map any measurement data to surements [42]. See a recent review on this topic in [12].
a quantum state. Approximation of Quantum States – The direct
Furthermore, machine learning techniques can be ex- application of numerical techniques to quantum devices
ploited to analyze the statistics of measurement outcomes requires, in many situations, the storage, and processing
in quantum experiments where the aim is to demonstrate of the system’s full quantum state. As the quantum state
the classical complexity of sampling from the quantum grows exponentially with the number of particles, the
distribution. Boson sampling is the most well-known memory requirements quickly become enormous, even for
such scenario. In [33], unsupervised machine learn- moderately large quantum systems. For example, stor-
ing techniques (various clustering approaches) were em- ing the full quantum state of a 42-qubit system requires
ployed to identify and rule out possible malfunction sce- 35 TByte of memory. As demonstrated in the earliest
narios that would lead to a noticeably different distribu- quantum advantage experiments [43], this is directly re-
tion. lated to the power of quantum computers and quantum
One exciting work from theoretical computer science simulators. However, it poses a significant problem for
connects the question of quantum state tomography with classical computational approaches that deal with large
the theory of computational learning in the supervised quantum systems and therefore for developing new large-
setting [34]. Here, a learner uses the training dataset to scale quantum technologies.
produce a hypothesis about future measurements. Quan- To overcome this challenge, memory-efficient approxi-
tum state tomography, which requires a number of mea- mations of the quantum wave function are indispensable.
surements that is exponential in the number of particles, Neural networks are one key candidate to approximate
can be seen as a learner that produces a hypothesis for ev- the quantum wave function. This approach has some-
7

times been called Neural Quantum State (NQS). that were not directly accessed in the experiment.
A prominent approach tries to represent the quantum The idea has been extended to mixed states [48–52].
state in terms of a neural network [44]. This implies In that way, NQS can be used to efficiently approximate
that for each new quantum state, another network will open quantum systems which are notoriously difficult to
be trained, based on the associated measurement data capture.
for that state. In principle, that is considerably easier Approximating Quantum Dynamics – Once a
than asking a single network to be responsible for arbi- suitable quantum state representation is available, it can
trary states, i.e. the task considered above. As a con- be exploited to evolve the state in time. This enables
sequence, much more complicated many-body states can potentially efficient simulation of quantum many-body
be accessed. The whole approach can be seen as a neural- time evolution, which is important for predicting and
network-based version of quantum state tomography. benchmarking the dynamics of quantum simulators and
Several different ways exist to use a single neural net- quantum computing platforms. For the general case of
work to represent a single quantum state. One straight- dissipative quantum many-body dynamics, i.e. the time
forward approach, first introduced in [44] and then ex- evolution of mixed states, this has been explored in [50].
tended in subsequent works, employs a network that Rather than explicitly storing the entire quantum
directly represents the wave function. Given a multi- state, another technique shows how one can directly com-
particle configuration x as input, the network has to pro- pute a quantum state’s complex properties just from the
duce the wave function amplitude for that configuration state’s construction rules, i.e. the quantum experimen-
as output: Ψθ (x). tal circuit. In [53], the authors show how a recurrent
Different structures can be used for the network, with neural network [24] can approximate the properties that
a restricted Boltzmann machine (RBM) being a popu- emerge from quantum experiments without ever storing
lar choice since it also allows direct sampling from the the intermediate quantum state directly. These systems
probability distribution of observations [45]. In a tradi- could then directly be applied for complex quantum de-
tional RBM, the aim is to learn to sample from some sign tasks, a topic we cover in chapter III D. Another
observed probability distribution, see Fig. 3(c). It con- approach [54] also foregoes the representation of quan-
sists of binary visible units and hidden units connected tum states and instead trains a recurrent network to
to each other, and the statistics of these units are sam- predict the evolution of observables under random ex-
pled from a Boltzmann distribution with an energy E ternal driving of a quantum many-body system (either
that contains interaction terms bilinear
P in the P hidden (h) based on simulated or possibly even experimental data).
and visible (v) unit values: −E = j aj vj + k bk hk + The trained network can then predict the evolution un-
der arbitrary driving patterns (e.g. quenches). In a sim-
P
j,k wjk vj hk . During training, the coupling constants
are updated to obtain the desired probability distribution ilar spirit, neural ordinary differential equations (neural
of v (observed in samples provided during training). A ODEs) can be used to approximate the dynamics of quan-
simple physics example would be a 1D spin chain, whose tum systems directly, again without storing the explicit
configurations are identified as sample vectors v. More information about the quantum wave function [55]. In-
generally, other so-called generative deep learning meth- terestingly, the approximation is of high enough quality
ods (such as normalizing flows, variational auto-encoders, that it is possible to rediscover some fundamental prop-
and generative adversarial networks) can be used to learn erties of quantum physics, such as the Heisenberg uncer-
probability distributions, including those representing tainty relation.
the statistics of observables in quantum states in a given Future Challenges and Opportunities –
basis. Improved data efficiency (both for the training but also
Quantum state tomography using an RBM-style when applying ML to interpret the data) will be an im-
ansatz for the wave functions was introduced in [46]. portant challenge for the future, especially when the de-
Since one wants to keep the wave function’s phase ϕ as vices scale to more complex quantum systems.
well as the
p probability p, the ansatz is now of the type It will be interesting to co-discover measurement strat-
Ψ(x) = p(x)eiϕ(x) , where both p and ϕ are represented egy together with the data interpretation strategy. This
as networks and x corresponds to the visible units. A cru- might be particularly interesting if the AI algorithm is
cial idea in this approach is to match the probability dis- allowed to employ quantum measurements on numerous
tributions obtained from the experiment for observables copies of the same state, as pioneered in [36].
in more than one basis (e.g. σ z and σ x etc. for qubits). When a neural network can find a suitable approxima-
The evaluation of different bases can be carried out via tion for the computation of complex quantum systems,
unitary transformations acting on the wave function Ψ such as an NQS, it has learned a theoretical technique
that is expressed in a single reference basis. that might be useful for humans too. It will be interest-
In [47], this approach was applied to experimental data ing to learn how to extract the per-se inaccessible knowl-
from snapshots of many-particle configurations taken edge from the weights and biases of the neural network.
in a Rydberg atom quantum simulator. The resulting One method is so-called symbolic regression.
network-based wave-function ansatz could then be used The extensions of NQS to complex quantum systems,
to reconstruct other expectation values and observables such as higher dimensions and spins beyond qubits, will
8

stance, the number of photons that interact with a sam-


ple) without changing the experimental setup or input
states throughout the measurement sequence. The sec-
ond class of approaches exploits feedback, meaning they
employ adaptive strategies that change either the input
or the measurement setting, depending on the previous
measurement outcome. Naturally, adaptive strategies re-
quire more advanced experimental implementations, in-
cluding fast switching or long-term stability of setups.
For that reason, non-adaptive quantum metrology is so
far much more explored in laboratories, while adaptive
approaches are still at the stage of proof-of-principle ex-
periments. AI has contributed to both approaches.
An example of the application of neural networks to the
estimation of unknown parameters in a photonic experi-
FIG. 4. Machine Learning for Parameter Estimation in ment is provided in [58]. There the goal is to calibrate a
Quantum Devices. (a) A typical scenario, with the measure- device via neural-network training that can later be used
ment result statistics depending both on some tuneable mea- to estimate an unknown phase shift. The authors first ac-
surement setting and the unknown parameter(s), here rep- cumulate a large set of calibration measurements using
resented as phase shifts in a Mach-Zehnder setup. (b) An a controlled phase plate. In addition, the data is aug-
adaptive measurement strategy can be illustrated as a tree, mented to account for the statistical noise contribution.
with branches on each level corresponding to different mea-
The neural network receives the measurement data (the
surement outcomes. Depending on those outcomes, a certain
next measurement setting (indicated as ”αj ”) needs to be se- number of detected photons) and has to estimate the cor-
lected. Finding the best strategy is a challenging task, as it responding phase. As soon as the neural network is able
corresponds to searching the space of all such trees. (c) Neu- to model the connection between measurement data and
ral generative models can be used to randomly sample possi- calibration phases, it can be used to estimate unknown
ble future measurement outcomes (here: 2D current-voltage phases as well. This task shows that device calibration
maps as in [56]) that are compatible with previous measure- and estimation of external parameters for quantum sen-
ment outcomes. This is helpful for selecting the optimal next sors are closely related.
measurement location. Different random locations in latent In addition, also still for non-adaptive approaches, var-
space result in different samples. (d) Measurement outcome
ious projects have explored the discovery of new exper-
vs. measurement setting for 5 possible underlying parameter
values (different curves; measurement uncertainty indicated imental setups for the resource-efficient measurement of
via thickness). We aim to maximize the information gain, parameters of an external system [59, 60]. We will talk
i.e. choose the setting which best pinpoints the parameter about this approach in section III D.
(which is not equivalent to maximizing the uncertainty of the We now turn to focus on feedback-based quantum
outcome). metrology schemes and how they could be improved with
ML algorithms. A pioneering non-ML approach for such
a task is called BWB (Berry-Wiseman-Breslin) strategy,
allow for more interesting applications. after the authors of [61, 62]. In the original setting of
that strategy, the authors consider a Mach-Zehnder in-
terferometer with two detectors at the two outputs, see
B. Parameter estimation: learning the properties Fig. 4a. One of the arms contains an unknown phase
of quantum systems shift, and the other arm contains a controllable phase
that is modified depending on the previous measurement
In this section, we will discuss machine-learning- results. One can think of this decision-making problem
based approaches for estimating experimental parame- as a decision tree (see Fig. 4b), where the actions must
ters. These could either be system parameters for the be chosen based on the measurement outcomes to max-
calibration of quantum devices or the estimation of ex- imize the information gain. In general, the expected in-
ternal parameters in the form of quantum metrology. formation gain is defined as the reduction of entropy of
Quantum metrology – Machine learning can be the parameter distribution, averaged over possible mea-
helpful for various challenges in quantum metrology [57]. surement outcomes. The authors derived the feedback
Quantum metrology deals with the resource-efficient algorithm by applying Bayes theorem to the distribution
measurement of external parameters acting on a quan- of the unknown phase, which is then used to choose sub-
tum system, like magnetic fields or optical properties sequent measurement settings with a large information
of a material sample. Broadly, the field can be sepa- gain. This strategy has been found not only to be highly
rated into two different branches. First, non-adaptive resource-efficient, but also practically implementable in
approaches exploit how complex quantum entanglement the laboratory [63].
can be exploited to reduce the required resources (for in- Interestingly, greedy adaptive strategies (i.e. to choose
9

in every step the reference phase that yields the largest operating point of the quantum device, it is therefore
immediate information gain) do not necessarily lead to necessary to extract the relevant data with a very lim-
the largest information gain in the long run. To over- ited amount of information. This task can be formulated
come this effect, one of the seminal early contributions as a machine learning task, specifically applying “active
to ML-based quantum metrology used particle swarm op- learning” or “Bayesian optimal experimental design” [70]
timization of the feedback strategy[64]. In particle swarm where the algorithm chooses the most informative mea-
optimization, a collection of different feedback strategies surement autonomously (see also [71] for a review of such
(each called “particle”) iteratively moves in the space methods in the context of quantum devices). Naturally,
of all possible strategies. In each iteration, the particle this is closely related to the estimation of external pa-
moves towards a combination of the best local optimum rameters in quantum metrology as discussed above.
and the currently best known global optimum known by We illustrate these techniques via a pioneering exper-
the whole swarm. The experimental setting of [64, 65] imental application to quantum devices. This experi-
is the same as for the BWB strategy – a Mach Zehnder ment [56] considered the calibration and measurement
interferometer with an unknown phase in one arm and an of a semiconductor quantum dot. Such a device can be
adaptive phase in the other arm. Indeed, the swarm opti- tuned via applied gate voltages, and its resulting proper-
mization algorithm finds (slighly) better strategies than ties can be measured via a transport current. Here, the
the greedy Bayesian BWB approach. Interestingly, nei- goal was to explore the properties of the quantum dot, as
ther BWB nor swarm optimization can find the optimal defined by its current-voltage map I(V1 , V2 ), where the
strategy, which was identified for small photon numbers voltages include a bias voltage driving the current and
via an extensive computation of all possible strategies. a gate voltage deforming the dot’s potential. In a naive
Other early ML algorithms in this domain have applied approach, even if only two voltages were scanned with
evolutionary approaches to approximate the ideal feed- 100 discretization steps each, one would need to perform
back strategy[66, 67]. 10,000 measurements to get a suitable resolved device
Discovering a strategy is a problem that can directly characteristic.
be formulated as a reinforcement learning task. An early To reduce the required number of current measure-
application of RL to quantum parameter estimation was ments, the authors tried to estimate which measurement
provided in [68], with frequency estimation of a qubit as (in the 2D voltage space) would yield the ”maximum
a test case. In that work, the idea was to optimize the amount of additional information.” In practical terms,
quantum Fisher information for the parameter of inter- this would be the measurement that is expected to place
est. This can be done by finding a sequence of suitable the tightest constraints on the current-voltage maps that
control pulses applied during the noisy evolution of the are still compatible with all the observed values of the
quantum probe. No feedback is involved in this simple current (observed in this and prior measurements). It
setting since the measurement itself is not part of the is obvious how this setting translates to other quantum
evolution controlled by RL. platforms, e.g. measurements of microwave transmission
However, the quantum Fisher information is only use- through superconducting circuits controlled via gate volt-
ful in cases where one is already fairly certain of the true ages and magnetic fields or the optical response of tune-
parameter value. RL can be employed to study more able atomic systems.
complex situations, where updates are performed using The first step towards this goal is to efficiently rep-
the Bayes rule, starting from an arbitrary prior param- resent all current-voltage maps that might be observed,
eter distribution, and where the strategy is not greedy given the general physics of such a device, the assumed
(i.e. more than a single step of the sequence is optimized). prior distribution of device parameters, and all previous
In [69], the authors provided information about the cur- measurement results. In general, this is the domain of
rent Bayes distribution of the unknown parameter (as ”generative models,” which can sample from a probabil-
extracted from previous measurement results) and the ity distribution that is learned. In the case of [56], the
previous measurement choices as input to an RL agent authors used such a generative model, in their case a so-
implemented by a neural network. It then has to sug- called ”constrained variational autoencoder” (cVAE), to
gest the next measurement. After the whole sequence randomly create realistic current-voltage map that follow
of measurements, the agent is rewarded according to the the probability distribution of the actual physical system,
total reduction in parameter variance. It was shown that see Fig. 4c. Additional input into the generative model
this approach performs very competitively for an impor- provides a constraint, in the form of a few initially exist-
tant test case, namely parameter estimation for a qubit ing measurement results, and guides the reconstruction
of unknown frequency in the presence of dephasing. to sample only maps compatible with those constraints.
Device calibration – Future large-scale quantum de- In each step, 100 different voltage maps are sampled.
vices will consist of a large number of components with Those maps are used to find the next measurement point
adjustable parameters that need to be characterized and in voltage-space that would lead to the maximum infor-
tuned automatically. A complete characterization of the mation gain, see Fig. 4d. With this technique, the total
device via quantum process tomography quickly becomes number of necessary measurements for the characteriza-
impractical. To find the actual parameters or the ideal tion of the device is reduced by a factor of 4. This clearly
10

shows that the overhead of the deep-learning algorithm center. The underlying learning mechanism is very gen-
is more than compensated by its efficiency improvement eral, and it thus could become a powerful tool for learning
compared to the naive approach. A benefit of this tech- the dynamics of unknown quantum systems.
nique is that generating new samples with the cVAE is Future Challenges and Opportunities – For adap-
very efficient. Thus it can be scaled to much larger de- tive approaches, one needs to consider the trade-off be-
vices, where even more significant efficiency gains are ex- tween speed and the sophistication of the approach. In
pected. these tasks, the time between measurement and feedback
Another comparatively straightforward way to use ma- is often very short, so the decision must be taken quickly.
chine learning in device characterization consists in train- An interesting future approach for advanced quan-
ing a network-based classifier to recognize ”interesting” tum metrology approaches is to simultaneously co-design
measurement results. This then allows to tuning param- the experimental setup and the feedback strategy, rather
eters until those results are obtained. Such an approach than solving these tasks individually.
has been demonstrated in [72] for navigating charge-
stability diagrams of multi-quantum-dot devices. In that
setting, the algorithm’s goal was to automatically tune C. Discovering strategies for hardware-level
the charge occupation of the double quantum dot. The quantum control
task is reformulated as a classification task, where the
algorithm recognizes individual charge transitions when Challenges like quantum computing and quantum sim-
presented with a charge-stability diagram. Since such a ulation are leading to rapidly increasing demands on the
diagram constitutes an image, CNNs are a suitable choice efficient and high-fidelity control of quantum systems.
for the task. Tasks range from the preparation of complex quantum
Quantum Hamiltonian Learning – Imagine the fol- states and the synthesis of unitary gates via suitable
lowing parameter-estimation problem: One wants to es- control-field pulses all the way up to goals like feedback-
timate the parameters x0 that affect the evolution of a based quantum state stabilization and continuously per-
quantum state under a quantum many-body Hamiltonian formed error correction. In trying to solve these tasks,
H(x0 ) [73]. Unfortunately, even the task of comput- the specific capabilities and restrictions of any hardware
ing the dynamics scales exponentially with the system platform, from superconducting circuits to cavity quan-
size (number of qubits) when tackled using a classical tum electrodynamics, need to be considered.
machine. The idea of Quantum Hamiltonian Learning In this section, we will highlight specifically how rein-
(QHL) is to enlist the help of a quantum simulator to forcement learning has come to help with many of these
overcome this problem. The parameters x0 can then be challenges. In the form of model-free RL, it promises
estimated with standard Bayesian methods. Thereby, to discover optimal strategies directly on an experiment,
the quantum simulator is used like a subroutine inside which can be treated as a black box, see Fig. 5a. All
a classical ML approach. The first experimental imple- its unknowns and non-idealities will then be revealed
mentation of this idea was demonstrated in 2017 [74]. In only via its response to the externally imposed control
that work, the authors wanted to estimate the parame- drives. But even when used in a model-based way, us-
ters of an electron spin in a nitrogen-vacancy center, and ing simulations, RL can be more flexible than simpler
they used a quantum simulator on an integrated pho- approaches. In particular, it offers ways to discover feed-
tonics platform to perform the QHL. Interestingly, not back strategies, i.e. strategies conditioned on measure-
only did the approach lead to a high-quality estimation ment outcomes. These were not previously accessible to
of the dynamic system parameters, but it also indicated the usual numerical optimal control techniques.
when the initial Hamiltonian model had deficits. In these The present section is firmly concerned with hardware-
cases, the learning method informed the user that there level control that is continuous in the time domain, dis-
are other dynamics in play that have not been consid- covering pulse shapes or feedback strategies based on
ered, which inspires an improvement of the underlying time-continuous noisy measurement traces as they would
Hamiltonian model. emerge from weak measurements of quantum devices.
While the QHL method indicates that when the model There are some connections to the next section, but there
Hamiltonian is not ideal, it cannot adapt it. To overcome we will be concerned with the discovery of protocols, con-
this hurdle and to learn the entire Hamiltonian structure trol strategies, and whole experimental setups that are
(not only its parameters), the authors of [75] have in- described on a higher level, composed of discrete build-
troduced the idea of a Quantum Model Learning Agent ing blocks like gates or experimental elements.
(QMLA). This agent not only finds the parameters of a Quantum control tasks without feedback (open-
predefined Hamiltonian, but discovers the whole Hamil- loop control) –
tonian that describes the dynamics of a system. The Prior to the application of machine learning techniques
approach iteratively refines the initial Hamiltonian and in this field, the focus was essentially on tasks without
uses QHL as a subroutine for finding suitable parameter feedback, which was solved by direct optimization tech-
settings. This approach has also been demonstrated in niques, adapting the shape of control pulses applied to
a hybrid quantum system involving a nitrogen-vacancy the quantum system to maximize some quantity (like the
11

state fidelity). These direct optimization techniques in- an experiment whenever needed. In other words, there
clude gradient-based approaches, with GRAPE [76] and is no need for the agent to be running during the actual
the Krotov method [77, 78] the most prominent exam- experiment, which strongly relaxes requirements for the
ples, as well as approaches that do not rely on access hardware: no real-time control is necessary.
to gradients, such as CRAB [79]. At the time of writing, Quantum feedback control (closed-loop control)
these techniques still form the default toolbox for the case – The successful control of quantum systems subject to
of open-loop control, even while the first applications of noise, decay and decoherence requires either reservoir en-
machine learning (described below) are taking hold. Evo- gineering (autonomous feedback) or active feedback con-
lutionary algorithms define another class of (stochastic) trol. The space of active feedback strategies is expo-
approaches that have been used successfully to find op- nentially larger than that of open-loop control strategies
timal control sequences [80]. (i.e. without feedback), owing to the number of potential
State preparation is the most common quantum con- measurement outcome sequences growing exponentially
trol problem, and yet it can already be challenging, es- with time (each such sequence may require a different
pecially for multi-qubit settings. In probably the ear- response). It is here that it is almost inevitable to use
liest application of RL to quantum physics, pure-state the power of RL, particularly deep RL, with its ability
preparation with discrete control pulses was shown us- to process high-dimensional observations.
ing a version of Q-learning for a spin-1/2 system and a The first work to apply deep RL to feedback-based con-
three-level system [81]. A few years later, RL-based state trol of quantum systems was [88]. It employed discrete
preparation was demonstrated for a many-qubit system gates for quantum error correction, and we will discuss
[82], also using Q-learning and discrete bang-bang type it in sections III D and III E. State preparation and sta-
actions, with particular emphasis on analyzing the com- bilization in the presence of noise or an uncertain ini-
plexity of the control problem showing up in the form of a tial state are other natural candidates for RL feedback
glassy control landscape. Both of these works used some strategies. Examples include quantum state engineering
version of table-based Q-learning, which works well for a via feedback [89], as well as control of a quantum particle
restricted number of states and actions. The first work in an unstable potential [90] and a double-well potential
to employ deep (i.e. neural-network-based) RL methods [91]. In some quantum systems, control may be very lim-
to open-loop control of quantum systems was [83], with ited (e.g. only linear manipulations), but measurements
both discrete and continuous controls and a recurrent can introduce nonlinearity and their exploitation through
network as an agent, as applied to dynamical decoupling RL-based feedback strategies can enable powerful con-
and again state preparation, followed shortly afterwards trol, as shown in [92]. One challenge for model-free RL
by [84]. as applied to experiments is to make sure rewards can be
The RL approach can be used successfully to find suit- extracted directly and reliably from experimental mea-
able pulse sequences for unitary gates and optimize for surements, and to use a training procedure that really
the gate fidelity, as shown first in [85], and analyzed later treats the quantum device as a black box (not relying,
also in [86]. e.g. on simulations). These aspects were emphasized in
Recently, deep RL has been applied for the first time [93], where state preparation in a cavity coupled to a
to learn control strategies for a real quantum comput- qubit was analyzed.
ing experiment [87]. The authors trained on a cloud- Model-free vs model-based RL – Applying model-
based quantum computing platform, collecting data for free RL techniques, as described above, has a great ad-
the current control policy, extracting rewards, and up- vantage: the experimental quantum device can be treated
dating the policy. The goal was unitary gate synthesis, as a black box, and its inner parameters and distortions of
and the lack of real-time access to the device was not a the control and measurement signals need not be known a
concern since the task required only open-loop control. priori. However, this also means that part of the training
This first demonstration of RL-based quantum control effort is spent on effectively learning an implicit model of
on a real quantum experiment helped to illustrate the the quantum device since that is the basis for a good
possibilities and challenges in this new approach. control strategy.
Even though open-loop control pulse design means In many situations relevant to modern quantum tech-
that the actual strategy in the experiment is not con- nologies, though, a good model is known since the Hamil-
ditioned on any measurement outcomes, RL training for tonian and Lindblad dissipation terms have been care-
such tasks (when done on a computer simulation) may fully calibrated. This allows to consider model-based
still benefit from the agent receiving input information techniques explicitly, see Fig. 5b. In principle, these can
like the current quantum state. Experience shows that simply consist in applying model-free approaches to an
this makes it easier to find a good strategy. Otherwise, RL-environment that is represented by a simulation of
only very sparse nominal information like the current the model. However, this is only useful if running the ex-
time step and possibly the most recent selected action periment often would be expensive or time-consuming. A
would be fed into the agent. In any case, however, once more direct approach takes gradients directly through the
RL has found a control sequence, it could in principle be model dynamics. In the absence of feedback, this is what
stored (e.g. as a waveform or pulse sequence) and sent to well-known approaches like GRAPE offer. In an inter-
12

esting recent development, automatic differentiation (the


cornerstone of deep-learning frameworks) has been used
to easily get access to the gradients needed for model-
based control optimization. This was first presented in
[94] and then applied to various quantum control tasks,
especially for qubit systems [95, 96], also employing neu-
ral networks to generate the control pulses [97, 98].
Until recently, it was unclear, however, how to extend
these ideas naturally to situations with feedback. The
reason is that the stochastic choice of measurement out-
comes is not directly compatible with taking gradients
through smooth dynamics, unless special care is taken.
A first example, applied to feedback based on weak lin-
ear measurements, was provided in [99], using automatic
differentiation. The continuous measurement outcomes FIG. 5. (a) The eventual goal of model-free reinforcement
in such situations can be written as a simple function learning is the direct application to experiments, which then
of a given Gaussian noise process. Very recently, a fully can be treated as a black box. Many actual implementations,
general approach, termed ‘feedback-GRAPE,’ was pre- however, use model-free RL techniques applied to model-
sented and analyzed in [100]. There, it was pointed out based simulations. (b) Model-based reinforcement learning
how the effect of discrete stochastic measurements can directly exploits the availability of a model, e.g. taking gradi-
ents through differentiable dynamics.
be properly considered in such a gradient-based setting.
This enables model-based optimization for feedback in-
volving arbitrary strong discrete measurements, where
the response to those outcomes can be represented via the simulation as an “observation” into the agent. Since
neural networks or trainable lookup tables. that would not be available in a real experiment, one pos-
sible solution is the use of a stochastic master equation
Future Challenges and Opportunities – As
to deduce the quantum evolution of the state based on
of the time of writing, experimental applications of
noisy measurement traces during the experimental run
reinforcement-learning-based quantum control are still in
in real time (see e.g. [92] for comments on this). An-
their infancy. Even for the easier, open-loop control case,
other option is a “two-stage learning” procedure, where
one has to set up a full pipeline where control sequences
the successful RL agent, which still uses the state as in-
are delivered to the setup and a suitable reward is ex-
put, is used for supervised training of another network
tracted from experimental measurement data, before be-
that only uses measurement results as input. This new
ing processed (e.g. externally, in a PC, implementing the
network can then be applied to an experiment [88]. How-
RL algorithm). In a quantum system, where measure-
ever, both approaches require some calibration of exper-
ments collapse the state, this often means the reward
imental parameters since the original training still relies
can only be obtained at the end of an experimental run,
on simulations.
making it harder to guide training. By contrast, having
an immediate reward after each time step would improve
training success because it assigns credit to the actions
that immediately preceded a high reward. D. Discovering quantum experiments, protocols,
The challenge becomes even larger when feedback con- and circuits
trol is called for. Then, we require an agent (a neural
network) that can process in real time the incoming mea- Inventing a blueprint for a quantum experiment, cir-
surement signals and decide on the subsequent actions. cuits or protocols requires ensuring that complex quan-
Depending on the hardware platform, this imposes se- tum phenomena play together to produce a quantum
vere constraints: for superconducting qubits, the time state or a quantum transformation from a limited set of
afforded to one such evaluation may be on the order of basic building blocks. In this section we will discuss de-
only a few hundred nanoseconds. This kind of challenge sign questions that involve discrete building, such as op-
is specific to real-time feedback and does not exist for tical elements, superconducting circuit elements or quan-
any of the other machine learning applications to quan- tum gates. While these systems also contain additional
tum devices. continuous parameters, the overarching discrete nature
Almost all of the applications of model-free RL tech- gives in some cases the possibility – as we will see – to
niques mentioned above are numerical and rely on some understand and learn from the solutions.
simulation of the quantum device (if only because realiz- Discovery of Quantum Experiments – An impor-
ing these approaches in an experiment is still technically tant and natural playground for discovery based on dis-
very demanding). In many publications, access to the crete building blocks is the invention of new experimental
simulation is furthermore utilized to make learning easier, setups.
e.g. by feeding the current quantum state obtained from In the field of quantum optics, those building blocks
13

phenomena in laboratories [103]. Furthermore, new con-


cepts and ideas have been discovered, understood, and
generalized from some of the surprising solutions of the
algorithm, such as an entirely new way to multi-photon
interactions [104].
Sequentially building an experimental setup can be for-
mulated as a reinforcement learning problem. This pos-
sibility has been explored in [105]. The approach led to
the rediscovery of several experimental setups and to the
automated simplification of experimental setups (which
before was done using hand-crafted algorithms).
An alternative approach, called Theseus, which is or-
ders of magnitudes faster than the previous techniques,
is based on a new abstract representation of quantum ex-
FIG. 6. Discovery of Quantum Experiments. Quantum op- periments [101]. Here, quantum experimental setups are
tics experiments can be represented by colored graphs. Using
translated into a graph-based representations, see Fig.6.
the most general, complete graph as a starting representa-
tion, the AI’s goal is to extract the conceptual core of the
Any quantum optical experiment that can be built in the
solution, which can then be understood by human scientists. laboratory can be represented with a colored weighted
The solution can then be translated to numerous different graph, which translates an in principle infinite search
experimental configurations[101]. space into a finite space with continuous (thus differen-
tiable) parameters. In that way, the question of finding
a certain quantum setup can be directly translated into
contain lasers, nonlinear crystals, beam splitters, holo- discovering a graph with certain properties. As the pa-
grams, photon detectors, and alike. Put together in a rameters are continuous, highly efficient gradient-based
specific way, those systems can lead to the generation optimization algorithms can be used to find the solutions.
of complex quantum entanglement or perform quantum In addition, the graph’s topology is eventually simplified,
teleportation – but exchange one single element, and the such that the human researcher is not only presented with
setup produces something entirely different and most a solution, but can immediately understand why and how
likely not useful. Conventionally, experienced and cre- the solution works. The algorithm has been used to an-
ative scientists use their intuition and insights to design swer numerous open questions and has led to new con-
new quantum experiments. They translate an abstract cepts. It showcases that – when a simulator is available
task into a concrete layout that can be built in an ex- (i.e. a model) and no feedback needs to be taken into ac-
perimental laboratory. However, human researchers are count, gradient-descent optimization at a large abstract
struggling to find suitable experimental setups for more representation often outperforms approaches such as ge-
complex quantum states and transformations. netic algorithms networks or reinforcement learning.
This challenge was met recently with the introduction A different approach, denoted Tachikoma, aims at
of automated discovery of quantum experiments from the discovery of new experimental setups for quantum
scratch[102, 103]. The general idea is that an algorithm metrology[59, 60] – using an evolutionary learning ap-
combines building blocks from a toolbox to produce a proach. The goal of Tachikoma is to find setups for
suitable experimental setup while optimizing for certain quantum states that can measure phase shifts efficiently
desired characteristics. In the setting of [102, 103], the al- and with high precision. It uses a toolbox of optical
gorithm (here called Melvin), puts optical elements from elements from which it builds up a pool of candidate
a toolbox onto a virtual optical table. The toolbox con- solutions. The next generation is produced from the
sists of the experimental components available in the lab- best-performing parent setups. Those are merged and
oratory (such as lasers and crystals). The algorithm ini- mutated to create the new pool of candidates. In that
tially starts to put elements from the toolbox in random way, the population improves its performance over time
order on the table. If the candidate setup satisfies all and leads to numerous counter-intuitive and exotic solu-
sanity checks, a simulator computes the full experimen- tions. One of the computationally expensive operations
tal output. If the setup produces the desired quantum and bottlenecks is the computation of the fidelity of the
state, it is automatically simplified and reported to the quantum state. For that, the authors have extended the
user. In addition, the setup is then stored as a new part approach by using a neural network that can classify the
of the toolbox. In that way, over time, the algorithm quantum states from the setup. This combination of an
learns useful macro components which it can use in sub- evolutionary algorithm with a neural network has discov-
sequent iterations. Thereby, it can already access useful ered experimental blueprints with yet unachieved quan-
operations that significantly speed up the discovery pro- tum metrology advantage. While much effort has been
cess. put into constraining the system to realistic solutions,
This algorithm has produced experimental blueprints it remains to be seen whether these experiments can be
that enabled the observation of numerous new quantum built in the laboratory and achieve the expected quality.
14

An entirely different approach uses logical artificial act. The algorithm SCILLA starts with a discrete circuit
intelligence for designing quantum optical experiments topology. The best candidates are further parametrically
[106]. While the credo of the deep learning community optimized, either with a direct gradient-based optimiza-
is to build large neural networks that can solve arbitrary tion or with an evolutionary approach (to avoid local
tasks given large enough training examples and compute minima). The final design outperforms the only other
power, this is not the only way towards ”intelligent” algo- (hand-crafted) 4-local coupler in terms of noise resilience
rithms. An alternative is logic AI [107]. Here, the idea is and coupling strength.
to translate arbitrary problems to Boolean satisfiability Discovering Quantum Protocols and Discrete
expressions and solve them with powerful SAT solvers. In Feedback Strategies –
[106], the question of designing quantum experiments has Discrete building blocks occur not only naturally in the
been rephrased into logical expressions and solved with construction of experiments, but also as part of discrete
MiniSAT. It is shown that in some problems, a combina- temporal sequences, specifically sequences of quantum
tion of Theseus and the logical approach is faster than gates and other operations. These sequences can rep-
the continuous optimization itself. The reason is that the resent quantum protocols or higher-level control strate-
unsatisfiability of candidate solutions is detected quickly gies for quantum devices. The hardware-level control dis-
with a logical approach, thus guiding the continuous opti- cussed in the previous section III C could then be consid-
mizer towards more promising candidates. This approach ered as a tool to implement the individual building blocks
is in its infancy. Given that the field of logic AI is grow- (e.g. an individual gate). In the following we will deal
ing fast due to computational and algorithmic advances, with protocols that also contain elements of feedback or
we expect a large increase in interest in this topic. other actions that are not merely unitary gates. The task
Deep generative models such as variational autoen- of quantum circuit synthesis (building up unitaries out of
coders became a standard tool in fields such as mate- elements) will be discussed further below.
rial design [108]. Here, an encoder network transforms Reinforcement learning is one suiting tool for the au-
a (potentially discrete) representation into a continuous tomated discovery of such sequences. This was first an-
latent space. The decoder network is trained to take a alyzed in [88], using deep RL, where the goal was to dis-
point in the latent space and translate it back to the dis- cover a strategy for quantum error correction in a quan-
crete structure. The encoder and decoder together are tum memory register made of a few qubits. This involves
trained to perform an identity transformation, which by applying discrete unitary gates, which are conditioned on
itself is not that interesting. However, as an exciting the outcomes of measurements, i.e. ’real-time’ feedback
side effect, the system builds up an internal, continu- executed during the control sequence, Fig. 7a. Since the
ous latent space that can be shaped during the training aim of [88] was primarily to find quantum error correction
and used for gradient-based optimization. For the first strategies, we will discuss some more aspects separately
time, such a system was demonstrated for quantum op- in the upcoming section III E.
tics in [109]. The work focuses on understanding what A reinforcement learning technique was subsequently
the neural networks have learned and how they store the also applied to the rediscovery of implementations for
information in their internal latent space. The structure quantum communication protocols [111]. There, the au-
of the latent space shows surprising discrete structures thors set up the task as an RL problem and explain the
that were then identified with concrete properties of the similarity of quantum communication and RL with the
experimental setups. It will be interesting to see more following intuition: A quantum communication protocol
advanced ways to investigate, navigate and understand is a sequence of operations that leads to the desired out-
the high-dimensional internal representations of neural come. Similarly, an RL agent learns a policy, that is, to
networks that are built autonomously during training. perform sequences of operations that maximize a reward
A conceptually related task is the design of supercon- function. The authors task the RL agent to rediscover
ducting circuits. The quantum behavior of superconduct- several important quantum communication schemes such
ing circuits is defined by a network of inductances, ca- as quantum teleportation, quantum state purification, or
pacitances, and Josephson junctions. As with quantum entanglement swapping. Each of these tasks can be writ-
optical experiments, those systems are conventionally de- ten as a simple network, where the nodes stand for the
signed by experienced human researchers who aim to involved parties and edges indicate classical or quantum
find suitable configurations for complex quantum trans- correlations between them. Let us take the quantum tele-
formations, such as coupling between two well-defined portation protocol as an example (the others follow simi-
qubits in quantum computers. The search space of pos- lar ideas). The environment is a three-node network (the
sible structures grows exponentially with the number of incoming unknown quantum state A, the sender B, and
elements, and thus it quickly becomes infeasible for hu- the receiver C). The environment starts with pre-shared
mans to find solutions for complex tasks. In [110], the entanglement between B and C. The agent now has to
authors addressed the question of designing supercon- find a correct sequence of local measurements and clas-
ducting circuits for the first time with a fully automated sical communication steps that teleports the quantum
closed-loop optimization approach and designed a 4-local state from A to C. After performing up to 50 opera-
coupler by which four superconducting flux qubits inter- tions, the transformations are evaluated, and the agent
15

less evaluations than an exhaustive search and could also


present different solutions not discussed in the literature
before. Genetic algorithms are a powerful tool for dis-
crete discovery. Thus, related approaches are still being
used two decades later [113, 114].
The problem of designing a quantum circuit from an
unparameterized set of elementary gates can be formu-
lated as a reinforcement learning problem. In [115], the
authors introduce a deep reinforcement learning algo-
rithm that translates an arbitrary single-qubit gate into
a sequence of elementary gates from a finite universal
set for a topological quantum computer. The authors
FIG. 7. Discovery of Quantum Circuits and Feedback Strate- apply their algorithms to Fibonacci anyons and discover
gies with Discrete Gates. (a) A reinforcement-learning agent high-quality braiding sequences. Similar machine learn-
acts on a multi-qubit system by selecting gates, potentially ing techniques have been applied to the quantum circuits
conditioned on measurement outcomes, finding an optimized for gate-based quantum computers [116]. The design of
quantum circuit or quantum feedback strategy. (b) A fixed unparameterized quantum circuits has an important ap-
layout quantum circuit with adjustable parameters that can plication for fault-tolerant quantum computation, where
be optimized via gradient ascent to achieve some goal like only a finite list of unparameterized gates can be applied
state preparation or variational ground state search (possibly
to the logical qubits.
including feedback).
In general, quantum circuits can have tunable param-
eters – for instance, parameterized X-gates. In that case,
the problem has both a discrete and continuous element.
gets a (binary) reward for whether it succeeded or not. An important application for this task is the hardware-
Over 100k trials, the agent finds with high probability aware design of circuits[117, 118]. Here, the algorithms
an action-efficient strategy to perform the task. As there consider the circuit’s connectivity on the one hand and
is no feedback from the environment and a model of the noise contributions on the other. The main goal is to find
system exists, RL agents are not the only option to solve shallow circuits that are more noise-resistant than other
these questions, and direct gradient-based methods can (textbook) implementations of circuits. The authors of
be used for discovering quantum protocols. [117, 118] present an algorithm that can significantly out-
Quantum Circuits – The design of quantum circuits perform textbook solutions for various state generation
has some relation to the design of quantum experiments. tasks in terms of fidelity under realistic noise conditions.
In both situations, a discrete set of parametrized ele- The design of discrete circuits with continuous parame-
ments are carefully connected to form the topology of the ters has also been approached with deep reinforcement
circuit or experiment, while the continuous parameters of learning [119].
the elements (such as phases) are optimized. In contrast A crucial and heavily investigated topic in the area of
to quantum circuits that can use a universal gate set, for quantum circuit design is the parametric optimization of
experimental design or the quantum protocols considered a constant circuit topology. This task is essential for hy-
above it is not clear in the beginning whether certain tar- brid quantum-classical variational quantum algorithms
gets can be reached with the available resources. (VQA) that can run on near-term quantum computing
Quantum circuit design problems broadly fall into two hardware [17, 18], as well as for quantum machine learn-
branches: First, quantum circuit synthesis (QCS) (some- ing, where such parameterized circuits are used as quan-
times called quantum circuit compilation) addresses the tum neural networks [15]. We will not go into the details
problem of how to build from scratch a circuit that per- of these topics, but refer the reader to the excellent re-
forms a specific task. Second, quantum circuit optimiza- views on this topic.
tion (QCO) aims at turning a given circuit into a simpli- We will mention only a few selected but important
fied, logically equivalent circuit. results and ideas for completeness. First, it has been
The problem of quantum circuit synthesis is translat- discovered that the gradients in a randomly initialized
ing an algorithm into elementary gates from a finite uni- parametrized quantum circuit vanish exponentially with
versal set. There are two situations – the fully discrete the number of qubits [120], which has become one of the
case, where gates are fixed, and the case where gates main challenges in the field of quantum machine learning
can be tuned via continuous parameters (e.g. rotation and VQA, Fig. 7b. An important question then is how
angles). The first application of machine-learning tech- an expressive initial state of the circuit (ansatz) allows
niques for the de-novo generation of quantum circuits for the efficient machine-learning-based optimization of
used genetic algorithms [112]. The algorithm had access the circuits [121]. Some exciting approaches involve re-
to a set of single- and two-qubit gates and was tasked inforcement learning that explores economic and expres-
to rediscover the quantum circuit for quantum teleporta- sive initial ansätze [122] or ideas that are inspired from
tion. It indeed found the correct circuit with significantly neural network architecture search [123]. Besides the di-
16

rect gradient-based optimization of parametrized quan- [101]. This point of view shows how machines can cre-
tum circuits, different approaches try to avoid the prob- atively contribute to science and act as an inspiration
lem of vanishing gradients, by employing reinforcement for human scientists. There will be a great potential for
learning [124], using ML-based prediction of suitable ini- expanding these ideas. Automatic extraction of under-
tial parameters (rather than optimizing the parameters standable building blocks (’subroutines’) can help with
directly) [125] or advanced gradient-free approaches that this challenge.
are naturally not susceptible to the barren plateau prob-
lem [126].
An interesting recent application of VQA-based sys- E. Quantum Error Correction
tems is the quantum-computer-aided design of quantum
hardware[127–129]. As described at the beginning of The ability to correct errors in a quantum computing
this chapter, the AI-based design of new quantum hard- device will be indispensable to realizing beneficial ap-
ware on a classical computer has the problem of memory plications of quantum computation since real-world de-
requirements increasing exponentially with the system vices are not coherent enough to run an error-free cal-
size. One way to overcome this problem is to outsource culation. The basic conceptual ideas in this domain are
the computation of the expensive quantum system to a known since the pioneering work of Shor [132] and sub-
quantum computer. Here, the problem of designing new sequent developments, most notably the surface code. In
multi-qubit couplers for superconducting quantum com- any case, the idea is to encode logical qubit information
puters or the design of new quantum optics hardware in many physical qubits robustly and redundantly. The
can be rephrased as in a VQA-style problem. A clas- presence of errors (like qubit dephasing and decay) must
sical AI algorithm changes the parameter of a parame- be detected via measurement of so-called syndromes,
terized quantum circuit to minimize a fidelity function i.e. suitably chosen observables (often multi-qubit opera-
computed from the outcome of the quantum computa- tors). Finally, a good way to interpret the observed syn-
tion. After convergence, a mapping translates the fi- dromes and apply some error correction procedure must
nal parametrized quantum circuit into the specific quan- be found. Despite the knowledge of good encodings and
tum hardware. This approach has been experimentally suitable syndromes, it remains a challenging problem how
demonstrated in a proof-of-principle three-qubit super- to best implement those in practice, for a given quantum
conducting circuit [129]. device, with its available gate set and topology of con-
In general, it is not guaranteed that a direct compi- nections between the qubits, and how to optimize them
lation of an algorithm already yields the most efficient for a given noise model.
implementation of a quantum circuit. A powerful clas- It has been recognized early on that machine learning
sical method to simplify (compile) quantum circuits is methods could potentially be of great help in this domain.
the ZX formalism [130], which reformulates the circuit The tasks can naturally be divided into three categories.
into a graph, where predefined rules identify simplifica- Syndrome interpretation – On the simplest level,
tions. However, this and similar approaches have been we already assume an existing encoding and a fixed set
formulated in a hardware-independent way, operating on of syndromes. The task then is to find the optimal way
a global level. Alternatively, this problem can be ap- to interpret the observed syndrome, e.g. deciding which
proached by RL algorithms [131] that can autonomously qubits are likely erroneous and must be corrected, see
simplify circuits, for example, in terms of circuit depth or Fig. 8. This can be phrased as a supervised learning
gate counts, and this enables easily taking into account problem, where some errors are simulated, the syndrome
concrete hard-ware constraints. In [131] this approach is fed into a network, and the network must announce
was developed and found superior to simulated anneal- the location of the errors. In practice, the surface code
ing (tested for circuits of up to 50 qubits). It has the is the most promising QEC architecture, and deducing
potential to become an important tool for simplifying the error from the syndrome is not trivial, though non-
quantum circuits in the future. ML algorithms exist. Multiple works therefore trained
Future Challenges and Opportunities – The sim- neural networks to yield “neural decoders” [133–136]. In
ulation of quantum experiments becomes expensive as one early example [133], a modified, restricted Boltzmann
soon as the system grows in size. Neural networks could machine was used, with two types of visible units, cor-
autonomously find approximate predictions for the dy- responding to syndrome and underlying error configura-
namics of the quantum system. Such supervised systems tion. This was then trained on a data set of such pairs.
need a lot of training examples. Thus the trade-off be- Afterward, the machine could be used to sample the er-
tween the creation of training examples and the compu- rors compatible with an observed syndrome. It is also
tational benefit of an approximation needs to be investi- possible to use reinforcement learning to discover better
gated. strategies in more complicated situations. In [137], this
The design of new experiments or hardware can not was applied to the surface code, exemplified in a situation
only be seen as optimization (in the sense of making an with faulty syndrome measurements.
existing structure better) but as discovery in which we Code search – Going one step further, the question
create new ideas that did not exist before, as shown in arises whether a machine can also find better codes. It is
17

helpful to take an existing code and modify it. The sur-


face code is usually formulated on a simple square lattice,
but it can be implemented on more complicated geome-
tries, both periodic and even more generally a-periodic.
In [138], a reinforcement learning agent was asked to opti-
mize the connectivity of a surface code given a number of
data qubits. It was able to find the best performing code
in scenarios important for real experiments. These in-
cluded biased noise (not all error channels equally strong) FIG. 8. Quantum Error Correction. Syndrome interpretation
or spatially localized noise (higher error rates in the vicin- in a surface code as a task that a neural network can be
ity of some qubits). The agent found interesting nontriv- trained to perform.
ial connectivities as optimal solutions. Due to the avail-
ability of highly efficient simulation tools for establishing
the performance of surface codes, the authors of [138] works for a handful of qubits due to the effort required
were able to go up to 70 data qubits. in simulations. A future challenge would be finding ways
Autonomous quantum error correction consists of an to make the RL work directly on the experiment.
experimental configuration that can intrinsically correct Future Challenges and Opportunities – Regard-
certain types of errors without active feedback. This idea ing syndrome measurements, an important challenge for
can be implemented by introducing carefully additional the future is ensuring that the neural networks inter-
drives and dissipation. The discovery of such mechanisms preting those measurement results can be deployed in
in real physical systems, under strict experimental con- an actual device at sufficient speed: even for the clas-
straints, is highly nontrivial. The authors of [139] show sical algorithms, this is nontrivial. Another challenge
the automated discovery of an autonomous QEC that is that ab-initio discovery of quantum error correction
could be applied to Bosonic systems. The goal is to find strategies still relies on simulations, whose numerical ef-
an encoding of a logical qubit that is robust under the dy- fort scales exponentially in the number of qubits (for gen-
namics of the system. The algorithm denoted AutoQEC eral quantum dynamics). An interesting direction could
can then discover such an encoding by maximizing the also be the co-discovery of autonomous QEC experiments
average fidelity of the logical qubit. AutoQEC is further together with QEC feedback strategies in these systems.
constrained to consider only systems within experimental
capabilities. Indeed,
√ the authors discover a new quantum
code, denoted 3-code, that has a longer lifetime than IV. OUTLOOK
previously studied systems with the same concrete ex-
perimental constraints. The authors go on and, inspired With all these promising ideas in mind, let us look
by their numerical discovery, derive the analytical, logi- forward to the year 2035: how do we imagine machine
cal state and analyze the new autonomous QEC system learning to contribute to quantum technologies by that
further. time?
Full QEC protocol discovery – Finally, one can Fully controlled and error-corrected quantum
adopt the attitude that neither the code itself, the code systems – As quantum platforms scale towards ever
family, or any other ingredients are assumed. In that greater numbers of components and connections, ma-
case, one starts from scratch, and the goal of the machine chine learning will provide a way to harness this com-
is to discover ways to preserve the quantum information plexity – by automatically calibrating and fine-tuning the
with high fidelity for as long as possible. In other words, resulting huge number of parameters adaptively, by dis-
it (re-)discovers all aspects of QEC and error mitigation, covering optimized experimental setups in the first place,
adapted to the given platform and noise model. Such an by extracting the maximum amount of information from
ab-initio approach was demonstrated in [88], which we rather indirect measurements of the underlying phenom-
already mentioned above in the context of RL for feed- ena via complex observables, and by finding smart con-
back. Given a few qubits with arbitrary connectivity and trol strategies. Errors in such systems could be corrected
gate set, as well as an arbitrary noise model, the agent by fully automated quantum error correction schemes
is asked to preserve the quantum information as long as discovered by an AI. When we think of quantum com-
possible. To solve this challenging task, additional gen- puters with 1000s of qubits or quantum simulators with
erally applicable insights were required, e.g. introduc- even more degrees of freedom, all of this will be a crucial
ing a reward that can measure the amount of surviving part of the community’s toolbox.
quantum information without having discovered a proper Specifying goals, not algorithms – One important
decoding sequence. Beyond approaches that fall in the aspect here will be that instead of defining an algorithm
family of stabilizer codes, the same agent also discovered that tells the computer how to achieve some goal, we will
noise mitigation techniques based on adaptive measure- typically define the goal itself. Such goals could be to
ments. The advantage of such an ab-initio approach is retain a large fidelity during quantum operations, pro-
its flexibility, but the price to pay is that so far, it only duce highly entangled states, or have a strong sensitivity
18

to some external signal. The details of how to reach this the most famous examples being the ImageNet data set
goal will be left for the computer to discover. This change which provided the basis for a revolution of ML-based
of perspective will enable a much higher level of descrip- computer vision systems [145]. This idea was adapted in
tion, which is one way to keep ahead of the growing com- other fields of science that apply AI methodologies, such
plexity. Ultimately, one might expect that the machine as material discovery[146, 147]. In contrast, the field
has access to the scientific literature and suggests goals of AI in quantum technology, at the moment, appears
and new experiments autonomously, as demonstrated in more like the wild west. There are no clear ways how to
material science[140]. compare approaches from different papers, because most
Discovering new Algorithms – Rather than discov- works apply their approaches to slightly different tasks,
ering experiments or feedback strategies, it will be very making them incomparable. We believe that fair and
interesting to see whether ML agents can autonomously suitably curated benchmark data sets will steer the devel-
discover other higher-logic quantum programs such as opment of powerful and ever more generally applicable AI
quantum algorithms. This task is recently been tack- algorithms in quantum technology. The data sets could
led by large language models for classical algorithms consist of simulated or (in the best case) experimental
[141, 142], and we expect that similarly quantum algo- data for data interpretation tasks. Likewise, to facili-
rithms can be discovered with classical machine learning tate the discovery of experimental setups and protocols,
models. the community can develop a selection of well-curated
How can the human learn? – Suppose that com- objective functions and a set of simulated environments
puters will be able to help us find solutions for many describing important prototypical quantum devices (see
of the lower-level and even some higher-level, more con- SciGym for a first attempt at this [148]). In a similar
ceptual tasks in the domain of quantum technologies. direction, we expect that cloud access to real quantum
That raises the following notorious question, pervasive experiments will become available for significantly more
throughout machine learning and artificial intelligence: systems. AI algorithms can then be trained on the data
How can we human scientists understand what the ma- from these real machines with specific experimental con-
chine has learned? Do we need to open the black box of straints (such as connectivity or noise). This will boost
neural networks, or can we use the algorithms as a source the capabilities of algorithms that deal with important,
of inspiration in a different way [143]? We argue that real-world systems.
while improved performance in the task at hand is great, Finally, what Alan Turing remarked in his visionary
being able to understand the essence of what the machine article on intelligence and learning machines [149] is also
has discovered is crucial for the result to become of much valid here, in the field of machine learning applied to
wider applicability. In general, gaining understanding quantum technologies: ”We can only see a short dis-
has been called the essential aim of science[144]. Here tance ahead, but we can see plenty there that needs to
approaches where the solution involves discrete steps (e.g. be done.”
discrete actions of an agent) or logic-based AI seem to be
easier to interpret, explain and understand than results
from deep-learning-based methods. The field of symbolic
regression (which extracts discrete explanations of neu-
V. ACKNOWLEDGMENTS
ral network predictions) might be very fruitful in this
approach.
What needs to be done? – To attain the visions The authors thank Sören Arlt and Xuemei Gu for help-
described above, our community may adopt some proven ful comments on the manuscript. F.M. acknowledges
methodologies from other areas. The idea of fair bench- funding by the Munich Quantum Valley, which is sup-
marks and competitions is one of the powerful driving ported by the Bavarian state government with funds from
forces in the development of ML algorithms. One of the Hightech Agenda Bayern Plus.

[1] M. I. Jordan and T. M. Mitchell, Machine learning: (2003).


Trends, perspectives, and prospects, Science 349, 255 [5] F. Flamini, N. Spagnolo, and F. Sciarrino, Photonic
(2015). quantum information processing: a review, Reports on
[2] Y. LeCun, Y. Bengio, and G. Hinton, Deep learning, Progress in Physics 82, 016001 (2018).
Nature 521, 436 (2015). [6] I. M. Georgescu, S. Ashhab, and F. Nori, Quantum sim-
[3] I. Goodfellow, Y. Bengio, and A. Courville, Deep learn- ulation, Reviews of Modern Physics 86, 153 (2014).
ing (MIT press, 2016). [7] J. Preskill, Quantum Computing in the NISQ era and
[4] J. P. Dowling and G. J. Milburn, Quantum technology: beyond, Quantum 2, 79 (2018).
the second quantum revolution, Philosophical Transac- [8] M. C. Angelini and F. Ricci-Tersenghi, Cracking nuts
tions of the Royal Society of London. Series A: Math- with a sledgehammer: when modern graph neural
ematical, Physical and Engineering Sciences 361, 1655 networks do worse than classical greedy algorithms,
19

arXiv:2206.13211 (2022). [29] E. Magesan, J. M. Gambetta, A. D. Córcoles, and J. M.


[9] F. Marquardt, Machine learning and quantum devices, Chow, Machine Learning for Discriminating Quan-
SciPost Physics Lecture Notes , 29 (2021). tum Measurement Trajectories and Improving Readout,
[10] A. Dawid, J. Arnold, B. Requena, A. Gresch, Physical Review Letters 114, 200501 (2015).
M. Plodzień, K. Donatella, K. A. Nicoli, P. Stornati, [30] A. Seif, K. A. Landsman, N. M. Linke, C. Figgatt,
R. Koch, M. Büttner, et al., Modern applications of C. Monroe, and M. Hafezi, Machine learning assisted
machine learning in quantum sciences, arXiv:2204.04198 readout of trapped-ion qubits, Journal of Physics B:
(2022). Atomic, Molecular and Optical Physics 51, 174006
[11] G. Carleo, I. Cirac, K. Cranmer, L. Daudet, M. Schuld, (2018).
N. Tishby, L. Vogt-Maranto, and L. Zdeborová, Ma- [31] G. Liu, M. Chen, Y.-X. Liu, D. Layden, and P. Cappel-
chine learning and the physical sciences, Reviews of laro, Repetitive readout enhanced by machine learning,
Modern Physics 91, 045002 (2019). Machine Learning: Science and Technology 1, 015003
[12] J. Carrasquilla, Machine learning for quantum matter, (2020).
Advances in Physics: X 5, 1797528 (2020). [32] E. Flurin, L. S. Martin, S. Hacohen-Gourgy, and I. Sid-
[13] M. Schuld, I. Sinayskiy, and F. Petruccione, An intro- diqi, Using a Recurrent Neural Network to Reconstruct
duction to quantum machine learning, Contemporary Quantum Dynamics of a Superconducting Qubit from
Physics 56, 172 (2015). Physical Observations, Physical Review X 10, 011006
[14] J. Biamonte, P. Wittek, N. Pancotti, P. Rebentrost, (2020).
N. Wiebe, and S. Lloyd, Quantum machine learning, [33] I. Agresti, N. Viggianiello, F. Flamini, N. Spagnolo,
Nature 549, 195 (2017). A. Crespi, R. Osellame, N. Wiebe, and F. Sciarrino, Pat-
[15] V. Dunjko and P. Wittek, A non-review of Quantum tern Recognition Techniques for Boson Sampling Vali-
Machine Learning: trends and explorations, Quantum dation, Physical Review X 9, 011013 (2019).
Views 4, 32 (2020). [34] S. Aaronson, The learnability of quantum states, Pro-
[16] L. Lamata, Quantum machine learning and quantum ceedings of the Royal Society A: Mathematical, Physical
biomimetics: A perspective, Machine Learning: Science and Engineering Sciences 463, 3089 (2007).
and Technology 1, 033002 (2020). [35] A. Rocchetto, S. Aaronson, S. Severini, G. Carvacho,
[17] M. Cerezo, A. Arrasmith, R. Babbush, S. C. Benjamin, D. Poderini, I. Agresti, M. Bentivegna, and F. Sciar-
S. Endo, K. Fujii, J. R. McClean, K. Mitarai, X. Yuan, rino, Experimental learning of quantum states, Science
L. Cincio, et al., Variational quantum algorithms, Na- Advances 5, eaau1946 (2019).
ture Reviews Physics 3, 625 (2021). [36] H.-Y. Huang, M. Broughton, J. Cotler, S. Chen,
[18] K. Bharti, A. Cervera-Lierta, T. H. Kyaw, T. Haug, J. Li, M. Mohseni, H. Neven, R. Babbush, R. Kueng,
S. Alperin-Lea, A. Anand, M. Degroote, H. Heimonen, J. Preskill, et al., Quantum advantage in learning from
J. S. Kottmann, T. Menke, W.-K. Mok, S. Sim, L.-C. experiments, Science 376, 1182 (2022).
Kwek, and A. Aspuru-Guzik, Noisy intermediate-scale [37] V. Cimini, M. Barbieri, N. Treps, M. Walschaers,
quantum algorithms, Reviews of Modern Physics 94, and V. Parigi, Neural Networks for Detecting Multi-
015004 (2022). mode Wigner Negativity, Physical Review Letters 125,
[19] K. Bharti, T. Haug, V. Vedral, and L.-C. Kwek, Ma- 160504 (2020).
chine learning meets quantum foundations: A brief sur- [38] J. Carrasquilla and R. G. Melko, Machine learning
vey, AVS Quantum Science 2, 034101 (2020). phases of matter, Nature Physics 13, 431 (2017).
[20] S. J. Russell and P. Norvig, Artificial Intelligence: A [39] E. P. L. van Nieuwenburg, Y.-H. Liu, and S. D. Huber,
Modern Approach, 4th ed. (Pearson, 2021). Learning phase transitions by confusion, Nature Physics
[21] D. Whitley, A genetic algorithm tutorial, Statistics and 13, 435 (2017).
computing 4, 65 (1994). [40] S. J. Wetzel, Unsupervised learning of phase transitions:
[22] S. N. Sivanandam and S. N. Deepa, Introduction to Ge- From principal component analysis to variational au-
netic Algorithms (Springer Berlin, Heidelberg, 2008). toencoders, Physical Review E 96, 022140 (2017).
[23] G. Cybenko, Approximation by superpositions of a sig- [41] K. Kottmann, P. Huembeli, M. Lewenstein, and
moidal function, Mathematics of Control, Signals and A. Acı́n, Unsupervised Phase Discovery with Deep
Systems 2, 303 (1989). Anomaly Detection, Physical Review Letters 125,
[24] S. Hochreiter and J. Schmidhuber, Long Short-Term 170603 (2020).
Memory, Neural Computation 9, 1735 (1997). [42] N. Käming, A. Dawid, K. Kottmann, M. Lewenstein,
[25] M. A. Nielsen, Neural Networks and Deep Learning (De- K. Sengstock, A. Dauphin, and C. Weitenberg, Unsu-
termination Press, 2015). pervised machine learning of topological phase transi-
[26] D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, tions from experimental data, Machine Learning: Sci-
G. van den Driessche, J. Schrittwieser, I. Antonoglou, ence and Technology 2, 035037 (2021).
V. Panneershelvam, M. Lanctot, et al., Mastering the [43] F. Arute, K. Arya, R. Babbush, D. Bacon, J. C. Bardin,
game of Go with deep neural networks and tree search, R. Barends, R. Biswas, S. Boixo, F. G. S. L. Brandao,
Nature 529, 484 (2016). D. A. Buell, et al., Quantum supremacy using a pro-
[27] O. Vinyals, I. Babuschkin, W. M. Czarnecki, M. Math- grammable superconducting processor, Nature 574, 505
ieu, A. Dudzik, J. Chung, D. H. Choi, R. Powell, (2019).
T. Ewalds, P. Georgiev, et al., Grandmaster level in [44] G. Carleo and M. Troyer, Solving the quantum many-
StarCraft II using multi-agent reinforcement learning, body problem with artificial neural networks, Science
Nature 575, 350 (2019). 355, 602 (2017).
[28] R. S. Sutton and A. G. Barto, Reinforcement Learning: [45] R. G. Melko, G. Carleo, J. Carrasquilla, and J. I. Cirac,
An Introduction (MIT Press, Cambridge, MA, 2018). Restricted Boltzmann machines in quantum physics,
20

Nature Physics 15, 887 (2019). mation, Physical Review A 63, 053804 (2001).
[46] G. Torlai, G. Mazzola, J. Carrasquilla, M. Troyer, [63] B. L. Higgins, D. W. Berry, S. D. Bartlett, H. M. Wise-
R. Melko, and G. Carleo, Neural-network quantum state man, and G. J. Pryde, Entanglement-free Heisenberg-
tomography, Nature Physics 14, 447 (2018). limited phase estimation, Nature 450, 393 (2007).
[47] G. Torlai, B. Timar, E. P. L. van Nieuwenburg, [64] A. Hentschel and B. C. Sanders, Machine Learning for
H. Levine, A. Omran, A. Keesling, H. Bernien, Precise Quantum Measurement, Physical Review Let-
M. Greiner, V. Vuletić, M. D. Lukin, et al., Integrating ters 104, 063603 (2010).
Neural Networks with a Quantum Simulator for State [65] A. Hentschel and B. C. Sanders, Efficient Algorithm for
Reconstruction, Physical Review Letters 123, 230504 Optimizing Adaptive Quantum Metrology Processes,
(2019). Physical Review Letters 107, 233601 (2011).
[48] M. Schuld, I. Sinayskiy, and F. Petruccione, Neural Net- [66] N. B. Lovett, C. Crosnier, M. Perarnau-Llobet, and
works Take on Open Quantum Systems, Physics 12, 74 B. C. Sanders, Differential Evolution for Many-Particle
(2019). Adaptive Quantum Metrology, Physical Review Letters
[49] A. Nagy and V. Savona, Variational Quantum Monte 110, 220501 (2013).
Carlo Method with a Neural-Network Ansatz for [67] P. Palittapongarnpim, P. Wittek, E. Zahedinejad,
Open Quantum Systems, Physical Review Letters 122, S. Vedaie, and B. C. Sanders, Learning in quantum
250501 (2019). control: High-dimensional global optimization for noisy
[50] M. J. Hartmann and G. Carleo, Neural-Network Ap- quantum dynamics, Neurocomputing 268, 116 (2017).
proach to Dissipative Quantum Many-Body Dynamics, [68] H. Xu, J. Li, L. Liu, Y. Wang, H. Yuan, and X. Wang,
Physical Review Letters 122, 250502 (2019). Generalizable control for quantum parameter estima-
[51] F. Vicentini, A. Biella, N. Regnault, and C. Ciuti, Vari- tion through reinforcement learning, npj Quantum In-
ational Neural-Network Ansatz for Steady States in formation 5, 82 (2019).
Open Quantum Systems, Physical Review Letters 122, [69] J. Schuff, L. J. Fiderer, and D. Braun, Improving the
250503 (2019). dynamics of quantum sensors with reinforcement learn-
[52] N. Yoshioka and R. Hamazaki, Constructing neural sta- ing, New Journal of Physics 22, 035001 (2020).
tionary states for open quantum many-body systems, [70] E. G. Ryan, C. C. Drovandi, J. M. McGree, and A. N.
Physical Review B 99, 214306 (2019). Pettitt, A Review of Modern Computational Algorithms
[53] T. Adler, M. Erhard, M. Krenn, J. Brandstetter, for Bayesian Optimal Design, International Statistical
J. Kofler, and S. Hochreiter, Quantum Optical Experi- Review 84, 128 (2016).
ments Modeled by Long Short-Term Memory, Photonics [71] V. Gebhart, R. Santagati, A. A. Gentile, E. Gauger,
8, 535 (2021). D. Craig, N. Ares, L. Banchi, F. Marquardt,
[54] N. Mohseni, T. Fösel, L. Guo, C. Navarrete-Benlloch, L. Pezze, and C. Bonato, Learning Quantum Systems,
and F. Marquardt, Deep Learning of Quantum Many- arXiv:2207.00298 (2022).
Body Dynamics via Random Driving, Quantum 6, 714 [72] R. Durrer, B. Kratochwil, J. V. Koski, A. J. Landig,
(2022). C. Reichl, W. Wegscheider, T. Ihn, and E. Greplova,
[55] M. Choi, D. Flam-Shepherd, T. H. Kyaw, and Automated Tuning of Double Quantum Dots into Spe-
A. Aspuru-Guzik, Learning quantum dynamics with la- cific Charge States Using Neural Networks, Physical Re-
tent neural ordinary differential equations, Physical Re- view Applied 13, 054019 (2020).
view A 105, 042403 (2022). [73] N. Wiebe, C. Granade, C. Ferrie, and D. G. Cory,
[56] D. T. Lennon, H. Moon, L. C. Camenzind, L. Yu, D. M. Hamiltonian Learning and Certification Using Quantum
Zumbühl, G. A. D. Briggs, M. A. Osborne, E. A. Laird, Resources, Physical Review Letters 112, 190501 (2014).
and N. Ares, Efficiently measuring a quantum device [74] J. Wang, S. Paesani, R. Santagati, S. Knauer, A. A.
using machine learning, npj Quantum Information 5, Gentile, N. Wiebe, M. Petruzzella, J. L. O’Brien, J. G.
79 (2019). Rarity, A. Laing, et al., Experimental quantum Hamil-
[57] E. Polino, M. Valeri, N. Spagnolo, and F. Sciarrino, tonian learning, Nature Physics 13, 551 (2017).
Photonic quantum metrology, AVS Quantum Science 2, [75] A. A. Gentile, B. Flynn, S. Knauer, N. Wiebe, S. Pae-
024703 (2020). sani, C. E. Granade, J. G. Rarity, R. Santagati, and
[58] V. Cimini, I. Gianani, N. Spagnolo, F. Leccese, F. Scia- A. Laing, Learning models of quantum systems from
rrino, and M. Barbieri, Calibration of Quantum Sen- experiments, Nature Physics 17, 837 (2021).
sors by Neural Networks, Physical Review Letters 123, [76] N. Khaneja, T. Reiss, C. Kehlet, T. Schulte-
230502 (2019). Herbrüggen, and S. J. Glaser, Optimal control of cou-
[59] P. A. Knott, A search algorithm for quantum state en- pled spin dynamics: design of NMR pulse sequences by
gineering and metrology, New Journal of Physics 18, gradient ascent algorithms, Journal of magnetic reso-
073033 (2016). nance 172, 296 (2005).
[60] R. Nichols, L. Mineh, J. Rubio, J. C. F. Matthews, and [77] V. F. Krotov, Global methods in optimal control theory,
P. A. Knott, Designing quantum experiments with a Monographs and textbooks in pure and applied mathe-
genetic algorithm, Quantum Science and Technology 4, matics, Vol. 195 (Marcel Dekker, New York, 1996).
045012 (2019). [78] J. Somlói, V. A. Kazakov, and D. J. Tannor, Controlled
[61] D. W. Berry and H. M. Wiseman, Optimal States dissociation of I2 via optical transitions between the
and Almost Optimal Adaptive Measurements for Quan- X and B electronic states, Chemical Physics 172, 85
tum Interferometry, Physical Review Letters 85, 5098 (1993).
(2000). [79] P. Doria, T. Calarco, and S. Montangero, Optimal
[62] D. W. Berry, H. Wiseman, and J. Breslin, Optimal in- Control Technique for Many-Body Quantum Dynamics,
put states and feedback for interferometric phase esti- Physical Review Letters 106, 190501 (2011).
21

[80] R. S. Judson and H. Rabitz, Teaching lasers to control Physical Review A 101, 022321 (2020).
molecules, Physical Review Letters 68, 1500 (1992). [97] F. Schäfer, M. Kloc, C. Bruder, and N. Lörch, A dif-
[81] C. Chen, D. Dong, H.-X. Li, J. Chu, and T.-J. Tarn, ferentiable programming method for quantum control,
Fidelity-Based Probabilistic Q-Learning for Control of Machine Learning: Science and Technology 1, 035009
Quantum Systems, IEEE Transactions on Neural Net- (2020).
works and Learning Systems 25, 920 (2013). [98] L. Coopmans, D. Luo, G. Kells, B. K. Clark, and J. Car-
[82] M. Bukov, A. G. R. Day, D. Sels, P. Weinberg, rasquilla, Protocol Discovery for the Quantum Con-
A. Polkovnikov, and P. Mehta, Reinforcement Learn- trol of Majoranas by Differentiable Programming and
ing in Different Phases of Quantum Control, Physical Natural Evolution Strategies, PRX Quantum 2, 020332
Review X 8, 031086 (2018). (2021).
[83] M. August and J. M. Hernández-Lobato, Taking Gra- [99] F. Schäfer, P. Sekatski, M. Koppenhöfer, C. Bruder, and
dients Through Experiments: LSTMs and Memory M. Kloc, Control of stochastic quantum dynamics by
Proximal Policy Optimization for Black-Box Quantum differentiable programming, Machine Learning: Science
Control, in High Performance Computing, edited by and Technology 2, 035004 (2021).
R. Yokota, M. Weiland, J. Shalf, and S. Alam (Springer [100] R. Porotti, V. Peano, and F. Marquardt, Gra-
International Publishing, Cham, 2018) pp. 591–613. dient Ascent Pulse Engineering with Feedback,
[84] X.-M. Zhang, Z.-W. Cui, X. Wang, and M.-H. Yung, arXiv:2203.04271 (2022).
Automatic spin-chain learning to explore the quantum [101] M. Krenn, J. S. Kottmann, N. Tischler, and A. Aspuru-
speed limit, Physical Review A 97, 052333 (2018). Guzik, Conceptual Understanding through Efficient
[85] M. Y. Niu, S. Boixo, V. N. Smelyanskiy, and H. Neven, Automated Design of Quantum Optical Experiments,
Universal quantum control through deep reinforcement Physical Review X 11, 031044 (2021).
learning, npj Quantum Information 5, 33 (2019). [102] M. Krenn, M. Malik, R. Fickler, R. Lapkiewicz, and
[86] Z. An and D. L. Zhou, Deep reinforcement learning for A. Zeilinger, Automated Search for new Quantum Ex-
quantum gate control, Europhysics Letters 126, 60002 periments, Physical Review Letters 116, 090405 (2016).
(2019). [103] M. Krenn, M. Erhard, and A. Zeilinger, Computer-
[87] Y. Baum, M. Amico, S. Howell, M. Hush, M. Liuzzi, inspired quantum experiments, Nature Reviews Physics
P. Mundada, T. Merkh, A. R. R. Carvalho, and M. J. 2, 649 (2020).
Biercuk, Experimental Deep Reinforcement Learning [104] X. Gao, M. Erhard, A. Zeilinger, and M. Krenn,
for Error-Robust Gate-Set Design on a Superconducting Computer-Inspired Concept for High-Dimensional Mul-
Quantum Computer, PRX Quantum 2, 040324 (2021). tipartite Quantum Gates, Physical Review Letters 125,
[88] T. Fösel, P. Tighineanu, T. Weiss, and F. Mar- 050501 (2020).
quardt, Reinforcement Learning with Neural Networks [105] A. A. Melnikov, H. P. Nautrup, M. Krenn, V. Dunjko,
for Quantum Feedback, Physical Review X 8, 031084 M. Tiersch, A. Zeilinger, and H. J. Briegel, Active learn-
(2018). ing machine learns to create new quantum experiments,
[89] J. Mackeprang, D. B. R. Dasari, and J. Wrachtrup, A Proceedings of the National Academy of Sciences 115,
reinforcement learning approach for quantum state en- 1221 (2018).
gineering, Quantum Machine Intelligence 2, 5 (2020). [106] A. Cervera-Lierta, M. Krenn, and A. Aspuru-Guzik, De-
[90] Z. T. Wang, Y. Ashida, and M. Ueda, Deep Reinforce- sign of quantum optical experiments with logic artificial
ment Learning Control of Quantum Cartpoles, Physical intelligence, arXiv:2109.13273 (2021).
Review Letters 125, 100401 (2020). [107] M. J. H. Heule and O. Kullmann, The science of brute
[91] S. Borah, B. Sarma, M. Kewming, G. J. Milburn, force, Communications of the ACM 60, 70 (2017).
and J. Twamley, Measurement-Based Feedback Quan- [108] B. Sanchez-Lengeling and A. Aspuru-Guzik, Inverse
tum Control with Deep Reinforcement Learning for a molecular design using machine learning: Generative
Double-Well Nonlinear Potential, Physical Review Let- models for matter engineering, Science 361, 360 (2018).
ters 127, 190403 (2021). [109] D. Flam-Shepherd, T. C. Wu, X. Gu, A. Cervera-Lierta,
[92] R. Porotti, A. Essig, B. Huard, and F. Marquardt, Deep M. Krenn, and A. Aspuru-Guzik, Learning interpretable
Reinforcement Learning for Quantum State Preparation representations of entanglement in quantum optics ex-
with Weak Nonlinear Measurements, Quantum 6, 747 periments using deep generative models, Nature Ma-
(2022). chine Intelligence 4, 544 (2022).
[93] V. V. Sivak, A. Eickbusch, H. Liu, B. Royer, I. Tsiout- [110] T. Menke, F. Häse, S. Gustavsson, A. J. Kerman, W. D.
sios, and M. H. Devoret, Model-Free Quantum Control Oliver, and A. Aspuru-Guzik, Automated design of su-
with Reinforcement Learning, Physical Review X 12, perconducting circuits and its application to 4-local cou-
011059 (2022). plers, npj Quantum Information 7, 49 (2021).
[94] N. Leung, M. Abdelhafez, J. Koch, and D. Schuster, [111] J. Wallnöfer, A. A. Melnikov, W. Dür, and H. J. Briegel,
Speedup for quantum optimal control from automatic Machine Learning for Long-Distance Quantum Commu-
differentiation based on graphics processing units, Phys- nication, PRX Quantum 1, 010301 (2020).
ical Review A 95, 042318 (2017). [112] C. P. Williams and A. G. Gray, Automated Design
[95] M. Abdelhafez, D. I. Schuster, and J. Koch, Gradient- of Quantum Circuits, in Quantum Computing and
based optimal control of open quantum systems us- Quantum Communications, edited by C. P. Williams
ing quantum trajectories and automatic differentiation, (Springer Berlin, Heidelberg, 1999) pp. 113–125.
Physical Review A 99, 052327 (2019). [113] R. B. McDonald and H. G. Katzgraber, Genetic braid
[96] M. Abdelhafez, B. Baker, A. Gyenis, P. Mundada, A. A. optimization: A heuristic approach to compute quasi-
Houck, D. Schuster, and J. Koch, Universal gates for particle braids, Physical Review B 87, 054414 (2013).
protected superconducting qubits using optimal control,
22

[114] U. Las Heras, U. Alvarez-Rodriguez, E. Solano, and [130] R. Duncan, A. Kissinger, S. Perdrix, and J. van de We-
M. Sanz, Genetic Algorithms for Digital Quantum Sim- tering, Graph-theoretic Simplification of Quantum Cir-
ulations, Physical Review Letters 116, 230504 (2016). cuits with the ZX-calculus, Quantum 4, 279 (2020).
[115] Y.-H. Zhang, P.-L. Zheng, Y. Zhang, and D.-L. Deng, [131] T. Fösel, M. Y. Niu, F. Marquardt, and L. Li, Quantum
Topological Quantum Compiling with Reinforcement circuit optimization with deep reinforcement learning,
Learning, Physical Review Letters 125, 170501 (2020). arXiv:2103.07585 (2021).
[116] L. Moro, M. G. A. Paris, M. Restelli, and E. Prati, [132] P. W. Shor, Scheme for reducing decoherence in quan-
Quantum compiling by deep reinforcement learning, tum computer memory, Physical Review A 52, R2493
Communications Physics 4, 178 (2021). (1995).
[117] L. Cincio, Y. Subaşı, A. T. Sornborger, and P. J. Coles, [133] G. Torlai and R. G. Melko, Neural Decoder for Topolog-
Learning the quantum algorithm for state overlap, New ical Codes, Physical Review Letters 119, 030501 (2017).
Journal of Physics 20, 113022 (2018). [134] S. Krastanov and L. Jiang, Deep Neural Network Prob-
[118] L. Cincio, K. Rudinger, M. Sarovar, and P. J. Coles, abilistic Decoder for Stabilizer Codes, Scientific reports
Machine Learning of Noise-Resilient Quantum Circuits, 7, 11003 (2017).
PRX Quantum 2, 010324 (2021). [135] S. Varsamopoulos, B. Criger, and K. Bertels, Decoding
[119] J. Yao, P. Kottering, H. Gundlach, L. Lin, and small surface codes with feedforward neural networks,
M. Bukov, Noise-Robust End-to-End Quantum Control Quantum Science and Technology 3, 015004 (2017).
using Deep Autoregressive Policy Networks, in Proceed- [136] P. Baireuther, T. E. O’Brien, B. Tarasinski, and
ings of the 2nd Mathematical and Scientific Machine C. W. J. Beenakker, Machine-learning-assisted correc-
Learning Conference, Proceedings of Machine Learning tion of correlated qubit errors in a topological code,
Research, Vol. 145, edited by J. Bruna, J. Hesthaven, Quantum 2, 48 (2018).
and L. Zdeborova (PMLR, 2022) pp. 1044–1081. [137] R. Sweke, M. S. Kesselring, E. P. L. van Nieuwenburg,
[120] J. R. McClean, S. Boixo, V. N. Smelyanskiy, R. Bab- and J. Eisert, Reinforcement learning decoders for fault-
bush, and H. Neven, Barren plateaus in quantum neural tolerant quantum computation, Machine Learning: Sci-
network training landscapes, Nature Communications ence and Technology 2, 025005 (2021).
9, 4812 (2018). [138] H. P. Nautrup, N. Delfosse, V. Dunjko, H. J. Briegel,
[121] S. Sim, P. D. Johnson, and A. Aspuru-Guzik, Ex- and N. Friis, Optimizing Quantum Error Correction
pressibility and Entangling Capability of Parameterized Codes with Reinforcement Learning, Quantum 3, 215
Quantum Circuits for Hybrid Quantum-Classical Al- (2019).
gorithms, Advanced Quantum Technologies 2, 1900070 [139] Z. Wang, T. Rajabzadeh, N. Lee, and A. H. Safavi-
(2019). Naeini, Automated Discovery of Autonomous Quantum
[122] M. Ostaszewski, L. M. Trenkwalder, W. Masarczyk, Error Correction Schemes, PRX Quantum 3, 020302
E. Scerri, and V. Dunjko, Reinforcement learning for op- (2022).
timization of variational quantum circuit architectures, [140] V. Tshitoyan, J. Dagdelen, L. Weston, A. Dunn,
Advances in Neural Information Processing Systems, Z. Rong, O. Kononova, K. A. Persson, G. Ceder, and
34, 18182 (2021). A. Jain, Unsupervised word embeddings capture la-
[123] S.-X. Zhang, C.-Y. Hsieh, S. Zhang, and H. Yao, Neural tent knowledge from materials science literature, Nature
predictor based quantum architecture search, Machine 571, 95 (2019).
Learning: Science and Technology 2, 045027 (2021). [141] S. d’Ascoli, P.-A. Kamienny, G. Lample, and F. Char-
[124] J. Yao, L. Lin, and M. Bukov, Reinforcement Learning ton, Deep Symbolic Regression for Recurrent Sequences,
for Many-Body Ground-State Preparation Inspired by arXiv:2201.04600 (2022).
Counterdiabatic Driving, Physical Review X 11, 031070 [142] P. Veličković, A. P. Badia, D. Budden, R. Pascanu,
(2021). A. Banino, M. Dashevskiy, R. Hadsell, and C. Blun-
[125] G. Verdon, M. Broughton, J. R. McClean, K. J. Sung, dell, The CLRS Algorithmic Reasoning Benchmark,
R. Babbush, Z. Jiang, H. Neven, and M. Mohseni, arXiv:2205.15659 (2022).
Learning to learn with quantum neural networks via [143] M. Krenn, R. Pollice, S. Y. Guo, M. Aldeghi,
classical neural networks, arXiv:1907.05415 (2019). A. Cervera-Lierta, P. Friederich, G. d. P. Gomes,
[126] A. Anand, M. Degroote, and A. Aspuru-Guzik, Natural F. Häse, A. Jinich, A. Nigam, et al., On scientific under-
evolutionary strategies for variational quantum compu- standing with artificial intelligence, arXiv:2204.01467
tation, Machine Learning: Science and Technology 2, (2022).
045012 (2021). [144] H. W. de Regt, Understanding Scientific Understanding
[127] T. H. Kyaw, T. Menke, S. Sim, A. Anand, N. P. Sawaya, (Oxford University Press, 2017).
W. D. Oliver, G. G. Guerreschi, and A. Aspuru-Guzik, [145] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and
Quantum Computer-Aided Design: Digital Quantum L. Fei-Fei, Imagenet: A large-scale hierarchical image
Simulation of Quantum Processors, Physical Review database, in 2009 IEEE Conference on Computer Vi-
Applied 16, 044042 (2021). sion and Pattern Recognition (2009) pp. 248–255.
[128] J. S. Kottmann, M. Krenn, T. H. Kyaw, S. Alperin- [146] N. Brown, M. Fiscato, M. H. S. Segler, and A. C.
Lea, and A. Aspuru-Guzik, Quantum computer-aided Vaucher, GuacaMol: Benchmarking Models for de Novo
design of quantum optics hardware, Quantum Science Molecular Design, Journal of Chemical Information and
and Technology 6, 035010 (2021). Modeling 59, 1096 (2019).
[129] F.-M. Liu, M.-C. Chen, C. Wang, S.-W. Li, Z.-X. [147] D. Polykovskiy, A. Zhebrak, B. Sanchez-Lengeling,
Shang, C. Ying, J.-W. Wang, C.-Z. Peng, X. Zhu, C.- S. Golovanov, O. Tatanov, S. Belyaev, R. Kurbanov,
Y. Lu, et al., Quantum Design for Advanced Qubits, A. Artamonov, V. Aladinskiy, M. Veselov, et al., Molec-
arXiv:2109.00994 (2021). ular sets (MOSES): a benchmarking platform for molec-
23

ular generation models, Frontiers in Pharmacology 11, [149] A. Turing, Computing Machinery and Intelligence,
565644 (2020). Mind LIX, 433 (1950).
[148] Reinforcement learning for science, https://www.
scigym.net/, accessed: 2022-08-08.

You might also like