Professional Documents
Culture Documents
Artificial Intelligence and Machine Learning For Quantum Technologies
Artificial Intelligence and Machine Learning For Quantum Technologies
decade.
FIG. 1. Overview of tasks in the area of quantum technologies that machine learning and artificial intelligence can help solve
better, as explained in this perspective article.
on the best-suited algorithms. trees” [20]. However, the flexibility of neural networks
Once the reader wants to understand how to apply has made them a popular general-purpose choice, so we
these tools in practice, we recommend the very educa- focus on those in our brief introduction.
tional lecture notes introducing machine learning tech-
niques with a view to their application to quantum de-
vices, both brief [9], and very extended [10]. In addition, A. Evolutionary Algorithms
there are several reviews from recent years with a some-
what different focus than ours, e.g. about machine learn- One set of algorithms that often been used in opti-
ing applied to physics in general [11] or machine learning mization tasks in the domain of artificial intelligence are
for quantum many-body physics [12]. evolutionary (genetic) algorithms [21, 22]. There, the
We also remark that there is the whole field of quan- idea is to deal with a set of candidate solutions, each
tum machine learning, which tries to discover potential of them described via a suitable vector. These solutions
quantum advantages when implementing new learning can be randomly changed (”mutated”), two solutions can
algorithms on quantum platforms. This also includes be combined to form a new candidate (”crossover”), and
variational quantum circuits, which are very promising finally only the best solutions can be kept for the next
in the context of noisy intermediate-scale quantum com- round of evolution (”selection”). Such methods can be
puters (NISQ). We do not cover these developments here, surprisingly effective, and their applications to new prob-
and we refer the interested reader to reviews for quan- lems is often straight forward. As we will describe in fur-
tum machine learning [13–16] and variational quantum ther sections, genetic algorithms have successfully been
algorithms and NISQ devices [17, 18]. While we focus applied to quantum-technology-related tasks.
on quantum technology here, we want to mention that
there is also considerable work on ML for foundational
quantum science [19]. B. Neural Networks: Structure
In the following section, we will briefly introduce the
basics of neural networks and other machine learning The structure of artificial neural networks, which can
techniques, aiming to set the stage for subsequent dis- be trained to approximate arbitrary functions, is loosely
cussions. The bulk of our review is contained in section motivated by neurons in the human brain. Each neuron
III, where we discuss the various applications of machine receives multiple inputs and generates an output signal
learning to quantum technologies. In each case, we aim which serves as input for other neurons. During learning,
to remark on some of the challenges and potential future the strength of connections between neurons changes.
research directions. Finally, in the outlook, we specu- More precisely, each artificial neuron receives N inputs
late about how machine learning might have transformed xj which are summed according to some weights wj , rep-
quantum technologies in a dozen years from now. resenting connection strengths, see Fig. 2a. After adding
a bias (shift) b, a simple nonlinear ”activation function”
f is applied to yield the output y:
II. BASIC TECHNIQUES OF MACHINE
LEARNING AND ARTIFICIAL INTELLIGENCE y = f (z) (1a)
N
X
The purpose of this section is merely to provide a z= w j xj + b (1b)
glimpse of the essential basics so that the subsequent dis- j=1
cussion of applications becomes intelligible to the reader
without prior exposure. Machine learning techniques Without the nonlinear activation function between each
have been around for several decades and involve many layer, the whole neural network could be compressed into
efficient approaches that predate the recent deep learning one single linear transformation, with very limited com-
revolution, like ”support vector machines” or ”decision putational power. Popular activation functions are the
3
parameters). Therefore, it is crucial that there exists An MDP consists of an ”agent” (the controller) and
a highly efficient approach to calculating the gradient: an ”environment” (the world, or the system to be con-
the so-called ”backpropagation” scheme. As its name trolled), and both interact in multiple time steps. In
implies, this algorithm calculates the gradients layer by each time step t, the environment’s state st is observed.
layer, starting from the output layer. Remarkably, it is Solely based on this observation, the agent decides on
computationally not more demanding than the original its next action at , which will change the environment’s
evaluation of the neural network (also called the forward state. The agent’s behavior is defined by the ”policy”
propagation). A significant hardware advance for train- π(a|s), which denotes the probability of choosing the ac-
ing neural networks are GPUs (graphical processor units) tion a given the observation s. For each action at , the
that are optimized to perform highly-efficient manipula- agent receives a reward rt . For example, in a game, this
tions of large matrices. This efficiency is essential for the reward could be +1/ − 1 at the last time step, when the
success of ML in many applications. agent has won/lost the game, and otherwise 0 in all pre-
For more details on neural network architectures, back- vious steps. RL aims to maximize the cumulative reward
propagation, and gradient descent, the interested reader R (also called ”return”)
is referred to the many existing excellent introductions
T
[3, 9, 10, 25]. X
R= rt , (4)
t=1
learning technique called support-vector machine (SVM) ery possible measurement on the quantum system. This
to perform clustering of measurement traces in an un- might, however, not be necessary in most cases. The
supervised fashion, outperforming classical clustering al- theory of probably approximately correct models (PAC)
gorithms. The idea of a nonlinear SVM is to map data provides a hypothesis for every measurement close to the
points to a higher-dimensional space and to find the best dataset. Surprisingly, it was found that this question can
hyperplane for separating two classes of data points in be solved with a linear scaling of measurements. This
that space. In this specific work, each measurement tra- computational learning strategy has been first demon-
jectory, which consists of hundreds of individual data strated in quantum optics experiments with up to six
points, is interpreted as a point in a high-dimensional photons. The experiment has confirmed the scaling be-
space. The SVM’s goal was to separate curves that orig- havior of PAC, even in the presence of realistic experi-
inate from a zero-state and one from a one-state. The mental noise [35].
readout fidelity was improved compared to non-ML clus- Another very interesting recent development shows
tering techniques. Furthermore, this analysis has shown that joint quantum measurements of several individual
that the main noise contribution comes from physical bit- copies of a many-particle quantum state can lead to
flips (either heating or relaxation of the qubit). Without an exponential improvement over classical learning al-
such events, the classification of the SVM becomes near- gorithms [36]. The authors show experimentally on a
perfect. A similar approach has been demonstrated – us- platform of 40 superconducting qubits that tasks such as
ing neural networks – to enhance the readout capability predicting properties of the physical system can be signif-
of trapped-ion qubits [30]. Here, the authors show that icantly improved when the results of such joint measure-
the readout fidelity improves significantly compared to ments are fed as an input to a classical RNN. The result
a non-ML clustering method, especially when the effec- is particularly remarkable as it shows a clear advantage
tive amount of data per measurement increases. Similar already for current, noisy, and not error-corrected quan-
techniques have also been applied to NV center quantum tum computers.
devices [31]. Many other interesting examples exist that use neu-
In a pioneering experimental work, a neural network ral networks to analyze simulated or measured data of
reconstructed the quantum dynamics of a quantum sys- quantum systems. For example, it has been demon-
tem directly from measurement data [32]. There, the au- strated that the Wigner negativity of a multimode quan-
thors considered again a superconducting qubit coupled tum state can be approximated well even in the low-data
to a readout resonator, whose noisy measurement trace regime [37], with important consequences for quantum
is fed as input into the network (together with the initial technologies. Another vivid field is the neural-network-
preparation state and the final measurement basis). The based detection of quantum phase transitions and classi-
network’s task was to predict the statistics of arbitrary fication of quantum phases in condensed matter physics.
measurements at some given time during the evolution, We will not go into detail here. Instead, we point to
i.e. effectively predict the evolution of the density matrix some exciting early works in this field [38–40], as well
given the measurement record, see Fig. 3b. The kind of as some very modern applications that, for instance, use
network most suited to this task is a so-called recurrent anomaly detection to find phase transitions in an unsu-
network, i.e. a network able to process a time series (orig- pervised way [41]. Anomaly detection has already been
inally used for text or speech processing). The resulting applied to detect phases directly from experimental mea-
fully trained network can map any measurement data to surements [42]. See a recent review on this topic in [12].
a quantum state. Approximation of Quantum States – The direct
Furthermore, machine learning techniques can be ex- application of numerical techniques to quantum devices
ploited to analyze the statistics of measurement outcomes requires, in many situations, the storage, and processing
in quantum experiments where the aim is to demonstrate of the system’s full quantum state. As the quantum state
the classical complexity of sampling from the quantum grows exponentially with the number of particles, the
distribution. Boson sampling is the most well-known memory requirements quickly become enormous, even for
such scenario. In [33], unsupervised machine learn- moderately large quantum systems. For example, stor-
ing techniques (various clustering approaches) were em- ing the full quantum state of a 42-qubit system requires
ployed to identify and rule out possible malfunction sce- 35 TByte of memory. As demonstrated in the earliest
narios that would lead to a noticeably different distribu- quantum advantage experiments [43], this is directly re-
tion. lated to the power of quantum computers and quantum
One exciting work from theoretical computer science simulators. However, it poses a significant problem for
connects the question of quantum state tomography with classical computational approaches that deal with large
the theory of computational learning in the supervised quantum systems and therefore for developing new large-
setting [34]. Here, a learner uses the training dataset to scale quantum technologies.
produce a hypothesis about future measurements. Quan- To overcome this challenge, memory-efficient approxi-
tum state tomography, which requires a number of mea- mations of the quantum wave function are indispensable.
surements that is exponential in the number of particles, Neural networks are one key candidate to approximate
can be seen as a learner that produces a hypothesis for ev- the quantum wave function. This approach has some-
7
times been called Neural Quantum State (NQS). that were not directly accessed in the experiment.
A prominent approach tries to represent the quantum The idea has been extended to mixed states [48–52].
state in terms of a neural network [44]. This implies In that way, NQS can be used to efficiently approximate
that for each new quantum state, another network will open quantum systems which are notoriously difficult to
be trained, based on the associated measurement data capture.
for that state. In principle, that is considerably easier Approximating Quantum Dynamics – Once a
than asking a single network to be responsible for arbi- suitable quantum state representation is available, it can
trary states, i.e. the task considered above. As a con- be exploited to evolve the state in time. This enables
sequence, much more complicated many-body states can potentially efficient simulation of quantum many-body
be accessed. The whole approach can be seen as a neural- time evolution, which is important for predicting and
network-based version of quantum state tomography. benchmarking the dynamics of quantum simulators and
Several different ways exist to use a single neural net- quantum computing platforms. For the general case of
work to represent a single quantum state. One straight- dissipative quantum many-body dynamics, i.e. the time
forward approach, first introduced in [44] and then ex- evolution of mixed states, this has been explored in [50].
tended in subsequent works, employs a network that Rather than explicitly storing the entire quantum
directly represents the wave function. Given a multi- state, another technique shows how one can directly com-
particle configuration x as input, the network has to pro- pute a quantum state’s complex properties just from the
duce the wave function amplitude for that configuration state’s construction rules, i.e. the quantum experimen-
as output: Ψθ (x). tal circuit. In [53], the authors show how a recurrent
Different structures can be used for the network, with neural network [24] can approximate the properties that
a restricted Boltzmann machine (RBM) being a popu- emerge from quantum experiments without ever storing
lar choice since it also allows direct sampling from the the intermediate quantum state directly. These systems
probability distribution of observations [45]. In a tradi- could then directly be applied for complex quantum de-
tional RBM, the aim is to learn to sample from some sign tasks, a topic we cover in chapter III D. Another
observed probability distribution, see Fig. 3(c). It con- approach [54] also foregoes the representation of quan-
sists of binary visible units and hidden units connected tum states and instead trains a recurrent network to
to each other, and the statistics of these units are sam- predict the evolution of observables under random ex-
pled from a Boltzmann distribution with an energy E ternal driving of a quantum many-body system (either
that contains interaction terms bilinear
P in the P hidden (h) based on simulated or possibly even experimental data).
and visible (v) unit values: −E = j aj vj + k bk hk + The trained network can then predict the evolution un-
der arbitrary driving patterns (e.g. quenches). In a sim-
P
j,k wjk vj hk . During training, the coupling constants
are updated to obtain the desired probability distribution ilar spirit, neural ordinary differential equations (neural
of v (observed in samples provided during training). A ODEs) can be used to approximate the dynamics of quan-
simple physics example would be a 1D spin chain, whose tum systems directly, again without storing the explicit
configurations are identified as sample vectors v. More information about the quantum wave function [55]. In-
generally, other so-called generative deep learning meth- terestingly, the approximation is of high enough quality
ods (such as normalizing flows, variational auto-encoders, that it is possible to rediscover some fundamental prop-
and generative adversarial networks) can be used to learn erties of quantum physics, such as the Heisenberg uncer-
probability distributions, including those representing tainty relation.
the statistics of observables in quantum states in a given Future Challenges and Opportunities –
basis. Improved data efficiency (both for the training but also
Quantum state tomography using an RBM-style when applying ML to interpret the data) will be an im-
ansatz for the wave functions was introduced in [46]. portant challenge for the future, especially when the de-
Since one wants to keep the wave function’s phase ϕ as vices scale to more complex quantum systems.
well as the
p probability p, the ansatz is now of the type It will be interesting to co-discover measurement strat-
Ψ(x) = p(x)eiϕ(x) , where both p and ϕ are represented egy together with the data interpretation strategy. This
as networks and x corresponds to the visible units. A cru- might be particularly interesting if the AI algorithm is
cial idea in this approach is to match the probability dis- allowed to employ quantum measurements on numerous
tributions obtained from the experiment for observables copies of the same state, as pioneered in [36].
in more than one basis (e.g. σ z and σ x etc. for qubits). When a neural network can find a suitable approxima-
The evaluation of different bases can be carried out via tion for the computation of complex quantum systems,
unitary transformations acting on the wave function Ψ such as an NQS, it has learned a theoretical technique
that is expressed in a single reference basis. that might be useful for humans too. It will be interest-
In [47], this approach was applied to experimental data ing to learn how to extract the per-se inaccessible knowl-
from snapshots of many-particle configurations taken edge from the weights and biases of the neural network.
in a Rydberg atom quantum simulator. The resulting One method is so-called symbolic regression.
network-based wave-function ansatz could then be used The extensions of NQS to complex quantum systems,
to reconstruct other expectation values and observables such as higher dimensions and spins beyond qubits, will
8
in every step the reference phase that yields the largest operating point of the quantum device, it is therefore
immediate information gain) do not necessarily lead to necessary to extract the relevant data with a very lim-
the largest information gain in the long run. To over- ited amount of information. This task can be formulated
come this effect, one of the seminal early contributions as a machine learning task, specifically applying “active
to ML-based quantum metrology used particle swarm op- learning” or “Bayesian optimal experimental design” [70]
timization of the feedback strategy[64]. In particle swarm where the algorithm chooses the most informative mea-
optimization, a collection of different feedback strategies surement autonomously (see also [71] for a review of such
(each called “particle”) iteratively moves in the space methods in the context of quantum devices). Naturally,
of all possible strategies. In each iteration, the particle this is closely related to the estimation of external pa-
moves towards a combination of the best local optimum rameters in quantum metrology as discussed above.
and the currently best known global optimum known by We illustrate these techniques via a pioneering exper-
the whole swarm. The experimental setting of [64, 65] imental application to quantum devices. This experi-
is the same as for the BWB strategy – a Mach Zehnder ment [56] considered the calibration and measurement
interferometer with an unknown phase in one arm and an of a semiconductor quantum dot. Such a device can be
adaptive phase in the other arm. Indeed, the swarm opti- tuned via applied gate voltages, and its resulting proper-
mization algorithm finds (slighly) better strategies than ties can be measured via a transport current. Here, the
the greedy Bayesian BWB approach. Interestingly, nei- goal was to explore the properties of the quantum dot, as
ther BWB nor swarm optimization can find the optimal defined by its current-voltage map I(V1 , V2 ), where the
strategy, which was identified for small photon numbers voltages include a bias voltage driving the current and
via an extensive computation of all possible strategies. a gate voltage deforming the dot’s potential. In a naive
Other early ML algorithms in this domain have applied approach, even if only two voltages were scanned with
evolutionary approaches to approximate the ideal feed- 100 discretization steps each, one would need to perform
back strategy[66, 67]. 10,000 measurements to get a suitable resolved device
Discovering a strategy is a problem that can directly characteristic.
be formulated as a reinforcement learning task. An early To reduce the required number of current measure-
application of RL to quantum parameter estimation was ments, the authors tried to estimate which measurement
provided in [68], with frequency estimation of a qubit as (in the 2D voltage space) would yield the ”maximum
a test case. In that work, the idea was to optimize the amount of additional information.” In practical terms,
quantum Fisher information for the parameter of inter- this would be the measurement that is expected to place
est. This can be done by finding a sequence of suitable the tightest constraints on the current-voltage maps that
control pulses applied during the noisy evolution of the are still compatible with all the observed values of the
quantum probe. No feedback is involved in this simple current (observed in this and prior measurements). It
setting since the measurement itself is not part of the is obvious how this setting translates to other quantum
evolution controlled by RL. platforms, e.g. measurements of microwave transmission
However, the quantum Fisher information is only use- through superconducting circuits controlled via gate volt-
ful in cases where one is already fairly certain of the true ages and magnetic fields or the optical response of tune-
parameter value. RL can be employed to study more able atomic systems.
complex situations, where updates are performed using The first step towards this goal is to efficiently rep-
the Bayes rule, starting from an arbitrary prior param- resent all current-voltage maps that might be observed,
eter distribution, and where the strategy is not greedy given the general physics of such a device, the assumed
(i.e. more than a single step of the sequence is optimized). prior distribution of device parameters, and all previous
In [69], the authors provided information about the cur- measurement results. In general, this is the domain of
rent Bayes distribution of the unknown parameter (as ”generative models,” which can sample from a probabil-
extracted from previous measurement results) and the ity distribution that is learned. In the case of [56], the
previous measurement choices as input to an RL agent authors used such a generative model, in their case a so-
implemented by a neural network. It then has to sug- called ”constrained variational autoencoder” (cVAE), to
gest the next measurement. After the whole sequence randomly create realistic current-voltage map that follow
of measurements, the agent is rewarded according to the the probability distribution of the actual physical system,
total reduction in parameter variance. It was shown that see Fig. 4c. Additional input into the generative model
this approach performs very competitively for an impor- provides a constraint, in the form of a few initially exist-
tant test case, namely parameter estimation for a qubit ing measurement results, and guides the reconstruction
of unknown frequency in the presence of dephasing. to sample only maps compatible with those constraints.
Device calibration – Future large-scale quantum de- In each step, 100 different voltage maps are sampled.
vices will consist of a large number of components with Those maps are used to find the next measurement point
adjustable parameters that need to be characterized and in voltage-space that would lead to the maximum infor-
tuned automatically. A complete characterization of the mation gain, see Fig. 4d. With this technique, the total
device via quantum process tomography quickly becomes number of necessary measurements for the characteriza-
impractical. To find the actual parameters or the ideal tion of the device is reduced by a factor of 4. This clearly
10
shows that the overhead of the deep-learning algorithm center. The underlying learning mechanism is very gen-
is more than compensated by its efficiency improvement eral, and it thus could become a powerful tool for learning
compared to the naive approach. A benefit of this tech- the dynamics of unknown quantum systems.
nique is that generating new samples with the cVAE is Future Challenges and Opportunities – For adap-
very efficient. Thus it can be scaled to much larger de- tive approaches, one needs to consider the trade-off be-
vices, where even more significant efficiency gains are ex- tween speed and the sophistication of the approach. In
pected. these tasks, the time between measurement and feedback
Another comparatively straightforward way to use ma- is often very short, so the decision must be taken quickly.
chine learning in device characterization consists in train- An interesting future approach for advanced quan-
ing a network-based classifier to recognize ”interesting” tum metrology approaches is to simultaneously co-design
measurement results. This then allows to tuning param- the experimental setup and the feedback strategy, rather
eters until those results are obtained. Such an approach than solving these tasks individually.
has been demonstrated in [72] for navigating charge-
stability diagrams of multi-quantum-dot devices. In that
setting, the algorithm’s goal was to automatically tune C. Discovering strategies for hardware-level
the charge occupation of the double quantum dot. The quantum control
task is reformulated as a classification task, where the
algorithm recognizes individual charge transitions when Challenges like quantum computing and quantum sim-
presented with a charge-stability diagram. Since such a ulation are leading to rapidly increasing demands on the
diagram constitutes an image, CNNs are a suitable choice efficient and high-fidelity control of quantum systems.
for the task. Tasks range from the preparation of complex quantum
Quantum Hamiltonian Learning – Imagine the fol- states and the synthesis of unitary gates via suitable
lowing parameter-estimation problem: One wants to es- control-field pulses all the way up to goals like feedback-
timate the parameters x0 that affect the evolution of a based quantum state stabilization and continuously per-
quantum state under a quantum many-body Hamiltonian formed error correction. In trying to solve these tasks,
H(x0 ) [73]. Unfortunately, even the task of comput- the specific capabilities and restrictions of any hardware
ing the dynamics scales exponentially with the system platform, from superconducting circuits to cavity quan-
size (number of qubits) when tackled using a classical tum electrodynamics, need to be considered.
machine. The idea of Quantum Hamiltonian Learning In this section, we will highlight specifically how rein-
(QHL) is to enlist the help of a quantum simulator to forcement learning has come to help with many of these
overcome this problem. The parameters x0 can then be challenges. In the form of model-free RL, it promises
estimated with standard Bayesian methods. Thereby, to discover optimal strategies directly on an experiment,
the quantum simulator is used like a subroutine inside which can be treated as a black box, see Fig. 5a. All
a classical ML approach. The first experimental imple- its unknowns and non-idealities will then be revealed
mentation of this idea was demonstrated in 2017 [74]. In only via its response to the externally imposed control
that work, the authors wanted to estimate the parame- drives. But even when used in a model-based way, us-
ters of an electron spin in a nitrogen-vacancy center, and ing simulations, RL can be more flexible than simpler
they used a quantum simulator on an integrated pho- approaches. In particular, it offers ways to discover feed-
tonics platform to perform the QHL. Interestingly, not back strategies, i.e. strategies conditioned on measure-
only did the approach lead to a high-quality estimation ment outcomes. These were not previously accessible to
of the dynamic system parameters, but it also indicated the usual numerical optimal control techniques.
when the initial Hamiltonian model had deficits. In these The present section is firmly concerned with hardware-
cases, the learning method informed the user that there level control that is continuous in the time domain, dis-
are other dynamics in play that have not been consid- covering pulse shapes or feedback strategies based on
ered, which inspires an improvement of the underlying time-continuous noisy measurement traces as they would
Hamiltonian model. emerge from weak measurements of quantum devices.
While the QHL method indicates that when the model There are some connections to the next section, but there
Hamiltonian is not ideal, it cannot adapt it. To overcome we will be concerned with the discovery of protocols, con-
this hurdle and to learn the entire Hamiltonian structure trol strategies, and whole experimental setups that are
(not only its parameters), the authors of [75] have in- described on a higher level, composed of discrete build-
troduced the idea of a Quantum Model Learning Agent ing blocks like gates or experimental elements.
(QMLA). This agent not only finds the parameters of a Quantum control tasks without feedback (open-
predefined Hamiltonian, but discovers the whole Hamil- loop control) –
tonian that describes the dynamics of a system. The Prior to the application of machine learning techniques
approach iteratively refines the initial Hamiltonian and in this field, the focus was essentially on tasks without
uses QHL as a subroutine for finding suitable parameter feedback, which was solved by direct optimization tech-
settings. This approach has also been demonstrated in niques, adapting the shape of control pulses applied to
a hybrid quantum system involving a nitrogen-vacancy the quantum system to maximize some quantity (like the
11
state fidelity). These direct optimization techniques in- an experiment whenever needed. In other words, there
clude gradient-based approaches, with GRAPE [76] and is no need for the agent to be running during the actual
the Krotov method [77, 78] the most prominent exam- experiment, which strongly relaxes requirements for the
ples, as well as approaches that do not rely on access hardware: no real-time control is necessary.
to gradients, such as CRAB [79]. At the time of writing, Quantum feedback control (closed-loop control)
these techniques still form the default toolbox for the case – The successful control of quantum systems subject to
of open-loop control, even while the first applications of noise, decay and decoherence requires either reservoir en-
machine learning (described below) are taking hold. Evo- gineering (autonomous feedback) or active feedback con-
lutionary algorithms define another class of (stochastic) trol. The space of active feedback strategies is expo-
approaches that have been used successfully to find op- nentially larger than that of open-loop control strategies
timal control sequences [80]. (i.e. without feedback), owing to the number of potential
State preparation is the most common quantum con- measurement outcome sequences growing exponentially
trol problem, and yet it can already be challenging, es- with time (each such sequence may require a different
pecially for multi-qubit settings. In probably the ear- response). It is here that it is almost inevitable to use
liest application of RL to quantum physics, pure-state the power of RL, particularly deep RL, with its ability
preparation with discrete control pulses was shown us- to process high-dimensional observations.
ing a version of Q-learning for a spin-1/2 system and a The first work to apply deep RL to feedback-based con-
three-level system [81]. A few years later, RL-based state trol of quantum systems was [88]. It employed discrete
preparation was demonstrated for a many-qubit system gates for quantum error correction, and we will discuss
[82], also using Q-learning and discrete bang-bang type it in sections III D and III E. State preparation and sta-
actions, with particular emphasis on analyzing the com- bilization in the presence of noise or an uncertain ini-
plexity of the control problem showing up in the form of a tial state are other natural candidates for RL feedback
glassy control landscape. Both of these works used some strategies. Examples include quantum state engineering
version of table-based Q-learning, which works well for a via feedback [89], as well as control of a quantum particle
restricted number of states and actions. The first work in an unstable potential [90] and a double-well potential
to employ deep (i.e. neural-network-based) RL methods [91]. In some quantum systems, control may be very lim-
to open-loop control of quantum systems was [83], with ited (e.g. only linear manipulations), but measurements
both discrete and continuous controls and a recurrent can introduce nonlinearity and their exploitation through
network as an agent, as applied to dynamical decoupling RL-based feedback strategies can enable powerful con-
and again state preparation, followed shortly afterwards trol, as shown in [92]. One challenge for model-free RL
by [84]. as applied to experiments is to make sure rewards can be
The RL approach can be used successfully to find suit- extracted directly and reliably from experimental mea-
able pulse sequences for unitary gates and optimize for surements, and to use a training procedure that really
the gate fidelity, as shown first in [85], and analyzed later treats the quantum device as a black box (not relying,
also in [86]. e.g. on simulations). These aspects were emphasized in
Recently, deep RL has been applied for the first time [93], where state preparation in a cavity coupled to a
to learn control strategies for a real quantum comput- qubit was analyzed.
ing experiment [87]. The authors trained on a cloud- Model-free vs model-based RL – Applying model-
based quantum computing platform, collecting data for free RL techniques, as described above, has a great ad-
the current control policy, extracting rewards, and up- vantage: the experimental quantum device can be treated
dating the policy. The goal was unitary gate synthesis, as a black box, and its inner parameters and distortions of
and the lack of real-time access to the device was not a the control and measurement signals need not be known a
concern since the task required only open-loop control. priori. However, this also means that part of the training
This first demonstration of RL-based quantum control effort is spent on effectively learning an implicit model of
on a real quantum experiment helped to illustrate the the quantum device since that is the basis for a good
possibilities and challenges in this new approach. control strategy.
Even though open-loop control pulse design means In many situations relevant to modern quantum tech-
that the actual strategy in the experiment is not con- nologies, though, a good model is known since the Hamil-
ditioned on any measurement outcomes, RL training for tonian and Lindblad dissipation terms have been care-
such tasks (when done on a computer simulation) may fully calibrated. This allows to consider model-based
still benefit from the agent receiving input information techniques explicitly, see Fig. 5b. In principle, these can
like the current quantum state. Experience shows that simply consist in applying model-free approaches to an
this makes it easier to find a good strategy. Otherwise, RL-environment that is represented by a simulation of
only very sparse nominal information like the current the model. However, this is only useful if running the ex-
time step and possibly the most recent selected action periment often would be expensive or time-consuming. A
would be fed into the agent. In any case, however, once more direct approach takes gradients directly through the
RL has found a control sequence, it could in principle be model dynamics. In the absence of feedback, this is what
stored (e.g. as a waveform or pulse sequence) and sent to well-known approaches like GRAPE offer. In an inter-
12
An entirely different approach uses logical artificial act. The algorithm SCILLA starts with a discrete circuit
intelligence for designing quantum optical experiments topology. The best candidates are further parametrically
[106]. While the credo of the deep learning community optimized, either with a direct gradient-based optimiza-
is to build large neural networks that can solve arbitrary tion or with an evolutionary approach (to avoid local
tasks given large enough training examples and compute minima). The final design outperforms the only other
power, this is not the only way towards ”intelligent” algo- (hand-crafted) 4-local coupler in terms of noise resilience
rithms. An alternative is logic AI [107]. Here, the idea is and coupling strength.
to translate arbitrary problems to Boolean satisfiability Discovering Quantum Protocols and Discrete
expressions and solve them with powerful SAT solvers. In Feedback Strategies –
[106], the question of designing quantum experiments has Discrete building blocks occur not only naturally in the
been rephrased into logical expressions and solved with construction of experiments, but also as part of discrete
MiniSAT. It is shown that in some problems, a combina- temporal sequences, specifically sequences of quantum
tion of Theseus and the logical approach is faster than gates and other operations. These sequences can rep-
the continuous optimization itself. The reason is that the resent quantum protocols or higher-level control strate-
unsatisfiability of candidate solutions is detected quickly gies for quantum devices. The hardware-level control dis-
with a logical approach, thus guiding the continuous opti- cussed in the previous section III C could then be consid-
mizer towards more promising candidates. This approach ered as a tool to implement the individual building blocks
is in its infancy. Given that the field of logic AI is grow- (e.g. an individual gate). In the following we will deal
ing fast due to computational and algorithmic advances, with protocols that also contain elements of feedback or
we expect a large increase in interest in this topic. other actions that are not merely unitary gates. The task
Deep generative models such as variational autoen- of quantum circuit synthesis (building up unitaries out of
coders became a standard tool in fields such as mate- elements) will be discussed further below.
rial design [108]. Here, an encoder network transforms Reinforcement learning is one suiting tool for the au-
a (potentially discrete) representation into a continuous tomated discovery of such sequences. This was first an-
latent space. The decoder network is trained to take a alyzed in [88], using deep RL, where the goal was to dis-
point in the latent space and translate it back to the dis- cover a strategy for quantum error correction in a quan-
crete structure. The encoder and decoder together are tum memory register made of a few qubits. This involves
trained to perform an identity transformation, which by applying discrete unitary gates, which are conditioned on
itself is not that interesting. However, as an exciting the outcomes of measurements, i.e. ’real-time’ feedback
side effect, the system builds up an internal, continu- executed during the control sequence, Fig. 7a. Since the
ous latent space that can be shaped during the training aim of [88] was primarily to find quantum error correction
and used for gradient-based optimization. For the first strategies, we will discuss some more aspects separately
time, such a system was demonstrated for quantum op- in the upcoming section III E.
tics in [109]. The work focuses on understanding what A reinforcement learning technique was subsequently
the neural networks have learned and how they store the also applied to the rediscovery of implementations for
information in their internal latent space. The structure quantum communication protocols [111]. There, the au-
of the latent space shows surprising discrete structures thors set up the task as an RL problem and explain the
that were then identified with concrete properties of the similarity of quantum communication and RL with the
experimental setups. It will be interesting to see more following intuition: A quantum communication protocol
advanced ways to investigate, navigate and understand is a sequence of operations that leads to the desired out-
the high-dimensional internal representations of neural come. Similarly, an RL agent learns a policy, that is, to
networks that are built autonomously during training. perform sequences of operations that maximize a reward
A conceptually related task is the design of supercon- function. The authors task the RL agent to rediscover
ducting circuits. The quantum behavior of superconduct- several important quantum communication schemes such
ing circuits is defined by a network of inductances, ca- as quantum teleportation, quantum state purification, or
pacitances, and Josephson junctions. As with quantum entanglement swapping. Each of these tasks can be writ-
optical experiments, those systems are conventionally de- ten as a simple network, where the nodes stand for the
signed by experienced human researchers who aim to involved parties and edges indicate classical or quantum
find suitable configurations for complex quantum trans- correlations between them. Let us take the quantum tele-
formations, such as coupling between two well-defined portation protocol as an example (the others follow simi-
qubits in quantum computers. The search space of pos- lar ideas). The environment is a three-node network (the
sible structures grows exponentially with the number of incoming unknown quantum state A, the sender B, and
elements, and thus it quickly becomes infeasible for hu- the receiver C). The environment starts with pre-shared
mans to find solutions for complex tasks. In [110], the entanglement between B and C. The agent now has to
authors addressed the question of designing supercon- find a correct sequence of local measurements and clas-
ducting circuits for the first time with a fully automated sical communication steps that teleports the quantum
closed-loop optimization approach and designed a 4-local state from A to C. After performing up to 50 opera-
coupler by which four superconducting flux qubits inter- tions, the transformations are evaluated, and the agent
15
rect gradient-based optimization of parametrized quan- [101]. This point of view shows how machines can cre-
tum circuits, different approaches try to avoid the prob- atively contribute to science and act as an inspiration
lem of vanishing gradients, by employing reinforcement for human scientists. There will be a great potential for
learning [124], using ML-based prediction of suitable ini- expanding these ideas. Automatic extraction of under-
tial parameters (rather than optimizing the parameters standable building blocks (’subroutines’) can help with
directly) [125] or advanced gradient-free approaches that this challenge.
are naturally not susceptible to the barren plateau prob-
lem [126].
An interesting recent application of VQA-based sys- E. Quantum Error Correction
tems is the quantum-computer-aided design of quantum
hardware[127–129]. As described at the beginning of The ability to correct errors in a quantum computing
this chapter, the AI-based design of new quantum hard- device will be indispensable to realizing beneficial ap-
ware on a classical computer has the problem of memory plications of quantum computation since real-world de-
requirements increasing exponentially with the system vices are not coherent enough to run an error-free cal-
size. One way to overcome this problem is to outsource culation. The basic conceptual ideas in this domain are
the computation of the expensive quantum system to a known since the pioneering work of Shor [132] and sub-
quantum computer. Here, the problem of designing new sequent developments, most notably the surface code. In
multi-qubit couplers for superconducting quantum com- any case, the idea is to encode logical qubit information
puters or the design of new quantum optics hardware in many physical qubits robustly and redundantly. The
can be rephrased as in a VQA-style problem. A clas- presence of errors (like qubit dephasing and decay) must
sical AI algorithm changes the parameter of a parame- be detected via measurement of so-called syndromes,
terized quantum circuit to minimize a fidelity function i.e. suitably chosen observables (often multi-qubit opera-
computed from the outcome of the quantum computa- tors). Finally, a good way to interpret the observed syn-
tion. After convergence, a mapping translates the fi- dromes and apply some error correction procedure must
nal parametrized quantum circuit into the specific quan- be found. Despite the knowledge of good encodings and
tum hardware. This approach has been experimentally suitable syndromes, it remains a challenging problem how
demonstrated in a proof-of-principle three-qubit super- to best implement those in practice, for a given quantum
conducting circuit [129]. device, with its available gate set and topology of con-
In general, it is not guaranteed that a direct compi- nections between the qubits, and how to optimize them
lation of an algorithm already yields the most efficient for a given noise model.
implementation of a quantum circuit. A powerful clas- It has been recognized early on that machine learning
sical method to simplify (compile) quantum circuits is methods could potentially be of great help in this domain.
the ZX formalism [130], which reformulates the circuit The tasks can naturally be divided into three categories.
into a graph, where predefined rules identify simplifica- Syndrome interpretation – On the simplest level,
tions. However, this and similar approaches have been we already assume an existing encoding and a fixed set
formulated in a hardware-independent way, operating on of syndromes. The task then is to find the optimal way
a global level. Alternatively, this problem can be ap- to interpret the observed syndrome, e.g. deciding which
proached by RL algorithms [131] that can autonomously qubits are likely erroneous and must be corrected, see
simplify circuits, for example, in terms of circuit depth or Fig. 8. This can be phrased as a supervised learning
gate counts, and this enables easily taking into account problem, where some errors are simulated, the syndrome
concrete hard-ware constraints. In [131] this approach is fed into a network, and the network must announce
was developed and found superior to simulated anneal- the location of the errors. In practice, the surface code
ing (tested for circuits of up to 50 qubits). It has the is the most promising QEC architecture, and deducing
potential to become an important tool for simplifying the error from the syndrome is not trivial, though non-
quantum circuits in the future. ML algorithms exist. Multiple works therefore trained
Future Challenges and Opportunities – The sim- neural networks to yield “neural decoders” [133–136]. In
ulation of quantum experiments becomes expensive as one early example [133], a modified, restricted Boltzmann
soon as the system grows in size. Neural networks could machine was used, with two types of visible units, cor-
autonomously find approximate predictions for the dy- responding to syndrome and underlying error configura-
namics of the quantum system. Such supervised systems tion. This was then trained on a data set of such pairs.
need a lot of training examples. Thus the trade-off be- Afterward, the machine could be used to sample the er-
tween the creation of training examples and the compu- rors compatible with an observed syndrome. It is also
tational benefit of an approximation needs to be investi- possible to use reinforcement learning to discover better
gated. strategies in more complicated situations. In [137], this
The design of new experiments or hardware can not was applied to the surface code, exemplified in a situation
only be seen as optimization (in the sense of making an with faulty syndrome measurements.
existing structure better) but as discovery in which we Code search – Going one step further, the question
create new ideas that did not exist before, as shown in arises whether a machine can also find better codes. It is
17
to some external signal. The details of how to reach this the most famous examples being the ImageNet data set
goal will be left for the computer to discover. This change which provided the basis for a revolution of ML-based
of perspective will enable a much higher level of descrip- computer vision systems [145]. This idea was adapted in
tion, which is one way to keep ahead of the growing com- other fields of science that apply AI methodologies, such
plexity. Ultimately, one might expect that the machine as material discovery[146, 147]. In contrast, the field
has access to the scientific literature and suggests goals of AI in quantum technology, at the moment, appears
and new experiments autonomously, as demonstrated in more like the wild west. There are no clear ways how to
material science[140]. compare approaches from different papers, because most
Discovering new Algorithms – Rather than discov- works apply their approaches to slightly different tasks,
ering experiments or feedback strategies, it will be very making them incomparable. We believe that fair and
interesting to see whether ML agents can autonomously suitably curated benchmark data sets will steer the devel-
discover other higher-logic quantum programs such as opment of powerful and ever more generally applicable AI
quantum algorithms. This task is recently been tack- algorithms in quantum technology. The data sets could
led by large language models for classical algorithms consist of simulated or (in the best case) experimental
[141, 142], and we expect that similarly quantum algo- data for data interpretation tasks. Likewise, to facili-
rithms can be discovered with classical machine learning tate the discovery of experimental setups and protocols,
models. the community can develop a selection of well-curated
How can the human learn? – Suppose that com- objective functions and a set of simulated environments
puters will be able to help us find solutions for many describing important prototypical quantum devices (see
of the lower-level and even some higher-level, more con- SciGym for a first attempt at this [148]). In a similar
ceptual tasks in the domain of quantum technologies. direction, we expect that cloud access to real quantum
That raises the following notorious question, pervasive experiments will become available for significantly more
throughout machine learning and artificial intelligence: systems. AI algorithms can then be trained on the data
How can we human scientists understand what the ma- from these real machines with specific experimental con-
chine has learned? Do we need to open the black box of straints (such as connectivity or noise). This will boost
neural networks, or can we use the algorithms as a source the capabilities of algorithms that deal with important,
of inspiration in a different way [143]? We argue that real-world systems.
while improved performance in the task at hand is great, Finally, what Alan Turing remarked in his visionary
being able to understand the essence of what the machine article on intelligence and learning machines [149] is also
has discovered is crucial for the result to become of much valid here, in the field of machine learning applied to
wider applicability. In general, gaining understanding quantum technologies: ”We can only see a short dis-
has been called the essential aim of science[144]. Here tance ahead, but we can see plenty there that needs to
approaches where the solution involves discrete steps (e.g. be done.”
discrete actions of an agent) or logic-based AI seem to be
easier to interpret, explain and understand than results
from deep-learning-based methods. The field of symbolic
regression (which extracts discrete explanations of neu-
V. ACKNOWLEDGMENTS
ral network predictions) might be very fruitful in this
approach.
What needs to be done? – To attain the visions The authors thank Sören Arlt and Xuemei Gu for help-
described above, our community may adopt some proven ful comments on the manuscript. F.M. acknowledges
methodologies from other areas. The idea of fair bench- funding by the Munich Quantum Valley, which is sup-
marks and competitions is one of the powerful driving ported by the Bavarian state government with funds from
forces in the development of ML algorithms. One of the Hightech Agenda Bayern Plus.
Nature Physics 15, 887 (2019). mation, Physical Review A 63, 053804 (2001).
[46] G. Torlai, G. Mazzola, J. Carrasquilla, M. Troyer, [63] B. L. Higgins, D. W. Berry, S. D. Bartlett, H. M. Wise-
R. Melko, and G. Carleo, Neural-network quantum state man, and G. J. Pryde, Entanglement-free Heisenberg-
tomography, Nature Physics 14, 447 (2018). limited phase estimation, Nature 450, 393 (2007).
[47] G. Torlai, B. Timar, E. P. L. van Nieuwenburg, [64] A. Hentschel and B. C. Sanders, Machine Learning for
H. Levine, A. Omran, A. Keesling, H. Bernien, Precise Quantum Measurement, Physical Review Let-
M. Greiner, V. Vuletić, M. D. Lukin, et al., Integrating ters 104, 063603 (2010).
Neural Networks with a Quantum Simulator for State [65] A. Hentschel and B. C. Sanders, Efficient Algorithm for
Reconstruction, Physical Review Letters 123, 230504 Optimizing Adaptive Quantum Metrology Processes,
(2019). Physical Review Letters 107, 233601 (2011).
[48] M. Schuld, I. Sinayskiy, and F. Petruccione, Neural Net- [66] N. B. Lovett, C. Crosnier, M. Perarnau-Llobet, and
works Take on Open Quantum Systems, Physics 12, 74 B. C. Sanders, Differential Evolution for Many-Particle
(2019). Adaptive Quantum Metrology, Physical Review Letters
[49] A. Nagy and V. Savona, Variational Quantum Monte 110, 220501 (2013).
Carlo Method with a Neural-Network Ansatz for [67] P. Palittapongarnpim, P. Wittek, E. Zahedinejad,
Open Quantum Systems, Physical Review Letters 122, S. Vedaie, and B. C. Sanders, Learning in quantum
250501 (2019). control: High-dimensional global optimization for noisy
[50] M. J. Hartmann and G. Carleo, Neural-Network Ap- quantum dynamics, Neurocomputing 268, 116 (2017).
proach to Dissipative Quantum Many-Body Dynamics, [68] H. Xu, J. Li, L. Liu, Y. Wang, H. Yuan, and X. Wang,
Physical Review Letters 122, 250502 (2019). Generalizable control for quantum parameter estima-
[51] F. Vicentini, A. Biella, N. Regnault, and C. Ciuti, Vari- tion through reinforcement learning, npj Quantum In-
ational Neural-Network Ansatz for Steady States in formation 5, 82 (2019).
Open Quantum Systems, Physical Review Letters 122, [69] J. Schuff, L. J. Fiderer, and D. Braun, Improving the
250503 (2019). dynamics of quantum sensors with reinforcement learn-
[52] N. Yoshioka and R. Hamazaki, Constructing neural sta- ing, New Journal of Physics 22, 035001 (2020).
tionary states for open quantum many-body systems, [70] E. G. Ryan, C. C. Drovandi, J. M. McGree, and A. N.
Physical Review B 99, 214306 (2019). Pettitt, A Review of Modern Computational Algorithms
[53] T. Adler, M. Erhard, M. Krenn, J. Brandstetter, for Bayesian Optimal Design, International Statistical
J. Kofler, and S. Hochreiter, Quantum Optical Experi- Review 84, 128 (2016).
ments Modeled by Long Short-Term Memory, Photonics [71] V. Gebhart, R. Santagati, A. A. Gentile, E. Gauger,
8, 535 (2021). D. Craig, N. Ares, L. Banchi, F. Marquardt,
[54] N. Mohseni, T. Fösel, L. Guo, C. Navarrete-Benlloch, L. Pezze, and C. Bonato, Learning Quantum Systems,
and F. Marquardt, Deep Learning of Quantum Many- arXiv:2207.00298 (2022).
Body Dynamics via Random Driving, Quantum 6, 714 [72] R. Durrer, B. Kratochwil, J. V. Koski, A. J. Landig,
(2022). C. Reichl, W. Wegscheider, T. Ihn, and E. Greplova,
[55] M. Choi, D. Flam-Shepherd, T. H. Kyaw, and Automated Tuning of Double Quantum Dots into Spe-
A. Aspuru-Guzik, Learning quantum dynamics with la- cific Charge States Using Neural Networks, Physical Re-
tent neural ordinary differential equations, Physical Re- view Applied 13, 054019 (2020).
view A 105, 042403 (2022). [73] N. Wiebe, C. Granade, C. Ferrie, and D. G. Cory,
[56] D. T. Lennon, H. Moon, L. C. Camenzind, L. Yu, D. M. Hamiltonian Learning and Certification Using Quantum
Zumbühl, G. A. D. Briggs, M. A. Osborne, E. A. Laird, Resources, Physical Review Letters 112, 190501 (2014).
and N. Ares, Efficiently measuring a quantum device [74] J. Wang, S. Paesani, R. Santagati, S. Knauer, A. A.
using machine learning, npj Quantum Information 5, Gentile, N. Wiebe, M. Petruzzella, J. L. O’Brien, J. G.
79 (2019). Rarity, A. Laing, et al., Experimental quantum Hamil-
[57] E. Polino, M. Valeri, N. Spagnolo, and F. Sciarrino, tonian learning, Nature Physics 13, 551 (2017).
Photonic quantum metrology, AVS Quantum Science 2, [75] A. A. Gentile, B. Flynn, S. Knauer, N. Wiebe, S. Pae-
024703 (2020). sani, C. E. Granade, J. G. Rarity, R. Santagati, and
[58] V. Cimini, I. Gianani, N. Spagnolo, F. Leccese, F. Scia- A. Laing, Learning models of quantum systems from
rrino, and M. Barbieri, Calibration of Quantum Sen- experiments, Nature Physics 17, 837 (2021).
sors by Neural Networks, Physical Review Letters 123, [76] N. Khaneja, T. Reiss, C. Kehlet, T. Schulte-
230502 (2019). Herbrüggen, and S. J. Glaser, Optimal control of cou-
[59] P. A. Knott, A search algorithm for quantum state en- pled spin dynamics: design of NMR pulse sequences by
gineering and metrology, New Journal of Physics 18, gradient ascent algorithms, Journal of magnetic reso-
073033 (2016). nance 172, 296 (2005).
[60] R. Nichols, L. Mineh, J. Rubio, J. C. F. Matthews, and [77] V. F. Krotov, Global methods in optimal control theory,
P. A. Knott, Designing quantum experiments with a Monographs and textbooks in pure and applied mathe-
genetic algorithm, Quantum Science and Technology 4, matics, Vol. 195 (Marcel Dekker, New York, 1996).
045012 (2019). [78] J. Somlói, V. A. Kazakov, and D. J. Tannor, Controlled
[61] D. W. Berry and H. M. Wiseman, Optimal States dissociation of I2 via optical transitions between the
and Almost Optimal Adaptive Measurements for Quan- X and B electronic states, Chemical Physics 172, 85
tum Interferometry, Physical Review Letters 85, 5098 (1993).
(2000). [79] P. Doria, T. Calarco, and S. Montangero, Optimal
[62] D. W. Berry, H. Wiseman, and J. Breslin, Optimal in- Control Technique for Many-Body Quantum Dynamics,
put states and feedback for interferometric phase esti- Physical Review Letters 106, 190501 (2011).
21
[80] R. S. Judson and H. Rabitz, Teaching lasers to control Physical Review A 101, 022321 (2020).
molecules, Physical Review Letters 68, 1500 (1992). [97] F. Schäfer, M. Kloc, C. Bruder, and N. Lörch, A dif-
[81] C. Chen, D. Dong, H.-X. Li, J. Chu, and T.-J. Tarn, ferentiable programming method for quantum control,
Fidelity-Based Probabilistic Q-Learning for Control of Machine Learning: Science and Technology 1, 035009
Quantum Systems, IEEE Transactions on Neural Net- (2020).
works and Learning Systems 25, 920 (2013). [98] L. Coopmans, D. Luo, G. Kells, B. K. Clark, and J. Car-
[82] M. Bukov, A. G. R. Day, D. Sels, P. Weinberg, rasquilla, Protocol Discovery for the Quantum Con-
A. Polkovnikov, and P. Mehta, Reinforcement Learn- trol of Majoranas by Differentiable Programming and
ing in Different Phases of Quantum Control, Physical Natural Evolution Strategies, PRX Quantum 2, 020332
Review X 8, 031086 (2018). (2021).
[83] M. August and J. M. Hernández-Lobato, Taking Gra- [99] F. Schäfer, P. Sekatski, M. Koppenhöfer, C. Bruder, and
dients Through Experiments: LSTMs and Memory M. Kloc, Control of stochastic quantum dynamics by
Proximal Policy Optimization for Black-Box Quantum differentiable programming, Machine Learning: Science
Control, in High Performance Computing, edited by and Technology 2, 035004 (2021).
R. Yokota, M. Weiland, J. Shalf, and S. Alam (Springer [100] R. Porotti, V. Peano, and F. Marquardt, Gra-
International Publishing, Cham, 2018) pp. 591–613. dient Ascent Pulse Engineering with Feedback,
[84] X.-M. Zhang, Z.-W. Cui, X. Wang, and M.-H. Yung, arXiv:2203.04271 (2022).
Automatic spin-chain learning to explore the quantum [101] M. Krenn, J. S. Kottmann, N. Tischler, and A. Aspuru-
speed limit, Physical Review A 97, 052333 (2018). Guzik, Conceptual Understanding through Efficient
[85] M. Y. Niu, S. Boixo, V. N. Smelyanskiy, and H. Neven, Automated Design of Quantum Optical Experiments,
Universal quantum control through deep reinforcement Physical Review X 11, 031044 (2021).
learning, npj Quantum Information 5, 33 (2019). [102] M. Krenn, M. Malik, R. Fickler, R. Lapkiewicz, and
[86] Z. An and D. L. Zhou, Deep reinforcement learning for A. Zeilinger, Automated Search for new Quantum Ex-
quantum gate control, Europhysics Letters 126, 60002 periments, Physical Review Letters 116, 090405 (2016).
(2019). [103] M. Krenn, M. Erhard, and A. Zeilinger, Computer-
[87] Y. Baum, M. Amico, S. Howell, M. Hush, M. Liuzzi, inspired quantum experiments, Nature Reviews Physics
P. Mundada, T. Merkh, A. R. R. Carvalho, and M. J. 2, 649 (2020).
Biercuk, Experimental Deep Reinforcement Learning [104] X. Gao, M. Erhard, A. Zeilinger, and M. Krenn,
for Error-Robust Gate-Set Design on a Superconducting Computer-Inspired Concept for High-Dimensional Mul-
Quantum Computer, PRX Quantum 2, 040324 (2021). tipartite Quantum Gates, Physical Review Letters 125,
[88] T. Fösel, P. Tighineanu, T. Weiss, and F. Mar- 050501 (2020).
quardt, Reinforcement Learning with Neural Networks [105] A. A. Melnikov, H. P. Nautrup, M. Krenn, V. Dunjko,
for Quantum Feedback, Physical Review X 8, 031084 M. Tiersch, A. Zeilinger, and H. J. Briegel, Active learn-
(2018). ing machine learns to create new quantum experiments,
[89] J. Mackeprang, D. B. R. Dasari, and J. Wrachtrup, A Proceedings of the National Academy of Sciences 115,
reinforcement learning approach for quantum state en- 1221 (2018).
gineering, Quantum Machine Intelligence 2, 5 (2020). [106] A. Cervera-Lierta, M. Krenn, and A. Aspuru-Guzik, De-
[90] Z. T. Wang, Y. Ashida, and M. Ueda, Deep Reinforce- sign of quantum optical experiments with logic artificial
ment Learning Control of Quantum Cartpoles, Physical intelligence, arXiv:2109.13273 (2021).
Review Letters 125, 100401 (2020). [107] M. J. H. Heule and O. Kullmann, The science of brute
[91] S. Borah, B. Sarma, M. Kewming, G. J. Milburn, force, Communications of the ACM 60, 70 (2017).
and J. Twamley, Measurement-Based Feedback Quan- [108] B. Sanchez-Lengeling and A. Aspuru-Guzik, Inverse
tum Control with Deep Reinforcement Learning for a molecular design using machine learning: Generative
Double-Well Nonlinear Potential, Physical Review Let- models for matter engineering, Science 361, 360 (2018).
ters 127, 190403 (2021). [109] D. Flam-Shepherd, T. C. Wu, X. Gu, A. Cervera-Lierta,
[92] R. Porotti, A. Essig, B. Huard, and F. Marquardt, Deep M. Krenn, and A. Aspuru-Guzik, Learning interpretable
Reinforcement Learning for Quantum State Preparation representations of entanglement in quantum optics ex-
with Weak Nonlinear Measurements, Quantum 6, 747 periments using deep generative models, Nature Ma-
(2022). chine Intelligence 4, 544 (2022).
[93] V. V. Sivak, A. Eickbusch, H. Liu, B. Royer, I. Tsiout- [110] T. Menke, F. Häse, S. Gustavsson, A. J. Kerman, W. D.
sios, and M. H. Devoret, Model-Free Quantum Control Oliver, and A. Aspuru-Guzik, Automated design of su-
with Reinforcement Learning, Physical Review X 12, perconducting circuits and its application to 4-local cou-
011059 (2022). plers, npj Quantum Information 7, 49 (2021).
[94] N. Leung, M. Abdelhafez, J. Koch, and D. Schuster, [111] J. Wallnöfer, A. A. Melnikov, W. Dür, and H. J. Briegel,
Speedup for quantum optimal control from automatic Machine Learning for Long-Distance Quantum Commu-
differentiation based on graphics processing units, Phys- nication, PRX Quantum 1, 010301 (2020).
ical Review A 95, 042318 (2017). [112] C. P. Williams and A. G. Gray, Automated Design
[95] M. Abdelhafez, D. I. Schuster, and J. Koch, Gradient- of Quantum Circuits, in Quantum Computing and
based optimal control of open quantum systems us- Quantum Communications, edited by C. P. Williams
ing quantum trajectories and automatic differentiation, (Springer Berlin, Heidelberg, 1999) pp. 113–125.
Physical Review A 99, 052327 (2019). [113] R. B. McDonald and H. G. Katzgraber, Genetic braid
[96] M. Abdelhafez, B. Baker, A. Gyenis, P. Mundada, A. A. optimization: A heuristic approach to compute quasi-
Houck, D. Schuster, and J. Koch, Universal gates for particle braids, Physical Review B 87, 054414 (2013).
protected superconducting qubits using optimal control,
22
[114] U. Las Heras, U. Alvarez-Rodriguez, E. Solano, and [130] R. Duncan, A. Kissinger, S. Perdrix, and J. van de We-
M. Sanz, Genetic Algorithms for Digital Quantum Sim- tering, Graph-theoretic Simplification of Quantum Cir-
ulations, Physical Review Letters 116, 230504 (2016). cuits with the ZX-calculus, Quantum 4, 279 (2020).
[115] Y.-H. Zhang, P.-L. Zheng, Y. Zhang, and D.-L. Deng, [131] T. Fösel, M. Y. Niu, F. Marquardt, and L. Li, Quantum
Topological Quantum Compiling with Reinforcement circuit optimization with deep reinforcement learning,
Learning, Physical Review Letters 125, 170501 (2020). arXiv:2103.07585 (2021).
[116] L. Moro, M. G. A. Paris, M. Restelli, and E. Prati, [132] P. W. Shor, Scheme for reducing decoherence in quan-
Quantum compiling by deep reinforcement learning, tum computer memory, Physical Review A 52, R2493
Communications Physics 4, 178 (2021). (1995).
[117] L. Cincio, Y. Subaşı, A. T. Sornborger, and P. J. Coles, [133] G. Torlai and R. G. Melko, Neural Decoder for Topolog-
Learning the quantum algorithm for state overlap, New ical Codes, Physical Review Letters 119, 030501 (2017).
Journal of Physics 20, 113022 (2018). [134] S. Krastanov and L. Jiang, Deep Neural Network Prob-
[118] L. Cincio, K. Rudinger, M. Sarovar, and P. J. Coles, abilistic Decoder for Stabilizer Codes, Scientific reports
Machine Learning of Noise-Resilient Quantum Circuits, 7, 11003 (2017).
PRX Quantum 2, 010324 (2021). [135] S. Varsamopoulos, B. Criger, and K. Bertels, Decoding
[119] J. Yao, P. Kottering, H. Gundlach, L. Lin, and small surface codes with feedforward neural networks,
M. Bukov, Noise-Robust End-to-End Quantum Control Quantum Science and Technology 3, 015004 (2017).
using Deep Autoregressive Policy Networks, in Proceed- [136] P. Baireuther, T. E. O’Brien, B. Tarasinski, and
ings of the 2nd Mathematical and Scientific Machine C. W. J. Beenakker, Machine-learning-assisted correc-
Learning Conference, Proceedings of Machine Learning tion of correlated qubit errors in a topological code,
Research, Vol. 145, edited by J. Bruna, J. Hesthaven, Quantum 2, 48 (2018).
and L. Zdeborova (PMLR, 2022) pp. 1044–1081. [137] R. Sweke, M. S. Kesselring, E. P. L. van Nieuwenburg,
[120] J. R. McClean, S. Boixo, V. N. Smelyanskiy, R. Bab- and J. Eisert, Reinforcement learning decoders for fault-
bush, and H. Neven, Barren plateaus in quantum neural tolerant quantum computation, Machine Learning: Sci-
network training landscapes, Nature Communications ence and Technology 2, 025005 (2021).
9, 4812 (2018). [138] H. P. Nautrup, N. Delfosse, V. Dunjko, H. J. Briegel,
[121] S. Sim, P. D. Johnson, and A. Aspuru-Guzik, Ex- and N. Friis, Optimizing Quantum Error Correction
pressibility and Entangling Capability of Parameterized Codes with Reinforcement Learning, Quantum 3, 215
Quantum Circuits for Hybrid Quantum-Classical Al- (2019).
gorithms, Advanced Quantum Technologies 2, 1900070 [139] Z. Wang, T. Rajabzadeh, N. Lee, and A. H. Safavi-
(2019). Naeini, Automated Discovery of Autonomous Quantum
[122] M. Ostaszewski, L. M. Trenkwalder, W. Masarczyk, Error Correction Schemes, PRX Quantum 3, 020302
E. Scerri, and V. Dunjko, Reinforcement learning for op- (2022).
timization of variational quantum circuit architectures, [140] V. Tshitoyan, J. Dagdelen, L. Weston, A. Dunn,
Advances in Neural Information Processing Systems, Z. Rong, O. Kononova, K. A. Persson, G. Ceder, and
34, 18182 (2021). A. Jain, Unsupervised word embeddings capture la-
[123] S.-X. Zhang, C.-Y. Hsieh, S. Zhang, and H. Yao, Neural tent knowledge from materials science literature, Nature
predictor based quantum architecture search, Machine 571, 95 (2019).
Learning: Science and Technology 2, 045027 (2021). [141] S. d’Ascoli, P.-A. Kamienny, G. Lample, and F. Char-
[124] J. Yao, L. Lin, and M. Bukov, Reinforcement Learning ton, Deep Symbolic Regression for Recurrent Sequences,
for Many-Body Ground-State Preparation Inspired by arXiv:2201.04600 (2022).
Counterdiabatic Driving, Physical Review X 11, 031070 [142] P. Veličković, A. P. Badia, D. Budden, R. Pascanu,
(2021). A. Banino, M. Dashevskiy, R. Hadsell, and C. Blun-
[125] G. Verdon, M. Broughton, J. R. McClean, K. J. Sung, dell, The CLRS Algorithmic Reasoning Benchmark,
R. Babbush, Z. Jiang, H. Neven, and M. Mohseni, arXiv:2205.15659 (2022).
Learning to learn with quantum neural networks via [143] M. Krenn, R. Pollice, S. Y. Guo, M. Aldeghi,
classical neural networks, arXiv:1907.05415 (2019). A. Cervera-Lierta, P. Friederich, G. d. P. Gomes,
[126] A. Anand, M. Degroote, and A. Aspuru-Guzik, Natural F. Häse, A. Jinich, A. Nigam, et al., On scientific under-
evolutionary strategies for variational quantum compu- standing with artificial intelligence, arXiv:2204.01467
tation, Machine Learning: Science and Technology 2, (2022).
045012 (2021). [144] H. W. de Regt, Understanding Scientific Understanding
[127] T. H. Kyaw, T. Menke, S. Sim, A. Anand, N. P. Sawaya, (Oxford University Press, 2017).
W. D. Oliver, G. G. Guerreschi, and A. Aspuru-Guzik, [145] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and
Quantum Computer-Aided Design: Digital Quantum L. Fei-Fei, Imagenet: A large-scale hierarchical image
Simulation of Quantum Processors, Physical Review database, in 2009 IEEE Conference on Computer Vi-
Applied 16, 044042 (2021). sion and Pattern Recognition (2009) pp. 248–255.
[128] J. S. Kottmann, M. Krenn, T. H. Kyaw, S. Alperin- [146] N. Brown, M. Fiscato, M. H. S. Segler, and A. C.
Lea, and A. Aspuru-Guzik, Quantum computer-aided Vaucher, GuacaMol: Benchmarking Models for de Novo
design of quantum optics hardware, Quantum Science Molecular Design, Journal of Chemical Information and
and Technology 6, 035010 (2021). Modeling 59, 1096 (2019).
[129] F.-M. Liu, M.-C. Chen, C. Wang, S.-W. Li, Z.-X. [147] D. Polykovskiy, A. Zhebrak, B. Sanchez-Lengeling,
Shang, C. Ying, J.-W. Wang, C.-Z. Peng, X. Zhu, C.- S. Golovanov, O. Tatanov, S. Belyaev, R. Kurbanov,
Y. Lu, et al., Quantum Design for Advanced Qubits, A. Artamonov, V. Aladinskiy, M. Veselov, et al., Molec-
arXiv:2109.00994 (2021). ular sets (MOSES): a benchmarking platform for molec-
23
ular generation models, Frontiers in Pharmacology 11, [149] A. Turing, Computing Machinery and Intelligence,
565644 (2020). Mind LIX, 433 (1950).
[148] Reinforcement learning for science, https://www.
scigym.net/, accessed: 2022-08-08.