Download as pdf or txt
Download as pdf or txt
You are on page 1of 47

Machine learning and the physical sciences

Giuseppe Carleo
Center for Computational Quantum Physics,
Flatiron Institute,
162 5th Avenue,
New York, NY 10010,
USA∗

Ignacio Cirac
Max-Planck-Institut fur Quantenoptik,
arXiv:1903.10563v2 [physics.comp-ph] 6 Dec 2019

Hans-Kopfermann-Straße 1,
D-85748 Garching,
Germany

Kyle Cranmer
Center for Cosmology and Particle Physics,
Center of Data Science,
New York University, 726 Broadway,
New York, NY 10003,
USA

Laurent Daudet
LightOn, 2 rue de la Bourse, F-75002 Paris,
France

Maria Schuld
University of KwaZulu-Natal,
Durban 4000, South Africa
National Institute for Theoretical Physics,
KwaZulu-Natal, Durban 4000, South Africa,
and Xanadu Quantum Computing,
777 Bay Street, M5B 2H7 Toronto,
Canada

Naftali Tishby
The Hebrew University of Jerusalem,
Edmond Safra Campus,
Jerusalem 91904,
Israel

Leslie Vogt-Maranto
Department of Chemistry,
New York University,
New York, NY 10003,
USA

Lenka Zdeborová
Institut de physique théorique,
Université Paris Saclay, CNRS, CEA,
F-91191 Gif-sur-Yvette,
France†

Machine learning encompasses a broad range of algorithms and modeling tools used for
a vast array of data processing tasks, which has entered most scientific disciplines in
recent years. We review in a selective way the recent research on the interface between
machine learning and physical sciences. This includes conceptual developments in ma-
chine learning (ML) motivated by physical insights, applications of machine learning
techniques to several domains in physics, and cross-fertilization between the two fields.
After giving basic notion of machine learning methods and principles, we describe ex-
amples of how statistical physics is used to understand methods in ML. We then move
2

to describe applications of ML methods in particle physics and cosmology, quantum


many body physics, quantum computing, and chemical and material physics. We also
highlight research and development into novel computing architectures aimed at acceler-
ating ML. In each of the sections we describe recent successes as well as domain-specific
methodology and challenges.

CONTENTS

I. Introduction 3
A. Concepts in machine learning 4
1. Supervised learning and neural networks 4
2. Unsupervised learning and generative modelling 6
3. Reinforcement learning 6

II. Statistical Physics 7


A. Historical note 7
B. Theoretical puzzles in deep learning 7
C. Statistical physics of unsupervised learning 8
1. Contributions to understanding basic unsupervised methods 8
2. Restricted Boltzmann machines 9
3. Modern unsupervised and generative modelling 9
D. Statistical physics of supervised learning 10
1. Perceptron and GLMs 10
2. Physics results on multi-layer neural networks 10
3. Information Bottleneck 11
4. Landscapes and glassiness of deep learning 11
E. Applications of ML in Statistical Physics 12
F. Outlook and Challenges 12

III. Particle Physics and Cosmology 13


A. The role of the simulation 13
B. Classification and regression in particle physics 14
1. Jet Physics 15
2. Neutrino physics 15
3. Robustness to systematic uncertainties 16
4. Triggering 16
5. Theoretical particle physics 17
C. Classification and regression in cosmology 17
1. Photometric Redshift 17
2. Gravitational lens finding and parameter estimation 18
3. Other examples 18
D. Inverse Problems and Likelihood-free inference 19
1. Likelihood-free Inference 20
2. Examples in particle physics 20
3. Examples in Cosmology 21
E. Generative Models 22
F. Outlook and Challenges 22

IV. Many-Body Quantum Matter 22


A. Neural-Network quantum states 23
1. Representation theory 24
2. Learning from data 24
3. Variational Learning 25
B. Speed up many-body simulations 25
C. Classifying many-body quantum phases 26
1. Synthetic data 26
2. Experimental data 27
D. Tensor networks for machine learning 27
E. Outlook and Challenges 28

V. Quantum computing 29
A. Quantum state tomography 29
B. Controlling and preparing qubits 30
C. Error correction 31

VI. Chemistry and Materials 31


A. Energies and forces based on atomic environments 32
B. Potential and free energy surfaces 32
C. Materials properties 33
3

D. Electron densities for density functional theory 34


E. Data set generation 34
F. Outlook and Challenges 35

VII. AI acceleration with classical and quantum hardware 36


A. Beyond von Neumann architectures 36
B. Neural networks running on light 36
C. Revealing features in data 36
D. Quantum-enhanced machine learning 37
E. Outlook and Challenges 38

VIII. Conclusions and Outlook 38

Acknowledgements 38

References 39

I. INTRODUCTION in the way their fundamental goals are realized. On the


one hand, physicists want to understand the mechanisms
The past decade has seen a prodigious rise of machine- of Nature, and are proud of using their own knowledge,
learning (ML) based techniques, impacting many areas intelligence and intuition to inform their models. On
in industry including autonomous driving, health-care, fi- the other hand, machine learning mostly does the oppo-
nance, manufacturing, energy harvesting, and more. ML site: models are agnostic and the machine provides the
is largely perceived as one of the main disruptive tech- ’intelligence’ by extracting it from data. Although of-
nologies of our ages, as much as computers have been ten powerful, the resulting models are notoriously known
in the 1980’s and 1990’s. The general goal of ML is to to be as opaque to our understanding as the data pat-
recognize patterns in data, which inform the way unseen terns themselves. Machine learning tools in physics are
problems are treated. For example, in a highly complex therefore welcomed enthusiastically by some, while being
system such as a self-driving car, vast amounts of data eyed with suspicions by others. What is difficult to deny
coming from sensors have to be turned into decisions of is that they produce surprisingly good results in some
how to control the car by a computer that has “learned” cases.
to recognize the pattern of “danger”.
The success of ML in recent times has been marked
at first by significant improvements on some existing
technologies, for example in the field of image recogni-
tion. To a large extent, these advances constituted the In this review, we attempt at providing a coherent se-
first demonstrations of the impact that ML methods can lected account of the diverse intersections of ML with
have in specialized tasks. More recently, applications tra- physics. Specifically, we look at an ample spectrum of
ditionally inaccessible to automated software have been fields (ranging from statistical and quantum physics to
successfully enabled, in particular by deep learning tech- high energy and cosmology) where ML recently made a
nology. The demonstration of reinforcement learning prominent appearance, and discuss potential applications
techniques in game playing, for example, has had a deep and challenges of ‘intelligent’ data mining techniques in
impact in the perception that the whole field was moving the different contexts. We start this review with the field
a step closer to what expected from a general artificial of statistical physics in Section II where the interaction
intelligence. with machine learning has a long history, drawing on
In parallel to the rise of ML techniques in industrial ap- methods in physics to provide better understanding of
plications, scientists have increasingly become interested problems in machine learning. We then turn the wheel in
in the potential of ML for fundamental research, and the other direction of using machine learning for physics.
physics is no exception. To some extent, this is not too Section III treats progress in the fields of high-energy
surprising, since both ML and physics share some of their physics and cosmology, Section IV reviews how ML ideas
methods as well as goals. The two disciplines are both are helping to understand the mysteries of many-body
concerned about the process of gathering and analyzing quantum systems, Section V briefly explore the promises
data to design models that can predict the behaviour of of machine learning within quantum computations, and
complex systems. However, the fields prominently differ in Section VI we highlight some of the amazing advances
in computational chemistry and materials design due to
ML applications. In Section VII we discuss some ad-
vances in instrumentation leading potentially to hard-
∗ Corresponding author: gcarleo@flatironinstitute.org ware adapted to perform machine learning tasks. We
† Corresponding author: lenka.zdeborova@cea.fr conclude with an outlook in Section VIII.
4

A. Concepts in machine learning set {Xµ , yµ }µ=1,...,n is called the training set. In order
to test the resulting function f one usually splits the
For the purpose of this review we will briefly explain available data samples into the training set used to learn
some fundamental terms and concepts used in machine the function and a test set to evaluate the performance.
learning. For further reading, we recommend a few re- Let us now describe the training procedure most com-
sources, some of which have been targeted especially for monly used to find a suitable function f . Most commonly
a physics audience. For a historical overview of the the function is expressed in terms of a set of parameters,
development of the field we recommend Refs. (LeCun called weights w ∈ Rk , leading to fw . One then con-
et al., 2015; Schmidhuber, 2014). An excellent recent structs a so-called loss function L[fw (Xµ ), yµ ] for each
introduction to machine learning for physicists is Ref. sample µ, with the idea of this loss being small when
(Mehta et al., 2018), which includes notebooks with prac- fw (Xµ ) and yµ are close, and vice versa. The average of
tical demonstrations. A very useful online resource is the loss over the
Pntraining set is then called the empirical
Florian Marquardt’s course “Machine learning for physi- risk R(fw ) = µ=1 L[fw (Xµ ), yµ ]/n.
cists” 1 . Useful textbooks written by machine learning During the training procedure the weights w are being
researchers are Christopher Bishop’s standard textbook adjusted in order to minimize the empirical risk. The
(Bishop, 2006), as well as (Goodfellow et al., 2016) which training error measures how well is such a minimization
focuses on the theory and foundations of deep learning achieved. A notion of error that is the most important
and covers many aspects of current-day research. A vari- is the generalization error, related to the performance on
ety of online tutorials and lectures is useful to get a basic predicting labels ynew for data samples Xnew that were
overview and get started on the topic. not seen in the training set. In applications, it is com-
To learn about the theoretical progress made in statis- mon practice to build the test set by randomly picking a
tical physics of neural networks in the 1980s-1990s we rec- fraction of the available data, and perform the training
ommend the rather accessible book Statistical Mechan- using the remaining fraction as a training set. We note
ics of Learning (Engel and Van den Broeck, 2001). For that in a part of the literature the generalization error
learning details of the replica method and its use in com- is the difference between the performance of the test set
puter science, information theory and machine learning and the one on the training set.
we would recommend the book of Nishimori (Nishimori, The algorithms most commonly used to minimize the
2001). For the more recent statistical physics methodol- empirical risk function over the weights are based on gra-
ogy the textbook of Mézard and Montanari is an excellent dient descent with respect to the weights w. This means
reference (Mézard and Montanari, 2009). that the weights are iteratively adjusted in the direction
To get a basic idea of the type of problems that ma- of the gradient of the empirical risk
chine learning is able to tackle it is useful to defined three
wt+1 = wt − γ∇w R(fw ). (1)
large classes of learning problems: Supervised learning,
unsupervised learning and reinforcement learning. This The rate γ at which this is performed is called the learn-
will also allow us to state the basic terminology, build- ing rate. A very commonly used and successful variant
ing basic equipment to expose some of the basic tools of of the gradient descent is the stochastic gradient descent
machine learning. (SGD) where the full empirical risk function R is re-
placed by the contribution of just a few of the samples.
This subset of samples is called mini-batch and can be
1. Supervised learning and neural networks as small as a single sample. In physics terms, the SGD
algorithm is often compared to the Langevin dynamics
In supervised learning we are given a set of n samples at finite temperature. Langevin dynamics at zero tem-
of data, let us denote one such sample Xµ ∈ Rp , with perature is the gradient descent. Positive temperature
µ = 1, . . . , n. To have something concrete in mind each introduces a thermal noise that is in certain ways similar
Xµ could be for instance a black-and-white photograph of to the noise arising in SGD, but different in others. There
an animal, and p the number of pixels. For each sample are many variants of the SGD algorithm used in practice.
Xµ we are further given a label yµ ∈ Rd , most commonly The initialization of the weights can change performance
d = 1. The label could encode for instance the species in practice, as can the choice of the learning rate and a
of the animal on the photograph. The goal of supervised variety of so-called regularization terms, such as weight
learning is to find a function f so that when a new sample decay that is penalizing weights that tend to converge to
Xnew is presented without its label, then the output of the large absolute values. The choice of the right version of
function f (Xnew ) approximates well the label. The data the algorithm is important, there are many heuristic rules
of thumb, and certainly more theoretical insight into the
question would be desirable.
One typical example of a task in supervised learning
1 See https://machine-learning-for-physicists.org/. is classification, that is when the labels yµ take values
5

in a discrete set and the so-called accuracy is then mea- the so-called activation functions, and act component-
sured as the fraction of times the learned function clas- wise on vectors. We note that the input in the activation
sifies the data point correctly. Another example is re- functions are often slightly more generic affine transforms
gression where the goal is to learn a real-valued func- of the output of the previous layer that simply matrix
tion, and the accuracy is typically measured in terms of multiplications, including e.g. biases. The number of
the mean-squared error between the true labels and their layers L is called the network’s depth. Neural networks
learned estimates. Other examples would be sequence- with depth larger than some small integer are called deep
to-sequence learning where both the input and the label neural networks. Subsequently machine learning based
are vectors of dimension larger than one. on deep neural networks is called deep learning.
There are many methods of supervised learning and The theory of neural networks tells us that without
many variants of each. One of the most basic supervised hidden layers (L = 1, corresponding to the generalized
learning method is the widely known and used linear re- linear regression) the set of functions that can be ap-
gression, where the function fw (X) is parameterized in proximated this way is very limited (Minsky and Papert,
the form fw (Xµ ) = Xµ w, with w ∈ Rp . When the data 1969). On the other hand already with one hidden layer,
live in high dimensional space and the number of samples L = 2, that is wide enough, i.e. r1 large enough, and
is not much larger than the dimension, it is indispensable where the function g (1) is non-linear, a very general class
to use regularized form of linear regression called ridge of functions can be well approximated in principle (Cy-
regression or Tikhonov regularization. The ridge regres- benko, 1989). These theories, however, do not tell us
sion is formally equivalent to assuming that the weights what is the optimal set of parameters (the activation
w have a Gaussian prior. A generalized form of linear functions, the widths of the layers and the depth) in or-
regression, with parameterization fw (Xµ ) = g(Xµ w), der for the learning of W (1) , . . . , W (L) to be tractable
where g is some output channel function, is also often efficiently. We know from empirical success of the past
used and its properties are described in section II.D.1. decade that many tasks of interest are tractable with
Another popular way of regularization is based on sepa- deep neural network using the gradient descent or the
rating the example n a classification task so that they the SGD algorithms. In deep neural networks the derivatives
separate categories are divided by a clear gap that is as with respect to the weights are computed using the chain
wide as possible. This idea stands behind the definition rule leading to the celebrated back-propagation algorithm
of so-called support vector machine method. that takes care of efficiently scheduling the operations
A rather powerful non-parametric generalization of the required to compute all the gradients (Goodfellow et al.,
ridge regression is kernel ridge regression. Kernel ridge 2016).
regression is closely related to Gaussian process regres- A very important and powerful variant of (deep) feed-
sion. The support vector machine method is often com- forward neural networks are the so-called convolutional
bined with a kernel method, and as such is still the state- neural networks (Goodfellow et al., 2016) where the input
of-the-art method in many applications, especially when into each of the hidden units is obtained via a filter ap-
the number of available samples is not very large. plied to a small part of the input space. The filter is then
Another classical supervised learning method is based shifted to different positions corresponding to different
on so-called decision trees. The decision tree is used to hidden units. Convolutional neural networks implement
go from observations about a data sample (represented in invariance to translation and are in particular suitable for
the branches) to conclusions about the item’s target value analysis of images. Compared to the fully connected neu-
(represented in the leaves). The best known application ral networks each layer of convolutional neural network
of decision trees in physical science is in data analysis of has much smaller number of parameters, which is in prac-
particle accelerators, as discussed in Sec. III.B. tice advantageous for the learning algorithms. There are
The supervised learning method that stands behind many types and variances of convolutional neural net-
the machine learning revolution of the past decade are works, among them we will mention the Residual neural
multi-layer feed-forward neural networks (FFNN) also networks (ResNets) use shortcuts to jump over some lay-
sometimes called multi-layer perceptrons. This is also a ers.
very relevant method for the purpose of this review and Next to feed-forward neural networks there are the so-
we shall describe it briefly here. In L-layer fully con- called recurrent neural networks (RNN) in which the out-
nected neural networks the function fw (Xµ ) is parame- puts of units feeds back at the input in the next time step.
terized as follows In RNNs the result is thus given by the set of weights,
fw (Xµ ) = g (L) (W (L) . . . g (2) (W (2) g (1) (W (1) Xµ ))), (2) but also by the whole temporal sequence of states. Due
to their intrinsically dynamical nature, RNN are particu-
where w = {W (1) , . . . , W (L) }i=1,...,L , and W (i) ∈ larly suitable for learning for temporal data sets, such as
Rri ×ri−1 with r0 = p and rL = d, are the matrices of speech, language, and time series. Again there are many
weights, and ri for 1 ≤ i ≤ L − 1 is called the width of types and variants on RNNs, but the ones that caused
the i − th hidden layer. The functions g (i) , 1 ≤ i ≤ L, are the most excitement in the past decade are arguably the
6

long short-term memory (LSTM) networks (Hochreiter e.g. in section IV.A.


and Schmidhuber, 1997). LSTMs and their deep variants A very neat idea to perform unsupervised learning yet
are the state-of-the-art in tasks such as speech processing, being able to use all the methods and algorithms devel-
music compositions, and natural language processing. oped for supervised learning are auto-encoders. An au-
toencoder is a feed-forward neural network that has the
input data on the input, but also on the output. It aims
2. Unsupervised learning and generative modelling to reproduce the data while typically going trough a bot-
tleneck, in the sense that some of the intermediate layers
Unsupervised learning is a class of learning problems have very small width compared to the dimensionality
where input data are obtained as in supervised learning, of the data. The idea is then that autoencoder is aim-
but no labels are available. The goal of learning here ing to find a succinct representation of the data that still
is to recover some underlying –and possibly non-trivial– keeps the salient features of each of the samples. Varia-
structure in the dataset. A typical example of unsuper- tional autoencoders (VAE) (Kingma and Welling, 2013;
vised learning is data clustering where data points are Rezende et al., 2014) combine variational inference and
assigned into groups in such a way that every group has autoencoders to provide a deep generative model for the
some common properties. data, which can be trained in an unsupervised fashion.
In unsupervised learning, one often seeks a probability A further approach to unsupervised learning worth
distribution that generates samples that are statistically mentioning here, are adversarial generative networks
similar to the observed data samples, this is often referred (GANs) (Goodfellow et al., 2014). GANs have attracted
to as generative modelling. In some cases this probabil- substantial attentions in the past years, and constitute
ity distribution is written in an explicit form and ex- another fruitful way to take advantage of the progresses
plicitly or implicitly parameterized. Generative models made for supervised learning to do unsupervised one.
internally contain latent variables as the source of ran- GANs typical use two feed-forward neural networks, one
domness. When the number of latent variables is much called the generator and another called the discrimina-
smaller than the dimensionality of the data we speak tor. The generator network is used to generate outputs
of dimensionality reduction. One path towards unsuper- from random inputs, and is designed so that the out-
vised learning is to search values of the latent variables puts look like the observed samples. The discriminator
that maximize the likelihood of the observed data. network is used to discriminate between true data sam-
In a range of applications the likelihood associated to ples and samples generated by the generator network.
the observed data is not known or computing it is it- The discriminator is aiming at best possible accuracy in
self intractable. In such cases, some of the generative this classification task, whereas the generator network is
models discussed below offer on alternative likelihood- adjusted to make the accuracy of the discriminator the
free path. In Section III.D we will also discuss the so- smallest possible. GANs currently are the state-of-the
called ABC method that is a type of likelihood-free in- art system for many applications in image processing.
ference and turns out to be very useful in many contexts Other interesting methods to model distributions in-
arising in physical sciences. clude normalizing flows and autoregressive models with
Basic methods of unsupervised learning include prin- the advantage of having tractable likelihood so that they
cipal component analysis and its variants. We will cover can be trained via maximum likelihood (Larochelle and
some theoretical insights into these method that were ob- Murray, 2011; Papamakarios et al., 2017; Uria et al.,
tained using physics in section II.C.1. A physically very 2016).
appealing methods for unsupervised learning are the so- Hybrids between supervised learning and unsupervised
called Boltzmann machines (BM). A BM is basically an learning that are important in application include semi-
inverse Ising model where the data samples are seen as supervised learning where only some labels are available,
samples from a Boltzmann distribution of a pair-wise in- or active learning where labels can be acquired for a se-
teracting Ising model. The goal is to learn the values of lected set of data points at a certain cost.
the interactions and magnetic fields so that the likelihood
(probability in the Boltzmann measure) of the observed
data is large. A restricted Boltzmann machine (RBM) is 3. Reinforcement learning
a particular case of BM where two kinds of variables –
visible units, that see the input data, and hidden units, Reinforcement learning (Sutton and Barto, 2018) is an
interact through effective couplings. The interactions are area of machine learning where an (artificial) agent takes
in this case only between visible and hidden units and are actions in an environment with the goal of maximizing
again adjusted in order for the likelihood of the observed a reward. The action changes the state of the environ-
data to be large. Given the appealing interpretation in ment in some way and the agent typically observes some
terms of physical models, applications of BMs and RBMs information about the state of the environment and the
are widespread in several physics domains, as discussed corresponding reward. Based on those observations the
7

agent decides on the next action, refining the strategies of Gardner’s method enabled to explicitly calculate learn-
which action to choose in order to maximize the result- ing curves, i.e. the typical training and generalization
ing reward. This type of learning is designed for cases errors as a function of the number of training examples,
where the only way to learn about the properties of the for very specific one and two-layer neural networks (Györ-
environment is to interact with it. A key concept in rein- gyi and Tishby, 1990; Seung et al., 1992a; Sompolinsky
forcement learning it the trade-off between exploitation et al., 1990). These analytic statistical physics calcula-
of good strategies found so far, and exploration in order tions demonstrated that the learning dynamics can ex-
to find yet better strategies. We should also note that hibit much richer behavior than predicted by the worse-
reinforcement learning is intimately related to the field case distribution free PAC bounds (PAC stands for prov-
of theory of control, especially optimal control theory. ably approximately correct) (Valiant, 1984). In partic-
One of the main types of reinforcement learning ap- ular, learning can exhibit phase transitions from poor
plied in many works is the so-called Q-learning. Q- to good generalization (Györgyi, 1990). This rich learn-
learning is based on a value matrix Q that assigns quality ing dynamics and curves can appear in many machine
of a given action when the environment is in a given state. learning problems, as was shown in various models, see
This value function Q is then iteratively refined. In re- e.g. more recent review (Zdeborová and Krzakala, 2016).
cent advanced applications of Q-learning the set of states The statistical physics of learning reached its peak in the
and action is so large that it is impossible to even store early 1990s, but had rather minor influence on machine-
the whole matrix Q. In those cases deep feed-forward learning practitioners and theorists, who were focused
neural networks are used to represent the function in a on general input-distribution-independent generalization
succinct manner. This gives rise to deep Q-learning. bounds, characterized by e.g. the Vapnik-Chervonenkis
Most well-known recent examples of the success of re- dimension or the Rademacher complexity of hypothesis
inforcement learning is the computer program AlphaGo classes.
and AlphaGo Zero that for a first time in history reached
super-human performance in the traditional board game
of Go. Another well known use of reinforcement learning B. Theoretical puzzles in deep learning
is locomotion of robots.
Machine learning in the new millennium was marked
by much larger scale learning problems, in input/pattern
II. STATISTICAL PHYSICS sizes which moved from hundreds to millions in di-
mensionality, in training data sizes, and in number of
A. Historical note adjustable parameters. This was dramatically demon-
strated by the return of large scale feed-forward neural
While machine learning as a wide-spread tool for network models, with many more hidden layers, known
physics research is a relatively new phenomenon, cross- as deep neural networks. These deep neural networks
fertilization between the two disciplines dates back much were essentially the same feed-forward convolution neu-
further. Especially statistical physicists made important ral networks proposed already in the 80s. But some-
contributions to our theoretical understanding of learn- how with the much larger scale inputs and big and
ing (as the term “statistical” unmistakably suggests). clean training data (and a few more tricks and hacks),
The connection between statistical mechanics and these networks started to beat the state-of-the-art in
learning theory started when statistical learning from ex- many different pattern recognition and other machine
amples took over the logic and rule based AI, in the mid learning competitions, from roughly 2010 and on. The
1980s. Two seminal papers marked this transformation, amazing performance of deep learning, trained with the
Valiant’s theory of the learnable (Valiant, 1984), which same old stochastic gradient descent (SGD) error-back-
opened the way for rigorous statistical learning in AI, propagation algorithm, took everyone by surprise.
and Hopfield’s neural network model of associative mem- One of the puzzles is that the existing learning theory
ory (Hopfield, 1982), which sparked the rich application (based on the worst-case PAC-like generalization bounds)
of concepts from spin glass theory to neural networks is unable to explain this phenomenal success. The exist-
models. This was marked by the memory capacity cal- ing theory does not predict why deep networks, where
culation of the Hopfield model by Amit, Gutfreund, and the number/dimension of adjustable parameters/weights
Sompolinsky (Amit et al., 1985) and following works. A is way higher than the number of training samples, have
much tighter application to learning models was made good generalization properties. This lack of theory was
by the seminal work of Elizabeth Gardner who applied coined in now a classical article (Zhang et al., 2016),
the replica trick (Gardner, 1987, 1988) to calculate vol- where the authors show numerically that state-of-the-art
umes in the weights space for simple feed-forward neural neural networks used for classification are able to classify
networks, for both supervised and unsupervised learning perfectly randomly generated labels. In such a case exist-
models. ing learning theory does not provide any useful bound on
8

the generalization error. Yet in practice we observe good some component-wise function of X) can be written as a
generalization of the same deep neural networks when noisy version of a rank r matrix where r  p; r  n, i.e.
trained on the true labels. the rank is much lower that the dimensionality and the
Continuing with the open question, we do not have number of samples, therefore the name low-rank. A par-
good understanding of which learning problems are com- ticularly challenging, yet relevant and interesting regime,
putationally tractable. This is particularly important is when the dimensionality p is comparable to the number
since from the point of view of computational complexity of samples n, and when the level of noise is large in such a
theory, most of the learning problems we encounter are way that perfect estimation of the signal is not possible.
NP-hard in the worst case. Another open question that It turns out that the low-rank matrix estimation in the
is central to current deep learning concern the choice of high-dimensional noisy regime can be modelled as a sta-
hyper-parameters and architectures that is so far guided tistical physics model of a spin glass with r-dimensional
by a lot of trial-and-error combined by impressive expe- vector variables and a special planted configuration to be
rience of the researchers. At the same time as applica- found.
tions of ML are spreading into many domains, the field Concretely, this model can be defined in the teacher-
calls for more systematic and theory-based approaches. student scenario in which the teacher generates r-
In current deep-learning, basic questions such as what is dimensional latent variables u∗i ∈ Rr , i = 1, . . . , n,
the minimal number of samples we need in order to be taken from a given probability distribution Pu (u∗i ), and
able to learn a given task with a good precision is entirely r-dimensional latent variables vj∗ ∈ Rr , j = 1, . . . , p,
open. taken from a given probability distribution Pv (vi∗ ). Then
At the same time the current literature on deep learn- the teacher generates components of the data matrix
ing is flourishing with interesting numerical observations X from some given conditional probability distribution
and experiments that call for explanation. For a physics Pout (Xij |u∗i · vj∗ ). The goal of the student is then to re-
audience the situation could perhaps be compared to the cover the latent variables u∗ and v ∗ as precisely as possi-
state-of-the-art in fundamental small-scale physics just ble from the knowledge of X, and the distributions Pout ,
before quantum mechanics was developed. The field was Pu , Pv .
full of unexplained experiments that were evading exist- Spin glass theory can be used to obtain rather com-
ing theoretical understanding. This clearly is the perfect plete understanding of this teacher-student model for
time for some of physics ideas to study neural networks low-rank matrix estimation in the limit p, n → ∞,
to resurrect and revisit some of the current questions and n/p = α = Ω(1), r = Ω(1). One can compute with the
directions in machine learning. replica method what is the information-theoretically best
Given the long history of works done on neural net- error in estimation of u∗ , and v ∗ the student can possibly
works in statistical physics, we will not aim at a complete achieve, as done decades ago for some special choices of r,
review of this direction of research. We will focus in a se- Pout , Pu and Pv in (Barkai and Sompolinsky, 1994; Biehl
lective way on recent contributions originating in physics and Mietzner, 1993; Watkin and Nadal, 1994). The im-
that, in our opinion, are having important impact in cur- portance of these early works in physics is acknowledged
rent theory of learning and machine learning. For the in some of the landmark papers on the subject in statis-
purpose of this review we are also putting aside a large tics, see e.g. (Johnstone and Lu, 2009). However, the
volume of work done in statistical physics on recurrent lack of mathematical rigor and limited understanding of
neural networks with biological applications in mind. algorithmic tractability caused the impact of these works
in machine learning and statistics to remain limited.
A resurrection of interest in statistical physics ap-
C. Statistical physics of unsupervised learning proach to low-rank matrix decompositions came with the
study of the stochastic block model for detection of clus-
1. Contributions to understanding basic unsupervised methods ters/communities in sparse networks. The problem of
community detection was studied heuristically and al-
One of the most basic tools of unsupervised learning gorithmically extensively in statistical physics, for a re-
across the sciences are methods based on low-rank de- view see (Fortunato, 2010). However, the exact solu-
composition of the observed data matrix. Data cluster- tion and understanding of algorithmic limitations in the
ing, principal component analysis (PCA), independent stochastic block model came from the spin glass theory
component analysis (ICA), matrix completion, and other in (Decelle et al., 2011a,b). These works computed (non-
methods are examples in this class. rigorously) the asymptotically optimal performance and
In mathematical language the low-rank matrix decom- delimited sharply regions of parameters where this per-
position problem is stated as follows: We observe n sam- formance is reached by the belief propagation (BP) al-
ples of p-dimensional data xi ∈ Rp , i = 1, . . . , n. De- gorithm (Yedidia et al., 2003). Second order phase tran-
noting X the n × p matrix of data, the idea underly- sitions appearing in the model separate a phase where
ing low-rank decomposition methods assumes that X (or clustering cannot be performed better than by random
9

guessing, from a region where it can be done efficiently erature and used extensively a range of area, for a re-
with BP. First order phase transitions and one of their cent review on the physics of Boltzmann machines see
spinodal lines then separate regions where clustering is (Nguyen et al., 2017).
impossible, possible but not doable with the BP algo- Concerning restricted Boltzmann machines, there are
rithm, and easy with the BP algorithm. Refs. (Decelle number of studies in physics clarifying how these ma-
et al., 2011a,b) also conjectured that when the BP algo- chines work and what structures can they learn. Model
rithm is not able to reach the optimal performance on of random restricted Boltzmann machine, where the
large instances of the model, then no other polynomial weights are imposed to be random and sparse, and not
algorithm will. These works attracted a large amount of learned, is studied in (Cocco et al., 2018; Tubiana and
follow-up work in mathematics, statistics, machine learn- Monasson, 2017). Rather remarkably for a range of po-
ing and computer science communities. tentials on the hidden unit this work unveiled that even
The statistical physics understanding of the stochastic the single layer RBM is able to represent compositional
block model and the conjecture about belief propagation structure. Insights from this work were more recently
algorithm being optimal among all polynomial ones in- used to model protein families from their sequence infor-
spired the discovery of a new class of spectral algorithms mation (Tubiana et al., 2018).
for sparse data (i.e. when the matrix X is sparse) (Krza- Analytical study of the learning process in RBM, that
kala et al., 2013b). Spectral algorithms are basic tools is most commonly done using the contrastive divergence
in data analysis (Ng et al., 2002; Von Luxburg, 2007), algorithm based on Gibbs sampling (Hinton, 2002), is
based on the singular value decomposition of the matrix very challenging. First steps were studied in (Decelle
X or functions of X. Yet for sparse matrices X, the et al., 2017) at the beginning of the learning process
spectrum is known to have leading singular values with where the dynamics can be linearized. Another interest-
localized singular vectors unrelated to the latent underly- ing direction coming from statistical physics is to replace
ing structure. A more robust spectral method is obtained the Gibbs sampling in the contrastive divergence training
by linearizing the belief propagation, thus obtaining a so- algorithm by the Thouless-Anderson-Palmer equations
called non-backtracking matrix (Krzakala et al., 2013b). (Thouless et al., 1977). This has been done in (Gabrié
A variant on this spectral method based on algorithmic et al., 2015; Tramel et al., 2018) where such training was
interpretation of the Hessian of the Bethe free energy also shown to be competitive and applications of the approach
originated in physics (Saade et al., 2014). were discussed. RBM with random weights and their re-
This line of statistical-physics inspired research is lation to the Hopfield model was clarified in (Barra et al.,
merging into the mainstream in statistics and machine 2018; Mézard, 2017).
learning. This is largely thanks to recent progress in:
(a) our understanding of algorithmic limitations, due
to the analysis of approximate message passing (AMP) 3. Modern unsupervised and generative modelling
algorithms (Bolthausen, 2014; Deshpande and Monta-
nari, 2014; Javanmard and Montanari, 2013; Matsushita The dawn of deep learning brought an exciting innova-
and Tanaka, 2013; Rangan and Fletcher, 2012) for low- tions into unsupervised and generative-models learning.
rank matrix estimation that is a generalization of the A physics friendly overview of some classical and more
Thouless-Anderson-Palmer equations (Thouless et al., recent concepts is e.g. (Wang, 2018).
1977) well known in the physics literature on spin glasses. Auto-encoders with linear activation functions are
And (b) progress in proving many of the corresponding closely related to PCA. Variational autoencoders (VAE)
results in a mathematically rigorous way. Some of the (Kingma and Welling, 2013; Rezende et al., 2014) are
influential papers in this direction (related to low-rank variants much closer to a physicist mind set where the
matrix estimation) are (Barbier et al., 2016; Coja-Oghlan autoencoder is represented via a graphical model, and in
et al., 2018; Deshpande and Montanari, 2014; Lelarge and trained using a prior on the latent variables and varia-
Miolane, 2016) for the proof of the replica formula for the tional inference. VAE with a single hidden layer is closely
information-theoretically optimal performance. related to other widely used techniques in signal process-
ing such as dictionary learning and sparse coding. Dictio-
nary learning problem has been studied with statistical
2. Restricted Boltzmann machines physics techniques in (Kabashima et al., 2016; Krzakala
et al., 2013a; Sakata and Kabashima, 2013).
Boltzmann machines and in particular restricted Boltz- Generative adversarial networks (GANs) – a powerful
mann machines are another method for unsupervised set of ideas emerged with the work of (Goodfellow et al.,
learning often used in machine learning. As apparent 2014) aiming to generate samples (e.g. images of hotel
from the very name of the method, it had strong relation bedrooms) that are of the same type as those in the train-
with statistical physics. Indeed the Boltzmann machine ing set. Physics-inspired studies of GANs are starting to
is often called the inverse Ising model in the physics lit- appear, e.g. the work on a solvable model of GANs by
10

(Wang et al., 2018) is a intriguing generalization of the data to be random independent identically distributed
earlier statistical physics works on online learning in per- (iid) matrix and modelling the labels as being created
ceptrons. in the teacher-student setting. The teacher generates
We also want to point the readers attention to au- a ground-truth vector of weights w so that wj ∼ Pw ,
toregressive generative models (Larochelle and Murray, j = 1, . . . , p. The teacher then uses this vector and data
2011; Papamakarios et al., 2017; Uria et al., 2016). The matrix X to produce labels y taken from Pout (yi |Xi w∗ ).
main interest in autoregressive models stems from the The students then knows X, y, Pw and Pout and is sup-
fact that they are a family of explicit probabilistic mod- posed to learn the rule the teacher uses, i.e. ideally
els, for which direct and unbiased sampling is possible. to learn the w∗ . Already this setting with random in-
Applications of these models have been realized for both put data provides interesting insights into the algorith-
statistical (Wu et al., 2018) and quantum physics prob- mic tractability of the problem as the number of samples
lems (Sharir et al., 2019). changes.
This line of work was pioneered by Elisabeth Gard-
ner (Gardner and Derrida, 1989) and actively studied
D. Statistical physics of supervised learning in physics in the past for special cases of Pout and PW ,
see e.g. (Györgyi and Tishby, 1990; Seung et al., 1992a;
1. Perceptron and GLMs Sompolinsky et al., 1990). The replica method can be
used to compute the mutual information between X and
The arguably most basic method of supervised learn- y in this teacher-student model, which is related to the
ing is linear regression where one aims to find a vector free energy in physics. One can then deduce the optimal
of coefficients w so that its scalar product with the data estimation error of the vector w∗ , as well as the opti-
point Xi w corresponds to the observed predicate y. This mal generalization error. A remarkable recent progress
is most often solved by the least squares method where was made in (Barbier et al., 2019) where it has been
||y − Xw||22 is minimized over w. In the Bayesian lan- proven that the replica method yields the correct results
guage, the least squares method corresponds to assum- for the GLM with random inputs for generic Pout and
ing Gaussian additive noise ξ so that yi = Xi w + ξi . In PW . Combining these results with the analysis of the ap-
high dimensional setting it is almost always indispensable proximate message passing algorithms (Javanmard and
to use regularization of the weights. The most common Montanari, 2013), one can deduce cases where the AMP
ridge regularization corresponds in the Bayesian inter- algorithm is able to reach the optimal performance and
pretation to Gaussian prior on the weights. This prob- regions where it is not. The AMP algorithm is conjec-
abilistic thinking can be generalized by assuming a gen- tured to be the best of all polynomial algorithm for this
eral prior PW (·) and a generic noise represented by a case. The teacher-student model could thus be used by
conditional probability distribution Pout (yi |Xi w). The practitioners to understand how far from optimality are
resulting model is called generalized linear regression or general purpose algorithms in cases where only very lim-
generalized linear model (GLM). Many other problems ited number of samples is available.
of interest in data analysis and learning can be repre-
sented as GLM. For instance sparse regression simply
requires that PW has large weight on zero, for the per- 2. Physics results on multi-layer neural networks
ceptron with threshold κ the output has a special form
Pout (y|z) = I(z > κ)δ(y − 1) + I(z ≤ κ)δ(y + 1). In the Statistical physics analysis of learning and generaliza-
language of neural networks, the GLM represents a single tion properties in deep neural networks is a challenging
layer (no hidden variables) fully connected feed-forward task. Progress had been made in several complementary
network. directions.
For generic noise/activation channel Pout traditional One of the influential directions involved studies of lin-
theories in statistics are not readily applicable to the ear deep neural networks. While linear neural networks
regime of very limited data where both the dimension p do not have the expressive power to represent generic
and the number of samples n grow large, while their ra- functions, the learning dynamics of the gradient descent
tio n/p = α remains fixed. Basic questions such as: how algorithm bears strong resemblance with the learning dy-
does the best achievable generalization error depend on namics on non-linear networks. At the same time the dy-
the number of samples, remain open. Yet this regime and namics of learning in deep linear neural networks can be
related questions are of great interest and understanding described via a closed form solution (Saxe et al., 2013).
them well in the setting of GLM seems to be a prereq- The learning dynamics of linear neural networks is also
uisite to understand more involved, e.g. deep learning, able to reproduce a range of facts about generalization
methods. and over-fitting as observed numerically in non-linear
Statistical physics approach can be used to obtain spe- networks, see e.g. (Advani and Saxe, 2017).
cific results on the high-dimensional GLM by considering Another special case that has been analyzed in great
11

detail is called the committee machine, for a review see input so that the output labels can be predicted, while
e.g. (Engel and Van den Broeck, 2001). Committee forgetting as much of the unnecessary information as pos-
machine is a fully-connected neural network learning a sible in order to keep the learned representation concise.
teacher-rule on random input data with only the first One of the interesting consequences of this informa-
layer of weights being learned, while the subsequent ones tion theoretic analysis is that the traditional capacity, or
are fixed. The theory is restricted to the limit where expressivity dimension of the network, such as the VC
the number of hidden neurons k = O(1), while the di- dimension, is replaced by the exponent of the mutual in-
mensionality of the input p and the number of samples formation between the input and the compressed hid-
n are both diverge, with n/p = α = O(1). Both the den layer representation. This implies that every bit of
stochastic gradient descent (aka online) learning (Saad representation compression is equivalent to doubling the
and Solla, 1995a,b) and the optimal batch-learning gener- training data in its impact on the generalization error.
alization error can be analyzed in closed form in this case The analysis of (Shwartz-Ziv and Tishby, 2017) also
(Schwarze, 1993). Recently the replica analysis of the op- suggests that such representation compression is achieved
timal generalization properties has been established rig- by Stochastic Gradient Descent (SGD) through diffusion
orously (Aubin et al., 2018). A key feature of the commit- in the irrelevant dimensions of the problem. According to
tee machine is that it displays the so-called specialization this, compression is achieved with any units nonlinearity
phase transition. When the number of samples is small, by reducing the SNR of the irrelevant dimensions, layer
the optimal error is achieved by a weight-configuration by layer, through the diffusion of the weights. An intrigu-
that is the same for every hidden unit, effectively im- ing prediction of this insight is that the time to converge
plementing simple regression. Only when the number to good generalization scales like a negative power-law
of hidden units exceeds the specialization threshold the of the number of layers. The theory also predicts a con-
different hidden units learn different weights resulting in nection between the hidden layers and the bifurcations,
improvement of the generalization error. Another inter- or phase transitions, of the Information Bottleneck rep-
esting observation about the committee machine is that resentations.
the hard phase where good generalization is achievable While the mutual information of the internal represen-
information-theoretically but not tractably gets larger tations is intrinsically hard to compute directly in large
as the number of hidden units grows. Committee ma- neural networks, none of the above predictions depend
chine was also used to analyzed the consequences of on explicit estimation of mutual information values.
over-parametrization in neural networks in (Goldt et al., A related line of work in statistical physics aims to pro-
2019a,b). vide reliable scalable approximations and models where
Another remarkable limit of two-layer neural networks the mutual information is tractable. The mutual infor-
was analysed in a recent series of works (Mei et al., 2018; mation can be computed exactly in linear networks (Saxe
Rotskoff and Vanden-Eijnden, 2018). In these works the et al., 2018). It can be reliably approximated in models
networks are analysed in the limit where the number of of neural networks where after learning the matrices of
hidden units is large, while the dimensionality of the in- weights are close enough to rotationally invariant, this
put is kept fixed. In this limit the weights interact only is then exploited within the replica theory in order to
weakly – leading to the term mean field – and their evo- compute the desired mutual information (Gabrié et al.,
lution can be tracked via an ordinary differential equa- 2018).
tion analogous to those studied in glassy systems (Dean,
1996). A related, but different, treatment of the limit
when the hidden layers are large is based on lineariza- 4. Landscapes and glassiness of deep learning
tion of the dynamics around the initial condition leading
to relation with Gaussian processes and kernel methods, Training a deep neural network is usually done via
see e.g. (Jacot et al., 2018; Lee et al., 2018) stochastic gradient descent (SGD) in the non-convex
landscape of a loss function. Statistical physics has long
experience in studies of complex energy landscapes and
3. Information Bottleneck and their relation to dynamical behaviour. Gradient de-
scent algorithms are closely related to the Langevin dy-
Information bottleneck (Tishby et al., 2000) is another namics that is often considered in physics. Some physics-
concept stemming in statistical physics that has been in- inspired works (Choromanska et al., 2015) became pop-
fluential in the quest for understanding the theory be- ular but were somewhat naive in exploring this analogy.
hind the success of deep learning. The theory of the in- Interesting insight on the relation between glassy dy-
formation bottleneck for deep learning (Shwartz-Ziv and namics and learning in deep neural networks is pre-
Tishby, 2017; Tishby and Zaslavsky, 2015) aims to quan- sented in (Baity-Jesi et al., 2018). In particular the role
tify the notion that layers in a neural networks are trad- of over-parameterization in making the landscape look
ing off between keeping enough information about the less glassy is highlighted and contrasted with the under-
12

parametrized networks. methods in terms of identifying previously unknown or-


Another intriguing line of work that relates learning in der parameters, as well as understanding whether they
neural networks to properties of landscapes is explored can reliably distinguish between a true thermodynamic
in (Baldassi et al., 2016, 2015). This work is based on phase transitions and a mere cross-over are yet to be
realization that in the simple model of binary perceptron clarified. Experiments presented on the Ising model in
learning dynamics ends in a part of the weight-space that (Mehta et al., 2018) provide some preliminary thoughts
has many low-loss close-by configurations. It goes on in that direction. Some underlying mechanisms are dis-
to suggest that learning favours these wide parts in the cussed in (Kashiwa et al., 2019). Kernel based learning
space of weights, and argues that this might explain why method for learning phases in frustrated magnetic mate-
algorithms are attracted to wide local minima and why by rials that is more easily interpretable and able to identify
doing so their generalization properties improve. An in- complex order parameters is introduced and studied in
teresting spin-off of this theory is a variant of the stochas- (Greitemann et al., 2019; Liu et al., 2019).
tic gradient descent algorithm suggested in (Chaudhari Disordered and glassy solids where identifications of
et al., 2016). the order parameter is particularly challenging were also
studied. In particular (Nussinov et al., 2016; Ronhovde
et al., 2011) use multi-scale network clustering meth-
E. Applications of ML in Statistical Physics ods to identify spatial and spatio-temporal structures in
glasses, (Cubuk et al., 2015) learn to identify structural
When a researcher in theoretical physics encounters flow defects, and (Schoenholz et al., 2017) argues to iden-
deep neural networks where the early layers are learn- tify a parameter that captures the history dependence of
ing to represent the input data at a finer scale than the the disordered system.
later layers, she immediately thinks about renormaliza- In an ongoing effort to go beyond the limitations of
tion group as used in physics in order to extract macro- supervised learning to classify phases and identify phase
scopic behaviour from microscopic rules. This analogy transitions, several direction towards unsupervised learn-
was explored for instance in (Bény, 2013; Mehta and ing are begin explored. For instance, in (Wetzel, 2017)
Schwab, 2014). Analogies between renormalization group for the Ising and XY model, in (Wang and Zhai, 2017,
and the principle component analysis were reported in 2018) for frustrated spin systems. The work of (Mar-
(Bradde and Bialek, 2017). tiniani et al., 2019) explores the direction of identifying
A natural idea is to use neural networks in order phases from simple compression of the underlying config-
to learn new renormalization schemes. First attempts urations.
in this direction appeared in (Koch-Janusz and Ringel, Machine learning also provides exciting set of tools to
2018; Li and Wang, 2018). However, it remains to be study, predict and control non-linear dynamical systems.
shown whether this can lead to new physical discoveries For instance (Pathak et al., 2018, 2017) used recurrent
in models that were not well understood previously. neural networks called an echo state networks or reser-
Phase transitions are boundaries between different voir computers (Jaeger and Haas, 2004) to predict the
phases of matter. They are usually determined using trajectories of a chaotic dynamical system and of mod-
order parameters. In some systems it is not a priori clear els used for weather prediction. The authors of (Reddy
how to determine the proper order parameter. A natu- et al., 2016, 2018) used reinforcement learning to teach
ral idea is that a neural networks may be able to learn an autonomous glider to literally soar like a bird, using
appropriate order parameters and locate the phase tran- thermals in the atmosphere.
sition without a priori physical knowledge. This idea
was explored in (Carrasquilla and Melko, 2017; Morn-
ingstar and Melko, 2018; Tanaka and Tomiya, 2017a; F. Outlook and Challenges
Van Nieuwenburg et al., 2017) in a range of models us-
ing configurations sampled uniformly from the model of The described methods of statistical physics are quite
interest (obtained using Monte Carlo simulations) in dif- powerful in dealing with high-dimensional data sets and
ferent phases or at different temperatures and using su- models. The largest difference between traditional learn-
pervised learning in order to classify the configurations ing theories and the theories coming from statistical
to their phases. Extrapolating to configurations not used physics is that the later are often based on toy genera-
in the training set plausibly leads to determination of the tive models of data. This leads to solvable models in the
phase transitions in the studied models. These general sense that quantities of interest such as achievable errors
guiding principles have been used in a large number of can be computed in a closed form, including constant
applications to analyze both synthetic and experimental terms. This is in contrast with aims in the mainstream
data. Specific cases in the context of many-body quan- learning theory that aims to provide worst case bounds
tum physics are detailed in Section IV.C. on error under general assumptions on the setting (data
Detailed understanding of the limitations of these structure, or architecture). These two approaches are
13

complementary and ideally will meet in the future once bold new strategies. The excitement spans the theoreti-
we understand what are the key conditions under which cal and experimental aspects of these fields and includes
practical cases are close to worse cases, and what are the both applications with immediate impact as well as the
right models of realistic data and functions. prospect of more transformational changes in the longer
The next challenge for the statistical physics approach term.
is to formulate and solve models that are in some kind of
universality class of the real settings of interest. Mean-
ing that they reproduce all important aspects of the be- A. The role of the simulation
haviour that is observed in practical application of neural
networks. For this we need to model the input data no An important aspect of the use of machine learning
longer as iid vectors, but for instance as outputs from in particle physics and cosmology is the use of computer
a generative neural network as in (Gabrié et al., 2018), simulations to generate samples of labeled training data
or as perceptual manifolds as in (Chung et al., 2018). {Xµ , yµ }nµ=1 . For example, when the target y refers to
The teacher network that is producing the labels (in an a particle type, particular scattering process, or parame-
supervised setting) needs to model suitably the correla- ter appearing in the fundamental theory, it can often be
tion between the structure in the data and the label. We specified directly in the simulation code so that the sim-
need to find out how to analyze the (stochastic) gradient ulation directly samples X ∼ p(·|y). In other cases, the
descent algorithm and its relevant variants. Promising simulation is not directly conditioned on y, but provides
works in this direction, that rely of the dynamic mean- samples (X, Z) ∼ p(·), where Z are latent variables that
field theory of glasses are (Mannelli et al., 2018, 2019). describe what happened inside the simulation, but which
We need to generalize the existing methodology to multi- are not observable in an actual experiment. If the target
layer networks with extensive width of hidden layers. label can be computed from these latent variables via a
Going back to the direction of using machine learning function y(Z), then labeled training data {Xµ , y(Zµ )}nµ=1
for physics, the full potential of ML in research of non- can also be created from the simulation. The use of high-
linear dynamical systems and statistical physics is yet fidelity simulations to generate labeled training data has
to be uncovered. The above mentioned works certainly not only been the key to early successes of supervised
provide an exciting appetizer. learning in these areas, but also the focus of research
addressing the shortcomings of this approach.
Particle physicists have developed a suite of high-
III. PARTICLE PHYSICS AND COSMOLOGY fidelity simulations that are hierarchically composed to
describe interactions across a huge range of length scales.
A diverse portfolio of on-going and planned experi- The components of these simulations include Feynman
ments is well poised to explore the universe from the diagrammatic perturbative expansion of quantum field
unimaginably small world of fundamental particles to the theory, phenomenological models for complex patterns
awe inspiring scale of the universe. Experiments like the of radiation, and detailed models for interaction of par-
Large Hadron Collider (LHC) and the Large Synoptic ticles with matter in the detector. While the resulting
Survey Telescope (LSST) deliver enormous amounts of simulation has high fidelity, the simulation itself has free
data to be compared to the predictions of specific the- parameters to be tuned and number of residual uncer-
oretical models. Both areas have well established phys- tainties in the simulation must be taken into account in
ical models that serve as null hypotheses: the standard down-stream analysis tasks.
model of particle physics and ΛCDM cosmology, which Similarly, cosmologists can simulate the evolution of
includes cold dark matter and a cosmological constant Λ. the universe at different length scales using general rel-
Interestingly, most alternate hypotheses considered are ativity and relevant non-gravitational effects of matter
formulated in the same theoretical frameworks, namely and radiation that becomes increasingly important dur-
quantum field theory and general relativity. Despite such ing structure formation. There is a rich array of approxi-
sharp theoretical tools, the challenge is still daunting as mations that can be made in specific settings that provide
the expected deviations from the null are expected to be enormous speedups compared to the computationally ex-
incredibly tiny and revealing such subtle effects requires a pensive N -body simulations of billions of massive objects
robust treatment of complex experimental apparatuses. that interact gravitationally, which become prohibitively
Complicating the statistical inference is that the most expensive once non-gravitational feedback effects are in-
high-fidelity predictions for the data do not come in the cluded.
from simple closed-form equations, but instead in com- Cosmological simulations generally involve determin-
plex computer simulations. istic evolution of stochastic initial conditions due to pri-
Machine learning is making waves in particle physics mordial quantum fluctuations. The N -body simulations
and cosmology as it offers a suit of techniques to confront are very expensive, so there are relatively few simula-
these challenges and a new perspective that motivates tions, but they cover a large space-time volume that is
14

statistically isotropic and homogeneous at large scales. but the target label y refers to the energy or momen-
In contrast, particle physics simulations are stochastic tum of the particle responsible for those energy deposits.
throughout from the initial high-energy scattering to the These algorithms are applied to the bulk data processing
low-energy interactions in the detector. Simulations for of the LHC data.
high-energy collider experiments can run on commodity
hardware in a parallel manner, but the physics goals re-
quires enormous numbers of simulated collisions. Event selection, refers to the task of selecting a small
Because of the critical role of the simulation in these subset of the collisions that are most relevant for a tar-
fields, much of the recent research in machine learning is geted analysis task. For instance, in the search for the
related to simulation in one way or another. These goals Higgs boson, supersymmetry, and dark matter data an-
of these recent works are to: alysts must select a small subset of the LHC data that
is consistent with the features of these hypothetical "sig-
• develop techniques that are more data efficient by nal" processes. Typically these event selection require-
incorporating domain knowledge directly into the ments are also satisfied by so-called "background" pro-
machine learning models; cesses that mimic the features of the signal either due
to experimental limitations or fundamental quantum me-
• incorporate the uncertainties in the simulation into chanical effects. Searches in their simplest form reduce to
the training procedure; comparing the number of events in the data that satisfy
these requirements to the predictions of a background-
• develop weakly supervised procedures that can be only null hypothesis and signal-plus-background alter-
applied to real data and do not rely on the simula- nate hypothesis. Thus, the more effective the event selec-
tion; tion requirements are at rejecting background processes
• develop anomaly detection algorithms to find and accept signal processes, the more powerful the re-
anomalous features in the data without simulation sulting statistical analysis will be. Within high-energy
of a specific signal hypothesis; physics, machine learning classification techniques have
traditionally been referred to as multivariate analysis to
• improve the tuning of the simulation, reweight or emphasize the contrast to traditional techniques based
adjust the simulated data to better match the real on simple thresholding (or “cuts”) applied to carefully se-
data, or use machine learning to model residuals lected or engineered features.
between the simulation and the real data;

• learn fast neural network surrogates for the simula- In the 1990s and early 2000s simple feed-forward neu-
tion that can be used to quickly generate synthetic ral networks were commonly used for these tasks. Neu-
data; ral networks were largely displaced by Boosted Decision
Trees (BDTs) as the go-to for classification and regres-
• develop approximate inference techniques that sion tasks for more than a decade (Breiman et al., 1984;
make efficiently use of the simulation; and Freund and Schapire, 1997; Roe et al., 2005). Starting
around 2014, techniques based on deep learning emerged
• learn fast neural network surrogates that can be and were demonstrated to be significantly more powerful
used directly for statistical inference. in several applications (for a recent review of the history,
see Refs. (Guest et al., 2018; Radovic et al., 2018)).

B. Classification and regression in particle physics


Deep learning was first used for an event-selection task
Machine learning techniques have been used for targeting hypothesized particles from theories beyond the
decades in experimental particle physics to aid particle standard model. It not only out-performed boosted deci-
identification and event selection, which can be seen as sion trees, but also did not require engineered features to
classification tasks. Machine learning has also been used achieve this impressive performance (Baldi et al., 2014).
for reconstruction, which can be seen as a regression In this proof-of-concept work, the network was a deep
task. Supervised learning is used to train a predictive multi-layer perceptron trained with a very large train-
model based on large number of labeled training samples ing set using a simplified detector setup. Shortly after,
{Xµ , yµ }nµ=1 , where X denotes the input data and y the the idea of a parametrized classifier was introduced in
target label. In the case of particle identification, the in- which the concept of a binary classifier was extended to
put features X characterize localized energy deposits in a situation where the y = 1 signal hypothesis is lifted
the detector and the label y refers to one of a few particle to a composite hypothesis that is parameterized continu-
species (e.g. electron, photon, pion, etc.). In the recon- ously, for instance, in terms of the mass of a hypothesized
struction task, the same type of sensor data X are used, particle (Baldi et al., 2016b).
15

1. Jet Physics tion and modeling smooth spectra where an analytical


form is not well motivated and the simulation has sig-
The most copious interactions at hadron colliders such nificant uncertainties (Frate et al., 2017). The work also
as the LHC produce high energy quarks and gluons in allows one to model alternative signal hypotheses with a
the final state. These quarks and gluons radiate more diffuse prior instead of a specific concrete physical model.
quarks and gluons that eventually combine into color- More abstractly, the Gaussian process in this work is be-
neutral composite particles due to the phenomena of con- ing used to model the intensity of inhomogeneous Pois-
finement. The resulting collimated spray of mesons and son point process, which is a scenario that is found in
baryons that strike the detector is collectively referred to particle physics, astrophysics, and cosmology. One in-
as a jet. Developing a useful characterization of the struc- teresting aspect of this line of work is that the Gaussian
ture of a jet that are theoretically robust and that can be process kernels can be constructed using compositional
used to test the predictions of quantum chromodynam- rules that correspond clearly to the causal model physi-
ics (QCD) has been an active area of particle physics cists intuitively use to describe the observation, which
research for decades. Furthermore, many scenarios for aids in interpretability (Duvenaud et al., 2013).
physics Beyond the Standard Model predict the produc-
tion of particles that decay into two or more jets. If
those unstable particles are produced with a large mo- 2. Neutrino physics
mentum, then the resulting jets are boosted such that
the jets overlap into a single fat jet with nontrivial sub- Neutrinos interact very feebly with matter, thus the
structure. Classifying these boosted or fat jets from the experiments require large detector volumes to achieve
much more copiously produced jets from standard model appreciable interaction rates. Different types of inter-
processes involving quarks and gluons is an area that can actions, whether they come from different species of neu-
significantly improve the physics reach of the LHC. More trinos or background cosmic ray processes, leave different
generally, identifying the progenitor for a jet is a classifi- patterns of localized energy deposits in the detector vol-
cation task that is often referred to as jet tagging. ume. The detector volume is homogeneous, which moti-
Shortly after the first applications of deep learning vates the use of convolutional neural networks.
for event selection, deep convolutional networks were The first application of a deep convolutional network
used for the purpose of jet tagging, where the low-level in the analysis of data from a particle physics experi-
detector data lends itself to an image-like representa- ment was in the context of the NOνA experiment, which
tion (Baldi et al., 2016a; de Oliveira et al., 2016). While uses scintillating mineral oil. Interactions in NOνA lead
machine learning techniques have been used within par- to the production of light, which is imaged from two
ticle physics for decades, the practice has always been re- different vantage points. NOνA developed a convolu-
stricted to input features X with a fixed dimensionality. tional network that simultaneously processed these two
One challenge in jet physics is that the natural represen- images (Aurisano et al., 2016). Their network improves
tation of the data is in terms of particles, and the number the efficiency (true positive rate) of selecting electron
of particles associated to a jet varies. The first applica- neutrinos by 40% for the same purity. This network has
tion of a recurrent neural network in particle physics was been used in searches for the appearance of electron neu-
in the context of flavor tagging (Guest et al., 2016). More trinos and for the hypothetical sterile neutrino.
recently, there has been an explosion of research into the Similarly, the MicroBooNE experiment detects neutri-
use of different network architectures including recurrent nos created at Fermilab. It uses 170 ton liquid-argon time
networks operating on sequences, trees, and graphs (see projection chamber. Charged particles ionize the liquid
Ref. (Larkoski et al., 2017) for a recent review for jet argon and the ionization electrons drift through the vol-
physics). This includes hybrid approaches that leverage ume to three wire planes. The resulting data is processed
domain knowledge in the design of the architecture. For and represented by a 33-megapixel image, which is domi-
example, motivated by techniques in natural language nantly populated with noise and only very sparsely popu-
processing, recursive networks were designed that oper- lated with legitimate energy deposits. The MicroBooNE
ate over tree-structures created from a class of jet clus- collaboration used a FasterRCNN (Ren et al., 2015) to
tering algorithms (Louppe et al., 2017a). Similarly, net- identify and localize neutrino interactions with bounding
works have been developed motivated by invariance to boxes (Acciarri et al., 2017). This success is important
permutations on the particles presented to the network for future neutrino experiments based on liquid-argon
and stability to details of the radiation pattern of parti- time projection chambers, such as the Deep Underground
cles, (Komiske et al., 2018b, 2019). Recently, compar- Neutrino Experiment (DUNE).
isons of the different approaches for specific benchmark In addition to the relatively low energy neutrinos pro-
problems have been organized (Kasieczka et al., 2019). duced at accelerator facilities, machine learning has also
In addition to classification and regression, machine been used to study high-energy neutrinos with the Ice-
learning techniques have been used for density estima- Cube observatory located at the south pole. In partic-
16

ular, 3D convolutional and graph neural networks have ger that is independent of the jet invariant mass (Shim-
been applied to a signal classification problem. In the lat- min et al., 2017). Previously, a different algorithm called
ter approach, the detector array is modeled as a graph, uboost was developed to achieve similar goals for boosted
where vertices are sensors and edges are a learned func- decision trees (Rogozhnikov et al., 2015; Stevens and
tion of the sensors’ spatial coordinates. The graph neu- Williams, 2013).
ral network was found to outperform both a traditional- The second general strategy used within particle
physics-based method as well as classical 3D convolu- physics to cope with systematic mis-modeling in the sim-
tional neural network (Choma et al., 2018). ulation is to avoid using the simulation for modeling the
distribution p(X|y). In what follows, let R denote an
index over various subsets of the data satisfying cor-
3. Robustness to systematic uncertainties responding selection requirements. Various data-driven
strategies have been developed to relate distributions of
Experimental particle physicists are keenly aware that the data in control regions, p(X|y, R = 0), to distribu-
the simulation, while incredibly accurate, is not perfect. tions in regions of interest, p(X|y, R = 1). These rela-
As a result, the community has developed a number of tionships also involve the simulation, but the art of this
strategies falling roughly in two broad classes. The first approach is to base those relationships on aspects of the
involves incorporating the effect of mis-modeling when simulation that are considered robust. The simplest ex-
the simulation is used for training. This involves either ample is estimating the distribution p(X|y, R = 1) for
propagating the underlying sources of uncertainty (e. g. a specific process y by identifying a subset of the data
calibrations, detector response, the quark and gluon com- R = 0 that is dominated by y and p(y|R = 0) ≈ 1. This
position of the proton, and the impact of higher-order is an extreme situation that is limited in applicability.
corrections from perturbation theory, etc.) through the Recently, weakly supervised techniques have been
simulation and analysis chain. For each of these sources developed that only involve identifying regions where
of uncertainty, a nuisance parameter ν is included, and only the class proportions are known or assuming that
the resulting statistical model p(X|y, ν) is parameterized the relative probabilities p(y|R) are not linearly depen-
by these nuisance parameters. In addition, the likeli- dent (Komiske et al., 2018a; Metodiev et al., 2017) . The
hood function for the data is augmented with a term p(ν) techniques also assume that the distributions p(X|y, R)
representing the uncertainty in these sources of uncer- are independent of R, which is reasonable in some con-
tainty, as in the case of a penalized maximum likelihood texts and questionable in others. The approach has been
analysis. In the context of machine learning, classifiers used to train jet taggers that discriminate between quarks
and regressors are typically trained using data generated and gluons, which is an area where the fidelity of the
from a nominal simulation ν = ν0 , yielding a predictive simulation is no longer adequate and the assumptions
model f (X|ν0 ). Treating this predictive model as fixed, for this method are reasonable. This weakly-supervised,
it is possible to propagate the uncertainty in ν through data-driven approach is a major development for machine
f (X|ν0 ) using the model p(X|y, ν)p(ν). However, the learning for particle physics, though it is limited to a sub-
down-stream statistical analysis based on this approach set of problems. For example, this approach is not ap-
is not optimal since the predictive model was not trained plicable if one of the target categories y corresponds to a
taking into account the uncertainty on ν. hypothetical particle that may not exist or be present in
In machine learning literature, this situation is often the data.
referred to as covariate shift between two domains rep-
resented by the training distribution ν0 and the target
distribution ν. Various techniques for domain adap- 4. Triggering
tation exist to train classifiers that are robust to this
change, but they tend to be restricted to binary do- Enormous amounts of data must be collected by col-
mains ν ∈ {train, target}. To address this problem, an lider experiments such as the LHC, because the phenom-
adversarial training technique was developed that ex- ena being targeted are exceedingly rare. The bulk of the
tends domain adaptation to domains parametrized by collisions involve phenomena that have previously been
ν ∈ Rq (Louppe et al., 2016). The adversarial ap- studied and characterized, and the data volume asso-
proach encourages the network to learn a pivotal quan- ciated with the full data stream is impractically large.
tity, where p(f (X)|y, ν) is independent of ν, or equiv- As a result, collider experiments use a real-time data-
alently p(f (X), ν|y) = p(f (X)|y)p(ν). This adversarial reduction system referred to as a trigger. The trigger
approach has also been used in the context of algorithmic makes the critical decision of which events to keep for
fairness, where one desires to train a classifiers or regres- future analysis and which events to discard. The AT-
sor that is independent of (or decorrelated with) spe- LAS and CMS experiments retain only about 1 out of
cific continuous attributes or observable quantities. For every 100,000 events. Machine learning techniques are
instance, in jet physics one often would like a jet tag- used to various degrees in these systems. Essentially, the
17

same particle identification (classification) tasks appears approaches are discussed in Sec. III.F). A continuation
in this context, though the computational demands and of this work is being supported by the Argon Leadership
performance in terms of false positives and negatives are Computing Facility. The a new Intel-Cray system Au-
different in the real-time environment. rora, will be capable of over 1 exaflops and specifically
The LHCb experiment has been a leader in using ma- is aiming at problems that combine traditional high per-
chine learning techniques in the trigger. Roughly 70% formance computing with modern machine learning tech-
of the data selected by the LHC trigger is selected by niques.
machine learning algorithms. Initially, the experiment
used a boosted decision tree for this purpose (Gligorov
and Williams, 2013), which was later replaced by the Ma-
trixNet algorithm developed by Yandex (Likhomanenko C. Classification and regression in cosmology
et al., 2015).
The Trigger systems often use specialized hardware 1. Photometric Redshift
and firmware, such as field-programmable gate arrays
(FPGAs). Recently, tools have been developed to
Due to the expansion of the universe the distant lu-
streamline the compilation of machine learning models
minous objects are redshifted, and the distance-redshift
for FPGAs to target the requirements of these real-time
relation is a fundamental component of observational cos-
triggering systems (Duarte et al., 2018; Tsaris et al.,
mology. Very precise redshift estimates can be obtained
2018).
through spectroscopy; however, such spectroscopic sur-
veys are expensive and time consuming. Photometric
surveys based on broadband photometry or imaging in a
5. Theoretical particle physics
few color bands give a coarse approximation to the spec-
tral energy distribution. Photometric redshift refers to
While the bulk of machine learning in particle physics
the regression task of estimating redshifts from photo-
and cosmology are focused on analysis of observational
metric data. In this case, the ground truth training data
data, there are also examples of using machine learning
comes from precise spectroscopic surveys.
as a tool in theoretical physics. For instance, machine
learning has been used to characterize the landscape of The traditional approaches to photometric redshift is
string theories (Carifio et al., 2017), to identify the phase based on template fitting methods (Benítez, 2000; Bram-
transitions of quantum chromodynamics (QCD) (Pang mer et al., 2008; Feldmann et al., 2006). For more than
et al., 2018), and to study the the AdS/CFT correspon- a decade cosmologists have also used machine learning
dence (Hashimoto et al., 2018a,b). Some of this work is methods based on neural networks and boosted decision
more closely connected to the use of machine learning as a trees for photometric redshift (Carrasco Kind and Brun-
tool in condensed matter or many-body quantum physics. ner, 2013; Collister and Lahav, 2004; Firth et al., 2003).
Specifically, deep learning has been used in the context One interesting aspect of this body of work is the effort
of lattice QCD (LQCD). In an exploratory work in this that has been placed to go beyond a point estimate for
direction, deep neural networks were used to predict the the redshift. Various approaches exist to determine the
parameters in the QCD Lagrangian from lattice config- uncertainty on the redshift estimate and to obtain a pos-
urations (Shanahan et al., 2018). This is needed for a terior distribution.
number of multi-scale action-matching approaches, which While the training data are not generated from a sim-
aim to improve the efficiency of the computationally in- ulation, there is still a concern that the distribution of
tensive LQCD calculations. This problem was setup as a the training data may not be representative of the distri-
regression task, and one of the challenges is that there are bution of data that the models will be applied to. This
relatively few training examples. Additionally, machine type of covariate shift results from various selection ef-
learning techniques are being used to reduce the auto- fects in the spectroscopic survey and subtleties in the
correlation time in the Markov Chains (Albergo et al., photometric surveys. The Dark Energy Survey consid-
2019; Tanaka and Tomiya, 2017b) In order to solve this ered a number of these approaches and established a
task with few training examples it is important to lever- validation process to evaluate them critically (Bonnett
age the known space-time and local gauge symmetries et al., 2016). Recently there has been work to use hi-
in the lattice data. Data augmentation is not a scalable erarchical models to build in additional causal structure
solution given the richness of the symmetries. Instead in the models to be robust to these differences. In the
the authors performed feature engineering that imposed language of machine learning, these new models aid in
gauge symmetry and space-time translational invariance. transfer learning and domain adaptation. The hierar-
While this approach proved effective, it would be desir- chical models also aim to combine the interpretability of
able to consider a richer class of networks, that are equiv- traditional template fitting approaches and the flexibility
ariant (or covariant) to the symmetries in the data (such of the machine learning models (Leistedt et al., 2018).
18

2. Gravitational lens finding and parameter estimation there are relatively few statistically independent realiza-
tions of the large simulations Xµ . As deep learning tends
One of the most striking effects of general relativity to require large labeled training data-sets, various types
is gravitatioanl lensing, in which a massive foreground of subsampling and data augmentation approaches have
object warps the image of a background object. Strong been explored to ameliorate the situation. An alternative
gravitational lensing occurs, for example, when a mas- approach to subsampling is the so-called Backdrop, which
sive foreground galaxy is nearly coincident on the sky provides stochastic gradients of the loss function even on
with a background source. These events are a powerful individual samples by introducing a stochastic masking
probe of the dark matter distribution of massive galaxies in the backpropagation pipeline (Golkar and Cranmer,
and can provide valuable cosmological constraints. How- 2018).
ever, these systems are rare, thus a scalable and reli- Inference on the fundamental cosmological model also
able lens finding system is essential to cope with large appears in a classification setting. In particular, mod-
surveys such as LSST, Euclid, and WFIRST. Simple ified gravity models with massive neutrinos can mimic
feedfoward, convolutional and residual neural networks the predictions for weak-lensing observables predicted by
(ResNets) have been applied to this supervised classi- the standard ΛCDM model. The degeneracies that exist
fication problem (Estrada et al., 2007; Lanusse et al., when restricting the Xµ to second-order statistics can be
2018; Marshall et al., 2009). In this setting, the train- broken by incorporating higher-order statistics or other
ing data came from simulation using PICS (Pipeline for rich representations of the weak lensing signal. In par-
Images of Cosmological Strong) lensing (Li et al., 2016) ticular, the authors of (Peel et al., 2018) constructed a
for the strong lensing ray-tracing and LensPop (Collett, novel representation of the wavelet decomposition of the
2015) for mock LSST observing. Once identified, charac- weak lensing signal as input to a convolutional network.
terizing the lensing object through maximum likelihood The resulting approach was able to discriminate between
estimation is a computationally intensive non-linear op- previously degenerate models with 83%–100% accuracy.
timization task. Recently, convolutional networks have Deep learning has also been used to estimate the mass
been used to quickly estimate the parameters of the Sin- of galaxy clusters, which are the largest gravitationally
gular Isothermal Ellipsoid density profile, commonly used bound structures in the universe and a powerful cosmo-
to model strong lensing systems (Hezaveh et al., 2017). logical probe. Much of the mass of these galaxy clusters
comes in the form of dark matter, which is not directly
observable. Galaxy cluster masses can be estimated via
3. Other examples gravitational lensing, X-ray observations of the intra-
cluster medium, or through dynamical analysis of the
In addition to the examples above, in which the ground cluster’s galaxies. The first use of machine learning for
truth for an object is relatively unambiguous with a more a dynamical cluster mass estimate was performed using
labor-intensive approach, cosmologists are also leveraging Support Distribution Machines (Póczos et al., 2012) on
machine learning to infer quantities that involve unob- a dark-matter-only simulation (Ntampaka et al., 2015,
servable latent processes or the parameters of the funda- 2016). A number of non-neural network algorithms in-
mental cosmological model. cluding Gaussian process regression (kernel ridge regres-
For example, 3D convolutional networks have been sion), support vector machines, gradient boosted tree re-
trained to predict fundamental cosmological parameters gressors, and others have been applied to this problem
based on the dark matter spatial distribution (Ravan- using the MACSIS simulations (Henson et al., 2016) for
bakhsh et al., 2017) (see Fig. 1). In this proof-of-concept training data. This simulation goes beyond the dark-
work, the networks were trained using computationally matter-only simulations and incorporates the impact of
intensive N -body simulations for the evolution of dark various astrophysical processes and allows for the devel-
matter in the universe assuming specific values for the 10 opment of a realistic processing pipeline that can be ap-
parameters in the standard ΛCDM cosmology model. In plied to observational data. The need for an accurate,
real applications of this technique to visible matter, one automated mass estimation pipeline is motivated by large
would need to model the bias and variance of the visible surveys such as eBOSS, DESI, eROSITA, SPT-3G, Act-
tracers with respect to the underlying dark matter distri- Pol, and Euclid. The authors found that compared to
bution. In order to close this gap, convolutional networks the traditional σ − M relation the predicted-to-true mass
have been trained to learn a fast mapping between the ratio using machine learning techniques is reduced by a
dark matter and visible galaxies (Zhang et al., 2019), al- factor of 4 (Armitage et al., 2019). Most recently, convo-
lowing for a trade-off between simulation accuracy and lutional neural networks have been used to mitigate sys-
computational cost. One challenge of this work, which tematics in the virial scaling relation, further improving
is common to applications in solid state physics, lattice dynamical mass estimates (Ho et al., 2019). Convolu-
field theory, and many body quantum systems, is that tional neural networks have also been used to estimate
the simulations are computationally expensive and thus cluster masses with synthetic (mock) X-ray observations
puter Science, Carnegie Mellon University, 5000 Forbes Ave., Pittsburgh, PA 15213, USA
enter for Cosmology, Department of Physics, Carnegie Mellon University, Carnegie 5000 Forbes Ave.,
213, USA
19

Abstract
allenge of the 21st century cosmol-
ccurately estimate the cosmological
of our Universe. A major approach
g the cosmological parameters is to
scale matter distribution of the Uni-
xy surveys provide the means to map
arge-scale structure in three dimen-
mation about galaxy locations is typ-
arized in a “single” function of scale,
galaxy correlation function or power-
We show that it is possible to estimate
logical parameters directly from the
of matter. This paper presents the
of deep 3D convolutional networks
c representation of dark-matter sim-
Figure 1. Dark matter distribution in three cubes produced using
well as the results
Figureobtained using adistributiondifferent
1 Dark matter in three cubes produced using different sets of parameters. Each cube is divided into small
sets of parameters. Each cube is divided into small sub-
posed distribution regression
sub- cubes frame-and prediction.
for training Note that although cubes in this figure are produced using very different cosmological
parameters in our constrained sampled set, training
cubes for andis prediction.
the effect not visuallyNote that although
discernible. cubes infrom (Ravanbakhsh et al., 2017).
Reproduced
ng that machine learning techniques this figure are produced using very different cosmological param-
able to, and can sometimes outper- eters in our constrained sampled set, the effect is not visually dis-
mum-likelihood point estimates using
of galaxy clusters, where the authors cernible.find the scatter in is equivalent to a linear operator and the maximum like-
cal models”. the Thispredicted
opens themasswayistoreduced compared
servationstothat allow us X-
traditional to makelihood
serious inroads ŷto
estimate the(X)
un- or ẐMLE (X) can be expressed
MLE
he parametersray of luminosity
our Universe withmethods (Ntampaka
based derstanding of et our
al., own
2018).universe,as including
a matrix the cosmic In
inversion. mi-that case, the instability of the
acy. crowave background (CMB) (Planck inverseCollaboration
is related to etthe al.,matrix for the forward model
2015; Hinshaw et al., 2013), supernovae being poorly (Perlmutter et al., While the maximum likeli-
conditioned.
1999; Riess
D. Inverse Problems and Likelihood-free et al., 1998) and the
inference hood large scale may
estimate structure of
be unbiased, it tends to be high vari-
on galaxies and galaxy clusters (Cole ance.et al., 2005; Anderson
Penalized maximum likelihood, ridge regression
(Tikhonov regularization),
large and Gaussian process regres-
has brought us As stressed
tools to ob-bothetparticle
repeatedly,
and methods al., 2014; Parkinson
physics and cos-et al., 2012). In particular,
mology are characterized by wellscale structure
motivated, involves
high-fidelitymeasuringsion the
arepositions
closely related
and approaches to the bias-variance
other
e the Universe in far greater detail than
forward simulations. These forward properties of bright
simulations are ei- intrade-off.
sources great volumes of the sky.
us to deeply probe the fundamental prop-
ther intrinsically stochastic – asThe amount
in the case of
of information
the proba- is overwhelming,
Within particle and physics,
modern this type of problem is often
gy. We have a suite of cosmological ob-
bilistic decays and interactions methods
found in in particles physics
machine learning and statistics can play an in-In that case, one is often inter-
referred to as unfolding.
simulations
33 rd International – or on
Conference they are deterministic
Machine – as in the role
creasingly important casein modern
ested in cosmology. For ex-of some kinematic property of
the distribution
rk, NY, USA, of 2016. JMLR: W&CP
gravitational volume
lensing or N-body gravitational simula-
ample, the common method to compare large scaletostruc-
the collisions prior the detector effects, and X repre-
6 by the author(s).
tions. However, even deterministic physics simulators sents a smeared version of this quantity after folding in
ture observation and theory is to compare the compressed
usually are followed by a probabilistic description of the the detector effects. Similarly, estimating the parton den-
observation based on Poisson counts or a model for in- sity functions that describe quarks and gluons inside the
strumental noise. In both cases, one can consider the sim- proton can be cast as an inverse problem of this sort (Ball
ulation as implicitly defining the distribution p(X, Z|y), et al., 2015; Forte et al., 2002). Recently, both neural net-
where X refers to the observed data, Z are unobserved works and Gaussian processes with more sophisticated,
latent variables that take on random values inside the physically inspired kernels have been applied to these
simulation, and y are parameters of the forward model problems (Bozson et al., 2018; Frate et al., 2017). In the
such as coefficients in a Lagrangian or the 10 parameters context of cosmology, an example inverse problem is to
of ΛCDM cosmology. Many scientific tasks can be char- denoise the Laser Interferometer Gravitational-Wave Ob-
acterized as inverse problems where one wishes to infer servatory (LIGO) time series to the underlying waveform
Z or y from X = x. The simplest cases that we have from a gravitational wave (Shen et al., 2019). Generative
considered are classification where y takes on categorical Adversarial Networks (GANs) have even been used in the
values and regression where y ∈ Rd . The point estimates context of inverse problems where they were used to de-
ŷ(X = x) and Ẑ(X = x) are useful, but in scientific ap- noise and recover images of galaxies beyond naive decon-
plications we often require uncertainty on the estimate. volution limits (Schawinski et al., 2017). Another exam-
In many cases, the solution to the inverse problem is ple involves estimating the image of a background object
ill-posed, in the sense that small changes in X lead to prior to being gravitationally lensed by a foreground ob-
large changes in the estimate. This implies the estimator ject. In this case, describing a physically motivated prior
will have high variance. In some cases the forward model for the background object is difficult. Recently, recurrent
20

inference machines (Putzky and Welling, 2017) have been bility p(ρ(X, x) < ), where x is the observed data to
introduced as way to implicitly learn a prior for such in- be conditioned on, ρ(x0 , x) is some distance metric be-
verse problems, and they have successfully been applied tween x and the output of the simulator x0 , and  is
to strong gravitational lensing (Morningstar et al., 2018, a tolerance parameter. As  → 0, one recovers exact
2019). Bayesian inference; however, the efficiency of the proce-
A more ambitious approach to inverse problems in- dure vanishes. One of the challenges for ABC, particu-
volves providing detailed probabilistic characterization of larly for high-dimensional x, is the specification of the
y given X. In the frequentist paradigm one would aim to distance measure ρ(x0 , x) that maintains reasonable ac-
characterize the likelihood function L(y) = p(X = x|y), ceptance efficiency without degrading the quality of the
while in a Bayesian formalism one would wish to charac- inference (Beaumont et al., 2002; Marin et al., 2012; Mar-
terize the posterior p(y|X = x) ∝ p(X = x|y)p(y). The joram et al., 2003; Sisson and Fan, 2011; Sisson et al.,
analogous situation happens for inference of latent vari- 2007). This approach to estimating the likelihood is quite
ables Z given X. Both particle physics and cosmology similar to the traditional practice in particle physics of
have well-developed approaches to statistical inference using histograms or kernel density estimation to approx-
based on detailed modeling of the likelihood, Markov imate p̂(x|y) ≈ p(x|y). In both cases, domain knowledge
Chain Monte Carlo (MCMC) (Foreman-Mackey et al., is required to identify useful summary in order to reduce
2013), Hamiltonian Monte Carlo, and variational infer- the dimensionality of the data. An interesting exten-
ence (Jain et al., 2018; Lang et al., 2016; Regier et al., sion of the ABC technique utilizes universal probabilistic
2018). However, all of these approaches require that the programming. In particular, a technique known as infer-
likelihood function is tractable. ence compilation is a sophisticated form of importance
sampling in which a neural network controls the random
number generation in the probabilistic program to bias
1. Likelihood-free Inference the simulation to produce outputs x0 closer to the ob-
served x (Le et al., 2017).
Somewhat surprisingly, the probability density, or like- The term ABC is often used synonymously with the
lihood, p(X = x|y) that is implicitly defined by the sim- more general term likelihood-free inference; however,
ulator is often intractable. Symbolically,
R the probability there are a number of other approaches that involve
density can be written p(X|y) = p(X, Z|y)dZ, where Z learning an approximate likelihood or likelihood ratio
are the latent variables of the simulation. The latent that is used as a surrogate for the intractable likelihood
space of state-of-the-art simulations is enormous and (ratio). For example, neural density estimation with
highly structured, so this integral cannot be performed autoregressive models and normalizing flows (Larochelle
analytically. In simulations of a single collision at the and Murray, 2011; Papamakarios et al., 2017; Rezende
LHC, Z may have hundreds of millions of components. In and Mohamed, 2015) have been used for this purpose
practice, the simulations are often based on Monte Carlo and scale to higher dimensional data (Cranmer and
techniques and generate samples (Xµ , Zµ ) ∼ p(X, Z|y) Louppe, 2016; Papamakarios et al., 2018). Alternatively,
from which the density can be estimated. The challenge training a classifier to discriminate between x ∼ p(x|y)
is that if X is high-dimensional it is difficult to accurately and x ∼ p(x|y 0 ) can be used to estimate the likeli-
estimate those densities. For example, naive histogram- hood ratio r̂(x|y, y 0 ) ≈ p(x|y)/p(x|y 0 ), which can be
based approaches do not scale to high dimensions and used for inference in either the frequentist or Bayesian
kernel density estimation techniques are only trustwor- paradigm (Brehmer et al., 2018c; Cranmer et al., 2015;
thy to around 5-dimensions. Adding to the challenge is Hermans et al., 2019).
that the distributions have a large dynamic range, and
the interesting physics often sits in the tails of the distri-
butions. 2. Examples in particle physics
The intractability of the likelihood implicitly defined
by the simulations is a foundational problem not only Thousands of published results within particle physics,
for particle physics and cosmology, but many other areas including the discovery of the Higgs boson, involve statis-
of science as well including epidemiology and phyloge- tical inference based on a surrogate likelihood p̂(x|y) con-
netics. This has motivated the development of so-called structed with density estimation techniques applied to
likelihood-free inference algorithms, which only require synthetic datasets generated from the simulation. These
the ability to generate samples from the simulation in typically are restricted to one- or two-dimensional sum-
the forward mode. mary statistics or no features at all other than the num-
One prominent technique, is Approximate Bayesian ber of events observed. While the term likelihood-free
Computation (ABC). In ABC one performs Bayesian in- inference is relatively new, it is core to the methodology
ference using MCMC or a rejection sampling approach of experimental particle physics.
in which the likelihood is approximated by the proba- More recently, a suite of likelihood-free inference tech-
21

parameter ✓
<latexit sha1_base64="sg4BK8w5ytAcXmQyQ0tWWWn7/Y0=">AAAcmHicpVltc9y2Eb64b6n65qTfGnWGqcaukzmd706SJafjGY8Tj5upXauy7LgRJQ1ILknMgQQFgOc7seyX/pp+Tf9KvvSH9HsXIE/i61nTnkY8EPs8i8VisVzwnIRRqcbjf39w6wc//NGPf/LhTzd+9vNf/PJXtz/6+I3kqXDhtcsZF28dIoHRGF4rqhi8TQSQyGHwjTP7Usu/mYOQlMfHapnAaUSCmPrUJQq7zm//1lawUEZP5hEx23ZYCnlmqxAUyc9vb41HY/Ox2o1J2dh6PPjPd99/f+tPh+cffeLaHnfTCGLlMiLlyWScqNOMCEVdBvmGnUpIiDsjAZxgMyYRyNPMjJ9bd7DHs3wu8D9WlumtMjISSbmMHERGRIWyKdOdXbKTVPkHpxmNk1RB7BYD+SmzFLe0UyyPCnAVW2KDuIKirZYbEkFcha6rjWJsGi4Kgzc27ljGxRJtjSWuFk7bekdVaCWMK2R64OPCVNwrwMszETh5Nh4dDLUv9WVyMN55OJ3sPTh4MN3f3ZvkHcxiYUrq+Ip6A2YgAOI6tcoZP0QdRl++cafN5oLEwWrkSUF/sLs/3js4mO7sTR/uTiYHJfs95HLGuyW6a/WGntRLP9RtxTmTtYjJJE1jqhb1zkCQJKRuozdKmaKCvxs6nM8UceQQLykjYjH0GSeqHop60EcxFxFhkl7CaSZTx6dBY3QM6BC8oSPIDOoKMh+WcZRsk1TxumbJhbrr8gh3pcRIj4lyqIOQQ9wcLxO9CeUxPyy1hMskhFjmWSpYXtWSeD7u02FIPb3TZ3Kor8bPjyrRMbRcqqDaXSz90EJ91W4dhcYxEd5JnkBsFCruPiKMnWo7QAjw63OEOI1Qf4T7x2xSFIntYsk9C33LQNU3C3q84Y6MuC7uEblSYYchUXUOnV3WKQuJLdAM05AWjXG7RRGJvWI4TWEUV0UsMdHgmsuh5aEbhMlxcqQnSeMAewvpKNK5TW/eZxCDIMwkAdSjELVhx/CuVI9pEGNQx0Z+MjnNTKqUbrY1yfM6DNsJpgsTiZg8sW3HPEHjHczFM9uhgZzRpNYXcxp76IqGJoxPOtdrUYwY4iJkoVLJF/fvG9GIi+A+RvN9NKIwSCmc9Fs6/0Kb1dCmG7CyHt0QYAokIrOl75OIsqUtMdslSod88zlgElWhsuIonX1xV1r1YWiUa/UqFFFGmzZ41PevxV5TDEGeZTCyh8Eo/3tDRjFzZBRl0JbBRUrnhOnZ4XxoBBfG0L9ifMTcisjSgTphCQkicXelArQx2jvNFGlZthuCa7ZFy5kYQx73+3UUCRpVEKlaZFx66KcaZyNT4Txk6fJXJtjrWi4uLlKCUNt8W8VX3sJcgVaoHtg17gpYIHH4w3ApqStXK14n4wRWMaVCl7DsRX5uAqhjb+iJ17Av8/MOGAtEHfZ8jUrhNVXaDHx1b2tiCxqE6rMGwRHXEfjkqKkuSjCG6mujr46fJfn58VmxzbKISr00LTI0yU/fy8HHlbzIzfeZ7ZEgAMyEeNOAYY2C4fbqvFSm74rQeI0PwcaKwNxsgwzetMJ2JZq1ZdFK9qItC1ayZ22ZWsmO2zLfKUT43dzO16KzbLudrZJSnLSZ16IVE93wFRZcgjqpyfJWc7/pZ3l+Mq1EyZ9z+1P8KyPFsiPqeQz+Zm1Nrauw0WpBYGZRdI6pBHW
<latexit sha1_base64="uKih3q81UG5Y39UkSUDmaMgSwo0=">AAAcmHicpVltc9y2Eb6kb6n65qTfGnWGqcaukzmd706SLaXjGY8Tj5sZu1ZlyXEjShqQXJKYAwkKAM93Ytkv/TX9mv6B/o3+kH7vAuRJfD1r2tOIB2KfZ7FYLJYLnpMwKtV4/O8PPvzBD3/045989NONn/38F7/81Z2PP3kjeSpcOHE54+KtQyQwGsOJoorB20QAiRwG3zqzr7T82zkISXl8rJYJnEUkiKlPXaKw6+LOb20FC2X0ZB4Rs22HpZBntgpBkfziztZ4NDYfq92YlI2tJ4P/fP+vwWBwePHxp67tcTeNIFYuI1KeTsaJOsuIUNRlkG/YqYSEuDMSwCk2YxKBPMvM+Ll1F3s8y+cC/2Nlmd4qIyORlMvIQWREVCibMt3ZJTtNlb9/ltE4SRXEbjGQnzJLcUs7xfKoAFexJTaIKyjaarkhEcRV6LraKMam4aIweGPjrmVcLNHWWOJq4bStd1SFVsK4QqYHPi5Mxb0CvDwTgZNn49H+UPtSXyb7452D6WTv4f7D6aPdvUnewSwWpqSOr6m3YAYCIK5Tq5zxAeow+vKNu202FyQOViNPCvrD3Ufjvf396c7e9GB3Mtkv2e8hlzPeLdFdqzf0pF76oW4rzpmsRUwmaRpTtah3BoIkIXUbvVHKFBX83dDhfKaII4d4SRkRi6HPOFH1UNSDPo65iAiT9ArOMpk6Pg0ao2NAh+ANHUFmUFeQ+bCMo2SbpIrXNUsu1D2XR7grJUZ6TJRDHYQc4uZ4lehNKI/5YaklXCYhxDLPUsHyqpbE83GfDkPq6Z0+k0N9NX5+XImOoeVSBdXuYumHFuqrdusoNI6J8E7yBGKjUHH3MWHsTNsBQoBfnyPEaYT6I9w/ZpOiSGwXS+5Z6FsGqr5Z0OMNd2TEdXGPyJUKOwyJqnPo7KpOWUhsgWaYhrRojNstikjsFcNpCqO4KmKJiQbXXA4tD90gTI6TIz1JGgfYW0hHkc5tevM+hxgEYSYJoB6FqA07hnelekyDGIM6NvLTyVlmUqV0s61Jntdh2E4wXZhIxOSJbTvmCRrvYC6e2Q4N5Iwmtb6Y09hDVzQ0YXzSuV6LYsQQFyELlUq+fPDAiEZcBA8wmh+gEYVBSuGk39L5l9qshjbdgJX16IYAUyARmS19n0SULW2J2S5ROuSbzwGTqAqVFUfp7Iu70qoPQ6Ncq1ehiDLatMGjvn8j9ppiCPIsg5E9DEb53xoyipkjoyiDtgwuUzonTM8O50MjuDSG/gXjI+ZWRJYO1AlLSBCJuysVoI3R3mmmSMuy3RBcsy1azsQY8rjfr6NI0KiCSNUi49JDP9U4G5kK5yFLl782wV7Xcnl5mRKE2ubbKr7yFuYatEL1wG5w18ACicMfhktJXbla8ToZJ7CKKRW6hGUv8wsTQB17Q0+8hn2VX3TAWCDqsBdrVAqvqdJm4Kv7WxNb0CBUnzcIjriJwKdHTXVRgjFUXxt9dfwsyS+Oz4ttlkVU6qVpkaFJfvZeDj6u5GVuvs9tjwQBYCbEmwYMaxQMt9cXpTJ9V4TGCT4EGysCc7MNMnjTCtuVaNaWRSvZy7YsWMmet2VqJTtuy3ynEOF3czvfiM6z7Xa2Skpx0mbeiFZMdMPXWHAJ6qQmy1vN/aaf5fnptBIlf8rtz/CvjBTLjqjnMfirtTW1rsNGqwWBmUXROaYS1GWeNakiiouGzxMPRDGCj6WiZfK+opglMWKrd9OW+5pEnSNLVtFsUzo559MqK0PaeYvpt4jAFFnxiva0cOhTzrzuDY89XnEkqMW6ZQQFY3VmaI6vEXGa99NQ2EGBRFLG4zW8ORErUAd/0WCWO3rRhb3qxl51YZfd2GUXdt6NnXdhVTdWddoLgnfDx8VCvgQVcq+xhi7BovK6iPlK39llbVUHCs7EDfBI33UDJRaLyxvka3PbD6Uxr4N1RzfcJRJ3bNVac99jbwN81ACbhym4+tRrcQxS0R3jJnx10Vs2V3lXd7USUqz06ZDgMbteVVhZZyGVrNMgbqNBrNMgb6NBrtOgbqOhXdZUNFzdRsN3LQ2M41pFHBPRrfy4WhRDKx6tuLzfxAqw+G6uKFqn0wB+WfZnVpEkF/Yf2nNYJAIrrxb0951Y2sJd0E7grA2cdQF90jazyGEa3YJfNbFXPUpbwCJ5dNrawi4se1jV36JgMQXQtPt8p9vBPk9FC7vbj122scs+bNN0xHZ7RNsgsSpgXYYM+6ZYXcXqo3fHmuET9P7UTujn5zt534h99N0qfbeTboZPeodPbjN8H323Su8e3tB7/JVYD6wV2bjurvUGXF0imZIJd7CgrdOLesfn4Ba1iAN4Fs0SA1x8kZ+6Z5YuyWxTjAGir0XNQkg7xajZeZ8avO6s1aVnaFTt3kaVue6ut67UKLpViluqbLnNnLf6jbyHCgt9995rofGf0XewVp++7BTmYWtPXx4acx/p5r6+HKwfyJi91hP/i92i23DxfxmO4fuCx+YY5hDROGzOQWBneeCcgYityWgvSo1Av3Yve7dNL55AK3d5lVAcLniKz0rLvPXCURjEgT5v6tdgIegzSGPaWmCGvrth4cfWb6l4gayx6sdX7C/e3ZQs4nmKd4yVbY9HO3uYxUuc5tnzJCSx4hEWVimDbKJPx1XSCoypQVbHKipP/ezGI78oz0tfg8uwqNLdr8peXEusePC/R3qM0uNeqUdJkGfm2oM4Al0MHkGf/JBTrJfMtQfhcqzc9aVHjiePPNOXHjkssNxU5u1j+d7BcbJneR/86VGu30r0SEM3z8KRPXRH3Ygv9EvDICL4eMVve6hb64A0XgGx1TOmAJZqF7446UMwHuDpm6Jt162+QV++fpZn+tIHUCICglapyITQMZ1dmZ8S7DiOsTDUbyez8ejRnhvlq35GliAkJNm01ukA050NsMOFV6DHo3r/QobUV9ht1BT9TFYGnd7gy34oHol6jLZI/8TR7sUpB7iPKoJr81udxZvo6ry0oDGviztbk+aPcO3Gm+loMh5N/jzeerI/KD4fDT4d/G5wfzAZPBo8GfxxcDg4GbiDvw/+Mfh+8M/N32w+2Xy++U0B/fCDkvPrQe2zefRfYlyt/g==</latexit>
sha1_base64="sg4BK8w5ytAcXmQyQ0tWWWn7/Y0=">AAAcmHicpVltc9y2Eb64b6n65qTfGnWGqcaukzmd706SJafjGY8Tj5upXauy7LgRJQ1ILknMgQQFgOc7seyX/pp+Tf9KvvSH9HsXIE/i61nTnkY8EPs8i8VisVzwnIRRqcbjf39w6wc//NGPf/LhTzd+9vNf/PJXtz/6+I3kqXDhtcsZF28dIoHRGF4rqhi8TQSQyGHwjTP7Usu/mYOQlMfHapnAaUSCmPrUJQq7zm//1lawUEZP5hEx23ZYCnlmqxAUyc9vb41HY/Ox2o1J2dh6PPjPd99/f+tPh+cffeLaHnfTCGLlMiLlyWScqNOMCEVdBvmGnUpIiDsjAZxgMyYRyNPMjJ9bd7DHs3wu8D9WlumtMjISSbmMHERGRIWyKdOdXbKTVPkHpxmNk1RB7BYD+SmzFLe0UyyPCnAVW2KDuIKirZYbEkFcha6rjWJsGi4Kgzc27ljGxRJtjSWuFk7bekdVaCWMK2R64OPCVNwrwMszETh5Nh4dDLUv9WVyMN55OJ3sPTh4MN3f3ZvkHcxiYUrq+Ip6A2YgAOI6tcoZP0QdRl++cafN5oLEwWrkSUF/sLs/3js4mO7sTR/uTiYHJfs95HLGuyW6a/WGntRLP9RtxTmTtYjJJE1jqhb1zkCQJKRuozdKmaKCvxs6nM8UceQQLykjYjH0GSeqHop60EcxFxFhkl7CaSZTx6dBY3QM6BC8oSPIDOoKMh+WcZRsk1TxumbJhbrr8gh3pcRIj4lyqIOQQ9wcLxO9CeUxPyy1hMskhFjmWSpYXtWSeD7u02FIPb3TZ3Kor8bPjyrRMbRcqqDaXSz90EJ91W4dhcYxEd5JnkBsFCruPiKMnWo7QAjw63OEOI1Qf4T7x2xSFIntYsk9C33LQNU3C3q84Y6MuC7uEblSYYchUXUOnV3WKQuJLdAM05AWjXG7RRGJvWI4TWEUV0UsMdHgmsuh5aEbhMlxcqQnSeMAewvpKNK5TW/eZxCDIMwkAdSjELVhx/CuVI9pEGNQx0Z+MjnNTKqUbrY1yfM6DNsJpgsTiZg8sW3HPEHjHczFM9uhgZzRpNYXcxp76IqGJoxPOtdrUYwY4iJkoVLJF/fvG9GIi+A+RvN9NKIwSCmc9Fs6/0Kb1dCmG7CyHt0QYAokIrOl75OIsqUtMdslSod88zlgElWhsuIonX1xV1r1YWiUa/UqFFFGmzZ41PevxV5TDEGeZTCyh8Eo/3tDRjFzZBRl0JbBRUrnhOnZ4XxoBBfG0L9ifMTcisjSgTphCQkicXelArQx2jvNFGlZthuCa7ZFy5kYQx73+3UUCRpVEKlaZFx66KcaZyNT4Txk6fJXJtjrWi4uLlKCUNt8W8VX3sJcgVaoHtg17gpYIHH4w3ApqStXK14n4wRWMaVCl7DsRX5uAqhjb+iJ17Av8/MOGAtEHfZ8jUrhNVXaDHx1b2tiCxqE6rMGwRHXEfjkqKkuSjCG6mujr46fJfn58VmxzbKISr00LTI0yU/fy8HHlbzIzfeZ7ZEgAMyEeNOAYY2C4fbqvFSm74rQeI0PwcaKwNxsgwzetMJ2JZq1ZdFK9qItC1ayZ22ZWsmO2zLfKUT43dzO16KzbLudrZJSnLSZ16IVE93wFRZcgjqpyfJWc7/pZ3l+Mq1EyZ9z+1P8KyPFsiPqeQz+Zm1Nrauw0WpBYGZRdI6pBHWZZ02qiOKi4fPEA1GM4GOpaJm8ryhmSYzY6t205b4mUefIklU025ROztm0ysqQdtZi+i0iMEVWvKI9LRz6hDOve8Njj1ccCWqxbhlBwVidGZrja0Sc5v00FHZQIJGU8XgNb07ECtTBXzSY5Y5edGEvu7GXXdhlN3bZhZ13Y+ddWNWNVZ32guDd8HGxkC9AhdxrrKFLsKi8KmK+1Hd2WVvVgYIzcQ080nfdQInF4vIa+crc9kNpzOtg3dENd4nEHVu11tz32NsAHzXA5mEKrj71WhyDVHTHuAlfXfSWzVXe1V2thBQrfTokeMyuVxVW1llIJes0iJtoEOs0yJtokOs0qJtoaJc1FQ2XN9HwbUsD47hWEcdEdCM/rhbF0IpHKy7v17ECLL6bK4rW6TSAX5b9qVUkyYX9h/YcFonAyqsF/X0nlrZw57QTOGsDZ11An7TNLHKYRrfgl03sZY/SFrBIHp22trALyx5W9bcoWEwBNO0+2+l2sM9T0cLu9mOXbeyyD9s0HbHdHtE2SKwKWJchw74pVlex+ujdsWb4BL03tRP62dlO3jdiH323St/tpJvhk97hk5sM30ffrdK7hzf0Hn8l1n1rRTauu2O9AVeXSKZkwh0saOv0ot7xObhFLeIAnkWzxAAXn+cn7qmlSzLbFGOA6CtRsxDSTjFqdt6nBq87a3XpGRpVuzdRZa67660rNYpuleKGKltuM+etfiPvosJC3933Wmj8Z/Q9XKtPX3YK87C1py8PjLn7unmgLw/XD2TMXuuJ/8Vu0W24+L8Mx/B9zmNzDHOIaBw25yCwszxwzkDE1mS0F6VGoF+7l73bphdPoJW7vEooDhc8xWelZd564SgM4kCfN/VrsBD0GaQxbS0wQ9/ZsPBj67dUvEDWWPXjK/YX725KFvE8xTvGyrbHo509zOIlTvPseRKSWPEIC6uUQTbRp+MqaQXG1CCrYxWVp35245FflOelr8BlWFTp7pdlL64lVjz43yM9Rulxr9SjJMgzc+1BHIEuBo+gT37IKdZL5tqDcDlW7vrSI8eTR57pS48cFlhuKvP2sXzv4DjZ07wP/uQo128leqShm2fhyB66o27E5/qlYRARfLzitz3UrXVAGq+A2OoZUwBLtQufv+5DMB7g6ZuibVetvkFfvHqaZ/rSB1AiAoJWqciE0DGdXZqfEuw4jrEw1G8ns/Fof8+N8lU/I0sQEpJsWut0gOnOBtjhwivQ41G9fyFD6ivsNmqKfiYrg06v8WU/FI9EPUZbpH/iaPfilAPcRxXBlfmtzuJNdHVeWtCY1/ntrUnzR7h24810NBmPJn8Zbz0+GBSfDwefDH43uDeYDPYHjwd/HBwOXg/cwT8G/xx8N/jX5m82H28+2/y6gN76oOT8elD7bB79FxQTrzM=</latexit>

observab e
✓j
atent z x
<latexit sha1_base64="KGKYmyKWjUVupNzZY8DVUShrvqU=">AAAoEHicpVrbctzGEaWdm0MriZ08+mUUWtbF2NUCXEqUXXQ5vpSTKitSJEp2QpCsATC7mFrcNBhQu0SQj0jlY/KWymv+IN+Sl3TPALu4LlnyqoQFZk6f7unp6enB0kkCnsrJ5L9vvf2jH//kpz975+e77974xS9/9d77v36Zxplw2Qs3DmLxvUNTFvCIvZBcBuz7RDAaOgH7zll8if3fXTCR8jg6lquEnYZ0HvEZd6mEpvP33/2H7bA5j3LJF5cJd2UmWHFyukuIjS2pXAUsd+MoYi4KFEcnqR8LySLy2ZGZSEP63F0YxBP09ZETUHdx8+HEIKP8s5PUpQE7Mk8Lg0Sxx4gHg6GRy47sKAroCmxiSUvPLIv6tcAIVqRSRVOfJFRKJqKjPI7IfiJJPJsRK5HFdkta6iKWCaUMZQyXCzdg5YBmPAiOXvtcMiPkEQ+zkKT8UtmOg8H7FtmSrOn0Td0Ug2wIW3LSZ5IOyHpULEZOkLFSfv1805zc7CPzueeB07ZY0vY4j2jwhpYvSUAdFhTkiJxItpTaPMG8fqNa4GFr+ljngrGo13N9aHTR6S6gbxEFQrlZDMvC9Ykd0ZCR+8ReER6RfGKMx2PDLBCCM3tSn49Tckc9jpTQXUIluTMxLDckIxB3w7skLz5VapZXqrA2KpYb+mWTemS2uBW5Ey9ZitI4LScY/8wzyuXQjBNgHE3GD4zJeHqXjEZA2X6wxg/0w6j7hMBP+/VUxKP9SqLnaa1o1PeI2HJAvnmlu6YbdzWCGgzxzbrTatnEIOb4gKAHycaFpFRp/SCVVl2lda+utEenVhnEc7FmrC80IMQ+TbbfICPgq4MGS2oQAZNA8KGXSo4mmmnaZLLWDhiWNHslzSsl0165mu3DoqJf1KpP1pd6v9GLFzzbihe9+XWW77of8rPszCvsGP7JZiPbLG3Ndpcwb850cKH83bYty2vYYb2RHctr2+CbaIQ13m7F9I2sQKUtM6zKjJYVFlhRBfdVZvQostqK1GJojxUbQU8Z/iqcFFdVI6wXkeZIVebajhHXwOBqKqN3O8xsu6UK1n5bdbBNtPyJwyKPBGwmj8xJuYT7TdNiZkfMmpw2jPgdlCWSSlZbd7Xt8VQtuZFacfAIK45AfhhbBzphfWgr7Pnkw8bCXW4RHpkb6WVTrL7da9n9cTdPlClzQ6rsABcT26cyF8WHPUmkxtnIH+TjOtNagaJEthTZ3pjMqlP9MMPKDWrNJjWbDfPaKMDP39ubjCfqQ7o3Znmzt1N+np6//4Fre7GbhSySbkDT9MScJPI0p0JyqGqLXTsDA6DiohBGcIubWXqaq/NDQW5Bi0dgKcP/SBLVWpfIaZimq9ABZAgBmrb7sLGv7ySTs8PTnEdJBrW8qxXNsoDImOBhBM4EApZOsIIb6goOthLXp4K6UNw3tSibjKU2eHcXlx3cpWBrlMLBAoZNXnMJB4MgliDpMZgYptB5VZYWuZg7RT4ZHxroS7yYh5P9R5Z58ODwgfVwegD7SFcS66q16GQteg1JVbQ2Resyk0fAofiK3Vtd6VjQaF5pNrX4g+nDycHhobV/YD2amuZhKX2FcDniaYnumz3DS3HqDbyXcRykjYjJU55FXC6bjXNBE6gNW61hFkgu4teGE8cLSZ3UgEsWULE0ZkFMZTMUUelRFIuQBupMlaeZM+PzlvZAl6KOoAvWJMhnbBWFyYhmMm4yp3B+/MiNQzgNpxDpEZUOdwDyFBbHkwTTa3ocPy1Z/FXisygt8kwERZ0l8WZwrDAgp+EJe5EaeFV+PqpFh0FcOB7Vm/XUGwT46s0YhcoxITylccIiRShj94gGwSnawYRgs+YYWZSFwB/C+lGLFLrESE+5R8C3AZPNxQIeb7kjp64LayStKGwf0k9TBtJPU2SZwh1DCXWT4t4ODg1p5Gl1KBJwmBWxgkQDcw6btQduEOrdQjrGQfJoDq26dxzCLqMW7zcsYgLyJiYB4JGA2rUj9rqkz22MQYyN4sQ8hSc42aVuvmcWRRMG9wmkCxWJRQ67p4CsnIDxDlQlC9vh83TBk0ZbFPMI9ifZYoL45Bc4F1qjD5OQ+1Imn9y/r7rGsZjfh2i+D0Zog6SEQX/PLz5Bs1pseMMq68ENc0iBVOR2OpvRkAcrO4Vsl0gMecXVTlSasuYozL6wKklTDQ8LpJe+CHPetsHjs9mm22t3s3mR52xsG/Nx8bdWH4fMkXPoY90+9irjFzTA0cF4eMheKUP/DPERxSSkK4c1BVYsASSsLtjY0Bj0TjtFQqHo+sxVy6LjTIghL54Nc+gEDRQ0lR1hmHo2LKqcrV4qwHZYuvy5CvYmy6tXrzIKUFt9E/1VdDBrUIUagG1wa6BGgvqn/irlblrNeFMYBlDFlPRdGuSPi3MVQD1rAwfewD4pzntgwVw0Yd9uoRRem9LGcvTOnmkLPvfl3ZaAIzYR+MWzNl2YQAw15wavzixPivPjM73M8pCnODUdYdYW/vpKGdiu0leF+j6zPTqfM8iE8NCCQY0C4fb8vCTDJx0aL2ATbM0Iu1DLIGcvO2FbdS26fWHV97jbN6/6vun2yarvuNs3c3QXfLeX86brLB91s1VSdiddyU1XJQlu+AoKLsGdTGV50l5vuJcXJ1YtSv5Y2DfhXxkpxA7hhBCwv5I9i6zDBmmZgMwi+QWkEuBSe00Gh5pYtHyeeExoDTMoFYnK+5JDloSIrT9ZHfe1BTFHllL6tivSK3Nm1aVyEDvrSM46giyAA1kpp+8t7dAv4sDrX/DQ4qnDWTPWierQErk+vXVmFRFRVgyLQWePCEtSHsTRFrkLKipQj/yyJVmu6GUf9rIfe9mHXfVjV33Yi37sRR9W9mNlr71MxP3wiZ7Ix0z6sdeaQ5dCUbkuYr7EJ7usrZpAEQdiA3yGT/3AFIrF1Qb5XD0OQ3kUN8HY0A93Kf5YUrdWPQ/Y2wI/a4HVZspc/LWJxBCkoj/GVfhi0VveVnkXmzoJKZJ4OqRO0KoqSN5bSCXbGMR1GMQ2hvQ6DOk2Bnkdhm5ZU2O4vA7DXzoMQQxzFcaQiK7lx2pSlJjeWmF6/xBJBsV3e0bBOkwD8EXsm0QnyaX9aXcMy0RA5dWB3u7F8g7unPcCF13gog84o10zdQ5DdAd+2cZeDpB2gDp59NrawS6JbdT5OyJQTDHWtvtsv9/BszgTHex0GLvqYldD2LbpgO33CNqQQlUQ9BliDA2xPov1rXefLGAHvWPZCb97tl8MaRwSn9bFp73iSn0yqD65jvoh8WldvF+9Eh/wV0Luk0pYue4WeclcLJFUyQQrWPDO6UW+ji+Yq2sR/WN/ooDLe8WJe0qwJLNVMYavIddd7UIInaJo9q+igev+Vi4coaKaXodKXafbrSsZRT+luCZlx23qvDVs5EdAqPk+utJC5T/F92grH172tXlwd4CXB8rch3h7iJdH2xUps7d64k3sFv2Gix9kOITvt3GkjmEOFa3D5gUT0FgeOBdMRPj2PMxUB/65S9k6Uq1wAq09FXUBfbiIM9griXrrBVoCFs3xvImvwXyGZ5DWsLFDqb61S+Bj41uqWCMbUs3jK77TV+9uSinqeTLu0ZWPJuP9A8jiJQ7l7IvEp5GMQyissoDlJp6O60IVGFJDWtelK0/cu+HIL8rz0lfMDaCowuYnZSvMJVQ88H+g9xh6jwd7PU7nRa6uA4hnDIvBZ2yo/2nMoV5S1wGEG0PljpeBfjh5FDleBvrZEspNqd4+lu8dHCf/uhiCf/GswLcSA72+W+T+2DbccT/iHr40nIcUtlf4tg282wbkUQWEuwGdggUZuvDbF0OIIJ7D6ZuDbeu7IaWPn39d5HgZAkgRMgpWyVCF0DFfXKqfEmp/5JRPxg8P3LCo2qsftnKr0ah/32qDnVh4Gq3+MKDWvkx9PpPQrGh0e5DWlFobfNnO9JaIOrpd+BNHtxWGPId1VOtYm99p1G+i6+PCjta4zt/bM9s/wnVvXlpjczI2/zTZ+/zz8ge6d3Y+2Pntzp0dc+fhzuc7v995uvNix333fzdu3rh34+Pbf7/9z9v/uv1vDX37rVLmNzuNz+3//B+t8VDE</latexit>
sha1_base64="HjZ6RxRDdZu139wdkhmGLAXlGyY=">AAAoEHicpVpbc9vGFWbSW6q4TdI+5mVdxbHlgDQBUbacjDJuLpN2Jm4cW3bSCpJmASyJHeLmxUImhaI/otMf07dOX/sP+tbH/oe+9JxdkMSV0jj0GAR2v/Ods2fPnj0LykkCnsrx+N9vvPmjH//kpz976+c7b9/4xS/fefe9X71I40y47LkbB7H43qEpC3jEnksuA/Z9IhgNnYB958w/x/7vLphIeRwdy2XCTkM6i/iUu1RC0/l7b//NdtiMR7nk88uEuzITrDg53SHExpZULgOWu3EUMRcFiqOT1I+FZBH59MhMpCF97s4N4gn66sgJqDu/+WBskGH+6Unq0oAdmaeFQaLYY8SDwdDIZUd2FAV0CTaxpKFnmkXdWmAES7JSRVOfJFRKJqKjPI7IfiJJPJ0SK5HFdksa6iKWCaUMZQyXCzdg5YCmPAiOXvlcMiPkEQ+zkKT8UtmOg8H7BtmCrOn0TdUUg2wIG3LSZ5L2yHpUzIdOkLFSfv180xzf7CLzueeB07ZY0vQ4j2jwmpYvSEAdFhTkiJxItpDaPMG8bqMa4H5rulhngrGo03NdaHTR6Q6gbxEFQrlpDMvC9Ykd0ZCRe8ReEh6RfGyMRiPDLBCCM3tSnY9Tckc9DpXQHqGS3BkblhuSIYi74R7Ji0+UmsWVKqyNisWGflGnHpoNbkXuxAuWojROywnGP/OMcjnU4wQYh+PRfWM8muyR4RAomw/W6L5+GLafEPhJt54V8XB/JdHxtFY07HpEbDkg37zSXZONu2pBDYb4ZtVplWxiEHN0QNCDZONCUqq0fpBKq6rSultV2qFTqwzimVgzVhcaEGKfJtuvkRHw1UGNJTWIgEkg+NBJJYdjzTSpM1lrB/RLmp2S5pWSaadcxfZ+UdEtalUn63O93+jFC55txIve/FrLd90P+Vm25hV2DP9ks5FtlrZm2yPMmzEdXCi/17RlcQ07rNeyY3FtG3wTjbBG262YvJYVqLRhhrUyo2GFBVasgvsqMzoUWU1FajE0x4qNoKcMfxVOimtVI6wXkeZIVebajhHXwOBqKqN3O8xsumUVrN226mAba/kTh0UeCdhUHpnjcgl3m6bFzJaYNT6tGfFbKEsklayy7irb46lackO14uARVhyB/DCyDnTC+sBW2PPxB7WFu9giPDQ30ou6WHW717L7o3aeKFPmhlTZAS4mtk9lLooPOpJIhbOWP8hHVaa1AkWJbCmyvTaZVaX6YYaVG9SaTWo2G+a1VoCfv7s7Ho3Vh7RvzPJm9xH59r//GQwGT87fe9+1vdjNQhZJN6BpemKOE3maUyE5VLXFjp2BAVBxUQgjuMXNLD3N1fmhILegxSOwlOF/JIlqrUrkNEzTZegAMoQATZt92NjVd5LJ6eFpzqMkg1re1YqmWUBkTPAwAmcCAUsnWMINdQUHW4nrU0FdKO7rWpRNxkIbvLODyw7uUrA1SuFgAcMmr7iEg0EQS5D0GEwMU+h8VZYWuZg5RT4eHRroS7yYh+P9h5Z5cP/wvvVgcgD7SFsS66q16Hgteg1JVbTWRasy44fAofiKnVtt6VjQaLbSbGrx+5MH44PDQ2v/wHo4Mc3DUvoK4XLEkxLdNXuGl+LUG3gv4zhIaxGTpzyLuFzUG2eCJlAbNlrDLJBcxK8MJ47nkjqpAZcsoGJhTIOYynoootKjKBYhDdSZKk8zZ8pnDe2BLkUdQeesTpBP2TIKkyHNZFxnTuH8+KEbh3AaTiHSIyod7gDkCSyObxJMr+lx/KRk8ZeJz6K0yDMRFFWWxJvCscKAnIYn7Hlq4FX5+agSHQZx4XhUbdZTbxDgqzZjFCrHhPCUxgmLFKGM3SMaBKdoBxOCTetjZFEWAn8I60ctUugSQz3lHgHfBkzWFwt4vOGOnLourJF0RWH7kH7qMpB+6iKLFO4YSqibFPd2cGhII0+rQ5GAw6yIJSQamHPYrD1wg1DvFtIRDpJHM2jVvaMQdhm1eL9iEROQNzEJAI8E1I4dsVclfW5jDGJsFCfmKTzByS51812zKOowuE8gXahILHLYPQVk5QSMd6AqmdsOn6VzntTaophHsD/JBhPEJ7/AudAafZiE3Jcy+fjePdU1isXsHkTzPTBCGyQlDPp7fvExmtVgwxu2sh7cMIMUSEVup9MpDXmwtFPIdonEkFdczUSlKSuOwuwLq5LU1fCwQHrpizDnTRs8Pp1uur1mN5sVec5GtjEbFX9p9HHIHDmHPtbuYy8zfkEDHB2Mh4fspTL0jxAfUUxCunRYXWDJEkDC6oKNDY1B7zRTJBSKrs9ctSxazoQY8uJpP4dO0EBBU9kShqln/aLK2eqlAmyHpcufqWCvs7x8+TKjALXVN9FfRQuzBq1QPbANbg3USFD/xF+m3E1XM14XhgGsYkr6Lg3yx8W5CqCOtYEDr2G/Kc47YMFM1GFfb6EUXpPSxnL0zq5pCz7z5V5DwBGbCPzsaZMuTCCG6nODV2eaJ8X58ZleZnnIU5yaljBrCn95pQxsV+nLQn2f2R6dzRhkQnhowKBGgXB7dl6S4ZMOjeewCTZmhF2oZZCzF62wXXXN233hqu9xu2+26vuq3SdXfcftvqmju+C7uZw3XWf5sJ2tkrI7aUtuulaS4IYvoOAS3MlUlifN9YZ7eXFiVaLkD4V9E/6VkULsEE4IAfsz2bXIOmyQlgnILJJfQCoBLrXXZHCoiUXD54nHhNYwhVKRqLwvOWRJiNjqk9VyX1MQc2QppW/bIp0yZ1ZVKgexs5bktCXIAjiQlXL63tIO/SwOvO4FDy2eOpzVY52oDi2R69Nba1YREWVFvxh0doiwJOVBHG2Ru6BiBeqQXzQkyxW96MJedmMvu7DLbuyyC3vRjb3owspurOy0l4m4Gz7WE/mYST/2GnPoUigq10XM5/hkl7VVHSjiQGyAT/GpG5hCsbjcIJ+px34oj+I6GBu64S7FH0uq1qrnHnsb4KcNsNpMmYu/NpEYglR0x7gKXyx6y9tV3sWmVkKKJJ4OqRM0qgqSdxZSyTYGcR0GsY0hvQ5Duo1BXoehXdZUGC6vw/CnFkMQw1yFMSSia/lxNSlKTG+tML2/jySD4rs5o2AdpgH4IvZNopPkwv6kPYZFIqDyakFvd2J5C3fOO4HzNnDeBZzStpk6hyG6Bb9sYi97SFtAnTw6bW1hF8Q2qvwtESimGGvafbbf7eBpnIkWdtKPXbaxyz5s03TAdnsEbUihKgi6DDH6hlidxerWu0/msIPeseyE753tF30a+8QnVfFJp7hSn/SqT66jvk98UhXvVq/Ee/yVkHtkJaxcd4u8YC6WSKpkghUseOv0Il/FF8zVtYj+sT9RwMXd4sQ9JViS2aoYw9eQ665mIYROUTT7V9HAdX8rF45QUU2uQ6Wuk+3WlYyim1Jck7LlNnXe6jfyQyDUfB9eaaHyn+J7uJUPL/vaPLg7wMt9Ze4DvD3Ey8PtipTZWz3xOnaLbsPFDzIcwvfrOFLHMIeKxmHzggloLA+ccyYifHseZqoD/9ylbB2qVjiBVp6KqoA+XMQZ7JVEvfUCLQGLZnjexNdgPsMzSGPY2KFU39oh8LHxLVWskTWp+vEV3+mrdzelFPU8GXfoyofj0f4BZPESh3L2ReLTSMYhFFZZwHITT8dVoRUYUkNa1aUrT9y74cgvyvPSF8wNoKjC5m/KVphLqHjgf0/vMfQe9/Z6nM6KXF17EE8ZFoNPWV//k5hDvaSuPQg3hsodLz39cPIocrz09LMFlJtSvX0s3zs4Tv5l0Qf/7GmBbyV6en23yP2RbbijbsRdfGk4Cylsr/BtG3i3DcijFRDuenQKFmTowq+f9yGCeAanbw62re/6lD5+9mWR46UPIEXIKFglQxVCx3x+qX5KqPyRUz4ePThww2LVvvphK7dqjfr3rSbYiYWn0eoPAyrti9TnUwnNika3B2lFqbXBl+1Mb4moo92FP3G0W2HIM1hHlY61+a1G/Sa6Oi7saIzr/N1ds/kjXPvmhTUyxyPz2/Huo0cD/Xlr8P7gN4M7A3PwYPBo8LvBk8Hzgfv2/27cvHH3xke3/3r777f/cfufGvrmG6XMrwe1z+1//R+6gFM9</latexit>
sha1_base64="N0otbztN+sjGy9xqdVKMRIGUlNA=">AAAoEHicpVrdctvGFVbSn6SK2ybtZW7WVRxbDkgTEGXLySjj5mfSzsSNY8tOWkLSLIAlsUP8ebGQSaHoQ3T6ML3r9LZv0Bfo9B1603N2QRK/lMahxyCw+53vnD179uxZUE4S8FSORv9+480f/fgnP33r7Z/tvnPj57/45bvv/epFGmfCZc/dOIjF9w5NWcAj9lxyGbDvE8Fo6ATsO2f+OfZ/d8FEyuPoRC4TdhrSWcSn3KUSms7fe+dvtsNmPMoln18m3JWZYMXkdJcQG1tSuQxY7sZRxFwUKI4nqR8LySLy6bGZSEP63J0bxBP01bETUHd+88HIIIP800nq0oAdm6eFQaLYY8SDwdDIZcd2FAV0CTaxpKFnmkXdWmAES7JSRVOfJFRKJqLjPI7IQSJJPJ0SK5HFdksa6iKWCaUMZQyXCzdg5YCmPAiOX/lcMiPkEQ+zkKT8UtmOg8H7BtmCrOn0TdUUg2wIG3LSZ5L2yHpUzAdOkLFSfv180xzd7CLzueeB07ZY0vQ4j2jwmpYvSEAdFhTkmEwkW0htnmBet1ENcL81XawzwVjU6bkuNLrodBfQt4gCodw0hmXh+sSOaMjIPWIvCY9IPjKGw6FhFgjBmZ1U5+OU3FGPAyW0T6gkd0aG5YZkAOJuuE/y4hOlZnGlCmujYrGhX9SpB2aDW5E78YKlKI3TMsH4Z55RLod6nADjYDS8b4yG430yGABl88Ea3tcPg/YTAj/p1rMiHhysJDqe1ooGXY+ILQfkm1e6a7xxVy2owRDfrDqtkk0MYg4PCXqQbFxISpXWD1JpVVVad6tKO3RqlUE8E2vG6kIDQuzTZAc1MgK+OqyxpAYRMAkEHzqp5GCkmcZ1JmvtgH5Js1PSvFIy7ZSr2N4vKrpFrepkfa73G714wbONeNGbX2v5rvshP8vWvMKO4U82G9lmaWu2fcK8GdPBhfL7TVsW17DDei07Fte2wTfRCGu43Yrxa1mBShtmWCszGlZYYMUquK8yo0OR1VSkFkNzrNgIesrwV+GkuFY1wnoRaY5UZa7tGHENDK6mMnq3w8ymW1bB2m2rDraRlp84LPJIwKby2ByVS7jbNC1mtsSs0WnNiN9CWSKpZJV1V9keT9WSG6gVB4+w4gjkh6F1qBPWB7bCno8+qC3cxRbhgbmRXtTFqtu9lj0YtvNEmTI3pMoOcDGxfSpzUXzQkUQqnLX8QT6qMq0VKEpkS5HttcmsKtUPM6zcoNZsUrPZMK+1Avz83b3RcKQ+pH1jljd7j8i3//3P228dPjl/733X9mI3C1kk3YCm6cQcJfI0p0JyqGqLXTsDA6DiohBGcIubWXqaq/NDQW5Bi0dgKcP/SBLVWpXIaZimy9ABZAgBmjb7sLGrb5LJ6dFpzqMkg1re1YqmWUBkTPAwAmcCAUsnWMINdQUHW4nrU0FdKO7rWpRNxkIbvLuLyw7uUrA1SuFgAcMmr7iEg0EQS5D0GEwMU+h8VZYWuZg5RT4aHhnoS7yYR6ODh5Z5eP/ovvVgfAj7SFsS66q16Ggteg1JVbTWRasyo4fAofiK3Vtt6VjQaLbSbGrx++MHo8OjI+vg0Ho4Ns2jUvoK4XLE4xLdNXuGl+LUG3gv4zhIaxGTpzyLuFzUG2eCJlAbNlrDLJBcxK8MJ47nkjqpAZcsoGJhTIOYynoootLjKBYhDdSZKk8zZ8pnDe2BLkUdQeesTpBP2TIKkwHNZFxnTuH8+KEbh3AaTiHSIyod7gDkCSyObxJMr+lJ/KRk8ZeJz6K0yDMRFFWWxJvCscKAnIYn7Hlq4FX5+bgSHQZx4XhUbdZTbxDgqzZjFCrHhPCUxgmLFKGM3WMaBKdoBxOCTetjZFEWAn8I60ctUugSAz3lHgHfBkzWFwt4vOGOnLourJF0RWH7kH7qMpB+6iKLFO4YSqibFPd2cGhII0+rQ5GAw6yIJSQamHPYrD1wg1DvFtIhDpJHM2jVvcMQdhm1eL9iEROQNzEJAI8E1K4dsVclfW5jDGJsFBPzFJ7gZJe6+Z5ZFHUY3CeQLlQkFjnsngKycgLGO1CVzG2Hz9I5T2ptUcwj2J9kgwnik1/gXGiNPkxC7kuZfHzvnuoaxmJ2D6L5HhihDZISBv09v/gYzWqw4Q1bWQ9umEEKpCK30+mUhjxY2ilku0RiyCuuZqLSlBVHYfaFVUnqanhYIL30RZjzpg0en0433V6zm82KPGdD25gNi780+jhkjpxDH2v3sZcZv6ABjg7Gw0P2Uhn6R4iPKCYhXTqsLrBkCSBhdcHGhsagd5opEgpF12euWhYtZ0IMefG0n0MnaKCgqWwJw9SzflHlbPVSAbbD0uXPVLDXWV6+fJlRgNrqm+ivooVZg1aoHtgGtwZqJKh/4i9T7qarGa8LwwBWMSV9lwb54+JcBVDH2sCB17DfFOcdsGAm6rCvt1AKr0lpYzl6Z8+0BZ/5cr8h4IhNBH72tEkXJhBD9bnBqzPNk+L85EwvszzkKU5NS5g1hb+8Uga2q/Rlob7PbI/OZgwyITw0YFCjQLg9Oy/J8EmHxnPYBBszwi7UMsjZi1bYrrrm7b5w1fe43Tdb9X3V7pOrvpN239TRXfDdXM6brrN80M5WSdmdtCU3XStJcMMXUHAJ7mQqy5PmesO9vJhYlSj5Q2HfhH9lpBA7hBNCwP5M9iyyDhukZQIyi+QXkEqAS+01GRxqYtHweeIxoTVMoVQkKu9LDlkSIrb6ZLXc1xTEHFlK6du2SKfMmVWVykHsrCU5bQmyAA5kpZy+t7RDP4sDr3vBQ4unDmf1WCeqQ0vk+vTWmlVERFnRLwadHSIsSXkQR1vkLqhYgTrkFw3JckUvurCX3djLLuyyG7vswl50Yy+6sLIbKzvtZSLuho/0RD5m0o+9xhy6FIrKdRHzOT7ZZW1VB4o4EBvgU3zqBqZQLC43yGfqsR/Ko7gOxoZuuEvxx5Kqteq5x94G+GkDrDZT5uKvTSSGIBXdMa7CF4ve8naVd7GplZAiiadD6gSNqoLknYVUso1BXIdBbGNIr8OQbmOQ12FolzUVhsvrMPypxRDEMFdhDInoWn5cTYoS01srTO/vI8mg+G7OKFiHaQC+iH2T6CS5sD9pj2GRCKi8WtDbnVjewp3zTuC8DZx3Aae0babOYYhuwS+b2Mse0hZQJ49OW1vYBbGNKn9LBIopxpp2nx10O3gaZ6KFHfdjl23ssg/bNB2w3R5BG1KoCoIuQ4y+IVZnsbr1HpA57KB3LDvh+2cHRZ/GPvFxVXzcKa7UJ73qk+uo7xMfV8W71SvxHn8l5B5ZCSvX3SIvmIslkiqZYAUL3jq9yFfxBXN1LaJ/7E8UcHG3mLinBEsyWxVj+Bpy3dUshNApiubgKhq4HmzlwhEqqvF1qNR1vN26klF0U4prUrbcps5b/UZ+CISa78MrLVT+U3wPt/Lh5UCbB3eHeLmvzH2At0d4ebhdkTJ7qydex27Rbbj4QYZD+H4dR+oY5lDROGxeMAGN5YFzzkSEb8/DTHXgn7uUrQPVCifQylNRFdCHiziDvZKot16gJWDRDM+b+BrMZ3gGaQwbO5TqW7sEPja+pYo1siZVP77iO3317qaUop4n4w5d+WA0PDiELF7iUM6+SHwayTiEwioLWG7i6bgqtAJDakirunTliXs3HPlFeV76grkBFFXY/E3ZCnMJFQ/87+k9gd6T3l6P01mRq2sP4inDYvAp6+t/EnOol9S1B+HGULnjpacfTh5FjpeefraAclOqt4/lewfHyb8s+uCfPS3wrURPr+8WuT+0DXfYjbiLLw1nIYXtFb5tA++2AXm0AsJdj07Bggxd+PXzPkQQz+D0zcG29V2f0sfPvixyvPQBpAgZBatkqELohM8v1U8JlT9yykfDB4duWKzaVz9s5VatUf++1QQ7sfA0Wv1hQKV9kfp8KqFZ0ej2IK0otTb4sp3pLRF1tLvwJ452Kwx5Buuo0rE2v9Wo30RXx4UdjXGdv7tnNn+Ea9+8sIbmaGh+O9p79GhHf97eeX/nNzt3dsydBzuPdn6382Tn+Y77zv9u3Lxx98ZHt/96+++3/3H7nxr65hulzK93ap/b//o/y+dThw==</latexit>

r(x, z|✓)
arg m n L g r̂(x|✓)
<latexit sha1_base64="kQLQGJ4Xe+WRfsBc6Tu8qUqoqpg=">AAAcoXicpVltc9y2Eb6kb6n65rQfow9INXbtzOl0d5IsKR3PZJx40szYtSpLjltR0oDkksQcSVAAeL4Tw/6C/pp+bf9I/00XIE/i61nTnkY8EPs8i8VisVzw7CRkUo3H//no4x/9+Cc//dknP9/4xS9/9evfPPj0t28lT4UDZw4PuXhnUwkhi+FMMRXCu0QAjewQvrdnX2v593MQkvH4VC0TuIioHzOPOVRh19WDR5aChTJ6MpeK2bYAN8/E48WQ3JAfiKUCUPRJfvVgazwamw9pNyZlY2tQfo6vPv3MsVzupBHEygmplOeTcaIuMioUc0LIN6xUQkKdGfXhHJsxjUBeZMaQnDzEHpd4XOB/rIjprTIyGkm5jGxERlQFsinTnV2y81R5hxcZi5NUQewUA3lpSBQn2jvEZQIcFS6xQR3B0FbiBFRQR6EPa6MYm4aLwuCNjYfE+FqirbHEZcNpk/dMBSQJuUKmCx6uUMvPvp1n49HhUPtSXyaH492j6WT/6eHT6cHe/iTvYNphCrfU8S31HkxfAMR1apUzPkIdRl++8bDN5oLG/mrkSUF/uncw3j88nO7uT4/2JpPDkv0BcjnjvRLdtXpDV+qlH+q24jyUtYjJJEtjphb1Tl/QJGBOozdKQ8UEfz+0OZ8passhXtKQisXQCzlV9VDUgz6LuYhoKNkNXGQytT3mN0bHgA7AHdqCzqCuIPNgGUfJNk0Vr2uWXKhHDo9we0qM9Jgqm9kIOcbN8TrRu1Ge8uNSS7BMAohlnqUizKtaEtfDDTsMmKu3/EwO9dX4+VklOobEYQqq3cXSDwnqq3brKDSOifBO8gRio1Bx5xkNwwttBwgBXn2OEKcR6o9w/5hNiiKxXSy5S9C3Iaj6ZkGPN9yRUcfBPSJXKqwgoKrOYbObOmUhsQWaYRqSsBi3WxTR2C2G05SQ4aqIJSYaXHM5JC66QZhkJ0d6kiz2sbeQjiJMbmbzfgsxCBqaJIB6FKI2rBjel+ozS8egjo38fHKRmZwpnWxrkud1GLYTTBcmEvPMwrYV8wSNtzEpzyyb+XLGklpfzFnsoisamjA+2VyvRTFigIuQBUolX+7sGNGIC38Ho3kHjSgMUgon/Y7Nv9RmNbTpBqysRzf4mAKpyCzpeTRi4dKSmO0SpUO+84FQqKw4Smdf3JWkPgyLcq1eBSLKWNMGl3nendhtisHPswxG1tAf5X9vyBhmjoyhDNoyuE7ZnIZ6djgfFsG1MfSvGB8xJxFd2lAnLCFBJO6uVIA2RnunmSIJsZwAHLMtWs7EGHK516+jSNCogkrVIuPSQz/VOBuZCuchS5e/McFe13J9fZ1ShFrmmxRfeQtzC1qhemB3uFtggcThj4OlZI5crXidjBNYxZQKHBpmr/IrE0Ade0NPvIZ9nV91wEJf1GEv16gUblOlFYKnHm9NLMH8QD1pEGxxF4HPT5rqogRjqL42+mp7WZJfnV4W2yyLmNRL0yJDk/zigxx8XMnr3HxfWi71fcBMiDcNGNYoGG5vrkpl+q4IjTN8CDZWBOZmG2TwthW2K9GsLYtWsldtmb+SfduWqZXstC3z7EKE383tfCe6zLbb2SopxUmbeSdaMdEN32DBJZidmixPmvtNP8vz82klSv6cW5/jXxkpxIqY64bwA9maktuw0WpBYGZRbI6pBHWZZ02qqOKi4fPEBVGM4GGpSEzeVwyzJEZs9W7acl+TqHNkySqabUon53JaZWVIu2wxvRYRQkVXvKI9LRz6nIdu94bHHtecCeqxToygYGTFoaG1qhoRp3k/DYUdFEgkC3m8hjenYgXq4C8azHJHL7qwN93Ymy7sshu77MLOu7HzLqzqxqpOe0Hwbvi4WMhXoALuNtbQoVhU3hYxX+s7q6yt6kDBQ3EHPNF33UCJxeLyDvnG3PZDWczrYN3RDXeoxB1btdbc99jbAJ80wOZhCo4+/hKOQSq6Y9yEry56y+Yq7+quVkKKlT4dUjxv16sKknUWUsk6DeI+GsQ6DfI+GuQ6Deo+GtplTUXDzX00/K2lIeS4VhHHRHQvP64WxdCKRysu73exAiy+myuK1uk0gF/E+pwUSXJh/bE9h0UisPJqQf/QiWUt3BXrBM7awFkX0KNtM4scptEt+E0Te9OjtAUskkenrS3sgljDqv4WBYspgKbdl7vdDvZ4KlrYvX7sso1d9mGbpiO22yPaBolVQdhlyLBvitVVrD56d8kMn6CPp1bCnlzu5n0j9tH3qvS9TroZPukdPrnP8H30vSq9e3hD7/FXQnbIimxc95C8BUeXSKZkwh0sWOv0ot7zOThFLWIDnkWzxAAXX+TnzgXRJZllijFA9K2oWQhppxg1ux9Sg9fdtbr0DI2qvfuoMte99daVGkW3SnFPlS23mfNWv5GPUGGh79EHLTT+M/qO1urTl93CPGzt68tTY+6Bbh7qy9H6gYzZaz3xv9gtug0X/5fhGL4veWyOYTYVjcPmHAR2lgfOGYiYTEb7UWoE+v172bttevEEWrnLq4TicMFTfFYS89YLRwkh9vV5U78GC0CfQRrT1gIz9MMNgh9Lv6XiBbLGqh9fsb94d1OyqOsq3jFWtj0e7e5jFi9xmmfNk4DGikdYWKUhZBN9Oq6SVmBMDbI6VlF56mc3HvlFeV76BpwQiyrd/brsxbXEigf/e6SnKD3tlbqM+nlmrj2IE9DF4An0yY85w3rJXHsQDsfKXV965HjyyDN96ZHDAstNZd4+lu8dbDt7kffBn5/k+q1EjzRw8iwYWUNn1I34Qr809COKj1f8toa6tQ7I4hUQWz1jCghT7cKXZ32IkPt4+mZo222rb9BXb17kmb70AZSIgKJVKjIhdMpmN+anBCuOYywM9dvJbDw62HeifNUf0iUICUk2rXXaEOrOBtjmwi3Q41G9fyED5insNmqK/lBWBp3e4ct+KB6Jeoy2SP/E0e7FKfu4jyqCW/NbncWb6Oq8tKAxr6sHW5Pmj3DtxtvpaDIeTf4y3vrqsPyB7pPBZ4PfDx4PJoODwVeDPw2OB2cDZ/CPwT8H/xr8e3Nr87vN482TAvrxRyXnd4PaZ/P8v/QgrgE=</latexit>
sha1_base64="n4ECwTYdCfuLwJXCZl6+ybiKm1E=">AAAcoXicpVltb9y4Ed67vl3dt1z7pcD5A6+G0+Sw3uyu7cS5IkCQu+B6QNK4jpNLa9kGJY0kYiVRJqnN7urUX9Bf06/tHynQH9MhpbX1ujHaNaylOM8zHA6Ho6HWTkIm1Xj8748+/sEPf/Tjn3zy062f/fwXv/zVnU9//VbyVDjwxuEhF+9sKiFkMbxRTIXwLhFAIzuE7+zZV1r+3RyEZDw+VcsEziPqx8xjDlXYdXnnrqVgoYyezKVitifAzTNxbzEkK/I9sVQAit7PL+/sjEdj8yHtxqRs7Dz97eo/A/wcX376mWO53EkjiJUTUinPJuNEnWdUKOaEkG9ZqYSEOjPqwxk2YxqBPM+MITnZxR6XeFzgf6yI6a0yMhpJuYxsREZUBbIp051dsrNUeUfnGYuTVEHsFAN5aUgUJ9o7xGUCHBUusUEdwdBW4gRUUEehD2ujGJuGi8Lgra1dYnwt0dZY4rLhtMl7pgKShFwh0wUPV6jlZ9/Os/HoaKh9qS+To/H+4+nk8OHRw+mjg8NJ3sG0wxSuqeNr6i2YvgCI69QqZ/wYdRh9+dZum80Fjf31yJOC/vDg0fjw6Gi6fzh9fDCZHJXsD5DLGR+U6K7VG7pSL/1QtxXnoaxFTCZZGjO1qHf6giYBcxq9URoqJvj7oc35TFFbDvGShlQshl7IqaqHoh70ScxFREPJVnCeydT2mN8YHQM6AHdoCzqDuoLMg2UcJXs0VbyuWXKh7jo8wu0pMdJjqmxmI+QYN8erRO9GecqPSy3BMgkglnmWijCvaklcDzfsMGCu3vIzOdRX4+cnlegYEocpqHYXSz8kqK/araPQOCbCO8kTiI1CxZ0nNAzPtR0gBHj1OUKcRqg/wv1jNimKxF6x5C5B34ag6psFPd5wR0YdB/eIXKuwgoCqOofNVnXKQmILNMM0JGExbrcoorFbDKcpIcNVEUtMNLjmckhcdIMwyU6O9CRZ7GNvIR1FmNzM5v0GYhA0NEkA9ShEbVkxvC/VZ5aOQR0b+dnkPDM5UzrZziTP6zBsJ5guTCTmmYVtK+YJGm9jUp5ZNvPljCW1vpiz2EVXNDRhfLK5XotixAAXIQuUSr588MCIRlz4DzCaH6ARhUFK4aTfsfmX2qyGNt2AtfXoBh9TIBWZJT2PRixcWhKzXaJ0yHc+EAqVFUfp7Iu7ktSHYVGu1atARBlr2uAyz7sRu00x+HmWwcga+qP8bw0Zw8yRMZRBWwZXKZvTUM8O58MiuDKG/gXjI+Ykoksb6oQlJIjE3ZUK0MZo7zRTJCGWE4BjtkXLmRhDLvf6dRQJGlVQqVpkXHropxpnI1PhPGTp8tcm2Otarq6uUopQy3yT4itvYa5Ba1QP7AZ3DSyQOPxxsJTMkesVr5NxAuuYUoFDw+xlfmkCqGNv6InXsK/yyw5Y6Is67MUGlcJtqrRC8NS9nYklmB+o+w2CLW4i8NlJU12UYAzV10ZfbS9L8svTi2KbZRGTemlaZGiSn3+Qg48reZWb7wvLpb4PmAnxpgHDGgXD7fVlqUzfFaHxBh+CjRWBudkGGbxthe1aNGvLorXsZVvmr2XftGVqLTttyzy7EOF3czvfiC6yvXa2Skpx0mbeiNZMdMPXWHAJZqcmy5PmftPP8vxsWomSP+XW5/hXRgqxIua6IXxPdqbkOmy0WhCYWRSbYypBXeZZkyqquGj4PHFBFCN4WCoSk/cVwyyJEVu9m7bc1yTqHFmyimab0sm5mFZZGdIuWkyvRYRQ0TWvaE8Lhz7jodu94bHHNWeCeqwTIygYWXFoaK2qRsRp3k9DYQcFEslCHm/gzalYgzr4iwaz3NGLLuyqG7vqwi67scsu7LwbO+/Cqm6s6rQXBO+Gj4uFfAkq4G5jDR2KReV1EfOVvrPK2qoOFDwUN8ATfdcNlFgsLm+Qr81tP5TFvA7WHd1wh0rcsVVrzX2PvQ3wSQNsHqbg6OMv4RikojvGTfjqordsrvOu7molpFjp0yHF83a9qiBZZyGVbNIgbqNBbNIgb6NBbtKgbqOhXdZUNKxuo+GvLQ0hx7WKOCaiW/lxvSiGVjxacXm/jRVg8d1cUbROpwH8ItbnpEiSC+sP7TksEoGVVwv6+04sa+EuWSdw1gbOuoAebZtZ5DCNbsFXTeyqR2kLWCSPTltb2AWxhlX9LQoWUwBNuy/2ux3s8VS0sAf92GUbu+zDNk1HbLdHtA0Sq4Kwy5Bh3xSrq1h99O6TGT5B702thN2/2M/7RuyjH1TpB510M3zSO3xym+H76AdVevfwht7jr4Q8IGuycd0ueQuOLpFMyYQ7WLDW6UW953NwilrEBjyLZokBLr7Iz5xzoksyyxRjgOhrUbMQ0k4xavY/pAav+xt16RkaVQe3UWWuB5utKzWKbpXilipbbjPnrX4j76LCQt/dD1po/Gf0Pd6oT1/2C/OwdagvD425j3TzSF8ebx7ImL3RE/+L3aLbcPF/GY7h+4LH5hhmU9E4bM5BYGd54JyBiMlkdBilRqDfv5e9e6YXT6CVu7xKKA4XPMVnJTFvvXCUEGJfnzf1a7AA9BmkMW0tMEPvbhH8WPotFS+QNVb9+Ir9xbubkkVdV/GOsbK98Wj/ELN4idM8a54ENFY8wsIqDSGb6NNxlbQGY2qQ1bGKylM/u/HIL8rz0tfghFhU6e5XZS+uJVY8+N8jPUXpaa/UZdTPM3PtQZyALgZPoE9+zBnWS+bag3A4Vu760iPHk0ee6UuPHBZYbirz9rF872Db2fO8D/7sJNdvJXqkgZNnwcgaOqNuxBf6paEfUXy84rc11K1NQBavgdjqGVNAmGoXvnjThwi5j6dvhrZdt/oGffn6eZ7pSx9AiQgoWqUiE0KnbLYyPyVYcRxjYajfTmbj0aNDJ8rX/SFdgpCQZNNapw2h7myAbS7cAj0e1fsXMmCewm6jpugPZWXQ6Q2+7IfikajHaIv0TxztXpyyj/uoIrg2v9VZvImuzksLGvO6vLMzaf4I1268nY4m49Hkz+Odp0eD4vPJ4LPB7wb3BpPBo8HTwR8Hx4M3A2fw98E/Bv8c/Gt7Z/vb7ePtkwL68Ucl5zeD2mf77L/dBa+v</latexit>
sha1_base64="yfNfOs7ZyWqkNT8RpYmkj2bn/mA=">AAAcoXicpVltb9zGEb6kb4n65rRfCsQfNhXs2sHpfHeSLCmFAcOJkQawa0WWHLeiJCzJIbk4kkvtLs93Ylj0B/TX5Gv7Rwr0x3R2yZP4ehbaE8Rb7jzP7Ozs7HCWZychk2o8/vcHH/7oxz/56c8++njj57/45a9+feeT37yRPBUOnDg85OKtTSWELIYTxVQIbxMBNLJD+M6efanl381BSMbjY7VM4Cyifsw85lCFXRd37lsKFsroyVwqZlsC3DwTDxZDckW+J5YKQNGH+cWdzfFobD6k3ZiUjc2nv7v6z8d//+HZ4cUnnzqWy500glg5IZXydDJO1FlGhWJOCPmGlUpIqDOjPpxiM6YRyLPMGJKTe9jjEo8L/I8VMb1VRkYjKZeRjciIqkA2ZbqzS3aaKm//LGNxkiqInWIgLw2J4kR7h7hMgKPCJTaoIxjaSpyACuoo9GFtFGPTcFEYvLFxjxhfS7Q1lrhsOG3yjqmAJCFXyHTBwxVq+dm382w82h9qX+rLZH+8fTCd7D7efzzd29md5B1MO0zhmjq+pt6C6QuAuE6tcsYHqMPoyzfutdlc0NhfjTwp6I939sa7+/vT7d3pwc5ksl+y30MuZ7xTortWb+hKvfRD3Vach7IWMZlkaczUot7pC5oEzGn0RmmomODvhjbnM0VtOcRLGlKxGHohp6oeinrQJzEXEQ0lu4KzTKa2x/zG6BjQAbhDW9AZ1BVkHizjKNmiqeJ1zZILdd/hEW5PiZEeU2UzGyGHuDleJXo3ymN+WGoJlkkAscyzVIR5VUvierhhhwFz9ZafyaG+Gj8/qUTHkDhMQbW7WPohQX3Vbh2FxjER3kmeQGwUKu48oWF4pu0AIcCrzxHiNEL9Ee4fs0lRJLaKJXcJ+jYEVd8s6PGGOzLqOLhH5EqFFQRU1TlsdlWnLCS2QDNMQxIW43aLIhq7xXCaEjJcFbHERINrLofERTcIk+zkSE+SxT72FtJRhMnNbN6vIQZBQ5MEUI9C1IYVw7tSfWbpGNSxkZ9OzjKTM6WTbU7yvA7DdoLpwkRinlnYtmKeoPE2JuWZZTNfzlhS64s5i110RUMTxieb67UoRgxwEbJAqeSLR4+MaMSF/wij+REaURikFE76LZt/oc1qaNMNWFmPbvAxBVKRWdLzaMTCpSUx2yVKh3znA6FQWXGUzr64K0l9GBblWr0KRJSxpg0u87wbsdsUg59nGYysoT/K/9aQMcwcGUMZtGVwmbI5DfXscD4sgktj6F8wPmJOIrq0oU5YQoJI3F2pAG2M9k4zRRJiOQE4Zlu0nIkx5HKvX0eRoFEFlapFxqWHfqpxNjIVzkOWLn9tgr2u5fLyMqUItcw3Kb7yFuYatEL1wG5w18ACicMfBkvJHLla8ToZJ7CKKRU4NMxe5hcmgDr2hp54Dfsqv+iAhb6ow16sUSncpkorBE892JxYgvmBetgg2OImAp8dNdVFCcZQfW301fayJL84Pi+2WRYxqZemRYYm+fl7Ofi4kpe5+T63XOr7gJkQbxowrFEw3F5flMr0XREaJ/gQbKwIzM02yOBNK2xXollbFq1kL9syfyX7ui1TK9lxW+bZhQi/m9v5RnSebbWzVVKKkzbzRrRiohu+woJLMDs1WZ4095t+luen00qU/Dm3PsO/MlKIFTHXDeF7sjkl12Gj1YLAzKLYHFMJ6jLPmlRRxUXD54kLohjBw1KRmLyvGGZJjNjq3bTlviZR58iSVTTblE7O+bTKypB23mJ6LSKEiq54RXtaOPQZD93uDY89rjkT1GOdGEHByIpDQ2tVNSJO834aCjsokEgW8ngNb07FCtTBXzSY5Y5edGGvurFXXdhlN3bZhZ13Y+ddWNWNVZ32guDd8HGxkC9BBdxtrKFDsai8LmK+1HdWWVvVgYKH4gZ4pO+6gRKLxeUN8rW57YeymNfBuqMb7lCJO7ZqrbnvsbcBPmqAzcMUHH38JRyDVHTHuAlfXfSWzVXe1V2thBQrfTqkeN6uVxUk6yykknUaxG00iHUa5G00yHUa1G00tMuaioar22j4a0tDyHGtIo6J6FZ+XC2KoRWPVlzeb2IFWHw3VxSt02kAv4j1GSmS5ML6Y3sOi0Rg5dWC/qETy1q4C9YJnLWBsy6gR9tmFjlMo1vwqyb2qkdpC1gkj05bW9gFsYZV/S0KFlMATbvPt7sd7PFUtLA7/dhlG7vswzZNR2y3R7QNEquCsMuQYd8Uq6tYffRukxk+QR9MrYQ9PN/O+0bso+9U6TuddDN80jt8cpvh++g7VXr38Ibe46+EPCIrsnHdPfIGHF0imZIJd7BgrdOLesfn4BS1iA14Fs0SA1x8np86Z0SXZJYpxgDR16JmIaSdYtRsv08NXrfX6tIzNKp2bqPKXHfWW1dqFN0qxS1Vttxmzlv9Rt5HhYW++++10PjP6DtYq09ftgvzsLWrL4+NuXu6ua8vB+sHMmav9cT/YrfoNlz8X4Zj+L7gsTmG2VQ0DptzENhZHjhnIGIyGe1GqRHo9+9l75bpxRNo5S6vEorDBU/xWUnMWy8cJYTY1+dN/RosAH0GaUxbC8zQ9zYIfiz9looXyBqrfnzF/uLdTcmirqt4x1jZ1ni0vYtZvMRpnjVPAhorHmFhlYaQTfTpuEpagTE1yOpYReWpn9145BfleekrcEIsqnT3q7IX1xIrHvzvkR6j9LhX6jLq55m59iCOQBeDR9AnP+QM6yVz7UE4HCt3femR48kjz/SlRw4LLDeVeftYvnew7ex53gd/dpTrtxI90sDJs2BkDZ1RN+Jz/dLQjyg+XvHbGurWOiCLV0Bs9YwpIEy1C1+c9CFC7uPpm6Ft162+QV++fp5n+tIHUCICilapyITQMZtdmZ8SrDiOsTDUbyez8Whv14nyVX9IlyAkJNm01mlDqDsbYJsLt0CPR/X+hQyYp7DbqCn6Q1kZdHqDL/uheCTqMdoi/RNHuxen7OM+qgiuzW91Fm+iq/PSgsa8Lu5sTpo/wrUbb6ajyXg0+Xa8+XR/UHw+Gnw6+P3gwWAy2Bs8HfxpcDg4GTiDfwx+GPxz8K+7m3e/uXt496iAfvhByfntoPa5e/pfaBOxLA==</latexit>

t(x, z|✓)
<latexit sha1_base64="GuTW9do3dzdd+I9ZbAmjhvNlqcg=">AAAcoXicpVltc9y2Eb6kb6n65rQfow9INXbtzOl0d5IsKR3PZJx40szYtSpLjltR0oDkksQcSVAAeL4Tw/6C/pp+bf9I/00XIE/i61nTnkY8EPs8i8VisVzw7CRkUo3H//no4x/9+Cc//dknP9/4xS9/9evfPPj0t28lT4UDZw4PuXhnUwkhi+FMMRXCu0QAjewQvrdnX2v593MQkvH4VC0TuIioHzOPOVRh19WDR5aChTJ6MpeK2bYAN8/U48WQ3JAfiKUCUPRJfvVgazwamw9pNyZlY2tQfo6vPv3MsVzupBHEygmplOeTcaIuMioUc0LIN6xUQkKdGfXhHJsxjUBeZMaQnDzEHpd4XOB/rIjprTIyGkm5jGxERlQFsinTnV2y81R5hxcZi5NUQewUA3lpSBQn2jvEZQIcFS6xQR3B0FbiBFRQR6EPa6MYm4aLwuCNjYfE+FqirbHEZcNpk/dMBSQJuUKmCx6uUNPPwrfzbDw6HGpf6svkcLx7NJ3sPz18Oj3Y25/kHUw7TOGWOr6l3oPpC4C4Tq1yxkeow+jLNx622VzQ2F+NPCnoT/cOxvuHh9Pd/enR3mRyWLI/QC5nvFeiu1Zv6Eq99EPdVpyHshYxmWRpzNSi3ukLmgTMafRGaaiY4O+HNuczRW05xEsaUrEYeiGnqh6KetBnMRcRDSW7gYtMprbH/MboGNABuENb0BnUFWQeLOMo2aap4nXNkgv1yOERbk+JkR5TZTMbIce4OV4nejfKU35cagmWSQCxzLNUhHlVS+J6uGGHAXP1lp/Job4aPz+rRMeQOExBtbtY+iFBfdVuHYXGMRHeSZ5AbBQq7jyjYXih7QAhwKvPEeI0Qv0R7h+zSVEktosldwn6NgRV3yzo8YY7Muo4uEfkSoUVBFTVOWx2U6csJLZAM0xDEhbjdosiGrvFcJoSMlwVscREg2suh8RFNwiT7ORIT5LFPvYW0lGEyc1s3m8hBkFDkwRQj0LUhhXD+1J9ZukY1LGRn08uMpMzpZNtTfK8DsN2gunCRGKeWdi2Yp6g8TYm5ZllM1/OWFLrizmLXXRFQxPGJ5vrtShGDHARskCp5MudHSMaceHvYDTvoBGFQUrhpN+x+ZfarIY23YCV9egGH1MgFZklPY9GLFxaErNdonTIdz4QCpUVR+nsi7uS1IdhUa7Vq0BEGWva4DLPuxO7TTH4eZbByBr6o/zvDRnDzJExlEFbBtcpm9NQzw7nwyK4Nob+FeMj5iSiSxvqhCUkiMTdlQrQxmjvNFMkIZYTgGO2RcuZGEMu9/p1FAkaVVCpWmRceuinGmcjU+E8ZOnyNybY61qur69TilDLfJPiK29hbkErVA/sDncLLJA4/HGwlMyRqxWvk3ECq5hSgUPD7FV+ZQKoY2/oidewr/OrDljoizrs5RqVwm2qtELw1OOtiSWYH6gnDYIt7iLw+UlTXZRgDNXXRl9tL0vyq9PLYptlEZN6aVpkaJJffJCDjyt5nZvvS8ulvg+YCfGmAcMaBcPtzVWpTN8VoXGGD8HGisDcbIMM3rbCdiWatWXRSvaqLfNXsm/bMrWSnbZlnl2I8Lu5ne9El9l2O1slpThpM+9EKya64RssuASzU5PlSXO/6Wd5fj6tRMmfc+tz/CsjhVgRc90QfiBbU3IbNlotCMwsis0xlaAu86xJFVVcNHyeuCCKETwsFYnJ+4phlsSIrd5NW+5rEnWOLFlFs03p5FxOq6wMaZctptciQqjoile0p4VDn/PQ7d7w2OOaM0E91okRFIysODS0VlUj4jTvp6GwgwKJZCGP1/DmVKxAHfxFg1nu6EUX9qYbe9OFXXZjl13YeTd23oVV3VjVaS8I3g0fFwv5ClTA3cYaOhSLytsi5mt9Z5W1VR0oeCjugCf6rhsosVhc3iHfmNt+KIt5Haw7uuEOlbhjq9aa+x57G+CTBtg8TMHRx1/CMUhFd4yb8NVFb9lc5V3d1UpIsdKnQ4rn7XpVQbLOQipZp0HcR4NYp0HeR4Ncp0HdR0O7rKlouLmPhr+1NIQc1yrimIju5cfVohha8WjF5f0uVoDFd3NF0TqdBvCLWJ+TIkkurD+257BIBFZeLegfOrGshbtincBZGzjrAnq0bWaRwzS6Bb9pYm96lLaARfLotLWFXRBrWNXfomAxBdC0+3K328EeT0ULu9ePXbaxyz5s03TEdntE2yCxKgi7DBn2TbG6itVH7y6Z4RP08dRK2JPL3bxvxD76XpW+10k3wye9wyf3Gb6Pvleldw9v6D3+SsgOWZGN6x6St+DoEsmUTLiDBWudXtR7PgenqEVswLNolhjg4ov83LkguiSzTDEGiL4VNQsh7RSjZvdDavC6u1aXnqFRtXcfVea6t966UqPoVinuqbLlNnPe6jfyESos9D36oIXGf0bf0Vp9+rJbmIetfX15asw90M1DfTlaP5Axe60n/he7Rbfh4v8yHMP3JY/NMcymonHYnIPAzvLAOQMRk8loP0qNQL9/L3u3TS+eQCt3eZVQHC54is9KYt564SghxL4+b+rXYAHoM0hj2lpghn64QfBj6bdUvEDWWPXjK/YX725KFnVdxTvGyrbHo919zOIlTvOseRLQWPEIC6s0hGyiT8dV0gqMqUFWxyoqT/3sxiO/KM9L34ATYlGlu1+XvbiWWPHgf4/0FKWnvVKXUT/PzLUHcQK6GDyBPvkxZ1gvmWsPwuFYuetLjxxPHnmmLz1yWGC5qczbx/K9g21nL/I++POTXL+V6JEGTp4FI2vojLoRX+iXhn5E8fGK39ZQt9YBWbwCYqtnTAFhql348qwPEXIfT98Mbbtt9Q366s2LPNOXPoASEVC0SkUmhE7Z7Mb8lGDFcYyFoX47mY1HB/tOlK/6Q7oEISHJprVOG0Ld2QDbXLgFejyq9y9kwDyF3UZN0R/KyqDTO3zZD8UjUY/RFumfONq9OGUf91FFcGt+q7N4E12dlxY05nX1YGvS/BGu3Xg7HU3Go8lfxltfHZY/0H0y+Gzw+8HjwWRwMPhq8KfB8eBs4Az+Mfjn4F+Df29ubX63ebx5UkA//qjk/G5Q+2ye/xcse64D</latexit>
sha1_base64="BMHppIHEY/0iNNG8Fhs+Hd4wokw=">AAAcoXicpVltb9y4Ed67vl3dt1z7pcD5A6+G0+Sw3uyu7cS5IkCQu+B6QNK4jpNLa9kGJY0kYiVRJqnN7urUX9Bf06/tHynQH9MhpbX1ujHaNaylOM8zHA6Ho6HWTkIm1Xj8748+/sEPf/Tjn3zy062f/fwXv/zVnU9//VbyVDjwxuEhF+9sKiFkMbxRTIXwLhFAIzuE7+zZV1r+3RyEZDw+VcsEziPqx8xjDlXYdXnnrqVgoYyezKVitifAzTN1bzEkK/I9sVQAit7PL+/sjEdj8yHtxqRs7Dz97eo/A/wcX376mWO53EkjiJUTUinPJuNEnWdUKOaEkG9ZqYSEOjPqwxk2YxqBPM+MITnZxR6XeFzgf6yI6a0yMhpJuYxsREZUBbIp051dsrNUeUfnGYuTVEHsFAN5aUgUJ9o7xGUCHBUusUEdwdBW4gRUUEehD2ujGJuGi8Lgra1dYnwt0dZY4rLhtMl7pgKShFwh0wUPV6jpZ+HbeTYeHQ21L/VlcjTefzydHD48ejh9dHA4yTuYdpjCNXV8Tb0F0xcAcZ1a5Ywfow6jL9/abbO5oLG/HnlS0B8ePBofHh1N9w+njw8mk6OS/QFyOeODEt21ekNX6qUf6rbiPJS1iMkkS2OmFvVOX9AkYE6jN0pDxQR/P7Q5nylqyyFe0pCKxdALOVX1UNSDPom5iGgo2QrOM5naHvMbo2NAB+AObUFnUFeQebCMo2SPporXNUsu1F2HR7g9JUZ6TJXNbIQc4+Z4lejdKE/5caklWCYBxDLPUhHmVS2J6+GGHQbM1Vt+Jof6avz8pBIdQ+IwBdXuYumHBPVVu3UUGsdEeCd5ArFRqLjzhIbhubYDhACvPkeI0wj1R7h/zCZFkdgrltwl6NsQVH2zoMcb7sio4+AekWsVVhBQVeew2apOWUhsgWaYhiQsxu0WRTR2i+E0JWS4KmKJiQbXXA6Ji24QJtnJkZ4ki33sLaSjCJOb2bzfQAyChiYJoB6FqC0rhvel+szSMahjIz+bnGcmZ0on25nkeR2G7QTThYnEPLOwbcU8QeNtTMozy2a+nLGk1hdzFrvoioYmjE8212tRjBjgImSBUsmXDx4Y0YgL/wFG8wM0ojBIKZz0Ozb/UpvV0KYbsLYe3eBjCqQis6Tn0YiFS0titkuUDvnOB0KhsuIonX1xV5L6MCzKtXoViChjTRtc5nk3YrcpBj/PMhhZQ3+U/60hY5g5MoYyaMvgKmVzGurZ4XxYBFfG0L9gfMScRHRpQ52whASRuLtSAdoY7Z1miiTEcgJwzLZoORNjyOVev44iQaMKKlWLjEsP/VTjbGQqnIcsXf7aBHtdy9XVVUoRaplvUnzlLcw1aI3qgd3groEFEoc/DpaSOXK94nUyTmAdUypwaJi9zC9NAHXsDT3xGvZVftkBC31Rh73YoFK4TZVWCJ66tzOxBPMDdb9BsMVNBD47aaqLEoyh+troq+1lSX55elFssyxiUi9NiwxN8vMPcvBxJa9y831hudT3ATMh3jRgWKNguL2+LJXpuyI03uBDsLEiMDfbIIO3rbBdi2ZtWbSWvWzL/LXsm7ZMrWWnbZlnFyL8bm7nG9FFttfOVkkpTtrMG9GaiW74GgsuwezUZHnS3G/6WZ6fTStR8qfc+hz/ykghVsRcN4Tvyc6UXIeNVgsCM4tic0wlqMs8a1JFFRcNnycuiGIED0tFYvK+YpglMWKrd9OW+5pEnSNLVtFsUzo5F9MqK0PaRYvptYgQKrrmFe1p4dBnPHS7Nzz2uOZMUI91YgQFIysODa1V1Yg4zftpKOygQCJZyOMNvDkVa1AHf9Fgljt60YVddWNXXdhlN3bZhZ13Y+ddWNWNVZ32guDd8HGxkC9BBdxtrKFDsai8LmK+0ndWWVvVgYKH4gZ4ou+6gRKLxeUN8rW57YeymNfBuqMb7lCJO7ZqrbnvsbcBPmmAzcMUHH38JRyDVHTHuAlfXfSWzXXe1V2thBQrfTqkeN6uVxUk6yykkk0axG00iE0a5G00yE0a1G00tMuaiobVbTT8taUh5LhWEcdEdCs/rhfF0IpHKy7vt7ECLL6bK4rW6TSAX8T6nBRJcmH9oT2HRSKw8mpBf9+JZS3cJesEztrAWRfQo20zixym0S34qold9ShtAYvk0WlrC7sg1rCqv0XBYgqgaffFfreDPZ6KFvagH7tsY5d92KbpiO32iLZBYlUQdhky7JtidRWrj959MsMn6L2plbD7F/t534h99IMq/aCTboZPeodPbjN8H/2gSu8e3tB7/JWQB2RNNq7bJW/B0SWSKZlwBwvWOr2o93wOTlGL2IBn0SwxwMUX+ZlzTnRJZpliDBB9LWoWQtopRs3+h9TgdX+jLj1Do+rgNqrM9WCzdaVG0a1S3FJly23mvNVv5F1UWOi7+0ELjf+Mvscb9enLfmEetg715aEx95FuHunL480DGbM3euJ/sVt0Gy7+L8MxfF/w2BzDbCoah805COwsD5wzEDGZjA6j1Aj0+/eyd8/04gm0cpdXCcXhgqf4rCTmrReOEkLs6/Omfg0WgD6DNKatBWbo3S2CH0u/peIFssaqH1+xv3h3U7Ko6yreMVa2Nx7tH2IWL3GaZ82TgMaKR1hYpSFkE306rpLWYEwNsjpWUXnqZzce+UV5XvoanBCLKt39quzFtcSKB/97pKcoPe2Vuoz6eWauPYgT0MXgCfTJjznDeslcexAOx8pdX3rkePLIM33pkcMCy01l3j6W7x1sO3ue98GfneT6rUSPNHDyLBhZQ2fUjfhCvzT0I4qPV/y2hrq1CcjiNRBbPWMKCFPtwhdv+hAh9/H0zdC261bfoC9fP88zfekDKBEBRatUZELolM1W5qcEK45jLAz128lsPHp06ET5uj+kSxASkmxa67Qh1J0NsM2FW6DHo3r/QgbMU9ht1BT9oawMOr3Bl/1QPBL1GG2R/omj3YtT9nEfVQTX5rc6izfR1XlpQWNel3d2Js0f4dqNt9PRZDya/Hm88/RoUHw+GXw2+N3g3mAyeDR4Ovjj4HjwZuAM/j74x+Cfg39t72x/u328fVJAP/6o5PxmUPtsn/0XFWCvsQ==</latexit>
sha1_base64="WhfdL6HX9XmJ7WXSrLsNwv05onI=">AAAcoXicpVltb9zGEb6kb4n65rRfCsQfNhXs2sHpfHeSLCmFAcOJkQawa0WWHLeiJCzJIbk4kkvtLs93Ylj0B/TX5Gv7Rwr0x3R2yZP4ehbaE8Rb7jzP7Ozs7HCWZychk2o8/vcHH/7oxz/56c8++njj57/45a9+feeT37yRPBUOnDg85OKtTSWELIYTxVQIbxMBNLJD+M6efanl381BSMbjY7VM4Cyifsw85lCFXRd37lsKFsroyVwqZlsC3DxTDxZDckW+J5YKQNGH+cWdzfFobD6k3ZiUjc2nv7v6z8d//+HZ4cUnnzqWy500glg5IZXydDJO1FlGhWJOCPmGlUpIqDOjPpxiM6YRyLPMGJKTe9jjEo8L/I8VMb1VRkYjKZeRjciIqkA2ZbqzS3aaKm//LGNxkiqInWIgLw2J4kR7h7hMgKPCJTaoIxjaSpyACuoo9GFtFGPTcFEYvLFxjxhfS7Q1lrhsOG3yjqmAJCFXyHTBwxVq+ln4dp6NR/tD7Ut9meyPtw+mk93H+4+nezu7k7yDaYcpXFPH19RbMH0BENepVc74AHUYffnGvTabCxr7q5EnBf3xzt54d39/ur07PdiZTPZL9nvI5Yx3SnTX6g1dqZd+qNuK81DWIiaTLI2ZWtQ7fUGTgDmN3igNFRP83dDmfKaoLYd4SUMqFkMv5FTVQ1EP+iTmIqKhZFdwlsnU9pjfGB0DOgB3aAs6g7qCzINlHCVbNFW8rllyoe47PMLtKTHSY6psZiPkEDfHq0TvRnnMD0stwTIJIJZ5loowr2pJXA837DBgrt7yMznUV+PnJ5XoGBKHKah2F0s/JKiv2q2j0DgmwjvJE4iNQsWdJzQMz7QdIAR49TlCnEaoP8L9YzYpisRWseQuQd+GoOqbBT3ecEdGHQf3iFypsIKAqjqHza7qlIXEFmiGaUjCYtxuUURjtxhOU0KGqyKWmGhwzeWQuOgGYZKdHOlJstjH3kI6ijC5mc37NcQgaGiSAOpRiNqwYnhXqs8sHYM6NvLTyVlmcqZ0ss1Jntdh2E4wXZhIzDML21bMEzTexqQ8s2zmyxlLan0xZ7GLrmhowvhkc70WxYgBLkIWKJV88eiREY248B9hND9CIwqDlMJJv2XzL7RZDW26ASvr0Q0+pkAqMkt6Ho1YuLQkZrtE6ZDvfCAUKiuO0tkXdyWpD8OiXKtXgYgy1rTBZZ53I3abYvDzLIORNfRH+d8aMoaZI2Mog7YMLlM2p6GeHc6HRXBpDP0LxkfMSUSXNtQJS0gQibsrFaCN0d5ppkhCLCcAx2yLljMxhlzu9esoEjSqoFK1yLj00E81zkamwnnI0uWvTbDXtVxeXqYUoZb5JsVX3sJcg1aoHtgN7hpYIHH4w2ApmSNXK14n4wRWMaUCh4bZy/zCBFDH3tATr2Ff5RcdsNAXddiLNSqF21RpheCpB5sTSzA/UA8bBFvcROCzo6a6KMEYqq+NvtpeluQXx+fFNssiJvXStMjQJD9/LwcfV/IyN9/nlkt9HzAT4k0DhjUKhtvri1KZvitC4wQfgo0VgbnZBhm8aYXtSjRry6KV7GVb5q9kX7dlaiU7bss8uxDhd3M734jOs612tkpKcdJm3ohWTHTDV1hwCWanJsuT5n7Tz/L8dFqJkj/n1mf4V0YKsSLmuiF8Tzan5DpstFoQmFkUm2MqQV3mWZMqqrho+DxxQRQjeFgqEpP3FcMsiRFbvZu23Nck6hxZsopmm9LJOZ9WWRnSzltMr0WEUNEVr2hPC4c+46HbveGxxzVngnqsEyMoGFlxaGitqkbEad5PQ2EHBRLJQh6v4c2pWIE6+IsGs9zRiy7sVTf2qgu77MYuu7Dzbuy8C6u6sarTXhC8Gz4uFvIlqIC7jTV0KBaV10XMl/rOKmurOlDwUNwAj/RdN1Bisbi8Qb42t/1QFvM6WHd0wx0qccdWrTX3PfY2wEcNsHmYgqOPv4RjkIruGDfhq4vesrnKu7qrlZBipU+HFM/b9aqCZJ2FVLJOg7iNBrFOg7yNBrlOg7qNhnZZU9FwdRsNf21pCDmuVcQxEd3Kj6tFMbTi0YrL+02sAIvv5oqidToN4BexPiNFklxYf2zPYZEIrLxa0D90YlkLd8E6gbM2cNYF9GjbzCKHaXQLftXEXvUobQGL5NFpawu7INawqr9FwWIKoGn3+Xa3gz2eihZ2px+7bGOXfdim6Yjt9oi2QWJVEHYZMuybYnUVq4/ebTLDJ+iDqZWwh+fbed+IffSdKn2nk26GT3qHT24zfB99p0rvHt7Qe/yVkEdkRTauu0fegKNLJFMy4Q4WrHV6Ue/4HJyiFrEBz6JZYoCLz/NT54zokswyxRgg+lrULIS0U4ya7fepwev2Wl16hkbVzm1UmevOeutKjaJbpbilypbbzHmr38j7qLDQd/+9Fhr/GX0Ha/Xpy3ZhHrZ29eWxMXdPN/f15WD9QMbstZ74X+wW3YaL/8twDN8XPDbHMJuKxmFzDgI7ywPnDERMJqPdKDUC/f697N0yvXgCrdzlVUJxuOApPiuJeeuFo4QQ+/q8qV+DBaDPII1pa4EZ+t4GwY+l31LxAllj1Y+v2F+8uylZ1HUV7xgr2xqPtncxi5c4zbPmSUBjxSMsrNIQsok+HVdJKzCmBlkdq6g89bMbj/yiPC99BU6IRZXuflX24lpixYP/PdJjlB73Sl1G/Twz1x7EEehi8Aj65IecYb1krj0Ih2Plri89cjx55Jm+9MhhgeWmMm8fy/cOtp09z/vgz45y/VaiRxo4eRaMrKEz6kZ8rl8a+hHFxyt+W0PdWgdk8QqIrZ4xBYSpduGLkz5EyH08fTO07brVN+jL18/zTF/6AEpEQNEqFZkQOmazK/NTghXHMRaG+u1kNh7t7TpRvuoP6RKEhCSb1jptCHVnA2xz4Rbo8ajev5AB8xR2GzVFfygrg05v8GU/FI9EPUZbpH/iaPfilH3cRxXBtfmtzuJNdHVeWtCY18WdzUnzR7h24810NBmPJt+ON5/uD4rPR4NPB78fPBhMBnuDp4M/DQ4HJwNn8I/BD4N/Dv51d/PuN3cP7x4V0A8/KDm/HdQ+d0//C6BfsS4=</latexit>
g
approx mate
augmented data ke hood
rat o ✓i
Simulation Machine Learning Inference

F gure 2 A schemat c o mach ne earn ng based approaches to ke hood- ree n erence n wh ch the s mu at on prov des tra n ng
data or a neura network that s subsequent y used as a surrogate or the ntractab e ke hood dur ng n erence Reproduced
rom (Brehmer et a 2018b)

n ques based on neura networks have been deve oped a var at ona opt m zat on techn que s used wh ch
and app ed to mode s for phys cs beyond the stan- prov des a d fferent ab e surrogate oss funct on Th s
dard mode expressed n terms of effect ve fie d theory techn que s be ng nvest gated for tun ng the parameters
(EFT) (Brehmer et a . 2018a b) EFTs prov de a sys- of the s mu at on wh ch s a computat ona y ntens ve
temat c expans on of the theory around the standard task n wh ch Bayes an opt m zat on has a so recent y
mode that s parametr zed by coeffic ents for quantum been used (I ten et a . 2017)
mechan ca operators wh ch p ay the ro e of y n th s
sett ng One nterest ng observat on n th s work s that
even though the ke hood and ke hood rat o are n-
tractab e the jo nt ke hood rat o r(x z y y ) and the
jo nt score t(x z y) = ∇y og p(x z y) are tractab e and 3 Examp es n Cosmo ogy
can be used to augment the tra n ng data (see F g 2)
and dramat ca y mprove the samp e effic ency of these W th n Cosmo ogy ear y uses of ABC nc ude con-
techn ques (Brehmer et a . 2018c) stra n ng th ck d sk format on scenar o of the M ky
In add t on an inference compi ation techn que has Way (Rob n et a . 2014) and nferences on rate of
been app ed to nference of a tau- epton decay Th s morpho og ca transformat on of ga ax es at h gh red-
proof-of-concept effort requ red deve op ng probab st c sh ft (Cameron and Pett tt 2012) wh ch a med to track
programm ng protoco that can be ntegrated nto ex st- the Hubb e parameter evo ut on from type Ia supernova
ng doma n-spec fic s mu at on codes such as SHERPA and measurements These exper ences mot vated the deve -
GEANT4 (Bayd n et a . 2018 Casado et a . 2017) Th s opment of too s such as CosmoABC to stream ne the ap-
approach prov des Bayes an nference on the atent var - p cat on of the methodo ogy n cosmo og ca app ca-
ab es p(Z X = x) and deep nterpretab ty as the pos- t ons (Ish da et a . 2015)
ter or corresponds to a d str but on over comp ete stack- More recent y ke hood-free nference methods based
traces of the s mu at on a ow ng any aspect of the s m- on mach ne earn ng have a so been deve oped mot vated
u at on to be nspected probab st ca y by the exper ences n cosmo ogy To confront the cha -
Another techn que for ke hood-free nference that enges of ABC for h gh-d mens ona observat ons X a
was mot vated by the cha enges of part c e phys cs data compress on strategy was deve oped that earns
s known as adversar a var at ona opt m zat on summary stat st cs that max m ze the F sher nforma-
(AVO) (Louppe et a . 2017b) AVO para e s generat ve t on on the parameters (A s ng et a . 2018 Charnock
adversar a networks where the generat ve mode s no et a . 2018) The earned summary stat st cs approx -
onger a neura network but nstead the doma n-spec fic mate the suffic ent stat st cs for the mp c t ke hood n
s mu at on Instead of opt m z ng the parameters of a sma ne ghborhood of some nom na or fiduc a param-
the network the goa s to opt m ze the parameters eter va ue Th s approach s c ose y connected to that
of the s mu at on so that the generated data matches of (Brehmer et a . 2018c) Recent y these approaches
the target data d str but on The ma n cha enge s have been extended to earn summary stat st cs that are
that un ke neura networks most sc ent fic s mu ators robust to systemat c uncerta nt es (A s ng and Wande t
are not d fferent ab e To get around th s prob em 2019)
22

E. Generative Models mological surveys for weak gravitational lensing rely on


accurate measurements of the apparent shapes of distant
An active area in machine learning research involves galaxies. However, shape measurement methods require
using unsupervised learning to train a generative model a precise calibration to meet the accuracy requirements
to produce a distribution that matches some empirical of the science analysis. This calibration process is chal-
distribution. This includes generative adversarial net- lenging as it requires large sets of high quality galaxy
works (GANs) (Goodfellow et al., 2014), variational au- images, which are expensive to collect. Therefore, the
toencoders (VAEs) (Kingma and Welling, 2013; Rezende GAN enables an implicit generalization of the paramet-
et al., 2014), autoregressive models, and models based ric bootstrap.
on normalizing flows (Larochelle and Murray, 2011; Pa-
pamakarios et al., 2017; Rezende and Mohamed, 2015).
Interestingly, the same issue that motivates likelihood- F. Outlook and Challenges
free inference, the intractability of the density implicitly
defined by the simulator also appears in generative adver- While particle physics and cosmology have a long his-
sarial networks (GANs). If the density of a GAN were tory in utilizing machine learning methods, the scope
tractable, GANs would be trained via standard maxi- of topics that machine learning is being applied to has
mum likelihood, but because their density is intractable grown significantly. Machine learning is now seen as
a trick was needed. The trick is to introduce an adver- a key strategy to confronting the challenges of the up-
sary – i.e. the discriminator network used to classify the graded High-Luminosity LHC (Albertsson et al., 2018;
samples from the generative model and samples taken Apollinari et al., 2015) and is influencing the strategies
from the target distribution. The discriminator is ef- for future experiments in both cosmology and particle
fectively estimating the likelihood ratio between the two physics (Ntampaka et al., 2019). One area in particular
distributions, which provides a direct connections to the that has gathered a great deal of attention at the LHC
approaches to likelihood-free inference based on classi- is the challenge of identifying the tracks left by charged
fiers (Cranmer and Louppe, 2016). particles in high-luminostiy environments (Farrell et al.,
Operationally, these models play a similar role as tra- 2018), which has been the focus of a recent kaggle chal-
ditional scientific simulators, though traditional simula- lenge.
tion codes also provide a causal model for the underlying In almost all areas where machine learning is being ap-
data generation process grounded in physical principles. plied to physics problems, there is a desire to incorporate
However, traditional scientific simulators are often very domain knowledge in the form of hierarchical structure,
slow as the distributions of interest emerge from a low- compositional structure, geometrical structure, or sym-
level microphysical description. For example, simulat- metries that are known to exist in the data or the data-
ing collisions at the LHC involves atomic-level physics generation process. Recently, there has been a spate of
of ionization and scintillation. Similarly, simulations in work from the machine learning community in this direc-
cosmology involve gravitational interactions among enor- tion (Bronstein et al., 2017; Cohen and Welling, 2016; Co-
mous numbers of massive objects and may also include hen et al., 2018; Cohen et al., 2019; Kondor, 2018; Kondor
complex feedback processes that involve radiation, star et al., 2018; Kondor and Trivedi, 2018). These develop-
formation, etc. Therefore, learning a fast approximation ments are being followed closely by physicists, and al-
to these simulations is of great value. ready being incorporated into contemporary research in
Within particle physics early work in this direction this area.
included GANs for energy deposits from particles in
calorimeters (Paganini et al., 2018a,b), which is being
studied by the ATLAS collaboration (ATLAS Collabora- IV. MANY-BODY QUANTUM MATTER
tion, 2018). In Cosmology, generative models have been
used to learn the simulation for cosmological structure The intrinsic probabilistic nature of quantum mechan-
formation (Rodríguez et al., 2018). In an interesting hy- ics makes physical systems in this realm an effectively
brid approach, a deep neural network was used to predict infinite source of big data, and a very appealing play-
the non-linear structure formation of the universe from ground for ML applications. Paradigmatic example of
as a residual from a fast physical simulation based on this probabilistic nature is the measurement process in
linear perturbation theory (He et al., 2018). quantum physics. Measuring the position r of an elec-
In other cases, well-motivated simulations do not al- tron orbiting around the nucleus can only be approxi-
ways exist or are impractical. Nevertheless, having a mately inferred from measurements. An infinitely pre-
generative model for such data can be valuable for the cise classical measurement device can only be used to
purpose of calibration. An illustrative example in this di- record the outcome of a specific observation of the elec-
rection comes from (Ravanbakhsh et al., 2016), see Fig. 3. tron position. Ultimately, a complete characterization of
The authors point out that the next generation of cos- the measurement process is given by the wave function
232

Fig. 2: Samples from the GALAXY- ZOO dataset versus generated samples using conditional generative adversarial network of Section III.
Figure 3 Samples from the GALAXY-ZOO dataset versus generated samples using conditional generative adversarial network.
Each synthetic image is a 128 ⇥ 128 colored image (here inverted) produced by conditioning on a set of features y 2 [0, 1]37 . The pair of
Each synthetic image is a 128×128 colored image (here inverted) produced by conditioning on a set of features y ∈ [0, 1]37 .
observed
The pair ofand generated
observed images
and in eachimages
generated columnincorrespond to thecorrespond
each column same y value. For same
to the detailsyon these Reproduced
value. crowd-sourcedfrom
y features see Willett
(Ravanbakhsh
etetal.,
al. 2016).
(2013). These instances are selected from the test-set and were unavailable to the model during the training.

images
Ψ(r), “conditioned”
whose square modulus on statistics
ultimately of interest
defines such as the
the prob- analysis then
(RBM) involves computing
(Smolensky, 1986). RBMs auto- and withcross-correlations
hidden unit
brightness or size of 2 the galaxy. This will allow us to syn-
ability P (r) = |Ψ(r)| of observing the electron at a given of the measured ellipticities for galaxies
in {±1} and without biases on the visible units at different distances.
for-
thesize in
position calibration
space. While datasets in for
the specific
case of galaxya singlepopulations,
electron mally correspond to FFNNs of depth L = 2, and pre-
These correlation functions are compared to theoretical ac-
with objects exhibiting realistic
both theoretical predictions and experimental morphologies. In related works
inference dictions ing (1)
tivations order
(x) to=constrain cosmological
log cosh(x), g (2) (x) = models and shed
exp(x). An
in machine learning literature Regier
for P (r) are efficiently performed, the situation becomes et al. (2015b) use a light on the difference
important nature of dark
withenergy.
respect to RBM applications
convex combination of smooth and
dramatically more complex in the case of many quantum spiral templates in an However, measuring
for unsupervised learning galaxy ellipticitiesdistributions,
of probability such that their is
(unconditioned) generative model of galaxy
particles. For example, the probability of observing the images and Regier ensemble average (used for the
that when used as NQS RBM states are typicallycosmological analysis)
takenis
et al. (2015a)
positions of N propose
electrons using P (rVAE for this task.1
1 , . . . rN ) is an intrinsically
unbiased
to have is an extremely challenging
complex-valued weights (Carleotask. Fig.and 1 illustrates
Troyer,
In the following,function,
high-dimensional Section Ithat givescan a brief background
seldom on the
be exactly the main steps involved in the acquisition
2017). Deeper architectures have been consistently of the science
stud-
image generation for calibration and
determined for N much larger than a few tens. The expo- its significance for mod- images.
ied andThe weakly sheared
introduced in more galaxy images
recent undergo
work, for additional
example
ern cosmology. We then review the current
nential hardness in estimating P (r1 , . . . rN ) is itself a di-approaches to deep distortions
NQS based(essentially blurring) asand
on fully-connected, theyconvolutional
go through the deep at-
conditional generative models and introduce
rect consequence of estimating the complex-valued many- new techniques mosphere and telescope optics, before being
networks (Choo et al., 2018; Saito, 2018; Sharir et al., acquired by the
for our
body problem setting
amplitudes Ψ(r1 in . . .Sections
rN ) andII is and III. In Section
commonly IV we
referred imagingsee
2019), sensor
Fig.which
4 forpixelates the noisy
a schematic image. AsAthis
example. figure
motiva-
to as the quantum many-body problem. The quantumthe
assess the quality of the generated images by comparing illustrates, the cosmological shear is
tion to use deep FFNN networks, apart from the prac-clearly a subdominant
conditional distributions of shape and morphology parameters effectsuccess
in the final image and needs to be disentangled from
many-body problem manifests itself in a variety of cases. tical of deep learning in industrial applications,
between simulated and real galaxies, and find good agreement. subsequent blurring by the atmosphere and telescope options.
These most chiefly include the theoretical modeling and also comes from more general theoretical arguments in
This blurring, or Point Spread Function (PSF), can be directly
simulation of complex quantum systems – most materi- quantum physics. For example, it has been shown that
I. W EAK measured by using stars as point sources, as shown at the top
als and molecules – forGwhich
RAVITATIONAL L ENSING
only approximate solutions deep NQS can sustain entanglement more efficiently than
In the weak regime of gravitational lensing, the distortion of of Fig. 1.
are often available. Other very important manifestations RBM states (Levine et al., 2019; Liu et al., 2017a). Other
Once the image is acquired, shape measurement algorithms
of the quantum many-body problem include an
background galaxy images can be modeled by theanisotropic
under- extensions of the NQS representation concern representa-
shear, noted , whose amplitude and orientation depend are used to estimate the ellipticity of the galaxy while correct-
standing and analysis of experimental outcomes, espe-on tion of mixed states described by density matrices, rather
ing for the PSF. However, despite the best efforts of the weak
cially in relation with complex phases of matter. Indistant
the matter distribution between the observer and these the than pure wave-functions. In this context, it is possible
galaxies. This shear affects in particular the apparent focused
ellipticity lensing community for nearly two decades, all current state-
following, we discuss some of the ML applications to define positive-definite RBM parametrizations of the
of-the-art shape measurement algorithms are still susceptible
onofalleviating
galaxies, denoted
some ofe.the Measuring
challenging this weak lensingand
theoretical effect
ex- is density matrix (Torlai and Melko, 2018).
made possible under the assumption that background galaxies to biases in the inferred shears. These measurement biases are
perimental problems posed by the quantum many-body
are randomly oriented, so that the ensemble average of the commonly
One of the modeled in terms
specific of additive
challenges and multiplicative
emerging in the quan- bias
problem.
shapes would average to zero in the absence of lensing. Their parameters c and m defined as:
tum domain is imposing physical symmetries in the NQS
apparent ellipticity e can then be used as a noisy but unbiased representations. InE[e] the= case
(1 + of
m)a periodic
+c arrangement (1)
A.estimator of the shear
Neural-Network quantum fieldstates
: E[e] = . The cosmological of matter, spatial symmetries can be imposed using con-
1 The current approach to address this problem in cosmology literature is
where is architectures
volutional the true shear. similar
Depending on the isshape
to what usedmeasure-
in im-
Neural-network quantum states (NQS) are a represen-
to fit analytic parametric light profiles (defined by size, intensity, ellipticity mentclassification
age method being tasksused, m and cetcanal.,
(Choo depend
2018;onSaito,
factors2018;
such
tation of the parameters)
and steepness many-body wave-function
to the observed galaxies, in followed
terms of byartifi-
a simple as the et
Sharir PSF al.,size/shape, the level
2019). Selecting of noise instates
high-energy the images or,
in differ-
modelling
cial neuralof networks
the distribution
(ANNs)of the (Carleo
fitted parameters as a function
and Troyer, 2017).of a moresymmetry
ent generally,sectors
intrinsichasproperties
also beenof demonstrated
the galaxy population
(Choo
quantity of interest, such as the galaxy brightness. This modelling usually
Asimply
commonly adopted choice is to parameterize wave-
involves fitting a linear dependence of mean and standard deviation (like
et al.,their size and
2018). ellipticity
While spatialdistributions,
symmetries etc.have
). Calibration
analogous of
function amplitudes
of a Gaussian distributionas a feed-forward
– e.g., see Hoekstra etneural network:
al. (2016); Appendix A. these biases can
counterparts in be achieved
other ML using image simulations,
applications, satisfyingclosely
more
However, simple parametric models of galaxy light profiles do not have mimicking
involved real observations
quantum symmetriesforoftena given
needssurvey
a deepbut using
rethink-
(L) (L) (2) (2) (1) (1)
Ψ(r) =
the complex g (W needed
morphologies . . . gfor (W g task.
calibration (W Ther))),
only currently
(3)
available alternative, if realistic galaxy morphologies are needed, is to use the galaxy
ing of images distorted with aThe
ANN architectures. known
most shear, thus allowing
notable case in
training
with set images
similar themselves
notation as the input
to what of the simulation
introduced in Eq.pipeline.
(2). This the measurement
this sense is theofexchangethe bias parameters
symmetry.in For Eq. (1).
bosons, this
involves subsampling the training set to match the distribution of size, redshift Image to simulation
Early works have mostly concentrated on shallow
and brightness of the target galaxy simulations, leaving only a relatively small
net- amounts imposingpipelines, such as theto
the wave-function GalSim package
be permuta-
works,
number and mostreused
of objects, notably Restricted
several hundred times Boltzmann
to simulate aMachines
large survey – Rowe et invariant
tionally al. (2015), withuse arespect
forward to modeling
exchange of the observa-
of particle
e.g., see Jarvis et al. (2016); Section 6.1. tions, reproducing all the steps of the image acquisition pro-
24

Convolutional, complex, deep


(a) (b) Sum ticularly helpful in clarifying some of the properties of
NQS (Chen et al., 2018b; Pastori et al., 2018). A fam-
<latexit sha1_base64="STvUioHMrdzisEP8W9tPyJH1ZLI=">AAACAnicbVDLSgMxFM34rPU16krcBItQN2VGBV0W3bisYB/QGUomk7aheQxJRihDceOvuHGhiFu/wp1/Y6adhbYeCDmccy/33hMljGrjed/O0vLK6tp6aaO8ubW9s+vu7be0TBUmTSyZVJ0IacKoIE1DDSOdRBHEI0ba0egm99sPRGkqxb0ZJyTkaCBon2JkrNRzD4OGptUgkizWY26/LNB0wNHktOdWvJo3BVwkfkEqoECj534FscQpJ8JghrTu+l5iwgwpQzEjk3KQapIgPEID0rVUIE50mE1PmMATq8SwL5V9wsCp+rsjQ1znC9pKjsxQz3u5+J/XTU3/KsyoSFJDBJ4N6qcMGgnzPGBMFcGGjS1BWFG7K8RDpBA2NrWyDcGfP3mRtM5q/nnNv7uo1K+LOErgCByDKvDBJaiDW9AATYDBI3gGr+DNeXJenHfnY1a65BQ9B+APnM8fd+uXeQ==</latexit>
( )
⌅CNN ( )
ily of NQS based on RBM states has been shown to be
equivalent to a certain family of variational states known
Channels: 12
Heisenberg 2D
10 8 6 4 2
as correlator-product-states (Clark, 2018; Glasser et al.,
2018a). The question of determining how large are the re-
spective classes of quantum states belonging to the NQS
form, Eq. (3) and to computationally efficient tensor
network is, however, still open. Exact representations of
several intriguing phases of matter, including topological
Figure 4 (Top) Example of a shallow convolutional neural states and stabilizer codes (Deng et al., 2017a; Glasser
network used to represent the many-body wave-function of a et al., 2018a; Huang and Moore, 2017; Kaubruegger et al.,
system of spin 1/2 particles on a square lattice. (Bottom) Fil- 2018; Lu et al., 2018; Zheng et al., 2018), have also been
ters of a fully-connected convolutional RBM found in the vari- obtained in closed RBM form. Not surprisingly, given
ational learning of the ground-state of the two-dimensional its shallow depth, RBM architectures are also expected
Heisenberg model, adapted from (Carleo and Troyer, 2017). to have limitations, on general grounds. Specifically, it is
not in general possible to write all possible physical states
in terms of compact RBM states (Gao and Duan, 2017).
indices. The Bose-Hubbard model has been adopted as a
In order to lift the intrinsic limitations of RBMs, and effi-
benchmark for ANN bosonic architectures, with state-of-
ciently describe a very large family of physical states, it is
the-art results having been obtained (Saito, 2017, 2018;
necessary to introduce deep Boltzmann Machines (DBM)
Saito and Kato, 2017; Teng, 2018). The most chal-
with two hidden layers (Gao and Duan, 2017). Similar
lenging symmetry is, however, certainly the fermionic
network constructions have been introduced also as a pos-
one. In this case, the NQS representation needs to en-
0.3
code the antisymmetry of0.0 0.3(exchang-
the wave-function
sible theoretical framework, alternative to the standard
path-integral representation of quantum mechanics (Car-
ing two particle positions, for example, leads to a minus
leo et al., 2018).
sign). In this case, different approaches have been ex-
plored, mostly expanding on existing variational ansatz
for fermions. A symmetric RBM wave-function correct-
ing an antisymmetric correlator part has been used to 2. Learning from data
study two-dimensional interacting lattice fermions (No-
mura et al., 2017). Other approaches have tackled the Parallel to the activity on understanding the theoreti-
fermionic symmetry problem using a backflow transfor- cal properties of NQS, a family of studies in this field is
mation of Slater determinants (Luo and Clark, 2018), or concerned with the problem of understanding how hard
directly working in first quantization (Han et al., 2018a). it is, in practice, to learn a quantum state from numerical
The situation for fermions is certainly the most challeng- data. This can be realized using either synthetic data (for
ing for ML approaches at the moment, owing to the spe- example coming from numerical simulations) or directly
cific nature of the symmetry. On the applications side, from experiments.
NQS representations have been used so-far along three This line of research has been explored in the super-
main different research lines. vised learning setting, to understand how well NQS can
represent states that are not easily expressed (in closed
analytic form) as ANN. The goal is then to train a NQS
1. Representation theory network |Ψi to represent, as close as possible, a cer-
tain target state |Φi whose amplitudes can be efficiently
An active area of research concerns the general expres- computed. This approach has been successfully used to
sive power of NQS, as also compared to other families of learn ground-states of fermionic, frustrated, and bosonic
variational states. Theoretical activity on the represen- Hamiltonians (Cai and Liu, 2018). Those represent in-
tation properties of NQS seeks to understand how large, teresting study cases, since the sign/phase structure of
and how deep should be neural networks describing inter- the target wave-functions can pose a challenge to stan-
esting interacting quantum systems. In connection with dard activation functions used in FFNN. Along the same
the first numerical results obtained with RBM states, the lines, supervised approaches have been proposed to learn
entanglement has been soon identified as a possible can- random matrix product states wave-functions both with
didate for the expressive power of NQS. RBM states for shallow NQS (Borin and Abanin, 2019), and with gener-
example can efficiently support volume-law scaling (Deng alized NQS including a computationally treatable DBM
et al., 2017b), with a number of variational parameters form (Pastori et al., 2018). While in the latter case these
scaling only polynomially with system size. In this di- studies have revealed efficient strategies to perform the
rection, the language of tensor networks has been par- learning, in the former case hardness in learning some
25

random MPS has been showed. At present, it is specu- tum state is given, thus they typically demand a larger
lated that this hardness originates from the entanglement computational burden than supervised and unsupervised
structure of the random MPS, however it is unclear if this learning schemes for NQS.
is related to the hardness of the NQS optimization land- Experiments on a variety of spin (Choo et al., 2018;
scape or to an intrinsic limitation of shallow NQS. Deng et al., 2017a; Glasser et al., 2018a; Liang et al.,
Besides supervised learning of given quantum states, 2018), bosonic (Choo et al., 2018; Saito, 2017, 2018; Saito
data-driven approaches with NQS have largely concen- and Kato, 2017), and fermionic (Han et al., 2018a; Luo
trated on unsupervised approaches. In this framework, and Clark, 2018; Nomura et al., 2017) models have shown
only measurements from some target state |Φi or density that results competitive with existing state-of-the-art ap-
matrix are available, and the goal is to reconstruct the proaches can be obtained. In some cases, improvement
full state, in NQS form, using such measurements. In the over existing variational results have been demonstrated,
simplest setting, one is given a data set of M measure- most notably for two-dimensional lattice models (Carleo
ments r(1) . . . r(M ) distributed according to Born’s rule and Troyer, 2017; Luo and Clark, 2018; Nomura et al.,
prescription P (r) = |Φ(r)|2 , where P (r) is to be recon- 2017) and for topological phases of matter (Glasser et al.,
structed. In cases when the wave-function is positive def- 2018a; Kaubruegger et al., 2018).
inite, or when only measurements in a certain basis are Other NQS applications concern the solution of
provided, reconstructing P (r) with standard unsuper- the time-dependent Schrödinger equation (Carleo and
vised learning approaches is enough to reconstruct all the Troyer, 2017; Czischek et al., 2018; Fabiani and Mentink,
available information on the underlying quantum state Φ. 2019; Schmitt and Heyl, 2018). In these applications, one
This approach for example has been demonstrated for uses the time-dependent variational principle of Dirac
ground-states of stoquastic Hamiltonians (Torlai et al., and Frenkel (Dirac, 1930; Frenkel, 1934) to learn the opti-
2018) using RBM-based generative models. An approach mal time evolution of network weights. This can be suit-
based on deep VAE generative models has also been ably generalized also to open dissipative quantum sys-
demonstrated in the case of a family of classically-hard tems, for which a variational solution of the Lindblad
to sample from quantum states (Rocchetto et al., 2018), equation can be realized (Hartmann and Carleo, 2019;
for which the effect of network depth has been shown to Nagy and Savona, 2019; Vicentini et al., 2019; Yoshioka
be beneficial for compression. and Hamazaki, 2019).
In the more general setting, the problem is to recon- In the great majority of the variational applications
struct a general quantum state, either pure or mixed, discussed here, the learning schemes used are typically
using measurements from more than a single basis of higher-order techniques than standard SGD approaches.
quantum numbers. Those are especially necessary to re- The stochastic reconfiguration (SR) approach (Becca and
construct also the complex phases of the quantum state. Sorella, 2017; Sorella, 1998) and its generalization to the
This problem corresponds to a well-known problem in time-dependent case (Carleo et al., 2012), have proven
quantum information, known as quantum state tomog- particularly suitable to variational learning of NQS. The
raphy, for which specific NQS approaches have been in- SR scheme can be seen as a quantum analogous of the
troduced (Carrasquilla et al., 2019; Torlai et al., 2018; natural-gradient method for learning probability distri-
Torlai and Melko, 2018). Those are discussed more in butions (Amari, 1998), and builds on the intrinsic geome-
detail, in the dedicated section V.A, also in connection try associated with the neural-network parameters. More
with other ML techniques used for this task. recently, in an effort to use deeper and more expressive
networks than those initially adopted, learning schemes
building on first-order techniques have been more con-
3. Variational Learning sistently used (Kochkov and Clark, 2018; Sharir et al.,
2019). These constitute two different philosophy of ap-
Finally, one of the main applications for the NQS proaching the same problem. On one hand, early appli-
representations is in the context of variational approx- cations focused on small networks learned with very ac-
imations for many-body quantum problems. The goal curate but expensive training techniques. On the other
of these approaches is, for example, to approximately hand, later approaches have focused on deeper networks
solve the Schrödinger equation using a NQS represen- and cheaper –but also less accurate– learning techniques.
tation for the wave-function. In this case, the problem Combining the two philosophy in a computationally effi-
of finding the ground state of a given quantum Hamilto- cient way is one of the open challenges in the field.
nian H is formulated in variational terms as the prob-
lem of learning NQS weights W minimizing E(W ) =
hΨ(W )|H|Ψ(W )i/hΨ(W )|Ψ(W )i. This is achieved using B. Speed up many-body simulations
a learning scheme based on variational Monte Carlo opti-
mization (Carleo and Troyer, 2017). Within this family of The use of ML methods in the realm of the quan-
applications, no external data representative of the quan- tum many-body problems extends well beyond neural-
26

network representation of quantum states. A power- imaginary-time correlations in imaginary time is also a
ful technique to study interacting models are Quan- field in which ML can be used as an alternative to tradi-
tum Monte Carlo (QMC) approaches. These methods tional maximum-entropy techniques to perform analyti-
stochastically compute properties of quantum systems cal continuations of QMC data (Arsenault et al., 2017;
through mapping to an effective classical model, for ex- Fournier et al., 2018; Yoon et al., 2018).
ample by means of the path-integral representation. A
practical issue often resulting from these mappings is that
providing efficient sampling schemes of high-dimensional C. Classifying many-body quantum phases
spaces (path integrals, perturbation series, etc..) re-
quires a careful tuning, often problem-dependent. Devis- The challenge posed by the complexity of many-body
ing general-purpose samplers for these representations is quantum states manifests itself in many other forms.
therefore a particularly challenging problem. Unsuper- Specifically, several elusive phases of quantum matter are
vised ML methods can, however, be adopted as a tool often hard to characterize and pinpoint both in numeri-
to speed-up Monte Carlo sampling for both classical and cal simulations and in experiments. For this reason, ML
quantum applications. Several approaches in this direc- schemes to identify phases of matter have become partic-
tion have been proposed, and leverage the ability of un- ularly popular in the context of quantum phases. In the
supervised learning to well approximate the target dis- following we review some of the specific applications to
tribution being sampled from in the underlying Monte the quantum domain, while a more general discussion on
Carlo scheme. Relatively simple energy-based generative identifying phases and phase transitions is to be found in
models have been used in early applications for classi- II.E.
cal systems (Huang and Wang, 2017; Liu et al., 2017b).
"Self-learning" Monte Carlo techniques have then been
generalized also to fermionic systems (Chen et al., 2018a; 1. Synthetic data
Liu et al., 2017c; Nagai et al., 2017). Overall, it has been
found that such approaches are effective at reducing the Following the early developments in phase classifi-
autocorrelation times, especially when compared to fami- cations with supervised approaches (Carrasquilla and
lies of less effective Markov Chain Monte Carlo with local Melko, 2017; Van Nieuwenburg et al., 2017; Wang, 2016),
updates. More recently, state-of-the-art generative ML many studies have since then focused on analyzing phases
models have been adopted to speed-up sampling in spe- of matter in synthetic data, mostly from simulations of
cific tasks. Notably, (Wu et al., 2018) have used deep au- quantum systems. While we do not attempt here to pro-
toregressive models that may enable a more efficient sam- vide an exhaustive review of the many studies appeared
pling from hard classical problems, such as spin glasses. in this direction, we highlight two large families of prob-
The problem of finding efficient sampling schemes for the lems that have so-far largely served as benchmarks for
underlying classical models is then transformed into the new ML tools in the field.
problem of finding an efficient corresponding autoregres- A first challenging test bench for phase classification
sive deep network representation. This approach has schemes is the case of quantum many-body localization.
also been generalized to the quantum cases in (Sharir This is an elusive phase of matter showing characteris-
et al., 2019), where an autoregressive representation of tic fingerprints in the many-body wave-function itself,
the wave-function is introduced. This representation is but not necessarily emerging from more traditional or-
automatically normalized and allows to bypass Markov der parameters [see for example (Alet and Laflorencie,
Chain Monte Carlo in the variational learning discussed 2018) for a recent review on the topic]. First studies in
above. this direction have focused on training strategies aim-
While exact for a large family of bosonic and spin sys- ing at the Hamiltonian or entanglement spectra (Hsu
tems, QMC techniques typically incur in a severe sign et al., 2018; Huembeli et al., 2018b; Schindler et al., 2017;
problem when dealing with several interesting fermionic Venderley et al., 2018; Zhang et al., 2019). These works
models, as well as frustrated spin Hamiltonians. In this have demonstrated the ability to very effectively learn
case, it is tempting to use ML approaches to attempt a di- the MBL phase transition in relatively small systems ac-
rect or indirect reduction of the sign problem. While only cessible with exact diagonalization techniques. Other
in its first stages, this family of applications has been used studies have instead focused on identifying signatures
to infer information about fermionic phases through hid- directly in experimentally relevant quantities, most no-
den information in the Green’s function (Broecker et al., tably from the many-body dynamics of local quantities
2017b). (Doggen et al., 2018; van Nieuwenburg et al., 2018). The
Similarly, ML techniques can help reduce the burden latter schemes appear to be at present the most promis-
of more subtle manifestations of the sign problem in dy- ing for applications to experiments, while the former have
namical properties of quantum models. In particular, been used as a tool to identify the existence of an unex-
the problem of reconstructing spectral functions from pected phase in the presence of correlated disorder (Hsu
27

et al., 2018). 2. Experimental data


Another very challenging class of problems is found
when analyzing topological phases of matter. These are Beyond extensive studies on data from numerical sim-
largely considered a non-trivial test for ML schemes, be- ulations, supervised schemes have found their way also as
cause these phases are typically characterized by non- a tool to analyze experimental data from quantum sys-
local order parameters. In turn, these non-local order tems. In ultra-cold atoms experiments, supervised learn-
parameters are hard to learn for popular classification ing tools have been used to map out both the topological
schemes used for images. This specific issue is already phases of non-interacting particles, as well the onset of
present when analyzing classical models featuring topo- Mott insulating phases in finite optical traps (Rem et al.,
logical phase transitions. For example, in the presence of 2018). In this specific case, the phases where already
a BKT-type transition, learning schemes trained on raw known and identifiable with other approaches. How-
Monte Carlo configurations are not effective (Beach et al., ever, ML-based techniques combining a-priori theoretical
2018; Hu et al., 2017). These problems can be circum- knowledge with experimental data hold the potential for
vented devising training strategies using pre-engineered genuine scientific discovery.
features (Broecker et al., 2017a; Cristoforetti et al., 2017; For example, ML can enable scientific discovery in the
Wang and Zhai, 2017; Wetzel, 2017) instead of raw Monte interesting cases when experimental data has to be at-
Carlo samples. These features typically rely on some im- tributed to one of many available and equally likely a-
portant a-priori assumptions on the nature of the phase priory theoretical models, but the experimental informa-
transition to be looked for, thus diminishing their effec- tion at hand is not easily interpreted. Typically inter-
tiveness when looking for new phases of matter. Deeper esting cases emerge for example when the order param-
in the quantum world, there has been research activity eter is a complex, and only implicitly known, non-linear
along the direction of learning, in a supervised fashion, function of the experimental outcomes. In this situa-
topological invariants. Neural networks can be used for tion, ML approaches can be used as a powerful tool to
example to classify families of non-interacting topological effectively learn the underlying traits of a given theory,
Hamiltonians, using as an input their discretized coeffi- and provide a possibly unbiased classification of experi-
cients, either in real (Ohtsuki and Ohtsuki, 2016, 2017) or mental data. This is the case for incommensurate phases
momentum space (Sun et al., 2018; Zhang et al., 2018c). in high-temperature superconductors, for which scanning
In these cases, it is found that neural networks are able tunneling microscopy images reveal complex patters that
to reproduce the (already known beforehand) topological are hard to decipher using conventional analysis tools.
invariants, such as winding numbers, Berry curvatures Using supervised approaches in this context, recent work
and more. The context of strongly-correlated topologi- (Zhang et al., 2018d) has shown that is possible to infer
cal matter is, to a large extent, more challenging than the nature of spatial ordering in these systems, also see
the case of non-interacting band models. In this case, Fig. 5.
a common approach is to define a set of carefully pre- A similar idea has been also used for another pro-
engineered features to be used on top of the raw data. totypical interacting quantum systems of fermions, the
One well known example is the case of of the so-called Hubbard model, as implemented in ultra-cold atoms ex-
quantum loop topography (Zhang and Kim, 2017), trained periments in optical lattices. In this case the reference
on local operators computed on single shots of sampled models provide snapshots of the thermal density matrix
wave-function walkers, as for example done in variational that can be pre-classified in a supervised learning fash-
Monte Carlo. It has been shown that this very specific ion. The outcome of this study (Bohrdt et al., 2018),
choice of local features is able to distinguish strongly in- is that the experimental results are with good confidence
teracting fraction Chern insulators, and also Z2 quan- compatible with one of the theories proposed, in this case
tum spin liquids (Zhang et al., 2017). Similar efforts a geometric string theory for charge carriers.
have been realized to classify more exotic phases of mat- In the last two experimental applications described
ter, including magnetic skyrmion phases (Iakovlev et al., above, the outcome of the supervised approaches are to a
2018), and dynamical states in antiskyrmion dynamics large extent highly non-trivial, and hard to predict a pri-
(Ritzmann et al., 2018). ori on the basis of other information at hand. The inner
Despite the progress seen so far along the many di- bias induced by the choice of the theories to be classified
rection described here, it is fair to say that topological is however one of the current limitations that these kind
phases of matter, especially for interacting systems, con- of approaches face.
stitute one of the main challenges for phase classifica-
tion. While some good progress has already been made
(Huembeli et al., 2018a; Rodriguez-Nieva and Scheurer, D. Tensor networks for machine learning
2018), future research will need to address the issue of
finding training schemes not relying on pre-selection of The research topics reviewed so far are mainly con-
data features. cerned with the use of ML ideas and tools to study prob-
28

have also been explored for time-series modeling (Guo


et al., 2018).
In the effort of increasing the amount of entanglement
encoded in these low-rank tensor decompositions, recent
works have concentrated on tensor-network representa-
tions alternative to the MPS form. One notable exam-
ple is the use of tree tensor networks with a hierarchi-
cal structure (Hackbusch and Kühn, 2009; Shi et al.,
2006), which have been applied to classification (Liu
et al., 2017a; Stoudenmire, 2018) and generative mod-
eling (Cheng et al., 2019) tasks with good success. An-
other example is the use of entangled plaquette states
(Changlani et al., 2009; Gendiar and Nishino, 2002; Mez-
zacapo et al., 2009) and string bond states (Schuch et al.,
2008), both showing sizable improvements in classifica-
16 tion tasks over MPS states (Glasser et al., 2018b).
Figure 5 Example of machine learning approach to the classi-
fication of experimental images from scanning tunneling mi- On the more theoretical side, the deep connection be-
croscopy of high-temperature superconductors. Images are tween tensor networks and complexity measures of quan-
classified according to the predictions of distinct types of pe- tum many-body wave-functions, such as entanglement
riodic spatial modulations. Reproduced from (Zhang et al., entropy, can be used to understand, and possible inspire,
2018d) successful network designs for ML purposes. The tensor-
network formalism has proven powerful in interpreting
deep learning through the lens of renormalization group
lems in the realm of quantum many-body physics. Com- concepts. Pioneering work in this direction has connected
plementary to this philosophy, an interesting research MERA tensor network states (Vidal, 2007) to hierarchi-
direction in the field explores the inverse direction, in- cal Bayesian networks (Bény, 2013). In later analysis,
vestigating how ideas from quantum many-body physics convolutional arithmetic circuits (Cohen et al., 2016),
can inspire and devise new powerful ML tools. Central a family of convolutional networks with product non-
to these developments are tensor-network representations linearities, have been introduced as a convenient model to
of many-body quantum states. These are very successful bridge tensor decompositions with FFNN architectures.
variational families of many-body wave functions, nat- Beside their conceptual relevance, these connections can
urally emerging from low-entanglement representations help clarify the role of inductive bias in modern and com-
of quantum states (Verstraete et al., 2008). Tensor net- monly adopted neural networks (Levine et al., 2017).
works can serve both as a practical and a conceptual tool
for ML tasks, both in the supervised and in the unsuper-
vised setting. E. Outlook and Challenges
These approaches build on the idea of providing
physics-inspired learning schemes and network structures Applications of ML to quantum many-body problems
alternative to the more conventionally adopted stochastic have seen a fast-pace progress in the past few years,
learning schemes and FFNN networks. For example, ma- touching a diverse selection of topics ranging from nu-
trix product state (MPS) representations, a work-horse merical simulation to data analysis. The potential of ML
for the simulation of interacting one-dimensional quan- techniques has already surfaced in this context, already
tum systems (White, 1992), have been re-purposed to showing improved performance with respect to existing
perform classification tasks, (Liu et al., 2018; Novikov techniques on selected problems. To a large extent, how-
et al., 2016; Stoudenmire and Schwab, 2016), and also ever, the real power of ML techniques in this domain
recently adopted as explicit generative models for un- has been only partially demonstrated, and several open
supervised learning (Han et al., 2018b; Stokes and Ter- problems remain to be addressed.
illa, 2019). It is worth mentioning that other related In the context of variational studies with NQS, for ex-
high-order tensor decompositions, developed in the con- ample, the origin of the empirical success obtained so
text of applied mathematics have been used for ML pur- far with different kind of neural network quantum states
poses (Acar and Yener, 2009; Anandkumar et al., 2014). is not equally well understood as for other families of
Tensor-train decompositions (Oseledets, 2011), formally variational states, like tensor networks. Key open chal-
equivalent to MPS representations, have been introduced lenges remain also with the representation and simulation
in parallel as a tool to perform various machine learn- of fermionic systems, for which efficient neural-network
ing tasks (Gorodetsky et al., 2019; Izmailov et al., 2017; representation are still to be found.
Novikov et al., 2016). Networks closely related to MPS Tensor-network representations for ML purposes, as
29

well as complex-valued networks like those used for NQS, tum experiments (Krenn et al., 2016; Melnikov et al.,
play an important role to bridge the field back to the 2018), and learning to extract relevant information from
arena of computer science. Challenges for the future of measurements (Seif et al., 2018).
this research direction consist in effectively interfacing We focus on three general problems related to quan-
with the computer-science community, while retaining tum computing which were targeted by a range of ML
the interests and the generality of the physics tools. methods: the problem of reconstructing en benchmark-
For what concerns ML approaches to experimental ing quantum states via measurements; the problem of
data, the field is largely still in its infancy, with only a preparing a quantum state via quantum control ; the
few applications having been demonstrated so far. This problem of maintaining the information stored in the
is in stark contrast with other fields, such as High-Energy state through quantum error correction. The first prob-
and Astrophysics, in which ML approaches have matured lem is known as quantum state tomography, and it is es-
to a stage where they are often used as standard tools for pecially useful to understand and improve upon the lim-
data analysis. Moving towards achieving the same goal itations of current quantum hardware. Quantum control
in the quantum domain demands closer collaborations and quantum error corrections solve related problems,
between the theoretical and experimental efforts, as well however usually the former refers to hardware-related so-
as a deeper understanding of the specific problems where lutions while the latter uses algorithmic solutions to the
ML can make a substantial difference. problem of executing a computational protocol with a
Overall, given the relatively short time span in which quantum system.
applications of ML approaches to many-body quantum Similar to the other disciplines in this review, machine
matter have emerged, there are however good reasons learning has shown promising results in all these areas,
to believe that these challenges will be energetically and will in the longer run likely enter the toolbox of
addressed–and some of them solved– in the coming years. quantum computing to be used side-by-side with other
well-established methods.

V. QUANTUM COMPUTING
A. Quantum state tomography
Quantum computing uses quantum systems to process
information. In the most popular framework of gate- The general goal of quantum state tomography (QST)
based quantum computing (Nielsen and Chuang, 2002), is to reconstruct the density matrix of an unknown quan-
a quantum algorithm describes the evolution of an initial tum state, through experimentally available measure-
state |ψ0 i of a quantum system of n two-level systems ments. QST is a central tool in several fields of quantum
called qubits to a final state |ψf i through discrete trans- information and quantum technologies in general, where
formations or quantum gates. The gates usually act only it is often used as a way to assess the quality and the
on a small number of qubits, and the sequence of gates limitations of the experimental platforms. The resources
defines the computation. needed to perform full QST are however extremely de-
The intersection of machine learning and quantum manding, and the number of required measurements
computing has become an active research area in the scales exponentially with the number of qubits/quantum
last couple of years, and contains a variety of ways to degrees of freedom [see (Paris and Rehacek, 2004) for a
merge the two disciplines (see also (Dunjko and Briegel, review on the topic, and (Haah et al., 2017; O’Donnell
2018) for a review). Quantum machine learning asks how and Wright, 2016) for a discussion on the hardness of
quantum computers can enhance, speed up or innovate learning in state tomography].
machine learning (Biamonte et al., 2017; Ciliberto et al., ML tools have been identified already several years ago
2018; Schuld and Petruccione, 2018a) (see also Sections as a tool to improve upon the cost of full QST, exploit-
VII and V). Quantum learning theory highlights theo- ing some special structure in the density matrix. Com-
retical aspects of learning under a quantum framework pressed sensing (Gross et al., 2010) is one prominent ap-
(Arunachalam and de Wolf, 2017). proach to the problem, allowing to reduce the number
In this Section we are concerned with a third angle, of required measurements from d2 to O(rd log(d)2 ), for
namely how machine learning can help us to build and a density matrix of rank r and dimension d. Success-
study quantum computers. This angle includes topics ful experimental realization of this technique has been
ranging from the use of intelligent data mining methods for example implemented for a six-photon state (Tóth
to find physical regimes in materials that can be used as et al., 2010) or a seven-qubit system of trapped ions (Ri-
qubits (Kalantre et al., 2019), to the verification of quan- ofrío et al., 2017). On the methodology side, full QST
tum devices (Agresti et al., 2019), learning the design of has more recently seen the development of deep learning
quantum algorithms (Bang et al., 2014; Wecker et al., approaches. For example, using a supervised approach
2016), facilitating classical simulations of quantum cir- based on neural networks having as an output the full
cuits (Jónsson et al., 2018), automated design on quan- density matrix, and as an input possible measurement
30

outcomes (Xu and Xu, 2018). The problem of choosing Finally, the problem of learning quantum states from
optimal measurement basis for QST has also been re- experimental measurements has also profound implica-
cently addressed using a neural-network based approach tions on the understanding of the complexity of quan-
that optimizes the prior distribution on the target den- tum systems. In this framework, the PAC learnabil-
sity matrix, using Bayes rule (Quek et al., 2018). In ity of quantum states (Aaronson, 2007), experimen-
general, while ML approaches to full QST can serve as tally demonstrated in (Rocchetto et al., 2017), and
a viable tool to alleviate the measurement requirements, the ‘’shadow tomography” approach (Aaronson, 2017),
they cannot however provide an improvement over the showed that even linearly sized training sets can pro-
intrinsic exponential scaling of QST. vide sufficient information to succeed in certain quantum
The exponential barrier can be typically overcome only learning tasks. These information-theoretic guarantees
in situations when the quantum state is assumed to have come with computational restrictions and learning is ef-
some specific regularity properties. Tomography based ficient only for special classes of states (Rocchetto, 2018)
on tensor-network paremeterizations of the density ma-
trix has been an important first step in this direction,
allowing for tomography of large, low-entangled quan- B. Controlling and preparing qubits
tum systems (Lanyon et al., 2017). ML approaches
to parameterization-based QST have emerged in recent A central task of quantum control is the following:
times as a viable alternative, especially for highly entan- Given an evolution U (θ) that depends on parameters
gled states. Specifically, assuming a NQS form (see Eq. θ and maps an initial quantum state |ψ0 i to |ψ(θ)i =
3 in the case of pure states) QST can be reformulated U (θ)|ψ0 i, which parameters θ∗ minimise the overlap or
as an unsupervised ML learning task. A scheme to re- distance between the prepared state and the target state,
trieve the phase of the wave-function, in the case of pure |hψ(θ)|ψtarget i|2 ? To facilitate analytic studies, the space
states, has been demonstrated in (Torlai et al., 2018). In of possible control interventions is often discretized, so
these applications, the complex phase of the many-body that U (θ) = U (s1 , . . . , sT ) becomes a sequence of steps
wave-function is retrieved upon reconstruction of several s1 , . . . , sT . For example, a control field could be applied
probability densities associated to the measurement pro- at only two different strengths h1 and h2 , and the goal
cess in different basis. Overall, this approach has al- is to find an optimal strategy st ∈ {h1 , h2 }, t = 1, . . . , T
lowed to demonstrate QST of highly entangled states up to bring the initial state as close as possible to the target
to about 100 qubits, unfeasible for full QST techniques. state using only these discrete actions.
This tomography approach can be suitably generalized to This setup directly generalizes to a reinforcement
the case of mixed states introducing parameterizations of learning framework (Sutton and Barto, 2018), where an
the density matrix based either on purified NQS (Torlai agent picks “moves” from the list of allowed control inter-
and Melko, 2018) or on deep normalizing flows (Cran- ventions, such as the two field strengths applied to the
mer et al., 2019). The former approach has been also quantum state of a qubit. This framework has proven to
demonstrated experimentally with Rydberg atoms (Tor- be competitive to state-of-the-art methods in various set-
lai et al., 2019). An interesting alternative to the NQS tings, such as state preparation in non-integrable many-
representation for tomographic purposes has also been re- body quantum systems of interacting qubits (Bukov
cently suggested (Carrasquilla et al., 2019). This is based et al., 2018), or the use of strong periodic oscillations
on parameterizing the density matrix directly in terms to prepare so-called “Floquet-engineered” states (Bukov,
of positive-operator valued measure (POVM) operators. 2018). A recent study comparing (deep) reinforcement
This approach therefore has the important advantage of learning with traditional optimization methods such as
directly learning the measurement process itself, and has Stochastic Gradient Descent for the preparation of a sin-
been demonstrated to scale well on rather large mixed gle qubit state shows that learning is of advantage if
states. A possible inconvenient of this approach is that the “action space” is naturally discretized and sufficiently
the density matrix is only implicitly defined in terms of small (Zhang et al., 2019).
generative models, as opposed to explicit parameteriza- The picture becomes increasingly complex in slightly
tions found in NQS-based approaches. more realistic settings, for example when the control is
Other approaches to QST have explored the use of noisy (Niu et al., 2018). In an interesting twist, the con-
quantum states parameterized as ground-states of local trol problem has also been tackled by predicting future
Hamiltonians (Xin et al., 2018), or the intriguing pos- noise using a recurrent neural network that analyses the
sibility of bypassing QST to directly measure quantum time series of past noise. Using the prediction, the an-
entanglement (Gray et al., 2018). Extensions to the more ticipated future noise can be corrected (Mavadia et al.,
complex problem of quantum process tomography are 2017).
also promising (Banchi et al., 2018), while the scalability An altogether different approach to state preparation
of ML-based approaches to larger systems still presents with machine learning tries to find optimal strategies for
challenges. evaporative cooling to create Bose-Einstein condensates
31

(Wigley et al., 2016). In this online optimization strat- to address this problem (Varsamopoulos et al., 2018),
egy based on Bayesian optimization (Frazier, 2018; Jones and the significance of different hyper-parameters of the
et al., 1998), a Gaussian process is used as a statistical neural network has been studied (Varsamopoulos et al.,
model that captures the relationship between the con- 2019).
trol parameters and the quality of the condensate. The Besides scalability, an important problem in quantum
strategy discovered by the machine learning model al- error correction is that the syndrome measurement pro-
lows for a cooling protocol that uses 10 times fewer itera- cedure could also introduce an error, since it involves ap-
tions than pure optimization techniques. An interesting plying a small quantum circuit. This setting increases the
feature is that - contrary to the common reputation of problem complexity but is essential for real applications.
machine learning - the Gaussian process allows to deter- Noise in the identification of errors can be mitigated by
mine which control parameters are more important than doing repeated cycles of syndrome measurements. To
others. consider the additional time dimension, recurrent neu-
Another angle is captured by approaches that ‘learn’ ral network architectures have been proposed (Baireuther
the sequence of optical instruments in order to pre- et al., 2018). Another avenue is to consider decoding as
pare highly entangled photonic quantum states (Mel- a reinforcement learning problem (Sweke et al., 2018), in
nikov et al., 2018). which an agent can choose consecutive operations acting
on physical qubits (as opposed to logical qubits) to cor-
rect for a syndrome and gets rewarded if the sequence
C. Error correction corrected the error.
While much of machine learning for error correction
One of the major challenges in building a universal focuses on surface codes that represent a logical qubit by
quantum computer is error correction. During any com- physical qubits according to some set scheme, reinforce-
putation, errors are introduced by physical imperfections ment agents can also be set up agnostic of the code (one
of the hardware. But while classical computers allow could say they learn the code along with the decoding
for simple error correction based on duplicating infor- strategy). This has been done for quantum memories,
mation, the no-cloning theorem of quantum mechanics a system in which quantum states are supposed to be
requires more complex solutions. The most well-known stored rather than manipulated (Nautrup et al., 2018),
proposal of surface codes prescribes to encode one “log- as well as in a feedback control framework which protects
ical qubit” into a topological state of several “physical qubits against decoherence (Fösel et al., 2018). Finally,
qubits”. Measurements on these physical qubits reveal a beyond traditional reinforcement learning, novel strate-
“footprint” of the chain of error events called a syndrome. gies such as projective simulation can be used to combat
A decoder maps a syndrome to an error sequence, which, noise (Tiersch et al., 2015).
once known, can be corrected by applying the same error As a summary, machine learning for quantum error
sequence again, and without affecting the logical qubits correction is a problem with several layers of complexity
that store the actual quantum information. Roughly that, for realistic applications, requires rather complex
stated, the art of quantum error correction is therefore learning frameworks. Nevertheless, it is a very natural
to predict errors from a syndrome - a task that naturally candidate for machine learning, and especially reinforce-
fits the framework of machine learning. ment learning.
In the past few years, various models have been ap-
plied to quantum error correction, ranging from super-
vised to unsupervised and reinforcement learning. The VI. CHEMISTRY AND MATERIALS
details of their application became increasingly complex.
One of the first proposals deploys a Boltzmann machine Machine learning approaches have been applied to pre-
trained by a data set of pairs (error, syndrome), which dict the energies and properties of molecules and solids,
specifies the probability p(error, syndrome), which can with the popularity of such applications increasing dra-
be used to draw samples from the desired distribution matically. The quantum nature of atomic interactions
p(error|syndrome) (Torlai and Melko, 2017). This simple makes energy evaluations computationally expensive, so
recipe shows a performance for certain kinds of errors ML methods are particularly useful when many such
comparable to common benchmarks. The relation be- calculations are required. In recent years, the ever-
tween syndromes and errors can likewise be learned by a expanding applications of ML in chemistry and mate-
feed-forward neural network (Krastanov and Jiang, 2017; rials research include predicting the structures of related
Maskara et al., 2019; Varsamopoulos et al., 2017). How- molecules, calculating energy surfaces based on molecular
ever, these strategies suffer from scalability issues, as the dynamics (MD) simulations, identifying structures that
space of possible decoders explodes and data acquisition have desired material properties, and creating machine-
becomes an issue. More recently, neural networks have learned density functionals. For these types of problems,
been combined with the concept of renormalization group input descriptors must account for differences in atomic
32

environments in a compact way. Much of the current In this case adding layers to allow interactions between
work using ML for atomistic modeling is based on early atom-centered NNs improved the molecular energy pre-
work describing the local atomic environment with sym- dictions (Lubbers et al., 2018).
metry functions for input into a atom-wise neural net- The examples above use translation- and rotation-
work (Behler and Parrinello, 2007), representing atomic invariant representations of the atomic environments,
potentials using Gaussian process regression methods thanks to the incorporation of symmetry functions in
(Bartók et al., 2010), or using sorted interatomic dis- the NN input. For some applications, such as describ-
tances weighted by the nuclear charge (the "Coulomb ing molecular reactions and materials phase transfor-
matrix") as a molecular descriptor (Rupp et al., 2012). mations, atomic representations must also be continu-
Continuing development of suitable structural represen- ous and differentiable. The smooth overlap of atomic
tations is reviewed by Behler (2016). A discussion of positions (SOAP) kernels address all of these require-
ML for chemical systems in general, including learning ments by including a similarity metric between atomic
structure-property relationships, is found in the review environments (Bartók et al., 2013). Recent work to pre-
by Butler et al. (2018), with additional focus on data- serve symmetries in alternate molecular representations
enabled theoretical chemistry reviewed by Rupp et al. addresses this problem in different ways. To capitalize on
(2018). In the sections below, we present recent exam- known molecular symmetries for "Coulomb matrix" in-
ples of ML applications in chemical physics. puts, both bonding (rigid) and dynamic symmetries have
been incorporated to improve the coverage of training
data in the configurational space (Chmiela et al., 2018).
A. Energies and forces based on atomic environments This work also includes forces in the training, allowing
for MD simulations at the level of coupled cluster calcu-
One of the primary uses of ML in chemistry and mate- lations for small molecules, which would traditionally be
rials research is to predict the relative energies for a series intractable. Molecular symmetries can also be learned,
of related systems, most typically to compare different as shown in determining local environment descriptors
structures of the same atomic composition. These appli- that make use of continuous-filter convolutions to de-
cations aim to determine the structure(s) most likely to scribe atomic interactions (Schütt et al., 2018). Further
be observed experimentally or to identify molecules that development of atom environment descriptors that are
may be synthesizable as drug candidates. As examples compact, unique, and differentiable will certainly facili-
of supervised learning, these ML methods employ var- tate new uses for ML models in the study of molecules
ious quantum chemistry calculations to label molecular and materials.
representations (Xµ) with corresponding energies (yµ)to However, machine learning has also been applied in
generate the training (and test) data sets {Xµ , yµ }nµ=1 . ways that are more closely integrated with conventional
For quantum chemistry applications, NN methods have approaches, so as to be more easily incorporated in exist-
had great success in predicting the relative energies of a ing codes. For example, atomic charge assignments com-
wide range of systems, including constitutional isomers patible with classical force fields can be learned, without
and non-equilibrium configurations of molecules, by using the need to run a new quantum mechanical calculation
many-body symmetry functions that describe the local for each new molecule of interest (Sifain et al., 2018).
atomic neighborhood of each atom (Behler, 2016). Many In addition, condensed phase simulations for molecular
successes in this area have been derived from this type of species require accurate intra- and intermolecular poten-
atom-wise decomposition of the molecular energy, with tials, which can be difficult to parameterize. To this end,
each element represented using a separate NN (Behler local NN potentials can be combined with physically-
and Parrinello, 2007) (see Fig. 6(a)). For example, ANI- motivated long-range Coulomb and van der Waals contri-
1 is a deep NN potential successfully trained to return the butions to describe larger molecular systems (Yao et al.,
density functional theory (DFT) energies of any molecule 2018). Local ML descriptions can also be successfully
with up to 8 heavy atoms (H, C, N, O) (Smith et al., combined with many-body expansion methods to allow
2017). In this work, atomic coordinates for the training application of ML potentials to larger systems, as demon-
set were selected using normal mode sampling to include strated for water clusters (Nguyen et al., 2018). Alterna-
some vibrational perturbations along with optimized ge- tively, intermolecular interactions can be fitted to a set of
ometries. Another example of a general NN for molec- ML models trained on monomers to create a transferable
ular and atomic systems is the Deep Potential Molec- model for dimers and clusters (Bereau et al., 2018).
ular Dynamics (DPMD) method specifically created to
run MD simulations after being trained on energies from
bulk simulations (Zhang et al., 2018a). Rather than sim- B. Potential and free energy surfaces
ply include non-local interactions via the total energy of
a system, another approach was inspired by the many- Machine learning methods are also employed to de-
body expansion used in standard computational physics. scribe free energy surfaces. Rather than learning the po-
33

Figure 6 Several representations are currently used to describe molecular systems in ML models, including (a) atomic coordi-
nates, with symmetry functions encoding local bonding environments, as inputs to element-based neural networks (Reproduced
from (Gastegger et al., 2017)) and (b) nuclear potentials approximated by a sum of Gaussian functions as inputs kernel ridge
regression models for electron densities (Modified from (Brockherde et al., 2017)).

tential energy of each molecular conformation directly as sequently provide a set of states that represent the dis-
described above, an alternate approach is to learn the tribution of states on the FES (Noé et al., 2019).
free energy surface of a system as a function of collective Furthermore, the long history of finding relationships
variables, such as global Steinhardt order parameters or between minima on complex energy landscapes may also
a local dihedral angle for a set of atoms. A compact be useful as we learn to understand why ML models ex-
ML representation of a free energy surface (FES) using hibit such general success. Relationships between the
a NN allows improved sampling of the high dimensional methods and ideas currently used to describe molecular
space when calculating observables that depend on an systems and the corresponding are reviewed in (Ballard
ensemble of conformers. For example, a learned FES can et al., 2017). Going forward, the many tools developed
be sampled to predict the isothermal compressibility of by physicists to explore and quantify features on energy
solid xenon under pressure, or the expected NMR spin- landscapes may be helpful in creating new algorithms to
spin J couplings of a peptide (Schneider et al., 2017). efficiently optimize model weights during training. (See
Small NN’s representing a FES can also be trained itera- also the related discussion in Sec. II.D.4.) This area of
tively using data points generated by on-the-fly adaptive interdisciplinary research promises to yield methods that
sampling (Sidky and Whitmer, 2018). This promising will be useful in both machine learning and physics fields.
approach highlights the benefit of using a smooth repre-
sentation of the full configurational space when using the
ML models themselves to generate new training data. C. Materials properties
As the use of machine-learned FES representations in-
creases, it will be important to determine the limit of Using learned interatomic potentials based on local en-
accuracy for small NN’s and how to use these models as vironments has also afforded improvement in the calcula-
a starting point for larger networks or other ML archi- tion of materials properties. Matching experimental data
tectures. typically requires sampling from the ensemble of possible
configurations, which comes at a considerable cost when
Once the relevant minima have been identified on a using large simulation cells and conventional methods.
FES, the next challenge is to understand the processes Recently, the structure and material properties of amor-
that take a system from one basin to another. For ex- phous silicon were predicted using molecular dynamics
ample, developing a Markov state model to describe con- (MD) with a ML potential trained on density functional
formational changes requires dimensionality reduction to theory (DFT) calculations for only small simulation cells
translate molecular coordinates into the global reaction (Deringer et al., 2018). Related applications of using ML
coordinate space. To this end, the power of deep learn- potentials to model the phase change between crystalline
ing with time-lagged autoencoder methods has been har- and amorphous regions of materials such as GeTe and
nessed to identify slowly changing collective variables in amorphous carbon are reviewed by Sosso et al. (2018).
peptide folding examples (Wehmeyer and Noé, 2018). A Generating a computationally-tractable potential that is
variational NN-based approach has also been used to sufficiently accurate to describe phase changes and the
identify important kinetic processes during protein fold- relative energies of defects on both an atomistic and ma-
ing simulations and provides a framework for unifying terial scale is quite difficult, however the recent success
coordinate transformations and FES surface exploration for silicon properties indicates that ML methods are up
(Mardt et al., 2018). A promising alternate approach to the challenge (Bartók et al., 2018).
is to use ML to sample conformational distributions di- Ideally, experimental measurements could also be in-
rectly. Boltzmann generators can sample the equilib- corporated in data-driven ML methods that aim to pre-
rium distribution of a collective variable space and sub- dict material properties. However, reported results are
34

too often limited to high-performance materials with no Furthermore, this work demonstrated that the energy of
counter examples for the training process. In addition, a molecular system can also be learned with electron den-
noisy data is coupled with a lack of precise structural in- sities as an input, enabling reactive MD simulations of
formation needed for input into the ML model. For for proton transfer events based on DFT energies. Intrigu-
organic molecular crystals, these challenges were over- ingly, an approximate electron density, such as a sum of
come for predictions of NMR chemical shifts, which are densities from isolated atoms, has also been successfully
very sensitive to local environments, by using a Gaussian employed as the input for predicting molecular energies
process regression framework trained on DFT-calculated (Eickenberg et al., 2018). A related approach for peri-
values of known structures (Paruzzo et al., 2018). Match- odic crystalline solids used local electron densities from
ing calculated values with experimental results prior to an embedded atom method to train Bayesian ML mod-
training the ML model enabled the validation of a pre- els to return total system energies (Schmidt et al., 2018).
dicted pharmaceutical crystal structure. Since the total energy is an extensive property, a scalable
Other intriguing directions include identification of NN model based on summation of local electron densities
structurally similar materials via clustering and using has also been developed to run large DFT-based simula-
convex hull construction to determine which of the many tions for 2D porous graphene sheets (Mills et al., 2019).
predicted structures should be most stable under certain With these successes, it has become clear that given den-
thermodynamic constraints (Anelli et al., 2018). Using sity functional, machine learning offers new ways to learn
kernel-PCA descriptors for the construction of the con- both the electron density and the corresponding system
vex hull has been applied to identify crystalline ice phases energy.
and was shown to cluster thousands structures which dif- Many human-based approaches to improving the ap-
fer only by proton disorder or stacking faults (Engel et al., proximate functionals in use today rely on imposing
2018) (see Fig. 7). Machine-learned methods based on a physically-motivated constraints. So far, including these
combination of supervised and unsupervised techniques types of restrictions on ML-based methods has met with
certainly promises to be a fruitful research area in the only partial success. For example, requiring that a ML
future. In particular, it remains an exciting challenge to functional fulfill more than one constraint, such as a
identify, predict, or even suggest materials that exhibit a scaling law and size-consistency, improves overall per-
particular desired property. formance in a system-dependent manner (Hollingsworth
et al., 2018). Obtaining accurate derivatives, particu-
larly for molecules with conformational changes, is still
D. Electron densities for density functional theory an open question for physics-informed ML functionals
and potentials that have not been explicitly trained with
In many of the examples above, density functional the- this goal (Bereau et al., 2018; Snyder et al., 2012).
ory calculations have been used as the source of training
data. It is fitting that machine learning is also playing a
role in creating new density functionals. Machine learn- E. Data set generation
ing is a natural choice for situations such as DFT where
we do not have knowledge of the functional form of an As for other applications of machine learning, com-
exact solution. The benefit of this approach to identify- parison of various methods requires standardized data
ing a density functional was illustrated by approximating sets. For quantum chemistry, these include the 134,000
the kinetic energy functional of an electron distribution molecules in the QM9 data set (Ramakrishnan et al.,
in a 1D potential well (Snyder et al., 2012). For use in 2014) and the COMP6 benchmark data set composed of
standard Kohn-Sham based DFT codes, the derivative of randomly-sampled subsets of other small molecule and
the ML functional must also be used to find the appro- peptide data sets, with each entry optimized using the
priate ground state electron distribution. Using kernel same computational method (Smith et al., 2018).
ridge regression without further modification can lead to In chemistry and materials research, computational
noisy derivatives, but projecting the resulting energies data are often expensive to generate, so selection of train-
back onto the learned space using PCA resolves this is- ing data points must be carefully considered. The in-
sue (Li et al., 2015). A NN-based approach to learning put and output representations also inform the choice
the exchange-correlation potential has also been demon- of data. Inspection of ML-predicted molecular energies
strated for 1D systems (Nagai et al., 2018). In this case, for most of the QM9 data set showed the importance
the ML method makes direct use of the derivatives gen- of choosing input data structures that convey conformer
erated during the NN training steps. changes (Faber et al., 2017). In addition, dense sampling
It is also possible to bypass the functional derivative of the chemical composition space is not always neces-
entirely by using ML to generate the appropriate ground sary. For example, the initial ANI training set of 20 mil-
state electron density that corresponds to a nuclear po- lion molecules could be replaced with 5.5 million train-
tential (Brockherde et al., 2017), as shown in Fig. 6(b). ing points selected using an active learning method that
35

Figure 7 Clustering thousands of possible ice structures based on machine-learned descriptors identifies observed forms and
groups similar structures together. Reproduced from (Engel et al., 2018).

added poorly predicted molecular examples from each fewer points than a NN, although these points need to be
training cycle (Smith et al., 2018). Alternate sampling selected more carefully (Kamath et al., 2018). Balancing
approaches can also be used to more efficiently build up the computational cost of data generation, ease of model
a training set. These range from active learning meth- training, and model evaluation time continues to be an
ods that estimate errors from multiple NN evaluations for important consideration when choosing the appropriate
new molecules(Gastegger et al., 2017) to generating new ML method for each application.
atomic configurations based on MD simulations using a
previously-generated model (Zhang et al., 2018b). Inter-
esting, statistical-physics-based, insight into theoretical F. Outlook and Challenges
aspects of such active learning was presented in (Seung
et al., 1992b). Going forward, ML models will benefit from includ-
ing methods and practices developed for other problems
Further work in this area is needed to identify the in physics. While some of these ideas are already being
atomic compositions and configurations that are most explored, such as exploiting input data symmetries for
important to differentiating candidate structures. While molecular configurations, there are still many opportu-
NN’s have been shown to generate accurate energies, the nities to improve model training efficiency and regular-
amount of data required to prevent over-fitting can be ization. Some of the more promising (and challenging)
prohibitively expensive in many cases. For specific tasks, areas include applying methods for exploration of high-
such as predicting the anharmonic contributions to vi- dimensional landscapes for parameter/hyper-parameter
brational frequencies of the small molecule formaldehye, optimization and identifying how to include boundary
Gaussian process methods were more accurate, and used behaviors or scaling laws in ML architectures and/or in-
36

put data formats. To connect more directly to exper- and requires very low power consumption. Optical inter-
imental data, future physics-based ML methods should connects are already widespread, to carry information on
account for uncertainties and/or errors from calculations short or long distances, but light interference properties
and measured properties to avoid over-fitting and im- also can be leveraged in order to provide more advanced
prove transferability of the models. processing. In the case of machine learning there is one
more perk. Some of the standard building blocks in op-
tics labs have a striking resemblance with the way infor-
VII. AI ACCELERATION WITH CLASSICAL AND mation is processed with neural networks (Killoran et al.,
QUANTUM HARDWARE 2018; Lin et al., 2018; Shen et al., 2017), an insight that is
by no means new (Lu et al., 1989). An example for both
There are areas where physics can contribute to ma- large bulk optics experiments and on-chip nanophoton-
chine learning by other means than tools for theoretical ics are networks of interferometers. Interferometers are
investigations and domain-specific problems. Novel hard- passive optical elements made up of beam splitters and
ware platforms may help with expensive information pro- phase shifters (Clements et al., 2016; Reck et al., 1994).
cessing pipelines and extend the number crunching facil- If we consider the amplitudes of light modes as an incom-
ities of CPUs and GPUs. Such hardware-helpers are also ing signal, the interferometer effectively applies a unitary
known as “AI accelerators”, and physics research has to transformation to the input (see Figure 8 left). Ampli-
offer a variety of devices that could potentially enhance fying or damping the amplitudes can be understood as
machine learning. applying a diagonal matrix. Consequently, by means of
a singular value decomposition, an amplifier sandwiched
by two interferometers implements an arbitrary matrix
A. Beyond von Neumann architectures multiplication on the data encoded into the optical am-
plitudes. Adding a non-linear operation – which is usu-
When we speak of computers, we usually think of uni- ally the hardest to precisely control in the lab – can turn
versal digital computers based on electrical circuits and the device into an emulator of a standard neural network
Boolean logic. This is the so-called "von Neumann" layer (Lin et al., 2018; Shen et al., 2017), but at the speed
paradigm of modern computing. But any physical sys- of light.
tem can be interpreted as a way to process information, An interesting question to ask is: what if we use quan-
namely by mapping the input parameters of the experi- tum instead of classical light? For example, imagine the
mental setup to measurement results, the output. This information is now encoded in the quadratures of the elec-
way of thinking is close to the idea of analog comput- tromagnetic field. The quadratures are - much like po-
ing, which has been – or so it seems (Ambs, 2010; Lund- sition and momentum of a quantum particle - two non-
berg, 2005) – dwarfed by its digital cousin for all but commuting operators that describe light as a quantum
very few applications. In the context of machine learn- system. We now have to exchange the setup to quantum
ing however, where low-precision computations have to optics components such as squeezers and displacers, and
be executed over and over, analog and special-purpose get a neural network encoded in the quantum properties
computing devices have found a new surge of interest. of light (Killoran et al., 2018). But there is more: Using
The hardware can be used to emulate a full model, such multiple layers, and choosing the ‘nonlinear operation’
as neural-network inspired chips (Ambrogio et al., 2018), as a “non-Gaussian” component (such as an optical “Kerr
or it can outsource only a subroutine of a computation, non-linearity” which is admittedly still an experimental
as done by Field-Programmable Gate Arrays (FPGAs) challenge), the optical setup becomes a universal quan-
and Application-Specific Integrated Circuits (ASICs) for tum computer. As such, it can run any computations a
fast linear algebra computations (Jouppi et al., 2017; quantum computer can perform - a true quantum neural
Markidis et al., 2018). network. There are other variations of quantum opti-
In the following, we present selected examples from cal neural nets, for example when information is encoded
various research directions that investigate how hardware into discrete rather than continuous-variable properties
platforms from physics labs, such as optics, nanophoton- of light (Steinbrecher et al., 2018). Investigations into
ics and quantum computers, can become novel kinds of what these quantum devices mean for machine learning,
AI accelerators. for example whether there are patterns in data that can
be easier recognized, have just begun.

B. Neural networks running on light


C. Revealing features in data
Processing information with optics is a natural and ap-
pealing alternative - or at least complement - to all-silicon One does not have to implement a full machine learn-
computers: it is fast, it can be made massively parallel, ing model on the physical hardware, but can outsource
37

Figure 8 Illustrations of the methods discussed in the text. 1. Optical components such as interferometers and amplifiers
can emulate a neural network that layer-wise maps an input x to ϕ(W x) where W is a learnable weight matrix and ϕ a
nonlinear activation. Using quantum optics components such as displacement and squeezing, one can encode information into
quantum properties of light and turn the neural net into a universal quantum computer. 2. Random embedding with an Optical
Processing Unit. Data is encoded into the laser beam through a spatial light modulator (here, a DMD), after which a diffusive
medium generates the random features. 3. A quantum computer can be used to compute distances between data points, or
“quantum kernels”. The first part of the quantum algorithm uses routines Sx , Sx0 to embed the data in Hilbert space, while
the second part reveals the inner product of the embedded vectors. This kernel can be further processed in standard kernel
methods such as support vector machines.

single components. An example which we will highlight data, these devices already outperform CPUs or GPUs
as a second application is data preprocessing or feature both in speed and power consumption.
extraction. This includes mapping data to another space
where it is either compressed or ‘blown up’, in both cases
revealing its features for machine learning algorithms. D. Quantum-enhanced machine learning

One approach to data compression or expansion with A fair amount of effort in the field of quantum machine
physical devices leverages the very statistical nature of learning, a field that investigates intersections of quan-
many machine learning algorithms. Multiple light scat- tum information and intelligent data mining (Biamonte
tering can generate the very high-dimensional random- et al., 2017; Schuld and Petruccione, 2018b), goes into
ness needed for so-called random embeddings (see Figure applications of near-term quantum hardware for learn-
8 top right). In a nutshell, the multiplication of a set ing tasks (Perdomo-Ortiz et al., 2017). These so-called
of vectors by the same random matrix is approximately Noisy Intermediate-Scale Quantum or ‘NISQ’ devices are
distance-preserving (Johnson and Lindenstrauss, 1984). not only hoped to enhance machine learning applications
This can be used for dimensionality reduction, i.e., data in terms of speed, but may lead to entirely new algo-
compression, in the spirit of compressed sensing (Donoho, rithms inspired by quantum physics. We have already
2006) or for efficient nearest neighbor search with local- mentioned one such example above, a quantum neural
ity sensitive hashing. This can also be used for dimen- network that can emulate a classical neural net, but go
sionality expansion, where in the limit of large dimen- beyond. This model falls into a larger class of varia-
sion it approximates a well-defined kernel (Saade et al., tional or parametrized quantum machine learning algo-
2016). Such devices can be built in free-space optics, rithms (McClean et al., 2016; Mitarai et al., 2018). The
with coherent laser sources, commercial light modulators idea is to make the quantum algorithm, and thereby the
and CMOS sensors, and a well-chosen scattering material device implementing the quantum computing operations,
(see Fig.8 2a). Machine learning applications range from depend on parameters θ that can be trained with data.
transfer learning for deep neural networks, time series Measurements on the “trained device” represent new out-
analysis - with a feedback loop implementing so-called puts, such as artificially generated data samples of a gen-
echo-state networks (Dong et al., 2018), or change-point erative model, or classifications of a supervised classifier.
detection (Keriven et al., 2018). For large-dimensional Another idea of how to use quantum computers to en-
38

hance learning is inspired by kernel methods (Hofmann VIII. CONCLUSIONS AND OUTLOOK
et al., 2008) (see Figure 8 bottom right). By associating
the parameters of a quantum algorithm with an input A number of overarching themes become apparent af-
data sample x, one effectively embeds x into a quan- ter reviewing the ways in which machine learning is used
tum state |ψ(x)i described by a vector in Hilbert space in or has enhanced the different disciplines of physics.
(Havlicek et al., 2018; Schuld and Killoran, 2018). A First of all, it is clear that the interest in machine learn-
simple interference routine can measure overlaps between ing techniques suddenly surged in recent years. This is
two quantum states prepared in this way. An overlap is true even in areas such as statistical physics and high-
an inner product of vectors in Hilbert space, which in energy physics where the connection to machine learning
the machine literature is known as a kernel, a distance techniques has a long history. We are seeing the research
measure between two data points. As a result, quantum move from an exploratory efforts on toy models towards
computers can compute rather exotic kernels that may the use of real experimental data. We are also seeing an
be classically intractable, and it is an active area of re- evolution in the understanding and limitations of these
search to find interesting quantum kernels for machine approaches and situations in which the performance can
learning tasks. be justified theoretically. A healthy and critical engage-
Beyond quantum kernels and variational circuits, ment with the potential power and limitations of ma-
quantum machine learning presents many other ideas chine learning includes an analysis of where these meth-
that use quantum hardware as AI accelerators, for exam- ods break and what they are distinctly not good at.
ple as a sampler for training and inference in graphical Physicist are notoriously hungry for very detailed un-
models (Adachi and Henderson, 2015; Benedetti et al., derstanding of why and when their methods work. As
2017), or for linear algebra computations (Lloyd et al., machine learning is incorporated into the physicist’s tool-
2014)2 . Another interesting branch of research investi- box, it is reasonable to expect that physicist may shed
gates how quantum devices can directly analyze the data light on some of the notoriously difficult questions ma-
produced by quantum experiments, without making the chine learning is facing. Specifically, physicists are al-
detour of measurements (Cong et al., 2018). In all these ready contributing to issues of interpretability, tech-
explorations, a major challenge is the still severe limita- niques to validate or guarantee the results, and princi-
tions in current-day NISQ devices which reduce numer- ple ways to chose the various parameters of the neural
ical experiments on the hardware to proof-of-principle networks architectures.
demonstrations, while theoretical analysis remains noto- One direction in which the physics community has
riously difficult in machine learning. much to learn from the machine learning community is
the culture and practice of sharing code and developing
carefully-crafted, high-quality benchmark datasets. Fur-
E. Outlook and Challenges thermore, physics would do well to emulate the practices
of developing user-friendly and portable implementations
The above examples demonstrate a way of how physics of the key methods, ideally with the involvement of pro-
research can contribute to machine learning, namely by fessional software engineers.
investigating new hardware platforms to execute tiresome The picture that emerges from the level of activity and
computations. While standard von Neumann technolo- the enthusiasm surrounding the first success stories is
gies struggle to keep pace with Moore’s law, this opens a that the interaction between machine learning and the
number of opportunities for novel computing paradigms. physical sciences is merely in its infancy, and we can an-
In their simplest embodiment, these take the form of ticipate more exciting results stemming from this inter-
specialized accelerator devices, plugged onto standard play between machine learning and the physical sciences.
servers and accessed through custom APIs. Future re-
search focuses on the scaling-up of such hardware capabil-
ities, hardware-inspired innovation to machine learning, ACKNOWLEDGEMENTS
and adapted programming languages as well as compil-
ers for the optimized distribution of computing tasks on This research was supported in part by the Na-
these hybrid servers. tional Science Foundation under Grants Nos. NSF
PHY-1748958, ACI-1450310, OAC-1836650, and DMR-
1420073, the US Army Research Office under con-
tract/grant number W911NF-13-1-0387, as well as the
2 Many quantum machine learning algorithms based on linear al- ERC under the European Unions Horizon 2020 Research
gebra acceleration have recently been shown to make unfounded
claims of exponential speedups (Tang, 2018), when compared
and Innovation Programme Grant Agreement 714608-
against classical algorithms for analysing low-rank datasets with SMiLe. Additionally, we would like to thank the support
strong sampling access. However, they are still interesting in this of the Moore and Sloan foundations, the Kavli Institute
context where even constant speedups make a difference. of Theoretical Physics of UCSB, and the Institute for Ad-
39

vanced Study. Finally, we would like to thank Michele Aubin, B., A. Maillard, F. Krzakala, N. Macris, L. Zdeborová,
Ceriotti, Yoav Levine, Andrea Rocchetto, Miles Stouden- et al. (2018), in Advances in Neural Information Processing
mire, and Ryan Sweke. Systems, pp. 3223–3234, preprint arXiv:1806.05451.
Aurisano, A., A. Radovic, D. Rocco, A. Himmel, M. D.
Messier, E. Niner, G. Pawloski, F. Psihas, A. Sousa, and
P. Vahle (2016), JINST 11 (09), P09001, arXiv:1604.01444
[hep-ex].
REFERENCES Baireuther, P., T. E. O’Brien, B. Tarasinski, and C. W. J.
Beenakker (2018), Quantum 2, 48.
Aaronson, S. (2007), Proceedings of the Royal Society Baity-Jesi, M., L. Sagun, M. Geiger, S. Spigler, G. B. Arous,
A: Mathematical, Physical and Engineering Sciences C. Cammarota, Y. LeCun, M. Wyart, and G. Biroli (2018),
463 (2088), 3089. in ICML 2018, arXiv preprint arXiv:1803.06969.
Aaronson, S. (2017), arXiv:1711.01053 [quant-ph] ArXiv: Baldassi, C., C. Borgs, J. T. Chayes, A. Ingrosso, C. Lucibello,
1711.01053. L. Saglietti, and R. Zecchina (2016), Proceedings of the
Acar, E., and B. Yener (2009), IEEE Transactions on Knowl- National Academy of Sciences 113 (48), E7655.
edge and Data Engineering 21 (1), 6. Baldassi, C., A. Ingrosso, C. Lucibello, L. Saglietti, and
Acciarri, R., et al. (MicroBooNE) (2017), JINST 12 (03), R. Zecchina (2015), Physical review letters 115 (12),
P03011, arXiv:1611.05531 [physics.ins-det]. 128101.
Adachi, S. H., and M. P. Henderson (2015), arXiv preprint Baldi, P., K. Bauer, C. Eng, P. Sadowski, and D. Whiteson
arXiv:1510.06356. (2016a), Physical Review D 93 (9), 094034.
Advani, M. S., and A. M. Saxe (2017), arXiv preprint Baldi, P., K. Cranmer, T. Faucett, P. Sadowski, and
arXiv:1710.03667. D. Whiteson (2016b), Eur. Phys. J. C76 (5), 235,
Agresti, I., N. Viggianiello, F. Flamini, N. Spagnolo, arXiv:1601.07913 [hep-ex].
A. Crespi, R. Osellame, N. Wiebe, and F. Sciarrino (2019), Baldi, P., P. Sadowski, and D. Whiteson (2014), Nature Com-
Physical Review X 9 (1), 011013. mun. 5, 4308, arXiv:1402.4735 [hep-ph].
Albergo, M. S., G. Kanwar, and P. E. Shanahan (2019), Phys. Ball, R. D., et al. (NNPDF) (2015), JHEP 04, 040,
Rev. D100 (3), 034515, arXiv:1904.12072 [hep-lat]. arXiv:1410.8849 [hep-ph].
Albertsson, K., et al. (2018), Proceedings, 18th International Ballard, A. J., R. Das, S. Martiniani, D. Mehta, L. Sagun,
Workshop on Advanced Computing and Analysis Tech- J. D. Stevenson, and D. J. Wales (2017), Phys. Chem.
niques in Physics Research (ACAT 2017): Seattle, WA, Chem. Phys. 19 (20), 12585.
USA, August 21-25, 2017, J. Phys. Conf. Ser. 1085 (2), Banchi, L., E. Grant, A. Rocchetto, and S. Severini (2018),
022008, arXiv:1807.02876 [physics.comp-ph]. New Journal of Physics 20 (12), 123030.
Alet, F., and N. Laflorencie (2018), Comptes Rendus Bang, J., J. Ryu, S. Yoo, M. Pawłowski, and J. Lee (2014),
Physique Quantum simulation / Simulation quantique, New Journal of Physics 16 (7), 073017.
19 (6), 498. Barbier, J., M. Dia, N. Macris, F. Krzakala, T. Lesieur, and
Alsing, J., and B. Wandelt (2019), arXiv:1903.01473 [astro- L. Zdeborová (2016), in Advances in Neural Information
ph.CO]. Processing Systems, pp. 424–432.
Alsing, J., B. Wandelt, and S. Feeney (2018), Mon. Not. Barbier, J., F. Krzakala, N. Macris, L. Miolane, and L. Zde-
Roy. Astron. Soc. 477 (3), 2874, arXiv:1801.01497 [astro- borová (2019), Proceedings of the National Academy of Sci-
ph.CO]. ences 116 (12), 5451.
Amari, S.-i. (1998), Neural Computation 10 (2), 251. Barkai, N., and H. Sompolinsky (1994), Physical Review E
Ambrogio, S., P. Narayanan, H. Tsai, R. M. Shelby, I. Boybat, 50 (3), 1766.
C. Nolfo, S. Sidler, M. Giordano, M. Bodini, N. C. Farinha, Barra, A., G. Genovese, P. Sollich, and D. Tantari (2018),
et al. (2018), Nature 558 (7708), 60. Physical Review E 97 (2), 022310.
Ambs, P. (2010), Advances in Optical Technologies 2010. Bartók, A. P., J. Kermode, N. Bernstein, and G. Csányi
Amit, D. J., H. Gutfreund, and H. Sompolinsky (1985), Phys- (2018), Physical Review X 8 (4), 041048.
ical Review A 32 (2), 1007. Bartók, A. P., R. Kondor, and G. Csányi (2013), Phys. Rev.
Anandkumar, A., R. Ge, D. Hsu, S. M. Kakade, and M. Tel- B 87, 184115.
garsky (2014), Journal of Machine Learning Research 15, Bartók, A. P., M. C. Payne, R. Kondor, and G. Csányi (2010),
2773. Phys. Rev. Lett. 104 (13), 136403.
Anelli, A., E. A. Engel, C. J. Pickard, and M. Ceriotti (2018), Baydin, A. G., L. Heinrich, W. Bhimji, B. Gram-Hansen,
Phys. Rev. Mater. 2, 103804. G. Louppe, L. Shao, Prabhat, K. Cranmer, and F. Wood
Apollinari, G., O. Brüning, T. Nakamoto, and L. Rossi (2018), arXiv:1807.07706 [cs.LG].
(2015), CERN Yellow Report (5), 1, arXiv:1705.08830 Beach, M. J. S., A. Golubeva, and R. G. Melko (2018), Phys-
[physics.acc-ph]. ical Review B 97 (4), 045207.
Armitage, T. J., S. T. Kay, and D. J. Barnes (2019), MNRAS Beaumont, M. A., W. Zhang, and D. J. Balding (2002), Ge-
484, 1526, arXiv:1810.08430 [astro-ph.CO]. netics 162 (4), 2025.
Arsenault, L.-F., R. Neuberg, L. A. Hannah, and A. J. Millis Becca, F., and S. Sorella (2017), Quantum Monte Carlo
(2017), Inverse Problems 33 (11), 115007. Approaches for Correlated Systems (Cambridge University
Arunachalam, S., and R. de Wolf (2017), ACM SIGACT Press, Cambridge, United Kingdom ; New York, NY).
News 48 (2), 41. Behler, J. (2016), J. Chem. Phys. 145 (17), 170901.
ATLAS Collaboration, (2018), Deep generative models for Behler, J., and M. Parrinello (2007), Phys. Rev. Lett. 98 (14),
fast shower simulation in ATLAS , Tech. Rep. ATL-SOFT- 583.
PUB-2018-001 (CERN, Geneva).
40

Benedetti, M., J. Realpe-Gómez, R. Biswas, and A. Perdomo- Carrasquilla, J., and R. G. Melko (2017), Nature Physics
Ortiz (2017), Physical Review X 7 (4), 041052. 13 (5), 431.
Benítez, N. (2000), Astrophys. J. 536, 571, arXiv:astro- Carrasquilla, J., G. Torlai, R. G. Melko, and L. Aolita (2019),
ph/9811189 [astro-ph]. Nature Machine Intelligence 1 (3), 155.
Bény, C. (2013), in ICLR 2013, arXiv preprint Casado, M. L., et al. (2017), arXiv:1712.07901 [cs.AI].
arXiv:1301.3124. Changlani, H. J., J. M. Kinder, C. J. Umrigar, and G. K.-L.
Bereau, T., R. A. DiStasio, Jr., A. Tkatchenko, and O. A. Chan (2009), Physical Review B 80 (24), 245116.
von Lilienfeld (2018), J. Chem. Phys. 148 (24), 241706. Charnock, T., G. Lavaux, and B. D. Wandelt (2018), Phys.
Biamonte, J., P. Wittek, N. Pancotti, P. Rebentrost, Rev. D97 (8), 083004, arXiv:1802.03537 [astro-ph.IM].
N. Wiebe, and S. Lloyd (2017), Nature 549 (7671), 195. Chaudhari, P., A. Choromanska, S. Soatto, Y. LeCun, C. Bal-
Biehl, M., and A. Mietzner (1993), EPL (Europhysics Let- dassi, C. Borgs, J. Chayes, L. Sagun, and R. Zecchina
ters) 24 (5), 421. (2016), in ICLR 2017, arXiv preprint arXiv:1611.01838.
Bishop, C. M. (2006), Pattern recognition and machine learn- Chen, C., X. Y. Xu, J. Liu, G. Batrouni, R. Scalettar, and
ing (springer). Z. Y. Meng (2018a), Physical Review B 98 (4), 041102.
Bohrdt, A., C. S. Chiu, G. Ji, M. Xu, D. Greif, Chen, J., S. Cheng, H. Xie, L. Wang, and T. Xiang (2018b),
M. Greiner, E. Demler, F. Grusdt, and M. Knap (2018), Physical Review B 97 (8), 085104.
arXiv:1811.12425 [cond-mat] . Cheng, S., L. Wang, T. Xiang, and P. Zhang (2019),
Bolthausen, E. (2014), Communications in Mathematical arXiv:1901.02217 [cond-mat, physics:quant-ph, stat].
Physics 325 (1), 333. Chmiela, S., H. E. Sauceda, K.-R. Müller, and A. Tkatchenko
Bonnett, C., et al. (DES) (2016), Phys. Rev. D94 (4), 042005, (2018), Nat. Commun. 9, 3887.
arXiv:1507.05909 [astro-ph.CO]. Choma, N., F. Monti, L. Gerhardt, T. Palczewski,
Borin, A., and D. A. Abanin (2019), arXiv:1901.08615 [cond- Z. Ronaghi, Prabhat, W. Bhimji, M. M. Bronstein,
mat, physics:quant-ph] . S. R. Klein, and J. Bruna (2018), arXiv e-prints ,
Bozson, A., G. Cowan, and F. Spanò (2018), arXiv:1809.06166arXiv:1809.06166 [cs.LG].
arXiv:1811.01242 [physics.data-an]. Choo, K., G. Carleo, N. Regnault, and T. Neupert (2018),
Bradde, S., and W. Bialek (2017), Journal of Statistical Physical Review Letters 121 (16), 167204.
Physics 167 (3-4), 462. Choromanska, A., M. Henaff, M. Mathieu, G. B. Arous, and
Brammer, G. B., P. G. van Dokkum, and P. Coppi (2008), Y. LeCun (2015), in Artificial Intelligence and Statistics,
Astrophys. J. 686, 1503, arXiv:0807.1533 [astro-ph]. pp. 192–204.
Brehmer, J., K. Cranmer, G. Louppe, and J. Pavez (2018a), Chung, S., D. D. Lee, and H. Sompolinsky (2018), Physical
Phys. Rev. D98 (5), 052004, arXiv:1805.00020 [hep-ph]. Review X 8 (3), 031003.
Brehmer, J., K. Cranmer, G. Louppe, and J. Pavez (2018b), Ciliberto, C., M. Herbster, A. D. Ialongo, M. Pontil, A. Roc-
Phys. Rev. Lett. 121 (11), 111801, arXiv:1805.00013 [hep- chetto, S. Severini, and L. Wossnig (2018), Proceedings
ph]. of the Royal Society A: Mathematical, Physical and Engi-
Brehmer, J., G. Louppe, J. Pavez, and K. Cranmer (2018c), neering Sciences 474 (2209), 20170551.
arXiv:1805.12244 [stat.ML]. Clark, S. R. (2018), Journal of Physics A: Mathematical and
Breiman, L., J. H. Friedman, R. A. Olshen, and C. J. Stone Theoretical 51 (13), 135301.
(1984). Clements, W. R., P. C. Humphreys, B. J. Metcalf, W. S.
Brockherde, F., L. Vogt, L. Li, M. E. Tuckerman, K. Burke, Kolthammer, and I. A. Walmsley (2016), Optica 3 (12),
and K.-R. Müller (2017), Nat. Commun. 8 (1), 872. 1460.
Broecker, P., F. F. Assaad, and S. Trebst (2017a), Cocco, S., R. Monasson, L. Posani, S. Rosay, and J. Tubiana
arXiv:1707.00663 [cond-mat] . (2018), Physica A: Statistical Mechanics and its Applica-
Broecker, P., J. Carrasquilla, R. G. Melko, and S. Trebst tions 504, 45.
(2017b), Scientific Reports 7 (1), 8823. Cohen, N., O. Sharir, and A. Shashua (2016), in 29th An-
Bronstein, M. M., J. Bruna, Y. LeCun, A. Szlam, and nual Conference on Learning Theory, Proceedings of Ma-
P. Vandergheynst (2017), IEEE Sig. Proc. Mag. 34 (4), chine Learning Research, Vol. 49, edited by V. Feldman,
18, arXiv:1611.08097 [cs.CV]. A. Rakhlin, and O. Shamir (PMLR, Columbia University,
Bukov, M. (2018), Physical Review B 98 (22), 224305. New York, New York, USA) pp. 698–728.
Bukov, M., A. G. Day, D. Sels, P. Weinberg, A. Polkovnikov, Cohen, T., and M. Welling (2016), in International conference
and P. Mehta (2018), Physical Review X 8 (3), 031086. on machine learning, pp. 2990–2999.
Butler, K. T., D. W. Davies, H. Cartwright, O. Isayev, and Cohen, T. S., M. Geiger, J. Köhler, and M. Welling
A. Walsh (2018), Nature 559, 547. (2018), Proceedings of the 6th International Confer-
Cai, Z., and J. Liu (2018), Physical Review B 97 (3), 035116. ence on Learning Representations (ICLR) ArXiv preprint
Cameron, E., and A. N. Pettitt (2012), MNRAS 425, 44, arXiv:1801.10130.
arXiv:1202.1426 [astro-ph.IM]. Cohen, T. S., M. Weiler, B. Kicanaoglu, and M. Welling
Carifio, J., J. Halverson, D. Krioukov, and B. D. Nelson (2019), arXiv e-prints arXiv:1902.04615 [cs.LG].
(2017), JHEP 09, 157, arXiv:1707.00655 [hep-th]. Coja-Oghlan, A., F. Krzakala, W. Perkins, and L. Zdeborová
Carleo, G., F. Becca, M. Schiro, and M. Fabrizio (2012), (2018), Advances in Mathematics 333, 694.
Scientific Reports 2, 243. Collett, T. E. (2015), The Astrophysical Journal 811 (1), 20.
Carleo, G., Y. Nomura, and M. Imada (2018), Nature com- Collister, A. A., and O. Lahav (2004), Publications of the
munications 9 (1), 5322. Astronomical Society of the Pacific 116, 345, arXiv:astro-
Carleo, G., and M. Troyer (2017), Science 355 (6325), 602. ph/0311058 [astro-ph].
Carrasco Kind, M., and R. J. Brunner (2013), MNRAS 432, Cong, I., S. Choi, and M. D. Lukin (2018), arXiv preprint
1483, arXiv:1303.7269 [astro-ph.CO]. arXiv:1810.03787.
41

Cranmer, K., S. Golkar, and D. Pappadopulo (2019), Fabiani, G., and J. Mentink (2019), SciPost Physics 7 (1),
arXiv:1904.05903 [hep-lat, physics:physics, physics:quant- 004.
ph, stat] ArXiv: 1904.05903. Farrell, S., et al. (2018), in 4th International Workshop Con-
Cranmer, K., and G. Louppe (2016), J. Brief Ideas necting The Dots 2018 (CTD2018) Seattle, Washington,
10.5281/zenodo.198541. USA, March 20-22, 2018 , arXiv:1810.06111 [hep-ex].
Cranmer, K., J. Pavez, and G. Louppe (2015), Feldmann, R., C. M. Carollo, C. Porciani, S. J. Lilly, P. Ca-
arXiv:1506.02169 [stat.AP]. pak, Y. Taniguchi, O. Le Fèvre, A. Renzini, N. Scov-
Cristoforetti, M., G. Jurman, A. I. Nardelli, and C. Furlanello ille, M. Ajiki, H. Aussel, T. Contini, H. McCracken,
(2017), arXiv:1705.09524 [cond-mat, physics:hep-lat] . B. Mobasher, T. Murayama, D. Sanders, S. Sasaki, C. Scar-
Cubuk, E. D., S. S. Schoenholz, J. M. Rieser, B. D. Malone, lata, M. Scodeggio, Y. Shioya, J. Silverman, M. Takahashi,
J. Rottler, D. J. Durian, E. Kaxiras, and A. J. Liu (2015), D. Thompson, and G. Zamorani (2006), MNRAS 372, 565,
Physical review letters 114 (10), 108001. arXiv:astro-ph/0609044 [astro-ph].
Cybenko, G. (1989), Mathematics of control, signals and sys- Firth, A. E., O. Lahav, and R. S. Somerville (2003), MNRAS
tems 2 (4), 303. 339, 1195, arXiv:astro-ph/0203250 [astro-ph].
Czischek, S., M. Gärttner, and T. Gasenzer (2018), Physical Foreman-Mackey, D., D. W. Hogg, D. Lang, and J. Goodman
Review B 98 (2), 024311. (2013), Publ.Astron.Soc.Pac. 125, 306, arXiv:1202.3665
Dean, D. S. (1996), Journal of Physics A: Mathematical and [astro-ph.IM].
General 29 (24), L613. Forte, S., L. Garrido, J. I. Latorre, and A. Piccione (2002),
Decelle, A., G. Fissore, and C. Furtlehner (2017), EPL (Eu- JHEP 05, 062, arXiv:hep-ph/0204232 [hep-ph].
rophysics Letters) 119 (6), 60001. Fortunato, S. (2010), Physics reports 486 (3-5), 75.
Decelle, A., F. Krzakala, C. Moore, and L. Zdeborová Fournier, R., L. Wang, O. V. Yazyev, and Q. Wu (2018),
(2011a), Physical Review E 84 (6), 066106. arXiv:1810.00913 [cond-mat, physics:physics] .
Decelle, A., F. Krzakala, C. Moore, and L. Zdeborová Frate, M., K. Cranmer, S. Kalia, A. Vandenberg-Rodes, and
(2011b), Physical Review Letters 107 (6), 065701. D. Whiteson (2017), arXiv:1709.05681 [physics.data-an].
Deng, D.-L., X. Li, and S. Das Sarma (2017a), Physical Re- Frazier, P. I. (2018), arXiv preprint arXiv:1807.02811.
view B 96 (19), 195145. Frenkel, Y. I. (1934), Wave Mechanics: Advanced General
Deng, D.-L., X. Li, and S. Das Sarma (2017b), Physical Re- Theory, The International series of monographs on nuclear
view X 7 (2), 021021. energy: Reactor design physics No. v. 2 (The Clarendon
Deringer, V. L., N. Bernstein, A. P. Bartók, M. J. Cliffe, Press).
R. N. Kerber, L. E. Marbella, C. P. Grey, S. R. Elliott, Freund, Y., and R. E. Schapire (1997), J. Comput. Syst. Sci.
and G. Csányi (2018), J. Phys. Chem. Lett. 9 (11), 2879. 55 (1), 119.
Deshpande, Y., and A. Montanari (2014), in 2014 IEEE In- Fösel, T., P. Tighineanu, T. Weiss, and F. Marquardt (2018),
ternational Symposium on Information Theory. Physical Review X 8 (3), 031084.
Dirac, P. A. M. (1930), Mathematical Proceedings of the Gabrié, M., A. Manoel, C. Luneau, N. Macris, F. Krza-
Cambridge Philosophical Society 26 (03), 376. kala, L. Zdeborová, et al. (2018), in Advances in Neural
Doggen, E. V. H., F. Schindler, K. S. Tikhonov, A. D. Mirlin, Information Processing Systems, pp. 1821–1831, preprint
T. Neupert, D. G. Polyakov, and I. V. Gornyi (2018), arXiv:1805.09785.
Physical Review B 98 (17), 174202. Gabrié, M., E. W. Tramel, and F. Krzakala (2015), in Ad-
Dong, J., S. Gigan, F. Krzakala, and G. Wainrib (2018), in vances in Neural Information Processing Systems, pp. 640–
2018 IEEE Statistical Signal Processing Workshop (SSP) 648.
(IEEE) pp. 448–452. Gao, X., and L.-M. Duan (2017), Nature Communications
Donoho, D. L. (2006), IEEE Transactions on information the- 8 (1), 662.
ory 52 (4), 1289. Gardner, E. (1987), EPL (Europhysics Letters) 4 (4), 481.
Duarte, J., et al. (2018), JINST 13 (07), P07027, Gardner, E. (1988), Journal of physics A: Mathematical and
arXiv:1804.06913 [physics.ins-det]. general 21 (1), 257.
Dunjko, V., and H. J. Briegel (2018), Reports on Progress in Gardner, E., and B. Derrida (1989), Journal of Physics A:
Physics 81 (7), 074001. Mathematical and General 22 (12), 1983.
Duvenaud, D., J. Lloyd, R. Grosse, J. Tenenbaum, and Gastegger, M., J. Behler, and P. Marquetand (2017), Chem.
G. Zoubin (2013), in International Conference on Machine Sci. 8 (10), 6924.
Learning, pp. 1166–1174, 1302.4922. Gendiar, A., and T. Nishino (2002), Physical Review E
Eickenberg, M., G. Exarchakis, M. Hirn, S. Mallat, and 65 (4), 046702.
L. Thiry (2018), J. Chem. Phys. 148 (24), 241732. Glasser, I., N. Pancotti, M. August, I. D. Rodriguez, and
Engel, A., and C. Van den Broeck (2001), Statistical mechan- J. I. Cirac (2018a), Physical Review X 8 (1), 011006.
ics of learning (Cambridge University Press). Glasser, I., N. Pancotti, and J. I. Cirac (2018b), arXiv
Engel, E. A., A. Anelli, M. Ceriotti, C. J. Pickard, and R. J. preprint arXiv:1806.05964.
Needs (2018), Nat. Commun. 9, 2173. Gligorov, V. V., and M. Williams (2013), JINST 8, P02013,
Estrada, J., J. Annis, H. T. Diehl, P. B. Hall, T. Las, arXiv:1210.6861 [physics.ins-det].
H. Lin, M. Makler, K. W. Merritt, V. Scarpine, S. Al- Goldt, S., M. S. Advani, A. M. Saxe, F. Krzakala, and L. Zde-
lam, and D. Tucker (2007), Astrophys. J. 660, 1176, astro- borová (2019a), arXiv preprint arXiv:1906.08632.
ph/0701383. Goldt, S., M. S. Advani, A. M. Saxe, F. Krzakala, and L. Zde-
Faber, F. A., L. Hutchison, B. Huang, J. Gilmer, S. S. Schoen- borová (2019b), arXiv preprint arXiv:1901.09085.
holz, G. E. Dahl, O. Vinyals, S. Kearnes, P. F. Riley, and Golkar, S., and K. Cranmer (2018), arXiv:1806.01337
O. A. von Lilienfeld (2017), J. Chem. Theory Comput. [stat.ML].
13 (11), 5255.
42

Goodfellow, I., Y. Bengio, and A. Courville (2016), Deep Hollingsworth, J., T. E. Baker, and K. Burke (2018), J.
learning (MIT press). Chem. Phys. 148 (24), 241743.
Goodfellow, I., J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde- Hopfield, J. J. (1982), Proceedings of the national academy
Farley, S. Ozair, A. Courville, and Y. Bengio (2014), in Ad- of sciences 79 (8), 2554.
vances in neural information processing systems, pp. 2672– Hsu, Y.-T., X. Li, D.-L. Deng, and S. Das Sarma (2018),
2680. Physical Review Letters 121 (24), 245701.
Gorodetsky, A., S. Karaman, and Y. Marzouk (2019), Com- Hu, W., R. R. P. Singh, and R. T. Scalettar (2017), Physical
puter Methods in Applied Mechanics and Engineering 347, Review E 95 (6), 062122.
59. Huang, L., and L. Wang (2017), Physical Review B 95 (3),
Gray, J., L. Banchi, A. Bayat, and S. Bose (2018), Physical 035105.
Review Letters 121 (15), 150503. Huang, Y., and J. E. Moore (2017), arXiv:1701.06246 [cond-
Greitemann, J., K. Liu, L. Pollet, et al. (2019), Physical Re- mat] .
view B 99 (6), 060404. Huembeli, P., A. Dauphin, and P. Wittek (2018a), Physical
Gross, D., Y.-K. Liu, S. T. Flammia, S. Becker, and J. Eisert Review B 97 (13), 134109.
(2010), Physical Review Letters 105 (15), 150401. Huembeli, P., A. Dauphin, P. Wittek, and C. Gogolin
Guest, D., J. Collado, P. Baldi, S.-C. Hsu, G. Urban, (2018b), arXiv:1806.00419 [cond-mat, physics:quant-ph]
and D. Whiteson (2016), Phys. Rev. D94 (11), 112002, ArXiv: 1806.00419.
arXiv:1607.08633 [hep-ex]. Iakovlev, I., O. Sotnikov, and V. Mazurenko (2018), Physical
Guest, D., K. Cranmer, and D. Whiteson (2018), Ann. Rev. Review B 98 (17), 174411.
Nucl. Part. Sci. 68, 161, arXiv:1806.11484 [hep-ex]. Ilten, P., M. Williams, and Y. Yang (2017), JINST 12 (04),
Guo, C., Z. Jie, W. Lu, and D. Poletti (2018), Physical Re- P04028, arXiv:1610.08328 [physics.data-an].
view E 98 (4), 042114. Ishida, E. E. O., S. D. P. Vitenti, M. Penna-Lima, J. Cisewski,
Györgyi, G. (1990), Physical Review A 41 (12), 7097. R. S. de Souza, A. M. M. Trindade, E. Cameron, V. C.
Györgyi, G., and N. Tishby (1990), in W.T. Theumann and Busti, and COIN Collaboration (2015), Astronomy and
R. Kobrele (Editors), Neural Networks and Spin Glasses , Computing 13, 1, arXiv:1504.06129 [astro-ph.CO].
3. Izmailov, P., A. Novikov, and D. Kropotov (2017),
Haah, J., A. W. Harrow, Z. Ji, X. Wu, and N. Yu (2017), arXiv:1710.07324 [cs, stat] .
IEEE Transactions on Information Theory 63 (9), 5628. Jacot, A., F. Gabriel, and C. Hongler (2018), in Advances in
Hackbusch, W., and S. Kühn (2009), Journal of Fourier Anal- neural information processing systems, pp. 8580–8589.
ysis and Applications 15 (5), 706. Jaeger, H., and H. Haas (2004), science 304 (5667), 78.
Han, J., L. Zhang, and W. E (2018a), arXiv:1807.07014 Jain, A., P. K. Srijith, and S. Desai (2018), arXiv:1803.06473
[physics] . [astro-ph.IM].
Han, Z.-Y., J. Wang, H. Fan, L. Wang, and P. Zhang (2018b), Javanmard, A., and A. Montanari (2013), Information and
Physical Review X 8 (3), 031012. Inference: A Journal of the IMA 2 (2), 115.
Hartmann, M. J., and G. Carleo (2019), arXiv:1902.05131 Johnson, W. B., and J. Lindenstrauss (1984), Contemporary
[cond-mat, physics:quant-ph] . mathematics 26 (189-206), 1.
Hashimoto, K., S. Sugishita, A. Tanaka, and A. Tomiya Johnstone, I. M., and A. Y. Lu (2009), Journal of the Amer-
(2018a), Phys. Rev. D 98, 106014. ican Statistical Association 104 (486), 682.
Hashimoto, K., S. Sugishita, A. Tanaka, and A. Tomiya Jones, D. R., M. Schonlau, and W. J. Welch (1998), Journal
(2018b), Phys. Rev. D98 (4), 046019, arXiv:1802.08313 of Global Optimization 13 (4), 455.
[hep-th]. Jouppi, N. P., C. Young, N. Patil, D. Patterson, G. Agrawal,
Havlicek, V., A. D. Córcoles, K. Temme, A. W. Harrow, R. Bajwa, S. Bates, S. Bhatia, N. Boden, A. Borchers,
J. M. Chow, and J. M. Gambetta (2018), arXiv preprint et al. (2017), in Computer Architecture (ISCA), 2017
arXiv:1804.11326. ACM/IEEE 44th Annual International Symposium on
He, S., Y. Li, Y. Feng, S. Ho, S. Ravanbakhsh, W. Chen, and (IEEE) pp. 1–12.
B. Póczos (2018), arXiv e-prints arXiv:1811.06533 [astro- Jónsson, B., B. Bauer, and G. Carleo (2018),
ph.CO]. arXiv:1808.05232 [cond-mat, physics:physics,
Henson, M. A., S. T. Kay, D. J. Barnes, I. G. physics:quant-ph] .
McCarthy, J. Schaye, and A. Jenkins (2016), Kabashima, Y., F. Krzakala, M. Mézard, A. Sakata, and
Monthly Notices of the Royal Astronomical Society L. Zdeborová (2016), IEEE Transactions on Information
465 (1), 213, http://oup.prod.sis.lan/mnras/article- Theory 62 (7), 4228.
pdf/465/1/213/8593372/stw2722.pdf. Kalantre, S. S., J. P. Zwolak, S. Ragole, X. Wu, N. M. Zim-
Hermans, J., V. Begy, and G. Louppe (2019), arXiv e-prints merman, M. Stewart, and J. M. Taylor (2019), npj Quan-
arXiv:1903.04057 [stat.ML]. tum Information 5 (1), 6.
Hezaveh, Y. D., L. Perreault Levasseur, and P. J. Marshall Kamath, A., R. A. Vargas-Hernández, R. V. Krems, T. Car-
(2017), Nature 548, 555, arXiv:1708.08842 [astro-ph.IM]. rington, Jr, and S. Manzhos (2018), J. Chem. Phys.
Hinton, G. E. (2002), Neural computation 14 (8), 1771. 148 (24), 241702.
Ho, M., M. M. Rau, M. Ntampaka, A. Farahi, H. Trac, and Kashiwa, K., Y. Kikuchi, and A. Tomiya (2019), Progress of
B. Poczos (2019), arXiv:1902.05950 [astro-ph.CO]. Theoretical and Experimental Physics 2019 (8), 083A04.
Hochreiter, S., and J. Schmidhuber (1997), Neural computa- Kasieczka, G., T. Plehn, A. Butter, D. Debnath, M. Fairbairn,
tion 9 (8), 1735. W. Fedorko, C. Gay, L. Gouskos, P. Komiske, S. Leiss, et al.
Hofmann, T., B. Schölkopf, and A. J. Smola (2008), The (2019), arXiv:1902.09914 [hep-ph].
Annals of Statistics , 1171. Kaubruegger, R., L. Pastori, and J. C. Budich (2018), Phys-
ical Review B 97 (19), 195136.
43

Keriven, N., D. Garreau, and I. Poli (2018), arXiv preprint Levine, Y., D. Yakira, N. Cohen, and A. Shashua (2017),
arXiv:1805.08061. ICLR 2018 ArXiv: 1704.01552.
Killoran, N., T. R. Bromley, J. M. Arrazola, M. Schuld, Li, L., J. C. Snyder, I. M. Pelaschier, J. Huang, U.-N. Ni-
N. Quesada, and S. Lloyd (2018), “Continuous-variable ranjan, P. Duncan, M. Rupp, K.-R. Müller, and K. Burke
quantum neural networks,” arxiv:1806.06871. (2015), Int. J. Quantum Chem. 116 (11), 819.
Kingma, D. P., and M. Welling (2013), arXiv preprint Li, N., M. D. Gladders, E. M. Rangel, M. K. Florian, L. E.
arXiv:1312.6114. Bleem, K. Heitmann, S. Habib, and P. Fasel (2016), The
Koch-Janusz, M., and Z. Ringel (2018), Nature Physics Astrophysical Journal 828 (1), 54.
14 (6), 578. Li, S.-H., and L. Wang (2018), Phys. Rev. Lett. 121, 260601.
Kochkov, D., and B. K. Clark (2018), arXiv:1811.12423 Liang, X., W.-Y. Liu, P.-Z. Lin, G.-C. Guo, Y.-S. Zhang, and
[cond-mat, physics:physics] . L. He (2018), Physical Review B 98 (10), 104426.
Komiske, P. T., E. M. Metodiev, B. Nachman, and Likhomanenko, T., P. Ilten, E. Khairullin, A. Rogozhnikov,
M. D. Schwartz (2018a), Phys. Rev. D98 (1), 011502, A. Ustyuzhanin, and M. Williams (2015), Proceedings,
arXiv:1801.10158 [hep-ph]. 21st International Conference on Computing in High En-
Komiske, P. T., E. M. Metodiev, and J. Thaler (2018b), ergy and Nuclear Physics (CHEP 2015): Okinawa, Japan,
JHEP 04, 013, arXiv:1712.07124 [hep-ph]. April 13-17, 2015, J. Phys. Conf. Ser. 664 (8), 082025,
Komiske, P. T., E. M. Metodiev, and J. Thaler (2019), JHEP arXiv:1510.00572 [physics.ins-det].
01, 121, arXiv:1810.05165 [hep-ph]. Lin, X., Y. Rivenson, N. T. Yardimci, M. Veli, Y. Luo, M. Jar-
Kondor, R. (2018), CoRR abs/1803.01588, rahi, and A. Ozcan (2018), Science 361 (6406), 1004.
arXiv:1803.01588. Liu, D., S.-J. Ran, P. Wittek, C. Peng, R. B. García, G. Su,
Kondor, R., Z. Lin, and S. Trivedi (2018), NeurIPS 2018 , and M. Lewenstein (2017a), arXiv:1710.04833 [cond-mat,
arXiv:1806.09231arXiv:1806.09231 [stat.ML]. physics:physics, physics:quant-ph, stat] ArXiv: 1710.04833.
Kondor, R., and S. Trivedi (2018), in International Confer- Liu, J., Y. Qi, Z. Y. Meng, and L. Fu (2017b), Physical
ence on Machine Learning, pp. 2747–2755, 1802.03690. Review B 95 (4), 041101.
Krastanov, S., and L. Jiang (2017), Scientific reports 7 (1), Liu, J., H. Shen, Y. Qi, Z. Y. Meng, and L. Fu (2017c),
11003. Physical Review B 95 (24), 241104.
Krenn, M., M. Malik, R. Fickler, R. Lapkiewicz, and Liu, K., J. Greitemann, L. Pollet, et al. (2019), Physical Re-
A. Zeilinger (2016), Physical review letters 116 (9), 090405. view B 99 (10), 104410.
Krzakala, F., M. Mézard, and L. Zdeborová (2013a), in In- Liu, Y., X. Zhang, M. Lewenstein, and S.-J. Ran
formation Theory Proceedings (ISIT), 2013 IEEE Interna- (2018), arXiv:1803.09111 [cond-mat, physics:quant-ph,
tional Symposium on (IEEE) pp. 659–663. stat] ArXiv: 1803.09111.
Krzakala, F., C. Moore, E. Mossel, J. Neeman, A. Sly, L. Zde- Lloyd, S., M. Mohseni, and P. Rebentrost (2014), Nature
borová, and P. Zhang (2013b), Proceedings of the National Physics 10, 631.
Academy of Sciences 110 (52), 20935. Louppe, G., K. Cho, C. Becot, and K. Cranmer (2017a),
Lang, D., D. W. Hogg, and D. Mykytyn (2016), “The Tractor: arXiv:1702.00748 [hep-ph].
Probabilistic astronomical source detection and measure- Louppe, G., J. Hermans, and K. Cranmer (2017b),
ment,” Astrophysics Source Code Library, ascl:1604.008. arXiv:1707.07113 [stat.ML].
Lanusse, F., Q. Ma, N. Li, T. E. Collett, C.-L. Li, S. Ravan- Louppe, G., M. Kagan, and K. Cranmer (2016),
bakhsh, R. Mandelbaum, and B. Póczos (2018), MNRAS arXiv:1611.01046 [stat.ME].
473, 3895, arXiv:1703.02642 [astro-ph.IM]. Lu, S., X. Gao, and L.-M. Duan (2018), arXiv:1810.02352
Lanyon, B. P., C. Maier, M. Holzäpfel, T. Baumgratz, [cond-mat, physics:quant-ph] ArXiv: 1810.02352.
C. Hempel, P. Jurcevic, I. Dhand, A. S. Buyskikh, A. J. Lu, T., S. Wu, X. Xu, and T. Francis (1989), Applied optics
Daley, M. Cramer, M. B. Plenio, R. Blatt, and C. F. 28 (22), 4908.
Roos (2017), Nature Physics advance online publica- Lubbers, N., J. S. Smith, and K. Barros (2018), J. Chem.
tion, 10.1038/nphys4244. Phys. 148 (24), 241715.
Larkoski, A. J., I. Moult, and B. Nachman (2017), Lundberg, K. H. (2005), IEEE Control Systems 25 (3), 22.
arXiv:1709.04464 [hep-ph]. Luo, D., and B. K. Clark (2018), arXiv:1807.10770 [cond-
Larochelle, H., and I. Murray (2011), in Proceedings of the mat, physics:physics] ArXiv: 1807.10770.
Fourteenth International Conference on Artificial Intelli- Mannelli, S. S., G. Biroli, C. Cammarota, F. Krzakala,
gence and Statistics, pp. 29–37. P. Urbani, and L. Zdeborová (2018), arXiv preprint
Le, T. A., A. G. Baydin, and F. Wood (2017), in Artificial In- arXiv:1812.09066.
telligence and Statistics, pp. 1338–1348, arXiv:1610.09900. Mannelli, S. S., F. Krzakala, P. Urbani, and L. Zdeborová
LeCun, Y., Y. Bengio, and G. Hinton (2015), nature (2019), arXiv preprint arXiv:1902.00139.
521 (7553), 436. Mardt, A., L. Pasquali, H. Wu, and F. Noé (2018), Nat.
Lee, J., Y. Bahri, R. Novak, S. S. Schoenholz, J. Pennington, Commun. 9, 5.
and J. Sohl-Dickstein (2018), ICRL 2018 arXiv:1711.00165. Marin, J.-M., P. Pudlo, C. P. Robert, and R. J. Ryder (2012),
Leistedt, B., D. W. Hogg, R. H. Wechsler, and J. DeRose Statistics and Computing , 1.
(2018), arXiv e-prints , arXiv:1807.01391arXiv:1807.01391 Marjoram, P., J. Molitor, V. Plagnol, and S. Tavaré (2003),
[astro-ph.CO]. Proceedings of the National Academy of Sciences 100 (26),
Lelarge, M., and L. Miolane (2016), Probability Theory and 15324.
Related Fields , 1. Markidis, S., S. W. Der Chien, E. Laure, I. B. Peng, and J. S.
Levine, Y., O. Sharir, N. Cohen, and A. Shashua (2019), Vetter (2018), IEEE International Parallel and Distributed
Physical Review Letters 122 (6), 065301. Processing Symposium Workshops (IPDPSW).
44

Marshall, P. J., D. W. Hogg, L. A. Moustakas, C. D. Fass- van Nieuwenburg, E., E. Bairey, and G. Refael (2018), Phys-
nacht, M. Bradač, T. Schrabback, and R. D. Blandford ical Review B 98 (6), 060301.
(2009), The Astrophysical Journal 694 (2), 924. Nishimori, H. (2001), Statistical physics of spin glasses and
Martiniani, S., P. M. Chaikin, and D. Levine (2019), Phys. information processing: an introduction, Vol. 111 (Claren-
Rev. X 9, 011031. don Press).
Maskara, N., A. Kubica, and T. Jochym-O’Connor (2019), Niu, M. Y., S. Boixo, V. Smelyanskiy, and H. Neven (2018),
Physical Review A 99 (5), 052351. arXiv preprint arXiv:1803.01857.
Matsushita, R., and T. Tanaka (2013), in Advances in Neural Noé, F., S. Olsson, J. Köhler, and H. Wu (2019), Science
Information Processing Systems, pp. 917–925. 365 (6457).
Mavadia, S., V. Frey, J. Sastrawan, S. Dona, and M. J. Bier- Nomura, Y., A. S. Darmawan, Y. Yamaji, and M. Imada
cuk (2017), Nature Communications 8, 14106. (2017), Physical Review B 96 (20), 205152.
McClean, J. R., J. Romero, R. Babbush, and A. Aspuru- Novikov, A., M. Trofimov, and I. Oseledets (2016),
Guzik (2016), New Journal of Physics 18 (2), 023023. arXiv:1605.03795 ArXiv: 1605.03795.
Mehta, P., M. Bukov, C.-H. Wang, A. G. Day, C. Richardson, Ntampaka, M., H. Trac, D. J. Sutherland, N. Battaglia,
C. K. Fisher, and D. J. Schwab (2018), arXiv preprint B. Póczos, and J. Schneider (2015), Astrophys. J. 803,
arXiv:1803.08823. 50, arXiv:1410.0686 [astro-ph.CO].
Mehta, P., and D. J. Schwab (2014), arXiv preprint Ntampaka, M., H. Trac, D. J. Sutherland, S. Fromenteau,
arXiv:1410.3831. B. Póczos, and J. Schneider (2016), Astrophys. J. 831,
Mei, S., A. Montanari, and P.-M. Nguyen (2018), arXiv 135, arXiv:1509.05409 [astro-ph.CO].
preprint arXiv:1804.06561. Ntampaka, M., J. ZuHone, D. Eisenstein, D. Nagai,
Melnikov, A. A., H. P. Nautrup, M. Krenn, V. Dunjko, A. Vikhlinin, L. Hernquist, F. Marinacci, D. Nelson,
M. Tiersch, A. Zeilinger, and H. J. Briegel (2018), Proceed- R. Pakmor, A. Pillepich, P. Torrey, and M. Vogelsberger
ings of the National Academy of Sciences 115 (6), 1221. (2018), arXiv e-prints , arXiv:1810.07703arXiv:1810.07703
Metodiev, E. M., B. Nachman, and J. Thaler (2017), JHEP [astro-ph.CO].
10, 174, arXiv:1708.02949 [hep-ph]. Ntampaka, M., et al. (2019), arXiv:1902.10159 [astro-ph.IM].
Mézard, M. (2017), Physical Review E 95 (2), 022117. Nussinov, Z., P. Ronhovde, D. Hu, S. Chakrabarty, B. Sun,
Mézard, M., and A. Montanari (2009), Information, physics, N. A. Mauro, and K. K. Sahu (2016), in Information Sci-
and computation (Oxford University Press). ence for Materials Discovery and Design (Springer) pp.
Mezzacapo, F., N. Schuch, M. Boninsegni, and J. I. Cirac 115–138.
(2009), New Journal of Physics 11 (8), 083026. O’Donnell, R., and J. Wright (2016), in Proceedings of the
Mills, K., K. Ryczko, I. Luchak, A. Domurad, C. Beeler, and forty-eighth annual ACM symposium on Theory of Com-
I. Tamblyn (2019), Chem. Sci. 10 (15), 4129. puting (ACM) pp. 899–912.
Minsky, M., and S. Papert (1969), Perceptrons: An Introduc- Ohtsuki, T., and T. Ohtsuki (2016), Journal of the Physical
tion to Computational Geometry (MIT Press, Cambridge, Society of Japan 85 (12), 123706.
MA, USA). Ohtsuki, T., and T. Ohtsuki (2017), Journal of the Physical
Mitarai, K., M. Negoro, M. Kitagawa, and K. Fujii (2018), Society of Japan 86 (4), 044708.
arXiv preprint arXiv:1803.00745. de Oliveira, L., M. Kagan, L. Mackey, B. Nachman, and
Morningstar, A., and R. G. Melko (2018), Journal of Machine A. Schwartzman (2016), JHEP 07, 069, arXiv:1511.05190
Learning Research 18 (163), 1. [hep-ph].
Morningstar, W. R., Y. D. Hezaveh, L. Perreault Lev- Oseledets, I. (2011), SIAM Journal on Scientific Computing
asseur, R. D. Blandford, P. J. Marshall, P. Putzky, 33 (5), 2295.
and R. H. Wechsler (2018), arXiv e-prints , Paganini, M., L. de Oliveira, and B. Nachman (2018a), Phys.
arXiv:1808.00011arXiv:1808.00011 [astro-ph.IM]. Rev. Lett. 120 (4), 042003, arXiv:1705.02355 [hep-ex].
Morningstar, W. R., L. Perreault Levasseur, Y. D. Heza- Paganini, M., L. de Oliveira, and B. Nachman (2018b), Phys.
veh, R. Blandford, P. Marshall, P. Putzky, T. D. Rueter, Rev. D97 (1), 014021, arXiv:1712.10321 [hep-ex].
R. Wechsler, and M. Welling (2019), arXiv e-prints , Pang, L.-G., K. Zhou, N. Su, H. Petersen, H. Stöcker,
arXiv:1901.01359arXiv:1901.01359 [astro-ph.IM]. and X.-N. Wang (2018), Nature Commun. 9 (1), 210,
Nagai, R., R. Akashi, S. Sasaki, and S. Tsuneyuki (2018), J. arXiv:1612.04262 [hep-ph].
Chem. Phys. 148 (24), 241737. Papamakarios, G., I. Murray, and T. Pavlakou (2017), in
Nagai, Y., H. Shen, Y. Qi, J. Liu, and L. Fu (2017), Physical Advances in Neural Information Processing Systems, pp.
Review B 96 (16), 161102. 2335–2344.
Nagy, A., and V. Savona (2019), arXiv:1902.09483 [cond-mat, Papamakarios, G., D. C. Sterratt, and I. Murray
physics:quant-ph] . (2018), arXiv e-prints , arXiv:1805.07226arXiv:1805.07226
Nautrup, H. P., N. Delfosse, V. Dunjko, H. J. Briegel, and [stat.ML].
N. Friis (2018), arXiv preprint arXiv:1812.08451. Paris, M., and J. Rehacek, Eds. (2004), Quantum State Esti-
Ng, A. Y., M. I. Jordan, and Y. Weiss (2002), in Advances mation, Lecture Notes in Physics (Springer-Verlag, Berlin
in neural information processing systems, pp. 849–856. Heidelberg).
Nguyen, H. C., R. Zecchina, and J. Berg (2017), Advances Paruzzo, F. M., A. Hofstetter, F. Musil, S. De, M. Ceriotti,
in Physics 66 (3), 197. and L. Emsley (2018), Nat. Commun. 9, 4501.
Nguyen, T. T., E. Székely, G. Imbalzano, J. Behler, G. Csányi, Pastori, L., R. Kaubruegger, and J. C. Budich
M. Ceriotti, A. W. Götz, and F. Paesani (2018), J. Chem. (2018), arXiv:1808.02069 [cond-mat, physics:physics,
Phys. 148 (24), 241725. physics:quant-ph] ArXiv: 1808.02069.
Nielsen, M. A., and I. Chuang (2002), “Quantum computa- Pathak, J., B. Hunt, M. Girvan, Z. Lu, and E. Ott (2018),
tion and quantum information,”. Physical review letters 120 (2), 024102.
45

Pathak, J., Z. Lu, B. R. Hunt, M. Girvan, and E. Ott (2017), Rocchetto, A., E. Grant, S. Strelchuk, G. Carleo, and S. Sev-
Chaos: An Interdisciplinary Journal of Nonlinear Science erini (2018), npj Quantum Information 4 (1), 28.
27 (12), 121102. Rodríguez, A. C., T. Kacprzak, A. Lucchi, A. Amara,
Peel, A., F. Lalande, J.-L. Starck, V. Pettorino, J. Merten, R. Sgier, J. Fluri, T. Hofmann, and A. Réfrégier
C. Giocoli, M. Meneghetti, and M. Baldi (2018), arXiv e- (2018), Computational Astrophysics and Cosmology 5, 4,
prints , arXiv:1810.11030arXiv:1810.11030 [astro-ph.CO]. arXiv:1801.09070 [astro-ph.CO].
Perdomo-Ortiz, A., M. Benedetti, J. Realpe-Gómez, and Rodriguez-Nieva, J. F., and M. S. Scheurer (2018),
R. Biswas (2017), arXiv preprint arXiv:1708.09757. arXiv:1805.05961 [cond-mat] ArXiv: 1805.05961.
Póczos, B., L. Xiong, D. J. Sutherland, and J. G. Schneider Roe, B. P., H.-J. Yang, J. Zhu, Y. Liu, I. Stancu, and
(2012), CoRR abs/1202.0302, arXiv:1202.0302. G. McGregor (2005), Nucl. Instrum. Meth. A543 (2-3),
Putzky, P., and M. Welling (2017), arXiv preprint 577, arXiv:physics/0408124 [physics].
arXiv:1706.04008. Rogozhnikov, A., A. Bukva, V. V. Gligorov, A. Ustyuzhanin,
Quek, Y., S. Fort, and H. K. Ng (2018), arXiv:1812.06693 and M. Williams (2015), JINST 10 (03), T03002,
[quant-ph] ArXiv: 1812.06693. arXiv:1410.4140 [hep-ex].
Radovic, A., M. Williams, D. Rousseau, M. Kagan, D. Bona- Ronhovde, P., S. Chakrabarty, D. Hu, M. Sahu, K. Sahu,
corsi, A. Himmel, A. Aurisano, K. Terao, and T. Wongji- K. Kelton, N. Mauro, and Z. Nussinov (2011), The Euro-
rad (2018), Nature 560 (7716), 41. pean Physical Journal E 34 (9), 105.
Ramakrishnan, R., P. O. Dral, M. Rupp, and O. A. von Rotskoff, G., and E. Vanden-Eijnden (2018), in Advances in
Lilienfeld (2014), Sci. Data 1, 191. Neural Information Processing Systems, pp. 7146–7155.
Rangan, S., and A. K. Fletcher (2012), in Information Theory Rupp, M., O. A. von Lilienfeld, and K. Burke (2018), J.
Proceedings (ISIT), 2012 IEEE International Symposium Chem. Phys. 148 (24), 241401.
on (IEEE) pp. 1246–1250. Rupp, M., A. Tkatchenko, K.-R. Müller, and O. A. von
Ravanbakhsh, S., F. Lanusse, R. Mandelbaum, J. Schnei- Lilienfeld (2012), Phys. Rev. Lett. 108, 058301.
der, and B. Poczos (2016), arXiv e-prints , Saad, D., and S. A. Solla (1995a), Physical Review Letters
arXiv:1609.05796arXiv:1609.05796 [astro-ph.IM]. 74 (21), 4337.
Ravanbakhsh, S., J. Oliva, S. Fromenteau, L. C. Price, S. Ho, Saad, D., and S. A. Solla (1995b), Physical Review E 52 (4),
J. Schneider, and B. Poczos (2017), arXiv e-prints , 4225.
arXiv:1711.02033arXiv:1711.02033 [astro-ph.CO]. Saade, A., F. Caltagirone, I. Carron, L. Daudet, A. Drémeau,
Reck, M., A. Zeilinger, H. J. Bernstein, and P. Bertani (1994), S. Gigan, and F. Krzakala (2016), in Acoustics, Speech
Physical Review Letters 73 (1), 58. and Signal Processing (ICASSP), 2016 IEEE International
Reddy, G., A. Celani, T. J. Sejnowski, and M. Vergassola Conference on (IEEE) pp. 6215–6219.
(2016), Proceedings of the National Academy of Sciences Saade, A., F. Krzakala, and L. Zdeborová (2014), in Advances
113 (33), E4877. in Neural Information Processing Systems, pp. 406–414.
Reddy, G., J. Wong-Ng, A. Celani, T. J. Sejnowski, and Saito, H. (2017), Journal of the Physical Society of Japan
M. Vergassola (2018), Nature 562 (7726), 236. 86 (9), 093001.
Regier, J., A. C. Miller, D. Schlegel, R. P. Adams, Saito, H. (2018), Journal of the Physical Society of Japan
J. D. McAuliffe, and Prabhat (2018), arXiv e-prints , 87 (7), 074002.
arXiv:1803.00113arXiv:1803.00113 [stat.AP]. Saito, H., and M. Kato (2017), Journal of the Physical Society
Rem, B. S., N. Käming, M. Tarnowski, L. Asteria, of Japan 87 (1), 014001.
N. Fläschner, C. Becker, K. Sengstock, and C. Weiten- Sakata, A., and Y. Kabashima (2013), EPL (Europhysics
berg (2018), arXiv:1809.05519 [cond-mat, physics:quant- Letters) 103 (2), 28008.
ph] ArXiv: 1809.05519. Saxe, A. M., Y. Bansal, J. Dapello, M. Advani, A. Kolchinsky,
Ren, S., K. He, R. Girshick, and J. Sun (2015), in Advances B. D. Tracey, and D. D. Cox (2018), .
in neural information processing systems, pp. 91–99. Saxe, A. M., J. L. McClelland, and S. Ganguli (2013), in
Rezende, D., and S. Mohamed (2015), in International Con- ICLR 2014, arXiv preprint arXiv:1312.6120.
ference on Machine Learning, pp. 1530–1538, 1505.05770. Schawinski, K., C. Zhang, H. Zhang, L. Fowler, and G. K.
Rezende, D. J., S. Mohamed, and D. Wierstra (2014), in Santhanam (2017), Monthly Notices of the Royal Astro-
Proceedings of the 31st International Conference on In- nomical Society: Letters , slx008ArXiv: 1702.00403.
ternational Conference on Machine Learning-Volume 32 Schindler, F., N. Regnault, and T. Neupert (2017), Physical
(JMLR. org) pp. II–1278. Review B 95 (24), 245134.
Riofrío, C. A., D. Gross, S. T. Flammia, T. Monz, D. Nigg, Schmidhuber, J. (2014), CoRR abs/1404.7828,
R. Blatt, and J. Eisert (2017), Nature Communications 8, arXiv:1404.7828.
15305. Schmidt, E., A. T. Fowler, J. A. Elliott, and P. D. Bristowe
Ritzmann, U., S. von Malottki, J.-V. Kim, S. Heinze, (2018), Comput. Mater. Sci. 149, 250.
J. Sinova, and B. Dupé (2018), Nature Electronics 1 (8), Schmitt, M., and M. Heyl (2018), SciPost Physics 4 (2), 013.
451. Schneider, E., L. Dai, R. Q. Topper, C. Drechsel-Grau,
Robin, A. C., C. Reylé, J. Fliri, M. Czekaj, C. P. Robert, and M. E. Tuckerman (2017), Phys. Rev. Lett. 119 (15),
and A. M. M. Martins (2014), Astronomy and Astrophysics 150601.
569, A13, arXiv:1406.5384 [astro-ph.GA]. Schoenholz, S. S., E. D. Cubuk, E. Kaxiras, and A. J. Liu
Rocchetto, A. (2018), Quantum Information and Computa- (2017), Proceedings of the National Academy of Sciences
tion 18 (7&8). 114 (2), 263.
Rocchetto, A., S. Aaronson, S. Severini, G. Carvacho, Schuch, N., M. M. Wolf, F. Verstraete, and J. I. Cirac (2008),
D. Poderini, I. Agresti, M. Bentivegna, and F. Sciarrino Physical Review Letters 100 (4), 040501.
(2017), arXiv:1712.00127 [quant-ph] ArXiv: 1712.00127.
46

Schuld, M., and N. Killoran (2018), arXiv preprint Stokes, J., and J. Terilla (2019), arXiv preprint
arXiv:1803.07128v1. arXiv:1902.06888.
Schuld, M., and F. Petruccione (2018a), Quantum computing Stoudenmire, E., and D. J. Schwab (2016), in Advances in
for supervised learning (Springer). Neural Information Processing Systems 29 , edited by D. D.
Schuld, M., and F. Petruccione (2018b), Supervised Learning Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Gar-
with Quantum Computers (Springer). nett (Curran Associates, Inc.) pp. 4799–4807.
Schütt, K. T., H. E. Sauceda, P. J. Kindermans, Stoudenmire, E. M. (2018), Quantum Science and Technology
A. Tkatchenko, and K. R. Müller (2018), J. Chem. Phys. 3 (3), 034003.
148 (24), 241722. Sun, N., J. Yi, P. Zhang, H. Shen, and H. Zhai (2018), Phys-
Schwarze, H. (1993), Journal of Physics A: Mathematical and ical Review B 98 (8), 085402.
General 26 (21), 5781. Sutton, R. S., and A. G. Barto (2018), Reinforcement learn-
Seif, A., K. A. Landsman, N. M. Linke, C. Figgatt, C. Mon- ing: An introduction (MIT press).
roe, and M. Hafezi (2018), Journal of Physics B: Atomic, Sweke, R., M. S. Kesselring, E. P. van Nieuwenburg, and
Molecular and Optical Physics 51 (17), 174006, arXiv: J. Eisert (2018), arXiv preprint arXiv:1810.07207.
1804.07718. Tanaka, A., and A. Tomiya (2017a), Journal of the Physical
Seung, H., H. Sompolinsky, and N. Tishby (1992a), Physical Society of Japan 86 (6), 063001.
Review A 45 (8), 6056. Tanaka, A., and A. Tomiya (2017b), arXiv:1712.03893 [hep-
Seung, H. S., M. Opper, and H. Sompolinsky (1992b), in lat].
Proceedings of the fifth annual workshop on Computational Tang, E. (2018), arXiv preprint arXiv:1807.04271.
learning theory (ACM) pp. 287–294. Teng, P. (2018), Physical Review E 98 (3), 033305.
Shanahan, P. E., D. Trewartha, and W. Detmold (2018), Thouless, D. J., P. W. Anderson, and R. G. Palmer (1977),
Phys. Rev. D97 (9), 094506, arXiv:1801.05784 [hep-lat]. Philosophical Magazine 35 (3), 593.
Sharir, O., Y. Levine, N. Wies, G. Carleo, and A. Shashua Tiersch, M., E. Ganahl, and H. J. Briegel (2015), Scientific
(2019), arXiv:1902.04057 [cond-mat] . reports 5, 12874.
Shen, H., D. George, E. A. Huerta, and Z. Zhao (2019), Tishby, N., F. C. Pereira, and W. Bialek (2000), arXiv
arXiv:1903.03105 [astro-ph.CO]. preprint physics/0004057.
Shen, Y., N. C. Harris, S. Skirlo, M. Prabhu, T. Baehr-Jones, Tishby, N., and N. Zaslavsky (2015), in Information Theory
M. Hochberg, X. Sun, S. Zhao, H. Larochelle, D. Englund, Workshop (ITW), 2015 IEEE (IEEE) pp. 1–5.
et al. (2017), Nature Photonics 11 (7), 441. Torlai, G., G. Mazzola, J. Carrasquilla, M. Troyer, R. Melko,
Shi, Y.-Y., L.-M. Duan, and G. Vidal (2006), Physical Review and G. Carleo (2018), Nature Physics 14 (5), 447.
A 74 (2), 022320. Torlai, G., and R. G. Melko (2017), Physical Review Letters
Shimmin, C., P. Sadowski, P. Baldi, E. Weik, D. Whiteson, 119 (3), 030501.
E. Goul, and A. Søgaard (2017), Phys. Rev. D96 (7), Torlai, G., and R. G. Melko (2018), Physical Review Letters
074034, arXiv:1703.03507 [hep-ex]. 120 (24), 240503.
Shwartz-Ziv, R., and N. Tishby (2017), arXiv preprint Torlai, G., B. Timar, E. P. L. van Nieuwenburg, H. Levine,
arXiv:1703.00810. A. Omran, A. Keesling, H. Bernien, M. Greiner, V. Vuletić,
Sidky, H., and J. K. Whitmer (2018), J. Chem. Phys. M. D. Lukin, R. G. Melko, and M. Endres (2019),
148 (10), 104111. arXiv:1904.08441 [cond-mat, physics:quant-ph] ArXiv:
Sifain, A. E., N. Lubbers, B. T. Nebgen, J. S. Smith, A. Y. 1904.08441.
Lokhov, O. Isayev, A. E. Roitberg, K. Barros, and S. Tre- Tóth, G., W. Wieczorek, D. Gross, R. Krischek, C. Schwem-
tiak (2018), J. Phys. Chem. Lett. 9 (16), 4495. mer, and H. Weinfurter (2010), Physical review letters
Sisson, S. A., and Y. Fan (2011), Likelihood-free MCMC 105 (25), 250403.
(Chapman & Hall/CRC, New York.[839]). Tramel, E. W., M. Gabrié, A. Manoel, F. Caltagirone, and
Sisson, S. A., Y. Fan, and M. M. Tanaka (2007), Proceedings F. Krzakala (2018), Physical Review X 8 (4), 041006.
of the National Academy of Sciences 104 (6), 1760. Tsaris, A., et al. (2018), Proceedings, 18th International Work-
Smith, J. S., O. Isayev, and A. E. Roitberg (2017), Chem. shop on Advanced Computing and Analysis Techniques in
Sci. 8 (4), 3192. Physics Research (ACAT 2017): Seattle, WA, USA, August
Smith, J. S., B. Nebgen, N. Lubbers, O. Isayev, and A. E. 21-25, 2017, J. Phys. Conf. Ser. 1085 (4), 042023.
Roitberg (2018), J. Chem. Phys. 148 (24), 241733. Tubiana, J., S. Cocco, and R. Monasson (2018), arXiv
Smolensky, P. (1986), Chap. Information Processing in Dy- preprint arXiv:1803.08718.
namical Systems: Foundations of Harmony Theory (MIT Tubiana, J., and R. Monasson (2017), Physical review letters
Press, Cambridge, MA, USA) pp. 194–281. 118 (13), 138301.
Snyder, J. C., M. Rupp, K. Hansen, K.-R. Müller, and Uria, B., M.-A. Côté, K. Gregor, I. Murray, and H. Larochelle
K. Burke (2012), Phys. Rev. Lett. 108 (25), 1875. (2016), Journal of Machine Learning Research 17 (205), 1.
Sompolinsky, H., N. Tishby, and H. S. Seung (1990), Physical Valiant, L. G. (1984), Communications of the ACM 27 (11),
Review Letters 65 (13), 1683. 1134.
Sorella, S. (1998), Physical Review Letters 80 (20), 4558. Van Nieuwenburg, E. P., Y.-H. Liu, and S. D. Huber (2017),
Sosso, G. C., V. L. Deringer, S. R. Elliott, and G. Csányi Nature Physics 13 (5), 435.
(2018), Mol. Simulat. 44 (11), 866. Varsamopoulos, S., K. Bertels, and C. G. Almudever (2018),
Steinbrecher, G. R., J. P. Olson, D. Englund, and arXiv preprint arXiv:1811.12456.
J. Carolan (2018), “Quantum optical neural networks,” Varsamopoulos, S., K. Bertels, and C. G. Almudever (2019),
arxiv:1808.10047. arXiv preprint arXiv:1901.10847.
Stevens, J., and M. Williams (2013), JINST 8, P12013, Varsamopoulos, S., B. Criger, and K. Bertels (2017), Quan-
arXiv:1305.7248 [nucl-ex]. tum Science and Technology 3 (1), 015004.
47

Venderley, J., V. Khemani, and E.-A. Kim (2018), Physical Xu, Q., and S. Xu (2018), arXiv:1811.06654 [quant-ph]
Review Letters 120 (25), 257204. ArXiv: 1811.06654.
Verstraete, F., V. Murg, and J. I. Cirac (2008), Advances in Yao, K., J. E. Herr, D. W. Toth, R. Mckintyre, and J. Parkhill
Physics 57 (2), 143. (2018), Chem. Sci. 9 (8), 2261.
Vicentini, F., A. Biella, N. Regnault, and C. Ciuti Yedidia, J. S., W. T. Freeman, and Y. Weiss (2003), Explor-
(2019), arXiv:1902.10104 [cond-mat, physics:quant-ph] ing artificial intelligence in the new millennium 8, 236.
ArXiv: 1902.10104. Yoon, H., J.-H. Sim, and M. J. Han (2018), Physical Review
Vidal, G. (2007), Physical Review Letters 99 (22), 220405. B 98 (24), 245101.
Von Luxburg, U. (2007), Statistics and computing 17 (4), Yoshioka, N., and R. Hamazaki (2019), arXiv:1902.07006
395. [cond-mat, physics:quant-ph] .
Wang, C., H. Hu, and Y. M. Lu (2018), arXiv preprint Zdeborová, L., and F. Krzakala (2016), Advances in Physics
arxXiv:1805.08349. 65 (5), 453.
Wang, C., and H. Zhai (2017), Physical Review B 96 (14), Zhang, C., S. Bengio, M. Hardt, B. Recht, and O. Vinyals
144432. (2016), arXiv preprint arXiv:1611.03530.
Wang, C., and H. Zhai (2018), Frontiers of Physics 13 (5), Zhang, L., J. Han, H. Wang, R. Car, and W. E (2018a),
130507. Phys. Rev. Lett. 120 (14), 143001.
Wang, L. (2016), Physical Review B 94 (19), 195105. Zhang, L., D.-Y. Lin, H. Wang, R. Car, et al. (2018b), arXiv
Wang, L. (2018), “Generative models for physicists,”. preprint arXiv:1810.11890.
Watkin, T., and J.-P. Nadal (1994), Journal of Physics A: Zhang, P., H. Shen, and H. Zhai (2018c), Physical Review
Mathematical and General 27 (6), 1899. Letters 120 (6), 066401.
Wecker, D., M. B. Hastings, and M. Troyer (2016), Physical Zhang, W., L. Wang, and Z. Wang (2019), Physical Review
Review A 94 (2), 022309. B 99 (5), 054208.
Wehmeyer, C., and F. Noé (2018), J. Chem. Phys. 148 (24), Zhang, X., Y. Wang, W. Zhang, Y. Sun, S. He, G. Contardo,
241703. F. Villaescusa-Navarro, and S. Ho (2019), arXiv e-prints ,
Wetzel, S. J. (2017), Physical Review E 96 (2), 022140. arXiv:1902.05965arXiv:1902.05965 [astro-ph.CO].
White, S. R. (1992), Physical Review Letters 69 (19), 2863. Zhang, X.-M., Z. Wei, R. Asad, X.-C. Yang, and X. Wang
Wigley, P. B., P. J. Everitt, A. van den Hengel, J. W. Bastian, (2019), arXiv preprint arXiv:1902.02157.
M. A. Sooriyabandara, G. D. McDonald, K. S. Hardman, Zhang, Y., and E.-A. Kim (2017), Physical Review Letters
C. D. Quinlivan, P. Manju, C. C. N. Kuhn, I. R. Petersen, 118 (21), 216401.
A. N. Luiten, J. J. Hope, N. P. Robins, and M. R. Hush Zhang, Y., R. G. Melko, and E.-A. Kim (2017), Physical
(2016), Scientific Reports 6, 25890. Review B 96 (24), 245119.
Wu, D., L. Wang, and P. Zhang (2018), arXiv preprint Zhang, Y., A. Mesaros, K. Fujita, S. D. Edkins, M. H.
arXiv:1809.10606. Hamidian, K. Ch’ng, H. Eisaki, S. Uchida, J. C. S. Davis,
Xin, T., S. Lu, N. Cao, G. Anikeeva, D. Lu, J. Li, G. Long, E. Khatami, and E.-A. Kim (2018d), arXiv:1808.00479
and B. Zeng (2018), arXiv:1807.07445 [quant-ph] ArXiv: [cond-mat, physics:physics].
1807.07445. Zheng, Y., H. He, N. Regnault, and B. A. Bernevig (2018),
arXiv:1812.08171 .

You might also like