Professional Documents
Culture Documents
Encoding Importante Problems Qunatum Computing
Encoding Importante Problems Qunatum Computing
∗ f.dominguez@parityqc.com
† wolfgang@parityqc.com
2
Thus, also the encoding of real-valued problems to dis- for some k ∈ R and a real valued function
crete optimization problems (discretization) as an inter- c(v0 , · · · , vN −1 ), or it is an inequality
mediate step is discussed in Sec. VII D.
The document is structured as follows. After intro- c(v0 , · · · , vN −1 ) ≤ k . (5)
ducing the notation used throughout the text in Sec. II,
we present a list of encodings in Sec. III. Sec. VI then The goal for an optimization problem O = (V, f, C) is
functions as a manual on how to bring the optimization to find an extreme value yex of f , such that all of the
problems contained in this document into a form that constraints are satisfied at yex .
can be solved with a quantum computer and two explicit Discrete optimization problems can often be stated
examples are used as illustration. Sec. VII contains a li- in terms of graphs or hypergraphs. A graph is a pair
brary of optimization problems which are classified into (V, E), where V is a finite set of vertices or nodes and
several categories and Sec. VIII lists building blocks used E ⊂ {{vi , vj } | vi 6= vj ∈ V } is the set of edges. An ele-
in the formulation of these problems. The building blocks ment {vi , vj } ∈ E is called an edge between vertex vi and
are also useful to handle many further problems. Finally, vertex vj . Note that a graph defined like this can neither
in Sec. IX we summarize the results and give an outlook have loops, i.e., edges beginning and ending at the same
on future projects. vertex, nor can there be multiple edges between the same
pair of vertices. Given a graph G = (V, E), its adjacency
matrix is the symmetric binary |V | × |V | matrix A with
II. DEFINITIONS AND NOTATION entries
(
In this section we give some basic definitions and settle 1, if {vi , vj } ∈ E,
Aij = (6)
the notation we will use throughout the whole text. 0, else .
A discrete set of real numbers is a countable subset
U ⊂ R not having an accumulation point. Discrete sets A hypergraph is a generalization of a graph in which
will be denoted by uppercase latin letters, except the we allow edges to be adjacent to more than two vertices.
letter G, which we reserve for graphs. A discrete vari- That is, a hypergraph is a pair H = (V, E), where V is a
able is a variable ranging over a discrete set R. Discrete finite set of vertices and
variables will be represented by lowercase latin letters,
mostly v or w. The discrete set R a discrete variable v |V |
takes values in, is called its range. Elements of R are
[
E⊆ {vj1 , · · · , vji } | vjk ∈ V and vjk 6= vj`
denoted by lowercase greek letters. i=1
(7)
If a discrete variable has range given by {0, 1} we call
it binary or boolean. Binary variables will be denoted by ∀ k, ` = 1, · · · i
the letter x. Similarly, a variable with range {−1, 1} will is the set of hyperedges.
be called a spin variable and the letter s will be reserved Throughout the whole paper we reserve the word qubit
for spin variables. There is an invertible mapping from a for actual physical qubits. To get from an encoding-
binary variable x to a spin variable s: independent Hamiltonian to a quantum program, binary
x 7→ s := 2x − 1 . (2) or spin variables become Pauli-z-matrices which act on
the corresponding qubits.
For a variable v with range R we define the value indica-
tor function to be
( III. ENCODINGS LIBRARY
α 1 if v = α
δv = (3)
0 if v 6= α, For many important problems, the cost function and the
problem constraints can be represented in terms of two
where α ∈ R. fundamental building blocks: the value of the integer
We will also consider optimization problems for con- variable v and the value indicator δvα defined in Eq. (3).
tinuous variables. A variable v will be called continuous, When expressed in terms of these building blocks, Hamil-
if its range is given by Rd for some d ∈ Z>0 . tonians are more compact and recurring terms can be
An optimization problem O is a triple (V, f, C), where identified across many different problems. Moreover,
1. V := {vi }i=0,··· ,N −1 is a finite set of variables. quantum operators are not present at this stage: an
encoding-independent Hamiltonian is just a cost func-
2. f := f (v0 , · · · , vN −1 ) is a real valued function, tion in terms of discrete variables, which eases access to
called objective or cost function. quantum optimization to a wider audience. The encod-
3. C = {Ci }i=0,··· ,l is a finite set of constraints Ci . A ing of the variables and the choice of quantum algorithms
constraint C is either an equation can be done at later stages.
The representation of the building blocks in terms of
c(v0 , · · · , vN −1 ) = k (4) Ising operators depends on the encoding we choose. An
4
which means that v = α if xα = 1. The value of v is This encoding uses the position of a domain wall in an
given by Ising chain to codify values of a variable v. If the end-
points of an N + 1 spin chain are fixed in opposite states,
N
X −1
v= αxα . (19)
α=0 1 Note that we only need N − 1 instead of N terms from the
summation since any configuration that makes the XOR chain
The physically meaningful quantum states are those with return one and is not a valid one-hot string has at least three
a single qubit in state 1 and so dynamics must be re- ones in it.
6
there must be at least one domain wall in the chain. Since so N − 1 binary variables xi are needed for encoding
the energy of a ferromagnetic Ising chain depends only an N -value variable. Unary encoding does not require a
on the number of domain walls it has and not on where core term because every quantum state is a valid state.
they are located, an N + 1 spin chain with fixed opposite However, this encoding is not unique in the sense that
endpoints has N possible ground states, depending on each value of v has multiple representations.
the position of the single domain wall. A drawback of unary encoding (and every dense en-
The codification of a variable v = 1, . . . , N using do- coding) is that it requires information from all binary
main wall encoding requires the core Hamiltonian [30]: variables to determine the value of v. The value indica-
" N −2
# tor δvα is
X
Hcore = − −s1 + sα sα+1 + sN −1 . (24)
α=1 Y v−i
δvα = , (29)
Since the fixed endpoints of the chain do not need a spin α−i
i6=α
representation (s0 = −1 and sN = 1), N − 1 Ising vari-
−1
ables {si }N
i=1 are sufficient for encoding a variable of N which involves 2N interaction terms. This scaling is com-
values. The minimum energy of Hcore is 2−N , so the core pletely unfavorable, so unary encoding may be only con-
term can be alternatively encoded as a sum constraint: venient for variables that do not require value indicators
" N −2
# δvα , but only the variable value v.
X A performance comparison for the Knapsack problem
ccore = − −s1 + sα sα+1 + sN −1 = 2 − N. (25)
using digital annealers showed that unary encoding can
α=1
outperform binary and one-hot encoding and requires
The variable indicator corroborates if there is a domain smaller energy scales [32]. The reasons for the high per-
wall in the position α: formance of unary encoding are still under investigation,
1 but redundancy is believed to play an important role, be-
δvα =
(sα − sα−1 ) , (26) cause it facilitates the annealer to find the ground state.
2
As for domain-wall encoding [35], the minimum Ham-
where s0 ≡ −1 and sN ≡ 1, and the variable v can be ming distance between two valid states (i.e., the number
written as of spin flips needed to pass from one valid state to an-
N
X other) could also explain the better performance of the
v= αδvα unary encoding.
α=1
(27)
−1
" N
#
1 X
= (1 + N ) − si . F. Block encodings
2 i=1
Quantum annealing experiments using domain wall en- It is also possible to combine different approaches to ob-
coding have shown significant improvements in perfor- tain a balance between sparse and dense encodings [33].
mance compared to one-hot encoding [31]. This is partly Block encodings are based on B blocks, each consisting of
because the required search space is smaller (N − 1 ver- g binary variables. Similar to one-hot encoding, the valid
sus N qubits for a variable of N values), but also be- states for block encodings are those states where only a
cause domain-wall encoding generates a smoother en- single block contains non-zero binary variables. The bi-
ergy landscape: in one-hot encoding, the minimum Ham- nary variables in block b, {xb,i }g−1
i=0 define a block value
ming distance between two valid states is two, whereas in wb , using a dense encoding like binary, gray or unary.
domain-wall, this distance is one. This implies that ev- For example, if wb is encoded using binary, we have
ery valid quantum state in one-hot is a local minimum, g−1
surrounded by energy barriers generated by the core en-
X
wb = 2i xb,i + 1. (30)
ergy of Eq. (21). As a consequence, the dynamics in i=0
domain-wall encoded problems freeze later in the anneal-
ing process, because only one spin-flip is required to pass The discrete variable v is defined by the active block b
from one state to the other. and its corresponding block value wb ,
g
B−1
X 2X −1
α
E. Unary encoding v= v (b, wb = α) δw b
, (31)
b=0 α=1
For this case, the value of v is encoded in the number of where v (b, wb = α) is the discrete value associated to the
binary variables which are equal to one: quantum state with active block b and block value α, and
α
N
X −1 δw b
is a value indicator that only needs to check the value
v= xi , (28) of binary variables in block b. For each block, there are
i=0 2g−1 possible values (assuming gray or binary encoding
7
for the block), because the all-zero state is not allowed can be greatly improved outside QUBO [29, 38, 39],
(otherwise block b is not active). If the block value wb is prompting us to look for alternative formulations to the
encoded using unary, then g values are possible. The ex- current QUBO-based Hamiltonians. The Parity Archi-
α
pression of δw b
depends on the encoding and is the same tecture is a paradigm for solving quantum optimization
as the one already presented in the respective encoding problems [21, 24] that does not rely on the QUBO formu-
section. lation, hence allowing a wide number of options for for-
The value indicator for the variable v is the correspond- mulating Hamiltonians. The architecture is based on the
ing block value indicator. Suppose the discrete value v0 Parity transformation, which remaps Hamiltonians into a
is encoded in the block b with a block variable wb = α: two-dimensional grid that requires only local connectiv-
ity of the qubits. The absence of long-range interactions
v0 = v (b, wb = α) . (32) enables high parallelizability of quantum algorithms and
eliminates the need for costly and time-consuming SWAP
Then the value indicator δvv0 is gates, which helps to overcome two of the main obsta-
δvv0 = δw
α
. (33) cles of quantum computing: the limited coherence time
b
of qubits and the poor connectivity of qubits within a
A core term is necessary so that only qubits in P
a single quantum register.
block can be in the excited state. Defining tb = i xi,b , The Parity transformation creates a single Parity qubit
the core terms results in for each interaction term in the (original) logical Hamil-
X tonian:
Hcore = tb tb0 , (34)
b6=b0 Ji,j,... σz(i) σz(j) · · · → Ji,j,... σz(i,j,... ) , (36)
or, as a sum constraint, where the interaction strength Ji,j,... is now the local field
(i,j,... )
X of the Parity qubit σz . This facilitates addressing
ccore = tb tb0 = 0. (35) high-order interactions and frees the problem formula-
b6=b0
tion from the QUBO approach. The equivalence between
The minimum value of Hcore is zero. If two blocks b and the original logical problem and the Parity-transformed
b0 have binary variables with values one, then tb tb0 6= 0 problem is ensured by adding three-body and four-body
and the corresponding eigenstate of Hcore is no longer the constraints and placing them on a two dimensional grid
ground state. such that only neighboring qubits are involved in the con-
straints. The mapping of a logical problem into the reg-
ular grid of a Parity chip can for example be realized
IV. PARITY ARCHITECTURE by the Parity compiler [21]. Although the Parity com-
pilation of the problem may require a larger number of
physical qubits, the locality of interactions on the grid
The strong hardware limitations of noisy intermediate-
allows for higher parallelizability of quantum algorithms.
scale Quantum (NISQ) [36] devices have made sparse
This allows constant depth algorithms [22, 40] to be im-
encodings (and especially one-hot encoding) the stan-
plemented with a smaller number of gates [39].
dard approach to problem formulation. This is mainly
because the basic building blocks (value and value indi- The following toy example, summarized in Fig. 2,
cator) are of linear or quadratic order in the spin vari- shows how a Parity-transformed Hamiltonian can be
ables in these encodings. The low connectivity of qubit solved using a smaller number of qubits when the origi-
platforms requires Hamiltonians in the QUBO formu- nal Hamiltonian has high-order interactions. Given the
lation, and high-order interactions are expensive when logical Hamiltonian
translated to QUBO [8]. However, that different choices
of encodings can significantly improve the performance H =σz(1) σz(2) + σz(2) σz(4)
(37)
of quantum algorithms [30–33], since a smart choice of + σz(1) σz(5) + σz(1) σz(2) σz(3) + σz(3) σz(4) σz(5) ,
encodings can reduce the search space or generate a
smoother energy landscape. the corresponding QUBO formulation requires 5 + 2 = 7
One way this difference between encodings manifests qubits, including two ancilla qubits for decomposing the
itself is in the number of spin flips of physical qubits three-body interactions, and the total number of two-
needed to change a variable into another valid value [35]. body interactions is 14. The embedding of the QUBO
If this number is larger than one, there are local minima problem in the quantum hardware may require additional
separated by invalid states penalized with a high cost qubits and interactions depending on the chosen architec-
which can impede the performance of the optimization. ture. Instead, the Parity-transformed Hamiltonian only
On the other hand, such an energy-landscape might of- consists of six Parity qubits with local fields and two
fer some protection against errors (see also Ref. [23] and four-body interactions between close neighbors.
Ref. [37]). Furthermore, other fundamental aspects of It is not yet clear what the best Hamiltonian repre-
the algorithms, such as circuit depth and energy scales sentation is for an optimization problem. The answer
8
D−1
log2(N)−1
∏ [ i ( α,i
δvα = x 2x − 1) No core term (dense
v= 2i xi + 1
∑
Binary x0 x1 x2 i=0 encoding)1
v=2 i=0 −xα,i + 1]
N−1 2
N−1
(∑ )
v= δvα = xα xd − 1
∑
One-hot x0 x1 x2 ixi
v=1 i=0 d=0
v−i
N−1
v=2 v= δvα =
No core term (dense
∑
xi
Unary
∏α−i encoding)
i=0 i≠α
s0 = − 1 s4 = 1 2
1 + N N−1 si 1 N−1
(∑ )
v= − δvα = (sα − sα−1) sαsα+1 − N + 2
Domain-wall
s1 s2 s3 2 ∑2 2 α=0
i=1
v=3
b1 b2 b3 2
B−1 2g−1
δvv0 = δwαb
Block
v= v b, wb = α) δwαb
∑∑ ( ∑∑
xd,bxe,b′

encoding b≠b′ d,e
b=0 α=1
v=6

FIG. 1. Summary of popular encodings of discrete variables in terms of binary xi or Ising si variables. Each encoding has
a particular representation of the value of the variable v and the value indicator δvα . If every quantum state in the register
represents a valid value of v, the encoding is called dense. On the contrary, sparse encodings must include a core term in the
Hamiltonian in order to represent a value of v. Sparse encodings have simpler representations of the value indicators than dense
encodings, but the core term implementation demands extra interaction terms between the qubits. The cost of a Hamiltonian
in terms of number of qubits and interaction terms strongly depends on the chosen encoding, and should be evaluated for every
specific problem.
will probably depend strongly on the particular use case nomial constraints of the form
we want to solve, and will take into account not only X
the number of qubits needed, but also the smoothness of c(v1 , . . . , vN ) = gi vi
the energy landscape, which has a direct impact on the X X
+ gi,j vi vj + gi,j,k vi vj vk + . . . ,
performance of quantum algorithms [41]. The Parity ar-
chitecture allows us to explore formulations beyond the (38)
standard QUBO and this library aims to facilitate the which remain polynomial after replacing the discrete
exploration of new Hamiltonian formulations. variables vi with any encoding. The coefficients
gi , gi,j , . . . depend on the problem and its constraints.
In general, constraints can be implemented dynami-
cally [9, 10] (exploring only quantum states that satisfy
the constraints) or as extra terms Hcore in the Hamilto-
nian, such that eigenvectors of H are also eigenvectors
V. ENCODING CONSTRAINTS of Hcore and the ground states of Hcore correspond to
elements in the domain of f that satisfy the constraint.
Even if the original problem is unconstrained, the use of
Up to this point, we have presented the possible encod- sparse encodings such as one-hot or domain wall imposes
ings of the discrete variables in terms of binary variables. hard constraints on the quantum variables.
In this section, we assume that the encodings of the vari- Constraints arising from sparse encodings are usually
ables have already been chosen and explain how to imple- incorporated as a penalty term Hcore into the Hamilto-
ment the hard constraints associated with the problem. nian that penalizes any state outside the desired sub-
Hard constraints c(v1 , . . . , vN ) = K often appear in opti- space:
mization problems, limiting the search space and making
2
problems even more difficult to solve. We consider poly- Hcore = A [c(v1 , . . . , vN ) − K] , (39)
9
H= + + + +
Hard constraint
σz(1)σz(2) σz(2)σz(4) σz(1)σz(5) σz(1)σz(2)σz(3) σz(3)σz(4)σz(5)
c(σz(1), …, σz(N)) = giσz(i) + gi,jσz(i)σz( j) + gi,j,kσz(i)σz( j)σz(k) + … = K
∑ ∑ ∑
QUBO Parity compiler
1 2 3 4 5
2 4 Initial
345 45 15
condition:
a 3 b
1 5 123 12 24
Valid subspace
Hilbert space ℋ
Parity Architecture
FIG. 3. (a) Decision tree to encode the constraints, which can
be implemented as energy penalizations in the Hamiltonian
or of the problem or dynamically by selecting a driver Hamil-
tonian that preserves the desired condition. When using en-
Hcore = A [c(v1 , . . . , vN ) − K] , (40) ergy penalties (left figure), the driver term needs to explore
the entire Hilbert space. In contrast, the dynamic imple-
in the special case that c(x1 , . . . , xN ) ≥ K is satisfied. mentation of the constraints (right figure) reduces the search
The constant A must be large enough to ensure that the space to the subspace satisfying the hard constraint, thus im-
ground state of the total Hamiltonian satisfies the con- proving the performance of the algorithms. (b) Constrained
straint, but the implementation of large energy scales logical problem (left) and its Parity representation. In the
Parity architecture, each term of the polynomial constraint
lowers the efficiency of quantum algorithms [42] and ad-
is represented by a single Parity qubit (green qubits), so the
ditionally imposes a technical challenge. Moreover, ex- polynomial constraints define a subspace in which the total
tra terms in the Hamiltonian imply additional overhead magnetization of the involved qubits is preserved and which
of computational resources, especially for squared terms can be explored with a exchange driver σ+ σ− + h.c. that pre-
like in Eq. (39). serves the total magnetization. Figure originally published in
Quantum algorithms for finding the ground state of Ref. [29].
Hamiltonians, like QAOA or quantum annealing, require
driver terms Udrive = exp(−itHdrive ) that spread the ini-
tial quantum state to the entire Hilbert space. Dynami- In general, the construction of constraint-preserving
cal implementation of constraints employs a driver term drivers depends on the problem [9, 10, 25, 28, 30]. Ap-
that only explores the subspace of the Hilbert space that proximate driver terms have been proposed, which ad-
satisfies the constraints. Given an encoded constraint in mit some degree of leakage and may be easier to con-
(i) struct [28]. Within the Parity Architecture, each term of
terms of Ising operators σz :,
X a polynomial constraint is a single Parity qubit [21, 24].
c(σz(1) , . . . , σz(N ) ) = gi σz(i) This implies that for the Parity architecture the polyno-
X (41) mial constraints are simply the conservation of magneti-
+ gi,j σz(i) σz(j) + · · · = K, zation between the qubits involved:
X X
a driver Hamiltonian Hdrive that commutes with gi σz(i) + gi,j σz(i) σz(j) + · · · = K
(1) (N )
c(σz , . . . , σz ) will restrict to the valid subspace pro- i i,j
(43)
vided the initial quantum state satisfies the constraint:
X
→ gu σz(u) = K,
u∈c
c|ψ0 i = K|ψ0 i
(u)
Udrive c|ψ0 i = Udrive K|ψ0 i (42) where the Parity qubit σz represents the logical qubits
(u ) (u ) (u )
c [Udrive |ψ0 i] = K [Udrive |ψ0 i] . product σz 1 σz 2 . . . σz n and we sum over all these
10
products that appear in the constraint c. A driver Hamil- constraints. This output can already be compiled using
(u) (w) the Parity compiler [21] without the need for reducing
tonian based on exchange (or flip-flop) terms σ+ σ− +
h.c., summed over all pairs of products u, w ∈ c, pre- the problem to QUBO or additional embedding.
serves the total magnetization and explores the complete Let us illustrate the encoding for two examples, the
subspace where the constraint is satisfied [29]. The deci- clustering problem and the traveling salesman problem.
sion tree for encoding constraints is presented in Fig. 3. a. Clustering problem: The clustering problem is a
partitioning problem described in Sec. VII A 1. The pro-
cess to go from the encoding-independent Hamiltonian
VI. ENCODING-INDEPENDENT to the spin Hamiltonian that has to be implemented in
FORMULATION the quantum computer is outlined in Fig. 4. A prob-
lem instance is defined from the required inputs: the
The problems given in this library consist of a cost func- number of clusters K we want to create, N objects with
tion f and a set of constraints that define a subspace weights wi and distances di,j between the objects. Two
where the function f must be minimized. Both the cost different types of discrete variables are required, variables
function and the constraints are expressed in terms of dis- vi = 1, . . . , K (i = 1, . . . , N ) indicate to which of the K
crete variables and we call this encoding independent for- possible clusters the object i is assigned, and variables
mulation. To generate a Hamiltonian suitable for quan- yj = 1, . . . , Wmax track the weight in cluster j.
tum optimization algorithms, the discrete variables must The variables vi and yj serve different purposes in the
be encoded in Ising operators, using the encodings pre- formulation of the clustering problem and thus it might
sented in Sec. III. If sparse encodings are chosen for some be beneficial to use different encodings. For variables vi
variables, then the core constraints of these variables only the value indicator δvαi is required, therefore sparse
must be included in the constraint set of the problem. encodings with simple δvαi are recommended. In contrast,
After encoding the problem, it must be decided only the value of variables yj is required for the problem
whether the constraints (either problem constraints or formulation, and so dense encodings like binary or Gray
core constraints) are implemented as energy penalties or will reduce the number of necessary boolean variables. In
dynamically in the driver term (see Sec. V). The Hamil- order to find the spin representation of the problem, it
tonian of the problem will consist of the encoded cost is necessary to choose an encoding and decide if the core
function f plus the constraints ci that are implemented term (if there is one) is included as an energy penalization
as energy penalties: or via a constraint-preserving driver (see Sec. V).
Once the encodings have been chosen, the discrete vari-
H = Af + Bc1 + Cc2 + . . . , (44)
ables are replaced in the cost function and in the con-
where the energy scales A < B < C, . . . are selected such straint associated to the problem, Eqs. (49) and (47),
that the ground state of H is also the ground state of respectively. The problem constraint can be encoded
each of the terms representing constraints. This guaran- as an energy penalization in the Hamiltonian or imple-
tees that it is never energetically favorable to violate a mented dynamically in the driver term of the quantum
constraint, in order to further reduce the cost function. algorithm as a sum constraint. The resulting Hamilto-
If the energy scales are too different, the performance nian and constraints are now in terms of spin operators,
of the algorithms may deteriorate (in addition to the and can compiled using the Parity compiler to obtain a
experimental challenge that this entails) so parameters two-dimensional chip design including only local interac-
A, B, C, . . . cannot be set arbitrarily large. The deter- tions.
mination of the optimal energy scales is an important b. Traveling salesperson problem: The second exam-
open problem. For some of the problems in the library ple we present is the Traveling salesperson (TSP) prob-
we provide an estimation of the energy scales (cf. also lem, described in Sec. VII C 2. The input to this problem
Sec. VIII D 2). is a (usually complete) graph G with a cost wi,j associ-
The problems included in the library are classified into ated with each edge in the graph. We seek a path that
four different categories: subsets, partitions, permuta- visits all nodes of G without visiting Pthe same node twice,
tions, continuous variables. These categories are defined and also minimizes the total cost wi,j of the edges of
by the role of the discrete variable and are intended to the path.
organize the library and make it easier to find problems, This problem requires N variables vi = 1, . . . , N ,
but also to serve as a basis for the formulation of similar where N is the number of vertices in G. Each variable vi
use cases. An additional category in Sec. VII E contains represents a node in the graph, and indicates the stage
problems that do not fit into the previous categories but of the travel at which node i is visited. If vi = α and
may also be important use cases for quantum algorithms. vj = α + 1, the salesperson travels from node i to node
In Sec. VIII we include a summary of recurrent encoding- j and so the cost function adds wi,j to the total cost of
independent building blocks that are used throughout the the travel, as indicated in Eq. (96). This formulation re-
library and could be useful in formulating new problems. quires the constraints of Eq. (93) vi 6= vj , for all i 6= j so
We now present two examples, going from the encoding the salesperson visits every node once.
independent problem to the Ising Hamiltonian and its In Fig. 5, we present an example of a decision tree for
11
De ne problem instance
Clustering problem instance
Distances {di,j}
N objects with weights wi Variables vi Variables yk
Number of clusters K
Choose encoding Choose encoding
Replace
Cost function Problem constraint
K K N Wmax N encoded
f= di,jδvkiδvkj → f = yk = wiδvki → xyk,j =
∑∑ ∑∑
di,j xvi,k xvj,k variables
∑ ∑ ∑
wi xvi,k
k=1 i<j k=1 i<j i=1 j=1 i=1
Choose implementation
Penalization Driver term
2
Wmax N Wmax N
xyk,j − xyk,j =
∑ ∑
wi xvi,k
∑ ∑
wi xvi,k
j=1 i=1 j=1 i=1
Hamiltonian
Constraints
2 Final Hamiltonian
K Wmax N K K
H = A1 xyk,j − + A2 xvi,j = 1, i = 1,…, N and sum constraints
∑ ∑ ∑
wi xvi,k
∑∑
di,j xvi,k xvj,k ∑
k=1 j=1 i=1 k=1 i<j j=1
fi
FIG. 4. Example decision tree for the clustering problem. The entire process is presented in four different blocks (Define
problem, Encode variables, Replace variables and Final Hamiltonian) and the decisions taken are highlighted in green. The
formulation of a problem instance requires the discrete variables vi and yk to be encoded in terms of qubit operators. In this
example, the vi variables are encoded using the one-hot encoding while for the yk variables we use the binary encoding. The
Hamiltonian is obtained by substituting in the encoding-independent expressions of Eqs. (47) and (49) the discrete variables
vi , yk and the value indicators δvα according to the chosen encoding. Core terms associated to sparse encodings (one-hot in this
case) and the problem constraints can be added to the Hamiltonian as energy penalizations or implemented dynamically in
the driver term of the quantum algorithm as sum constraints. The final Hamiltonian and the sum constraints can be directly
encoded on a Parity chip using the Parity Compiler, without the need to reduce the problem to its QUBO formulation.
this problem. The cost function and the constraint in- VII. PROBLEM LIBRARY
clude products of value indicators δvαi δvβk for which sparse
encodings like domain wall are better suited. A. Partitioning problems
Encode variables
Domain wall Unary Binary
One-hot
N log N
vi =
N−1
vi = 2j xvi,j
[ ∑ vi,α]
N
∑
1 xvi,j
vi = vi = (1 + N) − s ∑
∑
αxvi,α 2 j=1 j=1
vi − l j vi − l
α=1 α=1 N N
δvαi = xvi,α δvαi = 1
(s − svi,α−1) δvji =
∏ j−l
δvi =
∏ j−l
2 vi,α l≠j l≠j
Encode the core term
Penalization
Driver term
2
N−2 N−2
cvi = −svi,1 + svi,jsvi,j+1 + svi,N−1 − (1 − N) −svi,1 + svi,jsvi,j+1 + svi,N−1 = 1 − N
∑ ∑
j=1 j=1
∑∑(
δvαi δvαk = 0 → svi,α − svi,α−1) (svk,α − svk,α−1) = 0
j i
α i≠j ∑∑
↓ α=1 i≠k α=1 i≠k
]
+ (svj,α − svj,α−1) (svi,α+1 − svi,α)
Final Hamiltonian
Hamiltonian Constraints and sum constraints
N N
∑∑(
H = A1 cvi + A2 f svi,α − svi,α−1) (svk,α − svk,α−1) = 0
∑
i=1 α=1 i≠k
fi
FIG. 5. Example decision tree for the TSP problem. The Hamiltonian is obtained by substituting in the encoding-independent
expressions of Eqs. (96) and (93) the value indicators according to the chosen encoding. The choices made in the decision tree
are highlighted in green. For this problem instance we assume the graph G is complete (there is an edge connecting every pair
of nodes) and we choose domain-wall encoding. This sparse encoding is convenient since the cost function and the problem
constraint include products of value indicators δvαi δvβk , which would imply a large number of interaction terms using a dense
encoding. Sparse encodings require a core term for defining the valid states of the Hilbert space. In this case, we choose
to encode this core term as an energy penalization in the Hamiltonian while the problem constraint is considered as a sum
constraint. The final Hamiltonian and the sum constraints can be directly encoded on a Parity chip using the Parity Compiler,
without the need to reduce the problem to its QUBO formulation.
partition P of U is a set P := {Uk }k∈K of subsets Uk ⊂ U , arbitrary, therefore the value of the variable vi is usually
SK not important in these cases, but only the value indicator
such that U = k=1 Uk and Uk ∩ Uk0 = 0 if k 6= k 0 .
Partitioning problems require a discrete variable vi for δvαi is needed, so sparse encodings may be convenient for
each element ui ∈ U . The value of the variable indicates these variables.
to which subset the element belongs, so vi can take K
different values. The values assigned to the subsets are
13
which can be expressed as: then enforces that l is as least as large as the maximum
pi (see Sec. VIII A 3) and the second term minimizes l.
The theta step function can be expressed in terms of the
X
yk = wi δvki . (47)
i
value indicator functions according to the building block
Eq. (150) where we either have to introduce auxiliary
If the encoding for the auxiliary variables makes it nec- variables or express the value indicators directly accord-
essary (e.g. a binary encoding and Wmax not a power ing to the discussion in Sec. VIII D 1.
of 2), this constraint must be shifted as is described in d. Special cases For K = 2, the cost function that
Sec. VIII B 3. minimizes the difference between the partial sums is
d. Cost function The sum of the distances of the f = (p2 − p1 )
2
elements of a subset is: "N #2
X (53)
1 2
= δvi − δvi ui .
X X
fk = di,j = di,j δvki δvkj , (48)
i<j i=1
{i<j:ui ,uj ∈Uk }
The only two possible outcomes for δv1i − δv2i are ±1, so
and the cost function results: these factors can be trivially encoded using spin variables
X si = ±1, leading to
f= fk . (49) N
k
X
f= si ui . (54)
i=1
e. References This problem is studied in Ref. [43] as
part of the Capacitated Vehicle Routing problem. e. References The Hamiltonian formulation for
K = 2 can be found in [11].
a. Description Given a set U of N real numbers ui , a. Description In this problem, the nodes of a graph
we look for a partition of size K such that the sum of the G = (V, E) are divided into K different subsets, each one
numbers in each subset is as homogeneous as possible. representing a different color. We look for a partition
b. Variables The problem requires N variables in which two adjacent nodes are painted with different
vi ∈ [1, K], one per element ui . The value of vi indicates colors, and we can also look for a partition that minimizes
to which subset ui belongs. the number of color used.
14
b. Variables We define a variable vi = 1 . . . K for which is equal to 1 when nodes i and j belong to the
each node i = 1, ..., N in the graph. same partition (vi = vj ) and zero when not (vi 6= vj ). In
c. Constraints We must penalize solutions for which this way, the term:
two adjacent nodes are painted with the same color. The
cost function of the graph partitioning problem presented (1 − δ(vi − vj )) Ai,j (59)
in Eq. (60) can be used for the constraint of graph color- is equal to 1 if there is a cut edge connecting nodes i and
ing. In that case, [1 − δ(vi − vj )] Ai,j was used to count j, being Ai,j the adjacency matrix of the graph. The
the number of adjacent nodes belonging to different sub- cost function for minimizing the number of cut edges is
sets, now we use δ(vi −vj )Ai,j to indicate if nodes i and j obtained by summing over all the nodes:
are painted with the same color. Therefore the constraint X
is (see building block VIII B 1) f= [1 − δ(vi − vj )] Ai,j , (60)
X i<j
c= δ(vi − vj )Ai,j = 0. (55)
i<j or alternatively −f to maximize the number of cut edges.
d. Constraints A common constraint imposes that
d. Cost function The decision problem (“Is there a partitions have a specific size. The element counting
coloring that uses K colors?”) can be answered by im- building block VIII A 1 is defined as
plementing the constraint as a Hamiltonian H = c (note
|V |
that c ≥ 0). The existence of a solution with zero energy X
implies that a coloring with K colors exists. tα = δvαi . (61)
Alternatively we can look for the coloring that uses i=1
the minimum number of colors. To check if a color If we want the partition α to have L elements, then
α is used, we can use the following term (see building
block VIII B 2): tα = L (62)
N
Y must hold. If we want two partitions α and β to have
1 − δvαi ,
uα = 1 − (56) the same size, then the constraint is
i=1
tα = tβ , (63)
which is one if and only if color α is not used in the
coloring, and 0 if at least one node is painted with color and all the partitions will have the same size imposing
α. The number of colors used in the coloring is the cost X
function of the problem: (ta − tb )2 = 0 (64)
a<b
K
which is only possible if |V |/N is a natural number.
X
f= uα . (57)
α=1
e. References The Hamiltonian for K = 2 can be
found in [11].
This objective function can be very expensive to imple- f. Hypergraph partitioning The problem formula-
ment, since uα includes products of δvαi of order N , so a tion can be extended to hypergraphs. A hyperedge is
good estimation of the minimum number Kmin would be cut when contains vertices from at least two different
useful to avoid using an unnecessarily large K. subsets. Given a hyperedge e of G, the function
e. References The one-hot encoded version of this Y
problem can be found in Ref. [11]. cut(e) = δ(vi − vj ) (65)
i,j∈e
f. References This Hamiltonian was formulated in f. References This problem can be found in
Ref. [11] using one-hot encoding. Refs. [11].
where ∂v is the set of edges connected to vertex v. Ad- e. Special case: exact cover If we want each element
ditionally, the matching should be maximal.
P For each of U to appear once and only once on the cover, then yα =
vertex u, we define a variable yu = i∈∂u xi which is 1, for all α and the constraint of the problem reduces to
only zero if the vertex does not belong to an edge of C.
If the first constraint is satisfied, this variable can only
X
cα = xi = 1. (89)
be 0 or 1. In this case, the constraint can be enforced by i:α∈Vi
X k
Y
fC = C (1 − yua ) = 0. (84) f. References The one-hot encoded Hamiltonian can
e=(u1 ,...,uk )∈E a=1 be found in Ref. [11].
However, one has to make sure that the constraint imple-
mented by fB is not violated in favor of fC which could
happen if for some v, yv > 1 and for m neighboring ver- 7. Knapsack
tices yu = 0. Then the contributions from v are given
by a. Description A set U contains N objects, each of
1 them with a value di and a weight w i . We look for a
fv = Byv (yv − 1) + C(1 − yv )m (85) subset of U with the maximum value
P
di such that the
2
total weight of the selected objects does not exceed the
and since m+yv is bounded by the maximum degree ∆ of
upper limit W .
G times the maximum rank of the hyperedges k, we need
to set B > (∆k − 2)C to ensure that the ground state b. Variables We define a binary variable xi = 0, 1
of fB + fC does not violate the first constraint. Finally, for each element in U that indicates if the element i is
one has to prevent fC to be violated in favor of fA which selected or not. We also define an auxiliary variable y
entails C > A. that indicates the total weight of the selected objects:
e. Resources The maximum order of the interaction X
terms is k and the number of terms scales roughly with y= wi . (90)
(|V | + |E|2k )∆(∆ − 1). i:xi =1
f. References This problem can be found in
Ref. [11]. If the weights are natural numbers wi ∈ N then y is also
natural, and the encoding of this auxiliary variable is
greatly simplified.
6. Set cover c. Cost function The cost function is given by
Note that we can also combine the cost function for min- for the projection to the i-th coordinate. A hyperrectan-
imizing the number of shifts and this constraint by using gle R ⊂ Rd is defined as
the cost function
R := x ∈ Rd πi (x) ∈ [ai , bi ], ∀ i ∈ Dn .
!2 (116)
D
X XN
f= pi vi,t − nmin,t . (113) For the fixed set of chosen coordinates Dn the Dirich-
t=1 i=1 let processes are run m times to get m hyperrectangles
Finally, the constraint that a nurse i should work in max- {R1 , . . . , Rm }. For k = 1, . . . , m, we define binary pro-
imally dmax consecutive shifts reads jection maps
t2
X Y zk : Rd → Z2
c(i) = vi,t0 = 0. (114) (
t1 −t2 >dmax t0 =t1 1, if x ∈ Rk (117)
x 7→ zk (x) :=
e. References This problem can be found in 0, else.
Ref. [49].
Then Random subspace coding is defined as a map
model postulates that without crashes the equity values where k is the highest order of the ansatz and the vari-
V (such that v = C̃V ) in equilibrium satisfy ables vi depend on the continuous to discrete encoding.
In terms of the indicator functions they are expressed as
V = Dp + CV → v = C̃(1 − C)−1 Dp. (122) X
vi = αδvαi . (127)
Crashes are then modeled as abrupt changes in the prices α
of assets held by an institution, i.e., via
Alternatively, one could write the ansatz as a function of
v = C̃(1 − C)−1 (Dp − b(v, p)), (123) the indicator functions alone instead of the variables
k
where bi (v, p) = βi (p)(1 − Θ(vi − vic )) results in the prob- X X α α
lem being highly non-linear. y 0 (v, w) = wi01 ,...,ir δvii11 ...δviirr . (128)
b. Variables It is useful to shift the market values r=1 i1 ,...,ir =0,...,∆
if they are negations of each other, i.e. yil = ¬yjp and npm multiplication operations. However, there are algo-
solve the maximum independent set (MIS) problem for rithms using fewer resources, e.g. in Ref. [56]. Strassen
this graph [44]. If and only if this set has cardinality m, showed that there is an algorithm for multiplying two
the SAT instance is satisfiable. The variables in this for- 2 × 2 matrices involving only 7 multiplications at the
mulation are the m×k boolean variables xi which are 1 if expense of more addition operations which are computa-
the vertex is part of the independent set and 0 otherwise. tionally less costly.
c. Cost function A clause-violation based cost func- In a recent paper [57] machine learning methods were
tion for a problem in CNF can be written as used to identify new effective algorithms for matrix mul-
X tiplication. They formulated the search for more effec-
fC = (1 − σj (x)) (137) tive algorithms (using less multiplication operations) as
j an optimization problem. We will show that there is an
equivalent Ising-Hamlitonian optimization problem.
with (see Sec. VIII C 2) Multiplying matrices A and B can be stated in terms
of an nm × mp × np-tensor M. The usual algorithm for
σj = lj1 ∨ lj2 ... ∨ ljk matrix multiplication corresponds to a sum decomposi-
k
Y (138) tion of M into nmp-terms of the following form:
=1− (1 − ljn ).
nmp−1
n=1 X
M= u(α) ⊗ v(α) ⊗ w(α) , (141)
A cost function for the MIS problem can be constructed α=0
from a term encouraging a higher cardinality of the in-
dependent set: where u(α) is an nm-vector with binary entries for any
α = 0, · · · , nmp and v(α) , w(α) are binary mp- and np
m×k
X vectors. How to multiply two matrices using the tensor
fB = −b xj . (139) Eq. (141) is described in Ref. [57, Algorithm 1]. One
j=1 maps matrices A, B and C to vectors of the respective
length, e.g., an n × m-matrix A will be an nm-vector
d. Constraints In the MIS formulation one has to whose k-th entry is the component Ars for k = rm +
enforce that there are no connections between members s. Then one starts by computing an intermediate nmp-
of the maximally independent set: vector m = (m0 , · · · , mnmp−1 ) which is computed by
X
c= xi xj = 0. (140) nm−1
X (i) mp−1
X (i)
!
(i,j)∈E mi = uj Aj vk Bi (142)
j=0 k=0
If this constraint is implemented as an energy penalty
fA = ac, it should have a higher priority. The minimal for all i = 0, . . . , nmp − 1. In a second step, the com-
cost of a spin flip (cf. Sec. VIII D 2) from fA is a(m − 1) ponents {Ci }i=0,...,np−1 of the final matrix C = AB are
and in fB a spin flip could result in a maximal gain of b. determined via
Thus a/b & 1/(m − 1).
nmp−1
e. Resources The cost function fC consists of up to X (r)
O(2k mk) terms with a maximal order of k in the spin Ck = wk mk , (143)
variables. In the alternative approach the order is only r=0
quadratic so it would naturally be in a QUBO formula- for all k= 0, . . . , np − 1. Making matrix multiplication
tion. However, the number of terms in fMIS can be as more efficient corresponds to finding sum decomposition
high as O(m2 k 2 ) and thus scales worse in the number of of M with fewer terms (lower rank), i.e., we want to
clauses. find vectors a(β) , b(β) , c(β) , where β = 0, . . . , B − 1 and
f. References This problem can be found in B < nmp, such that
Ref. [44].
B−1
X
M= a(β) ⊗ b(β) ⊗ c(β) (144)
3. Improvement of Matrix Multiplication β=0
a. Description Matrix multiplication is among the holds. Performing the above algorithm with this shorter
most used mathematical operations employed in infor- tensor decomposition corresponds to a matrix multipli-
matics. For a streamlined presentation of the procedure cation algorithm involving fewer scalar multiplications.
we present the case of matrix multiplication over the real Thus the new algorithm is more efficient. In general, we
numbers. can allow the vectors
Given a real n × m matrix A and a real m × p matrix n o
B the usual algorithm for matrix multiplication requires a(β) , b(β) , c(β)
β=1,··· ,B−1
24
to have real entries. To ease our lives we restrict ourselves A. Auxiliary functions
to entries in the finite set K := {−k, −k + 1, . . . , k − 1, k}
with k ∈ N and Ref. [57] reports promising results for The simplest class of building blocks are auxiliary
k = 2. Note that the tensor M has binary entries though. scalar functions of multiple variables that can be used
Allowing more values for the decomposition vectors just directly in cost functions or constraints via penalties.
increases the search space for possible solutions.
b. Variables The variables of the problem are given
by the entries of the vectors in a sum decomposition of
M. That is, there are vectors of variables 1. Element counting
n o
a(α) One of the most common building blocks is the func-
α=1,...,nmp−1
tion ta that simply counts the number of variables vi with
(α) (α) (α) a given value a. In terms of the value indicator functions
, where every entry ai of a(α) = (a0 , . . . , anm−1 )
we have
is a discrete variable with range n K. Simi-
o
(α)
larly, we have vectors of variables bα=0,...,nmp−1 ,
X
ta = δvai . (148)
(α) (α) (α)
with b(α) = (b0 , . . . , bmp ),
where bi has range i
4. Compare variables inequality vi ≤ a0 one can enforce this with a energy pe-
nalization for all values that do not satisfy the inequality
Given two variables v, w ∈ [1, K], the following term c=
X
δvai . (155)
indicates if v and w are equal: K>a>a0
If we have a set of variables {vi ∈ [1, K]}N so c is bound to be equal to some value of y, and the only
i=1 and two
variables vi , vj are connected when the entry Aij of the possible values for y are those that satisfy the inequality.
adjacency matrix A is 1, the following term has a mini- If the auxiliary variable y is expressed in binary or
mum when vi 6= vj for all connected i 6= j: gray encoding, then Eq. (158) must be modified when
2n < K < 2n+1 for some n ∈ N,
X 2
c= δ(vi − vj )Aij y − c(v1 , . . . , vN ) − 2n+1 + K = 0, (159)
i6=j
K (153) which ensures c(v1 , . . . , vN ) < K if y = 0, . . . , 2n+1 − 1.
XX
= δvαi δvαj Aij
i6=j α=1
4. Constraint preserving driver
The minimum value of c is zero, and it is only possi-
ble if and only if there is no connected pair i, j such When annealing inspired algorithms like quantum an-
that vi = vj . For all-to-all connectivity one can use this nealing or QAOA are used, an alternative to add penalty
building block to enforce that all variables are different. terms to the cost function that penalize states which do
If K = N , the condition vi 6= vj for all i 6= j is then not satisfy constraint is to start the algorithm in a state
equivalent to asking that each of the K possible values is that fulfills the constraints and use driver Hamiltonians
reached by a variable. that do not lead out of the constraint-fulfilling subspace
[6, 38]. As explained in Sec. V, we demand that the
driver Hamiltonian commutes with the operator gener-
2. Value α is used ating the constraints and that it reaches the entire valid
search space. Arbitrary polynomial constraints c can for
example be handled by using the Parity mapping. It
Given a set of N variables vi , we want to know if at P (u)
least one variable is taking the value α. This is done by brings constraints to the form u∈c gu σz . Starting in
the term a constraint-fulfilling state and employing constraint de-
pended flip-flop terms constructed from σ+ , σ− operators
as driver terms on the mapped spins [29] automatically
N
(
Y
α
0 if ∃ vi = α
uα = 1 − δ vi = (154) enforces the constraints.
i=1
1 if @ vi = α.
1. Modulo 2 linear programming shown here for the ground state/solution to the prob-
lem at hand. For example, if we are interested in a sat-
For the set of linear equations isfying assignment for x1 ∧ ... ∧ xn we might formulate
the cost function in two different ways that both have
xA = y, (160) x1 = ...Q = xn = 1 as their Punique ground state; namely
n n
f = − i=1 xi and f = i=1 (1 − xi ). The two cor-
where A ∈ Fl×n n
2 , y ∈ F2 are given, we want to solve for
responding spin Hamiltonians will have vastly different
x ∈ Fl2 . The cost function properties, as the first one will have, when expressed in
spin variables, 2n terms with up to n-th order interac-
Xn l
X tions, whereas the second one has n linear terms. Ad-
f= (1 − 2yi ) 1 − 2xj Aji (161) ditionally, configurations which are closer to a fulfilling
i=1 j=1 assignment, i.e. have more of the xi equal to one, have
a lower cost in the second option which is not the case
minimizes the Hamming distance between xA and y and for the first option. Note that for x1 ∨ ... ∨ xn there is no
thus the ground state represents a solution. If we consider similar trick as we want to penalize exactly one configu-
the second factor for fixed i, we notice that it counts the ration.
number of 1s in x (mod 2) where Aji does not vanish at
the corresponding index. When acting with
D. Meta optimization
l
A
Y
Hi = σz,jji (162)
j=1
Finally, we present several options for choices that arise
in the construction of cost functions. This includes meta
on |xi we find the same result and thus the cost function parameters like coefficients of building blocks, but also
reoccurring methods and techniques in dealing with op-
n l timization problems.
A
X Y
f= (1 − 2yi ) sj ji , (163)
i=1 j=1
1. Auxiliary variables
with spin variables sj has the solution to Eq. (160) as its
ground state.
It is often useful to combine information about a set of
variables vi into one auxiliary variable y that is then used
2. Representation of boolean functions
in other parts of the cost function. Examples include the
element counting building block or the maximal value
of a set of variables where we showed a way to bind the
Given a boolean function f : {0, 1}n → {0, 1} we want value of an auxiliary variable to this function of the set of
to express it as a cost function in terms of the n boolean variables. If the function f can be expressed algebraically
variables. This is hard in general [58], but for (combina- in terms of the variable values/indicator functions (as a
tions of) local boolean functions there are simple expres- finite polynomial) the natural way to do this is via adding
sions. In particular we have, e.g., the constraint term
¬x1 =1 − x1 (164) 2
c (y − f (vi )) (169)
n
Y
x1 ∧ x2 ∧ ... ∧ xn = xi (165) with an appropriate coefficient c. To avoid the downsides
i=1 of introducing constraints, there is an alternative route
n
Y that might be preferred in certain cases. This alternative
x1 ∨ x2 ∨ ... ∨ xn =1 − (1 − xi ) (166) α
consists in simply using the variable indicator δy(v
i=1 ,δ β )
i vi
where the xvm,t ,β are the binary variables of the one-hot IX. CONCLUSION AND OUTLOOK
encoding of vm,t .
In this paper we have collected and elaborated on
a wide variety of optimization problems, which can be
2. Prioritization of cost function terms formulated as a spin Hamiltonian, allowing them to be
solved on a quantum computer. We first presented these
If a cost function is constructed out of multiple terms, optimization problems in an encoding independent fash-
ion. Only after choosing an encoding for the problem’s
f = a1 f1 + a2 f2 , (171) variables, do we obtain a spin Hamiltonian ready to
be fed into algorithms such as quantum annealing or
one often wants to prioritize one term over another, e.g., QAOA. Representing an optimization problem in terms
if the first term encodes a hard constraint. One strategy of a spin Hamiltonian already comprises some choices,
to ensure that f1 is not “traded off” against f2 is to one of which is an encoding. Ultimately, this freedom
evaluate the minimal cost ∆1 in f1 and the maximal gain arises from the fact that the solution to the problem is
∆2 in f2 from flipping one spin. It can be more efficient only encoded in the ground state of the Hamiltonian and
to do this independently for both terms and assume a the excited states are not fixed.
state close to an optimum. Then the coefficients can This is best illustrated with an example from the rep-
be set according to c1 ∆1 & c2 ∆2 . In general, this will resentation of the one-hot core term in Sec. III C where
depend on the encoding for two reasons; first, for some two alternatives were given to enforce the one-hot con-
encodings one has to add core terms to the cost function dition. If we were restricted to convert problems to a
which have to be taken into account for the prioritization QUBO form, the choice with lower order of interaction
(they usually have the highest priority). Furthermore, in would be superior but with the Parity mapping and ar-
some encodings a single spin flip can cause the value of chitecture, higher order interaction terms can be handled
a variable to change by more than one.2 Nonetheless, better and thus the Hamiltonian with fewer terms might
it can be an efficient heuristic to compare the “costs” be preferred.
and “gains” introduced above for pairs of cost function We have thus highlighted that obtaining the spin
terms already in the encoding independent formulation Hamiltonian for an optimization problem involves sev-
by evaluating them for single variable changes by one and eral choices. Rather than a burden, we consider this a
thereby fix their relative coefficients. Then one only has feature. It leaves room to improve the performance of
to fix the coefficient for the core term after the encoding quantum algorithms by making different choices. This is
is chosen. the second main contribution of this paper; we identified
common, reoccurring building blocks in the formulation
of optimization problems. These building blocks, like en-
codings, parts of cost functions or different choices for
3. Problem conversion imposing constraints, constitute a construction kit for
optimization problems. This paves the way to an auto-
Many decision problems associated to discrete opti- mated preprocessing algorithm picking the best choices
mization problems are NP-complete, meaning that one among the construction kit to facilitate the search for
can always map them to any other NP-complete prob- optimally performing quantum algorithms.
lem with only a polynomial overhead of classical runtime Finding the optimal spin Hamiltonian for a given hard-
in the system size. Therefore, a new problem for which ware platform for a quantum computer is in itself an
one does not have a cost function formulation could be important problem. Optimal spin Hamiltonians for dif-
classically mapped to another problem where such a for- ferent hardware platforms are very likely to differ, thus
mulation is at hand. However, since quantum algorithms it is important to have a procedure capable of tailoring a
are expected to deliver only a polynomial advantage for problem to any given platform. Our hardware agnostic
general NP-hard problems, it is advisable to carefully approach is a first key step towards this direction.
analyze the overhead. It is also possible that there are In Sec. VI we also demonstrated for two examples of
trade-offs as in the example k-SAT in Sec. VII E 2, where how to use the building blocks and make choices when
in the original formulation the order of terms in the cost constructing the spin Hamiltonian. These examples can
function is k while in the MIS formulation we naturally be used as a template to solve customized optimization
have a QUBO problem where the number of terms scales problems in a systematic way.
worse in the number of clauses. The next step in this program towards automation is
some explicit testing for selected problems. In particu-
lar, we plan to benchmark different encodings for vari-
ables. Ultimately, the testing and benchmark results will
2 E.g. in the binary encoding a spin flip can cause a variable shift be incorporated in a pipeline to automate the search for
of up to K/2 if the variables take values 1, ..., K. This is the Hamiltonian formulations leading to better quantum al-
main motivation to use modifications like the Gray encoding. gorithm performance.
28
[1] Boris Melnikov, “Discrete optimization problems-some ing quantum annealing systems with applications to a
new heuristic approaches,” in Eighth International Con- scalable architecture,” Npj Quantum Inf. 3, 21 (2017).
ference on High-Performance Computing in Asia-Pacific [16] M. Schöndorf and F.K. Wilhelm, “Nonpairwise interac-
Region (HPCASIA’05) (IEEE, 2005) pp. 73–82. tions induced by virtual transitions in four coupled arti-
[2] Marco Dorigo and Gianni Di Caro, “Ant colony optimiza- ficial atoms,” Phys. Rev. Appl. 12, 064026 (2019).
tion: a new meta-heuristic,” in Proceedings of the 1999 [17] Tim Menke, Florian Häse, Simon Gustavsson, Andrew J.
congress on evolutionary computation-CEC99 (Cat. No. Kerman, William D. Oliver, and Alán Aspuru-Guzik,
99TH8406), Vol. 2 (IEEE, 1999) pp. 1470–1477. “Automated design of superconducting circuits and its
[3] Nina Mazyavkina, Sergey Sviridov, Sergei Ivanov, and application to 4-local couplers,” Npj Quantum Inf. 7, 49
Evgeny Burnaev, “Reinforcement learning for combina- (2021).
torial optimization: A survey,” Comput. Oper. Res. 134, [18] Tim Menke, William P. Banner, Thomas R. Bergam-
105400 (2021). aschi, Agustin Di Paolo, Antti Vepsäläinen, Steven J.
[4] Edward Farhi, Jeffrey Goldstone, Sam Gutmann, Weber, Roni Winik, Alexander Melville, Bethany M.
and Michael Sipser, “Quantum computation by adi- Niedzielski, Danna Rosenberg, Kyle Serniak, Mollie E.
abatic evolution,” arXiv:quant-ph/0001106 (2000), Schwartz, Jonilyn L. Yoder, Alán Aspuru-Guzik, Simon
10.48550/ARXIV.QUANT-PH/0001106. Gustavsson, Jeffrey A. Grover, Cyrus F. Hirjibehedin,
[5] Edward Farhi, Jeffrey Goldstone, and Sam Gutmann, Andrew J. Kerman, and William D. Oliver, “Demonstra-
“A quantum approximate optimization algorithm,” tion of tunable three-body interactions between super-
arXiv:1411.4028 (2014), 10.48550/ARXIV.1411.4028. conducting qubits,” Phys. Rev. Lett. 129, 220501 (2022).
[6] Stuart Hadfield, Zhihui Wang, Bryan O’Gorman, [19] Yao Lu, Shuaining Zhang, Kuan Zhang, Wentao Chen,
Eleanor G Rieffel, Davide Venturelli, and Rupak Biswas, Yangchao Shen, Jialiang Zhang, Jing-Ning Zhang, and
“From the quantum approximate optimization algorithm Kihwan Kim, “Global entangling gates on arbitrary ion
to a quantum alternating operator ansatz,” Algorithms qubits,” Nature 572, 363–367 (2019).
12, 34 (2019). [20] G Pelegrı́, A J Daley, and J D Pritchard, “High-fidelity
[7] Naeimeh Mohseni, Peter L McMahon, and Tim Byrnes, multiqubit rydberg gates via two-photon adiabatic rapid
“Ising machines as hardware solvers of combinatorial op- passage,” Quantum Sci. Technol. 7, 045020 (2022).
timization problems,” Nat. Rev. Phys. 4, 363–379 (2022). [21] Kilian Ender, Roeland ter Hoeven, Benjamin E. Niehoff,
[8] Gary Kochenberger, Jin-Kao Hao, Fred Glover, Mark Maike Drieb-Schön, and Wolfgang Lechner, “Parity
Lewis, Zhipeng Lü, Haibo Wang, and Yang Wang, “The quantum optimization: Compiler,” arXiv:2105.06233
unconstrained binary quadratic programming problem: (2021), 10.48550/arXiv.2105.06233.
a survey,” J. Comb. Optim. 28, 58–81 (2014). [22] Wolfgang Lechner, “Quantum approximate optimization
[9] Itay Hen and Marcelo S. Sarandy, “Driver hamiltoni- with parallelizable gates,” IEEE Trans. Quantum Eng.
ans for constrained optimization in quantum annealing,” 1, 1–6 (2020).
Phys. Rev. A 93, 062312 (2016). [23] Michael Fellner, Anette Messinger, Kilian Ender, and
[10] Itay Hen and Federico M. Spedalieri, “Quantum anneal- Wolfgang Lechner, “Universal parity quantum comput-
ing for constrained optimization,” Phys. Rev. Applied 5, ing,” Phys. Rev. Lett. 129, 180503 (2022).
034007 (2016). [24] Wolfgang Lechner, Philipp Hauke, and Peter Zoller, “A
[11] Andrew Lucas, “Ising formulations of many np prob- quantum annealing architecture with all-to-all connectiv-
lems,” Front. Phys. 2, 5 (2014). ity from local interactions,” Sci. Adv. 1, e1500838 (2015).
[12] Samuel A. Wilkinson and Michael J. Hartmann, “Su- [25] Stuart Hadfield, Zhihui Wang, Eleanor G. Rieffel, Bryan
perconducting quantum many-body circuits for quan- O’Gorman, Davide Venturelli, and Rupak Biswas,
tum simulation and computing,” Appl. Phys. Lett. 116, “Quantum approximate optimization with hard and
230501 (2020). soft constraints,” Proceedings of the Second Interna-
[13] Clemens Dlaska, Kilian Ender, Glen Bigan Mbeng, An- tional Workshop on Post Moores Era Supercomputing,
dreas Kruckenhauser, Wolfgang Lechner, and Rick van PMES’17, 15–21 (2017).
Bijnen, “Quantum optimization via four-body rydberg [26] Yingyue Zhu, Zewen Zhang, Bhuvanesh Sundar,
gates,” Phys. Rev. Lett. 128, 120503 (2022). Alaina M. Green, C. Huerta Alderete, Nhung H.
[14] Niklas J. Glaser, Federico Roy, and Stefan Fil- Nguyen, Kaden R. A. Hazzard, and Norbert M. Linke,
ipp, “Controlled-controlled-phase gates for supercon- “Multi-round qaoa and advanced mixers on a trapped-
ducting qubits mediated by a shared tunable coupler,” ion quantum computer,” arXiv:2201.12335 (2022),
arXiv:2206.12392 (2022), 10.48550/ARXIV.2206.12392. 10.48550/ARXIV.2201.12335.
[15] N. Chancellor, S. Zohren, and P. A. Warburton, “Cir- [27] Franz Georg Fuchs, Kjetil Olsen Lye, Halvor Møll Nilsen,
cuit design for multi-body interactions in superconduct- Alexander Johannes Stasik, and Giorgio Sartor, “Con-
29
straint preserving mixers for the quantum approximate Mauerer, and Claudia Linnhoff-Popien, “A hybrid solu-
optimization algorithm,” Algorithms 15, 202 (2022). tion method for the capacitated vehicle routing problem
[28] Nicolas PD Sawaya, Albert T Schmitz, and Stuart Had- using a quantum annealer,” Front. ICT 6 (2019).
field, “Encoding trade-offs and design toolkits in quan- [44] Vicky Choi, “Adiabatic quantum algorithms for the
tum algorithms for discrete optimization: coloring, rout- np-complete maximum-weight independent set, exact
ing, scheduling, and other problems,” arXiv:2203.14432 cover and 3sat problems,” arXiv:1004.2226 (2010),
(2022), 10.48550/arXiv.2203.14432. 10.48550/ARXIV.1004.2226.
[29] Maike Drieb-Schön, Younes Javanmard, Kilian En- [45] Krzysztof Kurowski, Jan Weglarz, Marek Subocz, Rafal
der, and Wolfgang Lechner, “Parity quantum optimiza- Różycki, and Grzegorz Waligóra, “Hybrid quantum an-
tion: Encoding constraints,” arXiv:2105.06235 (2021), nealing heuristic method for solving job shop scheduling
10.48550/ARXIV.2105.06235. problem,” in Computational Science – ICCS 2020 , edited
[30] Nicholas Chancellor, “Domain wall encoding of discrete by Valeria V. Krzhizhanovskaya, Gábor Závodszky,
variables for quantum annealing and QAOA,” Quantum Michael H. Lees, Jack J. Dongarra, Peter M. A. Sloot,
Sci. Technol. 4, 045004 (2019). Sérgio Brissos, and João Teixeira (Springer International
[31] Jie Chen, Tobias Stollenwerk, and Nicholas Chancellor, Publishing, 2020) pp. 502–515.
“Performance of domain-wall encoding for quantum an- [46] Davide Venturelli, Dominic J. J. Marchand, and
nealing,” IEEE Trans. Quantum Eng. 2, 1–14 (2021). Galo Rojo, “Quantum annealing implementation
[32] Kensuke Tamura, Tatsuhiko Shirai, Hosho Katsura, Shu of job-shop scheduling,” arXiv:1506.08479 (2015),
Tanaka, and Nozomu Togawa, “Performance comparison 10.48550/ARXIV.1506.08479.
of typical binary-integer encodings in an ising machine,” [47] Amaro, David, Rosenkranz, Matthias, Fitzpatrick,
IEEE Access 9, 81032–81039 (2021). Nathan, Hirano, Koji, and Fiorentini, Mattia, “A case
[33] Nicolas P. D. Sawaya, Tim Menke, Thi Ha Kyaw, Sonika study of variational quantum algorithms for a job shop
Johri, Alán Aspuru-Guzik, and Gian Giacomo Guer- scheduling problem,” EPJ Quantum Technol. 9, 5 (2022).
reschi, “Resource-efficient digital quantum simulation [48] Costantino Carugno, Maurizio Ferrari Dacrema, and
of d-level systems for photonic, vibrational, and spin-s Paolo Cremonesi, “Evaluating the job shop scheduling
hamiltonians,” Npj Quantum Inf. 6, 49 (2020). problem on a d-wave quantum annealer,” Scientific Re-
[34] Alejandro Montanez-Barrera, Alberto Maldonado-Romo, ports 12, 6539 (2022).
Dennis Willsch, and Kristel Michielsen, “Unbalanced [49] Kazuki Ikeda, Yuma Nakamura, and Travis S Humble,
penalization: A new approach to encode inequal- “Application of quantum annealing to nurse scheduling
ity constraints of combinatorial problems for quan- problem,” Sci. Rep. 9, 12837 (2019).
tum optimization algorithms,” arXiv:2211.13914 (2022), [50] DA Rachkovskii, SV Slipchenko, EM Kussul, and
10.48550/arXiv.2211.13914. TN Baidyk, “Properties of numeric codes for the scheme
[35] Jesse Berwald, Nicholas Chancellor, and Raouf of random subspaces rsc,” Cybern. Syst. Anal. 41, 509–
Dridi, “Understanding domain-wall encoding theoret- 520 (2005).
ically and experimentally,” arXiv:2108.12004 (2021), [51] Luc Devroye, Peter Epstein, and Jörg-Rüdiger Sack,
10.48550/ARXIV.2108.12004. “On generating random intervals and hyperrectangles,”
[36] John Preskill, “Quantum computing in the nisq era and J. Comput. Graph. Stat. 2, 291–307 (1993).
beyond,” Quantum 2, 79 (2018). [52] Matthew Elliott, Benjamin Golub, and Matthew O Jack-
[37] Fernando Pastawski and John Preskill, “Error correc- son, “Financial networks and contagion,” Am. Econ. Rev.
tion for encoded quantum annealing,” Phys. Rev. A 93, 104, 3115–53 (2014).
052325 (2016). [53] Román Orús, Samuel Mugel, and Enrique Lizaso,
[38] Kilian Ender, Anette Messinger, Michael Fellner, “Forecasting financial crashes with quantum computing,”
Clemens Dlaska, and Wolfgang Lechner, “Modular par- Phys. Rev. A 99, 060301 (2019).
ity quantum approximate optimization,” PRX Quantum [54] Syun Izawa, Koki Kitai, Shu Tanaka, Ryo Tamura,
3 (2022), 10.1103/prxquantum.3.030304. and Koji Tsuda, “Continuous black-box optimization
[39] Michael Fellner, Kilian Ender, Roeland ter Hoeven, with quantum annealing and random subspace coding,”
and Wolfgang Lechner, “Parity quantum optimization: arXiv:2104.14778 (2021), 10.48550/ARXIV.2104.14778.
Benchmarks,” arXiv:2105.06240 (2021). [55] Ching-Yi Lai, Kao-Yueh Kuo, and Bo-Jyun Liao, “Syn-
[40] Josua Unger, Anette Messinger, Benjamin E. Niehoff, drome decoding by quantum approximate optimization,”
Michael Fellner, and Wolfgang Lechner, “Low- arXiv:2207.05942 (2022), 10.48550/ARXIV.2207.05942.
depth circuit implementation of parity constraints [56] Volker Strassen, “Gaussian elimination is not optimal,”
for quantum optimization,” arXiv:2211.11287 (2022), Numer. Math. 13, 354–356 (1969).
10.48550/ARXIV.2211.11287. [57] Alhussein Fawzi, Matej Balog, Aja Huang, Thomas
[41] James King, Sheir Yarkoni, Jack Raymond, Isil Ozfidan, Hubert, Bernardino Romera-Paredes, Mohammadamin
Andrew D King, Mayssam Mohammadi Nevisi, Jeremy P Barekatain, Alexander Novikov, Francisco J R Ruiz, Ju-
Hilton, and Catherine C McGeoch, “Quantum annealing lian Schrittwieser, Grzegorz Swirszcz, et al., “Discovering
amid local ruggedness and global frustration,” J. Phys. faster matrix multiplication algorithms with reinforce-
Soc. Jpn. 88, 061007 (2019). ment learning,” Nature 610, 47–53 (2022).
[42] Martin Lanthaler and Wolfgang Lechner, “Minimal con- [58] Stuart Hadfield, “On the representation of boolean and
straints in the parity formulation of optimization prob- real functions as hamiltonians for quantum computing,”
lems,” New J. Phys. 23, 083039 (2021). ACM Trans. Quant. Comput. 2, 1–21 (2021).
[43] Sebastian Feld, Christoph Roch, Thomas Gabor, Chris- [59] Grigori S Tseitin, “On the complexity of derivation
tian Seidel, Florian Neukart, Isabella Galter, Wolfgang in propositional calculus,” in Automation of reasoning
(Springer, 1983) pp. 466–483.