mathrm (C) /mathrm (I) /mathrm (O) /mathrm (M) /mathrm (P) /mathrm (U) /mathrm (T)

\mathrm{S}\mathrm{I}\mathrm{A}\mathrm{M} \mathrm{J}. \mathrm{S}\mathrm{C}\mathrm{I}. \mathrm{C}\mathrm{O}\mathrm{M}\mathrm{P}\mathrm{U}\mathrm{T}.
© 2023 \mathrm{S}\mathrm{o}\mathrm{c}\mathrm{i}\mathrm{e}\mathrm{t}\mathrm{y} \mathrm{f}\mathrm{o}\mathrm{r} \mathrm{I}\mathrm{n}\mathrm{d}\mathrm{u}\mathrm{s}\mathrm{t}\mathrm{r}\mathrm{i}\mathrm{a}\mathrm{l} \mathrm{a}\mathrm{n}\mathrm{d} \mathrm{A}\mathrm{p}\mathrm{p}\mathrm{l}\mathrm{i}\mathrm{e}\mathrm{d} \mathrm{M}\mathrm{a}\mathrm{t}\mathrm{h}\mathrm{e}\mathrm{m}\mathrm{a}\mathrm{t}\mathrm{i}\mathrm{c}\mathrm{s}

\mathrm{V}\mathrm{o}\mathrm{l}. 45, \mathrm{N}\mathrm{o}. 3, \mathrm{p}\mathrm{p}. \mathrm{A}1214--\mathrm{A}1238
A GLOBAL OPTIMIZATION APPROACH FOR MULTIMARGINAL

OPTIMAL TRANSPORT PROBLEMS WITH COULOMB COST*
Downloaded 06/12/23 to 219.142.99.17 . Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/terms-privacy
YUKUAN HU\dagger , HUAJIE CHEN\ddagger , AND XIN LIU\dagger \S
Abstract. In this work, we construct a novel numerical method for solving the multimarginal
optimal transport problems with Coulomb cost. This type of optimal transport problem arises in
quantum physics and plays an important role in understanding the strongly correlated quantum sys-
tems. With a Monge-like ansatz, we transfer the original high-dimensional problems into mathemati-
cal programmings with generalized complementarity constraints, and thus the curse of dimensionality
is surmounted. However, the latter ones are themselves hard to deal with from both theoretical and
practical perspectives. Moreover, in the presence of nonconvexity, brute-force searching for global
solutions becomes prohibitive as the problem size grows large. To this end, we propose a global
optimization approach for solving the nonconvex optimization problems, by exploiting an efficient
proximal block coordinate descent local solver and an initialization subroutine based on hierarchical
grid refinements. We conduct numerical simulations on some typical physical systems to show the
efficiency of our approach. The results match well with both theoretical predictions and physical
intuitions and provide indications for Monge solutions in two-dimensional contexts. In addition, we
give the first visualization of approximate optimal transport maps for some two-dimensional systems.
Key words. multimarginal optimal transport, Coulomb cost, Monge-like ansatz, mathematical
programming with generalized complementarity constraints, global optimization, grid refinement,
optimal transport maps
MSC codes. 49M37, 65K05, 81V05, 90C26, 90C30
DOI. 10.1137/21M1455164
1. Introduction. The aim of this paper is to provide an optimization method

for the multimarginal optimal transport (MMOT) problems [40, 48] arising in many-
electron physics [11, 13, 47]. Let d \in \{ 1, 2, 3\} be the dimension of a system, \Omega \subseteq \BbbR d be
a bounded domain where the electrons are located, N \in \BbbN with N \geq 2 be the number
of electrons, and ri \in \Omega (i \in \{ 1, . . . , N \} ) be the position of the ith electron. For the
many-electron system, the MMOT problem with Coulomb cost reads
\int
\bigl( \bigr) \bigl( \bigr)
(1.1) min c r1 , . . . , rN d\Gamma r1 , . . . , rN subject to (s.t.) \Gamma \mapsto \rightarrow \rho ,
\Gamma \in \scrP (\Omega N ) \Omega N
where \scrP (\Omega N ) denotes the space of all N -point probability measures over \Omega N , the cost
function c(r1 , . . . , rN ) is determined by the electron-electron Coulomb interaction
* Submitted
to the journal's Methods and Algorithms for Scientific Computing section October
25, 2021; accepted for publication (in revised form) December 20, 2022; published electronically June
12, 2023.
https://doi.org/10.1137/21M1455164
Funding: The work of the second author was supported by the National Natural Science Foun-
dation of China (11971066). The work of the third author was supported in part by the National
Natural Science Foundation of China (1212500491, 11971466, 11991021) and the Key Research Pro-
gram of Frontier Sciences, Chinese Academy of Sciences (ZDBS-LY-7022).
\dagger
State Key Laboratory of Scientific and Engineering Computing, Academy of Mathematics and
Systems Science, Chinese Academy of Sciences, and University of Chinese Academy of Sciences,
Beijing, China (ykhu@lsec.cc.ac.cn).
\ddagger
School of Mathematical Sciences, Beijing Normal University, Beijing, China (chen.huajie@
bnu.edu.cn).
\S
Corresponding author. State Key Laboratory of Scientific and Engineering Computing, Acad-
emy of Mathematics and Systems Science, Chinese Academy of Sciences, and University of Chinese
Academy of Sciences, Beijing, China (liuxin@lsec.cc.ac.cn).
A1214
Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.

A GLOBAL OPTIMIZATION APPROACH FOR MMOT A1215
\bigl( \bigr) \sum 1
(1.2) c r1 , . . . , rN := ,
i<j
| ri - rj |
\int
\rho \in L1 (\Omega ) refers to the single-electron density satisfying \Omega \rho = N , and \Gamma \mapsto \rightarrow \rho repre-
sents the marginal constraints: for i = 1, . . . , N and any open set \scrA i \subseteq \Omega ,
\int \int
\bigl( \bigr) 1
(1.3) d\Gamma r1 , . . . , rN = \rho (r) dr.
\Omega i - 1 \times \scrA i \times \Omega N - i \scrA i N
Note that the Coulomb interaction 1/| ri - rj | between the electrons in (1.2) can be
approximated or regularized, especially in the simulations of systems with d < 3 [4,
20]. Nevertheless, the approach constructed in this paper will make no difference
as long as the interaction between the electrons is repulsive (i.e., the cost decreases
with respect to | ri - rj | ). Without loss of generality, we will focus on the Coulomb
interaction of the form (1.2).
The MMOT problem (1.1) arises as the strictly correlated electrons (SCE) limit
in the density functional theory (DFT). The DFT has been most widely used for
electronic structure calculations in physics, chemistry, and materials science (see [3]
for a review). It depends on choosing an ansatz for an exact yet unknown density
functional. The SCE limit was first introduced in [45]. Later in [7, 12], it was
recognized that the limit is an MMOT problem. The SCE limit provides an alternative
route to derive the DFT energy functionals and has been exploited to extend the
capability of the DFT to treat strongly correlated quantum systems [9, 10, 23, 35, 37].
Direct discretization of the MMOT problem (1.1) leads to a linear programming,
with the size increasing exponentially fast with respect to N (the number of elec-
trons/marginals). There are several works devoted to the reformulations and nu-
merical methods for the MMOT problem (1.1). In [5], the Sinkhorn scaling algo-
rithm based on iterative Bregman projections was applied to an entropy-regularized
discretized MMOT problem of one-dimensional (1D) systems. In [36], the authors
proposed numerical methods based on the Kantorovich dual of the MMOT problem,
penalizing the nonsmooth reformulation of the original inequality constraints and
utilizing derivative-free methods. In [31, 32], a convex relaxation for the so-called N -
representability formulation was proposed by imposing certain necessary constraints
satisfied by the two-marginal, and the relaxed problem was then solved as a semi-
definite programming to obtain tight lower bounds for the optimal cost. In [1, 2],
the existence of sparse global solutions to semidiscrete formulation was established
and a constrained overdamped Langevin process was proposed to solve the moment
constrained relaxations. In [20, 21], the sparse extremal representations for global
solutions were rigorously justified, and an efficient numerical method was proposed
based on column generation and machine learning.
The starting point of this work is to approximate the N -point measure \Gamma by the
ansatz
\rho (r1 )
(1.4) d\Gamma (r1 , . . . , rN ) = \gamma 2 (r1 , r2 ) \cdot \cdot \cdot \gamma N (r1 , rN ) dr1 \cdot \cdot \cdot drN ,
N
where, for any n \in \{ 2, . . . , N \} , \gamma n \in L1 (\Omega 2 ) is a transport plan fulfilling
\int \int
(1.5) \gamma n (r, r\prime ) \geq 0, \gamma n (r, r\prime ) dr\prime = 1, and \rho (r)\gamma n (r, r\prime ) dr = \rho (r\prime ) \forall r, r\prime \in \Omega .
\Omega \Omega
The condition (1.5) is derived from the marginal constraints (1.3). We sometimes call
the last equality in (1.5) the mass-preserving constraint. From a physical point of

A1216 YUKUAN HU, HUAJIE CHEN, AND XIN LIU
view, \gamma n (r, r\prime ) represents the correlation between the first and the nth electron, which
gives the probability density of finding the nth electron at r\prime while the first electron
is located at r. Under the ansatz (1.4), the MMOT problem (1.1) (with N > 2) can
be rewritten as
\left\{
\sum \int \int \int
\rho (r)\gamma m (r, r\prime )\gamma n (r, r\prime \prime )
(1.6) min dr dr\prime dr\prime \prime
\gamma 2 ,...,\gamma N \Omega \Omega \Omega | r\prime - r\prime \prime |
2\leq m<n\leq N
\right\}
\sum \int \int \rho (r)\gamma n (r, r\prime )
+ \prime dr dr\prime : \gamma 2 , . . . , \gamma N satisfy (1.5) .
\Omega \Omega | r - r |
2\leq n\leq N
We mention that in the case of N = 2, the first term in the objective of (1.6) vanishes;
(1.6) then reduces to a linear programming and can be solved by standard algorithms
[10]. In this work, we focus our attention on the N \geq 3 settings. The formulation
(1.6) amounts to a spectacular dimension reduction, in that the unknowns are N - 1
transport plans on \Omega 2 instead of the N -point measure \Gamma on \Omega N . Therefore, the
degrees of freedom now scale linearly with respect to N rather than exponentially
fast. Moreover, the ansatz (1.4) is related to the Monge state [38, 47] by taking
\gamma n (r, r\prime ) = \delta (r\prime - Tn (r)) with \delta being the Dirac measure and Tn (n \in \{ 2, . . . , N \} )
being the so-called optimal transport map. The Monge formulation gives significant
information on the MMOT problem and enjoys physical interpretations; see more
discussions in subsection 1.3.
In practical calculations, we need to discretize (1.6) into some finite-dimensional
problem. The discretization consists of three steps. First, we employ a finite elements--
like mesh \scrT = \{ ek \} K k=1 to partition the domain \Omega into K nonoverlapping elements,
i.e., \cup K k=1 k e = \Omega and ek \cap ek\prime = \emptyset when k \not = k \prime . Let e := [| e1 | , . . . , | eK | ]\top \in \BbbR K + de-
note the volumes of elements. Second, we approximate the \int marginal \rho by a vector
1
\bfitvarrho := [\varrho 1 , . . . , \varrho K ]\top \in \BbbR K
+ , where the kth entry \varrho k := | ek | ek \rho (r) dr gives the mar-
ginal/electron mass on the kth element ek . Finally, the Coulomb interactions and
transport plans \gamma n (n = 2, . . . , N ) can be approximated by the effective interactions
and transports between elements, i.e., for any i, j \in \{ 1, . . . , K\} ,
(1.7) \int \int \int \int
1 1 \prime 1
cij := dr dr and x n,ij := \gamma n (r, r\prime ) dr dr\prime ,
| ei | \cdot | ej | ej ei | r - r\prime | | ei | \cdot | ej | ej ei
respectively, leading to K \times K matrices C := ((1 - aij )cij )ij and Xn := (xn,ij )ij
(n = 2, . . . , N ). Here, aij equals 1 if i = j and 0 otherwise. With a slight abuse of
terminology, we also call Xn (n = 2, . . . , N ) transport plans in what follows. After
discretization, we can approximate (1.6) using the following optimization problem
with unknowns \{ Xn \} N n=2 :
(1.8) \sum \sum
min f (X2 , . . . , XN ) := \langle Xn , \Lambda E CE\rangle + \langle Xn , E\Lambda Xm ECE\rangle
X2 ,...,XN
2\leq n\leq N 2\leq m<n\leq N
s.t. Xn e = 1, Xn\top E\bfitvarrho = \bfitvarrho , Tr(Xn ) = 0, Xn \geq 0, n = 2, . . . , N,
\langle Xm , Xn \rangle = 0, \forall m \not = n,
where 1 is the all-ones vector in \BbbR K , \Lambda := Diag(\bfitvarrho ) and E := Diag(e) are K \times K
diagonal matrices formed by the entries in \bfitvarrho and e, respectively. More detailed
derivation of (1.8) is given in Appendix A. Note that the diagonal elements in matrix
C are removed due to the integral divergence in (1.7). The extra constraints
Tr(Xn ) = 0, n = 2, . . . , N, and \langle Xm , Xn \rangle = 0 \forall m \not = n,

are hence accordingly added. From a physical point of view, these constraints keep
the electrons spatially away from each other in the case of Coulomb repulsion so that
unfavorable particle clustering can be avoided.
In the case of N = 3, (1.8) is a mathematical programming with complemen-

tarity constraints (MPCC) in view of the nonnegative constraints and \langle X2 , X3 \rangle = 0.
Due to the disjunctive nature of the feasible set, a general MPCC violates commonly
used constraint qualifications at any feasible point [15]. The well-known Karush--
Kuhn--Tucker (KKT) conditions can even be unnecessary for local minimizers. When
N > 3, the formulation of the constraints in (1.8) is more complicated than that of the
complementarity constraints. Since \langle Xm , Xn \rangle = 0 for any m \not = n impose the require-
ments that, for each n \in \{ 2, . . . , N \} , the block variable Xn complements all the other
blocks, we call (1.8) a mathematical programming with generalized complementarity
constraints (MPGCC).
In addition to its intrinsic difficulty, we are in search of the global solutions of
(1.8). This is a hard matter because both the repulsive energy f and the feasible set
are nonconvex in variables (Xn )N n=2 . Since the degrees of freedom (N - 1)K grow
2
quickly as the meshes become finer, state-of-the-art global optimization solvers cannot
be our last resort.
1.1. Optimization background. Although little is known about MPGCC,
there exists a rich literature on MPCC. To overcome the intrinsic difficulties men-
tioned above, several tailored constraint qualifications have been provided for MPCC.
Under these constraint qualifications, points satisfying certain stationary systems are
shown to be the proper candidates of local minimizers. The related notions and
theoretical results are gathered in [41, 51] and the references within.
With these in place, researchers have proposed various numerical approaches,
wherein those based on the original MPCC formulation rank top choices; they em-
ploy modified nonlinear programming solvers. For example, the authors in [17] solved
MPCCs using sequential quadratic programming algorithms with filter techniques
[16]. The software introduced in [8, 49] incorporates a suite of nonlinear program-
ming algorithms to tackle MPCCs, including interior-point methods and sequential
quadratic programming algorithms, together with globalization techniques such as
line search and trust region.
Owing to the troubles when coping with complementarity constraints, methods
based on penalty functions also have gained popularity. Among others, we confine our
attention to the \ell 1 (complementarity) penalty function, which favors direct extension
to MPGCC (1.8) as
\sum
(1.9) f (X2 , . . . , XN ) + \beta \langle Xm , Xn \rangle ,
m<n
namely, penalizing merely the complementarity violation in \ell 1 form. Here, f is the
repulsive energy defined in (1.8), and \beta > 0 is the penalty parameter. It can be verified
under certain conditions that the global solutions of (1.8) coincide with those globally
minimizing (1.9) over \scrS N - 1 , where
(1.10) \scrS := \{ W \in \BbbR K\times K : W e = 1, W \top E\bfitvarrho = \bfitvarrho , Tr(W ) = 0, W \geq 0\} .
A direct consequence is that, if the global solutions of (1.8) are required, one can in
turn minimize (1.9) over \scrS N - 1 starting with proper initializations. However, we are
not aware of any existing method that fully exploits the special structure of (1.9). A
customized algorithm is thus needed, particularly in the large-scale contexts.

In addition, methods based on approximation (smoothing or regularization), aug-

mented Lagrangian functions, and full penalization are available as well. We refer
interested readers to [14, 24, 27, 28, 29, 34, 43, 44] and the references therein. Com-
pared with methods using modified nonlinear programming solvers or penalty func-
tions, these approaches require solving a sequence of subproblems in the same size to
stationarity or even optimality [30]. This weakness excludes them from our choices,
particularly when the number of grid points K is tremendously large.
1.2. Contributions. Our contributions are threefold:

(1) A global optimization approach, equipped with a local solver and a hierarchical
initialization subroutine, is constructed for solving (1.8).
The initialization subroutine (Algorithm 2.2), derived from hierarchical grid
refinements, leads to a great chance for the local solver to start from the
attractive basins of global solutions, and hence serves as the core of the pro-
posed global optimization approach (Framework 2.1). The proposed approach
saves one from brute-force solving large-scale (1.8) via plain global optimiza-
tion methods. Remarkably in Framework 2.1, the optimal transport maps
can be directly approximated by the solutions, which is usually difficult in
the context of Coulomb cost.
(2) An inexact proximal block coordinate descent (PBCD) algorithm is proposed for
locally minimizing (1.9) over \scrS N - 1 .
PBCD (Algorithm 2.3) acts as the local solver in Framework 2.1 and enjoys
global convergence guarantee in the presence of iterate infeasibility (Theorem
3.2), which is not covered by existing works.
(3) Simulations for some typical 1D and 2D systems.
We consider systems with the number of electrons up to 7 and discretization
with the number of grid points up to 6.1 \times 104 . The results are in line
with both theoretical predictions and physical intuitions (section 4), providing
indications for Monge solutions and the first visualization of approximate
optimal transport maps for some 2D systems.
1.3. Further remarks.

Monge formulation. The Monge formulation makes the ansatz
\rho (r1 )
(1.11) d\Gamma (r1 , . . . , rN ) = \delta (r2 - T2 (r1 )) \cdot \cdot \cdot \delta (rN - TN (r1 )) dr1 \cdot \cdot \cdot drN ,
N
where the transport map Tn : \Omega \rightarrow \Omega (n \in \{ 2, . . . , N \} ) (we can prescribe T1 (r) = r
for the completeness of notation) preserves the single-electron density \rho . The Monge
solution has a simple physical interpretation: the many-electron repulsive energy is
minimized at a state such that one electron at position r can determine the positions
of all other N - 1 electrons via \{ Tn \} N n=2 . It is known that for 1D systems or systems
with N = 2 electrons, the Monge ansatz (1.11) accommodates the global solutions
of the MMOT problems [11, 12]. But in the general d > 1 and N > 2 cases, it
is unknown whether there exists a minimizer of (1.1) in the form (1.11). Thus far,
a counterexample in real physical scenarios has not been put forward; see [19] and
the references within. It is therefore of scientific interest to search for the Monge
solutions. The ansatz (1.4) is more flexible than (1.11), in that the transport maps
are replaced with transport plans. Unlike (1.11), (1.4) guides one to the problem (1.6)
that can be easily discretized and always admits a minimizer, with which we are able
to approximate the Monge solutions. More precisely, let aj \in \Omega be the barycenter

of element ej ; then Tn (aj ) (n = 2, . . . , N ) can be approximated by a given solution

(Xn )Nn=2 through barycentric average
\sum
(1.12) TnK (aj ) := ak xn,jk | ek | , j = 1, . . . , K, n = 2, . . . , N.
1\leq k\leq K
Symmetric constraints. In physics, one is only interested in the measures that

are symmetric with respect to \{ ri \} N
i=1 (as \Gamma represents an N -point probability measure
of electrons, which is symmetric by the laws of quantum theory). More precisely, one
requires that for any permutation P on \{ 1, . . . , N \} ,
\int \int
d\Gamma = d\Gamma \forall open sets \scrA 1 , . . . , \scrA N \subseteq \Omega .
\scrA 1 \times \cdot \cdot \cdot \times \scrA N \scrA P(1) \times \cdot \cdot \cdot \times \scrA P(N )
Although we do not have this symmetric restriction in the MMOT problem (1.1) and
the ansatz (1.11) is in general not symmetric, dropping the restriction does not alter
the minimum value. This is because we have a symmetric cost function c in (1.2) and
equal marginal for any i \in \{ 1, . . . , N \} in (1.3). Hence each nonsymmetric \Gamma can give a
symmetric one with the same energy value by symmetrization. Consequently, we do
not have to impose the symmetric constraints in the optimization formulation (1.8).
Discretization. Most of the existing works discretize the MMOT problems
with real space methods [5, 10]. Particularly, this paper discretizes (1.6) into (1.8)
by representing the marginal \rho with piecewise finite elements and using effective cost
coefficients obtained by integrating the continuous cost functions with respect to these
elements. To further reduce the computational cost (i.e., use fewer grid points where
the marginal is small), we choose the elements adaptively such that each element
carries approximately the same marginal mass.
1.4. Outline. The rest of this paper is organized as follows. We introduce the
global optimization approach in section 2, where the initialization subroutine (subsec-
tion 2.1) and local solver (subsection 2.2) are detailed in order. Section 3 is dedicated
to the rough statements of the convergence properties of PBCD. We corroborate the
proposed approach with numerical simulations on several typical systems in section 4.
Finally, conclusions and discussions appear in section 5.
1.5. Notation. The adjoint and image of a linear operator A are denoted by
A \ast and Im(A ), respectively. The notation \| X\| p gives the p-norm of matrix X, while
\| X\| F yields its Frobenius norm. The components of matrices or vectors are indicated
by subscripts, e.g., xij . The inequality X \geq 0 means xij \geq 0 for any i, j.
For the multiblock objective functions in this work (such as (1.8)), we occasionally
adopt abbreviations in parentheses. For example, f (X<n , Xn , X>n ) means
f (X2 , . . . , Xn - 1 , Xn , Xn+1 , . . . , XN );
abbreviations like X<n , X(m,n) , and X>n represent the aggregation of blocks with
certain subscripts (clearly, X<0 , X(n,n) , and X>n are null variable blocks, which may
be used for notational ease). For any n, we use \nabla n f to refer to the gradient of f with
respect to the nth block variable.
Regarding the algorithm, we use double superscripts within brackets for iterates
(\ell ,k)
in the inner loop; for instance, Xn is the iterate in the kth inner iteration of the
\ell th outer iteration.

Framework 2.1 The GGR approach.

Require: Oracle returning C, e, and \bfitvarrho in proper dimensions; global solver; local
(0) (0)
solver; GR subroutine; initial mesh with K (0) elements \{ ek \} K k=1 .
1: Set \ell := 0.
(0, \star )
2: GGR Init: use the global solver for (2.1) with size K (0) and get (Xn )N n=2 .
3: while certain stopping criteria are not satisfied do
(\ell ) (\ell ) (\ell +1) K (\ell +1)
4: Refine the last mesh \{ ek \} K k=1 to \{ ek \} k=1 with K (\ell +1) elements.
(\ell , \star ) N (\ell +1,0) N
5: Modify (Xn )n=2 using the GR subroutine to obtain (Xn )n=2 .
(\ell +1,0) N
6: GGR LS(\ell + 1): start the local solver from (Xn )n=2 for (2.1) with size
(\ell +1, \star ) N
K (\ell +1) and get (Xn )n=2 .
7: Set \ell := \ell + 1.
8: end while
(\ell , \star ) K (\ell ) \times K (\ell ) N - 1
9: return (Xn )N n=2 \in (\BbbR ) .
2. A global optimization approach for solving (1.8). In light of the ansatz

(1.4), the original MMOT problem with Coulomb cost (1.1) is approximated by
MPGCC (1.8). Violating commonly used constraint qualifications, MPGCC (1.8)
itself is a hard nut to crack in both algorithmic design and theoretical analyses. In-
stead, we concentrate on the \ell 1 penalized MPGCC (1.8), i.e.,
\sum
min f\beta (X2 , . . . , XN ) := f (X2 , . . . , XN ) + \beta \langle Xm , Xn \rangle
(2.1) X2 ,...,XN
m<n
s.t. Xn \in \scrS , n = 2, . . . , N,
where f is the repulsive energy defined in (1.8) and \scrS is defined in (1.10). The
problem (2.1) is a nonconvex quadratic programming problem, still NP-hard [39]. In
what follows, when we talk about (2.1) and its solution in space (\BbbR K\times K )N - 1 , we
simply say (2.1) and its solution with size K.
For practical purposes, a global solution of (2.1) is always demanded. Meanwhile,
we notice that the degrees of freedom in (2.1), (N - 1)K 2 , grow fast with respect to K.
This prevents us from brute-force solving (2.1) by state-of-the-art global optimization
methods (e.g., branch-and-bound and cutting plane algorithms) due to exponentially
increasing running time.
Motivated by [5], we propose a global optimization approach, GGR; see Frame-
work 2.1. Here, ``G"" and ``GR"" stand for global optimization and initialization based
on hierarchical grid refinements, respectively. GGR Init refers to the initial step
invoking a global solver and GGR LS to the subsequent step invoking a local solver.
Framework 2.1 progresses step by step along with the process of mesh refinements.
Let us first justify the usage of a global solver in the initial step (line 2 in Frame-
work 2.1). From the point of applicability, given initial size K (0) of moderate magni-
tude, globally solving (2.1) is amenable to state-of-the-art global optimization meth-
ods. Considering the necessity, the qualities of the constructed initial points largely
depend on the solutions in the previous step. Hence it is a natural choice for us to
invoke a global solver in the initial step. For our choices in implementation, please
refer to subsection 4.1.
Without specification, the mesh refinements (line 4 in Framework 2.1) are done
such that the coarse meshes are always embedded into the refined meshes. For more
remarks, see subsection 1.3. Although the refinements are uniform in the numerical

simulations of present work (subsections 4.2 and 4.3), practical implementations focus
on the region where marginals vary violently. Nevertheless, in the latter contexts, our
GGR approach still works.
In what follows, we elaborate on the initialization subroutine and local solver.

2.1. Initialization subroutine based on grid refinements. Brute-force op-
timizing (2.1) becomes impracticable once K grows large. One treatment for this
is arming a local solver with good initializations. Roughly speaking, if the energy
surface forms a basin around some global solution (Xn \star )N n=2 , the local solver is able to
find (Xn \star )N n=2 provided that the initial point lies inside the basin near (Xn )n=2 . This
\star N
subsection is devoted to the development of the subroutine, GR, for initializations

(line 5 in Framework 2.1). That is, the GR subroutine passes the solution informa-
tion of the previous step on to the current one such that good initializations can be
anticipated. Without this process, the point found by the local solver is very likely
not a global minimizer, resulting in bad solutions afterward.
We derive the GR subroutine from some 1D numerical experience: for a particular
problem (given oracle of C, e, and \bfitvarrho ), the solutions with different sizes share ``similar""
patterns. This phenomenon suggests constructing an initial point based on the pattern
reflected in the solution with a smaller size. In the following, we try to understand
the ``similarity"" standing at optimal transport and then introduce the GR subroutine.
In principle, the GR subroutine applies to any dimension d.
Let us begin with 1D settings. Suppose that we already have a finite-elements
mesh \{ ek \} K N
k=1 and an approximate solution (Xn )n=2 of (2.1). Then in the context
of optimal transport, for any n \in \{ 2, . . . , N \} , xn,jj \prime > 0 indicates that mass of xn,jj \prime
is transported from ej to ej \prime by Xn . For the problem with a doubly refined mesh
\~
ek \} K
\{ \~ k=1 , the original ej , ej \prime correspond to e \~2j - 1 and e\~2j , e\~2j \prime - 1 and e\~2j \prime , respectively.
Let j1 = 2j - 1, j2 = 2j, j1\prime = 2j \prime - 1, j2\prime = 2j \prime . A reasonable speculation is that
there also exists certain mass transported from e\~j1 , e\~j2 to e\~j1\prime , e\~j2\prime by the new X \~ n , i.e.,
\~n,ju jv\prime > 0 for u, v \in \{ 1, 2\} . See Figure 2.1 for an illustration.
x
The above arguments apply to any d \in \BbbN . Suppose that a finite-elements mesh
\{ ek \} K N
k=1 and an approximate solution (Xn )n=2 are at hand. For any n \in \{ 2, . . . , N \} ,
xn,jj > 0 means that mass of xn,jj is transported from element ej to ej \prime by Xn .
\prime \prime
3 4 doubly refine 5 6 7 8
Mesh: −−−−−−−−→
m m
4 7 8
5
Plan: 3
6
Fig. 2.1. 1D case. The red block means there is mass transported from 3 to 4. Then in a doubly
refined mesh, there is mass transported from 5 and 6 to 7 and 8, as marked out by 4 blue blocks.

(1,9)(1,10)
(1,5)
(2,9)(2,10)
doubly refine
Mesh: −−−−−−−−→
(3,7)(3,8)
(2,4)
(4,7)(4,8)
m m
35 36 49 50
11
9 ···
10 ···
5
Plan: .. .. .. ..
. . . .
23 ···
24 ···
Fig. 2.2. 2D case (7\times 7 rectangular mesh). The red block means there is mass transported from
(1,5) to (2,4). Then in a doubly refined mesh, there is mass transported from (1,9), (1,10), (2,9) and
(2,10) to (3,7), (3,8), (4,7) and (4,8), as marked out by 16 blue blocks.
\~
After mesh refinement, the original \{ ek \} K k=1 becomes \{ \~ ek \} K
k=1 ; for each k, the original
sk
element ek is divided into sk parts: ek = \cup t=1 e\~kt and e\~kt1 \cap e\~kt2 = \emptyset when kt1 \not = kt2 .
It is reasonable to speculate that there also exists certain mass transported from e\~ju
to e\~jv\prime , where u \in \{ 1, . . . , sj \} , v \in \{ 1, . . . , sj \prime \} . Accordingly in X \~ n , there should be
\~n,ju jv > 0, sj \times sj positive entries in total. We illustrate the 2D case in Figure 2.2.
x \prime \prime
Note that the coordinates in the transport plan are rearranged from the 2D coordi-
nates in mesh.
Based upon the above arguments, we derive the GR subroutine for initializations;
see Algorithm 2.2.
We shall mention that our strategy is completely different from the so-called
shielding neighborhood in the context of standard optimal transport problems [5, 22,
42]. The shielding neighborhood strategy adjusts the supports of transport plans
adaptively by using the strong duality of linear programming and restricts the re-
finement of plans on the adjusted supports. The strategy ensures the optimality of
the refined solutions without increasing too much computational and storage com-
plexities. Compared with the standard optimal transport problems, however, (2.1) is
nonconvex, which renders finding appropriate supports impracticable. Therefore, we
instead keep all the degrees of freedom and concentrate on constructing high-quality
initial points for the local solver.
2.2. Local solver. The first-step global optimization and GR subroutine waive
the need of brute-force globally solving large-scale (2.1). Instead, we only need to
devise a local solver (see line 6 in Framework 2.1). We assume that the procedure is in
the \ell th iteration of Framework 2.1. This is the same in what follows whenever talking
(\ell ) (\ell ) (\ell )
about the local solver. We define the linear operator B (\ell ) : \BbbR K \times K \rightarrow \BbbR 2K +1 as
(\ell )
\times K (\ell )
B (\ell ) (W ) := [e(\ell )\top W \top \bfitvarrho (\ell )\top E (\ell ) W Tr(W )]\top \forall W \in \BbbR K ,

Algorithm 2.2 The GR initialization subroutine.

Require: Coarse mesh with K elements \{ ek \} K \~
k=1 ; refined mesh with K elements
\~
K N
\{ \~
ek \} k=1 ; approximate solution from the previous step (Xn )n=2 ; scaling factor
r > 0.
1: for n = 2, . . . , N do
2: for j = 1, . . . , K do
3: for j \prime = 1, . . . , K do
4: if xn,jj \prime > 0 then
sj
5: Find e\~ju , u = 1, . . . , sj , such that ej = \cup u=1 e\~ju .
sj \prime
6: Find e\~jv , v = 1, . . . , sj , such that ej = \cup v=1 e\~jv\prime .
\prime \prime \prime
7: Set x \~n,ju jv\prime = r \cdot xn,jj \prime for u \in \{ 1, . . . , sj \} and v \in \{ 1, . . . , sj \prime \} .
8: end if
9: end for
10: end for
11: end for
\~ n )N \in (\BbbR K\times \~ K \~ N - 1 \~ n = (\~
12: return (X n=2 ) , where X xn,jj \prime )jj \prime (n = 2, . . . , N ).
Algorithm 2.3 PBCD for (2.1).

(\ell ,0) (\ell ) (\ell ) (\ell )
Require: C (\ell ) , Xn \in \BbbR K \times K , n = 2, . . . , N ; e(\ell ) , \bfitvarrho (\ell ) \in \BbbR K ; \{ \varepsilon (\ell ,k) \} k \subseteq \BbbR + ;
\beta (\ell ) , \sigma (\ell ) > 0.
1: Set k := 0.
2: while certain stopping criteria are not satisfied do
3: For n = 2, . . . , N , inexactly solve
\bigl( (\ell ,k+1) (\ell ,k) \bigr) \sigma (\ell )

(2.2) min f\beta (\ell ) X<n , Xn , X>n + \| Xn - Xn(\ell ,k) \| 2F
Xn \in \scrS (\ell ) 2
(\ell )
(\ell ,k+1) (\ell ) (\ell ,k+1) \times K (\ell )
to obtain Xn , \bfitlambda (\ell
n
,k+1)
\in \BbbR K , and \Phi n \in \BbbR K
+ satisfying
\sqrt{}
(\ell ,k+1) (\ell ,k+1) \~ n(\ell ,k)
(2.3) r(\ell ) (Xn , \bfitlambda n(\ell ,k+1) , \Phi n ,X ) \leq \varepsilon (\ell ,k) ,
\~ n(\ell ,k) is computed as

where X
\bigl( (\ell ,k+1) (\ell ,k) \bigr)

\~ n(\ell ,k) := Xn(\ell ,k) - 1 \nabla n f\beta (\ell ) X<n
X
(\ell ) (\ell )
, X\geq n \in \BbbR K \times K .
\sigma (\ell )
4: Set k := k + 1.
5: end while
(\ell , \star ) (\ell ,k) N (\ell ) (\ell )
6: return (Xn )N n=2 := (Xn )n=2 \in (\BbbR K \times K )N - 1 .
(\ell )
and b(\ell ) := [1\top \bfitvarrho (\ell ) 0]\top \in \BbbR 2K +1 . The feasible set in (1.10) can then be rewritten
as \scrS (\ell ) = \{ W : B (\ell ) (W ) = b(\ell ) , W \geq 0\} .
The block structure of (2.1) reminds us of using splitting-type methods. One
natural choice is an (N - 1)-block cyclic PBCD method; see Algorithm 2.3. In PBCD,
the nth block problem merely depends on the nth block variable Xn , while keeping

(\ell ,k)
other block variables their latest values. Moreover, the proximal term \| Xn - Xn \| 2F
is added to the objective function such that the block problem admits a unique global
solution, with \sigma (\ell ) > 0 being the proximal parameter. PBCD invokes certain subsolver
for the block problem (2.2) until the inexact criterion (2.3) is met. In (2.3), r(\ell ) acts as
a residual function measuring the violation of the KKT conditions of (2.2), defined as1
r(\ell ) (Xn ,\bfitlambda n , \Phi n , X \~ n ) := \| Xn - X\~ n - B (\ell )\ast (\bfitlambda n ) - \Phi n \| \infty
\Bigl\{ \Bigl\langle \Bigr\rangle \Bigr\}
+ max Xn , Xn - X \~ n - B (\ell )\ast (\bfitlambda n ) - \Phi n , 0
\Bigl\{ \Bigl\langle \Bigr\rangle \Bigr\}
+ max \bfitlambda n , B (\ell ) (Xn ) - b(\ell ) , 0 + max \{ \langle \Phi n , Xn \rangle , 0\}
+ \| B (\ell ) (Xn ) - b(\ell ) \| \infty + \| max \{ - Xn , 0\} \| \infty ;
(\ell ,k+1)\top (\ell ,k+1)\top (\ell ,k+1) (\ell ,k+1)

\bfitlambda (\ell
n
,k+1)
:= [\bfitlambda n,1 , \bfitlambda n,2 , \lambda n,3 ]\top 2 and \Phi n are respectively the Lagrange
multipliers associated with the equality and nonnegative constraints, calculated by the
(\ell ,k+1)
subsolver. Roughly speaking, (2.3) guarantees that the solution error at Xn is
(\ell ,k)
at most some multiple of \varepsilon (for details, please refer to the supplementary material
(GGR-SM-v2.pdf [local/web 417KB])). One can then achieve the convergence of
PBCD via imposing conditions on the prescribed sequence \{ \varepsilon (\ell ,k) \} k .
Zooming in on (2.2) in Algorithm 2.3, we find that solving the block problems is
equivalent to projecting X \~ n(\ell ,k) onto \scrS (\ell ) . There exist numerous algorithms for this
purpose. For instance, we can extend the semismooth Newton-CG (SSNCG) method
proposed in [33]; see more discussions in subsection 4.1. Since the projection does not
possess a closed-form expression, iterate infeasibility with respect to (\scrS (\ell ) )N - 1 is in-
evitable in Algorithm 2.3. This brings difficulties in analyzing the convergence of PBCD.
3. Convergence analysis. In this section, we show the convergence of PBCD to
the KKT points or global solutions of (2.1) in different settings. The definition of the
KKT points for (2.1) can be found in the supplementary material (GGR-SM-v2.pdf
[local/web 417KB]). The convergence depends upon the following conditions on the
prescribed \{ \varepsilon (\ell ,k) \} k .
Condition 3.1.
(1) The sequence \{ \varepsilon (\ell ,k) \} k is nonnegative square summable.
(2) The sequence \{ \varepsilon (\ell ,k) \} k is nonnegative summable and there exists \theta \in (0, 1)
such that \{ k(\varepsilon (\ell ,k) )2\theta \} k is summable.
Since the analysis is rather complicated, we give a rough statement of the con-
vergence results for PBCD below. The formal statement and proof are relegated to the
supplementary material (GGR-SM-v2.pdf [local/web 417KB]).
Theorem 3.2. Suppose that \sigma (\ell ) > 0. Let \{ X (\ell ,k) \} k be the sequence generated by
PBCD.
(1) If \{ \varepsilon (\ell ,k) \} k fulfills Condition 3.1(1), then \{ X (\ell ,k) \} k has at least one accumu-
lation point and each accumulation point is a KKT point of (2.1).
(2) If \{ \varepsilon (\ell ,k) \} k fulfills Condition 3.1(2), then \{ X (\ell ,k) \} k converges to a KKT point
of (2.1).
1 With a slight abuse of notation, we use \| \cdot \|

\infty to denote an entrywise \ell \infty -norm of a matrix or
vector.
2 \bfitlambda (\ell ,k+1) , \bfitlambda (\ell ,k+1) \in \BbbR K (\ell ) , and \lambda (\ell ,k+1) \in \BbbR correspond to the constraints X e(\ell ) = 1,
n,1 n,2 n,3 n
Xn\top E (\ell ) \bfitvarrho (\ell ) = \bfitvarrho (\ell ) , and Tr(Xn ) = 0, respectively.

(3) If \{ \varepsilon (\ell ,k) \} k fulfills Condition 3.1(2), \sum \infty X

(\ell ,0)
is \sum
feasible and sufficiently close to
(\ell ,k) \infty
some global solution of (2.1), k=0 \varepsilon and k=0 k(\varepsilon (\ell ,k) )2\theta are sufficiently
(\ell ,k)
small, then \{ X \} k converges to a global solution of (2.1).
Remark 3.3. Theorem 3.2 itself is of particular theoretical interest, in that the
iterates are allowed to be infeasible. This has not been covered by existing works
on PBCD (e.g., [50]) and should be credited to the inexact criterion (2.3). This also
provides a theoretical guarantee for the usage of efficient infeasible subsolvers for (2.2),
which is of significant importance in our context because the number of variables far
exceeds that of equality constraints.
4. Numerical experiments. In this section, we validate the proposed GGR
approach via numerical simulations on several typical systems, including both 1D and
2D systems. During the experiments, we mainly monitor the repulsive energy f in
(1.8). We also calculate the approximate transport maps \{ TnK \} N
n=2 as in (1.12), and
evaluate the qualities of solutions through the average error (denoted by err)
K N
1 \sum \sum \bigm| \bigm| \bigm|
err(K, \Omega ) := Tn (ak ) - TnK (ak )\bigm|
K | \Omega | n=2 k=1
if the optimal transport maps \{ Tn \} N n=2 in (1.11) are already available. Moreover,
we approximate the SCE potentials, which is crucial in the applications of electronic
\~ K := \bfitlambda K - minK \{ \lambda K \} \cdot 1 \in \BbbR K . Here,
structure calculations [46, 47], with \bfitlambda j=1 j
N
1 \sum K
(4.1) \bfitlambda K := \bfitlambda \in \BbbR K
N - 1 n=2 n,2
is the average of the N - 1 Lagrange multipliers associated with the mass-preserving

constraints, computed by some subsolver. We refer interested readers to the supple-
mentary material (GGR-SM-v2.pdf [local/web 417KB]) for a numerical comparison
of the local solvers proposed in [6, 8, 17, 49] and PBCD.
All the numerical experiments presented here are run on a platform with an Intel
Xeon Gold 6242R CPU @ 3.10 GHz and 510 GB RAM running Matlab R2018b under
Ubuntu 20.04.
4.1. Default settings.
Global solver. Considering the applicability and efficiency, we take the stochas-
tic method, random multistart, as the global solver when d = 1, whose implementation
follows from [26]. Usually, for (2.1) with (N - 1)K 2 \leq 500, random multistart entails
dozens of starts to achieve global optimality. Other strategies, such as branch-and-
bound and polynomial optimization, are not adequate due to high computational or
storage complexities. When d = 2, we employ the software BARON for global optimiza-
tion, where a hybrid strategy (random multistart and branch-and-bound) is adopted.
Given (2.1) with K around 200, it has been observed to produce satisfactory solutions
within a reasonable time. Version 21.1.13 of BARON is available in the downloadable
AMPL system [18].
Details in PBCD. We adapt the SSNCG method in [33] as the subsolver in PBCD.
A general iteration in SSNCG consists of approximately solving a sparse symmetric
positive definite linear system of the form
\bigl( \bigr) \bigl( \bigr)
V (\ell ,k,j) + \tau (\ell ,k,j) I d + s(\ell ,k,j) = 0, d \in Im B (\ell ) ,

Table 4.1
Values of \beta for different K.
K (0, 10) [10, 36) [36, 80) [80, 160) [160, 320)
\beta 22 21 20 2 - 2 2 - 3
K [320, 640) [640, 1280) [1280, 2560) [2560, 5120) [5120, \infty )
\beta 2 - 4 2 - 5 2 - 6 2 - 7 2 - 8
and then performing line searches along d for a sufficient reduction on the dual ob-
jective. Here, the third superscript (j) indicates the iteration of SSNCG, V (\ell ,k,j) \in
(\ell ) (\ell ) (\ell )
\BbbR (2K +1)\times (2K +1) is a positive semidefinite matrix, s(\ell ,k,j) \in \BbbR 2K +1 is the resid-
ual vector, and \tau (\ell ,k,j) > 0. In our context, the linear system can be solved quickly
to desired accuracy by the preconditioned conjugate gradient method equipped with
block Jacobi preconditioner.
Parameter settings. In the GR subroutine, we set the scaling factor r = 1/2d.
For any \ell , we fix \sigma (\ell ) \equiv 10 - 3 in PBCD. For different K, we choose \beta according to Table
4.1. We start SSNCG from origin in the first call; after that, we perform warm start
for acceleration.
Stopping criteria. In SSNCG, we fix \varepsilon (\ell ,k) \equiv 10 - 9 in (2.3) and terminate the
4
algorithm once the subiteration number \surd reaches 10 . We stop PBCD when the scaled
difference of two consecutive iterates \sigma \| X (\ell ) (\ell ,k+1)
- X (\ell ,k) \| F is less than 10 - 4 , or
when the absolute value of the difference between two consecutive energies is less than
10 - 8 , or once the iteration number reaches 106 .
4.2. Numerical results on 1D systems. We first consider some typical 1D

systems with our GGR approach. In the simulations, we use equimass discretization
of the marginals for the initial meshes, in the sense that each element in the mesh
carries the same marginal mass. This can be achieved cheaply and exactly for 1D
systems. The meshes are refined uniformly afterward.
The first three systems under consideration all consist of three particles (N = 3),
whose single-electron densities (marginals) are given by
\bigl( \bigr)
\rho 1 (x) = c1 cos(\pi x) + 1 , \Omega = [ - 1, 1],
\bigl( 2 2 \bigr)
\rho 2 (x) = c2 2e - 6(x+0.5) + 1.5e - 4(x - 0.5) , \Omega = [ - 1.5, 1.5],
- | x|
\rho 3 (x) = c3 e , \Omega = [ - 5, 5],
\int
respectively, with ci (i = 1, 2, 3) being the normalizing factors such that \Omega \rho i (x) dx =
3. The number of grid points used for the initial meshes is K (0) = 12 for all three
systems. Starting from the initial meshes, we have performed uniform mesh refine-
ments and invoked PBCD six times. Note that the explicit solutions of the original
MMOT problems are known for 1D systems [11]. We first list the output energies and
calculate average errors (the `èrr e"" columns) at each step in Table 4.2(a), supporting
the efficiency of our approach. The evolution of the approximate SCE potentials \bfitlambda \~ K
(4.1) and comparison with the ground truth are illustrated in Figure 4.1. Finally, the
single-electron densities (marginals) and approximate transport maps \{ TnK \} N n=2 (1.12)
are shown in Figure 4.2. From these results, the convergence of the GGR approach
can be observed as the meshes being refined. Moreover, our results match the theory
perfectly. To approximate the SCE potentials for electronic structure calculations, it

Table 4.2
Table 4.2
Output energies and calculated average errors of the GGR approach on 1D systems. The column
Output energies and calculated average errors of the GGR approach on 1D systems. The column
“err s” (resp. “err e”) lists the average errors of the initial points (resp. the converged solutions).
`èrr s"" (resp., `èrr e"") lists the average errors of the initial points (resp., the converged solutions).
System 1 System 2 System 3

Step
K Energy err s err e K Energy err s err e K Energy err s err e
GGR Init 12 18.114 - 0.031 12 10.695 - 0.034 12 5.935 - 0.040
GGR LS(1) 24 18.911 0.049 0.013 24 11.301 0.053 0.016 24 6.275 0.053 0.018
GGR LS(2) 48 19.004 0.022 0.009 48 11.362 0.026 0.011 48 6.346 0.027 0.013
GGR LS(3) 96 19.019 0.014 0.004 96 11.370 0.016 0.007 96 6.356 0.019 0.012
GGR LS(4) 192 19.021 0.007 0.003 192 11.372 0.011 0.004 192 6.360 0.013 0.001
GGR LS(5) 384 19.022 0.007 0.002 384 11.373 0.006 0.002 384 6.361 0.003 0.000
GGR LS(6) 768 19.022 0.004 0.001 768 11.373 0.003 0.000 768 6.361 0.001 0.000
(a) N = 3
System 4 System 5 System 6

Step
K Energy err s err e K Energy err s err e K Energy err s err e
GGR Init 14 173.951 - 0.052 14 151.891 - 0.039 14 111.964 - 0.030
GGR LS(1) 28 181.474 0.045 0.019 28 158.797 0.037 0.028 28 117.223 0.030 0.010
GGR LS(2) 56 181.929 0.018 0.025 56 158.507 0.023 0.026 56 117.050 0.011 0.008
GGR LS(3) 112 181.989 0.019 0.012 112 158.317 0.019 0.011 112 116.914 0.007 0.008
GGR LS(4) 224 181.954 0.014 0.013 224 158.267 0.008 0.008 224 116.876 0.008 0.006
GGR LS(5) 448 181.942 0.007 0.002 448 158.255 0.010 0.004 448 116.864 0.005 0.003
GGR LS(6) 896 181.939 0.002 0.001 896 158.254 0.005 0.002 896 116.861 0.003 0.001
(b) N = 7
6 3.5 3
6 3.5 3
3
5 2.5
3
5 2.5 2.5
4 2
2.5
2 2
4
3 1.5
2
1.5
3 1.5
2 1.5 1
1
2 1
1 1 0.5
0.5
1 0.5
0.5
0 0 0
-1 -0.5 0 0.5 1 -1.5 -1 -0.5 0 0.5 1 1.5 -5 0 5
0 0 0
-1 -0.5 0 0.5 1 -1.5 -1 -0.5 0 0.5 1 1.5 -5 0 5
Fig. 4.1. Approximate SCE potentials (blue lines) and ground truths (red lines) for 1D systems
with NFig.
= 3.4.1. Approximate SCE potentials (blue lines) and ground truths (red lines) for 1D systems
with N = 3.
single-electron densities (marginals) and approximate transport maps {TnK }N (1.12)

seems
are shown thatin averaging
Figure 4.2.theFrom N - these
1 Lagrange
results,multipliers
the convergence given by SSNCG
of the GGR asn=2 in (4.1) is
approach
enough
can be observed and there as the is nomeshes need to solverefined.
being extra partial
Moreover, differential equations
our results match [10]. the theory
The second set includes three systems,
perfectly. To approximate the SCE potentials for electronic structure calculations, each of which contains seven particles it
(N = that
seems 7). Note that this
averaging theparticle number is multipliers
N − 1 Lagrange already intractable given byifSSNCG
one tries as to in solve (4.1) the
is
originaland
enough MMOTthere problem is no need (1.1) directly.
to solve extraThe partial single-electron
differential densities equations(marginals)
[10]. are
given by
The second set includes three systems, each of which contains 7 particles (N = 7).
\surd
Note that
\rho 4 (x)this
= cparticle - x2 / \pi number is already intractable if one tries to solve \Omega =the original
4e , [ - 3, 3],
MMOT problem\bigl( (1.1) - (x+2)
directly.
2 The
- 2x 2 single-electron
- (x - 2) 2 \bigr) densities (marginals) are given by
\rho 5 (x) = c5 e + 5e +e , \Omega = [ - 4, 4],
\bigl( 2 /√π 2 2 2 2
ρ4\rho (x) ==c4ce6−xe - 4(x+2)
6 (x) , + e - 4(x+1.5) + e - 4(x+1) + e - 4(x+0.5) Ω = [−3, 3],
2 2 2 2 2 2 \bigr)
- 4(x - 2/3)
ρ5 (x) = c+ 5 e
−(x+2)
+ 5e e - 4(x - 4/3)
+ −2x + e−(x−2) + e - 4(x - 2)
, , Ω\Omega ==[−4, [ - 3,4], 3],
−4(x+2)2 −4(x+1.5)2 −4(x+1)2 −4(x+0.5)2 \int
ρ6 (x) = c6with
respectively, e ci (i =+ 4,e5, 6) being + thee normalizing + e factors such that \rho (x) dx =
\Omega i
2
7. The last + twoe examples
−4(x−2/3) 2
can
+e be viewed
−4(x−4/3) 2
+e as−4(x−2) systems , with localizedΩ = [−3, 3], The
electrons.
number of grid points used for the initial meshes is K (0) = 14 for these three systems.

2.5
3.0
1.5
2.5 2.0
2.0 1.5
1.0
1.5
1.0
1.0
0.5
0.5
0.5
0.0 0.0 0.0

-1.0 0.0 1.0 -1.5 0.0 1.5 -5.0 0.0 5.0
Fig. 4.2. Marginals (the first row) and approximate transport maps (the remaining four rows)
in 1D systems with N = 3; left to right: systems with \rho 1 , \rho 2 , \rho 3 . The maps in the last four rows
correspond to the rows in Table 4.2(a), where K = 12, 48, 192, 768. The blue and red dots are the
images of T2K and T3K over grid barycenters, respectively.
Starting from the initial meshes, we have performed uniform mesh refinements and
invoked PBCD six times. The output energies and calculated average errors (the `èrr e""
\~ K (4.1) as
columns) are collected in Table 4.2(b). The approximate SCE potentials \bfitlambda
well as the ground truths are depicted in Figure 4.3. We show the single-electron
densities (marginals) and approximate transport maps \{ TnK \} N
n=2 (1.12) in Figure 4.4.
\~ K given by our
We observe from the numerical results that the iterates as well as \bfitlambda
GGR approach converge well to the correct solutions and the true SCE potentials.

12 14 6
12
10 5
10
8 4
8
6 3
6
4 2
4
2 1
2
0 0 0
-3 -2 -1 0 1 2 3 -4 -3 -2 -1 0 1 2 3 4 -3 -2 -1 0 1 2 3
Fig. 4.3. Approximate SCE potentials (blue lines) and ground truths (red lines) for 1D systems
with N = 7.
To show that the GR subroutine (Algorithm 2.2) yields high-quality initializa-

tions, we compute the average errors of the initial points (the `èrr s"" columns) as
well (see Table 4.2). The notation ``-"" in the GGR Init step indicates that no initial
points are fed to the global solver. The decreasing err s's underline the efficacy of the
GR subroutine, which boosts the GGR approach and helps PBCD find global solutions.
Incidentally, the comparison between err s and err e in the same row highlights the
improvements brought by PBCD. One can also find that err e is sometimes slightly
larger than err s. In these cases, PBCD eliminates infeasibility while inheriting the
high quality of the initial points.
Compared with naive random initializations, our GR strategy leverages past in-
formation and helps PBCD reach lower energies and entail less running time. We show
this point with Table 4.3, where random initializations are realized by the built-in
function ``rand"" in Matlab.
4.3. Numerical results on 2D systems. We then consider some 2D systems
with the GGR approach. We use the finite elements package FreeFEM [25] to generate
the initial meshes for the marginal discretization. The meshes are nonuniform such
that every element carries almost the same mass. In the later steps of the GGR
approach, each element is refined uniformly.
The two systems under consideration both consist of three particles (N = 3),
whose single-electron densities (marginals) are given by
\bigl( 2 2 \bigr)
\rho 7 (x, y) = c7 e - 2.5| (x,y) - ( - 1.5,0)| + 0.5e - 2.5| (x,y) - (1.5,0)| , \Omega = [ - 3, 3] \times [ - 2, 2],
\bigl( - 2.5| (x,y) - ( - 1.032, - 0.84)| 2 ) 2
\rho 8 (x, y) = c8 e + e - 2.5| (x,y) - (0,0.96)|
2 \bigr)
+ e - 2.5| (x,y) - (1.032, - 0.84)| , \Omega = [ - 2.5, 2.5]2 ,
\int
respectively, with c7 , c8 being the normalizing factors such that \Omega \rho i (x, y) dx dy = 3
(i = 7, 8). For the first 2D system, \rho 7 corresponds to a system that has two electrons
located on the left part of \Omega (represented by the first Gaussian centered at ( - 1.5, 0)),
and the third electron located on the right part (represented by the second Gaussian
centered at (1.5, 0)). For the second 2D system, \rho 8 corresponds to a system that has
three electrons concentrated on three different sites ( - 1.032, - 0.84), (0, 0.96), and
(1.032, - 0.84) (represented by three Gaussians), respectively. The electron densities
(marginals) and corresponding initial meshes (obtained by FreeFEM) are shown in the
first two rows of Figure 4.5. The numbers of grid points used for the initial meshes are
K (0) = 240 for \rho 7 and K (0) = 170 for \rho 8 , respectively. After four steps in Framework
2.1, we reach K (4) = 61440 for \rho 7 and K (4) = 43520 for \rho 8 .

3.5 2.5
4.0
3.0
2.0
2.5 3.0
2.0 1.5
2.0
1.5 1.0
1.0
1.0
0.5
0.5
0.0 0.0 0.0

-3.0 0.0 3.0 -4.0 0.0 4.0 -3.0 0.0 3.0
Fig. 4.4. Marginals (the first row) and approximate transport maps (the remaining four rows)
in 1D systems with N = 7; left to right: systems with \rho 4 , \rho 5 , \rho 6 . The maps in the last four rows
correspond to the rows in Table 4.2(b) where K = 14, 56, 224, 896. The blue, red, black, green, brown,
and purple dots are the images of TnK , n = 2, . . . , 7, over grid barycenters, respectively.
We gather the output energies of the GGR approach in Table 4.4, where errors
are absent because no explicit solutions of the original MMOT problems are known in
these contexts. The evolution of the approximate SCE potentials \bfitlambda \~ K (4.1) is shown
in Figure 4.6. The convergence of our GGR approach can be well observed from these
results. Moreover, to show the transport maps T2K , T3K (1.12) approximated by the
obtained solutions, we plot in the remaining three rows of Figure 4.5 the images of
the barycenters of triangular elements within some subregions \omega \subseteq \Omega under T2K and

Incidentally, the comparison between err s and err e in the same row highlights the
improvements brought by PBCD. One can also find that err e is sometimes slightly
larger than err s. In these cases, PBCD eliminates infeasibility while inheriting the
high quality of the initial points.
Compared with naive random initializations, our GR strategy leverages past in-
formation and helps PBCD reach lower energies and entail less running time. We show
this point with Table 4.3, where
A GLOBAL random initializations
OPTIMIZATION APPROACH FOR are MMOT
realized by the built-in
A1231
function “rand” in Matlab.
Table 4.3
Comparison between energies output and Table 4.3
running time (Time) in seconds needed by PBCD
Comparison
equipped with GR between energies
and random output and
initializations running
on three time (Time)
1D systems with N in
= 3 seconds
and K =needed by
768. The
results given by random initializations are the average of 10 1D
PBCD equipped with GR and random initializations on three systems with N = 3 and K = 768.
trials.
The results given by random initializations are the average of 10 trials.
ρ = ρ1 ρ = ρ2 ρ = ρ3
Initialization
Energy Time Energy Time Energy Time
GR 19.022 197.20 11.373 209.30 6.361 98.55
Random 19.037 823.98 11.379 716.02 6.396 538.23
4.3. Numerical results on 2D systems. We then consider some 2D systems

with
K the GGR approach. We use the finite elements package FreeFEM [25] to generate
T3 . For the two-Gaussian system \rho 7 , the pictures show that if the first electron is
the initial
around themeshes for the marginal
left Gaussian discretization.
center, then The meshes
the third electron will goaretonon-uniform
the region such near
that every element carries almost the same mass. In the later
the right Gaussian center, and the second electron will lie in the left part steps of (to the satisfy
GGR
approach,
the marginaleachconstraints)
element is refined
but stayuniformly.
away from the first one (\omega and T2K (\omega ) lie in two
The two systems under consideration both consist
different regions around the left Gaussian center); if oneof 3 particles
electron (N =around
is located 3), whose
the
single-electron densities (marginals) are given by
right Gaussian center, then the other two electrons will be around the left Gaussian
center while keeping a distance away 2 from each other. For 2 the three-Gaussian system
−2.5|(x,y)−(−1.5,0)| −2.5|(x,y)−(1.5,0)|
\rho ρ87,(x,
wey)can
= csee
7 ethat if one electron+is0.5e located around one of , the Ω = [−3, 3]centers,
Gaussian × [−2, 2],
the
other two electrons go to the other two Gaussian centers, respectively. Our simulations
match physical intuitions quite well and can support the reliability of our approach.
Although no existing works have shown the existence of Monge solutions for
these two systems, our results appear to provide some indications; see Figure 4.5. In
addition, we record the number of nonzero entries per row (nnzn,i ) of the solutions
(Xn )N n=2 and the corresponding distance (dn,i ) defined respectively as
\sum
nnzn,i := \# \{ j : xn,ij > 0\} and dn,i := | aj - aj \prime | .
j:xn,ij>0
j \prime :xn,ij \prime >0
For any n \in \{ 2, . . . , N \} and i \in \{ 1, . . . , K\} , the cardinality nnzn,i corresponds to the
number of sites to which the ith piece of mass is transported by the nth plan and
dn,i measures the total distance between each two of these sites. We draw the fre-
quency percentage distributions of \{ nnzn,i \} n,i and \{ dn,i \} n,i using the built-in function
``hist3"" of matlab; see Figure 4.7. One can observe that there are dominant values
in a few columns for most rows and the distances are short. By the definition of nnzn,i
and dn,i , Figure 4.7 shows the localization of mass transportation. As K grows, the
mesh becomes more refined and the sparsity of solutions becomes more evident.
4.4. Scaling of the GGR approach. We finally investigate the scaling of GGR
numerically on the eight systems in the previous two subsections; see Figures 4.8 and
4.9.3 Leveraging the sparsity of iterates, PBCD entails O(K) expenditure for solving
(2.1) in each step of GGR, which leads to O(K 2 ) cost in total for GGR.
5. Conclusions. In the present work, we consider the MMOT problem with
Coulomb cost arising in quantum physics. The Monge-like ansatz tides us over the
curse of dimensionality, in that the number of unknowns scales linearly with respect to
the number of electrons, however resulting in MPGCC. In quest for global solutions,
3 The time consumed for the initial global optimization is not included.

Fig. 4.5. Contours of marginals (the first row), initial meshes (the second row), and slices of
approximate transport maps (the third to fifth rows) in 2D systems; left to right: systems with \rho 7 , \rho 8 .
In system 7--8, we calculated K to 15360 and 10880, respectively. The gray, blue, and green circles
are preimages \omega \subseteq \Omega , T2K (\omega ), and T3K (\omega ), respectively.
we propose a global optimization approach GGR for dealing with the derived MPGCC.
The GGR approach solves the problem step by step along with the process of mesh
refinements and is equipped with an initialization subroutine such that global solutions
are amenable to the proposed local solver PBCD. The convergence properties of PBCD
are established in the presence of iterate infeasibility. We corroborate the merits of
the GGR approach with numerical simulations on several typical 1D and 2D physical
systems. Notably, we obtain solutions with high resolution in the 1D cases, provide
indications for the Monge solutions, and visualize the approximate optimal transport

AAGLOBAL
GLOBALOPTIMIZATION
OPTIMIZATIONAPPROACH
APPROACHFOR
FORMMOT
MMOT A1233
A19
Table4.4
Table 4.4
Outputenergies
Output energiesofofthe
theGGR
GGRapproach
approachon
on2D
2Dsystems.
systems.
System 7 System 8
Step
K Energy K Energy
GGR Init 240 9.503 170 9.491
GGR LS(1) 960 9.577 680 9.533
GGR LS(2) 3840 9.598 2720 9.543
GGR LS(3) 15360 9.604 10880 9.546
GGR LS(4) 61440 9.606 43520 9.547
= 7 , K = 240 = 7 , K = 960 = 7 , K = 3840 = 7 , K = 15360
= 7 , K = 240 = 7 , K = 960 = 7 , K = 3840 = 7 , K = 15360

2 2 2 2
2 2 2 2
1 1 1 1
1 1 1 1
0 0 0 0
2 2 2 2
0 0 2 0 0 2 0 0 2 0 0 2
0 0 0 0
2 -2 2 -2 2 -2 2 -2
-2 -2 -2 -2
0 2 0 2 0 2 0 2
0 0 0 0
= -2 8 , K -2
= 170 = -28 , K =-2680 = -2
8
, K =-2
2720 = 8-2 -2
, K = 10880
= 8 , K = 170 = 8 , K = 680 = 8 , K = 2720 = 8 , K = 10880

1 1 1 1
1 1 1 1
0.5 0.5 0.5 0.5
0.5 0.5 0.5 0.5

0 0 0 0
2 2 2 2
2 2 2 2
0 0 0 0 0 0 0 0 0 0 0 0
2 -2 -2 2 -2 -2 2 -2 -2 2 -2 -2
2 2 2 2
0 0 0 0
0 0 0 0
-2 -2 -2 -2 -2 -2 -2 -2
Fig. 4.6. Approximate SCE potentials for 2D systems. Upper: ρ = ρ7 (from left to right:
K = 240, 960, 3840, 15360). Lower: ρ = ρ8 (from left to right: K = 170, 680, 2720, 10880).
Fig. 4.6. Approximate SCE potentials for 2D systems. Upper: \rho = \rho 7 (from left to right:
K = 240, 960, 3840, 15360). Lower: \rho = \rho 8 (from left to right: K = 170, 680, 2720, 10880).
maps Weingather the 2Dthe contexts. output energies of the GGR approach in Table 4.4, where errors
are absent because no explicit solutions of the original MMOT problems are known in
Appendix
these contexts. The A.evolution Discretization of the approximate of (1.6). For SCE thepotentials
repulsive λ̃ energy
K
(4.1)inis(1.6),shownwe
have for any n \in \{ 2, . . . , N \} ,
in Figure 4.6. The convergence of our GGR approach can be well observed from these
results. Moreover, \int \int to show the \prime transport maps \sum \int T\int 2K , T K
\rho (r)\gamma n (r, r ) 3 (1.12)
\rho (r)\gamma n (r, r )
\prime approximated by the
\prime
obtained solutions, we plot in\prime the remaining three rows of Figure dr dr = \prime dr\prime the
dr4.5 . images of
\Omega \Omega | r - r | ej ei | r - r |
the barycenters of triangular elements within some subregions ω ⊆ Ω under T2K and i,j
K
TNote
3 . For that thewhen two-Gaussian i = j, the system integralρ7explodes , the pictures and hence show we that: impose if thexfirst = electron is
0 for any
n,kk
around the left Gaussian center,
k \in \{ 1, . . . , K\} as extra constraints to avoid numerical instability. In the subsequent then the third electron will go to the region near
the right Gaussian
derivation, we takecenter, \gamma n (r, rand \prime
) = 0the second electron
whenever r and r\prime will belong lie in to the the left same part (to satisfy
element:
the marginal constraints) but stay away from the first one (ω and T2K (ω) lie in two
\int \int \sum \int \int
different regions (r, r\prime ) the left
\rho (r)\gamma naround \prime
Gaussian center); \rho (r)\gamma n (r, if rone \prime
) electron \prime
is located around the
right \Omega Gaussian center, \prime dr
then dr the = other two electrons \prime will drbe draround the left Gaussian
\Omega | r - r | | r - r |
i\not =j ej ei
center while keeping distance away \sum from each
\int \int other. For the three-Gaussian system
ρ8 , we can see that if one electron = is\varrho ilocated
xn,ij around1one dr of dr the\prime +Gaussian O(h) centers, the
other two electrons go to the other | r - r\prime | respectively. Our simulations
i\not =jtwo Gaussian ej eicenters,
match physical intuitions quite \sum well and can support the reliability of our approach.
(A.1) = \varrho i xn,ij cij | ei | | ej | + O(h) = \langle Xn , \Lambda E CE\rangle + O(h),
Although no existing works have shown the existence of Monge solutions for
i\not =j
these two systems, our results appear to provide some indications; see Figure 4.5. In

Fig. 4.7. Frequency percentage distribution of \{ nnzn,i \} n,i and \{ dn,i \} n,i of solutions (Xn )N n=2 .
Upper: \rho = \rho 7 (from left to right: K = 240, 960, 3840, 15360). Lower: \rho = \rho 8 (from left to right:
K = 170, 680, 2720, 10880).
d = 1, N = 3 d = 1, N = 7
106 106
105 105
104 104
Running time (seconds)
103 103
102 102
101 101
= 1 = 4
= 2 = 5
0 0
10 = 3 10 = 6
K2 K2
10-1 10-1
24 48 96 192 384 768 28 56 112 224 448 896
K K
Fig. 4.8. Scaling of GGR on 1D systems. Left: \rho = \rho 1 , \rho 2 , \rho 3 . Right: \rho = \rho 4 , \rho 5 , \rho 6 .
where h := \| e\| \infty represents the size of the largest element. By similar arguments, we
can write for any m, n \in \{ 2, . . . , N \} : m \not = n,
\int \int \int

\rho (r)\gamma m (r, r\prime )\gamma n (r, r\prime \prime )
dr dr\prime dr\prime \prime
\Omega | r\prime - r\prime \prime |
\sum \Omega \Omega
(A.2) = \varrho i xm,ij xn,ik cjk | ei | | ej | | ek | + O(h)
i,j,k:j\not =k
= \langle Xn , E\Lambda Xm ECE\rangle + O(h).

d = 2, N = 3, = 7
d = 2, N = 3, = 8
1010 1010
108 108

106 106
104 104
= 7 = 8
K2 K2
2 2
10 10
960 3840 15360 61440 680 2720 10880 43520
K K
Fig. 4.9. Scaling of GGR on 2D systems. Left: \rho = \rho 7 . Right: \rho = \rho 8 .
Note that we have excluded j = k cases and impose \langle Xm , Xn \rangle = 0 as extra com-
plementarity constraints. By (A.1) and (A.2), the repulsive energy in (1.6) can be
approximated by
\sum \sum
\langle Xn , \Lambda E CE\rangle + \langle Xn , E\Lambda Xm ECE\rangle
2\leq n\leq N m<n
with error depending on the size of the largest element.

Regarding the constraints in (1.5), we can see from a similar derivation that, for
any n \in \{ 2, . . . , N \} ,
\int \int \int K
\sum
1 1
1= 1 dr = \gamma n (r, r\prime ) dr\prime dr = xn,ij | ej | \forall i,
| ei | ei | ei | ei \Omega j=1
\int \int \int K
\sum
1 \prime \prime 1
\varrho j = \rho (r ) dr = \rho (r)\gamma n (r, r\prime ) dr dr\prime = \varrho i xn,ij | ei | + O(h) \forall j.
| ej | ej | ej | ej \Omega i=1
Consequently, the constraints in (1.5) can be approximated using
Xn e = 1, Xn\top E\bfitvarrho = \bfitvarrho \forall n \in \{ 2, . . . , N \} .
REFERENCES
[1] A. Alfonsi, R. Coyaud, and V. Ehrlacher, Constrained overdamped Langevin dynamics for
symmetric multimarginal optimal transportation, Math. Models Methods Appl. Sci., 32
(2022), pp. 403--455, https://doi.org/10.1142/S0218202522500105.
[2] A. Alfonsi, R. Coyaud, V. Ehrlacher, and D. Lombardi, Approximation of optimal trans-
port problems with marginal moments constraints, Math. Comp., 90 (2021), pp. 689--737,
https://doi.org/10.1090/mcom/3568.
[3] A. D. Becke, Perspective: Fifty years of density-functional theory in chemical physics, J.
Chem. Phys, 140 (2014), 18A301, https://doi.org/10.1063/1.4869598 (18 pages).
[4] S. Bednarek, B. Szafran, T. Chwiej, and J. Adamowski, Effective interaction for charge
carriers confined in quasi-one-dimensional nanostructures, Phys. Rev. B, 68 (2003),
045328, https://doi.org/10.1103/PhysRevB.68.045328, (9 pages).
[5] J.-D. Benamou, G. Carlier, and L. Nenna, A numerical method to solve multi-marginal
optimal transport problems with Coulomb cost, in Splitting Methods in Communication,

Imaging, Science, and Engineering, R. Glowinski, S. J. Osher, and W. Yin, eds., Springer,
Cham, 2016, pp. 577--601, https://doi.org/10.1007/978-3-319-41589-5 17.
[6] S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein, Distributed optimization and
statistical learning via the alternating direction method of multipliers, Found. Trends Mach.
Learn., 3 (2011), pp. 1--122, https://doi.org/10.1561/2200000016.
[7] G. Buttazzo, L. De Pascale, and P. Gori-Giorgi, Optimal-transport formulation of
electronic density-functional theory, Phys. Rev. A, 85 (2012), 062502, https://doi.org/
10.1103/PhysRevA.85.062502.
[8] R. H. Byrd, J. Nocedal, and R. A. Waltz, Knitro: An integrated package for nonlinear
optimization, in Large-Scale Nonlinear Optimization, G. Di Pillo and M. Roma, eds.,
Springer, Boston, 2006, pp. 35--59, https://doi.org/10.1007/0-387-30065-1 4.
[9] H. Chen and G. Friesecke, Pair densities in density functional theory, Multiscale Model.
Simul., 13 (2015), pp. 1259--1289, https://doi.org/10.1137/15M1014024.
[10] H. Chen, G. Friesecke, and C. B. Mendl, Numerical methods for a Kohn-Sham den-
sity functional model based on optimal transport, J. Chem. Theory Comput., 10 (2014),
pp. 4360--4368, https://doi.org/10.1021/ct500586q.
[11] M. Colombo, L. De Pascale, and S. Di Marino, Multimarginal optimal transport
maps for one-dimensional repulsive costs, Canad. J. Math., 67 (2015), pp. 350--368,
https://doi.org/10.4153/CJM-2014-011-x.
[12] C. Cotar, G. Friesecke, and C. Kluppelberg, \" Density functional theory and optimal
transportation with Coulomb cost, Comm. Pure Appl. Math., 66 (2013), pp. 548--599,
https://doi.org/10.1002/cpa.21437.
[13] C. Cotar, G. Friesecke, and B. Pass, Infinite-body optimal transport with Coulomb
cost, Calc. Var. Partial Differential Equations, 54 (2015), pp. 717--742, https://doi.org/
10.1007/s00526-014-0803-0.
[14] F. Facchinei, H. Jiang, and L. Qi, A smoothing method for mathematical programs with
equilibrium constraints, Math. Program., 85 (1999), pp. 107--134, https://doi.org/10.1007/
s10107990015a.
[15] M. L. Flegel and C. Kanzow, On the Guignard constraint qualification for mathematical pro-
grams with equilibrium constraints, Optimization, 54 (2005), pp. 517--534, https://doi.org/
10.1080/02331930500342591.
[16] R. Fletcher and S. Leyffer, Nonlinear programming without a penalty function, Math.
Program., 91 (2002), pp. 239--269, https://doi.org/10.1007/s101070100244.
[17] R. Fletcher and S. Leyffer, Solving mathematical programs with complementarity
constraints as nonlinear programs, Optim. Methods Softw., 19 (2004), pp. 15--40,
https://doi.org/10.1080/10556780410001654241.
[18] R. Fourer, D. M. Gay, and B. W. Kernighan, A modeling language for mathemat-
ical programming, Management Sci., 36 (1990), pp. 519--641, https://doi.org/10.1287/
mnsc.36.5.519.
[19] G. Friesecke, A. Gerolin, and P. Gori-Giorgi, The strong-interaction limit of density
functional theory, https://arxiv.org/abs/2202.09760, 2022.
[20] G. Friesecke, A. S. Schulz, and D. Vogler, \" Genetic column generation: Fast computation
of high-dimensional multimarginal optimal transport problems, SIAM J. Sci. Comput., 44
(2022), pp. A1632--A1654, https://doi.org/10.1137/21M140732X.
[21] \"
G. Friesecke and D. Vogler, Breaking the curse of dimension in multi-marginal Kantorovich
optimal transport on finite state spaces, SIAM J. Math. Anal., 50 (2018), pp. 3996--4019,
https://doi.org/10.1137/17M1150025.
[22] S. Gerber and M. Maggioni, Multiscale strategies for computing optimal transport, J. Mach.
Learn. Res., 18 (2017), pp. 2440--2471, http://jmlr.org/papers/v18/16-108.html.
[23] J. Grossi, D. P. Kooi, K. J. H. Giesbertz, M. Seidl, A. J. Cohen, P. Mori-Sanchez, \' and
P. Gori-Giorgi, Fermionic statistics in the strongly correlated limit of density functional
theory, J. Chem. Theory Comput., 13 (2017), pp. 6089--6100, https://doi.org/10.1021/
acs.jctc.7b00998.
[24] L. Guo and X.-J. Chen, Mathematical programs with complementarity constraints and a non-
Lipschitz objective: Optimality and approximation, Math. Program., 185 (2021), pp. 455--
485, https://doi.org/10.1007/s10107-019-01435-7.
[25] F. Hecht, New development in freefem++, J. Numer. Math., 20 (2012), pp. 251--265,
https://doi.org/10.1515/jnum-2012-0013.
[26] F. J. Hickernell and Y. Yuan, A simple multistart algorithm for global optimization,
Oper. Res. Trans., 1 (1997), pp. 1--12, http://citeseerx.ist.psu.edu/viewdoc/summary?doi=
10.1.1.46.1346.

[27] T. Hoheisel, C. Kanzow, and A. Schwartz, Theoretical and numerical comparison of relax-
ation methods for mathematical programs with complementarity constraints, Math. Pro-
gram., 137 (2013), pp. 257--288, https://doi.org/10.1007/s10107-011-0488-5.
[28] X. Hu and D. Ralph, Convergence of a penalty method for mathematical programming

with complementarity constraints, J. Optim. Theory Appl., 123 (2004), pp. 365--390,
https://doi.org/10.1007/s10957-004-5154-0.
[29] X. Jia, C. Kanzow, P. Mehlitz, and G. Wachsmuth, An augmented Lagrangian method
for optimization problems with structured geometric constraints, Math. Program. (2022),
https://doi.org/10.1007/s10107-022-01870-z.
[30] C. Kanzow and A. Schwartz, The price of inexactness: Convergence properties of relaxation
methods for mathematical programs with complementarity constraints revisited, Math.
Oper. Res., 40 (2015), pp. 253--275, https://doi.org/10.1287/moor.2014.0667.
[31] Y. Khoo, L. Lin, M. Lindsey, and L. Ying, Semidefinite relaxation of multimarginal optimal
transport for strictly correlated electrons in second quantization, SIAM J. Sci. Comput.,
42 (2020), pp. B1462--B1489, https://doi.org/10.1137/20M1310977.
[32] Y. Khoo and L. Ying, Convex relaxation approaches for strictly correlated density func-
tional theory, SIAM J. Sci. Comput., 41 (2019), pp. B773--B795, https://doi.org/10.1137/
18M1207478.
[33] X. Li, D. Sun, and K.-C. Toh, On the efficient computation of a generalized Jacobian
of the projector over the Birkhoff polytope, Math. Program., 179 (2020), pp. 419--446,
https://doi.org/10.1007/s10107-018-1342-9.
[34] G. Lin and M. Fukushima, A modified relaxation scheme for mathematical programs with
complementarity constraints, Ann. Oper. Res., 133 (2005), pp. 63--84, https://doi.org/
10.1007/s10479-004-5024-z.
[35] F. Malet and P. Gori-Giorgi, Strong correlation in Kohn-Sham density functional theory,
Phys. Rev. Lett., 109 (2012), 246402, https://doi.org/10.1103/PhysRevLett.109.246402.
[36] C. B. Mendl and L. Lin, Kantorovich dual solution for strictly correlated electrons
in atoms and molecules, Phys. Rev. B, 87 (2013), 125106, https://doi.org/10.1103/
PhysRevB.87.125106.
[37] C. B. Mendl, F. Malet, and P. Gori-Giorgi, Wigner localization in quantum dots from
Kohn-Sham density functional theory without symmetry breaking, Phys. Rev. B, 89 (2014),
125106, https://doi.org/10.1103/PhysRevB.89.125106.
[38] G. Monge, M\' emoire sur la Th\'
eorie des D\'eblais et des Remblais, Histoire de l'Academie Royale
des Sciences de Paris, 1781.
[39] P. M. Pardalos and S. A. Vavasis, Quadratic programming with one negative eigenvalue is
NP-hard, J. Global Optim., 1 (1991), pp. 15--22, https://doi.org/10.1007/BF00120662.
[40] F. Santambrogio, Optimal Transport for Applied Mathematicians, Birkh\" auser, Cham, 2015,
https://doi.org/10.1007/978-3-319-20828-2.
[41] H. Scheel and S. Scholtes, Mathematical programs with complementarity constraints:
Stationarity, optimality, and sensitivity, Math. Oper. Res., 25 (2000), pp. 1--22,
https://doi.org/10.1287/moor.25.1.1.15213.
[42] B. Schmitzer, A sparse algorithm for dense optimal transport, in Scale Space and Variational
Methods in Computer Vision, J.-F. Aujol, M. Nikolova, and N. Papadakis, eds., Springer,
Cham, 2015, pp. 629--641, https://doi.org/10.1007/978-3-319-18461-6 50.
[43] S. Scholtes, Convergence properties of a regularization scheme for mathematical pro-
grams with complementarity constraints, SIAM J. Optim., 11 (2001), pp. 918--936,
https://doi.org/10.1137/S1052623499361233.
[44] S. Scholtes and M. Stohr, \" Exact penalization of mathematical programs with equilibrium
constraints, SIAM J. Control Optim., 37 (1999), pp. 617--652, https://doi.org/10.1137/
S0363012996306121.
[45] M. Seidl, Strong-interaction limit of density-functional theory, Phys. Rev. A, 60 (1999),
pp. 4387--4395, https://doi.org/10.1103/PhysRevA.60.4387.
[46] M. Seidl, S. Di Marino, A. Gerolin, L. Nenna, K. J. H. Giesbertz, and P. Gori-Giorgi,
The strictly-correlated electron functional for spherically symmetric systems revisited,
https://arxiv.org/abs/1702.05022, 2017.
[47] M. Seidl, P. Gori-Giorgi, and A. Savin, Strictly correlated electrons in density-functional
theory: A general formulation with applications to spherical densities, Phys. Rev. A, 75
(2007), 042511, https://doi.org/10.1103/PhysRevA.75.042511.
[48] C. Villani, Optimal Transport: Old and New , Grundlehren Math. Wiss. 338, Springer, Berlin,
2009, https://doi.org/10.1007/978-3-540-71050-9.
[49] R. A. Waltz, J. L. Morales, J. Nocedal, and D. Orban, An interior algorithm for nonlinear
optimization that combines line search and trust region steps, Math. Program., 107 (2006),
pp. 391--408, https://doi.org/10.1007/s10107-004-0560-5.

[50] Y. Xu and W. Yin, A block coordinate descent method for regularized multiconvex optimization
with applications to nonnegative tensor factorization and completion, SIAM J. Imaging
Sci., 6 (2013), pp. 1758--1789, https://doi.org/10.1137/120887795.
[51] J. Ye, Necessary and sufficient optimality conditions for mathematical programs with equi-
librium constraints, J. Math. Anal. Appl., 307 (2005), pp. 350--369, https://doi.org/
10.1016/j.jmaa.2004.10.032.

mathrm (C) /mathrm (I) /mathrm (O) /mathrm (M) /mathrm (P) /mathrm (U) /mathrm (T)

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

mathrm (C) /mathrm (I) /mathrm (O) /mathrm (M) /mathrm (P) /mathrm (U) /mathrm (T)

Uploaded by

Copyright:

Available Formats

\mathrm{S}\mathrm{I}\mathrm{A}\mathrm{M} \mathrm{J}. \mathrm{S}\mathrm{C}\mathrm{I}. \mathrm{C}\mathrm{O}\mathrm{M}\mathrm{P}\mathrm{U}\mathrm{T}.

A GLOBAL OPTIMIZATION APPROACH FOR MULTIMARGINAL

YUKUAN HU\dagger , HUAJIE CHEN\ddagger , AND XIN LIU\dagger \S

MSC codes. 49M37, 65K05, 81V05, 90C26, 90C30

1. Introduction. The aim of this paper is to provide an optimization method

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.

Tr(Xn ) = 0, n = 2, . . . , N, and \langle Xm , Xn \rangle = 0 \forall m \not = n,

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.

In the case of N = 3, (1.8) is a mathematical programming with complemen-

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.

In addition, methods based on approximation (smoothing or regularization), aug-

1.2. Contributions. Our contributions are threefold:

1.3. Further remarks.

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.

of element ej ; then Tn (aj ) (n = 2, . . . , N ) can be approximated by a given solution

Symmetric constraints. In physics, one is only interested in the measures that

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.

Framework 2.1 The GGR approach.

2. A global optimization approach for solving (1.8). In light of the ansatz

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.

In what follows, we elaborate on the initialization subroutine and local solver.

subsection is devoted to the development of the subroutine, GR, for initializations

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.

Algorithm 2.2 The GR initialization subroutine.

Algorithm 2.3 PBCD for (2.1).

\bigl( (\ell ,k+1) (\ell ,k) \bigr) \sigma (\ell )

\~ n(\ell ,k) is computed as

\bigl( (\ell ,k+1) (\ell ,k) \bigr)

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.

(\ell ,k+1)\top (\ell ,k+1)\top (\ell ,k+1) (\ell ,k+1)

1 With a slight abuse of notation, we use \| \cdot \|

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.

(3) If \{ \varepsilon (\ell ,k) \} k fulfills Condition 3.1(2), \sum \infty X

is the average of the N - 1 Lagrange multipliers associated with the mass-preserving

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.

4.2. Numerical results on 1D systems. We first consider some typical 1D

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.

System 1 System 2 System 3

System 4 System 5 System 6

single-electron densities (marginals) and approximate transport maps {TnK }N (1.12)

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.

0.0 0.0 0.0

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.

To show that the GR subroutine (Algorithm 2.2) yields high-quality initializa-

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.

0.0 0.0 0.0

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.

4.3. Numerical results on 2D systems. We then consider some 2D systems

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.

= 7 , K = 240 = 7 , K = 960 = 7 , K = 3840 = 7 , K = 15360

= 7 , K = 240 = 7 , K = 960 = 7 , K = 3840 = 7 , K = 15360

= 8 , K = 170 = 8 , K = 680 = 8 , K = 2720 = 8 , K = 10880

0.5 0.5 0.5 0.5

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.

\int \int \int

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.

Running time (seconds)

with error depending on the size of the largest element.

Consequently, the constraints in (1.5) can be approximated using

Xn e = 1, Xn\top E\bfitvarrho = \bfitvarrho \forall n \in \{ 2, . . . , N \} .