1 s2.0 S1877050917307858 Main

Available online at www.sciencedirect.
com
ScienceDirect
This space is reserved for the Procedia header, do not use it
This space is Computer
Procedia reservedScience
for the Procedia
108C header, do not use it
(2017) 755–764
This space is reserved for the Procedia header, do not use it
International Conference on Computational Science, ICCS 2017, 12-14 June 2017,

Zurich, Switzerland
TNT-NN:
TNT-NN: A
A Fast
Fast Active
Active Set Set Method
Method for for Solving
Solving Large
Large
Non-Negative
TNT-NN: A Fast Active
Non-Negative Least
LeastSet Squares
Method Problems
Squares for Solving Large
Problems
Non-Negative
J.M.
J.M. Myre , E. Frahm Least
Myre11 , E. Frahm 2
Squares
2 , D. J.
, D. J. Lilja , and Problems
Lilja33 , and M.O. Saar44
M.O. Saar
1
Department of Computer and1 Information 2 Sciences, University
3 of St. Thomas, 4Saint Paul, MN
1
J.M.
Department of Myreand
Computer , E.Information
Frahm ,Sciences,
D. J. Lilja
myre@stthomas.edu , andof M.O.
University Saar Saint Paul, MN
St. Thomas,
2 myre@stthomas.edu
School of Physics and Astronomy, University of Minnesota, Twin Cities
1 2
Department of Computer
School andand
of Physics Information Sciences,
Astronomy, University
University of St. Thomas,
of Minnesota, Saint Paul, MN
Twin Cities
frahm@physics.umn.edu
3 myre@stthomas.edu
frahm@physics.umn.edu
3
Department
2 of Electrical and Computer Engineering, University of Minnesota, Twin Cities
School
Department of Physics
of Electrical and
and Astronomy,
Computer UniversityUniversity
Engineering, of Minnesota, Twin Cities
of Minnesota, Twin Cities
lilja@umn.edu
4 frahm@physics.umn.edu
lilja@umn.edu
3 4
Department of Earth Sciences, ETH-Zürich
Department of Electrical and Computer
Department of Engineering,
Earth Sciences,University
ETH-Zürichof Minnesota, Twin Cities
saarm@ethz.ch
lilja@umn.edu
saarm@ethz.ch
4
Department of Earth Sciences, ETH-Zürich
saarm@ethz.ch
Abstract
Abstract
In 1974 Lawson and Hanson produced a seminal active set strategy to solve least-squares prob-
In
lems1974 Lawson
with and Hanson
non-negativity produced that
constraints a seminal active
remains set strategy
popular today. toInsolve
thisleast-squares
paper we presentprob-
Abstract
lems with non-negativity constraints that remains popular today. In this paper we present
TNT-NN, a newand
In 1974 Lawson active set method
Hanson producedfor solving non-negative
a seminal least to
active set strategy squares (NNLS) problems.
solve least-squares prob-
TNT-NN,
TNT-NN a new
uses a active
different set method
strategy not for
onlysolving
for thenon-negative
construction least
of squares
the active (NNLS) problems.
lems withuses
TNT-NN non-negativity
a different constraints
strategy not that for
only remains
the popular today.
construction of the In thisset
active set
but also
paper
but also
for the
we present
for the
solution
TNT-NN, of athe
newunconstrained
active set methodleast squares sub-problem.
for solving This least
non-negative resultssquares
in dramatically improved
(NNLS) problems.
solution of
performance the unconstrained least squares sub-problem. This results in dramatically improved
TNT-NN usesover
performance over
traditional
a different
traditional
activenot
strategy
active
setonly
NNLS
set NNLS for solvers, includingof
the construction
solvers,
the
includingforthe theLawson
active and
Lawson
Hanson
set but
and
NNLS
also for
Hanson the
NNLS
algorithm and
solution of and the Fast
the unconstrained NNLS (FNNLS)
least algorithm,
squaresalgorithm,
sub-problem. allowing
This for computational
results in dramatically investigations
improved
algorithm
of new types over the Fast
of scientific NNLS (FNNLS)
and active
engineering problems. allowing computational investigations
performance
of traditional
new types of scientific set NNLS
and engineering solvers, including the Lawson and Hanson NNLS
problems.
For
algorithmthe small
and systems
thesystems
Fast NNLStested (5000
(FNNLS) × 5000 or smaller),
algorithm, allowingit is
forshown that TNT-NN
computational is up to
investigations
95× For the
faster small
than FNNLS. tested
Recent (5000
studies × in5000
rock or smaller),
magnetism it is
have shown
revealedthat
a TNT-NN
need for is NNLS
fast up to
of new
95× types
faster of scientific
than FNNLS. and engineering
Recent studies inproblems.
rock magnetism have revealed a need for fast NNLS
algorithms
For the to address
small largetested
systems problems(5000 (on× the
5000order of 1055 × it
or smaller), 1055is or larger).
shown thatWe apply the
TNT-NN is TNT-
up to
algorithms
NN algorithmto address
to a large problems
representative rock (on the
magnetism order of 10
inversion × 10 or
problem larger).
where We
it is apply
60× the
faster TNT-
than
95×
NN faster than
algorithm to FNNLS.
a Recent studies
representative rock in rock magnetism
magnetism inversion have revealed
problem where aitneed
is for faster
60× fast NNLS
than
FNNLS. We also show that TNT-NN is capable of solving5 large 5 (45000 × 45000) problems more
algorithms
FNNLS. We toalso
address
show large
that problems (on
TNT-NN is the order
capable of of 10 ×
solving 10 (45000
large or larger).
× We apply
45000) the TNT-
problems more
than 150× faster
NN algorithm to a than FNNLS. These
representative large test inversion
rock magnetism problemsproblem
were previously
where it isconsidered
60× fasterto be
than
than 150× due
unsolvable, fasterto than
the FNNLS. execution
excessive These large
timetest problems
required by were previously
traditional methods.considered to be
FNNLS. Wedue
unsolvable, alsotoshow
the that TNT-NN
excessive is capable
execution time of solvingby
required large (45000 ×methods.
traditional 45000) problems more
than
© 2017150× faster Published
The Authors.
Keywords: thanleast
non-negative FNNLS. These
bysquares,
Elsevier B.V.large
active set, test problems conjugate
preconditioned were previously
gradient,considered to be
rock magnetism
Keywords: under
Peer-review
unsolvable, non-negative leastofsquares,
dueresponsibility
to the excessive active
the scientific
execution set,
timepreconditioned
committee conjugate
of the International
required gradient,
Conference
by traditional rock magnetism
on Computational
methods. Science
Keywords: non-negative least squares, active set, preconditioned conjugate gradient, rock magnetism
1
1 Introduction
Introduction
The least squares method produces the solution to a system of linear equations that minimizes
1
The Introduction
error.least squares it
Although method produces
can be appliedthe
to solution to a systemsystems
under-determined of linearofequations
equations,that
it minimizes
is natural
error.
to Although
leastit can be method
applied to under-determined systemsthat
of equations, it is natural
Theapply
to least the
apply squares
the
squares
leastmethod
squaresproduces
to solution
methodthe
over-determined systems
to a system
to over-determined of linear
systems
cannot be
thatequations thatsolved
cannot be
in a
minimizes
solved in a
error. Although it can be applied to under-determined systems of equations, it is natural
to apply the least squares method to over-determined systems that cannot be solved in 11a
1877-0509 © 2017 The Authors. Published by Elsevier B.V.
Peer-review under responsibility of the scientific committee of the International Conference on Computational Science 1
10.1016/j.procs.2017.05.194
756 J.M.NNLS
TNT-NN: A Fast Method for Large Myre etProblems
al. / Procedia Computer Science 108C (2017) 755–764
Myre, Frahm, Lilja, and Saar
way that will satisfy all of the equations. Since the inception of the least squares method by
Gauss [3], it has been applied in countless scientific, engineering, and numerical contexts. When
investigating physical systems, it is not uncommon for non-negativity constraints to be applied
in order for the system to be physically meaningful. Such constraints have been necessary
in the computational Earth sciences when using nuclear magnetic resonance to determine pore
structures [7], formulating petrologic mixing models [1, 21], and addressing seismologic inversion
problems [28]. This constrained least squares problem is the non-negative least-square (NNLS)
problem, that can be stated as minx ||Ax−b||2 , such that x ≥ 0, where A is the m×n system of
equations, b is the solution vector of measured data, and x is the vector of obtained parameters
obtained that minimizes the 2-norm of the residual.
Recent problems in rock magnetism have required solving NNLS problems composed of tens
of thousands of variables [8, 26, 25]. Without the restrictions of conventional computer systems
and algorithms, these problems could expand to hundreds of thousands of variables. Direct
spatial solutions to these problems using standard NNLS algorithms have required months of
execution time. Frequency domain methods [15] have been developed as an alternative fast
method to obtain representative solutions, but can produce non-physical artifacts (e.g., Gibbs
phenomenon at sharp interfaces) that violate the non-negativity constraint.
We present TNT-NN, a new (dynamite) active set strategy for solving large NNLS problems.
TNT-NN improves upon prior efforts by incorporating a more aggressive strategy for identifying
the active set of constraints and by using an improved solver to address the unconstrained least
squares sub-problem. We show that TNT-NN dramatically outperforms the present Fast NNLS
active set algorithm on a wide variety of test problems and that the maximum tractable problem
size is extended to the point where previously prohibitive problems are now feasible.
2 Background and related work

Active set strategies for the NNLS problem can be broken down into two basic parts, 1) the
strategy for identifying the “active” non-negativity constraints and 2) the strategy for solving
the unconstrained least squares sub-problem for each choice of the active set of constraints.
These parts are independent of one another and can therefore be discussed independently.
In many applications of NNLS the maximum problem size that can be investigated is limited
by the algorithmic performance and the available computational resources. One of the most
widely used is the 1974 Lawson and Hanson NNLS algorithm [14] (LH-NNLS). Like most
non-negative active set strategies, the LH-NNLS algorithm attempts to find the non-negative
solution by setting some variables to zero. These variables are the active set, because their
non-negativity constraints are “active”. In each iteration, the active set is modified by a single
variable. The active set variables are then ignored and an unconstrained least squares sub-
problem is solved. This conservative approach can yield extremely slow convergence on problems
with many variables. The LH-NNLS algorithm is found in popular software packages such as
Matlab, GNU R [17], and scientific tools for python [12]. LH-NNLS is commonly encountered
in reference literature as the method to solve NNLS problems [2, 20, 27] and its prevalence
continues to inspire new works focused on its optimization [10, 24, 16].
The Fast NNLS (FNNLS) algorithm of Bro and De Jong [6] improves upon LH-NNLS
by by avoiding redundant computations and by allowing an initial “loading” of the active
set. Although the FNNLS algorithm will continue to modify the active set by one variable
per iteration, the total number of iterations that are required for convergence is reduced by
supplying an initial solution that is close to the actual solution. Bro and De Jong report
FNNLS speedups over LH-NNLS ranging from 2x-5x using small real and synthetic test suites.
2
J.M.NNLS
Myre, Frahm, Lilja, and Saar 757
Common strategies for solving the unconstrained least squares problem, minx ||Ax − b||2 ,
include the normal equations [9, 11], the pseudoinverse [3], and methods using QR decompo-
sition [23]. Using the normal equations introduces excessive numerical round off error. The
pseudoinverse provides a conceptually simple and numerically accurate approach, but can be
computationally expensive. QR methods are often a good compromise between the normal
equations and pseudoinverse methods, providing acceptable accuracy and numerical efficiency.
3 The design and implementation of the TNT-NN algo-

rithm
The NNLS objective function is a convex quadratic function [4] with linear inequalities as
constraints. The whole NNLS problem, therefore, is convex and any feasible solution can be
found from another feasible point. This provides an algorithmic “license” to take a guess at the
active set, without fear of becoming locked into a “local” minimum. Exploiting this algorithmic
license is a key feature that distinguishes TNT-NN from traditional LH-NNLS type strategies.
To form the active set, TNT-NN first solves an unconstrained least squares problem. Vari-
ables that violate the non-negativity constraint are added to the active set. Once a feasible
solution is found, where none of the non-negativity constraints are violated, the 2-norm of the
residual is used as a measure of fitness and the solution is saved as the current “best” solution.
TNT-NN attempts to modify the active set by iteratively moving some of the variables from
the active set back into the unconstrained set. The active set variables are sorted based on their
components of the gradient. Variables that show the largest positive gradient components are
tested by moving some of them from the active set into the unconstrained set. It is important
to note that initially large groups of variables can be moved in a single test. If the new solution
does not improve in fitness, then the solution is rejected and a smaller set of variables is tested.
If a group of the variables can be removed from the active set and a new feasible solution is
found that is “better”, the solution is saved and the algorithm begins a new iteration. The
algorithm reaches convergence when the active set can no longer be modified.
Two scaling constants heuristically govern the number of variables that are added to and
removed from the active set. The reduction constant, rc, controls how quickly modifications to
the active set are restricted to a single variable per iteration. Large values of rc are preferable
for difficult problems with a high condition number where the gradient can be an unreliable
prediction of quality. The expansion constant, ec opposes rc by controlling how many variables
are added to the active set per iteration. Allowing the active set to be modified by a large
number of variables per iteration is preferable for less complex problems with a low condition
number where the gradient provides a good prediction of variables requiring constraint.
The TNT-NN method only begins a new iteration when a solution is found with a new
smaller residual. Since the residual is bounded below and there are a finite number of variables,
the algorithm is guaranteed to terminate. However, in the decades since the inception of
LH-NNLS, problem sizes have grown so much larger that worst case convergence analysis has
diminished in usefulness. Due to our use of heuristics, we cannot guarantee that the convergence
rate of the TNT-NN is provably better than LH-NNLS. However, our tests shows that in many
cases there is a reduction in the total number of iterations required to converge with TNT-NN.
To solve the core least squares problem, TNT-NN uses the TNT algorithm [19], which
can quickly produce solutions with low error. TNT is a left-preconditioned Conjugate Gradient
Normal Residual routine (PCGNR) [22], with the Cholesky decomposition of the matrix product
AT A used as a preconditioner. TNT naturally produces solutions with small components as it
3
758 J.M.NNLS
is always initialized at the origin. Other active set methods have no such constraints inherent
to their design. This is a justifiable design decision due to the physics of real world problems
where solutions corresponding with low energy states are often desired.
The TNT-NN Matlab implementation used in this paper is archived in a GitHub reposi-
tory [18].
4 Test methodology
TNT-NN is tested for performance and numerical accuracy using a Matlab implementation.
All tests include FNNLS and TNT-NN. The reference FNNLS Matlab function [5] is used
throughout these tests. These tests are performed using Matlab r2016a on a host that is
composed of dual Intel Xeon 6-core 2.93 GHz X5670 CPUs with 384 GB of memory.
The first test spans systems with dimensions of 200×200 to 5000×5000, in increments of 200
for both rows and columns. A complete combination of dimensions, condition numbers (102 , 105 ,
and 109 ) and percentage of active constraints (5%, 20%, and 50%) are tested. The reduction and
expansion constants used in the TNT-NN algorithm for these tests are 0.5 and 2, respectively.
Timing performance is reported as speedup of TNT-NN vs. FNNLS. Numerical accuracy is
reported as solution error, calculated as the 2-norm of the solution residual (||b − Ax||2 ).
The NNLS solver included with Matlab (lsqnonneg, which uses LH-NNLS) is not included
in these tests due to prohibitively slow performance. For example, to solve 2000x2000 and
3600x3600 systems lsqnonneg required 2.5 and 25 hours, respectively, while TNT-NN only
required 2.5 and 14 minutes, respectively. The exclusion of the LH-NNLS due to prohibitive
performance characteristics is consistent with other studies of this nature [13].
The convergence of the TNT-NN algorithm is analyzed by varying the constants that govern
the number of variables that are added to and removed from the active set using two tests: one
that varies the reduction constant (0.1 to 0.9 in increments of 0.1) and one that varies the
expansion constant (1.1 to 1.9 in increments of 0.1). Default values for rc and ec are 0.5 and
1.5, respectively. These tests solve 1500×1500 systems with 50% active constraints and κ = 109 .
Using the convergence test results, we test the performance of TNT-NN for large n × n
matrices where n is 10000 through 25000 in increments of 5000 using a high condition number
(κ = 109 ) as well as 5% and 50% active constraints. Each of these tests is repeated five times
and the arithmetic mean of the five execution times is used to report speed up over FNNLS.
The final test compares the performance of FNNLS and TNT-NN when solving an extremely
large (by current standards) 45000 × 45000 system. This test performs a complete combination
of condition numbers (κ = 102 and 109 ) and active non-negativity constraints (5% and 50%).
5 Discussion of results
5.1 Execution Time
Speedup results for all tests are shown in Figure 1. These results show that TNT-NN out-
performs FNNLS for almost all of the tested matrices (speedup increasing with system size).
TNT-NN does not outperform FNNLS (speedup ≤ 1) under the following conditions: 1) the
system of equations is small (at least one dimension is on the order of 102 ) and 2) the system of
equations is very ill-conditioned and underdetermined. When the system of equations is small
the time spent by TNT-NN to intelligently construct the active set is detrimental to the point of
mitigating the benefits provided by the least squares solver. These conditions limit the perfor-
mance of TNT-NN to roughly the same level of performance as FNNLS. A system of equations
4
J.M.NNLS
that is underdetermined can have an infinite number of valid solutions and ill-conditioning
imparts additional difficulty. TNT-NN then spends more time evaluating the solution space
to find a valid solution because the gradient does not always provide an accurate prediction.
This additional time spent evaluating the underdetermined solution space reduces the relative
performance of TNT-NN compared to FNNLS.
(a) (b)
Figure 1: Speedups for all tests with (A)

5%, (B) 20%, and (C) 50% active constraints.
Horizontal axes show the square root of the
number of elements (negative for underdeter-
mined systems and positive for over- and well-
determined systems). General trends show
1) a positive relationship between system size
and speedup and 2) a negative relationship be-
tween active constraints and speedup. TNT-
NN exhibits better performance when solving
overdetermined systems with fewer active con-
straints and lower κ (almost 100×).
(c)
‘
The tests with high values of κ do not exhibit speedup growth curves that are as steep as
the curves for tests using smaller values of κ. High condition numbers make it difficult for
TNT-NN to accurately predict what variables should be constrained. This property leads to
damped performance, particularly in the over- and well-determined regions, where TNT-NN
would receive the most benefit from prediction.
Note that the relative performance of TNT-NN increases with the size of the system it is
solving. While the difference in performance for current problems might be considered small for
some of these tests (speedups of less than five in some instances), the large problems that will
need to be addressed in the near future will benefit from the enhanced performance of TNT.
5.2 Error
Histograms of log10 (abs(||b − Ax||2 )) for all tests are shown in Figure 2. Two features are
notable: 1) there are minimal differences between the FNNLS and TNT-NN error distributions
5
760 J.M.NNLS
for over and well-determined tests and 2) the error in the underdetermined region is typically
lower than the error in the over and well-determined regions for FNNLS tests.
(A) (B)
Figure 2: Histograms of solution error (as log10 (|error|)) for (A) over- and well-determined
tests and (B) underdetermined tests. Error from TNT-NN is never more than FNNLS for over-
and well-determined tests and it is lower on average for underdetermined tests.
Feature 1 is not unexpected as over and well-determined systems have unique solutions.
Differences in TNT-NN and FNNLS solutions are attributable to machine round off error.
Feature 2 is likewise not unexpected as TNT-NN produces solutions with small norms by default.
When an infinite number of solutions exist, as in underdetermined systems, it is customary to
choose a solution that has a small norm since those solutions can also be described as solutions
with low energy states. This design decision allows TNT-NN to have a lower error distribution
than FNNLS in all but one percent of the underdetermined tests.
5.3 Convergence
(A) (B)
Figure 3: Total unconstrained TNT-NN iterations to convergence due to variations in (A) the
reduction constant (ec fixed at 1.5) and (B) the expansion constant (rc fixed at 0.5).
The reduction constant has the greatest effect on total inner loop iterations to convergence,
spanning more than 80 iterations (Figure 3A). The expansion constant does not have as great
of an effect, exhibiting a range of just over 5 total inner loop iterations to convergence (Figure
3B). Although these are simple tests and analyses, they indicate that an acceptable value for
the reduction constant lies in the range of [0.1, 0.5] and that the expansion constant does not
have much effect in the range [1.1, 1.9]. To reduce the total number of inner loop iterations to
convergence, reasonable initial values for the reduction and expansion constants would be 0.2
and 1.2, respectively.
6
J.M.NNLS
5.4 Performance Trends
(A) (B)
Figure 4: Performance trends of speedup beyond the original 5000 × 5000 tests (dashed lines)
are shown for square systems from 10000 × 10000 to 25000 × 25000 in increments of 5000 (solid
line with circle markers) where κ = 109 using (A) 5% and (B) 50% active constraints.
The speedup provided by using TNT-NN over FNNLS for n × n matrices beyond the initial
5000 × 5000 space, where κ = 109 and active constraints (non-negativity) of 5% and 50%, are
shown in Figure 4. These results show consistent positive trends in performance with the size
of the square matrices. It is important to note that this trend comes from sparsely sampled
data. A densely sampled (but more time intensive) test suite may produce results that reveal
more details of this behavioral trend.
When solving an extremely large (45000 × 45000) systems, TNT-NN outperforms FNNLS
by at least 40× (Figure 5). When FNNLS was used to solve these large systems, the mean
execution time was 6.63 days. The mean TNT-NN execution time for the same tests was 2.45
hours (a speedup of 65×).
(A) (B)
Figure 5: Execution times (A) and speedups (B) are shown for large (45000 × 45000) systems
where κ = 102 and κ = 109 . Negativity constraints of 5% and 50% are applied for both
condition numbers. For these large tests the TNT-NN algorithm always outperforms the FNNLS
algorithm, by more than 50× in three cases with a maximum of 157×.
When looking beyond the mean performance of these algorithms, it is seen that there is a
reduction in the relative performance gain of TNT-NN when the condition number increases.
Again, high condition numbers make it difficult for the TNT-NN algorithm to accurately make
predictions about the active set. It is important to state that appropriate reduction and ex-
pansion constants are problem specific. Better performance might be obtained by tuning the
reduction and expansion constants.
7
762 J.M.NNLS
Overall, these results represent a drastic shift in NNLS solver performance; from FNNLS
which requires more than a work-week to solve such large systems, to TNT-NN where multi-
ple “same workday” results are possible. Stated slightly differently, for large systems of this
scale, TNT-NN could potentially accomplish in a single day what might require more than a
month to accomplish with FNNLS. These trends hold promise for the types of large science and
engineering problems that may be addressed using TNT-NN.
6 Application to Rock Magnetic Data
(A) (B) (C)

Figure 6: sIRM magnetic sources of the UMN logo are shown for (A) the synthetic sample
(known solution) and (B) the TNT-NN solution. (C) shows the difference between the known
magnetic source sources and the sources obtained via TNT. (All units are in Am2 .)
We use TNT-NN and FNNLS to solve the unidirectional rock magnetism inversion problem
[25] of a simulated saturated Isothermal Remnant Magnetization (sIRM) on a synthetic sample
in the shape of the University of Minnesota (UMN) logo. The synthetic sIRM is created by
treating a grayscale image of the UMN logo as a magnetic source map (3350 total points) and
imposing negativity constraints on sources within 4% of the maximum. All values are then
scaled such that the maximum value of the synthetic sources is 4.8842 × 10−9 , which is in the
typical range of magnetization intensity for natural samples measured in Am2 .
Results obtained via TNT-NN are shown in Figure 6. FNNLS results are not shown as
differences with the TNT-NN algorithm are indiscernible. The TNT-NN algorithm exhibits
speedups of 70× over FNNLS (and 2963× over LH-NNLS). The solution found by TNT-
NN produces a reconstructed magnetic field with 2% lower RMS error (8.62 × 10−15 nT vs.
8.80 × 10−15 nT) than the FNNLS reconstructed field.
Although this is a small problem, it is representative of the NNLS problems encountered
in rock magnetism. Studies using natural samples of higher resolution should see enhanced
performance benefits. Even if the relative performance stayed constant, execution time to
obtain solutions for natural samples would be reduced from months to days.
7 Conclusions and future work

In this paper we present the TNT-NN algorithm for solving NNLS problems that are encoun-
tered in the Earth sciences. TNT-NN is shown to typically outperform FNNLS [6] for both
execution time and solution accuracy. The trends in the TNT-NN performance curves show
increases in performance with an increase in problem size, providing up to 95× speedup in these
8
J.M.NNLS
tests. While these tests are limited to 5000 × 5000 systems, TNT-NN should be able to provide
additional performance as problem sizes become much larger in the near future. The error of
TNT-NN solutions are less than or equal to the FNNLS errors in all but 1% of the tests. We ap-
ply the TNT-NN algorithm to the rock magnetism inversion problem [25] where it outperforms
FNNLS with a speedup of 70× and produces a smaller error for the small low-resolution sample
tested. We expect that this difference in performance will be more pronounced for studies using
high-resolution data sets of natural samples. Such studies would benefit from this enhanced
performance accordingly.
We plan to continue developing TNT-NN to incorporate regularization method and tech-
niques for solving sparse systems. Additional work will examine methods for enhancing the
current active set strategy by providing it with higher quality “guesses” for the active set, par-
ticularly at high condition numbers. A more in depth analysis of the interplay between the
reduction and expansion constants governing the active set is also warranted. At present, these
heuristic constants should be considered problem specific tuning parameters. Alternative av-
enues for enhancing performance will also be investigated. Native implementations of the linear
algebra routines used in TNT-NN should be analyzed for potential performance enhancement.
We also plan to investigate the performance benefit of implementing TNT-NN using parallel
and accelerator hardware, such as GPU or Intel Xeon Phi technology.
Finally, we plan to apply TNT-NN to real world problems requiring, until now, prohibitively
large systems, like the rock magnetism inversion problem [25], are of particular interest.
8 Acknowledgments
This work was supported in part by National Science Foundation (NSF) Grants EAR-0941666
and CCF-1438286. Any opinions, findings, and conclusions or recommendations expressed in
this material are those of the authors and do not necessarily reflect the views of the NSF. We
also thank Bill Tuohy, Joshua Feinberg, Ioan Lascu, Ben Weiss, and Eduardo Lima for helpful
discussions and the Institute for Rock Magnetism, and the Minnesota Supercomputing Institute
for computational resources.
References
[1] F. Albarede and A. Provost. Petrological and geochemical mass-balance equations: an algorithm
for least-square fitting and general error analysis. Compute. Geosci., 3(2):309 – 326, 1977.
[2] Richard Aster, Brian Borchers, and Clifford Thurber. Parameter Estimation and Inverse Problems
(International Geophysics). Academic Press, 1st edition, January 2005.
[3] Åke Björck. Numerical Methods for Least Squares Problems. Siam Philadelphia, 1996.
[4] S. Boyd and L. Vandenberghe. Convex optimization. Cambridge university press, 2004.
[5] R. Bro and L. Shure. Reference fnnls matlab implementation, 1996.
http://www.ub.edu/mcr/als/fnnls.m.
[6] Rasmus Bro and Sijmen De Jong. A fast non-negativity-constrained least squares algorithm.
Journal of Chemometrics, 11(5):393–401, 1997.
[7] David P Gallegos and Douglas M Smith. A nmr technique for the analysis of pore structure:
Determination of continuous pore size distributions. Journal of Colloid and Interface Science,
122(1):143 – 153, 1988.
[8] J. Gattacceca, M. Boustie, E. Lima, B.P. Weiss, T. De Resseguier, and J.P. Cuq-Lelandais. Un-
raveling the simultaneous shock magnetization and demagnetization of rocks. Physics of the Earth
and Planetary Interiors, 182(1):42–49, 2010.
9
764 J.M.NNLS
[9] Gene H. Golub and Charles F. Van Loan. Matrix Computations (Johns Hopkins Studies in Math-
ematical Sciences)(3rd Edition). The Johns Hopkins University Press, 3rd edition, October 1996.
[10] Karen H. Haskell and Richard J. Hanson. An algorithm for linear least squares problems
with equality and nonnegativity constraints. Mathematical Programming, 21:98–118, 1981.
10.1007/BF01584232.
[11] M.T. Heath. Numerical methods for large sparse linear least squares problems. SIAM J. Sci. Stat.
Comput., 5:497–513, 1984.
[12] E. Jones, T. Oliphant, P. Peterson, et al. SciPy: Open source scientific tools for Python, 2001–.
[13] Dongmin Kim, Suvrit Sra, and Inderjit S Dhillon. Tackling box-constrained optimization via a
new projected quasi-newton approach. SIAM J. Sci. Comput, 32(6):3548–3563, 2010.
[14] C. L. Lawson and R. J. Hanson. Solving least squares problems. 2nd edition, 1995.
[15] E.A. Lima, B.P. Weiss, L. Baratchart, D.P. Hardin, and E.B. Saff. Fast inversion of magnetic
field maps of unidirectional planar geological magnetization. J. of Geophys. Res. B: Solid Earth,
118(6):2723–2752, 2013.
[16] Yuancheng Luo and Ramani Duraiswami. Efficient parallel nonnegative least squares on multicore
architectures. SIAM J. Sci. Comput., 33(5):2848–2863, October 2011.
[17] Katherine M. Mullen and Ivo H.M. van Stokkum. The lawson-hanson algorithm for non-negative
least squares (nnls). Technical report, 2012.
[18] J.M. Myre, E. Frahm, D.J. Lilja, and M.O. Saar. TNT-NN reference implementation: http:
//dx.doi.org/10.5281/zenodo.438158, 2017.
[19] J.M. Myre, E. Frahm, D.J. Lilja, and M.O. Saar. TNT: A preconditioned method for applying
conjugate gradient to dense least-squares problems. In Prep.
[20] R. L. Parker. Geophysical Inverse Theory. Princeton University Press, 1994.
[21] M.J. Reid, A.J. Gancarz, and A.L. Albee. Constrained least-squares analysis of petrologic problems
with an application to lunar sample 12040. Earth and Planet. Sci. Lett., 17(2):433 – 445, 1973.
[22] Yousef Saad. Iterative methods for sparse linear systems. SIAM, 2 edition, 2003.
[23] Lloyd N Trefethen and David Bau III. Numerical linear algebra, volume 50. Siam, 1997.
[24] Mark H. Van Benthem and Michael R. Keenan. Fast algorithm for the solution of large-scale non-
negativity-constrained least squares problems. Journal of Chemometrics, 18(10):441–450, 2004.
[25] B. P. Weiss, E. A. Lima, L. E. Fong, and F. J. Baudenbacher. Paleomagnetic analysis using SQUID
microscopy. J. Geophys. Res., 112:B09105, 2007.
[26] B.P. Weiss, E.A. Lima, L.E. Fong, and F.J. Baudenbacher. Paleointensity of the earth’s magnetic
field using squid microscopy. Earth Planet. Sci. Lett., 264(1):61–71, 2007.
[27] Zonghou Xiong and Andreas Kirsch. Three-dimensional earth conductivity inversion. Journal of
Computational and Applied Mathematics, 42(1):109 – 121, 1992.
[28] Y. Yagi. Source rupture process of the 2003 Tokachi-oki earthquake determined by joint inversion
of teleseismic body wave and strong ground motion data. Earth, Planets Space, 56:311–316, March
2004.
10

1 s2.0 S1877050917307858 Main

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

1 s2.0 S1877050917307858 Main

Uploaded by

Copyright:

Available Formats

Available online at www.sciencedirect.

This space is reserved for the Procedia header, do not use it

International Conference on Computational Science, ICCS 2017, 12-14 June 2017,

2 Background and related work

3 The design and implementation of the TNT-NN algo-

Figure 1: Speedups for all tests with (A)

5.4 Performance Trends

6 Application to Rock Magnetic Data

(A) (B) (C)

7 Conclusions and future work

You might also like