Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

Neural Networks. Vol. 7, No. 4, p.

629-641, 1994
Copyright (c; 1994 Elsevier Science Ltd
Pergamon Printed in the USA. All rights reserved
0893-6080/94 $6.00 + .00
0893-6080(93)E0021-X
CONTRIBUTED ARTICLE

A Deterministic Annealing Neural Network


for Convex Programming

JUN WANG
University of North Dakota

(Received 19 October 1992: revised and accepted 15 November 1993)

Abstract--A recurrent neural network, called a deterministic' annealing neural network, is proposed jot solving
convex programming problems. The proposed deterministic annealing neural network is shown to be capable of
generating optima/solutions to convex programming problems. The conditions jot asymptotic stabilit); solution
feasibilit3; and solution optimality are derived. The design methodologyfi)r determining design parameters is discussed
Three detailed illustrative examples are also presented to demonstrate the Jimctional and operational characteristics
o[the deterministic annealing neural network in solving linear and quadratic programs.

Keywords--Recurrent neural network, Convex programming, Convergence analysis.

1. INTRODUCTION programming problems. Kennedy and Chua (1988a,b)


analyzed Tank-Hopfield linear programming circuit
Convex programming deals with the problem of min-
and Chua-Lin nonlinear programming circuit from a
imizing a convex function over a convex set of deci-
sion variables. Convex programming problems repre- circuit-theoretic viewpoint and reconciled a modified
sent a class of widespread optimization problems aris- canonical nonlinear programming circuit that can also
ing in diverse design and planning contexts. Many be applied to linear programming. Wang ( 1991 ) and
large-scale and real-time applications, such as traffic Wang and Chankong (1992) developed a class of re-
routing and image restoration, require solving large- current neural networks for optimization. Maa and
scale convex programming problems in real time. In Shanblatt (1992) presented an analysis of the stability
such applications, because of the large number of properties of neural networks for linear and quadratic
variables and stringent time requirements, existing al- programming. Although the results of these investiga-
gorithmic solution methods are usually not efficient. tions have demonstrated great potential, most of the
In recent years, neural networks have been developed proposed neural network models are not guaranteed
as viable alternatives for solving various optimization to yield optimal solutions. Although the capability of
the recurrent neural networks for solving convex pro-
problems. Although the mainstream of the neural net-
gramming problems (Wang, 1991; Wang & Chankong,
work approach to optimization has focused on non-
convex combinatorial optimization problems, a num- 1992) is substantiated theoretically, the neural network
ber of neural network paradigms have been developed models are not very easy to implement in hardware.
for solving convex programming problems. For ex- Further investigations are deemed necessary.
ample, Tank and Hopfield (1986) demonstrated the In this paper, a deterministic annealing neural net-
potential of the Hopfield network for solving linear work for solving convex programming problems is
proposed. The proposed deterministic annealing neural
network is convergent and capable of solving convex
Acknowledgements: The author thanks the anonymous reviewers programming problems. Compared with the recurrent
for the insightful comments and constructive suggestions on an earlier neural networks (Wang, 1991; Wang & Chankong,
version of this paper. This work was partially supported by a s u m m e r 1992), the proposed deterministic annealing neural
research professorship award from the Graduate School of the Uni- network is less restrictive in conditions for solution op-
versity of North Dakota and by a research grant from the Advancing
Science Excellence in North Dakota Program.
timality, much simpler in configuration, and hence
Requests for reprints should be sent to the author in the Depart- much easier for hardware implementation. Starting
ment of Industrial Technology, University of North Dakota, Grand with problem reformulation, the configuration of the
Forks, ND 58202-7118. proposed deterministic annealing neural network is
629
630 !. II a,,J~

described. Then its convergence properties and the de- For a constrained optimization problem to be solved
sign principles are discussed. Finally, simulation results through neural computing, it is usually necessary to
are given. characterize the t~asible set V using a penalty function
This paper is organized in eight sections. In Section p ( v ) to penalize a constraint violation. A penalty func-
2, the reformulation of the convex program under study tion characterizes the feasible region of an optimization
is delineated. In Section 3, the dynamics of the proposed problem with a thnctional equality. The basic require-
deterministic annealing neural network are discussed. ment tbr the penalty function is nonnegativity; precisely
In Section 4, the architectures of the deterministic an-
t 0. if r:- '~
nealing neural network are described. In Section 5, the
l)(t')[> O. ~)ther~s~
analytical results on the convergence properties of the
proposed deterministic annealing neural network for Additional requirements tbr the penalty function p(t) )
convex programming are presented. In Section 6, design are convexity and differentiability, l f p ( v ) is convex and
principles for selection of parameters in the determin- differentiable, then min~,p(v) := 0 and Vv E V, Vp(~ }
istic annealing neural network are addressed. In Section =0.
7, the performance of the proposed deterministic an- There are basically two ways to construct the penalty
nealing neural network is demonstrated via illustrative function based on the functional constraints: functional
examples. Finally, the paper is concluded in Section 8. transformanon and state augmentation.
(i) F u n c t i o n a l transformation: Functional translor-
2. PROBLEM REFORMULATION mation involves a mapping of an inequality constraint
or an equality constraint to an equality. Specifically,
In this study, we consider a general constrained convex
programming problem described as follows: 1~(, /r,[',- ~!: 5'
l)e(12) 1.1. e(r) - '

minimize I ( v ). ~I ) = O. h(c, (~
pMt;) (("~
s u b j e c t to e( r ) -c- O. ,2, ;> 0. t l ( !: ~ -" t;

h(v) - O. ~3,
There are several logical choices olp.e(v ) and Ph( V}.
For example, for g ( v ) _< 0. the integral of the Heaviside
where v E R N is a column vector of decision variables,
function can be used: tbr h ( v ) = O. ]th(v)[l~ can be
f:R u ~ R is a convex objective function to be mini-
used where II • 112 is the Euclidean norm. Namely,
mized, g:R N --~ R m is an m-tuple convex function used
to define m inequality constraints, and h : R u ~ R ~ is (7)
p~tv)- x£ I H~v~dx.
an n-tuple linear function used to define n equality t "

constraints. The inequality constraints (2) and equality


constraints (3) characterize a convex feasible region in I ~ h,(v)- (8)
the N-dimensional Euclidean space. Let "¢ denote the
convex feasible region and v* = [ vl,* v ~ . . . . . v ~,] 1 where H( • ) is the heaviside function defined as follows:
denote an optimal solution that minimizes f ( v ) over
"V where the superscript T denotes the transpose op-
H(x) = (9)
erator; that is, "¢ = {v ~ R N I g(1)) <_ 0. h(l)) = 0 } and \. k "> !1

v* = arg m i n c e , f (v). For an obvious reason, we as-


sume that the feasible region V is nonempty and that Because both g ( v ) and tl (v) are differentiable, p~( v )
the objective function v a l u e f ( v ) is bounded from below and Ph(V) are differentiable with respect to v. Based on
over V. Furthermore, for operational reasons, we as- the convexity of g ( v ) and h (v). the convemty of p e(v)
sume that f ( v ) , g( v ), and h ( v ) are differentiable with and Ph(V) can be proven ( cf. the proof in the Appendix).
respect to v o v e r R N. (ii) S t a t e a u g m e n t a t i o n : State augmentation is an
Two classes of important convex programming approach through converting the inequality constraints
problems are the linear program (LP) and the quadratic to equality ones by adding slack or surplus variables.
program ( Q P ) . In the standard form of a linear pro- First. the inequality constraint, g(v ) _< 0, can be con-
gram, f ( v ) = c T v , and g ( v ) = v. h ( v ) - A v - b. verted to equality by adding m slack variables: for ex-
where c is an N-dimensional column vector. A is an n ample.
× N matrix, and b is an n-dimensional column vector. g ( v ) - t,' =: o. ~ t0)
In a quadratic program, f ( v ) = v r Q v / 2 ~- c r y . and
g(v) = v, h ( v ) = A v b. w h e r e Q i s a n N x N where v ~ E R " and v ~ ->- 0. For example, in an LP
matrix, and A, b, and c are the same as those defined problem, letting g ( v ) = A v -- b. t h e n g ( v ) + v ~ = A v
in an LP. + v" b = 0. If g ( v ) = r(v) s where r(v) >- O. and
A Deterministic Annealing Network 631

s ~ R m, then a surplus variable vector v- E R m can The major difference between the existing recurrent
also be applied as follows: neural networks for optimization and the proposed de-
terministic annealing neural network lies in essentially
r(v) - S o v - - O, (11)
the treatment of their penalization procedures for vi-
where 0 _< v) _< I (i = 1, 2 . . . . . rn) and o denotes the olation of constraints. In the Hopfield network for op-
operator for array multiplication (element-by-element timization, the penalization is realized by constant
product). Once all the inequality constraints are con- penalty parameters (Hopfield & Tank, 1985, Tagliarini,
verted to equality constraints, a penalty function can Christ, & Page, 1992). As evidenced by many empirical
be constructed in a way similar to that of eqn (8); that studies, the steady state of the Hopfield network may
is. not represent a feasible solution or an optimal solution
to the optimization problem. In the recurrent neural
}[L,(v)+v÷]r[g(v)+v+], (12)
networks proposed earlier (Wang, 1991: Wang &
p~2~(v) - ½[r(v)-SoV ]T[r(v) SoY ], (13) Chankong, 1992), the penalization is realized by a
monotonically increasing penalty parameter. Because
p~,(v) - ½h(t') rh (v). (14)
the penalty parameter is supposed to be unbounded
Obviously, pt,(v) and Ph (v) are also differentiable and above, it is not easy to implement physically. In the
convex with respect to v. proposed deterministic annealing neural network, the
The penalty function can be constructed based on penalization is realized by using a time-varying tem-
p~,(v) and pt,(v); for example, perature parameter. The role of the time-varying tem-
perature parameter will be more apparent in Sec-
p(v) p~,(v) + Ph(V). (15)
tion 5.
Because p~(v) and ph(V) are convex and differentiable,
p(v) is also convex and differentiable. The convex pro-
gramming problem described in eqns ( 1 ), (2), and ( 3 ) 4. N E T W O R K ARCHITECTURE
thus can be reformulated as the following equivalent
convex programming problem: To solve an optimization problem via neural compu-
tation, the key is to cast the problem into the architec-
minimize f(v), (16)
ture of a recurrent neural network for which the stable
subject to p ( v ) = 0. (17) state represents the solution to the optimization prob-
lem. As shown in eqn (18), the architecture of the de-
terministic annealing neural network depends on spe-
3. N E T W O R K DYNAMICS cific forms o f f ( v ) and p ( v ) / i . e . , f ( v ) , g ( v ) , h ( v ) , and
the method for constructing p ( v ) ] .
The dynamical equations of the deterministic annealing If the functional transformation method [e.g., eqns
neural network can be described as follows: (7) and (8)] is used to construct a penalty function,
du(t) then
dt -T(t)~'f[v(t)] - 7p[v(t)], (18) m

~Tpx(v) = ~ lt[g,(v)]~g,(v), (20)


v(t) F[u(t)]. (19) i I

where u(t) E R u is a column vector of instantaneous n

net inputs to neurons, v ( t ) E V C R x is a column ~Ph(V) = ~ h/(v)Y'h~(v). (21)


vector of activation states corresponding to the decision
variables, V is the activation state space, u (0) and v (0) Obviously, Vv E V, 7Px(v) = ~Tph(v) = 0. Furthermore,
are unspecified, T(t) is a time-varying temperature pa- because H ( x ) is continuous, ~'p.~,(v) is also continuous.
rameter, F:tt ~ v is an N-tuple activation function, and If the state augmentation method is used to construct
is the gradient operator. a penalty function based on g ( v ) , then the dimension
Because the temperature parameter is a function of of the activation-state vector increases from N to N +
time, the proposed deterministic annealing neural net- m; so does the size of the neural network architecture.
work is virtually a time-varying dynamic system. According to eqns (12) and ( 13 ),
Equation (18) defines the neural network dynamics,
in which the first term of the right-hand side enforces Vp.~')(v +) g ( v ) + v ~, (22)
the minimization of the objective function and the sec-
Vp(x2)(v ) = - s o r ( v ) + . ~ soy . (23)
ond term enforces penalization of violation of con-
straint (17). Equation (19) is the activation function In view of the fact that./'(v) has nothing to do with the
that defines the sensitivity and range of the activation augmented-state variables, eqns (22) and (23) imply
state v (t). that there is no connection between the neurons cot-
632 ,/ l | i ~ <

responding to the augmented-state variables v ~(t) or


v (t) in a neural network architecture. . . . . _ . . . . . . . . . . . . . . . . . . . . . . [

In the standard form of L E for e x a m p l e , f ( v ) = c r v , i .......... 7 . . . . . . . . . . . . . . . . . . . . ! i


W' ! Wn ~' ~
g ( v ) = - v , and h(v) = A v - b. Because the nonnega- 12 ----- := i '~

tivity constraint, v > 0, can be realized by using a non- ' I ": ~ e x }------:,. . . . . . , / 2~( - i "'°
negative activation function F l u ( t ) ] >- O, only h ( v ) ~-~-- r:,. LS/ '\ \ ( .. .'J ~'
needs to be used in the penalty function. Let p ( v ) = } [ .... ,.\ x ~,
ph(V) = h ( v ) r h ( v ) / 2 = (Av - b)r(Av -b)/2, thus
Vp(v) = Vph(v) =ArAv - A'rb. Equation (18 be-
comes >-____$ .......

du(t) _ i /" i
A ~ A v ( t ) + Arb - T(/)c. (24)
dt 1 •

i w; • / ~. . . . ~,. ,
In the standard form of QP, f ( v ) = v r Q v / 2 + c ~ v , t ---- ~ / / 2q ~)n~ \\" '"

g( v ) = - v , and h( v ) = A v - b. Letting Q be symmetric


and positive semidefinite, similar to the case ofLP, eqn Iw~. . . . . . . . . -)//'J e~, :~",

( t 8) becomes
FIGURE 2. Two-layer architecture of the deterministic annealing
du( t) _ neural network.
[ArA + T ( t ) Q ] v ( t ) + A~b T(t)c. (25)
dt

There are two ways to map the dynamical equations, column of A and Q respectively, 0, and c, are the ith
eqn (24) or (25), to a neural network architecture. elements of b and c. respectively. Because Q is sym-
The first method results in a single-layer recurrent metric, the connection weight matrices are symmetric
neural network. The single-layer architecture o f t b e de- for both LP and QP (i.e.. v i , j , wij - wii). I f Q is positive
terministic annealing neural network for convex pro- semidefinite and T(t) -> 0. then the eigenvalues of the
gramming consists of N laterally connected neurons, weight matrices are always real and nonpositive for both
as shown in Figure 1. Each neuron represents one de- LP and QP.
cision variable. The lateral connection weight matrix Letting e(t) = Av(t) - b. eqns ~24 ~ and (25) can
is defined as - A r A for an LP and - A r A - T ( t ) Q for be decomposed, respectively, as
a QR The biasing threshold vector of the neurons is
defined as A ~b - T ( t ) c for both LP and QP. In other duct) ~261
- ,4re(t} - T(l)c.
words, the connection weight w o from neuron j to dt

neuron i is defined as - ~=~ " ak~akj for an LP and


dutt) .Are(t) T(t)Qvlt~- T(t)c. (27}
- Y~ 7< ag~agj - T ( t ) q o for a QP, the biasing threshold dt
Oi of neuron i is defined as Y~~=~ akib~ -- T ( t ) c i , where
The second approach tbr mapping the dynamical
a o and q0 are the element in the ith row and the j t h
equations to a neural network architecture results in a
two-layer architecture composed of an output layer and
an input layer, as shown in Figure 2. The output layer
consists of N neurons and the input layer consists of n
neurons. The state vectors of the input and output layers
are denoted by e(t) and v (t), respectively. The output
• • ° state vector v(t) represents the decision variable vector
v and the input state vector e(t) represents the error
~t vector that measures the solution feasibility of the out-
l.Yat put-state vector. The connection weight matrix from
the input layer to output layer is defined as - A r and
the weight matrix from output layer to input layer is
W 2 a

Oz A. In the LP case, there is no lateralconnection among


input and output neurons. In the Q P case, there is no
I • ° lateral connection among input neurons, and the con-
I W n t nection weight matrix among output neurons is
- T ( t ) Q . The bias vectors are - b in the input layer
and - T(t lc in the output layer, respectively, for both
"UOn~ " > the LP and Q P neural networks. The advantage o f the
second configuration is that the connection weight ma-
FIGURE 1. Single.layer architecture of the determini~c an-
trices and bias vectors can be determined directly from
nealing neural network.
A Deterministic Annealing Network 633

the coefficient matrices and vectors without involving Because J{ F[ u(t)] } is positive semidefinite, du(t) v~
matrix multiplication. dtJ { F[ u( t ) ] } du( t ) / dt >- O. Furthermore, because
f [ v ( t ) ] - m > 0 and d T ( t ) / d t < O, d E [ v ( t ) , T(t)]/
dt < 0. Therefore, E[v(t), T(t)] is positive definite
5. CONVERGENCE ANALYSIS and radially unbounded, d E [ v ( t ), T(t) ] /dt is negative
definite. According to the Lyapunov's theorem, the de-
In this section, analytical results on the stability of the terministic annealing neural network is asymptotically
proposed deterministic annealing neural network and stable in the large. •
feasibility and optimality of the steady-state solutions
to convex programming problems are presented.
5.2. Solution Feasibility
For the deterministic annealing neural network to re-
5.1. Asymptotic Stability
alize a solution procedure for a convex programming
THEOREM 1. Assume that the Jacobian matrix of problem, its steady state must represent, at least, a fea-
F[ u (t) ] exists and is positive semidefinite. I f the tem- sible solution to the problem. Toward this end, the fol-
perature parameter T( t ) is nonnegative, strictly mono- lowing theorem is derived.
tone decreasingJbr t >_ O, and approaches zero as time THEOREM 2. Assume that J{ F[u(t)] } exists and is
approaches infinit): then the deterministic annealing positive semidefinite. I f T(t) >_ O, d T ( t ) / d t < 0 and
neural network is asymptotically stable in the large," l i m , ~ T( t) = O, then the steady state of the determin-
that is, Vt >_ O, J{ F[u(t)] } is p.s.d., T(t) >_ O, d T ( t ) / istic annealing neural network represents a feasible so-
dt < O, and lim,~.~ T(t) - 0 imply Vv(0) ~ V, 3~ E lution to the convex programming problem described
V such that l i m , ~ v ( / ) = ~. where J { F [ u ( t ) ] } is" the in eqns ( 16 ) and ( 17 ), that is, F)E 9.
Jacobian ofF[ u(t)] and F) is a steady-state of v( t ).
Proof The proof of Theorem 1 shows that the energy
Proof. Becausef(v) is bounded from below, there exists function E [ v (t), T(t) ] is positive definite and strictly
M such that - o c < M < inf~,evf(V). Let monotone decreasing with respect to time t, which im-
plies lim~.... E [ v ( t ) , T(t)] = 0. Because l i m t ~ T(t)
E[v(t), T(t)] : T ( l ) I f [ v ( r ) ] - M} + p[v(t)]. (28)
= 0, l i m , ~ E[v(t), T(t)] = lira, ..... T ( t ) { f [ v ( t ) ] -
Because Vt >_ 0, T(I) >_ O,f[v(t)] - M > O, andp[v(t)] M} + p[v(t)] = limt .... p[v(t)]. Therefore, l i m t ~
>- O. Vv ~ V, E[ v( t ), T(t)] >_ 0. Furthermore, because ElY(t), T(t)] = l i m , ~ plY(t)] = 0. Because p ( v ) is
l i m , ~ T(t) = 0 and min~p(v) = 0, min~,rE[v, T] = continuous, l i m , ~ plY(t)] = p[limt .... v(t)] = p(~)
0. To facilitate the stability analysis, let us consider T(t) =0.
as a dummy-state variable for the time being; that is,
let f)(t) r = [ v(t) r, T(t)] be the augmented-state vector. REMARK. Theorems 1 and 2 show that the asymptotic
In view of the fact that .f(v) and p ( v ) are convex stability of the deterministic annealing neural network
and T(t) ~ [0, oc), E --* oo as [Iv[[ --~ oc; that is, implies the solution feasibility of the steady state.
E is radially unbounded. Because - d u ( t ) / d t =
T(t)Vf[v(t)] + 7p[v(t)] and v(t) = F[u(t)] according 5.3. Solution Optimality
toeqns (18) and (19), d v ( t ) / d t = J { F [ u ( t ) ] } d u ( t ) /
dt and Theorems 1 and 2 ensure the asymptotic stability and
solution feasibility. The optimality of the steady-state
dE[v(t). 7"(t)] dT(r)
dl ~f[v(t)] - M} dt solutions, however, is not guaranteed and deserves fur-
ther exploration.
df[v(r)] dp[v(r)] There are two possibilities for the locality of the op-
+ T(t) - - +
dt dt timal solution v*: an interior point or a boundary point
dT(t) of the feasible region 9 . If v* is an interior point of V,
: If[v(r)I-m} Jr then v* is the same as the global minimum of the ob-
jective function ,f(v), hence Vf ( v* ) = Vp (v*) = 0,
+ ~ T(r)Vf[v(t)] + Vp[v(/)] }r dr(t) because both f ( v ) and p ( v ) are convex and differen-
dt tiable. In this case, according to eqn (18), a steady
: { J [ v ( r ) ] - M} dT(t) du(t)rdv(r) state of the deterministic annealing neural network is
dt dt dt an optimal solution if the temperature parameter T(t)
dT(l) sustains sufficiently long. The solution process for this
: {f[v(r)] - M} dr case is trivial. In most convex programming problems,
such as linear programs, v* is a boundary point of V.
du(t) 7 du(t) In this case, Vv ~ 9, V f ( v ) ~ O. Theorem 3 provides
dt J { F [ u ( t ) ] } - ~ -
a sufficient condition for the deterministic annealing
634 J. !l km:z

neural network to solve convex programming problems Substituting eqn (18) and dv(t)/d~ - - J{ F l u / I ) ] :
in which the optimal solutions are boundary points. X dull)/dr into the above inequality, we have

THEOREM 3. Assume that Vt _>_O, J{ F [ u ( t ) ] ~, # 0 du ( t )


V ! [ v ( t ) ] ~ , / F[u(t)]l --
and is positive semidefnite, and 7f[ v( t)] # 0. I f dTt t )/ ~h
dt < 0, l i m t ~ T(t) = 0, and
du ( l )
Vp[v(t)]lJ t"[u(t)]]
T(t) dt
IO VP[V(t)]~JlF[u(t)]}VP[V(t)] ] V/[vU)] T d v ( t ) - Vp[v(t)] ~ dv(t) ~ o.
> ~[u(t)l}Vp[v(t)l[ ~# ch
_ m a x 'Vf[v(t)]rJ{F[u(t)]}Vf[v(t)l | (29)
-- Vp[v(t)]rJ{ F[u(t)l }Vf[v(t)] J Note that 7 / [ v ( t ) ] ~ d v ( l ) / d ! d f [ v ( t ) ] / d t , and
V p [ v ( t ) ] r d v i t ) / d t = dp[v(t)]/dl, eqn i 2 9 ) i m p l i e s
then the steady state of the deterministic annealing d f [ v ( t ) ] / d t <; dp[v(t)]/dt. Furthermore. d f [ v ( t ) ] /
neural network represents an optimal solution to the dt < dp[v(t)]/dt impliesJ[v(t")] - j[v(t')] ~ p[v(t")]
convex programming problem described in eqns ( 16 ) - p [ v ( t ' ) ] for any t' _< t". Let t* be the time associated
and ( 17 ), that is. ~ = arg min,,o,..f( v ). with an optimal solution v*. We have f [ v ( o c ) ]
/ [ v ( t * )] _< p[ v(-x )] - p l Y ( t * )]: that is, ./'( ~ ) - - / ( v * )
Proof According to Theorems 1 and 2, the steady state
<- p ( ~ ) - p i v * } . Because p ( ~ ) = p(t,*} = O..f(~) <~
of the deterministic annealing neural network repre-
./( v* ). Also. because ~ E "¢ and v* = arg min~,c v f ( v ) .
sents a feasible solution to the convex programming
problem: that is, l i m , ~ v(t) = ~ ~ 9 and p ( ~ ) = 0.
[(f)l > [ i v * l by definition of v* Consequently. j ( v )
= /it;*) = m i n , : ~ p f l v ) . •
Because E[v(t), T(t)l = r ( t ) { f [ v ( t ) ] - :~1} +ply(t)]
and r E l y ( t ) , T(t)] = T ( t ) V f [ v ( t ) ] + Vp[v(t)], REMARK. Theorem 3 gives a lower bound on the tem-
d u ( t ) / d t = - r E l Y ( t ) , T(t)], where V E [ v ( t ) , T ( t ) ] perature parameter Tit). Note that T(tl ~ O. d T ( t l /
= {OE[v(t), T(t)]/Ov,, OE[v(t), T(t)]/Ov2 . . . . . act < 0. and lim .... T(t) = 0 imply T(I) 4s O. Because
OE[ v( t ), T( t ) ]/ Ov~ } r i s the gradient of E l Y ( t ) , T(t)] lim,~,, plY(t)] = p(~) - min~p(v) = O, ply)isconvex,
with respect to v. Moreover, because v (t) = F[ u( t )] and Vp[v(t)] is continuous. Vp(~) - lira,_, Vp[v(l)]
and d u ( t ) / d t = - r E l Y ( t ) , T(t)], d v ( t ) / d t = : 0. Hence
J{ Flu(t)] } du(t)/dt = - J { F l u ( t ) ] }VE[v(t), T(I)I.
In view of the fact that J{ F[ u(t)] } # 0 and is positive
Vp[vlt)l' J ; Flu(t)] }Vp[v(t)]
" V/'[v(t)]VdF[u(1)]'VP[Vil)] O. ,'~0)
semidefinite, the state v(t) of the deterministic an- lira
..... Vf[v(t)llJ~ Flu(t)] Vf[v(lt]
nealing neural network approaches a local m i n i m u m Vp[v(l)]zJl F[u(t)] IV/[v(t)]
of the energy function in a gradient descent manner.
Furthermore, because T(t) >- 0 and both f l y ) and p ( v ) In addition. Theorem 3 relaxes the int~asibility re-
are convex, ElY(t), T(t)] is also convex. The convexity quirement for initial states of the recurrent neural net-
property of E l v (t), T(t) ] guarantees that a local min- works proposed earlier (Wang. 199 h Wang & Chan-
i m u m of E is always a global one. Because of the pres- kong, 1992).
ence of T(t) in E l y ( t ) , T(t)], the global m i n i m u m is
time-varying. Equation (29) implies
6. D E S I G N METHODOLOGY
T ( t ) {VflY( t)] r j{ F[ u( t)] I V / [ v(t )1
The convergence analysis in the preceding section re-
- Vp[v(t)] lj{ F[ u(t)] }Vf l y ( t)] ', veals the desirable properties of thc deterministic an-
_>_Vp[v(t)]rJ{F[u(t)] }Vp[v(t)] nealing neural network for convex programming. In
this section, design principles are discussed, based on
.... V.f[v(t)]~ J~ Flu(t)] }Vp[v(t)].
the results of convergence analysis discussed above, for
Rearranging the above inequality, we have determining the activation function and temperature
parameter of the deterministic annealing neural net-
T(t)Vf[v(t)]r J{ F[ u(t)] }VflY(t)] work.
+ W [ v ( t ) ] Tj{ F[u(t)l }Vp[v(t)]
- T(t)Vp[v(t)]rj{F[u(t)] }Vf[v(t)] 6.1. Activation Function
+ Vp[v(t)]rJi F[u(t)]}Vp[v(t)]} ->-0;
In light of Theorems 1, 2, and 3, the activation fianction
that is, F should be differentiable and its Jacobian should be
nonzero and positive semidefiniteat any time. The ex-
Vl'[v(t)]rJ{ Flu(t)]} { T(t)Vf[v(t)l + Vp[v(t)] }
isting activation functions are all uncoupled; that is,
- Vp[v(t)]rJ{ F[u(t)l } { T(t)Vf[v(t)]} Vi(t) = Fi[ui(t)] for all i. In this case, J { F [ u ( t ) } =
+ Vp[v(t)]} ~ O. diag{ dI~;.[ui(t)]/du~li = 1, 2 . . . . . N }and the nonzero
+4Deterministic Annealing Network 635

and positive semidefinite requirements on J{ F[u(t)] } explicit bound on decision variable vector v, then a
can be replaced by strictly monotone increasing F~ for linear activation function, F[ u ( t ) ] = ~u (t), can be used
all i. The activation function F determines the domain where ( > 0 is also a scaling constant. In this case, V
of the state variables v(t) (i.e., the state space V) and =R N '

depends on the feasible region V. Specifically, it is nec-


essary for V _c V unless prior knowledge on the locality
6.2. Temperature Parameter
of v* is available. Any explicit bound on decision vari-
able vector v can be realized by properly selecting the Because T(t) represents the annealing effect, the de-
range of F. Figure I of Wang ( 1991 ) provides a set of termination of the cooling temperature parameter T(t)
potential choices of the activation functions for different is very important for solution optimality. Theorems 1
domain of state variables. and 2 provide the necessary conditions for selecting the
In the typical sigmoid activation function, F~[ t6 (t)] temperature parameter by requiring the nonnegativity
= t,..... /[1 + e x p ( - ( u i ( t ) ) ] where Vmax> 0. TO ensure and decreasing monotonicity of T(t) with respect to
increasing monotonicity, parameter ( must be positive. time. The following polynomial, exponential, and log-
Moreover, parameter ~ can be used to control the slope arithmic functions of t are three simple examples of
of the sigmoid activation functions, as shown in Figure open-loop T(t).
3a. Because the activation sensitivity increases as ( in-
T(t)-[3(t + t ) ", (31)
creases, a large value of parameter ( is usually selected
to expedite the convergence of the deterministic an- T(t) - Bc~ ~' (32)
nealing neural network. The range of the sigmoid ac- F(t) = /3[log,(z + ~)] ~ (33)
tivation function is defined as [0, /)max], because the
infimum and supremum of the activation state are 0 where ~ > 1, ~ > 0, and ~7> 0 are constant parameters.
and/)m~, respectively. The state space of the determin- Parameters fl and r/can be used to scale the temperature
istic annealing neural network is then defined as a hy- parameter.
percube V = [0, /.?max]N, where /).... depends on the Theorem 3 provides a sufficient condition for de-
t~asible region of decision variables. Furthermore,/)m~ termining T(t). If J{F[u(t)]} is a symmetric ma-
is also a thctor for sensitivity of the sigmoid activation trix such as the aforementioned diagonal matrix,
function, as illustrated in Figure 3b. Because the slope then V f [ v ( t ) ] r J { F [ u ( t ) ] } V p [ v ( t ) ] = Vp[v(t)] r
of the sigmoid activation function F~[ u, (t)] increases J { F [ u ( t ) ] } V f [ v ( t ) ] , hence eqn (29) becomes
as Vma~increases and the upper b o u n d / ) ~ needs not T(1)
to be tight unless an explicit upper bound on v exists.
A large upper bound could increase the sensitivity of t Vp[v(t)]rj{F[u(t)l]Vp[v(t) ] ]
activation and hence increase the convergence rate as max
- Vf[v(t)lTJ{ F[u(t)l }Vp[v(t)]~ (34)
long as the hypercube-state space contains the feasible
region. Therefore, the rule for determining/)m~x is to
- V/[v(t)]rJl F[u(t)] }Vp[v(t)lJ
set it to maximum such that V _c V. If there is no Because J{ F [ u ( t ) ] } is positive semidefinite,
Vp[v(t)]rJ{ F [ u ( t ) ] }Vp[v(t)] >_ 0 and V f [ v ( t ) ] 7
× J{F[u(t)]}V/'[v(t)] >_ O. In the cases of LP and

1t I+=10(/+=,
<
(~)
QP, if v(0) and J{F[u(t)] } is symmetric, as in most
of neural network models, then eqn (29) implies
max T ( t ) - T(O)
> 0.5 ~ = 0.1 t-0

brAJ{F[u(l)]]( c + Arb) l
>_max 0, c T y { g [ u ( t ) ] l i c + ~ i ~ j. (35)
,

010 -5 0 5 10
In addition, if F ( u ) = [Fj( ul ), F2(u2) . . . . . Fx( ux)]
1.1
and Ft = F2 = . . . = F~, then eqn (35) becomes
(u)
loc I,~:4(c + :i'/~)I
//" Vrnax = 10 max,,07'(t) = T(0)>_ max 0, c ~ - + - ~ 4 ~ j . (36)
/
= 1 ]" v~,: = 5
>" 5 // As discussed in Section 3, the terms - T(t)V/'[ v (t) ]
and - V p [ v (t) ] in the right-hand side of the dynamical
/f Omax 1
eqn (18) represent, respectively, the objective mini-
010 -5 0 i 10 mization effect and the constraint satisfaction effect.
Because a feasible solution is not necessarily an optimal
U
one, it is essential for the first term not to vanish until
FIGURE 3. Effects of ~ and Vm.xon sigmoid activation function. v(t) reaches an optimal solution. Theorem 3 suggests
636 J tlatl~

essentially that T ( t ) sustain sufficiently long to obtain terministic annealing neural network with different
an optimal solution. In other words, similar to the tem- initial states, Example 2 shows the effects of the design
perature parameter in the stochastic counterpart, the parameters/3. ~, and ~ on the performance of the de-
value of the temperature parameter in the deterministic terministic annealing neural network, and Example 3
annealing neural network should decrease slowly. A shows the performance of the neural network in solving
simple way to implement this requirement is to select a quadratic program.
a sufficiently small value of rl in eqns (31 ), ( 32 ), or
EXAMPLE I. Consider the following linear program with
(33). Because eqn (29) is not a necessary condition,
two decision variables and three functional constraints,
violation of this condition could still result in an optimal
solution. As will be shown in the illustrative examples min --2v~ - 3c~.
in the ensuing section, optimal solutions can still be
obtained when d f [ v ( t ) ] / d t ¢ d p [ v ( t ) ] / d t .
An exponential function, fl e x p ( - 0 t ) is used im- --v~ i- 2r: - _",
plicitly as the temperature parameter in the recurrent
3v~ + ~, - f,
neural networks for solving the assignment problem
and linear programming problems (Wang, 1992,
1993 ); that is, a = e in eqn (32) where e = 2.71838 . . . . . The optimal solution of this problem is v* = [ 1.4, 0.8] r
In this case, if the first term of the right-hand side of with the minimal objective v a l u e r ( v * ) -- -5.2. In this
dynamical eqn ( 18 ) is dominant as time elapses, then example, the inequality constraints are converted to
the convergence rate of the recurrent neural network equality constraints by adding threv slack-state van-
is exclusively dictated by n and the deterministic an- ables v3, v4, and v 5.
nealing neural network needs approximately 5 / n units
of time to converge, because ~/is the time constant of t'l ~ 2L'- t V.~ .~.
the temperature parameter. This feature enables one - - U I ~- 21.~2 4- F 4 - 2,
to control or determine the convergence rate of the de-
terministic annealing neural network.
Because fl = T ( 0 ) in eqns ( 3 1 ) - ( 3 3 ) , parameter fl where v~. v2, v3, V4, 1~5 ~> O,
defines the maximal and initial value of temperature Numerical simulations have been performed using
parameter T(t). According to eqn (34), the developed simulator based on vi(t) ~ vma~/[l ~-
Vp[v(O)lJS{Flu(O)l}Vptv(O) l ] e x p ( - ~ u i ( t ) ) ] , Y ( t ) =/3(1 -~ t)- ~, Vma~ = 2, ~ = 10 S.
~ J { F[ u( 0)1 }Vp[ v(0)_]~ /3 ~ 1. r/ = 103. and At -- 10 ~'. Yhe results of the
fl>_max O'vf[v(O)]rJ~F[u(O)]lVf[v(O)]t, [. (37) simulations show that the steady slates of the deter-
- Vp[v(O)]TJ{ F [ u ( 0 ) ] / V f [ v ( 0 ) ] J mmistie annealing neural network can always accu-
rately generate the optimal solution starting from four
In addition, fl is a design parameter related to conver- different comers of the 2 × 2 square on the v~ - v,
gence rate of the deterministic annealing neural net- plane; that is. ~ - [1.400000. 0.800001]'rand f(v) =
work and can be used to balance the penalization effect -5.200001. Figures 4 and 5 illustrate, respectively, the
on constraint violation and minimization effect on the
objective function. In general, a small fl results in an
underpenalization and a large fl results in an overpen-
alization of constraint violations. The convergence is
\
slow in either case. An excessively small fl could even \
\
result in convergence to a suboptimal solution, as will
be shown in Example 2. In other words, a properly
selected parameter/3 can make the deterministic an-
nealing neural network converge rapidly, Ifp,d v), Ph (V),
a n d f ( v ) are normalized in the same scale, then a design . 1,0--
rule is to set/3 ~ 1.

7. I L L U S T R A T I V E E X A M P L E S
Simulations have been performed using a simulator for
G
a number of sample problems. The following illustrative
examples are to demonstrate the operating character-
) 0 -X+-- . . . . . . . . . . . . . . . . . . . .
istics of the proposed deterministic annealing neural
network in solving linear and quadratic programs. Spe-
cifically, Example 1 shows the performance of the de- FIGURE 4. Feasible reg~a and state trajectodes in Example 1.
A Deterministic A nnealing Network 63 7

2,0 2.0

i
1.5 2) t 1.5 ~ 2)1

1.o 1.0
2)2 2)2

0.5 0.5
Initial S t a t e v(O) = ( 0 . 0 0 0 0 0 1 , 1.999999) T Initial S t a t e v(O) - ( 1 . 9 9 9 9 9 9 , 1.999999) 7

0.0 , = i i n i i r L ] n ] i = n i ¢ ~ L j , , n ~ n I = ¢ n I r T I I I r i n i I
0 2000 4000 6000 8000 2000 4000 6000 8000
Time Inferval Time Inferval

2.0 2.0

1.5 rut 1.5 I Jr

1.0 1.0
2) 2 122

0.5 0.5
Initial S t a t e v(O) = ( 0 . 0 0 0 0 0 1 , 0 . 0 0 0 0 0 1 ) 7
I n i t i a l S t a t e v(O) = ( 1 . 9 9 9 9 9 9 , 0 . 0 0 0 0 0 1 ) 1

0.0 ir i l l l l l l l l [11111,4 I l l l I [ l l l l { l l l l l l l f ~l 0.0


2000 4000 6000 8000 2000 4000 6000 8000
Time Intervol Time Interval

FIGURE 5. T r a n s i e n t s t a t e s o f the d e t e r m i n i s t i c annealing neural n e t w o r k in E x a m p l e 1.

1.0 T(t) -s- f[v(O]

0.8

-6- /

0.6

0.4
-7-

0.2

0.0 i in I I , , I I ~ , t I ~ I -8
........ • " ........ I . . . . . . . . . . . . . . . . . . . I
2000 4000 6000 8000 0 2000 4000 6000 8000
Time Interval Time Interval

1.0 4- E[.(t), t]

0.8
3-

0.6

2-
0.4

1
0.2

0.0 u i u ( I I lit J Ill I I I I I I i I Illl F III [ I FI ] ~ I I~ 0


I P n r n , n l J i J n n n n I l l I I I ~ I I I n ; l U F I ir i n l l , i i
2000 4000 6000 8000 2000 4000 6000 8000
Time Interva) Time Interval

FIGURE 6. T r a n s i e n t b e h a v i o r s o f the t e m p e r a t u r e p a r a m e t e r , o b j e c t i v e function, p e n a l t y function, and e n e r g y f u n c t i o n in


E x a m p l e 1.
638 .; ~l ~ ~,

401
5.5
2)3
of the values of temperature parameter, objective func-,
tion, penalty function, and energy function versus the
number of time intervals with fbu~ different imtiai
states, which shows that the values of objective, penalty.
and energy functions with different starting points make
little difference after a few iterations.
2.o ~/' " ~ 2 ~
EXAMPLE 2. Consider the tbllowing linear program with
five variables and three constraints,
rain 2.4v, + i.6v2 .+ 4 . 2 r , + 5.2v~ i . ! . 4 v ,

s.t. 4.3v~ + 5.3v2 ~ 1.OV~ .* 0.5c.~ 2.1v5 :- t2.5_


i i~lt--TiLilr 7r~lTr-~-rrTTTr]-[*rq~f-rT]-r~Tr~]~r ~'T~ i
o 1000 2000 3000 4000 5000 7.It:, 2.6r2 + 2.4r~ ,- iA~t~ - L 9 r ~ : - 7 . 2 ,
Time interv,~l

FIGURE 7. Transient states of the deterministicannealing neural 1.3t,, i.2rL, + 2.5r:~-+ 4. it~, 2.?r< - 6 . 3 .

network in Example 2.

T h e optimal solution of this problem v * ....


convergent-state trajectories on the v, - v2 plane and [0.671138, 1.963131, 3.113311. O, O ] r a n d the corre-
the convergent states versus the number of time inter- sponding value of objective function f l y * ) ~
vals (iterations). Figures 4 and 5 indicate that the four 17.827650. Simulation using the simulator shows that
trajectories merge in their final stages. This phenom- the steady state of the deterministic annealing neural
enon indicates a deep valley of the energy function in network ~ = [0,671138, 1.963132, 3:1.13310, 0,000001,
the vicinities of the trajectories. Figure 5 also shows 0.000000 ] r and .f(~) = 17.827648, where v (0) = [ 2.5.
that the deterministic annealing neural network can 2.5, 2.5, 2.5, 2.5] T, v,(t) = Vma×/[l --t exp( -~lti(t))],
converge in approximately 8000 time intervals or 01008 5. ~ = I04, if = t, n : l0 ~,
T ( I ) = flexp(-rff), v .... =:

time units. Figure 6 illustrates the convergence patterns At = 10 -6 unless otherwise indicated. Figure 7 iltus-

~.o /[v(t)]

0.8 23

- \
0.6 21
/3=0.1

o.4 ~9 ~1~..~...____.___L--0.5

0.2

0.0
0 1000 2000 30•0 4000 5000 o 10o0 2000 3000 4000. ~00
Time Interval Time Intervol

,.o p[~(t)] 20 E[~(t),t]

0.8
is\

0.6
\

0.4

5-
0,2

0,0 1000 2000 3000 4000 5000


1000 2000 3000 4000 5000
Time Interval Time Interv0t

FIGURE 8. Transient behaviors of the temperature parameter, objective function, penalty function, and energy f ~ based on
different B in Example 2.
A Deterministic .4nnealing Network 639

trates the transient behavior of the activation states of 24.157710. Simulation using the simulator shows
the deterministic annealing neural network. Because that the steady state of the deterministic annealing
= 103 and Zt = l 0 - 6 , 5 / r / = 0.005 and 5/(~2~t) = neural network ~ = [2.967033, 0.000000, 1.736263,
5000. Figure 7 clearly shows that the deterministic an- 0.417583] r and .f(~) = 24.157701, where the initial
nealing neural network can converge in approximately state is v(0) - [2.5, 2.5, 2.5, 2.5] r, vi(t) = V m a x / [ 1 +
5000 time intervals or 0.005 time units. Figures 8, 9, e x p ( - ( u i ( / ) ) ] , T ( t ) = fl[ln(t + e)] ', Vmax = 5, ( -
and 10 illustrate the transient behaviors of the values 10 4, fl = 1, r/ = 10 4, /~t = 1 0 6. Figure 11 illustrates
of the temperature parameter, objective function, pen- the transient behavior of the activation states, temper-
alty function, and energy function based on different ature parameter, objective function, penalty function,
values of design parameters/3, rt, and/~, respectively. and energy function in Example 3.
EXAMPLE 3. Consider the following convex quadratic
program with four variables and three constraints. 8. C O N C L U S I O N S
rain 1.5c~ - V,t'2 + V[V3 -- 2vlr'4 + 2v¢_ In this papen a deterministic annealing neural network
-- 2•2U3 + 2.5V 2 + V3U4 -- 3v4 for solving convex programming problems has been
proposed. The convergence properties of the determin-
6Vl + 15£ 2 + 9V 3 + 4134, istic annealing neural network have been analyzed, the
s.t. v~ + 2v2 + 4 v 3 + 5v4- 12, design principles have been discussed, and three illus-
trative examples have also been detailed. It has been
3v~ - 2v2 v3 + 2V4 = 8, substantiated that the proposed deterministic annealing
2131 3V2 + V3 -- 4V4 - 6, neural network is capable of generating optimal solu-
tions to convex programming problems. Having the
U I ~ 0 , U2 ~ 0 , ~33 ~ 0 , I)4 2> 0.
same convergence properties of its predecessor (Wang,
The optimal solution to this problem v* - 1991; Wang & Chankong, 1992), the proposed deter-
[2.967033, 0.000000, 1.736264, 0.417582] T and the ministic annealing neural network possesses more de-
corresponding value of objective function f ( v * ) sirable features. From a theoretical point of view, the

1.0 20- .fS,( t) ]

0.8
i9

0.6

18 = i000
0.4
~ = 500

17 ¸
0.2
= 100

0.0 ,,,+,,,,,l, ,IE,, ,]+I;,,I,[, ,,[,{,,,,, ,Elil,bl,iTl 1 16 ITllrTiTiilZplll[,l];lllltlj ' 1111'I[I1111111111|


I000 2000 3000 4000 5000 0 000 2000 3000 4000 5000
Time Interval Time Interval

1.0 p[~(t)]

0.8

0.6

0.4 1 0 " 7 5

0.2

0.0 Id Jl [ ; f zr I t I z l l r I ;~ I I I JI I I I J t ; I ; I I I I I I t ~l I I z I f = F ~ 0 T jllPrlll[Pllr~z JrlflTl[lll[ZllliTtlllTilll;llliZ ]


0 1000 2000 3000 4000 5000 1000 2000 3000 4000 5000
Time Interval Time Interval

FIGURE 9. Transient behaviors of the lemperature parameter, objective function, penalty function, and e n e r g y function based on
different
n in E x a m p l e 2.
640
,/ ~ f Llil/.:

infeasibility requirement for initial solutions is relaxed.


From a practical point of view, the proposed determin-
istic annealing neural network is more suitable for
hardware implementation, as shown in Wang (1992,
1993). The proposed deterministic annealing neural
network thus can serve as an effective computational
model for solving real-time and large-scale convex pro- ~J3
gramming problems.
Many avenues are open for future work. The pro-
posed deterministic annealing neural network for con- %)4
vex programming could be extended for solving non-
~2
convex optimization problems. Further investigations I L ~ I i , i ~ , , i , ) i , ) ) , , , i i ) ) ,-T~'r~l i , i , i ~ ,-T~'I

5oo 100o I 5o0 2000


Time Iniervol

,o..tic(t)]

35

3O

25 = 100
1.0

0.8

0.6

0,4
! (t)

20 ¸ = 1000 0.2
~ = 10000
\

15 o.o
0 1000 2000 3000 4000 5000 500 1000 ',500 2000
Time Interval Time Interval

~.o ~,b,(O1
0.8
20

0.6 /~ \ 15-

0.4

0.2 ~'00~ ~ ~

O I l l I I t 1 1 1 1 [ I n l t U l l l l T l / l l I [ 1 1 ] 1 , I I11 I ~
0.0 luuu ii , ul ii n i Li , , uu l, 17 i ni i in iii i~ it i ~ I - - [ T 7 q 0 500 1000 1500 2000
1o0o 2000 3000 4000 5000
Time Interval
Time Interval

20- [ (O,t ]
FIGURE 11. T r a n s i e n t ~ , ~ ~ , objective
function, penalty function, and en~'gy ~ in Example 3.

have been aimed at extension of the proposed deter-


ministic annealing neural network to solve nonconvex
quadratic programs and integer programs, applications
of the deterministic annealing neural network to spe-
5 0 ° 0 0 ~
cific problems of interest, and implementation of the
deterministic annealing neural network in analog VLS1
circuits.
0 , ~ i i , i i i i i i i i j , , ,l i ~'~ i i , i i i i jli ; ), t i i ~ i i i i i , ' I ~ F - ~
1000 2000 3000 4000 5000
Time Intervol REFERENCES
RGURE 10. Trensient behavtom of the temperature i~remoter,
Hopfield, J. J., & Tank. D. W, ( 1985 ), Neural computation of decisions
on different ~ in Example 2, in optimization problems. Biological C),bernetics. 52. 141-152.
A Deterministic Annealing Network 641

Kennedy, M. E, & Chua, L. O. (1988a). Unifying the Tank and Because[gi(v')-gi(v')]2>_O, Ml M[g,(v')2+gi(v")2]>-2Ml
Hopfield linear programming circuit and the canonical nonlinear - Mgi(v')g,(v"); that is, )`gi(v') 2 + (1 - X)g,(v"): >_)`2g~(v')2 +
programming circuit of Chua and Lin. IEEE Transactions on (l - M2gj(v") 2 + 2)`(1 - Mg~(v')g,(v"). Summing the above in-
Circuits and Systems, 34 ( 2 ), 210-214. equality over i and multiplying both sides with ½. we have
Kennedy, M. P., & Chua, L. O. (1988b). Neural networks for nonlinear l m
programming. IEEE Transactions on Circuits and Systems, 35( 5 ), 2 ,=~ [)`gi(v')2+(l )`)g'(v")2l
554-562.
Maa, C. Y., & Shanblatt, M. A. (1992). Linear and quadratic pro- l m
gramming neural network analysis. IEEE Transactions on Neural >--7 ~ [X2gi(/)')2 + (1 - X)2gi(V") 2 + 2X(I Mg,(v')g,(v")].
~M'tworks, 3, 580-594.
Tagliarini, G. A., Christ, J. E, & Page, E. W. (1991). Optimization
Because
using neural networks. 1EEE Transactions on Computers, 40( 12 ),
1347-1358. l m
Tank, D. W., & Hopfield, J. J. (1986). Simple neural optimization z-2i=lZ[X2gi(v'): + 2),(1 - Mg,(v')gi(v") + ( 1 )~)2gi(v") 2]
networks: An A / D converter, signal decision circuit, and a linear
programming circuit. IEEE Transactions on Circuits and Systems, l m
33(5), 533-541. = ~ ~ [)`gi(v') + (1 - )`)gi(v")] 2
2
Wang, J. ( 1991 ). On the asymptotic properties of recurrent neural
networks for optimization. International Journal of Pattern Rec- andXgi(v')+(l X)g(v")>g,(Xv'+(l - X ) v " ) d u e t o t h e c o n v e x i t y
~nition and A rtificial Intelligence, 5 (4), 581-601. ofg,(v) for i = 1, 2 . . . . . m,
Wang J. ( 1992 ). Analog neural network for solving the assignment
problem. Eh,ctronics Letters, 28( 1i ), 1047-1050. l S [)`gi(v')+(l - )`)gi(v")]2 -> ~1 ~ g,(;~v' + (I - )`)v") 2
Wang, J. (1993). Analysis and design of a recurrent neural network -~
i=1 -i=
for linear programming. IEEE Transactions on Circuits and Sys-
tems: Fundamental Theory and Applications, 40( 9 ), 613-618. = p~[xv' + (1 )`)v"].
Wang, J., & Chankong, V. (1992). Recurrent neural networks for
linear programming: Analysis and design principles. Computers Thus, the convexity property ofpg(v ) is proven. Note that the convexity
& Operations Research. 19(3/4), 297-311. Ofpg(V) does require the convexity ofg(v). The proof of the convexity
ofph(v) is similar.

APPENDIX NOMENCLATURE
Proof of the convexity of pg(v) in eqn ( 7 ): N the number of elements in a decision variable
pg(v)isconvexifXpg(v')+(l -Mpg(v")>_pg[)`v'+(l MY"],
Vv', v" ~ R N, )` E [0, 1 ]. According to the definition of the Heaviside
m the number of inequality constraints
function eqn (9), f o r / - 1, 2 . . . . . m 17 the number of equality constraints
I) the N-dimensional activation state vector and
e,(~)tt(x)dr={O,(V)2/2 if gi(v)>O, vector of decision variables
f " if gi(1)) <_ O. the N-dimensional net input vector to neurons
Because 0 < )` _< 1. ifgi (v') _< 0 and / or gi (v") _< 0, then the convexity
f(v) the real-valued objective function
is straightforward. Let us consider gz(v') > 0 and gi(v") > O.
g(v) the m-tuple function for inequality constraints
h(v) the n-tuple function for equality contraints
)`pdv')+(l
1 m
-Mpe(v")=3-~_,= [ ) ` g ' ( v ' ) 2 + ( l Mg~(v")2]. p(v) the real-valued penalty function
T(t) the temperature parameter

You might also like