Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

Application of Discrete Hopfield-type Neural Network

for Max-Cut Problems


Ling-Yun Wu Xiang-Sun Zhang Ju-Liang Zhang
Academy of Mathematics and System Sciences
Chinese Academy of Sciences
Beijing 100080, P.R.China
wlyun@amath8.amt.ac.cn

Abstract In this paper, we discuss the convergence by the following equation:


property of the discrete Hopfield-type neural network
(DHNN) running in asynchronous mode. Then a x(t + 1) = sgn{(W x(t) − T )},
DHNN with negative diagonal weight matrix is de- where W = (wij ) is an n × n transition matrix (or
signed to solve the Max-Cut problem, which can ap- weight matrix), T = (t1 , · · · , tn )T is a threshold vec-
proach good solutions. tor, and

Keywords discrete Hopfield-type neural network, sgn{x} = (sgn(x1 ), · · · , sgn(xn ))T


Max-Cut, combinatory optimization. where the operator sgn is the signum function (or
bipolar binary function) defined by

1 if y ≥ 0
1 Introduction sgn(y) =
−1 otherwise.

There have been more than ten years since artifi- Sometimes, a variation of the signum function is used
cial neural networks were applied in the field of op- in DHNN, called unipolar binary function:
timization. As we known, the application of neural 
1 if y ≥ 0
networks in the field of optimization was initiated sgn(y) =
0 otherwise.
by Hopfield and Tank in 1985. Hopfield and Tank’s
seminal paper [13] demonstrated that the Travelling There are two categories of DHNN identified by
Salesman Problem (TSP) could be solved by using a their operating modes: synchronous mode (parallel
Hopfield neural network. Since then, Hopfield neu- mode) and asynchronous mode (serial mode). The
ral network have always been the majority neural general algorithms of the DHNN in the two modes
network in solving optimization problems. A va- are as follows,
riety of feedback neural networks similar to Hop-
Algorithm 1. (Synchronous DHNN)
field neural network have been proposed to solve lin-
ear programming problems (for example, [21, 22, 25]) 1. given a stop criterion C
and quadratic programming problems (for example,
[3,18,25]) as well as combinatorial optimization prob- 2. given an initial vector x(0), t = 0
lems (for example, [1, 8, 13, 14, 16, 23]) because of 3. xi (t + 1) = sgn(Wi x(t) − ti ) i = 1, · · · , n
the potential of extremely rapid computation power
and speed of neural networks, which can be obtained 4. if x(t + 1) satisfies C then stop
through hardware implementation. Cichocki et al. [6] else t := t + 1 and go to step 3
and Zhang [24] are two comprehensive books about
the neural networks and optimization, and a concise Algorithm 2. (Asynchronous DHNN)
review of neural networks for combinatorial optimiza- 1. given a stop criterion C and a
tion was written by Smith [20]. point-to-point map
The discrete Hopfield-type neural network
(DHNN) denoted by N = (W , T ) can be described M : {1, · · · , n} → {1, · · · , n}
Pn
2. given an initial vector x(0) and set where i=1 ∆wii is a constant which does not con-
i(0) = 1, t = 0 tribute to the solution set. But according to the
above theorem, the diagonal elements of the matrix
3. take i(t) = M (i(t − 1)) ∈ {1, · · · , n} if t > 0 W take very important role in the convergence anal-
ysis. So we can simply assign proper values to the

sgn(Wi x(t) − ti ) i = i(t)
4. xi (t + 1) = diagonal elements of W in order to make the net-
xi (t) i 6= i(t)
work convergent.
5. if x(t + 1) satisfies C then stop On the other hand, the values of the diagonal el-
else t := t + 1 and go to step 3 ements are closely related to the number of stable
states of the network. The following theorem tells
where Wi , i = 1, · · · , n are the rows of W .
us that smaller diagonal elements may result in less
In this paper, we only consider the DHNN running
stable states of a network.
in asynchronous mode.
Theorem 2. ( [15], also [24]) Let N 1 = (W 1 , T ),
N 2 = (W 2 , T ) be two DHNN, where wij 1 2
= wij for
2 Theoretical Results 1 2
i 6= j, and wii ≤ wii for i = 1, · · · , n. Then ΩN 1 ⊆
Definition 1. A DHNN N = (W , T ) is called con- ΩN 2 where ΩN i is the set of all stable state of the
i
vergent from a given initial state x(0) if there exists network N (i = 1, 2) respectively.
an integer k such that x(k) = sgn{W x(k) − T }.
Generally speaking, since there is a finite number of
x∗ = x(k) is referred to as a stable state. A net-
states for a network, the network will always converge
work is called convergent if it converges to a sta-
to a stable state or a stable cycle. However, while
ble state from any given initial state. A DHNN is
a network is used to solve a optimization problem,
called convergent to a stable cycle from a given ini-
we are concerned about whether the energy function
tial state x(0) if there exists integers k and q such
is decreasing as the network is updating rather than
that x(k + q) = sgn{W x(k) − T }.
whether the network converge to a stable state. Thus
The following theorem is given by Hopfield in [12]. the following definition are proposed.
Theorem 1. ( [12]) Let W be a symmetric matrix Definition 2. For a given energy function E(t), a
with nonnegative diagonal. The DHNN N = (W , T ), network N is called E-convergent form a given initial
which runs in the asynchronous mode, will always state x(0) if it converges to a stable state or a stable
converge to a stable state. cycle with ∆E(t) ≤ 0 for t = 0, 1, · · ·. A network
The asynchronous DHNN can be used to solve such is called totally E-convergent if it is convergent from
combinatorial optimization problem any given initial state.

min − 12 xT W x + T T x By the above definition, we have the following the-


(P) orem.
s.t. x ∈ {−1, 1}n .
There were few applications of DHNN in combina- Theorem 3. Let N = (W , T ) be a discrete Hopfield-
torial optimization before Zhuo et al. [26]. The main type neural network running in asynchronous mode
reason is that the weight matrix W is not always with symmetric matrix W . Let
satisfied the condition in Theorem 1, therefore the X
network does not guarantee to be convergent. Zhuo Ki = min{| wij xj − ti | :
et al. [26] proposed a DHNN as a solver for Four- j6 =i
X
Coloring Map problem by rearranging the weight ma- wij xj − ti 6= 0,
trix to zero diagonal. j6=i
It is interesting to notice that rearranging wii will
x = (x1 , · · · , xn )T ∈ {−1, 1}n }.
not affect the solution set of the problem P. In fact,
1 If
− xT (W + Diag(∆wii ))x + T T x wii > −Ki , i = 1, · · · , n
2
n
1 X
then the network is totally E-convergent with the en-
= − xT W x + T T x + ∆wii x2i
2 i=1
ergy funtion
n
1 X 1
= − xT W x + T T x + ∆wii E(t) = − x(t)T W x(t) + T T x(t).
2 i=1
2
Proof. Let xt = x(t) and xt+1 = x(t + 1) be network Since we are interested in the global solution of a
states in step t and t + 1 respectively and xt+1 6= xt . given optimization problem, it is not favorable if there
Note that the network is running in the asynchronous is no stable state corresponding to a global solution.
mode, we can suppose that xt+1 = xt + ∆x with The following theorem give a guarantee.
∆xk = −2xtk and ∆xi = 0, i 6= k.
Consider the energy function Theorem 4. ( [26], also [24]) If a DHNN N =
(W , T ) is E-convergent with energy function
1
E(t) = − x(t)T W x(t) + T T x(t), 1
2 E(t) = − x(t)T W x(t) + T T x
2
we have
then there is at least one stable state (or states in a
∆E(t) t+1
= E(x ) − E(x ) t stable cycle) which corresponds to a global solution of
1 the problem P.
= [− (xt + ∆x)T W (xt + ∆x)
2
+T T (xt + ∆x)] 3 Application to The Max-Cut
1
−[− (xt )T W xt + T T xt ]
2
Problem
1 Consider an undirected, simple graph (i.e., a graph
= −∆xT W xt − ∆xT W ∆x + T T ∆x
2 with no loops or parallel edges) G = (V, E), where
n
X V = {1, . . . , n} is the vertex set of G and E the edge
= 2xtk ( wkj xtj ) − 2wkk (xtk )2 − 2xtk tk
set of G. An edge ij ∈ E connects vertices i and j.
j=1
X Let wij be the weight of edge ij, and W = (wij ) ∈
= 2xtk ( wkj xtj − tk ). Rn×n which is called weight matrix.
j6=k For S ⊆ V , the set δ(S) = {ij ∈ E : i ∈ S, j ∈
V \S} is called the cut determined by S. The Max-
If we can show that ∆E(t) ≤ 0, the theorem is Cut problem on G is to find S ⊆ V such that
proved.
For the case that xtk = 1, xt+1 = −1, we have
X
k w(δ(S)) , wij (1)
n ij∈δ(S)
X
sgn( wkj xtj − tk ) = −1,
is maximized. We refer to w(δ(S)) as the weight of
j=1
the cut δ(S).
i.e., It is well known that the Max-Cut problem is NP-
X n complete. In fact it was one of the six basic problems
wkj xtj − tk < 0, which appeared in the list of Karp [17]. The problem
j=1 is very well studied in the literature and numerous
thus results on lower bounds and expected size for differ-
X ent classes of graphs are known. A large number of
wkj xtj − tk < −wkk .
applications of the Max-Cut problem can be found
j6=k
in the literature of Physics and VLSI-design. We re-
Since wkk > −Kk , then fer the interested reader to [7, 19] and the references
X therein for more details about the Max-Cut problem
wkj xtj − tk < Kk and its applications.
j6=k Since the Max-Cut problem is NP-complete for
which the exact solution is difficult to obtain, differ-
which implies that ent heuristic or approximation algorithms have been
X proposed, for example, algorithms based on the SDP
wkj xtj − tk ≤ 0. relaxation in Goemans et al. [10], Helmberg et al. [11],
j6=k
Benson et al. [2] and Choi et al. [5], and heuristic al-
So, gorithm in Burer et al. [4].
Funabiki et al. [9] proposed a binary neural net-
∆E(t) ≤ 0.
work for the Max-Cut problem. The network they
For the case of xtk = −1, xt+1 k = 1, it can be proved used is a continuous time neural network while our
similarly. model in this paper is a discrete time network. The
energy function they used is quite different from that 4 Computational Results
of us. Since we can not get their test problems, we
never compare our network with their model in this As the discussion in the previous section, the diag-
paper. onal elements of the matrix Ŵ must be rearranged
The Max-Cut problem in order to ensure that the network is E-convergent.
Zhuo et al. [26] used zero diagonal weight matrix net-
X
max wij . work. By our new result, negative diagonal can be
S⊆V used instead of zero diagonal. According to the The-
ij∈δ(S)
orem 2, smaller diagonal elements may result in less
can be formulated as an integer quadratic program stable states of a network, then reduce the possibility
by introducing cut vector. A cut vector x ∈ {−1, 1}n to be trapped in a state corresponding to local min-
is defined as: imum or poor quality solution and may approach to
 a better solution.
1 i∈S The set of test problems we used are the same as
xi =
−1 i ∈ V \ S. that Helmberg et al. [11], Benson et al. [2], Choi et
al. [5], Burer et al. [4] used to test their methods of
Then the Max-Cut problem is equivalent to the
solving the Max-Cut problem. All graphs are gen-
following problem,
erated with a platform independent graph generator,
X 1 − xi xj rudy, which was written by Giovanni Rinaldi. In
max wij . these test problems, the weights of edges are all inte-
x∈{−1,1}n
i<j
2
gers. Therefore, we have
By exploiting the symmetry of W = (wij ) and Ki ≥ 0.5, i = 1, · · · , n
xi xi = 1, we have
Then we can reassign the diagonal elements of matrix
X 1 − xi xj W to a negative value d satisfied −0.5 < d < 0.
wij The results are reported in Table 1 and Table 2
i<j
2
for using different diagonal values d. The sizes of the
1X graphs are given as (|V |, |E|), |V | is the number of the
= wij (1 − xi xj )
4 i,j vertex and |E| is the number of the non-zero weights.
n n n Table 1 show the best results in 10 tests with random
1X X X
initial points, and Table 2 are the average values of all
= ( wij xi xi − wij xi xj )
4 i=1 j=1 j=1 results in 10 tests. It is obviously that using negative
1 T 0 diagonal improve the solution quality dramatically in
= x W x both best and average values.
4
where W 0 = (wij
0
), and Graph Size d=0 d = −0.4
G1 (800,19176) 11420 11497
0 G11 (800,1600) 470 556
wij = −wij , i 6= j
G22 (2000,19990) 12963 13109
n
X X G32 (2000,4000) 1150 1372
0
wii = wij − wii = wij .
j=1 j6=i Table 1: the best value in 10 tests
For Ŵ = 12 W 0 , the Max-Cut problem can be
rewritten in the following more general form,
Graph Size d=0 d = −0.4
1 T G1 (800,19176) 11382.7 11444
max x Ŵ x,
x∈{−1,1}n 2 G11 (800,1600) 457.2 549.4
G22 (2000,19990) 12901.2 13058.5
and therefore equivalent to G32 (2000,4000) 1134.2 1360.4
1
min − xT Ŵ x Table 2: the average value in 10 tests
x∈{−1,1}n 2

which can be solved by a DHNN N = (Ŵ , 0) running In Table 3, the results are compared with the re-
in asynchronous mode. sults of DSDP method [5], one of the latest algorithms
by solving the Max-Cut SDP relaxation, and CirCut nal weight matrix to solve Max-Cut problem. Ele-
method [4], a new heuristic algorithm for the Max- mentary numerical experiments show that the per-
Cut problem. To our knowledge, DSDP and CirCut formance of such DHNN is exciting. Comparing with
are two of the most efficient algorithms for the Max- other methods solving Max-Cut, for example SDP
Cut problem at present. The results in Table 3 are method and CirCut method which are two of the
the best values obtained in 100 tests with random ini- most efficient solving Max-Cut problem at present,
tial points. The quality of solutions given by DHNN the solution quality of DHNN is competitive or even
is almost better than those of DSDP except on prob- better than those of them. Meanwhile, we note that
lems G48, G49 and G50. On the problem G49, all although we simply change the diagonal elements to
three algorithms obtained the global optimal value. negative, the quality are improved dramatically.
Moreover, on problems G11, G13, G33, G55 and G62, Unlike the complicate energy function used to solve
the solution found by DHNN are best. Note that our TSP (see [1, 13, 16, 23] etc.), the energy function for
code is simple without any further improving tech- the Max-Cut problem is simpler, and suitable for neu-
nique and CirCut is a well developed heuristic code, ral network. It is possibly another reason that DHNN
the quality of the solutions obtained by DHNN is can produce good solutions for the Max-Cut prob-
competitive. lem. There are many other combinatorial optimiza-
tion problems can be formulated to simple integer
Graph DSDP CirCut DHNN programming problems. Another further work is to
G11 542 554 560 extend the idea in this paper to such combinatorial
G12 540 552 548 optimization problems.
G13 564 572 574 However, like traditional Hopfield-type neural net-
G14 2922 3053 3024 work, the neural network proposed in this paper con-
G15 2938 3039 3013 verge to the first stable state they encounter, which
G20 838 939 895 will decrease the solution quality. There are many
G21 841 921 881 techniques to prevent network to be trapped in stable
G22 12960 13331 13167 state corresponding poor quality solution, for exam-
G23 13006 13269 13157 ple “hill-climbing”. If we combine these techniques
G24 12933 13287 13140 with DHNN, we expect that the quality of the solu-
G30 3038 3377 3200 tion given by network is better. We shall study this
G31 2851 3255 3111 problem further.
G32 1338 1380 1378
G33 1330 1352 1354
G34 1334 1358 1356 References
G48 6000 6000 5992
G49 6000 6000 6000 [1] S. V. B. Aiyer, M. Niranjan, and F. Fallside.
G50 5880 5856 5846 A theoretical investigation into the performance
G55 9960 10240 11639 of the Hopfield model. IEEE Trans. on Neural
G56 3634 3943 3700 Networks, Vol. 1, No. 2, pp. 204-215, 1990.
G57 3320 3412 3394
[2] S. Benson, Y. Ye, and X. Zhang. Solving large-
G60 13610 14081 13718
scale sparse semidefinite programs for combina-
G61 5252 5690 5316 torial optimization. SIAM J. Optim., Vol. 10,
G62 4612 4740 4894 No. 2, pp. 443-461, 2000.
G64 7624 8575 8055
[3] A. Bouzerdoum and T. R. Pattison. Neural net-
Table 3: Comparison with DSDP and CirCut work for quadratic optimization with bound con-
straints. IEEE Trans. on Neural Networks, Vol.
4, pp. 293-304, 1993.

5 Conclusion [4] S. Burer, R. D. C. Monteiro, and Y. Zhang.


Rand-two relaxation heuristics for max-cut and
In this paper, we give new convergence conditions other binary quadratic programs. Technical Re-
for discrete Hopfield-type neural network. According port TR00-33, Department of Computational
to this condition, we construct a discrete Hopfield- and Applied Mathematics, Rice University,
type neural network (DHNN) with negative diago- Texas, 2000.
[5] C. Choi and Y. Ye. Solving sparse semidefi- [18] M. P. Kennedy and L. O. Chua. Neural networks
nite programs using the dual scaling algorithm for nonlinear programming. IEEE Trans. on Cir-
with an interative solver. Working paper, De- cuits and Systems, Vol. 35, No. 5, pp. 554-562,
partment of Management Science, University of 1988.
Iowa, Iowa, 2000.
[19] S. Poljak and Z. Tuza. Maximum cuts and large
[6] A. Cichochi and R. Unbehauen. Neural networks bipartite subgraphs. DIMACS series in Discrete
for optimization and signal processing, John Wi- Mathematics and Theoretical Computer Science,
ley & Sons, New York, 1993. Vol. 20, 1995.
[7] M. Deza and M. Laurent. Geometry of cuts and [20] K. A. Simth. Neural networks for combinatorial
metrics, Volume 15 of Algorithms and Combina- optimization: a review of more than a decade
torics, Springer, 1997. of research. INFORMS Journal on Computing,
[8] N. Funabiki, Y. Takefuji, and K. C. Lee. A neu- Vol. 11, No. 1, 1999.
ral network model for finding a near-maximum [21] J. Wang. Analysis and design of a recurrent
clique. J. of Parallel and Distributed Computing, neural network for linear programming. IEEE
Vol. 14, pp. 340-344, 1992. Trans. on Circuits System, Vol. 40, pp. 613-618,
[9] N. Funabiki, S. Nishiawa, and S. Tajima. A 1993.
binary neural network approach for max cut
[22] Y. Xia. A new neural network for solving linear
problems. ICONIP’96 - HongKong, pp. 631-635,
programming and its application. IEEE Trans.
1996.
on Neural Networks, Vol. 7, No. 2, pp. 525-529,
[10] M. X. Goemans and D. P. Williamson. Improved 1996.
approximation algoritms for maximum cut and
[23] X. Xu and W. T. Tsai. Effective neural algo-
satisfiability problems using semidefinite pro-
rithms for Traveling Salesman Problem. Neural
gramming. Journal of ACM, Vol. 42, pp. 1115-
Networks, Vol. 4, pp. 193-205, 1991.
1145, 1995.
[11] C. Helmberg and F. Rendl. A spectral bundle [24] X. S. Zhang. Neural networks in optimization,
method for semidefinite programming. SIAM J. Volume 46 of Nonconvex Optimizations and Its
Optim., Vol. 10, pp. 673-696, 2000. Applications, Kluwer, 2000.

[12] J. J. Hopfield. Neural networks and physical [25] X. S. Zhang and H. C. Zhu. A neural network
systems with emergent collective computational model for quadratic programming with simple
abilities. Proc. Natl. Acad. Sci. USA, Vol. 79, pp. upper and lower bounds and its application to
2554-2558, 1982. linear programming. Lecture Notes in Computer
Science, 834, pp. 119-127, Springer-Verlag, 1994.
[13] J. J. Hopfield and D.W. Tank. Neural computa-
tion of decisions in optimization problems. Bio- [26] X. J. Zhuo and X. S. Zhang. Hopfield-type neural
logical Cybernetics, Vol. 52, pp. 141-152, 1985. network for solving four-coloring map problems.
OR Transactions, Vol. 3, No. 3, pp. 35-43, 1999.
[14] A. Jagota. Approximating maximum clique with (in Chinese)
a Hopfield network. IEEE Trans. on Neural Net-
works, Vol. 6, pp. 724-735, 1995.
[15] L. C. Jiao. System theory of neural networks,
Xian Electronic Scientific University Publishing
House, Xian, China, 1990. (in Chinese)
[16] A. Joppe, H. R. A. Cardon, and J. C. Bioch. A
neural network for solving the Traveling Sales-
man Problem on the basis of city adjacency in
the tour. IJCNN, Vol. 3, pp. 961-964, 1990.
[17] R. M. Karp. Reducibility among combinatorial
problems. In R. E. Miller and J. W. Thather,
editors, Complexity of Computer Computation,
pp. 85-103. Plenum Press, New York, 1972.

You might also like