Professional Documents
Culture Documents
Comparison and Selection
Comparison and Selection
Comparison and Selection
Algorithms1
Joaqun Prez O.1, Rodolfo A. Pazos R.1, Juan Frausto S.2, Guillermo Rodriguez
O.3, Laura Cruz R.4, Hctor Fraire H.4
1
1 Introduction
In the solution of many difficult combinatorial problems (such as the data distribution
problem), exact and heuristic algorithms have been used. Exact algorithms have been
extensively studied and are considered adequate for moderately size instances,
whereas heuristic algorithms are considered promising for very large instances [1, 2,
3, 4]. To get the best of both, it is necessary to analytically determine for what
problem size it is convenient to use an exact algorithm and when it is better to use a
heuristic algorithm. However the lack of mathematical methods to predict the
1
z= f
q km l km c jt x mt + cv
kj
m
t
t
j
j
k
k
+ cf wt + a mj c jt d m x mt
m
t
t
j
f kj
ykt
(1)
where
fki =
q =
lkm =
ci =
c1 =
ykj =
c2 =
wj =
ami =
dm =
km
km
km
The model solutions are subject to five constraints: each attribute must be stored in
one site only, each attribute must be stored in a site which executes at least one query
that uses it, variables wj and ykj are forced to adopt values compatible with those of xmj,
and site storage capacity must not be exceeded by the attributes stored at each site. The
detailed description of this model can be found at [12, 13].
2.2 Solution Algorithms
Since the distribution problem modeled by DFAR is NP-complete [14], a heuristic
method is needed. As an exact solution method, the Branch and Bound algorithm
implemented in the Lindo 6.01 commercial software was used. As a heuristics
method, a variation of the Simulated Annealing algorithm, known as Threshold
Accepting, was implemented. In the cases reported in the specialized literature, this
version consumes less computing time and generates better quality solutions [15].
More details of the implementations are reported in [16].
3 Evaluation of Algorithms
In this section a statistical method is presented for comparing exact and heuristic
algorithms. Additionally, the steps for estimating algorithm performance and selecting
the best are detailed.
3.1 Method for Comparison of Exact and Heuristic algorithms
The following method was devised considering the notions presented in [17, 18]:
Step 2. Calculate the coefficients a and b of the set of feasible polynomials using a
fitness method such as least squares.
Step 3. Select the most adequate polynomial using statistical tests, which quantify
their goodness to represent the relationship between performance and
problem size. In order to increase the confidence level of the chosen function,
three fit tests are recommended: estimation of the error variance, the global F
test, and the Student t test. The first provides a preliminary assessment of the
function confidence, with the second a subset of useful functions is obtained,
and with the third the usefulness of the candidate function coefficients is
determined. Table 1 presents the equations and conditions that are used to
determine the goodness of fit of the efficiency polynomials, which are similar
for the efficacy polynomials.
Table 1. Goodness Tests
Error Variance
r
se 2 =
(t i - T( ni ))
Global F Test
r
r - ( g + 1)
R2 =1-
(t i -T (ni ))
2
(t i -t )
i
The polynomial is
adequate if it has the
smallest se value.
R2 /g
F=
(1- R 2 )/[r -( g +1)]
t Student Test
sebi = standard error of bi
calculated by least
squares
t=
bi
sebi
0i g
Coefficient bi is useful if
t < -(t a (r-(g+1)))* or
t > (t a (r-(g+1)))*
through successive runs. The number of runs is determined by dividing the tolerance
time by the estimated processing time. Finally, if none is adequate, the algorithm ends
without result. The algorithm previously described is the following:
Algorithm
Begin
real t, e ;
// tolerances
integer n;
// problem size
if only an optimal solution is acceptable then
if TE (n) t then
x = E ( I );
else finish without solution
end_if
else
if TE (n) t then
x = E ( I );
else
if TA (n) t and EA (n) e then
if z(y) < z (x) then
x = A ( I );
for i = 2 to t / TA (n)
y = A ( I );
if z(y) < z (x) then
x=y
end_if
end_for
else finish without solution
end_if
end_if
end_if
end_if
End
4 Experimental Results
4.1 Results of Algorithm Behavior
In order to obtain the tabular description of the algorithm behavior, 40 experiments
were conducted for each instance. 17 instances of wide size range and with known
optimal solution were generated. These belong to the same class and were
mathematically obtained using the Uncoupled Components Method [19].
Each test instance was solved using a Branch&Bound algorithm and the Threshold
Accepting Algorithm. Tables 2 and 3 show a subset of the results of these tests. The
second and third columns of Table 2 show the problem sizes for the test cases; while
the last two columns show the performance results. Table 3 show the results obtained
using the Threshold Accepting Algorithm. The difference percentage with respect to
the optimal is shown on columns two through four. The last column shows the
execution time of the algorithm.
Table 2. Exact Solution Using Branch&Bound
Instance
I1
I2
I3
I4
I5
I6
I7
I8
Sites
2
18
20
32
64
128
256
512
Queries
2
18
20
32
64
128
256
512
Optimal
Value
302
2719
3022
* 4835
* 9670
* 19340
* 38681
* 77363
Time
(sec.)
0.05
1.15
3.29
**
**
**
**
**
Instance
I1
I2
I3
I4
I5
I6
I7
I8
% Difference
(deviation from optimal)
Best
Worst
Average
0
0
0
0
141
10
0
0
0
0
78
4
0
100
20
0
140
36
0
405
88
66
383
215
Time
(sec.)
0.03
0.3
0.4
1.2
6.1
43.6
381.2
3063.4
Algorithm
Polynomial Function T(n)
Threshold Accepting -0.31458651 + 6.7247624E-5n + 4.3424044E-10n2 6.1504908E-17n3
Branch&Bound
+0.0036190847 + 4.4856655E-4 n - 4.4942872E-7 n2 +
2.5914131E-10 n3 -5.4339889E-14 n4 + 2.5641303E-18 n5 +
2.4019059E-22 n6
Table 5. Polynomial Functions for Efficacy (Threshold Accepting)
Test Case
Large instances
(best case)
Large instances
(average case)
Large instances
(worst case)
Random problems
(average case)
- 1.8006418E-11 n2
-0.23663 + 0.00049 n
Figure 1 shows the efficiency functions for both algorithms. For Branch&Bound a
sixth degree polynomial was obtained, whereas a third degree polynomial was found
for Threshold Accepting. Notice that for small instances the first outperforms the
second, whereas for large instances the situation is just the opposite, and there is a
crossing point between the two functions.
Branch&Bound
Threshold Accepting
Experimental results
from a B&B run
from a set of TA
runs
Experimental results
Worst case
Average case
Best case
Fig. 2. Graph of the Efficacy Functions for the Threshold Accepting Algorithm
Due to the large spread of the efficacy results for the Threshold Accepting
Algorithm, three polynomials were determined (Figure 2). For the best, average and
worst cases the resulting polynomials were of first, third and first degrees.
5. Final Remarks
This paper shows that by finding the performance functions that characterize the exact
and heuristic algorithms, it is possible to automatically determine the most adequate
algorithm given the problem size. Also, the characterization helps us to better
understand their behavior. For example, it defines regions in which one algorithm
outperforms the other, as opposed to the traditional approaches, which oversimplify
algorithm evaluation; i.e., they claim that one algorithm outperforms the other in all
cases, which is not always true.
For demonstration purposes the performance functions for Branch&Bound and
Simulated Annealing were obtained when applied to the solution of the database
distribution problem modeled by DFAR. The experimental results have proved that
Branch&Bound is satisfactory for small problems, Simulated Annealing is promising
for large problems, and there exists a crossing point that divides both regions.
Future plans for research include the following: exploring probability distributions
for characterizing the behavior of two type of algorithms: exacts and heuristics;
integrating our work with another model, developed by us, to select the best between
different heuristics algorithms.
References
1. G. Murty: Operations Research: Deterministic Optimization Models, New Jersey: Prentice
Hall (1995) 581.
2. R. K. Ahuja, A. Kumar, K. Jha: Exact and Heuristic Algorithms for theWeapon Target
Assignment Problem, working paper (2003).
3. J. Gu. Efficient Local Search for very Large-Scale Satisfability Problem. SIGART Bulletin,
(1992) 3:8-12.
4. B. Selman, H.A. Kautz and B. Cohen. Noise Strategies for Improving Local Search.
Proceeding of AAAI94, Mit Press, (1994) 337-343.
5. C. Papadimitriou, K. Steiglitz: Combinatorial Optimization: Algorithms and Complexity.
New Jersey, Prentice-Hall (1982) 496.
6. B. J. Borghetti, Inference Algorithm Performance and Selection Under Contrained
Resources, MS Thesis, AFIT/GCS/ENG/96D-05, (1996).
7. J.N. Hooker: Testing Heuristics: We Have it All Wrong. Journal of Heuristics (1996).
8. H.H. Hoos and T. Stutzle. Systematic vs. Local Search for SAT. Journal of Automated
Reasoning, Vol. 24, (2000) 421-481.
9. I. P. Gent, E. MacIntyre, P. Prosser and T.Walsh. The Scaling of Search Cost. Proceedings of
AAAI97, Mit Press, (1997) 315-320.
10. D.S. Johnson and M.A. Trick, editors. Clique, Coloring, and Satisfability. DIMACS Series
on Discrete Mathematics and Theoretical Computer Science, AMS, Vol. 16 (1996).
11. A. Davenport. A Comparison of Complete and Incomplete Algorithms in the Easy and Hard
Regions. Workshop on Studying and Solving Really Hard Problems. CP95 (1995).
12. Prez, J., Pazos, R.A., Frausto, J., Romero, D., Cruz, L.: Vertical Fragmentation and
Allocation in Distributed Databases with Site Capacity Restrictions Using the Threshold
Accepting Algorithm. Lectures Notes in Computer Science, Vol. 1793. Springer-Verlag,
(2000) 75-81.
13. Prez, J., Pazos, R.A., Romero, D., Santaolaya, R., Rodrguez, G., Sosa, V.: Adaptive and
Scalable Allocation of Data-Objects in the Web. Lectures Notes in Computer Science, Vol.
2667. Springer-Verlag, (2003) 134-143
14. J. Prez, R. Pazos, D. Romero, L. Cruz: Anlisis de Complejidad del Problema de la
Fragmentacin Vertical y Reubicacin Dinmica en Bases de Datos Distribuidas. 7th.
International Congress on Computer Science Research, Cd. Madero (2000) 63-70.
15. L. Morales, R. Garduo, D. Romero: The Multiple-minima Problem in Small Peptides
Revisted. The Threshold Accepting Approach. Journal of Biomelecular Structure &
Dynamics. Vol. 9, No. 5 (1992) 951-957.
16.Prez, J., Pazos, R.A., Velez, L. Rodriguez, G.: Automatic Generation of Control
Parameters for the Threshold Accepting Algorithm, Lectures Notes in Computer Science,
Vol. 2313. Springer-Verlag, , (2002) 119-127
17. R. Scheaffer, J. McClave: Probabilidad y Estadstica para Ingeniera, Tr. V. Gonzlez, Grupo
Editorial Iberoamrica (1990) 690.
18. R.Walpole, R. Myers: Probabilidad y Estadstica, Tr. G. Maldonado, McGraw Hill (1990)
797.
19. L. Cruz: Automatizacin del Diseo de la Fragmentacin Vertical y Ubicacin en Bases de
Datos Distribuidas Usando Mtodos Heursticos y Exactos, M.S. thesis, Instituto Tecnolgico y
de Estudios Superiores de Monterrey (1999) 116.