Professional Documents
Culture Documents
Automatic Generation of Floating-Point Test Data
Automatic Generation of Floating-Point Test Data
Automatic Generation of Floating-Point Test Data
Abstract-For numerical programs, or more generally for programs "heuristic" in that it is not guaranteed to produce a set of test
with floating-point data, it may be that large savings of time and data executing a given path whenever such data exist. (On
storage are made possible by using numerical maximization methods the other hand, we know of no guaranteed data generation
instead of symbolic execution to generate test data. Two examples,
a matrix factorization subroutine and a sorting method, illustrate the scheme whose execution time does not, in the worst case,
types of data generation problems that can be successfully treated grow at least exponentially with the length of the execution
with such maximization techniques. path.)
Index Terns-Automatic test data generation, branching, data con-
straints, execution path, software evaluation systems. NUMERICAL MAXIMIZATION METHODS FOR
GENERATING TEST DATA
INTRODUCTION Given the problem of generating floating-point test data our
R ESEARCH in program evaluation and verification has approach begins by fixing all integer parameters of the given
only rarely (e.g., [1]) begun with the explicit require- program (e.g., the dimensions of the data in a matrix program
ment that the program deal with real numbers as op- or the number of iterations in an iterative method) so that the
posed to integers. This may be an oversight since there are only unresolved decisions controlling program flow are com-
theoretical results which suggest the desirability of this as- parisons involving real values. Then, as will be seen, an execu-
sumption. Specifically, a general procedure of Tarski [2] tion path takes the form of a straight-line program of float-
shows that certain properties, undecidable (in the technical ing-point assignment statements interspersed with "path
sense) for "integer" programs, are decidable for "numerical" constraints" of the form ci = 0, ci > 0, or ci > 0. Each ci is a
programs. Examples of this phenomenon arise when one asks data-dependent real value possibly defined in terms of pre-
if there exists a set of data driving execution of a certain kind viously computed results. For instance, a path which takes
of program down a given path. the true branch of a test "IF(X.NE.Y)" has a constraint c > 0,
Moreover, there is practical evidence supporting the case for where, e.g., C = ABS(X - Y) or c = (X - Y)2. (We will not discuss
automatic verification of special properties of numerical pro- in any detail the philosophical and practical difficulties asso-
grams. Proving "numerical correctness," i.e., verifying a satis- ciated with equality tests when computation is contaminated
factory level of insensitivity to rounding error, is sometimes by rounding error. Nor will we consider the problem of (auto-
much easier than proving that the program performs properly matically or manually) generating the straight-line program;
in exact arithmetic. The ideal and contaminated results can we have nothing new to add on this subject.)
often be meaningfully compared with only minimal under- The situation is clarified by an example. Consider the fol-
standing of the program. Simple, portable, general-purpose lowing subprogram of Moler [8] .
software [31, [4] can easily provide answers which have SUBROUTINE DECOMP(N,NDIM,A,IP)
eluded specialists in roundoff analysis. This work [3], [4] REAL A(NDIM,NDIM) ,T
also shows the possible advantage of using, e.g., numerical INTEGER IP (NDIM)
maximization methods to do the verification, avoiding the c
alternative of using, e.g., computer symbolic manipulation C MATRIX TRIANGULARIZATION BY GAUSSIAN ELIMINATION.
(5], [6]. C INPUT..
This paper considers automatic test data generation, a prob- C N = ORDER OF MATRIX.
lem which arises in such fields as automatic software evalua- C NDIM = DECLARED DIMENSION OF ARRAY A.
tion systems [7] and in automatic roundoff analysis [4] . Our C A = MATRIX TO BE TRIANGULARIZED.
contention is that automatic test data generation is sometimes C OUTPUT..
best formulated and solved as a numerical maximization prob- C A(I,J), I.LE.J = UPPER TRIANGULAR FACTOR, U.
lem. The reader should be warned that our scheme is only C A(I,J), I.GT.J = MULTIPLIERS = LOWER TRIANGULAR
FACTOR, I-L.
Manuscript received September 9, 1975; revised February 23, 1976. IC IP(K), K.LT.N = INDEX OF K-TH PIVOT ROW.
This work was supported in part by the National Science Foundation
under Grant GJ-42968. IP(N) = (-1)**(NUMBER OF INTERCHANGES) OR 0.
W. Miller is with the Department of Computer Science, Pennsylvania C USE 'SOLVE' TO OBTAIN SOLUTION OF LINEAR SYSTEM.
State University, University Park, PA 16802. C DETERM(A) = IP(N)*A(1, 1)*A(2 ,2)* *A(N,N).
D. L. Spooner was with the Department of Computer Science, Penn-
sylvania State University, University Park, PA 16802. He is now with C IF IP(N)=o, A IS SINGULAR, SOLVE WILL DIVIDE BY ZERO.
the Department of Computer Science, Cornell University, Ithaca, NY. C INTERCHANGES FINISHED IN U, ONLY PARTLY IN L.
224 IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, SEPTEMBER 1976
C A(1,3) = T
IP(N) = 1 cs = ABS(T) > 0
DO 6 K = 1,N A(2,3) = A(2,3) + A(2, 1)*T
IF(K.EQ.N) GO TO 5 A(3,3) = A(3,3) + A(3, 1) *T
KPI = K+1 C6 = ABS(A(1, 1)) > 0
M=K C7 = ABS(A(3,2)) - ABS(A(2, 2)) > 0
DO 11 = KP1,N T = A(3,2)
IF(ABS(A(I,K)) .GT.ABS(A(M,K))) M = I A(3,2) = A(2,2)
1 CONTINUE A(2,2) = T
IP(K) = M C8 = ABS(T) > 0
IF(M.NE.K) IP(N) --IP(N) A(3,2) = -A(3,2)/T
T = A(M,K) T = A(3, 3)
A(M, K) = A(K,K) A(3,3) = A(2,3)
A(K, K) = T A(2,3) = T
IF(T.EQ.0.) GO TO 5 cg = ABS(T) > 0
DO 2 I = KP1,N A(3,3) = A(3,3) + A(2,3)*T
2 A(I,K) = -A(I, K)/T clo ABS(A(2,2)) > 0
=
DO 4 J = KP1,N c = ABS(A(3, 3)) > 0
T = A(M,J)
One method of test data generation [9] -[11] begins with
A(M,J) = A(K,J)
A(K,J) = T
symbolic execution of the program to find explicit representa-
tions for the ci in terms of-the data. For instance, to write C7
IF(T.EQ.0.) GO TO 4
in this form we express the recomputed values A(2, 2) and
DO 31 = KPI,N
3 A(I,J) = A(I,J) + A(I,K)*T
A(3, 2) in terms of the original A(I, J), getting
4 CONTINUE C7=ABS(A(l1, 2) A(I, 1) * A(3,2) * A(3, I)-')
5 IF (A(K,K) .EQ.0.) IP(N) = 0
6 CONTINUE
ABS(A(2,2) - A(2, 1) A(3,2)- A(3, l)-1).
-
cl, ,,.cl
, 1positive. such that f(cl, * c,Cm) < 0 if at least one ci is strictly nega-
el = ABS(A(2,1) - ABS(A(1,1)) > 0 tive and f(cl, ,cm)
c > 0 if all ci are strictly positive (by
c2 = ABS(A(3, 1)) ABS(A(2, 1)) > 0
- continuity this implies that f(cl, -.. ,cm) > 0 whenever
T = A(3, 1) ci > 0 for all i). For instance, using the notation Zc = min (c, 0)
A(3, 1) A(1, 1) pick one of
A(1,1) - T
C3 = ABS(T) > 0 MooCI)- XCM) = in (Cl- *Cm)
A(2, 1) = -A(2, 1)/T
A(3, 1) = -A(3, 1)/T if at least one ci is
T = A(3,2)
f2(cl, -
- *,
C.) = { = C ) negative
A(3,2) = A(1,2)
A(1,2) = T min (cl, - -
* " Cm) if no ci is negative
c4 = ABS(T) > 0
m
A(2,2) = A(2,2) + A(2, 1)*T if at least one ci is
A(3,2) = A(3,2) + A(3,1)*T
T = A(3,3)
*Cmm)
fi(cl, *mc
E Fi
) ~~~negative
i nc=negative .
A(3,3) = A(1,3) min (cl, * * -*, cm) if no ci is negative.
MILLER AND SPOONER: GENERATION OF FLOATING-POINT TEST DATA 225
between four and seven seconds of CPU time on our IBM [11] J. King, "Symbolic execution and program testing," submitted
370/168, at ten cents per second. However, this is in some for publication.
[12] P. Gill and W. Murray, Ed., Numerical Methods for Constrain ted
ways pessimistic since we neither took the trouble nor had the Optimization. New York: Academic, 1974.
appropriate software to explicitly generate the straight-line [13] W. Swann, "Direct search methods," in Numerical Methods for
program. Instead, we essentially executed the Fortran pro- Unconstrained Optimization, W. Murray, Ed. New York: Aca-
demic, 1972, pp. 13-28.
gram each of the, e.g., 1147 times to determine the constraints [14] -, Constrained optimization by direct search," in Numerical
and the assignment statements along the given path.) Methods for Unconstrained Optimization, W. Murray, Ed. New
York: Academic, 1972, pp. 191-217.
ACKNOWLEDGMENT [15] A. Aho, J. Hopcroft, and J. Ullman, The Design and Analysis of
Computer Algorithms. Reading, MA: Addison-Wesley, 1974.
The authors wish to thank the referee who pointed out ref-
erence [9] and made other helpful suggestions. The final form
of our Heapsort example was prompted by J. King's informal
conjecture that our methods are not much more efficient than
random generation of test data until a set is found which
causes the given path to be executed.
Webb Miller was born in Walla Walla, WA, on
November 30, 1943. He received the B.S. De-
REFERENCES gree in mathematics from Whitman College,
[1] T. Hull et al., "The correctness of numerical algorithms," in Proc. Walla Walla, WA, in 1966 and the Ph.D. degree
ACM Conf. Proving Assertions about Programs, New Mexico in mathematics from the University of Washing-
State University, Jan. 6-7, 1972. ton, Seattle, in 1969.
[2] A. Tarski, A Decision Method for Elementary Algebra and Geom- He is currently an Associate Professor in the
etry. Berkeley, CA: University of California Press, 1951. Department of Computer Science, Pennsylvania
[3] W. Miller, "Software for roundoff analysis," Ass. Comput. Mach. State University, State College, and is trying to
Trans. Math. Software, vol. 1, pp. 108-128, 1975. find time to pursue his interests in rounding er-
[4] W. Miller and D. Spooner, "Software for roundoff analysis, II," ror analysis and computational complexity.
be published in Ass. Comput. Mach. Trans. Math. Software.
[5] W. Kahan, "One numerical analyst's experience with one symbol
manipulator," SIAM Rev., vol. 16, p. 129, 1974.
[6] D. Stoutemyer, "Automatic error analysis using computer alge-
braic manipulation," submitted for publication.
[7] C. Ramamoorthy and S.-B. Ho, "Testing large software with au-
tomated software evaluation systems," IEEE Trans. Software David L. Spooner was born in State College,
Eng., vol. 1, pp. 46-58, 1975. PA, on April 13, 1953. He received the B.S. de-
[8] C. Moler, "Algorithm 423, linear equation solver," Commun. Ass. gree in computer science from Pennsylvania
Comput. Mach., vol. 15, p. 274, Apr. 1972. State University, University Park, PA, in 1975.
[9] R. Boyer, B. Elspas, and K. Levitt, "SELECT-A formal system He is currently a graduate student at Cornell
for testing and debugging programs by symbolic execution," in University, Ithaca, NY. His major interests are
Proc. 1975 Int. Conf. Reliable Software; also SIGPLANNotices, in the areas of programming languages and com-
vol. 10, pp. 234-245, June 1975. piler design.
[101 L. Clarke, "A system to generate test data and symbolically ex- Mr. Spooner is a member of the Association
ecute programs," Dept. Comp. Sci., Univ. of Colorado, Rep. CU- for Computing Machinery, Phi Kappa Phi, and
CS-060-75, Feb. 1975. Upsilon Pi Epsilon.