Control Structure Abstractions of The Backtracking Programming Technique

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. SE-2, NO.

4, DECEMBER 1976 285

in Structured Programming, Dahl, Dijkstra, and Hoare. New in English literature from the University of
York: Academic, 1972. Texas, Austin, and the Ph.D. degree in compu-
[4] B. H. Liskov, "An introduction to CLU," Computation Struc- ter science from Carnegie-Mellon University,
tures Group Memo 136, Lab. Comput. Science, M.I.T., Cam- Pittsburgh, PA.
bridge, MA, Feb. 1976. Since 1973 she has been a member of the
[5] W. A. Wulf, R. London, and M. Shaw, "Abstraction and verifica- Carnegie-Mellon University Faculty and is cur-
tion in Alphard," Dep. Comput. Science, Carnegie-Mellon Univ., rently an Assistant Professor of Computer Sci-
Pittsburgh, PA, 1976. ence. Her current research is in the areas of
(6] J. B. Dennis and E. C. van Horn, "Programming for multipro- security enforcement in computer systems,
grammed computations," Commun. Ass. Comput. Mach., vol. 9, software engineering, and distributed compu-
pp. 143-155, Mar. 1966. tation systems.
[7] B. W. Lampson, "Protection," in Proc. Fifth Ann. Princeton
Conf Information Sciences and Systems, Princeton Univ., Prince-
ton, NJ, 1971, pp. 437-443.
[8] A. K. Jones, "Protection in programming systems," Ph.D disser-
tation, Carnegie-Mellon Univ., Pittsburgh, PA, 1973.
[9] J. McCarthy et al., LISP 1.5 Programmer's Manual. Cambridge,
MA: M.I.T. Press, 1962.
[10] N. Wirth, "The programming language PASCAL," Acta Infor- Barbara H. Liskov received the B.A. degree in
matica,vol. 1,pp. 335-363, 1971. mathematics from the University of California,
[11] A. K. Jones and B. H. Liskov, "An access control facility for pro- Berkeley, and the M.S. and Ph.D. degrees in
gramming languages," Computation Structures Group Memo 137, computer science from Stanford University,
Lab. Comput. Science, M.I.T., Cambridge, MA, Apr. 1976. Stanford, CA.
[12] A. K. Jones and W. A. Wulf, "Toward the design of a secure sys- From 1968 to 1972 she was associated with
tem," Software Practice and Experience, vol. 5, pp. 321-336, the Mitre Corporation, Bedford, MA, where
1975. she participated in the design and implementa-
tion of the Venus Machine and the Venus Op-
erating System. She is presently Associate Pro-
fessor of Electrical Engineering and Computer
Science at the Massachusetts Institute of Technology, Cambridge. Her
Anita K. Jones was born in Houston, TX. She received the A.B. degree research interests include programming methodology and the design of
in mathematics from Rice University, Houston, TX, the M.A. degree languages and systems to support structured programming.

Control Structure Abstractions of the Backtracking


Programming Technique
SUSAN L. GERHART AND LAWRENCE YELOWITZ

Abstract-Backtracking is a well-known technique for solving com- parameterization of the backtracking technique to facilitate selection
binatorial problems. It is of interest to programming methodologists of the appropriate variant abstraction for a concrete problem. The
because 1) correctness of backtracking programs may be difficult to methodology is illustrated on the eight queens, knighfs tour, malicious
ascertain experimentally and 2) efficiency is often of paramount secretary, and good sequences problems. Also discussed are the amount
importance. This paper applies a programming methodology, which we of work involved in the control structure abstraction approach for this
call control structure abstraction, to the backtracking technique. The particular application area, its relationship to the data structure abstrac-
value of control structure abstraction in the context of correctness is tion method, and its possible applicition to other areas.
that proofs of general properties of a class of programs with similar
control structures are separated from proofs of specific properties of Index Terns-Abstraction, backtracking, correctness-preserving trans-
individual programs of the class. In the context of efficiency, it provides formations, program correctness, programming methodology, program
sufficient conditions for correctness of an initial program which may schema.
subsequently be improved for efficiency whlle preserving correctness.
The paper provides several abstract variations of backtracking pro-
grams, along with correctness statements and assertions, and an overall I. INTRODUCTION
T HE backtracking programming technique has long been
Manuscript received May 1, 1976; revised August 15, 1976. . This known and has been used both for solutions to practical
work was supported in part by NSF Grant MCS75-08146. problems and to intriguing puzzles and for programming
S. L. Gerhart is with the Department of Computer Science, Duke pedagogy [3] -[5]. Wells [1] provides an extensive survey of
University, Durham, NC 27706.
L. Yelowitz is with the Department of Computer Science, Univer- backtracking as a fundamental technique of combinatorial
sity of Pittsburgh, Pittsburgh, PA 15260. computing. Several recent papers study various control
286 2EEE TRANSACTIONS ON SOFTWARE ENGINEERING, DECEMBER 1976

representations: recursion in Wirth [5], macros in Bitner and Let Gc Generate(S).

Reingold [7], and coroutines in Hanson [6]. Knuth [2] DEFINITIONS


SubSuccessors(S,G)={S} u U Successors(So<X>)
investigates a strategy for the important problem of run- XcG
time estimation. SolutionSubSuccessors (S ,G)-
The term "backtracking" is not defined uniformly in the {scSubSuccessors (S,G) Solution (s))
literature. An example definition [7] is SavedSub Solutions (S, G)=
AllSolutions=AllSolutions' u
procedure backtrack (vector, i) SolutionSubSuccessors (S,G)
if vector is a solution then record it; SearchedSubSuccessors (S ,G)-
O= Cardinality (FirstSolution)
compute Si, a subset of fixed set Ai; A{ } = SolutionSubSuccessors(S,G)
forall XESi do backtrack (vector-<X>,i+l) od Vl=Cardinality (FirstSolution)
AFirstSolution c SolutionSuccessors (S)
which is called by backtrack (AJ,) where denotes sequence
-
OptimalOfSubSuccessors (S ,G)
catenation, <...> indicates a sequence, and A is the empty (VseSubSuccessors(S,G)) (f(Best)' f(s))
sequence. Knuth [2] requires fixed length solution vectors CONVENTION
QSuccessors(S)= QSubSuccessors (S,Generate(S))
and an explicit intermediate test which defines the set Si. We where Q is any of the qualifiers: blank, Solution,
prefer a more general definition using two functions. Saved, Searched, Optimalof
A
Example: Successors(S)= SubSuccessors(S,Generate(S))
Solution: sequence of T -*{true, false} (T, a data type)
Generate: sequence of T -+ set of T Fig. 1. Backtracking notation.

and the procedure


we will discuss this case study of the backtracking technique
procedure Backtrack (S: sequence of T) in the context of what we will call the control structure
if Solution (S) then Action (S); abstraction methodology. This methodology consists of
forall XEGenerate (S) do Backtrack (S-<X>) od (*) organizing knowledge about a particular programming tech-
where Action (S) may record S either selectively if the goal is nique or task in terms of schemas (abstract programs) for
the first or optimal solution or unselectively if the goal is all which correctness is partially proved or derived by applica-
solutions. The term "backtrack" comes from the behavior tion of correctness-preserving transformations on other known
of the procedure when the set Generate(S) becomes empty. correct schemas.
The recursion when unravels either back to the call for se- The overall purpose of the paper is twofold, 1) to advance
quence S' where an untried element of Generate (S') remains the state of the knowledge about backtracking, especially for
or to the first call, nonexperts, and 2) to further define and develop the Control
Although the backtracking technique has been extensively Structure Abstraction Methodology.
studied and used, the following questions have not been fully
answered. II. ORGANIZATION OF BACKTRACKING KNOWLEDGE
1) What is the correctness criterion for a backtracking We defined (*) as a general form of backtracking. Our next
program? task is to determine what it means to say that this procedure
2) Is there a standard set of proved correct variations which is "correct." Ideally, the order would be reversed; we would
cover most of the known and imaginable applications to first specify what we want to accomplish by backtracking and
specific problems? then find the procedure (*), but pragmatically we are accus-
3) If there are such variations, how may they be used tomed to 1) think informally about what a program should do,
effectively and efficiently? 2) then formally state the program, and 3) sometimes state
Question 1 is easily answered by defining some notation and formally what the program does. It is step 3) that we are
specifications. In answer to question 2, our study of back- now concerned with.
tracking programs in the literature shows the "space" to be Fig. 1 gives two definitions which are fundamental to
parameterized as follows. backtracking. Successors (S) is S together with all the Succes-
1) Goal: all, first (if any), or optimal solutions; sors of S with any element of Generate(S) appended to it.
2) Structure of Solution; SolutionSuccessors (S) is simply the subset of Successors which
3) Structure of Generate; are also Solutions. SubSuccessors (S, G) is the subset of
4) Control structure representation; Successors (S) which come from Successor applied to S with
5) Use of auxilliary data structures to compute only elements of G appended to it. By convention, we sys-
Solution and Generate. tematically use QSubSuccessors (S, Generate (S)) to define
We will show certain combinations of parameter values that QSuccessors (S).
give the most common variations. General forms which we The predicates of Fig. 1 are used in Fig. 2(b) to express
prove correct will be used to prove porrectness of these standard assertions about the three forms of backtracking correspond-
variations by applying correctness-preserving transformations ing to the goals all, first (if any), and optimal solutions. Fig.
or by using similarity of proofs. These variations will then be 2(a) shows the schema and its call. Fig. 2(b) shows the inter-
specialized to several examples. Our purpose is to compare pretations of the abstract entities Btform, Exitform, Action,
the tasks with and without the preceding abstraction. Finally, Assertion, and Initialization which yield programs for the
GERHART AND YELOWITZ: CONTROL STRUCTURE ABSTRACTIONS 287

procedure Btform(S:sequence of T);


entry: true
exit: Exitform(S)
begin X:T;
if Solution(S) then Action(S) fi;
forall XtGenerate(S)
asserting Assertion
do Btform(S"<X>) od
end
CALL
true {Initialization; Btform(A)} Exitform(A)

(a)
Goal: All First Optimal

Btform BTAGGPN BTFGGPN BTOGGPN


Exitform SavedSolutions(S) SearchedSuccessors(S) OptimalOfSuccessors(S)
Action AllSolutions:= if FirstSolution - { I if f(Best)<f(S)
AllSolutionsu{S} then FirstSolution:-[S} fi then Best:-S fi
Assertion SavedSubSolutions(S,T) SearchedSubSuccessors(S,T) OptimalOfSubSuccessors(S,T)
Initialization AllSolutions: set of FirstSolution: set of Best: sequence of T
sequence of T sequence of T initially A
initially T initially }

(b)
Fig. 2. (a) General form of recursive backtracking procedures. (b)
Recursive goal variations.

three goals. In Fig. 2(b), BTFGGPN (a backtracking mech- Search Goal Generator Auxilliary data
anism searching for the first general Solution, using a general A=All G=General Y=Yes
F=First K=Fixed size N=No
Generator via a recursive procedure, with no auxiliary data

G~I |N
O=Optimal F=Fixed set
structure) is correct but inefficient, since there is no need to I T,F
continue searching once the first solution has been found.
By using correctness preserving transformations [15], it is BT F
B,Y
A,Z
I[ I
straightforward to make BTFGGPN terminate when the first 0o G

solution has been found, and to guarantee the correctness of Solution Control
the more efflcient version.
G=-General P=Recursive procedure
We will skim over details of semantics and correctness T=Truncated F=Recursive function
formalisms. P{R}Q is the notation of Hoare [8] with the F=Fixed Length
B=Bounded Length
S=Iterative Stack
N=Nes ted
interpretation "Q will be true after proper termination of R if (truncated)
Y=Bounded Length
P was true before execution of R." Primed variables denote (nontruncated)
values upon entry. The verification rule for the forall XEG A=Acceptance (truncated)
Z=Acceptance (nontruncated)
statement is given in Hoare [9]. The nonprogram variable r de-
notes the set of all elements so far processed by the loop; ini-
Fig. 3. Parameters and procedure names.
tially it is { } and terminally it is G. Later we will use the exit-
in-the-middle loop construct described in Knuth [10]. Other eter values. For example, BTAPFSN is the procedure which
statements have the usual semantics and verification rules. searches for all fixed length solutions produced by a Generate
The reader should be able to reason informally through the with a fixed set, written iteratively with a stack and no data
correctness arguments. Formal proofs proceed by deriving a structure. Conceptually, this space is quite large; commonly,
list of properties of the defimitions which form the basis for only a few combinations occur. We will consider some of the
verification. Notation for sequence operations appears in the more important parameter values.
Appendix. Selection of control structures is somewhat dependent on
We now claim to have found the variations for the goal the intended use of the procedure and on implementation
parameter, expressed their correctness criteria, and developed constraints. It is easy to translate the procedures of Fig. 2
the basis for their proofs. As always, the difficult part of the into functions, e.g., by adding a local variable AllSolutions to
task was finding the right defmitions and assertions, those BTAGGPN which returns a value satisfying AllSolutions=
shown in Fig. 1. Once we had mastered one, the others came SolutionSuccessors (S). The proofs of the functions are similar.
easily. It may seem distrubing that we do not know what It is not as easy to translate into iterative forms since the
Solution and Generate do, only how the algorithm uses them. backtracking technique is inherently recursive. However, this
However, we know of no better way to describe the effect of implementation constraint is so frequent and the run-time
backtracking, namely a search through all the possible efficiency is so important that it must be considered. Fig.
sequences produced by recursively applying Generate for those 4(b) shows an iterative version of BTAGGSN which uses G
which are Solutions. The point is to exploit this abstraction. as a stack to contain the unused Generated sets for the initial
Fig. 3 shows the overall parameterization for backtracking parts of the sequence S. The assertions are monstrous since
with some, but certainly not all, parameter values. The names they must summarize the entire history of an inherently
for procedures are formed from the letters indicating param- recursive process. Entirely different approaches are taken in
288 IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, DECEMBER 1976

Solution Sufficient Conditions for


SolutionSuccessors (S)c. {S}
T Solution (S)
F ,Y Length (S)=N
B Solution(S) VLength (S)=N
z -Accept (S)
A -Accept (S) VSolution (S)

Necessary Conditions for


Solution (S)
F Length(S)=N
B,Y Length (S)< N
A,Z Accept (S)

(a)
procedure BTAGGSN
returns AllSolutions: set of sequence of T
entry: true
exit: AllSolutions=SolutionSuccessors (A)
begin G: sequence of set of T initially A,
S: sequence of T initially A,
AllSolutions: set of sequence of T initially { };
looP
Solution
o then
e (S7
AllSolutions:=AllSolutionsu {S} fi;
LG:=CG<Generate(5)>;
loop asserting A do
while SJA and Last (G)= { } do
G:=OtherThanLast(G); S:=OtherThanLast(S)
repeat
asserting AA(S=A or Last(G)#{ }) do
while Last(G)#{ } do
S :=S"-ElementFrom(Last (G))>
repeat
end
A
where A=
Length (G)=Length (S)+l
A(VjIl < j < Length(G) UnUsed(j)cGenerate(I(j))
A (AllSolutions=
Length (G)
U SolutionSubSuccessors(I(j) ,Used(j))
j=l
UnUsed(j)- if =Length(G) then G(j) else G(j) u {S(j)}
Used(j)=A Generate(I(j))-UnUsed(j)
I(j)- Initial(j-1,S)
(b)

Parameter s Replacement in BTAGGSN for BTAsGSN Trans formation

Truncated if Solution(S) then PAB1 {Sl;S3) Q


AllSolutions :=AllSolutions u{S);
(T) G:=G-<{ }> P (if Bl then Sl fi; S2) Q =>
else G:=Gs<xGenerate(S)> P {if Bl then Sl; S3 else S2 fie) Q
fie
applied to boxed part of Figure 4b
Fixed if Length(S)=N then
if Solution(S) then BlDBO, PABOA'Bl {S3} Q
(F)' AllSolutions:=AllSolutions u {S fi; P {if Bl then Sl;S3 else S2 fie} Q =>
G:=Gx<{ }> P {if BO then if Bl then Sl fi; S3 else S2 fie) Q
else G:=G%<Generate(S)> fie
applied to BTATGSN
Acceptance if Accept(S) then
test, if Solution(S) then PA-B2 {S3) Q
Truncated AllSolutions :=AllSolutionc u {S }; P (if Bl then Sl; S3 else S2 fie} Q =>
G:=GC'<{ }> P tif B2 then if Bl then Sl;S3 else S2 fie else S3 fie} Q
(A) else G:=Gtb<Genlerate(S)> fie
else G:=Gu<{ }> fie applied to BTATGSN

(c) (d)

Fig. 4. (a) Properties of Solution. (b) Iterative version with stack G. (c) Modifications to BTAGGSN. (d) Correctness-
preserving transformations supporting modifications.

Hanson [6] and Floyd [11]. Some of the techniques for fies possible necessary conditions for a sequence to have no
translating recursion to iteration in Burstall and Darlington successors, except perhaps itself, which are solutions. A
[12] are assistance. particularly important case is where there is a test predicate
There are many cases where Solution has specific properties Accept(S) which decides whether any Successors(S) can be
which can be exploited to narrow the search. Fig. 4(a) identi- Solutions. Devising strong intermediate tests that filter out
GERHART AND YELOWITZ: CONTROL STRUCTURE ABSTRACTIONS 289

procedure BTAAFSN
returns AllSolutions: set of sequence of T
entry: true
exit: AllSolutions=SolutionSuccessors (A)
begin S: sequence of T initially A,
AllSolutions: set of sequence of T initially { },
g: set of T;
loop
if Accept(S) then
if Solution(S) then
AllSolutioits:=AllSolutions u {S}; Backup
else S:=Sn'<g. first> fie
else Backup fie
asserting A while SOA
repeat
end

procedure Backup
entry: B exit:A
begin
loop asserting B
while S$A and Last (S)=g. last do
S:=OtherThanLast(S)
repeat;
if SOA then Last(S):= g.next(Last(S)) fi
end
where A= AllSolutions = if S=A then SolutionSuccessors(A)
A else
c PreviousSolutions
ac ~~~~r
Lc- v UUMOU.lUL.ULLb
B= AllSolutions = SolutionSuccessors(S)u
Previous Solutions
PreviousSolutionsm
Length (S)
U SolutionSubSuccessors(Initial(j-l,S), preed(S(j )) )
j=l
pred(k) = subset of g generated prior to k
Fig. 5. Fixed set Generate version.

blind alleys without requiring excessive computation to do so BTAGGSN. These programs may be subjected to further
is the heart of efficiency activities. Alternatively, such tests improvements.
can be built into Generate. One particular variation of Generate is especially important.
Now suppose we have the procedure BTAGGSN proved Consider the case where Generate (S) is a fixed nonempty
correct and we want to modify it to exploit some of these set g - all which can be generated with a first element g - first,
properties of Solution, without much additional proof. The applying a function g - next to any element from g - all to get
method of correctness-preserving transformations formalized another, until an element g - last is obtained, of course without
by Gerhart [151 allows us to do so. Fig. 4(c) shows the repetition of elements. (This is a good place for a data struc-
changes to the boxed portion of BTAGGSN which yield the ture abstraction.) The correctness could be obtained either
procedures BTAsGSN, where s ranges over three of the param- by writing down a reasonable version similar to BTAAGSN
eter values. These are obtained by using the transformations and then using this similarity to devise a similar proof or by
of Fig. 4(d) which are represented as verification rules: using the redundancy of G [in that the next element of G (j)
conditions can be generated from S (j)] in correctness-preserving trans-
P{R}Q => P{R' }Q formations. An iterative version BTAAFSN is in Fig. 5.
One last point is the parameter for auxiliary data structures.
The interpretation is that whenever the conditions can be The main idea is that a data structure D often allows a more
proved and R is correct with respect to P-Q, then R' is also efficient evaluation of Accept or Generate. Given a known
correct with respect to P-Q. If P-Q is the "context" of R in correct version without data structures, we can add D in such a
some program, i.e., P holds before R and Q holds after R, then way that the operations on D do not interfere with the correct
R can be replaced by R' in the program. Gerhart [15] shows program, then prove that immediately before the use of
how to compute the context using forward and backward Accept (S) that Test (D)=Accept (S). This supports the
verification ideas. correctness-preserving transformation
In Fig. 4(d) the precondition P is formed from the path from BO=BI
entry through initialization and the path from the outer loop
assertion back to the loop beginning. Q is the outer-loop as- P {if BO then S1 else S2 fie} Q >
sertion. The interpretations for B 1, BO, and B2 are Solution (S), P {if B1 then S1 else S2 fie} Q
Length (S) =N, and Accept (S). S 1, S2, and S3 are All- Alternatively, we can use the conditions BODB1, PAB1 {S2}
Solutions:= AllSolutionsU{S},G:=G-<{ }> , and G:=G- Q for the same replacement.
<Generate (S)>. The conditions for the transformations are This completes our brief analysis of general properties of
not easy to prove, but once one proof is mastered, the results backtracking programs. There is far more to do than we have
carry over to the other proofs. It would be even more diffi- space for here. It should be realized that for any program not
cult to prove the BTAsGSN in Figure 4(c) without the trans- covered here, the methodology is laid out. We claim that this
formations, which capture the essential differences from methodology is easy to follow conceptually, although the de-
290 IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, DECEMBER 1976

tails may be overwhelming if the task is large. Although it V LOOP a LOOP

might look like more work to proceed up to the general and


true true
then back down to the specific, it saves work in the long run
because proofs at an abstract level are easier to construct and f p:=true; i:=b;
loop asserting rA
i:=b-
loop asserting rA (
understand and much less likely to contain errors than when (p _(Vj Ib < j < i) q(j)) -(: jlb < j < i)q(j)(
while p i < t do . while i < t and q(i) do
the problem domain is mixed in as in a "start from scratch" ( p:=q (i); i :=i+l l i :-i+l C
proof. \repeat
_
{ repeat
\~~~~
p:=i < t J
pE(Vjlb < j < t) q(j) _
III. EXAMPLES p=(k j lb < j < t)q(j)
We will now consider several examples. Our claim is that the where r i=b > t v b < i < t + 1
control structure abstraction of the preceding section allows (a)
us to bypass a significant part of the construction and proof procedure NonequalSequences (N)
begin
of backtracking programs. Only Example 1 will actually reach i:=O;
the program stage. For the other examples, the parameter loop Accept:=true; j:=l;
loop asserting ..
values will be identified. while Accept and j 5 i div 2 do
k:=O;
Example 1: Nonequal adjacent sequences. Generate all loop asserting ...
sequences of N>O characters, chosen from an alphabet of while k s j-l and S(i-k)=S(i-j-k) do
k:=k+l
three elements, e.g., {1,2,3}, such that no two adjacent sub- repeat;
Accept:=k ' j-l; j:=j+l
sequences are equal. (See [4, sec. 15.4].) repeat;
if Accept then
T: {1, 2,3} Goal: All Solutions if i=N then
print (S);
Solution: Length (S)=NA(Vi lli-N)Accept (Initial (i,S)) Backup
Accept (S) =(Vj 1 j<length (S) div 2) (3k IOk<j) else i:=i+l;
S(i):=l fie
S (length (S) - k) S (length (S) - j - k) else Backup fie
asserting ...
Generate: {1,2,3}. while i#O do
repeat
The applicable schema is BTAAFSN. Refinement of Accept is end

easy using the schemas of Fig. 6(a) which show how to con- procedure Backup
begin
struct loops which model the universal and existential quanti- loop asserting
fiers. See Fig. 6(b) where S is a vector indexed from 1 to N. while ifO and S(i)=3 do
i :=i-l
It is interesting to compare this solution with that in [4]. repeat;
if i#O then
The fmal programs are similar, but the construction methods S (i) :=S (i)+l fi
differ considerably. We used the abstractions of Section II end

and the schemas for quantifiers while Wirth used the (b)
systematic construction method. There are some difficulties Fig. 6. (a) Quantifier loops. (b) Example 1 program. (See Figs. 5
with Wirth's solution: a variable "good," introduced to play and 6(a) for assertions.)
the role of Accept in our version, is used wrongly in a few Example 3: Binary circle. Construct a circle of 2 N zero's
places, being negated when it should not be and vice versa. and ones in which each of the possible subsequences of N bi-
Also that solution used a "trick" of padding an element at
the start of the sequence to handle termination. Our purpose nary digits occurs exactly once. (See [41 , sec. 15.4, exercises].)
is not to cast aspersions on that solution (which probably Let
occurred circa 1971), but to show how much progress is made IS=def <1,0, *, ..
,>.
I
by further analysis of the backtracking problem, as is done in
the successor book [5]. N
Example 2: Knight's tour. Starting from some designated IS must appear somewhere in any solution so assume it is at
square (x,y) using the chess moves of a knight, move a piece the beginning and that S is so initialized.
around an N X N board so that every square is landed on
exactly once (see [5, sec. 3.4] ). T: {0,1 } Goal: First (if any)
Solution: Length (S) =2N A Accept (S)
T: {(a,b) I I Sa,bSN}, Goal: First (if any) A (Vj I 1 j<N- 1) Accept (rotate (j, S))
Solution: Fixed Length N2 Accept (S) _ (Vi I <i<Length (S)- N
Generate: if S=A then {(x,y)} else SubSeq (i,i+N- 1,S) SubSeq
{(c,d) Ic=u+a,d=v+b, l.c,d.N, (a,b)EJumps, (Length (S) - N+ 1, Length (S), S)
A6 (c, d) 0S} Generate: T
(u,v) =Last (S), Data Structure: B [1:2 N] to hold the decimal representations
Jumps-{(a,b)Ia,bE {1,- 1,2,- 2}AaIlabI} of subsequences.
Data Structure: H[1:N, 1:N] such that
Example 4: Malicious secretary. A secretary permutes the
H(i,j)=OA(i,j)¢S or S(H(i,j))=(i,j). labels covering the typewriter keys in order to confound her
Alternatively, H can be built into Accept. boss, who will be coming in over the weekend to type up some
GERHART AND YELOWITZ: CONTROL STRUCTURE ABSTRACTIONS 291

private correspondence. The boss is both a "hunt and peck" ture consists of examples, many of which are hard to under-
type person and a fan of permutations. Knowing this, the stand, especially if the correctness criterion is not clear.
secretary predicts that the boss will first type MO, the desired Often for pedagogical reasons, data structures and efficiency
message, getting M1; then, if M1 is not what he wants, try to considerations dominate correctness and control structure
type Ml, getting M2, etc., until the desired message appears considerations. It is probably impossible to get a good grasp
as Mp. The malicious secretary wants to maximize the number of backtracking without studying these examples, but there is
of iterations if this algorithm is followed as predicted. It can a large gap between examples themselves and the abstractions
easily be shown that the maximum number of iterations, p, is which get at the essence of backtracking. We have tried to
lcm (Xl,X2, ,Xk), where lcm=least common multiple, k partially close the gap in Section II.
is the number of cycles in the permutation, and Xi is the length Correctness statements: Nowhere in the literature could we
of the ith cycle. So the general problem is: find a statement of the correctness of backtracking which is
as precise as ours. Correctness was an implicit, but not a major,
Maximize lcm (S), where SC {1,2, N}, concern of the various articles on backtracking. That is not
N=number of keys on the typewriter, and E s < N. to say that the programs are incorrect, only that correctness is
s=S not proved (although informal arguments are occasionally
T: {1,** ,N} Goal: Maximize lcm (S) given). The notation which we give in Fig. 1 and correctness
Solution: true statements for the various backtracking search goals may look
Generate: {X Imax(S) <X.INAX+ E s .N}. fairly simple. In fact, it took surprisingly long for us to get
seS
the notation in exactly the form where it could be used for
Example 5: 8 Queens. Position 8 queens on an 8X8 board both correctness statements of the backtracking procedures
so that no queen can capture any other, i.e., no queen is in and for intermediate assertions. However, once the notation
the same row, column, or diagonal as another. was mastered for one variant, it came easier for the others.
It might be interesting to consider this point in regard to the
T: {1,*5 , 8} Goal: First (if any) article by Dijkstra [17].
Solution: Length (S)= 8/AAccept (S) Backtracking parameterization: This aspect came fairly
Accept (S) - (Vij I .i,j.length (S))(ij D flcapture (i,j)) easily from our study of the literature and simply from the
capture (i,j) = "thie queen in column i, row S (i) can abstract variations. We do not consider this parameterization
take the queen in column j, row S ()" complete. However, we claim to have gone far enough through
Generate: T different enough variations to say that the continuation is
Data Structure: Boolean and integer vectors to designate unlikely to present any major new concepts but instead will
queen positions. only lead to a great mass of details.
The parameters help select the overall control structure of Correctness of backtracking schemas: Given our abstract
the backtracking programs. It remains to optimize the Accept statement of correctness, the semantics of our rather vague
and Generate functions. language, and standard correctness techniques (generate and
prove verification lemmas), we were able to prove without
IV. THE CONTROL STRUCTURE ABSTRACTION difficulty the abstract correctness of our most general back-
METHODOLOGY tracking variants. In fact, with the notation properties of Fig.
In the last two sections, we have described a small case 1 the proofs are surprisingly easy; stating the assertions was
study of control structure abstraction of the backtrack pro- the hard problem.
gramming technique. Obviously, we have only scratched the Correctness of general schemas: We also used some schemas
surface of backtracking knowledge, but let us now examine which refined quantifiers into loops. The abstract correctness
exactly what kinds of knowledge we have used so far. of these was proved using the standard correctness techniques
Programming language semantics: We have steered away and we found these exceedingly useful in refining Accept in
from any particular language, and instead used a few funda- Example 1.
mental control structures and abstract data structures. We Correctness-preserving program transformations: Given the
have made many assumptions about good behavior of these general backtracking variants and the parametric properties of
language constructs, e.g., no sharing of data between variables. Solution, we applied the idea of program transformations.
We assume that the semantics and schemas of our vague Here we stated and proved the transformations in abstract
language could be translated to those of any particular lan- form with the conditions for their use in concrete forms and
guage. These details are important when it finally comes time then proved the concrete conditions. Thus we obtained a
to code programs from algorithms, but language details are one much larger set of correct backtracking variants with only a
of the programming aspects from which we have tried to small investment of proof effort.
abstract. Optimization: We have discussed optimization in only a
Concrete knowledge about backtracking: We started out few of the many ways it may be performed. We modified our
this case study by reading all we could readily find in the general variations of backtracking to make use of specific
literature. Initially, we barely know the concept. There was properties of Solution and Generate and we stated both
some abstraction in Wirth [5] and Wells [1] in the form of recursive and iterative forms. We show how data structure
schemas, without correctness statements. Most of the litera- may be introduced into the variations in order to improve the
292 IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, DECEMBER 1976

Accept test and what conditions must be maintained on the The fundamental questions in relation to software engineer-
data structure. There is much more work to be done here in ing are: Do we want the correctness of all programs stated
capturing the great number of known techniques which pro- and proved? Assuming so, do we want to start from scratch
duce good Accept tests through data structure manipulation. for every program? How can we exploit abstraction in every
We have not gone into this subject in any detail. In some sense, possible way?
these "tricks" can be regarded as too domain specific to be
considered part of the body of general knowledge about back- APPENDIX
tracking, but in another sense it is really this knowledge that Sequence notation:-, catenation; A, empty; Last(S),
makes backtracking a practical technique rather than a S (Length(S)); OtherThanLast (S), S with Last (S) deleted;
pedagogical toy. Initial (i,s), first i elements of S; rotate (i,S), left rotation i
Let us now try to define what we mean by the control places; Subseq (i,j,S),<S (i), S (j)>.
structure abstraction methodology. Set Notation: { }, empty; ElementFrom, delete and return
This methodology uses abstract programs which express some element. Set notation applied to sequences refers to
control flow and suppress details of some control operations, the elements in the sequence.
abstract specifications which express the essential facts about The and operator functions in this paper as the "conditional
behavior of abstract programs, and abstract proofs of consis- and" operator, in which "Bl and B2" has the value "true" if
tency between abstract programs and abstract specifications both operands are defined and are both "true," "false" if both
insofar as proof is possible given the nonsuppressed details. operands are defined and are not both "true," and "false" if
Whatever cannot be proved abstractly remains to be proved Bi is "false" and B2 is undefined.
when the details for a particular program are supplied. The
abstract and concrete proofs together guarantee the correct- REFERENCES
ness of the program. It also uses abstractly expressed trans- [1] M.Wells, Elements of Combinatorial Computing. New York:
formations, again with abstract proofs of correctness preserva- Pergamon,
[2] D. Knuth, 1971.
"Estimating the efficiency of backtrack programs,"
tion which leave premises to be fulfilled when the correctness- Math. Comput., vol. 29, pp. 121-139.
preserving transformation is used on a concrete program. The [3] N. Wirth, "Program development by stepwise refimement,"
thesis behind the methodology is that the basic difficulty in Commun. Ass.
[4] , SystematicComput. Mach., vol. 14, pp. 221-227.
Programming. Englewood Cliffs, NJ: Prentice-
correctness proving is more the organization and expression of Hall, 1973.
programming knowledge than the lower level details of proofs. [5] -, Algorithms+Data Structures=Programs. Englewood Cliffs,
It follows that the abstractions stimulated by correctness NJ: Prentice-Hall, 1976.
concerns may be applied to program construction. [6] D. Hanson, "Procedure mechanism for backtrack programming,"
Univ. Arizona, Tuscon, Tech. Note, 1976.
The goals of managing complexity through abstraction and [7] J. Bitner and E. Reingold, "Backtrack programming," Commun.
achieving reusability through generality are shared by the data [8]
Ass. Comput. Mach., vol. 18, pp. 322-329.
C. A. R. Hoare, "An axiomatic basis for computer programming,"
structure abstraction methodology [16]. We expect to see Commun. Ass. Comput. Mach., vol. 12, pp. 322-329.
many data structure abstractions in implementations of Solu- [9] -, "A note on the for statement," BIT, vol. 12, pp. 334 -341.
tion and Generate. One of our goalstion
is toand
collect u oasiocolc[10]
substantial D. Knuth, "Structured programming with goto statements,"
Geeaea neo
number of schemas and transformations. Whenever possible [11]
ubtComput. Surveys, vol. 6, pp. 261-302.
R. Floyd, "Nondeterministic algorithms," J. Ass. Comput. Mach.,
we use a known one; otherwise we generalize, prove abstractly, vol. 14, pp. 636-644.
[12] programs,"
add to the list 9 and apply back to the source. Thus our pro-R. Burstall and J. Darlington, "Some transformations for recursive
in Proc. 1975 Int. Conf. Reliable Software, pp.
gramming knowledge becomes progressively more formally 465-472.
expressed and organized. Other examples of use of the [13] S. Gerhart, "Proof theory of partial correctness verification
methodology are Wirth [5], Dershowitz and Manna [141, systems,"SIAMJ. Comput.,tobepublished.
and Gerhart [ 1 5. [141 N. Dershowitz Z. Manna, "On automating structured pro-
and IRIA
gramming," in Proc. Conf
Proving and Improving Programs,
Proving the correctness of all backtracking programs might 1975, pp. 167-193.
be considered an academic game or a folly. The practical [15] S.study,"
Gerhart, "Knowledge about programs: A model and case
in Proc. 1975Int. Conf: Reliable Software, pp. 88-95.
applications in real-world problems and academic research [16] SIGPLAN Conference on Data: Abstraction, Definition, and
suggest it is worthwhile. That we made considerable progress Structure, Mar. 1976.
inthis- area suggests it is possible. But more important is the [17] E. W. Dijkstra, People, Oct.as 1974,
nature," Comput."Programming a discipline of a mathematical
pp. 10-11; also Amer.
development of a methodology which supports this goal, no Math. Mon., June-July 1974.

matter what the application area, and which produces results,


suh
suchasasthe
theschemas
schemasand
andtransformations Figs.4(c)
transformationsofofFigs. and6(a),
4(c)and 6(a), Susan L. Gerhart, for a photograph and biography, see page 207 of the
with generality beyond the application area. Some candidate September 1976 issue of this TRANSACTIONS.
areas for further application of the method are operating sys-
tems, data structure manipulation, and data processing Lawrence Yelowitz, for a photograph and biography, see page 207 of
algorithms. the September 1976 issue of this TRANSACTIONS.

You might also like