Professional Documents
Culture Documents
Peng-Analysis of Comm
Peng-Analysis of Comm
Peng-Analysis of Comm
for Non-progress
Wuxu Peng and S. Purushothaman
Abstract
Let P and Q be two processes (specified as fi- of messages.
nite state machines) communicating asynchronously Let P and Q be two processes specified as finite state
with each other using send and receive commands machines (possibly nondeterministic), which send and re-
over a set of message types E. We consider the prob- ceive messages asynchronously over unbounded buffers. The
lem of testing P and Q for two forms of non-progress: events of the two machines are send and receive commands
deadlock and unspecified reception. Since the non-
specifying a particular message type g E E. A send event
progress problem is undecidable, we use a dataflow
s(g) in P, appends a message of type g to the end of Q’s
approach to obtain sufficientconditions under which
buffer. A receive event r ( g ) in P succeeds if there is a mes-
the two processes are free of deadlock and unspeci-
sage of type g at the front of P ’ s buffer; it causes the front
fied reception. Our approximation analysis is based
on weakening the receive operation, and we present of P’s buffer to be deleted. We assume interleaving exe-
polynomial time algorithms to perform the analy- cution of the two processes. A pair of processes P and Q
sis. This problem arises in the context of dataflow reach deadlock provided both can only attempt to receive
analysis of processes that communicate by message and both of the buffers are empty. A process P reaches an
passing and in the context of showing correctness of unspecified reception state provided it reaches a state where
protocol specifications. it can only receive messages of particular types, and the
message at the front of the queue is not one of acceptable
types. Both deadlock and unspecified reception contribute
1 Introduction: Problem statement towards non-progress of the processes and hence should be
and background avoided. The problem of checking two processes for absence
of deadlock and unspecified reception is undecidable [3]. In
One of the important characteristics of parallel programs is this paper we will take a dataflow approach to these prob-
indeterminacy. Two different runs of a program with the lems. We will weaken the model by using receive commands
same initial input might produce different traces. Thus it that do not depend on the type of the message received. Un-
becomes difficult to debug them. Consequently, the bur- der these assumptions we will present polynomial space and
den of showing/proving that a parallel program is free of time algorithms to check statically whether two processes
runtime errors such as deadlock rests with the programmer. P and Q are free of deadlock and unspecified receptions.
We believe that in order for parallel programming languages We develop the algorithms is such a fashion that the proof
to be widely accepted, their compilers should help users in of “semantic soundness” theorems falls through easily. Our
detecting and exposing errors like deadlock. Our motiva- algorithm for checking two processes for absence of dead-
tion is similar in spirit to type checking that is traditionally locks solves the problem left open by Gouda and Yu in (111.
performed in the semantic analysis phase of a compiler for On the other hand, the problem of checking for unspecified
languages such as PASCAL. In fact, our approach is sim- receptions has not been considered in the literature before.
ilar to polymorphic type-checking [2] of programs. In this Our analysis for unspecified reception is set up in such a
paper we consider the problem of statically analyzing com- fashion that it is applicable when there are more than two
municating processes for deadlock and unspecified reception participating processes.
280
0 1989 IEEE
CH2706-0/89/0000/02&?0$01.00
Our motivation for considering finite state machines as context of synchronous communication. They consider the
representations of processes is two fold. First, in order to non-progress problem when the interconnection among the
perform dataflow analysis of communicating processes we processes is restricted to a tree and when restricted to
will have to forget the results of tests and assignment state- acyclic processes in the general interconnections case. The
ments, which would lead us to communicating finite state work of Taylor [IO] discusses the importance of perform-
machines [9]. Our specific model of asynchronous communi- ing static analysis and presents a simulator for checking
cation with blocking receives and non-blocking sends arise deadlocks whose termination is not guaranteed. The work
in programming the Intel hypercube and the systolic Warp of Holzman [4] discusses an algebra for expressing protocols
machine [6].Our second motivation is the use of such ma- and presents a simulator to detect deadlock. As in the work
chines in the literature for specifying and validating net- of Taylor, the simulator is not guarenteed to terminate.
work protocols [I]. Moreover, a calculus for communicat-
ing processes [i’] widely used for giving semantics to parallel
constructs is based on the notion of finite state machines 3 Communicating finite state ma-
as processes. chines
In Section 2 we report on previous work, in Section 3
we provide necessary definitions, in Section 4 we show how A communicating finite state machine (CFSM) is a labeled
deadlock analysis can be done, in Section 5 we consider the directed graph with a distinguished initial state, and wherein
unspecified reception problem and we conclude in section each edge is labeled by an event. The events of a CFSM
6. are send and receive commands over a set of message types
E. The communication between CFSMs is assumed to
be asynchronous (i.e., non-blocking sends and blocking re-
ceives). Consequently, we assume the availability of an in-
finite buffer for each process. Furthermore, interleaving
semantics is assumed.
2 Previous work Formally, let C be a set of message types, then a CFSM
Previous work on this problem appear by Reif [9],by Gouda P can be specified as a four-tuple (S,A,T,po), where
et al [11,3],and by Kanelakkis [5]. Reif considers the prob- (1) S is the set of local states,
lem of reachability in a set of processes with multiple ports ( 2 ) A = { r ( g ) , s ( g ) 1 g E C} is the set of events,
and a single message type. With static communication s ( g ) denotes sending a message of type g and T(g) denotes
which corresponds to our assumptions here, he shows that receiving a message of type g,
reachability requires at least exponential space, infinitely (3) T is a partial mapping, T : S x A 4 2’. T(p1,a)
often. On the other hand, his approximate algorithm for is the set of new states that the machine can possibly enter
reachability analysis is more tractable. Unfortunately, his after performing event a in state p l ,
work does not handle loops well. Furthermore, the em- (4) po is the initial state.
phasis in the approximate analysis is on checking for the Let N = ( P , Q ) be a network of two communicating
presence of deadlocks rather than on absence of deadlocks. finite state machines (NCFSM), where both P and Q are
The work of Gouda is more directly related to our work CFSMs. We will assume from now on that P and Q com-
in that our models are the same. They provide an O(m3n3) municate over the same set C of message types. When N
algorithm that checks for freedom from deadlock when the starts execution, the message buffers for both P and Q are
two machines do not have any mixed nodes - that is there assumed to be empty. A global state of N can be denoted
are no vertices whose outgoing arcs include both sending by a 4-tuple [p, q, z, y], where process P is in state p , pro-
and receiving events. We present an O(m4n4) algorithm cess Q is in state q , z is the string of messages in P’s buffer
that works even in the presence of mixed nodes, where m and y is the string Q’s. The initial state of the network is
and n are the number of states in the two processes. They [Po, 40, E , E ] , where po is the initial state of P and qo is the
did not consider the problem of checking for unspecified initial state of Q.
reception. A node p in either P or Q is said to be a sending (or
Kanelakkis and Smolka considered the problem in the receiving) node iff all of its outgoing edges are sending (re-
28 1
spective buffers are empty. Since we are assuming that the
ceiving) edges. A node p in P or Q is said to be a mixed receive commands do not depend on the type of messages,
node iff it has outgoing sending and receiving edges. These the type of messages being is not significant. Consequently,
definitions have obvious generalizations to global states. we will consider two machines P and Q that communicate
Let e be an outgoing edge of node p (in P ) or q (in by sending and receiving messages of a single type. Let m
Q), and let label(e) = a. Let juxtaposition of two strings (or n) be the number of nodes in machine P(or Q). We will
denote concatenation. A global state [p’, q‘, z’, y’] is one- refer to the event of sending a message by 1 and the event
step-reachable from [p, q, I,y ] , denoted M [p, q, I , y ] % of receiving a message by T. Since the type of the messages
[p’,q’, I‘, y’], if is not significant, the global state of the two machines can
(1) e is a sending edge labeled by s ( g ) from p to p‘ in be captured by the 4-tuple [p, q, z, y ] , where p is a state in
P , q‘ = q, z’ = z and y.g = y’, or P , q is a state in Q, z and y denote the number of pending
(2) e is a sending edge labeled by s ( g ) from q to q’ in messages in the buffers of P and Q respectively.
Q,p’ = p , 5.9 = z’ and y’ = y , or A reachable global state [p, q , 2,y ] is fair if z is equal to
(3) e is a receiving edge labeled by r ( g ) from p to p’ in y . That is, the number of messages in the buffers for both
P , q = q’, g.z‘ = z and y‘ = y, or P and Q are the same at all times. More formally, we define
(4) e is a receiving edge labeled by ‘ ( 9 ) from q to q’ in the fair reachable graph (FRG) for a pair of machines P and
Q,p = p’, I = 2‘ and g.y‘ = y . Q exchanging messages of a single type as follows, which
Define reachability as the reflexive and transitive clo- captures the set of all fair states of the network ( P ,Q).
sure of one-step-reachability. Define reachability set to be
Definition 4.1 Let N = ( P , Q ) be an NCFSM which ez-
the set of states reachable from the initial state [Po, 90,E , E ] .
changes messages of a single type. The FRG[P,Q]= (V,E )
We use the notation [p, q , I , y ) A [ p ’ ,q’, z‘, y’] if [p’,q‘, z‘, y‘]
is a directed labelled graph defined as
is reachable from [p,q,z,y]. Of course, [ ~ O , Q ~ , E , E ] -L
( 1 ) bo,qo] E V , where po (qo) is the dart state for P ( Q ) .
[p, q , z, y ] denotes that [p, q , I, y ] is a reachable global state.
Let [p,q] E V .
A reachable state [p, q , I, y ] is a deadlock state iff I = -
y = E and both p and q are receiving nodes. ( 2 ) I f p-&’ E P , and q L q ‘ E Q , then [ p , q ] A [ p ’ , q’] E
notate all receive commands (irrespective of the message Intuitively, the edge label 1 denotes an increase in the
type expected) by i. The Dyck set over the left brackets number of messages, i denotes a decrease in the number of
a l , a z , . . . ,a, and corresponding right brackets bl, b,.. . ,b, messages, and 0 denotes that no change has taken place in
will be notated as DYCK(al,az,...,a,; b l , b , . . . , b , ) and the number of messages in both the buffers.
the set of all prefixes over a Dyck set will be notated as A path r in FRG[P,Q] is legal iff any prefix of r has
PDYCK(a1, a2,. .. ,a,; b l , &, . . . , b,). Note that both of at least as many 1-labeled edges as T-labeled edges. The
these are context-free languages. For example the gram- length of a path r (notated as Zen(r)) in FRG[P,Q]is the
mar S ::= S(S)S I E specifies the dyck set over ”(” and number of 1-labeled edges minus the number of 7-labeled
” 1’’. edges in r.
The fair reachable graphs are important in that they
4 Deadlock Analysis capture only a subset of the entire reachable set of states.
We state below that all fair reachable states are captured
In this section we will discuss the problem of checking for in the fair reachable graph. Our proof is based on a cor-
freedom from deadlock. Note that processes P and Q reach responding theorem appearing in [Ill, and is given in the
deadlock iff they are both in receiving states and their re- Appendix A. Formally, we have
282
Lemma 4.1 Assume that we are given a n N C F S M N = Theorem 4.2 Let N be an N C F S M over multiple message
( P , Q ) and that its fair reachable graph is F R G [ P , Q ] = types. Consider N’ obtained from N by identifying all mes-
(V,E). [p,q, k , k ] is a reachable state in N iff there ezists sage types to be a single message type. If N’ is free of
at least a legal path r from node [po,qo] to node [p,q] in deadlocks then N is also free of deadlocks. 0
F R G [ P ,Q ] such that Zen(r) = k . I n particular, [p, q, O,O]
Proof: Since N‘ has all the execution sequences of N
is a reachable deadlock state in N iff there ezists at least a
(and of course some extra ones), if N’ is free of deadlocks
legal path r of length zero from node [Po, qo] to node [p, q] in then N is free of deadlocks too. 0
FRG[P,Q].
283
Given a shuffle-product the relative ordering of the oper-
ations that affect P’s buffer can be obtained by ignoring the state [[p, q ] ,z ] in ( P @ Q )( p is an unspecified reception state
operations that affect Q‘s buffer. We will refer to this op- if none of the outgoing receiving edges of [p, q] is labeled by
eration as projection and denote the projection of machine the message type in front of the queue z. Let e be the
P (or Q) over the shuffle-product P @ Q by ( P @ Q ) Ip (or edge from state [p1,q1] to state [p2,q2]. We say that e is
( P@ Q) IQ).
Formally we have the cause of an unspecified reception at state [p, q ] if there
is an execution path in which a message of type g sent as
Definition 5.2 Let P @ Q = (S,A’,T,s) be the shufle-
a result of executing edge e results in g being in front of
product of N = ( P ,Q ) . T h e projection of P @ Q over P is
queue at [p, q ] , and g is not acceptable at state [p, q]. As the
a ( P 8 Q ) IP= (sp,
Ap,Tp,sP) defined as: receive commands do not respect the type of a message, e
( 1 ) sp= s. can be the cause of an unspecified reception if there are k
(2) sp = s. messages in the buffer when execution reaches state [PI,q1)
(3) T h e alphabet Ap contains all the send event3 of Q and there are k receive commands in a path from [p1,q2]
and a special letter T which denotes the receiving action of to the state b,q]. To formalize this intuition, we need the
P , viz., Ap = { g I ( E , s ( s ) )E A} U {T}. following definitions in the context of projection ( P @ Q )Ip.
( 4 ) The transition function Tp is a (sort of) restriction
of the transition function of the composition P @ Q to the Definition 5.3 T h e spectra of a state [p,q] (notated as
events in Ap. Corresponding t o transitions in A’ that affect spectra([p,q ] ) ) is the set of all natural numbers k, such that
Q’s buffer we use .z transitions in ( P 8 Q ) Ip, and corre- [[p, q ] , z ] is a reachable state and length of z is k. T h e chops
sponding to every transition that affects P’s buffer we will of two states [p, q] and [p’,q’] (notated chops( [p, q ] , [p’, q’]))
use a transition labeled by a n appropriate event f r o m Ap. is the set of all natural numbers k such that there are k
-
Formally, Tp i s defineed as:
I f [p, q ] ( E ’ S ( d )
If [p,qfr*)[p’,q]
h q ’ l E T , then ( p , q ] ~ [ p , q ’E
E T , then [ p , q ] L [ p ’ , q ]E Tp.
] Tp.
receive commands o n some path f r o m [p, q] to [p’,q’]. 0
284
The use of projections in the above development allows Consider two edges, one is labeled by 1 and the other
us to easily extend this technique to certifying freedom from labeled by T in 91(R')2 lying on a path p as follows:
unspecified reception in the case of more than two pro-
p : 5' - L S 2 L S 3 1 . ' L s n - li s ,
cesses; we merely need to compute the shuffle-product of
all the machines and then consider the projection on each
Let k be the number of messages in the queue when exe-
of the machines. Now we are left with tasks of computing
cution reaches the node s]. If this path is taken, then the
the spectra of a node, computing the chops of two nodes
number of messages in the queue when execution reaches
and finding the intersection between them.
the node s, will be IC and for nodes s2, ..., sn-l will be IC 1. +
Let P' denote the projection (PBQ)( p = (Sp,AprTp,sp). Consequently, by adding an €-labeled edge from si to s, we
Let s l , s2 and sg be states in P' for which we wish to test if will not be changing the spectra of any node. Assume that
spectra(sl)~chop~(s2,s3) is empty. Define R' to be a FSA we have added an E-edge from SI to s, for all paths p as
with s1 as the final state, i.e., R' = (S,Ap,TP,sp, {SI}). De- described above. Note that there can be at most O(m2n2)
fine R" as the FSA P' with s2 as the start state and s3 as such additions. Since the addition of each of such E edge
the final state, i.e. R" = (S,ApTp,s2, ( ~ 3 ) ) . Now L(R') is
does not change the spectra of any node, we can drop all
a language characterizing all the paths in P' from the start
the receiving edges without changing the spectra relation!
state sp to sl, and L(R") is the set of paths in P' from We will refer to this new machine as the normalized ma-
state s2 to state s3 in P'. Note that every legal execution chine. The construction of the normalized machine R from
sequence from sp to SI has the property that any prefix of a machine R' can be carried out in time O(m6n5)when R'
it has at least as many sends as receives. Consequently, has O(mn)states. The algorithm to perform this transfor-
the set of all legal execution sequences can be obtained by
- mation is given in the Appendix B3. The correctness of this
intersecting L(R') with PDYCK(gl,...,gn;1,...,i), where algorithm can be established by induction on the number
C = {gl,...,gn}. As we are interested in the number of of edges of paths in the graph. Now, we have the following
messages in the buffer at state sl, i.e., the excess num-
spectra(sl)nchops(sz,s3) is empty iff L ( R )n L(iP2(R"))is
ber of sends over receives in any legal execution sequence
empty.
from start state sp to state sl, define L = G1(L(R'))n
P D Y C K ( 1; 7)where 91 is a homomorphism specified by Since each of the steps involved in computing the inter-
91(g E E) = 1. Let be a Parikh's mapping [8]'. Thus, section can be performed in polynomial space and time, the
spectra(s1) = { z 1 = #I - #i, , where [#I, #i] E " ( L ) } . test for unspecified reception can be carried out in polyno-
Similarly chops(s2,s3) = { z 1 z E 9(92(L(R"))},where mial space and time.
G2 is specified as 92(g E E) = E and 02(i)= 1. Instead
of computing the Parikh's mapping for the two sets and
then intersecting them, we will use a more direct approach.
6 Discussion
The direct approach that we will look at in the following is We have shown how two communicating finite state pro-
possible due to a special property of CFLs (Parikh's theo- cesses can be tested for freedom from deadlock and un-
rem), which states that every CFL is letter-equivalent to a specified reception in polynomial time. Our analysis for
regular language. More formally, if C is a CFL then there deadlock settles the problem left open in [ll].On the other
exists a regular language R, such that *(C)= * ( R ) . Con- hand, our analysis for unspecified reception is novel and
sequently, in the following instead of explicitly constructing has not been considered in the literature before. Moreover,
the spectra we will compute the regular language over the our solution to the unspecified reception problem is scalable
single letter 1 which has the Parikh's mapping that we are to networks of more than two processes. In figures 1 and
interested in. 2 we show some networks that can be certified to be free
'Parikh's mapping may be defined as follows. Let C = {al, ...,a,,} of unspecified receptions using our algorithms. The prob-
be the alphabet of a language C . Let #a(w) be number of times a E C
lem of testing for deadlock in more than two processes still
appears in the word w E C . Now, U(C) = {n I D = [#,,,(w), ...,#..(tu)]
and w E C }
remains open.
*It can be obtained by replacing every event in g by 1. The language
L(CP,(R'))is the same as CPI(L(R')).
3This algorithm does not depend on the final state of R'. Thus it
computes the spectra of every node in a projection at the same time
285
Acknowledgements: We would like to thank Piotr
Berman for his suggestions and encouragement.
Initial Initial
References node node
I l
Paris Kanelakkis and Scott Smolka. On the analysis of Figure 1. An NCFSM N1 = (Pl, Q1)
cooperation and antangonism in networks of communi-
cating processes. In IV ACM Symposium on Principles
of Distributed Computing, 1985.
Initial
Monica Lam. Compiler Optimizations for Asyn-
chronous Systolic Array Programs. In X V ACM
Symposium on Principles of Programming Languages,
1988.
286
Appendix Note that ECL and IECL can be computed in time O(m3n3).
Now we are ready for the algorithm.
287