Distributed Averaging Using Periodic Gossiping

4282 IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 62, NO.
8, AUGUST 2017
Distributed Averaging Using Periodic Gossiping

Changbin Yu, Brian D. O. Anderson, Shaoshuai Mou, Ji Liu, Fenghua He, and A. Stephen Morse
Abstract—The distributed averaging problem is a consensus at any instant of time commonly just one pair of agents interacts, i.e.,
problem whose objective is to devise a protocol which will enable all exchanges and averages their values. For analysis purposes, typically
the members of a group of autonomous agents to compute the av-
erage of the initial values of their individual consensus variables in an underlying graphical structure is assumed for a gossiping algorithm,
a distributed manner. Periodic gossiping is a deterministic method and the graph must be connected. Vertices correspond to agents of
for solving the distributed averaging problem by stipulating that course, and the edges of the graph join those vertex pairs correspond-
each pair of agents which are allowed to gossip, do so repeatedly ing to agent pairs which are permitted to gossip. (Variants involving a
in accordance with a prespecified periodic schedule. Agent pairs
which are allowed to gossip correspond to edges on a given con-
jointly connected property over a time interval are sometimes invoked,
nected, undirected graph. In general, the rate at which the agents’ to deal with time-varying topologies.) One can conceive of synchronous
consensus variables converge to the desired average value de- or asynchronous selection of edges, and random or deterministic se-
pends on the order in which the gossips occur over a period. The lection of edges. Random gossiping algorithms are studied in, e.g.,
main contributions of this paper are first to characterize the classes [1]–[4]. There are nevertheless clear reasons which may lead one on
of periodic gossip sequences which have the same convergence
rate and second to prove that if the graph of allowable gossips is occasion to prefer deterministic gossiping to random gossiping. Here
a tree with each edge restricted to gossiping once per period, the are the following three.
convergence rate is quite surprisingly, fixed and invariant over all 1) With random gossiping, there is only a convergence rate on av-
possible periodic gossip sequences the graph allows. To arrive at eraged quantities, including mean square error. There are in fact
these results, a new and unusual graph theoretic concept, namely
the transfer function of a node of an undirected graph, is used.
particular random sequences with arbitrarily slow convergence rate
Among all the trees with the same number of edges, optimal tree which occur with positive probability; for these sequences, the ac-
structures, which yield the fastest convergence rate, can then be tual squared error is greater than the mean squared error. In contrast,
sought. when using deterministic gossiping sequences, a convergence rate
Index Terms—Consensus, edge coloring, gossiping, multiagent may be explicitly and often easily computed (and of course applies
systems. to all sequences).
2) Deterministic gossiping offers the opportunity for further conver-
I. INTRODUCTION gence rate improvement using multigossiping; in multigossiping,
several pairs of agents, with no agent in more than one pair, can
Gossiping is a particular way of achieving average consensus, i.e.,
gossip at a given time instant. (There is a random but centralized
deriving from a collection of networked agents’ initial values of a
and synchronous algorithm for multigossiping mentioned in [1],
certain variable what the average value is. With a conventional gossip-
but it does not seem to have enjoyed much attention, perhaps due
ing algorithm (as opposed to one allowing multigossiping, see below),
to the requirements of centralization and synchronization.)
3) Distributed optimization algorithms [5] often rely on deterministic
Manuscript received December 23, 2016; revised January 6, 2017; distributed averaging algorithms, of which deterministic gossip
accepted March 9, 2017. Date of publication March 27, 2017; date of algorithms are a special case.
current version July 26, 2017. The work of C. Yu and B. D. O. Anderson
It is also relevant to ask what are the distinguishing features of dis-
was supported in part by the Australian Research Council under Grant
DP-130103610 and Grant DP- 160104500, in part by the Natural Science tributed gossiping which differentiate it from other types of distributed
Foundation of China under Grant 61375072, and in part by the Westlake averaging algorithms.
Education Foundation. The work of S. Mou was supported by a funding On the one hand, especially if there is an underlying clock, there is
from Northrop Grumman Corporation. The work of A. S. Morse was the almost certain likelihood that convergence will be slower, although
supported in part by grants from the US Air Force Office of Scientific
Research, in part by the National Science Foundation, and in part by a the possibility of multigossiping when the gossiping is synchronous
gift from the Xerox Corporation. Recommended by Associate Editor F. can mitigate the convergence loss. As against the speed disadvantage,
R. Wirth. (Corresponding author: Ji Liu.) there can, however, be advantages, starting with the simplicity of the
C. Yu is with Westlake Institute for Advanced Study, Hangzhou information that is exchanged, and the way it is processed. If one uses
310024, China, and also with the Australian National University, Can-
the methods for distributed averaging set out in, e.g., [6] and [7], lin-
berra ACT2601, Australia (e-mail: Yu@wias.org.cn).
B. D. O. Anderson is with Hangzhou Dianzi University, Hangzhou ear iterations with doubly stochastic matrices are used, requiring each
310018, China, with the Australian National University Canberra agent to know an upper bound on the number of neighbors of each
ACT2601, Australia, and also with DATA61-CSIRO, Canberra, ACT of its neighbors. A second possibility is to use double linear itera-
2601, Australia (Previously NICTA) (e-mail: Brian.Anderson@anu. tions (push-sum), e.g., [8]–[12]. Knowledge concerning the number of
edu.au).
S. Mou is with Purdue University, West Lafayette, IN 47906 USA neighbors of neighbors is no longer a requirement, but the algorithms
(e-mail: mous@purdue.edu). are more complicated than linear iterations, and require more data to
J. Liu is with the Coordinated Science Laboratory, University of Illinois be communicated between agents.
at Urbana-Champaign, Urbana, IL 61801 USA (e-mail: jiliu@illinois.edu). Yet another and old variant on averaging achieves computation in
F. He is with Harbin Institute of Technology, Harbin 150001, China (e-
finite time: values are flooded through the network in finite time and
mail: hefenghua@gmail.com).
A. S. Morse is with Yale University, New Haven, CT 06520 USA each node computes an average at the end of the flooding process.
(e-mail: as.morse@yale.edu). A moment’s reflection illustrates a key difference (but not the only
Digital Object Identifier 10.1109/TAC.2017.2688278 one of course) between this type of algorithm and the ones mentioned
0018-9286 © 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications standards/publications/rights/index.html for more information.
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 62, NO. 8, AUGUST 2017 4283
above, which exhibit exponential convergence; the latter algorithms are The next section is devoted to preliminaries and some definitions.
robust to noisy values, or to limited short-term loss of communication Section III states the main result for general graphs, and in a corol-
between pairs of agents, and this in part accounts for their popularity. lary, the application of the main result to trees. Section IV provides
The flooding algorithm in contrast has nowhere near the same level of a proof of the main result. Section V considers various examples of
robustness. tree and star graphs, with illustrations of convergence rate. There is
Periodic multigossiping for message passing is an old and well- also examination of the question of what trees for a fixed number of
understood idea in computer science, see, e.g., [13] and [14]; multi- nodes will give the fastest convergence rate for periodic gossiping with
gossiping (but not periodic multigossiping) for averaging was also no repeats in a period. The section also makes some observations on
mentioned in [1]. Deterministic periodic multigossiping is dealt with the distinction between random and periodic gossiping. Section VII
in more detail below. contains concluding remarks.
A criticism of deterministic gossiping algorithms is that they have
some overhead for implementation, in that before gossiping com- II. DEFINITIONS AND BACKGROUND
mences, a protocol must be supplied to all agents to schedule when
they gossip. However, such protocols can be very simple, and once Let F = (V, E) be an undirected simple graph with n nodes in the
gossiping commences, the gossiping process can then run in a very node set V and m edges in the edge set E. The vertices correspond
straightforward way [15]–[18]—no centralized activity (leaving aside to agents which store a variable of interest, and the edges define the
synchronization) is required. It seems indeed hard to rank logically two agent pairs between which a gossip potentially occurs. A pair of agents
algorithms where one is more complicated to initialize but easier to run with labels i and j are said to gossip at time t if both xi (t + 1) and
than the other. xj (t + 1) are set equal to the average of xi (t) and xj (t). Thus, F is
In this work, we largely focus on gossiping algorithms where there the allowed gossip graph. Let E = e1 , e2 , . . . , em with m = |E| be an
is a (deterministic) periodic protocol causing the gossip labeled by ordering of the edges of F , with each edge appearing just once. We call
each edge to occur exactly once in the underlying period. We do, such a sequence nonredundant. Call the sequence E complete if F is
however, briefly also consider situations where some edges gossip connected, and call it minimally complete if it is complete and F is a
more than once in a single period; the situation is roughly analogous to tree.
having unequal probabilities of gossiping of allowed pairs in a random Over one period of gossiping, the gossips defined by E occur in the
gossiping framework. Unsurprisingly, examples show one can achieve defined order. Over an infinite interval, periodic repetition is assumed
a speed-up, but we lack insights at this point that suggest how to to occur, giving rise to an infinite periodic gossip sequence.
systematically exploit the phenomenon. Any individual gossip whether or not in a periodic gossip sequence
Assuming convergence to a consensus average does occur, it is rea- can be described by a specially structured doubly stochastic matrix (call
sonable to suppose that the convergence rate associated with determin- it a single-gossip matrix). If edge i defines a gossip, and is incident on
istic periodic gossiping will be dependent on the order of the gossips nodes j, l the associated single-gossip matrix Si has entries of 1/2 in
within one period. It is easy to construct examples which verify this. the jj, jl, lj, ll entries, 1’s elsewhere on the main diagonal and zeros
Nevertheless it is not always the case. We establish here a result which elsewhere. It can be easily checked that if two edges are not incident,
we found very surprising, originally having observed it experimentally. the corresponding two single-gossip matrices commute. If x(k) is the
Consider a graph in which no edge gossips more than once per period. vector of stored values at different nodes at time k and edge i then
Then, any two successive gossips within the period can be interchanged gossips, there holds x(k + 1) = Si x(k). If in all there are m edges
without affecting the convergence rate if either the nodes on which the in a periodic gossiping sequence, the composition of the m successive
edges are incident are all distinct, or, when there is a common node for gossips corresponding to a product of such single-gossip matrices is
the two edges, that the edges are not within any cycle of the graph. As again a doubly stochastic matrix, call it T . Where we wish to emphasize
an immediate corollary, if the graph is a tree, the convergence rate is that T is indeed a product, we shall term it a composite gossip matrix.
independent of the order of the gossip sequence. Building on the main The term gossip matrix may denote a single-gossip or composite gossip
result, we can also establish conditions applicable to the case where matrix. Define ρ to be the magnitude of the eigenvalue of the second
within one period, one or more edges can gossip more than once. largest magnitude of the matrix T (the largest eigenvalue is 1). If ρ < 1
The discovery of the invariance of convergence rate with respect to (and conditions for this are reviewed below) then the convergence rate
the ordering of gossips over a period for tree graphs (given no repeats on a per period basis is ρ (in the sense that T k approaches a limit as
within the period) was originally announced in our paper [15], and k → ∞ as fast as ρk tends to zero). Since there are m gossips making
later in [16]. Neither paper contained a detailed proof of the claim. A up one period, then the convergence rate on a per gossip basis, which
complete proof of the invariance property for tree graphs was derived is the usual basis for describing convergence rates, is ρ1 / m .
in an unpublished technical report which was the basis of this paper; The composite gossip matrix defined by the sequence E is denoted
that report did not treat the case of general graphs addressed here. The by TE . Then,
work in this paper inspired the work in [18], which derived the result in TE = Sm Sm −1 . . . S1 . (1)
an alternative way through a detailed calculation of specific entries of
matrices arising in the gossip process, including products of matrices Call TE a (minimally) complete gossip matrix just when E is (mini-
used to characterize single gossips. In contrast to this paper, none of mally) complete. If the first gossip within a single period occurs at time
the cited works treated graphs more general than tree graphs, and it is k and x(k) denotes the vector of gossip variables, then after one period,
quite unclear how [18] could be extended beyond tree graphs. the new vector of gossip variables is x(k + m) = TE x(k). Now given
In the course of establishing the results, we introduce a form of trans- a periodic gossip sequence E, the completeness property, equivalently
fer function associated with a node and a periodic gossiping sequence connectedness of the associated graph F , is necessary and sufficient
in a graph. This concept appears to be of interest in its own right, and to ensure that ρ(TE ) < 1 and the convergence of the gossip process
is also used as a device for obtaining the characteristic polynomial of occurs, see, e.g., [17]. Indeed it is well known, see [19], that because
a gossiping matrix associated with a graph formed by joining two or ρ(TE ) < 1, then TEk → n1 11 as k → ∞. (Here, as usual, 1 denotes
more simpler graphs at a common node. a vector of all 1’s.)
4284 IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 62, NO. 8, AUGUST 2017
Below, we shall be working with composite gossip matrices formed

from a subgraph of a given graph. On occasion, this subgraph may have
no edges, though it will have (a nonzero number of) vertices. In this
case, we simply take T = I.
An important tool in the following is provided by the transfer func-
tion of a gossip matrix, especially a complete gossip matrix, which is
defined in conjunction with identifying a particular node of the graph;
for convenience we take this to be the last numbered node, since node
reordering will deal with the case where a different node is used. Sup-
pose T is an n × n gossip matrix, and written in partitioned form
as
Fig. 1. (a) e1 and e2 are incident on the same node and are not part
A b of any cycle. (b) e1 and e2 are incident on the same node but e1 is in a
T = (2)
c d cycle.
with d a scalar. Then,
w(z) = d + c(zI − A)−1 b (3) 3) π is a transposition permutation which for some i < m inter-
changes i and i + 1 provided that in F , edges ei , ei + 1 are not
is defined to be the transfer function associated with T . We will always incident on the same node;
assume that w is given in reduced form, i.e., pole-zero cancellations 4) π is a transposition permutation which for some i < m inter-
are affected. To distinguish transfer functions associated with differ- changes i and i + 1 provided that neither ei nor ei + 1 is contained
ent gossip matrices, we shall usually write w(T ), suppressing the z in any cycle of F .
dependence. Note that if T = I, then w(z) = 1. Let TE and TE π be as defined in (1) and (5). Then, for any π in
Remark 1: An interpretation of w(z) can be easily given. Consider Π(E), these two matrices have the same eigenvalues.
a signal flow graph [20] in which the nodes correspond to nodes 1 That the eigenvalues of TE are left invariant by the first three trans-
through n − 1 of F , with two further nodes, an input node, label it formations is straightforward to demonstrate, as we now show. The
ni and an output node, labeled no , that can together be thought of as first claim is obvious. For the second claim, recall that it is well known
corresponding to node n of F . Let ai j , bi , ci denote, respectively, the that for two square matrices S1 , S2 of the same size, the eigenval-
ij entry of A, the ith entry of the column vector b, and the ith entry of ues of S2 S1 are identical with those of S1 S2 . From this it follows at
the row vector c. In the signal flow graph, there is a directed edge from once that the eigenvalues of Sm Sm −1 . . . S1 are identical with those
node j to node i which is labeled with the gain ai j z −1 for 1 ≤ i, j < n, of S1 Sm Sm −2 . . . S2 and Sm −1 Sm −2 . . . S1 Sm , and this yields the
a directed edge from node ni to node i with the gain bi z −1 , a directed second claim. For the third claim, observe that Si , Si + 1 each differ
edge from node i to node no with gain ci for 1 ≤ i < n, and a directed from the identity matrix in precisely four entries, which are of value
edge from node ni to node no with gain d. The transfer function w(z) 1/2, occurring in rows and columns corresponding to the nodes on
is the gain from ni to no . This is rather like a loop gain, obtained by which ei , ei + 1 are incident. Because all four nodes are distinct, Si and
constructing a signal flow graph corresponding to the gossip algorithm, Si + 1 commute. Thus, this permutation not only leaves the eigenvalues
and but with an opening of the loop at node n. invariant, but also the product matrix itself.
The characteristic polynomial |zI − A| is denoted by ip(T ) (short- It is the eigenvalue invariance property under the fourth permutation
hand for the ‘internal polynomial of T ”). It is straightforward to see type which is nontrivial to verify. Verification is provided in Section IV.
that Note that in the light of the third condition, we only need to consider
the situation where ei , ei + 1 are incident on a common node. Fig. 1
|zI − T | = ip(T )(z − w(T )). (4) illustrates (a) a case in which the transposition of e1 and e2 belongs to
the fourth permutation type; and (b) a case in which the transposition
Our key interest in this paper is in considering variations
of e1 and e2 belongs to neither the third nor the fourth permutation
of the gossip ordering within a period. To this end, let π :
types.
{1, 2, . . . , m} → {1, 2, . . . , m} denote a permutation map. Let Eπ =
eπ (1 ) , eπ (2 ) , . . . , eπ (m ) denote the edge sequence after acting on E
with the permutation π. Then, A. Result for Trees
We now indicate the most important consequence of the above
TE π = Sπ (m ) Sπ (m −1 ) . . . Sπ (1 ) . (5)
theorem, achieved by restricting the underlying graph to be a
tree.
III. MAIN RESULTS Theorem 2: Adopt the same hypothesis as for Theorem 1 and sup-
The first main result we will prove in the next section is as follows. pose additionally that E is a minimally complete sequence, or equiva-
Theorem 1: Let F = (V, E) be an arbitrary simple, connected, lently that F is a tree. Then, for any permutation π : {1, 2, . . . , m} →
undirected graph with n nodes and m edges. Let E be an order- {1, 2, . . . , m} the complete gossip matrices TE , TE π have the same
ing of the edges in E, defining a nonredundant complete gossip se- eigenvalues.
quence of length m. Consider the group Π(E) (under composition) of The proof is immediate: it is well known that every permutation of
permutations π : {1, 2, . . . , m} → {1, 2, . . . , m} generated by the per- a finite set is expressible as a composition of transpositions of adjacent
mutations of the following types: elements, see, e.g., [21]; since F is a tree, no edge is in a cycle, and so
1) π is the identity permutation; under the fourth category, all transpositions of adjacent elements, and
2) π preserves the cyclic order of the edges; therefore all permutations, are allowed.
B. Speed-Up of Convergence by Multigossiping corresponding edge appears more than once in the ordering defined
by E. If an edge, ej say, gives rise to q entries in E, then q − 1 extra
Finally, in this section, we remark on a consequence of Theorem 2
edges are inserted between the nodes on which it is incident. Call these
which allows speed-up for gossiping over trees. When two successive
additional edges ej 2 , ej 3 , . . . , ej q . An ordering E∗ is defined for F ∗ by
gossiping steps (involving two edges) do not involve any agent in
identifying the q entries of E corresponding to ej and assigning q − 1
common, the two edges are said to be nonadjacent, and then the gossips
of them arbitrarily to correspond to ej 2 , ej 3 , . . . , ej q . It is evident that
can be performed simultaneously [1].
E∗ is a nonredundant and complete gossip sequence of length m for
To state this more formally in relation to the graph F = (V, E),
F ∗ . As such, Theorem 1 applies. It is then straightforward to conclude
let us define a multigossiping sequence as a cyclic sequence
that condition 4) of this theorem applied to F corresponds to condition
{E1 , E2 , . . . , Ei ,E1 , . . . } of subsets Eiof individual gossiping edges,
4) of Theorem 1 applying to F ∗ .
such that 1) Ei Ej = ∅ (i = j); 2) Ei = E; and 3) no two edges
in the same Ei have a common node. Then at any clock pulse, a set of
edges Ei become active and gossiping is performed over these edges IV. ESTABLISHING THE MAIN RESULT
simultaneously. Our aim is now to establish Theorem 1, and in particular the eigen-
An optimal multigossiping schedule for a graph may be character- value invariance property under the fourth condition. The general ap-
ized as one where there is the least number of simultaneous gossips proach is inductive. An overview of the approach is as follows. Consider
required in one period; such a schedule can be identified by solving a graph in which there is a node of degree at least 2 with at least two
an edge-coloring problem on F = (V, E). In graph theory [22], edge incident edges which are not in any cycle. Decompose, in a manner
coloring is an assignment of different colors to the edges of a graph to be described and guided by the existence of this node, the overall
F , such that no two edges having the same color are incident on the graph into a union of spanning subgraphs (i.e., subgraphs with the
same node. The minimum required number of colors for a graph is same node set as the original graph) with disjoint edge sets. Relate the
called the chromatic index, denoted by χ (F ). Various results exist complete gossip matrix for the original graph to gossip matrices for
involving the chromatic index, but the most relevant one is Vizing’s the spanning subgraphs, and likewise relate the transfer functions and
theorem: the chromatic index of any graph equals either the maximum the characteristic polynomials. Then, prove the desired result that the
node degree, δ(F ), or δ(F ) + 1. There are also algorithms for solving permutations of the fourth type listed in the theorem statement do not
the edge-coloring problem in a distributed way, see, e.g., [23] and [24] change the eigenvalues of the composite gossip matrix over one period
While determining the exact chromatic index of a general graph for the original graph, assuming by induction that it is valid for the
is NP-complete, it is easy to determine two possible values which spanning subgraphs.
differ only by one, so the acceleration of the convergence rate for We first recall some further terminology which is useful for this
multigossiping over single gossiping is almost guaranteed by an order section. Let G be a simple, not necessarily connected graph with the
of δ(F ). For some common families of graphs, the chromatic index node set {1, 2, . . . , n}. By an isolated node of G is meant any node
is easily obtainable, see, e.g., [25]. In particular, for a path graph, with degree 0; all other nodes are nonisolated. Obviously, if i is an
the chromatic index is 2 (which is δ), for a ring graph it is 2 or 3 isolated node of G, then in any gossip matrix, G say, formed from G,
accordingly as the number of edges is even or odd, and for a tree graph, whether it is single or composite the ith row of G has a one in its ith
the chromatic index is δ. column and zeros elsewhere.
C. Allowing Redundant Gossip Sequences A. Background Lemmas

To this point, we have postulated that in one period, the gossiping We first state two straightforward lemmas to be used in the sequel;
sequence is defined by an ordering E of the edges in E which is their simple proofs are omitted.
nonredundant, i.e., no repetitions occur. We now extend Theorem 1 to Lemma 1: Let T be an n × n matrix and let π : {1, 2, . . . , n} →
relax this constraint. {1, 2, . . . , n} define a permutation which leaves n unchanged, with
Theorem 3: Let F = (V, E) be an arbitrary simple, connected, Q the associated permutation matrix. Then, the last row of QT is the
undirected graph with n nodes and m edges. Let E be an ordering same as that of T and there holds
of the edges in E, such that each edge occurs at least once, and suppose
there are m > m elements in E. Consider the group Π(E) (under com- w(QT Q ) = w(T ) and ip(QT Q ) = ip(T ). (6)
position) of permutations π : {1, 2, . . . , m } → {1, 2, . . . , m } acting
on ordered sets of m elements and generated by the following permu- Lemma 2: Let G = (V, E) be a simple graph with n nodes, and
tations types: suppose that for some integer r > 1, there exist spanning subgraphs
1) π is the identity permutation; G1 = (V, E1 ), G2 = (V, E2 ), . . . , Gr = (V, Er ) of G such that
2) π preserves cyclic ordering; 1) ∪Gi = G
3) π is a transposition permutation which interchanges two adjacent 2) For all 1 ≤ i < j ≤ r, subgraphs Gi and Gj have no nonisolated
elements of the ordering provided that in F , the corresponding node in common.
edges are not incident on the same node; Let G be a composite gossip matrix for G associated with a nonre-
4) π is a transposition permutation which interchanges two adjacent dundant sequence E. Then, there exist composite gossip matrices Gi
elements of the ordering provided that in F , each corresponding associated with nonredundant edge sequences of each Gi such that
edge is not contained in any cycle of F , nor is it a repeated edge, G = Gπ (1 ) Gπ (2 ) . . . Gπ (r ) (7)
i.e., it does not correspond to more than one element of E.
Let TE and TE π be as defined in (1) and (5), save that m is replaced for any permutation map π : {1, 2, . . . , r} → {1, 2, . . . , r}, and in the
by m . Then for any π in Π(E), these two matrices have the same event that Ei is empty, the associated Gi is the identity matrix.
eigenvalues. We emphasize that all nodes of G are nodes of Gi for every i;
Proof: Define a new graph F ∗ = (V, E ∗ ), where E ∗ is obtained from consequently, condition 1) is a condition on the union of the edge sets:
E by adjoining to it extra edges between any node pair where the ∪Ei = E. Conditions 1) and 2) together in fact imply that the edge
sets Ei of the Gi form a partition of the edge set E of G and if for Gπ (1 ) by {i1 , i2 , . . . , ik , n} and {ik + 1 , . . . , in −1 }, respectively. Re-
any node w of G, there is an incident edge, call it e, then e and all call v is chosen to be the node corresponding to index n. Since
other edges incident at w must lie in one and the same subgraph, Gi r = 2, n indexes the common nonisolated node to the two subgraphs.
say (depending on w), with w isolated in Gj for all j = i. Hence Let Q denote the permutation matrix representing the permutation
any connected component of G in fact must lie entirely in one Gi . {1, 2, . . . , n} → {i1 , i2 , . . . , in −1 , n}; note for use below that QT has
Lemma 2 is then an immediate consequence of Claim 3 of Theorem 1: the same last row as T since the permutation leaves n invariant and
if any two successive single-gossip matrices in a nonredundant gossip note that Q is not a representation of π. It follows that QGπ (2 ) Q takes
sequence for G correspond to edges from different Gi , then they may the following form for some A2 , b2 , c2 , d2 :
be interchanged, since they cannot share a common node. In this way, ⎡ ⎤
all the individual gossips corresponding to edges in the one Gi can be Ik ×k 0 0
⎢ ⎥
sequentially performed, and the sequence of gossips associated with QGπ (2 ) Q = ⎣ 0 A 2 b2 ⎦ . (11)
Gi can be done before or after the sequence associated with Gj for any 0 c2 d2
i = j.
The next lemma is of independent interest. When a graph G has with d2 scalar. Similarly, we can argue that for some A1 , b1 , c1 , d1
the structure defined in the hypothesis of the lemma and when the ⎡ ⎤
gossip sequence order is, in a sense made clear in the lemma state- A1 0 b1
ment, consistent with the graph structure, the transfer function and QGπ (1 ) Q = ⎣ 0 I(n −k −1 )×(n −k −1 ) 0 ⎦ .

(12)
internal polynomial can be related to those of the spanning subgraphs. c1 0 d1
Following the lemma, we shall specialize the structure to which it is
applicable. Evidently,
Lemma 3: Let G = (V, E) be a simple not necessarily connected
QGπ (1 ) Gπ (2 ) Q = QGπ (1 ) Q QGπ (2 ) Q
graph with n nodes, and suppose that for some r, there exist spanning
⎡ ⎤
subgraphs G1 = (V, E1 ), G2 = (V, E2 ), . . . , Gr = (V, Er ) of G such A 1 b1 c2 b1 d2
that ⎢ ⎥
=⎣ 0 A2 b2 ⎦ . (13)
1) ∪Gi = G
2) For all 1 ≤ i < j ≤ r, the spanning subgraphs Gi and Gj have c1 d1 c2 d1 d2
no nonisolated node in common with the exception of a single It is then straightforward algebra to verify that
nonisolated node v, which is nonisolated in at least two of the Gi .
Let G be a composite gossip matrix for G associated with a nonre- w(QGπ (1 ) Gπ (2 ) Q ) = w(QGπ (1 ) Q )w(QGπ (2 ) Q ) (14)
dundant sequence E and suppose it can be expressed in terms of com-
posite gossip matrices Gi associated with nonredundant sequences in and it is immediate that
each Gi as follows:
ip(QGπ (1 ) Gπ (2 ) Q ) = |zI − A1 ||zI − A2 |
G = Gπ (1 ) Gπ (2 ) . . . Gπ (r ) (8)
ip(QGπ (1 ) Q ) = (z − 1)(n −k −1 ) |zI − A1 |
for some particular permutation π : {1, 2, . . . , r} → {1, 2, . . . , r}.
Suppose that the nodes are ordered, and node v is the last node in ip(QGπ (2 ) Q ) = (z − 1)k |zI − A2 |. (15)
the sequence. Then, there holds
Now recall that QT for any T has the same last row as T . By

r
Lemma 1, it follows from these equations that
w(G) = w(Gi )
i= 1
w(Gπ (1 ) Gπ (2 ) ) = w(Gπ (1 ) )w(Gπ (2 ) )
and
1
1 ip(Gπ (1 ) Gπ (2 ) ) = ip(Gπ (1 ) )ip(Gπ (2 ) ). (16)
ip(G) = ip(G1 )ip(G2 ) . . . ip(Gr ) (9) (z − 1)(n −1 )
(z − 1) (r −1 )(n −1 )
and This establishes the basis step of the induction. We observe the fact,
used below, that had v been an isolated node of Gπ (2 ) , the same
|zI − G| = |zI − Gσ (1 ) Gσ (2 ) . . . Gσ (r ) | (10) equation (16) would result; in the derivation, one would simply have
for an arbitrary permutation σ : {1, 2, . . . , r} → {1, 2, . . . , r}. b2 = 0, c2 = 0, d2 = 1.
Proof: Note an immediate implication of the hypothesis 2), that no The inductive step is more straightforward: assume the result
edge can be in two different edge sets Ei . Hence the Ei form a partition holds for r = 1, 2, . . . , i, and suppose now r = i + 1. Then, by hy-
of E, so that in particular, Ei ∩ Ej = φ, 1 ≤ i < j ≤ r, a fact we will pothesis G can be written as a product Gπ (1 ) Gπ (2 ) . . . Gπ (i + 1 ) =
use below. [Gπ (1 ) Gπ (2 ) . . . Gπ (i ) ]Gπ (i + 1 ) of composite gossip matrices, for some
We will establish the formulae for w(G) and ip(G) by induction on permutation π. Define the union graph GΠ (i ) := ∪ik = 1 Gπ (k ) . By hy-
r. For the basis step, suppose that r = 2. pothesis, any nonisolated node of the union graph other than the node
Let Gπ (1 ) have k + 1 nonisolated nodes. We will consider two cases. labeled m is an isolated node of Gπ (i + 1 ) . Also, GΠ (i ) ∪ Gπ (i + 1 ) = G

Suppose first that k + 1 = n, implying all nodes of Gπ (1 ) are noniso- and Ei + 1 is disjoint from ik = 1 Eπ (k ) . Accordingly, the basis step of
lated. Condition 2’) implies that with r = 2, both G1 and G2 must the induction yields (whether or not m labels a nonisolated node of
have at least one edge. Hence Gπ (2 ) has at least two nonisolated nodes, Gπ (i + 1 ) )
which are evidently nonisolated for Gπ (1 ) as well. This is a contradic-
tion and so this case cannot arise. w(GΠ (i + 1 ) ) = w(GΠ (i ) Gπ (i + 1 ) ) = w(GΠ (i ) )w(Gπ (i + 1 ) )
For the second case, still with r = 2, suppose k + 1 < n, and 1
ip(GΠ (i ) Gπ (i + 1 ) ) = ip(GΠ (i ) )ip(Gπ (i + 1 ) )
denote the set of labels of the nonisolated and isolated nodes of (z − 1)(n −1 )
and then invoking the standard induction assumption

i
w(GΠ (i ) ) = w(Gπ (j ) )
j=1
1
i
ip(GΠ (i ) ) = ip(Gπ (j ) )
(z − 1)(n −1 )(i −1 ) j=1
it is immediate that
i
+1
w(GΠ (i + 1 ) ) = w(Gπ (j ) )
j=1
i+1
1
ip(GΠ (i + 1 ) ) = ip(Gπ (j ) ). Fig. 2. Illustration of the three spanning subgraphs Gi with disjoint
(z − 1)(n −1 )i j = 1 edge sets Ei , i = 1, 2, 3. Their union is G.
This completes the induction.

Finally, to establish the sequence independence property (10) for
|zI − G|, recall (8). By what has just been proved, for any permutation F . Define nonempty spanning subgraphs Gi , i = 1, 2, 3 of G (which
will then be spanning subgraphs also of F ) as follows: G1 contains
σ, the gossip matrix Ĝ = Gσ (1 ) Gσ (2 ) . . . Gσ (r ) has the same trans- the connected component of G containing node u; G2 contains the
fer function, viz., ri= 1 w(Gi ) and same internal polynomial, viz.,
r connected component of G containing node v; G3 contains the other
i = 1 ip(Gi ). Hence Ĝ and G have the same characteristic polyno-
mial by (4). This proves the result. edges of G (there will be none if ei and ei + 1 are the only edges incident
Remark 2: There is a helpful way to think about the last lemma. on n). Note that if ei or ei + 1 were in a cycle, there could not be three
Consider r graphs, Hi , i = 1, 2, . . . , r with disjoint node sets of size such nonempty subgraphs. The definition results in G, G1 , G2 , and G3
ni . Identify one node in each graph and connect the r graphs at this fulfilling the conditions of Lemma 2 with r = 3. By cyclic permutation,
we have
node, to form a new graph G with n = ri= 1 ni − (r − 1) nodes.
(Many graphs can be grown in this fashion, and all trees can be grown |zI − TE | = |zI − M Si + 1 Si | (18)
by a sequence of such operations.) Then, the transfer function and
internal polynomial for G can be found in terms of those of the Hi . for some M which is associated with a nonredundant edge sequence for
To see this, define first spanning subgraphs Gi of G by adding to G. By Lemma 2, there are corresponding gossip matrices Gi associated
Hi the set of nodes of ∪j = i Gj , except for the node which becomes with the Gi such that M = G3 G2 G1 , and so
the common node at which all the Hi are joined when forming G.
Clearly. gossip matrices for the Hi induce gossip matrices for the |zI − TE | = |zI − G3 G2 G1 Si + 1 Si |. (19)
Gi : up to node reordering, a gossip matrix for Gi is the direct sum Moreover, the edge set of G1 is incident on no nodes in common with
of an identity matrix In −n i and a gossip matrix for Hi . In obvious the pair associated with the gossip Si + 1 , so that G1 and Si + 1 commute.
notation, w(Hi ) = w(Gi ), ip(Gi ) = (z − 1)n −n i ip(Hi ). Therefore, Hence,
the lemma explains how to obtain the transfer function and internal
polynomial for a graph in terms of the corresponding quantities for |zI − TE | = |zI − G3 G2 Si + 1 G1 Si |. (20)
subgraphs (not spanning subgraphs) obtained by separating the given
graph at a node. Now define spanning graphs Hj , j = 1, 2 of F with Hj having
edge set as the union of the edge set of Gj and ei + j −1 . Notice that
F = H1 ∪ H2 ∪ G3 , that H1 , H2 , G3 have a single nonisolated node
B. Proof of Main Theorem
in common, viz., n, and that the edge sets of the three graphs are
We now have the machinery to complete the proof of Theorem 1. disjoint. Accordingly, Lemma 3 is applicable, with r = 3.
Permutations of the first three types listed in the theorem statement Now observe that G1 Si and G2 Si + 1 are composite gossip matrices
are assumed to have been verified as providing the eigenvalue result. associated with nonredundant edge sequences of H1 and H2 . Hence
Given that in particular the eigenvalue property has been verified for the by Lemma 3, (20) yields
third type, it is enough to consider the fourth type when ei , ei + 1 have
a common node. Without loss of generality, suppose that the nodes are |zI − TE | = |zI − G3 G1 Si G2 Si + 1 |. (21)
so ordered that this is the last node, with index n. Recalling that Sj Because G2 commutes with Si and G1 , there then holds
is the single-gossip matrix corresponding to edge ej , let π̄ denote the
permutation of the edges which simply interchanges edges i and i + 1, |zI − TE | = |zI − G3 G2 G1 Si Si + 1 |
so that, as defined in (1),
= |zI − M Si Si + 1 | = |zI − TE π̄ | (22)
TE = Sm Sm −1 . . . Si + 1 Si . . . S1
with the last step following by cyclic permutation again. This completes
while also the proof of the main theorem.
Remark 3: In [26], it is pointed out that the replacement of the usual
TE π̄ = Sm Sm −1 . . . Si Si + 1 . . . S1 . (17) gossip equations for every edge jl ∈ E by
We are required to show that |zI − TE | = |zI − TE π̄ |.
xj (k + 1) = wxj (k) + (1 − w)xl (k)
As illustrated in Fig. 2, let ei join nodes u, n and ei + 1 join v, n. Let
G denote the spanning subgraph of F when ei , ei + 1 are deleted from xl (k + 1) = (1 − w)xj (k) + wxl (k) (23)
1
with w ∈ (0, 1) still leads to convergence to an average, but the con- magnitude eigenvalue of the average gossip matrix is 1 − 2m
in an m-
vergence rate can be faster for some values of w. The main results of edge graph. For m = 10, this is .9500.
this paper and their proofs still hold with this replacement, there being
nothing special about w = 12 .
VI. CONCLUSION
V. EXAMPLES The main results of this paper are concerned with deterministic
periodic gossiping. We have shown that for a general graph, a complete
In this section, we present some examples of trees, including path gossip matrix obtained by multiplying together single-gossip matrices
and star graphs, and identify optimal tree topologies, i.e., those offering corresponding to each edge of the graph has a spectrum that remains
fastest convergence. Some comparison with random gossiping is also unaltered if time-adjacent gossips are interchanged, under the proviso
made. that the associated edges share no common node, or, if they do share a
common node, that they are not in any cycle of the graph.
A. Path and Star Graphs There remain some problems whose solutions would be of interest,
but they do seem challenging. First, is there a way of applying the order
Consider a path graph with m edges. A rather lengthy induction invariance results to graphs which contain cycles, that would yield
argument will show that the transfer function obtained when a leaf some insight as to the range of possible convergence rates which can
node is taken as the last vertex is expressible in terms of Chebyshev be obtained, and how they might be optimized? Equivalently, the latter
polynomials, and the eigenvalues of the characteristic polynomial of task is to identify for a nontree graph what complete periodic gossip
the composite gossip matrix Tm over one period are 1, cos2 mj+π 1 , j = sequence will result in the fastest convergence; such a sequence would
1, 2, . . . , m/2 , 0, . . . , 0. Then, ρ(Tm ) is cos2 m π+ 1 . This means that not necessarily have to include all edges in the graph, so long as those
1
without multigossiping, the convergence rate is ρ m , and for multigos- edges which were included defined a connected spanning subgraph. A
1
siping it is ρ 2 . For m = 10, there results .9918 and .9595, respectively second issue with periodic gossiping is to allow the possibility that over
For a star graph with m edges, with Tm the composite gossip matrix one period some edges gossip more than once, and to draw conclusions
over one period, it is straightforward to conclude using the transfer on what sort of improvement can be achieved in convergence rates (this
function concept that is equivalent to allowing graphs which are not simple, and having each
m edge gossip once in a period); ideally, one would like to understand
1 z m which edges in the graph should be repeated and which should not.
|zI − Tm | = z z − − . (24)
2 2
No analytic formula is known for the zeros. For m = 10, the conver- REFERENCES
gence rate for gossiping results as ρ1 / m (Tm ) = .9759. Multigossiping
[1] S. Boyd, A. Ghosh, B. Prabhakar, and D. Shah, “Randomized gossip
is not possible. algorithms,” IEEE Trans. Inf. Theory, vol. 52, no. 6, pp. 2508–2530,
Jun. 2006.
[2] M. Cao, D. A. Spielman, and E. M. Yeh, “Accelerated gossip algorithms
B. Optimal Tree Graphs
for distributed computation,” in Proc. 44th Annu. Allerton Conf. Commun.,
A table listing the number of nonisomorphic trees with vertex count Control, Comput., 2006, pp. 952–959.
[3] J. Liu, B. D. O. Anderson, M. Cao, and A. S. Morse, “Analysis of acceler-
up to 40 is available in [27], and a web page associated with Brendan
ated gossip algorithms,” in Proc. 48th IEEE Conf. Decis. Control, 2009,
McKay [28] describes a program “Trees” which lists all (unlabeled) pp. 871–876.
trees according to diameter and vertex count up to 22. Using this [4] F. Fagnani and S. Zampieri, “Randomized consensus algorithms over large
data, an exhaustive search of optimal trees giving the fastest gossip or scale networks,” IEEE J. Sel. Areas Commun., vol. 26, no. 4, pp. 634–649,
multigossip rates was conducted up to m = 21. For m = 10, the rates May 2008.
[5] A. Nedić and A. Ozdaglar, “Distributed sub-gradient methods for multi-
were .9259 and .8706, respectively. For m = 21, the figures were .9804 agent optimization,” IEEE Trans. Automat. Control, vol. 58, no. 6,
and .9532. The optimum shapes tended to involve a path of two or three pp. 48–61, Jan. 2009.
nodes, each of which was the node of a star. To clarify the terminology, [6] L. Xiao and S. Boyd, “Fast linear iterations for distributed averaging,”
which is peculiar to this paper: a double star graph comprises two Syst. Control Lett., vol. 53, pp. 65–78, 2004.
[7] L. Xiao, S. Boyd, and S. Lall, “A scheme for robust distributed sensor
connected vertices, and from each of the vertices a number of rays
fusion based on average consensus,” in Proc. 4th Int. Conf. Inf. Process.
spread out. A triple star graph comprises three vertices forming a path Sensor Netw., 2005, pp. 63–70.
subgraph, and from each of the vertices a number of rays spread out. [8] D. Kempe, A. Dobra, and J. Gehrke, “Gossip-based computation of aggre-
Optimal double stars tend to be balanced in the sense that the degrees gate information,” in Proc. 44th IEEE Symp. Found. Comput. Sci., 2003,
of the central vertices are roughly equal, but this is not so for triple pp. 482–491.
[9] A. Olshevsky and J. Tsitsiklis, “Convergence speed in distributed consen-
star graphs. We are not sure whether the triple star will become more sus and averaging,” SIAM J. Control Optim., vol. 48, no. 1, pp. 33–55,
balanced when n tends to be large. As far as we know there is no 2009.
existing algorithm to list all unlabeled trees for large n, which limits [10] F. Benezit, V. Blondel, P. Thiran, J. Tsitsiklis, and M. Vetterli, “Weighted
the possibility of extensive search for optimal trees even in simulations. gossip: Distributed averaging using non-doubly stochastic matrices,” in
Proc. IEEE Int. Symp. Inf. Theory, 2010, pp. 1753–1757.
The optimum graphs are not the same for gossiping and multigossiping
[11] K. Tsianos, S. Lawlor, and M. Rabbat, “Push-sum distributed averaging
however. for convex optimization,” in Proc. 51st IEEE Conf. Decis. Control, 2012,
pp. 5453–5458.
[12] J. Liu and A. S. Morse, “Asynchronous distributed averaging using double
C. Comparison With Random Gossiping linear iterations,” in Proc. Amer. Control Conf., 2012, pp. 6620–6625.
Any comparison with random gossiping is an apples and oranges [13] A. L. Liestman and D. Richards, “Network communication in edge-
coloured graphs: Gossiping,” IEEE Trans. Parallel Distrib. Syst., vol. 4,
comparison, as noted in the introduction. Random gossiping conver- no. 4, pp. 438–445, Apr. 1993.
gence rates are average and the optimum graph is that with the greatest [14] R. Labahn, S. T. Hedetniemi, and R. Laskar, “Periodic gossiping on trees,”
algebraic connectivity, which is always a star graph. The second largest Discrete Appl. Math., vol. 53, pp. 235–245, 1994.
[15] B. D. O. Anderson, C. Yu, and A. S. Morse, “Convergence of periodic [22] R. Diestel, Graph Theory. New York, NY, USA: Springer-Verlag, 1997.
gossiping algorithms,” in Perspectives in Mathematical System Theory, [23] D. A. Grable and A. Panconesi, “Nearly optimal distributed edge colouring
Control, and Signal Processing, J. C. Willems, S. Hara, Y. Ohta, and H. in O(log log n) rounds,” in Proc. 8th Annu. ACM-SIAM Symp. Discrete
Fujioka, Eds. Berlin, Germany: Springer, 2010, pp. 127–138. Algorithms, 1997, pp. 278–285.
[16] S. Mou, C. Yu, B. D. O. Anderson, and A. S. Morse, “Deterministic [24] S. Gandham, M. Dawande, and R. Prakash, “Link scheduling in wireless
gossiping with a periodic protocol,” in Proc. 49th IEEE Conf. Decis. sensor networks: Distributed edge-coloring revisited,” J. Parallel Distrib.
Control, 2010, pp. 5782–5787. Comput., vol. 68, no. 8, pp. 1122–1134, 2008.
[17] J. Liu, S. Mou, A. S. Morse, B. D. O. Anderson, and C. Yu, “Deterministic [25] J. Gross and J. Yellen, Graph Theory and Its Applications. New York, NY,
gossiping,” Proc. IEEE, vol. 99, no. 9, pp. 1505–1524, Sep. 2011. USA: Chapman and Hall, 2006.
[18] F. He, A. S. Morse, J. Liu, and S. Mou, “Periodic gossiping,” in Proc. 18th [26] O. Mangoubi, S. Mou, J. Liu, and A. S. Morse, “Towards optimal convex
IFAC World Congr., 2011, pp. 8718–8723. combination rules for gossiping,” in Proc. Amer. Control Conf., 2013,
[19] E. Seneta, Non-Negative Matrices and Markov Chains. New York, NY, pp. 1263–1267.
USA: Springer, 1981. [27] P. Stockmeyer, “Enumeration,” in Handbook of Graph Theory, J. L. Gross,
[20] Y. Chow and E. Cassignol, Linear Signal-Flow Graphs and Applications. J. Yellen, and P. Zhang, Eds. Boca Raton, FL, USA: CRC Press, 2014,
New York, NY, USA: Wiley, 1962. ch. 6.3.3, pp. 636–640.
[21] W. Ledermann, Introduction to the Theory of Finite Groups. Edinburgh, [28] B. McKay, “Trees sorted by diameter.” [Online]. Available: http://cs.anu.
U.K.: Oliver & Boyd, 1957. edu.au/ bdm/data/trees.html.

Distributed Averaging Using Periodic Gossiping

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Distributed Averaging Using Periodic Gossiping

Uploaded by

Copyright:

Available Formats

4282 IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 62, NO.

Distributed Averaging Using Periodic Gossiping

Below, we shall be working with composite gossip matrices formed

with d a scalar. Then,

C. Allowing Redundant Gossip Sequences A. Background Lemmas

and then invoking the standard induction assumption

This completes the induction.

You might also like