Collecting Coupons On Trees, and The Analysis of Random Walks

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

Collecting Coupons on Trees, and the Analysis of Random Walks

(Preliminary version)

Uriel Feige April 26, 1994


Abstract

We consider the cover time u ], the expected time it takes a random walk that starts at to visit all vertices of a connected graph . By combining a minimum weight spanning tree argument with a coupon collector argument, we obtain: 1 ]] + max ]]) u ] 2 (min T v2G where ] is the weight of spanning tree , and ]= ]? ] is the di erence between the expected time it takes a random walk to hit , and the expected time it takes it to afterwards get back to . We use this bound to show: 1. maxG minu u ] = (1 + (1))2 3 27. This answers an open question of Aldous. 2. The -path is the -vertex tree on which the cover time is maximized. This con rms a conjecture of Brightwell and Winkler. 3. For regular graphs, u ] 2 2. This improves the leading constant in previously known upper bounds. + We also provide upper bounds on u ], the expected time to cover and return to .
E G u n G E G W T D u; v W T T D u; v H u; v v H v; u u E G o n = n n E G < n E G G u

1 Introduction
Let G be a connected undirected graph on n vertices. We consider random walks on G, where at each step the random walk moves to a vertex chosen at random with uniform
Department of Applied Math., The Weizmann Institute, Rehovot, Israel. feige@wisdom.weizmann.ac.il. Supported by a Koret Foundation fellowship.

probability from the neighbors of the current vertex. For two vertices u; v 2 G, the hitting time H u; v ] is the expected number of steps it takes a walk that starts at u to reach v , the commute time C u; v ] is the expected number of steps that it takes a walk to go from u to v and back to u (that is, C u; v] = H u; v] + H v; u]), and the di erence time (which may be negative) is D u; v ] = H u; v ] ? H v; u]. Let G0 be a subgraph of G, and let u 2 G0. Then the cover time E G0] denotes the expected number of steps it takes a walk that starts at u to visit all vertices of G0 , and the cover and return time E + G0] denotes the expected number of steps it takes a walk that starts at u to visit all vertices of G0, and then return to u. We are interested in proving bounds on these quantaties, especially when G0 = G. Hitting times between any two vertices are computable in polynomial time. Computing the cover time of a graph is more di cult. For any graph, the cover time can be estimated with arbitrary degree of accuracy by a random polynomial time algorithm that repeatedly takes random walks and measures their cover times. However, no deterministic polynomial time algorithm is known for computing the cover time, nor for approximating it. The lack of simple characterizations of the cover time also makes it di cult to obtain extremal results (e.g., for which n-vertex graph is the cover time maximal? minimal?). One approach of upper bounding the cover time is by a \coupon collector" argument. Each time the random walk visits a new vertex it gets a coupon, and the random walk has to collect all n ? 1 coupons. No restrictions are imposed on the order in which coupons are collected. Covering the complete graph K is equivalent to the classical coupon collector problem with n ? 1 coupons, giving E K ] ' n ln n. For arbitrary graphs, the coupon collector argument was generalized by Matthews to give the upper bound E G] max H x; y ]] ln n 8]. This bound is particularly useful for graphs that have very fast cover times, such as expander graphs. A di erent approach for bounding the cover time was introduced by Aleluinas et al. 2]. In this approach, the coupons are to be collected in a particular favorable order. A convenient order to consider is one that agrees with Deapth First Search of a minimum weight spanning tree of the graph. Let T be a tree de ned on the vertices of G (though tree edges P need not correspond to edges of G). De ne the weight of the tree as W T ] = C u; v ], where summation is taken over the endpoints of the n ? 1 edges of T . It is easy to see that E G] min W T ]] (and similarly, E + G] min W T ]]). The spanning tree approach is particularly useful for graphs that have slow cover time, such as the lollipop graph. It was used successfully to provide a tight bound (upto low order terms) of 4n3 =27 on the cover time of any n-vertex graph 6]. In this paper we study cover problems where neither the coupon collector approach nor the spanning tree approach provide a satisfactory bound. Our research was motivated by a question of Aldous, namely, how high can max min E G] be 1]?
u u n u n u x;y u T u T G u u

We use the notation H u; v ] to de ne hitting times relative to the tree T , rather than the graph G. For arbitrary u; v 2 G, let P u; v ] denote the set of directed edges on the simple leading from to along the edges of tree T We de ne u; v P path H x; y], and Cu u; vv = H u; v] + H v; utheObserve. that H u; v]H H u;] v= ] ]. ], ( )2 ] and C u; v ] C u; v ]. It is interesting to note that there is no reason to de ne a concept of D u; v] = H u; v] ? H v; u]. It can be shown that regardless of the tree T on which the path P is de ned, D u; v ] = D u; v ]. This follows from additivity of the di erence time, D u; v] + D v; w] = D u; w] (for proof of the additivity property, see e.g. 5]).
T T x;y P u;v T T T T T T T T T

1.1 Additional notation

We provide a coupon collector based re nement to the spanning tree method. As in the spanning tree method, we construct a spanning tree of the graph. But unlike the spanning tree method, we do not enforce a particular order in which the spanning tree has to be traversed. Instead, we use a coupon collector argument, collecting the coupons from parts of the tree that happen to be covered rst. The main contribution of our work is in showing that the above nonstandard coupon collector problem leads to an upper bound that is both simple to express and useful.
of the vertices of G, where the edges of T need not be edges of G. Let E T ] de ne the expected time it takes a random walk that starts at u to visit all vertices of T , where u 2 T . Let W T ] be the weight of T : the sum of jT j ? 1 commute times between endpoints of edges of T . Then 1 E T ] 2 (W T ] + 2 nf g D u; v]]) max
u u v T u

1.2 A new technique and the main result

Theorem 1 Let G be a connected graph on n vertices. Let T be a tree de ned on a subset

Our techniques apply also to the cover and return time.

1.3 Modi cations and corollaries

Theorem 2 Let G be a connected graph on n vertices. Let T be a tree de ned on a subset


of the vertices of G, and u 2 T . Then E + T ] 1 (W T ] + max C u; v]]) 2 2
u v T T

We conjecture that C u; v ] in the above upper bound can be replaced by C u; v ]. However, this is of little relevance for our intended applications. Our original motivation for obtaining Theorem 1 was to show that for any graph, there is a \good" starting point from which the expected cover time is at most W T ]=2. This follows from the fact that min max D u; v ]] 0. Since the minimum weight spanning tree of any connected graph satis es W T ] (1 + o(1))4n3=27 6], we obtain:
T u v

Corollary 3 For connected graphs on n vertices, max min E G]] (1 + o(1))2n =27.
G u u

This answers an open question of Aldous. Using Theorem 2 we show:

Corollary 4 For connected graphs on n vertices, max min E G]] (1 + o(1))n =9.
G u u

For some families of graphs we can show that max D u; v ] < cW T ], where c is some constant, 0 < c < 1. For these graphs, we obtain bounds on the cover time that are tighter than those that can be obtained by spanning tree arguments alone. One such family is that of trees. Brightwell and Winkler 3] showed that the n vertex tree with least cover time is the star. They conjectured that the path has maximum cover time. Using Theorem 1, we prove this conjecture.
u;v

Corollary 5 For n-vertex trees T ,


max max T ] = b 5(n ? 1) c 4
2
T u

which is the cover time for a walk that starts at the middle point (or one of two points) on the n-path.

The condition max D u; v ] < cW T ] holds also for regular graphs. This allows us to improve the known upper bounds on their cover time.
u;v

Corollary 6 There exists a positive constant , such that for any regular graph,
max E G]] < (2 ? )n2 .
v v

For the cover and return time we obtain:

E + G] 13n2=6.
u

Corollary 7 For any connected d-regular graph, E G] 2n (1 +


u

d 2 (d+1)2

? ). In particular,

In conjunction with previous works of the author 6, 7], we have the following results for connected graphs on n vertices (where ' denotes equality up to low order additive terms). max max E G] ' 4n3=27 max max E + G] ' 4n3 =27 max min E G] ' 2n3 =27 max min E + G] ' 3n3 =27 min max E G] ' n ln n min max E + G] ' n ln n min min E G] ' n ln n min min E + G] ' n ln n The question of which are the extremal graphs for each of the above properties is open.
G u u G u u G u u G u u G u u G u u G u u G u u

1.4 Asymptotic growth of extremal cover times

2 Collecting coupons on trees


In this section we prove Theorems 1 and 2. We shall use the following useful property of the expectation. Let A and B denote two events, A _ B denote the occurence of one of the events (whichever happens rst), A ^ B the occurance of both events. Recall that E C ] denotes the expected number of steps until event C ocurs.

Lemma 8

Proof: By de nition, E C ] = P iPr t(C ) = i]. Hence,


i

E A] + E B] = E A _ B] + E A ^ B]

E A] + E B] =
= + =

X iPr t(A) = i] + X iPr t(B) = i]


i i

X i(Pr (t(A) = i)^(t(B) = i)]+Pr (t(A) = i)^(t(B) < i)]+Pr (t(A) = i)^(t(B) > i)])
i

X i(Pr (t(B) = i) ^ (t(A) = i)]+ Pr (t(B) = i) ^ (t(A) < i)]+ Pr (t(B) = i) ^ (t(A) > i)])
i

X i(Pr (t(A) = i)^(t(B) = i)]+Pr (t(A) = i)^(t(B) > i)]+Pr (t(B) = i)^(t(A) > i)])
i

X i(Pr (t(B) = i) ^ (t(A) = i)]+ Pr (t(A) = i) ^ (t(B) < i)]+ Pr (t(B) = i) ^ (t(A) < i)])
i

Now we are ready to prove Theorem 1, that max E T ] 1 (W T ] + 2 nf g D u; v]]) 2 Proof: We prove the theorem for any graph G, by induction on the size of T . For the base case, T contains only two vertices, u and v . Then the left hand side evaluates to E T ] = H u; v ], and the right hand side evaluates to 1 (W T ] + D u; v ]) = 1 (C u; v ] + D u; v ]) = H u; v ] 2 2 and we have exact equality in the statement of the Theorem. For the inductive step, we make the inductive hypothesis that the theorem holds for any u, and any T with at most k vertices, and prove for T with k + 1 vertices. We distinguish between two cases, according to the degree of u in T . If the degree of u in T is 1, then let y denote the neighbor of u in T . We have E T ] H u; y] + E T n u]. The tree T n u has only k vertices, so by the inductive hypothesis, 1 E T n u] 2 (W T n u] + 2max g D y; v]]) nf 1 Using H u; v ] = 2 (C u; v ] + D u; v ]), and W T ] = C u; v ] + W T n u], and the relation D u; v] = D u; y] + D y; v], the proof follows. If the degree of u in T is greater than 1, then we partition T into two subtrees, T1 and T2, that intersect only on u. We want a bound on E T1 ^ T2]. Let H T1; u] denote the expected time it takes a walk that starts at the worst possible vertex of T1 to hit u, and de ne H T2; u] similarly. Observe that
u v T u u u y y v T u;y u

= E A _ B] + E A ^ B]

E T1 ^ T2] E T1 _ T2] + max H T2; u] + E T1]; H T1; u] + E T2]] W.l.o.g., we assume that H T2; u] + E T1] H T1; u] + E T2]. Using Lemma 8 to eliminate E T1 _ T2 ] we obtain:
u u u u u u u

1 E T1 ^ T2] E T1] + 2 (E T2] + H T2; u])


u u u

Lemma 9 For any tree T de ned on the vertices of connected graph G, and for any starting
point u 2 T ,

E T ] W T ] ? H T; u]
u

Proof: Let x be the vertex that maximizes H T; u]. That is, H T; u] = H x; u]. Rather than use H x; u], we shall use the upper bound H x; u] H x; u]. Hence we can assume that x is a leaf of T . Then there is some Deapth First Search order of visiting the vertices of T that starts at u and visits x last. For this particular order, E T ] W T ] ? H x; u].
T

Applying the above lemma to T2 , we obtain E T2] + H T2; u] W T2 ]. Using the 1 inductive hypothesis E T1] 2 (W T1]+max 2 1 D u; x]), and the identity W T ] = W T1 ]+ W T2], the proof follows. 2 We now prove Theorem 2, that for any connected graph G, 1 E + T ] 2 (W T ] + max C u; v]]) 2 The proof is simpler than that of Theorem 1, and is given in less detail. Proof: By induction on the size of T . If T contains only one edge, we have exact equality in the statement of the lemma. Assume the theorem holds for any u, and any T with at most k vertices, and prove for T with k + 1 vertices. We distinguish between two cases, according to the degree of u in T . If the degree of u in T is 1, and (u; y ) 2 T , then we have E + T ] C u; y ] + E + T n u]. The tree T n u has only k vertices, so we can use the inductive hypothesis, and the proof follows by simple manipulations. If the degree of u in T is greater than 1, then we partition T into two subtrees, T1 and T2, that intersect only on u. Note that each such subtree contains at most k vertices, and so it obeys the inductive hypothesis. We want a bound on E + T1 ^ T2 ]. Observe that
u u x T u v T T u T y u

we obtain:

E + T1 ^ T2] E + T1 _ T2] + max E + T1]; E + T2]] Assuming w.l.o.g. that E + T1] E + T2], and using Lemma 8 to eliminate E + T1 _ T2]
u u u u u u u

Using E + T1]
u

1 E + T1 ^ T 2 ] E + T1 ] + 2 E + T2 ] 1 (W T1] + max 2 1 C u; x]), and E + T2] W T2], the proof follows. 2


u u u x T T u

3 Applications
We present three applications of our bounds. The rst uses the fact that for any connected graph G, there exists a spanning tree T of weight W T ] (1 + o(1))4n3=27 6]. The second uses the fact that if G is a tree, then W G] = W T ] = 2(n ? 1)2 . The third uses the fact that for connected d-regular graphs, any spanning tree T has weight W T ] < 3dn2=(d + 1) 4]. In all three cases, we obtain bounds on D u; v ] (or C u; v ]) that are smaller than the corresponding values of W T ], and get bounds on the cover time that are stronger than those that one obtains by using W T ] alone.
T

We use Theorem 1 to prove Corollary 3, that for any connected graph there is a vertex u with E G] (1 + o(1))2n3=27. Proof: By the additivity property of the di erence time, it follows that there exist a vertex u such that for any vertex v , D u; v ] 0. From Theorem 1, we obtain that for this 1 particular u and any tree T that contains u, E G]] 2 W T ]. The proof follows by choosing T to be a minimum weight spanning tree of G, and the bound on its weight obtained in 6].
u u

3.1 Where to start a random walk

two endpoints is hit rst, this lower bounds the expected cover time. We use Theorem 2 to prove Corollary 4, that for any connected graph there is a vertex u with E + G] (1 + o(1))3n3=27. Proof: For graph G, construct minimum spanning tree T , using only edges of G. By 6], it has weight at most (1 + o(1))4n3=27. Observe that between any two adjacent vertices on T , C x; y ] = C x; y ] < n2 . Let z and w be a pair of vertices that maximize C z; w]. Then we choose a vertex u along the path connecting z and w on T such that C u; z ] 1 (C z; w]+ n2 ), and C u; w] 1 (C z; w]+ n2). Such a vertex u must exist, and satis es 2 2 1 max 2 C u; v ] 2 (W T ]+ n2). Hence E + G] 3W T ]=4+ n2=4 (1+ o(1))3n3=27. 2 The above bound is best possible. Consider a graph G composed of a path of length n=3, and a clique of size 2n=3 hanging anywhere on the path. Let u be the midpoint of the path, and let x and y be the two endpoints. It can be shown that C u; x] = C u; y ] ' 2n2 =27. Using Lemma 8 and the exact equality in the relation E + fu; x; y g] = E + fu; xg_ fu; y g]+ E + fu; xg], it follows that E + fu; x; yg] n3=9 + o(n3).
u T T T T T T v G T u u u u u

The above bound is best possible. Consider the graph G composed of a path of length n=3, and a clique of size 2n=3 hanging in the middle of the path. The expected hitting time between the two endpoints of the path is 2n3 =27 + o(n3 ). Hence no matter which of the

Of all n-vertex trees, the expected commute time is maximized for the end points of the n-path, where it is 2(n ? 1)2. Since the commute time is always bounded by W T ], and W T ] = 2(n ? 1)2 for any tree, it follows from Theorem 2 that the n-path is the tree with maximum cover and return time. Brightwell and Winkler 3] conjectured that of all trees, the n-path also maximizes the expected cover time, for a walk that starts at the mid-point of the path (or one of the two midpoints, for even n). We prove this conjecture.

3.2 Trees

Lemma 10 For any n-vertex tree,


max D u; v ]] b (n ? 1) c 2 Proof: Let T be an arbitrary n-vertex tree, and consider the vertices that maximize D u; v]. Let ` be the number of edges along the path connecting u and v. Then C u; v] = 2`(n ? 1), H v; u] `2, and D u; v ] 2`(n ? 1) ? 2`2. This is maximized when ` = (n ? 1)=2 for odd n, and ` = n=2 for even n. 2 From Theorem 1 we obtain Corollary 5, that for any n-vertex tree T ,
2
u;v

This is exactly the cover time for the n-path. The n-path is the unique n-vertex tree with this cover time. This follows from the fact that for a tree to be extremal, we must have for the starting point u and any leaf v (this follows from requiring that all inequalities are tight in the proof of Theorem 1). The diameter bound for regular graphs can be used in order to bound the maximum commute time.

max T ] b 5(n ? 1) c 4
2
u

D u; v] = b (n ? 1) c 2
2

3.3 Regular graphs

Lemma 11 For any connected d-regular graph, the diameter (number of edges in the shortest path between two most distant vertices) is smaller than
d+1

3n

? 1.

the shortest path connecting them. Let V be the set of ` + 1 vertices on this path, and let V be the set of n ? ` ? 1 that are not on this path. Since the degree of each vertex is d, and the path contains ` edges, then there are d(` + 1) ? 2` edges connecting V and V . Each vertex in V is connected to at most 3 vertices in V , since otherwise it would serve as a shortcut contradicting the shortest path property. Hence
path path path path path path

Proof: Let v and v be the vertices that are furthest apart, and let v , v , ... , v ? be
0
`

implying the lemma. 2


P

3(n ? ` ? 1) d(` + 1) ? 2`

Lemma 12 Let G be connected d-regular graph. Let P be the shortest path connecting
vertices x and y . Then C x; y ] (n2d(d +7)=(d +1)2 ? nd) < n2 (1+5=d), where commute times are measured relative to the path P (as in Sect. 1.1).

Substitute an upper bound for ` from Lemma 11. 2

Proof: By 4], for any set of ` edges, their total weight is at most nd(

d+1

?1+

d+1

2`

).

Lemma 13 For any connected d-regular graph and vertex u, there exists a spanning tree
T of weight W T ] < 3n2d=(d + 1), and with maximum commute time max C u; v]] < n2d(d + 7)=(d + 1)2.
v T

bound on the commute time follows from Lemma 12, and the bound on the weight of the tree follows from 4]. 2 Now, from Theorem 2, we obtain Corollary 7, that for any connected d-regular graph, ?2 the maximum expected cover and return time is less than 2n2 (1 + ( +1)2 ). Our bound obtains its maximum 13n2=6 at d = 5. The regular graph whose expected cover and return time is conjectured to be worst is the 3-regular necklace, for which the bound is n2 ( 3 + o(1)). For large d, our bound is 2 n2(2+ o(1)), whereas the d-regular necklace has a cover and return time of n2(1+ o(1)) (for odd d). We remark that our upper bound is not sensitive to the existance of a constant number of vertices of degree d 1 (except for low order additive terms). In this respect it is nearly tight, for the path. To bound the cover time, we would like to bound max D u; v ]] over all d-regular graphs, and then use Theorem 1. From 9] it follows that for regular graphs D u; v ] = ave (C v; w] ? C u; w]). Now principles similar to those employed in lemmas 11 and 12 can be used to bound the average commute time. We obtain (we omit the details of the proof) Corollary 6, that for any d-regular graph, max E G]] < (2 ? )n2 (for some universal ).
d d u;v w v v

Proof: Use Breadth First Search starting at u to construct a spanning tree. Then the

10

For a walk starting at the middle of the 3-regular necklace, the cover time is 15n2 =16 (up to low order terms). This is conjectured to be the regular graph that is most di cult to cover 1].

References
1] D. J. Aldous. \Reversible Markov chains and random walks on graphs". Draft of rst six chapters of book, January 26, 1993. 2] R. Aleliunas, R. M. Karp, R. J. Lipton, L. Lovasz, and C. Racko . \Random walks, universal traversal sequences, and the complexity of maze problems". In 20th Annual Symposium on Foundations of Computer Science, pages 218{223, San Juan, Puerto Rico, October 1979. 3] G. Brightwell and P. Winkler. \Extremal Cover Times for Random Walks on Trees". Journal of Graph Theory, 14(5):547{554, 1990. 4] D. Coppersmith, U. Feige, J. Shearer. \Random Walks on Regular and Irregular Graphs". Technical report CS93-15, the Weizmann Institute, 1993. 5] D. Coppersmith, P. Tetali, and P. Winkler. \Collisions among random walks on a graph". SIAM J. on Discrete Math., 6(3):363-374, August 1993. 6] U. Feige. \A Tight Upper Bound on the Cover Time for Random Walks on Graphs". Technical report CS93-08, the Weizmann Institute, 1993. 7] U. Feige. \A Tight Lower Bound on the Cover Time for Random Walks on Graphs". Technical report CS93-19, the Weizmann Institute, 1993. 8] P.C. Matthews. \Covering Problems for Brownian Motion on Spheres. Ann. Probab., 16:189-199, 1988. 9] P. Tetali. \Random walks and the e ective resistance of networks". Journal of Theoretical Probability, 4:101-109, 1991.

11

You might also like