Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 19

The Load Distance Balancing

Problem
Eddie Bortnikov (Yahoo!)
Samir Khuller (Maryland)
Yishay Mansour (Google)
Seffi Naor (Technion)
The Load-Distance Balancing Problem

 Given n clients and k servers s1, s2,…sk we


need to assign each client to a server.
 Cost for client i assigned to server sj is as
follows:
Cost(i) = Distance(i,sj) + Delay(j,Lj)
Delay(j,Lj) is a FUNCTION of the number of
clients Lj assigned to sj.
OBJECTIVE: Min Max Cost(i)
An Example

s1 s2 s3

C E
D
A B

 Each server has its own delay function (can be arbitrary, we


just assume its non-decreasing).
 Note that C is closer to s1, but prefers to attach to s2 since
Dist(C,s2)+Delay(s2,2)<Dist(C,s1)+Delay(s1,3)

 Objective is to Minimize Max Cost for any client


Related Work

 Lots of research on locating facilities. Here


the facilities are all given; we just have to
compute assignment of clients to facilities.
 Notion of capacities has been used for
various covering problems such as Vertex
Cover, K Centers, Facility Location etc.
Main Results
 The problem is NP-hard.
 We develop a polynomial time 2 approx.
 We show that the bound 2 cannot be improved to 2-ε
unless NP=P.
 With triangle inequality in the distance function the
hardness reduces to 5/3-ε.
 When all clients and servers are on a line its solvable
in polynomial time.
 For Min Sum Cost(i), we can solve in polynomial
time using Min-Weight Matching.
NP-hardness by Exact Set Cover

 Given N elements and a collection S of K


sets (each set has size m). Does there exist
a subset S’ of S, such that each element
belongs to exactly one set in S’?
In other words, we need to pick exactly N/m
subsets from S, to cover each element once.
Example of Exact Cover

 Here we have N=16 elements and 9 sets (m=4).


 The FOUR blue sets form an EXACT COVER, and
we discard the FIVE orange sets.
Reduction from Exact Cover

 Each element is a client. In


d1
addition we create a
Dummy
collection of M(K-N/m)
clients
dummy clients.
 Subset Sj in S corresponds
to server sj.
 Dist(dummy,server)=d1 sj
d2
 Dist(i,sj) = d2 if i ε Sj, o.w. ∞
 d2 >> d1 i
Clients (elements)
Reduction from Exact Cover

 Delay functions for servers are basically a step


function.
 Delay(j,Lj) = Δ-d2, when load is at most m.
 Delay(j,Lj)= Δ-d1, when load exceeds m, but is at
most M.
Δ-d1

Δ-d2

m M
Reduction from Exact Cover

 Suppose there is a solution to exact cover, then


there is a solution to the LDB problem with delay at
most Δ.
 For each chosen subset Sj, the corresp. server sj
gets m clients each at distance d2.Since the delay is
Δ-d2, the total cost is at most Δ.
 For the remaining subsets, those dummies are all
assigned to the remaining servers (K-N/m), each
gets M dummies.
 The proof in the other direction requires some work!
Proof (cont.)
 Each server can support at most M dummy clients if
the total cost does not exceed Δ, and no more than
m real clients.
 Suppose a server supports both real and dummy
clients; then the total number of servers with real
clients is k’ > N/m.
 These serve at most (mk’-N) dummy clients, while
the rest can serve only M(k-k’) dummy clients.
 Adding the two shows (some algebra needed) that
we can only assign < M(k-N/m) dummy clients if
M>m.
Hardness Results follow….

 We can set d1=ε and d2=Δ-ε. A solution to


EXACT COVER exists if and only if a
solution with cost Δ exists for LDB.
 If there is no solution to EXACT COVER,
then every solution to LDB has cost 2(Δ-ε).
 However, here we do violate triangle
inequality in the distance function.
Hardness results with triangle
inequality

 We need to set d1=⅓ Δ, and d2= Δ-ε.


 With these parameters, the distance between
a (real) client i and a server sj such that i is
not in Sj, is at least 5/3Δ- ε.
 A solution to EXACT COVER exists if and
only if a solution with cost Δ exists for LDB.
 If there is no solution to EXACT COVER,
then every solution to LDB has cost 5/3Δ-ε.
Approximation Algorithm with factor 2

 Suppose a solution exists with maximum cost Δ.


 For each server sj, we can compute an upper bound
on the number of clients that can be served with a
delay of at most Δ (say L*j).
 For each client i we can compute the subset of
servers that are within distance Δ (S*i).
 Now its just a flow problem to check if an assignment
exists where each client i is assigned to a server
from S*i and each sj has load at most L*j.
 Minimizing Δ gives a trivial 2 approximation.
All servers and clients on a line

 Use dynamic programming!


Minimizing the Sum of Costs

 We reduce this to min cost


matching in a bipartite
graph.
 Let G=(X,Y,E) where nodes
in X correspond to n clients
and there are nk nodes in Y.
We have n nodes corresp.
to each server.
 We ask for a min cost
matching to find a solution.
Capacitated K Centers

 The related problem of choosing K facilities has been


considered (each client should be assigned to a
closeby facility and the load on the facility should not
be too high): [Khuller & Sussman] (K,L,5R) or
((2/c)K, cL, 2R) related to K-centers clustering.
Conclusions

 Can we improve the 2 approximation when


triangle inequality holds?
 Can we improve the 2 approximation when
Delay functions satisfy specific properties?
What is a natural delay function?
 Are there other special cases that can be
solved in polynomial time?

You might also like