Tolga Ques

Asymptotically Optimal Control of Many-server Heterogeneous
Service Systems with H
2
Service Times
Tolga Tezcan
Simon School of Business

University of Rochester
Rochester, NY
tolga.tezcan@simon.rochester.edu
June 16, 2011
Abstract
Optimal control of many-server heterogenous service systems with service times that have
a special hyper-exponential distribution, denoted by H
2
, which is a mixture of an exponential
distribution and a unit point mass at 0, is considered. A static priority policy that assigns
priorities to server pools based on their service time distributions is proposed. This policy is
shown to be asymptotically optimal in the many-server heavy trac regime in minimizing the
total number of customers in the system or in the queue under two dierent assumptions on
service time distributions.
Keywords: Heterogeneous servers; Many servers; Heavy trac; Haln-Whitt regime; Asymptotic
optimality
1 Introduction
The heavy-trac analysis approach has been successfully used to analyze complex queueing systems
that cannot be analyzed using standard queueing techniques for exact analysis; e.g., see [21, 8, 4].
In this approach, the queue lengths in a heavily loaded system are shown to be close to a diusion
process under reasonable scheduling policies. The many-server heavy-trac analysis, initiated by
the seminal paper of [20], has been used to analyze queueing systems with many servers in heavy-
trac. Unlike the conventional heavy-trac analysis, the many-server analysis is more suitable for
systems with several servers working in parallel, see [16] for a comparison of these two regimes.
In this paper we establish asymptotically optimal policies for a service system with heterogenous
servers in a many-server heavy trac regime. We consider service times with a special hyper-
exponential distribution in order to gain some insights for the optimal control of many-server
systems under non-exponential service times.
The heterogenous service systems we study consist of a single customer class and multiple server
pools. In each server pool there are several servers and two servers that belong to the same pool
Research supported by NSF Grant CMMI-0954126.

1
manuscript
Click here to download Manuscript: rev4_11_06_10.pdf Click here to view linked References
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
are assumed to have the same service time distribution. These systems are known as inverted-V
systems; see [16] and [1]. Our main goal is to devise control policies that minimize the congestion
in these systems. The control of inverted-V systems consists of two parts. First, if an arriving
customer nds idle servers in dierent pools, the control policy must specify which server pool the
arriving customer will be routed to. It is also possible to hold the customer in queue even though
there are idle servers. Second, when a server nishes serving a customer and there are customers
waiting in the queue at that time point, the control policy must specify whether the server should
start serving another customer or it should idle.
We assume that the service times have H
2
distribution, that is, a customer that is routed to
server j has an exponential service time distribution with rate
j
with probability p
j
or customers
service time is equal to zero. Although, having service times equal to zero is not possible in most of
the real systems, it may be used to approximate very short service times. In addition it provides us
with additional insights on the control of inverted-V systems when service times are not exponential.
Before we explain our proposed policy we rst introduce our assumption on service time distri-
butions.
Assumption 1. One of the following conditions hold.
i. X
j

st
X
1
, for all j = 2, . . . , J,
ii.
1

j
, for all j = 2, . . . , J,
where X
j
is a random variable which has the same distribution with the service times in pool j.
Although they look similar, these two conditions are dierent. Because
P (X
j
> t) = p
j
exp{
j
t}, (1.1)
it is easy to verify that Assumption 1(i) holds if and only if
j

1
and p
j
p
1
, for all j = 2, . . . J.
Therefore, the rst assumption implies the second one but the second one does not imply the rst
one in general. We propose the following static priority policy
At the time of a customer arrival, route the arriving customer to one of the server pools with
j 2 with available servers, if there is an available server in those pools. For asymptotic
analysis it is immaterial which server pool is chosen but for concreteness we choose the highest
indexed server pool. Otherwise, if there is an available server in pool 1, route the customer
to server pool 1. If all the servers are busy, the arriving customer joins the queue and starts
waiting.
At the time of a service completion, the server nishing service picks the longest waiting
customer in the queue (so our policy is non-idling). If there are no customers waiting the
server idles after nishing service of a customer.
We denote this policy by
. The main result of this paper is that, under Assumption 1,
is
asymptotically optimal as the arrival rate and the number of servers get large and the load on the
system approaches its capacity at a certain rate (see Section 2 for a detailed denition). Specically,
we show that
asymptotically minimizes the average number of customers in the system under

Assumption 1(i) and in the queue under Assumption 1(ii) over any nite time interval in the sense
of stochastic ordering or expected value in the limit.
2
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
In [1], inverted-V systems with exponential service times have been analyzed in a similar asymp-
totic regime as we consider. It is shown that the fastest-server-rst (FSF) policy, which routes an
arriving customer to one of the available servers with the smallest expected service time, is asymp-
totically optimal in minimizing the number of customers in the system over nite time intervals
and in steady state. In our model, if the service times are exponential, i.e., p
j
= 1 for each pool
j, the proposed policy
reduces to the fastest-server-rst (FSF) policy (if we assume in addition

that servers with lower mean service time are given priority. For asymptotic optimality this is not
necessary). Note that if service times are exponential, Assumption 1(ii) and Assumption 1(i) are
equivalent. Optimality under Assumption 1(i) is very intuitive since service times are assumed
to be stochastically ordered. On the other hand, only the distribution of the non-zero part of the
service times matters in Assumption 1(ii) when p
j
< 1. It is possible to construct examples where a
server pool having priority over another one may have service times with higher mean and variance
(see Section 4.2 for details).
Optimal control of queueing systems with non-exponential service times in the many-server
heavy trac regime has not been treated in the literature to the best of our knowledge (paper [28],
which appeared after the rst version of this paper, considers the control of V-model systems using
uid models). We take a rst step to treat this problem. Although the domain of the systems
we consider may seem to be too limited, the results and ideas in [1] for inverted-V-systems with
exponential service times have served as a stepping stone for understanding the control of complex
systems with many servers; see for example [18] and [7].
The general idea of our proof of asymptotic optimality has now become standard in heavy trac
analysis; see [8, 9, 4, 13, 12]. We rst establish an asymptotic lower bound for the total number
of customers in the system (or the queue length) and then show that the relevant process under
achieves the lower bound in the limit. When the service times have exponential distribution
it is common in the literature to rst determine an asymptotically optimal preemptive policy
and then show that the values of the objective function under a similar non-preemptive optimal
policy converges to the same limit; see [6, 1]. Such an approach is not possible when the service
times have hyper-exponential distributions; see Section 4.3 for more details. Hence, to study the
optimal control of inverted-V models, we dene a mapping with an appropriate domain and range
and show that the the total number of customers in the system (or the queue length) under this
mapping is minimized among all feasible mappings. We only consider H
2
service times because
a direct extension of our optimality proof does not seem possible to more general distributions
such as phase-type. However, we believe that our results can be extended to the case with general
hyper-exponential service times under Assumption 1(i) or under a slightly dierent version of
Assumption 1(ii).
We close this section with a review of the related literature. The many-server asymptotic
regime we focus on is rst proposed by [20]. The optimal control of parallel-server systems in
the many-server regime have been studied in [2, 3, 1, 13, 12, 18, 19, 17, 6, 5, 7]. The main
dierence between our work and the existing literature is that we do not restrict our attention
to exponential service times. Although our research is the rst to address the optimal control in
systems with non-exponential service times, diusion and uid limits of systems with general service
times distributions have been established; see [25] for the treatment of phase-type distributions and
see [26, 22] for general distributions. Also, [28] considers the control of V-model systems in many-
server regime with general service times using a uid scaling.
The rest of this paper is organized as follows. In Section 2 we present the details of the
3
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
queueing model and the asymptotic regime we study. We dene a regulator mapping and establish
its properties in Section 3. We present our main results in Section 4. In addition, we present the
details of simulation results and discuss the dierences between preemptive and non-preemptive
scheduling in our setting. The remaining sections are devoted to the proof of our main results. In
the next section we collect the notation and terminology used in the rest of this paper.
1.1 Notation
Let |x| denote the max norm on R
d
given by |x| = max
i=1,2...,d
{|x
i
|}. For x R, x
= (x) 0
and x
+
= x 0. For each positive integer d, D
d
[0, ) denotes the d-dimensional Skorohod path
space; see [14]. For x, y D
d
[0, ) and 0 s < T we set
x() y()
[s,T]
= sup
stT
|x(t) y(t)|.
and we write x() y()
T
= x() y()
[0,T]
for notational convenience. We also set
x() y()
(s,T]
= sup
s<tT
|x(t) y(t)|.
For f D
d
[0, ), we let f(s) = lim
ts
f(t) for s > 0 and f(0) = f(0) by convention.
The space D
d
[0, ) is endowed with the Skorohod J
1
topology and the weak convergence in
this space is considered with respect to this topology. For a sequence of functions {x
r
} in D
d
[0, ),
the sequence is said to converge uniformly on compact sets to x D
d
[0, ) as r , denoted by
x
r
x u.o.c., if for each T > 0
x
r
() x()
T
0 as r .
2 The queueing model and asymptotic framework
In this section we rst describe the details of the queueing model and the asymptotic framework
we consider. Then we present the queueing equations.
We consider an inverted-V system with J 2 server pools (we set J = {1, . . . , J}). We assume
that servers in the same pool have the same service time distribution. The service times are assumed
to have a special distribution, denoted by H
2
as in [30]. To recap, the service times in pool j are
mixtures of an exponential distribution with rate
j
with probability 0 < p
j
1 and a unit point
mass at 0 with probability 1 p
j
.
In our asymptotic analysis, we consider a sequence of inverted-V -systems indexed by r with
the same structure. The arrival rate and the number of servers go to innity in this sequence; in
the rth system the arrival rate is given by
r
= r (2.1)
and the number of servers, N
r
j
, in the jth pool is given by
N
r
j
=
j
r, for j J, (2.2)
for some
j
> 0. (One might require N
r
j
to be an integer by rounding and this does not aect our
analysis. For notational simplicity we do not use rounding.) Let m
j
(= p
j
/
j
) denote the average
service time in pool j.
4
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
Let Q
r
(t) denote the number of customers in the queue, Z
r
j
(t) denote the number of customers
being served in pool j and Y
r
(t) denote the total number of customers in the system at time t in
the rth system. The diusion scaled processes are dened by
Q
r
(t) =
Q
r
(t)
r
and

Z
r
j
(t) =
Z
r
j
(t) N
r
j
r
, for j J, (2.3)
and
Y
r
(t) =

Q
r
(t) +
J
j=1
Z
r
j
(t).
Obviously these quantities depend on the policy, , used. When we want to make this dependence
explicit we append to these quantities as a subscript.
The arrival process in the rth system A
r
() is dened by
A
r
(t) = sup{m 0 :
m
=0
u() rt}, (2.4)
where {u(l) : l = 1, 2, . . .} is a sequence of i.i.d. nonnegative random variables with mean 1 and
variance
2
[0, ) and u(0) is an arbitrary nonnegative random variable. By convention, empty
sums are set to be zero.
As described in the introduction, control policies are needed to determine how inverted-V models
operate. In search of an optimal control policy, we restrict our attention to admissible control
policies that are non-idling, head-of-line, non-preemptive and Markovian as described in [13].
We make the following (so called) many-server heavy trac assumption; for R,
J
j=1
N
r
j
m
j
= r +
r, for r 2. (2.5)
We also assume that the following holds for the sequence of initial conditions;
(

Q
r
(0),

Z
r
(0)) (

Q(0),

Z(0)), as r , (2.6)
where denotes weak convergence, (

Q(0),

Z(0)) is a random vector and

Z
r
=
_
Z
r
j
, j J
_
.
Queueing processes: Next we provide the details of the queueing processes. Let S
j
denote a
Poisson process with rate
j
, j J, and assume that S
j
s, j J, are independent. We use A
r
j
(t),
j J, to denote the number of customers whose service started in server pool j before time t. We
note that A
r
j
(t) is nondecreasing for all j J because we only consider non-preemptive policies.
Let D
r
j
(t), j J, denote the number of service completions by time t in server pool j whose service
times were not equal to 0. We can write
D
r
j
(t) = S
j
__
t
0
Z
r
j
(s)ds
_
, for j J,
for any admissible policy, see Theorem 2.1 in [13] for details. We set
A
r
=
_
A
r
, A
r
j
: j J
_
, Z
r
=
_
Z
r
j
: j J
_
, and D
r
=
_
D
r
j
: j J
_
.
5
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
Let X
r
= (A
r
, Z
r
, Q
r
, D
r
). Note that X
r
depends on the control policy, , used. When we want to
make this dependence explicit we append to the process X
r
or its components.
Let {
j
(n), n 1} be a sequence binomial i.i.d. random variables such that
j
(1) is equal to 1
with probability p
j
and equal to zero with probability 1 p
j
. We can write the evolution of the
queuing processes as follows;
Q
r
(t) = Q
r
(0) +A
r
(t)
J
j=1
A
r
j
(t), (2.7)
Z
r
j
(t) = Z
r
j
(0) +
A
r
j
(t)
n=0
j
(n) D
r
j
(t) for j J. (2.8)
Next, we write the queueing equations (2.7)-(2.8) in a form more amenable to analysis. The
uid scaling is dened by
X
r
= X
r
/r.
We dene the diusion scaled departure processes by
D
r
j
(t) =

r
_
_
S
j
_
_
t
0
Z
r
j
(s)ds
_
r

j
_
t
0
Z
r
j
(s)ds
_
_
, j J (2.9)
and the diusion scaled arrival process by
A
r
(t) = r
1/2
(A
r
(t)
r
t) .
Let
C
r
j
(t) =
t
=1
j
() and

C
r
j
(t) = r
1/2
_
C
r
j
(rt) rtp
j
_
, j J.
We set

C
r
=
_
C
r
1
, . . . ,

C
r
J
_
and dene
a
r
j
(t) =

C
r
j
(

A
r
j
(t)), j J. (2.10)
Also let
u
r
j
(t) = r
1/2
_
A
r
j
(t)
N
r
j
m
j
t
_
, j J (2.11)
and u
r
=
_
u
r
j
; j J
_
. The process u
r
j
can be interpreted as the diusion scaled deviation of
number of customers routed to queue j from its nominal value. For notational convenience we
set
w
r
j
(t) =

Z
r
j
(0) + a
r
j
(t)

D
r
j
(t), j J and w
r
q
(t) =

Q
r
(0) +r
1/2
(A
r
(t)
r
t) +t. (2.12)
By (2.7)(2.12),
Z
r
j
(t) = w
r
j
(t)
j
_
t
0
Z
r
j
(s)ds +p
j
u
r
j
(t), j J, (2.13)
Q
r
(t) = w
r
q
(t)
jJ
u
r
j
(t), (2.14)
6
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
and
0

Q
r
(t) and

Z
r
j
(t) 0 for j J. (2.15)
The non-idling condition implies that
Q
r
(t)
J
j=1
Z
r
j
(t) = 0, for all t 0. (2.16)
Depending on the policy used additional equations are added to (2.13)-(2.15). Under the pro-
posed policy
the following condition also holds;

J
j=j
_
A
r
j
(t) A
r
j
(s)
_
= A
r
(t) A
r
(s), (2.17)
if

J
j=j

Z
r
j
(u) < 0 for all u [s, t] and for j
J.
3 Mapping
In this section we dene the mapping that we later use to characterize the optimal limiting
queueing processes and to study the limit of queuing processes under the proposed policy
. Let
x = (x
q
, x
j
: j J) D
J+1
[0, ), u = (u
j
: j J) D
J
[0, ), z = (z
j
: j J) D
J
[0, ) and
q D[0, ). Consider the following equations
z
j
(t) = x
j
(t)
j
_
t
0
z
j
(s)ds +p
j
u
j
(t), for all j J, (3.1)
q(t) = x
q
(t)
J
j=1
u
j
(t), (3.2)
z
j
(t) 0, for all j J, (3.3)
q(t) 0, (3.4)
q(t)
J
j=1
z
j
(t) = 0, (3.5)
for t 0. We highlight the fact that equations (2.13)-(2.16) are very similar to (3.1)(3.5). The
processes A
r
j
s, hence u
r
j
s, see (2.11), determine how a queueing system is controlled. In this setting
the control is determined by u. We start with dening feasible controls in this context.
Denition 1. Given x D
J+1
[0, ), u D
J
[0, T] is said to be a feasible control if there exist
z D
J
[0, ) and q D[0, ) that satisfy (3.1)(3.5).
Given x and a feasible control u, the process (z, q) satisfying (3.1)(3.5) is unique; see Theo-
rem 4.1 in [24]. In order to make this dependence explicit we write (z
u
, q
u
) to denote these processes
associated with a feasible control u when the control is not explicit from the context.
7
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
Next we dene the mapping . Let : D
J+1
[0, ) D
J
[0, ) D[0, ) D
J
[0, ) be dened
for x D
J+1
[0, ) by
(x) = (
1
(x),
2
(x),
3
(x)) = (z, q, u),
where (z, q, u) satises (3.1)(3.5) and the following condition
z
j
(t) = 0, for j 2, and t 0. (3.6)
Note that by (3.1) this implies
u
j
(t) = x
j
(t)/p
j
, for j 2 and t 0. (3.7)
In words, mapping makes sure that all the servers in pools j = 2, . . . , J (those that have
priority over the rst pool) are busy at all times, see (3.6). Obviously depends on p but we do
not explicitly indicate this dependence in our notation because we assume that p is xed throughout
the paper. Next we show that is well dened.
Lemma 2. For each x D
J+1
[0, ), there exists a unique (z, q, u) that satises (3.1)(3.6).
Therefore, the mapping is well dened. In addition, is continuous provided that the function
spaces D
J+1
[0, ) and D
J
[0, ) D[0, ) D
J
[0, ) are endowed with either the topology of
uniform convergence over compact intervals or with the Skorohod-J
1
topology.
Proof. Let x D
J+1
[0, ). Clearly, u
j
for j 2 is well dened by (3.7). Let
: D
J+1
[0, )
D[0, ) be dened by
(x) = y
1
, where
y
1
(t) = y
1
(0) +x
q
(t) +
J
j=1
x
j
(t)
p
j
+
1
_
t
0
(y
1
(s))
ds.
By Theorem 4.1 in [24],
is well dened, i.e., there exits a unique y

1
D[0, ), and it is
continuous provided that the function spaces D
J+1
[0, ) and D[0, ) are endowed with either the
topology of uniform convergence over compact intervals or the Skorohod J
1
topology. Let
q(t) = (y
1
(t))
+
and z
1
(t) = p
1
(y
1
(t))
. (3.8)
Hence,
y
1
(t) = q(t) +
z
1
(t)
p
1
. (3.9)
Clearly q and z
1
satisfy (3.3)(3.5). Dene
u
1
(t) =
z
1
(t) x
1
(t) +
1
_
t
0
z
1
(s)ds
p
1
.
Note that z
1
and u
1
satisfy (3.2). Also q satises (3.2) by (3.7) and (3.9). This proves existence.
Continuity and measurability follows from the continuity of
and (3.8).
8
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
Now we prove that the mapping is optimal in a certain sense. Given a feasible control u and
associated processes q, z satisfying (3.1)(3.5) with u, let
y(t) = q(t) +
J
j=1
z
j
(t).
We write y
u
instead of y when we want to make the dependence on the control u explicit.
Theorem 3. Fix x D
J+1
[0, ) and let u
=
3
(x). Then for any other feasible control u and
xed T > 0,
i. if Assumption 1(i) holds then
y
u
(t) y
u
(t), t [0, T], (3.10)
ii. if Assumption 1(ii) holds then
(y
u
(t))
+
(y
u
(t))
+
, t [0, T]. (3.11)
Proof. Fix x D
J+1
[0, ). Let u be a feasible control and denote by (z, q) associated processes
satisfying (3.1)(3.5). First we prove (ii). Dene
y
u
(t) = q(t) +
J
j=1
z
j
(t)
p
j
.
By (3.3)(3.5)
( y
u
(t))
+
= q(t) and ( y
u
(t))
=
J
j=1
z
j
(t)
p
j
. (3.12)
We also have by (3.1) and (3.2)
y
u
(t) = y
u
(0) +x
q
(t) +
J
j=1
x
j
(t)
p
j
j=1
j
_
t
0
z
j
(s)
p
j
ds.
We use Lemma 4.4 in [12] to complete the proof. We dene
r(t) =
_
t
0
q(s)ds and v
j
(t) =
_
t
0
z
j
(s)
p
j
ds. (3.13)
We also set
w(t) = y(0) +x
q
(t) +
J
j=1
x
j
(t)
p
j
.
(This quantity is not related to w
r
q
dened in (2.12).) We have
y
u
(t) = w(t)
J
j=1
j
v
j
(t).
9
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
By (3.12) and (3.13), y, r and v
j
, j J, satisfy the conditions of Lemma 4.4 in [12], hence
y
u
(t) y
u
(t), t [0, T]. (3.14)
This inequality gives (3.11) by (3.5) and (3.12).
We prove (3.10) using (3.14), Assumption 1(i) and (3.11). Note that Assumption 1(i) implies
by (1.1) that
p
1
p
j
, for all j = 2, . . . , J. (3.15)
Therefore,
(y
u
(t))
(a)
=
J
j=1
z
j
(t)
(b)
p
1
J
j=1
1
p
j
z
j
(t)
(c)
= p
1
( y
u
(t))
(d)
p
1
( y
u
(t))
(e)
= (y
u
(t))
, (3.16)
where (a) follows from (3.3) and (3.4), (b) follows from (3.3) and (3.15), (c) follows from (3.12), (d)
follows from (3.14) and (e) follows from (3.6). Because Assumption 1(i) implies Assumption 1(ii),
(3.10) still holds (see the discussion after Assumption 1). Hence, combined with (3.11), (3.16) gives
(3.10).
4 Main results and Insights
In this section, we rst present our main results. Then we present the results of simulation experi-
ments in Section 4.2 and comment on our proof technique in Section 4.3.
4.1 Main Results
Let B = (W
q
, W), where W
q
and W = (W
j
: j J) are independent Brownian motions with drifts
and 0, variances
2
and (
j
j
(2 p
j
) : j J), respectively, and initial states
W
q
(0) =

Q(0) and W
j
(0) =

Z
j
(0) for j J.
Let
(
,

Q
, u
) = (B)
and
(t) =

Q
(t) +
J
j=1
j
(t).
First we prove that

Q
and

Y
provide a lower bound for all the admissible policies.

Theorem 4. Consider a sequence of inverted-V systems. Assume that (2.1), (2.2), (2.5) and (2.6)
hold. For any sequence of admissible policies {
r
} and for any T > 0,
i. under Assumption 1(i)
liminf
r
P
__
T
0
Y
r
r (t)dt > x
_
P
__
T
0
(t)dt > x
_
(4.1)
for any x > 0 and
liminf
r
E
__
T
0
Y
r
r (t)dt
_
E
__
T
0
(t)dt
_
, (4.2)
10
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
Setting
1
p
1

2
(E[X
1
] ,E[X
2
]) (Var[X
1
] ,Var[X
2
])
1 3 0.7 4 (0.233,0.25) (0.101,0.0625)
2 3 0.5 4 (0.166,0.25) (0.083,0.0625)
3 3 0.3 4 (0.1,0.25) (0.056,0.0625)
Table 1: Service time parameters
ii. under Assumption 1(ii)
liminf
r
P
__
T
0
Q
r
r (t)dt > x
_
P
__
T
0
(t)dt > x
_
(4.3)
for any x > 0 and
liminf
r
E
__
T
0
Q
r
r (t)dt
_
E
__
T
0
(t)dt
_
. (4.4)
Next we show that the cost under the proposed policy coincides with the lower bound in the
limit.
Theorem 5. Consider a sequence of inverted-V systems. Assume that (2.1), (2.2), (2.5) and (2.6)
hold. Under policy
, for any T > 0,

i. under Assumption 1(i)
lim
r
P
__
T
0
Y
r
(t)dt > x
_
= P
__
T
0
(t)dt > x
_
(4.5)
for any x > 0 and
lim
r
E
__
T
0
Y
r
(t)dt
_
= E
__
T
0
(t)dt
_
, (4.6)
ii. under Assumption 1(ii)
lim
r
P
__
T
0
Q
r
(t)dt > x
_
= P
__
T
0
(t)dt > x
_
(4.7)
for any x > 0 and
lim
r
E
__
T
0
Q
r
(t)dt
_
= E
__
T
0
(t)dt
_
. (4.8)
Remark 6. The main dierence between parts (i) and (ii) in Theorems 4 and 5 is that under
Assumption 1(i) we prove a slightly stronger result; the total number of customer in the system is
minimized (in the appropriate sense). Whereas only the number of customers in queue is shown to
be minimized under the proposed policy under Assumption 1(ii).
4.2 Simulation experiments
In this section we focus on inverted-V model systems with J = 2 and carry out some simulation
experiments. (The reader who wants to see the details of the proofs rst can skip this section
11
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
Exp No N
1
p
1

2
Q
1 (50,50) 407 3 0.7 4 1.70 (1.31)
2 (50,50) 480 3 0.5 4 0.63 (0.03)
3 (50,50) 685 3 0.3 4 5.87 (5.31)
4 (200,200) 1643 3 0.7 4 3.35 (2.08)
5 (200,200) 1985 3 0.5 4 4.19 (3.27)
6 (200,200) 2770 3 0.3 4 5.38 (3.97)
Table 2: Simulation results
and go to Section 5.) We set J = 2, p
2
= 1,
1
= 3, and
2
= 4. Therefore, Assumption 1(ii)
holds (irrespective of the value of p
1
(0, 1]) and the policy that gives priority to the second pool
should be asymptotically optimal. In our experiments, we compare the proposed policy (which
gives priority to the second pool) to the policy that gives priority to the rst pool. We consider
three dierent settings for service times by altering p
1
as presented in Table 1. We also present
the expected service times, under column (E[X
1
], E[X
2
]) and the variances of the service times,
under column (Var[X
1
],Var[X
2
]) for pools 1 and 2, respectively, under three dierent values for p
1
in Table 1. In all these settings, the expected service time for the rst pool is smaller than that
of the second pool. The variance of the service times for pool 1 is higher than that for the second
pool in the rst two settings and lower in the last experiment.
The details and results of the simulation experiments are presented in Table 2. Initially we
consider mid-size systems in Experiments 1 through 3 in Table 2 with 50 servers in each pool and
with the service time distributions given as in each setting in Table 1. We run three additional
experiments, Experiments 4 through 6 in Table 2, using similar parameters to Experiments 1
through 3, respectively, except that the number of servers in each pool in these experiments is
equal to 200. The column N is the number of servers in each pool and is the arrival rate. The
last column, Q, is the dierence between the average queue lengths throughout the simulation
when we give priority to the rst and the second pools, respectively. In parentheses, we display
the 95% condence interval we obtain from 100 replications. In each replication, we simulate the
system to allow 1,000,000 arrivals and we start each simulation with an empty system.
The results of our experiments show that the proposed policy performs better than the policy
that gives priority to the rst pool and the dierences between queue lengths are statistically
signicant in all the experiments. We emphasize the fact that in Experiments 3 and 6, both the
mean and variance of service times in the second pool are larger than those in the rst pool, but it
is still optimal to give priority to the second pool.
4.3 Preemptive vs. nonpreemptive policies
We comment on our proof technique before we provide the details of the proofs. In the many-server
asymptotic analysis, when service times have exponential distribution, a common methodology to
show that a non-preemptive policy is asymptotically optimal is to rst show that a similar but
preemptive policy is optimal and then to show that the performance of the non-preemptive policy
coincides with that of the optimal preemptive policy in the limit. For example, in [1] the following
preemptive policy is analyzed rst; if a faster server idles, the service of a customer in the slower
server pool (if there is any) is preempted and that customer is re-routed to the faster pool. Then
12
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
it is shown that the policy that does not allow preemptions have the same asymptotic performance
as this preemptive policy.
However, such an approach cannot be used in the current context. To illustrate, consider an
inverted-V model with two server pools. If preemptive policies are allowed, it can be shown that the
number of customers in queue converges to zero in the many-server heavy trac regime for certain
systems as we explain below. Assume that we reserve one tagged server in the rst pool and that
tagged server is used dierently from other servers in a way we explain next. Upon a new customer
arrival, we send the arriving customer to the tagged server. Note that with probability 1 p
1
the
new customers service time is zero and if so that customer will leave the system immediately. If
that customers service time is not zero, we preempt this customers service. If possible we try to
assign this customer to a server in pool 2 and then to a server (other than the tagged server) in
pool 1. Otherwise, the customer starts waiting in the queue until a server other than the tagged
server becomes available. (Although the policy we describe is idling, a non-idling version can be
constructed similarly.)
Consider the following parameters,
1
=
2
= 1, p
1
= 0.5 and p
2
= 1, and N
r
1
= N
r
2
= 50.
Therefore, under the preemptive policy described above 50% of the customers will leave the system
immediately upon arrival. The total system capacity is 150 customers per unit time by (2.5).
However if preemption as described above is allowed, the system can handle 200 customers per
unit time (50% of them will leave the system immediately and the rest will have average service
time equal to 1). Therefore, even when (2.5) holds, the system is not under heavy trac when
preemptions are allowed. Using this fact, the diusion scaled queue length can be shown to converge
to zero at all times for large enough r when (2.1), (2.2), (2.5) and (2.6) hold. This is not possible
under any non-preemptive policy as shown in our main result. Therefore, we work directly with
the mapping dened in Section 3 instead of considering preemptive policies in our proof.
5 Asymptotic Bounds for admissible policies
The rest of the paper is devoted to the proofs of Theorems 4 and 5. In this section we provide
bounds for

Q
r
,

Z
r
and u
r
under non-idling and non-preemptive policies. These bounds will be
used in several places throughout the proofs of our main results. In Sections 6 and 7 we prove
Theorems 4 and 5, respectively.
For a xed admissible policy , let
w
r
=
_
w
r
j
, j J
_
,

b
r
=
_
w
r
q
, w
r
_
, (5.1)
(recall (2.12)) and
(
Z
r
,

Q
r
, u
r
) = (
b
r
), (5.2)
where the mapping is dened as in Section 3. Note that the process

b
r
depends on the policy
but we ignore this dependence from our notation for simplicity.
Lemma 7. For any admissible policy and T > 0
Q
r
()
T

Z
r
()
T
u
r
()
T
C
T
__
_
_
b
r
()
_
_
_
T
+
_
_
_
A
r
()
_
_
_
T
_
,
for some C
T
> 0, independent of the policy and for all j J.
13
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
The following result can be proved similarly to Lemma 7 hence the proof is omitted.
Lemma 8. Fix an admissible policy and let (
Z
r
,

Q
r
, u
r
) be dened as in (5.2). For T > 0
Q
r
()
T

Z
r
()
T
u
r
T
C
T
__
_
_
b
r
()
_
_
_
T
+
_
_
_
A
r
()
_
_
_
T
_
,
for some C
T
> 0, independent of the policy.
Proof of Lemma 7. Fix an admissible policy and T > 0. First note that by (2.15), (2.13) gives
p
jk
u
r
j
(t) w
r
j
(t) +
j
_
t
0
Z
r
j
(s)ds w
r
j
(t). (5.3)
Also, if

Q
r
(t) = 0, then by (2.14)
jJ
u
r
j
(t) = w
r
q
(t).
Hence by (5.3) for t [0, T]
u
r
j
(t)
_
_
_
b
r
()
_
_
_
T
(5.4)
if

Q
r
(t) = 0.
Now assume that

Q
r
(t) > 0. Also assume there exists 0 < u < t such that

Q
r
(u) = 0 (we
comment on the case

Q
r
(u) > 0 for all u [0, t] below) and dene
s = sup{u < t :

Q
r
(u) = 0}. (5.5)
First assume that 0 < s < t (we comment on the case s = t below). Note that

Q
r
() > 0 for all
(s, t], because it is right continuous by denition, and so

Z
r
(u) = 0 for all u [s, t] by (3.5).
By (2.13)
Z
r
j
(t)

Z
r
j
(s) = a
r
j
(t) a
r
j
(s)
_
D
r
j
(t)

D
r
j
(s)
_
j
_
t
s
Z
r
j
(s)ds
+p
j
_
u
r
j
(t) u
r
j
(s)
_
. (5.6)
We show below that for any 0 < u < T
u
r
j
(u) u
r
j
(u)
2
__
_
_
b
r
()
_
_
_
T
+
_
_
_
A
r
()
_
_
_
T
_
. (5.7)
This inequality also implies by (2.13) and the fact that

Z
j
D[0, T],
Z
r
j
(u)

Z
r
j
(u)
3
__
_
_
b
r
()
_
_
_
T
+
_
_
_
A
r
()
_
_
_
T
_
. (5.8)
Note that because

Z
r
j
(s) = 0, (5.8) implies that
Z
r
j
(s)
3
__
_
_
b
r
()
_
_
_
T
+
_
_
_
A
r
()
_
_
_
T
_
.
14
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
Because

Z
r
j
(u) = 0 for all u (s, t]
j
_
t
s
Z
r
j
(s)ds = 0. (5.9)
Also by (5.4) and the fact that u
r
j
has left limits we have
u
r
j
(s)

_
_
_
b
r
()
_
_
_
T
. (5.10)
By combining (5.6)(5.10), for some C
T
> 0
p
j
u
r
j
(t)
Z
r
j
(s)
+ 2
a
r
j
(t)
+ 2
D
r
j
(t)
+p
j
u
r
j
(s)
C
T
__
_
_
b
r
()
_
_
_
T
+
_
_
_
A
r
()
_
_
_
T
_
.
This inequality gives the desired result for u
r
(when there exists 0 < u < t such that

Q
r
(u) = 0).
If

Q
r
(u) > 0 for all u > 0, the result follows by setting s = 0 in (5.6) and from (2.6). If s = t in
(5.5), the result follows from (5.7).
The result for

Z
r
follows from the bound for u
r
, (2.13) and Gronwalls inequality (see Corollary
11.2 in [23]). The result for

Q
r
follows trivially from the bounds for u
r
and (2.14).
To prove (5.7), we note that the number of customers whose service started in pool j (including
those whose service times are equal to zero) in an interval [s, t] is bounded by the summation of the
number of customers whose service is completed by that server pool and the total number of arrivals
to the system in that interval. This follows from (2.7), (2.8) and the fact under an admissible policy
A
j
is non-decreasing. Therefore
A
r
j
(t) A
r
j
(s) A
r
(t) A
r
(s) +D
r
j
(t) D
r
j
(s) +
A
r
j
(t)
n=A
r
j
(s)+1
(1
j
(n)),
which implies
A
r
j
(t)
n=A
r
j
(s)+1
j
(n) A
r
(t) A
r
(s) +D
r
j
(t) D
r
j
(s).
This inequality gives (5.7) combined with (2.11) and the fact that
r
1/2
_
_
A
r
j
(t)
n=A
r
j
(s)+1
j
(n)
_
_
= r
1/2
_
A
r
j
(t) A
r
j
(s)
_
+ a
r
j
(t) a
r
j
(s),
which follows from (2.10).
6 Proof of Theorem 4
The proof follows similarly to Theorem 3.2 in [29] using Lemma 7 (see also Step 1 of Proposition 1
in [5]).
15
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
Proof. Assume that (2.1), (2.2), (2.5) and (2.6) hold. We mainly focus on the case where Assump-
tion 1(i) holds. Fix a sequence of admissible policies
r
. All the queueing processes below depend
on the policy
r
but we drop it from our notation for simplicity. We show below that for any T > 0
_
_
_
_
_
Q
r
(t),
Z
r
(t)
r
_
_
_
_
_
T
0, as r . (6.1)
Let

b
r
be dened as in (5.1). By (6.1), Theorem 4.4 in [10] and random time change theorem (see
Theorem 5.3 in [11])
b
r
B as r , (6.2)
where B is dened as in Section 4.
Let (
Z
r
,

Q
r
, u
r
) be dened as in (5.2). (We note that the process

b
r
depends on the policy ,
hence so is (
Z
r
,

Q
r
, u
r
).) Also set
Y
r
(t) =

Q
r
(t) +
J
j=1
Z
r
j
(t). (6.3)
By Theorem 3(i), under Assumption 1(i) for any T > 0
Y
r
(t)

Y
r
(t) for all t [0, T]. (6.4)
Result (4.1) follows from (6.2), (6.4) and continuous mapping theorem. Result (4.2) follows from
(4.1) and Fatous lemma. Part (ii) of Theorem 4 follows similarly using Theorem 3(ii) instead of
Theorem 3(i) to arrive at
_
Y
r
(t)
_
+
Y
r
(t)
_
+
for all t [0, T]
instead of (6.4) and proceeding in a similar way as above.
We nish the proof by establishing (6.1) using Lemma 7. Fix T > 0. Recall the denition
of w
j
and w
r
q
in (2.12). By functional strong law of large numbers (see Theorem 5.10 in [11]),
_
_
A
r
(t)
_
_
T
0 a.s. as r . Therefore by Lemma 7, it is enough to show that
( w
r
, w
r
q
) 0, as r , (6.5)
where w
r
= w
r
/
r and w
r
q
= w
r
q
/
r. For w
r
q
, the result immediately follows from (2.6) and the
fact that A
r
is a delayed renewal process, see (2.4). Next we focus on w
r
.
By (2.12),
w
r
j
(t) =
Z
r
j
(0)
r
+ a
r
j
(t)/
r

D
r
j
(t)/
r. (6.6)
We have

Z
r
j
(0)/
r 0 as r by (2.6). By (2.10)
a
r
j
(t)/
r =
C
r
j
_
r

A
r
j
(t)
_
r
rp
j

A
r
j
(t).
16
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
Because

A
r
j
(t)

A
r
(t) for all t and A
r
(t) 2t a.s. for r large enough, for any T > 0
sup
0tT
C
r
j
_
r

A
r
j
(t)
_
r
rp
j

A
r
j
(t)
sup
0tT
C
r
j
(2rt)
r
p
j
2rt
(6.7)
for r large enough. The term on the RHS of (6.7) converges to 0 a.s. for any T > 0 as r by
functional strong law of large numbers (see Theorem 5.10 in [11]). Because

Z
r
j
(t) N
j
/r
j
r,
see (2.2), we also have that sup
0tT
_
_
_
D
r
j
(t)/
r
_
_
_ 0 for any T > 0 by functional strong law of
large numbers, completing the proof of (6.1) by (6.6).
7 Proof of Theorem 5
We rst establish two results and then prove Theorem 5 using these results.
Proposition 9. Assume that (2.1), (2.2), (2.5) and (2.6) hold. Under
, for any 0 < s < T and

> 0,
limsup
r
P
_
_
_
J
j=2
_
_
_
Z
r
j
()
_
_
_
[s,T]
>
_
_
_
= 0.
Proof. Assume that (2.1), (2.2), (2.5) and (2.6) hold. Fix T > 0. As in [27] we argue below that
for any > 0 and 0 < s < T
limsup
r
P
_
_
_
J
j=2
_
_
_
Z
r
j
()
_
_
_
[s,T]
> 2
_
_
_
limsup
r
P
_
sup
st
1
t
2
T
|t
2
t
1
|<
J
j=2
_
|
A
r
(t
2
)

A
r
(t
1
)| +| a
r
j
(t
2
) a
r
j
(t
1
)|
+|
D
r
j
(t
2
)

D
r
j
(t
1
)|
_
>
_
. (7.1)
Let

Z
r
(t) =

Z
r
j
(t)/
r. By (2.13),
Z
r
j
(t) =

Z
r
j
(0) +r
1/2
a
r
j
(t) r
1/2

D
r
j
(t)
j
_
t
0
Z
r
j
(s)ds +r
1/2
p
j
u
r
j
(t).
By (6.1) and (6.5), this equation implies that
r
1/2
p
j
u
r
j
0 as r .
Hence, for any T > 0, by (2.11)
sup
0tT
A
r
j
(t)

j
m
j
t
0 as r .
17
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
Therefore by the random time change theorem (see Theorem 5.3 in [11])
a
r
j
a
j
, as r , (7.2)
where a
j
is a driftless Brownian motion with variance (1 p
j
)
j
j
. Also by (6.1)
D
r
j

D
j
, as r , (7.3)
where

D
j
is a driftless Brownian motion with variance
j
j
. In addition
A
r

A, as r , (7.4)
where

A is a Brownian motion with variance
2
. We get the desired result from (7.1), (7.2)(7.4),
the fact that > 0 can be taken to be arbitrarily small and because Brownian motion is continuous
a.s.
We prove (7.1) next. First assume that
J
j=2
Z
r
j
(0) = 0.
Let
r
1
= inf{t > 0 :
J
j=2
Z
r
j
(t) > 2}
and
r
0
= sup{
r
1
> t > 0 :
J
j=2
Z
r
j
(t) < }.
Recall that

Z
r
j
(t) 0 for all t by (2.15). Also by (2.17), on [
r
0
,
r
1
],
J
j=2
p
j
u
r
j
(
r
1
)
J
j=2
p
j
u
r
j
(
r
0
)

A
r
(
r
1
)

A
r
(
r
0
) +
N
r
1
rm
1
(
r
1

r
0
) +(
r
1

r
0
).
Therefore, by (2.13)
J
j=2
Z
r
j
(
r
1
)
J
j=2
Z
r
j
(
r
0
)
J
j=2
a
r
j
(
r
1
)
J
j=2
a
r
j
(
r
0
)
_
_
J
j=2
D
r
j
(
r
1
)
J
j=2
D
r
j
(
r
0
)
_
_
+

A
r
(
r
1
)

A
r
(
r
0
) +
N
r
1
rm
1
(
r
1

r
0
) +(
r
1

r
0
).(7.5)
This inequality gives (7.1).
Now assume that
J
j=2
Z
r
j
(0) < 0.
18
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
Let
r
= inf
_
_
_
t 0 :
J
j=2
Z
r
j
(t) = 0
_
_
_
.
Note that the argument (7.5) still holds if we focus on [
r
, T] instead of [0, T], therefore it is enough
to show that
r
0 as r . This follows by noticing that (7.5) also holds if we set
r
1
=
r
and
use 0 instead of
r
0
.
Proposition 10 (Convergence). Assume that (2.1), (2.2), (2.5) and (2.6) hold. Under
, for any
0 < s < T
sup
s<t<T
_
_
_
_
Z
r
(t),

Q
r
(t)
_
(t),

Q
(t)
__
_
_ 0 (7.6)
as r .
Proof. Assume that (2.1), (2.2), (2.5) and (2.6) hold. Fix 0 < s < T. By (7.2) and (7.3),

b
r
B
as r , for B dened as in Section 4. Let (
Z
r
,

Q
r
, u
r
) dened as in (5.2). Then,
_
Z
r
,

Q
r
_
,

Q
_
as r . Therefore, by the convergence together theorem (see Theorem 4.1 [10]), it is enough to
show that
sup
s<t<T
__
_
_
Z
r
(t)

Z
r
(t)
_
_
_
_
_
_
Q
r
(t)

Q
r
(t)
_
_
_
_
0 (7.7)
as r .
Note that under
,

Z
r
j
and

Q
r
satisfy (2.13) and (2.14) and
J
j=2
Z
r
j
(t) =
r
(t), (7.8)
where
r
()
[s,T]
0 for any 0 < s < T as r by Proposition 9.
By (3.7), p
j
u
r
j
(t) = w
j
(t), for j 2 and t 0. Fix T > 0, for 0 < t T and j 2, by (2.13),
(2.15) and (7.8)
p
j
u
r
j
(t) + w
r
j
(t)
Z
r
j
(t)
_
t
0
Z
r
j
(s)ds
Therefore, for 0 < t T and j 2, by (7.8), Lemma 7 and (6.2)

sup
s<u<T
u
r
j
(u) u
r
j
(u)
0 (7.9)
or any 0 < s < T as r .
Let

Y
r
1
(t) =

Q
r
(t) +

Z
r
1
(t)/p
1
and

Y
r
1
(t) =

Q
r
(t) +

Z
r
1
(t)/p
1
. Note that by (2.15)
_
Y
r
1
(t)
_
+
=

Q
r
(t) and
_
Y
r
1
(t)
_
Z
r
1
(t)/p
1
. (7.10)
19
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
Similarly, by (3.3)(3.5)
_
Y
r
1
(t)
_
+
=

Q
r
(t) and
_
Y
r
1
(t)
_
Z
r
1
(t)/p
1
. (7.11)
By (2.13) and (2.14)
Y
r
1
(t) = w
r
q
(t) +
w
r
1
(t)
p
1

1
p
1
_
t
0
Z
r
1
(s)ds
J
j=2
u
r
j
(t). (7.12)
By (3.1) and (3.2)
Y
r
1
(t) = w
r
q
(t) +
w
r
1
(t)
p
1

1
p
1
_
t
0
Z
r
1
(s)ds
J
j=2
u
r
j
(t). (7.13)
By (7.9)(7.12), and Gronwalls inequality (see (11.2) in [23])
Y
r
1
(t)

Y
r
1
(t)
j=2
u
r
j
(t) u
r
j
(t)
+C
1
_
t
0
u
r
j
(s) u
r
j
(s)
ds (7.14)
for C
1
=

1
p
1
exp
_
1
p
1
T
_
. First note that u
r
j
(0) = 0 and
u
r
j
(0)

J
j=1
Z
r
j
(0)
. Hence, (7.14)
implies (7.7), by (7.9)(7.11), Lemma 7 and (6.2).
We are now ready to prove Theorem 5.
Proof. Assume that (2.1), (2.2), (2.5) and (2.6) hold. We prove the result using (7.6) and Propo-
sition 11 below. We only prove the result under Assumption 1(i), proof under Assumption 1(ii) is
similar.
Fix > 0 and x > 0. By Proposition 11 we can nd > 0 such that for r large enough
P
__

0
Y
r
(s)ds >
_
< and P
__

0
(s)ds >
_
< . (7.15)
Also, by (7.6) and the continuity of the integral operator we have
P
__
T
Y
r
(s)ds > x
_
P
__
T
(s)ds > x
_
,
as r . By (7.15), for x > 0 and r large enough
P
__
T
0
Y
r
(s)ds > x
_
P
__
T
Y
r
(s)ds > x
_
+
P
__
T
(s)ds > x
_
+ 2 P
__
T
0
(s)ds > x
_
+ 3.
Since > 0 is arbitrary this proves (4.7) by Theorem 4. The result (4.8) follows from (4.7) and the
uniform integrability of
_
Y
r
(s)ds by Proposition 11 below.

20
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
Proposition 11. Assume that (2.1), (2.2), (2.5) and (2.6) hold. For any admissible policy and
T > 0
E
_
Q
r
()
2
T
_
E
_
Z
r
()
2
T
_
E
_
u
r
()
2
T
C
T
and
E
_
Q
r
()
2
T
E
_
Z
r
()
2
T
E
_
u
r
()
2
T
C
T
for some C
T
> 0, independent of the policy.
Proof. The rst result follows from Lemma 7 and combining (52) in [15] with Lemmas 2 and 3 in
[5] and the fact that interarrival times are assumed to have nite mean and variance. The second
result follows from Lemma 8 and a similar argument.
References
[1] M. Armony. Dynamic routing in large-scale service systems with heterogenous servers. Queue-
ing Systems: Theory and Applications, 51:287329, 2005.
[2] M. Armony and C. Maglaras. Contact centers with a call-back option and real-time delay
information. Operations Research, 52:527545, 2004.
[3] M. Armony and C. Maglaras. On customer contact centers with a call-back option: Customer
decisions, routing rules and system design. Operations Research, 52:271292, 2004.
[4] B. Ata and S. Kumar. Heavy trac analysis of open processing networks with complete re-
source pooling: asymptotic optimality of discrete review policies. Annals of Applied Probability,
15:331391, 2005.
[5] R. Atar. A diusion model of scheduling control in queueing systems with many servers.
Annals of Applied Probability, 15:820852, 2005.
[6] R. Atar. Scheduling control for queueing systems with many servers: asymptotic optimality
in heavy trac. Annals of Applied Probability, 15:26062650, 2005.
[7] R. Atar, A. Mandelbaum, and M. Reiman. Scheduling a multi-class queue with many exponen-
tial servers: Asymptotic optimality in heavy-trac. Annals of Applied Probability, 14:1084
1134, 2004.
[8] S. L. Bell and R. J. Williams. Dynamic scheduling of a system with two parallel servers in
heavy trac with resource pooling: asymptotic optimality of a threshold policy. Annals of
Applied Probability, 11:608649, 2001.
[9] S. L. Bell and R. J. Williams. Dynamic scheduling of a parallel server system in heavy trac
with complete resource pooling: Asymptotic optimality of a threshold policy. Electronic J. of
Probability, 10:10441115, 2005.
[10] P. Billingsley. Convergence of probability measures. Wiley, New York, 1968.
[11] H. Chen and D. Yao. Fundamentals of queueing networks : Performance, asymptotics, and
optimization. Springer, New York, 2001.
21
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
[12] J. Dai and T. Tezcan. Optimal control of parallel server systems with many servers in heavy
trac. Queueing Systems Theory and Applications, 59:95134, 2008.
[13] J. G. Dai and T. Tezcan. State space collapse in many server diusion limits of parallel server
systems. To appear in math of or, School of Industrial and Systems Engineering, Georgia
Insitute of Technology, 2011.
[14] S. Ethier and T. Kurtz. Markov Processes: Characterization and Convergence. John Wiley
and Sons, New York, 1986.
[15] D. Gamarnik and A. Zeevi. Validity of heavy trac steady-state approximations in open
queueing networks. Annals of Applied Probability, 16:5690, 2006.
[16] N. Gans, G. Koole, and A. Mandelbaum. Telephone call centers: Tutorial, review and research
prospects. Manufacturing and Service Operations Management, 5:79141, 2003.
[17] I. Gurvich, M. Armony, and A. Mandelbaum. Service-Level Dierentiation in Call Centers
with Fully Flexible Servers. Management Science, 54:279294, 2008.
[18] I. Gurvich and W. Whitt. Service-level dierentiation in many-server service systems: A
solution based on xed-queue-ratio routing. Operations Research, 29:567588, 2007.
[19] I. Gurvich and W. Whitt. Scheduling exible servers with convex delay costs in many-server
service systems. Manufacturing & Service Operations Management, 11(2):237253, 2009.
[20] S. Haln and W. Whitt. Heavy-trac limits for queues with many exponential servers. Oper-
ations Research, 29:567588, 1981.
[21] J. M. Harrison. Brownian models of queueing networks with heterogeneous customer popu-
lations. In W. Fleming and P. L. Lions, editors, Stochastic Dierential Systems, Stochastic
Control Theory and Their Applications, volume 10 of The IMA Volumes in Mathematics and
Its Applications, pages 147186, New York, 1988. Springer-Verlag.
[22] H. Kaspi and K. Ramanan. Law of large numbers limits for many-server queues. Annals of
Applied Probability, 21:33114, 2011.
[23] A. Mandelbaum, W. Massey, and M. Reiman. Strong approximations for Markovian service
networks. Queueing Systems: Theory and Applications, 30:149201, 1998.
[24] G. Pang, R. Talreja, and W. Whitt. Martingale proofs of many-server heavy-trac limits for
Markovian queues. Probability Surveys, 4:193267, 2007.
[25] A. Puhalskii and M. Reiman. The multiclass GI/PH/N queue in the Haln-Whitt regime.
Advances in Applied Probability, 32:564595, 2000.
[26] J. E. Reed. The G/GI/N queue in the Haln-Whitt regime. Annals of Applied Probability,
19:22112269, 2009.
[27] M. Reiman. Some diusion approximations with state space collapse. In Proceedings Interna-
tional Seminar On Modeling And Performance Evaluation Methodology, pages 209240, Berlin,
1983. Springer-Verlag.
22
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
[28] G. Shaikhet. A uid control problem in queueing networks with general service times. Technical
report, Carleton University, 2010.
[29] T. Tezcan and J. G. Dai. Dynamic control of N-systems with many servers: Asymptotic
optimality of a static priority policy in heavy trac. Operations Research, 58:94110, 2010.
[30] W. Whitt. Heavy-trac limits for the G/H
2
/n/m queue. Mathematics of Operations Research,
30:127, 2005.
23
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65

Tolga Ques

Uploaded by

Copyright:

Available Formats

You might also like

Tolga Ques

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Tolga Ques

Uploaded by

Copyright:

Available Formats

Asymptotically Optimal Control of Many-server Heterogeneous

Service Systems with H

Simon School of Business

Research supported by NSF Grant CMMI-0954126.

. The main result of this paper is that, under Assumption 1,

asymptotically minimizes the average number of customers in the system under

reduces to the fastest-server-rst (FSF) policy (if we assume in addition

the following condition also holds;

is well dened, i.e., there exits a unique y

provide a lower bound for all the admissible policies.

, for any T > 0,

, for any 0 < s < T and

Therefore, for 0 < t T and j 2, by (7.8), Lemma 7 and (6.2)

(s)ds by Proposition 11 below.

You might also like