Interleaved Weighted Round-Robin: A Network Calculus Analysis

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

Interleaved Weighted Round-Robin: A Network

Calculus Analysis
Seyed Mohammadhossein Tabatabaee Jean-Yves Le Boudec Marc Boyer
EPFL EPFL ONERA/DTIS, University of Toulouse
Lausanne, Switzerland Lausanne, Switzerland F-31055 Toulouse, France
hossein.tabatabaee@epfl.ch jean-yves.leboudec@epfl.ch Marc.Boyer@onera.fr

Abstract—Weighted Round-Robin (WRR) is often used, due with network calculus, the service offered to a flow of interest
arXiv:2003.08372v3 [cs.NI] 24 Apr 2020

to its simplicity, for scheduling packets or tasks. With WRR, by a system is abstracted by means of a service curve. A
a number of packets equal to the weight allocated to a flow bound on the worst-case delay is obtained by combining the
can be served consecutively, which leads to a bursty service.
Interleaved Weighted Round-Robin (IWRR) is a variant that service curve with an arrival curve for the flow of interest.
mitigates this effect. We are interested in finding bounds on worst- An arrival curve is a constraint on the amount of data that
case delay obtained with IWRR. To this end, we use a network the flow of interest can send; such a constraint is necessary
calculus approach and find a strict service curve for IWRR. The to the existence of a finite delay bound. The exact definitions
result is obtained using the pseudo-inverse of a function. We are recalled in Section II.
show that the strict service curve is the best obtainable one, and
that delay bounds derived from it are tight (i.e., worst-case) for The network calculus approach was applied to WRR in
flows of packets of constant size. Furthermore, the IWRR strict [2, Sec. 8.2.4], where a strict service curve is obtained. As
service curve dominates the strict service curve for WRR that explained in Section II, a strict service curve is a special case
was previously published. We provide some numerical examples of a service curve hence can be used to derive delay (and
to illustrate the reduction in worst-case delays caused by IWRR backlog) bounds. Our first contribution is to obtain a strict
compared to WRR.
service curve for IWRR. Compared to WRR, the interleaving
in IWRR makes the analysis more difficult, and the method
I. I NTRODUCTION
of proof in [2] cannot easily be extended. To circumvent
Weighted Round-Robin (WRR) is a scheduling algorithm this difficulty, we rely heavily on the method of pseudo-
that is often used for scheduling tasks, or packets, in real-time inverse, recalled in Section II. As expected, the IWRR strict
systems or communication networks. The capacity is shared service curve dominates that of WRR, hence the resulting
between several clients or queues by giving each of them a delay bounds for IWRR are always less than or equal to those
weight, which is a positive integer, and by providing more for WRR.
service to those with larger weights. Specifically, every queue The strict service curve enables us to obtain delay bounds
is visited one after the other, and when a queue i with weight by using network calculus, but such bounds might not always
wi has an emission opportunity, it sends wi packets, or less be tight, i.e., they might not always be equal to worst-cases.
if fewer packets are present. The advantage of WRR is that This is because the strict service curve is an abstraction of
it is fair and simple. However, the service is bursty because the system. Our second contribution is to show that, for flows
up to wi packets can be served consecutively for queue i, with packets of constant sizes, the strict service curve obtained
which can cause a large worst-case waiting time for other for IWRR provides tight delay bounds. We show that the
queues. Interleaved Weighted Round-Robin (IWRR) mitigates same result holds for the existing strict service curve of WRR.
this effect [1]. With IWRR, a queue i with weight wi has Extending such results to flows with packets of variable sizes
wi emission opportunities per round and can send up to one is left for further study.
packet at every emission opportunity. In contrast, with WRR, it The strict service curve obtained for IWRR has some
has one emission opportunity per round and can send up to wi description complexity, see also Fig. 3. Therefore, we provide
packets at every emission opportunity. Hence, IWRR spreads simplified lower bounds that can be used, at the expense
out emission opportunities of each queue in a round, which is of sub-optimality, when analytic, closed-form expressions are
expected to result in a smoother service and lower worst-case important.
delays. There exist several versions of IWRR; we focus on After giving some necessary background on network cal-
the simplest one, where queue i has emission opportunities in culus and the lower-pseudo inverse technique in Section II,
the first wi cycles within a round (see Section III for a formal we describe our system model in Section III. We describe the
description of IWRR and Section IV for WRR variants). state of the art in Section IV. In Section V, we present our
We are interested in delay bounds for the worst case, as strict service curve for IWRR, the proof of which we present
is typical in the context of deterministic networking. To this in Section VI. In Section VII, we show that both the IWRR
end, a standard approach is network calculus. Specifically, and WRR strict service curves are the best possible and that
they give tight delay bounds for a flow with constant packet period. A strict service curve β can always be assumed to be
sizes. We use numerical examples to illustrate the worst-case super-additive, i.e., to satisfy β(s + t) ≥ β(s) + β(t) for all
latency improvement of IWRR over WRR obtained with our s, t (otherwise, it can be replaced by its super-additive closure
method in Section VIII. Proofs of results other than Theorem 1 [2, Prop. 5.6]).
are in Appendix. Assume that a flow, constrained by arrival curve α, traverses
a system that offers a service curve β to the flow and that
II. BACKGROUND
respects the ordering of the flow (FIFO per-flow). The delay
We use the framework of network calculus [2], [4], [5]. A of the flow is upper bounded by h(α, β) (horizontal deviation),
flow is represented by a cumulative arrival function R ∈ F , defined by
where F denotes the set of wide-sense increasing functions f :
R+ 7→ R+ ∪ {+∞} and R(t) is the number of bits observed h(α, β) = sup{inf{d ≥ 0|α(t) ≤ β(t + d)}} (1)
t≥0
on the flow between times 0 and t. We say that a flow R
has α ∈ F as arrival curve if for all s ≤ t, R(t) − R(s) ≤ Our technique of proof uses the lower pseudo-inverse. The
α(t − s). A frequently used arrival curve is α = γr,b , defined lower pseudo-inverse f ↓ of a function f ∈ F is defined by
by γr,b (t) = rt + b for t > 0 and γr,b (t) = 0 for t = 0 (token f ↓ (y) = inf{x|f (x) ≥ y} = sup{x|f (x) < y} (2)
bucket arrival curve, with rate r and burst b). An arrival curve
α can always be assumed to be sub-additive, i.e., to satisfy We use the following property from [6, Sec. 10.1]:
α(s + t) ≤ α(s) + α(t) for all s, t.
∀x, y ∈ R+ , y ≤ f (x) ⇒ x ≥ f ↓ (y) (3)
For two functions f and g in F , the min-plus convolution is
defined by (f ⊗g)(t) = inf 0≤s≤t {f (t−s)+g(s)}. An example III. S YSTEM M ODEL
of min-plus convolution used in this paper is illustrated in We consider a weighted round-robin subsystem that serves
Fig. 1. n input flows, has one queue per flow, and uses a weighted
round-robin algorithm (described later) to arbitrate between
νa,b (t) (λ1 ⊗ νa,b ) (t) flows. The weighted round-robin subsystem is itself placed
in a larger system, and can compete with other queuing
subsystems. For example, consider the case of a constant-rate
4a 4a
3a 3a server with several priority levels, without preemption, and
2a 2a where the weighted round-robin subsystem is at a priority level
a t a t that is not the highest. Assuming some arrival curve constraints
for the higher priority traffic, the service received by the entire
b 2b 3b 4b a b 2b 3b 4b
weighted round-robin subsystem can be modelled using a strict
(a) νa,b (b) (λ1 ⊗ νa,b ) service curve [2, Section 8.3.2].
This motivates us to assume that the aggregate of all flows
Fig. 1: Left: the
 stair function νa,b ∈ F defined for t ≥ 0 by in the weighted round-robin subsystem receives a strict service
νa,b (t) = a bt . Right: min-plus convolution of νa,b with the
function λ1 ∈ F defined by λ1 (t) = t for t ≥ 0, when a ≤ b. The curve, say β ∈ F that we call “aggregate strict service curve”.
discontinuities are smoothed, and replaced with a unit slope. If the weighted round-robin subsystem has exclusive access to
a transmission line of rate c, then β(t) = ct for t ≥ 0. We
Consider a system S and a flow through S with input and assume that β(t) is finite for every (finite) t and, without loss
output functions R and R∗ and let β ∈ F . We say that the of generality, we assume β to be super-additive. Furthermore,
system S offers β as a service curve to the flow if R∗ ≥ R⊗β, we need an additional technical assumption, primarily for
which often means that for every t ≥ 0 there exists some s ≤ t establishing the tightness result: we assume that β is Lipschitz-
such that R∗ (t) ≥ R(s) + β(t − s) [2, Sec. 3.2.2]. We say that continuous, i.e., there exists a constant K > 0 such that
β(t)−β(s)
system S offers a strict service curve β ∈ F to the flow if t−s ≤ K for all 0 ≤ s < t; this does not appear to be
R∗ (t) − R∗ (s) ≥ β(t − s) whenever (s, t] is a backlogged a restriction as the rate at which data is served has a physical
period (i.e., R∗ (τ ) > R(τ ) for all τ such that s < τ ≤ t). limit.
If β is a strict service curve, then it is a service curve, but The arbitration algorithm assumed in this paper is IWRR,
the converse is not always true [4, Section 1.3]. A frequently shown in Algorithm 1. When a packet of flow i enters the
used service curve is the rate-latency function βr,T that is the weighted round-robin subsystem, it is put into queue i. The
function in F defined by βr,T (t) = r[t − T ]+ , where we weight of flow i is wi . IWRR runs an infinite loop of rounds.
use the notation [x]+ = max {x, 0}. Saying that a system In one round, each queue i has wi emission opportunities;
offers a service curve βr,T to a flow expresses that the flow is one packet can be sent during one emission opportunity. The
guaranteed a service rate r, except for possible interruptions inner loop defines a cycle, where each queue is visited but only
that might impact the delay by at most T . Saying that a system those with a weight not smaller than the cycle number have
offers a strict service curve βr,T to a flow expresses that an emission opportunity. The send instruction is assumed to
the flow is guaranteed a service rate r, except for possible be the only one with a non-null duration. Its actual duration
interruptions that might not exceed T in total per backlogged depends on the packet size but also on the amount of service

2
available to the entire weighted round-robin subsystem. See IWRR is modified into WRR/SB in [12] to enable some
Figure 2 for an illustration. flow to send slightly more packets than permitted in a cycle,
and to decrease accordingly at the next cycle.
round 𝑛 round 𝑛 + 1 As mentioned in Section I, plain WRR (which we simply
cycle 1 cycle 2 c 3 c 4 c 5 cycle 1 cycle 2 c 3 c 4 c 5 call “WRR”) enables each flow i to send up to wi packets
1 2 3 1 2 3 2 3 3 3 1 2 3 1 2 3 2 3 3 3 every time it is selected [13]. A “Multiclass WRR” is also
defined in [13]. Surprisingly, the authors of [13] were not
Fig. 2: Emission opportunities on two successive rounds for aware of [1] and have re-invented LIWRR. Note that even
IWRR with three flows and w1 = 2, w2 = 3, w3 = 5. Mind that
this is not the temporal behaviour: each opportunity can lead to
if WRR was designed for packets of constant size, it has been
an empty interval if the queue is empty at this time. Furthermore, applied in network of variable size packets such as Ethernet
the duration of each non-empty interval depends on the packet [14, Sec. 8.6, Sec. 8.6.8.3, Sec. 37], in request balancing in
size and the aggregate service available (we do not assume cloud infrastructures [15], in the LinuxVirtualServer schedul-
constant rate service). ing [16], in network of chip [17], and so on. In fact, looking for
expression “weighted round-robin” in the title or abstracts of
papers index by Scopus returns more than 400 entries (March
Algorithm 1 Interleaved Weighted Round-Robin
2020), and Google references more than 4000 patents with this
Input: Integer weights w1 ≤ w2 ≤ .. ≤ wn expression (March 2020). Unfortunately, when authors refer to
1: wmax = max{w1 , .., wn } WRR, they often do not explicit which version of WRR it is.
2: while True do . A round starts. A WRR server is also a latency-rate server, with latency and
3: for C ← 1 to wmax do . A cycle starts. rates given in [18] for packets of constant size. The latency
4: for i ← 1 to n do result is generalised to LIWRR in [19]. Even if the notion of
5: if C ≤ wi then latency-rate server is very close to the one of a service curve
6: if (not empty(i)) then βr,T in network calculus, both notions are slightly different,
7: . A service for queue i. and results cannot be directly imported from one theory to the
8: print(now,i); other [20]. In [17], the authors consider a Network on Chip
9: send(head(i)); (NoC), with WRR arbitration at the flit level. A flit is the
10: removeHead(i); elementary data unit of the NoC, one flit is sent per CPU/NoC
11: end if cycle. Assuming that the weights are such that packets are
12: end if never fragmented by the arbiter, a strict service
P curve βRi ,Ti
13: end for for flow i is found, with Ri = Pwiwk , Ti = j6=i wj .
14: end for . A cycle finishes. k

15: end while . A round finishes. WRR arbitration in an Ethernet switch is also considered
in [21], with the assumption that all flows of an output ports
Here, we use the context of communication networks, but have the same constant packet size. It then computes, in the
the results equally apply to real-time systems: Simply map network calculus framework, a residual P service with service
wj
flow to task, packet to job, packet size to job execution time curve βRi ,Ti with Ri = Pwiwk C, Ti = j6=Ci , where C is
k

and strict service curve to “delivery curve” [7], [8]. the link rate. We assume that the missing packet size in the Ti
term was a typo. This network calculus result on conventional
IV. S TATE OF THE ART WRR arbitration in Ethernet is refined in [22], considering
One of the first use of round-robin scheduling in the network packets of variable size, leading to residual service with strict
wi limin
context appeared in [9], with a fairness objective, i.e., a service curve βRi ,Ti with Ri = C and Ti =
wi limin + j6=i wj ljmax
P
fair way to share the bandwidth between sessions. It is also P max
j6=i wj lj
mentioned in [10] as a way to implement “fair queueing”. C eq. (1) and (2) in [22]) where limin , limax are,
(cf.
The term “Weighed Round-Robin” was coined in [1] as respectively, lower and upper bounds on the size of the packets
a generalisation of round-robin to share the bandwidth “in in the flow i. It refines this result by subtracting the part of
proportion to prescripted weights” in the context of ATM (i.e., the bandwidth not used by interfering flows (considering their
with constant size packets). Two versions of the algorithm are arrival curves).
presented in [1]. The former is presented in Algorithm 1: at Observe that computing a residual service with a βR,T
cycle C (with C between 1 and wmax ), only flows with weight curve is pessimistic as it assumes that, once the worst latency
wi ≥ C can emit one packet. We call this version IWRR. The is payed, each packet is served with the long-term residual
latter version assumes that there exists for each flow
Pwimaxa bit-list rate. Whereas, in reality, each packet, when it is selected for
of length wmax , oi ∈ {0, 1} max , such that wi = k=1
w
oi [k]. emission, is transmitted at full link speed up to completion. A
A flow i can emit a packet at cycle C only if oi [C] = 1. A residual service for the conventional WRR with a curve that
strategy is given to build these vectors in [1] and is refined with is an alternation of full services and plateaus is given in [2,
fairness objectives in [11]. Call LIWRR (list-based IWRR) this Sec. 8.2.4]. This effect of “full speed up to completion” can
version. also be captured when computing the local delay of a server

3
with βR,T service curve [23]. The proof is in Appendix. Fig. 3 illustrates how the strict
service curve for IWRR improves on that for WRR, by
V. S TRICT S ERVICE C URVES FOR IWRR providing a smoother, and generally larger, service.
Our first result is a strict service curve for IWRR that, as The service curve found in Theorem 1 is the best possible
we show in Section VII, is the best possible. We compare it one but has a complex expression. If there is interest in a
to WRR and also give simpler, lower approximations. simpler expression, any lower bounding function is a strict
service curve; in particular, the strict service curve βi0 for WRR
Theorem 1 (Strict Service Curve of IWRR). Let S be a server
is also a valid, though suboptimal, strict service curve for
shared by n flows that uses IWRR as explained in Section III,
IWRR. There is often interest in service curves that are rate-
with weight wi for flow i. Recall that the server offers a strict
latency functions. Observe that, if the aggregate service curve
service curve β to the aggregate of the n flows. For any flow
β is a rate-latency function, then replacing γi by a rate-latency
i, limin [resp.limax ] is a lower [resp. upper] bound on the packet
lower-bounding function also yields a rate-latency function for
size.
βi , and vice-versa. Therefore, we are interested in rate-latency
Then, S offers to every flow i a strict service curve βi given
functions that lower bound γi .
by βi (t) = γi (β(t)) with
Among all of such these, there is not a single best one, as
γi = λ 1 ⊗ U i (4) some have a smaller latency while others have a larger rate.
wi −1  +  We say that a rate-latency function βr,T that lower bounds
def X
Ui (x) = νlmin ,L x − ψi (klimin ) (5) γi is non-dominated if there is no other rate latency function
i tot
k=0 βr0 ,T 0 that lower bounds γi and dominates βr,T , i.e., such
that r0 ≥ r and T 0 ≤ T . The following result gives all such
X
Ltot = wi limin + wj ljmax (6)
qi wi limin
j,j6=i non-dominated rate-latency functions. Let r∗ = Ltot = Ltot ,
rwi −1 = 1, and
 
def X x
ψi (x) = x + φi,j ljmax (7)
limin limin
j,j6=i
rk = , 0 ≤ k < wi − 1 (11)
ψi ((k + 1)limin ) − ψi (klimin )
 
def x +
φi,j (x) = wj + [wj − wi ]
wi k ∗ = min{0 ≤ k < wi | rk ≥ r∗ } (12)
+ min(x mod wi + 1, wj ) (8) rk∗ ∗
= min(rk , r ), 0 ≤ k ≤ k ∗
(13)
In the above, νa,b is the stair function, λ1 is the unit rate Theorem 3. With the assumptions in Theorem 1 and the defi-
function and ⊗ is the min-plus convolution, all are described nitions (11)-(13), a rate-latency function βr,T lower bounds
in Fig. 1. γi and is non-dominated if and only if r = rk∗∗ and
Furthermore, βi is super-additive. k∗ limin
T = ψi (k ∗ limin ) − r , ∗
or rk−1 ≤ r < rk∗ and T =
klmin
The proof is in Section VI. See Fig. 3 for some illustration ψi (klimin ) − ir for some integer k with 0 < k ≤ k ∗ . Among
of βi . Observe that γi in (4) is the strict service curve obtained all such rate-latency functions, the one with lowest latency is
when the aggregate strict service curve is β = λ1 (i.e., when βr0∗ ,T0∗ and the one with largest rate is βrk∗∗ ,Tk∗∗ .
the aggregate is served at a constant, unit rate). In the common
case where β is equal to a rate-latency function, say βc,T , we The proof is in Appendix and Fig 3 illustrates βr0∗ ,T0∗ and
have βi (t) = γi (c(t − T )) for t ≥ T and βi (t) = 0 for t ≤ T , β ∗ ,T ∗
rk ∗ k∗
in some examples. Observe that k 7→ rk∗ is wide-sense
namely, βi is derived from γi by a rescaling of the x axis and increasing with k for 0 ≤ k ≤ k ∗ , but the values of rk∗ are not
a right-shift. necessarily all distinct. It can also occur that k ∗ = 0 (as in the
As mentioned in Section II, any strict service curve that top panel of Fig. 3); in which case, there is one optimal rate-
is not super-additive can be replaced by its super-additive latency service curve. In general, however, this does not occur,
closure. The last statement in the theorem guarantees that this and a simple lower bounding approximation can be obtained
does not occur here. with the supremum of all non-dominated
 rate-latencies, which
We now compare to WRR. The best known service curve can be shown is equal to max βr0∗ ,T0∗ , . . . , βrk∗∗ ,Tk∗∗ . When
for (non-interleaved) WRR is given in [2, Sec. 8.2.4] and is β is a rate-latency function, this provides a convex piecewise
  linear function that has several good properties [2, Sec. 4.2].
+
βi0 (t) = (λ1 ⊗ νqi ,Ltot ) [β(t) − Qi ] (9)
VI. P ROOF OF T HEOREM 1
with qi = wi limin and Qi = j,j6=i wj ljmax . In Section VII, we The idea of proof is as follows. We consider a backlogged
P
show that βi (t) is indeed the best possible strict service curve
0
period (s, t] of flow of interest i, and we let p be the number
for WRR. Furthermore, it is dominated by the strict service of packets of flow i that are entirely served during this period.
curve for IWRR: For every other flow j, the number of packets that are entirely
Theorem 2. With the assumptions in Theorem 1 and in (9): served is upper bounded by a function of p, given in Lemma 3.
p is also upper bounded by a function of the amount of service
βi0 ≤ βi (10) received by flow i in Lemma 5. Combining these two results

4
Illustration of Theorems 1, 2, and 3
60
bits (×1024)

IWRR Strict Service Curve (βi ) WRR Strict Service Curve (βi0 )
40
20 Optimal Rate-latency Service Curve

0
0 10 20 30 40 50 60 70
80 90 100 110 120 130 140 150 160 170
Time (µs)
60 IWRR Strict Service Curve (βi )
bits (×1024)

40 Non-dominated Rate-Latency with maximum rate


WRR Strict Service Curve (βi0 )
20 Non-dominated Rate-Latency with minimum latency
0
0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85
Time (µs)
Fig. 3: Strict service curves obtained in Section V for an example with four input flows, weights = {4, 6, 7, 10}, lmin =
{4096, 3072, 4608, 3072} bits, lmax = {8704, 5632, 6656, 8192} bits and β(t) = ct with c = 10Mb/s (i.e., the aggregate of all flows is
served at a constant rate). The figure shows the IWRR service curve βi and the WRR strict service curve βi0 for two of the flows; it
also shows the non-dominated rate-latency strict service curves βr0∗ ,T0∗ and βrk∗∗ ,Tk∗∗ of Theorem 3 (in the top panel both are equal).

in an implicit inequality for the total amount of service (26). B. Amount of Service to Other Flows
By using the technique of pseudo-inverse, this inequality is In order to upper bound the number of emission oppor-
inverted and provides a lower bound for the amount of service tunities for another flow j, we first find an expression, in
received by the flow of interest. Lemma 1, for the number of emission opportunities for flow
A. Key Variables and Basic Properties j between two consecutive emission opportunities for flow i.
Lemma 2 then finds an upper bound on the number of emission
Let (s, t] be a backlogged period of flow i. Let (τk , f lk ) opportunities for flow j in (s, τσ(p) ), as a function of the cycle
be couples of (instant,flow), printed at line 8 of Algorithm 1. number (variable C in Algorithm 1) at τσ(0) . Lastly, Lemma 3
Note that τk < τk+1 as the send instruction has a non-null maximizes the previous upper bound over all values of C.
duration (because the aggregate service curve β is Lipschitz
continuous). Let σ(0), σ(1), . . . be the sequence of service op- Lemma 1. The number of emission opportunities for flow j 6=
portunities for flow i at or after s, i.e., σ(0) = min{m | τm ≥ i between two consecutive emission opportunities for flow i,
s, f lm = i} and σ(k) = min{m | τm > τσ(k−1) , f lm = i}. given that the latter emission opportunity for flow i occurs at
def
The kth service opportunity for flow i occurs at time τσ(k−1) ; cycle C, is equal to qi,j (C) =
we say that it is “complete” if τσ(k−1)+1 ≤ t, i.e., the interval 
taken by this service is entirely in [s, t]. Let p ≥ 0 be the 0
 if 1 < C ≤ wi and wj < C
number of complete service opportunities. Observe that it is 1 if 1 < C ≤ wi and wj ≥ C (15)
possible that p = 0, and it might happen that τσ(p) < t or +

[wj − wi ] + 1 if C = 1

τσ(p) ≥ t.
Proof. According to Algorithm 1, flow i has emission oppor-
flow i is served flow i is served tunities only in the first wi cycles of each round. Both emission
opportunities are either in the same round (Case 1) or in two
s τσ(0)−1 τσ(0) τσ(0)+1 τσ(p−1) τσ(p−1)+1 t τσ(p) consecutive rounds (Case 2). As C is the cycle number for the
flow i is served flow i is served flow i is served second emission opportunity for flow i, Case 1 can occur only
when 1 < C ≤ wi , and Case 2 can occur when C = 1. For
s τσ(0)−1 τσ(0) τσ(0)+1 τσ(p−1) τσ(p−1)+1 τσ(p) t τσ(p)+1 Case 1, we further differentiate between wj < C and wj ≥ C.
Case 1a: 1 < C ≤ wi and wj < C: Queue j does not have
In each service of flow i, during a backlogged period, an emission opportunity in cycle C because wj < C. Also, we
it sends one packet with a length ≥ limin , thus, for all must have wj < wi , thus queue j does not have an emission
k = 0 . . . (p − 1), Ri∗ (τσ(k+1) ) − Ri∗ (τσ(k) ) ≥ limin , therefore opportunity after i in cycle C − 1. Hence, qi,j (C) = 0.
Case 1b: 1 < C ≤ wi and wj ≥ C: If wj > wi , then queue
Ri∗ (τσ(p) ) − Ri∗ (τσ(0) ) ≥ plimin (14) j has an emission opportunity after queue i in cycle C − 1. If

5
wj = wi , then queue j has an emission opportunity before i number of emission opportunities in (τσ(p−1) , τσ(p) ). By the
in cycle C, or after i in cycle C − 1. Else, C ≤ wj < wi and induction hypothesis, N1 ≤ qi,j
0
(C, p − 1). Also, by Lemma
queue j has an emission opportunity in cycle C, before i. In 1, we have N2 ≤ qi,j (C (p)). Thus, by using (17) which was
0

all cases, qi,j (C) = 1. just shown to also hold for p, we obtain
Case 2: C = 1: The first emission opportunity for i is in p−1
the last cycle of a round that includes i (cycle wi ). If wj > wi ,
X
N≤ qi,j ((C + k − 1) mod wi + 1)
then queue j has an emission opportunity in the rest of cycle k=0
(19)
wi and also has emission opportunities during the next (wj − + qi,j ((C + p − 1) mod wi + 1)
wi ) cycles of the last round. In this case, qi,j (C) = wj −wi +1,
which is also the value in the last line of (15). Else if wj = wi , where the right-hand side is equal to qi,j
0
(C, p) as required.
queue j has an emission opportunity before i in this cycle or Lemma 3. For any backlogged period (s, t] of flow i with p
after i in cycle wi of the first round, thus qi,j (C) = 1, which complete services, the number of emission opportunities for
is also the value in the last line of (15). Else, wj < wi and flow j 6= i in (s, τσ(p) ) is upper bounded by φi,j (p), defined
queue j has an emission opportunity before i in this cycle. in (8).
Here too, qi,j (C) = 1, the value in the last line of (15).
Proof. Lemma 2 gives the number of emission opportunities
Lemma 2. The number of emission opportunities for flow j 6= for flow j 6= i in (s, τσ(p) ), for any backlogged period (s, t] of
i in (s, τσ(p) ), for any backlogged period (s, t] of flow i with flow i with p complete services, when the first service starts
p complete services, given that the first service starts at cycle at cycle number C (cycle number at time τσ(0) ). To obtain
number C (cycle number at time τσ(0) ) is upper bounded by the lemma, we maximize this result over C. We show the
def
p
X following properties.
0
qi,j (C, p) = qi,j ((C + k − 1) mod wi + 1) (16) (P1) For any integer C ∈ [1, wi ],
k=0
i −1
wX
Also, let C 0 (p) be the cycle number at τσ(p) . Then, qi,j ((C + k − 1) mod wi + 1) = wj (20)
k=0
C 0 (p) = (C + p − 1) mod wi + 1 (17)
The mapping k 7→ (C + k − 1) mod wi + 1 is one-to-one
Proof. By induction on p. from {0, ..., wi − 1} onto
Base Case: p = 0 Pwi{1, ..., wi }, thus the left-hand side
of (20) is equal to k=1 qi,j (k) that as we show now,
In this case, qi,j
0
(C, 0) is the number of emission oppor- +
is equal to wj . First, we have qi,j (1) = [wj − wi ] + 1.
tunities for flow j between two consecutive emission oppor- Also,
tunities for flow i that by Lemma 1, is equal to qi,j (C). As Pwi qi,j (k) = 1 when k > 1 and wj ≥ k + 1. Thus,
k=2 qi,j (k) = min(wi − 1, wj − 1) and finally the left-
1 ≤ C ≤ wi , (C − 1) mod wi + 1 = C thus qi,j (C) = +
hand side is equal to [wj − wi ] + min(wi − 1, wj − 1) + 1,
qi,j ((C − 1) mod wi + 1). This shows (16). Also, by defini- which is equal to wj .
tion, C 0 (0) = C; using again (C − 1) mod wi +1 = C shows (P2) For any integers C ∈ [1, wi ] and p ≥ 0, qi,j
0
(C, p) =
that (17) holds.
Induction step:   p mod
X wi
p
We assume that (16) and (17) hold for p − 1, and we want wj + qi,j ((C + k − 1) mod wi + 1) (21)
wi
to show that they also hold for p. k=0

First, let’s prove (17). There are two possible cases: (a) if qi,j is a periodic function with period wi . By (P1), the
0 ≤ C 0 (p − 1) < wi , then both (p − 1)th and pth emission sum over
j kone complete period is wj . Also,j wek can write
opportunities occur in the same round, thus C 0 (p) = C 0 (p − p = wpi wi + p mod wi . Thus, we have wpi complete
1)+1. By the induction hypothesis, (C + p − 2) mod wi +1 < rounds, and the sum in (21) is the remainder.
wi , i.e., (C + p − 2) mod wi < wi − 1. Note that, for any (P3) qi,j is a wide-sense decreasing function. This means that
integer x for any integer k ∈ [1, wi ), qi,j (k + 1) ≤ qi,j (k). If k = 1,
this follows from qi,j (1) ≥ 1 and qi,j (2) ≤ 1. Else if k ≤
(
(x mod w) + 1 if (x mod w) < w − 1
(x+1) mod w = wj < k + 1, then qi,j (k + 1) = 0 and qi,j (k) = 1. Else, they
0 otherwise
are equal. Hence, in all cases the property holds.
(18)
(P4) For any integer C ∈ [1, wi ] and p ≥ 0,
By using (18), we obtain that C 0 (p) is given by (17) as
required. (b) In the second case, C 0 (p − 1) = wi then the 0
qi,j 0
(C, p) ≤ qi,j (1, p) (22)
next emission opportunity occurs in the first cycle of the next
round, thus C 0 (p) = 1. Here too, applying (18) shows that By
Ppmod w using (P2), we should show that
C 0 (p) is given by (17) as required. k=0
i
qi,j ((C + k − 1) mod wi + 1) is upper bounded
Ppmod wi
Then, we prove (16). Let N be the number of emission by k=0 qi,j (k mod wi + 1). Note that here we have k
def
opportunities for flow j in [s, τσ(p) ). N is the sum of N1 , the mod wi = k. Both sides are the sum of a = p mod wi + 1
number of emission opportunities in [s, τσ(p−1) ), and N2 , the unique elements of the set {qi,j (k)}k∈[1,wi ] . By (P3), the

6
right-hand side is the maximum sum of a unique elements of E. Lower Pseudo-inverse of ψi
this set. Our next step is to invert (26) by computing the lower-
(P5) For any integer p ≥ 0, pseudo inverse of ψi . As the calculus of pseudo inverses
0
qi,j (1, p) = φi,j (p) (23) applies to wide-sense increasing functions, we first show:
We apply (P2) with C = 1 to compute qi,j 0
(1, p). Lemma 7. ψi , defined in (7), is wide-sense increasing.
Then, the
Pp mod wi sum in the right-hand side of (21) is equal to Proof. It is sufficient to show that φi,j , defined in (8), is a
k=0 qi,j (k + 1), as k mod wi = k. Then, by using the wide-sense increasing function. For any non-negative integers
+
same argument after (20), it is equal to [wj − wi ] +1+min(p x and y such that y ≤ x, we can write x = kwi +(x mod wi )
mod wi , wj − 1), which, by (8), is precisely φi,j (p). and y = k 0 wi + (y mod wi ), where k, k 0 are non-negative
The lemma then follows directly from (P4) and (P5). integers. We must have k ≤j k 0 .kIf kj= kk0 , we know that (y
Lemma 4. For every flow j 6= i, mod wi ≤ x mod wi ) and wxi = wyi . Hence, φi,j (y) ≤
j k j k
Rj∗ (t) ≤ Rj∗ (τσ(p) ) (24) φi,j (x). Else, k > k 0 and wxi > wyi . Thereby, φi,j (x) is at
Proof. If t ≤ τσ(p) , the result follows from Rj∗ being wide- least one wj larger than φi,j (y). Hence, φi,j (y) < φi,j (x).
sense increasing. Else, we have t > τσ(p) ; this implies that Lemma 8. Let g0 , g1 , . . . , gk , . . . be a non-negative sequence
flow i is served during [τσ(p) , t]; thus for any other flow j, such that gk+1 − gk ≥ 1. The sequence can be extended to a
Rj∗ (t) = Rj∗ (τσ(p) ). function in F by g(x) = gbxc and let g ↓ be its lower pseudo-
C. Amount of Service to Flow of Interest inverse, so that g ↓ (y) = k + 1 ∈ N ⇔ gk < y ≤ gk+1 . Define
Lemma 5. The number of complete services, p, of flow of f ∈ F by f (x) = gbxc + xmod 1. Then, f ↓ = λ1 ⊗ g ↓ .
interest, i, in (s, t] is upper bounded by: Proof. Observe that convolving g ↓ with λ1 consists in smooth-
ing the unit steps with a slope of 1 (Fig. 1). Thus (λ1 ⊗
 ∗
Ri (t) − Ri∗ (s)

p≤ (25) g ↓ )(y) = k + y − gk whenever gk ≤ y ≤ gk + 1 and
limin
(λ1 ⊗ g ↓ )(y) = k + 1 whenever gk + 1 ≤ y ≤ gk+1 .
Proof. First, Ri∗ (s) ≤ Ri∗ (τσ(0) ), as s ≤ τσ(0) and Ri∗ is wide- Also, f is piecewise linear and can be inverted in closed
sense increasing. Second, consider the two cases in VI-A. If form on every interval where it is linear. A direct calculation
t ≥ τσ(p) , the property holds. Else, the scheduler in not serving gives f ↓ (y) = k + y − gk whenever gk ≤ y ≤ gk + 1 and
flow i in [τσ(p−1)+1 , τσ(p) ), thus, Ri∗ (t) = Ri∗ (τσ(p) ). Hence, f ↓ (y) = k + 1 whenever gk + 1 ≤ y ≤ gk+1 .
in both cases Ri∗ (t) ≥ Ri∗ (τσ(p) ). By (14), Ri∗ (t) − Ri∗ (s) ≥
plimin . Then, observe that p is integer. Lemma 9. Let f ∈ F and l, m > 0. Define h ∈ F by
y
h(x) = mf xl . Then, for all y ≥ 0, h↓ (y) = lf ↓ m .
D. Total Amount of Service
def
Lemma 6. For any backlogged period (s, t] of the flow of Proof. Let B(f, y) = {x ≥ 0, h(x) ≥ y} so that f ↓(y) =
y
interest i, inf B(y, f ). Observe that x ∈ B(h, y) ⇔ xl ∈ B f, m .
β(t − s) ≤ ψi (Ri∗ (t) − Ri∗ (s)) (26)
Lemma 10. Let a ∈ F and l > 0. Define b ∈ F by b(x)  =
where ψi is defined in (7). lf xl . Then, for all x ≥ 0, (λ1 ⊗ b)(x) = l(λ1 ⊗ a) xl .
Proof. As the interval (s, t] is a backlogged period, by the Proof. Do the change of variable u = lv in the expansion
definition ofP
the strict service curve for the aggregate of flows, (λ1 ⊗ b)(x) = inf 0≤u≤x (u + b(x  − u)) and obtain
 (λ1 ⊗
β(t − s) ≤ j Rj∗ (t) − Rj∗ (s). We upper bound Rj∗ (t) for all b)(x) = inf 0≤v≤ xl lv + a xl − v = l (λ1 ⊗ a) xl .
j 6= i by applying Lemma 4,
We can now compute the lower-pseudo inverse of ψi . First,
ψi klimin . As in Lemma 8,
X
β(t − s) ≤ (Ri∗ (t) − Ri∗ (s)) + Rj∗ (τσ(p) ) − Rj∗ (s) (27) define the sequence g by gk = min 1

li
j,j6=i g can be extended to a piecewise constant function whose
Each flow j has at most φi,j (p) emission opportunities during lower-pseudo inverse, g ↓ , can be directly computed:
s, τσ(p) (Lemma 3) and can send at most one packet of wi −1
1 X  
maximum size in each. Thus, limin [x − gk ]
+
g ↓ (x) = min νlmin ,L (30)
li k=0 i tot
φi,j (p)ljmax
X
β(t − s) ≤ (Ri∗ (t) − Ri∗ (s)) + (28)
j,j6=i Second, observe that for all x ≥ 0, ψi (x) = ψi (b x
clmin )+x
limin i
Also, Lemma 5 finds an upper bound on p. Thereby, mod limin . Define f and h from g as in Lemmas 8 and 9 with
β(t − s) ≤ (Ri∗ (t) − Ri∗ (s)) l = m = limin , so that h = ψi . Apply Lemmas 8 and 9 and
 ∗ obtain ψi↓ (x) = limin λ1 ⊗ g ↓ ( min
x
). Now apply Lemma 10
Ri (t) − Ri∗ (s)

(29) li
ljmax
X
+ φi,j min with a = g ↓ , l = limin , and b = Ui to obtain
j,j6=i
l i

where the right-hand side is equal to ψi (Ri∗ (t) − Ri∗ (s)). ψi↓ = λ1 ⊗ Ui (31)

7
F. Proof of Theorem 1 Theorem 6 (Tightness of Delay Bound for IWRR with
Proof. Lemma 6 gives, in (26), an upper bound on the total Constant Packet Size). Consider a system, as in Theorem 4,
amount of service as a function of the service received by the with the additional assumption that, for the flow of interest i,
flow of interest. We invert (26) by the lower-pseudo inverse limin = limax = l.
technique in (3) and obtain Ri∗ (t) − Ri∗ (s) ≥ ψi↓ (β(t − s)). Let αi ∈ F be a sub-additive function that is an integer
The lower-pseudo inverse of ψi is given by (31), thus multiple of l, and assume that flow i has αi as arrival curve.
The network calculus delay bound is tight, i.e, there exists a
Ri∗ (t) − Ri∗ (s) ≥ (λ1 ⊗ Ui ) (β (t − s)) = βi (t − s) (32) trajectory where the delay of one packet of flow i is equal to
h(αi , βi ).
Lastly, we need to prove that βi is super-additive. This
follows from the tightness result in Theorem 4 (the proof of Theorem 7 (Tightness of Delay Bound for WRR with Con-
which is independent of rest of this proof). Indeed, the super- stant Packet Size). Theorem 6 is also valid for the WRR policy.
additive closure β̄i of βi is also a strict service curve, and
β̄i (t) ≥ βf (t) for all t [2, Prop. 5.6]). By Theorem 4, we also VIII. N UMERICAL E XAMPLES
have β̄i (t) ≤ βi (t) for all t, hence β̄i = βi . To compare IWRR and WRR worst-case delays, we pro-
vide some numerical examples. First, we consider a sys-
VII. T IGHTNESS tem of 8 input flows f1 , . . . , f8 with respective weights
We first show that the strict service curve we have obtained {22, 27, 28, 30, 30, 34, 41, 45} and lmin = lmax = l = 7119 bit.
is the best possible. Proofs of results in this Section are in Let the aggregate service, β, be a constant bit rate of 10
Appendix. Mb/s. For every flow i, we compute the IWRR and WRR
strict service curves βi , βi0 . Then, for every i, we generate
A. Tightness of Strict Service Curve N = 1000 leaky-bucket arrival curves γr,bk , k = 1 . . . N ,
Theorem 4 (Tightness of the IWRR Service Curve). Con- with rate r = 0.5 Mb/s and burst bk picked uniformly at
γ k
sider a weighted round-robin subsystem that uses the IWRR random in [1, 20] packets. Then, we use αik = d r,b l el to
scheduling algorithm, as defined in Section III. Assume the satisfy the conditions of Theorems 6 and 7 and to compute
following system parameters are fixed: the number of input dki = h(αik , βi ) and d˙ki = h(αik , βi0 ). Fig. 4 gives the box-and-
flows, the weight wj allocated to every flow j, the bounds whisker plots of the d˙ki −dki series. The median of WRR delay
on packet sizes ljmin and ljmax for every flow j, and the strict bounds d˙ki are also provided to illustrate the improvement.
service curve β for the aggregate of all flows. Let i be the Second, we repeated the same study for M = 10000 sets of
index of one of the flows. system parameters. For each system, we choose the weights
Assume that bi ∈ F is a strict service curve for flow i in of 8 flows by picking them uniformly at random between 10
any system that satisfies the specifications above. Then bi ≤ βi and 50, and we pick a packet length l uniformly at random
where βi is given in Theorem 1. between 64 to 1522 bytes. For each experiment, we call flow 1
the flow with the smallest weight, flow 2 with second smallest
Interestingly, we obtain a similar result for WRR. Recall
weight, and so on. As the scale of delay bounds depends on
that βi0 is the strict service curve for flow i, described in (9),
the choices of weights and the packet length, the d˙ki −dki series
which was obtained in [2, Sec. 8.2.4].
are divided by d˙m̄i , the median of WRR delay bounds for flow
Theorem 5 (Tightness of the WRR Service Curve). Theorem i. Fig. 4 gives the box-and-whisker plots of the
˙
dk
i −di
k
series.
˙m̄
4 is also valid if we replace IWRR with WRR. Specifically, di
using WRR as a scheduling policy, βi0 is the largest possible Using IWRR improves worst-case delays, as expected, and the
strict service curve for flow i. improvement is larger for flows with larger weights.

B. Tightness of Delay Bounds With Constant Packet Sizes IX. C ONCLUSION


Having obtained the best-possible strict service curve does IWRR is a variant of WRR with the same long-term rate and
not guarantee that the delay bounds derived from it are tight, the same complexity. We have provided a residual strict service
i.e., are worst-case delays. This is because a service curve curve for IWRR and have showed that it is the best possible
is only an abstraction of the system; and we have obtained one under general assumptions. For flows with packets of
a strict service curve, and non-strict service curves might constant size, we have showed that the delay bounds derived
provide better results. However, we show that, for flows of from it are worst-case. We have proved that IWRR worst-case
packets of constant size, we do obtain tight delay bounds. We delay is not greater than WRR and shown on experiments that
show that it holds for IWRR and for WRR. the gain is significant (20%-60%) in practice, which speaks in
Recall that a delay bound requires the knowledge of an favour of using IWRR as a replacement to WRR. Our result
arrival curve αi for the flow of interest. If this flow generates assumes that the aggregate of all IWRR queues receives a
only packets of length l, then αi can be assumed to be a strict service curve guarantee, and we find a strict service
multiple of l and sub-additive. A delay bound for this flow is curve guarantee for every IWRR queue. Therefore, our results
then equal to h(αi , βi ) (see (1)). apply to hierarchical schedulers. In future research, we plan to
improve the results with supplementary hypotheses on flows,

8
Absolute Improvement of Delay Bounds of IWRR wrt WRR on one Configuration
[4] J.-Y. Le Boudec and P. Thiran, Network Calculus: A Theory of Deter-
180
ministic Queuing Systems for the Internet. Springer Science & Business
WRR Worst-case Delay − IWRR Worst-case Delay (Time (ms))

Median of WRR Worst-case Delay


Media, 2001, vol. 2050.
160
[5] C. S. Chang, Performance Guarantees in Communication Networks.
New York: Springer-Verlag, 2000.
140
[6] J. Liebeherr, “Duality of the max-plus and min-plus network calculus,”
Foundations and Trends in Networking, vol. 11, no. 3-4, pp. 139–282,
120 2017.
[7] D. B. Chokshi and P. Bhaduri, “Modeling fixed priority non-preemptive
100
scheduling with real-time calculus,” in 2008 14th IEEE International
Conference on Embedded and Real-Time Computing Systems and Ap-
80 plications, Aug 2008, pp. 387–392.
[8] L. Thiele, S. Chakraborty, and M. Naedele, “Real-time calculus for
60 scheduling hard real-time systems,” in 2000 IEEE International Sympo-
sium on Circuits and Systems (ISCAS), vol. 4, May 2000, pp. 101–104
40 vol.4.
[9] E. L. Hahne and R. G. Gallager, “Round robin scheduling for fair flow
20
Median of WRR Worst-case Delay control in data communication networks,” in Proc. of the IEEE Int. Conf.
on Communications (ICC 86), June 1986.
0 [10] J. Nagle, “On packet switches with infinite storage,” Communications,
1 2 3 4 5 6 7 8
IEEE Transactions on, vol. 35, no. 4, pp. 435–438, Apr 1987.
Flows
[11] Y.-T. Wang, T.-P. Lin, and K.-C. Gan, “An improved scheduling algo-
Fig. 4: Box-and-whisker plots of difference between WRR and rithm for weighted round-robin cell multiplexing in an ATM switch,”
IWRR delay bounds with weights {22, 27, 28, 30, 30, 34, 41, 45} in Proceedings of the International Conference on Communications
and l = 7119 bit with random arrival curves. Median WRR delay (SUPERCOMM’94), May 1994, pp. 1032–1037 vol.2.
[12] H. Shimonishi, M. Yoshida, R. Fan, and H. Suzuki, “An improvement
bounds are also provided.
of weighted round robin cell scheduling in ATM networks,” in Proc.
of the IEEE Global Telecommunications Conference (GLOBECOM 97),
Relative Improvement of Delay Bounds of IWRR wrt WRR on a Set of Cconfigurations
vol. 2, Nov 1997, pp. 1119–1123 vol.2.
[13] H. M. Chaskar and U. Madhow, “Fair scheduling with tunable latency: A
100
round robin approach,” in Proc of the IEEE Global Telecommunications
(Percentage %)

90
Conference (GLOBECOM’99), vol. 2. IEEE, 1999, pp. 1328–1333.
[14] “IEEE standard for local and metropolitan area networks – bridges and
80 bridged networks,” IEEE, IEEE Standard 802.1Q, 2018.
[15] D. B. L.D. and P. V. Krishna, “Honey bee behavior inspired load
WRR Worst-case Delay−IWRR Worst-case Delay

70 balancing of tasks in cloud computing environments,” Applied Soft


Computing, vol. 13, no. 5, pp. 2292 – 2303, 2013. [Online]. Available:
Median of WRR Worst-case Delay

60
http://www.sciencedirect.com/science/article/pii/S1568494613000446
50
[16] Wensong, “Weighted round-robin scheduling, docu-
mentation of the linuxvirtualserver knowledge base,”
40 http://kb.linuxvirtualserver.org/wiki/Weighted˙Round-Robin˙Scheduling,
2005.
30 [17] Y. Qian, Z. Lu, and W. Dou, “Analysis of worst-case delay bounds
for best-effort communication in wormhole networks on chip,” in Proc.
20
of the 3rd ACM/IEEE International Symposium on Networks-on-Chip
10
(NoCS 2009). IEEE, 2009, pp. 44–53.
[18] D. Stiliadis and A. Varma, “Latency-rate servers: A general model
0 for analysis of traffic scheduling algorithms,” IEEE/ACM Trans. Netw.,
1 2 3 4 5 6 7 8
vol. 6, no. 5, pp. 611–624, October 1998. [Online]. Available:
Flows http://dx.doi.org/10.1109/90.731196
[19] S. Nananukul, “Latency of weighted round-robin scheduler,” Electronics
Fig. 5: Box-and-whisker plots of difference between WRR and Letters, vol. 39, no. 2, pp. 256–257, 2003.
IWRR delay bounds normalized to the median of WRR delay [20] Y. Jiang, “Relationship between guaranteed rate server and latency
bounds, for several systems with weights picked uniformly at rate server,” Computer Networks, vol. 43, no. 3, pp. 307 – 315,
random in [10, 50], assigned to flow by increasing order, and a 2003. [Online]. Available: http://www.sciencedirect.com/science/article/
packet length picked uniformly at random in [64, 1522] bytes. pii/S1389128603002767
[21] J.-P. Georges, T. Divoux, and E. Rondeau, “Network calculus:
application to switched real-time networking,” in Proc. of the
considering arrival curves and packet size distribution, with 5th Int. ICST Conf. on Performance Evaluation Methodologies
“packet curves” [24]. and Tools, ser. VALUETOOLS ’11. ICST, Brussels, Belgium,
Belgium: ICST (Institute for Computer Sciences, Social-Informatics
and Telecommunications Engineering), 2011, pp. 399–407. [Online].
R EFERENCES Available: http://dl.acm.org/citation.cfm?id=2151688.2151733
[22] A. Soni, X. Li, J.-L. Scharbarg, and C. Fraboul, “WCTT analysis of
[1] M. Katevenis, S. Sidiropoulos, and C. Courcoubetis, “Weighted round- avionics switched ethernet network with WRR scheduling.” in Proc. of
robin cell multiplexing in a general-purpose ATM switch chip,” IEEE the 26th International Conference on Real-Time Networks and Systems
Journal on Selected Areas in Communications, vol. 9, no. 8, pp. 1265– (RTNS). ACM, 2018, pp. 213–222.
1279, 1991. [23] E. Mohammadpour, E. Stai, and J.-Y. Le Boudec, “Improved
[2] A. Bouillard, M. Boyer, and E. Le Corronc, Deterministic Network delay bound for a service curve element with known transmission
Calculus: From Theory to Practical Implementation. Wiley-ISTE. rate,” IEEE Networking Letters, pp. 1–1, 2019. [Online]. Available:
[3] S. M. Tabatabaee, J.-Y. L. Boudec, and M. Boyer, “Interleaved http://infoscience.epfl.ch/record/267840
weighted round-robin: A network calculus analysis,” 2020. [Online]. [24] A. Bouillard, N. Farhi, and B. Gaujal, “Packetization and packet curves
Available: https://arxiv.org/pdf/2003.08372.pdf in network calculus,” in Performance Evaluation Methodologies and

9
Tools (VALUETOOLS), 2012 6th International Conference on. IEEE, We now proceed with the proof of the lemma. Consider
2012, pp. 136–137. some arbitrary x ∈ [0, w) and let ` = bxc. Then
A PPENDIX f (x) = x − ` + g` (40)
A. Proof of Theorem 2 h(x) = a(x − `) + a(` − k) + gk (41)
The WRR strict service curve [2, Sec. 8.2.4] is defined by h(x) − f (x) = (a − 1)(x − `) + gk − g` − a(k − `) (42)
βi0 (t) = γi0 (β(t)) with | {z } | {z }
  A B
+
γi0 = (λ1 ⊗ νqi ,Ltot ) [t − Qi ] (33) Observe that we must have a ≥ 1: if k = w − 1 this follows
def

x
 from 3), and if k ≤ w − 2 it follows from 3) and 1); thus
ljmax
X
ψi0 (x) = x + φ0i,j min
(34) A ≥ 0. Also B ≥ 0 by (39).
j,j6=i
li
  
0 def x
φi,j (x) = 1 + wj (35) Lemma 12. Let T > 0 and P a bounded, wide-sense
wi
increasing function [0, T ) →  R. Extend P to a function
γi0 is the lower-pseudo inverse of ψi0 . We know that for IWRR, P̄ ∈ F by ∀x ≥ 0, P̄ (x) = Tx P (T − ) + P (xmod T ) where
γi is also the lower-pseudo inverse of ψi (defined in (7)). We def
P (T − ) = sup0≤t<T P (t).
first show that ψi ≤ ψi0 .
It is sufficient to prove that for all j 6= i and for all k ∈ N, Also, consider an affine function L, defined by L(x) = ax+

φi,j (k) ≤ φ0i,j (k). From the definition of φi,j and as min(x b for some a ≥ P (TT ) and some b ∈ R.
mod wi + 1, wj ) ≤ min(wi , wj ), If L(x) ≥ P (x) for all x in [0, T ) then L ≥ P̄ .
Proof. Observe that, for x ≥ 0, L(x) = a Tx T + L(x
   
x +
φi,j (x) ≤ wj + [wj − wi ] + min(wi , wj ) (36) mod T ). Now L(xmod T ) ≥ P (xmod T ) by hypothesis.
wi
+
Thus
Observe that [wj − wi ] + min(wi , wj ) = wj . Hence, the jxk
right-hand side is φ0i,j (x). This shows that L(x) ≥ a T + P (x mod T ) (43)
T
ψi ≤ ψi0 (37) P (T − ) x
j k
≥ T + P (x mod T ) = P̄ (x) (44)
T T
In [6, Sec. 10.1], it is shown that
∀f, g ∈ F , f ≥ g ⇒ f ↓ ≤ g ↓ (38)
Lemma 13. Let f ∈ F and a rate-latency function βr,T such
Apply (38) to (37) to conclude the proof. that r > 0, T > 0, and βr,T ≤ f . Assume that βr,T (x1 ) =
B. Proof of Theorem 3 f (x1 ) for x1 > T .
Then there is no other rate-latency function βr0 ,T 0 (i.e., with
Lemma 11. Consider some integers w ≥ 1 and 0 ≤ k ≤ (r0 , T 0 ) 6= (r, T )) such that βr,T ≤ βr0 ,T 0 ≤ f .
w − 1, a finite sequence g0 , g1 , . . . , gw−1 and a number a ∈ R
that satisfy : Proof. Assume that βr,T ≤ βr0 ,T 0 ≤ f . The proof consists in
1) ∀` ∈ N if 0 ≤ ` ≤ w − 2 then g`+1 − g` ≥ 1 showing that (r, T ) = (r0 , T 0 ).
2) ∀` ∈ N if 0 ≤ ` ≤ w − 3 then g`+2 − g`+1 ≤ g`+1 − g` First, we know that βr,T (x1 ) = f (x1 ) and x1 > T ; thus
3) if k ≤ w − 2 then a ≥ gk+1 − gk else a ≥ 1 r(x1 − T ) = f (x1 ) and
4) if k ≥ 1 then a ≤ gk − gk−1 f (x1 )
Define f : [0, w) → R by f (x) = gbxc + xmod 1 and T = x1 − (45)
r
h: [0, w) → R by h(x) = a(x − k) + gk
Second, observe that we must have T 0 ≤ T , since otherwise
Then h ≥ f .
βr,T (T 0 ) > 0 = βr0 ,T 0 (T 0 ).
Proof. First we show that Third, observe that f (x1 ) = βr,T (x1 ) ≤ βr0 ,T 0 (x1 ) ≤
f (x1 ) thus βr0 ,T 0 (x1 ) = f (x1 ) and
∀` ∈ {0, . . . , w − 1} , gk − g` ≥ a(k − `) (39)
Pk−1 f (x1 )
Case 1: ` < k. Then gk − g` = k0 =` (gk0 +1 − gk0 ). By 2) T 0 = x1 − (46)
every term in the sum is ≥ gk − gk−1 , by 4) is also ≥ a and r0
there are (k − `) terms, this shows (39). Combining the last three paragraphs, it follows x1 − f (x
r0
1)

Case 2: ` = k. Then (39) is obvious. P`−1
f (x1 )
x1 − r , i.e., r ≤ r. Also, we must have r ≥ r, since
0 0
Case 3: ` > k. Then g` − gk = k0 =k (gk0 +1 − gk0 ). By 2) −r 0 T 0
otherwise ∀x > x0 , βr,T (x) > βr0 ,T 0 (x) with x0 = rTr−r .
0
every term in the sum is ≤ gk+1 − gk ; note that we must have Thus, r0 = r, and it follows from (45) and (46) that T 0 = T .
k ≤ w − 2 thus by 3), every term in the sum is also ≤ a; also,
there are ` − k terms. Thus g` − gk ≤ a(` − k), which shows
(39) in this case. Now we proceed with the proof of Theorem 3.

10
1) We first show that rk ≤ rk+1for k = 0...wi − 2. Define 3) We now show that for any r ∈ [r0∗ , rk∗∗ ], βr,T (r) is a
sequence g by gk = min 1
ψi klimin for k = 0 . . . wi − 1. By non-dominated lower-bound of γi . Let r0 ≥ 0, T 0 ≥ 0 such
li
definition, we have gk+1 − gk = that βr,T (r) ≤ βr0 ,T 0 ≤ γi . We have to show that r0 = r and
T 0 = T (r).
1 X First, if r in [r0∗ , rk∗∗ ), observe that βr,T (r) (x) = γi (x) for
1+ (min(k + 2, wj ) − min(k + 1, wj )) ljmax (47)
limin j,j6=i klmin
x = ψi (klimin ) > ψi (klimin ) − ir = T (r). Then, apply
Lemma 13 with βr,T = βr,T (r) and f = γi to conclude that
Observe that (min(k + 2, wj ) − min(k + 1, wj )) is equal to 1
r0 = r and T 0 = T (r).
if k + 1 < wj , and equal to 0 otherwise. Thus, gk+2 − gk+1 ≤
Second, if r = rk∗∗ , observe that βr,T (r) (x) = γi (x) for
gk+1 − gk for 0 ≤ k < wi − 2, which shows that rk ≤ rk+1
x = ψi (k ∗ limin ) + Ltot > T (r). Again, apply Lemma 13 with
for k = 0...wi − 3. Also, observe that gk+1 − gk ≥ 1, i.e.,
βr,T = βr,T (r) and f = γi to conclude that r0 = r and
rk ≤ 1, for 0 ≤ k ≤ wi − 2. Hence, rwi −2 ≤ rwi −1 .
T 0 = T (r).
2) Let r ∈ [r0∗ , rk∗∗ ] and let T (r) be the value of T defined 4) We now show that there is no other non-dominated rate-
def klmin
in the Theorem, namely, T (r) = ψi (klimin ) − ir , where k latency function, βr0 ,T 0 , that is upper bounded by γi .
is defined by rk−1 ∗
≤ r < rk∗ if r ∈ [r0∗ , rk∗∗ ) and k = k ∗ if First, we must have T 0 ≥ T (r0∗ ). This is because γi (x) = 0
r = rk∗ . We now show that βr,T (r) ≤ γi .

for x ≤ ψi (0) = T (r0∗ ).
We consider two cases: r0∗ ≤ r < rk∗∗ or r = rk∗∗ . For the Second, we must have r0 ≥ r0∗ . Otherwise, we have r0 <
former case, for any r, apply Lemma 11 with w = wi , g as r0∗ and we previously showed T 0 ≥ T (r0∗ ). Thus, βr0 ,T 0 ≤
defined in 1), k as defined in the paragraph above, and a = 1r . βr0∗ ,T (r0∗ ) ≤ γi , which is in contradiction with βr0 ,T 0 being
As by construction r1k < a ≤ rk−1 1
and rk−1
1
= gk − gk−1 , 3) non-dominated.
and 4) are satisfied. For the latter case, apply again Lemma 11 Third, we must have r0 ≤ rk∗∗ . We proceed to prove this
with the same g and w = wi but now with k = k ∗ and a = by contradiction. If T 0 ≥ T (rk∗∗ ) and r0 > rk∗∗ , observe
∗ . By construction, we have r ∗
1 1 1
≥ rk1∗ = gk∗ +1 − gk∗ r 0 T 0 +rk ∗ ∗
∗ T (rk∗ )
r = rk ∗ k∗
that βr0 ,T 0 (x0 ) = βrk∗∗ ,T (rk∗∗ ) (x0 ) with x0 = r 0 −rk∗

and r∗1∗ ≤ rk∗1−1 = gk∗ − gk∗ −1 . Thus, conditions 3) and and ∀x, x > x0 ⇒ βr0 ,T 0 (x) > βrk∗∗ ,T (rk∗∗ ) (x); for any
k
4) of Lemma 11 are satisfied. Let f be the corresponding arbitrary, non-negative integer k, let xk be defined by xk =
function f in Lemma 11, i.e., f (x) = gbxc + xmod 1 for ψi (k ∗ limin ) + kLtot . Then observe that βrk∗∗ ,T (rk∗∗ ) (xk ) =
0 ≤ x < wi . Note that for both cases f is the same. Also, γi (xk ). Choose some k large enough such that xk > x0 ;
let fr be the corresponding function h in Lemma 11, i.e., then, βr0 ,T 0 (xk ) > βrk∗∗ ,T (rk∗∗ ) (xk ) = γi (xk ), which is in
fr (x) = 1r (x − k) + gk for 0 ≤ x < wi . By Lemma 11, contradiction with βr0 ,T 0 ≤ γi . Also, if T 0 < T (rk∗∗ ) and
fr ≥ f . r0 > rk∗∗ , we have ∀x, x > T 0 ⇒ βr0 ,T 0 (x) > βrk∗∗ ,T (rk∗∗ ) (x).
Observe that f (wi− ) = min 1
ψi ((wi − 1)limin ) + 1 = Choose some k large enough such that xk > T 0 ; then,

 li βr0 ,T 0 (xk ) > βrk∗∗ ,T (rk∗∗ ) (xk ) = γi (xk ), which is in contradic-
min max
1
= Lmin
tot = w r ∗ . Then, as
P
w l + w l i
min
li
i i j,j6=i j j li tion with βr0 ,T 0 ≤ γi . Therefore, r0 > rk∗∗ is in contradiction
f (w− ) with βr0 ,T 0 ≤ γi .
fr (x) ≥ f (x) for 0 ≤ x < wi and 1r ≥ r1∗ = wii , we
can apply Lemma 12 with P = f and L = fr . It gives us f¯ Therefore, we must have r0 in [r0∗ , rk∗∗ ]. We now show
defined by f¯(x) = b wxi c Lmin
tot +f (xmod w ) such that f ≥ f¯.
i r
that T 0 = T (r0 ). Because otherwise, if T 0 < T (r0 ), we
li
¯ ¯ have βr0 ,T (r0 ) ≤ βr0 ,T 0 ≤ γi which is in contradiction with
+ by↓using (38), fr ≤ f . Also, as f ≥ 0, we have
 Then,
↓ ↓ ↓
βr0 ,T (r0 ) being a non-dominated rate latency function. Also,
fr↓ ≤ f¯ . Note that for an increasing, linear function L, if T 0 > T (r0 ), we have βr0 ,T 0 ≤ βr0 ,T (r0 ) ≤ γi , which is in
defined by ∀x ≥ 0, L(x) = ax + b with some a > 0 and contradiction with βr0 ,T 0 being non-dominated.
↓ +
b > 0, we have L = β a1 ,b ; and observe that fr (x) =

T (r)  ↓ + C. Tightness Proofs
min Hence, fr
. = βr, T (r) .
x k x
r + gk − r = r +
li
lmin
i We use the following Lemma about the lower pseudo-
Until now, we have shown that βr, T (r) ≤ f¯↓ . Lastly, inverse technique.
lmin
i
min ¯↓ x
we show that li f ( min ) = γi (x) and limin βr, T (r) ( min x
) = Lemma 14. For a right-continuous function f in F and x, y
li li
lmin
i in R+ , f ↓ (y) = x if and only if f (x) ≥ y and there exists
βr,T (r) (x). Observe that limin f¯( min x
) = b xmin cLtot + ψi (( min
x
li wi li li some ε > 0 such that ∀x0 ∈ (x − ε, x), f (x0 ) < y.
mod wi )limin ). Also, ψi (x) = b xmin cLtot +ψi (xmod wi limin ).
wi li Proof. ⇒:
Hence, we have ψi (x) = limin f¯( min x
). By using Lemma 9 with Let S = {x0 , f (x0 ) ≥ y} so that x = inf S (2). From the
li
min min ¯↓ x ↓
l = m = li , li f ( min ) = ψi (x) = γi (x). Also, observe definition of an inf, there exists a sequence xn such that xn ∈
li
that limin βr, T (r) ( min
x
) = βr,T (r) (x). S for all n, xn ≥ x, and limn→∞ xn = x. Since f is right-
lmin
i
li continuous, limn→∞ f (xn ) = f (x), which shows that f (x) ≥
Combine the last paragraphs to conclude that βr,T (r) ≤ γi y. Also, again by definition of an inf, any x0 < x does not
for all r in [r0∗ , rk∗∗ ]. belong to S, i.e. ∀x0 < x, f (x0 ) < y.

11
⇐: services to each flow j is equal to φi,j (p). From 2), flow j
By the first part of the hypothesis, x ∈ S therefore x ≥ sends packets with the maximum length. Thus:
inf S = f ↓ (y). Let also S 0 = {x0 , f (x0 ) < y} so that f ↓ (y) =
φi,j (p)ljmax
X X
Rj∗ (s + τσ(p) ) − Rj∗ (s) = (55)
sup S 0 (2). By the second part of the hypothesis, S 0 contains
j,j6=i j,j6=i
the interval (x − ε, x) hence sup S 0 ≥ x, which shows that
f ↓ (y) ≥ x. Combining the two shows that f ↓ (y) = x. Now there are two cases for s + τ (VI-A).
Case 1: s + τ < τσ(p) In this case the scheduler is not
serving flow i in [τσ(p) , s+τ ] and x = plimin . Thus Ri∗ (s+τ ) =
Proof of Theorem 4. We prove that, for any value of the Ri∗ (τσ(p) ). It follows that
system parameters, for any τ > 0, and for any flow i, there x
φi,j (b min c)ljmax
X
exists one trajectory of a system such that ψi (x) = x +
j,j6=i
li
∃s ≥ 0, (s, s + τ ] is backlogged for flow i
(56)
| {z }
(48) ∗ ∗
P
j,j6=i Rj (τσ(p) )−Rj (s)
and Ri∗ (s + τ ) − Ri∗ (s) = βi (τ ) X
∗ ∗
y =x+ Rj (s + τ ) − Rj (s)
Step 1: Constructing the Trajectory j,j6=i
1) Flows are labeled in order of weights, i.e., wj ≤ wj+1 .
and thus
2) lAt timem 0, the input of every queue j 6= i is a burst of ψi (x) ≥ y (57)
size β(τ )
ljmax
ljmax + wj ljmax .
3) Every flow, j 6= i, is packetized according to its Let x − limin < x0 < x; flow i’s output becomes equal to x0
maximum packet size, ljmax . during the emission of packet p − 1 thus
4) The output of the system is at rate K (the Lipschitz
X
ψi (x0 ) = x0 + Rj∗ (τσ(p−1) ) − Rj∗ (s) (58)
constant of β) from time 0 to times s, which is defined as the j,j6=i
time at which queue i is visited at cycle wi in the first round,
Hence
namely
1 X ∀x0 ∈ (x − limin , x), ψi (x0 ) < y (59)
s= min (wi − 1, wj ) ljmax (49)
K
j,j6=i
Combining (57) and (59) with Lemma 14 shows (54).
Case 2: s + τ ≥ τσ(p) In this case the scheduler is serving
It follows that flow i in [τσ(p) , s+τ ]. For every other flow j, we have Rj∗ (s+
∀t ∈ [0, s], R∗ (t) = Kt (50) τ ) = Rj∗ (τσ(p) ). Hence,

φi,j (p)ljmax = y (60)


X
5) The
 input  of queue i starts just after time s, with a burst ψi (x) = Ri∗ (s + τ ) − Ri∗ (s) +
β(τ )
of size min limin . j,j6=i
li
6) Flow i is packetized according to its minimum packet As with case 1, for any x0 ∈ ((p−1)limin , x), we have ψi (x) <
size, limin . y, which shows (54).
7) After time s, the output of the system is equal to the This shows that (48) holds. It remains to show that the
guaranteed service; by 2) and 5), the busy period lasts for at system constraints are satisfied.
least τ , i.e., Step 2: Verifying the Trajectory We need to verify that
the service offered to the aggregate satisfies the strict service
∀t ∈ [s, s + τ ], R∗ (t) = R∗ (s) + β(t − s) (51) curve constraint. Our trajectory has one busy period, starting
at time 0 and ending at some time Tmax ≥ τ . We need to
In particular, verify that
R∗ (s + τ ) − R∗ (s) = β(τ ) (52) ∀t1 , t2 ∈ [0, Tmax ] with t1 < t2 , R∗ (t2 )−R∗ (t1 ) ≥ β(t2 −t1 )
(61)
If we apply ψi↓ to both sides of (52), the right-hand side is Case 1: t2 < s
equal to βi (τ ). Thereby, we should prove: Then R∗ (t2 ) − R∗ (t1 ) = K(t2 − t1 ). Observe that, by the
ψi↓ (R∗ (s + τ ) − R∗ (s)) = Ri∗ (s + τ ) − Ri∗ (s) (53) Lipschitz continuity condition on β, for all t ≥ 0, β(t) =
β(t) − β(0) = β(t) ≤ Kt thus K(t2 − t1 ) ≥ β(t2 − t1 ).
Let y = R∗ (s + τ ) − R∗ (s) and x = Ri∗ (s + τ ) − Ri∗ (s). Our Case 2: t1 < s ≤ t2 Then R∗ (t2 ) − R∗ (t1 ) = β(t2 − s) +
goal is now to prove that K(s − t1 ). By the Lipschitz continuity condition:

ψi↓ (y) = x (54) β(t2 − t1 ) − β(t2 − s) ≤ K(s − t1 ) (62)


From 5), we know that the first packet of flow i is served thus R∗ (t2 ) − R∗ (t1 ) ≥ β(t2 − t1 ).
at the first cycle of a round (C = 1 in Algorithm 1). Thus, Case 3: s ≤ t1 < t2 Then R∗ (t2 ) − R∗ (t1 ) = β(t2 ) −
applying Lemma 2 and (P5) in Lemma 3, the number of β(t1 ) ≥ β(t2 − t1 ) because β is super-additive.

12
Proof of Theorem 5. The proof is very similar to the proof
of Theorem 4. The necessary changes in the proof are the
following:
1) s is the time of the first visit to flow i.
2) Instead of functions ψi and φi,j , use functions ψi0 and
φ0i,j , defined in (34) and (35).
Proof of Theorem 6. The proof contains the following steps:
1) Consider the same trajectory as in the proof of Theorem
4, yet with one difference: the input of flow i is Ri (t) =
αi (t − s) for t ≥ s and zero before s. Observer that as αi
is sub-additive, ∀t1 , t2 : t2 ≥ t1 ≥ s ⇒ Ri (t2 ) − Ri (t1 ) =
αi (t2 ) − αi (t1 ) ≤ αi (t2 − t1 ).
2) Define s0 = inf{u > 0|αi (u) ≤ βi (u)}. This is the first
time after zero that the service curve meets the arrival curve.
Note that s0 can be infinite as well.
3) Then, it is guaranteed that flow i is backlogged in (s, s +
s0 ]. Therefore, using (48), we have Ri∗ (t) = βi (t−s) for t ≥ s
and zero before s.
4) Combining 1 and 3, the horizontal deviation of Ri and
Ri∗ in (s, s + s0 ] is equal to the horizontal deviation of αi and
βi in [0, s0 ].
4) Using [2, Sec. 5.3.3], the horizontal deviation of αi and
βi can be restricted to [0, s0 ].
Thereby, we find a valid trajectory (verified in the proof of
Theorem 4) where the delay bound is achieved.
Proof of Theorem 7. The same proof of Theorem 6 works
here as well. However, we use the trajectory defined in the
proof of Theorem 5.

13

You might also like