Professional Documents
Culture Documents
Streaming Transmission of Poisson Traffic: Won S. Yoon
Streaming Transmission of Poisson Traffic: Won S. Yoon
Won S. Yoon
Cambridge, MA 02142 USA
Email: wonsyoon@alum.mit.edu
T
Ut 1 0 1 1 0 0 1
t
subtrees
0 with correct
If the queueing system is stable (arrival rate is strictly less
time-T bit. than the channel capacity), then it is known that the busy
1
periods are finite with probability one. Since the number of
packets in memory at any time is bounded by the number
of packets transmitted in a busy period, we are guaranteed
Fig. 2. A sample path of an adaptive-rate encoder with encoded bits Ut ,
and the subtrees corresponding to the time-T packet, UT = 1. that the required memory and computational time are finite
with probability one, although they may be arbitrarily large
finite values. In practice finite code memory can be handled
B. The Decoder by switching to block-coding mode and holding new data in
The decoder has its own version of the code tree, on which queue until old data is reliably decoded.
it employs per-packet MAP decoding. For the packet encoded
III. E RROR P ERFORMANCE FOR S TREAM C ODES
at time t, the decoder compares the aggregate likelihoods of
groups of paths, each group containing all subtrees that share A fundamental issue for stream codes is the behavior of er-
the same time-t branch. Figure 2 shows a sample path of a tree ror probability as a function of decoding delay. For maximum-
in which two groups of subtrees (corresponding to UT = 0 and likelihood decoding, [1] showed that the error probability as a
UT = 1) each contain two subtrees. The decoder compares the function of decoding delay satisfies the block random coding
aggregate likelihood of the two subtrees with UT = 1 versus exponent. We show an analogous result for the case of time-
the aggregate likelihood of the two subtrees with UT = 0. varying code inputs with per-packet MAP decoding.
At stage t+d, the decoder makes a decision on a packet that
was transmitted back at stage t by choosing the branch whose A. Error-versus-Delay for Real-Time Codes
corresponding group of subtrees has maximum likelihood. Before analyzing the error performance of stream codes,
The probability of error for this single branch decision, as we address the subtle but important concept of “error versus
a function of decoding delay d, is derived in the next section. delay”. Traditional random coding analysis assumes a fixed
At any given time, there may be multiple packets in trans- code length and proves the existence of a good code for that
mission (awaiting decisions). As soon as a packet accumulates code length. However, this says nothing about the error perfor-
enough reliability, it is decoded and released. The per-packet mance when the same code is used at different code lengths
nature of MAP decoding allows packets to depart out of order. (different decoding delays). This is an important constraint
In contrast, sequential and ML algorithms decode the best because in practice, it is impractical to switch codes in the
overall path through the entire tree, and therefore data must be middle of a transmission. Therefore, we need to show the
decoded in a serial first-in-first-out manner. For our assumption existence of codes that are uniformly good for all code lengths,
of binary encoding, it turns out that MAP decoding is also not just for one particular code length. Similar results have
serial and therefore all three algorithms have the same error- been independently derived in [1] and [2].
versus-delay performance. However, for general non-binary Define a real-time optimal code (xT1 , . . . , xTM ), as having
variable-rate transmission, MAP decoding achieves smaller the property that all truncations (xt1 , . . . , xtM ), t = 1, . . . , T
n
delay than sequential and ML decoding by allowing a later have a probability of error Pem that is upper-bounded by the
low-rate stage to be decoded before an earlier high-rate stage. block random coding error exponent.
2
Proposition 3.1. There exists a code (xT1 , . . . , xTM ) with trun- The error contribution of each adversarial group can be upper-
cations (xt1 , . . . , xtM ), t = 1, 2, . . . , T , such that Pe,m
t
is bounded by the the random coding bound for block codes.
uniformly bounded for every block length t ≤ T and every Using Proposition 3.1, we know that there exists a single code
message m = 1, . . . , M , for which the probability of error is upper-bounded by the
t random coding bound, for each decoding delay. Taking a union
Pe,m ≤ exp −tEr (m/t) for t ≥ C log M
bound over all such adversarial paths, the overall probability
where m = log M , and Er (R) = maxρ∈[0,1] {E0 (ρ) − Rρ} of error for decoding stage t as a function of decoding delay
is Gallager’s reliability function for block codes and C = d is upper-bounded
E0 (ρ)/ρ is channel capacity. Ãd−1 !ρ ∞
à −1 !ρ
Y X Y
−dnE0
The proof uses the Markov inequality and the union bound Pe ≤ Mt+i e Mt+i e−knE0
i=0 k=1 i=−k
to show that, for a random ensemble of codes, there is ( )
∞
X −1
X
a probability strictly less than one that every code in the Pd−1
ρmt+i −dnE0
ensemble has a truncation that exceeds the average error = e i=0 e exp ρmt+i e−knE0
k=1 i=−k
probability by a specified amount. ∞
Pd−1 X P
−dnE0 + i=0 ρmt+i −knE0 + −1
i=−k ρmt+i
B. Error Exponents for Real-Time MAP Decoding = e e
k=1
We use the previous result to analyze the error-versus-delay
performance of stream codes. Assume infinite code memory Simplifying the notation yields a general expression for the
(infinite constraint length) for ease of analysis. probability of error for decoding stage t after a delay d,
At time t, we decode stage τ < t using the observations " Nt
#
X
−dj E( ρ)+jmρ
y1t = (y1 , . . . , yt ). Define the maximum a posteriori probabil- Pe,t,d ≤ 1 + e e−dE0 (ρ)+(1+Nd )mρ (1)
ity of stage τ as Λ(y1t ) = maxx Pr(Xτ = x|Y1t = y1t ). It can k=1
be shown that this is proportional to the sum of likelihoods A lower bound can be derived by only considering the worst-
for paths that share a common stage-t, case (highest-rate) past adversary. The difference between the
X X
Λ(y1t ) ∼ max ··· Pr(Y1t = y1t |X1t = xt1 ) upper and lower bounds appears as different multiplicative
xτ
x1 /xτ xt
coefficients in front of the exponential term. In the asymptotic
limit of large decoding delay, the differences in this coefficient
In other words, finding the Xτ that maximizes the a posteriori becomes negligible.
probability is equivalent to finding the Xτ that maximizes the The random variables Nt and Nd denote the total number
aggregate likelihoods of all subtrees that emanate from it. of packets encoded in the past [0, t) and in the future (t, t+d],
Consider making a decision about time t just before reach- respectively. The random variable dj is the backwards delay
ing time t + d (decoding delay of d slots). Let Mt denote to the j-th previous packet, which is the sum of j inter-arrival
the number of different inputs possible at time t (i.e., the total Pj
times, dj = i=1 τi .
number of branches in stage t), which is equal to Mt = 2Ut m , In order to guarantee an average probability of error no
where Ut is again the number of packets encoded at time t. greater than ², the transmission (decoding) delay d must satisfy
Assume that the correct path corresponds to an information the following reliability condition
sequence of all zeros. Although the comparison at time t is
XNt
only amongst Ut −1 other paths, there are actually many more 1
paths that are potential adversaries. An adversarial path is one dE0 − mρ + Nd mρ + log e−dj E0 +jmρ ≥ log
j=0
²
that diverged from the correct path at or before time t.
1) Stage t: there are Mt − 1Qbranches that diverge from the To simplify notation, define σ = log 1² /E0 , µ = Eρ0 , φ(dN
1 )=
t
d−1
correct path, each with i=1 Mt+i subsequent tails of 1
P Nt −dj E0 +jmρ
E0 log j=0 e and aggregate terms on the right-
length d − 1. The total number of length-d paths that hand side to get a simplified reliability condition,
diverged from theQcorrect path at time t is therefore · ¸
d−1
Wt = (Mt − 1) i=1 Mt+i , which is upper-bounded 1 Nd
Qd−1 d ≥ σ+ + + φ(dN 1 )
t
(2)
by Wt ≤ i=0 Mt+i . µ µ
2) Stage (t − 1): there are Mt−1 − 1 branches that diverge The previous definitions have a useful interpretation: σ rep-
from the correct path, each with Wt subsequent tails of resents a setup cost for each new transmission, µ1 represents
length d that are incorrect in stage t. The total number the payload of a single packet (therefore µ is the maximum
of diverging paths of length d + 1 that are incorrect in throughput of the channel in packets per slot), and φ represents
stage t is therefore Wt−1 = Q(M t−1 − 1)Wt , which is the workload due to past encodings in the stream.
d−1
upper-bounded by Wt−1 ≤ i=−1 Mt+i . Therefore, reliable transmission of packet t is achieved
3) In general, in stage (t−k): the total number of diverging when the decoding delay d exceeds a threshold. Examining
paths of length d+k that are incorrect in stage t is upper-
Qd−1 the three terms on the right-hand side of Equation 2 more
bounded by Wt−k ≤ i=−k Mt+i . carefully: the first bracketed term represents the workload of
3
a single packet in isolation; the second term represents the denote the smallest such k, the optimal block-coding policy
workload due to packets encoded in the past; the third term is to wait and collect k ∗ packets per codeword. This results
represents the workload due to packets encoded in the future. in a queueing³delay´ equal to half the transmission delay,
Note that if there is no data in the past nor the future σµ
∗
Dblock = 32 µ−λ .
(Nt = Nd = 0) then the workload on the right-hand side
of Equation 2 is that of a single packet; this is not surprising, B. Stream Codes
as there is only one branching point at the root of the tree and As in the block-coding scenario, small values of throughput
the stream decoder is comparing a set of completely disjoint λ < d11 imply that each packet can be reliably transmitted
paths, which are essentially codewords in a block code. As before the next packet arrives. In this case, stream codes
more packets are encoded in the past and future of the stream, essentially behave like block codes by transmitting each packet
the workload (and decoding delay) for packet t increases. ∗
independently with delay equal to Dstream = d1 .
C. A Queueing Model For larger values of throughput λ, it can be shown that the
optimal stream code transmits each packet immediately upon
Using the reliability condition of Equation 2, a packet arrival. We call such a policy “greedy” because it transmits
transmission can be described in queueing-theoretic terms: each packet as quickly as possible, with minimal queueing.
1) A new packet enters service with an initial workload of More precisely, for λ ≥ d11 , the reliability condition in Equa-
σ + µ1 + φ, where σ is a fixed setup cost, µ1 is the packet tion 2 can be evaluated with Nd = bdλc and dj = j/λ to yield,
1 1
payload and φ is the additional work due to existing log 1
∗
Dstream = σµ+1
µ−λ +
mρ
µ−λ
1−β
, where β = e− λ E0 +mρ is
packets already in service. the single-stage error probability. The last log term represents
2) The new packet causes all existing packets in service the extra delay due to joint decoding with the past (and turns
to incur an additional workload of µ1 , equal to the new out to be negligible).
packet’s payload. Figure 4 illustrates the necessity of a minimum number of
3) Every packet in service receives one unit of service per packet per encoding, for both block codes and stream codes. In
slot. When a packet accumulates sufficient service to both cases, a sufficiently large number of packets is required
satisfy Equation 2, it departs the system. per encoding in order to maintain stability for large values of
This model is somewhat reminiscent of the queueing-theoretic throughput. Namely, the encoder must ensure that k/λ > dk
notion of processor-sharing, but there are two key differences in order to guarantee that the number of undecoded packets
here: (1) the joint workload of packets is less than the sum of in the system does not grow unbounded.
their individual workloads, thanks to joint coding efficiency,
and (2) packets encoded in the future impose more workload unstable unstable stable
4
50000
nearly-optimal performance. Furthermore, the results indicate
that stream codes achieve strictly smaller delay than block
40000
codes, and that the difference grows larger in the power-
limited (wideband) regime. The analysis here closely mirrors
30000
that of [3] for block codes.
delay (slots)
A. Lower Bound on Delay
20000
A simple lower bound is obtained by ignoring the queueing
delay and lower-bounding the streaming transmission delay
10000
in Equation 2. Using the randomization technique from [3],
we assume that there exists a steady-state average number of
0
packets encoded per slot, Pr(packet encoded in a slot) = α
0.0 0.2 0.4 0.6 0.8 1.0
(where α > λ for stability of the queue itself). This results in
utilization factor
a random decoding delay in Equation 2, which we can take
the expected value of with respect to the encoding distribution. Fig. 5. Greedy block codes (dashed) compared with greedy stream
codes (solid) for an AWGN channel with SNR A = 1, 0.1 and 0.01 from
On the right hand-side, we recognize that Nd = αd > λd and right to left. The lower bound for stream codes (finely dotted) is also
ignore the log term to obtain the lower bound plotted to show the near-optimality of greedy policies.
∗ σµ + 1
E[Dstream ] ≥ (3)
µ−λ
by other authors for ML decoding and sequential decoding,
B. Upper Bound on Delay and suggests that Gallager’s block coding error exponent
An upper bound is obtained by analyzing a particular determines the fundamental behavior of error versus delay for
policy. Consider a greedy policy that encodes each packet coding systems under a wide variety of decoding algorithms.
immediately upon arrival, with no queueing. The transmission Under the assumptions that the packet arrival times have
delay in Equation 2 is analyzed by substituting Nd = dλ, either a deterministic or Poisson distribution, we have:
h P i • Obtained bounds on the optimal delay of stream codes,
1 Nt −dj E0 +jmρ
σµ + 1 E log e
∗ mρ j=0 and showed that they appear to be very tight.
E[Dstream ] ≤ +
µ−λ µ−λ • Analyzed the difference in delay between optimal stream
(4) codes and optimal block codes, and showed that the
The expected value is taken with respect to the distribution of difference is more significant in the power-limited regime.
the inter-arrival times, τi , i = 1, . . . , j subject to the constraint
P • Showed that greedy rate control policies achieve nearly
j
i=1 τi = dj < t and the constraint that the interval [0, t) optimal delay for stream codes.
must be a busy period. As in the case of deterministic arrivals, In the result for greedy policies, Equation 4, the fact that the
this log term turns out to be negligible. log term (representing the delay due to jointly decoding with
The results of these expressions are plotted for an AWGN the past) was negligible is likely due to the rather uniform
channel with different values of SNR. It can be seen that the nature of deterministic and Poisson arrivals. For more bursty
upper and lower bounds are very close, implying that greedy arrivals with high probability of short inter-arrivals, we suspect
policies are nearly optimal for Poisson arrivals. This is similar that this extra delay term will be more significant and degrade
to the result in [3], which showed that greedy policies are the performance of greedy policies.
nearly optimal for rate-adaptive block codes. For comparison, Finally, note that the common denominator appearing in
we have independently obtained delay bounds for block codes, Equations 3 through 5 is proportional to E0 (ρ) − λmρ, which
3 3 is Gallager’s reliability function. This implies a fundamental
2 σµ ∗ 2 (σµ + 1)
≤ E[Dblock ] ≤ (5) relationship between physical-layer reliability and network-
µ−λ µ−λ
layer delay: for a fixed channel coding strategy, the resulting
Our results appear to indicate that stream codes achieve strictly error exponent determines the gap between the arrival rate and
smaller delay than block codes, and that the gap increases for the system capacity, µ − λ, which is a fundamental measure
smaller values of SNR. of performance in queueing theory. This is analogous to the
VI. C ONCLUSIONS AND D ISCUSSION well-known information-theoretic fact that the error exponent
represents the gap between code rate and channel capacity.
We have analyzed the use of streaming channel codes for
transmitting bursty data. We developed a general random R EFERENCES
coding framework that encompasses both block codes and [1] G.D. Forney, “Convolutional codes ii: maximum-likelihood decod-
stream codes, and derived the error exponent as a function of ing,”Information and Control, 25:222-266, 1974.
[2] A. Sahai, “Why block length and delay are not the same thing,” submitted
delay for real-time MAP decoding. The fundamental quantity 2006 to IEEE Trans. Inform. Theory, preprint arXiv: cs.IT/0610138.
that governs the real-time error exponent is the block coding [3] S. Musy and E. Telatar, “On the transmission of bursty sources,” Pro-
reliability function. This adds to the analogous results shown ceedings, 2006 Int. Symp. on Information Theory, Seattle, July 2006.