Professional Documents
Culture Documents
1311 2012 PDF
1311 2012 PDF
Abstract—This paper investigates the maximal achievable rate to infinity. In this case, the channel is said to be in outage.
for a given blocklength and error probability over quasi-static For fading distributions for which the fading coefficient can be
multiple-input multiple-output (MIMO) fading channels, with and arbitrarily small (such as for Rayleigh, Rician, or Nakagami
without channel state information (CSI) at the transmitter and/or
arXiv:1311.2012v2 [cs.IT] 16 Apr 2014
the receiver. The principal finding is that outage capacity, de- fading), the probability of an outage is positive. Hence, the
spite being an asymptotic quantity, is a sharp proxy for the finite- overall block error probability is bounded away from zero
blocklength fundamental limits of slow-fading channels. Specif- for every positive rate R > 0, in which case the Shannon
ically, the channel dispersion is shown to be zero regardless of capacity is zero. More generally, the Shannon capacity depends
whether the fading realizations are available at both transmitter on the fading probability density function (pdf) only through its
and receiver, at only one of them, or at neither of them. These re-
sults follow from analytically tractable converse and achievability support [6], [7].
bounds. Numerical evaluation of these bounds verifies that zero For applications in which a positive block error probability
dispersion may indeed imply fast convergence to the outage capac- > 0 is acceptable, the maximal achievable rate as a function of
ity as the blocklength increases. In the example of a particular 1×2 the outage probability (also known as capacity versus outage) [1,
single-input multiple-output (SIMO) Rician fading channel, the p. 2631], [8], may be a more relevant performance metric than
blocklength required to achieve 90% of capacity is about an order
of magnitude smaller compared to the blocklength required for an Shannon capacity. The capacity versus outage coincides with
AWGN channel with the same capacity. For this specific scenario, the -capacity C (which is the largest achievable rate under the
the coding/decoding schemes adopted in the LTE-Advanced stan- assumption that the block error probability is less than > 0) at
dard are benchmarked against the finite-blocklength achievability the points where C is a continuous function of [7, Sec. IV].
and converse bounds. For the sake of simplicity, let us consider for a moment a
single-antenna communication system operating over a quasi-
I. I NTRODUCTION static flat-fading channel. The outage probability as a function
of the rate R is defined by
Consider a delay-constrained communication system operat-
F (R) = P log(1 + |H|2 ρ) < R .
ing over a slowly-varying fading channel. In such a scenario, it (1)
is plausible to assume that the duration of each of the transmitted Here, H denotes the random channel gain and ρ is the signal-
codewords is smaller than the coherence time of the channel, so to-noise ratio (SNR). For a given > 0, the outage capacity
the random fading coefficients stay constant over the duration (or -capacity) C is the supremum of all rates R satisfying
of each codeword [1, p. 2631], [2, Sec. 5.4.1]. We shall refer to F (R) ≤ . The rationale behind this definition is that, for every
this channel model as quasi-static fading channel.1 realization of the fading coefficient H = h, the quasi-static
When communicating over quasi-static fading channels at a fading channel can be viewed as an AWGN channel with channel
given rate R, the realization of the random fading coefficient may gain |h|2 , for which communication with arbitrarily small block
be very small, in which case the block (frame) error probability error probability is feasible if and only if R < log(1 + |h|2 ρ),
is bounded away from zero even if the blocklength n tends provided that the blocklength n is sufficiently large. Thus, the
This work was supported in part by the Swedish Research Council under
outage probability can be interpreted as the probability that the
grant 2012-4571, by the Ericsson Research Foundation under grant FOSTIFT- channel gain H is too small to allow for communication with
12:022, by a Marie Curie FP7 Integration Grant within the 7th European arbitrarily small block error probability.
Union Framework Programme under Grant 333680, by the Spanish government
(TEC2009-14504-C02-01, CSD2008-00010, and TEC2012-38800-C03-01), and
A major criticism of this definition is that it is somewhat
by the National Science Foundation under Grant CCF-1253205. The material contradictory to the underlying motivation of the channel model.
of this paper was presented in part at the 2013 and 2014 IEEE International Indeed, while log(1 + |h|2 ρ) is meaningful only for codewords
Symposium on Information Theory.
W. Yang and G. Durisi are with the Department of Signals and Systems,
of sufficiently large blocklength, the assumption that the fading
Chalmers University of Technology, 41296, Gothenburg, Sweden (e-mail: {ywei, coefficient is constant during the transmission of the codeword is
durisi}@chalmers.se). only reasonable if the blocklength is smaller than the coherence
T. Koch is with the Signal Theory and Communications Department, Univer-
sidad Carlos III de Madrid, 28911, Leganés, Spain (e-mail: koch@tsc.uc3m.es).
time of the channel. In other words, it is prima facie unclear
Y. Polyanskiy is with the Department of Electrical Engineering and Computer whether for those blocklengths for which the quasi-static chan-
Science, MIT, Cambridge, MA, 02139 USA (e-mail: yp@mit.edu). nel model is reasonable, the outage capacity is a meaningful
1 The term “quasi-static” is widely used in the communication literature (see,
performance metric.
e.g., [2, Sec. 5.4.1], [3]). The quasi-static channel model belongs to the general
class of composite channels [1, p. 2631], [4] (also known as mixed channels [5, In order to shed light on this issue, we study the maximal
Sec. 3.3]). achievable rate R∗ (n, ) for a given blocklength n and block
2
error probability over a quasi-static multiple-input multiple- Fast convergence to the outage capacity provides mathematical
output (MIMO) fading channel, subject to a per-codeword power support to the observation reported by several researchers in
constraint. the past that the outage probability describes accurately the
Previous results: Building upon Dobrushin’s and Strassen’s performance over quasi-static fading channels of actual codes
asymptotic results, Polyanskiy, Poor, and Verdú recently showed (see [15] and references therein).
that for various channels with positive Shannon capacity C, the The following example supports our claims: for a 1 × 2 single-
maximal achievable rate can be tightly approximated by [9] input multiple-output (SIMO) Rician-fading channel with C =
r 1 bit/channel use and = 10−3 , the blocklength required to
V −1 log n
∗
R (n, ) = C − Q () + O . (2) achieve 90% of C for the perfect CSIR case is between 120 and
n n 320 (see Fig. 2 on p. 10), which is about an order of magnitude
−1
Here, Q (·) denotes the inverse of the Gaussian Q-function smaller compared to the blocklength required for an AWGN
Z ∞ channel with the same capacity (see [9, Fig. 12]).
1 −t2 /2 Fast convergence to the outage capacity further suggests that
Q(x) , √ e dt (3)
x 2π communication strategies that are optimal with respect to outage
and V is the channel dispersion [9, Def. 1]. The approxima- capacity may perform also well at finite blocklength. Note, how-
tion (2) implies that to sustain the desired error probability at ever, that−1 this need not be true for very small blocklengths, where
a finite blocklength n, one pays a penalty on the rate the O(n log n) term in (4) may dominate. Thus, for small n
√ (compared the derived achievability and converse bounds on R∗ (n, ) may
to the channel capacity) that is proportional to 1/ n.
Recent works have extended (2) to some ergodic fading chan- behave differently than the outage capacity. Table I summarizes
nels. Specifically, the dispersion of single-input single-output how the ∗
outage capacity and the achievability/converse bounds
(SISO) stationary fading channels for the case when channel state on R (n, ) derived in this paper depend on system parameters
information (CSI) is available at the receiver was derived in [10]. such as the availability of CSI and the number of antennas at
This result was extended to block-memoryless fading channels the transmitter/receiver. These observations may be relevant for
in [11]. Upper and lower bounds on the second-order coding delay-constrained communication over slowly-varying fading
rate of quasi-static MIMO Rayleigh-fading channels have been channels.
reported in [12] for the asymptotically ergodic setup when the Proof techniques: Our converse bounds on R∗ (n, ) are
number of antennas grows linearly with the blocklength. A lower based on the meta-converse theorem [9, Th. 30]. Our achievabil-
bound on R∗ (n, ) for the imperfect CSI case has been developed ity bounds on R∗ (n, ) are based on the κβ bound [9, Th. 25]
in [13]. The second-order coding rate of single-antenna quasi- applied to a stochastically degraded channel, whose choice is
static fading channels for the case of perfect CSI and long-term motivated by geometric considerations. The main tools used
power constraint has been derived in [14]. to establish (4) are a Cramer-Esseen-type central-limit theo-
rem [16, Th. √ VI.1] and a result on the speed of convergence
Contributions: We provide achievability and converse
of P[B > A/ n] to P[B > 0] for n → ∞, where A and B are
bounds on R∗ (n, ) for quasi-static MIMO fading channels. We
independent random variables.
consider both the case when the transmitter has full transmit CSI
Notation: Upper case letters such as X denote scalar ran-
(CSIT) and, hence, can perform spatial water-filling, and the case
dom variables and their realizations are written in lower case,
when no CSIT is available. Our converse results are obtained
e.g., x. We use boldface upper case letters to denote random
under the assumption of perfect receive CSI (CSIR), whereas
vectors, e.g., X, and boldface lower case letters for their real-
the achievability results are derived under the assumption of no
izations, e.g., x. Upper case letters of two special fonts are used
CSIR.
to denote deterministic matrices (e.g., Y) and random matrices
By analyzing the asymptotic behavior of our achievability
(e.g., Y). The superscripts T and H stand for transposition and
and converse bounds, we show that under mild conditions on
Hermitian transposition, respectively. We use tr(A) and det(A)
the fading distribution,2
to denote the trace and determinant of the matrix A, respectively,
log n and use span(A) to designate the subspace spanned by the
R∗ (n, ) = C + O . (4)
n column vectors of A.pThe Frobenius norm of a matrix A is
denoted by kAkF , tr(AAH ). The notation A 0 means
This results holds both for the case of perfect CSIT and for the
that the matrix A is positive semi-definite. The function resulting
case of no CSIT, and independently on whether CSIR is available
from the composition of two functions f and g is denoted by
at the receiver or not. By comparing (2) with √ (4), we observe g ◦ f , i.e., (g ◦ f )(x) = g(f (x)). For two functions f (x)
that for the quasi-static fading case, the 1/ n rate penalty is
and g(x), the notation f(x) = O(g(x)), x → ∞, means that
absent. In other words, the -dispersion (see [9, Def. 2] or (52) < ∞, and f (x) = o(g(x)), x → ∞,
lim supx→∞ f (x)/g(x)
below) of quasi-static fading channels is zero. This suggests
∗ means that lim x→∞
f (x)/g(x) = 0. We use Ia to denote the
that the maximal achievable rate R (n, ) converges quickly
identity matrix of size a × a, and designate by Ia,b (a > b)
to C as n tends to infinity, thereby indicating that the outage
the a × b matrix containing the first b columns of Ia . The
capacity is indeed a meaningful performance metric for delay-
distribution of a circularly-symmetric complex Gaussian random
constrained communication over slowly-varying fading channels.
vector with covariance matrix A is denoted by CN (0, A), the
2 These conditions are satisfied by the fading distributions commonly used in Wishart distribution [18, Def. 2.3] with n degrees of freedom and
the wireless communication literature (e.g., Rayleigh, Rician, Nakagami). covariance matrix A defined on matrices of size m×m is denoted
3
TABLE I
O UTAGE CAPACITY VS . FINITE BLOCKLENGTH WISDOM ; t IS THE NUMBER OF TRANSMIT ANTENNAS .
by Wm (n, A), and the Beta distribution [19, Ch. 25] is denoted no, tx, rx, and rt, respectively. Next, we introduce the notion of
by Beta(·, ·). The symbol R+ stands for the nonnegative real a channel code for each of these four settings.
line, Rm m
+ ⊂ R is the nonnegative orthant of the m-dimensional Definition 1 (no-CSI): An (n, M, )no code consists of:
real Euclidean spaces, and Rm m
≥ ⊂ R+ is defined by i) an encoder fno : {1, . . . , M } 7→ Cn×t that maps the mes-
sage J ∈ {1, . . . , M } to a codeword X ∈ {C1 , . . . , CM }.
Rm m
≥ , {x ∈ R+ : x1 ≥ · · · ≥ xm }. (5)
The codewords satisfy the power constraint
The indicator function is denoted by 1{·}, and [ · ]+ , 2
kCi kF ≤ nρ, i = 1, . . . , M. (9)
max{ · , 0}. Finally, log(·) is the natural logarithm.
Given two distributions P and Q on a common measurable ii) A decoder gno : Cn×r 7→ {1, . . . , M } satisfying a maximum
space W, we define a randomized test between P and Q as a probability of error constraint
random transformation PZ | W : W 7→ {0, 1} where 0 indicates
that the test chooses Q. We shall need the following performance max P[gno (Y) 6= J | J = j] ≤ (10)
1≤j≤M
metric for the test between P and Q:
Z where Y is the channel output induced by the transmitted
βα (P, Q) , min PZ | W (1 | w)Q(dw) (6) codeword X = fno (j) according to (8).
Definition 2 (CSIR): An (n, M, )rx code consists of:
where the minimum is over all probability distributions PZ | W i) an encoder fno : {1, . . . , M } 7→ Cn×t that maps the mes-
satisfying sage J ∈ {1, . . . , M } to a codeword X ∈ {C1 , . . . , CM }.
Z The codewords satisfy the power constraint (9).
PZ | W (1 | w)P (dw) ≥ α. (7) ii) A decoder grx : Cn×r × Ct×r 7→ {1, . . . , M } satisfying
max P[grx (Y, H) 6= J | J = j] ≤ . (11)
1≤j≤M
II. S YSTEM M ODEL
Definition 3 (CSIT): An (n, M, )tx code consists of:
We consider a quasi-static MIMO fading channel with t
transmit and r receive antennas. Throughout this paper, we i) an encoder ftx : {1, . . . , M } × Ct×r 7→ Cn×t that maps the
denote the minimum number of transmit and receive antennas message j ∈ {1, . . . , M } and the channel H to a codeword
by m, i.e., m , min{t, r}. The channel input-output relation is X = ftx (j, H) satisfying
given by 2 2
kXkF = kftx (j, H)kF ≤ nρ,
Y = XH + W. (8) ∀j = 1, . . . , M, ∀H ∈ Ct×r . (12)
Here, X ∈ Cn×t is the signal transmitted over n channel uses; ii) A decoder gno : Cn×r 7→ {1, . . . , M } satisfying (10).
Y ∈ Cn×r is the corresponding received signal; the matrix H ∈ Definition 4 (CSIRT): An (n, M, )rt code consists of:
Ct×r contains the complex fading coefficients, which are random i) an encoder ftx : {1, . . . , M } × Ct×r 7→ Cn×t that maps the
but remain constant over the n channel uses; W ∈ Cn×r denotes message j ∈ {1, . . . , M } and the channel H to a codeword
the additive noise at the receiver, which is independent of H X = ftx (j, H) satisfying (12).
and has independent and identically distributed (i.i.d.) CN (0, 1) ii) A decoder grx : Cn×r × Ct×r 7→ {1, . . . , M } satisfy-
entries. ing (11).
We consider the following four scenarios: The maximal achievable rate for the four cases listed above
1) no-CSI: neither the transmitter nor the receiver is aware of is defined as follows:
the realizations of the fading matrix H;
2) CSIT: the transmitter knows H; log M
Rl∗ (n, ) , sup : ∃(n, M, )l code ,
3) CSIR: the receiver knows H; n
4) CSIRT: both the transmitter and the receiver know H. l ∈ {no, rx, tx, rt}. (13)
To keep the notation compact, we shall abbreviate in mathemat- From Definitions 1–4, it follows that
ical formulas the acronyms no-CSI, CSIT, CSIR, and CSIRT as
∗ ∗ ∗
Rno (n, ) ≤ Rtx (n, ) ≤ Rrt (n, ) (14)
3 A channel is reciprocal for a given performance metric (e.g., outage capacity)
∗ ∗ ∗
if substituting H with HH does not change the metric. Rno (n, ) ≤ Rrx (n, ) ≤ Rrt (n, ). (15)
4
θǫ
IV. CSI AVAILABLE AT THE T RANSMITTER y′ y
√
A. Achievability w kw ′ k ≈ kwk ≈ n
w′
In this section, we consider the case where CSI is available h′ x1 hx1 hx1 ,w ′ i
≈ hx1 ,wi
≈0
kx1 kkw ′ k kx1 kkwk
at the transmitter but not at the receiver. Before establishing our
achievability bound in Section IV-A2, we provide some geomet-
ric intuition that will guide us in the choice of the decoder gno
(see Definition 3).
1) Geometric Intuition: Consider for simplicity a real-valued Fig. 1. A geometric illustration of the outage event for large blocklength n. In
quasi-static SISO channel (t = r = 1), i.e., a channel with the example, the fading realization h0 triggers an outage event, h does not.
input-output relation
Roughly speaking, the decoder of a C -achieving code may cos θk , max |ha, bi|,
a ∈ A, b ∈ B : kak = kbk = 1,
commit an error only when the channel is in outage. Pick now an ha, ai i = hb, bi i = 0, i = 1, . . . , k − 1
arbitrary codeword x1 from the hypersphere {x ∈ Rn : kxk2 = k = 1, . . . , a. (39)
nρ}, and let Y be the received signal corresponding to x1 .
Following [21], we analyze the angle θ(x1 , Y ) between x1 Here, ak and bk , k = 1, . . . , a, are the vectors that achieve the
and Y as follows. By the law of large numbers, the noise maximum in (39) at the kth recursion. The angle between the
vector W is approximately orthogonal to x1 if n is large, i.e., subspaces A and B is defined by
hx1 , W i a
→ 0, n → ∞. (36) Y
kx1 kkW k sin{A, B} , sin θk . (40)
k=1
Also by the law of large numbers, kW k2 /n → 1 as n → ∞.
Hence, for a given H and for large n, the angle θ(x1 , Y ) can With a slight abuse of notation, for two matrices A ∈ Cn×a
be approximated as and B ∈ Cn×b , we abbreviate sin{span(A), span(B)} with
kW k sin{A, B}. When the columns of A and B are orthonormal bases
θ(x1 , Y ) ≈ arcsin p (37) for span(A) and span(B), respectively, we have (see, e.g., [22,
H 2 kx1 k2 + kW k2
Sec. I])
1
≈ arcsin p (38)
ρH 2 + 1 sin2 {A, B} = det I − AH BBH A
(41)
= det I − BH AAH B .
where the first approximation follows by (36) and the second (42)
approximation follows because kW k2 ≈ n. It follows from (35)
and (38) that θ(x1 , Y ) is larger than θ , arcsin(e−C ) in the Some additional properties of the operator sin{·, ·} are listed in
outage case, and smaller than θ otherwise (see Fig. 1). Appendix I.
This geometric argument suggests the use of a threshold We are now ready to state our achievability bound.
decoder that, for a given received signal Y , declares xi to be Theorem 1: Let Λ1 ≥ · · · ≥ Λm be the m largest eigenvalues
the transmitted codeword if xi is the only codeword for which of HHH . For every 0 < < 1 and every 0 < τ < , there exists an
θ(xi , Y ) ≤ θ . If no codewords or more than one codeword (n, M, )tx code for the channel (8) that satisfies
meet this condition, the decoder declares an error. Thresholding
angles instead of log-likelihood ratios (cf., [9, Th. 17 and Th. 25]) log M 1 τ
≥ log Qr . (43)
j=1 Bj ≤ γn
appears to be a natural approach when CSIR is unavailable. n n P
Note that the proposed threshold decoder does neither require
CSIR nor knowledge of the fading distribution. As we shall Here, Bj ∼ Beta(n−t−j +1, t), j = 1, . . . , r, are independent
see, it achieves (4) and yields a tight achievability bound at Beta-distributed random variables, and γn ∈ [0, 1] is chosen so
6
that where
m
(n − 1)n e−(n−1) Γ(n, n − 1)
√
n np
P sin2 In,t , nIn,t diag v1∗ Λ1 , . . . , crt (n) , +
Γ(n) Γ(n)
H
p
∗ Λ , 0, . . . , 0
o o × EH det(It + ρHH ) (50)
vm m + W ≤ γn ≥ 1 − + τ (44)
and the scalar γn (v) is the solution of
| {z }
t−m
by analyzing the upper bound (49) in the limit n → ∞. We next is the capacity of the channel (8) when H = H (the water-filling
prove in Appendix V the achievability result power allocation values {vj∗ } in (60) are given in (45) and {λj }
are the eigenvalues of HH H), and
∗ log n
Rtx (n, ) ≥ Ctx + O (57) m
n X 1
V (H) = m − (61)
(1 + vj∗ λj )2
by expanding (43) for n → ∞. The desired result then follows j=1
by (14). is the dispersion of the channel (8) when H = H [23, Th. 78].
Remark 3: As mentioned in Section I, the quasi-static fading Theorem 3 and the expansion
channel considered in this paper belongs to the general class
of composite or mixed channels, whose -dispersion is known N 1
Rrt (n, ) = Ctx + O (62)
in some special cases. Specifically, the dispersion of a mixed n
channel with two states was derived in [24, Th. 7]. This result was (which follows from Lemma 17 in Appendix IV-C and Taylor’s
extended to channels with finitely many states in [25, Th. 4]. √ In theorem) suggest that this approximation is accurate, as con-
both cases, the rate of convergence to the -capacity is O(1/ n) firmed by the numerical results reported in Section VI-A. Note
(positive dispersion), as opposed to O(log(n)/n) in Theorem 3. that the same approximation has been concurrently proposed
Our result shows that moving from finitely many to uncountably in [27]; see also [24, Def. 2] and [25, Sec. 4] for similar approx-
many states (as in the quasi-static fading case) yields a drastic imations for mixed channels with finitely many states.
change in the value of the channel dispersion. For this reason,
our result is not derivable from [24] or [25]. V. CSI N OT AVAILABLE AT THE T RANSMITTER
Remark 4: It can be shown that the assumptions on the fading
A. Achievability
matrix in Theorem 3 are satisfied by most probability distri-
butions used to model MIMO fading channels, such as i.i.d. In this section, we shall assume that neither the transmitter
or correlated Rayleigh, Rician, and Nakagami. However, the nor the receiver have a priori CSI. Using the decoder described
(nonfading) AWGN MIMO channel, which can be seen as a in IV-A, we obtain the following achievability bound.
quasi-static fading channel with fading distribution equal to a Theorem 4: Assume that for a given 0 < < 1 there exists a
∗
step function, does not meet these assumptions and has, in fact, Q ∈ U t such that
positive dispersion [23, Th. 78].
Fno (Cno ) = inf P log det Ir + HH QH ≤ Cno (63)
While zero dispersion indeed may imply fast convergence Q∈Ut
= P log det Ir + HH Q∗ H ≤ Cno
to -capacity, this is not true anymore when the probability (64)
distribution of the fading matrix approaches a step function, in
which case the higher-order terms in the expansion (54) become i.e., the infimum in (63) is a minimum. Then, for every 0 < τ <
more dominant. Consider for example a SISO Rician fading there exists an (n, M, )no code for the channel (8) that satisfies
channel with Rician factor K. For < 1/2, one can refine (54) log M 1 τ
and show that [26] ≥ log Qr . (65)
j=1 Bj ≤ γn
n n P
√
log n c1 K + c2 1 ∗ Here, Bj ∼ Beta(n − t∗ − j + 1, t∗ ), j = 1, . . . , r, are
C − − +o ≤ Rtx (n, )
n n n independent Beta-distributed random variables, t∗ , rank(Q∗ ),
√
log n c̃1 K + c̃2
1 and γn ∈ [0, 1] is chosen so that
∗
≤ Rrt (n, ) ≤ C + − +o (58) √
n n n P sin2 {In,t∗ , nIn,t∗ UH + W} ≤ γn ≥ 1 − + τ (66)
where c1 , c2 , c̃1 and c̃2 are finite constants with c1 > 0 and ∗
with U ∈ Ct ×t satisfying UH U = Q∗ .
c̃1 > 0. As we let the Rician factor K become large, the fading Proof: The proof is identical to the proof of Theorem 1,
distribution converges to a step function and the third term in
both the left-most lower bound and the right-most upper bound √ the precoding matrix P(H) (defined
with the only difference that
in (108)) is replaced by nIn,t∗ U.
becomes increasingly large in absolute value. The assumption in (64) that the -capacity-achieving input
covariance matrix of the channel (8) exists is mild. A sufficient
D. Normal Approximation condition for the existence of Q∗ is given in the following
proposition.
N ∗
h i
We define the normal approximation Rrt (n, ) of Rrt (n, ) 2
Proposition 5: Assume that E kHkF < ∞ and that the
as the solution of
distribution of H is absolutely continuous with respect to the
Lebesgue measure on Ct×r . Then, for every R ∈ R+ , the
" !#
N
C(H) − Rrt (n, )
=E Q p . (59) infimum in (26) is a minimum.
V (H)/n
Proof: See Appendix VI.
Here, For the SIMO case, the RHS of (43) and the RHS of (65)
m
coincide, i.e.,
X
C(H) = log(1 + vj∗ λj ) (60) 1
Rtx (n, ), Rno (n, ) ≥ log
τ
(67)
j=1 n P[B ≤ γn ]
8
where B ∼ Beta(n − r, r), and γn ∈ [0, 1] is chosen so that 1) SIMO case: For the SIMO case, CSIT is not beneficial [26]
√ and the bounds (49) and (71) can be tightened as follows.
P[sin2 {e1 , nρe1 H T + W} ≤ γn ] ≥ 1 − + τ. (68) Theorem 7: Let
n
X 2
Here, e1 stands for the first column of the identity matrix In .
p p
Ln , n log(1 + ρG) + 1 − ρGZi − 1 + ρG
The achievability bound (67) follows from (43) and (65) by i=1
noting that the random
Qvariable B on the RHS of (67) has the (74)
r
same distribution as i=1 Bi , where Bi ∼ Beta(n − i, 1),
i = 1, . . . , r. and
√
ρGZi − 12
n !
X
Sn , n log(1 + ρG) + 1− (75)
i=1
1 + ρG
B. Converse
with G , kHk2 and Zi , i = 1, . . . , n, i.i.d. CN (0, 1) dis-
For the converse, we shall assume CSIR but not CSIT. The
tributed. For every n and every 0 < < 1, the maximal
counterpart of Theorem 2 is the following result.
achievable rate on the quasi-static fading channel (8) with one
Theorem 6: Let Ute be as in (28). For an arbitrary Q ∈ Ute ,
transmit antenna and with CSIR (with or without CSIT) is upper-
let Λ1 ≥ · · · ≥ Λm be the ordered eigenvalues of HH QH. Let
bounded by
n Xm
1 1
X 2 ∗ ∗
Lrx (n − 1, ) ≤ Rrt (n − 1, ) ≤
p p
n (Q) , log(1+Λ j )+1− Λ Z
j ij − 1 + Λ j
Rrx log (76)
n−1 P[Ln ≥ nγn ]
i=1 j=1
(69) where γn is the solution of
and
P[Sn ≤ nγn ] = . (77)
n X m Λj Zij − 12
p
Proof: See [26, Th. 1]. The main difference between the
X
rx
Sn (Q) , log(1 + Λj ) + 1 − (70)
i=1 j=1
1 + Λj proof of Theorem 7 and the proof of Theorem 2 and Theorem 6
is that the simple bound 0 ≥ 1 − 1/M on the maximal
where Zij , i = 1, . . . , n, j = 1, . . . , m, are i.i.d. CN (0, 1) error probability of the auxiliary channel in the meta-converse
distributed. Then, for every n ≥ r and every 0 < < 1, theorem [9, Th. 30] suffices to establish the desired result. The
the maximal achievable rate on the quasi-static MIMO fading more sophisticated bounds reported in Lemma 14 (Appendix III)
channel (8) with CSIR is upper-bounded by and Lemma 19 (Appendix VII) are not needed.
2) Converse for (n, M, )iso codes: In Theorem 8 below,
∗ 1 crx (n) we establish a converse bound on the maximal achievable rate
Rrx (n − 1, ) ≤ log . (71)
n−1 n (Q) ≥ nγn (Q)]
inf e P[Lrx of (n, M, )iso codes introduced in Section III. As such codes
Q∈Ut
consist of isotropic codewords chosen from the set Fiso in (32),
Here, CSIT is not beneficial also in this scenario.
b(r+1)2 /4c Theorem 8: Let Lrx rx
n (·) and Sn (·) be as in (69) and (70),
π r(r−1)
crx (n) , E 1 + ρ kHkF
2 respectively. Then, for every n and every 0 < < 1, the maximal
Γr (n)Γr (r) ∗
achievable rate Rrx,iso (n, ) of (n, M, )iso codes over the quasi-
r
static MIMO fading channel (8) with CSIR is upper-bounded by
Y
n+r−2i+1 −(n+r−2i)
× (n + r − 2i) e
∗ ∗ 1 1
i=1 Rrx,iso (n, ) ≤ Rrt,iso (n, ) ≤ log
n P[Ln ((ρ/t)It ) ≥ nγn ]
rx
+ Γ(n + r − 2i + 1, n + r − 2i) (72) (78)
where γn is the solution of
with Γ(·) (·) denoting the complex multivariate Gamma func-
tion [28, Eq. (83)], and γn (Q) is the solution of P[Snrx ((ρ/t)It ) ≤ nγn ] = . (79)
Proof: The proof follows closely the proof of Theorem 6.
P[Snrx (Q) ≤ nγn (Q)] = . (73)
As in the SIMO case, the main difference is that the simple bound
Proof: See Appendix VII. 0 ≥ 1 − 1/M on the maximal error probability of the auxiliary
channel in the meta-converse theorem [9, Th. 30] suffices to
The infimum in (71) makes the upper bound more diffi-
establish (79).
cult to evaluate numerically and to analyze asymptotically up
to O(log(n)/n) terms than the upper bound (49) that we estab-
lished for the CSIT case. In fact, even the simpler problem of C. Asymptotic Analysis
finding the matrix Q that minimizes lim P[Lrx n (Q) ≥ nγn ] To state our dispersion result, we will need the following
n→∞
is open. Next, we consider two special cases for which the definition of the gradient ∇g of a differentiable function g :
t×r
bound (71) can be tightened and evaluated numerically: the C 7→ R. Let L ∈ Ct×r , then we shall write ∇g(H) = L if
SIMO case and the case where all codewords are chosen from d
= Re tr AH L , ∀A ∈ Ct×r . (80)
g(H + tA)
the set Fiso . dt t=0
9
Theorem 9 below establishes the zero-dispersion result for the Conditions 1 and 2 of Theorem 9, the function Q 7→ FQ0 (Cno )
case of no CSIT. Because of the analytical intractability of the is continuous with respect to the same metric. By the extreme
minimization in the converse bound (71), Theorem 9 requires value theorem [29, p. 34], these two properties imply that the
more stringent conditions on the fading distribution compared to infimum on the RHS of (89) is a minimum. Then, one shows
the CSIT case (cf., Theorem 3), and its proof is more involved. that for Rayleigh, Rician, and Nakagami distributions
Theorem 9: Let fH be the pdf of the fading matrix H. Assume
that H satisfies the following conditions: FQ0 (Cno ) > 0, ∀Q ∈ Q . (90)
1) fH is a smooth function, i.e., it has derivatives of all orders. One way to prove (90) is to write FQ0 (Cno ) in integral form using
2) There exists a positive constant a such that Lemma 22 in Appendix VIII-A1 and to show that the resulting
−2tr−b(r+1)2 /2c−1 integral is positive.
fH (H) ≤ a kHkF (81)
−2tr−5
For the SIMO case, the conditions on the fading distribution
k∇fH (H)kF ≤ a kHkF . (82) can be relaxed and the following result holds.
3) The function Fno (·) satisfies Theorem 10: Assume that the pdf of kHk2 is continuously
differentiable and that the -capacity C is a point of growth for
Fno (Cno + δ) − Fno (Cno ) the outage probability function
lim inf > 0. (83)
δ→0 δ
Then, F (R) = P[log(1 + kHk2 ρ) < R] (91)
∗ ∗ log n i.e., F 0 (C ) > 0. Then,
(n, ) = Cno + O
Rno (n, ), Rrx . (84)
n
∗ ∗
log n
Hence, the -dispersion is zero for both the CSIR and the no-CSI Rno (n, ), Rrx (n, ) = C + O . (92)
n
case:
Proof: In the SIMO case, CSIT is not beneficial [26, Th. 5].
Vno = Vrx = 0, ∈ (0, 1)\{1/2}. (85)
Hence, the result follows directly from Theorem 3 and Proposi-
Proof: See Appendices VIII and IX. tion 23 in Appendix IX.
Remark 5: It can be shown that Conditions 1–3 in Theorem 9 Similarly, for the case of codes consisting of isotropic code-
are satisfied by the probability distributions commonly used to words, milder conditions on the fading distribution are sufficient
model MIMO fading channels, such as Rayleigh, Rician, and to establish zero dispersion, as illustrated in the following theo-
Nakagami. Condition 2 requires simply that fH has a polynomi- rem.
ally decaying tail. Condition 3 plays the same role as (53) in the Theorem 11: Assume that the joint pdf of the nonzero eigen-
CSIT case. The exact counterpart of (53) for the no-CSIT case values of HH H is continuously differentiable and that
would be 0
Fiso (Ciso ) > 0 (93)
0 no
Fno (C ) > 0. (86)
where Fiso is the outage probability function given in (31). Then,
However, different from (53), the inequality (86) does not neces- we have
sarily hold for the commonly used fading distributions. Indeed,
∗ ∗ iso log n
consider a MISO i.i.d. Rayleigh-fading channel. As proven {Rno,iso (n, ), Rrx,iso (n, )} = C + O . (94)
in [20], the -capacity-achieving covariance matrix for this n
case is given by (29). The resulting outage probability function Proof: See Appendix X.
Fno (·) may not be differentiable at the rates R for which the
infimum in (27) is achieved by two input covariance matrices
with different number of nonzero entries t∗ on the main diagonal. D. Normal Approximation
Next, we briefly sketch how to prove that Condition 3 holds For the general no-CSIT MIMO case, the unavailability of a
for Rayleigh, Rician, and Nakagami distributions. Let closed-form expression for the -capacity Cno in (25) prevents us
from obtaining a normal approximation for the maximum coding
FQ (R) , P[log det Ir + HH QH < R].
(87)
rate at finite blocklength. However, such an approximation can
Let Q be the set of all -capacity-achieving covariance matrices, be obtained for the SIMO case and for the case of isotropic
i.e., codewords. In both cases, CSIT is not beneficial and the outage
e no no
capacity can be characterized in closed form.
Q , {Q ∈ Ut : FQ (C ) = Fno (C )}. (88) For the SIMO case, the normal approximation follows directly
∗
By Proposition 5, the set Q is non-empty for the considered from (59)–(61) by setting m = 1, v1 = ρ and noting that λ1 =
2
fading distributions. It follows from algebraic manipulations that khk .
N
For (n, M, )iso codes, the normal approximation Rrx,iso (n, )
Fno (Cno + δ) − Fno (Cno ) 0 no ∗
lim inf = inf FQ (C ). (89) to the maximal achievable rate Rrx,iso (n, ) is obtained as the
δ→0 δ Q∈Q
solution of
To show that the RHS of (89) is positive, one needs to perform "
N
!#
two steps. First, one shows that the set Q is compact with Ciso (H) − Rrx,iso (n, )
=E Q p . (95)
respect to the metric d(A, B) = kA − BkF and that under Viso (H)/n
10
0.8
Rate, bit/(ch. use)
0.2 0.2
0 00 100 200 300 400 500 600 700 800 900 1000
0 100 200 300 400 500 600 700 800 900 1000
Blocklength, n Blocklength, n
Fig. 2. Achievability and converse bounds for a quasi-static SIMO Rician-fading Fig. 3. Achievability and converse bounds for (n, M, )iso codes over a
channel with K-factor equal to 20 dB, two receive antennas, SNR = −1.55 quasi-static MIMO Rayleigh-fading channel with two transmit and three receive
dB, and = 10−3 . Note that in the SIMO case Ctx = Cno = C . antennas, SNR = 2.12 dB, and = 10−3 .
Here, CSIRT case and in the range [120, 480] for the no-CSI case. For
m
X the AWGN channel, this number is approximately 1420. Hence,
Ciso (H) = log(1 + ρλj /t) (96) for the parameters chosen in Fig. 2, the prediction (based on
j=1 zero dispersion) of fast convergence to capacity is validated.
N
The gap between the normal approximation Rrt (n, ) defined
and
implicitly in (59) and both the achievability (CSIR) and the
m
X 1 converse bounds is less than 0.02 bit/(ch. use) for blocklengths
Viso (H) = m − 2
(97)
(1 + ρλ j /t) larger than 400.
j=1
Note that although the AWGN curve in Fig. 2 lies below the
h 28, 2014
where {λj } are the eigenvalues of HH H. A comparison between
March 31, 2014
achievability
DRAFT
bound for the quasi-static fading channel, this does D
N
Rrx,iso (n, ) and the bounds (65) and (78) is provided in the next not mean that “fading helps”. In Fig. 2, we chose the SNRs so
section. that both channels have the same -capacity. This results in the
received power for the quasi-static case being 1.45 dB larger
VI. N UMERICAL R ESULTS than that for the AWGN case.
N
In Fig. 3, we compare the normal approximation Rrx,iso (n, )
A. Numerical Results
defined (implicitly) in (95) with the achievability bound (65) and
In this section, we compute the bounds reported in Sec- the converse bound (78) on the maximal achievable rate with
N
tions IV and V. Fig. 2 compares Rrt (n, ) with the achievability (n, M, )iso codes over a quasi-static MIMO fading channel with
bound (67) and the converse bound (76) for a quasi-static SIMO t = 2 transmit and r = 3 receive antennas. The channel between
fading channel with two receive antennas. The channels between each transmit-receive antenna pair is Rayleigh-distributed, and
the transmit antenna and each of the two receive antennas the channels between different transmit-receive antenna pairs
are Rician-distributed with K-factor equal to 20 dB. The two are assumed to be independent. We set = 10−3 and choose
channels are assumed to be independent. We set = 10−3 ρ = 2.12 dB so that C iso = 1 bit/(ch. use). For this scenario,
and choose ρ = −1.55 dB so that C = 1 bit/(ch. use). We the blocklength required to achieve 90% of C iso is less than 500,
∗
also plot a lower bound on Rrt (n, ) obtained by using the κβ which again demonstrates fast convergence to C iso .
bound [9, Th. 25] and assuming CSIR.7 For reference, Fig. 2
shows also the approximation (2) for R∗ (n, ) corresponding
B. Comparison with coding schemes in LTE-Advanced
to an AWGN channel with C = 1 bit/(ch. use), replacing the
term O(log(n)/n) in (2) with log(n)/(2n) [9, Eq. (296)] [30].8 The bounds reported in Sections IV and V can be used to
The blocklength required to achieve 90% of the -capacity of benchmark the coding schemes adopted in current standards. In
the quasi-static fading channel is in the range [120, 320] for the Fig. 4, we compare the performance of the coding schemes used
in LTE-Advanced [31, Sec. 5.1.3.2] against the achievability and
7 Specifically, we took F = {x ∈ Cn : kxk2 = nρ}, and Q converse bounds for the same scenario as in Fig. 2. Specifically,
YH =
PH n H
Q
j=1 QYj | H where QYj | H=h = CN (0, Ir + ρhh ). Fig. 4 illustrates the performance of the family of turbo codes
8 The approximation reported in [9, Eq. (296)], [30] holds for a real AWGN
chosen in LTE-Advanced for the case of QPSK modulation. The
channel. Since a complex AWGN channel with blocklength n can be identified
as a real AWGN channel with the same SNR and blocklength 2n, the approxi- decoder employs a max-log-MAP decoding algorithm [32] with
ρ2 +2ρ 10 iterations. We further assume that the decoder has perfect CSI.
mation [9, Eq. (296)], [30] with C = log(1 + ρ) and V = (1+ρ) 2 is accurate
for the complex case. For the AWGN case, it was observed in [9, Fig. 12] that about
11
A PPENDIX I
0 AUXILIARY L EMMAS C ONCERNING THE P RODUCT OF
0 100 200 300 400 500 600 700 800 900 1000
Blocklength, n S INES OF P RINCIPAL A NGLES
In this appendix, we state two properties of the product of
Fig. 5. Comparison between achievability and converse bounds and rate
achievable with the coding schemes in LTE-Advanced. We consider a quasi-
principal sines defined in (40), which will be used in the proof
static SIMO Rayleigh-fading channel with two receive antennas, SNR = 2.74 of Theorem 3 and of Proposition 23. The first property, which is
dB, = 0.1, and CSIR. The star-shaped markers indicate the rates achievable referred to in [34] as “equalized Hadamard inequality”, is stated
by the turbo codes in LTE-Advanced (QPSK modulation and 10 iterations of a
max-log-MAP decoder [32]).
in Lemma 12 below.
Lemma 12: Let A = [A1 , A2 ] ∈ Cn×(a1 +a2 ) , where A1 ∈
n×a1
C and A2 ∈ Cn×a2 . If rank(A1 ) = a1 and rank(A2 ) = a2 ,
then
2
det(AH A) = det(AH H
1 A1 ) det(A2 A2 ) sin {A1 , A2 }. (98)
half of the gap between the rate achieved by√ the best available
channel codes9 and capacity is due to the 1/ n penalty in (2); Proof: The proof follows by extending [35, Th. 3.3] to the
the other half is due to the suboptimality of the codes. From complex case.
Fig. 4, we conclude that for quasi-static fading channels the The second property provides an upper bound on sin{A, B}
finite-blocklength penalty is significantly reduced because of the that depends on the angles between the basis vectors of the two
zero-dispersion effect. However, the penalty due to the code subspaces.
rch 28, 2014 DRAFT
suboptimality remains. In fact, we see that the gap between Lemma 13: Let A and B be subspaces of Cn with dim(A) =
the rate achieved by the LTE-Advanced turbo codes and the a and dim(B) = b. Let {a1 , . . . , aa } be an orthonormal basis
normal approximation Rrt N
(n, ) is approximately constant up for A, and let {b1 , . . . , bb } be an arbitrary basis (not necessarily
to a blocklength of 1000. orthonormal) for B. Then
min{a,b}
Y
9 The
codes used in [9, Fig. 12] are a certain family of multiedge low-density sin{A, B} ≤ sin{aj , bj }. (99)
parity-check (LDPC) codes. j=1
12
Proof: To keep notation simple, we define the following {Φj } are unitary, the codewords satisfy the power constraint (12).
function, which maps a complex matrix X of arbitrary size to its Motivated by the geometric considerations reported in Sec-
volume: tion IV-A1, we consider for a given input X(H) = ΦP(H) a
q physically degraded version of the channel (8), whose output is
vol(X) , det(XH X). (100) given by
Let A = [a1 , . . . , aa ] ∈ Cn×a and B = [b1 , . . . , bb ] ∈ Cn×b . ΩY = span(ΦP(H)H + W). (110)
If the vectors a1 , . . . , aa , b1 , . . . , bb are linearly dependent,
then the LHS of (99) vanishes, in which case (99) holds triv- Note that the subspace ΩY belongs with probability one to the
ially. In the following, we therefore assume that the vectors Grassmannian manifold Gn,r , i.e., the set of all r dimensional
a1 , . . . , aa , b1 , . . . , bb form a linearly independent set. Below, subspaces in Cn . Because (110) is a physically degraded version
we prove Lemma 13 for the case a ≤ b. The proof for the case of (8), the rate achievable on (110) is a lower bound on the rate
a > b follows from similar steps. achievable on (8).
Using Lemma 12, we get the following chain of (in)equalities: To prove the theorem, we apply the κβ bound [9, Th. 25]
to the channel (110). Following [9, Eq. (107)], we define the
sin{A, B}
following measure of performance for the composite hypothesis
vol([A, B])
= (101) test between an auxiliary output distribution QΩY defined on the
vol(A)vol(B) subspace ΩY and the collection of channel-output distributions
vol([A, B]) {PΩY | Φ=Φ }Φ∈Sn,t :
= (102)
vol(B) Z
1 κτ (Sn,t , QΩY ) , inf PZ | ΩY (1 | ΩY )QΩY (dΩY ) (111)
= ka1 k vol [a2 , . . . , aa , B]
vol(B) | {z }
=1 where the infimum is over all probability distributions PZ | ΩY :
· sin a1 , [a2 , . . . , aa , B] (103) Gn,t 7→ {0, 1} satisfying
.. Z
. ! PZ | ΩY (1 | ΩY )PΩY | Φ=Φ (dΩY ) ≥ τ, ∀Φ ∈ Sn,t . (112)
a
1 Y
= sin ai , [ai+1 , . . . , aa , B] vol(B) (104) By [9, Th. 25], we have that for every auxiliary distribution Q
vol(B) i=1
ΩY
a κτ (Sn,t , QΩY )
M≥ (113)
Y
≤ sin{ai , bi }. (105) supΦ∈Sn,t β1−+τ (PΩY | Φ=Φ , QΩY )
i=1
Here, (102) holds because the columns of A are orthonormal and, where β(·) (·, ·) is defined in (6). We next lower-bound the RHS
hence, det(AH A) = 1; (103) and (104) follow from Lemma 12; of (113) to obtain an expression that can be evaluated numerically.
(105) follows because Fix a Φ ∈ Sn,t and let
ZΦ (ΩY ) = 1{sin2 {span(Φ), ΩY } ≤ γn } (114)
sin ai , [ai+1 , . . . , aa , B] ≤ sin{ai , bi }. (106)
where γn ∈ [0, 1] is chosen so that
PΩY |Φ=Φ [ZΦ (ΩY ) = 1] ≥ 1 − + τ. (115)
A PPENDIX II
P ROOF OF T HEOREM 1 (CSIT ACHIEVABILITY B OUND ) Since the noise matrix W is isotropically distributed, the proba-
Given H = H, we perform a singular value decomposition bility distribution of the random variable sin2 {span(Φ), ΩY }
(SVD) of H to obtain (where ΩY ∼ PΩY |Φ=Φ ) does not depend on Φ. Hence, the
chosen γn satisfies (115) for all Φ ∈ Sn,t . Furthermore, ZΦ (ΩY )
H = LΣVH (107) can be viewed as a hypothesis test between PΩY | Φ=Φ and QΩY .
where L ∈ Ct×t and V ∈ Cr×r are unitary matrices, and Hence, by definition
Σ ∈ Ct×r is a (truncated) √ diagonal matrix
√ of dimension t × r,
β1−+τ (PΩY | Φ=Φ , QΩY ) ≤ QΩY [ZΦ (ΩY ) = 1] (116)
whose diagonal elements λ1 , . . . , λm , are the ordered sin-
gular values of H. It will be convenient to define the following for every Φ ∈ Sn,t .
t × t precoding matrix for each H: We next evaluate the RHS of (116), taking as the auxiliary
p p
P(H) , diag{ nv1∗ , . . . , nvm ∗ , 0, . . . , 0}LH . (108) output distribution the uniform distribution on Gn,r , which we de-
| {z } note by QuΩY . With this choice, QuΩY [sin2 {span(Φ), ΩY } ≤ γn ]
t−m
does not depend on Φ ∈ Sn,t . To simplify calculations, we can
We consider a code whose codewords Xj (H), j = 1, . . . , M , therefore set Φ = In,t . Observe that under QuΩY , the squares of
have the following structure the sines of the principle angles between span(In,t ) and ΩY have
the same distribution as the eigenvalues of a complex multivariate
Xj (H) = Φj P(H), Φj ∈ Sn,t (109)
Beta-distributed matrix B ∼ Betar (n−t, t) [36, Sec. 2]. By [37,
n×t H
where Sn,t , {A ∈ C : A A = It } denotes the set of all Cor.Q1], the distribution of det B coincides with the distribution
r
n × t unitary matrices, (i.e., the complex Stiefel manifold). As of i=1 Bi , where {Bi }, i = 1, . . . , r, are independent with
13
Bi ∼ Beta(n − t − i + 1, t). Using this result to compute the for all distributions PZ | ΩY that satisfy (123). This proves (122),
RHS of (116) we obtain since
κτ (Sn,t , QuΩY ) ≥ κuτ (Sn,t , QuΩY ) ≥ τ.
Yr (126)
sup β1−+τ (PΩY | Φ=Φ , QΩY ) ≤ P Bj ≤ γn (117)
Φ∈Sn,t j=1
A PPENDIX III
P ROOF OF T HEOREM 2 (CSIRT C ONVERSE B OUND )
where γn satisfies When CSI is available at both the transmitter and the receiver,
h i the MIMO channel (8) can be transformed into the following set
P sin2 In,t , In,t P(H)H + W ≤ γn ≥ 1 − + τ.
(118)
of m parallel quasi-static channels
p
Note that (118) is equivalent to (44). Indeed Yi = xi Λi + Wi , i = 1, . . . , m (127)
h n √ o i
by performing a singular value decomposition [17, Sec. 3.1].
P sin2 In,t , nIn,t P(H)H + W ≤ γn
Here, Λ1 ≥ · · · ≥ Λm denote the m largest eigenvalues of HHH ,
h n √ p
= P sin2 In,t , nIn,t diag
p ∗
v1 Λ1 , . . . , vm∗ Λ ,
m and Wi ∼ CN (0, In ), i = 1, . . . , m, are independent noise
H o i vectors.
0, . . . , 0 V + W ≤ γn (119) Next, we establish a converse bound for the channel (127).
| {z }
t−m Let X = [x1 · · · xm ] and fix an (n, M, )rt code. Note that (12)
h
2
n √ p ∗ p
∗ Λ , implies
= P sin In,t , nIn,t diag v1 Λ1 , . . . , vm m
m
X
kxi k2 ≤ nρ.
o i
(128)
0, . . . , 0 + WV ≤ γn (120)
| {z } i=1
t−m
h
2
n √ p ∗ p To simplify the presentation, we assume that the encoder ftx is
= P sin In,t , nIn,t diag v1 Λ1 , . . . , vm∗ Λ ,
m deterministic. Nevertheless, the theorem holds also if we allow
for randomized encoders. We further assume that the encoder ftx
o i
0, . . . , 0 + W ≤ γn (121)
| {z } acts on the pairs (j, λ) instead of (j, H) (cf., Definition 3). The
t−m channel (127) and the encoder ftx define a random transfor-
where V contains the right singular vectors of H (see (107)). mation PY,Λ | J from the message set {1, . . . , M } to the space
n×m
Here, (119) follows from (108); (120) follows because right- C × Rm +:
multiplying a matrix A by a unitary matrix does not change PY,Λ | J = PΛ PY | Λ,J (129)
the subspace spanned by the columns of A and hence, it does
not change sin{·, ·}; (121) follows because W is isotropically where Y = [Y1 , . . . , Ym ] and
distributed and hence WV has the same distribution as W. PY | Λ=λ,J=j , PY | Λ=λ,X=ftx (j,λ) . (130)
To conclude the proof, it remains to show that
We can think of PY,Λ | J as the channel law associated with
κτ (Sn,t , QuΩY ) ≥ τ. (122) J −→ Y, Λ. (131)
Once this is done, the desired lower bound (43) follows by using ∗
To upper-bound Rrt (n, ),
we use the meta-converse theorem [9,
the inequality (117) and (122) in (113), by taking the logarithm Th. 30] on the channel (131). We start by associating to each
of both sides of (113), and by dividing by the blocklength n. codeword X a power-allocation vector ṽ(X) whose entries ṽi (X)
To prove (122), we replace (112) with the less stringent are
constraint that 1
ṽi (X) , kxi k2 , i = 1, . . . , m. (132)
Z n
EPΦu PZ | ΩY (1 | ΩY )PΩY | Φ (dΩY ) ≥ τ (123) We take as auxiliary channel QY,Λ | J = PΛ QY | Λ,J , where
m
where PΦu is the uniform input distribution on Sn,t . Since
Y
QY | Λ=λ,J=j = QYi | Λ=λ,J=j (133)
replacing (112) by (123) enlarges the feasible region of the i=1
minimization problem (111), we obtain an infimum in (111) and
(denoted by κuτ (Sn,t , QuΩY )) that is no larger than κτ (Sn,t , QuΩY ).
The key observation is that the uniform distribution PΦu induces QYi | Λ=λ,J=j = CN 0, 1 + (ṽi ◦ ftx (j, λ))λi In . (134)
an isotropic distribution on Y. This implies that the induced By [9, Th. 30], we obtain
distribution on ΩY is the uniform distribution on Gn,r , i.e., QuΩY .
Therefore, it follows that min β1− (PYΛ | J=j , QYΛ | J=j ) ≤ 1 − 0 (135)
j∈{1,...,M }
Z
PZ | ΩY (1 | ΩY )QuΩY (dΩY ) where 0 is the maximal probability of error over QY,Λ | J .
Z We shall prove Theorem 2 in the following two steps: in Ap-
= EPΦu PZ | ΩY (1 | ΩY )PΩY | Φ (dΩY ) (124) pendix III-1, we evaluate β1− (PYΛ | J=j , QYΛ | J=j ); in Ap-
pendix III-2, we relate 0 to Rrt
∗
(n, ) by establishing a converse
≥τ (125) bound on the auxiliary channel QY,Λ | J .
14
1) Evaluation of β1− : Let j ∗ be the message that achieves the maximal error probability over the channel QU ,Λ | S defined
∗
the minimum in (135), let ftx (λ) , ftx (j ∗ , λ), and let by
∗ n
β1− (ftx ) , β1− (PY,Λ | J=j ∗ , QY,Λ | J=j ∗ ). (136) 1 + Si Λi X
Ui = |Wi,l |2 , i = 1, . . . , m. (144)
n
Using (136), we can rewrite (135) as l=1
Here, Γ(·, ·) denotes the (upper) incomplete Gamma function [39, and by noting that this γ satisfies
Sec. 6.5]. Substituting (154) into (153), we finally obtain that
for every code {c1 (Λ), . . . , cM (Λ)} ⊂ Vm , γ = Ctx + O(1/n) (165)
1 (n − 1)n e−(n−1)
Γ(n, n − 1)
m i.e., it belongs to the desired neighborhood of Ctx for sufficiently
1 − 0 ≤ + large n. Here, (165) follows by a Taylor series expansion [40,
M Γ(n) Γ(n)
"m # Th. 5.15] of Ftx (γ) around Ctx , and because Ftx (Ctx ) = and
0
×E
Y
(1 + ρΛi ) (155) Ftx (Ctx ) > 0 by assumption.
i=1
In the reminder of this appendix, we will prove (163). Our
crt (n) proof consists of the three steps sketched below.
= . (156) Step 1: Given v and Λ, the random variable Snrt (v, Λ)
M
(see (48) for its definition) is the sum of n i.i.d. random variables
This proves Lemma 14. with mean
m
A PPENDIX IV
X
µ(v, Λ) , log 1 + Λj vj (Λ) (166)
P ROOF OF THE C ONVERSE PART OF T HEOREM 3 j=1
As a first step towards establishing (56), we relax the upper and variance
bound (49) by lower-bounding its denominator. Recall that by m
definition (see Appendix III-1)
X 1
σ 2 (v, Λ) , m − 2 . (167)
1 + Λj vj (Λ)
P[Lrt
n (v, Λ) ≥ nγn (v)] = β1− (PY,Λ | J=j ∗ , QY,Λ | J=j ∗ ). j=1
If E |X1 |4 < ∞ and if sup|t|≥ζ |ϕ(t)| ≤ k0 for some k0 < 1, where ζ0 does not depend on λ and v. Through algebraic
where ζ , 1/(12E |X1 |3 ), then for every ξ and n manipulations, we can further show that the RHS of (181) is
upper-bounded by
Fn (ξ) − Q(−ξ) − k1 (1 − ξ 2 )e−ξ2 /2 √1
n
1
sup ϕTl (t) ≤ p , k0 < 1. (182)
1 + ζ02 /m
( n ) |t|>ζ0
E |X1 |4
6 1
≤ k2 + n k0 + . (173)
n 2n The inequalities (178) and (182) imply that the conditions in
√ Theorem 15 are met. Hence, we conclude that, by Theorem 15,
Here, k1 , E X13 /(6 2π), and k2 is a positive constant for every n, λ, and v(·),
independent of {Xi } and ξ. " #
n
Proof: The inequality (173) is a consequence of the tighter 1 X √ √
P √ Tl ≤ nu(v, λ) Λ = λ − Q − nu(v, λ)
inequality reported in [16, Th. VI.1]. n
Let l=1
E Tl3 | Λ = λ 2 9k2
1
m Λj vj (Λ)Zl,j − 12
p ! ≥ √ √ (1 − nu(v, λ)2 )e−nu(v,λ) /2 −
6 2π n n
X
Tl (v, Λ) , 1− (174)
σ(v, Λ) j=1 1 + Λj vj (Λ)
1
n
− k2 n6 k0 + (183)
2n
where Zl,j , l = 1, . . . , n and j = 1, . . . , m, are i.i.d. CN (0, 1)
distributed. The random variables T1 , . . . , Tn have zero mean where u(v, λ) was defined in (168). The inequality (169) follows
and unit variance, and are conditionally i.i.d. given Λ. Further- then by noting that
more, by construction h
3
i √
" n
# 0 ≥ E T l = λ ≥ − 2π (184)
√
Λ
1 X
P Snrt (v, Λ) ≤ nγ = P √
Tl (v, Λ) ≤ nu(v, Λ)
n and that
l=1
(175) n
6 1
sup n k2 n k0 + < ∞. (185)
where u(v, Λ) was defined in (168). We next show that the n≥1 2n
conditions under which Theorem 15 holds are satisfied by the
random variables {Tl }.
We start by noting that if λj vj (λ), j = 1, . . . , m, are identi- B. Proof of (171)
cally zero, then Snrt (v, Λ) = 0, so (169) holds trivially. Hence, For every fixed λ, we minimize qn (u(v, λ)) on the RHS
we will focus on the case where {λj vj (λ)} are not all identically of (169) over all power allocation functions v(·). With a slight
zero. Let abuse of notation, we use v ∈ Vm (where Vm was defined in (46))
itT to denote the vector v(λ) whenever no ambiguity arises. Since
ϕTl (t) , E e l Λ = λ (176) the function q (x) in (170) is monotonically increasing in x, the
n
We next show that there exists a k0 < 1 such that The minimization in (186) is difficult to solve since u(v, λ) is
sup|t|>ζ |ϕTl (t)| ≤ k0 for every λ ∈ Rm + and every func- neither convex nor concave in v. For our purposes, it suffices
tion v(·). We start by evaluating ζ. For every λ ∈ Rm + and to obtain a lower bound on (186), which is given in Lemma 16
every v(·) such that λj vj (λ), 1 ≤ j ≤ m, are not identically below. Together with (187) and the monotonicity of qn (·), this
zero, it can be shown through algebraic manipulations that then yields (171).
Lemma 16: Let v ∗ , µ(v, λ), σ(v, λ), and u(v, λ) be as
E |Tl |4 Λ = λ ≤ 9.
(178)
in (45), (166), (167), and (168), respectively. Moreover, let
By Lyapunov’s inequality [16, p. 18], this implies that µ∗ (λ) , µ(v ∗ , λ) and σ ∗ (λ) , σ(v ∗ , λ). Then, there exist δ >
0, δ̃ > 0 and k < ∞ such that for every γ ∈ (Ctx − δ̃, Ctx + δ̃)
3/4
E |Tl |3 Λ = λ ≤ E |Tl |4 Λ = λ ≤ 93/4 .
(179)
min u(v, λ)
v∈Vm
Hence, √
δ/ m, if µ∗ (λ) ≤ γ − δ
∗
1 1 γ − µ (λ)
ζ= ≥ , ζ0 . (180) ≥ û(λ) , ∗ (λ) + k(γ − µ∗ (λ))
, if |γ − µ∗ (λ)| < δ
12E |Tl |3 Λ = λ 12 × 93/4 σ
if µ∗ (λ) ≥ γ + δ.
−∞,
By (180), we have
(187)
sup ϕTl (t) ≤ sup ϕTl (t) (181)
|t|>ζ |t|>ζ0 Proof: See Appendix IV-D.
17
C. Proof of (163) Assume without loss of generality that n ≥ m/δ 2 (recall that
We shall need the following √
lemma, which concerns the speed we are interested in the asymptotic regime n → ∞). In this case,
of convergence of P[B > A/ n] to P[B > 0] as n → ∞ for the second term on the RHS of (195) is zero. Hence,
two independent random variables A and B.
Lemma 17: Let A be a real random variable with zero mean E[qn (û(Λ))1{Λ ∈
/ Kδ }]
√ δ
and unit variance. Let B be a real random variable independent = Q − n√ P µ∗ (Λ) ≤ γ − δ
(196)
of A with continuously differentiable pdf fB . Then m
2
≥ P µ (Λ) ≤ γ − δ − e−nδ /(2m) .
∗
(197)
P B ≥ √A − P[B ≥ 0] ≤ 1 2 + k1 + k1
n δ2 (188)
n δ 2 2
Here, (197) follows because Q(−t) ≥ 1 − e−t /2 for all t ≥ 0
where
and because P[µ∗ (Λ) ≤ γ − δ] ≤ 1.
sup max |fB (t)|, |fB0 (t)|
k1 , (189)
t∈(−δ,δ) We next lower-bound the second term on the RHS of (194).
and δ > 0 is chosen so that k1 is finite. If P[Λ ∈ Kδ ∩ Int(Wj )] = 0, we have
Proof: See Appendix IV-E.
E[qn (û(Λ))1{Λ ∈ Kδ ∩ Int(Wj )}] = 0 (198)
To establish (163), we lower-bound E[qn (û(Λ))] on the RHS
of (171) using Lemma 17. This entails technical difficulties since qn (·) is bounded. We thus assume in the following that
since the pdf of û(Λ) is not continuously differentiable due P[Λ ∈ Kδ ∩ Int(Wj )] > 0. Let Û denote the random variable
to the fact that the water-filling solution (45) may give rise û(Λ). To emphasize that Û depends on γ (see (187)), we write
to different numbers of active eigenmodes for different values Û (γ) in place of Û whenever necessary. Using this definition
of λ. To circumvent this problem, we partition Rm ≥ into m non- and (170), we obtain
intersecting subregions Wj , j = 1, . . . , m [15, Eq. (24)] h i
j E qn (Û )1{Λ ∈ Kδ ∩ Int(Wj )}
n 1 1X 1 ρ 1o
Wj , x ∈ Rm
≥ : > + ≥ , h
√ i
xj+1 j xl j xj = E Q(− nÛ ) | Λ ∈ Kδ ∩ Int(Wj )
l=1
j = 1, . . . , m − 1 (190) i
1 h + 2
− √ E 1 − nÛ 2 e−nÛ /2 Λ ∈ Kδ ∩ Int(Wj )
and 6 n
m
n 1 X 1 ρ 1 o × P Λ ∈ Kδ ∩ Int(Wj ) . (199)
Wm , x ∈ Rm
≥ : + ≥ . (191)
m xl m xm
l=1
Observe that the transformation
In the interior of Wj , j = 1, . . . , m,
Smthe pdf of û(Λ) is
continuously differentiable. Note that j=1 Wj = Rm ≥ . For (λ1 , . . . , λj , γ) 7→ (û(λ), λ2 , . . . , λj , γ) (200)
every λ ∈ Wj , the water-filling solution gives exactly j active
eigenmodes, i.e., is one-to-one and twice continuously differentiable with nonsin-
gular Jacobian for λ ∈ Kδ ∩ Int(Wj ), i.e., it is a diffeomorphism
v1∗ (λ) ≥ · · · ≥ vj∗ (λ) > vj+1
∗ ∗
(λ) = · · · = vm (λ) = 0. (192) of class C 2 [29, p. 147]. Consequently, the conditional pdf
Let fÛ (γ) | Λ∈Kδ ∩Int(Wj ) (t) of Û (γ) given Λ ∈ Kδ ∩ Int(Wj ) as
well as its first derivative are jointly continuous functions of γ
n o
∗
Kδ , λ ∈ Rm
≥ : |γ − µ (λ)| < δ . (193)
and t. Hence, they are bounded on bounded sets. It thus follows
Using (193) and the sets {Wj }, we express E[qn (û(Λ))] as that for every j ∈ {1, . . . , m}, every γ ∈ (Ctx − δ̃, Ctx + δ̃)
(where δ̃ is given by Lemma 16), and every δ̃1 > 0, there exists a
E[qn (û(Λ))]
k2 < ∞ such that the conditional pdf fÛ (γ) | Λ∈Kδ ∩Int(Wj ) and
= E[qn (û(Λ))1{Λ ∈ / Kδ }] its derivative satisfy
m
E[qn (û(Λ))1{Λ ∈ Kδ ∩ Int(Wj )}]
X
+ (194)
sup sup Û (γ) | Λ∈Kδ ∩Int(Wj ) (t) ≤ k2 (201)
f
j=1 t∈[−δ̃1 ,δ̃1 ] γ∈(Ctx −δ̃,Ctx +δ̃)
0
where Int(·) denotes the
Sminterior of a given set. To obtain (194), sup sup f
Û (γ) | Λ∈Kδ ∩Int(Wj )
(t) ≤ k2 . (202)
we used that Λ lies in j=1 Int(Wj ) almost surely, which holds t∈[−δ̃1 ,δ̃1 ] γ∈(Ctx −δ̃,Ctx +δ̃)
because the joint pdf of {Λj }m j=1 exists by assumption and We next apply Lemma 17 with A being a standard normal random
because the boundary of Wj has zero Lebesgue measure.
variable and B being the random variable Û conditioned on Λ ∈
We next lower-bound the two terms on the RHS of (194)
Kδ ∩ Int(Wj ). This implies that there exists a finite constant k3
separately. We first consider the first term. When µ∗ (λ) ≥ γ + δ,
∗ independent of γ and n such that the first term on the RHS
we have û(λ) = −∞ √and qn u1 (λ) = 0; when µ (λ) ≤ γ −δ, of (199) satisfies
we have û(λ) = δ/ m and
2
h √ i
√ δ [1 − nδ 2 /m]+ e−nδ /(2m)
E Q − nÛ (γ) Λ ∈ Kδ ∩ Int(Wj )
qn û(λ) = Q − n √ − √ .
m 6 n k3
≥ P µ∗ (Λ) ≤ γ | Λ ∈ Kδ ∩ Int(Wj ) + . (203)
(195) n
18
We next bound the second term on the RHS of (199) for n ≥ δ̃1−2 and that
as
σ ∗ (λ) − k|γ − µ∗ (λ)| > 0. (216)
1 h + 2
i
√ E 1 − nÛ 2 e−nÛ /2 Λ ∈ Kδ ∩ Int(Wj )
6 n The desired bound (213) follows then by lower-bounding
Z 1/√n σ(vmin , λ) in (214) by σ ∗ (λ) − k|γ − µ∗ (λ)| when µ∗ (λ) ≥ γ
k2 2
≤ √ (1 − nt2 )e−nt /2 dt (204) and by upper-bounding σ(vmin , λ) by σ ∗ (λ) + k|γ − µ∗ (λ)|
6 n −1/ n √
when µ∗ (λ) < γ.
k2 We first establish (215). By the mean value theorem, there
= √ (205)
3 en exist vj0 between vj∗ and vmin,j , j = 1, . . . , m, such that
where (204) follows from (201). Substituting (203) and (205) σ(vmin , λ) − σ ∗ (λ)
into (199) we obtain
m
h i X 2λ j ∗
E qn (Û )1{Λ ∈ Kδ ∩ Int(Wj )}
=
0 3
(vmin,j − vj ) (217)
j=1 (1 + λj vj )
k4
≥ P µ∗ (Λ) ≤ γ, Λ ∈ Kδ ∩ Int(Wj ) +
(206) m
n
X 2λj vmin,j − vj∗
≤ 0 3
(218)
for some finite k4 independent of γ and n. Using (197), (198) j=1
(1 + λj vj )
and (206) in (194), and substituting (194) into (171), we con- m
X
vmin,j − vj∗
clude that ≤ 2λ1 (219)
j=1
rt ∗ 1 √
P[Sn (v, Λ) ≤ nγ] ≥ P[µ (Λ) ≤ γ] + O (207) ≤ 2λ1 mkvmin − v ∗ k. (220)
n
1 Here, the last step for every a = [a1 , . . . , am ] ∈
= Ftx (γ) + O (208) Pmfollows because√
n Rm , we have j=1 |aj | ≤ mkak.
Next, we upper-bound λ1 and kvmin − v ∗ k separately. The
where the O(1/n) term is uniform in γ ∈ (Ctx − δ̃, Ctx + δ̃). variable λ can be bounded as follows. Because the water-filling
1
Here, the last step follows from (166) and (20). power levels {vl∗ } in (45) are nonincreasing, we have that
ρ
D. Proof of Lemma 16 ≤ v1∗ ≤ ρ. (221)
m
For an arbitrary λ ∈ Rm ≥ , the function µ(v, λ) in the numer- Choose δ1 > 0 and δ̃ > 0 such that δ1 + δ̃ < C tx . Using (221)
ator of (168) is maximized by the (unique) water-filling power together with
allocation vj = vj∗ defined in (45):
log(1 + λ1 v1∗ ) ≤ µ∗ (λ) ≤ m log(1 + λ1 v1∗ ) (222)
µ∗ (λ) = max µ(v, λ) = µ(v ∗ , λ). (209)
v∈Vm and the assumption that γ ∈ (Ctx − δ̃, Ctx + δ̃), we obtain that
The function σ(v, λ) on the denominator of (168) can be whenever |µ∗ (λ) − γ| < δ1
bounded as 1 (Ctx −δ1 −δ̃)/m
√ k0 , e −1
0 ≤ σ(v, λ) ≤ m. (210) ρ
m Ctx +δ1 +δ̃
Using (209) and (210) we obtain that for an arbitrary δ > 0 ≤ λ1 ≤ e − 1 , k1 . (223)
ρ
√ ∗
δ/ m, µ (λ) ≤ γ − δ ∗
The term kvmin − v k can be upper-bounded as follows.
min u(v, λ) ≥ (211)
v∈Vm −∞, µ∗ (λ) ≥ γ + δ. Since vmin is the minimizer of u(v, λ), it must satisfy the
Let vmin be the minimizer of u(v, λ) for a given λ. To prove Karush–Kuhn–Tucker (KKT) conditions [41, Sec. 5.5.3]:
Lemma 16, it remains to show that there exist δ > 0, δ̃ > 0 and ∂u(v, λ)
tx tx − = η, ∀ l for which vmin,l > 0 (224)
k < ∞ such that for every γ ∈ (C − δ̃, C + δ̃) and every ∂vl
vl =vmin,l
m ∗
λ ∈ R≥ satisfying |µ (λ) − γ| < δ, ∂u(v, λ)
− ≤ η, ∀ l for which vmin,l = 0 (225)
∂vl
min u(v, λ) = u(vmin , λ) (212) vl =vmin,l
v∈Vm
for some η. The derivatives in (224) and (225) are given by
γ − µ∗ (λ)
≥ ∗ . (213) ∂u(v, λ)
γ − µ(vmin , λ)
σ (λ) + k(γ − µ∗ (λ)) − = 1 +
∂vl (1 + λl vmin,l )2 σ 2 (vmin , λ)
vl =vmin,l
Since
1
γ − µ(vmin , λ) γ − µ∗ (λ) × . (226)
u(vmin , λ) = ≥ (214) (vmin,l + 1/λl )σ(vmin , λ)
σ(vmin , λ) σ(vmin , λ)
Let η̃ , 1/(σ(vmin , λ)η). Then, (224) and (225) can be rewrit-
it suffices to show that for every γ ∈ (Ctx − δ̃, Ctx + δ̃) and ten as
∗
every λ ∈ Rm ≥ satisfying |µ (λ) − γ| < δ, we have
+
γ − µ(vmin , λ)
1
vmin,l = η̃ 1 + − (227)
|σ(vmin , λ) − σ ∗ (λ)| ≤ k|γ − µ∗ (λ)| (215) (1 + λl vmin,l )2 σ 2 (vmin , λ) λl
19
Finally, |c2 (n)| can be bounded as Here, (261) follows from Lemma 13 (Appendix I) by let-
2 0
A |fB (A0 )| √
ting ej and Wj stand for the jth column of In,t and W, respec-
|c2 (n)| ≤ E · 1{|A| < δ n} (250) tively; (262) follows by symmetry. We next note that by (98),
2n
the random variable sin2 {e1 , nvj∗ Λj e1 + Wj } has the same
p
√ k1
≤ E A2 · 1{|A| < δ n}
(251) distribution as
2n Pn
k1 |2
i=2 |Wi,jP
≤ . (252) Tj , p ∗ n . (263)
2n | nvj Λj + W1,j | + i=2 |Wi,j |2
2
m
( " u
# )
X Y (u) We next choose γn as the solution of
≥ P Tj ≤ γ P[Λ ∈ Int(Wu )] . (269)
u=1 j=1 1 1
Pn 1 − Ftx (− log γn ) + O =1−+ . (278)
Here, because, by (192), Tj = ( i=2 |Wi,j |2 ) n n
Pn(268) follows
/( i=1 |Wi,j |2 ) for j = u + 1, . . . , m. 0
Since Ftx (Ctx ) = and Ftx (Ctx ) > 0, it follow from Taylor’s
The following lemma, built upon Lemma 17, allows us to theorem that
establish (266).
Lemma 18: Let G = [G1 , . . . , Gu ]T ∈ Ru≥ be a random tx 1
− log γn = C + O . (279)
vector with continuously differentiable joint pdf. Let n
Pn 2
i=2 |Wi,j | So, for sufficiently large n, γn in (279) belongs to the inter-
Dj , p , j = 1, . . . , u (270) tx tx
n val e−C −δa , e−C +δa . Hence, by (264), (277), and (278),
P
| nGj + W1,j |2 + i=2 |Wi,j |2
this γn satisfies (44). This concludes the proof.
where Wi,j , i = 1, . . . , n, j = 1, . . . , u, are i.i.d. CN (0, 1)-
distributed. Fix an arbitrary ξ0 ∈ (0, 1). Then, there exist a
δ > 0 and a finite constant k such that A. Proof of Lemma 18
" u # " u #!
Y Y 1 k Choose δ > 0 such that δ ≤ ξ0 /2. Throughout this appendix,
inf P Dj ≤ ξ − P ≤ξ > . we shall use const to indicate a finite constant that does neither
ξ∈(ξ0 −δ,ξ0 +δ)
j=1 j=1
1 + Gj n
depend on ξ ∈ (ξ0 − δ, ξ0 + δ) nor on n; its magnitude and sign
(271) may change at each occurrence.
Since the joint pdf of Λ is continuously differentiable by assump- 1) We show in Section V-A1 that for every ξ ∈ (ξ0 −δ, ξ0 +δ),
tion, the joint pdf of Λ(u) is also continuously differentiable. the term p1 in (282) can be lower-bounded as
Moreover, it can be shown that the transformation Λ(u) 7→ Λ e (u)
const
defined by (273) is a diffeomorphism of class C 2 [29, p. 147]. p1 ≥ 1 − . (283)
e (u) is continuously differentiable. n
Therefore, the joint pdf of Λ
We next use (272)in (269) to conclude that for every γ ∈ 2) Using Lemma 17 in Appendix IV-C, we show in Sec-
tx tx
e−C −δa , e−C +δa (where δa , min{δ1 , . . . , δm }) tion V-A2 that p2 can be lower-bounded as
" m # " u
#
Y 1 Y const
Tj ≤ γ p2 ≥ P Dj ≤ ξ G1 < gth − . (284)
P
1 + G1 j=2
n
u=1
m
( " u
# )
X Y 1 1 3) Reiterating Step 2 for D2 , . . . , Du , we conclude that (284)
≥ P ≤ γ P[Λ ∈ Int(Wu )] +O
u=1
e (u)
j=1 1 + Λ
n can be further lower-bounded as
j
(274)
" u #
Y 1 const
p2 ≥ P ≤ ξ G1 < gth − . (285)
" m #
Y 1 1 1 + G n
∗ (Λ) ≤ γ + O n
=P (275) j=1 j
j=1
1 + Λ v
j j
" m # 4) Finally, using (283) and (285) in (282), we show in Sec-
X
∗ 1 tion V-A3 that
=1−P log(1 + Λj vj (Λ)) ≤ − log γ + O (276)
j=1
n " u # " u #
Y Y 1 const
Dj ≤ ξ ≥ P ≤ξ − . (286)
1 P
= 1 − Ftx (− log γ) + O (277) j=1 j=1
1 + Gj n
n
where Ftx (·) is given in (20). This proves Lemma 18.
22
Subtracting (294) from (296) yields (271). In the following, we Next, we lower-bound the RHS of (306) using Lemma 17 in
2
shall focus exclusively on the case P[G1 ≥ gth ] < 1. Appendix IV-C. Let µW and σW be the mean and the variance
23
qP
n+1 √ q P
n+1
of the random variable i=2 |Wi,1 |2 . Let Z2 , Z −1 − 1. Here, (315) follows by using that 2 i=2 |Wi,1 |2 is χ-
Furthermore, let distributed with 2n degrees of freedom and by using [42,
v
un+1 ! Eq. (18.14)]; (316) follows from [43, Q
Sec. 2.2]; (318) follows
u
1 uX from the definition of Z and because j=2 Dj ≤ 1. Substitut-
K1 , p − Wre + Z2 t |Wi,1 |2 − µW Z2
2
1/2 + Z22 σW ing (316) and (319) into (313), we obtain
i=2
(307) const
f˜G1 (x | z2 , g2 . . . , gu ) ≤ . (320)
fG2 ,...,Gu | G1 <gth (g2 , . . . , gu )
and
p Following similar steps, we can also establish that
1 µW
G1 , p G1 − √ Z2 . (308) const
1/2 + Z22 σW
2 n ˜0
fG1 (x | z2 , g2 . . . , gu ) ≤
.
fG2 ,...,Gu | G1 <gth (g2 , . . . , gu )
Note that K1 is a zero-mean, unit-variance random variable
(321)
that is conditionally independent of G1 given Z2 . Using these
definitions, we can rewrite the RHS of (306) as Using (320)–(321) and Lemma 17, we obtain that
h √ i h √ i
P G1 ≥ K1 / nZ2 , G2 , . . . , Gu , G1 < gth . (309) P G1 ≥ K1 / nZ2 , G2 = g2 , . . . , Gu = gu , G1 < gth
h i
In order to use Lemma 17, we need to establish an upper bound on ≥ P G1 ≥ 0Z2 , G2 = g2 , . . . , Gu = gu , G1 < gth
the conditional pdf of G1 given Z2 , G2 , . . . , Gu and G1 < gth ,
which we denote by f˜G1 , and on its derivative. As fG1 ,...,Gu const 1
− 1+ . (322)
is continuously differentiable by assumption, fG1 ,...,Gu and its n fG2 ,...,Gu | G1 <gth (g2 , . . . , gu )
partial derivatives are bounded on bounded sets. Together with Returning to the analysis of (302), we combine (306), (309)
the assumption that P[G1 ≥ gth ] < 1, this implies that the con- and (322) to obtain
ditional pdf fG1 ,...,Gu | G1 <gth of G1 , . . . , Gu given G1 < gth
and its partial derivatives are all bounded on [0, gth )u . Namely, P D1 ≤ Z, Z < 1G1 < gth
for every {x1 , . . . , xu } ∈ [0, gth )u and every i ∈ {1, . . . , u}
"
≥ EZ,G2 ,...,Gu | G1 <gth 1{Z < 1}
fG1 ,...,Gu | G1 <gth (x1 , . . . , xu ) ≤ const (310)
∂fG1 ,...,Gu | G1 <gth (x1 , . . . , xu ) h i
≤ const. (311) × P G1 ≥ 0Z, G2 , . . . , Gu , G1 < gth
∂xi
P log det Ir + HH Ql H ≤ ξ
The first factor in (332) is equal to one because of (295). This
proves (286) and concludes the proof of Lemma 18. → P log det Ir + HH QH ≤ ξ as Ql → Q (345)
To upper-bound Re∗ (n, ), we use the meta-converse theorem [9, where γn (X0 ) is the solution of
Th. 30]. As auxiliary channel QYH | X , we take
PYH | X=X0 r(X0 ; YH) ≤ nγn (X0 ) = . (362)
QYH | X = PH × QY | XH (349)
It can be shown that under PYH | X=X0 , the random variable
where
r(X0 ; YH) has the same distribution as Snrx (U(X0 )) in (70), and
n
Y under QYH | X=X0 , it has the same distribution as Lrx n (U(X0 ))
QY | X=X,H=H = QYi | X=X,H=H (350)
in (69).
i=1
2) Converse on the auxiliary channel: To prove the theorem,
with Yi , i = 1, . . . , n denoting the rows of Y, and it remains to lower-bound 0 , which is the maximal probability
QYi | X=X,H=H = CN 0, Ir + HH U(X)H .
(351) of error over the auxiliary channel (349). The following lemma
serves this purpose.
By [9, Th. 30], we have Lemma 19: For every code with M codewords and block-
inf β1− PYH | X=X , QYH | X=X ≤ 1 − 0
(352) length n ≥ r, the maximum probability of error 0 over the
X∈Fn,t auxiliary channel (349) satisfies
where 0 is the maximal probability of error of the optimal code crx (n)
with M codewords over the auxiliary channel (349). To shorten 1 − 0 ≤ (363)
M
notation, we define
where crx (n) is given in (72).
n
β1− (X) , β1− PYH | X=X , QYH | X=X . (353) Substituting (361) into (352) and using (363), we then obtain
To prove the theorem, we proceed as in Appendix III: we first upon minimizing (361) over all matrices in Ute
n
evaluate β1− (X), then we relate 0 to Re∗ (n, ) by establishing 1 crx (n)
a converse bound on the channel QYH | X . Re∗ (n, ) ≤ . (364)
n (Q) ≥ nγn ]
n inf e P[Lrx
1) Evaluation of β1− (X): Let G be an arbitrary n×n unitary Q∈Ut
matrix. Let gi : Fn,t 7→ Fn,t and go : Cn×r × Ct×r 7→ Cn×r ×
Ct×r be two mappings defined as The final bound (71) follows by combining (364) with (347) and
by noting that the upper bound does not depend on the chosen
gi (X) , GX and go (Y, H) , (GY, H). (354) code.
Proof of Lemma 19: According to (351), given H = H, the
Note that
output of the auxiliary channel depends on X only through U(X).
PYH | X (go−1 (E) | gi (X)) = PYH | X (E | X) (355) In the following, we shall omit the argument of U(X) where it is
immaterial. Let V , U(Y). Then, (V, H) is a sufficient statistic
for all measurable sets E ⊂ Cn×r × Ct×r and X ∈ Fn,t , i.e., the for the detection of X from (Y, H). Therefore, to establish (363),
pair (gi , go ) is a symmetry [45, Def. 3] of PYH | X . Furthermore, it is sufficient to lower-bound the maximal probability of error 0
(350) and (351) imply that over the equivalent auxiliary channel
QYH | X=X = QYH | X=gi (X) (356)
QVH | U = PH × QV | U,H (365)
and that QYH | X=X is invariant under go for all X ∈ F. Hence,
by [45, Prop. 19], we have that where QV | U=U,H=H is the Wishart distribution [18, Def. 2.3]:
n n n
β1− (X) = β1− (gi (X)) = β1− (GX). (357) 1 H
QV | U=U,H=H = Wr n, (Ir + H UH) . (366)
n n
Since G is arbitrary, this implies that β1− (X) depends on X only
through U(X). Consider the QR decomposition [46, p. 113] of X Let B , Ir + HH UH, and let qV | B (V | B) be the pdf associated
X = VX0 (358) with (366), i.e., [18, Def. 2.3]
where V ∈ C n×n
is unitary and X0 ∈ C n×t
is upper triangular. det Vn−r −1
qV | B (V | B) = 1
n exp −tr n−1 B V .
By (357) and (358), Γr (n) det n B
n n (367)
β1− (X0 ) = β1− (X). (359)
Let It will be convenient to express qV | B (V | B) in the coordinate
dPYH | X=X0 system of the eigenvalue decomposition
r(X0 ; YH) , log . (360)
dQYH | X=X0 V = QDQH (368)
Under both PYH | X=X0 and QYH | X=X0 , the random variable
r(X0 ; YH) has absolutely continuous cdf with respect to the where Q ∈ Cr×r is unitary, and D is a diagonal matrix whose
Lebesgue measure. By the Neyman-Pearson lemma [38, p. 300] diagonal elements D1 , . . . , Dr are the eigenvalues of V in de-
scending order. In order to make the eigenvalue decomposi-
n
β1− (X0 ) = QYH | X=X0 r(X0 ; YH) ≥ nγn (X0 ) (361) tion (368) unique, we assume that the first row of Q is real
26
and non-negative. Thus, Q only lies in a submanifold Ser,r of the Let 0avg denote the average probability of error over the auxiliary
Stiefel manifold Sr,r . Using (368), we rewrite (367) as channel. Then,
Since B = Ir + HH UH, we have that Substituting (382) into (381) and using (374), we obtain
crx (n)
1 ≤ bi ≤ 1 + tr HH UH
(372) 1 − 0 ≤ . (383)
M
2
≤ 1 + kHkF tr (U) (373) Note that the RHS of (383) is valid for every code.
2
=1+ kHkF ρ , b0 (374)
A PPENDIX VIII
where (373) follows from the Cauchy-Schwarz inequality and P ROOF OF THE C ONVERSE PART OF T HEOREM 9
(374) follows because U ∈ Ute . Using (374), we can upper-bound
each factor on the RHS of (371) as follows: In this appendix, we prove the converse asymptotic expansion
for Theorem 9. More precisely, we show the following:
din+r−2i
di
Proposition 20: Let the pdf of the fading matrix H satisfy the
exp −n conditions in Theorem 9. Then
bni bi
n + r − 2i
n+r−2i
∗ log n
b0
[r−2i]+ −(n+r−2i)
e , Rrx (n, ) ≤ Cno + O . (384)
n
n
b0 (n+r−2i)
if di ≤ Proof: Proceeding as in (158)–(162), we obtain from The-
≤ gi (di ) , n+r−2i n (375)
di [r−2i]+ −ndi /b0 orem 6 that
b0 e ,
b
∗
0
(n − 1)Rrx (n − 1, )
b0 (n+r−2i)
if di > n .
≤ nγ − log inf e P[Snrx (Q) ≤ nγ] − + log crx (n) (385)
Q∈Ut
We are now ready to establish the desired converse result
for the auxiliary channel Q. Consider an arbitrary code for the where γ > 0 is arbitrary. The third term on the RHS of (385) is
auxiliary channel Q with encoding function f0 : {1, . . . , M } 7→ upper-bounded by
Ute . Let Dj (H) be the decoding set for the jth codeword f0 (j) b(r+1)2 /4c
r2
2
in the eigenvalue decomposition coordinate such that log crx (n) ≤ log n + log E 1 + ρ kHkF
2
M
[ + O(1) (386)
Dj (H) = Ser,r × Rr≥ . (376) r 2
j=1
= log n + O(1). (387)
2
27
Here, (386) follows from algebraic manipulations, and (387) Lemma 21: Let H have pdf fH satisfying Conditions 1 and 2
follows from the assumption (81), which ensures that the second in Theorem 9. Let ϕγ,Q : Ct×r 7→ R be defined as in (389)
term on the RHS of (386) is finite. and let U (γ, Q) with pdf fU (γ,Q) denote the random variable
To evaluate P[Snrx (Q) ≤ nγ] on the RHS of (385), we ϕγ,Q (H). Then, there exist δ1 ∈ (0, Cno ) and δ > 0 such that
note that given H = H, the random variable Snrx (Q) is the u 7→ fU (γ,Q) (u) is continuously differentiable on (−δ, δ) and
sum of n i.i.d. random variables. Hence, using Theorem 15 that
(Appendix IV-A) and following similar steps as the ones reported
sup sup sup fU (γ,Q) (u) < ∞ (395)
in Appendix IV-A, we obtain γ∈(Cno −δ1 ,Cno +δ1 ) Q∈Ute u∈(−δ,δ)
0
1 sup sup sup U (γ,Q) (u) < ∞.
rx
f (396)
P[Sn (Q) ≤ nγ | H = H] ≥ qn (ϕγ,Q (H)) + O (388) γ∈(Cno −δ1 ,Cno +δ1 ) Q∈Ute u∈(−δ,δ)
n
Proof: See Appendix VIII-A.
where the function ϕγ,Q : Ct×r 7→ R is given by
Using (392), (394), and Lemma 21 in (391), and then (391)
and (387) in (385), we obtain for every γ ∈ (Cno −δ1 , Cno +δ1 )
γ − log det Ir + HH QH
ϕγ,Q (H) , q (389) that
tr Ir − (Ir + HH QH)−2
∗
(n − 1)Rrx (n − 1, )
the function qn (·) was defined in (170), and the O(1/n) term is
≤ nγ − log inf e P[log det Ir + HH QH ≤ γ] −
uniform in Q, γ and H. Let Q∈Ut
U (γ, Q) , ϕγ,Q (H). (390) + O(1/n) + O(log n) (397)
Averaging (388) over H, we obtain = nγ − log Fno (γ) − + O(1/n) + O(log n) (398)
Using this metric, we define the gradient ∇g of an arbitrary To prove (407), we shall use that for an arbitrary δ > 0,
function g : Ml 7→ R as in (80). Note that the metric (403)
induces a norm on the tangent space of Ml , which can be f∗ (u + δ) − f∗ (u)
Z Z
identified with the Frobenius norm. dS dS
= f − f (411)
Our proof consists of two steps. Let fl (u) denote the pdf of the −1
ϕ (u+δ) k∇ϕk F −1
ϕ (u) k∇ϕk F
random variable U (γ, Q) conditioned on H ∈ Ml . We first show Z
dS
no
that there exist l0 ∈ N, δ > 0, and δ1 ∈ (0, C ) such that fl (u) = d f (412)
0 no no ϕ−1 ((u,u+δ)) k∇ϕkF
and fl (u) are uniformly bounded for every γ ∈ (C −δ1 , C + Z
δ1 ), every Q ∈ Ute , every u ∈ [−δ, δ], and every l ≥ l0 . We then = ψ1 dV (413)
show that u 7→ fU (γ,Q) (u) is continuously differentiable on ϕ−1 ((u,u+δ))
(−δ, δ), and that for every u ∈ (−δ, δ), the sequences {fl (u)}
where in (412) we used Stoke’s theorem [51, Th. III.7.2], that f is
and {fl0 (u)} converge uniformly to fU (γ,Q) (u) and fU0 (γ,Q) (u), dS
compactly supported, and that the restriction of the form f k∇ϕk
respectively, i.e., F
to ϕ−1 ((u, u + δ)) is also compactly supported; (413) follows
from the definition of ψ1 (see (408)). Equation (407) follows
lim sup fl (u) − fU (γ,Q) (u) = 0 (404)
l→∞ u∈(−δ,δ) then from similar steps as in (409)–(410).
Using Lemma 22, we obtain
lim sup fl (u) − fU0 (γ,Q) (u) = 0
0
(405)
l→∞ u∈(−δ,δ) Z
fH dS
fl (u) = (414)
from which Lemma 21 follows.
−1
ϕγ,Q (u)∩Ml P[H ∈ M l ] k∇ϕ γ,Q kF
where (410) follows from the smooth coarea formula [51, p. 160]. ?dϕγ,Q
dS = (421)
This implies (406). k∇ϕγ,Q kF
29
where ? denotes the Hodge star operator [50, p. 103] induced by which implies that
√
the metric (403). Using (421) and the definition of the Hodge no
−δ1 − rδ)/r
λmax (HH QH) ≥ e(C − 1 > 0. (433)
star operator, the RHS of (416) becomes
! Combing (433) with the second inequality in (428), we obtain
fH fH no √
∧ ?dϕγ,Q + 2 ∧ d ? dϕξ,Q tr Ir − Φ−2 ≥ 1 − e−2(C −δ1 − rδ)/r .
d 2 (434)
k∇ϕγ,Q kF k∇ϕγ,Q kF
2 We use (429) and (434) to lower-bound the smallest eigenvalue
h∇fH , ∇ϕγ,Q i fH h∇ k∇ϕγ,Q kF , ∇ϕγ,Q i of the matrix T defined in (427) as
= 2 − 4
k∇ϕγ,Q kF k∇ϕγ,Q kF
! λmin (T) = tr(Ir − Φ−2 ) λmin (Φ2 ) +(γ − log det Φ) (435)
fH · ∆ϕγ,Q | {z }
− 2 dV (422) √
≥1
k∇ϕγ,Q kF ≥ tr(Ir − Φ −2
)−δ r (436)
no √ √
where ∧ denotes the wedge product [29, p. 237] and ∆ denotes ≥ 1 − e−2(C −δ1 − rδ)/r
− δ r. (437)
the Laplace operator [50, Eq. (3.1.6)].11 From (422) we get
The RHS of (437) can be made positive if we choose δ suffi-
h∇fH , ∇ϕγ,Q i fH h∇ k∇ϕγ,Q k2F , ∇ϕγ,Q i
ciently small, in which case T is invertible. We can theorefore
|ψ1 | = 2 − 4 lower-bound k∇ϕγ,Q kF as
k∇ϕγ,Q kF k∇ϕγ,Q kF
fH · ∆ϕγ,Q 2
−3
− (423) k∇ϕξ,Q kF = 3/2
QHΦ T
F (438)
2
k∇ϕγ,Q kF tr Ir − Φ−2
2
1
2
≥ 3/2
QHΦ−3
F ·
fH
∇ k∇ϕγ,Q kF
(439)
k∇fH kF F fH · |∆ϕγ,Q | r kT−1 kF
≤ + 3 + 2
k∇ϕγ,Q kF k∇ϕγ,Q kF k∇ϕγ,Q kF 2 1 1
≥ 3/2 kQHkF · · −1
. (440)
(424) r kΦ kF kT kF
3
where the last step follows from the triangle inequality and the Here, we use the first inequality in (428) and the submultiplica-
Cauchy-Schwarz inequality. tivity of the Frobenius norm. The term kQHkF can be bounded
We proceed to lower-bound k∇ϕγ,Q kF . Using the definition as
H
of the gradient (80) together with the matrix identities [52, p. 29]
H QH
F
kQHkF ≥ (441)
det(I + εA) = 1 + εtr(A) + O(ε2 ), ε→0 (425) kHkF
(I + εA)−1 = I − εA + O(ε2 ), ε→0 (426) λmax (HH QH)
≥ (442)
kHkF
for every bounded square matrix A, we obtain no √
−δ1 − rδ)/r
e(C −1
2QHΦ−3 ≥ (443)
∇ϕγ,Q (H) = − kHkF
3/2
tr Ir − Φ−2 where (443)
follows
from (433).
The term
Φ3
F in (440) can be upper-bounded as
× tr(Ir − Φ−2 )Φ2 + (γ − log det Φ)Ir (427)
3
√
| {z }
Φ
≤ r(1 + λmax (HH QH))3 (444)
F
,T √
≤ r(1 + det Φ)3 (445)
where Φ , Ir + HH QH.
no ≤ const. (446)
√ an arbitrary δ1 ∈ (0, C ) and
Fix choose δ ∈ (0, (Cno −
−2 (446) follows from (429) and because γ ≤ Cno +δ. Finally,
δ1 )/ r). We first bound tr(Ir − Φ ) as Here,
−1
T
in (440) can be bounded as
r ≥ tr Ir − Φ−2 ≥ 1 − (1 + λmax (HH QH))−2 . (428)
F
√
−1
√ r
It follows from the first inequality in (428) and from (389) that
T
≤ rλmax (T−1 ) = . (447)
F λmin (T)
for every u ∈ (−δ, δ)
p √ The RHS of (447) is bounded because of (437). Substitut-
|γ − log det Φ| = |u| tr(Ir − Φ−2 ) ≤ δ r. (429) ing (443), (446) and (447) into (440), we conclude that
Using (429) and that the determinant is given by the product of k∇ϕγ,Q k−1 ≤ const · kHkF . (448)
the eigenvalues, we obtain that, for every γ ∈ (Cno − δ1 , Cno − Following similar steps as the ones reported in (425)–(448),
δ1 ) and every u ∈ (−δ, δ), we can show that
r log(1 + λmax (HH QH)) ≥ log det Φ
(430) 2
∇ k∇ϕγ,Q kF
< const · k∇ϕγ,Q kF (449)
√ F
≥ γ − rδ (431)
√ and
≥ Cno − δ1 − rδ > 0 (432)
|∆ϕγ,Q | < const. (450)
11 The Laplace operator used here and in [50, Eq. (3.1.6)] differs from the
usual one on Rn , as defined in calculus, by a minus sign. See [50, Sec. 3.1] for Substituting (448)–(450) into (424) and using the bounds (81)
a more detailed discussion. and (82), we obtain (418).
30
Proof of (419): We begin by observing that for every H ∈ We can therefore upper-bound Al (u) as
ϕ−1
γ,Q (u) ∩ Ml , every γ ∈ (Cno − δ1 , Cno + δ1 ), every u ∈ Z
(−δ, δ) and every Q ∈ Ute Al (u) = Al (u0 ) + ψ2 dV (465)
ϕ−1
γ,Q ((u0 ,u))∩Ml
2 tr(HH QH) Z
kHkF ≥ (451) ≤ const +
−2tr−1
const · kHkF dV (466)
tr(Q)
M0
1 Z ∞
≥ λmax (HH QH) (452) ≤ const + const · x−2 dx (467)
ρ √
k0
1 (Cno −δ1 −√rδ)/r
≥ e − 1 , k0 . (453) = const. (468)
ρ
Here, (465) follows from (462); (467) follows from (459), (464),
Here, (451) follows from Cauchy-Schwarz inequality; (452)
and (454). Note that the bound (468) is uniform in γ, Q, u, and l.
follows because tr(Q) = ρ for every Q ∈ Ute ; (453) follows
Following similar steps as the ones reported in (460)–(468), we
from (433). From (453) we conclude that
obtain the same result for u ∈ (−δ, u0 ). This proves (419).
2) Convergence of fl (u) and fl0 (u): In this section, we will
p
ϕ−1
γ,Q ((−δ, δ)) ∩ M 0
l ⊂ M , {H ∈ C
t×r
: kHkF ≥ k0 }.
prove (404) and (405). By Lemma 22,
(454) Z
fH dS
To upper-bound Al (u), we note that fU (γ,Q) (u) = . (469)
−1
ϕγ,Q (u) k∇ϕ γ,Q kF
Z δ Z
Al (u)du = kHkF
−2tr−3
dV (455) We have the following chain of inequalities
−δ ϕ−1
γ,Q ((−δ,δ))∩Ml
Z |fl (u) − fU (γ,Q) (u)|
−2tr−3
≤ kHkF dV (456) ≤ |P[H ∈ Ml ]fl (u) − fU (γ,Q) (u)|
M0
Z ∞
+ |(1 − P[H ∈ Ml ])fl (u)| (470)
= const · √ x−4 dx (457) Z
fH dS
k0 ≤
= const. (458) −1
ϕγ,Q (u)∩(C t×r \Ml ) k∇ϕ γ,Q kF
belongs to C 1 ([−δ, δ]). Moreover, following similar steps as eigenvalues of HH Q∗ H, and 0a,b denotes the all zero matrix of
in (470)–(474), we obtain that for all m > l > 0 size a × b. Conditioned on H = H, we have
√
sin2 {In,t∗ , nIn,t∗ UH + W}
0
lim sup |fm (u) − fl (u)| + |fm (u) − fl0 (u)| = 0. (476) √
l→∞ u∈[−δ,δ]
= sin2 In,t∗ L, ( nIn,t∗ UH + W)V
(483)
√
This means that {fl } restricted to [−δ, δ] is a Cauchy sequence, 2 e
= sin LIn,t∗ L, L( nIn,t∗ UH + W)V
e (484)
and, hence, converges in C 1 ([−δ, δ]) with respect to the C 1 2
√
= sin In,t∗ , nIn,t∗ Σ + W (485)
norm (475). Moreover, by (474) the limit of {fl } is fU (γ,Q) .
Therefore, we have fU (γ,Q) ∈ C 1 ([−δ, δ]), and {fl0 } converges where
to fU0 (γ,Q) with respect to the sup-norm k·k∞ . This proves (405). LH
0(n−t∗ )×t∗
L,
e (486)
0t∗ ×(n−t∗ ) In−t∗
A PPENDIX IX is unitary. Here, (483) follows because span(A) = span(AB)
P ROOF OF THE ACHIEVABILITY PART OF T HEOREM 9 for every invertable matrix B; (484) follows because the principal
We prove the achievability asymptotic expansion for Theo- angles between two subspaces are invariant under simultaneous
rem 9. More precisely, we prove the following: rotation of the two subspaces; (485) follows because W is
Proposition 23: Assume that there exists a Q∗ ∈ Ut satisfy- LWV has the same
isotropically distributed, which implies that e
ing (64). Let FQ∗ (·) be as in (87). Assume that the joint pdf of the distribution as W.
nonzero eigenvalues of HH Q∗ H is continuously differentiable Let ej and Wj be the jth column of In,t∗ and W, respectively.
and that FQ∗ (·) is differentiable and strictly increasing at Cno , Then
i.e., √
P sin2 In,t∗ , nIn,t∗ Σ + W ≤ γn
for γ in a certain neighborhood of Ciso . Here, the function qn (·) [13] C. Potter, K. Kosbar, and A. Panagos, “On achievable rates for MIMO
is given in (170); the function ũγ (·) : Rm
+ 7→ R is defined as
systems with imperfect channel state information in the finite length
regime,” IEEE Trans. Commun., vol. 61, no. 7, pp. 2772–2781, Jul. 2013.
Pm [14] W. Yang, G. Caire, G. Durisi, and Y. Polyanskiy, “Finite blocklength
γ − j=1 log(1 + ρλj /t)
ũγ (λ) , q ; (493) channel coding rate under a long-term power constraint,” in Proc. IEEE
Pm Int. Symp. Inf. Theory (ISIT), Honolulu, HI, USA, Jul. 2014.
m − j=1 (1 + ρλj /t)−2 [15] G. Caire, G. Taricco, and E. Biglieri, “Optimum power control over fading
channels,” IEEE Trans. Inf. Theory, vol. 45, no. 5, pp. 1468–1489, May
Λ = [Λ1 , . . . , Λm ]; and k1 is a finite constant independent of γ 1999.
and λ. A lower bound on P[Snrx ((ρ/t)It ) ≤ nγ] follows then by [16] V. V. Petrov, Sums of Independent Random Variates. Springer-Verlag,
1975, translated from the Russian by A. A. Brown.
averaging both sides of (492) with respect to Λ [17] İ. E. Telatar, “Capacity of multi-antenna Gaussian channels,” Eur. Trans.
k1 Telecommun., vol. 10, pp. 585–595, Nov. 1999.
P[Snrx ((ρ/t)It ) ≤ nγ] ≥ E qn ũγ (Λ) + . (494)
[18] A. M. Tulino and S. Verdú, “Random matrix theory and wireless communi-
n cations,” in Foundations and Trends in Communications and Information
Theory. Delft, The Netherlands: now Publishers, 2004, vol. 1, no. 1, pp.
Proceeding as in (199)–(206) and using the assumption that the 1–182.
joint pdf of Λ1 , . . . , Λm is continuously differentiable, we obtain [19] N. Johnson, S. Kotz, and N. Balakrishnan, Continuous Univariate Distri-
that for all γ ∈ (Ciso − δ, Ciso + δ) butions, 2nd ed. New York: Wiley, 1995, vol. 2.
[20] E. Abbe, S.-L. Huang, and İ. E. Telatar, “Proof of the outage probability
m conjecture for MISO channels,” IEEE Trans. Inf. Theory, vol. 59, no. 5,
X k2 pp. 2596–2602, May 2013.
E qn ũγ (Λ) ≥ P log(1 + ρΛj /t) ≤ γ + (495) [21] C. E. Shannon, “Probability of error for optimal codes in a Gaussian
j=1
n
channel,” Bell Syst. Tech. J., vol. 38, no. 3, pp. 611–656, May 1959.
[22] A. Barg and D. Y. Nogin, “Bounds on packings of spheres in the Grassmann
for some δ > 0 and k2 > −∞. Substituting (495) into (494), manifold,” IEEE Trans. Inf. Theory, vol. 48, no. 9, pp. 2450–2454, Sep.
we see that for every γ ∈ (Ciso − δ, Ciso + δ) 2002.
[23] Y. Polyanskiy, “Channel coding: non-asymptotic fundamental limits,” Ph.D.
P[Snrx ((ρ/t)It ) ≤ nγ] dissertation, Princeton University, 2010.
[24] Y. Polyanskiy, H. V. Poor, and S. Verdú, “Dispersion of the Gilbert-Elliott
m channel,” IEEE Trans. Inf. Theory, vol. 57, no. 4, pp. 1829–1848, Apr.
X k1 + k2
≥ P log(1 + ρΛj /t) ≤ γ + (496) 2011.
j=1
n [25] M. Tomamichel and V. Y. F. Tan, “-capacities and second-order coding
rates for channels with general state,” IEEE Trans. Inf. Theory, May 2013,
k1 + k2 submitted. [Online]. Available: http://arxiv.org/abs/1305.6789
= Fiso (γ) + . (497) [26] W. Yang, G. Durisi, T. Koch, and Y. Polyanskiy, “Quasi-static SIMO fading
n channels at finite blocklength,” in Proc. IEEE Int. Symp. Inf. Theory (ISIT
The proof of (490) is concluded by repeating the same steps as 2013), Istanbul, Turkey, Jul. 2013.
in (164)–(165). [27] E. MolavianJazi and J. N. Laneman, “On the second-order coding rate of
block fading channels,” in Proc. Allerton Conf. Commun., Contr., Comput.,
Monticello, IL, USA, Oct. 2013, to appear.
[28] A. T. James, “Distribution of matrix variates and latent roots derived from
R EFERENCES normal samples,” Ann. Math. Statist., vol. 35, pp. 475–501, 1964.
[1] E. Biglieri, J. Proakis, and S. Shamai (Shitz), “Fading channels: [29] J. R. Munkres, Analysis on Manifolds. Redwood City, CA: Addison-
Information-theoretic and communications aspects,” IEEE Trans. Inf. Wesley, 1991.
Theory, vol. 44, no. 6, pp. 2619–2692, Oct. 1998. [30] V. Y. F. Tan and M. Tomamichel, “The third-order term in the normal
[2] D. N. C. Tse and P. Viswanath, Fundamentals of Wireless Communication. approximation for the AWGN channel,” IEEE Trans. Inf. Theory, Dec.
Cambridge, U.K.: Cambridge Univ. Press, 2005. 2013, submitted. [Online]. Available: http://arxiv.org/abs/1311.2337
[3] G. J. Foschini and M. J. Gans, “On limits of wireless communications [31] 3GPP TS 36.212, “Technical specification group radio access network;
in a fading environment when using multiple antennas,” Wirel. Personal evolved universal terrestrial radio access (E-UTRA); multiplexing and
Commun., vol. 6, pp. 311–335, 1998. channel coding (release 10),” Dec. 2012.
[4] M. Effros and A. Goldsmith, “Capacity definitions and coding strategies [32] P. Robertson, E. Villebrun, and P. Hoeher, “A comparison of optimal and
for general channels with receiver side information,” in Proc. IEEE Int. sub-optimal MAP decoding algorithms operating in the log domain,” in
Symp. Inf. Theory (ISIT), Cambridge MA, Aug. 1998, p. p. 39. Proc. IEEE Int. Conf. Commun. (ICC), Seattle, USA, Jun. 1995, pp. 1009–
[5] T. S. Han, Information-Spectrum Methods in Information Theory. Berlin, 1013.
Germany: Springer-Verlag, 2003. [33] S. Sesia, I. Toufik, and M. Baker, Eds., LTE–The UMTS Long Term
[6] M. Effros, A. Goldsmith, and Y. Liang, “Generalizing capacity: New Evolution: From Theory to Practice, 2nd ed. UK: Wiley, 2011.
definitions and capacity theorems for composite channels,” IEEE Trans. [34] J. Miao and A. Ben-Israel, “On principal angles between subspaces in Rn ,”
Inf. Theory, vol. 56, no. 7, pp. 3069–3087, Jul. 2010. Linear Algebra Appl., vol. 171, pp. 81–98, 1992.
[7] S. Verdú and T. S. Han, “A general formula for channel capacity,” IEEE [35] S. Afriat, “Orthogonal and oblique projectors and the characteristics of
Trans. Inf. Theory, vol. 40, no. 4, pp. 1147–1157, Jul. 1994. pairs of vector spaces,” Proc. Cambridge Phil. Soc., vol. 53, no. 4, pp.
[8] L. H. Ozarow, S. S. (Shitz), and A. D. Wyner, “Information theoretic 800–816, 1957.
considerations for cellular mobile radio,” IEEE Trans. Inf. Theory, vol. 43, [36] P. Absil, P. Koev, and A. Edelman, “On the largest principal angle between
no. 2, pp. 359–378, May 1994. random subspaces,” Linear Algebra Appl., vol. 414, no. 1, pp. 288–294,
[9] Y. Polyanskiy, H. V. Poor, and S. Verdú, “Channel coding rate in the finite Apr. 2006.
blocklength regime,” IEEE Trans. Inf. Theory, vol. 56, no. 5, pp. 2307– [37] J. C. Roh and B. D. Rao, “Design and analysis of MIMO spatial multi-
2359, May 2010. plexing systems with quantized feedback,” IEEE Trans. Signal Process.,
[10] Y. Polyanskiy and S. Verdú, “Scalar coherent fading channel: dispersion vol. 54, no. 8, pp. 2874–2886, Aug. 2006.
analysis,” in Proc. IEEE Int. Symp. Inf. Theory (ISIT), Saint Petersburg, [38] J. Neyman and E. S. Pearson, “On the problem of the most efficient tests
Russia, Aug. 2011, pp. 2959–2963. of statistical hypotheses,” Philosophical Trans. Royal Soc. A, vol. 231, pp.
[11] W. Yang, G. Durisi, T. Koch, and Y. Polyanskiy, “Diversity versus channel 289–337, 1933.
knowledge at finite block-length,” in Proc. IEEE Inf. Theory Workshop [39] M. Abramowitz and I. A. Stegun, Eds., Handbook of Mathematical Func-
(ITW), Lausanne, Switzerland, Sep. 2012, pp. 577–581. tions with Formulas, Graphs, and Mathematical Tables, 10th ed. New
[12] J. Hoydis, R. Couillet, and P. Piantanida, “The second-order coding rate York: Dover: Government Printing Office, 1972.
of the MIMO Rayleigh block-fading channel,” IEEE Trans. Inf. Theory, [40] W. Rudin, Principles of Mathematical Analysis, 3rd ed. Singapore:
2013, submitted. [Online]. Available: http://arxiv.org/abs/1303.3400 McGraw-Hill, 1976.
33
[41] S. Boyd and L. Vandenberghe, Convex Optimization. Cambridge, U.K.: Giuseppe Durisi (S’02, M’06, SM’12) received the Laurea degree summa cum
Cambridge Univ. Press, 2004. laude and the Doctor degree both from Politecnico di Torino, Italy, in 2001 and
[42] N. Johnson, S. Kotz, and N. Balakrishnan, Continuous Univariate Distri- 2006, respectively. From 2002 to 2006, he was with Istituto Superiore Mario
butions, 2nd ed. New York: Wiley, 1995, vol. 1. Boella, Torino, Italy. From 2006 to 2010 he was a postdoctoral researcher at
[43] F. Qi and Q.-M. Luo, “Bounds for the ratio of two gamma functions—from ETH Zurich, Switzerland. Since 2010 he has been with Chalmers University of
Wendel’s and related inequalities to logarithmically completely monotonic Technology, Gothenburg, Sweden, where is now associate professor. He held
functions,” Banach J. Math. Anal., vol. 6, no. 2, pp. 132–158, 2012. visiting researcher positions at IMST, Germany, University of Pisa, Italy, ETH
[44] G. Grimmett and D. Stirzaker, Probability and Random Processes, 3rd ed. Zurich, Switzerland, and Vienna University of Technology, Austria.
New York, USA: Oxford University Press, 2001. Dr. Durisi is a senior member of the IEEE. He is the recipient of the 2013
[45] Y. Polyanskiy, “Saddle point in the minimax converse for channel coding,” IEEE ComSoc Best Young Researcher Award for the Europe, Middle East, and
IEEE Trans. Inf. Theory, vol. 59, no. 5, pp. 2576–2595, May 2013. Africa Region, and is co-author of a paper that won a ”student paper award” at
[46] R. A. Horn and C. R. Johnson, Matrix Analysis. Cambridge, U.K.: the 2012 International Symposium on Information Theory, and of a paper that
Cambridge Univ. Press, 1985. won the 2013 IEEE Sweden VT-COM-IT joint chapter best student conference
[47] A. Edelman, “Eigenvalues and condition numbers of random matrices,” paper award. He served as TPC member in several IEEE conferences, and is
Ph.D. dissertation, MIT, May 1989. currently publications editor of the IEEE Transactions on Information Theory.
[48] A. W. Marshall and I. Olkin, Inequalities: Theory of Majorization and its His research interests are in the areas of communication and information theory.
Application. Orlando, FL: Academic Press, Inc., 1979.
[49] C. Berge, Topological Spaces. Edinburg, UK: Oliver and Boyd, 1963.
[50] J. Jost, Riemannian Geometry and Geometric Analysis, 6th ed. Berlin,
Germany: Springer, 2011.
[51] I. Chavel, Riemannian Geometry: A Modern Introduction. Cambridge, Tobias Koch (S’02, M’09) is a Visiting Professor with the Signal Theory and
UK: Cambridge Univ. Press, 2006. Communications Department of Universidad Carlos III de Madrid (UC3M),
[52] H. Lütkepohl, Handbook of Matrices. Chichester, England: John Wiley Spain. He received the M.Sc. degree in electrical engineering (with distinction)
& Sons, 1996. in 2004 and the Ph.D. degree in electrical engineering in 2009, both from ETH
[53] J. K. Hunter and B. Nachtergaele, Applied Analysis. Singapore: World Zurich, Switzerland. From June 2010 until May 2012 he was a Marie Curie
Scientific Publishing Co., 2001. Intra-European Research Fellow with the University of Cambridge, UK. He was
also a research intern at Bell Labs, Murray Hill, NJ in 2004 and at Universitat
Pompeu Fabra (UPF), Spain, in 2007. He joined UC3M in 2012. His research
interests include digital communication theory and information theory.
Dr. Koch is serving as Vice Chair of the Spain Chapter of the IEEE Information
Theory Society in 2013–2014.
Wei Yang (S’09) received the B.E. degree in communication engineering and Yury Polyanskiy (S’08-M’10) is an Assistant Professor of Electrical Engineer-
M.E. degree in communication and information systems from the Beijing ing and Computer Science at MIT. He received the M.S. degree in applied
University of Posts and Telecommunications, Beijing, China, in 2008 and 2011, mathematics and physics from the Moscow Institute of Physics and Technology,
respectively. He is currently pursuing a Ph.D. degree in electrical engineering at Moscow, Russia in 2005 and the Ph.D. degree in electrical engineering from
Chalmers University of Technology, Gothenburg, Sweden. From July to August Princeton University, Princeton, NJ in 2010. In 2000-2005 he lead the develop-
2012, he was a visiting student at the Laboratory for Information and Decision ment of the embedded software in the Department of Surface Oilfield Equipment,
Systems, Massachusetts Institute of Technology, Cambridge, MA. Borets Company LLC (Moscow). Currently, his research focuses on basic
Mr. Yang is the recipient of a Student Paper Award at the 2012 IEEE questions in information theory, error-correcting codes, wireless communication
International Symposium on Information Theory (ISIT), Cambridge, MA, and and fault-tolerant circuits. Over the years Dr. Polyanskiy won the 2013 NSF
the 2013 IEEE Sweden VT-COM-IT joint chapter best student conference paper CAREER award, the 2011 IEEE Information Theory Society Paper Award and
award. His research interests are in the areas of information and communication Best Student Paper Awards at the 2008 and 2010 IEEE International Symposia
theory. on Information Theory (ISIT).