Download as pdf or txt
Download as pdf or txt
You are on page 1of 33

1

Quasi-Static Multiple-Antenna Fading Channels at


Finite Blocklength
Wei Yang, Student Member, IEEE, Giuseppe Durisi, Senior Member, IEEE,
Tobias Koch, Member, IEEE, and Yury Polyanskiy, Member, IEEE

Abstract—This paper investigates the maximal achievable rate to infinity. In this case, the channel is said to be in outage.
for a given blocklength and error probability over quasi-static For fading distributions for which the fading coefficient can be
multiple-input multiple-output (MIMO) fading channels, with and arbitrarily small (such as for Rayleigh, Rician, or Nakagami
without channel state information (CSI) at the transmitter and/or
arXiv:1311.2012v2 [cs.IT] 16 Apr 2014

the receiver. The principal finding is that outage capacity, de- fading), the probability of an outage is positive. Hence, the
spite being an asymptotic quantity, is a sharp proxy for the finite- overall block error probability  is bounded away from zero
blocklength fundamental limits of slow-fading channels. Specif- for every positive rate R > 0, in which case the Shannon
ically, the channel dispersion is shown to be zero regardless of capacity is zero. More generally, the Shannon capacity depends
whether the fading realizations are available at both transmitter on the fading probability density function (pdf) only through its
and receiver, at only one of them, or at neither of them. These re-
sults follow from analytically tractable converse and achievability support [6], [7].
bounds. Numerical evaluation of these bounds verifies that zero For applications in which a positive block error probability
dispersion may indeed imply fast convergence to the outage capac-  > 0 is acceptable, the maximal achievable rate as a function of
ity as the blocklength increases. In the example of a particular 1×2 the outage probability (also known as capacity versus outage) [1,
single-input multiple-output (SIMO) Rician fading channel, the p. 2631], [8], may be a more relevant performance metric than
blocklength required to achieve 90% of capacity is about an order
of magnitude smaller compared to the blocklength required for an Shannon capacity. The capacity versus outage coincides with
AWGN channel with the same capacity. For this specific scenario, the -capacity C (which is the largest achievable rate under the
the coding/decoding schemes adopted in the LTE-Advanced stan- assumption that the block error probability is less than  > 0) at
dard are benchmarked against the finite-blocklength achievability the points where C is a continuous function of  [7, Sec. IV].
and converse bounds. For the sake of simplicity, let us consider for a moment a
single-antenna communication system operating over a quasi-
I. I NTRODUCTION static flat-fading channel. The outage probability as a function
of the rate R is defined by
Consider a delay-constrained communication system operat-
F (R) = P log(1 + |H|2 ρ) < R .
 
ing over a slowly-varying fading channel. In such a scenario, it (1)
is plausible to assume that the duration of each of the transmitted Here, H denotes the random channel gain and ρ is the signal-
codewords is smaller than the coherence time of the channel, so to-noise ratio (SNR). For a given  > 0, the outage capacity
the random fading coefficients stay constant over the duration (or -capacity) C is the supremum of all rates R satisfying
of each codeword [1, p. 2631], [2, Sec. 5.4.1]. We shall refer to F (R) ≤ . The rationale behind this definition is that, for every
this channel model as quasi-static fading channel.1 realization of the fading coefficient H = h, the quasi-static
When communicating over quasi-static fading channels at a fading channel can be viewed as an AWGN channel with channel
given rate R, the realization of the random fading coefficient may gain |h|2 , for which communication with arbitrarily small block
be very small, in which case the block (frame) error probability  error probability is feasible if and only if R < log(1 + |h|2 ρ),
is bounded away from zero even if the blocklength n tends provided that the blocklength n is sufficiently large. Thus, the
This work was supported in part by the Swedish Research Council under
outage probability can be interpreted as the probability that the
grant 2012-4571, by the Ericsson Research Foundation under grant FOSTIFT- channel gain H is too small to allow for communication with
12:022, by a Marie Curie FP7 Integration Grant within the 7th European arbitrarily small block error probability.
Union Framework Programme under Grant 333680, by the Spanish government
(TEC2009-14504-C02-01, CSD2008-00010, and TEC2012-38800-C03-01), and
A major criticism of this definition is that it is somewhat
by the National Science Foundation under Grant CCF-1253205. The material contradictory to the underlying motivation of the channel model.
of this paper was presented in part at the 2013 and 2014 IEEE International Indeed, while log(1 + |h|2 ρ) is meaningful only for codewords
Symposium on Information Theory.
W. Yang and G. Durisi are with the Department of Signals and Systems,
of sufficiently large blocklength, the assumption that the fading
Chalmers University of Technology, 41296, Gothenburg, Sweden (e-mail: {ywei, coefficient is constant during the transmission of the codeword is
durisi}@chalmers.se). only reasonable if the blocklength is smaller than the coherence
T. Koch is with the Signal Theory and Communications Department, Univer-
sidad Carlos III de Madrid, 28911, Leganés, Spain (e-mail: koch@tsc.uc3m.es).
time of the channel. In other words, it is prima facie unclear
Y. Polyanskiy is with the Department of Electrical Engineering and Computer whether for those blocklengths for which the quasi-static chan-
Science, MIT, Cambridge, MA, 02139 USA (e-mail: yp@mit.edu). nel model is reasonable, the outage capacity is a meaningful
1 The term “quasi-static” is widely used in the communication literature (see,
performance metric.
e.g., [2, Sec. 5.4.1], [3]). The quasi-static channel model belongs to the general
class of composite channels [1, p. 2631], [4] (also known as mixed channels [5, In order to shed light on this issue, we study the maximal
Sec. 3.3]). achievable rate R∗ (n, ) for a given blocklength n and block
2

error probability  over a quasi-static multiple-input multiple- Fast convergence to the outage capacity provides mathematical
output (MIMO) fading channel, subject to a per-codeword power support to the observation reported by several researchers in
constraint. the past that the outage probability describes accurately the
Previous results: Building upon Dobrushin’s and Strassen’s performance over quasi-static fading channels of actual codes
asymptotic results, Polyanskiy, Poor, and Verdú recently showed (see [15] and references therein).
that for various channels with positive Shannon capacity C, the The following example supports our claims: for a 1 × 2 single-
maximal achievable rate can be tightly approximated by [9] input multiple-output (SIMO) Rician-fading channel with C =
r   1 bit/channel use and  = 10−3 , the blocklength required to
V −1 log n

R (n, ) = C − Q () + O . (2) achieve 90% of C for the perfect CSIR case is between 120 and
n n 320 (see Fig. 2 on p. 10), which is about an order of magnitude
−1
Here, Q (·) denotes the inverse of the Gaussian Q-function smaller compared to the blocklength required for an AWGN
Z ∞ channel with the same capacity (see [9, Fig. 12]).
1 −t2 /2 Fast convergence to the outage capacity further suggests that
Q(x) , √ e dt (3)
x 2π communication strategies that are optimal with respect to outage
and V is the channel dispersion [9, Def. 1]. The approxima- capacity may perform also well at finite blocklength. Note, how-
tion (2) implies that to sustain the desired error probability  at ever, that−1 this need not be true for very small blocklengths, where
a finite blocklength n, one pays a penalty on the rate the O(n log n) term in (4) may dominate. Thus, for small n
√ (compared the derived achievability and converse bounds on R∗ (n, ) may
to the channel capacity) that is proportional to 1/ n.
Recent works have extended (2) to some ergodic fading chan- behave differently than the outage capacity. Table I summarizes
nels. Specifically, the dispersion of single-input single-output how the ∗
outage capacity and the achievability/converse bounds
(SISO) stationary fading channels for the case when channel state on R (n, ) derived in this paper depend on system parameters
information (CSI) is available at the receiver was derived in [10]. such as the availability of CSI and the number of antennas at
This result was extended to block-memoryless fading channels the transmitter/receiver. These observations may be relevant for
in [11]. Upper and lower bounds on the second-order coding delay-constrained communication over slowly-varying fading
rate of quasi-static MIMO Rayleigh-fading channels have been channels.
reported in [12] for the asymptotically ergodic setup when the Proof techniques: Our converse bounds on R∗ (n, ) are
number of antennas grows linearly with the blocklength. A lower based on the meta-converse theorem [9, Th. 30]. Our achievabil-
bound on R∗ (n, ) for the imperfect CSI case has been developed ity bounds on R∗ (n, ) are based on the κβ bound [9, Th. 25]
in [13]. The second-order coding rate of single-antenna quasi- applied to a stochastically degraded channel, whose choice is
static fading channels for the case of perfect CSI and long-term motivated by geometric considerations. The main tools used
power constraint has been derived in [14]. to establish (4) are a Cramer-Esseen-type central-limit theo-
rem [16, Th. √ VI.1] and a result on the speed of convergence
Contributions: We provide achievability and converse
of P[B > A/ n] to P[B > 0] for n → ∞, where A and B are
bounds on R∗ (n, ) for quasi-static MIMO fading channels. We
independent random variables.
consider both the case when the transmitter has full transmit CSI
Notation: Upper case letters such as X denote scalar ran-
(CSIT) and, hence, can perform spatial water-filling, and the case
dom variables and their realizations are written in lower case,
when no CSIT is available. Our converse results are obtained
e.g., x. We use boldface upper case letters to denote random
under the assumption of perfect receive CSI (CSIR), whereas
vectors, e.g., X, and boldface lower case letters for their real-
the achievability results are derived under the assumption of no
izations, e.g., x. Upper case letters of two special fonts are used
CSIR.
to denote deterministic matrices (e.g., Y) and random matrices
By analyzing the asymptotic behavior of our achievability
(e.g., Y). The superscripts T and H stand for transposition and
and converse bounds, we show that under mild conditions on
Hermitian transposition, respectively. We use tr(A) and det(A)
the fading distribution,2
  to denote the trace and determinant of the matrix A, respectively,
log n and use span(A) to designate the subspace spanned by the
R∗ (n, ) = C + O . (4)
n column vectors of A.pThe Frobenius norm of a matrix A is
denoted by kAkF , tr(AAH ). The notation A  0 means
This results holds both for the case of perfect CSIT and for the
that the matrix A is positive semi-definite. The function resulting
case of no CSIT, and independently on whether CSIR is available
from the composition of two functions f and g is denoted by
at the receiver or not. By comparing (2) with √ (4), we observe g ◦ f , i.e., (g ◦ f )(x) = g(f (x)). For two functions f (x)
that for the quasi-static fading case, the 1/ n rate penalty is
and g(x), the notation f (x) = O(g(x)), x → ∞, means that
absent. In other words, the -dispersion (see [9, Def. 2] or (52) < ∞, and f (x) = o(g(x)), x → ∞,
lim supx→∞ f (x)/g(x)
below) of quasi-static fading channels is zero. This suggests
∗ means that lim x→∞
f (x)/g(x) = 0. We use Ia to denote the
that the maximal achievable rate R (n, ) converges quickly
identity matrix of size a × a, and designate by Ia,b (a > b)
to C as n tends to infinity, thereby indicating that the outage
the a × b matrix containing the first b columns of Ia . The
capacity is indeed a meaningful performance metric for delay-
distribution of a circularly-symmetric complex Gaussian random
constrained communication over slowly-varying fading channels.
vector with covariance matrix A is denoted by CN (0, A), the
2 These conditions are satisfied by the fading distributions commonly used in Wishart distribution [18, Def. 2.3] with n degrees of freedom and
the wireless communication literature (e.g., Rayleigh, Rician, Nakagami). covariance matrix A defined on matrices of size m×m is denoted
3

TABLE I
O UTAGE CAPACITY VS . FINITE BLOCKLENGTH WISDOM ; t IS THE NUMBER OF TRANSMIT ANTENNAS .

Wisdom C Bounds on R∗ (n, )


CSIT is beneficial only if t > 1 only if t > 1
CSIR is beneficial no [1, p. 2632] yes
With CSIT, waterfilling is optimal yes [17] no
With CSIT, the channel is reciprocal3 yes [17] only with CSIR

by Wm (n, A), and the Beta distribution [19, Ch. 25] is denoted no, tx, rx, and rt, respectively. Next, we introduce the notion of
by Beta(·, ·). The symbol R+ stands for the nonnegative real a channel code for each of these four settings.
line, Rm m
+ ⊂ R is the nonnegative orthant of the m-dimensional Definition 1 (no-CSI): An (n, M, )no code consists of:
real Euclidean spaces, and Rm m
≥ ⊂ R+ is defined by i) an encoder fno : {1, . . . , M } 7→ Cn×t that maps the mes-
sage J ∈ {1, . . . , M } to a codeword X ∈ {C1 , . . . , CM }.
Rm m
≥ , {x ∈ R+ : x1 ≥ · · · ≥ xm }. (5)
The codewords satisfy the power constraint
The indicator function is denoted by 1{·}, and [ · ]+ , 2
kCi kF ≤ nρ, i = 1, . . . , M. (9)
max{ · , 0}. Finally, log(·) is the natural logarithm.
Given two distributions P and Q on a common measurable ii) A decoder gno : Cn×r 7→ {1, . . . , M } satisfying a maximum
space W, we define a randomized test between P and Q as a probability of error constraint
random transformation PZ | W : W 7→ {0, 1} where 0 indicates
that the test chooses Q. We shall need the following performance max P[gno (Y) 6= J | J = j] ≤  (10)
1≤j≤M
metric for the test between P and Q:
Z where Y is the channel output induced by the transmitted
βα (P, Q) , min PZ | W (1 | w)Q(dw) (6) codeword X = fno (j) according to (8).
Definition 2 (CSIR): An (n, M, )rx code consists of:
where the minimum is over all probability distributions PZ | W i) an encoder fno : {1, . . . , M } 7→ Cn×t that maps the mes-
satisfying sage J ∈ {1, . . . , M } to a codeword X ∈ {C1 , . . . , CM }.
Z The codewords satisfy the power constraint (9).
PZ | W (1 | w)P (dw) ≥ α. (7) ii) A decoder grx : Cn×r × Ct×r 7→ {1, . . . , M } satisfying
max P[grx (Y, H) 6= J | J = j] ≤ . (11)
1≤j≤M
II. S YSTEM M ODEL
Definition 3 (CSIT): An (n, M, )tx code consists of:
We consider a quasi-static MIMO fading channel with t
transmit and r receive antennas. Throughout this paper, we i) an encoder ftx : {1, . . . , M } × Ct×r 7→ Cn×t that maps the
denote the minimum number of transmit and receive antennas message j ∈ {1, . . . , M } and the channel H to a codeword
by m, i.e., m , min{t, r}. The channel input-output relation is X = ftx (j, H) satisfying
given by 2 2
kXkF = kftx (j, H)kF ≤ nρ,
Y = XH + W. (8) ∀j = 1, . . . , M, ∀H ∈ Ct×r . (12)

Here, X ∈ Cn×t is the signal transmitted over n channel uses; ii) A decoder gno : Cn×r 7→ {1, . . . , M } satisfying (10).
Y ∈ Cn×r is the corresponding received signal; the matrix H ∈ Definition 4 (CSIRT): An (n, M, )rt code consists of:
Ct×r contains the complex fading coefficients, which are random i) an encoder ftx : {1, . . . , M } × Ct×r 7→ Cn×t that maps the
but remain constant over the n channel uses; W ∈ Cn×r denotes message j ∈ {1, . . . , M } and the channel H to a codeword
the additive noise at the receiver, which is independent of H X = ftx (j, H) satisfying (12).
and has independent and identically distributed (i.i.d.) CN (0, 1) ii) A decoder grx : Cn×r × Ct×r 7→ {1, . . . , M } satisfy-
entries. ing (11).
We consider the following four scenarios: The maximal achievable rate for the four cases listed above
1) no-CSI: neither the transmitter nor the receiver is aware of is defined as follows:
the realizations of the fading matrix H;  
2) CSIT: the transmitter knows H; log M
Rl∗ (n, ) , sup : ∃(n, M, )l code ,
3) CSIR: the receiver knows H; n
4) CSIRT: both the transmitter and the receiver know H. l ∈ {no, rx, tx, rt}. (13)
To keep the notation compact, we shall abbreviate in mathemat- From Definitions 1–4, it follows that
ical formulas the acronyms no-CSI, CSIT, CSIR, and CSIRT as
∗ ∗ ∗
Rno (n, ) ≤ Rtx (n, ) ≤ Rrt (n, ) (14)
3 A channel is reciprocal for a given performance metric (e.g., outage capacity)
∗ ∗ ∗
if substituting H with HH does not change the metric. Rno (n, ) ≤ Rrx (n, ) ≤ Rrt (n, ). (15)
4

III. A SYMPTOTIC R ESULTS AND P REVIEW where


Fno (R) , inf P log det Ir + HH QH < R
  
It was noted in [1, p. 2632] that the -capacity of quasi- (26)
Q∈Ut
static MIMO fading channel does not depend on whether CSI
is available at the receiver. Intuitively, this is true because the is the outage probability for the no-CSIT case. The matrix Q
channel stays constant during the transmission of a codeword, that minimizes the right-hand-side (RHS) of (26) is in general
so it can be accurately
√ estimated at the receiver through the not known, making this case more difficult to analyze and our
transmission of n pilot symbols with no rate penalty as n → ∞. nonasymptotic results less sharp and more difficult to evaluate
A rigorous proof of this statement follows by our zero-dispersion numerically. The minimization in (26) can be restricted to all Q
results (Theorems 3 and 9). In contrast, if CSIT is available and on the boundary of Ut , i.e.,
t > 1, then water-filling over space yields a larger -capacity [15]. Fno (R) = inf e P log det Ir + HH QH < R
  
(27)
Q∈Ut
We next define C for both the CSIT and the no-CSIT case.
Let Ut be the set of t × t positive semidefinite matrices whose where
trace is upper-bounded by ρ, i.e., U e , {A ∈ Ct×t : A  0, tr(A) = ρ}. (28)
t
Ut , {A ∈ Ct×t : A  0, tr(A) ≤ ρ}. (16) We lower-bound Rno ∗
(n, ) in Section V-A (Theorem 4), and

upper-bound Rrx (n, ) in Section V-B (Theorem 6). The asymp-
When CSI is available at the transmitter, the -capacity Ctx is totic analysis of the bounds provided in Section V-C (Theorem 9)
given by [15, Prop. 2]4 allows us to establish (4), although under slightly more strin-

Ctx = lim Rtx (n, ) (17) gent assumptions on the fading probability distribution than for
n→∞ the CSIT case.

= lim Rrt (n, ) (18) For the i.i.d. Rayleigh-fading model (without CSIT),
n→∞
= sup{R : Ftx (R) ≤ } (19) Telatar [17] conjectured that the optimal Q is of the form5
ρ
diag{1, . . . , 1, 0, . . . , 0}, 1 ≤ t∗ ≤ t (29)
where t∗ | {z } | {z }
  t∗ t−t∗
Ftx (R) , P max log det Ir + HH QH < R

(20) and that for small  values or for high SNR values, all available
Q∈Ut
transmit antennas should be used, i.e., t∗ = t. We define the
denotes the outage probability. Given H = H, the function -rate Ciso resulting from the choice Q = (ρ/t)It as
log det Ir + HH QH in (20) is maximized by the well-known
Ciso , sup{R : Fiso (R) ≤ } (30)
water-filling power-allocation strategy (see, e.g., [17]), which
results in where
h  ρ  i
m
X +
Fiso (R) , P log det Ir + HH H < R . (31)
max log det Ir + HH QH = t

[log(γ̄λj )] (21)
Q∈Ut
j=1 The -rate Ciso is often taken as an accurate lower bound on
the actual -capacity for the case of i.i.d Rayleigh fading and no
where the scalars λ1 ≥ · · · ≥ λm denote the m largest
CSIT. Motivated by this fact, we consider in Section V codes
eigenvalues of HH H, and γ̄ is the solution of
with isotropic codewords, i.e., chosen from the set
m  
X 1 H ρ
[γ̄ − 1/λj ]+ = ρ. (22) Fiso , X ∈ C n×t
: X X = It . (32)
j=1
n t
We indicate by (n, M, )iso a code with M codewords chosen
In Section IV, we study quasi-static MIMO channels with CSIT from Fiso and with a maximal error probability smaller than .
at finite blocklength. We present an achievability (lower) bound For this special class of codes, the maximal achievable rate

on Rtx (n, ) (Section IV-A, Theorem 1) and a converse (up- ∗
Rno,iso ∗
(n, ) for the no-CSI case and Rrx,iso (n, ) for the CSIR

per) bound on Rrt (n, ) (Section IV-B, Theorem 2). We show case can be characterized more accurately at finite blocklength
in Section IV-C (Theorem 3) that, under mild conditions on (Theorem 8) than for the general no-CSI case. Furthermore, we
the fading distribution, the two bounds match asymptotically show in Section V-C (Theorem 11) that under mild conditions
up to a O(log(n)/n) term. This allows us to establish the zero- on the fading distributions (weaker than the ones required for
dispersion result (4) for the CSIT case. the general no-CSI case)
When CSI is not available at the transmitter, the -  
capacity Cno is given by [17], [6] ∗ ∗ iso log n
{Rno,iso (n, ), Rrx,iso (n, )} = C + O . (33)
n

Cno = lim Rno (n, ) (23) A final remark on notation. For the single-transmit-antenna
n→∞
∗ case (i.e., t = 1), the -capacity does not depend on whether
= lim Rrx (n, ) (24)
n→∞
CSIT is available or not [15, Prop. 3]. Hence, we shall denote
= sup{R : Fno (R) ≤ } (25) the -capacity for this case simply as C .
4 More precisely, (19) and (25) hold provided that C tx and C no are continuous 5 This conjecture has recently been proved for the multiple-input single-output
 
functions of  [7, Th. 6]. case [20].
5

θǫ
IV. CSI AVAILABLE AT THE T RANSMITTER y′ y

A. Achievability w kw ′ k ≈ kwk ≈ n
w′
In this section, we consider the case where CSI is available h′ x1 hx1 hx1 ,w ′ i
≈ hx1 ,wi
≈0
kx1 kkw ′ k kx1 kkwk
at the transmitter but not at the receiver. Before establishing our
achievability bound in Section IV-A2, we provide some geomet-
ric intuition that will guide us in the choice of the decoder gno
(see Definition 3).
1) Geometric Intuition: Consider for simplicity a real-valued Fig. 1. A geometric illustration of the outage event for large blocklength n. In
quasi-static SISO channel (t = r = 1), i.e., a channel with the example, the fading realization h0 triggers an outage event, h does not.
input-output relation

Y = Hx + W (34) finite blocklength, provided that the threshold θ is chosen


appropriately.
where Y , x, and W are n-dimensional vectors, and H is a
In the following, we generalize the aforementioned threshold
(real-valued) scalar. As reviewed in Section I, the typical error
decoder to the MIMO case and present our achievability results.
event for the quasi-static fading channel (in the large blocklength
regime) is that the instantaneous channel gain H 2 is not large 2) The Achievability Bound: To state our achievability (lower)

enough to support the desired rate R, i.e., 12 log(1 + ρH 2 ) < R bound on Rtx (n, ), we will need the following definition, which
(outage event). For the channel in (34), the -capacity C , i.e., extends the notion of angle between real vectors to complex
the largest rate R for which the probability that the channel is subspaces.
in outage is less than , is given by Definition 5: Let A and B be subspaces in Cn with a =
    dim(A) ≤ dim(B) = b. The principal angles 0 ≤ θ1 ≤ · · · ≤
1 θa ≤ π/2 between A and B are defined recursively by
C = sup R : P log(1 + ρH 2 ) < R ≤  . (35)
2 March 28, 2014 DRAFT

Roughly speaking, the decoder of a C -achieving code may cos θk , max |ha, bi|,
a ∈ A, b ∈ B : kak = kbk = 1,
commit an error only when the channel is in outage. Pick now an ha, ai i = hb, bi i = 0, i = 1, . . . , k − 1
arbitrary codeword x1 from the hypersphere {x ∈ Rn : kxk2 = k = 1, . . . , a. (39)
nρ}, and let Y be the received signal corresponding to x1 .
Following [21], we analyze the angle θ(x1 , Y ) between x1 Here, ak and bk , k = 1, . . . , a, are the vectors that achieve the
and Y as follows. By the law of large numbers, the noise maximum in (39) at the kth recursion. The angle between the
vector W is approximately orthogonal to x1 if n is large, i.e., subspaces A and B is defined by
hx1 , W i a
→ 0, n → ∞. (36) Y
kx1 kkW k sin{A, B} , sin θk . (40)
k=1
Also by the law of large numbers, kW k2 /n → 1 as n → ∞.
Hence, for a given H and for large n, the angle θ(x1 , Y ) can With a slight abuse of notation, for two matrices A ∈ Cn×a
be approximated as and B ∈ Cn×b , we abbreviate sin{span(A), span(B)} with
kW k sin{A, B}. When the columns of A and B are orthonormal bases
θ(x1 , Y ) ≈ arcsin p (37) for span(A) and span(B), respectively, we have (see, e.g., [22,
H 2 kx1 k2 + kW k2
Sec. I])
1
≈ arcsin p (38)
ρH 2 + 1 sin2 {A, B} = det I − AH BBH A

(41)
= det I − BH AAH B .

where the first approximation follows by (36) and the second (42)
approximation follows because kW k2 ≈ n. It follows from (35)
and (38) that θ(x1 , Y ) is larger than θ , arcsin(e−C ) in the Some additional properties of the operator sin{·, ·} are listed in
outage case, and smaller than θ otherwise (see Fig. 1). Appendix I.
This geometric argument suggests the use of a threshold We are now ready to state our achievability bound.
decoder that, for a given received signal Y , declares xi to be Theorem 1: Let Λ1 ≥ · · · ≥ Λm be the m largest eigenvalues
the transmitted codeword if xi is the only codeword for which of HHH . For every 0 <  < 1 and every 0 < τ < , there exists an
θ(xi , Y ) ≤ θ . If no codewords or more than one codeword (n, M, )tx code for the channel (8) that satisfies
meet this condition, the decoder declares an error. Thresholding
angles instead of log-likelihood ratios (cf., [9, Th. 17 and Th. 25]) log M 1 τ
≥ log  Qr . (43)
j=1 Bj ≤ γn
appears to be a natural approach when CSIR is unavailable. n n P
Note that the proposed threshold decoder does neither require
CSIR nor knowledge of the fading distribution. As we shall Here, Bj ∼ Beta(n−t−j +1, t), j = 1, . . . , r, are independent
see, it achieves (4) and yields a tight achievability bound at Beta-distributed random variables, and γn ∈ [0, 1] is chosen so
6

that where
m
(n − 1)n e−(n−1) Γ(n, n − 1)


 n np
P sin2 In,t , nIn,t diag v1∗ Λ1 , . . . , crt (n) , +
Γ(n) Γ(n)
H
  
p
∗ Λ , 0, . . . , 0
o o × EH det(It + ρHH ) (50)
vm m + W ≤ γn ≥ 1 −  + τ (44)
and the scalar γn (v) is the solution of
| {z }
t−m

where P[Snrt (v, Λ) ≤ nγn (v)] = . (51)


The infimum on the RHS of (49) is taken over all power allocation
vj∗ = [γ̄ − 1/Λj ]+ , j = 1, . . . , r (45)
functions v : Rm + 7→ Vm .
are the water-filling power gains and γ̄ is defined in (22). Proof: See Appendix III.
Proof: The achievability bound is based on a decoder that Remark 1: The infimum on the RHS of (49) makes the con-
operates as follows: it first computes the sine of the angle between verse bound in Theorem 2 difficult to evaluate numerically. We
the subspace spanned by the received matrix Y and the subspace can further upper-bound the RHS of (49) by lower-bounding
spanned by each codeword; then, it chooses the first codeword for P[Lrt
n (v, Λ) ≥ nγn (v)] for each v(·) using [9, Eq. (102)]
which the squared sine of the angle is below γn . To analyze the and the Chernoff bound. After doing so, the infimum can be

performance of this decoder, we apply the κβ bound [9, Th. 25] computed analytically and the resulting upper bound on Rrt (n, )
to a physically degraded channel whose output is span(Y). See allows for numerical evaluations. Unfortunately, this bound is
Appendix II for the complete proof. in general loose.
Remark 2: As we shall discuss in Section V-B, the bound (49)
can be tightened and evaluated numerically in the SIMO case
B. Converse or when the codewords are isotropic, i.e., are chosen from the
set Fiso in (32). Note that in both scenarios CSIT is not beneficial.
In this section, we shall assume both CSIR and CSIT. Our
converse bound is based on the meta-converse theorem [9,
Th. 30]. Since CSI is available at both the transmitter and the C. Asymptotic Analysis
receiver, the MIMO channel (8) can be transformed into a set Following [9, Def. 2], we define the -dispersion of the chan-
∗ ∗
of parallel quasi-static channels. The proof of Theorem 2 below nel (8) with CSIT via Rtx (n, ) (resp. Rrt (n, )) as
builds on [23, Sec. 4.5], which characterizes the nonasymptotic 2
C − Rl∗ (n, )
 tx 
coding rate of parallel AWGN channels. Vl , lim sup n ,
Theorem 2: Let Λ1 ≥ · · · ≥ Λm be the m largest eigenvalues n→∞ Q−1 ()
of HHH , and let Λ , [Λ1 , . . . , Λm ]T . Consider an arbitrary  ∈ (0, 1)\{1/2}, l = {tx, rt}. (52)
power-allocation function v : Rm + 7→ Vm , where
Theorem 3 below characterizes the -dispersion of the quasi-
n Xm o
static fading channel (8) with CSIT.
Vm , [p1 , . . . , pm ] ∈ Rm
+ : pj ≤ ρ . (46)
j=1 Theorem 3: Assume that the fading channel H satisfies the
Let following conditions:
 
n X
m
1) the expectation EH det(It + ρHHH ) is finite;
X 2) the joint pdf of the ordered nonzero eigenvalues of HH H
Lrt

n (v, Λ) , log 1 + Λj vj (Λ) + 1
exists and is continuously differentiable;
i=1 j=1
q 2 ! 3) Ctx is a point of growth of the outage probability func-
tion (20) , i.e.,6
q
− Λj vj (Λ)Zi,j − 1 + Λj vj (Λ)
(47)
0
Ctx > 0.

Ftx (53)
and Then
n X
m
 
∗ ∗ log n
(n, ) = Ctx + O
X 
Snrt (v, Λ) Rtx (n, ), Rrt . (54)

, log 1 + Λj vj (Λ) + 1 n
i=1 j=1
Λj vj (Λ)Zij − 1 2
p ! Hence, the -dispersion is zero for both the CSIRT and the CSIT
− (48) case:
1 + Λj vj (Λ)
Vtx = Vrt = 0,  ∈ (0, 1)\{1/2}. (55)
where vj (·) is the jth coordinate of v(·), and Zij , i = 1, . . . , n,
Proof: To prove (54), we first establish in Appendix IV the
j = 1, . . . , m, are i.i.d. CN (0, 1) distributed random variables.
converse result
For every n and every 0 <  < 1, the maximal achievable rate  
∗ log n
on the channel (8) with CSIRT is upper-bounded by Rrt (n, ) ≤ Ctx + O (56)
n
∗ 1 crt (n)
Rrt (n, ) ≤ log (49) 6 Note that this condition implies that C tx is a continuous function of  (see
n inf P[Ln (v, Λ) ≥ nγn (v)]
rt 
v(·) Section III).
7

by analyzing the upper bound (49) in the limit n → ∞. We next is the capacity of the channel (8) when H = H (the water-filling
prove in Appendix V the achievability result power allocation values {vj∗ } in (60) are given in (45) and {λj }
  are the eigenvalues of HH H), and
∗ log n
Rtx (n, ) ≥ Ctx + O (57) m
n X 1
V (H) = m − (61)
(1 + vj∗ λj )2
by expanding (43) for n → ∞. The desired result then follows j=1
by (14). is the dispersion of the channel (8) when H = H [23, Th. 78].
Remark 3: As mentioned in Section I, the quasi-static fading Theorem 3 and the expansion
channel considered in this paper belongs to the general class  
of composite or mixed channels, whose -dispersion is known N 1
Rrt (n, ) = Ctx + O (62)
in some special cases. Specifically, the dispersion of a mixed n
channel with two states was derived in [24, Th. 7]. This result was (which follows from Lemma 17 in Appendix IV-C and Taylor’s
extended to channels with finitely many states in [25, Th. 4]. √ In theorem) suggest that this approximation is accurate, as con-
both cases, the rate of convergence to the -capacity is O(1/ n) firmed by the numerical results reported in Section VI-A. Note
(positive dispersion), as opposed to O(log(n)/n) in Theorem 3. that the same approximation has been concurrently proposed
Our result shows that moving from finitely many to uncountably in [27]; see also [24, Def. 2] and [25, Sec. 4] for similar approx-
many states (as in the quasi-static fading case) yields a drastic imations for mixed channels with finitely many states.
change in the value of the channel dispersion. For this reason,
our result is not derivable from [24] or [25]. V. CSI N OT AVAILABLE AT THE T RANSMITTER
Remark 4: It can be shown that the assumptions on the fading
A. Achievability
matrix in Theorem 3 are satisfied by most probability distri-
butions used to model MIMO fading channels, such as i.i.d. In this section, we shall assume that neither the transmitter
or correlated Rayleigh, Rician, and Nakagami. However, the nor the receiver have a priori CSI. Using the decoder described
(nonfading) AWGN MIMO channel, which can be seen as a in IV-A, we obtain the following achievability bound.
quasi-static fading channel with fading distribution equal to a Theorem 4: Assume that for a given 0 <  < 1 there exists a

step function, does not meet these assumptions and has, in fact, Q ∈ U t such that
positive dispersion [23, Th. 78].
Fno (Cno ) = inf P log det Ir + HH QH ≤ Cno (63)
  
While zero dispersion indeed may imply fast convergence Q∈Ut
= P log det Ir + HH Q∗ H ≤ Cno
  
to -capacity, this is not true anymore when the probability (64)
distribution of the fading matrix approaches a step function, in
which case the higher-order terms in the expansion (54) become i.e., the infimum in (63) is a minimum. Then, for every 0 < τ < 
more dominant. Consider for example a SISO Rician fading there exists an (n, M, )no code for the channel (8) that satisfies
channel with Rician factor K. For  < 1/2, one can refine (54) log M 1 τ
and show that [26] ≥ log  Qr . (65)
j=1 Bj ≤ γn
n n P
√  
log n c1 K + c2 1 ∗ Here, Bj ∼ Beta(n − t∗ − j + 1, t∗ ), j = 1, . . . , r, are
C − − +o ≤ Rtx (n, )
n n n independent Beta-distributed random variables, t∗ , rank(Q∗ ),

log n c̃1 K + c̃2
 
1 and γn ∈ [0, 1] is chosen so that

≤ Rrt (n, ) ≤ C + − +o (58) √
n n n P sin2 {In,t∗ , nIn,t∗ UH + W} ≤ γn ≥ 1 −  + τ (66)
 

where c1 , c2 , c̃1 and c̃2 are finite constants with c1 > 0 and ∗
with U ∈ Ct ×t satisfying UH U = Q∗ .
c̃1 > 0. As we let the Rician factor K become large, the fading Proof: The proof is identical to the proof of Theorem 1,
distribution converges to a step function and the third term in
both the left-most lower bound and the right-most upper bound √ the precoding matrix P(H) (defined
with the only difference that
in (108)) is replaced by nIn,t∗ U.
becomes increasingly large in absolute value. The assumption in (64) that the -capacity-achieving input
covariance matrix of the channel (8) exists is mild. A sufficient
D. Normal Approximation condition for the existence of Q∗ is given in the following
proposition.
N ∗
h i
We define the normal approximation Rrt (n, ) of Rrt (n, ) 2
Proposition 5: Assume that E kHkF < ∞ and that the
as the solution of
distribution of H is absolutely continuous with respect to the
Lebesgue measure on Ct×r . Then, for every R ∈ R+ , the
" !#
N
C(H) − Rrt (n, )
=E Q p . (59) infimum in (26) is a minimum.
V (H)/n
Proof: See Appendix VI.
Here, For the SIMO case, the RHS of (43) and the RHS of (65)
m
coincide, i.e.,
X
C(H) = log(1 + vj∗ λj ) (60)  1
Rtx (n, ), Rno (n, ) ≥ log
τ
(67)
j=1 n P[B ≤ γn ]
8

where B ∼ Beta(n − r, r), and γn ∈ [0, 1] is chosen so that 1) SIMO case: For the SIMO case, CSIT is not beneficial [26]
√ and the bounds (49) and (71) can be tightened as follows.
P[sin2 {e1 , nρe1 H T + W} ≤ γn ] ≥ 1 −  + τ. (68) Theorem 7: Let
n 
X 2
Here, e1 stands for the first column of the identity matrix In .
p p
Ln , n log(1 + ρG) + 1 − ρGZi − 1 + ρG
The achievability bound (67) follows from (43) and (65) by i=1
noting that the random
Qvariable B on the RHS of (67) has the (74)
r
same distribution as i=1 Bi , where Bi ∼ Beta(n − i, 1),
i = 1, . . . , r. and

ρGZi − 1 2
n !
X
Sn , n log(1 + ρG) + 1− (75)
i=1
1 + ρG
B. Converse
with G , kHk2 and Zi , i = 1, . . . , n, i.i.d. CN (0, 1) dis-
For the converse, we shall assume CSIR but not CSIT. The
tributed. For every n and every 0 <  < 1, the maximal
counterpart of Theorem 2 is the following result.
achievable rate on the quasi-static fading channel (8) with one
Theorem 6: Let Ute be as in (28). For an arbitrary Q ∈ Ute ,
transmit antenna and with CSIR (with or without CSIT) is upper-
let Λ1 ≥ · · · ≥ Λm be the ordered eigenvalues of HH QH. Let
bounded by
n Xm 
1 1

X 2 ∗ ∗
Lrx (n − 1, ) ≤ Rrt (n − 1, ) ≤
p p
n (Q) , log(1+Λ j )+1− Λ Z
j ij − 1 + Λ j
Rrx log (76)
n−1 P[Ln ≥ nγn ]
i=1 j=1
(69) where γn is the solution of
and
P[Sn ≤ nγn ] = . (77)
n X m  Λj Zij − 1 2 
p
Proof: See [26, Th. 1]. The main difference between the
X
rx
Sn (Q) , log(1 + Λj ) + 1 − (70)
i=1 j=1
1 + Λj proof of Theorem 7 and the proof of Theorem 2 and Theorem 6
is that the simple bound 0 ≥ 1 − 1/M on the maximal
where Zij , i = 1, . . . , n, j = 1, . . . , m, are i.i.d. CN (0, 1) error probability of the auxiliary channel in the meta-converse
distributed. Then, for every n ≥ r and every 0 <  < 1, theorem [9, Th. 30] suffices to establish the desired result. The
the maximal achievable rate on the quasi-static MIMO fading more sophisticated bounds reported in Lemma 14 (Appendix III)
channel (8) with CSIR is upper-bounded by and Lemma 19 (Appendix VII) are not needed.
2) Converse for (n, M, )iso codes: In Theorem 8 below,
∗ 1 crx (n) we establish a converse bound on the maximal achievable rate
Rrx (n − 1, ) ≤ log . (71)
n−1 n (Q) ≥ nγn (Q)]
inf e P[Lrx of (n, M, )iso codes introduced in Section III. As such codes
Q∈Ut
consist of isotropic codewords chosen from the set Fiso in (32),
Here, CSIT is not beneficial also in this scenario.
b(r+1)2 /4c  Theorem 8: Let Lrx rx
n (·) and Sn (·) be as in (69) and (70),
π r(r−1)

crx (n) , E 1 + ρ kHkF
2 respectively. Then, for every n and every 0 <  < 1, the maximal
Γr (n)Γr (r) ∗
achievable rate Rrx,iso (n, ) of (n, M, )iso codes over the quasi-
r
static MIMO fading channel (8) with CSIR is upper-bounded by
Y 
n+r−2i+1 −(n+r−2i)
× (n + r − 2i) e
∗ ∗ 1 1
i=1 Rrx,iso (n, ) ≤ Rrt,iso (n, ) ≤ log
 n P[Ln ((ρ/t)It ) ≥ nγn ]
rx
+ Γ(n + r − 2i + 1, n + r − 2i) (72) (78)
where γn is the solution of
with Γ(·) (·) denoting the complex multivariate Gamma func-
tion [28, Eq. (83)], and γn (Q) is the solution of P[Snrx ((ρ/t)It ) ≤ nγn ] = . (79)
Proof: The proof follows closely the proof of Theorem 6.
P[Snrx (Q) ≤ nγn (Q)] = . (73)
As in the SIMO case, the main difference is that the simple bound
Proof: See Appendix VII. 0 ≥ 1 − 1/M on the maximal error probability of the auxiliary
channel in the meta-converse theorem [9, Th. 30] suffices to
The infimum in (71) makes the upper bound more diffi-
establish (79).
cult to evaluate numerically and to analyze asymptotically up
to O(log(n)/n) terms than the upper bound (49) that we estab-
lished for the CSIT case. In fact, even the simpler problem of C. Asymptotic Analysis
finding the matrix Q that minimizes lim P[Lrx n (Q) ≥ nγn ] To state our dispersion result, we will need the following
n→∞
is open. Next, we consider two special cases for which the definition of the gradient ∇g of a differentiable function g :
t×r
bound (71) can be tightened and evaluated numerically: the C 7→ R. Let L ∈ Ct×r , then we shall write ∇g(H) = L if
SIMO case and the case where all codewords are chosen from d
= Re tr AH L , ∀A ∈ Ct×r . (80)
 
g(H + tA)

the set Fiso . dt t=0
9

Theorem 9 below establishes the zero-dispersion result for the Conditions 1 and 2 of Theorem 9, the function Q 7→ FQ0 (Cno )
case of no CSIT. Because of the analytical intractability of the is continuous with respect to the same metric. By the extreme
minimization in the converse bound (71), Theorem 9 requires value theorem [29, p. 34], these two properties imply that the
more stringent conditions on the fading distribution compared to infimum on the RHS of (89) is a minimum. Then, one shows
the CSIT case (cf., Theorem 3), and its proof is more involved. that for Rayleigh, Rician, and Nakagami distributions
Theorem 9: Let fH be the pdf of the fading matrix H. Assume
that H satisfies the following conditions: FQ0 (Cno ) > 0, ∀Q ∈ Q . (90)
1) fH is a smooth function, i.e., it has derivatives of all orders. One way to prove (90) is to write FQ0 (Cno ) in integral form using
2) There exists a positive constant a such that Lemma 22 in Appendix VIII-A1 and to show that the resulting
−2tr−b(r+1)2 /2c−1 integral is positive.
fH (H) ≤ a kHkF (81)
−2tr−5
For the SIMO case, the conditions on the fading distribution
k∇fH (H)kF ≤ a kHkF . (82) can be relaxed and the following result holds.
3) The function Fno (·) satisfies Theorem 10: Assume that the pdf of kHk2 is continuously
differentiable and that the -capacity C is a point of growth for
Fno (Cno + δ) − Fno (Cno ) the outage probability function
lim inf > 0. (83)
δ→0 δ
Then, F (R) = P[log(1 + kHk2 ρ) < R] (91)
 
∗ ∗ log n i.e., F 0 (C ) > 0. Then,
(n, ) = Cno + O

Rno (n, ), Rrx . (84)
n  
 ∗ ∗
log n
Hence, the -dispersion is zero for both the CSIR and the no-CSI Rno (n, ), Rrx (n, ) = C + O . (92)
n
case:
Proof: In the SIMO case, CSIT is not beneficial [26, Th. 5].
Vno = Vrx = 0,  ∈ (0, 1)\{1/2}. (85)
Hence, the result follows directly from Theorem 3 and Proposi-
Proof: See Appendices VIII and IX. tion 23 in Appendix IX.
Remark 5: It can be shown that Conditions 1–3 in Theorem 9 Similarly, for the case of codes consisting of isotropic code-
are satisfied by the probability distributions commonly used to words, milder conditions on the fading distribution are sufficient
model MIMO fading channels, such as Rayleigh, Rician, and to establish zero dispersion, as illustrated in the following theo-
Nakagami. Condition 2 requires simply that fH has a polynomi- rem.
ally decaying tail. Condition 3 plays the same role as (53) in the Theorem 11: Assume that the joint pdf of the nonzero eigen-
CSIT case. The exact counterpart of (53) for the no-CSIT case values of HH H is continuously differentiable and that
would be 0
Fiso (Ciso ) > 0 (93)
0 no
Fno (C ) > 0. (86)
where Fiso is the outage probability function given in (31). Then,
However, different from (53), the inequality (86) does not neces- we have
sarily hold for the commonly used fading distributions. Indeed,  
∗ ∗ iso log n
consider a MISO i.i.d. Rayleigh-fading channel. As proven {Rno,iso (n, ), Rrx,iso (n, )} = C + O . (94)
in [20], the -capacity-achieving covariance matrix for this n
case is given by (29). The resulting outage probability function Proof: See Appendix X.
Fno (·) may not be differentiable at the rates R for which the
infimum in (27) is achieved by two input covariance matrices
with different number of nonzero entries t∗ on the main diagonal. D. Normal Approximation
Next, we briefly sketch how to prove that Condition 3 holds For the general no-CSIT MIMO case, the unavailability of a
for Rayleigh, Rician, and Nakagami distributions. Let closed-form expression for the -capacity Cno in (25) prevents us
from obtaining a normal approximation for the maximum coding
FQ (R) , P[log det Ir + HH QH < R].

(87)
rate at finite blocklength. However, such an approximation can
Let Q be the set of all -capacity-achieving covariance matrices, be obtained for the SIMO case and for the case of isotropic
i.e., codewords. In both cases, CSIT is not beneficial and the outage
e no no
capacity can be characterized in closed form.
Q , {Q ∈ Ut : FQ (C ) = Fno (C )}. (88) For the SIMO case, the normal approximation follows directly

By Proposition 5, the set Q is non-empty for the considered from (59)–(61) by setting m = 1, v1 = ρ and noting that λ1 =
2
fading distributions. It follows from algebraic manipulations that khk .
N
For (n, M, )iso codes, the normal approximation Rrx,iso (n, )
Fno (Cno + δ) − Fno (Cno ) 0 no ∗
lim inf = inf FQ (C ). (89) to the maximal achievable rate Rrx,iso (n, ) is obtained as the
δ→0 δ Q∈Q
solution of
To show that the RHS of (89) is positive, one needs to perform "
N
!#
two steps. First, one shows that the set Q is compact with Ciso (H) − Rrx,iso (n, )
=E Q p . (95)
respect to the metric d(A, B) = kA − BkF and that under Viso (H)/n
10

Converse with CSIRT (76) Converse with CSIRT (78)


Outage capacity Cǫ Cǫiso
1
1

0.8
Rate, bit/(ch. use)

Rate, bit/(ch. use)


0.8
Normal approximation (59)
Normal approximation (95)
0.6 Achievability with CSIR
0.6
Achievability with no CSI (67)
R∗ (n, ǫ) over AWGN Achievability with no CSI (65)
0.4 0.4

0.2 0.2

0 00 100 200 300 400 500 600 700 800 900 1000
0 100 200 300 400 500 600 700 800 900 1000
Blocklength, n Blocklength, n

Fig. 2. Achievability and converse bounds for a quasi-static SIMO Rician-fading Fig. 3. Achievability and converse bounds for (n, M, )iso codes over a
channel with K-factor equal to 20 dB, two receive antennas, SNR = −1.55 quasi-static MIMO Rayleigh-fading channel with two transmit and three receive
dB, and  = 10−3 . Note that in the SIMO case Ctx = Cno = C . antennas, SNR = 2.12 dB, and  = 10−3 .

Here, CSIRT case and in the range [120, 480] for the no-CSI case. For
m
X the AWGN channel, this number is approximately 1420. Hence,
Ciso (H) = log(1 + ρλj /t) (96) for the parameters chosen in Fig. 2, the prediction (based on
j=1 zero dispersion) of fast convergence to capacity is validated.
N
The gap between the normal approximation Rrt (n, ) defined
and
implicitly in (59) and both the achievability (CSIR) and the
m
X 1 converse bounds is less than 0.02 bit/(ch. use) for blocklengths
Viso (H) = m − 2
(97)
(1 + ρλ j /t) larger than 400.
j=1
Note that although the AWGN curve in Fig. 2 lies below the
h 28, 2014
where {λj } are the eigenvalues of HH H. A comparison between
March 31, 2014
achievability
DRAFT
bound for the quasi-static fading channel, this does D
N
Rrx,iso (n, ) and the bounds (65) and (78) is provided in the next not mean that “fading helps”. In Fig. 2, we chose the SNRs so
section. that both channels have the same -capacity. This results in the
received power for the quasi-static case being 1.45 dB larger
VI. N UMERICAL R ESULTS than that for the AWGN case.
N
In Fig. 3, we compare the normal approximation Rrx,iso (n, )
A. Numerical Results
defined (implicitly) in (95) with the achievability bound (65) and
In this section, we compute the bounds reported in Sec- the converse bound (78) on the maximal achievable rate with
N
tions IV and V. Fig. 2 compares Rrt (n, ) with the achievability (n, M, )iso codes over a quasi-static MIMO fading channel with
bound (67) and the converse bound (76) for a quasi-static SIMO t = 2 transmit and r = 3 receive antennas. The channel between
fading channel with two receive antennas. The channels between each transmit-receive antenna pair is Rayleigh-distributed, and
the transmit antenna and each of the two receive antennas the channels between different transmit-receive antenna pairs
are Rician-distributed with K-factor equal to 20 dB. The two are assumed to be independent. We set  = 10−3 and choose
channels are assumed to be independent. We set  = 10−3 ρ = 2.12 dB so that C iso = 1 bit/(ch. use). For this scenario,

and choose ρ = −1.55 dB so that C = 1 bit/(ch. use). We the blocklength required to achieve 90% of C iso is less than 500,
∗ 
also plot a lower bound on Rrt (n, ) obtained by using the κβ which again demonstrates fast convergence to C iso .

bound [9, Th. 25] and assuming CSIR.7 For reference, Fig. 2
shows also the approximation (2) for R∗ (n, ) corresponding
B. Comparison with coding schemes in LTE-Advanced
to an AWGN channel with C = 1 bit/(ch. use), replacing the
term O(log(n)/n) in (2) with log(n)/(2n) [9, Eq. (296)] [30].8 The bounds reported in Sections IV and V can be used to
The blocklength required to achieve 90% of the -capacity of benchmark the coding schemes adopted in current standards. In
the quasi-static fading channel is in the range [120, 320] for the Fig. 4, we compare the performance of the coding schemes used
in LTE-Advanced [31, Sec. 5.1.3.2] against the achievability and
7 Specifically, we took F = {x ∈ Cn : kxk2 = nρ}, and Q converse bounds for the same scenario as in Fig. 2. Specifically,
YH =
PH n H
Q
j=1 QYj | H where QYj | H=h = CN (0, Ir + ρhh ). Fig. 4 illustrates the performance of the family of turbo codes
8 The approximation reported in [9, Eq. (296)], [30] holds for a real AWGN
chosen in LTE-Advanced for the case of QPSK modulation. The
channel. Since a complex AWGN channel with blocklength n can be identified
as a real AWGN channel with the same SNR and blocklength 2n, the approxi- decoder employs a max-log-MAP decoding algorithm [32] with
ρ2 +2ρ 10 iterations. We further assume that the decoder has perfect CSI.
mation [9, Eq. (296)], [30] with C = log(1 + ρ) and V = (1+ρ) 2 is accurate
for the complex case. For the AWGN case, it was observed in [9, Fig. 12] that about
11

Outage capacity Cǫ LTE-Advanced uses hybrid automatic repeat request (HARQ)


1 to compensate for packets loss due to outage events. When
HARQ is used, the block error rate that maximizes the average
0.8 throughput is about 10−1 [33, p. 218]. The performance of LTE-
Rate, bit/(ch. use)

Converse (76) Advanced codes for  = 10−1 is analyzed in Fig. 5. We set


Normal approximation (59) ρ = 2.74 dB1and consider Rayleigh fading (the other parameters
0.6 LTE-Advanced codes
are as in Fig. 4). Again, we observe that there is a constant gap
Achievability with CSIR between the rate achieved by LTE-Advanced turbo codes and
0.4 N
Rrt (n, ).

0.2 VII. C ONCLUSION


In this paper, we established achievability and converse
0 bounds on the maximal achievable rate R∗ (n, ) for a given
0 100 200 300 400 500 600 700 800 900 1000
Blocklength, n
blocklength n and error probability  over quasi-static MIMO
fading channels. We proved that (under some mild conditions
Fig. 4. Comparison between achievability and converse bounds and the rate on the fading distribution) the channel dispersion is zero for all
achievable with the coding schemes in LTE-Advanced. We consider a quasi-static four cases of CSI availability. The bounds are easy to evaluate
SIMO Rician-fading channel with K-factor equal to 20 dB, two receive antennas, when CSIT is available, when the number of transmit antennas
SNR = −1.55 dB,  = 10−3 , and CSIR. The star-shaped markers indicate the
rates achievable by the turbo codes in LTE-Advanced (QPSK modulation and is one, or when the code has isotropic codewords. In all these
10 iterations of a max-log-MAP decoder [32]). cases the outage-capacity-achieving distribution is known.
The numerical results reported in Section VI-A demonstrate
Converse (76)
that, in some scenarios, zero dispersion implies fast conver-
Outage capacity Cǫ gence to C as the blocklength increases. This suggests that
1
the outage capacity is a valid performance metric for communi-
cation systems with stringent latency constraints operating over
0.8 quasi-static fading channels. We developed an easy-to-evaluate
Rate, bit/(ch. use)

Normal approximation (59)


Achievability with CSIR approximation of R∗ (n, ) and demonstrated its accuracy by
0.6 LTE-Advanced codes comparison to our achievability and converse bounds. Finally,
we used our bounds to benchmark the performance of the coding
ch 28, 2014 0.4 schemesDRAFT
adopted in the LTE-Advanced standard. Specifically,
we showed that for a blocklength between 500 and 1000 LTE-
0.2
Advanced codes achieve about 85% of the maximal coding rate.

A PPENDIX I
0 AUXILIARY L EMMAS C ONCERNING THE P RODUCT OF
0 100 200 300 400 500 600 700 800 900 1000
Blocklength, n S INES OF P RINCIPAL A NGLES
In this appendix, we state two properties of the product of
Fig. 5. Comparison between achievability and converse bounds and rate
achievable with the coding schemes in LTE-Advanced. We consider a quasi-
principal sines defined in (40), which will be used in the proof
static SIMO Rayleigh-fading channel with two receive antennas, SNR = 2.74 of Theorem 3 and of Proposition 23. The first property, which is
dB,  = 0.1, and CSIR. The star-shaped markers indicate the rates achievable referred to in [34] as “equalized Hadamard inequality”, is stated
by the turbo codes in LTE-Advanced (QPSK modulation and 10 iterations of a
max-log-MAP decoder [32]).
in Lemma 12 below.
Lemma 12: Let A = [A1 , A2 ] ∈ Cn×(a1 +a2 ) , where A1 ∈
n×a1
C and A2 ∈ Cn×a2 . If rank(A1 ) = a1 and rank(A2 ) = a2 ,
then
2
det(AH A) = det(AH H
1 A1 ) det(A2 A2 ) sin {A1 , A2 }. (98)
half of the gap between the rate achieved by√ the best available
channel codes9 and capacity is due to the 1/ n penalty in (2); Proof: The proof follows by extending [35, Th. 3.3] to the
the other half is due to the suboptimality of the codes. From complex case.
Fig. 4, we conclude that for quasi-static fading channels the The second property provides an upper bound on sin{A, B}
finite-blocklength penalty is significantly reduced because of the that depends on the angles between the basis vectors of the two
zero-dispersion effect. However, the penalty due to the code subspaces.
rch 28, 2014 DRAFT
suboptimality remains. In fact, we see that the gap between Lemma 13: Let A and B be subspaces of Cn with dim(A) =
the rate achieved by the LTE-Advanced turbo codes and the a and dim(B) = b. Let {a1 , . . . , aa } be an orthonormal basis
normal approximation Rrt N
(n, ) is approximately constant up for A, and let {b1 , . . . , bb } be an arbitrary basis (not necessarily
to a blocklength of 1000. orthonormal) for B. Then
min{a,b}
Y
9 The
codes used in [9, Fig. 12] are a certain family of multiedge low-density sin{A, B} ≤ sin{aj , bj }. (99)
parity-check (LDPC) codes. j=1
12

Proof: To keep notation simple, we define the following {Φj } are unitary, the codewords satisfy the power constraint (12).
function, which maps a complex matrix X of arbitrary size to its Motivated by the geometric considerations reported in Sec-
volume: tion IV-A1, we consider for a given input X(H) = ΦP(H) a
q physically degraded version of the channel (8), whose output is
vol(X) , det(XH X). (100) given by
Let A = [a1 , . . . , aa ] ∈ Cn×a and B = [b1 , . . . , bb ] ∈ Cn×b . ΩY = span(ΦP(H)H + W). (110)
If the vectors a1 , . . . , aa , b1 , . . . , bb are linearly dependent,
then the LHS of (99) vanishes, in which case (99) holds triv- Note that the subspace ΩY belongs with probability one to the
ially. In the following, we therefore assume that the vectors Grassmannian manifold Gn,r , i.e., the set of all r dimensional
a1 , . . . , aa , b1 , . . . , bb form a linearly independent set. Below, subspaces in Cn . Because (110) is a physically degraded version
we prove Lemma 13 for the case a ≤ b. The proof for the case of (8), the rate achievable on (110) is a lower bound on the rate
a > b follows from similar steps. achievable on (8).
Using Lemma 12, we get the following chain of (in)equalities: To prove the theorem, we apply the κβ bound [9, Th. 25]
to the channel (110). Following [9, Eq. (107)], we define the
sin{A, B}
following measure of performance for the composite hypothesis
vol([A, B])
= (101) test between an auxiliary output distribution QΩY defined on the
vol(A)vol(B) subspace ΩY and the collection of channel-output distributions
vol([A, B]) {PΩY | Φ=Φ }Φ∈Sn,t :
= (102)
vol(B) Z
1  κτ (Sn,t , QΩY ) , inf PZ | ΩY (1 | ΩY )QΩY (dΩY ) (111)
= ka1 k vol [a2 , . . . , aa , B]
vol(B) | {z }
 =1 where the infimum is over all probability distributions PZ | ΩY :
· sin a1 , [a2 , . . . , aa , B] (103) Gn,t 7→ {0, 1} satisfying
.. Z
. ! PZ | ΩY (1 | ΩY )PΩY | Φ=Φ (dΩY ) ≥ τ, ∀Φ ∈ Sn,t . (112)
a
1 Y 
= sin ai , [ai+1 , . . . , aa , B] vol(B) (104) By [9, Th. 25], we have that for every auxiliary distribution Q
vol(B) i=1
ΩY
a κτ (Sn,t , QΩY )
M≥ (113)
Y
≤ sin{ai , bi }. (105) supΦ∈Sn,t β1−+τ (PΩY | Φ=Φ , QΩY )
i=1

Here, (102) holds because the columns of A are orthonormal and, where β(·) (·, ·) is defined in (6). We next lower-bound the RHS
hence, det(AH A) = 1; (103) and (104) follow from Lemma 12; of (113) to obtain an expression that can be evaluated numerically.
(105) follows because Fix a Φ ∈ Sn,t and let
ZΦ (ΩY ) = 1{sin2 {span(Φ), ΩY } ≤ γn } (114)

sin ai , [ai+1 , . . . , aa , B] ≤ sin{ai , bi }. (106)
where γn ∈ [0, 1] is chosen so that
PΩY |Φ=Φ [ZΦ (ΩY ) = 1] ≥ 1 −  + τ. (115)
A PPENDIX II
P ROOF OF T HEOREM 1 (CSIT ACHIEVABILITY B OUND ) Since the noise matrix W is isotropically distributed, the proba-
Given H = H, we perform a singular value decomposition bility distribution of the random variable sin2 {span(Φ), ΩY }
(SVD) of H to obtain (where ΩY ∼ PΩY |Φ=Φ ) does not depend on Φ. Hence, the
chosen γn satisfies (115) for all Φ ∈ Sn,t . Furthermore, ZΦ (ΩY )
H = LΣVH (107) can be viewed as a hypothesis test between PΩY | Φ=Φ and QΩY .
where L ∈ Ct×t and V ∈ Cr×r are unitary matrices, and Hence, by definition
Σ ∈ Ct×r is a (truncated) √ diagonal matrix
√ of dimension t × r,
β1−+τ (PΩY | Φ=Φ , QΩY ) ≤ QΩY [ZΦ (ΩY ) = 1] (116)
whose diagonal elements λ1 , . . . , λm , are the ordered sin-
gular values of H. It will be convenient to define the following for every Φ ∈ Sn,t .
t × t precoding matrix for each H: We next evaluate the RHS of (116), taking as the auxiliary
p p
P(H) , diag{ nv1∗ , . . . , nvm ∗ , 0, . . . , 0}LH . (108) output distribution the uniform distribution on Gn,r , which we de-
| {z } note by QuΩY . With this choice, QuΩY [sin2 {span(Φ), ΩY } ≤ γn ]
t−m
does not depend on Φ ∈ Sn,t . To simplify calculations, we can
We consider a code whose codewords Xj (H), j = 1, . . . , M , therefore set Φ = In,t . Observe that under QuΩY , the squares of
have the following structure the sines of the principle angles between span(In,t ) and ΩY have
the same distribution as the eigenvalues of a complex multivariate
Xj (H) = Φj P(H), Φj ∈ Sn,t (109)
Beta-distributed matrix B ∼ Betar (n−t, t) [36, Sec. 2]. By [37,
n×t H
where Sn,t , {A ∈ C : A A = It } denotes the set of all Cor.Q1], the distribution of det B coincides with the distribution
r
n × t unitary matrices, (i.e., the complex Stiefel manifold). As of i=1 Bi , where {Bi }, i = 1, . . . , r, are independent with
13

Bi ∼ Beta(n − t − i + 1, t). Using this result to compute the for all distributions PZ | ΩY that satisfy (123). This proves (122),
RHS of (116) we obtain since
κτ (Sn,t , QuΩY ) ≥ κuτ (Sn,t , QuΩY ) ≥ τ.
 
Yr (126)
sup β1−+τ (PΩY | Φ=Φ , QΩY ) ≤ P Bj ≤ γn  (117)
Φ∈Sn,t j=1
A PPENDIX III
P ROOF OF T HEOREM 2 (CSIRT C ONVERSE B OUND )
where γn satisfies When CSI is available at both the transmitter and the receiver,
h i the MIMO channel (8) can be transformed into the following set
P sin2 In,t , In,t P(H)H + W ≤ γn ≥ 1 −  + τ.

(118)
of m parallel quasi-static channels
p
Note that (118) is equivalent to (44). Indeed Yi = xi Λi + Wi , i = 1, . . . , m (127)
h n √ o i
by performing a singular value decomposition [17, Sec. 3.1].
P sin2 In,t , nIn,t P(H)H + W ≤ γn
Here, Λ1 ≥ · · · ≥ Λm denote the m largest eigenvalues of HHH ,
h n √ p
= P sin2 In,t , nIn,t diag
p ∗
v1 Λ1 , . . . , vm∗ Λ ,
m and Wi ∼ CN (0, In ), i = 1, . . . , m, are independent noise
H o i vectors.
0, . . . , 0 V + W ≤ γn (119) Next, we establish a converse bound for the channel (127).
| {z }
t−m Let X = [x1 · · · xm ] and fix an (n, M, )rt code. Note that (12)
h
2
n √ p ∗ p
∗ Λ , implies
= P sin In,t , nIn,t diag v1 Λ1 , . . . , vm m
m
X
kxi k2 ≤ nρ.
o i
(128)

0, . . . , 0 + WV ≤ γn (120)
| {z } i=1
t−m
h
2
n √ p ∗ p To simplify the presentation, we assume that the encoder ftx is
= P sin In,t , nIn,t diag v1 Λ1 , . . . , vm∗ Λ ,
m deterministic. Nevertheless, the theorem holds also if we allow
for randomized encoders. We further assume that the encoder ftx
o i
0, . . . , 0 + W ≤ γn (121)
| {z } acts on the pairs (j, λ) instead of (j, H) (cf., Definition 3). The
t−m channel (127) and the encoder ftx define a random transfor-
where V contains the right singular vectors of H (see (107)). mation PY,Λ | J from the message set {1, . . . , M } to the space
n×m
Here, (119) follows from (108); (120) follows because right- C × Rm +:
multiplying a matrix A by a unitary matrix does not change PY,Λ | J = PΛ PY | Λ,J (129)
the subspace spanned by the columns of A and hence, it does
not change sin{·, ·}; (121) follows because W is isotropically where Y = [Y1 , . . . , Ym ] and
distributed and hence WV has the same distribution as W. PY | Λ=λ,J=j , PY | Λ=λ,X=ftx (j,λ) . (130)
To conclude the proof, it remains to show that
We can think of PY,Λ | J as the channel law associated with
κτ (Sn,t , QuΩY ) ≥ τ. (122) J −→ Y, Λ. (131)
Once this is done, the desired lower bound (43) follows by using ∗
To upper-bound Rrt (n, ),
we use the meta-converse theorem [9,
the inequality (117) and (122) in (113), by taking the logarithm Th. 30] on the channel (131). We start by associating to each
of both sides of (113), and by dividing by the blocklength n. codeword X a power-allocation vector ṽ(X) whose entries ṽi (X)
To prove (122), we replace (112) with the less stringent are
constraint that 1
ṽi (X) , kxi k2 , i = 1, . . . , m. (132)
Z  n
EPΦu PZ | ΩY (1 | ΩY )PΩY | Φ (dΩY ) ≥ τ (123) We take as auxiliary channel QY,Λ | J = PΛ QY | Λ,J , where
m
where PΦu is the uniform input distribution on Sn,t . Since
Y
QY | Λ=λ,J=j = QYi | Λ=λ,J=j (133)
replacing (112) by (123) enlarges the feasible region of the i=1
minimization problem (111), we obtain an infimum in (111) and
(denoted by κuτ (Sn,t , QuΩY )) that is no larger than κτ (Sn,t , QuΩY ).    
The key observation is that the uniform distribution PΦu induces QYi | Λ=λ,J=j = CN 0, 1 + (ṽi ◦ ftx (j, λ))λi In . (134)
an isotropic distribution on Y. This implies that the induced By [9, Th. 30], we obtain
distribution on ΩY is the uniform distribution on Gn,r , i.e., QuΩY .
Therefore, it follows that min β1− (PYΛ | J=j , QYΛ | J=j ) ≤ 1 − 0 (135)
j∈{1,...,M }
Z
PZ | ΩY (1 | ΩY )QuΩY (dΩY ) where 0 is the maximal probability of error over QY,Λ | J .
Z  We shall prove Theorem 2 in the following two steps: in Ap-
= EPΦu PZ | ΩY (1 | ΩY )PΩY | Φ (dΩY ) (124) pendix III-1, we evaluate β1− (PYΛ | J=j , QYΛ | J=j ); in Ap-
pendix III-2, we relate 0 to Rrt

(n, ) by establishing a converse
≥τ (125) bound on the auxiliary channel QY,Λ | J .
14

1) Evaluation of β1− : Let j ∗ be the message that achieves the maximal error probability over the channel QU ,Λ | S defined

the minimum in (135), let ftx (λ) , ftx (j ∗ , λ), and let by
∗ n
β1− (ftx ) , β1− (PY,Λ | J=j ∗ , QY,Λ | J=j ∗ ). (136) 1 + Si Λi X
Ui = |Wi,l |2 , i = 1, . . . , m. (144)
n
Using (136), we can rewrite (135) as l=1

∗ Here, Ui denotes the ith entry of U , the random variables {Wi,l }


β1− (ftx ) ≤ 1 − 0 . (137)
are i.i.d. CN (0, 1)-distributed, and the input S = [S1 . . . Sm ]
Let now has nonnegative entries whose sum does not exceed ρ, i.e.,
dPY,Λ | J=j ∗ S ∈ Vm . Note that, given Si and Λi , the random variable Ui

r(ftx ; Y, Λ) , log . (138) in (144) is Gamma-distributed, i.e., its pdf qUi | Si ,Λi is given by
dQY,Λ | J=j ∗
qUi | Si ,Λi (ui | si , λi )
Note that, under both PY,Λ | J=j ∗ and QY,Λ | J=j ∗ , the random
nn
 

variable r(ftx ; Y, Λ) has absolutely continuous cumulative dis- n−1 nui
= u exp − . (145)
tribution function (cdf) with respect to the Lebesgue measure. (1 + si λi )n Γ(n) i 1 + si λi
By the Neyman-Pearson lemma [38, p. 300] Furthermore, the random variables U1 , . . . , Um are conditionally
∗ ∗ ∗ independent given S and Λ.
β1− (ftx ) = QY,Λ | J=j ∗ [r(ftx ; Y, Λ) ≥ nγn (ftx )] (139)
We shall use that qUi | Si ,Λi can be upper-bounded as

where γn (ftx ) is the solution of qUi | Si ,Λi (ui | si , λi )
∗ ∗
PY,Λ | J=j ∗ [r(ftx ; Y, Λ) ≤ nγn (ftx )] = . (140) ≤ gi (ui , λi ) (146)
n(n − 1)n−1 −(n−1)

∗ n−1
Let now v , ṽ◦ftx . Because of the power constraint (128), v is a 
 e , if ui ≤ n (1 + ρλi )
mapping from {1, . . . , M } to the set Vm defined in (46). Further-
 Γ(n)
, n n−1 −nui /(1+ρλi )
(147)
n ui e

more, under QY,Λ | J=j ∗ , the random variable r(ftx ; Y, Λ) has 
 , if ui > n−1
(1 + ρλ )
n i
the same distribution as Lrt n (v, Λ) in (47), and under PY,Λ | J=j ∗ , Γ(n)(1 + ρλi )n−1
rt
it has the same distribution as Sn (v, Λ) in (48). Thus, (137) is which follows because 1+si λi ≤ 1+ρλi , and because qU | S ,Λ
i i i
equivalent to is a unimodal function with maximum at
P[Lrt 0 n−1
n (v, Λ) ≥ nγn (v)] ≤ 1 −  (141) ui = (1 + si λi ). (148)
n
where γn (v) is the solution of (51). Note that this upper bound The bound in (147) is useful because it is integrable and does
depends on the chosen code only through the induced power not depend on the input s .
i
allocation function v. To conclude, we take the infimum of the Consider now an arbitrary code {c1 (Λ), . . . , cM (Λ)} ⊂ Vm
LHS of (141) over all power allocation functions v to obtain a for the channel Q
U ,Λ | S . Let Dj (Λ), j = 1, . . . , M , be the
bound that holds for all (n, M, )rt codes. (disjoint) decoding sets corresponding to the M codewords
2) Converse on the auxiliary channel: We next relate 0 to {cj (Λ)}. Let 0 be the average probability of error over the
∗ avg
Rrt (n, ). The following lemma, whose proof can be found at channel Q
U ,Λ | S . We have
the end of this appendix, serves this purpose.
0 0
Lemma 14: For every code with M codewords and block- 1 −  ≤ 1 − avg (149)
length n, the maximum probability of error 0 over the channel
 
M Z
1 X
QY,Λ | J satisfies = EΛ  qU | S,Λ (u | cj (Λ), Λ)du (150)
M j=1 D j (Λ)
crt (n)
1 − 0 ≤ (142)
 ! 
M Z m
M 1 X Y
≤ EΛ  gi (ui , Λi ) du (151)
where crt (n) is given in (50). M j=1 Dj (Λ) i=1
Using Lemma 14, we obtain "Z m
! #
1 Y
crt (n) = EΛ gi (ui , Λi ) du (152)
inf P[Lrtn (v, Λ) ≥ nγn (v)] ≤ . (143) M Rm
v(·) M "m Z + i=1
#
1 Y +∞
The desired lower bound (49) follows by taking the logarithm = EΛ gi (ui , Λi )dui (153)
on both sides of (143) and dividing by n. M i=1 0
Proof of Lemma 14: By (133), given Λ = λ, the output where (151) follows from (147), and where (152) follows be-
of the channel QY,Λ | J depends on the input J only through cause g (u , Λ ) is independent of the message j and because
i i i
S , ṽ ◦ ftx (J, λ), i.e., through the norm of each column of SM D (Λ) = Rm . After algebraic manipulations, we obtain
j=1 j +
the codeword matrix ftx (J, λ). Let U , ṽ(Y). In words, the Z ∞
entries of U are the square of the norm of the columns of Y gi (ui , λi )dui
normalized by the blocklength n. It follows that (U , Λ) is a 0
sufficient statistic for the detection of J from (Y, Λ). Hence, to (1 + ρλi ) h i
0 = (n − 1)n e−(n−1) + Γ(n, n − 1) . (154)
lower-bound  and establish (142), it suffices to lower-bound Γ(n)
15

Here, Γ(·, ·) denotes the (upper) incomplete Gamma function [39, and by noting that this γ satisfies
Sec. 6.5]. Substituting (154) into (153), we finally obtain that
for every code {c1 (Λ), . . . , cM (Λ)} ⊂ Vm , γ = Ctx + O(1/n) (165)

1 (n − 1)n e−(n−1)

Γ(n, n − 1)
m i.e., it belongs to the desired neighborhood of Ctx for sufficiently
1 − 0 ≤ + large n. Here, (165) follows by a Taylor series expansion [40,
M Γ(n) Γ(n)
"m # Th. 5.15] of Ftx (γ) around Ctx , and because Ftx (Ctx ) =  and
0
×E
Y
(1 + ρΛi ) (155) Ftx (Ctx ) > 0 by assumption.
i=1
In the reminder of this appendix, we will prove (163). Our
crt (n) proof consists of the three steps sketched below.
= . (156) Step 1: Given v and Λ, the random variable Snrt (v, Λ)
M
(see (48) for its definition) is the sum of n i.i.d. random variables
This proves Lemma 14. with mean
m
A PPENDIX IV
X 
µ(v, Λ) , log 1 + Λj vj (Λ) (166)
P ROOF OF THE C ONVERSE PART OF T HEOREM 3 j=1
As a first step towards establishing (56), we relax the upper and variance
bound (49) by lower-bounding its denominator. Recall that by m
definition (see Appendix III-1)
X 1
σ 2 (v, Λ) , m − 2 . (167)
1 + Λj vj (Λ)
P[Lrt
n (v, Λ) ≥ nγn (v)] = β1− (PY,Λ | J=j ∗ , QY,Λ | J=j ∗ ). j=1

Fix an arbitrary power allocation function v(·), and assume that


(157)
Λ = λ. Let
We shall use the following inequality: for every η > 0 [9, γ − µ(v, λ)
Eq. (102)] u(v, λ) , . (168)
σ(v, λ)
   
1 dP
β1− (P, Q) ≥ 1−P ≥η − . (158) Using the Cramer-Esseen theorem (see Theorem 15 below), we
η dQ show in Appendix IV-A that
Using (158) with P = PY,Λ | J=j ∗ , Q = QY,Λ | J=j ∗ , η = enγ , k3
and recalling that (see Appendix III-1) P[Snrt (v, Λ) ≤ nγ | Λ = λ] ≥ qn (u(v, λ)) + (169)
n
 
dPY,Λ | J=j ∗ nγ where
1 − PY,Λ | J=j ∗ ≥e = P[Snrt (v, Λ) ≤ nγ]
dQY,Λ | J=j ∗ √ [1 − nx2 ]+ e−nx
2
/2
(159) qn (x) , Q(− nx) − √ (170)
6 n
we obtain that for every γ > 0 and k3 is a finite constant independent of λ, v and γ.
 Step 2: We make the RHS of (169) independent of v by
β1− PY,Λ | J=j ∗ , QY,Λ | J=j ∗
minimizing qn (u(v, λ)) over v. Specifically, we establish in
≥ e−nγ P[Snrt (v, Λ) ≤ nγ] −  .

(160) Appendix IV-B the following result: for every γ in a certain
Using (160) and the estimate neighborhood of Ctx , we have that
m k3
log crt (n) = log n + O(1) (161) P[Snrt (v, Λ) ≤ nγ | Λ = λ] ≥ qn (û(λ)) + (171)
2 n
(which follows from (50), Assumption 1 in Theorem 3, and from where û(λ) is defined in (187).
algebraic manipulations), we upper-bound the RHS of (49) as Step 3: We average (171) over Λ and establish in Ap-
∗ 1   pendix IV-C the bound (163). This concludes the proof.
Rrt (n, ) ≤ γ − log inf P[Snrt (v, Λ) ≤ nγ] − 
n v(·)
 
m log n 1 A. Proof of (169)
+ +O . (162)
2 n n We need the following version of the Cramer-Esseen Theo-
To conclude the proof we show that for every γ in a certain rem.10
neighborhood of Ctx (recall that γ is a free optimization param- Theorem 15: Let X1 , . . . , Xn be a sequence of i.i.d. real ran-
eter), dom variables having zero mean and unit variance. Furthermore,
  let
rt 1
inf P[Sn (v, Λ) ≤ nγ] ≥ Ftx (γ) + O (163)
 
n
v(·) n  itX1  1 X
ϕ(t) , E e and Fn (ξ) , P √ Xj ≤ ξ  . (172)
where Ftx (·) is the outage probability defined in (20) and the n j=1
O(1/n) term is uniform in γ. The desired result (56) follows
then by substituting (163) into (162), setting γ as the solution of 10 The Berry-Esseen Theorem used in [9] to prove (2) yields an asymptotic

expansion in (163) up to a O(1/ n) term. This is not sufficient here, since we
Ftx (γ) −  + O(1/n) = 1/n (164) need an expansion up to a O(1/n) term (see (163)).
16

 
If E |X1 |4 < ∞ and if sup|t|≥ζ |ϕ(t)| ≤ k0 for some k0 < 1, where ζ0 does not depend on λ and v. Through algebraic
 
where ζ , 1/(12E |X1 |3 ), then for every ξ and n manipulations, we can further show that the RHS of (181) is
upper-bounded by
Fn (ξ) − Q(−ξ) − k1 (1 − ξ 2 )e−ξ2 /2 √1

n
1
sup ϕTl (t) ≤ p , k0 < 1. (182)

1 + ζ02 /m
(  n ) |t|>ζ0

E |X1 |4

6 1
≤ k2 + n k0 + . (173)
n 2n The inequalities (178) and (182) imply that the conditions in
  √ Theorem 15 are met. Hence, we conclude that, by Theorem 15,
Here, k1 , E X13 /(6 2π), and k2 is a positive constant for every n, λ, and v(·),
independent of {Xi } and ξ. " #
n
Proof: The inequality (173) is a consequence of the tighter 1 X √ √ 
P √ Tl ≤ nu(v, λ) Λ = λ − Q − nu(v, λ)

inequality reported in [16, Th. VI.1]. n
Let l=1 
E Tl3 | Λ = λ 2 9k2
1
m Λj vj (Λ)Zl,j − 1 2
p ! ≥ √ √ (1 − nu(v, λ)2 )e−nu(v,λ) /2 −
6 2π n n
X
Tl (v, Λ) , 1− (174)
σ(v, Λ) j=1 1 + Λj vj (Λ) 
1
n
− k2 n6 k0 + (183)
2n
where Zl,j , l = 1, . . . , n and j = 1, . . . , m, are i.i.d. CN (0, 1)
distributed. The random variables T1 , . . . , Tn have zero mean where u(v, λ) was defined in (168). The inequality (169) follows
and unit variance, and are conditionally i.i.d. given Λ. Further- then by noting that
more, by construction h
3
i √
" n
# 0 ≥ E T l = λ ≥ − 2π (184)

Λ
1 X
P Snrt (v, Λ) ≤ nγ = P √
 
Tl (v, Λ) ≤ nu(v, Λ)
n and that
l=1
(175)   n 
6 1
sup n k2 n k0 + < ∞. (185)
where u(v, Λ) was defined in (168). We next show that the n≥1 2n
conditions under which Theorem 15 holds are satisfied by the
random variables {Tl }.
We start by noting that if λj vj (λ), j = 1, . . . , m, are identi- B. Proof of (171)
cally zero, then Snrt (v, Λ) = 0, so (169) holds trivially. Hence, For every fixed λ, we minimize qn (u(v, λ)) on the RHS
we will focus on the case where {λj vj (λ)} are not all identically of (169) over all power allocation functions v(·). With a slight
zero. Let abuse of notation, we use v ∈ Vm (where Vm was defined in (46))
 itT  to denote the vector v(λ) whenever no ambiguity arises. Since
ϕTl (t) , E e l Λ = λ (176) the function q (x) in (170) is monotonically increasing in x, the
n

and vector v ∈ Vm that minimizes qn (u(v, λ)) is the solution of


1
ζ,  . (177) min u(v, λ). (186)
12E |Tl |3 Λ = λ v∈Vm

We next show that there exists a k0 < 1 such that The minimization in (186) is difficult to solve since u(v, λ) is
sup|t|>ζ |ϕTl (t)| ≤ k0 for every λ ∈ Rm + and every func- neither convex nor concave in v. For our purposes, it suffices
tion v(·). We start by evaluating ζ. For every λ ∈ Rm + and to obtain a lower bound on (186), which is given in Lemma 16
every v(·) such that λj vj (λ), 1 ≤ j ≤ m, are not identically below. Together with (187) and the monotonicity of qn (·), this
zero, it can be shown through algebraic manipulations that then yields (171).
Lemma 16: Let v ∗ , µ(v, λ), σ(v, λ), and u(v, λ) be as
E |Tl |4 Λ = λ ≤ 9.
 
(178)
in (45), (166), (167), and (168), respectively. Moreover, let
By Lyapunov’s inequality [16, p. 18], this implies that µ∗ (λ) , µ(v ∗ , λ) and σ ∗ (λ) , σ(v ∗ , λ). Then, there exist δ >
0, δ̃ > 0 and k < ∞ such that for every γ ∈ (Ctx − δ̃, Ctx + δ̃)
    3/4
E |Tl |3 Λ = λ ≤ E |Tl |4 Λ = λ ≤ 93/4 .

(179)
min u(v, λ)
v∈Vm
Hence,  √
 δ/ m, if µ∗ (λ) ≤ γ − δ


1 1  γ − µ (λ)
ζ=  ≥ , ζ0 . (180) ≥ û(λ) , ∗ (λ) + k(γ − µ∗ (λ))
, if |γ − µ∗ (λ)| < δ
12E |Tl |3 Λ = λ 12 × 93/4  σ
if µ∗ (λ) ≥ γ + δ.

−∞,

By (180), we have
(187)

sup ϕTl (t) ≤ sup ϕTl (t) (181)
|t|>ζ |t|>ζ0 Proof: See Appendix IV-D.
17

C. Proof of (163) Assume without loss of generality that n ≥ m/δ 2 (recall that
We shall need the following √
lemma, which concerns the speed we are interested in the asymptotic regime n → ∞). In this case,
of convergence of P[B > A/ n] to P[B > 0] as n → ∞ for the second term on the RHS of (195) is zero. Hence,
two independent random variables A and B.
Lemma 17: Let A be a real random variable with zero mean E[qn (û(Λ))1{Λ ∈
/ Kδ }]
√ δ
 
and unit variance. Let B be a real random variable independent = Q − n√ P µ∗ (Λ) ≤ γ − δ
 
(196)
of A with continuously differentiable pdf fB . Then m
2
≥ P µ (Λ) ≤ γ − δ − e−nδ /(2m) .
   ∗ 
(197)
P B ≥ √A − P[B ≥ 0] ≤ 1 2 + k1 + k1
 
n δ2 (188)
n δ 2 2
Here, (197) follows because Q(−t) ≥ 1 − e−t /2 for all t ≥ 0
where
and because P[µ∗ (Λ) ≤ γ − δ] ≤ 1.
sup max |fB (t)|, |fB0 (t)|

k1 , (189)
t∈(−δ,δ) We next lower-bound the second term on the RHS of (194).
and δ > 0 is chosen so that k1 is finite. If P[Λ ∈ Kδ ∩ Int(Wj )] = 0, we have
Proof: See Appendix IV-E.
E[qn (û(Λ))1{Λ ∈ Kδ ∩ Int(Wj )}] = 0 (198)
To establish (163), we lower-bound E[qn (û(Λ))] on the RHS
of (171) using Lemma 17. This entails technical difficulties since qn (·) is bounded. We thus assume in the following that
since the pdf of û(Λ) is not continuously differentiable due P[Λ ∈ Kδ ∩ Int(Wj )] > 0. Let Û denote the random variable
to the fact that the water-filling solution (45) may give rise û(Λ). To emphasize that Û depends on γ (see (187)), we write
to different numbers of active eigenmodes for different values Û (γ) in place of Û whenever necessary. Using this definition
of λ. To circumvent this problem, we partition Rm ≥ into m non- and (170), we obtain
intersecting subregions Wj , j = 1, . . . , m [15, Eq. (24)] h i
j E qn (Û )1{Λ ∈ Kδ ∩ Int(Wj )}
n 1 1X 1 ρ 1o
Wj , x ∈ Rm
≥ : > + ≥ ,  h
√ i
xj+1 j xl j xj = E Q(− nÛ ) | Λ ∈ Kδ ∩ Int(Wj )
l=1
j = 1, . . . , m − 1 (190) i
1 h + 2

− √ E 1 − nÛ 2 e−nÛ /2 Λ ∈ Kδ ∩ Int(Wj )

and 6 n
m  
n 1 X 1 ρ 1 o × P Λ ∈ Kδ ∩ Int(Wj ) . (199)
Wm , x ∈ Rm
≥ : + ≥ . (191)
m xl m xm
l=1
Observe that the transformation
In the interior of Wj , j = 1, . . . , m,
Smthe pdf of û(Λ) is
continuously differentiable. Note that j=1 Wj = Rm ≥ . For (λ1 , . . . , λj , γ) 7→ (û(λ), λ2 , . . . , λj , γ) (200)
every λ ∈ Wj , the water-filling solution gives exactly j active
eigenmodes, i.e., is one-to-one and twice continuously differentiable with nonsin-
gular Jacobian for λ ∈ Kδ ∩ Int(Wj ), i.e., it is a diffeomorphism
v1∗ (λ) ≥ · · · ≥ vj∗ (λ) > vj+1
∗ ∗
(λ) = · · · = vm (λ) = 0. (192) of class C 2 [29, p. 147]. Consequently, the conditional pdf
Let fÛ (γ) | Λ∈Kδ ∩Int(Wj ) (t) of Û (γ) given Λ ∈ Kδ ∩ Int(Wj ) as
well as its first derivative are jointly continuous functions of γ
n o

Kδ , λ ∈ Rm
≥ : |γ − µ (λ)| < δ . (193)
and t. Hence, they are bounded on bounded sets. It thus follows
Using (193) and the sets {Wj }, we express E[qn (û(Λ))] as that for every j ∈ {1, . . . , m}, every γ ∈ (Ctx − δ̃, Ctx + δ̃)
(where δ̃ is given by Lemma 16), and every δ̃1 > 0, there exists a
E[qn (û(Λ))]
k2 < ∞ such that the conditional pdf fÛ (γ) | Λ∈Kδ ∩Int(Wj ) and
= E[qn (û(Λ))1{Λ ∈ / Kδ }] its derivative satisfy
m
E[qn (û(Λ))1{Λ ∈ Kδ ∩ Int(Wj )}]
X
+ (194)

sup sup Û (γ) | Λ∈Kδ ∩Int(Wj ) (t) ≤ k2 (201)
f
j=1 t∈[−δ̃1 ,δ̃1 ] γ∈(Ctx −δ̃,Ctx +δ̃)
0
where Int(·) denotes the
Sminterior of a given set. To obtain (194), sup sup f
Û (γ) | Λ∈Kδ ∩Int(Wj )
(t) ≤ k2 . (202)
we used that Λ lies in j=1 Int(Wj ) almost surely, which holds t∈[−δ̃1 ,δ̃1 ] γ∈(Ctx −δ̃,Ctx +δ̃)
because the joint pdf of {Λj }m j=1 exists by assumption and We next apply Lemma 17 with A being a standard normal random
because the boundary of Wj has zero Lebesgue measure.
variable and B being the random variable Û conditioned on Λ ∈
We next lower-bound the two terms on the RHS of (194)
Kδ ∩ Int(Wj ). This implies that there exists a finite constant k3
separately. We first consider the first term. When µ∗ (λ) ≥ γ + δ,
∗ independent of γ and n such that the first term on the RHS
we have û(λ) = −∞ √and qn u1 (λ) = 0; when µ (λ) ≤ γ −δ, of (199) satisfies
we have û(λ) = δ/ m and
2
h √ i
√ δ [1 − nδ 2 /m]+ e−nδ /(2m)
  
 E Q − nÛ (γ) Λ ∈ Kδ ∩ Int(Wj )
qn û(λ) = Q − n √ − √ .
m 6 n  k3
≥ P µ∗ (Λ) ≤ γ | Λ ∈ Kδ ∩ Int(Wj ) + . (203)

(195) n
18

We next bound the second term on the RHS of (199) for n ≥ δ̃1−2 and that
as
σ ∗ (λ) − k|γ − µ∗ (λ)| > 0. (216)
1 h + 2
i
√ E 1 − nÛ 2 e−nÛ /2 Λ ∈ Kδ ∩ Int(Wj )

6 n The desired bound (213) follows then by lower-bounding
Z 1/√n σ(vmin , λ) in (214) by σ ∗ (λ) − k|γ − µ∗ (λ)| when µ∗ (λ) ≥ γ
k2 2
≤ √ (1 − nt2 )e−nt /2 dt (204) and by upper-bounding σ(vmin , λ) by σ ∗ (λ) + k|γ − µ∗ (λ)|
6 n −1/ n √
when µ∗ (λ) < γ.
k2 We first establish (215). By the mean value theorem, there
= √ (205)
3 en exist vj0 between vj∗ and vmin,j , j = 1, . . . , m, such that
where (204) follows from (201). Substituting (203) and (205) σ(vmin , λ) − σ ∗ (λ)

into (199) we obtain
m


h i X 2λ j ∗
E qn (Û )1{Λ ∈ Kδ ∩ Int(Wj )}

=
0 3
(vmin,j − vj ) (217)
j=1 (1 + λj vj )
 k4
≥ P µ∗ (Λ) ≤ γ, Λ ∈ Kδ ∩ Int(Wj ) +

(206) m
n
X 2λj vmin,j − vj∗

≤ 0 3
(218)
for some finite k4 independent of γ and n. Using (197), (198) j=1
(1 + λj vj )
and (206) in (194), and substituting (194) into (171), we con- m
X
vmin,j − vj∗

clude that ≤ 2λ1 (219)
  j=1
rt ∗ 1 √
P[Sn (v, Λ) ≤ nγ] ≥ P[µ (Λ) ≤ γ] + O (207) ≤ 2λ1 mkvmin − v ∗ k. (220)
n
 
1 Here, the last step for every a = [a1 , . . . , am ] ∈
= Ftx (γ) + O (208) Pmfollows because√
n Rm , we have j=1 |aj | ≤ mkak.
Next, we upper-bound λ1 and kvmin − v ∗ k separately. The
where the O(1/n) term is uniform in γ ∈ (Ctx − δ̃, Ctx + δ̃). variable λ can be bounded as follows. Because the water-filling
1
Here, the last step follows from (166) and (20). power levels {vl∗ } in (45) are nonincreasing, we have that
ρ
D. Proof of Lemma 16 ≤ v1∗ ≤ ρ. (221)
m
For an arbitrary λ ∈ Rm ≥ , the function µ(v, λ) in the numer- Choose δ1 > 0 and δ̃ > 0 such that δ1 + δ̃ < C tx . Using (221)

ator of (168) is maximized by the (unique) water-filling power together with
allocation vj = vj∗ defined in (45):
log(1 + λ1 v1∗ ) ≤ µ∗ (λ) ≤ m log(1 + λ1 v1∗ ) (222)
µ∗ (λ) = max µ(v, λ) = µ(v ∗ , λ). (209)
v∈Vm and the assumption that γ ∈ (Ctx − δ̃, Ctx + δ̃), we obtain that
The function σ(v, λ) on the denominator of (168) can be whenever |µ∗ (λ) − γ| < δ1
bounded as 1  (Ctx −δ1 −δ̃)/m 
√ k0 , e −1
0 ≤ σ(v, λ) ≤ m. (210) ρ
m  Ctx +δ1 +δ̃ 
Using (209) and (210) we obtain that for an arbitrary δ > 0 ≤ λ1 ≤ e − 1 , k1 . (223)
ρ
 √ ∗
δ/ m, µ (λ) ≤ γ − δ ∗
The term kvmin − v k can be upper-bounded as follows.
min u(v, λ) ≥ (211)
v∈Vm −∞, µ∗ (λ) ≥ γ + δ. Since vmin is the minimizer of u(v, λ), it must satisfy the
Let vmin be the minimizer of u(v, λ) for a given λ. To prove Karush–Kuhn–Tucker (KKT) conditions [41, Sec. 5.5.3]:
Lemma 16, it remains to show that there exist δ > 0, δ̃ > 0 and ∂u(v, λ)
tx tx − = η, ∀ l for which vmin,l > 0 (224)
k < ∞ such that for every γ ∈ (C − δ̃, C + δ̃) and every ∂vl

vl =vmin,l
m ∗
λ ∈ R≥ satisfying |µ (λ) − γ| < δ, ∂u(v, λ)
− ≤ η, ∀ l for which vmin,l = 0 (225)
∂vl

min u(v, λ) = u(vmin , λ) (212) vl =vmin,l
v∈Vm
for some η. The derivatives in (224) and (225) are given by
γ − µ∗ (λ)
≥ ∗ . (213) ∂u(v, λ)

γ − µ(vmin , λ)

σ (λ) + k(γ − µ∗ (λ)) − = 1 +
∂vl (1 + λl vmin,l )2 σ 2 (vmin , λ)

vl =vmin,l
Since
1
γ − µ(vmin , λ) γ − µ∗ (λ) × . (226)
u(vmin , λ) = ≥ (214) (vmin,l + 1/λl )σ(vmin , λ)
σ(vmin , λ) σ(vmin , λ)
Let η̃ , 1/(σ(vmin , λ)η). Then, (224) and (225) can be rewrit-
it suffices to show that for every γ ∈ (Ctx − δ̃, Ctx + δ̃) and ten as

every λ ∈ Rm ≥ satisfying |µ (λ) − γ| < δ, we have
+
γ − µ(vmin , λ)
  
1
vmin,l = η̃ 1 + − (227)
|σ(vmin , λ) − σ ∗ (λ)| ≤ k|γ − µ∗ (λ)| (215) (1 + λl vmin,l )2 σ 2 (vmin , λ) λl
19

where η̃ satisfies Let FA and√FB be the cdfs of A and B, respectively. We rewrite


m   + P[B ≥ A/ n] as follows:
γ − µ(vmin , λ)

X 1
η̃ 1 + − = ρ. (228) √ √
Z
(1 + λl vmin,l )2 σ 2 (vmin , λ) λl P[B ≥ A/ n] = √
P[B ≥ a/ n]dFA (a)
l=1
|a|≥δ n
Here, the equality in (228) follows because u(v, λ) is monoton-
| {z }
,c0 (n)
Pm implies that the minimizer vmin
ically decreasing in vj , which √
Z
of u(v, λ) must satisfy l=1 vmin,l = ρ. Comparing (227) + √
P[B ≥ a/ n] dFA (a). (238)
and (228) with (45) and (22), we obtain, after algebraic ma- |a|<δ n | {z

}
=1−FB (a/ n)
nipulations
We next expand the argument of the second integral in √(238)
kvmin − v ∗ k ≤ k2 |γ − µ(vmin , λ)| (229) by applying Taylor’s theorem [40, Th. 5.15] on F B (a/ n) as
√ √
follows: for all a ∈ (−δ n, δ n)
for some k2 < ∞ that does not depend on λ, vmin , v ∗ and γ.
To further upper-bound the RHS of (229), recall that vmin √ a f 0 (a0 ) a2
1 − FB (a/ n) = 1 − FB (0) − fB (0) √ − B (239)
minimizes u(v, λ) = (γ − µ(v, λ))/σ(v, λ) for a given λ and n 2 n
that µ∗ (λ) = maxv∈Vm µ(v, λ). Thus, if µ∗ (λ) ≥ γ then we √
for some a0 ∈ (0, a/ n). Averaging over A, we get
must have u(vmin , λ) ≤ u(v ∗ , λ) ≤ 0, which implies that

Z
1 − FB (a/ n)dFA (a)
0 ≤ µ(vmin , λ) − γ ≤ µ∗ (λ) − γ. (230) √
|a|<δ n

= (1 − FB (0)) P[|A| < δ n]
If µ∗ (λ) < γ then | {z }
=P[B≥0]
γ − µ(vmin , λ) γ − µ∗ (λ)
0≤ √ ≤ u(vmin , λ) ≤ (231) fB (0)  √ 
m σ ∗ (λ) − √ E A · 1{|A| < δ n}
n |
√ {z }
where in the second inequality we used that σ(vmin , λ) ≤ m ,c1 (n)
(see (210)). Using (221) and (223), we can lower-bound σ ∗ (λ) A2 fB0 (A0 ) √
 
as −E · 1{|A| < δ n} . (240)
2n
s | {z }
∗ 1 ,c2 (n)
σ (λ) ≥ 1 − (232)
(1 + λ1 v1∗ )2 Hence,
s √
1 P[B ≥ A/ n] − P[B ≥ 0] (241)
≥ 1− , k3 . (233)
(1 + ρk0 /m)2 √


= c0 (n) − P[B ≥ 0] · P[|A| ≥ δ n]
Substituting (233) into (231), we obtain
√ fB (0)
m − √ c1 (n) − c2 (n) (242)
γ − µ∗ (λ) . n

0 ≤ γ − µ(vmin , λ) ≤ (234)
k3 √ k1
√ ≤ c0 (n) + P[|A| ≥ δ n] + √ |c1 (n)| + |c2 (n)| (243)
Combining (234) with (230) and using that m/k3 > 1, we get n
√ √ k1
≤ 2P[|A| ≥ δ n] + √ |c1 (n)| + |c2 (n)| (244)
γ − µ(vmin , λ) ≤ m γ − µ∗ (λ) .

(235) n
k3 2 k1
≤ 2 + √ |c1 (n)| + |c2 (n)| . (245)
Finally, substituting (235) into√(229), then (229) and (223) δ n n
into (220), and writing k , k1 k2 m/k3 , we conclude that (215) Here, in (243) we used the triangle inequality together with (237)
holds for every γ ∈ (Ctx − δ̃, Ctx + δ̃) and every λ satisfying
|µ∗ (λ) − γ| < δ1 .
and the trivial bound P[B √ ≥ 0] ≤ 1; (244) follows because
c0 (n) ≤ P[|A| ≥ δ n];  (245) follows from Chebyshev’s
To prove (216), we choose 0 < δ < min{δ1 , k3 /k}. It then 
inequality and because E A2 = 1 by assumption.
follows that for every λ satisfying |µ∗ (λ) − γ| < δ we have To conclude the proof, we next upper-bound |c1 (n)|, and
σ ∗ (λ) − k|γ − µ∗ (λ)| ≥ k3 − kδ > 0. (236) |c2 (n)|. The term |c1 (n)| can be bounded as
√ 
|c1 (n)| = E A · 1{|A| ≥ δ n}

(246)
Here, in (236) we used the bound (233). This concludes the
1  √ √ 
proof. ≤ √ E δ n|A| · 1{|A| ≥ δ n} (247)
δ n
1 √ 
≤ √ E A2 · 1{|A| ≥ δ n}

E. Proof of Lemma 17 (248)
δ n
By assumption, there exist δ > 0 and k1 < ∞ such that 1
≤ √ (249)
δ n
sup max |fB (t)|, |fB0 (t)| ≤ k1 .

(237)
t∈(−δ,δ) where (246) follows because E[A] = 0 by assumption.
20

Finally, |c2 (n)| can be bounded as Here, (261) follows from Lemma 13 (Appendix I) by let-
 2 0
A |fB (A0 )| √
 ting ej and Wj stand for the jth column of In,t and W, respec-
|c2 (n)| ≤ E · 1{|A| < δ n} (250) tively; (262) follows by symmetry. We next note that by (98),
2n
the random variable sin2 {e1 , nvj∗ Λj e1 + Wj } has the same
p
√  k1
≤ E A2 · 1{|A| < δ n}

(251) distribution as
2n Pn
k1 |2
i=2 |Wi,jP
≤ . (252) Tj , p ∗ n . (263)
2n | nvj Λj + W1,j | + i=2 |Wi,j |2
2

Here, (251) follows because the support of A0 is contained in


Thus,
(0, δ) and from (237). Substituting (249) and (252) into (245), "m # "m #
we obtain the desired inequality (188). Y
2
n q

o Y
P sin e1 , nvj Λj e1 + Wj ≤ γn = P Tj ≤ γn .
A PPENDIX V j=1 j=1
P ROOF OF THE ACHIEVABILITY PART OF T HEOREM 3 (264)
In order to prove (57), we study the achievability bound (43) To evaluate the RHS of (264), we
in the large-n limit. We start by analyzing the denominator on Pnobserve that by the law
of large numbers, the noise term n1 i=2 |Wi,j |2 in (263) con-
the RHS of (43). Let α = n − t − r > 0. Then, centrates around 1 as n → ∞. Hence, we expect that for all
" r # " r #
Y Y
−α −α γ>0
P Bi ≤ γn = P Bi ≥ γn (253)    
i=1 m m
i=1
Qr Y Y 1
−α Tj ≤ γ  → P ∗Λ + 1 ≤ γ
 as n → ∞. (265)

i=1 Bi
E P
≤ (254) j=1 j=1
v j j
γn−α
r
Y h
−(n−t−r)
i We shall next make this statement rigorous by showing that, for
= γnn−t−r E Bi (255) tx
all γ in a certain neighborhood of e−C ,
i=1    
m m
where (254) follows from Markov’s inequality, and (255) follows
 
Y Y 1 +O 1
because the B1 , . . . , Br are independent. Recalling that Bi ∼ P Tj ≤ γ  ≥ P ≤ γ (266)
j=1
v∗ Λ + 1
j=1 j j
n
Beta(n − t − i + 1, t), we obtain that for every i ∈ {1, . . . , r}
h i
E Bi
−(n−t−r) where the term O(1/n) is uniform in γ. To this end, we build on
Z 1 Lemma 17 in Appendix IV-C. The technical difficulty is that the
Γ(n − i + 1) joint pdf of Λ1 v1∗ , . . . , Λm vm

is not continuously differentiable
= sr−i (1 − s)t−1 ds (256) ∗
Γ(n − t − i + 1)Γ(t) 0 because the functions {vj (·)} are not differentiable on the bound-
Γ(n − i + 1) ary of the nonintersecting regions W1 , . . . , Wm defined in (190)
≤ (257) and (191). To circumvent this problem, we study the asymptotic
Γ(n − t − i + 1)Γ(t)
≤ nt . (258) behavior of {Tj } conditioned on Λ ∈ Int(Wu ), in which
case the joint pdf of Λj vj∗ (Λ), j = 1, . . . , m, is continuously
Substituting (258) into (255), we get differentiable. This comes without loss of generality since Λ lies
Sm
in u=1 Int Wu ) almost surely (see also Appendix IV-C).
" r #
Y
P Bi ≤ γn ≤ nrt γnn−t−r . (259) (u)
To simplify notation, we use Tj to denote the random
i=1
variable Tj conditioned on the event Λ ∈ Int(Wu ), u =
Setting τ = 1/n and γn = exp(−Ctx + O(1/n)) in (43), and 1, . . . , m. We further denote by Λ(u) and Λ e (u) the random vec-
using (259), we obtain tors [Λ1 , . . . , Λu ] and [Λ1 v1 (Λ), . . . , Λu vu∗ (Λ)]T conditioned
T ∗

on the event Λ ∈ Int(Wu ), respectively. Using these definitions,


 
log M log n 1
≥ Ctx − (1 + rt) +O . (260)
n n n the LHS of (266) can be rewritten as
To conclude the proof, it remains to show that there exists
" m #
Y
a γn = exp(−Ctx + O(1/n)) satisfying (44). To this end, we P Tj ≤ γ
note that j=1
( " # )
√ m m
  np
p
P sin2 In,t , nIn,t diag v1∗ Λ1 , . . . , vm∗ Λ ,
X Y
= Tj ≤ γ Λ ∈ Int(Wu ) P[Λ ∈ Int(Wu )]

m P
o   u=1 j=1
0, . . . , 0 + W ≤ γn (267)
| {z } m
( " u m Pn #
2
Y (u)   Y |Wi,j |

t−m X
i=2
m = P Tj · Pn ≤γ
i=1 |Wi,j |
Y n q o  2
≥P sin2 ej , nvj∗ Λj ej + Wj ≤ γn (261) u=1 j=1
|
j=u+1
{z }
j=1 ≤1
Ym n q o  )
2 ∗
=P sin e1 , nvj Λj e1 + Wj ≤ γn . (262) × P[Λ ∈ Int(Wu )] (268)
j=1
21

m
( " u
# )
X Y (u) We next choose γn as the solution of
≥ P Tj ≤ γ P[Λ ∈ Int(Wu )] . (269)
 
u=1 j=1 1 1
Pn 1 − Ftx (− log γn ) + O =1−+ . (278)
Here, because, by (192), Tj = ( i=2 |Wi,j |2 ) n n
Pn(268) follows
/( i=1 |Wi,j |2 ) for j = u + 1, . . . , m. 0
Since Ftx (Ctx ) =  and Ftx (Ctx ) > 0, it follow from Taylor’s
The following lemma, built upon Lemma 17, allows us to theorem that
establish (266).  
Lemma 18: Let G = [G1 , . . . , Gu ]T ∈ Ru≥ be a random tx 1
− log γn = C + O . (279)
vector with continuously differentiable joint pdf. Let n
Pn 2
i=2 |Wi,j | So, for sufficiently large n, γn in (279) belongs to the inter-
Dj , p , j = 1, . . . , u (270) tx tx
n val e−C −δa , e−C +δa . Hence, by (264), (277), and (278),
P
| nGj + W1,j |2 + i=2 |Wi,j |2
this γn satisfies (44). This concludes the proof.
where Wi,j , i = 1, . . . , n, j = 1, . . . , u, are i.i.d. CN (0, 1)-
distributed. Fix an arbitrary ξ0 ∈ (0, 1). Then, there exist a
δ > 0 and a finite constant k such that A. Proof of Lemma 18
" u # " u #!
Y Y 1 k Choose δ > 0 such that δ ≤ ξ0 /2. Throughout this appendix,
inf P Dj ≤ ξ − P ≤ξ > . we shall use const to indicate a finite constant that does neither
ξ∈(ξ0 −δ,ξ0 +δ)
j=1 j=1
1 + Gj n
depend on ξ ∈ (ξ0 − δ, ξ0 + δ) nor on n; its magnitude and sign
(271) may change at each occurrence.

Proof: See Appendix V-A. Let gth , 2/ξ0 − 1 and let


(u)
Using Lemma 18 with Gj = Λ e , it follows that there
j
" u
Y


#
exist δu > 0 and 0 ≤ ku < ∞, such that for every γ ∈ p1 , P Dj ≤ ξ G1 ≥ gth

(280)
tx tx
e−C −δu , e−C +δu j=1

" u #
    Y
u u
p2 , P Dj ≤ ξ G1 < gth . (281)
 
Y (u) Y 1 1
P Tj ≤ γ  ≥ P (u)
≤ γ + O . (272) j=1

j=1 j=1 1 + Λj
e n
 Qu 
(u)
To prove Lemma 18, we decompose P j=1 Dj ≤ ξ as
To show that Λ e , j = 1, . . . , u, indeed satisfy the conditions
j " u #
in Lemma (18), we use (192), (45), and (22), to obtain Y
P Dj ≤ ξ = p1 P[G1 ≥ gth ] + p2 P[G1 < gth ] . (282)
(u) u
!
(u) Λj X 1 j=1
Λ
e =
j ρ+ (u)
− 1, j = 1, . . . , u. (273)
u Λ The proof consists of the following steps:
l=1 l

Since the joint pdf of Λ is continuously differentiable by assump- 1) We show in Section V-A1 that for every ξ ∈ (ξ0 −δ, ξ0 +δ),
tion, the joint pdf of Λ(u) is also continuously differentiable. the term p1 in (282) can be lower-bounded as
Moreover, it can be shown that the transformation Λ(u) 7→ Λ e (u)
const
defined by (273) is a diffeomorphism of class C 2 [29, p. 147]. p1 ≥ 1 − . (283)
e (u) is continuously differentiable. n
Therefore, the joint pdf of Λ
We next use (272)in (269) to conclude that for every γ ∈ 2) Using Lemma 17 in Appendix IV-C, we show in Sec-
tx tx
e−C −δa , e−C +δa (where δa , min{δ1 , . . . , δm }) tion V-A2 that p2 can be lower-bounded as
" m # " u
#
Y 1 Y const
Tj ≤ γ p2 ≥ P Dj ≤ ξ G1 < gth − . (284)

P
1 + G1 j=2
n
u=1
m
( " u
# )  
X Y 1 1 3) Reiterating Step 2 for D2 , . . . , Du , we conclude that (284)
≥ P ≤ γ P[Λ ∈ Int(Wu )] +O
u=1
e (u)
j=1 1 + Λ
n can be further lower-bounded as
j
(274)
" u #
Y 1 const
p2 ≥ P ≤ ξ G1 < gth − . (285)
" m   #
Y 1 1 1 + G n
∗ (Λ) ≤ γ + O n
=P (275) j=1 j
j=1
1 + Λ v
j j
" m #   4) Finally, using (283) and (285) in (282), we show in Sec-
X
∗ 1 tion V-A3 that
=1−P log(1 + Λj vj (Λ)) ≤ − log γ + O (276)
j=1
n " u # " u #
Y Y 1 const
Dj ≤ ξ ≥ P ≤ξ − . (286)
 
1 P
= 1 − Ftx (− log γ) + O (277) j=1 j=1
1 + Gj n
n
where Ftx (·) is given in (20). This proves Lemma 18.
22

1) Proof of (283): Let δ1 be an arbitrary √ real number in evaluate p2 in (282), we proceed as


2) Proof of (284): To Q
√ u
(1/(ξ0 − δ), 2/ξ0 ) and let δ2 , gth − δ1 − 1 > 0. Let follows. Defining Z , ξ/ j=2 Dj , we obtain
Wn+1,1 ∼ CN (0, 1) be independent of all other random vari- " u #
ables appearing in the definition of the {Dj } in (270). Finally, let Y
p2 = P Dj ≤ ξ G1 < gth (297)

Wre denote the real part of W1,1 . For every ξ ∈ (ξ0 − δ, ξ0 + δ)
j=1
 
  = P D1 ≤ Z G1 < gth (298)
p1 ≥ P D1 ≤ ξ G1 ≥ gth (287)  
" n = P D1 ≤ Z, Z ≥ 1 G1 < gth
p 2 1−ξ X  
≥ P nG1 + W1,1 ≥ |Wi,1 |2 , + P D1 ≤ Z, Z < 1 G1 < gth (299)

ξ i=2    
# = P Z ≥ 1 G1 < gth + P D1 ≤ Z, Z < 1 G1 < gth

√ (300)
Wre ≥ − nδ2 G1 ≥ gth (288)

" n
# where (300) follows because
p
2 1−ξ X
2
≥ P n( G1 − δ2 ) ≥ |Wi,1 | G1 ≥ gth  
P D1 ≤ Z Z ≥ 1, G1 < gth = 1. (301)
ξ i=2
 √ 
× P Wre ≥ − nδ2 (289) The second term on the RHS of (300) can be rewritten as
" n
#
1−ξ X √  
P D1 ≤ Z, Z < 1 G1 < gth

|Wi,1 |2 P Wre ≥ − nδ2

≥ P n(δ1 − 1) ≥
ξ i=2 h
= EZ,G2 ,...,Gu | G1 <gth 1{Z < 1}
(290)
" #  i
 n+1
X × P D1 ≤ Z Z, G2 , . . . , Gu , G1 < gth . (302)
≥ P n(δ1 − 1) ≥ 1/(ξ0 − δ) − 1 |Wi,1 |2
 √  i=2 Since events of measure zero do not affect (302), we can
× P |Wre | ≤ nδ2 (291) assume without loss of generality that the conditional joint
1 δ1 (ξ0 − δ) − 1
 2 ! 
1
 pdf of Z, G2 , . . . , Gu given G1 < gth is strictly positive. To
≥ 1− 1− (292) lower-bound (302), we first bound the  conditional probability
n 1 − (ξ0 − δ) 2nδ22 
P D1 ≤ Z Z, G2 , . . . , Gu , G1 < gth . Again, let Wre denote
const the real part of W1,1 , and let Wn+1,1 ∼ CN (0, 1) be independent
≥1− . (293)
n of all other random variables appearing in the definition of the
{Dj } in (270). Following similar steps as in (287)–(293), we
Here, (287) follows because Di ≤ 1, i = 2, . . . , u, with probabil- obtain for Z < 1

ity one (see (270)); (290) follows because δ1 −1 = ( gth −δ2 )2 ;
Pn+1
(291) follows because ξ > ξP 0 − δ and because i=2 |Wi,1 |
2 P[D1 ≤ Z | Z, G2 , . . . , Gu , G1 < gth ]
n 2
is stochastically larger than i=2 |Wi,1 | ; (292) follows from
" Pn
2
i=2 |Wi,1 |

= P √ ≤ Z

Chebyshev’s inequality applied to both probabilities in (291). 2 Pn
This proves (283).
nG1 + W1,1 +
i=2 |Wi,1 |2
#
Before proceeding to the next step, we first argue that, if
Z, G2 , . . . , Gu , G1 < gth (303)
P[G1 ≥ gth ] = 1, then (271) follows directly from (293). Indeed,
in this case we obtain from (293) and (282) that "
p 2 n


X
= P nG1 + W1,1 ≥ Z −1 − 1 |Wi,1 |2

" u
#
const

Y i=2
P Dj ≤ ξ = p1 ≥ 1 − . (294) #
j=1
n
Z, G2 , . . . , Gu , G1 < gth (304)

We further have, with probability one,


"
p 2
−1
 n+1
X
2
≥ P nG1 + Wre ≥ Z − 1 |Wi,1 |

u
i=2
Y 1 1 1 ξ0
≤ ≤ = ≤ ξ0 − δ < ξ (295)
#
j=1
1 + Gj 1 + G1 1 + gth 2 Z, G2 , . . . , Gu , G1 < gth (305)
" r
which gives p p Xn+1
≥P nG1 ≥ −Wre + Z −1 − 1 |Wi,1 |2

" # i=2
u
Y 1 #
P ≤ ξ = 1. (296)
j=1
1 + Gj Z, G2 , . . . , Gu , G1 < gth . (306)

Subtracting (294) from (296) yields (271). In the following, we Next, we lower-bound the RHS of (306) using Lemma 17 in
2
shall focus exclusively on the case P[G1 ≥ gth ] < 1. Appendix IV-C. Let µW and σW be the mean and the variance
23

qP
n+1 √ q P
n+1
of the random variable i=2 |Wi,1 |2 . Let Z2 , Z −1 − 1. Here, (315) follows by using that 2 i=2 |Wi,1 |2 is χ-
Furthermore, let distributed with 2n degrees of freedom and by using [42,
v
un+1 ! Eq. (18.14)]; (316) follows from [43, Q
Sec. 2.2]; (318) follows
u
1 uX from the definition of Z and because j=2 Dj ≤ 1. Substitut-
K1 , p − Wre + Z2 t |Wi,1 |2 − µW Z2
2
1/2 + Z22 σW ing (316) and (319) into (313), we obtain
i=2
(307) const
f˜G1 (x | z2 , g2 . . . , gu ) ≤ . (320)
fG2 ,...,Gu | G1 <gth (g2 , . . . , gu )
and
p  Following similar steps, we can also establish that
1 µW
G1 , p G1 − √ Z2 . (308) const
1/2 + Z22 σW
2 n ˜0
fG1 (x | z2 , g2 . . . , gu ) ≤

.
fG2 ,...,Gu | G1 <gth (g2 , . . . , gu )
Note that K1 is a zero-mean, unit-variance random variable
(321)
that is conditionally independent of G1 given Z2 . Using these
definitions, we can rewrite the RHS of (306) as Using (320)–(321) and Lemma 17, we obtain that
h √ i h √ i
P G1 ≥ K1 / n Z2 , G2 , . . . , Gu , G1 < gth . (309) P G1 ≥ K1 / n Z2 , G2 = g2 , . . . , Gu = gu , G1 < gth
h i
In order to use Lemma 17, we need to establish an upper bound on ≥ P G1 ≥ 0 Z2 , G2 = g2 , . . . , Gu = gu , G1 < gth

the conditional pdf of G1 given Z2 , G2 , . . . , Gu and G1 < gth ,  
which we denote by f˜G1 , and on its derivative. As fG1 ,...,Gu const 1
− 1+ . (322)
is continuously differentiable by assumption, fG1 ,...,Gu and its n fG2 ,...,Gu | G1 <gth (g2 , . . . , gu )
partial derivatives are bounded on bounded sets. Together with Returning to the analysis of (302), we combine (306), (309)
the assumption that P[G1 ≥ gth ] < 1, this implies that the con- and (322) to obtain
ditional pdf fG1 ,...,Gu | G1 <gth of G1 , . . . , Gu given G1 < gth  
and its partial derivatives are all bounded on [0, gth )u . Namely, P D1 ≤ Z, Z < 1 G1 < gth
for every {x1 , . . . , xu } ∈ [0, gth )u and every i ∈ {1, . . . , u}
"
≥ EZ,G2 ,...,Gu | G1 <gth 1{Z < 1}
fG1 ,...,Gu | G1 <gth (x1 , . . . , xu ) ≤ const (310)

∂fG1 ,...,Gu | G1 <gth (x1 , . . . , xu ) h i
≤ const. (311) × P G1 ≥ 0 Z, G2 , . . . , Gu , G1 < gth

∂xi

Let f˜G1 be the conditional pdf of G1 given G2 , . . . , Gu and const



1
!#
G1 < gth , and let fG2 ,...,Gu | G1 <gth be the conditional pdf of − 1+ (323)
n fG2 ,...,Gu | G1 <gth (G2 , . . . , Gu )
G2 , . . . , Gu given G1 < gth . Then, f˜G1 can be bounded as  
1 G1 < gth − const

≥P ≤ Z, Z < 1
f˜G1 (x | z2 , g2 . . . , gu ) 1 + nG1 /µ2W n
√ 2 Zgth Zgth
 q  !
= 2f˜G1 1/2 + z22 σW 2 x + z µ / n g , . . . , g
2 W 2 u fG2 ,...,Gu | G1 <gth (g2 , . . . , gu )
× 1 + ··· dg2 · · · dgu
fG2 ,...,Gu | G1 <gth (g2 , . . . , gu )

q q 
2 2 2 2 0 0
× 1/2 + z2 σW 1/2 + z2 σW x + z2 µW / n (312) (324)

 
p
const · gth 1/2 + σW 2 z2 1 const
2 ≥P ≤ Z, Z < 1 G1 < gth −
. (325)
≤ . (313) 1 + G1 n
fG2 ,...,Gu | G1 <gth (g2 , . . . , gu )
Here, in (324) we used (308), that 1{Z < 1} ≤ 1, that
Here, (312) follows from (308), and (313) follows from (310)
G1 , . . . , Gu are nonincreasing, and that const in (323) is posi-
and because we condition on the event that G1 < gth , so
tive; (325) follows because [42, Eq. (18.14)]
q √
2 x + z µ / n ≤ √g .
1/2 + z22 σW 2 W th (314) Γ(n + 1/2) √
µW = ≤ n (326)
To further upper-bound (313), we shall use that σW and Z2 are Γ(n)
bounded: and because the integral on the RHS of (324) is bounded. Sub-

Γ(n + 1/2)
2 stituting (325) into (300), we obtain
2
σW =n− (315)
Γ(n) p2 ≥ P[Z ≥ 1 | G1 < gth ]
≤ 1/4 (316)
 
1 const
+P ≤ Z, Z < 1 G1 < gth − (327)
and 1 + G1 n
 
1
Z22 = Z −1 − 1 (317) =P ≤ Z, Z ≥ 1 G1 < gth
1 + G1
≤ 1/ξ − 1 (318)  
1 const
−1 +P ≤ Z, Z < 1 G1 < gth −
(328)
≤ (ξ0 − δ) − 1. (319) 1 + G1 n
24

(Ir + HH QH)−1 ≤ kIr k = √r. Similarly, by replacing Ql


 
1 const
=P ≤ Z G1 < gth − (329) F F
1 + G1 n with Q in the above steps, we obtain
" u
#
1 Y const
=P Dj ≤ ξ G1 < gth − (330) det(Ir + HH QH)

1 + G1 j=2 n 2 √
≤ det(Ir + HH Ql H)(1 + kQl − QkF kHkF r)r .

(339)
where (328) follows because 1/(1 + G1 ) ≤ 1 with probability The inequalities (338) and (339) imply that
one. This proves (284).
log det(Ir + HH Ql H) − log det(Ir + HH QH)

3) Proof of (286): Set p0 , P[G1 ≥ gth ]. Substituting (283)
and (285) into (282), we obtain 2√
≤ r log(1 + kQl − QkF kHkF r) (340)
2
≤ r3/2 kQl − QkF kHkF .
" u #
Y (341)
P Dj ≤ ξ
j=1 Hence, for every c > 0
" #
u h i
Y1 const P log det(Ir + HH Ql H) − log det(Ir + HH QH) ≥ c

≥ p0 + (1 − p0 )P ≤ ξ G1 < gth −

j=1
1 + Gj n  
2 c 1
(331) ≤ P kHkF ≥ 3/2 (342)
" # r kQl − QkF
u
Y 1
r3/2
≤ ξ G1 ≥ gth p0
h i
=P 2

1 + Gj
≤ E kHkF · kQl − QkF (343)
j=1
c
| {z } → 0, as Ql → Q (344)
" =1u
#
Y 1 const where (343) follows from Markov’s i inequality and (344) follows
+ (1 − p0 )P ≤ ξ G1 < gth − (332)
h
2
1 + Gj n because, by assumption, E kHkF < ∞. Thus, the sequence of
j=1
" u  random variables {log det(Ir + HH Ql H)} converges in proba-
Y 1 const bility to log det(Ir + HH QH). Since convergence in probability
=P ≤ξ − . (333)
j=1
1 + Gj n implies convergence in distribution, we conclude that

P log det Ir + HH Ql H ≤ ξ
  
The first factor in (332) is equal to one because of (295). This
proves (286) and concludes the proof of Lemma 18. → P log det Ir + HH QH ≤ ξ as Ql → Q (345)
  

for every ξ ∈ R for which the cdf of log det(Ir + HH QH) is


A PPENDIX VI continuous [44, p. 308]. However, the cdf of log det(Ir +HH QH)
P ROOF OF P ROPOSITION 5 (E XISTENCE OF O PTIMAL is continuous for every ξ ∈ R since the distribution of H
C OVARIANCE M ATRIX ) is, by assumption, absolutely continuous with respect to the
Since the set Ut is compact, by the extreme value theorem [29, Lebesgue measure and the function H 7→ log det(Ir + HH QH)
p. 34], it is sufficient to show that, under the assumptionsin the is continuous. Consequently, (345) holds for every ξ ∈ R, thus

proposition, the function Q 7→ P log det Ir + HH QH ≤ ξ proving Proposition 5.
is continuous in Q ∈ Ut with respect to the metric d(A, B) =
kA − BkF . A PPENDIX VII
Consider an arbitrary sequence {Ql } in Ut that converges to Q. P ROOF OF T HEOREM 6 (CSIR C ONVERSE B OUND )
Then
For the CSIR case, the input of the channel (8) is X and the
H
det(Ir + H Ql H) output is the pair (Y, H). An (n, M, )e code is defined in a
similar way as the (n, M, )rx code in Definition 2, except that
= det(Ir + HH QH + HH (Ql − Q)H) (334)
each codeword satisfies the power constraint (9) with equality,
= det(Ir + HH QH) i.e., each codeword belongs to the set
× det Ir + HH (Ql − Q)H(Ir + HH QH)−1

(335) 2
H Fn,t , {X ∈ Cn×t : kXkF = nρ}. (346)
≤ det(Ir + H QH)
 r Denote by Re∗ (n, ) the maximal achievable rate with an
× 1 + HH (Ql − Q)H(Ir + HH QH)−1 F

(336)
(n, M, )e code. Then by [21, Sec. XIII] (see also [9, Lem. 39],
≤ det(Ir + HH QH) n

r Rrx (n − 1, ) ≤ R∗ (n, ). (347)

2
× 1 + kQl − QkF kHkF (Ir + HH QH)−1 F (337) n−1 e
2√
r
We next establish an upper bound on Re∗ (n, ). Consider an

≤ det(Ir + HH QH) 1 + kQl − QkF kHkF r . (338)
arbitrary (M, n, )e code. To each codeword X ∈ Fn,t , we
t×t
Here, (336) follows from Hadamard’s inequality; (337) fol- associate a matrix U(X) ∈ C :
lows from the sub-multiplicative property of the Frobenius 1
norm, namely, kABkF ≤ kAkF kBkF ; (338) follows because U(X) , XH X. (348)
n
25

To upper-bound Re∗ (n, ), we use the meta-converse theorem [9, where γn (X0 ) is the solution of
Th. 30]. As auxiliary channel QYH | X , we take  
PYH | X=X0 r(X0 ; YH) ≤ nγn (X0 ) = . (362)
QYH | X = PH × QY | XH (349)
It can be shown that under PYH | X=X0 , the random variable
where
r(X0 ; YH) has the same distribution as Snrx (U(X0 )) in (70), and
n
Y under QYH | X=X0 , it has the same distribution as Lrx n (U(X0 ))
QY | X=X,H=H = QYi | X=X,H=H (350)
in (69).
i=1
2) Converse on the auxiliary channel: To prove the theorem,
with Yi , i = 1, . . . , n denoting the rows of Y, and it remains to lower-bound 0 , which is the maximal probability
QYi | X=X,H=H = CN 0, Ir + HH U(X)H .

(351) of error over the auxiliary channel (349). The following lemma
serves this purpose.
By [9, Th. 30], we have Lemma 19: For every code with M codewords and block-
inf β1− PYH | X=X , QYH | X=X ≤ 1 − 0

(352) length n ≥ r, the maximum probability of error 0 over the
X∈Fn,t auxiliary channel (349) satisfies
where 0 is the maximal probability of error of the optimal code crx (n)
with M codewords over the auxiliary channel (349). To shorten 1 − 0 ≤ (363)
M
notation, we define
where crx (n) is given in (72).
n

β1− (X) , β1− PYH | X=X , QYH | X=X . (353) Substituting (361) into (352) and using (363), we then obtain
To prove the theorem, we proceed as in Appendix III: we first upon minimizing (361) over all matrices in Ute
n
evaluate β1− (X), then we relate 0 to Re∗ (n, ) by establishing 1 crx (n)
a converse bound on the channel QYH | X . Re∗ (n, ) ≤ . (364)
n (Q) ≥ nγn ]
n inf e P[Lrx
1) Evaluation of β1− (X): Let G be an arbitrary n×n unitary Q∈Ut
matrix. Let gi : Fn,t 7→ Fn,t and go : Cn×r × Ct×r 7→ Cn×r ×
Ct×r be two mappings defined as The final bound (71) follows by combining (364) with (347) and
by noting that the upper bound does not depend on the chosen
gi (X) , GX and go (Y, H) , (GY, H). (354) code.
Proof of Lemma 19: According to (351), given H = H, the
Note that
output of the auxiliary channel depends on X only through U(X).
PYH | X (go−1 (E) | gi (X)) = PYH | X (E | X) (355) In the following, we shall omit the argument of U(X) where it is
immaterial. Let V , U(Y). Then, (V, H) is a sufficient statistic
for all measurable sets E ⊂ Cn×r × Ct×r and X ∈ Fn,t , i.e., the for the detection of X from (Y, H). Therefore, to establish (363),
pair (gi , go ) is a symmetry [45, Def. 3] of PYH | X . Furthermore, it is sufficient to lower-bound the maximal probability of error 0
(350) and (351) imply that over the equivalent auxiliary channel
QYH | X=X = QYH | X=gi (X) (356)
QVH | U = PH × QV | U,H (365)
and that QYH | X=X is invariant under go for all X ∈ F. Hence,
by [45, Prop. 19], we have that where QV | U=U,H=H is the Wishart distribution [18, Def. 2.3]:
n n n  
β1− (X) = β1− (gi (X)) = β1− (GX). (357) 1 H
QV | U=U,H=H = Wr n, (Ir + H UH) . (366)
n n
Since G is arbitrary, this implies that β1− (X) depends on X only
through U(X). Consider the QR decomposition [46, p. 113] of X Let B , Ir + HH UH, and let qV | B (V | B) be the pdf associated
X = VX0 (358) with (366), i.e., [18, Def. 2.3]

where V ∈ C n×n
is unitary and X0 ∈ C n×t
is upper triangular. det Vn−r   −1 
qV | B (V | B) = 1
n exp −tr n−1 B V .
By (357) and (358), Γr (n) det n B
n n (367)
β1− (X0 ) = β1− (X). (359)
Let It will be convenient to express qV | B (V | B) in the coordinate
dPYH | X=X0 system of the eigenvalue decomposition
r(X0 ; YH) , log . (360)
dQYH | X=X0 V = QDQH (368)
Under both PYH | X=X0 and QYH | X=X0 , the random variable
r(X0 ; YH) has absolutely continuous cdf with respect to the where Q ∈ Cr×r is unitary, and D is a diagonal matrix whose
Lebesgue measure. By the Neyman-Pearson lemma [38, p. 300] diagonal elements D1 , . . . , Dr are the eigenvalues of V in de-
scending order. In order to make the eigenvalue decomposi-
n
 
β1− (X0 ) = QYH | X=X0 r(X0 ; YH) ≥ nγn (X0 ) (361) tion (368) unique, we assume that the first row of Q is real
26

and non-negative. Thus, Q only lies in a submanifold Ser,r of the Let 0avg denote the average probability of error over the auxiliary
Stiefel manifold Sr,r . Using (368), we rewrite (367) as channel. Then,

nrn exp −n · tr(B−1 QDQH )


 1 − 0
qQ,D | B (Q, D | B) = ≤ 1 − 0avg (377)
Γr (n) det Bn  
r M Z
1
Y
× det Dn−r (di − dj )2
X
(369) = EH  qQ,D | B=Ir +HH f0 (j)H (Q, D)dQdD (378)
i<j M j=1 Dj (H)
 
M Z r
where in (369) we used the fact
Qr that the Jacobian of the eigen- nrn X Y
value decomposition (368) is i<j (di − dj )2 (see [47, Th. 3.1]). ≤ EH  gi (di )dQdD (379)
Γr (n)M j=1 Dj (H) i=1
We next establish an upper bound on (369) that is integrable "Z #
r
and does not depend on B. To this end, we will bound each nrn Y
= EH gi (di )dQdD (380)
of the factors on the RHS of (369). To bound the argument Γr (n)M Ser,r ×Rr≥ i=1
of the exponential function, we apply the trace inequality [48, " r Z #
Th. 20.A.4] π r(r−1) nrn Y
≤ EH gi (xi )dxi (381)
Γr (r)Γr (n)M i=1 R+
r
X di
tr(B−1 QDQH ) ≥ (370) where (379) follows from (371) and (375); (380) follows
bi
i=1 from (376); (381) holds because the integrand does not depend
on Q, because Rr≥ ⊂ Rr+ and because the volume of Ser,r (with
for every unitary matrix Q, where b1 ≥ . . . ≥ br are the
respect to the Lebesgue measure on Ser,r ) is π r(r−1) /Γr (r). After
ordered eigenvalues of B. Using (370) in (369) and further upper-
algebraic manipulations, we obtain
bounding the terms (di − dj )2 in (369) with d2i , we obtain
Z [r−2i]+ +1 
r
( ) b0
nrn Y di n+r−2i 
di
 gi (xi )dxi = n+r−2i+1 Γ(n + r − 2i + 1, n + r − 2i)
qQD | B (Q, D | B) ≤ exp −n . R+ n
Γr (n) i=1 bin bi 
n+r−2i+1 −(n+r−2i)
(371) + (n + r − 2i) e . (382)

Since B = Ir + HH UH, we have that Substituting (382) into (381) and using (374), we obtain
crx (n)
1 ≤ bi ≤ 1 + tr HH UH

(372) 1 − 0 ≤ . (383)
M
2
≤ 1 + kHkF tr (U) (373) Note that the RHS of (383) is valid for every code.
2
=1+ kHkF ρ , b0 (374)
A PPENDIX VIII
where (373) follows from the Cauchy-Schwarz inequality and P ROOF OF THE C ONVERSE PART OF T HEOREM 9
(374) follows because U ∈ Ute . Using (374), we can upper-bound
each factor on the RHS of (371) as follows: In this appendix, we prove the converse asymptotic expansion
for Theorem 9. More precisely, we show the following:
din+r−2i

di
 Proposition 20: Let the pdf of the fading matrix H satisfy the
exp −n conditions in Theorem 9. Then
bni bi  

n + r − 2i
n+r−2i
∗ log n

b0
[r−2i]+ −(n+r−2i)
e , Rrx (n, ) ≤ Cno + O . (384)
n




 n
b0 (n+r−2i)

if di ≤ Proof: Proceeding as in (158)–(162), we obtain from The-

≤ gi (di ) ,  n+r−2i n (375)
 di [r−2i]+ −ndi /b0 orem 6 that

 b0 e ,
b


 0

(n − 1)Rrx (n − 1, )

b0 (n+r−2i)

if di > n .  
≤ nγ − log inf e P[Snrx (Q) ≤ nγ] −  + log crx (n) (385)
Q∈Ut
We are now ready to establish the desired converse result
for the auxiliary channel Q. Consider an arbitrary code for the where γ > 0 is arbitrary. The third term on the RHS of (385) is
auxiliary channel Q with encoding function f0 : {1, . . . , M } 7→ upper-bounded by
Ute . Let Dj (H) be the decoding set for the jth codeword f0 (j) b(r+1)2 /4c 
r2
 
2
in the eigenvalue decomposition coordinate such that log crx (n) ≤ log n + log E 1 + ρ kHkF
2
M
[ + O(1) (386)
Dj (H) = Ser,r × Rr≥ . (376) r 2

j=1
= log n + O(1). (387)
2
27

Here, (386) follows from algebraic manipulations, and (387) Lemma 21: Let H have pdf fH satisfying Conditions 1 and 2
follows from the assumption (81), which ensures that the second in Theorem 9. Let ϕγ,Q : Ct×r 7→ R be defined as in (389)
term on the RHS of (386) is finite. and let U (γ, Q) with pdf fU (γ,Q) denote the random variable
To evaluate P[Snrx (Q) ≤ nγ] on the RHS of (385), we ϕγ,Q (H). Then, there exist δ1 ∈ (0, Cno ) and δ > 0 such that
note that given H = H, the random variable Snrx (Q) is the u 7→ fU (γ,Q) (u) is continuously differentiable on (−δ, δ) and
sum of n i.i.d. random variables. Hence, using Theorem 15 that
(Appendix IV-A) and following similar steps as the ones reported
sup sup sup fU (γ,Q) (u) < ∞ (395)
in Appendix IV-A, we obtain γ∈(Cno −δ1 ,Cno +δ1 ) Q∈Ute u∈(−δ,δ)
  0
1 sup sup sup U (γ,Q) (u) < ∞.
rx
f (396)
P[Sn (Q) ≤ nγ | H = H] ≥ qn (ϕγ,Q (H)) + O (388) γ∈(Cno −δ1 ,Cno +δ1 ) Q∈Ute u∈(−δ,δ)
n
Proof: See Appendix VIII-A.
where the function ϕγ,Q : Ct×r 7→ R is given by
Using (392), (394), and Lemma 21 in (391), and then (391)
and (387) in (385), we obtain for every γ ∈ (Cno −δ1 , Cno +δ1 )

γ − log det Ir + HH QH
ϕγ,Q (H) , q  (389) that
tr Ir − (Ir + HH QH)−2

(n − 1)Rrx (n − 1, )
the function qn (·) was defined in (170), and the O(1/n) term is 
≤ nγ − log inf e P[log det Ir + HH QH ≤ γ] − 

uniform in Q, γ and H. Let Q∈Ut

U (γ, Q) , ϕγ,Q (H). (390) + O(1/n) + O(log n) (397)

Averaging (388) over H, we obtain = nγ − log Fno (γ) −  + O(1/n) + O(log n) (398)

P[Snrx (Q) ≤ nγ] where (398) follows from (27).


 √  We next set γ so that
≥ E Q(− nU (γ, Q))
" 2
# Fno (γ) −  + O(1/n) = 1/n. (399)
[1 − nU 2 (γ, Q)]+ e−nU (γ,Q)/2
 
1
−E √ +O . (391)
6 n n In words, we choose γ so that the argument of the logarithm
in (398) is equal to 1/n. Since the function (Q, R) 7→ FQ (R) is
We proceed to lower-bound the first two terms on the RHS continuous and Ute is compact, by the maximum theorem [49,
of (391). To this end, we show in Lemma 21 ahead that there Sec. VI.3] the function Fno (R) = inf Q∈Ute FQ (R) is continuous
exist δ1 ∈ (0, Cno ) and δ > 0 such that u 7→ fU (γ,Q) (u), in R. This guarantees that such a γ indeed exists. We next show
where fU (γ,Q) denotes the pdf of U (γ, Q), is continuously dif- that, for sufficiently large n, this γ satisfies
ferentiable on (−δ, δ), and that fU (γ,Q) (u) and fU0 (γ,Q) (u) are
uniformly bounded for every γ ∈ (Cno − δ1 , Cno + δ1 ), every |γ − Cno | ≤ O(1/n). (400)
e
Q ∈ Ut , and every u ∈ (−δ, δ). We then apply Lemma 17 in This implies that, for sufficiently large n, γ belongs to the interval
Appendix IV-C with A being a standard normal random variable (C no − δ , C no + δ ). We then obtain (384) by combining (398)
 1  1
and B = U (γ, Q) to lower-bound the first term on the RHS with (399) and (400), and dividing both sides of (398) by n − 1.
of (391) for every δ > 0 as To prove (400), we note that by (83) and the definition of

lim inf, there exists a δ2 ∈ (0, δ1 ) such that
 
E Q(− nU (γ, Q))
 1 2 Fno (γ) − Fno (Cno )
≥ P log det Ir + HH QH ≤ γ −
 
2 inf > 0. (401)
nδ γ∈(Cno −δ2 ,Cno +δ2 ) γ − Cno
1 1 1
  n o
sup max fU (γ,Q) (u), fU0 (γ,Q) (u) .

− + Substituting (401) into (399) and using that Fno (Cno ) = , we
n δ 2 u∈(−δ,δ)
obtain (400). This concludes the proof of Proposition 20.
(392)
We upper-bound the second term on the RHS of (391) for n > A. Proof of Lemma 21
δ −2 as Throughout this section, we shall use const to indicate a finite
1 − nU 2 (γ, Q) + e−nU 2 (γ,Q)/2
" #
constant that does not depend on any parameter of interest; its
E √ magnitude and sign may change at each occurrence. The proof
6 n
Z 1/√n of this lemma is technical and makes use of concepts from
1 2 Riemannian geometry.
≤ √ sup fU (γ,Q) (u) √ |
(1 − nt2 )e−nt /2 dt (393)
6 n u∈(−δ,δ) −1/ n {z } Denote by {Ml } the open subsets
≤1
1 Ml , {H ∈ Ct×r : kHkF < l} (402)
≤ sup fU (γ,Q) (u). (394)
3n u∈(−δ,δ) indexed by l ∈ N. We shall use the following flat Riemannian
The following lemma establishes that fU (γ,Q) and fU0 (γ,Q) are metric [50, pp. 13 and 165] on Ml
indeed uniformly bounded. hH1 , H2 i , Re tr HH
 
1 H2 . (403)
28

Using this metric, we define the gradient ∇g of an arbitrary To prove (407), we shall use that for an arbitrary δ > 0,
function g : Ml 7→ R as in (80). Note that the metric (403)
induces a norm on the tangent space of Ml , which can be f∗ (u + δ) − f∗ (u)
Z Z
identified with the Frobenius norm. dS dS
= f − f (411)
Our proof consists of two steps. Let fl (u) denote the pdf of the −1
ϕ (u+δ) k∇ϕk F −1
ϕ (u) k∇ϕk F
random variable U (γ, Q) conditioned on H ∈ Ml . We first show Z 
dS

no
that there exist l0 ∈ N, δ > 0, and δ1 ∈ (0, C ) such that fl (u) = d f (412)
0 no no ϕ−1 ((u,u+δ)) k∇ϕkF
and fl (u) are uniformly bounded for every γ ∈ (C −δ1 , C + Z
δ1 ), every Q ∈ Ute , every u ∈ [−δ, δ], and every l ≥ l0 . We then = ψ1 dV (413)
show that u 7→ fU (γ,Q) (u) is continuously differentiable on ϕ−1 ((u,u+δ))

(−δ, δ), and that for every u ∈ (−δ, δ), the sequences {fl (u)}
where in (412) we used Stoke’s theorem [51, Th. III.7.2], that f is
and {fl0 (u)} converge uniformly to fU (γ,Q) (u) and fU0 (γ,Q) (u), dS
compactly supported, and that the restriction of the form f k∇ϕk
respectively, i.e., F
to ϕ−1 ((u, u + δ)) is also compactly supported; (413) follows
from the definition of ψ1 (see (408)). Equation (407) follows
lim sup fl (u) − fU (γ,Q) (u) = 0 (404)
l→∞ u∈(−δ,δ) then from similar steps as in (409)–(410).
Using Lemma 22, we obtain
lim sup fl (u) − fU0 (γ,Q) (u) = 0
0
(405)

l→∞ u∈(−δ,δ) Z
fH dS
fl (u) = (414)
from which Lemma 21 follows.
−1
ϕγ,Q (u)∩Ml P[H ∈ M l ] k∇ϕ γ,Q kF

1) Uniform Boundness of {fl } and {fl0 }: To establish that


and
{fl } and {fl0 } are uniformly bounded, we shall need the follow- Z
ing lemma. ψ1 dS
fl0 (u) = (415)
Lemma 22: Let M be an oriented Riemannian manifold with −1
ϕγ,Q (u)∩Ml P[H ∈ M l ] k∇ϕ γ,Q kF
Riemannian metric (403) and let ϕ : M 7→ R be a smooth
function with k∇ϕkF 6= 0 on M. Let P be a random variable where ψ1 satisfies
on M with smooth pdf f . Then,  
dS
1) the pdf f∗ of ϕ(P ) at u is ψ1 dV = d fH . (416)
k∇ϕγ,Q kF
Z
dS
f∗ (u) = f (406) Since P[H ∈ Ml ] → 1 as l → ∞, there exists a l0 such that
ϕ−1 (u) k∇ϕkF P[H ∈ Ml ] ≥ 1/2 for every l ≥ l0 .
We next show that there exist δ > 0, 0 < δ1 < Cno , such that
−1
where ϕ (u) denotes the preimage {x ∈ M : ϕ(x) = u} for every γ ∈ (Cno − δ1 , Cno + δ1 ), every u ∈ (−δ, δ), every
and dS denotes the surface area form on ϕ−1 (u), chosen so Q ∈ Ute , every H ∈ ϕ−1 γ,Q (u) ∩ Ml , and every l ≥ l0
that dS(∇ϕ) > 0;
−2tr−3
2) if the pdf f is compactly supported, then the derivative of f∗ is fH (H) ≤ const · kHkF (417)
−2tr−3
Z |ψ1 (H)| ≤ const · kHkF (418)
dS
f∗0 (u) = ψ1 (407)
ϕ−1 (u) k∇ϕkF and
−2tr−3
where ψ1 is defined implicitly via kHkF
Z
dS
Al (u) , ≤ const. (419)
  −1
ϕξ,Q (u)∩Ml k∇ϕ k
γ,Q F
dS
ψ1 dV = d f (408)
k∇ϕkF The uniform boundedness of {fl } and {fl0 } follows then by using
the bounds (417)–(419) in (414) and (415).
with dV denoting the volume form on M and d(·) being
Proof of (417): Since fH (H) is continuous by assumption,
exterior differentiation [29, p. 256].
it is uniformly bounded for every H ∈ M1 . Hence, (417) holds
Proof: To prove (406), we note that for arbitrary a, b ∈ R for every H ∈ M1 . For H ∈ / M1 , we have by (81)
Z b Z −2tr−b(1+r)2 /2c−1 −2tr−3
fH (H) ≤ a kHkF ≤ a kHkF . (420)
f∗ (u)du = f dV (409)
a ϕ−1 ((a,b))
Z b Z ! This proves (417).
dS
= f du (410) Proof of (418): The surface area form dS on ϕ−1
γ,Q (u)∩Ml
a ϕ−1 (u) k∇ϕkF is given by

where (410) follows from the smooth coarea formula [51, p. 160]. ?dϕγ,Q
dS = (421)
This implies (406). k∇ϕγ,Q kF
29

where ? denotes the Hodge star operator [50, p. 103] induced by which implies that

the metric (403). Using (421) and the definition of the Hodge no
−δ1 − rδ)/r
λmax (HH QH) ≥ e(C − 1 > 0. (433)
star operator, the RHS of (416) becomes
! Combing (433) with the second inequality in (428), we obtain
fH fH no √
∧ ?dϕγ,Q + 2 ∧ d ? dϕξ,Q tr Ir − Φ−2 ≥ 1 − e−2(C −δ1 − rδ)/r .

d 2 (434)
k∇ϕγ,Q kF k∇ϕγ,Q kF
2 We use (429) and (434) to lower-bound the smallest eigenvalue
h∇fH , ∇ϕγ,Q i fH h∇ k∇ϕγ,Q kF , ∇ϕγ,Q i of the matrix T defined in (427) as
= 2 − 4
k∇ϕγ,Q kF k∇ϕγ,Q kF
! λmin (T) = tr(Ir − Φ−2 ) λmin (Φ2 ) +(γ − log det Φ) (435)
fH · ∆ϕγ,Q | {z }
− 2 dV (422) √
≥1
k∇ϕγ,Q kF ≥ tr(Ir − Φ −2
)−δ r (436)
no √ √
where ∧ denotes the wedge product [29, p. 237] and ∆ denotes ≥ 1 − e−2(C −δ1 − rδ)/r
− δ r. (437)
the Laplace operator [50, Eq. (3.1.6)].11 From (422) we get
The RHS of (437) can be made positive if we choose δ suffi-
h∇fH , ∇ϕγ,Q i fH h∇ k∇ϕγ,Q k2F , ∇ϕγ,Q i

ciently small, in which case T is invertible. We can theorefore
|ψ1 | = 2 − 4 lower-bound k∇ϕγ,Q kF as
k∇ϕγ,Q kF k∇ϕγ,Q kF

fH · ∆ϕγ,Q 2 −3

− (423) k∇ϕξ,Q kF = 3/2 QHΦ T F (438)
2
k∇ϕγ,Q kF tr Ir − Φ−2
2 1

2
≥ 3/2 QHΦ−3 F ·

fH ∇ k∇ϕγ,Q kF (439)

k∇fH kF F fH · |∆ϕγ,Q | r kT−1 kF
≤ + 3 + 2
k∇ϕγ,Q kF k∇ϕγ,Q kF k∇ϕγ,Q kF 2 1 1
≥ 3/2 kQHkF · · −1
. (440)
(424) r kΦ kF kT kF
3

where the last step follows from the triangle inequality and the Here, we use the first inequality in (428) and the submultiplica-
Cauchy-Schwarz inequality. tivity of the Frobenius norm. The term kQHkF can be bounded
We proceed to lower-bound k∇ϕγ,Q kF . Using the definition as
H
of the gradient (80) together with the matrix identities [52, p. 29] H QH
F
kQHkF ≥ (441)
det(I + εA) = 1 + εtr(A) + O(ε2 ), ε→0 (425) kHkF
(I + εA)−1 = I − εA + O(ε2 ), ε→0 (426) λmax (HH QH)
≥ (442)
kHkF
for every bounded square matrix A, we obtain no √
−δ1 − rδ)/r
e(C −1
2QHΦ−3 ≥ (443)
∇ϕγ,Q (H) = − kHkF
3/2
tr Ir − Φ−2 where (443) follows
from (433).
  The term Φ3 F in (440) can be upper-bounded as
× tr(Ir − Φ−2 )Φ2 + (γ − log det Φ)Ir (427) 3 √
| {z } Φ ≤ r(1 + λmax (HH QH))3 (444)
F
,T √
≤ r(1 + det Φ)3 (445)
where Φ , Ir + HH QH.
no ≤ const. (446)
√ an arbitrary δ1 ∈ (0, C ) and
Fix choose δ ∈ (0, (Cno −
−2 (446) follows from (429) and because γ ≤ Cno +δ. Finally,
δ1 )/ r). We first bound tr(Ir − Φ ) as Here,
−1
T in (440) can be bounded as
r ≥ tr Ir − Φ−2 ≥ 1 − (1 + λmax (HH QH))−2 . (428)

F

−1 √ r
It follows from the first inequality in (428) and from (389) that T ≤ rλmax (T−1 ) = . (447)
F λmin (T)
for every u ∈ (−δ, δ)
p √ The RHS of (447) is bounded because of (437). Substitut-
|γ − log det Φ| = |u| tr(Ir − Φ−2 ) ≤ δ r. (429) ing (443), (446) and (447) into (440), we conclude that
Using (429) and that the determinant is given by the product of k∇ϕγ,Q k−1 ≤ const · kHkF . (448)
the eigenvalues, we obtain that, for every γ ∈ (Cno − δ1 , Cno − Following similar steps as the ones reported in (425)–(448),
δ1 ) and every u ∈ (−δ, δ), we can show that
r log(1 + λmax (HH QH)) ≥ log det Φ

(430) 2
∇ k∇ϕγ,Q kF < const · k∇ϕγ,Q kF (449)

√ F
≥ γ − rδ (431)
√ and
≥ Cno − δ1 − rδ > 0 (432)
|∆ϕγ,Q | < const. (450)
11 The Laplace operator used here and in [50, Eq. (3.1.6)] differs from the
usual one on Rn , as defined in calculus, by a minus sign. See [50, Sec. 3.1] for Substituting (448)–(450) into (424) and using the bounds (81)
a more detailed discussion. and (82), we obtain (418).
30

Proof of (419): We begin by observing that for every H ∈ We can therefore upper-bound Al (u) as
ϕ−1
γ,Q (u) ∩ Ml , every γ ∈ (Cno − δ1 , Cno + δ1 ), every u ∈ Z
(−δ, δ) and every Q ∈ Ute Al (u) = Al (u0 ) + ψ2 dV (465)
ϕ−1
γ,Q ((u0 ,u))∩Ml
2 tr(HH QH) Z
kHkF ≥ (451) ≤ const +
−2tr−1
const · kHkF dV (466)
tr(Q)
M0
1 Z ∞
≥ λmax (HH QH) (452) ≤ const + const · x−2 dx (467)
ρ √
k0
1  (Cno −δ1 −√rδ)/r 
≥ e − 1 , k0 . (453) = const. (468)
ρ
Here, (465) follows from (462); (467) follows from (459), (464),
Here, (451) follows from Cauchy-Schwarz inequality; (452)
and (454). Note that the bound (468) is uniform in γ, Q, u, and l.
follows because tr(Q) = ρ for every Q ∈ Ute ; (453) follows
Following similar steps as the ones reported in (460)–(468), we
from (433). From (453) we conclude that
obtain the same result for u ∈ (−δ, u0 ). This proves (419).
2) Convergence of fl (u) and fl0 (u): In this section, we will
  p
ϕ−1
γ,Q ((−δ, δ)) ∩ M 0
l ⊂ M , {H ∈ C
t×r
: kHkF ≥ k0 }.
prove (404) and (405). By Lemma 22,
(454) Z
fH dS
To upper-bound Al (u), we note that fU (γ,Q) (u) = . (469)
−1
ϕγ,Q (u) k∇ϕ γ,Q kF
Z δ Z
Al (u)du = kHkF
−2tr−3
dV (455) We have the following chain of inequalities
−δ ϕ−1
γ,Q ((−δ,δ))∩Ml
Z |fl (u) − fU (γ,Q) (u)|
−2tr−3
≤ kHkF dV (456) ≤ |P[H ∈ Ml ]fl (u) − fU (γ,Q) (u)|
M0
Z ∞
+ |(1 − P[H ∈ Ml ])fl (u)| (470)
= const · √ x−4 dx (457) Z
fH dS
k0 ≤
= const. (458) −1
ϕγ,Q (u)∩(C t×r \Ml ) k∇ϕ γ,Q kF

+ const · (1 − P[H ∈ Ml ]) (471)


Here, (455) follows from the smooth coarea formula [51, −2tr−3
kHkF
Z
p. 160]; (456) follows from (454); (457) follows by writing the dS
≤ const ·
RHS of (456) in polar coordinates and by using that, by (433), −1
ϕγ,Q (u)∩(C t×r \Ml ) k∇ϕ k
γ,Q F
k0 > 0. By the mean value theorem, it follows from (458) that + const · (1 − P[H ∈ Ml ]). (472)
there exists a u0 ∈ (−δ, δ) satisfying
Rδ Here, (470) follows from the triangle inequality; (471) follows
Al (u)du from (414) and because {fl (u)} is uniformly bounded; (472)
Al (u0 ) = −δ ≤ const. (459) follows from (417). Following similar steps as the ones reported

in (455)–(468), we upper-bound the first term on the RHS
Next, for every u ∈ (u0 , δ) we have that of (472) as
−2tr−3
kHkF −2tr−3
Z
kHkF
Z
Al (u) − Al (u0 ) = dS dS const
k∇ϕγ,Q kF ≤ . (473)
ϕ−1
γ,Q (u)∩Ml
−1
ϕγ,Q (u)∩(Ct×r \Ml ) k∇ϕ γ,Q k F l
−2tr−3
kHkF
Z
− dS (460) Substituting (473) into (472), and using that P[H ∈ Ml ] → 1
ϕ−1
γ,Q (u0 )∩Ml
k∇ϕγ,Q kF
! as l → ∞, we obtain that
−2tr−3
kHkF
Z

= d dS (461) lim sup fl (u) − fU (γ,Q) (u) = 0. (474)
ϕ−1
γ,Q ((u0 ,u))∩Ml
k∇ϕγ,Q kF l→∞ u∈(−δ,δ)
Z
= ψ2 dV (462) This proves (404).
ϕ−1
γ,Q ((u0 ,u))∩Ml To prove (405), we proceed as follows. Let C 1 ([−δ, δ]) denote
the set of continuously differentiable functions on the compact
where ψ2 is defined implicitly via
interval [−δ, δ]. The space C 1 ([−δ, δ]) is a Banach space (i.e.,
a complete normed vector space) when equipped with the C 1
!
−2tr−3
kHkF
ψ2 dV = d dS . (463) norm [53, p. 92]
k∇ϕγ,Q kF
kf kC 1 ([−δ,δ]) = sup (|f (x)| + |f 0 (x)|). (475)
Here, (461) follows from Stokes’ theorem. Following similar x∈[−δ,δ]
steps as the ones reported in (421)–(450), we obtain that
Following similar steps as in (460)–(468), we obtain that {fl0 }
−2tr−1
|ψ2 | ≤ const · kHkF . (464) is continuous on [−δ, δ], i.e., the restriction of {fl } to [−δ, δ]
31

belongs to C 1 ([−δ, δ]). Moreover, following similar steps as eigenvalues of HH Q∗ H, and 0a,b denotes the all zero matrix of
in (470)–(474), we obtain that for all m > l > 0 size a × b. Conditioned on H = H, we have

sin2 {In,t∗ , nIn,t∗ UH + W}
 
0
lim sup |fm (u) − fl (u)| + |fm (u) − fl0 (u)| = 0. (476) √
l→∞ u∈[−δ,δ]
= sin2 In,t∗ L, ( nIn,t∗ UH + W)V

(483)

This means that {fl } restricted to [−δ, δ] is a Cauchy sequence, 2 e

= sin LIn,t∗ L, L( nIn,t∗ UH + W)V
e (484)
and, hence, converges in C 1 ([−δ, δ]) with respect to the C 1 2
 √
= sin In,t∗ , nIn,t∗ Σ + W (485)
norm (475). Moreover, by (474) the limit of {fl } is fU (γ,Q) .
Therefore, we have fU (γ,Q) ∈ C 1 ([−δ, δ]), and {fl0 } converges where
to fU0 (γ,Q) with respect to the sup-norm k·k∞ . This proves (405). LH
 
0(n−t∗ )×t∗
L,
e (486)
0t∗ ×(n−t∗ ) In−t∗
A PPENDIX IX is unitary. Here, (483) follows because span(A) = span(AB)
P ROOF OF THE ACHIEVABILITY PART OF T HEOREM 9 for every invertable matrix B; (484) follows because the principal
We prove the achievability asymptotic expansion for Theo- angles between two subspaces are invariant under simultaneous
rem 9. More precisely, we prove the following: rotation of the two subspaces; (485) follows because W is
Proposition 23: Assume that there exists a Q∗ ∈ Ut satisfy- LWV has the same
isotropically distributed, which implies that e
ing (64). Let FQ∗ (·) be as in (87). Assume that the joint pdf of the distribution as W.
nonzero eigenvalues of HH Q∗ H is continuously differentiable Let ej and Wj be the jth column of In,t∗ and W, respectively.
and that FQ∗ (·) is differentiable and strictly increasing at Cno , Then
i.e., √
P sin2 In,t∗ , nIn,t∗ Σ + W ≤ γn
  

FQ0 ∗ (Cno ) > 0.


 ∗ 
(477) m
Y
sin2 ej , nΛj ej + Wj ≤ γn  (487)
 p
≥ P
Let t∗ = rank(Q∗ ). Then, j=1
   
∗ log n 1 ∗
∗ no m
Rno (n, ) ≥ C − (1 + rt ) +O . (478) Y
sin2 e1 , nΛj e1 + Wj ≤ γn  . (488)
 p
n n = P
j=1
Note that the conditions on the distribution of the fading
matrix H under which Proposition 23 holds are less stringent Here, (487) follows from Lemma 13 (Appendix I) and (488)
than (and, because of Proposition 5 on p. 7 and Lemma 21 on follows by symmetry. By repeating the same steps as in (263)–
p. 27, implied by) the conditions under which Proposition 20 (279), we obtain from (488) that there exists a γn = exp(−Cno +
(converse part of Theorem 9) holds. O(1/n)) that satisfies (481).
Proof: The proof follows closely the proof of the achiev-
ability part of Theorem 3. Following similar steps as the ones A PPENDIX X
reported in (253)–(259), we obtain P ROOF OF T HEOREM 11 (D ISPERSION OF C ODES WITH
" r # I SOTROPIC C ODEWORDS )
∗ ∗
Using Proposition 23 with Q∗ replaced by (ρ/t)It , we obtain
Y
P Bi ≤ γn ≤ nrt γnn−t −r . (479)
 
i=1 ∗ log n
Rno,iso (n, ) ≥ Ciso + O . (489)
Setting τ = 1/n and γn = exp(−Cno +O(1/n)) in Theorem 4, n
and using (479), we obtain ∗
Since Rno,iso ∗
(n, ) ≤ Rrx,iso (n, ), the proof is completed by
log M
  showing that
no ∗ log n 1
≥ C − (1 + rt ) +O . (480)  
n n n ∗ iso log n
Rrx,iso (n, ) ≤ C + O . (490)
n
To conclude the proof, we show that there exists indeed a
γn = exp(−Cno + O(1/n)) satisfying To prove (490), we evaluate the converse bound (78) in the
√ large-n limit. This evaluation follows closely the proof of (56)
P sin2 {In,t∗ , nIn,t∗ UH + W} ≤ γn ≥ 1 −  + 1/n (481)
 
in Appendix IV. Let Λ1 ≥ · · · ≥ Λm be the ordered nonzero
∗ eigenvalues of HH H. Following similar steps as in (158)–(162),
where U ∈ Ct ×t satisfies UH U = Q∗ . Hereafter, we restrict
no no we obtain that for every γ > 0
ourselves to γn ∈ e−C −δ , e−C +δ for some δ ∈ (0, Cno ).

Let m∗ , min{t∗ , r}. Consider the SVD of UH Rrx,iso (n, )
 

Σm ∗ 0m∗ ×(r−m∗ )
 1   1
UH = L VH (482) ≤ γ − log P[Snrx ((ρ/t)It ) ≤ nγ] −  + O (491)
0(t∗ −m∗ )×m∗ 0(t∗ −m∗ )×(r−m∗ ) n n
with Snrx (·) defined in (70). To evaluate the second term on the
| {z }

RHS of (491), we proceed as in Appendix IV-A to obtain
∗ ∗
where√L ∈ Ct ×t√ and V ∈ Cr×r are unitary matrices, Σm∗ =  k1
diag{ λ1 , . . . , λm∗ } with λ1 , . . . , λm∗ being the m∗ largest P[Snrx ((ρ/t)It ) ≤ nγ | Λ = λ] ≥ qn ũγ (λ) + (492)
n
32

for γ in a certain neighborhood of Ciso . Here, the function qn (·) [13] C. Potter, K. Kosbar, and A. Panagos, “On achievable rates for MIMO
is given in (170); the function ũγ (·) : Rm
+ 7→ R is defined as
systems with imperfect channel state information in the finite length
regime,” IEEE Trans. Commun., vol. 61, no. 7, pp. 2772–2781, Jul. 2013.
Pm [14] W. Yang, G. Caire, G. Durisi, and Y. Polyanskiy, “Finite blocklength
γ − j=1 log(1 + ρλj /t)
ũγ (λ) , q ; (493) channel coding rate under a long-term power constraint,” in Proc. IEEE
Pm Int. Symp. Inf. Theory (ISIT), Honolulu, HI, USA, Jul. 2014.
m − j=1 (1 + ρλj /t)−2 [15] G. Caire, G. Taricco, and E. Biglieri, “Optimum power control over fading
channels,” IEEE Trans. Inf. Theory, vol. 45, no. 5, pp. 1468–1489, May
Λ = [Λ1 , . . . , Λm ]; and k1 is a finite constant independent of γ 1999.
and λ. A lower bound on P[Snrx ((ρ/t)It ) ≤ nγ] follows then by [16] V. V. Petrov, Sums of Independent Random Variates. Springer-Verlag,
1975, translated from the Russian by A. A. Brown.
averaging both sides of (492) with respect to Λ [17] İ. E. Telatar, “Capacity of multi-antenna Gaussian channels,” Eur. Trans.
 k1 Telecommun., vol. 10, pp. 585–595, Nov. 1999.
P[Snrx ((ρ/t)It ) ≤ nγ] ≥ E qn ũγ (Λ) + . (494)

[18] A. M. Tulino and S. Verdú, “Random matrix theory and wireless communi-
n cations,” in Foundations and Trends in Communications and Information
Theory. Delft, The Netherlands: now Publishers, 2004, vol. 1, no. 1, pp.
Proceeding as in (199)–(206) and using the assumption that the 1–182.
joint pdf of Λ1 , . . . , Λm is continuously differentiable, we obtain [19] N. Johnson, S. Kotz, and N. Balakrishnan, Continuous Univariate Distri-
that for all γ ∈ (Ciso − δ, Ciso + δ) butions, 2nd ed. New York: Wiley, 1995, vol. 2.
  [20] E. Abbe, S.-L. Huang, and İ. E. Telatar, “Proof of the outage probability
m conjecture for MISO channels,” IEEE Trans. Inf. Theory, vol. 59, no. 5,
  X k2 pp. 2596–2602, May 2013.
E qn ũγ (Λ) ≥ P log(1 + ρΛj /t) ≤ γ  + (495) [21] C. E. Shannon, “Probability of error for optimal codes in a Gaussian
j=1
n
channel,” Bell Syst. Tech. J., vol. 38, no. 3, pp. 611–656, May 1959.
[22] A. Barg and D. Y. Nogin, “Bounds on packings of spheres in the Grassmann
for some δ > 0 and k2 > −∞. Substituting (495) into (494), manifold,” IEEE Trans. Inf. Theory, vol. 48, no. 9, pp. 2450–2454, Sep.
we see that for every γ ∈ (Ciso − δ, Ciso + δ) 2002.
[23] Y. Polyanskiy, “Channel coding: non-asymptotic fundamental limits,” Ph.D.
P[Snrx ((ρ/t)It ) ≤ nγ] dissertation, Princeton University, 2010.
  [24] Y. Polyanskiy, H. V. Poor, and S. Verdú, “Dispersion of the Gilbert-Elliott
m channel,” IEEE Trans. Inf. Theory, vol. 57, no. 4, pp. 1829–1848, Apr.
X k1 + k2
≥ P log(1 + ρΛj /t) ≤ γ  + (496) 2011.
j=1
n [25] M. Tomamichel and V. Y. F. Tan, “-capacities and second-order coding
rates for channels with general state,” IEEE Trans. Inf. Theory, May 2013,
k1 + k2 submitted. [Online]. Available: http://arxiv.org/abs/1305.6789
= Fiso (γ) + . (497) [26] W. Yang, G. Durisi, T. Koch, and Y. Polyanskiy, “Quasi-static SIMO fading
n channels at finite blocklength,” in Proc. IEEE Int. Symp. Inf. Theory (ISIT
The proof of (490) is concluded by repeating the same steps as 2013), Istanbul, Turkey, Jul. 2013.
in (164)–(165). [27] E. MolavianJazi and J. N. Laneman, “On the second-order coding rate of
block fading channels,” in Proc. Allerton Conf. Commun., Contr., Comput.,
Monticello, IL, USA, Oct. 2013, to appear.
[28] A. T. James, “Distribution of matrix variates and latent roots derived from
R EFERENCES normal samples,” Ann. Math. Statist., vol. 35, pp. 475–501, 1964.
[1] E. Biglieri, J. Proakis, and S. Shamai (Shitz), “Fading channels: [29] J. R. Munkres, Analysis on Manifolds. Redwood City, CA: Addison-
Information-theoretic and communications aspects,” IEEE Trans. Inf. Wesley, 1991.
Theory, vol. 44, no. 6, pp. 2619–2692, Oct. 1998. [30] V. Y. F. Tan and M. Tomamichel, “The third-order term in the normal
[2] D. N. C. Tse and P. Viswanath, Fundamentals of Wireless Communication. approximation for the AWGN channel,” IEEE Trans. Inf. Theory, Dec.
Cambridge, U.K.: Cambridge Univ. Press, 2005. 2013, submitted. [Online]. Available: http://arxiv.org/abs/1311.2337
[3] G. J. Foschini and M. J. Gans, “On limits of wireless communications [31] 3GPP TS 36.212, “Technical specification group radio access network;
in a fading environment when using multiple antennas,” Wirel. Personal evolved universal terrestrial radio access (E-UTRA); multiplexing and
Commun., vol. 6, pp. 311–335, 1998. channel coding (release 10),” Dec. 2012.
[4] M. Effros and A. Goldsmith, “Capacity definitions and coding strategies [32] P. Robertson, E. Villebrun, and P. Hoeher, “A comparison of optimal and
for general channels with receiver side information,” in Proc. IEEE Int. sub-optimal MAP decoding algorithms operating in the log domain,” in
Symp. Inf. Theory (ISIT), Cambridge MA, Aug. 1998, p. p. 39. Proc. IEEE Int. Conf. Commun. (ICC), Seattle, USA, Jun. 1995, pp. 1009–
[5] T. S. Han, Information-Spectrum Methods in Information Theory. Berlin, 1013.
Germany: Springer-Verlag, 2003. [33] S. Sesia, I. Toufik, and M. Baker, Eds., LTE–The UMTS Long Term
[6] M. Effros, A. Goldsmith, and Y. Liang, “Generalizing capacity: New Evolution: From Theory to Practice, 2nd ed. UK: Wiley, 2011.
definitions and capacity theorems for composite channels,” IEEE Trans. [34] J. Miao and A. Ben-Israel, “On principal angles between subspaces in Rn ,”
Inf. Theory, vol. 56, no. 7, pp. 3069–3087, Jul. 2010. Linear Algebra Appl., vol. 171, pp. 81–98, 1992.
[7] S. Verdú and T. S. Han, “A general formula for channel capacity,” IEEE [35] S. Afriat, “Orthogonal and oblique projectors and the characteristics of
Trans. Inf. Theory, vol. 40, no. 4, pp. 1147–1157, Jul. 1994. pairs of vector spaces,” Proc. Cambridge Phil. Soc., vol. 53, no. 4, pp.
[8] L. H. Ozarow, S. S. (Shitz), and A. D. Wyner, “Information theoretic 800–816, 1957.
considerations for cellular mobile radio,” IEEE Trans. Inf. Theory, vol. 43, [36] P. Absil, P. Koev, and A. Edelman, “On the largest principal angle between
no. 2, pp. 359–378, May 1994. random subspaces,” Linear Algebra Appl., vol. 414, no. 1, pp. 288–294,
[9] Y. Polyanskiy, H. V. Poor, and S. Verdú, “Channel coding rate in the finite Apr. 2006.
blocklength regime,” IEEE Trans. Inf. Theory, vol. 56, no. 5, pp. 2307– [37] J. C. Roh and B. D. Rao, “Design and analysis of MIMO spatial multi-
2359, May 2010. plexing systems with quantized feedback,” IEEE Trans. Signal Process.,
[10] Y. Polyanskiy and S. Verdú, “Scalar coherent fading channel: dispersion vol. 54, no. 8, pp. 2874–2886, Aug. 2006.
analysis,” in Proc. IEEE Int. Symp. Inf. Theory (ISIT), Saint Petersburg, [38] J. Neyman and E. S. Pearson, “On the problem of the most efficient tests
Russia, Aug. 2011, pp. 2959–2963. of statistical hypotheses,” Philosophical Trans. Royal Soc. A, vol. 231, pp.
[11] W. Yang, G. Durisi, T. Koch, and Y. Polyanskiy, “Diversity versus channel 289–337, 1933.
knowledge at finite block-length,” in Proc. IEEE Inf. Theory Workshop [39] M. Abramowitz and I. A. Stegun, Eds., Handbook of Mathematical Func-
(ITW), Lausanne, Switzerland, Sep. 2012, pp. 577–581. tions with Formulas, Graphs, and Mathematical Tables, 10th ed. New
[12] J. Hoydis, R. Couillet, and P. Piantanida, “The second-order coding rate York: Dover: Government Printing Office, 1972.
of the MIMO Rayleigh block-fading channel,” IEEE Trans. Inf. Theory, [40] W. Rudin, Principles of Mathematical Analysis, 3rd ed. Singapore:
2013, submitted. [Online]. Available: http://arxiv.org/abs/1303.3400 McGraw-Hill, 1976.
33

[41] S. Boyd and L. Vandenberghe, Convex Optimization. Cambridge, U.K.: Giuseppe Durisi (S’02, M’06, SM’12) received the Laurea degree summa cum
Cambridge Univ. Press, 2004. laude and the Doctor degree both from Politecnico di Torino, Italy, in 2001 and
[42] N. Johnson, S. Kotz, and N. Balakrishnan, Continuous Univariate Distri- 2006, respectively. From 2002 to 2006, he was with Istituto Superiore Mario
butions, 2nd ed. New York: Wiley, 1995, vol. 1. Boella, Torino, Italy. From 2006 to 2010 he was a postdoctoral researcher at
[43] F. Qi and Q.-M. Luo, “Bounds for the ratio of two gamma functions—from ETH Zurich, Switzerland. Since 2010 he has been with Chalmers University of
Wendel’s and related inequalities to logarithmically completely monotonic Technology, Gothenburg, Sweden, where is now associate professor. He held
functions,” Banach J. Math. Anal., vol. 6, no. 2, pp. 132–158, 2012. visiting researcher positions at IMST, Germany, University of Pisa, Italy, ETH
[44] G. Grimmett and D. Stirzaker, Probability and Random Processes, 3rd ed. Zurich, Switzerland, and Vienna University of Technology, Austria.
New York, USA: Oxford University Press, 2001. Dr. Durisi is a senior member of the IEEE. He is the recipient of the 2013
[45] Y. Polyanskiy, “Saddle point in the minimax converse for channel coding,” IEEE ComSoc Best Young Researcher Award for the Europe, Middle East, and
IEEE Trans. Inf. Theory, vol. 59, no. 5, pp. 2576–2595, May 2013. Africa Region, and is co-author of a paper that won a ”student paper award” at
[46] R. A. Horn and C. R. Johnson, Matrix Analysis. Cambridge, U.K.: the 2012 International Symposium on Information Theory, and of a paper that
Cambridge Univ. Press, 1985. won the 2013 IEEE Sweden VT-COM-IT joint chapter best student conference
[47] A. Edelman, “Eigenvalues and condition numbers of random matrices,” paper award. He served as TPC member in several IEEE conferences, and is
Ph.D. dissertation, MIT, May 1989. currently publications editor of the IEEE Transactions on Information Theory.
[48] A. W. Marshall and I. Olkin, Inequalities: Theory of Majorization and its His research interests are in the areas of communication and information theory.
Application. Orlando, FL: Academic Press, Inc., 1979.
[49] C. Berge, Topological Spaces. Edinburg, UK: Oliver and Boyd, 1963.
[50] J. Jost, Riemannian Geometry and Geometric Analysis, 6th ed. Berlin,
Germany: Springer, 2011.
[51] I. Chavel, Riemannian Geometry: A Modern Introduction. Cambridge, Tobias Koch (S’02, M’09) is a Visiting Professor with the Signal Theory and
UK: Cambridge Univ. Press, 2006. Communications Department of Universidad Carlos III de Madrid (UC3M),
[52] H. Lütkepohl, Handbook of Matrices. Chichester, England: John Wiley Spain. He received the M.Sc. degree in electrical engineering (with distinction)
& Sons, 1996. in 2004 and the Ph.D. degree in electrical engineering in 2009, both from ETH
[53] J. K. Hunter and B. Nachtergaele, Applied Analysis. Singapore: World Zurich, Switzerland. From June 2010 until May 2012 he was a Marie Curie
Scientific Publishing Co., 2001. Intra-European Research Fellow with the University of Cambridge, UK. He was
also a research intern at Bell Labs, Murray Hill, NJ in 2004 and at Universitat
Pompeu Fabra (UPF), Spain, in 2007. He joined UC3M in 2012. His research
interests include digital communication theory and information theory.
Dr. Koch is serving as Vice Chair of the Spain Chapter of the IEEE Information
Theory Society in 2013–2014.

Wei Yang (S’09) received the B.E. degree in communication engineering and Yury Polyanskiy (S’08-M’10) is an Assistant Professor of Electrical Engineer-
M.E. degree in communication and information systems from the Beijing ing and Computer Science at MIT. He received the M.S. degree in applied
University of Posts and Telecommunications, Beijing, China, in 2008 and 2011, mathematics and physics from the Moscow Institute of Physics and Technology,
respectively. He is currently pursuing a Ph.D. degree in electrical engineering at Moscow, Russia in 2005 and the Ph.D. degree in electrical engineering from
Chalmers University of Technology, Gothenburg, Sweden. From July to August Princeton University, Princeton, NJ in 2010. In 2000-2005 he lead the develop-
2012, he was a visiting student at the Laboratory for Information and Decision ment of the embedded software in the Department of Surface Oilfield Equipment,
Systems, Massachusetts Institute of Technology, Cambridge, MA. Borets Company LLC (Moscow). Currently, his research focuses on basic
Mr. Yang is the recipient of a Student Paper Award at the 2012 IEEE questions in information theory, error-correcting codes, wireless communication
International Symposium on Information Theory (ISIT), Cambridge, MA, and and fault-tolerant circuits. Over the years Dr. Polyanskiy won the 2013 NSF
the 2013 IEEE Sweden VT-COM-IT joint chapter best student conference paper CAREER award, the 2011 IEEE Information Theory Society Paper Award and
award. His research interests are in the areas of information and communication Best Student Paper Awards at the 2008 and 2010 IEEE International Symposia
theory. on Information Theory (ISIT).

You might also like