Network Inf Theory Slide

Elements of Network Information Theory
Abbas El Gamal and Young-Han Kim Stanford University and UC San Diego Tutorial, ISIT 2011
Slides available at http://isl.stanford.edu/abbas
Introduction
Networked Information Processing System
Communication network
System: Internet, peer-to-peer network, sensor network, . . . Sources: Data, speech, music, images, video, sensor data Nodes: Handsets, base stations, processors, servers, sensor nodes, . . . Network: Wired, wireless, or a hybrid of the two Task: Communicate the sources, or compute/make decision based on them
El Gamal & Kim (Stanford & UCSD) Elements of NIT Tutorial, ISIT 2011 2 / 118
Introduction
Network Information Flow Questions
Communication network
What is the limit on the amount of communication needed? What are the coding scheme/techniques that achieve this limit? Shannon (1948): Noisy point-to-point communication FordFulkerson, EliasFeinsteinShannon (1956): Graphical unicast networks
Introduction
Network Information Theory

Simplistic model of network as graph with point-to-point links and forwarding nodes does not capture many important aspects of real-world networks:

Networked systems have multiple sources and destinations The network task is often to compute a function or to make a decision Many networks allow for feedback and interactive communication The wireless medium is a shared broadcast medium Network security is often a primary concern Sourcechannel separation does not hold for networks Data arrival and network topology evolve dynamically
Network information theory aims to answer the information ow questions while capturing some of these aspects of real-world networks
El Gamal & Kim (Stanford & UCSD)
Elements of NIT
Tutorial, ISIT 2011
4 / 118
Introduction
Brief History
First paper: Shannon (1961) Two-way communication channels

He didnt nd the optimal rates (capacity region) The problem remains open
Signicant research activities in 70s and early 80s with many new results and techniques, but

Many basic problems remained open Little interest from information and communication theorists
Wireless communications and the Internet revived interest in mid 90s

Some progress on old open problems and many new models and problems Coding techniques, such as successive cancellation, superposition, SlepianWolf, WynerZiv, successive renement, writing on dirty paper, and network coding, beginning to impact real-world networks
Elements of NIT
Tutorial, ISIT 2011
5 / 118
Introduction
Network Information Theory Book

The book provides a comprehensive coverage of key results, techniques, and open problems in network information theory The organization balances the introduction of new techniques and new models The focus is on discrete memoryless and Gaussian network models We discuss extensions (if any) to many users and large networks The proofs use elementary tools and techniques We use clean and unied notation and terminology
Elements of NIT
Tutorial, ISIT 2011
6 / 118
Introduction
Book Organization
Part I. Preliminaries (Chapters 2,3): Review of basic information measures, typicality, Shannons theorems. Introduction of key lemmas Part II. Single-hop networks (Chapters 4 to 14): Networks with single-round, one-way communication

Independent messages over noisy channels Correlated (uncompressed) sources over noiseless links Correlated sources over noisy channels
Part III. Multihop networks (Chapters 15 to 20): Networks with relaying and multiple communication rounds

Independent messages over graphical networks Independent messages over general networks Correlated sources over graphical networks
Part IV. Extensions (Chapters 21 to 24): Extensions to distributed computing, secrecy, wireless fading channels, and information theory and networking
Introduction
Tutorial Objectives
Focus on elementary and unied approach to coding schemes
Typicality and simple universal lemmas for DM models
Lossless source coding as a corollary of lossy source coding
Extending achievability proofs from DM to Gaussian models
Illustrate the approach through proofs of several classical coding theorems
Elements of NIT
Tutorial, ISIT 2011
8 / 118
Introduction
Outline
1. Typical Sequences 2. Point-to-Point Communication 3. Multiple Access Channel 4. Broadcast Channel 5. Lossy Source Coding 6. WynerZiv Coding 7. GelfandPinsker Coding 8. Wiretap Channel 9. Relay Channel 10. Multicast Network
10-minute break
10-minute break
Typical Sequences
Typical Sequences
Empirical pmf (or type) of x n X n : {i : xi = x } (x | x n ) = n
for x X
Typical set (OrlitskyRoche 2001): For X p(x ) and > 0,
n p(x ) for all x X = T(n) T(n) ( X ) = x n : ( x | x ) p ( x )
Typical Average Lemma

Let x n T(n) ( X ) and (x ) 0. Then (1 ) E( ( X )) 1 n (xi ) (1 + ) E( ( X )) n i =1
Elements of NIT
Tutorial, ISIT 2011
10 / 118
Typical Sequences
Properties of Typical Sequences

Let x n T(n) ( X ) and p(x n ) = n i =1 p X ( xi ). Then 2n(H ( X )+()) p(x n ) 2n(H ( X )()) , where ( ) 0 as 0 (Notation: p(x n ) 2nH ( X ) ) |T(n) ( X )| 2nH ( X ) for n suciently large
n (n) Let X n n i =1 p X ( xi ). Then by the LLN, limn P{ X T } = 1
Xn P{ X n T(n) } 1 |T(n) | 2nH ( X ) T(n) ( X )
typical x n p(x n ) 2nH ( X )
Elements of NIT
Tutorial, ISIT 2011
11 / 118
Typical Sequences
Jointly Typical Sequences

Joint type of (x n , y n ) X n Y n : {i : (xi , yi ) = (x , y )} for (x , y ) X Y (x , y | x n , y n ) = n
Jointly typical set: For ( X , Y ) p(x , y ) and > 0, T(n) ( X , Y ) = (x n , y n ): | (x , y | x n , y n ) p(x , y )| p(x , y ) for all (x , y ) = T(n) (( X , Y )) Let (x n , y n ) T(n) ( X , Y ) and p(x n , y n ) = n i =1 p X ,Y ( xi , yi ). Then

x n T(n) ( X ) and y n T(n) (Y )
p(x n ) 2nH ( X ) , p( y n ) 2nH (Y ) , and p(x n , y n ) 2nH ( X ,Y ) p(x n | y n ) 2nH ( X |Y ) and p( y n |x n ) 2nH (Y | X )
Elements of NIT
Tutorial, ISIT 2011
12 / 118
Typical Sequences
Conditionally Typical Sequences

Conditionally typical set: For x n X n , |T(n) (Y |x n )| 2n(H (Y | X )+()) T(n) (Y | x n ) = y n : (x n , y n ) T(n) ( X , Y )
Conditional Typicality Lemma

Let ( X , Y ) p(x , y ). If x n T(n) ( X ) and Y n n i =1 pY | X ( yi | xi ), then for > , n
lim P(x n , Y n ) T(n) ( X , Y ) = 1
If x n T(n) ( X ) and > , then for n suciently large, Let X p(x ), Y = ( X ), and x n T(n) ( X ). Then y n T(n) (Y | x n ) i | T(n) (Y | x n )| 2n(H (Y | X )())
y i = ( x i ) , i [1 : n ]
Tutorial, ISIT 2011 13 / 118
Elements of NIT
Typical Sequences
Illustration of Joint Typicality

T(n) ( X ) | | 2nH ( X ) xn y
n
| | 2nH (Y )
T(n) (Y )
| | 2nH ( X ,Y )
T(n) ( X , Y )
T(n) (Y |x n )
El Gamal & Kim (Stanford & UCSD) Elements of NIT
T(n) ( X | y n )
Typical Sequences
Another Illustration of Joint Typicality
T(n) ( X )
Xn
Yn
T(n) (Y )
xn
11 00 00 11 00 11 00 11
T(n) (Y |x n ) | | 2nH (Y | X )
Elements of NIT
Tutorial, ISIT 2011
15 / 118
Typical Sequences
Joint Typicality for Random Triples

Let ( X , Y , Z ) p(x , y , z ). The set of typical sequences is T(n) ( X , Y , Z ) = T(n) (( X , Y , Z ))
Joint Typicality Lemma

Let ( X , Y , Z ) p(x , y , z ) and < . Then for some ( ) 0 as 0: n n p Z | X ( z n , y n ) is arbitrary and Z i |x i ), then If (x i =1 n ) T (n) ( X , Y , Z ) 2n(I (Y ; Z | X )()) n , y n , Z P(x n n p Z | X ( z i |xi ), then for n suciently large, If (x n , y n ) T(n) and Z i =1 n ) T (n) ( X , Y , Z ) 2n(I (Y ; Z | X )+()) P(x n , y n , Z
Elements of NIT
Tutorial, ISIT 2011
16 / 118
Typical Sequences
Summary
Typical average lemma Conditional typicality lemma Joint typicality lemma
Point-to-Point Communication
Discrete Memoryless Channel (DMC)

Point-to-point communication system
M Encoder Xn p( y | x ) Yn Decoder M
Assume a discrete memoryless channel (DMC) model (X , p( y |x ), Y )

Discrete: Finite-alphabet Memoryless: When used over n transmissions with message M and input X n , p( yi | x i , y i1 , m) = pY | X ( yi | xi )
n
When used without feedback, p( y n |x n , m) = i=1 pY | X ( yi |xi )
A (2nR , n) code for the DMC:

Message set [1 : 2nR ] = {1, 2, . . . , 2nR } Encoder: a codeword x n (m) for each m [1 : 2nR ] ( y n ) [1 : 2nR ] {e} for each y n Decoder: an estimate m
Elements of NIT Tutorial, ISIT 2011 18 / 118
Encoder
Xn
p( y | x )
Yn
Decoder
Assume M Unif [1 : 2nR ] (n) = M } Average probability of error: Pe = P{ M Assume cost b(x ) 0 with b(x0 ) = 0 Average cost constraint: b(xi (m)) nB
i =1 n
for every m [1 : 2nR ]

n
Capacitycost function C (B ) of the DMC p( y |x ) with average cost constraint B on X is the supremum of all achievable rates
(n) R achievable if (2nR , n) codes that satisfy the cost constraint with lim Pe =0
Channel Coding Theorem (Shannon 1948)

C (B ) =
p(x ): E(b( X ))B

Elements of NIT
max
I (X ; Y )
Proof of Achievability
We use random coding and joint typicality decoding Codebook generation:

Fix p(x ) that attains C (B /(1 + )) n Randomly and independently generate 2nR sequences x n (m) i=1 pX (xi ), nR m [1 : 2 ] To send message m, the encoder transmits x n (m) if x n (m) T(n) n (by the typical average lemma, i=1 b(xi (m)) nB ) Otherwise it transmits (x0 , . . . , x0 ) is sent if it is unique message such that (x n (m ), y n ) T(n) Decoder declares that m Otherwise it declares an error
Encoding:
Decoding:

Elements of NIT
Tutorial, ISIT 2011
20 / 118
Analysis of the Probability of Error

Consider the probability of error P(E ) averaged over M and codebooks Assume M = 1 (symmetry of codebook generation) The decoder makes an error i one or both of the following events occur: E2 = ( X n (m), Y n ) T(n) for some m = 1 Thus, by the union of events bound P(E ) = P(E | M = 1) = P(E1 E2 ) E1 = ( X n (1), Y n ) T(n)
P(E1 ) + P(E2 )
Elements of NIT
Tutorial, ISIT 2011
21 / 118

Consider the rst term P(E1 ) = P( X n (1), Y n ) T(n) p X ( xi )
n n
= P X n (1) T(n) , ( X n (1), Y n ) T(n) + P X n (1) T(n) , ( X n (1), Y n ) T(n)

x - T(-) i =1 y - T(-) (Y |x - ) i =1
pY | X ( yi | xi ) + P X n (1) T(n)
(x - , y - )T(-) i =1
p X (xi ) pY | X ( yi | xi ) + P X n (1) T(n)
By the LLN, each term 0 as n
Elements of NIT
Tutorial, ISIT 2011
22 / 118

Consider the second term P(E2 ) = P( X n (m), Y n ) T(n) for some m = 1
n n For m = 1, X n (m) n i =1 p X ( xi ), independent of Y i =1 pY ( yi )
Xn X n (2) X n (m)
Yn
T(n) (Y )
Yn
To bound P(E2 ), we use the packing lemma

Packing Lemma
Let (U , X , Y ) p(u , x , y ) n, Y n ) p( u n , y n ) be arbitrarily distributed Let (U
i ), m A, where |A| 2nR , be Let X n (m) n i =1 p X |U ( xi | u n given U n pairwise conditionally independent of Y
Packing Lemma
There exists ( ) 0 as 0 such that

n
n , X n (m) , Y n ) T (n) for some m A} = 0, lim P(U
if R < I ( X ; Y |U ) ( )
Elements of NIT
Tutorial, ISIT 2011
24 / 118

Consider the second term P(E2 ) = P( X n (m), Y n ) T(n) for some m = 1
n n For m = 1, X n (m) n i =1 p X ( xi ), independent of Y i =1 pY ( yi )
Hence, by the packing lemma with A = [2 : 2nR ] and U = , P(E2 ) 0 if R < I ( X ; Y ) ( ) = C (B /(1 + )) ( )
Since P(E ) 0 as n , there must exist a sequence of (2nR , n) codes with (n) limn Pe = 0 if R < C (B /(1 + )) ( ) By the continuity of C (B ) in B , C (B /(1 + )) C (B ) as 0, which implies the achievability of every rate R < C (B )
Elements of NIT
Tutorial, ISIT 2011
25 / 118
Gaussian Channel
Gaussian Channel
Discrete-time additive white Gaussian noise channel
Z X

Y = X + Z
: channel gain (path loss) { Zi }: WGN(N0 /2) process, independent of M Assume N0 /2 = 1 and label received power 2 P as S (SNR)
2 Average power constraint: n i =1 xi (m) nP for every m
Theorem (Shannon 1948)

C=
F (x ): E( X 2 )P
max
I (X ; Y ) =
1 log(1 + S ) 2
Elements of NIT
Tutorial, ISIT 2011
26 / 118
Gaussian Channel
We extend the proof for DMC using a discretization procedure (McEliece 1977) First note that the capacity is attained by X N(0, P ), i.e., I ( X ; Y ) = C 2 Let [ X ] j be a nite quantization of X such that E([ X ]2 j ) E( X ) = P and [ X ] j X in distribution
Z X [X ] j Yj [Y j ]k
Let Y j = [ X ] j + Z and [Y j ]k be a nite quantization of Y j By the achievability proof for the DMC, I ([ X ] j ; [Y j ]k ) is achievable for every j , k By the data processing inequality and the maximum dierential entropy lemma, I ([ X ] j ; [Y j ]k ) I ([ X ] j ; Y j ) = h(Y j ) h( Z ) h(Y ) h( Z ) = I ( X ; Y ) By the weak convergence and the dominated convergence theorem, Combining the two bounds I ([ X ] j ; [Y j ]k ) I ( X ; Y ) as j , k
Elements of NIT
lim inf lim I ([ X ] j ; [Y j ]k ) = lim inf I ([ X ] j ; Y j ) I ( X ; Y )

j k j
Gaussian Channel
Summary
Random coding Joint typicality decoding Packing lemma Discretization procedure for Gaussian
Multiple Access Channel
DM Multiple Access Channel (MAC)

Multiple access communication system (uplink)
M1 Encoder 1
n X1
M2
Encoder 2
n X2
p( y |x1 , x2 )
Yn
Decoder
1, M 2 M
Assume a 2-sender DM-MAC model (X1 X2 , p( y |x1 , x2 ), Y ) A (2nR1 , 2nR2 , n) code for the DM-MAC:

Message sets: [1 : 2nR1 ] and [1 : 2nR2 ] Encoder j = 1, 2: x n j (m j ) 1 ( y n ), m 2 ( y n )) Decoder: (m
n n Assume (M1 , M2 ) Unif ([1 : 2nR1 ] [1 : 2nR2 ]): x1 (M1 ) and x2 (M2 ) independent (n) Average probability of error: P = P{(M1 , M2 ) = (M1 , M2 )} (n) (R1 , R2 ) achievable: if (2nR1 , 2nR2 , n) codes with limn Pe =0 Capacity region: closure of the set of achievable (R1 , R2 )
Tutorial, ISIT 2011
29 / 118
Theorem (Ahlswede 1971, Liao 1972, SlepianWolf 1973b)

R1 I ( X1 ; Y | X2 , Q ), R2 I ( X2 ; Y | X1 , Q ),
Capacity region of DM-MAC p( y |x1 , x2 ) is the set of rate pairs (R1 , R2 ) such that
for some pmf p(q) p(x1 |q) p(x2 |q), where Q is an auxiliary (time-sharing) r.v. R2 C12 C2 Individual capacities: C1 = max p(x1 ), x2 I ( X1 ; Y | X2 = x2 ) C2 = max p(x2 ), x1 I ( X2 ; Y | X1 = x1 ) Sum-capacity: C12 = max p(x1 ) p(x2 ) I ( X1 , X2 ; Y ) C1 C12
R1 + R2 I ( X1 , X2 ; Y | Q )
R1
Proof of Achievability (HanKobayashi 1981)

We use simultaneous decoding and coded time sharing Codebook generation:

Fix p(q) p(x1 |q) p(x2 |q) n Randomly generate a time-sharing sequence qn i=1 pQ (qi )
Randomly and conditionally independently generate 2nR1 sequences n n x1 (m1 ) i=1 pX1 |Q (x1i |qi ), m1 [1 : 2nR1 ]
n
n Similarly generate 2nR2 sequences x2 (m2 ) i=1 pX2 |Q (x2i |qi ), m2 [1 : 2nR2 ]
Encoding:
n n To send (m1 , m2 ), transmit x1 (m1 ) and x2 (m2 )
Decoding:
n n 1, m 2 ) such that (qn , x1 1 ), x2 2 ), y n ) T(n) Find the unique message pair (m (m (m
Elements of NIT
Tutorial, ISIT 2011
31 / 118

Assume (M1 , M2 ) = (1, 1) m1 1 1 m2 1 1 Joint pmfs induced by dierent (m1 , m2 ) Joint pmf
n n n n n n n p ( q ) p ( x1 | q ) p ( x2 | q ) p ( y n | x1 , x2 , q ) n n n n n n n n p ( q ) p ( x1 | q ) p ( x2 | q ) p ( y | x2 , q ) n n n n n n p ( q n ) p ( x1 | q ) p ( x2 | q ) p ( y n | x1 ,q ) n n n n n n n p ( q ) p ( x1 | q ) p ( x2 | q ) p ( y | q ) n
We divide the error events into the following 4 events:

n n E1 = (Q n , X1 (1), X2 (1), Y n ) T(n) n n E2 = (Q n , X1 (m1 ), X2 (1), Y n ) T(n) for some m1 = 1
Then P(E ) P(E1 ) + P(E2 ) + P(E3 ) + P(E4 )

n n E4 = (Q n , X1 (m1 ), X2 (m2 ), Y n ) T(n) for some m1 = 1, m2 = 1
n n E3 = (Q n , X1 (1), X2 (m2 ), Y n ) T(n) for some m2 = 1
Elements of NIT
Tutorial, ISIT 2011
32 / 118
m1 1 1
m2 1 1
Joint pmf
n n E2 = (Q n , X1 (m1 ), X2 (1), Y n ) T(n) for some m1 = 1 n n E4 = (Q n , X1 (m1 ), X2 (m2 ), Y n ) T(n) for some m1 = 1, m2 = 1
n n E1 = (Q n , X1 (1), X2 (1), Y n ) T(n)
By the packing lemma (A = [2 : 2nR1 ], U Q , X X1 , Y ( X2 , Y )), P(E2 ) 0 as n if R1 < I ( X1 ; X2 , Y |Q ) ( ) = I ( X1 ; Y | X2 , Q ) ( ) Similarly, P(E3 ) 0 as n if R2 < I ( X2 ; Y | X1 , Q ) ( )
Elements of NIT
By the LLN, P(E1 ) 0 as n
Tutorial, ISIT 2011
33 / 118
Packing Lemma
Let (U , X , Y ) p(u , x , y ) n, Y n ) p( u n , y n ) be arbitrarily distributed Let (U
i ), m A, where |A| 2nR , be Let X n (m) n i =1 p X |U ( xi | u n given U n pairwise conditionally independent of Y
Packing Lemma

n
n , X n (m) , Y n ) T (n) for some m A} = 0, lim P(U
if R < I ( X ; Y |U ) ( )
Elements of NIT
Tutorial, ISIT 2011
34 / 118
m1 1 1
m2 1 1
Joint pmf
n n E1 = (Q n , X1 (1), X2 (1), Y n ) T(n)
By the packing lemma (A = [2 : 2nR1 ], U Q , X X1 , Y ( X2 , Y )), P(E2 ) 0 as n if R1 < I ( X1 ; X2 , Y |Q ) ( ) = I ( X1 ; Y | X2 , Q ) ( ) Similarly, P(E3 ) 0 as n if R2 < I ( X2 ; Y | X1 , Q ) ( )
Elements of NIT
By the LLN, P(E1 ) 0 as n
Tutorial, ISIT 2011
35 / 118
m1 1 1
m2 1 1
Joint pmf
n n E1 = (Q n , X1 (1), X2 (1), Y n ) T(n)
By the packing lemma (A = [2 : 2nR1 ] [2 : 2nR2 ], U Q , X ( X1 , X2 )), P(E4 ) 0 as n if R1 + R2 < I ( X1 , X2 ; Y |Q ) ( )
n n Remark: ( X1 (m1 ), X2 (m2 )), m1 = 1, m2 = 1, are not mutually independent but each of them is pairwise independent of Y n (given Q n )
Elements of NIT
Tutorial, ISIT 2011
36 / 118
Summary
Coded time sharing Simultaneous decoding Systematic procedure for decomposing error event
Broadcast Channel
DM Broadcast Channel (BC)

Broadcast communication system (downlink)
Y1n M1 , M2 Encoder Xn p( y1 , y2 |x ) Y2n Decoder 1 1 M 2 M
Decoder 2
Assume a 2-receiver DM-BC model (X , p( y1 , y2 |x ), Y1 Y2 ) A (2nR1 , 2nR2 , n) code for the DM-BC:

Assume (M1 , M2 ) Unif ([1 : 2nR1 ] [1 : 2nR2 ]) (n) 1, M 2 ) = (M1 , M2 )} Average probability of error: Pe = P{(M
Message sets: [1 : 2nR1 ] and [1 : 2nR2 ] Encoder: x n (m1 , m2 ) j ( yn Decoder j = 1, 2: m j)
(n) (R1 , R2 ) achievable: if (2nR1 , 2nR2 , n) codes with limn Pe =0 Capacity region: closure of the set of achievable (R1 , R2 )
Broadcast Channel
Superposition Coding Inner Bound

Capacity region of the DM-BC is not known in general There are several inner and outer bounds tight in some cases
Superposition Coding Inner Bound (Cover 1972, Bergmans 1973)

A rate pair (R1 , R2 ) is achievable for the DM-BC p( y1 , y2 |x ) if R2 < I (U ; Y2 ), R1 + R2 < I ( X ; Y1 ) R1 < I ( X ; Y1 |U ),
for some pmf p(u , x ), where U is an auxiliary random variable This bound is tight for several special cases, including

Degraded: X Y1 Y2 physically or stochastically Less noisy: I (U ; Y1 ) I (U ; Y2 ) for all p(u , x ) More capable: I ( X ; Y1 ) I ( X ; Y2 ) for all p(x ) Degraded Less noisy More capable
Broadcast Channel
We use superposition coding and simultaneous nonunique decoding Codebook generation:

Fix p(u) p(x |u)
For each m2 [1 : 2nR2 ], randomly and conditionally independently generate 2nR1 n sequences (satellite codewords) x n (m1 , m2 ) i=1 pX |U (xi |ui (m2 )), m1 [1 : 2nR1 ] Un un (m2 ) Xn x n (m1 , m2 )
Randomly and independently generate 2nR2 sequences (cloud centers) n un (m2 ) i=1 pU (ui ), m2 [1 : 2nR2 ]
Elements of NIT
Tutorial, ISIT 2011
40 / 118
Broadcast Channel
We use superposition coding and simultaneous nonunique decoding Codebook generation:

Fix p(u) p(x |u)
For each m2 [1 : 2nR2 ], randomly and conditionally independently generate 2nR1 n sequences (satellite codewords) x n (m1 , m2 ) i=1 pX |U (xi |ui (m2 )), m1 [1 : 2nR1 ] To send (m1 , m2 ), transmit x n (m1 , m2 )
n 2 such that (un (m 2 ), y2 Decoder 2 nds the unique message m ) T(n)
Randomly and independently generate 2nR2 sequences (cloud centers) n un (m2 ) i=1 pU (ui ), m2 [1 : 2nR2 ]
Encoding:
Decoding:
(by the packing lemma, P(E2 ) 0 as n if R2 < I (U ; Y2 ) ( )) 1 such that Decoder 1 nds the unique message m
n 1 , m2 ), y1 (un (m2 ), x n (m ) T(n)
Elements of NIT
for some m2
Broadcast Channel
Analysis of the Probability of Error for Decoder 1

Assume (M1 , M2 ) = (1, 1) Joint pmfs induced by dierent (m1 , m2 ) m1 1 1 m2 1 1 Joint pmf
n n p(un , x n ) p( y1 |u ) n p(un , x n ) p( y1 ) n p(un , x n ) p( y1 ) n n p(u , x n ) p( y1 |x ) n
The last case does not result in an error So we divide the error event into the following 3 events: E12 = (U n (1), X n (m1 , 1), Y1n ) T(n) for some m1 = 1 E11 = (U n (1), X n (1, 1), Y1n ) T(n)
Then P(E1 ) P(E11 ) + P(E12 ) + P(E13 )

E13 = (U n (m2 ), X n (m1 , m2 ), Y1n ) T(n) for some m1 = 1, m2 = 1

Broadcast Channel
m1 1 1
m2 1 1
Joint pmf
n n p(un , x n ) p( y1 |u ) n p(un , x n ) p( y1 ) n p(un , x n ) p( y1 ) n n p(un , x n ) p( y1 |x )
E13 = (U n (m2 ), X n (m1 , m2 ), Y1n ) T(n) for some m1 = 1, m2 = 1 By the packing lemma (A = [2 : 2nR1 ]), P(E12 ) 0 as n if R1 < I ( X ; Y1 |U ) ( ) By the packing lemma (A = [2 : 2nR1 ] [2 : 2nR2 ], U , X (U , X )), P(E13 ) 0 as n if R1 + R2 < I (U , X ; Y1 ) ( ) = I ( X ; Y1 ) ( ) Remark: P(E14 ) = P{(U n (m2 ), X n (1, m2 ), Y1n ) T(n) for some m2 = 1} 0 as n if R2 < I (U , X ; Y1 ) ( ) = I ( X ; Y1 ) ( ) Hence, the inner bound continues to hold when decoder 1 is also to recover M2
E12 = (U n (1), X n (m1 , 1), Y1n ) T(n) for some m1 = 1
E11 = (U n (1), X n (1, 1), Y1n ) T(n)
Broadcast Channel
Summary
Superposition coding Simultaneous nonunique decoding
Lossy Source Coding
Lossy Source Coding

Point-to-point compression system
Xn Encoder M Decoder n , D) (X
Assume a discrete memoryless source (DMS) (X , p(x )) ), (x , x ) X X a distortion measure d (x , x n : Average per-letter distortion between x n and x n ) = d (x n , x A (2nR , n) lossy source code:

1 n i ) d ( xi , x n i =1
Encoder: an index m(x n ) [1 : 2nR ) := {1, 2, . . . , 2nR } n n (m) X Decoder: an estimate (reconstruction sequence) x
Elements of NIT
Tutorial, ISIT 2011
44 / 118
Lossy Source Coding
Xn
Encoder
Decoder
n , D) (X
Expected distortion associated with the (2nR , n) code:

x-
n ) = p(x n ) d (x n , x n (m(x n ))) D = E d ( X n , X
n )) D (R , D ) achievable if (2nR , n) codes with lim supn E( d ( X n , X
Ratedistortion function R(D ): inmum of R such that (R , D ) is achievable
Lossy Source Coding Theorem (Shannon 1959)

R (D ) =
|x ): E( d (x , x ))D p( x
min
) I (X ; X
( X ))] for D Dmin = E[minx (x ) d ( X , x
Elements of NIT
Tutorial, ISIT 2011
45 / 118
Lossy Source Coding
We use random coding and joint typicality encoding Codebook generation:

|x ) |x ) that attains R(D /(1 + )) and compute p(x ) = x p(x ) p(x Fix p(x nR n (m) n Randomly and independently generate sequences x p ( i =1 X x i ) , m [ 1 : 2 ] n (m)) T(n) Find an index m such that (x n , x If more than one, choose the smallest index among them If none, choose m = 1 n = x n (m) Upon receiving m, set the reconstruction sequence x
Encoding:

Decoding:
Elements of NIT
Tutorial, ISIT 2011
46 / 118
Lossy Source Coding
Analysis of Expected Distortion

We bound the expected distortion averaged over codebooks Dene the encoding error event n (M )) T (n) = ( X n , X n (m)) T (n) for all m [1 : 2nR ] E = ( X n , X
n n n ( m ) n p ( x X i =1 X i ), independent of each other and of X i =1 p X ( xi )
n X n (1) X n (m) X
Xn
T(n) ( X )
Xn
To bound P(E ), we use the covering lemma

Lossy Source Coding
Covering Lemma
) p( u , x , x ) and < Let (U , X , X Let (U n , X n ) p(un , x n ) be arbitrarily distributed such that
n nR n ( m ) n p ( x Let X i =1 X |U i | ui ), m A, where |A| 2 , be n conditionally independent of each other and of X given U n
lim P{(U n , X n ) T(n) (U , X )} = 1
Covering Lemma

n
n (m)) T (n) for all m A = 0, lim P(U n , X n , X
|U ) + ( ) if R > I ( X ; X
Elements of NIT
Tutorial, ISIT 2011
48 / 118
Lossy Source Coding

We bound the expected distortion averaged over codebooks Dene the encoding error event n (M )) T (n) = ( X n , X n (m)) T (n) for all m [1 : 2nR ] E = ( X n , X
n n n ( m ) n p ( x X i =1 X i ), independent of each other and of X i =1 p X ( xi ) By the covering lemma (U = ), P(E ) 0 as n if
) + ( ) = R(D /(1 + )) + ( ) R > I (X ; X
Now, by the law of total expectation and the typical average lemma, n ) = P(E ) E d ( X n , X n )| E + P(E c ) E d ( X n , X n )| E c E d ( X n , X )) P(E ) dmax + P(E c )(1 + ) E( d ( X , X n )] D and there must exist a sequence of Hence, lim supn E[ d ( X n , X nR (2 , n) codes that satises the asymptotic distortion constraint By the continuity of R(D ) in D , R(D /(1 + )) + ( ) R(D ) as 0
Lossy Source Coding
Lossless Source Coding

n = Xn Suppose we wish to reconstruct X n losslessly, i.e., X n = X n } = 0 R achievable if (2nR , n) codes with limn P{ X Optimal rate R : inmum of achievable R
Lossless Source Coding Theorem (Shannon 1948)

R = H ( X )
We prove this theorem as a corollary of the lossy source coding theorem = X , and Consider the lossy source coding problem for a DMS X , X ) = 0 if x = x , and d (x , x ) = 1 otherwise) Hamming distortion measure ( d (x , x At D = 0, the ratedistortion function is R(0) = H ( X ) We now show operationally R = R(0) without using the fact that R = H ( X )
Lossy Source Coding
Proof of the Lossless Source Coding Theorem

Proof of R R(0):
First note that

n
n )) = lim lim E( d ( X n , X n
1 n i = Xi } lim P{ X n = X n } P{ X n n i =1
Proof of R R(0):

n = X n } = 0 achieves D = 0 Hence, any sequence of (2nR , n) codes with limn P{ X We can still use random coding and joint typicality encoding! |x ) = 1 if x = x and 0 otherwise ( p(x ) = pX (x )) Fix p(x n (m), m [1 : 2nR ] As before, generate a random code x
n ) T (n) } 0 as n if R > I ( X ; X ) + ( ) = R(0) + ( ) Then P(E ) = P{( X n , X n = X n } 0 as n if R > R(0) + ( ) Hence, P{ X
n ) T(n) , then x n = x n (or if x n = x n , then (x n , x n ) T(n) ) Now recall that if (x n , x
Elements of NIT
Tutorial, ISIT 2011
51 / 118
Lossy Source Coding
Summary
Joint typicality encoding Covering lemma Lossless as a corollary of lossy
WynerZiv Coding
Lossy Source Coding with Side Information at the Decoder

Lossy compression system with side information
Xn Yn Encoder M Decoder n , D) (X
) Assume a 2-DMS (X Y , p(x , y )) and a distortion measure d (x , x

A (2nR , n) lossy source code with side information available at the decoder:
Encoder: m(x n ) n (m , y n ) Decoder: x
Expected distortion, achievability, ratedistortion function: dened as before
Theorem (WynerZiv 1976)

)) D (u , y ) such that E( d ( X , X where the minimum is over all p(u|x ) and x
RSI-D (D ) = min I ( X ; U ) I (Y ; U ) = min I ( X ; U | Y ),
WynerZiv Coding
We use binning in addition to joint typicality encoding and decoding
yn x un (1) B(1)
n
B (m)
un ( l )
T(n) (U , Y )
B(2nR ) un (2nR )
Elements of NIT
Tutorial, ISIT 2011
54 / 118
WynerZiv Coding
We use binning in addition to joint typicality encoding and decoding Codebook generation:

(u , y ) that attain RSI-D (D /(1 + )) Fix p(u|x ) and x

Randomly and independently generate 2nR sequences un ( l ) i=1 pU (ui ), l [1 : 2nR ]

n
Partition [1 : 2nR ] into bins B(m) = [(m 1)2n(RR) + 1 : m2n(RR) ], m [1 : 2nR ]
Encoding:

Find l such that (x n , un ( l )) T(n) If more than one, it picks one of them uniformly at random If none, choose l [1 : 2nR ] uniformly at random Send the index m such that l B(m) Upon receiving m, nd the unique l B(m) such that (un ( l ), y n ) T(n) , where > =x (u ( Compute the reconstruction sequence as x l ) , y ), i [1 : n ]
i i i
Decoding:

WynerZiv Coding

We bound the distortion averaged over the random codebook and encoding be the index estimate at the decoder Let (L , M ) denote chosen indices and L Dene the error event ) , X n , Y n ) T (n) E = (U n (L and consider E1 = (U n ( l ), X n ) T(n) for all l [1 : 2nR ]
E2 = (U n (L), X n , Y n ) T(n) E3 = (U n ( l ), Y n ) T(n) for some l B(M ), l= L

c P(E ) P(E1 ) + P(E1 E2 ) + P(E3 )
The probability of error is bounded as
Elements of NIT
Tutorial, ISIT 2011
55 / 118
WynerZiv Coding
E1 = (U ( l ), X n ) T(n) for all l [1 : 2nR ]

n
> I ( X ; U ) + ( ) By the covering lemma, P(E1 ) 0 as n if R (n) c n n Since E1 = {(U (L), X ) T }, > , and Y n | {U n (L) = un , X n = x n } pY |U , X ( yi | ui , xi ) = pY | X ( yi | xi ),
i =1 i =1 n n
c P(E ) P(E1 ) + P(E1 E2 ) + P(E3 )
E2 = (U n (L), X n , Y n ) T(n) E3 = (U n ( l ), Y n ) T(n) for some l B(M ), l= L
by the conditional typicality lemma, To bound P(E3 ), it can be shown that
c P(E1
E2 ) 0 as n
n Since each U n ( l ) n i =1 pU ( ui ), independent of Y , R < I (Y ; U ) ( ) by the packing lemma, P(E3 ) 0 as n if R Combining the bounds, we have shown that P(E ) 0 as n if R > I ( X ; U ) I (Y ; U ) + ( ) + ( ) = RSI-D (D /(1 + )) + ( ) + ( )
P(E3 ) P(U n ( l ), Y n ) T(n) for some l B (1)
WynerZiv Coding
Lossless Source Coding with Side Information

What is the minimum rate RSI -D needed to recover X losslessly?
Theorem (SlepianWolf 1973a)

RSI -D = H ( X | Y )
We prove the SlepianWolf theorem as a corollary of the WynerZiv theorem Let d be the Hamming distortion measure and consider the case D = 0 Then RSI-D (0) = H ( X |Y )

As before, we can show operationally RSI -D = RSI-D (0)

n RSI -D RSI-D (0) by WynerZiv coding with X = U = X
n n RSI -D RSI-D (0) since (1/n) i =1 P{ Xi = Xi } P{ X = X }
Elements of NIT
Tutorial, ISIT 2011
57 / 118
WynerZiv Coding
Summary
Binning Application of conditional typicality lemma Channel coding techniques in source coding
GelfandPinsker Coding
DMC with State Information Available at the Encoder

Point-to-point communication system with state
Sn p( s ) p( y | x , s ) Yn M
Encoder
Xn
Decoder
Assume a DMC with DM state model (X S , p( y |x , s ) p(s ), Y )

DMC: p( y n |x n , s n , m) = i=1 pY | X ,S ( yi |xi , si ) DM state: (S1 , S2 , . . .) i.i.d. with Si pS (si )

n
A (2nR , n) code for the DMC with state information available at the encoder:

Message set: [1 : 2nR ] Encoder: x n (m , s n ) ( yn ) Decoder: m
Elements of NIT
Tutorial, ISIT 2011
59 / 118
Sn
p( s ) p( y | x , s ) Yn M
Encoder
Xn
Decoder
Expected average cost constraint: E[b(xi (m , S n ))] nB

i =1 n
for every m [1 : 2nR ]
Probability of error, achievability, capacitycost function: dened as for DMC
Theorem (GelfandPinsker 1980)

CSI-E (B ) =
p(u|s ), x (u, s ): E(b( X ))B
max
I (U ; Y ) I (U ; S )
Elements of NIT
Tutorial, ISIT 2011
60 / 118
Proof of Achievability (HeegardEl Gamal 1983)

We use multicoding
sn un un (1) C (1)
T(n) (U , S ) C (m) un ( l )
C (2nR ) un (2nR )
Elements of NIT
Tutorial, ISIT 2011
61 / 118
Proof of Achievability (HeegardEl Gamal 1983)

We use multicoding Codebook generation:

Fix p(u|s ) and x (u , s ) that attain CSI-E (B /(1 + )) For each m [1 : 2nR ], generate a subcodebook C (m) consisting of n 2n(RR) randomly and independently generated sequences un ( l ) i=1 pU (ui ), R ) R ) n(R n(R l [(m 1)2 + 1 : m2 ] To send m [1 : 2nR ] given s n , nd un ( l ) C (m) such that (un ( l ), s n ) T(n) Then transmit xi = x (ui ( l ), si ) for i [1 : n] n (by the typical average lemma, i=1 b(xi (m , s n )) nB ) n If no such u ( l ) exists, transmit (x0 , . . . , x0 ) such that (un ( l ), y n ) T(n) for some un ( l ) C (m ), where > Find the unique m
Encoding:

Decoding:

Assume M = 1 Let L denote the index of the chosen U n sequence for M = 1 and S n The decoder makes an error only if one or more of the following events occur: E2 = (U n (L), Y n ) T(n) E1 = (U n ( l ), S n ) T(n) for all U n ( l ) C (1)
E3 = (U n ( l ), Y n ) T(n) for some U n ( l ) C (1) Thus, the probability of error is bounded as

c P(E ) P(E1 ) + P(E1 E2 ) + P(E3 )
Elements of NIT
Tutorial, ISIT 2011
62 / 118
E2 = (U n (L), Y n ) T(n)
E1 = (U n ( l ), S n ) T(n) for all U n ( l ) C (1)
c P(E ) P(E1 ) + P(E1 E2 ) + P(E3 )
E3 = (U n ( l ), Y n ) T(n) for some U n ( l ) C (1)
R > I (U ; S ) + ( ) By the covering lemma, P(E1 ) 0 as n if R

c Since > , E1 = {(U n (L), S n ) T(n) } = {(U n (L), X n , S n ) T(n) }, and n n
Y n |{U n (L) = un , X n = x n , S n = s n } pY |U , X ,S ( yi | ui , xi , si ) = pY | X ,S ( yi | xi , si ),
i =1 i =1 c by the conditional typicality lemma, P(E1 E2 ) 0 as n n n Since U ( l ) C (1) is distributed according to n i =1 p( ui ), independent of Y , < I (U ; Y ) ( ) by the packing lemma, P(E3 ) 0 as n if R n Remark: Y is not i.i.d. Combining the bounds, we have shown that P(E ) 0 as n if R < I (U ; Y ) I (U ; S ) ( ) ( ) = CSI-E (B /(1 + )) ( ) ( )
Multicoding versus Binning

Multicoding
Channel coding technique Given a set of messages Generate many codewords for each message To communicate a message, send a codeword from its subcodebook
Binning
Source coding technique Given a set of indices (sequences) Map indices into a smaller number of bins To communicate an index, send its bin index
Elements of NIT
Tutorial, ISIT 2011
64 / 118
WynerZiv versus GelfandPinsker

WynerZiv theorem: ratedistortion function for a DMS X with side information Y available at the decoder: RSI-D (D ) = min I (U ; X ) I (U ; Y ) We proved achievability using binning, covering, and packing GelfandPinsker theorem: capacitycost function of a DMC with state information S available at the encoder: CSI-E (B ) = max I (U ; Y ) I (U ; S ) We proved achievability using multicoding, covering, and packing Dualities: min binning covering rate packing rate max multicoding packing rate covering rate
Writing on Dirty Paper

Gaussian channel with additive Gaussian state available at the encoder
S Xn Z Yn M M
Sn M Encoder
Decoder
Encoder
C=
2 n Assume expected average power constraint: n i =1 E( xi (m , S )) nP for every m 1 2
Noise Z N(0, N ) State S N(0, Q ), independent of Z
CSI-ED =
log 1 +
1 2
log 1 +
P N +Q
P N
= CSI-D
P 1 log 1 + 2 N
Writing on Dirty Paper (Costa 1983)

CSI-E =
Elements of NIT
Proof involves a clever choice of F (u|s ), x (u , s ) and discretization procedure Let X N(0, P ) independent of S and U = X + S , where = P /(P + N ). Then Let [U ] j and [S ] j be nite quantizations of U and S Let [ X ] j j = [U ] j [S ] j and [Y j j ]k be a nite quantization of the corresponding channel output Y j j = [U ] j [S ] j + S + Z We use GelfandPinsker coding for the DMC with DM state p([ y j j ]k |[x ] j j , [s ] j ) p([s ] j )

I (U ; Y ) I (U ; S ) =
P 1 log 1 + 2 N
R > I (U ; S ) I ([U ] j ; [S ] j ) Joint typicality encoding: R < I ([U ] j ; [Y j j ]k ) Joint typicality decoding: R
Following similar arguments to the discretization procedure for Gaussian channel coding,
j j k
Thus R < I ([U ] j ; [Y j j ]k ) I (U ; S ) is achievable for any j , j , k
lim lim lim I ([U ] j ; [Y j j ] l ) = I (U ; Y )

Summary
Multicoding Packing lemma with non i.i.d. Y n Writing on dirty paper
Wiretap Channel
DM Wiretap Channel (WTC)

Point-to-point communication system with an eavesdropper
Yn M Encoder Xn p( y , z | x ) Zn Decoder M
Eavesdropper
Assume a DM-WTC model (X , p( y , z |x ), Y Z ) A (2nR , n) secrecy code for the DM-WTC:

Message set: [1 : 2nR ] Randomized encoder: X n (m) p(x n |m) for each m [1 : 2nR ] ( yn ) Decoder: m
Elements of NIT
Tutorial, ISIT 2011
69 / 118
Wiretap Channel
Yn M Encoder X
n
Decoder
p( y , z | x )
Zn
Eavesdropper
Assume M Unif [1 : 2nR ]
(n) = M } Average probability of error: Pe = P{ M
(n) Information leakage rate: RL = (1/n) I (M ; Z n )
(n) (n) (R , RL ) achievable if (2nR , n) codes with limn Pe = 0, lim supn RL RL
Rateleakage region R : closure of the set of achievable (R , RL ) Secrecy capacity: CS = max{R : (R , 0) R }
Theorem (Wyner 1975, Csisz arK orner 1978)

CS = max I (U ; Y ) I (U ; Z )
p( u , x )
Wiretap Channel
We use multicoding and two-step randomized encoding Codebook generation:

Assume CS > 0 and x p(u , x ) that attains it ( I (U ; Y ) I (U ; Z ) > 0) For each m [1 : 2nR ], generate a subcodebook C (m) consisting of n 2n(RR) randomly and independently generated sequences un ( l ) i=1 pU (ui ), R ) R ) n(R n(R l [(m 1)2 + 1 : m2 ] l :1 2n(R R )
2n R
Encoding:

C (1)
C (2)
C (3)

C (2nR )
Decoding:
To send m, choose an index L [(m 1)2n(RR) + 1 : m2n(RR) ] uniformly at random n Then generate X n i=1 pX |U (xi |ui (L)) and transmit it such that (un ( ) Find the unique m l ), y n ) T(n) for some un ( l ) C (m By the LLN and the packing lemma, P(E ) 0 as n if R < I (U ; Y ) ( )
Wiretap Channel
Analysis of the Information Leakage Rate

For each C (m), the eavesdropper has 2n(RRI (U ; Z )) un ( l ) jointly typical with z n
l :1
2n(R R )
2n R
C (1)
C (2)
C (3)
C (2nR )
R > I (U ; Z ), the eavesdropper has roughly same number of sequences in If R each subcodebook, providing it with no information about the message Let M be the message sent and L be the randomly selected index Every codebook
C
induces a pmf of the form

n
p(m , l , un , z n | C ) = 2nR 2n(RR) p(un | l , C ) pZ |U (zi | ui )

i =1
In particular, p(un , z n ) = n i =1 pU , Z ( ui , zi )
Wiretap Channel

Consider the amount of information leakage averaged over codebooks: I (M ; Z n | C ) = H (M | C ) H (M | Z n , C ) = nR H (M , L | Z n , C ) + H (L | Z n , M , C ) = nR H (L | Z n , C ) + H (L | Z n , M , C )
The rst equivocation term H (L | Z n , C ) = H (L | C ) I (L ; Z n | C ) I (L ; Z n | C ) = nR I (U n , L ; Z n | C ) = nR I (U n , L , C ; Z n ) nR
(a)
I (U n ; Z n ) = nR nI (U ; Z ) = nR
(a) (L , C ) U n Z n form a Markov chain

Wiretap Channel

Consider the amount of information leakage averaged over codebooks: + nI (U ; Z ) + H (L | Z n , M , C ) I (M ; Z n | C ) nR nR The remaining equivocation term can be upper bounded as follows
Lemma
R I (U ; Z ), then If R
n
R I (U ; Z ) + ( ) lim sup H (L | Z n , M , C ) R
1 n
< I (U ; Y ) ( ) for decoding), we have shown that Substituting (recall that R lim sup I (M ; Z n | C ) ( )
n
1 n
if R < I (U ; Y ) I (U ; Z ) ( )
(n) Thus, there must exist a sequence of (2nR , n) codes such that Pe 0 and (n) RL ( ) as n
Wiretap Channel
Summary
Randomized encoding Bound on equivocation (list size)
Relay Channel
DM Relay Channel (RC)

Point-to-point communication system with a relay
Relay encoder Y2n M Encoder
n X1 n X2
p( y2 , y3 |x1 , x2 )
Y3n
Decoder
Assume a DM-RC model (X1 X2 , p( y2 , y3 |x1 , x2 ), Y2 Y3 ) A (2nR , n) code for the DM-RC:

Message set: [1 : 2nR ] n Encoder: x1 (m) i 1 Relay encoder: x2i ( y2 ), i [1 : n ] n ( y3 ) Decoder: m
Probability of error, achievability, capacity: dened as for the DMC

Relay Channel
Relay encoder Y2n M Encoder

n X1 n X2
p( y2 , y3 |x1 , x2 )
Y3n
Decoder
Capacity of the DM-RC is not known in general There are upper and lower bounds that are tight in some cases We discuss two lower bounds: decodeforward and compressforward
Elements of NIT
Tutorial, ISIT 2011
77 / 118
Relay Channel
Multihop
Multihop Lower Bound

The relay recovers the message received from the sender in each block and retransmits it in the following block
Y2 : X2 M Y3 M
X1
Multihop Lower Bound

C max min{ I ( X2 ; Y3 ), I ( X1 ; Y2 | X2 )}
p(x1 ) p(x2 )
Tight for a cascade of two DMCs, i.e., p( y2 , y3 |x1 , x2 ) = p( y2 |x1 ) p( y3 |x2 ): C = minmax I ( X2 ; Y3 ), max I ( X1 ; Y2 )
p(x2 ) p(x1 )
The scheme uses block Markov coding, where codewords in a block can depend on the message sent in the previous block
Relay Channel
Multihop
Send b 1 messages in b blocks using independently generated codebooks
n 3 m1
Block 1
m2 2
m3
m b 1 b1
1 b
Codebook generation:

Fix p(x1 ) p(x2 ) that attains the lower bound For each j [1 : b], randomly and independently generate 2nR sequences n n x1 (m j ) i=1 pX1 (x1i ), m j [1 : 2nR ]
n n n Codebooks: C j = {(x1 (m j ), x2 (m j 1 )) : m j 1 , m j [1 : 2nR ]}, j [1 : b] n To send m j in block j , transmit x1 (m j ) from C j
n Similarly, generate 2nR sequences x2 (m j 1 ) i=1 pX2 (x2i ), m j 1 [1 : 2nR ]
Encoding:
Relay encoding:

Decoding:
n n n j such that (x1 j ), x2 j 1 ), y2 At the end of block j , nd the unique m (m (m ( j )) T(n) n j ) from C j +1 In block j + 1, transmit x2 (m n n j such that (x2 j ), y3 At the end of block j + 1, nd the unique m (m ( j + 1)) T(n)
Relay Channel
Multihop

We analyze the probability of decoding error for M j averaged over codebooks Assume M j = 1 j be the relays decoded message at the end of block j Let M
j = 1} {M j = 1} {M j = M j }, the decoder makes an error only if one Since {M of the following events occur: 2 ( j ) = ( X n (m j ), X n (M j 1 ), Y n ( j )) T (n) for some m j = 1 E 2 1 2
n E1 ( j ) = ( X2 (M j ), Y3n ( j + 1)) T(n) n j E2 ( j ) = ( X2 (m j ), Y3n ( j + 1)) T(n) for some m j = M
1 ( j ) = ( X n (1), X n (M j 1 ), Y n ( j )) T (n) E 1 2 2
Thus, the probability of error is upper bounded as 1 ( j )) + P(E 2 ( j )) + P(E1 ( j )) + P(E2 ( j )) j = 1} P(E P(E ( j )) = P{M
Relay Channel
Multihop
2 ( j ) = ( X n (m j ), X n (M j 1 ), Y n ( j )) T (n) for some m j = 1 E 2 1 2

n E1 ( j ) = ( X2 (M j ), Y3n ( j + 1)) T(n) n j E2 ( j ) = ( X2 (m j ), Y3n ( j + 1)) T(n) for some m j = M
1 ( j ) = ( X n (1), X n (M j 1 ), Y n ( j )) T (n) E 1 2 2
j 1 , which is a function of Y n ( j 1) By the independence of the codebooks, M 2 n n and codebook C j 1 , is independent of the codewords X1 (1), X2 (M j 1 ) in C j 1 ( j )) 0 as n Thus by the LLN, P(E 2 ( j )) 0 as n if R < I ( X1 ; Y2 | X2 ) ( ) By the packing lemma, P(E By the independence of the codebooks and the LLN, P(E1 ( j )) 0 as n
By the same independence and the packing lemma, P(E2 ( j )) 0 as n if R < I ( X2 ; Y3 ) ( ) Thus we have shown that under the given constraints on the rate, j = M j } 0 as n for each j [1 : b 1] P{ M
Elements of NIT
Tutorial, ISIT 2011
81 / 118
Relay Channel
Coherent Multihop
Coherent Multihop Lower Bound

In the multihop coding scheme, the sender knows what the relay transmits in each block
Y2 : X2 M Y3 M
X1
Hence, the multihop coding scheme can be improved via coherent cooperation between the sender and the relay
Coherent Multihop Lower Bound

C
p(x1 , x2 )
max
min I ( X2 ; Y3 ), I ( X1 ; Y2 | X2 )
Elements of NIT
Tutorial, ISIT 2011
82 / 118
Relay Channel
Coherent Multihop
We again use a block Markov coding scheme
Send b 1 messages in b blocks using independently generated codebooks Fix p(x1 , x2 ) that attains the lower bound

For each m j 1 [1 : 2nR ], randomly and conditionally independently generate 2nR n n sequences x1 (m j |m j 1 ) i=1 pX1 | X2 (x1i |x2i (m j 1 )), m j [1 : 2nR ]
n n Codebooks: C j = {(x1 (m j |m j 1 ), x2 (m j 1 )) : m j 1 , m j [1 : 2nR ]}, j [1 : b]
For j [1 : b], randomly and independently generate 2nR sequences n n x2 (m j 1 ) i=1 pX2 (x2i ), m j 1 [1 : 2nR ]
Elements of NIT
Tutorial, ISIT 2011
83 / 118
Relay Channel
Coherent Multihop
Block X1 Y2 X2 Y3
1
n x1 (m1 |1)
2
n x1 (m2 |m1 )
3
n x1 (m3 |m2 )
... ... ... ... ...
b1
n x1 ( m b 1 | m b 2 )
b
n x1 (1|mb1 )
1 m
n x2 (1)
2 m
n 1) x2 (m
3 m
n 2) x2 (m
b 1 m
n b 2 ) x2 (m
n b 1 ) x2 (m
1 m
2 m
b 2 m
b 1 m
Encoding:
n In block j , transmit x1 (m j |m j 1 ) from codebook C j
Relay encoding:
j such that At the end of block j , nd the unique m n n n j |m j 1 ), x2 j 1 ), y2 (x1 (m (m ( j )) T(n) n j ) from codebook C j +1 In block j + 1, transmit x2 (m
n n j such that (x2 j ), y3 ( j + 1)) T(n) At the end of block j + 1, nd unique message m (m
Decoding:
Relay Channel
Coherent Multihop

We analyze the probability of decoding error for M j averaged over codebooks Assume M j 1 = M j = 1 j be the relays decoded message at the end of block j Let M The decoder makes an error only if one of the following events occur:
n E1 ( j ) = ( X2 (M j ), Y3n ( j + 1)) T(n)
( j ) = {M j = 1} E
n j E2 ( j ) = ( X2 (m j ), Y3n ( j + 1)) T(n) for some m j = M
Thus, the probability of error is upper bounded as ( j )) + P(E1 ( j )) + P(E2 ( j )) j = 1} P(E P(E ( j )) = P{M Following the same steps as in the multihop coding scheme, the last two terms 0 as n if R < I ( X2 ; Y3 ) ( )
Relay Channel
Coherent Multihop

( j )) = P{M j = 1}, dene To upper bound P(E 1 ( j ) = ( X n (1 | M j 1 ), X n (M j 1 ), Y n ( j )) T (n) E 1 2 2
2 ( j ) = ( X n (m j | M j 1 ), X n (M j 1 ), Y n ( j )) T (n) for some m j = 1 E 2 1 2 ( j )) P(E ( j 1))+ P(E 1 ( j ) E c ( j 1))+ P(E 2 ( j )) P (E
Then
Consider the second term 1 ( j ) E c ( j 1)) = P{( X n (1 | M j 1 ), X n (M j 1 ), Y n ( j )) T (n) , M j 1 = 1} P (E 1 2 2 which, by the independence of the codebooks and the LLN, 0 as n 2 ( j )) 0 as n if R < I ( X1 ; Y2 | X2 ) ( ) By the packing lemma, P(E ( j )) 0 as n for every j [1 : b 1] 0 = 1, by induction, P(E Since M Thus we have shown that under the given constraints on the rate, j = M j } 0 as n for every j [1 : b 1] P{ M
n n j 1 = 1}, P{( X1 (1 | 1), X2 (1), Y2n ( j )) T(n) | M
Relay Channel
DecodeForward
DecodeForward Lower Bound

Coherent multihop can be further improved by combining the information through the direct path with the information from the relay
Y2 : X2 M Y3 M
X1
DecodeForward Lower Bound (CoverEl Gamal 1979)

C max min I ( X1 , X2 ; Y3 ), I ( X1 ; Y2 | X2 )
p(x1 , x2 )
Tight for a physically degraded DM-RC, i.e., p( y2 , y3 | x1 , x2 ) = p( y2 | x1 , x2 ) p( y3 | y2 , x2 )
Elements of NIT
Tutorial, ISIT 2011
87 / 118
Relay Channel
DecodeForward
Proof of Achievability (ZengKuhlmannBuzo 1989)

We use backward decoding (Willemsvan der Meulen 1985) Codebook generation, encoding, relay encoding:

Same as coherent multihop n n Codebooks: C j = {(x1 (m j |m j 1 ), x2 (m j 1 )): m j 1 , m j [1 : 2nR ]}, j [1 : b]

Block X1 Y2 X2 Y3 1
n x1 (m1 |1)
2
n x1 (m2 |m1 )
3
n x1 (m3 |m2 )
... ... ... ... ...
b1
n x1 ( m b 1 | m b 2 )
b
n x1 (1|mb1 )
1 m
n x2 (1)
2 m
n 1) x2 (m
3 m
n 2) x2 (m
b 1 m
n b 2 ) x2 (m
n b 1 ) x2 (m
1 m
2 m
b 2 m
b 1 m
Decoding:

Decoding at the receiver is done backwards after all b blocks are received j such that For j = b 1, . . . , 1, the receiver nds the unique message m n n n j +1 | m j ), x2 j ), y3 b = 1 (x1 (m (m ( j + 1)) T(n) , successively with the initial condition m
Relay Channel
DecodeForward

We analyze the probability of decoding error for M j averaged over codebooks Assume M j = M j +1 = 1 The decoder makes an error only if one or more of the following events occur: j +1 = 1} E ( j + 1) = {M ( j ) = {M j = 1} E
n j ), X n (M j ), Y n ( j + 1)) T (n) E1 ( j ) = ( X1 (M j +1 | M 2 3
n n j E2 ( j ) = ( X1 (M j +1 | m j ), X2 (m j ), Y3n ( j + 1)) T(n) for some m j = M
Thus, the probability of error is upper bounded as j = 1} P(E ( j )) = P{M ( j ) E ( j + 1) E1 ( j ) E2 ( j )) P(E
As in the coherent multihop scheme, the rst term 0 as n if R < I ( X1 ; Y2 | X2 ) ( )

( j )) + P(E ( j + 1)) + P(E1 ( j ) E c ( j ) E c ( j + 1)) + P(E2 ( j )) P(E
Tutorial, ISIT 2011
89 / 118
Relay Channel
DecodeForward
j +1 = 1} E ( j + 1) = {M
( j ) = {M j = 1} E
n j ), X n (M j ), Y n ( j + 1)) T (n) E1 ( j ) = ( X1 (M j +1 | M 2 3
( j ))+ P(E ( j + 1)) + P(E1 ( j ) E c ( j ) E c ( j + 1)) + P(E2 ( j )) P(E ( j )) P(E The third term is upper bounded as j +1 = 1} {M j = 1} PE1 ( j ) {M
n n j +1 = 1, M j = 1 = P( X1 (1 | 1), X2 (1), Y3n ( j + 1)) T(n) , M
n n j E2 ( j ) = ( X1 (M j +1 | m j ), X2 (m j ), Y3n ( j + 1)) T(n) for some m j = M
which, by the independence of the codebooks and the LLN, 0 as n By the same independence and the packing lemma, the fourth term P(E2 ( j )) 0 as n if R < I ( X1 , X2 ; Y3 ) ( ) b = Mb = 1, by induction, P{M j = Mj} 0 Finally for the second term, since M as n for every j [1 : b 1] if the given constraints on the rate are satised
n n j = 1, P( X1 (1 | 1), X2 (1), Y3n ( j + 1)) T(n) | M
Relay Channel
CompressForward
CompressForward Lower Bound

In the decodeforward coding scheme, the relay recovers the entire message
Y2 : X2 | 2 Y Y3 M
X1
If channel from sender to relay is worse than direct channel to receiver, this requirement can reduce rate below that of direct transmission (relay is not used) In the compressforward coding scheme, the relay helps communication by sending a description of its received sequence to the receiver
CompressForward Lower Bound (CoverEl Gamal 1979, El GamalMohseniZahedi 2006)

C
2 | y2 , x2 ) p(x1 ) p(x2 ) p( y
max
2 | X1 , X2 , Y3 ), I ( X1 ; Y 2 , Y3 | X2 ) min I ( X1 , X2 ; Y3 ) I (Y2 ; Y
Elements of NIT
Tutorial, ISIT 2011
91 / 118
Relay Channel
CompressForward
We use block Markov coding, joint typicality encoding, binning, and simultaneous nonunique decoding
Y2 : X2 2 Y Y3 M
X1
n 2 At the end of block j , the relay chooses a reconstruction sequence y ( j ) of the n received sequence y2 ( j ) n Since the receiver has side information y3 ( j ), we use binning to reduce the rate n The bin index is sent to the receiver in block j + 1 via x2 ( j + 1) At the end of block j + 1, the receiver recovers the bin index and then m j and the compression index simultaneously
Elements of NIT
Tutorial, ISIT 2011
92 / 118
Relay Channel
CompressForward
We use block Markov coding, joint typicality encoding, binning, and simultaneous nonunique decoding
Y2 : X2 2 Y Y3 M
X1

2 | y2 , x2 ) that attains the lower bound Fix p(x1 ) p(x2 ) p( y For j [1 : b], randomly and independently generate 2nR sequences n n x1 (m j ) i=1 pX1 (x1i ), m j [1 : 2nR ]
n
For each l j 1 [1 : 2nR2 ], randomly and conditionally independently generate 2nR2 n n 2 2i |x2i ( l j 1 )), k j [1 : 2nR2 ] sequences y (k j | l j 1 ) i=1 pY 2 | X2 ( y
n n Codebooks: C j = {(x1 (m j ), x2 ( l j 1 )): m j [1 : 2nR ], l j 1 [1 : 2nR2 ]}, j [1 : b]
n ( l j 1 ) i=1 pX2 (x2i ), l j 1 [1 : 2nR2 ] Similarly generate 2nR2 sequences x2
Partition the set [1 : 2nR2 ] into 2nR2 equal-size bins B( l j ), l j [1 : 2nR2 ]
Elements of NIT
Tutorial, ISIT 2011
93 / 118
Relay Channel
CompressForward
Block X1 Y2 X2 Y3
1
n x1 (m1 ) n 2 y (k1 |1), l1 n x2 (1)
2
n x1 (m2 ) n 2 y (k2 | l1 ), l2 n x2 ( l1 )
3
n x1 (m3 ) n 2 y (k3 | l2 ), l3 n x2 ( l2 )
... ... ... ... ...
b1
n x1 ( m b 1 ) n 2 y ( k b 1 | l b 2 ) , l b 1 n x2 ( l b 2 )
b
n x1 (1)
n x2 ( l b 1 )
,m l1 , k 1 1
,m l2 , k 2 2
,m l b 2 , k b 2 b 2
,m l b 1 , k b 1 b 1
Encoding:
n Transmit x1 (m j ) from codebook C j n n n 2 At the end of block j , nd an index k j such that ( y2 ( j ), y (k j | l j 1 ), x2 ( l j 1 )) T(n) n In block j + 1, transmit x2 ( l j ), where l j is the bin index of k j n n At the end of block j + 1, nd the unique l j such that (x2 ( l j ), y3 ( j + 1)) T(n) n n n n j such that (x1 (m j ), x2 ( l j 1 ), y 2 (k j | l j 1 ), y3 ( j )) T(n) for some Find the unique m k j B( l j )
Relay encoding:

Decoding:

Relay Channel
CompressForward

Assume M j = 1 and let L j 1 , L j , K j denote the indices chosen by the relay The decoder makes an error only if one or more of the following events occur: j 1 = L j 1 } E1 ( j 1) = {L j = L j } E1 ( j ) = { L
2 ( j ) = ( X n (L j 1 ), Y n (k j | L j 1 ), Y n ( j )) T (n) for all k j [1 : 2nR E ] 2 2 2
n n n (K j | L j 1 ), Y n ( j )) T (n) E2 ( j ) = ( X1 (1), X2 (L j 1 ), Y 2 3
n n |L n (k j 1 ), Y n ( j )) T (n) E4 ( j ) = ( X1 ( m j ) , X2 (L j 1 ), Y j 2 3 B(L = K , m = 1 ), k for some k j j j j j
n n n (K j | L j 1 ), Y n ( j )) T (n) for some m j = 1 (L j 1 ), Y ( m j ) , X2 E3 ( j ) = ( X1 2 3
Thus, the probability of error is bounded as j = 1} P(E ( j )) = P{M
( j )) + P(E1 ( j 1)) + P(E1 ( j )) + P(E2 ( j ) E c ( j ) E c ( j 1)) P (E 1

c c + P(E3 ( j )) + P(E4 ( j ) E1 ( j 1) E1 ( j ))
Relay Channel
CompressForward
j 1 = L j 1 } E1 ( j 1) = {L j = L j } E1 ( j ) = { L
( j )) + P(E1 ( j 1)) + P(E1 ( j )) + P(E2 ( j ) E c ( j ) E c ( j 1)) P(E ( j )) P(E 1 c c + P(E3 ( j )) + P(E4 ( j ) E1 ( j 1) E1 ( j )) By the independence of codebooks and the covering lemma (U X2 , X Y2 , 2 > I (Y 2 ; Y2 | X2 ) + ( ) Y 2 ), the rst term 0 as n if R X j 1 = L j 1 } 0 and As in the multihop coding scheme, the next two terms P{L P{L j = L j } 0 as n if R2 < I ( X2 ; Y3 ) ( ) n n c ( j ) 0 by n (K j |L j 1 ), Y n ( j )) T (n) | E The fourth term P( X1 (1), X2 (L j 1 ), Y 2 3 the independence of codebooks and the conditional typicality lemma
n n |L n (k j 1 ), Y n ( j )) T (n) E4 ( j ) = ( X1 ( m j ) , X2 (L j 1 ), Y j 2 3 B(L = K , m = 1 j ), k for some k j j j j
n n j 1 ), Y n ( j )) T (n) for some m j = 1 n (K j | L E3 ( j ) = ( X1 ( m j ) , X2 (L j 1 ), Y 2 3
Relay Channel
CompressForward
Covering Lemma
) p( u , x , x ) and < Let (U , X , X Let (U n , X n ) p(un , x n ) be arbitrarily distributed such that
n nR n ( m ) n p ( x Let X i =1 X |U i | ui ), m A, where |A| 2 , be n conditionally independent of each other and of X given U n
lim P{(U n , X n ) T(n) (U , X )} = 1
Covering Lemma

n
n (m)) T (n) for all m A = 0, lim P(U n , X n , X
|U ) + ( ) if R > I ( X ; X
Elements of NIT
Tutorial, ISIT 2011
97 / 118
Relay Channel
CompressForward
j 1 = L j 1 } E1 ( j 1) = {L j = L j } E1 ( j ) = { L
( j )) + P(E1 ( j 1)) + P(E1 ( j )) + P(E2 ( j ) E c ( j ) E c ( j 1)) P(E ( j )) P(E 1 c c + P(E3 ( j )) + P(E4 ( j ) E1 ( j 1) E1 ( j )) By the independence of codebooks and the covering lemma (U X2 , X Y2 , 2 > I (Y 2 ; Y2 | X2 ) + ( ) Y 2 ), the rst term 0 as n if R X j 1 = L j 1 } 0 and As in the multihop coding scheme, the next two terms P{L P{L j = L j } 0 as n if R2 < I ( X2 ; Y3 ) ( ) n n c ( j ) 0 by n (K j |L j 1 ), Y n ( j )) T (n) | E The fourth term P( X1 (1), X2 (L j 1 ), Y 2 3 the independence of codebooks and the conditional typicality lemma
n n |L n (k j 1 ), Y n ( j )) T (n) E4 ( j ) = ( X1 ( m j ) , X2 (L j 1 ), Y j 2 3 B(L = K , m = 1 j ), k for some k j j j j
n n j 1 ), Y n ( j )) T (n) for some m j = 1 n (K j | L E3 ( j ) = ( X1 ( m j ) , X2 (L j 1 ), Y 2 3
Relay Channel
CompressForward
j 1 = L j 1 } E1 ( j 1) = {L j = L j } E1 ( j ) = { L
By the same independence and the packing lemma, P(E3 ( j )) 0 as n if 2 , Y3 ) + ( ) = I ( X1 ; Y 2 , Y3 | X2 ) + ( ) R < I ( X1 ; X2 , Y As in WynerZiv coding, the last term n n |L ), Y n ( j )) T (n) for some k B (1), m = 1}, n (k P{( X1 ( m j ) , X2 (L j 1 ), Y j j 1 j j 2 3 which, by the independence of codebooks, joint typicality lemma, and union 2 R2 < I ( X1 ; Y3 | X2 ) + I (Y 2 ; X1 , Y3 | X2 ) ( ) bound, 0 as n if R + R
El Gamal & Kim (Stanford & UCSD) Elements of NIT Tutorial, ISIT 2011
( j )) + P(E1 ( j 1)) + P(E1 ( j )) + P(E2 ( j ) E c ( j ) E c ( j 1)) P(E ( j )) P(E 1 c c + P(E3 ( j )) + P(E4 ( j ) E1 ( j )) ( j 1) E1
n n |L n (k j 1 ), Y n ( j )) T (n) E4 ( j ) = ( X1 ( m j ) , X2 (L j 1 ), Y j 2 3 B(L = K , m = 1 for some k ) , k j j j j j
n n n (K j | L j 1 ), Y n ( j )) T (n) for some m j = 1 E3 ( j ) = ( X1 ( m j ) , X2 (L j 1 ), Y 2 3
99 / 118
Relay Channel
CompressForward
Summary
Block Markov coding Coherent cooperation Decodeforward Backward decoding
Compressforward
Multicast Network
DM Multicast Network (MN)

Multicast communication network
2 j j M N k k M
1 M 3 p( y1 , . . . , yN |x1 , . . . , xN )
N M
Assume an N -node DM-MN model ( j =1 X j , p( y N |x N ), j =1 Y j )

N N
Topology of the network is dened through p( y N |x N ) A (2nR , n) code for the DM-MN:

Message set: [1 : 2nR ] i 1 Source encoder: x1i (m , y1 ), i [1 : n ] Relay encoder j [2 : N ]: x ji ( y ij1 ), i [1 : n]

n k ( yk Decoder k D: m )
Multicast Network
j M N
1 M 3 p( y1 , . . . , yN |x1 , . . . , xN ) k
N M
k M
Assume M Unif [1 : 2nR ]
(n) k = M for some k D} Average probability of error: Pe = P{ M
(n) R achievable if there exists a sequence of (2nR , n) codes with limn Pe =0
Capacity C : supremum of achievable R Special cases:

DMC with feedback (N = 2, Y1 = Y2 , X2 = , and D = {2}) DM-RC (N = 3, X3 = Y1 = , and D = {3}) Common-message DM-BC ( X2 = = XN = Y1 = and D = [2 : N ]) DM unicast network (D = {N })
Multicast Network
Network DecodeForward
Decodeforward for RC can be extended to MN
Y2 : X2 Mj Mj X1 M j 1 Y3 : X3 M j 2 Y4 j 2 M
Network DecodeForward Lower Bound (XieKumar 2005, KramerGastparGupta 2005)

N C max min I ( X k ; Yk +1 | Xk +1 ) p(x ) k [1: N 1]
For N = 3 and X3 = , reduces to the decodeforward lower bound for DM-RC

N N k +1 N N Tight for a degraded DM-MN, i.e., p( yk ) = p( y k +2 | x , y +2 | xk +1 , yk +1 )
Holds for any D [2 : N ]

Can be improved by removing some relay nodes and relabeling the nodes
Multicast Network
We use block Markov coding and sliding window decoding (Carleial 1982) We illustrate this scheme for DM-RC Codebook generation, encoding, and relay encoding: same as before
Block X1 Y2 X2 Y3 1
n x1 (m1 |1)
2
n x1 (m2 |m1 )
3
n x1 (m3 |m2 )
b1
n x1 ( m b 1 | m b 2 )
b
n x1 (1|mb1 )
1 m
n x2 (1)
2 m
n 1) x2 (m
3 m
n 2) x2 (m
b 1 m
n b 2 ) x2 (m
n b 1 ) x2 (m
1 m
2 m
b 2 m
b 1 m
Decoding:
j such that At the end of block j + 1, nd the unique m n n n (n) n n j |m j 1 ), x2 (m j 1 ), y3 ( j )) T and (x2 (m j ), y3 (x1 (m ( j + 1)) T(n) simultaneously
Multicast Network

Assume that M j 1 = M j = 1 The decoder makes an error only if one or more of the following events occur: ( j 1) = {M j 1 = 1} E j 1 = 1} E ( j 1) = {M ( j ) = {M j = 1} E
n j 1 ), X n (M j 1 ), Y n ( j )) T (n) or ( X n (M j ), Y n ( j + 1)) T (n) E1 ( j ) = ( X1 (M j | M 2 3 2 3
n j 1 ), X n (M j 1 ), Y n ( j )) T (n) and ( X n (m j ), Y n ( j + 1)) T (n) E2 ( j ) = ( X1 (m j | M 2 3 2 3 for some m j = M j
Thus, the probability of error is upper bounded as ( j 1) E ( j ) E ( j 1) E1 ( j ) E2 ( j )) P(E ( j )) P(E
( j 1)) + P(E ( j )) + P(E ( j 1)) P(E c ( j 1) E c ( j ) E c ( j 1)) + P(E2 ( j ) E c ( j )) + P(E1 ( j ) E
By independence of the codebooks, the LLN, the packing lemma, and induction, the rst four terms tend to zero as n if R < I ( X1 ; Y2 | X2 ) ( )
Elements of NIT Tutorial, ISIT 2011
105 / 118
Multicast Network
For the last term, consider c ( j )) = P( X n (m j | M j 1 ), X n (M j 1 ), Y n ( j )) T (n) , P(E2 ( j ) E 1 2 3

n j = 1 ( X2 (m j ), Y3n ( j + 1)) T(n) for some m j = 1, and M
n j 1 ), X n (M j 1 ), Y n ( j )) T (n) , P( X1 (m j | M 2 3 m =1 n n (n) ( X (m j ), Y ( j + 1)) T , and M j = 1 2 3 (a)
n j 1 ), X n (M j 1 ), Y n ( j )) T (n) and M j = 1 = P( X1 (m j | M 2 3 m =1 j = 1 P( X n (m j ), Y n ( j + 1)) T (n) | M n j 1 ), X n (M j 1 ), Y n ( j )) T (n) P( X1 (m j | M 2 3 m =1 n n (n) P( X (m j ), Y ( j + 1)) T | M j = 1 2 3 2 3
(b) nR n( I ( X ;Y | X ) ( )) n( I ( X ;Y ) ( )) 1 3 2 2 3
n j 1 ) , X n ( M j 1 ), Y n ( j )) T (n) } and {( X n (m j ), Y n ( j + 1)) T (n) } are (a) {( X1 (m j | M 2 3 2 3 j = 1 for m j = 1 conditionally independent given M (b) independence of the codebooks and the joint typicality lemma
0 as n if R < I ( X1 ; Y3 | X2 ) + I ( X2 ; Y3 ) 2 ( ) = I ( X1 , X2 ; Y3 ) 2 ( )
2 2
Tutorial, ISIT 2011
106 / 118
Multicast Network
Noisy Network Coding

Compressforward for DM-RC can be extended to DM-MN
Theorem (Noisy Network Coding Lower Bound)

1 = by convention, k | yk , xk ), Y where the maximum is over all N k =1 p( xk ) p( y X (S ) denotes inputs in S , and Y (S c ) denotes outputs in S c Special cases:

C max min
k D S :1S , k S
min
(S c ), Yk | X (S c )) I (Y (S ); Y (S )| X N , Y (S c ), Yk ), I ( X (S ); Y
Compressforward lower bound for DM-RC (N = 3 and X3 = ) Network coding theorem for graphical MN (AhlswedeCaiLiYeung 2000) Capacity of deterministic MN with no interference (RatnakarKramer 2006) Capacity of wireless erasure MN (DanaGowaikarPalankiHassibiEros 2006) Lower bound for general deterministic MN (AvestimehrDiggaviTse 2011)
Can be extended to Gaussian networks (giving best known gap result) and to multiple messages (LimKimEl GamalChung 2011)
Multicast Network
We use several new ideas beyond compressforward for DM-RC

The source node sends the same message m [1 : 2nbR ] over b blocks n of Y n without binning Relay node j sends the index of the compressed version Y j j Each receiver node performs simultaneous nonunique decoding of the message and compression indices from all b blocks
We illustrate this scheme for DM-RC Codebook generation:

2 | y2 , x2 ) that attains the lower bound Fix p(x1 ) p(x2 ) p( y For each j [1 : b], randomly and independently generate 2nbR sequences n n x1 ( j , m) i=1 pX1 (x1i ), m [1 : 2nbR ]
n
For each l j 1 [1 : 2nR2 ], randomly and conditionally independently generate 2nR2 n n 2 2i |x2i ( l j 1 )), l j [1 : 2nR2 ] sequences y ( l j | l j 1 ) i=1 pY 2 | X2 ( y
n n n 2 C j = {(x1 ( j , m), x2 ( l j 1 ) , y ( l j | l j 1 )): m [1 : 2nbR ], l j , l j 1 [1 : 2nR2 ]}, j [1 : b]
n Randomly and independently generate 2nR2 sequences x2 ( l j 1 ) i=1 pX2 (x2i ), nR2 l j 1 [ 1 : 2 ]
Multicast Network
Block X1 Y2 X2 Y3
1
n x1 (1, m) n 2 y ( l1 |1), l1 n x2 (1)
2
n x1 (2, m) n 2 y ( l2 | l1 ), l2 n x2 ( l1 )
3
n x1 (3, m) n 2 y ( l3 | l2 ), l3 n x2 ( l2 )
b1
n x1 (b 1, m) n 2 y ( l b 1 | l b 2 ) , l b 1 n x2 ( l b 2 )
b
n x1 (b , m) n 2 y ( l b | l b 1 ) , l b n x2 ( l b 1 )
Encoding:
n To send m [1 : 2nbR ], transmit x1 ( j , m) in block j n n n 2 At the end of block j , nd an index l j such that ( y2 ( j ), y ( l j | l j 1 ), x2 ( l j 1 )) T(n) n In block j + 1, transmit x2 ( l j )
Relay encoding:

Decoding:
such that At the end of block b, nd the unique m n n n n ), x2 2 (x1 ( j, m ( l j 1 ) , y ( l j | l j 1 ), y3 ( j )) T(n) for all j [1 : b] for some l1 , l2 , . . . , lb
Multicast Network

Assume M = 1 and L1 = L2 = = Lb = 1 The decoder makes an error only if one or more of the following events occur: n ( l j | 1), X n (1)) T (n) for all l j for some j [1 : b] E1 = (Y2n ( j ), Y 2 2
n n n ( l j | l j 1 ), Y n ( j )) T (n) for all j for some l b , m = 1 E3 = ( X1 ( j , m ) , X2 ( l j 1 ), Y 2 3 n n n (1 | 1), Y n ( j )) T (n) for some j [1 : b] E2 = ( X1 ( j , 1), X2 (1), Y 2 3
Thus, the probability of error is upper bounded as

c P(E ) P(E1 ) + P(E2 E1 ) + P(E3 )
By the covering lemma and the union of events bound (over b blocks), 2 ; Y2 | X2 ) + ( ) P(E1 ) 0 as n if R2 > I (Y By the conditional typicality lemma and the union of events bound, c P(E2 E1 ) 0 as n
Elements of NIT
Tutorial, ISIT 2011
110 / 118
Multicast Network
j (m , l j 1 , l j ) = ( X n ( j , m), X n ( l j 1 ), Y n ( l j | l j 1 ), Y n ( j )) T (n) Dene E 1 2 2 3 Then j (m , l j 1 , l j ) P(E3 ) = P E

m=1 l , j =1 b b
j (m , l j 1 , l j ) P E
m=1 l , j =1
j (m , l j 1 , l j )) = P(E
m=1 l , j =1 b
j (m , l j 1 , l j )) P(E
m=1 l , j =2
j ) 2 If l j 1 = 1, then by the joint typicality lemma, P(E 2 ; X1 , Y3 | X2 ) ( )) n ( I ( X , X ; Y ) + I (Y 1 2 3 j ) 2 Similarly, if l j 1 = 1, then P(E Thus, if l b1 has k 1s, then
b
2 , Y3 | X2 ) ( )) n ( I ( X1 ; Y
I2
j (m , l j 1 , l j )) 2n(kI1 +(b1k )I2 (b1)()) P(E

j =2
Multicast Network
Continuing with the bound, j (m , l j 1 , l j )) j (m , l j 1 , l j )) = P(E P(E

m=1 l, l ,1 j =2 b1 b b
m=1 l , j =2

m=1 l, j =0
b 1 n(b1 j )R2 n( jI1 +(b1 j )I2 (b1)()) 2 2 j b 1 n( jI1 +(b1 j )(I2 R2 )(b1)()) 2 j b 1 n((b1)(min{I1 , I2 R2 }())) 2 j
=
m=1 l, j =0
b1

m=1 l, j =0
b1
which 0 as n if R < ((b 1)(min{ I1 , I2 R2 } ( )) R2 )/b 2 ; Y2 | X2 ) + ( ), substituting I1 and I2 , and Finally, by eliminating R2 > I (Y taking b , we have shown that P(E ) 0 as n if 2 , Y3 | X2 ), I ( X1 , X2 ; Y3 ) I (Y 2 ; Y2 | X1 , X2 , Y3 ) ( ) ( ) R < min I ( X1 ; Y This completes the proof of achievability for noisy network coding
2nbR 2nR2 2b 2n(b1)(min{I1 , I2 R2 }()) ,
Multicast Network
Summary
1. Typical Sequences 2. Point-to-Point Communication 3. Multiple Access Channel 4. Broadcast Channel 5. Lossy Source Coding Network decodeforward 6. WynerZiv Coding Sliding window decoding 7. GelfandPinsker Coding Noisy network coding 8. Wiretap Channel 9. Relay Channel 10. Multicast Network
Sending same message multiple times using independent codebooks
Beyond packing lemma

Elements of NIT
Conclusion
Conclusion
Presented a unied approach to achievability proofs for DM networks:

Typicality and elementary lemmas Coding techniques: random coding, joint typicality encoding/decoding, simultaneous (nonunique) decoding, superposition coding, binning, multicoding
Results can be extended to Gaussian models via discretization procedures Lossless source coding is a corollary of lossy source coding Network Information Theory book:

Comprehensive coverage of this approach More advanced coding techniques and analysis tools Converse techniques (DM and Gaussian) Open problems
Although the theory is far from complete, we hope that our approach will

Make the subject accessible to students, researchers, and communication engineers Help in the quest for a unied theory of information ow in networks
References
References
Ahlswede, R. (1971). Multiway communication channels. In Proc. 2nd Int. Symp. Inf. Theory, Tsahkadsor, Armenian SSR, pp. 2352. Ahlswede, R., Cai, N., Li, S.-Y. R., and Yeung, R. W. (2000). Network information ow. IEEE Trans. Inf. Theory, 46(4), 12041216. Avestimehr, A. S., Diggavi, S. N., and Tse, D. N. C. (2011). Wireless network information ow: A deterministic approach. IEEE Trans. Inf. Theory, 57(4), 18721905. Bergmans, P. P. (1973). Random coding theorem for broadcast channels with degraded components. IEEE Trans. Inf. Theory, 19(2), 197207. Carleial, A. B. (1982). Multiple-access channels with dierent generalized feedback signals. IEEE Trans. Inf. Theory, 28(6), 841850. Costa, M. H. M. (1983). Writing on dirty paper. IEEE Trans. Inf. Theory, 29(3), 439441. Cover, T. M. (1972). Broadcast channels. IEEE Trans. Inf. Theory, 18(1), 214. Cover, T. M. and El Gamal, A. (1979). Capacity theorems for the relay channel. IEEE Trans. Inf. Theory, 25(5), 572584. Csisz ar, I. and K orner, J. (1978). Broadcast channels with condential messages. IEEE Trans. Inf. Theory, 24(3), 339348. Dana, A. F., Gowaikar, R., Palanki, R., Hassibi, B., and Eros, M. (2006). Capacity of wireless erasure networks. IEEE Trans. Inf. Theory, 52(3), 789804.
Elements of NIT
Tutorial, ISIT 2011
115 / 118
References
References (cont.)
El Gamal, A., Mohseni, M., and Zahedi, S. (2006). Bounds on capacity and minimum energy-per-bit for AWGN relay channels. IEEE Trans. Inf. Theory, 52(4), 15451561. Elias, P., Feinstein, A., and Shannon, C. E. (1956). A note on the maximum ow through a network. IRE Trans. Inf. Theory, 2(4), 117119. Ford, L. R., Jr. and Fulkerson, D. R. (1956). Maximal ow through a network. Canad. J. Math., 8(3), 399404. Gelfand, S. I. and Pinsker, M. S. (1980). Coding for channel with random parameters. Probl. Control Inf. Theory, 9(1), 1931. Han, T. S. and Kobayashi, K. (1981). A new achievable rate region for the interference channel. IEEE Trans. Inf. Theory, 27(1), 4960. Heegard, C. and El Gamal, A. (1983). On the capacity of computer memories with defects. IEEE Trans. Inf. Theory, 29(5), 731739. Kramer, G., Gastpar, M., and Gupta, P. (2005). Cooperative strategies and capacity theorems for relay networks. IEEE Trans. Inf. Theory, 51(9), 30373063. Liao, H. H. J. (1972). Multiple access channels. Ph.D. thesis, University of Hawaii, Honolulu, HI. Lim, S. H., Kim, Y.-H., El Gamal, A., and Chung, S.-Y. (2011). Noisy network coding. IEEE Trans. Inf. Theory, 57(5), 31323152. McEliece, R. J. (1977). The Theory of Information and Coding. Addison-Wesley, Reading, MA.
Elements of NIT
Tutorial, ISIT 2011
116 / 118
References
References (cont.)
Orlitsky, A. and Roche, J. R. (2001). Coding for computing. IEEE Trans. Inf. Theory, 47(3), 903917. Ratnakar, N. and Kramer, G. (2006). The multicast capacity of deterministic relay networks with no interference. IEEE Trans. Inf. Theory, 52(6), 24252432. Shannon, C. E. (1948). A mathematical theory of communication. Bell Syst. Tech. J., 27(3), 379423, 27(4), 623656. Shannon, C. E. (1959). Coding theorems for a discrete source with a delity criterion. In IRE Int. Conv. Rec., vol. 7, part 4, pp. 142163. Reprint with changes (1960). In R. E. Machol (ed.) Information and Decision Processes, pp. 93126. McGraw-Hill, New York. Shannon, C. E. (1961). Two-way communication channels. In Proc. 4th Berkeley Symp. Math. Statist. Probab., vol. I, pp. 611644. University of California Press, Berkeley. Slepian, D. and Wolf, J. K. (1973a). Noiseless coding of correlated information sources. IEEE Trans. Inf. Theory, 19(4), 471480. Slepian, D. and Wolf, J. K. (1973b). A coding theorem for multiple access channels with correlated sources. Bell Syst. Tech. J., 52(7), 10371076. Willems, F. M. J. and van der Meulen, E. C. (1985). The discrete memoryless multiple-access channel with cribbing encoders. IEEE Trans. Inf. Theory, 31(3), 313327. Wyner, A. D. (1975). The wire-tap channel. Bell Syst. Tech. J., 54(8), 13551387.
Elements of NIT
Tutorial, ISIT 2011
117 / 118
References
References (cont.)
Wyner, A. D. and Ziv, J. (1976). The ratedistortion function for source coding with side information at the decoder. IEEE Trans. Inf. Theory, 22(1), 110. Xie, L.-L. and Kumar, P. R. (2005). An achievable rate for the multiple-level relay channel. IEEE Trans. Inf. Theory, 51(4), 13481358. Zeng, C.-M., Kuhlmann, F., and Buzo, A. (1989). Achievability proof of some multiuser channel coding theorems using backward decoding. IEEE Trans. Inf. Theory, 35(6), 11601165.
Elements of NIT
Tutorial, ISIT 2011
118 / 118

Network Inf Theory Slide

Uploaded by

Copyright:

Available Formats

You might also like

Network Inf Theory Slide

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Network Inf Theory Slide

Uploaded by

Copyright:

Available Formats

Elements of Network Information Theory

Slides available at http://isl.stanford.edu/abbas

Networked Information Processing System

Network Information Flow Questions

Network Information Theory

El Gamal & Kim (Stanford & UCSD)

Tutorial, ISIT 2011

Wireless communications and the Internet revived interest in mid 90s

El Gamal & Kim (Stanford & UCSD)

Tutorial, ISIT 2011

Network Information Theory Book

El Gamal & Kim (Stanford & UCSD)

Tutorial, ISIT 2011

Typicality and simple universal lemmas for DM models

Lossless source coding as a corollary of lossy source coding

Extending achievability proofs from DM to Gaussian models

Illustrate the approach through proofs of several classical coding theorems

El Gamal & Kim (Stanford & UCSD)

Tutorial, ISIT 2011

Typical set (OrlitskyRoche 2001): For X p(x ) and > 0,

n p(x ) for all x X = T(n) T(n) ( X ) = x n : ( x | x ) p ( x )

Typical Average Lemma

El Gamal & Kim (Stanford & UCSD)

Tutorial, ISIT 2011

Properties of Typical Sequences

Xn P{ X n T(n) } 1 |T(n) | 2nH ( X ) T(n) ( X )

typical x n p(x n ) 2nH ( X )

El Gamal & Kim (Stanford & UCSD)

Tutorial, ISIT 2011

Jointly Typical Sequences

x n T(n) ( X ) and y n T(n) (Y )

El Gamal & Kim (Stanford & UCSD)

Tutorial, ISIT 2011

Conditionally Typical Sequences

Conditional Typicality Lemma

lim P(x n , Y n ) T(n) ( X , Y ) = 1

El Gamal & Kim (Stanford & UCSD)

Illustration of Joint Typicality

Another Illustration of Joint Typicality

El Gamal & Kim (Stanford & UCSD)

Tutorial, ISIT 2011

Joint Typicality for Random Triples

Joint Typicality Lemma

El Gamal & Kim (Stanford & UCSD)

Tutorial, ISIT 2011

Typical average lemma Conditional typicality lemma Joint typicality lemma

Discrete Memoryless Channel (DMC)

Assume a discrete memoryless channel (DMC) model (X , p( y |x ), Y )

When used without feedback, p( y n |x n , m) = i=1 pY | X ( yi |xi )

A (2nR , n) code for the DMC:

El Gamal & Kim (Stanford & UCSD)

for every m [1 : 2nR ]

Channel Coding Theorem (Shannon 1948)

p(x ): E(b( X ))B

El Gamal & Kim (Stanford & UCSD)

Tutorial, ISIT 2011

Analysis of the Probability of Error

El Gamal & Kim (Stanford & UCSD)

Tutorial, ISIT 2011