Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

Code Length and Source Coding Theorem

September 11, 2013

• Source Coding
The conversion of the output of a Discrete Memoryless Source (DMS) into a sequence of binary symbols is called
source coding

The aim of source coding is to minimize the average bit rate required for representation of the source by reducing
redundancy of the information source

• Average Code Length


For a DMS with a set of alphabet {x1 , x2 , ..., xm } with corresponding probability of occurrence {p1 , p2 , ..., pm }
and code length {l1 , l2 , ..., lm }

The average code length L per source symbol is thus



m
L= pi li
i=1

• Source Coding Theorem


For a DMS with nite entropy H , the average code length L per source symbol has a lower bound

L≥H
i.e.

Lmin = H

• Code Eciency η and Code Redundancy γ

Lmin H
η= =
L L
H
γ =1−η =1−
L

1
• Entropy Bound
For m symbol xi with occurrence probability pi

The entropy is thus ∑ ∑ 1


H= pi Ii = pi log2
pi
This entropy H actually has a upper bound and lower bound

0 ≤ H ≤ log2 m
Proof of the left hand side inequality 0 ≤ H
pi ∈ [0, 1]
1
⇒ ≥1
pi
1
⇒ log2 ≥0
pi
1
⇒ pi log2 ≥0
pi
∑ 1
⇒ pi log2 ≥0
pi
⇒ H≥0

Proof of the right hand side inequality H ≤ log2 m

Consider an inequality ( can be proved simply by dierentiation )

ln x ≤ x − 1
Consider two group of probability {pi } , {qi } on {xi }
By axiom of probability distribution , sum of probability should be 1
∑ ∑
qi = pi = 1

∑ qi ∑ ln p 1 ∑
qi
qi
i
pi log = pi = pi ln
pi ln 2 ln 2 pi
Using the inequality
∑ ( )
qi 1 ∑ qi 1 ∑ qi
pi log = pi ln ≤ pi −1
pi ln 2 pi ln 2 pi
1 ∑
= qi − pi
ln 2
1 (∑ ∑ )
= qi − pi
ln 2
By axiom of probability distribution

=0
Thus
∑ qi 1 ∑ qi
pi log = pi ln
pi ln 2 pi
( )
1 ∑ qi
≤ pi −1
ln 2 pi

=0

2
Therefore
∑ qi
pi log ≤0
pi
Let
1
qi = (Equal probability distribution for m-message)
m
∑ qi
pi log ≤0
pi
∑ 1
⇐⇒ pi log ≤0
pi m
∑ 1 ∑
⇐⇒ pi log2 − pi log2 m ≤ 0
pi
∑ 1 ∑
⇐⇒ pi log2 − log2 m pi ≤ 0
pi | {z }
| {z }
1
H

H ≤ log2 m
Thus

0 ≤ H ≤ log2 m

• Source Coding Theorem


The entropy H is the optimal lower bound of the average code length

L≥H
i.e. When the coding is optimal ( using shortest amount of code to represent information )

Lmin = H
Consider the inequality
∑ qi
pi log ≤0
pi
1
And let qi = ∑ 2 i 1 , notice that
l

m
i=1
2li
∑m 1
∑ ∑
m 1
i=1
qi = 2li
=∑ 2li = 1
∑m 1 m 1
i=1 i=1 i=1
2 li 2li
And thus
∑ qi
pi log ≤0
pi
 
∑ 1
1

2li
⇐⇒ pi log  ∑ ≤0
pi m 1 
i=1
2li

3
 
∑ ( ) ∑
 1 1 1 
m
⇐⇒ pi 
log2 + log2 li − log2 ≤0
li 
pi 2
| {z } i=1
2
−li
( ) ( )
∑ 1 ∑ ∑m∑ 1
⇐⇒ pi log2 − pi li − pi log2 ≤0
pi 2li
| {z } | {z } | {z } i=1
L 1
H

∑m
1
⇐⇒ H − L − log2 ≤0
i=1
2li
Using Kraf t Inequality
∑m
1
≤1
i=1
2li
Thus
∑m
1
log2 ≤ log2 1 = 0
i=1
2li
Therefore
∑m
1
H − L ≤ log2
i=1
2li

H −L≤0
Thus

L≥H
Equality holds when

Lmin = H

−EN D−

You might also like