Source Coding - TELECOMMUNICATION SYSTEMS

DIGITAL COMMUNICATIONS
Chapter 3: SOURCE CODING
Lectured by Assoc. Prof. Thuong Le-Tien

August 2014
1
1. Mathematical model for

information sources
Assume that each letter in the alphabet {x1, x2 ,, xL }
has a given probability pk of occurrence
pk P ( X xk ),
1 k L
where
k 1
Two mathematical models are DMS Discrete

Memoryless Source and Statistical Dependent Sources
2
2. Logarithmic measure of information

When X and Y are statistically independent, the
occurrence of Y = yj provides no information about
the occurrence of the event X = xi. The conditional
probability:
P( X xi | Y y j ) P( xi | y j )
divided by the probability:
P( X xi ) P( xi )
That is, the information content provided by the

occurrence of the event X = xi is defined as:
I ( xi ; y j ) log
P ( xi | y j )
P ( xi )
(2-1)
I ( xi ; y j ) is the mutual information between xi and yj.

3
P(xi|yj) = P(xi) and hence, I(xi;yj) = 0. On the other

hand, when the occurrence of the event Y = yj uniquely
determines the occurrence of the event X = xi, the
conditional probability in the numerator of (2-1) is unity
and hence,
1
I ( xi ; y j ) log
log P( xi )
P( xi )
(2-2)
But (2-2) is just the information of the event X = xi. For

this reason, it is called the self-information of the event
X = xi, and it is denoted as:
1
I ( xi ) log
log P ( xi )
P ( xi )
(2-3)
4
Average mutual information

and entropy
n
I ( X ; Y ) P ( xi , y j ) I ( xi ; y j )
i 1 j 1
n
P ( xi , y j )
P ( xi , y j ) log
i 1 j 1
(2-4)
P ( xi ) P ( y j )
The average self-information, denoted by H(X), as:

n
H ( X ) P ( xi ) I ( xi )
i 1
P ( xi ) log P ( xi )
i 1
(2-5)
5
Information measures for continuous

random variable
p ( y | x) p ( x)
dxdy
I ( X ; Y ) p( x) p( y | x) log
x x
p ( x) p ( y )
x
(2-6)
H ( X ) p( x) log p( x)dx
x
H ( X | Y )
x x
p( x, y ) log p( x | y )dxdy
(2-7)
(2-8)
The average mutual information may be expressed as:
I ( X ;Y ) H ( X ) H ( X | Y )
or alternatively, as:
I ( X ; Y ) H (Y ) H (Y | X )
6
The mutual information provided about the event X = xi

by the occurrence of the event Y = yj is:
P( y | xi ) P( xi )
I ( xi ; y ) log
p( y ) P( xi )
p( y | xi )
log
p( y )
(2-9)
Then, the average mutual information between X and Y

is:
n
p ( y | xi )
I ( X ; Y ) p ( y | xi ) P( xi ) log
dy
p( y)
i 1
(2-10)
7
Example
Suppose that X is a discrete random variable with two equally
probable outcomes x1 = A and x2 = -A . Let the conditional pdfs
p(y|xi), i = 1, 2 be Gaussian with mean
xi and variance 2. That is,
2
( y A)
1
2
p ( y | A)
e 2
(2-11)
2
( y A) 2
1
2
p ( y | A)
e 2
(2-11)
2
The average mutual information obtained from (2-10) becomes:
I ( X ;Y )
1 x
p( y | A)
p( y | A)
p
(
y
|
A
)
log
p
(
y
|
A
)
log
dy (2-12)
x
2
p( y)
p( y)
1
p( y | A) p( y | A)
(2-13)
2
the average mutual information I(X;Y) given by (2-12) represents
the channel capacity of a binary-input additive white Gaussian
noise channel.
p( y)
3 Coding for discrete sources

* Coding for DMS sources:
Fixed-Length Code Words. First, we consider a block
encoding scheme that assigns a unique set of R binary digits to
each symbol. Since there are L possible symbols, the number of
binary digits per symbol required for unique encoding when L is a
power of 2 is:
R = log2L
(3-1)
and, when L is not a power of 2, it is:
R = log2L + 1
(3-2)
where x denotes the largest integer less than x. The code rate R
in bits per symbol is now R and, since H(X) log2L , it follows that
R H(X).
Source Coding Theorem 1

Let X be the ensemble of letters from a DMS with finite
entropy H(X). Blocks of J symbols from the source are
encoded into code words of length N from a binary
alphabet. For any > 0, the probability of a block
decoding failure can be made arbitrary small if:
N
R H (X )
J
(3-3)
and J is sufficiently large. Conversely, if:
R H (X )
(3-4)
then becomes arbitrarily close to 1 as J is made

sufficiently large.
10
Code tree for code II in Table
Code tree for code III in Table

11
Kraft Inequality. A necessary and sufficient

condition for the existence of a binary code with code
words having lengths n1 n2 nL that satisfy the
prefix condition is:
L
nk
(3-5)
k 1
Source Coding Theorem II

Let X be the ensemble of letters from a DMS with finite
entropy H(X), and input letters xk, 1 k L, with
corresponding probabilities of occurrence pk, 1 k L.
It is possible to construct a code that satisfies the
prefix condition and has average length R that satisfies
the inequalities:
H ( X ) R H ( X ) 1
(3-6)
12
Huffman Coding Algorithm
An example
of variable
lengh source
coding
for a DMS
13
Huffman encoding example
14
Lemple-Ziv Algorithm
The Huffman coding algorithm yields optimal source codes
in the sense that code words satisfy the prefix condition
and the average code length is minimum but needed to
know the probabilities of occurrence of all the source letters.
Lempel-Ziv source coding algorithm s designed to be
independent of the source statistic.
Lempel-Ziv belongs to the class of universal source coding
algorithm
It is a variable-to-fixed length algorithm.
Lempel-Ziv algorithm is widely used in the compression of
computer files
15
Lemple-Ziv Algorithm
The sequence of the discrete source is parsed into
variable-length blocks, called phrases. A new phrase is
introduced every time a block of letters from the source
differs from previous phrase in the last letter. The
phrases are listed in a dictionary, which stores the
location of the existing phrases. In encoding a new
phrase, we simply specify the location of the existing
phrase in the dictionary and append the new letter.
Consider the binary sequence:
10101101001001110101000011001110101100011011
Parsing the sequence as described above produces the

following phrases:
1, 0, 10, 11, 01, 00, 100, 111, 010, 1000, 011, 001, 110, 101,
10001, 1011
16
Dictionary
For Lempel-Ziv
Algorithm
17
LEMPEL-ZIV DECODER
The decoder is just as simple as the encoder.
Specially, it uses the pointer to identify the
root subsequence and then appends the
innovation symbol. Consider, for example, the
binary encoded block 01010 in position 9. The
last bit, 0, is the innovation symbol. The
remain bits, 0101, point to the root
subsequence 01 in position 5. Hence, the
block 01010 is decoded into 010, which is
correct.
18
4. Coding for analog sources

Optimum quantization
An analog source emits a message waveform x(t) that
is a sample function of a stochastic process, the
sampling theorem allows us to present X(t) by a
sequence of uniform samples taken at the Nyquist
rate.
Quantization of the amplitudes of the sample signal
results in data compression but it also introduces some
distortion of the waveform or a loss of signal fidelity.
The minimization of this distortion is considered in this
section.
19
4.1 Rate Distortion function

The used distortion measure is the squared-error distortion,
2
defined as:
k
k
k
k
(4-1)
d (x , ~
x ) (x ~
x )
Which is used to characterize the quantization error in PCM in

Section 3-5-1. Other distortion measures may take the general
p
form:
~
~
d ( xk , xk ) x xk
(4-2)
where p takes values from the set of positive integers. The case p
= 2 has the advantage of being mathematically tractable.
xk ) is the distortion measure per letter, the distortion
If d ( xk , ~
between a sequence of n samples Xn and the corresponding n
~
quantized values X n is the average over the n source output
samples, i.e.
n
1
~
d ( X n , X n ) d ( xk , ~
xk )
n k 1
(4-3)
20
The source output is a random process, and hence, the n samples

~
in Xn are random variables. Therefore, d ( X n , X n ) is a random
variable. Its expected value is defined as the distortion , i.e.
1 n
~
D E[d ( X n , X n )] E[d ( xk , ~
xk )] E[d ( x, ~
x )]
n k 1
(4-4)
where the last step follows from the assumption that the source
output process is stationary.
Now suppose we have a memoryless source with a continuousamplitude output~ X that has a pdf p(x), a quantized amplitude
~
X
d
(
x
,
x) ,
output alphabet
, and a per letter distortion measure
~
where x X and ~
x X . Then, the minimum rate in bits per source
output that is required to represent the output X of the
memoryless source with a distortion less than or equal to D is
called the rate-distortion function R(D) and is defined as:
~
(4-5)
R( D)
min
I(X , X )
~
p(~
x | x ):E [ d ( X , X )] D
Where I ( X ; X ) is the average mutual information between X and X
21
Theorem: Rate-Distortion Function for a Memoryless

Gaussian Source (Shannon, 1959a)
The minimum information rate necessary to represent the output of
a discrete-time,
continuous-amplitude
memoryless
gaussian
source based on a mean-square-error distortion measure per symbol
(single letter distortion measure) is
log 2
Rg ( D )
0
1
2
x2
D
(0 D x2 )
2
x
(D )
(4-6)
where x2 is the variance of the Gaussian source output.

Theorem: Source Coding with a Distortion Measure
(Shannon, 1959a)
There exist an encoding scheme that maps the source output into
code words such that for any given distortion D, the minimum rate
R(D) bits per symbol (sample) is sufficient to reconstruct the source
output with an average distortion that is arbitrary close to D.
22
23
4.2 Scalar quantization

In source encoding, the quantizer can be optimized if we know the
probability density function of the signal amplitude at the input to
the quantizer. For example, suppose that the sequence {xn} at the
input to the quantizer has a pdf p(x) and let L = 2R be the desired
number of levels. We wish to design the optimum scalar quantizer
~
that minimizes some function of the quantization error q x x ,
where ~x is the quantized value of x. To elaborate, suppose that
f (~
x x) denotes the desired
function of the error. Then, the
distortion resulting from quantization of the signal amplitude is:
D f (~
x x) p( x)dx
(4-7)
In general, an optimum quantizer is one that minimizes D by

optimally selecting the output levels and the corresponding input
range of each output level. This optimization problem has been
considered by Lloyd (1982) and Max (1960), and the resulting
optimum quantizer is usually called the Lloyd-Max quantizer.
24
~ 1
For a uniform quantizer, the output levels are specifies xk 2 (2k 1)
as corresponding to an input signal amplitude in the range (k1)
x k, where is the step size. L is number of levels. When the
uniform quantizer is symmetric with an even number of levels, the
average distortion in (4-7) may be expressed as:
L 1
2
D 2
k 1
( k 1)
f ( 2 k 1) x p ( x ) dx
2
2 L
f ( 2 k 1) x p ( x ) dx
2 1 2
(4-8)
25
4-3 Vector quantization

In the popular issue, we considered the quantization of the output
signal from a continuous-amplitude source when the quantization is
performed on a sample-by-sample basis, i.e. by scalar quantization.
In this section, we consider the joint quantization of a block of signal
samples or a block of signal parameters. This type of quantization is
called block or vector quantization. It is widely used in speech coding
for digital cellular systems.
A fundamental result of rate-distortion theory is that better
performance can be achieved by quantizing vectors instead of scalars,
even if the continuous-amplitude by quantizing vectors instead of
scalars, even if the continuous-amplitude source is memoryless. If, in
addition, the signal samples or signal parameters are statistically
dependent, we can exploit the dependency by jointly quantizing blocks
of samples or parameters and thus, achieve an even greater efficiency
(lower bit rate) compared with that which is achieved by scalar
quantization.
26
The vector quantization problem may be formulated as follows.

We have an n-dimensional vector X = {x1, x2, , xn} with realvalued, continuous-amplitude components {xk, 1 k n}. We
~
express the quantization as Q(), so that:
X Q( X )
~
X is the output of the vector quantizer when input-vector is X.
In general, quantization of the n-dimensional vector X into an ndimensional vector X introduces a quantization error or a distortion
~
d ( X , X ). The average distortion over the set of input vectors X is:
L
~
D P( X Ck ) E[ d ( X , X ) | X Ck ]
k 1
L
P( X Ck )
k 1
~
d
(
X
,
X
) p( X ) dX
X Ck
where P( X Ck ) is the probability that the vector X falls in the cell

Ck and p(X) is the joint pdf of the n random variables. As in the
case of scalar quantization, we can minimize D by selecting the
cells {Ck, 1 k L} for a given pdf p(X).
27
Example: Let x1 and x2 be two random variables with a uniform joint pdf
1 / ab
p ( x1 , x2 ) p ( X )
0
( X C)
otherwise
Where C is the rectangular region illustrated in the figure. Note that the
rectangle is rotated 450 relative to the horizontal axis. If we quantize x1
and x2 by using uniform interval of length D, the number of levels is:
L1 L2
a b
, hence then number needed for coding the vector X x1 , x2
2
is : Rx R1 R2 log 2 L1 log 2 L2
2
a b
, thus scalar quantizati on of

log
22
each component is equivalent to vector quantizati on with the total levels
Lx L1 L2 if only region for which p ( X ) 0 with squares having area 2 the

total levels : L' x ab / 2 , then the difference in bit rate between scalar and
vector quantizati on methods is Rx R' x log 2
a b 2 ,
2ab
example a 4b, the difference i bits rate is :1.64bits / vector , thus vector
quantizati on is 0.82bits / sample better for the same distortion
A uniform pdf in two dimensions
Coding techniques for Analog sources

Pulse Code Modulation: Quantized value, signal, quantization error
Uniform quatization with

Uniform pdf as:
Many source signals such as speech waveforms have the characteristic

small signal amplitudes occur more frequent than large ones then a
Compressor is needed.
= 255 in North America

And in Europe with A-law
Differential PCM
The predicted value of xn
And the MSE
Assuming that the source output is wide sense stationary then
(b) Decoder

Source Coding - TELECOMMUNICATION SYSTEMS

Uploaded by

Copyright:

Available Formats

You might also like

Source Coding - TELECOMMUNICATION SYSTEMS

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Source Coding - TELECOMMUNICATION SYSTEMS

Uploaded by

Copyright:

Available Formats

DIGITAL COMMUNICATIONS

Chapter 3: SOURCE CODING

Lectured by Assoc. Prof. Thuong Le-Tien

1. Mathematical model for

Two mathematical models are DMS Discrete

2. Logarithmic measure of information

divided by the probability:

That is, the information content provided by the

I ( xi ; y j ) is the mutual information between xi and yj.

P(xi|yj) = P(xi) and hence, I(xi;yj) = 0. On the other

But (2-2) is just the information of the event X = xi. For

Average mutual information

The average self-information, denoted by H(X), as:

Information measures for continuous

The average mutual information may be expressed as:

The mutual information provided about the event X = xi

Then, the average mutual information between X and Y

3 Coding for discrete sources

and, when L is not a power of 2, it is:

Source Coding Theorem 1

and J is sufficiently large. Conversely, if:

then becomes arbitrarily close to 1 as J is made

Code tree for code II in Table

Code tree for code III in Table

Kraft Inequality. A necessary and sufficient

Source Coding Theorem II

Huffman Coding Algorithm

Huffman encoding example

Parsing the sequence as described above produces the

4. Coding for analog sources

4.1 Rate Distortion function

Which is used to characterize the quantization error in PCM in

The source output is a random process, and hence, the n samples

Where I ( X ; X ) is the average mutual information between X and X

Theorem: Rate-Distortion Function for a Memoryless

where x2 is the variance of the Gaussian source output.

4.2 Scalar quantization

In general, an optimum quantizer is one that minimizes D by

4-3 Vector quantization

The vector quantization problem may be formulated as follows.

where P( X Ck ) is the probability that the vector X falls in the cell

, thus scalar quantizati on of

Lx L1 L2 if only region for which p ( X ) 0 with squares having area 2 the

A uniform pdf in two dimensions

Coding techniques for Analog sources

Uniform quatization with

Many source signals such as speech waveforms have the characteristic

= 255 in North America

Assuming that the source output is wide sense stationary then

You might also like