Professional Documents
Culture Documents
Source Coding (Mã Hóa Ngu N)
Source Coding (Mã Hóa Ngu N)
SOURCE CODING
(MÃ HÓA NGUỒN)
Instructor:
Name: Đoàn Bảo Sơn
Office: Faculty of Electrical - Electronics Engineering
Phone: 0913 706061
Email: sondb@vaa.edu.vn
1
1. Introduction to Information Theory
1.1. Introduction
2
1. Introduction to Information Theory
1.2. Review of probabilities
3
1. Introduction to Information Theory
1.2. Review of probabilities
1.2.1. Discrete random variables
X takes its values in a discrete set: AX
AX may be infinite (for instance, if AX = N) or finite with a size n, if AX
= {x1, x2, … , xn}
Each outcome is associated with an probability of occurrence PX =
{p1, p2, … , pn}:
5
1. Introduction to Information Theory
1.2. Review of probabilities
1.2.1. Discrete random variables
Conditional probability (Xác suất có điều kiện)
6
1. Introduction to Information Theory
1.2. Review of probabilities
1.2.1. Discrete random variables
Independence
Two discrete random variables X and Y are independent
7
1. Introduction to Information Theory
1.2. Review of probabilities
1.2.2. Continuous random variables
• The random variable X is continuous if its cumulative distribution
function FX(x) is continuous
• FX(x) is related to the probability density
• Nth moment:
8
1. Introduction to Information Theory
1.2. Review of probabilities
1.2.3. Random signals
Signal x(t) is deterministic if the function t ↦ x(t) is perfectly known
If the values taken by x(t) are unknown, the signal follows a
random process
X(t): random variable; x(t): outcome of this random variable
Probability density:
9
1. Introduction to Information Theory
1.2. Review of probabilities
1.2.3. Random signals
Autocorrelation function RXX(τ) of a random variable:
10
1. Introduction to Information Theory
1.2. Review of probabilities
1.2.3. Random signals
Autocorrelation function RXX(τ) :
11
1. Introduction to Information Theory
1.3. Entropy and mutual information
1.3.1. A logarithmic measure of information
Information associated with the event X = xi: h(xi)
12
1. Introduction to Information Theory
1.3. Entropy and mutual information
1.3.1. A logarithmic measure of information
The quantity of information h(xi) associated with the realization of
the event X = xi should be equal to the logarithm of the inverse of
the probability Pr(X = xi)
Unit of h(xi):
• binary logarithm: Shannon (Sh)
• natural logarithm: Natural unit (Nat)
13
1. Introduction to Information Theory
1.3. Entropy and mutual information
1.3.1. A logarithmic measure of information
Example:
a discrete source: bits (0 or 1)
Pr(X = 0) = Pr(X = 1) = 1/2
Quantity of information: X = 0 or X = 1
15
1. Introduction to Information Theory
1.3. Entropy and mutual information
1.3.2. Mutual information
Mutual information: the quantity of information that the
realization of the event Y = yj gives about the event X = xi
• difference between the quantity of information associated with the
realization of the event X = xi and the quantity of information associated
with the the realization of the event X = xi conditionally to the event Y = yj
17
1. Introduction to Information Theory
1.3. Entropy and mutual information
1.3.3. Entropy and average mutual information
Source:
random variable: X
sample space: AX = {x1, x2, … , xn}
probabilities PX = {p1, p2, … , pn}
Entropy
Average quantity of information entropy of the source: quantity of
information associated with each possible realization of the event
X = xi :
18
1. Introduction to Information Theory
1.3. Entropy and mutual information
1.3.3. Entropy and average mutual information
Entropy (cont.)
H(X) is a measure of the uncertainty on X
Properties:
19
1. Introduction to Information Theory
1.3. Entropy and mutual information
Entropy (cont.)
Two random variables: X, Y
AX = {x1, x2, … , xn}; AY = {y1, y2, … , ym}
Joint entropy: H(X, Y)
20
1. Introduction to Information Theory
1.3. Entropy and mutual information
Entropy (cont.)
Mutual information:
21
1. Introduction to Information Theory
1.3. Entropy and mutual information
1.3.4. Differential entropy
Continuous random variable: X;
Density probability: p(x)
Differential entropy: HD(X)
22
1. Introduction to Information Theory
1.4. Lossless source coding theorems
1.4.1. Introduction :
Source coding
entropy coding
Digital sequence delivered by the source with the shortest sequence
of symbols with the ability to reconstruct it by the source decoder
23
1. Introduction to Information Theory
1.4. Lossless source coding theorems
1.4.2. Entropy and source redundancy
Source: discrete, stationary
Output symbols: Q-ary symbols
Output: random variable X
Entropy: H(X)
average quantity of information per symbol at the output of the source
If the source is memoryless (output symbols: de-correlated):
HMAX = log2Q
If the source is memory:
25
1. Introduction to Information Theory
1.4. Lossless source coding theorems
1.4.3. Fundamental theorem of source coding
THEOREM 1.1. (Shannon):
Let ∊ > 0, for all stationary source with entropy per symbol H(X), there
is a binary source coding method that associates with each message x
of length N a binary word of average length NRmoy such that:
26
1. Introduction to Information Theory
1.4. Lossless source coding theorems
1.4.4. Lossless source coding
Introduction
28
1. Introduction to Information Theory
1.4. Lossless source coding theorems
1.4.4. Lossless source coding
Variable length coding
Example 1: Discrete source generate 4 different messages a1, a2, a3, a4
with their respective probabilities Pr(a1) = 1/2, Pr(a2) = 1//4, Pr(a3) =
Pr(a4) = 1/8.
Variable length code
- Satisfy the criterion of unique coding
- Not allow a unique decoding
- Ex: a1, a2, a1 = 1001 (coding)
- At the receiver, decoding: a1, a2, a1 or a4, a3 ?
⇒ unusable
29
1. Introduction to Information Theory
1.4. Lossless source coding theorems
Variable length coding (cont.)
Example 2:
- Satisfy both unique coding and decoding
- Not instantaneous (tức thời)
- Ex: a3 is the beginning of a4
- After the sequence 11 → determine the parity
of the number of zeros → decode
⇒ more complex (decoding)
30
1. Introduction to Information Theory
1.4. Lossless source coding theorems
Variable length coding (cont.)
Example 3:
32
1. Introduction to Information Theory
1.4. Lossless source coding theorems
Fundamental theorem of source coding
Memoryless source: entropy per symbol H(X)
It is possible to build an instantaneous code for which the average
length of the words Rmoy satisfy the following inequality:
33
1. Introduction to Information Theory
1.4. Lossless source coding theorems
Entropy rate
X: stationary and discrete source
Entropy per symbol: H(X) [Sh/symbol]
Symbols per second: DS
Entropy rate: DI
34
1. Introduction to Information Theory
1.4. Lossless source coding theorems
Entropy rate (cont.)
From the theorem of source coding:
35
1. Introduction to Information Theory
1.5. Theorem for lossy source coding
1.5.1. Definitions
words composed of
x NR bits x
ENCODER DECODER
36
1. Introduction to Information Theory
1.5. Theorem for lossy source coding
1.5.1. Definitions
DEFINITION 1.1. The distortion per dimension between the sequences
x and 𝑥 of dimension N:
𝟏
𝒙−𝒙 𝟐
𝑵
DEFINITION 1.2. The average distortion per dimension of the coder-
decoder
𝟏 𝟐𝒇
𝑫𝑵 = 𝒙−𝒙 𝒙 𝒅𝒙
𝑵
𝑿
37
1. Introduction to Information Theory
1.5. Theorem for lossy source coding
1.5.1. Definitions (cont.)
DEFINITION 1.3. A pair (R, D) is said to be achievable if there is a
coder-decoder:
lim 𝑫𝑵 ≤ 𝑫
𝒏→∞
DEFINITION 1.4. For a given memoryless source, the rate distortion
function R(D):
𝟐
𝑹 𝑫 = min 𝑰 𝑿; 𝑿 |𝑬 𝑿 − 𝑿 ≤𝑫
𝒑 𝒙|𝒙
38
1. Introduction to Information Theory
1.5. Theorem for lossy source coding
1.5.2. Lossly source coding theorem
THEOREM 1.4. The minimum number of bits per dimension R allowing
to describe a sequence of real samples with a given average distortion
D should be higher or equal to R(D).
𝜎𝑥 2 : source variance
40
1. Introduction to Information Theory
1.6. Transmission channel models
1.6.1. Binary symmetric channel
input and the output of the channel is binary
using a single parameter
42
1. Introduction to Information Theory
1.6. Transmission channel models
1.6.2. Binary erasure channel
Some bits can be lost or erased
Compared to the binary symmetric channel, we add an event Y = ∊
corresponding to the case where a transmitted bit has been erased
Character:
p: erasure probability
Diagram:
43
1. Introduction to Information Theory
1.7. Capacity of a transmission channel
Problem:
Transmit equiprobable bits: Pr(X = 0) = Pr(X = 1) =1/2
Binary rate: 1000 bits per second
Binary symmetric channel: p = 0.1
What is the maximum information rate that can be transmitted ?
44
1. Introduction to Information Theory
1.7. Capacity of a transmission channel
1.7.1. Capacity of a transmission channel
DEFINITION: capacity of a transmission channel
𝑪 = 𝐦𝐚𝐱𝑰(𝑿: 𝒀)
The capacity is the maximum of average mutual information
[C] = Shannon/symbol; Shannon/second
Capacity per symbol
C ’ = C x DS ; DS: symbol rate of th source
Channel is noiseless
C = HMAX(X) = log2Q
Channel is noisy:
C < HMAX(X)
To compute the capacity of a transmission channel → calculate the
average quantity of information that is lost in the channel
45
1. Introduction to Information Theory
1.7. Capacity of a transmission channel
1.7.1. Capacity of a transmission channel (cont.)
H(X|Y): measure of the residual uncertainty on X knowing Y
Good transmission: H(X|Y) = zero or negligible (không đáng kể)
H(X|Y): average quantity of information lost in the channel
Noiseless channel:
H(X|Y) = H(X|X) = 0 ⇒ C = HMAX(X)
Noisy channel: X and Y are independent
H(X|Y) = H(X) ⇒ C = 0
Case C = HMAX(X)
Case C = 0
46
1. Introduction to Information Theory
1.7. Capacity of a transmission channel
1.7.1. Capacity of a transmission channel (cont.)
Communication system with channel coding:
47
1. Introduction to Information Theory
1.7. Capacity of a transmission channel
1.7.2. Fundamental theorem of channel coding
Channel coding: error rate as low as desired
Average quantity of information entering the block channel coder
channel-channel decoder is less than the capacity C of the channel
H(U) < C
C = maxI(X; Y): highest number of information bits that can be
transmitted through the channel with an error rate as low as
desired
48
1. Introduction to Information Theory
1.7. Capacity of a transmission channel
1.7.3. Capacity of the binary symmetric channel
Average mutual information:
49
1. Introduction to Information Theory
1.7. Capacity of a transmission channel
1.7.4. Capacity of erasure channel
Entropy: H(X|Y) = pH2(q)
Average mutual information:
50
2. Source Coding
2.1. Introduction
Source coding: lossless source coding and lossy source coding
Lossless source coding (entropy coding): shortest sequence of
symbols (bits) → perfect reconstruction (decoder)
Lossy source coding: minimize a fidelity criterion (constraint on the
binary rate)
Implement lossless source coding: Huffman’s algorithm; Lempel–
Ziv coding
51
2. Source Coding
2.2. Algorithms for lossless source coding
2.2.1. Run length coding
Run length coding (RLC): exploiting the repetition between
consecutive symbols
Source sequence: many identical successive symbols
Couples (number of identical consecutive symbols, symbol)
Example:
54
2. Source Coding
2.2. Algorithms for lossless source coding
2.2.2. Huffman’s algorithm (cont.)
Example:
- Huffman’s encoding table:
- Note:
Huffman’s algorithm: optimal source coding under
the restriction that the probabilities of the message
are 2-m (1/2; 1/4; ….)
When the successive symbols are correlated:
Group many symbols together to constitute the messages
⇒ complexity
- Use: image compression or audio compression
(JPEG, MP3, …)
55
2. Source Coding
2.2. Algorithms for lossless source coding
2.2.2. Arithmetic coding
Rissanen (1976), Pasco (1976)
Source coding without any a priori knowledge of the statistics of the
source (memoryless or with memory)
Principle: associate with each binary sequence an interval on the
segment [0; 1[
Example:
0111 → [0.0111; 0.1000[ in binary or [0.4375; 0.5[ in decimal
The longer the sequence, the smaller the associated interval
56
2. Source Coding
2.2. Algorithms for lossless source coding
2.2.3. Algorithm LZ78
Lempel and Ziv (1978)
Algorithm: uses a dictionary
Dictionary: a pair composed of a pointer or index on a previous
element of the dictionary and a symbol
Each element of the dictionary will be related to a string of symbols
57
2. Source Coding
2.2. Algorithms for lossless source coding
2.2.3. Algorithm LZ78 (cont.)
Example:
• Binary sequence:
001000001100010010000010110001000001000011000101010000
100000110000010110000
• We first find the shortest string that we have not yet found starting
from the left.
0,01000001100010010000010110001000001000011
• The second string different from 0 is 01
0,01,000001100010010000010110001000001000011
• The third string different from 0 and 01 is 00
0,01,00,0001100010010000010110001000001000011
• Finally, the sequence can be decomposed as follows:
0, 01, 00, 000, 1, 10, 001, 0010, 0000, 101, 100, 010, 00001, 000011
58
2. Source Coding
2.2. Algorithms for lossless source coding
2.2.3. Algorithm LZ78 (cont.)
Example:
• Finally, the sequence can be decomposed as follows:
0, 01, 00, 000, 1, 10, 001, 0010, 0000, 101, 100, 010, 00001, 000011
• Dictionary of strings