Download as pdf or txt
Download as pdf or txt
You are on page 1of 23

Universal Codes and Channel Capacity

Arithmetic Coding

Problems of Huffman coding:


• The size of Huffman coding table is qn, hence exponential
memory and computation is required.
• Code table needs to transmit to receiver.
• The source statistics are assumed stationary. If there are changes
an adaptive scheme is required which re-estimates the
probabilities, and recalculates the Huffman code table.
• Encoding and decoding is performed on a per block basis; the
code is not produced until a block of n symbols is received.

1
Universal Codes and Channel Capacity
Arithmetic Coding

2
Universal Codes and Channel Capacity
Arithmetic Coding

3
Universal Codes and Channel Capacity
Arithmetic Coding

4
Universal Codes and Channel Capacity
Arithmetic Coding

5
Universal Codes and Channel Capacity
Arithmetic Coding

6
Universal Codes and Channel Capacity
Arithmetic Coding

The average codeword length is equal to the entropy

7
Universal Codes and Channel Capacity
Arithmetic Coding

8
Universal Codes and Channel Capacity
Arithmetic Coding

9
Universal Codes and Channel Capacity
Arithmetic Coding

10
Fig. 3.15: Arithmetic coding for Example 3.20.
Universal Codes and Channel Capacity
Arithmetic Coding

11
Universal Codes and Channel Capacity
Arithmetic Coding

12
Universal Codes and Channel Capacity
Lempel Ziv Coding

Universal Source Coding:

Other coding algorithms required the knowledge of


statistics of source

But the universal source coding does not need knowledge


of statistics of source.

Example: Lempel Ziv Coding Algorithm

13
Universal Codes and Channel Capacity
Lempel Ziv Coding

The algorithm is very efficient.

• Variant of LZ77 algorithm used in gzip, png


• Variant of LZW algorithm used in zip, pkzip, compress,
gif, tiff

14
Universal Codes and Channel Capacity
Lempel Ziv Coding

The algorithm

• It is dictionary based source coding technique


• A substring in the source is replaced by a codeword that
identifies the substring in the dictionary
• Each codeword CI except the zeroth codeword in the
concatenation of the two parameters LI and mI.
• mI is related to the position in the dictionary where
the substring match starts.
• Length of the text to be copied from the past is given
by LI .
• The string coded is used as dictionary
• The parameter nW denotes the window size
15
Universal Codes and Channel Capacity
Lempel Ziv Coding

The algorithm
𝑛𝑊
• Both encoder and decoder maintain a dictionary 𝐷𝐼
containing the most recent nW encoded source symbols.
• When encoding a substring starting with source symbol
𝑛−1
wk, the dictionary contains the source symbols 𝑊𝑘−𝑛 𝑊
.
• Initially dictionary contains the first nW symbols of 𝑊1𝑁 .
𝑛𝑊
• First codeword C0 is the initial dictionary, 𝑊1 .
• Second codeword C1is found by searching for the largest
substring Y1, beginning with such that Y1 starts in the
dictionary and ends L1  1 time units later,
• The index of the dictionary symbol where the copy of Y1
begins is denoted by p. m1= nW – p. 16
Universal Codes and Channel Capacity
Lempel Ziv Coding
The algorithm

• If multiple choices for m1 exist, choose the smallest m1.


• After determining C1, the dictionary is updated by deleting the L1
oldest entities and adding the L1 most recently encoded symbols.
• This is similar to sliding window of the dictionary to the right by
L1 shifts.
• The third codeword C2 is found by searching for largest substring
Y3, beginning with 𝑊𝑛𝑊 + 𝐿1 + 1 such that Y2 starts in the
dictionary and ends L2  1 time units later. m2 is calculated
• Dictionary then updated.
• The process is repeated until we have encoded the full source
sequence 𝑊1𝑁 .
17
Universal Codes and Channel Capacity
Lempel Ziv Coding
𝑛
I 𝐷𝐼 𝑊 YI LI mI CI Dictionary and phase locations
0 {00101001} {00101001}
1 {00101001} {00} 2 2 [2, 2] {:00101001:0000011011011000}
2 {10100100} {000} 3 0 [3, 0] {00:10100100:00011011011000}
3 {00100000} {1} 1 5 [1, 5] {00101:00100000:11011011000}
4 {01000001} {10} 2 6 [2, 6] {001010:01000001:1011011000}
5 {00000110} {110110} 6 2 [6, 2] {00101001:00000110:11011000}
6 {10110110} {00} 2 0 [2, 0] {00101001000001:10110110:00}

18
Universal Codes and Channel Capacity

Encoding of mI and LI:

A comma free is a binary encoding technique for LI.

Wyner-Ziv binary encoding: For a positive integer k, we


denote the binary representation of k by b(k) and 𝑏 𝑘 =
𝑙𝑜𝑔2 (𝑘 + 1) to be the number of bis in the representation

We denote sequence of k-1 zeros followed by 1 as u(k) = 0k-1


1
The comma free encoding of LI is then
𝑒 𝐿𝐼 = 𝑒Ƹ 𝑏 𝐿𝐼 . 𝑏 𝐿𝐼
𝑒Ƹ 𝑝𝐼 = 𝑢 𝑏 𝑝𝐼 . 𝑏 𝑝𝐼
19
Universal Codes and Channel Capacity

Encoding of mI and LI:


Example :

The comma free encoding of LI is then


𝑒 𝐿𝐼 = 𝑒Ƹ 𝑏 𝐿𝐼 . 𝑏 𝐿𝐼
𝑒Ƹ 𝑝𝐼 = 𝑢 𝑏 𝑝𝐼 . 𝑏 𝑝𝐼
LI = 4, binary representation of 4 is 100
e(4) = 𝑒Ƹ 3 .100
𝑒Ƹ 3 = 𝑢 2 11 binary representation of 3 is 11
= 0111
e(4) = 0111100

20
Universal Codes and Channel Capacity
Lempel Ziv Coding

• Decoder receives the encoded sequences and knows the


window size nW.
• First bits are directly stored.
• To determine the L1, the decoder reads a sequence of 0.s
followed by the first 1, to determine the length of
encoding of p1, which in terms can be used to determine
the number of bits to represent the length of L1 and
subsequently the L1.
• Having determined , the decoder know that the next binay
bits are for representing m1.

21
Universal Codes and Channel Capacity
Lempel Ziv Coding

22
Universal Codes and Channel Capacity
Lempel Ziv Coding

23

You might also like