Information Theory: Dr. Muhammad Imran Farid

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

Information Theory

Dr. Muhammad Imran Farid


Arithmetic coding
▪ Arithmetic coding is a form of entropy encoding used in
lossless data compression. Normally, a string of
characters such as the words "hello there" is
represented using a fixed number of bits per character,
as in the ASCII code. When a string is converted to
arithmetic encoding, frequently used characters will be
stored with fewer bits and not-so-frequently occurring
characters will be stored with more bits, resulting in
fewer bits used in total. Arithmetic coding differs from
other forms of entropy encoding, such as Huffman
coding, in that rather than separating the input into
component symbols and replacing each with a code,
arithmetic coding encodes the entire message into a
single number, a fraction n where [0.0 ≤ ''n'' < 1.0).
2
Basics of Arithmetic coding (AC)

▪ Unique identifier / tag is generated for the sequence to be


encoded
▪ Binary representation of the tag, which becomes the binary
code for the sequence
▪ In practice tag and its binary representation is the same
process. However AC is conceptually easy to understand if we
divide approach in two phases
1. Generation of tag
2. Tag is given unique binary code word

3
Coding sequence in Arithmetic coding (AC)

To code a sequence we follow the following procedure


▪ In order to distinguish a sequence of symbols from another sequence
of symbols we need to tag it with the unique identifier
▪ One possible set of tag for the sequences of values is the tag value
[ 0 , 1)
▪ In order to do this, we needs a function that will map the sequence of
symbols into this unit interval; and that function can be generated
using cumulative distributed function.
▪ Lets use this function in developing AC
4
Coding sequence in Arithmetic coding (AC)
▪ Now! How do we generate a Tag
▪ The procedure for generating the tag works by reducing the size
of the interval [ 0 , 1 )
▪ Divide [0, 1)  [ Fx(i-1), Fx(i) ) , i = 1,2, …. m
▪ The minimum value of c.d.f. is 0 and maximum value is 1; this
exactly partitioned the values
▪ [ Fx(i-1), Fx(i) )  Si
▪ The appearance of the first symbol in the sequence restricts the
interval containing the tag to one of these sub intervals. Suppose
first interval is Sk.
▪ Sk  [ Fx(k-1), Fx(k) ) 5
Coding sequence in Arithmetic coding (AC)

▪ jth interval corresponding to the Sj is given by


 Fx  j  1 Fx  j  
 Fx  k  1  ,Fx  k  1  
  Fx  k   Fx k  1  Fx k   Fx k  1 
• if the second symbol in the sequences Sj, then the interval
containing the tag value becomes as shown here.
• Each succeeding symbol causes the tag to be restricted to
a sub interval that is further partitioned in the same
proportion.
• This process can be more clearly understood through an
example
6
Example: Arithmetic coding (AC)

▪ S = {s1, s2, s3}


P1 = 0.7, P2 = 0.1, P3=0.2

We use Shannon Fano Elias Mapping i.e. X  Si   isi  S

Therefore, C.D.F. Fx(1) = 0.7, Fx(2) = 0.8, Fx(3) = 1

7
[0.0, 0.7) [0.7, 0.8) [0.8, 1.0) tag lying intervals
0.0 0.00

s1 s1

0.7 0.49
s2 s2
0.8 0.56

s3 s3
1.0 0.70

Restricting the interval containing the tag for the input sequence
s1, s2, s3, ……….
8
[0.00, 0.49) [0.49, 0.56) [0.56, 0.70) tag lying intervals
0.0 0.00 0.490

s1 s1
s1

0.7 0.49 0.539


s2 s2 s2
0.8 0.56 0.546

s3 s3 s3
1.0 0.70 0.560

Restricting the interval containing the tag for the input sequence
s1, s2, s3, ……….
9
[0.49, 0.539) [0.539, 0.546) [0.546, 0.560) tag lying intervals
0.0 0.00 0.490 0.5460

s1 s1 s1
s1

0.7 0.49 0.539 0.5558


s2 s2 s2
s2
0.8 0.56 0.546 0.5572

s3 s3 s3 s3
1.0 0.70 0.560 0.5600

Restricting the interval containing the tag for the input sequence
s1, s2, s3, ……….
10
In future

▪ We will study the tag generation process mathematically


starting with the sequences of length 1,
▪ and then we will extend this approach to longer sequences
by imposing, what is known as lexicographic ordering on
the sequences

11

You might also like