10.7 Arithmetic Coding: Figure 10.9 Assignment of Ranges Between 0 and 1

10.
7 Arithmetic Coding
Arithmetic coding is unlike all the other methods discussed in that it takes in the complete data
stream and outputs one specific codeword. This codeword is a floating point number between 0
and 1. The bigger the input data set, the more digits in the number output. This unique number is
encoded such that when decoded, it will output the exact input data stream. Arithmetic coding,
like Huffman, is a two-pass algorithm. The first pass computes the characters' frequency and
generates a probability table. The second pass does the actual compression.
The probability table assigns a range between 0 and 1 to each input character. The size of each
range is directly proportional to a characters' frequency. The order of assigning these ranges is
not as important as the fact that it must be used by both the encoder and decoder. The range
consists of a low value and a high value. These parameters are very important to the
encode/decode process. The more frequently occurring characters are assigned wider ranges in
the interval requiring fewer bits to represent them. The less likely characters are assigned more
narrow ranges, requiring more bits.
With arithmetic coding, you start out with the range 0.01.0 (Figure 10.9). The first character
input will constrain the output number with its corresponding range. The range of the next
character input will further constrain the output number. The more input characters there are, the
more precise the output number will be.
Figure 10.9 Assignment of ranges between 0 and 1.
Suppose we are working with an image that is composed of only red, green, and blue pixels.
After computing the frequency of these pixels, we have a probability table that looks like

Pixel Probability Assigned Range
Red 0.2 [0.0,0.2)
Green 0.6 [0.2,0.8)
Blue 0.2 [0.8,1.0)
The algorithm to encode is very simple.

LOW 0. 0
HIGH 1.0
WHILE not end of input stream
get next CHARACTER
RANGE = HIGH  LOW
HIGH = LOW + RANGE * high range of CHARACTER
LOW = LOW + RANGE * low range of CHARACTER
END WHILE
output LOW
Figure 10.10 shows how the range for our output is reduced as we process two possible input
streams.
0.0 0.2 0.8 1.0
RED GREEN BLUE
RED GREEN BLUE
RED GREEN BLUE
a
0.0 0.2 0.8 1.0
RED GREEN BLUE
RED GREEN BLUE
Figure 10.10 Reduced output range: (a) Green-Green-Red; (b) Green-Blue-Green.
Let's encode the string ARITHMETIC. Our frequency analysis will produce the following
probability table.
Symbol Probability Range
A 0.100000 0.000000 - 0.100000
C 0.100000 0.100000 - 0.200000
E 0.100000 0.200000 - 0.300000
H 0.100000 0.300000 - 0.400000
I 0.200000 0.400000 - 0.600000
M 0.100000 0.600000 - 0.700000
R 0.100000 0.700000 - 0.800000
T 0.200000 0.800000 - 1.000000
Before we start, LOW is 0 and HIGH is 1. Our first input is A. RANGE = 1  0 = 1. HIGH will
be (0 + 1) x 0.1 = 0.1. LOW will be (0 + l) x 0 = 0. These three calculations will be repeated until
the input stream is exhausted. As we process each character in the string, RANGE, LOW, and
HIGH will look like
A range = 1.000000000 low = 0.0000000000 high = 0. 1000000000

R range =0.100000000 low=0.0700000000 high = 0.0800000000
I range =0.010000000 low=0.0740000000 high = 0.0760000000
T range = 0.002000000 low = 0.0756000000 high = 0.0760000000
H range = 0.000400000 low = 0.0757200000 high = 0.0757600000
M range = 0.000000000 low = 0.0757440000 high = 0.0757480000
E range = 0.000004000 low = 0.0757448000 high = 0.0757452000
T range = 0.000000400 low = 0.0757451200 high = 0.0757452000
I range = 0.000000080 low = 0.0757451520 high = 0.0757451680
C range = 0.0000000 16 low = 0.0757451536 high = 0.0757451552
Our output is then 0.0757451536.

The decoding algorithm is just the reverse process.
get NUMBER
DO
find CHARACTER that has HIGH > NUMBER and LOW <NUMBER
set HIGH and LOW corresponding to CHARACTER
output CHARACTER
RANGE = HIGH  LOW
NUMBER = NUMBER  LOW
NUMBER = NUMBER  RANGE
UNTIL no more CHARACTERs
As we decode 0.0757451536, we see
num = 0,075745153600 A Range = 0. 1 low = 0.0 high = 0. 1

num = 0.757451536000 R Range = 0. 1 low = 0.7 high = 0.8
num = 0.574515360000 1 Range = 0.2 low = 0.4 high = 0.6
num = 0.872576800000 T Range = 0.2 low = 0.8 high = 1.0
num = 0.362884000000 H Range = 0. 1 low = 0.3 high = 0.4
num = 0.628840000000 M Range = 0. 1 low = 0.6 high = 0.7
num = 0.288400000002 E Range = 0. 1 low = 0.2 high = 0.3
num = 0.884000000024 T Range = 0.2 low = 0,8 high = 1.0
num = 0.420000000120 1 Range = 0.2 low = 0.4 high = 0.6
num = 0.100000000598 C Range = 0. 1 low = 0. 1 high = 0.2
Arithmetic coding is one possible algorithm for use in the entropy coder during JPEG
compression. For JPEG compression, see the next part. JPEG achieves slightly higher
compression ratios than the Huffman option but is computationally more intensive.

10.7 Arithmetic Coding: Figure 10.9 Assignment of Ranges Between 0 and 1

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

10.7 Arithmetic Coding: Figure 10.9 Assignment of Ranges Between 0 and 1

Uploaded by

Copyright:

Available Formats

10.

Figure 10.9 Assignment of ranges between 0 and 1.

The algorithm to encode is very simple.

0.0 0.2 0.8 1.0

RED GREEN BLUE

RED GREEN BLUE

RED GREEN BLUE

RED GREEN BLUE

RED GREEN BLUE

Figure 10.10 Reduced output range: (a) Green-Green-Red; (b) Green-Blue-Green.

A range = 1.000000000 low = 0.0000000000 high = 0. 1000000000

Our output is then 0.0757451536.

num = 0,075745153600 A Range = 0. 1 low = 0.0 high = 0. 1

You might also like