Download as pdf or txt
Download as pdf or txt
You are on page 1of 20

1 Image Compression: Error Free and Lossy Compression

10

IMAGE COMPRESSION : ERROR FREE AND LOSSY


COMPRESSION
Unit Structure :

10.0 Objectives
10.1 Error-free compression
10.2 Variable-length coding
10.2.1 Huffman coding
10.2.2 Arithmetic coding
10.3 LZW (Lempel-Ziv-Welch) Coding
10.4 Bit-Plane coding
10.4.1 Run-length Coding of bit-plane images
10.5 Lossless Predictive Coding
10.6 Summary
10.7 Unit end exercise
10.8 Further Reading

10.0 OBJECTIVES

Objectives of this chapter are:


 Define and measure source entropy.
 Measure the coding efficiency of an encoding scheme.
 State the basic principles of Huffman coding.
 Assign Huffman codes to a set of symbols of known probabilities.
 Explain why Huffman coding is not optimal.
 State the basic principles of arithmetic coding.
 Encode a sequence of symbols into an arithmetic coded bit stream.
 Decode an arithmetic coded bit stream.
 State the coding efficiency limitations of arithmetic coding.
 State the basic principles of Lempel-Ziv coding

www.rocktheit.com www.facebook.com/rocktheit
2 Image Compression: Error Free and Lossy Compression

 Encode a sequence of symbols into a Lempel-Ziv coded bit stream.


 Decode a Lempel-Ziv coded bit stream.
 Convert a gray-scale image into bit-plane images.
 Apply run-length coding on binary images.
 State the basic principles of lossless predictive coding.
 Reconstruct the original image from the error image and the predicted image.

10.1 ERROR-FREE COMPRESSION

 If the compression and decompression processes induce no information loss, then the
compression scheme is lossless; otherwise, it is lossy.
 Error-free compression techniques usually rely on entropy-based encoding algorithms.
 How do we measure information?
 What is the information content of a image?
 What is the minimum amount of data that is sufficient to describe completely an image
without loss of information?
 Modelling Information
 Information generation is assumed to be a probabilistic process.
 Idea: associate information with probability!
 A random event E withprobability P(E) contains:

 I(E)=0 when P(E)=1


 Suppose that gray level values are generated by a random variable, then rk contains:

units of information per pixel


 Therefore Average information content of an image:
L 1
E   I (rk ) Pr(rk )
k 0

 Using

 Entropy :

www.rocktheit.com www.facebook.com/rocktheit
3 Image Compression: Error Free and Lossy Compression

units/pixel
 Redundancy

 Redundancy

 where:
 if Lavg= H, the R=0 (no redundancy

 Entropy Estimation
 It is not easy to estimate H reliably!
 Image :

Table 10.1

 First order estimate of H:

 Second order estimate of H:


 Use relative frequencies of pixel blocks :
Table 10.2

 The first-order estimate provides only a lower-bound on the compression that can be
achieved.
 Differences between higher-order estimates of entropy and the first-order estimate
indicate the presence of interpixel redundancy!

www.rocktheit.com www.facebook.com/rocktheit
4 Image Compression: Error Free and Lossy Compression

 Lossless Methods: Taxonomy

Figure 10.1 : Lossless Methods

10.2 VARIABLE-LENGTH CODING

 A variable-length code is a code which maps source symbols to a variable number of bits.
 Variable-length codes can allow sources to be compressed and decompressed with zero
error (lossless data compression)
 Most entropy-based encoding techniques rely on assigning variable-length codewords to
each symbol, whereas the most likely symbols are assigned shorter codewords.
 In the case of image coding, the symbols may be raw pixel values or the numerical values
obtained at the output of the mapper stage (e.g., differences between consecutive pixels,
run-lengths, etc.).
 The most popular entropy-based encoding technique is the Huffman code.
 It provides the least amount of information units (bits) per source symbol.

10.2.1 Huffman coding


 In 1952, D. A. Huffman developed a code construction method that can be used to perform
lossless compression
 Huffman coding is based on the frequency of occurance of a data item (pixel in images).
 The principle is to use a lower number of bits to encode the data that occurs more
frequently.
 Codes are stored in a Code Book which may be constructed for each image or a set of
images.
 In all cases the code book plus encoded data must be transmitted to enable decoding.
 Huffman coding yields the smallest possible number of code symbols per source symbol.

www.rocktheit.com www.facebook.com/rocktheit
5 Image Compression: Error Free and Lossy Compression

 Huffman Coding Algorithm:


 Order the gray levels according to their frequency of use (probability)
 Most frequent first
 Combine the two least used gray levels into one group. Combine their probabilities and
reorder the gray levels
 Continue until only two gray levels are left
 Now allocate a 0 to one of these gray level groups and 1 to the other
 Work back through the groupings so that, where two groups have been combined to
form a new, larger group which is currently coded as ‘ccc’ code; one of the smaller
groups as ‘cc0’ and other as ‘cc1’

Figure 10.2 : Flowchart for Huffman algorithm

 Step I:
 Order the probabilities of the symbols under consideration

www.rocktheit.com www.facebook.com/rocktheit
6 Image Compression: Error Free and Lossy Compression

 Combine the lowest probability symbols into a single symbol

 Step II:

1. Lavg= (0.4)(1) + (0.3)(2) + (0.1)(3) + (0.1)(4) + (0.06)(5) + (0.04)(5)=2.2 bits/symbol

 Properties of Huffman Coding


1. Unique Prefix Property: No Huffman code is a prefix of any other Huffman code -
precludes any ambiguity in decoding.
2. Optimality: minimum redundancy code - proved optimal for a given data model (i.e., a
given, accurate, probability distribution):
• The two least frequent symbols will have the same length for their Huffman codes,
differing only at the last bit.
• Symbols that occur more frequently will have shorter Huff- man codes than symbols
that occur less frequently.

Limitations of Huffman Coding


 To achieve the entropy of a DMS (Discrete Memoryless Source), the symbol probabilities
should be negative powers of 2 (i.e. log pi is an integer).

www.rocktheit.com www.facebook.com/rocktheit
7 Image Compression: Error Free and Lossy Compression

 Huffman coding use an integer number (k) of bits for each symbol, hence k is never less
than 1. ie.Can not assign fractional codelengths.
 Can not efficiently adapt to changing source statistics.
 Number of entries in Huffman table grows exponentially with block size
 Sometimes, e.g., when sending a 1-bit image, compression becomes impossible.

10.2.2 Arithmetic coding


 Arithmetic coding solves many limitations of Huffman coding.
 Arithmetic encoders are better suited for adaptive models than Huffman coding.
 It is an entropy encoding technique, in which the frequently seen symbols are encoded with
fewer bits than lesser seen symbols.
 No assumption on encode source symbols one at a time.
 Sequences of source symbols are encoded together.
 There is no one-to-one correspondence between source symbols and code words.
 Slower than Huffman coding but typically achieves better compression.
 A sequence of source symbols is assigned a single arithmetic code word which
corresponds to a sub-interval in [0,1].
 As the number of symbols in the message increases, the interval used to represent it
becomes smaller.
 Smaller intervals require more information units (i.e., bits) to be represented.

www.rocktheit.com www.facebook.com/rocktheit
8 Image Compression: Error Free and Lossy Compression

 Algorithm :

www.rocktheit.com www.facebook.com/rocktheit
9 Image Compression: Error Free and Lossy Compression

Figure 10.3 : Flowchart for Arithmetic algorithm

www.rocktheit.com www.facebook.com/rocktheit
10 Image Compression: Error Free and Lossy Compression

Table 10.3  Encode message: a1 a2 a3 a3 a4

1) Assume message occupies [0, 1)

0 1
2) Subdivide [0, 1) based on the probability of αi

3) Update interval by processing source symbols

 Example
Table 10.4

www.rocktheit.com www.facebook.com/rocktheit
11 Image Compression: Error Free and Lossy Compression

 Encode :a1 a2 a3 a3 a4 [0.06752, 0.0688) or 0.068


 The message a1 a2 a3 a3 a4 is encoded using 3 decimal digits or 3/5 = 0.6 decimal digits
per source symbol.
 The entropy of this message is:

 -(3 x 0.2log10(0.2)+0.4log10(0.4))=0.5786 digits/symbol

 Note: finite precision arithmetic might cause problems due to truncations!

10.3 LZW (Lempel-Ziv-Welch) Coding

 The Lempel-Ziv-Welch algorithm is named after the three inventors and is usually referred
to as the LZW algorithm.

www.rocktheit.com www.facebook.com/rocktheit
12 Image Compression: Error Free and Lossy Compression

 These are lossless compression algorithms in which no data is lost, and the original image
can be entirely reconstructed from the encoded image.
 Requires no prior knowledge of pixel probability distribution values.
 Included in GIF and TIFF and PDF file formats
 LZW uses fixed-length codewords to represent variable-length strings of
symbols/characters that commonly occur together.
 A codebook (or dictionary) needs to be constructed
 The LZW encoder and decoder build up the same dictionary dynamically while receiving
the data.
 LZW places longer and longer repeated entries into a dictionary, and then emits the code
for an element, rather than the string itself, if the element has already been placed in the
dictionary.
 Initially, the first 256 entries of the dictionary are assigned to the gray levels 0,1,2,..,255
(i.e., assuming 8 bits/pixel)
 Consider a 4x4, 8 bit image
39 39 126 126
39 39 126 126
39 39 126 126
39 39 126 126

 As the encoder examines image pixels, gray level sequences (i.e., blocks) that are not in
the dictionary are assigned to a new entry.

Table 10.5

www.rocktheit.com www.facebook.com/rocktheit
13 Image Compression: Error Free and Lossy Compression

 Compression Algorithm:
 BEGIN
 s = next input character;
 while not EOF
 { c = next input character;
 if s + c exists in the dictionary
 s = s + c;
 else
 { output the code for s;
 add string s + c to the dictionary with a new code;
 s = c;
 }
 }
 output the code for s;
 END

www.rocktheit.com www.facebook.com/rocktheit
14 Image Compression: Error Free and Lossy Compression

10.4 : LZW Compression flowchart

 Decoding LZW
 The dictionary which was used for encoding need not be sent with the image
 Can be built on the “fly” by the decoder as it reads the received code words.
 LZW Decompression Algorithm:
 BEGIN
 s = NIL;
 while not EOF
 {
 k = next input code;
 entry = dictionary entry for k;
 output entry;
 if (s != NIL)
 add string s + entry[0] to dictionary with a new code;
 s = entry;
 }
 END

www.rocktheit.com www.facebook.com/rocktheit
15 Image Compression: Error Free and Lossy Compression

Figure 10.5 : LZW Uncompressing flowchart

10.4 BIT-PLANE CODING

 A bit plane of a digital image is a set of bits corresponding to a given bit position in each of
the binary numbers representing the image.

www.rocktheit.com www.facebook.com/rocktheit
16 Image Compression: Error Free and Lossy Compression

 Converting a gray-scale image into bit-plane images:


 Any integer value s in the interval [0,2n −1]can be represented by a polynomial of base-2,
as follows:

 where are the n binary coefficients associated with thecorresponding


powers of 2.
 This basically converts the integer s into the binary number representation
with a n-1as the most significant bit and a0 as the least significant bits.
 Following this, all the pixel values having intensities in the range [0, 255] of an image array
can be converted into its 8-bit binaryrepresentation.

 If we now consider an array containing only the ith bit of every pixel (i = 0,1,….,7), we get
the ithbit plane image.
 Let us consider a simple 4x4 array, whose elements are integers in the range 0-15 .
 The corresponding 4-bit binary representations are indicated within the parenthesis, s shown
below:

 The corresponding 4 bit planes are shown below:

 Bitplane coding is a effective technique to reduce inter pixel redundancy is to process each
bit plane individually.
1) Decompose an image into a series of binary images.
2) Compress each binary image (e.g., using run-length coding)

www.rocktheit.com www.facebook.com/rocktheit
17 Image Compression: Error Free and Lossy Compression

10.4.1 Run-length Coding of Bit-Plane Images


 In one-dimensional run-length coding schemes for binary images, runs ofcontinuous 1s or
0s in every row of an image are encoded together, resulting insubstantial bit savings.
 Either, it is to be indicated whether the row begins with a run of 1s or 0s.
 Run-length encoding may be illustrated with the following example of a row of an image:
000110100011111
 The first run count in the given binary sequence is 0.
 Then we have a run of three 0s.
 Hence, the next count is 3.
 Proceeding in this manner, the reader may verify that the given binary sequence gets
encoded to: 0,3,2,1,1,3,5
 Can compress any type of data but cannot achieve high compression ratios compared to
other compression methods.

10.5 LOSSLESS PREDICTIVE CODING

 It’s an error-free compression approach.


 Does not require decomposition of an image into a collection of bit planes.
 Based on interpixel redundancies of closely spaced pixels by extracting and coding only the
new information in each pixel.
 The new information of a pixel : difference between the actual and predicted value of the
pixel.

Figure 10.6 : Block diagram of Lossless predictive encoder


 The current pixel s( n1, n2 ) is predicted as sˆ (n1,n2)using a linear combination of previously
received pixels, which are spatial neighbors of the current pixel.
 The error in prediction, given by :

is encoded using any of the lossless compression schemes discussed so far.

www.rocktheit.com www.facebook.com/rocktheit
18 Image Compression: Error Free and Lossy Compression

 Since the predictor can only consider the pixels received so far, the prediction is based the
set of pixels from the current row and the set of pixels

from the previous row, where N2is the


number of columns.
 In practice, we normally consider the closest past neighbors of s( n1, n2 )and obtain the
predicted value sˆ (n1,n2)as a linear combination of these pixels, as shown in the figure:

Figure 10.7 : Closeset neighbors to predict S


 mathematically expressed as:

where, a1,a2,a3,a4are the coefficients associated with the respective neighboring pixels,
such that a1+ a2+ a3+ a4= 1 .
 Above equation describes a linear predictive coding.
 The corresponding lossless predictive decoder is shown in figure below :

Figure 10.8 : Block diagram of Lossless predictive decoder

 In the decoder, the same prediction mechanism is replicated.

www.rocktheit.com www.facebook.com/rocktheit
19 Image Compression: Error Free and Lossy Compression

 It receives the encodederror signal e (n1, n2)and reconstructs the current pixel s( n1, n2
)using

10.6 SUMMARY

 If the compression and decompression processes induce no information loss, then the
compression scheme is lossless; otherwise, it is lossy.
 Information generation is to be a probabilistic process.
 Variable-length codes can allow sources to be compressed and decompressed with zero
error
 Huffman coding is based on the frequency of occurrence of a data item (pixel in images).
 Arithmetic encoders are better suited for adaptive models than Huffman coding.
 LZW uses fixed-length codewords to represent variable-length strings of
symbols/characters that commonly occur together.
 LZW places longer and longer repeated entries into a dictionary, and then emits the code
for an element, rather than the string itself, if the element has already been placed in the
dictionary.
 A bit plane of a digital image is a set of bits corresponding to a given bit position in each of
the binary numbers representing the image.
 Bitplane coding is an effective technique to reduce inter pixel redundancy is to process
each bit plane individually.
 In one-dimensional run-length coding schemes for binary images, runs ofcontinuous 1s or
0s in every row of an image are encoded together, resulting insubstantial bit savings.

10.7 UNIT END EXERCISE

1. Define the entropy of a source of symbols. How is entropy related to uncertainty?


2. State the basic principles of Huffman coding.
3. Explain why arithmetic coding can achieve better coding efficiency than Huffman.
4. State the basic principles of arithmetic coding.
5. Make a comparison between Huffman coding and arithmetic coding.
6. State the basic principles of Lempel-Ziv coding.
7. Explain with example LZW coding technique.
8. How is it possible to convert a gray-scale image into bit-planes. Explain briefly.

www.rocktheit.com www.facebook.com/rocktheit
20 Image Compression: Error Free and Lossy Compression

9. State the basic principles of run-length coding of binary images.


10. State the basic principles of lossless predictive coding.
11. What is lossy compression? Explain lossy predictive coding with block diagram.

10.8 FURTHER READING

1. R.C.Gonzales R.E.Woods, .Digital Image Processing., Second Edition, Pearson


Education.





www.rocktheit.com www.facebook.com/rocktheit

You might also like