Professional Documents
Culture Documents
06 Image Compresssion
06 Image Compresssion
06 Image Compresssion
processing (CoSc4151)
Chapter six
Image compression
Jimma University
Jimma Institute of Technology
For Computer Science students
Abel W.
Image Compression
CV & IP 02 2
Applications that require image compression are
many and varied such as:
1. Internet,
2. Businesses,
3. Multimedia,
4. Satellite imaging,
5. Medical imaging
CV & IP 02 3
Image Compression
Storage problems, plus the desire to exchange images over the Internet,
have lead to a large interest in image compression algorithms.
CV & IP 02 4
Compression algorithms remove redundancy
If more data are used than is strictly necessary, then we say that there is
redundancy in the dataset.
Data redundancy is not abstract concept but a mathematically quantifiable entity . If n1
and nc denote the number of information carrying units in two data sets that represent
the same information, the relative data redundancy RD of the first data set ( n1 ) can
be defined as
RD= 1 – 1/CR (1)
Where n1 is the number of information carrying units used in the uncompressed dataset
and nc is the number of units in the compressed dataset. The same units should be used
for n1 and nc; bits or bytes are typically used.
When nc<<n1 , CR large value and RD 1. Larger values of C indicate better
compression
CV & IP 02 5
Entropy
Entropy in information theory is the measure of the
information content of a message.
2b 1
H P (i ) log 2 P (i )
i 0
CV & IP 02 6
A general algorithm for data compression and image
reconstruction
Input image ( f(x,y) Reconstructed image f ’(x,y)
Source
Source
Channel Channel Channel decoder
encoder
encoder decoder Reconstr
Data
uction
redundancy
reduction
An input image is fed into the encoder which creates a set of symbols from the input
data. After transmission over the channel, the encoded representation is fed to the
decoder, where a reconstructed output image f ’(x,y) is generated . In general , f ’(x,y)
may or may not an exact replica of f(x,y). If it is , the system is error free or
information preserving, if not, some level of distortion is present in the reconstructed
image .
CV & IP 02 7
Redundancy information and data
Data is not the same thing as information.
Data is the means with which information is
expressed.
The amount of data can be much larger than the
amount of information.
Redundant data doesn't provide additional
information.
Image coding or compression aims at reducing the
amount of data while keeping the information by
reducing the amount of redundancy.
CV & IP 02 8
Types of redundancy
CV & IP 02 9
Different Types of Redundancy
Coding Redundancy
Some gray levels are more common than
other.
Inter-pixel Redundancy
The same gray level may cover
a large area.
Psycho-Visual Redundancy
The eye can only resolve about
32 gray levels locally.
15.
CV & IP 02
10
Coding redundancy
Our quantized data is represented using code-words
The code-words are ordered in the same way as the intensities that they
represent;
thus the bit pattern 00000000, corresponding to the value 0,
represents the darkest points in an image and the bit pattern
11111111, corresponding to the value 255, represents the brightest
points.
If the size of the code-word is larger than is necessary to
represent all quantization levels, then we have coding redundancy
CV & IP 02 11
Coding redundancy – Example
(Huffman coding)
Rk Pr(rk) Code 1 l1(rk) Code 2 l2(rk)
r0 = 0 0.19 000 3 11 2
r1 = 1/7 0.25 001 3 01 2
r2 = 2/7 021 010 3 10 2
r3 = 3/7 0.16 011 3 001 3
r4 = 4/7 0.08 100 3 0001 4
r5 = 5/7 0.06 101 3 00001 5
r6 = 6/7 0.03 110 3 000001 6
r7=1 0.02 111 3 000000 6
7
Lav g l
k 0
2 ( rk ) Pr ( rk )
Using eq. (2) the resulting compression ratio Cn is 3/2.7 or 1.11 Thus approximately
10 percent of the data resulting from the use of code 1 is redundant. The exact level
of redundancy is
CV & IP 02 12
RD = 1 – 1/1.11 =0.099
Image compression
Reversible (lossless)
no loss of information.
The image after compression and decompression is identical to
the original image.
Often necessary in image analysis applications.
The compression ratio is typically 2 to 10 times.
CV & IP 02 13
Data compression
CV & IP 02 14
Data compression methods
Image Coding and Compression
CV & IP 02 15
LOSSLESS COMPRESSION
METHODS
CV & IP 02 16
Cont’d….
CV & IP 02 17
1. Huffman Coding
The Huffman code, developed by D. Huffman in
1952, is a minimum length code
This means that given the statistical distribution of
the gray levels (the histogram), the Huffman
algorithm will generate a code that is as close as
possible to the minimum bound, the entropy
CV & IP 02 18
Cont’d…
The method results in an unequal (or variable)
length code, where the size of the code words can
vary
For complex images, Huffman coding alone will
typically reduce the file by 10% to 50% (1.1:1 to
1.5:1), but this ratio can be improved to 2:1 or 3:1
by preprocessing for irrelevant information
removal
CV & IP 02 19
Cont,d…
The Huffman algorithm can be described in five steps:
1. Find the gray level probabilities for the image by finding the
histogram
2. Order the input probabilities (histogram magnitudes) from
smallest to largest
3. Combine the smallest two by addition
4. GOTO step 2, until only two probabilities are left
5. By working backward along the tree, generate code by
alternating assignment of 0 and 1
CV & IP 02 20
CV & IP 02 21
CV & IP 02 22
Huffman Coding
• Huffman coding is the most popular technique for
removing coding redundancy.
• Unique prefix property
• Instantaneous decoding property
• Optimality
• JPEG
CV & IP 02 23
Huffman coding
Huffman coding assigns shorter codes to symbols that
occur more frequently and longer codes to those that occur
less frequently. For example, imagine we have a text file that
uses only five characters (A, B, C, D, E).
Before we can assign bit patterns to each character, we
assign each character a weight based on its frequency of use.
In this example, assume that the frequency of the characters
CV & IP 02 24
CV & IP 02 25
Huffman coding
A character’s code is found by starting at the root and
following the branches that lead to that character. The code
itself is the bit value of each branch on the path, taken in
sequence.
CV & IP 02 26
Final tree and code
Encoding
Let us see how to encode text using the code for our five
characters. Figure 15.6 shows the original and the
encoded text.
CV & IP 02 27
Huffman encoding
Decoding
The recipient has a very easy job in decoding the data
it receives. Figure 15.7 shows how decoding takes
place.
CV & IP 02 28
Huffman decoding
Symbol Probability 1 2 3 4 Code
a2 0.4 0.4 0.4 0.4 0.6 1
a6 0.3 0.3 0.3 0.3 0.4 00
a1 0.1 0.1 0.2 0.3 011
a4 0.1 0.1 0.1 0100
a3 0.06 0.1 01010
a5 0.04 01011
CV & IP 02 29
Example
CV & IP 02 30
2. Run-length encoding
15.
CV & IP 02
31
The general idea behind this method is to replace
consecutive repeating occurrences of a symbol by
one occurrence of the symbol followed by the
number of occurrences.
The method can be even more efficient if the data
uses only two symbols (for example 0 and 1) in its
bit pattern and one symbol is more frequent than the
other.
CV & IP 02 32
Run-length encoding
CV & IP 02 33
Run-length encoding
CV & IP 02 34
Image coding and compression
Image coding
Image compression
CV & IP 02 35
LOSSY COMPRESSION METHODS
CV & IP 02 36
Lossy Compression Methods
Lossy compression methods are required to achieve
high compression ratios with complex images
They provides tradeoffs between image quality and
degree of compression, which allows the compression
algorithm to be customized to the application
Several methods have been developed using lossy
compression techniques. JPEG (Joint Photographic
Experts Group) encoding is used to compress pictures and
graphics, MPEG (Moving Picture Experts Group)
encoding is used to compress video, and MP3 (MPEG
audio layer 3) for audio compression.
CV & IP 02 37
Lossy compression
With more advanced methods, images can be
compressed 10 to 20 times with virtually no visible
information loss, and 30 to 50 times with minimal
degradation
Newer techniques, such as JPEG2000, can achieve
reasonably good image quality with compression
ratios as high as 100 to 200
Image enhancement and restoration techniques can
be combined with lossy compression schemes to
improve the appearance of the decompressed image
CV & IP 02 38
Cont’d…
In general, a higher compression ratio results in a
poorer image, but the results are highly image
dependent – application specific
Lossy compression can be performed in both the
spatial and transform domains. Hybrid methods use
both domains.
CV & IP 02 39
Gray-Level Run Length Coding
The RLC technique can also be used for lossy
image compression, by reducing the number of
gray levels, and then applying standard RLC
techniques
As with the lossless techniques, preprocessing by
Gray code mapping will improve the compression
ratio
CV & IP 02 40
Lossy Bit plane Run Length Coding
CV & IP 02 42
Fidelity criteria
When lossy compression techniques are employed, the decompressed image will not be
identical to the original image. In such cases , we can define fidelity criteria that
measure the difference between this two images.
A good example for (1) objective fidelity criteria is root-mean square ( RMS ) error
between on input and output image For any value of x,and y , the error e(x,y) can be
defined as :
M 1 N 1
The total error between two images is:
f ' ( x, y )
x 0 y 0
f ( x, y )
1
1 M 1 N 1 2
2
The root –mean square error , erms is : erms f ' ( x, y ) f ( x, y )
CV & IP 02
MN x 0 y 0 43
Image compression – JPEG encoding
An image can be represented by a two-dimensional
array (table) of picture elements (pixels).
A grayscale picture of 307,200 pixels is represented
by 2,457,600 bits, and a color picture is represented
by 7,372,800 bits.
CV & IP 02 47
Predictive encoding
In predictive encoding, the differences between samples are
encoded instead of encoding all the sampled values. This type
of compression is normally used for speech. Several standards
have been defined such as GSM (13 kbps), G.729 (8 kbps), and
G.723.3 (6.4 or 5.3 kbps). Detailed discussions of these
techniques are beyond the scope of this book.
Perceptual encoding: MP3
The most common compression technique used to create CD-
quality audio is based on the perceptual encoding technique.
This type of audio needs at least 1.411 Mbps, which cannot be
sent over the Internet without compression. MP3 (MPEG audio
layer 3) uses this technique.
CV & IP 02 48
Any ???
CV & IP 02 49