Image Compression

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

Image compression

Written by Ahmed Hesham Mostafa,Sec 1,Level 4,CS


‫أحمد هشام مصطفي مصطفي‬

Image compression is the application of data compression. In effect, the objective is to


reduce redundancy of the image data in order to be able to store or transmit data in an efficient
form, Image compression may be lossy or lossless.

Lossless compression:-
Lossless compression is preferred for archival purposes and often for medical imaging, technical
drawings, clip art, or comics. This is because lossy compression methods, especially when used
at low bit rates, introduce compression artifacts.

Lossy compression:-
Lossy methods are especially suitable for natural images such as photographs in applications
where minor (sometimes imperceptible) loss of fidelity is acceptable to achieve a substantial
reduction in bit rate. The lossy compression that produces imperceptible differences may be
called visually lossless.

Methods for lossless image compression are:

1) Entropy encoding
In information theory an entropy encoding is a lossless data compression scheme that is
independent of the specific characteristics of the medium. Two of the most common entropy
encoding techniques are Huffman coding and arithmetic coding.

a) Huffman coding

int bits;
while (true) {
if ((bits = input.readbits(1)) == -1){
System.err.println("should not happen! trouble reading bits");
}
else {
// use the zero/one value of the bit read
// to traverse Huffman coding tree
// if a leaf is reached, decode the character and print UNLESS
// the character is pseudo-EOF, then decompression done
if ( (bits & 1) == 0) // read a 0, go left in tree

1
else // read a 1, go right in tree
if (at leaf-node in tree) {
if (leaf-node stores pseudo-eof char)
break; // out of loop
else
write character stored in leaf-node
}}}

Huffman Coding of Images

In order to encode images:

 Divide image up into 8x8 blocks


 Each block is a symbol to be coded
 compute Huffman codes for set of block
 Encode blocks accordingly

b) Arithmetic coding

Encode

Pseudo code
and this is the pseudo code for the initialization:

 Get probabilities and scale them


 Save probabilities in the output file
 High = FFFFh (16 bits)
 Low = 0000h (16 bits)
 Underflow_bits = 0 (16 bits should be enough)

Where:

 High and low, they define where the output number falls.
 Underflow_bits, the bits which could have produced underflow and thus they were
shifted.

And the routine to encode a symbol:

 Range = ( high - low ) + 1


 High = low + ( ( range * high_values [ symbol ] ) / scale ) - 1
 Low = low + ( range * high_values [ symbol - 1 ] ) / scale
 Loop. (will exit when no more bits can be outputted or shifted)
 Msb of high = msb of low?
 Yes

2
o Output msb of low
o Loop. While underflow_bits > 0 Let's output underflow bits pending for output
 Output Not ( msb of low )
o go to shift
 No
o Second msb of low = 1 and Second msb of high = 0 ? Check for underflow
o Yes
 Underflow_bits += 1 Here we shift to avoid underflow
 Low = low & 3FFFh
 High = high | 4000h
 go to shift
o No
 The routine for encoding a symbol ends here.

Shift:

 Shift low to the left one time. Now we have to put in low and high new bits
 Shift high to the left one time, and or the lsb with the value 1
 Repeat to the first loop.

Decoding
The first thing to do when decoding is read the probabilities, because the encode did the scaling
you just have to read them and to do the ranges. The process will be the following: see in what
symbol our number falls, extract the code of this symbol from the code. Before starting we have
to init "code" this value will hold the bits from the input, init it to the first 16 bits in the input.
And this is how it's done:

 Range = ( high - low ) + 1 See where the number lands


 Temp = ( ( code - low ) + 1 ) * scale) - 1 ) / range )
 See what symbols corresponds to temp.
 Range = ( high - low ) + 1 Extract the symbol code
 High = low + ( ( range * high_values [ symbol ] ) / scale ) - 1
 Low = low + ( range * high_values [ symbol - 1 ] ) / scale Note that those formulae are
the same that the encoder uses
 Loop.
 Msb of high = msb of low?
 Yes
o Go to shift
 No
o Second msb of low = 1 and Second msb of high = 0 ?
o Yes
 Code = code ^ 4000h
 Low = low & 3FFFh
 High = high | 4000h
 go to shift
o No

3
 The routine for decoding a symbol ends here.

Shift:
low to the left one time. Now we have to put in low, high and code new bits

2)Run-length encoding

Encoding Strings

Traditional RLE

Encoding using traditional RLE is fairly simple:

Step 1. Set the previous symbol equal to an unmatchable value.

Step 2. Read the next symbol from the input stream.

Step 3. If the symbol is an EOF exit.

Step 4. Write out the current symbol.

Step 5. If the symbol is an does not match the previous symbol, set the previous symbol to the
current symbol, and go to step 2.

Step 6. Read and count additional symbols until a non-matching symbol is found. This is the run
length.

Step 7. Write out the run length.

Step 8. Write out the non-matching symbol.

Step 9. Set the previous symbol to the non-matching symbol, and go to step 2

When actually implementing traditional RLE, a little attention to detail is required in Step 6. The
run length is stored in a finite number of bits (I used an unsigned char). Runs longer than the
amount that can be counted need to be broken up into to smaller runs. When the maximum count
is reached, just write the count value and start the process of looking for a run all over again.
You also need to handle the case where a run is ended by an EOF. When a run is ended by an
EOF, write out the run length and exit.That's all there is to it.

4
Encoding using the PackBits variant is slightly more complicate than traditional RLE. The block
header cannot be written until the type of block and it's length have been determined. Until then
data must be held in a buffer of the maximum length that can be copied verbatim. The following
steps describe PackBits style encoding:
Step 1. Read symbols from the input stream into the buffer until one of the following occurs:

A. The buffer is full (go to step 2)


B. An EOF is reached (go to step 3)
C. The last three symbols are identical (go to step 4)

Step 2. If the buffer is full:

A. write the buffer size - 1


B. write contents of the buffer
C. go to step 1

Step 3. If the symbol is an EOF:

A. number of symbols in the buffer - 1


B. write contents of the buffer
C. exit

Step 4. If the last three symbols match, a run has been found. Determine the number of symbols
in the buffer prior to the start of the run (n).

Step 5. Write n - 1 followed by the contents of the buffer up to the start of the run.

Step 5. Set the run length to 3.

Step 6. Read additional symbols until a non-matching symbol is found. Increment the run length
for each matching symbol.

Step 7. Write out 2 - the run length the run length followed by the run symbol.

Step 8. Write the non-matching symbol to the buffer and go to step 1.

That's pretty much all there is. You need to stop counting your run length in Step 6 if it reaches
the maximum length you can account for in your header. My actual implementation is also a
little less greedy. When I reach the maximum number of symbols that can be copied verbatim, I
read an extra symbol or two in case the symbols at the end of a buffer are actually the start of a
run.

5
Decoding traditionally encoded strings is even easier than encoding. Not only are there less
steps, but there are no caveats. To decode a traditionally encoded stream:

Step 1. Set the previous symbol equal to an unmatchable value.

Step 2. Read the next symbol from the input stream.

Step 3. If the symbol is an EOF exit.

Step 4. Write out the current symbol.

Step 5. If the symbol is an does not match the previous symbol, set the previous symbol to the
current symbol, and go to step 2.

Step 6. Read the run length.

Step 7. Write out a run of the current symbol as long as indicated by the run length.

Step 8. Go to step 1.

If that wasn't easy enough, it is even easier to decode strings encoded by the variant PackBits
algorithm. To decode a variant PackBits encoded stream:

Step 1. Read the block header (n).

Step 2. If the header is an EOF exit.

Step 3. If n is non-negative, copy the next n + 1 symbols to the output stream and go to step 1.

Step 4. If n is negative, write 2 - n copies of the next symbol to the output stream and go to
step1.

6
3) DEFLATE

do
read block header from input stream.
if stored with no compression
skip any remaining bits in current partially
processed byte
read LEN and NLEN (see next section)
copy LEN bytes of data to output
otherwise
if compressed with dynamic Huffman codes
read representation of code trees (see
subsection below)
loop (until end of block code recognized)
decode literal/length value from input stream
if value < 256
copy value (literal byte) to output stream
otherwise
if value = end of block (256)
break from loop
otherwise (value = 257..285)
decode distance from input stream

move backwards distance bytes in the output


stream, and copy length bytes from this
position to the output stream.
end loop
while not last block

Methods for lossy compression

1) Chroma subsampling
Because the human visual system is less sensitive to the position and motion of color than
luminance,[1] bandwidth can be optimized by storing more luminance detail than color detail. At
normal viewing distances, there is no perceptible loss incurred by sampling the color detail at a
lower rate. In video systems, this is achieved through the use of color difference components.
The signal is divided into a luma (Y') component and two color difference components (chroma).

Chroma subsampling deviates from color science in that the luma and chroma components are
formed as a weighted sum of gamma-corrected (tristimulus) R'G'B' components instead of linear
(tristimulus) RGB components. As a result, luminance and color detail are not completely
independent of one another. There is some "bleeding" of luminance and color information
between the luma and chroma components. The error is greatest for highly-saturated colors and

7
can be somewhat noticeable in between the magenta and green bars of a color bars test pattern
(that has chroma subsampling applied). This engineering approximation (by reversing the order
of operations between gamma correction and forming the weighted sum) allows color
subsampling to be more easily implemented.

2) Transform coding
The idea of transform coding is to transform the input into a different form which can then either
be
compressed better, or for which we can more easily drop certain terms without as much
qualitative
loss in the output. One form of transform is to select a linear set of basis functions (φi) that span
the
space to be transformed. Some common sets include sin, cos, polynomials, spherical harmonics,
Bessel functions, and wavelets. Figure 18 shows some examples of the first three basis functions
for discrete cosine, polynomial, and wavelet transformations. For a set of n values, transforms
can
be expressed as an n × n matrix T. Multiplying the input by this matrix T gives, the transformed
coefficients. Multiplying the coefficients by T−1 will convert the data back to the original form.
For example, the coefficients for the discrete cosine transform (DCT) are
Tij =
(p
1/n cos (2j+1)i_
p 2n i = 0, 0 ≤ j < n
2/n cos (2j+1)i_
2n 0 < i < n, 0 ≤ j < n
The DCT is one of the most commonly used transforms in practice for image compression,
more so than the discrete Fourier transform (DFT). This is because the DFT assumes periodicity,
which is not necessarily true in images. In particular to represent a linear function over a region
requires many large amplitude high-frequency components in a DFT. This is because the
periodicity assumption will view the function as a sawtooth, which is highly discontinuous at the
teeth requiring the high-frequency components. The DCT does not assume periodicity and will
only require much lower amplitude high-frequency components. The DCT also does not require
a phase, which is typically represented using complex numbers in the DFT.
For the purpose of compression, the properties we would like of a transform are (1) to
decorrelate the data, (2) have many of the transformed coefficients be small, and (3) have it so
that from the point of view of perception, some of the terms are more important than others.

8
3) Fractal compression

Fractal image representation can be described mathematically as an iterated function system


(IFS).

For Binary Images

We begin with the representation of a binary image, where the image may be thought of as a
subset of . An IFS is a set of contraction mappings ƒ1,...,ƒN,

According to these mapping functions, the IFS describes a two-dimensional set S as the fixed
point of the Hutchinson operator

That is, H is an operator mapping sets to sets, and S is the unique set satisfying H(S) = S. The
idea is to construct the IFS such that this set S is the input binary image. The set S can be
recovered from the IFS by fixed point iteration: for any nonempty compact initial set A0, the
iteration Ak+1 = H(Ak) converges to S.

The set S is self-similar because H(S) = S implies that S is a union of mapped copies of itself:

So we see the IFS is a fractal representation of S.

Extension to Grayscale

IFS representation can be extended to a grayscale image by considering the image's graph as a
subset of . For a grayscale image u(x,y), consider the set S = {(x,y,u(x,y))}. Then similar to the
binary case, S is described by an IFS using a set of contraction mappings ƒ1,...,ƒN, but in ,

Encoding

A challenging problem of ongoing research in fractal image representation how to choose the
ƒ1,...,ƒN such that its fixed point approximates the input image, and how to do this efficiently. A
simple approach[1] for doing so is the following:

9
1. Partition the image domain into blocks Ri of size s×s.
2. For each Ri, search the image to find a block Di of size 2s×2s that is very similar to Ri.
3. Select the mapping functions such that H(Di) = Ri for each i.

In the second step, it is important to find a similar block so that the IFS accurately represents the
input image, so a sufficient number of candidate blocks for Di need to be considered. On the
other hand, a large search considering many blocks is computationally costly. This bottleneck of
searching for similar blocks is why fractal encoding is much slower than for example DCT and
wavelet based image representations.

Resources
- http://en.wikipedia.org/wiki/Image_compression
- http://en.wikipedia.org/wiki/Entropy_encoding
- http://en.wikipedia.org/wiki/Huffman_coding
- http://en.wikipedia.org/wiki/Arithmetic_coding
- http://en.wikipedia.org/wiki/Run-length_encoding
- http://en.wikipedia.org/wiki/DEFLATE
- http://en.wikipedia.org/wiki/Chroma_subsampling
- http://en.wikipedia.org/wiki/Transform_coding
- http://en.wikipedia.org/wiki/Fractal_compression
- http://www.arturocampos.com/ac_arithmetic.html
- http://michael.dipperstein.com/rle/index.html
- http://www.ietf.org/rfc/rfc1951.txt
- Introduction to Data Compression, Guy E. Blelloch, Computer, Science Department, Carnegie
Mellon University.
- A RAPID ENTROPY-CODING ALGORITHM, Wm. Douglas Withers, Department of
Mathematics, United States Naval Academy, Annapolis, MD 21402, and Pegasus Imaging
Corporation.

10

You might also like