Professional Documents
Culture Documents
Image Compression
Image Compression
Image Compression
Lossless compression:-
Lossless compression is preferred for archival purposes and often for medical imaging, technical
drawings, clip art, or comics. This is because lossy compression methods, especially when used
at low bit rates, introduce compression artifacts.
Lossy compression:-
Lossy methods are especially suitable for natural images such as photographs in applications
where minor (sometimes imperceptible) loss of fidelity is acceptable to achieve a substantial
reduction in bit rate. The lossy compression that produces imperceptible differences may be
called visually lossless.
1) Entropy encoding
In information theory an entropy encoding is a lossless data compression scheme that is
independent of the specific characteristics of the medium. Two of the most common entropy
encoding techniques are Huffman coding and arithmetic coding.
a) Huffman coding
int bits;
while (true) {
if ((bits = input.readbits(1)) == -1){
System.err.println("should not happen! trouble reading bits");
}
else {
// use the zero/one value of the bit read
// to traverse Huffman coding tree
// if a leaf is reached, decode the character and print UNLESS
// the character is pseudo-EOF, then decompression done
if ( (bits & 1) == 0) // read a 0, go left in tree
1
else // read a 1, go right in tree
if (at leaf-node in tree) {
if (leaf-node stores pseudo-eof char)
break; // out of loop
else
write character stored in leaf-node
}}}
b) Arithmetic coding
Encode
Pseudo code
and this is the pseudo code for the initialization:
Where:
High and low, they define where the output number falls.
Underflow_bits, the bits which could have produced underflow and thus they were
shifted.
2
o Output msb of low
o Loop. While underflow_bits > 0 Let's output underflow bits pending for output
Output Not ( msb of low )
o go to shift
No
o Second msb of low = 1 and Second msb of high = 0 ? Check for underflow
o Yes
Underflow_bits += 1 Here we shift to avoid underflow
Low = low & 3FFFh
High = high | 4000h
go to shift
o No
The routine for encoding a symbol ends here.
Shift:
Shift low to the left one time. Now we have to put in low and high new bits
Shift high to the left one time, and or the lsb with the value 1
Repeat to the first loop.
Decoding
The first thing to do when decoding is read the probabilities, because the encode did the scaling
you just have to read them and to do the ranges. The process will be the following: see in what
symbol our number falls, extract the code of this symbol from the code. Before starting we have
to init "code" this value will hold the bits from the input, init it to the first 16 bits in the input.
And this is how it's done:
3
The routine for decoding a symbol ends here.
Shift:
low to the left one time. Now we have to put in low, high and code new bits
2)Run-length encoding
Encoding Strings
Traditional RLE
Step 5. If the symbol is an does not match the previous symbol, set the previous symbol to the
current symbol, and go to step 2.
Step 6. Read and count additional symbols until a non-matching symbol is found. This is the run
length.
Step 9. Set the previous symbol to the non-matching symbol, and go to step 2
When actually implementing traditional RLE, a little attention to detail is required in Step 6. The
run length is stored in a finite number of bits (I used an unsigned char). Runs longer than the
amount that can be counted need to be broken up into to smaller runs. When the maximum count
is reached, just write the count value and start the process of looking for a run all over again.
You also need to handle the case where a run is ended by an EOF. When a run is ended by an
EOF, write out the run length and exit.That's all there is to it.
4
Encoding using the PackBits variant is slightly more complicate than traditional RLE. The block
header cannot be written until the type of block and it's length have been determined. Until then
data must be held in a buffer of the maximum length that can be copied verbatim. The following
steps describe PackBits style encoding:
Step 1. Read symbols from the input stream into the buffer until one of the following occurs:
Step 4. If the last three symbols match, a run has been found. Determine the number of symbols
in the buffer prior to the start of the run (n).
Step 5. Write n - 1 followed by the contents of the buffer up to the start of the run.
Step 6. Read additional symbols until a non-matching symbol is found. Increment the run length
for each matching symbol.
Step 7. Write out 2 - the run length the run length followed by the run symbol.
That's pretty much all there is. You need to stop counting your run length in Step 6 if it reaches
the maximum length you can account for in your header. My actual implementation is also a
little less greedy. When I reach the maximum number of symbols that can be copied verbatim, I
read an extra symbol or two in case the symbols at the end of a buffer are actually the start of a
run.
5
Decoding traditionally encoded strings is even easier than encoding. Not only are there less
steps, but there are no caveats. To decode a traditionally encoded stream:
Step 5. If the symbol is an does not match the previous symbol, set the previous symbol to the
current symbol, and go to step 2.
Step 7. Write out a run of the current symbol as long as indicated by the run length.
Step 8. Go to step 1.
If that wasn't easy enough, it is even easier to decode strings encoded by the variant PackBits
algorithm. To decode a variant PackBits encoded stream:
Step 3. If n is non-negative, copy the next n + 1 symbols to the output stream and go to step 1.
Step 4. If n is negative, write 2 - n copies of the next symbol to the output stream and go to
step1.
6
3) DEFLATE
do
read block header from input stream.
if stored with no compression
skip any remaining bits in current partially
processed byte
read LEN and NLEN (see next section)
copy LEN bytes of data to output
otherwise
if compressed with dynamic Huffman codes
read representation of code trees (see
subsection below)
loop (until end of block code recognized)
decode literal/length value from input stream
if value < 256
copy value (literal byte) to output stream
otherwise
if value = end of block (256)
break from loop
otherwise (value = 257..285)
decode distance from input stream
1) Chroma subsampling
Because the human visual system is less sensitive to the position and motion of color than
luminance,[1] bandwidth can be optimized by storing more luminance detail than color detail. At
normal viewing distances, there is no perceptible loss incurred by sampling the color detail at a
lower rate. In video systems, this is achieved through the use of color difference components.
The signal is divided into a luma (Y') component and two color difference components (chroma).
Chroma subsampling deviates from color science in that the luma and chroma components are
formed as a weighted sum of gamma-corrected (tristimulus) R'G'B' components instead of linear
(tristimulus) RGB components. As a result, luminance and color detail are not completely
independent of one another. There is some "bleeding" of luminance and color information
between the luma and chroma components. The error is greatest for highly-saturated colors and
7
can be somewhat noticeable in between the magenta and green bars of a color bars test pattern
(that has chroma subsampling applied). This engineering approximation (by reversing the order
of operations between gamma correction and forming the weighted sum) allows color
subsampling to be more easily implemented.
2) Transform coding
The idea of transform coding is to transform the input into a different form which can then either
be
compressed better, or for which we can more easily drop certain terms without as much
qualitative
loss in the output. One form of transform is to select a linear set of basis functions (φi) that span
the
space to be transformed. Some common sets include sin, cos, polynomials, spherical harmonics,
Bessel functions, and wavelets. Figure 18 shows some examples of the first three basis functions
for discrete cosine, polynomial, and wavelet transformations. For a set of n values, transforms
can
be expressed as an n × n matrix T. Multiplying the input by this matrix T gives, the transformed
coefficients. Multiplying the coefficients by T−1 will convert the data back to the original form.
For example, the coefficients for the discrete cosine transform (DCT) are
Tij =
(p
1/n cos (2j+1)i_
p 2n i = 0, 0 ≤ j < n
2/n cos (2j+1)i_
2n 0 < i < n, 0 ≤ j < n
The DCT is one of the most commonly used transforms in practice for image compression,
more so than the discrete Fourier transform (DFT). This is because the DFT assumes periodicity,
which is not necessarily true in images. In particular to represent a linear function over a region
requires many large amplitude high-frequency components in a DFT. This is because the
periodicity assumption will view the function as a sawtooth, which is highly discontinuous at the
teeth requiring the high-frequency components. The DCT does not assume periodicity and will
only require much lower amplitude high-frequency components. The DCT also does not require
a phase, which is typically represented using complex numbers in the DFT.
For the purpose of compression, the properties we would like of a transform are (1) to
decorrelate the data, (2) have many of the transformed coefficients be small, and (3) have it so
that from the point of view of perception, some of the terms are more important than others.
8
3) Fractal compression
We begin with the representation of a binary image, where the image may be thought of as a
subset of . An IFS is a set of contraction mappings ƒ1,...,ƒN,
According to these mapping functions, the IFS describes a two-dimensional set S as the fixed
point of the Hutchinson operator
That is, H is an operator mapping sets to sets, and S is the unique set satisfying H(S) = S. The
idea is to construct the IFS such that this set S is the input binary image. The set S can be
recovered from the IFS by fixed point iteration: for any nonempty compact initial set A0, the
iteration Ak+1 = H(Ak) converges to S.
The set S is self-similar because H(S) = S implies that S is a union of mapped copies of itself:
Extension to Grayscale
IFS representation can be extended to a grayscale image by considering the image's graph as a
subset of . For a grayscale image u(x,y), consider the set S = {(x,y,u(x,y))}. Then similar to the
binary case, S is described by an IFS using a set of contraction mappings ƒ1,...,ƒN, but in ,
Encoding
A challenging problem of ongoing research in fractal image representation how to choose the
ƒ1,...,ƒN such that its fixed point approximates the input image, and how to do this efficiently. A
simple approach[1] for doing so is the following:
9
1. Partition the image domain into blocks Ri of size s×s.
2. For each Ri, search the image to find a block Di of size 2s×2s that is very similar to Ri.
3. Select the mapping functions such that H(Di) = Ri for each i.
In the second step, it is important to find a similar block so that the IFS accurately represents the
input image, so a sufficient number of candidate blocks for Di need to be considered. On the
other hand, a large search considering many blocks is computationally costly. This bottleneck of
searching for similar blocks is why fractal encoding is much slower than for example DCT and
wavelet based image representations.
Resources
- http://en.wikipedia.org/wiki/Image_compression
- http://en.wikipedia.org/wiki/Entropy_encoding
- http://en.wikipedia.org/wiki/Huffman_coding
- http://en.wikipedia.org/wiki/Arithmetic_coding
- http://en.wikipedia.org/wiki/Run-length_encoding
- http://en.wikipedia.org/wiki/DEFLATE
- http://en.wikipedia.org/wiki/Chroma_subsampling
- http://en.wikipedia.org/wiki/Transform_coding
- http://en.wikipedia.org/wiki/Fractal_compression
- http://www.arturocampos.com/ac_arithmetic.html
- http://michael.dipperstein.com/rle/index.html
- http://www.ietf.org/rfc/rfc1951.txt
- Introduction to Data Compression, Guy E. Blelloch, Computer, Science Department, Carnegie
Mellon University.
- A RAPID ENTROPY-CODING ALGORITHM, Wm. Douglas Withers, Department of
Mathematics, United States Naval Academy, Annapolis, MD 21402, and Pegasus Imaging
Corporation.
10