Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 113

Image Compression

CS474/674 – Prof. Bebis


Chapter 8 (except Sections 8.10-8.12)
Image Compression
• The goal of image compression is to reduce the amount
of data required to represent a digital image while at the
same time preserving as much information as possible.
• Lower memory requirements, faster transmission rates.
Data ≠ Information

• Data and information are not synonymous terms!

• Data is the means by which information is conveyed.

• The same information can be represented by different


amount of data!
Data ≠ Information - Example

Ex1: Your wife, Helen, will meet you at Logan Airport


in Boston at 5 minutes past 6:00 pm tomorrow
night

Ex2: Your wife will meet you at Logan Airport at 5


minutes past 6:00 pm tomorrow night

Ex3: Helen will meet you at Logan at 6:00 pm


tomorrow night
Image Compression (cont’d)

• Lossless
– Information preserving
– Low compression ratios

• Lossy
– Information loss
– High compression ratios

Trade-off: information loss vs compression ratio


Compression Ratio

compression

Compression ratio:
Relevant Data Redundancy

Example:
Types of Data Redundancy

(1) Coding Redundancy


(2) Interpixel (or Spatial) Redundancy
(3) Psychovisual (or Irrelevant Information)
Redundancy

• The goal of data compression is to reduce one or more


of these redundancy types.
Coding Redundancy
• A code is a system of rules to convert information (e.g.,
an image) into another form for efficient storage or
transmission.
– A code consists of a list of
symbols (e.g., letters, numbers,
bits etc.)
– A code word is a sequence of
symbols used to represent some
information (e.g., gray levels).
– The length of a code word is the
number of symbols in the code
word; could be fixed or variable.
Coding Redundancy (cont’d)
• Coding redundancy results from employing inefficient
coding schemes.

• To compare the efficiency of different coding schemes,


we can compute the average number of symbols Lavg
per code word (or average data content).

• Coding schemes with lower Lavg are more efficient


since they save more memory.
Coding Redundancy (cont’d)
N x M image
(symbols: bits)

rk: k-th gray level


l(rk): # of bits for representing rk
P(rk): probability of rk

(average length of symbols per code word


or average data content (bits) per pixel)

Average Image Size:


Coding Redundancy - Example

• Case 1: l(rk) = fixed length

L=8

/pixel

Average image size: 3NM


Coding Redundancy – Example (cont’d)
• Case 2: l(rk) = variable length

L=8

/pixel

Average image size: 2.7NM


Interpixel redundancy
• Interpixel redundancy results from pixel correlations (i.e.,
a pixel value can be reasonably predicted by its neighbors).

histograms

auto-correlation


f ( x) o g ( x)   f ( x) g ( x  a )da


auto-correlation: f(x)=g(x)
Interpixel redundancy (cont’d)
• To reduce interpixel redundancy, some transformation needs
to be applied on the data.

Example: gray-levels of line #100

Grayscale threshold

thresholding
transformation 11 ……………0000……………………..11…..000…..

Original: 1024 bytes


Binary
Thresholded: 1024 bits
Psychovisual redundancy
• The human eye is more sensitive to the lower frequencies
than to the higher frequencies in the visual spectrum.
• Idea: discard data that is perceptually insignificant (e.g., using
quantization).
256 gray levels 16 gray levels random noise + 16 gray levels

Add a small
Example: random
number
to each pixel
prior to
quantization

CR=8/4 = 2:1
Data ≠ Information (revisited)

Goal: reduce the amount of data while preserving as


much information as possible!

Question: What is the minimum amount of data that


preserves information content in an image?

We need some measure of information!


How do we measure information?
• We assume that information is generated by a
probabilistic process.

• Idea: associate information with probability!

• A random event E with probability P(E) contains

Note: when P(E)=1, I(E)=0


How much information does a pixel contain?

• Suppose that pixel values are generated by a random


process, then gray-level rk contains

units of information!

(assuming statistically independent random events)


How much information does an image contain?

• The average information content of an image is:


L 1
E   I (rk ) P(rk )
k 0

using

units of info / pixel


Entropy:
(e.g., bits/pixel)
Entropy – Example

H=1.6614 bits/pixel H=8 bits/pixel H=1.566 bits/pixel

The amount of entropy, and thus information in an image,


is far from intuitive!
Data Redundancy
• Data redundancy can be computed by comparing data to
information:
Do not confuse R with RD
(relative data redundancy)

where:

Note: if Lavg= H, then R=0 (no data redundancy)


Data Redundancy - Example

Lavg = 8 bits/pixel

R= Lavg- H or R= 6.19 bits/pixel


Entropy Estimation
• Estimating H reliably is not easy!

First order estimate of H Second order estimate of H


Use relative frequencies of pixels: Use relative frequencies of pixel blocks:

Which estimate is more reliable?


Entropy Estimation (cont’d)

• In general, differences between first-order and


higher-order entropy estimates indicate the presence
of interpixel redundancy.

• As we have discussed earlier, some transformation


needs to be applied on the data to address interpixel
redundancy.
Estimating Entropy - Example
• Consider a simple transformation that subtracts
column i+1 from column i (starting at i=1):
Difference image:

• No information has been lost – the original image can


be completely reconstructed from the difference image
by adding column i to column i+1 (starting at i=0)
Estimating Entropy – Example (cont’d)

16

Entropy of difference image:


Less than the entropy
of the original image
(i.e., 1.81 bits/pixel)

• It is possible that an even better transformation could


be found since the 2nd order entropy estimate is even lower!
General Image Compression and
Transmission Model

We will focus on the Source Encoder/Decoder only.


Encoder – Three Main Components

• Mapper: applies a transformation to the data to account for


interpixel redundancies.
Encoder (cont’d)

• Quantizer: quantizes the data to account for psychovisual


redundancies.
Encoder (cont’d)

• Symbol encoder: encodes the data to account for coding


redundancies.
Decoder - Three Main Components

• The decoder applies the same steps in inverse order.

• Note: the quantization is irreversible in general!


Fidelity Criteria

• How close is to ?

• Criteria
– Subjective: based on human observers.
– Objective: based on mathematically defined criteria.
Subjective Fidelity Criteria
Objective Fidelity Criteria

• Root mean square error (RMS)

• Mean-square signal-to-noise ratio (SNR)


Objective Fidelity Criteria - Example

RMS = 5.17 RMS = 15.67 RMS = 14.17


Lossless Compression
Taxonomy of Lossless Methods

(Run-length encoding)

(see “Image Compression Techniques” paper)


Huffman Coding
(addresses coding redundancy)
• A variable-length coding technique.

• Source symbols (i.e., gray-levels) are encoded one at a time.


– There is a one-to-one correspondence between source symbols and
code words.
– It is optimal in the sense that it minimizes code word length per
source symbol.
Huffman Coding (cont’d)
• Forward Pass
1. Sort probabilities per symbol (e.g., gray-levels)
2. Combine the lowest two probabilities
3. Repeat Step2 until only two probabilities
remain.
Huffman Coding (cont’d)

• Backward Pass
Assign code symbols going backwards
Huffman Coding (cont’d)
• Lavg assuming binary coding:

• Lavg assuming Huffman coding:


Huffman Decoding

• Decoding can be performed unambiguously using a


look-up table.
Arithmetic (or Range) Coding
(addresses coding redundancy)

• Huffman coding encodes source symbols one at a


time which might not be efficient overall.
• Arithmetic coding assigns sequences of source
symbols to variable length code words.
– There is no one-to-one correspondence between source
symbols and code words.
– Slower than Huffman coding but can achieve higher
compression.
Arithmetic Coding – Main Idea

• Maps a sequence of symbols to a real number (arithmetic code)


in the interval [0, 1).

α1 α2 α3 α3 α4

• The mapping is built incrementally (i.e., as each source symbol


arrives) and depends on the source symbol probabilities.
• The original sequence of symbols can be obtained by decoding
the arithmetic code.
Arithmetic Coding – Main Idea (cont’d)
symbol sequence: α1 α2 α3 α3 α4
known probabilities P(αi)
– Start with the interval [0, 1)

0 1
– A sub-interval of [0,1) is chosen to encode the first symbol α1 in the
sequence (based on P(α1)).

0 1

– A sub-interval inside the previous sub-interval is chosen to encode the


next symbol α2 in the sequence (based on P(α2)).
0 1
– Eventually, the whole symbol sequence is encoded by a number within
the final sub-interval, e.g.,:
final
Arithmetic Coding - Example
Subdivide [0,1)
based on P(αi)
Encode
α1 α2 α3 α3 α4

Subdivide Subdivide Subdivide Subdivide Subdivide

[0.06752, 0.0688) 0.8 1.6


final sub-interval

0.4 0.08
arithmetic code: 0.068 0.04
0.2
(can choose any number
within the final sub-interval)

Warning: finite precision arithmetic might cause problems due to truncations!


Arithmetic Coding - Example (cont’d)
• The arithmetic code 0.068 can be encoded using Binary Fractions:

0.0068 ≈ 0.000100011 (9 bits) (subject to conversion errors;


exact value is 0.068359375)
α1 α2 α3 α3 α4

• Huffman Code:
0100011001 (10 bits)

• Fixed Binary Code:


5 x 8 bits/symbol = 40 bits
Arithmetic Decoding - Example
Subdivide based
on P(αi) Subdivide Subdivide Subdivide Subdivide

1.0 0.8 0.72 0.592 0.5728

α4 α4 α4 α4 α4
0.8 0.72 0.688 0.5856 0.57152

Decode 0.572
α3 α3 α3 α3 α3

0.4 0.56 0.624 0.5728 0.56896

α2 α2 α2 α2 α2 α3 α3 α1 α2 α4
0.2 0.48 0.592 0.5664 0.56768

α1 α1 α1 α1 α1 A special EOF symbol can


be used to terminate iterations .
0.0 0.4
0.56 0.56 0.5664
LZW Coding
(addresses interpixel redundancy)

• Requires no prior knowledge of symbol probabilities.

• Assigns sequences of source symbols to fixed length


code words.
– There is no one-to-one correspondence between source symbols
and code words.
– Included in GIF, TIFF and PDF file formats
LZW Coding
• LZW builds a codebook (or dictionary) of symbol
sequences (i.e., gray-level sequences) as it processes
the image pixels.
• Each symbol sequence is encoded by its dictionary
location.
Dictionary Location Entry

0 … For example, the sequence of gray-


1 … levels 10-120-51 (3 bytes) will be
… …
encoded by 256 (1 byte)
255 …
256 10-120-51
… …
511 -
LZW Coding (cont’d)
• Initially, the first 256 entries of the dictionary are assigned
to the gray levels 0,1,2,..,255 (i.e., assuming 8 bits/pixel
images)
Initial Dictionary
Dictionary Location Entry

0 0
1 1
… …
255 255
256 -
… …
511 -
LZW Coding – Main Idea
Example:
As the encoder examines the
39 39 126 126
image pixels, gray level
39 39 126 126
sequences that are not in the
39 39 126 126
dictionary are added to the
39 39 126 126
dictionary.
Dictionary Location Entry

0 0
1 1
- Is 39 in the dictionary……..Yes
. . - What about 39-39………….No
255 255
256 - 39-39
* Add 39-39 at location 256
511 -

So, 39-39 can be encoded by 256!


Example
39 39 126 126 Concatenated Sequence: CS = CR + P
39 39 126 126
(CR) (P)
39 39 126 126 Dictionary
Location

39 39 126 126
CR = empty
repeat
P=next pixel
CS=CR + P
If CS is found:
(1) CR=CS
else:
(1) Add CS to D
(2) Output D(CR)
(3) CR=P
10 x 9 bits/symbol = 90 bits vs 16 x 8 bits/symbol = 128 bits
Decoding LZW

• Decoding can be done using the dictionary again.


• For image transmission, there is no need to transmit
the dictionary for decoding.
• The dictionary can be built on the “fly” by the
decoder as it reads the received code words.
Run-length coding (RLC)
(addresses interpixel redundancy)

• Represent sequences of repeating symbols (a “run”) using a


compact representation (symbol, count) :
(i) symbol: the symbol itself
(ii) count: the number of times the symbol repeats

111110000001  (1,5) (0, 6) (1, 1)


aaabbbbbbcc  (a,3) (b, 6) (c, 2)
• Each pair (symbol, count) can be thought as a “new” symbol.
– Huffman or Arithmetic coding could be used to encode them.
Bit-plane coding
(addresses interpixel redundancy)

• Process each bit plane individually.

(1) Decompose an image into a series of binary images.


(2) Compress each binary image (e.g., using RLC)
Lossy Methods - Taxonomy

(see “Image Compression Techniques” paper)


Lossy Compression – Transform Coding
• Transform the image into some other domain to address
interpixel redundancy.

Quantization is irreversible in general!


Example: Fourier Transform

Note that the magnitude


Approximate of the FT decreases, as
f(x,y) using u, v increase!
fewer F(u,v)
coefficients !
K << N
K-1 K-1
What transformations can be used?

• Various transformations T(u,v) are possible, for example:


– DFT
– DCT (Discrete Cosine Transform)
– KLT (Karhunen-Loeve Transformation)
– Principal Component Analysis (PCA)

• JPEG employs DCT for handling interpixel redundancy.


DCT (Discrete Cosine Transform)

Forward:

Inverse:

if u=0 if v=0
if u>0 if v>0
DCT (cont’d)

• Basis functions for a 4x4 image (i.e., cosines of


different frequencies).
DCT (cont’d)
DFT WHT DCT

Image is divided into


8 x 8 sub-images
(64 coefficients per
sub-image).

Sub-images were
reconstructed
by truncating
50% of the
coefficients.

RMS error was computed


between the original
and reconstructed
images.

RMS error: 2.32 1.78 1.13


DCT (cont’d)

RMS error

• Sub-image size selection:

Reconstructions (75% truncation of coefficients)

original 2 x 2 sub-images 4 x 4 sub-images 8 x 8 sub-images


JPEG Compression

Entropy
encoder

Became an
international
image
compression
standard in
1992.

Entropy
decoder
JPEG - Steps

1. Divide image into 8x8 subimages.

For each subimage do:


2. Shift the gray-levels in the range [-128, 127]
(i.e., reduces the dynamic range requirements of DCT)

3. Apply DCT; yields 64 coefficients


1 DC coefficient: F(0,0)
63 AC coefficients: F(u,v)
Example

[-128, 127] (i.e., DCT spectrum)


The low frequency
components are around the
upper-left corner of the
spectrum (not centered!).
JPEG Steps

4. Quantize coefficients (i.e., reduce the amplitude of


coefficients that do not contribute a lot).

Q(u,v): quantization array


Computing Q[i][j] - Example

• Quantization Array Q[i][j] array


Example (cont’d)

Cq(u,v)

C(u,v)

Quantization
array

Q(u,v) Small magnitude coefficients


have been truncated to zero!

“quality” controls how many of


them will be truncated!
JPEG Steps (cont’d)
5. Order the coefficients using zig-zag ordering
- Creates long runs of zeros (i.e., ideal for RLC)
JPEG Steps (cont’d)

6. Encode coefficients:

6.1 Form “intermediate” symbol sequence.

6.2 Encode “intermediate” symbol sequence into


a binary sequence (e.g., using Huffman coding)
Intermediate Symbol Sequence – DC coefficient

symbol_1 (SIZE) symbol_2 (AMPLITUDE)


(6) (61)

SIZE: # bits need to encode


the amplitude
Amplitude of DC coefficient

symbol_1 symbol_2
(SIZE) (AMPLITUDE)

predictive
coding:
64x64 64x64

The DC coefficient of blocks other than the first


block is substituted by the difference between the
DC coefficient of the current block and that of the
previous block.
Intermediate Symbol Sequence – AC coefficient

symbol_1 (RUN-LENGTH, SIZE) symbol_2 (AMPLITUDE) end of block

RUN-LENGTH: run of zeros preceding coefficient


SIZE: # bits for encoding the amplitude of coefficient

Note: If RUN-LENGTH > 15, use symbol (15,0) as many times as needed , e.g.,
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2  (15, 0)(3, 2)(2)
AC Coefficient Encoding
Symbol_1 Symbol_2
(Variable Length Code (VLC) (Variable Length Integer (VLI)
pre-computed Huffman codes) pre-computed codes)

# bits

Idea: smaller (and


more common)
values use fewer
bytes and take up
less space than
larger (and less
common) values.

DC coefficients are
encoded similarly.

(1,4) (12)  (111110110 1100)


VLC VLI
Final Symbol Sequence
Effect of “Quality” parameter

(58k bytes) (21k bytes) (8k bytes)

lower compression higher compression


Effect of “Quality” parameter (cont’d)
Effect of Quantization:
homogeneous 8 x 8 block
Effect of Quantization:
homogeneous 8 x 8 block (cont’d)

De-quantized coefficients
Quantized coefficients (multiply by Q(u,v))
Effect of Quantization:
homogeneous 8 x 8 block (cont’d)

Reconstructed
Error is low!

Original
Effect of Quantization:
non-homogeneous 8 x 8 block
Effect of Quantization:
non-homogeneous 8 x 8 block (cont’d)

De-quantized coefficients
Quantized coefficients (multiply by Q(u,v))
Effect of Quantization:
non-homogeneous 8 x 8 block (cont’d)

Reconstructed

Error is high!

Original
Case Study: Fingerprint Compression

• FBI is digitizing fingerprints at 500 dots per inch


using 8 bits of grayscale resolution.
• A single fingerprint card (contains fingerprints from
all 10 figures) turns into about 10 MB of data.

A sample fingerprint image


768 x 768 pixels =589,824 bytes
Need to Preserve Fingerprint Details

The "white" spots in the middle of


the black ridges are sweat pores
which are admissible points of
identification in court.

These details are just a couple


pixels wide!
What compression scheme should be used?

• Lossless or lossy compression?

• In practice lossless compression methods haven’t


done better than 2:1 on fingerprints!

• Does JPEG work well for fingerprint compression?


Results using JPEG compression
file size 45853 bytes
compression ratio: 12.9

Fine details have been lost.

Image has an artificial ‘‘blocky’’


pattern superimposed on it.

Artifacts will affect the


performance of fingerprint
recognition.
WSQ Fingerprint Compression

• An image coding standard for digitized fingerprints


employing the Discrete Wavelet Transform
(Wavelet/Scalar Quantization or WSQ).

• Developed and maintained by:


– FBI
– Los Alamos National Lab (LANL)
– National Institute for Standards and Technology (NIST)
Results using WSQ compression
file size 45621 bytes
compression ratio: 12.9

Fine details are better


preserved.

No “blocky” artifacts.
WSQ Algorithm

Target bit rate can be set via a parameter, similar


to the "quality" parameter in JPEG.
Compression ratio

• FBI’s target bit rate is around 0.75 bits per pixel (bpp)

• This corresponds to a compression ratio of


8/0.75=10.7

• Let’s compare WSQ with JPEG …


Varying compression ratio (cont’d)
0.9 bpp compression
WSQ image, file size 47619 bytes, JPEG image, file size 49658 bytes,
compression ratio 12.4 compression ratio 11.9
Varying compression ratio (cont’d)
0.75 bpp compression
WSQ image, file size 39270 bytes JPEG image, file size 40780 bytes,
compression ratio 15.0 compression ratio 14.5
Varying compression ratio (cont’d)
0.6 bpp compression
WSQ image, file size 30987 bytes, JPEG image, file size 30081 bytes,
compression ratio 19.0 compression ratio 19.6
JPEG Modes

• JPEG supports several different modes


– Sequential Mode
– Progressive Mode
– Hierarchical Mode
– Lossless Mode
(see “Survey” paper)
Sequential Mode

• Image is encoded in a single scan (left-to-right, top-to-


bottom); this is the default mode.
Progressive JPEG
• Image is encoded in multiple scans.
• Produces a quick, roughly decoded image when
transmission time is long.
Progressive JPEG (cont’d)

• Main algorithms:
(1) Progressive spectral selection algorithm
(2) Progressive successive approximation algorithm
(3) Hybrid progressive algorithm
Progressive JPEG (cont’d)

(1) Progressive spectral selection algorithm


– Group DCT coefficients into several spectral bands
– Send low-frequency DCT coefficients first
– Send higher-frequency DCT coefficients next

Example:
Example
Progressive JPEG (cont’d)

(2) Progressive successive approximation algorithm


– Send all DCT coefficients but with lower precision.
– Refine DCT coefficients in later scans.

Example:
Example
Example
after 0.9s after 1.6s

after 3.6s after 7.0s


Progressive JPEG (cont’d)

(3) Hybrid progressive algorithm


– Combines spectral selection and successive approximation.
Hierarchical JPEG

• Hierarchical mode encodes the image at different


resolutions.

• Image is transmitted in multiple passes with increased


resolution at each pass.
Hierarchical JPEG (cont’d)

f4

N/4 x N/4

f2
down-sample
N/2 x N/2

up-sample

NxN
Problem 1
Problem 2
Problem 3
Quiz #7
• When: 12/6/2021 @ 1pm
• What: Image Compression
• Duration: 10 minutes

You might also like