Image Vcip 07

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

RATE-DISTORTION OPTIMIZED COLOR QUANTIZATION FOR COMPOUND IMAGE

COMPRESSION

Wenpeng Ding*a, Yan Lub, Feng Wub, Shipeng Lib


a
Dept. of Computer Science, University of Science and Technology of China, Hefei 230027, China
b
Microsoft Research Asia, Beijing, 100080, China
ABSTRACT

In this paper, we present a new image compression scheme, which is specially designed for computer generated
compound color images. First we classify the image content into two kinds: text/graphic content and picture content.
Then a rate-distortion optimized color quantization algorithm is designed for text/graphic content, which introduces
distortion to text content and minimizes the bit rate produced by the following lossless entropy compression algorithm.
The picture content is compressed using conventional image algorithms like JPEG. The results show that the proposed
scheme achieves better coding performance than other images compression algorithms such as JPEG2000 and DjVu.

1. INTRODUCTION

The fast development of computer and internet leads to the widely usage of computer generated images. Webpage, flash,
screen desktops are typical computer generated images. Photographic images compression has been well studied and
many mature algorithms such as JPEG [1] and JPEG2000 [2] are available. Another kind of images, document images,
are also been well studied. Existing algorithms such as JBIG, JBIG2 achieve good compression performance. While
computer generated compound images compression still need more research work.
One kind of approaches for compound images is layered coding, which separate the images into different layers and
code easy layer independently. Most layered coding algorithms use the standard three layer mixed raster content
representation [3]. One popular method is DjVu [4], which uses a wavelet-based codec (IW44) for background and
foreground, and JB2 for mask layer. Block-based approaches [5][8] for compound images are also studied by some
researchers. Said et al. [5] proposed a simple blocked-based scheme, which compresses text blocks using JPEG-LS,
picture blocks using JPEG.
Layer based approaches works well on simple compound images. However when the content is very complex, for
example the text overlap with background or the text has shadow around it, layer based approaches show poor
performance. The reason is that it is difficult to separate text from backgrounds when content is complex. On complex
picture content, it is useless to separate it into two layers. Sometimes it is needed to separate the image into more then
two layers.
Block-based approaches have low complexity. However they also have lower compression efficiency. They use
different algorithms for different types of blocks. To achieve good compression performance, we need algorithms
specially designed for each type of content. Picture content can be effectively compressed using conventional image
codec such as JPEG. However most of these approaches compress the text block lossless, which will lead to high bit
rate. The text blocks should be specially handled.
In this paper, we present a novel compression algorithm for computer generated color compound images, which
combining various compression algorithms for different content to achieve the best compression performance. The main
attributions of our scheme are as follows:
a) A two-stage rate-distortion optimized segmentation method is proposed to segment the image into two types:
text/graphic content and picture content.
b) A rate-distortion optimized color quantization algorithm for compression is proposed to tradeoff the rate and
distortion of text/graphic blocks.

*
This work is done when the author with Microsoft Research Asia

Visual Communications and Image Processing 2007, edited by Chang Wen Chen, Dan Schonfeld, Jiebo Luo,
Proc. of SPIE-IS&T Electronic Imaging, SPIE Vol. 6508, 65082Q, © 2007 SPIE-IS&T · 0277-786X/07/$18

SPIE-IS&T/ Vol. 6508 65082Q-1


This paper is organized as follows. Section 2 overviews the proposed compound compression framework. Then the
segmentation technique is introduced in section3. Section 4 repents the text/graphic block compression scheme. The
rate-distortion optimized color quantization and lossless coding for color quantized images are discussed in detail. The
experimental results are shown in section 5, followed by conclusion in section 6.

Text/graphic
blocks
RDO Color Entropy
Quantization Coding

Input im age
Segm entation m ultiplexer

JPEG
Picture
blocks

Fig. 1 Compound image compression framework

2. COMPOUD IMAGE COMPRESSION FRAMEWORK

We propose a block-based solution which classifies the image into blocks of different types and they are compressed
using different algorithms. Our scheme is showed in Fig.1. The input image is spitted to 8x8 blocks. The blocks are
classified into two different types: text and picture. Then each type of block is efficiently compressed using different
compression algorithm.
A two-stage segmentation method is proposed to segment the image into two types: text/graphic content and picture
content. In the first stage, simple picture and text blocks, which are easy to distinguish, are classified. Then a rate-
distortion optimization algorithm is used to find the best mode for each complex blocks.
How to compress the text/graphic blocks efficiently is the main problem we want to solve. First, we need to
introduce distortion to text blocks. Second, the color quantized image is needed to compress lossless efficiently. Based
on these two points, a rate-distortion optimized color quantization algorithm is proposed to introduce distortion to text
blocks. And a context based arithmetic coder, which exploits the high order entropy of text blocks, is proposed to
compress the color quantized image.
Color quantization algorithms can be used to introduce distortion to text/graphic blocks. Many existing color
quantization algorithms [7] such as K-means, Popularity, Medina cut, Vector quantization are available. However all this
algorithms are designed for display of limited number of colors. As a result, they are not efficient enough for
compression. The proposed rate-distortion optimized color quantization algorithm tradeoff the requirements of rate and
distortion, achieving better rate-distortion performance. It is a very effective way of introducing distortion to text blocks
in image domain.
We adopt a JPEG-like algorithm for the picture blocks for simplicity. The difference with JPEG is that we need to
handle the boundary blocks which have neighboring blocks of different types. The picture blocks compression
algorithms can be replaced by better algorithms such as J2K, SPHIT to achieve high compression performance..

3. SEGMENTATION

The segmentation algorithm classifies the image blocks into different types for different compression algorithms. There
are many features, which we can used to classify the blocks by thresholding their values. However, this kind of
segmentation algorithms is sensitive to the thresholds especially for complex picture blocks and text/graphic blocks.
They may have a lot of features in common. For example, the color range is large and the number of high gradient pixels
is larger on both the complex picture blocks and complex text blocks. Sometimes it is even difficult to define whether a
block is a complex picture block or text/graphics block such as the blocks with text written on pictures. What’s more,
even the segmentation result by human doesn’t guarantee it is best for compression.
We propose a two stage segmentation scheme which combines thresholding block features and rate-distortion
optimization. The segmentation algorithm flowchart is shown in Fig.2. In the first stage, the blocks are classified into
three classes: simple picture blocks, simple text/graphic blocks and complex blocks. Then in the second stage we use a

SPIE-IS&T/ Vol. 6508 65082Q-2


rate-distortion optimization algorithm to further classify the complex blocks into two classes: complex picture blocks
and complex text/graphics blocks.
In the first stage, the classification algorithm classify those simple picture blocks and simple text blocks by
thresholding the block features such as gradient, color number and color range.. The smooth blocks and clean text blocks
are easily distinguished using this method and the complex blocks are left to the second stage.

Input
Image

Simple
Classifcation

RD Optimized
classification

Text/graphic
Picture contet
Content

Fig. 2 Two stage block classification algorithm

In the second stage, a rate-distortion optimized (RDO) classification algorithm is used for complex blocks. The
objective function we want to minimize is J = D + λR . D and R are the distortion and rate of each block respectively.
The cost J of complex blocks in different modes will be calculated by compress them using the two different algorithms.
The mode with the minimum J will be selected. The lambda can be calculated using the technique introduced in [6].

4. TEXT/GRAPHIC BLOCK COMPRESSION

Text/graphic blocks in computer generated compound images typically contain rich color text and complex background.
How to efficient compress this kind of image? For one thing, we need to reduce the color number in Rich color images
and control the distortion from color reducing. For another, the local and global similarities we have observed in
compound images are also need to take to account.

Here, we propose a compression scheme which consists of two parts: color quantization and lossless coding. The
input images will first be color quantized and converted to codebooks and labels, introducing constraint distortion to the
color quantization images. Then generated labels and codebooks are lossless compressed respectively. The details are
described in the following chapters.

4.1. Color Quantization for compression

Color quantization is an important problem in Computer Graphics an Image Processing. Many color quantization
algorithms have been studied. However these algorithms are designed for display devices capable of displaying limited

SPIE-IS&T/ Vol. 6508 65082Q-3


number of colors. They don’t take into account of the requirement of the following entropy compression algorithms. As
a result, the quantized image with few colors may also results in high bit rate.
Color quantization for compression is a different problem. We need to take into account both the distortion and rate
of a color quantization problem. The rate of the quantized image depends on the high order entropy .
We proposed a Rate-distortion optimized (RDO) color image quantization algorithm, which minimizes the
distortion and the resulting bit rate.

4.1.1. Classical color quantization

The classical color quantization can be defined as following: Let the image pixels set be X = {xi | i = 1,2,L, N } . Let
the base colors set which the image pixels map to be Y = { yv | v = 1,2,L, K } . Then a color quantization is defined as
an assignment of pixel colors xi to base colors yv , which is formalized by Boolean assignment
variables M iv ∈ {0,1} . M iv = 1(0} denotes whether the image pixel xi is (not) quantized to yv . All the assignments are
summarized in terms of Boolean assignment matrix M ∈ S where
K
S = {M ∈ {0,1}NxK : ∑ M iv = 1,1 ≤ i ≤ N } (1)
v =1

Then quantized is now formally obtained by matrix multiplication MY. The cost function of classical color quantization
algorithms is
n K 2 N K 2

H = ∑ xi − ∑ M iv yv = ∑∑ M iv xi − yv (2)
i =1 k =1 i =1 v =1

The task of color quantization is to search for a parameter set (M, Y) which minimize (2).

4.1.2. RD-Optimized color quantization

In the application of image compression, we need to minimize both the distortion and the bit rate. Hence we have two
cost functions Eqs (2) and (3)

R = R( M ) + R(Y ) (3)

We combine the two cost functions using Lagrange multiplier method to get:

C = H + λR (4)

But the cost of R (M ) is dependent on the high-order entropy of M, which is hard to optimized locally or iteratively. To
simply this problem we add two constraints to reduce the problem dimension.
Atom region: The coherent continuous region predefined to be quantized to the same color is called atom region. Every
atom region is required to be quantized to the same colors.
Maximum allowed distortion: Given the maximum allowed distortion Dmax , the maximum error of all the pixels is
required to be lower than Dmax .
2
max{ xi − y v } ≤ Dmax , M iv = 1 (5)

SPIE-IS&T/ Vol. 6508 65082Q-4


Then the original optimization problem are converted the following problem: Given the atom
regions {oi , i = 1,2,L, n} , group the atom regions together to K groups: G = {g v | v = 1,2, L , K } . Let y v also
denote the color assigned to pixels in group g v . The cost function is:
K
C = H + λR = ∑ Div + λ (∑ Ri , j + R(Y )) , Div = ∑ x− y
2
v
v =1 i, j x∈g v
(6)

Div is the distortion of all the pixels in group g v . Ri , j is cost of edges between adjunctive regions oi , o j belonging to
different groups. Ri , j can be estimated by representing the edges by chain coding. R(Y) is the cost of compressing the
codebook Y.

4.1.3. Optimization by iterative merging

The optimization of the objective function (1) is still computationally hard, since most combination optimization
problems are proven NP-hard and have many local minima. We design a greedy algorithm to optimize the problem,
which merge the regions one by one. The merge algorithm starts from atom regions. Each time the two groups, which
when merged decease the cost function most, will be merged. Then cost of merging each group pair is updated. The
proposed algorithm has a running time of Ο(n
3
).

Algorithm 1: RD-Optimized color quantization


Input: atom regions: {oi , i = 1,2,L, N } ,
number of output groups: K
Output: groups: g i , i = 1,2,L K , where
K N
gi = U o j , gi I g j = Φ , U gi = U o j
i =1 j =1
j

Function:
Initialize

Number of group k = N
g i = oi
do
Calculate all decrease amount ∆Cij of combining group g i and group g j
Merger the two group g i and g j with minimum ∆Cij
g new = g i U g j
while k equals K

The distortion of a group is computed as following. All the pixels in one group g v will assign to the same color y v .

∑ x− y ∑x/ N
2
Then the distortion is Dv = v . And we obtain yv = v for group g v by minimize Dv . N v stands
x∈ g i x

SPIE-IS&T/ Vol. 6508 65082Q-5


for the number of pixels belonging to group g v . To find the group pair make the cost function decrease most, we
compute the decrease amount ∆Cij for each group pair. The cost of g i and g j before merging is
C ij = Di + D j + λRij .
When they are merged together the cost the edges between them is eliminated, so we get
C ij' = Di' + D 'j
Then
∆C ij = Di' + D 'j − Di − D j − λRij .

4.2. Lossless coding

The labels and codebooks generated by color quantization are lossless compressed. The codebooks are simply
compressed one by one lossless. All the labels form a label map, which is compressed lossless, exploiting this local
correlation and long range similarity.
The label map shows up both local and global similarity. Context based compression scheme, adopt in JBIG, have
been approved to be a quite efficient to exploit long rang similarity such as text on document images. The generated
label map also shows up local similarity, which can be exploited by prediction from its causal neighbor. We use a
context based arithmetic coder to compress the label map.

Algorithm2: Pseudocode of label map compresson algorithm

Input: label map


Output: compressed bit stream

Function :
for row = 1 to last row
for column = 1 to last column
Get label at current position;
Generate prediction state for current label
Calculate the prediction state context
Context := neighbors[0]*125 + neighbors[1]*25 +
neighbors[2]*5 + neighbors[3];
Code current predict state using the context;
if current prediction state is 4 ( prediction faild) then
code the current pixel value;
end if
end for
end for

We use prediction to exploit he local correlation in label map. The current label is predicted from the four causal
labels: left, right, left top and right top label. Due to the RDO color quantization, in most cases the current label will
equal to one of its four causal labels. The prediction state can be five different value, with 0, 1, 2, 3 indicate which
neighbor it equal to. The value 4 indicates the label X is different from its neighbors. The label map is scanned in scan
line order and predicted from its four causal neighbors. The predation state is then arithmetic coded using its
neighboring label’ states as context. Each neighbor has five prediction states, resulting 5*5*5*5 = 625 context for
current prediction state. If the current perdition is failed (state 4), the label is further entropy coded using an arithmetic
coder

SPIE-IS&T/ Vol. 6508 65082Q-6


5. EXPERIMENTAL RESULTS

The proposed scheme has been implemented and several experiments have done to evaluate its coding performance
objectively.
Fig.3 shows a typically compound image, which contains text, graphic and picture content. The test image is
successfully separated into two groups of blocks of different characteristics. The picture content contains the smooth and
texture blocks, while the text/graphic content contains the text and strong edge blocks.
Fig.4 compares the performance of different color quantization algorithms. The convention color quantization used
in the experiment is vector quantization, which quantizes all the pixels to a given number of colors. The proposed color
quantization algorithm first quantized to atom regions locally and then the generated atom regions are quantized to a
given number of colors. We quantize the input image into different number of colors (8, 16, 32 ...) to get PSNR-rate
curves for both methods. In the experiments, only the color quantization algorithms are different. The proposed method
shows much better coding performance.
In order to demonstrate the compression performance of proposed scheme, we compare it with the state-of-art
Image compression standard J2K and a widely used compound image compression algorithm DjVu. The decomposition
level is set to five for J2K. Table I tabulates the compression results for eight compound color images with the PSNR at
about 40db. The proposed algorithm achieves much higher compression ration than J2K. On average the compression
ratio of the proposed algorithm is 2.76 higher than J2K. Fig.5 compares the visual quality of reconstructed images
compressed by our scheme, J2K and DjVu. . It is obvious that text in our proposed scheme is much clear than that of
J2K. The ringing artifacts around edges are also alleviated. While the DjVu removes all the shadows around the text,
making the text aliased. It also suffers form color scattering problem. The proposed scheme reserves the shadows around
text and achieves better visual quality than both J2K and DjVu.

Table I: Comparison of coding performance between JPEG2000 and proposed scheme in compression ratio
Test images 1 2 3 4 5 6 7 8
JPEG2000 60.0 21.8 40.0 30.0 18.5 34.3 30.0 21.8
Proposed 139.7 90.9 105.0 48.4 54.0 46.9 97.1 83.6

6. CONCLUSION

We present a new image compression scheme, which is specially designed for computer generated compound color
images. The propose scheme combining various algorithms to achieve the best compression performance for compound
images. A two stage segmentation algorithm combining thresholding block features and rate-distortion optimization is
introduced to decompose the screen into two different parts: text and picture. The rate-distortion optimization
segmentation algorithm assures each block is compressed using the best algorithm. Text/graphic blocks are efficiently
compressed using rate-distortion optimized color quantization and context-base lossless coding. The rate-distortion
optimized color quantization algorithm efficiently tradeoff the rate and distortion of text/graphics blocks. Our algorithm
is efficient on screen images and outperforms both J2K and DjVu.

11. REFERENCES

[1] W. p. Pennebaker, J. L. Mitchell, “JPEG: Still image compression standard”, Van Nostrand Reinhold, 1993,
[2] D. Taubman and M. Marcellin, “JPEG2000: Image Compression Fundamentals, Standards, and Practice”, Kluwer Academic
Publishers, 2001.
[3] “Mixed Raster Content (MRC)”, ITU-T Recommendation T.44, Study Group-8 Contribution, 1998,
[4] P. Haffner, L. Bottou, P.G. Howard, P. Simard, Y. Bengio, Y. LeCun, “High Quality document image compression with DjVu”,
Journal of Electronic Imaging, pp. 410-425 July 1998
[5] A. Said and A. Drukarev, “Simplified segmentation for compound image compression”, Proceeding of ICIP’ 1999, pp.229-233
[6] Sullivan, G.J and Wiegand, T “Rate-distortion optimization for video compression” ,Signal Processing Magazine, IEEE Volume
15, Issue 6, Nov. 1998 Page(s):74 – 9
[7] Braquelaire, J.-P and Brun, L “Comparison and optimization of methods of color image quantization”, Image Processing, IEEE
Transactions on Volume 6, Issue 7, July 1997 Page(s):1048 – 1052

SPIE-IS&T/ Vol. 6508 65082Q-7


[8] Lin, T.; Pengwei Hao “Compound image compression for real-time computer screen image transmission”, Image Processing,
IEEE Transactions on Volume: 14 Issue: 8 Aug. 2005 Page(s): 993- 1005

It

WLiws XL
Test image 2 Sivk Ltad
Sti ity Ovivkw
peecer I l]11]e
Title
i]irrjrjft (jrJrfJfetijjJ

.1•

Picture content
—S - 2

— .

Text/graphics
content

Fig.3 Exemplar test image 2 and segmentation results

SPIE-IS&T/ Vol. 6508 65082Q-8


50

45

40

35

30
Gl obal
l ocal +gl obal
25

20
0 10000 20000 30000 40000 50000 60000 70000

Fig. 4 Comparison of proposed color quantization and vector color quantization

Microsoft
Orignal
Windows
Microsoft
J2K
Windows
MicrosgfV
J"
DjVu
Windows
Microsoft
Proposed
Windows
Fig. 5 Visual quality comparisons of part of reconstructed test image 1

SPIE-IS&T/ Vol. 6508 65082Q-9

You might also like