Professional Documents
Culture Documents
Efficient Codebooks For Vector Quantization Image Compression With An Adaptive Tree Search Algorithm
Efficient Codebooks For Vector Quantization Image Compression With An Adaptive Tree Search Algorithm
Abstract-This paper discusses some algorithms to be used for uses a fuzzy clustering and leads to a global optimum. Our
the generation of an efficient and robust codebook for vector goal in developing the algorithms discussed in this paper is to
quantization (VQ). Some of the algorithms reduce the required achieve the design of a universal codebook that is relatively
codebook size by 4 or even 8 b to achieve the same level of
performance as some of the popular techniques. This helps in insensitive to the scenery changes in different images.
The size of codebook and the vector dimension also play a
greatly reducing the complexity of codebook generation and
encoding. We also present a new adaptive tree search algorithm major role in determining the overall performance. From Shanwhich improves the performance of any product VQ structure. nons rate distortion theory we know that the larger the vector
Our results show an improvement of nearly 3 dB over the fixed dimension, the better the potential performance. However, with
rate search algorithm at a bit rate of 0.75 blpixel.
I. INTRODUCTION
MAGE DATA compression using vector quantization (VQ)
has received a lot of attention in the last decade because
of its simplicity and adaptability. The advantage of using
vectors over scalars was first shown by Shannon [ 11 in his rate
distortion theory. VQ requires the input image to be processed
as vectors or blocks of image pixels. The encoder takes in
a vector and finds the best or closest match, based on some
distortion criterion, from its stored codebook. The address of
the best match is then transmitted to the decoder. The decoder
accesses an entry from an identical codebook, thus obtaining
the reconstructed vector. Data compression is achieved in this
process because the transmission of the address requires fewer
bits than transmitting the vector itself.
The performance of encoding and decoding by VQ is
dependent on the available codebook and the distribution of the
source data relative to it. Hence, the design of an efficient and
robust codebook is of prime importance in VQ. Linde, Buzo,
and Gray first suggested a practical suboptimal clustering
analysis algorithm [2], now known as the LBG algorithm
to generate a codebook based on some training set. The
drawback of this scheme is that the algorithm only guarantees
a locally optimum codebook relative to the source data used
(the training set). Some of the techniques which have appeared
in the literature to overcome this problem are [31-[5]. The
simulated annealing (SA) method of generating a codebook
tries to obtain a global optimum by a stochastic relaxation
technique. Another algorithm, called deterministic annealing,
3028
3029
Evaluate
Gain
Scalar
Quantizer
Vector
(3)
where N is the number of training vectors in the rth partition.
For a discussion on how the centroid condition is affected
by gain normalization, the reader is referred to [17]. With
this modified centroid condition, we now proceed to design
an optimal codebook by applying the LBG algorithm on the
training sequence {zn, o n } ,which is derived from the input
vector sequence { z n } .Now the convergence to a local optimum is guaranteed. The performance results of this codebook
are discussed in the simulation results section.
Throughout our discussion so far, we have bypassed the
implementation issue. The calculation of gain on a computer
would normally use floating point representation and as such
it would be difficult to implement in a digital VLSI chip.
However, if we scale the residual vector with a constant a, and
the codebook is also scaled with same a,we use only integer
arithmetic. In our simulations we did both floating point and
integer implementations, and the difference in performance
was negligible.
B. Predictive VQ Codebook Design
3030
buffer level. When the input vector arrives, the tree-structured an adept way of encoding the vector and exploits the product
codebook is searched for the best match starting from the nature of VQ. This is also advantageous in two other ways,
root level and travelling down the tree one level at a time. one is that it saves the search time at the encoder and the other
The distortion between the input vector and the best match is is that only the mean (or gain) scalar codebook index needs
evaluated at each level and compared with the threshold. If the to be transmitted which saves considerable bandwidth of the
distortion is greater than the threshold, the search is continued channel. It should be obvious to note that as the output bit
travelling down to the lower levels of the codebook. Since we rate decreases a greater percentage of vectors are transmitted
are dealing with a successive approximation tree structured as only mean or only gain.
codebook, we can say that in general as we travel down the tree
This algorithm is different from the pruning algorithm for
we get a better representation of the input vector and reduction classification and regression trees suggested by Breiman ef al.
in the overall distortion. If the distortion is less than or equal to [ 181 and generalized for tree-structured source coding by Chou
the threshold, further searching of the codebook is terminated et aZ. [ 161. While the pruning algorithm prunes a subtree of
and the index of the best match is transmitted along with a the tree-structured codebook thereby restricting its search for
prefix to specify the level of the tree where the match was a possible match, our algorithm always uses a balanced tree
found. The buffer is then updated and the new corresponding codebook. Another difference of our algorithm is that, while
threshold is evaluated. By allowing the threshold to increase the pruned tree algorithm gives a variable output rate at the
as the buffer level at the encoder increases, the input image encoder, and as such cannot guarantee a certain average rate
suffers a smooth degradation in quality. We now describe the per vector, the adaptive tree search algorithm gives a constant
algorithm in pseudocode form for VQ and later explain the rate at the output of the encoder.
modification to it for product VQ.
IV. SIMULATION
RESULTS
1) Evaluate initial threshold, based on initial buffer level.
2) For each input vector do the following:
Extensive computer simulation were carried out to evaluate
the performance of each of the algorithms mentioned in the
a) current level = 0;
previous section. We generated six tree structured codebooks
b) while (distortion> threshold).
using a training set of about loo0 378 x 480, 8 b images. The
Search the current level of the tree-structured vector dimension used for all of the simulations, was a subcodebook for the best match of the input block of 4 x 4 image pixels. A tree-structured codebook was
vector. Evaluate the distortion with respect used to reduce the encoder and decoder complexity with large
to the best match.
size codebooks. The codebook is a four-level codebook with
if (distortion 5 threshold)
16 branches in each node and was generated using [6]. All the
testing was done with these codebooks, with three different
Go to step c).
images. It should be noted that the images used for testing
else if (currentlevel< maxlevel)
were not present in the training set. Since the codebooks are
a tree structured codebook, we used the same codebook for
currentlevel=currentlevel+ 1;
evaluating the performance when using 4,8, 12, and 16 b of
else Go to step c).
the codebook.
Fig. 3 shows the plot of the peak signal-to-noise ratio
c) Evaluate the number of bits (n-bits) required for
the transmission of the best match. This depends (PSNR), versus the codebook size in bits, for MGVQ, MRVQ,
on the level of the tree at which the distortion and GSVQ. The plots for MRVQ and GSVQ are shown for
comparison purpose only. The PSNR points on the plot are
became less than threshold.
d) Update the current buffer level (buflevel) based the average for the three images on which testing were done,
on the number of bits required for transmission and is defined by
and the number of allowed output bits per vector.
2552 dB
PSNR = 10 log (4)
loMSE
buflevel = buflevel nbits - outbits.
where MSE stands for the mean squared-error between the
where outbits represents the number of bits that
decoded image and the original image. From the results, it
the encoder transmits for each vector. Note that
can be seen that there is considerable improvement in the
since this does not depend on the input vector,
performance of MGVQ, as compared to GSVQ and MRVQ.
our algorithm is a constant rate algorithm.
This improvement in performance is obtained at the additional
e) Evaluate the new threshold.
cost of slight increase in complexity and increase in side
The algorithm can be modified for product VQ to allow the information rate. The former problem is not very significant
transmission of only mean (for MRVQ) and only gain (for and the latter problem is overcome by the adaptive tree search
GSVQ). The criterion to decide this is the distortion between algorithm.
the mean (or gain) and the input vector. If this distortion is
The performance of PVQ is shown in Fig. 4. It is seen that
less than the current threshold, the encoder does not search PVQ with gain normalization only (PVQG) performs better
the vector codebook at all and transmits only the index of the than PVQ with mean removal only (PVQM). The reason
mean (or gain) as the representation to the input vector. This is for this can be explained by observing the distribution of
_I_
40.00
40.00
38.00
z36.00
n
m 38.00
303 1
,,/
/
z36.00
>A
(r
34.00
.-H
tinew GSVQ
ADT-GSVQ
32.00
32.00
30.00
30.00
28.00 ? s
0.00
ADT-MRVQ
a*-
5.00
10.00
I d
*-Y
, I
28.00
I
5.00
20.00
15.00
cfffo MGVQ
I 1
10.00
15.00
ADT-MGVO
20.00
25.00
30.bO
Fig. 5. Plot of PSNR versus the number of bits per vector for the fixed
rate and adaptive tree search algorithms. The vector dimension in each case
is 4 x 4.
44.00 3
42.00
*,
40.00
40.00 1
-38.00
36.00 :
Z
0 32.00
36 00
---
v,
a 34.00
PVQG
MOW PVQM
tit** PVQMG
OE-PVQ
28.00 1
30 00
Go
32005 00
24.00
0.00
I
PVQM
m ADTPVQM
mPVQG
A c.L ADTPVQG
cL)tr** PVQMG
ADTPVQMG
DMW
lY
I I I I I I I
II
4.00
I I I I II I I
II
I I I I I I I I
8.00
II
I I I I
12.00
I I I
II
16.00
,, , , , , , I
20.00
Fig. 6. Plot of PSNR versus the number of bits per vector for the fixed
rate and adaptive tree search algorithms. The vector dimension in each case
is 4 x 4.
3032
TABLE I
PERFORMANCE
OF ALGORITHMSON IMAGELENA.
THE VECTOR DIMENSIONIN EACHCASEIS 4 x 4
Algorithm
MRVQ (dB)
GSVQ (dB)
MGVQ (dB)
PVQM (dB)
PVQG (dB)
PVQMG (dB)
4
30.14
30.11
31.63
30.79
30.64
32.19
16
36.58
36.47
38.22
36.83
37.13
38.39
TABLE I1
PERFORMANCE
OF ~ A F T I V ETREESEARCH
ON IMAGE
LENA.THE VECTOR DIMENSION
IN EACH
CASE IS 4 X 4
Algorithm
MRVQ (dB)
GsVQ (a)
MGVQ (B)
PVQM (m)
PVQG (dB)
PVQMG (dB)
V. CONCLUSION
In this paper we have discussed a number of algorithms
which achieve the design of an eficient and robust codebook.
It has also been shown that the same level of performance is
achieved by reducing the codebook size by 4 or even up to
8 b. These reductions in the codebook size are obtained with
only slight increase in computational complexity. Reducing the
codebook size exponentially reduces the codebook generation
complexity and also the encoding complexity. The reduced
complexity can be used in several applications such as highdefinition television. On the other hand, if one can cope
with large size codebooks, the potential theoretical increase
in performance is significant.
The adaptive tree-search algorithm is a new approach and
provides considerable improvement over the fixed rate VQ.
The improvement in performance of the algorithm is more
pronounced for low bit rates, wherein the product nature of the
VQ is exploited. The algorithm is fast and is simple enough
to implement in hardware at video rates. Hence, it can be
effectively used in real life.
REFERENCES
[ 11 C. E. Shannon, A mathematical theory of communication, Bell Syst.
SITARAM et
01.:
3033