Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

MEAN-GAIN-SHAPE VECTOR QUANTIZATION

Karen L . Oehler Robert M. Gray


Information Systems Laboratory, Department of Electrical Engineering
Stanford University, Stanford, CA 94305-4055 USA

ABSTRACT develop a codebook for each feature producing a product


Mean-gain-shape (MGS) is a product code formulation codebook, & ,, x &, x € 8 .
of vector quantization (VQ) which allows flexible bit al- MGS was first proposed for image compression applica-
location between the mean, gain, and shape features of a tions by Murakami et al. [2], but they did not describe the
vector. Product codes provide memory and complexity ad- detailed structure of the code design and the optimality
vantages over non-product VQ. This makes the use of larger properties of the sequential encoding techniques given here.
vectors more feasible, and increasing vector dimension im- Previous work in MGS encoding examined the same fea-
proves compression efficiency. T h e product code presented tures as defined above, but coded the mean and gain fea-
here obtains the minimum distortion reproduction vector tures noiselessly and coded the shape feature with a full-
by successive encoding in each of the three codebooks. We search vector quantizer [2, 31. Sitaram and Huang demon-
use pruned tree-structured vector quantizers (PTSVQ) to strate how product codes, including MGS, save memory
provide variable rate codes a t low encoding complexity. Si- using a balanced 16-ary tree-structured vector quantizer
multaneous pruning of the three codebooks provides opti- (TSVQ) to encode shape features, but again do not inves-
mal bit allocation. Prediction and concatenation is used to tigate the bit allocation between the three features [4].
take advantage of interblock correlation. The results com- Here, variable rate PTSVQ are used for all three fea-
pare favorably with other tree-structured VQ methods. ture codebooks, which provides low complexity encoding,
further compression of the mean and gain features, and a
natural method of bit allocation. T h e optimum encoding
1. I N T R O D U C T I O N algorithm is shown to decompose naturally into a sequence
of separate quantization steps as in the gainlshape VQ case
A variety of VQ algorithms have been proposed and stud- [SI. T h e algorithms for codebook generation are extended
ied for image compression [I]. A properly designed product to the tree-structured case.
code structure can offer several advantages over unstruc- In encoding, we wish to minimize the quantizer distortion
tured VQ including: lower memory requirements, better
representation of important features of the input vectors d ( X , X ) which is defined as simple mean squared error,
and more robust performance. Typical product codebooks IIX - Xl12:
require exhaustive search t o determine the best combina-
tion of component codewords. T h e MGS algorithm, how- d ( X ,X) = IIX - [ui+ T h l j 1 1 2
ever, can be implemented so that the minimum distortion
representation is found by successive encoding in each of = (IXII2 - 2 u Y i - 2&(XT1)
the three codebooks. This algorithm works in the spatial + b’llill2 + 2 8 i n i T 1 + Th2k
domain requiring no computationally intensive transforms
or filtering. If the shape codewords in €8 are designed t o be consistent
Given a k-dimensional vector X = (ZI,2 2 , . . . ~ k we) ~ with the shape training vectors so that 11&112 = 1 and
compute the vector mean m x , gain u x , and shape 8x as s T l = 0 we have:
k
1 1

-
mx = - Z, = -XT1 where 1 = (1, 1 . . . 4x9 X)
k k
= ~ I X I I ~ - 2UxTi- 2mmxk + u2 + h’k

---
1=1
k
Shape

- ux’ + 2 - 2&xTir + l i ( ~ h - m x (1)


)~
No Effect Gain Mean

These features can be combined to the original vet- Examination of this distortion equation shows that given
tor: X = dxsx + m x l ,where we require that the gain u x the mean, gain, and shape codebooks, the Optimal repro-
is nonnegative whenever expressing in this form, Note duction choice can be obtained by encoding each feature
that the shape vector has mean and unit gain. we successively. T h e variance U; of the input vector X is
obviously independent of the codewords chosen. T h e best
‘This work was SUDDOrtedbv the National ScienceFoundation mean codeword is independent of the choices of gain and
under Grant MIP-90i6974. - shape. Additionally, the best gain and shape cidewords

V-241
0-7803-0946-4/93$3.00 0 1993 IEEE
can be found by first determining the best shape codeword the large shape TSVQ, provides gain coding equivalent t o
and then using the result t o determine the best gain code- constructing unique gain codebooks for each shape bit rate.
word. Not all product codes have this characteristic; other Hence, training a single gain codebook using the large shape
structures may require an exhaustive search of the various TSVQ is satisfactory.
codeword combinations t o determine the optimal choice of
codewords. Note that the shape codeword is selected by
examining the inner product of the input vector and the 3. BIT ALLOCATION AND JOINT PRUNING
codeword candidates. There is no need to compute the in- Since encoding requires three codebooks, we have to decide
put vector’s gain or normalized shape. For details on the how many bits we should use (on average) to describe the
gain/shape interaction see [5]. mean, the gain, and the shape. One alternative is t o o p
timally prune each of the three codebooks and try various
2. CODEBOOK GENERATION combinations of the resulting subtrees. Better bit alloca-
An unbalanced tree-structured vector quantizer is greedily tion can be obtained by jointly pruning the three codebook
grown for each feature [6]. This provides a variable rate trees simultaneously. Figure 1 demonstrates how the si-
code which can be optimally pruned to provide a set of multaneously pruned codebooks outperform various combi-
nested codebooks over a range of bit rates [7]. nations of individually pruned codebooks on the training
T h e distortion term in (1) can be rewritten t o indicate set. Given sufficient experimentation, combined individ-
the effect of each feature: ual pruning can eventually improve to the simultaneously
pruned performance, but simultaneous pruning provides a
d ( X ,X) +
= d m ( m x ,i“) d s ( X ,8 ) + d,(XTB, b) quick
.~ path to the best bit allocation.
with Joint pruning is a straightforward extension of previous
dm(mx,7jL) = k(riz-mx)’ TSVQ pruning [7]where all three feature trees are consid-
ered and branches are removed from each. This is possible
d e ( X ,8) = - ([XT8]+)’
c7: because the overall distortion is simply the sum of each fea-
d , ( X T i , b) = b2 - 26XTB + ([X%]+)’ ture distortion and the quantizer functions of the the three
TSVQs are convex. For more details see Riskin [a] where
where [.It represents the larger o f . and 0. T h e mean dis- a similar problem involving optimal bit allocation pruning
tortion, d m ( m x , h ) , is simply mean squared error. T h e for Classified VQ is considered. While the pruning here
optimal shape codeword, 8 , minimizes the shape distortion, minimizes the overall distortion, other criteria such as psy-
d e ( X , 8 ) ,regardless of the gain codeword. Once ii is c h e chovisual knowledge could easily be incorporated during the
pruning step. For instance, if the shape distortion was con-
sen, the ideal gain is c7* = [XTii]+and the optimal gain
sidered to be more noticeable than mean or gain distortion
codeword minimizes the gain distortion, d , ( X T 8 , &). The because of the sensitivity of the human eye t o edges, the
centroids used in codebook generation are derived t o mini- pruning algorithm could weight the shape distortion more
mize distortion for each feature. heavily resulting in codes with more bits allocated to shape
Although the optimal encoding process is sequential, the and fewer bits allocated to mean and gain.
optimal codebook generation process is not sequential. Be-
cause the distortion term for the mean is completely in-
dependent of the shape and gain, the mean codebook can 4. PREDICTION
be grown independently without loss of optimality. In con-
trast, the gain and shape codebooks d o depend upon each T h e means of adjacent blocks are highly correlated. Hence,
other and joint optimization of the codebooks is required one can use a simple linear predictor on the mean, predict-
for optimal codebook design. However, Sabin and Gray ing a block’s mean by averaging the quantized means of the
demonstrate that independently optimized algorithms give nearest upper and leftmost blocks. T h e resulting predic-
results very close to jointly optimized algorithms a t reduced tion residual is encoded. T h e decoder adds the quantized
complexity and design time [5]. Hence, we adopt their in- residual t o the predicted value. T h e quantized means are
dependently optimized algorithm to the tree growing case. available to both the encoder and decoder so that no side
T h e mean codebook is constructed using a training set of information is required. Such prediction improves the en-
vector means. T h e shape codebook is constructed using a coded image quality with only a small increase in complex-
training set of mean-removed vectors. Once the shape code- ity. While adjacent gains are correlated to a lesser degree,
book is constructed, it is used to encode the training vectors their nonnegative nature and more complicated distortion
providing the d a t a for the gain codebook to be trained. term make prediction more complex and it is not considered
We grow a large TSVQ for the shape features, and then here.
optimally prune to achieve the desired bit rate [7]. T h e
tree-growing algorithm is a greedy algorithm, which only 5. CODING MEAN AND GAIN AS VECTORS
optimizes for the current split. Once grown the tree is opti-
mally pruned by repeatedly removing branches (either ter- The scalar mean and gain often require a disproportion-
minal nodes or internal nodes with descendants) of the tree. ate number of bits for quality representation. Both the
T h e pruning algorithm removes the branch which minimizes mean and gain show significant degree of correlation be-
the increase in average distortion per decrease in bit rate. tween neighboring blocks. In Section 4, we used prediction
T h e resulting nested subtrees provide optimal TSVQs a t to take advantage of the mean correlation. We can also use
various bit rates (given the initial tree.) VQ to code both the mean and the gain values more effec-
Because the shape and gain codebooks are interdepen- tively. T h e mean and gain values for each block are con-
dent, we need t o grow a unique gain TSVQ for each shape catenated together int,o vectors before being encoded with
subtree we wish to use. However, we found experimen- a PTSVQ. Prediction of the means based on preceding vec-
tally that developing and pruning a single gain TSVQ from tors can still be used to reduce redundancy.

V-242
36

35

34

33
*
6
a
32

31
2
-3 30
G
29

28

27

26 I

6. OTHER IMPROVEMENTS ing 4 x 4 blocks. A training set of 10 512x512 grayscale


images was used. T h e PSNR (dB) obtained by encoding
We have also implemented two heuristic extensions which a test image from outside the training sequence using vari-
improve quantizer performance. These extensions use the ous algorithms are shown in Figure 2. (PSNR is defined
same codebooks as described before; they alter the encod-
ing process, but not the codebook generation process. The as 10 log,, [(255)2/mean squared error] .) T h e predictive
first extension takes advantage of the fact that quality shape MGS is comparable to the predictive PTSVQ. At higher
representation is much less important when the vector gain rates (above 0.9 B P P ) the MGS algorithm is clearly su-
is small. We propose a simple alteration in the original al- perior, due to the robustness of the product code. This
gorithm. We encode the shape, then the gain as before. makes MGS particularly useful when high quality encod-
However, if the quantized gain value (known t o both en- ing is necessary. Encoding the mean and gain as concate-
coder and decoder) falls below a given threshold, then fewer nated 2 x 2 vectors as described in Section 5 significantly
shape bits are sent. Because a TSVQ is used to encode the improves the MGS performance a t lower rates. T h e ex-
shape, this corresponds t o sending only the first bits of the tensions mentioned in Section 6, further improve the quan-
path through the tree. This can substantially reduce the tizer performance with little additional complexity. Here,
bit rate without a significant increase in distortion. we show results where only the first 2 bits of the chosen
T h e second extension improves the prediction of means. shape codeword index are transmitted if the gain codeword
Mean distortion has a strong effect on both PSNR and per- is less than 15, and the actual mean values are transmit-
ceived image quality. However, for those vectors which the ted instead of the quantized ones if the mean distortion is
prediction is poor (e.g. around edges) the mean distortion above 1000. The test image encoded using MGS with con-
can be quite high. We can use two methods of encoding the catenation and these extensions is shown in Figure 3 with
mean, using the usual codebook if the resulting mean dis- overall bit rate of 0.58 bpp and PSNR of 33.18 dB. T h e
tortion is below a certain threshold and using a non-residual mean rate is 0.26 bpp (16.96 bits per concatenated vector),
codebook otherwise (here, we simply sent the actual mean the gain rate is 0.07 bpp (4.57 bits per concatenated vec-
values as integers). We send one extra bit per concatenated tor), and the shape rate is 0.24 bpp (3.88 bits per vector).
vector t o tell the decoder which codebook to use. When us- In this example, 67% of the vectors have sufficiently small
ing mean vectors from 4 x 4 blocks concatenated into 2 x 2 gains so that their truncated indexes are sent, and 11%
vectors, this adds one bit per 64 pixels, a small increase in of the mean concatenated vectors have sufficient distortion
rate compared t o the improvement in encoded mean qual- to warrant alternate rendering. At the same overall rate of
ity. This allows us to use a low rate residual coder most 0.58 bpp, the ordinary predictive PTSVQ produces a PSNR
of the time, but prevents large distortion in those vectors of 32.68 d B and its codebook requires 5 times the memory
which are badly predicted. needed for all three MGS codebooks combined. T h e MGS
rates could be further reduced using entropy coding t o en-
code the codeword indices.
7. RESULTS
The MGS algorithm was used to encode images from the
USC d a t a base using 4 x 4 pixel blocks. Results were com-
pared t o regular (non-product) predictive PTSVQ also us-

V-243
361
35 _/*
>-
-

341
_-*
*.-*

_.-.-
_,_.-.-.- -

6
a
& 33 -
a,
2 32- -
.!i
31 - -
c)

Gcm
30 -
.. = Pred PTSVQ
-
29 - - = Pred MGS with Concat. and Exten. -
.- = Pred MGS with Concat.
-- = Red MGS
I
0.2 0.4 0.6 0.8 1 1.2 1.4
O v e r a l l bit rate (BPP)
Figure 2. Comparison of various MGS algorithms with ordinary PTSVQ. Shown are predictive PTSVQ, predictive MGS, predictive
MGS with concatenated mean and gain
- vectors, and predictive MGS with concatenated vectors and the extensions (gain threshold
15 and mean threshold 1000).
8. CONCLUSIONS REFERENCES
We have demonstrated a MGS product code using PTSVQ [l] A. Gersho and R.M. Gray, Vector Quantization and
for the compression of digitized images. Simultaneous prun- Signal Compression, Kluwer Academic Publishers,
ing of the tree-structured codebooks provides optimal bit Boston, 1992.
allocation between the three features. Prediction and other [2] T. Murakami, K. Asai and E. Yamazaki, Vector Quan-
enhancements are implemented to further improve the en- tiser of Video Signals, Electronic Letters, pp. 1005-
coder efficiency. The algorithm produces quantized images 1006, Vol. 18, No. 23, November 1982.
of good quality with low encoding complexity and reduced
memory requirements. [3] H.J. Lee and D.T.L. Lee, A Gain-Shape Vector Quan-
tizer for Image Coding, Proc. lEEE lnt. Conf. Acoust.,
Speech and Signal Proc., Vol. 1, pp. 141-144, Tokyo,
1986.
[4] V.S. Sitaram and C.M. Huang, Efficient Codebooks
for Vector Quantization, Proceedings of the 1992
IEEE Data Compression Conference, p. 396, Snowbird,
Utah, March 1992.
[5] M.J. Sabin and R.M. Gray, Product code vector quan-
tizers for waveform and voice coding, IEEE Trans.
Acoust., Speech and Signal Proc., ASSP-32:474-488,
June 1984.
[6] E.A. Riskin and R.M. Gray, A greedy tree growing al-
gorithm for the design of variable rate vector quan-
tizers, IEEE Trans. Signal Process., pp. 2500-2507,
November 1991.
[7] P.A. Chou, T. Lookabaugh, and R.M Gray, Optimal
pruning with application to tree-structured source cod-
ing and modeling, IEEE Trans. lnform. Theory, pp.
299-315, March 1989.
[8] E.A. Riskin, IEEE Trans. Info Theory, Vol. 37, No. 2,
pp. 400-402, March 1991.

Figure 3. Test image encoded a t 0.58 bpp. PSNR = 33.18 dB.

V-244

You might also like