Download as pdf or txt
Download as pdf or txt
You are on page 1of 15

Subband Coder Design Incorporating Recursive

Quadrature Filters and Optimum ADPCM Coders

Abstract-This paper presents the results of an analytical and experi-theseparate channels, andthusthecodingdistortion is a
mental study of theuse of IiR quadrature mirror filters and forward function both of the channel coders used and the stopband
adapting ADPCM with optimum quantizers for octavebandsubband rejectioncharacteristics of the individual channel filters.
coders for speech. It is *ownthattheunderlyinganalysis-synthesis
systems canbedesignedsuchthatthere is no interbandaliasing dis-Clearly, the better the quality of the channel filters, the better
the rejection of the coding noise between the bands.
tortion in the coded speech and such thatthe analysis-synthesis transfer
function has no frequency distortion or no phase distortion, but not In the original subband coding systems reported Crochiere by
both. The experimental study showed thatsubbandcodersbased onet al. [ l ] , [ 2 ] , the channel filters used were infinite impulse
a mixture of infinite impulse response (IIR) and finite-duration impulse
response (IIR) recursive filters, and the resulting coding sys-
response (FIR) filters resulted inhigherquality coding systems using
fewer multiplies thanfor @stems based on FIR filters alone. It was tem exhibited, to some degree, ail four of the basic distortion
also found that a signal-to-noiseratio (SNR)gain of about 4 dB could types: interbandaliasing, frequency distortion, dhase distortion,
be obtained usingoptimumquantizers for forwardadapting ADPCM and coding distortion. In a later work, Estebaaetal. [ 3 ] intro-
codersoverlinearquantizers in backwardadapting APCM codes, but duced the concept of quadrature mirror filters, which can be
no perceptual quality gain was observed. used torealize two-bandanalysis-synthesis systems which have
no interband aliasing. Further, Esteban showed that if equal
length linear phase filters are used for the bahd splitting, then
I. INTRODUCTION the overall analysis-synthesis transfer function is linear phase.
VER the past several years, a number of systems for the Hence, quadrature mirror filters allow for an analysis-synthesis
0 adaptive coding of speech in subbandshas been reported system which only exhibits frequency distortion.
in the literature. In all suchsystems (see Fig. l), the speech Based on these results, Esteban et al. [4] and Crochiere et al.
is divided into channelsby first passing it through afilter [SI, [6] developed subband speech coding systemsbased on
bank [ H n ( z ) ] ,and the output of each filter is .decimated to treestructures of quadraturemirror filters.Thesesystems
a ratedetermined by thebandwidth of the channel.These exhibit an overall system response which is linear phase, and
individualchannel outputs are thencoded using adaptive as such, they may be used to codedata signals as well as speech
PCM or adaptive DPCM coders. At the receiver, the decoded signals. The filters used by Crochiere et al. [5] were designed
channel signals areeach interpolatedback to the original by Johnston [7] using an iterative approach which sought to
sampling rate using a set of interpolation filters [C,(z)] , minimize thefrequencydistortioninherent in the overall
and the outputs are summed to give the reconstructed speech system. Another characteristic of all these latersubband
signal $(a). The quality improvement of the coded speech in coders is that the bandsplitting filterswererealized using a
subband codersystems over full band coder systems is attribut- two-band polyphasefilterbank structure [8] to improve the
able to the fact that the coding noise generated in a particular computational efficiency.
band is largely limitedtothat band in thereconstruction, Thispaper reports the results of an analytical and experi-
and the noise is not allowed to spread to other bands where mental study of the impact of using IIR filters for.the band-
there may be less signal energy. As is illustrated in Fig. 1 , there splitting function in tree-structured subband coders forspeech,
are always two basic issues in such codingsystems. First, if and of using forward adapting ADPCM coders with optimum
the individual channel coders are excluded,then a subband quantizers for the channelcoding functions in the same sys-
coder can bethought ofasa frequency variant analysis. followed tems. .In the analytical study, it is shown that it is possible to
by aninterpolated resynthesis which, by itself, potentially design subband codersystems using either IIR or finite-duration
introduces interband aliasing distortion, frequency distortion, impulse response (FIR) filters such that the analysis-synthesis
and phase distortion. Clearly, the overall quality of subband systemresults in nointerband aliasing amongthechannels
coded speechcan be nobetterthanthe intrinsic quality of and in which the overall analysis-synthesis transfer function
this analysis-synthesis system. Second,
additional dis- has no phase distortion or nofrequencydistortion,butnot
tortion is added by the inclusion of the individual coders in both, In theexperimental Study, such filters were used to
test if subband coders of equivalent quality to those already
ManuscriptreceivedApril 27, 1981; revisedMarch 26, 1982. This demonstrated canbeachieved using IIR filters without for-
workwassupportedbythe Acoustics ResearchDepartment of Bell feiting the computational improvements inherent in recursive
Laboratories, Murray Hill, NJ.
Theauthor is withthe School of ElectricalEngineering,Georgia filter realizations. The effects of the optimum ADPCM coders
Institute of Technology, Atlanta, GA 30332. were also tested experimentally.

0096-3518/82/1000-0751$00.75 0 1982 IEEE


152 IEEE TRANSACTIONS ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL. ASSP-30, NO. 5, OCTOBER 1982

Fig. 1. N-band, analysis-synthesis system for speech coding.

8nJ 2:i po(n) qo(n) -I 1:2 + ~ ( n )

,+
x (n)
{q,(n) I : 2 +n+t)
(c)
Fig. 2. (a)Two-bandanalysis-synthesis band splitter.(b)Equivalent
polyphase structure for a two-band analysis-synthesis band splitter.
(c) Equivalent polyphasestructure foranalyzing the reconstruction.

11. THEORY
The octave band subband coder systems of interest in this
Hl (z) = 5 hl (n)
n=o
Z-n (1 b)

study are formed as a tree structure of two-band analysis-syn-


and the Fouriertransform of these two transfer functions by
thesis pairs based on quadrature mirror filters. Fig. 2(a) shows
the block diagram for such a two-band system. For the upper m

channel, the input signal x(n) is first passed through a "half- H,[w]ho(n)e-iwn
=Ho(eiw)= (2a)
n=o
band" low-pass filter with inipulse response ho(n), and then is
decimated at a rate of 2-to-l . For the lower channel, the sig- m

nal is passed through the half-bandhigh-pass filter with impulse H~ [a]


= N,(e'") = h l (n) e-fwn. (2b)
response h,(n), followed by another2-to-1decimator.The n=o
outputs of these twochannels areyo(n) andy,(n), respectively. Then,forquadrature mirrorfilters, the high-pass transfer
These are the twosignals which would be coded in a two-band function can be defined in terms of the low-pass filter by
subbarid coding system, For the synthesis,these two signals
are padded with zeros at a 1-to-2 rate, passed through the two H , [a]= H , [a- 711 (3 1
interpolation filters go(n) andg, (n), respectively, and the out-
puts aresummed to give thereconstructed signal $(n). One or equivalently by
important point to note here is that if the filters ho(n)and
go@) were removed [i.e., ho(n)=go(n)=6(n)], if h,(n) and Hi ( 2 ) = H , (- z). (4)
gl(n) were replaced with an ideal delay and an ideal advance,
respectively [Le., hl (n) = S(n - 1) and g, (n) = S(n f l)] , and This leads to the relationship
if there were no coders inserted for yo (n) and y 1(n), then
2(n)=x@), and the reconstructionwould be perfect. h 1 (a) = (- 1r h o (n) (5)
Let us define the transfer functions of thetwo analysis
filters as or

H,(z) =
m

n=o
h,(n) Z-n (la) H 1 ( z )= 2 ho(n) (- l)nZ-n.
n=o
(6)
BARNWELL:SUBBAND CODER DESIGN 153

Fig. 2(b) gives the equivalent polyphase filter structure [8] transform (DFT) on the analysis side cancels with the inverse
forthetwo-bandanalysis-synthesissystem of Fig. 2(a). In two-point DFT on the synthesis side, and the overall analysis-
this structure, the input signal is delayed by one sample for synthesissystem6 is equivalenttothesystemof Fig. 2(c).
the lower channel, and then both channels are decimated at a This means thatX [ o ] can be writtenirl terms of the polyphase
rateof230-1.Followingthedecimation,eachchannel is filters alone as
passed through a polyphase filter, and then a two-point DFT B[w] = 3 [Po[2wI Q o J 2 ~ 1+Pi [2wI QI [ 2 ~ 1 1X[wI
is applied. To show that the systems of Fig. 2(a) and (b) can
bemade to beequivalent, wewill writeandcomparethe + 3 [Po[2wI Q o E201
expressions for Yo[w] and Y , [ a ] for each of the two realiza- - Pi [2w]
Q1[2wII X[O - TI (13)
tions. For the filter bank system of Fig. 2(a), we have
X[o- n] is the interband
In this expression, the term including
aliasing distortion. This term can be set to zero if
Po [2~31Qo [2wI = P I[2wI QI [2wI (14)
which results in a linear analysis-synthesis system in which
2 [ w ] = X[w] Po [Zw] Qo [2w]. (1 5)
Therefore, as long as the condition of (14) is met, there is no
interband aliasing in thesynthesized signal, andthe overall
analysis-synthesis can be considered to be a filter
and for the polyphase systemsof Fig. 2(b), we have
f [ w l = C[ol Xtwl (16 )
with the transfer function
C[O] =Po [201 Qo [2w]. (1 7)
In orderforthistwo-bandsubbandcodingsystem to be
effective,itmustmeet two-conditions.First,thetransfer
function C[w] must beas near to a distortionless system as
possible, i.e., c(n) = S(n - a) for some value of a. Second, the
analysisfilters Ho[a] and H 1 [ a ] mustapproximate'ideal
half-bandfilters.Thefirstconditioncan be met triviallyby
the solution Po[o] = P, [a]= Qo [w] = Q1[ w ] = 1. This
results in a two-tap FIRfilter for Ho[q]where
Applying (3), these two expressions are equivalent if
L3
H , [ a ] = 2ej(w/') cos - (1 8)
H, [W] = p0 [2w] +P,[2wI e-iw 2
or equivalently which is a very poor filter. All other filters used for Ho[w]
Ho(z)=Po(z~)+P,(z2)z-1. except this one and the unrealizableideal half-band filter result
in a C [ o ] whichexhibitsphasedistortion,frequency dis-
Transforming this expression back into the time domain gives tortion, or both. If Ho[ w ] is taken to be an ideal half-band
low-pass filter, then
m

n=o
ho (n) z-" = 2 po(n) z
n=o
- t ~p1(n)
~ z-(2n+ (1 1) ho (n) = S(n - U ) (1 9)
for some value of a. Note that there is no requirement here
or
that "u" be an integer. For this case, the corresponding poly-
Po@>= h(2n) (12a) phase filters for Po [w] and Pl[+] are
p1 (n) =(12b)
h(2n + 1). Po 101 = 1,2 e-i(UP)a (20a)
This, of course, is the well-known polyphase result that the p, = 1 e-i(w/2)a e-i(wP).
impulse response of the polyphase fdters for a two-band sys- 2 (20b)
tem is forrtied by taking every second sample of the impulse So the ideal polyphase filters are two ideal delays where the
response of the prototype Filter.
There are two important points to be made about the poly-
4
relative delay between the two is sample,

phase structure. First, since the polyphase filtersareapplied Properties of the Polyphase Filters
at the decimatedrate,thenthepolyphasestructure is more
computationally efficient thanthe filter
bank structures. Up to this point, we have made few assumptions about the
Second, note that if no coding of.'the bandpass signals yo(n) nature of the filters to be used to do the subband splitting.
and y1 (n) is performed, then the two-point discrete Fourier Clearly, .they must be realizable, casual, stable, and must have
754 IEEE
TRANSACTIONS ON ACOUSTICS,
SPEECH, AND SIGNAL
PROCESSING, VOL. ASSP-30, NO. 5 , OCTOBER 1982

reasonable computational characteristics if they are to be which means that the roots of Po(z) are the reciprocals of the
used for speech coding systems. A class of filters which meets roots of Pl (z).
these requirements is the IIR filters of the form
Traditional FIR Polyphase Realizations
ho@)
I=
r"l (1
1
- b(0z-l) Traditionally [4], [ 5 ] , FIR subband coders have used equal
Ho(z) = (21) length half-band filters and have satisfied the interband aliasing
n
m =I
(1 - a ( m ) z - ' ) conditions of (14) by the solution
L-1
Qo ( z ) = Pl (z) = h(2n) z - ~ (27a)
where L is the number of zeros and M is the number of poles. n=o
These filters should be computationally attractive because of
their computational advantage over FIR filters for high quality L-112
Ql(z)=Po(z)= h(2n t I ) z - ~ . (27b)
filterrealizations. In the past, however [4] -[6], FIR filters n=o
have been used most frequently in subband coding applications
because oftheirbetter phase'characteristics. Tounderstand This results in an overall analysis-synthesis transfer function
this phase advantage,consider the case where Ho [a]is a given by
half-band, linearphase, FIR low-pass filter.Forthis case, C[O] =Po [2w] P, [ 2 o ] . (28)
M = 0, and (21) becomes
Recalling that the roots of Po ( z ) are the roots of P I ( l / z ) ,
then Po(z) PI (z) must be a symmetric polynomial in z . This
n=o I= 1 means that c ( z ) is a symmetric polynomial in z 2 , and that
and c(n) = c(2L - 1 - n) (29)
h(L - n ) = h(n). (23) which means in turn that C [ w ] is a linear phase filter. Hence,
This last condition means that Ho ( z )is a symmetric polynomial these FIR solutions have two desirable properties. First, their
in z , and that, hence, the roots ofHo(z) and Ho(l/z) must be frequency response is thefrequency response obtained by
identical. This means that if z, is a root of Ho(z), then l / z , concatenatingtwo filters, both of whose ideals are all-pass
ryust also bea root. A root may be its own inverse for the delays. Second, the overall phase response of the entire sys-
case of z, = ? 1. tem is always linear. Althoughmoderate phase distortion is
For an FIR filter, the polyphase transfer functions defined acceptableinspeech communication systems, no phase dis-
in (1 2a) and (12b) canbe written as tortion is desirable if data signals are to be transmitted over
the same channel.
N1 N1
P,(z) = ho(2n) z - =~ po(n) z+ The main disadvantage of an FIR based subbandcoding
n=o n=o system which might be improved by the use of IIR filters is
that, if good quality filters are used for the band splitting, then
N2 N2 the length of the FIRfdters mustbe quite long. Also, it should
Pl(z)= h0(2n'+ I ) z - =~ pl(n)z-n (24b)
n=o n=o be noted that FIR systems always result in some frequency
distortion, and the human ear is very sensitive to frequency
where for L even [odd number of terms in h o ( n ) ] ,N1 = L/2 shaping. Finally, the strict linear phase condition maintained
and N2 = (L/2) - 1, and for L odd [even number of terms in by these systems is not important if only speech is to be coded
ho(n)], N 1 = N 2= (L - 1)/2. For odd length filters, generally, with the coding system.
p1( n ) = 0 for n f L/2, and
(z) P, = + z-L/2. (25) Alternate IIR and FIR ReconstructionFilters

However, also for oddlengthfilters, Po ( z ) and Pl ( z ) will There are a number of other possible polyphase realizations
both be symmetric polynomials, and Po(z) must have an odd involving both IIR and FIR filters which satisfythe no-interband
number of roots. This means that either z = 1 or z = - 1 must aliasing condition of (14).In this section,three of these
bea root of Po(z). This requires, in turn, that the Po [w] alternate realizations will be presented:one which has both
polyphase filter must have at least one zero on the unit circle. frequency distortion and phase distortion, one which has only
Since Po(z) is ideally an all-pass (time delay) filter, then this frequency distortion and no phase distortion like the traditional
zero is in the passband, and from (17), must also occur in the FIR realization, and one which has phase distortion but no
passband of the overall analysis-synthesis system function frequency distortion. In all cases, these developments will be
C[o] . Hence, odd length FIR filters should not be used for made in the context of two-band subband analysis-synthesis
subband splittingin two-band subband coders. systems which may be combined to give octave band subband
Even length filters, ontheotherhand,donot have this coders. The developments willalso be done for the general
property, since neither Po(z) nor P I ( z ) is a symmetric poly- case in which theband-splitting filters have both poles and
nomial. However, they do have the additional property that zeros, as in (21). This does not mean that these results do not
hold for FIR filters, because they do. The corresponding FIR
results are obtained by allowing the numberof poles to be zero.
L 1- (1 y -n )
P0(n)=P
The first point to note is that any IIR analysis filter of the
BARNWELL: SUBBAND CODERDESIGN 755

type previously discussed can be expressed in the form respectively, and Bo(z) and B 1 ( z ) aretheproductsofthe
maximum phase zeros of Po( z ) and Pl (z), respectively. Under
h(0) fi
E= 1
( 1 - b'(Z) z - 1 )
-
these conditions, there are three ways in which to satisfy the
no-interband aliasing conditions. The first is the IIR version
Ho(z) = " of the traditional FIR solution,which results in reconstruction
fl (1 - a'(m)z-') filters given by
m-1

for some set of coefficients b'(Z) and a'(m), and where the
denominator is in terms of z'. This can be accomplished
either bydirectly designing the filter in this form [8] or by
observing thatanyIIR filter may be transformed by the
operation
where Gl and Go are gain terms. It can be seen from (15)
M that thisresults in an analysis-synthesis transferfunction
n (I L

E= 1
- b ( ~z )- l ) fl (I t a(m)z - l )
m=l
given by
Ho ( z ) = ho (0) M
n
m=l
(1 - a(m)z-'> fi (1
m=l
t a(m))z-'
C(Z) = ~ d ~ ~ ~ ~ b
P(z2) P(z2)
~ ~ ~ ~ o ~ l ~ o

(3 7)
(31)
This function should result in a relatively flat passband, but
or
wouldalways have bothfrequencydistortionand phase

n
L
(1 - ~ ( z ) z - ' ) .n (1
m=l
M
t a(m) z - ' )
distortion.
A second solution is given by
H, ( z ) = h, (0) I =
n (1
M

m=i
- a'@) z-2)

(32 ) where Go and Gl are again gain terms. Using (1 5), this would
result in an analysis-synthesis transfer function given by
which is in the correct form. Rewriting Ho ( z ) as
C(z)=h~(0)h~(l)GoGlAo(z2)Al(.z2)Bo(z2)Bl(z2).
(3 9)
(3 3) This transfer function could never be an all-pass function, but
thiscould conceivably be alinear phase function if A o ( z ) ,
m=l A , (z), Bo(z), and Bl ( z ) could be constrained to have the
then the polyphase filters can be rewritten as proper characteristics. Inparticular,notethatthe analysis-
synthesis transfer function given in (39) is a product of FIR
N1 filters, just as was true for the case of the analysis-synthesis
hb(2n) Z - n
n=o transfer functionresulting from the traditional FIRrealization.
Po(z) = Hence, if the same constraints could be placedon the numerator
n (I
m=l
- a'(m)z-') of the IIR prototype half-band
z transformoftheFIRprototype
filter as were placed on the
half-band filter,thenthe
resulting analysis-synthesis transferfunctioncould be linear
N2
hb(2n t l ) z - n phase. It would not, however,be close to an all-pass delay
n=o function, as was the case for the FIR realization.
Pl(Z) = M
n
m=l
(1 - a'(m) z-1)
The third solutionis arrived at by first observing that a Qo ( 2 )
and Ql ( z ) given by

where the conditions onN1 and N2 are thesame asfor the FIR
filter case already discussed. Rewriting the polyphasefilters
in terms of theirpoles and zeros, in general, gives

is also a solution, and results in a C(z) given by

C(z)=h~(0)h~(l)CoGlBo(z2)B1(z2). (41)

What has been done hereis that all minimum phase components
where P(z) = H z = , (1 - a'(m) z - l ) , A o ( z ) and
, A ( z ) are the have beenremoved, leaving only the maximum phaseterms.
products of the minimum phase zeros of Po@) and P1(z), This solution itself is notinteresting because it necessarily
156 IEEE
TRANSACTIONS
ON
ACOUSTICS,
SPEECH,
ANDSIGNAL
PROCESSING,
VOL.
ASSP-30,
NO. 5, OCTOBER 1982

resultsinconsiderable frequency distortion and considerable


phase distortion.Butit can be transformedinto a solution
which has no frequency distortion.
To see how this may be accomplished, consider a maximum
phase FIR transfer functionD(z) where

D(z) = D o n
L

1= 1
(1 - d(Z) Z-).

Then, if we define D(z) such that

(43)

then D(z) is a minimum phase transfer function and

D(z) -
n
L

I=1
(1 - d ( l ) z - )
(44) (b)
Fig. 3. (a) Band-splittingmodule for octave band subband coder. (b)
Band-merging module for octave band subband coder.
is an all-pass filterwith an exactly flat passband. Using this
procedure, if Qo(z) and Q1(z) can be redefined as linear filter C [ w ]. Therefore, if one band is split further while
the other band is not, as shown in Fig. 5, then the effect of
theadditionalband splitting is justthat of the linearfilter
C [ o ] . In order for the no-interbandaliasing condition of (14)
to be met under these conditions, a compensation filter must
be included in the band which was not split. In particular, if
the frequency response of the splitting operation isACs[w] and
then for the correctchoice of Go and Gl , t h e analysis-synthesis the compensation filter is C c [ o ] , then the output X [ w ] given
transfer function,given by by equation becomes
I

is an all-pass filter with a passband which is exactly flat. This


transfer function, of course, always contains phase distortion
and is never even approximately linear phase.

Tree Structured Subband Coders


and to remove the second aliasing term
Two-band subband coders based on the two-band analysis-
synthesis systems of the type we have been discussing are of
relatively littleinterestfor speechcoding since theyoffer
little gain over full-band ADPCMs. However, such two-band
systems can be combined in tree structures to produce interest- (48)
ing subband coding structures for speech. This procedure has all our previous results hold as long as C, [a] = C, [ a ] . This
the advantage that it can take directadvantage of the properties gives an overall frequency response for the system of Fig. 5 of
of half-band quadrature mirror filters in a more complex filter
environment.It has the specific disadvantage of stacking
the frequency and phase distortion inherent in each separate
band-splitting operation.
In a tree structure such as that of Fig. 3, this effect is con-
A structure which has been found to be effective for speech
catenated as the bands are split into successively lower levels.
coding is the octave band structure illustrated in Fig. 3. The
In particular, if C, [ a ] is the frequency response of the overall
band splitters and band combiners shown in Fig. 3are illus-
system, if we have an N octave band system, as is shown in
trated in Fig. 4 as polyphase structured half-band quadrature
Fig. 3 for N = 5, and if ck[cd] is thecompensation filter
mirror filters.
which must be applied at the kthlevel, then
A key issue in the octaveband structure is compensating
for bands which have not been split. Recall that for all classes
of two-bandreconstruction discussed thusfar,the overall
effect of the ideal analysis-synthesis operation was that of a
BARNWELL: SUBBAND CODER DESIGN 757

-tSPLIT
-

M
D
E
.
SPLIT U M
ADPCM
COOER L U
T L
II T
SPLIT ADPCM P I
CODER L P
E L
X E
SPLIT E X
ADPCM
CODER R E
R

-
Fig. 4. Octave band subband coding system.

ed(n)
QUANTIZER - CODER
CODE
TO
RECEIVER

P(n I

i
0 (n)

LINEAR
FIXED
PREDICTOR

Fig. 5. ADPCM showing forward and backward adaption.

and band coderwhich operated at slightly below 16kbits/s[5]


and afour-octave band coder which operated slightlybelow
C,[Ol = n
N- 1

I= 1
C,[41* (51) 9.6 kbits/s, were chosen. Both coders utilized FIR filters for
the band-splitting functions which had been specifically
This, of course, is therequirement for. exactinterband designed for subband coding [7]. The experimental technique
aliasing removal. In previous work [7] using FIR band- was to systematicallyreplace theFIRfilters with theIIR
splitting filters, the compensating filters were linear phase and filters to be tested, and then to do back-to-back comparisons
had an almost flat passband, and hence the Ck [w]s approxi- onthe resulting coded speech. Thecharacteristicsof these
mated ideal integer delays. Experimental results showed that FIR-based subband coders used as a reference are summarized
using ideal delays ratherthanthetruecompensating filters in Table I.
gave goodperceptual results in thesesystems, even though In all cases, theexperimentalstudyconsistedof changing
this allowed some interband aliasing distortion. Since such an modules in the basic FIRsystems,andthen assessing the
approximation allows for considerable computational reduc- improvement. In all, three issues were addressed in this way.
tions, it is very important to test if this same approximation First,andmostimportant, was thepotential of IIRfilters
can be made when recursive band-splitting filters areused. used in subband coder structures for giving equivalent quality
to the FIR filter structures with an associated computational
111. THEEXPERIMENTALSTUDY improvement. Second, an alternate
forward
adapting
The basic purpose of the experimental studywas to quantify ADPCM coder was considered to replace the Jayant APCM
the effectiveness of recursive quadraturemirrorfilter tech- [9] coder used in the reference system. Finally, another for-
niques for the subband coding of speech, also
and the effective- ward adapting ADPCM coder in which thequantizer levels
ness ofoptimum ADPCM coders in the same systems. In were specifically set to give minimum mean-square error was
order to do this, two reference subband coders, a five-octave also considered.
758 IEEETRANSACTIONS ON ACOUSTICS,SPEECH, AND SIGNALPROCESSING,VOL. ASSP-30, NO. 5, OCTOBER 1982

TABLE I
DESCRIPTIONS
OF THE FIVE-BAND FIR-BASEDSUBBANO
AND FOUR-BAND
CODERS USEDFOR REFERENCE I N THE EXPERIMEKTAL
STUDYAT 16 KBITS/S
AND 9.6 KBITS/S, RESPECTIVELY

Band No. 5 1 4 2 3
Frequency
Range (Hz) 3200-1600
1600-800
800-400
400-200
200-100
8 16 16
Length of QMF 16 32
B i t Allocation 2 2 4 4 4

Band4 No. 3 1 2
Frequency
Range (Hz) 2800-1400
1400-720
720-360
360-180
16 of QMF 16
Length 32 8
2 1 . Bit
58 Allocation 2 3

IIR Filter Issues polyphase filters areimplemented as in (38), then the numerator
The majorunderlying issue in the use of IIR filters in subband of the polyphase filters will meet the criterionof (24), and the
coding is the tradeoff between frequency distortion andphase overall analysis-synthesis transfer function C(z) given in (39)
distortion. Aswas discussed in the previous sections,either will be linear phase. Under such conditions,the underlying
FIR or IIR filters can be made to have no phase distortion at ideal prototype for the polyphasefilters will not be all-pass
all, no frequencydistortionat all, or a mixture of both. If delays, and considerable frequency distortion maybe expected,
linear phase is required,it is probably true that FIR filters but such filters would result, as for the FIR case, in exactly
must be used since the polyphase filters for this case turn out linear phase analysis-synthesis transfer functions.
to be FIRapproximationsto ideal time delays,whichare The requirementsfor the IIR filters are similar to the re-
themselves all-pass functions. However, aswe shall see in the quirements used by Martinez and Parks [ l o ] , [ 111 in which
section on IIR filter design, it is possible to generate IIR-based an iterative exchange algorithm was used to design IIR filters
systemswith linear phase responses, and these were indeed with powers of z N in the denominator and linear phase numer-
tested. ators.The problem is that their technique always designs
If phase distortion is allowed, the key question is how much oddlengthnumerations,andthecondition of (24) requires
phase distortion is acceptable. It is well known that moderate even lengthnumerators. Aprogramfor designing low-pass
amounts of phase distortion are not perceivable in coded filters based on the Martinez and Parks technique was available
speech, but when the relative group delays exceed about 50 from Richards [12].
samples, the speed signal begins to sound reverberant. One of The Richardsprogram was modified to give even length
the major problems with thetree-structuredsubbandcoder sequences asis required by (24). The Martinez and Parks
beingconsidered here is that a1sample group delay in the design algorithm always places the stopband zeros on the unit
innermostband translates into a 2N-' sample groupdelay, circle. Since there is an odd numberof terms in the numerator,
where N is the number of subbands, at the output (51). This then there is an even number of zeros. Hence, one additional
suggests thathybrid systemswhich use linearphase, even zero must be added, and to retain the linear phase constraints,
length FIR filters for the innermost bands and IIR filters for that zero must be at z = ? 1. Since we are designing low-pass
theouter bandsmight be effective andcomputationally filters, it cannot be at z = t 1 , so it must be at z = - 1. Hence,
attractive. the iterative exchange algorithm was modified to specifically
Finally, there is the issue of the realization ofthe band- take into account the zero at z = - 1. The overall design pro-
compensating filters in the octaveband structure. Clearly, cedure results in even length linear phase numerators, and like
from (47), if the correct compensation filter is not used, then the original algorithm, is optimal in the equal ripple sense. Ex-
some interband aliasing results in the reconstruction. One of amples of someof the filter designs are given in the latersections.
the major savings obtained in the FIR-based tree structurewas Alternate APCMandADPCM Coders
that these compensatorscould be successfully approximated
by simple delays. If something similar to this is not also true In all cases, the adaptive coders used in this study had both
fortheIIR case,any computational gains to be obtained an APCM and an ADPCM form, depending on whether a pre-
from the intrinsic IIR efficiency will be lost to the implementa- dictor was present or not. All of the coders used here belong
tion of the compensators. to the class of coders described in Fig. 6.
In the reference system, the APCM coder used was a back-
The Design of the IIR Filters ward adapting coder in the sense that the coder's current state
As was shown in the previous sections [(3 1)-(35)] , any is entirely a function of the previous code stream. In the terms
half-band IIR filter can be used as a basis for a quadrature of Fig. 6, the adaptive gain control (AGC) in such systems is
mirror, polyphase two-bandsubbandcoder.It is also true entirely controlled from the code bit stream. This coder had
that if the IIR half-band filters have certain special character- the additional feature that the quantizers used were all linear
istics, then the overall analysis-synthesis system will be linear (equal step size) quantizers.
phase. In particular, the denominator polynomial must be in One questionofinterest is whether the use of quantizers
powers of z 2 , andthenumeratormust be an equallength which are optimum in somestatistical sense might improve
linear phase FIRfilter. Ifthis condition is true, and if the thequality of the overall subband codingsystem. In par-
BARNWELL: SUBBAND CODER DESIGN

- p 2

BPF ADPCM #f
3200- CODER
1 6 0 0 (Hz) 2 BITS 2 BITS

EPF W2 ADPCM ADPCM


* 1600- + 4:I --C CODER - DECODER ---C I14 + 1600-
800 (HZ) 2 BITS
M
U
i
L
T
BPF 1 3 ADPCM I
X(n)+- 800- 3 8:i 3 CODER
- P
400 (Hz) 4 BITS L
E
X
0 (
R I

BPF # 2 ADPCM ADPCM


.--.) 400- --C 16:i 3 CODER - DECODER --C I : I6 --C 400-
2 0 0 (HZ) 4 BITS 4 BITS
~

-
BPF #I ADPCM ADPCM
4 200- 4 32:I ---C CODER - *DECODER --C 132:
1 --C 200-

1 1 0 0 (Hz)J J 14 B I T S I - - 4 BITS 100 (Hz)

Fig. 6 . General structure for all the 16 kbit/s subband coders tested.

ticular, if the statistics of the signal being coded are known or be referred to by the labels 8AF, 16AF, 16BF, 16CF, 32CF,
if the statistics can be estimated from real data,thenthe 32DF,64DF,and64EF.Theother eightwere IIRfilters,
quantizer limits and output values can be set to minimize the designed by the modifiedMartinez and Parks procedure, which
expected mean-square error [I31 -[ 161.Inthe case of the have approximatelythe same specifications as the filtersin
ADPCM system of Fig. 6 , the signal statistics desired are those the FIR set. These filters are labeled 8AR, 16AR, 14BR, 16CR,
at the output of the AGC, ed(nj. The problem is that with a 32CR, 32DR, 64DR, and 64ER. A comparison of character-
backward predictor, the characteristics of the AGC, and hence istics of the two filter sets is given in Table 11. Also, Figs. 7
the statistics of ed(nj, are a function of the quantizer itself. and 8 show thefrequency response and pole-zero plots, re-
This does not mean that such a system cannot be optimized, spectively, for filter 32CR. Notice that since the denominator
but it does mean that it must be an iterative process. polynomials for these filters are in terms of z 2 , then the poles
If an ADPCM coder is used which uses a forward predictor, are symmetric around both theimaginary axis and the real axis.
then the optimization is no longer an iterative process. In this During the course of the experimentation, a total of 9 6 dif-
type of system, the AGC is controlled by a parameter which ferentsubband coders were implementedat each datarate.
is estimated directly from the signal to be coded, x(nj. Since For each subbandcoder, asix-sentence testset was coded,
this parametercannot be estimatedatthe receiver, itmust and careful, butinformal, pairwise comparisons were made
be transmitted as side information, and this slightly increases amongthe systems. Theonly variables amongthe systems
the basic data rate and the complexity of the coder. If this is tested were filter variations, ADPCM coder types, and ADPCM
done, however, the signal at the output of the AGC, ed(n), coderquantizer characteristics. All of the systems operated
is no longer a function of the quantizer characteristics, and an at the same data rate and with the same bit assignment as the
optimal quantizer can be designed based on ed(n). corresponding FIR based subband coders described in Table I.
The quantizer design technique used here dates back t o Max A total of 16 different combinations of filters for the tree-
[ 131, and is based on work by Gimarc [16]. The algorithm structured band splitting in thesubbandcoders were tested
used is similar to an exchange algorithm suggested by Esteban in the experimental study, and these filter sets are summarized
[ 151 . Previous experiments have indicated that this algorithm in Table 111. The FIR setis the filter setused in the reference
converges well, and italways results in a reduction of the mean- system described in Table I. The IIR set is the filterset
square error. It is uncertain whether it converges in any par- which is approximately equivalent, from a filter specification
ticular case to the absolute optimal quantizer or only finds a point of view, totheFIRfilterset,and which uses the
local minimum. reconstruction polyphasefilters given in (36a)and (36b).
This is just the direct application of the same reconstruction
The Experimental Design method which is traditionally applied for FIR reconstruction,
All of the systems implemented as part of this study were and it results in an analysis-synthesis transfer function which
based on some combination of 16 filters. Eight of these were has both frequency and phase distortion.
FIR filters from the set designed by Johnston [ 7 ] , and shall TheIIRL filterset is identical to the IIR set, except
760 IEEE TRANSACTIONS ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL. ASSP-30, NO. 5, OCTOBER 1982

TABLE I1
A COMPARISONOF THE CHARACTERISTICSOF THE FIR AND IIR FILTERS
USEDAS THE BASISFOR THE SUBBAND CODERS TESTED IN THE
EXPERIMENTALSTUDY. THEIIR FILTERS WEREDESIGNED USING
A MODIFIED MARTINEZ AND PARKS APPROACH.

Multiples T rPaansssibt iaonnd


Filter Numerator
Denominator Band R i p p l e (dB) R i p p l e (dB)

8AF 8 .14 .06 -31


8AR 6 2 -33 .14 .008

16AF 16 .14 .008 -60


6 l6AR 4 .14 .002 -47

16BF 16 .lo .02 -44


6 16BR 6 .014 .10 -37

16CF 16 .0625 .07 -30


4 16CR 6 .03.0625 -30

32CF 32 .0625 .009 -51


32CR a 6 .006
.0625 -45

32DF 32 -38 .025 .043


32DR 8
-39 8
.01 .043

64DF 64 .002 .043 -65


64DR 12 10 .043 .0005 -65

64EF 14 .023 .025 -40


8 64ER 10 -43.023 .0068

40

-1201 1 I I I
0 0.64 1 282.56 1.92 0
FREQUENCY (kHz)
Fig. 7. Frequency response for filter 32CR.

that the reconstruction polyphase filters are designed to give


an analysis-synthesis transfer function which is exactly linear
POLE-ZERO PLOT
phase by using (38a) and (38b).
TheIIRA filterset is also identical totheIIRset,
except that the reconstruction polyphasefiltersare designed
from (45a) and(45b)to give an analysis-synthesis transfer
function which is exactly all-pass. The filtersets FIIR0-
FIIR11 are all combinations of FIR and IIR filters where
the FIR reconstruction filters are designed to give linear phase
and the IIR reconstruction filters are designed to give an all-
pass analysis-synthesis transfer function.
Fig. 8. Polezero pattern for filter 32CR. The filter sets IIRXl and IIRX2 are also both identical
to the IIR set, except they use a mixed linear phase/all-pass
reconstruction philosophy. For poles which are far from
ESIGN BARNWELL: SUBBAND CODER 761

TABLE 111 an SNR was computed between the coded signal and an un-
A FILTERS
SUMMARY OF THE USEDIN THE FIVE-BAND(16 KBIT/S) AND
coded signal which had been passed through the same analysis-
FOUR-BAND
(9.6 KBIT/S) SUBBAND CODERS TESTEDIN THE
EXPERIMENTAL STUDY synthesissystem,butwithno ADPCM or APCM coding.
~~~ ~~~

The Analysis-Synthesis Filter Sets


Hence, this SNR represents the coding impact of the APCM
or ADPCM codersanddoesnotincludeanymeasureof
5-Band theanalysis-synthesisdistortion.The SNRs areshown in
Band Split Table IV.
x1 x2 x3 x4 #5
Filter Set
Foreachofthesystemsconsidered,acarefulinformal
FIR 32DF 16CF 16CF 16CF BAF Linear Phase listening test was performedbetweenthatsystemandthe
Mixed
referencesystem of TableI.Thedifferenceswerescored as
IIR 32DR 16CR 16CR 1613 8AR
follows: - 3 meansthe testsystem is muchworsethanthe
IIRL 32DR 16CR 16CR 16CR 8AR Linear Phase
reference system; - 2 means the test system is not as good as
IIRA 32DR 16CR 16CR 16CR BAR All-Pass the reference system, but is still a reasonable coding system;
FIIRO 32DR 16CR 16CR 16CF Mixed - 1 means the test system is not as good as the reference sys-
FIIRl 32DR 16CR 16CF 16CF Mixed
FIIR2 32DR 16CF 16CF 16CF Mixed tem, but the two systems are very close in quality; 0 means
FIIR3 64DR 32CR 32CR 16CF Mixed thetestsystem is the same quality as thereferencesystem;
FIIR4 64DR 32CR 16CF 16CF Mixed
FIIRS 64DR 16CF 16CF 16CF Mixed t 1 means the test system is better than the reference system,
FIIR6 64ER 32DR 32DR 16CF Mixed
FIIR7 64ER 32DR 16CF 16CF Mixed
but is very close in quality; and t 2 means that the test system
FIIR8 6438 16CF 16CF 16CF Mixed is clearly better than the reference system, but still of com-
FIIRP 64DR 64DR 64DR 16CF Mixed
FIIRlO 64DR 64DR 16CF 16CF Mixed parable quality. Table V gives a compilation of these results.
FIIRll 64DR 16CF 16CF 16CF Mixed

IIRxl 32DR 16CR 16CR 16CR 8AR Mixed IV.RESULTS


IIRx2 32DR 16CR 16CR 16CR 8AR Mixed
Four questions were addressed by the experimental portion
4-Band
Band Split of this study; the effectiveness of IIR based tree-structured
x1 #2 #3 x4 analysis-synthesis systems for subband coders; the impact of
Filter Set
FIR 16CF
16CF
33uF BAF Linear Phase correctcompensation as compared to delaycompensation
fordeletedbands;theeffectiveness of forwardadapting
16CR
16CR
32DR
IIR 8AR Mixed
ADPCM forcodingthebands in subbandcoders;andthe
IIRL 32DR
16CR
16CR
BAR
Linear Phase
effectiveness of optimum quantizer design in conjunction with
32DR
IIRA 16CR 16CR 8AR All-Pass forward adaptingADPCM for stibband coders.
FIXRO 32DR 16CR 16CR 8AF Mixed
PIIRl 32DR 16CR 16CR 8AF Mixed IIR Filter Sets
FIIR2 32DR 16CF 16CF 8AF Mixed
FIIR3 64DR 32CR 32CR 8AF Mixed As previously discussed, there are three methods which can
PIIR4 64DR 32CR 16CF 8AF Mixed
FIIR5 64DR 16CF 16CF 8AF Mixed be used to design thesynthesis filtersfor aband-splitting
FIIR6
FIIR7
64ER
64ER
32DR
32DR
32DR
16CF
8AF Mixed
Mixed
analysis-synthesispair:thetraditionquadraturemirrorfilter
BW
method of (36a) and (36b); the linear phase reconstruction
FIXRE
FIIRP
FIIRlO
64ER
64DR
64DR
16CF
64DR
64DR
16CF
64DR
16DR
E
8AF
Mixed
Mixed
Mixed methodof(38a)and(38b);andthe all-pass reconstruction
FIIRll 64DR 16CF 16CF 8AF Mixed method of (45a) and (45b). These approaches are compared
IIRxl 32DR 1KR l6CR 8AR Mixed by the SNRresults in Table IV and the quality results in Table V
IIRXZ 32DR 16CR 16CR 8AR Mixed
by the IIR, IIRL, and IIRA filter sets, respectively.
The first point to note is that none of these methods inter-
theunitcircle,the linear phase design proceduresof(38a) actswith the APCM or ADPCM coders in such a way as to
and (38b) are used. For poles which are close to the unit dramaticallyincreaseordecreasethenoise level since all of
circle, the all-pass procedures of (45a)and(45b)areused. the SNRs are comparable to one another. It is important to
Hence, the poles which are far away from the unit circle and recall at this point that the SNRs reported in Table IV only
which have littleeffectonthemagnitudeofthetransfer reflect thecodingdistortionintroduced by the APCM or
functionarereconstructedfor linearphase, while thepoles ADPCM coders, and do not reflectanysubjective distortion
which are close to the unit circle and heavily affect the fre- inherent in the analysis-synthesis systems.
quency response are compensated to have an overall all-pass The second point is that none of these systems performed
effect. Thetwo filter sets,includedhere have adifferent well whencompared tothe referenceFIRsystem. In par-
definition of close and far. ticular,theIIRandIIRAsystemswerejudgedto be
Most of the filter sets were used as the basis for six subband reasonable coding systems, but clearly not as good as the FIR
coder systems. These systems were formed by combining three system, while theIIRLsystem was found to beof poor
ADPCM and APCM coder types-the backward adapting linear quality. It is clearfromseparatelisting testsandtheSNR
quantizer APCM, the forward adapting linear quantizer ADPCM, results that virtually all of thequalitydegradations in these
and the forward adapting optimum quantizer ADPCM-with threesystemsareduetoinherentdistortions in theunder-
two compensation filters-simple delays or exact compensation lyinganalysis-synthesissystems.It is alsoclear thatthe
as given by (37), (39), and (46). For each of these systems, underlyingreasonsforthedistortions are quite different for
762 IEEE
TRANSACTIONS ON ACOUSTICS,
SPEECH, AND SIGNAL
PROCESSING, VOL. ASSP-30, NO. 5 , OCTOBER 1982

TABLE IV TABLE V
SNR RESULTS
FOR THE FIVE-BAND (16 KBIT/ S) AND FOUR-BAND(16 KBIT/S) OF T H E QUALITY
RESULTS TESTS FOR THE FIVE-BAND (16 K B I T ~ S )A N D
SUBBANDCODERS TESTED IN THE EXPERIMENTAL STUDY.THISRESULT FOUR-BAND (9.6 K B I T / S ) SUBBAND CODERS. IN THISTABLE, -3
ONLY CODING
REFLECTS DISTORTION AND NOT THE PHASE OR MEAUST H E TESTSYSTEM WAS MUCHWORSETHAN THE
FREQUENCYDISTORTION OF THE ANALYSIS-SYNTHESIS SYSTEM. REFERENCE SYSTEM, -2 MEANSTHE TESTSYSTEM W A S
WORSETHAN BUTCOMPARABLE TO THE REFERENCE
Results for the5-Band SNR Study
SYSTEM, -1 MEAUS T H E TESTSYSTEM WAS VERY
SLIGHTLY WORSETHAN THE REFERENCE SYSTEM,
Forward Backward Forward Optimum 0 MEANS THE TESTSYSTEM AVD T H E
ADPCM APCM ADPCM REFERENCE SYSTEM HADTHE SAME
QUALITY, A R D +1 MEAKS THE TEST
Delay
Filter
Delay
Filter
Delay
Fllter SYSTEM WAS SLIGHTLY BETTER
Filter THAN THE REFERENCE SYSTEM.
Set
5-Band Systems
FIR 15.4415.44 Backwards Forward Forward Optimum
IIR 15.1715.17 ADPCM ADPCM ADPCM
IIRL 12.2
12.0
15.87
IIRA
15.88 Delay
Filter
Delay
Filter Delay
Filter
FIR0 15.51
15.50
18.13
18.14
19.92
19.93
FIIRl 15.23
15.20
17.95
17.95
19.89
19.92 Filter
FIIR2 15.63 15.63
18.42
18.43
20.25
20.22 Set
FIIR3 15.79 15.81
18.38
18.38
20.27
20.27 FIR - 0 0 0 0 0
FIIR4 15.59 15.58
18.28
18.29
20.07
20.09 IIR -2 -2 -2 -2 -2 -2
FIIR5 15.60 15.61
18.25
18.26
20.19
20.20 IIRL -3 -3 -3 -3 -3 -3
15.71
FIIR6 15.71
17.91
17.98
20.25
20.27 IIRA -2 -2 -2 -2 -2 -2
FIIR7 15.73 15.73
18.29
18.30
20.10
20.11 FIIRO -1 -1 -1 -1 -1 -1
FIIR8 15.84 15.80
18.62
18.62 20.48
20.48 FIIRl 0 0 0 0 0 0
FIIR9 15.65 15.66
18.01
18.02
20.06
20.07 FIIR2 0 0 0 0 0 0
FIIR10 15.74 15.73
18.20
18.21
20.22
20.23 FIIR3 -1 -1 -1 -1 -1 -1
FIIRll
15.94 15.94
18.25
18.26
20.19
20.20 FIIR4 0 0 0 0 0 0
14.83
IIRxl 14.88 FIIRS +1 +1 +1 +1 +1 +1
13.07
IIRx2 13.02 FIIR6 -2 -2 -2 -2 -2 -2
FIIR7 -1 -1 -1 -1 -1 -1
Results for the 4 Band SNR Study FIIR8 +1 +1 +l +1 +1 +1
FIIR9 -2 -2 -2 -2 -2 -2
FIIRlO -1
-1 -1 -1 -1 -1
Backward Forward
Forward
Optimum FIIRll +1 +1 +1 +l +1 +1
APCM ADPCM ADPCM -3IIRxl-3 -3 -3 -3 -3
IIRx2 -3 -3 -3 -3 -3 -3
Delay
Filter
Delay
Filter
Delay
Filter
Filter 4-Band Systems
Set ForwardBackwards Forward ODtimum
FIR 9.7 9.8 12.612.6 13.2
13.0
ADPCM ADPCM - ADPCM
IIRA 8.7 8.7 11.1
11.0 11.6
11.6 Delay
Filter
Delay
Filter
Delay
Filter
IIRB 9.89.8 12.412.3 13.0 13.0
5.1 IIR
5.0 4.2 4.0 7.2 7.1 Filter
FIIRO 8.7 8.8 11.1
11.2 11.6
11.6 Set
FIIRl 9.0 9.0 12.112.0
11.8
11.8 FIR 0 0 0 0 0
FIIRZ 9.49.5 13.2
13.2
12.6
12.6 IIR -2 -2 -2 -2 -2 -2
FIIR3
11.2
11.29.4 9.3 11.8
11.8 -311% -3 -3 -3 -3 -3
FIIR4 8.9 9.0 11.9
12.0 11.9
11.9 IIRA -2 -2 -2 -2 -2 -2
FIIRS 9.8 9.8 12.5
12.4 12.9
12.9 FIIRO -1 -1 -1 -1 -1 -1
FIIR6 8.88.8 10.6 10.6 11.3
11.2 FIIRl 0 0 0 0 0 0
FIIR7 8.7 8.811.9
11.8
11.4
11.4 FIIR2 0 0 0 0 0 0
FIIR8 9.5 9.6 12.4
12.4 13.1
13.1 FIIR3 -1 -1 -1 -1 -1 -1
FIIR9 8.28.2 10.9 10.9 11.0
11.0 FIIR4 0 0 0 0 0 0
FIIRlO 8.7 8.7 11.2
11.711.6
11.3 FIIRS +l +1 +1 +1 +1 +1
FIIR11 9.8 9.9 12.4
12.912.9
12.4 FIIR6 -2 -2 -2 -2 -2 -2
IIRxl 9.011.0 9.1 11.6
11.6
11.1 FIIR7 -1 -1 -1 -1 -1 -1
ITQV2 7.9 8.0 10.0 10.1 10.8 10.8 FIIR8 +1 +1 +l +1 +1 +1
FIIR9 -2 -2 -2 -2 -2 -2
FIIR10 -1 -1 -1 -1 -1 -1
FIIR11 +1 +1 +l +1 +1 +1
the different systems. For the IIRA system, which has no IIFxl-3 -3 -3 -3 -3 -3
frequency distortion, the problem is too much phase distor- IIRx2 -3
-3 -3 -3 -3 -3

tion, which results in a slight reverberant quality in the coded


speech.It iswell knownthatmoderate phase distortion is The IIR system exhibits both of the above effects, that is,
notimportant in speech quality,butthetree-structured both phase and frequency distortion, but it is the frequency
analysis-synthesis systems result in considerable phase distor- distortion which is most responsible for its overall quality.
tion. Hence, themajority of the phase distortioneffects in This might be improved by better IIR fdter design techniques,
the IIRA system are due to the phase distortions from the but the phase distortion would still be present.
bands which have been split most often. Twoadditionalmethods were investigated to improve the
FortheIIRLsystem,there is no phase distortion,but quality of theIIR filter based systems. The first was to
there is considerable frequencydistortion.Inparticular,the generate the filter sets IIRX1andIIRX2,in which the
resulting transfer function for the IIRL system as given by phase distortion was reduced by applying the all-pass compen-
(41) cannot be argued to be an approximate all-pass function sation of (45a) and (45b) only to poles which were close to
for IIR filters as it can for FIR filters (37). Hence, the linear theunit circle, and by using the linear phase compensation
phasesystemhasbad frequencydistortion, whichdestroys of (38a) and(38b)forthe poles whichwerefar fromthe
its quality. unit circle.
BARNWELL: SUBBAND CODER DESIGN 763

007 I

FREQUENCY (kHz)
I I
Fig. 9. Frequency response for analysis-synthesis system FIIR4.

As can be seen from the results of Table V, the frequency and phase distortions, Figs. 9 and 10 show theoverall analysis-
distortionsintroduced by the linear phase compensationfar synthesistransfer functionfrequency response andgroup
outweighed any perceptual improvement associated with the delay fortheFIIR4system, which was scored as equalin
reduced phase distortion. quality to the reference system.
The second method considered was tomixFIRandIIR So, in summary, it can be said that IIR filters are effective
filters t ~ g e t h e rin sets so that FIR filters were used on the forsubbandcoding if they areused only in theoutermost
innermost bands,where the phase characteristics are critical bands. The remaining questionconcernsthecomputational
and IIR filters were used on the outermost bands, where the impactof theseimprovements onthe systemrealizations.
frequency characteristics of the analysis filters are most Table VI shows the multiples per sample for each of the basic
important.The filtersets FIIR0-FIIRl1are examples filters when they are used for the band-splitting function in a
of this technique. polyphase structure, while Table VI1 shows a comparison of
The first point to note from the SNR results for these sys- multiples per sample for each of the filtersetsas compared
tems given in Table IV is that, like the other systems, these to the perceived quality rating from Table V. As can be seen,
analysis-synthesis systems have littleimpactonthecoding equivalent quality was achieved with 86 percent of the multiples
noise in the reconstructedspeech. Second, however, the quality required for the FIR system, while improved quality was ob-
results of TableV show that several of thesesystems were tainedfor systemswhich used 92 percentofthemultiples
found to be of equivalentquality to the basic FIRsystem, required for the FIR system. These gains cannot be said to be
while three systems-FIIRS, FIIR8, andFIIR1l- dramatic,and this is attributable to twofacts.First,the
were judged to be of slightly better quality than the reference computational gains for IIR over FIR filters are only apparent
FIR system. Anexamination of thecombinationof filters for higher quality filters. Second,theIIRfilterscanonly
used for the different filter sets reveals some general properties. be used in the outer two bands. It should be noted, however,
First note that, for recursive filters, good frequency resolution that if a two-band subband coder or a four-band subband coder
usuallyresultsin bad phase distortion. Hence, whenever IIR with equallyspacedfilters were implemented,theprojected
filters are used on the inner bands, theoverall phase distortion improvement in thenumber of multipleswould be onthe
of the analysis-synthesis system is heavily impacted, and the order of 50 percent.
reverberant quality becomes the dominant distortion. Hence,
the simple rule becomes that IIR filtersshould not be used The CompensationFilters
inanybuttheoutertwobands,andfurther,anadditional One ofthe critical questions in tree-structuredsubband
result is that if this rule is not followed, then the better the coder implementations is the method which is used to com-
frequency resolving characteristics oftheIIRfilters used, pensate forunsplitbands. If thesebandsare notproperly
the worse the overall distortion becomes.However, for the compensated, then (48) is not satisfied, and interbandaliasing
outertwo bands,thiseffect is reversed.Here, the phase results.However, if thesebands are exactlycompensated to
distortion is only moderate and is essentially not perceptually remove theinterband aliasing ratherthan by approximating
important, while the gains in controlling the coding error by the correct compensation by simple delays, then the compu-
using better filters result in perceivable improvements in tational impact isvery great.Previousresearchershave used
quality.Inordertoshowthe level of allowable frequency delay compensation effectively for linear phase synthesis
764 IEEETRANSACTIONS ON ACOUSTICS,SPEECH,ANDSIGNALPROCESSING, VOL. ASSP-30, NO. 5 , OCTOBER 1982

FREQUENCY (kHz)
Fig. 10. Group delay for analysis-synthesis system FIIR4.

TABLE VI TABLE VI1


A COMPILATION
OF THE TOTAL
MULTIPLESPER SAMPLE USEDIN THE MULTIPLESPER SAMPLE AND QUALITYSCORES FOR THE FIVE-BAND
BANDSPLITTING
ANALYSIS OPERATIONS FOR EACHOF THE (16 KBIT/S) AND FOUR-BAND
(9.6 KBIT/S) TESTED
FILTERS
USED IN THE STUDY
Multiplies per InputSample for the
Synthesis
Synthesis
Analysis Analysis-Synthesis Systems
Linear
Phase
All-Pass
5-Band Systems
8AF 4 (8)
8AR 6.5 (11.5) FilterSetMultipliesperSample Quality

16AF 8 FIR 46.5


16AR 9 IIR 36.9 -2
IIRL 36.9 -3
16BF 8 IIFA 40.3 -2
16BR 10.5 FIIRO 40.1 -1
FIIRl 40.3 0
16CF 8 FIIRZ 40.5 0
16CR 8.5 FIIR3 54 -1
FIIR4 52.5 0
32CF 16 FIIR5 49.5 +1
32CR 12 FIIR6 41.5 -2
FIIR7 42 -1
32DF 16 FIIR8 43 +1
32DR 14 FIIR9 66.1 -2
FIIRlO 64 -1
64DF 32 FIIR11 49.5 +1
64DR 19

64EF 32
64ER 15.5
4-Band Systems

Filter
set
Multiplies
per
Sample
Quality
[4], [ 5 ] , but it is not at all clear that this would be effective
FIR 42.5
when IIR filters are used. IIR 32.9 -2
In order to check the effectiveness of delay compensation, IIRL 32.9 -3
IIRA 36.3 -2
all the systems tested were implemented using boththe FIIRO 36.1 -1
FIIRl 36.3 0
correctcompensationandthecorrespondingapproximate FIIR2 36.5 0
delay compensation. The results of the quality tests for these FIIR3 50 -1
FIIR4 48.2 0
systems are shown in Table V. The overwhelming result here FIIRS 45.5 +1
FIIR6 37.5 -2
is that, in all cases, the delay compensation performed equally FIIR7 38 -1
as well as using the correct compensation filters. Hence, the FIIRO 39 +1
FIIR9 62.1 -2
conclusion is clearly that delays ratherthancompensation FIIRlO 60 -1
FIIRll 45.5 +1
filters should always be used.

The Forward Adapting ADPCM 2 dB over the backward predicting APCM, and that the optimum
The results for the forward adapting ADPCMs for both the quantizer gave a further improvement in SNR of about 2 dB
linear and the optimum quantizer case are shown in Tables IV over the linear quantizer. However,results of the quality
and V. The first pointtonote is thatthe linear quantizer studies show that neither of the forward adapting ADPCMs
forward adapting ADPCM gave an SNR improvement of about resulted in qualityimprovements. The overall conclusion
BARNWELL: SUBBAND CODER DESIGN 765

which must be drawn hereis that forward predictingADPCMs, R. E. Crochiere, On the design of sub-band codes for low-bit-
bothoptimumandnonoptimum,areoflittle value inthe rate speech communications, Bell Syst. Tech. J., vol. 56, pp.
147-770, May-June, 1977.
subband coding of speech at the 16 kbit/s or the 9.6 kbit/s A. Croisier, D. Estaban, and G. Galand, Perfect channel splitting
data rate. byuse of interpolation/decimation/tree decomposition tech-
niques, presented at the1976 Int. Conf. Inform. Sci. Syst.,
Theseresultsillustratewhat is probablyafundamental Patras, Greye, 1976.
effect in using optimalquantizers in waveformcoders for D. Esteban and C. Galand, Application of quadrature mirror
speech. Clearly, a 4 dB improvement in SNR normally results filters to split andbandvoice coding schemes, in Proc. 1977
ina clearlyperceivablequalityimprovement.Theproblem Znt. Conf. Acoust., Speech, Signal Processing, Hartford, CT, May
1977, pp. 191-195.
here results from two facts. First, the objective measure used, A. J. Barabell and R. E. Crochiere, Sub-band coder design
namely SNR, is exactly the quantity which is being optimized. incorporating quadrature filters and pitch prediction, in Proc.
Second,it is well knownthatSNR is notagoodobjective 1979 Int. Con$ Acoust., Speech, Signal Processing, Washington,
DC, API. 1979.
measureforspeechquality [17] . Hence,theoptimizerhas R. E. Crochiere, A novel approach for implementing pitch
performed exactly as it was designed to do and has improved prediction in subband coding, in Proc. 1979 Int. Conf. Acoust.,
the SNR. However, because of the highly dependent relation- Speech, Signal Processing, Washington, DC, Apr. 1979.
J. D. Johnston,, A fdter family designed for use in quadrature
shipbetweenthequantizer design algorithmandtheob- mirror fiiter banks, in Proc. Int. Conf. Acoust., Speech, Signal
jective quality measures used, and because of the poor overall Processing, Denver, CO,Apr. 1980.
performance of theobjectivequalitymeasure in predicting H.G. Bellanger, G. Bonnerst, and M. Coudreuse, Digital filter-
ing by polyphase network: Application to samplerate alteration
perceivedspeechquality,theimprovedSNRresultsdo not and fiter banks, IEEE Trans.Acoust., Speech,Signal Processing,
result in better quality speech. V O ~ .ASSP-24, pp. 109-114, API. 1976.
The effect of getting an improvedSNRby using anop- P. Cummiskey, N. S . Jayant, and J. L. Flanagan, Adaptive
quantization in differential PCM coding of speech, Bell Syst.
timum quantizer with no improvement in speech quality was Tech. J.,vol. 52,pp. 1105-1118, Sept. 1973.
also observedby Gimarc [ 16J ,but was not reported by Esteban H. G. Martinez and T. W. Parks,Designofrecursivedigital
[ 151 . Thiseffectpoints uponce again theinappropriate- Titers with optimum magnitude and attenuated poles on the
unit circle, ZEEE Pans.Acoust., Speech, Signal Processing,
ness of mean-square error as a general measure of waveform vol. ASSP-26, pp. 150-157, Apr. 1978.
coder performance. -, Aclass of infiite-duration impulse response filters for
sampling rate reduction, ZEEE Trans. Acoust., Speech, Signal
V. SUMMARY Processing, vol. ASSP-27, pp. 154-163, Apr. 1979.
M. Richards, private communication.
Thispaperpresentedtheresults of ananalyticalandex- J. Max, Quantizing for minimum distortion, IRE Trans.
perimentalstudy of the useof recursive quadraturemirror Znform. Theory, vol. IT-6, pp. 7-12, Mar. 1960.
filters and ADPCM coders with optimum quantizers in octave M. D. Paez and T. H. Glisson,Minimum mean-squared-error
quantization in speech PCM and DPCM systems, IEEE Trans.
band-subbandcodersforspeech.It was shownanalytically Commun., vol. COM-20, pp. 225-230, Apr. 1972.
thatoctavebandanalysis-synthesissystemswhichhad no D. J. Esteban, J. Menez,and F. Boeri, Optimum quantizer
interband aliasing andwhichhadeither no frequency dis- algorithm for real-time block quantizing, in Coni Rec., 1979
Znt. Con$ Acoust.,Speech, Signal Processing, Apr. 1979, pp.
tortionorno phase distortioncould be designed. Theex- 980-983.
perimentalstudycenteredonthree areas: theperceptual C. E. Gimarc, Application of an optimum quantizer algorithm
effects induced by the use of IIR filters; the improvements to PCM and ADPCM speech coders, masters thesis, Georgia
Inst. Technol., Atlanta, June 1980.
inSNRandqualitydueto using optimumquantizerswith T.P. Barnwell, I11 and W. D. Voiers, An analysis of objective
forward adapting ADFCMs; and the effects of compensating measures for user acceptance of voice communications systems,
forfurtherbandsplitting in theoctavebandstructureby Georgia Inst. Technol., Atlanta, Final Rep. DA100-78-(2-0003,
Sept. 1979.
using ideal delays.
Themajorresults can be summarized as follows.The IIR
filtersareeffectiveonlywhenused in theoutermost splits
inatreedecimationstructure. Given that this condition is
met,subbandcoderswith slightly betterqualitythanFIR-
basedsystems can beachievedwithmodestcomputational
improvements. In
tests using forwardadapting ADPCMs Thomas P. Barnwell, 111 (76)received the
S.B.,S.M., and Ph.D. degrees from the Massa-
with optimum quantizers, no perceptual quality improvement chusetts Institute of Technology, Cambridge,
was observed. Finally, ideal delays were found to be always in 1965, 1967, and 1970, respectively.
sufficient for unsplit band compensation. From 1965 to 1966 he was a Nationalscience
Foundation Fellow. Hewas a Teaching Assis-
ACKNOWLEDGMENT tant during the 1966-1967 school year, and a
National Institutes of Health Fellow from 1967
The author would like to express his deep appreciation for to 1970. Since 1971 he has been at theGeorgia
the support given him by Bell Laboratories. . He would also Institute of Technology, Atlanta, where he is
now a Professor. While at Georgia Tech, he
like to acknowledge Dr. R. Crochiere, Dr. R. Cox, and Dr. D. has been involved in the development of the Digital Signal Processing
Malah for their help and encouragement in this research. Laboratory, and has introduced several courses in the areas of speech
processing and digital systems. His research activities are in the areas
REFERENCES of speech processing techniques, digital systems, and digital archi-
tecture for signalprocessing.Hehas been the principal investigator
[ 1] R. E. Crochiere, S. A. Webber, and J. L. Flanagan, Digital on numerous research programs in these areas, and is the author of
coding of speech in sub-bands, Bell Syst. Tech. J., vol. 55, pp. numerous related papers and technical reports.
1069-1085, Oct. 1976. Dr. Barnwellisamember ofEta KappaNu, Tau Beta Pi, and Sigma Xi.

You might also like