Download as pdf or txt
Download as pdf or txt
You are on page 1of 3

1

Speech coder realization, based on DSP ADSP-2181


I.Zolotuhin, O.Pavlov
Abstract
The main performance of speech coder model based on CELP algorithm and the main
performance of hardware and software realization of this model on the base of Analog Device
debugging unit EZ-KIT-LITE ADSP-2181 are presented. The presented data permits to make an
comparative estimation of efficiency of realization of similar algorithms on the base of various
signal processors of leading companies, to evaluate hardware and processing costs and reserves
when ADSP-2181 unit is used, to define the most critical fragments of algorithm to be realized
by fixed-point arithmetic and fragments to be realized by floating-point arithmetic.
Анотація
Представляються головні результати апаратно-програмної реалізації мовного
кодера на базі одного сигнального процесора ADSP-2181 фірми Analog Device. Наведені
дані дозволяють провести порівнюючу оцінку щодо ефективності реалізації схожих
алгоритмів на базі різних сигнальних процесорів провідних виробників, оцінити
апаратно-програмні витрати та резерви, виділити найкритичніші фрагменти алгоритмів
для реалізації їх за допомогою арифметики з фіксованою крапкою, та фрагменти, які
можно реалізовувати на базі арифметики з плавоючою крапкою.
Introduction
The theory of speech coding with low speed data rate was good developed during last 20
years [1-7]. The main efforts of the most projects were directed to reduce processing complexity
of used speech codec models and to realize them in real time. However, in many cases, reducing
of processing complexity was obtained by using more simple models and reducing speech
quality. Progress of Digital Signal Processing (DSP) technique and new fast units with large
internal memory allow to realize more complex algorithms with better speech quality.
Model description
The model is a 4800 bps CELP (Code Excited Linear Predictive) algorithm. One pitch
Long Term Predictor (LTP) and all-pole 10-order Short Term Predictor (STP) are used. To create
optimal exciting signal 75% zero elements stochastic codebook is used. It includes 512 exciting
vectors with overlap of 2 samples [4]. The encoder frame is 20 ms (160 samples per frame with 8
kHz sample frequency). Used data type is a signed char (8 bit).
Speech frame, being analyzed, has two 80-samples sub-frames. The samples of speech
frame are windowing by Hamming’s window and 11 values of auto-correlation function (ACF)
are calculated. Then, 10 LPC coefficients are obtained by Darbin’s algorithm. 10 Linear
Spectrum Pare (LSP) coefficients are determined by LPC coefficients and quantized by personal
codebook. Their indexes are the product of the encoder.
10 LSP coefficients are calculated by interpolation of the previous & current and current
& next LSP coefficients for each sub-frame in encoder and decoder. Sub-frame LSP coefficients
are used to obtain STP parameters. To estimate LTP parameters the linear prediction rest, that is
obtained by inverse filtering of the previous speech frame and has good correlation quality, is
used. Obtained pitch is refined by adaptive codebook search. Adaptive codebook is a memory of
product of LTP by real exiting vectors from stochastic exiting codebook at the previous frame,
and is build in encoder and decoder. The product of current sub-frame STP by linear prediction
rest is used as estimation signal to find optimal pitch. Optimal adaptive codebook vectors

Proc. of the 3-rd Int. Conf. on Radiocommunication, Audio and Television Broadcasting, Ukraine, Odessa, 1997 Sep. 9-12, P. 431 — 433
2

indexes and LTP coefficients indexes (according quantization one with its codebooks), obtained
for each sub-frame, are product of encoder too.
To determine exiting signal the stochastic codebook vector search with non-correlation
rest of linear prediction, used as estimation signal is used. Optimal stochastic exiting vectors
indexes and gain coefficients indexes (according quantization one with its codebook), obtained
for each sub-frame, are product of encoder too.
Starting complexity of used model was approximately 53 MIPS.
Main characteristics of real-time realization
At the first, we have obtained results corresponding to very large required processing
complexity of the model. The model has required up to 53 MIPS (ADSP-2181 has 33 MIPS with
30 ns maximum instruction rate). Processing of one 20 ms speech frame has required time up to
32 ms, and real-time realization has not been possible. Due to the powerful designing tools the
most critical fragments of algorithm have been determined and changed. Model has been revised
and optimized. At the same case, the speed of processing has been increased by return from
fixed-point model to floating-point arithmetic, that also permit to get higher accuracy. The results
have exceeded all expectation. New model of speech codec has required less then 17.2 MIPS
performance (that corresponds up to triple acceleration). Real-time realization of new model has
the following features:
• A real-time realization of the speech codec is based on an Analog Device low cost debugging
unit EZ-KIT-LITE ADSP-2181, that includes one DSP chip ADSP-2181, stereo-codec AD-
1847 and boot EPROM 27C010;
• Analog and digital interfaces are optional;
• Only internal Program Memory (PM) RAM and internal Data Memory (DM) RAM are
required;
• Required size of PM RAM to one speech frame processing: 12963 words:
1. PM RAM for Encoder module: 3671 words;
2. PM RAM for Decoder module: 1817 words;
3. PM RAM for initialized and control module: 412 words;
4. PM RAM for codebooks and coefficients tables: 5344 words;
5. PM RAM for dynamic variable: 1719 words;
• Required size of DM RAM to one speech frame processing: 15361 words:
1. DM RAM for static variable: 2273 words;
2. DM RAM for dynamic variable: 13088 words.
• Required time to one speech frame (20 ms) processing by Encoder: 9.888 ms:
1. Floating-point dividing operations: 3.6%
2. Calculation of ACF and LPC coefficients: 0.7%
3. Calculation of LSP coefficients: 4.3%
4. Quantization of LSP coefficients: 0.3%
5. Interpolation of LSP coefficients: 0.2%
6. Calculation of LPC coefficients by LSP one: 0.4%
7. Linear prediction rest analysis: 12.1%
8. Adaptive codebook search: 22.1%
9. Stochastic codebook search: 42.3%
10. Other operations: 13.8%
• Required time to one speech frame (20 ms) processing by Decoder: 0.542 ms;
1. Errors corrections 3.0%
2. Unpack and codebooks search: 0.8%
3. Interpolation of LSP coefficients: 8.9%
4. Calculation of LPC coefficients by LSP one: 8.6%

Proc. of the 3-rd Int. Conf. on Radiocommunication, Audio and Television Broadcasting, Ukraine, Odessa, 1997 Sep. 9-12, P. 431 — 433
3

5. Speech synthesis (LTP and STP): 64.0%


6. Adaptive codebook modification: 3.8%
7. Other operations: 10.9%
• Total required time to one speech frame (20 ms) processing by Encoder and Decoder: 10.430
ms
• Required computational power: 17.2 MIPS
1. Encoder: 16.3 MIPS;
2. Decoder: 0.9 MIPS.
The obtained results permit to conclude the following:
1. The main time resource is spend to determine the pitch (linear prediction rest analysis and
adaptive codebook search) and optimal exiting signal (stochastic codebook search).
2. There is free computational resource of 15.8 MIPS, that can be used to solve other problems.
3. On our estimations the improvement of the obtained results on 5-15% is possible, that will
permit to realize two speech coders based on one DSP ADSP-2181.
The present results may be compared with obtained ones by Analogical Systems (USA)
for LD-CELP (Low-Delay Code Excited Linear Prediction) algorithm of speech coding,
corresponding to Fixed Point CCITT G.728 standard, when fixed-point arithmetic is used. An
according with [8], realization of LD-CELP codec with full duplex on the base of DSP ADSP-
2171 requires computational power of 31 MIPS. It is interesting to note, that hardware and
software realization of the modem according to CCITT V.32/V.32 bis standard, made by VoCAL
Technologies Ltd (USA) on the base of DSP ADSP-2105, requires computational capacity of
10.2 (9.7) MIPS, 870 words of internal PM + 7050 words of external PM, 500 words of internal
DM + 6730 words of external DM, 6 kB external memory for echo canceler [8].
Thus, the proposed speech coder realization permits to design one single DSP ADSP-
2181 device, that will realize functions not only of a speech coder but also, for example,
functions of V.29 or V.32/V.32 bis modem.
The other variant of using free computational resource of DSP ADSP-2181 can be
realization of other algorithms of speech coding to build on the base of a single chip the gateway
between various coding systems without signal transformation into analogous form.

References

1. Коротаев Г.А. Зарубежная радиоэлектроника, 1990, №3.


2. Коротаев Г.А. Зарубежная радиоэлектроника, 1991, №7.
3. Коротаев Г.А. Зарубежная радиоэлектроника, 1996, №3.
4. Mark D. Grosen. Texas Instruments, Inc., “Digital Signal Processing Applications with the
TMS320 Family”, 1990, V3.
5. Grant Davidson, Mei Yong, and Allen Gersho. Proc. IEEE Inter. Conf. on ASSP, Dallas, Apr.
1987.
6. Kenneth Zeger and Allen Gersho. Proc. IEEE Inter. Conf. on Communications. Seatle, June
1987.
7. Karen Bryden, Andre Brind’Amour and Hisham Hassanein. IEEE Pacific Rim Conference on
Communications, Computers and Signal Processing, June 1st-2nd, 1989. Conf. Proc. - New
York, 1989.
8. “Third Party DSP Developer. Directory.”, published by Analog Device, 1996.

National Technical University of Ukrane “Kyiv Politecnical Institute”, Theoretical Radio


Engineering Department, 37, Premohy Prospect, Kyiv, 252065, Ukraine. opmail@mail.ru

Proc. of the 3-rd Int. Conf. on Radiocommunication, Audio and Television Broadcasting, Ukraine, Odessa, 1997 Sep. 9-12, P. 431 — 433

You might also like