Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 5

Real Time Implementation of 16 Kbps CVSD Codec

Abstract
Continuously Variable Slope Delta (CVSD) modulation has been in use particularly in military wireless environments for over 30 years, and is now also adopted by Bluetooth. CVSD is particularly suitable for Internet and mobile environments due to its robustness against transmission errors, and simplicity of implementation and the absence of a need for synchronization. A real time implementation of 16 Kbps CVSD speech-coding algorithm on TMS320C6416 digital signal processor is described in this paper.

Introduction
CVSD is basically Delta Modulator with an adaptive quantizer. Applying adaptive techniques to a DM quantizer allows for continuous step size adjustment. By adjusting the quantization step size, the coder is able to represent low amplitude signals with greater accuracy (where it is needed) without sacrificing performance on large amplitude signals. CVSD encoder compares the band-limited analog input signal with an analog feedback approximation signal generated at the reconstruction integrator output. Block Diagram of CVSD Codec is shown in figure1.The comparator compares the input speech signal with the output signal of the reconstruction integrator .This comparison produces the digital error signal input to the 3-bit shift register. The slope overload detection algorithm operates on the output of the 3bit shift register using the run-of-three coincidence algorithm.

Encoded Output Input samples comparator Reconstruction Filter Decoder output

Z-1

1-Tsyl Tsyl
Z
-1

Z-1

Dmax Slope Overload Detector Reconstructed samples


Z-1

Dmin Syllabic Integrator

Trec 1-Trec
Z-1

Reconstruction Integrator

Figure 1.Block Diagram of CVSD Codec For clock rates of 16 kHz and below, the 3bit algorithm is well suited. For clock rates of 32 KHz and above, the 4bit algorithm is preferred.

The syllabic filter acts as a low-pass filter for the output signal from the overload algorithm. The stepfunction response of the syllabic filter is related to the syllabic rate of speech, is independent of the sampling rate, and is exponential in nature. When the overload algorithm output is true, a charging curve is applicable. When this output is false, a discharging curve is applicable. The PAM operates with two input signals: the output signal from the syllabic filter, and the digital signal from the 3-bit shift register. The syllabic filter output signal determines the amplitude of the PAM output signal and the signal from the 3-bit shift register is the polarity control that determines the direction of the PAM output signal. The reconstruction integrator operates on the output signal of the PAM to produce an analog feedback signal to the comparator that is an approximation of the analog input signal. Typical values for the syllabic and reconstruction integrator time constants are 5 to 12 ms and 0.5 to 1.5 milliseconds, respectively. Dmin and Dmax are chosen depending on the dynamic range, the maximum frequency and sampling frequency of the input signal. A low pass reconstruction filter at the receiving circuit output eliminates most of the quantizing noise. Generally, the lower the bit rate, the better the filter must be.

Simulation of Algorithm
CVSD algorithm has been simulated to verify its performance. First floating point simulation is done, then fixed point simulation of algorithm is carried out and results compared. Ideally, all algorithms should be implemented with floating-point processors; in that way the rounding error after each operation will be negligibly small, and the hazard of numeric overflow is virtually eliminated. Unfortunately, floating-point processors are relatively expensive, due to the increased
SNR vs Signal Power 18 16 14 12
SNR in dB

SNR Vs Signal Frequency 16

Floating Pt Model Fixed Pt Model

14 12 10 8
SNR in dB

Floating Pt Fixed Pt Reference signal =0.15 V Sinewave

10 8 6

Signal Freq=820 Hz

6 4 2

4 2 0 -30

0 -2 -4

-25 -20 -15 -10 -5 0 5 10 signal power in dBm0 (with reference to sinewave at 0.15 V)

15

500

1000 1500 2000 2500 Input Sine Wave Frequency in Hz

3000

3500

Figure 2.SNR versus input level for CVSD

Figure 3.SNR versus input level for CVSD

size of the processors chip needed to support the more complex operations; also, power consumption is higher when compared to a fixed-point processor. For cost and power sensitive consumer appliances (e.g., the cellular handset), the fixed-point processor is almost the mandatory choice.

10 9 8

SIgnal to Quantization noise ratio in dB

7 6 5 4 3 2 1 0 -45

-40

-35

-30

-25 -20 -15 Speech input in dB

-10

-5

Figure 4.Typical SNR versus Speech input power characteristics The adaptive behavior of CVSD results in a SNR versus input level characteristic similar to nonuniform quantization. This non-linear SNR characteristic is due to companding, where the quantization level (slope) is adjusted to a larger or smaller value according to past pitch changes of the input signal. Figure 2 shows a SNR versus input (sine wave 820 Hz) level characteristic for 16kb/s CVSD. Figure 3 shows SNR versus input frequency characteristics. Figure 4 shows typical SNR versus speech input plot. Although SNR is the most used method to objectively quantify performance of speech coding algorithms, segmental SNR (SSNR) is considered a better perceptual model since it evaluates the quantization noise with respect to signal energy in each underlying speech segment. SSNR value is compared for floating point and fixed point simulations taking segment size of 160 and 60 segments. The value of SSNR is found comparable for both cases. Table 1. Comparison of segmental SNR for Fixed and Floating Point Simulation Model
Parameter Floating Pt Model (no reconstruction filter) 8.43 Floating Point Model (with reconstruction filter) 11.2 Fixed Point Model (no reconstruction filter) 7.63 Fixed Point Model (with reconstruction filter) 10.6

Segmental SNR

Hardware Details
G.729a algorithm has been implemented on Texas Instrument DSP Starter kit having TMSC6416 DSP Processor. The TMS320C6416 DSPs is a fixed point DSP primarily intended for audio and video applications. A simplified block diagram of the DSP internal architecture is shown in figure 1. C6416 is a VLIW processor with clock rate of 720 MHz It has 64 general-purpose registers of 32-bit word length and eight

highly independent functional Units; two multipliers for a 32-bit result and six arithmetic logic units .

EMIF TIMER McBSP UTOPIA GPIO Interrupt selector

L1P Cache

(16kbytes)
EDMA controller (64 channels)

L2 Cache/ Internal memory (1024K bytes)

CPU VLIW

L1D

Cache(16kby tes)

Internal architecture of TMS320C6416 The C64x uses a two-level cache-based architecture and has a powerful and diverse set of peripherals. The Level 1 Program cache (L1P) is a 128-Kbit direct mapped cache and the Level 1 data cache (L1D) is a 128-Kbit 2-way set associative cache. The Level 2 memory/cache (L2) consists of an 8-Mbit memory space that is shared between program and data space. The peripheral set includes McBSPs; an 8-bit UTOPIA port; three 32-bit general-purpose timers; a user-configurable HPI, GPIO and PCI interface.

Implementation Scheme
Implementation scheme utilizes McBSP (Multichannel Buffered Bidirectional serial Port) and EDMA (Enhanced Direct Memory Access) to efficiently handle the data transfer without intervention from the DSP .Audio data is transferred back and forth from the codec through McBSP2, a bidirectional serial port at a rate of 16 KHz, the CVSD sample rate. The EDMA controller takes incoming audio data directly from McBSP2 and places it in a memory buffer. It also takes data from a memory buffer and sends it to McBSP2 to generate the audio output. Separate EDMA channels are used to transmit and receive audio data.
McBSP Audio Codec Input Speech Decoded Speech ADC DAC EDMA_receive() 16Ksps RcvBuffPing XmtBuffPing EDMA_transmit()

Encoder
90 ms frame ( 1440 samples) RcvBuffPong

Decoder
Process_Buffer()

XmtBuffPong

Memory Buffer

Codec

Memory Buffer

Implementation Scheme of G.729a Codec

Implementation Results
Analyzing an algorithm implies the prediction of resource requirements necessitated to run it. The resources are often measured in terms of memory and computation constituting the two fundamental cost components of digital hardware. Input speech frame of 90 msec (corresponds to 1440 speech samples) has been taken. The Performance of CVSD Codec has been listed below: Table 2.CVSD Codec Performance/Code Size results on TMS320C6416 DSK Codec CVSD Execution Time (msec) 23 msec PM (in KB) 14 DM (in KB) 34

Conclusion
Simulation of CVSD algorithm is carried out in both floating and fixed point. The results of SNR vs. signal power, SNR vs. input frequency and segmental SNR are computed for both the cases and are found comparable. Finally the fixed point implementation of CVSD is carried out on Texas Instruments TMS320C6416 based DSK.
CVSD has several attributes that make it well suited for digital coding of speech. One-bit words eliminate the need for complex framing schemes. Robust performance in the presence of bit errors make error detection and correction hardware unnecessary. Despite this simplicity, CVSD has enough flexibility to allow digital encryption for secure applications. Finally, CVSD can operate over a wide range of data rates.

References
[1]."Continuously Variable Slope Delta Modulation: A Tutorial", Application Doc. #20830070.001, MXCOM, Inc., Winston-Salem, North Carolina, 1997. [2]. Taylor, D. S. "Design of Continuously Variable Slope Delta Modulation Communication Systems", Application Note AN1544/D, Motorola, Inc. 1996. [3]. A. Spanias, Speech coding: a tutorial review, Proc. IEEE, vol. 82, pp 1541-1582, 1994. [4]. N. S. Jayant, Adaptive delta modulation with a one-bit memory,Bell Syst. Tech. J., pp. 321-342, Mar. 1970. [5]. Wai C. Chu. Speech Coding Algorithms Foundation and Evolution of Standardized Coders. John Wiley & Sons, Inc, 2003.

You might also like