4GV White Paper

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

4GV – Fourth Generation Vocoder

Overview
QUALCOMM’s Fourth Generation Vocoder (4GV™), is an integrated core based speech
coding technology that introduces a new paradigm to the service providers to
dynamically prioritize voice quality and network capacity. Designed and customized
around CDMA2000® 1X and HRPD network with spectral bandwidth efficiency in
mind. 4GV is capable of operating in narrowband, wideband and transitioning between
circuit-switched and packet-switched voice services. It offers notably enhanced network
capacity (up to 40 percent as compared with EVRC) and increases the overall voice
quality of an end-to-end voice services including delay-sensitive applications such as
Voice over Internet Protocol (VoIP). All these features are offered with just a software
upgrade. Capacity gains are even more remarkable when deployed in conjunction with
other capacity saving scheme such as Received Diversity.

It is a standard technology (3GPP2 and IETF) and published under EVRC family of
codecs as shown in Figure 1. It is supported by major carriers and infrastructure vendors.
It is available today across many Qualcomm Mobile Station Modem™ (MSM™)
chipsets with network commercial deployment scheduled around early 2007.

EVRC-WB
8.55kbps
SO70
Better Voice Quality (MOS)

QC13k EVRC EVRC-B


13.3kbps 8.55kbps 8.55kbps
SO17 SO3 SO68

QC8k
8.55kbps
SO1

Higher Voice Capacity (%Erlangs)

Figure 1: 4GV-4th Generation Vocoder standardized as EVRC family of codecs


Emerging Market in Cellular Network

Technologies are continuously evolving to meet the emerging needs of wireless


communications. There is a growing demand for wireless communication that requires
greater capacity and higher network efficiencies. Spectral efficiency is a key driver of the
economics of voice and data services.

Mobile communication is especially more challenging due to its real time constrains
while dealing with the time varying transmission characteristics of wireless channels and
the dynamic quality of service (QoS) associated with speech signal such as variable
frame rates, prioritized delivery of important units, and variable tolerance vs. packet
errors.

Rich multimedia content, advanced device capabilities and broadband user throughputs
are driving the demand for high-speed wireless data solutions. Sector throughput is a
factor in determining the types of services an operator can deliver. Data is becoming a
significant source of revenue for operators. Although throughput of busty transmission of
high speed packet data has improved a lot lately, but voice capacity of circuit switched
dedicated channel pretty much remains unchanged. As user demand for data-intensive
applications continues to grow, 4GV delivers network operators the ability to
accommodate their existing network for voice services while leveraging their investments
into 3G networks.

Voice Capacity Trend

Operators around the world are witnessing an exponential growth in subscribers and
demand for conversational voice services remains very strong. Forecasting such demand;
the operators are asking for greater efficiency and versatility in their voice centric
network to complement such significant growth without sacrificing the overall
performance. The more subscribers and minutes/mBytes that can be driven over a cell
site, the lower the cost per minute/mBytes are to an operator.

Operators are often faced with dilemma of how to manage their investment and
expanding their cell sites coverage to complement such subscribers’ growth. They are
often challenged on how to justify additional infrastructure costs while offering an
affordable Average Revenue Per User (ARPU) while remaining profitable.

Furthermore, we witness proliferation of IP phones while approaching roll out of 3G


wireless handsets. This technology evolution has created a tremendous opportunity to
provide better than toll quality voice services along with rich media call features in the
next-generation telephone network. Such a sound performance enhancement is enabled
by the availability of more advanced Digital Signal Processors (DSP) that can compress
wideband speech signals and supersede the existing band limited 4 kHz telephone
network while maintaining transmission bandwidth efficiency.

It is empirically given that one near-term and cost effective solution is to concentrate
more on spectral efficiency suite of speech codecs that are designed to operate more
efficiently by operating in a lower average data rate during active speech and introduce
enhanced voice services. This paper describes such technology in more detail.

4GV Technology

4GVTM is Qualcomm 4th generation core based suite of voice codec technology that
customizes its design to complement the requirement of present and future CDMA2000®
voice services. 4GV is a source control variable rate speech codec operating over mature
network rate set one (RS1) multiplex sub-layer [1] using full-rate (rate-1), half-rate (rate-
1/2), quarter-rate (rate-1/4) and eight-rate (rate-1/8) similar to the existing Enhanced
Variable Rate Codec (EVRC), service option 3 [2]. 4GV technology suite as shown in
Figure 2 consists of 4GV narrowband circuit switched codec, 4GV wideband circuit
switched codec and 4GV Voice over Internet Protocol (VoIP) packet switched codec.
More detail about each of these technologies is presented below.

Circuit
Switched
Narrow Band

4GV Core

VoIP Circuit
(NB&WB) Switched
Wide Band

Figure 2: 4GV Core based suite of Technologies


4GV Narrowband (4GV-NB) Circuit Switched Codec (EVRC-B)

4GV-NB (EVRC-B) circuit switched codec introduces a new paradigm for the network
providers to control Capacity Operating Points (COPs) of the codec dynamically based
on a particular geographical location (hotspots) and/or during a specified time of the day
(on/off peak hours).

4GV-NB with its unique design has fully utilized CDMA2000® rate configuration while
emphasizing more on coding efficiency. Operating in different modes is capable of
increasing current network voice capacity by up to 40% as compared with EVRC.

4GV-NB is a new 3GPP2 standard codec build over existing EVRC-A codec. It is
published as C.S0014-B, Enhanced Variable Rate speech codec, service option 3 and 68
[3]. The term 4GV remains as Qualcomm trademark referring to the suite of codecs based
on the same core technology. In this document the term “4GV-NB” is interchangeably
exchanged by its new standard name “EVRC-B”. It is software upgradeable across many
Qualcomm MSM-1x products and it is a cost effective solution for operator.

4GV-NB introduces different capacity operating points (COPs) to properly trade-off


voice quality for capacity. The big difference between 4GV-NB and EVRC is that EVRC
never uses rate-1/4 and pretty much operates under a fixed capacity operation point.
Whereas 4GV-NB fully utilizes the rate-1/4 in a most efficient way possible to code the
active speech signal. 4GV-NB introduces many different capacity operating points by
using a unique rate control scheme, by manipulating between different rate configuration
and by their percentage usage for a given period. The higher percentage use of lower bit-
rate frames (i.e. rate-1/4) the lower the average data rate will be and as a result causes the
network voice capacity to be improved.

The COP is controlled dynamically by the network resource manager entity depending on
the instantaneous capacity requirement of the network, constrained by even an individual
user in a particular geographical location and/or for a given time of a day. The forward
link COP can be chosen to be any arbitrary value but the reverse link COP is limited to
the eight preset values as will be described later. Network can request for any of these
operating points by using existing CDMA2000 call processing protocols [3].

4GV-NB Forward link capacity

In a cdma2000® system, one of the major factors influencing the forward link capacity is
the availability of a base station’s (BS) transmitter (Tx) power. Assuming enough Walsh
codes are available, blocking occurs when the BS does not have sufficient Tx power to
support any additional users at their specified target Frame Erasure Rate (FER). For voice
calls, the instantaneous Tx power is proportional to the transmitted data rate.

Transmitting data symbols of one half-rate frame (4800 bps) requires approximately 3 dB
less power than transmission of one full-rate frame (9600 bps). Similarly, Tx power of
quarter-rate is decreased by 3 dB and so on. Thus, the average Tx power required to
support a voice call is proportional to the average data of speech codec. Because the total
available BS Tx power is limited, a reduction in Tx power required for one user translates
to more available power for supporting other users. Thus, forward link capacity is
inversely proportional to the BS Tx power needed for each user.

The forward link vocoder average data rate or its operating point is controlled and
managed by the network resource manager. The network resource manager can select any
arbitrary operating point for any user at any time, depending on the instantaneous forward
link capacity requirement. The forward link selected operating point for a particular user
is totally independent of the reverse link selected operating point.

Table 1 shows a comparison of 4GV-NB and EVRC with an ensemble of different


forward link operating points in CDMA2000 as a function of active speech (silence
period or eight-rate frames are removed) average data rate which is further quantified by
the percentage usage of full-rates, half-rates and quarter-rates. The detail analysis is given
in [4] and it is based on forward link power gain for a given Erlangs while shown in
incremental steps of increase in channel element (lines).
Capacity Gain Due to Power
Erlangs including handoff
Channel Elements (Lines)

Active Speech Vocoder

Active Speech Actual

EVRC in % Erlangs
Capacity Gain over
Average Data Rate

Average Data Rate


% Quarter-Rate
% Half-Rate
% Full-Rate
Erlangs
Codec

EVRC 36.0 27.1 48.7 1.000 0.454 0.026 0.000 8.300 9.336 0.000
4GV Narrowband Continuous Capacity

36.0 27.1 48.7 1.000 0.454 0.026 0.000 8.300 9.336 0.000
37.0 28.3 50.9 1.028 0.428 0.026 0.026 7.946 8.947 4.428
38.0 29.2 52.5 1.056 0.403 0.026 0.051 7.610 8.578 7.801
39.0 30.1 54.1 1.083 0.380 0.026 0.074 7.292 8.228 11.181
Operating Points

40.0 31.0 55.8 1.111 0.358 0.026 0.096 6.990 7.896 14.569
41.0 31.9 57.4 1.139 0.337 0.026 0.117 6.702 7.580 17.964
42.0 32.8 59.1 1.167 0.321 0.019 0.127 6.586 7.449 21.365
43.0 33.8 60.8 1.194 0.302 0.019 0.146 6.317 7.154 24.773
44.0 34.7 62.4 1.222 0.284 0.019 0.164 6.061 6.872 28.187
45.0 35.6 64.1 1.250 0.266 0.019 0.182 5.816 6.603 31.607
46.0 36.5 65.8 1.278 0.249 0.019 0.198 5.582 6.346 35.032
47.0 37.5 67.4 1.306 0.233 0.019 0.214 5.358 6.099 38.463
48.0 38.4 69.1 1.333 0.218 0.019 0.230 5.143 5.863 41.899

Table 1: Different 4GV ADR/COPs values as a function of increase in channel elements


4GV-NB Reverse link capacity

The reverse link of a cdma2000 system is limited by the level of multiple access
interference. Because all users share a common frequency spectrum, each user’s signal
interferes with the signals of other users. Blocking occurs when the noise plus
interference level due to a particular user exceeds the background thermal noise level by
a specified level. Above this blocking interference-to-noise level, known as outage rise-
over-thermal, the addition of only one user produces a significant increase in
interference. This occurs when in response to the interference increase of one user, other
users, in turn, raise their Tx power, thereby increasing interference to other users. Such
an occurrence potentially results in system instability. Thus, to guarantee stability, the
outage rise-over-thermal level is typically limited to the range of 6 dB to 10 dB. The
Erlang capacity of the system is measured by the average traffic load corresponding to
the number of active users causing blocking with the designated blocking probability.

The control of the vocoder average data rate on the reverse-link is established by
selecting one of eight preset possible capacity operating points. Network controls the
reverse link operating point by selecting a desired COP using the existing CDMA2000
call processing signaling protocol. The eight possible COPs are specified by the 3-bit
RATE_REDUC field of the Service Option Control Message (SOCM) [4]. Table 2
specifies these eight operating points along with their respective average data rate. The
mobile station accommodates the requested operating point/average data rate until it is
instructed to do otherwise.

Encoder Capacity Estimated average encoding Estimated average encoding


RATE_REDUC
Operating Point rate for active speech rate for active speech
(binary)
(COP) (channel encoding rates) (source encoding rates)
‘000’ 0 9.3 8.3 kbps
‘001’ 1 8.5 7.57 kbps
‘010’ 2 7.5 6.64 kbps
‘011’ 3 7.0 6.18 kbps
‘100’ 4 6.6 5.82 kbps
‘101’ 5 6.2 5.45 kbps
‘110’ 6 5.8 5.08 kbps
‘111’ 7 (1/2 rate max) 4.8 4.0 kbps
Quarter-rates are not used when RATE_REDUC = ‘000’ and RATE_REDUC = ‘111’. Therefore,
RATE_REDUC = ‘000’ and RATE_REDUC = ‘111’ could be used in IS-95 systems where quarter-rate
frames are sometimes disallowed.

Table 2: Service Option 68 Reverse Link Encoding Rate Control Parameters


4GV-NB Experimental Result
Tests are performed in lab to measure the achievable forward and reverse link capacity
gains of using 4GV-NB (EVRC-B) for capacity operation points of 0, 4, and 6 as given in
Table 2 while compared to EVRC. Please refer to [5] for more detail information on this
experiment.
The lab setup consists of a phone under test connected to the base station simulator. For
each test, the BS initiates a voice call of either service option 3 (EVRC) or service option
68 (EVRC-B). The BS reads frames from a pre-calculated packet file, and transmits the
specified frame type (i.e. full, half, quarter, or eighth-rate) on the Forward Fundamental
Channel (F-FCH).
The forward link test uses forward power control (FPC) mode 0 to power control to a
target FER of 1%. Tests are run under various Radio Frequency (RF) channel conditions
with different speech databases. Table 3 shows an example of forward link capacity test
with 3-path channel conditions in Raleigh fading at speed of 100km/h for a source speech
file consists of approximately seven minutes of properly enunciated speech spoken by
several people, commonly referred to as “Harvard sentence pairs”.

Vocoder F-FCH Gain (dB) % Change Gain % Change Capacity


EVRC -20.02 0.00% 0.00%
4GV-COP0 -20.08 -1.28% 1.29%
4GV-COP4 -21.93 -26.94% 36.87%
4GV-COP6 -22.93 -35.61% 55.29%
Table 3: Forward Link Capacity Results, 1/8 rate gating disabled

For reverse link tests, the initial transmit power of the forward link is set to a high value
and forward power control is disabled in order to achieve 0% FER on the forward link.
Frames received on the forward link are looped back by the mobile and transmitted on
the reverse link. Having no FER on the forward link ensures that the reverse link carries
the same frame distribution as the frame types transmitted by the BS. The reverse power
control is enabled, with a target FER of 1%. Table 4 shows a reverse link capacity tests
with 2-path channel conditions in Raleigh-JTC fading at speeds of 100 km/h using the
same source speech file as in the forward link experiment.

Outage level: 6 dB 7 dB 8 dB 9 dB 10 dB
Vocoder Erlangs % Gain Erlangs % Gain Erlangs % Gain Erlangs % Gain Erlangs % Gain
EVRC 34.34 0.00% 37.06 0.00% 39.22 0.00% 40.96 0.00% 42.31 0.00%
4GV-COP0 34.43 0.25% 37.15 0.24% 39.32 0.25% 41.06 0.25% 42.42 0.25%
4GV-COP4 43.33 26.19% 46.73 26.08% 49.43 26.03% 51.58 25.95% 53.30 25.97%
4GV-COP6 46.84 36.41% 50.49 36.25% 53.41 36.17% 55.73 36.08% 57.57 36.07%
Table 4: Reverse Link Erlang Capacity Gains 1/8 rate gating disabled
4GV Wideband (4GV-WB) Circuit Switched Codec (EVRC-WB)

Traditional telecommunication network uses narrowband speech signals, sampling at 8


kHz with a frequency band in the range of 300 Hz to 3400 Hz. This bandwidth limitation
was inherited by frequency response of analog telephone hybrid coil that was used in the
Public Switch Telephone Network (PSTN). For years subscribers are used to the band
limited voice quality often referred to as “toll quality”. Until recently, a significant
improvement in perceived speech quality and intelligibility is obtained by using
wideband speech signal processing. Wideband speech codec samples speech signal at 16
kHz and covers a larger frequency band of 50 Hz to 7-8 kHz. So, there is a new era
underway to end the conventional PSTN network with its band limited toll quality.
However, coding of such broader spectrum generally requires more transmission
bandwidth than existing narrowband codec and to some extent contradicts operators
desire to maintain their bandwidth efficiency.

4GV wideband (4GV-WB) is part of our 4GV core based suite of technologies
recognizing such requirement and offering greater quality and intelligibility over
narrowband codec, but with an efficiency of capacity impact no worse than existing
narrowband (EVRC) codec. 4GV-WB operates over matured Rate Set one (RS1) network
and using the same rate configuration as EVRC without a need for additional network
planning. It is designed based on 4GV core and as such utilizes majority of 4GV-NB
code reuse. It operates by utilizing bits from narrowband in as embedded Low-band (LB)
and High-band (HB) configuration as shown in Figure 3. The 4GV-LB codes (0-4 kHz)
narrowband spectrum while 4GV-HB codes (4-7.2 kHz) wideband spectrum.
170 1

EVRC-B 4GV-NB

EVRC-WB 4GV-LB 4GV-HB


154 16 1
Figure 3: 4GV-WB Embedded Rate Configuration

To maintain wideband characteristic of speech signal in cellular network requires a point-


to-point (mobile-to-mobile) connection without any vocoder transcoding in the network.
Present CDMA2000-1x network relies on Pulse Code Modulation (PCM) 64 kbps A-
interface between Base Station Controller (BSC) and Mobile Switching Center (MSC)
and as such will not work properly for wideband circuit switched services. So,
deployment of wideband technology over cellular network requires presence of
transcoding free (TrFO) or tandem free (TFO) [5] in the network which is outside the
scope of this paper.

4GV-WB is a new 3GPP2 standard codec build over EVRC-A and EVRC-B codecs. It is
published as C.S0014-C, Enhanced Variable Rate speech codec, service option 3, 68 and
70 [4]. Mobile Station (MS) is granted wideband service option on call set up if the
network has TrFO capability for Mobile to Mobile calls. 4GV-WB offers a single
operating mode without any capacity quality tradeoff as 4GV-NB did due to a limited bits
available in RS1 for coding wideband signal. The LB and HB embedded design provide
faster and more robustness in network deployment of wideband voice services to handle
occasional 3-way conference calls. In such situation the network needs to implement the
LB portion of the 4GV-WB codec to combine the three calls signal using narrowband
PCM mixer and then convert all three parties to narrowband calls again. This is the only
scenario where mobile to mobile wideband calls have to be converted to narrowband
calls, otherwise mobile to mobile wideband voice services are preserved throughout the
calls.

Furthermore, all the call processing and supervisory tones can stay as narrowband as
4GV-WB decoder is able to recognize and decode 4GV-NB packets accordingly.
However, the operators can optionally take advantage of 4GV-WB coding audio/music
more accurately and use it for such services including customized in-band ringback tone
application.

4GV Packet Switched VoIP (EVRCB and EVRCWB)

4GV Packet Switched Voice over Internet Protocol (4GV-VoIP) is part of our 4GV core
based suite of technologies, where 4GV-NB and 4GV-WB circuit switched technologies
are expanded their scope toward specific packet switched telephony QoS (Quality of
Service) constraint. We have introduced technologies such as Discontinuous
Transmission (DTX); to eliminate transmission of background noise and silent interval,
time warping algorithm; to better handle the time varying delay and jitter in packet based
transmission and handling late packet arrival caused by CDMA2000 DO Scheduler
slotted mode design. In addition, we have introduced better error concealment; to handle
larger loss of packet caused by combined air interface and IP network best effort
delivery, all are fully integrated to the 4GV core technology from inception to achieve the
best overall performance and perceived voice quality possible for packet switched
telephony network.

Having a combined core based technology allows a more consistent voice services across
both packet and circuit switched network and provides more peak rate scalability with
potential to further enhance our packet based voice services.

4GV-VoIP proposed algorithmic baseline design is combined with the 3GPP2 circuit
switched baseline text standard [3 & 4]. In addition, we have called on IETF standard
body for specific changes as required to introduce new sub-type MIME for transmission
over Session Description Protocol (SDP) and Real Time Protocol (RTP) respectively.

So far, we have managed to expand RFC3558; EVRC RTP payload format with
RFC4788 to incorporate the changes as required for EVRC-B RTP payload format.
There are different MIME types defined for EVRC-B VoIP. EVRCB is standard
RFC3558 TOC bundle format MIME type, EVRCB0 is header-free format MIME type
for VoIP and finally EVRCB1 is compact bundle format MIME type for applications
such as Push-To-Talk (PTT) where increasing the transmitting delay is bearable.
Correspondingly, EVRC-WB VoIP would require similar formats under EVRCWB
MIME type and such new draft is currently under review by AVT-IETF group, to be
published as a new RFC for combined EVRC family of codecs.
Subjective & Objective Voice Quality Comparison

EVRC-B and EVRC-WB have gone through extensive voice quality assessment and
characterization both objectively and subjectively based on 3GPP2 standardization test
plans and procedures. Please refer to 3GPP2 website 3gpp2.org for more detailed
information in this regard. Figure 4 is just an ensemble of such detailed results showing
the codecs performance while tested in actual deployment conditions. The figure on the
left side is comparing EVRC with EVRC-B operating at M0, M4 and M6 which
corresponds to 0%, 31% and 42% capacity saving modes respectively. The figure on the
right side is comparing EVRC-WB with AMR-WB and VMR-WB codecs while tested in
1% FER and 1% D&B (Dim & Burst in band signaling) condition.

In addition to subjective voice quality measures, the codecs are compared objectively
using Active Speech Average Data Rate (ADR) as a measure of showing their network
voice capacity gain. As you can see EVRC-B M6 corresponds to ADR of around 5.08
kbps while EVRC is 8.3kbps for a capacity gain of around 39%. Furthermore, EVRC-
WB ADR is around 7.37kbps while AMR-WB is 12.65kbps for a capacity gain of around
41%.
Summary Benefit of 4GV

• A core-based suite of technologies with incremental software upgrades that


simplifies deployment while reducing the overall costs
• Supported by Qualcomm as a technology leader and backed by technical experts
and product supports across MSM-1x devices
• 3GPP2 and IETF standard codecs that are supported by major carriers and
infrastructure manufacturers world wide
• Network operators can offer flexible deployment with Voice Tiering option over
different bands to address varying market requirements
• 4GV Narrowband (EVRC-B)
o Provides dynamic prioritization capabilities up to 40% system capacity
improvements while maintaining consistent voice quality
o Provides operators with the flexibility to dynamically prioritize voice
quality or network capacity
• 4GV Wideband (EVRC-WB)
o Offers superior voice quality enhancement for mobile users without
sacrificing existing network voice capacity
o Provides operators to introduce a new evolution in voice services with
voice quality superior to existing telecommunication network
o Provides operators to use the wideband voice quality as a in-network
promotion and eliminate churning
• 4GV VoIP (EVRCB, EVRCWB)
o Delivers better voice quality over packet network through integrated
Quality of Service (QoS) attributes integrated within the core of the codec
o Maintains a consistent voice quality between two different end points
o 4GV-WB is an ideal codec for VoIP with its intrinsic point-to-point
connectivity which preserves the overall wideband voice quality
o Allows future scalability and enhancement in packet switched services
o Enhances simultaneous voice data services and future multimedia services
such as Video Telephony, streaming, gaming and ringback tones

References
[1] Upper Layer (Layer 3) Signaling Standard for cdma2000 Spread Spectrum
Systems, C.S0005-A, July, 2000
[2] Enhanced Variable Speech Codec Service Option 3, C.S0014-A, May, 2004
[3] Enhanced Variable Speech Codec Service Option 3 & 68, C.S0014-B, February,
2006
[4] Enhanced Variable Speech Codec Service Option 3, 68 & 70, C.S0014-C,
February 2007
[5] Qualcomm internal paper on using 4GV to achieve arbitrary system capacity
operating point,
[6] Qualcomm internal paper 80-VA387-1 on Forward and Reverse Link Voice
Capacity Gains of 4GV, November 17, 2005
[7] TrFO-RTO TSG-A Recommendation for Requirements to support TrFO/RTO
in the IOS, 3GPP2 A.S0004-B V2.0

You might also like