Amr

You might also like

Download as rtf, pdf, or txt
Download as rtf, pdf, or txt
You are on page 1of 6

AMR-WB (9 )

6.60, 8.85, 12.65, 14.25, 15.85, 18.25, 19.85, 23.05, 23.85 /

GSM Audio Codec / Vocoder


- Audio / voice codecs and vocoders convert the voice signals required
to be transmitted over a GSM link into a compact digital format. Voice
codec technologies used with GSM include LPC-RPE, EFR, Full Rate,
Half Rate, AMR codec and AMR-WB codec & CELP, ACELP, VSELP,
speech codec technologies.
GSM TUTORIAL INCLUDES

GSM tutorial / introduction

Network architecture

Interfaces

Radio access network

Frames

Frequency bands and allocations

Power class, control & amplifiers

Physical & logical channels

Codecs / vocoders

Handover / handof
Audio codecs or vocoders are universally used within the GSM system. They reduce the bit
rate of speech that has been converted from its analogue for into a digital format to enable
it to be carried within the available bandwidth for the channel. Without the use of a speech
codec, the digitised speech would occupy a much wider bandwidth then would be
available. Accordingly GSM codecs are a particularly important element in the overall
system.
A variety of diferent forms of audio codec or vocoder are available for general use, and the
GSM system supports a number of specific audio codecs. These include the RPE-LPC, half
rate, and AMR codecs. The performance of each voice codec is diferent and they may be
used under diferent conditions, although the AMR codec is now the most widely used. Also
the newer AMR wideband (AMR-WB) codec is being introduced into many areas, including
GSM
Voice codec technology has advanced by considerable degrees in recent years as a result
of the increasing processing power available. This has meant that the voice codecs used in
the GSM system have large improvements since the first GSM phones were introduced.

Vocoder / codec basics


Vocoders or speech codecs are used within many areas of voice communications.
Obviously the focus here is on GSM audio codecs or vocoders, but the same principles
apply to any form of codec.
If speech were digitised in a linear fashion it would require a high data rate that would
occupy a very wide bandwidth. As bandwidth is normally limited in any communications
system, it is necessary to compress the data to send it through the available channel. Once
through the channel it can then be expanded to regenerate the audio in a fashion that is as
close to the original as possible.
To meet the requirements of the codec system, the speech must be captured at a high
enough sample rate and resolution to allow clear reproduction of the original sound. It must
then be compressed in such a way as to maintain the fidelity of the audio over a limited bit
rate, error-prone wireless transmission channel.
Audio codecs or vocoders can use a variety of techniques, but many modern audio codecs
use a technique known as linear prediction. In many ways this can be likened to a
mathematical modelling of the human vocal tract. To achieve this the spectral envelope of
the signal is estimated using a filter technique. Even where signals with many non-
harmonically related signals are used it is possible for voice codecs to give very large levels
of compression.
A variety of diferent codec methodologies are used for GSM codecs:

CELP: The CELP or Code Excited Linear Prediction codec is a vocoder


algorithm that was originally proposed in 1985 and gave a significant improvement
over other voice codecs of the day. The basic principle of the CELP codec has been
developed and used as the basis of other voice codecs including ACELP, RCELP,
VSELP, etc. As such the CELP codec methodology is now the most widely used
speech coding algorithm. Accordingly CELP is now used as a generic term for a
particular class of vocoders or speech codecs and not a particular codec.

The main principle behind the CELP codec is that is uses a principle known as
"Analysis by Synthesis". In this process, the encoding is performed by perceptually
optimising the decoded signal in a closed loop system. One way in which this could
be achieved is to compare a variety of generated bit streams and choose the one
that produces the best sounding signal.

ACELP codec: The ACELP or Algebraic Code Excited Linear Prediction


codec. The ACELP codec or vocoder algorithm is a development of the CELP model.
However the ACELP codec codebooks have a specific algebraic structure as
indicated by the name.

VSELP codec: The VSELP or Vector Sum Excitation Linear Prediction codec.
One of the major drawbacks of the VSELP codec is its limited ability to code non-
speech sounds. This means that it performs poorly in the presence of noise. As a
result this voice codec is not now as widely used, other newer speech codecs being
preferred and ofering far superior performance.

GSM audio codecs / vocoders


A variety of GSM audio codecs / vocoders are supported. These have been introduced at
diferent times, and have diferent levels of performance.. Although some of the early audio
codecs are not as widely used these days, they are still described here as they form part of
the GSM system.

CODEC NAME BIT RATE COMPRESSION TECHNOLOGY


(KBPS)
Full rate 13 RTE-LPC
EFR 12.2 ACELP
Half rate 5.6 VSELP
AMR 12.2 - 4.75 ACELP
AMR-WB 23.85 - 6.60 ACELP

GSM Full Rate / RPE-LPC codec


The RPE-LPC or Regular Pulse Excited - Linear Predictive Coder. This form of voice codec
was the first speech codec used with GSM and it chosen after tests were undertaken to
compare it with other codec schemes of the day. The speech codec is based upon the
regular pulse excitation LPC with long term prediction. The basic scheme is related to two
previous speech codecs, namely: RELP, Residual Excited Linear Prediction and to the MPE-
LPC, Multi Pulse Excited LPC. The advantages of RELP are the relatively low complexity
resulting from the use of baseband coding, but its performance is limited by the tonal noise
produced by the system. The MPE-LPC is more complex but provides a better level of
performance. The RPE-LPC codec provided a compromise between the two, balancing
performance and complexity for the technology of the time.
Despite the work that was undertaken to provide the optimum performance, as technology
developed further, the RPE-LPC codec was viewed as ofering a poor level of voice quality.
As other full rate audio codecs became available, these were incorporated into the system.

GSM EFR - Enhanced Full Rate codec


Later another vocoder called the Enhanced Full Rate (EFR) vocoder was added in response
to the poor quality perceived by the users of the original RPE-LPC codec. This new codec
gave much better sound quality and was adopted by GSM. Using the ACELP compression
technology it gave a significant improvement in quality over the original LPC-RPE encoder.
It became possible as the processing power that was available increased in mobile phones
as a result of higher levels of processing power combined with their lower current
consumption.

GSM Half Rate codec


The GSM standard allows the splitting of a single full rate voice channel into two sub-
channels that can maintain separate calls. By doing this, network operators can double the
number of voice calls that can be handled by the network with very little additional
investment.
To enable this facility to be used a half rate codec must be used. The half rate codec was
introduced in the early years of GSM but gave a much inferior voice quality when compared
to other speech codecs. However it gave advantages when demand was high and network
capacity was at a premium.
The GSM Half Rate codec uses a VSELP codec algorithm. It codes the data around 20 ms
frames each carrying 112 bits to give a data rate of 5.6 kbps. This includes a 100 bps data
rate for a mode indicator which details whether the system believes the frames contain
voice data or not. This allows the speech codec to operate in a manner that provides the
optimum quality.
The Half Rate codec system was introduced in the 1990s, but in view of the perceived poor
quality, it was not widely used.

GSM AMR Codec


The AMR, Adaptive Multi-rate codec is now the most widely used GSM codec. The AMR
codec was adopted by 3GPP in October 1988 and it is used for both GSM and circuit
switched UMTS / WCDMA voice calls.
The AMR codec provides a variety of options for one of eight diferent bit rates as described
in the table below. The bit rates are based on frames that are 20 millisceonds long and
contain 160 samples. The AMR codec uses a variety of diferent techniques to provide the
data compression. The ACELP codec is used as the basis of the overall speech codec, but
other techniques are used in addition to this. Discontinuous transmission is employed so
that when there is no speech activity the transmission is cut. Additionally Voice Activity
Detection (VAD) is used to indicate when there is only background noise and no speech.
Additionally to provide the feedback for the user that the connection is still present, a
Comfort Noise Generator (CNG) is used to provide some background noise, even when no
speech data is being transmitted. This is added locally at the receiver.
The use of the AMR codec also requires that optimized link adaptation is used so that the
optimum data rate is selected to meet the requirements of the current radio channel
conditions including its signal to noise ratio and capacity. This is achieved by reducing the
source coding and increasing the channel coding. Although there is a reduction in voice
clarity, the network connection is more robust and the link is maintained without dropout.
Improvement levels of between 4 and 6 dB may be experienced. However network
operators are able to prioritise each station for either quality or capacity.
The AMR codec has a total of eight rates: eight are available at full rate (FR), while six are
available at half rate (HR). This gives a total of fourteen diferent modes.

MODE BIT RATE FULL RATE (FR) /


(KBPS) HALF RATE (HR)
AMR 12.2 12.2 FR
AMR 10.2 10.2 FR
AMR 7.95 7.95 FR / HR
AMR 7.40 7.40 FR / HR
AMR 6.70 6.70 FR / HR
AMR 5.90 5.90 FR / HR
AMR 5.15 5.15 FR / HR
AMR 4.75 4.75 FR / HR
AMR codec data rates

AMR-WB codec
Adaptive Multi-Rate Wideband, AMR-WB codec, also known under its ITU designation of
G.722.2, is based on the earlier popular Adaptive Multi-Rate, AMR codec. AMR-WB also uses
an ACELP basis for its operation, but it has been further developed and AMR-WB provides
improved speech quality as a result of the wider speech bandwidth that it encodes. AMR-
WB has a bandwidth extending from 50 - 7000 Hz which is significantly wider than the 300
- 3400 Hz bandwidths used by standard telephones. However this comes at the cost of
additional processing, but with advances in IC technology in recent years, this is perfectly
acceptable.
The AMR-WB codec contains a number of functional areas: it primarily includes a set of
fixed rate speech and channel codec modes. It also includes other codec functions
including: a Voice Activity Detector (VAD); Discontinuous Transmission (DTX) functionality
for GSM; and Source Controlled Rate (SCR) functionality for UMTS applications. Further
functionality includes in-band signaling for codec mode transmission, and link adaptation
for control of the mode selection.
The AMR-WB codec has a 16 kHz sampling rate and the coding is performed in blocks of 20
ms. There are two frequency bands that are used: 50-6400 Hz and 6400-7000 Hz. These
are coded separately to reduce the codec complexity. This split also serves to focus the bit
allocation into the subjectively most important frequency range.
The lower frequency band uses an ACELP codec algorithm, although a number of additional
features have been included to improve the subjective quality of the audio. Linear
prediction analysis is performed once per 20 ms frame. Also, fixed and adaptive excitation
codebooks are searched every 5 ms for optimal codec parameter values.
The higher frequency band adds some of the naturalness and personality features to the
voice. The audio is reconstructed using the parameters from the lower band as well as
using random excitation. As the level of power in this band is less than that of the lower
band, the gain is adjusted relative to the lower band, but based on voicing information. The
signal content of the higher band is reconstructed by using an linear predictive filter which
generates information from the lower band filter.

BIT RATE NOTES


(KBPS)
6.60 This is the lowest rate for AMR-WB. It is used for circuit switched connections for GSM
and UMTS and is intended to be used only temporarily during severe radio channel
conditions or during network congestion.
8.85 This gives improved quality over the 6.6 kbps rate, but again, its use is only
recommended for use in periods of congestion or when during severe radio channel
conditions.
12.65 This is the main bit rate used for circuit switched GSM and UMTS, ofering superior
performance to the original AMR codec.
14.25 Higher bit rate used to give cleaner speech and is particularly useful when ambient
audio noise levels are high.
15.85 Higher bit rate used to give cleaner speech and is particularly useful when ambient
audio noise levels are high.
18.25 Higher bit rate used to give cleaner speech and is particularly useful when ambient
audio noise levels are high.
19.85 Higher bit rate used to give cleaner speech and is particularly useful when ambient
audio noise levels are high.
23.05 Not suggested for full rate GSM channels.
23.85 Not suggested for full rate GSM channels, and provides speech quality similar to that
of G.722 at 64 kbps.
Not all phones equipped with AMR-WB will be able to access all the data rates - the
diferent functions on the phone may not require all to be active for example. As a result, it
is necessary to inform the network about which rates are available and thereby simplify the
negotiation between the handset and the network. To achieve this there are three
diference AMR-WB configurations that are available:

Configuration A: 6.6, 8.85, and 12.65 kbit/s

Configuration B: 6.6, 8.85, 12.65, and 15.85 kbit/s

Configuration C: 6.6, 8.85, 12.65, and 23.85 kbit/s


It can be seen that only the 23.85, 15.85, 12.65, 8.85 and 6.60 kbit/s modes are used.
Based on listening tests, it was considered that these five modes were sufficient for a high
quality speech telephony service. The other data rates were retained and can be used for
other purposes including multimedia messaging, streaming audio, etc.
By Ian Poole

You might also like