Professional Documents
Culture Documents
Echo Cancellation in IP Networks
Echo Cancellation in IP Networks
INTRODUCTION
Voice quality in Public Switch Telephone Networks (PSTN) represents the standard for
voice communications, and is referred to as toll quality. The evolution of packet-based
transport is replacing this legacy communication systems. The implementation of Voice
over IP (VOIP) networks presents a new set of challenges in optimizing the voice quality
and providing the expected toll quality. For example, due to the limited bandwidth voice
compression is required. This compression is often a lossy compression, hence some of
the fidelity is potentially lost. In addition, packet loss and increased round trip delay and
variability in delay all contribute to a degradation in voice quality. Despite these
limitations VOIP has the potential to surpass toll grade with the applications of the wide
band and super-wideband audio, and signal processing techniques such as, echo
cancellation, speech enhancement and acoustic beamforming.
Echoes play an important role in the perception of voice quality. For example, in legacy
devices, most users unknowingly expect to hear an echo (as long as the delay is less
25ms and the echo return loss is at least 45dB). IP phones actually create an echo called
a sidetone to recreate this experience, otherwise users will think the line is dead. In
communications silence can actually take away from the experience of a call, thus
comfort noise generation, is required in nearly all applications of acoustic echo
cancellation (AEC).
Due to the limited processing power of most mobile devices, the advance signal
processing required for voice quality enhancements (VQE), such as echo
cancellation, have to be performed at a centralized network location.
Performing echo cancellation further back in the transmission channel now
puts low bit-rate coded speech (e.g. GSM-AMR) in the echo path. Typically in
this scenario, the coded speech has to be decoded first, and then echo
cancellation and other voice quality enhancement techiques are applied. Finally,
the enhanced speech is re-encoded to be spent to the far-end user. Another
approach is to perform VQE directly on the coded speech parameters, as seen in
the figure below. The potential benefits of this method are the reduced
complexity, delays, and quantization noise resulting from having to do the
additional decode and encode to work on uncoded data.
Echo Cancellation software enhances speech for Radio, Mobile, VoIP and Voice
applications
1.3.1 Features
ITU compliant
Passes rigorous proprietary tests
Rapid convergence
Double talk detection
Non-linear Processing (NLP)
Built-in Noise Reduction
Comfort Noise Generation (CNG)
Supports multiple channels
Supports user callable functions
Portable, re-entrant and re-locatable code
Optimized for ANSI C and leading DSP architectures
1.4 ACOUSTIC ECHO CANCELLATION (AEC) SOFTWARE
Acoustic Echo Cancellation (AEC) software is used in Radio, Mobile, VoIP and
Voice communications.
G.168 Line Echo Canceller software reduces line echo for superior speech
enhancement
G.168 Line Echo Canceller Software is fully compliant with ITU G.168
specification as well as satisfies our own rigorous proprietary tests. Our line
echo canceller software is designed to pass and exceed requirements using any
of the G.168-2012 echo path models as well as applicable customer
models. VOCAL provides robust echo cancellation and noise reduction
algorithms that can be configured to significantly reduce echoes and remove
residual noise from the voice signal. Contact us to discuss your voice application
requirements.
VOCAL has developed a range of line / network echo canceller solutions which
are installed world-wide and used in the public telephone system and TDM
echo cancellation circuits, VoIP applications, and many other types of
manufactured products. VOCAL’s experts can help you select and configure the
algorithms that best fit your needs.
G.168 Line Echo Canceller must be placed in the 4-wire portion of a circuit
(which may be an individual circuit path or a path carrying a multiplexed signal).
While intended for use in the PSTN, the extremely high performance and
versatility of G.168 compliant algorithms allow them to be used in a wide variety
of applications for line noise cancellation. For VoIP and embedded devices, the
line echo canceller may be placed after an audio codec (such as G.711, G.722,
G.723.1, G.726, G.728, G.729, G.729A, G.729AB, GSM, MELP, etc.) to reduce the
echo from local or customer hardware.
Like all of VOCAL’s software libraries, our G.168 line echo cancellation algorithm
is available in a variety of forms, including ANSI C and assembly language
implementations optimized for leading DSP architectures. The line echo
canceller libraries are modular and can be executed standalone with its own
microkernel or as a single task under a variety of operating systems.
1.5.2 Features
VOCAL’s line echo cancellation modules pass the standard ITU G.168 Test Suite:
Test No. 2 Convergence and Steady state residual and returned echo
level tests
Test 2A Convergence test with NLP enabled
Test 2B Convergence test with NLP disabled
Test 2C Convergence test in the presence of background noise
Test No. 3 Performance under conditions of double talk
Test 3A Double talk test with low cancelled-end levels
Test 3B Double talk test with high cancelled-end levels
Test 3C Double talk test under simulated conversation
Test No. 4 Leak rate test
Test No. 5 – Infinite return loss convergence test
Test No. 6 – Non-divergence on narrow-band signals
Test No. 7 – Stability test
Test No. 9 – Comfort noise test
Test No. 10 – Facsimile test during call establishment phase
Test No. 10A – Canceller operation on the calling station side
Test No. 10B – Canceller operation on the called station side
Test No. 14 – Performance with V.Series Low-speed Data Modems
Rapid convergence
Double talk detection
Low divergence during double talk
Fast re-convergence for echo path changes
Portable, re-entrant and re-locatable code
Configurable echo tail length (up to 256ms)
Automatic echo delay estimation
Non-linear Processing (NLP)
Built-in Noise Reduction
Anti-howling
Comfort Noise Generation (CNG)
User callable functions
Hands Free telephones and monitoring systems
Full / half duplex speakerphones
VOCAL’s G.167 acoustic echo canceller supports both standard 8kHz and
wideband audio applications and has been rigorously tested to meet G.167 ITU
compliance as well as satisfy our own proprietary testing. The software
estimates the transfer function of the acoustic environment from the enclosed
loudspeaker to the device microphone and cancels received echoes from the
microphone signal. Transfer function constraints include the required
bandwidth, enclosed volume and tolerable delay.
VOCAL’s G.165 Echo Canceller software is fully compliant with ITU G.165
and optimized for execution on ANSI C and leading DSP architectures. Our
echo cancellers have been rigorously tested, meeting ITU compliance tests as
well as our own. G.165 Echo Cancellation algorithm is available standalone, as
part of an embedded library, or with a VoIP stack for integration with developer
applications. Custom solutions are also available to meet application specific
requirements.
Our voice software is optimized for DSPs and conventional processors from TI,
ADI, ARM, AMD, Intel and other leading vendors. Our experts can provide a
custom, optimized line echo cancellation solution to meet most types of
processing and acoustic environment requirements. Please contact us for more
information or to arrange a demonstration.
The ITU G.165 Recommendation was initially developed to standardize line echo
canceller performance and ensure network interoperability. Unfortunately, the
focus on band-limited noise measurement and testing often resulted in
unsatisfactory speech performance. As a result, it was superseded by the ITU
G.168 Recommendation and subsequent addenda. In actual practice it is more
beneficial to use a stripped subset of the G.168 implementation to satisfy
most echo cancellation requirements rather than a G.165 Line Echo Canceller.
1.7.2 Features
Fully compliant with ITU G.165 Recommendation
Rapid convergence
Subjective low returned echo level during single talk
Low divergence during double talk
Configurable tail length up to 128 ms
Utilizes Normalized LMS (NLMS)
Low echo return level during single talk
Double talk detector avoids divergence during double talk
Tone detector and hold release logic
Non linear processor (NLP) with Comfort Noise Generator (CNG)
Supports user callable functions
Compliant with Test No. 1 Steady state residual and returned echo level
test
Compliant with Test No. 2 Convergence test
Compliant with Test No. 3 Performance under conditions of double talk
Compliant with Test No. 4 Leak rate test
Compliant with Test No. 5 Infinite return loss convergence test
Compliant with Test No. 6 Nondivergence on narrow-band signals
Compliant with Test No. 7 Nonconvergence of echo cancellers on mono
or bifrequency signals transmitted in a handshaking protocol
Compliant with Test No. 8 Overload test for Type A and Type D cancellers
Chapter 2
Not all echo cancellation solutions have cookie-cutter approaches and come in a black
box
Custom echo cancellation solutions do not all have cookie-cutter designs and
come in a shiny black box. VOCAL provides:
One of the first tests we perform is our noise burst test. A white noise signal is
generated to be played from the loudspeaker, which is echoed through the system and
the microphone signal is recorded. By observing both signals, the frequency response
of the echo path can be determined. This allows us to identify any frequency that has a
gain in the echo path, which must be avoided at all cost, since the returning echo signal
will be saturated and non-linear. The majority of the time this is caused by
inappropriate hardware gain settings, but sometimes the form factor of the product can
result in strong acoustic coupling at certain frequencies. Instead of doing a redesign,
our engineers can quickly create a custom software solution so the effects on the
non-linearities are minimized and high quality echo cancellation results. For example,
in a full-duplex home automation intercom system, the near-end speaker will most
likely be located a considerable distance from the loudspeaker and microphone. For the
near-end speaker to be heard and speech to be captured clearly, the gain on both the
loudspeaker and microphone will be set high. Unfortunately this creates a particularly
harsh echo environment, where the acoustic coupling will be strong and nonlinear
echoes present. The main challenge for the system designer is to handle the near- to
far-end speaker ratio (NFR). The NFR will be much less than 0dB and during
double-talk (both near- and far-end speakers talk simultaneously), the echo signal will
swamp the near-end signal. For this application double-talk detectors can be extremely
unreliable; fortunately the divergence caused by the near-end speaker will not be as
severe, but still will produce unpleasant results. Since the echo path is constantly
changing and the accuracy of a double talk detection algorithm cannot be guaranteed,
VOCAL has implemented a two-path echo canceller, a simple but very effective way to
improve the digital signal processing system performance for handling doubletalk and
echo path changes.
The voice and the echo are visualized as The time it takes for the voice to travel from
John and Little-John of Figure waves. Bryan can’t hear any echo because he is behind
the reflection point. 1 to the reflection point is called one-way delay. The time it takes
to travel to the reflection point and back to the person making the sounds is called the
total round-trip delay. In this illustration, the one-way delay is either 80 ms or 10 ms,
depending how far John and Little-John are away from the reflection point.
Let’s assume two people are having a conversation over a network. The network can be
IP, satellite, or another network type with long delay. In Figure 2 John on the telephone
is 80 ms or more (one-way) delay away from the reflection point. This is an equivalent
situation to Figure 1. Bryan on the mountain in Figure 1 is in the house in Figure 2,
and John who is 80 ms away from the reflection point in Figure 1 is the person with the
telephone in Figure 2.
John would hear two different kind of echoes of his own voice. One echo is the
electrical echo that occurs at the transition from 2- to 4-wire cable, called the hybrid (H).
The hybrid itself is typically located in the Central Office (CO) other location are
possible, e.g. Private Branch ExchangePBX). The hybrid is the reflection point of the
electrical echo.
The reflection point of the voice in the house could be the wall, the furniture, etc. This
echo is called acoustic echo. The voice coming out of the loudspeaker bounces back
from the wall to the microphone. The wall is the reflection point of the acoustic echo.
In both cases electrical and acoustic echo, John can hear his own voice. It may be nice
to hear one’s own echo in the mountains, but not during a telephone call, when echoes
will definitely interfere with the conversation.
In Figure 1, Little-John who is very close to the reflection point doesn’t hear his echo,
because he is too close, only 10 ms away. The echo occurs, but the elapsed time is too
short to be perceived. However, John who is 80 ms away will hear the echo because of
the longer delay. People can hear an echo if the one-way delay exceeds 25 ms.
The one-way delay is shown in Figure 2, and, the echo tail end delay is illustrated in
Figure 3. The echo tail-end delay, is the time the voice needs to travel from the echo
canceller to the hybrid and back. The tail-end delay can differ depending on the
distance between the echo canceller and the hybrid, the quality of the hybrid
(dispersion), and the acoustic echo.
The capacity of an echo canceller is determined by the echo tail-end delay (e.g., 64 ms).
How does delay occur? The four following examples show how the delay is introduced
in networks
• For satellite transmission, it takes several hundred ms for the voice to travel from
earth to the satellite and back to the earth (e.g., a satellite at 14,000 km altitude has a
120-ms one-way delay). The key factor here is the geographical distance, which differs
widely within and among countries. Also, in real-life applications, the routing in
international and national networks is often determined by factors other than
performance.
• An example for the transmission time in purely digital networks among local
exchanges is 4.8 ms for 400 km/ 250 miles
In VoIP applications, the delay becomes a big issue due to processing time.The
processing may include voice compression and decompression, packetizing and
depacketizing, and packet delay in IP networks. Delays of more than 200 ms are not
rare in IP networks. Figure 4 shows an example of how delays can add up in an IP
network.
2.4 WHAT DOES THE ECHO CANCELLER DO?
If echo is encountered in the network, echo cancellation is required. The echo canceller
simulates the end echo path with a special algorithm --an Echo Estimation Unit—using
the received signal.
The signal that is reflected from the hybrid is subtracted from the calculated signal of
the echo canceller (Figure 3). Ideally, the signal at the output of the echo canceller does
not contain any echo.
• Because of the voice compression, packetizing in IP and routing, the delay for the
total round trip always exceeds 50 ms in VoIP applications.
To compete with the telephone system, everybody is used to, the quality in IP networks
has to be at least as good. To achieve this quality, an echo canceller that has been
approved for PSTN systems should be used.
If a phone call is carried out from an IP based network to a TDM based network, an
echo canceller is required on the border of the two networks. This could be an
Interworking (IWU), a VoIP gateway, Cable Head End, Customer Premises Equipment
(CPE) etc. Figure 5 illustrates different situation where echo cancellation is required. It
gives a few examples in which environment the echo canceller has to be implemented
• Voice over IP gateways require echo cancellation to cancel the echo occurring in the
PSTN.
• Sending voice over a Digital Subscriber Loop (DSL) system into an IP network. A
Digital Subscriber Line Access Multiplexer (DSLAM) could also require echo
cancellation if the voice channels are connected to the PSTN.
• Another application is Private Branch Exchange (PBX) systems to send voice over IP.
In this case, the PBX has to take care of the echo cancellation.
If a phone call remains in the Local Area Network (LAN) or Wide Area Network
(WAN), no echo cancellation is require because hybrids are not present.
Different standards for objective and subjective testing of echo canceller exist
Specifications can cover only certain network conditions. They are not able to cover all
possibilities that can occur in a network or in gateways. The ITU recommendations
state which performance standards an echo canceller has to meet. The standards refer to
minimum levels of performance (in boldface).
Chapter 3
- The main advantage is the amount of money you end up saving on your phone bills as
compared to a traditional phone line.
- Inexpensive and easy to use. Since it is simple, upgrading is relatively simpler too.
- You can integrate it with an existing phone connection.
- With VOIP PC-to-PC, calls are free no matter the distance and PC-to-Phone charges
are nominal.
- For a monthly fee you may make unlimited free calls within a geographic area.
- A virtual number enables you to make calls from anywhere as long as a broadband
connection is available.
- You may purchase a number in a geography area of your choice, which works out
very cheap. If your relatives and friends live in Virginia and you moved to California,
you may purchase a Virginia number and make local calls to your loved ones.
- You may access your VOIP account just like your email Id from any where in the
world as long as you have an internet phone. This makes it easy for those who travel
frequently to make calls frequently to those back at home at local call rates, no matter
where they are.
- You may call or message or do both at the same time with VOIP services.
- VOIP cost about half the cost of traditional phone services and it seems that the taxes
and surcharges are much lower. Also your bill is easier to understand and it can be
viewed via the Internet.
disadvantages of VOIP could be annoying, but their effects are relatively limited. The
complaints of VOIP are usually tolerable if the callers are using a free service. As the
technology is advancing, we will expect the VOIP quality will match the traditional
telephone technology.
These are some of advantages and disadvantages of VOIP. All said and done, we could
say that the advantages of VOIP outweigh the disadvantages of it.
- The main advantage is the amount of money you end up saving on your phone bills as
compared to a traditional phone line.
- Inexpensive and easy to use. Since it is simple, upgrading is relatively simpler too.
- You can integrate it with an existing phone connection.
- With VOIP PC-to-PC, calls are free no matter the distance and PC-to-Phone charges
are nominal.
- For a monthly fee you may make unlimited free calls within a geographic area.
- A virtual number enables you to make calls from anywhere as long as a broadband
connection is available.
- You may purchase a number in a geography area of your choice, which works out
very cheap. If your relatives and friends live in Virginia and you moved to California,
you may purchase a Virginia number and make local calls to your loved ones.
- You may access your VOIP account just like your email Id from any where in the
world as long as you have an internet phone. This makes it easy for those who travel
frequently to make calls frequently to those back at home at local call rates, no matter
where they are.
- You may call or message or do both at the same time with VOIP services.
- VOIP cost about half the cost of traditional phone services and it seems that the taxes
and surcharges are much lower. Also your bill is easier to understand and it can be
viewed via the Internet.
Chapter 4
APPLICATIONS
Telephone hybrid circuits convert 4-wire interface into 2-wire interface in POTS/PSTN
systems. Historically, the introduction of the 4-wire-to-2-wire interface was motivated
by the need to reduce the cost of copper loops connecting the POTS telephone line
facility (access network) to individual subscribers. Figure 1 illustrates a typical line
echo canceller operating at the near-end of the voice connection.
4.3 LINE & ACOUSTIC ECHO CANCELLER COMBINED
Line & acoustic echo cancellation is used in applications where both line and
acoustic echo are present
A Line & Acoustic Echo Canceller Combined is an echo cancellation solution that is
suitable for applications where both types of echoes, line echo and acoustic echo, are
present, such as:
The figure illustrates a high-level configuration of LEC/AEC in the voice network access
sub-system of a TDM or VoIP communication network.
In order to perform adequately, the line and acoustic echo canceller combined (LAEC) is
expected to have all basic features of line and acoustic echo cancellersand, specifically,
be able to fast track echo path changes as well as adequately cover echo path lengths
typically present in the acoustical environment such as a conference room, office or
meeting room.
VoIP or Voice over Internet Protocol has been largely heralded as the
telecommunications paradigm of the 21st century. VoIP transmits data from the
traditional analog Public Switched Telephone Network (PSTN) across an IP
network through the use of an Analog Telephone Adapter (ATA). The signal is
broken up into frames and the information in each frame is stored in digital
packets that are sent over the network. Each packet has header information that
gives the receiving end information about how to reconstruct the signal. This
header is essential as each packet traverses the network independently and
each may encounter different transmission scenarios.
Jitter effects are typically due to either clock slippage or network delay [1]. Clock
slippage occurs when a clock rate difference exists between the receiving and
transmitting side that can cause either lost packets or duplicate samples due to buffer
read errors. If the buffer is persistent and circular, and the receiver is sampling faster
than the transmitter then values will be duplicated, and similarly a slower receiver will
miss values.
Network delay shifts the body of the impulse response along the echo path. Each
packet may experience different levels of network traffic while in route, and thus may
arrive out of sequence. Since the speech signal processing needs to reconstruct the
signal at the other end, delay results.
Conversion of a speech frame into a packet is typically done with a low bit rate vocoder
for efficiency and ease of transmission. A vocoder essentially attempts to represent the
speech frame by a smaller set of parameters that will excite a speech production model
on the receiving end. Distortion is introduced by an inaccurate representation, pre- or
post-filtering, and by parameter quantization, and thus non-linearity is introduced into
the echo path which will degrade the performance of the linear echo canceller [2].
The effect packet loss has on the performance on an echo canceller depends on how
such a packet is recovered in each instance [2]. One such method is packet concealment,
in which the lost packet is somehow replaced on the receiving end. Typical replacement
possibilities are silence, noise, the previous packet. Alternatively, you might want to
attempt an extrapolation from the previous packet.
The SPMMax-MDF Algorithm needs to be done after the speech is decoded by the
vocoder. There are many algorithms available for sampled speech echo cancellation, as
generously expanded upon on this site. An alternative method is to use a LPC type filter
to directly filter the packets before they reach the vocoder. While you will run into much
the same issues, at least the size of the filters used will be smaller.
Delay effects can instead be mitigated by attempting to predict the delay caused by the
network. A Hidden Markov Model may be most appropriate for this task [4]. Clock
slippage can be reduced by resampling, or by using identical or good quality hardware.
Vocoder distortion can by modeled as through a non-linear processing block and thus
explicitly dealt with. Packet loss effects can be trivialized by the proper selection of a
recovery technique. The best recovery techniques are the ones that introduce an
approximation to the lost packet, as this will throw the system distance off the least.
Whichever way is chosen, it is clear that VoIP will be increasingly important in this
century due to our growing reliance on digital networks, and will undoubtedly be
introduced into applications not readily apparent at the moment. Thus, it is important
that we monitor how the IP network evolves and assess how such evolution will change
the typical characteristics of the VoIP echo path. With due diligence, VoIP quality can be
assured.
Chapter 5
RESULT
5.1 Conclusion
Hence, this paper provides an introduction into the basics of echo cancellation. The
paper is kept general but it emphasizes on the emerging Voice over IP (VoIP) market.
This paper shows why and where echo cancellation is required, what echo is, and what
causes it. It also gives an overview of the standards involved in echo cancellation. The
SIDECTM, Infineon’s digital echo canceller, meets the requirements of the VoIP
market. This paper shows how it can be integrated into a VoIP system and the
advantages the solution offers.