Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/224234249

Noise and Feedback Suppression for In-car Communication Systems

Conference Paper · November 2008


Source: IEEE Xplore

CITATIONS READS

3 247

2 authors, including:

Juergen Freudenberger
Cyberagentur - Germany's Cyber Security Innovation Agency
153 PUBLICATIONS 785 CITATIONS

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

code-based cryptography View project

Bounded Distance Decoding and Decision Feedback View project

All content following this page was uploaded by Juergen Freudenberger on 08 March 2015.

The user has requested enhancement of the downloaded file.


Noise and Feedback Suppression for In-car Communication Systems
Jürgen Freudenberger, Johannes Pittermann
Fakultät für Informatik, HTWG Konstanz, 78462 Konstanz
E-Mail: {jfreuden,jpitterm}@htwg-konstanz.de
Web: www.edc.in.htwg-konstanz.de

Abstract
In cars, the communication between different seat rows
proves to be difficult due to road noise and the acoustic
situation (seating positions and viewpoints). Drivers typi-
cally tend to turn their heads when talking to rear seat pas-
sengers, unconsciously challenging their and the passen-
gers’ safety. In this paper, we address the enhancement of
the speech understandability, particularly for the scenario
where driver or co-driver talk to rear seat passengers. We
propose an in-car communication system providing an am-
plification to the acoustic propagation from front to rear
taking into account the acoustic conditions in a car. In
particular, we present a spectral subtraction approach for
noise and feedback suppression, frequency shift and feed-
back cancelation modules in such systems.

1 Introduction
Although luxury vehicles provide a noise reduced environ-
ment in the cabin, the communication between front seats
and rear seats is difficult. In contrast to a normal conver- Figure 1: Seating position of passengers in a car and the
sation, in cars there is road noise and the positions and acoustic propagation of speech from the driver to the right
lines of vision of the passenger seats are fixed which im- rear seat passenger.
pairs speech understanding. Usually passengers do not feel
comfortable to conduct long conversations. Frequently, the
car driver is tempted to turn the head in order to improve In-car communication systems have tight restrictions
the communication. Thus, for safety and comfort reasons, for the maximum delay and allowable amplification for the
a system which supports natural communication between transmission from the speaking to the listening person, oth-
passengers is desirable. In this paper we discuss an in-car erwise the sound of the system is usually not acceptable.
communication system which improves speech communi- For instance, if the system gain is too high and the delay
cation in a vehicle. Such a system basically works as an too long, the speaking person recognizes his or her own
intercom between the different passenger seats. Further- echo. Moreover, high gains and long delays may lead to
more, it can serve as the acoustic front-end for other appli- situations, where the listening person localizes the speaker
cations like hands-free telephony, voice controlled devices, in the direction of the loudspeaker. This localization mis-
broadcast services, and dialog systems. Similar concepts match is rather disturbing and should therefore be avoided.
are considered for example in [1], [7], [5], and [8]. Typically, the overall delay (including A/D and D/A con-
Usually, communication systems are associated with version) should not exceed 10 ms [8]. Due to this delay
bi-directional communication, i.e., an in-car communica- restriction, the signal processing for in-car communica-
tion system may be expected to amplify or improve the tion systems is usually performed in the time-domain and
communication from front seat passengers to rear seat pas- block processing is avoided. This work presents a spec-
sengers and vice versa. However, practical experience tral subtraction approach, where the filter is calculated in
and experiments have shown that the front-to-rear com- the frequency-domain whereas the actual filtering is per-
munication path requires more attention concerning signal formed in time-domain. The filter is used to suppress back-
improvement [3]. First, the directivity of a human head ground noise as well as to prevent howling due to acoustic
(mouth) is forward-turned, especially for higher frequen- feedback.
cies. Second, upper medium class cars featuring hands- In this paper, we propose a half-duplex communica-
free telephony already provide all microphones and loud- tion system amplifying speech signals from the front seats
speakers which are required for enhanced front-to-rear to the rear seats in addition to the acoustic propagation of
communication. And third, given typical microphone po- these signals. The scenario is outlined in Fig. 1. It should
sitions (no near microphones), measurements have shown be noted that our system is especially designed to the use
that it is possible to achieve a gain of 10 dB from the front with microphones and loudspeakers which are already pro-
seats to the back seats by simply amplifying the signal vided by hands-free telephony and sound reproduction sys-
without any further processing (noise reduction) whereas tems and, thus, can be seen as an efficient add-on not re-
the prospective gain for the communication from rear to quiring a lot of extra hardware. As opposed to the sys-
front is rather lower. For the communication between tems described in [5] or [8], our system can not avail itself
driver and co-driver the gain is not even significantly larger of specially tailored microphones and loudspeakers at op-
than 0 dB [5]. timal position, but, nevertheless, features an audible and
noise level
AGC

Spectral Frequency Feedback Equalizer


Beamformer
Subtraction Shift Canceler

Figure 2: System architecture of the in-car communication system.

measurable enhancement of the communication from the is required. To avoid seesaw changes between pause and
front seats to the rear seats. speech activity which are very likely to occur in sim-
In our experiments, we limit our considerations to the ple power-based VAD algorithms, we compare the signal
communication from the driver to the right rear seat pas- power in different frequency bands (below 750 Hz, 750 Hz
senger. Here, the acoustic path is indicated by a dotted to 1875 Hz, above 1875 Hz) and we consider a modified
line, further acoustic propagation paths are represented by power spectrum Φ̃xx ( f , k) at discrete time k is dependent
solid lines. Speech signals are captured by two micro- on previous values and which calculates as
phones located in the rear-view mirror and the processed
signal is output by two speakers (one in the door and one Φ̃xx ( f , k) = (1 − δ ) · Φ̃xx ( f , k − 1) + δ · Φxx ( f , k) ,
in the back shelf). Although not explicitly presented in this where δ ∈ [0, 1] is around 0.3 and where Φxx ( f , k) is the
illustration, we always examine both paths to the passen- current power spectrum. Knowing the current signal power
gers’ left and right ears. In the remainder of this paper, we spectrum Φxx ( f ) and an estimate Φnn ( f ) of the noise
describe the architecture of our proposed communication power spectrum, the filters are determined according to
system and we discuss results of our experiments. s
Φnn ( f )
Ĥ ss ( f ) = 1 − ,
2 System Architecture Φxx ( f )

The architecture of our proposed system is depicted in and the filter would be applied to the signal in the fre-
Fig. 2. Speech signals captured by both microphones quency domain before an inverse FFT is applied. The delay
are first processed by a delay-and-sum beamformer. This caused by this block processing is not tolerable for in-car
beamformer achieves a gain in terms of signal to noise ra- communication. To overcome this restriction we apply the
tio of 2-3dB for frequencies above 1kHz, while the gain inverse FFT to the transfer function of the filter and ob-
is only marginal for lower frequencies. The use of a tain the output signal by convolution in the time domain,
fixed beamformer instead of adaptive algorithms reduces where the filter coefficients are updated every 2ms. To re-
the complexity of the feedback canceler, because only one move musical noise the filter coefficients in time domain
are recursively smoothed, i.e. if ĥss −1 (Ĥ ss ( f )) is
feedback canceler is required in contrast to one canceler i (k) = F
for each microphone with adaptive beamformers. the ith filter coefficient at time k, we have
Due to the acoustic paths from the loudspeakers to the ĥss ss ss
i (k) = (1 − γ) · ĥi (k − 1) + γ · hi (k) ,
microphones the in-car communication system is a closed-
loop system that may become instable if the system gain where γ is a constant in the range [0.1, 0.9].
is too large. The task of the feedback canceler is to esti- On the one hand, the frequent update of the noise re-
mate the acoustic feedback and subtract it from the beam- duction filter increases the computation complexity of the
former output signal. However, feedback cancellation is system. On the other hand, simulations show that a 64 or
extremely difficult due to the strong correlation between 128 point FFT is sufficient to obtain the required SNR gain
the speech signal and the loudspeaker output signal, by of 3-5dB, so that the overall complexity is similar to con-
what the conditions of our system resemble those of hear- ventional noise reduction algorithms.
ing aids [4]. Usually, additional means to suppress rising In order to avoid feedback signals, a non-linearity is
feedback tones are required. In [5] and [8] adaptive FIR fil- introduced in the system. Here, we choose a frequency
ters are used to predict and suppress periodic signal com- shift method similar to the one described in [6]: The single
ponents. In our system, we use the spectral subtraction side-band signal is not shifted by a fixed frequency offset,
algorithm to reduce the background noise level as well as e.g., 5 Hz, but by a variable offset changing from 0 Hz to
to attenuate feedback frequencies. 10 Hz at a frequency of 5 Hz (“frequency warbling”). The
Conventionally, noise reduction is performed in the fre- frequency shift is only used at high noise levels as in this
quency domain, e.g. employing spectral subtraction. The case a higher signal amplification is required (increasing
objective of the spectral subtraction algorithm is the es- the risk of feedback signals). At lower noise levels (where
timation of the noise proportion in the short-time signal even the minor distortion introduced by the offset would
spectrum and to calculate an appropriate filter attenuating rather disturb the auditory impression) the frequency shift
noisy signal components. For the 16kHz sampling rate is switched off.
of our system, the spectral subtraction typically operates The theory behind frequency shifting requires a single
on overlapping signal blocks of 256 or 512 samples from side-band signal which is multiplied with the complex ex-
which the spectrum is calculated. ponential function, so that the output signal is
In order to estimate the noise proportion appropri- n o
ately, a robust method for voice activity detection (VAD) sshifted (k) = ℜ sSSB (k) · eΩshift ·k ,
cos (Ωshift · k)
without support
8000
δ(t − L)
6000

Frequency
s(k) sshifted (k)
4000
+_
sin (Ωshift · k)
2000

Hilbert 0
1 2 3 4 5 6 7
Time
with support
8000

Figure 3: Frequency shift block diagram. 6000

Frequency
4000

where sSSB (k) is the single side-band input signal and 2000
Ωshift = 2π j · fshift (k)/ fs with variable frequency offset
0
fshift (k) and sampling frequency fs . In our implementa- 1 2 3 4 5 6 7
tion, we follow the approach shown in Figure 3 which uses Time

a Hilbert filter. As the ideal Hilbert filter is an acausal fil-


ter of infinite impulse response length, we use an approxi-
mated Hilbert of length 2L + 1 samples in the lower branch Figure 5: Spectrograms of the speech signal at the passen-
and for synchronicity reasons a delay of L samples in the ger’s left ear.
upper branch. Then, the filtered or delayed signal in the
branches is multiplied with a sine or cosine of the desired
frequency offset. We choose L = 16 which constitutes a The output gain of the system has to be adjusted ac-
passable trade-off between system delay (only 1 ms) and cording to the driving situation. Based on the noise level
quality of the filter’s frequency response. Looking at the determined during the spectral subtraction, the automatic
curve in Figure 4, it can be observed that the frequency re- gain control (AGC) is adapted such that the signal is only
sponse is not equal among all frequencies, especially above amplified if the noise level indicates that the system sup-
7000 Hz where the curve shows a significant quality fall- port is required. This AGC is implemented as a lookup
off leading to a slight distortion of the output signal. table containing three noise level regions with respective
amplifications of -4 dB, -2 dB and 0 dB relative to the max-
Frequency response of the Hilbert Filter (L=16)
4
imum possible overall amplification.
After the automatic gain control, the signal is equalized
2
with particular respect to the car cabin’s acoustic prop-
0 erties. The equalizer is implemented as an IIR filter the
−2
coefficients of which are determined on the basis of the
relevant acoustic propagation paths between speakers and
Amplification [dB]

−4
microphones.
−6 A fixed delay of approximately 2ms is inserted in our
−8 simulation to model the delay introduced by the A/D and
D/A converters.
−10
The processed signal is output via two speakers – one
−12 located in the door and one located in the back shelf. The
−14 balance between both speakers can be adjusted with the aid
−16
of a system variable in our simulations. With this variable
200 400 800 1600 3200 6400 defaulting to 0.5, the signal is output in equal measures by
Frequency [Hz]
both speakers.

Figure 4: Frequency response of our Hilbert filter. 3 Experimental Results


For our experiments, we use impulse responses of
In addition to the frequency shift, we employ a feed- a Mercedes-Benz S-Class cabin and background noise
back canceler which uses linear prediction to detect and recorded in the same car driving at 100km/h. In order to
suppress harmonics (especially feedback signals) in the achieve a better comparability we scale speech signals and
signal: the input signal is delayed by 3 ms, and an FIR fil- noise to obtain a predefined signal-to-noise ratio (SNR) at
ter of length L = 30 is adapted to minimize the difference the left microphone of the rear-view mirror. The spec-
between the original signal and the delayed signal using trograms of the signal at the passenger’s left ear, with
the LMS algorithm: and without the support of our communication system, are
  shown in Fig. 5. In this simulation, we assumed an SNR of
fb fb α 10dB.
ĥi (k) = ĥi (k − 1) + · s(k) · d(k − i) ,
∑i ||d(k − i)||2 A comparison of both spectrograms shows that the
noise level remains constant in both cases which is due
where ĥfb
i (k) is the ith filter coefficient of the FIR filter at to the same background noise measured at the passenger’s
time k (0 ≤ i ≤ L − 1), α determines the adaptation rate, ear. However, it can be seen that the speech signal is no-
s(k) and d(k) are the original signal and the delayed signal ticeably amplified by the communication system.
at time k. In addition to the acoustic impression (listening to the
SNR improvement at different background noise levels SNR derived from Power Spectral Densities @ 8dB (input)
0 20
with support with support
without support without support
15
−5

10
SNR at passenger’s ear [dB]

SNR at passenger’s ear [dB]


−10
5

−15 0

−5
−20

−10

−25
−15

−30 −20
4 6 8 10 12 14 16 18 20 22 200 400 800 1600 3200 6400
SNR at front microphone [dB] Frequency [Hz]

Figure 6: SNR (in dB) at front microphone and rear seat Figure 7: SNR (in dB) in relevant one-third octave bands.
passenger.
the half-duplex communication from front seat passengers
output signal) and optical impression (looking at the spec- to rear seat passengers, our system features an SNR im-
trogram), we also include measurable metrics in our sys- provement of 3 - 7 dB at typical noise levels in a car cabin.
tem evaluation. First, we consider the delay which we
measure with the aid of composite speech signals (speech Acknowlegements
signals containing white noise bursts of length 200ms re-
peated every second) at a high SNR (engine turned off). Research for this article was sup-
Cross-correlation measurements with the in-car communi- ported by the German Federal Min-
cation system turned off determine a delay of the acoustic istry of Education and Research
path of 61 samples (3.8 ms). Including the system and sub- (Grant No. 17 N11 08 ).
tracting the cross-correlation of the acoustic propagation,
we obtain a system delay of 110 samples (7 ms). With such
a small delay, the first wavefront at the rear seat passenger References
arrives from the front (and not from the loudspeakers be-
hind him) leaving the correct impression that the signal is [1] B. M. Finn. Integrated vehicle voice enhancement sys-
coming from the driver, according to the Haas effect [2]. tem and hands-free cellular telephone system. Euro-
pean Patent EP 0 932 142 A2, Jul. 1999.
Having determined the delays, we perform SNR mea-
surements at different noise levels. As shown in Figure 6, [2] H. Haas. The influence of a single echo on the audibil-
we measure the reference SNR at the front microphone. ity of speech. J. Audio Eng. Soc., 20:145–159, March
Then we determine the signal-to-noise ratio at the rear seat 1972.
passenger’s left ear with the in-car communication system
switch off (without support) and on (with support). Look- [3] E. Hänsler and G. Schmidt, editors. Topics in Acous-
ing at the curves in the diagram, it can be understood that tic Echo and Noise Control: Selected Methods for the
our system features a minimum SNR improvement of 3 dB Cancellation of Acoustical Echoes, the Reduction of
(even increasing for a lower overall SNR). Background Noise, and Speech Processing. Springer,
The distribution of the SNR improvement over the one- 2006.
third octave bands is illustrated in Figure 7. Comparing [4] J. M. Kates. Signal Processing for hearing aids.
the dashed (without system support) and solid (with sup- Kluwer Academic Publishers, 1998. Chapter 6 in M.
port) lines, it can be observed that the SNR improvement Kahrs, K. Brandenburg: Applications of Digital Signal
introduced by our in-car communication system primarily Processing to Audio and Acoustics.
occurs in relevant bands between 300 Hz and 6400 Hz sup-
porting the acoustic impression of how the signal is actu- [5] K. Linhard and J. Freudenberger. Passenger in-car
ally enhanced. communication enhancement. In Proc. EUSIPCO, Vi-
enna, pages 21–24, 2004.
[6] G. Nishinimoya. Improvement of acoustic feedback
4 Conclusion stability of public address system by warbling. In Pro-
In this paper we have presented a half-duplex in-car ceedings of the Sixth International Congress of Acous-
communication system enhancing the acoustic propaga- tics, pages 93–96, 1968.
tion from front passengers to rear passengers inside a [7] K. Schaaf, J. Schultz, and K. Tontch. Digital voice
car. The system has low computational complexity and enhancement for improved in-car communication. In
has been implemented on the digital signal processor Proc. 3rd IFAC Workshop Advances in Automotive
TMS320C6713 from Texas Instruments. Our experiments Control, Karlsruhe, Germany, March 2001.
on real data show an audible enhancement of the speech
signal which can not only be visualized in spectrograms as [8] G. Schmidt and T. Haulick. Signal processing for
shown in Fig. 5 but can also be understood in signal-to- in-car communication systems. Signal Processing,
noise ratio measurements as outlined in Figures 6 or 7. For 86(6):1307–1326, 2006.

View publication stats

You might also like