Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 22

Speech & Audio Processing

Part - II

Marc Moonen
Dept. E.E./ESAT, K.U.Leuven
Speech & Audio Processing - II

Topics overview
Course notes/literature
Exercise sessions/project

Speech & Audio Version 2010-2011 Introduction p. 2


Part-I (H. Van hamme)

speech recognition
speech coding (+audio coding)
speech synthesis (TTS)

Part-II (M. Moonen) : ` audio/speech signal processing

digital audio recording/playback
microphone array processing
noise- ,echo-, feedback- cancellation,
loudspeaker arrays - 3D audio
PS: selection of 8 topics
Speech & Audio Version 2010-2011 Introduction p. 3

Aim is 2-fold :
Speech & audio per se
S & A industry in Belgium/Europe/
Basic signal processing theory/principles :
Optimal filters / Kalman filters (linear/nonlinear)
Advanced adaptive filter algorithms (APA, PBFDAF
Filtered-X LMS,..)
Time-Frequency analysis (STFT,..)
Oversampling D/A - A/D etc...

Speech & Audio Version 2010-2011 Introduction p. 4

Topics Overview : Topic-1

Digital audio recording & playback

Sampling, oversampling A/D - D/A conversion
Example: Super Audio CD (SACD)
Amplitude quantization: distortion & dither
Re-quantization (e.g. 24 bits -> 16 bits)
dithered re-quantization, noise shaping
Sampling rate conversion (e.g.32kHz, 44.1kHz, 48kHz,..)

24bit 16bit
24bit e' (t ) 16bit
Speech & Audio Version 2010-2011 Introduction p. 5
Topics Overview : Topic-2

Microphone Array Processing

Spatial filtering - Beamforming
Fixed vs. adaptive beamforming
Example filter-and-sum beamformer :

S ( )
Y1 ( , )
F1 ( )
Y2 ( , )
F2 ( )
Z ( , ) d m Y ( , )
Y1 ( ,Ym)( , )
Fm ( ) d m cos

YM ( , )
FM ( )

Application: hearing aids

Speech & Audio Version 2010-2011 Introduction p. 6

Topics Overview
Speech communication set-up :
- background noise noise suppression, source separation
- far-end echoes acoustic echo cancellation
- reverberation de-reverberation/deconvolution

Applications :
hands-free telephony
hearing aids, etc..
Speech & Audio Version 2010-2011 Introduction p. 7
Topics Overview : Topic-3

Noise Reduction

`microphone_signal[k] = speech[k] + noise[k]

Single-microphone noise reduction

Spectral Subtraction Methods (spectral filtering)
Iterative methods based on speech modeling
(Wiener & Kalman Filters)

Multi-microphone noise reduction

Beamforming revisited
Optimal filtering approach : spectral+spatial filtering

Speech & Audio Version 2010-2011 Introduction p. 8

Topics Overview : Topic-4

Acoustic Echo Cancellation

Adaptive filtering problem:

non-stationary/wideband/ speech signals
non-stationary/long/ acoustic channels

Adaptive filtering algorithms

AEC Control
AEC Post-processing
Stereo AEC

Speech & Audio Version 2010-2011 Introduction p. 9

Topics Overview : Topic-5

Acoustic Feedback Cancellation

Ex: Hearing aids
Ex: PA systems
correlation between filter input (`x ) and near-end signal ( n )
fixes : noise injection, pitch shifting, notch filtering, ...


Speech & Audio Version 2010-2011 Introduction p. 10

Topics Overview : Topic-6

Reverb & De-reverberation

` microphone_signal[k] = filter*speech[k] (+ noise[k])

Reverb = effect of acoustic channel in between speaker and

Reverb has an impact on coding, speech recognition, etc.

Single-microphone de-reverberation
Cepstrum techniques
Multi-microphone de-reverberation:
Estimation of acoustic impulse responses
Inverse-filtering method
Matched filtering

Speech & Audio Version 2010-2011 Introduction p. 11

Topics Overview : Topic-7

3D Audio & Loudspeaker Arrays

Binaural synthesis
with headphones
head related transfer functions (HRTF)
with 2+ loudspeakers (`sweet spot)
crosstalk cancellation
(Wavefield synthesis)
with loudspeaker/microphone arrays

Speech & Audio Version 2010-2011 Introduction p. 12

Topics Overview : Topic-7bis

Active Noise Control

Solution based on `filtered-X LMS

Application : active headsets/ear defenders

Speech & Audio Version 2010-2011 Introduction p. 13

Topics Overview : Topic-8

Editing : Time scaling and pitch shifting

Applications : Karaoke, playback speed control,
Application of time-frequency analysis/processing
Techniques : OLA, PSOLA,

Speech & Audio Version 2010-2011 Introduction p. 14

Aims/Scope (revisited)

Aim is 2-fold :
Speech & audio per se
Basic signal processing theory/principles :
Optimal filtering / Kalman filters (linear/nonlinear)
here : speech enhancement
other : automatic control, spectral estimation, ...
Advanced adaptive filter algorithms
here : acoustic echo cancellation
other : digital communications, ...
Filtered-X LMS
here : 3D audio
other : active noise/vibration control
Speech & Audio Version 2010-2011 Introduction p. 15

H197 `Systeemtheorie -Regeltechniek (JVDW)

HJ09 `Digitale Signaalverwerking I (PW)

signaaltransformaties, bemonstering, multi-rate, DFT,

HC63 `Digitale Signaalverwerking II (MM)

filter-ontwerp, filterbanken, optimale- & adaptieve filters

Speech & Audio Version 2010-2011 Introduction p. 16

Literature (General) (available in DSP-II library)
Simon Haykin
`Adaptive Filter Theory (Prentice Hall 1996)
P.P. Vaidyanathan
`Multirate Systems and Filter Banks (Prentice Hall 1993)

Literature (specialized) (some available in DSP-II library)

S.L. Gay & J. Benesty
`Acoustic Signal Processing for Telecommunication (Kluwer 2000)
M. Kahrs & K. Brandenburg (Eds)
`Applications of Digital Signal Processing to Audio and Acoustics
B. Gold & N. Morgan
`Speech and Audio Signal Processing (Wiley 2000)

Speech & Audio Version 2010-2011 Introduction p. 17

Part-II Lectures

Lecture-1 : March 29 10.35-12.25

Lecture-2 : March 31 8.25-10.25
Lecture-3 : April 5 10.35-12.35
Lecture-4 : April 7 8.25-10.25

Lecture-5 : April 26 10.35-12.35

Lecture-6 : April 28 8.25-10.25
Lecture-7 : May 3(or5) 10.35-12.35(or 8.25-10.25)
Lecture-8 : May 10 10.35-12.35
Lecture-9 : May 12 8.25-10.25

Speech & Audio Version 2010-2011 Introduction p. 18

Exercise Sessions/Project

4 Matlab/Simulink Sessions

Session-1: Re-quantization
straightforward rounding vs dithered re-quantization
off-line experiments with test signals
Session-2 : Microphone Array Processing
Griffiths-Jim beamforming
off-line experiments with recorded data
Session 3-4: `Project Sessions
TA: Romain Serizel (F/E), romain.serizel@esat

Speech & Audio Version 2010-2011 Introduction p. 19

Exercise Sessions/Project

Complete speech enhancement set-up
builds on results from exercise session 2
single-/multi-microphone noise reduction +AEC
off-line experiments with recorded data

Deliverable : Matlab/Simulink software

+ Report (max. 5p)
Groups of 2
Time budget = 16hrs per person

Speech & Audio Version 2010-2011 Introduction p. 20


Mondeling, schriftelijke voorbereiding

Open boek (beperkt)
2 Inzicht-/denkvragen, geen rekenoefeningen

7.5 for question-1

7.5 for question-2

+5 for project (software/report)

= 20
Speech & Audio Version 2010-2011 Introduction p. 21

Contact: romain.serizel@esat

Slides (use `version 2011 !!)

FAQs (send questions to marc.moonen@esat)

Speech & Audio Version 2010-2011 Introduction p. 22

You might also like