SSP Sheets 2007 09

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

Speech Signal Processing David Weenink Bandlter Analysis

Speech Signal Processing


David Weenink
Institute of Phonetic Sciences University of Amsterdam

First semester 2007

Speech Signal Processing David Weenink Bandlter Analysis

Filterbank Analysis

Bandlter analysis

Speech Signal Processing David Weenink Bandlter Analysis


Mel Frequency Cepstral Coecients References

Advantages: objective analysis Disadvantage: F0 variations

Sound to MFCC

Speech Signal Processing David Weenink Bandlter Analysis


Mel Frequency Cepstral Coecients References

Analysis Frame: Steps


1

A Short-time FT to calculate the power (K spectral values Sk ) Filterbank with J triangular lters Hj Pj = k Hjk Sk for 1 j J where k Hjk = 1; j Cosine transform of the lterbank output cm = J1 cos m (j + 0.5) log Pj j=0 J Praat datatype MelFilter

The Filterbank: Hertz Scale

Speech Signal Processing David Weenink Bandlter Analysis


Mel Frequency Cepstral Coecients References

Amplitude

0.5

0 0

1000 Frequency (Hz)

8000

The Filterbank: Mel Scale

Speech Signal Processing David Weenink Bandlter Analysis


Mel Frequency Cepstral Coecients References

Amplitude

0.5

0 0

1000 2000 Frequency (mel)

3000

Hertz to Mel

Speech Signal Processing David Weenink Bandlter Analysis

3000

Mel Frequency Cepstral Coecients References

2000 mel 1000 0 1000 Hertz 8000

From Hz to mel
mel = 2595 log(1 + Hertz/700)

Mel to Hertz

Speech Signal Processing David Weenink Bandlter Analysis


Mel Frequency Cepstral Coecients References

8000

Hertz

From Mel to Hertz


Hertz = 700(10mel/2595 1)
1000

1000 mel

2000

3000

0.5

1.5

2.5

Speech Signal Processing David Weenink Bandlter Analysis


Mel Frequency Cepstral Coecients References

Frequency (mel)

8000 7000 6000 5000 4000 3000 2000 1000

8000 7000 Frequency (Hz) 6000 5000 4000 3000 2000 1000 0 0 0.5 1 1.5 2 2.5 3

train/dr1/mcpm0/sa1.wav To MelFilter... 0.025 ...0.005 100 100 0 To Spectrogram... 0.025 ...8000 0.005 20 Gaussian

Sound: To MelFilter...

Speech Signal Processing David Weenink Bandlter Analysis


Mel Frequency Cepstral Coecients References

mel = 2595 log(1 + Hertz/700)

Speech Signal Processing David Weenink


60

Bandlter Analysis
Mel Frequency Cepstral Coecients References

Sound pressure level (dB/Hz)

40

20

0 Frequency (Hz)

8000

The bandltering process: Filterbank with J triangular lters Hj Pj = k Hjk Sk for 1 j J where k Hjk = 1; j

MelFilter: To MFCC...

Speech Signal Processing David Weenink Bandlter Analysis


Mel Frequency Cepstral Coecients References

Cosine transform of the MelFilter cm = J1 cos m (j + 0.5) log Pj j=0 J

References

Speech Signal Processing David Weenink Bandlter Analysis


Mel Frequency Cepstral Coecients References

R. Vergin & D. OShaughnessy (1999), Generalized Mel Frequency Cepstral Coecients for Large-Vocabulary Speaker-Independent Continuous-Speech Recognition, IEEE Trans. on Speech and Audio Processing 7, 525532.

You might also like