Download as pdf or txt
Download as pdf or txt
You are on page 1of 15

ELEN E4896 MUSIC SIGNAL PROCESSING

Lecture 5: Sinusoidal Modeling


1. 2. 3. 4. Sinusoidal Modeling Sinusoidal Analysis Sinusoidal Synthesis & Modication Noise Residual
Dept. Electrical Engineering, Columbia University
dpwe@ee.columbia.edu E4896 Music Signal Processing (Dan Ellis) http://www.ee.columbia.edu/~dpwe/e4896/ 2012-02-13 - 1 /15

Dan Ellis

Periodic sounds

1. Sinusoidal Modeling

ridges in spectrogram
Violin.arco.ff.A4
2012-02-13 - 2 /15

each ridge is a sinusoidal harmonic .. with smoothly-varying parameters .. an efcient & exible description?

E4896 Music Signal Processing (Dan Ellis)

Analogous to Fourier series


k

Sinusoid Modeling

model harmonics explicitly? e.g. x[n] = ak [n]cos( k [n]) ... for pitched signal with fundamental k [n] = k 0 [n] n harmonicity smoothness of ak [n]
0 [n]

Additional constraints

Arbitrarily accurate given enough sinusoids


E4896 Music Signal Processing (Dan Ellis) 2012-02-13 /15

Using Michael Klingbeils SPEAR


http://www.klingbeil.com/spear/

Examples

E4896 Music Signal Processing (Dan Ellis)

2012-02-13 - 4 /15

0.4 0.3 0.2 0.1 0

Extracted envelope reects analysis window


2000 1500 1000 500 0 10 20 30 40 50 60 70 0 0 0.5 1 Time 1.5 2
Frequency Frequency

Envelope Limitations

0.4 0.3 0.2 0.1 0 0 10 20 30 40 50 60 70

2000 1500 1000 500 0 0 0.5 1 Time 1.5 2

Sharp window violates assumptions

E4896 Music Signal Processing (Dan Ellis)

2012-02-13 - 5 /15

Sinusoids = peaks in spectrogram slices


N 1

2. Sinusoidal Analysis

= DFT frames X[k, m] =


n=0

x[n + mL] w[n]e

kn j2N

DFT length N

Hop advance L
E4896 Music Signal Processing (Dan Ellis)

window determines frequency resolution: X(ej ) W (ej ) long enough to see harmonics e.g. 2-3x longest pitch cycle typically 50-100 ms but: too long blurs amplitude envelope ak [n] choose N/2 or N/4 .. denser for simpler interpolation along time
2012-02-13 - 6 /15

Local maxima in DFT frames


8000 6000 4000 2000 0 0 20 0.02 0.04 0.06 0.08 0.1 0.12 0.14

Sinusoidal Peak Picking

freq / Hz

0.16

0.18

time / s

level / dB

0 -20 -40 -60 0 1000 2000 3000 4000 5000 6000 7000 freq / Hz

Quadratic t for sub-bin resolution


20

level / dB

10 0 -10 -20

ab2/4

y = ax(x-b) x b/2

phase / rad

-5

-10 400 600 800 freq / Hz

400

600

800 freq / Hz

E4896 Music Signal Processing (Dan Ellis)

2012-02-13 - 7 /15

Dont want every peak


just true sinusoids threshold?
20

Peak Selection

level / dB

-20 -40 -60 0 1000 2000 3000 4000 5000 6000 7000

freq / Hz

Look for stability


E4896 Music Signal Processing (Dan Ellis)

local shape - ts

) W (ej ) 0

of frequency & amplitude in successive time frames phase derivative in time/freq


2012-02-13 - 8 /15

Connect peaks in adjacent frames to form


sinusoids
can be ambiguous if large frequency changes
freq

Track Formation

birth

existing tracks

death

new time peaks

Unclaimed peak create new track No continuation of track termination


hysteresis
E4896 Music Signal Processing (Dan Ellis) 2012-02-13 - 9 /15

Extracted sinusoids could be anywhere


freq / Hz
4000 2000 0 0 0.05 0.1 0.15 0.2

Pitch Tracking
freq / Hz
700 650 600 550 0

but often expect them to be in harmonic series

6000

Find pitch by searching for common factor [n] = k [n]


can then regularize pitch
k 0
E4896 Music Signal Processing (Dan Ellis)

time / s

0.05

0.1 0.15

0.2

time / s

2012-02-13 - 10/15

Each sinusoid track {a [n],


drives an oscillator
3 2 1

3. Sinusoidal Synthesis
k k [n]}
3 2

level

a k[n]

a k[n]cos(

k[n]t)

1 0 -1

z q/H fre

0 700 600 500 0 k[n] 0.05 0.1 0.15 0.2

-2

time / s

-3 0

0.05

0.1 0.15

time / s

0.2

Faster method synthesizes DFT frames


then overlap-add trickier to achieve frequency modulation
E4896 Music Signal Processing (Dan Ellis) 2012-02-13 - 11/15

can interpolate amplitude, frequency samples

Sinusoidal description very easy to modify


e.g. changing time base of sample points
5000

Sinusoidal Modication

freq / Hz

4000 3000 2000 1000 0 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5

Frequency stretch
level / dB
40 30 20 10 0 0 1000 2000 3000 4000

time / s

preserve formant envelope?


level / dB
40 30 20 10 0 0 1000 2000 3000 4000

freq / Hz E4896 Music Signal Processing (Dan Ellis)

freq / Hz 2012-02-13 - 12/15

Some energy is not well t with sinusoids e.g. noisy energy Can just keep it as residual or model it some other way Leads to sinusoidal + noise model
x[n] =
k
20
mag / dB

4. Noise Residual

ak [n]cos(
sinusoids

k [n]n)

+ e[n]

0 -20 -40 -60 -80

original

residual LPC
0 1000 2000 3000 4000 5000 6000 7000 freq / Hz

E4896 Music Signal Processing (Dan Ellis)

2012-02-13 - 13/15

Sinusoids + Noise Decomposition

Removing sines reveals noise & transients


Guitar - original 4000 3000

Frequency

2000 1000 0

0.2

0.4

0.6

0.8

1 Time

1.2

1.4

1.6

1.8

Guitar - sinusoid reconstruction 4000 3000

Frequency

2000 1000 0

0.2

0.4

0.6

0.8

1 Time

1.2

1.4

1.6

1.8

Guitar - residual (original - sines) 4000 3000

Frequency

2000 1000 0

Different representation approaches...


E4896 Music Signal Processing (Dan Ellis)

0.2

0.4

0.6

0.8

1 Time

1.2

1.4

1.6

1.8

2012-02-13 - 14/15

Spectrogram shows sinusoid harmonics


in many sounds

Summary

Peak picking in spectrogram


can effectively extract them

Sinusoidal domain extremely exible


for modication

Noise residual can add even more realism


E4896 Music Signal Processing (Dan Ellis) 2012-02-13 - 15/15

You might also like