Professional Documents
Culture Documents
32 2 Lehtonen
32 2 Lehtonen
Timo I. Laakso*
Canceling and Selecting
*Helsinki University of Technology (TKK)
Department of Signal Processing and Acoustics
Partials from Musical Tones
P.O. Box 3000, FI-02015 TKK, Espoo, Finland
This article discusses canceling and extracting har- method, the signal is analyzed using the windowed
monics from a musical signal using digital filters. Fast Fourier transform (FFT), and the frequency and
This is an old technique that has been proposed in amplitude tracks are obtained by connecting data in
different forms by Moorer in the 1970s for pitch the neighboring analysis frames. This approach has
detection of speech signals (Moorer 1974) and for its roots in the phase vocoder technique and its
analyzing music data for additive synthesis (Moorer efficient transform-domain implementation. For
1977). The basic idea is to use a multiple-notch periodic or pseudo-periodic musical tones it is
filter to extract individual harmonic components as unnecessary to resort to an overly generic analysis
signals. The filter structure can be obtained as the method, because the frequencies of the harmonic
inverse transfer function of a comb filter (i.e., a components are known after the estimation of the
delay line in a feedback loop). fundamental frequency. Advantages of the proposed
In this article, we expand on a recently proposed filter-based analysis method (compared with the
idea that the delay line in the inverse comb filter more general FFT-based techniques) are simplicity,
(ICF) can be replaced with a high-order fractional- which follows mainly from the small number of
delay filter to obtain very accurate cancellation of parameters, and the possibility of designing filter
neighboring harmonics to select a single harmonic coefficients in closed form. Additionally, the result-
or to extract the residual signal by canceling all ing decomposition is obtained directly as a set of
harmonics (Vlimki, Ilmoniemi, and Huotilainen time-domain signals, and no separate synthesis
2004; Vlimki, Lehtonen, and Laakso 2007). The stage is required.
proposed signal analysis method is useful for many Other signal processing methods proposed for
practical cases. Many musical instruments, includ- analyzing the harmonic structure of musical signals
ing all woodwind, brass, and bowed string instru- include wavelets (Evangelista 1993) and high-
ments, produce a sound signal that is inherently resolution tracking methods (Badeau, David, and
harmonic, that is, the spectral components are Richard 2006). These methods provide excellent
integral multiples of a fundamental frequency. This frequency accuracy at the expense of a complicated
follows from the sound-production mechanism of algorithm and a high computational cost. The
these self-excited systems, which involves mode method proposed in this article can also provide
locking in the time domain (Fletcher and Rossing amplitude and frequency accuracy that is sufficient
1991). It forces the sustained tones of such instru- for musical signal analysis, but at the same time,
ments to be periodic. There is often a noise compo- the analysis method remains easy to apply.
nent in these musical tones, however, making them In this article, we first discuss the theory behind
pseudo-periodic in practice. canceling and selecting partials using fractional
Another method for this kind of signal decompo- delay filters. Then, we present three test cases to
sition is sinusoidal modeling (McAulay and Quat- demonstrate the power of this approach in musical
ieri 1986; Serra 1989; Serra and Smith 1990). In this signal analysis. Sound samples corresponding to
these examples are provided on the forthcoming
Computer Music Journal, 32:2, pp. 4356, Summer 2008 Computer Music Journal DVD (to be released with
2008 Massachusetts Institute of Technology. the Winter 2008 issue).
Lehtonen et al. 43
Figure 1. (a) Conventional (c) and (d) that allow the Figure 2. Magnitude ICF. The thick vertical
ICF and (b) a fractional- use of a fractional-delay response of the conven- lines indicate the har-
delay filter based ICF filter Hfd(z) of arbitrary tional (dashed line) and monic frequencies to be
(after Vlimki, Ilmoni- order for signals with both the fractional-delay canceled (f0 = 4,186 Hz,
emi, and Huotilainen (c) low and (d) high all-pass filter (solid line) fs = 44,100 Hz).
2004). Two other ICF fundamental frequencies.
structures are presented in
(a)
(b)
(c)
Lehtonen et al. 45
Figure 3. (a) Magnitude The standard Lagrange FIR Lagrange FIR filter is indi-
response, (b) phase delay filter is indicated with a cated with a solid line in
in samples, and (c) attenu- dashed line in (a) and (b) (a) and (b) and with plus
ation of harmonics of ICFs and with a dot at each signs in (c). The parameter
using standard and trun- attenuated harmonic in values are N = 160, M =
cated Lagrange FIR filters. (c). The truncated 1120, and d = 0.5.
formance of the standard Lagrange filter collapses in that seminal paper, under general assumptions
above 16 kHz. (stable all-pass functions) the overall numerator
polynomial of this structure is either symmetric or
antisymmetric (a mirror-image or anti-mirror-image
All-Pass Fractional-Delay Filter Design polynomial, respectively). These polynomials are
known to have their zeroes exactly on the unit circle.
For all-pass fractional-delay filter design, we can ex- Hence, our filter is known to have accurate zeroes
press the transfer function (Equation 1) in the form also in the fractional-delay ICF case. Note that the
all-pass filters are exactly all-pass (unity magnitude
1 1 z N D(z 1)
H(z) = 1 Hfd(z) = 1 at all frequencies) even if the phase (or phase delay
2 2 D(z) or group delay) is only approximately as desired. In
1 D(z) z N D(z 1) particular, as shown by Vlimki, Lehtonen, and
= (8)
2 D(z)
Laakso (2007), the phase approximation errors of the
all-pass filter tend to accumulate with increasing
where D(z) = 1 + a(1)z 1 + a(2)z 2 + . . . + a(N)z N and frequency, and the zeroes of the corresponding ICF
the numerator polynomial can be written as B(z) = further from the ideal places at higher frequencies.
D(z) z ND(z 1) = b(0) + b(1)z 1 + . . . + b(N)z N, and For good accuracy, a method is needed that allows
b(k) = a(k) a(Nk), k = 0, 1, . . . , N. for the design of high-order fractional-delay filters.
It is easy to verify that B(z) is an antisymmetric Three closed-form design methods are known for
polynomial, that is, bk = bN k. In fact, the ICF is a fractional-delay all-pass filters: the Thiran all-pass
special case of the parallel connection of two all- filter design (Thiran 1971; Fettweis 1972; Laakso
pass filters discussed by Saramki (1985). As shown et al. 1996), the truncated Thiran all-pass filter
(Vlimki 2000), and the Pei-Wang method (Pei and At low frequencies, this filter has a phase delay of
Wang 2004). Such methods are needed to increase N + d samples.
the all-pass filter order to be large enough for good The truncated Thiran design is obtained by modi-
wideband approximation in audio applications. fying Equation 9:
Both the standard and the truncated Thiran meth-
ods allow the filter order to be increased up to N = M M d + n
a ( k ) = (1)k for k = 1,2,3,...,N (10)
1,029 (when d = 0.5) using 64-bit double floating- k n= 0 d + k + n
point computing. An alternative method would be
the non-parametric all-pass filter design method where M is the prototype filter order (M > N) (Vli-
proposed by Abel and Smith (2006). This method mki 2000). By using a value for M that is much
provides numerical stability even with large filter larger than N in Equation 10, it is possible to extend
orders, and it is possible to match an arbitrary group the bandwidth of good approximation. This comes
delay as a function of frequency. at the expense of losing quality at low frequencies:
In this work, we concentrate on the standard and The approximation error is larger than in the origi-
the truncated Thiran filter design methods. The nal all-pass filter. This design technique allows a
Thiran design formula for the coefficients of the useful tradeoff between approximation accuracy
polynomial D(z) in Equation 8 can be expressed as and bandwidth, as discussed in Vlimki (2000);
Vlimki, Ilmoniemi, and Huotilainen (2004); and
N N d + n
a ( k ) = (1)k for k = 1,2,3,...,N (9) Vlimki, Lehtonen, and Laakso (2007).
k n= 0 d + k + n
Figure 4 compares the standard and truncated
where N is the filter order, and d is the fractional Thiran all-pass filters. The design parameters are
delay parameter (0.5 < d 0.5) (Laakso et al. 1996). N = 80 and d = 0.5 for both filters, and the proto-
Lehtonen et al. 47
type filter order for the truncated Thiran filter is must be used to ensure the maximum gain of the filter
M = 9N = 720. The magnitude response, which is to be unity (i.e., 0 dB). Then, the minimum gain of the
exactly flat in both cases, and the phase delays are filter, which occurs at the bottom of the notches at
displayed in Figures 4a and 4b, respectively. harmonic frequencies, is g0(1 rL), which we call A. We
Note from Figure 4b that the difference between can now solve for the required g0, and consequently
the fractional-delay approximations of the two the required r, when gain A is set to a given value. By
filters is microscopic below about 17 kHz, but the combining Equation 11 and the aforementioned defi-
relaxed accuracy allows for the truncated Thiran nition for A, we get a representation for the radius r,
filter to perform significantly better above 17 kHz. as shown in Vlimki, Lehtonen, and Laakso (2007):
A comparison of the two ICFs with Thiran and
r = L (1 A) (1 + A) (12)
truncated Thiran filters is shown in Figure 4c.
Again, the lengths of the additional delay-lines were The HEF transfer function can be written as
chosen to be L1 = L2 = 0. Note that below 17 kHz,
HHEF(z) = g1R(z)[1 r L Hfd(z)] (13)
the ICF with the truncated Thiran filter is worse
than the ICF using the standard Thiran filter, but Here, the scaling coefficient g1 that sets the maxi-
both are sufficiently good, because the attenuation mum gain at the bottom of the notches (without the
is more than 140 dB. Above 17 kHz, the perfor- resonator) to be unity is defined by
mance of the Thiran ICF collapses, but with the
g1 = 1 [ r(1 r L)] (14)
truncated version of the all-pass filter the ICF offers
an attenuation of 140 dB up to 20 kHz. and the transfer function of the resonant filter is
R(z) = b0 (1 + a1z 1 + a2 z 2) (15)
Extracting Harmonic Components with coefficients b0 = (1 r )sin(2fres / fs), which
2
(a) (c)
(b) (d)
In practice, the high-order all-pass filter does not mf0, because the frequency of the mth zero is offset
provide a perfect phase approximation. Thus, it may by the inaccuracy of the phase approximation of the
be necessary to set A to a smaller value, such as all-pass filter. In practice, this mismatch produces a
10 6. However, when the filter structure for select- kink around the main lobe of the band-pass filter,
ing a single harmonic is used, the resonant filter and the gain at the resonance frequency becomes
provides additional attenuation at frequencies away larger than 0 dB. A correction to the pole frequency
from the resonance, which further improves the is required to reduce this error.
attenuation at the notches. One way to correct the resonance frequency of
It was reported by Vlimki, Ilmoniemi, and the all-pole filter is to search for the minimum of
Huotilainen (2004) that for some musical instru- the ICFs magnitude response around the mth
ment tones with strong low-indexed harmonics, the notch. For example, computing the magnitude
filtering of the signal with the transfer function response at 10,000 points between 0.999990mf0 and
HHEF(z) is insufficient. Listening to the filtered signal 1.000010mf0, and selecting fres as the frequency,
reveals that the fundamental is still perceived, where the minimum occurs, reduces the mismatch
although one of the high-frequency partials is sufficiently. To reduce the number of magnitude-
strongly emphasized. Filtering the signal twice with response evaluations, the local minimum can be
a transfer function according to Equation 13 ad- estimated by using interpolation, as suggested by
equately attenuates the rest of the harmonics in Vlimki, Lehtonen, and Laakso (2007).
this case, because it is then possible to obtain an Figure 5a gives an example of the attenuation
attenuation that is better than 100 dB. obtained without and with the proposed correction
There is a minor mismatch in the cancellation of of the resonance frequency when the fundamental
the mth transfer function zero with the pole of the frequency is 69.2957 Hz, the harmonic #285 at
resonant filter having the resonance frequency fres = 19749.2 Hz is selected, the all-pass filter orders used
Lehtonen et al. 49
are N = 80 and M = 720, and attenuation is A = 10 5. Table 1. Results of the Comparison Between the
In this case, the pole radius is r = 1 31.4 10 9 = Linear-Phase FIR Filter, the Truncated Lagrange
0.999999969. The difference between the nominal Filter, and the Truncated Thiran Filter in Terms of
(mf0) and the corrected resonance frequency is the Computational Complexity
0.0135 Hz, but the attenuation of the partial is 0.8 Prototype
dB without and 0.0045 dB with the correction. This Filter Order order Multiplications Additions
difference is illustrated in Figure 5c. The difference
in attenuation is enough to make the correction Linear-phase 4246 4247 4246
FIR
worth the effort, because it makes the analysis filter
an accurate tool for signal analysis. Figures 5b and Truncated 130 911 131 130
5d display the magnitude response of the two filters Lagrange
around selected harmonic. It is seen that without Truncated 65 585 131 130
correction, the gain of the filter fluctuates between Thiran
about 2 dB and +2 dB near the resonance frequency.
Using the correction, the fluctuation is negligible,
not exceeding 0.010 dB (not visible in Figure 5d odd harmonics can be separated using a single
owing to limited image resolution; the kink is very filtering operation, as suggested by Vlimki,
small and very narrow). Ilmoniemi, and Huotilainen (2004) and Vlimki,
For comparison, we designed two ICFs with the Lehtonen, and Laakso (2007). The odd and even
truncated Lagrange and truncated Thiran all-pass harmonics can be separated by first canceling the
filters, and a linear-phase FIR band-pass filter that even harmonics using the fractional-delay inverse
imitates the obtained magnitude response. The comb filter and then subtracting the resulting signal
linear-phase FIR filter was designed using the from the original.
Chebyshev window with a sidelobe level of 90 dB. The structure of Figure 1c or 1d can be used, but
The width of the pass-band was equal to f0, that is, the delay to be approximated is half of that used in
69.2957 Hz. It was required that the magnitude canceling all harmonics, namely, fs / 2f0 samples.
error at the frequency of the selected harmonic not With this delay, the notches are located at the
exceed 0.1 dB, and that the minimum attenuation at multiples of the second harmonic, and the filter
the frequencies of the canceled harmonics is 100 dB. now cancels the even harmonics and preserves the
Table 1 shows the results of the comparison in odd harmonics. The signal containing even harmon-
terms of the computational complexity when all ics is then obtained by subtracting the estimated
filters are realized in direct form. odd harmonics from the original signal, as shown in
As can be seen in Table 1, the number of multi- Figure 6:
plications needed when the truncated Lagrange or
seven(n) = sorig (n) sodd (n) (16)
truncated Thiran filter is used is only 3.1% of that
of the FIR filter. The computational complexities of
the truncated Lagrange and truncated Thiran filters
are the same, but the orders of the prototype filters Case Studies
differ.
This section presents how the proposed filtering
algorithms perform with synthetic signals and
Separation of Odd and Even Partials recorded instrument tones. In addition, the ICF
structure based on either windowed-sinc interpola-
Although it is possible to cancel the harmonic tion, truncated Lagrange, or truncated Thiran
components one by one by applying the above HEF interpolation are compared against sinusoidal
structure multiple times, alternatively, even and modeling (McAulay and Quatieri 1986; Serra 1989;
Lehtonen et al. 51
Figure 7. Results of com- #76 (near 20 kHz) obtained Figure 8. Effects of tempo- windowed-sinc interpola-
parison in the frequency with an ICF based on (c) ral smearing. Here, (a) and tion. Also shown are
domain. Here, (a) and (b) windowed-sinc interpola- (b) show an excerpt of the corresponding signals
represent the synthetic test tion, (d) sinusoidal model- original synthetic signal, obtained with (e)(f)
signal. Selected harmonic ing, (e) truncated Lagrange and (c) and (d) show sinusoidal modeling;
components #1 (thick line interpolation, and (f) trun- excerpts of the harmonics (g)(h) a truncated
near 0 kHz in (c)(f)) and cated Thiran interpolation. #1 and #76 obtained with Lagrange filter; and (i)(j) a
an ICF based on the truncated Thiran filter.
Figure 7
Figure 8
(a) (c)
(b) (d)
signal, and Figures 8c and 8d present the cases Thiran filter. In Figures 9a and 9b, the tone is
where the first harmonic and the 76th harmonic presented in the time and frequency domains,
have been extracted with the ICF based on respectively. The fundamental frequency of the
windowed-sinc interpolation. Temporal smearing tone was measured to be f0 = 58.2306 Hz using the
in the sinusoidal modeling technique is depicted in YIN algorithm (de Cheveign and Kawahara 2002).
Figures 8e and 8f. The performances of the trun- It is now desired to remove all the harmonics
cated Lagrange FIR filter (see Figures 8g and 8h) and instead of preserving one. Depending on the funda-
the truncated Thiran filter (see Figures 8i and 8j) are mental frequency, a modified form of the transfer
also shown. It can be seen that the truncated function (Equation 3) can be used:
Lagrange and truncated Thiran filters perform best
H(z) = g1 1 r L Hfd (z) (18)
in terms of temporal smearing.
where the coefficients g1 and r are computed with
L
Lehtonen et al. 53
Figure 10. The structure of Figure 11. Envelopes of the
the kantele. The varras is first eight harmonics of a
magnified in order to kantele tone.
illustrate the termination
point. (Adapted from
Pakarinen, Vlimki, and
Karjalainen 2005.)
components are attenuated efficiently. Moreover, Figure 11 shows a 3-D time-frequency plot of the
the noise between the harmonics is preserved. envelopes of the first eight harmonics of a kantele
tone. The dominating fundamental frequency of the
tone was measured to be f0 = 413.445 Hz. The
Extraction of a Beating Harmonic truncated Thiran filter with parameters N = 80, M =
9N, and A = 10 6 was used. In this case, the resulting
Beating is a significant phenomenon present in attenuation is not quite 120 dB, owing to the two
many stringed instruments. It is characteristic to fundamental frequencies present in the signal. As
the piano and the kantele, for example. The kantele the notches of the inverse comb filter are very
is a plucked string instrument that usually has five sharp, part of the harmonic information does not hit
metal strings. The strings are terminated by metal the notch frequency but rather the slope of the
tuning pins at one end and wound once around a notch, where the attenuation is less than 120 dB.
horizontal metal bar called the varras at the other The obtained attenuation is still more than 75 dB,
end and then knotted (see Figure 10). The beating in which is sufficient for visualization, since the
the kantele sound is caused by the fact that the beating patterns are clear (see Figure 11). From these
vibrations in two polarization planes have different envelopes, it is easy to determine the frequency of
effective lengths, as the varras is the termination the beatingfor example, by fitting sinusoids to the
point for vibration in the horizontal plane and the envelopes and choosing the best match in the
knot is the termination point for the vibration in least-square sense (Erkut et al. 2002).
the vertical plane. This causes the tone to have two
slightly different fundamental frequencies, which
produces beating (Erkut et al. 2002). Conclusions and Future Work
Extraction of a beating harmonic is considered to
be somewhat problematic with sinusoidal model- Digital filtering techniques were proposed to obtain
ing, for example, because preservation of the useful decompositions of harmonic musical signals.
beating in the resynthesized deterministic signal The basic approach taken here is to subtract a
requires two or more spectral peaks to be extracted delayed copy of the signal from itself to cancel the
around a harmonic frequency (e.g., Tolonen 1999, harmonic components. A high-order fractional-
and Esquef, Karjalainen, and Vlimki 2003). The delay filter implements an accurate approximation
beating harmonics of the kantele can be easily of the required time delay. A truncated Lagrange
extracted from a recorded tone using the HEF and a truncated Thiran all-pass fractional-delay
structure (Equation 13). filter are well suited for this task. A harmonic-
Lehtonen et al. 55
Laakso, T. I., et al. 1996. Splitting the Unit DelayTools chastic Decomposition. Ph.D. Thesis and Report No.
for Fractional Delay Filter Design. IEEE Signal Pro- STAN-M-58, Stanford University, Stanford, California.
cessing Magazine 13(1):3060. Serra, X., and J. O. Smith. 1990. Spectral Modeling Syn-
McAulay, R. J., and T. F. Quatieri. 1986. Speech Anal- thesis: A Sound Analysis / Synthesis Based on a Deter-
ysis / Synthesis Based on a Sinusoidal Representation. ministic Plus Stochastic Decomposition. Computer
IEEE Transactions on Acoustics, Speech, and Signal Music Journal 14(4):1224.
Processing 34(4):744754. Steiglitz, K. 1996. A Digital Signal Processing Primer.
Moorer, J. A. 1974. The Optimum Comb Method Menlo Park, California: Addison-Wesley.
of Pitch Period Analysis of Continuous Digitized Thiran, J.-P. 1971. Recursive Digital Filters with Maxi-
Speech. IEEE Transactions on Acoustics, Speech, and mally Flat Group Delay. IEEE Transactions on Circuit
Signal Processing 22(5):330338. Theory 18(6):659664.
Moorer, J. A. 1977. Signal Processing Aspects of Com- Tolonen, T. 1999. Methods for Separation of Harmonic
puter Music: A Survey. Proceedings of the IEEE Sound Sources Using Sinusoidal Modeling. Proceed-
5(8):11081137. ings of the 106th AES Convention. New York: Audio
Orfanidis, S. J. 1996. Introduction to Signal Processing. Engineering Society, preprint 4958. Available online at
Upper Saddle River, New Jersey: Prentice Hall. lib.tkk.fi / Diss / 2000 / isbn9512251965 / article7.pdf.
Pakarinen, J., V. Vlimki, and M. Karjalainen. 2005. Tseng, C.-C., and S.-C. Pei. 2001. Stable IIR Notch Filter
Physics-Based Methods for Modeling Nonlinear Vi- Design with Optimal Pole Placement. IEEE Transac-
brating Strings. Acta Acustica United with Acustica tions on Signal Processing 49(11):26732681.
91(2):312325. Vlimki, V. 2000. Simple Design of Fractional Delay
Pei, S.-C., and C.-C. Tseng. 1997. IIR Multiple Notch Fil- All-Pass Filters. Proceedings of the 2000 European
ter Design Based on All-Pass Filter. IEEE Transactions Signal Processing Conference. Kessariani, Greece:
on Circuits and Systems Part II: Analog and Digital EURASIP, vol. 4, pp. 18811884.
Signal Processing 44(2):133136. Vlimki, V., and A. Haghparast. 2007. Fractional Delay
Pei, S.-C., and C.-C. Tseng. 1998. A Comb Filter Design Filter Design Based on Truncated Lagrange Interpola-
Using Fractional-Sample Delay. IEEE Transactions tion. IEEE Signal Processing Letters 14(11):816819.
on Circuits and Systems Part II: Analog and Digital Vlimki, V., M. Ilmoniemi, and M. Huotilainen. 2004.
Signal Processing 45(6):649653. Decomposition and Modification of Musical Instru-
Pei, S.-C., and P.-H. Wang. 2004. Closed-Form Design of ment Sounds Using a Fractional Delay All-Pass Filter.
All-Pass Fractional Delay Filters. IEEE Signal Pro- Proceedings of the 2004 Nordic Signal Processing
cessing Letters 11(10):788791. Symposium. Espoo, Finland: IEEE Finland Section,
Rauhala, J., and V. Vlimki. 2006. Tunable Dispersion pp. 208211.
Filter Design for Piano Synthesis. IEEE Signal Pro- Vlimki, V., and T. I. Laakso. 2001. Fractional Delay
cessing Letters 13(5):253256. Filters: Design and Applications. In F. Marvasti,
Rocchesso, D., and F. Scalcon. 1996. Accurate Disper- ed. Nonuniform Sampling: Theory and Practice.
sion Simulation for Piano Strings. Proceedings of the New York: Kluwer Academic / Plenum Publishers,
1996 Nordic Acoustical Meeting. Helsinki, Finland: pp. 835895.
NAM, pp. 407414. Vlimki, V., H.-M. Lehtonen, and T. I. Laakso. 2007.
Saramki, T. 1985. On the Design of Digital Filters as Musical Signal Analysis Using Fractional-Delay In-
a Sum of Two All-Pass Filters. IEEE Transactions on verse Comb Filters. Proceedings of the International
Circuits and Systems 32(11):11911193. Conference on Digital Audio Effects. Bordeaux, France:
Serra, X. 1989. A System for Sound Analysis / Transfor- LaBRI, University of Bordeaux 1, pp. 261268.
mation / Synthesis Based on a Deterministic Plus Sto-