Professional Documents
Culture Documents
SparseTQWT Slides
SparseTQWT Slides
SparseTQWT Slides
and
the tunable Q-factor wavelet transform
Ivan Selesnick
Polytechnic Institute of New York University
Brooklyn, New York
1
Introduction
Outline
5. Examples
References (MDCT, etc)
S. N. Levine and J. O. Smith III. A sines+transients+noise audio representation for data compression and time/pitch scale
modications. (1998)
L. Daudet and B. Torrésani. Hybrid representations for audiophonic signal encoding. (2002)
S. Molla and B. Torrésani. An hybrid audio coding scheme using hidden Markov models of waveforms. (2005)
M. E. Davies and L. Daudet. Sparse audio representations using the MCLT. (2006)
2
Oscillatory (rhythmic) and Transient Components in EEG
3
Signal resonance and Q-factor
PULSE 1
1 SPECTRUM
10
0 Q−factor = 1.15
5
−1
PULSE 2
1
40
0
Q−factor = 4.6
20
−1
PULSE 3
1
20
Q−factor = 1.15
0
10
−1
0
0 100 200 300 400 500 0 0.02 0.04 0.06 0.08 0.1 0.12
PULSE 4 100
1
Q−factor = 4.6
0 50
−1
0
0 100 200 300 400 500 0 0.02 0.04 0.06 0.08 0.1 0.12
TIME (SAMPLES) FREQUENCY (CYCLES/SAMPLE)
4
Resonance-based signal decomposition
2 2
TEST SIGNAL TEST SIGNAL
1 1
0 0
−1 −1
−2 −2
0 100 200 300 400 500 0 100 200 300 400 500
2 2
HIGH−RESONANCE COMPONENT LOW−PASS FILTERED
1 1
0 0
−1 −1
−2 −2
0 100 200 300 400 500 0 100 200 300 400 500
2 2
LOW−RESONANCE COMPONENT BAND−PASS FILTERED
1 1
0 0
−1 −1
−2 −2
0 100 200 300 400 500 0 100 200 300 400 500
2 2
RESIDUAL HIGH−PASS FILTERED
1 1
0 0
−1 −1
−2 −2
0 100 200 300 400 500 0 100 200 300 400 500
TIME (SAMPLES) TIME (SAMPLES)
Figure 2: Resonance- and frequency-based filtering. (a) Decomposition of a test signal into high- and low-resonance components. The high-resonance signal
component is sparsely represented using a high Q-factor WT. Similarly, the low-resonance signal component is sparsely represented using a low Q-factor WT. (b)
Decomposition of a test signal into low, mid, and high frequency components using LTI discrete-time filters.
5
Resonance-based signal decomposition must be nonlinear
= +
= +
= +
= +
= +
= +
= +
Figure 3: Resonance-based signal decomposition must be nonlinear: The signal in the bottom left panel is the sum of the signals above it; however, the low-resonance
component of a sum is not the sum of the low-resonance components. The same is true for the high-resonance component. Neither the low- nor high-resonance
components satisfy the superposition property.
6
Rational-dilation wavelet transform (RADWT)
2. P. Auscher (1992)
7
Low-pass Scaling
X(ω) X(ω)
1 1
0.5 0.5
0 0
−π −απ 0 απ π −π 0 π
FREQUENCY (ω) FREQUENCY (ω)
Y(ω) Y(ω)
1 1
0.5 0.5
0 0
−π 0 π −π −π/α 0 π/α π
FREQUENCY (ω) FREQUENCY (ω)
8
High-pass Scaling
X(ω) X(ω)
1 1
0.5 0.5
0 0
−π −(1−β)π 0 (1−β)π π −π 0 π
FREQUENCY (ω) FREQUENCY (ω)
Y(ω) Y(ω)
1 1
0.5 0.5
0 0
−π 0 π −π −(1−1/β)π 0 (1−1/β)π π
FREQUENCY (ω) FREQUENCY (ω)
9
Tunable Q-factor wavelet transform (TQWT)
v0(n) y0(n)
H0(ω) LPS α LPS 1/α H0∗(ω)
x(n) + y(n)
H1(ω) HPS β HPS 1/β H1∗(ω)
v1(n) y1(n)
|H0(ω)|2 X(ω) |ω| ≤ α π
Y0(ω) =
α π < |ω| ≤ π
0
0
|ω| < (1 − β) π
Y1(ω) =
|H (ω)|2 X(ω) (1 − β) π ≤ |ω| ≤ π
1
|H0(ω)|2 X(ω) |ω| < (1 − β) π
Y (ω) = (|H0(ω)|2 + |H1(ω)|2) X(ω) (1 − β) π ≤ |ω| < α π
|H1(ω)|2 X(ω)
α π ≤ |ω| ≤ π
10
For perfect reconstruction, the filters should satisfy
0.5
0
−π −απ −(1−β)π 0 (1−β)π απ π
FREQUENCY (ω)
H1(ω)
0.5
0
−π −απ −(1−β)π 0 (1−β)π απ π
FREQUENCY (ω)
11
The transition band of H0(ω) and H1(ω) can be constructed using any power-complementary function,
θ(ω),
θ2(ω) + θ2(π − ω) = 1, (2)
We use the Daubechies filter frequency response with two vanishing moments,
1 p
θ(ω) = (1 + cos(ω)) 2 − cos(ω), |ω| ≤ π. (3)
2
Scale and dilate θ(ω) to obtain transition bands for H0(ω) and H1(ω).
12
X(ω)
1
0.5
0
−π −απ −(1−β)π 0 (1−β)π απ π
ω
H0(ω) H1(ω)
1 1
0.5 0.5
0 0
−π −απ −(1−β)π 0 (1−β)π απ π −π −απ −(1−β)π 0 (1−β)π απ π
FREQUENCY (ω) FREQUENCY (ω)
0.5 0.5
0 0
−π −απ −(1−β)π 0 (1−β)π απ π −π −απ −(1−β)π 0 (1−β)π απ π
ω ω
V0(ω) V1(ω)
1 1
0.5 0.5
0 0
−π −0.5π 0 0.5π π −π −0.5π 0 0.5π π
ω ω
13
Iterated Filters
c(n)
stage 3
stage 2 d3(n)
x(n) stage 1 d2(n)
d1(n)
j − 1 stages
(j)
≡ H1 (ω) LPS αj−1 HPS β dj (n)
14
The equivalent filter is
j−1
0 |ω| < (1 − β) α π
j−2
Y
(j) j−1
H1 (ω) := H1(ω/α ) H0(ω/αm) (1 − β) αj−1 π ≤ |ω| ≤ αj−1 π (4)
m=0
αj−1 π < |ω| ≤ π.
0
H1j (ω)
0 (1 − β) αj−1 π αj−1 π π
Q-factor:
ωc 2−β
Q= =
BW β
Scaling factors α, β:
2 β
β= , α=1−
Q+1 r
15
LOW Q-FACTOR WT HIGH Q-FACTOR WT
FREQUENCY RESPONSES − 4 LEVELS FREQUENCY RESPONSES − 13 LEVELS
alpha = 0.67, beta = 1.00, Q = 1.00, R = 3.00 alpha = 0.87, beta = 0.40, Q = 4.00, R = 3.00
1 1
0.5 0.5
0 0
0 π/4 π/2 3π/4 π 0 π/4 π/2 3π/4 π
FREQUENCY (ω) FREQUENCY (ω)
WAVELET WAVELET
0.1
0.1
0.05
0.05
0 0
−0.05
−0.05
−0.1
−0.1
0 50 100 150 200 250 300 0 50 100 150 200 250 300
TIME (SAMPLES) TIME (SAMPLES)
Q-factor = 1 Q-factor = 4
Redundancy = 3 Redundancy = 3
16
Finite-length Signals...
The TWQT can be implemented for finite-length signals using the DFT (FFT)....
v0(n) y0(n)
H0(k) LPS N : N0 LPS N0 : N H0∗(k)
x(n) + y(n)
H1(k) HPS N : N1 HPS N1 : N H1∗(k)
v1(n) y1(n)
17
Low-pass scaling N : N0
X (k) X (k)
0 N/2 N −1 0 N/2 N −1
Y (k) Y (k)
0 N 0 /2 N0 − 1 0 N 0 /2 N0 − 1
18
High-pass scaling N : N1
X (k) X (k)
0 N/2 N −1 0 N/2 N −1
Y (k) Y (k)
0 N 1 /2 N1 − 1 0 N 1 /2 N1 − 1
19
X (k)
0 N/2 N −1
(a) DFT of input signal, X(k).
H 0 (k) H 1 (k)
0 T N/2 T N −1 0 T N/2 T N −1
(b) Filters H0 (k) and H1 (k). The transitions-bands are indicated by ‘T ’.
0 N/2 N −1 0 N/2 N −1
(c) DFT of input signal after filtering.
V 0 (k) V 1 (k)
0 N 0 /2 N0 − 1 0 N 1 /2 N1 − 1
20
Example: Low Q-factor vs High Q-factor WT
7 2.43%
1 1.89%
8 7.97%
2 6.14%
9 3.52%
3 11.72%
10 0.54%
SUBBAND
SUBBAND
4 16.33% 11 7.30%
12 16.03%
5 22.82%
13 4.72%
6 26.80%
14 1.74%
7 13.92%
15 20.12%
8 0.38%
16 29.69%
9 0.00%
17 5.75%
0 100 200 300 400 500 0 100 200 300 400 500
P = 2, Q = 3, S = 1, Levels = 10
TIME (SAMPLES) Dilation = 1.50, Redundancy = 3.00 P = 5, Q = 6, S = 2, Levels = 20
TIME (SAMPLES) Dilation = 1.20, Redundancy = 3.00
21
Low Q-factor vs High Q-factor WT after sparsification
7 0.01%
1 0.00%
8 15.64%
2 2.07%
9 0.11%
3 8.78%
10 0.00%
SUBBAND
SUBBAND
4 34.39% 11 0.02%
12 38.66%
5 0.79%
13 0.02%
6 53.29%
14 0.02%
7 0.68%
15 6.26%
8 0.00%
16 39.25%
9 0.00%
17 0.00%
0 100 200 300 400 500 0 100 200 300 400 500
P = 2, Q = 3, S = 1, Levels = 10
TIME (SAMPLES) Dilation = 1.50, Redundancy = 3.00 P = 5, Q = 6, S = 2, Levels = 20
TIME (SAMPLES) Dilation = 1.20, Redundancy = 3.00
22
Low Q-factor vs High Q-factor WT
6 1.82%
1 2.58% 7 2.75%
8 3.02%
2 6.31%
9 3.82%
3 10.70%
10 5.65%
11 6.82%
SUBBAND
SUBBAND
4 16.11%
12 6.95%
5 21.44%
13 8.72%
14 12.37%
6 22.50%
15 13.98%
7 13.51%
16 12.04%
17 8.44%
8 4.87%
18 5.41%
9 1.42%
19 3.07%
0 100 200 300 400 500 0 100 200 300 400 500
P = 2, Q = 3, S = 1, Levels = 10
TIME (SAMPLES) Dilation = 1.50, Redundancy = 3.00 P = 5, Q = 6, S = 2, Levels = 20
TIME (SAMPLES) Dilation = 1.20, Redundancy = 3.00
23
Low Q-factor vs High Q-factor WT after sparsification
6 2.43%
1 0.00% 7 3.78%
8 2.35%
2 3.12%
9 2.16%
3 10.99%
10 4.12%
11 5.87%
SUBBAND
SUBBAND
4 27.98%
12 6.25%
5 10.60%
13 4.01%
14 18.71%
6 40.59%
15 12.72%
7 3.53%
16 9.69%
17 19.51%
8 2.64%
18 1.36%
9 0.23%
19 3.11%
0 100 200 300 400 500 0 100 200 300 400 500
P = 2, Q = 3, S = 1, Levels = 10
TIME (SAMPLES) Dilation = 1.50, Redundancy = 3.00 P = 5, Q = 6, S = 2, Levels = 20
TIME (SAMPLES) Dilation = 1.20, Redundancy = 3.00
24
Constant-Q vs Constant-BW
Constant-Q Constant-BW
0.5 0.5
0 0
0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2
FREQUENCY (CYCLES/SAMPLE) FREQUENCY (CYCLES/SAMPLE)
0 0
−1 −1
0 50 100 150 200 250 300 0 50 100 150 200 250 300
TIME (SAMPLES) TIME (SAMPLES)
0 0
−1 −1
0 50 100 150 200 250 300 0 50 100 150 200 250 300
TIME (SAMPLES) TIME (SAMPLES)
25
Tunable Q-factor wavelet transform (TQWT) — Run Times
• Run times measured on a 2010 base-model Apple MacBook Pro (2.4 GHz Intel Core 2 Duo).
• C implementation.
26
Tunable Q-factor wavelet transform (TQWT)
Summary:
3. Adjustable Q-factor:
Can attain higher Q-factors than (or same low Q-factor of) the dyadic WT.
=⇒ Can achieve higher-frequency resolution needed for oscillatory signals.
4. Samples the time-frequency plane more densely in both time and frequency.
=⇒ Exactly invertible, fully-discrete approximation of the continuous WT.
5. FFT-based implementation
27
Morphological Component Analysis (MCA)
the goal of MCA is to estimate/determine x1 and x2 individually. Assuming that x1 and x2 can be
sparsely represented in bases (or frames) Φ1 and Φ2 respectively, they can be estimated by minimizing the
objective function,
J(w1, w2) = λ1kw1k1 + λ2kw2k1
Φ1w1 + Φ2w2 = x.
and
x̂2 = Φ2w2.
Reference:
Starck, Elad, Donoho. Image Decomposition via the Combination of Sparse Representations and a Variational Approach,
IEEE Trans. on Image Processing, Oct 2005.
28
Why not a quadratic cost function?
29
MCA as a linear inverse problem
The constrained optimization problem
min λ1kw1k1 + λ2kw2k1
w1 ,w2
subject to Hw = x
where
h i w1
H = Φ1 Φ2 , w=
w2
and denotes point-by-point multiplication.
This is ‘basis pursuit’, an `1-regularized linear inverse problem . . .
• Non-differentiable
• Convex
=⇒ We use a variant of SALSA (split augmented Lagrangian shrinkage algorithm).
Reference: Afonso, Bioucas-Dias, Figueiredo. Fast Image Recovery Using Variable Splitting and Constrained Optimization. IEEE Trans. on Image
Processing, 2010.
30
SALSA for MCA (bais pursuit form)
31
OBJECTIVE FUNCTION
ISTA
9.5 SALSA
8.5
7.5
6.5
5.5
0 10 20 30 40 50 60 70 80 90 100
ITERATION
Figure 4: Reduction of objective function during the first 100 iterations. SALSA converges faster than ISTA.
32
Example: Resonance-selective nonlinear band-pass filtering
2
(a) TEST SIGNAL (c) OUTPUT OF BAND−PASS FILTER 1
2
1
1
0 0
−1 −1
−2
−2
0 100 200 300 400 0 100 200 300 400
TIME (SAMPLES) TIME (SAMPLES)
2
(b) TWO BAND−PASS FILTERS (d) OUTPUT OF BAND−PASS FILTER 2
1
1
BAND−PASS FILTER 1
0
0.5 BAND−PASS FILTER 2
−1
0 −2
0 0.1 0.2 0.3 0.4 0.5 0 100 200 300 400
FREQUENCY (CYCLES/SAMPLE) TIME (SAMPLES)
Figure 5: LTI band-pass filtering. The test signal (a) consists of a sinusoidal pulse of frequency 0.1 cycles/sample and a transient. Band-pass filters 1 and 2 in
(b) are tuned to the frequencies 0.07 and 0.10 cycles/second respectively. The output signals, obtained by filtering the test signal with each of the two band-pass
filters, are shown in (c) and (d). The output of band-pass filter 1, illustrated in (c), contains oscillations due to the transient in the test signal. Moreover, the
transient oscillations in (c) have a frequency of 0.07 Hz even though the test signal (a) contains no sustained oscillatory behavior at this frequency.
33
2 2
(a) HIGH−RESONANCE COMPONENT (c) OUTPUT OF BAND−PASS FILTER 1
1 1
0 0
−1 −1
−2 −2
0 100 200 300 400 0 100 200 300 400
TIME (SAMPLES) TIME (SAMPLES)
2 2
(b) LOW−RESONANCE COMPONENT (d) OUTPUT OF BAND−PASS FILTER 2
1 1
0 0
−1 −1
−2 −2
0 100 200 300 400 0 100 200 300 400
TIME (SAMPLES) TIME (SAMPLES)
Figure 6: Resonance-based decomposition and band-pass filtering. When resonance-based analysis method is applied to the test signal in Fig. 5a, it yields the
high- and low-resonance components illustrated in (a) and (b). The output signals, obtained by filtering the high-resonance component (a) with each of the two
band-pass filters shown in Fig. 5b, are illustrated in (c) and (d). The transient oscillations in (c) are substantially reduced compared to Fig. 5c.
34
Example: Resonance-based decomposition of speech
0.2
−0.2
−0.4
0 0.05 0.1 0.15
0.4
(b) HIGH−RESONANCE COMPONENT
0.2
−0.2
−0.4
0 0.05 0.1 0.15
0.4
(c) LOW−RESONANCE COMPONENT
0.2
−0.2
−0.4
0 0.05 0.1 0.15
TIME (SECONDS)
Figure 7: Decomposition of a speech signal (“I’m”) into high- and low-resonance components. The high-resonance component (b) contains the sustained oscillations
present in the speech signal, while the low-resonance component (c) contains non-oscillatory transients. (The residual is not shown.)
35
0.01 (a) ORIGINAL SPEECH
0.005
0
0 500 1000 1500 2000 2500 3000
0.005
0
0 500 1000 1500 2000 2500 3000
0.005
0
0 500 1000 1500 2000 2500 3000
FREQUENCY (Hz)
Figure 8: Frequency spectra of the speech signal in Fig. 7 and of the extracted high- and low-resonance components. The spectra are computed using the 50 msec
segment from 0.05 to 0.10 seconds. The energy of each resonance component is widely distributed in frequency and their frequency-spectra overlap.
36
(a) RECONSTRUCTION FROM SUBBANDS 9, 10
0.2
−0.2
−0.2
−0.2
−0.2
Figure 9: Frequency decomposition of high-resonance component in Fig. 7. Reconstructing the high-resonance component from a few subbands of the high Q-factor
WT at a time, yields an efficient AM/FM decomposition.
37
Constant bandwidth + Constant Q-factor
CONSTANT BANDWITH
1
0.5
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
FREQUENCY
CONSTANT Q−FACTOR
1
0.5
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
FREQUENCY
A constant bandwidth and a constant Q-factor decomposition can have high coherence due to some
analysis functions, from each decomposition, having similar frequency support. This can degrade the
results of MCA in principle.
38
Constant Q-factor: High Q-factor + Low Q-factor
0.5
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
FREQUENCY
0.5
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
FREQUENCY
Two constant Q-factor decompositions with markedly different Q-factors will have low coherence because
no analysis functions from the two decompositions will have similar frequency support. This is beneficial
for the operation of MCA.
39
Small coherence between low and high Q-factor WTs
p Ψ2(f )
Q2/f2
f2/Q2
p Ψ1(f )
Q1/f1
f1/Q1
f
0 f1 f2
Figure 10: For reliable resonance-based decomposition, the inner product between the low-Q and high-Q wavelets should be small for all dilations and translations.
The computation of the maximum inner product is simplified by assuming the wavelets are ideal band-pass functions and expressing the inner product in the
frequency domain.
40
Constant Bandwidth: Narrow-band + Wide-band
0.5
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
FREQUENCY
0.5
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
FREQUENCY
Two constant bandwidth decompositions with markedly different bandwidths will also have low coherence
and are therefore also suitable transform for MCA-based signal decomposition. This gives a bandwidth-
based decomposition, rather than a resonance-based decomposition.
41
Conclusion: Resonance-based signal decomposition
High Q-factor WT used for sparse representation of the oscillatory (rhythmic) component.
Morphological component analysis (MCA) used to separate the two signal components.
• Oscillatory component not necessarily high-pass — contains both low and high frequencies.
• Transient component not necessarily a low-pass signal — contains sharp bumps and jumps.
42
Graphical User Interface: Facilitates interactive selection and tuning of parameters.
43
.
More slides...
44
Comparison: EMD & TQWT
We compare
A goal of EMD is to decompose a multicomponent signal into several narrow-band components (intrinsic
mode functions).
For some real signals, a sparse TQWT representation leads to a more reasonable decomposition than
EMD (next two slides).
45
Empirical mode decomposition (EMD)
7
4
IMF index
46
Sparse TQWT Representation
band 1
band 2
band 3
band 4
residual
recon
47
Morphological Component Analysis (MCA) — Noisy signal case
where n is noise, the components x1 and x2 can be estimated by minimizing the objective function,
x̂1 = Φ1w1
and
x̂2 = Φ2w2.
Reference:
Starck, Elad, Donoho. Image Decomposition via the Combination of Sparse Representations and a Variational Approach,
IEEE Trans. on Image Processing, Oct 2005.
48
Why not a quadratic cost function?
then, using Φ1Φt1 = Φ2Φt2 = I, the minimizing w1 and w2 can be found in closed form:
λ1
w1 = Φt1 x
λ1 + λ2 + λ1 λ2
λ2
w2 = Φt2 x
λ1 + λ2 + λ1 λ2
and the estimated components, x̂1 = Φ1w1 and x̂2 = Φ2w2, are given by
λ1
x̂1 = x
λ1 + λ2 + λ1 λ2
λ2
x̂2 = x
λ1 + λ2 + λ1 λ2
Both x̂1 and x̂2 are just scaled versions of x.
=⇒ No separation at all!
49
MCA as a linear inverse problem
can be written as
J(w) = kx − Hwk22 + kλ wk1
where
h i w1
H = Φ1 Φ2 , w=
w2
and denotes point-by-point multiplication.
• Non-differentiable
• Convex
=⇒ Use Iterative Soft Thresholding Algorithm (ISTA) or another algorithm to minimize J(w).
50
Split augmented Lagrangian shrinkage algorithm (SALSA)
Reference:
Afonso, Bioucas-Dias, Figueiredo.
Fast Image Recovery Using Variable Splitting and Constrained Optimization.
IEEE Trans. on Image Processing, 2010.
51
SALSA
52
NOISY SPEECH WAVEFORM
0.2
0.1
0
−0.1
−0.2
0.2
0.1
0
−0.1
−0.2
0.2
0.1
0
−0.1
−0.2
0.2
0.1
0
−0.1
−0.2
53