Frame Sparse

1
FROM ANALYSIS TO SPARSE SYNTHESIS
SPARSE REPRESENTATIONS
SPARSE REPRESENTATION — ATSI 2
PRESENTATION
▸ Mail: matthieu.kowalski@universite-paris-saclay.fr
▸ Web site: http://hebergement.universite-paris-saclay.fr/mkowalski/

3
REMINDER
FOURIER
‣ From time representation to frequency
- Spectral analysis : frequency content of a function (think about musical notes)
- Measure the similute (correlation, angle) between pure (complex) sine and a signal
- Sines are eigen signal of time invariant linear systems (filters)
- Fourier analysis computes the correlation between the signal s(t) and a pure (complex) sine of frequency ν:
⟨s(t), ϵν(t)⟩, with ϵν(t) = e i2πνt
‣ Limitation of Fourier analysis
- We obtain a pure frequency content from a pure temporal content
- What is the difference between a sum of sine, and a succession of sine ?

EXAMPLE: SUM OF 2 SINES

EXAMPLE: SEQUENCE OF 2 SINES

SHORT TIME FOURIER TRANSFORM (STFT)

▸ Idea: perform a local spectral analysis of the signal thanks to a sliding window
Let w(t) be a real smooth window localized around t = 0. Let the time-frequency atoms
φτ,ν(t) = w(t − τ)e i2πνt
The stft transform of a signal x(t) computes the correlation between x(t) and the time-frequency atoms φτ,ν(t) = w(t − τ)e i2πνt:
+∞
∫−∞
X(τ, ν) = ⟨x(t), φτ,ν(t)⟩ = x(t)w(t − τ)e −i2πνtdt
‣ It is a time-frequency transform
‣ Window choice: Heisenberg’s uncertainty principle

+∞ +∞
∫−∞ ∫−∞
−i2πντ
‣ It is invertible: x(t) = X(τ, ν)w(t − τ)e dτdν
‣ We have energy preservation

EXAMPLE: SUM OF 2 SINES

EXAMPLE: SEQUENCE OF 2 SINES

EXAMPLE: GLOCKENSPIEL
CONTINUOUS WAVELET TRANSFORM

▸ Idea: be sensitive to irregularities instead of oscillations
Let ψ(t) be an admissible "mother" wavelet, and its the dilated and translated version
a ( a )
1 t−b
ψa,b(t) = ψ
The continuous wavelet transform is given by:
( a )
+∞
1 t−b
a ∫−∞
Cx(a, b) = ⟨x(t), ψa,b(t)⟩ = x(t)ψ dt
‣ It is a times-scale transform
( )
+∞ +∞
1 t − b da db
‣ It is invertible: x(t) = c ∫ ∫
X(a, b)ψ
a a 2
ψ −∞ −∞
‣ We have energy preservation

TIME-FREQUENCY VS TIME-SCALE
▸ STFT tilling of the time-frequency plane ▸ Wavelet tilling of the time-frequency plane
FROM CONTINUOUS TRANSFORM TO DISCRETE TRANSFORM

▸ First idea: construction of orthogonal basis
▸ No STFT orthogonal basis (Balian-Low theorem)
▸ Time-frequency orthogonal basis: MDCT
▸ Time-scale orthogonal basis: multi-resolution analysis and dyadic wavelets
▸ Alternative to orthogonal basis ?

14
FRAME
EXAMPLE
{0 t ≠ k
1 t=k
▸ Let the Kronecker basis Δ = {δ0(t), …, δN−1(t) } , δk (t) =
1
Let the Fourier basis ℰ = {ϵ0(t), …, ϵN−1(t)}, ϵk(t) = e i2π Nk t
▸ N
▸ Let x(t) = δk(t) + ϵn(t)
▸ x needs N coefficients in Δ, and N coefficients in ℰ for a perfect representation
▸ x needs only 2 coefficients in 𝒟 = Δ ∪ ℰ for a perfect representation
▸ 𝒟 is not an orthogonal basis (and is often called a "dictionary")

FRAME: DEFINITION
▸ A frame is a system of discrete representation where inversion is stable
▸ Let a dictionary 𝒟 = {φn(t), φn(t) ∈ L (ℝ)} (φn(t) being called a atom of 𝒟). 𝒟 is a frame of
2
2 2
L (ℝ) iff it exists two constant A, B > 0 such that for all f ∈ L (ℝ)
+∞
2
2 2
∑
A∥f∥ ≤ ⟨f(t), φn(t)⟩ ≤ B∥f∥
n=−∞
‣ If A = B, the frame is a tight-frame
‣ If A = B = 1 the frame is a Parseval frame
‣ If A = B = 1 and ∥φn∥ = 1 ∀n , the frame is an orthogonal basis

FRAME: ANALYSIS AND SYNTHESIS OPERATORS

▸ Analysis operator Φ*
Φ* : L 2(ℝ) → ℓ 2(ℤ)
f(t) ↦ Φ*f = {⟨f(t), φn(t)⟩}
n∈ℤ
The coefficients {⟨f(t), φn(t)⟩} are called the analysis coefficients of f(t)
n∈ℤ
▸ Synthesis operator Φ
Φ : ℓ 2(ℤ) → L 2(ℝ)
+∞
∑
α ↦ Φα = αnφn(t)
n=−∞
The coefficients αn are called synthesis coefficients of f(t)

FRAME: THE FRAME OPERATOR

‣ The analysis and synthesis operators are adjoint
‣ The frame operator

+∞
∑
ℛ = ΦΦ* : ⟨f(t), φn(t)⟩φn(t)
n=−∞
+∞
∑
‣ We have f(t) = ℛ( f ) = ⟨f(t), φn (t)⟩φn(t) iff the frame is a Parseval frame
n=−∞
FRAME: THE DUAL FRAME

‣ Let 𝒟 = {φn(t), φn(t) ∈ L 2(ℝ)} be a frame with constants A, B. The dual frame of 𝒟 is given by
𝒟 = {φ̃n(t) = ℛ (φn)}
˜ −1
˜ is also a frame such that

‣ 𝒟
+∞
1 2 2 1 2
∑
∥f∥ ≤ ⟨f(t), φ̃(t)⟩ ≤ ∥f∥
B n=−∞
A
+∞ +∞
∑ ∑
‣ We have f(t) = ⟨f, φ̃ n(t)⟩φn (t) = ⟨f, φn(t)⟩ φ̃n (t)
n=−∞ n=−∞
1
‣ If 𝒟 is a tight-frame, then φ̃n(t) = φn(t)
A
FRAME: EXAMPLES
▸ An orthogonal basis is a Parseval-Frame
▸ An union of 2 orthogonal bases is a tight-frame with constant A = 2
▸ An union of K orthogonal bases is a tight-frame with constant A = K

M MN
▸ In finite dimension ℂ , every matrix U ∈ ℂ such that rank(U) = M is a frame
▸ Moreover, if UU* = A IdM , then U is a tight-frame with constant A

GABOR FRAME IN FINITE DIMENSION

T
▸ Let x[t] ∈ ℝ , then one can define the full STFT with the real, normalized, analysis window w[t]
T−1
−i2π Tν t
∑
X[τ, ν] = x[t]w[t − τ]e , for all τ, ν = 0..(T − 1)
t=0
‣ Moreover, we have
T−1 T−1
i2π Tν t
∑∑
x[t] = X[τ, ν]w[t − τ]e
τ=0 ν=0
‣ Very redundant: T 2 time-frequency coefficients

DISCRETE GABOR FRAME IN PRACTICE

T L
▸ Let x[t] ∈ ℝ and Let w[t] ∈ ℝ be a real, normalized, analysis window (with L ≤ T)
+ T
▸ Let t0, ν0 ∈ ℕ* and let the Gabor atoms φτ,ν[t] ∈ ℝ
[ t0 ]
L ν
i2π ν L t
φτ,ν[t] = w t − τ e 0
‣ Usual choices are ν0 = 1 (FFT of size L) or ν0 = 2 (FFT of size 2L), and t0 = 2 (overlap of 50 %
between 2 windows) or t0 = 4 (overlap of 75 % between 2 windows)
‣ The dual frame is still a Gabor frame, with the dual window w̃(t)
23
SPARSE SYNTHESIS
FROM ANALYSIS TO SYNTHESIS

T N
▸ Let x[t] ∈ ℝ and 𝒟 = {φn[t]}n=0 be an over complete dictionary (N ≥ T)
TN
▸ Let Φ ∈ ℂ the matrix associated to the dictionary (the synthesis operator). The k-th column of
Φ is then the atom φk
▸ The analysis of x is given by Φ*x = {⟨x[t], φn[t]⟩}n=0

N
How is a signal best synthesized?

SPARSE SYNTHESIS
▸ Back to the example: x(t) = δk(t) + ϵn(t)
▸ Goal: how can we synthesize x with the fewest possible coefficients ?

N−1
∑
Ie: find the synthesis coefficients α such that x[t] = αnφn[t] such that most of αn = 0
▸
n=0
▸ Definition of the quasi-norm ℓ0 : ∥α∥0 = #{αn ≠ 0}

SPARSE SYNTHESIS
▸ Goal: how can we synthesize x with the fewest possible coefficients ?
min∥α∥0 s.t. x = Φα
‣ NP Hard ! Can be solved by MILP programming when N is small
‣ Idea: replace the ℓ0 norm by something easier to minimize

SPARSE SYNTHESIS: THE FRAME METHOD

▸ Replace the ℓ0 norm by the ℓ2 norm
min ∥α∥2 s.t. x = Φα
‣ Solution: α = Φ*(ΦΦ*) x−1
‣ It is the dual frame
‣ No sparsity
SPARSE SYNTHESIS: THE BASIS PURSUIT

N−1
∑
Replace the ℓ0 norm by the ℓ1 norm: ∥α∥1 = | αn |
▸
n=0
min ∥α∥1 s.t. x = Φα
‣ Solution obtained by linear programming (the problem is convex and linear)
‣ Sparsity of the solution
‣ When this solution is the same as the true ℓ0 problem ? (Wait few classes)
SPARSE SYNTHESIS: THE MATCHING PURSUIT

▸ Idea: use a greedy approach
- Init: r (0) = x, x (0) = 0, k = 0
- Repeat
1. Find the optimal atom: λk = argmax | ⟨r, φλ⟩ |

λ
2. Update the approximation: x ( k+1)

= x ( k)
+ ( k)
⟨r , φλk⟩φλk
3. And the residual: r ( k+1)

=x− x ( k+1)
= r ( k)
− ( k)
⟨r , φλk⟩φλk

After K iterations (K ≥ 0), we have
K
(k) (K)
∑
x= ⟨r , φλk⟩φλk + r
▸
k=0
(K) 2 (K−1) 2 (K−1) 2

▸ ∥r ∥ = ∥r ∥ − | ⟨r , φλK−1⟩ |
Moreover, we have
(k)
lim ∥r ∥ = 0
k→+∞

‣ Shortcoming of the MP:
‣ It converges asymptotically
‣ An atome φλ can be chosen several times
‣ Solution: orthogonal matching pursuit

SPARSE SYNTHESIS: THE OMP

‣ Idea: orthogonal projection of the signal on the subspace spanned by the selected atoms
‣ Algorithm:
- Init: r (0) = x, x (0) = 0, k = 0
- Repeat
1. Find the optimal atom: λk = argmax | ⟨r, φλ⟩ |

λ
k−1
αjφλj with αk = (Φ*
Λ Λk)
−1
(k+1) k
∑
2. Update the approximation: x = PVk x = Φ Φ*
Λ
x and Λk = {λj}j=0
k k
j=0
3. And the residual: r ( k+1)

=x− x (k+1)
SPARSE SYNTHESIS: THE OMP

N
‣ Converge in N iterations (remember x ∈ ℝ )
‣ Orthogonal projection can be costly in computation time

34
SPARSE DENOISING
DENOISING BY SPARSE SYNTHESIS

T TN
▸ Let x ∈ ℝ be a signal and Φ ∈ ℝ a dictionary where x is sparse
T
▸ Let y ∈ ℝ be a noisy observation of x:
y=x+n
With n ∈ ℝT a gaussian white noise
‣ Let α ∈ ℝN some sparse synthesis coefficients of x:
y = Φα + n
‣ How to "denoise" y ? Or, how to estimate the sparse coefficients α ?

DENOISING BY SPARSE SYNTHESIS

y = Φα + n
‣ Proposed solutions: solve

2
min ∥α∥0 s.t. ∥y − Φα∥2 ≤σ
‣ One can use MP, OMP, or the convex relaxation by replacing the ℓ0 norm by the ℓ1 norm
‣ LASSO or Basis Pursuit Denoising:
1 2
min ∥y − Φα∥2 + λ∥α∥1
2
LASSO
1 2
min ∥y − Φα∥2 + λ∥α∥1
2
‣ It is a convex, non smooth problem
‣ Can be solved efficiently by proximal descent

LASSO: THE ORTHOGONAL CASE

1
min ∥y − Φα∥22 + λ∥α∥1
2
‣ Suppose that N = T and Φ*Φ = ΦΦ* = IdN .
‣ The problems reads, with z = Φ*y
1
min ∥z − α∥22 + λ∥α∥1
2
‣ It is the so-called proximal operator of λ∥ ⋅ ∥1
‣ Solution: soft-thresholding
+
( | zk | )
λ
αk = 𝒮λ(zk) = zk 1 − with (x)+ = max(x,0)
LASSO: ISTA
1 2
min ∥y − Φα∥2 + λ∥α∥1 = ℱ(α)
2
‣ Let L = ∥Φ∥ 2
‣ Iterative Shrinkage/Thresholding Algorithm (ISTA):
( )
(t+1) (t) 1 (t)
α = 𝒮λ/L α + Φ*(y − Φα )
L
(0) 2
(t) L ∥α − α*∥
‣ ℱ(α ) − ℱ(α*) ≤
2 t
‣ Fast version: FISTA (just use a relaxation step)
CONCLUSION
▸ Stable discretization of continuous transform: frame theory
▸ Sparse synthesis vs dual frame
▸ Sparse denoising
▸ Algorithms: Greedy (MP, OMP), Convex optimization (LASSO and proximal descent)
▸ Questions: matthieu.kowalski@universite-paris-saclay.fr

Frame Sparse

Uploaded by

Copyright:

Available Formats

You might also like

Frame Sparse

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Frame Sparse

Uploaded by

Copyright:

Available Formats

1

FROM ANALYSIS TO SPARSE SYNTHESIS

▸ Web site: http://hebergement.universite-paris-saclay.fr/mkowalski/

- Spectral analysis : frequency content of a function (think about musical notes)

- Sines are eigen signal of time invariant linear systems (filters)

⟨s(t), ϵν(t)⟩, with ϵν(t) = e i2πνt

‣ Limitation of Fourier analysis

- We obtain a pure frequency content from a pure temporal content

- What is the difference between a sum of sine, and a succession of sine ?

EXAMPLE: SUM OF 2 SINES

EXAMPLE: SEQUENCE OF 2 SINES

SHORT TIME FOURIER TRANSFORM (STFT)

φτ,ν(t) = w(t − τ)e i2πνt

‣ Window choice: Heisenberg’s uncertainty principle

‣ We have energy preservation

EXAMPLE: SUM OF 2 SINES

EXAMPLE: SEQUENCE OF 2 SINES

CONTINUOUS WAVELET TRANSFORM

The continuous wavelet transform is given by:

‣ We have energy preservation

FROM CONTINUOUS TRANSFORM TO DISCRETE TRANSFORM

▸ No STFT orthogonal basis (Balian-Low theorem)

▸ Time-frequency orthogonal basis: MDCT

▸ Time-scale orthogonal basis: multi-resolution analysis and dyadic wavelets

▸ Alternative to orthogonal basis ?

▸ Let x(t) = δk(t) + ϵn(t)

▸ x needs N coefficients in Δ, and N coefficients in ℰ for a perfect representation

▸ x needs only 2 coefficients in 𝒟 = Δ ∪ ℰ for a perfect representation

▸ 𝒟 is not an orthogonal basis (and is often called a "dictionary")

‣ If A = B, the frame is a tight-frame

‣ If A = B = 1 the frame is a Parseval frame

‣ If A = B = 1 and ∥φn∥ = 1 ∀n , the frame is an orthogonal basis

FRAME: ANALYSIS AND SYNTHESIS OPERATORS

The coefficients αn are called synthesis coefficients of f(t)

FRAME: THE FRAME OPERATOR

‣ The frame operator

FRAME: THE DUAL FRAME

˜ is also a frame such that

▸ An union of 2 orthogonal bases is a tight-frame with constant A = 2

▸ An union of K orthogonal bases is a tight-frame with constant A = K

▸ Moreover, if UU* = A IdM , then U is a tight-frame with constant A

GABOR FRAME IN FINITE DIMENSION

‣ Very redundant: T 2 time-frequency coefficients

DISCRETE GABOR FRAME IN PRACTICE

FROM ANALYSIS TO SYNTHESIS

▸ The analysis of x is given by Φ*x = {⟨x[t], φn[t]⟩}n=0

How is a signal best synthesized?

▸ Goal: how can we synthesize x with the fewest possible coefficients ?

▸ Definition of the quasi-norm ℓ0 : ∥α∥0 = #{αn ≠ 0}

‣ NP Hard ! Can be solved by MILP programming when N is small

‣ Idea: replace the ℓ0 norm by something easier to minimize

SPARSE SYNTHESIS: THE FRAME METHOD

min ∥α∥2 s.t. x = Φα

‣ Solution: α = Φ*(ΦΦ*) x−1

‣ It is the dual frame

SPARSE SYNTHESIS: THE BASIS PURSUIT

min ∥α∥1 s.t. x = Φα

‣ Solution: α = Φ(ΦΦ) x−1