Professional Documents
Culture Documents
Advanced Machine Learning: (COMP 5328)
Advanced Machine Learning: (COMP 5328)
(COMP 5328)
Sparse Coding and Regularisation
Tongliang Liu
Quiz results
3
Dictionary learning
What is a dictionary in machine learning?
d
Let x 2 R , D2R d⇥k
⇤ 2
↵ = arg min kx D↵k .
↵2Rk
p
Note that kxk = x> x is the ell 2 norm.
d
Given x1 , . . . , xn 2 R
Xn
⇤ ⇤ ⇤ 1 2
{D , ↵1 , . . . , ↵ n } = arg min kxi D↵i k .
D2Rd⇥k ,↵1 ,...,↵n 2Rk n i=1
Note that
Xn
1 1 2
kxi D↵i k = kX DRkF2 ,
n i=1 n
d⇥n
where X = [x1 , x2 , . . . , xn ] 2 R ,
k⇥n
R = [↵1 , ↵2 , . . . , ↵n ] 2 R .
v
q u d n
uX X
kXkF = trace(X > X) = t 2
Xi,j is the Frobenius norm of X.
i=1 j=1
Note that
2
arg min kX DRkF ,
D2D,R2R
Dimension(
Reduction
≈
X
Natural image
Image credit: Zaouali, Bouzidi, and Zagrouba: Review of multiscale geometric decompositions in a remote sensing context
Note that
2
arg min kX DRkF ,
D2D,R2R
Note that
2
arg min kX DRkF
D2D,R2R
X # "
Sparsity
Overcompleteness
COMP5328 Tongliang Liu 17
`p norm
0 11/p
Xk
`p norm: k↵kp = @ |↵j | pA
, where ↵ 2 R .
k
j=1
k
X
p p
In other words, k↵kp = |↵j | .
j=1
D:1
2
D:3
min kX DRkF D:2
D2D,R2R
Image credit: Compression of facial images using the K-SVD algorithm, J.Vis. Comun. Image Represent.
Image credit: CLow Bit-Rate Compression of Facial Images, IEEE Transactions on Image Processing
Uniform slicing
Image credit: Compression of facial images using the K-SVD algorithm, J.Vis. Comun. Image Represent.
50%
80%
Image credit: Sparse representation for color image restoration. IEEE Transactions on Image Processing.
Image credit: Sparse representation for color image restoration. IEEE Transactions on Image Processing.
↵2 ↵2
µ ↵1 ↵1
µ µ µ
/
• min & ' $$s. *. + − -& . ≤ 1.
$%
Greedy algorithms:
•OMP (Y. Pati, et al. 1993; J. Tropp 2004).
•Subspace pursuit (SP) (W. Dai and O. Milenkovic 2009),
CosaMP (D. Needell and J. Tropp 2009).
•IHT (T. Blumensath and M. Davies 2009).
COMP5328 Tongliang Liu 34
Sparse coding learning
algorithms
The `1 norm based approaches:
/
• min % & ''s.*. + − -% . ≤ 1.
$
*
• min % − '( ) +, ( -.
$
Bayesian approach:
• Relevance vector machine (RVM) (M. Tipping 2001 ).
• Bayesian compressed sensing (BCS) (S. Ji, et al. 2008).
Xu, H., Caramanis, C., & Mannor, S. (2012). Sparse algorithms are not stable: A no-free-lunch theorem. IEEE transactions on pattern analysis
and machine intelligence, 34(1), 187-193.
0 0
S i = {(X1 , Y1 ), . . . , (Xi 1 , Yi 1 ), (X i , Y i ), (Xi+1 , Yi+1 ), . . . , (Xn , Yn )}
Xn
1 2
hS = arg min `(Xi , Yi , h) + khk2
h2H n
i=1
If the convex surrogate loss function is L-Lipschitz continuous
w.r.t. h , kXk2 B, we have
0 0
|`(X, Y, h) `(X, Y, h )| L|h(X) h (X)|.
) *
! " ≥ ! $ + &! $ ," − $ + $−" , ∀$, ",
*
⇔
./ ≼ & *! $ , ∀$
!"
. *
! $ + &! $ ," − $ + $ −" ! $ + &! $ ," − $
2
($, ! $ )
We then have