Image Decomposition and Separation Using Sparse Representations: An Overview

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 12

INVITED

PAPER

Image Decomposition and


Separation Using Sparse
Representations: An Overview
This overview paper points out that signal and image processing, as well as
many other important areas of engineering, can benefit from the techniques
it discusses.
By M. Jalal Fadili, Jean-Luc Starck, Jérôme Bobin, and Yassir Moudden

ABSTRACT | This paper gives essential insights into the use of is an iterative thresholding algorithm that is capable of decou-
sparsity and morphological diversity in image decomposition pling the signal content. Sparsity and morphological diversity
and source separation by reviewing our recent work in this field. have also been used as a novel and effective source of diversity
The idea to morphologically decompose a signal into its building for blind source separation (BSS), hence extending the MCA to
blocks is an important problem in signal processing and has far- multichannel data. Building on these ingredients, we will provide
reaching applications in science and technology. Starck et al. [1], an overview the generalized MCA introduced by the authors in [3]
[2] proposed a novel decomposition methodVmorphological and [4] as a fast and efficient BSS method. We will illustrate the
component analysis (MCA)Vbased on sparse representation of application of these algorithms on several real examples. We
signals. MCA assumes that each (monochannel) signal is the conclude our tour by briefly describing our software toolboxes
linear mixture of several layers, the so-called morphological made available for download on the Internet for sparse signal
components, that are morphologically distinct, e.g., sines and and image decomposition and separation.
bumps. The success of this method relies on two tenets: sparsity
and morphological diversity. That is, each morphological com- KEYWORDS | Blind source separation; image decomposition;
ponent is sparsely represented in a specific transform domain, morphological component analysis; sparse representations
and the latter is highly inefficient in representing the other
content in the mixture. Once such transforms are identified, MCA
I. INTRODUCTION
Although mathematics has it million-dollar problems, in
the form of Clay Math Prizes, there are several billion
dollar problems in signal and image processing. Famous
ones include the cocktail party problem (separate a
speaker voice from a mixture of other recorded voices
and background sounds at a cocktail party). These signal-
processing problems seem to be intractable according to
orthodox arguments based on rigorous mathematics, and
yet they keep cropping up in problem after problem.
Manuscript received March 10, 2009; revised May 29, 2009; accepted June 1, 2009.
Date of publication September 29, 2009; date of current version May 19, 2010. This
One such fundamental problem involves decomposing a
work was supported by NatImages ANR under Grant ANR-08-EMER-009. signal or image into superposed contributions from
M. J. Fadili is with GREYC CNRS, ENSICAEN, Image Processing Group,
University of Caen, 14050 Caen Cedex, France (e-mail: Jalal.Fadili@greyc.ensicaen.fr).
different sources; think of symphonic music, which may
J.-L. Starck is with AIM, CEA/DSM, CNRS, University Paris Diderot, involve superpositions of acoustic signals generated by
F-91191 Gif-sur-Yvette Cedex, France (e-mail: jstarck@cea.fr).
J. Bobin is with the Department of Applied and Computational Mathematics,
many different instrumentsVand imagine the problem of
California Institute of Technology, Pasadena, CA 91125 USA separating these contributions. More abstractly, we can see
(e-mail: bobin@acm.caltech.edu).
Y. Moudden is with DSM/IRFU/SEDI, CEA/Saclay, F-91191 Gif-sur-Yvette, France
many forms of media content that are superpositions of
(e-mail: yassir.moudden@cea.fr). contributions from different Bcontent types,[ and we can
Digital Object Identifier: 10.1109/JPROC.2009.2024776 imagine wanting to separate out the contributions from

0018-9219/$26.00 Ó 2009 IEEE Vol. 98, No. 6, June 2010 | Proceedings of the IEEE 983
Authorized licensed use limited to: CALIFORNIA INSTITUTE OF TECHNOLOGY. Downloaded on June 04,2010 at 22:14:32 UTC from IEEE Xplore. Restrictions apply.
Fadili et al.: Image Decomposition and Separation Using Sparse Representations

each. We can easily see a fundamental problem; for ex- Assume further that there are several sensors or re-
ample, an N-pixel image created by superposing K different ceivers. These sensors are in different positions, so that
types offers us N data (the pixel values), but there may be as each records a mixture of the original source signals with
many as N  K unknowns (the contribution of each content different weights. The so-called blind source separation
type to each pixel). Traditional mathematical reasoningV (BSS) problem is to find the original sources or signals
in fact, the fundamental theorem of linear algebraVtells us from their observed mixtures, without prior knowledge
not to attempt this: there are more unknowns than equa- of the mixing weights, and by knowing very little about
tions. On the other hand, if we have prior information the original sources. Some specific issues of BSS have
about the underlying object, there are some rigorous results already been addressed, as testified by the wide litera-
showing that such separation might be possibleVusing a ture in this field. In this context, as clearly emphasized
sparsity prior. by previous work, it is fundamental that the sources to be
The idea to morphologically decompose a signal into retrieved present some quantitatively measurable diversity
its building blocks is an important problem in signal and or contrast (e.g., decorrelation, independence, morpho-
image processing. Successful methods for signal or image logical diversity, etc.). The seminal work of [7] and [8]
separation can be applied in a broad range of areas in paved the way for the use of sparsity in BSS. Recently,
science and technology including biomedical engineering, sparsity and morphological diversity have emerged as a
medical imaging, speech processing, astronomical imag- novel and effective source of diversity for BSS for both
ing, remote sensing, communication systems, etc. An underdetermined and overdetermined BSS; see the com-
interesting and complicated image content separation prehensive review in [4]. Building on the sparsity and
problem is the one targeting decomposition of an image morphological diversity ingredients, the authors proposed
to texture and piece-wise-smooth (cartoon) parts. A the generalized MCA (GMCA) as a fast and efficient
functional-space characterization of oscillating textures multichannel sparse data-decomposition and BSS method
was proposed in [5] and was used for variational [3], [4], [9].
cartoon þ texture image decomposition [6]. Since then, we
have witnessed a flurry of research activity in this appli- A. Organization of the Paper
cation field. Our intent in this paper is to provide an overview of the
In [1] and [2], the authors proposed a novel recent work in monochannel image decomposition and
decomposition methodVmorphological component anal- multichannel source separation based on the concepts of
ysis (MCA)Vbased on sparse representation of signals. sparsity and morphological diversity. The first part of this
MCA assumes that each signal is the linear mixture of paper is devoted to monochannel sparse image decompo-
several layers, the so-called morphological components, sition, and the second part to blind sparse source sepa-
that are morphologically distinct, e.g., sines and bumps. ration. In this review, our goal is to highlight the essential
The success of this method relies on the assumption that concepts and issues, and to describe the main algorithms.
for every component behavior to be separated, there Several applications to real data are given in each part to
exists a dictionary of atoms that enables its construction illustrate the capabilities of the proposed algorithms. We
using a sparse representation. It is then assumed that conclude our tour by providing pointers to our software
each morphological component is sparsely represented in toolboxes that implement our algorithms and reproduce
a specific transform domain. And when all transforms the experiments on sparse signal and image decomposition
(each one attached to a morphological component) are and source separation.
amalgamated in one dictionary, each one must lead to
sparse representation over the part of the signal it is B. Notations
serving while being highly inefficient in representing the PThe ‘p -norm of a (column or row) vector x is kxkp :¼
1=p
other content in the mixture. If such dictionaries are ð i jx½ijp Þ with the usual adaptation when p ¼ 1 and
identified, the use of a pursuit algorithm searching for the kxk0 :¼ limp!0 kxkpp is the ‘0 pseudonorm, i.e., the
sparsest representation leads to the desired separation. number of nonzero components. Bold symbols represent
MCA is capable of creating atomic sparse representations matrices and XT is the transpose of X. The Frobenius
1=2
containing as a by-product a decoupling of the signal norm of X is kXkF ¼ TraceðXT XÞ . The kth entry of yi
content. (respectively, yj ) is yi ½k (respectively, yj ½k), where yi is the
Over the last few years, the development of multi- ith row and yj is the jth column of Y. An atom is an
channel sensors motivated interest in methods for the elementary signal-representing template. Examples might
coherent processing of multivariate data. Consider a include sinusoids, monomials, wavelets, and Gaussians. A
situation where there is a collection of signals emitted by dictionary % ¼ ½1 ; . . . ; L  defines a N  L matrix whose
some physical sources. These could be, for example, dif- columns are unit ‘2 -norm atoms i . When the dictionary
ferent brain areas emitting electric signals; people speak- has more columns than rows, it is called overcomplete or
ing in the same room, thus emitting speech signals; or redundant. We are mainly interested here in overcomplete
radiation sources emitting their electromagnetic waves. dictionaries.

984 Proceedings of the IEEE | Vol. 98, No. 6, June 2010


Authorized licensed use limited to: CALIFORNIA INSTITUTE OF TECHNOLOGY. Downloaded on June 04,2010 at 22:14:32 UTC from IEEE Xplore. Restrictions apply.
Fadili et al.: Image Decomposition and Separation Using Sparse Representations

II . MONOCHANNEL S PARSE Thus, a dictionary can be built by amalgamating several


IMAGE DECOMPOSITION transforms ð%1 ; . . . ; %K Þ such that, for each k, the rep-
resentation of xk in %k is sparse and not, or at least not as
A. Morphological Component Analysis sparse, in other %l , l 6¼ k. In other words, the
Suppose that the N-sample signal or image x is the subdictionaries ð%1 ; . . . ; %K Þ must be mutually incoher-
linear superposition of K morphological components, ent. Thus, the dictionary %k plays a role of a discriminant
possibly contaminated with noise between content types, preferring the component xk over
the other parts. This is a key observation for the success
of the separation algorithm. Owing to recent advances in
X
K
computational harmonic analysis, many novel representa-
y¼ xk þ "; 2" ¼ Var½"G þ 1:
k¼1 tions, including the wavelet transform, curvelet, contour-
let, steerable, or complex wavelet pyramids, were shown
to be very effective in sparsely representing certain kinds
The MCA framework aims at solving the inverse problem of signals and images. Thus, for decomposition purposes,
that consists in recovering the components ðxk Þk¼1;...;K the dictionary will be built by taking the union of one or
from their observed linear mixture, as illustrated in the top several (sufficiently incoherent) transforms, generally
of Fig. 1. MCA assumes that each component xk can be each corresponding to an orthogonal basis or a tight
sparsely represented in an associated basis %k , i.e., frame.
However, the augmented dictionary % ¼ ½%1    %K 
xk ¼ %k k ; k ¼ 1; . . . ; K will provide an overcomplete representation of x.
Because there are more unknowns than equations, the
system x ¼  is underdetermined. Sparsity can be used
where k is a sparse coefficient vector (sparse means that to find a unique solution, in some idealized cases; there
only a few coefficients are large and most are negligible). is an extensive literature on the subject, and the

Fig. 1. Illustration of the (top) image decomposition and (bottom) BSS problems using sparsity and morphological diversity.
For the bottom part, each source is itself a mixture of morphological components [see (4)] to be isolated.

Vol. 98, No. 6, June 2010 | Proceedings of the IEEE 985


Authorized licensed use limited to: CALIFORNIA INSTITUTE OF TECHNOLOGY. Downloaded on June 04,2010 at 22:14:32 UTC from IEEE Xplore. Restrictions apply.
Fadili et al.: Image Decomposition and Separation Using Sparse Representations

interested reader may refer to the comprehensive review are relieved from the other components and are likely to
paper [10]. contain mainly the salient information of xk . This intuition
In [1] and [2], it was proposed to solve this under- dictates a coordinate relaxation algorithm that cycles
determined system of equations and recover the morpho- through the components at each iteration and applies a
logical components ðxk Þk¼1;...;K by solving the following thresholding to the marginal residuals. This is what
constrained optimization problem: justifies the steps of the MCA algorithm summarized in
Algorithm 1, where TH ðÞ denotes component-wise thresh-
  olding with threshold : hard thresholding HT ðuÞ ¼ u
X
K  XK  if juj >  and zero otherwise, or soft-thresholding ST ðuÞ ¼
 
min kk kpp such that  y  %k k    (1) u maxð1  =juj; 0Þ.
1 ;...;K
k¼1
 k¼1

2 Beside coordinate relaxation, another important
ingredient of MCA is iterative thresholding with varying
threshold. Thus, MCA can be viewed as a stagewise
where kkpp is the penalty quantifying sparsity (the most hybridization of matching pursuit (MP) [12] with block-
interesting regime is for 0pffiffiffiffip  1) and  is typically coordinate relaxation [13] to (approximately) solve (1).
chosen as a constant time N  . The constraint in this The adjective stagewise is because MCA exploits the fact
optimization problem accounts for the presence of noise that the dictionary is structured (union of transforms),
and model imperfection. If there is no noise and the linear and the atoms enter the solution by groups rather than
superposition model is exact ð ¼ 0Þ, an equality con- individually unlike MP. As such, MCA is a salient-to-fine
straint is substituted for the inequality constraint. This process where, at each iteration, the most salient content
formulation is flexible enough to incorporate external of each morphological component is iteratively comput-
forces that direct the morphological components to better ed. These estimates are then progressively refined as the
suit their expected content; these forces will fine-tune the threshold  decreases towards min .
separation process to achieve its task. As an example for In the noiseless case, a careful analysis of the recovery
such successful external force, [1] proposed to add a total properties (uniqueness and support recovery) of the MCA
variation penalty [11] to the cartoon part in order to direct algorithm and its convergence behavior when all %k are
this component to fit the piecewise-smooth model. orthobases can be found in [9] and [14].

B. MCA Algorithm C. Dictionary Choice


Equation (1) is not easy to solve in general, especially From a practical point of view, given a signal x, we
for p G 1 (for p ¼ 0, it is even NP-hard). Nonetheless, if all will need to compute its forward (or analysis)
components xl ¼ l l but the kth are fixed, then it can be transform by multiplying it by %T . We also need to
proved that the solution k is given by hard thresholding reconstruct any signal from its coefficients . In fact,
(for p ¼ 0) or softP thresholding (for p ¼ 1) the marginal the matrix % and its adjoint %T corresponding to each
residuals rk ¼ y  l6¼k l l . These marginal residuals rk transform are never explicitly constructed in memory.

986 Proceedings of the IEEE | Vol. 98, No. 6, June 2010


Authorized licensed use limited to: CALIFORNIA INSTITUTE OF TECHNOLOGY. Downloaded on June 04,2010 at 22:14:32 UTC from IEEE Xplore. Restrictions apply.
Fadili et al.: Image Decomposition and Separation Using Sparse Representations

Rather, they are implemented as fast implicit analysis and is at the noise level. Assuming that the noise variance 2" is
synthesis operators taking a signal vector x and returning known, the algorithm may be stopped at iteration tpwhen ffiffiffiffi
%T x ¼ Tx (analysis side), or taking a coefficient vector  the ‘2 -norm of the residual satisfies krðtÞ k2  N " .
and returning % (synthesis side). In the case of a simple Alternatively, one may use a strategy reminiscent of de-
orthogonal basis, the inverse of the analysis transform is noising methods by taking min ¼ " , where  is a
trivially T1 ¼ %; whereas, assuming that % is a tight constant, typically between three and four.
frame implies that the frame operator satisfies %%T ¼ cI,
where c > 0 is the tight frame constant. Hence, TT ¼ % is F. Applications
the Moore–Penrose pseudoinverse transform (corre- Fig. 2 shows examples of application of the MCA sparse
sponding to the minimal dual synthesis frame), up to the decomposition algorithm to three real images: (a)–(c)
constant c. In other words, the pseudoinverse reconstruc- Barbara, (d)–(f) X-ray riser image, and (g)–(j) an astro-
tion operator Tþ corresponds to c1 %. It turns out that nomical image of the galaxy SBS 0335-052. The riser in
Tþ  is the reconstruction operation implemented by most Fig. 2(d) is made of a composite material layer, a layer of
implicit synthesis algorithms. steel-made fibers having opposite lay angles, and lead-
Choosing an appropriate dictionary is a key step made markers used as a reference to calibrate the X-ray
towards a good sparse decomposition. Thus, to represent camera. The observed image is corrupted by noise. The
efficiently isotropic structures, a qualifying choice is the structures of interest are the curvilinear fibers. The astro-
wavelet transform [15], [16]. The curvelet system [17] is a nomical image of Fig. 2(g) is contaminated by noise and a
very good candidate for representing piecewise smooth stripping artifact; the galaxy of interest is hardly visible in
ðC2 Þ images away from C2 contours. The ridgelet transform the original data.
[18] has been shown to be very effective for sparsely re- The dictionaries used for the three images are,
presenting global lines in an image. For locally oscillating respectively: local DCT þ curvelets for Barbara to decom-
textures, one can think of the local discrete cosine trans- pose it into cartoon þ texture parts; translation invariant
form (DCT) [15], waveatoms [19] or brushlets [20]. These wavelets þ curvelets for the riser image; and ridgelets þ
transforms are also computationally tractable particularly curvelets þ translation invariant wavelets for the astro-
in large-scale applications and, as stated above, never nomical image. The details of the experimental setup
explicitly implement % and T. The associated implicit fast including the parameters of the dictionaries for each image
analysis and synthesis operators have typical complexities are found in [22]. From Fig. 2(e) and (f), one can clearly
of OðNÞ, with N the number of samples or pixels (e.g., see how MCA managed to get rid of the lead-made markers
orthogonal or biorthogonal wavelet transform) or while reconstructing the curvilinear fibers structure. In
OðN log NÞ (e.g., ridgelets, curvelets, and local DCT, Fig. 2(j), the galaxy has been well detected in the wavelet
waveatoms). space while the stripping artifact was remarkably captured
What happens if none of the known fixed transforms and removed owing to ridgelets and curvelets.
can efficiently sparsify a morphological component; e.g., a
complex natural texture. In [21], the authors have ex-
tended the MCA algorithm to handle the case where the III . MULTICHANNEL SPARSE
dictionary attached to each morphological component is SOURCE SEPARATION
not necessarily fixed a priori as above but learned from a
set of exemplars in order to capture complex textural A. The Blind Source Separation Problem
patterns. In the BSS setting, the instantaneous linear mixture
model assumes that we are given m observations (chan-
D. Thresholding Strategy nels) fy1 ; . . . ; ym g, where each yj is a row-vector of size N;
In practice, hard thresholding leads to better results. each channel is the linear mixture of n sources
Furthermore, in [14], we empirically showed that the use
of hard thresholding is likely to provide the ‘0 -sparsest X
n
solution. As far as the thresholding decreasing strategy is 8j 2 f1; . . . ; mg; yj ¼ aj ½isi þ "j (2)
concerned, there are several alternatives. For example, in i¼1
[1] and [2], linear and exponential decrease were advo-
cated. In [14], a more elaborated strategy coined MOM
or, equivalently, in matrix form
(for mean-of-max) was proposed.

E. Handling Bounded Noise Y ¼ AS þ E (3)


MCA handles in a natural way data perturbed by
additive noise " with bounded variance 2" . Indeed, as
T
MCA is a coarse-to-fine iterative procedure, bounded noise where Y ¼ ½yT1 ; . . . ; yTm  is the m  N measurement mat-
T
can be handled just by stopping iterating when the residual rix whose rows are the channels yj , S ¼ ½sT1 ; . . . ; sTn  is the

Vol. 98, No. 6, June 2010 | Proceedings of the IEEE 987


Authorized licensed use limited to: CALIFORNIA INSTITUTE OF TECHNOLOGY. Downloaded on June 04,2010 at 22:14:32 UTC from IEEE Xplore. Restrictions apply.
Fadili et al.: Image Decomposition and Separation Using Sparse Representations

Fig. 2. MCA of three real two-dimensional images. Barbara: (a) original, (b) cartoon component (curvelets), (c) texture (local DCT).
X-ray riser image: (d) observed, (e) isotropic structures and background (wavelets), (f) curvilinear fibers (curvelets).
Galaxy SBS 0335-052: (g) observed, (h) ridgelet component, (i) curvelet component, (j) detected galaxy (wavelets).

A source matrix whose rows are the sources si , and A is atoms. Thus, we let % be the concatenation of K
T
the m  n mixing matrix. A defines the contribution of transforms % ¼ ½%T1 ; . . . ; %TK  .
each source to each channel. An m  N matrix E is added The GMCA framework assumes a priori that the
to account for instrumental noise or model imperfections. sources ðsi Þi¼1;...;n are sparse in the dictionary %, 8i
See Fig. 1. si ¼ i %, where i is sparse (or compressible). More
Source separation techniques aim at recovering the precisely, in the GMCA setting, each source is modeled as
original sources S by taking advantage of some informa- the linear combination of K morphological components,
tion contained in the way the signals are mixed. In the where each component is sparse in a specific basis
blind approach, both the mixing matrix A and the
sources S are unknown. Source separation is overwhelm-
X
K X
K
ingly a question of contrast and diversity. Indeed, source 8i 2 f1; . . . ; ng; si ¼ xik ¼ ik %k : (4)
separation boils down to devising quantitative measures k¼1 k¼1
of diversity or contrast to extricate the sources. Typical
measures of contrast are statistical criteria such as GMCA seeks an unmixing scheme, through the estima-
independence (i.e., independent component analysis tion of A, which leads to the sparsest sources S in the
(ICA) [23]) or sparsity and morphological diversity; see dictionary %. This is expressed by the following opti-
[3], [4], [7], [24], and [25] and references therein. mization problem:
B. Generalized Morphological Component Analysis
The GMCA framework assumes that the observed data X
n X
K

Y is a linear instantaneous mixture of unknown sources min kik kpp such that kY  A%kF  
A;
i¼1 k¼1
S with an unknown mixing matrix A, as in (3). For
notational convenience, the dictionaries in the multi- and kai k2 ¼ 1 8i ¼ 1; . . . ; n (5)
channel case will be transposed versions of those
considered in the single-channel case in Section II; each where typically p ¼ 0 or a relaxed version with p ¼ 1. But
dictionary %k is now a matrix whose rows are unit-norm other sparsity regularization terms can be used in (5), e.g.,

988 Proceedings of the IEEE | Vol. 98, No. 6, June 2010


Authorized licensed use limited to: CALIFORNIA INSTITUTE OF TECHNOLOGY. Downloaded on June 04,2010 at 22:14:32 UTC from IEEE Xplore. Restrictions apply.
Fadili et al.: Image Decomposition and Separation Using Sparse Representations

mixed-norms [26]. The unit ‘2 -norm constraint on the column ai is estimated from the most significant features
columns of A avoids the classical scale indeterminacy of of si . Each source and its corresponding column of A is then
the product AS in (3). The reader may have noticed that alternately and progressively refined as the threshold 
the MCA problem (1) is a special case of the GMCA decreases towards min . This particular iterative threshold-
problem (5) when there is only one source n ¼ 1 and one ing scheme provides robustness to noise, model imperfec-
channel m ¼ 1 (no mixing). Thus GMCA is indeed a tions, and initialization by working first on the most
multichannel generalization of MCA. significant features in the data and then progressively incor-
Equation (5) is a difficult nonconvex optimization porating smaller details to finely tune the model parameters.
problem even for convex penalties p  1. More con- As a multichannel extension of MCA, GMCA is also robust
veniently, the product AS can be split into P n  K multi- to noise and can be used with either linear or exponential
channel P morphological components: AS ¼ i;k ai xik ¼ decrease of the threshold [3], [4]. Moreover, hard thresh-
A% ¼ i;k ai ik %k . Based on this decomposition, olding leads to its best practical performance.
GMCA yields an alternating minimization algorithm to
estimate iteratively one term at a time. It has been shown D. Unknown Number of Sources
in [3] that estimating the morphological component In BSS, the number of sources n is assumed to be a
xik ¼ ik %k assuming A and xfpqg6¼fikg can be obtained fixed known parameter of the problem. This is rather an
through iterative thresholding for p ¼ 0 and p ¼ 1. exception than a rule, and estimating n from the data is a
Now, considering fixed fap gp6¼i and S, updating the crucial and strenuous problem. Only a few works have
column ai is then just a least squares estimate attacked this issue. One can think of using model selection
criteria such as the minimum description length used in
0 1 [27]. In [4], a sparsity-based method to estimate n within
1 X the GMCA framework was proposed. Roughly speaking,
ai ¼ @Y  ap sp AsTi : (6)
ksi k22 p6¼i
this selection procedure uses GMCA to solve a sequence of
problems (5) for each constraint radius ðqÞ with
increasing q, 1  q  m. In [4], it was argued to set ðqÞ
This estimate is then projected onto the unit sphere to to the Frobenius-norm of the error when approximating
meet the unit ‘2 -norm constraint in (5). the data matrix Y with its largest q singular vectors; see [4]
for further details.
C. GMCA Algorithm
The GMCA algorithm is summarized in Algorithm 2. In E. Hyperspectral Data
the same vein as MCA, GMCA also relies on a salient-to-fine In standard BSS, A is often seen as a mixing matrix of
strategy. More precisely, GMCA is an iterative thresholding small size m  n. On the other hand, there are applications
algorithm such that, at each iteration, it first computes in which one deals with data from instruments with a very
coarse versions of the morphological components for a fixed large number of channels m, which are well organized
source si . These raw sources are estimated from their most according to some physically meaningful index. A typical
significant coefficients in %. Then, the corresponding example is hyperspectral data, where images are collected in

Vol. 98, No. 6, June 2010 | Proceedings of the IEEE 989


Authorized licensed use limited to: CALIFORNIA INSTITUTE OF TECHNOLOGY. Downloaded on June 04,2010 at 22:14:32 UTC from IEEE Xplore. Restrictions apply.
Fadili et al.: Image Decomposition and Separation Using Sparse Representations

a large number (i.e., hundreds) of contiguous regions of the mixtures [signal-to-noise ratio (SNR) ¼ 10 dB]. GMCA
electromagnetic spectrum. Regardless of other definitions was applied with a dictionary containing curvelets and
appearing in other scientific communities, the term local DCT. As a quantitative performance measure of BSS
Bhyperspectral[ is used here for multichannel data model methods, we use the mixing matrix criterion A ¼
(3) where the number of channels m is large and these kIn  P^Aþ Ak1 , where ^Aþ is the pseudoinverse of the
channels achieve a uniform sampling of some meaningful estimate of the mixing matrix A and P is a matrix that
physical index (e.g., wavelength, space, time), which we reduces the scale/permutation indeterminacy of the mix-
refer to as the spectral dimension. For such data, it then ing model. Fig. 3(c) compares GMCA to popular BSS
makes sense to consider the regularity of the spectral sig- techniques in terms of A as the SNR increases. We
natures ðai Þi¼1;...;n . For instance, these spectral signatures compare GMCA to ICA JADE [30], relative Newton algo-
may be known a priori to have a sparse representation in rithm (RNA) [31] that accounts for sparsity, and EFICA
some specified possibly redundant dictionary 8 of spectral [32]. Both RNA and EFICA were applied after
waveforms. Bsparsifying[ the data via an orthonormal wavelet trans-
In [4] and [28], the authors propose a modified GMCA form. It can be seen that JADE performs rather badly,
algorithm capable of handling hyperspectral data. This is while RNA and EFICA behave quite similarly. GMCA
achieved by assuming that each rank-one matrix Xi ¼ ai si seems to provide much better results, especially at high
has a sparse representation in the multichannel dictionary noise level.
%  8 [9], [29] ( is the Kronecker product); i.e., 8i
si ¼ i % and ai ¼ 8i , where i and i are both sparse. Color Image Denoising: GMCA can be applied to color
The separation optimization problem for hyperspectral image denoising. This is illustrated in Fig. 4, where the
data is then original RGB image is shown in (a). Fig. 4(b) shows the
RGB image obtained using a classical undecimated
 2 wavelet-domain hard thresholding on each color plane
1 Xn  X

n
independently. GMCA is applied to the RGB color
min Y  8i i % þ i ki i k1 (7)
; 2   channels using the curvelet dictionary. In the notation of
i¼1 i¼1F
Section III-A, we have m ¼ 3 channels (color planes); each
color channel is yj , j ¼ 1; 2; 3, n ¼ 3 sources; and A plays
where a Laplacian prior is imposed on each i con- the role of the color space conversion matrix. Unlike
ditionally on i , and vice versa. Remarkably, this joint classical color spaces (e.g., YUV, YCC), where the con-
prior preserves the scale invariance of (3). Equation (7) is version matrix from RGB is fixed, the color space conver-
again a nonconvex optimization problem for which no sion matrix is here estimated by GMCA from the data. As
closed-form solution exists. In the line of the GMCA such, GMCA is able to find adaptively the appropriate
algorithm, thanks to the form of the ‘1 -penalty in (7), a color space corresponding to the color image at hand.
block-relaxation iterative thresholding algorithm was Once A is estimated by GMCA, we applied the same
proposed in [4] and [28] that alternately minimizes (7) undecimated wavelet-based denoising to the estimated
with respect to  and . It was shown by these authors that sources. The denoised data are obtained by coming back to
the update equations on the coefficient matrices are the RGB space via the estimated mixing matrix. Fig. 4(c)
shows the GMCA-based denoised image. Clearly, denois-
 1 T 
T ing in the BGMCA color space[ is substantially better than
ðtþ1Þ ¼ ST ðtÞ ðtÞ  ðtÞ ðtÞ 8T Y%T in the RGB space (or other color spaces such as YUV or
    YCC; see [3]).
T T 1
ðtþ1Þ ¼ ST
ðtÞ 8T Y%T ðtÞ ðtÞ ðtÞ : (8)
Hyperspectral Data Processing: In this experiment, we
consider m ¼ 128 mixtures of n ¼ 5 source images. The
ðtÞ
ðtÞ is a vector of length n and entries ðtÞ ½i ¼ ðtÞ ki k1 = sources are drawn at random from a set of structured
ðtÞ 2 ðtÞ images shown in Fig. 5(a). For the spectra (i.e., columns
k i k2 ;
ðtÞ has length m and entries
ðtÞ ½j ¼ ðtÞ kj k1 = of A), we randomly generated sparse coefficient vectors
ðtÞ 2
kj k2 ; and ðtÞ is a decreasing threshold. The multichan- ði ¼ 1; . . . ; nÞ ði ¼ 1; . . . ; nÞ with independent Gaussian-
nel soft-thresholding operator ST acts on each row i with distributed nonzero entries and then applied the inverse
threshold ½i and ST
acts on each column j with orthogonal wavelet transform to these sparse vectors to
threshold
½j. get the spectra. % was chosen as the curvelet dictionary.
Fig. 5(b) gives four typical noisy observed channels with
F. Applications SNR ¼ 20 dB. The sources recovered using GMCA
(Algorithm 2) and its hyperspectral extension (iteration
BSS: We first report a simple BSS application. Fig. 3 of Section III-E) are shown, respectively, in Fig. 5(c) and
shows (a) two original sources and (b) the two noisy (d). Visual inspection shows that GMCA is outperformed

990 Proceedings of the IEEE | Vol. 98, No. 6, June 2010


Authorized licensed use limited to: CALIFORNIA INSTITUTE OF TECHNOLOGY. Downloaded on June 04,2010 at 22:14:32 UTC from IEEE Xplore. Restrictions apply.
Fadili et al.: Image Decomposition and Separation Using Sparse Representations

Fig. 3. Example of BSS with (a) two sources and (b) two noisy mixtures. (c) depicts the evolution of the mixing matrix criterion A
with input SNR [solid line: GMCA; dashed line: JADE; (þ): RNA; (?): EFICA].

Fig. 4. (a) Original RGB image with additive Gaussian noise SNR ¼ 15 dB. (b) Wavelet-based denoising in the RGB space.
(c) Wavelet-based denoising in the ‘‘GMCA color space.’’

by hyperspectral GMCA, which better accounts for both available freely for download.1 MCALab and GMCALab
spatial and spectral sparsity. have been developed to demonstrate key concepts of MCA
and GMCA and make them available to interested re-
searchers and technologists. These toolboxes are libraries
IV. REPRODUCIBLE RESEARCH SOFTWARE of MATLAB routines that implement the decomposition
Following the philosophy of reproducible research, two
1
toolboxes, MCALab and GMCALab [22], are made http://www.morphologicaldiversity.org.

Vol. 98, No. 6, June 2010 | Proceedings of the IEEE 991


Authorized licensed use limited to: CALIFORNIA INSTITUTE OF TECHNOLOGY. Downloaded on June 04,2010 at 22:14:32 UTC from IEEE Xplore. Restrictions apply.
Fadili et al.: Image Decomposition and Separation Using Sparse Representations

Fig. 5. (a) Images used as sources. (b) Four noisy mixtures out of m ¼ 128 (SNR ¼ 20 dB). (c) Recovered sources using GMCA.
(d) Recovered sources using hyperspectral GMCA.

and source separation algorithms overviewed in this paper. regularize image decomposition and blind source separa-
They contain a variety of scripts to reproduce the figures in tion problems. We also reported several numerical
our own papers, as well as other exploratory examples not experiments to illustrate the wide applicability of the
included in the papers. algorithms described. We believe that this is an exciting
field where many interesting problems are still open.
Among them, we may cite, for instance, the theoretical
V. CONCLUSION guarantees of the sparsity-regularized BSS problem and
In this paper, we gave an overview of how sparsity and sharper theoretical guarantees for the decomposition
morphological diversity can be used advantageously to problem by exploiting geometry. h

992 Proceedings of the IEEE | Vol. 98, No. 6, June 2010


Authorized licensed use limited to: CALIFORNIA INSTITUTE OF TECHNOLOGY. Downloaded on June 04,2010 at 22:14:32 UTC from IEEE Xplore. Restrictions apply.
Fadili et al.: Image Decomposition and Separation Using Sparse Representations

REFERENCES [11] L. Rudin, S. Osher, and E. Fatemi, BNonlinear [24] P. G. Georgiev, F. Theis, and A. Cichocki,
total variation noise removal algorithm,[ Phys. BSparse component analysis and blind source
[1] J.-L. Starck, M. Elad, and D. Donoho, D, vol. 60, pp. 259–268, 1992. separation of underdetermined mixtures,[
BImage decomposition via the combination IEEE Trans. Neural Netw., vol. 16, no. 4,
of sparse representatntions and variational [12] S. Mallat and Z. Zhang, BMatching pursuits
with time-frequency dictionaries,[ IEEE pp. 992–996, 2005.
approach,[ IEEE Trans. Image Process.,
vol. 14, no. 10, pp. 1570–1582, 2005. Trans. Signal Process., vol. 41, no. 12, [25] J. Bobin, Y. Moudden, J.-L. Starck, and
pp. 3397–3415, 1993. M. Elad, BMorphological diversity and
[2] J.-L. Starck, M. Elad, and D. Donoho, source separation,[ IEEE Signal Process. Lett.,
BRedundant multiscale transforms and [13] S. Sardy, A. Bruce, and P. Tseng,
BBlock coordinate relaxation methods for vol. 13, no. 7, pp. 409–412, 2006.
their application for morphological
component analysis,[ in Advances in nonparametric wavelet denoising,[ J. Comput. [26] M. Kowalski, E. Vincent, and R. Gribonval,
Imaging and Electron Physics, vol. 132, Graph. Statist., vol. 9, no. 2, pp. 361–379, BUnder-determined source separation via
P. Hawkes, Ed. New York: Academic/ 2000. mixed-norm regularized minimization,[ in
Elsevier, 2004, vol. 132. [14] J. Bobin, J.-L. Starck, M. J. Fadili, Y. Moudden, Proc. EUSIPCO’08, Lausanne, Switzerland,
and D. L. Donoho, BMorphological component Aug. 2008.
[3] J. Bobin, J.-L. Starck, M. J. Fadili, and
Y. Moudden, BSparsity and morphological analysis: An adaptive thresholding strategy,[ [27] R. Balan, BEstimator for number of sources
diversity in blind source separation,[ IEEE IEEE Trans. Image Process., vol. 16, using minimum description length criterion
Trans. Image Process., vol. 16, pp. 2662–2674, pp. 2675–2681, Nov. 2007. for blind sparse source mixtures,[ in
Nov. 2007. [15] S. Mallat, A Wavelet Tour of Signal Processing, Independent Component Analysis and Signal
2nd ed. New York: Academic, 1998. Separation, vol. 4666, M. E. Davies,
[4] J. Bobin, J.-L. Starck, Y. Moudden, and C. J. James, S. A. Abdallah, and
M. J. Fadili, BBlind source separation: The [16] J.-L. Starck, F. Murtagh, and A. Bijaoui, Image
M. D. Plumbley, Eds. New York:
sparsity revolution,[ in Advances in Imaging Processing and Data Analysis: The Multiscale
Springer, 2007, pp. 333–340.
and Electron Physics, P. Hawkes, Ed. Approach. Cambridge, U.K.: Cambridge
New York: Academic/Elsevier, 2008, Univ. Press, 1998. [28] Y. Moudden, J. Bobin, J.-L. Starck, and
pp. 221–298. M. J. Fadili, BDictionary learning with
[17] E. Candès, L. Demanet, D. Donoho, and
spatio-spectral sparsity constraints,[ in
[5] Y. Meyer, BOscillating patterns in image L. Ying, BFast discrete curvelet transforms,[
Proc. SPARS’09, St. Malo, France,
processing and in some nonlinear evolution SIAM Multiscale Model. Simul., vol. 5,
Apr. 2009.
equations,[ in 15th Dean Jacquelines B. Lewis pp. 861–899, 2005.
Memorial Lectures, 2001. [29] R. Gribonval and M. Nielsen, BBeyond
[18] E. Candès and D. Donoho, BRidgelets:
sparsity: Recovering structured
[6] L. A. Vese and S. J. Osher, BModeling textures The key to high dimensional intermittency?’’
representations by l1 -minimization and
with total variation minimization and Phil. Trans. Royal Soc. London A, vol. 357,
greedy algorithms: Application to the
oscillating patterns in image processing,[ pp. 2495–2509, 1999.
analysis of sparse underdetermined ica,[
J. Sci. Comput., vol. 19, no. 1–3, pp. 553–572, [19] L. Demanet and L. Ying, BWave atoms and Adv. Comput. Math., vol. 28, no. 1, pp. 23–41,
2003. sparsity of oscillatory patterns,[ Appl. Comput. 2008.
[7] M. Zibulevsky and B. Pearlmutter, BBlind Harmon. Anal., vol. 23, no. 3, pp. 368–387,
[30] J.-F. Cardoso, BBlind signal separation:
source separation by sparse decomposition,[ 2007.
Statistical principles,[ Proc. IEEE
Neural Comput., vol. 13/4, 2001. [20] F. G. Meyer and R. R. Coifman, BBrushlets: (Special Issue on Blind Identification and
[8] P. Bofill and M. Zibulevsky, BUnderdetermined A tool for directional image analysis and Estimation), vol. 9, no. 10, pp. 2009–2025,
blind source separation using sparse image compression,[ Appl. Comput. Harmon. Oct. 1998.
representations,[ Signal Process., vol. 81, no. 11, Anal., vol. 5, pp. 147–187, 1997.
[31] M. Zibulevski, BBlind source separation with
pp. 2353–2362, 2001. [21] G. Peyré, M. J. Fadili, and J.-L. Starck, relative newton method,[ in Proc. ICA2003,
[9] J. Bobin, Y. Moudden, M. J. Fadili, and BLearning adapted dictionaries for geometry 2003, pp. 897–902.
J.-L. Starck, BMorphological diversity and and texture separation,[ in Proc. SPIE Conf.
[32] Z. Koldovsky, P. Tichavsky, and E. Oja,
sparsity for multichannel data restoration,[ Wavelets XII, San Diego, CA, Aug. 2007.
BEfficient variant of algorithm fastica for
J. Math. Imag. Vision, vol. 10, no. 2, [22] M. J. Fadili, J.-L. Starck, M. Elad, and independent component analysis attaining
pp. 149–168, 2009. D. L. Donoho, BMCALab: Reproducible the cramer-rao lower bound,[ IEEE Trans.
[10] A. M. Bruckstein, D. L. Donoho, and M. Elad, research in signal and image decomposition Neural Netw., vol. 17, pp. 1265–1277,
BFrom sparse solutions of systems of and inpainting,[ IEEE Comput. Sci. Eng., 2006.
equations to sparse modeling of signals and to be published.
images,[ SIAM Rev., vol. 51, no. 1, pp. 34–81, [23] A. Hyvärinen, J. Karhunen, and E. Oja,
2009. Independent Component Analysis.
New York: Wiley, 2001.

ABOUT THE AUTHORS


M. Jalal Fadili graduated from the Ecole Nationale Jean-Luc Starck received the Ph.D. degree from
Supérieure d’Ingénieurs (ENSI) de Caen, Caen, the University Nice-Sophia Antipolis, France, and
France. He received the M.Sc. and Ph.D. degrees the Habilitation degree from the University Paris XI,
in signal and image processing from the University Orsay, France.
of Caen. He was a Visitor with the European Southern
He was a Research Associate with the Univer- Observatory (ESO) in 1993, the University of
sity of Cambridge, Cambridge, U.K., from 1999 to California, Los Angeles, in 2004, and the Statistics
2000, where he was a MacDonnel-Pew Fellow. He Department, Stanford University, in 2000 and
has been an Associate Professor of signal and 2005. He has been a Researcher with CEA, France,
image processing since September 2001 at ENSI. since 1994. His research interests include image
He was a Visitor at several universities (QUT-Australia, Stanford Uni- processing, statistical methods in astrophysics, and cosmology. He is an
versity, California Institute of Technology, EPFL). His research interests expert in multiscale methods such as wavelets and curvelets, He is a
include statistical approaches in signal and image processing, inverse Leader of the Multiresolution project at CEA and is a core Team Member
problems, computational harmonic analysis, optimization theory, and of the PLANCK ESA project. He has published more than 100 papers in
sparse representations. His areas of application include medical and different areas in scientific journals. He is also author of Image Processing
astronomical imaging. and Data Analysis: The Multiscale Approach (Cambridge, U.K.: Cambridge
University Press, 1998) and Astronomical Image and Data Analysis,
2nd ed. (New York: Springer, 2006).

Vol. 98, No. 6, June 2010 | Proceedings of the IEEE 993


Authorized licensed use limited to: CALIFORNIA INSTITUTE OF TECHNOLOGY. Downloaded on June 04,2010 at 22:14:32 UTC from IEEE Xplore. Restrictions apply.
Fadili et al.: Image Decomposition and Separation Using Sparse Representations

Jérôme Bobin graduated from the Ecole Normale Yassir Moudden graduated in electrical engi-
Superieure (ENS) de Cachan, France, in 2005. He neering from SUPELEC, Gif-sur-Yvette, France. He
received the M.Sc. degree in signal and image received the M.S. degree in physics from Univer-
processing from ENS de Cachan and University sity Paris VII, France, in 1997 and the Ph.D. degree
Paris XI, Orsay, France, and the Ph.D. degree from in signal processing from the University Paris XI,
Paris XI in 2008. He received the Agrégation de Orsay, France.
Physique degree in 2004. He was a Visitor at the University of California,
He is now a Postdoctoral Researcher with ACM, Los Angeles, in 2004 and is currently with CEA,
California Institute of Technology, Pasadena, CA. France, working on applications of signal proces-
His research interests include statistics, informa- sing to astronomy. His research interests include
tion theory, multiscale methods, and sparse representations in signal and signal and image processing, data analysis, statistics, and information
image processing. theory.

994 Proceedings of the IEEE | Vol. 98, No. 6, June 2010


Authorized licensed use limited to: CALIFORNIA INSTITUTE OF TECHNOLOGY. Downloaded on June 04,2010 at 22:14:32 UTC from IEEE Xplore. Restrictions apply.

You might also like