Professional Documents
Culture Documents
Nonnegative Least-Correlated Component Analysis For Separation of Dependent Sources by Volume Maximization
Nonnegative Least-Correlated Component Analysis For Separation of Dependent Sources by Volume Maximization
Nonnegative Least-Correlated Component Analysis For Separation of Dependent Sources by Volume Maximization
Abstract—Although significant efforts have been made in developing nonnegative blind source separation techniques, accurate
separation of positive yet dependent sources remains a challenging task. In this paper, a joint correlation function of multiple signals is
proposed to reveal and confirm that the observations after nonnegative mixing would have higher joint correlation than the original
unknown sources. Accordingly, a new nonnegative least-correlated component analysis (nLCA) method is proposed to design the
unmixing matrix by minimizing the joint correlation function among the estimated nonnegative sources. In addition to a closed-form
solution for unmixing two mixtures of two sources, the general algorithm of nLCA for the multisource case is developed based on an
iterative volume maximization (IVM) principle and linear programming. The source identifiability and required conditions are discussed
and proven. The proposed nLCA algorithm, denoted by nLCA-IVM, is evaluated with both simulation data and real biomedical data to
demonstrate its superior performance over several existing benchmark methods.
Index Terms—Nonnegative blind source separation, nonnegative least-correlated component analysis, dependent sources, joint
correlation function of multiple signals, iterative volume maximization.
1 INTRODUCTION
[19]. Possible circumstances under which the NMF yields real-world applications, where the number of mixtures is
unique decomposition can also be found in [20]. larger than the number of sources, Section 4 presents noise
The nBSS based on the pure-source sample assumption reduction and rank reduction prior to the nBSS processing
(where a pure-source sample means a data point that is fully using the nLCA-IVM algorithm. In Section 5, simulation
dominated by only one source) has been investigated for results are presented with synthetic mixed human face
separation of dependent sources in biomedical imaging [21], images, infrared spectral signals, and dual-energy X-ray
[22], [23] and remote sensing applications [24], [25], [26]. In images to evaluate the performance of the proposed
[27], a neural network algorithm, searching for the edges of nLCA-IVM algorithm. Then a real data analysis of
data scatter plot, exploits the existence of pure-source fluorescence microscopy imaging and that of dynamic
samples to iteratively update the unmixing matrix such that contrast-enhanced magnetic resonance imaging (DCE-MRI)
the sum of all the off-diagonal entries of the angular using the proposed nLCA-IVM algorithm are presented to
proximity matrix (analogous to the cross-correlation matrix) show its effectiveness. Finally, some conclusions are drawn
of the extracted edges is minimum. A recently reported in Section 6.
nBSS method, convex analysis of mixtures of nonnegative
sources (CAMNS) [22], based on the pure-source sample
assumption, has been theoretically proven to achieve perfect 2 PROBLEM FORMULATION
separation by searching for all the extreme points of an For ease of later use, let us define the following notations:
observation-constructed polyhedral set. In remote sensing,
many spectral unmixing algorithms [24], [25], [26], also IR; IRN ; IRMN Set of real numbers, N-dimensional
based on the pure-source sample assumption, try to search vectors, M N matrices
for the purest observed pixels from the spectral data set and IRþ ; IRN
þ ; IRþ
MN
Set of nonnegative real numbers,
then identify the mixing matrix. The use of such spectral N-dimensional vectors,
unmixing methods is usually followed by an inversion M N matrices
process, such as nonnegative least-squares method, to 1N N 1 vector with all the entries equal
recover the sources. Similar ideas for finding the mixing to unity
matrix can be found in our recently reported nBSS method, IN N N identity matrix
namely nLCA [23], which estimates the mixing matrix by an ei Unit vector of proper dimension with the
algebraic edge search algorithm. Herein, we renamed the ith entry equal to unity
nLCA reported in [23] nLCA-edge search (nLCA-ES) so as to
0 Zero vector of proper dimension
distinguish it from the one to be presented in this paper.
kk Euclidean norm of a vector or Frobenius
In this paper, we propose a joint correlation function of
multiple signals for the design of the demixing matrix. The norm of a matrix
joint correlation function of the observations mixed by Componentwise inequality
nonnegative combinations of the sources would be higher diagf1 ; . . . ; N g N N diagonal matrix with diagonal
than that of the sources. Based on this idea and the entries 1 ; . . . ; N
nonnegativity constraint of the sources, a novel nLCA is Consider the following M N linear mixing system:
proposed for which the nBSS problem is formulated into a
problem of minimizing the joint correlation function of all x½n ¼ As½n; n ¼ 1; 2; . . . ; L; ð1Þ
the demixed sources under the nonnegativity constraint. A T
where x½n ¼ ðx1 ½n; x2 ½n; . . . ; xM ½nÞ is the nth data point of
closed-form solution for the case of two sources with two
observations is provided. For the general case of multiple the given M observations (or mixtures), A ¼ ½aij MN 2
sources, the proposed nLCA can be fulfilled by an iterative IRMN is an unknown mixing matrix, s½n ¼ ðs1 ½n; s2 ½n; . . . ;
volume maximization algorithm (nLCA-IVM) using linear sN ½nÞT is the nth source point consisting of N sources, and
program (LP), which finds the optimal demixing matrix by L > maxfM; Ng is the data length (i.e., number of pixels in
maximizing the volume of a solid region formed by the each observation and each source).
demixed source vectors regardless of whether the pure- Our goal herein is to design a real demixing matrix W ¼
source sample assumption is valid or not. However, the ½wij NM such that
source identifiability of the nLCA can be proven under the
existence of pure-source samples. In contrast to other y½n ¼ Wx½n ¼ ðWAÞs½n ¼ Ps½n; ð2Þ
spectral unmixing methods, the proposed nLCA-IVM, where y½n ¼ ðy1 ½n; y2 ½n; . . . ; yN ½nÞT is the extracted (un-
which is never a pure-source sample criterion-based nBSS mixed or demixed) source point and P ¼ WA 2 IRNN is a
algorithm, is able to separate all the (dependent or permutation matrix, meaning that the extracted source
independent) sources from the given mixtures without point y½n is equivalent to the true source point s½n up to a
involving any inversion process. Comparative experimental
permutation.
results using synthetic data and real biomedical data are
For ease of the ensuing derivations, we let
presented to demonstrate the superior performance of
nLCA-IVM over several existing benchmark methods. x i ¼ ðxi ½1; xi ½2; . . . ; xi ½LÞT ; ðith observationÞ
Section 2 presents the nBSS problem formulation and
the model assumptions. In Section 3, the novel nLCA is where i ¼ 1; 2; . . . ; M and
presented followed by the closed-form solution of the
optimum demixing matrix for the two-source case, the si ¼ ðsi ½1; si ½2; . . . ; si ½LÞT ; ðith sourceÞ;
nLCA-IVM algorithm, and the source identifiability of the T
y i ¼ ðyi ½1; yi ½2; . . . ; yi ½LÞ ; ðith extracted sourceÞ;
nLCA under the existence of pure-source samples. For
Authorized licensed use limited to: National Tsing Hua University. Downloaded on March 25,2010 at 01:50:03 EDT from IEEE Xplore. Restrictions apply.
WANG ET AL.: NONNEGATIVE LEAST-CORRELATED COMPONENT ANALYSIS FOR SEPARATION OF DEPENDENT SOURCES BY VOLUME... 877
X
N
xi ¼ aij s j ; i ¼ 1; 2; . . . ; M: ð3Þ
j¼1
X
M
yi ¼ wij x j ¼ sti ; i ¼ 1; 2; . . . ; N; ð4Þ
j¼1
where ti 2 f1; . . . ; Ng, ti 6¼ tj for i 6¼ j. Fig. 1. Geometric illustrations of Theorem 1 for (a) M ¼ N ¼ 3 and
To be practical in imaging applications, the nLCA to be (b) M ¼ N ¼ 2, respectively.
presented is based on the following general assumptions:
Second, the affine hull of fzz1 ; z2 ; . . . ; zN g is defined as [30]
. (A1) (nonnegative sources): For each i; s i 2 IRLþ .
. (A2) (nonnegative mixing matrix): A 2 IRMN þ . afffzz1 ; z 2 ; . . . ; z N g ¼
. (A3) M N and A is of full column rank, i.e., XN X
N
rankðAÞ ¼ N. zjz¼ i zi ; i ¼ 1; i 2 IR :
. (A4) (unit row sum): Each row sum of A equals i¼1 i¼1
Authorized licensed use limited to: National Tsing Hua University. Downloaded on March 25,2010 at 01:50:03 EDT from IEEE Xplore. Restrictions apply.
878 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 32, NO. 5, MAY 2010
4 1
x1 ; x 2 ; . . . ; x M Þ ¼
corrðx ; ð8Þ
V ðsolfx
x1 ; x 2 ; . . . ; x M gÞ
where
Z
V ðsolfx
x1 ; x 2 ; . . . ; x M gÞ ¼ d; ð9Þ
solfx
x1 ;x xM g
x2 ;...;x
R
is the volume of solfx x1 ; x 2 ; . . . ; x M g and solfxx1 ;xx2 ;...;xxM g d is Fig. 2. Geometric illustration of the observations x1 and x2 and the
the multiple integral over solfx x1 ; x 2 ; . . . ; xM g. The proposed unmixed signals y 1 and y 2 for M ¼ N ¼ 2.
joint correlation function of multiple signals also reflects the
same behavior as the conventional pairwise correlation Next, let us present how to solve (13) for the case of
coefficient. Specifically, as shown in Fig. 1b, the larger the M ¼ N ¼ 2 and the case of M ¼ N 2, respectively,
followed by the associated source identifiability.
value of corrðx x1 ; x 2 Þ, the smaller the area of solfx x1 ; x2 g. Since
the distance from the origin to the hyperplane passing 3.1 Case of M ¼ N ¼ 2: Closed-Form Solution
through x 1 and x 2 (i.e., affðx x1 ; x 2 Þ) is fixed, the smaller the Although problem (13) is a nonconvex problem, fortunately
area of solfx x1 ; x 2 g, the smaller the value of ðx x1 ; x2 Þ. Hence, we can find a closed-form solution for the case of
it can be inferred that the larger the value of corrðx x1 ; x 2 Þ, the M ¼ N ¼ 2, as stated in the following proposition:
larger the value of %ðx x1 ; x2 Þ. Proposition 1. Under (A1) to (A4), problem (13) for M ¼
Based on Theorem 1 and (9), we have V ðsolfx x1 ; N ¼ 2 has the closed-form solution W? , where
x 2 ; . . . ; x M gÞ V ðsolfss1 ; s2 ; . . . ; sN gÞ, which implies the
following corollary: ? x2 ½n
w11 ¼ max x1 ½n > x2 ½n; 8n ; ð14aÞ
Corollary 1. Suppose that (A2) and (A4) hold. Then, n x1 ½n x2 ½n
corrðss1 ; s2 ; . . . ; s N Þ corrðx
x1 ; x 2 ; . . . ; x M Þ: x2 ½n
w?21 ¼ min x1 ½n < x2 ½n; 8n ; ð14bÞ
n x1 ½n x2 ½n
Note that corrðx x1 ; x 2 ; . . . ; x M Þ corrðyy1 ; y 2 ; . . . ; y N Þ if the
demixing matrix satisfies (A2) (i.e., W 2 IRNM þ ) and (A4)
w?12 ¼ 1 w?11 ; w?22 ¼ 1 w?21 : ð14cÞ
(i.e., W1M ¼ 1N ) by Corollary 1. However, W 62 IRNM þ in the
demixing matrix design to be presented below is a real N M Proof. By (11), (12), and M ¼ N ¼ 2,
matrix such that corrðyy1 ; y2 ; . . . ; yN Þ corrðx x1 ; x 2 ; . . . ; x M Þ. If
W 2 IRNM can be properly designed, corrðyy1 ; y2 ; . . . ; yN Þ ¼ y i ¼ wi1 x1 þ wi2 x 2 ¼ wi1 ðx
x1 x 2 Þ þ x 2 ; i ¼ 1; 2: ð15Þ
corrðss1 ; s 2 ; . . . ; s N Þ can be achieved. Thus, maximizing V ðsolfyy1 ; y 2 gÞ (i.e., the shaded area in
Corollary 1 is profound, which motivates the idea of Fig. 2) is equivalent to increasing jw11 j and jw21 j, where
designing the demixing matrix W 2 IRNM by minimizing w11 0 and w21 0. Therefore, problem (13) can be
the joint correlation function of the N extracted sources decomposed into the following two subproblems:
y 1 ; y 2 ; . . . ; y N , that is,
w?11 ¼ min w
w
min corrfyy1 ; y 2 ; . . . ; y N g; ð10Þ ð16aÞ
W2IRNM s:t: w 0; wðx
x1 x 2 Þ þ x2 0;
subject to (s.t.)
w?21 ¼ max w
X
M w
ð16bÞ
yi ¼ wij x j 0; i ¼ 1; 2; . . . ; N; ðdue to ðA1ÞÞ; ð11Þ s:t: w 0; wðx
x1 x 2 Þ þ x2 0:
j¼1
Note that the inequality wðxx1 x 2 Þ þ x 2 0 is equiva-
W1M ¼ WA1N ¼ P1N ¼ 1N : ðdue to ðA4ÞÞ: ð12Þ lent to wðx1 ½n x2 ½nÞ þ x2 ½n 0 for n ¼ 1; 2; . . . ; L.
x1 x2 Þ þ x 2 0 itself implies
Therefore, the constraint wðx
From (8), the nBSS problem (10) subject to the two
constraints given by (11) and (12) can be reformulated as w x2 ½n=ðx1 ½n x2 ½nÞ; for x1 ½n > x2 ½n; ð17Þ
max V ðsolfyy1 ; y 2 ; . . . ; y N gÞ
W2IRNM w x2 ½n=ðx1 ½n x2 ½nÞ; for x1 ½n < x2 ½n; ð18Þ
ð13Þ
s:t: W1M ¼ 1N ; yi ¼ M
j¼1 wij x j 0; 8i;
which further lead to
which is a nonlinear and nonconvex optimization problem,
x2 ½n
implying that finding a closed-form solution of (13) for any max x1 ½n > x2 ½n w ð19Þ
n x1 ½n x2 ½n
M and N is almost formidable.
Authorized licensed use limited to: National Tsing Hua University. Downloaded on March 25,2010 at 01:50:03 EDT from IEEE Xplore. Restrictions apply.
WANG ET AL.: NONNEGATIVE LEAST-CORRELATED COMPONENT ANALYSIS FOR SEPARATION OF DEPENDENT SOURCES BY VOLUME... 879
and TABLE 1
The nLCA IVM Algorithm
x2 ½n
w min x1 ½n < x2 ½n : ð20Þ
n x1 ½n x2 ½n
One can easily see from (A1) and (A2) that xi ½n 0 for
all i and n. As a result, the optimal solutions of (16a) and
(16b) are, respectively, given by
x2 ½n
w?11 ¼ max x1 ½n > x2 ½n; 8n 0; ð21Þ
n x1 ½n x2 ½n
x2 ½n
w?21 ¼ min 1
x ½n < x 2 ½n; 8n 0: ð22Þ
n x1 ½n x2 ½n
Then, one can obtain w?12 ¼ 1 w?11 and w?22 ¼ 1 w?21 .
Thus, the proposition has been proved. u
t
It is worth mentioning that the optimum solution W?
given by (14) for M ¼ N ¼ 2 is exactly the same as the one
obtained by the nLCA-ES [23]. For a more general case,
where M ¼ N 2, we resort to the nLCA-IVM algorithm,
to be presented next.
s:t: wTi 1N ¼ 1; yi ¼ N
j¼1 wij x j 0:
¼ fw 2 IRN j wT 1N ¼ 1; wT v 0;
Note that the objective function in (26) is still nonconvex. ð30Þ
v 2 convfx½1; x½2; . . . ; x½Lgg:
Fortunately, the maximization problem (26) can be solved in
a globally optimal manner by breaking it into two LPs: The convex hull of all the data points is given by
X
N
convfx½1; x½2; . . . ; x½Lg ¼
p? ¼ max ð1Þiþj wij detðW
W ij Þ ( )
wi ð27aÞ Xr Xr ð31Þ
j¼1
i x½li i ¼ 1; i 0; 8i ;
s:t: wTi 1N ¼ 1; yi ¼ N
j¼1 wij x j 0; i¼1 i¼1
Authorized licensed use limited to: National Tsing Hua University. Downloaded on March 25,2010 at 01:50:03 EDT from IEEE Xplore. Restrictions apply.
880 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 32, NO. 5, MAY 2010
TABLE 2
Complexity Order Comparison of Various BSS Methods
In the table, N denotes the number of sources, L denotes the data length, denotes the number of iterations, and r L denotes the number of
extreme points of the convex hull (31).
where x½l1 ; . . . ; x½lr are the extreme points of the convex feasible set, while fixing y2 and y 3 , such that V ðsolfyy1 ; y 2 ; y 3 gÞ
hull and r L is the total number of the extreme points. increases maximally (since the LP always yields the optimal
Then, by (31) and (30), it follows that y1 on the boundary of the feasible set). By repeating the
procedure, the three sources will be literally identified.
F ¼ w 2 IRN j wT 1N ¼ 1; Similarly, for the M ¼ N ¼ 2 case as shown in Fig. 2, the
nLCA-IVM algorithm involves only two row vector updates,
X r X
r ð32Þ
implying that the global optimal solution for W can be
i wT x½li 0; i ¼ 1; i 0; 8i ; obtained in only one iteration (in addition to one iteration for
i¼1 i¼1
the convergence check). Thus, we can conclude that the
proposed nLCA-IVM algorithm is able to provide the
¼ fw 2 IRN j wT 1N ¼ 1; globally optimal solution for M ¼ N ¼ 2 as given in
ð33Þ
wT x½li 0; i ¼ 1; . . . ; rg; Proposition 1 above, but no guarantee of the global
optimality for M ¼ N > 2.
where x½l1 ; . . . ; x½lr can be identified by using the quickhull
algorithm [34], a well-known extreme point search method. 3.3 Source Identifiability of the nLCA
Suppose that r extreme points, or equivalently resultant r We address here the source identifiability problem of nLCA
inequality constraints in (33), are found, the computational concerning certain conditions (assumptions) under which
complexity of the nLCA-IVM then becomes OðN 2 rÞ the N extracted sources are identical to all of the N true
instead of OðN 2 LÞ, where is the number of iterations sources. In many biomedical imaging [8], [21] and remote
involved. Due to the use of the quickhull algorithm, the sensing applications [24], [29], the sources meet the
computational efficiency improvement of the proposed assumption that pure-source samples exist:
nLCA-IVM can be measured by the constraint reduction
ratio, defined as . (A5) There exists at least one index set fl1 ; l2 ; . . . ; lN g
such that s½li ¼ si ½li ei , i ¼ 1; 2; . . . ; N.
Lr
¼ : ð34Þ Note that the corresponding data points x½n for n 2
L fl1 ; l2 ; . . . ; lN g become
Surely, the larger the value of , the less the computational
complexity in solving (24). x½li ¼ ai si ½li ; ð35Þ
Some BSS methods, nLCA-ES [23], CAMNS [22], nICA where ai is the ith column vector of A. Namely, the data
[13], FastICA [12], NMF [14], quasi-Newton NMF (QNMF) point x½li is only contributed from source s i . For investigat-
[18], and sparse NMF (sNMF) [17], can be easily verified to ing the source identifiability of the nLCA, the following
have complexities (for the case of M ¼ N) given by OðN 3 LÞ, lemma is needed:
OððN 1Þ2 LÞ, OðN 2 LÞ, OðN 3 LÞ, OðN 2 LÞ, and OðN 2 LÞ,
respectively. The complexity order comparison for the Lemma 1. Suppose that (A2), (A3), and (A4) hold and M ¼ N.
above-mentioned BSS methods is listed in Table 2. Then, jdetðAÞj 1, and the equality holds if and only if A is a
Obviously, the proposed nLCA-IVM has lower complexity permutation matrix.
order than nICA, FastICA, NMF, QNMF, and sNMF.
A geometrical illustration of how the proposed The proof of Lemma 1 is given in Appendix B. Under
nLCA-IVM method works for the case of M ¼ N ¼ 3 is the existence of pure-source samples, the source identifia-
shown in Fig. 3. For each row vector update of W, the bility of the proposed nLCA is established in the
unmixed source, say y1 , is moved to the boundary of the following theorem:
Theorem 2. (Source identifiability of nLCA). Suppose that
(A1)-(A5) hold. Then,
W? A ¼ P; ð36Þ
where W? is the optimal solution of problem (24) and P is an
N N permutation matrix.
Proof. Substituting W? into the constraints of problem (24)
yields
W? 1N ¼ 1N ; ð37aÞ
Fig. 3. Geometric interpretation of nLCA-IVM for M ¼ N ¼ 3, where y 1
is updated with y 2 and y 3 fixed, such that V ðsolfyy1 ; y 2 ; y 3 gÞ increases W? x½n 0; n ¼ 1; 2; . . . ; L: ð37bÞ
maximally.
Authorized licensed use limited to: National Tsing Hua University. Downloaded on March 25,2010 at 01:50:03 EDT from IEEE Xplore. Restrictions apply.
WANG ET AL.: NONNEGATIVE LEAST-CORRELATED COMPONENT ANALYSIS FOR SEPARATION OF DEPENDENT SOURCES BY VOLUME... 881
By (1), (37b) can also be expressed as It is generally true that rankðSÞ ¼ N because the
sources s 1 ; s2 ; . . . ; sN are linearly independent. It can be
X
N X
N
easily shown by the rank inequality of the product of
W? si ½nai ¼ si ½nðW? ai Þ 0; 8n; ð38Þ
i¼1 i¼1
matrices [36] that rankðSAT Þ ¼ N < M, which implies
N < rankðXÞ M, as 6¼ 0. Note that X is usually of
which can be further reduced to the following full column rank as 6¼ 0. The optimum rank-reducing
N inequalities: approximation to the data matrix, denoted by X? , can be
W? ai 0; i ¼ 1; 2; . . . ; N: ðby ðA5ÞÞ: ð39Þ realized by minimizing the following approximation error:
?
By letting B ¼ W A, it can be inferred from (37a), (A4), X
M
J ¼ kX X? k2 ¼ xi x ?i k2 ;
kx ð44Þ
and (39) that the feasible set of B is given by
i¼1
B ¼ B j B1N ¼ 1N ; B 2 IRNN þ : subject to the constraint of rank(X? Þ ¼ N. Each column of
X? takes the following linear form:
Now, we prove this theorem by contradiction.
Suppose that B 2 B but B 62 P N , where P N denotes the x ?i ¼ þ V
i ; i ¼ 1; 2; . . . ; M; ð45Þ
set of N N permutation matrices. Then one can infer,
by Lemma 1, that where is an L 1 vector, V is an L ðN 1Þ semiunitary
matrix (i.e., VT V ¼ IN1 ), and i are ðN 1Þ 1 vectors. It
1 > j detðBÞj ¼ j detðW? AÞj ¼ j detðW? Þj j detðAÞj: ð40Þ is straightforward to prove that the optimum , i , and V
P
Furthermore, it can be inferred from (40) that are given by ? ¼ M1 M ? ? T
i¼1 x i ; i ¼ ðV Þ ðx xi ? Þ, and V? is
a matrix consisting of ðN 1Þ principal left singular vectors
1
j detðW? Þj < ¼ j detðA1 Þj ¼ j detðW0 Þj; ð41Þ of the matrix ðx x1 ? ; x2 ? ; . . . ; x M ? Þ. Note that, in
j detðAÞj
the presence of additive noise, an additional advantage is
where W0 ¼ A1 . It can be easily shown that W0 noise mitigation as M > N besides rank reduction.
satisfies (37a) and (39). Therefore, (41) contradicts with After obtaining the optimum x?i ¼ ? þ V? ?i ,
the optimality of (24). Thus, the theorem has been i ¼ 1; 2; . . . ; M, we select N noise-reduced (and rank-
proved. u
t reduced) observations x ?i 2 IRL with the smallest approx-
Though the source identifiability is proven under (A5), imation errors kx x?i xi k2 , say x ?1 ; . . . ; x ?N without loss of
we will show with experimental results (Section 5) that the generality. To maintain the nonnegativity of the selected x ?i ,
proposed nLCA-IVM is still able to achieve accurate source i ¼ ð
let x xi1 ; . . . ; xiL ÞT be the ith noise-reduced observation
estimation even if (A5) is not perfectly satisfied. Because the to be processed by the proposed nLCA-IVM, where
proposed nLCA-IVM above only considers the case of M ¼ xij ¼ maxfx?ij ; 0g; j ¼ 1; 2; . . . ; L ð46Þ
N without additive noise contamination, we next present
the noise suppression and dimension reduction methods for in which x?ij denotes the jth component of x ?i .
the case of M > N to obtain the best N nonnegative rank
and noise-reduced observations prior to employing the
5 EXPERIMENTAL RESULTS
proposed nLCA-IVM.
In this section, five sets of experimental results are
presented to demonstrate the efficacy of the proposed
4 RANK REDUCTION AND NOISE SUPPRESSION BY nLCA-IVM (after removing redundant constraints of (27)
PRINCIPAL COMPONENT ANALYSIS using the quickhull algorithm [34]). In the first three
As M > N, the rank-reducing approximation, so-called the experiments (for scenarios of different numbers of sources
principal component analysis (PCA) [35], can provide and existence/nonexistence of pure-source samples as in
Experiment 1, sparse sources as in Experiment 2, and long
M approximations (of rank equal to N) to the given
data length as in Experiment 3), simulation data over
M observations. Then N rank-reduced approximated
50 Monte Carlo runs are used to evaluate the performance
observations with the smallest approximation errors are
and complexity of the nLCA-IVM, and seven existing BSS
considered to be processed by the proposed nLCA-IVM.
algorithms nLCA-ES [23], CAMNS [22], nICA [13],
Now let us consider the noisy observations FastICA [12], NMF [14], QNMF [18], and sNMF [17]. At
X
N each run, all the entries of the mixing matrix were
xi ¼ aij sj þ i ; i ¼ 1; 2; . . . ; M; ð42Þ randomly generated with a uniform distribution over
j¼1 ½0; 1 followed by normalization of each row sum equal to
unity. The last two experiments are real data applications
where i 2 IRL is the additive noise vector. Another matrix of the proposed nLCA-IVM to fluorescence microscopy
expression of (42) is given by and DCE-MRI images.
The parameter settings for each algorithm in the first
X ¼ SAT þ ; ð43Þ
three experiments are as follows: For nLCA-IVM and
LM
where X ¼ ðx x1 ; x 2 ; . . . ; x M Þ 2 IR is the observation FastICA, the convergence tolerances were set to ¼ 105
matrix, S ¼ ðss1 ; s2 ; . . . ; sN Þ 2 IRLN is the source matrix, and ¼ 104 , respectively. For nICA, NMF, QNMF, and
and ¼ ð1 ; 2 ; . . . ; M Þ 2 IRLM is the noise matrix. sNMF, the convergence rule was the maximum number of
Authorized licensed use limited to: National Tsing Hua University. Downloaded on March 25,2010 at 01:50:03 EDT from IEEE Xplore. Restrictions apply.
882 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 32, NO. 5, MAY 2010
1 XN
ðssi qðssi ÞÞT ðci ^s i ci qð^si ÞÞ
¼ max ;
N c 2f1;1g
2N
i¼1
kssi qðssi Þk k^s i qð^si Þk
i
Authorized licensed use limited to: National Tsing Hua University. Downloaded on March 25,2010 at 01:50:03 EDT from IEEE Xplore. Restrictions apply.
WANG ET AL.: NONNEGATIVE LEAST-CORRELATED COMPONENT ANALYSIS FOR SEPARATION OF DEPENDENT SOURCES BY VOLUME... 883
TABLE 3
Averaged Cross-Correlation Coefficients (ave ) between Sources and Their Estimates, Averaged Computation Time (Tave ),
Averaged Number of Iterations (ave ) of Various BSS Methods over 50 Monte Carlo Runs for N ¼ 6 in Experiment 1
original sources better than those obtained by the other and 0.9943 for the case with artificially added pure-source
algorithms. samples, indicating that the computational efficiency with
Again, from Table 3, one can observe that the computa- the use of the quickhull algorithm prior to the use of the
tional complexity in terms of the averaged computation nLCA-IVM is also high in this scenario.
time, denoted by Tave , of the proposed nLCA-IVM is lower
than all the other iterative algorithms, i.e., nICA, FastICA, 5.2 Experiment 2: Infrared Spectra Decomposition
NMF, QNMF, and sNMF. An interesting observation is that In the field of analytical chemistry, infrared spectra are often
Tave (3.94 sec) associated with the proposed nLCA-IVM used to identify and quantify a solvent in liquid or gas state
when pure-source samples exist is significantly smaller than by its spectral signature. When multiple solvents are mixed,
that (Tave ¼ 22:95 sec) without pure-source samples. This is the measured infrared spectrum is a mixture of the spectra
because more redundant inequality constraints of problem of these solvents. In this experiment, we used the gas phase
(24) can be effectively removed by the quickhull algorithm. Fourier-transform infrared spectra of three pure materials
(i.e., toluene, acetone, and dichloromethane) with 2 cm1
On the other hand, Tave s associated with the two non-
resolution from a public quantitative infrared database (i.e.,
iterative algorithms, nLCA-ES and CAMNS, are similar for
http://webbook.nist.gov/chemistry/quant-ir/) to generate
all the three cases. Moreover, the averaged constraint
three mixtures, where L ¼ 3;526. The values of ave ; Tave ,
reduction ratios (0 < 1) (see (34)) are 0.9385 for the
and ave of the BSS algorithms under test are shown in
noise-free case, 0.9504 for the noisy case, and 0.9853 for the Table 5. One can observe from Table 5 that the proposed
case with artificially added pure-source samples, indicating nLCA-IVM performs almost as well as nLCA-ES and
that the computational efficiency with the use of the CAMNS (where ave ’ 1 does not justify the source
quickhull algorithm prior to the use of the nLCA-IVM is identifiability of the former due to no pure-source samples),
remarkably high in this scenario. and its performance is superior to that of nICA, FastICA,
We also show the corresponding simulation results for NMF, QNMF, and sNMF, although no pure-source sample
N ¼ 4 in Table 4, where the first four human face images in exists in the three original sources of this experiment. As for
Fig. 4a were used. From this table, similar conclusions about the computational complexity, the best four algorithms are
the performance and computational complexity comparison nLCA-ES, FastICA, CAMNS, and nLCA-IVM. One typical
of all the BSS algorithms under test can be drawn as from realization among the 50 independent runs for this
Table 3. Here, the averaged constraint reduction ratios experiment is shown in Fig. 5, where all the separated
are 0.9887 for the noise-free case, 0.9864 for the noisy case, spectra (intensity versus wave number (cm1 )) have been
TABLE 4
Averaged Cross-Correlation Coefficients (ave ) between Sources and Their Estimates, Averaged Computation Time (Tave ),
Averaged Number of Iterations (ave ) of Various BSS Methods over 50 Monte Carlo Runs for N ¼ 4 in Experiment 1
Authorized licensed use limited to: National Tsing Hua University. Downloaded on March 25,2010 at 01:50:03 EDT from IEEE Xplore. Restrictions apply.
884 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 32, NO. 5, MAY 2010
TABLE 5
Averaged Cross-Correlation Coefficients (ave ) between Sources and Their Estimates, Averaged Computation Time (Tave ),
Averaged Number of Iterations (ave ) of Various BSS Methods over 50 Monte Carlo Runs in Experiments 2 and 3
properly scaled and ordered for ease of illustrative 5.3 Experiment 3: Dual-Energy X-Ray Image
comparison among all the BSS algorithms under test. One Decomposition
can see in this figure that the separated source spectra Accurate detection of lung nodules (the early sign of lung
obtained from nLCA-IVM; nLCA-ES, and CAMNS match cancers) using dual-energy chest X-ray imaging is an
the original source spectra much better than those obtained important diagnostic task. However, the presence of ribs
by the other algorithms. or clavicles overlapped with soft tissue presents significant
challenge to detect subtle nodules. Effective separation of
bone and soft tissues in dual-energy chest X-ray imaging is
highly desirable. In this experiment, we applied the
nLCA-IVM to process two mixed images of soft and bone
tissues (L ¼ 26;896) taken from [39] as shown in Fig. 6a. The
values of ave of all the BSS algorithms under test are also
given in Table 5. It is to be mentioned that two pure-source
samples exist in the original two sources. It can be seen
from the table that nLCA-IVM, nLCA-ES, and CAMNS all
perform perfectly (i.e., ave ¼ 1), justifying the source
identifiability of nLCA (as stated in Theorem 2) and the
global optimum solution achieved by the proposed
nLCA-IVM (see Proposition 1). On the other hand, the
value of Tave (0.46 second) associated with the proposed
nLCA-IVM is larger than that (0.06 second) associated with
Authorized licensed use limited to: National Tsing Hua University. Downloaded on March 25,2010 at 01:50:03 EDT from IEEE Xplore. Restrictions apply.
WANG ET AL.: NONNEGATIVE LEAST-CORRELATED COMPONENT ANALYSIS FOR SEPARATION OF DEPENDENT SOURCES BY VOLUME... 885
Authorized licensed use limited to: National Tsing Hua University. Downloaded on March 25,2010 at 01:50:03 EDT from IEEE Xplore. Restrictions apply.
886 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 32, NO. 5, MAY 2010
APPENDIX A
PROOF OF THEOREM 1
1. By (3), (A2), and (A4), for any y 2 convfx
x1 ; . . . ; x M g,
X
M
y¼ i x i ; M
i¼1 i ¼ 1; i 0
i¼1
ð47Þ
X
M X
N
¼ i aij sj ; ðby ð3ÞÞ;
i¼1 j¼1
X
N
y¼ j sj 2 convfss1 ; s 2 ; . . . ; sN g ð48Þ
j¼1
X
N
y¼ j sj 2 solfss1 ; s2 ; . . . ; sN g ð49Þ
j¼1
nLCA method is implemented by an iterative volume and the equality holds if and only if b ¼ ðb1 ; b2 ; . . . ; bN ÞT ¼
maximization algorithm (i.e., nLCA-IVM) that uses compu- bl el for some l and bl 6¼ 0. By (A2)-(A4), (50), and (51), we have
tationally efficient LP solvers. We also proved the source
identifiability of the nLCA method under the existence of !1=2 !
Y
N X
N Y
N X
N
pure-source samples (i.e., (A5)), and furthermore, it never jdetðAÞj a2ij aij ¼ 1; ð52Þ
relies on (A5) and thus exhibits good performance even i¼1 j¼1 i¼1 j¼1
when (A5) is not satisfied perfectly. Comparative experi-
mental results using simulated data were presented to where the first equality holds if and only if all the rows of
demonstrate the superior performance and computational A ¼ ðaa1 ; . . . ; a N ÞT are orthogonal, and the second equality
P
efficiency of the nLCA-IVM over several existing benchmark holds if and only if a i ¼ el (since N j¼1 aij ¼ 1Þ for some l.
methods. Evaluation of the proposed nLCA-IVM using real Thus, it can be easily inferred that jdetðAÞj ¼ 1 if and only if
biomedical data was also presented, which provides some A is a permutation matrix.
insights into the underlying biomedical relevant character-
istics and physiologies. Due to a variety of potential
biomedical imaging applications, nontrivial modifications ACKNOWLEDGMENTS
of this algorithm may be needed for specific applications, The authors would like to sincerely thank Dr. Wing-Kin Ma
and they are left as future research along with other and Mr. A. ArulMurugan for their valuable advice,
applications, where nBSS is needed. suggestions, and discussions during the preparation of the
Authorized licensed use limited to: National Tsing Hua University. Downloaded on March 25,2010 at 01:50:03 EDT from IEEE Xplore. Restrictions apply.
WANG ET AL.: NONNEGATIVE LEAST-CORRELATED COMPONENT ANALYSIS FOR SEPARATION OF DEPENDENT SOURCES BY VOLUME... 887
manuscript. This work was supported partly by the [23] F.-Y. Wang, C.-Y. Chi, T.-H. Chan, and Y. Wang, “Blind
Separation of Positive Dependent Sources by Non-Negative
National Science Council of the Republic of China (R.O.C.) Least-Correlated Component Analysis,” Proc. IEEE Int’l Workshop
under Grant NSC 96-2628-E007-003-MY3, Taiwan, R.O.C., Machine Learning for Signal Processing, pp. 73-78, Sept. 2006.
and partly by the US National Institutes of Health under [24] M.E. Winter, “N-Findr: An Algorithm for Fast Autonomous
Grants EB000830 and CA109872. Spectral End-Member Determination in Hyperspectral Data,”
Proc. SPIE Conf. Imaging Spectrometry, pp. 266-275, Oct. 1999.
[25] J.M.P. Nascimento and J.M.B. Dias, “Vertex Component Analysis:
A Fast Algorithm to Unmix Hyperspectral Data,” IEEE Trans.
REFERENCES Geoscience and Remote Sensing, vol. 43, no. 4, pp. 898-910, Apr. 2005.
[1] N. Keshava and J. Mustard, “Spectral Unmixing,” IEEE Signal [26] C.I. Chang, C.-C. Wu, W.-M. Liu, and Y.-C. Quyang, “A New
Processing Magazine, vol. 19, no. 1, pp. 44-57, Jan. 2002. Growing Method for Simplex-Based Endmember Extraction
[2] M. Mjolsness and D. DeCoste, “Machine Learning for Science: Algorithm,” IEEE Trans. Geoscience and Remote Sensing, vol. 44,
State of the Art and Future Prospects,” Science, vol. 293, pp. 2051- no. 10, pp. 2804-2819, Oct. 2006.
2055, 2001. [27] A. Prieto, C.G. Puntonet, and B. Prieto, “A Neural Learning
[3] H.R. Herschman, “Molecular Imaging: Looking at Problems, Algorithm for Blind Separation of Sources Based on Geometric
Seeing Solutions,” Science, vol. 302, pp. 605-608, 2003. Properties,” Signal Processing, vol. 64, pp. 315-331, 1998.
[4] D.M. McDonald and P.L. Choyke, “Imaging of Angiogenesis: [28] P. Santago and H.D. Gage, “Statistical Models of Partial Volume
From Microscope to Clinic,” Nature Medicine, vol. 9, no. 6, pp. 713- Effect,” IEEE Trans. Image Processing, vol. 4, no. 11, pp. 1531-1540,
725, 2003. Nov. 1995.
[5] M.E. Dickinson, G. Bearman, S. Tilie, R. Lansford, and S.E. Fraser, [29] D. Stein, S. Beaven, L. Hoff, E. Winter, A. Schaum, and A. Stocker,
“Multi-Spectral Imaging and Linear Unmixing Add a Whole New “Anomaly Detection from Hyperspectral Imagery,” IEEE Signal
Dimension to Laser Scanning Fluorescence Microscopy,” Biotech- Processing Magazine, vol. 19, no. 1, pp. 58-69, Jan. 2002.
niques, vol. 31, no. 6, pp. 1272-1278, Dec. 2001. [30] M.S. Bazaraa, H.D. Sherall, and C.M. Shetty, Nonlinear Program-
[6] K. Suhling and D. Stephens, Cell Imaging: Methods Express. Scion ming: Theory and Algorithms, second ed. John Wiley, 1993.
Publishing, 2005. [31] S. Lang, Undergraduate Analysis, second ed. Springer-Verlag, 1983.
[7] A. Rabinovich, S. Agarwal, S. Krajewski, J.C. Reed, J.H. Price, and [32] S.H. Friedberg, A.J. Insel, and L.E. Spence, Linear Algebra, third ed.
S. Belongie, “Accuracy of Unsupervised Spectral Decomposition Prentice-Hall, 1997.
for Densitometry of Histological Sections,” IEEE Trans. Medical [33] I.J. Lustig, R.E. Marsten, and D.F. Shanno, “Interior Point Methods
Imaging, to appear. for Linear Programming: Computational State of the Art,” ORSA
[8] Y. Wang, J. Xuan, R. Srikanchana, and P.L. Choyke, “Modeling J. Computing, vol. 6, no. 1, pp. 1-14, 1994.
and Reconstruction of Mixed Functional and Molecular Patterns,” [34] C.B. Barber, D.P. Dobkin, and H. Huhdanpaa, “The Quickhull
Int’l J. Biomedical Imaging, vol. 2006, pp. 1-9, 2006. Algorithm for Convex Hulls,” ACM Trans. Math. Software, vol. 22,
[9] D. Nuzillard and A. Bijaoui, “Blind Source Separation and pp. 469-483, Dec. 1996.
Analysis of Multispectral Astronomical Images,” Astronomy and [35] T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical
Astrophysics Supplement Series, vol. 147, pp. 129-138, 2000. Learning. Springer, 2001.
[10] S.A. Astakhov, H. Stogbauer, A. Kraskov, and P. Grassberger, [36] R.A. Horn and C.R. Johnson, Matrix Analysis. Cambridge Univ.
“Monte Carlo Algorithm for Least Dependent Non-Negative Press, 1985.
Mixture Decomposition,” Analytical Chemistry, vol. 78, no. 5, [37] P. Tichavsk y and Z. Koldovsk y, “Optimal Pairing of Signal
pp. 1620-1627, 2006. Components Separated by Blind Techniques,” IEEE Signal
[11] S. Haykin, Neural Networks: A Comprehensive Foundation. second Processing Letters, vol. 11, no. 2, pp. 119-122, Feb. 2004.
ed. Prentice-Hall, 2005. [38] A. Cichocki and S. Amari, Adaptive Blind Signal and Image
[12] A. Hyvarinen, J. Karhunen, and E. Oja, Independent Component Processing. John Wiley, 2002.
Analysis. John Wiley, 2001. [39] K. Suzuki, R. Engelmann, H. MacMahon, and K. Doi, “Virtual
[13] E. Oja and M. Plumbley, “Blind Separation of Positive Sources by Dual-Energy Radiography: Improved Chest Radiographs by
Globally Convergent Gradient Search,” Neural Computation, Means of Rib Suppression Based on a Massive Training Artificial
vol. 16, pp. 1811-1825, 2004. Neural Network (Mtann),” Radiology, vol. 238, http://suzukilab.
[14] D. Lee and H.S. Seung, “Learning the Parts of Objects by Non- uchicago.edu/research.htm, 2006.
Negative Matrix Factorization,” Nature, vol. 401, pp. 788-791, Oct. [40] A.R. Padhani and J.E. Husband, “Dynamic Contrast-Enhanced
1999. MRI Studies in Oncology with an Emphasis on Quantification,
[15] J.S. Lee, D.D. Lee, S. Choi, K.S. Park, and D.S. Lee, “Non-Negative Validation and Human Studies,” Clinical Radiology, vol. 56, no. 8,
Matrix Factorization of Dynamic Images in Nuclear Medicine,” pp. 607-620, 2001.
IEEE Nuclear Science Symp. Conf. Record, vol. 4, pp. 2027-2030, 2001. [41] T.F. Coleman and Y. Li, “A Reflective Newton Method for
[16] M.W. Berry, M. Browne, A.N. Langville, V.P. Pauca, and R.J. Minimizing a Quadratic Function Subject to Bounds on Some of
Plemmons, “Algorithms and Applications for Approximate the Variables,” SIAM J. Optimization, vol. 6, no. 4, pp. 1040-1058,
Nonnegative Matrix Factorization,” Computational Statistics and 1996.
Data Analysis, vol. 52, no. 1, pp. 155-173, 2007. [42] C.-Y. Chi, C.-C. Feng, C.-H. Chen, and C.-Y. Chen, Blind
[17] W. Liu, N. Zheng, and X. Lu, “Nonnegative Matrix Factorization Equalization and System Identification. Springer-Verlag, 2006.
for Visual Coding,” Proc. IEEE Int’l Conf. Acoustics, Speech, and
Signal Processing, pp. 293-296, Apr. 2003.
[18] R. Zdunek and A. Cichocki, “Nonnegative Matrix Factorization
with Constrained Second-Order Optimization,” Signal Processing, Fa-Yu Wang received the BS and MS degrees
vol. 87, no. 8, pp. 1904-1916, 2007. in power mechanical engineering and electrical
[19] P. Hoyer, “Nonnegative Matrix Factorization with Sparseness engineering from National Tsing Hua University,
Constraints,” J. Machine Learning Research, vol. 5, pp. 1457-1469, Hsinchu, Taiwan, in 1993 and 1999, respec-
2004. tively. From 1996 to 2005, he was with Chungh-
[20] H. Laurberg, M.G. Christensen, M.D. Plumbley, L.K. Hansen, and wa Telecom. Currently, he is working toward the
S.H. Jensen, “Theorems on Positive Data: On the Uniqueness of PhD degree at the Institute of Communications
NMF,” Computational Intelligence and Neuroscience, vol. 2008, pp. 1- Engineering, National Tsing Hua University. His
9, 2008. research interests focus on signal processing,
[21] J. Mansfield, K. Gossage, C. Hoyt, and R. Levenson, “Autofluor- pattern analysis, and machine intelligence.
escence Removal, Multiplexing, and Automated Analysis Meth-
ods for In-Vivo Fluorescence Imaging,” J. Biomedical Optics, vol. 10,
no. 4, pp. 1-9, 2005.
[22] T.-H. Chan, W.-K. Ma, C.-Y. Chi, and Y. Wang, “A Convex
Analysis Framework for Blind Separation of Non-Negative
Sources,” IEEE Trans. Signal Processing, vol. 56, no. 10, pp. 5120-
5134, Oct. 2008.
Authorized licensed use limited to: National Tsing Hua University. Downloaded on March 25,2010 at 01:50:03 EDT from IEEE Xplore. Restrictions apply.
888 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 32, NO. 5, MAY 2010
Chong-Yung Chi received the PhD degree in Yue Wang received the BS and MS degrees in
electrical engineering from the University of electrical and computer engineering from
Southern California in 1983. From 1983 to Shanghai Jiao Tong University in 1984 and
1988, he was with the Jet Propulsion Laboratory, 1987, respectively, and the PhD degree in
Pasadena, California. He has been a professor electrical engineering from the University of
in the Department of Electrical Engineering Maryland Graduate School in 1995. In 1996,
since 1989 and the Institute of Communications he was a postdoctoral fellow at Georgetown
Engineering (ICE) since 1999 (also the Chair- University School of Medicine. From 1996 to
man of ICE for 2002-2005) at National Tsing 2003, he was an assistant and later an associate
Hua University, Hsinchu, Taiwan. Currently, he professor of electrical engineering at the Catho-
is an associate editor for the IEEE Signal Processing Letters and the lic University of America. In 2003, he joined Virginia Polytechnic Institute
IEEE Transactions on Circuits and Systems I, and a member of the and State University and is currently the endowed Grant A. Dove
IEEE Signal Processing Committee on Signal Processing Theory and Professor of electrical and computer engineering. He became an elected
Methods. He coauthored a technical book, Blind Equalization and fellow of the American Institute for Medical and Biological Engineering
System Identification (Springer, 2006) and has published more than (AIMBE) and was named an ISI highly cited researcher by Thomson
150 technical papers. His research interests include signal processing Scientific in 2004. His research interests focus on statistical pattern
for wireless communications, convex analysis and optimization for blind recognition, machine learning, signal and image processing, with
source separation, biomedical imaging, and hyperspectral imaging. He applications to computational bioinformatics and biomedical imaging.
is a senior member of the IEEE.
Tsung-Han Chan received the BS degree . For more information on this or any other computing topic,
from the Department of Electrical Engineering, please visit our Digital Library at www.computer.org/publications/dlib.
Yuan Ze University, Taoyuan, Taiwan, in 2004
and the PhD degree from the Institute of
Communications Engineering, National Tsing
Hua University (NTHU), Hsinchu, Taiwan, in
2009. Currently, he is a postdoctoral research
fellow with the Institute of Communications
Engineering, NTHU. In 2008, he was a visiting
doctoral graduate research assistant with
Virginia Polytechnic Institute and State University, Arlington. His
research interests are in signal processing, convex optimization, and
pattern analysis, with a recent emphasis on dynamic medical imaging
and remote sensing applications.
Authorized licensed use limited to: National Tsing Hua University. Downloaded on March 25,2010 at 01:50:03 EDT from IEEE Xplore. Restrictions apply.