Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

Blind Source Separation of Audio Signals Using Independent

Component Analysis and Wavelets




Ma. Guadalupe Lpez P.
1
, Hern Molina Lozano
1
, Luis P. Snchez F.
1
, L. No Oliva Moreno
2
1
Centro de Investigacin en Computacin-IPN, Mxico
2
Escuela Superior de Cmputo-IPN, Mxico
lpachecob09@sagitario.cic.ipn.mx, hmolina@ipn.mx, lsanchez@cic.ipn.mx, loliva@ipn.mx



Abstract
In this work we proposed a new method that allows the
blind source separation by the analysis of independent
components known as FASTICA in the domain of
Wavelet to observe his behavior on signs captured in a
real environment. The problem that tries to be solved in
Blind Source Separation (BSS) consists of recovering
signs statistically independent. Nevertheless, certain
difficulties appear when this system is applied to real
signs, on the one hand the effect of the reverberation does
that the mixtures gathered by the microphones are
convolution mix; and on the other hand, these mixtures
will not be totally independent. We did two experiments.
With the first experiment we separated 2 audio signals
with a very low percentage of error. With the second
experiment we recorded 3 different audio sources with an
array of 3 microphones, and then from one audio recorded
source 3 signals were separated, we appreciate that in
each source one signal was amplified and the other two
signals were fallen down. From the results, the method
that we proposed is able to separate from one mixed audio
signal 2 or even 3 independent signals.


1. Introduction

In the field of digital signal processing there is a
problem known as Cocktail Party, which try to separate
signals (voice or music) mixed simultaneously based only
on their mixtures. Solving this problem would allow
many applications, for example in mobile telephony,
multiuser communication systems, eliminating
redundancy and sparse coding in noise cancellation, voice
reinforcement in noisy environments, as well as in other
important environments such as urban ecology,
specifically on pollution caused by high sound levels.
Blind Source Separation (BSS) is a powerful technique
capable of solving this problem. This technique is based
on the following principle: assuming that the original
signals are mixed linearly and it is possible to collect
these mixtures with appropriate sensors, the BSS is able
to estimate the coefficients that characterize this linear
combination, and therefore can be estimated the original
signals [3, 5].

2. Blind Source Separation

In attempting to solve the cocktail party problem, there
are at least two difficulties: first the effect of
reverberation makes the mixtures obtained by the
microphones are convolutional and nonlinear, and on the
other hand, these mixtures are not fully independent
because the signal propagation is through nonlinear
environment and there are parasitic effects.
We have developed a lot of studies to solve the first
problem, all of them in the development of adaptive
algorithms. However, studies about the second problem in
a formal manner are rare. In this paper, we present a
method that improves the quality of the separation of a
BSS when mixtures observed are not fully independent
[2].
Blind Signal Separation is a processing tool able to
recover signals that cannot be seen separately from linear
mixtures or combinations thereof. The simplest BSS
model assumes the existence of n statistically independent
signals, s(t) = [o1(t); ; ov(t)] and observed v-
mixes that are linear combinations and instantaneous to
the above, x(t) = [1(t); ; v(t)], i.e.:

()

()

(1)



In compact form:

() () (2)
where A is a square matrix, vv, which contains the
coefficients of the mixtures. Since BSS aims to recover
s(t) from the signals x(t), is necessary to estimate the
matrix A. Having gained this estimate we can find and
obtain its inverse matrix:

() () (3)

where y(t) contains the estimation of independent
source signals and the matrix W is the inverse of A.
However, as already mentioned, the real mixtures
captured by the microphones will have a combination
convolutional as follows:
[] [] [] [] [ ]

(4)

As in the previous case, the separation is determined
by the impulse response matrix W[v], with size vv.
Thus, the output vector is constructed as
[] [] [] [] [ ]

(5)

It is trivial to verify that the instantaneous mixing
model is a special case of convolution model where the
mixing and separation system has no memory [6].
Convolutional model is proposed to give a closer
approximation to reality.

3. Independent Component Analysis (ICA)

Independent component analysis, or simply ICA, was
introduced in 1986 by Jeanny Herault and Christian Jutte
as a neural network based on Hebb learning law capable
of performing blind signal separation. Specifically, this
algorithm tries to separate a number of statistically
independent signals from the same number of input
signals are the linear sum of the first [4].
The method of independent component analysis (ICA)
has specific characteristics that should be considered [1]:
- ICA allows the separation of the signals
whenever these are statistically independent.
- Due to ICA separates the sources by maximizing
non-Gaussianity, Gaussian sources cannot be
separated.
In addition, there are two uncertainties in the method
of ICA, 1) ICA cannot get the original amplitude of the
mixed sources and 2) the outputs can be exchanged.

FastICA is a fixed point iterative algorithm uses a
nonlinear function g (y) = tanh (a * y), which is applied to
the separation vector W, that is recalculated at each
iteration of the algorithm. The fixed point algorithm is to
iterate to obtain a global minimum. Once you determine
the vector W, is pointing to one of the independent
components. This algorithm is a version more efficient
than the gradient, reaching faster convergence and more
stable [7].

The input to the FastICA algorithm must first be
whitened by three steps: 1) centered over the average, 2)
normalize the variance and 3) orthogonalize the data. The
steps to implement the FastICA algorithm, considering
that the data must first be whitened, are the following [6]:

FASTICA Algorithm

1. Center the data to make its mean
zero.
2. Whiten the data to give z.
3. Choose an initial (e.g., random)
vector w of unit norm.
4. Let w=E{zg(w
T
z)}- E{g(w
T
z)}w.
5. Let w=w/w.
6. If not converged, go back to
step 4.

4. Pre-processing of the mixtures using
wavelets

The wavelet transform belongs to a series of signal
analysis techniques commonly called multi-resolution
analysis. This means that is able to vary the order of the
parameters analyzed (scale, concept related to the
frequency and time) throughout the analysis.

}


|
.
|

\
|
- = dt
a
t
h t x
a
a WTx
t
t ) (
1
) , (

(6)

The main feature of this method is that it identifies
what frequencies make up a sign at every moment with
the following resolutions:
- For high frequencies get a good resolution in
time that allows its exact location in time, even
in exchange for losing frequency resolution
- For low-frequency components is most important
to know the frequency even at the expense of
losing temporal resolution.


Fig. 1. Pre-processing by Wavelet bands

Procedure
BSS algorithms are only able to separate statistically
independent signals, but most signs are in the real world
are not fully independent. Because of this, there is the
need for a pre-processing component signals x (t), and on
implementing the BSS algorithm (see Figure 1).
Wavelet transform divides the signals in time domain
and frequency domain, it is appropriate in the audio signal
analysis by allowing us to increase spectral resolution in
frequencies is concentrated where most of the sounds
produced by human activities.
The pre-processing involves dividing each input signal
into several sub-signals by the wavelet transform, and
applies the separation algorithm to the same frequency
bands of each signal; therefore, the algorithm will be
applied as often as bands obtained from the signal.
Subsequently, the two signals are composed of just output
from the sum of the bands estimated; taking into account
the estimated signals can be exchanged [8].

4. Experiments and Results

Experiment 1

Fig. 2. Original signal. Linear mixtures with different
weighting
There were two linear mixtures, with real signals. The
mix is the sound of an ambulance siren and the sound of a
whistle to control traffic. Fig. 2 shows the original signals
and the two linear mixtures.
Fig. 3 corresponds to 4 bands of the mixtures between
the whistle and siren, using the Daubechies wavelet
transform.

Fig. 3 Decomposition bands by applying Daubechies
Transform in the mix 1.

As can be seen in Fig. 4 is obtained successful
separation of FastICA applied to each band generated of
wavelet transform.

(a)
Mix (1)
Mix (k)
Wavelet
Wavelet
.
.
.
1- Band

N- Band

.
.
.
1- Band

N- Band

FastICA
FastICA

(b)
Fig. 4. Result of applying FastICA to each band of
wavelet transform. (a) whistle, (b) siren.

In fig. 5 shows that after adding the band in each signal
obtained through ICA, and compared with the original
signals, you get a good separation.
The separation obtained is not 100%, as both signals
obtained samples still contain the original mixture. The
distribution of the data signal versus the siren whistle is
shown in Fig. 6 (a) original signal in Fig. 6 (b) of the
signals obtained by wavelet-FastICA. The distribution of
data in both graphs is similar, however it is noted that the
recovered signals are not fully independent.

Fig. 5. Comparison between original signals and
recovered signals using wavelet-FastICA



Fig. 6. Distribution of data signals versus siren
whistle. (a) Original signals, (b) Signals recovered

Experiment 2 There were mixtures of three signals
(voice, piano and sound of a barking dog) played on
loudspeakers and using three directional microphones.
The microphones were aligned to the sources, each
microphone is pointed toward one of the sources. The
sources were separated 15 meters from the microphones
and with an angular distance of 60 . In fig. 7 shows the
mix of piano, barking dog and voice signals (first row)
and the following rows 4 bands of each mix using wavelet
transform Deubechies.


Fig. 7. Mix 1 and band decomposition using wavelet
transform Deubechies.
In fig. 8 are plotted the components obtained by the
FastICA algorithm to each band level, for A3 and D3,
there were three independent components.


(a)

(b)
Fig. 8. Components obtained after applying FastICA.
(a) in the bands A
3
y D
3
, (b) in the bands D2 y D1.

For the D2 band, there are 3 independent components
and the band D1, only two components were obtained by
applying the FastICA algorithm (see fig. 8b). Once again
grouped the independent components obtained above, we
obtain the following components (Fig. 9), corresponding
to the 3 mixtures..

Fig. 9. Resulting components when applying the
wavelet-FastICA
The signals obtained are not independent signals, still
represent the mixture of the three sources, although the
components 2 and 3 prevail voice signal, and the
component 1 can appreciate the sound of dogs barking
and the piano, the signal voice is appreciated but in low
amplitude.
In fig. 10 (a) shows the distribution at 60 of the 3
directional microphones pointing to each of the three
sources located at 15 m from the center, where the
microphones. The polar diagram in Fig. 10 (b) shows the
performance of directional microphone.


(a)


(b)
Figure 10. (a) Physical distribution of directional
microphones, (b) polar diagram of the behavior of a
directional microphone

Using this distribution of directional microphones,
recorded signals generated with tones at frequencies of
500Hz, 1Khz and 2Khz. Using these signals was
calculated signal to noise ratio (SNR) obtained with
FastICA and using Wavelet-FastICA method (see Table
1). Best results are obtained with Wavelet-FastICA
because we can recover the signal at 500 Hz and improve
the separation in the other two frequencies. In both
calculations are considered mixtures of real tones under
the circumstances of this experiment, plots of the
frequencies obtained by Wavelet-FastICA are shown in
Figure 11.

Frequencies FastICA
[dB]
Wavelet-FastICA
[dB]
500 Hz +6.02 -10.36
1 KHz -7.95 -5.53
2 KHz -6.02 -9.95
Table 1. Signal to noise ratio (SNR) for each
frequency using FastICA y Wavelet-FastICA.

5. Conclusions

In this work, we have demonstrated the potential that
exists as a technique for audio signal separation of
statistically dependent allowing the application of the
decomposition of the mixtures using Daubechies wavelets
in conjunction with the Blind Source Separation.
The subdivision of the audio signals collected with
three directional microphones in different frequency
bands using Daubechies wavelet transform, and the
subsequent implementation of the FastICA algorithm,
which improves the performance of blind separation of
sources. Thus the initial hypothesis of the signals picked
up by a microphone array are not completely independent
and that this can be improved by pre-processing based on
wavelet transform are demonstrated with the results. It is
shown that linear mixing good results were obtained
using the wavelet-FastICA method.


Figure 11. Frequencies separated of real mixing by
Wavelet-FastICA.

References

[1] C.Jutten and J.Herault: Independent Component
Analysis versus Principal Component Analysis,
EUSIPCO-1988, Signal processing IV, pp. 643-646,
Grenoble, France, 1988.
[2] A.Mansour and M.Kawamoto, ICA papers
classified according to their applications and
performances, IEICE Trans. Fundamentals, Vol.E86-
A, N.3, March 2003.
[3] A.Mansour, A.K.Barros, N.Ohnishi. Blind
Separation of Sources: Methods, Assumptions and
Applications, IEICE Trans. Fundamentals, Vol. E83-
A, N.8, August 2000.
[4] T.W.Lee. Independent Component Analysis, Kluwer
Academic Publishers, Boston, 2000.
[5] Oja, E., Hyvrinen, A., Independent Component
Analysis: Algorithms and Applications, 2000.
[6] Erikki Oja, Aapo Hyvrinen, Juha Karhunen,
Independent Component Analysis, John Wiley &Sons,
Inc., 2001.
[7] Oja, E., Hyvrinen, A., A Fast Fixed-Point for
Independent Component Analysis, Neural
Computation,1997
[8] R. Alcaraz, C. Sanchez, J. Rieta, M. Fernndez, J.
Ballesteros., Separacin Ciega de Fuentes en el
Dominio Wavelet. Aplicacin a seales de audio,
Universidad de Castilla-La Mancha.

You might also like