Download as pdf or txt
Download as pdf or txt
You are on page 1of 2

ADAPTIVE TRACKING OF SPEAKER'S DIRECTION USING TWO MICROPHONES

M. Hassanpour and M. H.Kahaei Department of Electrical Engineering Iran University of Science and Technology (IUST) Tehran, 16 8 4 4, Iran mahdi hassanpour(Lee.iust.ac.ir kahaei(iiust.ac.ir
,

ABSTRACT In this paper, we propose an algorithm for tracking of speaker's direction using two microphones. The method is performed by minimizing a cost function that alters during adaptation based on the convergence state by reorganizing the set of frequencies method. Numerical simulations compare the performance of the proposed method with the conventional algorithms. INTRODUCTION Speaker's direction estimation is one of the important parts in many applications, such as teleconferencing, distance-learning systems and wireless applications. Many algorithms of speaker's direction estimation make use of microphone arrays to enhance the received signal. In some algorithms such as the Coherent Signal Subspace (CSS), the MUSIC algorithm has been used. In this algorithm, using a focusing process the signal components of several frequency bands are gathered together. This process requires a pre-estimation of the signal direction whose error highly affects the final estimation. The purposed algorithm in [2] determines the polar angle of the speaker direction using two microphones by pre-estimating the signal direction. On the other hand, the method proposed in [1] makes use of three microphones to track the azimuth angle of the arrival signal. In this paper, we propose a method for reorganizing the set of frequencies of the received signal to track the polar angle of signal without preestimation using two microphones by incorporating the cost function of [1] in [2]. SYSTEM MODELING The Fourier transforms of acoustic signals of the microphones are defined as [2]
X I (k) =Xs(k)+V1(k)

X 2 (k) =Xs (k)e k0k +V2(k)

Where

W0k

k oT, o sin c '0

N 0 =(

O | 'CO=

wr 2,wi 2]

N1

where fs is the sample frequency, 0 shows the polar angle of the received signal, N is the number of samples, and c denotes the acoustic velocity. The cost function used in [2] has many local minima. As a result, without having the pre-estimation of the received signal direction, the selected initial value may make the algorithm not to converge to the correct angle. Thus, we define the cost function Jk (T, 0) at the kth frequency band as

Jk (p,O) =E X2(k) X(k)e

.k2 ]Pk

co0 9k =k
Where

where T is the angle estimation of 0 and r9 shows the time delay of X1 related top. Due to independence of xs(n), v1 (n) and v2(n) and that the signal power ofxs(n) is normalized, we may write:

[ (P

!I?sinpT

(2)

Jk (P, 0) = 4 sini

2( f

0 sini)n+ 2 O

*~

(3)

Since, the major power of speech signals is localized at harmonic frequencies, the SNRs at these frequencies are rather high and, as a result, harmonic elements contribute to improving the estimation accuracy. Thus, the general cost function is defined as mi I (4) Jmi ((P'Q) 1 ] E Jk (p,0)
where [mi ] include the frequencies whose SNRs are higher than the threshold T. The maximum number of k that the cost function Jk(T, 0) has only one local minimum for all values of 0 is called ks. The k<ks region is called the guarantee band for which the guarantee band ratio oc is defined as a=k, N

0-7803-9521-2/06/$20.00 2006 IEEE.

2999

REORGANIZING THE SET OF FREQUENCIES For having only one local minimum for the cost function Jk ((p, 0) at each state, its domain depends on cx and T. Now, we propose a method to calculate them in each state. Then, the set of frequencies indexed by [mi ] will be determined. CALCULATION METHOD OF THRESHOLD T We propose a method to calculate the threshold T. Suppose that we are at the ith state where

{ki 1

max{mi }, ki

min{mi }

mi+l =mi uki

Therefore, we obtain Tki that

Jki +1 (q, 0)

as the maximum value.

Tki

Jki (TPk,0) = 4 X

(ki2)cost(ki

CALCULATION METHOD OF GUARANTEE BAND RATIO a To find oc for all 0 values, the cost function should have only one minimum. Then, the guarantee band ratio cx is derived as
sin

Kko (sin0 -sinP)


7CC r

=0

sin

=sin0 -m

rik (o0

and

ks = 2o6uo 2

and

c
a

4rfs

SIMULATINS RESULTS Using numerical simulations, the performance of the proposed algorithm is compared to the conventional one in[2]. Fig. 1 shows the cost function used in [2]. Because of its local minima, the algorithm convergence depends on the initial value of the estimation. Comparing Figs. 2 and 3, it is seen that the convergence of proposed algorithm outperforms that of the conventional method[2] at the correct received signal direction. This leads to a superior speaker's direction finding and tracking.
;I

'I

I..

II ii II

II
II LL -j J X L
1
4

IL

IC

Fig

(p, o) in conventional algorithm

Fig2. Convergence of conventional algorithn

REFRENCES
[1]
[2]

Y. HIOKA, N. HAMADA, "Tracking of speaker direction by integrated use of microphone pairs in equilateraltriangle," IEICE Trans. on Fundamentals, Vol.E88-A, no.3, pp.633-641, Mar. 2005. H. Kawakami, M. Abe, and M. Kawamata, "A two-channel microphone array with adaptive target tracking using frequency domain generalized sidelobe cancellers," IEEE Int. Symp. on Intelligent Sign. Process. & Commun. Systems, pp.291-296, Nov. 2002.

0-7803-9521-2/06/$20.00 )2006 IEEE.

. 1_
,C (m=0, +1,...)
t ::
(t *11

(5)

(6)

(7)
(8)

j_ I

..J v

F)
7)
.....

4J '-i ki
i

._

i
*
IL. V. ~ J '-J

REHS ura ;1' !a

_E

Fig3. Convergence purposed algorithm

3000

You might also like