Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

2D Sound Source Localization in Azimuth & Elevation from Microphone Array by Using a Directional Pattern of Element

Atsushi IKEDA
Tokyo University of Science, TUS 2641, Yamazaki, Noda-shi, Chiba, 278-8510, Japan Email:

Tokyo University of Science, TUS 2641, Yamazaki, Noda-shi, Chiba, 278-8510, Japan Email:

Satoshi KAGAMI
Digital Human Research Center, AIST 2-41-6, Aomi, Koto-ku, Tokyo, 135-0064, Japan Email:

Tokyo University of Science, TUS 2641, Yamazaki, Noda-shi, Chiba, 278-8510, Japan Email:

The Kansai Electric Power Co., Inc. 3-11-20, NAKOJI, AMAGASAKI, HYOGO, 661-0974, Japan Email:

Abstract This paper describes a two-dimensional sound source localization method which uses the directional pattern of each microphone to perform acute beam forming in azimuth and elevation using a 32-channel distributed microphone array. The localization method is based on Delay and Sum Bean Forming (DSBF) and Frequency Band Selection (FBS). The microphone arrangement is determined by simulation of the proposed beam forming method. As an example application, a speaker detection system in a house environment is developed using the 32ch microphone array unit, which is mounted on the ceiling of each room. Finally, the 2D directional localization accuracy of the system for multiple sound sources is evaluated.

I. I NTRODUCTION In order for a robot to live and act within human society, it must be capable of recognizing the surrounding situation. Much research has been directed towards this goal, such as using laser range nders or ubiquitous sensor networks to detect and locate objects in the environment. Also microphone arrays have been developed to determine the location of sound sources in the environment. Typically these arrays are arranged horizontally and the sound source is assumed to also lie in the horizontal planes. However, sound sources are located in three dimensional space so that determining only the horizontal angle to the sound source is not sufcient for accurate localization, the elevation angle is also required. With estimation of both azimuth and elevation angles, we believe the accuracy of sound source localization will increase. An additional problem is that sound source localization and separation performance is reduced if the distance between the speaker and the microphone array increases. Research has studied systems that can localize in azimuth and elevation, with large scale microphones arrays being developed. [3] used 512 microphones, while [2] used 1020 microphones. The effectiveness of these microphone array systems which use large numbers of microphone elements

to localize sound sources in two dimensions was conrmed. However, these systems position many microphone elements all over interior walls, and a compact system, with similar capabilities of azimuth and elevation angle estimation, has as yet, to our knowledge, not been developed. We propose a compact microphone array system which can estimate both azimuth and elevation angles using the microphone directional pattern are used in localization phase. In the following sections, we describe the compact microphone array, the localization method using microphone directivity, and present some experimental results validating to system. II. M ICROPHONE A RRAY S YSTEM Figure 1 shows the 32-channel microphone array which we developed. The outer diameter of the microphone array is 520 [mm], the internal diameter is 300 [mm]. The component microphones used in the array are Primo omni-directional electric condenser (EM-100PT) microphones. Figure 2 shows the interface board we developed for transferring the microphones output signals to a computer. It has 32 A/D converters (AD7680BRM) and can sample 32ch data channels simultaneously. Table I shows the specication of this board. III. A LGORITHM Sound source localization is achieved by the Delay and Sum Beam Forming algorithm (DSBF). The DSBF method is a technique for forming a strong directional characteristic in the signal for each proposed direction by summing all the time shift and amplitude aligned sound waves inputted from each microphone. To the resulting directional output, the Frequency Band Selection alogrithm (described in [1] is applied. One problem in the implementation of this method is that each microphone has an inherent directivity where in the sensitivity of the microphone varies with direction.

1-4244-1262-5/07/$25.00 2007 IEEE


IEEE SENSORS 2007 Conference

300 200 100 Y[mm] 0 -100 -200 -300 -300 -200 -100 0 100 200 300 X[mm]

minimum of Li is Lmin , and the standard length Ls. Delay time(=Di ) of the ith microphone are expressed as eq.(1) :

s(t) =

xi (t Di )


a)32ch Microphone array b) Microphone Fig. 1. Microphone array unit design


Li Lmin (2) Vs The proposed method is to multiply the direction characteristic of each frequency of the microphone element which measured beforehand as an elevation function when added by sound localization that uses DSBF. Let denote the value of the proposed angle of elevation of the sound source. The X and Y-axis are on the ceiling plane and Z-axis is in the vertical direction. The origin is set at the array center. We assume the plane of the microphone array is at 90 degrees, and the vertical direction is assumed to be at 0 degrees. W () is dened as a directional function. So, this method can be expressed as Di =

s(t) =

W ()xi (t Di )



Fig. 2.

Interface board


board size input channel interface data transfer sampling speed resolution power supply

W=75,d=100,h=30[mm] 32 channel IEEE-1394 isochronous transfer 16[kHz] 16bit,programmable gain amp. 5[v]

Fig. 4.

Frequency characteristic

By taking account of the distortions to beam forming caused by the directivity of microphones, we suggest that sound source localization in two dimensions will be possible. To this end each signal input at the DSBF phase is multiplied by a sensitivity coefcient to represent the directional pattern of the microphone.

Fig. 3.

Localization algorithm

Let the acoustic speed be Vs [m/s], and the length from a focus to the ith (i=0,1,2,,N) microphone be Li . The


There are various characteristics in the microphone. (e.g., omni-directional pattern, uni-directional pattern and more). Figure 4 shows the frequency characteristics of the microphone. This microphone acts as an omni-directional microphone for horizontal sound measurement, but we do not data describing the response to variation in the elevation direction. We hypothesize that the sensitivity of microphones is different from the frontal direction (0 degrees) to other directions. In order to measure the directional characteristics of an individual microphones sensitivity, we moved a loud speaker in increments of 10 degrees in the elevation direction from the front of the microphone (0 degrees) to the back (180 degrees). The speaker was at a distance of 1[m] from the microphone. The loud speaker used in the experiment was a MS101II made by Yamaha, and the sound sources used were sin waves of varying frequencies. The experiment was done in a general laboratory, with background noise of about 35[dBA], including noises such as PC fans and air conditioners. Reverberation time(T60 ) in the room was about 450[msec]. The directivity in elevation angle we measured for various sound frequencies are shown in Figure 5. From the results of the directivity experiment, it has been concluded that the microphone element is not omni-direction


Fig. 5.

Directional pattern Fig. 6. Experimental room

in elevation angle, and that there are differences in the sharpness of this directivity depending on the frequency. Especially with high frequency sound sources, there are signicant features in the directional pattern which standout. Therefore, if these direction characteristics are used to modify the directional signals in DSBF sound source localization, we think that the accuracy of directional localization in the elevation angle will improve. V. E XPERIMENT In this experiment we validate the systems performance in sound source localization for the elevation direction. The microphone array was mounted on the ceiling in an experimental house as shown in Figure 6, while the sound source was generated from a loud speaker(YAMAHA MS101II). We used male voice as one sound source. In this room, reverberation time was from approximately 360 to 420[msec]. The sample frame length for each sound localization task is 4096 data samples. Sound source is localized by scanning the focus on veconcentric circles from the microphone array center. Horizontal angle resolution is 3deg. The direction of the elevation is 26, 45, 56, 63, and 68 degree, ve different conditions in total. A. Sound localization result The experiment results of the sound localization experiment were as follows: in Figure 7, the two large circle images show sound pressure maps of the area surrounding the microphone array. The left map show the sound pressure without using the microphones directional patterns, the right pressure map shows the same sound source but uses the directional pattern in sound localization. The image to the left shows the conguration of the speakers and microphone in the experiment. The circles display the strength of sound pressure power by intensity of grey-scale (black: high pressure). The images contain ve concentric circles, and each concentric circle shows the sound power for horizontal angle (radially around the circumference of the circles) and elevation angle (radius of the ve circles). The red mark shows the position where the

power is at the highest level, representing the current estimate of sound source location. sound source there. When both sound pressure maps are compared, the large spread of the black gradation in the left image indicates the wide range in the uncertainty of the horizontal angle and elevation angle. The spread of the gradation can be perceived to be reduced in both the horizontal and elevation directions in the right gure.
;Loud speaker ;Microphone array

Fig. 7.

Sound pressure map

18 16 14 using directivity not using directivity


12 10 8 6 4 2 0 1 2 3 4 5 Experimental run number

Fig. 8.

Elevation angle error

It experimented on sound localization, and compared it


for case of not considered directional pattern and considered directional pattern. Another experiment was conducted to test the accuracy of sound source localization in azimuth and elevation directions, for the both the case of not using directional patterns, and that of using them. First, the sound localization results of estimation of the elevation direction were compared for the two cases. Figure 8,9 show graphs with experiment numbers (1-5) on the Xaxis (corresponding with elevation angles (29,53,56,59,and 59 degree)), and the localization error on the Y-axis. The results are shown in Figure 8. The error shows the difference (in degrees) between the true elevation angle and the average elevation angle of the elevation localization result.

1 0.9 0.8 using directivity not using directivity

detecteion rate

0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 1 2 3 4 5 Experimental run number

Fig. 9.

Detection rate

Exp.# 1 2 3 4 5

Real value[deg] 162.4 58 312.2 96.2 96.2

Maximum error[deg] (azimuth) Using Without directivity directivity 146.6 179.6 116 116 47.7 135.3 92.8 96.2 143.8 95.8 Average error to azimuth[deg] Using Without directivity directivity 17.7 19.9 26 24.655 11.35 28.1 10.61 16.74 25.9 27.5 Variance[deg] (elevation) Using Without directivity directivity 15.7 2.6 6.6 8.2 4.0 7.4 5.5 6.4 7.2 10.7

Exp.# 1 2 3 4 5

Real value[deg] 162.4 58 312.2 96.2 96.2

Exp.# 1 2 3 4 5

Variance[deg] (azimuth) Using Without directivity directivity 27.4 38.7 33.4 33.6 13.4 45.6 23.5 29.1 39.4 37.0

When the sound source was localized without using microphone directivity, the maximum elevation angle error is 17[deg], the minimun error is 11[deg], and the average error is 13[deg]. On the other hand, when using directivity, the maximum elevation angle error is 14[deg], the minimun error is 4[deg], and the average error is 9[deg]. Maximum and average values of the difference between error in elevational direction when using directivity or not is 8[deg] and 4[deg] respectively. Figure 9 shows the detection ratio. If the localization result was within 9[deg] of the true value, it was assumed that the system detected the true value. That 9[deg] is means that three resolution because horizontal resolution is 3[deg]. The detection rate is 86% when using the directional patterns, and 84% when not using it. It seems that this indicates excluding a true value depend on the reverberation in the experimented room.

VI. C ONCLUSION In this paper, we proposed a compact microphone array capable of 2D sound source localization which uses a model of the microphone elements directional pattern. This method can be used to localize in two directions: azimuth and elevation angle. Using this technique, the system has been demonstrated to be able to detect the power difference in the direction of the elevation. The maximum error in sound localization between true value and measured value were smaller using directional patterns than without using the directivity method. Also the calculated average error of the estimated horizontal direction was smaller when using directivity compared to the error when not using it. By the same token, the variance values for the errors were smaller when using directivity as well. Thus, estimation in two directions, azimuth and elevation, is improved if directivity is used for sound source localization. From these reasons, we conclude that the proposed method is effective. In the future, it is planned to expand the application to be able to calculate a distance between sound source and microphone array. In addition, this microphone array microphone elements are arrange on a horizontal plane. Sound from horizontal direction can be localized accuracy, especially in small elevation angle, on the other hand sound from vertical direction cant be localized as well because it is difcult to align the phase difference for sounds coming from a vertical direction. So we plan to investigate microphone array congurations which are not restricted to the horizontal plane, for which we think the idea of directivity in elevation angle, proposed in this paper, will be important. R EFERENCES
[1] Yoko Sasaki, Satoshi Kagami, Hiroshi Mizoguchi Proceedings of the 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems(IROS2006), Multiple Sound Source Mapping for a Mobile pp.380385, Beijing, China, Oct., Robot by Self-motion Triangulation 2006 [2] Eugene Weinstein and Kenneth Steele and Anant Agarwal and James Glass, LOUD: A 1020-Node Modular Microphone - Array And Beamformer [3] Harvey F. Silverman and William R. Patterson III and Joshua Sachar Measured Performance of a Large-Aperture Microphone Array System


You might also like