Professional Documents
Culture Documents
Filtering For Speech: Enhancement
Filtering For Speech: Enhancement
873
wk lrPEEChlxk!
generator
UL - - - - - L
measurement
system
-
? - * -I vk- ,
l s k F /
fdter
U
is minimum. The estimation e m r e k is defined by the equation
ek =Xk -2k
(7)
with the initial condition XO= [OIn x l . The filter gain and error
variance equations are
Figure 1: The noisy speech generating and filtering mechanism.
K k = PkIk._1c[R.f CPkp-lCT]-l (8)
model [2]-[3] Pklk-1 = A P k - i A T + BQBT (9)
p k = [I- K k c ] P k I k - 1 (10)
where K k is a Kal" gain Vector, P k l k - 1 = E [ ( X k
Xk,k-l)*(+ -akp,, )] is a priori error covariance mamx, P k =
E [ ( x k - X k ) T ( X k - 8,')
is a posteriori error covariance matrix.
I 0
0
1
0 O1 ...
. . . O0
. ... . .
0 \
O
The initial condition PO = [ O l n x n . I is an n x n identity matrix.
The estimated speech sample e k can be obtained by
... . . i k =c x k (1 1)
0 0 0 ... 0 1 If the additive noise { U k } is a colored Gaussian process, the %an
an Un-1 an-2 -.. U? 01 filter algorithm for such speech estimation is given in [3].
BT = C=[OO ... 011 3.2. H , Filtering Algorithm
The objective of speech enhancement is to estimate Z k given
{ s i } (0 5 i 5 k). Previous studies have assumed that both W k Consider the state space model (3)-(4), we make no assumptionson
and U k are white, uncomlated Gaussian processes in order to de- the nature of unknown quantities W k and U k , and are interested not
veloping enhancement algorithms. i.e. to estimate X k [1]-[4]. How- necessarily in the estimation of x k but X k using the observations
ever neither the speech nor the noise may be Gaussian. This is be- {Si} (0 5 i 5 k). Let
cause W k could be a pulse train for voiced speech, random noise Zk =c x k (1 2)
for unvoiced speech or the modeling mor, U k could be any kind of Different from that of the Kalman filter which minimizes the vaxi-
noise. The Gaussian assumptions may provide an estimate which is ance of the estimation error, the design criterion for the H- filter
highly vulnerable to statistical outliers, i.e., a small number of large is to provide an uniformly small estimation error, e k = Z k - i k . for any
measurement errors would have a large influence on the resulting W k , U k E 22 and XO E 72". NO= that Z k i s equal to X k . The
estimate, so that the viability of the algorithms was checked by ex- measure of performance is then given by
periment [3]. In the following, we present an new approach based
on the Boofiltering algorithm for speech enhancement, where both
W k and V k may not be white or colored Gaussian processes. For
comparison, the Kalman filtering algorithm is briefly reviewed.
where((X0 - 2 o ) , W k , U k ) +
~,Xoisanapnbn'estimateof~o,
3. KALMANANDH,FIL!T"G Q 2 0 , ~ ; ' > 0,W > OandV > Oaretheweightingmatrices,
ALGORITHMS and are left to the choice of the designer and depend on perfor-
mance requirements. The notation IZk; 1 is defined as the square of
3.1. Kalman Filtering Algorithm the weighted (by Q) h n o m O f Z k , i.e., lZk1$ = z z Q z k . The irdo
filterwill search z k such that the optimal estimate Z k among ail pos-
In the Kalman filtering, the clean speech signal { Z k } is considered sible &(i.e. the worsetase perfonnance measure) should satisfy
to be a random process. Assuming that both exciting term W k and supJ<r2 (14)
observation noise U k (additive noise) are white Gaussian processes
with zero mean and uncomlated variances Q and R where "sup" stands for supremum and 7 > 0 is a prescribed level
of noise attenuation. The E, filtering can be interpreted as a
E { w k d } = Q mini- problem where the estimator strategy i k play against the
E{vkvr} = R exogenous inputs W k , U k and the initial sfate XO.The performance
criterion becomes
E{wkUr) = 0.
E means ezpectation. The design objective of "anfilter is to
determine the optimai estimate i k based on the (3;) (0 5 i 5 k) N
such that
Pk = E { e k e z ) (5)
874
where '"in" stands for minimization and "max" maximization. 4. EXPERIMENTAL RESULTS
Note that unlike the traditional minimum variance filtering ap-
proachwiener and/or Kalman filtering), the lTm filtering deals Both Kalman and IToo filtering algorithms described in Section 3
with deterministic disturbances and no a priori knowledge of the are applied to speech enhancement as follows.
noise statistics is required. Since the the observation s k is given, V k
can be uniquely detennined by (2) once the optimal values of W k Noisy speech is divided into equal-length segments. Within each
and xo are found, and z k = cxk,& = c x k , We C a n rewrke the segment, we first estimate the tap-gain parameters uj j = 1,...,n,
performance criterion (15 ) as then filter the noisy speech with the parameters {aj}. The Kalman
and H, filtering algorithms are initialized only for the first seg-
ment. In our experiment, we choose the state vector XO= 0, and
weight matrix po >> 0. In the subsequent segments, XOand f i
w are initialized using the corresponding last values from the previ-
ous segment. There exists a trade off in the choice of the length
of the segments. Large segments improve the accuracy of the pre-
diction parameters for stationary sounds (e.g. vowels) and short
where (3 = CTQC.In [9]-[lo], it has been proved that the follow- segments improve the accuracy for nonstationary sounds. In our
ing theorem presents a complete solution to the HODfiltering prob experiment, the segment length used for calculating the parameters
lem for the state-space model (3)-(4) with the performance criterion
(16).
.
{ u j , i = 1,2,.. n} is 128 samples, which corresponds to 16 ms
(with a sampling frequence of 8 KHz). The order of the all pole filter
is 10, which is a commonly used value in linear predictive analysis
of speech signal and the input S N R is 5 dB.Two types of noise are
Theorem : Let 7 > 0 be a p-rescribed level of noise attenuation. used: white noise (stationary) and helicopter noise (nonstationary).
Then, there exists an Hm filter for Z k if and only if there exists a The performance of both filtering algorithms is measured in tenns
stabilizing symmetric solution pk > 0 to the following discrete- of SNR, time domain speech representation, and listening evalua-
time Riccati equation tion. In the testing, the order of state space model is set to be equal
pk+1 = APkAT - APkCT(V + CPkCT)-'CPkAT (17) to the order of the all pole filter. Experiment results obtained for
the sentence "Woe beride the interviewee if he answered vaguely"
+BWBT - 7-'pk+iLT(Q-' + 7-'LA+1LT)-'LPk+1 are shown in the following tables. Table 1 shows the output S N R
Po = Cpc' + 7-*LQLT)-'. results for the testing where the weightings are W = 1, V = 3,the
attenuation parameter 7 = 1.02, andpo = 1OOO. The S N R values
If this is the case, then an lT, filter can be given by
throughout this paper are the global signal to noise ratios calculated
i k = c x k , k = l Y 2y . . . , N (18) by
where
&k = &k-1 + H k ( S k - C&k-l), 3 0 =0 (19)
where N is the total number of samples of each sentence, 2 k is
Hk is the gain of the ri, filter and given by the clean (noise-&) sequence and % is the enhanced speech. Ta-
ble l gives the performance comparison between the Kalman and
H k = APkCT(V + CP*CT)" (20) HoOfiltering algorithms for the situations that the speech signal is
Solving Riccati equation (17) for the solution Pk is not trivial contaminated by white Gaussian noise and helicopter noise respec-
due to its nonlinearity. Applying the following maaix inversion tively. In both cases, the IT, filtering algorithm has over 1.0 dB
le"a(MIL) advantage in the output S N R values over the Kalman filtering algo-
rithm [3]. Figure 2 shows the clean speech signal. Figures 3 and
- + +
A AB(C BTAB)"BTR = (A-' BC"BT)", (21) 4 show the original noisy speech signals (contaminated by white
(17) can be written as Gaussian noise and by helicopter noise respectively) as wed as the
corresponding noise suppressed signals in the time domain by us-
PF:l = [A(PF' +CTV"C)"AT+BWBT]-' - T ' ~ C ~ Q C ing the H,. filtering algorithm. Informal subjective tests c o n h
(22) the improvement of SNR values, intelligibility and voice quality.
so that we can obtain the solution of Pk from (22) recursively. The figures clearly show the effectiveness of the Hm filtering for
speech enhancement.
Comparing the Kalman filtering algorithm (7>( 10) and the H, fil-
tering algorithm (17>(20), we can observe
Table 1: Performance Comoarison with h u t Sm=5 dB
1) The Kalman filtering algorithm gives the minimum mean-square-
error estimate of xk based on the {si} ( 0 5 i5 k); Algorithm White Noise I Helicopter Noise
KalIXlan 8.7276 I 8.9119
2) The Ifmfiltering algorithm gives the optimal estimate of z k I EI, I 9.8781 I 10.0693 I
based on the {si} (0 5 i 5 k) such that the effect of the worst
disnrrbance(noises)on the estimation error is minimized.
875
Figure 2: The clean speech signal.
5. CONCLUSIONS
-x“
-4ooo0 0.5 1
Sample”&
15 2
x lo’
6. REFERENCES
1. J.S. Lim and A.V. Oppenheim, “All-Pole Modeling of De-
graded Speech’’, IEEE T m .Acowt., Speech, Signal P m e s s - Figure 3: The white Gaussian noise contaminated speech signal
ing, Vol. ASSP-26,197-210,1978. and the estimated speech signal using the H, filtering.
2. K.K. Paliwal and A. Basu, ‘A Speech Enhancement Method
Based on Kalman Filtering”, Proc. IEEE ICASSP, 177-180,
1987.
3. J.D.Gibson, B.Koo and S.D. Gray, “Fiitcring of Colored Noise
for Speech Enhancement and Coding”, IEEE Tiuns. S i g d
Proce$sing, Vol. SP-39,1732-1742,1991. B
U
4. Y. Ephraim. “A Mini” Mean Squan Error Approach for s o
Speech Enhancement”, Proc. IEEE ICASSP, 829-832,1990. E
4.
5. R.N. Banavar and JL.Speyer, “A Linear Quadratic Game The- -2ooo-
ory Approach to Estimation and Smoothing”. Pmc. IEEEACC,
2818-2822,1991.
6. M.J.Grimble and A. Elsayed, “Solution of the H, Optimal
Linear Fiitexing Problem for Discrete-Time Systems”, IEEE
Trans. Acout., Speech, Signal Processing, Vol. ASSP-38,
1092-1104,1990.
7. U. Shaked and Y. Theodor, “ H ~ O ~ t i mEstimation:
al A Tuto-
rial“, PWC.31st IEEE CDC, 2278-2286,1992.
8. KM. Nagpal and P.P. Khargonekar, “Filming and Smoothing
in an Hw Setting’’, IEEE Trans. Auto. Control, Vol. AC-36,
152-166,1991. -2wo.
876