Professional Documents
Culture Documents
Weiner Filter
Weiner Filter
Weiner Filter
01943128
rcy Tsai
Weiner Filter
Darcy Tsai ()
E-mailraulshepherd@access.ee.ntu.edu.tw
Graduate Institute of Electronics Engineering
Nation Taiwan University, Taipei, Taiwan, ROC
Abstract
Wiener theory, formulated by Norbert Wiener in 1940, forms the foundation of
data-dependent linear least square error filters. Wiener filters play a central role in a
wide range of applications such as linear prediction, echo cancellation, signal
restoration, channel equalization and system identification. The coefficients of a
Wiener filter are calculated to minimize the average squared distance between the
filter output and a desired signal. In its basic form, the Wiener theory assumes that the
signals are stationary processes. However, if the filter coefficients are periodically
recalculated for every block of N signal samples then the filter adapts itself to the
average characteristics of the signals within the blocks and becomes block-adaptive. A
block-adaptive (or segment adaptive) filter can be used for signals such as speech and
image that may be considered almost stationary over a relatively small block of
samples. In this chapter, we study Wiener filter theory, and consider alternative
methods of formulation of the Wiener filter problem. We consider the application of
Wiener filters in channel equalization, time-delay estimation and additive noise
reduction. A case study of the frequency response of a Wiener filter, for additive noise
reduction, provides useful insight into the operation of the filter. We also deal with
some implementation issues of Wiener filters.
Keywords
Weiner filter, Optimum linear filters, Minimum mean squared error (MMSE)
.
Content
1
2
Introduction............................................................................................................3
Linear Optimum Filtering......................................................................................4
2.1
Problem Statement.................................................................................4
2.2
Principle of Orthogonalaity....................................................................6
2.3
Minimum Mean Squared Error............................................................11
Wiener-Hopf Filters..............................................................................................11
3.1
Wiener-Hopf Equations........................................................................11
3.2
Matrix Formulation of the Wiener-Holf Equations..............................14
3.3
Error Performance Surface...................................................................15
3.4
Numeral Example.................................................................................17
Some Applications of Wiener Filters...................................................................25
4.1
Wiener Filter for Additive Noise Reduction........................................25
4.2
Wiener Channel Equalizer....................................................................28
1
5
6
4.3
Time-Alignment of Signals in Multichannel Systems.........................29
Implementation of Wiener Filters........................................................................31
Further Comments................................................................................................33
1 Introduction
Wiener filters are a class of optimum linear filters which involve linear
estimation of a desired signal sequence from another related sequence. In the
statistical approach to the solution of the linear filtering problem, we assume the
availability of certain statistical parameters (e.g. mean and correlation functions) of
the useful signal and unwanted additive noise. The problem is to design a linear filter
with the noisy data as input and the requirement of minimizing the effect of the noise
at the filter output according to some statistical criterion. A useful approach to this
filter-optimization problem is to minimize the mean-square value of the error signal
that is defined as the difference between some desired response and the actual filter
output. For stationary inputs, the resulting solution is commonly known as the Weiner
filter. Its main purpose is to reduce the amount of noise present in a signal by
comparison with an estimation of the desired noiseless signal.
1. The filter is linear, which makes the mathematical analysis easy to handle.
2. The filter operates in discrete time, which makes it possible for the filter to
be implemented using digital hardware/software.
1. Whether the impulse response of the filter has finite or infinite duration.
2. The type of statistical criterion used for the optimization.
The choice of a finite-duration impulse response (FIR) or an infinite-duration
impulse response (IIR) for the filter is dictated by practical considerations. The
choice of a statistical criterion for optimizing the filter design is influenced bt
mathematical tractability. These two issues are considered in turn.
For the initial developed includes that for FIR filters as a special case.
However, for much of the material presented in this tutorial, we will confine our
attention to the use if FIR filters. We do so for the following reason. An FIR filter
is inherently stable, because its structure involves the use of forward paths only. In
others words, the only mechanism for input-output interaction in the filter us via
forward paths from the filter input to its output. Indeed, it is this form of signal
transmission through the filter that limits its impulse response to a finite duration.
On the other hand, an IIR filter involves both feedforward and feedback. The
presence of feedback means that portions of the filter output and possibly other
internal variables in the filter are fed back to the input. Consequently, unless it is
properly designed, feedback in the filter can indeed make it unstable with the
result that the filter oscillates; this kind of operation is clearly unacceptable when
the requirement is that of filtering for which stability is a must. By itself, the
stability problem in IIR filters us manageable in both theoretical and practical
terms. However, when the filter us required to be adaptive, bringing with it
stability problems of its own, the inclusion of the adaptivity combined with
feedback that is inherently present in an IIR filter makes a difficult problem that
much more difficult to handle. It is for this reason that we find that in the majority
4
if applications requiring the use if adaptivity, the use if an FIR filter is preferred
over IIR filter even through the latter is less demanding in computational
requirements.
Turning next to the issue of what criterion to choose for statistical
optimization, there are indeeded several criteria that suggest themselves.
Specifically, we may consider optimizing the filter design by minimizing a cost
function, or index of performance, selected form the following short list of
possibilities:
Option 1 has a clear advantage over the other two, because it leads to tractable
mathematics. In particular, the choice of the mean-square error criterion results in
a second order dependence for the cost function on the unknown coefficients in
the impulse response of the filter. Moreover, the cost function has a distinct
minimum that uniquely defines the optimum statistical design of the filter.
We may now summarize the essence of the filtering problem it making the
following statement:
Design a linear discrete-time filter whose output y(n) provides an estimate of a
desired signal response d(n), given a set of input samples x(0), x(1), x(2), , such
that the mean-square value of the estimation error e(n), defined as the difference
between the desired response d(n) and the actual response y(n), is minimized.
We may develop the mathematical solution to this statistical optimization
problem by following two entirely different approaches that are complementary.
One approach leads to the development of an important theorem commonly
known as the principle of orthogonality. The other approach highlights the errorperformance surface that describes the second-order dependence of the cost
function one the filter coefficients. We will proceed by deriving the principle of
orthogonality first, because the derivation is relatively simple and because the
principle of orthogonality is highly insightful.
2.2
Principle of Orthogonality
y (n) wT X (n) wk x (n k ),
w, X (n) R L
and
n 0,1,2,......
(1)
k 0
(2)
where E[] is the expectation operator and e(n) is the estimation error. Then, the
estimation problem can be seen as finding the vector w that minimizes the cost
function JMSE(w). The solution to this problem is sometimes called the stochastic
least squares solution. If we choose the MSE cost function (2), the optimal
solution to the linear estimation problem can be presented as:
(3)
(4)
As this is a quadratic form, the optimal solution will be at the point where the
cost function has zero gradient, i.e.,
w | J MSE ( w)
J MSE
0 Lx1
w
(5)
or in other words, the partial derivative of JMSE with respect to each coefficient wk
should be zero. Under this set of conditions the filter is said to be optimum in the
mean-squared-error sense. Using (1) in (2), we can compute the gradient as:
J MSE
e( n)
2 E[ e ( n )
] 2 E[e(n) X (n)]
w
w
(6)
(7)
or equivalently
k 0,1,2,...., L 1
(8)
This is called the principle of orthogonality, and it implies that the optimal
condition is achieved if and only if the error e(n) is decorrelated from the
samples x(nk), k = 0,1,..., L1. Actually, the error will also be decorrelated
from the estimate y(n) since:
E[emin (n) yopt (n)] E[emin (n) wT opt (n) X (n)] wT opt E[emin (n) X (n)] 0
(9)
Fig.2
yopt(n)
2.3
(10)
(11)
(12)
Hence, evaluating the mean-square values of both sides of (11), and applying to it
the corollary to the principle of orthogonality described by (9), we get:
d2 y2op t J MSE
(13)
2
where 2d is the variance of the desired response, and y is the variance of
the estimate yopt ; both of these random variables are assumed to be of zero mean.
Solving (13) for MMSE, we get:
opt
J MSE d2 y2o pt
(14)
This relation shows that for the optimum filter, the MMSE equals the difference
between the variance of the desired response and the variance of the estimate that
the filter products at its output.
It is convenient to normalize the expression in (14) in such a way that the
minimum value if the mean-squared error always lies between zero and one. We
may do this by dividing both sides of (14) by 2d , obtaining
yo pt
J MSE
d2
d2
2
(15)
Clearly, this is possible because 2d is never zero, expect in the trivial case of a
desired response d(n) that is zero for all n. Let
J MSE
d2
(16)
The quantity is called the normalized mean-squared error, in terms of which
we may rewrite (15) in the form:
y2op t
d2
(16)
y2opt
2
We note that the ratio can never be negative, and the ratio d is always
positive. We therefore have
0 1
(17)
If is zero, the optimum filter operates perfectly in the sense that there is
complete agreement between the estimate yopt(n) at the filter output and the
desired response d(n). On the other hand, if is unity, there is no agreement
whatsoever between these two quantities; this corresponds to the worst possible
situation.
3
3.1
Wiener-Hopf Filter
Wiener-Hopf Equation
Consider a signal x(n) as the input to a finite impulse response (FIR) filter
of length L, as shown in Fig.3, wT = [wT,0, wT,1, , wT,L-1]T. This filtering
operation generates an output:
(18)
with X(n) = [x(n), x(n-1), , x(n-L+1)]T. As the output of the filter is observed, it
can be corrupted by an additive measurement noise v(n), leading to a linear
regression model for the observed output
(19)
T
E[emin (n) X (n)] E d (n) wopt
X (n) 0 Lx1
(20)
From (20) we can conclude that given the signals x(n) and d(n), we can always
assume that d(n) was generated by the linear regression model (19). To do this,
the system wT would be equal to the optimal filter w opt, while v(n) would be
associated to the residual error emin(n), which will be uncorrelated to the input
x(n).
It should be noticed that (8) is not just a condition for the cost function to
reach its minimum, but also a mean for testing whether a linear filter is operating
in the optimal condition. Here, the principle of orthogonality illustrated in Fig. 2
can be interpreted as follows: at time n the input vector X(n) = [x(n), x(n-1)] T
will pass through the optimal filter wopt = [wopt,0 , wopt,1]T to generate the
output yopt(n). Given d(n), yopt(n) is the only element in the space
spanned by x(n) that leads to an error e(n) that is orthogonal to
x(n), x(n 1), and yopt(n).
Now we focus on the computation of the optimal solution.
From (20), we have:
10
(21)
R X E[ X ( n) X T (n)] and
for the input autocorrelation matrix and the cross correlation vector, respectively.
The two expectations in (21) may be interpreted as follows:
1. The expectation
L 1
E[ X ( n) X T (n)] E ( x (n k ) x( n i ))
k 0,1,2,......
i 0
is equal to the autocorrelation function of the filter input for a lag of i-k. We may
thus express this expectation as
r (i k ) E ( x ( n k ) x(n i ))
(23)
2. The expectation
k 0,1,2,......
is equal to the cross-correlation between the filter input x(n-k) and the desired
response d(n) for a lag of k. We may thus express this second expectation as
p (k ) E ( x(n k )d (n))
(24)
w
i 0
i
opt
r (i k ) p (k )
k 0,1,2...
(25)
The system of equations (25) defines the optimum filter coefficients, in the most
general setting, in terms of two correlation functions: the autocorrelation function
of the filter input, and the cross-correlation between the filter input and the
desired response. These equations are called WienerHopf equations.
11
3.2
Let RX denote the L-by-L correlation matrix of the tap inputs x(n), x(n-1), ,
x(n-L+1) in the FIR filter of the Fig.3. In expanded form, we have
r (0)
r (1)
RX
r (1)
r (0)
r ( L 1)
r ( L 1) r ( L 2)
r ( L 2)
r (0)
Correspondingly, let rXd denote the L-by-1 cross0correlation vector between the
tap inputs of the filter and the desired response d(n). In expanded form, we have
Note that as the joint process is WSS, the matrix Rx is symmetric, semi-positive
definite and Toeplitz. Using these definitions, equation (25) can be put as:
R X wopt rXd
(26)
wopt R X1rXd
(27)
which is known as the Wiener filter. An alternative way to find it is the following.
3.3
(28)
(29)
12
(30)
Using the fact that Rx is positive definite (and therefore, so is its inverse), it turns
out that the cost function reaches its minimum when the filter takes the form of
(30), i.e., the Wiener filter. The minimum MSE value (MMSE) on the surface
(30) is:
T
J MMSE ( w) J MMSE ( wopt ) E[| d (n) | 2 ] rXd
R X1rXd E[| d ( n) | 2 ] E[| yopt (n) |2 ]
(31)
J MMSE =E |d ( n )|
(32)
This means that the Wiener solution will be able to identify the system wT with a
R X QQ T
(33)
13
~ w w
w
opt
and its transformed version
~
u QT w
(34)
(35)
Using (27)(30)(31)(33)(34)(35)
J MSE ( w) J MMSE u T u
(36)
This is called the canonical form of the quadratic form J MSE(w) and it contains no
cross-product terms. Since the eigenvalues are non-negative, it is clear that the
surface describes an elliptic hyperparaboloid, with the eigenvectors being the
principal axes of the hyperellipses of constant MSE value.
3.4
Numeral Example
H1 ( z)
1
1 0.8458 z 1
(37)
The process d(n) is applied to a communication channel modeled by the all pole
transfer function
H 2 ( z)
1
1 0.9458 z 1
(38)
zero mean and variance 21=0.1 , so a sample of the received signal u(n) equals
u ( n) x( n) v 2 ( n)
(39)
The white-noise processes v1(n) and v2(n) are uncorrelated. It is also assumed that
d(n) and u(n), and therefore v1(n) and v2(n), are all real valued.
Fig. 4 (a) Autoregressive model of desired response d(n) (b) model of noisy
communication channel
15
We begin the analysis by considering the difference equations that characterize the
various processes desired by the models of Fig.4. First, the generation of the
desired response d(n) is governed by the first-order difference equation
d (n) a1d (n 1) v1 ( n)
(40)
d2
12
0.9486
1 12
(41)
The process d(n) acts as input to the channel. Hence, form Fig.4, we find that the
channel output x(n) is related to the channel input d(n) by the first-order
difference equation
x(n) b1 x(n 1) d ( n)
(42)
where b1 = -0.9458. We also observe from the two parts of Fig.4 that the channel
output x(n) may be generated by applying the white noise process v 1(n) to a
second-order all=pole filter whose transfer function equals
H ( z) H1 ( z) H 2 ( z)
(43)
16
(44)
where a1 = -0.1 and a2= -0.8. Note that both AR process d(n) and x(n) are WSS.
The received signal u(n) consists if the channel output x(n) plus the additive
white noise v2(n). Since the process x(n) and v2(n) are uncorrelated, it follows that
the correlation matrix R equals the correlation matrix of x(n) plus the correlation
matrix if v2(n). That is,
R R X R2
(45)
r (0) rx (1)
RX x
rx (1) rx (0)
Where rx(0) and rx(1) are the autocorrelation functions of the received signal
x(n)for lags of 1 and 1,repectively. We have:
Hence,
1 0.5
RX
0.5 1
(46)
Next we observe that since v2(n) is a white-noise process of zero mean and
2
variance 2 0.1 , the 2-by-2 correlation matrix R of this process results
2
0.1 0
R2
0 0.1
(47)
Thus, substituting (46) (47) in (45) we find that the 2-by-2 correlation matrix of
the received signal x(n) equals
1.1 0.5
0.5 1.1
(48)
p (0)
rXd
p( 1)
where p(0) and p(-1) are the cross-correlation functions between d(n) and u(n) for
lags of 0 and -1, respectively. Since these two processes are real valued, we have
18
p ( k ) p ( k ) u ( n k ) d ( n),
k 0,1
(49)
Substituting (42) in (49), and recognizing that the channel output x(n) is
uncorrelated with the white-noise process v2(n), we get
p(k ) rx (k ) b1rx (k 1)
k 0,1
(50)
Putting b1= -0.9458 and using the element values fit the correlation matrix R,
given in (46), we obtain
Hence,
0.5272
rXd
0.4458
(51)
Error-Performance Surface
w
1.1 0.5 w0
J ( w0 , w1 ) 0.9486 2[0.5272,0.4458] 0 [ w0 , w1 ]
w1
0.5 1.1 w1
0.9486 1.0544 w0 0.8916 w1 w0 w1 1.1( w02 w12 )
19
Fig.6 shows contour plots of the tap weight w1 versus w0 for varying valuesof the
mean-squared error J. We see that the locus of w1 versus w0 for a fixed J is in the
form of an ellipse. The elliptical locus shrinks in size as the mean-squared error J
approaches the minimum value Jmin. For J = Jmin, the locus reduces to a point with
coordinates wo0 and wo1.
Wiener Filter
20
Fig. 5 Error performance surface of the two-tap FIR filter described in the numerical
example
21
r (0) r (1)
R
r (1) r (0)
1
Hence, substituting (51) (52) into (27), we get the desired result:
1.1456 0.5208
w0
0.5208 1.1456
0.5272 0.8360
0.4458 0.7853
(53)
To evaluate the minimum value of the mean-squared error, J min, which results
from the use of the optimum tap-weight vector wo, we use (31). Hence,
substituting (41) (51) and (53) into (31), we get
0.8360
J MMSE 0.9486 [0.5272,0.4458]
0.1579
0.7853
(54)
The point represented jointly by the optimum tap-weight vector wo of (53) and the
minimum mean-squared error of (54) defines the bottom of the error-performance
surface in Fig.5, or the center of the contour plots in Fig.6.
(1.1 ) 2 (0.5) 2 0
22
1 1.6
2 0.6
(55)
The locus of u2 versus u1, as defined in (55), traces an ellipse for a fixed value of
1/ 2
J
-J
. In particular, the ellipse has a minor axis of [( J MSE J MMSE ) / 1 ]
MSE
MMESE
1/ 2
along the u1 coordinate and a major axis of [( J MSE J MMSE ) / 2 ] along the u2
coordinate; this assumes that 1 2 , which is how they are related.
4.1
Consider a signal x(m) observed in a broadband additive noise n(m)., and model
as:
y ( m) x ( m) n ( m)
(56)
Assuming that the signal and the noise are uncorrelated, it follows that the
autocorrelation matrix of the noisy signal is the sum of the autocorrelation matrix
of the signal x(m) and the noise n(m):
R yy R xx Rnn
(57)
rxy rxx
(58)
where Ryy, Rxx and Rnn are the autocorrelation matrices of the noisy signal, the
noise-free signal and the noise respectively, and r xy is the cross-correlation vector
of the noisy signal and the noise-free signal. Substitution of (57) and (58) in the
Wiener filter, yields
w ( R xx Rnn ) 1 rxx
(59)
(59) is the optimal linear filter for the removal of additive noise. In the
following, a study of the frequency response of the Wiener filter provides useful
insight into the operation of the Wiener filter. In the frequency domain, the noisy
signal Y(f) is given by
Y( f ) X ( f ) N( f )
(60)
where X(f) and N(f) are the signal and noise spectra. For a signal observed in
additive random noise, the frequency-domain Wiener filter is obtained as
W( f )
PXX ( f )
PXX ( f ) PNN ( f )
(61)
where PXX(f) and PNN(f) are the signal and noise power spectra. Dividing the
numerator and the denominator of Equation (61) by the noise power spectra
PNN(f) and substituting the variable SNR(f)=PXX(f)/PNN(f) yields
W( f )
SNR( f )
SNR( f ) 1
(62)
where SNR is a signal-to-noise ratio measure. Note that the variable, SNR(f) is
expressed in terms of the power-spectral ratio, and not in the more usual terms of
log power ratio. Therefore SNR(f)=0 corresponds to dB.
From Fig.7, the following interpretation of the Wiener filter frequency response
W(f) in terms of the signal-to-noise ratio can be deduced. For additive noise, the
Wiener filter frequency response is a real positive number in the range 0 W(f)
1. Now consider the two limiting cases of (a) a noise-free signal SNR(f) = and
(b) an extremely noisy signal SNR(f)=0. At very high SNR, W (f)1, and the
filter applies little or no attenuation to the noise-free frequency component. At
the other extreme, when SNR(f)=0, W(f)=0. Therefore, for additive noise, the
Wiener filter attenuates each frequency component in proportion to an estimate
24
of the signal to noise ratio. Figure 6.4 shows the variation of the Wiener filter
response W(f), with the signal-to-noise ratio SNR(f).
Fig. 7 Variation of the gain of Wiener filter frequency response with SNR
An alternative illustration of the variations of the Wiener filter frequency
response with SNR(f) is shown in Fig8. It illustrates the similarity between the
Wiener filter frequency response and the signal spectrum for the case of an
additive white noise disturbance. Note that at a spectral peak of the signal
spectrum, where the SNR(f) is relatively high, the Wiener filter frequency
response is also high, and the filter applies little attenuation. At a signal trough,
the signal-to-noise ratio is low, and so is the Wiener filter response. Hence, for
additive white noise, the Wiener filter response broadly follows the signal
spectrum.
25
Fig. 8 Illustration of the variation of Wiener frequency response with signal spectrum
for additive white noise. The Wiener filter response broadly follow the signal
spectrum.
4.2
y ( m) hk x(m k ) n(m)
k 0
(63)
where x(m) and y(m) are the transmitted and received signals, [hk] is the impulse
response of a linear filter model of the channel, and n(m) models the channel
noise. In the frequency domain (63) becomes
Y ( f ) X ( f )H ( f ) N ( f )
(64)
where X(f), Y(f), H(f) and N(f) are the signal, noisy signal, channel and noise
spectra respectively. To remove the channel distortions, the receiver is followed
by an equalizer. The equalizer input is the distorted channel output, and the
desired signal is the channel input. It is easy to show that the Wiener equalizer in
the frequency domain is given by
PXX ( f ) H * ( f )
W( f )
PXX ( f ) | H ( f ) | 2 PNN ( f )
(65)
where it is assumed that the channel noise and the signal are uncorrelated. In the
absence of channel noise, PNN(f)=0, and the Wiener filter is simply the inverse
of the channel filter model W(f)=H1(f).
y1 ( m) x(m) n1 ( m)
y2 (m) Ax(m D ) n2 ( m)
(66)
(67)
where y1(m) and y2(m) are the noisy observations from channels 1 and 2, n1(m)
and n2(m) are uncorrelated noise in each channel, D is the time delay of arrival of
the two signals, and A is an amplitude scaling factor. Now assume that y 1(m) is
used as the input to a Wiener filter and that, in the absence of the signal x(m),
y2(m) is used as the desired signal. The error signal is given by
P 1
e(m) y 2 ( m) wk y1 ( m)
k 0
P 1
Ax(m D ) wk x(m)
k 0
P 1
w n (m)
k 0
k 1
(68)
n 2 ( m)
The Wiener filter strives to minimize the terms shown inside the square brackets
in (68). Using the Wiener-Holf equation, we have
(69)
W( f )
PXX ( f )
Ae jD
PXX ( f ) PNl N l ( f )
(70)
Note that in the absence of noise, the Wiener filter becomes a pure phase (or a
pure delay) filter with a flat magnitude response.
27
Further Comments
The MSE defined in (2) uses the linear estimator y(n) defined in (1). If we relax
the linear constraint on the estimator and look for a function of the input, i.e., y(n) =
g(x(n)), the optimal estimator in mean square sense is given by the conditional
expectation E[d(n)|x(n)]. Its calculation requires knowledge of the joint distribution
between d(n) and x(n), and in general, it is a nonlinear function of x(n) (unless certain
29
symmetry conditions on the joint distribution are fulfilled, as it is the case for
Gaussian distributions). Moreover, once calculated it might be very hard to implement
it. For all these reasons, linear estimators are usually preferred (which as we have
seen, depend only on second order statistics).
On a historical note, Norbert Wiener solved a continuous-time
prediction problem under causality constraints by means of an
elegant technique now known as the Wiener-Hopf factorization
technique. This is a much more complicated problem than the one
presented in (3). Later, Norman Levinson formulated the Wiener
filter in discrete time.
It should be noticed that the orthogonality principle used to derive
the Wiener filter does not apply to FIR filters only; it can be applied
to IIR (infinite impulse response) filtering, and even noncausal
filtering. For the general case, the output of the noncausal filter can
be put as:
y ( n)
w x(n k )
(71)
r (k i ) rxd (k )
k
opt x
(72)
which can be solved using Z-transform methods. In addition, a general expression for
the minimum mean square error is
J MMSE rd (0)
k
opt xd
r (k )
(73)
From this general case, we can derive the FIR filter studied before (index i in the
summation and lag k in (72) go from 0 to L 1) and the causal IIR filter (index I in
the summation and lag k in (72) go from 0 to ). Finally we would like to comment
on the stationary of the processes. We assume the input and reference processes were
WSS. If this were not the case, the statistics would be time-dependent. However, we
could still find the Wiener filter at each time n as the one that makes the estimation
error orthogonal to the input, i.e., the principle of orthogonality still holds. A less
costly alternative would be to recalculate the filter for every block of N signal
30
samples. However, nearly two decades after Wieners work, Rudolf Kalman
developed the Kalman filter, which is the optimum mean square linear filter for nonstationary processes (evolving under a certain state space model) and stationary ones
(converging in steady state to the Wieners solution).
References
1. A.H. Sayed, Adaptive Filters (John Wiley & Sons, Hoboken, 2008)
2. S. Haykin, Adaptive Filter Theory, 4th edn. (Prentice-Hall, Upper Saddle River,
2002)
3. G.H. Golub, C.F. van Loan, Matrix Computations (The John Hopkins University
Press, Baltimore, 1996)
4. B.D.O. Anderson, J.B. Moore, Optimal Filtering (Prentice-Hall, Englewood Cliffs,
1979)
5. T. Kailath, A.H. Sayed, B. Hassibi, Linear estimation (Prentice-Hall, Upper Saddle
River, 2000)
31