High Resolution Methods For Parametric Spectral Estimation

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

EE 225A

Digital Signal Processing

University of California, Berkeley


March 25, 2009

High Resolution Methods for Parametric Spectral Estimation

Sinusoidal Spectral Analysis

Consider a signal y[n] , composed of the sum of K complex exponentials and noise:

y[n] =

K
!

k ej(k n+k ) + v[n]

k=1

where k and k are deterministic parameters, and k are random variables uniformly distributed on
[, ] . The problem is to estimate the unknown amplitudes k and frequencies k from the set of
1
data samples {y[n]}N
n=0 .
In these notes we explore two subspace methods, the MUSIC algorithm [3] and the ESPRIT algorithm
[2], that can be used to solve this problem. We also review the annihilating filter method [6], and examine
the use of the annihilating filter with noisy data. In all of these frameworks, we first estimate the {k } ,
which are a nonlinear function of the data. This is the most challenging aspect of sinusoidal spectral
analysis. Once the frequencies have been determined, it is straightforward to find the {k } via linear
regression.

Subspace Methods

Subspace methods are based on the common thread of separating the signal subspace from the noise
subspace. We begin by defining some notation that we will use in this section.
T

Let y[n] = [y[n] y[n 1] . . . y[n M + 1]] denote the vector of M consecutive data samples up to
T
time n . Similarly, let v[n] = [v[n] v[n 1] . . . v[n M + 1]] denote the noise vector. Let the signal
"
#T
vector be defined as x[n] = 1 ej(1 n+1 ) . . . K ej(K n+K )
Finally, we define the transfer vector
"
#
j
j(M1) T
a() = 1 e
... e
, and the matrix A = [a(1 ) . . . a(K )] whose columns are the
transfer vectors of the unknown frequencies.
Using this notation, the received signal vector can be written as:
y[n] = Ax[n] + v[n]

2.1

Covariance Matrix

We will assume that the noise samples v[n] are i.i.d., zero-mean Gaussian with variance 2 . Therefore,
the covariance matrix of y[n] can be expressed as
R = E[y[n]yH [n]] = AE[x[n]xH [n]]AH + E[v[n]vH [n]] = ARx AH + 2 I
where Rx is a diagonal matrix with entries {21 , , . . . , 2K } and I is the M by M identity matrix.
1

The positive semi-definite matrix ARx AH has rank K , and therefore has K strictly positive eigenvalues
(which we denote by k ) and M K eigenvalues equal to zero. Thus, the eigenvalues of R can be
written as:
k =

k + 2 > 2
2

k = 1, 2, . . . , K
k = K + 1, . . . , M

Furthermore, let q1 , q2 , . . . , qM denote the associated eigenvectors. Let Qs denote the M by K matrix
whose columns are qi , 1 i K , the eigenvectors in the signal subspace, and let Qn denote the M
by M K matrix whose columns are qi , K + 1 i M , the eigenvectors in the noise subspace.
2.1.1

Estimating the Number of Exponentials

The eigenstructure of the covariance matrix provides a convenient method for estimating K , the number
of exponential components of y[n] , when it is not known a priori. We compute and order the eigenvalues
of R , and use the fact that the M K smallest eigenvalues are equal, to find K .

2.2

MUSIC

MUSIC, Multiple Signal Classification, was proposed by Schmidt in 1979 [3]. The intuition behind
MUSIC is that the space spanned by the transfer vectors a(k ) is orthogonal to the noise subspace. The
algorithm is based on computing the eigendecomposition of the covariance matrix R and searching for
transfer vectors orthogonal to the noise eigenvectors.
2.2.1

Mathematical Derivation

The fundamental fact exploited by MUSIC is that for any i > K , we know that Rqi = 2 qi . We can
also rewrite the left side of this equation as Rqi = (ARx AH + 2 I)qi . Taken together, this implies that
(ARx AH + 2 I)qi = 2 qi
ARx AH qi = 0
Because the matrix ARx has full column rank, it follows that AH qi = 0 , which is equivalent to
qH
i a(k ) = 0

K + 1 i M, 1 k M

In words, this means that the eigenvectors corresponding to the noise eigenvalues (the noise subspace)
are orthogonal to the transfer vectors of the components of the signal (the signal subspace). We can
find the frequencies of the components by searching for transfer vectors that satisfy this equation for
K +1iM.
In practice, the transfer vectors will not be precisely orthogonal (due to errors in estimating the qi ),
and so we cannot find frequencies that satisfy this equality exactly. However, we can obtain accurate
estimates of the frequencies in the following manner. If this condition was satisfied for all eigenvectors
in the noise subspace, then aH ()Qn would be equal to the zero vector. We compute the following
function:
PMUSIC =

1
aH ()Q

H
n Qn a()

PMUSIC will have a peak when corresponds to the frequency of one of the exponentials. An example
of PMUSIC is shown in Figure 1.

Figure 1: An example of the function PMUSIC for a case with K = 3 sinusoidal components.

2.2.2

MUSIC Algorithm

The MUSIC algorithm can be summarized as follows:


1. Take N samples of y[n]
=
2. Compute the M by M sample covariance matrix R

3. Form the eigendecomposition of R

1
N M+1

%N 1

n=M1

y[n]yH [n]

4. Estimate the size of the signal and noise subspaces (if K is not known a priori)
5. Find the K highest peaks of

1
aH ()Qn QH
n a()

M is a free parameter that we can adjust. MUSIC can be used as long as M > K . For a fixed N ,
a smaller value of M allows more reliable estimation of R since the sample covariance matrix is being
averaged over more data vectors. However, for a given matrix R , the signal and noise subspaces can be
more reliably estimated for larger M . Thus there is a tradeo in selecting the optimum value of M .
The main drawback in using MUSIC is that searching for the peaks of PMUSIC is computationally
expensive

2.3

ESPRIT

ESPRIT, Estimation of Signal Parameters via Rotational Invariance Techniques, was proposed by Roy
and Kailath in 1986 [2]. Like MUSIC, it relies on forming the eigendecomposition of the covariance
matrix. ESPRIT is based on exploiting the rotational invariance property of the transfer vectors.
2.3.1

Mathematical Derivation

Rotational Invariance

We begin by defining D , a K by K diagonal matrix, whose diagonal elements are equal to {ej1 , . . . , ejK } .
We also define W and W to be the matrices obtained by removing the first and last rows of a matrix
W , respectively.
Considering A , the matrix of transfer vectors defined previously, we see that:
A = AD
This is called the rotational invariance property.
Signal Space is in Range Space of Transfer Vectors
At this point, we need to prove a lemma, demonstrating that we can write Qs in the form Qs = AC ,
for some non-singular matrix C . We define to be a K by K diagonal matrix, whose diagonal entries
to be a K by
are {1 , , . . . , K } . (Recall that k are the eigenvalues of R .) Similarly, we define
2
2
K diagonal matrix, whose diagonal entries are {1 , , . . . , K } . Because the columns of Qs
are eigenvectors of R , we can write:
RQs = Qs = ARx AH Qs + 2 Qs
Qs 2 Qs = ARs AH Qs
&
'
Qs 2 I = ARs AH Qs
(
)
1
Qs = A Rs AH Qs
1
This shows that Qs = AC . Also, because both A and Qs have full column rank, C = Rs AH Qs
is non-singular. Thus, we obtain the lemma that we desire. (We note in passing that the specific form of
C is not important, only that a non-singular C exists.)
Using Rotational Invariance in Spectral Analysis
Returning to the problem of finding the k , we observe that we can write:
Qs = AC = ADC = Qs C1 DC = Qs
where = C1 DC . At this point, we note the first key fact for ESPRIT: because and D are related
by a similarity transformation, they have the same eigenvalues.
In addition, we note that since A and A have full column rank, Qs and Qs also have full column rank.
Thus, Qs H Qs is a K by K matrix with rank K , and can be inverted. We can obtain as follows:
Qs H Qs = Qs H Qs
(
)1
= Qs H Qs
Qs H Qs
This is the second key fact: is a function of matrices that can be estimated from the eigendecomposition
of R . After computing , we find its eigenvalues. Because and D have the same eigenvalues, and
D is diagonal, the eigenvalues of are equal to ejk , the entries in D . The unknown frequencies can
be found as the arguments of the eigenvalues of .

2.3.2

ESPRIT Algorithm

The ESPRIT algorithm can be summarized as follows:


1. Take N samples of y[n]
=
2. Compute the M by M sample covariance matrix R

1
N M+1

3. Form the eigendecomposition of R

%N 1

n=M1

y[n]yH [n]

4. Estimate the size of the signal and noise subspaces (if K is not known a priori)
)1
(
Qs H Qs
5. Form Qs , the matrix of signal space eigenvectors, and compute = Qs H Qs
6. Find k , the eigenvalues of
7. Compute the frequencies as k = arg(k )
As with MUSIC, M is a free parameter that aects the accuracy of ESPRIT. While ESPRIT requires
computing two separate eigendecompositions, it does not require a search over a continuous-valued parameter space as in MUSIC.

Annihilating Filter Method

This algorithm is based on calculating, and then factoring, an annihilating filter which zeros out the
observed data sequence. A variant of this method was proposed by Vetterli [6] for the problem of
sampling non-bandlimited signals. The basic scheme is relatively brittle in the presence of noise, and
hence care must be taken with noisy data.

3.1

Noiseless Case

The annihilating filter can be best understood by first looking at a setting with no noise. Changing
notation from y[n] to yn , the observed data can be expressed as

yn =

K
!

K
!

k ej(k n+k ) =

k=1

k=1
T

where uk = ejk . Let h = [h0 h1 . . . hK ]

H(z) =

k ejk n =

K
!

K
!

k unk

k=1

be a length K + 1 vector with Z-transform given by:

hk z k =

k=0

K
*
'
&
1 uk z 1

k=1

Convolving the sequences {yn } and {hk }K


k=0 produces

hn y n =

!
m

hm ynm =

!
m

hm

K
!

k unm
=
k

k=1

K
!

k=1

k unk

hm um
=0
k

because the inner summation is equal to H(z) evaluated at z = uk , which equals 0 .


Thus, the intuition behind this algorithm is to first find the annihilating filter h , and then factor h to
find the zeros {uk } . Once we have the zeros, it is simple to find the frequencies {k } .
5

3.1.1

Estimating the Number of Exponentials

Looking at the preceding equations more closely, we can see that any nontrivial filter {hk }k=0,1,...,L with
L K that has zeros at {uk }k=1,2,...,K will annihilate data. The converse (which we do not prove here)
is also true: any length L filter which annihilates the data has zeros at {uk } . We create the following
rectangular Toeplitz matrix:

yL

yL+1

..

.
yN 1
T

An annihilating vector h = [h0 h1 . . . hL ]

yL1
yL
..
.

...
...
..
.

y0
y1
..
.

yN 2

. . . yN L1

(1)

will satisfy the equation Yh = 0 .

If L > K , there are L K + 1 independent polynomials of length L with zeros at {uk }k=1,2,...,K , and
thus there are L K + 1 vectors that satisfy Yh = 0 . Because the rank of Y and the dimension of the
null space of Y must add up to L + 1 (the number of columns), the rank of Y is always equal to K .
This provides a simple method for determining K : find the smallest L such that Y is singular, then
K = L 1.
3.1.2

Annihilating Filter Algorithm

The annihilating filter algorithm for the noiseless case can be summarized as follows:
1. If K is not known a priori, estimate K by building Toeplitz matrices of various sizes, as described
in Section 3.1.1
2. Form a system of equations and solve for the annihilating filter h
3. Factor the annihilating filter to find its zeros {uk }k=1,2,...,K
4. Calculate the frequencies as k = arg(uk )

3.2

Noisy Case

In the presence of noise, the data will not be perfectly annihilated. However, finding a vector h that
minimizes Yh is still a reasonable way to approximate the annihilating filter. The following methods
are described in the context of sampling non-bandlimited fields in [1].
3.2.1

Total Least Squares Annihilating Filter

Because the matrix Y is noisy (it is composed of the noisy observed data), this is a total least squares
minimization. Note that when we minimize Yh we must include the constraint h = 1 , to avoid the
trivial solution h = 0 .
While there are a variety of methods for solving total least squares problems, the singular value decomposition (SVD) is one well-known and robust method [5]. We begin by building the Toeplitz matrix from
Section 3.1.1, with L = K . If the rank of Y is equal to K + 1 , then the only solution to Yh = 0 is
the trivial solution. Thus, to find a non-trivial annihilating filter we must find a matrix of rank K that
approximates Y .
Given the SVD of Y and the associated dyadic expansion:
6

Y = USVH =

K+1
!

k uk vkH

k=1

which minimizes the Frobenius norm


the Eckart-Young-Mirsky theorem states that the rank r matrix Y

Y YF is found by taking the dyadic expansion of the first r singular values of Y :


=
Y

r
!

k uk vkH

k=1

To find the annihilating filter, we will choose r = K to obtain an approximation of rank K . It should
, because vH vK+1 = 0 for 1 i K 1 .
further be clear that vK+1 is in the null space of Y
i
Therefore, the total least squares solution to Yh = 0 is given by h = vK+1 , the right singular vector
associated with the smallest singular value. After finding the annihilating filter, we factor it and compute
the frequencies k from its zeros, as in the noiseless case.
3.2.2

Cadzows Iterative Denoising

While the TLS annihilating filter works well in the presence of moderate to small noise power, it can be
unreliable when the SNR is low. A more robust procedure is to first pre-process the data to remove some
of the noise, and then apply the TLS scheme of section 3.2.1.
One popular pre-processing step is Cadzows Iterative Denoising method. It involves iteratively projecting the data onto the space of Toeplitz matrices and the space of rank K matrices. It can be summarized
as follows:
Choose an L K and build the Toeplitz matrix Y
Form the SVD Y = USVH
by keeping the K largest singular values, and setting the other
Form a new diagonal matrix S
singular values to 0
= USV
H
Compute a new matrix Y
will no longer be Toeplitz. The best Toeplitz approximation is found by averaging along its
Y
diagonals.
These steps are repeated until some stopping condition is met. For example, we might stop when the
(K + 1)th largest singular value is less than a given fraction of the K th largest singular value. In practice,
usually only a few iterations are necessary.
3.2.3

Annihilating Filter Algorithm

The annihilating filter algorithm for the noisy case can be summarized as follows:
1. Run the Cadzow denoising algorithm, either until a stopping condition is met or for a fixed number
of iterations
2. Form a Toeplitz matrix Y of size L = K from the denoised data, and compute the SVD
3. Approximate the annihilating filter with the rightmost singular vector, h = vK+1
7

4. Factor the annihilating filter to find its zeros {uk }k=1,2,...,K


5. Calculate the frequencies as k = arg(uk )
Note that we could replace steps 2 and 3 with any other algorithm for solving the total least squares
problem.

Direction of Arrival Estimation

One important application of MUSIC and ESPRIT, in fact the application which motivated the original
papers, is the estimation of the direction of arrival with antenna arrays. Consider a uniform linear array
of M antennas, as shown in Figure 2. The array receives K narrowband signals sk (t) , each incident at
angle k . (For simplicity, we only show one source in the figure.) Our objective is to estimate the angle
of arrival of all K sources. We require that M > K .

Figure 2: The direction of arrival estimation problem.


Assuming that the source is far from the array, the incoming signal at the receiver is a plane wave. The
time delay of arrival at antenna k , relative to its arrival at antenna 0 , is k,i = i d sinc k .
In these notes, we will focus on arrays with antenna spacing d = /2 . In this case, the time delay is
k
k,i = i sin
2fc . Making the further assumption that all signals sk (t) are narrowband, the signal received
at antenna i is:

xi (t) =

K
!

sk (t)ej2fc k,i + vi (t) =

k=1

K
!

sk (t)eji sin k + vi (t)

k=1

where vi (t) is additive noise. Clearly, for a fixed time t , the signals at the M antennas are the sum
of exponentials at unknown spatial frequencies k = sin k . We can use one of the high resolution
methods described here to estimate the spatial frequencies, and then extract the angles of arrival.
There are several dierences when MUSIC or ESPRIT is used for direction of arrival estimation. First,
M is now equal to the number of antennas in the array, and thus is no longer a free parameter that we
can adjust. Second, the received signal is now a function of both time and space. The covariance matrix
R is estimated from {x(t)}Tt=1 , samples of the received vector, whereas in Section 2.3 the covariance
matrix was estimated from overlapping blocks of the one dimensional data.

References
[1] T. Blu, P.-L. Dragotti, M. Vetterli, P. Marziliano, and L. Coulet, Sparse Sampling of Signal Innovations, IEEE Signal Processing Magazine, vol. 25, no. 2, pp. 3140, March 2008.
[2] R. Roy and T. Kailath, ESPRIT - Estimation of Signal Parameters Via Rotational Invariance
Techniques, IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 37, no. 7, pp. 984
995, July 1989.
[3] R. Schmidt, Multiple Emitter Location and Signal Parameter Estimation, IEEE Transactions on
Antennas and Propagation, vol. 34, no. 3, pp. 276280, March 1986.
[4] P. Stoica and R. Moses, Introduction to Spectral Analysis, Prentice-Hall, 1997.
[5] S. Van Huel and J. Vandewalle, The Total Least Squares Problem: Computational Aspects and
Analysis, SIAM, 1991.
[6] M. Vetterli, P. Marziliano, and T. Blu, Sampling Signals with Finite Rate of Innovation, IEEE
Transactions on Signal Processing, vol. 50, no. 6, pp. 14171428, June 2002.

You might also like