Super-Exponential Methods For Blind Deconvolution: Shalvi and Ehud Weinstein, Ieee

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 16

DU4 ))) TRANSACIONS ON INFORMATION THEORY, VOL.

39, NO, 2, MARCH 1993


5upcr-Lxponcntia! Mcthods Ior
!ind LcconvoIution
Ofr h3lVI 3B0 uC 0IBSl0IB, Senior Member, 1LLL
AbslrclA class of iterative methods for solving the blind
deconvolution problem, i.e., for recovering the input of an un
known possibly non minimum phase linear system by observation
of its output is presented. These methods are universal in the
sense that they do not impose any rstrictions on the probability
distribution of the input process provided that it is non-Gaussian.
Furthermore, they do not require prior knowledge of the input
distribution, These methods are computationally efcient, sta
tistically stable (i.e., small error variance), and they converge
to the desired solution regardless of initialization (no spurious
local stationary points) at a very fast nearly super-exponential
(expnential to the power) rate. The efects of fnite length of
the data, fnite length of the equalizer and additive noise in the
system on the attainable performance (intersymbol-interference)
are analyzed. It is shown that in many cases of practical interest
the performance of the proposed methods is far superior to linear
prdiction methods even for minimum phase systems. Recursive
and sequential algorithms are also developed, which allow real
time implementation and adaptive equalization of time-varying
systems.
!ndex 7e.s"blind deconvolution, channel equalization, sys
tem identifcation, high-order cumulants/spectra.
I. INTRODUCTION
W
E CONSIDER the blind deconvolution problem in
which we observe the output of an unknown possibly
nonminimum phase linear system from which we want to
recover its input using an adjustable linear flter (equalizer).
This is a problem of considerable practical interest in diverse
felds of engineering and applied sciences. For example, in
.digital communication the input process is a sequence of data
symbols, where the unknown system represents the linear
distortion caused by the transmission channel (e.g., a telephone
line) between the information source and the receiver. In image
rcstoration, the unobserved input represents the original image
(scene) where the unknown system represents the blurring
efects caused by the electronic or photographic medium.
Manuscript received October 16. 1990; revised July 13, 1992. This work
was supported in part by the Wolfson Research Award administrated by the
Israel Academy of Science and Humanities at Tel-Aviv University and in part
by the Ministry of Communications Award administered by the Chief Scientist
Ofce in the Israeli Ministry of Communications. This work was presented
in part at SPIE '91, San Diego, CA. July 1991.
O. Sh'alvi is with the Dpartme.t of Electrical Engineering-Systems,
Facuity of Engineering, Tel-Aviv University, Ramat-Aviv, Tel-Aviv, 6997,
Israel.
L. Weinstein is with the Department of Electrical Engineering-Systems,
Facuity of Engineering, Tel-Aviv University, Ramat-Aviv, Tel-Aviv, 69978,
Israel. He is also with the Woods Hole Oceanographic Institution, Woods
Hole, MA 02543.
IEEE Log Number 9204354.
It is well known that conventional linear prediction methods
based on second-order statistics (i.e., correlation or power
spectrum) are insufcient for solving the problem since they
are incapable of identifying the phase of the unknown system's
transfer function. For that reason, the problem cannot be
solved when the input process is Gaussian. Consequently,
numerous approaches for blind deconvolution based on high
order moments/cumulants/spectra have been proposed in the
recent literature. A partial list of references is given by [1]-[5],
[7]-[11], [13]-[18], [20]-[25].
In this paper, we present a class of deconvolution algorithms
that converge iteratively at a very fast nearly super-exponential
(that is, exponential to the power) rate to the desired solution,
regardless of initialization. As shown in [21], these methods
can be linked to the deconvolution criteria proposed in [5],
[10], [20], and [25]. For notational convenience, we shall
concentrate on the one-dimensional deconvolution problem as
appears in e.g., data communication. However, we note that
the basic approach presented in this paper can also be applied
to multidimensional deconvolution problems such d image
deblurring.
II. PROBLEM FORMULATION
The basic problem of interest is illustrated in Fig. 1.
We observe the output g[ of an unknown discrete time
invariant system J = {h
n
} with input U being an unobserved
realization (sample function) from a discrete-time stationary
random process. We want to adjust the equalizer C ~ {cn} so
that its output Z is equal to the input U up to a constant delay
and possibly a constant phase shift. Thus, if we denote by
y " hy !y " _h.

(1)
the unit sample response of the unknown system combined
(convolved) with the equalizer, then we want to set C
n
so that
7
_
,
7 y ,
(2)
or, equivalently, that the frequency response of the combined
system be
(3)
where k stands for the delay, and stands for the phase
shift. We note that a constant delay is inherent because of
the stationarity of the input process, and the constant phase
00I8944R/93$03.00 1993 IEEE
S1A|V AN\ I!N>JL! 5IlH-I/C^J^J^. N\J\lN lH HllO DECONVOLUTION DUb
.........,,.........._
t ' j " t
|
.
|
| J |
....................
Fig. ! Blind UCCOHVOU!\OD HOUC.
shift is also unavoidable if the input distribution is invariant
under rotation.
Two common measures of equalization performance (e.g.,
see (18]) are the maximum distortion (MD)defned by
VD) =
Ln ISn
l-
18max
l

...l
and the intersymbol-interference (lSI) defined by
ISI(s)

Ln ^
-,..

^+l
where __ is the component of ,n having the maximal
absolute value (the leading tap or cursor). Clearly, VDand
lSI are zero if 8n is of the form of (2), and a small value of
MD/ISI indicates the proximity to the desired solution.
In the next section. we present a class of iterative algorithms
for adjusting 8n that converge monotonically at a very fast
rate to the desired response regardless of initialization. We
then show how to convert these virtual 8-domain algorithms
into realizable c-domain algorithms in which we only need to
adjust the unit sample response (taps) ('" of the equalizer.
III. !ur Aori1uv i 1ur sDOvAin
Consider the following iterative procedure for adjusting
the unit sample response coeffcicnts (taps) of the combined
system:
(4a)
(4b)
where , are the tap values before the iteration, are
intermediate values, and are the tap values after the
iteration. We denote by lll = v VLn -,' the norm
of the (infinite dimensional) vector of _ and by and the
conjugate and conjugate-transpose operations, respectively.
If we choose p and q to be nonnegative integers such that
+ ( _ 2, then the transformation in (4a) followcd by the
normalization operation in (4b) causes the taps with smaller
magnitndes to decrease more rapidly, forcing 8n to converge
very quickly to the desired response in which one tap (the
leading tap) approaches one in magnitude, while all other taps
approach zero. The effect of the algorithm is illustrated in
Fig. 2 for the case p = 2 and If = ! Suppose we start with
8n = |.c
l
''' which corresponds to an initial 11 of 3.'6.
Mter one iteration we obtain ._ 0.76(0.512)l
n
l, which
corresponds to J1 = 0.71 . On the next iteration cycle we
obtain Sn = 0.98(0.1 3)1n, corresponding to 11 ^ 0.037,
which is already quite close to the desired response.
~:
)L1
-
O.O
.
I
D
.
'
.
-
.
.
.
` O 4
|

l3I O.1
.
' | '
. 1
~' ~A ~
.
.
.
` / 4 '
!)
:
IL1
-
L.!O(
.
l ! l |
D
-' -A ~ -.
.
` O . . 4
(J
Fig
.
Z. LlI0Cl UI the _orIthm in the time domain. (al Before nrst iteration.
(b)After frst iteration. (c) After eCoDU itcra|ion.
Transforming (4a) and (4b) to the frequency domain, we
obtain
'(.)
and
(

(.) '
. .
(.)`(

,) ~ Q S*(
-
l4)
27. P

r
+r P q-l
(27)P
+
q
-
l!-."
.

'
= _+.)(=])
'(-=,)!=,!a_.
(Sa)
(.)

'=) = -, ==

'
(
)
-

_
'

(5b)
where (.), 3'.). and (.)are the Fourier transforms of
-''' "' and -<, respectively. To demonstrate the efect of the
algorithm in the frequency domain, consider the case Z
and q = d. Suppose the initial (
_
)has a narrow spectral null
of width "c. at frequency .,:
5(.) =
1 ,
- -a 7E/2,
- ._ ` H/2.
After performing one iteration cycle we obtain (see Fig. 3):
S
"(w) _
I
1
-

~'
` '

1
l
---!
(
:
|
~
o(
-
).
a-2wol : H,
I
w
-2wol ` "c
If we regard the diference between S (w) and the desired
response in (3) as "disturbance," then a single iteration of
the algorithm causes the width of the "disturbance" to be
multiplied hy a factor of 2, the location of the peak of the
"disturbanc" is multiplied by a factor of 2, and the peak value
is multiplied by a factor of E/2 which essentially eliminates the'
3 IEEE TRNSACIONS ON INFORMATION THEORY. VOL. 39, NO. 2, MARCH 1993

T
=T
|)
|
(

(a)
3(=)
(b)
T
7
Fig. 3. Efect of the algorithm H the frequency domain. (a) Before the
iteration. (b) After the iteration.
spectral null. With arbitrary p and g, we obtain similar efects:
the width of the "disturbance" is multiplied by a factor of
(p +), and the peak value is multiplied by a factor smaller
than EP+q-l
To analyze the convergence behavior of the algorithm, we
note that by (4a) and (4b),
(6)
Since + g 2, then
1
_
1 1
1 1
|PP'_

1 'no
; g ``
.
Bno Sn2 ` '
(7)
That is, the indexes 7/ n
1
, n2,'" of the ordered s components
are presered throughout the iterations. In particular, the index
of the leading tap is preserved along the iterations. Thus,
and
MD
(
S
"
)
=
_

=
:,
(p+
q
)
llnO
no
n;no
no

I
)
p
+
q
= [MD(s)]p+q (8)
n#no
M


e
`
' 2(
p
+q
)
ISI(s") =


n#no
no
n#no
M
< [I: '
+
<)
[181(,)['+'. (9)
Performing (8) and (9) iteratively we obtain, respectively,
MD(s
(
l)) [MD(s(O)](p+q)', (10)
(11)
where s(O)
(
... s;

'
. . , denotes the initial value
of s, and s{/) = ss'sill . oO) T denotes the value of s
after / iterations of the algorithm. These inequalities indicate
that MD( s(l and ISI( s(/ converge to zero at a very fast
at least super-exponential (that is, exponential to the power)
rate to zero, provided that MD(sCO) and 181(s(0)) are smaller
than 1. Since by (4b) Ils(1)
11 = 1, then sell must converge to a
vector having only one nonzero component whose magnitUde
equals one, i.e., to the desired response.
To prove convergence to the desired response under weaker
conditions, we invoke (6) and (7) once again:


p
+
q :iD(s") =
:
=
:
ni
n
o
no
nino
M

P+Q
-
1
_
`
,
8no

' :D(s).

z
Performing (12) iteratively,
By (6),
MD(s(l) q
.
MD(sCO).
11
(
)
p
+
q
-1
t=
O
8no
(12)
(13)
(14)
Thus, substituting (14) into (13) and following straightforard
algebraic manipulations
where
MD(sCI)) r
p
+q)l-l
MD(sCO,
(15)

U
,)
sno
In a similar way, we obtain
ISI(s(l)

r[(p+q)I-11ISI(sC
O (16)
Since MD(s(O

and ISI(s(O r5, then the bounds


in (15) and (16) are tighter than the bounds in (10) and (11),
respectively. Furthermore, the convergence of these bounds
to zero (and thus to the desired response) only requires the
weaker condition that < |, i.e., that the leading tap of the
initial response is unique.
In the case where the leading tap of the initial response
is not unique, i.e., there are ` ` 1 distinct taps that
exactly achieve the maximal absolute value, it is easy to
verify that the algorithm converges to a vector shaving `
nonzero components of equal magnitude 1/ ,. However,
these convergence points are unstable equillibria, any small
deviation from which is sufcient to ensure convergence to the
desired response. For example, in case p +g 3, |q = 0.7,
SHALVI AND WEIN,TEIN: SUPER-EXPONENTIAL :ETHODS FOR BLIND DFCO'VOLCTIOK
then by (16) we need / Ziterations to improve the initial lSI
by a factor of at least 100. If ! 0.000. which represents
a very small deviation from the indicated anomaly, we need
/ iterations for a factor 100 improvement in the initial IS.
IV. THE ALGORITHM IN THE C-DOMAIN
The ,,-domain algorithm presented in the previous section is
stated in terms of the taps S_ of the combined system which, of
course, arc unavailable for adjustment. nthis section we show
how to implement the algorithm in the c-domain using only
the taps C_ of the equalizer and some statistical cumulants
of the observed data. Since the equalizer being used is of
fnite length. the 8-domain algorithm in (4a), (4b) with the
associated super-exponential convergence rate can only be
approximated. Since we use empirical eumulants, obtained
by sample averages over the available data, instead of the
exact cumulants, it causes random errors, and the presence of
additive noise can only add to these errors. All thesc effects
must be taken into consideration.
The organization of this main section is as follows. In
Section IV-A, we defne cumulants and present some of their
properties that will be used in the sequel. In Section IV-B,
we present the c-domain algorithm subject to the fnite-length
restriction (still with the exact cumulants). In Section V
t, we present a realizable algorithm obtained by replacing
the unavailable cumulants by their empirical estimates, and
analyze its statistical stability (error variance) for a few special
cases. In Section `-I,we briefy address the etfect of additive
noise in the system. In Section IV-E, we present simulation
results to confrm the analysis. In Section IV-F, we develop
sequential and adaptive approximations of the algorithm.
A. Mathematical Preliminaries
Let :r1 .. 1. ` ' . J_ be a sel of real/complex random variables
possessing the joint characteristic function:
}(u) EfJWTx F',
where j and E {stands for the expectation operation.
The joint cumulant of .l_
' `
.I`,,.
.
. .I . .{L 2." . n
is defned by

J
r!''''i::u)
\I.i

. .t,

:
. t,
,,

,-,) _ .
~ri ",(wnm:w
=
O
where lv() stands for the natural logarithm. An alternative
definition of eumulants in terms of moments is given in ([3, ch.
2, definition 2.3.1 D. Thus, if .i i
.
arc zero-mean random
variables then
cui:(.t, ) d
t.i.(t,: .r) E{.fl.rd
c\i(
J
: _` Fr,
c\iui(./,: t: r: r| E{TI:f2:r:J:r}
- curn(T1: .r}c\:i.(.t: r)
un,..i,|c(:: I,
- ci.r .l` ,iin(.|_ ,
and so on.
For notational convenience, let
ii:(.t:.t. r,.| uii.(, )
`
p terms
The following properties can easily be verifed (sec [3]).
PI) Linearity; cii(_

|,r,} _ |,un.,.)
l2) ! r,

. r g `

..l_ can be divided into two statisti


cally independent subsets, then their joint cumulant is
zero.
P3) H .|_ ..I`_.
.

.
. .!_ _ are jointly Gaussian, then their
joint cumulant is zero whenever ! > 2.
B. Algorithm Implementation Using Finite-Length c'qualizers
In this subsection. we show how to express the ,-domain
algorithm (4a), (4b) in terms Ol the equalizer taps and some
data cumulants. We shall make the following assumptions:
I) !hc input U t \2. is a se4uence of
real/complex independent identically distributed (i.i.d.)
continuous/discrete type random variables. We assume
the existence uIthe following nonzero cumulants of lI.
(17)
11J U .::c, q + 1) # D. (b)
where p and are the parameters appearing in (4a). Note
that since p+q 2 2 then by P3) U) must be non-Gaussian.
Comment: II is suffcient to assume that U) I lZ" .
are statistically independent random variables. However, for
the sake of simplicity, we further assume that they are also
identically distributed.
2) The unknown system J {h,,} is a stable, possibly
non minimum phme, linear time-invariant flter whose
inverse (which may be noncausal) `
J
{h;1} exists.
3) The equalizer C ,, is a tap-delay line of
fnite-length I - 1I + I.
The combined system respunse subject to the fnite-length
restriction is
n a vector form
.
:+

i'~i
|-1
H - Hc. ( 19)
where is the possibly infnite vector taps of the combined
system
f is the 1 ) vector equalizer taps
'

,../`
and H is the matrix of columns and possibly infnite number
of rows, whose elements are
(20)
3 IEEE TRANSACIONS ON INFORMATION THEORY, VOL. 39, NO.2, MARCH 1993
In the deconvolution problem, we want to adjust c so that
B ~1c is equal to the vector 6'
*
' whose nth element is ^,
*
for some fxed (for simplicity, we shall ignore the constant
phase shift). However, since c is of fnite length, we may
only require that c is chosen to minimize the distance (norm)
between 1c and 6'
*
'. Assuming that 1 is known, this is a
standard linear least squares problem whose solution is
m+n|

1c- 6'
*
'|
'
c={11

16'
*
'

B =1~1{1 1

1 '*
' (21)
where we note that by assumption 2) 1 is full rank m that
1 1 is invertible for any L. This is the optimal solution in
the least squares sense, and also in the sense of minimizing
the mean-square restoration error since
E]|z, ,.
*
||
E||(e

|
*
')..

||

n
No equalizer, with or without a training sequence, can do
better than the solution in (21) even if / is known a priori. As
L increases, the matrix product 1{1 1,

1 approaches
the identity matrix 1 (more ecisely, the kth column of
1{1 1)

1 approaches
*
' in which case the solution
in (21) closely approaches the desired solution.
Having this result in mind, we now turn to the algorithm
specifed by (4a), (4b). According to (4a), we want to set c
/
so
that B 1c is equal to the vector g whose components are
It is not difficult to prove that the point of convergence of
the algorithm subject to the fnite-length restriction is given
approximately by
1
1{1 1
~
1'
*
'
{6'
*
')T 1{1'1
16'
*
'
(25)
(ust substitute B s'
*
' into (24a), (24b) and show that to
a frst-order approximation = s
'
*
' up to a constant phase
shift), and the lSI at that point is

|
'*` {
'
*
'
s
|*`]'
lSI {s'
*
' =
{'
*
'
1|*
'
]'
I~{6'
*
'
1
1{1

1
'*
{'*`
)
1
1{1 1
~:
1
'*
'
(26)
We see that
'*
' coincides with (21) up to a gain factor
which is very close to unity, Thus, by iteratively performing the
algorithm in (23a), (23b) we converge to the optimal attainable
solution subject to the finite-length restriction. The residual lSI
given by (26) is close to zero for suffciently large L. There is
a weak dependence on the index of the leading tap which
can be exploited through initialization, since the index of the
leading tap is preserved throughout the iterations.
The algorithm in (23a), (23b) is still implicit since Hand g
are unknown. Our next step is to convert this algorithm into a
realizable algorithm expressed in terms of joint cumulants of
the input and the output of the equalizer. We start of from
- =,~. *

*
,
*

*
(22) Then,
However, since the number of equalizer taps available for
adjustment is fnite, we shall only require that c is chosen
to minimize the distance (norm) between 1c and g.
The normalization operation in (4b) is equivalent to
c =
1
c

,c1*1c'
(23a)
(23b)
Projecting the c-domain algorithm in (23a), (23b) back into
the s-domain, we obtain
It
I ,
B = _ .
(24a)
(24b)
Once again, if 1{11

'1' approaches the identity


matrix (i.e., if the equalizer is sufciently long to allow good
restoration), then the algorithm in (24a), (24b) closely ap
proaches (4a), (4b) with the associated super-exponential path
and global convergence properties. A method for adjusting the
length of the equalizer jointly with its taps in order to guarantee
global convergence is given in [21].
.~

_.

~
*
=
_
*
~,,
*
,
* *.
Invoking properties PI) and P2),
cm{
)~:,

, =m_
*
,

.=
*
`_,,=,)
*
,
*
, *.

hkl-nhk2_rcum( .
*,
,~
*
,
!
^1 ^g
=cnm{
,.,)

*
)~
*
=cum{.,,
){1
1)

Invoking PI) once again,


,:::, j, q; Y;-n)
=cnnr ,j,, q;

)
,
~
*
,
=
_
_

um{z. : p; : `)
*
)

*
where
(27)
(28)
SHALVI AND WEINSTEIN: SUPER-EXPONENTIAL METHODS FOR BLIND DECONVOLUTION 3V
~ cum(,:,: ,:,`:
`
IcIH5 g lcIH5
cut

J_t:

.,.,_,,
+,
1 _ !1
-

,~+
,,
,
~
_


_ _


_S11 I_s:1 `
11 t 1f_ +
umc,(,c,t,c,.,,
,u
,;;
(!j
S1:: u,,t I2. , :

1 r11+t -+:, .+:1 +J P!j


cvm((t,~,
c,
~`
u;)

{ n,|, .
), q Ij / ~ | * O,_
: IH_ /,
0, +.+:Vs:
S1+..1.1 (J 1+ (!j
(J
um|, ).:, ,
(,)"um, )!, ( *
Ij
g : cvm( :c, + Ij
S1+..1.1 (I 1.+ (!j
cvm|:, : :z, . q;p

cum(u ), 1
'`
-s
.
(Ij
cum(u, . p; u
, 1
+ I)1*)),
(!
U1 (!I a11 (!j 1 (! (,+j V: ++1
(j
1
c ~ c,
c1c
(+j
V+:: 1 +: L X L .s V++: :::1. :
1. :+-1:: t.s

T+: 1::1:: .+. 1 1: :
1:.+1 d -::.+ + ::+11+.1: :1x1. (:+.:.+1(,
V+:: +:: . -::.+ + ++:. +1: :1t11.
RJ P1,
?
_ j !j_ck
1
,
- ,- s,=:.
.

vm(u:t, u|!i,

u,,u,

j
S1+..1.1 (+ 1.+ (, V: ++..1 : .:1.-: t:t:1
..+1 + .+: +.+t I1.:1 +. :+)1.1 j+1. :1t11.
+ g 11
~
, . ::+ .:+1 :J:: V: tJ . :+1.: .+:
:1t11. +
u
, 11 .+:1 :+ .+: i.:..+1

C. !hcIgurtthm wth Jmprcu| Cumulunt
l1 :.:: .+: ::. :1t11. : 11s1+V1 11 .+::+:
:1 +1J +: +s.:1 +J .+: : :.t.: 11 .+:
.+: 1. T+::+: 1.:1 + ( (+j V: 1:
l
(

c
c+ Fc
V+:: ! .+: 1 X m V++: ::t:1. :

(u:~u,
!n-


um )
11 d .+: J` .::.+ V++: ::t:1. :
.
| J !,1
1-

um(c., . ,c; .
1
I
(Ij
(!+
(
(j
F

_
cum(u,+u,,)

(uu'
cvm,, va c,)
11 d .+: Lxi -::.+ V++: ::t:1. :
(1 V+:: c.() 1:1+.: .+: :.t.: + c.r..(), ++.1:1 +J
+s.1 :1:t+: -:: V.+ :t: -:: F+
:t: , t:+:1 .+:1
cum,, . ;: . qj u)
,
`
cim( |, [ +I,
(j
l1 .+ :..1 .+: +.+ :t::1 1 .:t + +:
:1t: . a11 +t: 1. :111.

T+: .:t c1t ,-
c; q1) :1 1 .+: 1:1+t1.+ + (j |1. t1.
J1 :+1.1. .+. :1 +: 1:+.+.:1 1.+ |+: 1+t.+1
+:.+1 1 (+j

T+1 .+ i)::1. .+: +.+t V:
+1J 1::1 .+ s1+V .+: .1:: (.:: +V: + ,
T+:
.+.+t 11-: 1 .+: :1: .+. . 1+: 1+. t+: 1J
::.+1 +1 .+: +++.J 1+1.+1 + ,, +.1:1 .+
. 1+111 .J1 +: :+11.+: i1 (I! :1 (I
W: 1+.: .+ (j t 1 .a:.1: .+ 1: :1:.+:
l1 ++.+ c: w: 1::1 .+ t1.J +J .+: 1-:: 1 + .+:
cum,u`u.` (utu,,'
m( p; z; /u.,),__
T(l`lu,
cvm | u,)|

_
T!!-`-:uj,
?T|
l
:,|T!.,
uj,
- T(:;!(:
, u,
}.
1 V+:+ ::
:

|
"

u
'
,+u

_ u~u.,
t.1
40)
P
~
, p; 0v.,),_
,
_

_l~
,u,,
(
'
I
,
-.
2! lll TRANSACTIONS ON INFORMATION THEORY, VL1. 39, NO.2, MARCH 1993
Va:: a: 11m+: +' -+: t;: RJ [
.a:+:t !j .a:: :111. :t.: : :+1:1. Va
;+++..J +1: (V;

( 1i 1 .a: :11: ( ( :1:


;+-:i .a. E{latI2(P+Q+1)} < , 1 Va:a :: .a:
+.a 1 (JI(, (JI+( ;;+:a: (JJ(, (JJ+( .
H+V:-: V: a+.: .a. a:: tJ +: t+: :J::1 ::aa1:
+ :..1 :1t11 +t b1.: i. ::+
E1+1 (!J((!'( :+t;:.:J ;::J a: +.at +
.a: :: = I.q = 1, a + a: :: = 2, g = I,
a :t + a: ++:-: g, 1 ., a: 1;1 1 +1.;1.
+ .a: :1.t: T+ : a: :+;1.+1 :+t;:t.J
+ a: +.at V: a+.: .a. a: ::1+a + F (+ +:
:+t;1:i +aJ +1::( :1.: +;:..+a the :a:.+a
+| Z
l
, .
.
. , Z _ :1: +;:..+a 1i .a: ::1.+1 + d
:1: ++1 (L +q -I}+;:.+1 Ta: .a-:+1 + I,
a: t1..;.:..+1 + !' +J 1 a .a: 1+tt..+a 1 (JI+(
:1: O(L) +;:+a Va:a . :..-:J 1:..+: +
L A +:a: V: 1+.: a .1: ;:1 :+a m:.a+
Va.:a :a +1J +: ;;: .+ a: miam1m ;a: ::
:1: 2L +;:+a L +;:..+1 + a: ::1..+1
+ a: :1t: .;, ai .+;:+a + a: :1:.+a
+ Z ). Ta::+: a: 11+: + +;:..+a ;: iteration
::1., equivalent .+ .+. + a: ;:.:..+1 1 1::
a: +..at :+1-:: V..a.1 :V :+a .a:1 +-:
:+;1..+a :+t;:t..J . 1+ :t:::1J : a1 a
+ ;:i.:-: i::+1-+1.+1

Ta: use + empirical cumulants .1.:i + a: :t: :11
1 J J::. a: ;+1. + :+1-::1:: + .a: +.a
AJ t;+: t :+ aJ . :.: +1. .1 A;;:at A
A1t1 a .a: :1t11 :..: : c+1:a. 1 :
(!J({!!(, a: .+aJ ;+ia + .+: `+.a 1; .+ :+1.a
:J a :+1a ;a: a -:1 ;;+t.t:J +J
s=
1
H(H+ H)-l H+v
/TH(H+H)-lH+/
, (!J(
Va:: / = /(O) . a: -::.+ Va+: n! :.:t:a. c,, a
Va:: v .a: -::+ Va+: ::t:a :
n J
(!!(
i 0,
where cu(at;a;_l1) 1 Cf(a/ . p;a; . q;a;_n) : a:
:t;.: :1t1a + .a: (11++:-:i( .a;1. ;+:: at.
Ts.1 .a: :t;::.+1 + (!J( a assuming .a. a: :111.
:.: : 11+: + a E {v} 0, V: ++.1
E{s}
l
H(H+ H)-l H+/

s.
/7'H(H+ H)-lH+/
(!(
Ta .a: convergence ;+a + .a: +..at +: +1 a:
:t: :1t11 (:: (!, 1i:1 a. .a: J ;.+: +.
. ::a..J 1: + a: fnite length + .a: :1t:
Taking the :+-.1:: + (!J(
cov(s) = H(H+ H)-
l
H+cov(v)
H(H+H)-lH+, (!+(
Va:: c,.) 1i v(v) .a + a: :+-a:: + o1
T :;::.-: | J
Ta: :t;::: lSI .a: ;+a. + :+1-::a:: :1 +: :+:J
;;+tt: +J
E{ISI(s)} = E
{
1
.

1
2
_
l!ln
I2
}
n=0

l
1 2
_ 1
2 +

I
12
_,.
`
,0
` ___
1
= ISI(s)
1
_
1
2
l
'
))- cov(
so
) ]
,
(!I(
Va::
l
r,; .1 + a: .:: + .a: +:s:: t.t Ta:
: ISI(s), given +J (!+(, .a: :1 lSI i1: .+ .a: nai.:
:1a + a: :1t: where a: ::+a : represents a:
:+1.+1+a 1: + .a: 1i.: :aa + a:
W: t1 1+V 1+..1: (!+( .1+ (!( 1i :J +1. a:
.1:.:i +;:..+1

Ta: ::1+1 + .a: :: = I q =
l (a+: :11a( ::i +1. .1 A;;:1.t R For
.;:.J V: 1: +. (lt is : -1: t:+:1
ai+ -+: .1 Va.:a :: .a: :+ai+a 1 (l( :1.:
a 1{ a} = J Ta: :1. .
E{
l
SI(s)} ISI(s) +

l
)
E{a
n
_

];2
{nD
+(IT H(H H)-l H+l - L) ,(!(
Va:: 1

( I.J, I )`. W: a+.: a ISI( s) +1+.+a:J


i:::: Va .Va: a: ::+1i :t 1 (!( .1::: V.a
1i:a :-+ +:V::1 +. and variance.
F+ a: ;1;+: + :+t;.+a .a: J t;+: ;:+t1::
+ ;::-: ::+1-+1+a +: +1 .a: t.1t1 ;a:
1;.+a -:1 +J (:

:: jj [1!j(
L-l
E{IS1(s)} IS1(s) . (!1(
If L : :+t;: .+ a: :d::.-: :a.a + a: 1-::
H-l + H a:1 ISI(s) J 1i . -:J : compared
V..a .a: :d::..-: :1.a + H-l a:a 1H(H+ H)-l H+l -
L 0, in Va:a c: .a: .+ + (!( + (!( (:..-:
:T:.:1:J ( .-:a +J
E{a
n
E{a;} - E2{af}
i
E2{an
(J(
+ :t;: . U 1+aJ t:: t:+t:1 +1J fN.
...1 a: -1: -1 a o V.a ;+++..: o1 +/1)
1i I{(1 +o), respectively, a:a U .;: ::1.+1 J .:i
p= o{(1 - n)2. |+ / < 0.38 + O` 2.62, p < 1, .1.:..1
that a a. 1: .a: J ;.+..: ;:+1:: + a: ;+;+:
SHALVI AND WEINSTEIN: SUPER-EXPONENTIAL METHODS "OR BlJ'D DECONVOLUTION 511
algorithm is better than that of linear prediction based on the
minimum phase assumption. The reason is that third-order
cumulants carry useful information that can be used to improvc
performance. However, in the range 0.38 < I < 2.62 the
relative efficiency is greater than one, and as t approaches I,
p approaches infnity. This is since in this limit (It becomes
a binary symmetric random variable for which the third-order
cumulant is identically zero and therefore, it does not carry any
statistical information. In this case wc must use the information
contained in the higher order cumulants.
The calculation of (47) for the case ~ Z and q 1 (fourth
order cumulants) is carried out in Appendix B. In this case,
we let at be either real or complex valued random variable.
For simplicity we have assumed that r{ad and E
.,] are
zero. In the complex case, we have further assumed that
E{a7} . R{I{tl2o.d and E{lat
I2ai} are zero, in which case
E {zl
}
~ 0 so that the third term on the right-hand side of
(42) need not be estimated. These assumptions are satisfed
for most signal constellations used in data communication. By
(18), at is restricted l have a nonzero fourth-order cumulant:
where A in the real case and 2 in the complex case.
The result is
E{lSI(m l'\()
-

E{latI2}E{l
atI6} -
E2{
I!
tI
4}
[E{la(14} - AE2 {la
t
I2}]2
(51 )
We note that if E{
I
!tI2!d ; 0, then (51) contains an
additional term as indicated in Appendix B. Ignoring the
contribution of the bias terms in (51) and (49), their relative
efciency is:
_
~
E{I!tl2}E{I!
tI6} - E2{latI4}
(52)
[E{latI4} - AE2{loI12}f
For example, if U is a uniformly distributed random vari
able, then p = 3J indicating that fourth-order cumulants
provide useful information that can be used to improve per
formance (reduce the variance). For V22 and V29 signal
constellations that are commonly used in data communication
(,ee Fig. 4) p is equal to 0.471 and 0.719, respectively,
indicating that in both cases the proposed algorithm is superior
to predictive deconvolution even for minimum phase systems.
Invoking Cauchykhwarz inequality, it can be shown that
(53)
where equality holds, if and only if the probability weight of
at is zero unless U]
'
or lat I ~ for some positive real
. Some of the most popular signals used in communications
(sec [24]) and in particular in digital communications, e.g.,
BPSK, V27, 4-QAM (see Fig. 4) satisfy this condition. For
this class of signal constellations, p ~ 0, implying that the
statistical variance is zero even for fnite data, in which case
the expected lSI given by (51) is not zero only because of the
bias efect due to the fnite length of the equalizer.
\PP 3OlNCF
In(n;
V<? ~ODNcF
i(a,,
Nc(n
)
\L9 -O'N0F
n~_)
4-AV BClRCl
t(n
)

N
()
I_. +. Typical data communications signal cunstellations: V22, V29,
4-QAM. hJU V27.
J+ explain this striking result. suppose that lat , with
probability one (w.p.l). Then, for H ; 0,
,`(t
a
;_n
_2
(1
-
A).422:;1 ata;_n
(1 - A)A4
d w.p.1.
This rather interesting applicable result is confrmed with the
observation by Godard [10] that blind deconvolution based on
his criterion might be an easier task for signals having constant
amplitUde.
Comment: We may consider combining the algorithm based
on third-order and fourth-order cumulants, and perhaps other
algorithms in the class, in order to improve statistical stability.
We may also consider incorporating linear prediction methods,
particularly when the input distribution is close to Gaussian.
By appropriately weighting and linearly combining thc two
methods, we obtain a method that is capable of identifying the
phase ambiguities associated with predictive deconvolution,
and whose expected lSI is lower than that of either one of
the methods separately (and in particular lower then that of
predictive deconvolution). Detailed considerations are beyond
the scope of this paper.
l. Efect of Additive Noise
Consider the system illustrated in Fig. 5 in which
J; = /, ~ al +'t-
^12 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 39, NO. 2, MARCH 1993
Fig. D. Blind deconvolution in the presence of additive noise.
where C represents the additive noise at the input to the
equalizer. Then,
If 0, and -j are statistically independent (e, need not be an
i.i.d. process) then by P1) and P2),
cum( t ] 1 Y;-n )

( , 0j) ]; (^] ~^,) 1` h;_n Oo;_n)


+cum ct ,) J ':~j )
-
1 e;_n )
. (55)
The frst terms on the right hand sides of (54) and (55)
represent the cumulants in the noise-free case, where the
second terms represent the contribution due to additive noise.
If the noise cumulants are known U priori, or can be measured
independently, their efect can be subtracted from (54) and
(55), respectively. If c, is a Gaussian process then all its
cumulants of order greater then 2 are zero, in which case it
has no efect on the computation of (55).
In the actual implementation of the algorithm we use
empirical cumulants and therefore the efect of additive noise
on the computation of (55) is not zero even in the Gaussian
case. This issue must be explored in depth.
Since the equalizer converges to the inverse 1' of J, it
may emphasize certain components of the noise at its output,
depending on the frequency response of the unknown system.
Therefore, as a fnal step we suggest to perform one more
iteration without subtracting the noise covariance from (54).
As shown in [21], this corresponds to performing Wiener
fltering, instead of inverse fltering, that minimizes the mean
square restoration error. If the input symbols are drawn from
a discrete known source constellation then we can minimize
the mean-square restoration error by switching to a decision
directed mode [16], [18].
E. Simulation Results
To verify the results of the analysis, the algorithm in (37a),
(37b) was examined for
J
2, q = 1. In this case the elements
of 1 and l are calculated by substituting (40) and (42) into
(38) and (39), respectively. In all the examples that we have
considered E {0; j = 0 whenever 0; is complex valued in
which case E[- 0 and the third term on the right-hand
side of (42) is not estimated.
In all experiments, the unknown system is an FIR flter of
length 7 with the tap values (0.4, 1, -0.7, 0.6, 0.3, -0.4,
0.1), corresponding to an initial lSI of 1 . 27. We have used
|Le11OF < 4

Y
N=<4 0.4690 0.0938 0.02 0.0I!0 0.0I03 0.0101 0.0101

N~JU 10.6273 0.1953 0.0O44 0.0227 0 0161 0.0147 0.0144 0.0143


Predicted performBnce for N = 2400: 0.0090
Predicted perforInce (or h ^ 1600: 0.0112
1

)
h=I000
<
..........
, ~~ ~ , ,
< 4 D
Iteration
Fig. 6. Performance of the algorithm for V29 source
.
an equalizer of length L 16 which was initialized to
c' (0 001 000000000 OOO)T.
In the frst set of experiments the input symbols were
independent realizations from V2l) source (see Fig. 4). The
algorithm was tested in 50 Monte Carlo trials and the average
lSI, denoted by <lSI> is tabulated and plotted in Fig. 6.
We have used a logarithmic (dB) scale because of the very
fast convergence rate. When using data blocks of P =
2400 samples, the algorithm converged in all trials within
2-4 iterations to an average lSI which is very close to the
asymptotic performance predicted by (51), For data blocks
of ^ = 1600 samples, the algorithm converged within 2-5
iterations, but there was a deviation of approximately 25%
(1 dB) from the asymptotic result, which may be because of
second-order stochastic efects due to the fnite length of the
data.
In Figs. 7 and 8, the input symbols were independent
realizations from a V27 source (see Fig. 4) and from a binary
source, respectively. In both cases, p = 0, so that according to
our analysis the algorithm is relatively insensitive to stochastic
effects due to the fnite length of the data, and the residual IS I
at the point of convergence s essentially due to the fnite
length of the equalizer. Thus, in the case of binary source
with ^ " 800 and P 400, which are considered very
short data blocks, the algorithm converged in all trials within
2-4 iterations to the predicted performance level. To push the
algorithm to its limit, it was tested for data blocks of P
200
points, in which case it still converged in 88% of the trials to
a suffciently small lSI level. These results suggest that the
algorithm may be potentially useful for communication over
fast fading channels.
SHALVI AND WEINSTEIN: SUPER-EXPONENTIAL METHODS FOR BLIND DECONVOLLTION 21
\C131!OD O

N HU 0.b3

00052 0.0052

00? NA 0.0078 0.0179 ; 0 .0096 0 0080


Predicted performance 0045
D
~.
.D

<

<D
< 4 _
Itt\1OF
Fig. 7. Peiform.ncO the algorithm \O1 `Z source.
IC11CJ
'
03071 ^

00274
f-
IA 04126

0 0690
f-
P< 07650 0.3425

_
1
\D
<U

00050 04? J~?


, ,
6

,
0 .Ot 38 0 0062 0.0052 0.0052
~7~ U<6?00149 095 0.1232
Predicted perorence J JJ+S
__ . . . .. ..
< 4 _

lteratwn
Fig. 8. PIformne of the algorithm i binary source.
In Fig. 9, the experiment was repeated for binary sOuIcc
with ^ 400 where the equalizer input is contaminated hy
.lc:~iion
1<U
I1

G1
I

1
2 + o 6
0 4770 0. 0887 00110 J.005B 0 JJ` 0 c56 00056
05690 01490 00J~? 00149 ,001280 uu:.r 00125
\
`

_
-

Predlded performance m :he absence of


0.004 5

` `
`

\ \
\ `


...
-

t
2
_ ... . . _
2 3 4 7 lC1!IDJ\
Fig. V. Performance of the algorithm for binary source in the presence of
noi-e.
additive white laussian noi,e with PK 20 dB and ^K
!0dB (where ^K is the ratio of the average power of
/, to the average power of f:t). In both cases the algorithm
converged in all trials, and the effect of additive noise on the
convergence rate was small.
In the last set of experiments, the input symbols where
independent realizations from a uniformly distributed random
variable. The outcome results arc shown in Fig. 10. For data
blocks of !00 points the algorithm converged within
2- iterations very close to the expected l predicted by
(51). For data blocks of l00 points, the algorithm
converged within 2-5 iterations, and the presence of additive
white Gaussian noise with SNR 20 dB did not have any
significant effect.
/ Kcruriv und`c(ucnIiuA|qort/hm:
In a variety of applications, it is desirable to process
the data recursively/sequentially. The advantage of d recur
sive/sequential algorithm over a batch algorithm is not neces
sarily in the fnal result, but in the computational effciency,
reduced storage requirements and in the fact that an outcome
may be provided without having to wait for all the data to be
processed. Moreover, if the underlying system (channel) ex
hibits lime changes, processing all the available data jointly is
not desirable, even if we can accommodate the computational
and storage load of the batch algorithm, since different data
segments correspond to different channel responses. In that
case, we want to havc an adaptive algorithm that is capable of
tracking the varying characteristics of the channel.
514 IEEE TRANSACtIONS ON INFORMATION THEORY. VOL. 3q. NO. 2. MARCH 1993
1\Ct|)OH Z 4 Y
I
N) 0.5257 0.1170 0.023 0.0111 0.0099 0. 098 0.98
|
u-iaca 0.6471 0.2065 0.0740 0.0289 0 0164

0.0135 0.0132 0.0130


.
u-aaa
bN-ZO
0.8630 0.2965 0.0998 0.039 0.0195 0.0160
< ISI >
dB
Frcd\c!cd grformance for N = 160: 0.0085
Fred\c\ed grformance tor N = I00 0.0110

~D
1
.
i>
<U
N=l00
SN=20
dB
. . . . . . . .
<
< 4 O Iteration
1_. 10. Performance of the algorithm for uniformly distributed source.
Consider the algorithm in (37a), (37b) for the case p =
1 , q = 1 (third-order cumulants). Following the develop
ment in Appendix C, we obtain the following recursion for
computing C
/
at each iteration cycle:
/:
!
*
(, '
. _
r:
=
^:

:v:
:
7v: :=;

!

1
_
!
_ /,!:

v,v!:i
1 - n:
: ~i
1 - : :v!:iv,
(56)
(57)
where is the value of c based on sample averages to time t,
and
1_
:
=
_t~i

v/
=
where C are the values of the equalizer taps before entering
the iteration, and
cnm: , , 2) | : '
`

cnm
: , | ,: ,
Similarly, in case p = 2, q 1 (fourth-order cumulants),
we obtain
c

r,.;
t
!:v, , , -: ' |l , : , )-: ^v,r,il
(58)
where
_

cum , 2; a; : 2)
=
|| ' : ,' | | , , ' |
ci

m:
, )
|l, : ' )
If d
: = lit and
!:
is appropriately initialized, then the
outcome
r
at the end of the recursion coincides with c
obtained by calculating (37a) in one step using the sample
averages in (40) and (41) in the case p ~ 1, q = 1, or the
sample averages in (40) and (42) in the case p = 2, g 1. If
/:
/ (a constant), it corresponds to exponential weighting
that gives more weight to current data in the sample averaging
operations. The normalization operation required by (37b) can
be performed at the end of the recursion by normalizing c
N
,
or by scaling the output of the equalizer to have the same
average power as :
To convert these iterative-recursive algorithms in which
each iteration is performed recursively into sequential al
gorithms, we suggest to replace the iteration index by the
time index (-) which is a common procedure in stochastic
approximation, and use the most current state of the equalizer
to generate , . Under these approximations (56) becomes
|: | : , ]
r:

r:~i+

!:v: : -:
|l, ,

(59)
and (58) becomes
(60)
where
-: = Y;C-l ,
(61)
where the additional normalization operation (scaling :, to
have the same average power as
:
) may be required. Both
(59) and (60) are fully sequential algorithms in which we
perform one iteration per symbol. Using well-known results
from stochastic approximation methods [6], [ 12] it can be
shown that if
tm 0, 0 ,
n: = '
/ ,
t

e. g. , :
= l it, then under certain stationary-ergodic conditions
these algorithms converge almost surely (a.s.) and in the mean
square (m.s.) to the desired solution, and the limit distribution
at the point of convergence can also be evaluated. However,
such an analysis is beyond the scope of this paper. If we choose
, = (a constant), it corresponds to exponential weighting
that reduces the efect of past data relative to new data, and
we efectively obtain adaptive algorithms.
Palterative approach for generating sequential algorithms
is based on Robbins-Monro frst-order stohastic approxima
tion method [19], [12] in which expectations are replaced by
current realizations, and the iteration index is substituted by the
time index. The algorithm development is given in Appendix
D. In the case p 1, q = 1 , we obtain
SHALVI AND WEINSTEIN: SUPER-EXPONENTIAL METHODS FOR BLIND DECONVOLCTION 5 15
!s.
B
C
1
` O
.
l )

__ __ __ _

_ _
Fig. 1 . PerIotmaace ofiherecursive a|gorithms
and in the case 2, ( - 1 , wc obtain


)
I-i +

g
;~t ~l

(' um(ft : 2: 1; : 2)
'; ' ;
|{
(63)
This algorithm is recognized as Godard's algorithm proposed
in [10] and [24]. The return to Godard's algorithm should
not be surprising since the super-exponential methods can
be viewed as iterative algorithms for optimizing a family of
deconvolution criteria (see [2 1 J), a special case of which is
Godard's criterion. We can now go back and analyze the
convergence behavior of Godard 's algorithm based on the
observation that i t i s a |1rstorder stochastic approximation of
the super-exponential method whose net action in the 8-domain
is well understood. In the case where \lI( U 2: a : 2)
0, the algorithm in (63) converges to thc desired solution.
However, in the general case, the algorithms in ( 62), (63)
should be modifed to maintain the normalization operation,
as explained in [21 J.
Thc diffcrence between (59) and (62), and between (60)
and (63), appears only in the multiplying matrix Qt . Since
the algorithms in (59) and (60) attempt to perform cumul ati ve
averaging, they are expected to be statistically more stable than
the algorithms in (62) and (63). Since Qt is proportional to
the inverse of the estimated covariance of g
;
, then the multi
plication by Q corresponds to a spectral whitening operation
and, as pointed out in [2], [21], this may signifcantly improve
convergence rate. To illustrate th;t, we have considered the
previous scenario in which the unknown system is an FIR filter
of length 7 whose tap values are (0. -1, 1, D. 7, (J . G, 0. 3,
-
0. -,
0. 1 ), corresponding to an initial l SI of 1. 27. The input ,ymbols
were independent random variables uniformly distributed in
(
-
0. 5, 0.5). We havc used an equalizer of length J - J b,
i ni ti al ized to c'

' ( 00010 0) '. The matrix Qt needed in


(60) was initialized by calculating the covariance matri x H
of using the frst 100 data samples, and taking its inverse.
The algorithms were tested in 50 Monte Carlo trials, and the
average lSI is plotted in Fig. 11 as a function of the numbcr lof
data samples. In applying (63), that is Godard' s algorithm, we
have used ,Jj = ! 0 I0` and in applying (o0;we have used
:, 5
.
1 0
-
3. These are the largest step sizes for which the
algorithms converged in all trials. We see that the algorithm in
(60) converged almost twice as fast as Godard' s algorithm (to
reach an lSI level of 0. 05 it required 1 140 symbols compared
with 2270 rcquired by Godard's algorithm). The normalization
operation was not necessary in implementing the sequential
algorithms. As a baseline, we have also plotted the expected
performance of the iterative batch algorithm as predicted by
(51 ).
In terms of computational complexity, the algorithms in
(62) and (63) requires 2[> operations per data sample, while
the algorithms in (59) and (60) require additional O( L2 )
operations. As a compromise, we need not update Qt every
data sample, or we may compute the data covariance based on
a sufllcienlly large data segment onl y once, and then proceed
with the algorithm.
`. THE ALGORITHM i THE FREQUENCY DOMATl
In this section, we relax the assumption that U] is an i.i.d.
process. More generally, let Uj be a discrete stationary process,
and dcfinc its (p. q)-order spectrum by
We note that the ( 1 ,D)-order spectrum is the power spectrum
of the process, the ( 1 , 1 )-order spectrum is the bispectrum, and
the (2, I )-order spectrum is the trispectrum.
Suppose that
_ ~ < - (64)
and
for some combination of and such that : _ 2. We
note that if U is an i.i.d. process then (64) and (65) reduce
to the conditions in (17) and (18), respectively, since in this
case, for all _ . :
. .
_

,,
I
,
o . 1 '
_
i ` ` `
_
g
r c, 0+ 1 ) .
The relations bctween the .q)-order spcctra of the input
U and the output Yt of the unknown system is given by l3]
(66)
where II( v) = _, lj ' is the frequency response of the
unknown system. In the special case where ,

( 1 . 0) , the
relation betwecn the power spectra of U) and g, is given by
(67)
Substituting 5 ., H ., , .) , 5 .} = H, .)( , .) and
, .) H, ., t , .)into (Sa) (5b) and using the relations
b1h ) KPNbPLLN LN NILKmPJL" HLKX, `L. V, NL. Z, MCH 1VV^
in (66) and (67), we obtain
C
'

W
)
_

`
'
=)


C=
_

'
,

y
,
y-) _

l
'
- +~I
C

=) C = )
-i )=]
`.;q]'
=J , .=,+, -i . =)
;, ,+i (=
i -Wp+q _l , W)
d=J d=.
i , (68)
(69)
where C(w) _cjc' is the frequency response of the
equalizer. This algorithm docs not require that

, be an i. i. d.
process. In mm, it requires prior knowledge of its power
spectrum and some other nonzero higher order spectrum.
In the actual implementation of the algorithm, the power
spectrum and the (p, q)-order spectrum of

, are replaced by
their sample estimates. Since the equalizer being used is of
fnite-length 1, its frequency response can be adjusted only at
L independent frequencies, say the discrete Fourier transform
(DF) frequencies =t ~ 2{ k k = u, 1 ,
. .
(L

1) , and
therefore, it has a fnite resolution. The resulting algorithm
is given by
L1 :-i
C

)
_
,=t)
\ \
Wk -

. . .

`-,=t)
t, =c t,,, , -
p+q-l p| ]g 1
C
=t _ =t, _C(=t _ C* (
-Wk, }
[=1 1 -| J
=
P
S
Y:P;Y' : q+l ( =t
. -Wkp 1 q_ 1 , Wk)
,,: g1,
=t,
.
, -Wkp H_, ' Wk)
C=t ) =
1
1/
2
C'

=t) .
1 ~y__ \'
I C
'
(W
)
1 2
1 =c 5_ -, )

(70)
(71)
where Sy;y. (w) and Sy:p;y' _q(W1
,: : : U___) denote the es
timates of the power spectrum and the |,q)-order spectrum
of , at the indicated set of frequencies. Statistical analysis of
the algorithm, similar to that performed in the time domain,
can also be carried out here.
VI. TE SYSTEM IDENTIFICATION ASPECT
The ap
p
roach presented in the paper can also be used for
the purpose of system identifcation. Since the equalizer C is
directed at fnding the inverse of the unknown system J,it can
therefore be viewed as modeling 1
-
' Since C is an all-zero
FIR flter, it corresponds to modeling the unknown system J
by an all-pole (AR) flter, where the taps C_ of the equalizer
can be regarded as the direct estimates of the AR parameters
of that model. Close form expressions for the resulting bias
and error variance can easily be extracted from (45) and (46),
where we note that the bias term refects the possible mismatch
between the actual J and its assumed P model. The main
feature of this system identifcation approach is that it uses a
non-Gaussian input distribution and therefore it is capable of
identifying nonminimum phase systems, unlike the classical
approaches for system identifcation that are based on the
Gaussian
'
assumption and therefore are incapable of identifying
nonminimum phase characteristics. The method is universal
in the sense that is does not impose any restrictions on the
probability distribution of the input process (driving noise) and
it only requires prior knowledge of its average power. Since
the proposed method uses information contained in the high
order statistics of the observed data, it often exhibits better
performance compared with second-order methods even for
minimum phase systems. Finally we note that the proposed
technique for reducing the effect of additive noise by cumulant
subtraction can also be applied here.
VII. CONCLUSION
We have presented a class of computationally efcient
iterative algorithms for blind deconvolution that converge
monotonically, regardless of initialization (no spurious local
stationary points), at a very fast nearly super-exponential
rate to the desired solution in which the inverse of the
unknown system is identifed, and its input is recovered up to a
constant delay and possibly a constant phase shif. We assume
that the input process is a sequence of i.i.d. real/complex
random variables with non-Gaussian but otherwise completely
arbitrary continuous/discrete probability distribution. When
implementing the algorithms in the frequency domain, the
i.i.d. assumption can be relaxed and instead we require prior
knowledge of the power-spectrum and some higher order
spectrum of the input process. The efects of the fnite length
of the equalizer and fnite length of the data are analyzed
and the results are confrmed via Monte-Carlo simulations.
It is shown that in many cases of interest the performance
(intersymboI-interference) of the proposed methods is far
superior to linear prediction methods even for minimum phase
systems. Recursive and sequential forms of the algorithms are
also presented.
APPENDIX A
DERIVATION OF (43)
Under the consistency requirement, the cumulant estimates
asymptotically preserve the linear property of cumulants. Thus,
by (27)-29),
'
_
_'
t,
', -,,:t, ,
t
_ ) , (A.l)
k2
c, '
q
:
_|[ c(, ] j` [` ;.t
)

(A.2)
k
cl(Zt
' p; :, ' q; ,._)
l _ t,
t
(t, . ?;-| ,t ;.t
-
,.,
)

(A. 3)
SHALVI AND WEINSTEIN: SUPER-EXPONENTIAL ML I I i UIl, FOR BLIND llfTONVOI.UTIO' 517
Note that the cumulant estimates in e.g. , ( 40)-(42) satisfes
(AI )-(A3) for any ` without any approximation.
Near the point of convergence, - () for H f 0 in which
case (A3) i s closely approximated by
` ] , } ,
~ - -, . ' } , J U;_n ) '
Iencc, rccaIl (38) and (39),
I H+A. H.
where A is the matrix whose clements urc
and g i s the vector whose elements are
-
1
(
_ l1, I!( , {
-
.,

-
^n

'
l`1!11 ( U] . . , : q
_
1)
Neglecting small-order terms,

-
I

(
=
;

-
R) U-I
(AA)
(A5)
(A6)
,A)
(A.S)
= ( H 1 1 I) - 1 - ( H+ H) -I H+( A - I) H( H+ H) -I .
(A9)
Thus,
(AlO)
where
u = g - ( A - I) H( H+ H) -
l
H g. (Al l )
Since A 1 and g 8b ( . )'Il, and since for large 1 the
product H ( H+ H) -I H+ approaches the identity matrix, then
the components of are given to a very good approximation
by
{ . , _ , t , : q
+l
) _
`0 n (,lllll( af : p: n ; : q+ l ) um. . o :
n -
-

.p ( . * ) q ( Cll l11 ( a; : P;(Jr : q: a, _ ,, )


' cum| . , :: ,+) ) cl l m(Ut , u ; ) '
Since -,
,
then
j 'I .
H ; 0
(AI2)
c''c' u+ H( H- H) -I H- AI( lJ- H) -1 H+u

, _ ^ '<T
H( H
- ' .
-
I H+l

|Q , ! l
i| il: U] . , )
(A. 1 3)
Hence, up to a constant phase,
//
1
J
C -
c' '!c
/

( H+ H) - l H+v.
lTH( H
I
H)
-
I H+l
(A1 4)
where W is a Vcc|O whose elements are
I/ O.
! f D.
n - 0,
(A. 1 5)
n # O.
Thus.
s" = Hd/ S. (A. 1 6)
where . - given by (43). Since . is independent of the iteration
index, then it is approximately the stationary point of the
algorithm.
ApPENDIX B
DERIVATION OF (48) P` (51)
In analogy with (40)-(42).
(B.3)
where ) - : i n the real case and A 2 in the complex case,
and where i n the transition to the second l ine of (B.3) we have
neglected 7|l mean terms whose variance is of the order of
( l ]A' , .
Substituting (B I )--(8.3) into (44),
where
H = 0.
n # 0,
(BA)
q 1,
2. [ * 1
(B.5)
Invoking the assumption that I 1 , 2,
.
is a sequence
of zero mean i . i . d. r. v. ' s and noting that E{at !p. ( at ) } = 0,
where
! - JH 0,
n = m. # O.
n # m = O,
|l/ H i U i O.
(B.6)
1l: TRANSACTIONS ON INFORMATION THEORY. VOL. 39, NO. 2
,
MARCH 1993
where in (B. lO) we have assumed that if U is complex valued
then E{aD E{ l a
t l
2
an
O.
Arranging the components of (B.6) in a matrix, the covari
ance of the vector W of {_ can be written in the form
where
Substituting (B. ll) into (47) and carrying out the indicated
matrix manipulations
E{ ISI(S) }
= TSI(.i) + Idl TyVl - 11 TW61 2] + 12( L - cTW6)
+ 2Real[3cTW1( 1 - 6TW6)] +146TW6( 1 - 6TW6) ,
(B. 16)
where
For sufficiently large L the frst column of W approaches the
vector 6 in which case 6TW6 ~ l, ITW6 1 1 and (B. 16) is
closely approximated by
E{ISI(s) } TST(s) +11 ( ITWI ~ I) + 12( L - I ) (B. 17)
Substituting (B. 1 2) and (B. 13) into (B. 17) and using
(B. 7)-(B.lO), we obtain (48) and (51).
ApPENDIX C
DERIVATION OF THE ITERATIVE-RECURSIVE ALGORITHMS
By (37a), the value of c
,
based on sample averages to time
t is given by
(C1)
where
(C2)
where in the case = 1, g 1
(C.3)
and in the case 2, [ = l, '
qt ( 1 - !t)qt-1 !ty; ( I Zt I 2 l : , ) : , (CA)
where J, lit in case of the sample averages as in (40)-(42),
where c1
,
1 = q and 02, 1 0, where
Yt
' Y
;,
z
[ , and q are
defned after (56)-(57) and where is defned after (58).
Inverting (C2), we obtain (57). Substituting (C2) and (C3)
into (C1 ),
,
, -Qt ,( l
!t)qt-1 + !tY; l zt I 2]

.. Qt [( 1 - rjt hQ;-
C
_l + !tY; l zt I
2
]
I

-Qt
(
Q:

- fty;y'[ )d-1 + !tY; l zt I


2]
`
, ft
Q
. _ _
1
2
T
'
) C
t _1
tYt : "IYt
C
t-1
`
which coincides with (56).
(C5)
Similarly, substituting (C2) and (C4) into (Cl) we obtain
(58).
ApPENDIX 0
DEVELOPMENT OF FIRST-ORDER STOCHASTIC
ApPROXIMATION ALGORITHMS
Substituting c
,
cinto (33a) and ignoring the normalization
operation in (33b), the set uIstationary points of the algorithm
must satisfy the equation:
c R
-1
d = d - Rc =

,
In the case 1, g l, (0. 1 ) becomes
1
E
_ -
_
E
{ l
at I 2a
n
)
_

cum(a/ ; a; : 2)
Yt
Z
t
Z
t E{
l at I
2
}
and in the case = 2, g l, (0. 1) becomes
(0. 1)
(0.2)
1
EY; Zt ( I Z
t
I
2
_
E
{ l
a
t
I 4}
)

cum(at : 2; a; : 2) . E{ l
at
I 2}
(0.3)
where we used this at the stationary pointsE { I Z
t

E
{ l aI 1 2 } .
Gradient-search algorithms for solving (0.2) and (0.3) are
given, respectively, by
,
,
c
( | J) _
/3t
cum(at: at . 2)

(
@ E{
l
a
t , aj

, . E Yt
Z, ~;
E
_
l
a
t I
2
_ ) C=CI 1 - 1 ) , (004)
c
( | ,
C
( l -l) _ /
cum( nt . 2; a; . 2)

2
E
{
l a I `
' . E Yt Zt ( l
zt l -
E{ l at I
2
}
) C=C( l - l ) . (0.5)
' lnthe case Z. Q " 1 ,
" IIy, , , , - \ ' , I : - l
" |Iy . '
.
- `|l ' o l l
where i nthc trans|tion t o the second line we have used that En,! ,
.

| o, , [ DcCu$c D1IDc DOID8!Z3l!DD Dpetation. Inthissetting. dis obtained
by recursively performing (CA) w!th 1j J ]t.
SHALVI AND WEINSTEIN: SUPER-EXPONENTIAL METHODS FOR BLIND DECONVOLUTION D 9
Replacing the iteration index / by the time index l, and
substituting expectations with current realizations, we obtain
(62) and (63), respectively.
XII. ACKNOWLEDGMENT
We would like to thank the reviewers for their useful and
insightful comments.
KrFRENCES
f l ] S. Bellini and F. Rocca, "Asymptotically eficient blind deconvolution,"
Signal PrceSSing. vol. 20, pp. 1 93-209, 1 990.
[2J A. Benveniste, M. Goursat, and l. Ruget. "Robust identification of a
nonminimum phase system: Blind adjustment of a linear equalizer in
data communication," IEEE Trans. Autumat. Contr. , vol. AC-25, pp.
385-399, June 1980.
[3] D. R. Brillinger, 1ime Series, Data Analysis and TheurJ. Sail Francisco,
CA: Holden-Day, 1981.
[4] P. Coman, "Independent Components Analysis," i n [nt. Signal Process
ing Worhhop on Higher Order Statistics, Chamrouse, France, VV! pp.
1 1 1-120.
[5] D. L. Donoho, "On minimum entropy deconvolution," in ]lied Time
Series Analysis, j, D.F. Findley, Ed. New York: Academic Press,
1981 .
[6] A. Dvoretzky, "On stochastic approXimation," Proc. Jrd Berkele Symp.
on Math. Statist. Probab., 195t, pp. 35-56.
[7] B. Friedlander and B. Porat, "Asymptotically optimal estimation of
MA and ARMA parameters of non-Gaussian processes from high-order
moments," [EEE Tran. Automat. Contr. , vol. 35. pp. 27-35, Jan. 1990.
[8] G. B. Giannakis and A. Swami, "On estimating noncausal nonminimum
phase ARMA models of non-Gaussian processes," IEEE Trans. cOuSl.,
Speech and Signal Processing, vol. 38, pp. 478-95. Mar. 1990.
[9] G. B. Giannakis and M. . Tsatanis, "A unifying maximum-likelihood
view of cumdlants and poly spectral measures for non-Gaussian signal
classifcation and estimation," [EEE Trans. Inform. Theory, vol. 38, .
386-406, Mar. 1992.
[10] D. Godard. "Self recovering equalization and carrier tracking in two
dimensional data communication systems, " 1Dn. Iommun. , vol.
CUM-28, pp. 1 867-1875, Nov. 1980.
[ 1 1 ] D. Hatzinakos and C. . Nikias, "Blind equalization using a triceptrum
based algorithm," IEEE Trun. Commu". , vol. 39, pp. 069-682, May
1991.
[ 12] H. J. Kushner, Stochastic Approximation Methods for Cum/rained and
Unconstrained Systems. New York: Springer-Verlag, 1978.
! 131 *>. Lii and M. Rosenblatt, "Non-Gaussian linear processes, phase, and
deconvolution," in Statistical Signal Processing, E. J. Wegman and J.
G. Smith, Eds. 1 984, . 51-58.
! 14] Ljung, SystemIdentifcation Theoryfor the User. nglewood Cliffs,
NJ: Prentice-liall, 1987.
[ I S] T. Matsuoka and T. J. UJrych, "Phase estimation using the bispectrum. "
Proc. IEEE, vol. 72, pp. 1 403-141 1 , 1984.
[ 1 6] .I. . Mazo, " Analysis of decision directed equalizer convergence," Bell
Syst. Tech. J. , vol. 59, pp. 1857-1876, Dec. 1980.
[17] C. L. Nikias and M. R. Raghuvecr. "Bispectrum estimation: A digital
signal processing framework," fIc. [EEE, vol. 75, pr. 869-891. July
1987.
I U!] J . G. Proakis, Digital Communications, second ed. New-York: McGraw
Hill. 1989.
[1 V] H. Rcbbins and S. Monro, "A stochastic approximation method," The
Ann. Mmh. Statist., vol. 22, pp. 40G-407, 1951 .
[20] Shalvi and E. Weinstein, "New criteria for blind deconvolution of
nonminimum phase systems (channels)," [EEE Trans. [nform. Theory,
vol. 36, pp. 31 2-321. Mar. 1990.
[21] , "Universal methods for blind deconvolution", I) Blind Decon
volution", S. Haykin, Ed. Englewood Cliffs, NJ: Prentice-Hall, to D
published.
[22] , , "Methods and apparatus for blind deconvolution", Israeli patent
application 97,345, Feb. 1991, and corresponding foreign applications.
[23] M. I. Sezan and A. M. Tekalp, "Survey of recent developments in digital
image restoration," Opt. Eng., vol. 29, pp. 393-04, May 1990.
[24] J. h. Triechler and B. G. Agee. "A new approach to multipath correc
tion of consIant modulus signals," [EEE Trans. Acoust. Speech Signal
Processing, vol. ASSP-3J , pp. 349-72, Apr. 1 983.
[Z5] V A. Wiggins, "Minimum entropy deconvolution," Geoexp/orati{ns,
vol. 1 6, pp. 21-35, 1978.

You might also like