Distributed Signal Processing

Digital Signal Processing 96 (2020) 102589
Contents lists available at ScienceDirect
Digital Signal Processing

www.elsevier.com/locate/dsp
Robust distributed Lorentzian adaptive filter with diffusion strategy in

impulsive noise environment
Annet Mary Wilson a,∗ , Trilochan Panigrahi a , Ankit Dubey b
a
Dept. of Electronics & Communication Engineering, National Institute of Technology, Goa, India
b
Dept. of Electrical Engineering, Indian Institute of Technology Jammu, India
a r t i c l e i n f o a b s t r a c t
Article history: Many works of literature support the potential of distributed channel estimation resorting to the
Available online 30 September 2019 traditional LMS algorithm and its variants. But these conventional LMS algorithms fail in an impulsive
noise environment, which is undeniable in many communication systems. Hence in this paper, we study
Keywords:
distributed channel estimation with robust cost functions. Most of the robust adaptive algorithms are
Distributed
Lorentzian
less efficient in terms of convergence rate. To deal with this, we propose the use of the window-
Impulse noise based Lorentzian norm in a distributed framework to gain the merit of improved convergence rate in
Channel estimation terms of both distribution and data reuse. The performance of the proposed algorithm is validated using
simulation results. Our contribution in this work is the application of Lorentzian norm in sensor networks
with diffusion cooperation and stability analysis of the same.
© 2019 Elsevier Inc. All rights reserved.
1. Introduction as a Gaussian mixture [4,5]. We aim to estimate the wireless chan-

nel parameters at the network nodes from the noisy received sig-
Wireless sensor networks (WSN) have emerged as trending, nals collected. There are mainly two approaches to estimate the
with its application in diverse fields like industry, environmental desired quantity in a WSN: centralized estimation and distributed
monitoring, military, etc. This has stirred interest in researchers to estimation. In centralized estimation, the data collected over the
deal with the limited power and computational capability of nodes nodes are sent to a central processor which filters out the de-
in WSN [1], to tackle the noisy links [2] and so on. The focus of sired quantity from the noise [6]. This requires a powerful cen-
this paper is the channel estimation in a WSN under impulsive tral processor and is more susceptible to link failure. In the latter
noise environment. The presence of impulsive noise in communi- approach, an adaptive filter at each node process the noisy data
cation systems has been stated in many works of literature and according to certain rules, and the estimated data is shared with
shall be duly noted later in this section. its neighboring nodes. This approach is more robust to link failure
A lot of research works on channel estimation have elucidated and adaptive to environmental change [7]. The discussion of dis-
its importance in wireless communication systems. Due to the tributed estimation is not complete without mentioning the two
multiple path propagation of a signal in a wireless channel, the main methods of operation such as incremental and diffusion. Un-
received signal could be in the grip of intersymbol interference like the consensus strategy of distribution, which only takes care of
(ISI). To mitigate the effect of ISI inverse filtering (equalization) is temporal information, they incorporate spatial information as well
performed at the receiver front, which requires the exact channel to the estimation along with temporal information [8]. In the in-
information [3]. cremental approach, the nodes are linked in a cyclic manner with
Consider a common source signal transmitted gets received by node k being the only neighbor of node k − 1 and node 1 being
sensor nodes, which are distributed over the destination with cer- the only neighbor of node N [9–11]. Thus the network topology
tain topology. We consider an additive impulse noise channel. It of incremental is simpler considering the diffusion distributed net-
can be mathematically described as a linear finite impulse re- work where each node is a member of a neighborhood with many
sponse (FIR) filter connected serially with a noise source modeled neighbors connected with certain topology [7,8,12,13]. But diffu-
sion is more robust to link failure than incremental.
Adaptive filtering is the core of statistical signal processing and
is implemented in this paper under diffusion distributed frame-
*
Corresponding author.
work owing to the fact that it improves the convergence rate
E-mail addresses: annetmarywilson@gmail.com (A.M. Wilson),
tpanigrahi@nitgoa.ac.in (T. Panigrahi), ankit.dubey@iitjammu.ac.in (A. Dubey). and steady state performance [8] as compared to simple non-
https://doi.org/10.1016/j.dsp.2019.102589
1051-2004/© 2019 Elsevier Inc. All rights reserved.
2 A.M. Wilson et al. / Digital Signal Processing 96 (2020) 102589
cooperative filtering. Due to the mathematical tractability of the

l2 norm, least mean square (LMS) approach is the most robust
and common adaptive filtering rule when the system is linear and
noise is Gaussian. The idea of the LMS algorithm is developed by
Widrow and Hoff [14] in the year 1959, which soon became pop-
ular among researchers working in the field of statistical signal
processing. But the incapability of the particular algorithm to deal
with outliers or impulsive noise (heavy-tailed distribution) led to
further research in robust adaptive algorithms. Many works of lit-
erature are there to support the presence of impulsive noise in a
communication system [15]. In an indoor wireless communication
channel, electrical equipment such as microwave ovens, or office
equipment like printers, copying machines, etc can cause impulsive
noise [16] whereas, in an outdoor environment, it is caused due to
switching transients in power line or automobile ignition [17]. In a
radar system, natural causes like thunderstorm and lightning could
cause impulsive noise [18]. Reinforcing this, in recent past, count-
less research works in communication systems like, source local-
ization based on the direction of arrival (DOA) and time difference
of arrival (TDOA) estimation [19,20], speech signal recovery using
variational Bayesian algorithm [21], are being carried out consid-
ering impulsive interferences. An equivalence of minimum mean
squared error (MMSE) and maximum signal-to-noise-ratio (MSNR)
criterion in Bayesian estimation in additive non-Gaussian channel
is proved in [22]. In [23], the authors proposed a method based on Fig. 1. Networked channel estimation model.
empirical characteristic function (ECF) to estimate the signal level
in a binary communication system under impulsive noise. An im- features of affine projection(AP) improve convergence rate and are
pulse noise reduction technique in telecommunications based on used together with l1 norm optimization to achieve robustness as
minimum mean square error (MMSE)/maximum a posteriori (MAP) mentioned prior. It is extended to distributed parameter estimation
estimation is proposed in [24]. in [41].
It has been noted that a lower order statistical measure of an To achieve the merits of data reuse along with strong outlier re-
error signal, like l p norm with 0 ≤ p ≤ 1 works better in an im- jection in distributed estimation, we extend the LAF filter in [34] to
pulsive noise environment. In sign algorithm (SA) [25] we use l1
a diffusion framework and compare the performance with Huber
norm of error as the cost function and is shown to work in an
cost function and MCC in a distributed network and diffusion affine
impulsive environment. Lower order statistics is found to have a
projection sign algorithm (DAPSA). In LAF, a sliding window-based
detrimental effect on the convergence rate. To incorporate the at-
Lorentzian cost function is used, where the trade-off between con-
tractive features of the l2 norm like faster convergence rate and
vergence rate and steady state error being controlled by the win-
better steady state error while being less sensitive to outliers, l1
dow length. The Lorentzian norm is convex near the origin and
and l2 norm is combined with a threshold in Huber cost function
unlike Huber norm, it is continuous everywhere [34]. The main at-
[26]. But properly setting the threshold is found to be tiresome.
traction to Lorentzian norm is its ability to reduce the influence
Other robust cost functions which can improve convergence rate
of outliers considerably. Also, the lack of nonlinear mathematical
include mixed norm [27,28] which uses both l1 and l2 parameter,
operations in the filter update equation makes it a suitable can-
least logarithmic absolute difference (LLAD) [29], etc. Another way
didate for hardware implementation. Our main contribution is the
to tackle slow convergence rate is by reusing data as in affine pro-
extension of the LAF filter to distributed network channel estima-
jection sign algorithm (APSA) [30] which combines the feature of
tion and convergence analysis of the same in an impulsive noise
multiple projections with l1 norm optimization.
environment.
Another class of robust cost functions is developed by making
The remainder of the paper is organized as follows. Section 2
use of saturation property of error nonlinearity like maximum cor-
presents the adaptive solution derived for channel estimation in a
rentropy criterion (MCC) [31–33], Lorentzian adaptive filter (LAF)
distributed framework. The distributed robust adaptive algorithms
[34] and sigmoid cost function [35]. In [36], a study on robust
used in this paper are discussed in Section 3. The proposed Com-
adaptive algorithms with error non-linearity is done where the ex-
bine then Adapt-Diffusion LAF is discussed in Section 4. The perfor-
isting algorithms are classified as V-shaped and -shaped based
mance analysis of the proposed diffusion LAF algorithm in a global
on the weighting function and the authors came up with a new
framework is presented in Section 5. Section 6 illustrates the sim-
family of robust M-shaped algorithms, which brings together the
ulation results and performance comparison of proposed Combine
merits of both V-shaped and -shaped algorithms.
then Adapt-Diffusion LAF with other robust algorithms. Section 7
Lately, many research works are going on in distributed esti-
provides the concluding remarks.
mation under impulse noise environment. An adapt-then-combine
diffusion strategy is employed in [37] by using l1 norm of error as
2. Distributed channel estimation
cost function at each node. A distributed 1-bit compressed sens-
ing algorithm is applied to sparse channel estimation in impulsive
noise in [38]. A weighted least mean p-power (LMP) algorithm is The channel estimation scenario is described as follows. Con-
proposed in [39] in a non-uniform noise environment with dis- sider a single source transmitting a signal through a wireless
tributed optimization. The efficacy of correntropy criterion in pres- channel, to a receiver which is modeled as a distributed net-
ence of outliers is extended to distributed estimation in [40]. While work with N nodes. At each node, there are spatially indepen-
these proved its efficacy in the presence of impulsive noise, not dent noise sources (a mixture of Gaussian and impulsive distri-
much discussion is carried out in terms of convergence rate. The bution) as shown in Fig. 1. Mathematically the channel can be
A.M. Wilson et al. / Digital Signal Processing 96 (2020) 102589 3

modeled as linear finite impulse response (FIR) filter of order M, φ ki −1 = c ,k w i −1 (11)
connected to a noise source at each node. The estimation prob- ∈Nk
lem is mathematically described below. From the collection of
w ki = φ ki −1 + μk uk,i (dk (i ) − ukT,i φ ki −1 ) (12)
noisy measurements across the nodes of the sensor network, the
M × 1 unknown channel coefficient vector w (o) is to be esti- This paper follows the uncomplicated CTA-diffusion distribution
mated. Each node k, where k = 1, . . . , N has access to time re- scheme, but with a robust adaptive algorithm, as the LMS based
distributed adaptive algorithm fails in an impulsive environment.
alizations of scalar data dk (i ) and M × 1 regressor vector uk,i ,
T The following section elaborates on the robust adaptive algorithms
(uk,i = uk (i ) uk (i − 1) . . . uk (i − M + 1) ) of zero mean ran- which are implemented in this paper.
dom data {dk , uk } which is related to w (o) by the model [8]:
3. Robust distributed algorithms
dk (i ) = ukT,i w (o) + v k (i ) (1)
Least mean square based distributed adaptive algorithms built
where the background noise v k (i ) is independent of uk ( j ) for any
upon the assumption of Gaussian noise and linearity are not able
i , j.
to cope with the impulsive environment causing its failure to con-
verge. This has prompted the researchers to work on robust adap-
2.1. Local optimization
tive algorithms.
In this paper, we have implemented Lorentzian cost function
Local optimization of diffusion distributed network to identify
in diffusion distributed framework and compared with Combine
the unknown parameter is elaborated in [7]. For ease of reference,
then Adapt diffusion Maximum Correntropy Criterion (CTA-DMCC)
a slightly modified set of relevant equations are written here in
[40], Combine then Adapt diffusion affine projection sign algorithm
this section.
(CTA-DAPSA) [41], and Combine then Adapt Huber diffusion LMS
The local cost function is given as,
(CTA-HuberDLMS). Recently two papers are published on diffusion
Huber adaptive filter; [43] and [44]. Former deals with normal-
J kloc ( w ) = T
o,k E | d (i ) − u, iw |
2
(2)
ized Huber adaptive algorithm in diffusion framework with Adapt
∈Nk then Combine (ATC) strategy, while in latter they introduce vari-
where o,k are the coefficients of matrix O which satisfy the con- able step size and variable threshold value which also uses ATC
dition, combine strategy, but not normalized as in the former case. Hence
for a similar framework for comparison purposes with CTA-DLAF
/ Nk and 1T O = 1T
o,k = 0 if ∈ (3) and CTA-DMCC an uncomplicated CTA strategy of diffusion Huber
adaptive algorithm is presented in this paper with constant step
where 1 is an N × 1 vector with unit entries. Applying stochastic size.
gradient descent solution [42] to the cost function in (2) we get The distributed framework in Section 2 with the data model in
(1) is used for the following algorithms.
w ki = w ki −1 − μk (∇ J kloc ( w ))∗ (4)
3.1. Huber diffusion LMS algorithm
= w ki −1 + μk o,k u,i (d (i ) − u,i w ki −1 )
T
(5)
∈Nk The Huber cost function is a fusion of 1 and 2 norm which
where μk = 2μk . There are two schemes of diffusion estimation: dynamically switches according to the relation with e (i ) and
Adapt then Combine (ATC) and Combine then Adapt (CTA) [7,8]. threshold δ [43]:

In ATC the first stage of operation is adaptation which is then fol-
E [ 12 e 2 (i )], if |e (i )| ≤ δ
lowed by diffusion of neighborhood estimates, whereas in CTA a J (i ) = (13)
local diffusion step precedes the adaptation step. The CTA diffu-
δ|e (i )| − 12 δ 2 , if |e (i )| > δ
sion scheme is given below: In a distributed network, the local cost function at each node,
which is a linear combination of cost in (13) over the neighbor-
φ ki −1 = c ,k w i −1 (6) hood of the local node k, is minimized to arrive at the optimal
∈Nk solution w (o) . The local cost function is given by

w ki = φ ki −1 + μk T
o,k u,i (d (i ) − u, i −1
iφ k ) (7) 1
∈Nk o ,k E [ 2 e ,k (i )], if |ek (i )| ≤ δ
2
∈Nk J klocal ( w ) = (14)
1 2
δ|ek (i )| − 2
δ , if |ek (i )| > δ
Here, in the above equation, c ,k denotes the entries of a combiner
matrix C, which is a mathematical representation of the network where e ,k (i ) = d (i ) − u,
T
w i −1 and ek (i ) = ek,k (i ). The coefficients
i k
topology, i.e. if node k and are not connected, then c ,k = 0. The o,k satisfy the condition in (3). Choosing O = I N , a simple un-
popular combiner matrix rule is Metropolis rule, which is used in complicated CTA strategy can be derived by minimizing (14) by
this paper. Let nk and n denote the degrees of node k and re- gradient descent method, which is given below:
spectively. The Metropolis rule is defined as [8]:
φ k(i −1) = c ,k w i −1 (15)
1 ∈Nk
c ,k = if k = are linked (8)
max(nk , n )
(i ) φ k(i −1) + μk (ek (i ))uk,i , if ek (i ) ≤ δ
c ,k = 0 if k and are not linked (9) wk = ( i −1 ) (16)
φk + μk δ csgn(ek (i ))uk,i , otherwise
ck,k = 1 − c ,k for k = (10) (i −1)
where ek (i ) = dk (i ) − ukT,i φ k . The mathematical operator csgn{}
∈ N k /k
for a complex number x is defined as,
An uncomplicated CTA scheme with O = I leads to the simple
CTA diffusion algorithm, csgn{x} = sign(real(x)) + j ∗ sign(imag (x))
L −1

3.2. Diffusion maximum correntropy criterion algorithm (i )
J w (i ) ( i ) = ψγ (e w (i) (i − τ )) ≡
dk,i − Uk,i w k
LL2,γ (26)
k k
τ =0
DMCC algorithm proposed in [40], is a correntropy based robust
adaptive algorithm in diffusion distributed framework. Gaussian where Lorentzian function ψγ (x) = log 1 + ( γx )2 and Uk,i is L × M
kernel-the most popular kernel used in correntropy-leads to the regressor matrix
following instantaneous MCC cost, T
Uk,i = uk,i uk,i −1 . . . uk,i − L +1 (27)
1 2
e (i ) T
GσMC C (e (i )) = √ exp(− ) (17) and dk,i = dk (i ) dk (i − 1) . . . dk (i − L + 1) .
σ 2π 2σ 2
(i )
e w (i) (n) = dk (n) − ukT,n w k for i − L +1≤n≤i (28)
which is used in [40] to derive DMCC algorithm. k
In a distributed network with N nodes, the local cost function Applying gradient descent search on the linear combination of
at each node k, is given by cost function (26) at each node k within the neighborhood Nk , the
weight update equation in [34] is extended to diffusion distributed
J klocal = o,k GσMC C e (i ) (18) framework as follows.
∈Nk The local cost function is defined as,

where o,k satisfy (3). A gradient based CTA-DMCC algorithm is J klocal ( w k ) = o,k J w (i) (i ) (29)
derived by taking the derivative of (18), which is expressed below: ∈Nk
k

φ k(i −1) = c ,k w i −1 Taking the derivative of (29) and using (26) we get
∈Nk 2
∇ J klocal ( w k ) = − o,k U∗,i W,i (d,i − U,i w k ) (30)
(i ) ( i −1 ) μk MC C
(19)
γ 2 ∈N
wk = φ k + o,k Gσ e (i ) e (i )u,i k
σ2 ∈Nk where Wk,i = diag{ gk,i (0), gk,i (1), . . . , gk,i ( L − 1)} is a L × L diago-
(i −1) nal matrix with
where e (i ) = d (i ) − u,
T
φ
i k
. Keeping O = I N the uncomplicated
CTA version is arrived at and is given by γ2
gk,i (τ ) = , τ = 0, 1, . . . , L − 1
2 + e 2 (i − τ)
( i −1 )
γ (i )
φk = c ,k w i −1 (20) wk
∈Nk,i −1 and the coefficients o,k follow the relation in (3). The gradient
μk MC C base algorithm at node k to get the estimate of w (o) at time i is
(i ) ( i −1 )
wk = φ k +G (ek (i ))ek (i )uk,i (21) given as follows:
σ2 σ

= φ k(i −1) + μk GσMC C (ek (i ))ek (i )uk,i (22)
(i )
wk = wk
( i −1 )
+ μk o,k U∗,i W,i (d,i − U,i w k
( i −1 )
) (31)
∈Nk
μk
where μk = σ2
.
2μk
with μk = γ2
(parameter γ 2 is absorbed in the step size). The
3.3. Diffusion affine projection sign algorithm mathematical framework for diffusion LAF is expressed below,
where we use a linear combiner to diffuse the data in spatial do-
A distributed data reuse algorithm with diffusion strategy, main, along with a sliding window based Lorentzian LMS adaptive
which is robust against impulsive interference is given in [41]. rule at each node. In this paper, we follow combine then adapt
With the projection order L given, the regressor matrix is defined (CTA) rule with O = I N , and yields this uncomplicated CTA DLAF
as algorithm,
T ( i −1 )

Uk,i = uk,i uk,i −1 . . . uk,i − L +1 (23) φk = c ,k w i −1
∈Nk,i −1 (32)
The CTA-DAPSA is stated below. ( i −1 ) ( i −1 )
+ μk Uk∗,i Wk,i (dk,i − Uk,i φ k
(i )
wk = φ k )

φ k(i −1) = c ,k w i −1 (24) The outline of the proposed CTA-DLAF is given in Algorithm 1:
∈Nk,i −1
(i ) ( i −1 ) Uk∗,i csgn(ek,i )
wk = φ k + μk (25) Algorithm 1: Diffusion LAF algorithm.
[Uk∗,i csgn(ek,i )]∗ [Uk∗,i csgn(ek,i )] + δ 1 Initialize: w k = 0;
(0)
2 for i=1,2,. . . do
where ek,i = dk,i − Uk,i φ ki −1 . 3 for k=1,2,. . . N do
4 ek,i = dk,i − Uk,i φ ki −1

4. Proposed diffusion LAF algorithm γ2
5 g k ,i ( j ) = , for j = 0, 1, . . . L − 1
γ 2 +ek2,i ( j )
6 Wk,i = diag{ gk,i (0), gk,i (1), . . . , gk,i ( L − 1)}
In this section, a distributed solution with diffusion strategy is
7 φ k(i −1) = i −1
∈Nk,i −1 c ,k w ;
implemented for the sliding window based Lorentzian adaptive fil- (i −1)
+ μk Uk∗,i Wk,i ek,i
(i )
8 wk = φ k
ter algorithm [34].
The sliding window based cost function for Lorentzian adaptive
filter algorithm [34] is given as,
Table 1
Computational complexity of CTA-HuberDLMS, CTA-DAPSA, CTA-DMCC, and CTA-DLAF for each node per iteration. nk
denotes cardinality of node k.
Robust Computational cost per iteration per node

algorithms Multiplication Addition Exponent Sgn ( ) Sqrt ( )
arithmetic
CTA-HuberDLMS e k (i ) ≤ δ 2M + 1 + nk M nk M + M 0 0 0
e k (i ) > δ 2M + 2 + nk M nk M + M 0 1 0
CTA-DAPSA 2LM + L + M + nk M 2LM + nk M 0 0 1
CTA-DMCC 2M + 4 + nk M nk M + M 1 0 0
CTA-DLAF 2LM + 4L + nk M L + 2LM + nk M − M 0 0 0
i
4.1. Complexity analysis w = wo − wi
w̃ (37)
Also note that
In Table 1, we provide a complexity comparison of our pro-
posed algorithm with the other robust algorithms considering real wo = wo
Gw (38)
data. A slight increase in computational complexity of the pro-
posed algorithm is attributed to the order L. Nevertheless, the 5.1. Mean stability analysis
absence of exponential computation gives an upper-hand to our
proposed algorithm compared to diffusion MCC algorithm. The mean behavior of the CTA-DLAF algorithm is derived here.
Subtract left hand side of the global update equation in (36) from
5. Performance analysis of the diffusion LAF algorithm
w o and the right hand side from Gw w o [8] to get,
i −1
w = (I N M − DU∗i Wi Ui )Gw̃ − DU∗i Wi v i
The diffusion of adaptive filters in the spatial domain makes the i
w̃ w (39)
performance analysis complex as compared to a single adaptive
filter. Nevertheless resorting to some assumptions on random data Taking expectation on both sides of (39) and assuming spatial
and background noise we proceed to solve the stability analysis and temporal independence of regressor data uk,i with background
by following the procedure adopted in [8]. The background noise noise, we get,
is modeled as Gaussian mixture in this analysis, different to the i −1
w = [I N M − DE (U∗i Wi Ui )]GE w̃
i
Gaussian noise scenario in [8]. E w̃ w (40)
∗
The global matrices collecting data and filter updates across lo- We need to evaluate the N M × N M data moment E (Ui Wi Ui ) prior
cal nodes are given below. to establishing the condition for mean stability. Assume EUk∗,i Uk,i
and Wk,i are statistically independent [34].
(i ) (i ) (i )
w i col{ w 1 , w 2 , . . . , w N } ( N M × 1) The k block of E (U∗i Wi Ui ) is given by
φ i −1
col{φ 1 ( i −1 )
, φ2( i −1 ) ( i −1 )
, . . . , φN } ( N M × 1)
Ru ,k T r ( EWk,i ), for k =
Ui diag{U1,i , U2,i , . . . , U N ,i } (N L × N M ) Ak = [ E (U∗i Wi Ui )]k = (41)
0, for k =
di col{d1,i , d2,i , . . . , d N ,i } ( N L × 1)
where
Wi = diag{W1,i , W2,i , . . . , W N ,i } (N L × N L)
Ru ,k = E [uk,i uk∗,i ]
where Ui and Wi are block diagonal. The local step sizes are col-
lected into a N M × N M block diagonal matrix
Ru ,k T r ( EWk,i ) = E [Uk∗,i Wk,i Uk,i ]
The above evaluation breaks down E (U∗i Wi Ui ) to
D = diag{μ1 I M , μ2 I M , . . . , μ N I M } (33)
E (U∗i Wi Ui ) = RU S (42)
The desired data available at the local nodes as expressed in (1)
is used to write the global matrix of measurements di : where
di = U i w o + v i (34) RU = diag{Ru ,1 , Ru ,2 , . . . , Ru , N }
where w o = Q w (o) and v i = col{ v 1,i , v 2,i , . . . , v N ,i } in which S = diag{ T r ( EW1,i ) I M , . . . T r ( EW N ,i ) I M }

T
v k,i = v k (i ) v k (i − 1) . . . v k (i − L + 1) . Q is an N M × M For the stability in mean we must have
matrix defined as Q = col{I M , I M , · · · , I M }.
Using the above relations the global representation of adaptive |λmax (PG)| < 1 (43)
algorithm in (32) is given as, where λmax (.) denotes the maximum eigen value of the matrix and
i −1 i −1 P = (I N M − DRU S). In [8] it is shown that cooperation improves sta-
φ = Gw
(35) bility i.e. |λmax (PG)| ≤ |λmax (P)|. Hence to show that the CTA-DLAF
w = φ i −1 + DU∗i Wi (di − Ui φ i −1 )
i
is mean stable it only requires to show that
Replacing φ i −1 by Gw i −1 we get |λmax (P)| < 1

i
w = Gw
w i −1 ∗
+ DUi Wi (di − Ui Gw
w i −1
) (36) i.e. P = (I N M − DRU S) must be a stable matrix. As time i → ∞,
S → LI N M . Keeping this in mind the condition for stability is
where G = C ⊗ I M is the N M × N M transition matrix and C is the
2
N × N diffusion combination matrix with entries [ck,l ]. ⊗ is the 0 < μk <
Kronecker product. The global weight error vector is defined as, L λmax (Ru ,k )
5.2. Mean square transient analysis we consider a weaker assumption that U∗i Wi Ui is independent of
i −1
w̃ . This assumption helps us to replace the random weighting
The study of mean square transient analysis is done in this matrix by its mean = E a deterministic quantity.
section. The analysis is based on weighted energy conservation re-
i −1 2
lation and variance relation [8,42]. The error vector of length L at E
w̃
i
w
2 = E
w̃
w
+ E [ v ∗i C∗i DDCi v i ]
each node k is defined as,
L −1
n
−1
ek,i = dk,i − Uk,i φ ki −1 (44) +2 E v ∗i C∗i D G[I N M − DPi −l ] DCi −n v i −n
n =1 l =0
The N L × 1 global error vector ei = col{e1,i , e2,i , . . . , e N ,i } is written ∗ = G∗ G − G∗ DE [U∗i Wi Ui ]G − G∗ E [U∗i Wi Ui ]DG
as,
+ G∗ E [U∗i Wi Ui D DU∗i Wi Ui ]G
i −1
ei = di − Ui Gw
w (52)
= Ui Gw̃ i −1 + v i By looking into (52) we could see that we need to evaluate
= eaG,i + v i (45) certain data moments namely,
where
n
−1
E v ∗i C∗i D G[I N M − DPi −l ] DCi −n v i −n
eaG,i = Ui G w̃ i −1 (46) l =0
(53)
The global a priori and a posteriori weighted estimation errors are E [U∗i Wi Ui ], E [ v ∗i Wi Ui DDU∗i Wi v i ] and
defined as, E [U∗i Wi Ui DDU∗i Wi Ui ]
eaD,i G Ui DGw̃ i −1 (47) prior to establishing the learning behavior of the CTA-DLAF algo-
rithm. The fourth order data moment in (53) is difficult to be eval-
eDp ,i Ui Dw̃ i (48) uated in closed form for arbitrary distributions and hence Gaussian
for some arbitrary N M × N M matrix ≥ 0 [8]. The weight error distributed data is considered. The evaluation of data moments and
vector curve is given below, which is obtained by following the subsequent analysis is simplified if we use remodeled variables by
same path as that is used in deriving (39): appealing to the eigendecomposition [42] of E [U∗i Wi Ui ] = RU S. The
eigen decomposition is given by
i −1
w̃
i
w = Gw̃
w − DU∗i Wi ei (49)
E [U∗i Wi Ui ] = RU S = V V ∗ (54)
Substitute (45) into (49) and perform weighted energy balance
on the resulting equation, which leads to the equation (50): where = diag{ 1 , 2 , . . . , N }.
Eigen decomposition of Ru ,k is given by
i −1 2
i
w
2 =
w̃
w̃ w
G∗ G − (eaD,i G )∗ Wi eaG,i − (eaD,i G )∗ Wi v i
Ru ,k = V k V ∗ (55)
− (eaG,i )∗ Wi eaD,i G + (eaG,i )∗ Wi Ui DDU∗i Wi eaG,i
(50)
+ (eaG,i )∗ Wi Ui DDU∗i Wi v i − v ∗i Wi eaD,i G The remodeled input regressor autocorrelation matrix by Lorent-
zian weight matrix in (54) from that of the diffusion LMS in [8]
∗ ∗ ∗ ∗
+ v i Wi Ui DDUi Wi eaG,i + v i Wi Ui DDUi Wi v i calls for a slight change in eigen decomposition. From (54) and
Another assumption has also been taken in this convergence (55) we vectorize k into
analysis apart from [8], the dependency of weight error vector on
vec { k } = k,i = T r ( EWk,i )λk (56)
the past noise [45]. Substituting (46) and (47) into (50) and taking
expectation on both sides the variance relation is obtained as in: where λk = vec {k } The new transformed variables are defined
i 2 i −1 2 ∗ ∗ below:
E
w̃
w
= E
w̃
w
+ E [ v i Ci DDCi v i ]
w i = V ∗ w̃
w̄ w ,
i
Ūi = Ui V , Ḡ = V ∗ GV , ¯ = V ∗V ,

L −1

n
−1
∗ ∗
+2 E v i Ci D G[I N M − DPi −l ] DCi −n v i −n ¯ = V ∗ V D̄ = V ∗ DV = D W̄i = V ∗ Wi V = Wi
n =1 l =0
The variance relation in (52) is rewritten in terms of the trans-
= G∗ G − G∗ DU∗i Wi Ui G − G∗ U∗i Wi Ui DG
formed variables
+ G∗ U∗i Wi Ui DDU∗i Wi Ui G
∗
w i −1
2¯ + E [ v ∗i C̄i D
w i
2¯ = E
w̄
E
w̄ ¯ DC̄i v i ]
(51)
L −1
n
−1
where Ci = U∗i Wi and Pi = U∗i Wi Ui . The extra noise term in (51) +2
∗
¯
E v ∗i C̄i D Ḡ[I N M − DP̄i −l ] DC̄i −n v i −n
accounts for the dependency of weight error vector on the past
n =1 l =0
noise and the derivation follows as in [45].
Since itself is a random quantity as it is dependent on input
¯ = Ḡ∗
¯ Ḡ − Ḡ∗
¯ DE [Ū∗i Wi Ūi ]Ḡ − Ḡ∗ E [Ū∗i Wi Ūi ]D
¯ Ḡ
∗ ∗
data, the transient analysis becomes complex. Hence for the sake + Ḡ E [Ūi Wi Ūi D¯ DŪ∗i Wi Ūi ]Ḡ
of mathematical tractability of (51) some simplifying assumptions
(57)
are taken into consideration. The independence of the regressor
i −1
w
vector with w̃ is a common assumption [42] for adaptive filter To evaluate the data moments, we closely follow the procedure in
analysis. But since Ui consists of successive regressors it is rather a [8]. A block matrix to vector conversion operator bvec{} is used [8].
strong condition than the usual independence assumption. Hence For the ease of reference note that bvec{ ¯ } = σ̄ . With the bvec{}
∗
operator at hand, the next step is to evaluate the transformed data ¯ DŪi Wi v i ] = xT Bd σ̄
E [ v ∗i Wi Ūi D (67)
moments in the remodeled variance relation (57), namely,
where x = bvec{ E [Wi v i v ∗i Wi ] T } and σ̄ = col{σ̄ 1 , σ̄ 2 , . . . , σ̄ N }.
∗ ∗ ¯ DŪ∗i Wi v i ],
E [Ūi Wi Ūi ], E [ v i Wi Ūi D By invoking the properties of trace operator just as done previ-
ously, the third term in (58) gets evaluated to the following.

n
−1
∗
¯
E v ∗i C̄i D̄ Ḡ[I N M − DP̄i −l ] D̄C̄i −n v i −n (58)
n
−1
∗
l =0 ¯
E v ∗i C̄i D Ḡ[I N M − DP̄i −l ] DC̄i −n v i −n = xnT σ̄ (68)
∗ ∗
and ¯ DŪi Wi Ūi ]
E [Ūi Wi Ūi D l =0
where
By substituting Ūi = Ui V and from (54), the first data moment
in (58) gets evaluated to
n
−1 T
∗
xn = bvec E Ḡ[I N M − DP̄i −l ] DC̄i −n v i −n v ∗i C̄i D (69)
∗
E [Ūi Wi Ūi ] = (59) l =0
The second term in (58) is expressed as No further simplifications are possible to the expectation term due
to the dependency of Wi and v i .
∗ ∗
E [ v ∗i Wi Ūi D ¯ DŪi ]}
¯ DŪi Wi v i ] = T r { E [Wi v i v ∗ Wi Ūi D (60) The fourth order moment in (58) is solved by appealing to the
i ∗
Gaussian factorization theorem [8,42]. Now both Ūi Wi Ūi and D are
∗
¯ DŪi are independent, (60) is writ-
Assuming Wi v i v ∗i Wi and Ui D block diagonal and hence we can write
ten as
∗ ∗ ∗ ∗
¯ DŪi Wi Ūi ] = DE [Ūi Wi Ūi
E [Ūi Wi Ūi D ¯ Ūi Wi Ūi ]D (70)
∗ ∗
¯ DŪi Wi v i ] = T r { E [Wi v i v ∗ Wi ] E [Ūi D
E [ v ∗i Wi Ūi D ¯ DŪi ]} (61)
i
Let
For noise with less power, we can assume Wi and v i are indepen- ∗ ∗
dent which fails if noise contains sufficient power. Hence the term ¯ Ūi Wi Ūi ].
F = E [Ūi Wi Ūi (71)
EWi v i v ∗i Wi cannot be further approximated due to mathematical
The M × M k block is given by
complexity and kept the same for the impulsive noise scenario.
Let ¯ k ū,i ū∗ ]
Fk = E [ T r {Wk,i W,i }ūk,i ūk∗,i ,i
∗ L −1
L −1
¯ DŪi
B E Ūi D
+ ¯ k ū,i −τ2 ū∗
E [ūk,i −τ1 ūk∗,i −τ1 ,i −τ2 (72)
The L × L k-block of B is given by τ 2 =0 τ 1 =0
τ1 =τ2
μk2 T r (k ¯ kk )IL for k = × Wk,i (τ1 , τ1 )W,i (τ2 , τ2 )]
Bk = (62)
0, for k =
Taking the statistical independence of Ui U∗i and Wi into ac-
Now express B as count (72) is rewritten as follows:
¯ k ū,i ū∗ ]
Fk = E [ T r {Wk,i W,i }] E [ūk,i ūk∗,i
B = [B1 B2 . . . B . . . B N ] ,i
L −1
L −1
where B is the th block column: + ¯ k ū,i −τ2 ū∗
E [ūk,i −τ1 ūk∗,i −τ1 ,i −τ2 ] (73)
τ 2 =0 τ 1 =0
B = col{B1, , B2, , . . . , Bk, , . . . , B N , } (63) τ1 =τ2
In order to vectorize B we need to apply the vec{} operator on Bk × E [Wk,i (τ1 , τ1 )W,i (τ2 , τ2 )]
and let us denote it by Applying fourth moment of Gaussian variable Lemma in [42], the
M × M block Fk in (73) gets evaluated to (74),
bk = vec {Bk }
⎧
vec{I L }μk2 λkT σ̄ kk for k = ⎪ ¯ ) + ξ k
k,k,i (k T r (k ¯ )
bk = ⎪
⎪ kk L −1 kk k
(64) ⎪
⎨ + k ¯ kk k L − 1
0, for k = τ 2 =0 τ1 =0 E [Wk,i (τ1 , τ1 )
Fk = τ1 =τ2
Finally bvec{B} = col{b1 , b2 , . . . , b , . . . , b N } where ⎪
⎪ × Wk,i (τ2 , τ2 )] for k =
⎪
⎪
⎩ ¯ k
k,,i k for k =
b = col{b1, , b2, , . . . , bk, , . . . , b N , }
(74)
b = col{0σ̄ 1 , 0σ̄ 2 , . . . , vec{I L }μ2 λT σ̄ , . . . , 0σ̄ N }
where k,,i = T r { EWk,i W,i } and ξ = 1 for complex data and
B σ̄ (65) ξ = 2 for real data. This differs from that in [8] by the presence of
T r { EWk,i W,i } and the summation term which takes care of previ-
where 0 is L 2 × M 2 null matrix and
ous L error vector. For small step size the summation term could
σ̄ = col{σ̄ 1 , σ̄ 2 , . . . , σ̄ N } be neglected.
For a compact representation of recursive equation in (57), the
B = diag {0, 0, . . . , vec{I L }μ2 λT , . . . , 0} block matrix ¯ is converted to a vector by applying bvec{} op-
erator on the same. bvec{} is the vectorization operator for block
Thus we get,
matrices and is explained in detail in [8]. For any three block ma-
bvec{B} = col{B1 σ̄ 1 , B2 σ̄ 2 , . . . , B N σ̄ N } = Bd σ̄ (66) trices P , Q and the following relation holds:
where Bd = diag {B1 , B2 , . . . , B N }. Thus we get (61) as bvec{PQ} = (Q T P)σ (75)

¯ } by keeping all of the above L −1

Now we proceed to express bvec{
¯ is rewritten here:
E
w̄ w i −1
2¯ + (x T Bd + 2
w i
2σ̄ = E
w̄ xnT )σ̄
relations in mind. σ̄
n =1
¯ =Ḡ∗
¯ Ḡ − Ḡ∗
¯ DḠ − Ḡ∗ D ¯ Ḡ
(87)
T ∗
(76) ¯ =(Ḡ Ḡ ) IN 2 M 2 − (I N M D)

∗ ∗ ∗
¯ DŪi Wi Ūi ]Ḡ
+ Ḡ E [Ūi Wi Ūi D
− (D I N M ) + (D D)F σ̄
First term is vectorized as
∗ The global learning curves are derived by iterating (87) as in [8]
bvec{Ḡ ¯ Ḡ} = (Ḡ T Ḡ∗ )σ̄ (77)
and is given below:
Second and third terms in (76) are vectored respectively as
w i −1
2σ̄
w i
2σ̄ = E
w̄
E
w̄
∗ ∗
bvec{Ḡ ¯ D }
¯ DḠ} = (Ḡ Ḡ )bvec{IN M
T
L −1
(88)
T ∗ + (xT Bd + 2 ¯
xnT )
i
w 0
2 i
σ̄ −
w̄
= (Ḡ Ḡ )(D I N M )σ̄ (78) ¯ ¯ )σ̄
(I−
n =1
and The global MSD and global EMSE error curves are taken as in [8].
∗ ∗ The global mean square deviation is defined as,
¯ Ḡ } = (Ḡ Ḡ )(I N M D)σ̄
bvec{Ḡ D
T
(79)
1
Finally we apply bvec{} operator on the fourth term. w i
2
η(i ) = ( ) E
w̄
N
∗ ∗ ∗
¯ DŪi Wi Ūi ]Ḡ}
bvec{Ḡ E [Ūi Wi Ūi D Choosing σ̄ = ( N1 )bvec{IN M } κ η in (88) yields the global MSD
∗ ∗ ∗
T
¯ Ūi Wi Ūi ]}
= (Ḡ Ḡ )(D D)bvec{ E [Ūi Wi Ūi (80) curve,
L −1

By following the same procedure used in [8] for bvec{} operator,
∗ η(i ) = η(i − 1) + (xT Bd + 2 ¯
xnT )
i
w 0
2 i
κ η −
w̄ (89)
¯ Ū∗i Wi Ūi ]} gets evaluated to
bvec{ E [Ūi Wi Ūi ¯ ¯ )κ η
(I−
n =1
∗ ∗
¯ Ūi Wi Ūi ]} = F σ̄
bvec{ E [Ūi Wi Ūi (81) In a similar way, choosing σ̄ = ( N1 )bvec{} λζ , where
where
= diag { 1 , 2 , . . . , k , . . . , N }
F = diag {F 1 , F 2 , . . . , F N } and σ̄ = col{σ̄ 1 , σ̄ 2 , . . . , σ̄ N } the global learning curve for EMSE is given below, with
with F comes from (74) and is defined in: L −1

ζ (i ) = ζ (i − 1) + (xT Bd + 2 ¯ λζ −
w̄
xnT )
i
w 0
2 (90)
F = diag {1, (1 ⊗ ), 2, (2 ⊗ ), . . . ¯ i (I−
¯ )λζ
n =1
. . . , (λ λT + ξ ⊗ ), + ⊗
As the learning process reaches steady state, the global steady state
L −1
L −1 MSD and EMSE are defined for i → ∞ as [8]
× E [W,i (τ1 , τ1 )W,i (τ2 , τ2 )], . . . (82)
1
τ 2 =0 τ 1 =0 η= w i −1
2
E
w̄ (MSD) (91)
τ1 =τ2 N
. . . , N , ( N ⊗ )} 1
ζ= w i −1
2
E
w̄ (EMSE) (92)
N
The time index i is dropped for compactness. σ̄ is defined as
σ̄ = col{σ̄ 1 , σ̄ 2 , . . . , σ̄ N }. σ̄ k is obtained by applying vec{} op- Which further solved to
erator on each element ¯ k of block matrix ¯.
L −1

1
Hence η= (x T Bd + 2 ¯ )−1 bvec{IN M }
xnT )( I − (MSD) (93)
N
∗ ∗ ∗ ∗ n =1
¯ DŪi Wi Ūi ]Ḡ} = (Ḡ Ḡ )(D D)F σ̄ (83)
bvec{Ḡ E [Ūi Wi Ūi D
T
L −1

1
Summarizing ζ= (x T Bd + 2 ¯ )−1 bvec{}
xnT )( I − (EMSE) (94)
N
n =1
¯ } = σ̄ =
bvec{ ¯ σ̄ (84)
The verification of the theoretical expressions derived here is done
where in the next section.

¯ = (Ḡ T Ḡ∗ ) I N 2 M 2 −(I N M D)

6. Simulation results
(85)
− (D I N M ) + (D D)F σ̄ The performance comparison of the proposed CTA-DLAF with
CTA-DMCC, CTA-DAPSA, and CTA-HuberDLMS is done for different
By taking inverse operation bvec−1 {} we express (57) as % of impulsive noise in this section. Also the theoretical results
of proposed CTA-DLAF is compared with simulation results. Fig. 2
L −1
[46] shows the topology of WSN adopted in this paper.
w i
2
E
w̄ −1 w i −1
2 −1 ¯ + (x T Bd + 2
= E
w̄ xnT )σ̄ (86)
bvec {σ̄ } bvec { σ̄ } In this work, we assume the wireless channel to be Rayleigh
n =1
fading and is considered to be time invariant for the block of pi-
Dropping the inverse notation for compactness, the recursive rela- lot symbols transmitted. Also since the real and imaginary parts
tion for the Diffusion LAF algorithm is given as of channel taps are independent we have assumed real channel
where w ki − 1
,l is the estimate at node k at lth run of experiment.
⎛ ⎞
1 1
N N
E M S E (i ) = 10 log ⎝ |uk,l (i )( w (o)
− w ki − 1 2⎠
,l )|
N N
l =1 k =1
Absolute percentage of deviation is given as

1 w (o) (l) − w ∞ (l)
NM
AP D = × 100% (97)
Fig. 2. Network topology. NM w (o) (l)
l =1
coefficients for simulation study. The 3-path Rayleigh channel coef- where w ∞ is the global estimate of unknown parameter.
ficients w o are generated as w o = randn(1, 3)/norm( w o ). The pilot
symbols u i at each node are considered the same and Gaussian 6.1. Comparison of analytical and simulation results
distributed and the noise source is a mixture of Gaussian and im-
pulsive noise. The impulsive noise is modeled as a Bernoulli Gaus- To confirm the theoretical analysis of proposed CTA-DLAF, the
sian distribution [29]. The impulsive noise model is given by mathematical learning curve is compared with the plot of simu-
lation results averaged over 50 trials in 10% impulsive noise envi-
v (i ) = v G (i ) + b (i ) v I (i ) (95)
ronment. The step size is taken as 0.01 and the tuning parameter
2
where v G (i ) is zero mean Gaussian noise with variance σ and G γ for CTA-DLAF is chosen as 1. The result is plotted for window
v I (i ) is zero mean Gaussian noise with large variance σ I2 and b(i) length L = 1 and 2. The result shows how the parameter L con-
follows Bernoulli distribution, i.e. P (b(i ) = 1) = p i and P (b(i ) = trols the convergence and steady state error. The expectation term
0) = 1 − p i . The ratio of variance of impulsive to Gaussian noise is E [Wi v i v ∗i Wi ], E [Wk,i W,i ] and E [Wk,i ] is evaluated by ensemble
defined as η , where averaging over time index i.
Fig. 3, shows the comparison of analytical and simulated learn-
σ I2 ing curves, and it is observed that there is a mismatch in transient
η= (96)
σG2 part due to ensemble averaging of E [Wk,i W,i ] and E [Wk,i ]. The
In the experiment results to follow, same noise power is consid- time dependency for E [Wk,i W,i ] and E [Wk,i ], due to the presence
ered at each node with SNR 30dB and η is chosen as 106 . of error term in the same, gets lost in ensemble averaging. Which,
The threshold value for CTA-HuberDLMS is chosen as [29] unlike in simulation, leads to constant matrices and causes a mis-
match in the transient curve.
1
δopt =
αopt 6.2. Effect of L and μ on steady state performance of CTA-DLAF

pi 1
where αopt = 1− p i σ G
and kernel width σ of CTA-DMCC is cho-
sen as 1 in all experiments. The theoretical expression for steady state MSD given in (93)
The analysis of the robustness performance in this work is is verified by varying the step size μ and by varying the window
length, L respectively in Fig. 4. As i → ∞, error tends to zero and
based on three parameters: mean square deviation (MSD), excess
hence E [Wk,i ] gets approximated to identity matrix [34].
mean square error (EMSE) and absolute percentage of deviation.
The behavior of CTA-DLAF with different step size is plotted
In this paper global MSD and EMSE which is the average of local
in Fig. 4(a) with window length L = 2 and γ = 1, in 10% impul-
performance measures over the nodes is used. The ensemble av-
sive noise environment. Fig. 4(b) shows the steady state behavior
erage of global MSD and EMSE in dB over N runs of independent
of CTA-DLAF with varying window length L for step size μ = .01
experiment are given as:
⎛ ⎞ and γ = 1 in 10% impulsive noise environment. While increasing L
leads to a faster convergence at the expense of steady state error,
1 1
N N
M S D (i ) = 10 log ⎝
w o − w ki −,l 1
2 ⎠ choosing L > 2 leads to more memory requirement and as mem-
N N ory is a constraint in nodes in WSN L = 2 is an optimum choice.
l =1 k =1
Fig. 3. Comparison of global theoretical and simulated curve for CTA-DLAF: (a) MSD, (b) EMSE.
Fig. 4. Comparison of global steady state MSD for CTA-DLAF (a) Varying μ (b) Varying L.
Fig. 5. Steady state comparison of CTA-DLAF and CTA-DMCC with σ = 1 and varying γ in 10% impulsive noise. (a) Steady state MSD (dB) of CTA-DLAF and CTA-DMCC.
(b) Steady state EMSE (dB) of CTA-DLAF and CTA-DMCC.
Table 2
Comparison of Steady State MSD and EMSE.
Adaptive Impulsive noise

algorithm 10% 20% 30%
MSD EMSE MSD EMSE MSD EMSE
CTA−DLAF −48.2225 −48.0786 −48.2919 −48.2642 −49.0365 −49.0098
CTA−DMCC −39.5700 −39.7806 −37.7340 −37.7235 −36.0198 −36.1820
CTA−HuberDLMS −27.8501 −27.1291 −28.7468 −28.2109 −29.7635 −29.6070
CTA−DAPSA −29.0925 −29.6178 −28.5562 −29.1192 −28.1170 −28.6086
CTA−DLMS 8.8471 8.6681 11.6021 11.6546 13.4244 13.2722
Hence in the following sections window length L for CTA-DLAF is 6.4. Steady state performance comparison
chosen as 2.
In this experiment the steady state performance of CTA-DLAF,
6.3. Steady state performance comparison with varying γ
CTA-DMCC, CTA-DAPSA, CTA-HuberDLMS, and CTA-DLMS is com-
In this experiment tuning parameter γ of CTA-DLAF is var- pared. The step size of each algorithm is adjusted for same con-
ied and compared the steady state performance with CTA-DMCC vergence rate and compared the steady state error. The projec-
in 10% impulsive noise environment. The step size of CTA-DLAF tion order L is chosen as 2 for CTA-DAPSA and CTA-DLAF. Step
and CTA-DMCC are adjusted to provide same convergence rate for size for CTA-DLMS, CTA-HuberDLMS, CTA-DAPSA, CTA-DMCC and
each value of γ . Fig. 5 shows the comparable performance of CTA- CTA-DLAF is taken as 0.1, 0.5, 0.03, 0.2, and 0.05 respectively. The
DMCC and CTA-DLAF for γ > 0.4 with CTA-DLAF having a slight performance results in 10, 20, and 30% impulsive noise environ-
upper hand in 10% impulsive noise environment. Fig. 5 also shows ment is given in Table 2. The robustness of all four algorithms
the necessity of properly tuning the parameter γ to provide ro- remains intact with increase in impulsive noise percentage. Fig. 6
bustness in an impulsive noise environment as it is seen that for shows the comparison of learning curves in 30% impulsive noise.
γ < 0.4 CTA-DMCC shows better performance compared to CTA- The comparable performance of CTA-HuberDLMS and CTA-DAPSA
DLAF. in 30% impulse noise, is attributed to the projection order L of
√ following experiments a time varying γ is chosen, i.e.,
In the CTA-DAPSA. Table 2 and Fig. 6 show the √ improvement in perfor-
γ = 10 E M S E (i ). mance of CTA-DLAF in choosing γ = 10 E M S E (i ).
Fig. 6. Learning curves for CTA-DLAF and other algorithms in 30% impulsive noise: (a) MSD, (b) EMSE.
Fig. 7. Transient Performance of CTA-DLAF, CTA-DMCC, CTA-HuberDLMS and CTA-DLMS in 30% impulsive noise: (a) MSD, (b) EMSE.
Table 3 Table 4
Comparison of convergence rate. Absolute percentage deviation (η = 106 ).
Adaptive Convergence rate in impulsive noise (approx.) Adaptive Impulsive noise

algorithm 10% 20% 30% algorithm 0% 10% 20% 30%
CTA-DLAF 123 159 183 CTA-DLAF 0.0097 0.0154 0.0181 0.0086
CTA-DMCC 447 612 956 CTA-DMCC 0.0110 0.0390 0.0178 0.0462
CTA-HuberDLMS 1062 1077 1498 CTA-HuberDLMS 0.0477 0.1181 0.0389 0.0831
CTA-DAPSA 1390 1570 1925 CTA-DAPSA 0.0228 0.0526 0.0331 0.0600
CTA-DLMS Diverging Diverging Diverging CTA-DLMS 0.0119 8.9863 3.5590 14.0859
6.5. Transient performance comparison

firms that CTA-DLAF provides faster convergence rate with window
In this experiment the transient performance of CTA-DLAF is length, L = 2 compared to other algorithms.
analyzed in channel estimation scenario √ at 10, 20 and 30% im- Absolute percentage of deviation (APD) of CTA-DLMS, CTA-
pulsive noise environment with γ = 10 E M S E (i ). Table 3 il- HuberDLMS, CTA-DMCC and CTA-DLAF in 0, 10, 20, 30% of impul-
lustrates the comparison of algorithms in terms of approximate sive noise is given in Table 4. The parameters are kept same as
convergence rate. In 10% impulsive noise scenario, step sizes are in Section 6.4. APD is calculated by using (91) averaged over 100
0.06,0.051,0.0015, 0.03, and 0.05 for CTA-DLAF, CTA-DMCC, CTA- independent experiments and the estimated weight is taken as
DAPSA, CTA-HuberDLMS and CTA-DLMS respectively. Kernel width the average over last 500 iterations. It clearly shows the failure
CTA-DMCC σ is chosen as 1. In 20% impulsive noise scenario, of diffusion LMS in impulsive noise environment. In 0% impul-
step sizes are 0.06, 0.038, 0.0013, 0.03, and 0.05 for CTA-DLAF, sive noise environment, i.e. channel gets affected by only Gaussian
CTA-DMCC, CTA-DAPSA, CTA-HuberDLMS and CTA-DLMS respec- noise CTA-DAPSA gives a slightly degraded performance compared
tively. In 30% impulsive noise scenario, step sizes are 0.06, 0.03, with CTA-DLMS while CTA-DMCC shows performance at par with
0.0013, 0.03, and 0.05 for CTA-DLAF, CTA-DMCC, CTA-DAPSA, CTA- CTA-DLMS. CTA-DLAF outperforms all of these algorithms in esti-
HuberDLMS and CTA-DLMS respectively. The learning curves in mating AWGN channel if the tuning parameter γ is chosen as a
30% impulsive noise environment is given in Fig. 7 which con- priori excess mean square error.
7. Conclusion [19] J. Zhang, T. Qiu, S. Luan, H. Li, Bounded non-linear covariance based esprit
method for noncircular signals in presence of impulsive noise, Digit. Signal Pro-
cess. 87 (2019) 104–111.
In this paper we proposed CTA-DLAF, a faster converging chan-
[20] Y. Liu, Y. Zhang, T. Qiu, J. Gao, S. Na, Improved time difference of arrival esti-
nel estimation algorithm in a sensor network affected by impulsive mation algorithms for cyclostationary signals in α -stable impulsive noise, Digit.
noise. The proposed algorithm differs from Lorentzian adaptive fil- Signal Process. 76 (2018) 94–105.
ter in the sense that it is extended to sensor network and follows [21] H. Wan, X. Ma, X. Li, Variational Bayesian learning for removal of sparse im-
pulsive noise from speech signals, Digit. Signal Process. 73 (2018) 106–116.
diffusion strategy. Numerical simulations shows the comparable
[22] L. Rugini, P. Banelli, On the equivalence of maximum SNR and MMSE estima-
performance of CTA-DLAF and CTA-DMCC in terms of both steady tion: applications to additive non-Gaussian channels and quantized observa-
state and convergence rate if a constant tuning parameter γ is cho- tions, IEEE Trans. Signal Process. 64 (23) (2016) 6190–6199.
sen for CTA-DLAF. The simulation studies show the improved trade [23] S.B. Babarsad, S.M. Saberali, M. Majidi, Analytic performance investigation of
off between convergence rate and steady state error compared to signal level estimator based on empirical characteristic function in impulsive
noise, Digit. Signal Process. 92 (2019) 20–25.
CTA-DMCC when γ 2 is taken 100 times larger than a priori EMSE,
[24] P.A. Lopes, J.A. Gerald, Iterative MMSE/MAP impulsive noise reduction for
i.e. it is time varying. Under both the cases CTA-DLAF outperforms OFDM, Digit. Signal Process. 69 (2017) 252–258.
CTA-HuberDLMS and CTA-DAPSA. Also the steady state expression [25] V. Mathews, S.H. Cho, Improved convergence analysis of stochastic gradient
for CTA-DLAF shows close agreement with the simulation results. adaptive filters using the sign algorithm, IEEE Trans. Acoust. Speech Signal Pro-
cess. 35 (4) (1987) 450–454.
[26] P. Petrus, Robust Huber adaptive filter, IEEE Trans. Signal Process. 47 (4) (1999)
Declaration of competing interest 1129–1133.
[27] J. Chambers, A. Avlonitis, A robust mixed-norm adaptive filter algorithm, IEEE
We wish to confirm that there are no known conflicts of inter- Signal Process. Lett. 4 (2) (1997) 46–48.
[28] E.V. Papoulis, T. Stathaki, A normalized robust mixed-norm adaptive algorithm
est associated with this publication and there has been no signifi-
for system identification, IEEE Signal Process. Lett. 11 (1) (2004) 56–59.
cant financial support for this work that could have influenced its [29] M.O. Sayin, N.D. Vanli, S.S. Kozat, A novel family of adaptive filtering algo-
outcome. rithms based on the logarithmic cost, IEEE Trans. Signal Process. 62 (17) (2014)
4411–4424.
Acknowledgments [30] T. Shao, Y.R. Zheng, J. Benesty, An affine projection sign algorithm ro-
bust against impulsive interferences, IEEE Signal Process. Lett. 17 (4) (2010)
327–330.
This work was supported in part by the Science and Engineer- [31] B. Chen, L. Xing, H. Zhao, N. Zheng, J.C. Príncipe, Generalized correntropy for
ing Research Board (SERB), Govt. of India (Ref. No. robust adaptive filtering, IEEE Trans. Signal Process. 64 (13) (2016) 3376–3387.
SB/S3/EECE/210/2016 Dated 28/11/2016). [32] A. Singh, J.C. Príncipe, Using correntropy as a cost function in linear adap-
tive filters, in: International Joint Conference on Neural Networks, 2009,
pp. 2950–2955.
References [33] V.C. Gogineni, S. Mula, Improved proportionate-type sparse adaptive filtering
under maximum correntropy criterion in impulsive noise environments, Digit.
[1] T. Panigrahi, P.M. Pradhan, G. Panda, B. Mulgrew, Block least mean squares al- Signal Process. 79 (2018) 190–198.
gorithm over distributed wireless sensor network, J. Comput. Netw. Commun. [34] R.L. Das, M. Narwaria, Lorentzian based adaptive filters for impulsive noise en-
2012 (2012) 5564–5569. vironments, IEEE Trans. Circuits Syst. I, Regul. Pap. 64 (6) (2017) 1529–1539.
[2] W. Xia, Y. Wang, A variable step-size diffusion LMS algorithm over networks [35] F. Huang, J. Zhang, S. Zhang, A family of robust adaptive filtering algorithms
with noisy links, Signal Process. 148 (2018) 205–213. based on sigmoid cost, Signal Process. 149 (2018) 179–192.
[3] A. Goldsmith, Wireless Communication, Cambridge University Press, 2005. [36] S. Zhang, W.X. Zheng, J. Zhang, H. Han, A family of robust M-shaped error
[4] J.G. Proakis, Digital Communications, McGraw Hill, 2000. weighted least mean square algorithms: performance analysis and echo can-
[5] S.R. Kim, A. Efron, Adaptive robust impulse noise filtering, IEEE Trans. Signal cellation application, IEEE Access 5 (2017) 14716–14727.
Process. 43 (8) (1995) 1855–1866. [37] J. Ni, J. Chen, X. Chen, Diffusion sign-error LMS algorithm: formulation and
[6] R. Abdolee, B. Champagne, Centralized adaptation for parameter estimation stochastic behavior analysis, Signal Process. 128 (2016) 142–149.
over wireless sensor networks, IEEE Commun. Lett. 19 (9) (2015) 1624–1627. [38] H. Zayyani, M. Korki, F. Marvasti, A distributed 1-bit compressed sensing algo-
[7] F.S. Cattivelli, A.H. Sayed, Diffusion LMS strategies for distributed estimation, rithm robust to impulsive noise, IEEE Commun. Lett. 20 (6) (2016) 1132–1135.
IEEE Trans. Signal Process. 58 (3) (2010) 1035–1048. [39] M. Korki, H. Zayyani, Weighted diffusion continuous mixed p-norm algorithm
[8] C.G. Lopes, A.H. Sayed, Diffusion least-mean squares over adaptive networks: for distributed estimation in non-uniform noise environment, Signal Process.
formulation and performance analysis, IEEE Trans. Signal Process. 56 (7) (2008) 164 (2019) 225–233.
3122–3136. [40] W. Ma, B. Chen, J. Duan, H. Zhao, Diffusion maximum correntropy criterion
[9] C.G. Lopes, A.H. Sayed, Incremental adaptive strategies over distributed net- algorithms for robust distributed estimation, Digit. Signal Process. 58 (2016)
works, IEEE Trans. Signal Process. 55 (8) (2007) 4064–4077. 10–19.
[10] B.P. Mishra, T. Panigrahi, A. Dubey, Robust distributed estimation of wireless [41] W. Huang, L. Li, Q. Li, X. Yao, Distributed affine projection sign algorithms
channel, in: IEEE International Conference on Applied Electromagnetics, Signal against impulsive interferences, Tein Tzu Hsueh Pao/Acta Electron. Sin. 44 (7)
Processing and Communication, 2019, presented. (2016) 1555–1560.
[11] T. Panigrahi, G. Panda, B. Mulgrew, Error saturation nonlinearities for robust [42] A.H. Sayed, Fundamentals of Adaptive Filtering, Wiley, 2003.
incremental LMS over wireless sensor networks, ACM Trans. Sens. Netw. 11 (2) [43] Z. Li, S. Guan, Diffusion normalized Huber adaptive filtering algorithm, J.
(2014) 27, 20 pp. Franklin Inst. 355 (8) (2018) 3812–3825.
[12] J. Chen, C. Richard, A.H. Sayed, Multitask diffusion adaptation over networks, [44] W. Huang, L. Li, Q. Li, X. Yao, Diffusion robust variable step-size LMS algorithm
IEEE Trans. Signal Process. 62 (16) (2014) 4129–4144. over distributed networks, IEEE Access 6 (2018) 47511–47520.
[13] J. Chen, C. Richard, A.H. Sayed, Diffusion LMS over multitask networks, IEEE [45] S. Kim, J. Lee, W. Song, A theory on the convergence behaviour of the affine
Trans. Signal Process. 63 (11) (2015) 2733–2748. projection algorithm, IEEE Trans. Signal Process. 59 (12) (2011) 6233–6239.
[14] B. Widrow, Adaptive Filters I: Fundamentals, Tech. rep., Stanford Electronic Lab- [46] C. Yu, L. Xie, Y.C. Soh, Blind channel and source estimation in networked sys-
oratories, 1966. tems, IEEE Trans. Signal Process. 62 (17) (2014) 4611–4626.
[15] A.M. Zoubir, V. Koivunen, Y. Chakhchoukh, M. Muma, Robust estimation in sig-
nal processing: a tutorial-style treatment of fundamental concepts, IEEE Signal
Process. Mag. 29 (4) (2012) 61–80. Annet Mary Wilson received her M.Tech. in Electronics and Commu-
[16] T.K. Blankenship, D.M. Kriztman, T.S. Rappaport, Measurements and simulation nication Engineering from Calicut University, India, in 2015. She is cur-
of radio frequency impulsive noise in hospitals and clinics, in: IEEE 47th Ve- rently working as SRF in the Department of Electronics and Communica-
hicular Technology Conference. Technology in Motion, 1997. tion Engineering, National Institute of Technology, Goa, India. Her research
[17] D. Middleton, Non-Gaussian noise models in signal processing for telecommu- includes adaptive signal processing and signal processing for sensor net-
nications: new methods an results for class A and class B noise models, IEEE works.
Trans. Inf. Theory 45 (4) (1999) 1129–1149.
[18] Y. Abramovich, P. Turcaj, Impulsive noise mitigation in spatial and temporal do-
mains for surface-wave over-the-horizon radar, in: 9th Annual MIT Workshop Trilochan Panigrahi received his MTech in Electronics and Communi-
on Adaptive Sensor Array Processing. ASAP, 2001. cation Engineering from Biju Patnaik University of Technology Rourkela,
India, in 2005, the Ph.D. in Electronics and Communication Engineering University, Bhilai, India, in 2009 and the Ph.D. degree in Electrical Engi-
from National Institute of Technology, Rourkela, India, in 2012. He is cur- neering from the Indian Institute of Technology, Delhi, India, in 2014. Since
rently an Associate Professor in the Department of Electronics and Com- January 2019, he has been with the faculty of the Department of Electri-
munication Engineering, National Institute of Technology Goa, India. His cal Engineering, Indian Institute of Technology, Jammu, India, where he
research interests include signal processing for wireless sensor network is currently an Assistant Professor. His research interests are in diversity
and wireless communication, application of evolutionary algorithms in sig- combining, multi-hop transmission, and physical layer security for power
nal processing and source localization. line and wireless communications and smart grid communications.
Ankit Dubey received the B.E. degree in Electronics and Telecommu-

nication Engineering from the Chhattisgarh Swami Vivekanand Technical

Distributed Signal Processing

Uploaded by

Copyright:

Available Formats

You might also like

Distributed Signal Processing

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Distributed Signal Processing

Uploaded by

Copyright:

Available Formats

Digital Signal Processing 96 (2020) 102589

Contents lists available at ScienceDirect

Digital Signal Processing

Robust distributed Lorentzian adaptive ﬁlter with diffusion strategy in

1. Introduction as a Gaussian mixture [4,5]. We aim to estimate the wireless chan-

cooperative ﬁltering. Due to the mathematical tractability of the

where ek,i = dk,i − Uk,i φ ki −1 . 3 for k=1,2,. . . N do

4 ek,i = dk,i − Uk,i φ ki −1

Robust Computational cost per iteration per node

where w o = Q w (o) and v i = col{ v 1,i , v 2,i , . . . , v N ,i } in which S = diag{ T r ( EW1,i ) I M , . . . T r ( EW N ,i ) I M }

Replacing φ i −1 by Gw i −1 we get |λmax (P)| < 1

where Bd = diag {B1 , B2 , . . . , B N }. Thus we get (61) as bvec{PQ} = (Q T  P)σ (75)

¯  } by keeping all of the above L −1

¯ = (Ḡ T  Ḡ∗ ) I N 2 M 2 −(I N M  D)

Absolute percentage of deviation is given as

Adaptive Impulsive noise

Adaptive Convergence rate in impulsive noise (approx.) Adaptive Impulsive noise

6.5. Transient performance comparison

Ankit Dubey received the B.E. degree in Electronics and Telecommu-

You might also like

where Bd = diag {B1 , B2 , . . . , B N }. Thus we get (61) as bvec{PQ} = (Q T P)σ (75)

¯ } by keeping all of the above L −1

¯ = (Ḡ T Ḡ∗ ) I N 2 M 2 −(I N M D)