Random Neural Network Decoder For Error Correcting Codes

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

Random Neural Network Decoder for Error Correcting

Codes

Hossam Abdelbaki, Erol Gelenbe Said E. El-Khamy


School of Computer Science Department of Electrical Engineering
University of Central Florida Alexandria University
Orlando, FL 32816 Alexandria, Egypt 21544
fahossam, erolg@cs.ucf.edu elkhamy@alex.eun.eg

Abstract ampli er or a more sensitive receiver. The two most


popular ways of implementing FEC coding are known
This paper presents a novel Random Neural Net- as block coding and convolutional coding.
work (RNN) based soft decision decoder for block Since the decoding process of any code, even in the
codes. One advantage of the proposed decoder over linear case, can be classi ed as an NP-Complete prob-
conventional serial algebraic decoders is that noisy lem [1], it is reasonable to consider heuristics including
codewords arriving in non binary form can be corrected neural networks. Many authors addressed this subject
without rst rounding them to binary form. Another in the literature. For example, in [2] a Hop eld net-
advantage is that the RNN, after being trained, has a work based decoder was proposed for binary symmetric
simple hardware realization that makes it candidate for channels and in [3, 4] a feed-forward model was used.
implementation as a VLSI chip. This seems to make Hop eld model su ers from its very low storage capac-
the neural network decoding inherently more accurate, ity and the diculty of choosing between synchronous
faster, and more robust than conventional decoding. and asynchronous updating rule, which is dicult to
The proposed decoder is tested on Hamming linear argue in the context of error correcting codes. On the
codes and the results are compared with that of the other hand, feed-forward models can produce appropri-
optimum soft decision decoder and the conventional ate mapping between its input and output patterns but
hard decision decoder. Extensive simulations show that it can hardly encode the relation between the elements
the RNN based decoder reduces the error probability to of each input pattern.
zero in the range of the error correcting capacity of the In this paper, we considered the application of the
used code. On the other hand, it is much better than Random Neural Network (RNN) model [5, 6] to the
the hard decision decoder for codewords corrupted with decoding of block coded messages transmitted through
more errors. Gaussian channel with additive white noise (AWGN
channel). The proposed decoder consists of two layers,
the input layer or what can be called the association
1 Introduction layer and the output layer which acts as a classi cation
layer.
In a digital communication system, signi cant per- The sections of the paper are organized as follows.
formance improvement may be obtained if error control Section 2 provides a brief overview of the techniques
coding is properly applied. One type of the error con- used for error correction of block codes. It also presents
trol coding schemes is the forward error control (FEC). the Hamming codes family which we will use through-
In this type, some extra bits are inserted into the sym- out the paper. Section 3 introduces the RNN model.
bol stream emitted by the source for purposes of detect- The structure of the proposed RNN based decoder is
ing and correcting transmission errors at the receiver. presented in Section 4. Section 5 presents an exper-
Coding with FEC can be used to lower the signal to imental evaluation of the proposed decoder as well as
noise ratio (SNR) required at the receiver to achieve a the conventional decoders. Finally, Section 6 o ers con-
given bit error rate (BER). The cost for using FEC cod- cluding comments and suggestions for further work.
ing is the increase in complexity and bandwidth, which
may be preferable to the cost of a large nal power
2 Conventional Techniques for 2.3 Soft Decision Decoding (SDD)
Error Correction In SDD, the decoding of block codes is carried out
by seeking maximum correlation between the incoming
In this Section, the basic concepts that describe the corrupted signal and the locally generated codewords.
forming of block codes are given. Then the techniques Such a technique does not suggest making decisions in
used for error correction is reviewed [7, 8] followed by each bit as in HDD but only one nal decision. It
illustrating the speci c code family which we use in the is complex to be implemented since the received vec-
simulation. tor is to be correlated with all possible codewords and
then the vector yielding the highest correlation (closest
2.1 Basic Concepts codeword to the transmitted one) is selected. When
the number of codewords is large, implementing such a
For the purpose of encoding messages for error pro- receiver is impractical.
tection, the large messages are broken into smaller mes-
sage blocks consisting of k bits of information. Redun-
dant bits, normally known as parity bits, are added to 2.4 Hamming Code
these k bits according to certain rules, and the code- This is a class of (n; k) single error correcting block
words of length n, inclusive of (n ; k) parity bits, are codes that has the property (n; k) = (2r ; 1; 2r ; 1 ; r)
transmitted through the communication channel. Due where r is the number of parity bits. Thus for r = 3,
to the physical nature of channels, quite often they in- we have (7; 4) code with the generating matrix G given
troduce noise and perturbations, spoiling the transmit- by: 01 0 0 0 0 1 11
ted data. With k bits of information per block, there
are 2k possible distinct messages out of 2n codewords BB 0 1 0 0 1 0 1 CC
that may be generated with n bits. The set of 2k code- @0 0 1 0 1 1 0A
words is called a block code. At the receiver, the k mes- 0 0 0 1 1 1 1
sage bits are decoded from the noisy received message
blocks using a suitable process. This decoding process Given the four binary information symbols b1 ; b2 ;
can be done in two di erent ways as will be illustrated b3 ; and b4 , the corresponding codeword is b =
in the next subsections. (b1 ; b2 ; b3 ; b4; b2  b3  b4 ; b1  b3  b4 ; b1  b2 
b4 ). In the f+1; ;1g representation, this looks like
x = (x1 ; x2 ; x3 ; x4 ; x2 x3 x4 ; x1 x3 x4 ; x1 x2 x4 ) where xj =
2.2 Hard Decision Decoding (HDD) (;1)bj ;1 . The resulting codeword is in systematic form
In this technique, a decision is made about each in- because the information sent appears in the rst four
dividual bit, the entire codeword is then assembled and positions of each codeword. Since dmin = 3 for the
this codeword is then compared with each valid code- (7; 4) Hamming code, this code is capable of correcting
word. The codeword with minimum Hamming distance single error.
to the reconstructed received codeword is then selected.
This method for HDD is simple but computationally in-
ecient. A more ecient technique is to use the parity 3 The Random Neural Network
check matrix to compute the syndrom.
Although the hardware required for HDD is sim-
(RNN) Model
ple, it can only corrects (dmin2 ;1) errors where dmin In the random neural network model [5] signals in the
is the minimum Hamming distance between the code- form of spikes of unit amplitude circulate among the
words. If the coding scheme can correct t errors then neurons. Positive signals represent excitation and neg-
the probability that an n bits codeword is in error is ative signals represent inhibition. Each neuron's state
is a non-negative integer called its potential, which in-
P  terrors have occurred
the probability that more than
in n bits, i.e. Pe = ni=t+1 ni pi (1 ; p)n;i where p creases when an excitation signal arrives to it, and de-
creases when an inhibition signal arrives. An excitatory
q Eb k  that an individual bit is in error, p =
is the probability spike is interpreted as a \+1" signal at a receiving neu-
1
2
erfc  n , Eb is the energy needed to transmit a ron, while an inhibitory spike is interpreted as a \;1"
bit, which for bipolar mode is Eb = 1,  is theR noise uni- signal. A neuron i emitting a spike, whether it is an ex-
lateral spectral density, and erfc(x) = p2 x1 e;x2 dx citation or an inhibition, will lose potential of one unit,
is the complementary error function. going from some state whose value is ki to the state of
value ki ; 1. The state of the n-neuron network at time
t, is represented by the vector of non-negative integers
k(t) = (k1 (t); :::; kn (t)), where ki (t) is the potential or algorithm for the recurrent RNN [6], we are able to
integer state of neuron i. We will denote by k and ki train the network with such general connections be-
arbitrary values of the state vector and of the i-th neu- tween neurons. The input layer accepts the bits of the
ron's state. Neuron i will \ re" (i.e. become excited) if received perturbed codeword at the end of the trans-
its potential is positive. The spikes will then be sent out mission channel and acts as an association layer which,
at a rate ri , with independent, identically and exponen- through training, can extracts the relation between the
tially distributed inter-spike intervals. Spikes will go di erent bits of each codeword and encode this rela-
out to some neuron j with probability p+ij as excitatory tion into the weight matrix. On the other hand, the
signals, or with probability p;ij as inhibitory signals. A output layer acts as a classi cation layer in which the
neuron may also send signals P out of the network with index of the output neuron that produces the minimum
probability di , and di + nj=1 [p+ij + p;ij ] = 1. Let value indicates the index of the decoded codeword. We
wij+ = ri p+ij , and wij; = ri p;ij . Here the \w's" play a should note here that the two codewords [0000000] and
role similar to that of the synaptic weights in connec- [1111111] are excluded from the training and testing
tionist models, though they speci cally represent rates sets applied to this structure because this showed to
of excitatory and inhibitory spike emission. They are produce much better results than if they were included.
non-negative. Exogenous excitatory and inhibitory sig- On the other hand, another similar structure with 7
nals arrive to neuron i at rates i , i , respectively. inputs and 2 outputs has been successfully trained to
This is a \recurrent network" model which may have decode the all 00s and all 10s codewords. In the next
feedback loops of arbitrary topology. Computations are subsection we introduce the classi cation technique and
based on the probability distribution of network state the training set used in designing the proposed RNND.
p(k; t) = Pr[k(t) = k], or with the marginal proba- Output Layer
bility that neuron i is excited qi (t) = Pr[ki (t) > 0]. 1 2 m

As a consequence, the time-dependent behavior of the


model is described by an in nite system of Chapman-
Kolmogorov equations for discrete state-space continu-
ous time Markov systems. Information in this model is
carried by the frequency at which spikes travel. Thus,
neuron j , if it is excited, will send spikes to neuron i at
a frequency wij = wij+ + wij; . These spikes will be
1 2 n-1 n

emitted at exponentially distributed random intervals.


In turn, each neuron behaves as a non-linear frequency
demodulator since it transforms the incoming excitatory
and inhibitory spike trains' rates into an \amplitude". Input Layer

Let qi (t) be the probability that neuron i is excited at Figure 1: Structure of the RNND.
time t. The stationary probability distribution associ-
ated with the model is given by: 4.1 Classi cation Technique
p(k) = tlim
!1 p(k; t); qi = tlim !1 qi (t); i = 1; :::; n: (1) Let T = f(x; c)g be a training set of N vectors,
and where x 2 Rn represents a codeword of the block code
P P family under consideration and c 2 I is its class la-
+i = j qj wji+ + i ; ;i = j qj wji; + i bel from an index set I . Let y 2 Rm be the desired
output vector associated with the input vector x. In
qi = i ; if qi < 1:
+
(2) our application, we relate c to y via the following re-
ri + i lation: c = fj : yj < yk k = 1; 2; : : :; mg (c is the
index of the output neuron that produces the mini-
4 RNN Based Decoder (RNND) mum value) . The RNN decoder acts as a mapping
Rn ! Rm , which assigns a vector in Rm to each vector
The model presented here is designed to decode the in Rn or equivalently, it can be considered as a map-
(7; 4) Hamming code discussed in Section 2. The ping C : Rn ! I , which assigns a class label in I to
RNND, as shown in Figure 1, consists of two layers, each vector in Rn . The RNN speci es a partitioning of
the input layer, which has 7 neurons, is a fully inter- the codewords
S into regions
T Rj  fx 2 Rn : C (x) = j g,
connected layer. This layer is connected to the out- where j Rj  R and j Rj  ;. It also induces a
n
put layer, which consists of 14 neurons, through feed- partitioning of the training set into sets Tj  T where
forward connections. Thanks to the general training Tj  ffx; cg : x 2 Rj ; (x; c) 2 T g. A training pair
(x; c) 2 T is misclassi ed if C (x) 6= c. The perfor- 0
10
Probability of word error versus SNR for HDD, RNND, and SDD

mance measure of interest is the empirical error frac- HDD

tion E of the classi er, i.e. the fraction of the training −1


10 RNND

set or testing set which is misclassi ed: SDD

X X X
−2
10

Probability of word error


E= 1 N (x;c)2T (c; C (x)) = 1 (c; j )
N j2I (x;c)2Tj
−3
10

−4
10

where (c; j ) = 1 if c 6= j and 0 otherwise. −5


10

4.2 Training Patterns −6


10
1 2 3 4 5 6 7 8 9 10

In the training process, all the codewords of the (7; 4)


SNR (dB)

Hamming code are used excluding the all 00 s code and Figure 2: Probability of word error for (7; 4) Hamming
the all 10 s code as mentioned before. Thus, we have code in an AWGN channel.
n = 7(inputs), N = 14(patterns), m = 14(outputs) and In the second simulation, three groups of testing
I = f1; 2; : : : 14g. As an example, the codeword x4 = codewords, each of them consists of 100000 randomly
(;1; 1; ;1; ;1; 1; ;1; 1) will be associated with the tar-
get output y4 = (1; 1; 1; 0:1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1) and chosen (7; 4) Hamming codewords, are applied to the
hence its class label is c = 4. The training process is HDD, SDD and RNND. AWGN is added to each test-
carried out using the RNNSIM v1.0 package1 (Random ing group so as to produce a single error, two errors,
Neural Network Simulator) after some slight modi ca- and three errors in the codewords of the rst, second,
tions to incorporate the general recurrent training al- and third testing groups respectively. The noisy testing
gorithm into the program. codewords are then decoded using each of the consid-
ered techniques and the number of erroneously detected
codewords is recorded for each of the testing groups.
5 Simulation Results The results are shown in Tables 1, 2, and 3.
After training the RNN, the simulation is performed SNR (dB) HDD RNND SDD
for (7; 4) Hamming code transmitted over AWGN chan- -1 0 0.071 0.057
nel. The whole communication system has been sim- 0 0 0.043 0.036
ulated with the aid of an interactive JAVA program2. 1 0 0.027 0.022
To evaluate the performance of the proposed RNND 2 0 0.006 0.004
and compare it to the HDD and SDD, two di erent
simulations have been carried out.
In the rst simulation, the probability of word error Table 1: Probability of word error for the rst group
is recorded versus the SNR in dB as shown in Figure 2.
The SNR is related to the channel variance 2 through
the relation Eb = 212 . The vertical axis represents log- SNR (dB) HDD RNND SDD
arithmically the probability of word error associated -1 1 0.409 0.363
with HDD, SDD, and RNND. The RNND shows cod- 0 1 0.314 0.297
ing gain relative to the HDD and its performance ap- 1 1 0.253 0.201
proaches that of the optimum SDD. 2 1 0.217 0.190
1 Free GUI based software available via anonymous ftp at
ftp://ftp.mathworks.com/pub/contrib/v5/nnet/rnnsim Table 2: Probability of word error for the second group
2 http://www.cs.ucf.edu/~ahossam/rnndecoder/
HammRnndApp.html
SNR (dB) HDD RNND SDD
-1 1 0.774 0.772
0 1 0.686 0.648

Table 3: Probability of word error for the third group


From the previous tables, it is obvious that the as a near optimum decoder. The neural based model
probability of word error for RNND approaches that has a simple hardware realization that eliminates the
of the SDD. As an example for the decoding process of need of the multiplication of the received vector by the
the three techniques under consideration, consider the parity check matrix or seeking the maximum correla-
codeword x2 = (;1; ;1; 1; ;1; 1; 1; ;1) is to be trans- tion between the incoming noisy signals and the locally
mitted through an AWGN channel with SNR = 0dB . generated codewords. With sucient noise the neu-
The resulted noisy codeword (;1:05; 1:24; 1:04; ;2:63; ral network or the soft decision decoder will not pro-
2:47; ;0:88; ;1:32), which has 2 erroneous bits, is then duce the correct decoding, however, it is shown that
applied to HDD, SDD and RNND. The corresponding for higher signal to noise rations, the RNND is unlikely
outputs of the decoders are as follows: to produce an error. One drawback of the proposed
HDD: (5; 2; 4; 2; 4; 3; 1 ; 6; 4; 3; 5; 3; 5; 2). model is that the number of output neurons grows ex-
SDD: (;3:60; 6 :38 ; ;4:68; 5:90; ;3:41; 1:28; 5:37; ponentially with the number of codewords, which is not
;5:37; ;1:28; 3:41; ;5:90; 4:68; ;6:38; 3:6). a practical solution for long codes. In future, we can
RNND: (1; 0 :52 ; 1; 0:59; 1; 0:88; 0:62; 1; 0:86; 0:73; 1; consider using the RNN model as a dynamical system
0:65; 1; 0:64). From this simulation example, we notice in which the codewords are the constant attractors of
that the HDD produces minimum Hamming distance the system.
(1) corresponding to x7 thus the transmitted code word
is decoded incorrectly. On the other hand, the SDD
produces the maximum correlation value (6:38) corre- References
sponding to x2 . The RNND gives the minimum out- [1] E. R. Berlekamp, R. McEliece, and H. Tilborg,
put (0:52) indicating that the decoded codeword is x2 . \On the inherent intractability of certain coding
Thus the RNND and SDD decoded the noisy codeword problems," IEEE Transaction on Inf. Theory,
correctly although it has 2 bits in error. The results of vol. 24, No. 3, pp. 384-386, May 1978.
the simulations can be summarized as follows:
 The HDD has the best performance in the case [2] J. Bruck and M. Blaum, \Neural networks,
of single error. However, it is almost ine ectual error-correcting codes, and polynomials over the
for correcting two or three errors. This is because binary n-cube," IEEE Transaction on Inf. The-
the (7; 4) Hamming code is single error correcting ory, vol. 35, No. 5, pp. 976-987, September 1989.
code. [3] S. E. El-Khamy, E. A. Yousef, and H. M. Ab-
 The optimum SDD performs well for decoding the delbaki , \Soft-decision decoding of block codes
words containing one and two errors and has a using arti cial neural networks," in IEEE
moderate performance for three errors. This is Proc. Symp. on Computers and Communica-
of course on the account of high implementation tions (ISCC 95), June 1995, pp. 234-240.
complexity. [4] J. S. Sagrista, \Neural networks as decoders in
 As for the RNND, its performance is very close to error-correcting systems for digital data trans-
that of the SDD. It should be noted here that the mission," in Neural Networks and Their Appli-
proposed RNND may have a very simple hard- cations. J. G. Taylor, Ed. New York: John Wi-
ware implementation. ley, 1996, pp. 19-33.
[5] E. Gelenbe, \Random neural networks with neg-
6 Conclusions ative and positive signals and product form so-
lution," Neural Computation, vol. 1, no. 4, pp.
This paper presents a recurrent random neural net- 502-511, 1989.
work based error correcting decoder for linear block [6] E. Gelenbe, \Learning in the recurrent random
codes. The proposed decoder can reduces the error neural network," Neural Computation, vol. 5,
probability to zero in the range of the error correcting no. 1, pp. 154-164, 1993.
capacity of the used code. It have been shown through
simulation that the neural based decoder is much better [7] F. J. MacWilliams and N. J. A. Sloane, The
than the hard decision decoder for decoding codewords Theory of Error Correcing Codes. New York:
corrupted with errors more than the correction capa- North Holland, 1977.
bility of the used code. Although the optimum soft [8] W. W. Peterson and E. J. Weldon, Error Cor-
decision decoding technique has the minimum proba- reting Codes. Cambridge, MA: MIT Press, 1972.
bility of error, the proposed model can be considered

You might also like