Professional Documents
Culture Documents
VHDL ESA Unibo Noordwijk.v1
VHDL ESA Unibo Noordwijk.v1
VHDL ESA Unibo Noordwijk.v1
Foreword
NASA low-density parity-check (LDPC) codes will be used in next generation space missions Moreover, they are under evaluation to become part of the CCSDS recommendation (revision of CCSDS 131.0-B-1, Sept. 2003) AR4JA family of LDPC codes with code rates R=1/2, 2/3, 4/5 and frame lengths k=1024, 4096, 16384 + C2 (shortened) code with rate ~7/8 and frame length 7136
Foreword
To guarantee the necessary cross-support the other space agencies have to be able to encode and decode NASA LDPC codes The Univ. of Bologna (UniBo) is performing a VHDL implementation of NASA LDPC codes. This represents a necessary step towards FPGA implementation The developed VHDL code will be made available to ESA/ESOC as a support to develop an FPGA prototype of NASA LDPC code encoders and decoders Encoder and decoder of NASA AR4JA code with rate 1/2, 2/3 and 4/5 completed
Encoder implementation
NASA LDPC codes are QC codes whose parity-check matrix H is block circulant As such, the encoding operation may be performed by simple circuits based on shift registers The encoder implemented by UniBo in VHDL language is the systematic encoder proposed in
[1] Z. Li, L. Chen, L. Zeng, S. Lin and W. Fong, Efficient encoding of quasi-cyclic lowdensity parity-check codes, IEEE Trans. Commun., vol. 54, no. 1, pp. 71-81, Jan. 2006
The encoder architecture is comprises a bank of shift register adder accumulator (SRAA) modules coordinated by a Controller
CCSDS Fall 2009 Meeting Noordwijk October 2009
V1- 11 Oct 2009
Encoder implementation
Following [1], the methodology consists of obtaining a systematic generator matrix for the QC LDPC code such that the submatrix corresponding to the parity bits is block circulant (usually dense) Example: AR4JA NASA Code, k=1024, R=1/2:
Encoder implementation
The generic codeword is now expressed as where and
Each vector pj is generated by a shift register adder accumulator (SRAA) module The encoding circuit consists of c SRAAs working in parallel and coordinated by a Controller Punctured bits are not transmitted
CCSDS Fall 2009 Meeting Noordwijk October 2009
V1- 11 Oct 2009
Controller
SRAAs
Decoder architecture
Some literature relevant to the subject [2] K. Zhang, X. Huang and Z. Wang, High-throughput layered decoder implementation for quasi-cyclic LDPC codes, IEEE J. Selected Areas Commun., vol. 27, no. 6, Aug. 2009 [3] Y. Dai, N. Chen and Z. Yan, Memory efficient decoder architectures for quasi-cyclic LDPC codes, IEEE Trans. Circuits and Systems I, vol. 55, no. 9, Oct. 2008 [4] Z. Wang, Z. Cui, A memory efficient partially parallel decoder architecture for quasi-cyclic LDPC codes, IEEE Trans. VLSI Systems, vol. 15, no. 4, Apr. 2007 [5] Z. Wang, Z. Cui, Low-complexity high-speed decoder design for quasicyclic LDPC Codes, IEEE Trans. VLSI Systems, vol. 15, no. 1, Jan. 2007
Decoder architecture
The chosen decoder architecture is known in the literature as a partially parallel architecture Let M and N be the number of rows and the number of columns of the parity-check matrix H, respectively (N includes punctured bits). Let m be the dimension of each circulant matrix composing H Then, the partly-parallel decoder architecture is composed of h=M/m check processing units (CPUs) p=N/m variable processing units (VPUs), h p memory banks each one for m quantized soft messages a syndrome calculation circuit a Controller A CPU is a module performing check node operations. A VPU is a module performing variable node operations Memory banks are used to store messages exchanged between variable nodes and check nodes The Controller is used to synchronize the several components and to beat the different phases of the algorithm. It enables CPUs and VPUs writing on and reading from the memory banks, and the syndrome calculation circuit
10
Architecture Overview
output
11
Decoder Flow
Each CPU is associated with a block of m rows of the parity-check matrix H Analogously, each VPU is associated with a block of m columns of H The horizontal step (check-to-variable) of the algorithm is split into m phases. During the i-th phase, a CPU plays the role of the check node corresponding to the i-th row of the block it is associated with Each CPU reads messages from the appropriate memory bank (addressed by the Controller) and writes messages back on it Similarly, the vertical step (variable-to-check) of the algorithm is split into m phases. During the i-th phase, a VPU plays the role of the variable node corresponding to the i-th column of the block it is associated with Each VPU reads messages from the appropriate memory bank (addressed by the Controller) and writes messages back on it At the end of each iteration the syndrome of the current hard-decision sequence is tested. The syndrome calculation circuit is composed of a bank of M/m SRAAs
12
Controller
clk: system clock new_word: when at 1 denoted the beginning of a new sequence to be decoded count_cpu: memory address for the CPUs count_vpu: memory address for the VPUs wren: when at '1' enables writing on the memory banks sel: when at '0' the memory controllers are VPU-driven, when at '1 they are CPU-driven ps_enable: when at 1 enable the entity which serializes the bits to be input to the syndrome calculation block sraa_enable: when at 1 enables the syndrome calculation check_enable: when at '1' enables the syndrome check failed: when at 1 denotes a detected error (maximum number of iterations reached without finding any valid codeword) iteration: a counter denoting the number of iterations CCSDS Fall 2009 Meeting Noordwijk October 2009
V1- 11 Oct 2009
13