Download as pdf or txt
Download as pdf or txt
You are on page 1of 12

Received December 7, 2019, accepted December 16, 2019, date of publication December 19, 2019,

date of current version December 31, 2019.


Digital Object Identifier 10.1109/ACCESS.2019.2960839

A High Throughput Implementation


of QC-LDPC Codes for 5G NR
HAO WU 1,2 , (Member, IEEE), AND HUAYONG WANG 3
1 Department of Wireless Product Research and Design Institute, ZTE Corporation, Shenzhen 518055, China
2 State Key Laboratory of Mobile Network and Mobile Multimedia Technology, ZTE Corporation, Shenzhen 518055, China
3 Microelectronics Research Institute, ZTE Corporation, Shenzhen 518055, China

Corresponding author: Hao Wu (wu.hao19@zte.com.cn)

ABSTRACT Quasi-cyclic low-density parity-check (QC-LDPC) codes are the choice for data channels in
the fifth generation (5G) new radio (NR). At the transmitter side, code bits from the QC-LDPC encoder
are delivered to the rate matcher. The task of the rate matcher is to select an appropriate number of code
bits via puncturing and/or repetition. Code bits that are not selected do not need to be encoded. At the
receiver side, the de-rate matcher combines code bits of different transmission attempts and sends them to
the QC-LDPC decoder. The output of the QC-LDPC decoder only needs to include necessary systematic
bits. Unnecessary systematic bits and parity bits can be completely removed from the decoding process.
Taking these considerations into account, a smaller sub-base matrix instead of a full-base matrix can be used
in the encoding and decoding process. In this paper, we propose an efficient implementation of QC-LDPC
codes for 5G NR. The full-base matrix is pruned before being used. Compared to the traditional schemes,
the proposed scheme improves the throughput of QC-LDPC codes in 5G NR.

INDEX TERMS 5G NR, QC-LDPC, encoder, decoder, base matrix.

I. INTRODUCTION In 5G NR, the physical downlink shared channel (PDSCH)


Fifth generation (5G) new radio (NR) is the next gen- and the physical uplink shared channel (PUSCH) are used
eration of mobile networks beyond the fourth generation for unicast data transmissions. The data from the medium
(4G) long term evolution (LTE) [1], [2]. 5G NR sup- access control layer to the physical layer is organized in the
ports three scenarios: enhanced mobile broadband (eMBB), form of the transport block. Fig. 1 illustrates the processing
ultra-reliable and low-latency communications (uRLLC) and of the transport block in 5G NR [6]. At the transmitter side,
massive machine type communications (mMTC). These three the following steps are carried out for a transport block: cyclic
scenarios have requirements that include low latency and redundancy check (CRC) attachment, code block segmenta-
high throughput [3]. The peak throughput requirement is tion, QC-LDPC encoding, rate matching, bit interleaving and
10 Gbps for uplink and 20 Gbps for downlink. The user code block concatenation. The task of the rate matcher is to
plane latency requirement is 4ms for eMBB and 1ms for select an appropriate number of code bits from the output
uRLLC. The control plane latency requirement is 20ms. Tak- of the QC-LDPC encoder via puncturing and/or repetition to
ing these requirements into consideration, quasi-cyclic low- match the available radio resources. The redundancy version
density parity-check (QC-LDPC) codes are adopted by the determines the exact set of code bits to be selected. As a
5G NR standard for data channels [4], [5]. To simplify the consequence, code bits that are not selected do not need to
implementation, QC-LDPC codes have the core part with a be encoded. At the receiver side, the following steps are
dual-diagonal structure and the extension part with a diagonal carried out for a transport block: code block segmentation,
structure. Two base matrices, BG1 and BG2, are defined bit de-interleaving, de-rate matching, QC-LDPC decoding,
to guarantee the decoding performance for full ranges of code block concatenation, transport block CRC check. The
transport block sizes and code rates. For BG1, the mother de-rate matcher combines code bits of different transmission
code rate is 1/3. For BG2, the mother code rate is 1/5. attempts and sends them to the QC-LDPC decoder. The out-
put of the QC-LDPC decoder only needs to include necessary
The associate editor coordinating the review of this manuscript and systematic bits. As a consequence, unnecessary systematic
approving it for publication was Martin Reisslein . bits and parity bits can be completely removed from the

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see http://creativecommons.org/licenses/by/4.0/
VOLUME 7, 2019 185373
H. Wu, H. Wang: High Throughput Implementation of QC-LDPC Codes for 5G NR

FIGURE 1. The processing of the transport block in 5G NR.

decoding process. Taking these considerations into account, presents the architecture of the proposed scheme. Numerical
a smaller sub-base matrix instead of a full-base matrix can be results and computational complexity are shown in Section V
used in the encoding and decoding process. and Section VI respectively. Finally, the conclusion is given
The architectures of the QC-LDPC encoder include: in Section VII.
dual-diagonal [7]–[10], Richardson-Urbanke [11]–[13],
LU decomposition [14], [15], etc. In 5G NR, the dual- II. TRANSPORT BLOCK PROCESSING IN 5G NR
diagonal architecture is usually used to reduce the com- In this section, we focus on the processing of the transport
plexity of the encoder. The throughput of the dual-diagonal block in 5G NR [6]. Let T be the transport block size.
architecture is approximately inverse proportional to the A A-bit CRC is attached at the end of the transport block,
row number of the base matrix. The architectures of the where A is equal to 24 if T > 3824 and 16 otherwise. The
QC-LDPC decoder include: block parallel [16]–[18], row transport block, including the CRC, is divided into C equal
parallel [19], [20], full parallel [21], [22], etc. [23]. In 5G size code blocks. C is equal to
NR, the block parallel architecture is usually used to reduce (
the complexity of the decoder. The throughput of the block 1, T ≤ζ
C= (1)
parallel architecture is approximately inverse proportional dB/(ζ − 24)e, T > ζ
to the number of circulant blocks in the base matrix. As a
where B is equal to T + A and ζ is equal to 8448 for BG1 and
consequence, the throughput of QC-LDPC codes can be
3840 for BG2. The size of each code block is equal to
improved by using a smaller sub-base matrix instead of a
full-base matrix. K = B/C + τ (2)
Reference [24] only describes the encoding procedure for
QC-LDPC codes in 5G NR. The scheme of constructing where τ is equal to 0 for C = 1 and 24 otherwise. Note
the sub-base matrix is not described. Reference [7] uses the that the procedure of the transport block size determination
full-base matrix in the encoding process. It is a waste of guarantees that B is divisible by C [32].
resources for the transmitter to encode unnecessary code bits. Let R be the target code rate for the initial transmission.
The sub-base matrix constructed in references [25]–[30] is BG1 is used if T > 3824 and R > 0.25 or T > 292 and
a leading sub-base matrix [31]. That is, the sub-base matrix R > 0.67. Otherwise, BG2 is used. Let ω be the number of
is obtained by selecting the intersection of the first i rows systematic columns. For BG1, ω is equal to 22. For BG2, ω is
and the first j columns of the full-base matrix. The sub-base determined based on the value of B. That is, ω = 10 if B ∈
matrix constructed in [25]–[30] can be further pruned to (640, ∞); ω = 9 if B ∈ (560, 640]; ω = 8 if B ∈ (192, 560]
increase the throughput. and ω = 6 otherwise. The lifting size Z is equal to
This paper proposes an efficient implementation of Z = min{η : η ∈ 8, η ≥ B/ω} (3)
QC-LDPC codes for 5G NR. Unlike the traditional schemes,
the sub-base matrix constructed by the proposed scheme is where 8 is the set of supported lifting size. 8 includes all
not restricted to the leading sub-base matrix. In addition, values of the form j × 2i for j ∈ {2, 3, 5, 7, 9, 11, 13, 15}
the proposed scheme takes into account the difference in the and i ∈ {0, 1, 2, 3, 4, 5, 6, 7} that range from 2 to 384.
construction of the sub-base matrix between the encoder and 8 is categorized into eight sets according to j. There is an
the decoder. Compared to the traditional schemes, the pro- exponent matrix Pj associated with j for each base matrix.
posed scheme improves the throughput of QC-LDPC codes We assume that Z can be expressed as m × 2n . The full-base
in 5G NR. The rest of the paper is organized as follows: matrix Pm is pruned to the sub-base matrix P0m . The detail
Section II describes the processing of the transport block of the pruning is described in Section III. The parity check
in 5G NR. Section III gives the scheme of constructing the matrix H is constructed by lifting the sub-base matrix P0m .
sub-base matrix for the encoder and the decoder. Section IV That is, each non-negative element of P0m is replaced by a

185374 VOLUME 7, 2019


H. Wu, H. Wang: High Throughput Implementation of QC-LDPC Codes for 5G NR

TABLE 1. The value of TLBRM as a function of LM , QM and PM .

Z × Z permutation matrix and each negative element of P0m Let d = [d0 , d1 , . . . , dN −1 ] be the circular buffer. The initial
is replaced by a Z × Z zero matrix. If the size of the sub-base values of the circular buffer are nulls. After writing the code-
matrix P0m is R × C, the size of the parity check matrix H is word c into the circular buffer, the values of d are as follows
RZ × CZ . The task of the encoding is to find a codeword c
s2Z +i , 0 ≤ i < K − 2Z


that satisfies the following equation
K − 2Z ≤ i < ξ

null,
di = (9)
HcT = 0T (4) pi−ξ , ξ ≤ i < β


β≤i<N

null,
where (·)T denotes the transpose of the enclosed vector. Let
the codeword c be where β = min(N , ξ + RZ ) and ξ is equal to 20Z for
BG1 and 8Z for BG2. Then code bits e = [e0 , e1 , . . . , eE−1 ]
c = [s0 , s1 , . . . , sC Z −RZ −1 , p0 , p1 , . . . , pRZ −1 ] (5) are read out from the starting position k in the circular buffer,
skipping the code bit with a value of null. k is a function of
where s = [s0 , s1 , . . . , sC Z −RZ −1 ] is the systematic bits and the redundancy version rv and is given by Table 2.
p = [p0 , p1 , . . . , pRZ −1 ] is the parity bits. c is written into
a circular buffer of length N . To reduce the complexity of TABLE 2. The value of k as a function of the redundancy version rv .
the implementation, the limit buffer rate matching (LBRM)
is introduced in 5G NR. If LBRM is disable, N is equal to 3,
where 3 is equal to 66Z for BG1 and 50Z for BG2. If LBRM
is enable, N is equal to
1 TLBRM
N = min(3, b c) (6)
C RLBRM
where RLBRM = 2/3. TLBRM is a function of the maxi-
mum number of layers LM , the maximum modulation order
QM and the maximum number of physical resource blocks
PM [6], [32]. The values of TLBRM are listed in Table 1.
LBRM is usually enable to reduce the size of the circular
buffer and increase the throughput of the QC-LDPC codes. Next, bit interleaving is carried out with a row-column
Let G be the number of code bits available for transmission interleaver. The number of rows is equal to the modulation
of the transport block. Let L be the number of layers. Let Q order. Code bits of each code block are written row-by-
be the modulation order. If code block groups (CBGs) are row into the interleaver and read out column-by-column.
not supported, the rate matching output length E of the first This increases the reliability of systematic bits and improves
C − mod(G/(LQ), C) code blocks is equal to the performance of QC-LDPC codes [2], [33]. Let f =
[f0 , f1 , . . . , fE−1 ] be the output of the interleaver. e and f are
G
Em = LQb c (7) related by
LQC
fi+jQ = eiE/Q+j (10)
and that of the last mod(G/(LQ), C) code blocks is equal to
where i ∈ {0, 1, . . . , Q − 1} and j ∈ {0, 1, . . . , E/Q − 1}.
G Finally, code block concatenation collects the output of the
Ep = LQd e (8)
LQC rate matching.

VOLUME 7, 2019 185375


H. Wu, H. Wang: High Throughput Implementation of QC-LDPC Codes for 5G NR

III. THE SCHEME OF CONSTRUCTING THE As a result, parity bits should consist of [p0 , p1 , . . . , pδ−1 ],
SUB-BASE MATRIX where δ is equal to
In this section, we give the scheme of constructing the
sub-base matrix for the encoder and the decoder. Before delv- δ = min(Np , N − ξ ) (13)
ing into the details, let us explore the general structure of the If Np is larger than N − ξ , the selection of code bits wraps
base matrix. The structure of BG2 is illustrated in Fig. 2. The around to the beginning of the circular buffer.
structure of BG1 is similar. The base matrix consists of five Now, let us not restrict the value of k. That is, the starting
submatrices: A, C, B, I and O. A and C are non-zero matrices. point of the selection may be anywhere of the circular buffer.
B has a dual-diagonal structure. I is an identity matrix. O is The number of systematic bits to be selected is
a zero matrix. A and B constitute the core part. C, I and
O constitute the extension part. This structure is similar to Ns = max(K − 2Z − k, 0) (14)
QC-LDPC codes introduced in [34]. The core part can not be
and the number of parity bits to be selected is
pruned [4]. The first two columns of the base matrix are not
transmitted. This procedure improves the performance of the Np = E − max(K − 2Z − k, 0) (15)
decoding [35]. The rows of C are designed to be orthogonal or
quasi-orthogonal. Since the layered decoding is widely used If k < ξ , the starting point of the selection is in the middle of
in QC-LDPC codes [36], [37], this design reduces the decod- the systematic bits. If k ≥ ξ , the starting point of the selection
ing latency and improves the system throughput [38], [39]. is in the middle of the parity bits. Let λ = N −k +K −2Z −θ,
where

0,
 k≥ξ
θ = ξ − k, K − 2Z ≤ k < ξ (16)
ξ − (K − 2Z ), 0 ≤ k < K − 2Z

The ending point of the selection in the circular buffer is sK −1


when E is equal to λ. If E ≤ λ, parity bits should consist of
[pδs , pδs +1 , . . . , pδe −1 ], where δs is equal to
δs = max(0, k − ξ ) (17)
and δe is equal to
δe = min(Np + δs , N − ξ ) (18)
If E > λ, parity bits should consist of [pδs0 , pδs0 +1 , . . . ,
FIGURE 2. The general structure of the base matrix. pN −ξ −1 ] and [p0 , p1 . . . , pδe0 −1 ], where δs0 is equal to
δs0 = max(0, k − ξ ) (19)
A. CONSTRUCTION OF THE SUB-BASE MATRIX
FOR THE ENCODER and δe0 is equal to
The output of the QC-LDPC encoder should consist of
δe0 = min(E − λ, δs0 ) (20)
[s0 , s1 , . . . , sK −1 ] and some parity bits. In the following,
we derive the parity bits that need to be included. In order to The size of the systematic bits is at most (ξ + 2)Z . The
gain some insight, let us first focus on the case where k = 0. last (ξ + 2)Z − K systematic bits are filler bits and are not
That is, the starting point of the selection is at the beginning transmitted over the air. Since filler bits are set to zeros,
of the circular buffer. Since the first 2Z systematic bits are the values of parity bits are not affected by the last (ξ + 2)Z −
punctured, the number of systematic bits to be selected is K systematic bits. As a result, the columns corresponding to
these systematic bits can be pruned.
Ns = K − 2Z (11)
If the full-base matrix Pm is pruned to a sub-base
and the number of parity bits to be selected is matrix P0m , Pm and P0m are related by
Np = E − (K − 2Z ) (12) P0m = Pm (µ, ν) (21)


δ δ δ −1
[b s c, b s c + 1, . . . , b e E ≤λ

 c] ∪ [0, 1, 2, 3],
µ= Z Z Z (22)
[b δs c, b δs c + 1, . . . , b N − ξ − 1 c] ∪ [0, 1, . . . , max(3, b δe − 1 c)], E > λ
 0 0 0

Z Z Z Z

185376 VOLUME 7, 2019


H. Wu, H. Wang: High Throughput Implementation of QC-LDPC Codes for 5G NR

FIGURE 5. The construction of the sub-base matrix for the decoder. P00 2
is
FIGURE 3. The construction of the sub-base matrix for the encoder. P02 is obtained by selecting the intersection of the rows [0, 1, . . . , 5] and the
obtained by selecting the intersection of the rows [0, 1, . . . , 5] and the columns [0, 1, . . . , 5, 10, 11, . . . , 15] of P2 . Here rv = 0, T = 80, G = 160
columns [0, 1, . . . , 5, 10, 11, . . . , 15] of P2 . Here rv = 0, T = 80, G = 160 and TLBRM = 1277992.
and TLBRM = 1277992.

FIGURE 6. The construction of the sub-base matrix for the decoder. P00 2
is
FIGURE 4. The construction of the sub-base matrix for the encoder. P02 is obtained by selecting the intersection of the rows [0, 1, . . . , 5, 17, 18, . . . ,
obtained by selecting the intersection of the rows [0, 1, . . . , 3, 17, 18, 26] and the columns [0, 1, . . . , 5, 10, 11, . . . , 15, 27, 28, . . . , 36] of P2 .
. . . , 26] and the columns [0, 1, . . . , 5, 10, 11, . . . , 13, 27, 28, . . . , 36] Here rv = 2, T = 80, G = 160 and TLBRM = 1277992.
of P2 . Here rv = 2, T = 80, G = 160 and TLBRM = 1277992.

matrix used in the last transmission should be considered


where µ is give by (22), as shown at the bottom of the when constructing the sub-base matrix in the retransmission.
previous page, and ν is equal to Let µ0 equals to ∅ for the initial transmission and the
selected rows in the last transmission for the retransmission.
K −1 ξ
ν = [0, 1, . . . , bc, µ + 2 + ] (23) Let ν 0 equals to ∅ for the initial transmission and the selected
Z Z columns in the last transmission for the retransmission. If the
The symbol ∪ means the union of two vectors [40]. The full-base matrix Pm is pruned to a sub-base matrix P00m ,
expression (21) means that P0m is obtained by selecting the Pm and P00m are related by
intersection of the rows µ and the columns ν of Pm . Note that
B can not be pruned. P00m = Pm (µ ∪ µ0 , ν ∪ ν 0 ) (24)
Two examples of constructing the sub-base matrix for the where µ and ν are given in the preceding subsection.
encoder are shown in Fig. 3 and Fig. 4. From these figures, Two examples of constructing the sub-base matrix for the
we see that P0m is only a small portion of Pm . Code bits decoder are shown in Fig. 5 and Fig. 6. From these figures,
that are not selected are not encoded. The throughput of the we see that P00m is only a small portion of Pm . Unnecessary
QC-LDPC encoder is expected to be increased. systematic bits and parity bits are removed from the decod-
ing process. The throughput of the QC-LDPC decoder is
B. CONSTRUCTION OF THE SUB-BASE MATRIX expected to be increased. In the initial transmission, the sizes
FOR THE DECODER of sub-base matrices of the encoder and decoder are the same.
Hybrid automatic repeat request (HARQ) is widely used to In the retransmission, the sizes of the sub-base matrix of the
improve the transmission efficiency [41]–[43]. Log likeli- decoder is larger than that of the encoder.
hood ratios (LLRs) of the transport block received in error
are stored in a buffer. The receiver generates the positive IV. THE ARCHITECTURE OF THE PROPOSED SCHEME
acknowledgement or the negative acknowledgement to drive In this section, the architecture of the proposed scheme is
the retransmission. When the retransmission is received, described. In the processing of the transport block, we need to
the decoding is performed by the buffered LLRs combined know the lifting size, the size of the circular buffer, the type
with the retransmission LLRs. As a result, the sub-base of the base matrix, etc. These parameters are derived from

VOLUME 7, 2019 185377


H. Wu, H. Wang: High Throughput Implementation of QC-LDPC Codes for 5G NR

FIGURE 7. The proposed architecture of the encoder.

the scheduling information [2], [6]. However, this deriva- There are many dual-diagonal encoder architectures that
tion is not appropriate to be done in the hardware. Usu- offer a variety of parallelism orders [7]–[10]. The architec-
ally, these parameters are calculated by the software and ture with the high parallelism provides the high throughput.
then are passed to the hardware through the configura- However, this benefit comes at the cost of increased area.
tion [17], [29], [30], [44], [45]. Based on this configuration, The architecture supports various base matrices specified by
the control signals are generate by the controller. The size the protocol [6], [46], [47]. The controller of the architec-

of the configuration needs to be small as possible to reduce ture obtains the particular base matrix Pm used for encoding
the memory. It is clear that the configuration includes the through the configuration and then generates the control sig-
following fields: E, K , N , Z , the type of the base matrix, etc. †
nals based on Pm , Z , etc. Then the codeword c is produced
Compared to the traditional schemes, the proposed scheme under the control of these signals. The existing architectures
mainly affects the QC-LDPC encoding module at the trans- can easily support the proposed scheme by modifying the
mitter and the QC-LDPC decoding module at the receiver. For configuration and the controller. That is, b† that indicates
the sake of brevity, we only focus on these modules in this the sub-base matrix P0m is added to the configuration and the
section. logic that generates the sub-base matrix P0m is added to the
controller. Compared to the overall area of the architecture,
A. ARCHITECTURE OF THE ENCODER the area of the logic is negligible. Since the logic that gen-
Before giving the proposed architecture, the number of bits erates the sub-base matrix P0m is not on the critical path, this
needed to indicate the sub-base matrix P0m is derived. These modification does not affect the operating frequency of the
bits should be added to the configuration. In the following, architecture.
we show that 13 bits are needed to indicate the sub-base For example, one proposed architecture is shown in Fig. 7.
† † † †
matrix P0m . Let b† = [b0 , b1 , b2 ], where b0 is equal to 1 if The difference between the proposed architecture and the

E is smaller than or equal to λ and 0 otherwise, b1 is equal to reference architecture [7] is in the configuration and the
 controller, which are marked in red in the figure. The area of
δ this architecture can be reduced by using few cyclic shifts at
b s c, b† = 1


† Z 0 the expense of lower throughput. The proposed architecture
b1 = (25)
b δs c, b† = 0
0

 is briefly discussed as follows. First, the intermediate variable
Z 0 t is obtained by accumulating the cyclic shifts of the the
† systematic bits s in 4 clock cycles. t is stored in the core
and b2 is equal to unit and s is stored in the systematic buffer. In the following,
p0 = [p0 , p1 , . . . , p4Z −1 ] is generated by the core unit in 2

δ −1 †
b e

c, b0 = 1 clock cycles. p0 is stored in the parity buffer 1. Finally,

† Z
b2 = (26) p00 = [p4Z , p4Z +1 , . . . , pRZ −1 ] is obtained by accumulating
max(3, b δe − 1 c), b† = 0
 0
the cyclic shifts of the the systematic bits s and p0 in (R − 4)

Z 0
clock cycles. p00 are stored in parity buffer 2. The proposed
† † † scheme needs a total of (R + 2) clock cycles.
b0 , b1 and b2 can be represented by 1bit, 6bits and 6bits
respectively. From (22), we see that µ can be easily obtained
from the fields already exist in the configuration and b† . B. ARCHITECTURES OF THE DECODER
From (23), we see that ν can be easily obtained from the fields Before giving the proposed architecture, the number of bits
already exist in the configuration and µ. As a result, only b† needed to indicate the sub-base matrix P00m is derived. These
needs to be added to the configuration to indicate the sub-base bits should be added to the configuration. In the following,
matrix P0m . we show that 42 bits are needed to indicate the sub-base

185378 VOLUME 7, 2019


H. Wu, H. Wang: High Throughput Implementation of QC-LDPC Codes for 5G NR

FIGURE 8. The proposed architecture of the decoder.

‡ ‡ ‡
matrix P00m . Let b‡ = [b0 , b1 , . . . , b41 ] be the bit masks of of the sub-base matrix P00m are processed, the min-sum unit
length 42. The ith element of b‡ is equal to 1 if i is a member and the adder output the updated extrinsic information and
of µ ∪ µ0 − [0, 1, 2, 3] and 0 otherwise, where the symbol − the updated LLR values respectively. The updated extrinsic
means the difference of two vectors [48]. From (22), we see information is written back to the extrinsic buffer. And after
that µ ∪ µ0 can be easily obtained from b‡ since B can not be passing through the de-cyclic shifter, the updated LLR values
pruned. From (23), we see that ν ∪ ν 0 can be easily obtained are written back to the LLR buffer. The decoded bits can
from the fields already exist in the configuration and µ ∪ µ0 . be obtained by the hard decision on the LLR values. This
As a result, only b‡ needs to be added to the configuration to decoding process continues until the parity check passes or
indicate the sub-base matrix P00m . the maximum number of iterations is reached.
There are many block parallel architectures [16]–[18]. The
architecture supports various base matrices specified by the V. NUMERICAL RESULTS
protocol [46], [49], [50]. The controller of the architecture Numerical results are given in this section to compare the

obtains the particular base matrix Pm used for decoding throughput and the block error rate (BLER) of the traditional
through the configuration and then generates the control sig- schemes and the proposed scheme. The detail configurations
‡ are listed below. There is a total of 14 orthogonal frequency
nals based on Pm , Z , etc. Then the systematic bits s is obtained
under the control of these signals. The existing architectures division multiplexing (OFDM) symbols in the slot. Demodu-
can easily support the proposed scheme by modifying the lation reference signal (DMRS) is located in the first 2 sym-
configuration and the controller. That is, b‡ that indicates bols of the slot. Physical uplink shared channel (PUSCH)
the sub-base matrix P00m is added to the configuration and the is located in the last 12 symbols of the slot. The trans-
logic that generates the sub-base matrix P00m is added to the port block is mapped to the resource elements (REs) in a
controller. Compared to the overall area of the architecture, frequency-first manner. Note that the front-loaded DMRS and
the area of the logic is negligible. Since the logic that gen- the frequency-first mapping enable the transport block to be
erates the sub-base matrix P00m is not on the critical path, this processed on the fly. It is not necessary to collect all the
modification does not affect the operating frequency of the symbols in the slot before starting the decoding the transport
architecture. block. The number of physical resource blocks is Pa . The
For example, one proposed architecture is shown in Fig. 8. number of layers is 1. The transport block size T is deter-
The difference between the proposed architecture and the mined according the procedure in 5.1.3.2 in [32]. The LBRM
reference architecture [17] is in the configuration and the con- is applied and TLBRM is equal to 1277992. Let ψ be the index
troller, which are marked in red in the figure. The proposed of the modulation and code scheme, which is obtained from
architecture is briefly discussed as follows. First, the log the Table 5.1.3.1-2 in [32]. The target code rate R and the
likelihood ratio (LLR) buffer is initialized to LLR values modulation order Q are derived from ψ. The sequence of
of each bit and the extrinsic buffer is initialized to zeros. redundancy versions is 0, 2, 3, 1. The parameters of the initial
Then, at each time instant, the LLR values corresponding to a transmission and the retransmission are the same except for
non-negative element of the sub-base matrix P00m are read from the redundancy version.
the LLR buffer and the extrinsic information corresponding
to the same non-negative element of the sub-base matrix P00m A. THROUGHPUT PERFORMANCE
is read from the extrinsic buffer. The difference between the In this subsection, the throughput of the proposed scheme and
cyclic shifted LLR values and the extrinsic information is fed the traditional schemes are compared. In 5G NR, the through-
to the min-sum unit. When all non-negative elements in a row put is usually defined as the number of systematic bits

VOLUME 7, 2019 185379


H. Wu, H. Wang: High Throughput Implementation of QC-LDPC Codes for 5G NR

p
FIGURE 9. dt /dtf and dtl /dtf as a function of ψ for different Pa . The p
FIGURE 11. dt /dtf and dtl /dtf as a function of ψ for different Pa .
redundancy version is equal to 0. The redundancy version is equal to 3.

p
FIGURE 10. dt /dtf and dtl /dtf as a function of ψ for different Pa . p
The redundancy version is equal to 2. FIGURE 12. dt /dtf and dtl /dtf as a function of ψ for different Pa .
The redundancy version is equal to 1.

(excluding filler bits) that the system can process within frequencies of these three schemes are the same. Fig. 9 to
p f f
a specified time [51], [52]. This definition is used in this Fig. 12 illustrate dt /dt and dtl /dt as a function of ψ for
subsection. different Pa when the number of iterations for these three
schemes are the same. From these figures, we see that the
1) DECODER full-base matrix scheme yields the lowest decoding through-
The pipeline block parallel architecture is usually used to put. The effective code rate decreases as the number of trans-
p f f
reduce the complexity of the decoder. The pipeline stall due to missions increases, and thus dt /dt and dtl /dt approach one
the pipeline hazard can be kept to a minimum level [53], [54]. as the number of transmissions increases.
If one circulant block in the sub-base matrix needs one clock Now, we compare the leading sub-base matrix scheme and
p
cycle for processing, the decoding throughput dt is approxi- the proposed scheme. In the initial transmission, dt is slightly
l
larger than dt in a few cases. The main reason is that the
mately equal to
proposed scheme prunes the systematic bits but the leading
fd p
dt = K (27) sub-base matrix scheme does not. For example, dt gets up
id u d to a 1.05 fold improvement as compared to dt when ψ is
l
p
where fd is the operating frequency for the decoder, id is the equal to 27 and Pa is equal to 6. In the retransmission, dt is
number of iterations, ud is the number of circulant blocks in l
much larger than dt in most cases. The main reason is that the
the sub-base matrix. Note that this scheme is widely used in parity bits pruned by the proposed scheme is larger than the
evaluating the decoding throughput [52], [55]. parity bits pruned by the leading sub-base matrix scheme. For
f p
Let dt be the decoding throughput of the full-base matrix example, dt gets up to a 1.40 fold improvement as compared
scheme. Let dtl be the decoding throughput of the leading to dt when ψ is equal to 27, Pa is equal to 6 and rv is
l
p
sub-base matrix scheme. Let dt be the decoding throughput equal to 2.
of the proposed scheme. These three schemes mainly differ in The gain of the proposed scheme can be clearly observed
the logic that generates the sub-base matrix used for decod- in the retransmission. The decoder provided to implement
ing. Since this logic is not on the critical path, the operating the peak throughput requirement may not be able to decode

185380 VOLUME 7, 2019


H. Wu, H. Wang: High Throughput Implementation of QC-LDPC Codes for 5G NR

p p
FIGURE 13. et /eft and elt /eft as a function of ψ for different Pa . FIGURE 15. et /dtf and elt /eft as a function of ψ for different Pa .
The redundancy version is equal to 0. The redundancy version is equal to 3.

p p
FIGURE 14. et /eft and elt /eft as a function of ψ for different Pa . FIGURE 16. et /eft and elt /eft as a function of ψ for different Pa .
The redundancy version is equal to 2. The redundancy version is equal to 1.

p f f
transport blocks with the lower code rate over the same Fig. 16. illustrate et /et and elt /et as a function of ψ for
radio resources in the same time duration [51]. The effective different Pa . From these figures, we see that the full-base
code rate decreases as the number of transmissions increases. matrix scheme yields the lowest encoding throughput. The
The peak throughput requirement is difficult to meet in the effective code rate decreases as the number of transmissions
p f f
retransmission [28]. The proposed scheme can alleviate this increases, and thus et /et and elt /et approach one as the
problem to a large extent. number of transmissions increases.
The encoding throughput comparison between the leading
2) ENCODER sub-base matrix scheme and the proposed scheme is similar
We assume that the dual-diagonal architecture in the refer- to the decoding throughput comparison between the leading
ence [7] is used to reduce the complexity of the encoder. This sub-base matrix scheme and the proposed scheme. And it is
architecture needs a total of (mb + 2) clock cycles to encode omitted for the sake of brevity.
the systematic bits s, where mb is the number of rows of the
sub-base matrix. Then, the encoding throughput et is equal to B. BLER PERFORMANCE
In this subsection, the BLER performances of the pro-
fe
et = K (28) posed scheme and the traditional schemes are compared.
mb + 2 The parameters of the simulation are listed as follows. The
where fe is the operating frequency of the encoder. decoding algorithm is the layered normalized min-sum [56].
f
Let et be the encoding throughput of the full-base matrix The scaling factor is equal to 0.85. The maximum itera-
scheme. Let elt be the encoding throughput of the leading tion number is equal to 12. ψ is equal to 27. From the
p
sub-base matrix scheme. Let et be the encoding throughput Table 5.1.3.1.-2 in [32], we can derive that R is equal to
of the proposed scheme. These three schemes mainly differ in 0.9258 and Q is equal to 8. The encoded bits are transmitted
the logic that generates the sub-base matrix used for encod- over the additive white Gaussian noise (AWGN) channel.
ing. Since this logic is not on the critical path, the operating Based on the scheduling information, we can derive that T
frequencies of these three schemes are the same. Fig. 13 to is equal to 295176 and G is equal to 314496 when Pa is equal

VOLUME 7, 2019 185381


H. Wu, H. Wang: High Throughput Implementation of QC-LDPC Codes for 5G NR

for the encoding, the leading sub-base matrix scheme requires


up to 2 divisions, 4 comparisons and 6 additions and the
proposed scheme requires up to 5 divisions, 8 comparisons
and 17 additions. To obtain the sub-base matrix used for
the decoding, the leading sub-base matrix scheme requires
up to 2 divisions, 5 comparisons and 6 additions and the
proposed scheme requires up to 5 divisions, 8 comparisons,
17 additions and 2 union operations. Note that the amount of
computation required to obtain the sub-base matrix used for
encoding and decoding is amortized over C code blocks and
is negligible when C is large.

VII. CONCLUSION
In many applications, the code rate of the data transmission
FIGURE 17. Comparison of the BLER performance between the proposed is larger than the mother code rate. In these case, substantial
scheme and the traditional schemes. Here, Pa is equal to 6. throughput improvement can be achieved by pruning the
full-base matrix of QC-LDPC codes to the desired size. In this
paper, a scheme is developed keeping in mind the difference
between the pruning of the full-base matrix for encoder and
that for the decoder. The transport block of higher code rate
is encoded and decoded by using a smaller sub-base matrix
instead of a full-base matrix. As a consequence, the compu-
tational efficiency and energy efficiency are improved. These
features make the proposed scheme attractive for QC-LDPC
codes in 5G NR.

ACKNOWLEDGMENT
This article was presented at the Proceedings of the
25th International Telecommunication Networks and Appli-
cation Conference (ITNAC), Auckland, New Zealand,
November 2019 [28].
FIGURE 18. Comparison of the BLER performance between the proposed
scheme and the traditional schemes. Here, Pa is equal to 273. REFERENCES
[1] S. Parkvall, E. Dahlman, A. Furuskar, and M. Frenne, ‘‘NR: The new 5G
radio access technology,’’ IEEE Commun. Standards Mag., vol. 1, no. 4,
pp. 24–30, Dec. 2017.
to 273 and T is equal to 6400 and G is equal to 6912 when [2] E. Dahlman, S. Parkvall, and J. Sköld, 5G NR: The Next Generation
Pa is equal to 6. Wireless Access Technology. London, U.K.: Elsevier, 2018.
The BLER performances of the proposed scheme and [3] M. Shafi, A. F. Molisch, P. J. Smith, T. Haustein, P. Zhu, P. D. Silva,
F. Tufvesson, A. Benjebbour, and G. Wunder, ‘‘5G: A tutorial overview
the traditional schemes as a function of the signal-to-noise of standards, trials, challenges, deployment, and practice,’’ IEEE J. Sel.
ratio (SNR) are illustrated in Fig. 17 and Fig. 18. From Areas Commun., vol. 35, no. 6, pp. 1201–1221, Jun. 2017.
these figures, we see that the BLER performances of the [4] T. J. Richardson and S. Kudekar, ‘‘Design of low-density parity check
proposed scheme and the traditional schemes are almost the codes for 5G new radio,’’ IEEE Commun. Mag., vol. 56, no. 3, pp. 28–34,
Mar. 2018.
same. The difference between the BLER performance of [5] D. Hui, S. Sandbeg, Y. Blankenship, M. Andersson, and L. Grosjean,
the proposed scheme and that of the traditional schemes is ‘‘Channel coding in 5G new radio: A tutorial overview and performance
negligible. These simulation results verify the correctness of comparison with 4G LTE,’’ IEEE Veh. Technol. Mag., vol. 13, no. 4,
pp. 60–69, Dec. 2018.
the proposed scheme for the encoder and decoder.
[6] NR; Multiplexing and Channel Coding (Release 15), document TS 38.212,
V15.4.0, 3GPP, Dec. 2018.
VI. COMPUTATIONAL COMPLEXITY [7] T. T. B. Nguyen, T. N. Tan, and H. Lee, ‘‘Efficient QC-LDPC encoder for
The difference between the proposed scheme and the tradi- 5G new radio,’’ Electronics, vol. 8, no. 6, p. 668, Jun. 2019.
[8] J. M. Perez and V. Fernandez, ‘‘Low-cost encoding of IEEE 802.11n,’’
tional schemes is mainly the sub-base matrix used for encod- Electron. Lett., vol. 44, no. 4, pp. 307–308, Feb. 2008.
ing and decoding. In this section, we consider the amount of [9] Z. Cai, J. Hao, P. Tan, S. Sun, and P. Chin, ‘‘Efficient encoding of IEEE
computation required to obtain the sub-base matrix used for 802.11n LDPC codes,’’ Electron. Lett., vol. 42, no. 25, pp. 1471–1472,
Dec. 2006.
encoding and decoding. It is clear that the full-base matrix
[10] Y. Jung, C. Chung, J. Kim, and Y. Jung, ‘‘7.7 Gbps encoder design for IEEE
scheme can directly obtain the sub-base matrix used for 802.11n/ac QC-LDPC codes,’’ in Proc. Int. SoC Design Conf. (ISOCC),
encoding and decoding. To obtain the sub-base matrix used Jeju Island, South Korea, Nov. 2012, pp. 1–4.

185382 VOLUME 7, 2019


H. Wu, H. Wang: High Throughput Implementation of QC-LDPC Codes for 5G NR

[11] H. Zhang, J. Zhu, H. Shi, and D. Wang, ‘‘Layered approx-regular LDPC: [34] T. Chen, K. Vakilinia, D. Divsalar, and R. D. Wesel, ‘‘Protograph
Code construction and encoder/decoder design,’’ IEEE Trans. Circuits based raptor-like LDPC codes,’’ IEEE Trans. Commun., vol. 63, no. 5,
Syst. I, Reg. Papers, vol. 55, no. 2, pp. 572–585, Mar. 2008. pp. 1522–1532, May 2015.
[12] J. K. Kim, H. Yoo, and M. H. Lee, ‘‘Efficient encoding architecture for [35] D. Divsalar, S. Dolinar, C. Jones, and K. Andrews, ‘‘Capacity approach-
IEEE 802.16e LDPC codes,’’ IEICE Trans. Fundam., vols. E91-A, no. 12, ing protograph codes,’’ IEEE J. Sel. Areas Commun., vol. 27, no. 6,
pp. 3607–3611, Dec. 2006. pp. 876–888, Aug. 2009.
[13] T. J. Richardson and R. L. Urbanke, ‘‘Efficient encoding of low-density [36] D. E. Hocevar, ‘‘A reduced complexity decoder architecture via layered
parity-check codes,’’ IEEE Trans. Inf. Theory, vol. 47, no. 2, pp. 638–656, decoding of LDPC codes,’’ in Proc. IEEE Workshop Signal Process. Syst.
Feb. 2001. (SIPS), Austin, TX, USA, Oct. 2004, pp. 107–112.
[14] A. Mahdi and V. Paliouras, ‘‘A low complexity-high throughput QC-LDPC [37] Q. Lu, Z. Shen, C. W. Sham, and F. C. M. Lau, ‘‘A parallel-routing
encoder,’’ IEEE Trans. Signal Process., vol. 62, no. 10, pp. 2696–2708, network for reliability inferences of single-parity-check decoder,’’ in Proc.
May 2014. Int. Conf. Adv. Technol. Commun. (ATC), Ho Chi Minh City, Vietnam,
[15] Y. Kaji, ‘‘Encoding LDPC codes using the triangular factoriza- Oct. 2015, pp. 1–6.
tion,’’ IEICE Trans. Fundam., vols. E89-A, no. 10, pp. 2510–2518, [38] J. H. Bae, A. Abotabl, H. P. Lin, K. B. Song, and J. Lee, ‘‘An overview
Oct. 2006. of channel coding for 5G NR cellular communications,’’ APSIPA Trans.
Signal Inf. Process., vol. 8, no. 1, pp. 1–14, Jan. 2019.
[16] C. Liu, S. Yen, C. Chen, H. Chang, C. Lee, Y. Hsu, and S. Jou, ‘‘An LDPC
[39] LDPC Design for eMBB Data, document R1-1706970, 3GPP, Huawei,
decoder chip based on self-routing network for IEEE 802.16e applica-
HiSilicon, Hangzhou, China, May 2017.
tions,’’ IEEE J. Solid-State Circuits, vol. 43, no. 3, pp. 684–694, Mar. 2008.
[40] Union, Set Union of Two Arrays. Accessed: Dec. 5, 2019. [Online].
[17] P. Murugappa, V. Lapotre, A. Baghdadi, and M. Jezequel, ‘‘Rapid design
Available: https://www.mathworks.com/help/matlab/ref/double.union.
and prototyping of a reconfigurable decoder architecture for QC-LDPC
html
codes,’’ in Proc. Int. Symp. Rapid Syst. Prototyping (RSP), Montreal, QC,
[41] A. Anand and G. de Veciana, ‘‘Resource allocation and HARQ optimiza-
Canada, Oct. 2013, pp. 87–93.
tion for URLLC traffic in 5G wireless networks,’’ IEEE J. Sel. Areas
[18] Y. Sun, G. Wang, and J. R. Cavallaro, ‘‘Multi-layer parallel decoding Commun., vol. 36, no. 11, pp. 2411–2421, Nov. 2018.
algorithm and VLSI architecture for quasi-cyclic LDPC codes,’’ in Proc. [42] S. R. Khosravirad, G. Berardinelli, K. I. Pedersen, and F. Frederiksen,
IEEE Int. Symp. Circuits Syst. (ISCAS), Rio de Janeiro, Brazil, May 2011, ‘‘Enhanced HARQ design for 5G wide area technology,’’ in Proc. IEEE
pp. 1776–1779. 83rd Veh. Technol. Conf. (VTC Spring), Nanjing, China, May 2016, pp. 1–
[19] Y. S. Park, D. Blaauw, D. Sylvester, and Z. Zhang, ‘‘Low-power high- 5.
throughput LDPC decoder using non-refresh embedded DRAM,’’ IEEE [43] T. Bai, C. Xu, R. Zhang, A. F. A. Rawi, and L. Hanzo, ‘‘Performance
J. Solid-State Circuits, vol. 49, no. 3, pp. 783–794, Mar. 2014. of HARQ-assisted OFDM systems contaminated impulsive noise: Finite-
[20] S. W. Yen, S. Y. Hung, C. H. Chen, H. C. Chang, S. J. Jou, and length LDPC code analysis,’’ IEEE Access, vol. 7, pp. 14112–14123, 2019.
C. Y. Lee, ‘‘A 5.79-Gb/s energy-efficient multirate LDPC codec chip for [44] P. Murugappa, A. Baghdadi, and M. Jézéquel, ‘‘Parameterized area-
IEEE 802.15.3c applications,’’ IEEE J. Solid-State Circuits, vol. 47, no. 9, efficient multi-standard turbo decoder,’’ in Proc. Design, Automat. Test Eur.
pp. 2246–2256, Sep. 2012. Conf. Exhib. (DATE), Grenoble, France, Mar. 2013, pp. 1–6.
[21] A. J. Blanksby and C. J. Howland, ‘‘A 690-mW 1-Gb/s 1024-b, rate- [45] M. Karkooti, P. Radosavljevic, and J. R. Cavallaro, ‘‘Configurable,
1/2 low-density parity-check code decoder,’’ IEEE J. Solid-State Circuits, high throughput, irregular LDPC decoder architecture: Tradeoff analy-
vol. 37, no. 3, pp. 404–412, Mar. 2002. sis and implementation,’’ in Proc. IEEE 17th Int. Conf. Appl.-Specific
[22] T. Mohsenin and B. M. Baas, ‘‘Split-row: A reduced complexity, high Syst., Archit. Process. (ASAP), Steamboat Springs, CO, USA, Sep. 2006,
throughput LDPC decoder architecture,’’ in Proc. Int. Conf. Comput. pp. 1–8.
Design (ICCD), San Jose, CA, USA, Oct. 2006, pp. 1–6. [46] Part 11: Wireless LAN Medium Access Control (MAC) and Physical Layer
[23] C.-W. Sham, X. Chen, F. C. M. Lau, Y. Zhao, and W. M. Tam, (PHY) Specifications, Amendment 5: Enhancements for Higher Through-
‘‘A 2.0 Gb/s throughput decoder for QC-LDPC convolutional codes,’’ put, Standard IEEE 802.11n-2009, 2009.
IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 60, no. 7, pp. 1857–1869, [47] Part 11: Wireless LAN Medium Access Control (MAC) and Physi-
Jul. 2013. cal Layer (PHY) Specifications, Amendment 4: Enhancements for Very
[24] S. K. Ahn, K. J. Kim, S. Myung, S. L. Park, and K. Yang, ‘‘Comparison High Throughput for Operation in Bands Below 6 GHz, Standard IEEE
of low-density parity-check codes in ATSC 3.0 and 5G standards,’’ IEEE 802.11ac-2013, 2013.
Trans. Broadcast., vol. 65, no. 3, pp. 489–495, Sep. 2019. [48] Set Difference of Two Arrays. Accessed: Dec. 5, 2019. [Online]. Available:
[25] D. F. Zhao, H. Tian, and R. Xue, ‘‘Adaptive rate-compatible non-binary https://www.mathworks.com/help/matlab/ref/setdiff.html
LDPC coding scheme for the B5G mobile systems,’’ Sensors, vol. 19, no. 5, [49] Part 16: Aire Interface for Fixed and Mobile Broadband Wireless Access
p. 1067, Mar. 2019. Systems, Amendment 2: Physical and Medium Access Control Layers for
[26] Y. Zhang, K. Peng, X. Wang, and J. Song, ‘‘Performance analysis and code Combined Fixed and Mobile Operation in Licensed Bands and Corrigen-
optimization of IDMA with 5G new radio LDPC code,’’ IEEE Commun. dum 1, Standard IEEE 802.16e-2005, 2005.
Lett., vol. 8, no. 22, pp. 1552–1555, Jun. 2018. [50] Digital Video Broadcasting (DVB); Second Generation Framing Structure,
Channel Coding and Modulation Systems for Broadcasting, Interactive
[27] H. Li, B. Bai, X. Mu, J. Zhang, and H. Xu, ‘‘Algebra-assisted construction
Services, News Gathering and Other Broadband Satellite Applications
of quasi-cyclic LDPC codes for 5G new radio,’’ IEEE Access, vol. 6,
(DVB-S2), Standard ETSI EN 302 307, ETSI, 2009.
pp. 50229–50244, 2018.
[51] Throughput Requirements of LDPC Decoder, document R1-1707069,
[28] H. Wu and H. Wang, ‘‘Decoding latency of LDPC codes in 5G NR,’’
3GPP, Ericsson, Hangzhou, China, May 2017.
in Proc. Int. Telecommun. Netw. Appl. Conf. (ITNAC), Auckland, New
[52] Efficient Channel Coding Implementations for EMMB, document R1-
Zealand, Nov. 2019, pp. 1–5.
1610139, 3GPP, Qualcomm Incorporated, Lisbon, Portugal, Oct. 2016,
[29] LDPC Encoder/Decoder V2.0, Logicore IP Product Guide. Accessed: pp. 10–14.
Dec. 5, 2019. [Online]. Available: https://china.xilinx.com/member/ldpc- [53] M. Rovini, G. Gentile, F. Rossi, and L. Fanucci, ‘‘Techniques and architec-
enc-dec/v2_0/pg281-ldpc.pdf tures for hazard-free semi-parallel decoding of LDPC codes,’’ EURASIP J.
[30] 5G LDPC Intel FPGA IP User Guide, Updated for Intel Quartus Prime Embedded Syst., vol. 2009, no. 8, pp. 1–15, Jan. 2009.
Design Suite: 18.1. Accessed: Dec. 5, 2019. [Online]. Available: https:// [54] Z. Wu, D. Liu, and Y. Zhang, ‘‘Matrix reordering techniques for
www.intel.cn/content/dam/www/programmable/us/en/pdfs/literature/ug/ memory conflict reduction for pipelined QC-LDPC decoder,’’ in Proc.
ug_5g_ldpc.pdf IEEE/CIC Int. Conf. Commun. China (ICCC), Shanghai, China, Oct. 2014,
[31] G. A. F. Seber, A Matrix Handbook for Statisticians. Hoboken, NJ, USA: pp. 354–359.
Wiley, 2007. [55] Complexity, Throughput and Latency Analysis on LDPC Codes for EMMB,
[32] NR; Physical Layer Procedures for Data (Release 15), document TS document R1-1700246, 3GPP, ZTE Microelectronics, Spokane, WA, USA,
38.214, V15.4.0, 3GPP, Dec. 2018. Jan. 2017, pp. 16–20.
[33] F. H. Sepehr, A. Nimbalker, and G. Ermolaev, ‘‘Analysis of 5G LDPC [56] J. Chen, A. Dholakia, E. Eleftheriou, M. Fossorier, and X. Hu, ‘‘Reduced-
codes rate-matching design,’’ in Proc. IEEE Veh. Technol. Conf. (VTC), complexity decoding of LDPC codes,’’ IEEE Trans. Commun., vol. 53,
Porto, Portugal, Jun. 2018, pp. 1–5. no. 8, pp. 28–34, Aug. 2005.

VOLUME 7, 2019 185383


H. Wu, H. Wang: High Throughput Implementation of QC-LDPC Codes for 5G NR

HAO WU received the M.S. degree in commu- HUAYONG WANG received the M.S. degree in
nication and information systems from Tianjin IC design engineering from Peking University,
University, in 2009. He is currently a Senior Engi- in 2007. He is currently a Senior Engineer with
neer with ZTE corporation, where he develops ZTE Corporation, Shenzhen, China. His research
technologies to improve the performance of wire- interests include wireless communications sys-
less broadband communication systems. He has tems and VLSI design.
authored a number of articles. He holds a number
of patents. His research interests include wireless
communication systems, digital signal processing,
and error control coding.

185384 VOLUME 7, 2019

You might also like