Download as pdf or txt
Download as pdf or txt
You are on page 1of 49

Kalman Filtering Based Combining for MIMO Systems with

Hybrid ARQ

Journal: Transactions on Signal Processing

Manuscript ID T-SP-27858-2021

Manuscript Type: Regular Paper

Date Submitted by the


17-May-2021
Author:

Complete List of Authors: Park, Sangjoon; Kyonggi University, Department of Electronic Engieering

114. SPC-DETC Detection, estimation, and demodulation < SPC SIGNAL


PROCESSING FOR COMMUNICATIONS, 122. SPC-MIMO Multiple-input
EDICS:
multiple-output communication systems < SPC SIGNAL PROCESSING
FOR COMMUNICATIONS
Page 1 of 48
IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. XX, NO. XX, XXX 2021 1

1
2
3 Kalman Filtering Based Combining for MIMO
4
5
6
Systems with Hybrid ARQ
7 Sangjoon Park, Member, IEEE,
8
9
10
11
12 Abstract—This paper proposes a Kalman filtering (KF)-based high flexibility and can be applied to any system employing
combining scheme for multiple-input multiple-output (MIMO) an HARQ regardless of the number of HARQ processes [7]
13
systems with hybrid automatic repeat request (ARQ). In the and HARQ retransmission strategy [8]. In addition to BLC,
14 state-space model, the change of a transmit signal vector in
15 consecutive transmission time intervals (TTIs) is interpreted symbol-level combining (SLC) is a combining scheme for
16 as the transition of a state vector. Accordingly, the proposed HARQ, which combines the received signals related to a
17 KF operation performs symbol-level combining (SLC) based on packet and calculates the LLR for FEC decoding from the
linear minimum mean-square-error (LMMSE) detection using combined signal model [9]–[23]. In particular, in multiple-
18 the information aggregated up to the current TTI. The state-
19 input multiple-output (MIMO) systems, SLC performance can
space model can represent various configurations for MIMO
20 systems with hybrid ARQ (HARQ) via a simple modification be considerably improved from BLC. However, the direct
21 of a transition indicating vector. Therefore, compared to the SLC scheme with a brute-force aggregation of all related
22 existing SLC schemes designed for a specific system in a dedicated information of a packet is impractical in MIMO systems
manner, the proposed scheme can be applied to various MIMO- with HARQ owing to the large computational complexity and
23 HARQ systems, similar to bit-level combining (BLC). In addition,
24 memory loads. Therefore, several alternative SLC schemes
by employing a low-complexity correction step for the KF
25 operation, the computational complexity of the proposed scheme have been investigated for MIMO systems with HARQ.
26 is lower than that of existing SLC schemes and is comparable Despite the simplification of the direct SLC scheme, existing
27 to that of BLC. Simulation results and analyses confirm that the SLC schemes still require higher computational complexity
proposed scheme outperforms BLC and achieves near-identical and memory units than BLC. In addition, the existing SLC
28 error performance to the LMMSE-based direct SLC scheme with
29 schemes are dedicated for a specific MIMO system with
a brute-force aggregation of all related information up to the
30 current TTI for the transmitted packet(s). HARQ according to the number of HARQ processes (e.g.,
31 MIMO single ARQ (MSARQ) and MIMO multiple ARQ
Index Terms—HARQ, MIMO systems, Kalman filtering, (MMARQ) systems) and the HARQ retransmission strategy
32 packet combining, LMMSE, SLC, BLC.
33 (e.g., Chase combining (CC) and incremental redundancy
34 (IR)). Thus, employing these strategies to practical systems
35 I. I NTRODUCTION supporting various configurations in terms of the number
36 N wireless communication systems, the receiver can re- of HARQ processes and HARQ retransmission strategies is
37
38
I quest retransmission of a packet with detection errors. By
retransmitting the forward error correction (FEC) encoded
infeasible. In contrast, despite its flexibility for various sys-
tems and low overhead, BLC shows significantly worse error
39 packet, the hybrid automatic repeat request (HARQ), which performance and system throughput than SLC.
40 is a combined scheme of FEC and ARQ, can overcome Therefore, to overcome the limitations of conventional SLC
41 such detection errors and can achieve an extremely low error and BLC schemes, this paper proposes a Kalman filter-
42 probability [1]. In addition, because the packet in each HARQ ing (KF)-based combining scheme for MIMO systems with
43 round can have a higher code rate than the original mother HARQ. The Kalman filter [24] has been widely studied for sig-
44 codeword [2], HARQ can enhance the system throughput nal detection in MIMO systems, although the applications have
45 in comparison with a system without HARQ. Therefore, as usually been limited to equalization under frequency selective
46 an essential technique for packet transmission, HARQ is fading channels [25]. In the proposed KF-based combining
47 employed in most modern communication standards [3]–[5]. (KC) scheme, considering HARQ retransmissions, changes in
48 To completely utilize HARQ, the receiver must combine a transmitted signal vector from two consecutive transmission
49 the information of the packet obtained throughout the HARQ time intervals (TTIs) are modeled as the transition of a state
50 rounds. The basic combining scheme for HARQ is bit-level vector, which has not been considered in previous studies for
51 combining (BLC) [6]. In BLC, the log-likelihood ratio (LLR) MIMO systems with HARQ. Then, the KF operation is per-
52 of a packet in each HARQ round is separately calculated, formed to obtain an estimate of each transmitted signal vector
53 followed by combining the LLRs obtained from the HARQ according to the corresponding state-space model. Because
54 rounds of the packet for FEC decoding. This BLC scheme has KF can process all information collected up to a given point
55 and obtain a linear minimum mean-square-error (LMMSE)
56 S. Park is with the Department of Electronic Engineering, Kyonggi Uni- estimate of a given state vector [24], the proposed KC scheme
versity, Suwon 16227, Korea (e-mail: sj.park@kgu.ac.kr). enables LMMSE detection of the transmit symbols based on
57 This work was supported by the Ministry of Science and ICT (NRF-
58 2019R1C1C1003202). the SLC by a single KF operation. In this operation, the related
59 Manuscript received May 17, 2021. information (e.g., channel matrices and received signal vectors
60
Page 2 of 48
IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. XX, NO. XX, XXX 2021 2

1
2 related to the transmitted packet(s)) for SLC is aggregated by blanking. In [21], an SLC scheme based on QR decomposition
3 the state transition for the KF operation. with IC was proposed for CC. Meanwhile, in [22], an SLC
4 In the proposed scheme, the sizes of the state and ob- scheme with IC of all terminated packets was investigated for
5 servation vectors are equal to the numbers of transmit and CC. In [23], the SLC scheme without IC was developed for CC
6 receive antennas of the system, respectively. In addition, a by considering the interference from the terminated packets as
7 low-complexity correction step is developed for the KF oper- noise.
8 ation of the proposed scheme. Therefore, the computational
9 complexity of the proposed KC scheme is similar to that B. Notations
10 of the conventional LMMSE-based BLC scheme, which is
Vectors and matrices are denoted by lower-case and upper-
11 significantly lower than that of the conventional LMMSE-
case boldface letters, respectively. The superscripts T , ∗, H,
12 based SLC schemes. Furthermore, the state transition model
and −1 are the transpose, conjugate, conjugate-and-transpose,
13 for each transmit signal vector is determined by system con-
and inverse operators, respectively. ⊙ is an element-wise
14 figurations, such as the number of HARQ processes, HARQ
multiplication operator for vectors and matrices. Ij is the j ×j
15 retransmission strategy, bit-to-symbol mapping for modula-
identity matrix, and 0i×j and 1i×j are the i × j all-zero and
16 tion, and spatial multiplexing of modulated symbols. These
all-one matrices, respectively. A = diag(a) is the diagonal
17 configurations should already be known to the receiver to
matrix A whose diagonal elements are a. Finally, E[·] is the
18 decode the packet(s). Therefore, by modifying the transition
mathematical expectation.
19 indicating vector for a given system configuration, the pro-
20 posed scheme can be applied to various MIMO systems with
HARQ, regardless of the number of HARQ processes and the II. S YSTEM M ODEL
21
22 retransmission strategy, similar to BLC. We consider a spatially multiplexed MIMO system with
23 The remainder of this paper is organized as follows. Sec- packet transmission and retransmission by HARQ, where N
24 tion II describes the system model including MSARQ and and M denote the numbers of transmit and receive anten-
25 MMARQ systems with CC and IR. Section III explains the nas, respectively. R is a system limit for the number of
26 proposed scheme in detail, including the state-space model for transmissions (HARQ round) of a packet, and each packet
27 the KF operation, and compares the computational complexity has an independent ARQ process. Each transmit symbol of
28 and memory requirements of the proposed scheme with the a packet is modulated

Q
∑ using2 a 2 -ary constellation S with
29 conventional BLC and SLC schemes. Section IV presents s∈S s = 0 and s∈S |s| = 2 Q
. Each packet contains
30 the simulation results for various system configurations, and J transmit symbols for its r(1 ≤ r ≤ R)th HARQ round;
31 Section V concludes the paper. these J transmit symbols are classified as data and parity
32 symbols, including the data and parity bits of the packet for
33 the current HARQ round, respectively. Let C be the code
A. Related Work rate of a packet for each HARQ round. Then, assuming
34
For MSARQ systems, the modeling of HARQ procedures an integer JC for simplicity, each packet contains JC data
35
is straightforward and several SLC schemes have been devel- and J(1 − C) parity symbols for each HARQ round. For
36
oped. In [9], pre-combining and post-combining schemes with CC, by sending the same codeword for every retransmission,
37
linear detection were proposed. The pre-combining scheme both data and parity symbols become identical regardless
38
was extended to maximal-ratio combining to be used with of the HARQ round. Meanwhile, for IR, JC data symbols
39
maximum-likelihood (ML) detection and linear detection for are identical regardless of the HARQ round, whereas the
40
CC in [10]. In [11], an SLC scheme with sphere decoding was remaining J(1 − C) parity symbols are regenerated according
41
developed for CC. In [12], SLC schemes with QR decompo- to the HARQ round because new redundancy information is
42
sition of the aggregated channel matrix were investigated for sent for retransmission.
43
CC. In [13], a joint utilization of SLC and turbo equalization Let P denote the number of ARQ processes in each TTI. We
44
under frequency selective fading channels was proposed for consider the following two cases: P = 1 for MSARQ systems
45
CC. Meanwhile, in [14], an SLC scheme with ML and linear with single-packet transmission and P = N for MMARQ
46
detections was investigated for IR, which was extended to systems with multiple-packet transmission. When P = 1, the
47
the system with acknowledgement feedback bundling [15]. transmit symbols of a packet are spatially multiplexed across
48
In [16], an SLC scheme under an interference channel was the transmit antennas, and the number of the transmit signal
49
proposed for CC. In [17], an SLC scheme with ML detection vectors for each TTI L is J/N , assuming P is a multiple of
50
was proposed for partial retransmission considering orthogonal N for simplicity. Meanwhile, when P = N , each transmit
51
space-time block codes. In [18], an SLC scheme utilizing a antenna is utilized to send transmit symbols of each packet;
52
subset of the channel matrix was studied for CC. therefore, L = J. Then, for both MSARQ and MMARQ
53
In MMARQ systems, because of multiple ARQ processes, systems, the l(1 ≤ l ≤ L)th receive signal vector at the tth
54
SLC schemes become significantly complex compared to those TTI can be expressed as follows:
55
56 in MSARQ systems. In [19], a joint detection approach for CC
yt,l = Ht,l xt,l + nt,l . (1)
57 was proposed based on the interference cancellation (IC) of
58 successfully decoded packets. In [20], SLC schemes for CC In (1), xt,l is the N ×1 transmit signal vector, yt,l is the M ×1
59 were investigated according to the availability of IC and packet receive signal vector, Ht,l is the M × N channel matrix, and
60
Page 3 of 48
IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. XX, NO. XX, XXX 2021 3

1
nt,l is the M ×1 additive white Gaussian noise (AWGN) vector TABLE I: Examples of transition indicating vector dt,l
2
3 with zero-mean and a covariance matrix E[nt,l nH 2
t,l ] = σ IM . r=1 r>1
4 At the receiver, LLRs for the packet(s) transmitted in the MSARQ CC
0N ×1
1N ×1
5 tth TTI are generated considering the HARQ rounds of the IR [11×N C , 01×N (1−C) ]T
6 packet(s), and the generated LLRs are utilized for decoding 1 ≤ l ≤ JC JC + 1 ≤ l ≤ J
7 each packet. A packet that fails at decoding when its HARQ MMARQ CC pt
pt
8 round is less than R is retransmitted at the next (t + 1)th TTI. IR 0N ×1
9 Otherwise, if a packet is successfully decoded or its HARQ
10 round is R, the packet is terminated, and a new packet is
11 transmitted from the (t + 1)th TTI. III. P ROPOSED K ALMAN F ILTERING BASED C OMBINING
12 In the following subsections, a multiplexing strategy for A. State-Space Model
13 mapping the transmit symbols of the packet(s) to the transmit
First, we develop a state-space model for the proposed KC
14 signal vectors is explained for MSARQ and MMARQ systems.
scheme based on the system model described in Section II. In
15 this model, each transmit signal vector xt,l is considered to be
16 A. MSARQ Systems with Single-Packet Transmission the state vector, and the state is transited from xt−1,l to xt,l
17 Let sr = [sr (1), sr (2), · · · , sr (J)] denote the 1×J transmit by packet transmission and retransmission at the consecutive
18 symbol vector of the packet for its r(1 ≤ r ≤ R)th HARQ TTIs. Therefore, the process equation for xt,l (1 ≤ l ≤ L) can
19 round, where sr (j) is the jth transmit symbol of sr . Therefore, be written as follows:
20 {sr (1), · · · , sr (JC)} and {sr (JC+1), · · · , sr (J)} denote the
21 xt,l = Ft,l xt−1,l + wt,l , (4)
data and parity symbols of the packet for the rth HARQ round,
22 respectively. Then, the elements in sr can be multiplexed to where Ft,l and wt,l are the N × N state transition matrix
23 transmit signal vectors of the current tth TTI as follows: and N × 1 process noise vector, respectively. Let dt,l =
24 [dt,l (1), · · · , dt,l (N )]T = E[xt−1,l ⊙ x∗t,l ] be the N × 1
25 xt,l =[xt,l (1), · · · , xt,l (N )]T =[sr (l), · · · , sr (l+(N −1)L)]T . transition indicating vector for xt,l , where dt,l (n) = 1 if
26 (2) xt−1,l (n) is identical to xt,l (n) (e.g., a retransmitted data
27 In (2), xt,l (n)(= sr (l + (n − 1)L)) is the nth element of xt,l , symbol) and 0 otherwise. Then, Ft,l and wt,l can be written
28 which is sent from the nth transmit antenna. Therefore, xt,l (n) as follows:
29 with 1 ≤ l + (n − 1)L ≤ JC is the repeatedly transmitted Ft,l = diag(dt,l ) (5)
30 data symbol, where E[xt−1,l (n)x∗t,l (n)] = 1, regardless of
31 the HARQ retransmission strategy, if r > 1. Meanwhile, and
32 xt,l (n) for JC + 1 ≤ l + (n − 1)L ≤ J is the parity wt,l = xt,l ⊙ (1N ×1 − dt,l ). (6)
33 symbol. Therefore, if r > 1, E[xt−1,l (n)x∗t,l (n)] = 0 for In Table I, examples of dt,l according to the system
34 IR, whereas E[xt−1,l (n)x∗t,l (n)] = 1 for CC. Furthermore, if configuration are shown, where an integer N C is assumed
35 r = 1, indicating that a new packet is transmitted from the tth for MSARQ systems. pt = [pt (1), · · · , pt (N )]T denotes the
36 TTI, E[xt−1,l (n)x∗t,l (n)] = 0, regardless of l, n, and HARQ retransmission status in MMARQ systems, where pt (n) is 0
37 retransmission strategy. for rt (n) = 1 and 1 for 2 ≤ rt (n) ≤ R. As shown in Table
38 I, dt,l can be easily determined according to the system con-
39 B. MMARQ Systems with Multiple-Packet Transmission figuration that should be shared between the transmitter and
40 receiver for successful packet transmission.1 Therefore, the
Let st,n = [st,n (1), st,n (2), · · · , st,n (J)] denote the 1 × J
41 process equation in (4) can represent the transition behavior
transmit symbol vector of the packet transmitted from the nth
42 of xt,l in both MSARQ and MMARQ systems with CC or IR.
transmit antenna at the tth TTI. Because L = J, the elements
43 Further, the observation equation is equivalent to (1), which
in {st,1 , · · · , st,N } can be multiplexed to {xt,1 , · · · , xt,L } as
44 is
follows:
45 yt,l = Ht,l xt,l + nt,l , (7)
46 xt,l = [xt,l (1), · · · , xt,l (N )]T = [st,1 (l), · · · , st,N (l)]T . (3)
47 where yt,l and nt,l become the M × 1 observation vector and
48 Therefore, xt,l (n) = st,n (l), regardless of t, n, and l. Let observation noise vector, respectively, and Ht,l becomes the
49 rt = [rt (1), · · · , rt (N )]T denote the N × 1 vector, indicating M × N observation matrix.
50 the HARQ round of the active packets at the tth TTI, where Note that the state-space model in (4)–(7) is valid for a
51 rt (n) is the HARQ round of the packet from the nth transmit general MIMO system, while each term in (4)–(7) can be
52 antenna with 1 ≤ rt (n) ≤ R. Therefore, if rt (n) > 1 and changed according to the specific system configuration.
53 1 ≤ l ≤ JC, E[xt−1,l (n)x∗t,l (n)] = 1, regardless of the
1 For normal functioning of reception procedures, in addition to the number
54 HARQ retransmission strategy. In contrast, if rt (n) > 1 and
of HARQ processes and the retransmission strategy, the receiver should
55 JC + 1 ≤ l ≤ J, E[xt−1,l (n)x∗t,l (n)] = 1 for CC and 0 know (i) the position in the mother codeword of the coded bits consisting
56 for IR. Furthermore, similar to the case of MSARQ systems, each symbol and (ii) the packet to which each symbol belongs. For packet
if rt (n) = 1, indicating that a new packet is sent from the transmission, this information is usually shared between the transmitter and
57 receiver via control channel. Therefore, the receiver and transmitter can
58 nth transmit antenna, E[xt−1,l (n)x∗t,l (n)] = 0 for the given n, calculate dt,l = E[xt−1,l ⊙ x∗t,l ] by deciding whether each xt−1,l (n) is
59 regardless of l and HARQ retransmission strategy. retransmitted as xt,l (n), as described in Sections II-A and II-B.

60
Page 4 of 48
IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. XX, NO. XX, XXX 2021 4

1
2 B. Proposed Kalman Filtering Based Combining Algorithm 1: KF Stage with Low-Complexity Correction Step.
3 The proposed KC scheme includes three sequentially per- Input: l, σ 2 , yt,l , Ht,l , Ft,l , Wt,l , x̂t−1,l , Pt−1,l
4 formed stages: KF stage, LLR calculation stage, and optional (−) (−)
1: Prediction: x̂t,l = Ft,l x̂t−1,l , Pt,l = Ft,l Pt−1,l FTt,l + Wt,l ;
5 LLR combining stage for IR. The output LLRs of the proposed (−) (−)
2: Correction Step Initialization: x̂t,l,0 = x̂t,l , Pt,l,0 = Pt,l ;
6 scheme are used as the input for decoding. The following 3: for m = 1 to M do
7 section presents the detailed procedures of each stage. 4: vt,l,m = Pt,l,m−1 h∗t,l,m ;
8 1) Kalman filtering stage: Let x̂t,l denote the N ×1 filtered 5: kt,l,m = vt,l,m /(hT 2
t,l,m vt,l,m +σ );
9 estimate of xt,l using all observations up to the tth TTI, and 6: x̂t,l,m = x̂t,l,m−1 + kt,l,m (yt,l (m) − hT t,l,m x̂t,l,m−1 );
10 let Pt,l = E[(xt,l − x̂t,l )(xt,l − x̂t,l )H ] denote its N × N 7: Pt,l,m = Pt,l,m−1 − kt,l,m vt,l,mH ;
11 error covariance matrix. x0,l and P0,l can be initialized to all- 8: end for
12 (−)
zero for 1 ≤ l ≤ L. In addition, let x̂t,l denote the N × 1
9: Output: x̂t,l = x̂t,l,M , Pt,l = Pt,l,M ;
13 predicted estimate of xt,l using all observations up to the (t −
14 (−) (−) (−)
1)th TTI and Pt,l = E[(xt,l − x̂t,l )(xt,l − x̂t,l )H ]. Then,
15 (−) (−)
to xt,l,M is IN . Consequently, the prediction steps for the
without a priori information regarding xt,l , x̂t,l and Pt,l can sub-state transitions can be omitted, except xt−1,l → xt,l,1 ,
16
be obtained from x̂t−1,l and Pt−1,l as [24], [25] where the prediction of xt,l,1 (= xt,l ) from xt−1,l can be
17
18 (−)
(8) performed via the prediction step in (8) and (9). Therefore,
x̂t,l = Ft,l x̂t−1,l (−) (−)
19 with x̂t,l,0 = x̂t,l and Pt,l,0 = Pt,l , the low-complexity
20 and correction step is sequentially performed from m = 1 to M
(−)
21 Pt,l = Ft,l Pt−1,l FTt,l + Wt,l , (9) as follows:
22 x̂t,l,m = x̂t,l,m−1 + kt,l,m (yt,l (m) − hTt,l,m x̂t,l,m−1 ) (14)
where Wt,l = E[wt,l wt,lH
] = IN − Ft,l is the N × N process
23
noise covariance matrix.
24 and
After the prediction step in (8) and (9), a correction step is
25 (−) (−) Pt,l,m = Pt,l,m−1 − kt,l,m hTt,l,m Pt,l,m−1 , (15)
performed to obtain x̂t,l and Pt,l from x̂t,l and Pt,l . Based
26
on (7), the correction step can be written as follows [24]: where kt,l,m is the N × 1 Kalman gain vector expressed as
27
28 x̂t,l =
(−)
x̂t,l + Kt,l (yt,l −
(−)
Ht,l x̂t,l ) (10) Pt,l,m−1 h∗t,l,m
kt,l,m = . (16)
29 hTt,l,m Pt,l,m−1 h∗t,l,m + σ 2
30 and
(−) (−)
31 Pt,l = Pt,l − Kt,l Ht,l Pt,l , (11) The calculated x̂t,l,M and Pt,l,M are then set to x̂t,l and
32 Pt,l , respectively. Therefore, the matrix-matrix multiplications
where Kt,l is the N × M Kalman gain matrix given by
33 and matrix inversion in (10)–(12) are replaced by matrix-
(−) (−) −1 vector multiplications and vector-scalar division, respectively.
34 Kt,l = Pt,l HH H 2
t,l (Ht,l Pt,l Ht,l + σ IM ) . (12)
35 Furthermore, because the observation equation in (7) is linear
Therefore, x̂t,l is obtained by the KF operation in (8)–(12), and the observation noise nt,l is AWGN, x̂t,l and Pt,l from
36
which is an LMMSE estimate of xt,l using the information the low-complexity correction step in (14)–(16) are identical
37
aggregated up to the tth TTI by the state transition. to those from the correction step in (10)–(12) [24].
38
Although the correction step in (10)–(12) provides an In Alg. 1, the proposed KF stage with the low-complexity
39
40
LMMSE estimate of xt,l based on SLC, it requires several correction step is summarized. vt,l,m = Pt,l,m−1 h∗t,l,m in
matrix multiplication and inverse operations. To avoid such Alg. 1 is defined to reduce the number of calculations of
41
42
matrix operations, a low-complexity correction step is devel- Pt,l,m−1 h∗t,l,m in both Pt,l,m in (15) and kt,l,m in (16).
oped, which generates the same x̂t,l and Pt,l to the correction Next, we compare x̂t,l to the estimate of other combining
43
step in (10)–(12). schemes with LMMSE detection. When Ft,l = 0N ×N regard-
44
For the low-complexity correction step, first, the single less of t and l (e.g., full IR), x̂t,l is equivalent to the estimate of
45
M × 1 observation vector yt,l in (7) is modeled as the M xt,l by BLC with LMMSE detection because Kt,l becomes
46 −1 H
consecutive scalar observations, i.e., (HH 2
t,l Ht,l + σ IN ) Ht,l . In addition, for general Ft,l , we
47
48 yt,l (m) = hTt,l,m xt,l + nt,l (m), 1 ≤ m ≤ M, (13) have the following lemma:
49 Lemma 1 (Equivalence of the estimated xt,l in the proposed
50 where yt,l (m) and nt,l (m) are the mth elements of yt,l and and direct SLC schemes): x̂t,l is identical to the corresponding
51 nt,l , respectively, and hTt,l,m is the 1×N vector corresponding estimate of xt,l obtained by the direct SLC scheme with
52 to the mth row of Ht,l . LMMSE detection at the tth TTI.
53 Then, the state transition of xt−1,l → xt,l with an M × 1 Proof: See Appendix A.
54 observation vector based on (7) can be interpreted as M sub- Therefore, the proposed and direct SLC schemes at the
55 state transitions of xt−1,l → xt,l,1 → · · · → xt,l,M with tth TTI obtain the identical estimate for xt,l regardless of
56 M scalar observations based on (13), where xt,l,m is the the number of HARQ processes and retransmission strategy.
57 mth sub-state vector considering the mth scalar observation However, the direct SLC scheme requires a significantly
58 yt,l (m). Because xt,l,m = xt,l for 1 ≤ m ≤ M , the larger computational complexity and has to be designed in
59 state-transition matrix for each sub-state transition from xt,l,1 a dedicated manner for a specific system configuration.
60
Page 5 of 48
IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. XX, NO. XX, XXX 2021 5

1
TABLE II: Computational complexity, memory requirement, and performance of the combining schemes
2
3 System applicability Combining scheme Computational complexity Memory requirement Performance
4
5 Proposed KC O(2N 2 M ) N (N + 1)g high
Applicable to all
Conventional BLC O(N 3 + 2N 2 M ) -h low
6
Direct SLC O(N 3 + 2rN 2 M ) rM (N + 1) high
7 Pre-combining [9] O(N 3 + (r + 1)N 2 M ) N (N + 1) high
8 MSARQ-CC Maximal-ratio combining [10] O(4N 3 + N 2 M ) N (N + 1) high
9 SLC w/ direct QR [12] O(3N 3 + rN 2 M ) rM (N + 1) high
10 SLC w/ incremental QR [12] O(3N 3 + N 2 (M + αr ))a N (N + 3)/2 high
11 Direct SLC O(βr3 N 3 + 2rβr2 N 2 M )b,j rM (N + 1) high
MSARQ-IR
12 Extended detection [14] O(γr N 3 + δr N 2 M )c,j rM (N + 1) + rN 2 C(1 − C) high
13 Direct SLC O(Nt3 + 2Nt2 Mt )d tM (N + 1)i high
14 MMARQ-CC SLC w/ packet elimination [19], [22] O(N 3 + 2rt∗ N 2 M )e,f rt∗ M (N + 1) high
Aggregation-assisted combining [23] O(N 3 + 2rt∗ N 2 M )e rt∗ M (N + 1) moderate
15
MMARQ-IR Direct SLC O(Nt3 + 2Nt2 Mt )d,j tM (N + 1)i high
16
17 a αr = 0 for r = 1 and N for r ≥ 2.
18
b βr = 1 + (r − 1)(1 − C).
c γ = 1 and δ = 2 for r = 1, and γ = C 3 + (1 − C)3 and δ = 1 + (1 − C)2 + rβ C for r ≥ 2.
19 r r r r r
d N and M are the numbers of columns and rows in the utilized aggregated channel matrix, respectively. See Appendix A for details.
20 t t
e r ∗ = max(r ).
21 t t
f The complexity of the terminated packet elimination is additionally required.
22 g The LLRs of the previously transmitted parity bits for the current active packet(s) must be stored for IR.
23 h The LLRs obtained at the previous HARQ rounds of the current active packet(s) need to be stored.

24 i This can be reduced to (t−t∗+1)M (N +1), where t∗−1(< t) is the last TTI in which all active packets are terminated. See Appendix A for details.

25 j The complexity for LLR calculation is higher than the other cases by the increased number of transmit symbols (> N ) for LLR calculation.

26
27
28 2) LLR calculation stage: After the KF operation, LLRs of and compared with those of other combining schemes.
29 the bits consisting of each element of xt,l must be calculated By the structure of Ft,l , the prediction step in (8) and
30 from x̂t,l and Pt,l for FEC decoding. The residual interference (9) becomes the selection of subsets in x̂t−1,l and Pt−1,l .
31 with noise in x̂t,l (n), the nth element of x̂t,l , can be approxi- Therefore, the correction step in (14)–(16) mainly yield the
32 mated as a Gaussian distribution with variance pt,l (n), which complexity. Considering the highest-order terms only, the com-
33 is the nth diagonal element of Pt,l [25]. Let bqt,l,n denote the plexity is governed by calculations of Pt,l,m−1 h∗t,l,m (= vt,l,m
q
34 q(1 ≤ q ≤ 2Q )th bit comprising xt,l (n). Then, lt,l,n , the LLR in Alg. 1) and kt,l,m hTt,l,m Pt,l,m−1 (= kt,l,m vt,l,m
H
), and each
q
35 of bt,l,n can be calculated as follows: of them entails the complexity of O(N ). Because (14)–(16)
2

36 ∑ ( ) should be performed M times, the complexity of the KF stage


−|x̂t,l (n)−s|2
exp is O(2N 2 M ) for each xt,l . Meanwhile, the complexity with
37 ∀s∈S,bqt,l,n =1
pt,l (n)
38 q
lt,l,n = log ( ). (17) the correction step in (10)–(12) is O(2N 2 M + 2N M 2 + M 3 )
∑ −|x̂t,l (n)−s|2
39 exp pt,l (n)
for each xt,l . Therefore, the low-complexity correction step
40 ∀s∈S,bqt,l,n =0 can significantly reduce the complexity of the KF stage.
41 3) LLR combining stage: For LLRs of the bits comprising Next, the memory requirement for the KF stage is derived,
42 repeatedly transmitted symbols included in xt,l , e.g., the in which the memory size to store one complex number is
43 transmit symbols of the packet(s) with CC and data symbols denoted as one memory unit. At the KF stage, only x̂t−1,l and
44 of the packet(s) with IR, LLR combining is not required. Pt−1,l need to be stored for the current tth TTI. Therefore,
45 Meanwhile, the parity symbols of the packet sent until the N (N + 1) memory units are required for each transmit signal
46 last (t − 1)th TTI are not included in xt,l if IR is employed. vector, which can be further reduced because Pt,l is Hermitian.
47 Therefore, for IR, the LLRs of the coded bits comprising the Table II compares the computational complexity, memory
48 parity symbols sent until the (t − 1)th TTI should be stored to units, and error performance of the combining schemes with
49 be combined with the currently obtained LLRs for decoding.2 LMMSE detection. The existing SLC schemes require lower
50 complexity than the brute-force direct SLC scheme; however,
51 C. Computational Complexity and Memory Units they require additional overhead from BLC to aggregate in-
52 In this subsection, the complexity and memory size of the formation. In contrast, the complexity of the proposed scheme
53 proposed scheme for detection (i.e., the KF stage) are derived is comparable to that of BLC and is significantly smaller
54 than that of the conventional SLC schemes.3 In addition,
2 In general, all LLRs of the bits comprising the symbols that are not sent at
55 the next TTI should be stored in the proposed scheme for combining in future 3 Although the complexity based on the highest-order terms of the proposed
56 TTIs in which the packet is retransmitted. For example, if the symbol xt,l (n) scheme is lower than that of BLC, the proposed scheme has several low-
57
q
for a packet is sent at the (t − 3)th, (t − 2)th, and tth TTIs, then lt−2,l,n order terms, whereas BLC has few low-order terms. In that sense, despite the
q
58 should be pre-stored and combined with lt,l,n at the tth TTI to utilize the advantage in terms of complexity for a large N , we consider the complexity
information for xt,l (n) obtained at the (t − 3)th and (t − 2)th TTIs. of both schemes to be similar in this paper.
59
60
Page 6 of 48
IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. XX, NO. XX, XXX 2021 6

1
2
10 -1 10 0
3
4
5 10 -2
10 -1
6

Average BLERs
Average BERs

7 10 -3
8 r=1
10 -2
r=1
r=2 r=2
9 10 -4 r=3 r=3
10 KC KC
11 -5
BLC 10 -3 BLC
10 DSLC DSLC
12
N=M=4 N=M=4
13 N = M = 16 N = M = 16
-6 -4
14 10
-12 -8 -4 0 4 8
10
-12 -8 -4 0 4 8
15 SNR [dB] SNR [dB]
16 (a) (b)
17
18 Fig. 1: Error performance of combining schemes in MSARQ-CC systems: (a) average BER and (b) average BLER.
19
20 10 -1 10 0
21
22
10 -2
23 10 -1
24
Average BLERs
Average BERs

25 10 -3
r=1 r=1
26 r=2 10 -2 r=2
27 10 -4 r=3 r=3
28 KC KC
29 BLC 10 -3 BLC
10 -5 DSLC DSLC
30 N=M=4 N=M=4
31 -6
N = M = 16
-4
N = M = 16
32 10
-12 -8 -4 0 4 8
10
-12 -8 -4 0 4 8
33 SNR [dB] SNR [dB]
34 (a) (b)
35
Fig. 2: Error performance of combining schemes in MSARQ-IR systems: (a) average BER and (b) average BLER.
36
37
38 the proposed scheme requires lesser memory units than most forcefully terminated for every 100 TTIs. Two antenna con-
39 conventional SLC schemes. Even with the low computational figurations of N = M = 4 and N = M = 16 are
40 complexity and memory units, the proposed scheme can be considered, and R = 3, i.e., maximum two retransmissions
41 applied to both MSARQ and MMARQ systems regardless of for each packet. A low-density parity-check (LDPC) code
42 the retransmission strategy, unlike conventional SLC schemes. in [3] with a rate of 0.5 and a codeword length of 1440
43 Furthermore, the proposed scheme can outperform BLC and is considered as the mother code. As C = 3/4, 960 bits
44 achieve near-identical error performance to the direct SLC among the mother codeword of a packet are transmitted for
45 scheme, as shown in Section IV and Appendices A and B. each HARQ round. The quadrature phase shift keying (QPSK)
46 modulation with Q = 2 is considered; therefore, J = 480. The
47 IV. S IMULATION R ESULTS independent Rayleigh fading channel in which the channel
48 response independently varies for transmit signal vectors is
49 In this section, the average bit-error ratio (BER) and block-
error ratio (BLER) of the combining schemes are evaluated. considered. Furthermore, the maximum number of decoding
50 iterations for a packet at each TTI is set to 20. After decoding,
51 A block error is defined as the case in which at least one
bit-error occurs among the decoded data bits of a packet. In the syndromes of cyclic redundancy check (CRC) and LDPC
52 codes are used to detect errors in the decoded packets for a
53 addition to the proposed scheme, the BLC and direct SLC
schemes are considered as reference schemes. In MMARQ retransmission request, where CRC-32 is considered.
54
55 systems, the size of the aggregated channel matrix for the Figs. 1 and 2 depict the error performance of the combining
56 direct SLC scheme increases unless all active packets are schemes in MSARQ systems with CC and IR, respectively.
57 simultaneously terminated at the same TTI; this leads to As analyzed in Appendix B, compared with the direct SLC
58 an excessively large complexity for simulations. To prevent scheme, the proposed scheme shows identical error perfor-
59 such situations, in MMARQ systems, all active packets are mance for CC regardless of the antenna configuration. Mean-
60
Page 7 of 48
IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. XX, NO. XX, XXX 2021 7

1
2
10 -1 10 0
3
4
5 10 -2
10 -1
6

Average BLERs
Average BERs

7 10 -3
8 r=1
10 -2
r=1
r=2 r=2
9 10 -4 r=3 r=3
10 KC KC
11 -5
BLC 10 -3 BLC
10 DSLC DSLC
12
N=M=4 N=M=4
13 N = M = 16 N = M = 16
-6 -4
14 10
-12 -8 -4 0 4 8
10
-12 -8 -4 0 4 8
15 SNR [dB] SNR [dB]
16 (a) (b)
17
18 Fig. 3: Error performance of combining schemes in MMARQ-CC systems: (a) average BER and (b) average BLER.
19
20 10 -1 10 0
21
22
10 -2
23 10 -1
24
Average BLERs
Average BERs

25 10 -3
r=1 r=1
26 r=2 10 -2 r=2
27 10 -4 r=3 r=3
28 KC KC
29 BLC 10 -3 BLC
10 -5 DSLC DSLC
30 N=M=4 N=M=4
31 -6
N = M = 16
-4
N = M = 16
32 10
-12 -8 -4 0 4 8
10
-12 -8 -4 0 4 8
33 SNR [dB] SNR [dB]
34 (a) (b)
35
Fig. 4: Error performance of combining schemes in MMARQ-IR systems: (a) average BER and (b) average BLER.
36
37
38 while, the performance for IR slightly degrades with retrans- of having an extremely huge aggregated system model. In
39 mission (r ≥ 2) for both N = M = 4 and N = M = 16, addition, the proposed and direct SLC schemes outperform
40 because the LLRs of the parity bits transmitted at the previous BLC, and they achieve a slightly better error performance than
41 HARQ rounds only are not updated in the future HARQ BLC even for the initially transmitted packets (r = 1). This is
42 rounds, unlike the direct SLC scheme. Nevertheless, the gain because, unlike SLC in MSARQ systems, SLC in MMARQ
43 in the signal-to-noise ratio (SNR) of the direct SLC scheme systems can provide a better estimation result for initially
44 over the proposed scheme is negligible compared with that transmitted packets as well as retransmitted packets [22].
45 of the proposed scheme over BLC. For both CC and IR
46 with r ≥ 2, the proposed scheme outperforms BLC with V. C ONCLUSION
47 a similar complexity. Further, as all combining schemes for
48 In this study, we proposed and investigated an SLC scheme
r = 1 in MSARQ systems become equivalent to the LMMSE based on KF operation for MIMO systems with HARQ. The
49 detection in MIMO systems without retransmission, the error
50 proposed scheme exhibits system applicability and computa-
performance of the combining schemes for r = 1 is identical tional complexity similar to the BLC scheme with degraded
51 in MSARQ systems.
52 error performance. The error performance of the proposed
53 Figs. 3 and 4 show the error performance of the combining scheme is nearly identical to that of the dedicated direct SLC
54 schemes in MMARQ systems with CC and IR, respectively. scheme for a specific system configuration with high com-
55 Again, as analyzed in Appendix B, the proposed scheme putational complexity. Therefore, the proposed KC scheme is
56 achieves identical error performance to the direct SLC scheme considered an effective and powerful combining scheme for
57 for both CC and IR regardless of the antenna configuration MIMO systems with HARQ.
58 and r. Meanwhile, the direct SLC scheme is impractical Notably, the proposed scheme can be employed for other
59 specifically for MMARQ systems, because of the possibility system models not considered in this paper. For example,
60
Page 8 of 48
IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. XX, NO. XX, XXX 2021 8

1
2 through proper calculation of the transition indicating vec- Then, the aggregated receive signal vector for H̃t,l can be
3 tor dt,l , the proposed scheme can be applied to MIMO- represented as follows:
4 HARQ systems with different spatial multiplexing (e.g., xt,l =
ỹt,l = H̃t,l x̃t,l + ñt,l , (19)
5 [sr ((l − 1)N + 1), · · · , sr (lN )]T for MSARQ systems or
6 a packet transmitting from multiple transmit antennas for where ỹt,l = T
[y1,l ,··· T T
, yt,l ]
and ñt,l = [nT1,l , · · · , nTt,l ]T
7 MMARQ systems with 1 < P < N ), different HARQ retrans- denote the Mt × 1 receive signal and AWGN vectors for H̃t,l ,
8 mission strategies (e.g., full IR without repeatedly transmit- respectively.
9 ted symbols [8]), acknowledgement feedback bundling (e.g., Based on (19), the direct SLC scheme with LMMSE detec-
10 MSARQ systems with multiple packets forced to have the tion at the tth TTI can be performed as follows:
11 same ARQ process [15]), and packet blanking (e.g., sending ˆ t,l = (H̃H
x̃ 2 −1 H
(20)
t,l H̃t,l + σ INt ) H̃t,l ỹt,l = Gt,l ỹt,l ,
12 no new packets until all the remaining packets are terminated
13 [20]). In practical systems, the repeatedly transmitted symbols where x̃ ˆ t,l is the Nt × 1 LMMSE estimate of x̃t,l and Gt,l is
14 can be included in different transmit signal vectors or transmit the Nt × Mt LMMSE filter matrix.
15 antennas for each HARQ round [4], [5], and the proposed Next, we consider the aggregated KF operation based on a
16 scheme can support such cases by simple modifications. In state-space model using the process equation x̃t,l = F̃t,l x̃0,l +
17 addition, the proposed scheme can be extended to employ w̃t,l and observation equation (19), where F̃t,l = 0Nt ×Nt ,
18 iterative detection procedures for better error performance. x̃0,l = 0Nt ×1 , and w̃t,l = x̃t,l . As the output of the
19 These topics will be investigated in future works. prediction step {x̃ ˆ (−) , P̃(−) } is {0N ×1 , IN } and the Kalman
t,l t,l t t

20 gain matrix is Gt,l , the filtered output of the aggregated KF


21 A PPENDIX A operation at the tth TTI becomes identical to that of the direct
22 P ROOF OF L EMMA 1 SLC scheme, x̃ ˆ t,l , in (20).
23 Let H̆t∗ ,l denote the Mt∗ × Nt∗ aggregated channel The aggregated KF operation for x̃0,l → x̃t,l can be divided
24 matrix including all channel matrices up to the t∗ th TTI into several small KF operations for x0,l → x1,l → · · · →
25 {H1,l , · · · , Ht∗ ,l }, where Mt∗ and Nt∗ are the numbers of ag- xt,l , which correspond to consecutive KF operations of the
26 gregated receive and transmit antennas in H̆t∗ ,l , respectively. proposed scheme from the first to tth TTI. As (19) is a linear
27 We assume that all packets transmitted at the (t∗ − 1)th TTI equation and every element of ñt,l is AWGN, the elements
28 are simultaneously terminated after the reception procedures. in x̂t,l (the estimate after several small KF operations) are
29 Then, all elements of xt∗ ,l are new transmit symbols, regard- identical to the corresponding elements in x̃ ˆ t,l (the estimate
30 less of the number of HARQ processes and retransmission after the aggregated KF operation) [24], i.e., x̂t,l (n) = x̃ ˆt,l (n)
31 strategy, which should be regarded as those sent from addi- with 1 ≤ n ≤ a for the repeatedly transmitted symbols and
32 tional transmit antennas in the aggregated system model [14], x̂t,l (n) = x̃ˆt,l (n + Nt − N ) with a + 1 ≤ n ≤ N for the newly
33 [19]–[23]. Therefore, H̆t∗ ,l can be written as follows: transmitted symbols. Therefore, the proposed and direct SLC
34 [ ] schemes at the tth TTI obtain the identical estimate for xt,l .
35 H̆t∗ −1,l 0Mt∗ −1 ×N
H̆t∗ ,l = . (18) A PPENDIX B
36 0M ×Nt∗ −1 Ht∗ ,l
E RROR PERFORMANCE OF THE PROPOSED AND DIRECT
37
As the two column partitions of H̆t∗ ,l in (18) are orthogonal, SLC SCHEMES
38
39 the estimate of xt,l using H̆t∗ ,l is identical to that using Ht∗ ,l , Based on the analysis in Appendix A, the error performance
40 regardless of the detection criterion. Therefore, if all active of the proposed KC and direct SLC schemes according to the
41 packets are simultaneously terminated at (t∗ − 1)th TTI, H̆t∗ ,l system configuration can be compared as follows:
42 can be initialized to Ht∗ ,l of a much smaller size, and the 1) MSARQ-CC: Because there are no newly transmitted
43 aggregation can be performed from the initialized H̆t∗ ,l at symbols for retransmission, the proposed and direct SLC
44 future TTIs until the condition is satisfied again. Note that the schemes always obtain LLRs for decoding from the estimate
45 packet termination in MSARQ systems always satisfies the of xt,l only, where the estimate of xt,l is identical in both
46 above condition. schemes. Therefore, the error performance of the proposed
47 Based on the above strategy, instead of H̆t,l , we consider scheme is identical to that of the direct SLC scheme.
48 the effective aggregated channel matrix H̃t,l , which includes 2) MSARQ-IR: The estimate for data symbols is identical in
49 the channel matrices {Ht∗ ,l , · · · , Ht,l }, i.e., the (t∗ − 1)th both combining schemes, as in MSARQ-CC systems. Mean-
50 TTI with t∗ ≤ t was the last TTI satisfying the condition while, the parity symbols transmitted at the previous TTIs are
51 of simultaneous packet termination. For simple notations, we not re-estimated in the proposed scheme, while the direct SLC
52 assume that i) t∗ = 1 and ii) a symbols are repeatedly scheme re-estimates all the previously transmitted symbols.
53 transmitted from the first to the ath transmit antennas, and Therefore, the error performance of the proposed scheme
54 b(= N − a) symbols are always newly transmitted from for retransmission can be slightly degraded from that of the
55 the (a + 1) to N th transmit antennas. Then, the numbers of direct SLC scheme. However, this performance degradation
56 aggregated transmit and receive antennas are Nt = a + bt and is marginal, because the performance improvement of parity
57 Mt = M t, respectively, and H̃t,l becomes the Mt ×Nt matrix. symbols by retransmissions in the direct SLC scheme is
58 Let x̃t,l = [x1,l (1), · · · , x1,l (a), x1,l (a + 1), · · · , xt,l (N )]T limited in contrast to the significant improvement of the data
59 denote the Nt × 1 aggregated transmit signal vector for H̃t,l . symbols [14].
60
Page 9 of 48
IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. XX, NO. XX, XXX 2021 9

1
2 3) MMARQ-CC: Similar to MSARQ-CC systems, the pro- [19] C. Bai, W. A. Krzymien, and I. J. Fair, “Hybrid-ARQ for layered space
time MIMO systems with channel state information only at the receiver,”
3 posed scheme obtains LLRs for decoding from the current
IET Commun., vol. 4, no. 14, pp. 1765–1773, Sep. 2010.
4 estimate x̂t,l in which the elements are identical to the [20] N. Prasad and X. Wang, “Efficient combining techniques for multi-
5 corresponding elements in x̃ ˆ t,l of the direct SLC scheme. input multi-output multi-user systems employing hybrid automatic repeat
request,” IET Commun., vol. 5, no. 13, pp. 1785–1796, Sep. 2011.
6 Therefore, the error performance of the proposed scheme is
[21] T. Yamamoto, K. Adachi, S. Sun, and F. Adachi, “Recursive QR packet
7 identical to that of the direct SLC scheme. combining for uplink single-carrier multi-user MIMO HARQ using near
8 4) MMARQ-IR: The estimated data symbols in the pro- ML detection,” in Proc. Int. Wirel. Commun. Mobile Comput. Conf.,
posed and direct SLC schemes are identical, as in MMARQ- Limassol, Cyprus, Aug. 2012, pp. 407–412.
9 [22] S. Park and S. Choi, “Performance of symbol-level combining and
10 CC systems. Meanwhile, as Ft,l = 0N ×N for the transmit bit-level combining in MIMO multiple ARQ systems,” IEEE Trans.
11 signal vectors of parity symbols, their estimate in the proposed Commun., vol. 64, no. 4, pp. 1517–1528, Apr. 2016.
scheme is identical to BLC. In addition, as the aggregated [23] S. Park, “Aggregation-assisted combining for MIMO multiple ARQ
12 systems,” IEEE Commun. Lett., vol. 22, no. 2, pp. 348–351, Feb. 2018.
13 channel matrix for the transmit signal vectors of parity sym- [24] D. Simon, Optimal State Estimation: Kalman, H∞ , and Nonlinear
14 bols is block-diagonal with the t′ th diagonal block Ht′ ,l for Approaches, New York: Wiley, 2006.
15 1 ≤ t′ ≤ t, the direct SLC scheme also provides an estimate of [25] S. Park and S. Choi, “Iterative equalizer based on Kalman filtering and
smoothing for MIMO-ISI channels,” IEEE Trans. Signal Process., vol.
16 the parity symbols identical to BLC. Therefore, the proposed 63, no. 19, pp. 5111–5120, Oct. 2015.
17 and direct SLC schemes exhibit identical error performance.
18
R EFERENCES
19
20 [1] D. Chase, “Code combining—A maximum-likelihood decoding approach
for combining an arbitrary number of noisy packets,” IEEE Trans.
21 Commun., vol. 33, no. 5, pp. 385–393, May 1985.
22 [2] D. N. Rowitch and L. B. Milstein, “On the performance of hybrid
23 FEC/ARQ systems using rate compatible punctured turbo (RCPT) codes,”
IEEE Trans. Commun., vol. 48, no. 6, pp. 948–959, June 2000.
24 [3] IEEE P802.16e, IEEE Standard for Local and Metropolitan Area Net-
25 works Part 16: Air Interface for Fixed and Mobile Broadband Wireless
26 Access Systems, Feb. 2006.
[4] 3GPP, TR 36.212: Evolved Universal Terrestrial Radio Access (E-UTRA);
27 Multiplexing and channel coding, v8.6.0, Mar. 2009.
28 [5] 3GPP, TR 38.212: NR; Multiplexing and channel coding, v15.2.0, June
29 2018.
[6] J. Lee, H. Lou, D. Toumpakaris, E. W. Jang, and J. M. Cioffi, “Transceiver
30 design for MIMO wireless systems incorporating hybrid ARQ,” IEEE
31 Commun. Mag., vol. 47, no. 1, pp. 32–40, Jan. 2009.
32 [7] H. Zheng, A. Lozano, and M. Haleem, “Multiple ARQ processes for
MIMO systems,” EURASIP J. App. Sig. Proc., vol. 5, pp. 772–782, 2004. Sangjoon Park was born in Seoul, Korea, in 1981.
33 [8] P. Frenger, S. Parkvall, and E. Dahlman, “Performance comparison of He received the B.S., M.S., and Ph.D. degrees in
34 HARQ with Chase combining and incremental redundancy for HSDPA,” electrical and electronic engineering from Yonsei
35 in Proc. IEEE Veh. Technol. Conf., vol. 3, Oct. 2001, pp. 1829–1833. University, Seoul, Korea, in 2004, 2006, and 2011,
[9] E. N. Onggosanusi, A. G. Dabak, Y. Hui, and G. Jeong, “Hybrid ARQ respectively. From August 2011 to February 2014,
36 transmission and combining for MIMO systems,” in Proc. IEEE Int. Conf. he was with LG Electronics, as a Senior Research
37 Commun., Anchorage, AK, USA, May 2003, pp. 3205–3209. Engineer participating in research about telematics
38 [10] E. W. Jang, J. Lee, H. Lou, and J. M. Cioffi, “On the combining schemes and vehicular communication systems. From April
for MIMO systems with hybrid ARQ,” IEEE Trans. Wireless Commun., 2014 to August 2016, he was a Research Professor
39 vol. 8, no. 2, pp. 836–842, Feb. 2009. with the School of Electrical and Electronic Engi-
40 [11] H. Samra and Z. Ding, “New MIMO ARQ protocols and joint detection neering, Yonsei University. From September 2016 to
41 via sphere decoding,” IEEE Trans. Signal Process., vol. 54, no. 2, pp.
473–482, Feb. 2006.
February 2019, he was an Assistant Professor with the Department of Infor-
mation and Communication Engineering, Wonkwang University. From March
42 [12] E. W. Jang, J. Lee, L. Song, and J. M. Cioffi, “An efficient symbol-level 2019, he has been an Assistant Professor with the Department of Electronic
43 combining scheme for MIMO systems with hybrid ARQ,” IEEE Trans. Engineering, Kyonggi University, Suwon, Korea. His main research interests
Wireless Commun., vol. 8, no. 5, pp. 2443–2451, May 2009.
44 [13] T. Ait-Idir and S. Saoudi, “Turbo packet combining strategies for the
include communication signal processing techniques, such as detection and
equalization algorithms, hybrid ARQ and MIMO applications for wireless
45 MIMO-ISI ARQ channel,” IEEE Trans. Commun., vol. 57, no. 12, pp. communications, error correcting codes, algorithms for physical layer security,
46 3782–3793, Dec. 2009. and other signal processing techniques.
[14] S. Park, Y. Whang, and S. Choi, “Extended detection for MIMO systems
47 with partial incremental redundancy based hybrid ARQ,” IEEE Trans.
48 Wireless Commun., vol. 11, no. 10, pp. 3714–3722, Oct. 2012.
49 [15] S. Park, “Multiple retransmission strategy-based HARQ for MIMO
systems with acknowledgment bundling,” Electron. Lett., vol. 50, no. 8,
50 pp. 637–639, Apr. 2014.
51 [16] H. Kwon, J. Lee, and I. Kang, “Symbol-level combining for hybrid
52 ARQ on interference-aware successive decoding,” in Proc. IEEE Global
Commun. Conf. (GLOBECOM), Atlanta, GA, USA, Dec. 2013, pp. 3650–
53 3654.
54 [17] M. Zia and Z. Ding, “Bandwidth efficient variable rate HARQ under
55 orthogonal space-time block codes,” IEEE Trans. Signal Process., vol.
62, no. 13, pp. 3360–3370, July 2014.
56 [18] Y. Whang, S. Park, and H. Liu, “Low-complexity symbol-level combin-
57 ing for hybrid automatic repeat request in multiple-input multiple-output
58 systems with linear detection,” IET Commun., vol. 10, no. 10, pp. 1131–
1137, July 2016.
59
60
Page 10 of 48

1 IEEE TRANSACTIONS ON SIGNAL PROCESSING 1


2
3
4
5
6 Summary of Changes
7
8
9
10 Title: Kalman Filtering Based Combining for MIMO Systems with Hybrid ARQ
11
12 Previous Manuscript ID: T-SP-27517-2021
13
14
Manuscript Type: Regular Paper
15 Author: Sangjoon Park
16
17 * Department of Electronic Engineering, Kyonggi University, Suwon 16227, Korea (sj.park@kgu.ac.kr)
18
19
Date of Initial Submission: February 16, 2021.
20 Date and Result of 1st Evaluation: April 18, 2021. [Rejected for publication]
21
22 Date of Resubmission: May 17, 2021.
23
24
25 Dear Prof. Sangarapillai Lambotharan and Reviewers:
26
27
28
29
First of all, I sincerely appreciate all of your time and efforts for handling the paper. I have revised
30 the manuscript to address all comments. Also, I made a point-by-point response to all the comments in
31
32 this response letter. Again, thank you for your precious time and valuable efforts spent in reviewing the
33
34
manuscript as well as your expertise comments. I hope the revision will meet your approval and I’m
35 looking forward to any further advice you may have about the manuscript.
36
37
38
39
Sincerely yours,
40
41
42 Sangjoon Park
43
44
45
46
47 * In this response letter, red is used to indicate the revisions in the manuscript for each individual comment, and
48
underline is used to indicate the existing phrases related to the comment.
49
50 * In the revised manuscript, Yellow is used to highlight the major revisions explained in this response letter. Further,
51
52 Green is used for the other minor revisions (e.g., citation numbers, English usage, etc.), which are omitted in this
53
response letter.
54
55
56
57
58
59
60
Page 11 of 48

1 IEEE TRANSACTIONS ON SIGNAL PROCESSING 2


2
3
4
5
6 Response to Reviewer 1
7
8 Recommendation: AQ - Publish With Minor, Required Changes
9
10
11
12 [Comment #1]
13
14
This manuscript focused on hybrid ARQ (HARQ), and proposed a Kalman filtering-based combining
15 scheme for MIMO systems. The authors proposed a state-space model representing various configurations
16
17 for the systems with HARQ, and utilized the KF operation to perform symbol-level combining based on
18
19
LMMSE detection using the information aggregated up to the current TTI. However, some comments
20 are provided, and this reviewer expects a considerate response with respect to the comments.
21
22 Major Comments 1) In Section II-A, how to determine the transition indicating vector? Since the
23
24
receiver can only obtain the observed signal, how can it derive the expectation related to the transmitted
25 signal of neighboring TTIs?
26
27
28
29
[Author’s Response]
30 The calculation of the transition indicating vector is performed based on the equivalence of two symbols
31
32 based on the retransmission property, without using the exact value of each symbol. At the receiver, each
33
34
element of the transition indicating vector (dt,l (n) = E[xt−1,l (n)xt,l (n)∗ ]) can be obtained by determining
35 whether the current symbol (xt,l (n)) is identical to the previously transmitted symbol(xt−1,l (n)). Based
36
37 on the system model of the paper, this can be decided by checking (i) xt,l (n) is a data symbol or belongs
38
39
to the packet with CC & (ii) the packet including xt,l (n) is at least in its 2nd HARQ round at the tth
40 TTI. In general cases (besides the system model of the manuscript), there may be a data symbol which is
41
42 retransmitted from different l and n. However, for normal functioning of reception procedures regardless
43
44
of the combining strategy, the receiver should know (iii) the position in the mother codeword of the
45 coded bits consisting each symbol and (iv) the packet to which each symbol belongs. This kind of side
46
47 information should be shared between the transmitter and receiver via control channel, otherwise the
48
49
LLRs of the coded bits obtained from each symbol cannot be correctly located for decoding. Therefore,
50 similar to the case of the transmitter, the receiver can calculate the transition indicating vector dt,l prior
51
52 to decoding even though it does not know the exact value of xt,l (n) until the successful decoding of
53
54
packets.
55 Based on the above observation, the footnote 1 is added to clearly express that the side-information
56
57
58
59
60
Page 12 of 48

1 IEEE TRANSACTIONS ON SIGNAL PROCESSING 3


2
3
4
to determine dt,l should be known to the receiver. Because the calculation of E[xt−1,l (n)xt,l (n)∗ ] at the
5 transmitter was already explained in Sections II-A and II-B, the calculation at the receiver (which is
6
7 identical to the transmitter) is not added in the current manuscript.
8
9
10 [Section II-A: after (2)] In (2), xt,l (n)(= sr (l+(n−1)L)) is the nth element of xt,l , which is sent from the nth transmit an-
11
12 tenna. Therefore, xt,l (n) with 1 ≤ l+(n−1)L ≤ JC is the repeatedly transmitted data symbol, where E[xt−1,l (n)x∗t,l (n)] = 1,
13
regardless of the HARQ retransmission strategy, if r > 1. Meanwhile, xt,l (n) for JC +1 ≤ l+(n−1)L ≤ J is the parity symbol.
14
15 Therefore, if r > 1, E[xt−1,l (n)x∗t,l (n)] = 0 for IR, whereas E[xt−1,l (n)x∗t,l (n)] = 1 for CC. Furthermore, if r = 1, indicating
16
17 that a new packet is transmitted from the tth TTI, E[xt−1,l (n)x∗t,l (n)] = 0, regardless of l, n, and HARQ retransmission strategy.
18
19
20 [Section II-B: after (3)] Therefore, xt,l (n) = st,n (l), regardless of t, n, and l. Let rt = [rt (1), · · · , rt (N )]T denote
21
22 the N × 1 vector, indicating the HARQ round of the active packets at the tth TTI, where rt (n) is the HARQ round of the
23
packet from the nth transmit antenna with 1 ≤ rt (n) ≤ R. Therefore, if rt (n) > 1 and 1 ≤ l ≤ JC, E[xt−1,l (n)x∗t,l (n)] = 1,
24
25 regardless of the HARQ retransmission strategy. In contrast, if rt (n) > 1 and JC + 1 ≤ l ≤ J, E[xt−1,l (n)x∗t,l (n)] = 1 for
26
27 CC and 0 for IR. Furthermore, similar to the case of MSARQ systems, if rt (n) = 1, indicating that a new packet is sent from
28
the nth transmit antenna, E[xt−1,l (n)x∗t,l (n)] = 0 for the given n, regardless of l and HARQ retransmission strategy.
29
30
31
32 [Section III-A: footnote 1] For normal functioning of reception procedures, in addition to the number of HARQ processes
33
and the retransmission strategy, the receiver should know (i) the position in the mother codeword of the coded bits consisting each
34
35 symbol and (ii) the packet to which each symbol belongs. For packet transmission, this information is usually shared between
36
37 the transmitter and receiver via control channel. Therefore, the receiver and transmitter can calculate dt,l = E[xt−1,l x∗t,l ] by
38
deciding whether each xt−1,l (n) is retransmitted as xt,l (n), as described in Sections II-A and II-B.
39
40
41
42 In addition to the above revision, the following revisions are performed. The conjugate operator was
43
44
missing for the definition of the transition indicating vector in Section III-A, and it is revised (xt,l → x∗t,l ),
45 in addition to the conjugator operator newly defined in Section I-B, which are shown in below. Also,
46
47 the conjugate-and-transpose operators in Sections II-A and II-B are replaced by the conjugate operators
48 ∗
49
(xH
t,l (n) → xt,l (n)) for element-wise operations (this revision in Sections II-A and II-B is indicated in
50 red in the above.).
51
52
53
[Section III-A: after (4)] Let dt,l = [dt,l (1), · · · , dt,l (N )]T = E[xt−1,l xt,l x∗t,l ]
54
55
56
57
58
59
60
Page 13 of 48

1 IEEE TRANSACTIONS ON SIGNAL PROCESSING 4


2
3
[Section I-B] The superscripts T , ∗, H, and −1 are the transpose, conjugate, conjugate-and-transpose, and inverse operators,
4
5 respectively.
6
7
8
9
[Comment #2]
10 2) In Section III, the authors mentioned that the estimation of the transmit signal vector via KF is
11
12 equivalent to that obtained by the direct SLC scheme with LMMSE detection. To this description, why did
13
14
the authors choose to adopt KF scheme rather than other generally utilized data detection methods? To the
15 best knowledge of this reviewer, since there are matrix inverse operation in KF process, its computation
16
17 complexity have little advantage comparing to other conventional methods.
18
19
20 [Author’s Response]
21
22 As the reviewer pointed out, in general MIMO systems, the KF operation requires much higher
23
24
complexity than the conventional LMMSE detector, although both obtain the same detection quality.
25 Nevertheless, because of the channel aggregation in MIMO systems with HARQ, the direct SLC with
26
27 LMMSE detection can require much higher complexity than the proposed KF operation. For example,
28
29
in M × N MIMO systems with MSARQ-CC, the aggregated channel matrix used in the direct SLC
30 scheme becomes rM × N , so the complexity of the direct SLC scheme with LMMSE detection can be
31
32 even higher than that of the conventional KF operation for MIMO systems. Meanwhile, because of no
33
34
channel matrix aggregation, the BLC uses M × N channel matrix, which is identical to the conventional
35 LMMSE detector in MIMO systems without HARQ, and thereby the BLC with the LMMSE detection
36
37 can require less complexity than conventional KF operation for MIMO systems. Therefore, the low-
38
39
complexity correction step is developed in (13)–(16), and the proposed scheme with the low-complexity
40 correction step has similar complexity to BLC and much less complexity than direct SLC, as derived in
41
42 Section III-C.
43
44
In addition, as shown in Section IV (simulation results) and Appendices, the proposed scheme based
45 on KF outperforms BLC and has near-identical performance to the LMMSE-based direct SLC scheme
46
47 with huge complexity even higher than that of the conventional KF operation. Therefore, the proposed
48
49
KC scheme has (i) the complexity similar to the low-complexity low-performance BLC and (ii) the
50 performance similar to the high-complexity high-performance direct SLC, which is one of the main
51
52 reasons that we consider the KF operation for detection in MIMO-HARQ systems.
53
54
Another main reason is that, as explained in Section III, by changing the transition indicating vector,
55 the state-space model in KF can represent various MIMO-HARQ system configurations. That is, the
56
57
58
59
60
Page 14 of 48

1 IEEE TRANSACTIONS ON SIGNAL PROCESSING 5


2
3
4
proposed scheme can be employed to various MIMO-HARQ system configurations by obtaining a
5 different transition indicating vector, without changing the algorithm. Therefore, the proposed scheme
6
7 has the similar system applicability to BLC.
8
9
In the previous manuscript, the complexity and system applicability were expressed in detail in Section
10 III-C. Thus, to clearly indicate the high performance characteristic of the proposed scheme with other
11
12 advantages over the direct SLC scheme, the following revision is done in Section III-A including the
13
14
new Lemma 1:
15
16
17 [Section III-B-1: after (12)] Therefore, x̂t,l is obtained by the KF operation in (8)–(12), which is an LMMSE estimate of
18
xt,l using the information aggregated up to the tth TTI by the state transition. .Note that x̂t,l is equivalent to the estimate of
19
20 xt,l obtained by the direct SLC scheme with LMMSE detection, as shown in Appendix. Further, when Ft,l = 0N ×N (e.g., full
21
−1 H
22 IR), x̂t,l is equivalent to the estimate of xt,l by BLC with LMMSE detection because Kt,l becomes (HH 2
t,l Ht,l + σ IN ) Ht,l .
23
24
25 [Section III-B-1: last part] Next, we compare x̂t,l to the estimate of other combining schemes with LMMSE detection.
26 When Ft,l = 0N ×N regardless of t and l (e.g., full IR), x̂t,l is equivalent to the estimate of xt,l by BLC with LMMSE detection
27 −1 H
because Kt,l becomes (HH 2
t,l Ht,l + σ IN ) Ht,l . In addition, for general Ft,l , we have the following lemma:
28
29 Lemma 1 (Equivalence of the estimated xt,l in the proposed and direct SLC schemes): x̂t,l is identical to the corresponding
30 estimate of xt,l obtained by the direct SLC scheme with LMMSE detection at the tth TTI.
31 Proof: See Appendix A.
32
33 Therefore, the proposed and direct SLC schemes at the tth TTI obtain the identical estimate for xt,l regardless of the number
34
35 of HARQ processes and retransmission strategy. However, the direct SLC scheme requires a significantly larger computational
36 complexity and has to be designed in a dedicated manner for a specific system configuration.
37
38
39
40 In addition to the above revision, to improve the readability, Appendix in the previous manuscript
41 is divided to “A. Proof of Lemma 1 (equivalence of the estimated xt,l in the proposed and direct SLC
42
43 schemes)” and “B. Error performance of the proposed and direct SLC schemes” in the revised manuscript.
44
45 To sum up, Appendix A is to analyze the estimation quality of the transmit signal vector in the proposed
46 KF scheme compared to the direct SLC scheme with the LMMSE detection during the operation in
47
48 each TTI, and Appendix B is to extend the analysis in Appendix A to compare the error performance
49
50 (BER, BLER) of the proposed and direct SLC schemes according to the system configuration. The major
51 changes can be summarized as follows:
52
53 (* To remove the redundant information, the current Appendices A and B are greatly simplified from
54
55 the previous Appendix. For details, please refer to the response to Reviewer 4.)
56
57
58
59
60
Page 15 of 48

1 IEEE TRANSACTIONS ON SIGNAL PROCESSING 6


2
3
4
5 [Appendix: previous manuscript] [Relationship between direct SLC and proposed schemes]
6
7 → [Appendix A: current manuscript] [Proof of Lemma 1]
8
9
→ [Appendix A: current manuscript, last paragraph] ... for the newly transmitted symbols. Therefore, the proposed and
10 direct SLC schemes at the tth TTI obtain the identical estimate for xt,l .
11
12 → [Appendix B: current manuscript] [Error performance of the proposed and direct SLC schemes]
13
14
→ [Appendix B: current manuscript, 1st paragraph] Based on the analysis in Appendix A, the error performance of the
15 proposed KC and direct SLC schemes according to the system configuration can be compared as follows:
16
17
18
19
As the reviewer pointed out, there can be some other data detection methods, which can be modified
20 to have similar characteristics to the proposed scheme (high system applicability, high performance, low
21
22 complexity). However, such characteristics by using other methods needs to be verified separately, and
23
24
the research on other methods maybe beyond the scope of the manuscript which focuses on the KF based
25 detection scheme, so no revisions related to the other detection methods instead of KF are performed in
26
27 the current manuscript. I hope the reviewer may understand the intent.
28
29
30 [Comment #3]
31
32 3) The authors should double-check the writings in the whole paper more carefully to avoid spelling
33
34
errors.
35
36
37 [Author’s Response]
38
39
As the reviewer advised, the current manuscript has been carefully checked before resubmission, includ-
40 ing math equations and English usages. The followings are the typos revised in the current manuscript:
41
42 (* The typos in which the sentences and phrases are revised by English Proofreading or removed for
43
44
a simpler explanation are not included.)
45
46
47 [Section III-A: Table I] r = 01 & r > 01
48
49
50 [Section III-A: after (4)] Let dt,l = [dt,l (1), · · · , dt,l (N )]T = E[xt−1,l xt,l x∗t,l ]
51
52
53 [Section III-B: (16)]
54 Pt,l,m−1 h∗t,l,m
kt,l kt,l,m = . (16)
55 ht,l,m Pt,l,m−1 h∗t,l,m
T
+ σ2
56
57
58
59
60
Page 16 of 48

1 IEEE TRANSACTIONS ON SIGNAL PROCESSING 7


2
3
4
5 [Section III-B-3] ... the LLRs of the coded bits comprising the parity symbols are sent ..
6
7
8
[Section III-C: Table II, Maximal-ratio combining] O(54N 3 + N 2 M )
9
10
11
12 [Section V: 1st paragraph] ... Therefore, the proposed KC combining scheme is considered an ..
13
14
15 In addition, the entire manuscript is revised based on the comments from an English proofreading
16
17 service, which are highlighted in green . Because of the length, the related revisions are omitted here
18
19
and included in the revised manuscript only.
20
21
22 [Comment #4]
23
24
As a summary, the topic of the work is interesting. The modeling and the derivations are very novel
25 and thoughtful. In the end, it is suggested that the authors should check and improve their work carefully
26
27 with the above suggestions.
28
29
30 [Author’s Response]
31
32 Thank you for the kind comment on the manuscript as well as your expertise suggestions. I hope this
33
34
revision will meet your approval.
35
36
37 Again, I would like to thank the reviewer for the time and effort for this manuscript.
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
Page 17 of 48

1 IEEE TRANSACTIONS ON SIGNAL PROCESSING 8


2
3
4
5
6 Response to Reviewer 2
7
8 Recommendation: RQ - Review Again After Major Changes
9
10
11
12 [Comment #1]
13
14
The paper proposes a combining scheme for the MIMO systems with hybrid automatic repeat request
15 (ARQ). The proposed technique performs symbol-level combining (SLC) based on the Kalman filtering
16
17 (KF) for an effective combining scheme. The paper is the timely and well written. Also, the proposed
18
19
methods work well, but I have some comments on the paper. Please check the detailed comments below.
20 Major comments: (1) In Section III, please explain when we can adopt the state-space model for the
21
22 packet transmission and retransmission. In other words, what is the assumption for modeling the packet
23
24
transmission as the state-space model?
25
26
27 [Author’s Response]
28
29
The state-space model of Kalman filter is described by the process equation and the observation
30 equation. (i) In cases of the observation equation (yt,l = Ht,l xt,l + nt,l ), as shown in (1) and (7), it
31
32 is identical to the input-output relationship of a general MIMO system. This can be applied for the
33
34
system which employs the techniques not considered in this paper (e.g., precoding, STBC, etc.). Thus,
35 the observation equation of the state-space model is valid for general MIMO systems. (ii) Meanwhile, in
36
37 cases of the process equation (xt,l = Ft,l xt−1,l + wt,l ), by setting the transmit signal vector as the state
38
39
vector, it describes the relationship between two transmit signal vectors, such that whether each element
40 of a transmit signal vector (xt,l ) is identical to the corresponding element of the previous signal vector
41
42 (xt−1,l ) or not, which can be applied to general MIMO systems similar to the case of the observation
43
44
equations. Although the specific relationship between the two vectors (described by Ft,l and wt,l ) becomes
45 different according to the system configuration (e.g., retransmission strategy, spatial multiplexing, etc.),
46
47 but the validity of the process equation does not depend on the system configuration.
48
49
Based on the above observation, to emphasize the validity of the state-space model in general MIMO
50 systems, the following description is added to the end of Section III-A.
51
52
53
[Section III-A: after (7)] Note that the state-space model in (4)–(7) is valid for a general MIMO system, while each term
54
55 in (4)–(7) can be changed according to the specific system configuration.
56
57
58
59
60
Page 18 of 48

1 IEEE TRANSACTIONS ON SIGNAL PROCESSING 9


2
3
4
5 Meanwhile, in the situation such that the main system parameters (e.g., the numbers of transmit
6
7 and receive antennas) are changed in each TTI, the state-space model with a fixed size will not valid
8
9
anymore. However, such situations are rarely considered, and such main system parameters have already
10 been specified in the system model in Section II. Thus, the related revision on this point is not considered
11
12 in the current manuscript.
13
14
15 [Comment #2]
16
17 (2) Please provide the pseudo code for overall Kalman filtering based combining procedure in Section
18
19
III.
20
21
22 [Author’s Response]
23
24
In the revised manuscript, the pseudo code for the KF stage with low-complexity correction step is
25 added with the additional paragraph as follows:
26
27
28
[Section III-B: Algorithm 1]
29
30
31
Algorithm 1: KF Stage with Low-Complexity Correction Step.
32
33
Input: l, σ 2 , yt,l , Ht,l , Ft,l , Wt,l , x̂t−1,l , Pt−1,l
34
35 (−) (−)
1: Prediction: x̂t,l = Ft,l x̂t−1,l , Pt,l = Ft,l Pt−1,l FTt,l + Wt,l ;
36 (−) (−)
37 2: Correction Step Initialization: x̂t,l,0 = x̂t,l , Pt,l,0 = Pt,l ;
38 3: for m = 1 to M do
39
40 4: vt,l,m = Pt,l,m−1 h∗t,l,m ;
41
5: kt,l,m = vt,l,m /(hTt,l,m vt,l,m +σ 2 );
42
43 6: x̂t,l,m = x̂t,l,m−1 + kt,l,m (yt,l (m) − hTt,l,m x̂t,l,m−1 );
44 H
7: Pt,l,m = Pt,l,m−1 − kt,l,m vt,l,m ;
45
46 8: end for
47
48 9: Output: x̂t,l = x̂t,l,M , Pt,l = Pt,l,M ;
49
50
51
52
53 [Section III-B: after (16)] In Alg. 1, the proposed KF stage with the low-complexity correction step is summarized.
54
55 vt,l,m = Pt,l,m−1 h∗t,l,m in Alg. 1 is defined to reduce the number of calculations of Pt,l,m−1 h∗t,l,m in both Pt,l,m in
56
57
58
59
60
Page 19 of 48

1 IEEE TRANSACTIONS ON SIGNAL PROCESSING 10


2
3
(15) and kt,l,m in (16).
4
5
6
7 Meanwhile, the pseudo code for the overall Kalman filtering based combining procedure is not included.
8
9
The main reason is that, to explicitly describe the overall procedure, we need to define another parameters
10 and notations (e.g., puncturing patterns from the mother codeword to the codeword for each HARQ round
11
12 of a packet, bit-to-symbol mapping table, packet index). Although such information is important and
13
14
should be shared between the transmitter and receiver for valid combining process in MIMO-HARQ
15 systems, they are not directly related to the main part of the proposed scheme, and the additional
16
17 definitions are likely to significantly reduce the readability of the manuscript. Furthermore, because the
18
19
combining process for each system configuration (MSARQ/MMARQ, CC/IR) is different, the pseudo
20 code for the overall procedure will be necessary for all four configurations, which will be too lengthy
21
22 and also degrade the readability of the manuscript. Therefore, the pseudo code is described for the KF
23
24
stage with low-complexity correction step, which is the main part of the proposed scheme. I hope the
25 reviewer may understand the intent.
26
27
28
29
[Comment #3]
30 (3) In Fig. 2, why the performance of the proposed scheme is degraded? (compare to the performance of
31
32 Direct SLC scheme) Is it caused by the proposed scheme (KF) or MSARQ-IR structure? Please elaborate.
33
34
35 [Author’s Response]
36
37 Thank you for providing this comment. It is related to the structure of MSARQ-IR as well as the
38
39
proposed scheme. The proposed scheme is designed to have a fixed state size, so the estimation is
40 performed for the currently transmitted symbols only, while the direct SLC scheme re-estimates all the
41
42 related symbols for decoding transmitted until the current TTI. In the aggregated channel matrix for the
43
44
direct SLC scheme in MSARQ-IR systems, a channel partition for the parity symbols from the previous
45 HARQ round is not orthogonal to the channel partitions for the current transmit signal vectors. Therefore,
46
47 by re-estimating the previous parity symbols, the direct SLC scheme can improve the estimation quality
48
49
of the previous parity symbols at future HARQ rounds. Meanwhile, the proposed scheme does not re-
50 estimate the previous parity symbols of the active packet not included in the current transmit signal
51
52 vectors, and the estimation quality of the previous parity symbols not transmitted more is not improved
53
54
in the proposed scheme at future HARQ rounds.
55 On the other hand, because the data symbols are always retransmitted, the direct SLC and proposed
56
57
58
59
60
Page 20 of 48

1 IEEE TRANSACTIONS ON SIGNAL PROCESSING 11


2
3
4
schemes obtain the same estimate on the data symbols. Further, in MSARQ-IR systems, the performance
5 improvement of data symbols by SLC is much significant than that of parity symbols, as proved in [14].
6
7 As a result, the direct SLC scheme can have a slightly better error performance than the proposed scheme
8
9
in MSARQ-IR systems. To resolve this issue, if the proposed scheme is modified to consider the previous
10 symbols by increasing the state size, this performance degradation in MSARQ-IR systems can be fixed,
11
12 but it will result the complex state-space modeling and increased computational complexity. So, the state
13
14
size is fixed to the length of each transmit signal vector N .
15 In MSARQ-CC and MMARQ-CC, because all the previously transmitted symbols of the active packets
16
17 are retransmitted in the current TTI, there are no symbols re-estimated by the direct SLC scheme but not
18
19
estimated by the proposed scheme. In MMARQ-IR systems, (i) the data symbols of the active packets are
20 retransmitted, so their estimate is identical in both proposed and direct SLC scheme. Also, (ii) although
21
22 the previous parity symbols are not included in the current transmit signal vector (so the proposed scheme
23
24
does not re-estimate them but the direct SLC re-estimates), because of the block-diagonal structure of the
25 aggregated channel matrix for the parity symbols (i.e., a channel partition for the parity symbols from
26
27 the previous HARQ round is orthogonal to the channel partitions for the current transmit signal vectors),
28
29
the re-estimation by the direct SLC scheme does not give any better estimation quality compared with
30 the estimation by the direct SLC scheme at the previous TTIs. As a result, the proposed and direct SLC
31
32 scheme show the identical performance in MSARQ-CC, MMARQ-CC, and MMARQ-IR systems.
33
34
Based on the above observations, the following revisions are performed. First, although the above
35 observation was described in Appendix in the previous manuscript, to improve the readability, Appendix
36
37 is divided to “A. Proof of Lemma 1 (equivalence of the estimated xt,l in the proposed and direct SLC
38
39
schemes)” and “B. Error performance of the proposed and direct SLC schemes” in the current manuscript.
40 To sum up, Appendix A is to analyze the estimation quality of the transmit signal vector in the proposed
41
42 KF scheme compared to the direct SLC scheme with the LMMSE detection during the operation in
43
44
each TTI, and Appendix B is to extend the analysis in Appendix A to compare the error performance
45 (BER, BLER) of the proposed and direct SLC schemes according to the system configuration. The major
46
47 changes in the revised manuscript for Appendix can be summarized as follows:
48
49
(* In addition, for more clear explanations, the current Appendices A and B are greatly simplified
50 from the previous Appendix. For details, please refer to the response to Reviewer 4.)
51
52
53
[Appendix: previous manuscript] [Relationship between direct SLC and proposed schemes]
54
55 → [Appendix A: current manuscript] [Proof of Lemma 1]
56
57
58
59
60
Page 21 of 48

1 IEEE TRANSACTIONS ON SIGNAL PROCESSING 12


2
3
4
→ [Appendix A: current manuscript, last paragraph] ... for the newly transmitted symbols. Therefore, the proposed and
5 direct SLC schemes at the tth TTI obtain the identical estimate for xt,l .
6
7 → [Appendix B: current manuscript] [Error performance of the proposed and direct SLC schemes]
8
9
→ [Appendix B: current manuscript, 1st paragraph] Based on the analysis in Appendix A, the error performance of the
10 proposed KC and direct SLC schemes according to the system configuration can be compared as follows:
11
12
13
14
Second, following the above revision, to clearly express that the proposed scheme and direct SLC
15 scheme at the tth TTI obtains the same estimation results for the elements in the current transmit signal
16
17 vector, Lemma 1 and the related descriptions are added as follows.
18
19
20 [Section III-B-1: after (12)] Therefore, x̂t,l is obtained by the KF operation in (8)–(12), which is an LMMSE estimate of
21
22 xt,l using the information aggregated up to the tth TTI by the state transition. .Note that x̂t,l is equivalent to the estimate of
23
xt,l obtained by the direct SLC scheme with LMMSE detection, as shown in Appendix. Further, when Ft,l = 0N ×N (e.g., full
24
25 IR), x̂t,l is equivalent to the estimate of xt,l by BLC with LMMSE detection because Kt,l becomes (HH 2
t,l Ht,l + σ IN )
−1 H
Ht,l .
26
27
28 [Section III-B-1: last part] Next, we compare x̂t,l to the estimate of other combining schemes with LMMSE detection.
29
When Ft,l = 0N ×N regardless of t and l (e.g., full IR), x̂t,l is equivalent to the estimate of xt,l by BLC with LMMSE detection
30
−1 H
31 because Kt,l becomes (HH 2
t,l Ht,l + σ IN ) Ht,l . In addition, for general Ft,l , we have the following lemma:
32 Lemma 1 (Equivalence of the estimated xt,l in the proposed and direct SLC schemes): x̂t,l is identical to the corresponding
33
estimate of xt,l obtained by the direct SLC scheme with LMMSE detection at the tth TTI.
34
35 Proof: See Appendix A.
36 Therefore, the proposed and direct SLC schemes at the tth TTI obtain the identical estimate for xt,l regardless of the number
37
38 of HARQ processes and retransmission strategy. However, the direct SLC scheme requires a significantly larger computational
39
40 complexity and has to be designed in a dedicated manner for a specific system configuration.
41
42
43 Finally, in Section IV, the paragraph to explain the simulation results in MSARQ-IR systems is revised.
44
45 Because the details are described in Appendix B, the reason is relatively simply explained in the paragraph
46 as follows:
47
48
49
50 [Section IV: 2nd paragraph] As analyzed in Appendix B, compared with the direct SLC scheme, the proposed scheme
51 shows identical error performance for CC regardless of the antenna configuration. Meanwhile, the performance for IR slightly
52
53 degrades with retransmission (r ≥ 2) for both N = M = 4 and N = M = 16, because the LLRs of the parity bits transmitted
54
55 at the previous HARQ rounds only are not updated in the future HARQ rounds, unlike the direct SLC scheme. Nevertheless,
56
57
58
59
60
Page 22 of 48

1 IEEE TRANSACTIONS ON SIGNAL PROCESSING 13


2
3
...
4
5
6
7 [Comment #4]
8
9
(4) Please evaluate the simulation result with the large number of antennas. (at least N = M = 8)
10
11
12 [Author’s Response]
13
14
As the reviewer advised, the simulation results with the large number of antennas (N = M = 16) is
15 added to Figs. 1–4. Because the results for the 8x8 systems appear to be too overlapping with the results
16
17 for the 4x4 systems in the figures, the results for N = M = 16 are added. Please check the revised
18
19
figures in the following pages. In addition, because there is no difference in the tendency of simulation
20 results in 4x4 and 16x16 systems, the related descriptions in Section IV are revised as follows:
21
22
23
[Section IV: 1st paragraph] N = M = 4 Two antenna configurations of N = M = 4 and N = M = 16 are considered,
24
25 and R = 3, ...
26
27 [Section IV: 2nd paragraph] ..., the proposed scheme shows identical error performance for CC regardless of the antenna
28
configuration. Meanwhile, the performance for IR slightly degrades with retransmission (r ≥ 2) for both N = M = 4 and
29
30 N = M = 16, because ...
31
32 [Section IV: 3rd paragraph] ..., the proposed scheme achieves identical error performance to the direct SLC scheme for
33
both CC and IR regardless of the antenna configuration and r. Meanwhile, ...
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
Page 23 of 48

1 IEEE TRANSACTIONS ON SIGNAL PROCESSING 14


2
3
4
5
6
7
8
9
10
11 10 -1 10 0
12
13 10 -2
10 -1
14

Average BLERs
Average BERs

15 10 -3
16 10 -2
17 10 -4

18
19 10 -3
10 -5 Proposed Proposed
20 BLC BLC
Direct SLC Direct SLC
21 10 -6 10 -4
22 -6 -4 -2 0 2 4 6 8 -6 -4 -2 0 2 4 6 8
SNR [dB] SNR [dB]
23
24 (a) (b)
25
26 Fig. 1: [Previous] Error performance of combining schemes in MSARQ-CC systems: (a) average BER
27
28 and (b) average BLER.
29
30
31 10 -1 10 0

32
33 10 -2
10 -1
34
Average BLERs
Average BERs

35 10 -3
r=1 r=1
36 r=2 10 -2 r=2
37 10 -4 r=3 r=3

38 KC KC
BLC 10 -3 BLC
39 10 -5 DSLC DSLC
40 N=M=4 N=M=4
41 10 -6
N = M = 16
10 -4
N = M = 16

42 -12 -8 -4 0 4 8 -12 -8 -4 0 4 8
SNR [dB] SNR [dB]
43
44 (a) (b)
45
46
Fig. 1: [Revised] Error performance of combining schemes in MSARQ-CC systems: (a) average BER
47
48 and (b) average BLER.
49
50
51
52
53
54
55
56
57
58
59
60
Page 24 of 48

1 IEEE TRANSACTIONS ON SIGNAL PROCESSING 15


2
3
4
5
6
7
8
9
10
11 10 -1 10 0
12
13 10 -2
10 -1
14

Average BLERs
Average BERs

15 10 -3
16 10 -2
17 10 -4

18
19 10 -3
10 -5 Proposed Proposed
20 BLC BLC
21
Direct SLC Direct SLC
10 -6 10 -4
22 -6 -4 -2 0 2 4 6 8 -6 -4 -2 0 2 4 6 8
SNR [dB] SNR [dB]
23
24 (a) (b)
25
26 Fig. 2: [Previous] Error performance of combining schemes in MSARQ-IR systems: (a) average BER
27
28 and (b) average BLER.
29
30
31 10 -1 10 0

32
33 10 -2
10 -1
34
Average BLERs
Average BERs

35 10 -3
r=1 r=1
36 r=2 10 -2 r=2
37 10 -4 r=3 r=3

38 KC KC
BLC 10 -3 BLC
39 10 -5 DSLC DSLC
40 N=M=4 N=M=4
41 10 -6
N = M = 16
10 -4
N = M = 16

42 -12 -8 -4 0 4 8 -12 -8 -4 0 4 8
SNR [dB] SNR [dB]
43
44 (a) (b)
45
46
Fig. 2: [Revised] Error performance of combining schemes in MSARQ-IR systems: (a) average BER and
47
48 (b) average BLER.
49
50
51
52
53
54
55
56
57
58
59
60
Page 25 of 48

1 IEEE TRANSACTIONS ON SIGNAL PROCESSING 16


2
3
4
5
6
7
8
9
10
11 10 -1 10 0
12
13 10 -2
10 -1
14

Average BLERs
Average BERs

15 10 -3
16 10 -2
17 10 -4

18
19 Proposed 10 -3
10 -5 Proposed
20 BLC
Direct SLC BLC
21 10 -6 10 -4
Direct SLC

22 -6 -4 -2 0 2 4 6 8 -6 -4 -2 0 2 4 6 8
SNR [dB] SNR [dB]
23
24 (a) (b)
25
26 Fig. 3: [Previous] Error performance of combining schemes in MMARQ-CC systems: (a) average BER
27
28 and (b) average BLER.
29
30
31 10 -1 10 0

32
33 10 -2
10 -1
34
Average BLERs
Average BERs

35 10 -3
r=1 r=1
36 r=2 10 -2 r=2
37 10 -4 r=3 r=3

38 KC KC
BLC 10 -3 BLC
39 10 -5 DSLC DSLC
40 N=M=4 N=M=4
41 10 -6
N = M = 16
10 -4
N = M = 16

42 -12 -8 -4 0 4 8 -12 -8 -4 0 4 8
SNR [dB] SNR [dB]
43
44 (a) (b)
45
46
Fig. 3: [Revised] Error performance of combining schemes in MMARQ-CC systems: (a) average BER
47
48 and (b) average BLER.
49
50
51
52
53
54
55
56
57
58
59
60
Page 26 of 48

1 IEEE TRANSACTIONS ON SIGNAL PROCESSING 17


2
3
4
5
6
7
8
9
10
11 10 -1 10 0
12
13 10 -2
10 -1
14

Average BLERs
Average BERs

15 10 -3
16 10 -2
17 10 -4

18
19 10 -3
10 -5 Proposed Proposed
20 BLC
Direct SLC
BLC
Direct SLC
21 10 -6 10 -4
22 -6 -4 -2 0 2 4 6 8 -6 -4 -2 0 2 4 6 8
SNR [dB] SNR [dB]
23
24 (a) (b)
25
26 Fig. 4: [Previous] Error performance of combining schemes in MMARQ-IR systems: (a) average BER
27
28 and (b) average BLER.
29
30
31 10 -1 10 0

32
33 10 -2
10 -1
34
Average BLERs
Average BERs

35 10 -3
r=1 r=1
36 r=2 10 -2 r=2
37 10 -4 r=3 r=3

38 KC KC
BLC 10 -3 BLC
39 10 -5 DSLC DSLC
40 N=M=4 N=M=4
41 10 -6
N = M = 16
10 -4
N = M = 16

42 -12 -8 -4 0 4 8 -12 -8 -4 0 4 8
SNR [dB] SNR [dB]
43
44 (a) (b)
45
46
Fig. 4: [Revised] Error performance of combining schemes in MMARQ-IR systems: (a) average BER
47
48 and (b) average BLER.
49
50
51
52
53
54
55
56
57
58
59
60
Page 27 of 48

1 IEEE TRANSACTIONS ON SIGNAL PROCESSING 18


2
3
4
5 [Comment #5]
6
7 Minor comments: (1) Since ht,l,m is the row vector in the paper, please consider hH
t,l,m (or use transpose)
8
9
as a column vector (ht,l,m ) for the consistency.
10
11
12 [Author’s Response]
13
14
Thank you for the valuable comment. According to the reviewer’s comment, the related revisions
15 are performed as ht,l,m → hTt,l,m . Meanwhile, because (hTt,l,m )H looks complicated, by defining the
16
17 conjugated operator, (hTt,l,m )H is denoted as h∗t,l,m . Further, the pseudo code is also written by using the
18
19
same notation, as described in the response to your comment #2.
20
21
22 [Section I-B] The superscripts T , ∗, H, and −1 are the transpose, conjugate, conjugate-and-transpose, and inverse operators,
23
respectively.
24
25
26
[Section III-B: (13)]
27
28 yt,l (m) = hTt,l,m xt,l + nt,l (m), 1 ≤ m ≤ M, (13)
29
30 where yt,l (m) and nt,l (m) are the mth elements of yt,l and nt,l , respectively, and hTt,l,m is the 1 × N vector corresponding
31
32 to the mth row of Ht,l .
33
34
35 [Section III-B: (14)]
36 x̂t,l,m = x̂t,l,m−1 + kt,l,m (yt,l (m) − hTt,l,m x̂t,l,m−1 ) (14)
37
38
39
40 [Section III-B: (15)]
41 Pt,l,m = Pt,l,m−1 − kt,l,m hTt,l,m Pt,l,m−1 , (15)
42
43
44
[Section III-B: (16)]
45 Pt,l,m−1 h∗t,l,m
46 kt,l,m =
hTt,l,m Pt,l,m−1 h∗t,l,m + σ 2
. (16)
47
48
49
50 [Section III-C: 2nd paragraph] Considering the highest-order terms only, the complexity is governed by calculations of
51
Pt,l,m−1 h∗t,l,m (= vt,l,m in Alg. 1) and kt,l,m hTt,l,m Pt,l,m−1 (= kt,l,m vt,l,m
H
), ...
52
53
54
55
56
57
58
59
60
Page 28 of 48

1 IEEE TRANSACTIONS ON SIGNAL PROCESSING 19


2
3
4
5 [Comment #6]
6
7 (2) There are some typos:
8
9
A. In (16), kt,l should be kt,l,m
10 B. In III-C, “... by calculations of ht,l,m Pt,l,m−1 ...” → “... by calculations of ht,l,m Pt,l,m−1 hH
t,l,m ...”.
11
12
13
14
[Author’s Response]
15 Thank you for the detailed review. The typo in (16) is revised as follows:
16
17
18 [Section III-B: (16)]
19 Pt,l,m−1 h∗t,l,m
kt,l kt,l,m = . (16)
20 ht,l,m Pt,l,m−1 h∗t,l,m
T
+ σ2
21
22
23
Meanwhile, ht,l,m Pt,l,m−1 (assuming a row vector ht,l,m ) was a correct expression, because ht,l,m Pt,l,m−1
24
25 (or Pt,l,m−1 hH H
t,l,m = (ht,l,m Pt,l,m−1 ) ) is utilized in both Pt,l,m in (15) and kt,l,m in (16), as mentioned
26
27 in the response to your comment #2. To clearly indicate this point, according to the row vector hTt,l,m , the
28
notation vt,l,m (= Pt,l,m−1 h∗t,l,m ) is used in the pseudo code, and the related parts are revised as follows:
29
30
31
32 [Section III-B: after (16)] In Alg. 1, the proposed KF stage with the low-complexity correction step is summarized.
33
vt,l,m = Pt,l,m−1 h∗t,l,m in Alg. 1 is defined to reduce the number of calculations of Pt,l,m−1 h∗t,l,m in both Pt,l,m in
34
35 (15) and kt,l,m in (16).
36
37
38
[Section III-C: 2nd paragraph] Considering the highest-order terms only, the complexity is governed by calculations of
39
40 Pt,l,m−1 h∗t,l,m (= vt,l,m in Alg. 1) and kt,l,m hTt,l,m Pt,l,m−1 (= kt,l,m vt,l,m
H
), ...
41
42
43
In addition, the current manuscript has been carefully checked before submission, including math
44
45 equations and English usages. The followings are the other typos revised in the current manuscript:
46
47 (* The typos in which the sentences and phrases are revised by English Proofreading or removed for
48
a simpler explanation are not included.)
49
50
51
52 [Section III-A: Table I] r = 01 & r > 01
53
54
55 [Section III-A: after (4)] Let dt,l = [dt,l (1), · · · , dt,l (N )]T = E[xt−1,l xt,l x∗t,l ]
56
57
58
59
60
Page 29 of 48

1 IEEE TRANSACTIONS ON SIGNAL PROCESSING 20


2
3
4
5 [Section III-B-3] ... the LLRs of the coded bits comprising the parity symbols are sent ..
6
7
8
[Section III-C: Table II, Maximal-ratio combining] O(54N 3 + N 2 M )
9
10
11
12 [Section V: 1st paragraph] ... Therefore, the proposed KC combining scheme is considered an ..
13
14
15 Also, the entire manuscript is revised based on the comments from an English proofreading service,
16
17 including the simplification of expressions for a better readability. Because of the length, the related
18
19
revisions are not included in this response letter, and they are highlighted in green in the manuscript.
20
21
22 Again, I would like to thank the reviewer for the time and effort for this manuscript.
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
Page 30 of 48

1 IEEE TRANSACTIONS ON SIGNAL PROCESSING 21


2
3
4
5
6 Response to Reviewer 3
7
8 Recommendation: R - Reject
9
10
11
12 [Comment #1]
13
14
This paper proposes a Kalman filtering (KF)-based combining scheme for multiple-input multiple-
15 output (MIMO) systems with hybrid automatic repeat request (ARQ). The change of a transmit signal
16
17 vector in consecutive transmission time intervals (TTIs) in the state-space model for the proposed scheme
18
19
is interpreted as the transition of a state vector. Based on this state-space model, the KF operation in the
20 proposed scheme can perform symbol-level combining (SLC) based on linear minimum mean-square-
21
22 error (LMMSE) detection using the information aggregated up to the current TTI, without increasing the
23
24
size of the state-space model for each TTI.
25 1. This reviewer thinks that the contribution of this paper is not enough for being published in IEEE
26
27 TSP. This reviewer thinks that both KF and SLC are two well-examined methods. The authors just
28
29
merge them. Moreover, this paper is lack of analytical results. Too many paragraphs are used to survey
30 the existing works and present the complexity.
31
32
33
34
[Author’s Response]
35 [(i) the contribution of the paper: summary] I believe the proposed scheme is more than just merging
36
37 the well-known KF and SLC. The details can be summarized as follows:
38
39
40 (a) In the state-space model for the proposed scheme, the change of a transmit signal vector in
41
42 consecutive TTIs is interpreted as the transition of a state vector. With a simple modification of a transi-
43
44
tion indicating vector, the state-space model can represent various MIMO-HARQ system configurations
45 regardless of the number of HARQ processes (e.g., MIMO single and multiple ARQ systems) and HARQ
46
47 retransmission strategy (e.g., Chase combining and incremental redundancy). To the best of the author’s
48
49
knowledge, except the proposed state-space model, there have been no such state-space models that
50 can be applied to various MIMO-HARQ system configurations. Further, there have been no other SLC
51
52 schemes with a similar approach that can be employed for various MIMO-HARQ system configurations.
53
54
Therefore, unlike the existing SLC scheme developed for a specific system configuration in a dedicated
55 manner, the proposed scheme can be applied to various MIMO-HARQ systems.
56
57
58
59
60
Page 31 of 48

1 IEEE TRANSACTIONS ON SIGNAL PROCESSING 22


2
3
4
(b) Then, by the KF operation using the state-space model, the information for the transmit signal
5 vector obtained until the previous TTIs is naturally utilized to obtain the estimate of the transmit signal
6
7 vector at the current TTI. Because the Kalman filter is the LMMSE detector for MIMO systems with
8
9
AWGN, the KF operation of the proposed scheme can perform the LMMSE detection based SLC using
10 the information aggregated up to the current TTI, without increasing the size of the state-space model for
11
12 each TTI. In particular, it is derived that the proposed scheme has the near-identical error performance
13
14
to the LMMSE-based direct SLC scheme with a brute-force aggregation of all the related information
15 (channel matrices & receive signal vectors) up to the current TTI in a dedicated manner for each system
16
17 configuration, which results in very high complexity and memory loads. This is proved in Appendices
18
19
A and B of the current manuscript (Appendix of the previous manuscript).
20 (c) Meanwhile, the KF operation for MIMO systems requires a higher computational complexity
21
22 than conventional LMMSE detection. Therefore, using the linearity of the observation equation and
23
24
characteristic of the observation noise (AWGN), a low-complexity correction step is developed as well.
25 With the low-complexity correction step, the complexity of the proposed scheme is comparable to that
26
27 of BLC with conventional LMMSE detection. Again, to the best of the author’s knowledge, this low-
28
29
complexity KF correction step has not been developed for MIMO detection problems.
30 (d) Simulation results confirm the above observations and show that the proposed scheme can achieve
31
32 the near-identical performance to the LMMSE-based direct SLC scheme (high performance but high
33
34
complexity & low system applicability) and outperform the LMMSE-based BLC scheme (high system
35 applicability & low complexity but low performance).
36
37 Therefore, I believe that this study makes a significant contribution to the literature because the proposed
38
39
scheme has (1) high performance comparable to SLC, (2) high system applicability comparable to BLC,
40 and (3) low computational complexity also comparable to BLC, which have not been considered together
41
42 in the existing studies for MIMO-HARQ combining schemes until now.
43
44
45 [(ii) related revision: Abstract and Section I] To clearly express the above characteristics and
46
47 contributions, the manuscript has been extensively revised. A main reason why these points were not
48
49
observed well seems to be the readability of the previous manuscript, especially Abstract and Section I
50 (Introduction), e.g., the long explanations on the existing works as the reviewer pointed out. Therefore,
51
52 to improve the readability, the entire manuscript has been revised to remove the redundant explanations
53
54
and to simplify the descriptions, with the comments from an English proofreading service. As a result,
55 the length of the manuscript is reduced from 10 pages to 9 pages. The full details are not included
56
57
58
59
60
Page 32 of 48

1 IEEE TRANSACTIONS ON SIGNAL PROCESSING 23


2
3
4
in this response letter because of the length, and some major revisions (including the simplification of
5 introducing the existing works) on Abstract and Section I are summarized as follows:
6
7
8
[Abstract] This paper proposes a Kalman filtering (KF)-based combining scheme for multiple-input multiple-output (MIMO)
9
10 systems with hybrid automatic repeat request (ARQ). The In the state-space model, the change of a transmit signal vector in
11
12 consecutive transmission time intervals (TTIs) in the state-space model for the proposed scheme is interpreted as the transition
13
of a state vector. Based on this state-space model Accordingly, the proposed KF operation in the proposed scheme can perform
14
15 performs symbol-level combining (SLC) based on linear minimum mean-square-error (LMMSE) detection using the information
16
17 aggregated up to the current TTI , without increasing the size of the state-space model for each TTI. Furthermore, via simple
18
modification of a transition indicating vector, the. The state-space model can represent various configurations for MIMO systems
19
20 with hybrid ARQ (HARQ) regardless of the number of HARQ processes (e.g., MIMO single and multiple ARQ systems) and
21
22 the HARQ retransmission strategy (e.g., Chase combining and incremental redundancy). via a simple modification of a transition
23
indicating vector. Therefore, unlike compared to the existing SLC schemes which are designed for a specific system in a dedicated
24
25 manner, the proposed scheme can be applied to various MIMO-HARQ systems like, similar to bit-level combining (BLC). In
26
27 addition, because it employs by employing a low-complexity correction step for the KF operation, the computational complexity
28
of the proposed scheme is lower than that of existing SLC schemes and is comparable to that of BLC. Simulation results and
29
30 analyses confirm that the proposed scheme outperforms BLC, and the error performance of the proposed scheme is almost
31
32 identical to that of the direct SLC scheme with aggregates and achieves near-identical error performance to the LMMSE-based
33
direct SLC scheme with a brute-force aggregation of all related information up to the current TTI for the transmitted packet(s).
34
35
36
37 [Section I: 1st paragraph] In wireless communication systems, the receiver can request retransmission of a packet with
38
detection errors due to issues such as channel distortions and link adaptation failures. By retransmitting the forward error
39
40 correction (FEC) encoded packet, the hybrid automatic repeat request (HARQ), which is a combined scheme of FEC and ARQ,
41
42 can overcome such detection errors and can achieve an extremely low error probability [1]. In general, the HARQ procedure
43
of a packet is repeated by retransmission until the packet is successfully decoded or the HARQ round of the packet, i.e., the
44
45 number of transmissions of the packet, reaches a given limit set for the system. Based on techniques such as puncturing, the
46
47 packet that will be sent In addition, because the packet in each HARQ round can have a higher code rate than the original
48
mother codeword [2]. Thus, by allowing HARQ to successfully decode the packet with less redundancy information, HARQ
49
50 can also enhance the system throughput in comparison with a system without HARQ. Therefore, HARQ is considered to be one
51
52 of the essential techniques for wireless communication systems and as an essential technique for packet transmission, HARQ is
53
employed in most modern communication standards [3]–[5].
54
55
56
57
58
59
60
Page 33 of 48

1 IEEE TRANSACTIONS ON SIGNAL PROCESSING 24


2
3
[Section I: 2nd paragraph] To take full advantage of completely utilize HARQ, the receiver needs tomust combine the
4
5 information of the packet obtained throughout the HARQ rounds. The basic combining scheme for HARQ is bit-level combining
6
7 (BLC) [6], which corresponds to . In BLC, the log-likelihood ratio (LLR) combining. In BLC, the LLR of a packet in each
8
HARQ round is separately calculated, and then,followed by combining the LLRs obtained from the HARQ rounds of the packet
9
10 are combined for FEC decoding. This BLC scheme has high flexibility and can be applied to any system employing an HARQ
11
12 regardless of the number of HARQ processes [7] and the HARQ retransmission strategy [8], e.g., Chase combining (CC), which
13
sends identical signals throughout HARQ rounds, and incremental redundancy (IR), which sends new redundancy information
14
15 for retransmission.. In addition to BLC, symbol-level combining (SLC), also known as joint detection, is a combining scheme
16
17 for HARQ, which combines the received signals related to a packet and calculates the LLR for FEC decoding from the combined
18
signal model [9]–[23]. In particular, in multiple-input multiple-output (MIMO) systems, SLC performance can be considerably
19
20 improved from BLC by obtaining higher diversity order and average post-processing signal-to-interference-plus-noise ratio
21
22 (SINR) [10], [14], [22]. However, the direct SLC scheme with a brute-force aggregation of all related information of a packet is
23
impractical in MIMO systems with HARQ because ofowing to the large computational complexity and memory loads. Therefore,
24
25 numerousseveral alternative SLC schemes have been investigated for MIMO systems with HARQ [9]–[23].
26
27
28 [Section I: removed paragraph] In the proposed KC scheme, after the KF operation, LLR calculation is performed using
29
the estimated state vector and the error covariance matrix. If IR is employed as the retransmission strategy, the coded bits already
30
31 sent until a given HARQ round of a packet may not be sent from the next HARQ round. Therefore, after the KF operation
32 and the following LLR calculation, the calculated LLRs are combined with the LLRs for the current packet(s) obtained in the
33
previous TTIs, if the coded bits corresponding to these previous LLRs are not sent in the current TTI. Consequently, the proposed
34
35 KC scheme performs SLC based on LMMSE detection with optional LLR combining for the coded bits that are unsent in the
36 current TTI.
37
38 The remainder of the paper is organized as follows. Section II ...
39
40
41 [Section I-A: existing works (MSARQ), previous manuscript ] For MSARQ systems, i.e., MIMO systems with a single
42
43 ARQ process for each TTI [7], HARQ transmission and retransmission modeling becomes straightforward, and thus, several
44 SLC schemes have been investigated. In [9], pre-combining and post-combining schemes were proposed for MSARQ systems; in
45
46 the pre-combining scheme, cumulative combining for HARQ rounds is performed prior to linear detection for CC, whereas, the
47
48 post-combining is equivalent to BLC performing detection prior to combining for both CC and IR. The pre-combining scheme
49 was extended to maximal-ratio combining to be used with maximum-likelihood (ML) detection as well as linear detection for
50
51 MSARQ-CC systems with soft-decoding in [10]. In [11], an SLC scheme with sphere decoding was developed for MSARQ-CC
52
53 systems utilizing symbol mapping diversity for retransmissions. Furthermore, in [12], SLC schemes with QR decomposition
54 of the aggregated channel matrix throughout the HARQ rounds were investigated for MSARQ-CC systems, which reduce the
55
56
57
58
59
60
Page 34 of 48

1 IEEE TRANSACTIONS ON SIGNAL PROCESSING 25


2
3
computational complexity from the direct SLC scheme in case of ML detection. In [13], a turbo packet combining scheme with
4
5 joint utilization of SLC and turbo equalization was proposed for MSARQ-CC systems under frequency selective fading channels.
6
7 Meanwhile, in [14], an SLC scheme applicable to both ML and linear detections was investigated for MSARQ-IR systems, which
8
was further extended to MSARQ systems with joint utilization of CC and IR by acknowledgement feedback bundling [15]. In
9
10 [16], an SLC scheme for MSARQ-CC systems with an interference channel was proposed based on interference-aware successive
11
12 decoding. In [17], the SLC scheme with ML detection was proposed for MSARQ systems with a CC based partial retransmission
13
protocol considering orthogonal space-time block codes. Furthermore, a low-complexity SLC scheme that utilizes only a subset
14
15 of row vectors in the channel matrix was studied in [18] for MSARQ-CC systems with linear detection.
16
17 → [Section I-A: existing works (MSARQ), current manuscript ] For MSARQ systems, the modeling of HARQ procedures is
18
straightforward and several SLC schemes have been developed. In [9], pre-combining and post-combining schemes with linear
19
20 detection were proposed. The pre-combining scheme was extended to maximal-ratio combining to be used with maximum-
21
22 likelihood (ML) detection and linear detection for CC in [10]. In [11], an SLC scheme with sphere decoding was developed for
23
CC. In [12], SLC schemes with QR decomposition of the aggregated channel matrix were investigated for CC. In [13], a joint
24
25 utilization of SLC and turbo equalization under frequency selective fading channels was proposed for CC. Meanwhile, in [14],
26
27 an SLC scheme with ML and linear detections was investigated for IR, which was extended to the system with acknowledgement
28
feedback bundling [15]. In [16], an SLC scheme under an interference channel was proposed for CC. In [17], an SLC scheme
29
30 with ML detection was proposed for partial retransmission considering orthogonal space-time block codes. In [18], an SLC
31
32 scheme utilizing a subset of the channel matrix was studied for CC.
33
34
35 [Section I-A: existing works (MMARQ), previous manuscript ] In addition to the aforementioned studies for MSARQ
36
37 systems, SLC schemes for MMARQ systems, i.e., MIMO systems with multiple ARQ process for each TTI [7], have been
38
investigated. Because of the simultaneous transmission of multiple packets with independent ARQ processes, the combining and
39
40 reception procedures of SLC for MMARQ systems become significantly complicated compared with those in MSARQ systems.
41
42 In [19], a joint detection approach for MMARQ-CC systems was proposed based on the cancellation of successfully decoded
43
packets, whereas the process of handling the packets consecutively failed at decoding until a given system limit of the HARQ
44
45 round was not considered. Furthermore, several combining techniques for uplink multi-user (MU) MMARQ-CC systems were
46
47 investigated in [20] according to the availability of interference cancellation (IC) and blanking, where blanking implies that no
48
new ARQ process is allowed until all existing processes are terminated. In [21], an SLC scheme based on QR decomposition
49
50 with IC was proposed for uplink single-carrier MU-MMARQ-CC systems, which is an application of the SLC scheme in [12] to
51
52 the given system model. Meanwhile, in [22], an SLC scheme with IC of all terminated packets including the failed packets was
53
proposed and analyzed, including the performance comparison of MMARQ-CC systems with SLC and MMARQ-IR systems
54
55 with BLC. In [23], the SLC scheme without IC was investigated for MMARQ-CC systems by considering the interference from
56
57
58
59
60
Page 35 of 48

1 IEEE TRANSACTIONS ON SIGNAL PROCESSING 26


2
3
the terminated packets as noise.
4
5 → [Section I-A: existing works (MMARQ), current manuscript ] In MMARQ systems, because of multiple ARQ processes,
6
7 SLC schemes become significantly complex compared to those in MSARQ systems. In [19], a joint detection approach for CC
8
was proposed based on the interference cancellation (IC) of successfully decoded packets. In [20], SLC schemes for CC were
9
10 investigated according to the availability of IC and packet blanking. In [21], an SLC scheme based on QR decomposition with
11
12 IC was proposed for CC. Meanwhile, in [22], an SLC scheme with IC of all terminated packets was investigated for CC. In
13
[23], the SLC scheme without IC was developed for CC by considering the interference from the terminated packets as noise.
14
15
16
17 [(iii) related revision: Section III-C (complexity)] As the reviewer advised, Section III-C are simpli-
18
19
fied as well. Meanwhile, because the manuscript claims that the proposed scheme has lower computational
20 complexity than the existing SLC schemes, it is necessary to compare the complexity of the proposed
21
22 scheme to the existing SLC schemes. Therefore, Table II is maintained, and the descriptions in Section
23
24
III-C are revised as follows:
25
26
27 [Section III-C: 1st paragraph] In this subsection, the required computational complexity and memory size of the proposed
28
scheme at the detection stage for detection (i.e., the KF stage) are derived and are compared with those of the conventional
29
30 BLC and SLC other combining schemes.
31
32
33
[Section III-C: 2nd paragraph] The matrix-vector multiplications of the correction step in (14)–(16) mainly yield the
34
35 computational complexity of the KF stage, because By the structure of Ft,l , the prediction step in (8) and (9) is basically
36
37 becomes the selection of subsets in x̂t−1,l and Pt−1,l by the structure of Ft,l . Therefore, the correction step in (14)–(16)
38
mainly yield the complexity. Considering the highest-order terms only, the computational complexity is governed by calculations
39
40 of ht,l,m Pt,l,m−1 and kt,l,m ht,l,m Pt,l,m−1 Pt,l,m−1 h∗t,l,m (= vt,l,m in Alg. 1) and kt,l,m hTt,l,m Pt,l,m−1 (= kt,l,m vt,l,m
H
),
41
42 and each of them entails the complexity of O(N 2 ). Because the correction step in (14)–(16) should be performed M times,
43
the computational complexity of the KF stage is O(2N 2 M ) for each transmit signal vector xt,l . Meanwhile, only considering
44
45 the highest-order terms, the computational the complexity of the KF stage with the original correction step in (10)–(12) is
46
47 O(2N 2 M +2N M 2 +M 3 ) for each transmit signal vector xt,l . Therefore, the low-complexity correction step, the computational
48
complexity of the KF stage can be significantly reduced. Therefore, the low-complexity correction step can significantly reduce
49
50 the complexity of the KF stage.
51
52
53
[Section III-C: 3rd paragraph] Next, the memory requirement for the KF stage is derived, in which the memory size to
54
55 store one complex number is denoted as one memory unit. At the KF stage, only x̂t−1,l and Pt−1,l need to be stored for the
56
57
58
59
60
Page 36 of 48

1 IEEE TRANSACTIONS ON SIGNAL PROCESSING 27


2
3
current tth TTI. As shown in Section III-B, x̂t,l and Pt,l need to be stored for the next TTI, i.e., the (t + 1)th TTI, instead
4
5 of ŷt,l and Ht,l , whereas the previous filtered estimates x̂t0 ,l and Pt0 ,l for t0 < t are not required during the KF stage at the
6
7 (t + 1)th TTI. Therefore, N (N + 1) memory units are required in the KF stage for each transmit signal vector. Note that this
8
could be further reduced because Pt,l is Hermitian with real numbers on diagonal, which can be further reduced because Pt,l
9
10 is Hermitian..
11
12
13
[Section III-C: 4th paragraph] In Table II, the computational complexity and memory units as well as the error performance
14
15 of the combining schemes are compared, where LMMSE detection is considered for conventional combining schemes. Table
16
17 II compares the computational complexity, memory units, and error performance of the combining schemes with LMMSE
18
detection. Similar to the case of the proposed scheme, the computational complexity and memory units of each combining
19
20 scheme required at the detection stage for one transmit signal vector are evaluated. The existing SLC schemes require lower
21
22 computational complexity than the brute-force direct SLC schemes; however, they require additional overhead from BLC to
23
aggregate the information, e.g., noise whitening procedure [10], QR decomposition [12], and the terminated packet elimination
24
25 with IC [19], [22]. In contrast, the computational complexity of the proposed scheme is comparable to that of BLC and is
26
27 significantly smaller than that of the conventional SLC schemes.3 In addition, the proposed scheme requires lesslesser memory
28
units than most conventional SLC schemes. Even with the lowerlow computational complexity and memory units, by obtaining
29
30 the state transmission matrix Ft,l for the given system configuration, the proposed scheme can be applied to both MSARQ and
31
32 MMARQ systems regardless of the retransmission strategy, unlike the conventional SLC schemes designed for a specific MIMO
33
system with HARQ. Furthermore, the proposed scheme can outperform BLC and achieve near-identical error performance to
34
35 the direct SLC scheme, as shown in Section IV and Appendix Appendices A and B.
36
37
38
39
[(iv) related revision: analysis] In the previous manuscript, the performance of the proposed scheme
40 was analyzed in Appendix, but it was not clearly indicated in the main text. To clearly show that the
41
42 performance is analyzed, in the revised manuscript, Lemma 1 and the related descriptions are added, and
43
44
Appendix is divided to “A. Proof of Lemma 1 (equivalence of the estimated xt,l in the proposed and
45 direct SLC schemes)” and “B. Error performance of the proposed and direct SLC schemes”. In specific,
46
47 Appendix A is to analyze the estimation quality of the transmit signal vector in the proposed KF scheme
48
49
compared to the direct SLC scheme with the LMMSE detection, and Appendix B is to compare the
50 error performance (BER, BLER) of the proposed and direct SLC schemes according to the analysis in
51
52 Appendix A.
53
54
(* To remove the redundant information, the current Appendices A and B are greatly simplified from
55 the previous Appendix. For details, please refer to the response to Reviewer 4.)
56
57
58
59
60
Page 37 of 48

1 IEEE TRANSACTIONS ON SIGNAL PROCESSING 28


2
3
4
5 [Section III-B-1: after (12)] Therefore, x̂t,l is obtained by the KF operation in (8)–(12), which is an LMMSE estimate of
6
7 xt,l using the information aggregated up to the tth TTI by the state transition. .Note that x̂t,l is equivalent to the estimate of
8
xt,l obtained by the direct SLC scheme with LMMSE detection, as shown in Appendix. Further, when Ft,l = 0N ×N (e.g., full
9
10 IR), x̂t,l is equivalent to the estimate of xt,l by BLC with LMMSE detection because Kt,l becomes (HH 2
t,l Ht,l + σ IN )
−1 H
Ht,l .
11
12
13 [Section III-B-1: last part] Next, we compare x̂t,l to the estimate of other combining schemes with LMMSE detection.
14
When Ft,l = 0N ×N regardless of t and l (e.g., full IR), x̂t,l is equivalent to the estimate of xt,l by BLC with LMMSE detection
15
−1 H
16 because Kt,l becomes (HH 2
t,l Ht,l + σ IN ) Ht,l . In addition, for general Ft,l , we have the following lemma:
17 Lemma 1 (Equivalence of the estimated xt,l in the proposed and direct SLC schemes): x̂t,l is identical to the corresponding
18
estimate of xt,l obtained by the direct SLC scheme with LMMSE detection at the tth TTI.
19
20 Proof: See Appendix A.
21 Therefore, the proposed and direct SLC schemes at the tth TTI obtain the identical estimate for xt,l regardless of the number
22
23 of HARQ processes and retransmission strategy. However, the direct SLC scheme requires a significantly larger computational
24
25 complexity and has to be designed in a dedicated manner for a specific system configuration.
26
27
28 [Appendix: previous manuscript] [Relationship between direct SLC and proposed schemes]
29
30 → [Appendix A: current manuscript] [Proof of Lemma 1]
31 → [Appendix A: current manuscript, last paragraph] ... for the newly transmitted symbols. Therefore, the proposed and
32
33 direct SLC schemes at the tth TTI obtain the identical estimate for xt,l .
34
35 → [Appendix B: current manuscript] [Error performance of the proposed and direct SLC schemes]
36 → [Appendix B: current manuscript, 1st paragraph] Based on the analysis in Appendix A, the error performance of the
37
38 proposed KC and direct SLC schemes according to the system configuration can be compared as follows:
39
40
41 Again, I would like to thank the reviewer for the time and effort for this manuscript.
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
Page 38 of 48

1 IEEE TRANSACTIONS ON SIGNAL PROCESSING 29


2
3
4
5
6 Response to Reviewer 4
7
8 Recommendation: R - Reject
9
10
11
12 [Comment #1]
13
14
In this paper the author proposes a Kalman filtering based technique for MIMO systems. While the
15 paper is technically correct and the contribution is novel, I am concerned about volume of technical
16
17 content in this paper. The sentences seem excessive and the mathematical expressions are sparse, making
18
19
the paper feel stretched. While there are multiple addenda (including a theoretical performance analysis of
20 the proposed approach, the extension of the proposed approach to the case where one or more parameters
21
22 of the state space model become erroneous, etc. ) that can potentially increase the technical content of
23
24
this paper to something that is expected out of a full length Transactions on Signal Processing paper, I
25 will not suggest those since those will go beyond the scope of the six week ”RQ” window. Instead, I
26
27 would suggest that the author should trim this paper to a five page letter and submit at an appropriate
28
29
venue.
30
31
32 [Author’s Response]
33
34
[(i) suitability as a full paper] First of all, I sincerely appreciate your comment about the contribution
35 of the paper. Meanwhile, to fully describe the characteristic and novelty of the proposed scheme, I think
36
37 the following points need to be explained:
38
39
40 (1) The proposed scheme can be applied to various MIMO-HARQ system configurations. Thus, we need
41
42 to model and describe such various MIMO-HARQ system configurations as well as the corresponding
43
44
state-space model, which are explained in Sections II and III-A.
45 (2) To reduce the complexity of the proposed scheme, a low-complexity correction step is developed
46
47 in the manuscript. To explain how it is derived and how it can have the same estimate to the original
48
49
correction step, both original and low-complexity correction step need to be described, which are explained
50 in Section III-B.
51
52 (3) To show the effectiveness of the proposed scheme (high performance, low complexity, high
53
54
system applicability), it needs to be compared with the other existing SLC schemes in various system
55 configurations, in terms of receiver overhead (complexity, memory unit), error performance, and system
56
57
58
59
60
Page 39 of 48

1 IEEE TRANSACTIONS ON SIGNAL PROCESSING 30


2
3
4
applicability. This is described in Section III-C.
5 (4) Because the existing SLC schemes are simplified from the direct SLC scheme, they usually have
6
7 the same performance to the direct SLC scheme. Thus, we need to compare the performance of the
8
9
proposed scheme and LMMSE-based direct SLC scheme. This is explained in Appendices (analytic) and
10 Section IV (simulation results).
11
12
13
14
Therefore, to fully explain the above points, I believe a five page letter is insufficient, so the revised
15 manuscript is prepared as a full paper. I hope the reviewer may understand the intent.
16
17
18
19
[(ii) simplification of redundant descriptions] I fully agree with the reviewer’s comment that the
20 previous sentences look excessive, which can make the reader feels that the paper is stretched. Therefore,
21
22 the entire manuscript has been extensively revised with the comments from an English proofreading
23
24
service, so the redundant descriptions and extra information are significantly simplified. As a result, even
25 with new Algorithm 1 and Lemma 1 (for revision according to the comments from other reviewers), the
26
27 length of the manuscript is reduced from 10 to 9 pages. Because of the length, the full list of revisions
28
29
is not included in this response letter, and some major revisions are shown at the followings.
30 (* The simplifications not included in this response letter are highlighted in green in the revised
31
32 manuscript, while the revisions included in here are highlighted in yellow .)
33
34
(** Regarding the simplification of Appendices, please see the response (iii).)
35
36
37 [(ii-1) simplification of redundant descriptions: Abstract and Section I] The followings are major
38
39
revisions for simplification in Abstract and Section I:
40
41
42 [Abstract] This paper proposes a Kalman filtering (KF)-based combining scheme for multiple-input multiple-output (MIMO)
43
systems with hybrid automatic repeat request (ARQ). The In the state-space model, the change of a transmit signal vector in
44
45 consecutive transmission time intervals (TTIs) in the state-space model for the proposed scheme is interpreted as the transition
46
47 of a state vector. Based on this state-space model Accordingly, the proposed KF operation in the proposed scheme can perform
48
performs symbol-level combining (SLC) based on linear minimum mean-square-error (LMMSE) detection using the information
49
50 aggregated up to the current TTI , without increasing the size of the state-space model for each TTI. Furthermore, via simple
51
52 modification of a transition indicating vector, the. The state-space model can represent various configurations for MIMO systems
53
with hybrid ARQ (HARQ) regardless of the number of HARQ processes (e.g., MIMO single and multiple ARQ systems) and
54
55 the HARQ retransmission strategy (e.g., Chase combining and incremental redundancy). via a simple modification of a transition
56
57
58
59
60
Page 40 of 48

1 IEEE TRANSACTIONS ON SIGNAL PROCESSING 31


2
3
indicating vector. Therefore, unlike compared to the existing SLC schemes which are designed for a specific system in a dedicated
4
5 manner, the proposed scheme can be applied to various MIMO-HARQ systems like, similar to bit-level combining (BLC). In
6
7 addition, because it employs by employing a low-complexity correction step for the KF operation, the computational complexity
8
of the proposed scheme is lower than that of existing SLC schemes and is comparable to that of BLC. Simulation results and
9
10 analyses confirm that the proposed scheme outperforms BLC, and the error performance of the proposed scheme is almost
11
12 identical to that of the direct SLC scheme with aggregates and achieves near-identical error performance to the LMMSE-based
13
direct SLC scheme with a brute-force aggregation of all related information up to the current TTI for the transmitted packet(s).
14
15
16
17 [Section I: 1st paragraph] In wireless communication systems, the receiver can request retransmission of a packet with
18
detection errors due to issues such as channel distortions and link adaptation failures. By retransmitting the forward error
19
20 correction (FEC) encoded packet, the hybrid automatic repeat request (HARQ), which is a combined scheme of FEC and ARQ,
21
22 can overcome such detection errors and can achieve an extremely low error probability [1]. In general, the HARQ procedure
23
of a packet is repeated by retransmission until the packet is successfully decoded or the HARQ round of the packet, i.e., the
24
25 number of transmissions of the packet, reaches a given limit set for the system. Based on techniques such as puncturing, the
26
27 packet that will be sent In addition, because the packet in each HARQ round can have a higher code rate than the original
28
mother codeword [2]. Thus, by allowing HARQ to successfully decode the packet with less redundancy information, HARQ
29
30 can also enhance the system throughput in comparison with a system without HARQ. Therefore, HARQ is considered to be one
31
32 of the essential techniques for wireless communication systems and as an essential technique for packet transmission, HARQ is
33
employed in most modern communication standards [3]–[5].
34
35
36
37 [Section I: 2nd paragraph] To take full advantage of completely utilize HARQ, the receiver needs tomust combine the
38
information of the packet obtained throughout the HARQ rounds. The basic combining scheme for HARQ is bit-level combining
39
40 (BLC) [6], which corresponds to . In BLC, the log-likelihood ratio (LLR) combining. In BLC, the LLR of a packet in each
41
42 HARQ round is separately calculated, and then,followed by combining the LLRs obtained from the HARQ rounds of the packet
43
are combined for FEC decoding. This BLC scheme has high flexibility and can be applied to any system employing an HARQ
44
45 regardless of the number of HARQ processes [7] and the HARQ retransmission strategy [8], e.g., Chase combining (CC), which
46
47 sends identical signals throughout HARQ rounds, and incremental redundancy (IR), which sends new redundancy information
48
for retransmission.. In addition to BLC, symbol-level combining (SLC), also known as joint detection, is a combining scheme
49
50 for HARQ, which combines the received signals related to a packet and calculates the LLR for FEC decoding from the combined
51
52 signal model [9]–[23]. In particular, in multiple-input multiple-output (MIMO) systems, SLC performance can be considerably
53
improved from BLC by obtaining higher diversity order and average post-processing signal-to-interference-plus-noise ratio
54
55 (SINR) [10], [14], [22]. However, the direct SLC scheme with a brute-force aggregation of all related information of a packet is
56
57
58
59
60
Page 41 of 48

1 IEEE TRANSACTIONS ON SIGNAL PROCESSING 32


2
3
impractical in MIMO systems with HARQ because ofowing to the large computational complexity and memory loads. Therefore,
4
5 numerousseveral alternative SLC schemes have been investigated for MIMO systems with HARQ [9]–[23].
6
7
8 [Section I: removed paragraph] In the proposed KC scheme, after the KF operation, LLR calculation is performed using
9
the estimated state vector and the error covariance matrix. If IR is employed as the retransmission strategy, the coded bits already
10
11 sent until a given HARQ round of a packet may not be sent from the next HARQ round. Therefore, after the KF operation
12 and the following LLR calculation, the calculated LLRs are combined with the LLRs for the current packet(s) obtained in the
13
previous TTIs, if the coded bits corresponding to these previous LLRs are not sent in the current TTI. Consequently, the proposed
14
15 KC scheme performs SLC based on LMMSE detection with optional LLR combining for the coded bits that are unsent in the
16 current TTI.
17
18 The remainder of the paper is organized as follows. Section II ...
19
20
21 [Section I-A: existing works (MSARQ), previous manuscript ] For MSARQ systems, i.e., MIMO systems with a single
22
23 ARQ process for each TTI [7], HARQ transmission and retransmission modeling becomes straightforward, and thus, several
24 SLC schemes have been investigated. In [9], pre-combining and post-combining schemes were proposed for MSARQ systems; in
25
26 the pre-combining scheme, cumulative combining for HARQ rounds is performed prior to linear detection for CC, whereas, the
27
28 post-combining is equivalent to BLC performing detection prior to combining for both CC and IR. The pre-combining scheme
29 was extended to maximal-ratio combining to be used with maximum-likelihood (ML) detection as well as linear detection for
30
31 MSARQ-CC systems with soft-decoding in [10]. In [11], an SLC scheme with sphere decoding was developed for MSARQ-CC
32
33 systems utilizing symbol mapping diversity for retransmissions. Furthermore, in [12], SLC schemes with QR decomposition
34 of the aggregated channel matrix throughout the HARQ rounds were investigated for MSARQ-CC systems, which reduce the
35
36 computational complexity from the direct SLC scheme in case of ML detection. In [13], a turbo packet combining scheme with
37
38 joint utilization of SLC and turbo equalization was proposed for MSARQ-CC systems under frequency selective fading channels.
39 Meanwhile, in [14], an SLC scheme applicable to both ML and linear detections was investigated for MSARQ-IR systems, which
40
41 was further extended to MSARQ systems with joint utilization of CC and IR by acknowledgement feedback bundling [15]. In
42
43 [16], an SLC scheme for MSARQ-CC systems with an interference channel was proposed based on interference-aware successive
44 decoding. In [17], the SLC scheme with ML detection was proposed for MSARQ systems with a CC based partial retransmission
45
46 protocol considering orthogonal space-time block codes. Furthermore, a low-complexity SLC scheme that utilizes only a subset
47
48 of row vectors in the channel matrix was studied in [18] for MSARQ-CC systems with linear detection.
49 → [Section I-A: existing works (MSARQ), current manuscript ] For MSARQ systems, the modeling of HARQ procedures is
50
51 straightforward and several SLC schemes have been developed. In [9], pre-combining and post-combining schemes with linear
52
53 detection were proposed. The pre-combining scheme was extended to maximal-ratio combining to be used with maximum-
54 likelihood (ML) detection and linear detection for CC in [10]. In [11], an SLC scheme with sphere decoding was developed for
55
56
57
58
59
60
Page 42 of 48

1 IEEE TRANSACTIONS ON SIGNAL PROCESSING 33


2
3
CC. In [12], SLC schemes with QR decomposition of the aggregated channel matrix were investigated for CC. In [13], a joint
4
5 utilization of SLC and turbo equalization under frequency selective fading channels was proposed for CC. Meanwhile, in [14],
6
7 an SLC scheme with ML and linear detections was investigated for IR, which was extended to the system with acknowledgement
8
feedback bundling [15]. In [16], an SLC scheme under an interference channel was proposed for CC. In [17], an SLC scheme
9
10 with ML detection was proposed for partial retransmission considering orthogonal space-time block codes. In [18], an SLC
11
12 scheme utilizing a subset of the channel matrix was studied for CC.
13
14
15 [Section I-A: existing works (MMARQ), previous manuscript ] In addition to the aforementioned studies for MSARQ
16
17 systems, SLC schemes for MMARQ systems, i.e., MIMO systems with multiple ARQ process for each TTI [7], have been
18
investigated. Because of the simultaneous transmission of multiple packets with independent ARQ processes, the combining and
19
20 reception procedures of SLC for MMARQ systems become significantly complicated compared with those in MSARQ systems.
21
22 In [19], a joint detection approach for MMARQ-CC systems was proposed based on the cancellation of successfully decoded
23
packets, whereas the process of handling the packets consecutively failed at decoding until a given system limit of the HARQ
24
25 round was not considered. Furthermore, several combining techniques for uplink multi-user (MU) MMARQ-CC systems were
26
27 investigated in [20] according to the availability of interference cancellation (IC) and blanking, where blanking implies that no
28
new ARQ process is allowed until all existing processes are terminated. In [21], an SLC scheme based on QR decomposition
29
30 with IC was proposed for uplink single-carrier MU-MMARQ-CC systems, which is an application of the SLC scheme in [12] to
31
32 the given system model. Meanwhile, in [22], an SLC scheme with IC of all terminated packets including the failed packets was
33
proposed and analyzed, including the performance comparison of MMARQ-CC systems with SLC and MMARQ-IR systems
34
35 with BLC. In [23], the SLC scheme without IC was investigated for MMARQ-CC systems by considering the interference from
36
37 the terminated packets as noise.
38
39
→ [Section I-A: existing works (MMARQ), current manuscript ] In MMARQ systems, because of multiple ARQ processes,
40 SLC schemes become significantly complex compared to those in MSARQ systems. In [19], a joint detection approach for CC
41
42 was proposed based on the interference cancellation (IC) of successfully decoded packets. In [20], SLC schemes for CC were
43
investigated according to the availability of IC and packet blanking. In [21], an SLC scheme based on QR decomposition with
44
45 IC was proposed for CC. Meanwhile, in [22], an SLC scheme with IC of all terminated packets was investigated for CC. In
46
47 [23], the SLC scheme without IC was developed for CC by considering the interference from the terminated packets as noise.
48
49
50 [(ii-2) simplification of redundant descriptions: Section III] The followings are major revisions for
51
52 simplification in Section III:
53
54
55 [Section III-B-3] Because the KF operation performs LMMSE detection based on SLC, additional LLR combining is not
56
57
58
59
60
Page 43 of 48

1 IEEE TRANSACTIONS ON SIGNAL PROCESSING 34


2
3
required for For LLRs of the coded bits comprising repeatedly transmitted symbols included in xt,l for 1 ≤ l ≤ L, e.g., the
4
5 transmit symbols of the packet(s) with CC and data symbols of the packet(s) with IR. Therefore, such LLRs also do not have to
6
7 be stored for the reception procedures at future TTIs., LLR combining is not required. In contrast, if at least one retransmitted
8
packet with IR exists, Meanwhile, the parity symbols of the packet sent until the last (t − 1)th TTI are not included in xt,l if IR
9
10 is employed. Therefore, if IR is employed for IR, the LLRs of the coded bits comprising the parity symbols are sent until the
11
12 (t − 1)th TTI should be stored to be combined with the currently obtained LLRs for decoding.2 That is, after combining, the
13
packet at its rth HARQ round with IR will have the non-zero LLRs of length (JC + rJ(1 − C))Q. Therefore, the effective
14
15 code rate of the packet after combining is C/(C + r(1 − C)), which is identical to HARQ-IR systems with BLC. Note that
16
17 the LLRs of the coded bits comprising the symbols that will not be sent at the next TTI but belong to active (non-terminated)
18
packets need be stored in the proposed scheme, whereas all LLRs calculated up to the current TTI for the active packets should
19
20 be stored in BLC.2
21
22
23
[Section III-C: 1st paragraph] In this subsection, the required computational complexity and memory size of the proposed
24
25 scheme at the detection stage for detection (i.e., the KF stage) are derived and are compared with those of the conventional
26
27 BLC and SLC other combining schemes.
28
29
30 [Section III-C: 2nd paragraph] The matrix-vector multiplications of the correction step in (14)–(16) mainly yield the
31
32 computational complexity of the KF stage, because By the structure of Ft,l , the prediction step in (8) and (9) is basically
33
becomes the selection of subsets in x̂t−1,l and Pt−1,l by the structure of Ft,l . Therefore, the correction step in (14)–(16)
34
35 mainly yield the complexity. Considering the highest-order terms only, the computational complexity is governed by calculations
36
37 of ht,l,m Pt,l,m−1 and kt,l,m ht,l,m Pt,l,m−1 Pt,l,m−1 h∗t,l,m (= vt,l,m in Alg. 1) and kt,l,m hTt,l,m Pt,l,m−1 (= kt,l,m vt,l,m
H
),
38
and each of them entails the complexity of O(N 2 ). Because the correction step in (14)–(16) should be performed M times,
39
40 the computational complexity of the KF stage is O(2N 2 M ) for each transmit signal vector xt,l . Meanwhile, only considering
41
42 the highest-order terms, the computational the complexity of the KF stage with the original correction step in (10)–(12) is
43
O(2N 2 M +2N M 2 +M 3 ) for each transmit signal vector xt,l . Therefore, the low-complexity correction step, the computational
44
45 complexity of the KF stage can be significantly reduced. Therefore, the low-complexity correction step can significantly reduce
46
47 the complexity of the KF stage.
48
49
50 [Section III-C: 3rd paragraph] Next, the memory requirement for the KF stage is derived, in which the memory size to
51
52 store one complex number is denoted as one memory unit. At the KF stage, only x̂t−1,l and Pt−1,l need to be stored for the
53
current tth TTI. As shown in Section III-B, x̂t,l and Pt,l need to be stored for the next TTI, i.e., the (t + 1)th TTI, instead
54
55 of ŷt,l and Ht,l , whereas the previous filtered estimates x̂t0 ,l and Pt0 ,l for t0 < t are not required during the KF stage at the
56
57
58
59
60
Page 44 of 48

1 IEEE TRANSACTIONS ON SIGNAL PROCESSING 35


2
3
(t + 1)th TTI. Therefore, N (N + 1) memory units are required in the KF stage for each transmit signal vector. Note that this
4
5 could be further reduced because Pt,l is Hermitian with real numbers on diagonal, which can be further reduced because Pt,l
6
7 is Hermitian..
8
9
10 [Section III-C: 4th paragraph] In Table II, the computational complexity and memory units as well as the error performance
11
12 of the combining schemes are compared, where LMMSE detection is considered for conventional combining schemes. Table
13
II compares the computational complexity, memory units, and error performance of the combining schemes with LMMSE
14
15 detection. Similar to the case of the proposed scheme, the computational complexity and memory units of each combining
16
17 scheme required at the detection stage for one transmit signal vector are evaluated. The existing SLC schemes require lower
18
computational complexity than the brute-force direct SLC schemes; however, they require additional overhead from BLC to
19
20 aggregate the information, e.g., noise whitening procedure [10], QR decomposition [12], and the terminated packet elimination
21
22 with IC [19], [22]. In contrast, the computational complexity of the proposed scheme is comparable to that of BLC and is
23
significantly smaller than that of the conventional SLC schemes.3 In addition, the proposed scheme requires lesslesser memory
24
25 units than most conventional SLC schemes. Even with the lowerlow computational complexity and memory units, by obtaining
26
27 the state transmission matrix Ft,l for the given system configuration, the proposed scheme can be applied to both MSARQ and
28
MMARQ systems regardless of the retransmission strategy, unlike the conventional SLC schemes designed for a specific MIMO
29
30 system with HARQ. Furthermore, the proposed scheme can outperform BLC and achieve near-identical error performance to
31
32 the direct SLC scheme, as shown in Section IV and Appendix Appendices A and B.
33
34
35 [(ii-3) simplification of redundant descriptions: References] Also, the Kalman filter related refer-
36
37 ences not directly cited in the main text except Section I are omitted. Thus, the number of the related
38
39
references are reduced from 8 (previously [24]–[31]) to 2 ([24] and [25], which were previously [26]
40 and [31], respectively), and the related citations in the main text are changed accordingly.
41
42
43
44
[(iii) suggestions for a full-paper: analysis] In the previous manuscript, the performance of the
45 proposed scheme was already analyzed in Appendix by comparing the performance to the direct SLC
46
47 scheme. To clearly show the presence of the analysis, in the revised manuscript, Lemma 1 and the related
48
49
descriptions are added, and Appendix is divided to “A. Proof of Lemma 1 (equivalence of the estimated
50 xt,l in the proposed and direct SLC schemes)” and “B. Error performance of the proposed and direct SLC
51
52 schemes”. In specific, Appendix A is to analyze the estimation quality of the transmit signal vector in
53
54
the proposed KF scheme compared to the direct SLC scheme with the LMMSE detection, and Appendix
55 B is to compare the error performance (BER, BLER) of the proposed and direct SLC schemes according
56
57
58
59
60
Page 45 of 48

1 IEEE TRANSACTIONS ON SIGNAL PROCESSING 36


2
3
4
to the analysis in Appendix A. In addition, Appendices in the current manuscript is extensively revised
5 for simplification, as shown in the followings:
6
7
8
[Section III-B-1: after (12)] Therefore, x̂t,l is obtained by the KF operation in (8)–(12), which is an LMMSE estimate of
9
10 xt,l using the information aggregated up to the tth TTI by the state transition. .Note that x̂t,l is equivalent to the estimate of
11
12 xt,l obtained by the direct SLC scheme with LMMSE detection, as shown in Appendix. Further, when Ft,l = 0N ×N (e.g., full
13 −1 H
IR), x̂t,l is equivalent to the estimate of xt,l by BLC with LMMSE detection because Kt,l becomes (HH 2
t,l Ht,l + σ IN ) Ht,l .
14
15
16 [Section III-B-1: last part] Next, we compare x̂t,l to the estimate of other combining schemes with LMMSE detection.
17
18 When Ft,l = 0N ×N regardless of t and l (e.g., full IR), x̂t,l is equivalent to the estimate of xt,l by BLC with LMMSE detection
19 because Kt,l becomes (HH 2
t,l Ht,l + σ IN )
−1 H
Ht,l . In addition, for general Ft,l , we have the following lemma:
20 Lemma 1 (Equivalence of the estimated xt,l in the proposed and direct SLC schemes): x̂t,l is identical to the corresponding
21
estimate of xt,l obtained by the direct SLC scheme with LMMSE detection at the tth TTI.
22
23 Proof: See Appendix A.
24
Therefore, the proposed and direct SLC schemes at the tth TTI obtain the identical estimate for xt,l regardless of the number
25
26 of HARQ processes and retransmission strategy. However, the direct SLC scheme requires a significantly larger computational
27
28 complexity and has to be designed in a dedicated manner for a specific system configuration.
29
30
31 [Appendix: previous manuscript] [Relationship between direct SLC and proposed schemes]
32
33 → [Appendix A: current manuscript] [Proof of Lemma 1]
34 → [Appendix B: current manuscript] [Error performance of the proposed and direct SLC schemes]
35
36
37 [Appendix A] In aggregated system models for the direct SLC scheme, a new receive signal vector is interpreted as additional
38
receive antennas [9]–[23], whereas new transmit symbols are interpreted as additional transmit antennas [14], [19]–[23]. Let
39
40 H̆t∗ ,l denote the Mt∗ × Nt∗ aggregated channel matrix including all channel matrices up to the t∗ th TTI {H1,l , · · · , Ht∗ ,l } for
41 the direct SLC scheme, where Mt∗ and Nt∗ are the numbers of aggregated receive and transmit antennas in H̆t∗ ,l , respectively.
42
Assume We assume that all packets transmitted at the (t∗ −1)th TTI are simultaneously terminated after the reception procedures,
43
44 i.e., all packets are either at the Rth HARQ round or are decoded successfully. Then, no transmit symbols in xt∗ −1,l are sent
45 again at the t∗ th TTI regardless of the number of HARQ processes and retransmission strategies. That is, Then, all elements
46
of xt∗ ,l are new transmit symbols, regardless of the number of HARQ processes and retransmission strategy, which should be
47
48 regarded as those sent from additional transmit antennas in the aggregated system model [14], [19]–[23]. Therefore, in such
49 cases, H̆t∗ ,l can be written as follows:
50  
51 H̆t∗ ,l = 
H̆t∗ −1,l 0Mt∗ −1 ×N 
. (18)

52 0M ×Nt∗ −1 Ht∗ ,l
53
54 Because As the two column partitions of H̆t∗ ,l in (18) are orthogonal, the estimate of xt,l using H̆t∗ ,l is identical to that
55 using Ht∗ ,l regardless of the detection criterion, e.g., (HH 2
t∗ ,l Ht∗ ,l + σ IN )
−1 H
Ht∗ ,l yt∗ ,l is identical to the last N elements of
56
57
58
59
60
Page 46 of 48

1 IEEE TRANSACTIONS ON SIGNAL PROCESSING 37


2
3 −1 H
(H̆H 2
t∗ ,l H̆t∗ ,l + σ INt∗ )
T
H̆t∗ ,l y̆t∗ ,l with y̆t∗ ,l = [y1,l , · · · , ytT∗ ,l ]T in the case of LMMSE detection. Therefore, if the condition
4
5 of simultaneous termination of all packets is satisfied if all active packets are simultaneously terminated at (t∗ − 1)th TTI, H̆t∗ ,l
6 can be initialized to Ht∗ ,l of a much smaller size, and the aggregation for the direct SLC scheme can be performed from the
7
initialized H̆t∗ ,l at future TTIs until the condition is satisfied again. Note that the packet termination in MSARQ systems always
8
9 satisfies the above condition. Because only a single packet is sent in each TTI in MSARQ systems, its termination satisfies the
10 above condition. In that sense, the SLC schemes for MSARQ systems are based on an effective aggregated channel model which
11 includes channel matrices for the current active packet only [9]–[18], e.g., the rM × N effective aggregated channel matrix of
12
13 [HTt−r+1,l , · · · , HTt,l ]T for CC.
14 Based on the above strategy, instead of H̆t,l , we consider the effective aggregated channel matrix H̃t,l , for the direct
15 SLC scheme which includes the channel matrices {Ht∗ ,l , · · · , Ht,l }, i.e., the (t∗ − 1)th TTI with t∗ ≤ t was the last
16
TTI satisfying the condition of simultaneous packet termination. For simple notations, we assume that i) t∗ = 1 and ii) a
17
18 symbols are repeatedly transmitted from the first to the ath transmit antennas, and b(= N − a) symbols are always newly
19 transmitted from the (a + 1) to N th transmit antennas. Then, the numbers of aggregated transmit and receive antennas are
20
Nt = a + bt and Mt = M t, respectively, and the effective aggregated channel matrix H̃t,l becomes the Mt × Nt matrix. Let
21
22 x̃t,l = [x1,l (1), · · · , x1,l (a), x1,l (a + 1), · · · , xt,l (N )]T denote the Nt × 1 aggregated transmit signal vector for H̃t,l . Then, the
23 aggregated receive signal vector for H̃t,l can be represented as follows:
24
25 ỹt,l = H̃t,l x̃t,l + ñt,l , (19)
26
27 T
where ỹt,l = [y1,l T T
, · · · , yt,l ] and ñt,l = [nT1,l , · · · , nTt,l ]T denote the Mt × 1 receive signal and AWGN vectors for H̃t,l ,
28 respectively.
29
30 Based on the aggregated system model in (19), the direct SLC scheme with LMMSE detection at the tth TTI can be performed
31 as follows:
32 ˆ t,l = (H̃H
x̃ 2 −1 H
(20)
t,l H̃t,l + σ INt ) H̃t,l ỹt,l = Gt,l ỹt,l ,
33
34 ˆ t,l is the Nt × 1 LMMSE estimate of x̃t,l and Gt,l is the Nt × Mt LMMSE filter matrix.
where x̃
35
Consider Next, we consider the aggregated KF operation based on a state-space model using the process equation of x̃t,l =
36
37 F̃t,l x̃0,l + w̃t,l and observation equation of (19), where F̃t,l = 0Nt ×Nt , x̃0,l = 0Nt ×1 , and w̃t,l = x̃t,l . Thus, because As
38 the output of the prediction step {x̃ˆ (−) , P̃(−) } is {0Nt ×1 , INt } and the Kalman gain matrix is Gt,l , the filtered output of the
t,l t,l
39 ˆ t,l , in (20).
aggregated KF operation at the tth TTI becomes identical to that of the direct SLC scheme, x̃
40
41 The aggregated KF operation for x̃0,l → x̃t,l can be divided asinto several small KF operations for x0,l → x1,l → · · · → xt,l ,
42
43 which correspond to consecutive KF operations of the proposed scheme ((10)–(12)) or (14)–(16)) from the first to tth TTI.
44 Because As (19) is a linear equation and every element of ñt,l is AWGN, the elements in x̂t,l (the estimate after several small
45
46 ˆ t,l (the estimate after the aggregated KF operation) [26], i.e.,
KF operations) are identical to the corresponding elements in x̃
47
ˆt,l (n) with 1 ≤ n ≤ a for the repeatedly transmitted symbols and x̂t,l (n) = x̃
ˆt,l (n + Nt − N ) with a + 1 ≤ n ≤ N
48 x̂t,l (n) = x̃
49 for the newly transmitted symbols. Because of the reduction in state size, the elements in x̂t−1,l (the filtered estimates of the
50
51 ˆ t−1,l (the output of the direct SLC
proposed scheme at the (t − 1)th TTI) are identical to the corresponding elements in x̃
52
ˆ t,l . Therefore, the proposed and direct SLC schemes at the tth TTI obtain the
scheme at the (t − 1)th TTI) instead of those in x̃
53
54 identical estimate for xt,l .
55
56
57
58
59
60
Page 47 of 48

1 IEEE TRANSACTIONS ON SIGNAL PROCESSING 38


2
3
4
5 [Appendix B] Consequently, based on the above observations, the relationship between the proposed KC scheme and the
6 direct SLC scheme can be summarized as follows: Based on the analysis in Appendix A, the error performance of the proposed
7
KC and direct SLC schemes according to the system configuration can be compared as follows:
8
9 1) MSARQ-CC: There Because there are no newly transmitted symbols for the retransmission, of a packet. Thus, as
10 described in Section III-B, without using the past estimates x̂t0 ,l for t0 < t, the proposed scheme obtains LLRs for decoding
11 from the current estimate x̂t,l only. the proposed and direct SLC schemes always obtain LLRs for decoding from the estimate
12
13 of xt,l only, where the estimate of xt,l is identical in both schemes. Therefore, using the same estimate for LLR calculation,
14 the error performance of the proposed scheme is identical to that of the direct SLC scheme.
15 2) MSARQ-IR: Because the estimate of data symbols for MSARQ-IR systems is obtained from x̂t,l , the estimated data
16
symbols in the proposed and direct SLC schemes are identical. Meanwhile, because of the fixed state transition size of N , the
17
18 estimate of the parity symbols transmitted at the previous t0 (< t)th TTI is obtained from x̂t0 ,l , which is used for decoding at
19 the tth TTI by LLR combining. Thus, information from the (t0 + 1)th TTI to the tth TTI is not utilized to obtain the LLRs
20
of the parity bits transmitted at the t0 th TTI. The estimate for data symbols is identical in both combining schemes, as in
21
22 MSARQ-CC systems. Meanwhile, the parity symbols transmitted at the previous TTIs are not re-estimated in the proposed
23 scheme, while the direct SLC scheme re-estimates all the previously transmitted symbols. Therefore, the error performance of
24
the proposed scheme for retransmission can be slightly degraded from that of the direct SLC scheme. However, this performance
25
26 degradation is marginal not significant as shown in Section IV, because the performance improvement of the parity symbols by
27 retransmissions in the direct SLC scheme is limited in contrast to the significant improvement of the data symbols [14].
28 3) MMARQ-CC: Similar to MSARQ-CC systems, the proposed scheme obtains LLRs for decoding from the current
29
ˆ t,l of the direct SLC scheme. Therefore,
estimate x̂t,l only in which the elements are identical to the corresponding elements in x̃
30
31 the error performance of the proposed scheme is identical to that of the direct SLC scheme.
32
33 4) MMARQ-IR: According to the multiplexing strategy given in Section II-B, the estimate for repeatedly transmitted data
34 symbols in the proposed scheme becomes equivalent to that in MMARQ-CC systems. Therefore, the The estimated data symbols
35
36 in the proposed and direct SLC schemes are identical, as in MMARQ-CC systems. Meanwhile, becauseas Ft,l = 0N ×N for
37
38 the transmit signal vectors of the newly transmitted parity symbols, their estimate in the proposed scheme is identical to BLC.
39 In addition, becauseas the aggregated channel matrix for the transmit signal vectors of parity symbols is block-diagonal with
40
41 the t0 th diagonal block Ht0 ,l for 1 ≤ t0 ≤ t, the direct SLC scheme also provides an estimate of the parity symbols identical
42
43 to BLC. Therefore, the proposed and direct SLC schemes haveexhibit identical error performance.
44
45
46 [(iv) suggestions for a full-paper: dealing with parameter errors for the state-space model] The
47
48 state-space model consists of the process equation (xt,l = Ft,l xt−1,l + wt,l ) and observation equation
49 (yt,l = Ht,l xt,l + nt,l ). In the equations, (a) the noise nt,l remains unknown, (b) yt,l should be known by
50
51 receiving the signal, (c) Ht,l also should be estimated by the receiver in any MIMO systems, and (d) xt,l
52
53 and xt−1,l are the state vectors estimated by the proposed scheme. Therefore, the terms that may contain
54 errors are Ft,l and wt,l in the process equation, which is decided by the transition indicating vector dt,l ,
55
56
57
58
59
60
Page 48 of 48

1 IEEE TRANSACTIONS ON SIGNAL PROCESSING 39


2
3
4
because Ft,l = diag(dt,l ) and wt,l = xt,l (1N ×1 − dt,l ). Meanwhile, the estimated Ht,l can contain
5 errors as well, but the receiver can define the observation equation using the estimated Ht,l regardless
6
7 of errors, so the channel estimation error would not be a problem to develop the state-space model.
8
At the receiver, each element of the transition indicating vector (dt,l (n) = E[xt−1,l (n)xt,l (n)∗ ]) can
9
10 be obtained by determining that the current symbol (xt,l (n)) is identical to the previously transmitted
11
12 symbol (xt−1,l (n)). Based on the system model of the paper, this can be decided by checking (1) xt,l (n)
13
is a data symbol or belongs to the packet with CC and (2) the packet including xt,l (n) is at least in its
14
15 2nd HARQ round at the tth TTI. In general cases (besides the system model of the manuscript), there
16
17 may be a data symbol which is retransmitted from different l and n. However, for normal functioning of
18
reception procedures regardless of the combining strategy, the receiver should know (3) the position in
19
20 the mother codeword of the coded bits consisting each symbol and (4) the packet to which each symbol
21
22 belongs. This kind of side information should be shared between the transmitter and receiver via control
23
channel, otherwise the LLRs of the coded bits obtained from each symbol cannot be correctly located
24
25 for decoding. Therefore, similar to the case of the transmitter, the receiver can calculate the transition
26
27 indicating vector dt,l prior to decoding even though it does not know the exact value of xt,l (n) until the
28
successful decoding of packets.
29
30 To sum up, considering the conventional wireless communication systems with control channel, the
31
32 errors in the state-space model occur when the control channel information is not shared or is received
33
with errors. This blind/semi-blind estimation problem of the control channel information is an interesting
34
35 and important topic, but it is not directly related to MIMO-HARQ systems, which maybe the beyond the
36
37 scope of this manuscript proposing the combining scheme for MIMO-HARQ systems. Thus, the cases of
38
errors in parameters for the state-space model are not considered in the revised manuscript. Instead, the
39
40 following footnote is added to indicate that the information to develop the state-space model (in detail,
41
42 the transition indicating vector) is shared between the transmitter and receiver:
43
44
45 [Section III-A: footnote 1] For normal functioning of reception procedures, in addition to the number of HARQ processes
46
47 and the retransmission strategy, the receiver should know (i) the position in the mother codeword of the coded bits consisting each
48
symbol and (ii) the packet to which each symbol belongs. For packet transmission, this information is usually shared between
49
50 the transmitter and receiver via control channel. Therefore, the receiver and transmitter can calculate dt,l = E[xt−1,l x∗t,l ] by
51
52 deciding whether each xt−1,l (n) is retransmitted as xt,l (n), as described in Sections II-A and II-B.
53
54
55 Again, I would like to thank the reviewer for the time and effort for this manuscript.
56
57
58
59
60

You might also like