Machine Learning For Reliable Mmwave Systems Blockage Prediction and Proactive Handoff

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

MACHINE LEARNING FOR RELIABLE MMWAVE SYSTEMS: BLOCKAGE PREDICTION

AND PROACTIVE HANDOFF

Ahmed Alkhateeb† , Iz Beltagy‡ , and Sam Alex?

Arizona State University, Email: alkhateeb@asu.edu




Allen Institute for Artificial Intelligence, Email: beltagy@allenai.org
?
Facebook, Email: sampalex@fb.com

ABSTRACT coordinated transmission shown in [1–3], BSs coordination is asso-


ciated with high cooperation overhead and difficult synchronization
The sensitivity of millimeter wave (mmWave) signals to blockages challenges.
is a fundamental challenge for mobile mmWave communication sys- In this paper, we develop a novel solution that enhances
tems. The sudden blockage of the line-of-sight (LOS) link between mmWave system reliability in high-mobile applications without
the base station and the mobile user normally leads to disconnect- requiring the high cooperation overhead of coordinated transmis-
ing the communication session, which highly impacts the system sion. In our strategy, the serving BS uses the sequence of beams that
reliability. Further, reconnecting the user to another LOS base sta- it used to serve a mobile user over the past period of time to predict
tion incurs high beam training overhead and critical latency prob- if a hand-off/blockage is going to happen in the next few moments.
lem. In this paper, we leverage machine learning tools and pro- This allows that user and its serving BS to pro-actively hand-over
pose a novel solution for these reliability and latency challenges in the communication session to the next BS, which prevents sudden
mmWave MIMO systems. In the developed solution, the base sta- link disconnections due to blockage, improves the system reliability,
tions learn how to predict that a certain link will experience block- and reduces the latency overhead. To do that, we develop a machine
age in the next few time frames using their observations of adopted learning model based on gated recurrent neural networks that are
beamforming vectors. This allows the serving base station to proac- best suited for dealing with variable-length sequences. Simulation
tively hand-over the user to another base station with highly probable results showed that the proposed solution predicts blockages/hand-
LOS link. Simulation results show that the developed deep learn- off with almost 95% success probability and significantly improves
ing based strategy successfully predicts blockage/hand-off in close the reliability of mmWave large antenna array systems.
to 95% of the times. This reduces the probability of communication Notation: We use the following notation: A is a matrix, a is a
session disconnection, which ensures high reliability and low latency vector, a is a scalar, and A is a set. AT , A∗ are the transpose and
in mobile mmWave systems. Hermitian (conjugate transpose) of A. [a]n is the nth entry of a.
Index Terms— Millimeter wave, machine learning, beamform- A ◦ B is Hadamard product of A and B. N (m, R) is a complex
ing, blockages, hand-off. Gaussian random vector with mean m and covariance R.

1. INTRODUCTION 2. SYSTEM AND CHANNEL MODELS

Reliability and latency are two main challenges for millimeter In this section, we describe the adopted mmWave system and chan-
wave (mmWave) wireless systems [1–3]: (i) The high sensitivity nel models. Consider the communication setup in Fig. 1, where a
of mmWave signal propagation to blockages and the large signal- mobile user is moving in a trajectory. At every step in this trajec-
to-noise ratio gap between LOS and non-LOS links greatly affect tory, the mobile user gets connected to one out of N candidate base
the link reliability, and (ii) the frequent search for new base stations stations (BSs). For simplicity, we assume that the mobile user has
(BSs) after link disconnections causes critical latency overhead [4]. a single antenna while the BS is equipped with M antennas. Ex-
This paper leverages machine learning tools to efficiently address tending the results of this paper to the case of multi-antenna users
these challenges in mobile mmWave systems. is straight-forward. Let hn,k denote the M × 1 uplink channel vec-
The coordination among multiple BSs to serve the mobile tor from the user to the nth BS at the kth subcarrier. If the user is
user has been the main approach for enhancing the reliability of connected to the nth BS, this BS applies a beamforming vector fn to
mmWave communication links [1–3]. In [1], extensive measure- serve this user. In the downlink transmission, the received signal at
ments were done for coordinated multi-point transmission at 73 the mobile user on the kth subcarrier can then be expressed as
GHz, and showed that simultaneously serving the user by a number
of BSs noticeably improves the network coverage. This cover- yk = h∗n,k fn s + v, (1)
age performance gain was also confirmed by [2] in heterogeneous
where data symbol s ∈ C satisfies E |s|2 = P , with P the to-
 
mmWave cellular networks using stochastic geometry tools. To
overcome the large training overhead and increase the effective tal transmit power, and v ∼ NC 0, σ is the receive noise at the
2

achievable rate in coordinated transmissions, especially for highly- mobile user. Due to the high cost and power consumption of mixed-
mobile applications, [3] proposed to use machine learning tools to signal components in mmWave large antenna array systems, beam-
predict the beamforming directions at the coordinating BSs from forming processing is normally done in the analog domain using net-
low-overhead features. Despite the interesting coverage gains of works of phase shifters [5]. The constraints on these phase shifters
The final goal of training the model is to find the parameters
embd, Wr , Wz , Wh , Ur , Uz , Uh , Wf , cr , cz , ch , cf that 0.95

minimize this loss for all training instances.


0.9

The final goal of training


BSthen model isV. to Sfind the parameters
IMULATION R ESULTS

Blochage Prediction Success Probability


to the current BS is blocked, the mobile user incurs critical0.85latency
embd, Wr , Wz , Wh , h U1,k
r , U z , U h , W f , c , c z , c
r overhead ,
h toc that
f get connected
0.95
to another BS. To overcome these chal-
minimize this loss for The finalinstances.
all training goal of training
lenges, can thewemodel
predictisthat
to afindlink the parameters
blockage is going to happen
0.8 in
hN,k thehnext
BS N embd, Wr , Wz , W , Urfew, Umoments?
z , U h , W Successful
f , c 0.9
, c
r z h f,blockage
c , c prediction
that can be
0.95 very
Blockage fn BS 1 loss helpful for mmWave system operation as it allows for proactive
minimize
V. S IMULATION R ESULTS this for all training instances. 0.75
hand-off to the next BS. This0.85proactive hand-off enhances the 0.9 sys-

Blochage Prediction Success Probability


tem reliability by ensuring session continuity and avoids the latency
h1,k V.overhead
S IMULATION
that resultsRfrom link disconnection. In this section,
ESULTS 0.85 we
0.7

Blochage Prediction Success Probability


0.8
hN,k formulate the mmWave blockage prediction and proactive hand-off
fn h1,k problem that we tackle in the next section. 0.650.8
hN,k Beam sequence and hand-off0.75
status: To formulate the prob-
fn lem, we first define two important quantities, namely the beam se-
0.6
0.75
quence and the hand-off status. 0.7 Due to the user mobility, the cur-

rent/serving BS needs to frequently update its beamforming vector


Mobile User fn ∈ F. The frequency of updating the beams depends on a0.55
number
0.7
User Trajectory 0.65
0 2 4
of parameters including the user speed and the beam width. A good Deep
approximation for the period
1 every
0.6
which the BS needs to update
0.65 its

beam is the beam coherence time, TB , which is defined as [6]


Fig. 1. The system model considers one user moving in a trajectory, D Θn Fig. 3.0.6
and is served by one out of N candidate BSs at every step in the TB = 0.55 0 , 4 (5)
vs sin(α) 2 2 6 8 10 12
trajectory. Deep Learning Dataset
0.55 Size (Thousand Samp
where vs denotes the user speed, D is the distance between the 0user 2 4

and the BS, α is the angle between the direction of travel and the di- Deep

limit the beamforming vectors to be selected from quantized code- rection of the main scatterer/reflector
Fig. 3. (or the BS in the case of LOS),
books. Therefore, we assume that the BS beamforming vector fn is and Θn defines the beam-width of the beams used by BS n. Now, as-
Fig. 3.
selected from a quantized codebook F with size/cardinality |F| = suming that the current/serving BS n updates its beamforming vector
(t)
MCB . The codewords of this codebook are denoted as gm , m = every beam-coherence time and calling it a time step, we define fn
1, 2, ..., MCB . Further, we assume that the beamforming vector fn as the beamforming vector selected by the nth BS to serve the mo-
is selected from the codebook F to maximize the received signal bile user in the tth time step, with t = 1 representing the first time
power, i.e., according to the criterion step after the handing-over to the current BS. With this, we define
X ∗ the beam sequence of BS n until time step t, denoted Bt as
fn = arg max |hn,k gm |2 . (2)
n o
gm ∈F Bt = fn(1) , fn(2) , ..., fn(t) . (6)
k

We adopt a wideband geometric mmWave channel model [4] Further, we define st ∈ {1, 2, ..., N }, as the hand-off status at
with L clusters. Each cluster `, ` = 1, ..., L is assumed to contribute the tth time step, with st = n indicating the user will stay connected
to the current BS n in the next time step, and st 6= n indicating Fig. 4. that
with one ray that has a time delay τ` ∈ R, and azimuth/elevation an-
gles of arrival (AoA) θ` , φ` . Further, let ρn denote the path-loss be- the mobile user will hand-off to another BS in the t + 1 time step. It
tween the user and the n-th BS, and prc (τ ) represents a pulse shap- is important to note here that predicting a hand-off in the next time
ing function for TS -spaced signaling evaluated at τ seconds. With step is more general that predicting a blockage, as the hand-off can
Fig. 4.
this model, the delay-d channel between the user and the nth BS happen due to a sudden blockage or a better SNR. Therefore, Fig. 4. we
follows will generally adopt the hand-off prediction that implicitly include
link blockage prediction.
r L
In this paper, we will focus on stationary blockages, [1] F.
VI. B. Mismar and B.
Cwhich
ONCLUSION
M X
hn,d = α` p(dTS − τ` ) an (θ` , φ` ) , (3) leaves the user mobility as the main factor of the time-varying be- new radio aided by S
ρn
`=1 havior in the channel. To complete the formulation, we leverage R EFERENCES
ICC the Workshops, Kans
note that the beamforming[1]vector thatMismar
maximizes [2] A.signalAlkhateeb, O. El A
where an (θ` , φ` ) is the array response vector of the nth BS at the F. B. and the
B. received
L. Evans,
[1]andF. “Partially
hybrid
B. Mismar blind
and B.h
precoding
power, as defined in (2), heavily relies aided
new radio on thebyuser location
Sub-6 GHz and
LTE
AoAs θ` , φ` . Given the delay-d channel in (3), the frequency domain new radiosignaling,” Si
channel vector at subcarrier k, hn,k , can be written as the surrounding environment ICC geometry
Workshops,(including of
blockages)
Kansas City, ICC
MO, [3].USA, Maybyin
Selected aided
Topics
201
2014. Workshops, Kans
Therefore, we formulate [2]theA. problem
Alkhateeb, as O.
a El
prediction
Ayach, G.of the and R. Heath
[3][2]A. A.Leus,
Alkhateeb,
Alkhateeb, O.J. ElMA
D−1 hand-off status at time t + 1,and i.e.,hybrid
ŝt , given the beam-sequence
precoding for millimeter at
and hybridwaveprecoding
cellular s
2πk d
time t, Bt . Formally, the objective precoding and combin
X
hn,k = hn,d e−j K . (4) of this Topics
of Selected paper isintoSignal
maximize the
Processing,
of Selected vol.
Topics 8, in
no
probability of successful blockage/hand-off
2014. prediction defined Communications
as Mag
d=0 2014.
[3]
P [ŝtA.
= sAlkhateeb,
t |Bt ] .
J. Mo, [4] [3]R.
N. A.W.(7) Heath, N. J.
Gonzalez-Prelcic,
Alkhateeb, Gonz
and
Mo
Considering a block-fading channel model, {hn,k }K k=1 are assumed precoding and combining solutions “An overview
precoding for and ofcombin
signa
millimeter
to stay constant over the channel coherence time, denoted TC [6] . In this next section, we leverage machine learning
Communications Magazine,, systems,”
tools to ad-
vol. IEEE
52, no.
Communications 12, Journ
pp.
Mag 1
dress this problem. [4] R. W. Heath, N. Gonzlez-Prelcic, [4]no.R. 3,W.S.pp. 436–453,
Heath,
Rangan, N. W.
Gonz RA
3. PROBLEM DEFINITION AND FORMULATION “An overview of signal processing[5] V.“An Va,overview
J. Choi,
techniques T. mS
of for
signa
systems,”
4. DEEP LEARNING BASED PROACTIVE HAND-OFF IEEE Journal of multipath
systems,”
Selected fingerprintin
IEEE
Topics in Journ
Sign
Maintaining good link reliability is a key challenge for mmWave no. 3, pp. 436–453, April 2016. no. 3, pp. 436–453,
Transactions on Vehic A
communication systems, especially with mobility. This is mainly [5] V. Va, J. Choi, T. [6][5]
Shimizu,
In this section, we explain our proposed solution that uses deep A. V. G. Va, J.
Alkhateeb,
Bansal,Choi,
S.
and T.
AlexR S
due to the high sensitivity of mmWave signals to blockages, which multipath
learning tools, and more specifically fingerprinting
recurrent neural networks, multipath
for learning
millimeter fingerprintin
to ef- coordinated
wave v2i b
Transactions onhighlight
Vehicular Transactions vol.onPP,Vehic
can frequently cause link disconnections. Further, when the link ficiently predict hand-off/blockages. First, we theTechnology,
systems,”
key idea IEEE Accesno
[6] A. Alkhateeb, S. Alex, [7] P. [6] A. Alkhateeb,
Varkey,
K. Cho, Y. Li,
B. Q. Van
S.
Qu,Alex
Meran
learning coordinated beamforming learningforcoordinated
highly-mo
properties
systems,” IEEE Accesm
of neural
systems,” IEEE Access, pp. in 1–1, 2018.
[7] K.Proc. Cho,ofB.SSST,Van 201Mer
[7] K. Cho, B. Van Merriënboer, P.D.Kingma
[8] D.properties Bahdanau, and andJ. m B
of neural
properties of neural machinearXiv translation:
in preprint
Proc. Encoder
of SSST,arXiv:1
201
in Proc. of SSST, 2014. [8] D. P. Kingma and J. B
[8] D. P. Kingma and J. Ba, “Adam: A method for sto
⇥ yt ŝt where embdxt Sequence
The ismain
= embd (bprocessing:
a lookupt ) table
value Thelayer
we
of this central
learn iscomponent
during
(8) to ...the of our model
training.
XXX.

t-max
1 h (.)

ft-max
nnected Layer

ax
d ŝ(.)t U hytwhere aThelookup
U z ŝt isŝthen main value
where is of
table thisislearn
a we layer isduring
to ...Unit
aRecurrent
lookup XXX.
table
the we learn
training. during the training.

nected Layer
oft-max
connected Layer

ted Layer
onnected
lly-connected Layer
Fully-connected (c) 2014 Robert W Heath Jr.

soft-max
⇥ embd
h (.)
embd 3
) tthisbeam index, then our
Gated a model ofstarts with an embedding layer
yt tth beam index, our modelxstarts with
thisan embedding ...(GRU)
layer [7] — form neural
Wrh
tThe main
Sequence value = of
t Sequence
embd
processing: layer
processing:
(b
The tcentral to XXX.
The of
component our(8)
central component of our model
model The amain isvalue of this layer is training.
to
for...processing XXX.
h
Gated
The Recurrent
main ŝt value where Unit of (GRU)
this islayer lookup
atable
tovector we
...Unit XXX. xlearn during the

soft-maxLayer
1 h (.)
⇥ that maps every embd
beam index bt networks
to that is best
t .The that suited
maps every beam index variable bt tolength a vector xt .
yU
aSequence processing:
issequnces.a anGated ... Recurrent central Unit component (GRU) of[7]our at— model
of aour form of neural
t
U index, hThe main ismodel value Gated of this Recurrent layer isembedding
toGRU (GRU)
XXX. is alayer [7] —
recursive a form
network of neural
that runs every time
h (.) z
beam z then our Sequence
starts with processing: The central component model

Fully-connected Layer
tth Sequence whereprocessing: is a lookup
The centraltable we
component learn during of ourthe training.
model
soft-max
Uz ⇥

+
t
embd
Sequence is
networks a Gated
processing: that Recurrent
is best suited Unit for (GRU) processing [7] —variable a form length of neural
that maps t every beam index b x to= a embd networks
vector step The (b of
x central
) . the that input,component is and best it ofmaintains
our
suited
(8) modelafor hidden processing
x state
= embd q variable
which
(b ) length (8)
U z U zht is Delay W a r Gated The
W z is main
Recurrent
a Gated value
sequnces.
networks is tof
Recurrent
W ha this
t
Unit Gated
GRU that layer
(GRU)
Unit
is isaisaRecurrent
t tis to ... XXX.
(GRU)
recursive
best
function [7] [7]
suited
of —network

the Unit
aaprevious
forform
form (GRU)
that ofof
processing neural
runs
state [7]
neural
at
and every —
variable a form
t
time
the current
t
length of neural
input.
t
In
+ ht
Sequence networks
processing: sequnces. that
The is
central GRU best is
suited
where a recursive for
The processing
final network
is model agoal lookup ofhelps that
variable
training
tablewith runs
wethe at
length
learn every
model during time
is tothe find the parameters
training.
ŝt acomponent ofmechanism
our

soft-max
where embd networks isxtstepasequnces.
=lookup
that of issuited table
best we
suited learn for during
processing the (8) training.
variable embd length
soft-max

⇥ht + ŝnetworks that is best embd the y(b input,


tGRU
t )addition,
for is and ithas
itrecursive
aprocessing maintains special agating
variable
network hidden that state
length runs q which
t that
at every time long
ht hxt
t rt
The main
g (.) is
zt
g (.)a sequnces.
Gated
value ofis GRU
⇥Recurrent
this
a h (.)is a recursive
sequnces.
layer
function is step
Unit
of to GRU
sequences....
the of network
(GRU)
XXX. the
previous is More a 2input, that
[7]
recursiveThe— runs
formally,
state and
andmain
a atform itvalue
every
network
embd,
GRU
the maintains
time
of
is W
current ofneural rthis
that
implemented , Wruns
input. alayer
z ,In
hidden
Was is, depicted
at Uto
every state
r...
, UXXX.ztime
,U qht , which
W f , cr , cz , ch , cf that 0.95
deep learning
where tsequnces.
model that leverages
isGRU dstep,
astep lookup ŝis
step
recurrent
a ofwe
recursive
table
neural the learn input,
itnetworkduring andthe it
that maintains
a istraining. runs at a hidden
every time state qt which h
ŝhand-off
Delay BS inembdtheSequence
next networks
time of
processing:
tthat the
,addition,
given input,
isthe best past
The and is in
suited
central a Fig. function
maintains 2,for
component and processing of
hidden
described
of the
our state
Sequence previous
minimize
by q
variable
model tthe which
processing:
following state
this
length and
loss The
model the
for central
all current
equations component
training input.
instances. In
of our model
Layer

it has a special gating mechanism that helps with long


t
Thestep main value of this
1
isand
layer astep function
is oftothe
of
... the
XXX. of input, the aprevious
andtheit current maintains
state and the a hidden input.
Inascurrent
state qInt which
Input arepresentation:
of Uthe h isinput, a function
isazaitspecial maintains
previous state hidden
and isaofstate
isspecial Gated input.
atq1 Recurrent which Unit (GRU) [7] — long of neural At every time step t, the input to our
form 0.9
l that leverages
Uisrrecurrent
a Gated sequnces. neural Recurrent sequences.
GRU U Unit
isGated a function(GRU)
More
recursiveaddition,
2
[7]
formally,
rt of = it

network the has
a(WGRU form athat
previous neural
implemented
runs
Ur qstate gating tevery
cand mechanism
timedepicted
the current thatinput. helps
(9) with
In
Sequence processing: addition, addition,
The
it has central it
andRecurrent
has
component
gatinga special mechanism of
Unitg our gating rmodel
xthat
(GRU) t+
networks mechanism
helps twith
that
+long that
r)
is best helps suited with long
deep learning model islength the tth beam index in the beam sequence Bt .
next time step, is Gated anetworks
ŝt function
, given the⇥ that
step sequences. past
of oftheis theinbest Fig.
input, previous
2suited
addition,
2,and foritisit
sequences. state
described
processing
maintains
has aand isg More
special
byathe the
2variable
hidden current
xgating
following
formally, length
state
qmechanism
input.
model
GRU
q t which Inisthat
equations V. Sfor
implemented IMULATIONprocessingas depicted variable
R ESULTS
gleverages
model atthat leverages recurrent neural as helps with long

Blochage Prediction Success Probability


is hrecurrent neuralRecurrent Unit sequences.More
(GRU) formally, [7] More 2— GRU
zaformally,
t = form implemented
(W of zneural GRU t + Uis zas depicted
timplemented
1 + czis) a recursive depicted Let (10) 0.85
} denote the index of the tth beam index,
at stleverages 1 recurrent
sequnces.
as the neuralGRU
hand-off isstatus a recursiveat hthe network that runs atsequnces.
every time GRU network btthat∈ {1, runs2,at..., every
MCBtime
me in
time
that
2 {0,
the
step, next
networks

step, t addition,
, 1},
given

time
,
the
giventhatstep,
past is it
thea
is has
ŝ function
in ,
best
t
past a
given
Fig.
⇥ special
2, and
suited
+
in
the of
Fig. r past
is=
for
sequences.the gating
t
described
2, processing
and in
previous
(W mechanism
Fig.
is by
More
x q
2
+ 2,
state
the
variable
described
= U (1 and
following
formally,
q and z isthat
length
by +
t step
) model
qcthe helps
described
the
t of ) current
GRU following with
equations
h isby input.
1 the input, and it maintains
long
the
implemented
model following
In (9)
equations as model
depicted
thenaour
equations
model
hidden statestartsqt with
which an embedding layer that maps every beam
st =leverages recurrent userneural

Fully-connected Layer
1 indicating thethe will hand-off to 1,k
t
step of addition, input, it has anda it t
special maintains g
gating
r at hidden
t
mechanism
r t state
1 q r
that t which helps with long
ext
euralt +time
sequnces.sequences.
step, ŝtaGRU
,andgiven st More
isthe a2recursive
past formally, gin
network
t Fig.
GRU
2, that
Uand
runsis1is implemented
at every time
+described )q xt as
by h+
the Udepicted
qfollowing model equations
0.8
Embedding Layer

the
1 time
hand-off
step,
is status function at
=Delay
the
0of indicating
rt the = z(W
2 previous
that
= r xtg+ (W
state rzq xandtt +
r = + U
the czz rq t 1 is
)tcurrent
(W h+(W acxinput.
zfunction
+ U
N,k
In q of
(r(9) t the
+ c 1 )(10)
qt previous
)+ cq ) ,state index andbtthe
(11) to acurrent
vector
(9) xt . In
input.
step of the sequences. input, and it More maintains rtformally, = by a ghidden (W GRU xstate +cisU gqimplemented
tr q which r 1 t+ cfrn) as r tdepicted (9)
recurrent
ill past stay inneural
connected Fig.
addition, 2,to the and
itthe has is
current described
aztspecial= BSgq(W in
tgating=thezx (1tand +mechanism
U zthe rtfollowing
tz)qt current q1ttt+is1that z ) helps
t model
addition, with long itequations
has(10)
1 r
a special gating mechanism that helps with long

soft-max
g, given
hand-off the isuser astatus
the will
function
past athand-off
the intofFig. to2, previous
and iszstate described where the by x the the
followinginput.
input In
vector, model q+ t isequationsthe hidden state, ŝt GRU rt and zt (9)
0.75

The neural proposed sequences.


deep b learning More 2
model formally,
that leverages = r
GRU t = (W is
recurrent g z (Wx
implemented
t = +
tneural r x U
g t (W + q U
sequences.
as z x q+
1qdepicted
r t +t c y U
1 t) More
z q 2c t r ) 1 +
formally, c z ) (10)is de-implemented (10) as depicted xt = embd (bt ) (8)
rrent
1},
ees to and hand-off
user
will as
s addition,
will =
focus the hand-off
0 on hand-off
indicating
statusit has
stationary to a
at the that
special status q
blockages,
t =
gating at
(1 the
which
gtz(.)
mechanismt ) + qz gtare 1(.) ”gates”
that z(W helps x that with
+ ⇥ zhelp
U tlong
(r the (.) model
q z ) learn + c ) from
, (11)
long distance
(9)
t t g h q t q t t 1 q
ven
dquence
dicating
hesbility t =
predict
the
touser 0the past
sequences.
asindicating
Bthe
the
current
twillthe in
main
r
hand-off
Fig.
user tBS
More
that
hand-off factor
= 2 2,BS
in of
will and
g the (W
in
formally,
rat
to
the
is
hand-off
t the= r x
next
described
g+
t
GRU
(W +
time
zq t to
U
rt x=
step,
by z
r
isth + q
(W
ŝt , 1given
the
=
(1pendencies
t
timplemented
U qx rtz
+
following
qgq + (W
tt)tU dc the
1=
r
q+
)
q(r
past
(1
zif
tas x model
tcneeded. +
rt1depicted
in
)qt z1t))+U Fig.
equations
z cq
The q 2,
tq )weight
and
t ,1 1 (11)+ c is
matrices
z ) described by
(9)Wr , Wz , Ur , Uz
the following (10) model equations 0.7

the
ral
he
ast current
hand-off status thetime-varying
where xt is theand input the vector,
bias 1
vectors qt is cthe hidden state, rt parameters
r , cz , cq are model
and zt where
that are embd is a lookup table we learn during the training.
in To Fig.BS 2,in z and the = isthe described (W x by + the U followingq + modelc ) equations (10)
dhannel. step,
s
the = user and
0 complete
indicating
will s =
t hand-off
t r 0 = indicating
that
gz
where formulation,
gt to =
(W z x is(W
t + that
the U we
x
input q qz + = z
U
vector,
t + 1(1cq ) z
(W
is
z + +)
the c z x q
)
hidden + U (W
state, r (r = (9) x q
and (W
+ U ) x (10)
q (r
+ + U
ctqphase. ,rq
)de- qtt 11(11) )++crcSequence
)q ) , (11)
processing: (9)The central component of our model is a
hat the
status
tationary
t
thecurrent atblockages,
beamforming the t
which
vector
x are
rtg t ”gates”
that maximizes
z r t tthat t 1learnedt help
z rt the q h 1 model
tduring t q zthe t t learn
t machine 1 h from q t r
t qt long
learning g t training
t z 1tdistance
r t
Finally, g
0.65

onnected the to the BS in


current the
are BS
”gates” in
pendencies that the help if the
needed. model The learn weight from long (9)
matrices z distance
=
(10) (W de-
, x + U q +Gatedc ) Recurrent
(11) Unit (10)
(GRU) [7] — a form of neural networks that
r,nd indicating that
ary blockages,
s = r
0 = q
which z= (W = (1 q x (W
=+ z U(1 z)the
x q + q z U )+ q c tqt1 1 ) and+ + c zz ) are nonlinear (W x activation + U W functions,
(r W , U the
zqt r1 ) + , Ufirst is
z cq ) , sigmoid
ntus lfpower, to
hand-off
factor
we at the of
define
as the t
to
defined time-varying
s t g
2 in
t {0, (2), r
1}, t
gtheavily as t r
where
t
trelies hand-off
1
t zon
1
t iswhere
r
the status ⇥
input
h at is the
vector, the tanh. input tr ,is
t
the
vector,
g
hiddenr t z t
isstate,theare
z
rt and
hidden
t 1 z
isstate, rt and
t t xbias t h qqmodel t q zt
pendencies ifthe needed. Theand weight thex matrices q forzprocessing
t W W z, U r, U 0.6
or and )vectors rt, cz , cqis are
csecond qUsing t+=cparameters
non linear that
activation functions best suited variable length sequnces. GRU is a re-
etoof the time-varying zztt) (11)
(10) t
andplete
that
and-off dicating the
the
step, the to current
zt =
that st =gqBS
formulation,
surrounding
with (W
t1environment
= in
xt +
we
indicating
z(1 the z+Utare)bias q
z geometry
the t 1user
qt”gates” 1+ czwill
that hand-off
, qhelp the to model 1 )learn
(1
from (11)
long
q
distance de-
hzt(W qhx(W x c+q(r U (r qparameters ) , are
and t 1
he
onary the formulation,
blockages, we +
which zt the learned vectors
where during tx +
care q, c
rallows
the U
iszt”gates”
the
machine theare t q⇥
input
model
model q
that t+
learning
t to
1 )
help
vector, +
trepresent
training c the
q
q ) ,that
q model
is
complex
phase.the learn
hidden
non-linear
Finally, from
state, long
functions.
cursive
r anddistance
t t qnetwork z de-that runs(11) at every time step0.55of the input, and it main-
ming srnt
es)
ating to on
BS [6].
BS stationary
vector
in
that Therefore,
in the that
qthet t=+(1 1 blockages,
maximizes we
time ztformulate
) step,
learned
+ zq 1and
duringwhich
the sprob-
pendencies
h (W the
t x= tmachine
+0U indicating
if
t
(rlearning
qneeded. qt that The 1training
) +every q )phase.
weight , timematrices
cfunctions, t + zt
Finally,
(11) h (W ,W q xt, +
g
UrU (r t
t 1 ) + cq ) ,
nvector
ctor
e(2),
ed
obile
the
of main
ininof
the that
(2),
user the
hand-offmaximizes
factor time-varying
heavily
will state
of
stay wherethe
relies atand time
connectedon x(W
t t and
t htis
time-varying +are the
1,to are
i.e.,
q
hinput
the
nonlinear
are ŝ”gates”
t , vector,
current pendencies
nonlinear
activation
Output:
BS
t activation
that q in help
is
functions,
At
the if
the needed.
the
hidden
the model
first The step
state,
is
the weight
learn t,first
r
gW
the
and isrmodel
from matrices
sigmoid
z longzhas aW, qU
hidden
distance rz, W
tains z , Ur , state
de-
a hidden Uz qt which is a function of0 the previous 2 4
state
6
and8 10 12
ationary
BS
hat the
heavily whereblockages,
relies + x z
on t is which
the
h input
q
and x
and t +
the vector,
theU (r
bias
second
q t q
qvectors
stateist is
1 qtthe
) + t c
which
Delay
c
Using
q ) hidden
, , c (11)
encompasses
,nonc
where state,
are
linear model
sigmoid
rwhat
activation
is the and t
parameters
the
input z model t
functions
vector, that
learned are
is about
the hidden state, and
Deep Learning Dataset Size (Thousand Sample
quence
me
okages, e complete
ing
he the at
environment
step. time
formulation, where
t,
the
B .t
tare
geometryFormally,
iswe
t ”gates”
xformulation,
and thetheinput the objective
vector,
thatpendencies
second help
we is tanh. qand
the t is
t
model
Using the
tanh.
the
if needed. non bias
hidden learnr linear vectors
state,
z
The
from q rtweight
activation and
long x
crt,functions t
zcdistance
tz ,matrices
cqstep arede- t model
Wrthe ,W qparameters
t the
z, U r , Uzthat
current input. areIntaddition,
r z t it has a special gating mechanism that
factor
nvironment
o vector maximize where of
which
are the
geometry
the
xare time-varying
”gates”
is probability
”gates”the inputthat that help
of
vector,
helplearned
allows successfulqthe
the the
t is
during
model model
model
the
the
hidden the
learn
sequence
to learn machine
represent
state,
from from of complex
long
beams
and are long
learning
distance z”gates”
until
tWdistance
training
non-linear
de-
time
that de-
phase. t, and
functions.
help the Finally,
model next
learn step
g from long distance de-More formally, GRU is implemented as
ghich
re,
his
ees, mforming
we
paper,
which
formulate that
formulate wethe tmaximizes
will
prob-
vector
the focusprob-
pendencies
that
allows on the if
stationary
maximizes
model
needed.
and to
the blockages,learned
represent
The bias
is to weight vectors
use during
complex
whichq matrices
to c the
rt non-linear
,
predict c machine
, cthe
functions.
, are
W
most learning
model
, U
likely , U basetraining
parameters station phase.
helps
ŝ that
for Finally,
with
are long sequences.
g
letetime-varying
prediction the formulation,
defined asi.e., we
theand Output: are Atnonlinear every time t,activation
t step r
pendencies z
,thefunctions, r
q
,model
z
if needed. hasthe r
afirst z
hidden
Theisweight t
matricesinW r, W
2 Fig. 3.
tpendencies ifon tneeded. The cweight chqmatrices ,learning
are ”gates” that help ŝifOutput: model At hevery learn from
time long
step distancethe r ,model
t,de- has azz,hidden sigmoid z , Uisr , described
Uz
ying state
nthe
ch at (2),
e-varying timeuseratheavily
time
+ 1,pendencies
mobility t +relies
i.e., 1,ŝand
Fig. tas
, the the
2.
,needed.
main
Thebias factor
vectors
proposed
The
learned of weight the
deepand ,the
rduring
matrices
ctime-varying
next
z ,learning the timeW
are
are nonlinear
model
machinestep.
model
W W For U
zparameters
r
that
W
this, U
ractivation weU
leverages use ,U
rthat
training afunctions,
zare
fully
recurrent connected
phase. the depicted
first is gsigmoid
layer
Finally, Fig. 2, and by the following model equations
mulation, ng defined
. vector
pendencies
Formally, in
we that
and (2),
the if
the heavily
maximizes
needed.
bias
objective state The
vectors
q relies
state
and weight
which q,on
the which
second
matrices
tcencompasses are encompasses
is W
model what
tanh. , W the
parameters Using
, Uwhat and
model , U the
nonthe learned
that model
linear
bias
are vectors
about learned
activation c about
, c functions
, c are model parameters that are
tand stthe .bias vectors are model parameters that of are
ngorB
ormally, t in the the objective
= channel. To networks completeand c , c
environmentP [ŝ |Bgeometry ctoof rthe (7)
,predict
cbeams formulation,
, andcarewith output we size Nisequals number possible ŝtbase stations, functions
t r z r z r z q
ation,
at we maximizes
rounding and
we
environment
]learned
tneural
the during ,the
rgeometry
sequence ther
sequence machine
z
zmodel q
of the
quntil the
learning
nonlinear
beams time second
hand-off until step trainingactivation
BS
time and tanh.
in step phase.
the
theduring Using
next
t,next and Finally,
functions, time
step thenon step,
next glinear
the step first activation
,learning is sigmoid
dwe
ge in
probability
abilitythe (2), of the
note heavily biassuccessful
learned
thatof
successful the
vectors
relies
during
beamforming on
cthe allows
cmachine
z, c q are
vector the hmodel
learning that followed parameters
to
training
maximizes represent
by aphase. that
softmax
t,learned are
complex
Finally, activation g non-linear the
that machine
normalizes functions. the outputs training rtphase.
= σg Finally,(Wr xt + g Ur qt−1 + cr ) (9)
we
maximizes ion, formulate
welearned learned
leverage during thegivenduring
and
machine prob-
the is the to
machine the
arepast
learninguse is machine
nonlinear
to
q beam
and
learning
tools use
to sequence
the
predict
to traininglearning
activation
allows
t second
to predict
the most
B
phase.the training
functions,
is the model
likelyFinally,
tanh. most phase.
base to
Using the
likely station first
represent Finally,
non
base ŝis sigmoid
linear
station
for complex activationfor non-linear functions functions.
ng izes
vily
herefore,
te
d
es
eived as
relies
at
relies
environment
timesignal
on
on
we and formulate
power, h are
geometry
i.e., as
h
nonlinear
defined
, the prob-
t
in activation
Output: (2),
q
heavily At functions,
into everyrelies
t
probabilities the
timeon first and
step
(they is sum
gsigmoid ht, arethe
up nonlinear
to
t
model1). The g ŝ
activation
haspredicted
t
a hidden functions,
hand-off the first is sigmoid
(10)
ent m. geometry andand t
h and
+ 1,
arehthe and
aresecond
nonlinear theŝ
the t second
nonlinear next the
activation time next isUsing
allows step.
activation timethe
tanh.
functions, For Using
step.this,
Output:the
model we
functions,
Forfirst non use
this, Atlinear
a
toisoutput fully
wethe every
represent
sigmoidand use activation
connected
first
the a timefully is sigmoid
complex
second functions
layer
connected
step is tanh. non-linear
t,learnedlayer
the Using model functions.
non
z
has aactivation
linear t = σ (W
hidden functions
g z x t + U z qt−1 + c z )
Bser
on
nd-off on
geometry location
,tFormally,
]we. and
state
formulate thethe
and atobjective
second
time
the
(7) the
allows surrounding
iswith prob-
ttanh.
(7)
+ 1,iswith
theoutput i.e.,
state
tanh.
model
Using
environment
size ŝqnon
output t,Nwhich
tto equals
represent
size
linear
non
BS
N
linear
(the
encompasses
geometry
number
equals
activation
final
complex
activation
ofnumber possible
functions
of functions
what
non-linear
the
ofbase
model)
the
possible stations, model
functions.
can
base
then be
stations,
expressed about as
dingate ry
me prob- the
blockages)and
prob-
t .of
the
allows
Formally, [6]. second
the model
Therefore, is to
,the tanh.
bywe represent
sequence Using
Output:
formulate state
complex non
of the Atq
beams linear
which
every
non-linear
prob- =until
activation
encompasses
allows
time functions.
time the
step step functions
model
t, the what
andto the
represent
model
the next step(12)model
has complex a learned
hidden non-linear
qt = about(1 functions.
− zt ) ◦ qt−1
etrythe tothe ŝobjective t, (W
t
tate
obability
RNING atB Btime
t,allows ASED the tsuccessful
P+ model
and
ROACTIVE i.e.,
1,Output: followed
system represent
H followed
tAND At
operation
a complex
- OFF every softmax bybefore atime activation
non-linear
softmax stepmodel
delving ŝthat
activation
t,tfunctions.
into
normalizes
the arg
the that
maxnormalizes
model
Output:exact
thesoftmax
has outputs
Ata the
machine thelearning
hidden
every f qoutputs
t + cf ) n
time step t, the model haszta◦ hidden
tzehine b-
1,. i.e.,a+ learning
prediction
machine1, i.e., allows
, ŝ tools
learning ,of Output:
the
the totools model
hand-off
into At
to every to
probabilitiesstate
is to time
represent
state atuse time
(they step
q the
qtt to sum
t + sequence
complex
which
t, 1,the
up i.e.,
predict to encompasses
1). ŝ , The
the of
non-linear has beamsa
predicted
most hidden likely until
functions.
what hand-off basetime model step
station ŝt for learned
t, and the
about next step
+ σq (Wq xt + Uq (rt ◦ qt−1 ) + cq ) , (11)
Brob- ŝsthe theŝOutput:
Formally, tprobability t
the At
qtobjective
state
modeling every ofqt(the successful
time
which
in into Section probabilities
step encompasses t,4.2. the model(they what sumt athe
has up
n2{1,2,...,N
tomodel
hidden
state 1). qThe
}
t learned
whichpredicted about
encompasses hand-off what the model learned about
he t , beam-sequence
tSystem
objectiveOperation state atwhich BS
time t,encompasses
final . output
Formally, of what
is the to
the the
use
model)
objective model can tolearned
then predict
be the about
expressed the as most likely base station for
objective Output: At every B
the
BS (the the
next time sequence
time
final output stepwhere step. t, of
of the the
For beams
qand model
this,
model) until
we
are has
use time
weights a fully step
hidden and connected
t,
biasesand of the layer
the next
fully- step ŝ
defined state asthe which encompasses what the model learned aboutt,can andthen thebenext expressed as
probability q t of the successful t W t c t
of ve ŝpaper, of ispredicting
tsuccessful to maximizesequence sequence of
the beams of beams
probability until time
the ofuntil steptime
successful
f
t,layer. and stepthe f
the sequence
next step of beams step until time where step xt,t is andthetheinput nextvector,
step qt is the hidden state, rt and zt are
oblem
uccessful thestate sequence q blockage
of
which (7)
beams and with
until
encompasses reconnect
tis
ŝto output
=time to argstep use max size t,qnext
connected
what and
Nmost
tsoftmaxtothe time
predict
equals
the (W next
model step.
fq
The
number +the
tstep cFor
softmax(.)
flearned
)use most ofthis, possible
n(12) likelywe
function
about usebase
basetakesathefullystation
a vector
stations, connected
ŝt for
of layer
ful
CTIVEtive as H AND -isOFF to t is
use to q use to qpredict predict
the most the likely base likely is
station to
base ŝn stationfor q to
f qt + cf )nthat the (12)
predict
ŝ for most likely
”gates” base
that help station
the ŝ
model forlearn from long distance dependencies if
=ge/hand-offsROACTIVE
sapredict
sequence Hprediction
labeling defined
problem. theas Given aŝSystem softmax t t
is tto] .use q4.1. to predictMain t Idea t and
most t =length
likely arg N max
Operation and activation outputs the (Wprobability
t t nth BS is going
P t |B the the
AND
sequence t - OFF
next the next
time of time
step. followed
beams For the
step. (7) this, next
n2{1,2,...,N
until by
For with
we athis,
time
time
use
}base
softmax output
awe step.station
step
fully use } size For a
ŝt N
connected
t, and
the
fully
for
this, next equals
the that
we
connected
layer time next usenumber
normalizesstep. alayer
step fully
For ofthethis,possible
outputs
connectedwe needed.usebase alayer stations,
fully connected layer Wr , Wz , Ur , Uz and the bias vectors
achine s,
sful learning the
next timetoolsmost likely toFor W base station to to n2{1,2,...,N
be the BS of the next time step t + 1. The softmax(.) The weight matrices
] . the step. where this,
into f] .and
we cuse
probabilities are a the
followed fullyweights connected
(they by and
sum aequals layer
biases
up tooutputof1). activation
the Thefully- ofpredicted that normalizes hand-off the outputs
n
on
he (7) with In P
output
At[ŝthist = s(7)
size |B
subsection, with
equals Nwe f output
number briefly size
of possible
describe N (7) softmax
the with
base key number
stations, idea ofsize ŝthe possible
Nproposed equals basecrstations,
number , of
cz ,possible base stations,
verage 7) nextwith
ockage
time
and is to
(7)
output
machine
step.
reconnect use with
size qevery
learning Ntoutput
connected to ttime
equals Ntsize
predict
where
tools
step,
number
layer. Wto
our the
f of
The equals
and most cf number
function
possible
softmax(.) are likely
isbase
the defined
noffunction
of
weights possible
base
stations, astakes follows:
and stationacan base
biases
vector stations,
tof for
of the fully- cq are model parameters that are learned during the machine
to ng the history
blockage of
followed
and beams,
solution
followed
reconnect by and a asit
softmax
by BS
uses
well
a (the
followed
that
softmaxactivation
as final
to the into output
by that
activation probabilities
a normalizes
softmax
learning/communications thatthe model) (they
followed
activation
the
normalizes outputs sum
by then
thea
that
system up
softmax
outputsbe
to expressed
1).
normalizes
opera- The
activation as
predicted
the that
learning hand-off
normalizes
outputs training the
phase. outputsFinally, σg andFig. σq4.are nonlinear activation
his g problem.
next followed thesection, next
Given by a
we a time
softmaxlength
leverage step. connected
activation
N machine and For outputs this,
that layer.
learning thewe The
normalizes use
probability softmax(.)
tools a theto fully
that outputs
the function
n connected
nth BS is=going takes layer
e a
a n
vector of
machine
gning
tion.
to tools Iftoolsto
the learning
to
predicted
into
stationtion. base
probabilitiestools Westation
a model to is
(they different
into sum up
probabilities BS to (the
1). The final predicted
(they output into
softmax(a)
sum probabilities
hand-off of up the to model)
P1). (they
The .
can sum
predicted then up to
be (13)
1). The
expressed
hand-off predicted as hand-off
abeling problem. Given BSupthe problem of predicting hand-off/blockage as a
n
ikely
sne, this base
into
problem. probabilities into
to probabilities
(they
to be length
sum
the of Nto (they and
the 1). sum
outputs
The
next up
predicted
time the to
step 1).
probability
t The
hand-off
+ 1. The predicted
that the
softmax(.) nth hand-off
N BS is
ad going functions, the
(12)can then be expressed as first is sigmoid and the second is tanh. Using non
(7)that with
indicates output
-aBS blockage. size N model) equals ŝnumber
t = can arg of max possible softmax
BS as(the
base (W
final stations,
qd=1
foutput t + cof
n e f)
of nthe
most ACTIVE
ariable
every likely
BS time
length
Hbase
(the BS
AND
step,final (the
sequence
our OFF
station
sequence final
output toof
function
(the output
labeling
final toisbe
the ofBS
defined
output the
the (the
problem. model)
BS as
of follows:
can final
of
thethen In
the be
model) output
n2{1,2,...,N
then
summary,
next expressed beof
time
can expressed
the
given
step
} then model)
tbe+ the as 1.sequence
expressed The cansoftmax(.) then
as benmodel)
pre- expressed linear as
activation functions allows the model to represent complex
itfollowed to by our a softmax activation Training:
asthat ŝnormalizes the outputs
The training objective softmaxin our supervised (Wf qt learning (12)
ms,
ASEDp. At andevery
P usestime
ROACTIVE
thatvious step, H beams
AND -function
OFFB , the isservingdefined BS predict
follows: eta= the arg most max likely base station +non-linear
cf ) n functions. VI. C ONCLUSION
sase tostation ŝt =max arg t maxW softmax model (W is=careminimizing
q t+
n
the cross
cf )n2{1,2,...,N
n(12) (12) ŝ
entropy = arg max
t} (13) loss between our model f t softmax (W q + c ) (12)
-D EEP and Lis
into different
EARNING ŝt =that
itprobabilities BASED arg P
ŝwhere =softmax
ROACTIVE
(they sumsoftmax(a) (W Hand
up fq
AND ŝto +-=fOFF1). )narg Nthe
The weights
.q
apredicted )and biases qtof the fully- Output: f
nhand-off (12) Atnevery time step t, the model has a hidden state
tc ffP
D beams,
OFF
- H uses thetouser
that-n2{1,2,...,N twill arg
connect } fmax with tsoftmax
n
in the (W
next max dftime t e+ asoftmax
cfbase
step. Ifstation
the (W (12)
predicted +following
}cf )n
AND
ROACTIVE
ockage.
blockage
peration
OFF
and AND
reconnect OFF n2{1,2,...,N} predictions ŝ and e the actual n n2{1,2,...,N fof the
(13) time R EFERENCES
qt
ing
in tedIdea Model
baseBS station
and System
(the isbase different
final Operation
station output connected
isare of
n2{1,2,...,N
different the layer.
model) from } Then2{1,2,...,N
softmax(a)
can
d=1
t
softmax(.)
then n = P
be }n function
expressed . indicates takes
as astep vector ofsum encompasses what the model learned about the sequence of
wherestep tthe
.W Webiasescurrent and
compute c one,this are that
lossathed at weights
every time that and a weights
tbiases
then which of the fully-
N
eing where
problem. where and
f Given W and
Training:
are
aWhand-off cthe The
weights the weights
training and biases
sand
objective off in the our where
of the
supervised
fully- f fully-
W e and
learning c are the and biases of the fully- [1] F. B. Mismar and B. L. Evans, “Partially blind ha
nmodel eathe blockage.
key elements W
proactive
where
c
of thef proposed
f length
f
and machine cfN needs are andover to
the outputs
happen.
weights the As probability
and we
d=1
mentioned
biases that
f
of the theSection
in
f
nthatBS
fully- 3,isof going beams
ect
quence
redicting
reconnect
ich likely is
the problem
connected
based base
blockage
connected
on layer.
station
recurrent
of modeland
predicting
layer.
The to neural
reconnect
fisThe
softmax(.) where
blockage
minimizingsoftmax(.)
Training:
networks.
the
function connected
W
The
and cross all
and
reconnect
function
f training
time
entropy
takes cobjective steps.
falayer.
takes are
loss
vector
The
athe
The
connected
between vector
ofin
cross
weights
our
softmax(.)
our entropy
oflayer.
model
supervised
and Theloss biases
function every
softmax(.)
nlearning
time
nthe
takes functionauntil
stepfully- vector timeof
takes astep
vectort, and of the next step is new toradio
use aided
qt tobypredict
Sub-6 GHz the LTE signaling,” in
end reconnect a by adopting to the be problem then
theBS tof n
isofpredictingthe(W
computed next afas qthetime
follows:hand-off, step t aour
+time (12)
1.system Theofthe softmax(.)
will
At
gGiven
nce
tation:
stations
a blockage
every
labeling
length At a every
time
as
N and
length and time
step,
connected
sequence
problem.N ŝand
reconnect
outputs t =outputs
predictions
step
our t,the Given
the
arg
layer.
labeling
model
max
tthe
probability
input
The problem.
isaprobability
and
ŝconnected
tois length
minimizing
softmax
softmax(.)
actual
that thebase
layer. Given
that
Nnthstation
the
n The
and
theBS
cross
function
nthis of
outputs +
BS
tgoing the
length
softmax(.)
entropy
cisfollowing
f )going
takes N
lossn and
the between
n
vector
outputs
function
probability our model
probability
takes
that the a most n that
vector
nth likely
BSof is base
the nth BS
going station
is goingŝt for the next timeICC step. For this,
Workshops, weCity,
Kansas use MO,a USA, May 201
em.stationce
to ofGiven beams,
to a also
predict length predict
step the s .
most
and function
link
We
n2{1,2,...,N blockages
compute
likely
outputs } defined
this
base
the if
loss
station
probability they as
at everyfollows:
are
to going
time
that to step
the
be to the t happen
then BSBS
X sum ofis ingoing
the the nextnext time fully
step connected The layer with output size N[2] equals
A. Alkhateeb,
number O. El
of Ayach,
possible G. Leus, and R. Heath,
beling
the
ams, theproposed
model most
to
and isproblem.
bethethe likely
to tthbe
machine
itthe uses
BS beam the Given
of
thatbase the
BS
index
to
N next
of t ain
station thethetime
predictions next
length beamtostep time tN
ŝThe to
tand +step
and
be 1.theThe ttheoutputs
actual BS
+ 1.softmax(.) The base ofthe
loss(ŝ the
softmax(.)tprobability
stationn, stnext nth
) of= nthetime sthat
following
i
step
t log thei
t nth
ŝt .time + 1.BSThe is
(14) t +
going 1.
softmax(.)n softmax(.) n and hybrid precoding for millimeter wave cellular sy
time over
step all
as time
they steps.
will cross a entropy our loss at 1.every time n step
ct
ost
me se ur 2with
step, station
step.
{1, our2,in...,
function
likely At to
base is next }tostation
CBdefined
function
every
time
be
denoteistime the
asthe
defined step. BS
follows:
to
step, index asAt of
to
every
follows:
of
our the
betherequire next
the
time
BS time step, hand-off.
of step the tnext+function
time The iseasoftmax(.)
defined
step as nfollows:
The
base stations, followed by a softmax activation thatTopics
of Selected normalizes
in Signal theProcessing, vol. 8, no.
mees
tent
tobase
has
neural
that
where
station
access
step, tothe our
M
networks.
toisthe W
different history tand
f System is c
computed step
of defined are s
operation:
fbeams, t as. the
We follows:
and weights
function
compute
ait n uses The
this
softmax(a)
anthat
and
is
proposed defined
loss
to biases
at every
= as of
PN a
n learning/communication follows:
time the step i
fully- t
. t + then 1. sum softmax(.)
(13)
outputs n
an into probabilities (they sum up to 1). 2014.The predicted hand-off
eory ts At of
step the
ofevery proposed
beams, time
input and function
machine
to step, itpredicted our
uses is over that all time as e follows:
steps. The cross entropy loss at every time step e
intoloss(ŝ
function isPisdefined In. as ithefollows:
e
snect
t, e
connected layer. The function
X takes ad=1 vector (14) of ethe n BS
d
entblockage.
recurrent
m the
different
usesindex base that station.
inneural
theto beam systemIf the
networks. softmax(a)operates softmax(a) n basesoftmax(.)
= P station
two N= phases. n.different (13)
an ŝfirsti
. phase (13) (learning), softmax(a) an = PN(the final . output of the (13)model) can then [3] A.beAlkhateeb,
expressedJ. asMo, N. Gonzalez-Prelcic, and
t is computed n t ,as sta)follows:
d= a st elog
beams,
nce
he n predicted a
current and
length base
one, it uses that
NmmWavestation
and thatoutputs
indicates is todifferenta
communication the
blockage.
Training: d=1 e d=1 e d
probability The
N
system training that operates the
objective
tsoftmax(a)
nth as BSin
normal: is
our ngoing =
an P N
supervised
e At(13) every learning . d=1 ead (13) precoding and combining solutions for millimeter-w
enote ntime is different
the step index t, of thethe input to softmax(a) n = i PN X . e ad Communications Magazine,, vol. 52, no. 12, pp. 12
edbeam
dicates baseon astation
Training: blockage. isThe different
training objective inthe ourloss(ŝ supervised stsoftmax(a) learning ad i n = i P . our (13) ŝt = arg
nhX note
to to
index variable
beinTraining: beam
the the length
BS
beam coherence
Theofsequence model
the
training time,
next isobjective
minimizing
time current
instep ourt , the BS
tsupervised
)+ =cross
d=1 1. eThe
will Training:
update
entropylearning
st log ŝits
softmax(.) . The
tloss beamforming
Nbetween training
en
d=1
ad (14) objectivemodelin our supervised learningmax softmax[4] (W R.f q + cf )N.
W.t Heath, (12) S. Rangan, W. Ro
n Gonzlez-Prelcic,
aB }blockage.
gth modelthe
sequence is minimizing
vector thatthe predictions
cross
serves entropy the loss
user. between
Training: If the our The
link model
model itraining
is blocked, is minimizing objective d=1 the cross entropy loss between our
the user in willour supervised learning model }
n∈{1,2,...,N “An overview of signal processing techniques for mi
ourdenote model
functionŝt is
predictions
index is of
anddefined
Training: the
minimizing
the actual
the
as follows:
The cross
training entropy
ŝ t and
objective the
loss actual
between in our base oursupervised station
model of the
learning following time systems,” IEEE Journal of Selected Topics in Signa
uence chine Learning predictions follow
Model ŝthe
t and thebase
initial
step staccess
actual . We
station
Training: base model
of
process
station
compute
the following
Theis tothis
of training
minimizing the hand-off time to time
predictions
following
loss objective
at the every a cross new
ŝttime and
inBS. entropythe
our step actual
supervised
During t loss
then basebetween
sum station
learning of the
our modelfollowing time
tf the to step proposed s . We
step machine
model
compute
st . We
is
compute
minimizing
this loss this at every
loss
the cross
time
atminimizing
every
entropy
step e
time
atn then
step
loss sum
step
between
then s .
t sum We
our
compute
model
this loss at where
every Wtime f and step cft arethen the sum weights and biases no. of3, pp.
the436–453, April 2016.
fully-connected
newe over
t
this predictionsprocess, ŝtthe
over model
and serving
all the time is BS
steps.
predictions
actual will
base The feed
station ŝ cross the
andthetof cross beam
entropy
the the entropy
actual sequence
following loss base atloss Btbetween
every
timestation and time of our
step
the model
following time [5] V. Va, J. Choi, T. Shimizu, G. Bansal, and R.
rent
t,
durrent machine describe
neural
all time
over the
networks.
steps.
key
all time elementsThe softmax(a)
steps.
cross of entropy
the proposed loss = at
Pmachineevery time
t .
overstep all step time steps. time) (13) The cross layer. The softmax(.)
entropy loss at every time step n function takes a vector of length N
multipath fingerprinting and forout-
millimeter wave v2i be
isThe cross entropy
the hand-off status st at nevery loss timeN at step every (beam time coherence
networks.
s.
g model t is computed which is step as follows:
based s . We
on trecurrent
computepredictions
computed neural step
this as loss t .and
ŝstfollows:
networks. Weevery
at the e adactual
compute time base
this
step tstation
loss then at sum of
every thetofollowing
time puts stepthe time
t then
probability sum that the nth BS is goingTransactions to be the BS
on of the
Vehicular next
Technology, vol. PP, no.
me step ttheis computed
its
input machine
t
to asmachine follows:
learning model that d=1
will use tit isfor computed training. as It follows:
is impor-
osed
utinput lements
to machine
representation:
to
t,of the proposed
overAtallevery timetime stepstep
steps. X sThe t .overt,We theall
cross compute
i input time
entropy to steps. this
loss loss The
at X every at cross every
time entropy
i step time step loss tatX then
timeevery sum
step time
t + 1. step The softmax(.)n function [6] A.is Alkhateeb,
defined as S. follows:
Alex, P. Varkey, Y. Li, Q. Qu, and
X =(14)havesitdifferent
eam
am
edsal epofnetworks.
the on index
the
learning
beam recurrent Training:
proposed inmodel thetant
isbeam
t neural
tothe
machine
loss(ŝ
iscomputed
note
The networks.
tth
loss(ŝ
there
t , straining) = thatobjective
beam over
ast , sfollows: )iall
t index
these
si log
= tt time is
ŝbeam
in computedt .i in
loss(ŝ
ststeps.
the log beam
sequences
tour , st )supervised
ŝit . The as cross (14)
follows: entropy learning
log ŝloss(ŝ
t .loss lengths
t ,at st )every = (14) time step
sit log ŝit . (14) learning coordinated beamforming for highly-mob
hedenote the index depending of on }the speedtheofentropy the user, its trajectory, i the time period it systems,” IEEE Access, pp. 1–1, 2018.
ecurrent
dex cethe
every Boftinput
.theLetmodelneural
time btot 2step is
{1, 2,t,the
networks.
minimizing ..., the MCB input tthe
denoteistocross computed index
i as ofX loss
the
follows: between our model X i e[a]K.
[7] n
Cho, B. Van Merriënboer, D. Bahdanau, and
in the
time beam is spent connected to this sBS, etc. As swill be explained shortly,
(14) sWe i i softmax(a)n = PN (13)
. of neural machine
the tthstep t, the
predictions
beam index input
ŝtinand
designed
to the
the loss(ŝ
actual
beam t ,base
t) =station
our deep learning modelloss(ŝ
oft log
the ŝfollowing
t . t , stX
loss(ŝ
to carefully
) = time
handle this
i i
t log ŝt . (14) properties
e[a] translation: Encoder-
i variable
d
d=1 in Proc. of SSST, 2014.
index
beam
.., of
MCBindexthe stin.sequence
} denote
step the
We beam
thecompute
index of challenge.
the loss at every time
this i , s ) =
t tstep t then sum s t log ŝ
i t
i
. (14)
length [8] D. P. Kingma and J. Ba, “Adam: A method for stoc
hine
} denoteovertheall
indextime of the
steps. The cross entropy loss at every
After the machine learning model is well-trained, the serving time
i
step Training: The training objective in arXiv our preprint
supervised learning 2014.
arXiv:1412.6980,
rks. model is minimizing the cross entropy loss between our model pre-
t is computed
BS will as follows:
leverage it to predict if the link to the user will face a
dictions ŝt and the actual base station of the following time step st .
t to blockage/hand-off in the nextX time-step. If a hand-off is predicted,
i i We compute this loss at every time step t then sum over all time
eam the user and its tserving
loss(ŝ , st ) =BS willspro-actively
t log ŝt .
initiate the hand-off
(14) steps. The cross entropy loss at every time step t is computed as
the process to the next BS to ensurei session continuity and avoid latency
follows:
problems.
X
loss(ŝt , st ) = − [p]i,t log [p̂]i,t . (14)
4.2. Machine Learning Model i

Next, we describe the key elements of the proposed machine learning where the reference prediction vector p has 1 in the entry corre-
model which is based on recurrent neural networks. sponding to the index of the correct BS in st , and zero otherwise.
0.95
BS 2

0.9
BS 1
Blockage

Blockage Prediction Success Probability


Candidate User 0.85
Trajectories

0.8

0.75

0.7

Fig. 3. This figure illustrates the considered simulation setup where 0.65

two candidate BSs, each has ULA, serve one vehicle moving in a
street. 0.6

0.55
0 2 4 6 8 10 12 14 16 18
Further, the model prediction vector p̂ has the dth entry equals to Deep Learning Dataset Size (Thousand Samples)
softmax(a)d , ∀d.
The final goal of training the model is to find the parameters Fig. 4. This figure shows that the proposed deep-learning proactive
embd, Wr , Wz , Wh , Ur , Uz , Uh , Wf , cr , cz , ch , cf that mini- hand-off solution successfully predicts blockages/hand-off with high
mize this loss for all training instances. probabilities when the machine learning model is well-trained.

5. SIMULATION RESULTS
reliability of next-generation mmWave systems.
In this section, we evaluate the performance of the proposed deep-
learning based proactive hand-off solution. 6. CONCLUSION
Simulation setup: We adopt the mmWave system and chan-
nel models in Section 2, with two candidate BSs to serve one vehi- In this paper, we proposed a novel solution for the reliability and la-
cle/mobile user moving in a street, as depicted in Fig. 3. To generate tency problem in mobile mmWave systems. Our solution leveraged
realistic data for the channel parameters (AoAs/delay/etc.), we use deep learning tools to efficiently predict blockage and the need for
the commercial ray-tracing simulator, Wireless InSite [8], which is hand-off. This allows the mobile user to proactively hand-off to the
widely used in mmWave research [6, 9], and is verified with channel next BS without disconnecting the session or suffering from high la-
measurements [9]. Each BS is installed on one lamp post at height tency overhead due to sudden link disconnections. The simulation
4 m, and employs a 32-element uniform linear array (ULA) facing results showed that the developed proactive hand-off strategy can
the street. The mobile user is moving in straight-line trajectories successfully predicts blockage/hand-off with high probability when
in the street, that can be any where of the street width, and with the machine learning model is trained with reasonable dataset sizes.
maximum trajectory length of 160 m. The trajectory starting point is In the future, it is interesting to extend the developed solution to
randomly selected from the first 40m of the street, and the user speed multi-user systems and account for mobile blockages.
is randomly selected from {8, 16, 24, 32, 40} km/hr. The BS selects
its beamforming vector from a uniformly quantized beamsteering
7. REFERENCES
codebook with an oversampling factor of 4, i.e. the codebook size is
MCB = 128. At every beam coherence time, the BS beamforming [1] G. R. MacCartney, T. S. Rappaport, and A. Ghosh, “Base sta-
vector is updated to maximize the receive SNR at the user. During tion diversity propagation measurements at 73 GHz millimeter-
the uplink training, the MS is assumed to use 30dBm transmit power, wave for 5G coordinated multipoint (CoMP) analysis,” in 2017
and the noise variance corresponds to 1GHz system bandwidth. The IEEE Globecom Workshops (GC Wkshps), Dec 2017, pp. 1–7.
system is assumed to operate at 60GHz carrier frequency.
We consider the deep learning model described in Section 4.2. [2] D. Maamari, N. Devroye, and D. Tuninetti, “Coverage in
The neural network model has an embedding that outputs vectors of mmwave cellular networks with base station co-operation,”
length 20 to the GRU unit, with maximum sequence length of 454. IEEE Transactions on Wireless Communications, vol. 15, no.
Since we have only 2 BSs in out experiment, the fully-connected 4, pp. 2981–2994, April 2016.
layer has only two outputs that go to the softmax function. We use [3] A. Alkhateeb, S. Alex, P. Varkey, Y. Li, Q. Qu, and D. Tu-
the Adam optimizer [10]. In the deep learning experimental work, jkovic, “Deep learning coordinated beamforming for highly-
we used the Keras libraries [11] with a TensorFlow backend. mobile millimeter wave systems,” IEEE Access, pp. 1–1, 2018.
Hand-off/Blockage prediction: To evaluate the performance
[4] R. W. Heath, N. Gonzlez-Prelcic, S. Rangan, W. Roh, and
of the proposed deep-learning based proactive hand-off solution,
A. M. Sayeed, “An overview of signal processing techniques
Fig. 4 plots the blockage/hand-off successful prediction probability,
for millimeter wave MIMO systems,” IEEE Journal of Selected
defined in (7), versus the training size. Fig. 4 shows that with suf-
Topics in Signal Processing, vol. 10, no. 3, pp. 436–453, April
ficient dataset size (larger than 12 thousand samples), the machine
2016.
learning model successfully predicts the hand-off with more than
90% probability, given only the sequence of past beams Bt . This [5] A. Alkhateeb, Jianhua Mo, N. Gonzalez-Prelcic, and R.W.
illustrates the potential of the proposed solution in enhancing the Heath, “MIMO precoding and combining solutions for
millimeter-wave systems,” IEEE Communications Magazine,,
vol. 52, no. 12, pp. 122–131, Dec. 2014.
[6] V. Va, J. Choi, T. Shimizu, G. Bansal, and R. W. Heath, “In-
verse multipath fingerprinting for millimeter wave v2i beam
alignment,” IEEE Transactions on Vehicular Technology, vol.
PP, no. 99, pp. 1–1, 2017.
[7] Kyunghyun Cho, Bart Van Merriënboer, Dzmitry Bahdanau,
and Yoshua Bengio, “On the properties of neural machine
translation: Encoder-decoder approaches,” in Proc. of SSST,
2014.
[8] Remcom, “Wireless insite,” http://www.remcom.com/
wireless-insite.
[9] Q. Li, H. Shirani-Mehr, T. Balercia, A. Papathanassiou, G. Wu,
S. Sun, M. K. Samimi, and T. S. Rappaport, “Validation of a
geometry-based statistical mmwave channel model using ray-
tracing simulation,” in 2015 IEEE 81st Vehicular Technology
Conference (VTC Spring), May 2015, pp. 1–5.
[10] Diederik P Kingma and Jimmy Ba, “Adam: A method for
stochastic optimization,” arXiv preprint arXiv:1412.6980,
2014.
[11] Frdric Branchaud-Charron, Fariz Rahman, and Taehoon Lee,
“Keras,” .

You might also like