Professional Documents
Culture Documents
Vanet PDF
Vanet PDF
fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCCN.2020.3003036, IEEE
Transactions on Cognitive Communications and Networking
Abstract—Mobile edge computing (MEC) is a promising tech- computation operations per second are required to process
nology to support mission-critical vehicular applications, such as the data promptly [4]. Processing such computation-intensive
intelligent path planning and safety applications. In this paper, a applications on vehicular terminals is energy-inefficient and
collaborative edge computing framework is developed to reduce
the computing service latency and improve service reliability time-consuming. To overcome the limitation, mobile edge
for vehicular networks. First, a task partition and scheduling computing (MEC) is an emerging paradigm that provides
algorithm (TPSA) is proposed to decide the workload allocation fast and energy-efficient computing services for vehicle users
and schedule the execution order of the tasks offloaded to the edge [5]–[7]. Via vehicle-to-infrastructure (V2I) communications,
servers given a computation offloading strategy. Second, an ar- resource-constrained vehicle users are allowed to offload their
tificial intelligence (AI) based collaborative computing approach
is developed to determine the task offloading, computing, and computation-intensive tasks to highly capable edge servers co-
result delivery policy for vehicles. Specifically, the offloading and located with roadside units (RSUs) for processing. Meanwhile,
computing problem is formulated as a Markov decision process. compared to the conventional mobile cloud computing, the
A deep reinforcement learning technique, i.e., deep deterministic network delay caused by task offloading can be significantly
policy gradient, is adopted to find the optimal solution in a reduced in MEC due to the proximity of the edge server
complex urban transportation network. By our approach, the
service cost, which includes computing service latency and service to vehicles [8]. Consequently, some applications that require
failure penalty, can be minimized via the optimal workload high computing capability, such as path navigation, video
assignment and server selection in collaborative computing. stream analytics, and objective detection, can be implemented
Simulation results show that the proposed AI-based collaborative in vehicular networks with edge servers [9].
computing approach can adapt to a highly dynamic environment Despite the advantage brought by MEC-enabled vehicular
with outstanding performance.
networks, new challenges have emerged in task offloading and
Index Terms—Mobile edge computing, Internet of Vehicles, computing. One critical problem in MEC is to decide which
task scheduling, deep deterministic policy gradient edge servers should their computing tasks be offloaded to. In
vehicular networks, the highly dynamic communication topol-
I. I NTRODUCTION ogy leads to unreliable communication links [10]. Due to the
non-negligible computing time and the limited communication
Vehicular communication networks have drawn significant range of vehicles, a vehicle may travel out of the coverage
attention from both academia and industry in the past decade. area of an edge server during a service session, resulting in
Conventional vehicular networks aim to improve the driving a service disruption. To support reliable computing services
experience and enable safety applications via data exchange for high-mobility users, a service migration scheme has been
in vehicle-to-everything (V2X) communications. In the era of introduced in [11]. Under this scope, when a user moves out
5G, the concept of vehicular networks has been extended to of the communication area of the edge that the computing task
Internet-of-Vehicle (IoV), in which intelligent and interactive was offloaded, the computing process will be interrupted, and
applications are enabled by communication and computation the corresponding virtual machine (VM) will be migrated to
technologies [1]. A myriad of on-board applications can be im- a new edge according to the radio association. In the urban
plemented in the context of IoV, such as assisted/autonomous area, where highly dense infrastructure are deployed, frequent
driving and platooning, urban traffic management, and on- service interruption would happen due to the dynamically
board infotainment services [2], [3]. changing radio association, which can significantly increase
Although IoV technologies are promising, realizing the IoV the overall computing service latency.
applications still faces challenges. One of the obstacles is the Alternatively, computing service reliability can be achieved
limited on-board computation capability at vehicles. For exam- by cooperation among edge servers. Different from service
ple, a self-driving car with ten high-resolution cameras may migration, which achieves service reliability by migrating the
generate 2 gigapixels per second of data, while 250 trillion computing service according to the vehicle’s trajectory, service
cooperation improves the service reliability by accelerating
Mushu Li, Jie Gao, and Xuemin (Sherman) Shen are with the Department task processing time. The computing task can be divided and
of Electrical and Computer Engineering, University of Waterloo, Waterloo, computed by multiple servers in parallel or fully offloaded to
ON, Canada, N2L 3G1 (email:{m475li, jie.gao, sshen}@uwaterloo.ca). a server with high computing capability at the cost of com-
Lian Zhao is with the Department of Electrical, Computer and Biomed-
ical Engineering, Ryerson University, Toronto, ON, Canada, M5B 2K3 munication overhead [12], [13]. In this regard, the computing
(email:l5zhao@ryerson.ca). task can be forwarded to the edge server which is out of the
2332-7731 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: University of Glasgow. Downloaded on June 30,2020 at 02:11:54 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCCN.2020.3003036, IEEE
Transactions on Cognitive Communications and Networking
2
user’s communication range. Compared to service migration, minimize the computing time by finding a near-optimal
in which edge servers only execute the task offloaded by task scheduling solution with low time-complexity.
the vehicles under their communication coverage, service 3) We further propose an AI-based collaborative computing
cooperation allows edge servers processing the tasks offloaded approach, which utilizes a model-free method to find the
by the vehicles out of their coverage for reducing the overall optimal offloading strategy and MEC server assignment
computing time. Nevertheless, multi-hop communications in a 2-dimensional transportation system. A CNN based
could result in significant transmission delay and waste com- DDPG technique is developed to capture the correlation
munication spectrum resources in the task offloading process. of the state and action among different zones and
The tradeoff between the communication overhead and the accelerate the learning speed.
computing capability increases the complexity of the server The remainder of the paper is organized as follows. In
assignment problem. In addition, although computing service Section II, we present the related works. Section III describes
latency can be reduced by cooperative computing, it is hard the system model. Section IV formulates the service delay
to guarantee service reliability for the vehicles with high minimization problem. In Section V, we present the task
mobility. The uncertainty of vehicle moving trajectories poses partition and scheduling scheme, followed by an AI-based
significant challenges in computing result delivery. collaborative computing approach in Section VI. Section VII
Motivated by the issues in the existing service migration presents simulation results, and Section VIII concludes the
and computing cooperation schemes, we present a computing paper.
collaboration framework to provide reliable low-latency com-
puting in an MEC-enabled vehicular network. Once an edge II. R ELATED W ORKS
server receives the computing tasks offloaded by a vehicle, A. Mobile Edge Computing
it may partially or fully distribute the computing workload to As proposed by ETSI in [14], the main objective of MEC is
another edge server to reduce computing latency. Furthermore, to reduce the computing task offloading and computing latency
by selecting proper edge servers to deliver the computing via utilizing the computing resources located in edge devices,
results, vehicle users are able to obtain computing results such as base stations and access points. In the context of
without service disruption caused by mobility. Under this edge computing, one of the main problems is to determine
framework, we propose a novel task offloading and computing the computing task offloading mechanisms. The edge server
approach that reduces the overall computing service latency selection problem has been evaluated in [15] and [16]. In [15],
and improves service reliability. To achieve this objective, we Cheng et al. propose a user association strategy to jointly
firstly formulate a task partition and scheduling optimization minimize the computing delay, user energy consumption, and
problem, which allows all received tasks in the network to the server computing cost under a space-air-ground integrated
be executed with minimized latency given the offloading network. A model-free approach is proposed in the work to
strategy. A heuristic task partition and scheduling approach deal with the complex offloading decision-making problem.
is developed to obtain a near-optimal solution of the non- In [6], Liu et al. investigate the user-server association policy,
convex integer problem. In addition, we formulate the radio which takes into account the communication link quality and
and computing association problem into a Markov decision server computing capability. In both works, the computing
process (MDP). By characterizing stochastic state transitions association follows the radio association, i.e., the comput-
in the network, MDP is able to provide proactive offloading ing task is processed within the edge server that the task
policy for vehicles. An artificial intelligence (AI) approach, is offloaded. To further reduce the computation time, task
deep reinforcement learning (DRL), is adopted to cope with partition has been considered in [17]–[19]. Computing tasks
the curse of dimensionality in MDP and unknown network can be split and computed by multiple servers in parallel. The
state transitions caused by vehicle mobility. Specifically, a cooperation computing has been investigated among the works
convolutional neural network (CNN) based DRL is developed [17]–[19] under different network environments, while the
to handle the high-dimensional state space, and the deep impact of user mobility has not been addressed. Additionally,
deterministic policy gradient (DDPG) algorithm is adopted in [12], [20], and [21], task scheduling, i.e., ordering the
to handle the high-dimensional action space in the proposed task execution sequences, is also evaluated in the offloading
problem. The major contributions of this paper are: decision making process. In those works, the task scheduling
1) We develop an efficient collaborative computing frame- problem is formulated into a mixed-integer programming
work for MEC-enabled vehicular networks to provide problem, and heuristic algorithms are proposed to obtain near-
low-latency and reliable computing services. To over- optimal solutions efficiently. Different from the above works,
come the complexity brought by the dynamic network we investigate the task partition and scheduling under the
topology, we propose a location-aware task offloading collaborative computing framework, in which the adjustment
and computing strategy to guide MEC server collabora- on workload allocation for a task can affect the performance
tion. of other tasks, which makes the problem more complex.
2) We devise a task partition and scheduling scheme to
divide the computing workload among edge servers and B. MEC-enabled Vehicular Networks
coordinate the execution order for tasks offloaded to the The problem of computing offloading has been investigated
servers. Given the offloading strategy, our scheme can in many research works in the context of vehicular networks
2332-7731 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: University of Glasgow. Downloaded on June 30,2020 at 02:11:54 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCCN.2020.3003036, IEEE
Transactions on Cognitive Communications and Networking
3
[22]–[25]. In those works, the main objective is to minimize 2) Computing: After the computing task is fully offloaded,
service time by selecting the optimal edge server, while service the receiver RSU can process the whole computing
reliability in the presence of vehicle mobility is not taken task or select another RSU to share the computing
into account. In [26]–[28], machine learning techniques are load. The RSU, which is selected to process the task
adopted to obtain the reliable offloading decision for vehicles collaboratively with the receiver RSU, is referred to as
via predict the vehicle trajectories. In [26], Sun et al. focus the helper RSU for the task.
on task offloading and execution utilizing the computing 3) Delivering: A vehicle may travel out of the communica-
resources on vehicles, i.e. vehicular edge. An online learning tion range of its receiver RSU. Therefore, the controller
algorithm, i.e., multi-armed bandit, is utilized to determine the may select an RSU, which could connect with the
computing and communication association among vehicles. In vehicle at the end of service session, to gather and
[27], Ning et al. apply a DRL approach to jointly allocate transmit computing results. The RSU is referred to as
the communication, caching, and computing resources in the the deliver RSU. To reduce the overhead, we limit the
dynamic vehicular network. Furthermore, to deal with service deliver RSU to be either the receiver RSU or the helper
disruption when the vehicle leaving the server converge, RSU of the task. In the example shown in Fig. 1, RSU
service migration has been firstly proposed in [11]. According r + 1 behaves as both the helper RSU and the deliver
to the vehicle moving trajectory, the corresponding computing RSU for the computing task offloaded by the vehicle.
services can be migrated to another edge server that may asso- To reduce the decision space in task offloading and schedul-
ciate the vehicle in the future. The proactive service migration ing, instead of providing the offloading and computing policy
strategy has been investigated in [29] and [30], where MDP is to individual vehicles, we consider location-based offloading
utilized to make the migration decision in a proactive manner. and computing policy. We divide each road into several zones
To alleviate service interruption and network overheads in with equal length, where the set of zones is denoted by Z. The
virtual machine migration, server cooperation has been studied index of the zones is denoted by z = (a, b) ∈ Z. The terms
in [5], [31], and [32]. The works [5] and [31] consider that a and b represent the index of the roads and the index of the
vehicles divide and offload the computing tasks to multiple segments on the road, respectively, where a ∈ {1, . . . , A},
servers according to the predicted traveling traces. Vehicle-to- and b ∈ {1, . . . , B}. As the vehicle drives through the road, it
vehicle communication is used to disseminate the computing traverses the zones consecutively. We assume that all vehicles
result if the edge server cannot connect with the vehicle at in the same zone follow the same offloading and computing
the end of a service session. In [32], the work utilizes neural policy.1 For simplicity, we evaluate the aggregated tasks for
networks to predict the computing demand in the vehicular vehicles in each zone at a time slot, and refer to the tasks
network. MEC servers are clustered to compute the offloaded offloaded by zone z as task z in the remainder of the paper. We
tasks cooperatively. Different from the above works, our suppose that the vehicle will not travel out of a zone during
proposed approach achieves service reliability improvement by the time duration of a time slot, and vehicles can complete
collaboration and task scheduling among edge servers without the offloading process of a task generated in a zone before
cooperative transmission, which reduces the communication it travels out of the zone. Denote the set of vehicles in zone
overhead of result delivery. z and time slot t ∈ T as Vz,t . The offloading decision for
vehicles in zone z and time slot t is represented by a vector
|R| P|R|
III. S YSTEM M ODEL αz,t ∈ Z+ , where r=1 αz,r,t = 1. The element αz,r,t is 1 if
RSU r is selected as the receiver RSU for the vehicles in zone
A. Collaborative Edge Computing Framework z and time slot t, and 0 otherwise. Similarly, the collaborative
computing decision for vehicles in zone z and time slot t is
An MEC-enabled vehicular network is illustrated in Fig. 1. |R| P|R|
represented by a vector β z,t ∈ Z+ , where r=1 βz,r,t = 1.
A row of RSUs, equipped with computing resources, provide
The element βz,r,t is 1 if RSU r is selected as the helper RSU
seamless communication and computing service coverage for
for the vehicles in zone z and time slot t, and 0 otherwise. In
vehicles on the road. An RSU can also communicate with other
addition, the decision on result delivery is denoted by a binary
RSUs within its communication range via wireless links. The
variable γz,r,t , where γz,r,t is 0 if the computing results are
set of RSUs is denoted by R, where the index of RSUs is
delivered by RSU r for task z in time slot t, and γz,r,t is 1 if
denoted by r ∈ R. We assume that a global controller has full
the computing results are delivered by RSU r.
knowledge of the transportation network and makes offloading
and computing decisions for all the vehicles in a centralized
manner. In our model, a computing session for a task includes B. Cost Model
three steps: In this paper, the system cost includes two parts: the service
1) Offloading: When a computing task is generated at a delay and the penalty caused by service failure.
vehicle, the vehicle selects an RSU, which is under its 1) Service Delay: We adopt the task partition technique
communication range, and offloads the computing data during task processing. Once a receiver RSU receives the
of the task to the RSU immediately. In the example offloaded task from vehicles in a zone, it immediately divides
shown in Fig. 1, RSU r is selected to offload the 1 The accuracy of vehicle locations will be improved when the length of
computing load. Such RSU is referred to as the receiver the zone is reduced. In consideration of the length of a car, the length of a
RSU for the task. zone is larger than 5 m.
2332-7731 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: University of Glasgow. Downloaded on June 30,2020 at 02:11:54 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCCN.2020.3003036, IEEE
Transactions on Cognitive Communications and Networking
4
Road i
RSU r RSU r+1
... ...
Vehicle Trajectory
Coverage of RSU r
Coverage Coverage
of RSU r-1 Zone of RSU r+2
z=(a,b)
P V 10−L(Dz,r )/10
. .
. .
2332-7731 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: University of Glasgow. Downloaded on June 30,2020 at 02:11:54 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCCN.2020.3003036, IEEE
Transactions on Cognitive Communications and Networking
5
denoted by xz,t and 1 − xz,t , respectively. Thus, the delay for 2) Service Failure Penalty: The mobility of vehicles brings
forwarding the data to the deliver RSU is: uncertainty in result downloading. Service failure may occur
if a vehicle is out of the coverage of its deliver RSU during
X X αz,r,t βz,r0 ,t (1 − xz,t )Wz,t
R
Tz,t = . (7) the service session. Denote the zone that vehicle v is located
0
rr,r0 when its computing result is delivered as mv , i.e., the location
r∈R r ∈R
Service
of vehicle v ∈ Vz,t in time slot t + Tz,t . Also, we denote
Furthermore, after the task is offloaded to edge servers, the the signal-to-noise ratio threshold for result delivering as δ D .
queuing delay may be experienced. Let set Z r,t denote the We introduce a variable 1z,t to indicate whether the computing
zones which have tasks offloaded to RSU r, i.e., {z|αz,r,t = service for task z offloaded in time slot t is successful or not,
1} ∪ {z|βz,r,t = 1}, and let i(z) represent the index of where
zone z in set Z r,t . We denote Nr,t as the number of tasks (
1, if P R 10−L(Dmv ,r )/10 ≥ σr2 γz,r,t δ D , ∀v ∈ Vz,t
offloadedP in time slot t and assigned to the RSU Nr, where ×N
1z,t =
0, otherwise.
Nr,t = z αz,r,t + βz,r,t . Then, a matrix, I(r,t) ∈ Z+r,t r,t ,
can be defined to imply the processing order of tasks offloaded (13)
(r,t)
to RSU r in time slot t, where Ii(z),j = 1 if the task offloaded
from zone z is scheduled as the j-th task to be processed IV. P ROBLEM F ORMULATION
among the other tasks offloaded in the same time slot. As Our objective is to minimize the weighted sum of the
shown in Fig. 2, the queuing delay of a task depends on the overall computing service delay for vehicle users and service
computing time of the task scheduled priorly. For the first task failure penalty. The corresponding objective function can be
to be processed among the tasks offloaded in time slot t, the formulated as follows:
queuing delay stems from the computing time for the tasks
offloaded in previous time slots. Thus, the queuing delay of T −1
task z in RSU r can be formulated as follows: 1 X X n service o
min lim Tz,t 1z,t + λWz,t (1 − 1z,t )
{α,β,γ,x, T →∞ T
( Q0 (r,t) t=0 z∈Z
Tr,t , if I = 1, {I(r,t) ,∀r,t}}
Q
Tz,r,t = P P i(z),1 (r,t) (r,t) C
(8) (14a)
z0 j Ii(z),j Ii(z 0 ),j−1 Tz 0 ,r,t , otherwise.
s.t. (3), (6), (14b)
Q0
X X X
The term Tr,t represents the latency for finishing the tasks αz,r,t = 1, βz,r,t = 1, γz,r,t = 1
offloaded in previous time slots {1, . . . , t − 1}, where r∈R r∈R r∈R
(14c)
Q0 X (r,t)
Ii(z0 ),Nr,t−1 TzC0 ,r,t−1 − , 0 ,
Tr,t = max (9) Nr,t Nr,t
(r,t) (r,t)
X X
z0 Ii,j = 1, Ii,j =1 (14d)
i=1 j=1
where is the length of a time slot.
0 ≤ xz,t ≤ 1, (14e)
We consider that data transmission and task processing |R|
run in parallel. After the task is offloaded and other tasks αz,t , β z,t ∈ Z+ , (14f)
scheduled priorly are completed, the task can be processed by N ×N
I (r,t) ∈ Z+r,t r,t , (14g)
the dedicated server. The delay for processing task z offloaded
to RSU r in time slot t can be formulated as where λ represents per-unit penalty, for the case when the
computing offloading service fails. The optimization variables
P χWz,t [αz,r,t xz,t + βz,r,t (1 − xz,t )] include three aspects: edge server selection, i.e., {α, β, γ},
Tz,r,t = , (10)
Cr task partition, i.e., x, and task scheduling, i.e., {I(r,t) , ∀r, t}.
It can be seen that Problem (14) is a mixed-integer nonlinear
where Cr denotes the computing capability (CPU-cycle fre-
optimization problem. Solving the above problem directly by
quency) of RSU r, and χ denotes the number of computation
conventional optimization methods is challenging. Further-
cycles needed to execute 1 bit of data.
more, the decision dimension of the problem is too large to ap-
Given the offloading delay, queuing delay, and processing
ply model-free techniques directly. Taking the variable of task
delay, the computing delay for task z on RSU r can be
execution order as an example, i.e., I(r,t) , there are Nr,t ×Nr,t
formulated as follows:
number of decisions to be determined for a server in a time
C T R Q P slot. The number of combinations of scheduling decisions is
Tz,r,t = max{Tz,t + βz,r,t Tz,t , Tz,r,t } + Tz,r,t . (11)
at least (|Z|/|R|)! × |R| × |T |, in which tasks are evenly
assigned to servers and each task is processed by only one
Denote the overall service delay for the task offloaded from
service server. Thus, to reduce the decision dimension of the problem,
zone z in time slot t as Tz,t . As shown in Fig. 2, the overall
we divide Problem (14) into two sub-problems: i) task partition
service delay depends on the longest computing time between
and scheduling problem, and ii) edge server selection problem.
the receiver RSU and the helper RSU. Thus, we have
In the task partition and scheduling problem, we aim to obtain
service
X
C
X
C the optimal task partition ratio and the execution order to
Tz,t = max{ αz,r,t Tz,r,t , βz,r,t Tz,r,t }. (12)
r r
minimize the computing latency given the offloading policy
2332-7731 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: University of Glasgow. Downloaded on June 30,2020 at 02:11:54 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCCN.2020.3003036, IEEE
Transactions on Cognitive Communications and Networking
6
T Q −max{T Q ,T T }+χW /C
z z h(z) Q Q χWz Q 1 1
z,h(z) z,r(z)
χWz /Cr(z) +χWz /Ch(z) , if Tz,h(z) − max{Tz,r(z) , TzT } ≥ Cr(z) − χRr(z),h(z) (Tz,h(z) − TzT )( Cr(z) + Ch(z) )
x̂z = Q
TzT −max{Tz,r(z) ,TzT }+χWz /Ch(z) +Wz /Rr(z),h(z)
χWz /Cr(z) +χWz /Ch(z) +Wz /Rr(z),h(z) , otherwise.
(15)
{α, β}. After that, we re-formulate the edge server selection process task z by itself in a shorter service time comparing to
Q
problem as an MDP and utilize the DRL technique to obtain the queuing time in the helper RSU, i.e., max{Tz,r(z) , TzT } ≤
the optimal offloading and computing policy. Q
Tz,h(z) − CχW z
r(z)
.
V. TASK PARTITION AND S CHEDULING Lemma 1 shows the optimal partition ratio from the in-
Multiple tasks offloaded from different zones can be re- dividual task perspective. However, multiple tasks could be
ceived by an edge server in a time slot. The computing offloaded from different zones to an RSU, where the role of
tasks can only be processed if the tasks scheduled priorly the RSU could be different for those tasks. The task partition
are executed. As a result, the overall computing time may strategy for a single task could affect the computing latency
vary depending on the task execution order in edge servers. In for the task queued later. Therefore, we will investigate the
addition, the workload of a task can be divided and offloaded optimality of the task partition scheme in Lemma 1 in terms
to two edge servers, i.e., receiver and helper RSUs. Workload of minimizing the overall service time for all tasks z ∈ Z.
allocation for a task also affects the overall service time.
Therefore, we study task partition and scheduling to minimize Lemma 2. Assume that the following conditions are met:
the service latency given the offloading policy {α, β}. Based • The computing capability Cr is identical for all edge
on Problem (14), the delay minimization problem can be servers.
formulated as follows: • The receiver RSU and helper RSU are different for each
X
service task, i.e., r(z) 6= h(z).
min Tz,t (16a)
x,{I(r,t) ,∀r,t} • For the helper RSUs for all tasks, the queuing time is not
z∈Z Q T
shorter than the offloading time, i.e., Tz,h(z) ≥ Tz,r(z) +
s.t. (14d), (14e), (14g). (16b) R
Tr(z),h(z) , ∀z, r.
Problem (16) is a mixed-integer programming, which in- Then, given the execution order of tasks, the optimal solution
volves a continuous variable x and an integer matrix variable of Problem (16) follows the results shown in Lemma 1, i.e.,
{I(r,t) , ∀r, t}. Moreover, even if x is known, the remaining in- x∗z = min{max{0, x̂z }, 1}, ∀z.
teger problem is a variation of the traveling salesman problem,
which is an NP-hard problem. To reduce the time-complexity Proof. See Appendix A.
in problem-solving, we exploit the properties of task partition
and scheduling and develop a heuristic algorithm to obtain an We have proved that, given the task execution order, the
approximate result efficiently. To simplify the notations, we partition ratio in Lemma 1 is the optimal solution for Problem
eliminate the time index t in the remainder of the section since (16) under certain assumptions. Next, we will explore the
we consider the scheduling scheme for the tasks offloaded in optimal scheduling order given the workload allocation policy.
one time slot. We further denote r(z) and h(z) as the index
Lemma 3. Consider only one available RSU in the system,
of receiver and helper RSUs for task z, respectively.
i.e., r(z) = h(z). Under the assumption in which the offload-
Lemma 1. If no task is queued after task z for both the ing time is proportional to the size of the task, the optimal
receiver RSU and the helper RSU, the optimal partition ratio task execution order is to schedule the task with the shortest
for the task x∗z is min{max{0, x̂z }, 1}, where x̂z can be service time first.
determined by Eq. (15).
Proof. See Appendix B.
Proof. Without considering the tasks queued later, the service
time of task z can be minimized by solving the following According to the properties provided in Lemmas 1-3, we
problem: design a heuristic algorithm to schedule the task execution
C
min max{Tz,r(z) C
, Tz,h(z) } s.t. (14e). (17) order and allocate workload among RSUs. The full algorithm
is presented in Algorithm 1. In the algorithm, we allocate the
Given that 0 < xz < 1, the optimal task partition strategy task that has the shortest service time first. For each task,
C C
exists when Tz,r(z) = Tz,h(z) . The optimal task partition we divide the workload between the receiver RSU and helper
ratio is xz = x̂z . In addition, x∗z = max{0, x̂z } = 0 when
∗
RSU according to the optimal partition ratio in Lemma 1. In
the helper RSU can fully process task z in a shorter service the worst case, in which all zones have tasks to offload in a
time comparing to the queuing time in the receiver RSU, i.e., time slot, the algorithm requires |Z|(|Z| + 1)/2 iterations to
Q Q χWz
max{Tz,r(z) , TzT } ≥ max{Tz,h(z) , TzT + Rr(z),h(z) } + CχW z
h(z)
. compute the task partition and scheduling results, which can
∗
Otherwise, xz = min{1, x̂z } = 1, when the receiver RSU can still provide fast responses in the dynamic environment.
2332-7731 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: University of Glasgow. Downloaded on June 30,2020 at 02:11:54 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCCN.2020.3003036, IEEE
Transactions on Cognitive Communications and Networking
7
Algorithm 1 Task Partition and Scheduling Algorithm (TPSA) the computing data amount in zones, i.e., {Wz,t , ∀z},
1: At time slot t, initialize set S = {z|Wz,t 6= 0}. the average vehicle speed, i.e., {vz,t , ∀z}, and the delay
Q0
2: Initialize ψr = Tr,t , I(r,t) = 0, and jr = 1, ∀r. for edge servers to finish the tasks offloaded in previous
Q0
3: while |S| 6= 0 do time slots {1, . . . , t − 1}, i.e., {Tr,t , ∀r}.
4: Initialize Qz = 0, ∀z ∈ S. 2) Action space: For zone z and time slot t, the action
5: for Task z = 1 : |S| do taken by the network includes three elements: the index
6: Update r(z) = {r|αz,r,t = 1} and h(z) = of receiver RSU, helper RSU, and deliver RSU, which
{r|βz,r,t = 1}. can be represented by {a1z,t , a2z,t , a3z,t }, receptively.
7: Update partition ratio xz = min{max{0, x̂z }, 1}, 3) Cost model: Given the state-action pair, the overall
where x̂ is obtained by (15). service time can be available by the TPSA algorithm.
C Thus, according to the objective function (14), the cost
8: Update ψ̂z,r(z) = ψr(z) + Tz,r(z) .
9: C
Update ψ̂z,h(z) = ψh(z) + Tz,h(z) . function can be formulated as
Xn o
service
10: If xz = 1, then Qz = ψ̂z,h(z) . C(st , at ) = Tz,t 1z,t + λWz,t (1 − 1z,t ) . (18)
11: If xz = 0, then Qz = ψ̂z,r(z) . z∈Z
12: If 0 < xz < 1, then Qz = (ψ̂z,r(z) + ψ̂z,h(z) )/2. Then, to obtain the expected long-term discounted cost,
13: end for the value function V of state s is
14: Find z ∗ = argminz Qz . hX∞ i
15: Update ψr(z∗ ) = ψ̂z∗ ,r(z∗ ) and ψh(z∗ ) = ψ̂z∗ ,h(z∗ ) . V (s, π) = E γ t C(st , at )|s0 = s, π , (19)
r(z ∗ ),t h(z ∗ ),t t=0
16: Update order matrix Iz∗ ,jr(z∗ ) = 1, and Iz∗ ,jh(z∗ ) = 1.
17: Update jr(z∗ ) = jr(z∗ ) + 1, and jh(z∗ ) = jh(z∗ ) + 1. where the parameter γ is a discount factor. By minimiz-
18: S = S\{z ∗ }. ing the value function of each state, we can obtain the
19: end while optimal offloading and computing policy π ∗ ; that is,
Q0
20: Tr,t+1 = ψr − , ∀r. X
π ∗ (s) = arg mina p(s0 |s, a)[C(s, a) + γV (s0 , π ∗ )].
s0
(20)
VI. AI-BASED C OLLABORATIVE C OMPUTING A PPROACH Due to the limited knowledge on transition probability be-
tween the states and the sizeable state-action space in the
To deal with the server selection problem, we utilize a DRL network, the traditional dynamic programming is not able to
technique to conduct the complex decision-making problem find the optimal policy efficiently. Therefore, we adopt DRL
in a dynamic environment. To implement the DRL method, to solve the proposed server selection problem. There are three
we first re-formulate the problem into an MDP. An MDP common DRL algorithms: deep Q network (DQN), actor-critic
can be defined by a tuple (S, A, T, C), where S represents (AC), and DDPG. DQN is a powerful tool to obtain the optimal
the set of system states; A represents the set of actions; policy with a high dimension in the state space. Besides an
T = {p(st+1 |st , at )} is the set of transition probabilities; and online neural network (evaluation network) to learn the Q
C is the set of real-value cost functions. The term C(s, a) value, a frozen network (target network) and the experience
represents the cost when the system is at state s ∈ S and an replay technique are applied to stabilize the learning process.
action a ∈ A is taken. A policy π represents a mapping from However, the method shows the inefficiency on the network
S to A. In our problem, the state space, action space, and cost with a high dimension in the action space, while in our
model in an MDP are summarized as follows: problem, the large number of zones leads the high dimension
1) State space: In time slot t, the network state, st , includes in both state and action spaces. On the other hand, both AC
2332-7731 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: University of Glasgow. Downloaded on June 30,2020 at 02:11:54 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCCN.2020.3003036, IEEE
Transactions on Cognitive Communications and Networking
8
and DDPG tackle the problem with a high action dimension Although the DDPG algorithm is able to tackle the problem
by the policy gradient technique. Two networks, i.e., actor with a high dimension of state and action spaces, it is ineffi-
and critic networks, are adopted, in which the critic evaluates cient to apply the DDPG algorithm directly in our problem due
the Q value, and the actor updates policy parameters in the to the 2-dimensional transportation network and the multiple
direction suggested by the critic. Moreover, DDPG combines dimensions of the input. A huge number of neurons in a net-
the characteristics of DQN on top of the AC algorithm to work will be deployed if the conventional neural network with
learning the Q value and the deterministic policy by the fully connected layers is adopted. To improve the algorithm
experience relay and the frozen network, thereby helping reach efficiency, we utilize CNN in both actor and critic networks
the fast convergence [34]. In this paper, we exploit the DDPG to exploit the correlation of states and actions among different
algorithm to obtain the optimal collaborative computing policy zones. The structure of actor and critic networks is shown in
in vehicular networks. Fig. 4. Before fully connected layers, convolution layers and
The illustration of our AI-based collaborative computing pooling layers are applied to learn the relevant features of the
approach is shown in Fig. 3. The system states are observed inputs among zones. Due to the weight sharing feature of CNN
from the MEC-enabled vehicular network. After state st is filters, the number of training parameters can be significantly
obtained, the optimal server selection policy can be computed reduced compared to the network with fully connected layers
by the DDPG algorithm. According to the server selection [35]. After several convolution and pooling layers, the output
results, the corresponding task partition and scheduling policy of the CNN combines the state of edge servers and forwards
can be obtained by the proposed TPSA algorithm. Then, to fully connected layers.
the cost of the corresponding state-action pair and the next The proposed AI-based collaborative computing approach
system state can be observed from the environment. The is provided in Algorithm 2, where τ is a small number less
state transition set (st , at , rt , st+1 ) is stored in the replay than 1. In our algorithm, to learn the environment efficiently,
memory for training the neural networks. In DDPG, four the system will continuously train the parameter by Nt times
neural networks are employed. Two of the four networks are after Ne time step, where Ne > Nt .
evaluation networks, where the weights are updated when the
neural network is trained, and the other two networks are VII. P ERFORMANCE E VALUATION
target networks, where the weights are replaced periodically In this section, we first present the efficiency of the proposed
from the evaluation network. For both evaluation and target TPSA algorithm in task partition and scheduling. Then, we
networks, two neural networks, i.e., actor and critic networks, evaluate the performance of the proposed AI-based collabora-
are adopted to evaluate the optimal policy and Q value, re- tive computing approach in a vehicular network simulated by
spectively. The weights in evaluation and target critic networks VISSIM [36], where TPSA is applied to schedule computing
0
are denoted by θQ and θQ , and the weights in evaluation and tasks according to the policy given by the DDPG algorithm.
0
target actor networks are denoted by θµ and θµ , respectively.
In each training step, a batch of experience tuples are
extracted from the experience replay memory, where the A. Task Partition and Scheduling Algorithm
number of tuples in a mini-batch is denoted by N . The critics We first evaluate the performance of the proposed TPSA
in both evaluation and target networks approximate the value algorithm. In the simulation, we consider that tasks can be
function and compute the loss function L, where offloaded to five edge servers with an identical offloading rate
h 2 i of 6 Mbits/s. The communication rate among the servers is 8
L(θQ ) = E yt − Q(st , at |θQ ) . (21) Mbits/s. We set that the computing capability of the servers
is 8 GC/s, and the number of computation cycles needed for
The term Q(st , at |θQ ) represents the Q function approximated processing 1 Mbit is 4 GC. The computing data amount of
by the evaluation network. The value of yt is obtained from tasks is uniformly distributed in the range of [1,21] Mbits.
the value function approximated by the target network, where For each task, the receiver and helper RSUs are randomly
0 0 selected from the five servers. We compare the proposed TPSA
yt = C(st , at ) + γQ(st+1 , µ0 (st+1 |θµ )|θQ ). (22) algorithm with brute-force and random schemes. In the brute-
0 force scheme, we utilize an exhaustive search for finding the
The term µ0 (st+1 |θµ ) represents the action taken at st+1
optimal scheduling order. In the random scheme, we randomly
given by the target actor network. By minimizing the loss
assign the scheduling order of the tasks. Note that, for both
function (21), the weights in the evaluation critic, i.e., θQ ,
brute-force and random schemes, we adopt the optimal task
can be updated. On the other hand, to update the weights
partition ratio in workload allocation. The simulation results
of the evaluation actor network, the policy gradient can be
presented in Figs. 5(a) and 5(b) are averaged over 200 rounds
represented as
of Monte Carlo simulations.
1X The service delay performance of the proposed algorithm
∇θµJ ≈ ∇a Q(s, a|θQ )|s=st , ∇θµ µ(s|θµ )|s=st . (23) is shown in Fig. 5(a). It can be seen that an increase in the
N t a=µ(st )
task number leads to increasing overall service time, and the
From (23), it can be seen that actor weights are updated in increasing rate of the random scheme is the highest among the
each training step according to the direction suggested by the three schemes. The proposed TPSA algorithm can achieve a
critic. performance very close to the brute-force scheme. Moreover,
2332-7731 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: University of Glasgow. Downloaded on June 30,2020 at 02:11:54 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCCN.2020.3003036, IEEE
Transactions on Cognitive Communications and Networking
9
State Actor
Server Queuing Time
T1Q0 ... ... TRQ0
Speed Fully
V1,1 Data Amount Pooling Connected
W1,1 ... W 1,J Layer Layer
Convolution
Layer
...
...
Pool A1
Conv A1
FC 2 Output
WI ,1 ... WI , J
FC 1
Deliver RSU Critic
3
a1,1 Helper RSU Fully
2
a1,1 Receiver RSU Pooling Connected
1
a1,1 ... a 1
1,J Layer Layer
...
...
Output
Pool C1
Conv C1
Convolution FC 2
a1I ,1 ... a1I , J Layer FC 1
Action
2332-7731 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: University of Glasgow. Downloaded on June 30,2020 at 02:11:54 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCCN.2020.3003036, IEEE
Transactions on Cognitive Communications and Networking
10
(a) TABLE II
80 N EURAL N ETWORK S TRUCTURE
Brute-force
70 Random Actor Network
TPSA Layer Number of neurons Activation function
60 CONV1 5×1×2×10, stride 1 relu
Service Time (s)
300 104
9
250
Greedy
Runtime (s)
10-3 8 Greedy+TPSA
5
200 Random+TPSA
4
7 DRL
150 3
2
Weighted Cost 6
100
1
5
50 0
2 4 6 8
4
0
1 2 3 4 5 6 7 8 9
Number of tasks 3
Fig. 5. (a) Average service delay among the three task partition and scheduling 2
schemes with respect to the number of tasks. (b) Average computation runtime
among the three task partition and scheduling schemes with respect to the 1
number of tasks.
0
0.1 0.12 0.14 0.16 0.18 0.2 0.22 0.24
800 m
400 m Zone z=(a,b) Computing task arrival rate per vehicle(req/s)
Fig. 7. Average weighted computing delay cost versus computing task arrive
RSU 40 m rate per vehicle.
RSU with the highest SNR, and the received computing tasks
will not be collaboratively computed with other RSUs. In the
400 m
2332-7731 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: University of Glasgow. Downloaded on June 30,2020 at 02:11:54 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCCN.2020.3003036, IEEE
Transactions on Cognitive Communications and Networking
11
50 5.5
Greedy Greedy
Greedy+TPSA 5 Greedy+TPSA
Random+TPSA Random+TPSA
4
30
3.5
3
20
2.5
10 2
1.5
0 1
0.1 0.12 0.14 0.16 0.18 0.2 0.22 0.24 0.1 0.12 0.14 0.16 0.18 0.2 0.22 0.24
Computing task arrival rate per vehicle (req/s) Computing task arrival rate per vehicle (req/s)
Fig. 8. Average percentage of service failure versus computing task arrive Fig. 10. Average service delay for 1 Mbits successful computed data versus
rate per vehicle. computing task arrive rate per vehicle.
2600
104
Greedy 4
2400 Greedy+TPSA
Sccessfully computed data (Mbits)
Random+TPSA 3.5
2200
DRL
2000 3
Weighted Cost
1800 2.5
1600 2
1400
1.5
1200
1
1000
0.5
800
0
600 0 0.5 1 1.5 2 2.5 3
0.1 0.12 0.14 0.16 0.18 0.2 0.22 0.24
Episode 104
Computing task arrival rate per vehicle (req/s)
Fig. 11. Convergence performance of the proposed algorithm, where the task
Fig. 9. Average computing data amount which is successfully computed arrival rate is 0.1 request/sec.
versus computing task arrive rate per vehicle.
2332-7731 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: University of Glasgow. Downloaded on June 30,2020 at 02:11:54 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCCN.2020.3003036, IEEE
Transactions on Cognitive Communications and Networking
12
2332-7731 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: University of Glasgow. Downloaded on June 30,2020 at 02:11:54 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCCN.2020.3003036, IEEE
Transactions on Cognitive Communications and Networking
13
A PPENDIX B: P ROOF OF L EMMA 3 [10] F. Lyu, H. Zhu, H. Zhou, L. Qian, W. Xu, M. Li, and X. Shen, “MoMAC:
Mobility-aware and collision-avoidance MAC for safety applications in
Suppose the tasks in edge server r are scheduled by the VANETs,” IEEE Trans. Veh. Technol., vol. 67, no. 11, pp. 10590–10602,
shortest-task-first rule, and task 2 is queued after the task 1. Nov. 2018.
Then, we have [11] T. Taleb, A. Ksentini, and P. A. Frangoudis, “Follow-me cloud: When
cloud services follow mobile users,” IEEE Trans. Cloud Comput, vol.
Q T P Q T P 7, no. 2, pp. 369–382, Apr. 2019.
max{T1,r , T1,r } + T1,r ≤ max{T1,r , T2,r } + T2,r . (29) [12] J. Cao, L. Yang, and J. Cao, “Revisiting computation partitioning in
future 5G-based edge computing environments,” IEEE Internet Things
If the order of task 1 and task 2 are switched with each other, J., vol. 6, no. 2, pp. 2427–2438, Apr. 2019.
the service time of task 2 will be decreased by [13] L. Lin, X. Liao, H. Jin, and P. Li, “Computation offloading toward edge
computing,” Proc. IEEE, vol. 107, no. 8, pp. 1584–1607, Aug. 2019.
Q T P T Q T
D = max{max{T1,r , T1,r } + T1,r , T2,r } − max{T1,r , T2,r }. [14] Y. C. Hu, M. Patel, D. Sabella, N. Sprecher, and V. Young, “Mobile
edge computing—a key technology towards 5G,” White paper, ETSI,
(30) 2014.
On the other hand, the service time of task 1 will be increased [15] N. Cheng, F. Lyu, W. Quan, C. Zhou, H. He, W. Shi, and X. Shen,
by “Space/aerial-assisted computing offloading for IoT applications: A
learning-based approach,” IEEE J. Sel. Areas Commun., vol. 37, no.
Q T P Q T 5, pp. 1117–1129, May 2019.
I = max{T1,r , T2,r } + T2,r − max{T1,r , T1,r }. (31)
[16] C. Liu, M. Bennis, M. Debbah, and H. V. Poor, “Dynamic task
From (29), we can derive that I ≥ T1,r P
. Then, the overall offloading and resource allocation for ultra-reliable low-latency edge
computing,” IEEE Trans. Commun., vol. 67, no. 6, pp. 4132–4150,
service time of tasks 1 and 2 will be increased by June 2019.
P Q T P T
[17] L. Chen, S. Zhou, and J. Xu, “Computation peer offloading for energy-
I − D ≥ T1,r − max{max{T1,r , T1,r } + T1,r , T2,r } constrained mobile edge computing in small-cell networks,” IEEE/ACM
Q T Trans. Netw., vol. 26, no. 4, pp. 1619–1632, Aug. 2018.
+ max{T1,r , T2,r }. (32) [18] C. You and K. Huang, “Exploiting non-causal CPU-state information for
energy-efficient mobile cooperative computing,” IEEE Trans. Wireless
T
We then list the three scenarios on T2,r : Commun., vol. 17, no. 6, pp. 4104–4117, June 2018.
T Q T P [19] M. Li, N. Cheng, J. Gao, Y. Wang, L. Zhao, and X. Shen, “Energy-
•Case 1: T2,r ≥ max{T1,r , T1,r } + T1,r . In this case, I − efficient UAV-assisted mobile edge computing: Resource allocation and
P
D ≥ T1,r ≥ 0. trajectory optimization,” IEEE Trans. Veh. Technol., 2020, to appear.
Q T Q T P [20] H. A. Alameddine, S. Sharafeddine, S. Sebbah, S. Ayoubi, and C. Assi,
• Case 2: T1,r ≤ T2,r ≤ max{T1,r , T1,r } + T1,r . In this
T Q T
“Dynamic task offloading and scheduling for low-latency IoT services
case, I − D ≥ T2,r − max{T1,r , T1,r }. According the in multi-access edge computing,” IEEE J. Sel. Areas Commun., vol. 37,
T T
assumption, where T1,r ≤ T2,r , we then have I − D ≥ 0. no. 3, pp. 668–682, Mar. 2019.
T Q Q [21] J. Feng, Z. Liu, C. Wu, and Y. Ji, “AVE: Autonomous vehicular edge
• Case 3: T2,r ≤ T1,r . In this case, I − D ≥ T1,r −
computing framework with ACO-based scheduling,” IEEE Trans. Veh.
Q T
max{T1,r , T1,r } = 0. Technol., vol. 66, no. 12, pp. 10660–10675, 2017.
[22] Y. He, N. Zhao, and H. Yin, “Integrated networking, caching, and com-
Therefore, we can obtain the conclusion that the service time puting for connected vehicles: A deep reinforcement learning approach,”
will be increased if the task execution order does not follow IEEE Trans. Veh. Technol., vol. 67, no. 1, pp. 44–55, Jan. 2018.
a shortest-task-first rule under the assumptions. [23] Q. Qi, J. Wang, Z. Ma, H. Sun, Y. Cao, L. Zhang, and J. Liao,
“Knowledge-driven service offloading decision for vehicular edge com-
puting: A deep reinforcement learning approach,” IEEE Trans. Veh.
R EFERENCES Technol., vol. 68, no. 5, pp. 4192–4203, May 2019.
[24] X. Wang, Z. Ning, and L. Wang, “Offloading in internet of vehicles:
[1] N. Cheng, N. Zhang, N. Lu, X. Shen, J. W. Mark, and F. Liu, “Oppor-
A fog-enabled real-time traffic management system,” IEEE Trans. Ind.
tunistic spectrum access for CR-VANETs: A game-theoretic approach,”
Informat., vol. 14, no. 10, pp. 4568–4578, Oct. 2018.
IEEE Trans. Veh. Technol., vol. 63, no. 1, pp. 237–251, 2014.
[25] M. Li, L. Zhao, and H. Liang, “An SMDP-based prioritized channel
[2] H. Peng and X. Shen, “Deep reinforcement learning based resource
allocation scheme in cognitive enabled vehicular ad hoc networks,” IEEE
management for multi-access edge computing in vehicular networks,”
Trans. Veh. Technol., vol. 66, no. 9, pp. 7925–7933, Sep. 2017.
IEEE Trans. Netw. Sci. Eng., 2020, to appear.
[3] J. Gao, M. Li, L. Zhao, and X. Shen, “Contention intensity based [26] Y. Sun, X. Guo, J. Song, S. Zhou, Z. Jiang, X. Liu, and Z. Niu, “Adaptive
distributed coordination for V2V safety message broadcast,” IEEE learning-based task offloading for vehicular edge computing systems,”
Trans. Veh. Technol., vol. 67, no. 12, pp. 12288–12301, Dec. 2018. IEEE Trans. Veh. Technol., vol. 68, no. 4, pp. 3061–3074, Apr. 2019.
[4] “Self driving safety report,” Tech. Rep., Nvidia, Santa Clara, CA, USA, [27] Z. Ning, K. Zhang, X. Wang, M. S. Obaidat, L. Guo, X. Hu, B. Hu,
2018. Y. Guo, B. Sadoun, and R. Y. K. Kwok, “Joint computing and caching
[5] K. Zhang, Y. Zhu, S. Leng, Y. He, S. Maharjan, and Y. Zhang, “Deep in 5G-envisioned internet of vehicles: A deep reinforcement learning-
learning empowered task offloading for mobile edge computing in urban based traffic control system,” IEEE Trans. Intell. Transp. Syst., 2020,
informatics,” IEEE Internet Things J., vol. 6, no. 5, pp. 7635–7647, Oct. to appear.
2019. [28] K. A. Hafeez, L. Zhao, Z. Liao, and B. N. Ma, “A fuzzy-logic-based
[6] J. Liu, J. Wan, B. Zeng, Q. Wang, H. Song, and M. Qiu, “A scalable and cluster head selection algorithm in VANETs,” in 2012 IEEE Int. Conf.
quick-response software defined vehicular network assisted by mobile Commun. (ICC), June 2012, pp. 203–207.
edge computing,” IEEE Commun. Mag., vol. 55, no. 7, pp. 94–100, Jul. [29] S. Wang, R. Urgaonkar, M. Zafer, T. He, K. Chan, and K. K. Leung,
2017. “Dynamic service migration in mobile edge computing based on Markov
[7] N. Zhang, S. Zhang, P. Yang, O. Alhussein, W. Zhuang, and X. Shen, decision process,” IEEE/ACM Trans. Netw., vol. 27, no. 3, pp. 1272–
“Software defined space-air-ground integrated vehicular networks: Chal- 1288, June 2019.
lenges and solutions,” IEEE Commun. Mag., vol. 55, no. 7, pp. 101–109, [30] H. Liang, X. Zhang, J. Zhang, Q. Li, S. Zhou, and L. Zhao, “A novel
July 2017. adaptive resource allocation model based on SMDP and reinforcement
[8] J. Xu, L. Chen, and S. Ren, “Online learning for offloading and learning algorithm in vehicular cloud system,” IEEE Trans. Veh.
autoscaling in energy harvesting mobile edge computing,” IEEE Trans. Technol., vol. 68, no. 10, pp. 10018–10029, Oct. 2019.
Cogn. Commun. Netw., vol. 3, no. 3, pp. 361–373, Sep. 2017. [31] J. Zhou, D. Tian, Y. Wang, Z. Sheng, X. Duan, and V. C. M. Leung,
[9] J. Zhang and K. B. Letaief, “Mobile edge intelligence and computing “Reliability-optimal cooperative communication and computing in con-
for the internet of vehicles,” Proc. IEEE, vol. 108, no. 2, pp. 246–261, nected vehicle systems,” IEEE Trans. Mobile Comput., vol. 19, no. 5,
2020. pp. 1216–1232, 2020.
2332-7731 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: University of Glasgow. Downloaded on June 30,2020 at 02:11:54 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TCCN.2020.3003036, IEEE
Transactions on Cognitive Communications and Networking
14
[32] Z. Xiao, X. Dai, H. Jiang, D. Wang, H. Chen, L. Yang, and F. Zeng, Lian Zhao (S’99-M’03-SM’06) received the Ph.D.
“Vehicular task offloading via heat-aware MEC cooperation using game- degree from the Department of Electrical and Com-
theoretic method,” IEEE Internet Things J., vol. 7, no. 3, pp. 2038–2052, puter Engineering, University of Waterloo, Canada,
2020. in 2002. She joined the Department of Electrical,
[33] J. Chen, W. Xu, N. Cheng, H. Wu, S. Zhang, and X. Shen, “Rein- Computer, & Biomedical Engineering at Ryerson
forcement learning policy for adaptive edge caching in heterogeneous University, Toronto, Canada, in 2003 and has been
vehicular network,” in 2018 IEEE Global Commun. Conf. (GLOBE- a Professor in 2014. Her research interests are in
COM), Dec. 2018, pp. 1–6. the areas of wireless communications, radio resource
[34] T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, management, mobile edge computing, caching and
D. Silver, and D. Wierstra, “Continuous control with deep reinforcement communications, and vehicular ad-hoc networks.
learning,” arXiv preprint arXiv:1509.02971, 2015. She has been selected as an IEEE Communication
[35] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet classification Society (ComSoc) Distinguished Lecturer (DL) for 2020 and 2021; received
with deep convolutional neural networks,” in Advances in Neural the Best Land Transportation Paper Award from IEEE Vehicular Technology
Information Processing Systems 25, pp. 1097–1105. 2012. Society in 2016, Top 15 Editor Award in 2015 for IEEE Transaction on Vehic-
[36] “VISSIM,” http://vision-traffic.ptvgroup.com/. ular Technology, Best Paper Award from the 2013 International Conference
on Wireless Communications and Signal Processing (WCSP), and the Canada
Foundation for Innovation (CFI) New Opportunity Research Award in 2005.
She has been serving as an Editor for IEEE Transactions on Vehicular
Technology, IEEE Transactions on Wireless Communications, and IEEE
Internet of Things Journal. She served as a co-Chair of Wireless Com-
munication Symposium for IEEE Globecom 2020 and IEEE ICC 2018;
Local Arrangement co-Chair for IEEE VTC Fall 2017 and IEEE Infocom
2014; co-Chair of Communication Theory Symposium for IEEE Globecom
2013. She served as a committee member for NSERC (Natural Science and
Engineering Research Council of Canada) Discovery Grants Evaluation Group
for Electrical and Computer Engineering 2015 to 2018. She is a licensed
Professional Engineer in the Province of Ontario, a senior member of the
IEEE Communication Society and Vehicular Society.
2332-7731 (c) 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Authorized licensed use limited to: University of Glasgow. Downloaded on June 30,2020 at 02:11:54 UTC from IEEE Xplore. Restrictions apply.