Dynamic Computation of Oading For Mobile Cloud Computing: A Stochastic Game-Theoretic Approach

See discussions, stats, and author profiles for this publication at: https://www.researchgate.
net/publication/325770383
Dynamic Computation Offloading for Mobile Cloud Computing: A Stochastic

Game-Theoretic Approach
Article in IEEE Transactions on Mobile Computing · June 2018

DOI: 10.1109/TMC.2018.2847337
CITATIONS READS
61 852
4 authors:
Jianchao Zheng Yueming Cai

Nanjing University of Science and Technology PLA University of Science and Technology
32 PUBLICATIONS 542 CITATIONS 363 PUBLICATIONS 2,286 CITATIONS
SEE PROFILE SEE PROFILE
Yuan Wu Xuemin Sherman Shen

The Hong Kong University of Science and Technology University of Waterloo
96 PUBLICATIONS 1,096 CITATIONS 1,241 PUBLICATIONS 38,263 CITATIONS
SEE PROFILE SEE PROFILE
Some of the authors of this publication are also working on these related projects:
Spatial modulation View project
Vehicular Communications View project
All content following this page was uploaded by Jianchao Zheng on 22 March 2019.
The user has requested enhancement of the downloaded file.

IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 18, NO. 4, APRIL 2019 771
Dynamic Computation Offloading for

Mobile Cloud Computing: A Stochastic
Game-Theoretic Approach
Jianchao Zheng , Member, IEEE, Yueming Cai , Senior Member, IEEE,
Yuan Wu , Senior Member, IEEE, and Xuemin Shen, Fellow, IEEE
Abstract—Driven by the growing popularity of mobile applications, mobile cloud computing has been envisioned as a promising
approach to enhance computation capability of mobile devices and reduce the energy consumptions. In this paper, we investigate the
problem of multi-user computation offloading for mobile cloud computing under dynamic environment, wherein mobile users become
active or inactive dynamically, and the wireless channels for mobile users to offload computation vary randomly. As mobile users are
self-interested and selfish in offloading computation tasks to the mobile cloud, we formulate the mobile users’ offloading decision
process under dynamic environment as a stochastic game. We prove that the formulated stochastic game is equivalent to a weighted
potential game which has at least one Nash Equilibrium (NE). We quantify the efficiency of the NE, and further propose a multi-agent
stochastic learning algorithm to reach the NE with a guaranteed convergence rate (which is also analytically derived). Finally, we
conduct simulations to validate the effectiveness of the proposed algorithm and evaluate its performance under dynamic environment.
Index Terms—Mobile cloud computing, multi-user computation offloading, dynamic environment, stochastic game, multi-agent stochastic
learning
Ç
1 INTRODUCTION
W ITH the proliferation of smart wireless devices and

mobile internet services, more and more mobile
applications such as interactive gaming, face recognition, and
Offloading mobile users’ computation tasks to the
mobile cloud infrastructure usually involves considerable
communication burdens between the cloud and mobile
augmented reality have emerged and drawn increasing inter- devices, which thus necessitate a careful design of multi-
ests [2]. These sophisticated applications usually require sig- user computation offloading strategy to improve wireless
nificant amounts of computation resources and energy access efficiency [7]. A motivating example is as follows.
consumptions, which, however, cannot be directly afforded When many mobile devices aggressively offload their
by most mobile devices due to their limited computation computation tasks to a mobile cloud over the same wire-
resources and battery capacities [3], [4]. Therefore, mobile less channel, they may generate severe co-channel inter-
cloud computing, which enables mobile devices to offload ference to each other. Such a severe interference leads to
their computation tasks to the resource-rich cloud infrastruc- lower offloading rates (i.e., mobile users’ achievable data
tures (such as Amazon EC2, Microsoft Azure, and Google rates for sending computation tasks to the mobile cloud
App Engine) via wireless links, has been envisioned as a over the wireless link) and higher energy consumptions
promising approach to address this challenge issue [5]. In for mobile devices, which consequently compromise the
cloud infrastructure, each mobile device is associated with a benefit of offloading computation tasks. Therefore, it is
system-level cloud clone running a virtual machine that exe- very important to achieve an efficient computation off-
cutes mobile applications on behalf of the mobile device [6]. loading coordination when many mobile devices compete
for a limited number of wireless channels to offload com-
putation tasks to the mobile cloud infrastructure.
J. Zheng is with the National Innovation Institute of Defense Technology,
Game theory is a widely adopted mathematical tool to
Academy of Military Sciences PLA, Beijing 100010, China, and the College
of Communications Engineering, PLA Army Engineering University, model and analyze complicated decision-making processes
Nanjing 210007, China. E-mail: longxingren.zjc.s@163.com. among a group of rational decision-makers of conflicting objec-
Y. Cai is with the College of Communications Engineering, PLA Army tives [8], [9], [10], [11], [12]. Since different mobile devices are
Engineering University, Nanjing 210007, China. usually owned by different users, it is natural to adopt game
E-mail: caiym@vip.sina.com.
Y. Wu is with the College of Information Engineering, Zhejiang Univer-
theory to analyze the computation offloading process for mul-
sity of Technology, Hangzhou 310023, China. E-mail: iewuy@zjut.edu.cn. tiple mobile users who exploit a common set of wireless chan-
X. Shen is with the Department of Electrical and Computer Engineering, nels to offload their computation tasks. Specifically, each user
University of Waterloo, Waterloo, ON N2L 3G1, Canada. is modeled as a rational game player that observes and reacts
E-mail: sshen@uwaterloo.ca. to other users’ offloading strategies in the best response manner.
Manuscript received 10 Aug. 2016; revised 2 Apr. 2018; accepted 7 June 2018. Such an interactive decision process is expected to reach an
Date of publication 14 June 2018; date of current version 4 Mar. 2019. equilibrium point (also referred to as Nash Equilibrium, NE),
(Corresponding author: Jianchao Zheng.)
For information on obtaining reprints of this article, please send e-mail to: at which no individual user will change its offloading strategy
reprints@ieee.org, and reference the Digital Object Identifier below. unilaterally. Moreover, by leveraging the intelligence of mobile
Digital Object Identifier no. 10.1109/TMC.2018.2847337 users, game theory is useful for designing decentralized
1536-1233 ß 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See ht_tp://www.ieee.org/publications_standards/publications/rights/index.html for more information.
772 IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 18, NO. 4, APRIL 2019
mechanisms with low complexity, which help to ease the scheme for energy-efficient application execution on the
heavy controlling and signaling overhead of complex central- cloud-assisted mobile application platform. Huertacanepa
ized management [10]. and Lee [19] proposed an adaptive application offloading
However, applying game theory to model the multi-user mechanism based on both the current system conditions and
computation offloading process should carefully deal with the execution history of applications. By invoking the Lyapu-
the complex real network environment. Specifically, mobile nov optimization, [20] and [21] studied the dynamic computa-
users may become active or inactive1 dynamically, and wire- tion offloading policies for minimizing CPU and network
less channels are also time-varying. To capture the dynamics energy consumption under real network environment. In
of network environment, in this paper, we adopt a novel sto- [22], the authors modeled the unstable network as an alternat-
chastic game-theoretic approach to analyze the users’ compu- ing renewal process and proposed an offloading decision
tation offloading decision-making process under dynamic model for mobile cloud application.
conditions, and then propose a multi-agent stochastic learn- Only a few works have discussed the computation off-
ing algorithm to reach the NE of the stochastic game. The loading problem in the multi-user case. Yang et al. [23]
main contributions of this paper are summarized as follows: proposed a genetic algorithm to solve the partition prob-
lem of wireless network bandwidth among multiple
We formulate a stochastic game to model and analyze
users, which achieves high throughput of processing the
the multi-user computation offloading problem under
streaming data. In [3], the authors devised a low-com-
dynamic environment, wherein both mobile users’
plexity heuristic method to perform energy-efficient task
activeness and wireless channel gains are time-vary- offloading for multiuser mobile cloud computing, while
ing. To show the existence of NE in the formulated sto- satisfying the delay requirements. Sardellitti et al. [4] pro-
chastic game, we prove its equivalence to a weighted posed an iterative algorithm to perform the joint optimi-
potential game which has at least one NE. Moreover, zation of radio and computational resources for multi-cell
we analyze the performance bound of the NE in terms mobile-edge computing, under latency and power budget
of the system cost and the number of mobile users constraints.
who can benefit from cloud computing. The above studies all belonged to the centralized compu-
To reach the NE of the formulated stochastic game, we tation offloading mechanisms which did not consider the
propose a multi-agent stochastic learning algorithm for interactions among multiple self-organizing users when
the multi-user computation offloading under dynamic they independently chose their computation offloading
environment. The proposed algorithm runs in a fully strategies. [10], [11], [12], [13], [14], [15], [16] modeled
distributed manner without any information exchange, mobile users as self-interested game players and proposed
i.e., each user independently adjusts its offloading decentralized mechanisms to solve the multi-user computa-
strategy based on its received action-reward instead of tion offloading problems. These previous studies mainly
knowing other users’ detailed offloading-strategies. focused on the computation offloading problems under rel-
As an important technical contribution in this paper, atively static environment. However, in real network envi-
we theoretically derive the convergence rate of the ronment, due to dynamic mobile users’ activeness and
multi-agent stochastic learning algorithm. It is techni- time-varying wireless channels, the utility of each player is
cally challenging to prove the convergence property of dynamically varying, and thus the equilibrium solution of
the designed learning algorithms with multi-user inter- the static game model may never be reached. In this paper,
actions under dynamic environment, and our study we study the multi-user computation offloading problem
here is the first one successfully addressing this issue. with consideration of the dynamics of users’ behaviors and
The rest of this paper is organized as follows. In Section 2, time-varying channels, and adopt the theory of stochastic
we give a brief review of the related works. In Section 3, our game that accounts for all the possible states of the dynamic
system model is introduced. In Section 4, we propose a sto- process to successfully solve the problem.
chastic game to investigate the problem of dynamic compu- The task of achieving NE solutions of the stochastic game
tation offloading. In Section 5, the performance of the NE of in the distributed and dynamic environment is challenging.
the game is analyzed. In Section 6, we propose a multi-agent Most existing algorithms, such as the best (or better)
stochastic learning algorithm to find the NE under dynamic response [24], fictitious play [25], spatial adaptive play [26],
environment. Section 7 presents simulation results and dis- and no-regret learning [27] require enormous information
cussions. Conclusions are drawn in Section 8. exchanges for users’ strategy updating and require the envi-
ronment to be unchanged until reaching convergence of the
algorithms. For the distributed and dynamic environment,
2 RELATED WORK some efficient algorithms have been proposed by invoking
In the literature, many existing works have studied the com- the stochastic learning automata (SLA) [28], [29], [30]. Spe-
putation offloading problem from the perspective of a single cifically, the convergence of SLA-based algorithms to NE
mobile user. Rudenko et al. [17] used experimental results to has been established for coordination games [28] and exact
show that computation offloading can save significant energy. potential games [29], [30]. In this paper, we show the formu-
In [18], the authors designed an adaptive timeout scheme for lated stochastic game is equivalent to a weighted potential
computation offloading to improve the energy savings on game, and then prove the convergence of the designed
mobile devices. Wen et al. [6] proposed an optimization multi-agent stochastic learning algorithm to the NE of the
weighted potential game. Moreover, to the best of our
knowledge, our work is the first to establish the conver-
1. A mobile user is active if it has a computation task to be executed,
while a mobile user is inactive (or silent) if it does not have a computa- gence rate of the SLA-based algorithm in the distributed
tion task to be executed. and dynamic environment.
ZHENG ET AL.: DYNAMIC COMPUTATION OFFLOADING FOR MOBILE CLOUD COMPUTING: A STOCHASTIC GAME-THEORETIC APPROACH 773
TABLE 1
Summation of Used Notations
Notations Description Notations Description

N set of mobile users u0i payoff function for static
game G0;L
M set of wireless chan- ui payoff function for static
nels game G1;L
si user i’s offloading ui payoff function for
Fig. 1. An illustration of the system model (a group of mobile users off- strategy stochastic game G2
load computation tasks to a mobile cloud via a set of available wireless ui user i’s active proba- A set of active users
channels. Each mobile user could dynamically change its activeness, bility
and the wireless channels are time-varying). pi user i’s transmit sA strategy profile of all active
power users
3 SYSTEM MODEL gi;o instantaneous sAnfig strategy profile of all active
channel gain users excluding i
As shown in Fig. 1, we consider a group of mobile users gi;o expected channel s strategy profile of all users
N ¼ f1; 2; . . . ; N g, where each user i may have a computa- gain
tionally intensive task T i to be completed. There exists a wire- Ri user i’s data rate si strategy profile of all users
less access point (AP) through which the mobile users can excluding i
offload their computation tasks to the cloud center deployed Ii user i’s received wi user i’s strategy selection
by the telecom operator. Here, the wireless AP could be a interference probability vector
Viclo cloud computing cost b learning step-size of the
WiFi access point, 3G/4G macro-cell or small-cell base station.
proposed algorithm
Suppose that there are M available wireless
S channels denoted Viloc local computing cost rti user i’s action-reward at
as M ¼ f1; 2; . . . ; M g. We use si 2 f0g M to denote mobile time t
user i’s computation offloading strategy. Specifically, si > 0
denotes that user i chooses to offload the computation task to
the mobile cloud via wireless channel si ; in opposite, si ¼ 0
2N RN is a random vector, where a ¼ ½ai 8i2N , ai 2 f0; 1g
denotes that user i decides to compute its task locally without
offloading to mobile cloud. Notice that for each user i, choos- denotes the user i’s state (0 for inactive, and 1 for active) that
ing different channels will lead to different offloading rates the on-off distribution with probability ui , and
satisfies
when it sends computation task to the AP, which conse- g ¼ gi;o 8i2N follows Rayleigh fading. For better reading,
quently yields different costs. Table 1 summarizes the mainly used notations in this paper.
3.2 Communication Model for Active Mobile Users

3.1 Dynamics of Mobile Users’ Activeness
and Wireless Channels To make a clear presentation, we first consider one realization
of the stochastic system state, which is denoted by L. Given L,
Communication systems operate in a time-slotted fashion
we define the set of active users as A ¼ fi 2 N : ai ¼ 1g. Sup-
over time slots of equal duration (e.g., several microseconds
pose that user i chooses to offload its computation task to the
or milliseconds [32]). A computation offloading period (e.g.,
cloud via wireless channel si > 0. Given the strategy profile
several seconds [10]) usually consists of multiple time slots.
In this paper, we consider a general and practical case that sA ¼ ½si 8i2A of all active mobile users, the uplink data rate of
mobile users may become active or inactive dynamically user i 2 A can be computed by2
!
within different time slots. Specifically, a mobile user is pi gi;o
active if it has a computation task to be executed. Otherwise, Ri ðs A ; LÞ ¼ Blog 2 1 þ P ; (1)
j2Anfig:sj ¼si pj gj;o þ s 0
the mobile user is inactive. We use the on-off distribution to
model mobile user i’s activeness, i.e., mobile user i is active where B is the channel bandwidth, pi is the transmit power
(or inactive) with probability ui (or 1 ui ). of user i, gi;o is the channel gain from user i to the AP, and
To model the time-varying wireless channels, we assume s 0 denotes the background noise power. As shown in (1), if
that the channels between mobile users and the AP follow too many active users choose the same channel to offload
Rayleigh fading, which is a realistic and widely adopted their computation tasks to the AP, they may incur severe
mobile channel model [33], [34]. Specifically, the instanta- interference to each other, resulting in low offloading rates.
neouschannel
a power gain from user i to the AP is given by
gi;o ¼ di;o bi;o , where di;o is the distance from user i to the 3.3 Computation Model for Active Mobile Users
AP (for clear presentation, we use “o” to denote the AP), a is We consider that each active user i 2 A has a computation
the path loss exponent, and bi;o is the Rayleigh fading factor. task T i that needs to be executed either locally on the mobile
Notice that, the instantaneous random coefficient bi;o varies device or remotely on the telecom cloud through computation
from time slot to time slot. offloading. The computation task can be expressed as T i ¼
We consider a more practical model that all system para- i ; Di Þ, where Ci is the size of all input computation
ðCi ; Dloc clo
meters (i.e., channel power gains and the users’ active proba-
bilities) are unknown. For convenience of analysis, we define 2. In this paper, we focus on the wireless interference model given in
a probability space as ðV; H; PÞ, where V is the sample space (1) which is widely adopted in the literature. Note that, we can also
over all system states, H is a minimal s-algebra on subsets of adopt some media access control protocols (such as CSMA) in which
V, and P is a probability measure on ðV; HÞ. Let L denote an the multiple access among users for the shared spectrum is carried out
over the packet level. As shown in [11], the analysis could be very simi-
event in the sample space V. QðLÞ ¼ ½aðLÞ; gðLÞ : V ! lar to our adopted wireless interference model.
data (e.g., the mobile system settings, the program codes, and respectively. Here, the units of mTi and mE 1
i are Second and
1
the input parameters) involved in the task, and Dloc
i are Dclo
i Joule, respectively.
the total number of CPU cycles required to complete the com-
putation task on the mobile device and the telecom cloud, 3.3.2 Computation Cost When Choosing to Perform
respectively.3 For the sake of clear presentation, we use the let- Local Computing
ters “loc” and “clo” to represent LOCal computing and
CLOud computing, respectively. The detailed modelings of Each active mobile user i can also choose to execute its com-
the computation cost are as follows. putation task T i locally by itself (i.e., without invoking any
computation offloading to the cloud). Let Filoc be the com-
3.3.1 Computation Cost When Choosing to Perform putation capability of mobile user i. The computation exe-
Cloud Computing cution time of task T i by local computing is then given by
Dloc
If active user i chooses to offload its computation task T i to Tiloc ¼ F loc
i
. The corresponding computational energy can be
the cloud, it would incur the cost for transmitting the input i
data to the cloud via wireless access. According to the com- computed by Eiloc ¼ hi Dloc
i , where hi is the coefficient denot-
munication model (1), the transmission time and energy ing the energy consumption per CPU cycle. Then, by taking
consumption for offloading the input data of size Ci can be into account both the processing time and the energy consump-
respectively computed by tion, we can compute mobile user i’s total cost when it choo-
ses to perform local computation (i.e., si ¼ 0) as follows:
Ci pi Ci
clo
Ti;1 ð s A ; LÞ ¼ ; and Eiclo ðsA ; LÞ ¼ : Viloc ¼ mTi Tiloc þ mE loc
i 2 A; and si ¼ 0: (3)
Ri ðs A ; LÞ Ri ðsA ; LÞ i Ei ;
After the transmission, the cloud expends the execution time Based on the communication and computation models
Dclo
above, we see that, when choosing to offload computation
clo
Ti;2 ¼ i
Ficlo
to finish mobile user i’s task, where Ficlo denotes task to the cloud, each mobile user’s cost depends not only
on its own offloading strategy, but also on all the other
the computation capability (i.e., CPU cycles per second)
active peers’. Specifically, as shown in the cost function (2),
assigned to user i by the cloud.4 Then, considering both the if too many mobile users are active and using the same strat-
processing time5 and the energy consumption, the total cost6 egy (i.e., choosing to use the same wireless channel) to off-
when mobile user i chooses to perform cloud computing load their computation tasks to the cloud, they may
(i.e., si > 0) can be given by experience low offloading rates, which will incur more com-
putation cost (including longer transmission time and
Viclo ðsA ; LÞ ¼ mTi Ti;1
clo
ðsA ; LÞ þ Ti;2
clo
þ mEi Ei ðs A ; LÞ
clo
higher energy consumption). In this case, it would be more
E (2) beneficial for the mobile user to compute the task locally by
m pi þ mTi Ci
¼ i þ mTi Ti;2
clo
; 2 A; and si > 0; itself. Due to such an inter-dependence among different
Ri ðsA ; LÞ
mobile users, game theory is a suitable mathematical tool to
where mTi and mE i 2 ð0; 1Þ denote the weights of computa- model and analyze users’ decision making for computation
tional time and energy for mobile user i’s strategy decision, offloading. However, due to dynamically varying of mobile
users’ activeness, each user may not be able to know other
3. A mobile user i can obtain the information of Ci , Dloc clo
i , and Di by
peers’ active/inactive states. Moreover, mobile users prefer
applying the methods (e.g., call graph analysis) in [35], [36]. If we con- better channel condition for offloading their computation
sider the related overhead, it would be added just as a constant in the tasks, but the wireless channels are time-varying, which
system model, which would not affect the following mathematical anal-
ysis. Besides, since the mobile devices and the remote cloud computing
makes the problem more challenging.
servers have different instruction set architectures, the numbers of CPU
cycles for the two architectures (i.e., Dloc clo
i and Di ) are different. 4 STOCHASTIC COMPUTATION OFFLOADING GAME
4. Ficlo is considered to be preset for each user. In this paper, we do
not study the allocation/scheduling of computation resources to differ- In this section, we formulate a stochastic game to model the
ent users from the perspective of mobile cloud, since this issue has decision process for mobile users’ computation offloading
been widely investigated in literature, e.g., [4], [20]. to the cloud. For the sake of clear presentation, we first
5. Since the AP is connected to the mobile cloud via the high-speed describe the game model in a static case, and then illustrate
fiber link, the transmission time cost among them could be neglected,
compared with the much higher wireless access time cost (resulted the dynamic case.
from the constrained wireless spectrum resource). Besides, since the
size of the computation outcome is usually much smaller than the size 4.1 Game Models
of input computation data for many applications (e.g., face recognition),
we neglect the time cost for the cloud to send the computation outcome 4.1.1 Static Case
back to the mobile user. The similar assumption also appears in many Given other users’ strategy decisions, each active user i 2 A
previous works [11], [17], [18], [19], [21]. independently adjusts its computation offloading strategy
6. It would also be interesting to consider the users’ economic cost.
Since Ficlo is considered to be preset for each user in our system model, the to minimize its own computation cost. Specifically, given a
economic cost is just a constant added into Eq. (2), and thus it will not realization L in the probability space ðV; H; PÞ, the (state-
affect the following game-theoretic solution. However, if Ficlo is consid- based) payoff function of each active user i is naturally
ered as an optimization variable, how to optimally decide the price for the defined by
computing resources would also be a key problem for the service pro-
viders. In this case, the economic game models such as the market model, (
bargaining model, bidding model, auction model, duopoly model, and Viloc ; if si ¼ 0;
Stackelberg model could be adopted. Since this economic consideration ui si ; sAnfig ; L ¼
0 (4)
will lead to significant change of the current game model, we consider this
Vi si ; sAnfig ; L ; if si > 0;
clo
issue as an important future direction to extend our work.

where si denotes active user i’s strategy, sAnfig denotes the

strategy profile of all the active users excluding user i, Viclo
and Viloc are defined in (2) and (3), respectively. Then, the
game can be formulated as G0;L ¼ ½A; L; fS i gi2N ; fu0i gi2N ,
where A is the set of active users, S i denotes active user i’s Fig. 2. Graphical representation of relationships among the three pro-
strategy space, S i ¼ f0; 1; . . . ; M g. Each active user autono- posed game models.
mously chooses its strategy si to minimize its own payoff, i.e., where each user’s payoff function is given by

G0;L : min u0i si ; sAnfig ; L ; 8i 2 A: (5)
si 2S i Qi ; if si ¼ 0;
ui si ; sAnfig ; L ¼ (7)
Ii ðs A ; LÞ; if si > 0:
Definition 1. Given a computation offloading strategy profile
s A ¼ ½si 8i2A , for an active user i 2 A that chooses the cloud 4.1.2 Dynamic Case
computing approach (i.e., si > 0), if the cloud computing does
We next extend the static game G1;L (under a given L) into a
not yield a higher cost than the local computing (i.e., Viclo
corresponding stochastic game that experiences all possible L.
ðssA ; LÞ Viloc ), we say that the cloud computing is beneficial to
Specifically, in the dynamic and stochastic environment, we
user i. define an expected payoff function for each user i 2 N as follows:
(
In particular, from (1), (2), and (3), we observe that whether i ;
Q if si ¼ 0;
offloading computation task to cloud is beneficial to mobile ui ðsi ; si Þ ¼ EQ ½ui ðsi ; si ; QÞ ¼
user i or notPstrongly depends on its suffered interference, i.e., EQ ½Ii ðsi ; si ; QÞ; if si > 0;
Ii ðsA ; LÞ ¼ j2Anfig:sj ¼si pj gj;o . Referring to the similar proof (8)
in [11], we can achieve Lemma 1 as follows: where si ¼ t½sj 8j2N nfig denotes the strategy profile of all
Lemma 1. Given a computation offloading strategy profile s A , users excluding user i, Qi ¼ pci gi;o s 0 , and gi;o is the expected
2 i 1
cloud computing is beneficial to an P active user i 2 A if its channel gain from mobile user i to the AP. Based on (8), we
suffered interference Ii ðs A ; LÞ ¼ j2Anfig:sj ¼si pj gj;o on the formulate a stochastic game denoted by G2 ¼ ½N ; Q; fS i gi2N ;
selected wireless channel si > 0 satisfies that Ii ðs A ; LÞ Qi , ui tgi2N . Each mobile user independently adjusts its strategy
f
p gi;o
with the threshold Qi ¼ 2ci i 1 s 0 , where the parameter to minimize its individual expected payoff function, which
ðmEi pi þmTi ÞCi can be expressed as
ci ¼
B mTi Tiloc þmE E loc mTi Ti;2
clo ð G2 Þ : min ui ðsi ; si Þ; 8i 2 N : (9)
. i i si 2S i
Proof. According to Definition 1, cloud computing is Fig. 2 illustrates the connections among the game models
beneficial to user i only if Viclo ðs A ; LÞ Viloc . Based on the G0;L , G1;L , and G2 .
computing models in (2) and (3), this condition corre-
sponds to 4.2 Analysis of Nash Equilibrium
In game theory, Nash Equilibrium (NE) is the most important
mE
i pi þ mi Ci
T solution concept for analyzing the outcome of the strategic
þ mTi Ti;2
clo
mTi Tiloc þ mE loc
i Ei ; interaction of multiple decision-makers. In this section, we
Ri ðsA ; LÞ
first investigate the existence of NE for the static game models
which after some manipulations leads to the following G0;L and G1;L . Moreover, by proving the equivalence between
condition: G0;L and G1;L , we illustrate the rationality of designing the
game model G1;L . Then, based on the analysis in the static
case, we derive the existence of NE in the stochastic game G2 .
mEi pi þ mi Ci
T
Ri ðsA ; LÞ : To proceed, we first introduce the definition of NE for game
mTi Tiloc þ mE i Ei mi Ti;2
loc T clo
models in both static and dynamic cases.
Based on the communication model (1), the above condi- Definition 2 (NE of G0;L ). For a realization L 2 V, a
tion can be equivalently transformed into computation offloading strategy profile sA ¼ ½si 8i2A is a
X (pure-strategy) NE of the game G0;L if and only if no active
pi gi;o
pj gj;o s0: mobile user can minimize its payoff function u0i by unilaterally
j2Anfig:sj ¼si
ðmEi pi þmTi ÞCi deviating, i.e.,
B mT T loc þmE E loc mT T clo

2 i i i i i i;2
1 u0i si ; sAnfig ; L u0i si ; sAnfig ; L ; 8i 2 A; 8si 2 S i : (10)
This completes the proof of Lemma 1. u
t
Definition 3 (NE of G1;L ). For a realization L 2 V, a computa-
Based on Lemma 1, we design the following game model
tion offloading strategy profile sA ¼ ½si 8i2A is a (pure-strategy)
G1;L (which will be shown to be equivalent to G0;L with
proofs in Lemma 2 and Theorem 2) as follows: NE of the game G1;L if and only if no active mobile user can mini-
mize its payoff function ui by unilaterally deviating, i.e.,

G1;L : min ui si ; sAnfig ; L ; 8i 2 A; (6)
si 2S i ui si ; sAnfig ; L ui si ; sAnfig ; L ; 8i 2 A; 8si 2 S i : (11)
Definition 4 (Expected NE of G2 ). A computation offloading

strategy profile s ¼ ½si 8i2N is an expected (pure-strategy) NE F s0k ; sAnfkg ; L F sk ; sAnfkg ; L
X
of the stochastic game G2 if and only if no mobile user can mini- ¼ pk gk;o pj gj;o ‘fsj ¼s0 g ‘fs0 > 0g þ pk gk;o Qk ‘fs0 ¼0g
k k k
mize its expected payoff function ui by deviating unilaterally, i.e., j2Anfkg
X (18)
pk gk;o pj gj;o ‘fsj ¼sk g ‘fsk > 0g pk gk;o Qk ‘fsk ¼0g
ui si ; si ui si ; si ; 8i 2 N ; 8si 2 S i : (12) j2Anfkg

¼ pk gk;o uk s0k ; sAnfkg ; L uk sk ; sAnfkg ; L :
Theorem 1. For an arbitrary realization L 2 V, G1;L is a
weighted potential game which has at least one NE. As stated before, (18) essentially means that for each
Proof. The key of the proof is to show that for each user user k 2 A, the change of its payoff function (due to
k 2 A, the change of its payoff function (due to its unilat- its unilateral change of strategy) is proportional to the
eral change of strategy) is proportional to the change in a change in the potential function (13) for the whole
carefully chosen potential function for the whole system. system. Thus, according to the potential game theory
The details are as follows. in [24], G1;L is a weighted potential game (with
We first construct a state-based potential function as weight-factor pk gk;o ) which has at least one NE. This
follows: concludes the proof. u
t
1X X According to the proof in [11], G0;L is an ordinal potential
FðsA ; LÞ ¼ pi gi;o pj gj;o ‘fsj ¼si g ‘fsi > 0g game which also has at least one pure-strategy NE point. In
2 i2A j2Anfig
X (13) the following, we will investigate the relationship between
þ pi gi;o Qi ‘fsi ¼0g ; G0;L and G1;L , and illustrate the rationality of designing the
i2A game model G1;L .
where ‘fconditiong is an indicator function, and it is equal to Lemma 2. In G0;L and G1;L , all users’ strategy preferences are the
0 (resp., 1) when the condition is false (resp., true). The same. That is, for an arbitrary realization L 2 V, 8i 2 A, for
above Eq. (13) can be equivalently written as follows: any s0i 6¼ si
0
X u0i s0i ; sAnfig ; L u0i si ; sAnfig ; L , ui s0i ; sAnfig ; L
1 (19)
FðssA ; LÞ ¼ @ pk gk;o pj gj;o ‘fsj ¼sk g ‘fsk > 0g ui si ; sAnfig ; L :
2 j2Anfkg
X
þ pi gi;o pk gk;o ‘fsk ¼si g ‘fsi > 0g Proof. The proof is essentially based on our previous
i2Anfkg Lemma 1. Specifically, we consider the following three
1 (14) cases:
X X
þ pi gi;o pj gj;o ‘fsj ¼si g ‘fsi > 0g A (1) s0i > 0, si ¼ 0: According to the definition of
i2Anfkg j2Anfi;kg
X payoff functions, u0i ðs0i ; sAnfig ; LÞ ¼ Viclo ðs0i ; sAnfig ; LÞ,
þ pk gk;o Qk ‘fsk ¼0g þ pi gi;o Qi ‘fsi ¼0g :
u0i ðsi ; sAnfig ; LÞ ¼ Viloc , and ui ðs0i ; sAnfig ; LÞ ¼ Ii ðs0i ;
i2Anfkg
sAnfig ; LÞ, ui si ; sAnfig ; L ¼ Qi . Based on Lemma
In particular, the following result always holds
1, Viclo ðs0i ; sAnfig ; LÞ Viloc , Ii ðs0i ; sAnfig ; LÞ Qi .
X X
pk gk;o pj gj;o ‘fsj ¼sk g ‘fsk > 0g ¼ pi gi;o pk gk;o ‘fsk ¼si g ‘fsi > 0g : Therefore, u0i s0i ; sAnfig ; L u0i ðsi ; sAnfig ; LÞ , ui
j2Anfkg i2Anfkg ðs0i ; sAnfig ; LÞ ui ðsi ; sAnfig ; LÞ.
(15) (2) s0i ¼ 0, si > 0: According to the definition of payoff
Using (14) and (15), we can derive the following result:
functions, u0i s0i ; sAnfig ; L ¼ Viloc , u0i ðsi ; sAnfig ; LÞ ¼
X
Fðs A ; LÞ ¼ pk gk;o pj gj;o ‘fsj ¼sk g ‘fsk > 0g i ðs
V clo sA ; LÞ, and ui ðs0i ; sAnfig ; LÞ ¼ Qi , ui ðsi ; sAnfig ; LÞ ¼
j2Anfkg

(16) Ii ðssA ; LÞ. Based on Lemma 1, Viclo ðs A ; LÞloc i ,

þ pk gk;o Qk ‘fsk ¼0g þ X sAnfkg ; L ; Ii ðsA ; LÞ Qi . Therefore, u0i s0i ; sAnfig ; LÞ u0i ðsi ;

P P sAnfig ; LÞ , ui s0i ; sAnfig ; L ui si ; sAnfig ; L .
where XðssAnfkg ; LÞ ¼ 12 i2Anfkg j2Anfi;kg pi gi;o pj gj;o ‘fsj ¼si g ‘fsi > 0g þ
P (3) s0i > 0, si > 0: According to the definition of pay-
i2Anfkg pi gi;o Qi ‘fsi ¼0g is independent of user k’s strategy sk .
Besides, based on (7), we have the following off functions, u0i ðs0i ; sAnfig ; LÞ ¼ Viclo ðs0i ; sAnfig ; LÞ, u0i
equation: ðsi ; sAnfig ; LÞ ¼ Viclo ðssA ; LÞ, and ui ðs0i ; sAnfig ; LÞ ¼ Ii

ðs0i ; sAnfig ; LÞ, ui si ; sAnfig ; L ¼ Ii ðsA ; LÞ. Since Viclo ¼
uk sk ; sAnfkg ; L ¼ Ik ðsA ; LÞ‘fsk > 0g þ Qk ‘fsk ¼0g
X ðmE T
i pi þmi ÞCi
¼ pj gj;o ‘fsj ¼sk g ‘fsk > 0g þ Qk ‘fsk ¼0g : (17) Ri þ mTi Ti;2
clo
, we can get the following
j2Anfkg result: Viclo ðs0i ; sAnfig ; LÞ Viclo ðssA ; LÞ , Ri ðs0i ;
Therefore, for each user k 2 A and its two different strat- sAnfig ; LÞ Ri ðssA ; LÞ. As Ri is a decreasing func-
egies sk and s0k , we have the following equation: tion of the received interference Ii , Ri ðs0i ; sAnfig ;
5.1 Metric I: System-Wide Computation Cost

LÞ Ri ðsA ; LÞ , Ii ðs0i ; sAnfig ; LÞ Ii ðssA ; LÞ.
In dynamic and stochastic environment, (2) cannot charac-
Therefore, u0i ðs0i ; sAnfig ; LÞ u0i ðsi ; sAnfig ; LÞ , terize the cost of cloud computing, since the transmission
ui ðs0i ; sAnfig ; LÞ ui ðsi ; sAnfig ; LÞ. rate Ri ðsÞ is dynamically varying. Therefore, we compute
By summarizing the above three cases, we can obtain the expected cost of offloading computation task to mobile
the results in Lemma 2. u
t cloud in dynamic environment as follows7:
Theorem 2. Each NE of game G0;L is an NE of game G1;L , and E

mi pi þ mTi Ci
each NE of game G1;L is also an NE of game G0;L . Vi ðsÞ ¼
clo
þ mTi Ti;2
clo
: (22)
EQ ½Ri ðs; QÞ
Proof. Denote the set of NE of G0;L by C0 , and the set of NE
of G1;L by C1 . 8s 2 C0 , according to the definition of NE, In comparison, each mobile user’s total cost for performing
local computing is still given by (3), since the dynamic envi-
u0i ðsi ; sAnfig ; LÞ u0i ðsi ; sAnfig ; LÞ; 8i 2 A; 8si 2 S i . Then, ronment (channel dynamics, users’ dynamic activeness) does
based on (19), we have ui ðsi ; sAnfig ; LÞ ui ðsi ; sAnfig ; LÞ; not impact the local computing. In summary, we evaluate the
8i 2 A; 8si 2 S i , which implies that s 2 C1 . Following the computation cost in dynamic environment by the following
similar argument, we also have 8s 2 C1 ) s 2 C0 . metric:
Therefore, we can conclude that C0 and C1 are the same. t u loc
V ; if si ¼ 0;
Gi ðsi ; s i Þ ¼ iclo (23)
Lemma 2 and Theorem 2 together mean that G0;L and G1;L Vi ðsÞ; if si > 0:
are essentially equivalent. Thus, by deriving the NE of G1;L ,
we can also get the NE for G0;L . Moreover, G1;L can be used Definition 5 (Price of Anarchy [38]). Price of Anarchy
for designing the stochastic game model G2 . Based on the (PoA) is the ratio of system-wide computation costs between
property of G1;L as shown in Theorem 1, we can obtain the the worst (expected) NE and the globally optimal solution in
following result. centralized schemes, i.e.,
Theorem 3. The stochastic game G2 is a weighted potential game P
maxs 2C i2N ui Gi ðss Þ
with the expected potential function given by PoA ¼ P ; (24)
i2N ui Gi ð^s Þ
ðsÞ ¼ EQ ½Fðs ; QÞ
F
1X X where ui is mobile user i’s active probability, C is the set of
¼ ui uj pi gi;o pj gj;o ‘fsj ¼si g ‘fsi > 0g expected NE of the stochastic game G2 , and s^ denotes the
2 i2N j2N nfig (20)
X centralized optimal solution that minimizes the system-wide
þ i ‘fs ¼0g ;
ui pi gi;o Q P
i computation cost, i.e., s^ ¼ arg mins2S i2N ui Gi ðssÞ, where S
i2N
denotes the strategy space of all the users. Notice that the PoA
where s ¼ ½si 8i2N denotes the strategy profile of all mobile
users, ui is the active probability of mobile user i 2 N . provides a meaningful metric that indicates how good the NE is
compared to the centralized optimal solution for minimizing
Proof. By taking the operation of expectation for (18), we the system cost.
can obtain the following result
Lemma 3. For an arbitrary expected NE s of the stochastic game
G2 , if si > 0, user i’s expected transmission rate EQ ½Ri ðs ; QÞ
F s0 ; sk F
ðsk ; sk Þ ¼ pk gk;o uk s0 ; sk uk ðsk ; sk Þ ;
k k for computation offloading is lower bounded by
(21)
P
j;o uj
j2N nfig pj g
inf ¼ Eg Blog 2 1 þ pi gi;o
R Blog 1 þ :
where uk ðsk ; sk Þ is the payoff function defined in (8). i i;o
s0 2
Ms 0
Thus, according to the potential game theory in [24], G2 is (25)
a weighted potential game with weight-factor pk gk;o . u
t
As proved in [24], every weighted potential game pos- Proof. According to Definition
4 for the expected
Nash
sesses the finite improvement property, and thus G2 has at Equilibrium, 8i 2 N , ui si ; s i ui si ; si , which along
least one pure-strategy NE. That is, the investigated mobile with (8) leads to
users’ decision-making process for computation offloading is
guaranteed to have a pure-strategy NE under the dynamic EQ Ii si ; s i ; Q EQ Ii si ; s i ; Q ; 8si 2 M S i :
environment. However, the behaviors of users in the game
(26)
are selfish to minimize its own payoff (without caring about
the other peers’), which may lead to an inefficient NE. In the By summing up the two sides of (26), we can derive
following section, we will analyze the achievable performance
of NE for the stochastic game G2 under dynamic environment. X
MEQ Ii si ; s i ; Q E I s ; s ; Q :
s 2M Q i i i
(27)
i
5 PERFORMANCE ANALYSIS OF NASH

EQUILIBRIUM
7. An alternative choice of the expected cost can be Viclo ðs Þ ¼
To evaluate the performance of the NE, we first study the met- ðmEi pi þmTi ÞCi
ric of system-wide computation cost, and then analyze the EQ ½ Ri ðs ;QÞ þ mTi Ti;2
clo
. In this paper, we choose to use (22) for the
number of mobile users who benefit from cloud computing. convenience of analysis.
P
Obviously, EQ Ii si ; si ; Q ¼ j2N nfig:sj ¼si pj gj;o uj . Thus implies that higher offloading rate can be achieved. With
X X X Lemma 3, the following result can be achieved.
si 2M
EQ Ii si ; s i ; Q ¼ si 2M
p g u
j2N nfig:sj ¼si j j;o j Theorem 4. For the multi-user stochastic computation off-
X X
¼ p g u ‘n o loading game G2 , the PoA of the system-wide computation cost
s 2M
i j2N nfig j j;o j sj ¼si satisfies that
X X
¼ p g u ‘n o: PN loc ðmi pi þmi ÞCi
E T
j2N nfig j j;o j si 2M
i¼1 ui max Vi ; þ mi Ti;2
sj ¼si T clo
Rinf
1 PoA i
;
(28) PN loc ðmi pi þmi ÞCi
E T
P P i¼1 u i min V i ;
Ri
sup þ m T T clo
i i;2
If sj ¼ 0, si 2M ‘fsj ¼si g ¼ 0; if sj 2 M, si 2M ‘fsj ¼si g ¼ 1.
Thus (35)

X X where R sup
¼ Blog 2 1 þ pi
g i;o
.
E I s ; s ; Q
si 2M Q i i i
p g u ;
j2N nfig j j;o j
(29) i s0
Proof. 1) Let s 2 C be an arbitrary expected NE of the game

which along with (27) yields G2 . If si ¼ 0, user i chooses local computing with the cost
Viloc ; if si > 0, user i chooses cloud computing with the
1 X
EQ Ii si ; s i ; Q p g u :
j2N nfig j j;o j
(30) cost Viclo ðs Þ. Thus, the following result always holds:
M

Besides Gi ðs Þ max Viloc ; Viclo ðs Þ : (36)

pi gi;o Besides, according to Lemma 3, we have
EQ ½Ri ðs ; QÞ ¼ EQ Blog 2 1 þ
Ii ðs ; QÞ þ s 0

E E

Ii ðs ; QÞ þ pi gi;o Ii ðs ; QÞ m pi þ mTi Ci mi pi þ mTi Ci
¼ EQ Blog 2 1 þ EQ Blog 2 1 þ Viclo ðs Þ ¼ i þ m T clo
T þ mTi Ti;2
clo
:
s0 s0 EQ ½Ri ðs ; QÞ i i;2
Rinf
i

pi gi;o pi gi;o (37)
¼ Egi;o Blog 2 1 þ Egi;o Blog 2 1 þ
s0 s0

Therefore
Ii ðs ; QÞ þ pi gi;o Ii ðs ; QÞ E
þ EQ Blog 2 1 þ EQ Blog 2 1 þ :
loc mi pi þ mi Ci
T
s0 s0
Gi ðs Þ max Vi ; inf þ mT clo
i i;2 :
T (38)
(31) R i
Notably

2) For the centralized optimal solution s^, the expected
pi gi;o Ii ðs ; QÞ þ pi gi;o transmission rate for computation offloading is
Egi;o Blog 2 1 þ EQ Blog 2 1 þ :

s0 s0 pi gi;o
EQ ½Ri ð^s; QÞ ¼ EQ Blog 2 1 þ
(32) Ii ðs^; QÞ þ s 0

(39)
Moreover, according to Jensen’s inequality [41] and the pi gi;o pi gi;o
Egi;o Blog 2 1 þ Blog 2 1 þ ;
upper bound of expected interference given in (30), we s0 s0
have

where the last inequality in (39) is based on Jensen’s
Ii ð s ; Q Þ EQ ½Ii ðs ; QÞ inequality [41]. Then, we obtain the following result:
EQ Blog 2 1 þ Blog 2 1 þ
s0 s0 E E
P
mi pi þ mTi Ci mi pi þ mTi Ci
j;o uj Vi ðs^Þ ¼
clo
þ mi Ti;2
T clo
þ mTi Ti;2
clo
:
j2N nfig pj g EQ ½Ri ðs^; QÞ Rsup
Blog 2 1 þ : i
Ms 0 (40)
(33)
Since user i’s computation cost is either Viloc or Viclo ðs^Þ,
Then, (31), (32) and (33) lead to we have

pi gi;o Gi ðs^Þ min Viloc ; Viclo ðs^Þ
EQ ½Ri ðs ; QÞ Egi;o Blog 2 1 þ E
s0 m pi þ mT Ci (41)
P
(34) min Viloc ; i sup i þ mTi Ti;2
clo
;
j;o uj
j2N nfig pj g Ri
Blog 2 1 þ :
Ms 0 u
t
which along with (38) leads to the upper bound of the
Lemma 3 characterizes the lower bound of expected trans- PoA given in (35).
mission rate of each user for computation offloading at any Besides, since the centralized optimum s^ minimizes the
NE point. According to (25), the lower bound R inf increases
i system-wide computation cost, we hence have PoA 1.
with the number of available channels M. The reason is that
Therefore, we complete the proof of Theorem 4. u
t
as the number of channels increases, mobile users can avoid
mutual interference by choosing different channels for com- According to Theorem 4 and Lemma 3, more available
putation offloading. Second, when mobile users’ active proba- channels and lower active probabilities help increase the
inf becomes larger, which
bilities are lower, the lower bound R lower bound of expected offloading rate, which thus
i
decreases the gap between the worst NE and the centralized Remark 1. The above analysis indicates that the NE points
optimal solution. might enable mobile users to achieve desirable and attrac-
tive performance by playing the proposed stochastic
5.2 Metric II: Beneficial Cloud Computing Users game for computation offloading. It is very interesting
For an arbitrary NE s , let us denote the number of users since mobile users’ selfish and competitive behaviors
who benefit from cloud computing by N c ðs Þ. Next, we will lead to desirable game outcomes. The reasons can be
analyze the bound of N c ðs Þ. explained as follows. If too many mobile users are using
max ,
Letting Zmax , maxi2N fpi gi;o g, Zmin , mini2N fpi gi;o g, Q the same wireless channel to offload their computation

maxi2N fQi g, Qmin , mini2N fQi g, umax , maxi2N fui g, and tasks to the cloud, they may experience low offloading
rates. In order to reduce the computation cost, some
umin , mini2N fui g, we can derive the following theorem.
mobile users will definitely choose other wireless chan-
Theorem 5. Suppose 0 < N c ðs Þ < N, then for the multi-user nels for offloading or compute the task locally by itself.
dynamic computation offloading game, the total number of users Consequently, it leads to balanced occupation of wireless
who benefit from cloud computing at any NE point satisfies channels for computation offloading, which is beneficial

for the whole system.
MQmin Qmax
c
N ðs Þ M +1 : (42)
Zmax umax Zmin umin 6 MULTI-AGENT STOCHASTIC LEARNING UNDER
DYNAMIC ENVIRONMENT
Proof. 1) Since N c ðs Þ < N, there exists at least one user k
Although the NE points exhibit desirable and attractive
that chooses the local computing manner, i.e., sk ¼ 0. performance, it is challenging for mobile users to reach
Since s is an NE, we know that the user cannot reduce its the NE in a distributed manner and under dynamic envi-
payoff by choosing computation offloading via any chan- ronment. Most existing game-theoretic algorithms (e.g.,
nel m 2 M. According to (8), we have EQ ½I k ðs ; QÞ ¼ best (or better) response [24], spatial adaptive play [26])
P k , 8m 2 M. Then, let N c ðs Þ ¼
j;o uj ‘fs ¼mg Q
j2N nfkg pj g m update users’ strategies based on their received instanta-
PN k
neous utility/payoff. However, due to the dynamics of
‘
i¼1 fsi ¼mg
denote the number of users on channel m,
users’ activeness and wireless channels in our computa-
and we have tion offloading problem, each user might receive different
X
Nmc
ðs ÞZmax umax p g u ‘ Qk Q
min : utilities/payoffs in different time slots, even if it chooses
j2N nfkg j j;o j fsk ¼mg to use the same strategy. Thus, the existing game-theo-
(43) retic algorithms may never reach NE. This motivates us
min
c
Thus, Nm ðs Þ Zmax
Q
umax , 8m 2 M. Then, we can obtain to incorporate the idea of stochastic learning [28], [29],
[30] into the design of an efficient yet distributed algo-
XM MQmin rithm under dynamic environment in order to reach the
N c ðs Þ ¼ c
Nm ðs Þ : (44) NE of our proposed stochastic game G2 .
m¼1 Zmax umax
6.1 Proposed Multi-Agent Stochastic Learning
2) Since N c ðs Þ > 0, there exists at least one user k~ that
Algorithm
chooses the cloud computing manner, i.e., sk~ > 0. Without
The details are shown in the Multi-Agent Stochastic Learning
loss of generality, suppose user k~ is on the channel m,
Algorithm (i.e., referred as MASL-Algorithm). Specifically,
which is occupied by most users, i.e., Nm c
ð s Þ Nmc
~ ðs Þ,
each mobile user acts as a learning automaton that indepen-
8m~ 2 M. Since s is an NE, we know that the user cannot
dently and automatically selects its offloading strategy
reduce its payoff by choosing local computation. according to a probability vector over the strategy space, and
According to (8), we have EQ ½I k~ðs ; Q Þ Q ~. That is,
P
k updates the probability vector based on the action-reward
j2N nfkg
~ p j gj;o u j ‘ fs ¼mg Q ~. Then, the following result
j k received from the dynamic environment. For the sake of clear
always holds: presentation, we denote the strategy selection probability vec-
X tor for an arbitrary user i as wi ¼ ðwi0 ; wi1 ; . . . ; wiM Þ, where
c
Nm ðs Þ 1 Zmin umin ~ Q
pj gj;o uj ‘fsj ¼mg Q max ; wi0 denotes the probability to select the strategy of local com-
j2N n fk~g k
puting, wim (m 2 M) denotes the probability to select offload-
(45) ing the computation task to the mobile cloud via wireless
channel m.
which leads to Nm c
ðs Þ Z Qmax þ1. Then, we have
min umin As shown above, the proposed MASL-Algorithm is oper-
XM XM ated in an iterative manner. Within each round of iteration,
N c ðs Þ ¼ Nmc
~ ðs Þ
c
Nm ðs Þ
~
m¼1 ~
m¼1 each active user independently selects its offloading strategy

based on a probability vector over the strategy space, and
Qmax
M þ1 : (46) receives an action-reward from the dynamic environment.
Zmin umin
The computation offloading strategy with less cost is given
Therefore, Theorem 5 is proved. u
t larger action-reward, and the strategy with larger reward
value will be assigned with larger probability. By continu-
Theorem 5 provides a quantitative characterization about ously interacting with the random environment, each mobile
how many users can eventually benefit from offloading user will finally choose its optimal offloading strategy with
computation tasks to the mobile cloud, by playing the probability one. We emphasize that during the operation of
stochastic game G2 . MASL-Algorithm, each mobile user operates entirely based
on its own strategies and the consequently received reward, motivates us to perform a deep analysis about the conver-
without requiring any knowledge from other users and any gence property of the MASL-Algorithm in the next section.
prior knowledge of probability space ðV; H; PÞ of the dynamic
environment. Therefore, the proposed MASL-Algorithm is 6.2 Convergence Properties of MASL-Algorithm
fully distributed which makes itself attractive for a practical We first re-write the updating rule in Step 3 of the proposed
implementation. algorithm as follows:

MASL-Algorithm To Reach the NE of the Stochastic witþ1 ¼ wti þ bati rti est wti ; 8i 2 N ; (49)
Game G2
i
Initialization: At the initial time t ¼ 0, each mobile user where ati denotes the activeness of user i in the tth time slot.
T
i 2 N sets its strategy selection probability vector as a uni- Let Wt ¼ wt1 ; . . . ; wtN denote the strategy selection proba-
form distribution, i.e., wti ¼ ðMþ1
1 1
; . . . ; Mþ1Þ. bility vector of all the users, and thus we can express the
Loop for t ¼ 0; 1; 2; . . . evolution of the strategy selection probability vector of the
game G2 as follows:
1) Updating computation offloading strategy: In the tth
time slot, each active user i 2 At , selects an offloading
strategy sti according to its current strategy selection Wtþ1 ¼ Wt þ bf Wt ; a t ; s t ; r t ; (50)
probability vector wti . The inactive users N nAt keep
silent and take no action. where a t ¼ ðat1 ; . . . ; atN Þ, st ¼ st1 ; . . . ; stN Þ, r t ¼ ðrt1 ; . . . ; rtN Þ,
2) Measuring instantaneous payoff8: Each active user and fð
Þ represents the updating rule specified by (49). Then,
evaluates its respective payoff uti according to (7),
according to Theorem 3.1 in [28], we can derive the following
namely, active user i evaluates its received interfer-
ence Iit if sti > 0; otherwise, active user i directly com- lemma.
putes Qti (which is given in Lemma 1). Besides, the Lemma 4. With a sufficiently small step-size b, i.e., b ! 0, the
inactive users N nAt keep silent and take no action. sequence fWt g will converge weakly to the solution of the fol-
3) Updating strategy selection probability: Each active
lowing ordinary differential equation (ODE)
user updates its strategy selection probability vector
for the next time slot according to the following rule: dW
¼ hðWÞ; (51)
wtþ1 ¼ wti þ brti est wti ; (47) dt
i i
where 0 < b < 1 is the learning step-size, est is an with the initial
state W0 ¼ ½Mþ1
1
NðMþ1Þ , and hðWÞ ¼ EQ f Wt ;
i t t t
a ; s ; r ; Q W ¼ W.
t
ðM þ 1Þ-dimensional unit vector with the sti th element
being one, and rti is the received action-reward defined Lemma 5. With a sufficiently small step-size b, our proposed
by rti ¼ 1 g i uti . (The computation offloading strategy MASL-Algorithm converges to a stable stationary point of the
with less cost is given larger action-reward. Here g i is ODE given in (51).
a scaling factor, and we require g i max1fut g to guaran-
Proof. Let ri ðsÞ ¼ EQ ½ri ðss; QÞ denote user i’s expected
t i
tee the action-reward rti positive.) Besides, the inactive
reward function under strategy profile s, and let Xi ðm;
mobile users N nAt keep their strategy selection proba-
Wi Þ denote user i’s probabilistic reward function when it
bility vectors unchanged, i.e.,
adopts pure strategy m and other users adopt probability
wtþ1
i ¼ wti : (48) vector (for strategy selection) Wi ¼ ðw1 ; . . . wi1 ; wiþ1 ;
End loop until all users do not adjust their respective off- . . . ; wN Þ. Specifically, define
loading strategies. X Y
Xi ðm; Wi Þ ¼ ri ðm; Wi Þ ¼ ri ðm; si Þ wj;sj ;
Notice that due to the dynamics of users’ activeness and s i 2S i j6¼i
wireless channels, user i might receive different action- (52)
rewards in different time slots, even if it chooses to use the
same strategy. This imposes the key challenge to establish the where S i denotes the strategy space of all the users
convergence of MASL-Algorithm. Although there exist sev- excluding i, wj;sj is the probability of user j to choose pure
eral previous studies [28], [29], [30] investigating the conver- strategy sj . In addition, define the probabilistic potential
gence of some stochastic learning algorithms, our proposed function
MASL-Algorithm differs from those algorithms in terms of
X Y
taking into account the dynamics of both users’ activeness
Y ðWÞ ¼ FðWÞ ¼ s 2 SF ðsÞ wi;si ; (53)
and wireless channels. Moreover, the definition of payoff s2S i2N
function is application-dependent, and different payoff func-
and
tions (adopted by even the same learning mechanism) will
lead to different learning solutions [31]. Therefore, the previ-
ous analyses are not applicable to our case in this study, which @Y ðWÞ X Y
Yi ðm; Wi Þ ¼ ¼ ðm; s i Þ
F wj;sj ; (54)
@wi;m s 2S j6¼i
i i
8. As we discussed before, the received payoff uti depends not only
on other users’ activeness, but also on the current channel condition. is the expected potential function defined in
Specifically, other users’ activeness impacts its received interference Iit , where F
while the current channel condition impacts both Iit and Qti . (20), S denotes the strategy space of all the users.
dw
According to Lemma 4, we have Then, according to (55), we have dti;m ¼ 0; 8i; m, and thus
0 1
dt ¼ 0. Hence, W converges to a stationary point of ODE
dW
dwi;m X
¼ ui @wi;m 1 wi;m ri ðm; Wi Þ þ wi;m0 wi;m ri ðm0 ; Wi ÞA (51). This completes the proof of Lemma 5. u
t
dt m0 6¼m
0 1 It has been proved by Theorem 3.2 in [28] that all pure-
X
¼ ui wi;m @ri ðm; Wi Þ wi;m0 ri ðm0 ; Wi ÞA strategy NE of G2 coincide with the stable stationary points
m0 2S i of the ODE given in (51). Thus, based on Lemma 5, we can
0 1
X
derive the following theorem.
¼ ui wi;m @Xi ðm; Wi Þ wi;m0 Xi ðm ; Wi ÞA
0
m0 2S i Theorem 6. With a sufficiently small step-size b, our proposed

X MASL-Algorithm converges to a pure-strategy NE point of G2 .
¼ ui wi;m wi;m0 Xi ðm; Wi Þ Xi ðm0 ; Wi Þ :
m0 2S i
Moreover, the convergence rate of the MASL-Algorithm
Then (55) can be characterized as follows:
dY ðWÞ X @Y ðWÞ dwi;m Theorem 7. The average convergence rate of the proposed
¼
dt i;m
@wi;m dt MASL-Algorithm is given by
X X
¼ Yi ðm; Wi Þui wi;m wi;m0 Xi ðm; Wi Þ Xi ðm0 ; Wi Þ pffiffiffiffiffiffiffiffiffiffiffi
i;m m0 2S i rave ¼ r0 r1 ; (63)
X with
¼ ui wi;m wi;m0 Yi ðm; Wi Þ Xi ðm; Wi Þ Xi ðm0 ; Wi Þ :
r0 ¼ 1
i;m;m0
P P P P 2
(56) b i2N m2S i m0 2Si si 2Si ui g i pi gi;o ui ðm0 ; si Þ ui ðm; s i Þ
P
2ðM þ 1ÞN s2S F ðsÞ F ðs Þ ;
Notably
(64)
X and
ui wi;m wi;m0 Yi ðm; Wi Þ Xi ðm; Wi Þ Xi ðm0 ; Wi Þ P P 2
i;m;m0 b si 2S i ui g i pi gi;o ui ðsi ; si Þ ui ðs Þ
r1 ¼ 1
i2N
X (57) P P ;
¼ ui wi;m0 wi;m Yi ðm0 ; Wi Þ Xi ðm0 ; Wi Þ Xi ðm; Wi Þ : i ; s Þ F
Fðs ðs Þ
i2N si 2S i i
i;m;m0
(65)
Therefore, we can obtain the following result: where S i denotes the strategy space of all the users excluding
i, S denotes the strategy space of all the users, and s is a pure-
dY ðWÞ 1 X strategy NE.
¼ ui wi;m wi;m0 Yi ðm; Wi Þ Yi ðm0 ; Wi Þ
dt 2 i;m;m0 (58)
Proof. As shown in (61), the probabilistic potential function
Xi ðm; Wi Þ Xi ðm0 ; Wi Þ :
Y ðWÞ ¼ FðWÞ monotonously converges. Specifically, at
As given in Step 3 of the proposed algorithm, 8t, iteration index t, the corresponding convergence rate is
rti ¼ 1 guti , thus the expected reward ri ¼ 1 g ui . given by
Then, (21) yields
Wtþ1 F
F ðW Þ Y Wtþ1 Y Wt
r ¼ t
t
¼ 1 þ t ; (66)
s0 ; si F
F ðsi ; si Þ ¼ pi gi;o ri ðsi ; si Þ ri s0 ; si : F W FðW Þ Y W Y ðW Þ
i
gi i
where W is a stationary point of the ODE,
i.e.,
an NE. As
(59) ðW Þ,
Wtþ1 is to F
stated in [39], rt indicates
t how close F
By using Eqs. (52), (54), and (59), we can derive that 8m; compared with F W
m0 2 S i
X @Y ðWÞ
Y Wtþ1 Y Wt j t Dwi;m
Yi ðm; Wi Þ Yi ðm0 ; Wi Þ i;m
@wi;m W
pi gi;o X X
¼ Xi ðm0 ; Wi Þ Xi ðm; Wi Þ ; (60) ¼ Yi m; Wti ui bwi;m wi;m0 Xi m; Wti Xi m0 ; Wti
gi i;m m0 2S i
X
which, together with (58), yields ¼b ui wti;m wti;m0 Yi m; Wti Xi m; Wti Xi m0 ; Wti
i;m;m0
dY ðWÞ 1 X pi b X pi gi;o 2

¼ ui wi;m wi;m0 g ðXi ðm; Wi Þ ¼ ui wti;m wti;m0 Xi m; Wti Xi m0 ; Wti :
dt 2 i;m;m0 gi;o i 2 i;m;m0 gi
(61)
0
2 (67)
Xi ðm ; Wi Þ 0:
Besides, [28], [40] show that only the pure-strategy NE is
(61) indicates that Y ðWÞ monotonously decreases when stable, thus we only study the convergence to a pure-strat-
the algorithm iterates. Moreover, since Y ðWÞ is lower egy NE s . Obviously, Y ðW Þ can be equivalently written
bounded by Y ðWÞ 0, we know Y ðWÞ will converge to ðs Þ. Therefore
as F
a stationary point when dYdtðWÞ ¼ 0, and
P p g 2
b
2 i;m;m0 ui wti;m wti;m0 ig i;o Xi m; Wti Xi m0 ; Wti
dY ðWÞ rt ¼ 1 P
i
Q ;
¼ 0 ) wi;m wi;m0 Xi ðm; Wi Þ Xi ðm0 ; Wi Þ ¼ 0: (62)
s 2S Fðs Þ

i2N wi;si Fðs Þ
t
dt
(68)
TABLE 2
Parameters Setting
Coverage of AP 50 m Data size Ci 5000 KB

Number of mobile [20, 45] Number of CPU cycles for 1000 Megacycles
users N local computing Dloc
i
User’s active (0, 1] Number of CPU cycles for 1200 Megacycles
probability ui cloud computing Diclo
Number of channels M [4, 14] Local computational capability Filoc Randomly set from
{0.5, 0.8, 1.0} GHz
Bandwidth of channel B 5 MHz Cloud computational capability Ficlo 12 GHz
Transmit power pi 100 mW Weight of computational energy mEi Randomly set from {0, 0.5, 1.0}
Pass loss exponent a 4 Weight of computational time mTi 1 mE i
Background noise s 0 100 dBm Computing energy efficiency 1=hi Randomly set from {400, 500, 600}
Megacycles/J
which along with (52) yields mobile users are randomly scattered in the coverage of the
AP. For each user i, we randomly set its active probability ui
Q
P P p g
b
2 i;m;m0 ui ig i;o riðm; si Þ
si 2Si riðm0; si Þ 2 wti;m wti;m0 j6¼i wtj;sj 2
rt ¼ 1 P Q
i

s 2S Fðs Þ i2N wi;si Fðs Þ
t
P P 2 Q Q
b
2 i;m;m0 i;o uiðm0; si Þ uiðm; si Þ
si 2Si u i g i pi g
t
i2N:si ¼m wi;si
t
i2N:si ¼m0 wi;si
¼1 P ðs Þ Q ðs Þ : (69)
F
s 2S i2N wt F
i;si
At t ¼ 0, w0i;si ¼ Mþ1 1
, 8i 2 N ; 8si 2 S i , thus, we can according to a uniform distribution within (0,1]. The time-
derive r as (64). At t ¼ 1, let the probability in NE
0 varying channels follow Rayleigh fading, where the random
Q 1 fading coefficient b is exponentially distributed with unit-
i2N wi;si ¼ 1 ", the probability for only one user devi-
Q mean. Table 2 summarizes the key parameters used in simula-
ating NE w1 j;sj
1
i2N nfjg wi;s ¼ NM , sj 6¼ sj , and the proba-
"
tions, which are set similar to [10], [11]. If not otherwise speci-
i
bility for more than one user deviating NE to be 0. As fied, the default setting for the number of users is 30, and that
for the number of channels is 5. Moreover, in the proposed
" ! 0, we obtain r1 as (65). Then, according to the defi-
MASL-Algorithm, the default value of the step-size b is 0.1,
nition in [39], the average rate of convergence can be and that of the scaling factor g i is 105 (the impact will be dis-
pffiffiffiffiffiffiffiffiffiffiffi
computed by rave ¼ r0 r1 . Notably, the time taken for cussed later on).

W0 F
F ðW Þ to decrease L times its value is
7.1 Convergence Analysis
T ave ¼ log rave
log L
: u
t To evaluate convergence of the proposed MASL-Algorithm,
we plot the strategy selection probabilities of one arbitrarily
Remark 2. (The impact of step-size b) It is noted that the con-
selected user in Fig. 3. At the very beginning, this target user
vergence rate is higher when rave is smaller [39]. Nota-
randomly selects its computation offloading strategy accord-
bly, rave decreases linearly with the value of b. Thus, ing to a uniform distribution. As the MASL-Algorithm oper-
when b is larger, the convergence gets faster. However, ates, this target user’s strategy selection probabilities keep on
the approximation of the iterative process to the ODE updating and finally converge after around 300 iterations.9
requires a sufficiently small b, as shown in Lemma 4. [40] Specifically, after convergence, the probability of choosing off-
has characterized the accuracy of approximation as loading computation task via channel 1 is equal to 1, while the
h i pffiffiffi
~ bt ¼ O b ; probabilities for other strategies decrease to 0. This result
E Wt W (70)
means that, after convergence, the target user will only choose
bt
t
where W is the iterative process of algorithm, and W ~ rep- channel 1 to offload computation task to the mobile cloud.
resents the value of the trajectory of ODE at time bt. There- Fig. 4 plots the dynamics of system-wide computation cost
fore, there is a tradeoff between the convergence rate and for our MASL-Algorithm. For comparison, the best-response
the accuracy of approximation to NE of the stochastic algorithm (referred as BR-Algorithm) in [11] is plotted, which
game, and the parameter setting of b is application-depen- also runs in an iterative manner. At each iteration, the BR-
dent in practice. Fig. 5 in the next section verifies the above
discussions. 9. Since the typical length of a time slot in wireless systems is at the
time scale of microseconds (e.g., 70 microseconds for a time slot in LTE
system [32]), the time used by the computation offloading decision pro-
7 SIMULATION RESULTS cess is actually very short (e.g., 21 milliseconds for 300 iterations in LTE
system). Such a short duration is negligible, compared with the compu-
In this section, we conduct simulations to validate the pro- tation execution process, which is typically at the scale of seconds (e.g.,
posed MASL-Algorithm and its performance under dynamic the execution time for mobile gaming applications is typically several
environment. We set up a scenario network where a group of hundred milliseconds [37]).
Fig. 3. Convergence of strategy selection probabilities. Fig. 5. Impact of different values of step-size b.
Fig. 6. Impact of different values of scaling factor g i .

Fig. 4. Convergence of system-wide computation cost.
Algorithm allows the user who received the update-permis- shows that as the step-size increases, the algorithm can
sion message to select its optimal strategy based on its instanta- speed up its convergence, but obtains an inferior solution
neous computation cost, while other users keep their strategies (not NE). As shown in Remark 2 (Section 6), there is a trade-
unchanged. We can see that the MASL-Algorithm can greatly off between the convergence speed and the accuracy of con-
reduce the computation cost to its minimum (i.e., NE) after vergence to NE, which are both impacted by the step-size b.
convergence, while the BS-Algorithm yields a fluctuating For our case, it is preferable to set the step-size as 0.1, which
computational cost which is much greater than that of the can lead the algorithm converge to an NE within about 350
MASL-Algorithm. This is because the BR-Algorithm always iterations. It is noted setting the step-size as 0.05 can also get
conducts a myopic play based on the instantaneous computa- the NE, but the convergence speed is very slow (more than
tion cost, while the environment is dynamically varying. 500 iterations).
We then analyze the computational complexity of the Moreover, Fig. 6 shows the impact of the scaling factor g i
proposed algorithm. Since most operations only involve on the performance of the proposed MASL-Algorithm. We
some basic arithmetical calculations, the dominating part vary the parameter g i to be 103 , 104 , 105 , and 106 , respec-
is the updating of the strategy selection probability in tively. Fig. 6 shows that when g i ¼ 105 , the algorithm
Step 3, which involves the operations of 2 vector-vector achieves the best performance in terms of reducing the sys-
sums, 1 scalar-vector product, and 1 scalar-scalar product. tem-wide computation cost. Larger g i setting could enhance
Thus, the proposed MASL-Algorithm runs in a low com- users’ response to the computation cost, and thus lead
plexity of Oð3M þ 1Þ. In contrast, [11] has shown that the users’ sensitive strategy adjustment towards the optimal
computational complexity of the BR-Algorithm is one. That is why we observe from the figure that 105 is bet-
OðMlog MÞ. ter than 103 and 104 for the setting of g i . However, g i cannot
Fig. 5 shows the impact of the step-size on the conver- be too large in order to guarantee the action-reward rti posi-
gence speed of the proposed MASL-Algorithm. We vary the tive. As a result, the performance of the algorithm gets
step-size b to be 0.05, 0.1, 0.2, 0.3, and 0.4, respectively. Fig. 5 worse when g i ¼ 106 .
Fig. 7. Performance comparison in terms of system cost for different

numbers of mobile users. Fig. 9. Performance comparison in terms of beneficial cloud computing
users for different numbers of users.
Fig. 8. Performance comparison in terms of system cost for different

numbers of channels.
Fig. 10. Performance comparison in terms of beneficial cloud computing
users for different numbers of channels.
7.2 Performance Evaluation
We further evaluate the performance of the proposed the results are due to the average results). Both figures dem-
MASL-Algorithm, in comparison with the performance of onstrate that our proposed MASL-Algorithm achieves bet-
the BR-Algorithm [11] and the random strategy selection ter performances than the BR-Algorithm and the RSS-
algorithm (referred as RSS-Algorithm). Specifically, in the Algorithm. Fig. 9 shows that the number of users who bene-
RSS-Algorithm, each mobile user randomly selects a strat- fit from cloud computing increases when the number of
egy in each time slot. The following presented results are users becomes larger. Intuitively, when the number of total
obtained by simulating 500 independent trials and then tak- users increases, more users may possibly choose cloud com-
ing the average value. puting. However, due to limitation of available channels,
Fig. 7 shows the performance of our proposed MASL- the number of beneficial cloud computing users is also lim-
Algorithm versus different numbers of mobile users. Fig. 7 ited, since users would generate severe interference to each
shows that the MASL-Algorithm always consumes a signifi- other, leading to lower offloading rates. Fig. 10 shows that
cantly less total cost than the BR-Algorithm and the RSS- the number of beneficial cloud computing users increases
Algorithm. In addition, the consumed total cost increases as when the number of available channels increases.
the number of mobile users increases, which is consistent
with the intuition. Fig. 8 further shows the advantage of
using MASL-Algorithm by varying different numbers of 8 CONCLUSION
channels. Fig. 8 again shows that our proposed MASL-Algo- In this paper, we have investigated the problem of multi-user
rithm always outperforms the BR-Algorithm and the RSS- computation offloading for mobile cloud computing under
Algorithm. the practical dynamic environment. By formulating this prob-
Figs. 9 and 10 evaluate our proposed MASL-Algorithm lem as a stochastic game, we proved that such a dynamic off-
in terms of the number of mobile users who benefit from loading decision process always leads to a pure-strategy NE.
performing cloud computing (notice that the decimals in Moreover, we have analyzed the performance bounds of
the NE in terms of both the system cost and the number of [13] V. Cardellini, V. D. N. Persone, V. D. Valerio, F. Facchinei,
V. Grassi, F. L. Presti, and V. Piccialli, “A game-theoretic approach
users who can benefit from cloud computing. To reach to computation offloading in mobile cloud computing,” Math. Pro-
the NE, we proposed a fully distributed algorithm (i.e., gram., vol. 157, no. 2, pp. 421–449, Jun. 2016.
MASL-Algorithm) with a guaranteed convergence rate under [14] M. Chen, M. Dong, and B. Liang, “Multi-user mobile cloud off-
dynamic environment. Simulation results have been pre- loading game with computing access point,” in Proc. IEEE Int.
sented to validate the effectiveness of our proposed algorithm Conf. Cloud Netw., 2016, pp. 64–69.
[15] Y. Wang, S. Meng, Y. Chen, R. Sun, X. Wang, and K. Sun, “Multi-
and show its significant performance advantage. In our future leader multi-follower Stackelberg game based dynamic resource
work, we will study the joint optimization of dynamic offload- allocation for mobile cloud computing environment,” Wireless
ing decision-making and transmit power control, which will Personal Commun., vol. 93, no. 2, pp. 461–480, Mar. 2017.
be an important and technical challenging problem. Another [16] Y. Liu, C. Xu, Y. Zhan, Z. Liu, J. Guan, and H. Zhang, “Incentive
mechanism for computation offloading using edge computing: A
interesting direction is to investigate the mobile computation Stackelberg game approach,” Comput. Netw., vol. 129, no. 2,
offloading from the perspective of economics, and specifi- pp. 399–409, Dec. 2017.
cally, to consider mobile users’ economic expenses for off- [17] A. Rudenko, P. Reiher, G. J. Popek, and G. H. Kuenning, “Saving por-
loading computation tasks to the mobile cloud. table computer battery power through remote process execution,”
Mobile Comput. Commun. Rev., vol. 2, no. 1, pp. 19–26, 1998.
[18] C. Xian, Y. Lu, and Z. Li, “Adaptive computation offloading for
ACKNOWLEDGMENTS energy conservation on battery-powered systems,” in Proc. IEEE
This work is supported by the Jiangsu Provincial Natural Int. Conf. Parallel Distrib. Syst., 2007, pp. 1–8.
[19] G. Huertacanepa and D. Lee, “An adaptable application offload-
Science Foundation of China under Grant BK20170755, by the ing scheme based on application behavior,” in Proc. 22nd Int. Conf.
National Postdoctoral Program for Innovative Talents of Adv. Inf. Netw. Appl.-Workshops, 2008, pp. 387–392.
China under Grant BX201700109, by the National Natural [20] J. Kwak, Y. Kim, J. Lee, and S. Chong, “DREAM: Dynamic
Science Foundations of China under Grant 61801505, resource and task allocation for energy minimization in mobile
cloud systems,” IEEE J. Sel. Areas Commun., vol. 33, no. 12,
61572440, and 61771487, by the Zhejiang Provincial Natural pp. 2510–2523, Dec. 2015.
Science Foundation of China under Grant LR17F010002, and [21] D. Huang, P. Wang, and D. Niyato, “A dynamic offloading algo-
by the Natural Science and Engineering Research Council rithm for mobile computing,” IEEE Trans. Wireless Commun.,
vol. 11, no. 6, pp. 1991–1995, Jun. 2012.
(NSERC), Canada. This paper has been presented in part at [22] H. Wu, D. Huang, and S. Bouzefrane, “Making offloading deci-
the IEEE/CIC ICCC Conference [1], July 2016, Chengdu, sions resistant to network unavailability for mobile cloud collabo-
China. ration,” in Proc. IEEE Int. Conf. Collaborative Comput.: Netw. Appl.
Worksharing, 2013, pp. 168–177.
[23] L. Yang, J. Cao, S. Tang, T. Li, and A. T. S. Chan, “A framework for
REFERENCES partitioning and execution of data stream applications in mobile
[1] J. Zheng, Y. Cai, Y. Wu, and X. Shen, “Stochastic computation off- cloud computing,” in Proc. IEEE Int. Conf. Cloud Comput., 2012,
loading game for mobile cloud computing,” in Proc. IEEE/CIC Int. pp. 794–802.
Conf. Commun. China, 2016, pp. 1–6. [24] D. Monderer and L. S. Shapley, “Potential games,” Games Econ.
[2] Y. Xu and S. Mao, “A survey of mobile cloud computing for rich Behavior, vol. 14, pp. 124–143, 1996.
media applications,” IEEE Wireless Commun., vol. 20, no. 3, [25] J. Marden, G. Arslan, and J. Shamma, “Joint strategy fictitious play
pp. 46–53, Jun. 2013. with inertia for potential games,” IEEE Trans. Automat. Control,
[3] Y. Zhao, S. Zhou, T. Zhao, and Z. Niu, “Energy-efficient task off- vol. 54, no. 2, pp. 208–220, Feb. 2009.
loading for multiuser mobile cloud computing,” in Proc. IEEE/CIC [26] J. Zheng, Y. Cai, Y. Liu, Y. Xu, B. Duan, and X. Shen, “Optimal
Int. Conf. Commun. China, 2015, pp. 1–5. power allocation and user scheduling in multicell networks:
[4] S. Sardellitti, G. Scutari, and S. Barbarossa, “Joint optimization of Base station cooperation using a game-theoretic approach,”
radio and computational resources for multicell mobile-edge IEEE Trans. Wireless Commun., vol. 13, no. 12, pp. 6928–6942,
computing,” IEEE Trans. Signal Inf. Process. Over Netw., vol. 1, Dec. 2014.
no. 2, pp. 89–103, Jun. 2015. [27] S. Hart and A. Mas-Colell, “A simple adaptive procedure leading
[5] Y. Cai, F. R. Yu, and S. Bu, “Cloud computing meets mobile wire- to correlated equilibrium,” Econometrica, vol. 68, no. 5, pp. 1127–
less communications in next generation cellular networks,” IEEE 1150, 2000.
Netw., vol. 28, no. 6, pp. 54–59, Nov./Dec. 2014. [28] P. Sastry, V. Phansalkar, and M. Thathachar, “Decentralized
[6] Y. Wen, W. Zhang, and H. Luo, “Energy-optimal mobile appli- learning of Nash equilibria in multi-person stochastic games with
cation execution: Taming resource-poor mobile devices with incomplete information,” IEEE Trans. Syst. Man Cybern., vol. 24,
cloud clones,” in Proc. IEEE INFOCOM, 2012, pp. 2716–2720. no. 5, pp. 769–777, May 1994.
[7] M. V. Barbera, S. Kosta, A. Mei, and J. Stefa, “To offload or not to [29] J. Zheng, Y. Cai, N. Lu, Y. Xu, and X. Shen, “Stochastic game-
offload? The bandwidth and energy costs of mobile cloud theoretic spectrum access in distributed and dynamic environment,”
computing,” in Proc. IEEE INFOCOM, 2013, pp. 1285–1293. IEEE Trans. Veh. Technol., vol. 64, no. 10, pp. 4807–4820, Oct. 2015.
[8] J. Zheng, Y. Wu, N. Zhang, H. Zhou, Y. Cai, and X. Shen, [30] Y. Xu, J. Wang, Q. Wu, A. Anpalagan, and Y. Yao, “Opportunistic
“Optimal power control in ultra-dense small cell networks: A spectrum access in unknown dynamic environment: A game-
game-theoretic approach,” IEEE Trans. Wireless Commun., vol. 16, theoretic stochastic learning solution,” IEEE Trans. Wireless Com-
no. 7, pp. 4139–4150, Jul. 2017. mun., vol. 11, no. 4, pp. 1380–1391, Apr. 2012.
[9] D. Niyato, E. Hossain, and Z. Han, “Dynamics of multiple-seller [31] M. Bennis, S. M. Perlaza, P. Blasco, Z. Han, and H. V. Poor,
and multiple-buyer spectrum trading in cognitive radio networks: “Self-organization in small cell networks: A reinforcement
A game-theoretic modeling approach,” IEEE Trans. Mobile Com- learning approach,” IEEE Trans. Wireless Commun., vol. 12,
put., vol. 8, no. 8, pp. 1009–1022, Aug. 2009. no. 7, pp. 3202–3212, Jul. 2013.
[10] X. Chen, “Decentralized computation offloading game for mobile [32] T. Innovations, “LTE in a nutshell,” White Paper, 2010.
cloud computing,” IEEE Trans. Parallel Distrib. Syst., vol. 26, no. 4, [33] T. S. Rappaport, Wireless Communications: Principles and Practice,
pp. 974–983, Apr. 2015. 2nd ed. Englewood Cliffs, NJ, USA: Prentice Hall, 2001.
[11] X. Chen, L. Jiao, W. Li, and X. Fu, “Efficient multi-user computa- [34] B. Sklar, “Rayleigh fading channels in mobile digital communica-
tion offloading for mobile-edge cloud computing,” IEEE/ACM tion systems Part I: Characterization,” IEEE Commun. Mag.,
Trans. Netw., vol. 24, no. 5, pp. 2795–2808, Oct. 2016. vol. 35, no. 7, pp. 90–100, Jul. 1997.
[12] T. Lin, T. Alpcan, and K. Hinton, “A game-theoretic analysis of [35] E. Cuervo, A. Balasubramanian, D. Cho, A. Wolman, S. Saroiu,
energy efficiency and performance for cloud computing in com- R. Chandra, and P. Bahl, “MAUI: Making smartphones last longer
munication networks,” IEEE Syst. J., vol. 11, no. 2, pp. 649–660, with code offload,” in Proc. ACM 8th Int. Conf. Mobile Syst. Appl.
Jun. 2017. Serv., 2010, pp. 49–62.
[36] B. Chun, S. Ihm, P. Maniatis, M. Naik, and A. Patti, “CloneCloud: Yuan Wu (M’10-SM’16) received the PhD degree
Elastic execution between mobile device and cloud,” in Proc. in electronic and computer engineering from the
ACM 6th Conf. Comput. Syst., 2011, pp. 301–314. Hong Kong University of Science and Technology,
[37] S. Dey, Y. Liu, S. Wang, and Y. Lu, “Addressing response time of Hong Kong, in 2010. He is currently an associate
cloud-based mobile applications,” in Proc. 1st Int. Workshop Mobile professor with the College of Information Engineer-
Cloud Comput. Netw., 2013, pp. 3–10. ing, Zhejiang University of Technology, Hangzhou,
[38] T. Roughgarden, Selfish Routing and the Price of Anarchy. Cambridge, China. During 2010-2011, he was a postdoctoral
MA, USA: MIT Press, 2005. research associate with the Hong Kong University
[39] K. S. Narendra and M. A. L. Thathachar, Learning Automata: An of Science and Technology. During 2016-2017, he
Introduction. Englewood Cliffs, NJ, USA: Prentice-Hall, 1989. was a visiting scholar with the Broadband Commu-
[40] M. A. L. Thathachar and P. S. Sastry, Networks of Learning Autom- nications Research (BBCR) Group, Department of
ata: Techniques for Online Stochastic Optimization. Berlin, Germany: Electrical and Computer Engineering, University of Waterloo, Canada. His
Springer, 2011. research interests focus on radio resource allocations for wireless commu-
[41] O. E. Ayach and R. W. Heath, “Interference alignment with analog nications and networks, and smart grid. He is the recipient of the Best
channel state feedback,” IEEE Trans. Wireless Commun., vol. 11, Paper Award in the IEEE International Conference on Communications
no. 2, pp. 626–636, Feb. 2012. (ICC) in 2016. He is a senior member of the IEEE.
Jianchao Zheng (M’16) received the BS degree in

communications engineering, and the PhD degree Xuemin Shen (M’97-SM’02-F’09) received the
in communications and information systems from PhD degree in electrical engineering from Rutgers
the College of Communications Engineering, PLA University, New Brunswick, New Jersey, in 1990.
University of Science and Technology, Nanjing, He is currently a university professor with the
China, in 2010 and 2016, respectively. From 2015 Department of Electrical and Computer Engineer-
to 2016, he was a visiting scholar with the Broad- ing, University of Waterloo, Waterloo, ON, Canada.
band Communications Research Group, Depart- His research focuses on resource management in
ment of Electrical and Computer Engineering, interconnected wireless/wired networks, wireless
University of Waterloo, Canada. He is currently a network security, social networks, smart grid, and
research associate with the National Innovation vehicular ad hoc and sensor networks. He is a reg-
Institute of Defense Technology, Academy of Military Sciences PLA istered professional engineer of Ontario, Canada,
China. He is also a post-doctoral research associate with the College of an Engineering Institute of Canada fellow, a Canadian Academy of engi-
Communications Engineering, PLA Army Engineering University. He has neering fellow, a Royal Society of Canada fellow, and a distinguished lec-
published several papers in international conferences and reputed jour- turer of the IEEE Vehicular Technology Society and Communications
nals in his research area. His research interests include interference Society. He received the Joseph LoCicero Award in 2015 and the Education
mitigation techniques, green communications and computing networks, Award in 2017 from the IEEE Communications Society. He has also
game theory, learning theory, and optimization techniques. He is a received the Excellent Graduate Supervision Award in 2006 and the Out-
member of the IEEE. standing Performance Award in 2004, 2007, 2010, and 2014, respectively,
from the University of Waterloo and the Premier’s Research Excellence
Award (PREA) in 2003 from the Province of Ontario, Canada. He served as
Yueming Cai (M’05-SM’12) received the BS the technical program committee chair/co-chair for the IEEE Globecom’16,
degree in physics from Xiamen University, Xiamen, the IEEE Infocom’14, the IEEE VTC’10 Fall, the IEEE Globecom’07, the
China, in 1982, and the MS degree in micro-elec- symposia chair for the IEEE ICC’10, the tutorial chair for the IEEE VTC’11
tronics engineering and the PhD degree in commu- Spring, the chair for the IEEE Communications Society Technical Commit-
nications and information systems both from tee on Wireless Communications, and P2P Communications and Network-
Southeast University, Nanjing, China, in 1988 and ing. He is the editor-in-chief of the IEEE Internet of Things Journal and the
1996, respectively. His current research interests vice president of publications for the IEEE Communications Society. He is a
include cooperative communications, signal proc- fellow of the IEEE.
essing in communications, wireless sensor net-
works, and physical layer security. He is a senior
member of the IEEE. " For more information on this or any other computing topic,
please visit our Digital Library at www.computer.org/publications/dlib.
View publication stats

Dynamic Computation of Oading For Mobile Cloud Computing: A Stochastic Game-Theoretic Approach

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Dynamic Computation of Oading For Mobile Cloud Computing: A Stochastic Game-Theoretic Approach

Uploaded by

Copyright:

Available Formats

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

Dynamic Computation Ofﬂoading for Mobile Cloud Computing: A Stochastic

Article in IEEE Transactions on Mobile Computing · June 2018

Jianchao Zheng Yueming Cai

SEE PROFILE SEE PROFILE

Yuan Wu Xuemin Sherman Shen

SEE PROFILE SEE PROFILE

Spatial modulation View project

Vehicular Communications View project

The user has requested enhancement of the downloaded file.

Dynamic Computation Offloading for

W ITH the proliferation of smart wireless devices and

Notations Description Notations Description

3.2 Communication Model for Active Mobile Users

issue as an important future direction to extend our work.

where si denotes active user i’s strategy, sAnfig denotes the

Definition 4 (Expected NE of G2 ). A computation offloading

5.1 Metric I: System-Wide Computation Cost

Theorem 2. Each NE of game G0;L is an NE of game G1;L , and E

5 PERFORMANCE ANALYSIS OF NASH

Proof. 1) Let s  2 C be an arbitrary expected NE of the game

m0 2S i Theorem 6. With a sufficiently small step-size b, our proposed

dY ðWÞ 1 X pi b X pi gi;o  2

Coverage of AP 50 m Data size Ci 5000 KB

Fig. 6. Impact of different values of scaling factor g i .

Fig. 7. Performance comparison in terms of system cost for different

Fig. 8. Performance comparison in terms of system cost for different

Jianchao Zheng (M’16) received the BS degree in

View publication stats

You might also like

Proof. 1) Let s 2 C be an arbitrary expected NE of the game

dY ðWÞ 1 X pi b X pi gi;o 2