Professional Documents
Culture Documents
Dynamic Computation of Oading For Mobile Cloud Computing: A Stochastic Game-Theoretic Approach
Dynamic Computation of Oading For Mobile Cloud Computing: A Stochastic Game-Theoretic Approach
net/publication/325770383
CITATIONS READS
61 852
4 authors:
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Jianchao Zheng on 22 March 2019.
Abstract—Driven by the growing popularity of mobile applications, mobile cloud computing has been envisioned as a promising
approach to enhance computation capability of mobile devices and reduce the energy consumptions. In this paper, we investigate the
problem of multi-user computation offloading for mobile cloud computing under dynamic environment, wherein mobile users become
active or inactive dynamically, and the wireless channels for mobile users to offload computation vary randomly. As mobile users are
self-interested and selfish in offloading computation tasks to the mobile cloud, we formulate the mobile users’ offloading decision
process under dynamic environment as a stochastic game. We prove that the formulated stochastic game is equivalent to a weighted
potential game which has at least one Nash Equilibrium (NE). We quantify the efficiency of the NE, and further propose a multi-agent
stochastic learning algorithm to reach the NE with a guaranteed convergence rate (which is also analytically derived). Finally, we
conduct simulations to validate the effectiveness of the proposed algorithm and evaluate its performance under dynamic environment.
Index Terms—Mobile cloud computing, multi-user computation offloading, dynamic environment, stochastic game, multi-agent stochastic
learning
Ç
1 INTRODUCTION
mechanisms with low complexity, which help to ease the scheme for energy-efficient application execution on the
heavy controlling and signaling overhead of complex central- cloud-assisted mobile application platform. Huertacanepa
ized management [10]. and Lee [19] proposed an adaptive application offloading
However, applying game theory to model the multi-user mechanism based on both the current system conditions and
computation offloading process should carefully deal with the execution history of applications. By invoking the Lyapu-
the complex real network environment. Specifically, mobile nov optimization, [20] and [21] studied the dynamic computa-
users may become active or inactive1 dynamically, and wire- tion offloading policies for minimizing CPU and network
less channels are also time-varying. To capture the dynamics energy consumption under real network environment. In
of network environment, in this paper, we adopt a novel sto- [22], the authors modeled the unstable network as an alternat-
chastic game-theoretic approach to analyze the users’ compu- ing renewal process and proposed an offloading decision
tation offloading decision-making process under dynamic model for mobile cloud application.
conditions, and then propose a multi-agent stochastic learn- Only a few works have discussed the computation off-
ing algorithm to reach the NE of the stochastic game. The loading problem in the multi-user case. Yang et al. [23]
main contributions of this paper are summarized as follows: proposed a genetic algorithm to solve the partition prob-
lem of wireless network bandwidth among multiple
We formulate a stochastic game to model and analyze
users, which achieves high throughput of processing the
the multi-user computation offloading problem under
streaming data. In [3], the authors devised a low-com-
dynamic environment, wherein both mobile users’
plexity heuristic method to perform energy-efficient task
activeness and wireless channel gains are time-vary- offloading for multiuser mobile cloud computing, while
ing. To show the existence of NE in the formulated sto- satisfying the delay requirements. Sardellitti et al. [4] pro-
chastic game, we prove its equivalence to a weighted posed an iterative algorithm to perform the joint optimi-
potential game which has at least one NE. Moreover, zation of radio and computational resources for multi-cell
we analyze the performance bound of the NE in terms mobile-edge computing, under latency and power budget
of the system cost and the number of mobile users constraints.
who can benefit from cloud computing. The above studies all belonged to the centralized compu-
To reach the NE of the formulated stochastic game, we tation offloading mechanisms which did not consider the
propose a multi-agent stochastic learning algorithm for interactions among multiple self-organizing users when
the multi-user computation offloading under dynamic they independently chose their computation offloading
environment. The proposed algorithm runs in a fully strategies. [10], [11], [12], [13], [14], [15], [16] modeled
distributed manner without any information exchange, mobile users as self-interested game players and proposed
i.e., each user independently adjusts its offloading decentralized mechanisms to solve the multi-user computa-
strategy based on its received action-reward instead of tion offloading problems. These previous studies mainly
knowing other users’ detailed offloading-strategies. focused on the computation offloading problems under rel-
As an important technical contribution in this paper, atively static environment. However, in real network envi-
we theoretically derive the convergence rate of the ronment, due to dynamic mobile users’ activeness and
multi-agent stochastic learning algorithm. It is techni- time-varying wireless channels, the utility of each player is
cally challenging to prove the convergence property of dynamically varying, and thus the equilibrium solution of
the designed learning algorithms with multi-user inter- the static game model may never be reached. In this paper,
actions under dynamic environment, and our study we study the multi-user computation offloading problem
here is the first one successfully addressing this issue. with consideration of the dynamics of users’ behaviors and
The rest of this paper is organized as follows. In Section 2, time-varying channels, and adopt the theory of stochastic
we give a brief review of the related works. In Section 3, our game that accounts for all the possible states of the dynamic
system model is introduced. In Section 4, we propose a sto- process to successfully solve the problem.
chastic game to investigate the problem of dynamic compu- The task of achieving NE solutions of the stochastic game
tation offloading. In Section 5, the performance of the NE of in the distributed and dynamic environment is challenging.
the game is analyzed. In Section 6, we propose a multi-agent Most existing algorithms, such as the best (or better)
stochastic learning algorithm to find the NE under dynamic response [24], fictitious play [25], spatial adaptive play [26],
environment. Section 7 presents simulation results and dis- and no-regret learning [27] require enormous information
cussions. Conclusions are drawn in Section 8. exchanges for users’ strategy updating and require the envi-
ronment to be unchanged until reaching convergence of the
algorithms. For the distributed and dynamic environment,
2 RELATED WORK some efficient algorithms have been proposed by invoking
In the literature, many existing works have studied the com- the stochastic learning automata (SLA) [28], [29], [30]. Spe-
putation offloading problem from the perspective of a single cifically, the convergence of SLA-based algorithms to NE
mobile user. Rudenko et al. [17] used experimental results to has been established for coordination games [28] and exact
show that computation offloading can save significant energy. potential games [29], [30]. In this paper, we show the formu-
In [18], the authors designed an adaptive timeout scheme for lated stochastic game is equivalent to a weighted potential
computation offloading to improve the energy savings on game, and then prove the convergence of the designed
mobile devices. Wen et al. [6] proposed an optimization multi-agent stochastic learning algorithm to the NE of the
weighted potential game. Moreover, to the best of our
knowledge, our work is the first to establish the conver-
1. A mobile user is active if it has a computation task to be executed,
while a mobile user is inactive (or silent) if it does not have a computa- gence rate of the SLA-based algorithm in the distributed
tion task to be executed. and dynamic environment.
ZHENG ET AL.: DYNAMIC COMPUTATION OFFLOADING FOR MOBILE CLOUD COMPUTING: A STOCHASTIC GAME-THEORETIC APPROACH 773
TABLE 1
Summation of Used Notations
meters (i.e., channel power gains and the users’ active proba-
bilities) are unknown. For convenience of analysis, we define 2. In this paper, we focus on the wireless interference model given in
a probability space as ðV; H; PÞ, where V is the sample space (1) which is widely adopted in the literature. Note that, we can also
over all system states, H is a minimal s-algebra on subsets of adopt some media access control protocols (such as CSMA) in which
V, and P is a probability measure on ðV; HÞ. Let L denote an the multiple access among users for the shared spectrum is carried out
over the packet level. As shown in [11], the analysis could be very simi-
event in the sample space V. QðLÞ ¼ ½aðLÞ; gðLÞ : V ! lar to our adopted wireless interference model.
774 IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 18, NO. 4, APRIL 2019
data (e.g., the mobile system settings, the program codes, and respectively. Here, the units of mTi and mE 1
i are Second and
1
the input parameters) involved in the task, and Dloc
i are Dclo
i Joule, respectively.
the total number of CPU cycles required to complete the com-
putation task on the mobile device and the telecom cloud, 3.3.2 Computation Cost When Choosing to Perform
respectively.3 For the sake of clear presentation, we use the let- Local Computing
ters “loc” and “clo” to represent LOCal computing and
CLOud computing, respectively. The detailed modelings of Each active mobile user i can also choose to execute its com-
the computation cost are as follows. putation task T i locally by itself (i.e., without invoking any
computation offloading to the cloud). Let Filoc be the com-
3.3.1 Computation Cost When Choosing to Perform putation capability of mobile user i. The computation exe-
Cloud Computing cution time of task T i by local computing is then given by
Dloc
If active user i chooses to offload its computation task T i to Tiloc ¼ F loc
i
. The corresponding computational energy can be
the cloud, it would incur the cost for transmitting the input i
data to the cloud via wireless access. According to the com- computed by Eiloc ¼ hi Dloc
i , where hi is the coefficient denot-
munication model (1), the transmission time and energy ing the energy consumption per CPU cycle. Then, by taking
consumption for offloading the input data of size Ci can be into account both the processing time and the energy consump-
respectively computed by tion, we can compute mobile user i’s total cost when it choo-
ses to perform local computation (i.e., si ¼ 0) as follows:
Ci pi Ci
clo
Ti;1 ð s A ; LÞ ¼ ; and Eiclo ðsA ; LÞ ¼ : Viloc ¼ mTi Tiloc þ mE loc
i 2 A; and si ¼ 0: (3)
Ri ðs A ; LÞ Ri ðsA ; LÞ i Ei ;
After the transmission, the cloud expends the execution time Based on the communication and computation models
Dclo
above, we see that, when choosing to offload computation
clo
Ti;2 ¼ i
Ficlo
to finish mobile user i’s task, where Ficlo denotes task to the cloud, each mobile user’s cost depends not only
on its own offloading strategy, but also on all the other
the computation capability (i.e., CPU cycles per second)
active peers’. Specifically, as shown in the cost function (2),
assigned to user i by the cloud.4 Then, considering both the if too many mobile users are active and using the same strat-
processing time5 and the energy consumption, the total cost6 egy (i.e., choosing to use the same wireless channel) to off-
when mobile user i chooses to perform cloud computing load their computation tasks to the cloud, they may
(i.e., si > 0) can be given by experience low offloading rates, which will incur more com-
putation cost (including longer transmission time and
Viclo ðsA ; LÞ ¼ mTi Ti;1
clo
ðsA ; LÞ þ Ti;2
clo
þ mEi Ei ðs A ; LÞ
clo
higher energy consumption). In this case, it would be more
E (2) beneficial for the mobile user to compute the task locally by
m pi þ mTi Ci
¼ i þ mTi Ti;2
clo
; 2 A; and si > 0; itself. Due to such an inter-dependence among different
Ri ðsA ; LÞ
mobile users, game theory is a suitable mathematical tool to
where mTi and mE i 2 ð0; 1Þ denote the weights of computa- model and analyze users’ decision making for computation
tional time and energy for mobile user i’s strategy decision, offloading. However, due to dynamically varying of mobile
users’ activeness, each user may not be able to know other
3. A mobile user i can obtain the information of Ci , Dloc clo
i , and Di by
peers’ active/inactive states. Moreover, mobile users prefer
applying the methods (e.g., call graph analysis) in [35], [36]. If we con- better channel condition for offloading their computation
sider the related overhead, it would be added just as a constant in the tasks, but the wireless channels are time-varying, which
system model, which would not affect the following mathematical anal-
ysis. Besides, since the mobile devices and the remote cloud computing
makes the problem more challenging.
servers have different instruction set architectures, the numbers of CPU
cycles for the two architectures (i.e., Dloc clo
i and Di ) are different. 4 STOCHASTIC COMPUTATION OFFLOADING GAME
4. Ficlo is considered to be preset for each user. In this paper, we do
not study the allocation/scheduling of computation resources to differ- In this section, we formulate a stochastic game to model the
ent users from the perspective of mobile cloud, since this issue has decision process for mobile users’ computation offloading
been widely investigated in literature, e.g., [4], [20]. to the cloud. For the sake of clear presentation, we first
5. Since the AP is connected to the mobile cloud via the high-speed describe the game model in a static case, and then illustrate
fiber link, the transmission time cost among them could be neglected,
compared with the much higher wireless access time cost (resulted the dynamic case.
from the constrained wireless spectrum resource). Besides, since the
size of the computation outcome is usually much smaller than the size 4.1 Game Models
of input computation data for many applications (e.g., face recognition),
we neglect the time cost for the cloud to send the computation outcome 4.1.1 Static Case
back to the mobile user. The similar assumption also appears in many Given other users’ strategy decisions, each active user i 2 A
previous works [11], [17], [18], [19], [21]. independently adjusts its computation offloading strategy
6. It would also be interesting to consider the users’ economic cost.
Since Ficlo is considered to be preset for each user in our system model, the to minimize its own computation cost. Specifically, given a
economic cost is just a constant added into Eq. (2), and thus it will not realization L in the probability space ðV; H; PÞ, the (state-
affect the following game-theoretic solution. However, if Ficlo is consid- based) payoff function of each active user i is naturally
ered as an optimization variable, how to optimally decide the price for the defined by
computing resources would also be a key problem for the service pro-
viders. In this case, the economic game models such as the market model, (
bargaining model, bidding model, auction model, duopoly model, and Viloc ; if si ¼ 0;
Stackelberg model could be adopted. Since this economic consideration ui si ; sAnfig ; L ¼
0 (4)
will lead to significant change of the current game model, we consider this
Vi si ; sAnfig ; L ; if si > 0;
clo
Proof. According to Definition 1, cloud computing is Fig. 2 illustrates the connections among the game models
beneficial to user i only if Viclo ðs A ; LÞ Viloc . Based on the G0;L , G1;L , and G2 .
computing models in (2) and (3), this condition corre-
sponds to 4.2 Analysis of Nash Equilibrium
In game theory, Nash Equilibrium (NE) is the most important
mE
i pi þ mi Ci
T solution concept for analyzing the outcome of the strategic
þ mTi Ti;2
clo
mTi Tiloc þ mE loc
i Ei ; interaction of multiple decision-makers. In this section, we
Ri ðsA ; LÞ
first investigate the existence of NE for the static game models
which after some manipulations leads to the following G0;L and G1;L . Moreover, by proving the equivalence between
condition: G0;L and G1;L , we illustrate the rationality of designing the
game model G1;L . Then, based on the analysis in the static
case, we derive the existence of NE in the stochastic game G2 .
mEi pi þ mi Ci
T
Ri ðsA ; LÞ : To proceed, we first introduce the definition of NE for game
mTi Tiloc þ mE i Ei mi Ti;2
loc T clo
models in both static and dynamic cases.
Based on the communication model (1), the above condi- Definition 2 (NE of G0;L ). For a realization L 2 V, a
tion can be equivalently transformed into computation offloading strategy profile sA ¼ ½si 8i2A is a
X (pure-strategy) NE of the game G0;L if and only if no active
pi gi;o
pj gj;o s0: mobile user can minimize its payoff function u0i by unilaterally
j2Anfig:sj ¼si
ðmEi pi þmTi ÞCi deviating, i.e.,
B mT T loc þmE E loc mT T clo
2 i i i i i i;2
1 u0i si ; sAnfig ; L u0i si ; sAnfig ; L ; 8i 2 A; 8si 2 S i : (10)
This completes the proof of Lemma 1. u
t
Definition 3 (NE of G1;L ). For a realization L 2 V, a computa-
Based on Lemma 1, we design the following game model
tion offloading strategy profile sA ¼ ½si 8i2A is a (pure-strategy)
G1;L (which will be shown to be equivalent to G0;L with
proofs in Lemma 2 and Theorem 2) as follows: NE of the game G1;L if and only if no active mobile user can mini-
mize its payoff function ui by unilaterally deviating, i.e.,
G1;L : min ui si ; sAnfig ; L ; 8i 2 A; (6)
si 2S i ui si ; sAnfig ; L ui si ; sAnfig ; L ; 8i 2 A; 8si 2 S i : (11)
776 IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 18, NO. 4, APRIL 2019
j;o uj
j2N nfig pj g
inf ¼ Eg Blog 2 1 þ pi gi;o
R Blog 1 þ :
where uk ðsk ; sk Þ is the payoff function defined in (8). i i;o
s0 2
Ms 0
Thus, according to the potential game theory in [24], G2 is (25)
a weighted potential game with weight-factor pk gk;o . u
t
As proved in [24], every weighted potential game pos- Proof. According to Definition
4 for the expected
Nash
sesses the finite improvement property, and thus G2 has at Equilibrium, 8i 2 N , ui si ; s i ui si ; si , which along
least one pure-strategy NE. That is, the investigated mobile with (8) leads to
users’ decision-making process for computation offloading is
guaranteed to have a pure-strategy NE under the dynamic EQ Ii si ; s i ; Q EQ Ii si ; s i ; Q ; 8si 2 M S i :
environment. However, the behaviors of users in the game
(26)
are selfish to minimize its own payoff (without caring about
the other peers’), which may lead to an inefficient NE. In the By summing up the two sides of (26), we can derive
following section, we will analyze the achievable performance
of NE for the stochastic game G2 under dynamic environment. X
MEQ Ii si ; s i ; Q E I s ; s ; Q :
s 2M Q i i i
(27)
i
P
Obviously, EQ Ii si ; si ; Q ¼ j2N nfig:sj ¼si pj gj;o uj . Thus implies that higher offloading rate can be achieved. With
X X X Lemma 3, the following result can be achieved.
si 2M
EQ Ii si ; s i ; Q ¼ si 2M
p g u
j2N nfig:sj ¼si j j;o j Theorem 4. For the multi-user stochastic computation off-
X X
¼ p g u ‘n o loading game G2 , the PoA of the system-wide computation cost
s 2M
i j2N nfig j j;o j sj ¼si satisfies that
X X
¼ p g u ‘n o: PN loc ðmi pi þmi ÞCi
E T
j2N nfig j j;o j si 2M
i¼1 ui max Vi ; þ mi Ti;2
sj ¼si T clo
Rinf
1 PoA i
;
(28) PN loc ðmi pi þmi ÞCi
E T
P P i¼1 u i min V i ;
Ri
sup þ m T T clo
i i;2
If sj ¼ 0, si 2M ‘fsj ¼si g ¼ 0; if sj 2 M, si 2M ‘fsj ¼si g ¼ 1.
Thus (35)
X X where R sup
¼ Blog 2 1 þ pi
g i;o
.
E I s ; s ; Q
si 2M Q i i i
p g u ;
j2N nfig j j;o j
(29) i s0
decreases the gap between the worst NE and the centralized Remark 1. The above analysis indicates that the NE points
optimal solution. might enable mobile users to achieve desirable and attrac-
tive performance by playing the proposed stochastic
5.2 Metric II: Beneficial Cloud Computing Users game for computation offloading. It is very interesting
For an arbitrary NE s , let us denote the number of users since mobile users’ selfish and competitive behaviors
who benefit from cloud computing by N c ðs Þ. Next, we will lead to desirable game outcomes. The reasons can be
analyze the bound of N c ðs Þ. explained as follows. If too many mobile users are using
max ,
Letting Zmax , maxi2N fpi gi;o g, Zmin , mini2N fpi gi;o g, Q the same wireless channel to offload their computation
maxi2N fQi g, Qmin , mini2N fQi g, umax , maxi2N fui g, and tasks to the cloud, they may experience low offloading
rates. In order to reduce the computation cost, some
umin , mini2N fui g, we can derive the following theorem.
mobile users will definitely choose other wireless chan-
Theorem 5. Suppose 0 < N c ðs Þ < N, then for the multi-user nels for offloading or compute the task locally by itself.
dynamic computation offloading game, the total number of users Consequently, it leads to balanced occupation of wireless
who benefit from cloud computing at any NE point satisfies channels for computation offloading, which is beneficial
for the whole system.
MQmin Qmax
c
N ðs Þ M +1 : (42)
Zmax umax Zmin umin 6 MULTI-AGENT STOCHASTIC LEARNING UNDER
DYNAMIC ENVIRONMENT
Proof. 1) Since N c ðs Þ < N, there exists at least one user k
Although the NE points exhibit desirable and attractive
that chooses the local computing manner, i.e., sk ¼ 0. performance, it is challenging for mobile users to reach
Since s is an NE, we know that the user cannot reduce its the NE in a distributed manner and under dynamic envi-
payoff by choosing computation offloading via any chan- ronment. Most existing game-theoretic algorithms (e.g.,
nel m 2 M. According to (8), we have EQ ½I k ðs ; QÞ ¼ best (or better) response [24], spatial adaptive play [26])
P k , 8m 2 M. Then, let N c ðs Þ ¼
j;o uj ‘fs ¼mg Q
j2N nfkg pj g m update users’ strategies based on their received instanta-
PN k
neous utility/payoff. However, due to the dynamics of
‘
i¼1 fsi ¼mg
denote the number of users on channel m,
users’ activeness and wireless channels in our computa-
and we have tion offloading problem, each user might receive different
X
Nmc
ðs ÞZmax umax p g u ‘ Qk Q
min : utilities/payoffs in different time slots, even if it chooses
j2N nfkg j j;o j fsk ¼mg to use the same strategy. Thus, the existing game-theo-
(43) retic algorithms may never reach NE. This motivates us
min
c
Thus, Nm ðs Þ Zmax
Q
umax , 8m 2 M. Then, we can obtain to incorporate the idea of stochastic learning [28], [29],
[30] into the design of an efficient yet distributed algo-
XM MQmin rithm under dynamic environment in order to reach the
N c ðs Þ ¼ c
Nm ðs Þ : (44) NE of our proposed stochastic game G2 .
m¼1 Zmax umax
6.1 Proposed Multi-Agent Stochastic Learning
2) Since N c ðs Þ > 0, there exists at least one user k~ that
Algorithm
chooses the cloud computing manner, i.e., sk~ > 0. Without
The details are shown in the Multi-Agent Stochastic Learning
loss of generality, suppose user k~ is on the channel m,
Algorithm (i.e., referred as MASL-Algorithm). Specifically,
which is occupied by most users, i.e., Nm c
ð s Þ Nmc
~ ðs Þ,
each mobile user acts as a learning automaton that indepen-
8m~ 2 M. Since s is an NE, we know that the user cannot
dently and automatically selects its offloading strategy
reduce its payoff by choosing local computation. according to a probability vector over the strategy space, and
According to (8), we have EQ ½I k~ðs ; Q Þ Q ~. That is,
P
k updates the probability vector based on the action-reward
j2N nfkg
~ p j gj;o u j ‘ fs ¼mg Q ~. Then, the following result
j k received from the dynamic environment. For the sake of clear
always holds: presentation, we denote the strategy selection probability vec-
X tor for an arbitrary user i as wi ¼ ðwi0 ; wi1 ; . . . ; wiM Þ, where
c
Nm ðs Þ 1 Zmin umin ~ Q
pj gj;o uj ‘fsj ¼mg Q max ; wi0 denotes the probability to select the strategy of local com-
j2N n fk~g k
puting, wim (m 2 M) denotes the probability to select offload-
(45) ing the computation task to the mobile cloud via wireless
channel m.
which leads to Nm c
ðs Þ Z Qmax þ1. Then, we have
min umin As shown above, the proposed MASL-Algorithm is oper-
XM XM ated in an iterative manner. Within each round of iteration,
N c ðs Þ ¼ Nmc
~ ðs Þ
c
Nm ðs Þ
~
m¼1 ~
m¼1 each active user independently selects its offloading strategy
based on a probability vector over the strategy space, and
Qmax
M þ1 : (46) receives an action-reward from the dynamic environment.
Zmin umin
The computation offloading strategy with less cost is given
Therefore, Theorem 5 is proved. u
t larger action-reward, and the strategy with larger reward
value will be assigned with larger probability. By continu-
Theorem 5 provides a quantitative characterization about ously interacting with the random environment, each mobile
how many users can eventually benefit from offloading user will finally choose its optimal offloading strategy with
computation tasks to the mobile cloud, by playing the probability one. We emphasize that during the operation of
stochastic game G2 . MASL-Algorithm, each mobile user operates entirely based
780 IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 18, NO. 4, APRIL 2019
on its own strategies and the consequently received reward, motivates us to perform a deep analysis about the conver-
without requiring any knowledge from other users and any gence property of the MASL-Algorithm in the next section.
prior knowledge of probability space ðV; H; PÞ of the dynamic
environment. Therefore, the proposed MASL-Algorithm is 6.2 Convergence Properties of MASL-Algorithm
fully distributed which makes itself attractive for a practical We first re-write the updating rule in Step 3 of the proposed
implementation. algorithm as follows:
MASL-Algorithm To Reach the NE of the Stochastic witþ1 ¼ wti þ bati rti est wti ; 8i 2 N ; (49)
Game G2
i
Initialization: At the initial time t ¼ 0, each mobile user where ati denotes the activeness of user i in the tth time slot.
T
i 2 N sets its strategy selection probability vector as a uni- Let Wt ¼ wt1 ; . . . ; wtN denote the strategy selection proba-
form distribution, i.e., wti ¼ ðMþ1
1 1
; . . . ; Mþ1Þ. bility vector of all the users, and thus we can express the
Loop for t ¼ 0; 1; 2; . . . evolution of the strategy selection probability vector of the
game G2 as follows:
1) Updating computation offloading strategy: In the tth
time slot, each active user i 2 At , selects an offloading
strategy sti according to its current strategy selection Wtþ1 ¼ Wt þ bf Wt ; a t ; s t ; r t ; (50)
probability vector wti . The inactive users N nAt keep
silent and take no action. where a t ¼ ðat1 ; . . . ; atN Þ, st ¼ st1 ; . . . ; stN Þ, r t ¼ ðrt1 ; . . . ; rtN Þ,
2) Measuring instantaneous payoff8: Each active user and fð
Þ represents the updating rule specified by (49). Then,
evaluates its respective payoff uti according to (7),
according to Theorem 3.1 in [28], we can derive the following
namely, active user i evaluates its received interfer-
ence Iit if sti > 0; otherwise, active user i directly com- lemma.
putes Qti (which is given in Lemma 1). Besides, the Lemma 4. With a sufficiently small step-size b, i.e., b ! 0, the
inactive users N nAt keep silent and take no action. sequence fWt g will converge weakly to the solution of the fol-
3) Updating strategy selection probability: Each active
lowing ordinary differential equation (ODE)
user updates its strategy selection probability vector
for the next time slot according to the following rule: dW
¼ hðWÞ; (51)
wtþ1 ¼ wti þ brti est wti ; (47) dt
i i
where 0 < b < 1 is the learning step-size, est is an with the initial
state W0 ¼ ½Mþ1
1
NðMþ1Þ , and hðWÞ ¼ EQ f Wt ;
i t t t
a ; s ; r ; Q W ¼ W.
t
ðM þ 1Þ-dimensional unit vector with the sti th element
being one, and rti is the received action-reward defined Lemma 5. With a sufficiently small step-size b, our proposed
by rti ¼ 1 g i uti . (The computation offloading strategy MASL-Algorithm converges to a stable stationary point of the
with less cost is given larger action-reward. Here g i is ODE given in (51).
a scaling factor, and we require g i max1fut g to guaran-
Proof. Let ri ðsÞ ¼ EQ ½ri ðss; QÞ denote user i’s expected
t i
tee the action-reward rti positive.) Besides, the inactive
reward function under strategy profile s, and let Xi ðm;
mobile users N nAt keep their strategy selection proba-
Wi Þ denote user i’s probabilistic reward function when it
bility vectors unchanged, i.e.,
adopts pure strategy m and other users adopt probability
wtþ1
i ¼ wti : (48) vector (for strategy selection) Wi ¼ ðw1 ; . . . wi1 ; wiþ1 ;
End loop until all users do not adjust their respective off- . . . ; wN Þ. Specifically, define
loading strategies. X Y
Xi ðm; Wi Þ ¼ ri ðm; Wi Þ ¼ ri ðm; si Þ wj;sj ;
Notice that due to the dynamics of users’ activeness and s i 2S i j6¼i
wireless channels, user i might receive different action- (52)
rewards in different time slots, even if it chooses to use the
same strategy. This imposes the key challenge to establish the where S i denotes the strategy space of all the users
convergence of MASL-Algorithm. Although there exist sev- excluding i, wj;sj is the probability of user j to choose pure
eral previous studies [28], [29], [30] investigating the conver- strategy sj . In addition, define the probabilistic potential
gence of some stochastic learning algorithms, our proposed function
MASL-Algorithm differs from those algorithms in terms of
X Y
taking into account the dynamics of both users’ activeness
Y ðWÞ ¼ FðWÞ ¼ s 2 SF ðsÞ wi;si ; (53)
and wireless channels. Moreover, the definition of payoff s2S i2N
function is application-dependent, and different payoff func-
and
tions (adopted by even the same learning mechanism) will
lead to different learning solutions [31]. Therefore, the previ-
ous analyses are not applicable to our case in this study, which @Y ðWÞ X Y
Yi ðm; Wi Þ ¼ ¼ ðm; s i Þ
F wj;sj ; (54)
@wi;m s 2S j6¼i
i i
8. As we discussed before, the received payoff uti depends not only
on other users’ activeness, but also on the current channel condition. is the expected potential function defined in
Specifically, other users’ activeness impacts its received interference Iit , where F
while the current channel condition impacts both Iit and Qti . (20), S denotes the strategy space of all the users.
ZHENG ET AL.: DYNAMIC COMPUTATION OFFLOADING FOR MOBILE CLOUD COMPUTING: A STOCHASTIC GAME-THEORETIC APPROACH 781
dw
According to Lemma 4, we have Then, according to (55), we have dti;m ¼ 0; 8i; m, and thus
0 1
dt ¼ 0. Hence, W converges to a stationary point of ODE
dW
dwi;m X
¼ ui @wi;m 1 wi;m ri ðm; Wi Þ þ wi;m0 wi;m ri ðm0 ; Wi ÞA (51). This completes the proof of Lemma 5. u
t
dt m0 6¼m
0 1 It has been proved by Theorem 3.2 in [28] that all pure-
X
¼ ui wi;m @ri ðm; Wi Þ wi;m0 ri ðm0 ; Wi ÞA strategy NE of G2 coincide with the stable stationary points
m0 2S i of the ODE given in (51). Thus, based on Lemma 5, we can
0 1
X
derive the following theorem.
¼ ui wi;m @Xi ðm; Wi Þ wi;m0 Xi ðm ; Wi ÞA
0
TABLE 2
Parameters Setting
which along with (52) yields mobile users are randomly scattered in the coverage of the
AP. For each user i, we randomly set its active probability ui
Q
P P p g
b
2 i;m;m0 ui ig i;o riðm; si Þ
si 2Si riðm0; si Þ 2 wti;m wti;m0 j6¼i wtj;sj 2
rt ¼ 1 P Q
i
s 2S Fðs Þ i2N wi;si Fðs Þ
t
P P 2 Q Q
b
2 i;m;m0 i;o uiðm0; si Þ uiðm; si Þ
si 2Si u i g i pi g
t
i2N:si ¼m wi;si
t
i2N:si ¼m0 wi;si
¼1 P ðs Þ Q ðs Þ : (69)
F
s 2S i2N wt F
i;si
At t ¼ 0, w0i;si ¼ Mþ1 1
, 8i 2 N ; 8si 2 S i , thus, we can according to a uniform distribution within (0,1]. The time-
derive r as (64). At t ¼ 1, let the probability in NE
0 varying channels follow Rayleigh fading, where the random
Q 1 fading coefficient b is exponentially distributed with unit-
i2N wi;si ¼ 1 ", the probability for only one user devi-
Q mean. Table 2 summarizes the key parameters used in simula-
ating NE w1 j;sj
1
i2N nfjg wi;s ¼ NM , sj 6¼ sj , and the proba-
"
tions, which are set similar to [10], [11]. If not otherwise speci-
i
bility for more than one user deviating NE to be 0. As fied, the default setting for the number of users is 30, and that
for the number of channels is 5. Moreover, in the proposed
" ! 0, we obtain r1 as (65). Then, according to the defi-
MASL-Algorithm, the default value of the step-size b is 0.1,
nition in [39], the average rate of convergence can be and that of the scaling factor g i is 105 (the impact will be dis-
pffiffiffiffiffiffiffiffiffiffiffi
computed by rave ¼ r0 r1 . Notably, the time taken for cussed later on).
W0 F
F ðW Þ to decrease L times its value is
7.1 Convergence Analysis
T ave ¼ log rave
log L
: u
t To evaluate convergence of the proposed MASL-Algorithm,
we plot the strategy selection probabilities of one arbitrarily
Remark 2. (The impact of step-size b) It is noted that the con-
selected user in Fig. 3. At the very beginning, this target user
vergence rate is higher when rave is smaller [39]. Nota-
randomly selects its computation offloading strategy accord-
bly, rave decreases linearly with the value of b. Thus, ing to a uniform distribution. As the MASL-Algorithm oper-
when b is larger, the convergence gets faster. However, ates, this target user’s strategy selection probabilities keep on
the approximation of the iterative process to the ODE updating and finally converge after around 300 iterations.9
requires a sufficiently small b, as shown in Lemma 4. [40] Specifically, after convergence, the probability of choosing off-
has characterized the accuracy of approximation as loading computation task via channel 1 is equal to 1, while the
h i pffiffiffi
~ bt ¼ O b ; probabilities for other strategies decrease to 0. This result
E Wt W (70)
means that, after convergence, the target user will only choose
bt
t
where W is the iterative process of algorithm, and W ~ rep- channel 1 to offload computation task to the mobile cloud.
resents the value of the trajectory of ODE at time bt. There- Fig. 4 plots the dynamics of system-wide computation cost
fore, there is a tradeoff between the convergence rate and for our MASL-Algorithm. For comparison, the best-response
the accuracy of approximation to NE of the stochastic algorithm (referred as BR-Algorithm) in [11] is plotted, which
game, and the parameter setting of b is application-depen- also runs in an iterative manner. At each iteration, the BR-
dent in practice. Fig. 5 in the next section verifies the above
discussions. 9. Since the typical length of a time slot in wireless systems is at the
time scale of microseconds (e.g., 70 microseconds for a time slot in LTE
system [32]), the time used by the computation offloading decision pro-
7 SIMULATION RESULTS cess is actually very short (e.g., 21 milliseconds for 300 iterations in LTE
system). Such a short duration is negligible, compared with the compu-
In this section, we conduct simulations to validate the pro- tation execution process, which is typically at the scale of seconds (e.g.,
posed MASL-Algorithm and its performance under dynamic the execution time for mobile gaming applications is typically several
environment. We set up a scenario network where a group of hundred milliseconds [37]).
ZHENG ET AL.: DYNAMIC COMPUTATION OFFLOADING FOR MOBILE CLOUD COMPUTING: A STOCHASTIC GAME-THEORETIC APPROACH 783
Fig. 3. Convergence of strategy selection probabilities. Fig. 5. Impact of different values of step-size b.
Algorithm allows the user who received the update-permis- shows that as the step-size increases, the algorithm can
sion message to select its optimal strategy based on its instanta- speed up its convergence, but obtains an inferior solution
neous computation cost, while other users keep their strategies (not NE). As shown in Remark 2 (Section 6), there is a trade-
unchanged. We can see that the MASL-Algorithm can greatly off between the convergence speed and the accuracy of con-
reduce the computation cost to its minimum (i.e., NE) after vergence to NE, which are both impacted by the step-size b.
convergence, while the BS-Algorithm yields a fluctuating For our case, it is preferable to set the step-size as 0.1, which
computational cost which is much greater than that of the can lead the algorithm converge to an NE within about 350
MASL-Algorithm. This is because the BR-Algorithm always iterations. It is noted setting the step-size as 0.05 can also get
conducts a myopic play based on the instantaneous computa- the NE, but the convergence speed is very slow (more than
tion cost, while the environment is dynamically varying. 500 iterations).
We then analyze the computational complexity of the Moreover, Fig. 6 shows the impact of the scaling factor g i
proposed algorithm. Since most operations only involve on the performance of the proposed MASL-Algorithm. We
some basic arithmetical calculations, the dominating part vary the parameter g i to be 103 , 104 , 105 , and 106 , respec-
is the updating of the strategy selection probability in tively. Fig. 6 shows that when g i ¼ 105 , the algorithm
Step 3, which involves the operations of 2 vector-vector achieves the best performance in terms of reducing the sys-
sums, 1 scalar-vector product, and 1 scalar-scalar product. tem-wide computation cost. Larger g i setting could enhance
Thus, the proposed MASL-Algorithm runs in a low com- users’ response to the computation cost, and thus lead
plexity of Oð3M þ 1Þ. In contrast, [11] has shown that the users’ sensitive strategy adjustment towards the optimal
computational complexity of the BR-Algorithm is one. That is why we observe from the figure that 105 is bet-
OðMlog MÞ. ter than 103 and 104 for the setting of g i . However, g i cannot
Fig. 5 shows the impact of the step-size on the conver- be too large in order to guarantee the action-reward rti posi-
gence speed of the proposed MASL-Algorithm. We vary the tive. As a result, the performance of the algorithm gets
step-size b to be 0.05, 0.1, 0.2, 0.3, and 0.4, respectively. Fig. 5 worse when g i ¼ 106 .
784 IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 18, NO. 4, APRIL 2019
the NE in terms of both the system cost and the number of [13] V. Cardellini, V. D. N. Persone, V. D. Valerio, F. Facchinei,
V. Grassi, F. L. Presti, and V. Piccialli, “A game-theoretic approach
users who can benefit from cloud computing. To reach to computation offloading in mobile cloud computing,” Math. Pro-
the NE, we proposed a fully distributed algorithm (i.e., gram., vol. 157, no. 2, pp. 421–449, Jun. 2016.
MASL-Algorithm) with a guaranteed convergence rate under [14] M. Chen, M. Dong, and B. Liang, “Multi-user mobile cloud off-
dynamic environment. Simulation results have been pre- loading game with computing access point,” in Proc. IEEE Int.
sented to validate the effectiveness of our proposed algorithm Conf. Cloud Netw., 2016, pp. 64–69.
[15] Y. Wang, S. Meng, Y. Chen, R. Sun, X. Wang, and K. Sun, “Multi-
and show its significant performance advantage. In our future leader multi-follower Stackelberg game based dynamic resource
work, we will study the joint optimization of dynamic offload- allocation for mobile cloud computing environment,” Wireless
ing decision-making and transmit power control, which will Personal Commun., vol. 93, no. 2, pp. 461–480, Mar. 2017.
be an important and technical challenging problem. Another [16] Y. Liu, C. Xu, Y. Zhan, Z. Liu, J. Guan, and H. Zhang, “Incentive
mechanism for computation offloading using edge computing: A
interesting direction is to investigate the mobile computation Stackelberg game approach,” Comput. Netw., vol. 129, no. 2,
offloading from the perspective of economics, and specifi- pp. 399–409, Dec. 2017.
cally, to consider mobile users’ economic expenses for off- [17] A. Rudenko, P. Reiher, G. J. Popek, and G. H. Kuenning, “Saving por-
loading computation tasks to the mobile cloud. table computer battery power through remote process execution,”
Mobile Comput. Commun. Rev., vol. 2, no. 1, pp. 19–26, 1998.
[18] C. Xian, Y. Lu, and Z. Li, “Adaptive computation offloading for
ACKNOWLEDGMENTS energy conservation on battery-powered systems,” in Proc. IEEE
This work is supported by the Jiangsu Provincial Natural Int. Conf. Parallel Distrib. Syst., 2007, pp. 1–8.
[19] G. Huertacanepa and D. Lee, “An adaptable application offload-
Science Foundation of China under Grant BK20170755, by the ing scheme based on application behavior,” in Proc. 22nd Int. Conf.
National Postdoctoral Program for Innovative Talents of Adv. Inf. Netw. Appl.-Workshops, 2008, pp. 387–392.
China under Grant BX201700109, by the National Natural [20] J. Kwak, Y. Kim, J. Lee, and S. Chong, “DREAM: Dynamic
Science Foundations of China under Grant 61801505, resource and task allocation for energy minimization in mobile
cloud systems,” IEEE J. Sel. Areas Commun., vol. 33, no. 12,
61572440, and 61771487, by the Zhejiang Provincial Natural pp. 2510–2523, Dec. 2015.
Science Foundation of China under Grant LR17F010002, and [21] D. Huang, P. Wang, and D. Niyato, “A dynamic offloading algo-
by the Natural Science and Engineering Research Council rithm for mobile computing,” IEEE Trans. Wireless Commun.,
vol. 11, no. 6, pp. 1991–1995, Jun. 2012.
(NSERC), Canada. This paper has been presented in part at [22] H. Wu, D. Huang, and S. Bouzefrane, “Making offloading deci-
the IEEE/CIC ICCC Conference [1], July 2016, Chengdu, sions resistant to network unavailability for mobile cloud collabo-
China. ration,” in Proc. IEEE Int. Conf. Collaborative Comput.: Netw. Appl.
Worksharing, 2013, pp. 168–177.
[23] L. Yang, J. Cao, S. Tang, T. Li, and A. T. S. Chan, “A framework for
REFERENCES partitioning and execution of data stream applications in mobile
[1] J. Zheng, Y. Cai, Y. Wu, and X. Shen, “Stochastic computation off- cloud computing,” in Proc. IEEE Int. Conf. Cloud Comput., 2012,
loading game for mobile cloud computing,” in Proc. IEEE/CIC Int. pp. 794–802.
Conf. Commun. China, 2016, pp. 1–6. [24] D. Monderer and L. S. Shapley, “Potential games,” Games Econ.
[2] Y. Xu and S. Mao, “A survey of mobile cloud computing for rich Behavior, vol. 14, pp. 124–143, 1996.
media applications,” IEEE Wireless Commun., vol. 20, no. 3, [25] J. Marden, G. Arslan, and J. Shamma, “Joint strategy fictitious play
pp. 46–53, Jun. 2013. with inertia for potential games,” IEEE Trans. Automat. Control,
[3] Y. Zhao, S. Zhou, T. Zhao, and Z. Niu, “Energy-efficient task off- vol. 54, no. 2, pp. 208–220, Feb. 2009.
loading for multiuser mobile cloud computing,” in Proc. IEEE/CIC [26] J. Zheng, Y. Cai, Y. Liu, Y. Xu, B. Duan, and X. Shen, “Optimal
Int. Conf. Commun. China, 2015, pp. 1–5. power allocation and user scheduling in multicell networks:
[4] S. Sardellitti, G. Scutari, and S. Barbarossa, “Joint optimization of Base station cooperation using a game-theoretic approach,”
radio and computational resources for multicell mobile-edge IEEE Trans. Wireless Commun., vol. 13, no. 12, pp. 6928–6942,
computing,” IEEE Trans. Signal Inf. Process. Over Netw., vol. 1, Dec. 2014.
no. 2, pp. 89–103, Jun. 2015. [27] S. Hart and A. Mas-Colell, “A simple adaptive procedure leading
[5] Y. Cai, F. R. Yu, and S. Bu, “Cloud computing meets mobile wire- to correlated equilibrium,” Econometrica, vol. 68, no. 5, pp. 1127–
less communications in next generation cellular networks,” IEEE 1150, 2000.
Netw., vol. 28, no. 6, pp. 54–59, Nov./Dec. 2014. [28] P. Sastry, V. Phansalkar, and M. Thathachar, “Decentralized
[6] Y. Wen, W. Zhang, and H. Luo, “Energy-optimal mobile appli- learning of Nash equilibria in multi-person stochastic games with
cation execution: Taming resource-poor mobile devices with incomplete information,” IEEE Trans. Syst. Man Cybern., vol. 24,
cloud clones,” in Proc. IEEE INFOCOM, 2012, pp. 2716–2720. no. 5, pp. 769–777, May 1994.
[7] M. V. Barbera, S. Kosta, A. Mei, and J. Stefa, “To offload or not to [29] J. Zheng, Y. Cai, N. Lu, Y. Xu, and X. Shen, “Stochastic game-
offload? The bandwidth and energy costs of mobile cloud theoretic spectrum access in distributed and dynamic environment,”
computing,” in Proc. IEEE INFOCOM, 2013, pp. 1285–1293. IEEE Trans. Veh. Technol., vol. 64, no. 10, pp. 4807–4820, Oct. 2015.
[8] J. Zheng, Y. Wu, N. Zhang, H. Zhou, Y. Cai, and X. Shen, [30] Y. Xu, J. Wang, Q. Wu, A. Anpalagan, and Y. Yao, “Opportunistic
“Optimal power control in ultra-dense small cell networks: A spectrum access in unknown dynamic environment: A game-
game-theoretic approach,” IEEE Trans. Wireless Commun., vol. 16, theoretic stochastic learning solution,” IEEE Trans. Wireless Com-
no. 7, pp. 4139–4150, Jul. 2017. mun., vol. 11, no. 4, pp. 1380–1391, Apr. 2012.
[9] D. Niyato, E. Hossain, and Z. Han, “Dynamics of multiple-seller [31] M. Bennis, S. M. Perlaza, P. Blasco, Z. Han, and H. V. Poor,
and multiple-buyer spectrum trading in cognitive radio networks: “Self-organization in small cell networks: A reinforcement
A game-theoretic modeling approach,” IEEE Trans. Mobile Com- learning approach,” IEEE Trans. Wireless Commun., vol. 12,
put., vol. 8, no. 8, pp. 1009–1022, Aug. 2009. no. 7, pp. 3202–3212, Jul. 2013.
[10] X. Chen, “Decentralized computation offloading game for mobile [32] T. Innovations, “LTE in a nutshell,” White Paper, 2010.
cloud computing,” IEEE Trans. Parallel Distrib. Syst., vol. 26, no. 4, [33] T. S. Rappaport, Wireless Communications: Principles and Practice,
pp. 974–983, Apr. 2015. 2nd ed. Englewood Cliffs, NJ, USA: Prentice Hall, 2001.
[11] X. Chen, L. Jiao, W. Li, and X. Fu, “Efficient multi-user computa- [34] B. Sklar, “Rayleigh fading channels in mobile digital communica-
tion offloading for mobile-edge cloud computing,” IEEE/ACM tion systems Part I: Characterization,” IEEE Commun. Mag.,
Trans. Netw., vol. 24, no. 5, pp. 2795–2808, Oct. 2016. vol. 35, no. 7, pp. 90–100, Jul. 1997.
[12] T. Lin, T. Alpcan, and K. Hinton, “A game-theoretic analysis of [35] E. Cuervo, A. Balasubramanian, D. Cho, A. Wolman, S. Saroiu,
energy efficiency and performance for cloud computing in com- R. Chandra, and P. Bahl, “MAUI: Making smartphones last longer
munication networks,” IEEE Syst. J., vol. 11, no. 2, pp. 649–660, with code offload,” in Proc. ACM 8th Int. Conf. Mobile Syst. Appl.
Jun. 2017. Serv., 2010, pp. 49–62.
786 IEEE TRANSACTIONS ON MOBILE COMPUTING, VOL. 18, NO. 4, APRIL 2019
[36] B. Chun, S. Ihm, P. Maniatis, M. Naik, and A. Patti, “CloneCloud: Yuan Wu (M’10-SM’16) received the PhD degree
Elastic execution between mobile device and cloud,” in Proc. in electronic and computer engineering from the
ACM 6th Conf. Comput. Syst., 2011, pp. 301–314. Hong Kong University of Science and Technology,
[37] S. Dey, Y. Liu, S. Wang, and Y. Lu, “Addressing response time of Hong Kong, in 2010. He is currently an associate
cloud-based mobile applications,” in Proc. 1st Int. Workshop Mobile professor with the College of Information Engineer-
Cloud Comput. Netw., 2013, pp. 3–10. ing, Zhejiang University of Technology, Hangzhou,
[38] T. Roughgarden, Selfish Routing and the Price of Anarchy. Cambridge, China. During 2010-2011, he was a postdoctoral
MA, USA: MIT Press, 2005. research associate with the Hong Kong University
[39] K. S. Narendra and M. A. L. Thathachar, Learning Automata: An of Science and Technology. During 2016-2017, he
Introduction. Englewood Cliffs, NJ, USA: Prentice-Hall, 1989. was a visiting scholar with the Broadband Commu-
[40] M. A. L. Thathachar and P. S. Sastry, Networks of Learning Autom- nications Research (BBCR) Group, Department of
ata: Techniques for Online Stochastic Optimization. Berlin, Germany: Electrical and Computer Engineering, University of Waterloo, Canada. His
Springer, 2011. research interests focus on radio resource allocations for wireless commu-
[41] O. E. Ayach and R. W. Heath, “Interference alignment with analog nications and networks, and smart grid. He is the recipient of the Best
channel state feedback,” IEEE Trans. Wireless Commun., vol. 11, Paper Award in the IEEE International Conference on Communications
no. 2, pp. 626–636, Feb. 2012. (ICC) in 2016. He is a senior member of the IEEE.