Professional Documents
Culture Documents
Multi-Objective Workflow Scheduling in Cloud System Based On Cooperative Multi-Swarm Optimization Algorithm
Multi-Objective Workflow Scheduling in Cloud System Based On Cooperative Multi-Swarm Optimization Algorithm
Abstract: In order to improve the performance of multi-objective workflow scheduling in cloud system, a multi-swarm multi-
objective optimization algorithm (MSMOOA) is proposed to satisfy multiple conflicting objectives. Inspired by division of the same
species into multiple swarms for different objectives and information sharing among these swarms in nature, each physical machine
in the data center is considered a swarm and employs improved multi-objective particle swarm optimization to find out
non-dominated solutions with one objective in MSMOOA. The particles in each swarm are divided into two classes and adopt
different strategies to evolve cooperatively. One class of particles can communicate with several swarms simultaneously to promote
the information sharing among swarms and the other class of particles can only exchange information with the particles located in the
same swarm. Furthermore, in order to avoid the influence by the elastic available resources, a manager server is adopted in the cloud
data center to collect the available resources for scheduling. The quality of the proposed method with other related approaches is
evaluated by using hybrid and parallel workflow applications. The experiment results highlight the better performance of the
MSMOOA than that of compared algorithms.
Key words: multi-objective workflow scheduling; multi-swarm optimization; particle swarm optimization (PSO); cloud computing
system
Foundation item: Project(61473078) supported by the National Natural Science Foundation of China; Project(2015−2019) supported by the Program for
Changjiang Scholars from the Ministry of Education, China; Project(16510711100) supported by International Collaborative Project of
the Shanghai Committee of Science and Technology, China; Project(KJ2017A418) supported by Anhui University Science Research,
China
Received date: 2015−06−17; Accepted date: 2016−09−30
Corresponding author: DING Yong-sheng, Professor, PhD; Tel: +86−21−67792323; E-mail: ysding@dhu.edu.cn
J. Cent. South Univ. (2017) 24: 1050−1062 1051
different between customers and cloud service providers. As for three objectives (makespan, cost and energy
Customers usually interest in minimizing makespan and consumption), FARD et al [21] and YASSA et al [22]
cost of their application, whereas cloud service providers presented a heuristic list scheduling and a hybrid particle
often interest in maximizing the resource utilization, swarm optimization (PSO) algorithm for these objectives,
minimizing the energy consumption or user fairness. In respectively. But none of the above methods has been
these circumstances, the scheduling must be formulated integrated with the structure of cloud data center, which
as a multi-objective optimization problem (MOOP) is composed of multiple physic machines (PMs) and
aiming at optimizing multiple possible conflicting where information can be shared among these PMs
criteria, where it is impossible to find the globally through Intranet. The above methods also did not
optimal solution with respect to all objectives. consider the dynamic change of computational resources
Moreover, the cloud data center offers its services to in the context of cloud computing.
customers in the form of virtual machine (VM) through In this work, we also take three objectives
virtualization technology and the running VMs can scale (makespan, cost and energy consumption) into
up and down dynamically according to the workloads in consideration, and design a multi-swarm multi-objective
the system. So, the scheduling strategy should be able to optimization algorithm (MSMOOA) for workflow
check the available computational resources as quickly scheduling in cloud computing. In order to obtain the
as possible after the change happened. available computational resources for the scheduling, a
Recently, some related works [15−17] have data center model is designed at first. In this model, a
proposed their methods for multi-objective workflow manager server is adopted to collect the information of
scheduling in cloud or grid system. In Ref. [15], the available computational resources after accepting the
problem was simplified to a single-objective problem by workflows submitted from customers, which effectively
aggregating all the objectives in one analytical function. avoids the influence from the elastic resources to
The main drawback of these approaches is that the scheduling results. Then, the MSMOOA is executed.
computed solution depends on the selected weights, Different from previous algorithms [18−22], the
which is usually decided with a-priori, without any MSMOOA takes advantage of the structure of cloud data
knowledge about the workflow, infrastructure, and in center to search non-dominated scheduling solutions. In
general about the problem being solved. Therefore, the the MSMOOA, each available PM is considered a swarm
computed solution may not be satisfactory for the solved and employs the improved multi-objective particle
problem if the weights do not capture the user swarm optimization algorithm (MOPSOA) to find out
preferences in an accurate way. Other approaches are non-dominated solutions with one objective. Through the
based on sorting the different objectives in a sequential Intranet connection among PMs, some particles in one
fashion [16]. Once an objective has been optimized and swarm can get information from other swarms and the
no further improvement is possible, the next objective is velocity update of these particles is also influenced by
considered. The optimization of this new objective is the states of other swarms, which promote the
carried out so that none of the imposed constrains over information sharing and cooperation among swarms.
the previous criteria are violated. ZHAN et al [17] took Some new update strategies are designed to improve the
this kind of approach to optimize makespan and particles’ search capability. We compare the MSMOOA
economic cost. However, for the above approaches, the with another multi-objective scheduling algorithm in
number of objectives is limited and the order in which cloud system and further analyze the quality of solutions
the objectives are optimized requires some sort of computed by these algorithms. Simulation results on
preferential information, which may be difficult to well-known hybrid and parallel workflow applications
derive. highlight the performance of the proposed approach.
Recently, some Pareto-based approaches have been The main contributions of this work are as follows:
used for multi-objective task scheduling. TAO et al [18] 1) The MSMOOA is proposed by introducing a new
proposed a case library and Pareto solution based hybrid multi-swarm cooperative mechanism and modifying the
genetic algorithm to find Pareto solutions for makespan update of the particles’ velocity in the MSMOOA. The
and energy consumption optimization. DURILLO et al update of particles’ velocity at each iteration is affected
[19] designed a Pareto-based heuristic list scheduling by not only the personal and global best but also the
that provided the customers with a set of tradeoff optimal swarm best. 2) The proposed MSMOOA is used for
solutions about makespan and energy consumption. They multi-objective workflow scheduling in cloud system,
also proposed a similar multi-objective workflow which is the first workflow scheduling algorithm that
scheduling method for makespan and cost [20]. However, takes the structure characteristic of cloud data center into
all of the above works have focused on two objectives. consideration.
1052 J. Cent. South Univ. (2017) 24: 1050−1062
to the solutions in PS.
2 Problem modeling
Pf {F ( x ) x PS } (5)
where x X is a m-dimensional decision vector; X is which has different performance and prices as shown in
the search space; y Y is the objective vector and Y is Fig. 1. The number of running VMs and PMs can scale
the objective space. up and down dynamically according to the workloads in
Because there are multiple objectives involved in the system. If the running VMs and PMs are changed,
the MOOP, there is no single optimal solution with the corresponding information is sent to the manager
regards to all objectives. The solutions which have server immediately. When the manager server accepts a
trade-off or good compromise among all objectives workflow from customers, it is firstly check the
should be found, where Pareto optimality is usually information of available VMs and PMs. Based on this
adopted. Some Pareto concepts [24] are given as follows information, the multi-objective scheduling, which will
(without loss of generality, supposing that the objectives be described in next section, is executed without being
are to be minimized). influenced by the change of available computational
Definition 1: Pareto dominance. The vector x1 resources.
dominates the vector x2 (denoted by ), if and only if Pre-emption is not allowed in our model, which
the next statement is verified. means that each task must be completed without
interruption once started. It also supposes that each VM
i {1, 2, , n}, f i ( x1 ) f i ( x 2 ), i, cannot perform more than one task at a time.
f i ( x1 ) f i ( x 2 ) f i ( x1 ) f i ( x 2 ) (2)
Fig. 5 Comparative analysis of Montage workflow: (a) Makespan; (b) Cost; (c) Energy; (d) Hypervolume
J. Cent. South Univ. (2017) 24: 1050−1062 1059
MSMOOA for the three other workflows. Although the
performance of MOHEFT in makespan does not degrade
compared to MSMOOA, MOHEFT is initialized with the
solution computed by HEFT and MSMOOA is initialized
with stochastic solution. So, the MSMOOA has better
search ability. As for cost and energy consumption, the
performance of MOHEFT and MSMOOA has significant
improvement and the average gain by MSMOOA is
better than that by MOHEFT in most cases. This is
because HEFT is a single-objective scheduling algorithm
and does not consider other objectives, such as cost and
energy consumption, and MSMOOA designs the multi-
objective workflow scheduling by integrating with the
structure characteristics of cloud data center, while
MOHEFT left out of consideration.
Performance analysis of one selected solution from
MOHEFT and MSMOOA is given in Fig. 5 for Montage
workflow. This selected solution is the closest one to the Fig. 6 Average running time of algorithms about Montage
result of HEFT in the sense of Euclidean distance and the workflow
gain of makespan, cost and energy consumption over
HEFT by the selected solution of MOHEFT and reason for this phenomenon is that both MSMOOA and
MSMOOA is displayed on Figs. 5(a)−(c). MOHEFT and MOPSO are meta-heuristic methods and MOHEFT is a
MSMOOA get the similar result to HEFT about list scheduling algorithm based on HEFT. When
makespan as shown in Fig. 5(a). As for cost and energy considering the previously compared results, it is
consumption, MSMOOA and MOHEFT get better valuable to consume sustainable time for better
performance than HEFT as shown in Figs. 5(b) and (c), scheduling result, especially about the cost and energy
and the improvement gets more and more obviously with consumption. It is also indicate that the results of
the increase of the number of workflow tasks. In the case MSMOOA are better than those of MOPSO. This is due
of Motage_25 (the number of tasks is 25), the gain over to a fact that the iterations of MSMOOA are less than
HEFT in cost is 13.30% and 15.21% for MOHEFT and MOPSO, although the used time for each iteration of
MSMOOA, respectively and the gain over HEFT in MSMOOA is more than that of MOPSO.
energy consumption is 11.10% and 13.06%. When the As for epigenomics workflow, the results
number of tasks is 1000 (Montage_1000), the gain in summarized in Fig. 7, Fig. 8 and Table 2 are similar to
cost is 17.40% and 21.20% respectively and the gain in the previous experiment and confirm the findings of
energy consumption is 14.10% and 19.77%. We can also Montage workflow in terms of the compared metrics.
see that the performance of MSMOOA is better than The gain over HEFT in makespan, cost and energy
MOHEFT as illustrated in Fig. 5(d). The explanation for consumption is shown in Figs. 7(a), (b) and (c),
this behavior is that multi-swarms are designed to find respectively. The hypervolume of MSMOOA and
the solutions collaboratively in MSMOOA and each MOHEFT is indicated in Fig. 7(d). The comparison
swarm corresponds to one objective of multi-objective in about running time is presented in Fig. 8.
the workflow scheduling. At the same time, two classes
of particles are designed to promote information sharing 5 Conclusions
in these swarms and different strategy for updating LEA
and GEA is adopted. So, the selected solution of 1) A manager sever is adopted to avoid the influence
MSMOOA is better than that of MOHEFT. from the elastic characteristic of cloud system to
Furthermore, we also compare the running time scheduling results and collect the information of
used by MOHEFT, MSMOOA and MOPSO [32] for available computational resources when the system
each Montage workflow and the results are presented in accepts a workflows submitted from customers.
Fig. 6. The compared time of each algorithm is also the 2) The proposed MSMOOA can find out better non-
average result of the corresponding algorithm for ten dominated solutions effectively, which has been proved
times. The running times of all algorithms increase with by experiences.
the growing up of task number as shown in Fig. 6. It is 3) Compared with HEFT and MOHEFT by
also presented that the running times of MSMOOA and simulating them with both hybrid and parallel workflow
MOPSO are more than the results of MOHEFT. The applications having different structures, the MSMOOA
1060 J. Cent. South Univ. (2017) 24: 1050−1062
Fig. 7 Comparative analysis of epigenomics workflow: (a) Makespan; (b) Cost; (c) Energy; (d) Hypervolume
Cite this article as: YAO Guang-shun, DING Yong-sheng, HAO Kuang-rong. Multi-objective workflow scheduling in
cloud system based on cooperative multi-swarm optimization algorithm [J]. Journal of Central South University, 2017,
24(5): 1050−1062. DOI: 10.1007/s11771-017-3508-7.