Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

Chapter- 4

CLOUD SCHEDULING

4.1 Introduction
Masking or assigning of tasks over different resources based on some constraint or objective
function is called scheduling. Scheduling should be done in a way as to optimize or make best
use of the available resources so that we can maximize/ minimize the objective function by
achieving the desired quality of service. Scheduling deals with decided as to which tasks will
run on which resource and when [39]. Idea is to find solutions that can schedule the tasks in
lesser time even if the solution is not the best as no algorithms in current times exist that can
provide optimal solution to scheduling tasks in polynomial time.

4.2 Scheduling architecture


 Tasks are submitted by consumers to the Data center Broker.
 It is the responsibility of the broker to help schedule these tasks on virtual machines.
 A data center consists of number of hosts that further have number of virtual machines
created on top of these hosts. The tasks are then assigned to these virtual machines and
this scheduling is done on the basis of scheduling policy decided by Data Centre Broker.
 The Data center Broker communicates with cloud controller and schedule the submitted
tasks
Fig. 4.1: Cloud scheduling architecture [40]
4.3 Classification of scheduling:
1. Independent scheduling: Sequencing of the tasks does not matter as the tasks to be
scheduled are independent of each other [41].
2. Dependent scheduling: There is a hierarchical relationship amongst the tasks and a task
cannot be executed before its parent task has been executed successfully. Scheduling of
such tasks comes under the category of workflow scheduling [42].

4.4 Optimization metrics

Cloud platform is used to provide services to a consumer over the internet by a service provider,
hence there is a customer and a service provider. Customers submit their tasks to the service
provider for which they pay to service provider on the basis of pay-per-use. Customer is using
cloud in order to save money and not having to build the infrastructure locally, whereas service
provider is renting its resources to make profit by making efficient resource provisioning.
Objective functions are used to define the way these two can achieve their goals by minimizing
or maximizing the objective function respectively [43].

The optimization metrics can be classified into two types: -

1. Consumer-Desired:These criteria’s or parameters are based on the objectives or goals


preferred the consumer [44]:

Makespan: It is the most important optimization criteria that indicates the finish time of the last
task on the data centre. Makespan indicates the finishing time of the last task. It comes under
the category of minimization function [43].

𝑀𝑎𝑘𝑒𝑠𝑝𝑎𝑛 = 𝑚𝑎𝑥𝑖€𝑡𝑎𝑠𝑘𝑠 Fi , where Fi denotes the finishing time if task i. [43]

Economic cost: The total cost that a customer must pay to the service provider for the resource
used by it over the cloud for a specified period of time [43].
Economic Cost = i € resources
{Ci ∗ Ti} [43]

where Ci denotes the cost of resource i per unit time and Ti denotes the time for which resource
i is utilized.
Flowtime: In contrary to the Makespan time, it is computed by adding the finishing time of all
tasks. Shortest job first is expected to give the best flowtime [43].

Flow Time = 𝑖€𝑡𝑎𝑠𝑘𝑠 Fi , where Fi denotes the finishing time of task i. [43]
Flowtime is directly related to the response time and minimizing it also reduces the response
time of the schedule.

Tardiness: It refers to the delay between the set deadline and the time at which the task was
eventually executed. Value should be close to 0.

Tardinessi=Fi -Di [43]

Waiting time: It refers to the time a task has to wait before it starts with its execution. In short,
it is difference between execution time and submission time.

Waiting Timei=Si-Bi [43]

where Si and Bi are start time and submission time of task i respectively.

Turnaround time: It refers to the total time a task actually takes to complete. In short, it can be
defined as difference between finish time ans submission time of the task.

Turnaround Timei=Wi + Ei [43]


where Wi and Ei are waiting time and execution time of task i respectively.

Fairness: It is related to giving equal resource time to each task and not on some criteria. Lack
of fairness can result in starvation of some tasks.

2. Provider-Desired:Following two are criteria’s based on the objectives set by the provider
to meet its goals:
Resource utilization: Resources are costly and must be used optimally. This criteria is used to
check the percentage time for which the resource was used. The value should be as high as
possible with maximum of 100.

𝑛
Time taken by resource i to finish all jobs
Average Resource Utilization [43]
𝑖=1 𝑀𝑎𝑘𝑒𝑠𝑝𝑎𝑛 𝑋 𝑛

where n is no. of resources.


Throughput: it refers to the number of tasks completed in specified unit of time.
4.5 Constraints
Optimization can never be free of constraints or limits. Constraints specify the limits that an
objective function or solution must adhere to.Objective function can also be based on multiple
QoS parameters called the multi-objective function. Constraint and objective function are linked
to each other. Some of the scheduling constraints are listed below:

Priority constraint: it is about the importance of the task and that importance can be defined in
terms of the time at which the task was submitted, on the basis of execution deadline set for it or
the amount of payment made by it. The priority can be also be set on the basis of multiple
parameters. The priority should be computable and some numeric value can be assigned it [43].

Dependency constraint: It comes into consideration when the tasks and inter-related to each
other and there is hierarchical relationship between the tasks. The tasks at upper levels must
successfully execute before the tasks at lower levels can be executed. A dependency graph is
used to show the sequence order in which the tasks can be executed [43].

Deadline constraint: It refers to the time before which the single task or the set of tasks must
finish with their execution [43].

Budget constraint: A task or set of all tasks must finish within a fixed budget [43].
It is useful when you want to earn maximum profit with available resources and the number of
resources is limited. Some tasks can be pushed back as they are paying less amount and tasks
paying higher amount can be executed first
4.6 Ant Colony Optimization for Cloud Scheduling
Ant Colony Optimization (ACO) comes under the category of metaheuristic algorithms. The
algorithm is based on real ants and how they search for food. The ants travel from their colony
to the food source. The idea taken is that ants based on path taken by earlier ants are able to find
the shortest path from colony to source of food. The ants leave pheromones as they walk.
Initially random ants select random paths. The pheromone they leave also evaporates but at
lesser intensity. So, the shortest path after some time is the one with highest pheromone
intensity which leads all other ants to follow that path. After a certain period, all ants choose
that path and which happens to be the shortest path [45].
Table 4.1: Notation used to illustrate the working of ants [46]:

Symbol Representation

Ant representation (1 represent serial number of ant)

Pheromone representation

Food source (act as destination)


Fig. 4.2: Illustration of how ants work initially [46]:

In the Fig 4.2, (a) shows that there exist two paths that can be selected by ants to reach the
source of food.Initially as there is no pheromone on the path, the path selected by ants is
random, as shown in figure (b). ants leave pheromone as they move on each path. The
pheromone value will be more on the path that was travelled by more ants. Furthermore, all the
three ants went away with the fraction of food (represented by OD in (c). Ants has come back to
the initial state this time ants had chosen the path which had high pheromone density as this
process can be seen clearly in fig 4.3, (a) and (b).

Fig. 4.3: Illustration of how remaining ants select the path.

In the end, the ants had moved away with the leftover food as well as also increase the
pheromone density from the path which they had chosen.
This natural phenomenon used by ants to search for their food can be used to solve the problem
of scheduling the tasks in cloud. Only thing required is to map the computing problem with the
Ant Colony Optimization. Number of Ant based solutions have already been proposed. It starts
by selecting number of ants less than or equal to number of tasks. To begin with, each ant
executes random task tion resource Rj . Next, the remaining tasks and the resource on which
they are executed on are selected using the probability function like the one below [43]:

All ants make a full solution in a step by step manner by assigning to task to available to
resources.The pheromone value to begin with can be taken as any positive constant value which
is then changed by ant at the end of each iteration. Of all solutions, the solution of an ant that is
best among all ants at the end of the iteration is considered and if it is better then the previous
best, it is considered to be new optimal.
ACO algorithm (implementation of ACO using ETC) for task scheduling[49]
“Step1: Collect information about the tasks (n) and virtual machines (m)
Step 2: Initialize Expected Time To Compute (ETC) values
Step 3: Initialize the following parameters
Step 3.1: Set α=1,β=1,Q=100 and pheromone evaporation rate (ρ ) =0.5, number of ants=100
Step 3.2: Set optimal_solution=null and epoch=0
Step 3.3: Initialize pheromone trial value τij= c
Step 4: Repeat until each ant k in the colony finds VM for running all tasks
Step 4.1 Put the starting VM in tabuk
Step 4.2 Calculate the probability for selecting VMj allowedk for a taski
Pij=τij (t) α .ηij (t) β / Σx=allowedkτ ix (t) α .ηix (t) β
Where tabuk contains a list of VMs that has been already used by ant k for assigning tasks,
and allowedk maintains a list of VMs that are available for task execution.
τij (t) is the pheromone trial value on the edge (i,j) at time t. ηij (t) which affects the visibility
of pheromone trial is calculated as follows.
ηij (t) = 1 / dij where
dij= Tasklengthj/ (pe_numj X pe_mipsj)+Tasklengthi
Here, Tasklengthj is the length of all tasks assigned to the VMj and Tasklengthi is the length of
taski.
Step 4.3 Select a VMj with highest probability
Step 5: Calculate the makespan of schedule built by each ant and find the best schedule based
on the makespan
Makespan= max jJLj where J is the set of Virtual Machines and Lj=Σ k J ETC[k][j]
Step 6: Update pheromone trial value
Step 6.1: Compute the quantity of pheromone deposited on the edge (i,j) as below
δτijk(t)= Q/ makespan of the Schedule prepared by ant k where Q is the adaptive parameter
Step 6.2: Refresh pheromone trial value
τij (t)= τij (t) + δτijk(t)
Step 6.3: Perform a global pheromone update on the edges belongs to the best schedule τij
(t)=τij (t)+Q/makespan of the best schedule in the colony
Step 7 : If the maximum epoch is reached, then Optimal_solution= schedule with optimal
makespan
Display the schedule, optimalmakespan
else Go to step 4.”
4.7 Particle Swarm Optimization for Cloud Scheduling

Particle swarm optimization also belongs to the class of metaheuristic algorithm. Like ACO,
PSO is also problem independent optimization technique that can be applied to solve wide
range of very complex and large problems that are not solvable in polynomial time [37].
PSO is based on the principle of animals that form a group and find best position in that
group to form a swam. It is inspired from the behaviour of flock of birds or fishes. Single
fish or bird in a swarm is called a particle and each particle in a swarm them moves in a
certain direction at certain speed. So, the next position of the particle in the swarm is decided
by the direction and speed in which the particles are moving. It is important to map the PSO
algorithm to computing problems before it can be implemented on it. It is best suited for
problems that are continuous in nature.
Because the PSO is used to solve problems that deals with continuous values, the first step in
using PSO for solving task scheduling for cloud is to encode the problem to deal with
discrete values. A particle in the solution space can be best represented using a vector of size
1 * n, where n refers to the number of tasks to be scheduled and value of the particle position
for the vector represents the resource index [50-57]. In simple words, this vector represents
the mapping of all tasks on available resources. other than single dimension vector, matrix
can be also be used to represent the encoding scheme, the size of the matrix is m * n, where
m represents the available resources to serve the tasks and n represents the number of tasks
to be scheduled. The value for each cell [ij] is [0,1], i.e, whether the task i has been assigned
to the resource j or not. Other than positions, velocity can also be represented using the
matrix. Velocity refers to both the speed and the direction.

Fig. 4.4: Representation of position vector


Procedure of PSO

PSO Task Scheduling Algorithm [43]


1. “Set particle dimension as equal to the size of ready tasks T.
2. Initialize particles position randomly from PC = 1.....j and velocity vi randomly.
3. For each particle, calculate its fitness value.
4. If the fitness value is better than the previous best pbest, set the current fitness value as
the new pbest.
5. Perform Steps 3 and 4 for all particles and select the best particle as gbest.
6. For all particles, calculate velocity and update their positions.
7. If the stopping criteria or maximum iteration is not satisfied, repeat Step 3 & 4.”

4.8 Genetic Algorithm for Cloud Scheduling


A solution in GA is represented using an individual from within the population called the
chromosome. A chromosome is represented as a string of genes. Random selection of the
population is done in the first step of the algorithm. Chromosome choice or suitability is based
on the fitness function. The selection of the fitness function depends on the QoS parameter
which is the optimization parameter. It can also be multi-objective function.Selection of the
chromosome is based on the underlying objective or fitness function designed on the basis of
problem in hand and further operations like crossover and mutation are carried out to produce
off springs for the new population.

Procedure GA
Initialization: Generate initial population P of chromosomes.
Fitness: Apply fitness function to calculate the fitness value for each chromosome.
do
Selection: Apply selection operation to select chromosomes for generation next
generation.
Crossover: Apply crossover function on the pair of chromosomes.
Mutation: Apply mutation function on the chromosomes.
Fitness: Compute the fitness value of new chromosomes.
Replacement: Remove the poor chromosomes and add fit chromosomes.
while (stopping condition is met) // stop condition can be maximum iterations or no change
in fitness value.
return Best chromosome.

You might also like