Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

European Journal of Scientific Research ISSN 1450-216X Vol.65 No.3 (2011), pp. 434-443 EuroJournals Publishing, Inc.

. 2011 http://www.europeanjournalofscientificresearch.com

An Economic Allocation of Resources for Divisible Workloads in Grid Computing Paradigm


G. Murugesan Research Scholar, Anna University, Chennai, India E-mail: murugesh02@gmail.com Tel: +91-9884254492 C. Chellappan Professor, Anna University, Chennai, India E-mail: rcc@annauniv.edu Abstract Grid computing is already a mainstream paradigm for resource-intensive scientific applications, but it also becomes the useful model for enterprise applications. The grid enables resource sharing and dynamic allocation of computational resources, thus increasing access to distributed data, promoting operational flexibility and collaboration and allowing service providers to scale efficiently to meet variable demands. Grid computing requires an effective allocation for the better utilization of the dynamic resources. The execution of user processes must simultaneously satisfy both job execution constraints and system usage policies. Although many scheduling techniques for various computing system exist, traditional scheduling systems are inappropriate for scheduling tasks into grid resources. This paper develops a framework for resource allocation for divisible workloads in an grid computing environment, subject to a set of constraints. Here the resource allocation strategy can control the task assignment to grid resources with the aim to minimize the cost of the grid users. We have developed a mathematical model to allocate the best suitable resources from the resource pool with the objective of minimizing the cost of the grid user with respect to the deadline and budget specified by the user. In this paper, we present an economical model to make it complete user task and consider the quality of service (QOS) requirements, and we analyze the performance of this algorithm at the same time.

Keywords: Divisible workloads, Economical Model, Grid Scheduling, Resource Allocation Model, Computational Grid

1. Introduction
Grid computing is becoming a popular way of providing high performance computing for many process intensive, scientific and business applications. Grid computing consists of large sets of diverse, geographically distributed resources that are collected into a virtual computer for high performance computation. Thus, it is a heterogeneous computing platform that contains different network link and processor capacities. Furthermore, the grid system is not a dedicated system, which leads to load

An Economic Allocation of Resources for Divisible Workloads in Grid Computing Paradigm

435

variations in links and processors. Due to dynamic nature of system, this introduces many difficulties including resource allocation problem. To achieve the promising potentials of tremendous distributed resources, we require an effective and efficient algorithm for resource allocation. Unfortunately most of the resource allocation algorithms in traditional parallel and distributed systems are designed for homogeneous and dedicated resources. Grid computing requires an effective allocator for the better utilization of the dynamic resources. Divisible load theory is designed to solve the challenging problem of allocating and scheduling computing resources on a grid for thousands of independent tasks from large number of users. Divisible load theory involves the optimization of distributed computing problems where both communication and computation load is fractionable. Most divisible load theory uses a continuous mathematics, linear model which admits solution time optimization through linear equations or recursions, allows equivalent elements and other linear model features. It is necessary to consider the scheduling problem in the grid computing system as an economic model. One reason is that for a computer utility that charges customers for access to distributed and networked computing resources. The grid user always interested to utilize the cheapest processor to compute their loads at the same time the load has to be processed within the deadline assigned for each load. Under the network environment, because the resources provider and the resources user are different in the target, strategy and the supply-demand mode, particularly in the geography distribute characteristic, the resources management style of traditional concentrated control is no longer valid. But the present Network Resource Management Scheduling System adopts the conservative of strategy: the parts responsible for scheduling decide the operations how to be assigned to the processor according to the cost function driven by system core parameter. The strategy doesnt consider the resource access, that is, it considers the value of the dealt operations is same. But this doesnt correspond with the reality. Considering the use of machine should pay the cost. Under the network environment, Resource Scheduling can be considered as an optimization problem under limited conditions. In this paper, we present an economical model to make it complete its task and consider the quality of service (QOS) requirements, and we analyze the performance of this algorithm at the same time. The existing scheduler used in TeraGrid and other notable computing grids are dedicated for research purpose and based on deadlines, resource availability, and the description of resources required by a job. Because TeraGrid and most other grid computing implementations are not employed in profit seeking firms, the dollar cost of running jobs is not taken into account in arriving at the optimal schedule. Instead, the scheduling is evolved by matching resources and applications with the objective of improving the hardware performance. As grid computing migrates from scientific to business uses, the allocation of workloads to resources to meet business objectives, such as overall time, cost, or revenue becomes important aspect. This paper introduces a novel frame work for economic scheduling in Grid computing.

2. Related Works
As economic models are introduced into Grid computing, new research opportunities arise. Because the economic cost and profit are considered by Grid users and resource providers respectively, new objective functions and scheduling algorithms optimizing them are proposed. The economic problem exists only when the resources and the participants, namely consumers and producers, maximize their utility by deciding among needs that cannot concurrently satisfied and lead to different utility levels [6]. There are various economic models for Grid scheduling has been proposed with different techniques and objectives such as Commodity market model, Bargaining model, Tender/Contract-net model, Auction model, Community model, Monopoly/Oligopoly and so on proposed in [3]. In [7, 9], a deadline scheduling algorithm that supports load correction and fallback mechanisms to improve the algorithms performance in an environment that is based on client-server model is proposed. It submits

436

G. Murugesan and C. Chellappan

each job to a resource that can finish it in time less than or equal than the time specified for completing the job. The aim of this work is to decrease the number of jobs that dont meet their deadlines. The resources are priced according to their performance. It uses bricks simulator which is written in Java after extending it to assess the performance of the implemented deadline algorithm. In [2], several algorithms called deadline and budget constrained (DBC) scheduling algorithms are presented which consider the cost and makespan of a job simultaneously. These algorithms implement different strategies. For example, guarantee the deadline and minimize the cost or guarantee the budget and minimize the completion time. The difficulties to optimize these two parameters in an algorithm lie in the fact that the units for cost and time are different, and these two goals usually have conflicts (for example, high performance resources are usually expensive). We can also find such examples in [8] and [10]. The most straight forward method of an early grid and matching price model is the Sun Grid [11]. The Sun Grid offers a pool of pre-installed applications as well as options for users to deploy their own applications on Suns resources. The pricing model for this service is straight forward: $1/CPU-hour. The effective workload allocation model with single source has been proposed [5] for data grid system [1]. The time and cost trade-off has been proposed [4] with two meta-scheduling heuristics algorithms, that minimize and manage the execution cost and time of user applications. Also they have presented a cost metric to manage the trade-off between the execution cost and time.

3. Resource Allocation Model


The divisible load distribution model considered here goes through the following process. The load to be processed arrives at the node, called the originator or the cluster head. The cluster consists of collection of processing elements or processing nodes or processors. It can be a normal desktop PC connected in the network. Each processing element can be dedicated one or non-dedicated on which depends on the resource architecture under consideration. The role of the cluster head is, whenever it receives the load from the grid broker, it divides the load into m fractions called task and each task could be assigned to a processing element. The cluster head also involve in the process and it has its own fraction of loads. There are two possible approaches to partition the entire workload received from the grid broker. In the first approach the entire load can be equally divides with respect to the number of processing elements involved in the scheduling process. The other way is the intelligent approach. Here the cluster head allocate the fraction of workload with respect to the processing capacity. Both the approach are purely depends on the architecture used in the computing process. If the processing elements are homogeneous: ie., all the processing elements are equipped with same operating system, memory capacity, processing speed and so on, then the cluster head divides the load into equal division. Otherwise unequal fraction of load is considered. There are two possible approaches to allocate a fraction of load; in the first case the cluster head partition the entire workload into m equal fractions, starts the computation on its own load fraction and simultaneously starts distributing the other load fractions to other processing elements one at a time in a predetermined order. On the other hand, the cluster head first distributes the workload fractions to the rest of the processing elements and then it computes its own load fraction. The topology considered here is the tree topology. In that topology the cluster head is the root node and all the other processing elements are its child node. There is a direct communication link between the processing nodes with the originator. There is no direct link between the processing nodes. Here we are assuming that there is no communication failure and computation failure. But in real situation it is not possible. To reduce the computation complexity we considered as failure free communication and computation process. In the tree network topology, the workload distribution can be of random order or sequential order among the child processors or nodes. In sequential order, consider there is m number of child nodes then m! sequences are possible. An optimal sequence is that in which the load distribution follows the order in which the link speed decreases. However, from the network designers perspective, the best way allocate the workload fraction can be arrange the faster processors and links

An Economic Allocation of Resources for Divisible Workloads in Grid Computing Paradigm

437

to connect the faster processor to faster link and the next faster processor to the next processor link and so on then follow the optimal sequence of load distribution. In this study we assume that the resources are in cluster form. The cluster consist of a cluster head called as root processor or originator who receive the workload from the grid user or the broker and a set of nodes called processing elements or simply processor. Whenever the originator receives the workload it divides into fraction of loads with respect to the computing capacity of the processor. We referred processor and processing element and also originator and cluster head interchangeably. Also we assume that the originator knows that the computing and communication capacity of each processors. The originator becomes an intelligent node with high processing and storage capacity. It can divide the workload in short time duration and has the capacity of simultaneous communication. So we assume that there is no communication delay to transmit a workload fraction to the processor.

4. Single Divisible Load


Scheduling for single source with single resource is very simple compared to other categories. The major problem with this category is to select the best set of processing elements (PE)/processors. Whenever the originator receives the workload it selects the set of processor with respect to the deadline and the budget requirements of the grid user. If the processors are homogeneous (the computation and communication time and cost is same for all the processors) then the originator divide the entire workload into equal number of fractions called task and each task to be assigned to a processor. In the real grid environment a processor state may be dynamic ie., at anytime the processor can be participate the scheduling process or it can leave from scheduling. If a processor leaves in between the scheduling process the assigned task of that processor has to be reallocated to another suitable processor which is available in that time. So we need a reallocation mechanism in the scheduling process. When reallocating the task into other processor definitely the processing time will be increased. To compensate this situation need some flexibility in the deadline of the workload. The following notations are used to solve the problem total workload Wt Dt deadline for the workload Bt budget of the workload Ti time taken to process ith task tj computation time for the jth processor to compute unit workload computation startup time of jth processor tsj zj communication time of a unit workload transfer from the originator to jth processor pj communication cost of a unit workload transfer from the originator to jth processor cj computation cost of the jth processor to compute a unit workload available time for the jth processor Aj N available number of processors wij ith fraction of workload (task) assigned to the jth processor xij =1; if ith fraction of workload is assigned to the jth processor =0; otherwise Normally there are two optimization criteria in resource scheduling; finish time optimization and computing and communication cost optimization. Using the finish time optimization, we are trying to study or improve the performance of the resource, but in the other case aims to reduce or optimize the grid usage cost. Sometimes these two optimizing criteria were conflicting. While there are a variety of ways to trade off between the finish time and cost, a natural way is to minimize the cost such that for a given sequence of load distribution finish time is minimized. In our case the distribution model become a single level tree network shown in Fig.1.

438

G. Murugesan and C. Chellappan


Figure. 1: Workload distribution model

The originator receives the workload it divides the workload into task and each task is assigned to a processor. Here we assume that the processors will start computation the task as they start receiving them and that the communication delay between the originator and the processors is negligibly smaller than the computation time owing to high speed links so that no processor starve for load. In order to derive an optimal solution it is assumed that the processors involved in the computation process will available in the entire scheduling period. Also we assume that each processor has adequate memory/buffer space to accommodate and process the loads received from the originator.
Figure. 2: Timing diagram for task processing

Fig. 2 shows the timing diagram for task processing. From the figure at time t=0 all the processors are in idle. The processors are start computation after receive its workload portion from the originator. T1 is the time taken to process the entire workload portion for the first processor. The process time is the sum of communication time (tcomm ) and computation time (tcomp). Now the equations that govern the relations among various variables and parameters in the network can be written as follows. ticomm = ziwij (1) ticomp = tiwij (2) T1= t1comm + t1comp (3) Let ts1 be the computation startup time then it can be expressed as follows ts1= t1comm (4) 2 ts2 = ts1 + t comm (5) tsN = tsN-1 + tNcomm (6) Now T1 can expressed as

An Economic Allocation of Resources for Divisible Workloads in Grid Computing Paradigm

439

T1 = ts1 + t1comp (7) T2 = ts2 + t2comp (8) N TN = tsN + t comp (9) Let Ai be the ith processor available time then the processing time (Ti) should be always less than or equal to the available time of that processor. It can be expressed as Ti < = Ai (10) Let Wt be the total workload received by the originator from the grid user. The originator select the set of processor to process the workload with respect to the cost and the deadline assigned by the broker. The total workload has to be divided into tasks and each task is assigned to a processor. We are assuming that once the processor is selected to do the process the entire duration of the processing time the resources is dedicated to only that process and there is no failure in the processor. Let N be the number of task; the maximum value of N become the number of processors selected for the allocation process. Let wij be the ith task assigned to the jth processor. ie., first task may be assigned into first processor or second processor or third one and so on. Let xij be a binary variable; the value of xij=1 if the ith task is assigned to jth processor, zero otherwise. There is no overlap between the processor and the task. Only one task is assigned to one processor. So we can form the equation as follows;
N N ij

w
i=i
N

x ij = W t

j=i

(11)

j =1

x ij = 1 ; j = 1,2,,N

(12)

The equation (11) is used to confirm that the entire task has been assigned to the processors. ie., the sum of all the individual task could be the total workload. Equation (12) is for avoiding overlapping to ensure that only one task is assigned to a processor at particular time duration. Let cj be the payment to process one unit of task in the jth processor. Let wij be the ith fraction of task to be allotted for the jth processor. The total workload may be divided into maximum of N task. Here N is the number of processors selected from the resource pool for the scheduling process. Then the total cost incurred to process wj unit of task become cj*wj. The total cost to process the entire workload must be less than or equal to the budget allotted for the workload. So the equation become
N N j ij ij

cw x
i =1 j =1

Bt

(13) Equation (13) is the budget constraint; ie. the amount spent to complete the complete the process could be less than the budget. Also let tj be the time required to process a unit task by the jth processor. The total time required to process wij task become tj*wij. Now we can form the equation as follows:
N i =1 N

j =1

t jw ijx ij D t

(14) Let zj be the communication time required to transfer a unit of workload to the jth processor. The total time required to transfer wij unit of task become zj*wij. For all tasks the total communication time can be represented as
N i=1 N


j=1

t jw ijx ij

(15)

440

G. Murugesan and C. Chellappan

Sum of communication time and the computation time become the process time of a unit workload and that could be less than or equal to the dead line. So we combine the eqn. (14) and eqn. (15) to get the total time duration to complete the process of entire workload shown in eqn. (16).
N N j ij ij + N N j ij ij

zw x tw x
i =1 j =1 i =1 j =1

Dt

(16)

Let pj be the communication cost incurred to transfer a unit workload from originator to the processor. so to transfer wij unit of task from originator to the processor become pj*wij. For all tasks the total communication cost can be represented as
N i =1 N

j =1

p j w ij x ij

(17)

Therefore the total cost is expressed as the sum of communication cost and computation cost. So we combine the eqn. (13) and eqn. (17) we get the new eqn. (18)
N N j ij ij N N j ij ij

p w x + c w x
i =1 j =1 i =1 j =1

Bt

(18)

Let wij be the task assigned to the jth processor and mj is the available memory for the processor. So the amount of task assigned to the processor should be less than or equal to the memory capacity of the resource. It can be represented in eqn. (19). (19) The above equations lead to form a non-linear programming problem with the objective of minimizes the grid user cost. To summarize, the complete formulation of the model become Minimize
N N N N

wij mj

Z=
N

pjwijxij + cjwijxij
i =1 j =1 i =1 j =1
N N N

Subject to

zjwijxij + tjwijxij Dt
i =1 j =1 i =1 j =1

pjwijxij + cjwijxij Bt
i =1 j =1 i =1 j =1

Ti Ai ; i=1, 2, ,N, j=1, 2, . . ., N wij mj ; i=1, 2, ,N, j=1, 2, . . ., N


N N ij ij

w x
i =i j =i N

= Wt ;

i=1, 2, ,N, j=1, 2, . . ., N

; i=1, 2, ,N wij 0 ; i=1, 2, ,N, j=1, 2, . . ., N xij = {0,1} ; i=1, 2, ,N, j=1, 2, . . ., N The proposed model lies in the framework of mixed integer nonlinear programming model, even though the structure of nonlinearity is quite simple. The model can be characterized by large scale
j =1

ij

=1

An Economic Allocation of Resources for Divisible Workloads in Grid Computing Paradigm

441

dimensions (depending on the number of processors to manage) and requires, typically, a fast solution since the time horizon of the allocation is quite short.

5. Computational Experiments
Since the quite original nature of the proposed models, the computational experiments have been planned and carried out with the basic aim to access the suitability, reliability and semantic behavior of the same models. To this end, we have considered several test problems designed to face all possible situations that can occur in practice, with respect to the intersection of the supply and demand curve.
Table 1:
Resource R1 R2 R3 R4 R5

Resource availability
Time/Unit Load 2 3 2 2 3 Cost/Unit Load 6 4 6 6 4 Available Time 300 300 300 300 300

The objective is to investigate rate at which the resource users and the resource providers are matched by the systems dynamic allocation mechanism as they emerge. The Table.1 shows the properties of the resources available in the grid system. From the table we are assuming that there are five resources which is in the first column of the table. The second column specifies the amount of time required to execute/process one unit of workload per unit time with respect to the resources. The third column data specifies the cost required to execute/process a unit of workload in the various resources. And the data in the last column shows the resource available time in the schedule.
Table 2:
Resource R1 R2 R3 R4 R5

Workload allocation for heterogeneous resources


Work Load Allotted 100 0 150 100 150 Time taken to Process 300 0 300 300 300

To perform the experiment we have consider the total workload as 500 units with the dead line of 1200 time unit and the budget allotted to complete the workload is 2600 cost units with five resources are in the grid resource. We first perform the operation which focuses on cost minimization. From Table.2 the total workload of 500 workload unit is divided into five tasks such as 100, 0, 150, 100 and 150 and which is allotted into the resources R1 to R5 respectively. From the table the resource R2 is not utilized due to high cost. The sequence of allocation is random. The grid system completes the job with the budget of 2600 cost unit and it takes 1200 time units to complete the entire workload. Suppose all the resources are homogeneous resources with process time is 2 time unit to process a unit workload and the cost is 6 cost units to compute a unit workload then the allocation is shown in the table 3.
Table 3:
Resource R1 R2

Workload allocation for homogeneous resources


Work Load Allotted 6 125 Time taken to Process 12 250

442
Table 3:
R3 R4 R5

G. Murugesan and C. Chellappan


Workload allocation for homogeneous resources - continued
125 119 125 250 238 250

From the table all the resources are participated in the scheduling process and the total time taken to process the entire workload is 1000 time unit. The total amount to spent utilize the grid system is 3000 unit cost. The experiment can be extended with very large number of resources. IBM offers On Demand Business Service [12] similar to Sun Grid, where the customers are well versed in pay-as-you-go service so that the cost incurred towards the excess hardware and software investment and maintenance become minimized. The actual pricing model has not been disclosed as in the case with Sun but is rather on per-customer basis. Beyond the computational grids, research grids such as TeraGrid [13] and SURAgrid [14] exist where the cost is not directly tied to currency but rather to allocation time. Regardless of how one approaches the grid, there is a cost associated with the available services.

6. Conclusion
In this paper, solutions for optimum allocation of tasks to processors over a single tree network are obtained via optimization problem that uses linear programming. The objective is to find the minimum processing cost by distributing processing loads that originate from a source; it is constrained by the processing time and the cost. This algorithm can meet the users QOS need. It spends less time as possible and dont excess the budget at the same time. We can also take other principles, such as: we always choose the resource that expends the least cost. When the cost is same, we choose the one spends the less time. Currently, we are in the process of extending the single source model to consider the sequential and concurrent load distribution strategy with multiple sources model in the single level tree network.

References
[1] M.Abdullah, M.Othman, H.Ibrahim and S.Subramaniam, Load Allocation Model for Scheduling Divisible Data Grid Applications, Journal of Computer Science 5 (10): 760-763, 2009 R.Buyya, J. Giddy, and D. Abramson, An Evaluation of Economy-based Resource Trading and Scheduling on Computational Power Grids for Parameter Sweep Applications, in Proc. of the 2nd International Workshop on Active Middleware Services (AMS 2000), pp. 221-230, , Pittsburgh, USA, August 2000 R.Buyya, D. Abramson, J.Giddy, and H.Stockinger. Economic models for resource management and scheduling in grid computing. Journal of Concurrency and Communication: Practice and Experience, pages 1507-1542, 2002. S.K.Garg, R.Buyya and H.J.Siegel. Scheduling Parallel Applications on Utility Grids: Time and Cost Trade-off Management, Proceedings of the Thirty-Second Australasian Computer Science Conference (ACSC2009), Wellington, Australia, Conferences in Research and Practice in Information Technology (CRPIT), Vol. 91 K.Ger, B. De, Resource Allocation in Grid Computing, Journal of Scheduling 11:163-173,2008 J.Nakai. Reading between the lines and beyond. Technical Report NAS-01-010, NASA Ames Research Center, 2002. A.Takefusa, S.Matsuoka, H.Casanova, and F.Berman. A study of deadline scheduling for client-server systems on the computational grid. In HPDC01: Proceedings of the 10th IEEE

[2]

[3]

[4]

[5] [6] [7]

An Economic Allocation of Resources for Divisible Workloads in Grid Computing Paradigm

443

[8]

[9]

[10]

[11] [12] [13] [14]

International Symposium on High Performance Distributed Computing (HPDC-1001), pages 406-415. IEEE Computer Society, 2001. S.Venugopal and R. Buyya, A Deadline and Budget Constrained Scheduling Algorithm for eScience Applications on Data Grids, in Proc. of 6th International Conference on Algorithms and Architectures for Parallel Processing (ICA3PP-2005),pp.60-72, Melbourne, Australia, Oct. 2-5, 2005. S.Viswanathan, B.Veeravalli and T.G.Robertazzi, Resource-aware distributed scheduling strategies for large-scale computational cluster/grid systems. IEEE Transactions on Parallel and Distributed Systems,18: 1450-1461, 2007 J.Yu, R. Buyya, and C. K. Tham, QoS-based Scheduling of Workflow Applicationson Service Grids, in Proc. of the 1st IEEE International Conference on e-Science and Grid Computing (eScience05), Melbourne, Australia, December 2005 Sun Grid http://www.sun.com/service/sungrid/. IBM On Demand Business, http://www.ibm.com/e-business/ondemand/us/index.html TeraGrid, http://www.teragrid.org SURAgrid, http://www.sura.org

You might also like