Professional Documents
Culture Documents
Ejsr 65 3 15
Ejsr 65 3 15
. 2011 http://www.europeanjournalofscientificresearch.com
Keywords: Divisible workloads, Economical Model, Grid Scheduling, Resource Allocation Model, Computational Grid
1. Introduction
Grid computing is becoming a popular way of providing high performance computing for many process intensive, scientific and business applications. Grid computing consists of large sets of diverse, geographically distributed resources that are collected into a virtual computer for high performance computation. Thus, it is a heterogeneous computing platform that contains different network link and processor capacities. Furthermore, the grid system is not a dedicated system, which leads to load
435
variations in links and processors. Due to dynamic nature of system, this introduces many difficulties including resource allocation problem. To achieve the promising potentials of tremendous distributed resources, we require an effective and efficient algorithm for resource allocation. Unfortunately most of the resource allocation algorithms in traditional parallel and distributed systems are designed for homogeneous and dedicated resources. Grid computing requires an effective allocator for the better utilization of the dynamic resources. Divisible load theory is designed to solve the challenging problem of allocating and scheduling computing resources on a grid for thousands of independent tasks from large number of users. Divisible load theory involves the optimization of distributed computing problems where both communication and computation load is fractionable. Most divisible load theory uses a continuous mathematics, linear model which admits solution time optimization through linear equations or recursions, allows equivalent elements and other linear model features. It is necessary to consider the scheduling problem in the grid computing system as an economic model. One reason is that for a computer utility that charges customers for access to distributed and networked computing resources. The grid user always interested to utilize the cheapest processor to compute their loads at the same time the load has to be processed within the deadline assigned for each load. Under the network environment, because the resources provider and the resources user are different in the target, strategy and the supply-demand mode, particularly in the geography distribute characteristic, the resources management style of traditional concentrated control is no longer valid. But the present Network Resource Management Scheduling System adopts the conservative of strategy: the parts responsible for scheduling decide the operations how to be assigned to the processor according to the cost function driven by system core parameter. The strategy doesnt consider the resource access, that is, it considers the value of the dealt operations is same. But this doesnt correspond with the reality. Considering the use of machine should pay the cost. Under the network environment, Resource Scheduling can be considered as an optimization problem under limited conditions. In this paper, we present an economical model to make it complete its task and consider the quality of service (QOS) requirements, and we analyze the performance of this algorithm at the same time. The existing scheduler used in TeraGrid and other notable computing grids are dedicated for research purpose and based on deadlines, resource availability, and the description of resources required by a job. Because TeraGrid and most other grid computing implementations are not employed in profit seeking firms, the dollar cost of running jobs is not taken into account in arriving at the optimal schedule. Instead, the scheduling is evolved by matching resources and applications with the objective of improving the hardware performance. As grid computing migrates from scientific to business uses, the allocation of workloads to resources to meet business objectives, such as overall time, cost, or revenue becomes important aspect. This paper introduces a novel frame work for economic scheduling in Grid computing.
2. Related Works
As economic models are introduced into Grid computing, new research opportunities arise. Because the economic cost and profit are considered by Grid users and resource providers respectively, new objective functions and scheduling algorithms optimizing them are proposed. The economic problem exists only when the resources and the participants, namely consumers and producers, maximize their utility by deciding among needs that cannot concurrently satisfied and lead to different utility levels [6]. There are various economic models for Grid scheduling has been proposed with different techniques and objectives such as Commodity market model, Bargaining model, Tender/Contract-net model, Auction model, Community model, Monopoly/Oligopoly and so on proposed in [3]. In [7, 9], a deadline scheduling algorithm that supports load correction and fallback mechanisms to improve the algorithms performance in an environment that is based on client-server model is proposed. It submits
436
each job to a resource that can finish it in time less than or equal than the time specified for completing the job. The aim of this work is to decrease the number of jobs that dont meet their deadlines. The resources are priced according to their performance. It uses bricks simulator which is written in Java after extending it to assess the performance of the implemented deadline algorithm. In [2], several algorithms called deadline and budget constrained (DBC) scheduling algorithms are presented which consider the cost and makespan of a job simultaneously. These algorithms implement different strategies. For example, guarantee the deadline and minimize the cost or guarantee the budget and minimize the completion time. The difficulties to optimize these two parameters in an algorithm lie in the fact that the units for cost and time are different, and these two goals usually have conflicts (for example, high performance resources are usually expensive). We can also find such examples in [8] and [10]. The most straight forward method of an early grid and matching price model is the Sun Grid [11]. The Sun Grid offers a pool of pre-installed applications as well as options for users to deploy their own applications on Suns resources. The pricing model for this service is straight forward: $1/CPU-hour. The effective workload allocation model with single source has been proposed [5] for data grid system [1]. The time and cost trade-off has been proposed [4] with two meta-scheduling heuristics algorithms, that minimize and manage the execution cost and time of user applications. Also they have presented a cost metric to manage the trade-off between the execution cost and time.
437
to connect the faster processor to faster link and the next faster processor to the next processor link and so on then follow the optimal sequence of load distribution. In this study we assume that the resources are in cluster form. The cluster consist of a cluster head called as root processor or originator who receive the workload from the grid user or the broker and a set of nodes called processing elements or simply processor. Whenever the originator receives the workload it divides into fraction of loads with respect to the computing capacity of the processor. We referred processor and processing element and also originator and cluster head interchangeably. Also we assume that the originator knows that the computing and communication capacity of each processors. The originator becomes an intelligent node with high processing and storage capacity. It can divide the workload in short time duration and has the capacity of simultaneous communication. So we assume that there is no communication delay to transmit a workload fraction to the processor.
438
The originator receives the workload it divides the workload into task and each task is assigned to a processor. Here we assume that the processors will start computation the task as they start receiving them and that the communication delay between the originator and the processors is negligibly smaller than the computation time owing to high speed links so that no processor starve for load. In order to derive an optimal solution it is assumed that the processors involved in the computation process will available in the entire scheduling period. Also we assume that each processor has adequate memory/buffer space to accommodate and process the loads received from the originator.
Figure. 2: Timing diagram for task processing
Fig. 2 shows the timing diagram for task processing. From the figure at time t=0 all the processors are in idle. The processors are start computation after receive its workload portion from the originator. T1 is the time taken to process the entire workload portion for the first processor. The process time is the sum of communication time (tcomm ) and computation time (tcomp). Now the equations that govern the relations among various variables and parameters in the network can be written as follows. ticomm = ziwij (1) ticomp = tiwij (2) T1= t1comm + t1comp (3) Let ts1 be the computation startup time then it can be expressed as follows ts1= t1comm (4) 2 ts2 = ts1 + t comm (5) tsN = tsN-1 + tNcomm (6) Now T1 can expressed as
439
T1 = ts1 + t1comp (7) T2 = ts2 + t2comp (8) N TN = tsN + t comp (9) Let Ai be the ith processor available time then the processing time (Ti) should be always less than or equal to the available time of that processor. It can be expressed as Ti < = Ai (10) Let Wt be the total workload received by the originator from the grid user. The originator select the set of processor to process the workload with respect to the cost and the deadline assigned by the broker. The total workload has to be divided into tasks and each task is assigned to a processor. We are assuming that once the processor is selected to do the process the entire duration of the processing time the resources is dedicated to only that process and there is no failure in the processor. Let N be the number of task; the maximum value of N become the number of processors selected for the allocation process. Let wij be the ith task assigned to the jth processor. ie., first task may be assigned into first processor or second processor or third one and so on. Let xij be a binary variable; the value of xij=1 if the ith task is assigned to jth processor, zero otherwise. There is no overlap between the processor and the task. Only one task is assigned to one processor. So we can form the equation as follows;
N N ij
w
i=i
N
x ij = W t
j=i
(11)
j =1
x ij = 1 ; j = 1,2,,N
(12)
The equation (11) is used to confirm that the entire task has been assigned to the processors. ie., the sum of all the individual task could be the total workload. Equation (12) is for avoiding overlapping to ensure that only one task is assigned to a processor at particular time duration. Let cj be the payment to process one unit of task in the jth processor. Let wij be the ith fraction of task to be allotted for the jth processor. The total workload may be divided into maximum of N task. Here N is the number of processors selected from the resource pool for the scheduling process. Then the total cost incurred to process wj unit of task become cj*wj. The total cost to process the entire workload must be less than or equal to the budget allotted for the workload. So the equation become
N N j ij ij
cw x
i =1 j =1
Bt
(13) Equation (13) is the budget constraint; ie. the amount spent to complete the complete the process could be less than the budget. Also let tj be the time required to process a unit task by the jth processor. The total time required to process wij task become tj*wij. Now we can form the equation as follows:
N i =1 N
j =1
t jw ijx ij D t
(14) Let zj be the communication time required to transfer a unit of workload to the jth processor. The total time required to transfer wij unit of task become zj*wij. For all tasks the total communication time can be represented as
N i=1 N
j=1
t jw ijx ij
(15)
440
Sum of communication time and the computation time become the process time of a unit workload and that could be less than or equal to the dead line. So we combine the eqn. (14) and eqn. (15) to get the total time duration to complete the process of entire workload shown in eqn. (16).
N N j ij ij + N N j ij ij
zw x tw x
i =1 j =1 i =1 j =1
Dt
(16)
Let pj be the communication cost incurred to transfer a unit workload from originator to the processor. so to transfer wij unit of task from originator to the processor become pj*wij. For all tasks the total communication cost can be represented as
N i =1 N
j =1
p j w ij x ij
(17)
Therefore the total cost is expressed as the sum of communication cost and computation cost. So we combine the eqn. (13) and eqn. (17) we get the new eqn. (18)
N N j ij ij N N j ij ij
p w x + c w x
i =1 j =1 i =1 j =1
Bt
(18)
Let wij be the task assigned to the jth processor and mj is the available memory for the processor. So the amount of task assigned to the processor should be less than or equal to the memory capacity of the resource. It can be represented in eqn. (19). (19) The above equations lead to form a non-linear programming problem with the objective of minimizes the grid user cost. To summarize, the complete formulation of the model become Minimize
N N N N
wij mj
Z=
N
pjwijxij + cjwijxij
i =1 j =1 i =1 j =1
N N N
Subject to
zjwijxij + tjwijxij Dt
i =1 j =1 i =1 j =1
pjwijxij + cjwijxij Bt
i =1 j =1 i =1 j =1
w x
i =i j =i N
= Wt ;
; i=1, 2, ,N wij 0 ; i=1, 2, ,N, j=1, 2, . . ., N xij = {0,1} ; i=1, 2, ,N, j=1, 2, . . ., N The proposed model lies in the framework of mixed integer nonlinear programming model, even though the structure of nonlinearity is quite simple. The model can be characterized by large scale
j =1
ij
=1
441
dimensions (depending on the number of processors to manage) and requires, typically, a fast solution since the time horizon of the allocation is quite short.
5. Computational Experiments
Since the quite original nature of the proposed models, the computational experiments have been planned and carried out with the basic aim to access the suitability, reliability and semantic behavior of the same models. To this end, we have considered several test problems designed to face all possible situations that can occur in practice, with respect to the intersection of the supply and demand curve.
Table 1:
Resource R1 R2 R3 R4 R5
Resource availability
Time/Unit Load 2 3 2 2 3 Cost/Unit Load 6 4 6 6 4 Available Time 300 300 300 300 300
The objective is to investigate rate at which the resource users and the resource providers are matched by the systems dynamic allocation mechanism as they emerge. The Table.1 shows the properties of the resources available in the grid system. From the table we are assuming that there are five resources which is in the first column of the table. The second column specifies the amount of time required to execute/process one unit of workload per unit time with respect to the resources. The third column data specifies the cost required to execute/process a unit of workload in the various resources. And the data in the last column shows the resource available time in the schedule.
Table 2:
Resource R1 R2 R3 R4 R5
To perform the experiment we have consider the total workload as 500 units with the dead line of 1200 time unit and the budget allotted to complete the workload is 2600 cost units with five resources are in the grid resource. We first perform the operation which focuses on cost minimization. From Table.2 the total workload of 500 workload unit is divided into five tasks such as 100, 0, 150, 100 and 150 and which is allotted into the resources R1 to R5 respectively. From the table the resource R2 is not utilized due to high cost. The sequence of allocation is random. The grid system completes the job with the budget of 2600 cost unit and it takes 1200 time units to complete the entire workload. Suppose all the resources are homogeneous resources with process time is 2 time unit to process a unit workload and the cost is 6 cost units to compute a unit workload then the allocation is shown in the table 3.
Table 3:
Resource R1 R2
442
Table 3:
R3 R4 R5
From the table all the resources are participated in the scheduling process and the total time taken to process the entire workload is 1000 time unit. The total amount to spent utilize the grid system is 3000 unit cost. The experiment can be extended with very large number of resources. IBM offers On Demand Business Service [12] similar to Sun Grid, where the customers are well versed in pay-as-you-go service so that the cost incurred towards the excess hardware and software investment and maintenance become minimized. The actual pricing model has not been disclosed as in the case with Sun but is rather on per-customer basis. Beyond the computational grids, research grids such as TeraGrid [13] and SURAgrid [14] exist where the cost is not directly tied to currency but rather to allocation time. Regardless of how one approaches the grid, there is a cost associated with the available services.
6. Conclusion
In this paper, solutions for optimum allocation of tasks to processors over a single tree network are obtained via optimization problem that uses linear programming. The objective is to find the minimum processing cost by distributing processing loads that originate from a source; it is constrained by the processing time and the cost. This algorithm can meet the users QOS need. It spends less time as possible and dont excess the budget at the same time. We can also take other principles, such as: we always choose the resource that expends the least cost. When the cost is same, we choose the one spends the less time. Currently, we are in the process of extending the single source model to consider the sequential and concurrent load distribution strategy with multiple sources model in the single level tree network.
References
[1] M.Abdullah, M.Othman, H.Ibrahim and S.Subramaniam, Load Allocation Model for Scheduling Divisible Data Grid Applications, Journal of Computer Science 5 (10): 760-763, 2009 R.Buyya, J. Giddy, and D. Abramson, An Evaluation of Economy-based Resource Trading and Scheduling on Computational Power Grids for Parameter Sweep Applications, in Proc. of the 2nd International Workshop on Active Middleware Services (AMS 2000), pp. 221-230, , Pittsburgh, USA, August 2000 R.Buyya, D. Abramson, J.Giddy, and H.Stockinger. Economic models for resource management and scheduling in grid computing. Journal of Concurrency and Communication: Practice and Experience, pages 1507-1542, 2002. S.K.Garg, R.Buyya and H.J.Siegel. Scheduling Parallel Applications on Utility Grids: Time and Cost Trade-off Management, Proceedings of the Thirty-Second Australasian Computer Science Conference (ACSC2009), Wellington, Australia, Conferences in Research and Practice in Information Technology (CRPIT), Vol. 91 K.Ger, B. De, Resource Allocation in Grid Computing, Journal of Scheduling 11:163-173,2008 J.Nakai. Reading between the lines and beyond. Technical Report NAS-01-010, NASA Ames Research Center, 2002. A.Takefusa, S.Matsuoka, H.Casanova, and F.Berman. A study of deadline scheduling for client-server systems on the computational grid. In HPDC01: Proceedings of the 10th IEEE
[2]
[3]
[4]
443
[8]
[9]
[10]
International Symposium on High Performance Distributed Computing (HPDC-1001), pages 406-415. IEEE Computer Society, 2001. S.Venugopal and R. Buyya, A Deadline and Budget Constrained Scheduling Algorithm for eScience Applications on Data Grids, in Proc. of 6th International Conference on Algorithms and Architectures for Parallel Processing (ICA3PP-2005),pp.60-72, Melbourne, Australia, Oct. 2-5, 2005. S.Viswanathan, B.Veeravalli and T.G.Robertazzi, Resource-aware distributed scheduling strategies for large-scale computational cluster/grid systems. IEEE Transactions on Parallel and Distributed Systems,18: 1450-1461, 2007 J.Yu, R. Buyya, and C. K. Tham, QoS-based Scheduling of Workflow Applicationson Service Grids, in Proc. of the 1st IEEE International Conference on e-Science and Grid Computing (eScience05), Melbourne, Australia, December 2005 Sun Grid http://www.sun.com/service/sungrid/. IBM On Demand Business, http://www.ibm.com/e-business/ondemand/us/index.html TeraGrid, http://www.teragrid.org SURAgrid, http://www.sura.org