Software Quality of Service in Composite Applications Built With Web Services PHD Thesis

Software Quality of Service in Composite Applications Built with Web Services
Shelly Saunders
A thesis submitted in fulfilment of the requirements for the degree of Doctor of Philosophy of Nottingham Trent University and Southampton Solent University November 2010
Abstract
Over recent years businesses have evolved their enterprise architectures to include packaged applications, legacy systems, and bespoke line-of-business (LOB) applications that are integrated using an architectural approach called Service-oriented architectures (SOA). The architecture promotes agile and reconfigurable application development which is ideal in 21st century businesses. Modern SOAs now extend into the Cloud using pay per use Software as a Service (SaaS). Composite applications built using Web Services are one way in which SOA principles are being introduced into enterprise architectures. Although there are many business reasons for this type of architecture the methods and techniques for estimating overall Quality of Service (QoS) in a composite application built using Web Services do not exist at the moment. This thesis attempts to address a number of questions in this area. Firstly, how can we predict the performance of a composite application, for example, what is the effect on performance of replacing one Web Service with another one of equivalent functionality, or by dynamically changing the steps in a workflow? Secondly, how can we maximise the performance of that application through effective use and exploitation of the resources available to it? Thirdly, what strategies are there for improving our ability to meet QoS metrics in a composite application? As the provider of a composite application to one or more clients how can we manage situations were resources are overloaded? Under these conditions it would be useful to be able to selectively admit or reject requests from clients based on some criteria that maximises the providers profits or business objectives. ii
Fourthly, how can we define and manage SLAs for performance metrics in composite applications? This thesis makes the following five contributions. The first is detailed test results to demonstrate that the Mean Value Analysis (MVA) algorithm can be applied to a queuing network description of a composite application using Web Servcies. The second is the demonstration of the MVA algorithm as the fitness function for a Genetic Algorithm (GA). The third is a practical example of applying a GA to dynamic management of a real workflow implemented as a set of Web Services across multiple servers. The fourth is the demonstration of strategies for meeting QoS metrics under a number of different real-life overload conditions. The fifth is a proposal for improvements that could be made to existing SLA design methodologies and SLA languages to define QoS metrics composite applications.
iii
Dedication
To Greta 1993-2008 and Jenny 1970-2009 May flights of angels sing thee to thy rest.
iv
Acknowledgements
I would like to acknowledge the encouragement and support of my supervisors at Southampton Solent University: Eur Ing Professor Margaret Ross MBE, Eur Ing Geoff Staples and Dr Sean Wellington, Head of the Technology Research Centre. I would also like to thank Professor Mike Barnett and Professor John Rees who both provided helpful advice and direction. The final version of this thesis benefited greatly from the comments and input made by Edwin Gray during the viva. Steve White, of IBMs Autonomic Computing laboratory at the Thomas J Watson Research Centre in Hawthorne gave up time to read and comment on some of this work in a very valuable session. I have also had useful conversations about SOA in general from colleagues at my former employer, ACE Group, as well as IBM staff from the Hursely labs near Winchester.I would also like to thank Marlborough Stirling plc who gave permission for me to use the results of a performance testing exercise conducted on their systems.
Table of Contents
Chapter 1 Introduction ..................................................................................................................... 1 1.1 Motivation............................................................................................................................... 3 1.2 Hypothesis............................................................................................................................... 4 1.3 Methodology ........................................................................................................................... 5 1.4 Contributions........................................................................................................................... 5 1.5 Thesis Roadmap ...................................................................................................................... 6 Chapter 2 Analysis of the Problem ................................................................................................... 8 Chapter 3 Related Work ................................................................................................................. 13 3.1 Background ........................................................................................................................... 13 3.1.1 Discovery and Negotiation .............................................................................................. 13 3.1.2 Service Level Agreement ................................................................................................. 14 3.1.3 Service Provision ............................................................................................................. 15 3.1.4 Monitoring...................................................................................................................... 15 3.2 Adaptive Control of Web Applications and Services ............................................................... 16 3.3 Queuing Theory ..................................................................................................................... 17 3.3.1 Dynamic Resource Configuration .................................................................................... 17 3.3.2 Admission Control........................................................................................................... 19 3.3.3 Dynamic Provisioning of Idle Resources .......................................................................... 20 3.3.4 Extending to Multiple Tiers ............................................................................................. 20 3.4 Control Theory ...................................................................................................................... 21 3.4.1 Admission Control........................................................................................................... 21 3.4.2 Degraded Service ............................................................................................................ 21 3.4.3 Extending to Multiple Tiers ............................................................................................. 22 3.4.4 Fuzzy Controllers............................................................................................................. 23 3.5 Combined Approaches........................................................................................................... 23 3.6 Solving Optimization Problems .............................................................................................. 24 vi
3.6.1 Utility Functions .............................................................................................................. 24 3.6.2 Integer Linear Programming............................................................................................ 25 3.6.3 Genetic Algorithms ......................................................................................................... 25 3.7 Concluding Remarks .............................................................................................................. 26 3.8 Publications ........................................................................................................................... 27 Chapter 4 Designing SLAs for Composite Applications.................................................................... 28 4.1 Service Level Management .................................................................................................... 28 4.1.1 Service Monitoring.......................................................................................................... 29 4.1.2 Key Quality Indicators and Key Performance Indicators ................................................... 30 4.2 Service Level Agreement Design ............................................................................................ 31 4.2.1 COSMA ........................................................................................................................... 31 4.2.2 MoDe4SLA ...................................................................................................................... 33 4.2.3 Differential QoS Support ................................................................................................. 35 4.3 Proposals ............................................................................................................................... 36 Chapter 5 An MVA Performance Model for a SOA.......................................................................... 37 5.1 Introduction .......................................................................................................................... 37 5.2 Performance Requirements ................................................................................................... 38 5.3 Business Demand Modelling .................................................................................................. 39 5.4 Workload Characterisation .................................................................................................... 41 5.4.1 Task Distribution ............................................................................................................. 41 5.4.2 Arrival Time Distribution ................................................................................................. 43 5.4.3 Service Time Distribution ................................................................................................ 45 5.4.4 Load-Dependence of Service Times ................................................................................. 46 5.5 Modelling the Application...................................................................................................... 47 5.5.1 Mean Value Analysis ....................................................................................................... 47 5.5.2 The Queuing Network Model of an N-Tier Application .................................................... 51 5.6 Management Software .......................................................................................................... 53 5.6.1 Capturing Application Metrics ......................................................................................... 53 5.6.2 Statistical Analysis of Raw Metrics................................................................................... 54 7
5.6.3 MVA Modeller ................................................................................................................ 55 5.7 Results................................................................................................................................... 55 5.7.1 Accuracy of the Model .................................................................................................... 58 5.8 Using the Model to Predict the Performance of a New Workflow .......................................... 58 5.9 Summary and Discussion ....................................................................................................... 59 5.10 Publications ......................................................................................................................... 60 Chapter 6 A Genetic Algorithm with an MVA Fitness Function for Runtime Performance Improvements of a Composite Application Built Using Web Services ............................................ 62 6.1 Introduction to Genetic Algorithms........................................................................................ 62 6.2 Comparison with Other Techniques ....................................................................................... 64 6.3 A GA for the Sample Application ............................................................................................ 64 6.3.1 Chromosome Encoding ................................................................................................... 64 6.3.2 Initial Population............................................................................................................. 67 6.3.3 Fitness Evaluation ........................................................................................................... 68 6.3.4 Fitness Selector ............................................................................................................... 69 6.3.5 Constraints ..................................................................................................................... 69 6.3.6 Crossover ........................................................................................................................ 70 6.3.7 Mutation ........................................................................................................................ 74 6.3.8 Population Evolution....................................................................................................... 74 6.4 Technical Design of the Management Solution ...................................................................... 74 6.4.1 The ESB ........................................................................................................................... 74 6.4.2 Dynamic Routing ............................................................................................................. 75 6.4.3 Logical Design ................................................................................................................. 76 6.5 Test Harness .......................................................................................................................... 78 6.5.1 Sample Workflows .......................................................................................................... 78 6.5.2 Baseline Performance Test Results .................................................................................. 82 6.5.3 Post GA Results ............................................................................................................... 85 6.6 Summary and Discussion ....................................................................................................... 86 Chapter 7 Strategy for a QoS-aware Composite Applications in the Cloud ..................................... 87 8
7.1 Enterprise SOA and Cloud Computing .................................................................................... 87 7.1.1 Cloud Computing and Software as a Service .................................................................... 87 7.1.2 A Unified Architecture .................................................................................................... 89 7.1.3 Challenges with SLA Management .................................................................................. 90 7.2 Modelling a Composite Application in the Cloud .................................................................... 91 7.3 Strategies for Automated QoS Control in the Cloud ............................................................... 92 7.3.1 Changes in Workload ...................................................................................................... 92 7.3.2 Loss of Service ................................................................................................................ 93 7.3.3 Increases in Latency ........................................................................................................ 93 7.3.4 Differentiated Services .................................................................................................... 94 7.4 Conclusions ........................................................................................................................... 95 7.5 Publications ........................................................................................................................... 96 Chapter 8 Evaluation and Conclusions ............................................................................................ 97 8.1 Discussion of Results ............................................................................................................. 98 8.2 Evaluation of Results and Methodologies ............................................................................ 101 8.3 Contributions of this Thesis ................................................................................................. 103 8.4 Limitations of this Thesis ..................................................................................................... 104 8.5 Future Work ........................................................................................................................ 105 References .................................................................................................................................... 106 Appendix A Publications Linked to This Thesis ............................................................................. 124 Journals..................................................................................................................................... 124 Conferences .............................................................................................................................. 124 Appendix A2 Other Research Outputs Not Directly Relevant To This Thesis ................................ 125 Software Engineering ................................................................................................................ 125 Optoelectronics ......................................................................................................................... 125 Patents ...................................................................................................................................... 125
List of Figures
Figure 1.1 Service-oriented application integration. Figure 1.2 A Virtual Enterprise. Figure 4.1 Service Level Management Figure 4.2 Composite Web Services Figure 4.3 Simplified COSMAdoc schema Figure 5.1 Probability of client requesting each job class. Figure 5.2 Distribution of tasks across three tiers for each of the 26 classes of work. Figure 5.3 Inter-arrival time distribution from web logs. Figure 5.4 The tail of the distribution for inter-arrival times beyond two seconds. Figure 5.5 Inter-arrival times of tasks on the job queue. Figure 5.6 Service time distributions for all tasks executing in under 2 sec. Figure 5.7 Service time distributions for all tasks executing in over 2 sec Figure 5.8 Increase in task service time with load Figure 5.9 Queuing Network Model of an N-tier Application Figure 5.10 An ESB executing a sequence of tasks via Web Services Figure 5.11.Execution times of each class in a simple workflow, together with the total response time Figure 5.12 Comparison of the response times predicted by the model and the actual response times at different loads. Figure 5.13 Accuracy of the model Figure 5.14 The differences between the predicted and observed results when the model is used in a predictive manner
Figure 6.1 Layers of services become increasingly more coarse-grained, with the top layer of orchestration services providing a standards based aggregation and process framework. Figure 6.2 Logical Design Figure 6.3 Sample Workflows Figure 6.4 Baseline Results Figure 7.1 SOA and SaaS used to create a composite application Figure 7.2 Generalised example queuing network model including SaaS services
List of Tables
Table 5.1 MVA Algorithm Table 6.1 Example Chromosome Encoding Table 6.2 Logical Design Table 6.3 Sample Workload Table 6.4 Job Distribution Table 6.5 Measured Execution Times Table 6.6 Optimised Job Distribution
xii
Glossary of Terms
Admission Control a QoS procedure which determines the rate at which jobs are accepted into a system or network, or indeed, whether the jobs will be accepted at all Artificial Intelligence (AI) a branch of computer science dealing with simulating intelligent behaviour in computers Autonomic Computing a term created by IBM to describe self-managing computer systems. BPEL an XML language for describing business processes Cloud computing - is the provision of dynamically scalable and often virtualised resources as a service over the Internet. Cloud computing services often provide common business applications online that are accessed from a web browser, while the software and data are stored on the servers. Control Theory a technique from engineering whereby one or more input variables are tracked by a controller in order to manipulate one or more output variables. Decision Theory a branch of AI concerned with decision making. In particular design theory addresses problems such as how to measure the outcome of a decision to ensure that its optimal and how to make decisions with incomplete knowledge (choice under uncertainty). E-Commerce Transaction a business transaction that occurs over a network between two partners. The transaction is likely to consist of a number of discrete business processes that automatically engage other IT systems. ESB (Enterprise Service Bus) a layer of abstraction on top of a messaging service stack that supports Web services standards, synchronous and asynchronous messaging patterns, contentbased routing, rules-based content filtering or enrichment, XML transformation services, standardsbased adapters (such as JCA, JMS). xiii
Event-driven architecture (EDA) is a software architecture pattern promoting the production, detection, consumption of, and reaction to events. Event-driven architecture can complement service-oriented architecture (SOA) because services can be activated by triggers fired on incoming events Fuzzy Logic Fuzzy logic is a form of multi-valued logic derived from fuzzy set theory to deal with reasoning that is approximate rather than precise Genetic Algorithm - A genetic algorithm (GA) is a search technique used in computing to find exact or approximate solutions to optimization and search problems. Genetic algorithms are categorized as global search heuristics. Genetic algorithms are a particular class of evolutionary algorithms (also known as evolutionary computation) that use techniques inspired by evolutionary biology such as inheritance, mutation, selection, and crossover (also called recombination). Grid Computing an emerging architecture whereby many networked computers are used to parallel process work by packaging the work up into many small jobs J2EE a multi-platform framework that provides software developers a huge number of pre-coded solutions for common tasks in the Java language Kendall Notation a system for describing the characteristics of a queuing system - letters are used to describe the shape of a distribution: M-Markovian, G-general. The first letter defines the job interarrival distribution and the second letter describes the service time distribution. Then a number is used to give the number of servers, so M/M/1 is a queue where job interarrival and service times have a Markovian distribution and there is a single server. Linear Programming In mathematics, linear programming (LP) problems involve the optimization of a linear objective function, subject to linear equality and inequality constraints Mean Value Analysis (MVA) A technique for analysing closed multichain queuing networks. xiv
.NET a framework for the Windows operating system that provides software developers with a huge number of pre-coded solutions for common tasks. It supports development in multiple languages but C#.NET and VB.NET are the most popular PI Controller in control theory, a controller with both proportional and integral feedback control. Its popular because it can have a nonzero constant value under steady-state conditions even when the error signal is zero. Queuing Theory the mathematical analysis of queues QoS Quality of Service, in a software application sense, refers to non-functional attributes such as response time, availability, and reliability. QoS attributes are used to provide measurable constraints in a SLA. SLA a Service Level Agreement is a formal contract between an IT service provider and a service consumer Service-oriented Architecture (SOA) "a style of multi-tier computing that helps organizations share logic and data among multiple applications and usage modes." [Natis and Schulte, 1996] SOAP SOAP is a protocol for exchanging XML-based messages between software components Software as a Service (SaaS) an element of Cloud Computing, SaaS is a model of software deployment whereby a provider licenses an application to customers for use as a service on demand. SaaS software vendors may host the application on their own web servers or download the application to the consumer device, disabling it after use or after the on-demand contract expires. UDDI an XML based registry for listing the WSDL and URLs of Web services UML Unified Modelling Language Utility Functions Utility is a measure of preference, expressed through utility functions. Utility functions assign numbers to members of a choice set in order to rank the choices. xv
URL(URI) a unique identifier to the location of a Web service, application or site (on a corporate network or on the Internet) Web Service a Web service is commonly defined as a software service that uses WSDL to define its interface and SOAP envelopes for message exchange. Workflow a business process implemented as a composite Web Service comprised of a number of steps each consuming finer-grained Web Services Workload in e-commerce terms, this is the rate at which requests are made to system resources WSDL an XML format for describing the public interface of Web Services
xvi
Chapter 1 Introduction
It is commonplace for organisations to automate complex business processes using service-oriented architectures (SOA) [Erl, 2008]. A service-oriented architecture is a distributed architecture that models components as services. It is built upon a collection of open standards, including Web Services Description Language (WSDL) [W3C, 2001], SOAP [W3C, 2007], WS-Security [OASIS, 2006a], WS-Policy [IBM, 2006], WS-ReliableMessaging [OASIS 2006b], BPEL, or Business Process Execution Language [OASIS 2007]. A SOA encourages enterprise application integration and composite application development by virtue of the fact that it is intrinsically loosely-coupled [Erl, 2008]. For example, Web Services can provide wrappers to applications built on legacy systems, allowing the functionality of those legacy applications to be integrated with new functionality which is collectively delivered via a single portal, Figure 1.1.
Figure 1.1 Service-oriented application integration. The legacy functionality of back-end systems is exposed via Web Services. An integration layer provides business process orchestration and the portal layer provides the user interface.
This type of composite application built using Web Services not only helps companies maximise their investment in legacy systems, but also helps streamline business processes. The business and technical benefits of such applications collectively offer what is often called a virtual enterprise or virtual organisation [Khoshafian, 2002], Figure 1.2.
S e r vice C o n su m e r
UD DI
In te r n e t
S O A P , W S - A d d r e ssin g W S D L , W S - P o licy
W S - T r a n s a c ti on , W S - S ec u r i ty
CRM
B P E L /O W L - S
S e r vice P r o vid e r s
In tr a n e t ER P
In te r n e t S e r vice P r o vid e r s
L e g a cy A p p s
In - h o u se se r vice s
S e r vice P r o vid e r s B 2 B In t e gra t io n v ia O n D e m a n d S e rv ic e P rov ide rs
E n t e rp ris e A pp lic a t io n Int e gra t ion
Figure 1.2 A Virtual Enterprise (adapted from [Khoshafian, 2002]). A service-oriented architecture can flexibly integrate applications, functionality and data across not only legacy applications on the organisations own intranet, but can also across enterprise boundaries to consume external thirdparty services. Furthermore, the organisation can expose its own composite services to its own clients.
Cloud Computing allows SOAs to reach out across the globe consuming software services from around the world a concept usually referred to as Software as a Service (Saas) [Lakshmanan, 2009]. The ability to automatically discover services, compose those services into a business process and invoke them as part of workflow in an on-demand fashion opens up some of the most exciting features of dynamic e-business. The Universal Description and Discovery Interface (UDDI) initiative
provides an aggregation of metadata about registered services that consumers can dynamically query [OASIS, 2005]. Known as on-demand or utility computing, this next step differentiates itself from simply leasing a service by allowing the consumer to demand exactly what they want, when they want it, perhaps on a pay-per-use basis [IBM, 2004]. SOA provides the architectural platform for utility computing. The potential applications of dynamic, distributed, virtual applications like these in e-commerce scenarios are broad, and significant competitive advantage could be available to those organisations that make the best strategic use of these technologies. For example, just-in-time introduction of new processes could realise time and cost savings. Scalability could be improved through the use of alternative services during periods of high demand. Business processes and the supply chain could be optimized through the use of automation.
1.1 Motivation
A number of interesting scenarios have motivated this research. These are described below in order to explain the rationale. The primary motivator is to provide at an application level the ability to monitor, predict and optimize Quality of Service (QoS) metrics relating to the performance of the workflows implemented by the composite application. Note that we use the term workflow in this thesis to describe a business transaction implemented by a composite application built using Web Services. The Software Engineering Institute reported in their review paper on Service Level Agreements in Service-Oriented Architectures that one of the most important areas for further research is the need to understand and determine the QoS of composite services [Bianco et. al. 2008]. Hence, the problems that we wish to address include the following potential scenarios:
Managing Overload Conditions It is a common scenario when providing workflows to multiple clients that resources can become overloaded. Under these conditions it would be useful to be able to selectively admit or reject requests from clients based on some criteria that maximises the providers profits or business objectives.
Performance Prediction How can we predict the effect on performance of replacing one Web Service with another one of equivalent functionality, or by dynamically changing the steps in a workflow?
Performance Improvement How can an organisation improve the performance of that application through effective use and exploitation of the resources available to it?
Service Level Agreement (SLA) Management how can we define and manage SLAs for performance metrics in composite applications?
SLA Strategies what strategies are there for improving our ability to meet SLA performance targets in a composite application?
1.2 Hypothesis
This thesis examines the following hypothesis: There exist solutions and strategies that will allow providers of composite applications built using Web Services to manage the QoS metrics of that application in such a way that they can ensure they meet SLA targets containing those metrics. We make no attempt to determine the best solutions in this thesis as the scope of the work involved would be too broad for a thesis. Instead we attempt to provide evidence that such solutions exist. As suggested by the Software Engineering Institute, this is itself extremely valuable. Within the financial services industry, which the thesis author has worked since 1997, there is widespread
belief among system management professionals that these are potentially intractable problems. This thesis aims to demonstrate that solutions do exist.
1.3 Methodology
It is the aim of this research to explore how the scenarios outlined above could be addressed at the application level through the construction of QoS-aware software components that could be offered as a generic management service in a typical composite application built using Web Services. Fundamental to this effort are two major pieces of work: firstly, the creation of a model for measuring and predicting the essential QoS metrics: response time and throughput, and secondly the development of a methodology for efficiently solving the optimization problem that results. We will use an qualitative, empirical approach for the major pieces of software engineering involved in which we will attempt to apply candidate solutions to a real insurance application built using Web Services. In electing to use this application we have chosen to follow an exploratory case-study methodology. The results of exploratory research such as this are not useful for decision-making by themselves, but they can provide significant insight into a given situation and this is therefore considered a good approach to address our hypothesis. We also believe that the composite application used in the study is very typical of a general class of applications used in the insurance and financial services industries. This view is based on the thesis author's many years experience working as a technical architect in this sector. In selecting the software engineering aspects of this thesis we are attempting to generate ideas for a design space and evaluate our design choices through prototyping the proposed design solutions in real use with actual components of the casestudy application.
1.4 Contributions
This thesis makes the following main contributions to the subject: 5
1. This is the first time that detailed test results have been published to prove that the Mean Value Analysis (MVA) algorithm can be applied to a queuing network description of a composite application using Web Services. 2. This thesis is the first published work to use MVA as the fitness function for a Genetic Algorithm 3. This thesis is the first published work to apply a GA to dynamic run-time QoS management of a real insurance application implemented as a set of Web Services across multiple servers. Previous published work on using GAs to optimize service composition has restricted itself to numerical simulations. 4. This thesis demonstrates strategies for meeting QoS targets under a number of different real-life overload conditions. 5. This thesis discusses improvements that could be made to existing SLA design methodologies and SLA languages to incorporate QoS metrics for composite applications.
1.5 Thesis Roadmap

Chapter 2 provides a background discussion to the issues of QoS in service-oriented architectures as well as introducing related work in the field to the two main components of the thesis: the model and the optimization methodology. In order to derive and use a model for a composite application using Web Services we need to undertake the following steps: 1. Define the performance requirements of the composite application. 2. Model the business demand of the composite application 3. Build a performance model by characterising the workload of real systems.
Chapter 3 describes theses steps as applied to a real insurance application and shows how a performance model was developed based on queuing network theory. The report then shows how the model can be used for adaptive control of a composite application using Web Services by addressing some simple performance prediction problems. In Chapter 4, a Genetic Algorithm is introduced for performance management of composite application using Web Services . The key aspects of the GA are the chromosome encoding and the crossover strategy. It is shown how the GA can optimize the overall response times of workflows using the MVA model as a fitness function to identify whether the workflow suggested by each chromosome will meet the QoS targets defined. In chapter 5 we demonstrate from an architectural perspective how the models and optimization techniques introduced in this thesis can be applied to workflows for enterprise applications built using Service-oriented architectures that extend beyond the local enterprise and consume thirdparty services in the Cloud. Finally, in Chapter 6 we review the most recent proposals for SLA management of composite applications and identify areas where these proposals could be extended to include provision for the adaptive strategies described in the previous chapters.
Chapter 2 Analysis of the Problem

The Software Engineering Institute reported in their review paper on Service Level Agreements in Service-Oriented Architectures that one of the most important areas for further research is the need to understand and determine the QoS of composite services [Bianco et. al. 2008]. The problems have also been raised with respect to applications built using services in the Cloud [Panzieri et. al. 2010] who observe that QoS in clouds is not sufficiently investigated as yet but there is growing interest in both industry and academia. In terms of the SOA methodology, composition of services allows the business to realize flexibility, reusability and adaptability of its software assets. However, the application must still meet such important QoS attributes as performance. Since the components may be provided by multiple stakeholders and the configuration could change at run-time these are important additional issues to consider over a more traditional distributed architecture. Menasce [2002] first highlighted the need for a QoS definition in Web Services and identified the need to take into consideration both the needs of the service provider and the service consumer. QoS requirements for Web Services include the following [Yu et. al., 2007]: Performance, Reliability, Scalability, Transactions, Capacity, Accuracy and Integrity, Regulatory, Availability, Interoperability and Security. Performance: Service time is the length of time for services taken to provide a response to various types of requests [Bhoj et al, 2000; Chandrasekaran et al, 2002; Menasce, 2002; Agarwal et al, 2005].
Response time is the total time required to complete a service request [Mani and Nagarajan, 2002; Papazoglou and Georgakopoulos, 2003; Looker et al, 2004; DAmbrogio, 2006]. Reliability refers to the capability of maintaining the service and service quality [Jin et al, 2002; Silver et al, 2003; Cardoso et al, 2004; Burstein et al, 2005]. Security refers to authentication mechanisms, messages encryption and access control, confidentiality, non-repudiation and resilience to denial-of-service attacks [Sahai et al, 2002; Ran, 2003; Wang et al, 2004; DAmbrogio, 2006]. Accessibility refers to the capability of satisfying a web service request [Gu et al, 2002; Mani and Nagarajan, 2002; Looker et al, 2004; Mathijssen, 2005]. Transactions relates typically to properties such as the transactional durability and consistency of results [Mani and Nagarajan, 2002; Menasce, 2002; Ran, 2003; Schmit and Dudstdar; 2005] Capacity is the maximum number of concurrent requests that server can process to guarantee performance or the number of concurrent connections that is permitted by the service [Al-Ali et al, 2002; Ran, 2003; Mathijssen, 2005]. Accuracy and Integrity refers to the maintaining of correct and consistent interaction [Mani and Nagarajan, 2002; Papazoglou and Georgakopoulos, 2003; Looker et al, 2004]. Regulatory refers to the conformance and compliance to the rules, laws, standards and specifications [Mani and Nagarajan, 2002; Ran, 2003; Looker et al, 2004]. Availability is the time as a percentage that the composite application is available to service requests [Hu et.al. 2009]
Interoperability is the ability of the composite application to interoperate with systems in a way that is agnostic of the platform they run on or the programming language used to write them. Many of these are now well addressed, for example, Security through WS-Security [OASIS, 2006a] and Interoperability [OASIS 2010]. Performance in the context of Composite Web Services remains a challenge, however [Dyachuk et. al. 2007]. This is reflected in the fact that within the SaaS (Software as a Service) industry many vendors, e.g. Amazon, only make SLA statements that cover availability and reliability. SLA assurances about performance metrics such as response times are not widely available. The thesis author raised this topic on the discussion forum of the SaaS group on the LinkedIn business networking site. Despite the fact that this group has almost 6000 members worldwide (as of December 2009) just two SaaS vendors voluntarily offered performance related SLA metrics for their services. Of these two companies only one (Intactt) publicly display those figures on their website. Within the financial services industry, the author has noted through her work as a consultant, that many companies recognise this as a problem without any readily available automation solution. Instead, the state-of-the-art today is to monitor each individual resource in a composite application for its availability on a large monitor visually inspected by Help Desk staff, whilst performance metrics of individual resources are only analysed offline on a periodic basis (daily, weekly) from web logs. There is no published literature on this issue from these companies as the subject is for obvious reasons, commercially sensitive information. Where solutions exist to monitor performance metrics and to pro-actively take remedial action, these are based primarily on the use of redundant virtual machines. Lodi et. al. [2007] is an example of an approach using large-scale clustering of available Virtual Machines and adaptive load-
10
balancing that has been trialled an J2EE application servers. Two particular problems with this approach include: large number of VMs may give rise to scalability problems in collateral subsystems (e.g. a shared database may become a bottleneck) VM allocation time may cause SLA violations An alternative solution would be to make better use of the resources that are available. There are two aspects to this. Firstly, monitoring of individual resources to capture live performance metrics and being able to use a model of the composite application to be able to understand the impact on the workflows being executed in real-time. Secondly, using this data to automatically take remedial action where SLA targets are in danger of not being met. We attempt to address our hypothesis with a focus on these two pieces of software engineering. There have been many published strategies for modelling the performance of QoS of Composite Web Services, for example, through the use of integer programming [Cardoso, 2002; Zeng, et. al., 2004; Gao et. al., 2005; Kelly, 2003], as a multiple choice knapsack problem [Yu, et al. 2007], probability theory [Hwang et. al., 2007], event-driven rule-based programming [Zeng et. al., 2010], numerical simulation [Silver et. al., 2003], game theory [Esmaeilsabzali et. al., 2005], layered queuing network models [D'Ambrogio et. al., 2007], fuzzy logic [Lin et. al., 2005; Diao et. al. 2002b, 2003], analytical models using queuing networks and hill-climbing algorithms [Menasce and Bennani 2003], utility functions based on simple queuing networks [Pacifici et. al., 2003], approximate Mean Value Analysis [Menasce et. al., 2004], exact Mean Value Analysis [Urgaonkar et. al. 2005a], job scheduling [Urgaonkar and Shenoy 2005], control theory [Abdelzaher et. al. 2001; Lu et. al., 2002; Diao et. al., 2002a; Lu et. al., 2004; Wand et. al., 2004] and genetic algorithms [Canfora et. al. 2005;
11
Zomaya and Teh 2001; Page and Naughton 2006; Canfora et. al. 2008]. There is no published work that attempts to compare and contrast these different approaches.
12
Chapter 3 Related Work
3.1 Background
Although there has been an enormous amount of published material regarding SOA, the quality of service aspects have only more recently been addressed [Bichler and Lin, 2006], furthermore, the typical software QoS challenges of any Web Service [Mani and Nagaranjan, 2002]: availability, integrity, performance (throughput and latency), reliability, interoperability, regulatory factors and security, are compounded in more advanced SOA by their dynamic nature. Some of these issues are introduced below.
3.1.1 Discovery and Negotiation For all but the simplest of agreements, some form of negotiation is required if services are to be consumed on demand [Stantchev et. al. 2009]. WS-Negotiation was proposed [Hung et. al., 2004] with the principal goals of describing a negotiation process and publishing an XML negotiation language through the use of Web Services architecture technologies. The proposed standard leaves the negotiation decision-making process to some internal algorithm that could be based on metrics such as price, service level objectives, or business policy. From the service providers perspective, a decision must also take into consideration the resources that are available at the time. Many proposals have recently appeared regarding this complex problem. Suggested solutions include modelling the problem as a multi-constraint knapsack [Yu and Lin, 2005], as a fuzzy constraint problem [Lin et. al., 2005] or using integer programming models [Gao et. al., 2005].
13
3.1.2 Service Level Agreement The first attempt at describing a machine-readable (i.e. XML) language for specifying service-level agreements was IBMs Web Service Language Agreement (WSLA) [Ludwig et. al., 2003]. The rationale behind the specification [Keller and Ludwig, 2003] was the desire to create a flexible but formal language that could be applied end-to-end. WSLA is still the most actively cited SLA specification [Patel et. al. 2009] for researchers in the area of SOA and Cloud Computing. Another commonly cited specification is a joint proposal put forward via the Global Grid Forum, entitled WS-Agreement [Andrieux et. al., 2007]. However, WS-Agreements primary focus is the management of Grid architectures [Foster et. al., 2004]. It does not, therefore, directly address all of the needs of SLA definition from a business users perspective. Recently modifications were proposed to modify WS-Agreement to allow it to better model composite business services [Di Modica et. al. 2009]. A much simpler approach is described by Web Services Offering Language (WSOL) [Tosic et., al., 2003]. The objective of WSOL is to create a series of classes of service in a standard format that would sit alongside a services WSDL file. The WSOL descriptions act as advertisements for a matchmaking engine to examine. The service consumer, via their matchmaking engine, selects the offering that is most appropriate. The WSOL team suggest that classes of service could differ in terms of usage privileges, priorities or response times. The value of WSOL is that it vastly simplifies the negotiation process. Additionally, the authors suggest that the management infrastructure required to support WSOL is also much simplified [Tosic et. al., 2004].Many other SLA languages have been suggested [Greiner and Rahm, 2004, Tian et. al., 2004, Sahai et. al., 2002 and Lamanna et. al., 2003], each with their own relative merits. More recently these initial attempts at SLA definition have been enhanced to cater for composite applications. Two such examples are MoDe4SLA
14
[Bodenstaff et. al. 2008] and COSMA [Ludwig et. al. 2008]. These two specifications will be described in more detail in the following chapter.
3.1.3 Service Provision Resource provisioning is one of the biggest challenges for a service provider in an on-demand environment. Amongst the considerations are the identity of the client, the service being requested, the SLA, business policy, the measurement service, specific service provisioning operations, and the concurrent activities of requests from other clients [Dan et. al., 2004]. The problem in an ondemand, e-commerce environment where requests are stochastic and the physical resources satisfying those requests behave non-linearly and are not only spread across multiple application tiers, but could also be distributed around the globe, is potentially a major exercise. Workflow management must be able to distinguish requests based on performance objectives [Dan et. al., 2003]. Service provisioning is at the heart of autonomic computing [IBM, 2003] and much of the recent research into these two fields is related. A primer on control theoretic techniques for resource can be found in Diao et. al. [2004]. A detailed overview of the work that has been conducted in this area will be provided in the section 2.2.
3.1.4 Monitoring The monitoring of QoS metrics must consider what to measure, how to measure, who does it (service provider, consumer etc), and where the measurements are taken [Menasce, 2004]. The task of monitoring is implicit to the task of provisioning and it is assumed that a provider will need to have in place mechanisms for efficiently collecting and storing resource metrics as a basis for any adaptive provisioning. However, it is also in the consumers interest to undertake monitoring activities: the question of trust is one issue, but perhaps more importantly, the consumer might also be acting as a composite service provider to someone else. In this scenario, the service consumer 15
must not only monitor the quality of third-party services he/she consumes but must also monitor the quality of his/her own offering. Not only could monitoring become a fairly complex activity, it could in itself become a resource intensive activity. To this end, is has been proposed that new breed of service provider could become a reality one that exists to provide monitoring services independently of both service provider and consumer, easing trust issues and limiting performance penalties (Benjamin, et. al., 2004). An example of how monitoring and provisioning can be used to solve dynamic resource allocation problems is the WebQ framework developed by Patel et. al. (2004). WebQ dynamically monitors QoS parameters from more than one provider of a given service. To begin with, the framework distributes requests equally across all of the providers. As the metrics database grows, the framework dynamically shifts load to the better performing services using a weighted algorithm. Since it continues to send some of its load to the slower services (these could, in fact, be test messages), the framework can re-adapt itself should the performance of the slower service improve again. The authors have used multi-level rule modelling in OWL-S to create a flexible framework that can manage complex QoS requirements involving large number of parameters.
3.2 Adaptive Control of Web Applications and Services

Research into adaptive management has concentrated on two main modelling techniques: those using queuing theory [Kleinrock, 1976] and those using feedback control theory [Franklin et. al., 2002]. Combining the two, it is also possible to use queuing theory to derive the system model for a control theoretic approach. Derived from research into decision theory in artificial intelligence [Russel and Norvig, 2003], optimization problems can be approached using utility functions to mathematically model 16
preference. Given a certain event, the action to take is determined by ranking all possible actions and choosing the actions with the best expected outcome. An alternative is the use of genetic algorithms (GAs) [Holland, 1992]. GAs are adaptive techniques for search and optimization problems that were inspired by some of the processes involved in natural evolution and specifically the notion of survival of the fittest. The goals of published work on adaptive management fall into one of four categories: Dynamically reconfigure available resources to optimise the throughput or response of the current workload this is a common goal of queuing related techniques and utility driven approaches. Admission control through the rejection of excess requests. Ideally the controller should also attempt to only service the important ones, rather than rejecting requests at random. Both queuing and control theoretic techniques have used this approach Dynamically provision idle resources, if these are available. Queuing models have been used to predict when to do this, based on a demand threshold being exceeded. Degrade the performance of admitted requests, possibly paying penalties to the client. Control theoretic techniques have been applied to providing relative guarantees between service classes, rather than absolute guarantees.
3.3 Queuing Theory

3.3.1 Dynamic Resource Configuration An example of the use of queuing theory for adaptive resource configuration is Welsh's SEDA architecture [Welsh et. al., 2001]. SEDA applications consist of event-driven stages connected by queues. Dynamic resource controllers keep stages within their operating regimes during load changes via thread pool management. A stage is a self-contained component consisting of an event 17
handler, a simple, incoming event queue and a thread pool. The handler processes events and dispatches them onto successive stages. Multiple stages can share the same thread pool; hence the architecture can dynamically adjust the pool based on the load at each stage. SEDA can be found at the heart of the Mule [Mule, 2010] open source message bus. Dynamic thread adaptation was shown to be effective in dealing with bursty Internet traffic but had limitations when it came to dealing with overload conditions. Menasce and Bennani [2003] used an analytical performance model to design controllers that run periodically to determine the best current resource configuration of a web server given its current workload. The authors used a QoS controller that monitors system performance, including the resource utilisation of server resources and periodically executes an algorithm to determine the appropriate reconfiguration commands. Data is collected from metrics such as CPU utilisation, which allows the current service demand to be calculated as the ratio of the resource utilisation and the system throughput. Mean Value Analysis, MVA, [Lazowska et. al., 1984] is used to create a model in which average response time, the probability of rejection, and average throughput can be predicted. In this particular paper, the network model is extremely simple, assuming only that an incoming request is serviced by one of m threads. When all m threads are busy, the request is rejected. A hillclimbing search algorithm is used to find a close-to-optimal configuration by constantly re-applying the algorithm to all possible configurations. Pacifici et. al. [7] extended this work to clusters of servers, supporting multiple classes of web traffic. The content of the inbound requests SOAP header is examined to determine its class of service and the server farm is partitioned into clusters, each one managing different classes of traffic. A utility function is defined for each class of traffic, which is simply a construct to weight the deviation of the actual response times from the desired response time. A combined utility function is also derived to
18
calculate how to allocate tasks across all of the resources in the farm. In this work, a traffic class set is created for each tuple <customer, service, operation, grade>. In Kendall notation an M/M/1 queuing model is used to predict the average response time. The dynamic model is used to allocate resources and dynamically load-balance work across the available resources.
3.3.2 Admission Control Menasce et. al. [2004a] used an analytical model to make real-time admission control decisions. Every time a new request is received the performance network model, using approximate MVA, solves a closed loop queuing network. An algorithm then determines whether the request can be serviced or not based on the current commitments and the possible solutions suggested by the model. Each client session is modelled as an individual class. Urgaonkar and Shenoy [2005] discuss a policy mechanism that emphasises the need to ensure that the policy mechanism itself does not create a significant performance overhead. Requests are mapped to a service class and then scheduled either FIFO (first-in-first-out) or shortest job first. Requests of lower class are deliberately delayed. This prevents them from denying access to more important requests. Requests of a higher class are subject to the admission control tests first. If the highest class fails, there is obviously no need to test lower classes. Requests are admitted so long as the system believes it has sufficient capacity to meet the SLA. Furthermore, batching requests reduces policing overhead. Buckets are defined in each class, with a range of service times. All requests in a bucket are then treated as equal. When admission control is invoked it considers each non-empty bucket in the class its testing and conducts an all or nothing test on those requests. A predictive technique is also used to further reduce overhead. The number of requests to admit can be pre-computed if you have a good idea of how many requests will be arriving at the next time interval. 19
3.3.3 Dynamic Provisioning of Idle Resources Urgaonkar and Shenoy [2005] also use a G/G/1 queuing model in conjunction with online measurements to determine the need to replicate applications across idle, virtual servers if the number of requests gets so high that a threshold is breached. This threshold is simply the known bound on the job arrival rate of a G/G/1 queue [Kleinrock, 1976].
3.3.4 Extending to Multiple Tiers Urgaonkar et. al. [2005a] have specifically considered the problems associated with tackling bottlenecks in a multi-tier distributed application. They show that independent per-tier provisioning is not sufficient as it can fail to capture the way in which bottlenecks can shift across tiers. In a related paper, Urgaonkar et. al. present a multi-tier model based on MVA in [2005b]. The model deals with scenarios such as a single request in the web tier spawning multiple tasks on the application tier through the use of closed-loops creating multiple visits to each resource. They also deal with long-lived sessions using an infinite queue at the front of the model, which also serves as the re-entry point for requests that have been completed, thus forming a completely closed-loop. This models think time at the client. The model uses an exact MVA algorithm. They suggest it can be extended in several ways: to deal with scenarios where service times increase with load, where resources are replicated on the same tier (load-balancing), for overload conditions at a given tier causing dropped requests and for multiple session classes, but provide no specific details. Liu et. al. [2005] developed an approximate MVA model for a three-tiered architecture that uses a multistation queuing centre to model the ability of web servers to multi-thread incoming requests.
20
3.4 Control Theory

3.4.1 Admission Control The first published work on the use of control theory for admission control appears to be Abdelzaher et. al. [Abdelzaher, 2001]. In this paper, the authors attempted to keep the utilisation of a web server at a fixed percentage where the web server was known to achieve optimum performance. A simple linear expression was derived relating the utilisation to the number of admitted job requests and the bandwidth of pages being served. A PI controller was used. Also using a PI controller, Kihl et. al. [2003] used a M/G/1 queue for their model using a non-linear approximation for the utilisation of the server expressed in terms of the number of requests in the system and the service time distribution. Lu et. al. [2002] presented an approach using two SISO (single-input single-output) controllers. The controlled variables were the deadline miss ratio and the CPU utilisation. The adaptive system is characterised in terms of the following performance metrics: stability (the miss ratio and utilisation are bounded at all times), transient state response (overshoot and settling time), steady-state error and sensitivity to workload variations. As an alternative to using multiple SISO controllers, [Diao et. al. 2002a], constructed a true MIMO (multiple-input multiple output) controller. They controlled the Keep Alive and Max Clients parameters of an Apache Web Server in order to optimise its CPU and memory utilisation. They conclude that MIMO design techniques such as the Linear Quadratic Regulator, LQR [Franklin et. al., 2002], are beneficial for balancing design trade-offs.
3.4.2 Degraded Service An alternative approach is to degrade the service levels of admitted requests. Instead of offering customers absolute delay guarantees, Lu et. al. [2001] describe an approach that offers a 21
differentiated service. Only the relative delay between two service classes is guaranteed, e.g. the ratio of gold response time to silver response time. They point out that under conditions of heavy load many of the reported approaches will only ensure better service to premium customers, but do not provide any guarantees as to how much better the service will be. Their proportional delay model specifies a fixed ratio between the delays seen by each service class. They also introduce a hybrid policy one that uses proportional delay in normal operating conditions and switches to absolute delay under very heavy load. This is because extreme load could lead to very long response times even for the premium customers if the target is simply to maintain a fixed proportional delay. They show how the relative delays can be use as the control variable for a proportional feedback controller.
3.4.3 Extending to Multiple Tiers Lu et. al. [2004] extended control theoretic techniques to multi-tier distributed systems. Their paper presents the EUCON (End-to-end Utilization CONtrol) algorithm, which adaptively manages CPU utilisation using feedback control and a MIMO model predictive controller. They point out that most papers on feedback control methods assume a single CPU operating on a single task while most applications consist of tasks spawning multiple other tasks and are deployed on multi-CPU platforms. The performance of one task is coupled to the performance of other tasks. Changing the rate of one task affects the utilisation of dependent tasks on the processors that they are using. This paper derived a dynamic model to capture coupling amongst processors, developed a model predictive controller approach for QoS control and designed a distributed MIMO feedback control loop. When the number of servers is large, the overhead of a centralised controller could become significant. For this reason, an enhanced version, DEUCON, was presented in [Wand et. al., 2004]. This is the distributed controller version of EUCON. A peer-to-peer control structure and localised 22
utilisation control algorithm are used based on distributed model predictive controller theory where a controller for each CPU cooperates only with local neighbours, i.e. only those that are executing sub-tasks. Simulation results show that the overhead compared to a centralized solution is much lower.
3.4.4 Fuzzy Controllers One of the major limitations of control theoretic approaches is the need to derive a suitable model of the system. Diao et. al. [2002b, 2003] demonstrated that fuzzy controllers offer significant advantages. They defined a set of simple business related metrics to describe revenue, cost (penalty), and profit and then adapted their MIMO controller [2002a] to use a set of fuzzy rules, such as: IF change_in_MaxUsers IS neglarge AND change_in_profit IS neglarge THEN next_change_in_MaxUsers IS poslarge. They show that a PI controller achieves better results in the region of the workload for which it was designed, but for all other workloads, the fuzzy controller outperforms it. It is frequently the case, in any engineering discipline, that the derivation of a suitable model (Franklin, et al. 2002] can be a challenging task. In the case of complex distributed IT systems, the results presented demonstrate that it is particularly challenging. The conclusion is that the less rigorous demands of deriving a fuzzy model mean that this could be an extremely valuable approach.
3.5 Combined Approaches

Although control theory has been successfully used to provide improvements in the throughout and response times of web applications, the technique is limited due to the highly non-linear behaviour such systems. Queuing models, on the other hand, are very good at modelling these systems due to their statistical approach. Liu et. al. [2006] applied a simple queuing model to an adaptive control 23
algorithm [Astrom and Wittenmark, 1994] to demonstrate its applicability as an admission control in an overloaded web site. They compare their technique with three other approaches: A queuing model only Adaptive control only A queuing model with a PI controller (an approach proposed by Kamra et. al. [2004]).
They show how their approach provided the smallest difference between the target response time and the actual response time. Their controller did not exploit their previous work using MVA [Liu et. al., 2005] and they express their intent to extend it with this in mind.
3.6 Solving Optimization Problems

3.6.1 Utility Functions Utility functions have been commonly used in artificial intelligence (AI) as a means of expressing preference. Recent research has begun to explore their application to self-optimisation problems in autonomic computing [Walsh et. al., 2004]. Utility is the measure of the desirability of an outcome. It is usually measured in terms of the cost, benefit or risk of an action. A utility function assigns a cardinal number to the desirability of an outcome and can depend on one or more dimensions. These could be related to business level objectives as well as service level objectives. Expected utility is the combined utility of combinations of actions. By defining the optimal decision to be when the maximum expected utility is achieved, the regret of a decision can be defined as the difference between the maximum expected utility and the actual expected utility. A common AI algorithm known as minimax attempts to minimise the maximum possible regret [Wang and Boutilier, 2003]. One of the key advantages of this approach is that decisions can be taken in the absence of a complete description of the constraints that define a utility function [ Boutilier et. al., 2004]. The
24
methodology has been demonstrated in an autonomic, self-optimising application architecture at IBM [Tesauro, et. al., 2004].
3.6.2 Integer Linear Programming Several papers have been published that propose the use of Integer Linear Programming (IP) methods for web service composition and resource allocation problems [see for example Gao et. al., 2005 and Kelly, 2003]. In terms of the current discussion, the most relevant is the work of Zeng et al (2004) who have applied IP to the problem of finding an optimal execution plan for a sequence of tasks in a composite web service. IP problems are a form of linear programming where the variables are integers (usually 0 and 1). In this case, the variables are 1 if a service x can execute task y and 0 otherwise. The objective function is a linear weighted calculation of the QoS using parameters such as price, availability, service time etc. IP attempts to maximize or minimize the value of the objective function by adjusting the values of the variables while enforcing any known constraints. The output of an IP problem is the maximum (or minimum) value of the objective function and the values of variables at this maximum (minimum). As the authors discuss, though, IP has a large computation cost, especially as the number of services and tasks increase, because IP problems are generally NPhard.
3.6.3 Genetic Algorithms Whilst GAs have been applied to multiple and diverse applications [Goldberg, 1989], their use in software systems optimization problems is currently quite limited, focussing primarily on the rather different problem of Job Shop Scheduling [see for example Mahmood, 2000, Fayad and Petrovic, 2005, Petrovic and Fayad, 2005, Montana et. al., 1998, Wang et. al. 1997]. An exception is the work of Canfora et. al. [2005] who attempt to solve an optimization problem for a set of Web Services comprising a complex workflow using a GA. To evaluate the fitness of their solution, they use a 25
weighted combination of parameters including Availability, Response Time, Cost and Reliability, although there is no discussion of how these parameters might be evaluated in real-time, based on real measurements. Using a numerical simulation, they do, however, provide an interesting comparison with the Integer Programming method of Zeng et. al. described above, and demonstrate that the GA provides a faster solution as the number of Web Services increases. In a related paper [Canfora et. al., 2004] the authors take a step to considering the dynamic use of GAs for service composition by discussing the question of service re-planning and propose adding a trigger to the workflow engine to re-evaluate the optimum service composition. A similar piece of work was presented by Jaeger and Mhl [2007] who also use numerical simulation to describe the effectiveness of GAs in this problem domain. They provide detailed results comparing the impact of different parameters (e.g. mutation rate, fitness function) on the optimisation capability of the genetic algorithm. From the world of task scheduling two papers provide useful background into the problem of using a GA in a dynamic scenario. Zomaya and Teh [2001] used a cycle crossover to load balance a discrete set of tasks over a set of resources, whilst Page and Naughton [2006] added a heuristic on the mutation operator to improve the performance.
3.7 Concluding Remarks

In conclusion, there is a large body of previous work to draw inspiration from. Approaches using control theory appear to be difficult to apply to distributed systems, as is evidenced by the complexity of the DEUCON model [Wand et. al., 2004]. Statistical approaches using queuing theory have proven successful in related disciplines such as network performance analysis and telecommunications queuing. For this reason, the queuing model approach is used in this thesis.
26
In terms of choosing an optimization strategy, the discussion has focussed on general search and optimization techniques rather than heuristic methods confining themselves to a narrow domain. The results of Canfora et. al. [2005] suggest that GAs offer a promising candidate, particularly compared with Integer Programming. GAs are traditionally strong in problem spaces where heuristic approaches are too complex to be practical. Weise et al [2007], in a review of web service composition challenges, conclude that especially in practical applications, additional requirements will be imposed onto a service composition engineSuch requirements could include quality of service (QoS) ... or the generation of complete BPEL processes ... In this case, heuristic search will most probably become insufficient but genetic algorithms and genetic programming will still be able to deliver good results
3.8 Publications
Parts of this chapter were published in the Software Quality Journal: Shelly Saunders, Margaret Ross, Geoff Staples, and Sean Wellington, 2006. The Software Quality Challenges of Service Oriented Architectures in e-Commerce, Software Quality Journal 14 (1) 65-76 March 2006 This article has been cited at least 11 times by the start of 2011. It was also presented at SQM 2005 conference as: Shelly Saunders, Margaret Ross, Geoff Staples, and Sean Wellington, 2005. The Software Quality Challenges of Service Oriented Architectures in e-Commerce, In: Current Issues in Software Quality, Thirteenth International Conference on Software Quality Management (SQM 2005), pp87-100.
27
Chapter 4 Designing SLAs for Composite Applications
4.1 Service Level Management

In this chapter we consider in more detail the design of SLAs for composite applications. Our primary interest lies in the end-to-end QoS that we are able to offer the customer of our enterprise applications. That means we need to consider not only the SLA with the customers, but also internal SLAs and also SLAs with 3rd party suppliers. In fact, there will be a tier of SLAs involving not just software services but also network services as well. However, our focus will be those parts of the SLAs which deal with statements about the levels of services quality on which the service requestor and the providers reach an agreement and will concern itself only with software services. We are also interested in being able to identify external vendors who have caused us to violate SLA terms with our own customers. The SLA management process needs to ensure that we are able to identify the causes of our own SLA failures and provide evidence to the vendor concerned if penalty payments are due to us. In general terms we propose that SLAs will be managed based on the enterprise SLA process as defined by the TeleManagement Forum and the Open Group [TMForum, 2004] due to the widespread acceptance of the TMForums work in this regard. Our goal is a Service Level Agreement with the enterprise applications clients that is as simple as possible whilst being complete enough to define all expectations of the quality of service to be delivered as well as what happens when the application fails to meet those expectations. In a general enterprise application with multiple services, the end-to-end Service Level Agreement is composed of the following relationships [TMForum, 2004]:
28
Application (KQI)
Service Level Agreement
Service Performance Indicators (KPI)
Service Level Monitoring
Monitoring Instrumentation
Figure 4.1 Service Level Management
Key Quality Indicators (KQI) of the application are derived from the performance metrics of the underlying composite services. These performance metrics are known as Key Performance Indicators (KPI). For each service these will be obtained from monitoring instrumentation which is the core of the whole process.
4.1.1 Service Monitoring Many Cloud vendors who offer Web Services that can be composed into composite enterprise applications are only just beginning to provide actual data about the performance of their services. 29
Even where this data is provided, the question of trust will always be an issue. Furthermore, the complete round trip time of a particular service from a particular vendor is also dependent on multiple elements lying between the composite application and the service itself, including ISPs hardware, communication links etc. For these reasons, we propose that even if all services provide their own performance metrics, service monitoring is also performed centrally. In our work we have used the Enterprise Service Bus to achieve this. Apart from raw data obtained from live monitoring of data, we can also use information from service registries, test data, SLA statements on contracted QoS values and feedback from other service consumers. However, the most weight should be put on the service execution history data as the most reliable source of information. This process has been termed service profiling [Abramowicz et. al., 2006].
4.1.2 Key Quality Indicators and Key Performance Indicators Whilst there are many QoS metrics applicable to services [Kritikos and Plexousakis, 2009] the Key Quality Indicator of interest to this thesis is primarily the end-to-end time required to execute a particular workflow. This KQI can be mapped directly into the SLA. The KQI is derived from KPIs of each service used by that workflow. The KPIs we are interested in are the execution times of each service call. For differentiated services based on priority sessions, we would also be interested in the cost of each service as another KPI. Further, from a business intelligence perspective, understanding the costs involved in operating a composite service is also very desirable. Once KPIs are defined, we can generate our end-to-end workflow execution time KQIs from the KPIs for example using the techniques described by Mensace [2004]. In the example of Figure 4.2, Service A invokes B with probability p1 and it invokes C with probability p2 = 1 p1.
30
Likewise C invokes D with probability p3 and it invokes E with probability p4 = 1 p3. Finally F is invoked when either D or E finish, or when B finishes. In this example the total execution time, T, is given by: T = tA + p1tB + p2(tC + p3tD + p4tE) + tF where p is the probability of that execution path being chosen.
Figure 4.2 Composite Web Services
Likewise, the total cost will have exactly the same form. The KQI is based on the value of T that the application is expected to meet. Likewise KPIs are based on values of tn that each service is expected to be able to meet. Since the KQI is a composite measure each individual KPI can be defined with a certain degree of tolerance. An individual service could exceed its KPI whilst the overall application execution time remains within its SLA targets. This allows us to add flexibility to the KPIs by adding performance thresholds. There could be a warning threshold as well as an error threshold.
4.2 Service Level Agreement Design

4.2.1 COSMA The collection of service execution data and its use in the definition of KPIs has also been suggested as an important method of service profiling of composite services [Ludwig et. al. 2009a] based on COSMA, an approach for managing SLAs in composite services [Ludwig et. al. 2008].The concept 31
behind COSMA is the integration of contractual information from atomic services into a composite SLA management document which is written in XML. COSMA differs from previous proposals for XML SLA languages such as WSLA [Ludwig et. al. 2003], WSOL [Tosic et. al., 2003] and WSAgreement [Andrieux et. al., 2007] because these prior offerings focussed on bilateral agreements between service requesters and providers. COSMA, while based on WS-Agreement, focuses on the contribution of third-party atomic services in the management of a provided composite service and is, therefore, much more suitable to the goals and intentions of this thesis. COSMA consists of the following parts: COSMAdoc an information model for defining contractual data, service SLA management data and the dependencies between services COSMAframe a conceptual framework to describe the management of the composite SLA lifecyle COSMAlife- SLA management practices that use COSMAdoc instances to cover the phases of the SLA lifecycle COSMAdoc is the core of COSMA and consists of a set of SLA documents. For each composite service a new COSMAdoc instance is created. Referring to Figure 4.3 at the top of each COSMAdoc is a Header element that defines basic information such as the service description, version etc. The ServiceComposition element expresses the relationships and structure between all involved services. The SlaSetAssembly element captures individual SLAs for the individual services involved. Each Sla element is based on and extends the WS-Agreement model. The SlaSetUsageValidation and the SlaSetDataValidation sections allow the XML Sla elements to be connected with SLA management data. They can be used to provide specific constraints on the SLA elements in the SlaSetAssembly. Composition-specific aggregation 32 formulas are defined into the
AggregationFormulas
section
of
the
COSMAdoc
instance,
and
referenced
by
the
SlaSetDataValidation and SlaSetUsageValidation sections.
Figure 4.3 Simplified COSMAdoc schema (from Ludwig et. al, 2008]
4.2.2 MoDe4SLA MoDe4SLA is an approach at understanding the dependencies in a composite service [Bodenstaff et. al. 2008]. Like COSMA it seeks to define the vertical dependencies between atomic services that make up the composite service being provided. However, unlike COSMA it does not define a language but instead focuses on important conceptual issues on monitoring SLAs. Like ourselves, the authors identify response time and cost as two of the most important metrics to monitor. We,
33
therefore, find it very useful to review MoDe4SLA and compare it with the work we have presented in this thesis so far. The MoDe4SLA approach begins with a dependency model which for our purposes is similar to what we have produced already in Figure 4.2. We can produce models of this sort not only for response times but also for cost dependencies. Next the approach advocates that we analyse the impact the dependent services have on the composite service. An example is a service that is called repeatedly. If a workflow calls service A three times and its response time is 3 seconds and it calls service B once and its response time is 4 seconds we could represent the impact to the workflow of service A has 3x3s = 9 and the impact of service B has 4x1s = 4. Additional measures of impact might also be desirable. We mentioned in section 5.3.4 that some services could have a far greater impact on our composite service than other services, for example, if only one external vendor could supply that service. The MoDe4SLA approach does not cover this kind of scenario so we propose that a uniqueness impact is also derived for each service. If a service can only be sourced from one location it has an uniqueness of one. If we can source the service from 2 locations it has a uniqueness of 0.5. In both the impact derivation and the uniqueness, the important thing to understand is that we are at this stage simply creating a method which allows us to rank services as being important or less important to us in meeting our own service level objectives. The actual values are of no importance as long as we are consistent with how we derive them. Note also that MoDe4SLA was extended in a recent paper [Bodenstaff et. al., 2009a] to study availability as a metric alongside response time and cost not an impact. This is also an important consideration, especially if it is conjugated with uniqueness.
34
Next, MoDe4SLA suggests that we structure our monitoring results. All of the data indicated by MoDe4SLA is captured by our management solution described in section 4.3.3 and it consists of the following: An audit trail of all the messages exchanged The services invoked Which workflow the service invocation belonged to e.g. New Business, Renewal. Which internal resource or external vendor processed the request
Finally, MoDe4SLA suggests that runtime support is achieved by comparing the dependencies between services with their impact and assessing these against the runtime results. For example, if we our exceeding response time targets to our customers and a high impact service is shown to be exceeding its own response time targets then this particular service becomes a key candidate for action. Here MoDe4SLA is complete and it does not concern itself with what we do with the results. Its concern is simply to provide the framework by which we can achieve this level of insight. However, our own management solution can dynamically respond by re-routing requests to an alternative provider. We achieve this by temporarily adding a constraint on this service instance to the GA. MoDe4SLA is currently being evaluated across many industry sectors for its general applicability [Bodenstaff et. al., 2009b] so it is very encouraging that our solution is consistent with the framework and able to react dynamically to the service violations that the method is able to detect.
4.2.3 Differential QoS Support Neither COSMA nor MoDe4SLA discuss service differentiation and its specification in an SLA. However service differentiation has been considered in the broader context of web service management [Erradi et. al., 2006]. The authors conclude that existing web service management 35
standards and practices are inadequate to deal with a technique that promises to provide greater flexibility and achieve higher levels of reuse and adaptability. Further, in the business world differentiated services are very common, for example, gold card holders expect better service than ordinary card holders. So it is clearly an omission in existing practices that QoS differentiation is not offered to customers purchasing automated business services. The WSOL XML SLA language offers support for QoS differentiation via its class of service offering but as we have noted, it is intended primarily for the specification of atomic services. WSLA has a complex structure for specifying admission policy; however, neither WSLA nor WSOL offers a complete solution.
4.3 Proposals
Although considerable effort has been made to understand and define the requirements for SLAs for composite services we feel that there is still room for additional extra considerations. The process by which the SLA KPIs and KQIs are to be derived has been quite well explored. However, we make the following specific recommendations for further research: 1. COSMAdoc is a well structured XML language that extends the already popular WSAgreement to support composite applications. However, it does not address the provision of differentiated services to end users of composite applications. In a static environment a separate COSMAdoc instance for each grade of service could be created. However, for dynamic environments this approach would be cumbersome and something like WSOLs classes of service would be a useful extension to the language. 2. The conceptual framework offered by MoDe4SLA can be extended with an analysis of the uniqueness of an atomic service. Where the composite application can source a service from multiple vendors, the impact of a service failure is much reduced. The current framework does not take this into consideration 36
Chapter 5 An MVA Performance Model for a SOA
5.1 Introduction
Modeling of single-tier applications such as static user-browser-proxy-server architectures is well studied [Doyle et. al., 2003; Menasce, 2003; Slothouber et. al., 1996]. In contrast, modelling of multi-tier applications is less well studied. Extending single-tier models to multi-tier scenarios is a complex undertaking. In a composite application built using Web Services the model must consider Web Servers serving SOAP requests, Web Services executing business logic, and database servers. Each tier has vastly different performance characteristics. Further, in a composite application built using Web Services it is common to replicate resources using clustering and load-balancing technologies to reduce the chance of downtime. Finally, workflows executing on the composite application are session-based, where each session comprises a sequence of requests with thinktimes in between. For instance, a session at an insurance application consists of a sequence of user requests to browse.
An insurance application was analysed in order to characterise the workload and service attributes of a real-world composite Web Service application. Based on the authors 10 years of experience working in the financial services sector, this application was chosen because it is an industry typical example of a workflow that incorporates services provided by multiple systems. The ability for an insurance company to book a policy on their bookkeeping system is a core activity in the industry. The architecture is a text book [Erl, 2008] implementation of a SOA. The application won plaudits across the insurance sector went it went live. 37
The application in question provides an online quotation service to multiple clients in the insurance brokerage industry. Whilst using the application, the client (a broker or financial advisor) is likely to be engaged with a customer looking to purchase an insurance policy. Should the customer accept the quote, the application further provides functionality to allow the policy to be issued in real-time. This involves straight-through processing messages to a number of different back-end systems. The architecture was built on SOA principles, with each major piece of business functionality being delivered as an autonomous and discrete service. These services provided such functionality as Login, Product Selection, Quote Request, Documentation Request, Policy Booking etc. The complete e-commerce workflow resulting in a policy being issued to the customer and confirmed on the backend policy booking database consisted of 26 distinct steps. Each step is referred to in the following analysis as a class of work. A given Web Service can participate in one or more classes of work (as an example consider the Web Service that wraps the legacy bookkeeping system it participates in multiple steps because policy data must be read and updated in several different parts of the business process). Furthermore, the Web Services that make up the complete application are deployed across multiple resources. Although components are duplicated across these multiple resources to provide load-balancing and failover capability, the number of resources is independent of the number of Web Services since it is common, for cost reasons, to find multiple services deployed on a single server.
5.2 Performance Requirements

During the course of designing and delivering a number of e-commerce applications to global bluechip clients, the author has noted that the primary QoS metrics that interest the client are as follows: The total response time of the entire e-commerce workflow. 38
The response time of certain steps in the e-commerce workflow. For example, whilst issuing an insurance policy the client might want the quotation calculation to take less than 4 seconds because he/she could have a customer on the other end of the telephone waiting for the figure.
The response times are independent of throughput. From a clients perspective, the fact that other users might be using the service is irrelevant. The clients expect response times to meet the service level objectives at all times.
5.3 Business Demand Modelling

The reality of any e-commerce workflow is that not all clients will complete the workflow end-toend. Consider the case of our sample application. The end customer, on being given a quotation, might not decide to take up the offer. In this case, the final steps in the process which submit the bound policy data to a number of back-end systems will not take place, resulting in considerably less resource requirements. The take-up rate is likely to heavily influenced by the competitiveness of the companys quotes, and might be expected, therefore, to vary over time in a predictable manner, depending on how the company adjusts its rates in comparison with its competitors. This is the role of the actuary. To understand the impact of these business decisions, it is necessary to monitor the statistical likelihood of a class of work being requested. Figure 5.1 plots the probability, with respect to the login of step A, of the client requesting each class of work in the sample application. These results were drawn from the production system over a period of one week of live operation.
39
1 0.9 0.8 0.7 0.6
Probability of 0.5 Request

0.4 0.3 0.2 0.1 0 A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Job Class
Figure 5.1 Probability of client requesting each job class
The majority of brokers complete classes A to M. These are a sequence of activities that provide the salient information required for a general insurance quotation. Depending on whether the end customer approves of the quote that is provided, there is then a lower probability that classes N to V are requested. The classes T to V are a set of tasks that capture additional data required to make an underwriting decision and update the quote. Because underwriting decisions often require additional information that cannot be provided online, there is a probability that the broker will end the transaction at this point (the policy will be issued manually at a later date once the relevant information has been provided). Furthermore, since answers to the underwriting questions might lead to additional, more detailed questions, steps T-V can be requested multiple times during the course of a single transaction. One other point of interest is the difference between step A, the initial login page load, and step B, the confirmation that the login credentials were acceptable. It appears that 30% of all transactions terminate after loading the login page. This seems to be caused by brokers who like to periodically check that the service is available but are not currently ready to submit any customer details! About 1 in 5 transactions are completed end-to-end.
40
In order to model business demand an estimate is made of how this probability distribution might vary if the actuaries predict a greater take-up rate. Further, to cater for unusually high peak-time demand, it is necessary to multiply up the average demand by some factor. A factor of three has been commonly used by the author in her employment. This is somewhat arbitrary but appears to be a good rule-of-thumb.
5.4 Workload Characterisation

5.4.1 Task Distribution The sample application undertakes very little work on the web servers. The role of this tier is to render pages for returning to the client, and undertaking simple validation logic. These servers do not undertake processing of complex application logic or the execution of database queries. The contribution to the total response time and resource time of the web server responding to HTTP GETs for image files is very small. A handful of very simple classes of work are serviced entirely on the web tier, e.g. loading the login page. The rest, though spawn additional tasks on other tiers. The performance test environment used in this analysis used two additional tiers. The first one, designated App Tier 1 is a cluster of load-balanced servers that handle business logic and requests to a local database supporting the application. The demands of straight-through processing also means that another tier, App Tier 2, is required to farm out work to external back-end systems. The servers on this tier are essentially brokers acting as intermediaries between the application and the legacy systems. Figure 5.2 plots the number of tasks raised on each tier for each class of work.
41
100
80
Number of Tasks
60
40
20
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
App 2 0 0 0 0 0 0 0 0 0 0 6 14 3 0 0 0 0 0 0 7 0 9 0 0 99 0 App 1 0 2 2 0 1 2 0 0 0 1 2 7 4 2 1 2 0 1 2 6 2 7 6 5 5 0 Web 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Figure 5.2 Distribution of tasks across three tiers for each of the 26 classes of work
At this point it is worth defining the unit task on each tier: Web Tier: A task is the code that receives an HTTP POST or GET for a page, performs any page level logic and validation and renders a response back to the client. The total response time of a task on the Web Tier includes the response time of tasks executed synchronously on other tiers. It does not include the response time of any subsequent tasks that are executed asynchronously. It is quite common to execute tasks asynchronously in an application such as this in order to improve responsiveness. App Tier 1: A task is a unit of business logic that could optionally include a database query on a local database. The tasks can be composed of additional tasks running on the same tier
42
and the following tier, and so, like the web tier, the total response time of the tasks can include the response times of these further tasks if they are executing synchronously. App Tier 2: A task is a unit of work dispatched to an external system, for example this might be the submission of the newly captured policy details to a book-keeping system, which returns a response acknowledging that the details have been successfully received.
5.4.2 Arrival Time Distribution It is common for queuing network models to assume that the distribution of inter-arrival times at a queue is a Poisson distribution (with a mean equal to the variance). However, previous studies have demonstrated that the distribution of inter-arrival times for web site traffic is heavy-tailed and bursty in nature [Crovella and Bestavros, 1997, Arlitt and Jin, 1999]. One reason for this is that a page request from a browser generates (typically) multiple auto-generated requested for additional objects such as image files and JavaScript files. As described above, in this study, the focus is on the tasks that are subsequently raised across other tiers as a result of the initial request to the web server. To assess the difference between the two distributions, the web logs of the sample application were compared to the rate of arrival of tasks to the job queues of the application tiers behind the web server tier. In Figure 5.3, the inter-arrival time distribution for the web logs is plotted this includes all HTTP POSTs and GETs for all objects. It is clearly not Poisson, and in indeed, in Figure 5.4, showing the tail of the distribution for inter-arrival times beyond two seconds, the Pareto fit demonstrates good evidence that it is long-tailed. The variance is several times larger than the mean of the distribution.
43
12000 10000 8000 Count .. 6000 4000 2000 0 0 1 Interarrival time (s) 2
Figure 5.3 Inter-arrival time distribution from web logs.
500 400 .. 300 Count P areto fit 200 100 0 2 12 Interarrival time (s) 22
Figure 5.4 The tail of the distribution for inter-arrival times beyond two seconds
When the results from the job queue are analysed, Figure 5.5, however, the distribution is very close indeed to being Poisson with a variance that only slightly exceeds the mean. For the purposes of this work, a Poisson distribution will now be assumed.
44
200 180 160 Count 140 120 100 80 60 40 20 0 0 0.5 1 Interarrival times (s) 1.5 2
Figure 5.5 Inter-arrival times of tasks on the job queue.
5.4.3 Service Time Distribution The service times of each task are also commonly assumed to have a Poison distribution. The tasks service time is the total amount of time it is busy. This can be obtained from the sample applications audit trail which stamps the start and end time of each execution of each task into a database. Figure 5.6 shows that the Poisson assumption is far from adequate for this application. Indeed, in Figure 5.7 which plots the distribution for all service times that exceed two seconds, it is also clear that the distribution has a very heavy-tail. If each request consisted of a single task running on a single resource then the distribution could be expected to be very uniform. However, because a single request is constructed from a number of different tasks running on different resources, the service times for real-world applications are not exponentially distributed. Analysis reveals that just 10% of jobs account for 55% of all response time. Similar results have been observed on UNIX systems [Bansal, and Harchol-Balter, 2001]. In summary, our system is best described by a M/G/m queue. 45
2000
Count
0 0 0.5 1 1.5 2 Execution Tim e (s)

Figure 5.6 Service time distribution for all tasks executing in under 2 sec
120
Count
0 2 12 22 32 Execution Tim e (s)

F igure 5.7 Service time distribution for all tasks executing in over 2 sec
5.4.4 Load-Dependence of Service Times Another factor that must be taken into consideration is the load dependence of the service times. In queuing theory, a load-dependent device is one where the service time changes as the queue length increases. This is often the case when resources are being shared. In the sample application many tasks exhibit load-dependent characteristics, particularly those that consume common low-level 46
components such as XML parsers. Figure 5.8 plots the service time of one particular task as the total load on the application increases (in terms of messages/unit time period).
Figure 5.8 Increase in task service time with load
5.5 Modelling the Application

5.5.1 Mean Value Analysis Mean-Value Analysis (MVA) is a well-known technique for analysing closed multichain queuing networks [Reiser and Lavenberg, 1980]. An MVA model can handle applications with an arbitrary number of tiers and those with significantly different performance characteristics which makes it particularly suitable for composite applications built using Web Services. It is also algorithmically quick. In the case study application there are 26 classes of service and these are all load-balanced making 52 services in total. The MVA algorithm executes in just a few milliseconds in this example whereas the response times for many of these services is 2 or 3 orders of magnitude higher.
47
The sample application is an example of a closed network because the e-commerce transaction consists of a sequence of steps that require the user to wait for each one to complete before starting the next step. MVA is based on a set of what are termed operational laws for queuing networks derived by Denning and Buzen [1978]. These laws include Littles Law, which states that the mean number of jobs in a system is equal to the arrival rate of those jobs multiplied by the mean time of each job. In MVA, Littles Law is applied to the queuing network as a whole, and to each service centre in the network individually. Reiser and Lavenberg [1980] showed how to derive an expression from the operational laws for the response time of each service centre based on the mean queue length at each centre. Algorithms are available for computing the mean queue length. In simple scenarios, an exact calculation is available in which an iterative technique is used to calculate the queue length for n customers based on the result for n-1 customers. For more complex cases, or cases where the number of customers and/or classes is high, approximations must be used. MVA can be applied to networks with a variety of service time distributions. It has been shown that MVA can relatively easily be applied to multi-tier e-commerce applications [Urgaonkar et. al., 2005b]. MVA deals with such scenarios as a single request in the web tier spawning multiple tasks on the application tier through the use of closed-queuing networks creating multiple visits to each resource. In particular, it has been demonstrated how MVA can deal with long-lived transactions using an infinite queue at the front of the model, which also serves as the reentry point for requests that have been completed. Also using a queuing network modelling technique, Kounev and Buchmann [2003] applied their analysis to a J2EE application and concluded that it provided errors in throughput estimates of only a few percent. Errors in response time estimates were 10-30%. In terms of an e-commerce application this is probably not an unacceptable result. If the total response time is 10 seconds then the error is up to 3 seconds, which is not a bad
48
result given the complexity of distributed applications. Ensuring that the SLA is drawn up with some headroom in the agreed response times could cater for this error. Kounev and Buchmann conclude that the model is certainly useful for capacity planning purposes since it allows the performance to be predicted based on assumed scenarios involving the addition of extra resources. Prompted by these results, a multi-class load-dependent MVA calculation class was developed by the author in C#.NET based on the algorithm of Menasce et. al. [2004b]. This algorithm assumes that all classes have load-dependent behaviour such that the service times increase at the same rate with load. The algorithm was amended by the author to include a service-rate multiplier for each class of work. A correction given by Lazowska et. al. [1984] was also added to cater for high service time variability but then withdrawn when results indicated that the service time variability had little affect when the load-dependency of the service times was taken into account. The algorithm is described in Table 5.1 below. Inputs from statistics:
K = number of queues R = number of classes N = N[1]...N[R] number of current requests of each class
equivi, r = visit ratio for queue i, class r intercepti, r, slopei, r from a least squares fit to service time data (service time = load . slope + intercept)
Desired outputs:
Xr = throughput of class r customers Ri, r = average response of class r customers at queue i
49
Define:
R
| N | = Nr
r =1
Calculate service demands: Di, r = service demand of class r at queue i when there are Nr customers. = Visit Ratio multiplied by the Service Time at this load So, Di, r = equivi, r . (slopei, r . equivi, r + intercepti, r)
Calculate the service rate multipliers, i, r, j when there are j jobs in total is defined as: i, r, j/i, r, 1, where i, r, j is the service rate of device i class r when there are j customers
i, r, j = (j . slopei, r + intercepti, r )/( slopei, r + intercepti, r)
Initialize working variables by taking a first guess at the queue length probabilities of finding j jobs at queue i given the workload, N Pi( 0 | N ) =0 P i( j | N ) = 1 / | N |
~ ~ ~ ~
Take a first guess for the throughout:

K
X 0prev , r = min{Nr/ ( Di , r + R0, r), 1 / (maxi{Di,r + R0, r }}

i =1
Set the initial error to an arbitrary large number and perform the main iterative loop until the error is smaller than some arbitrary percentage which can be chosen as a trade-off between the number of iterations and accuracy:
Estimate the response time: 50
Ri, r = Di,r .
j =1
Nr
[j / i, r, j] . Pi( j- 1 | N )
Estimate the new queue length probabilities:

R
Pi( j | N ) = Pi( j- 1 | N ) .
r =1
prev [Di, r . X 0, r / i, r, j]
Re-estimate the throughput:

curr X0 , r = Nr / ( Ri, r + R0, r)
Calculate the error by comparing the latest estimate to the previous one error = ( X 0,r - X 0,r / X 0, r )
curr prev prev
Prepare for the next iteration

curr X 0prev ,r = X 0 ,r
Loop while the error exceeds the desired bounds of a maximum number of iterations achieved ( a failsafe in case the error does not converge).
Table 5.1 MVA algorithm
5.5.2 The Queuing Network Model of an N-Tier Application A multi-tier application is modelled by treating each tier of hardware resources as a finite queue, Figure 5.9. Each resource processes jobs of one or more class, and can instantiate additional tasks to complete that job. Those tasks might execute either on the same resource, on another resource in the same horizontal tier or on the tier below it. The clients are treated as an infinite queue.
51
Clients
Web Farm
App Servers Tier 1
App Servers Tier 2
Figure 5.9 Queuing Network Model of an N-tier Application
In a widely distributed architecture it could be envisaged that each task is deployed as a Web Service executing on its own resource (or load-balanced cluster of resources). In this scenario, the number of resources (queues) could equal or exceed the number of classes. The effect of n load-balanced resources being able to parallel process work is equivalent to dividing the workload so that each of n queues processes 1/nth of the requests and is easily catered for in MVA. The primary advantage of MVA is that by measuring the number of visits of each task belonging to each job class at each resource, the details of which tasks invokes which other task are irrelevant. It is an entirely statistical technique and this means that the technique can be applied to any application without needing to understand in detail how the application is actually working. It can also cater for scenarios where multiple application servers are deployed in load balancing configurations simply by adding more finite queues to the model.
52
5.6 Management Software

5.6.1 Capturing Application Metrics In order to obtain the necessary statistics to use such a model, a service-oriented application must capture, for each unit of executable work (task), the source of a request (the client), the class of the job, the resource that executes the task, and the elapsed time that the task took to execute. This is a relatively easy thing to do if the service-oriented application is constructed around an Enterprise Service Bus (ESB). To test the model, a simple ESB, built by the author in C#.NET and designed around a Pipes and Filters pattern [Hohpe and Woolf, 2004] was used. The Pipes and Filter pattern allows the service-oriented bus to sequentially process multiple tasks in a loosely coupled way, with the XML output from each task becoming the input to the next task, where each filter encapsulated the logic if a single task, Figure 5.10. Each task is addressed via a Web Service interface and can be located on the same resource as the ESB, on another resource in the domain or an external resource. The necessary measurements for the MVA model can be extracted from the bus using a Wire Tap pattern [Hohpe and Woolf, 2004] to extract pertinent details about the original request message from the pipe, and by simply logging the start and end times of the messages arriving and leaving each filter.
53
Figure 5.10 An ESB executing a sequence of tasks via Web Services
5.6.2 Statistical Analysis of Raw Metrics The Statistics Generator was written as a Windows Service in C#.NET and runs on a management server. It executes periodically offline to generate statistics from the recorded metrics. The raw metrics are grouped into time slices and then for each job class executing on each resource it computes the following: The average service time of each task for each class on each resource during that time slice The variance of the service times The load during the time slice in terms of jobs per second The ratio of visits by tasks of each class to each resource, with respect to the number of inbound requests of that class. The average service time of each request for each class as a function of load
To avoid problems with clustering of data points leading to skewing of the results [Asawa, 1998], a k-means clustering algorithm written in C#.NET is used to estimate the mean of each set of points [Deichmann, 2006].
54
5.6.3 MVA Modeller The MVA algorithm presented in Table 5.1 was implemented in C#.NET. In can operate in two modes: By retrieving statistics produced by the Statistics Generator it can produce an estimate of the current Quality of Service. By substituting statistics for a given task, it can simulate the effects of dynamically swapping one Web Service for another. The outputs from the Modeller are: a) The throughput of each class b) The response time of each class at each queue
5.7 Results
The experimental results presented here demonstrate how the model can be used in a predictive mode to assess the impact of a change in the business process or the task distribution. An ESB was used to configure workflow that invoked Web Services that provided some logical processing and in some cases accessed a back-end database. To create a baseline, a simple workflow was configured through the ESB that invoked four C#.NET Web Services sequentially. Three of these Web Services made calls to a common SQL Server database. The Web Services are designated A, B, C and D, and for the purposes of the model, they are similarly designated as class A, B, C and D respectively. The number of concurrent workflows was gradually increased and the total response time of the workflow was measured using an open source Web Service load test tool (soapui 2.0.2 [Eviware]). The response times of each class were measured using the ESBs audit logs. The total response time includes the response times of each
55
class, plus the round-trip time between the load test tool client and the ESB (which, it turns out, is negligible). The results are plotted in Figure 5.11.
8 7 6 w 5 4 3 C lass A 2 1 0 0 1 2 3 R esponse time (s) Total C lass B C lass C C lass D
Figure 5.11.Response times of each class in a simple workflow, together with the total response time
The results for the load-dependent execution time of each class were entered into the model and the model was then run to verify that it provided the same results for the response times. Figure 5.12 plots the predicted results against the actual results for the response times at different loads.
Load
56
Load = 3
p Response Times (s) 1.6 1.4 1.2 1 0.8 0.6 0.4 0.2 0 Class A Class B Class C Class D Total M odel Actual
Load = 5
p Response Times (s) 3 2.5 2 1.5 1 0.5 0 Class A Class B Class C Class D Total M odel Actual
Load = 7
p Response Times (s) 3.5 3 2.5 2 1.5 1 0.5 0 Class A Class B Class C Class D Total M odel Actual
Figure 5.12 Comparison of the response times predicted by the model and the actual response times at different loads.
57
5.7.1 Accuracy of the Model Figure 5.13 plots the difference between the models results and the actual results as a percentage for the case Load= 7, demonstrating a very small error of only about 8% on average which is considerably better than Kounev and Buchmann [2003].
16 14 12 10 D ifference: (Model8 Actual)/Actual as Percentage 6 4 2 0 Class A Class B Class C Class D Total
Figure 5.13 Accuracy of the Model
5.8 Using the Model to Predict the Performance of a New Workflow

Next a new workflow was conceived which invoked a different sequence of the same Web Services. This new workflow was A, B, C, D, D, A, C and it is modelled simply by changing the visit ratio of classes A, C and D from 1 to 2. The predicted total response times were then calculated by the model for different loads. Next, the actual response times were measured by re-configuring the workflow of the ESB and re-running the load test. In Figure 5.14 the differences are shown between the predicted and observed results. The hardware resources available during these tests limit the load that can be applied when many more jobs are introduced to the application by the extended workflow. 58
6 f 5 Total Response Time(s) 4 P redicted Actual 3 2 1 0 3 5 Load 6

Figure 5.14 The differences between the predicted and observed results when the model is used in a predictive manner
Whats interesting and particularly noteworthy about these results is that they are predicting the response of a workflow without recourse to any prior historical measurements about that particular workflow. Instead, weve simply used the historical measurements from the components that make up that workflow.
5.9 Summary and Discussion

The performance model successfully predicts the response times of our composite service-oriented application and is at least as good as other reported models, e.g. Kounev and Buchmann [2003]. Furthermore it is shown that it can be used in predictive manner with test results being presented for how the model can deal with changes in the workflow. As a specific example, this scenario could arise in our sample application if the company changed their rates and a greater take up of quotes occurred. 59
A specific limitation of the model in predictive scenarios is cases where the workflow becomes heavily skewed in favour on one particular service. If, for example, business demand meant that service D was called many times and service D happened to be particularly resource hungry, the historical data would be unreliable if there was no historical precedence for that scenario. The actual performance might be significantly affected by resource contention not previously observed and, therefore, not adequately captured by the statistical method of applying a fit to the historical response time changes under load. However, we can attempt to limit this problem through the standard capacity planning technique of base-lining each service with stress tests on each resource, in isolation from any other services using that resource. In the limit where one service is processing significantly more tasks than any other service, the performance can be expected to tend towards the performance observed during the base-line test, since that task will be responsible for most of the work at that resource. The example above used four job classes, but because the multi-class formalism is based upon an entirely statistical method of analysis, it can be expected to successfully be used for many more classes. The only limitation is in computation time. A lack of hardware resources prevents us from demonstrating this. This is the first time, to the best of our knowledge, that detailed test results have been obtained to prove that the MVA algorithm can be applied to a queuing network description of a SOA implemented as a workflow on an ESB. The final formulation of the multi-class MVA algorithm used in this work has been extended by us from an algorithm given by Mensace et. al. [2004b]. All software components were developed by us, unless otherwise stated.
5.10 Publications
A summary of this chapter was published at the SQM 2006 conference: 60
Shelly Saunders, Margaret Ross, Geoff Staples, and Sean Wellington, 2006. A Quality of Service Aware Model for Service-Oriented E-Commerce, In: Perspectives in Software Quality, Fourteenth International Conference on Software Quality Management (SQM 2006), pp51-62.
61
Chapter 6 A Genetic Algorithm with an MVA Fitness Function for Runtime Performance Improvements of a Composite Application Built Using Web Services
In this chapter a Genetic Algorithm (GA) is introduced for runtime performance improvement of QoS response time metrics of a composite application built using Web Services. The key aspects of the GA are the chromosome encoding and the crossover strategy. It is shown how the GA can optimize the overall response times of workflows using the MVA model as a fitness function to identify whether the workflow suggested by each chromosome will meet the SLA targets defined. The key elements of a management solution for autonomous performance improvement of a composite application built using Web Services are described. Finally results are produced to demonstrate the GA in action.
6.1 Introduction to Genetic Algorithms

As was discussed in Chapter 2, the results of Zomaya and Teh [2001], Canfora et. al. [2005] and Page and Naughton [2006], as well as the comparative analysis of Weise et al [2007], suggest that genetic algorithms offer a promising avenue of research in the field of web service composition. Genetic algorithms are search algorithms based on the mechanics of genetics. They have been proven to be robust and applicable to a wide variety of problems. Of merit in this particular problem space is the fact that GAs are not so susceptible to the problems that search methods such as hillclimbing have with functions that have multiple potentially good solutions rather than one single solution. GAs differ from other optimization algorithms in four key ways [Goldberg, 1989]: They can use general algorithmic procedures regardless of the problem space in question by encoding the parameters of the problem rather than using the parameters themselves. In 62
fact most of the work involved in using a GA is trying to identify how to encode the parameters in the first place. The parameters are coded as a string of characters known as a chromosome. They start from a large and random population of points in the search space, rather than single point. This helps GAs avoid locating false peaks in problem spaces with many peaks. Each valid (as allowed by the encoding rules) sequence of characters in the chromosome is a valid point in the search space. The initial population of chromosomes is usually chosen at random creating a diverse population of possible solutions. As the genetic algorithm progresses, the population retains its diversity but becomes more and more adapted to the problem space. They use an objective measure to identify the value of a particular solution. Each string (chromosome) in the population is evaluated against this measure, known as the fitness function, to identify its payoff value. The fitness function is unique to the type of optimization problem being solved. The payoff value of each chromosome in the population is used to determine the fittest chromosomes in the problem space. The parents of the next generation of chromosomes are chosen from the fittest individuals in the current generation a process known as reproduction. They use probabilistic rules rather than deterministic rules to help create successive populations of chromosomes. This ensures that the pool of chromosomes remains diverse and the full extent of the problem space is successfully explored. The two rules that are most commonly encountered in GAs are crossover and mutation. Crossover, at its simplest, involves taking two random parent chromosomes and exchanging some of the characters in the chromosome of one with the other, to create two new strings that will become part of 63
the population pool of the next generation. Mutation usually plays a secondary role to crossover. It is simply the mutation of a single character in a random string to a random alternative value. Mutation occurs in a typical GA at a much lower rate than crossover. Its value is that it prevents the population pool stagnating. Without mutation, it is possible that useful sequences of characters in a string (genes) could become extinct.
6.2 Comparison with Other Techniques

There are no complete comparative studies of GAs against other methodologies for QoS-aware frameworks. However, work has been conducted by Canfora et.al. [2005] to compare a GA with linear integer programming. They show that linear integer programming out-performs GAs when the number of services is small, however GAs perform faster when there are large numbers of services (> 15). In our own case study insurance application there were 26 concrete services and as these were load-balanced the total number of services was 52. For this reason, GAs were considered a good candidate for testing our hypothesis.
6.3 A GA for the Sample Application

This section will now present in detail how a composite application built using Web Services can be encoded into a chromosome for use in a GA and how the MVA model presented in Chapter 3 can be used as fitness function for that GA.
6.3.1 Chromosome Encoding The chromosome encoding follows the approach of Page and Naughton [2006]. Suppose we have 3 resources R1, R2 and R3 which are 3 physical servers. On each server we have 3 Web Services which each process requests for 3 classes of work, where the use of the word class is identical to that in
64
Chapter 3. Suppose also that under the current load Service 1 has three jobs in total to be executed, Service 2 has four jobs and Service 3 has four jobs We start by forming a 1D string where we denote each job by its class identifier, and use -1 to delimit the jobs running on each resource. For example String 1: 1, 2, 2, -1, 1, 1, 2, 3, 3, 3, -1, 2, 3
Reading left-to-right, string 1 says that one class 1 job is running on R1 and two class 2 jobs. Then it states that R2 has two jobs of class 1, one of class 2 and three of class 3. Finally, it states that R3 has one job of type 2 and one job of type 3. In total there are three jobs of class 1, four of class 2 and four of class 3 which is what was required. Within the bounds of the delimiters, the order of the characters is unimportant since the string captures a snapshot of the workflow load at a given moment in time. Next, we translate each character in string 1 into a set of unique task IDs where each ID indexes one of the jobs above. Well call this encoded version of the string Parent 1: Parent 1: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13
Note that the resource delimiters also receive an ID. A complete mapping between the characters in parent 1 and their unique IDs is shown in Table 6.1 below.
65
Unique ID 1 2 3 4 5 6 7 8 9 10 11 12 13
Type Class 1 Class 2 Class 2 Resource 1-2 delimiter Class 1 Class 1 Class 2 Class 3 Class 3 Class 3 Resource 2-3 delimiter Class 2 Class 3
Table 6.1 Example Chromosome Encoding
The unique ID represents, in the language of GAs, a unique gene. If there are n jobs in total running on m resources then there will (n + m -1) unique genes in the chromosome. Further, each chromosome will be (n + m -1) characters in length. Each character in the chromosome is a unique gene and genes are never repeated in a given chromosome.
66
6.3.2 Initial Population Now we are ready create an initial random sample population of parent strings. These are nothing more than variations on the Parent 1 in terms of the sequencing of their genes. For example, Parent 2 might be: Parent 2: 7, 6, 10, 4, 8, 1, 3, 13, 2, 11, 5, 9, 12
Decoding this chromosome using the mapping table above it shows that it correspond to the following set of jobs: Parent 2 (decoded): 2, 1, 3, -1, 3, 1, 2, 3, 2, -1, 1, 3, 2
Class 1 has 1 job running on R1, 1 on R2 and 1 on R3 Class 2 has 1 job running on R1, 2 on R2 and 1 on R3 Class 3 has 1 job running on R1, 2 on R2 and 1 on R3 A second example might be this chromosome in which the positions of the genes 12 and 11 have been swapped: Parent 3: 7, 6, 10, 4, 8, 1, 3, 13, 2, 12, 5, 9, 11
Decoded: Parent 3 (decoded): 2, 1, 3, -1, 3, 1, 2, 3, 2, 2, 1, 3, -1
Class 1 has 1 job running on R1 and 2 on R2 Class 2 has 1 job running on R1 and 3 on R2 Class 3 has 1 job running on R1 and 3 on R2 67
No jobs run on R3 since the delimiter between resources 2 and 3 is the last gene in the sequence.
6.3.3 Fitness Evaluation The fitness of each chromosome is evaluated using the approximate MVA algorithm with the load for each resource determined by decoding the chromosome under scrutiny. The fitness is calculated from the results of the MVA algorithm in the following way:
1. Identify the classes for the workflow under scrutiny 2. Retrieve the SLA target for the workflow 3. Use the MVA results to get the predicted average response time for each class in the workflow on the resource indicated by the chromosome 4. Total the response times and compare to the target SLA
The last step uses the methods proposed by Menasce [2004] who demonstrates by simple examples how to calculate the total execution time for a composite web service. Referring to Figure 4.2 Service A invokes B with probability p1 and it invokes C with probability p2 = 1 p1. Likewise C invokes D with probability p3 and it invokes E with probability p4 = 1 p3. Finally F is invoked when either D or E finish, or when B finishes. In this example the total execution time, T, is given by: T = tA + p1tB + p2(tC + p3tD + p4tE) + tF
where tX is the response time of service X
68
Once the total execution time is calculated from the MVA results, the fitness of a chromosome is defined here as: SLA response time for workflow MVA predicted response total execution time for workflow
This means that the better solutions in terms of response times lead to a higher fitness value. Negative fitness values indicate chromosomes that produce results where the predicted response time is outside the SLA targets. The chromosomes simply have to be ranked by fitness value in order for the fitness selector to choose the next population pool.
6.3.4 Fitness Selector The fitness selector (i.e. which chromosomes will survive to bear children) uses a standard weighted roulette wheel. Each individual is assigned a slot of size si on the wheel where si is given by: si = Fi / (F1 + F2 + F3Fn)
where Fi is the fitness of individual i as calculated above, and the total population size is n. The wheel is then spun n times to determine how many copies of each parents will breed. Parents with a high slot size (large fitness value) are more likely to be selected than parents with a low slot size.
6.3.5 Constraints The fitness evaluation function can also be used to easily apply constraints. An example of a constraint is the situation where a Web Service is unavailable on a given resource. In this case, a chromosome that represents a job of that type running on that resource is clearly representing an unphysical situation. Chromosomes such as these can be effectively removed from the population pool with a fitness penalty a large negative value that effectively removes the chromosome from
69
the population pool by ensuring that it ranks at the bottom of the fitness league and is never chosen by the fitness selector.
6.3.6 Crossover There are many ways of performing crossover operation in GAs. In the present work many of the most popular are ruled out because they produce offspring that do not represent real alternative solutions to the redistribution of jobs across resources to produce the same workflow. For example simple crossover (Goldberg p12) involves splicing one segment of a chromosome into the other, and vice versa. Consider Parents 1 and 2 from section 4.2.2 above. Parent 1: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13 Parent 2: 7, 6, 10, 4, 8, 1, 3, 13, 2, 11, 5, 9, 12
If these had the segment from locus 2 to 4 swapped to become Offspring 1: 1, 6, 10, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13 Offspring 2: 7, 2, 3, 4, 8, 1, 3, 13, 2, 11, 5, 9, 12
Then, once decoded, it will be seen that neither offspring represents a solution to the original workflow. For example, in offspring 1, the task represented by gene 6 is repeated twice whilst the task represented by gene 2 never occurs at all. However crossover operators are known that preserve the genes in a chromosome. They are known as permutation crossover operators. A well-known example of a problem that requires the use of a permutation crossover operator is the travelling salesman problem in which the salesman must visit each city once and once only by the shortest possible route. The chromosome to solve this problem with a GA is also a set of unique genes, each one representing a unique city. Two popular 70
permutation crossover operators for solving this problem are the cycle crossover [Oliver et al 1987] and the greedy crossover [Grefenstette et al 1985]. Following Zomaya and Teh [2001], we can demonstrate how cycle crossover works. The first step is to randomly choose two parents from the population pool. For example, take Parents 1 and 3 from section 6.2.2 above: Parent 1: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13 Parent 3: 7, 6, 10, 4, 8, 1, 3, 13, 2, 12, 5, 9, 11
To derive their children we start with a random position in parent 1, e.g. position 8 and fix the value of the gene here in offspring A Offspring A: + + + + + + + 8 + + + + +
Now we also fix whatever gene is in the same position at parent 3 in offspring B. Offspring B: + + + + + + + 13 + + + + +
Next take this value in offspring B and locate the next occurrence in parent 1 and fix this value at the same position in offspring A, Offspring A: + + + + + + + 8 + + + + 13
Then match parent 3s value at this position to offspring B: Offspring 2: + + + + + + + 13 + + + + 11
71
Continue cycling in this way until the value of the gene that we first started with (8) is reached in offspring B: Offspring A: + + + + + + + 8 + + 11 + 13 Offspring B: + + + + + + + 13 + + 5 + 11
Offspring A: + + + + 5 + + 8 + + 11 + 13 Offspring B: + + + + 8 + + 13 + + 5 + 11
Now we swap and start cycling with parent 3 using the first value that has so far not been considered. In this case the gene value 1 has not been used yet so we start with that. It occurs in position 6 in parent 3 so we place it in position 6 in offspring A. Offspring B receives gene 6 from parent 1. Offspring A: + + + + 5, 1 + 8 + + 11 + 13 Offspring B: + + + + 8, 6 + 13 + + 5 + 11
Gene 6 is in position 2 in parent 3 so offspring A receives gene 6 at position 2 and offspring B gets the gene that was in this position in parent 1 (gene 2). Offspring A: + 6 + + 5, 1 + 8 + + 11 + 13 Offspring B: + 2 + + 8, 6 + 13 + + 5 + 11
Continuing: Offspring A: + 6 + + 5, 1 + 8, 2 + 11 + 13 Offspring B: + 2 + + 8, 6 + 13, 9 + 5 + 11 72
Offspring A: + 6 + + 5, 1 + 8, 2 + 11, 9, 13 Offspring B: + 2 + + 8, 6 + 13, 9 + 5, 12, 11
Offspring A: 7, 6, 10 + 5, 1, 3, 8, 2, 12, 11, 9, 13 Offspring B: 1, 2, 3 + 8, 6, 7, 13, 9, 10, 5, 12, 11
This brings us back the gene value we started with (1) and so we start again from parent 1. There is only one more gene value left: Offspring A: 7, 6, 10, 4, 5, 1, 3, 8, 2, 12, 11, 9, 13 Offspring B: 1, 2, 3, 4, 8, 6, 7, 13, 9, 10, 5, 12, 11
Decoding these offspring they represent the following cases: Offspring A: 2, 1, 3, -1, 1, 1, 2, 3, 2, 2, -1, 3, 3
Class 1 has 1 job running on R1 and 2 on R2 Class 2 has 1 job running on R1 and 3 on R2 Class 3 has 1 job running on R1, 1 on R2 and 2 on R3
Offspring B: 1, 2, 2, -1, 3, 1, 2, 3, 3, 3, 1, 2, -1
Class 1 has 1 job running on R1 and 2 on R2 Class 2 has 2 jobs running on R1 and 2 on R2 73
Class 3 has 4 jobs running on R2 No jobs run on R3.
In conclusion, the permutation crossover operator has produced two offspring which are both valid solutions to the workflow we started with.
6.3.7 Mutation Mutation occurs by taking z% (where z is typically < 1%) of the offspring. Mutation will be through a positional change of one element. The element will be chosen at random and moved to a random new position. For example Before mutation: 1, 2, 3, 4, 8, 6, 7, 13, 9, 10, 5, 12, 11 After mutation: 1, 2, 3, 4, 8, 6, 5, 7, 13, 9, 10, 12, 11
6.3.8 Population Evolution Once all offspring have been created, the new population pool is evaluated for the fitness of each individual. The process of fitness evaluation, selection, reproduction, crossover and mutation is then repeated m times. After m evolutions the fittest individual is taken as the optimal solution. Determination of m is a trade-off determined experimentally. It will be depend on the computation time available and how quickly the GA converges to an optimal solution.
6.4 Technical Design of the Management Solution

6.4.1 The ESB Orchestration is a phrase given to a compound design pattern that attempts to centralize process logic and service repository information using an orchestration platform. It wraps a number of other 74
common enterprise integration design patterns, such as Message Translation and Content-Based Routing [Hohpe and Woolf, 2004] to enable web services to be efficiently composed into workflows representing complex business processes spanning the enterprise. Orchestration services are commonly provided by an Enterprise Service Bus (ESB), a term originally coined by Gartner Group and defined by the CBDI [CBDI Forum Ltd, 2004] as a uniform service integration architecture of infrastructure services that provides consistent support to business services across a defined ecosystem. The orchestration service provides a global view of all of the services that are used to compose a business process. It thus offers a single point from which a complex service-oriented architecture can be monitored in terms of the response times and throughput of all the individual services that make up the process. Indeed, this monitoring capability is provided off-the-shelf by most ESB vendors. However, the collected statistics are used offline for performance analysis, not as part of an automated runtime management solution. Of benefit to this thesis is the fact that it offers a point of access to inject dynamic routing commands as a result of optimization changes suggested by the GA.
6.4.2 Dynamic Routing Dynamic routing in an ESB is usually used to select a service endpoint at runtime. The list of available services is maintained in a service repository, often a UDDI store, and a rules engine is queried to identify the most suitable service to choose. This feature is extended in the design of the management solution to use the results of the GA for dynamically routing service requests. The decoded results of the GA are stored in a database table and at run-time they are used to select an endpoint for a given job class by using a weighted algorithm. 75
For example, if the GA proposes that for a given workflow under a given load, 5 jobs of class A should be processed on Server 1 and 3 jobs on Server 2 then the weighted algorithm will forward 5 jobs out of every 8 to Server 1 and 3 out of every 8 to Server 2.
6.4.3 Logical Design The management solution is composed of the components shown in Figure 6.2. All components are written in C#.NET 3.5 and are described in Table 6.2:
cd QASAR UI
Monitor
ESB
Statistics
Controller ESB Audit Data
SLA
Weighted Endpoint Selector
MVA
GA
Serv ice Registry
Figure 6.2 Logical Design
76
Component UI Monitor
Purpose This provides a simple UI for viewing results and initiating an optimization check The Monitor extracts the raw data from the ESB audit database for each and every job that has run through the ESB then inserts average execution times and job counts into a table for each distinct resource and web service.
Controller
The Controller assesses the data provided by the Monitor using the MVA model and the SLA component to determine if the workflows are currently operating outside their SLA targets. If they are it initiates the GA to find a more optimum solution and writes the results back to the ESB database ready for the ESBs Weighted Endpoint Selector to use.
Statistics
This component provides useful statistical function to the Controller such as linear and curve-fitting analysis to allow the load-dependency of the response times for each service on each resource to be calculated (this is needed by the MVA model)
SLA
This component maintains SLA data for each workflow. SLA data is stored in the WS-QoS XML language [Tian et al 2004].
MVA GA
This is the MVA model introduced in Chapter 3. This is the GA algorithm introduced above. Much of the core code is based on the excellent open-source JGAP project [JGAP, 2008] which was then partially ported to .NET and amended to suit the specific needs of this thesis. The GA uses the MVA model as a fitness function. It uses the Service Registry to identify constraints, for example, services might not be installed on all resources. The best solution derived by the GA is returned to the controller. The parameters of the GA are configurable but for the purposes of these tests, both the population size and maximum number of evolutions was set to 50. The mutation rate was 0.1%. The population size and number of evolutions were chosen to provide a rapid calculation returning a solution in < 1second on the Web Server hosting the ESB.
77
Component ESB
Purpose This is a simple ESB implementation by the thesis author allowing simple composite web service applications to be built and providing full audit capabilities for the workflows running through it.
ESB Audit Data
This is a database containing the audit data of every job that passes through the ESB. It is extending from the basic ESB audit database to include tables for capturing the GA results.
Weighted Endpoint Selector Service Registry
This is a dynamic router that extends the basic functionality of the ESB to dynamically route service requests based on the results of the GA. It uses the Service Registry to identify valid endpoints. This is a registry of all the services available to participate in the workflow, and the resources that they are installed on.
Table 6.2 Logical Design
6.5 Test Harness

6.5.1 Sample Workflows The test harness consisted of four different workflows implemented on the ESB and represents four types of insurance transaction. Each workflow consists of a simple linear sequence of jobs executed by 6 web services A to F. These are shown in Figure 6.3.
78
Figure 6.3 Sample Workflows
The job type referred to as A in the diagram is a common authentication and authorisation function. Job B is a service that inserts new policy data into a bookkeeping system. Job C is a service that generates transaction details that are posted to a queue for further processing (for example, policy document generation). Job D is a service that adds an endorsement to an existing policy in the bookkeeping system. Job E is a service that creates a renewal policy based on the previous year's policy. Job F is a service that cancels a live or lapsed policy in the bookkeeping system.
79
Each Web Service is implemented in .NET and involves a query to a local SQL Server database. The baseline tests were conducted against an arbitrary workload and arbitrary initial distribution of the web services across the available resources. The workload is defined by the ratio of workflows to each other. For the baseline, for each new business workflow there is one endorsement, one renewal and one cancellation. Therefore the baseline workload consists of 4 workflows and generates the following 11 jobs, shown in Table 4.3: Service A B C D E F # Jobs 4 1 3 1 1 1
Table 6.3 Sample Workload
The initial distribution of these services is arbitrary and the aim is to run 15 concurrent cycles of each workflow to ensure that the servers are put under genuine load rather than idling. Hence the total workload creates about 165 jobs. Three Web Servers were available for the performance tests. Web Server 2 is chosen only to host the service bus which is redirecting the jobs to other servers. This is applied as a constraint to the GA to prevent any concurrency issues between the operation of the ESB and the services interfering with the baseline data.
80
Web Servers 1 and 3 handle the jobs in the ratio as shown in the table. For example, all jobs of type E and F are processed on Web Server 1. However, 1 out of every 4 jobs of type A are processed are Web Server 3 as shown in Table 6.4.
81
Web Server Service A Service B Service C Service D Service E Service F
1 45 Jobs 0 Jobs 30 Jobs 0 Jobs 15 Jobs 15 Jobs
3 15 Jobs 15 Jobs 15 Jobs 15 Jobs 0 Jobs 0 Jobs
Table 6.4 Job Distribution
Each Web Server is a VMWare instance of a Windows 2003 Server (with SP1) running IIS 6 and SQL Server 2005 (with SP2). The VMWare instances are hosted by VMWare Workstation 6 [VMWare, 2008].
6.5.2 Baseline Performance Test Results The baseline results for a sample service on each web server are shown in figure 6.4. Both are on the same scale for easy comparison. These results were obtained over a period of four hours of continuous operation during which the load was increased from 1 thread to 120 concurrent threads using soapUI [Eviware, 2008].
82
Web Server 1 - Web Service A

10000 8000
Exec Time (ms)
6000 4000 2000 0 0 50 100 150 200 250 300 350 400
Jobs/min
Web Server 3 - Web Service A

10000 8000
Exec Time (ms)
6000 4000 2000 0 0 50 100 150 200 250 300 350 400
Jobs/min
Figure 6.4 Baseline Results
Web Server 1 exhibits little performance degradation as the load increases, however Web Server 3 performs poorly under load as a result of its RAM being artificially constrained using VMWare Workstation. The results for each Web Service are almost identical to Web Service A therefore they are not shown here for brevity.
83
Individual baseline load tests were also conducted for each Web Service on each Web Server. This helps fill in some of the gaps in the GAs knowledge, for example, the initial workload sends no job requests to Web Service F on Server 3. With the test harness running at approximately120 jobs/min, the total workflow execution times were measured using the data collected. 120 jobs/min is chosen as a useful load for testing purposes because it is evident from the graphs above that Server 3 is showing considerable distress at these loads. Further, it represents approximately the initial target load of 15 concurrent threads. The measured execution times are shown below in Table 6.5 with an SLA target for the workflow in the right-hand column. The SLA targets have been deliberately chosen to ensure that the GA will be triggered. So, Workflows A and D are slightly missing their target, whilst Workflow B is further out. Only Workflow C is inside its SLA target.
84
Workflow Workflow A Workflow B Workflow C Workflow D
Total Execution Time (s) 20.7 19.9 14.6 9.3
SLA Target (s) 20.0 18.0 15.0 9.0
Table 4.5 Measured Execution Times
6.5.3 Post GA Results At this point, the GA is triggered and returns an improved solution, redirecting a number of jobs that were running on Server 3 to run on Server 1 instead, see Table 6.6: Web Server Service A Service B Service C Service D Service E Service F 1 50 Jobs 15 Jobs 30 Jobs 15 Jobs 0 Jobs 0 Jobs 3 10 Jobs 0 Jobs 15 Jobs 0 Jobs 15 Jobs 15 Jobs
Table 6.6 Optimised Job Distribution
Following the GAs intervention, the total execution times were again measured following a period of operating the test harness at continuous load, as shown in Table 6.7: 85
Workflow
Total Execution BEFORE
Time (s) AFTER 7.5 7.4 5.7 4.9
Workflow A Workflow B Workflow C Workflow D
20.7 19.9 14.6 9.3
Table 6.7 Optimised Execution Times
6.6 Summary and Discussion

The results demonstrate that the GA coupled with the MVA fitness function is capable of finding improved solutions for our case study composite application implemented as a set of web services distributed across multiple resources. Indeed, so successful in this case was the GA that the operational load was doubled to 240 jobs/min whilst still meeting the SLA targets. Lack of client hardware to increase the concurrent threads further prevented a limit being exposed. To conclude, this chapter presents the first published work to use MVA as the fitness function for a Genetic Algorithm. It is also the first published work to apply a GA to dynamic performance improvement of a real workflow implemented as a set of Web Services across multiple servers. Previous published work on using GAs to improve service composition has restricted itself to numerical simulations.
86
Chapter 7 Strategy for a QoS-aware Composite Applications in the Cloud

The primary concern of this chapter will be to discuss the architecture of an enterprise application (such as supply chain management, SCM) in terms of how it might be built using a QoS-aware composition of services, both internal to the organisation and external to the organisation [Candido et. al. 2009]. Our assumption is that external services would be available via SaaS (Software as a Service) vendors in the Cloud. Since the SaaS vendors offer services from multi tenant architectures the enterprise application needs to monitor, model and react to changing latencies in the services that they are consuming and take appropriate action where necessary, such as switching to use the service of an alternative SaaS vendor. We will discuss how the MVA model and the optimisation algorithm could form the basis of an effective consumer strategy to manage the composite QoS of the enterprise application.
7.1 Enterprise SOA and Cloud Computing

7.1.1 Cloud Computing and Software as a Service Over recent years businesses have evolved their enterprise architectures to include packaged applications, legacy systems, and bespoke line-of-business (LOB) applications that are integrated using Service-oriented architectures. The advent of Cloud Computing allows these existing enterprise SOA initiatives to be extended, creating new business opportunities and alleviating existing business pressures [Lakshmanan et. al. 2009]. Cloud Computing allows a business to focus on its core capabilities whilst outsourcing other aspects to external vendors. Cloud Computing is
87
generally understood to consist of the following broad types of service offered to a business on a pay per usage basis: Infrastructure as a Service (IaaS) the business would hire server infrastructure as a service. The Cloud provider scales up or down the amount of server resource available to the consumer. There is a strong focus on virtualization technologies. Platform as a Service (PaaS) the business hires infrastructure and a complete development environment and computing platform for application development and deployment. Initiatives such as Microsoft Azure are an example. Software as a Service (SaaS) the business hires web-based software services. These are exposed to the business by the vendor using SOAP, WSDL and WS-* based standards whick makes them SOA friendly. Our focus in this thesis is on the last of these: Software as a Service, and how SaaS integrates with the local SOA to present interesting QoS challenges to the business. As has been discussed in the introduction to this thesis, SOA strategies allow a business to reuse existing assets (legacy applications and data) in new LOB applications. The architecture promotes agile and reconfigurable application development which is ideal in 21st century businesses. SaaS opens up new avenues for enterprise architects to contribute additional business agility: Non-core Services: Using SaaS the business can now start to streamline its application architecture. For example, any non-core business services are very good candidates for outsourcing. This then allows the business to focus on development of local business functions that support its core business and provide differentiation over its competitors. Examples of common applications that could be outsourced include CRM systems and portals for enterprise content management. 88
Workload Management: A business can use SaaS to assist with short-term increases in workload. For example, if postal address validation (a non-core activity for most businesses) was outsourced then demand peaks (e.g. a batch run of contact addresses) could be met using a second SaaS vendor.
Centralised Applications: Many large businesses have duplicate applications across the business. This situation arises very easily, especially where a business has grown through acquisitions or mergers. The SaaS model allows duplicate functionality to be consolidated.
7.1.2 A Unified Architecture A unifying vision of how SaaS and SOA can be used to create composite application was described by Schneider [2007]. In his example he describes the construction of a composite application built using SOA principles that integrates a simple CRM service from a SaaS vendor with a companys own finance system using Web Services to streamline their internal operations. In more general terms, Figure 7.1 depicts how a composite application might be built using a local SOA and SaaS in the Cloud. The local SOA uses an ESB to orchestrate the business process and utilises packaged applications, in-house LOB applications and legacy file drops via Web Service wrappers. It also uses two services from the Cloud. The standardised Web Service technology means that, for the ESB, there is no difference between integrating with a local application and integrating with a remote SaaS service. The interface is simply a REST or SOAP interface presenting data to the ESB as JSON or XML just as the in-house interfaces do
89
Service Service
The Cloud
On-premises SOA 2 1 4 Enterprise Service Bus 5 3
Adapter
Adapter
File Drop
Figure 7.1 SOA and SaaS used to create a composite application
7.1.3 Challenges with SLA Management Many SaaS vendors, e.g. Amazon, are beginning to offer SLAs that cover availability and reliability. However, SLA assurances about performance metrics such as response times are not widely available. The thesis author raised this topic on the discussion forum of the SaaS group on the LinkedIn business networking site. Despite the fact that this group has almost 6000 members worldwide (as of December 2009) just two SaaS vendors voluntarily offered performance related SLA metrics for their services. Of these two companies only one (Intactt) publicly display those figures on their website. 90
Custom LOB Application
Packaged Application
So its clear that if a business is serious about exploiting the architectural advantages of SaaS it will need to have in place processes and mechanisms to ensure that SLA targets it has for delivery of services to its own customers are not hampered by problems in the service it receives from the SaaS vendors. Not only are the SaaS vendors not yet advertising QoS metrics, the methods and techniques for estimating overall QoS in a composite application simply do not exist at the moment. This problem was recognised in a recent review by the Software Engineering Institute [Bianco et. al. 2008] as being one of the major hurdles still to be addressed for SLA management in SOAs. They state: More research is needed to understand and determine the QoS of composite services. For example, if the performance measure for a set of services is known, how can the performance for a composite service that used this set of services be determined? In the following section we demonstrate how the MVA developed in Chapter 5 and the Genetic Algorithm developed in Chapter 6 can be used to address these issues.
7.2 Modelling a Composite Application in the Cloud

The model developed in Chapter 5 is easily extended to the Cloud. Each SaaS vendor is simply treated as a load independent resource with its own queue. In Figure 7.2, as an example, we show a very general and simple queuing network model with a single internal server and two SaaS vendors, each modelled as its own queue.
91
Multiple Workflows Managed by ESB
Internal Resource
Vendor X
Vendor Y
Figure 7.2 Generalised example queuing network model including SaaS services
7.3 Strategies for Automated QoS Control in the Cloud

Armed with a model and an performance improvement algorithm we are now in a position to understand how they can be applied to the autonomous management of our composite application in terms of ensuring it meets its SLA targeted response times for each workflow. To do so we consider in turn a number of different problems that can arise in a widely distributed architecture such as that being considered.
7.3.1 Changes in Workload If the workload was fixed, we would only ever need to run the GA once. However the workload is permanently changing. Suppose in any given time frame we had 4 users, one of them creating a policy, one adding an endorsement, one renewing a policy and one cancelling a policy, as shown in the test workflows of Figure 6.3. In terms of the work requests we would be processing 4 work 92
requests of type A, 3 of type C and one each of types B, C, D, E and F. If instead we had two users creating a policy and two endorsing a policy we would be processing 4 work requests of type A, 4 of type C, 4 of type B and 4 of type D. Hence we need to continuously run the GA at regular intervals and continuously reapply the latest optimised solution to the ESBs routing plan.
7.3.2 Loss of Service The complete loss of service from one particular vendor or resource is the easiest situation to deal with. The GA is instructed to create solutions in which the unavailable resource does not receive any requests at all. This can be achieved simply by adding a large penalty to the fitness of any solution that sends work requests to that resource. The GA will rapidly remove such solutions from its pool and will prefer solutions which obtain the same service from an alternative vendor.
7.3.3 Increases in Latency Increases in latency at a resource are handled in the same way as workflow changes. We simply need to ensure that we continuously update the GA with the latest performance data captured by the ESB. If latency increases become excessive we might choose to temporarily remove the resource consideration as above. To do this we need to be able to recognise when a particular resource is underperforming. If we continuously check the average response time against the baseline data then this can be achieved quite simply. The only decision we need to make is what kind of percentage increase in response times should be considered as the trigger point for considering the resource to be significantly underperforming. This figure is best driven through performance testing the particular application in question.
93
7.3.4 Differentiated Services What happens if the GA cannot find any solution where all the workflows meet the SLA targets? Suppose we instruct the GA to complete a maximum of 100 cycles in its attempt to find an optimised solution and it fails to find any solution where all workflows meet their SLA targets? This could happen if a resource suffered performance degradation and it happened to be a single point of supply of a service. Or we might simply be trying to handle too many work requests and all workflows are suffering as a result. Our only option in this kind of scenario is to remove some of the work requests from our applications load. To do this we need to investigate admission control of work requests and the policies around which this could be applied. This strategy is similar to differentiated services models used in network policy management. Priority Customers Admission control could work in several ways. Firstly, we might provide different QoS levels, e.g. Platinum, Gold, Silver or Bronze and implement an admission control policy around the level of service that the customer has paid for. This is a simple solution for us to implement using the GA/MVA model. We rank all of the customers currently using the application by grade of service from poorest to best, and gradually remove customers, one grade at a time. At each step we can reapply the GA with the reduced workload simply by removing all job requests for those lower ranked customers. We can repeat the process of removing customers until the SLA targets for the remaining customers are being met again. Using the architecture of Figure 7.1, we can instruct the ESB to either queue requests from blocked customers until the ESB is handling fewer requests, or we can return an error message to the customer requesting the service.
94
Priority Workflows Another strategy would be to prioritise the workflows on offer. As an example, if one of the workflows was month-end reports we might agree with our customers that it was less business critical than create a new policy which is revenue earning. This time we rank all our workflows by priority from lowest to highest, and successively remove job requests from the GA/MVA model which relate to each workflow. Again, we can repeat the process until the SLA targets for the remaining workflows are met. Priority Sessions Another option is to employ a session-based admission control policy. In this strategy we give priority to users who have almost completed a workflow over those who have just started. This is a useful strategy when pay-per-use pricing models are in use. This is a popular pricing model for SaaS vendors. Consider in Figure 6.3 a user executing the create policy workflow who has reached the step represented by Service C. If the other services in the workflow have all been processed we have already paid for them. So, from a financial perspective it makes sense to allow the user to complete the workflow rather than allow, for example a request for Service A on this workflow from a completely new user. The GA/MVA model can handle priority sessions by ranking all requests by proximity to the end of the workflow, from those furthest from the end to those at the end.
7.4 Conclusions
We have discussed some of the QoS challenges surrounding a new breed of composite enterprise application architectures in which the benefits of SOA are combined with the benefits of SaaS services in the Cloud. We believe that the difficulties in guaranteeing SLA targets for performance 95
based metrics such as response times are challenging, as the Software Engineering Institute has rightly pointed out. However, we hope that this demonstrates that those challenges can be met. In our own research we have attempted to provide solutions that model these architectures and might lead to autonomous QoS-aware management and we have demonstrated how our solutions could be applied to common problems.
7.5 Publications
This chapter was presented at the SQM 2010 Conference: Shelly Saunders, Margaret Ross, Geoff Staples, and Sean Wellington, 2010. Meeting SLA targets with Composite Applications in the Cloud, In: Perspectives in Software Quality, Eighteenth International Conference on Software Quality Management (SQM 2010).
96
Chapter 8 Evaluation and Conclusions

The hypothesis was that there exist solutions and strategies that will allow providers of composite applications built using Web Services to manage the QoS metrics of that application in such a way that they can ensure they meet SLA targets containing those metrics. To demonstrate this we have used a case study of an application that is architecturally typical of the applications used in the financial services industry and attempted to prototype solutions using it. In particular we have tried to explore how scenarios such as the following could be addressed through the construction of QoSaware software components that could be offered as a generic management service in a typical application like this: Managing Overload Conditions It is a common scenario when providing workflows to multiple clients that resources can become overloaded. Under these conditions it would be useful to be able to selectively admit or reject requests from clients based on some criteria that maximises the providers profits or business objectives. Performance Prediction How can we predict the effect on performance of replacing one Web Service with another one of equivalent functionality, or by dynamically changing the steps in a workflow? Performance Improvement How can an organisation maximise the performance of that application through effective use and exploitation of the resources available to it? Service Level Agreement (SLA) Management how can we define and manage SLAs for performance metrics in composite applications? QoS Improvement what strategies are there for improving our ability to meet SLA performance targets in a composite application? 97
Practical work and experiments were carried on as part of the author's role as Technical Architect firstly at Marlborough Stirling and latterly at ACE Insurance Ltd. The case study application was an insurance bookkeeping system with an online front-end that allowed real-time creation of new policies that were added to the legacy mainframe back-end database. Latterly, the methods described in this thesis have been applied with equal success to a second application in a second company. This application was architecturally very similar although the operating systems, software languages, and databases were a mix of different technologies. This provides confidence that the methods used in this thesis could be more generally applicable to other applications built using the same architectural style (SOA) in this industry sector.
8.1 Discussion of Results

In Chapters 2 and 3 the general problems of defining SLAs and, in particular, QoS attributes for a composite application built using Web Services were introduced. Firstly a review of the wider problem was given and it covered areas such as the discovery and negotiation of SLAs; definition of the SLA itself; service provisioning in on-demand environments; QoS monitoring. Next, chapter 3 gave a more detailed overview of adaptive control of web applications and service. Related work to the two main components of the thesis was described: the model and the performance improvement methodology. In particular techniques using control theory and queuing networks were compared as methods of model-based adaptive control, and then optimization techniques such as utility functions (from decision theory), integer linear programming, and genetic algorithms. In Chapter 4, attempts to understand how adaptive management techniques might contribute and be defined in a formal SLA document, with an emphasis on the end-to-end QoS that we are able to offer the customer of our enterprise applications. The Key Quality Indicator (KQI) of interest to this thesis is primarily the end-to-end time required to execute a particular workflow. This KQI can be 98
readily mapped into an SLA. The KQI is derived from Key Performance Indicators (KPIs) of each service used by that workflow. The KPIs we are interested in are the execution times of each service call. For differentiated services based on priority sessions, we would also be interested in the cost of each service as another KPI. The discussion then focussed on two recent proposals: COSMAdoc and MoDe4SLA and how they can be used for SLA design. We concluded with discussions of how our adaptive management strategies could be incorporated into SLA documents using these proposed standards. Chapter 5 began to address the central questions posed by the thesis, by deriving a model for a composite application built using Web Services through the following steps: 1. It defined the performance requirements of the application using a real insurance application as an example. This thesis has focuses on the commonly requested QoS attribute of total response time for the execution of a multi-part workflow. 2. It modelled the business demand of the application in terms of how it is actually used throughout the day in different situations, with particular focus on the fact that different workflows place different demands on the application's components. 3. It characterised the workload of the application on different tiers of resources and investigated the applicability of queuing network models based on the distribution of interarrival times of jobs, and the service times of those jobs on each resource. 4. It built a performance model using Mean-Value Analysis which was then applied to the application and found to provide a good prediction of overall end-to-end response time of the workflows being executed. An MVA model was chosen as it can be used to model applications with an arbitrary number of tiers and those with significantly different performance characteristics which makes it particularly
99
suitable for composite applications built using Web Services. It is also algorithmically quick. It has been shown that MVA can relatively easily be applied to multi-tier e-commerce applications [Urgaonkar et. al., 2005b]. MVA deals with such scenarios as a single request in the web tier spawning multiple tasks on the application tier through the use of closed-queuing networks creating multiple visits to each resource. In particular, it has been demonstrated how MVA can deal with long-lived transactions using an infinite queue at the front of the model, which also serves as the reentry point for requests that have been completed. This type of application is very typical in the financial services industry where an online front-end is integrated with multiple back-end legacy systems in a workflow type application. In Chapter 6, an introductory attempt was made to provide a practical solution to the problem of adapting the workflows of a composite application built using Web Services automatically in realtime. It used a GA to search for solutions to an optimisation problem. The use of a GA was motivated by previous research using numerical simulations that demonstrated its applicability. This thesis applied a GA to a real system. It discussed the chromosome encoding strategy that could be used in such a system and how the MVA model can be used as the GAs fitness function. Also it described the crossover strategy to create new chromosomes as potential solutions to the problem of improving the QoS performance metrics. It showed how the total response time of a workflow could be modelled and used to determine from the fitness function whether the GA had found a solution that was compatible with the SLA targets. The GA is configured to find a solution that meets the SLA defined response time targets for all workflows it does not need to attempt to find an optimum solution, therefore it can operate extremely efficiently with little tuning of variable parameters such mutation rate and crossover. The
100
use of the GA was demonstrated with practical results derived from a prototype management solution constructed on top of an ESB which provided dynamic re-routing of web service requests. In Chapter 7 this knowledge of how the workflow response times of composite applications built using Web Services can be automatically managed was used to discuss techniques that could be used to address the types of scenarios described at the start of this chapter. Using the recent migration of enterprise SOA in the Cloud as a case study, it was shown how, firstly, how to model a composite application in the Cloud using the same queuing network method of Chapter 5. Next, it was shown how the adaptive management solution described in Chapter 6 could be applied to changes in workload, loss of service, and increases in latency. It was also shown how differentiated services could be used under conditions of such high load that no solution is available to meet the SLA requirements of all workflows simultaneously.
8.2 Evaluation of Results and Methodologies

We wanted to show through an authentic case study an actual example of software engineering technologies solving real-world problems in the run-time operation of composite applications built using Web Services. Case studies are useful for exploratory examinations of a problem space [Perry et. al. 2004] and so they are particularly relevant for this thesis as the published literature is rare, whereas, the problems are very real. As an example, a composite application architected by the thesis author was being used to sell shares for employees of a number of blue chip companies. The same platform was also being used for these companies administrative staff to produce monthly stock reports, implemented as a separate workflow using a number of common services on the same hardware. One particular day, following very bad results on the New York Stock Exchange, the stock price for one company fell dramatically and a large number of its employees started to sell their shares. Another company, using the reporting workflow intensively (for a non-critical reporting 101
operation) were preventing the first companys employees selling shares as the hardware resources were fully stretched. This is precisely the type of scenario that the thesis attempts to address. Composite applications built using Web Services in the financial services industry are usually built using common architecture patterns [Fowler, 2002] using one of two similar but competitive software platforms: Microsoft.NET on Windows or J2EE on Unix/Windows. This means that a case study using a typical application built along these architectural principles is likely to be relevant to many applications. Whilst architectural principles are common in these applications, software languages and operating systems can vary. For this reason, we attempted to use software engineering solutions that were agnostic in this regard. So, for example, a model of the composite application could have been attempted using control theoretic techniques using an operating system parameter as a control variable. However, this would have made the solution platform dependent. In choosing MVA and GAs we have used two techniques that are established across many different problem spaces and are independent of the operating system or language used. MVA is a widely accepted technique for computer capacity planning [Menasce et. al, 2004b] and has been shown to be applicable to modelling distributed n-tier applications [Urgaonkar et.al, 2005b]. Composite applications built using Web Services are a specific example of distributed n-tier applications that exploit Web Service standards to provide platform and operating system interoperability. By applying a load-dependent, multi-class formulation of MVA we have demonstrated results that provide errors in response time of about 8% on average which exceeds the only other published result in the literature, that of Kounev and Buchmann [2003], who achieved errors in the range 10-30%. The improvement we have obtained is due to the fact that a load-dependent calculation is used and this is achieved through the continuous statistical analysis of
102
response time measurements taken from each and every service running on each and every resource in the application. Compared to published results using other techniques, these results are also good. For example, Stewart and Shen [2005] use offline constructed application profiles to achieve an error of 14% for the average response time. The hybrid queuing model of Stewart et. al. [2007] achieved errors in the range 10 16% which were reduced to 4 14% after calibration. Chen et. al., [2007] used a MVA formulation and claim their analytic model does predict the performance of TPC-W accurately (TPC-W is an industry standard e-commerce web application), although accurate figures are hard to obtain from their graphs. The results of the GA as a means of improving the performance of the composite application are much more difficult to assess as there is little published work in which such a technique has been attempted against a real application. Numerical simulations have been used to demonstrate the effective use of GAs for service composition problems, for example Canfora et. al. [2005] and Zhang [2011]. Jiang et. al. [2011] have very recently released experimental results demonstrating that their GA for QoS-aware service composition is both effective and scalable. This then is an area where much more study could be undertaken.
8.3 Contributions of this Thesis

This thesis makes the following main contributions to the subject: 1. This is the first time that detailed test results have been published to prove that the Mean Value Analysis (MVA) algorithm can be applied to a queuing network description of a composite application built using Web Services. These results were presented in Chapter 5. 2. This thesis is the first published work to use MVA as the fitness function for a Genetic Algorithm. This method was presented in Chapter 6.
103
3. This thesis is the first published work to apply a GA to dynamic optimization of a real workflow implemented as a set of Web Services across multiple servers. Previous published work on using GAs to optimize service composition has restricted itself to numerical simulations. These results were presented in Chapter 6. 4. This thesis demonstrates strategies for meeting SLA performance targets under a number of different real-life overload conditions. These methods were presented in Chapter 7. 5. This thesis discusses improvements that could be made to existing SLA design methodologies and SLA languages for composite applications. These proposals were made in Chapter 4.
8.4 Limitations of this Thesis

The thesis has only presented a single possible technique that could be used for adaptive management the GA, and the full capabilities of limitations of using a GA have not been rigorously researched at this stage. There were two major reasons for this: a) The thesis itself was seeking to address the wider problem space rather than focussing on this particular aspect of the problem. For example, no exhaustive study has been carried out of how parameters such as chromosome encoding, crossover strategy, mutation rate and pool size affect the efficiency of the GA. b) To conduct a rigorous assessment of the use of a GA would have required access to more hardware resources than were available. For example, questions such as how complex can the SOA be before the GA is not capable of finding SLA targets. Furthermore, by focussing on a single case study, albeit a typical of the type of SOA designed applications in use in the financial services industry today, we cannot make any claim that the
104
results presented here are more generally applicable to composite applications built using Web Services in all applications built using SOA design methodologies.
8.5 Future Work

The work presented in this thesis could be extended in several ways: a) A rigorous assessment of the use of GAs in this type of problem space would be valuable. The initial results presented here are encouraging and it would be extremely interesting to investigate this more thoroughly. b) By comparing the GA with other methods such as linear integer programming to understand the best approach. Or to understand what approach is the best solution in what type of application. c) By comparing the MVA model with other techniques such as those from control theory. Or to understand what approach is the best solution in what type of application. d) It would be useful to bring adaptive management techniques and service differentiation into SLA proposals such as COSMAdoc and MoDe4SLA. e) By applying the solution to more case studies to assess the its more general applicability.
105
References
Abdelzaher, T. F., Shin, K. G., Bhatti, N., 2001. Performance Guarantees for Web Server EndSystems: A Control-Theoretical Approach. IEEE Transactions on Parallel and Distributed Systems, 13 (1), 80-96 Abramowicz, W., Kaczmarek, M., Kowalkiewicz, M., Zyskowski, D., 2006. Architecture for service profiling. In Modelling, Design and Analysis for Service-Oriented Architecture Workshop in conjunction with the 2006 IEEE International Conferences on Services Computing (SCC 2006) and Web Services (ICWS 2006), Chicago. Agarwal V., Dasgupta K., Karnik N., Kumar A., Kundu A., Mittal S. and Srivastava B., 2005. A Service Creation Environment Based on End to End Composition of Web Services, In: Proceeding of the 14th International World Wide Web Conference, Japan, 2005. Al-Ali R.J., Rana O.F., Walker D.W., Jha S. and Sohail S., 2002. G-QoSM: Grid Service Discovery Using QoS Properties. Available from: http://www.cse.unsw.edu.au/~sjha/publications/unswcardiff.pdf [Accessed April 2011] Andrieux, A., Czajkowski, K., Dan, A., Keahey, K., Ludwig, H., Nakata., N., Pruyne, J., Rofrano, J., Tuecke, S., Xu, M., 2007. Web Services Agreement Specification (WS-Agreement) [online]. Available from: www.ogf.org/documents/GFD.107.pdf [Accessed March 2010] Arlitt, M., Jin, T., 1999. Workload Characterization of the 1998 World Cup Web Site, [online]. Available from: http://www.hpl.hp.com/techreports/1999/HPL-1999-35R1.pdf [Accessed May 2006] Asawa, M., 1998. Measuring and analyzing service levels: a scalable passive approach. In: Sixth International Workshop on Quality of Service, 1998. (IWQoS 98) 18-20 May 1998. Pages 3 - 12 106
Astrom, K. J., Wittenmark, B., 1994. Adaptive Control (2nd edition). Prentice Hall. ISBN: 0201558661 Bansal, N., and Harchol-Balter, M., 2001 Analysis of SRPT Scheduling: Investigating Unfairness In: Proc. SIGMETRICS 2001, Cambridge, Massachusetts, June 2001. ISSN:0163-5999 279-290 Benjamin, A.C., Suave, J., Cirne, W., Carelli, M., 2004. Independently Auditing Service Level Agreements in the Grid [online]. Available from: http://www.hpovua.org/PUBLICATIONS/PROCEEDINGS/11_HPOVUAWS/HPOVUA%202004%20Papers/Monday%20June%2021,%202004/1.4%20BIT%20Session%20%20Business-IT%20Alignment/BIT_2_Independently%20Auditing%20Service.pdf [Accessed November 2004] Bhoj P., Ramanathan S. and Singhal S. 2006. Web2K: Bringing QoS to Web Servers, [online] Available from: http://nclab.kaist.ac.kr/lecture/te628_2001_Fall/seminar/papers/HPL-200061.pdf [Accessed April 2011] Bianco, P., Lewis, G. A., Merson P., 2008, Service Level Agreements in Service-Oriented Architecture Environments, Software Engineering Institute, Technical Note CMU/SEI-2008-TN021, Sept 2008 Bichler, M., Lin, K-J., 2006. Service-Oriented Computing. IEEE Computer. 39 (3) 99-101 Bodenstaff, L., Wombacher, A., Reichert, M., Jaeger,M. C. 2008. Monitoring Dependencies for SLAs: The MoDe4sSLA Approach. In: IEEE 5th Int'l Conference on Services Computing (SCC 2008) , Honolulu, Hawaii Bodenstaff, L., Wieringa, R., Wombacher, A. Reichert, M., 2009a. Towards Management of Complex Service Compositions. In: Int'l Workshop on Service Computing for B2B (SC4B2B'09), Bangalore
107
Bodenstaff, L., Wombacher, A., Wieringa, R., Jaeger, M., Reichert, M., 2009b. Monitoring Service Compositions In MoDe4SLA: Design of Validation. In: Proc. 11th Int'l Conf. on Enterprise Information Systems (ICEIS'09), Milan, Italy Boutilier, C., Patrascu, R., Poupart, P., Schuurmans, D., 2004. Regret Based Utility Elicitation in Constraint-based Decision Problems [online]. Available from: http://citeseer.ist.psu.edu/636031.html [Accessed December 2009] Burstein M., Bussler C., Finin T., Huhns M.N., Paolucci M., Sheth A.P., Williams S. and Singh M.P. 2005. A Semantic Web Services Architecture. IEEE Internet Computing, 2005, pp. 72-81. Candido, G., Barata, J., Colombo, A.W., Jammes, F., 2009. SOA in reconfigurable supply chains: A research roadmap. Engineering Applications of Artificial Intelligence. Volume 22, Issue 6, September 2009, Pages 939-949. Special Issue: Artificial Intelligence Techniques for Supply Chain Management Canfora, G., Di Penta, M., Esposito, R., and Villani, M., 2004. A Lightweight Approach for QoS Aware Service Composition [online]. Available from: http://www.rcost.unisannio.it/mdipenta/papers/tr-qos.pdf [Accessed Jul 2008] Canfora, G., Di Penta, M., Esposito, R., and Villani, M., 2005. An Approach for QoS-Aware Service Composition based on Genetic Algorithms. In Proceedings of the 2005 conference on Genetic and Evolutionary Computation Conference, Washington DC, USA Canfora, G., Di Penta, M., Esposito, R., and Luisa Villani, M., 2008. A framework for QoS-aware binding and re-binding of composite web services. Journal of Systems and Software, 81 (10) 1754, 2008 Cardoso. J., 2002. Quality of Service and Semantic Composition of Workflows. PhD thesis, Univ. of Georgia, 2002. Cardoso J., Sheth A., Miller J., Arnold J. and Kochut K., 2004. Quality of Service for Workflows and Web Service Processes [online]. Available from: http://lsdis.cs.uga.edu/lib/download/CSM+QoS-WebSemantics.pdf [Accessed April 2011] CBDI Forum Ltd, 2004. Market Trends. Time to Board the Enterprise Service Bus? CBDI Journal, July/August 2004. 108
Chandrasekaran S., Miller J.A., Silver G.S., Arpinar B. and Sheth A.P., 2003. Performance Analysis and Simulation of Composite Web Services, Electronic Markets: The International Journal of Electronic Commerce and Business Media, 13(2) 2003 Chen, Y., Iyer, S., Liu, X., Milojicic, D, Sahai, A., 2007. SLA Decomposition: Translating Service Level Objectives to System Level Thresholds. In: Fourth International Conference on Autonomic Computing (ICAC'07), 2007 Crovella, M. E., Bestavros, A., 1997. Self-Similarity in World Wide Web Traffic: Evidence and Possible Causes. IEEE/ACM Transactions on Networking, 5(6) 835846 DAmbrogio A., 2006. A Model-driven WSDL Extension for Describing the QoS of Web Services, In: IEEE International Conference on Web Services, 2006. D'Ambrogio A., Bocciarelli, P., 2007. A model-driven approach to describe and predict the performance, In: WOSP '07 Proceedings of the 6th international workshop on Software and performance of composite services, ACM, New York USA 2007. Dan, A., Davis, D., Kearney, R., Keller, A., King, R., Kuebler, D., Ludwig, H., Polan, M., Spreitzer, M., Youssef, A., 2004. Web Services On Demand: WSLA-driven automated management. IBM Systems Journal Special Issue on Utility Computing, 43 (1), 136-158 Dan, A., Ludwig, H., Pacifici, G., 2003. Web Services Differentiation with Service Level Agreements [online]. Available from: ftp://ftp.software.ibm.com/software/websphere/webservices/webserviceswithservicelevelsup port.pdf [Accessed October 2004] Deichmann, A., 2006. Re: K-Means Clustering Algorithm Tutorial, [online]. Available from: http://www.kdkeys.net/forums/thread/3538.aspx [Accessed May 2006]
109
Di Modica, G., Tomarchio O., Vita, L., 2009. Dynamic SLAs management in service oriented environments. Journal of Systems and Software. Volume 82, Issue 5, May 2009, Pages 759-771 Denning, P. J., Buzen, J. P., 1978. The Operational Analysis of Queueing Network Models, ACM Computing Surveys, 10 (3) 225-261 Diao, Y., Gandhi, N., Hellerstein, J. L., Parekh, S., Tilbury, D. M., 2002a. Using MIMO Feedback Control to Enforce Policies for Interrelated Metrics with Application to the Apache Web Server. In: Proceedings of the Network Operations and Management Symposium (NOMS), 15-19 Apr 2002, Florence, Italy. Diao, Y., Hellerstein, J. L., Parekh, S., 2002b. Using Fuzzy Control to Maximize Profits in Service Level Management, IBM Systems Journal, 41 (3), 403-420 Diao Y., Hellerstein, J. L., Parekh, S., Bigus, J. P., 2003, Managing Web Server Performance With AutoTune Agents, IBM Systems Journal, 42 (1), 136-149 Diao, Y., Hellerstein, J. L., Kaiser, G., Parekh, S., Phung, D., 2004, Self-Managing Systems: A Control Theory Foundation, IBM Technical Report RC23374(W0410-080) [online]. Available from: http://www.research.ibm.com/PM/rc23374.pdf [Accessed April 2006] Doyle, R., Chase, J., Asad, O., Jin, W., and Vahdat, A., 2003. Model-Based Resource Provisioning in a Web Service Utility. In Proceedings of the 4th USITS, Mar. 2003. Dyachuk, D., Deters, R., 2007. Improving Performance of Composite Web Services. SOCA 2007 147154 Erradi, A.,Padmanabhuni, S., Varadharajan, N., 2006. Differential QoS support in Web Services Management. In: IEEE International Conference on Web Services (ICWS'06), 2006 Erl, T., 2008. SOA Design Patterns. Prentice Hall. ISBN 0136135161
110
Erl. T., 2005. Service-Oriented Architecture: Concepts, Technology and Design. Prentice Hall. ISBN 0-13-185858-0. Esmaeilsabzali, S., and Larson, K., 2005. Service allocation for composite Web services based on quality attributes, In: ECommerce Technology Workshops 2005 Seventh IEEE International Conference. Esper, 2009. Event Stream Intelligence with Esper and NEsper [online]. Available from: http://esper.codehaus.org/ [Accessed Jan 2009] Eviware, 2008. Soapui; the Web Services Testing Tool [online], Available from: http://www.soapui.org [Accessed Nov 2008] Fayad, C., and Petrovic, S., 2005. A Fuzzy Genetic Algorithm for Real-World Job Shop Scheduling, Lectures Notes in Computer Science, Springer Berlin/Heidelberg, Vol 3553 pp524-533 ISBN 978354026551 Foster, I., Kesselman, C., Nick, J., Tuecke, S., 2004. The Physiology of the Grid: An Open Grid Services Architecture for Distributed Systems Architecture, [online]. Available from: http://www.globus.org/research/papers/ogsa.pdf [Accessed Oct 2004] Fowler, M., 2002. Patterns of Enterprise Application Architecture, Addison-Wesley Professional, ISBN 978-0321127426 Franklin, G. F., Powell, J. D., Abbas, E-N., 2002. Feedback Control of Dynamic Systems, 4th Ed. Pearson Education, ISBN 0-130-32393-4 Gao, A., Yang, D., Tang, S., Zhang, M., 2005. Web Service Composition Using Integer Programmingbased Models, In: IEEE International Conference on e-Business Engineering, ICEBE 2005. 12-18 Oct. 2005, Pages: 603 606
111
Goldberg, D. E., 1989. Genetic Algorithms in Search, Optimization and Machine Learning. Addison Wesley, ISBN 0201157675 Grefenstette, J, et al. 1985, Genetic Algorithms for the Traveling Salesman Problem, Proc. Intern. Conf. of Genetic Algorithms and their applications. 160-165 Greiner, U., and Rahm, E. 2004. Quality-Oriented Handling of Exceptions in Web-Service-Based Cooperative Processes. In: Proc. of EAI-Workshop 2004 - Enterprise Application Integration, Oldenburg. GITO-Verlag, Berlin, Feb. 2004, 11-18 Gu X., Nahrstedt K., Yuan W. and Wichadakul D., 2002. An XML-based Quality of Service Enabling Language for the Web. Journal of Visual Languages and Computing, 2002, pp 1-39. Hohpe, G., Woolf, B., 2004. Enterprise Integration Patterns, Pearson Education, ISBN 0-321-20068-3 Holland, J. H., 1992. Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control and Artificial Intelligence. MIT Press, ISBN 0262581116 Hu, T., Gou, S., Gou, M., Tang, F., and Dong, M. 2009. Analysis of the Availability of Composite Web Services In: Frontier of Computer Science and Technology, 2009. FCST '09. Fourth International Conference on Dec 2009 Hung, P., Li. H., and Jeng, J. J. 2004. WS-Negotiation: An Overview of Research Issues. In: Proceedings of the 37th Hawaii International Conference on Systems Science 5-8 Jan 2004 Big Island, Hawaii, IEEE Computer Society, 10033.2 ISBN:0-7695-2056-1 Hwang, S-Y., Wang, H., Tang, J., and Srivastava, J., 2007, A probabilistic approach to modeling and estimating the QoS of web-services-based workflow, Journal of Information Sciences, 177 (23), 2007 IBM, 2003. An Architectural Blueprint for Autonomic Computing, [online]. Available from: http://www-03.ibm.com/autonomic/pdfs/ACwpFinal.pdf [Accessed April 2006]
112
IBM, 2004. IBM Systems Journal Special Issue on Utility Computing. 43 (1) IBM, 2006. Web Services Policy Framework (WS-Policy) Version 1.2 [online]. Available from: http://download.boulder.ibm.com/ibmdl/pub/software/dw/specs/ws-polfram/ws-policy-200603-01.pdf [Accessed April 2006] Jaeger, M. C., and Mhl, G., 2007. QoS-based selection of services: The implementation of a genetic algorithm, In KiVS 2007 Workshop: Service-Oriented Architectures and Service Oriented Computing (SOA/SOC), 26 Feb - 2 Mar, 2007 Bern. Jain, R.K., 1991. The Art of Computer Systems Performance Analysis: Techniques for Experimental Design, Measurement, Simulation and Modelling. John Wiley & Sons. ISBN 0471503363 JGAP, 2008. Java Genetic Algorithms Package [online]. Available from: http://jgap.sourceforge.net/ [Accessed Nov 2008] Jiang, H., X. Yang, K. Yin, S. Zhang and J.A. Cristoforo, 2011. Multi-path QoS-aware web service composition using variable length chromosome genetic algorithm. Inform. Technol. J., 10 113119. Jin L.J., Machiraju V. and Sahai A. 2006. Analysis on Service Level Agreement of Web Services [online]. Available from http://athena.union.edu/~hemmendd/Gradseminar/hpl.pdf [Accessed April 2011] Kamra, A., Misra, V., Nahum, E., 2004. Yaksha: A controller for managing the performance of 3tiered websites. In: The Twelfth IEEE International Workshop on Quality of Service (IWQoS 2004), Toronto, Canada, 2004. Keller, A., Ludwig, H., 2003. The WSLA Framework: Specifying and Monitoring Service Level Agreements for Web Services. Journal of Network and Systems Management 11 (1), 57-81
113
Kelly, T., 2003, Utility-Directed Allocation, Online at: www.hpl.hp.com/techreports/2003/HPL 2003115.pdf [Accessed Aug 2005] Khoshafian, S., 2002. Web Services and Virtual Enterprises [online]. Available from: http://www.webservicesarchitect.com/content/articles/khoshafian01.asp [Accessed November 2004] Kihl, M., Robertsson, A., and Wittenmark, B. 2003. Analysis of Admission Control Mechanisms Using Non-linear Control Theory, In: Eighth IEEE International Symposium on and Communications, Jun30-Jul03 2003, Kemer-Antalya, Turkey Kleinrock, L., 1976. Queueing Systems Volume 2: Computer Applications, John Wiley & Sons, Inc., ISBN 0-471-49111-X Kounev, S., Buchmann, A., 2003. Performance Modeling and Evaluation of Large-Scale J2EE Applications. In: Proc. of the 29th International Conference of the Computer Measurement Group (CMG) on Resource Management and Performance Evaluation of Enterprise Computing Systems - CMG2003, December 2003 Kritikos, K., Plexousakis, D., 2009. Requirements for QoS-Based Web Service Description and Discovery. IEEE Transactions on Services Computing 2(4): 320-337 (2009) Lakshmanan, G. and Pande, M. 2009. How the Cloud Stretches the SOA Scope. Microsoft Architecture Journal, 21 36-41 Lamanna, D., Skene, J., Emmerich, W. 2003. SLAng: A Language for Defining Service Level Agreements. In: The Ninth IEEE Workshop on Future Trends of Distributed Computing Systems (FTDCS'03) May 28 - 30, 2003 San Juan, Puerto Rico, IEEE 100-106.
114
Lazowska, E. D., Zahorjan, J., Graham, G. S., Sevcik, K., 1984, Quantitative System Performance: Computer System Analysis Using Queueing Network Models, Prentice-Hall Inc, ISBN: 0137469756 Lin, M., Xie,J., Guo, H., Wang, H., 2005. Solving QoS-driven Web Service Dynamic Composition as Fuzzy Constraint Satisfaction, In: Proceedings of the 2005 IEEE International Conference on eTechnology, e-Commerce and e-Service, EEE '05, 29 March-1 April 2005, Pages: 9 14 Liu, X., Lui, J., Sha, L., 2005. Modeling 3-Tiered Web Applications. In: 13th IEEE International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS 2005), Atlanta, Georgia, 2005 Liu, X., Heo, J., Sha, L., Zhu, X., 2006. Adaptive Control of Multi-Tiered Web Application Using Queueing Predictor. In: 10th IEEE/IFIP Network Operations and Management Symposium (NOMS 2006), Vancouver, Canada, 2006 Lodi, G.,Panzieri, F., Rossi D., Turrini, E., 2007. SLA-Driven Clustering of QoS-Aware Application Servers, IEEE Trans. on Soft. Eng. 33(3), pp.186-197, 2007 Looker N., Munro M. and Xu J., 2004. Simulating Errors in Web Services. Inetrnational Journal of Simulation, 5(5), 2004, pp. 29-37. Lu, C., Abdelzahar, T. F., Stankovic, J.A., Son, S. H., 2001. A Feedback Control Approach for Guaranteeing Relative Delays in Web Servers, In: IEEE Real-Time Technology and Applications Symposium (RTAS'01), June 2001 Lu, C., Stankovic, J. A., Tao, G., Son, S. H., 2002, Feedback Control Real-Time Scheduling: Framework, Modeling, and Algorithms, Real-Time Systems Journal, Special Issue on ControlTheoretic Approaches to Real-Time Computing, 23(1/2) 85-126
115
Lu, C., Wang, X., Koutsoukos, X., 2004, Feedback Utilization Control in Distributed Real-Time Systems with End-to-End Tasks, Washington University Department of Computer Science and Engineering, [online]. Available from http://www.cse.seas.wustl.edu/techreportfiles/getreport.asp?354 [Accessed Apr 2005] Luckham, D., 2002. The Power of Events: An Introduction to Complex Event Processing in Distributed Enterprise Systems, Pearson Education Inc, 2002, ISBN 0201727897 Ludwig, A., Franckzyk, B., 2008. COSMA - An Approach for Managing SLAs in Composite Services. In Proceedings of the 6th International Conference on Service-Oriented Computing, Sydney Ludwig, A., Kowalkiewicz, M. 2009a. Supporting Service Level Agreement Creation with Past Service Behavior Data. In Proceedings of the 1st Workshop on Service Discovery and Selection in SOA Ecosystems (SDS-SOA 2009), Poznan Ludwig, A. Hering, T., Kluge R., Franckzyk, B., 2009b. A Case Study on Managing SLAs in Composite Services with COSMA. In Business Process, Services Computing and Intelligent Service Management (BPSC 2009), Leipzig, Germany Ludwig, H., Keller, A., Dan, A., King, R. P., Franck, R., 2003. Web Service Level Agreement (WSLA) Language Specification [online]. Available from: http://www.research.ibm.com/wsla/WSLASpecV1-20030128.pdf [Accessed March 2010] Mahmood, A., 2000. A Hybrid Genetic Algorithm for Task Scheduling in Multiprocessor Systems, Studies in Informatics and Control, 9 (3) Mani, A., Nagaranjan, A., 2002. Understanding Quality of Service for Web Services [online]. Available from: http://www-106.ibm.com/developerworks/library/ws-quality.html [Accessed November 2004]
116
Mathijssen S., 2005. A Fair Model for Quality of Web Services [online]. Available from: referaat.cs.utwente.nl/.../2005_03_B_Mathijssen,S.J.E.A_Fair_Model_for_Quality_of_Web_Services.pdf [Accessed April 2011] Menasce, D.A., 2002. QoS Issues in Web Services. IEEE Internet Computing, 6 (6) 72-75 Menasce, D.A., 2003. Web Server Software Architectures. In IEEE Internet Computing, 7, November/December 2003. Menasce, D.A., 2004. Composing Web Services: A QoS View. IEEE Internet Computing, 8 (6) 88-90 Mensace, D. A., Bennani, M. N., 2003. On the Use of Performance Models to Design Self-Managing Computer Systems, In: Proc. 2003 Computer Measurement Group Conf., Dallas Texas, Dec 7-12, 2003 Menasce, D. A., Ruan H., Gomas, H., 2004a. A Framework for QoS-Aware Software Components, ACM SIGSOFT Software Engineering Notes 29 (1) 186-196 Menasce, D. A., Almeida, V.A.F., Dowdy, L.W., 2004b. Performance By Design: Computer Capacity Planning by Example. Prentice Hall Professional Technical Reference 0-13-090673-5 Montana, D., Brinn, M., Bidwell, G., and Moore, S., 1998. Genetic Algorithms for Complex, RealTime Scheduling, IEEE Conference on Systems, Man, and Cybernetics, 1998. Mule, 2010, Mule ESB [online], Available from: http://www.mulesoft.org/ [Accessed Oct 2010] Natis, Y. V., and Schulte, R. W., 1996, Service Oriented Architectures, Part1, Gartner Research Note SPA-401-068, 12 April 1996. OASIS, 2005, UDDI v3.0 [online]. Available from: http://www.uddi.org [Accessed April 2006] OASIS, 2006a. Web Services Security: SOAP Message Security 1.1 [online]. Available from: http://www.oasis-open.org/committees/download.php/16790/wss-v1.1-spec-osSOAPMessageSecurity.pdf [Accessed Nov 2008]
117
OASIS, 2006b. Web Services Reliable Messaing Protocol (WS-Reliable Messaging) [online]. Available from: http://docs.oasis-open.org/ws-rx/wsrm/200608/wsrm-1.1-spec-cd-04.html [Accessed Nov 2008] OASIS 2007. Business Process Execution Language for Web Services version 2.0[online]. Available from: http://www.oasis-open.org/committees/tc_home.php?wg_abbrev=wsbpel [Accessed Nov 2008] OASIS 2010. Web Services Interoperability Basic Profile Version 2.0 [online]. Available from: http://ws-i.org/Profiles/BasicProfile-2.0-2010-11-09.html [Accessed April 2011] Oliver I. M., Smith D. J., and Holland J., 1987, A Study of Permutation Crossover Operators on the Traveling Salesman Problem, Proc. of the Second Int. Conf. on Genetic Algorithms, 224-230 Pacifici, G., Spreitzer, M., Tantawi, A., Youssef, A., 2003, Performance Management for Cluster Based Web Services [online]. Available from: http://www.research.ibm.com/autonomic/research/papers/pacifici_TechReport.pdf [Accessed Nov 2004] Page, A. J., and Naughton, T. J., 2006, Dynamic Task Scheduling Using Genetic Algorithms for Heterogeneous Distributed Computing, 8th International Workshop on Nature Inspired Distributed Computing, Denver, Colorado, USA, 2005 Panzieri, F., Pellegrini, M., and Turrini, E., 2010. QoS-Aware Clouds, [online] . Available from: http://research.microsoft.com/en-us/events/cloudfutures2010/panzieri.pdf [Accessed April 2011] Papazoglou M.P. and Georgakopoulos D., 2003. Service-Oriented Computing. Communications of the ACM, 46(10), 2003, pp. 25-28. Patel, C., Supekar, K, Lee, Y., 2004. Provisioning Resilient, Adaptive Web Services-based Workflow: A Semantic Modeling Approach, In: Proceedings of the IEEE Int Conf on Web Services, San Diego, USA. June 6-9, 2004, 480-487.
118
Patel, K., Pagurek, B., and Tosic, V., 2003. Improvements in WSOL Grammar and Premier WSOL Parser, Research Report SCE-03-25, Department of Systems and Computer Engineering, Carelton University, Ottawa, Canada, [online]. Available from: www.sce.carleton.ca/netmanage/papers/Improvements%20in%20WSOL%20Grammar%20and %20Premier%20WSOL%20Parser.pdf [Accessed July 2006] Patel, P., Ranabahu, A., Sheth A., 2009. Service Level Agreement in Cloud Computing [online]. Available from: http://knoesis.wright.edu/aboutus/visitors/summer2009/PatelReport.pdf [Accessed Dec 2009] Perry, D. Sim, S. Easterbrook, S., 2004. Case Studies for Software Engineers, In: International Conference on Software Engineering 26 p736-8 2004 Petrovic, S., and Fayad, C., 2005. A Genetic Algorithm for Job Shop Scheduling with Load Balancing, Lectures Notes in Computer Science, Springer Berlin/Heidelberg, Vol 3809 pp339-348 Ran S., 2003. A Model for Web Services Discovery with QoS, ACM Inc., 4(1), 2003, pp. 1-10 Reiser, M., Lavenberg S. S., 1980. Mean-Value Analysis of Closed Mulitchain Queuing Networks, Journal of the ACM, 27 (2) 313-322 Russel, S., Norvig, P., 2003. Artificial Intelligence: A Modern Approach, Second Edition. Pearson Education International. ISBN 0-13-080302-2 Sahai, A., Machiraju, V., Sayal, M., Jin. L. J., Casati, F. 2002. Automated SLA Monitoring for Web Services [online]. Available from: http://www.hpl.hp.com/techreports/2002/HPL-2002-191.pdf [Accessed November 2004] Schmit B.A. and Dustdar S., 2005. Model-driven Development of Web Service Transactions. International Journal of Enterprise Modeling and Information Systems, 1(1), 2005.
119
Schneider, R. D., 2007. SaaS, Composite Applications, and SOA: Understanding their Differences and Making Them Work Together. SOA Magazine Issue IX: July/August 2007 [online]. Available from: http://www.soamag.com/I9/0707-2.pdf [Accessed Dec 2009] Silver, G., Maduko,, A., Jafri, R., Miller, J., Sheth, A., 2003. Modeling and Simulation of Quality of Service for Composite Web Services, [online] Available from:
knoesis.wright.edu/library/download/SMJ_03-sci.pdf [Accessed April 2011] Slothouber, L., 1996. A Model of Web Server Performance. In Proceedings of the 5th International World Wide Web Conference, 1996. Stantchev , V., Schrpfer , V., 2009. Negotiating and Enforcing QoS and SLAs in Grid and Cloud Computing. In: Advances in Grid and Pervasive Computing In GPC '09: Proceedings of the 4th International Conference on Advances in Grid and Pervasive Computing (2009) Stewart, C., Shen, K., 2005. Performance modeling and system management for multi-component online services. In: Proceeding NSDI'05 Proceedings of the 2nd conference on Symposium on Networked Systems Design & Implementation - Volume 2, 2005. Stewart, C., Kelly, T., Zhang, A., 2007. Exploiting nonstationarity for performance prediction. In: Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007 Tesauro, G., Chess, D. M., Walsh, W. E., Das, R., Segal, A., Whalley, I., Kephart, J. O., White, S. R., 2004. A Multi-Agent Systems Approach to Autonomic Computing. In: Third International Joint Conference on Autonomous Agents and Multi Agent Systems, 464-471 Tian, M., Gramm, A., Ritter, H., Schiller, J., 2004. Efficient Selection and Monitoring of QoS-aware Web services with the WS-QoS Framework, In: IEEE/WIC/ACM International Conference on Web Intelligence (WI'04), Sep. 2004, Beijing, China
120
TMForum and the Open Group, 2004. SLA Management Handbook Volume 4: Enterprise Perspective, ISBN 1931624518 Tosic, V., Pagurek, B., Patel, K., 2003. WSOL A Language for the Formal Specification of Classes of Service for Web Services. In: Proceedings of the IEEE International Conference on Web Services, Las Vegas, USA, June 23-26, 2003, IEEE. 375-381. Tosic, V., Ma, W., Pagurek, B., Esfandiari, B., 2004. Web Services Offerings Infrastructure (WSOI) - A Management Infrastructure for XML Web Services. In: Proc. of NOMS (IEEE/IFIP Network Operations and Management Symposium) 2004, Seoul, South Korea, April 19-23, 2004, IEEE, 2004. Urgaonkar, B., and Shenoy, P., 2005. Cataclysm: Policing Extreme Overloads in Internet Services, In: Proceedings of the Fourteenth International World Wide Web Conference (WWW 2005), Chiba, Japan, May 2005 Urgaonkar, B., Shenoy, P., Chandra, A., Goyal, P., 2005a. Agile, Dynamic Provisioning of Multi-tier Internet Applications, In: 2nd IEEE International Conference on Autonomic Computing (ICAC) 116 June 2005 Seattle, Washington Urgaonkar, B., Pacifici, G., Shenoy, P., Spreitzer, M., and Tantawi, A., 2005b. An Analytical Model for Multi-tier Internet Services and its Applications, In: Proc. SIGMETRICS 2005, Banff, Canada, June 2005 VMWare, 2008. VMWare Workstation [online]. Available from: http://www.vmware.com/products/ws [Accessed Nov 2008] W3C, 2001. Web Services Description Language (WSDL), 1.1[online]. Available from: http://www.w3.org/TR/wsdl [Accessed Nov 2008]
121
W3C, 2007. SOAP Version 1.2 [online]. Available from: http://www.w3.org/TR/soap/ [Accessed Nov 2008] Walsh, W. E., Tesauro, G., Kephart, J. O., Das, R., 2004. Utility Functions in Autonomic Computing. In: Proceedings of the International Conference on Autonomic Computing, 70-77, 2004 Wand, X., Lu, C., Koutsoukos, X., 2004. DEUCON: Distributed End-to-End Utilization Control for Real-Time Systems, Washington University Department of Computer Science and Engineering, [online]. Available from http://www.cse.seas.wustl.edu/techreportfiles/getreport.asp?383 [Accessed Apr 2005] Wang G., Chen A., Wang C., Fung C. and Uczekaj S. , 2004. Integrated Quality of Service (QoS) Management in Service-Oriented Enterprise Architectures, In: Proceedings of the 8th IEEE International Enterprise Distributed Object Computing Conference, 2004 Wang, L., Siegel, H. J., Roychowdhury V. P., and Maciejewski, A., 1997. Task Matching and Scheduling in Heterogeneous Computing Environments Using a Genetic-Algorithm-Based Approach, Journal of Parallel and Distributed Computing 47, 8-22. Wang, T., and Boutilier, C., 2003. Incremental Utility Elicitation with the Minimax Regret Decision Criterion. In: Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence, Acapulco, 309-316 Weise, T., Bleul, S., and Geihs, K., 2007. Web Service Composition Systems for the Web Service Challenge A Detailed Review [online]. Available from http://www.vs.unikassel.de/~bleul/publications/Technicalreport20077.pdf [Accessed Jul 2008] Welsh, M., Culler, D., Brewer, E. 2001. SEDA: An Architecture for Well-Conditioned, Scalable Internet Services. In: Eighteenth Symposium on Operating Systems Principles (SOSP'01), Lake Louise, Canada, October 24, 2001
122
Yu, T., Lin, K-J., 2005. Service Selection Algorithms for Composing Complex Services with Multiple QoS Constraints In: Proc Third International Conference on Service Oriented Computing ICSOC 05, Amsterdam, The Netherlands, December 12-15, 2005, Lecture Notes in Computer Science 3826 Springer 2005, ISBN 3-540-30817-2 pp130-143 Yu, W.D., Radhakrishna, R. B., Pingali, S., and Kolluri, V., 2007. Modeling the Measurements of QoS Requirements in Web Service Systems, Simulation 83 (1) 75-91 2007 Chen, Y., Iyer, S., Liu, X., Milojicic, D., Sahai, A., 2007. SLA Decomposition: Translating Service Level Objectives to System Level Thresholds. In: Fourth International Conference on Autonomic Computing (ICAC'07), 2007 Zeng, L., Bentallah, B., Ngu, A., Dumas, M., Kalagnaman, J., and Chang, H., 2004. QoS-Aware Middleware for Web Services Composition. IEEE Trans Software Engineering, 30 (5) pp311-327 Zeng, L., Lei, H., Chang, H., 2010. Monitoring the QoS for Web Services In: Service-Oriented Computing ICSOC 2007 Lecture Notes in Computer Science, 2007, Volume 4749/2007 Zhang, C., 2011. Adaptive Genetic Algorithm for QoS-aware Service Selection. In: IEEE Workshops of International Conference on Advanced Information Networking and Applications (WAINA), 2011 Zomaya, A. Y., and Teh, Y.-H., 2001. Observations on Using Genetic Algorithms for Dynamic LoadBalancing. IEEE Trans on Parallel and Distributed Systems, 12 (9) 899-911 2001
123
Appendix A Publications Linked to This Thesis
Journals
Shelly Saunders, Margaret Ross, Geoff Staples, and Sean Wellington, 2006. The Software Quality Challenges of Service Oriented Architectures in e-Commerce, Software Quality Journal 14 (1) 65-76 March 2006
Conferences
Shelly Saunders, Margaret Ross, Geoff Staples, and Sean Wellington, 2005. The Software Quality Challenges of Service Oriented Architectures in e-Commerce, In: Current Issues in Software Quality, Thirteenth International Conference on Software Quality Management (SQM 2005), pp87-100. Shelly Saunders, Margaret Ross, Geoff Staples, and Sean Wellington, 2006. A Quality of Service Aware Model for Service-Oriented E-Commerce, In: Perspectives in Software Quality, Fourteenth International Conference on Software Quality Management (SQM 2006), pp51-62. Shelly Saunders, Margaret Ross, Geoff Staples, and Sean Wellington, 2010. Meeting SLA targets with Composite Applications in the Cloud, In: Perspectives in Software Quality, Eighteenth International Conference on Software Quality Management (SQM 2010).
124
Appendix A2 Other Research Outputs Not Directly Relevant To This Thesis
Software Engineering
Saunders, S, 2008. .NET Development & the IBM WebSphere Portal Server , Dr Dobbs Software Journal 2008
Optoelectronics
Selected papers where the thesis author was the first author or sole author Saunders, S. 1993. Wear-out failure mechanism in fused-fibre couplers, Elec Lett , 29 (12) Saunders, S., 1991. Birefringence control in close-spaced fused-fiber wavelength-division multiplexers: a comparison of three models, Optics, 16 (15) Saunders, S. et al. 1991. Application of rare-earth doped fibres as lowpass filters in passive optical networks. Elec Lett , 27 (3)
Patents
Selected patents where the thesis author is a named inventor. Patent Number: US5594578 , 1997. Optical communications system including doped optical fiber filter Patent Number: US5342425 , 1994. Fabrication of fused fibre devices Patent Number: WO9102276 , 1991. Fibre Modulators
125

Software Quality of Service in Composite Applications Built With Web Services PHD Thesis

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Software Quality of Service in Composite Applications Built With Web Services PHD Thesis

Uploaded by

Copyright:

Available Formats

Software Quality of Service in Composite Applications Built with Web Services

S e r vice P r o vid e r s B 2 B In t e gra t io n v ia O n D e m a n d S e rv ic e P rov ide rs

E n t e rp ris e A pp lic a t io n Int e gra t ion

1.5 Thesis Roadmap

Chapter 2 Analysis of the Problem

Chapter 3 Related Work

3.2 Adaptive Control of Web Applications and Services

3.3 Queuing Theory

3.4 Control Theory

3.5 Combined Approaches

3.6 Solving Optimization Problems

3.7 Concluding Remarks

Chapter 4 Designing SLAs for Composite Applications

4.1 Service Level Management

Service Level Agreement

Service Performance Indicators (KPI)

Service Level Monitoring

Figure 4.1 Service Level Management

Figure 4.2 Composite Web Services

4.2 Service Level Agreement Design

SlaSetDataValidation and SlaSetUsageValidation sections.

Chapter 5 An MVA Performance Model for a SOA

5.2 Performance Requirements

5.3 Business Demand Modelling

1 0.9 0.8 0.7 0.6

Probability of 0.5 Request

Figure 5.1 Probability of client requesting each job class

5.4 Workload Characterisation

App 2 0 0 0 0 0 0 0 0 0 0 6 14 3 0 0 0 0 0 0 7 0 9 0 0 99 0 App 1 0 2 2 0 1 2 0 0 0 1 2 7 4 2 1 2 0 1 2 6 2 7 6 5 5 0 Web 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

Figure 5.3 Inter-arrival time distribution from web logs.

Figure 5.5 Inter-arrival times of tasks on the job queue.

0 0 0.5 1 1.5 2 Execution Tim e (s)

0 2 12 22 32 Execution Tim e (s)

Figure 5.8 Increase in task service time with load

5.5 Modelling the Application

Xr = throughput of class r customers Ri, r = average response of class r customers at queue i

i, r, j = (j . slopei, r + intercepti, r )/( slopei, r + intercepti, r)

Take a first guess for the throughout:

X 0prev , r = min{Nr/ ( Di , r + R0, r), 1 / (maxi{Di,r + R0, r }}

Estimate the response time: 50

Estimate the new queue length probabilities:

Re-estimate the throughput:

Prepare for the next iteration

Table 5.1 MVA algorithm

App Servers Tier 1

App Servers Tier 2

Figure 5.9 Queuing Network Model of an N-tier Application

5.6 Management Software

Figure 5.10 An ESB executing a sequence of tasks via Web Services

8 7 6 w 5 4 3 C lass A 2 1 0 0 1 2 3 R esponse time (s) Total C lass B C lass C C lass D

16 14 12 10 D ifference: (Model8 Actual)/Actual as Percentage 6 4 2 0 Class A Class B Class C Class D Total

Figure 5.13 Accuracy of the Model

5.8 Using the Model to Predict the Performance of a New Workflow

6 f 5 Total Response Time(s) 4 P redicted Actual 3 2 1 0 3 5 Load 6

5.9 Summary and Discussion

6.1 Introduction to Genetic Algorithms

6.2 Comparison with Other Techniques

6.3 A GA for the Sample Application

Table 6.1 Example Chromosome Encoding

Decoded: Parent 3 (decoded): 2, 1, 3, -1, 3, 1, 2, 3, 2, 2, 1, 3, -1

where tX is the response time of service X

Then match parent 3s value at this position to offspring B: Offspring 2: + + + + + + + 13 + + + + 11

Continuing: Offspring A: + 6 + + 5, 1 + 8, 2 + 11 + 13 Offspring B: + 2 + + 8, 6 + 13, 9 + 5 + 11 72