P.A. Castillo Et Al - Optimisation of Multilayer Perceptrons Using A Distributed Evolutionary Algorithm With SOAP

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

Optimisation of Multilayer Perceptrons Using a Distributed Evolutionary Algorithm with SOAP

P.A. Castillo1 , M.G. Arenas1 , J.G. Castellano1 , J.J. Merelo1 , V.M. Rivas2 , and G. Romero1
Department of Architecture and Computer Technology University of Granada. Campus de Fuentenueva. E. 18071 Granada (Spain) 2 Department of Computer Science University of Jan. Avda. Madrid, 35. E. 23071 Jan (Spain) e e pedro@atc.ugr.es
1

Abstract. SOAP (simple object access protocol) is a protocol that allows the access to remote objects independently of the computer architecture and the language. A client using SOAP can send or receive objects, or access remote object methods. Unlike other remote procedure call methods, like XML-RPC or RMI, SOAP can use many dierent transport types (for instance, it could be called as a CGI or as sockets). In this paper an approach to evolutionary distributed optimisation of multilayer perceptrons (MLP) using SOAP and language Perl has been done. Obtained results show that the parallel version of the developed programs obtains similar or better results using much less time than the sequential version, obtaining a good speedup. Also it can be shown that obtained results are better than those obtained by other authors using dierent methods.

Introduction

SOAP is a standard protocol proposed by the W3C ([36], [4]) that extends the remote procedure call, to allow the remote access to objects [9]. SOAP is the evolution of XML-RPC protocol that allows remote procedure call using XML using HTTP as transport protocol. A SOAP client can access remote services using the interface of resident objects in remote servers, using any programming language. At the moment complete implementations of SOAP are available in Perl, Java, Python, C++ and other languages [31]. Opposed to other remote procedure call methods, such as RMI (remote method invocation, used by the Java language) and XML-RPC, SOAP has two main advantages: it can be used with any programming language, and it can use any type of transport (HTTP, SHTTP, TCP, SMTP, POP and other protocols). SOAP sends and receives messages using XML [27,16,6], wrapped HTTPlike headings. SOAP services specify the method interfaces that can be accessed using language WSDL (Web Services Description Language) [30,33]. WSDL is an interface description language that species those calls that can be made to the SOAP server and the response it should return.
J.J. Merelo Guervs et al. (Eds.): PPSN VII, LNCS 2439, pp. 676685, 2002. o c Springer-Verlag Berlin Heidelberg 2002

Optimisation of Multilayer Perceptrons

677

Using a WSDL le, it can be specied a service for dierent languages, so that a Java client can access a Perl server. In some cases, if the client and the server are written using the same language, a WSDL specication is not needed, SOAP main advantages are: it it it it it it it it is is is is is is is is a lightweight protocol simple and extensible used for application communications designed to use HTTP as transport protocol not bound to any components technology not bound to any programming language based on XML and coordinated by the W3C the core of the Microsofts .NET technology

One of the main interests in the knowledge and use of this protocol is its relation with the Microsofts initiative .NET [5], that bases most of its software supplies in remote services, using the SOAP protocol to carry out the communications (although it is Microsofts version, incompatible in some cases with the standard). This must probably means that in the near future there will be many clients and servers oering services and using them, sharing SOAP and the .NET specication as common protocol. The use of SOAP for distributed computation has not been proposed yet, except in some exceptions [20]. Nevertheless, SOAP is a high level protocol, which makes easy the task of distributing objects between dierent servers, without having to worry about the message formats, nor the explicit call to remote servers. In this work we propose using SOAP for distributed computation, and we demonstrate how it could be used for evolutionary computation. A future in which dierent remote computers oer services to the scientic community can be imagined: by example, all the services available at the moment by means of HTML forms could be implemented easily as SOAP services. SOAP could be also used for distributed P2P (Peer to Peer) optimisation, where all the computers act as clients and servers to each other, interchanging population elements. In this paper we intend to explore these abilities, implementing a distributed evolutionary algorithm (EA) using Perl and SOAP, to tune learning parameters and to set the initial weights and hidden layer size of a MLP, based on an EA and Quick Propagation [11] (QP). This paper continues the research on evolutionary optimisation of MLP (G-Prop method) presented in [25,8]. This method leverages the capabilities of two classes of algorithms: the ability of EA to nd a solution close to the global optimum, and the ability of the back-propagation algorithm (BP) to tune a solution and reach the nearest local minimum by means of local search from the solution found by the EA. Instead of using a pre-established topology, the population is initialised with dierent hidden layer sizes, with some specic operators designed to change them (mutation, multi-point crossover, addition and elimination of hidden units, and QP training applied as operator).

678

P.A. Castillo et al.

The EA searches and optimises the architecture (number of hidden units), the initial weight setting for that architecture and the learning rate for that net. The remainder of this paper is structured as follows: in section 2 it is explained how to implement an EA using SOAP and Perl. Section 3 describes the experiments, and section 4 presents the results obtained, followed by a brief conclusion in Section 5.

Distributed Evolutionary Algorithms Implementation Using SOAP and Perl

There are many implementations of distributed genetic algorithms [13], usually using PVM or MPI [17], thus this paper does not intend to innovate in that sense, but in the implementation. There are many ways to implement a distributed genetic algorithm, one of which is the global paralelization (Farming), in which, as Fogarty and Huang propose [12], Abramson and Abela [2], or Hauser and Mnner [26], individual a evaluation and/or genetic operator application are parallelized. A master processor supervises the population and select individuals to mate; then slave processors receive the individuals to evaluate them and to apply genetic operators. Another way to implement paralelization is the migration: the population is divided into small subpopulations of the same size assigned to dierent processors. From time to time each processor selects the best individuals in its subpopulation and it sends them to his nearer processors, receiving as well copies of the best individuals of his neighbours (migration of individuals). All processors replace the worst individuals of their populations. This kind of algorithms is also known as distributed evolutionary algorithms (Tanese [32], Pettey et al. [14], Cant-Paz and Goldberg [7]). . u An ideal client-server implementation of a distributed evolutionary algorithm could be a server process with several threads. Each thread would include a population, and would communicate with other threads through the shared code among them. Each thread would use an own tail of individuals to send to other threads. Each thread would evaluate its individuals in dierent remote computers, carrying out the communication using a SOAP server. As we cannot use a threaded version of Perl, our implementation is based on a ring topology, with N populations (computers) sending individuals to the next population. The parallel algorithm code has been split in two processes: the evolutionary algorithm and the code that shares individuals (migrator) between populations (islands). The system uses asynchronous communications, that, between EA process and migrator process within an island (as well as between migrators within dierent islands) are carried out using SOAP on HTTP transport protocol (migrators act as web servers). Implementation was carried out using the SOAP::Lite module [19] for the Perl programming language, for its stability and the familiarity of the authors with this language. In addition, servers are easy to implement using the computer infrastructure that exists in our department.

Optimisation of Multilayer Perceptrons

679

Fig. 1. Architecture: each computer runs two processes, one of them contains the population (represented as a group of people) and the other (Migrator) acts as transporter, being used to write elements of the population; thus, elements of the population are received from other computers; readings and writings are done by the Migrator object, that abstracts the physical architecture of the application.

SOAP was included from the beginning in the OPEAL EC library, and so far, several distributed evolutionary algorithms congurations have been tested on EC benchmark problems such as MaxOnes and tide [23,18]. 2.1 Evolutionary Optimisation of MLP

The evolutionary algorithm has been implemented using the OPEAL library [22], available at http://opeal.sourceforge.net under GPL license. G-Prop method has been fully described and analysed out in previous papers (see [25,8]), thus we refer to these papers for a full description. In most cases, evolved MLP should be coded into chromosomes to be handled by the genetic operators, however, G-Prop uses no binary codication, instead, the initial parameters of the network are evolved using specic variation operators such as mutation, multi-point crossover, addition and elimination of hidden units, and QP training applied as operator to the individuals of the population. The EA optimises the classication ability of the MLP, and at the same time it searches for the number of hidden units (architecture), the initial weight setting and the learning rate for that net. Although evolved MLP are not coded as bit strings nor other kind of codication (they are made evolve directly) when they are sent from an island to another one, they must be coded as an XML document to be sent using SOAP. The migration policy is as follows: each n generations the algorithm sends to the migrator e individuals, and takes i individuals. Usually, e >= i, in order that always are available individuals so that other islands can take them. A steady state algorithm was used because it was empirically found to be faster at obtaining solutions than other selection algorithms [35]. For each generation, the best 30% individuals of the population are selected to mate using the genetic operators. After 5 generations, several individuals are taken from

680

P.A. Castillo et al.

the migrator, and they are put together with the new ospring that replace the worst individual in the population. Only default parameters listed above have been used. Genetic operators were applied using the same application rate. No parameter tuning has been done, because our aim is to prove how speedup improves as the number of islands grows.

Experiments

The tests used to assess the accuracy of a method must be chosen carefully, because some of them (toy problems) are not suitable for certain capacities of the BP algorithm, such as generalization [10]. Our opinion, along with Prechelt [24], is that, in order to test an algorithm, real world problems should be used. 3.1 Glass

This problem consists of the classication of glass types, and is also taken from [24]. The results of a chemical analysis of glass splinters (percent content of 8 dierent elements) plus the refractive index are used to classify the sample to be either oat processed or non oat processed building windows, vehicle windows, containers, tableware, or head lamps. This task is motivated by forensic needs in criminal investigation. This dataset was created based on the glass problem dataset from the UCI repository of machine learning databases. The data set contains 214 instances. Each sample has 9 attributes plus the class attribute: refractive index, sodium, magnesium, aluminium, silicon, potassium, calcium, barium, iron, and the class attribute (type of glass). The main data set was divided into three disjoint parts, for training, validating and testing. In order to obtain the tness of an individual, the MLP is trained with the training set and its tness is established from the classication error with the validating set. Once the EA is nished (when it reaches the limit of generations), the classication error with the testing set is calculated: this is the result shown. Up to six computers have been used to run the algorithm and to obtain results both in sequential and parallel versions of the program. Computer speeds range from 400 Mhz to 800 Mhz and are connected using the 10Mbits Ethernet network of the University (with a high communication latency). No experiments using homogeneous computer network have been done, because our aim is to demonstrate potential of distributed asynchronous EA using web services. The problem has been studied in the sequential and parallel case using a ring migration scheme: sending some individuals to the next island (computer), and getting the best individuals from the previous island. Each single-computer EA was executed using the parameters shown in Table 1. Each generation 30 new individual are generated (100 generations), thus 3100 individuals are generated each run. Taking into account that 2 up to 6 computers

Optimisation of Multilayer Perceptrons

681

Table 1. List of parameters used to execute the EA that runs in a computer (island). Parameter Value number of generations 100 individuals in the population 100 % of the population replaced 30% hidden units ranging from 2 to 90 BP epochs to calculate tness 300 Table 2. Equivalence between number of computers in parallel runs and number of generations in sequential runs (to generate the same number of individuals). Number of Total number Number of generations islands of individuals (sequential version) 1 3100 100 2 6200 200 3 9300 300 4 12400 400 5 15500 500 6 18600 600

have been used, then 3100 to 18600 individual are created and evaluated each run (depending on the number of computers used). To compare with the sequential version of the method, the sequential EAs have been executed using the number of generations shown in Table 2, so that the number of individuals generated in each run is equivalent to the number of individuals generated using the parallel version:

Results

Time was measured using the Unix time command. Sequential version of the program was run in the faster machine; and in parallel runs, time spent by the faster machine too was taken and used in results shown. With sequential version, 10 simulations were run, and 5 runs with 2, 3, 4, 5 and 6 machines. Each simulation was carried out adding a new computer from the set of computers used previously. Results obtained using the sequential version can be shown in Table 3: Results obtained using the parallel version can be shown in Table 4. As can be seen, better results in time and classication error are obtained dividing the problem between several computers. Figure 2 shows that speedup does not equals the number of computers used; however, simulation time is improved using several computers. Moreover, up to 6 computers in the ring, the speedup grows, and as can be seen in Figure 2 it could continue growing for a higher number of computers. Results could be better if a dedicated communication network was used, however, the University Ethernet network is overloaded and that implies a high latency in communications between processes.

682

P.A. Castillo et al.

Table 3. Results (error % and time) obtained using the sequential version of the method. Generations Error (%) Time (seconds) 100 35 1 20 2 200 35 1 30 5 300 33 2 44 8 400 32 2 70 9 500 32 2 91 4 600 30 3 104 2 Table 4. Results (error % and time) obtained using the parallel version of the method, and the speedup for each experiment. Islands Error (%) Time (seconds) Speedup 1 35 1 20 2 1 2 33 2 18 1 1.6 3 33 1 20 1 2.2 4 32 1 20 2 3.5 5 31 2 22 3 4.1 6 30 1 19 2 5.5
6

speedup
3 2 1 1 2 3 4 Number of computers 5 6

Fig. 2. Plot of the speedup (solid line) and f (x) = x function (dashed line). Although speedup is not lineal, it can be seen that simulation time is improved using several computers. Moreover, up to 6 computers in the ring, the speedup grows, and it could continue growing for a higher number of computers.

Although it is not the aim of this paper to compare the G-Prop method with those of other authors, we do so in order to prove the capacity of both versions of G-Prop (parallel and sequential) to solve pattern classication problems, and how it outperforms other methods. Thus, obtained results (% of error in test) are better than those presented by Prechelt [24] using RPROP [28] (32.08), Grnroos o [15] using a hybrid algorithm (32 5), and Castillo et al. [25] using a previous version of the G-Prop method (31 2).

Optimisation of Multilayer Perceptrons

683

Conclusions and Work in Progress

This paper presents a new research line on parallel-distributed computation using SOAP that shows the useful this new protocol can be in the eld of the evolutionary computation. To implement and use communications using SOAP it is not necessary running virtual machines (as in Java programming), nor daemons, just only to install several libraries available for almost any programming language. More over, if a ring topology is used, an arbitrary number of computers (islands) can be added to the network, making the system more ecient. In these experiments, we have demonstrated that SOAP can be used as communication protocol for distributed evolutionary programming, obtaining a good speedup using a ring topology and adding new computers to the ring. Up to 6 computers in the ring, the speedup grows, and as can be seen in Figure 2 it could continue growing for a higher number of computers. Results could improve using a dedicated communication network instead of the overloaded network of the University. Although PVM or MPI communications add fewer overheads than SOAP communications since they are done at a lower level, SOAP provides a common interface that can be called from almost any programming language. Thus programs can be written in any language and can share data without the need of worrying about the message formats or communication protocols. At the same time, as it is a P2P system, it does not overload too much the network. Other distributed systems, such as Jini [34,21], network trac is so high that when a high number of computers are used, communication becomes impossible. As future research, it is very important adding support for SOAP to existing distributed evolutionary algorithm libraries, such as JEO [3], EO [29], and libraries in other languages, in order to allow the implementation of multilanguage evolutionary algorithms. It would also be of interest to use data structure description protocols, such as WSDL, to describe data to be evolved, in order to allow using them by any program that understands XML, and be able to send them easily using SOAP. Another possibility is to test true P2P architectures, where each computer communicates only with one or two computers in the network. It would be very interesting to test asynchronous evolutionary algorithms using random topologies, in such a way that a servent (server/client) can enter or leave the network at any moment.

Acknowledgements
This work has been supported in part by projects CICYT TIC-1999-0550, INTAS-9730950 and IST-1999-12679.

684

P.A. Castillo et al.

References
1. Actas XII Jornadas de Paralelismo. Universidad Politcnica de Valencia, 2001. e 2. Abramson; Abela J. A. Parallel genetic algorithm for solving the school timetabling problem. In Proceedings of the Fifteenth Australian Computer Science Conference (ACSC-15), vol. 14, p.1-11, 1992. 3. M.G. Arenas, L. Foucart, J.J. Merelo, and P. A. Castillo. Jeo: a framework for evolving objects in java. In Actas Jornadas de Paralelismo [1]. 4. Paco Avila. SOAP: revoluci=n en la red. Linux actual, (19):5559, 2001. 5. K. Ballinger, J. Hawkins, and P. Kumar. SOAP in the microsoft .NET framework and visual Studio.NET. Available from http://msdn.microsoft.com/library/techart/Hawksoap.htm. 6. D. Box. Inside SOAP. Available from http://www.xml.com/pub/a/2000/02/09/feature/index.html. 7. E. Cant-Paz and D. E. Goldberg. Modeling idealized bounding cases of parallel u genetic algorithms. In Koza J., Deb K., Dorigo M., Fogel D., Garz0n M., Iba H., Riolo R. Eds. Genetic Programming 1997: Proceedings of the Second Annual Conference, Morgan Kaufmann (San Francisco. CA), 1997. 8. P. A. Castillo, J. J. Merelo, V. Rivas, G. Romero, and A. Prieto. G-Prop: Global Optimization of Multilayer Perceptrons using GAs. Neurocomputing, Vol.35/1-4, pp.149-163, 2000. 9. DevelopMentor. SOAP Frequently Asked Questions. Available from http://www.develop.com/soap/soapfaq.htm. 10. S. Fahlman. An empirical study of learning speed in back-propagation networks. Technical report, Carnegie Mellon University, 1988. 11. S.E. Fahlman. Faster-Learning Variations on Back-Propagation: An Empirical Study. Proceedings of the 1988 Connectionist Models Summer School, Morgan Kaufmann, 1988. 12. T. Fogarty and R. Huang. Implementing the genetic algorithm on transputer based parallel processing systems. Parallel Problem Solving From Nature,p.145-149, 1991. 13. David E. Goldberg. Genetic Algorithms in search, optimization and machine learning. Addison Wesley, 1989. 14. C. B. Pettey; M. R. Leuze; J. J. Grefenstette. A parallel genetic algorithm. In J. J. Grefenstette (Ed.), Proceedings of the Second International Conference on Genetic Algorithms, Lawrence Erlbaum Associates, pp. 155-162, 1987. 15. M.A. Grnroos. Evolutionary Design of Neural Networks. Master of Science Thesis o in Computer Science. Dept. of Mathematical Sciences. University of Turku, 1998. 16. Elliotte Rusty Harold. XML Bible. IDG Books worldwide, 1991. 17. J.G.Castellano, M.Garc a-Arenas, P.A.Castillo, J.Carpio, M.Cillero, J.J.Merelo, A.Prieto, V.Rivas, and G.Romero. Objetos evolutivos paralelos. In Universidad de Granada Dept. ATC, editor, XI Jornadas de Paralelismo, pages 247252, 2000. 18. J.J.Merelo, J.G.Castellano, and P.A.Castillo. Algoritmos evolutivos P2P usando SOAP. pages 3137. Universidad de Extremadura, Febrero 2002. 19. P. Kuchenko. SOAP::Lite. Available from http://www.soaplite.com. 20. D. Marcato. Distributed computing with soap. Available from http://www.devx.com/upload/free/features/vcdj/2000/04apr00/dm0400/dm0400.asp. 21. J. Atienza; M. Garc J. Gonzlez; J. J. Merelo. Jenetic: a distributed, nea; a grained, asynchronous evolutionary algorithm using jini. pages 10871089, 2000. ISBN: 0-9643456-9-2.

Optimisation of Multilayer Perceptrons

685

22. J. J. Merelo. OPEAL, una librer de algoritmos evolutivos. Actas del Primer a Congreso Espaol de Algoritmos Evolutivos y Bioinspirados. ISBN:84-607-3913-9. n pp.54-59. Mrida, Spain, febrero, 2002. e 23. J. J. Merelo, J.G. Castellano, P.A. Castillo, and G. Romero. Algoritmos gen ticos distribuidos usando soap. In Actas Jornadas de Paralelismo [1]. 24. Lutz Prechelt. PROBEN1 A set of benchmarks and benchmarking rules for neural network training algorithms. Technical Report 21/94, Fakultt a fr Informatik, Universitt Karlsruhe, D-76128 Karlsruhe, Germany. (Also in: u a http://wwwipd.ira.uka.de/prechelt/), 1994. 25. P.A. Castillo; J. Carpio; J. J. Merelo; V. Rivas; G. Romero; A. Prieto. Evolving multilayer perceptrons. Neural Processing Letters, 12:115127, October 2000. 26. Hauser R.; Mnner R. Implementation of standard genetic algorithm on mimd a machines. In Davidor Y., Schwefel H. P., Mnner R., Eds., Parallel Problem a Solving from Nature, PPSN III, p. 504-513, Springer-Verlag (Berlin), 1994. 27. Erik T. Ray. Learning XML: creating self-describing data. OReilly, January 2001. 28. M. Riedmiller and H. Braun. A Direct Adaptive Method for Faster Backpropagation Learning: The RPROP Algorithm. In Ruspini, H., (Ed.) Proc. of the ICNN93, San Francisco, pp. 586-591, 1993. 29. M. Keijzer; J. J. Merelo; G. Romero; and M. Schoenauer. Evolving objects: a general purpose evolutionary computation library. Springer-Verlag, October 2001. 30. A. Ryman. Understanding web services. Available from http://www7.software.ibm.com/vad.nsf/Data/Document4362?OpenDocument&p=1&BCT=1&Footer=1. 31. soaprpc.com. SOAP software. Available from http://www.soaprpc.com/software. 32. R. Tanese. Parallel genetic algorithms for a hypercube. In J. J. Grefenstette (Ed.), Proceedings of the second International Conference on Genetic Algorithms, Lawrence Erlbaum Associates, pp. 177-184, 1987. 33. V. Vasudevan. A web services primer. Available from http://www.xml.com/pub/a/2001/04/04/webservices/index.html. 34. Bill Venners. Jini FAQ (frequently asked questions). Available from http://www.artima.com/jini/faq.html. 35. D. Whitley. The GENITOR Algorithm and Selection Presure: Why rank-based allocation of reproductive trials is best. in J.D. Schaer (Ed.), Proceedings of The Third International Conference on Genetic Algorithms, Morgan Kaumann, Publishers, 116-121, 1989. 36. D. Box; D. Ehnebuske; G. Kakivaya; A. Layman; N. Mendelsohn; H.F. Nielsen; S. Thatte; D. Winer. Simple Object Access Protocol (SOAP) 1.1, W3C Note 08 May 2000. Available from http://www.w3.org/TR/SOAP.

You might also like